Advanced Analytics with PySpark

Patterns for Learning from Data at Scale Using Python and Spark



Bookstore > Books > Advanced Analytics with PySpark

Price$41.03 - $59.99
Rating
AuthorsAkash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills
PublisherO'Reilly Media
Published2022
Pages233
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101098103653
ISBN-139781098103651
EBook Hardcover Paperback

The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming.

Data scientists Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills offer an introduction to the Spark ecosystem, then dive into patterns that apply common techniques-including classification, clustering, collaborative filtering, and anomaly detection, to fields such as genomics, security, and finance. This updated edition also covers NLP and image processing.

If you have a basic understanding of machine learning and statistics and you program in Python, this book will get you started with large-scale data analysis.

Familiarize yourself with Spark's programming model and ecosystem; Learn general approaches in data science; Examine complete implementations that analyze large public datasets; Discover which machine learning tools make sense for particular problems; Explore code that can be adapted to many uses.


  1. (3 books)
  2. (3 books)
  3. (3 books)
  4. (3 books)



Similar Books


Advanced Analytics with Transact-SQL

Advanced Analytics with Transact-SQL

by Dejan Sarka

Learn about business intelligence (BI) features in T-SQL and how they can help you with data science and analytics efforts without the need to bring in other languages such as R and Python. This book shows you how to compute statistical measures using your existing skills in T-SQL. You will learn how to calculate descriptive statistics, i...

Price:  $32.55  |  Publisher:  Apress  |  Release:  2021

PySpark Recipes

PySpark Recipes

by Raju Kumar Mishra

Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved!PySpark Recipes covers Hadoop and its shortcomings...

Price:  $35.10  |  Publisher:  Apress  |  Release:  2018

Derivatives Analytics with Python

Derivatives Analytics with Python

by Yves Hilpisch

Derivatives Analytics with Python shows you how to implement market-consistent valuation and hedging approaches using advanced financial models, efficient numerical techniques, and the powerful capabilities of the Python programming language. This unique guide offers detailed explanations of all theory, methods, and processes, giving you ...

Price:  $71.74  |  Publisher:  Wiley  |  Release:  2015

Advanced Analytics and Deep Learning Models

Advanced Analytics and Deep Learning Models

by Archana Mire, Shaveta Malik, Amit Kumar Tyagi

The book provides readers with an in-depth understanding of concepts and technologies related to the importance of analytics and deep learning in many useful real-world applications such as e-healthcare, transportation, agriculture, stock market, etc.Advanced analytics is a mixture of machine learning, artificial intelligence, graphs, tex...

Price:  $162.46  |  Publisher:  Wiley  |  Release:  2022

Big Data Analytics with Spark

Big Data Analytics with Spark

by Mohammed Guller

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...

Price:  $29.99  |  Publisher:  Apress  |  Release:  2016

Text Analytics with Python

Text Analytics with Python

by Dipanjan Sarkar

Derive useful insights from your data using Python. You will learn both basic and advanced concepts, including text and language syntax, structure, and semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization.Text Analytics with Python teaches you the technique...

Price:  $18.99  |  Publisher:  Apress  |  Release:  2016

Understanding Azure Data Factory

Understanding Azure Data Factory

by Sudhir Rawat, Abhishek Narain

Improve your analytics and data platform to solve major challenges, including operationalizing big data and advanced analytics workloads on Azure. You will learn how to monitor complex pipelines, set alerts, and extend your organization's custom monitoring requirements.This book starts with an overview of the Azure Data Factory as a ...

Price:  $30.09  |  Publisher:  Apress  |  Release:  2019

Machine Learning with PySpark

Machine Learning with PySpark

by Pramod Singh

Build machine learning models, natural language processing applications, and recommender systems with PySpark to solve various business challenges. This book starts with the fundamentals of Spark and its evolution and then covers the entire spectrum of traditional machine learning algorithms along with natural language processing and reco...

Price:  $20.41  |  Publisher:  Apress  |  Release:  2019