Advanced Analytics with PySpark

Patterns for Learning from Data at Scale Using Python and Spark



Bookstore > Books > Advanced Analytics with PySpark

Price$35.42 - $38.33
Rating
AuthorsAkash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills
PublisherO'Reilly Media
Published2022
Pages233
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101098103653
ISBN-139781098103651
EBook Hardcover Paperback

The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming.

Data scientists Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills offer an introduction to the Spark ecosystem, then dive into patterns that apply common techniques-including classification, clustering, collaborative filtering, and anomaly detection, to fields such as genomics, security, and finance. This updated edition also covers NLP and image processing.

If you have a basic understanding of machine learning and statistics and you program in Python, this book will get you started with large-scale data analysis.

Familiarize yourself with Spark's programming model and ecosystem; Learn general approaches in data science; Examine complete implementations that analyze large public datasets; Discover which machine learning tools make sense for particular problems; Explore code that can be adapted to many uses.


  1. (3 books)
  2. (3 books)
  3. (3 books)
  4. (3 books)


3 5 2

Similar Books


Advanced Analytics with Transact-SQL

Advanced Analytics with Transact-SQL

by Dejan Sarka

Learn about business intelligence (BI) features in T-SQL and how they can help you with data science and analytics efforts without the need to bring in other languages such as R and Python. This book shows you how to compute statistical measures using your existing skills in T-SQL. You will learn how to calculate descriptive statistics, i...

Price:  $32.55  |  Publisher:  Apress  |  Release:  2021

Advanced Data Analytics Using Python, 2nd Edition

Advanced Data Analytics Using Python, 2nd Edition

by Sayan Mukhopadhyay, Pratip Samanta

Understand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, and computer vision in the cloud environme...

Price:  $33.88  |  Publisher:  Apress  |  Release:  2023

Predictive Analytics with Microsoft Azure Machine Learning

Predictive Analytics with Microsoft Azure Machine Learning

by Roger Barga, Valentine Fontama, Wee Hyong Tok

Data Science and Machine Learning are in high demand, as customers are increasingly looking for ways to glean insights from all their data. More customers now realize that Business Intelligence is not enough as the volume, speed and complexity of data now defy traditional analytics tools. While Business Intelligence addresses descriptive ...

Price:  $24.59  |  Publisher:  Apress  |  Release:  2014

PySpark Recipes

PySpark Recipes

by Raju Kumar Mishra

Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved!PySpark Recipes covers Hadoop and its shortcomings...

Price:  $35.10  |  Publisher:  Apress  |  Release:  2018

Derivatives Analytics with Python

Derivatives Analytics with Python

by Yves Hilpisch

Derivatives Analytics with Python shows you how to implement market-consistent valuation and hedging approaches using advanced financial models, efficient numerical techniques, and the powerful capabilities of the Python programming language. This unique guide offers detailed explanations of all theory, methods, and processes, giving you ...

Price:  $71.74  |  Publisher:  Wiley  |  Release:  2015

Advanced Analytics and Deep Learning Models

Advanced Analytics and Deep Learning Models

by Archana Mire, Shaveta Malik, Amit Kumar Tyagi

The book provides readers with an in-depth understanding of concepts and technologies related to the importance of analytics and deep learning in many useful real-world applications such as e-healthcare, transportation, agriculture, stock market, etc.Advanced analytics is a mixture of machine learning, artificial intelligence, graphs, tex...

Price:  $162.46  |  Publisher:  Wiley  |  Release:  2022

Big Data Analytics with Spark

Big Data Analytics with Spark

by Mohammed Guller

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...

Price:  $29.99  |  Publisher:  Apress  |  Release:  2016

Text Analytics with Python

Text Analytics with Python

by Dipanjan Sarkar

Derive useful insights from your data using Python. You will learn both basic and advanced concepts, including text and language syntax, structure, and semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization.Text Analytics with Python teaches you the technique...

Price:  $18.99  |  Publisher:  Apress  |  Release:  2016