Advanced Analytics with Spark

Patterns for Learning from Data at Scale



Bookstore > Books > Advanced Analytics with Spark

Price$20.00 - $38.15
Rating
AuthorsSandy Ryza, Uri Laserson, Sean Owen, Josh Wills
PublisherO'Reilly Media
Published2015
Pages276
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101491912766
ISBN-139781491912768
EBook Hardcover Paperback

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example.

You'll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques - classification, collaborative filtering, and anomaly detection among others - to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you'll find these patterns useful for working on your own data applications.


  1. (3 books)
  2. (3 books)
  3. (3 books)
  4. (3 books)


4 5 189

Similar Books


Big Data Analytics with Spark

Big Data Analytics with Spark

by Mohammed Guller

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...

Price:  $29.99  |  Publisher:  Apress  |  Release:  2016

Advanced Analytics with Spark, 2nd Edition

Advanced Analytics with Spark, 2nd Edition

by Sandy Ryza, Uri Laserson, Josh Wills, Sean Owen

In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. Updated for Spark 2.1, this ed...

Price:  $29.85  |  Publisher:  O'Reilly Media  |  Release:  2017

Advanced Analytics with Transact-SQL

Advanced Analytics with Transact-SQL

by Dejan Sarka

Learn about business intelligence (BI) features in T-SQL and how they can help you with data science and analytics efforts without the need to bring in other languages such as R and Python. This book shows you how to compute statistical measures using your existing skills in T-SQL. You will learn how to calculate descriptive statistics, i...

Price:  $32.55  |  Publisher:  Apress  |  Release:  2021

Advanced Analytics with PySpark

Advanced Analytics with PySpark

by Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills

The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics...

Price:  $35.42  |  Publisher:  O'Reilly Media  |  Release:  2022

Fast Data Processing with Spark, 2nd Edition

Fast Data Processing with Spark, 2nd Edition

by Krishna Sankar, Holden Karau

Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (G...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2015

Fast Data Processing with Spark

Fast Data Processing with Spark

by Holden Karau

Spark is a framework for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and inbuilt tools for interactive query analysis (Shark), large-scale graph processing and analysis (Bagel), and ...

Price:  $22.99  |  Publisher:  Packt Publishing  |  Release:  2013

Machine Learning with Spark

Machine Learning with Spark

by Nick Pentreath

Apache Spark is a framework for distributed computing that is designed from the ground up to be optimized for low latency tasks and in-memory data storage. It is one of the few frameworks for parallel computing that combines speed, scalability, in-memory processing, and fault tolerance with ease of programming and a flexible, expressive, ...

Price:  $34.99  |  Publisher:  Packt Publishing  |  Release:  2015

Apache Spark 2: Data Processing and Real-Time Analytics

Apache Spark 2: Data Processing and Real-Time Analytics

by Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your o...

Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2018