Advanced Analytics with Spark, 2nd Edition

Patterns for Learning from Data at Scale



Bookstore > Books > Advanced Analytics with Spark, 2nd Edition

Price$29.85 - $30.47
Rating
AuthorsSandy Ryza, Uri Laserson, Josh Wills, Sean Owen
PublisherO'Reilly Media
Published2017
Pages280
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101491972955
ISBN-139781491972953
EBook Hardcover Paperback

In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. Updated for Spark 2.1, this edition acts as an introduction to these techniques and other best practices in Spark programming.

You'll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques - including classification, clustering, collaborative filtering, and anomaly detection - to fields such as genomics, security, and finance.

If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you'll find the book's patterns useful for working on your own data applications.

Familiarize yourself with the Spark programming model; Become comfortable within the Spark ecosystem; Learn general approaches in data science; Examine complete implementations that analyze large public data sets; Discover which machine learning tools make sense for particular problems; Acquire code that can be adapted to many uses.


  1. (3 books)
  2. (3 books)
  3. (3 books)
  4. (3 books)


4 5 140

Similar Books


Advanced Data Analytics Using Python, 2nd Edition

Advanced Data Analytics Using Python, 2nd Edition

by Sayan Mukhopadhyay, Pratip Samanta

Understand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, and computer vision in the cloud environme...

Price:  $33.88  |  Publisher:  Apress  |  Release:  2023

Fast Data Processing with Spark, 2nd Edition

Fast Data Processing with Spark, 2nd Edition

by Krishna Sankar, Holden Karau

Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (G...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2015

Java Persistence with Hibernate, 2nd Edition

Java Persistence with Hibernate, 2nd Edition

by Christian Bauer, Gavin King, Gary Gregory

Java Persistence with Hibernate, 2nd Edition explores Hibernate by developing an application that ties together hundreds of individual examples. You'll immediately dig into the rich programming model of Hibernate, working through mappings, queries, fetching strategies, transactions, conversations, caching, and more. Along the way you...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2015

Angular Development with Typescript, 2nd Edition

Angular Development with Typescript, 2nd Edition

by Yakov Fain, Anton Moiseev

Angular Development with TypeScript, 2nd Edition is an intermediate-level tutorial that introduces Angular and TypeScript to developers comfortable with building web applications using other frameworks and tools.Whether you're building lightweight web clients or full-featured SPAs, Angular is a clear choice. The Angular framework is ...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2018

Practical Data Science with R, 2nd Edition

Practical Data Science with R, 2nd Edition

by Nina Zumel, John Mount

Practical Data Science with R, Second Edition takes a practice-oriented approach to explaining basic principles in the ever expanding field of data science. You'll jump right to real-world use cases as you apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2019

Spring Persistence with Hibernate, 2nd Edition

Spring Persistence with Hibernate, 2nd Edition

by Brian D. Murphy, Paul Fisher

Learn how to use the core Hibernate APIs and tools as part of the Spring Framework. This book illustrates how these two frameworks can be best utilized. Other persistence solutions available in Spring are also shown including the Java Persistence API (JPA).Spring Persistence with Hibernate, Second Edition has been updated to cover Spring ...

Price:  $33.19  |  Publisher:  Apress  |  Release:  2016

Big Data Analytics with Spark

Big Data Analytics with Spark

by Mohammed Guller

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...

Price:  $29.99  |  Publisher:  Apress  |  Release:  2016

Writing Excel Macros with VBA, 2nd Edition

Writing Excel Macros with VBA, 2nd Edition

by Steven Roman, PhD

To achieve the maximum control and flexibility from Microsoft Excel often requires careful custom programming using the VBA (Visual Basic for Applications) language. Writing Excel Macros with VBA, 2nd Edition offers a solid introduction to writing VBA macros and programs, and will show you how to get more power at the programming level: f...

Price:  $4.84  |  Publisher:  O'Reilly Media  |  Release:  2002