Learning Spark
Lightning-Fast Big Data Analysis
Price | $32.23 - $44.26
|
Rating | |
Authors | Matei Zaharia, Holden Karau, Andy Konwinski, Patrick Wendell |
Publisher | O'Reilly Media |
Published | 2015 |
Pages | 276 |
Language | English |
Format | Paper book / ebook (PDF) |
ISBN-10 | 1449358624 |
ISBN-13 | 9781449358624 |
Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates.
Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You'll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.
Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell; Leverage Spark's powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib; Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm; Learn how to deploy interactive, batch, and streaming applications; Connect to data sources including HDFS, Hive, JSON, and S3; Master advanced topics like data partitioning and shared variables.
- Matei Zaharia (2 books)
- Holden Karau (5 books)
- Andy Konwinski
- Patrick Wendell
4 5 799
Similar Books
Next-Generation Machine Learning with Spark
by Butch Quinto
Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications.The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry.Next-Generation Machine...
Price: $26.41 | Publisher: Apress | Release: 2020
by Jean-Georges Perrin
The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, 2nd Edition, you'll learn to take advantage of Spark's core features and incredible processing speed, with applications including real-time computation, delayed eval...
Price: $35.89 | Publisher: Manning | Release: 2020
Sams Teach Yourself Apache Spark in 24 Hours
by Jeffrey Aven
Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that leverage Spark's amaz...
Price: $32.51 | Publisher: SAMS Publishing | Release: 2016
by Nick Pentreath
Apache Spark is a framework for distributed computing that is designed from the ground up to be optimized for low latency tasks and in-memory data storage. It is one of the few frameworks for parallel computing that combines speed, scalability, in-memory processing, and fault tolerance with ease of programming and a flexible, expressive, ...
Price: $34.99 | Publisher: Packt Publishing | Release: 2015
by Hien Luu
Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it.Along the way, you'll discove...
Price: $25.33 | Publisher: Apress | Release: 2018
Apache Spark 2: Data Processing and Real-Time Analytics
by Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei
Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your o...
Price: $49.99 | Publisher: Packt Publishing | Release: 2018
Advanced Analytics with PySpark
by Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills
The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics...
Price: $35.42 | Publisher: O'Reilly Media | Release: 2022
by Mohammed Guller
This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...
Price: $29.99 | Publisher: Apress | Release: 2016