Practical Apache Spark
Using the Scala API
|Price||$31.66 - $48.38
|Authors||Subhashini Chellappan, Dharanitharan Ganesan|
|Format||Paper book / ebook (PDF)|
Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You'll follow a learn-to-do-by-yourself approach to learning - learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure.
On completion, you'll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You'll also become familiar with machine learning algorithms with real-time usage.
Discover the functional programming features of Scala; Understand the complete architecture of Spark and its components; Integrate Apache Spark with Hive and Kafka; Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries; Work with different machine learning concepts and libraries using Spark's MLlib packages.
by Mike Frampton
Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations.This book aims to take your limited knowledge of Spark to th...
Price: $35.25 | Publisher: Packt Publishing | Release: 2015
by Holden Karau, Rachel Warren
Apache Spark is amazing when everything clicks. But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle la...
Price: $27.31 | Publisher: O'Reilly Media | Release: 2017
by Jeffrey Aven
Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that leverage Spark's amazing s...
Price: $31.49 | Publisher: SAMS Publishing | Release: 2016
by Hien Luu
Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it.Along the way, you'll discover res...
Price: $25.33 | Publisher: Apress | Release: 2018
by Zubair Nabi
Learn the right cutting-edge skills and knowledge to leverage Spark Streaming to implement a wide array of real-time, streaming applications. This book walks you through end-to-end real-time application development using real-world applications, data, and code. Taking an application-first approach, each chapter introduces use cases from a...
Price: $34.99 | Publisher: Apress | Release: 2016
by Sam R. Alapati
Follow this handbook to build, configure, tune, and secure Apache Cassandra databases. Start with the installation of Cassandra and move on to the creation of a single instance, and then a cluster of Cassandra databases.Cassandra is increasingly a key player in many big data environments, and this book shows you how to use Cassandra with ...
Price: $39.99 | Publisher: Apress | Release: 2017
by Mahmoud Parsian
If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as buildi...
Price: $58.74 | Publisher: O'Reilly Media | Release: 2015
by Nick Pentreath
Apache Spark is a framework for distributed computing that is designed from the ground up to be optimized for low latency tasks and in-memory data storage. It is one of the few frameworks for parallel computing that combines speed, scalability, in-memory processing, and fault tolerance with ease of programming and a flexible, expressive, ...
Price: $29.99 | Publisher: Packt Publishing | Release: 2015