Practical Apache Spark

Using the Scala API



Bookstore > Books > Practical Apache Spark

Price$31.66 - $37.29
Rating
AuthorsSubhashini Chellappan, Dharanitharan Ganesan
PublisherApress
Published2018
Pages280
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101484236513
ISBN-139781484236512
EBook Hardcover Paperback

Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You'll follow a learn-to-do-by-yourself approach to learning - learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure.

On completion, you'll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You'll also become familiar with machine learning algorithms with real-time usage.

Discover the functional programming features of Scala; Understand the complete architecture of Spark and its components; Integrate Apache Spark with Hive and Kafka; Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries; Work with different machine learning concepts and libraries using Spark's MLlib packages.



3 5 6

Similar Books


Mastering Apache Spark

Mastering Apache Spark

by Mike Frampton

Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations.This book aims to take your limited knowledge of Spark to th...

Price:  $43.99  |  Publisher:  Packt Publishing  |  Release:  2015

Apache Spark 2: Data Processing and Real-Time Analytics

Apache Spark 2: Data Processing and Real-Time Analytics

by Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your o...

Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2018

High Performance Spark

High Performance Spark

by Holden Karau, Rachel Warren

Apache Spark is amazing when everything clicks. But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and...

Price:  $27.31  |  Publisher:  O'Reilly Media  |  Release:  2017

Graph Algorithms

Graph Algorithms

by Mark Needham, Amy Hodler

Learn how graph algorithms can help you leverage relationships within your data to develop intelligent solutions and enhance your machine learning models. With this practical guide,developers and data scientists will discover how graph analytics deliver value, whether they're used for building dynamic network models or forecasting re...

Price:  $45.63  |  Free ebook  |  Publisher:  O'Reilly Media  |  Release:  2019

Sams Teach Yourself Apache Spark in 24 Hours

Sams Teach Yourself Apache Spark in 24 Hours

by Jeffrey Aven

Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that leverage Spark's amaz...

Price:  $32.51  |  Publisher:  SAMS Publishing  |  Release:  2016

Beginning Apache Spark 2

Beginning Apache Spark 2

by Hien Luu

Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it.Along the way, you'll discove...

Price:  $25.33  |  Publisher:  Apress  |  Release:  2018

The Azure Data Lakehouse Toolkit

The Azure Data Lakehouse Toolkit

by Ron L'Esteve

Design and implement a modern data lakehouse on the Azure Data Platform using Delta Lake, Apache Spark, Azure Databricks, Azure Synapse Analytics, and Snowflake. This book teaches you the intricate details of the Data Lakehouse Paradigm and how to efficiently design a cloud-based data lakehouse using highly performant and cutting-edge Apa...

Price:  $54.99  |  Publisher:  Apress  |  Release:  2022

PolyBase Revealed

PolyBase Revealed

by Kevin Feasel

Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered.PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Server with Azure Blob Storage, Apache Hadoo...

Price:  $21.73  |  Publisher:  Apress  |  Release:  2020