Apache Spark 2: Data Processing and Real-Time Analytics
Master complex big data processing, stream analytics, and machine learning with Apache Spark
|Price||$49.99 - $64.68
|Authors||Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei|
|Format||Paper book / ebook (PDF)|
Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform.
You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools.
By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle.
by Shilpi Saxena
This book will teach you how to use Storm for real-time data processing and to make your applications highly available with no downtime using Cassandra.The book starts off with the basics of Storm and its components along with setting up the environment for the execution of a Storm topology in local and distributed mode. Moving on, you wi...
Price: $35.99 | Publisher: Packt Publishing | Release: 2015
by Vinay Singh
SAP HANA is an in-memory database created by SAP. SAP HANA breaks traditional database barriers to simplify IT landscapes, eliminating data preparation, pre-aggregation, and tuning. SAP HANA and in-memory computing allow you to instantly access huge volumes of structured and unstructured data, including text data, from different sources.S...
Price: $31.99 | Publisher: Packt Publishing | Release: 2015
by Krishna Sankar, Holden Karau
Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (G...
Price: $23.99 | Publisher: Packt Publishing | Release: 2015
by Hien Luu
Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it.Along the way, you'll discover res...
Price: $25.33 | Publisher: Apress | Release: 2018
by Quinton Anderson
Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!Storm Real Time Processing Cookbook will ha...
Price: $29.99 | Publisher: Packt Publishing | Release: 2013
by P. Taylor Goetz, Brian O'Neill
Storm is the most popular framework for real-time stream processing. Storm provides the fundamental primitives and guarantees required for fault-tolerant distributed computing in high-volume, mission critical applications. It is both an integration technology as well as a data flow and control mechanism, making it the core of many big dat...
Price: $27.07 | Publisher: Packt Publishing | Release: 2014
by Byron Ellis
Real-time analytics is the hottest topic in data analytics today. In Real-Time Analytics - expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpace traditional batch-based analysis plat...
Price: $33.90 | Publisher: Wiley | Release: 2014
by Mike Frampton
Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations.This book aims to take your limited knowledge of Spark to th...
Price: $35.25 | Publisher: Packt Publishing | Release: 2015