Apache Spark 2: Data Processing and Real-Time Analytics
Master complex big data processing, stream analytics, and machine learning with Apache Spark
|Price||$49.99 - $64.68
|Authors||Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei|
|Format||Paper book / ebook (PDF)|
Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform.
You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools.
By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle.
by Vinay Singh
SAP HANA is an in-memory database created by SAP. SAP HANA breaks traditional database barriers to simplify IT landscapes, eliminating data preparation, pre-aggregation, and tuning. SAP HANA and in-memory computing allow you to instantly access huge volumes of structured and unstructured data, including text data, from different sources.S...
Price: $30.00 | Publisher: Packt Publishing | Release: 2015
by Shilpi Saxena
This book will teach you how to use Storm for real-time data processing and to make your applications highly available with no downtime using Cassandra.The book starts off with the basics of Storm and its components along with setting up the environment for the execution of a Storm topology in local and distributed mode. Moving on, you wi...
Price: $44.99 | Publisher: Packt Publishing | Release: 2015
by Krishna Sankar, Holden Karau
Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (G...
Price: $29.99 | Publisher: Packt Publishing | Release: 2015
by Hien Luu
Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it.Along the way, you'll discove...
Price: $25.33 | Publisher: Apress | Release: 2018
by Wayne Winston
Master business modeling and analysis techniques with Microsoft Excel 2019 and Office 365 and transform data into bottom-line results. Written by award-winning educator Wayne Winston, this hands-on, scenario-focused guide helps you use Excel to ask the right questions and get accurate, actionable answers. New coverage ranges from Power Qu...
Price: $23.99 | Publisher: Microsoft Press | Release: 2019
by Quinton Anderson
Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!Storm Real Time Processing Cookbook will ha...
Price: $29.99 | Publisher: Packt Publishing | Release: 2013
by P. Taylor Goetz, Brian O'Neill
Storm is the most popular framework for real-time stream processing. Storm provides the fundamental primitives and guarantees required for fault-tolerant distributed computing in high-volume, mission critical applications. It is both an integration technology as well as a data flow and control mechanism, making it the core of many big dat...
Price: $24.99 | Publisher: Packt Publishing | Release: 2014
by Raul Estrada
Apache Kafka is a great open source platform for handling your real-time data pipeline to ensure high-speed filtering and pattern matching on the ﬂy. In this book, you will learn how to use Apache Kafka for efficient processing of distributed applications and will get familiar with solving everyday problems in fast data and processing p...
Price: $29.99 | Publisher: Packt Publishing | Release: 2018