Stream Processing with Apache Flink
Fundamentals, Implementation, and Operation of Streaming Applications
|Price||$47.52 - $59.49
|Authors||Fabian Hueske, Vasiliki Kalavri|
|Format||Paper book / ebook (PDF)|
Get started with Apache Flink, the open source framework that enables you to process streaming data - such as user interactions, sensor data, and machine logs - as it arrives. With this practical guide, you'll learn how to use Apache Flink's stream processing APIs to implement, continuously run, and maintain real-world applications.
Authors Fabian Hueske, one of Flink's creators, and Vasia Kalavri, a core contributor to Flink's graph processing API (Gelly), explains the fundamental concepts of parallel stream processing and shows you how streaming analytics differs from traditional batch data analysis. Software engineers, data engineers, and system administrators will learn the basics of Flink's DataStream API, including the structure and components of a common Flink streaming application.
- Solve real-world problems with Apache Flink's DataStream API;
- Set up an environment for developing stream processing applications for Flink;
- Design streaming applications and migrate periodic batch workloads to continuous streaming workloads;
- Learn about windowed operations that process groups of records;
- Ingest data streams into a DataStream application and emit a result stream into different storage systems;
- Implement stateful and custom operators common in stream processing applications;
- Operate, maintain, and update continuously running Flink streaming applications;
- Explore several deployment options, including the setup of highly available installations.
4 5 10
by Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei
Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own da...
Price: $49.99 | Publisher: Packt Publishing | Release: 2018
by Mike Frampton
Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations.This book aims to take your limited knowledge of Spark to th...
Price: $35.25 | Publisher: Packt Publishing | Release: 2015
by Quinton Anderson
Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lot of fun to use!Storm Real Time Processing Cookbook will ha...
Price: $29.99 | Publisher: Packt Publishing | Release: 2013
by Ankit Jain, Anand Nalya
Starting with the very basics of Storm, you will learn how to set up Storm on a single machine and move on to deploying Storm on your cluster. You will understand how Kafka can be integrated with Storm using the Kafka spout.You will then proceed to explore the Trident abstraction tool with Storm to perform stateful stream processing, guar...
Price: $14.24 | Publisher: Packt Publishing | Release: 2014
by Manuel Ignacio Franco Galeano
Processing big data in real time is challenging due to scalability, information consistency, and fault-tolerance. This book teaches you how to use Spark to make your overall analytical workflow faster and more efficient. You'll explore all core concepts and tools within the Spark ecosystem, such as Spark Streaming, the Spark Streaming API...
Price: $29.99 | Publisher: Packt Publishing | Release: 2018
by Arun C. Murthy, Vinod Kumar Vavilapalli, Doug Eadline, Joseph Niemiec, Jeff Markham
Apache Hadoop is helping drive the Big Data revolution. Now, its data processing has been completely overhauled: Apache Hadoop YARN provides resource management at data center scale and easier ways to create distributed applications that process petabytes of data. And now in Apache Hadoop YARN, two Hadoop technical leaders show you how to...
Price: $30.15 | Publisher: Addison-Wesley | Release: 2014
by Gwen Shapira, Neha Narkhede, Todd Palino
Every enterprise application creates data, whether it's log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you're an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how t...
Price: $23.49 | Publisher: O'Reilly Media | Release: 2017
by William P. Bejeck Jr.
Kafka Streams in Action teaches you everything you need to know to implement stream processing on data flowing into your Kafka platform, allowing you to focus on getting more from your data without sacrificing time or effort.Not all stream-based applications require a dedicated processing cluster. The lightweight Kafka Streams library pro...
Price: $35.99 | Publisher: Manning | Release: 2018