Data Pipelines with Apache Airflow



Bookstore > Books > Data Pipelines with Apache Airflow

Price$36.99 - $49.58
Rating
AuthorsBas P. Harenslak, Julian Rutger de Ruiter
PublisherManning
Published2021
Pages480
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101617296902
ISBN-139781617296901
EBook Hardcover Paperback

A successful pipeline moves data efficiently, minimizing pauses and blockages between tasks, keeping every process along the way operational. Apache Airflow provides a single customizable environment for building and managing data pipelines, eliminating the need for a hodgepodge collection of tools, snowflake code, and homegrown processes. Using real-world scenarios and examples, Data Pipelines with Apache Airflow teaches you how to simplify and automate data pipelines, reduce operational overhead, and smoothly integrate all the technologies in your stack.

Data pipelines manage the flow of data from initial collection through consolidation, cleaning, analysis, visualization, and more. Apache Airflow provides a single platform you can use to design, implement, monitor, and maintain your pipelines. Its easy-to-use UI, plug-and-play options, and flexible Python scripting make Airflow perfect for any data management task.

Data Pipelines with Apache Airflow teaches you how to build and maintain effective data pipelines. You'll explore the most common usage patterns, including aggregating multiple data sources, connecting to and from data lakes, and cloud deployment. Part reference and part tutorial, this practical guide covers every aspect of the directed acyclic graphs (DAGs) that power Airflow, and how to customize them for your pipeline's needs.




5 5 114

Similar Books


Data Engineering with Alteryx

Data Engineering with Alteryx

by Paul Houghton

Alteryx is a GUI-based development platform for data analytic applications.Data Engineering with Alteryx will help you leverage Alteryx's code-free aspects which increase development speed while still enabling you to make the most of the code-based skills you have.This book will teach you the principles of DataOps and how they can be...

Price:  $44.99  |  Publisher:  Packt Publishing  |  Release:  2022

Modern Data Engineering with Apache Spark

Modern Data Engineering with Apache Spark

by Scott Haines

Leverage Apache Spark within a modern data engineering ecosystem. This hands-on guide will teach you how to write fully functional applications, follow industry best practices, and learn the rationale behind these decisions. With Apache Spark as the foundation, you will follow a step-by-step journey beginning with the basics of data inges...

Price:  $46.38  |  Publisher:  Apress  |  Release:  2022

Big Data Processing with Apache Spark

Big Data Processing with Apache Spark

by Manuel Ignacio Franco Galeano

Processing big data in real time is challenging due to scalability, information consistency, and fault-tolerance. This book teaches you how to use Spark to make your overall analytical workflow faster and more efficient. You'll explore all core concepts and tools within the Spark ecosystem, such as Spark Streaming, the Spark Streamin...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2018

Practical Data Science with R, 2nd Edition

Practical Data Science with R, 2nd Edition

by Nina Zumel, John Mount

Practical Data Science with R, Second Edition takes a practice-oriented approach to explaining basic principles in the ever expanding field of data science. You'll jump right to real-world use cases as you apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2019

Data Engineering with Google Cloud Platform

Data Engineering with Google Cloud Platform

by Adi Wijaya

With this book, you'll understand how the highly scalable Google Cloud Platform (GCP) enables data engineers to create end-to-end data pipelines right from storing and processing data and workflow orchestration to presenting data through visualization dashboards.Starting with a quick overview of the fundamental concepts of data engin...

Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2022

Productive and Efficient Data Science with Python

Productive and Efficient Data Science with Python

by Tirthajyoti Sarkar

This book focuses on the Python-based tools and techniques to help you become highly productive at all aspects of typical data science stacks such as statistical analysis, visualization, model selection, and feature engineering.You'll review the inefficiencies and bottlenecks lurking in the daily business process and solve them with ...

Price:  $49.99  |  Publisher:  Apress  |  Release:  2022

Learning Apache Mahout

Learning Apache Mahout

by Chandramani Tiwary

In the past few years the generation of data and our capability to store and process it has grown exponentially. There is a need for scalable analytics frameworks and people with the right skills to get the information needed from this Big Data. Apache Mahout is one of the first and most prominent Big Data machine learning platforms. It i...

Price:  $44.99  |  Publisher:  Packt Publishing  |  Release:  2015

Modern Big Data Processing with Hadoop

Modern Big Data Processing with Hadoop

by Naresh Kumar, Prashant Shindgikar

The complex structure of data these days requires sophisticated solutions for data transformation, to make the information more accessible to the users.This book empowers you to build such solutions with relative ease with the help of Apache Hadoop, along with a host of other Big Data tools.This book will give you a complete understanding...

Price:  $50.55  |  Publisher:  Packt Publishing  |  Release:  2018