Apache Books



Bookstore > Books > Apache

Modern Data Engineering with Apache Spark

Modern Data Engineering with Apache Spark

by Scott Haines

Leverage Apache Spark within a modern data engineering ecosystem. This hands-on guide will teach you how to write fully functional applications, follow industry best practices, and learn the rationale behind these decisions. With Apache Spark as the foundation, you will follow a step-by-step journey beginning with the basics of data ingestion, processing, and transformation, and ending up with an entire loc...

Price:  $46.38  |  Publisher:  Apress  |  Release:  2022

Cloud-Native Microservices with Apache Pulsar

Cloud-Native Microservices with Apache Pulsar

by Rahul Sharma, Mohammad Atyab

Apply different enterprise integration and processing strategies available with Pulsar, Apache's multi-tenant, high-performance, cloud-native messaging and streaming platform. This book is a comprehensive guide that examines using Pulsar Java libraries to build distributed applications with message-driven architecture.You'll begin with an introduction to Apache Pulsar architecture. The first few c...

Price:  $31.09  |  Publisher:  Apress  |  Release:  2022

The Azure Data Lakehouse Toolkit

The Azure Data Lakehouse Toolkit

by Ron L'Esteve

Design and implement a modern data lakehouse on the Azure Data Platform using Delta Lake, Apache Spark, Azure Databricks, Azure Synapse Analytics, and Snowflake. This book teaches you the intricate details of the Data Lakehouse Paradigm and how to efficiently design a cloud-based data lakehouse using highly performant and cutting-edge Apache Spark capabilities using Azure Databricks, Azure Synapse Analytics...

Price:  $54.99  |  Publisher:  Apress  |  Release:  2022

In-Memory Analytics with Apache Arrow

In-Memory Analytics with Apache Arrow

by Matthew Topol

Apache Arrow is designed to accelerate analytics and allow the exchange of data across big data systems easily.In-Memory Analytics with Apache Arrow begins with a quick overview of the Apache Arrow format, before moving on to helping you to understand Arrow's versatility and benefits as you walk through a variety of real-world use cases. You'll cover key tasks such as enhancing data science workfl...

Price:  $44.99  |  Publisher:  Packt Publishing  |  Release:  2022

Advanced Analytics with PySpark

Advanced Analytics with PySpark

by Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills

The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best pract...

Price:  $41.03  |  Publisher:  O'Reilly Media  |  Release:  2022

Machine Learning with PySpark, 2nd Edition

Machine Learning with PySpark, 2nd Edition

by Pramod Singh

Master the new features in PySpark 3.1 to develop data-driven, intelligent applications. This updated edition covers topics ranging from building scalable machine learning models, to natural language processing, to recommender systems.Machine Learning with PySpark, Second Edition begins with the fundamentals of Apache Spark, including the latest updates to the framework. Next, you will learn the full spectr...

Price:  $49.05  |  Publisher:  Apress  |  Release:  2022

Data Algorithms with Spark

Data Algorithms with Spark

by Mahmoud Parsian

Apache Spark's speed, ease of use, sophisticated analytics, and multilanguage support makes practical knowledge of this cluster-computing framework a required skill for data engineers and data scientists. With this hands-on guide, anyone looking for an introduction to Spark will learn practical algorithms and examples using PySpark.In each chapter, author Mahmoud Parsian shows you how to solve a data p...

Price:  $42.95  |  Publisher:  O'Reilly Media  |  Release:  2022

Kafka in Action

Kafka in Action

by Dylan Scott, Viktor Gamov, Dave Klein

Kafka in Action is a fast-paced introduction to every aspect of working with Apache Kafka. Starting with an overview of Kafka's core concepts, you'll immediately learn how to set up and execute basic data movement tasks and how to produce and consume streams of events. Advancing quickly, you'll soon be ready to use Kafka in your day-to-day workflow, and start digging into even more advanced K...

Price:  $44.99  |  Publisher:  Manning  |  Release:  2022

Elasticsearch 8.x Cookbook, 5th Edition

Elasticsearch 8.x Cookbook, 5th Edition

by Alberto Paro

Elasticsearch is a Lucene-based distributed search engine at the heart of the Elastic Stack that allows you to index and search unstructured content with petabytes of data. With this updated fifth edition, you'll cover comprehensive recipes relating to what's new in Elasticsearch 8.x and see how to create and run complex queries and analytics.The recipes will guide you through performing index map...

Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2022

Automated Machine Learning on AWS

Automated Machine Learning on AWS

by Trenton Potgieter

AWS provides a wide range of solutions to help automate a machine learning workflow with just a few lines of code. With this practical book, you'll learn how to automate a machine learning pipeline using the various AWS services.Automated Machine Learning on AWS begins with a quick overview of what the machine learning pipeline/process looks like and highlights the typical challenges that you may face ...

Price:  $44.99  |  Publisher:  Packt Publishing  |  Release:  2022

Pages: 1, 2, 3 ... 18 | Next→

Subscribe to Newsletter

Be the first to know about new IT books, upcoming releases, exclusive offers and more.