Modern Data Engineering with Apache Spark   




by Scott Haines
Leverage Apache Spark within a modern data engineering ecosystem. This hands-on guide will teach you how to write fully functional applications, follow industry best practices, and learn the rationale behind these decisions. With Apache Spark as the foundation, you will follow a step-by-step journey beginning with the basics of data ingestion, processing, and transformation, and ending up with an entire loc...
Price: $46.38 | Publisher: Apress | Release: 2022
Cloud-Native Microservices with Apache Pulsar
by Rahul Sharma, Mohammad Atyab
Apply different enterprise integration and processing strategies available with Pulsar, Apache's multi-tenant, high-performance, cloud-native messaging and streaming platform. This book is a comprehensive guide that examines using Pulsar Java libraries to build distributed applications with message-driven architecture.You'll begin with an introduction to Apache Pulsar architecture. The first few c...
Price: $31.09 | Publisher: Apress | Release: 2022
The Azure Data Lakehouse Toolkit   




by Ron L'Esteve
Design and implement a modern data lakehouse on the Azure Data Platform using Delta Lake, Apache Spark, Azure Databricks, Azure Synapse Analytics, and Snowflake. This book teaches you the intricate details of the Data Lakehouse Paradigm and how to efficiently design a cloud-based data lakehouse using highly performant and cutting-edge Apache Spark capabilities using Azure Databricks, Azure Synapse Analytics...
Price: $54.99 | Publisher: Apress | Release: 2022
Apache Essentials, 2nd Edition   




by Darren James Harkness
Take a friendly, non-technical approach to installing, configuring, and maintaining a web server for development and testing on Mac OS, Linux, and Windows. This new edition uses straightforward language to demystify the mechanics of the web, leading the reader through a complex topic via simple, iterative steps. The book reflects current, relevant Apache configurations and web application frameworks, and pr...
Price: $37.49 | Publisher: Apress | Release: 2022
In-Memory Analytics with Apache Arrow   




by Matthew Topol
Apache Arrow is designed to accelerate analytics and allow the exchange of data across big data systems easily.In-Memory Analytics with Apache Arrow begins with a quick overview of the Apache Arrow format, before moving on to helping you to understand Arrow's versatility and benefits as you walk through a variety of real-world use cases. You'll cover key tasks such as enhancing data science workfl...
Price: $44.99 | Publisher: Packt Publishing | Release: 2022
Advanced Analytics with PySpark   




by Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills
The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best pract...
Price: $35.42 | Publisher: O'Reilly Media | Release: 2022
Machine Learning with PySpark, 2nd Edition
by Pramod Singh
Master the new features in PySpark 3.1 to develop data-driven, intelligent applications. This updated edition covers topics ranging from building scalable machine learning models, to natural language processing, to recommender systems.Machine Learning with PySpark, Second Edition begins with the fundamentals of Apache Spark, including the latest updates to the framework. Next, you will learn the full spectr...
Price: $49.05 | Publisher: Apress | Release: 2022
PHP 8 for Absolute Beginners, 3rd Edition
by Jason Lengstorf, Thomas Blom Hansen, Steve Prettyman
Embark on a practical journey of building dynamic sites aided by multiple projects that can be easily adapted to real-world scenarios. This third edition will show you how to become a confident PHP developer, ready to take the next steps to being a Full Stack Developer and/or successful website or web application programmer. You won't be swamped with every detail of the full PHP language up front - ins...
Price: $40.67 | Publisher: Apress | Release: 2022
by Mahmoud Parsian
Apache Spark's speed, ease of use, sophisticated analytics, and multilanguage support makes practical knowledge of this cluster-computing framework a required skill for data engineers and data scientists. With this hands-on guide, anyone looking for an introduction to Spark will learn practical algorithms and examples using PySpark.In each chapter, author Mahmoud Parsian shows you how to solve a data p...
Price: $42.95 | Publisher: O'Reilly Media | Release: 2022
by Dylan Scott, Viktor Gamov, Dave Klein
Kafka in Action is a fast-paced introduction to every aspect of working with Apache Kafka. Starting with an overview of Kafka's core concepts, you'll immediately learn how to set up and execute basic data movement tasks and how to produce and consume streams of events. Advancing quickly, you'll soon be ready to use Kafka in your day-to-day workflow, and start digging into even more advanced K...
Price: $44.99 | Publisher: Manning | Release: 2022