The Azure Data Lakehouse Toolkit
Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake
Price | $54.99
|
Rating | |
Author | Ron L'Esteve |
Publisher | Apress |
Published | 2022 |
Pages | 465 |
Language | English |
Format | Paper book / ebook (PDF) |
ISBN-10 | 1484282329 |
ISBN-13 | 9781484282328 |
Design and implement a modern data lakehouse on the Azure Data Platform using Delta Lake, Apache Spark, Azure Databricks, Azure Synapse Analytics, and Snowflake. This book teaches you the intricate details of the Data Lakehouse Paradigm and how to efficiently design a cloud-based data lakehouse using highly performant and cutting-edge Apache Spark capabilities using Azure Databricks, Azure Synapse Analytics, and Snowflake. You will learn to write efficient PySpark code for batch and streaming ELT jobs on Azure. And you will follow along with practical, scenario-based examples showing how to apply the capabilities of Delta Lake and Apache Spark to optimize performance, and secure, share, and manage a high volume, high velocity, and high variety of data in your lakehouse with ease.
The patterns of success that you acquire from reading this book will help you hone your skills to build high-performing and scalable ACID-compliant lakehouses using flexible and cost-efficient decoupled storage and compute capabilities. Extensive coverage of Delta Lake ensures that you are aware of and can benefit from all that this new, open source storage layer can offer. In addition to the deep examples on Databricks in the book, there is coverage of alternative platforms such as Synapse Analytics and Snowflake so that you can make the right platform choice for your needs.
After reading this book, you will be able to implement Delta Lake capabilities, including Schema Evolution, Change Feed, Live Tables, Sharing, and Clones to enable better business intelligence and advanced analytics on your data within the Azure Data Platform.
- Ron L'Esteve
4 5 1
Similar Books
Cloud Data Design, Orchestration, and Management Using Microsoft Azure
by Francesco Diaz, Roberto Freato
Use Microsoft Azure to optimally design your data solutions and save time and money. Scenarios are presented covering analysis, design, integration, monitoring, and derivatives.This book is about data and provides you with a wide range of possibilities to implement a data solution on Azure, from hybrid cloud to PaaS services. Migration fr...
Price: $41.41 | Publisher: Apress | Release: 2018
Understanding Azure Data Factory
by Sudhir Rawat, Abhishek Narain
Improve your analytics and data platform to solve major challenges, including operationalizing big data and advanced analytics workloads on Azure. You will learn how to monitor complex pipelines, set alerts, and extend your organization's custom monitoring requirements.This book starts with an overview of the Azure Data Factory as a ...
Price: $30.09 | Publisher: Apress | Release: 2019
The Data Warehouse Toolkit, 3rd Edition
by Ralph Kimball, Margy Ross
The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new a...
Price: $48.99 | Publisher: Wiley | Release: 2013
by Zoiner Tejada
Microsoft Azure has over 20 platform-as-a-service (PaaS) offerings that can act in support of a big data analytics solution. So which one is right for your project? This practical book helps you understand the breadth of Azure services by organizing them into a reference framework you can use when crafting your own big data analytics solu...
Price: $22.99 | Publisher: O'Reilly Media | Release: 2017
Guide to NoSQL with Azure Cosmos DB
by Gaston C. Hillar, Daron Yondem
Cosmos DB is a NoSQL database service included in Azure that is continuously adding new features and has quickly become one of the most innovative services found in Azure, targeting mission-critical applications at a global scale. This book starts off by showing you the main features of Cosmos DB, their supported NoSQL data models and the...
Price: $29.99 | Publisher: Packt Publishing | Release: 2018
Beginning Apache Spark Using Azure Databricks
by Robert Ilijason
Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions ...
Price: $32.32 | Publisher: Apress | Release: 2020
SQL Server Data Automation Through Frameworks
by Andy Leonard, Kent Bradshaw
Learn to automate SQL Server operations using frameworks built from metadata-driven stored procedures and SQL Server Integration Services (SSIS). Bring all the power of Transact-SQL (T-SQL) and Microsoft .NET to bear on your repetitive data, data integration, and ETL processes. Do this for no added cost over what you've already spent...
Price: $37.99 | Publisher: Apress | Release: 2020
Microsoft Azure Data Solutions
by Daniel A. Seara, Francesco Milano, Danilo Dominici
Cloud technologies are advancing at an accelerating pace, supplanting traditional relational and data warehouse storage solutions with novel, high-value alternatives. Now, three pioneering Azure Data consultants offer an expert introduction to the relational, non-relational, and data warehouse solutions offered by the Azure platform. Draw...
Price: $36.23 | Publisher: Microsoft Press | Release: 2021