Apache Spark Books



Bookstore > Books > Apache Spark

Elasticsearch 8.x Cookbook, 5th Edition

Elasticsearch 8.x Cookbook, 5th Edition

by Alberto Paro

Elasticsearch is a Lucene-based distributed search engine at the heart of the Elastic Stack that allows you to index and search unstructured content with petabytes of data. With this updated fifth edition, you'll cover comprehensive recipes relating to what's new in Elasticsearch 8.x and see how to create and run complex queries and analytics.The recipes will guide you through performing index map...

Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2022

Introducing .NET for Apache Spark

Introducing .NET for Apache Spark

by Ed Elliott

Get started using Apache Spark via C# or F# and the .NET for Apache Spark bindings. This book is an introduction to both Apache Spark and the .NET bindings. Readers new to Apache Spark will get up to speed quickly using Spark for data processing tasks performed against large and very large datasets. You will learn how to combine your knowledge of .NET with Apache Spark to bring massive computing power to be...

Price:  $46.96  |  Publisher:  Apress  |  Release:  2021

Practical Weak Supervision

Practical Weak Supervision

by Wee Hyong Tok, Amit Bahree, Senja Filipi

Most data scientists and engineers today rely on quality labeled data to train machine learning models. But building a training set manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Wee Hyong Tok, Amit Bahree, and Senja Filipi show you how to create products using weakly supervised learning models.You'll l...

Price:  $61.93  |  Publisher:  O'Reilly Media  |  Release:  2021

Data Science at the Command Line, 2nd Edition

Data Science at the Command Line, 2nd Edition

by Jeroen Janssens

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools-useful whether you work wit...

Price:  $54.85  |  Publisher:  O'Reilly Media  |  Release:  2021

Designing Cloud Data Platforms

Designing Cloud Data Platforms

by Danil Zburivsky, Lynda Partner

Centralized data warehouses, the long-time defacto standard for housing data for analytics, are rapidly giving way to multi-faceted cloud data platforms. Companies that embrace modern cloud data platforms benefit from an integrated view of their business using all of their data and can take advantage of advanced analytic practices to drive predictions and as yet unimagined data services. Designing Cloud Dat...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2021

Next-Generation Machine Learning with Spark

Next-Generation Machine Learning with Spark

by Butch Quinto

Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications.The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry.Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark...

Price:  $26.41  |  Publisher:  Apress  |  Release:  2020

Spark in Action, 2nd Edition

Spark in Action, 2nd Edition

by Jean-Georges Perrin

The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, 2nd Edition, you'll learn to take advantage of Spark's core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in ente...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2020

PolyBase Revealed

PolyBase Revealed

by Kevin Feasel

Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered.PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Server with Azure Blob Storage, Apache Hadoop, other SQL Server instances, Oracle, Cosmos DB, Apache Spark, and mo...

Price:  $21.73  |  Publisher:  Apress  |  Release:  2020

Beginning Apache Spark Using Azure Databricks

Beginning Apache Spark Using Azure Databricks

by Robert Ilijason

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incremental...

Price:  $32.32  |  Publisher:  Apress  |  Release:  2020

SQL Server Big Data Clusters

SQL Server Big Data Clusters

by Benjamin Weissman, Enrico van de Laar

Use this guide to one of SQL Server 2019's most impactful features - Big Data Clusters. You will learn about data virtualization and data lakes for this complete artificial intelligence (AI) and machine learning (ML) platform within the SQL Server database engine. You will know how to use Big Data Clusters to combine large volumes of streaming data for analysis along with data stored in a traditional d...

Price:  $33.67  |  Publisher:  Apress  |  Release:  2020

Pages: ←Previous | 1, 2, 3, 4 ... 9 | Next→

Subscribe to Newsletter

Be the first to know about new IT books, upcoming releases, exclusive offers and more.