Elasticsearch 8.x Cookbook, 5th Edition
by Alberto Paro
Elasticsearch is a Lucene-based distributed search engine at the heart of the Elastic Stack that allows you to index and search unstructured content with petabytes of data. With this updated fifth edition, you'll cover comprehensive recipes relating to what's new in Elasticsearch 8.x and see how to create and run complex queries and analytics.The recipes will guide you through performing index map...
Price: $49.99 | Publisher: Packt Publishing | Release: 2022
Introducing .NET for Apache Spark
by Ed Elliott
Get started using Apache Spark via C# or F# and the .NET for Apache Spark bindings. This book is an introduction to both Apache Spark and the .NET bindings. Readers new to Apache Spark will get up to speed quickly using Spark for data processing tasks performed against large and very large datasets. You will learn how to combine your knowledge of .NET with Apache Spark to bring massive computing power to be...
Price: $46.96 | Publisher: Apress | Release: 2021
by Wee Hyong Tok, Amit Bahree, Senja Filipi
Most data scientists and engineers today rely on quality labeled data to train machine learning models. But building a training set manually is time-consuming and expensive, leaving many companies with unfinished ML projects. There's a more practical approach. In this book, Wee Hyong Tok, Amit Bahree, and Senja Filipi show you how to create products using weakly supervised learning models.You'll l...
Price: $61.93 | Publisher: O'Reilly Media | Release: 2021
Data Science at the Command Line, 2nd Edition
by Jeroen Janssens
This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools-useful whether you work wit...
Price: $54.85 | Publisher: O'Reilly Media | Release: 2021
Designing Cloud Data Platforms
by Danil Zburivsky, Lynda Partner
Centralized data warehouses, the long-time defacto standard for housing data for analytics, are rapidly giving way to multi-faceted cloud data platforms. Companies that embrace modern cloud data platforms benefit from an integrated view of their business using all of their data and can take advantage of advanced analytic practices to drive predictions and as yet unimagined data services. Designing Cloud Dat...
Price: $39.99 | Publisher: Manning | Release: 2021
Next-Generation Machine Learning with Spark
by Butch Quinto
Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications.The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry.Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark...
Price: $26.41 | Publisher: Apress | Release: 2020
by Jean-Georges Perrin
The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, 2nd Edition, you'll learn to take advantage of Spark's core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in ente...
Price: $35.89 | Publisher: Manning | Release: 2020
by Kevin Feasel
Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered.PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Server with Azure Blob Storage, Apache Hadoop, other SQL Server instances, Oracle, Cosmos DB, Apache Spark, and mo...
Price: $21.73 | Publisher: Apress | Release: 2020
Beginning Apache Spark Using Azure Databricks
by Robert Ilijason
Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incremental...
Price: $32.32 | Publisher: Apress | Release: 2020
by Benjamin Weissman, Enrico van de Laar
Use this guide to one of SQL Server 2019's most impactful features - Big Data Clusters. You will learn about data virtualization and data lakes for this complete artificial intelligence (AI) and machine learning (ML) platform within the SQL Server database engine. You will know how to use Big Data Clusters to combine large volumes of streaming data for analysis along with data stored in a traditional d...
Price: $33.67 | Publisher: Apress | Release: 2020