Trino: The Definitive Guide, 2nd Edition
by Matt Fuller, Manfred Moser, Martin Traverso
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakehouse with Iceberg or Delta Lake, a different system like Cassandra, Kafka, or SingleStore, or a relational data...
Price: $52.99 | Publisher: O'Reilly Media | Release: 2022
by Matt Fuller, Manfred Moser, Martin Traverso
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. With this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's Hive, Cassandra, a relational database, or a proprietary data store. Analysts, software engineers, and production engineers will learn how to manage, use, and even devel...
Price: $64.73 | Publisher: O'Reilly Media | Release: 2021
Beginning Apache Spark Using Azure Databricks
by Robert Ilijason
Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incremental...
Price: $32.32 | Publisher: Apress | Release: 2020
by Matt Fuller, Martin Traverso, Manfred Moser
Perform fast interactive analytics against different data sources using the Presto high-performance, distributed SQL query engine. With this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's Hive, Cassandra, a relational database, or a proprietary data store. Analysts, software engineers, and production engineers will learn how to manage, use, and even dev...
Price: $51.81 | Publisher: O'Reilly Media | Release: 2020
by Subhashini Chellappan, Dharanitharan Ganesan
Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You...
Price: $31.66 | Publisher: Apress | Release: 2018
by Paul Rogers, Charles Givre
Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster....
Price: $42.26 | Publisher: O'Reilly Media | Release: 2018
Apache Hadoop 3 Quick Start Guide
by Hrishikesh Karambelkar
Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.The book begins with an overview of big data and Apache Hadoop. Then, you will set up a p...
Price: $29.99 | Publisher: Packt Publishing | Release: 2018
Sams Teach Yourself Hadoop in 24 Hours
by Jeffrey Aven
Apache Hadoop is the technology at the heart of the Big Data revolution, and Hadoop skills are in enormous demand. Now, in just 24 lessons of one hour or less, you can learn all the skills and techniques you'll need to deploy each key component of a Hadoop platform in your local environment or in the cloud, building a fully functional Hadoop cluster and using it with real programs and datasets. Each sh...
Price: $31.99 | Publisher: SAMS Publishing | Release: 2017
by Bill Havanki
Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there's a lot more to deploying Hadoop to the public cloud than simply renting machines.This hands-on guide shows developers and systems administrators familiar with Hadoop how to install, use, an...
Price: $25.32 | Publisher: O'Reilly Media | Release: 2017
by Brandon Perry
Learn to use C#'s powerful set of core libraries to automate tedious yet important tasks like fuzzing, performing vulnerability scans, and analyzing malware. With some help from Mono, you'll write your own practical security tools that will run on Windows, OS X, Linux, and even mobile devices.After a crash course in C# and some of its advanced features, you'll learn how to: Generate shellco...
Price: $17.99 | Publisher: No Starch Press | Release: 2017