Hadoop Books



Bookstore > Books > Hadoop

Data Analysis with Python and PySpark

Data Analysis with Python and PySpark

by Jonathan Rioux

Data Analysis with Python and PySpark is your guide to delivering successful Python-driven data projects. Packed with relevant examples and essential techniques, this practical book teaches you to build pipelines for reporting, machine learning, and other data-centric tasks. Quick exercises in every chapter help you practice what you've learned, and rapidly start implementing PySpark into your data sys...

Price:  $57.69  |  Publisher:  Manning  |  Release:  2022

PolyBase Revealed

PolyBase Revealed

by Kevin Feasel

Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered.PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Server with Azure Blob Storage, Apache Hadoop, other SQL Server instances, Oracle, Cosmos DB, Apache Spark, and mo...

Price:  $21.73  |  Publisher:  Apress  |  Release:  2020

Beginning Apache Spark Using Azure Databricks

Beginning Apache Spark Using Azure Databricks

by Robert Ilijason

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incremental...

Price:  $32.32  |  Publisher:  Apress  |  Release:  2020

Spark in Action, 2nd Edition

Spark in Action, 2nd Edition

by Jean-Georges Perrin

The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, 2nd Edition, you'll learn to take advantage of Spark's core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in ente...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2020

Mastering Large Datasets with Python

Mastering Large Datasets with Python

by John T. Wolohan

Modern data science solutions need to be clean, easy to read, and scalable. In Mastering Large Datasets with Python, author J.T. Wolohan teaches you how to take a small project and scale it up using a functionally influenced approach to Python coding. You'll explore methods and built-in Python tools that lend themselves to clarity and scalability, like the high-performing parallelism method, as well as...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2020

Hadoop for Windows Succinctly

FREE EBOOK - Hadoop for Windows Succinctly

by Dave Vickers

Author Dave Vickers provides a thorough guide to using Hadoop directly on Windows operating systems. From a conceptual overview to practical examples, Hadoop for Windows Succinctly is a valuable resource for developers....

Publisher:  Syncfusion  |  Release:  2019

Modern Big Data Processing with Hadoop

Modern Big Data Processing with Hadoop

by Naresh Kumar, Prashant Shindgikar

The complex structure of data these days requires sophisticated solutions for data transformation, to make the information more accessible to the users.This book empowers you to build such solutions with relative ease with the help of Apache Hadoop, along with a host of other Big Data tools.This book will give you a complete understanding of the data lifecycle management with Hadoop, followed by modeling of...

Price:  $50.55  |  Publisher:  Packt Publishing  |  Release:  2018

Apache Hadoop 3 Quick Start Guide

Apache Hadoop 3 Quick Start Guide

by Hrishikesh Karambelkar

Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.The book begins with an overview of big data and Apache Hadoop. Then, you will set up a p...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2018

PySpark Recipes

PySpark Recipes

by Raju Kumar Mishra

Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved!PySpark Recipes covers Hadoop and its shortcomings. The architecture of Spark, PySpark, and RDD are presented. You will ...

Price:  $35.10  |  Publisher:  Apress  |  Release:  2018

Advanced Data Analytics Using Python

Advanced Data Analytics Using Python

by Sayan Mukhopadhyay

Gain a broad foundation of advanced data analytics concepts and discover the recent revolution in databases such as Neo4j, Elasticsearch, and MongoDB. This book discusses how to implement ETL techniques including topical crawling, which is applied in domains such as high-frequency algorithmic trading and goal-oriented dialog systems. You'll also see examples of machine learning concepts such as semi-su...

Price:  $29.01  |  Publisher:  Apress  |  Release:  2018

Pages: 1, 2, 3 ... 9 | Next→

Subscribe to Newsletter

Be the first to know about new IT books, upcoming releases, exclusive offers and more.