Hadoop Books



Bookstore > Books > Hadoop

PolyBase Revealed

PolyBase Revealed

by Kevin Feasel

Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered.PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Server with Azure Blob Storage, Apache Hadoop, other SQL Server instances, Oracle, Cosmos DB, Apache Spark, and mo...

Price:  $21.73  |  Publisher:  Apress  |  Release:  2020

Beginning Apache Spark Using Azure Databricks

Beginning Apache Spark Using Azure Databricks

by Robert Ilijason

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incremental...

Price:  $37.76  |  Publisher:  Apress  |  Release:  2020

Spark in Action, 2nd Edition

Spark in Action, 2nd Edition

by Jean-Georges Perrin

The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, 2nd Edition, you'll learn to take advantage of Spark's core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises wo...

Price:  $59.99  |  Publisher:  Manning  |  Release:  2020

Mastering Large Datasets with Python

Mastering Large Datasets with Python

by John T. Wolohan

Modern data science solutions need to be clean, easy to read, and scalable. In Mastering Large Datasets with Python, author J.T. Wolohan teaches you how to take a small project and scale it up using a functionally influenced approach to Python coding. You'll explore methods and built-in Python tools that lend themselves to clarity and scalability, like the high-performing parallelism method, as well as dist...

Price:  $39.99  |  Publisher:  Manning  |  Release:  2020

Modern Big Data Processing with Hadoop

Modern Big Data Processing with Hadoop

by Naresh Kumar, Prashant Shindgikar

The complex structure of data these days requires sophisticated solutions for data transformation, to make the information more accessible to the users.This book empowers you to build such solutions with relative ease with the help of Apache Hadoop, along with a host of other Big Data tools.This book will give you a complete understanding of the data lifecycle management with Hadoop, followed by modeling of...

Price:  $31.99  |  Publisher:  Packt Publishing  |  Release:  2018

Apache Hadoop 3 Quick Start Guide

Apache Hadoop 3 Quick Start Guide

by Hrishikesh Karambelkar

Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.The book begins with an overview of big data and Apache Hadoop. Then, you will set up a p...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2018

PySpark Recipes

PySpark Recipes

by Raju Kumar Mishra

Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved!PySpark Recipes covers Hadoop and its shortcomings. The architecture of Spark, PySpark, and RDD are presented. You will ...

Price:  $35.10  |  Publisher:  Apress  |  Release:  2018

Advanced Data Analytics Using Python

Advanced Data Analytics Using Python

by Sayan Mukhopadhyay

Gain a broad foundation of advanced data analytics concepts and discover the recent revolution in databases such as Neo4j, Elasticsearch, and MongoDB. This book discusses how to implement ETL techniques including topical crawling, which is applied in domains such as high-frequency algorithmic trading and goal-oriented dialog systems. You'll also see examples of machine learning concepts such as semi-supervi...

Price:  $29.01  |  Publisher:  Apress  |  Release:  2018

Practical Enterprise Data Lake Insights

Practical Enterprise Data Lake Insights

by Saurabh Gupta, Venkata Giri

Use this practical guide to successfully handle the challenges encountered when designing an enterprise data lake and learn industry best practices to resolve issues.When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data. Starting from sourcing data into the Hadoop ecosystem, you will go t...

Price:  $24.14  |  Publisher:  Apress  |  Release:  2018

Beginning Apache Spark 2

Beginning Apache Spark 2

by Hien Luu

Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it.Along the way, you'll discover resilient distributed datasets (RDDs); use Spark SQL for structured data;...

Price:  $25.33  |  Publisher:  Apress  |  Release:  2018

Pages: 1, 2, 3 ... 9 | Next→

Subscribe to Newsletter

Be the first to know about new IT books, upcoming releases, exclusive offers and more.