Modern Big Data Processing with Hadoop

Expert techniques for architecting end-to-end big data solutions to get valuable insights



Bookstore > Books > Modern Big Data Processing with Hadoop

Price$50.55 - $54.99
Rating
AuthorsNaresh Kumar, Prashant Shindgikar
PublisherPackt Publishing
Published2018
Pages394
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-10178712276X
ISBN-139781787122765
EBook Hardcover Paperback

The complex structure of data these days requires sophisticated solutions for data transformation, to make the information more accessible to the users.This book empowers you to build such solutions with relative ease with the help of Apache Hadoop, along with a host of other Big Data tools.

This book will give you a complete understanding of the data lifecycle management with Hadoop, followed by modeling of structured and unstructured data in Hadoop. It will also show you how to design real-time streaming pipelines by leveraging tools such as Apache Spark, and build efficient enterprise search solutions using Elasticsearch. You will learn to build enterprise-grade analytics solutions on Hadoop, and how to visualize your data using tools such as Apache Superset. This book also covers techniques for deploying your Big Data solutions on the cloud Apache Ambari, as well as expert techniques for managing and administering your Hadoop cluster.

By the end of this book, you will have all the knowledge you need to build expert Big Data systems.

Build an efficient enterprise Big Data strategy centered around Apache Hadoop; Gain a thorough understanding of using Hadoop with various Big Data frameworks such as Apache Spark, Elasticsearch and more; Set up and deploy your Big Data environment on premises or on the cloud with Apache Ambari; Design effective streaming data pipelines and build your own enterprise search solutions; Utilize the historical data to build your analytics solutions and visualize them using popular tools such as Apache Superset; Plan, set up and administer your Hadoop cluster efficiently.




2 5 4

Similar Books


Big Data Processing with Apache Spark

Big Data Processing with Apache Spark

by Manuel Ignacio Franco Galeano

Processing big data in real time is challenging due to scalability, information consistency, and fault-tolerance. This book teaches you how to use Spark to make your overall analytical workflow faster and more efficient. You'll explore all core concepts and tools within the Spark ecosystem, such as Spark Streaming, the Spark Streamin...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2018

Big Data Analytics with R and Hadoop

Big Data Analytics with R and Hadoop

by Vignesh Prajapati

Big data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations, and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue. New...

Price:  $5.77  |  Publisher:  Packt Publishing  |  Release:  2013

Apache Spark 2: Data Processing and Real-Time Analytics

Apache Spark 2: Data Processing and Real-Time Analytics

by Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your o...

Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2018

Scaling Big Data with Hadoop and Solr

Scaling Big Data with Hadoop and Solr

by Hrishikesh Vijay Karambelkar

As data grows exponentially day-by-day, extracting information becomes a tedious activity in itself. Technologies like Hadoop are trying to address some of the concerns, while Solr provides high-speed faceted search. Bringing these two technologies together is helping organizations resolve the problem of information extraction from Big Da...

Price:  $26.99  |  Publisher:  Packt Publishing  |  Release:  2013

Fast Data Processing with Spark

Fast Data Processing with Spark

by Holden Karau

Spark is a framework for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and inbuilt tools for interactive query analysis (Shark), large-scale graph processing and analysis (Bagel), and ...

Price:  $22.99  |  Publisher:  Packt Publishing  |  Release:  2013

Apache Hadoop 3 Quick Start Guide

Apache Hadoop 3 Quick Start Guide

by Hrishikesh Karambelkar

Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.The book begins wit...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2018

Big Data Analytics with Spark

Big Data Analytics with Spark

by Mohammed Guller

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...

Price:  $29.99  |  Publisher:  Apress  |  Release:  2016

Fast Data Processing with Spark, 2nd Edition

Fast Data Processing with Spark, 2nd Edition

by Krishna Sankar, Holden Karau

Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (G...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2015