Modern Big Data Processing with Hadoop

    Expert techniques for architecting end-to-end big data solutions to get valuable insights



    Bookstore > Books > Modern Big Data Processing with Hadoop

    Price$50.55 - $54.99
    Rating
    AuthorsNaresh Kumar, Prashant Shindgikar
    PublisherPackt Publishing
    Published2018
    Pages394
    LanguageEnglish
    FormatPaper book / ebook (PDF)
    ISBN-10178712276X
    ISBN-139781787122765
    EBook Hardcover Paperback

    The complex structure of data these days requires sophisticated solutions for data transformation, to make the information more accessible to the users.This book empowers you to build such solutions with relative ease with the help of Apache Hadoop, along with a host of other Big Data tools.

    This book will give you a complete understanding of the data lifecycle management with Hadoop, followed by modeling of structured and unstructured data in Hadoop. It will also show you how to design real-time streaming pipelines by leveraging tools such as Apache Spark, and build efficient enterprise search solutions using Elasticsearch. You will learn to build enterprise-grade analytics solutions on Hadoop, and how to visualize your data using tools such as Apache Superset. This book also covers techniques for deploying your Big Data solutions on the cloud Apache Ambari, as well as expert techniques for managing and administering your Hadoop cluster.

    By the end of this book, you will have all the knowledge you need to build expert Big Data systems.

    Build an efficient enterprise Big Data strategy centered around Apache Hadoop; Gain a thorough understanding of using Hadoop with various Big Data frameworks such as Apache Spark, Elasticsearch and more; Set up and deploy your Big Data environment on premises or on the cloud with Apache Ambari; Design effective streaming data pipelines and build your own enterprise search solutions; Utilize the historical data to build your analytics solutions and visualize them using popular tools such as Apache Superset; Plan, set up and administer your Hadoop cluster efficiently.




    2 5 4

    Similar Books


    Big Data Processing with Apache Spark

    Big Data Processing with Apache Spark

    by Manuel Ignacio Franco Galeano

    Processing big data in real time is challenging due to scalability, information consistency, and fault-tolerance. This book teaches you how to use Spark to make your overall analytical workflow faster and more efficient. You'll explore all core concepts and tools within the Spark ecosystem, such as Spark Streaming, the Spark Streamin...

    Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2018

    Big Data Analytics with R and Hadoop

    Big Data Analytics with R and Hadoop

    by Vignesh Prajapati

    Big data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations, and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue. New...

    Price:  $5.77  |  Publisher:  Packt Publishing  |  Release:  2013

    Apache Spark 2: Data Processing and Real-Time Analytics

    Apache Spark 2: Data Processing and Real-Time Analytics

    by Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei

    Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your o...

    Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2018

    Scaling Big Data with Hadoop and Solr

    Scaling Big Data with Hadoop and Solr

    by Hrishikesh Vijay Karambelkar

    As data grows exponentially day-by-day, extracting information becomes a tedious activity in itself. Technologies like Hadoop are trying to address some of the concerns, while Solr provides high-speed faceted search. Bringing these two technologies together is helping organizations resolve the problem of information extraction from Big Da...

    Price:  $26.99  |  Publisher:  Packt Publishing  |  Release:  2013

    Fast Data Processing with Spark

    Fast Data Processing with Spark

    by Holden Karau

    Spark is a framework for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and inbuilt tools for interactive query analysis (Shark), large-scale graph processing and analysis (Bagel), and ...

    Price:  $22.99  |  Publisher:  Packt Publishing  |  Release:  2013

    Apache Hadoop 3 Quick Start Guide

    Apache Hadoop 3 Quick Start Guide

    by Hrishikesh Karambelkar

    Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.The book begins wit...

    Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2018

    Big Data Analytics with Spark

    Big Data Analytics with Spark

    by Mohammed Guller

    This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...

    Price:  $29.99  |  Publisher:  Apress  |  Release:  2016

    Fast Data Processing with Spark, 2nd Edition

    Fast Data Processing with Spark, 2nd Edition

    by Krishna Sankar, Holden Karau

    Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (G...

    Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2015