Learning Apache Mahout

Acquire practical skills in Big Data Analytics and explore data science with Apache Mahout



Bookstore > Books > Learning Apache Mahout

Price$44.99 - $65.24
Rating
AuthorChandramani Tiwary
PublisherPackt Publishing
Published2015
Pages250
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101783555211
ISBN-139781783555215
EBook Hardcover Paperback

In the past few years the generation of data and our capability to store and process it has grown exponentially. There is a need for scalable analytics frameworks and people with the right skills to get the information needed from this Big Data. Apache Mahout is one of the first and most prominent Big Data machine learning platforms. It implements machine learning algorithms on top of distributed processing platforms such as Hadoop and Spark.

Starting with the basics of Mahout and machine learning, you will explore prominent algorithms and their implementation in Mahout development. You will learn about Mahout building blocks, addressing feature extraction, reduction and the curse of dimensionality, delving into classification use cases with the random forest and Naïve Bayes classifier and item and user-based recommendation. You will then work with clustering Mahout using the K-means algorithm and implement Mahout without MapReduce. Finish with a flourish by exploring end-to-end use cases on customer analytics and test analytics to get a real-life practical know-how of analytics projects.




3 5 1

Similar Books


Learning Apache Mahout Classification

Learning Apache Mahout Classification

by Ashish Gupta

This book is a practical guide that explains the classification algorithms provided in Apache Mahout with the help of actual examples. Starting with the introduction of classification and model evaluation techniques, we will explore Apache Mahout and learn why it is a good choice for classification.Next, you will learn about different cla...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2015

Apache Mahout Essentials

Apache Mahout Essentials

by Jayani Withanawasam

Apache Mahout is a scalable machine learning library with algorithms for clustering, classification, and recommendations. It empowers users to analyze patterns in large, diverse, and complex datasets faster and more scalably.This book is an all-inclusive guide to analyzing large and complex datasets using Apache Mahout. It explains compli...

Price:  $24.99  |  Publisher:  Packt Publishing  |  Release:  2015

Learning Apache Kafka, 2nd Edition

Learning Apache Kafka, 2nd Edition

by Nishant Garg

Kafka is one of those systems that is very simple to describe at a high level but has an incredible depth of technical detail when you dig deeper.Learning Apache Kafka Second Edition provides you with step-by-step, practical examples that help you take advantage of the real power of Kafka and handle hundreds of megabytes of messages per s...

Price:  $13.07  |  Publisher:  Packt Publishing  |  Release:  2015

Practical Apache Lucene 8

Practical Apache Lucene 8

by Atri Sharma

Gain a thorough knowledge of Lucene's capabilities and use it to develop your own search applications. This book explores the Java-based, high-performance text search engine library used to build search capabilities in your applications. Starting with the basics of Lucene and searching, you will learn about the types of queries used ...

Price:  $31.61  |  Publisher:  Apress  |  Release:  2020

Learning Apache Drill

Learning Apache Drill

by Paul Rogers, Charles Givre

Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databas...

Price:  $42.26  |  Publisher:  O'Reilly Media  |  Release:  2018

Learning Apache Thrift

Learning Apache Thrift

by Krzysztof Rakowski

With modern software systems being increasingly complex, providing a scalable communication architecture for applications in different languages is tedious. The Apache Thrift framework is the solution to this problem! It helps build efficient and easy-to-maintain services and offers a plethora of options matching your application type by ...

Price:  $13.46  |  Publisher:  Packt Publishing  |  Release:  2015

Learning Spark

Learning Spark

by Matei Zaharia, Holden Karau, Andy Konwinski, Patrick Wendell

Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. Th...

Price:  $32.23  |  Publisher:  O'Reilly Media  |  Release:  2015

Machine Learning with Spark

Machine Learning with Spark

by Nick Pentreath

Apache Spark is a framework for distributed computing that is designed from the ground up to be optimized for low latency tasks and in-memory data storage. It is one of the few frameworks for parallel computing that combines speed, scalability, in-memory processing, and fault tolerance with ease of programming and a flexible, expressive, ...

Price:  $34.99  |  Publisher:  Packt Publishing  |  Release:  2015