Beginning Apache Spark 2

With Resilient Distributed Datasets, Spark SQL, Structured Streaming and Spark Machine Learning library



Bookstore > Books > Beginning Apache Spark 2

Price$25.33 - $43.07
Rating
AuthorHien Luu
PublisherApress
Published2018
Pages393
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101484235789
ISBN-139781484235782
EBook Hardcover Paperback

Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it.

Along the way, you'll discover resilient distributed datasets (RDDs); use Spark SQL for structured data; and learn stream processing and build real-time applications with Spark Structured Streaming. Furthermore, you'll learn the fundamentals of Spark ML for machine learning and much more.

After you read this book, you will have the fundamentals to become proficient in using Apache Spark and know when and how to apply it to your big data applications.

Understand Spark unified data processing platform; How to run Spark in Spark Shell or Databricks; Use and manipulate RDDs; Deal with structured data using Spark SQL through its operations and advanced functions; Build real-time applications using Spark Structured Streaming; Develop intelligent applications with the Spark Machine Learning library.




4 5 18

Similar Books


Apache Spark 2: Data Processing and Real-Time Analytics

Apache Spark 2: Data Processing and Real-Time Analytics

by Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your o...

Price:  $49.99  |  Publisher:  Packt Publishing  |  Release:  2018

Mastering Apache Spark

Mastering Apache Spark

by Mike Frampton

Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations.This book aims to take your limited knowledge of Spark to th...

Price:  $43.99  |  Publisher:  Packt Publishing  |  Release:  2015

Beginning Apache Spark Using Azure Databricks

Beginning Apache Spark Using Azure Databricks

by Robert Ilijason

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions ...

Price:  $32.32  |  Publisher:  Apress  |  Release:  2020

Sams Teach Yourself Apache Spark in 24 Hours

Sams Teach Yourself Apache Spark in 24 Hours

by Jeffrey Aven

Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that leverage Spark's amaz...

Price:  $32.51  |  Publisher:  SAMS Publishing  |  Release:  2016

Apache Cookbook, 2nd Edition

Apache Cookbook, 2nd Edition

by Rich Bowen, Ken Coar

There's plenty of documentation on installing and configuring the Apache web server, but where do you find help for the day-to-day stuff, like adding common modules or fine-tuning your activity logging? That's easy. The new edition of the Apache Cookbook offers you updated solutions to the problems you're likely to encounte...

Price:  $4.79  |  Publisher:  O'Reilly Media  |  Release:  2007

Beginning SQL Server 2005 Programming

Beginning SQL Server 2005 Programming

by Robert Vieira

Covering all the fundamentals of SQL Server 2005, this developer-oriented guide begins with an exploration of the foundation objects of SQL. Each chapter builds on the previous one, gradually progressing to increasingly advanced topics. By the time you've completed this book, you will be prepared to perform as an efficient SQL Server...

Price:  $3.74  |  Publisher:  Wrox  |  Release:  2006

Beginning EJB 3, 2nd Edition

Beginning EJB 3, 2nd Edition

by Jonathan Wetherbee, Chirag Rathod, Raghu Kodali, Peter Zadrozny

Develop powerful, standards-based, back-end business logic with Beginning EJB 3, Java EE 7 Edition. Led by an author team with 20 years of combined Enterprise JavaBeans experience, you'll learn how to use the new EJB 3.2 APIs. You'll gain the knowledge and skills you'll need to create the complex enterprise applications tha...

Price:  $50.00  |  Publisher:  Apress  |  Release:  2013

Pro Apache Hadoop, 2nd Edition

Pro Apache Hadoop, 2nd Edition

by Sameer Wadkar, Madhu Siddalingaiah, Jason Venner

Pro Apache Hadoop, Second Edition brings you up to speed on Hadoop - the framework of big data. Revised to cover Hadoop 2.0, the book covers the very latest developments such as YARN (aka MapReduce 2.0), new HDFS high-availability features, and increased scalability in the form of HDFS Federations. All the old content has been revised too...

Price:  $22.99  |  Publisher:  Apress  |  Release:  2014