High Performance Spark

Best Practices for Scaling and Optimizing Apache Spark



Bookstore > Books > High Performance Spark

Price$27.31
Rating
AuthorsHolden Karau, Rachel Warren
PublisherO'Reilly Media
Published2017
Pages175
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101491943203
ISBN-139781491995662
EBook Hardcover Paperback

Apache Spark is amazing when everything clicks. But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.

Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you'll also learn how to make it sing.

How Spark SQL's new interfaces improve performance over SQL's RDD data structure; The choice between data joins in Core Spark and Spark SQL; Techniques for getting the most out of standard RDD transformations; How to work around performance issues in Spark's key/value pair paradigm; Writing high-performance Spark code without Scala or the JVM; How to test for functionality and performance when applying suggested improvements; Using Spark MLlib and Spark ML machine learning libraries; Spark's Streaming components and external community packages.


  1. (5 books)


Similar Books


Clojure High Performance Programming

Clojure High Performance Programming

by Shantanu Kumar

Clojure is a young, dynamic, functional programming language that runs on the Java Virtual Machine. It is built with performance, pragmatism, and simplicity in mind. Like most general purpose languages, Clojure's features have different performance characteristics that one should know in order to write high performance code.Clojure H...

Price:  $20.99  |  Publisher:  Packt Publishing  |  Release:  2013

High Performance Web Sites

High Performance Web Sites

by Steve Souders

Want to speed up your web site? This book presents 14 specific rules that will cut 20% to 25% off response time when users request a page. Author Steve Souders, in his job as Chief Performance Yahoo!, collected these best practices while optimizing some of the most-visited pages on the Web. Even sites that had already been highly optimize...

Price:  $4.30  |  Publisher:  O'Reilly Media  |  Release:  2007

NGINX High Performance

NGINX High Performance

by Rahul Sharma

NGINX is one of the most common free, open source web servers. Its performance-oriented architecture and small footprint makes it an ideal choice for high-traffic websites.NGINX offers great performance and optimal resource utilization to its administrators. This practical guide walks you through how to tune one of the leading free open s...

Price:  $29.99  |  Publisher:  Packt Publishing  |  Release:  2015

High-Performance Caching with Nginx and Nginx Plus

High-Performance Caching with Nginx and Nginx Plus

by Floyd Smith

You can cache static assets - more than half the payload needed to respond to many web requests - and even application?generated web pages (whether partial or complete). And you can use cache clusters and microcaching to increase the caching capability of your web applications while simplifying implementation and reducing operational comp...

Free ebook  |  Publisher:  Self-publishing  |  Release:  2017

High Performance JavaScript

High Performance JavaScript

by Nicholas C. Zakas

If you're like most developers, you rely heavily on JavaScript to build interactive and quick-responding web applications. The problem is that all of those lines of JavaScript code can slow down your apps. This book reveals techniques and strategies to help you eliminate performance bottlenecks during development. You'll learn o...

Price:  $19.59  |  Publisher:  O'Reilly Media  |  Release:  2010

Apache Solr High Performance

Apache Solr High Performance

by Surendra Mohan

Apache Solr is one of the most popular open source search servers available on the web. However, simply setting up Apache Solr is not enough to ensure the success of your web product. To maximize efficiency, you need to use techniques to boost Solr performance in order to return relevant results faster. You need to implement robust techni...

Price:  $20.99  |  Publisher:  Packt Publishing  |  Release:  2014

PostgreSQL 10 High Performance

PostgreSQL 10 High Performance

by Ibrar Ahmed, Gregory Smith, Enrico Pirozzi

PostgreSQL database servers have a common set of problems that they encounter as their usage gets heavier and requirements get more demanding. Peek into the future of your PostgreSQL 10 database's problems today. Know the warning signs to look for and how to avoid the most common issues before they even happen.Surprisingly, most Post...

Price:  $44.99  |  Publisher:  Packt Publishing  |  Release:  2018

Mastering High Performance with Kotlin

Mastering High Performance with Kotlin

by Igor Kucherenko

The ease with which we write applications has been increasing, but with it comes the need to address their performance. A balancing act between easily implementing complex applications and keeping their performance optimal is a present-day requirement In this book, we explore how to achieve this crucial balance, while developing and deplo...

Price:  $44.57  |  Publisher:  Packt Publishing  |  Release:  2018