Advanced Analytics with PySpark
Patterns for Learning from Data at Scale Using Python and Spark
|Price||$41.03 - $59.99
|Authors||Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills|
|Format||Paper book / ebook (PDF)|
The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming.
Data scientists Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills offer an introduction to the Spark ecosystem, then dive into patterns that apply common techniques-including classification, clustering, collaborative filtering, and anomaly detection, to fields such as genomics, security, and finance. This updated edition also covers NLP and image processing.
If you have a basic understanding of machine learning and statistics and you program in Python, this book will get you started with large-scale data analysis.
Familiarize yourself with Spark's programming model and ecosystem; Learn general approaches in data science; Examine complete implementations that analyze large public datasets; Discover which machine learning tools make sense for particular problems; Explore code that can be adapted to many uses.
by Dejan Sarka
Learn about business intelligence (BI) features in T-SQL and how they can help you with data science and analytics efforts without the need to bring in other languages such as R and Python. This book shows you how to compute statistical measures using your existing skills in T-SQL. You will learn how to calculate descriptive statistics, i...
Price: $32.55 | Publisher: Apress | Release: 2021
by Raju Kumar Mishra
Quickly find solutions to common programming problems encountered while processing big data. Content is presented in the popular problem-solution format. Look up the programming problem that you want to solve. Read the solution. Apply the solution directly in your own code. Problem solved!PySpark Recipes covers Hadoop and its shortcomings...
Price: $35.10 | Publisher: Apress | Release: 2018
by Yves Hilpisch
Derivatives Analytics with Python shows you how to implement market-consistent valuation and hedging approaches using advanced financial models, efficient numerical techniques, and the powerful capabilities of the Python programming language. This unique guide offers detailed explanations of all theory, methods, and processes, giving you ...
Price: $71.74 | Publisher: Wiley | Release: 2015
by Archana Mire, Shaveta Malik, Amit Kumar Tyagi
The book provides readers with an in-depth understanding of concepts and technologies related to the importance of analytics and deep learning in many useful real-world applications such as e-healthcare, transportation, agriculture, stock market, etc.Advanced analytics is a mixture of machine learning, artificial intelligence, graphs, tex...
Price: $162.46 | Publisher: Wiley | Release: 2022
by Mohammed Guller
This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML.Big Data Analytics w...
Price: $29.99 | Publisher: Apress | Release: 2016
by Dipanjan Sarkar
Derive useful insights from your data using Python. You will learn both basic and advanced concepts, including text and language syntax, structure, and semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization.Text Analytics with Python teaches you the technique...
Price: $18.99 | Publisher: Apress | Release: 2016
by Sudhir Rawat, Abhishek Narain
Improve your analytics and data platform to solve major challenges, including operationalizing big data and advanced analytics workloads on Azure. You will learn how to monitor complex pipelines, set alerts, and extend your organization's custom monitoring requirements.This book starts with an overview of the Azure Data Factory as a ...
Price: $30.09 | Publisher: Apress | Release: 2019
by Pramod Singh
Build machine learning models, natural language processing applications, and recommender systems with PySpark to solve various business challenges. This book starts with the fundamentals of Spark and its evolution and then covers the entire spectrum of traditional machine learning algorithms along with natural language processing and reco...
Price: $20.41 | Publisher: Apress | Release: 2019