Next-Generation Big Data

A Practical Guide to Apache Kudu, Impala, and Spark



Bookstore > Books > Next-Generation Big Data

Next-Generation Big Data
Buy
Preview
Price$33.51 - $39.99
Rating
AuthorButch Quinto
PublisherApress
Published2018
Pages557
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101484231465
ISBN-139781484231463
EBook Hardcover Paperback

Utilize this practical and easy-to-follow guide to modernize traditional enterprise data warehouse and business intelligence environments with next-generation big data technologies.

Next-Generation Big Data takes a holistic approach, covering the most important aspects of modern enterprise big data. The book covers not only the main technology stack but also the next-generation tools and applications used for big data warehousing, data warehouse optimization, real-time and batch data ingestion and processing, real-time data visualization, big data governance, data wrangling, big data cloud deployments, and distributed in-memory big data computing. Finally, the book has an extensive and detailed coverage of big data case studies from Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and Mastercard.

Install Apache Kudu, Impala, and Spark to modernize enterprise data warehouse and business intelligence environments, complete with real-world, easy-to-follow examples, and practical advice; Integrate HBase, Solr, Oracle, SQL Server, MySQL, Flume, Kafka, HDFS, and Amazon S3 with Apache Kudu, Impala, and Spark; Use StreamSets, Talend, Pentaho, and CDAP for real-time and batch data ingestion and processing; Utilize Trifacta, Alteryx, and Datameer for data wrangling and interactive data processing; Turbocharge Spark with Alluxio, a distributed in-memory storage platform; Deploy big data in the cloud using Cloudera Director; Perform real-time data visualization and time series analysis using Zoomdata, Apache Kudu, Impala, and Spark; Understand enterprise big data topics such as big data governance, metadata management, data lineage, impact analysis, and policy enforcement, and how to use Cloudera Navigator to perform common data governance tasks; Implement big data use cases such as big data warehousing, data warehouse optimization, Internet of Things, real-time data ingestion and analytics, complex event processing, and scalable predictive modeling; Study real-world big data case studies from innovative companies, including Navistar, Cerner, British Telecom, Shopzilla, Thomson Reuters, and Mastercard.




Similar Books


Next Generation Databases: NoSQL, NewSQL, and Big Data

Next Generation Databases: NoSQL, NewSQL, and Big Data

This is a book for enterprise architects, database administrators, and developers who need to understand the latest developments in database technologies. It is the book to help you choose the correct database technology at a time when concepts such as Big Data, NoSQL and NewSQL are making what used to be an easy choice into a complex dec...
Big Data Imperatives

Big Data Imperatives

Big Data Imperatives, focuses on resolving the key questions on everyone's mind: Which data matters? Do you have enough data volume to justify the usage? How you want to process this amount of data? How long do you really need to keep it active for your analysis, marketing, and BI applications?Big data is emerging from the realm of one-of...
Big Data Bootcamp

Big Data Bootcamp

Investors and technology gurus have called big data one of the most important trends to come along in decades. Big Data Bootcamp explains what big data is and how you can use it in your company to become one of tomorrow's market leaders. Along the way, it explains the very latest technologies, companies, and advancements.Big data holds th...
Scalable Big Data Architecture

Scalable Big Data Architecture

This book highlights the different types of data architecture and illustrates the many possibilities hidden behind the term "Big Data", from the usage of No-SQL databases to the deployment of stream analytics architecture, machine learning, and governance.Scalable Big Data Architecture covers real-world, concrete industr...
Big Data Analytics with R and Hadoop

Big Data Analytics with R and Hadoop

Big data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations, and other useful information. Such information can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue. New...
Scaling Big Data with Hadoop and Solr

Scaling Big Data with Hadoop and Solr

As data grows exponentially day-by-day, extracting information becomes a tedious activity in itself. Technologies like Hadoop are trying to address some of the concerns, while Solr provides high-speed faceted search. Bringing these two technologies together is helping organizations resolve the problem of information extraction from Big Da...