Practical Synthetic Data Generation
Balancing Privacy and the Broad Availability of Data
|Price||$48.30 - $67.99
|Authors||Khaled El Emam, Lucy Mosquera, Richard Hoptroff|
|Format||Paper book / ebook (PDF)|
Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issues? This practical book introduces techniques for generating synthetic data - fake data generated from real data - so you can perform secondary analysis to do research, understand customer behaviors, develop new products, or generate new revenue.
Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution.
This book describes: Steps for generating synthetic data using multivariate normal distributions; Methods for distribution fitting covering different goodness-of-fit metrics; How to replicate the simple structure of original data; An approach for modeling data structure to consider complex relationships; Multiple approaches and metrics you can use to assess data utility; How analysis performed on real data can be replicated with synthetic data; Privacy implications of synthetic data and methods to assess identity disclosure.
by Bobby Curtis, Fuad Arshad, Erik Benner, Maris Elsins, Matt Gallagher, Pete Sharman, Yury Velikanov
Practical Oracle Database Appliance is a hands-on book taking you through the components and implementation of the Oracle Database Appliance. Learn about architecture, installation, configuration, and reconfiguration. Install and configure the Oracle Database Appliance with confidence. Make the right choices between the various configurat...
Price: $49.99 | Publisher: Apress | Release: 2014
by Saurabh Gupta, Venkata Giri
Use this practical guide to successfully handle the challenges encountered when designing an enterprise data lake and learn industry best practices to resolve issues.When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data...
Price: $24.14 | Publisher: Apress | Release: 2018
by Max Shron
Many analysts are too concerned with tools and techniques for cleansing, modeling, and visualizing datasets and not concerned enough with asking the right questions. In this practical guide, data strategy consultant Max Shron shows you how to put the why before the how, through an often-overlooked set of analytical skills.Thinking with Da...
Price: $10.73 | Publisher: O'Reilly Media | Release: 2014
by Greg Jordan
Why have developers at places like Facebook and Twitter increasingly turned to graph databases to manage their highly connected big data? The short answer is that graphs offer superior speed and flexibility to get the job done.It's time you added skills in graph databases to your toolkit. In Practical Neo4j, database expert Greg Jordan gu...
Price: $24.53 | Publisher: Apress | Release: 2015
by Valentina Porcu
Learn how to use Python and its structures, how to install Python, and which tools are best suited for data analyst work. This book provides you with a handy reference and tutorial on topics ranging from basic Python concepts through to data mining, manipulating and importing datasets, and data analysis.Python for Data Mining Quick Syntax...
Price: $23.88 | Publisher: Apress | Release: 2018
by Tyler Akidau, Slava Chernyak, Reuven Lax
Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to wo...
Price: $52.52 | Publisher: O'Reilly Media | Release: 2018
by Luk Arbuckle, Khaled El Emam
How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner.Luk Arbuckle and Khaled ...
Price: $36.65 | Publisher: O'Reilly Media | Release: 2020
by Ilango Gurusamy
Scala, together with the Spark Framework, forms a rich and powerful data processing ecosystem. Modern Scala Projects is a journey into the depths of this ecosystem. The machine learning (ML) projects presented in this book enable you to create practical, robust data analytics solutions, with an emphasis on automating data workflows with t...
Price: $30.83 | Publisher: Packt Publishing | Release: 2018