Practical Synthetic Data Generation

Balancing Privacy and the Broad Availability of Data



Bookstore > Books > Practical Synthetic Data Generation

Price$59.99
Rating
AuthorsKhaled El Emam, Lucy Mosquera, Richard Hoptroff
PublisherO'Reilly Media
Published2020
Pages166
LanguageEnglish
FormatPaper book / ebook (PDF)
ISBN-101492072745
ISBN-139781492072744
EBook Hardcover Paperback

Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issues? This practical book introduces techniques for generating synthetic data - fake data generated from real data - so you can perform secondary analysis to do research, understand customer behaviors, develop new products, or generate new revenue.

Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution.

This book describes: Steps for generating synthetic data using multivariate normal distributions; Methods for distribution fitting covering different goodness-of-fit metrics; How to replicate the simple structure of original data; An approach for modeling data structure to consider complex relationships; Multiple approaches and metrics you can use to assess data utility; How analysis performed on real data can be replicated with synthetic data; Privacy implications of synthetic data and methods to assess identity disclosure.


  1. (3 books)


4 5 9

Similar Books


Practical Enterprise Data Lake Insights

Practical Enterprise Data Lake Insights

by Saurabh Gupta, Venkata Giri

Use this practical guide to successfully handle the challenges encountered when designing an enterprise data lake and learn industry best practices to resolve issues.When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data...

Price:  $24.14  |  Publisher:  Apress  |  Release:  2018

Practical Oracle Database Appliance

Practical Oracle Database Appliance

by Bobby Curtis, Fuad Arshad, Erik Benner, Maris Elsins, Matt Gallagher, Pete Sharman, Yury Velikanov

Practical Oracle Database Appliance is a hands-on book taking you through the components and implementation of the Oracle Database Appliance. Learn about architecture, installation, configuration, and reconfiguration. Install and configure the Oracle Database Appliance with confidence. Make the right choices between the various configurat...

Price:  $49.99  |  Publisher:  Apress  |  Release:  2014

Fundamentals of Data Engineering

Fundamentals of Data Engineering

by Joe Reis, Matt Housley

Data engineering has grown rapidly in the past decade, leaving many software engineers, data scientists, and analysts looking for a comprehensive view of this practice. With this practical book, you'll learn how to plan and build systems to serve the needs of your organization and customers by evaluating the best technologies availab...

Price:  $36.47  |  Publisher:  O'Reilly Media  |  Release:  2022

Practical Simulations for Machine Learning

Practical Simulations for Machine Learning

by Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning

Simulation and synthesis are core parts of the future of AI and machine learning. Consider: programmers, data scientists, and machine learning engineers can create the brain of a self-driving car without the car. Rather than use information from the real world, you can synthesize artificial data using simulations to train traditional mach...

Price:  $59.99  |  Publisher:  O'Reilly Media  |  Release:  2022

Practical Python Data Wrangling and Data Quality

Practical Python Data Wrangling and Data Quality

by Susan E. McGregor

The world around us is full of data that holds unique insights and valuable stories, and this book will help you uncover them. Whether you already work with data or want to learn more about its possibilities, the examples and techniques in this practical book will help you more easily clean, evaluate, and analyze data so that you can gene...

Price:  $49.58  |  Publisher:  O'Reilly Media  |  Release:  2021

The Self-Service Data Roadmap

The Self-Service Data Roadmap

by Sandeep Uttamchandani

Data-driven insights are a key competitive advantage for any industry today, but deriving insights from raw data can still take days or weeks. Most organizations can't scale data science teams fast enough to keep up with the growing amounts of data to transform. What's the answer? Self-service data.With this practical book, data...

Price:  $48.49  |  Publisher:  O'Reilly Media  |  Release:  2020

Thinking with Data

Thinking with Data

by Max Shron

Many analysts are too concerned with tools and techniques for cleansing, modeling, and visualizing datasets and not concerned enough with asking the right questions. In this practical guide, data strategy consultant Max Shron shows you how to put the why before the how, through an often-overlooked set of analytical skills.Thinking with Da...

Price:  $25.17  |  Publisher:  O'Reilly Media  |  Release:  2014

Practical Neo4j

Practical Neo4j

by Greg Jordan

Why have developers at places like Facebook and Twitter increasingly turned to graph databases to manage their highly connected big data? The short answer is that graphs offer superior speed and flexibility to get the job done.It's time you added skills in graph databases to your toolkit. In Practical Neo4j, database expert Greg Jord...

Price:  $24.53  |  Publisher:  Apress  |  Release:  2015