Sign In

Communications of the ACM

ACM TechNews

Artificial Data Give the Same Results as Real Data--Without Compromising Privacy

View as: Print Mobile App Share:
A representation of artificial data.

Massachusetts Institute of Technology researchers have developed a machine-learning system that automatically creates synthetic data.

Credit: MIT News

Researchers at the Massachusetts Institute of Technology (MIT) have developed the Synthetic Data Vault (SDV), a machine-learning system that automatically creates synthetic data. Such artificial data can be used in data science efforts that otherwise would be thwarted due to limited access to authentic data.

The use of authentic data raises significant privacy concerns, and the synthetic data can still be used to develop and test data science algorithms and models.

The SDV algorithm, known as a recursive conditional parameter aggregation, exploits the hierarchical organization of data common to all databases.

The researchers found the synthetic data can successfully replace real data in software writing and testing. They also note the SDV can be scaled to create very small or very large synthetic datasets, facilitating rapid development cycles or stress tests for big data systems.

From MIT News
View Full Article


Abstracts Copyright © 2017 Information Inc., Bethesda, Maryland, USA


No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account