SDVThe Synthetic Data Vault

sdv hero

The Synthetic Data Vault

What can you use synthetic data for?

Use a synthetic data in place of real data for added protection, or use it in addition to your real data as an enhancement.

Expand Access

Pilot New Products

The SDV Ecosystem

Public, Source-Available Libraries

The SDV is an overall ecosystem for synthetic data models, benchmarks, and metrics. Explore publicly available libraries supporting the SDV. Each can be used as standalone packages for particular needs.


Models & generates tabular data with classic statistical methods. Uses multivariate copulas.


Models & generates tabular data with Deep Learning. Offers CTGAN and TVAE models.


Models & generates time series data with a mix of classic statistical models and Deep Learning.


Discovers properties & transforms data for data science use. Reverses the transforms to reproduce realistic data.

Synthetic Data Vault

Generates synthetic data across single table, relational, and time series data. Supports multiple models & evaluations.


Try it out now!

Quickly discover SDV with just a few lines of code!

from sdv.datasets.demo import download_demo
from sdv.lite import SingleTablePreset

real_data, metadata = download_demo(
    'single_table', 'fake_hotel_guests')

synthesizer = SingleTablePreset(metadata, name='FAST_ML')

synthetic_data = synthesizer.sample(num_rows=10)

Join Us

Join Our Community

Chat with developers across the world. Stay up-to-date with the latest features, blogs, and news.

© 2023, DataCebo, Inc.