The Synthetic Data Vault Blog. Put synthetic data to work!

Featured Article

The Most Important Open Source Demographic That No One Thinks About

Kalyan Veeramachaneni

23 January, 2023

The Most Important Open Source Demographic That No One Thinks About

How we define a user in 2023 to build a community around synthetic data.

All Articles

Applications Engineering Open Source Product

Neha Patki

26 January, 2023

3 user-centric growth strategies for open source

Our open source grew faster when we adopted a user-centric mindset. Here are 3 strategies we used along the way.

Kalyan Veeramachaneni

23 January, 2023

The Most Important Open Source Demographic That No One Thinks About

How we define a user in 2023 to build a community around synthetic data.

Neha Patki

10 January, 2023

Can you use synthetic data for label balancing?

Imbalanced data can prevent your projects from succeeding. Will synthetic data work? Explore the rationale behind label balancing.

Santiago Gomez Paz

20 December, 2022

Interpreting the Progress of CTGAN

It can be difficult to verify the progress that a GAN is making. What if we combined it with easily interpretable metrics and visualizations?

Neha Patki

07 October, 2022

How to evaluate synthetic data for your project — and avoid the biggest mistake we see

Proper evaluation is critical when using synthetic data. Avoid this common mistake and lead your project to success.

Arnav Modi

24 February, 2022

ML Model Development using Synthetic Data Clones

What happens when you train a machine learning model on synthetic data instead of real data? Let's experiment to find out.

Neha Patki

25 January, 2022

Building the Unique Combinations Constraint in the SDV

Sometimes, you want to limit the amount of permutations in your synthetic data. Explore the strategies we used for enforcing this kind of logic.

Kalyan Veeramachaneni

03 January, 2022

The SDV in 2021: A year in review

In this article, we summarize SDV growth – downloads as well as community building – that indicates increasing market demand for synthetic data.

Andrew Montanez

21 December, 2021

How we engineered constraint handling strategies in SDV

The SDV enforces deterministic rules using constraints. What strategies did we use to engineer this ML system? Dive into the details.

Neha Patki

01 December, 2021

User input to enhance synthetic data generation

ML models learn some rules out of the box, while other logic requires more work. Which is which? Read more to find out.

Neha Patki

16 November, 2021

Software Testing: Synthetic data changes the game

Creating fake data is an old concept -- but machine learning is a whole new ballgame. Learn about why ML is a key ingredient to synthetic data.

Neha Patki

19 May, 2021

Your Feedback in Action, Part 2: Data Workflow

After thousands of downloads, see how the synthetic data workflow in the SDV has evolved based on feedback from users.