This repository is part of The Synthetic Data Vault Project, a project from DataCebo.

[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha) [![PyPi Shield](https://img.shields.io/pypi/v/copulas.svg)](https://pypi.python.org/pypi/copulas) [![Downloads](https://pepy.tech/badge/copulas)](https://pepy.tech/project/copulas) [![Unit Tests](https://github.com/sdv-dev/Copulas/actions/workflows/unit.yml/badge.svg)](https://github.com/sdv-dev/Copulas/actions/workflows/unit.yml) [![Coverage Status](https://codecov.io/gh/sdv-dev/Copulas/branch/master/graph/badge.svg)](https://codecov.io/gh/sdv-dev/Copulas) [![Slack](https://img.shields.io/badge/Community-Slack-blue?style=plastic&logo=slack)](https://bit.ly/sdv-slack-invite)

</div>

Overview

Copulas is a Python library for modeling multivariate distributions and sampling from them using copula functions. Given a table of numerical data, use Copulas to learn the distribution and generate new synthetic data following the same statistical properties.

Key Features:

  • Model multivariate data. Choose from a variety of univariate distributions and copulas – including Archimedian Copulas, Gaussian Copulas and Vine Copulas.

  • Compare real and synthetic data visually after building your model. Visualizations are available as 1D histograms, 2D scatterplots and 3D scatterplots.

  • Access & manipulate learned parameters. With complete access to the internals of the model, set or tune parameters to your choosing.

Install

Install the Copulas library using pip or conda.

pip install copulas
conda install -c conda-forge copulas

Usage

Get started using a demo dataset. This dataset contains 3 numerical columns.

```python from copulas.datasets import sample_trivariate_xyz

real_data = sample_trivariate_xyz()

Indices and tables