copulas.datasets module¶
Sample datasets for the Copulas library.
-
copulas.datasets.
sample_bivariate_age_income
(size=1000, seed=42)[source]¶ Sample from a bivariate toy dataset.
This dataset contains two columns which correspond to the simulated age and income which are positively correlated with outliers.
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
DataFrame with two columns,
age
andincome
.- Return type
pandas.DataFrame
-
copulas.datasets.
sample_trivariate_xyz
(size=1000, seed=42)[source]¶ Sample from three dimensional toy dataset.
The output is a DataFrame containing three columns:
x
: Beta distribution with a=0.1 and b=0.1y
: Beta distribution with a=0.1 and b=0.5z
: Normal distribution + 10 timesy
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
DataFrame with three columns,
x
,y
andz
.- Return type
pandas.DataFrame
-
copulas.datasets.
sample_univariate_bernoulli
(size=1000, seed=42)[source]¶ Sample from a Bernoulli distribution with p=0.3.
The distribution is built by sampling a uniform random and then setting 0 or 1 depending on whether the value is above or below 0.3.
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
Series with the sampled values.
- Return type
pandas.Series
-
copulas.datasets.
sample_univariate_beta
(size=1000, seed=42)[source]¶ Sample from a beta distribution with a=3 and b=1 and loc=4.
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
Series with the sampled values.
- Return type
pandas.Series
-
copulas.datasets.
sample_univariate_bimodal
(size=1000, seed=42)[source]¶ Sample from a bimodal distribution which mixes two Gaussians at 0.0 and 10.0 with stdev=1.
The distribution is built by sampling a standard normal and a normal with mean
10
and then selecting one or the other based on a bernoulli distribution.- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
Series with the sampled values.
- Return type
pandas.Series
-
copulas.datasets.
sample_univariate_degenerate
(size=1000, seed=42)[source]¶ Sample from a degenerate distribution that only takes one random value.
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
Series with the sampled values.
- Return type
pandas.Series
-
copulas.datasets.
sample_univariate_exponential
(size=1000, seed=42)[source]¶ Sample from an exponential distribution at 3.0 with rate 1.0.
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
Series with the sampled values.
- Return type
pandas.Series
-
copulas.datasets.
sample_univariate_normal
(size=1000, seed=42)[source]¶ Sample from a normal distribution with mean 1 and stdev 1.
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
Series with the sampled values.
- Return type
pandas.Series
-
copulas.datasets.
sample_univariate_uniform
(size=1000, seed=42)[source]¶ Sample from a uniform distribution in [-1.0, 3.0].
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
Series with the sampled values.
- Return type
pandas.Series
-
copulas.datasets.
sample_univariates
(size=1000, seed=42)[source]¶ Sample from a list of univariate distributions.
- Parameters
size (int) – Amount of samples to generate. Defaults to 1000.
seed (int) – Random seed to use. Defaults to 42.
- Returns
DataFrame with the sampled distributions.
- Return type
pandas.DataFrame