Danger

You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software

Click here to go to the new docs pages.

sdv.metrics.tabular.GMLogLikelihood.compute

classmethod GMLogLikelihood.compute(real_data, synthetic_data, metadata=None, n_components=1, 30, covariance_type='diag', iterations=3, retries=3)[source]

Compute this metric.

This fits multiple GaussianMixture models to the real data and then evaluates how likely it is that the synthetic data belongs to the same distribution as the real data.

By default, GaussianMixture models will search for the optimal number of components and covariance type using the real data and then evaluate the likelihood of the synthetic data using those arguments 3 times.

Real data and synthetic data must be passed as pandas.DataFrame instances and metadata as a Table metadata dict representation.

If no metadata is given, one will be built from the values observed in the real_data.

The output is the average log likelihood across all the GMMs evaluated.

Parameters
  • real_data (Union[numpy.ndarray, pandas.DataFrame]) – The values from the real dataset.

  • synthetic_data (Union[numpy.ndarray, pandas.DataFrame]) – The values from the synthetic dataset.

  • metadata (dict) – Table metadata dict.

  • n_components (Union[int, tuple[int]]) – Number of components to use for the GMM. If a tuple with 2 integers is passed, the optimal number of components within the range will be searched. Defaults to (1, 30)

  • covariance_type (Union[str, tuple[str]]) – Covariange type to use for the GMM. If multiple values are passed, the best one will be searched. Defaults to 'diag'.

  • iterations (int) – Number of times that each number of components should be evaluated before averaging the scores. Defaults to 3.

  • retries (int) – Number of times that each iteration will be retried if the GMM model crashes during fit. Defaults to 3.

Returns

Average score returned by the GaussianMixtures.

Return type

float