Danger

You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software

Click here to go to the new docs pages.

sdv.metrics.tabular.BNLikelihood.compute

classmethod BNLikelihood.compute(real_data, synthetic_data, metadata=None, structure=None)[source]

Compute this metric.

This fits a BayesianNetwork to the real data and then evaluates how likely it is that the synthetic data belongs to the same distribution.

Real data and synthetic data must be passed as pandas.DataFrame instances and metadata as a Table metadata dict representation.

If no metadata is given, one will be built from the values observed in the real_data.

If a structure is given, either directly or as a structure first level entry within the metadata dict, it is passed to the underlying BayesianNetwork for fitting. Otherwise, the structure is learned from the data using the chow-liu algorithm.

structure can be passed as either a tuple of tuples representing only the network structure or as a dict representing a full serialization of a previously fitted BayesianNetwork. In the later scenario, only the structure will be extracted from the BayesianNetwork instance, and then a new one will be fitted to the given data.

The output is the average probability across all the synthetic rows.

Parameters
  • real_data (Union[numpy.ndarray, pandas.DataFrame]) – The values from the real dataset.

  • synthetic_data (Union[numpy.ndarray, pandas.DataFrame]) – The values from the synthetic dataset.

  • metadata (dict) – Table metadata dict. If not passed, it is build based on the real_data fields and dtypes. Optionally, the metadata can include a structure entry with the structure of the Bayesian Network.

  • structure (dict) – Optional. BayesianNetwork structure to use when fitting to the real data. If not passed, learn it from the data using the chow-liu algorith. This is ignored if metadata is passed and it contains a structure entry in it.

Returns

Mean of the probabilities returned by the Bayesian Network.

Return type

float