sdv.tabular.copulas.GaussianCopula.sample_remaining_columns

GaussianCopula.sample_remaining_columns(known_columns, batch_size=None, randomize_samples=True, output_file_path=None)[source]

Sample rows from this table.

Parameters
  • known_columns (pandas.DataFrame) – A pandas.DataFrame with the columns that are already known. The output is a DataFrame such that each row in the output is sampled conditionally on the corresponding row in the input.

  • batch_size (int or None) – The batch size to sample. Defaults to num_rows, if None.

  • randomize_samples (bool) – Whether or not to use a fixed seed when sampling. Defaults to True.

  • output_file_path (str or None) – The file to periodically write sampled rows to. Defaults to a temporary file, if None.

Returns

Sampled data.

Return type

pandas.DataFrame

Raises
  • ConstraintsNotMetError – If the conditions are not valid for the given constraints.

  • ValueError – If any of the following happens: * any of the conditions’ columns are not valid. * no rows could be generated.