You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software

Click here to go to the new docs pages.


CTGAN.sample_remaining_columns(known_columns, max_tries_per_batch=100, batch_size=None, randomize_samples=True, output_file_path=None)

Sample rows from this table.

  • known_columns (pandas.DataFrame) – A pandas.DataFrame with the columns that are already known. The output is a DataFrame such that each row in the output is sampled conditionally on the corresponding row in the input.

  • max_tries_per_batch (int) – Number of times to retry sampling until the batch size is met. Defaults to 100.

  • batch_size (int) – The batch size to use per sampling call.

  • randomize_samples (bool) – Whether or not to use a fixed seed when sampling. Defaults to True.

  • output_file_path (str or None) – The file to periodically write sampled rows to. Defaults to a temporary file, if None.


Sampled data.

Return type


  • ConstraintsNotMetError – If the conditions are not valid for the given constraints.

  • ValueError – If any of the following happens: * any of the conditions’ columns are not valid. * no rows could be generated.