Danger

You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software

Click here to go to the new docs pages.

sdv.tabular.ctgan.CTGAN.sample

CTGAN.sample(num_rows, randomize_samples=True, max_tries_per_batch=100, batch_size=None, output_file_path=None, conditions=None)

Sample rows from this table.

Parameters
  • num_rows (int) – Number of rows to sample. This parameter is required.

  • randomize_samples (bool) – Whether or not to use a fixed seed when sampling. Defaults to True.

  • max_tries_per_batch (int) – Number of times to retry sampling until the batch size is met. Defaults to 100.

  • batch_size (int or None) – The batch size to sample. Defaults to num_rows, if None.

  • output_file_path (str or None) – The file to periodically write sampled rows to. If None, does not write rows anywhere.

  • conditions – Deprecated argument. Use the sample_conditions method with sdv.sampling.Condition objects instead.

Returns

Sampled data.

Return type

pandas.DataFrame