Danger

You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software

Click here to go to the new docs pages.

sdv.lite.tabular.TabularPreset.sample_remaining_columns¶

TabularPreset.sample_remaining_columns(known_columns, max_tries_per_batch=100, batch_size=None, randomize_samples=True, output_file_path=None)[source]¶

Sample rows from this table.

Parameters

known_columns (pandas.DataFrame) – A pandas.DataFrame with the columns that are already known. The output is a DataFrame such that each row in the output is sampled conditionally on the corresponding row in the input.
max_tries_per_batch (int) – Number of times to try sampling discarded rows. Defaults to 100.
batch_size (int) – The batch size to use per attempt at sampling. Defaults to 10 times the number of rows.
randomize_samples (bool) – Whether or not to use a fixed seed when sampling. Defaults to True.
output_file_path (str or None) – The file to periodically write sampled rows to. Defaults to a temporary file, if None.

Returns

Sampled data.

Return type

pandas.DataFrame

sdv.lite.tabular.TabularPreset.sample_conditions sdv.lite.tabular.TabularPreset.save