sdv.tabular.ctgan.
TVAE
Model wrapping TVAESynthesizer model.
TVAESynthesizer
field_names (list[str]) – List of names of the fields that need to be modeled and included in the generated output data. Any additional fields found in the data will be ignored and will not be included in the generated output. If None, all the fields found in the data are used.
None
field_types (dict[str, dict]) – Dictinary specifying the data types and subtypes of the fields that will be modeled. Field types and subtypes combinations must be compatible with the SDV Metadata Schema.
field_transformers (dict[str, str]) –
Dictinary specifying which transformers to use for each field. Available transformers are:
integer: Uses a NumericalTransformer of dtype int. float: Uses a NumericalTransformer of dtype float. categorical: Uses a CategoricalTransformer without gaussian noise. categorical_fuzzy: Uses a CategoricalTransformer adding gaussian noise. one_hot_encoding: Uses a OneHotEncodingTransformer. label_encoding: Uses a LabelEncodingTransformer. boolean: Uses a BooleanTransformer. datetime: Uses a DatetimeTransformer.
integer: Uses a NumericalTransformer of dtype int.
integer
NumericalTransformer
int
float: Uses a NumericalTransformer of dtype float.
float
categorical: Uses a CategoricalTransformer without gaussian noise.
categorical
CategoricalTransformer
categorical_fuzzy: Uses a CategoricalTransformer adding gaussian noise.
categorical_fuzzy
one_hot_encoding: Uses a OneHotEncodingTransformer.
one_hot_encoding
OneHotEncodingTransformer
label_encoding: Uses a LabelEncodingTransformer.
label_encoding
LabelEncodingTransformer
boolean: Uses a BooleanTransformer.
boolean
BooleanTransformer
datetime: Uses a DatetimeTransformer.
datetime
DatetimeTransformer
anonymize_fields (dict[str, str]) – Dict specifying which fields to anonymize and what faker category they belong to.
primary_key (str) – Name of the field which is the primary key of the table.
constraints (list[Constraint, dict]) – List of Constraint objects or dicts.
table_metadata (dict or metadata.Table) – Table metadata instance or dict representation. If given alongside any other metadata-related arguments, an exception will be raised. If not given at all, it will be built using the other arguments or learned from the data.
embedding_dim (int) – Size of the random sample passed to the Generator. Defaults to 128.
compress_dims (tuple or list of ints) – Size of each hidden layer in the encoder. Defaults to (128, 128).
decompress_dims (tuple or list of ints) – Size of each hidden layer in the decoder. Defaults to (128, 128).
l2scale (int) – Regularization term. Defaults to 1e-5.
batch_size (int) – Number of data samples to process in each step.
epochs (int) – Number of training epochs. Defaults to 300.
loss_factor (int) – Multiplier for the reconstruction error. Defaults to 2.
cuda (bool or str) – If True, use CUDA. If a str, use the indicated device. If False, do not use cuda at all.
True
str
False
rounding (int, str or None) – Define rounding scheme for NumericalTransformer. If set to an int, values will be rounded to that number of decimal places. If None, values will not be rounded. If set to 'auto', the transformer will round to the maximum number of decimal places detected in the fitted data. Defaults to 'auto'.
'auto'
min_value (int, str or None) – Specify the minimum value the NumericalTransformer should use. If an integer is given, sampled data will be greater than or equal to it. If the string 'auto' is given, the minimum will be the minimum value seen in the fitted data. If None is given, there won’t be a minimum. Defaults to 'auto'.
max_value (int, str or None) – Specify the maximum value the NumericalTransformer should use. If an integer is given, sampled data will be less than or equal to it. If the string 'auto' is given, the maximum will be the maximum value seen in the fitted data. If None is given, there won’t be a maximum. Defaults to 'auto'.
__init__
Initialize self. See help(type(self)) for accurate signature.
Methods
__init__([field_names, field_types, …])
Initialize self.
fit(data)
fit
Fit this model to the data.
get_metadata()
get_metadata
Get metadata about the table.
get_parameters()
get_parameters
Get the parameters learned from the data.
load(path)
load
Load a TabularModel instance from a given path.
sample(num_rows[, randomize_samples, …])
sample
Sample rows from this table.
sample_conditions(conditions[, max_tries, …])
sample_conditions
Sample rows from this table with the given conditions.
sample_remaining_columns(known_columns[, …])
sample_remaining_columns
save(path)
save
Save this model instance to the given path using pickle.
set_parameters(parameters)
set_parameters
Regenerate a previously learned model from its parameters.