DeepEcho model based on the deepecho.models.par.PARModel class.
field_names (list[str]) – List of names of the fields that need to be modeled
and included in the generated output data. Any additional
fields found in the data will be ignored and will not be
included in the generated output.
If None, all the fields found in the data are used.
field_types (dict[str, dict]) – Dictinary specifying the data types and subtypes
of the fields that will be modeled. Field types and subtypes
combinations must be compatible with the SDV Metadata Schema.
anonymize_fields (dict[str, str]) – Dict specifying which fields to anonymize and what faker
category they belong to.
primary_key (str) – Name of the field which is the primary key of the table.
entity_columns (list[str]) – Names of the columns which identify different time series
sequences. These will be used to group the data in separated
context_columns (list[str]) – The columns in the dataframe which are constant within each
group/entity. These columns will be provided at sampling time
(i.e. the samples will be conditioned on the context variables).
segment_size (int, pd.Timedelta or str) – If specified, cut each training sequence in several segments of
the indicated size. The size can either can passed as an integer
value, which will interpreted as the number of data points to
put on each segment, or as a pd.Timedelta (or equivalent str
representation), which will be interpreted as the segment length
in time. Timedelta segment sizes can only be used with sequence
indexes of type datetime.
sequence_index (str) – Name of the column that acts as the order index of each
sequence. The sequence index column can be of any type that can
be sorted, such as integer values or datetimes.
context_model (str or sdv.tabular.BaseTabularModel) –
Model to use to sample the context rows. It can be passed as a
a string, which must be one of the following:
gaussian_copula (default): Use a GaussianCopula model.
Alternatively, a preconfigured Tabular model instance can be
table_metadata (dict or metadata.Table) – Table metadata instance or dict representation.
If given alongside any other metadata-related arguments, an
exception will be raised.
If not given at all, it will be built using the other
arguments or learned from the data.
epochs (int) – The number of epochs to train for. Defaults to 128.
sample_size (int) – The number of times to sample (before choosing and
returning the sample which maximizes the likelihood).
Defaults to 1.
cuda (bool) – Whether to attempt to use cuda for GPU computation.
If this is False or CUDA is not available, CPU will be used.
Defaults to True.
verbose (bool) – Whether to print progress to console or not.
Initialize self. See help(type(self)) for accurate signature.
__init__([field_names, field_types, …])
Fit this model to the data.
Get metadata about the table.
Load a TabularModel instance from a given path.
sample([num_sequences, context, sequence_length])
Sample new sequences.
Save this model instance to the given path using pickle.