Danger

You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software

Click here to go to the new docs pages.

ConstraintsΒΆ

SDV supports adding constraints within a single table. See Constraints for more information about the available single table constraints.

In order to use single-table constraints within a relational model, you can pass in a list of applicable constraints when adding a table to your relational Metadata. (See Relational Metadata for more information on constructing a Metadata object.)

In this example, we wish to add a FixedCombinations constraint to our sessions table, which is a child table of users. First, we will create a Metadata object and add the users table.

In [1]: from sdv import load_demo, Metadata

In [2]: tables = load_demo()

In [3]: metadata = Metadata()

In [4]: metadata.add_table(
   ...:     name='users',
   ...:     data=tables['users'],
   ...:     primary_key='user_id'
   ...: )
   ...: 

The metadata now contains the users table.

In [5]: metadata
Out[5]: 
Metadata
  root_path: .
  tables: ['users']
  relationships:

Now, we want to add a child table sessions which contains a single table constraint. In the sessions table, we wish to only have combinations of (device, os) that appear in the original data.

In [6]: from sdv.constraints import FixedCombinations

In [7]: constraint = FixedCombinations(column_names=['device', 'os'])

In [8]: metadata.add_table(
   ...:     name='sessions',
   ...:     data=tables['sessions'],
   ...:     primary_key='session_id',
   ...:     parent='users',
   ...:     foreign_key='user_id',
   ...:     constraints=[constraint],
   ...: )
   ...: 

If we get the table metadata for sessions, we can see that the constraint has been added.

In [9]: metadata.get_table_meta('sessions')
Out[9]: 
{'fields': {'session_id': {'type': 'id', 'subtype': 'integer'},
  'user_id': {'type': 'id',
   'subtype': 'integer',
   'ref': {'table': 'users', 'field': 'user_id'}},
  'device': {'type': 'categorical'},
  'os': {'type': 'categorical'},
  'minutes': {'type': 'numerical', 'subtype': 'integer'}},
 'constraints': [{'constraint': 'sdv.constraints.tabular.FixedCombinations',
   'column_names': ['device', 'os']}],
 'primary_key': 'session_id'}

We can now use this metadata to fit a relational model and synthesize data.

In [10]: from sdv.relational import HMA1

In [11]: model = HMA1(metadata)

In [12]: model.fit(tables)

In [13]: new_data = model.sample()

In the sampled data, we should see that our constraint is being satisfied.

In [14]: new_data
Out[14]: 
{'users':    user_id country gender  age
 0        0      ES      F   57
 1        1      BG      F   32
 2        2      UK    NaN   32
 3        3      ES      F   27
 4        4      DE      F   32
 5        5      ES      M   32
 6        6      FR      F   57
 7        7      ES      M   29
 8        8      ES    NaN   48
 9        9      US      F   45,
 'sessions':    session_id  user_id  device       os  minutes
 0           0        0  mobile      ios        8
 1           1        1  tablet  android        8
 2           2        2  mobile  android       24
 3           3        3  tablet  android       10
 4           4        4  mobile  android       34
 5           5        5  tablet      ios       26
 6           6        6  mobile  android       34
 7           7        7  tablet      ios       33
 8           8        8  mobile      ios       24
 9           9        9  mobile      ios       12}