SDV supports adding constraints within a single table. See Constraints for more information about the available single table constraints.
In order to use single-table constraints within a relational model, you can pass in a list of applicable constraints when adding a table to your relational Metadata. (See Relational Metadata for more information on constructing a Metadata object.)
Metadata
In this example, we wish to add a FixedCombinations constraint to our sessions table, which is a child table of users. First, we will create a Metadata object and add the users table.
FixedCombinations
sessions
users
In [1]: from sdv import load_demo, Metadata In [2]: tables = load_demo() In [3]: metadata = Metadata() In [4]: metadata.add_table( ...: name='users', ...: data=tables['users'], ...: primary_key='user_id' ...: ) ...:
The metadata now contains the users table.
In [5]: metadata Out[5]: Metadata root_path: . tables: ['users'] relationships:
Now, we want to add a child table sessions which contains a single table constraint. In the sessions table, we wish to only have combinations of (device, os) that appear in the original data.
(device, os)
In [6]: from sdv.constraints import FixedCombinations In [7]: constraint = FixedCombinations(column_names=['device', 'os']) In [8]: metadata.add_table( ...: name='sessions', ...: data=tables['sessions'], ...: primary_key='session_id', ...: parent='users', ...: foreign_key='user_id', ...: constraints=[constraint], ...: ) ...:
If we get the table metadata for sessions, we can see that the constraint has been added.
In [9]: metadata.get_table_meta('sessions') Out[9]: {'fields': {'session_id': {'type': 'id', 'subtype': 'integer'}, 'user_id': {'type': 'id', 'subtype': 'integer', 'ref': {'table': 'users', 'field': 'user_id'}}, 'device': {'type': 'categorical'}, 'os': {'type': 'categorical'}, 'minutes': {'type': 'numerical', 'subtype': 'integer'}}, 'constraints': [{'constraint': 'sdv.constraints.tabular.FixedCombinations', 'column_names': ['device', 'os']}], 'primary_key': 'session_id'}
We can now use this metadata to fit a relational model and synthesize data.
In [10]: from sdv.relational import HMA1 In [11]: model = HMA1(metadata) In [12]: model.fit(tables) In [13]: new_data = model.sample()
In the sampled data, we should see that our constraint is being satisfied.
In [14]: new_data Out[14]: {'users': user_id country gender age 0 0 BG F 36 1 1 BG F 54 2 2 ES M 35 3 3 ES F 22 4 4 ES F 24 5 5 ES M 36 6 6 ES F 45 7 7 DE NaN 44 8 8 US M 34 9 9 UK NaN 45, 'sessions': session_id user_id device os minutes 0 0 0 tablet ios 27 1 1 1 mobile ios 17 2 2 2 tablet ios 23 3 3 3 mobile android 33 4 4 4 mobile ios 22 5 5 5 mobile android 29 6 6 6 mobile android 25 7 7 7 tablet ios 34 8 8 8 tablet ios 24 9 9 9 tablet ios 9}