Danger You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software Click here to go to the new docs pages.
Danger
You are looking at the documentation for an older version of the SDV! We are no longer supporting or maintaining this version of the software
Click here to go to the new docs pages.
SDV supports adding constraints within a single table. See Constraints for more information about the available single table constraints.
In order to use single-table constraints within a relational model, you can pass in a list of applicable constraints when adding a table to your relational Metadata. (See Relational Metadata for more information on constructing a Metadata object.)
Metadata
In this example, we wish to add a FixedCombinations constraint to our sessions table, which is a child table of users. First, we will create a Metadata object and add the users table.
FixedCombinations
sessions
users
In [1]: from sdv import load_demo, Metadata In [2]: tables = load_demo() In [3]: metadata = Metadata() In [4]: metadata.add_table( ...: name='users', ...: data=tables['users'], ...: primary_key='user_id' ...: ) ...:
The metadata now contains the users table.
In [5]: metadata Out[5]: Metadata root_path: . tables: ['users'] relationships:
Now, we want to add a child table sessions which contains a single table constraint. In the sessions table, we wish to only have combinations of (device, os) that appear in the original data.
(device, os)
In [6]: from sdv.constraints import FixedCombinations In [7]: constraint = FixedCombinations(column_names=['device', 'os']) In [8]: metadata.add_table( ...: name='sessions', ...: data=tables['sessions'], ...: primary_key='session_id', ...: parent='users', ...: foreign_key='user_id', ...: constraints=[constraint], ...: ) ...:
If we get the table metadata for sessions, we can see that the constraint has been added.
In [9]: metadata.get_table_meta('sessions') Out[9]: {'fields': {'session_id': {'type': 'id', 'subtype': 'integer'}, 'user_id': {'type': 'id', 'subtype': 'integer', 'ref': {'table': 'users', 'field': 'user_id'}}, 'device': {'type': 'categorical'}, 'os': {'type': 'categorical'}, 'minutes': {'type': 'numerical', 'subtype': 'integer'}}, 'constraints': [{'constraint': 'sdv.constraints.tabular.FixedCombinations', 'column_names': ['device', 'os']}], 'primary_key': 'session_id'}
We can now use this metadata to fit a relational model and synthesize data.
In [10]: from sdv.relational import HMA1 In [11]: model = HMA1(metadata) In [12]: model.fit(tables) In [13]: new_data = model.sample()
In the sampled data, we should see that our constraint is being satisfied.
In [14]: new_data Out[14]: {'users': user_id country gender age 0 0 ES F 57 1 1 BG F 32 2 2 UK NaN 32 3 3 ES F 27 4 4 DE F 32 5 5 ES M 32 6 6 FR F 57 7 7 ES M 29 8 8 ES NaN 48 9 9 US F 45, 'sessions': session_id user_id device os minutes 0 0 0 mobile ios 8 1 1 1 tablet android 8 2 2 2 mobile android 24 3 3 3 tablet android 10 4 4 4 mobile android 34 5 5 5 tablet ios 26 6 6 6 mobile android 34 7 7 7 tablet ios 33 8 8 8 mobile ios 24 9 9 9 mobile ios 12}