tgan.model module¶

Module with the model for TGAN.

This module contains two classes:

GraphBuilder: That defines the graph and implements a Tensorpack compatible API.
TGANModel: The public API for the model, that offers a simplified interface for the underlying operations with GraphBuilder and trainers in order to fit and sample data.

class tgan.model.GraphBuilder(metadata, batch_size=200, z_dim=200, noise=0.2, l2norm=1e-05, learning_rate=0.001, num_gen_rnn=100, num_gen_feature=100, num_dis_layers=1, num_dis_hidden=100, optimizer='AdamOptimizer', training=True)[source]¶

Bases: tensorpack.graph_builder.model_desc.ModelDescBase

Main model for TGAN.

Parameters: None –

Attributes:

static batch_diversity(l, n_kernel=10, kernel_dim=10)[source]¶

Return the minibatch discrimination vector.

Let \(f(x_i) \in \mathbb{R}^A\) denote a vector of features for input \(x_i\), produced by some intermediate layer in the discriminator. We then multiply the vector \(f(x_i)\) by a tensor \(T \in \mathbb{R}^{A×B×C}\), which results in a matrix \(M_i \in \mathbb{R}^{B×C}\). We then compute the \(L_1\)-distance between the rows of the resulting matrix \(M_i\) across samples \(i \in {1, 2, ... , n}\) and apply a negative exponential:

\[cb(x_i, x_j) = exp(−||M_{i,b} − M_{j,b}||_{L_1} ) \in \mathbb{R}.\]

The output \(o(x_i)\) for this minibatch layer for a sample \(x_i\) is then defined as the sum of the cb(xi, xj )’s to all other samples:

\begin{aligned} &o(x_i)_b = \sum^{n}_{j=1} cb(x_i , x_j) \in \mathbb{R}\\ &o(x_i) = \Big[ o(x_i)_1, o(x_i)_2, . . . , o(x_i)_B \Big] \in \mathbb{R}^B\\ &o(X) ∈ R^{n×B}\\ \end{aligned}

Note

This is extracted from Improved techniques for training GANs (Section 3.2) by Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen.

Parameters

l (tf.Tensor) –
n_kernel (int) –
kernel_dim (int) –

Returns

tensorflow.Tensor

build_graph(*inputs)[source]¶

Build the whole graph.

Parameters: inputs (list[tensorflow.Tensor]) –
Returns: None

build_losses(logits_real, logits_fake, extra_g=0, l2_norm=1e-05)[source]¶

D and G play two-player minimax game with value function \(V(G,D)\).

\[min_G max_D V(D, G) = IE_{x \sim p_{data}} [log D(x)] + IE_{z \sim p_{fake}} [log (1 - D(G(z)))]\]

Parameters

logits_real (tensorflow.Tensor) – discrim logits from real samples.
logits_fake (tensorflow.Tensor) – discrim logits from fake samples from generator.
extra_g (float) –
l2_norm (float) – scale to apply L2 regularization.

Returns

None

collect_variables(g_scope='gen', d_scope='discrim')[source]¶

Assign generator and discriminator variables from their scopes.

Parameters

g_scope (str) – Scope for the generator.
d_scope (str) – Scope for the discriminator.

Raises

ValueError – If any of the assignments fails or the collections are empty.

static compute_kl(real, pred)[source]¶

Compute the Kullback–Leibler divergence, \(D_{KL}(\textrm{pred} || \textrm{real})\).

Parameters

real (tensorflow.Tensor) – Real values.
pred (tensorflow.Tensor) – Predicted values.

Returns

Computed divergence for the given values.

Return type

float

discriminator(vecs)[source]¶

Build discriminator.

We use a \(l\)-layer fully connected neural network as the discriminator. We concatenate \(v_{1:n_c}\), \(u_{1:n_c}\) and \(d_{1:n_d}\) together as the input. We compute the internal layers as

\[ \begin{align}\begin{aligned}\begin{aligned}\\f^{(D)}_{1} &= \textrm{LeakyReLU}(\textrm{BN}(W^{(D)}_{1}(v_{1:n_c} \oplus u_{1:n_c} \oplus d_{1:n_d})\\f^{(D)}_{1} &= \textrm{LeakyReLU}(\textrm{BN}(W^{(D)}_{i}(f^{(D)}_{i−1} \oplus \textrm{diversity}(f^{(D)}_{i−1})))), i = 2:l\\\end{aligned}\end{aligned}\end{align} \]

where \(\oplus\) is the concatenation operation. \(\textrm{diversity}(·)\) is the mini-batch discrimination vector [42]. Each dimension of the diversity vector is the total distance between one sample and all other samples in the mini-batch using some learned distance metric. \(\textrm{BN}(·)\) is batch normalization, and \(\textrm{LeakyReLU}(·)\) is the leaky reflect linear activation function. We further compute the output of discriminator as \(W^{(D)}(f^{(D)}_{l} \oplus \textrm{diversity} (f^{(D)}_{l}))\) which is a scalar.

Parameters: vecs (list[tensorflow.Tensor]) – List of tensors matching the spec of inputs()
Returns: a (b, 1) logits
Return type: tensorpack.FullyConected

generator(z)[source]¶

Build generator graph.

We generate a numerical variable in 2 steps. We first generate the value scalar \(v_i\), then generate the cluster vector \(u_i\). We generate categorical feature in 1 step as a probability distribution over all possible labels.

The output and hidden state size of LSTM is \(n_h\). The input to the LSTM in each step \(t\) is the random variable \(z\), the previous hidden vector \(f_{t−1}\) or an embedding vector \(f^{\prime}_{t−1}\) depending on the type of previous output, and the weighted context vector \(a_{t−1}\). The random variable \(z\) has \(n_z\) dimensions. Each dimension is sampled from \(\mathcal{N}(0, 1)\). The attention-based context vector at is a weighted average over all the previous LSTM outputs \(h_{1:t}\). So \(a_t\) is a \(n_h\)-dimensional vector. We learn a attention weight vector \(α_t \in \mathbb{R}^t\) and compute context as

\[a_t = \sum_{k=1}^{t} \frac{\textrm{exp} {\alpha}_{t, j}} {\sum_{j} \textrm{exp} \alpha_{t,j}} h_k.\]

We set :math: a_0 = 0. The output of LSTM is \(h_t\) and we project the output to a hidden vector \(f_t = \textrm{tanh}(W_h h_t)\), where \(W_h\) is a learned parameter in the network. The size of \(f_t\) is \(n_f\) . We further convert the hidden vector to an output variable.

If the output is the value part of a continuous variable, we compute the output as \(v_i = \textrm{tanh}(W_t f_t)\). The hidden vector for \(t + 1\) step is \(f_t\).
If the output is the cluster part of a continuous variable, we compute the output as \(u_i = \textrm{softmax}(W_t f_t)\). The feature vector for \(t + 1\) step is \(f_t\).
If the output is a discrete variable, we compute the output as \(d_i = \textrm{softmax}(W_t f_t)\). The hidden vector for \(t + 1\) step is \(f^{\prime}_{t} = E_i [arg_k \hspace{0.25em} \textrm{max} \hspace{0.25em} d_i ]\), where \(E \in R^{|D_i|×n_f}\) is an embedding matrix for discrete variable \(D_i\).
\(f_0\) is a special vector \(\texttt{<GO>}\) and we learn it during the training.

Parameters: z –
Returns: Outpu
Return type: list[tensorflow.Tensor]
Raises: ValueError – If any of the elements in self.metadata[‘details’] has an unsupported value in the type key.

get_optimizer[source]¶: Return optimizer of base class.

inputs()[source]¶

Return metadata about entry data.

Returns: list[tensorpack.InputDesc]
Raises: ValueError – If any of the elements in self.metadata[‘details’] has an unsupported value in the type key.

class tgan.model.TGANModel(continuous_columns, output='output', gpu=None, max_epoch=5, steps_per_epoch=10000, save_checkpoints=True, restore_session=True, batch_size=200, z_dim=200, noise=0.2, l2norm=1e-05, learning_rate=0.001, num_gen_rnn=100, num_gen_feature=100, num_dis_layers=1, num_dis_hidden=100, optimizer='AdamOptimizer')[source]¶

Bases: object

Main model from TGAN.

Parameters

continuous_columns (list[int]) – 0-index list of column indices to be considered continuous.
output (str, optional) – Path to store the model and its artifacts. Defaults to output.
gpu (list[str], optional) – Comma separated list of GPU(s) to use. Defaults to None.
max_epoch (int, optional) – Number of epochs to use during training. Defaults to 5.
steps_per_epoch (int, optional) – Number of steps to run on each epoch. Defaults to 10000.
save_checkpoints (bool, optional) – Whether or not to store checkpoints of the model after each training epoch. Defaults to True
restore_session (bool, optional) – Whether or not continue training from the last checkpoint. Defaults to True.
batch_size (int, optional) – Size of the batch to feed the model at each step. Defaults to 200.
z_dim (int, optional) – Number of dimensions in the noise input for the generator. Defaults to 100.
noise (float, optional) – Upper bound to the gaussian noise added to categorical columns. Defaults to 0.2.
l2norm (float, optional) – L2 reguralization coefficient when computing losses. Defaults to 0.00001.
learning_rate (float, optional) – Learning rate for the optimizer. Defaults to 0.001.
num_gen_rnn (int, optional) – Defaults to 400.
num_gen_feature (int, optional) – Number of features of in the generator. Defaults to 100
num_dis_layers (int, optional) – Defaults to 2.
num_dis_hidden (int, optional) – Defaults to 200.
optimizer (str, optional) – Name of the optimizer to use during fit,possible values are: [GradientDescentOptimizer, AdamOptimizer, AdadeltaOptimizer]. Defaults to AdamOptimizer.

fit(data)[source]¶

Fit the model to the given data.

Parameters: data (pandas.DataFrame) – dataset to fit the model.
Returns: None

get_model(training=True)[source]¶: Return a new instance of the model.

classmethod load(path)[source]¶: Load a pretrained model from a given path.

prepare_sampling()[source]¶: Prepare model for generate samples.

sample(num_samples)[source]¶

Generate samples from model.

Parameters: num_samples (int) –
Returns: None
Raises: ValueError –

save(path, force=False)[source]¶: Save the fitted model in the given path.

tar_folder(tar_name)[source]¶: Generate a tar of :self.output:.