tgan.model module¶
Module with the model for TGAN.
This module contains two classes:
GraphBuilder
: That defines the graph and implements a Tensorpack compatible API.TGANModel
: The public API for the model, that offers a simplified interface for the underlying operations with GraphBuilder and trainers in order to fit and sample data.
-
class
tgan.model.
GraphBuilder
(metadata, batch_size=200, z_dim=200, noise=0.2, l2norm=1e-05, learning_rate=0.001, num_gen_rnn=100, num_gen_feature=100, num_dis_layers=1, num_dis_hidden=100, optimizer='AdamOptimizer', training=True)[source]¶ Bases:
tensorpack.graph_builder.model_desc.ModelDescBase
Main model for TGAN.
- Parameters
None –
Attributes:
-
static
batch_diversity
(l, n_kernel=10, kernel_dim=10)[source]¶ Return the minibatch discrimination vector.
Let \(f(x_i) \in \mathbb{R}^A\) denote a vector of features for input \(x_i\), produced by some intermediate layer in the discriminator. We then multiply the vector \(f(x_i)\) by a tensor \(T \in \mathbb{R}^{A×B×C}\), which results in a matrix \(M_i \in \mathbb{R}^{B×C}\). We then compute the \(L_1\)-distance between the rows of the resulting matrix \(M_i\) across samples \(i \in {1, 2, ... , n}\) and apply a negative exponential:
\[cb(x_i, x_j) = exp(−||M_{i,b} − M_{j,b}||_{L_1} ) \in \mathbb{R}.\]The output \(o(x_i)\) for this minibatch layer for a sample \(x_i\) is then defined as the sum of the cb(xi, xj )’s to all other samples:
\begin{aligned} &o(x_i)_b = \sum^{n}_{j=1} cb(x_i , x_j) \in \mathbb{R}\\ &o(x_i) = \Big[ o(x_i)_1, o(x_i)_2, . . . , o(x_i)_B \Big] \in \mathbb{R}^B\\ &o(X) ∈ R^{n×B}\\ \end{aligned}Note
This is extracted from Improved techniques for training GANs (Section 3.2) by Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen.
- Parameters
l (tf.Tensor) –
n_kernel (int) –
kernel_dim (int) –
- Returns
tensorflow.Tensor
-
build_graph
(*inputs)[source]¶ Build the whole graph.
- Parameters
inputs (list[tensorflow.Tensor]) –
- Returns
None
-
build_losses
(logits_real, logits_fake, extra_g=0, l2_norm=1e-05)[source]¶ D and G play two-player minimax game with value function \(V(G,D)\).
\[min_G max_D V(D, G) = IE_{x \sim p_{data}} [log D(x)] + IE_{z \sim p_{fake}} [log (1 - D(G(z)))]\]- Parameters
logits_real (tensorflow.Tensor) – discrim logits from real samples.
logits_fake (tensorflow.Tensor) – discrim logits from fake samples from generator.
extra_g (float) –
l2_norm (float) – scale to apply L2 regularization.
- Returns
None
-
collect_variables
(g_scope='gen', d_scope='discrim')[source]¶ Assign generator and discriminator variables from their scopes.
- Parameters
g_scope (str) – Scope for the generator.
d_scope (str) – Scope for the discriminator.
- Raises
ValueError – If any of the assignments fails or the collections are empty.
-
static
compute_kl
(real, pred)[source]¶ Compute the Kullback–Leibler divergence, \(D_{KL}(\textrm{pred} || \textrm{real})\).
- Parameters
real (tensorflow.Tensor) – Real values.
pred (tensorflow.Tensor) – Predicted values.
- Returns
Computed divergence for the given values.
- Return type
float
-
discriminator
(vecs)[source]¶ Build discriminator.
We use a \(l\)-layer fully connected neural network as the discriminator. We concatenate \(v_{1:n_c}\), \(u_{1:n_c}\) and \(d_{1:n_d}\) together as the input. We compute the internal layers as
\[ \begin{align}\begin{aligned}\begin{aligned}\\f^{(D)}_{1} &= \textrm{LeakyReLU}(\textrm{BN}(W^{(D)}_{1}(v_{1:n_c} \oplus u_{1:n_c} \oplus d_{1:n_d})\\f^{(D)}_{1} &= \textrm{LeakyReLU}(\textrm{BN}(W^{(D)}_{i}(f^{(D)}_{i−1} \oplus \textrm{diversity}(f^{(D)}_{i−1})))), i = 2:l\\\end{aligned}\end{aligned}\end{align} \]where \(\oplus\) is the concatenation operation. \(\textrm{diversity}(·)\) is the mini-batch discrimination vector [42]. Each dimension of the diversity vector is the total distance between one sample and all other samples in the mini-batch using some learned distance metric. \(\textrm{BN}(·)\) is batch normalization, and \(\textrm{LeakyReLU}(·)\) is the leaky reflect linear activation function. We further compute the output of discriminator as \(W^{(D)}(f^{(D)}_{l} \oplus \textrm{diversity} (f^{(D)}_{l}))\) which is a scalar.
- Parameters
vecs (list[tensorflow.Tensor]) – List of tensors matching the spec of
inputs()
- Returns
a (b, 1) logits
- Return type
tensorpack.FullyConected
-
generator
(z)[source]¶ Build generator graph.
We generate a numerical variable in 2 steps. We first generate the value scalar \(v_i\), then generate the cluster vector \(u_i\). We generate categorical feature in 1 step as a probability distribution over all possible labels.
The output and hidden state size of LSTM is \(n_h\). The input to the LSTM in each step \(t\) is the random variable \(z\), the previous hidden vector \(f_{t−1}\) or an embedding vector \(f^{\prime}_{t−1}\) depending on the type of previous output, and the weighted context vector \(a_{t−1}\). The random variable \(z\) has \(n_z\) dimensions. Each dimension is sampled from \(\mathcal{N}(0, 1)\). The attention-based context vector at is a weighted average over all the previous LSTM outputs \(h_{1:t}\). So \(a_t\) is a \(n_h\)-dimensional vector. We learn a attention weight vector \(α_t \in \mathbb{R}^t\) and compute context as
\[a_t = \sum_{k=1}^{t} \frac{\textrm{exp} {\alpha}_{t, j}} {\sum_{j} \textrm{exp} \alpha_{t,j}} h_k.\]We set :math: a_0 = 0. The output of LSTM is \(h_t\) and we project the output to a hidden vector \(f_t = \textrm{tanh}(W_h h_t)\), where \(W_h\) is a learned parameter in the network. The size of \(f_t\) is \(n_f\) . We further convert the hidden vector to an output variable.
If the output is the value part of a continuous variable, we compute the output as \(v_i = \textrm{tanh}(W_t f_t)\). The hidden vector for \(t + 1\) step is \(f_t\).
If the output is the cluster part of a continuous variable, we compute the output as \(u_i = \textrm{softmax}(W_t f_t)\). The feature vector for \(t + 1\) step is \(f_t\).
If the output is a discrete variable, we compute the output as \(d_i = \textrm{softmax}(W_t f_t)\). The hidden vector for \(t + 1\) step is \(f^{\prime}_{t} = E_i [arg_k \hspace{0.25em} \textrm{max} \hspace{0.25em} d_i ]\), where \(E \in R^{|D_i|×n_f}\) is an embedding matrix for discrete variable \(D_i\).
\(f_0\) is a special vector \(\texttt{<GO>}\) and we learn it during the training.
- Parameters
z –
- Returns
Outpu
- Return type
list[tensorflow.Tensor]
- Raises
ValueError – If any of the elements in self.metadata[‘details’] has an unsupported value in the type key.
-
class
tgan.model.
TGANModel
(continuous_columns, output='output', gpu=None, max_epoch=5, steps_per_epoch=10000, save_checkpoints=True, restore_session=True, batch_size=200, z_dim=200, noise=0.2, l2norm=1e-05, learning_rate=0.001, num_gen_rnn=100, num_gen_feature=100, num_dis_layers=1, num_dis_hidden=100, optimizer='AdamOptimizer')[source]¶ Bases:
object
Main model from TGAN.
- Parameters
continuous_columns (list[int]) – 0-index list of column indices to be considered continuous.
output (str, optional) – Path to store the model and its artifacts. Defaults to
output
.gpu (list[str], optional) – Comma separated list of GPU(s) to use. Defaults to
None
.max_epoch (int, optional) – Number of epochs to use during training. Defaults to
5
.steps_per_epoch (int, optional) – Number of steps to run on each epoch. Defaults to
10000
.save_checkpoints (bool, optional) – Whether or not to store checkpoints of the model after each training epoch. Defaults to
True
restore_session (bool, optional) – Whether or not continue training from the last checkpoint. Defaults to
True
.batch_size (int, optional) – Size of the batch to feed the model at each step. Defaults to
200
.z_dim (int, optional) – Number of dimensions in the noise input for the generator. Defaults to
100
.noise (float, optional) – Upper bound to the gaussian noise added to categorical columns. Defaults to
0.2
.l2norm (float, optional) – L2 reguralization coefficient when computing losses. Defaults to
0.00001
.learning_rate (float, optional) – Learning rate for the optimizer. Defaults to
0.001
.num_gen_rnn (int, optional) – Defaults to
400
.num_gen_feature (int, optional) – Number of features of in the generator. Defaults to
100
num_dis_layers (int, optional) – Defaults to
2
.num_dis_hidden (int, optional) – Defaults to
200
.optimizer (str, optional) – Name of the optimizer to use during fit,possible values are: [GradientDescentOptimizer, AdamOptimizer, AdadeltaOptimizer]. Defaults to
AdamOptimizer
.
-
fit
(data)[source]¶ Fit the model to the given data.
- Parameters
data (pandas.DataFrame) – dataset to fit the model.
- Returns
None