Generative adversarial networks (GANs) offer a distinct and promising approach that focuses on a game-theoretic formulation for training an image synthesis model.  In GANs frameworks, the generative model is pitted against an adversary. A discriminative model learns to determine whether a sample is from the model distribution or the data distribution. The generative model can be thought of as analogous to a team of counterfeiters trying to produce fake currency and use it without detection. The discriminative model is analogous to the police, trying to detect the counterfeit currency. Competition in this game drives both teams to improve their methods until the counterfeits are indistiguishable from the genuine articles. GANs are a powerful class of generative models that cast generative modeling as a game between two networks: a generator network produces synthetic data given some noise source and a discriminator network discriminates between the generator’s output and true data. GANs can produce very visually appealing samples, but are often hard to train, and much of the recent work on the subject has been devoted to finding ways of stabilizing training.

Generative Adversarial Nets  framework had been proposed for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples. This framework can yield specific training algorithms for many kinds of model and optimization algorithm.

Generative adversarial networks
Generative adversarial networks

Several techniques to stabilize training that allow us to train models that were previously untrainable had been proposed. In work Improved Techniques for Training GANs  authors present a variety of new architectural features and training procedures that can be applyed to the generative adversarial networks (GANs) framework. Proposed evaluation metric (the Inception score) gives us a basis for comparing the quality of these models. Authors apply our techniques to the problem of semi-supervised learning, achieving state-of-the-art results on a number of different data sets in computer vision, and achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN.

Image synthesis models provide a unique opportunity for performing semi-supervised learning: these models build a rich prior over natural image statistics that can be leveraged by classifiers to improve predictions on datasets for which few labels exist. Scientist introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. The work Conditional Image Synthesis with Auxiliary Classifier GANs (Conditional Image Synthesis with Auxiliary Classifier GANs) introduced the AC-GAN architecture and demonstrated that AC-GANs can generate globally coherent ImageNet samples. Using new metric authors demonstrated that samples obtained are more discriminable than those from a model that generates lower resolution images and performs a naive resize operation. The AC-GAN model can perform semi-supervised learning by ignoring the component of the loss arising from class labels when a label is unavailable for a given training image.

In Wasserstein GAN  authors introduced an algorithm that we deemed WGAN, an alternative to traditional GAN training. They improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Corresponding optimization problem is sound, and provided extensive theoretical work highlighting the deep connections to other distances between distributions.

Generative Adversarial Networks suffer from training instability. In Improved Training of Wasserstein GANs authors propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Proposed method enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models with continuous generators.  strong modeling performance and stability across a variety of architectures had been demonstrated. Another interesting direction is adapting our penalty term to the standard GAN objective function, where it might stabilize training by encouraging the discriminator to learn smoother decision boundaries.