**Generative Adversarial Networks (GANs) marked the first great success of Deep Learning when it comes to generative AI. We are reviewing the in and out of this foundational model paradigm. We are going to look at:**

**The overview of GAN’s architecture****The discriminator****The generator****The training****The loss functions****The convergence of the generative training**

## Overview of GAN’s architecture

The goal of Generative Adversarial Networks (GANs) is to generate fake images that seem realistic. GANs can be broken down into the following components:

**The generator**- The role of the generator is to fool the discriminator into generating images as realistic as possible**The discriminator**- the role of the discriminator is to distinguish the generated images from the real ones**The loss functions**- The loss functions are there to capture the problem as an optimization problem and will allow the model to learn. The discriminator wants to become very good at distinguishing real images from fake ones and the generator wants to become very good at fooling the discriminator with fake images.

## The Discriminator

The discriminator is simply a classifier. The input data are real images and generated images and the labels are “1“s if the image is real and “0“ if not.

In the 2014 original paper by Ian J. Goodfellow (“Generative Adversarial Nets“), the proposed discriminator was simply a feedforward network. In 2016 Alec Radford and Luke Metz proposed DCGAN, a convolutional network architecture (“Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”) with a discriminator architecture very close to the VGG16 architecture (“Very Deep Convolutional Networks for Large-Scale Image Recognition”). The architecture is comprised of a succession of convolutional layers where, for each layer, the number of output channels is twice as big as the number of input channels. The final computational unit is a sigmoid for binary classification (fake or not):

The following pseudo-code captures the architecture of the discriminator:

```
discriminator = Sequential(
```**# first layer**
Conv2d(3, 64),
LeakyReLU(0.2),
**# second layer**
Conv2d(64, 128),
BatchNorm2d(),
LeakyReLU(0.2),
**# third layer**
Conv2d(128, 256),
BatchNorm2d,
LeakyReLU(0.2),
**# fourth layer**
Conv2d(256, 512),
BatchNorm2d,
LeakyReLU(),
** # fifth layer**
nn.Conv2d(512, 1),
Sigmoid()
)

## The Generator

The generator takes as input random noise and outputs an image. Its architecture is very similar to that of the discriminator, but performs the opposite operation:

The main difference is that instead of using convolution operations, the generator uses transposed convolution operators. They can be seen as estimators of inverse convolution operations:

The following pseudo-code captures the architecture of the generator:

```
generator = Sequential(
```**# first layer**
ConvTranspose2d( 100, 512),
BatchNorm2d,
ReLU,
** # second layer**
ConvTranspose2d(512, 256),
BatchNorm2d,
ReLU,
** # third layer**
ConvTranspose2d(256, 128),
BatchNorm2d,
ReLU,
** # forth layer**
ConvTranspose2d(128, 64),
BatchNorm2d,
ReLU,
** # fifth layer**
ConvTranspose2d(64, 3),
Tanh()
)

## The Training

Training a GAN can be tricky. We iterate over epochs and data batches. The data needs to represent the distribution of images you are trying to learn. Here are the steps for one batch of images within an epoch:

**Step 1**- We get the predictions from the discriminator on the data batch. The discriminator is trying to estimate the probability that the sample is real.

## Keep reading with a 7-day free trial

Subscribe to The AiEdge Newsletter to keep reading this post and get 7 days of free access to the full post archives.