DCGAN

Summary

DCGAN is one of the popular and successful network design for GAN. It mainly composes of convolution layers without max pooling or fully connected layers. It uses convolutional stride and transposed convolution for the downsampling and the upsampling. The figure below is the network design for the generator.

•

Replace all max pooling with convolutional stride

•

Use transposed convolution for upsampling.

•

Eliminate fully connected layers.

•

Use Batch normalization except the output layer for the generator and the input layer of the discriminator.

•

Use ReLU in the generator except for the output which uses tanh.

•

Use LeakyReLU in the discriminator.

#1. Normalize the inputs

transforms.Normalize(
	mean=(0.5, 0.5, 0.5), 
	std=(0.5, 0.5, 0.5)
	)
Python
복사

Normalize does the following for each channel:

image = (image - mean) / std

The parameters mean, std are passed as 0.5, 0.5 in your case. This will normalize the image in the range [-1,1]. For example, the minimum value 0 will be converted to (0-0.5)/0.5=-1, the maximum value of 1 will be converted to (1-0.5)/0.5=1.

if you would like to get your image back in [0,1] range, you could use, About whether it helps CNN to learn better, I’m not sure. But majority of the papers I read employ some normalization schema. What you are following is one of them.

#2: A modified loss function

#3:

How to Identify and Diagnose GAN Failure Modes - Machine Learning Mastery

How to Identify Unstable Models When Training Generative Adversarial Networks. GANs are difficult to train. The reason they are difficult to train is that both the generator model and the discriminator model are trained simultaneously in a zero sum game. This means that improvements to one model come at the expense of the other model.

https://machinelearningmastery.com/practical-guide-to-gan-failure-modes/