Implementing Deep Convolutional Generative Adversarial Networks (DCGAN)

栏目: IT技术 · 发布时间: 4年前

内容简介:Deep Convolutional Generative Adversarial Networks or DCGANs are the ‘image version’ of the most fundamental implementation of GANs. This architecture essentially leverages Deep Convolutional Neural Networks to generate images belonging to a given distribu

Implementing Deep Convolutional Generative Adversarial Networks (DCGAN)

How I Generated New Images from Random Data using DCGAN

Training a DCGAN on MNIST by Author

Deep Convolutional Generative Adversarial Networks or DCGANs are the ‘image version’ of the most fundamental implementation of GANs. This architecture essentially leverages Deep Convolutional Neural Networks to generate images belonging to a given distribution from noisy data using the Generator-Discriminator framework.

Generative Adversarial Networks use a generator network to generate new samples of data and a discriminator network to evaluate the generator’s performance. So, fundamentally, GANs’ novelty lies in the evaluator more than that in the generator.

This is what sets GANs apart from other generative models. The incorporation of a Generative model with a Discriminative model is what GANs are all about

A Comprehensive Guide to Generative Adversarial Networks (GANs)

I have discussed the theory and math behind GANs in another post , consider giving it a read if you are interested in knowing how GANs work!

In this article, we will implement DCGAN using TensorFlow and observe the results for two well-known datasets:

  1. MNIST handwritten digits dataset and
  2. CIFAR-10 image recognition dataset

Loading and Pre-processing the Data

In this section we load and prepare the data for our model.

We load the data from tensorflow.keras datasets module, which provides a load_data function for obtaining a few well-known datasets (including the ones we need). Then, we normalize the loaded images to have values from -1 to 1 as these have pixel values from 0 to 255.

Preparing Data for Training

The Generator

The generator model mainly consists of Deconvolution layers or more accurately, Transposed Convolution layers i.e. basically the reverse of a Convolution operation.

Transposed Convolution, no padding, no strides via Dumoulin et. al

In the adjacent figure, the transpose of convolving a 3×3 kernel over a 4×4 image is depicted.

This operation is tantamount to convolving a 3×3 kernel over a 2×2 image with a 2×2 border of zeros.

A Guide to Convolution Arithmetic for Deep Learning is by far one of the best papers on convolution operations involved in DL. Giving it a read is worth it! (in my opinion).

Moving on to the generator, we take a 128-dimensional vector and map it to an 8x8x256 dimensional vector using a fully connected layer. This vector is reshaped to (8, 8, 256). These are essentially 256 activation maps of size 8×8. Further, we apply several Deconv layers and finally obtain a “3 channel image of size 32×32”. This is the generated image.

Generator Model

The Discriminator

The discriminator is nothing but a binary classifier that consists of several convolution layers (like any other image classification task). And finally, the flattened activation maps mapped to a probability output to predict if the image is real or fake.

Discriminator Model

Defining the Losses

Since, this is a binary classification problem, the ultimate loss function would be Binary Crossentropy. However, this loss is adjusted and applied to both the networks separately in order to optimize their objective.

loss = tf.keras.losses.BinaryCrossentropy(from_logits=True)

The Generator is essentially trying to generate images that the discriminator would approve as real images. Hence, all the generated images must be predicted and as “1” (real) and must be penalized for failing to do so.

Generator Loss

Hence, we train the generator to predict “1” as the output at the discriminator.

Contrary to the generator, the discriminator wants itself to predict generated outputs as fake, and at the same time it must predict any real image as real. Hence, the discriminator trains on a combination of these two losses.

Discriminator Loss

We train the discriminator to predict “0” (fake) for the generated images and “1” (real) for the images from the dataset.

Training the GAN

In the training epoch, we process the generator and discriminator model together. However we apply the gradients separately as the losses and the architectures of both the models are different.

Training Epoch

After training, I got the following results,

Results @50 Epochs by Author
Results @100 Epochs by Author

Conclusion

We saw how to implement Generative Adversarial Networks. We covered this implementation using the Deep Convolutional flavor of GANs. There are other flavors of GANs that produce conditional outputs and hence can prove to be very useful.

Here is a link to the GitHub repository of the code. Feel free to fork it!

References

Original GANs Paper: https://arxiv.org/abs/1406.2661

DCGAN paper: https://arxiv.org/abs/1511.06434

GANs Blog: https://towardsdatascience.com/a-comprehensive-guide-to-generative-adversarial-networks-gans-fcfe65d1cfe4

The code used in this guide is referred from the official TensorFlow Documentation:

TensorFlow Official Docs:


以上所述就是小编给大家介绍的《Implementing Deep Convolutional Generative Adversarial Networks (DCGAN)》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

社交天性

社交天性

[美] 马修·利伯曼(Matthew D. Lieberman) / 贾拥民 / 浙江人民出版社 / 2016-6 / 69.90

[内容简介] ● 《社交天性》是社会心理学家马修·利伯曼解读人类“社会脑”的权威之作,它告诉我们为什么在充满合作与竞争的智慧社会中人们喜爱社交又相互连接,个人的社会影响力如何得以发挥,书中处处充满了令人惊喜的洞见。 ● 为什么有的人天生善于社交,而有的人总是充满障碍? 为什么智商越高的人越难相处? 心痛对人的伤害甚至超过头痛? 慈善组织如何激发人们的捐赠行为? ......一起来看看 《社交天性》 这本书的介绍吧!

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

html转js在线工具
html转js在线工具

html转js在线工具

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试