Inpainting with AI — get back your images! [PyTorch]

栏目: IT技术 · 发布时间: 4年前

内容简介:Did you know the old childhood photo you have in that dusty album can be restored? Yeah, that one in which everyone is holding hands and having the time of their lives! Don’t believe me? Check this out here —Inpaintingis aImage inpainting is an active area

Python-PyTorch

Inpainting with AI — get back your images! [PyTorch]

Solving the problem of Image Inpainting with PyTorch and Python

Inpainting with AI — get back your images! [PyTorch]

Photo by James Pond on Unsplash

Did you know the old childhood photo you have in that dusty album can be restored? Yeah, that one in which everyone is holding hands and having the time of their lives! Don’t believe me? Check this out here —

Inpaintingis a conservation process where damaged, deteriorating, or missing parts of an artwork are filled in to present a complete image. [1] This process can be applied to both physical and digital art mediums such as oil or acrylic paintings, chemical photographic prints , 3-dimensional sculptures , or digital images and video . — https://en.wikipedia.org/wiki/Inpainting

Image inpainting is an active area of AI research where AI has been able to come up with better inpainting results than most artists. In this article, we are going to discuss image inpainting using neural networks — specifically context encoders. This article explains and implements the research work on context encoders that was presented in CVPR 2016.

Context Encoders

To get started with Context Encoders, we have to learn what are autoencoders . An autoencoder structurally consists of an encoder, a decoder and a bottleneck. The general autoencoder aims to reduce image size by ignoring the noise in the image. Autoencoders are however not specific for images and can be extended to other data as well. There are specific variants of autoencoders to fulfill specific tasks.

Inpainting with AI — get back your images! [PyTorch]

Autoencoder Architecture

Now that we know about autoencoders we can describe context encoders as an analogy to autoencoders. A context encoder is a convolutional neural network trained to generate the contents of an arbitrary image region on the basis of its surroundings — i.e. a context encoder takes in the surrounding data of the image region and tries to generate something that would fit into the image region. Same as we fitted jigsaw puzzles when we were small — only we didn’t have to generate the puzzle pieces ;)

Our context encoder here consists of an encoder capturing the context of an image into a compact latent feature representation and a decoder which uses that representation to produce the missing image content. Missing image content? — Since we need an enormous dataset to train a neural network, we cannot afford to work with just the inpainting problem images. So we block out portions of images from normal image datasets to create an inpainting problem and feed the images to the neural network, thus creating missing image content at the region we block.

[It is important to note here that the images fed to the neural network have too many missing portions for classical inpainting methods to work at all.]

Use of GAN

GANs or Generative Adversarial Networks have been shown to be extremely useful for image generation. Generative Adversarial Networks run on a basic principle of a generator trying to ‘fool’ a discriminator and a determined discriminator trying to get hold of the generator. In other words, two networks trying to minimize and maximize a loss function respectively.

More about GANs here — https://medium.com/@hmrishavbandyopadhyay/generative-adversarial-networks-hard-not-eea78c1d3c95

Region Masks

Region Masks are the portion of images we block out so that we can feed the generated inpainting problems to the model. By blocking out, we just set the pixel value to zero for that image region. Now, there are 3 ways we can do this —

  1. Central Region: The simplest way of blocking out image data is to set a central square patch as zero. Although the network learns inpainting, we face the problem of generalization. The network fails to generalize well and only low level features are learned.
  2. Random Block: To counter the problem of the network ‘latching’ onto the masked region boundary as in central region mask, the masking process is randomized. Instead of choosing a single square patch as mask, a number of overlapping square masks are set up which take up to 1/4 of the image.
  3. Random Region: The Random Block masking, however, still has sharp boundaries for the network to latch onto. To deal with this, arbitrary shapes have to be removed from images. Arbitrary shapes can be obtained from the PASCAL VOC 2012 dataset, deformed and placed as masks at random image locations.

Inpainting with AI — get back your images! [PyTorch]

From left — a)Central region mask, b) Random block mask, c)Random region mask [ source: https://arxiv.org/abs/1604.07379 ]

In here, I have implemented only the Central Region masking method as this is just a guide to get you started on inpainting with AI. Feel free to try with other masking methods and let me know about the results in the comments!

Structure

By now, you should have some idea about the model. Let’s see if you’re correct ;)

The model consists of an encoder and a decoder section, building up the context-encoder part of the model. This part also acts as the generator which generates data and tries to fool the discriminator. The discriminator consists of convolution networks followed by a Sigmoid function that finally gives a single scalar as output.

Loss

The loss function of the model is divided into 2 parts:

  1. Reconstruction Loss — The reconstruction loss is a L2 loss function. It helps to capture the overall structure of the missing region and coherence with regards to its context. Mathematically, it is expressed as —
L2 loss

It is important to note here that only using L2 loss would give us a blurry image. Because having a blurry image reduces the mean pixel wise error and thus the L2 loss is minimized — but not in a way we want it to.

2. Adversarial Loss — This tries to make the prediction ‘look’ real (remember the generator has to fool the discriminator!) and this helps us in getting over the blurry image that the L2 loss would have led us into. Mathematically, we can express it as —

Adversarial Loss

Here an interesting observation is that the adversarial loss encourages the entire output to look real and not just the missing part. The adversarial network, in other words, gives the whole image a realistic look.

The total loss function:

Total loss of the model

Let’s build it!

Now since we have cleared the main points of the network lets get down to building the model. I will first build the model structure and then will get down to the training and the loss function part. The model will be built with the help of the PyTorch library on python.

Let’s start with the generator network:

The generator model for the network — implemented as a python module

Now, the discriminator network:

The discriminator network — implemented as a module

Let’s start training the network now. We will set the batch-size to 64, and the number of epochs to 100. The learning rate is set to 0.0002.

Training module for training the generator and the discriminator

Results

Let’s take a glance at what our model has been able to build!

Images at the zeroth epoch(noise) —

Inpainting with AI — get back your images! [PyTorch]

Image at zeroth epoch

Images at the 100th epoch —

Inpainting with AI — get back your images! [PyTorch]

Images at the 100th epoch

Let’s see what went into the model —

Inpainting with AI — get back your images! [PyTorch]

Central Region Masked image

That from this ? Yeah! Pretty cool, huh?

Implement your version of the model. Watch it recreate your childhood photos — and if you are good enough, you might just recreate the future of Inpainting with AI. So, what are you waiting for?

Let me know in the comments if anything goes wrong with your implementation. Here to help :)


以上所述就是小编给大家介绍的《Inpainting with AI — get back your images! [PyTorch]》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

算法时代

算法时代

Luke Dormehl / 胡小锐、钟毅 / 中信出版集团 / 2016-4-1 / CNY 59.00

世界上的一切事物都可以被简化成一个公式吗?数字可以告诉我们谁是适合我们的另一半,而且能和我们白头偕老吗?算法可以准确预测电影的票房收入,并且让电影更卖座吗?程序软件能预知谁将要实施犯罪,并且精确到案发时间吗?这些事听起来都像是科幻小说中的情节,但事实上,它们仅是日益被算法主宰的人类世界的“冰山一角”。 近年来随着大数据技术的快速发展,我们正在进入“算法经济时代”。每天,算法都会对展示在我们眼......一起来看看 《算法时代》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

URL 编码/解码
URL 编码/解码

URL 编码/解码