Inpainting with AI — get back your images! [PyTorch]

栏目: IT技术 · 发布时间: 4年前

内容简介:Did you know the old childhood photo you have in that dusty album can be restored? Yeah, that one in which everyone is holding hands and having the time of their lives! Don’t believe me? Check this out here —Inpaintingis aImage inpainting is an active area

Python-PyTorch

Inpainting with AI — get back your images! [PyTorch]

Solving the problem of Image Inpainting with PyTorch and Python

Inpainting with AI — get back your images! [PyTorch]

Photo by James Pond on Unsplash

Did you know the old childhood photo you have in that dusty album can be restored? Yeah, that one in which everyone is holding hands and having the time of their lives! Don’t believe me? Check this out here —

Inpaintingis a conservation process where damaged, deteriorating, or missing parts of an artwork are filled in to present a complete image. [1] This process can be applied to both physical and digital art mediums such as oil or acrylic paintings, chemical photographic prints , 3-dimensional sculptures , or digital images and video . — https://en.wikipedia.org/wiki/Inpainting

Image inpainting is an active area of AI research where AI has been able to come up with better inpainting results than most artists. In this article, we are going to discuss image inpainting using neural networks — specifically context encoders. This article explains and implements the research work on context encoders that was presented in CVPR 2016.

Context Encoders

To get started with Context Encoders, we have to learn what are autoencoders . An autoencoder structurally consists of an encoder, a decoder and a bottleneck. The general autoencoder aims to reduce image size by ignoring the noise in the image. Autoencoders are however not specific for images and can be extended to other data as well. There are specific variants of autoencoders to fulfill specific tasks.

Inpainting with AI — get back your images! [PyTorch]

Autoencoder Architecture

Now that we know about autoencoders we can describe context encoders as an analogy to autoencoders. A context encoder is a convolutional neural network trained to generate the contents of an arbitrary image region on the basis of its surroundings — i.e. a context encoder takes in the surrounding data of the image region and tries to generate something that would fit into the image region. Same as we fitted jigsaw puzzles when we were small — only we didn’t have to generate the puzzle pieces ;)

Our context encoder here consists of an encoder capturing the context of an image into a compact latent feature representation and a decoder which uses that representation to produce the missing image content. Missing image content? — Since we need an enormous dataset to train a neural network, we cannot afford to work with just the inpainting problem images. So we block out portions of images from normal image datasets to create an inpainting problem and feed the images to the neural network, thus creating missing image content at the region we block.

[It is important to note here that the images fed to the neural network have too many missing portions for classical inpainting methods to work at all.]

Use of GAN

GANs or Generative Adversarial Networks have been shown to be extremely useful for image generation. Generative Adversarial Networks run on a basic principle of a generator trying to ‘fool’ a discriminator and a determined discriminator trying to get hold of the generator. In other words, two networks trying to minimize and maximize a loss function respectively.

More about GANs here — https://medium.com/@hmrishavbandyopadhyay/generative-adversarial-networks-hard-not-eea78c1d3c95

Region Masks

Region Masks are the portion of images we block out so that we can feed the generated inpainting problems to the model. By blocking out, we just set the pixel value to zero for that image region. Now, there are 3 ways we can do this —

  1. Central Region: The simplest way of blocking out image data is to set a central square patch as zero. Although the network learns inpainting, we face the problem of generalization. The network fails to generalize well and only low level features are learned.
  2. Random Block: To counter the problem of the network ‘latching’ onto the masked region boundary as in central region mask, the masking process is randomized. Instead of choosing a single square patch as mask, a number of overlapping square masks are set up which take up to 1/4 of the image.
  3. Random Region: The Random Block masking, however, still has sharp boundaries for the network to latch onto. To deal with this, arbitrary shapes have to be removed from images. Arbitrary shapes can be obtained from the PASCAL VOC 2012 dataset, deformed and placed as masks at random image locations.

Inpainting with AI — get back your images! [PyTorch]

From left — a)Central region mask, b) Random block mask, c)Random region mask [ source: https://arxiv.org/abs/1604.07379 ]

In here, I have implemented only the Central Region masking method as this is just a guide to get you started on inpainting with AI. Feel free to try with other masking methods and let me know about the results in the comments!

Structure

By now, you should have some idea about the model. Let’s see if you’re correct ;)

The model consists of an encoder and a decoder section, building up the context-encoder part of the model. This part also acts as the generator which generates data and tries to fool the discriminator. The discriminator consists of convolution networks followed by a Sigmoid function that finally gives a single scalar as output.

Loss

The loss function of the model is divided into 2 parts:

  1. Reconstruction Loss — The reconstruction loss is a L2 loss function. It helps to capture the overall structure of the missing region and coherence with regards to its context. Mathematically, it is expressed as —
L2 loss

It is important to note here that only using L2 loss would give us a blurry image. Because having a blurry image reduces the mean pixel wise error and thus the L2 loss is minimized — but not in a way we want it to.

2. Adversarial Loss — This tries to make the prediction ‘look’ real (remember the generator has to fool the discriminator!) and this helps us in getting over the blurry image that the L2 loss would have led us into. Mathematically, we can express it as —

Adversarial Loss

Here an interesting observation is that the adversarial loss encourages the entire output to look real and not just the missing part. The adversarial network, in other words, gives the whole image a realistic look.

The total loss function:

Total loss of the model

Let’s build it!

Now since we have cleared the main points of the network lets get down to building the model. I will first build the model structure and then will get down to the training and the loss function part. The model will be built with the help of the PyTorch library on python.

Let’s start with the generator network:

The generator model for the network — implemented as a python module

Now, the discriminator network:

The discriminator network — implemented as a module

Let’s start training the network now. We will set the batch-size to 64, and the number of epochs to 100. The learning rate is set to 0.0002.

Training module for training the generator and the discriminator

Results

Let’s take a glance at what our model has been able to build!

Images at the zeroth epoch(noise) —

Inpainting with AI — get back your images! [PyTorch]

Image at zeroth epoch

Images at the 100th epoch —

Inpainting with AI — get back your images! [PyTorch]

Images at the 100th epoch

Let’s see what went into the model —

Inpainting with AI — get back your images! [PyTorch]

Central Region Masked image

That from this ? Yeah! Pretty cool, huh?

Implement your version of the model. Watch it recreate your childhood photos — and if you are good enough, you might just recreate the future of Inpainting with AI. So, what are you waiting for?

Let me know in the comments if anything goes wrong with your implementation. Here to help :)


以上所述就是小编给大家介绍的《Inpainting with AI — get back your images! [PyTorch]》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

奔腾年代

奔腾年代

郭万盛 / 中信出版集团 / 2018-9-1 / 68.00

1994年4月20日,一条64K国际专线将中关村地区教育与科研示范网络与互联网连在了一起。中国,成为第77个全功能接入互联网的国家。 1995年,中国互联网正式开始商业化应用。浪潮开始! 这是一个波澜壮阔的年代,带给我们翻天覆地的变化。中国互联网25年发展史的全景展示、忠实梳理和记录。 在更宏观的角度审视互联网与中国的关系,人们将会发现,互联网革命给中国带来了重新崛起的时代机遇。......一起来看看 《奔腾年代》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具