Inpainting with AI — get back your images! [PyTorch]

栏目: IT技术 · 发布时间: 4年前

内容简介:Did you know the old childhood photo you have in that dusty album can be restored? Yeah, that one in which everyone is holding hands and having the time of their lives! Don’t believe me? Check this out here —Inpaintingis aImage inpainting is an active area

Python-PyTorch

Inpainting with AI — get back your images! [PyTorch]

Solving the problem of Image Inpainting with PyTorch and Python

Inpainting with AI — get back your images! [PyTorch]

Photo by James Pond on Unsplash

Did you know the old childhood photo you have in that dusty album can be restored? Yeah, that one in which everyone is holding hands and having the time of their lives! Don’t believe me? Check this out here —

Inpaintingis a conservation process where damaged, deteriorating, or missing parts of an artwork are filled in to present a complete image. [1] This process can be applied to both physical and digital art mediums such as oil or acrylic paintings, chemical photographic prints , 3-dimensional sculptures , or digital images and video . — https://en.wikipedia.org/wiki/Inpainting

Image inpainting is an active area of AI research where AI has been able to come up with better inpainting results than most artists. In this article, we are going to discuss image inpainting using neural networks — specifically context encoders. This article explains and implements the research work on context encoders that was presented in CVPR 2016.

Context Encoders

To get started with Context Encoders, we have to learn what are autoencoders . An autoencoder structurally consists of an encoder, a decoder and a bottleneck. The general autoencoder aims to reduce image size by ignoring the noise in the image. Autoencoders are however not specific for images and can be extended to other data as well. There are specific variants of autoencoders to fulfill specific tasks.

Inpainting with AI — get back your images! [PyTorch]

Autoencoder Architecture

Now that we know about autoencoders we can describe context encoders as an analogy to autoencoders. A context encoder is a convolutional neural network trained to generate the contents of an arbitrary image region on the basis of its surroundings — i.e. a context encoder takes in the surrounding data of the image region and tries to generate something that would fit into the image region. Same as we fitted jigsaw puzzles when we were small — only we didn’t have to generate the puzzle pieces ;)

Our context encoder here consists of an encoder capturing the context of an image into a compact latent feature representation and a decoder which uses that representation to produce the missing image content. Missing image content? — Since we need an enormous dataset to train a neural network, we cannot afford to work with just the inpainting problem images. So we block out portions of images from normal image datasets to create an inpainting problem and feed the images to the neural network, thus creating missing image content at the region we block.

[It is important to note here that the images fed to the neural network have too many missing portions for classical inpainting methods to work at all.]

Use of GAN

GANs or Generative Adversarial Networks have been shown to be extremely useful for image generation. Generative Adversarial Networks run on a basic principle of a generator trying to ‘fool’ a discriminator and a determined discriminator trying to get hold of the generator. In other words, two networks trying to minimize and maximize a loss function respectively.

More about GANs here — https://medium.com/@hmrishavbandyopadhyay/generative-adversarial-networks-hard-not-eea78c1d3c95

Region Masks

Region Masks are the portion of images we block out so that we can feed the generated inpainting problems to the model. By blocking out, we just set the pixel value to zero for that image region. Now, there are 3 ways we can do this —

  1. Central Region: The simplest way of blocking out image data is to set a central square patch as zero. Although the network learns inpainting, we face the problem of generalization. The network fails to generalize well and only low level features are learned.
  2. Random Block: To counter the problem of the network ‘latching’ onto the masked region boundary as in central region mask, the masking process is randomized. Instead of choosing a single square patch as mask, a number of overlapping square masks are set up which take up to 1/4 of the image.
  3. Random Region: The Random Block masking, however, still has sharp boundaries for the network to latch onto. To deal with this, arbitrary shapes have to be removed from images. Arbitrary shapes can be obtained from the PASCAL VOC 2012 dataset, deformed and placed as masks at random image locations.

Inpainting with AI — get back your images! [PyTorch]

From left — a)Central region mask, b) Random block mask, c)Random region mask [ source: https://arxiv.org/abs/1604.07379 ]

In here, I have implemented only the Central Region masking method as this is just a guide to get you started on inpainting with AI. Feel free to try with other masking methods and let me know about the results in the comments!

Structure

By now, you should have some idea about the model. Let’s see if you’re correct ;)

The model consists of an encoder and a decoder section, building up the context-encoder part of the model. This part also acts as the generator which generates data and tries to fool the discriminator. The discriminator consists of convolution networks followed by a Sigmoid function that finally gives a single scalar as output.

Loss

The loss function of the model is divided into 2 parts:

  1. Reconstruction Loss — The reconstruction loss is a L2 loss function. It helps to capture the overall structure of the missing region and coherence with regards to its context. Mathematically, it is expressed as —
L2 loss

It is important to note here that only using L2 loss would give us a blurry image. Because having a blurry image reduces the mean pixel wise error and thus the L2 loss is minimized — but not in a way we want it to.

2. Adversarial Loss — This tries to make the prediction ‘look’ real (remember the generator has to fool the discriminator!) and this helps us in getting over the blurry image that the L2 loss would have led us into. Mathematically, we can express it as —

Adversarial Loss

Here an interesting observation is that the adversarial loss encourages the entire output to look real and not just the missing part. The adversarial network, in other words, gives the whole image a realistic look.

The total loss function:

Total loss of the model

Let’s build it!

Now since we have cleared the main points of the network lets get down to building the model. I will first build the model structure and then will get down to the training and the loss function part. The model will be built with the help of the PyTorch library on python.

Let’s start with the generator network:

The generator model for the network — implemented as a python module

Now, the discriminator network:

The discriminator network — implemented as a module

Let’s start training the network now. We will set the batch-size to 64, and the number of epochs to 100. The learning rate is set to 0.0002.

Training module for training the generator and the discriminator

Results

Let’s take a glance at what our model has been able to build!

Images at the zeroth epoch(noise) —

Inpainting with AI — get back your images! [PyTorch]

Image at zeroth epoch

Images at the 100th epoch —

Inpainting with AI — get back your images! [PyTorch]

Images at the 100th epoch

Let’s see what went into the model —

Inpainting with AI — get back your images! [PyTorch]

Central Region Masked image

That from this ? Yeah! Pretty cool, huh?

Implement your version of the model. Watch it recreate your childhood photos — and if you are good enough, you might just recreate the future of Inpainting with AI. So, what are you waiting for?

Let me know in the comments if anything goes wrong with your implementation. Here to help :)


以上所述就是小编给大家介绍的《Inpainting with AI — get back your images! [PyTorch]》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

微信营销与运营

微信营销与运营

王易 / 机械工业出版社 / 2014-1-1 / CNY 49.00

这是一本深度介绍微信营销的书,也是一本系统讲解微信公众账号运营的书,它基于微信的最新版本,从策略、方法、技巧与实践等多角度详细解析了微信的营销与运营,所有内容都是行业经验的结晶,旨在为企业运用微信提供有价值的参考。 本书首先从商业模式角度全面分析了微信5.0推出的“扫一扫”、表情商店、微信游戏、微信支付等新功能背后的商业机会,以及订阅号折叠给企业带来的影响和应对策略;其次从运营角度系统归纳了......一起来看看 《微信营销与运营》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具