Image Augmentation Mastering: 15+ Techniques and Useful Functions with Python Codes

栏目: IT技术 · 发布时间: 4年前

内容简介:Maybe at this point, you don’t seeWe can nowThe first, and one of the

How to use it?

Maybe at this point, you don’t see how simple the setup is . And yet it is. All we have to do is define a list of the transformations we want to do on our sample and that’s it. We do not touch anything else afterward . Note that the order of the transformations will have its importance. It’s up to you.

We can now dive into the purpose of the article and see the image augmentation techniques.

Flip

The first, and one of the simplest , consists of randomly performing flips on the horizontal and vertical axes of the images. In other words, there is a 50/50 chance of performing a vertical flip and a 50/50 chance of performing a horizontal flip.

Crop

To do image augmentation, it is common to crop the image randomly. In other words, we crop a part of the image of random size and over a random area .

The size of the cropped image can be chosen from a ratio on the dimensions (height, width) . If the proportional maximum size of the crop is not specified then we will consider by default that it is the size of the image.

Kernel filters

General Case

We are going to get into something a little more enjoyable . Filters are great classics but I think it’ is important to be able to easily create our own convolution filters . If you do not know how a filter works I refer you to my article about Conv2d.

So I wanted to make a general function to be able to use our own filters.

Sharpen

As far as filters are concerned, it is possible to go even further by choosing a filter upstream and applying it with random weighting . For example, I introduce you the filter for sharpening our image.

value of center from 0 to 65.

Blur

To finish with the filters, the most popular are used to randomly blur our image. There are a lot of ways to blur our image. The best known are the average, median, Gaussian, or bilateral filters .

Average blur

kernel size from 1 to 35

Concerning average filter . As its name indicates: it allows us to average the values on a given center. This is made by a kernel . Its size can be specified for more or less blur. To increase our images with an average filter we just need to filter our input image with a kernel of a random size .

Gaussian blur

kernel size from 1 to 35

Finally in the same way as for the average blur. The Gaussian blur does not use an average filter but a filter so the values correspond to a Gaussian curve from the center. Note that the kernel dimension must contain odd numbers only.

Perspectives transformation

By far the most widely used image enhancement technique is perspective transformation . There are rotation , translation , shearing, and scaling . These transformations can be performed in a 3D dimension. Usually, they are used only in 2D which is a pity. Let’s take advantage of everything we have at our disposition, right?

Rotation

Translation

Shearing

Scaling

Combining Everything

I will not take more time on the 3D transformations of a 2D image because I wrote a whole article about it . So I picked up the function we get at the end of this article. I invite you to have a look at it if you want to know more about homogeneous coordinates and 3D transformation matrices .

What should be noted is that this function allows us to randomly perform transformations according to the 4 proposed matrices. The order has its importance. Here we have the shearing, then the rotation, then the scale, and finally the translation. Note that the translation is done by a ratio of the dimensions of the image.

Combining random rotation translation shearing and scale

Cutout

Cutout replacement by 0, on the whole input and cropping the target at the same time

The cutout is pretty intuitive . It involves removing regions of the input image at random. It works in the same way as the cropping we talked about earlier. But instead of returning the regions concerned, we delete them. We can, therefore, once again allow the user to provide a minimum and maximum size per ratio of regions to be deleted, a maximum number of regions , to cut the regions from the target at the same time or not, we can perform this cutout per channel , and also choose the default replacement value of the deleted regions.

Cutout replacement by 1, channel size on input without cropping the target

Color Spaces

Now we get to the part I find the funniest . A part that is very rarely taken into account . If we know the color spaces we can take advantage of their properties to enhance our images. To give you a simple example, with the HSV color space we can have fun extracting the leaf thanks to its color and change its color randomly according to our wishes. That is a very cool thing to do! And we understand the interest of having our own image enhancement functions. Of course, this requires a little more creativity . So it is important to know our color spaces to make the most of them. Particularly since they can be crucial in preprocessing for our (Deep) Machine Learning models.

Brightness

Brightness from -100 to 100

Let’s stay on our colors a little longer. A great classic in image augmentation is to be able to play with brightness . There are several ways to do so the simplest is to simply add a random bias .

Contrasts

Contrasts from -100 to 100

In the same way, it is very simple to play with contrasts . This can also be done randomly.

Noise injection

The last fairly common image enhancement technique is noise injection . In reality, we only add a matrix of the same size as our input. This matrix is composed of elements following a random distribution . Noise injection can be done from any random distribution. In practice, we only see two of them. But feel free to go further

Uniform

Gaussian

Vignetting

Finally, much less used but not useless. Some cameras have a vignetting effect . It is also interesting to think about how we can increase our images by randomly imitating this phenomenon. We will also try to give flexibility to the user. We will be able to decide the minimum distance from the effect can randomly start, decide its intensity , and even decide if it’s an effect that goes towards black or toward white .

Lens distortion

And finally the best for last . I am surprised it is not used more often. But it can mimic the distortion of a camera lens . It is like looking through a round glass. What appears to us is distorted because the lens (the glass) is rounded. So if our images are taken from a camera with a lens why do we not simulate them. This should be used by default for image s. At least I think so.

I thus propose in this last function to be able to randomly simulate our lens distortion by playing on the radial coefficients k1, k2, k3 and on the tangential coefficients p1, p2 . In this method, the order of the coefficients is as follows: k1, k2, p1, p2, k3 . I invite you to have a look at the OpenCV documentation on this subject.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

算法导论(原书第3版)

算法导论(原书第3版)

Thomas H.Cormen、Charles E.Leiserson、Ronald L.Rivest、Clifford Stein / 殷建平、徐云、王刚、刘晓光、苏明、邹恒明、王宏志 / 机械工业出版社 / 2012-12 / 128.00元

在有关算法的书中,有一些叙述非常严谨,但不够全面;另一些涉及了大量的题材,但又缺乏严谨性。本书将严谨性和全面性融为一体,深入讨论各类算法,并着力使这些算法的设计和分析能为各个层次的读者接受。全书各章自成体系,可以作为独立的学习单元;算法以英语和伪代码的形式描述,具备初步程序设计经验的人就能看懂;说明和解释力求浅显易懂,不失深度和数学严谨性。 全书选材经典、内容丰富、结构合理、逻辑清晰,对本科......一起来看看 《算法导论(原书第3版)》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

随机密码生成器
随机密码生成器

多种字符组合密码

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具