Image Augmentation Mastering: 15+ Techniques and Useful Functions with Python Codes

栏目: IT技术 · 发布时间: 4年前

内容简介：Maybe at this point, you don’t seeWe can nowThe first, and one of the

How to use it?

Maybe at this point, you don’t see how simple the setup is . And yet it is. All we have to do is define a list of the transformations we want to do on our sample and that’s it. We do not touch anything else afterward . Note that the order of the transformations will have its importance. It’s up to you.

We can now dive into the purpose of the article and see the image augmentation techniques.

Flip

The first, and one of the simplest , consists of randomly performing flips on the horizontal and vertical axes of the images. In other words, there is a 50/50 chance of performing a vertical flip and a 50/50 chance of performing a horizontal flip.

Crop

To do image augmentation, it is common to crop the image randomly. In other words, we crop a part of the image of random size and over a random area .

The size of the cropped image can be chosen from a ratio on the dimensions (height, width) . If the proportional maximum size of the crop is not specified then we will consider by default that it is the size of the image.

Kernel filters

General Case

We are going to get into something a little more enjoyable . Filters are great classics but I think it’ is important to be able to easily create our own convolution filters . If you do not know how a filter works I refer you to my article about Conv2d.

So I wanted to make a general function to be able to use our own filters.

Sharpen

As far as filters are concerned, it is possible to go even further by choosing a filter upstream and applying it with random weighting . For example, I introduce you the filter for sharpening our image.

value of center from 0 to 65.

Blur

To finish with the filters, the most popular are used to randomly blur our image. There are a lot of ways to blur our image. The best known are the average, median, Gaussian, or bilateral filters .

Average blur

kernel size from 1 to 35

Concerning average filter . As its name indicates: it allows us to average the values on a given center. This is made by a kernel . Its size can be specified for more or less blur. To increase our images with an average filter we just need to filter our input image with a kernel of a random size .

Gaussian blur

kernel size from 1 to 35

Finally in the same way as for the average blur. The Gaussian blur does not use an average filter but a filter so the values correspond to a Gaussian curve from the center. Note that the kernel dimension must contain odd numbers only.

Perspectives transformation

By far the most widely used image enhancement technique is perspective transformation . There are rotation , translation , shearing, and scaling . These transformations can be performed in a 3D dimension. Usually, they are used only in 2D which is a pity. Let’s take advantage of everything we have at our disposition, right?

Rotation

Translation

Shearing

Scaling

Combining Everything

I will not take more time on the 3D transformations of a 2D image because I wrote a whole article about it . So I picked up the function we get at the end of this article. I invite you to have a look at it if you want to know more about homogeneous coordinates and 3D transformation matrices .

What should be noted is that this function allows us to randomly perform transformations according to the 4 proposed matrices. The order has its importance. Here we have the shearing, then the rotation, then the scale, and finally the translation. Note that the translation is done by a ratio of the dimensions of the image.

Combining random rotation translation shearing and scale

Cutout

Cutout replacement by 0, on the whole input and cropping the target at the same time

The cutout is pretty intuitive . It involves removing regions of the input image at random. It works in the same way as the cropping we talked about earlier. But instead of returning the regions concerned, we delete them. We can, therefore, once again allow the user to provide a minimum and maximum size per ratio of regions to be deleted, a maximum number of regions , to cut the regions from the target at the same time or not, we can perform this cutout per channel , and also choose the default replacement value of the deleted regions.

Cutout replacement by 1, channel size on input without cropping the target

Color Spaces

Now we get to the part I find the funniest . A part that is very rarely taken into account . If we know the color spaces we can take advantage of their properties to enhance our images. To give you a simple example, with the HSV color space we can have fun extracting the leaf thanks to its color and change its color randomly according to our wishes. That is a very cool thing to do! And we understand the interest of having our own image enhancement functions. Of course, this requires a little more creativity . So it is important to know our color spaces to make the most of them. Particularly since they can be crucial in preprocessing for our (Deep) Machine Learning models.

Brightness

Brightness from -100 to 100

Let’s stay on our colors a little longer. A great classic in image augmentation is to be able to play with brightness . There are several ways to do so the simplest is to simply add a random bias .

Contrasts

Contrasts from -100 to 100

In the same way, it is very simple to play with contrasts . This can also be done randomly.

Noise injection

The last fairly common image enhancement technique is noise injection . In reality, we only add a matrix of the same size as our input. This matrix is composed of elements following a random distribution . Noise injection can be done from any random distribution. In practice, we only see two of them. But feel free to go further 

Uniform

Gaussian

Vignetting

Finally, much less used but not useless. Some cameras have a vignetting effect . It is also interesting to think about how we can increase our images by randomly imitating this phenomenon. We will also try to give flexibility to the user. We will be able to decide the minimum distance from the effect can randomly start, decide its intensity , and even decide if it’s an effect that goes towards black or toward white .

Lens distortion

And finally the best for last . I am surprised it is not used more often. But it can mimic the distortion of a camera lens . It is like looking through a round glass. What appears to us is distorted because the lens (the glass) is rounded. So if our images are taken from a camera with a lens why do we not simulate them. This should be used by default for image s. At least I think so.

I thus propose in this last function to be able to randomly simulate our lens distortion by playing on the radial coefficients k1, k2, k3 and on the tangential coefficients p1, p2 . In this method, the order of the coefficients is as follows: k1, k2, p1, p2, k3 . I invite you to have a look at the OpenCV documentation on this subject.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Image Augmentation Mastering: 15+ Techniques and Useful Functions with Python Codes

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

深入理解并行编程

[美] Paul E.Mckenney（保罗·E·麦肯尼) / 谢宝友鲁阳 / 电子工业出版社 / 2017-7-1 / 129

《深入理解并行编程》首先以霍金提出的两个理论物理限制为引子，解释了多核并行计算兴起的原因，并从硬件的角度阐述并行编程的难题。接着，《深入理解并行编程》以常见的计数器为例，探讨其不同的实现方法及适用场景。在这些实现方法中，除了介绍常见的锁以外，《深入理解并行编程》还重点介绍了RCU的使用及其原理，以及实现RCU的基础：内存屏障。最后，《深入理解并行编程》还介绍了并行软件的验证，以及并行实时计算等内容......一起来看看《深入理解并行编程》这本书的介绍吧!

码农工具