Balancing the Regularization Effect of Data Augmentation

栏目: IT技术 · 发布时间: 5年前

内容简介:A look into the need to balance overfitting and underfitting with data augmentation using an application of Image segmentation on satellite images to identify water bodies.When training neural networks, data augmentation is one of the most commonly used pr

Balancing the Regularization Effect of Data Augmentation

A look into the need to balance overfitting and underfitting with data augmentation using an application of Image segmentation on satellite images to identify water bodies.

The Effect of Data Augmentation

When training neural networks, data augmentation is one of the most commonly used pre-processing techniques. The word “augmentation” which literally means “the action or process of making or becoming greater in size or amount” , summarizes the outcome of this technique. But another important effect is that, it increases or augments the diversity of the data. The increased diversity means, at each training stage the model comes across a different version of the original data.

By Author

Why do we need this ‘increased diversity’ in data? The answer lies in the core tenet of machine learning — The Bias-Variance tradeoff . More complex models like deep neural networks have low bias but suffer from high variance. This implies that, these models overfit the training data and would show poor perform on test data or the data, they haven’t seen before. This would lead to higher prediction errors . Thus, the increased diversity from data augmentation reduces the variance of the model by making it better at generalizing .

For images, some common methods of data augmentation are taking cropped portions, zooming in/out, rotating along the axis, vertical/horizontal flips, adjusting the brightness and sheer intensity. Data augmentation for audio data involves adding noise, changing speed and pitch.

While data augmentation prevents the model from overfitting, some augmentation combinations can actually lead to underfitting . This slows down training which leads to a huge strain on resources like available processing time, GPU quotas, etc. Moreover, the model isn’t able to learn as much information to give accurate predictions which, again leads to high prediction errors. In this blog post we take the example of semantic segmentation on satellite images, to see the impact of different combinations of data augmentations on training.

About the Data set

This Kaggle data set gives the satellite images from Sentinel 2 and their corresponding masks which segment the water bodies . The masks have been calculated using the Normalized Difference Water Index or NDWI. Out of a total 2841 images in the data set, 2560 were extracted for the train set, 256 for the validation set and 25 for the test set respectively. The entire analysis and modeling was done on Google Colab with GPU support.

Structure of U-NET

Simply put, a U-NET is an autoencoder with residual or skip connections from each convolutional block in the encoder to its counterpart in the decoder. This results in a symmetric ‘U’ like structure. This article gives a comprehensive line by line explanation of the structure of a U-NET from the original paper.

We use a slightly modified version of the U-NET as shown below.

Snapshot of a block of the UNET used (By Author)

A Look at Different Cases of Data Augmentation

We explore 5 different cases of data augmentation with the help of Keras ImageDataGenerator. We want to see how augmentation can lead to overfitting or underfitting during training. Thus, for comparison of the 5 cases, Accuracy and Loss during training & validation were used; where binary cross-entropy was taken as the loss function.

When dealing with semantic segmentation, an important point to remember is to apply the same augmentations to the images and their corresponding masks!

In all 5 cases, the pixel values of the images and masks were rescaled by a factor of 1/255. All images and their masks in the validation and test set were also rescaled.

Case 1:This was the base case . Only the pixel values of images and their masks were rescaled. No augmentations were applied. This case produced a training set with the least variance .

Case 2:In addition to rescaling, the images and their masks were randomly flipped vertically or horizontally .

Case 3:For this case, rescaling, random vertical or horizontal flips and random rotations between [-20,20] degrees were applied to images and their corresponding masks.

Case 4:The images and their corresponding masks were randomly shifted along the width and height by a factor of 0.3.

Case 5:The sheer transformations were randomly applied to the images and their corresponding masks using a factor of 20. They were also randomly zoomed in between the range [0.2,0.5].

Different types of Augmentations on the same image and it’s mask (By Author)

A Comparison of Results

Across all 5 cases, the model was trained for 250 epochs with a batch size of 16. The Adam optimizer was used with a learning rate of 0.00001, beta 1 of 0.99 and beta 2 of 0.99. In the below graphs, we see that, each case of data augmentation gave a varying performance for the same model, trained for the same number of epochs with the same initial state of the optimizer.


很遗憾的说,推酷将在这个月底关闭。人生海海,几度秋凉,感谢那些有你的时光。


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

产品经理修炼之道

产品经理修炼之道

费杰 / 机械工业出版社华章公司 / 2012-7-30 / 59.00元

本书由资深产品经理、中国最大的产品经理沙龙Pmcaff创始人费杰亲自执笔,微软、腾讯、百度、新浪、搜狐、奇虎、阿里云、Evernote等国内外20余家大型互联网企业资深产品经理和技术专家联袂推荐。用系统化的方法论和丰富的实战案例解读了优秀产品经理所必须修炼的产品规划能力、产品设计能力、产品执行能力,以及思考、分析和解决问题的能力和方法,旨在为互联网产品经理打造核心竞争力提供实践指导。 全书一......一起来看看 《产品经理修炼之道》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

SHA 加密
SHA 加密

SHA 加密工具

html转js在线工具
html转js在线工具

html转js在线工具