Balancing the Regularization Effect of Data Augmentation

栏目: IT技术 · 发布时间: 4年前

内容简介:A look into the need to balance overfitting and underfitting with data augmentation using an application of Image segmentation on satellite images to identify water bodies.When training neural networks, data augmentation is one of the most commonly used pr

Balancing the Regularization Effect of Data Augmentation

A look into the need to balance overfitting and underfitting with data augmentation using an application of Image segmentation on satellite images to identify water bodies.

The Effect of Data Augmentation

When training neural networks, data augmentation is one of the most commonly used pre-processing techniques. The word “augmentation” which literally means “the action or process of making or becoming greater in size or amount” , summarizes the outcome of this technique. But another important effect is that, it increases or augments the diversity of the data. The increased diversity means, at each training stage the model comes across a different version of the original data.

By Author

Why do we need this ‘increased diversity’ in data? The answer lies in the core tenet of machine learning — The Bias-Variance tradeoff . More complex models like deep neural networks have low bias but suffer from high variance. This implies that, these models overfit the training data and would show poor perform on test data or the data, they haven’t seen before. This would lead to higher prediction errors . Thus, the increased diversity from data augmentation reduces the variance of the model by making it better at generalizing .

For images, some common methods of data augmentation are taking cropped portions, zooming in/out, rotating along the axis, vertical/horizontal flips, adjusting the brightness and sheer intensity. Data augmentation for audio data involves adding noise, changing speed and pitch.

While data augmentation prevents the model from overfitting, some augmentation combinations can actually lead to underfitting . This slows down training which leads to a huge strain on resources like available processing time, GPU quotas, etc. Moreover, the model isn’t able to learn as much information to give accurate predictions which, again leads to high prediction errors. In this blog post we take the example of semantic segmentation on satellite images, to see the impact of different combinations of data augmentations on training.

About the Data set

This Kaggle data set gives the satellite images from Sentinel 2 and their corresponding masks which segment the water bodies . The masks have been calculated using the Normalized Difference Water Index or NDWI. Out of a total 2841 images in the data set, 2560 were extracted for the train set, 256 for the validation set and 25 for the test set respectively. The entire analysis and modeling was done on Google Colab with GPU support.

Structure of U-NET

Simply put, a U-NET is an autoencoder with residual or skip connections from each convolutional block in the encoder to its counterpart in the decoder. This results in a symmetric ‘U’ like structure. This article gives a comprehensive line by line explanation of the structure of a U-NET from the original paper.

We use a slightly modified version of the U-NET as shown below.

Snapshot of a block of the UNET used (By Author)

A Look at Different Cases of Data Augmentation

We explore 5 different cases of data augmentation with the help of Keras ImageDataGenerator. We want to see how augmentation can lead to overfitting or underfitting during training. Thus, for comparison of the 5 cases, Accuracy and Loss during training & validation were used; where binary cross-entropy was taken as the loss function.

When dealing with semantic segmentation, an important point to remember is to apply the same augmentations to the images and their corresponding masks!

In all 5 cases, the pixel values of the images and masks were rescaled by a factor of 1/255. All images and their masks in the validation and test set were also rescaled.

Case 1:This was the base case . Only the pixel values of images and their masks were rescaled. No augmentations were applied. This case produced a training set with the least variance .

Case 2:In addition to rescaling, the images and their masks were randomly flipped vertically or horizontally .

Case 3:For this case, rescaling, random vertical or horizontal flips and random rotations between [-20,20] degrees were applied to images and their corresponding masks.

Case 4:The images and their corresponding masks were randomly shifted along the width and height by a factor of 0.3.

Case 5:The sheer transformations were randomly applied to the images and their corresponding masks using a factor of 20. They were also randomly zoomed in between the range [0.2,0.5].

Different types of Augmentations on the same image and it’s mask (By Author)

A Comparison of Results

Across all 5 cases, the model was trained for 250 epochs with a batch size of 16. The Adam optimizer was used with a learning rate of 0.00001, beta 1 of 0.99 and beta 2 of 0.99. In the below graphs, we see that, each case of data augmentation gave a varying performance for the same model, trained for the same number of epochs with the same initial state of the optimizer.


很遗憾的说,推酷将在这个月底关闭。人生海海,几度秋凉,感谢那些有你的时光。


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Ruby on Rails社区网站开发

Ruby on Rails社区网站开发

布拉德伯纳 / 柳靖 / 2008-10 / 55.00元

《Ruby on Rails社区网站开发》全面探讨创建完整社区网站的开发过程。首先介绍开发一个内容简单的管理系统,之后逐渐添加新特性,以创建更完整的、使用Ruby on Rails 的Web 2.0 社区网站。还给出了开发和测试中的一些建议和提示,同时指导如何使网站更生动以及维护得更好。《Ruby on Rails社区网站开发》也探讨了如何与Flickr 、Google Maps 等其他平台集成,......一起来看看 《Ruby on Rails社区网站开发》 这本书的介绍吧!

图片转BASE64编码
图片转BASE64编码

在线图片转Base64编码工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具