Balancing the Regularization Effect of Data Augmentation

栏目: IT技术 · 发布时间: 4年前

内容简介：A look into the need to balance overfitting and underfitting with data augmentation using an application of Image segmentation on satellite images to identify water bodies.When training neural networks, data augmentation is one of the most commonly used pr

Balancing the Regularization Effect of Data Augmentation

A look into the need to balance overfitting and underfitting with data augmentation using an application of Image segmentation on satellite images to identify water bodies.

The Effect of Data Augmentation

When training neural networks, data augmentation is one of the most commonly used pre-processing techniques. The word “augmentation” which literally means “the action or process of making or becoming greater in size or amount” , summarizes the outcome of this technique. But another important effect is that, it increases or augments the diversity of the data. The increased diversity means, at each training stage the model comes across a different version of the original data.

By Author

Why do we need this ‘increased diversity’ in data? The answer lies in the core tenet of machine learning — The Bias-Variance tradeoff . More complex models like deep neural networks have low bias but suffer from high variance. This implies that, these models overfit the training data and would show poor perform on test data or the data, they haven’t seen before. This would lead to higher prediction errors . Thus, the increased diversity from data augmentation reduces the variance of the model by making it better at generalizing .

For images, some common methods of data augmentation are taking cropped portions, zooming in/out, rotating along the axis, vertical/horizontal flips, adjusting the brightness and sheer intensity. Data augmentation for audio data involves adding noise, changing speed and pitch.

While data augmentation prevents the model from overfitting, some augmentation combinations can actually lead to underfitting . This slows down training which leads to a huge strain on resources like available processing time, GPU quotas, etc. Moreover, the model isn’t able to learn as much information to give accurate predictions which, again leads to high prediction errors. In this blog post we take the example of semantic segmentation on satellite images, to see the impact of different combinations of data augmentations on training.

About the Data set

This Kaggle data set gives the satellite images from Sentinel 2 and their corresponding masks which segment the water bodies . The masks have been calculated using the Normalized Difference Water Index or NDWI. Out of a total 2841 images in the data set, 2560 were extracted for the train set, 256 for the validation set and 25 for the test set respectively. The entire analysis and modeling was done on Google Colab with GPU support.

Structure of U-NET

Simply put, a U-NET is an autoencoder with residual or skip connections from each convolutional block in the encoder to its counterpart in the decoder. This results in a symmetric ‘U’ like structure. This article gives a comprehensive line by line explanation of the structure of a U-NET from the original paper.

We use a slightly modified version of the U-NET as shown below.

Snapshot of a block of the UNET used (By Author)

A Look at Different Cases of Data Augmentation

We explore 5 different cases of data augmentation with the help of Keras ImageDataGenerator. We want to see how augmentation can lead to overfitting or underfitting during training. Thus, for comparison of the 5 cases, Accuracy and Loss during training & validation were used; where binary cross-entropy was taken as the loss function.

When dealing with semantic segmentation, an important point to remember is to apply the same augmentations to the images and their corresponding masks!

In all 5 cases, the pixel values of the images and masks were rescaled by a factor of 1/255. All images and their masks in the validation and test set were also rescaled.

Case 1:This was the base case . Only the pixel values of images and their masks were rescaled. No augmentations were applied. This case produced a training set with the least variance .

Case 2:In addition to rescaling, the images and their masks were randomly flipped vertically or horizontally .

Case 3:For this case, rescaling, random vertical or horizontal flips and random rotations between [-20,20] degrees were applied to images and their corresponding masks.

Case 4:The images and their corresponding masks were randomly shifted along the width and height by a factor of 0.3.

Case 5:The sheer transformations were randomly applied to the images and their corresponding masks using a factor of 20. They were also randomly zoomed in between the range [0.2,0.5].

Different types of Augmentations on the same image and it’s mask (By Author)

A Comparison of Results

Across all 5 cases, the model was trained for 250 epochs with a batch size of 16. The Adam optimizer was used with a learning rate of 0.00001, beta 1 of 0.99 and beta 2 of 0.99. In the below graphs, we see that, each case of data augmentation gave a varying performance for the same model, trained for the same number of epochs with the same initial state of the optimizer.

很遗憾的说，推酷将在这个月底关闭。人生海海，几度秋凉，感谢那些有你的时光。

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Balancing the Regularization Effect of Data Augmentation

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

数据结构基础

[美]Ellis Horowitz 霍罗维兹 / 朱仲涛 / 清华大学出版社 / 2009-3 / 49.00元

《数据结构基础(C语言版)(第2版)》是最经典数据结构教材的最新版本，国内外大多数的同类教材都是以《数据结构基础(C语言版)(第2版)》为蓝本编写而来的。《数据结构基础(C语言版)(第2版)》用C作为描述语言，全面而生动地介绍了数据结构的有关知识，如数组、栈、队列、链表、树和图，以及构成所有软件基础的排序散列技术。此外，《数据结构基础(C语言版)(第2版)》还介绍了各种高级或特殊数据结构，如优先级......一起来看看《数据结构基础》这本书的介绍吧!

码农工具

XML、JSON 在线转换

在线XML、JSON转换工具

HEX HSV 转换工具

HEX HSV 互换工具