Classical Neural Networks: What does a Loss Function Landscape look like?

栏目: IT技术 · 发布时间: 5年前

内容简介：Every neural network’s objective/loss function is to be minimized! But what does this loss function really look like? Today, we will be showing the loss function for two different neural networks (N1, N2: fig.1).The family of loss function we will when tra

(rights: source )

Classical Neural Networks: What does a Loss Function Landscape look like?

Ever wondered on what kind of topology we were optimising our neural networks on? Well now you know!

Every neural network’s objective/loss function is to be minimized! But what does this loss function really look like? Today, we will be showing the loss function for two different neural networks (N1, N2: fig.1).

fig.1 (rights: own image)

The family of loss function we will when training are MSE (Mean Squared Error). Although other loss function family might be interesting, we will stick with this one for the purpose of illustration.

For people that are extra curious, we will be training N2 neural network (training part is not very interesting since, again what we want is a landscape illustration) on this distribution (fig.2: and yes, I am too lazy to add noise)

fig.2 (rights: own image)

And N1 neural network (fig.3):

fig.3 (rights: own image)

Loss function value as a function of the input (N2)

Let’s simply plot the loss function itself to begin with (fig.4).

fig.4 (rights: own image)

Remarks to be made:

We see that error values are particularly high around x=-2 and y in [-1,1].
Except seeing what a loss function looks like, having such an illustration can be useful for someone who wants to purposefully attack such a neural network! For an adversarial this can be a first exploratory step.

Loss function value as a function of weights (N1)

Being able to see loss function as a function of input is nice, but not exactly what people would be interested in. Seeing the landscape for optimisation is definitely better for crafting an architecture! Now as mentioned in my previous article N1 has 7 weights scalar to optimise on. Plotting a 7 dimension would only have very little point for our understanding, so we will be arbitrarily projecting on two dimension. Note that we fix the input, so that the variables are only the two weights. (fig.5)

fig.5 (rights: own image)

This is the landscape for one data point. Multiple things are to note:

If we were to optimise (minimum) on this plot, the two weights arbitrarily picked, then the loss for the one data point would obviously diminish.
Now to minimize such a function, any simple gradient search would be enough, and not even a SGD would be needed since it is a strict convex distribution but since it has a plateau, we would need some momentum descent instead of general descent to have a big enough gradient direction.

Such a plot can be done to multiple data points when using MSE as a loss metric. Being able to picture losses in general neural network is harder, because of the number of weights, but this can be a way to not randomly try optimisation algorithms when training, and instead understand the underlying data model that you want to approach. I hope again that this help people understand that Machine Learning is not black magic and truly requires analysis! Hyperparameter finding does not have to be random trials.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Classical Neural Networks: What does a Loss Function Landscape look like?

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

TCP/IP详解卷1：协议

W.Richard Stevens / 范建华 / 机械工业出版社 / 2000-4-1 / 45.00元

《TCP/IP详解卷1：协议》是一本完整而详细的TCP/IP协议指南。描述了属于每一层的各个协议以及它们如何在不同操作系统中运行。作者W.Richard Stevens用Lawrence Berkeley实验室的tcpdump程序来捕获不同操作系统和TCP/IP实现之间传输的不同分组。对tcpdump输出的研究可以帮助理解不同协议如何工作。《TCP/IP详解卷1：协议》适合作为计算机专业学......一起来看看《TCP/IP详解卷1：协议》这本书的介绍吧!

码农工具

Classical Neural Networks: What does a Loss Function Landscape look like?

Classical Neural Networks: What does a Loss Function Landscape look like?

Loss function value as a function of the input (N2)

Loss function value as a function of weights (N1)

TCP/IP详解卷1：协议

UNIX 时间戳转换

正则表达式在线测试

HSV CMYK 转换工具

Classical Neural Networks: What does a Loss Function Landscape look like?

Classical Neural Networks: What does a Loss Function Landscape look like?

Loss function value as a function of the input (N2)

Loss function value as a function of weights (N1)

TCP/IP详解 卷1：协议

UNIX 时间戳转换

正则表达式在线测试

HSV CMYK 转换工具

TCP/IP详解卷1：协议