Getting to know Activation Functions in Neural Networks.

栏目: IT技术 · 发布时间: 4年前

内容简介:There are quite a few other non-linear activation functions such as softmax, Parametric ReLU etc. which is not discussed in this article. Now comes the million dollar question! Which activation function is the best? Well, my answer would be

Getting to know Activation Functions in Neural Networks.

What are activation functions in Neural Networks and why should you know about them?

Getting to know Activation Functions in Neural Networks.

Photo by Marius Masalar onUnsplash

If you are someone who has experience implementing Neural Networks, you might have encountered the term ‘Activation functions’. Does the name ring any bells? No? How about ‘ relu, softmax or sigmoid ’? Well, those are a few of the most widely used activation functions in today’s context. When I started working with Neural Networks I had no idea what an activation function really does. But there was a point where I could not go ahead with the implementation of my neural network without a sound knowledge of activation functions. I did a little bit of digging and here’s what I found…

What are activation functions?

To put it simply, activation functions are mathematical equations that determine the output of neural networks. They basically decide to deactivate neurons or activate them to get the desired output thus the name, activation functions. Now, let’s get into the math…

Getting to know Activation Functions in Neural Networks.

Figure 1

In a neural network, input data points(x) which are numerical values are fed into neurons. Each and every neuron has a weight(w) which will be multiplied by the inputs and output a certain value which will again be fed into the neurons in the next layer. Activation functions come into the play as mathematical gates in between this process as depicted in figure 1 and decide whether the output of a certain neuron is on or off.

Activation functions can be divided into three main categories; Binary Step Function, Linear Activation Function and Non-Linear Activation functions. However, non-linear activation functions consist of several types of functions. Let’s take a deeper look…

1. Binary Step Function

Getting to know Activation Functions in Neural Networks.

Binary Step Activation Function

Binary step function is a threshold-based activation function which means after a certain threshold neuron is activated and below the said threshold neuron is deactivated. In the above graph, the threshold is zero. This activation function can be used in binary classifications as the name suggests, however it can not be used in a situation where you have multiple classes to deal with.

2. Linear Activation Function

Getting to know Activation Functions in Neural Networks.

Linear Activation Function

Here, our function (Output) is directly proportional to the weighted sum of neurons. Linear Activation function can deal with multiple classes, unlike Binary Step function. However, it has its own drawbacks. With linear activation function changes made in back-propagation will be constant which is not good for learning. Another huge drawback of Linear Activation Function is that no matter how deep the neural network is (how many layers neural network consist of) last layer will always be a function of the first layer. This limits the neural network’s ability to deal with complex problems.

3. Non-Linear Activation Functions

Deep learning practitioners today work with data of high dimensionality such as images, audios, videos, etc. With the drawbacks mentioned above, it is not practical to use Linear Activation Functions in complex applications that we use neural networks for. Therefore, it is Non-Linear Functions that are being widely used in present. We’ll take a look at a few of the popular non-linear activation functions.

  • Sigmoid function.

Getting to know Activation Functions in Neural Networks.

Sigmoid function

Sigmoid function (also known as logistic function) takes a probabilistic approach and the output ranges between 0–1. It normalizes the output of each neuron. However, Sigmoid function makes almost no change in the prediction for very high or very low inputs which ultimately results in neural network refusing to learn further, this problem is known as the vanishing gradient .

  • tanh function

Getting to know Activation Functions in Neural Networks.

tanh function

tanh function (also known as hyperbolic tangent) is almost like the sigmoid function but slightly better than that since it’s output ranges between -1 and 1 allowing negative outputs. However, tanh also comes with the vanishing gradient problem just like sigmoid function.

  • ReLU (Rectified Linear Unit) function

Getting to know Activation Functions in Neural Networks.

ReLU function

In this function, outputs for the positive inputs can range from 0 to infinity but when the input is zero or a negative value, the function outputs zero and it hinders with the back-propagation. This problem is known as the dying ReLU problem.

  • Leaky ReLU

Getting to know Activation Functions in Neural Networks.

Leaky ReLU function

Leaky ReLU prevents the dying ReLU problem and enable back-propagation. One flaw of Leaky ReLU is the slope being predetermined rather than letting the neural network figure it out.

There are quite a few other non-linear activation functions such as softmax, Parametric ReLU etc. which is not discussed in this article. Now comes the million dollar question! Which activation function is the best? Well, my answer would be it depends … It depends on the problem you are applying the neural network to. For an instance if you are applying a neural network to a classification problem, sigmoid will work well, but for some other problem it might not work well and that is why it is important to learn about pros and cons of activation functions so that you can choose the best activation function to the project that you are working on.

How to include activation functions in your code?

Maybe years ago implementing the math behind all these functions might have been quite difficult but now with the advancement of open-source libraries such as TensorFlow and PyTorch it has become easier! Let’s see a code snippet where activation functions have been included in code using TensorFlow.

Activation functions in TensorFlow

Seems quite simple right? As easier it is with TensorFlow, it is important to have an actual understanding of theses activation functions because the learning process of your neural network highly depends on it.

Thank you for reading and hope this article was of help.

Resources


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

乌合之众

乌合之众

(法)勒庞 / 冯克利 / 中央编译出版社 / 2011-5-1 / 16.00元

古斯塔夫・勒庞 Gustave Le Bon(1841-1931) 法国著名社会心理学家。他自1894年始,写下一系列社会心理学著作,以本书最为著名;在社会心理学领域已有的著作中,最有影响的,也是这本并不很厚的《乌合之众》。古斯塔夫・勒庞在他在书中极为精致地描述了集体心态,对人们理解集体行为的作用以及对社会心理学的思考发挥了巨大影响。《乌合之众--大众心理研究》在西方已印至第29版,其观点新颖,语......一起来看看 《乌合之众》 这本书的介绍吧!

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具