Exploring Neural Networks and their fascinating effectiveness

栏目: IT技术 · 发布时间: 4年前

Inside AI

Exploring Neural Networks and their fascinating effectiveness

Understanding the underlying concepts in the effectiveness of Neural Networks using python and PyTorch.

Jul 17 ·5min read

Exploring Neural Networks and their fascinating effectiveness

Photo by Andrew Neel on Unsplash

Where it all started?

It all started with the idea of understanding how the brain actually works. Back in the 1940s McCulloch and Pitts introduced the Neuron[1] and in 1950s the first Perceptron by Frank Rosenbolt was introduced[2]. The neural networks were with us since the 1940s but, the field faced ups and downs due to lack of practical implementation. The recent growth in the practice of deep learning techniques involving a variety of neural network architectures is because of two major advances, first, computation power (High-performance CPUs and GPUs), and second, amount of data available.

Geoff Hinton and two of his graduate students showed that how one could take a very large dataset called ImageNet , with 10,000 categories and 10 million images, and reduce the classification error by 20 percent using deep learning. This happened back in 2012 at the NIPS meeting which was remarkable as Terrence Sejnowski says,

Traditionally on that dataset(ImageNet), error decreases by less than 1 percent in one year. In one year, 20 years of research was bypassed. That really opened the floodgates. — In an interview at The Verge .

This was the moment when the “ AI-buzz ” started.

How magical is a neural network?

ANNs consist of a large number of simple interconnected processing elements. These elements operate in parallel, whose function is determined by network structure, connection strengths, and the processing performed at computing elements or nodes. The growth of interest in deep learning was partly due to the failures of traditional programming techniques in “hard” tasks such as machine vision, continuous speech recognition, and machine learning.

There have been significant demonstrations of neural network capabilities in vision, speech, signal processing, and robotics. The variety of problems addressed by neural networks is impressive. There have been recent breakthroughs in tasks such as Image Classification, Object Detection, Text Generation, Image Captioning, Language Translation, GANs, etc.

The hot research area of the neural network is growing fast and in a wide range due to its effectiveness and availability of high performing processors(GPUs, TPUs). Recent advancements in cloud technologies such as AWS, Google Cloud, etc. also contribute to better research. However, deep learning often faces the problem of scaling, and real-world problems. (discussion about the problems is beyond the scope of this post)

Function Approximation

Function Approximation is describing the behavior of complex function by ensembles of simpler functions. The methods included polynomial approximation by Gauss , series expansion to compute an approximation of a function around the operating point, like the Taylor Series, and many more.

Neural Networks are function approximation machines that achieve generalization statistically.

Neural Networks are universal approximators

Feedforward neural networks provide a universal approximation framework, The Universal Approximation Theorem ,

The universal approximation theorem, in one of its most general versions, says that if we consider only continuous activation functions σ , then a standard feedforward neural network with one hidden layer is able to approximate any continuous multivariate function f to any given approximation threshold ε , if and only if σ is non-polynomial.[3]

The feedforward networks provide a universal system for representing functions in the sense that, given a function, there exists a feedforward network that approximates the function. This says that there exists a large network that approximates the function under consideration, but it does not answer, exactly how large?

In short, a feedforward neural network with a single layer is sufficient to represent any function, but the layer may be quite large and may fail to generalize correctly.

Example of approximation

As an example of a function approximation, I take the well-known sine and cosine functions. The range of data points, [-3,3] makes the function plots look as follows,

  1. Cosine Function

Exploring Neural Networks and their fascinating effectiveness

Cosine Plot (Source: Notebook)

2. Sine Function

Exploring Neural Networks and their fascinating effectiveness

Sine Plot (Source: Notebook)

Further approximating these functions with the neural network of architecture,

Layers: Input(1)-Hidden(100)-Output(1)

Exploring Neural Networks and their fascinating effectiveness

Neural Network Architecture[in-hidden-out, 1–100–1]

The approximated functions based upon neural network's optimization results below function plots,

  1. Cosine

Exploring Neural Networks and their fascinating effectiveness

2. Sine

Exploring Neural Networks and their fascinating effectiveness

Problems

The above illustrations show with a correct set of parameters neural network almost fits the original function. Everything seems perfect, then what could possibly go wrong?

Overfitting

Yes, our neural network overfits! The error on the training set is driven to a very small value, but when new data is presented to the network the error is large. The network has memorized the training examples, but it has not learned to generalize to new situations. This must be avoided. We can have a look over another custom function with the same neural network architecture.

Function,

Exploring Neural Networks and their fascinating effectiveness

10*(a²)*sin(a)*cos(s)

On Function Approximation,

Exploring Neural Networks and their fascinating effectiveness

Approximated

Conclusion

These were some of the attempts to understand the basics of the universal approximation power of neural networks, which makes possible the deep learning field to be effective on a wide range of tasks. Finally, we summarize the above as follows:

  • Training a neural network on data approximates the unknown underlying mapping function from inputs to outputs.
  • Problems such as Overfitting, while training the neural network hinders the results over new data(unseen).

One can observe changing the neural architecture and parameters affects the results by experimenting themselves in the notebook with code, for the above plots here.

References

[1]McCulloch, W. S. and Pitts, W. 1943. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5:115–133.

[2]Rosenblatt, F. 1957. The Perceptron — a perceiving and recognizing automaton. Report 85–460–1, Cornell Aeronautical Laboratory.

[3]A closer look at the approximation capabilities of neural networks, Kai Fong Ernest Chong.

Resources to check out

Thank-you!


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

创业时代

创业时代

付遥 / 中信出版社 / 2015-7 / 39.8元

香港人郭鑫年酷爱赛车,在驾车穿越隧道的时候,因为收发短信发生意外,他从被撞得破烂的车里爬出来时,兴奋地高喊:我有一个伟大的想法,手机上的对讲机,将要改变世界!他随即辞职来到北京,开始艰难的创业历程。 移动技术迅猛发展,正在颠覆互联网行业,郭鑫年误打误撞,对讲机用户数量急增,竟成为移动互联网的明星,他也因此置身于风口浪尖。三大互联网巨头为了抢夺手机入口大打出手,无不希望争夺这张通往未来移动市场......一起来看看 《创业时代》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

在线进制转换器
在线进制转换器

各进制数互转换器

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具