Batch vs Stochastic Gradient Descent

栏目: IT技术 · 发布时间: 5年前

Batch vs Stochastic Gradient Descent

Learn difference between Batch & Stochastic Gradient Descent and choose best descent for your model.

May 31 ·4min read

Batch vs Stochastic Gradient Descent

Photo by Bailey Zindel on Unsplash

Before diving into Gradient Descent, we’ll look how a Linear Regression model deals with Cost function. Main motive to reach Global minimum is to minimize Cost function which is given by,

Batch vs Stochastic Gradient Descent

Here, Hypothesis represents linear equation where, theta(0) is the bias AKA intercept and theta(1) are the weight(slope) given to the feature ‘x’.

Batch vs Stochastic Gradient Descent

Fig: 1

Weights and intercept are randomly initialized taking baby step to reach minimum point. An important parameter in Gradient Descent is the size of the steps, determined by the learning rate hyper-parameter. It’s important to note that if we set high value of learning rate, point will end up taking large steps and probably will not reach global minimum( having large errors). On the other hand, if we take small value of learning rate, purple point will take large amount of time to reach global minimum. Therefore, Optimal learning rate should be taken.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

链接

链接

[美] 巴拉巴西 / 徐彬 / 湖南科技出版社 / 2007-04-01 / 28.00

从鸡尾酒会到恐怖分子的巢穴,从远古的细菌到国际组织——所有这一切各自都是一种网络,都是一个令人惊讶的科学革新的一部分。21世纪初,有科学家发现,网络具有深层的秩序,依据简单而强有力的规则运行。这一领域的知识帮助我们了解时尚、病毒等的传播机制,了解生态系统的稳健性,以及经济体系的脆弱性——甚至是民主的未来。 一位致力于研究“链接和节点”的科学家将首次带领我们领略网络革新的内幕。在本书中,作者生......一起来看看 《链接》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具