The Gradient Vector

栏目: IT技术 · 发布时间: 4年前

内容简介:In vector calculus one of the major topics is the introduction of vectors and the 3-dimensional space as an extension of the 2-dimensional space often studied in the cartesian coordinate system. Vectors have two main properties:Intuitively, this can be ext

What is it, and how do we compute it?

The Gradient Vector

Photo from Unsplash

Vector Calculus

In vector calculus one of the major topics is the introduction of vectors and the 3-dimensional space as an extension of the 2-dimensional space often studied in the cartesian coordinate system. Vectors have two main properties: direction and magnitude . In 2-dimensions we can visualize a vector extending from the origin as an arrow (exhibiting both direction and magnitude).

The Gradient Vector

2d vector plot from matplotlib

Intuitively, this can be extended to 3-dimensions where we can visualize an arrow floating in space (again, exhibiting both direction and magnitude).

The Gradient Vector

3d vector graph from JCCC

Less intuitively, the notion of a vector can be extended to any number of dimensions, where comprehension and analysis can only be accomplished algebraically. It’s important to note that in any case, a vector does not have a specific location. This means if two vectors have the same direction and magnitude they are the same vector . Now that we have a basic understanding of vectors let’s talk about the gradient vector.

The Gradient Vector

Regardless of dimensionality, the gradient vector is a vector containing all first-order partial derivatives of a function.

Let’s compute the gradient for the following function…

The Gradient Vector
The function we are computing the gradient vector for

The gradient is denoted as ∇…

The Gradient Vector
The gradient vector for function f

After partially differentiating…

The Gradient Vector
The gradient vector for function f after substituting the partial derivatives

That is the gradient vector for the function f(x, y) . That’s all great, but what’s the point? What can the gradient vector do — what does it even mean?

Gradient Ascent: Maximization

The gradient for any function points in the direction of greatest increase. This is incredible. Imagine you have a function modeling profit for your company. Obviously, your goal is to maximize profit. One way to do this is to compute the gradient vector and pick some random inputs — you can now iteratively update your inputs by computing the gradient and adding those values to your previous inputs until a maximum is reached.

Gradient Descent: Minimization

We know that the gradient vector points in the direction of greatest increase. Conversely, a negative gradient vector points in the direction of greatest decrease. The main purpose of gradient descent is to minimize an error or cost, most notably prevalent in machine learning. Imagine you have a function modeling costs for your company. Obviously, your goal is to minimize costs. Similar to maximizing profits you can compute the gradient vector for some random inputs and iteratively update the inputs by subtracting the values in the gradient vector from your previous inputs until a minimum is reached.

Issues with Gradient Ascent/Descent

The most notable issue using this method of optimization is the existence of relative extrema. Relative extrema refer to points on the function that are the maximum or minimum value relative to points around it exhibited by the graph below.

The Gradient Vector

Photo from Paul’s Online Notes

A traditional calculus approach to optimization runs into this same problem and solves it by comparing the function output at all relative extrema to determine the true global max/min. In terms of gradient ascent/descent, there are a variety of different modifications that can be made to the iterative process of updating the inputs to avoid (or pass) relative extrema aiding in the optimization efforts. The main types of gradient ascent/descent are…

  • Stochastic Gradient Ascent/Descent
  • Batch Gradient Ascent/Descent
  • Mini-Batch Gradient Ascent/Descent

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

敏捷估计与规划

敏捷估计与规划

[美] Mike Cohn / 宋锐 / 清华大学出版社 / 2007-7 / 39.90元

《敏捷估计与规划》一书为对敏捷项目进行估计与规划提供了权威实际的指导方针。在本书中,敏捷联盟的共同创始人Mike Cohn讨论了敏捷估计与规划的思想,并使用现实的例子与案例分析向您详细地展示了如何完成工作。本书清晰地阐述了有关的概念,并引导读者逐步认识到下列一些问题的答案:我们要构建什么?它的规模有多大?需要在什么时候完成?到那个时候我们到底能完成多少?您首先会认识到优秀的计划由哪些东西组成,接着......一起来看看 《敏捷估计与规划》 这本书的介绍吧!

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具