内容简介:In vector calculus one of the major topics is the introduction of vectors and the 3-dimensional space as an extension of the 2-dimensional space often studied in the cartesian coordinate system. Vectors have two main properties:Intuitively, this can be ext
What is it, and how do we compute it?
Jun 5 ·4min read
Vector Calculus
In vector calculus one of the major topics is the introduction of vectors and the 3-dimensional space as an extension of the 2-dimensional space often studied in the cartesian coordinate system. Vectors have two main properties: direction and magnitude . In 2-dimensions we can visualize a vector extending from the origin as an arrow (exhibiting both direction and magnitude).
Intuitively, this can be extended to 3-dimensions where we can visualize an arrow floating in space (again, exhibiting both direction and magnitude).
Less intuitively, the notion of a vector can be extended to any number of dimensions, where comprehension and analysis can only be accomplished algebraically. It’s important to note that in any case, a vector does not have a specific location. This means if two vectors have the same direction and magnitude they are the same vector . Now that we have a basic understanding of vectors let’s talk about the gradient vector.
The Gradient Vector
Regardless of dimensionality, the gradient vector is a vector containing all first-order partial derivatives of a function.
Let’s compute the gradient for the following function…
The gradient is denoted as ∇…
After partially differentiating…
That is the gradient vector for the function f(x, y) . That’s all great, but what’s the point? What can the gradient vector do — what does it even mean?
Gradient Ascent: Maximization
The gradient for any function points in the direction of greatest increase. This is incredible. Imagine you have a function modeling profit for your company. Obviously, your goal is to maximize profit. One way to do this is to compute the gradient vector and pick some random inputs — you can now iteratively update your inputs by computing the gradient and adding those values to your previous inputs until a maximum is reached.
Gradient Descent: Minimization
We know that the gradient vector points in the direction of greatest increase. Conversely, a negative gradient vector points in the direction of greatest decrease. The main purpose of gradient descent is to minimize an error or cost, most notably prevalent in machine learning. Imagine you have a function modeling costs for your company. Obviously, your goal is to minimize costs. Similar to maximizing profits you can compute the gradient vector for some random inputs and iteratively update the inputs by subtracting the values in the gradient vector from your previous inputs until a minimum is reached.
Issues with Gradient Ascent/Descent
The most notable issue using this method of optimization is the existence of relative extrema. Relative extrema refer to points on the function that are the maximum or minimum value relative to points around it exhibited by the graph below.
A traditional calculus approach to optimization runs into this same problem and solves it by comparing the function output at all relative extrema to determine the true global max/min. In terms of gradient ascent/descent, there are a variety of different modifications that can be made to the iterative process of updating the inputs to avoid (or pass) relative extrema aiding in the optimization efforts. The main types of gradient ascent/descent are…
- Stochastic Gradient Ascent/Descent
- Batch Gradient Ascent/Descent
- Mini-Batch Gradient Ascent/Descent
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。