The Gradient Vector

栏目: IT技术 · 发布时间: 4年前

内容简介：In vector calculus one of the major topics is the introduction of vectors and the 3-dimensional space as an extension of the 2-dimensional space often studied in the cartesian coordinate system. Vectors have two main properties:Intuitively, this can be ext

What is it, and how do we compute it?

Roman Paolucci

Jun 5 ·4min read

The Gradient Vector — Photo from Unsplash

Vector Calculus

In vector calculus one of the major topics is the introduction of vectors and the 3-dimensional space as an extension of the 2-dimensional space often studied in the cartesian coordinate system. Vectors have two main properties: direction and magnitude . In 2-dimensions we can visualize a vector extending from the origin as an arrow (exhibiting both direction and magnitude).

Intuitively, this can be extended to 3-dimensions where we can visualize an arrow floating in space (again, exhibiting both direction and magnitude).

Less intuitively, the notion of a vector can be extended to any number of dimensions, where comprehension and analysis can only be accomplished algebraically. It’s important to note that in any case, a vector does not have a specific location. This means if two vectors have the same direction and magnitude they are the same vector . Now that we have a basic understanding of vectors let’s talk about the gradient vector.

The Gradient Vector

Regardless of dimensionality, the gradient vector is a vector containing all first-order partial derivatives of a function.

Let’s compute the gradient for the following function…

The gradient is denoted as ∇…

After partially differentiating…

That is the gradient vector for the function f(x, y) . That’s all great, but what’s the point? What can the gradient vector do — what does it even mean?

Gradient Ascent: Maximization

The gradient for any function points in the direction of greatest increase. This is incredible. Imagine you have a function modeling profit for your company. Obviously, your goal is to maximize profit. One way to do this is to compute the gradient vector and pick some random inputs — you can now iteratively update your inputs by computing the gradient and adding those values to your previous inputs until a maximum is reached.

Gradient Descent: Minimization

We know that the gradient vector points in the direction of greatest increase. Conversely, a negative gradient vector points in the direction of greatest decrease. The main purpose of gradient descent is to minimize an error or cost, most notably prevalent in machine learning. Imagine you have a function modeling costs for your company. Obviously, your goal is to minimize costs. Similar to maximizing profits you can compute the gradient vector for some random inputs and iteratively update the inputs by subtracting the values in the gradient vector from your previous inputs until a minimum is reached.

Issues with Gradient Ascent/Descent

The most notable issue using this method of optimization is the existence of relative extrema. Relative extrema refer to points on the function that are the maximum or minimum value relative to points around it exhibited by the graph below.

A traditional calculus approach to optimization runs into this same problem and solves it by comparing the function output at all relative extrema to determine the true global max/min. In terms of gradient ascent/descent, there are a variety of different modifications that can be made to the iterative process of updating the inputs to avoid (or pass) relative extrema aiding in the optimization efforts. The main types of gradient ascent/descent are…

Stochastic Gradient Ascent/Descent
Batch Gradient Ascent/Descent
Mini-Batch Gradient Ascent/Descent

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

The Gradient Vector

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

High-Performance Compilers for Parallel Computing

Michael Wolfe / Addison-Wesley / 1995-6-16 / USD 117.40

By the author of the classic 1989 monograph, Optimizing Supercompilers for Supercomputers, this book covers the knowledge and skills necessary to build a competitive, advanced compiler for parallel or......一起来看看《High-Performance Compilers for Parallel Computing》这本书的介绍吧!

码农工具

图片转BASE64编码

在线图片转Base64编码工具

HEX CMYK 转换工具

HEX CMYK 互转工具