Statistical Decision Theory

栏目: IT技术 · 发布时间: 4年前

In this post, we will discuss some theory that provides the framework for developing machine learning models.

Let’s get started!

If we consider a real valued random input vector, X , and a real valued random output vector, Y , the goal is to find a function f ( X ) for predicting the value of Y. This requires a loss function, L ( Y , f ( X )). This function allows us to penalize errors in predictions. One example of a commonly used loss function is the square error losss:

The loss function is the squared difference between true outcome values and our predictions. If f ( X ) = Y , which means our predictions equal true outcome values, our loss function is equal to zero. So we’d like to find a way to choose a function f ( X ) that gives us values as close to Y as possible.

Given our loss function, we have a critereon for selecting f ( X ). We can calculate the expected squared prediction error by integrating the loss function over x and y :

Where P( X , Y ) is the joint probability distribution in input and output. We can then condition on X and calculate the expected squared prediction error as follows:

We can then minimize this expect squared prediction error point wise, by finding the values, c , which minimize the error given X :

The solution to this is:

Which is the conditional expectation of Y , given X = x. Put another way, the regression function gives the conditional mean of Y, given our knowledge of X. Interestingly, the k -nearest neighbors method is a direct attempt at implementing this method from training data. With nearest neighbors, for each x , we can ask for the average of the y ’s where the input, x , equals a specific value. Our estimator for Y can then be written as:

Where we are taking the average over sample data and using the result to estimate the expected value. We are also conditioning on a region with k neighbors closest to the target point. As the sample size gets larger, the points in the neighborhood are likely to be close to x . Additionally, as the number of neighbors, k , gets larger the mean becomes more stable.

If you’re interested in learning more, Elements of Statistical Learning , by Trevor Hastie, is a great resource. Thank you for reading!

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

Statistical Decision Theory

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

数据结构

殷人昆 / 清华大学 / 2007-6 / 39.00元

《数据结构》(第2版)“数据结构”是计算机专业的核心课程，是从事计算机软件开发和应用人员必备的专业基础。随着计算机的日益普及，“数据结构”课程也在不断地发展。《数据结构》(第2版)按照清华大学计算机系本科“数据结构”大纲的要求，从面向对象的概念、对象类设计的风格和数据结构的层次开始，从线性结构到非线性结构，从简单到复杂，深入地讨论了各种数据结构内在的逻辑关系及其在计算机中的实现方式和使用。此外，对......一起来看看《数据结构》这本书的介绍吧!

码农工具

Statistical Decision Theory

数据结构

JS 压缩/解压工具

URL 编码/解码

HEX CMYK 转换工具