In this post, we will discuss some theory that provides the framework for developing machine learning models.
Let’s get started!
If we consider a real valued random input vector, X , and a real valued random output vector, Y , the goal is to find a function f ( X ) for predicting the value of Y. This requires a loss function, L ( Y , f ( X )). This function allows us to penalize errors in predictions. One example of a commonly used loss function is the square error losss:
The loss function is the squared difference between true outcome values and our predictions. If f ( X ) = Y , which means our predictions equal true outcome values, our loss function is equal to zero. So we’d like to find a way to choose a function f ( X ) that gives us values as close to Y as possible.
Given our loss function, we have a critereon for selecting f ( X ). We can calculate the expected squared prediction error by integrating the loss function over x and y :
Where P( X , Y ) is the joint probability distribution in input and output. We can then condition on X and calculate the expected squared prediction error as follows:
We can then minimize this expect squared prediction error point wise, by finding the values, c , which minimize the error given X :
The solution to this is:
Which is the conditional expectation of Y , given X = x. Put another way, the regression function gives the conditional mean of Y, given our knowledge of X. Interestingly, the k -nearest neighbors method is a direct attempt at implementing this method from training data. With nearest neighbors, for each x , we can ask for the average of the y ’s where the input, x , equals a specific value. Our estimator for Y can then be written as:
Where we are taking the average over sample data and using the result to estimate the expected value. We are also conditioning on a region with k neighbors closest to the target point. As the sample size gets larger, the points in the neighborhood are likely to be close to x . Additionally, as the number of neighbors, k , gets larger the mean becomes more stable.
If you’re interested in learning more, Elements of Statistical Learning , by Trevor Hastie, is a great resource. Thank you for reading!
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
虚拟现实:最后的传播
聂有兵 / 中国发展出版社 / 2017-4-1 / 39.00
本书对“虚拟现实”这一诞生自70年代却在今天成为热门话题的概念进行了历史发展式的分析和回顾,认为虚拟现实是当今最重大的社会变革的技术因素之一,对虚拟现实在未来百年可能给人类社会的各个层面带来的影响进行说明,结合多个大众媒介的发展趋势,合理地推演未来虚拟现实在政治、经济、文化等领域的态势,并基于传播学理论框架提出了几个新的观点。对于普通读者,本书可以普及一般的虚拟现实知识;对于传媒行业,本书可以引导......一起来看看 《虚拟现实:最后的传播》 这本书的介绍吧!