In this post, we will discuss some theory that provides the framework for developing machine learning models.
Let’s get started!
If we consider a real valued random input vector, X , and a real valued random output vector, Y , the goal is to find a function f ( X ) for predicting the value of Y. This requires a loss function, L ( Y , f ( X )). This function allows us to penalize errors in predictions. One example of a commonly used loss function is the square error losss:
The loss function is the squared difference between true outcome values and our predictions. If f ( X ) = Y , which means our predictions equal true outcome values, our loss function is equal to zero. So we’d like to find a way to choose a function f ( X ) that gives us values as close to Y as possible.
Given our loss function, we have a critereon for selecting f ( X ). We can calculate the expected squared prediction error by integrating the loss function over x and y :
Where P( X , Y ) is the joint probability distribution in input and output. We can then condition on X and calculate the expected squared prediction error as follows:
We can then minimize this expect squared prediction error point wise, by finding the values, c , which minimize the error given X :
The solution to this is:
Which is the conditional expectation of Y , given X = x. Put another way, the regression function gives the conditional mean of Y, given our knowledge of X. Interestingly, the k -nearest neighbors method is a direct attempt at implementing this method from training data. With nearest neighbors, for each x , we can ask for the average of the y ’s where the input, x , equals a specific value. Our estimator for Y can then be written as:
Where we are taking the average over sample data and using the result to estimate the expected value. We are also conditioning on a region with k neighbors closest to the target point. As the sample size gets larger, the points in the neighborhood are likely to be close to x . Additionally, as the number of neighbors, k , gets larger the mean becomes more stable.
If you’re interested in learning more, Elements of Statistical Learning , by Trevor Hastie, is a great resource. Thank you for reading!
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
写给Web开发人员看的HTML5教程
2012-3 / 45.00元
《写给Web开发人员看的HTML5教程》通过结合大量实际案例和源代码对HTML5的重要特性进行了详细讲解,内容全面丰富,易于理解。全书共分为12章,从HTML5的历史故事讲起,涉及了文档结构和语义、智能表单、视频与音频、画布、SVG与MathML、地理定位、Web存储与离线Web应用程序、WebSockets套接字、WebWorker多线程、微数据以及以拖曳为代表的一些全局属性,涵盖了HTML5所......一起来看看 《写给Web开发人员看的HTML5教程》 这本书的介绍吧!