L1 and L2 Regularization — Explained

栏目: IT技术 · 发布时间: 4年前

L1 and L2 Regularization — Explained

L1 and L2 Regularization — Explained

How to control the complexity of a model

Photo by Chris Leipelt on Unsplash

Overfitting is a significant issue in the field of data science that needs to handled carefully in order to build a robust and accurate model. Overfitting arises when a model tries to fit the training data so well that it cannot generalize to new observations. An overfit model captures the details and noise in training data rather than the general trend. Therefore, even slight changes in a features greatly changes the outcome of a model. Overfit models seem to be outstanding on training data but performs poor on new, previously unseen observations.

The main reason of overfitting is model complexity. Thus, we can prevent a model from overfitting by controlling the complexity which is exatly what regularization does. Regularization controls the model complexity by penalizing higher terms in the model. If a regularization terms is added, the model tries to minimize both loss and complexity of model.

In this post, I will cover two commonly used regularization techniques which are L1 and L2 regularization. The two main reasons that cause a model to be complex are:

  • Total number of features (handled by L1 regularization), or
  • The weights of features (handled by L2 regularization)

L1 Regularization

It is also called regularization for sparsity . As the name suggests, it is used to handle sparse vectors which consist of mostly zeroes. Sparse vectors typically result in very high-dimensional feature vector space. Thus, the model becomes very difficult to handle.

L1 regularization forces the weights of uninformative features to be zero by substracting a small amount from the weight at each iteration and thus making the weight zero, eventually.

L1 regularization penalizes |weight|.

L2 Regularization

It is also called regularization for simplicity . If we take the model complexity as a function of weights, the complexity of a feature is proportinal to the absolute value of its weight.

L2 regularization forces weights toward zero but it does not make them exactly zero. L2 regularization acts like a force that removes a small percentage of weights at each iteration. Therefore, weights will never be equal to zero.

L2 regularization penalizes (weight)²

There is an additional parameter to tune the L2 regularization term which is called regularization rate (lambda). Regularization rate is a scalar and multiplied by L2 regularization term.

Note:Choosing an optimal value for lambda is important. If lambda is too high, the model becomes too simple and tends to underfit. On the other hand, if lambda is too low, the effect of regulatization becomes negligible and the model is likely to overfit. If lambda is set to zero, then regularization will be completely removed (high risk of overfitting!).

Note: Ridge regression uses L2 regularization whereas Lasso regression uses L1 regularization. Elastic net regression combines L1 and L2 regularization.

Conclusion

Overfitting is a crucial issue for machine learning models and needs to be carefully handled. We build machine learning models to predict the unknown. We want the model to learn the trends in the training data and apply that knowledge when evaluating new observations. Thus, the model needs to generalize well to the underlying structure rather than focusing too much on detail of training points. Regularization helps models to achieve this goal.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

UX设计之道

UX设计之道

[美] 昴格尔、[美] 钱德勒 / 孙亮 / 人民邮电出版社 / 2010-4 / 35.00元

《UX设计之道:以用户体验为中心的Web设计》可以看作是用户体验设计的核心参考书。无论是Web设计领域的创业者、项目管理者还是用户体验的策划者和实施者,或是有志于Web设计领域的学生,都应该了解《UX设计之道:以用户体验为中心的Web设计》中的知识。 用户是网站的根本,网站要达到自己的商业目标,必须满足目标用户的需求——这就是以用户体验为中心的网站设计。那么,用户需求从何而来?如何将用户需......一起来看看 《UX设计之道》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

JS 压缩/解压工具
JS 压缩/解压工具

在线压缩/解压 JS 代码

URL 编码/解码
URL 编码/解码

URL 编码/解码