Intuitively, How Can We (Better) Understand Logistic Regression

栏目: IT技术 · 发布时间: 4年前

内容简介:Logistic Regression and Linear Discriminant Analysis are closely related. Here is an intuitive way to understand them and to help us define Softmax Regression.Inmy previous article, I introduced 5 principles of classification that helped us to define more

Logistic Regression and Linear Discriminant Analysis are closely related. Here is an intuitive way to understand them and to help us define Softmax Regression.

Apr 25 ·5min read

Inmy previous article, I introduced 5 principles of classification that helped us to define more than 5 types of algorithms.

The intuition that we used for logistic regression was to “smooth the straight line”. The smoothing function is a logistic function . Now how can we better understand how we come up with this logistic function ?

As in the previous article, we will explain the principle for the 1D situation, with blue dots and red dots.

Intuitively, How Can We (Better) Understand Logistic Regression

How LDA and Logistic Regression are related

In order to explain LDA (Linear Discriminant Analysis), the idea is to first build two normal distributions . For a new dot x, we can consider:

  • PDF_b(x) with PDF_b: Probability Density Function of blue dots , and
  • PDF_r(x) with PDF_r: Probability Density Function of red dots
  • p(B) : the proportion of blue dots
  • p(R) :the proportion of red dots

The final probability of the new dot being blue is:

p(B)×PDF_b(x)/(p(B)×PDF_b(x)+p(R)×PDF_r(x))

Now let’s look at the normal PDF:

Intuitively, How Can We (Better) Understand Logistic Regression

Since we consider that in the case of LDA, standard deviation is the same for the two classes, then we can simplify and the x² term will go away, which is why we call this Linear Discriminant Analysis.

Intuitively, How Can We (Better) Understand Logistic Regression

If we don’t adopt the hypothesis of homoscedasticity (which means same standard deviation for the two classes), the x² term will remain and the algorithm is then called Quadratic Discriminant Analysis.

So for LDA, we end up with something like:

1/(1+exp(ax+b))

Yes, a logistic function !

Of course, the parameters a and b are different from the actual logistic regression.

We can compare the results, and in this situation, we can see that the results are actually very close (the green curve being logistic regression and the black curve being LDA).

Intuitively, How Can We (Better) Understand Logistic Regression

Conclusion: LDA and Logistic Regression produce the final probability which is a logistic function. The only difference between the two approaches lies in the fact that

  • Logistic Regression uses maximum likelihood to estimate the parameters
  • LDA, the parameters come from the estimated mean and variance from a normal distribution and the proportions of each class (prior probability)

Simplify the Normal PDF

Since we know that the x² in the normal PDF will go away in the hypothesis of homoscedasticity , maybe we can directly get rid of it at the beginning.

So we can directly consider:

Intuitively, How Can We (Better) Understand Logistic Regression

We can test some parameters, in order to draw the curves. Let’s begin with the blue curve fb(x):

  • to begin with, we can let a_b=1 (the parameter a for blue dots)
  • for b_b, we can say that the curve should pass the point (x=mean of blue dots, y=1)

And we consider that the situation is symmetric for the red curve:

  • a_r =-1
  • and the red curve should pass the point (x=mean of red dots, y=1)

Intuitively, How Can We (Better) Understand Logistic Regression

Computation of the ratios

When we calculate the ratio to get the final probability

Intuitively, How Can We (Better) Understand Logistic Regression

We also end up with a logistic function.

Below we can see the ratio (black line). Remember that the parameters a and b here are chosen manually.

Intuitively, How Can We (Better) Understand Logistic Regression

But even though, we can see that it is actually not that bad in the given situation. The green line in the graph below is the logistic regression model while the black line is the ratio calculated with our manually chosen parameters.

Intuitively, How Can We (Better) Understand Logistic Regression

Conclusion: logistic regression is a normalized exponential function (defined by the two classes).

Softmax regression

With the intuition of “smoothing the straight line”, it is not easy to generalize for the situation of multiple prediction classes. But with the idea of normalized exponential function, we can add more classes.

For K classes , we can consider this normalized exponential function to estimate the probability of x to belong to class j

Intuitively, How Can We (Better) Understand Logistic Regression

Here is a graph with 3 classes

Intuitively, How Can We (Better) Understand Logistic Regression

This is called softmax regression and now you know that behind this fancy name, it is just a very simple generalization of logistic regression.

Since you know that logistic regression is very close to LDA, then the result of softmax regression should be close to multiclass LDA as is the case in the graph below:

Intuitively, How Can We (Better) Understand Logistic Regression


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

MySQL性能调优与架构设计

MySQL性能调优与架构设计

简朝阳 / 2009-6 / 59.80元

《MySQL性能调优与架构设计》以 MySQL 数据库的基础及维护为切入点,重点介绍了 MySQL 数据库应用系统的性能调优,以及高可用可扩展的架构设计。 全书共分3篇,基础篇介绍了MySQL软件的基础知识、架构组成、存储引擎、安全管理及基本的备份恢复知识。性能优化篇从影响 MySQL 数据库应用系统性能的因素开始,针对性地对各个影响因素进行调优分析。如 MySQL Schema 设计的技巧......一起来看看 《MySQL性能调优与架构设计》 这本书的介绍吧!

在线进制转换器
在线进制转换器

各进制数互转换器

XML、JSON 在线转换
XML、JSON 在线转换

在线XML、JSON转换工具

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具