What makes Logistic Regression a Classification Algorithm?

栏目: IT技术 · 发布时间: 5年前

DS INTO THE REAL WORLD

What makes Logistic Regression a Classification Algorithm?

Log Odds, the baseline of Logistic Regression explained.

Jul 3 ·6min read

What makes Logistic Regression a Classification Algorithm? — Photo by Caleb Jones on Unsplash — Edited

Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable , although many more complex extensions exist.

— Wikipedia.

— All the images (plots) are generated and modified by Author.

Probably, for every Data Practitioner, the Linear Regression happens to be the starting point when implementing Machine Learning, where you learn about foretelling a continuous value for the given independent set of rules .

Why Logistic, not Linear?

Let us start with the most basic one, in Binary Classification , the model should be able to predict the dependent variable as one of the two probable class which could be 0 or 1 . If we consider using Linear Regression , we can predict the value for the given set of rules as input to the model but it will forecast continuous values like 0.03, +1.2, -0.9, etc. which is not suitable for categorizing it in one of the two classes neither identifying it as a probability value to predict a class.

E.g. When we have to predict if a website is malicious when the length of the URL is given as a feature, the response variable has two values, benign and malicious.

If we try to fit a Linear Regression model to a binary classification problem, the model fit will be a straight line and can be seen why it is not suitable for using the same.

To overcome this problem, we use a sigmoid function , which tries to fit an exponential curve to the data to build a good model.

Logistic/Sigmoid Function

The Logistic Regression can be explained with Logistic function , also known as Sigmoid function that takes any real input x , and outputs a probability value between 0 and 1 which is defined as,

The model fit using the above Logistic function can be seen as below:

Further, for any given independent variable t, let us consider it as a linear function in a univariate regression model, where β0 is the intercept and β1 is the slope and is given by,

The general Logistic function p which outputs a value between 0 and 1 will become,

We can see that the data separable into two classes can be modelled using a Logistic function for the given variable in a linear function. But the relation between the input variable x and output probability cannot be interpreted easily which is given by the sigmoid function, we introduce the Logit (log-odds) function now that makes this model interpretable in a linear fashion.

Logit (Log-Odds) Function

The Log-odds function, a.k.a natural logarithm of the odds , is an inverse of the standard Logistic function which can be defined and further simplified as,

In the above equation, the terms are as follows:

g is the logit function. The equation for g(p(x)) shows that the logit is equivalent to linear regression expression
ln denotes the natural logarithm
p(x) is the probability of the dependent variable that falls in one of the two classes 0 or 1, given some linear combination of the predictors
β0 is the intercept from the linear regression equation
β1 is the regression coefficient multiplied by some value of the predictor

On further simplifying the above equation and exponentiating both sides, we can deduce the relationship between the probability and the linear model as,

The left term is called odds , which is defined as equivalent to the exponential function of the linear regression expression. With ln (log base e) on both sides, we can interpret the relation as linear between the log-odds and the independent variable x .

Why Regression?

The change in probability p(x) with change in variable x can not be directly understood as it is defined by the sigmoid function. But with the above expression, we can interpret that the change in log-odds of variable x is linear concerning a change in variable x itself. The plot of log-odds with linear equation can be seen as,

The probability outcome of the dependent variable shows that the value of the linear regression expression can vary from negative to positive infinity and yet, after transformation with sigmoid function, the resulting expression for the probability p(x) ranges between 0 and 1, i.e. 0what makes Logistic Regression a Classification algorithm being regression , that classifies the value of linear regression to a particular class depending upon the decision boundary.

Decision Boundary

The decision boundary is defined as a threshold value that helps us to classify the predicted probability value given by sigmoid function into a particular class, positive or negative.

Linear Decision Boundary

When two or more classes can be linearly separable,

Non-Linear Boundary

When two or more classes are not linearly separable,

Multi-Class Classification

The basic intuition behind Multi-Class and Binary Logistic Regression is the same. However, for a multi-class classification problem, we follow a one v/s all classification . If there are multiple independent variables for the model, the traditional equation is modified as,

Here, the Log-Odds can be defined as linearly related to multiple independent variables present when the linear regression becomes multiple regression with m explanators.

Eg. If we have to predict whether the weather is sunny, rainy, or windy, we are dealing with a Multi-class problem. We turn this problem into three binary classification problem i.e whether it is sunny or not, whether it is rainy or not and whether it is windy or not. We run all three classifications independently on input features and the classification for which the value of probability is the maximum relative to others, becomes the solution.

Conclusion

Logistic regression is one of the most simple Machine Learning models. They are easy to understand, interpretable, and can give pretty good results. Every practitioner using Logistic Regression out there must know about the Log-Odds which is the main concept behind this learning algorithm. The Logistic regression is very much interpretable considering the business needs and explanation regarding how the model works concerning different independent variables used in the model. This post aimed to provide an easy way to understand the idea behind regression and transparency provided by Logistic Regression.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

What makes Logistic Regression a Classification Algorithm?

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

七周七语言

Bruce A.Tate / 巨成、戴玮、白明 / 人民邮电出版社 / 2012-5-8 / 59.00元

内容简介：从计算机发展史早期的Cobol、Fortran到后来的C、Java，编程语言的家族不断壮大。除了这些广为人知的语言外，还涌现了Erlang、Ruby等后起之秀，它们虽被喻为小众语言，但因其独特性也吸引了为数不少的追随者。 Bruce A. Tate是软件行业的一名老兵，他有一个宏伟目标：用一本书的篇幅切中要害地探索七种不同的语言。本书就是他的成果。书中介绍了Ruby、Io、......一起来看看《七周七语言》这本书的介绍吧!

码农工具

What makes Logistic Regression a Classification Algorithm?

DS INTO THE REAL WORLD