DS INTO THE REAL WORLD
What makes Logistic Regression a Classification Algorithm?
Log Odds, the baseline of Logistic Regression explained.
Jul 3 ·6min read
Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable , although many more complex extensions exist.
— Wikipedia.
— All the images (plots) are generated and modified by Author.
Probably, for every Data Practitioner, the Linear Regression happens to be the starting point when implementing Machine Learning, where you learn about foretelling a continuous value for the given independent set of rules .
Why Logistic, not Linear?
Let us start with the most basic one, in Binary Classification , the model should be able to predict the dependent variable as one of the two probable class which could be 0 or 1 . If we consider using Linear Regression , we can predict the value for the given set of rules as input to the model but it will forecast continuous values like 0.03, +1.2, -0.9, etc. which is not suitable for categorizing it in one of the two classes neither identifying it as a probability value to predict a class.
E.g. When we have to predict if a website is malicious when the length of the URL is given as a feature, the response variable has two values, benign and malicious.
If we try to fit a Linear Regression model to a binary classification problem, the model fit will be a straight line and can be seen why it is not suitable for using the same.
To overcome this problem, we use a sigmoid function , which tries to fit an exponential curve to the data to build a good model.
Logistic/Sigmoid Function
The Logistic Regression can be explained with Logistic function , also known as Sigmoid function that takes any real input x , and outputs a probability value between 0 and 1 which is defined as,
The model fit using the above Logistic function can be seen as below:
Further, for any given independent variable t, let us consider it as a linear function in a univariate regression model, where β0 is the intercept and β1 is the slope and is given by,
The general Logistic function p which outputs a value between 0 and 1 will become,
We can see that the data separable into two classes can be modelled using a Logistic function for the given variable in a linear function. But the relation between the input variable x and output probability cannot be interpreted easily which is given by the sigmoid function, we introduce the Logit (log-odds) function now that makes this model interpretable in a linear fashion.
Logit (Log-Odds) Function
The Log-odds function, a.k.a natural logarithm of the odds , is an inverse of the standard Logistic function which can be defined and further simplified as,
In the above equation, the terms are as follows:
- g is the logit function. The equation for g(p(x)) shows that the logit is equivalent to linear regression expression
- ln denotes the natural logarithm
- p(x) is the probability of the dependent variable that falls in one of the two classes 0 or 1, given some linear combination of the predictors
- β0 is the intercept from the linear regression equation
- β1 is the regression coefficient multiplied by some value of the predictor
On further simplifying the above equation and exponentiating both sides, we can deduce the relationship between the probability and the linear model as,
The left term is called odds , which is defined as equivalent to the exponential function of the linear regression expression. With ln (log base e) on both sides, we can interpret the relation as linear between the log-odds and the independent variable x .
Why Regression?
The change in probability p(x) with change in variable x can not be directly understood as it is defined by the sigmoid function. But with the above expression, we can interpret that the change in log-odds of variable x is linear concerning a change in variable x itself. The plot of log-odds with linear equation can be seen as,
The probability outcome of the dependent variable shows that the value of the linear regression expression can vary from negative to positive infinity and yet, after transformation with sigmoid function, the resulting expression for the probability p(x) ranges between 0 and 1, i.e. 0
what makes Logistic Regression a Classification algorithm being regression , that classifies the value of linear regression to a particular class depending upon the decision boundary.
Decision Boundary
The decision boundary is defined as a threshold value that helps us to classify the predicted probability value given by sigmoid function into a particular class, positive or negative.
Linear Decision Boundary
When two or more classes can be linearly separable,
Non-Linear Boundary
When two or more classes are not linearly separable,
Multi-Class Classification
The basic intuition behind Multi-Class and Binary Logistic Regression is the same. However, for a multi-class classification problem, we follow a one v/s all classification . If there are multiple independent variables for the model, the traditional equation is modified as,
Here, the Log-Odds can be defined as linearly related to multiple independent variables present when the linear regression becomes multiple regression with m explanators.
Eg. If we have to predict whether the weather is sunny, rainy, or windy, we are dealing with a Multi-class problem. We turn this problem into three binary classification problem i.e whether it is sunny or not, whether it is rainy or not and whether it is windy or not. We run all three classifications independently on input features and the classification for which the value of probability is the maximum relative to others, becomes the solution.
Conclusion
Logistic regression is one of the most simple Machine Learning models. They are easy to understand, interpretable, and can give pretty good results. Every practitioner using Logistic Regression out there must know about the Log-Odds which is the main concept behind this learning algorithm. The Logistic regression is very much interpretable considering the business needs and explanation regarding how the model works concerning different independent variables used in the model. This post aimed to provide an easy way to understand the idea behind regression and transparency provided by Logistic Regression.
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
C++标准程序库
[德] Nicolai M. Josuttis / 侯捷、孟岩 / 华中科技大学出版社 / 2002-9 / 108.00元
这本包含最新资料的完整书籍,反映出被ANSI/ISO C++语言标准规格书纳入的C++标准程序库的最新组成。更明确地说,这本书将焦点放在标准模板库身上,检验其中的容器、迭代器、仿函数和算法。读者还可以找到特殊容、字串、数值类别、国际化议题、IOStream。每一个元素都有深刻的呈现,包括其介绍、设计、运用实例、细部解说、陷阱、意想不到的危险,以及相关类别和函数的精确樯记式和定义式。一起来看看 《C++标准程序库》 这本书的介绍吧!