Skip to content

Commit

Permalink
faq: Why is logistic regression considered a linear model?
Browse files Browse the repository at this point in the history
  • Loading branch information
rasbt committed Feb 11, 2016
1 parent 8a0d57a commit 0222925
Show file tree
Hide file tree
Showing 6 changed files with 45 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ I have set up a separate library, [`mlxtend`](http://rasbt.github.io/mlxtend/),

##### Logistic Regression

- [Why is logistic regression considered a linear model?](./faq/logistic_regression_linear.md)
- [What is the probabilistic interpretation of regularized logistic regression?](./faq/probablistic-logistic-regression.md)
- [Does regularization in logistic regression always results in better fit and better generalization?](./faq/regularized-logistic-regression-performance.md)
- [What is the major difference between naive Bayes and logistic regression?](./faq/naive-bayes-vs-logistic-regression.md)
Expand Down
1 change: 1 addition & 0 deletions faq/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ Sebastian

##### Logistic Regression

- [Why is logistic regression considered a linear model?](./logistic_regression_linear.md)
- [What is the probabilistic interpretation of regularized logistic regression?](./probablistic-logistic-regression.md)
- [Does regularization in logistic regression always results in better fit and better generalization?](./regularized-logistic-regression-performance.md)
- [What is the major difference between naive Bayes and logistic regression?](./naive-bayes-vs-logistic-regression.md)
Expand Down
43 changes: 43 additions & 0 deletions faq/logistic_regression_linear.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Why is logistic regression considered a linear model?

The short answer is: Logistic regression is considered a linear model because the outcome **always** depends on the **sum** of the inputs. Or in other words, the output **cannot** depend on the product (or quotient etc.) of the input features!

So, why is that? Let’s recapitulate the basics of logistic regression first, which hopefully makes things more clear. Logistic regression is an algorithm that learns a model for binary classification. A nice side-effect is that it gives us the *probability* that a sample belongs to class 1 (or vice versa: class 0). Our objective function is to minimize the so-called logistic function Φ (a certain kind of sigmoid function); it looks like this:

![](./logistic_regression_linear/2.png)

Now, if *Φ(z)* is larger than *0.5* (alternatively: if *z* is larger than *0*), we classify an input as class 1 (and class 0, otherwise). This logistic (activation) function doesn't look very linear at all, right!? So, let's dig a bit deeper and take a look at the equation we use to compute *z* -- the net input function!

![](./logistic_regression_linear/1.png)

The net input function is simply the dot product of our input features and the respective model coefficients **w**:

![](./logistic_regression_linear/3.png)

Here, x<sub>0</sub> refers to the weight of the bias unit which is always equal to 1 (a detail we don’t have to worry about here). I know, mathematical equations can be a bit "abstract" at times, so let's look at a concrete example.

Let's assume we have a sample training point **x** consisting of 4 features (e.g., *sepal length*, *sepal width*, *petal length*, and *petal width* in the [*Iris dataset*](https://archive.ics.uci.edu/ml/datasets/Iris)):

x = [1, 2, 3, 4]

Now, let's assume our weight vector looks like this:

w = [0.5, 0.5, 0.5, 0.5]

Let's compute *z* now!

z = w<sup>T</sup>x = 1*0.5 + 2*0.5 + 3*0.5 + 4*0.5 = 5

---

Not that it is important, but we have a 99.3% chance that this sample belongs to class 1:
*&Phi;(z) = 1 / (1 + e<sup>-5</sup> = 148.41)*

---

The key is that our model is ***additive***
our outcome *z* depends on the additivity of values, e.g., :

*z = w<sub>1</sub>x<sub>1</sub> + w<sub>2</sub>x<sub>2</sub>*

There's no interaction between input values, nothing like x<sub>1</sub>*<sub>2</sub> or so, which would make our model non-linear!
Binary file added faq/logistic_regression_linear/1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/logistic_regression_linear/2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added faq/logistic_regression_linear/3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 0222925

Please sign in to comment.