top of page

Logistic Regression

  • ali@fuzzywireless.com
  • Mar 3, 2022
  • 2 min read

Both linear and logistic regression falls under the family of generalized linear model (WF School of Medicine, 2019). The selection of specific model depends on the outcome variable, that is if outcome is continuous than linear regression is the choice of model, if outcome is dichotomous than logistic regression is the choice of model while if outcome is counts than Poisson regression is the right model for the study (2019). Linear regression is based on least square estimation, which means that regression coefficients are selected with the least sum of squared distance of each observed and fitted value (Bhalla, 2018). On the other hand, logistic regression is based on maximum likelihood estimation method, which means that the coefficients are selected with the highest probability of dependent variable for the given independent variable (Whitehead, n.d.). In short, logistic regression is used when the outcome is binary, that is whether an event occurred or not.


In case of linear regression, linear relationship is expected between the dependent and independent variables while logistic regression does not require linear relationship (Bhalla, 2018). Similarly, linear regression requires normal distribution of residuals which is not the requirement of logistic regression thus linear regression require the assumption of homoscedasticity that is the variance in independent and dependent variable is same across all values of independent variables. Linear regression assumes Gaussian or normal distribution of dependent variable while logistic regression assumes binomial distribution of dependent variable (2018). Normal distribution is a continuous distribution with a bell-shape curve between two values while binomial distribution is discrete but looks like bell curve if sample size is large (Hansen, 2003). For logistic regression, independent variables should preferably not have high correlation; in case of high correlation between independent variables it is suggested to remove less important variable (Whitehead, n.d.).


In summary, logistic regression is best suited for classification problems like true/false, zero/one etc. (Drakos, 2018) However, there is another popular algorithm, namely support vector machine (SVM) which can also be used for classification problems. However, the difference between logistic regression and support vector machine (SVM) is that logistic regression is more sensitive to outliers because the cost function diverges faster than SVM. Another key difference is that logistic regression outputs probabilities while SVM outputs 1 or 0, which means logistic regression does not predict absolute values. SVM finds the widest separation for classification while logistic regression optimizes the likelihood function (2018).


References:

Drakos, G. (2018). Support vector machine vs logistic regression. Retrieved from https://towardsdatascience.com/support-vector-machine-vs-logistic-regression-94cc2975433f

Hansen (2003). Normal vs. Binomial: what are the hallmarks and differences. Retrieved from http://staweb.sta.cathedral.org/departments/math/mhansen/public_html/23stat/handouts/normbino.htm

Bhalla, D. (2018). Difference between linear regression and logistic regression. Retrieved from https://www.listendata.com/2014/11/difference-between-linear-regression.html

Whitehead, J. (n.d.). An introduction to Logistic regression. Retrieved from https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=12&ved=2ahUKEwjbl4zLl4XhAhWCg-AKHeSDAwoQFjALegQICBAC&url=https%3A%2F%2Fnlp.stanford.edu%2F~manning%2Fcourses%2Fling236%2Fhandouts%2Fwhitehead-logistic-regression.ppt&usg=AOvVaw0tR79zfQsXb2Qd6ZE5p5FX

Wake Forest School of Medicine (2019). Logistic Regression. Retrieved from https://www.phs.wakehealth.edu/docs/dbs/HSRP1/Lecture3-Logistic%20Regression%206-5-08.ppt

Recent Posts

See All
Reinforcement Learning

Online book on reinforcement learning http://incompleteideas.net/book/RLbook2020.pdf

 
 
 
Amazon Prime Video

I came across interesting link related to Amazon Prime video quality detection using ML models - https://www.amazon.science/blog/how-prim...

 
 
 
KAGGLE.COM

Great resources at kaggle.com to learn data science, ML/AI: https://www.kaggle.com/learn/pandas https://www.kaggle.com/learn/intro-to-dee...

 
 
 

Comments


Post: Blog2_Post
bottom of page