General Least Square Model (GLM)
- ali@fuzzywireless.com
- Mar 3, 2022
- 2 min read
General least square model (GLM) is a mathematical format of statistical analysis for regression, analysis of variance, moderation and mediation (Nico, 2017). GLM enable testing of non-linear models in the context of regression. GLM supports logistic regression for classification and linear regression for continuous target variables. Classification by GLM is bounded by prediction probability and regression for predictions. GLM is a supervised learning algorithm for classification and regression analysis (2017).
Couple of key assumptions highlighted by Nico (2017) for GLM are the linear relationship between the predictor and outcome variable as well as additive effect of each predictor. However, GLM does support non-linear and non-additive models as well by transforming the variable from non-linear to linear and adding interaction terms or moderation terms for non-additive variables (2017). Four major assumptions of GLM are linearity, normality of residuals, equality of residual variances and fixed independent variables measured without error (Carey, 2019). Methods of general least square models for different use cases are (Nico, 2017):
1. Classification issues – logistic regression and support vector machine
2. Non-linearity – nearest neighbor methods, splines and generalized additive models, kernel smoothing
3. Regularized fitting – ridge regression and lasso.
4. Interactions – tree-based methods, bagging, random forests and boosting (2017).
Teknomo (2017) highlighted that several non-linear curves can be converted into linear thus solving non-linear problem as a linear problem. For instance, power model can be transformed into logarithmic model. However, there can be implications of routinely transforming from non-linear to linear as it normalizes the residuals and distort the ratio scale properties of measured variables for instance, dollars, weights or time (Lo & Andrews, 2015). For example, if there are two measures 700 and 800msec for scenario A and 780 and 910msec for scenario B; log(700ms)-log(600ms) =0.15415 and log(780ms)-log(910ms) =0.15415 thus signifying no difference after logarithmic transformation (2015).
As an example, R’s linear regression function ‘lm’ can be used on two variables say, ‘USTemperatures’ and ‘Latitude’ stored in a file named ‘SampleData’ as below:
Output=lm (USTemperatures~Latitude,data = SampleData)
The values of ‘Output’ will have ‘intercept’ and ‘slope’ of the equation of line fitting the data as below:
USTemperatures = (Slope) x Latitude + Intercept
References:
Lo, S. & Andrews, S. (2015). To transform or not to transform: using generalized linear mixed models to analysis reaction time data. Retrieved from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528092/
Teknomo, K. (2017). Non-Linear Transformation. Retrieved from https://people.revoledu.com/kardi/tutorial/Regression/nonlinear/NonLinearTransformation.htm
Carey, G. (2019). The General Linear Model. Retrieved from http://psych.colorado.edu/~carey/courses/psyc5741/handouts/glm%20theory.pdf
Nico, G. (2017). Statistics – Generalized Linear Models (GLM) – extensions of linear model. Retrieved from https://gerardnico.com/data_mining/glm
Opmerkingen