Introduction

Standardising parameters (i.e., coefficients) can allow for their comparison within and between models, variables and studies. Moreover, as it returns coefficients expressed in terms of change of variance (for instance, coefficients expresed in terms of SD of the response variable), it can allow for the usage of effect size interpretation guidelines, such as the famous Cohen’s (1988) rules of thumb.

However, standardizing the model’s parameters should not be automatically and mindlessly done: for some research fields, particular variables or types of studies (e.g., replications), it sometimes makes more sense to keep, use and interpret the original parameters, especially if they are well known or easily understood.

Critically, parameters standardization is not a trivial process. Different techniques exist, that can lead to drastically different results. Thus, it is critical that the standardization method is explicitly documented and detailed.

parameters include different techniques of parameters standardization, described below (Bring 1994; Menard 2004, 2011; Gelman 2008; Schielzeth 2010).

Standardization methods

refit: Re-fitting the model with standardized data

This method is based on a complete model re-fit with a standardized version of data. Hence, this method is equal to standardizing the variables before fitting the model. It is the “purest” and the most accurate (Neter, Wasserman, and Kutner 1989), but it is also the most computationally costly and long (especially for Bayesian models). This method is particularly recommended for complex models that include interactions or transformations (e.g., polynomial or spline terms).

library(parameters)

data <- iris
model <- lm(Sepal.Length ~ Petal.Width + Sepal.Width, data=data)

parameters_standardize(model, method="refit")
Parameter Std_Coefficient
(Intercept) 0.00
Petal.Width 0.89
Sepal.Width 0.21

The robust (default to FALSE) argument enables a robust standardization of data, i.e., based on the median and MAD instead of the mean and SD.

parameters_standardize(model, method="refit", robust=TRUE)
Parameter Std_Coefficient
(Intercept) 0.11
Petal.Width 0.97
Sepal.Width 0.17

This method is very flexible as it can be applied to all types of models (linear, logistic…).

library(parameters)

data$binary <- ifelse(data$Sepal.Width > 3, 1, 0)
model <- glm(binary ~ Species + Sepal.Length, data = data, family="binomial")
parameters_standardize(model, method="refit")
Parameter Std_Coefficient
(Intercept) 3.3
Speciesversicolor -5.4
Speciesvirginica -5.5
Sepal.Length 1.5

2sd: Scaling by two 2 SDs

Same as method = "refit", however, standardization is done by dividing by two times the SD or MAD (depending on robust). This method is useful to obtain coefficients of continuous parameters comparable to coefficients related to binary predictors (Gelman 2008).

smart: Scaling by the variances of the response and the predictor

Post-hoc standardization of the model paramaters. The coefficients are divided by the standard deviation (or MAD if robust) of the outcome (which becomes their expression ‘unit’). Then, the coefficients related to numeric variables are additionaly multiplied by the standard deviation (or MAD if robust) of the related term, so that they correspond to changes of 1 SD of the predictor (e.g., "A change in 1 SD of x is related to a change of 0.24 of the SD of y). This does not apply to binary variables or factors, so the coefficients are still related to changes in levels.

model <- lm(Sepal.Length ~ Petal.Width + Sepal.Width, data=data)
parameters_standardize(model, method="smart")
Parameter Std_Coefficient
(Intercept) 0.00
Petal.Width 0.89
Sepal.Width 0.21

classic: Basic scaling of all parameters

This method is similar to method = "smart", but treats all variables as continuous: it also scales the coefficient by the standard deviation of factors (transformed to integers) or binary predictors. Altough being inapropriate for these cases, this method is the one implemented by default in other softwares, such as lm.beta::lm.beta().

Methods Comparison

We will use the “refit” method as the baseline. We will then compute the differences between these standardized parameters and the ones provided by the other functions. The bigger the (absolute) number, the worse it is.

SPOILER ALERT: the standardization implemented in parameters is the most accurate and the most flexible.

Models with only numeric predictors

Linear Model

data <- iris
data$Group_Sepal.Width <- as.factor(ifelse(data$Sepal.Width > 3, "High", "Low"))
data$Binary_Sepal.Width <- as.factor(ifelse(data$Sepal.Width > 3, 1, 0))

model <- lm(Sepal.Length ~ Petal.Width + Sepal.Width, data=data) 
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 0 0 0 0
Petal.Width 0 0 0 0
Sepal.Width 0 0 0 0

For this simple model, all methods return results equal to the “refit” method.

Logistic Model

model <- glm(Binary_Sepal.Width ~ Petal.Width + Sepal.Length, data=data, family="binomial")
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) -0.26 -0.26 Error Error
Petal.Width 0.00 0.00 Error Error
Sepal.Length 0.00 0.00 Error Error

Linear Mixed Model

library(lme4)

model <- lme4::lmer(Sepal.Length ~ Petal.Width + Sepal.Width + (1|Species), data=data)
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 0 0 Error 0
Petal.Width 0 0 Error 0
Sepal.Width 0 0 Error 0

For this simple mixed model, all methods return results equal to the “refit” method.

Interactions

model <- lm(Sepal.Length ~ Petal.Width * Sepal.Width, data=data)
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) -0.01 -0.01 -0.01 -0.01
Petal.Width -0.28 -0.28 -0.28 -0.28
Sepal.Width -0.06 -0.06 -0.06 -0.06
Petal.Width:Sepal.Width 0.01 0.24 0.24 0.24

When interactions are involved, post-hoc methods return different results. However, methods implemented in other softwares perform arguably worse.

Transformation

model <- lm(Sepal.Length ~ poly(Petal.Width, 2) + poly(Sepal.Width, 2), data=data)
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 0 0.00 0.00 0.00
poly(Petal.Width, 2)1 0 10.48 10.48 10.48
poly(Petal.Width, 2)2 0 -1.86 -1.86 -1.86
poly(Sepal.Width, 2)1 0 3.44 3.44 3.44
poly(Sepal.Width, 2)2 0 0.28 0.28 0.28

For polynomial transformations, other software become very unreliable.

Bayesian Models

Parameter smart classic lm.beta MuMIn
(Intercept) 0 0 0 0
Petal.Width 0 0 0 0
Sepal.Width 0 0 0 0

Models with factors

Linear Model

model <- lm(Sepal.Length ~ Petal.Width + Group_Sepal.Width, data=data) 
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 0.14 0.14 0.14 0.14
Petal.Width 0.00 0.00 0.00 0.00
Group_Sepal.WidthLow 0.00 -0.12 -0.12 -0.12

When factors are involved, methods that standardize the numeric transformation of factors give different results.

Logistic Model

model <- glm(Binary_Sepal.Width ~ Petal.Width + Species, data=data, family="binomial")
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 8 8.0 Error Error
Petal.Width 0 0.0 Error Error
Speciesversicolor 0 -5.8 Error Error
Speciesvirginica 0 -7.6 Error Error

Linear Mixed Model

library(lme4)

model <- lme4::lmer(Sepal.Length ~ Petal.Length + Group_Sepal.Width + (1|Species), data=data)
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 0.15 0.15 Error 0.15
Petal.Length 0.00 0.00 Error 0.00
Group_Sepal.WidthLow 0.00 -0.14 Error -0.14

Interactions

model <- lm(Sepal.Length ~ Petal.Width * Group_Sepal.Width, data=data)
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 0.12 0.12 0.12 0.12
Petal.Width 0.00 0.00 0.00 0.00
Group_Sepal.WidthLow 0.27 0.01 0.01 0.01
Petal.Width:Group_Sepal.WidthLow -0.05 -0.01 -0.01 -0.01
model <- lm(Sepal.Length ~ Group_Sepal.Width * Petal.Width, data=data)
comparison(model)
Parameter smart classic lm.beta MuMIn
(Intercept) 0.12 0.12 0.12 0.12
Group_Sepal.WidthLow 0.27 0.01 0.01 0.01
Petal.Width 0.00 0.00 0.00 0.00
Group_Sepal.WidthLow:Petal.Width 0.00 -0.01 -0.01 -0.01

Bayesian Models

Parameter smart classic lm.beta MuMIn
1 (Intercept) 0.14 0.14 0.14 0.14
3 Petal.Width 0.00 0.00 0.00 0.00
2 Group_Sepal.WidthLow 0.00 -0.13 -0.13 -0.13
Parameter smart classic lm.beta MuMIn
1 (Intercept) 0.14 0.14 Error Error
3 Petal.Width 0.00 0.00 Error Error
2 Group_Sepal.WidthLow 0.00 -0.14 Error Error

Conclusion

Use refit if possible, otherwise smart.

References

Bring, Johan. 1994. “How to Standardize Regression Coefficients.” The American Statistician 48 (3): 209–13.

Gelman, Andrew. 2008. “Scaling Regression Inputs by Dividing by Two Standard Deviations.” Statistics in Medicine 27 (15): 2865–73.

Menard, Scott. 2004. “Six Approaches to Calculating Standardized Logistic Regression Coefficients.” The American Statistician 58 (3): 218–23.

———. 2011. “Standards for Standardized Logistic Regression Coefficients.” Social Forces 89 (4): 1409–28.

Neter, John, William Wasserman, and Michael H Kutner. 1989. “Applied Linear Regression Models.”

Schielzeth, Holger. 2010. “Simple Means to Improve the Interpretability of Regression Coefficients.” Methods in Ecology and Evolution 1 (2): 103–13.