*Test if your model is a good model!*

The primary goal of the **performance** package is to provide utilities for computing **indices of model quality** and **goodness of fit**. This includes measures like r-squared (R2), root mean squared error (RMSE) or intraclass correlation coefficient (ICC) , but also functions to check (mixed) models for overdispersion, zero-inflation, convergence or singularity.

Run the following to install the stable release of **performance** from CRAN:

`install.packages("performance")`

Or this one to install the latest development version:

```
install.packages("remotes")
remotes::install_github("easystats/performance")
```

There is a nice introduction into the package on youtube.

**performance** has a generic `r2()`

function, which computes the r-squared for many different models, including mixed effects and Bayesian regression models.

`r2()`

returns a list containing values related to the “most appropriate” r-squared for the given model.

```
model <- lm(mpg ~ wt + cyl, data = mtcars)
r2(model)
#> # R2 for Linear Regression
#>
#> R2: 0.830
#> adj. R2: 0.819
model <- glm(am ~ wt + cyl, data = mtcars, family = binomial)
r2(model)
#> $R2_Tjur
#> Tjur's R2
#> 0.7051
library(MASS)
data(housing)
model <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
r2(model)
#> $R2_Nagelkerke
#> Nagelkerke's R2
#> 0.1084
```

The different R-squared measures can also be accessed directly via functions like `r2_bayes()`

, `r2_coxsnell()`

or `r2_nagelkerke()`

(see a full list of functions here).

For mixed models, the *conditional* and *marginal* R-squared are returned. The *marginal R-squared* considers only the variance of the fixed effects and indicates how much of the model’s variance is explained by the fixed effects part only. The *conditional R-squared* takes both the fixed and random effects into account and indicates how much of the model’s variance is explained by the “complete” model.

For frequentist mixed models, `r2()`

(resp. `r2_nakagawa()`

) computes the *mean* random effect variances, thus `r2()`

is also appropriate for mixed models with more complex random effects structures, like random slopes or nested random effects (Johnson 2014; Nakagawa, Johnson, and Schielzeth 2017).

```
library(rstanarm)
model <- stan_glmer(Petal.Length ~ Petal.Width + (1 | Species), data = iris, cores = 4)
r2(model)
#> # Bayesian R2 with Standard Error
#>
#> Conditional R2: 0.953 (0.89% CI [0.944, 0.962])
#> Marginal R2: 0.824 (0.89% CI [0.747, 0.888])
library(lme4)
model <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy)
r2(model)
#> # R2 for Mixed Models
#>
#> Conditional R2: 0.799
#> Marginal R2: 0.279
```

Similar to R-squared, the ICC provides information on the explained variance and can be interpreted as “the proportion of the variance explained by the grouping structure in the population” (Hox 2010).

`icc()`

calculates the ICC for various mixed model objects, including `stanreg`

models…

```
library(lme4)
model <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy)
icc(model)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.722
#> Conditional ICC: 0.521
```

…and models of class `brmsfit`

.

```
icc(model)
#> # Intraclass Correlation Coefficient
#>
#> Adjusted ICC: 0.930
#> Conditional ICC: 0.771
```

Overdispersion occurs when the observed variance in the data is higher than the expected variance from the model assumption (for Poisson, variance roughly equals the mean of an outcome). `check_overdispersion()`

checks if a count model (including mixed models) is overdispersed or not.

```
library(glmmTMB)
data(Salamanders)
model <- glm(count ~ spp + mined, family = poisson, data = Salamanders)
check_overdispersion(model)
#> # Overdispersion test
#>
#> dispersion ratio = 2.946
#> Pearson's Chi-Squared = 1873.710
#> p-value = < 0.001
#> Overdispersion detected.
```

Overdispersion can be fixed by either modelling the dispersion parameter (not possible with all packages), or by choosing a different distributional family [like Quasi-Poisson, or negative binomial, see (Gelman and Hill 2007).

Zero-inflation (in (Quasi-)Poisson models) is indicated when the amount of observed zeros is larger than the amount of predicted zeros, so the model is *underfitting* zeros. In such cases, it is recommended to use negative binomial or zero-inflated models.

Use `check_zeroinflation()`

to check if zero-inflation is present in the fitted model.

```
model <- glm(count ~ spp + mined, family = poisson, data = Salamanders)
check_zeroinflation(model)
#> # Check for zero-inflation
#>
#> Observed zeros: 387
#> Predicted zeros: 298
#> Ratio: 0.77
#> Model is underfitting zeros (probable zero-inflation).
```

A “singular” model fit means that some dimensions of the variance-covariance matrix have been estimated as exactly zero. This often occurs for mixed models with overly complex random effects structures.

`check_singularity()`

checks mixed models (of class `lme`

, `merMod`

, `glmmTMB`

or `MixMod`

) for singularity, and returns `TRUE`

if the model fit is singular.

```
library(lme4)
data(sleepstudy)
# prepare data
set.seed(123)
sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE)
sleepstudy$mysubgrp <- NA
for (i in 1:5) {
filter_group <- sleepstudy$mygrp == i
sleepstudy$mysubgrp[filter_group] <- sample(1:30, size = sum(filter_group), replace = TRUE)
}
# fit strange model
model <- lmer(Reaction ~ Days + (1 | mygrp/mysubgrp) + (1 | Subject), data = sleepstudy)
check_singularity(model)
#> [1] TRUE
```

Remedies to cure issues with singular fits can be found here.

**performance** provides many functions to check model assumptions, like `check_collinearity()`

, `check_normality()`

or `check_heteroscedasticity()`

. To get a comprehensive check, use `check_model()`

.

```
model <- lm(mpg ~ wt * cyl + gear, data = mtcars)
check_model(model)
```

`model_performance()`

computes indices of model performance for regression models. Depending on the model object, typical indices might be r-squared, AIC, BIC, RMSE, ICC or LOOIC.

```
m1 <- lm(mpg ~ wt + cyl, data = mtcars)
model_performance(m1)
#> # Indices of model performance
#>
#> AIC | BIC | R2 | R2 (adj.) | RMSE | Sigma
#> -----------------------------------------------------
#> 156.010 | 161.873 | 0.830 | 0.819 | 2.444 | 2.568
```

```
m2 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")
model_performance(m2)
#> # Indices of model performance
#>
#> AIC | BIC | Tjur's R2 | RMSE | Sigma | Log_loss | Score_log | Score_spherical | PCP
#> --------------------------------------------------------------------------------------------
#> 31.298 | 35.695 | 0.478 | 0.359 | 0.934 | 0.395 | -14.903 | 0.095 | 0.743
```

```
library(lme4)
m3 <- lmer(Reaction ~ Days + (1 + Days | Subject), data = sleepstudy)
model_performance(m3)
#> # Indices of model performance
#>
#> AIC | BIC | R2 (cond.) | R2 (marg.) | ICC | RMSE | Sigma
#> -----------------------------------------------------------------------
#> 1755.628 | 1774.786 | 0.799 | 0.279 | 0.722 | 23.438 | 25.592
```

The `compare_performance()`

function can be used to compare the performance and quality of several models (including models of different types).

```
counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12)
outcome <- gl(3, 1, 9)
treatment <- gl(3, 3)
m4 <- glm(counts ~ outcome + treatment, family = poisson())
compare_performance(m1, m2, m3, m4)
#> # Comparison of Model Performance Indices
#>
#> Name | Model | AIC | BIC | RMSE | Sigma | Score_log | Score_spherical | R2 | R2 (adj.) | Tjur's R2 | Log_loss | PCP | R2 (cond.) | R2 (marg.) | ICC | Nagelkerke's R2
#> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
#> m1 | lm | 156.010 | 161.873 | 2.444 | 2.568 | | | 0.830 | 0.819 | | | | | | |
#> m2 | glm | 31.298 | 35.695 | 0.359 | 0.934 | -14.903 | 0.095 | | | 0.478 | 0.395 | 0.743 | | | |
#> m3 | lmerMod | 1755.628 | 1774.786 | 23.438 | 25.592 | | | | | | | | 0.799 | 0.279 | 0.722 |
#> m4 | glm | 56.761 | 57.747 | 3.043 | 1.132 | -2.598 | 0.324 | | | | | | | | | 0.657
```

One can also easily compute and a **composite index** of model performance and sort the models from the best one to the worse.

```
compare_performance(m1, m2, m3, m4, rank = TRUE)
#> # Comparison of Model Performance Indices
#>
#> Name | Model | AIC | BIC | RMSE | Sigma | Performance-Score
#> --------------------------------------------------------------------------
#> m2 | glm | 31.298 | 35.695 | 0.359 | 0.934 | 100.00%
#> m4 | glm | 56.761 | 57.747 | 3.043 | 1.132 | 96.21%
#> m1 | lm | 156.010 | 161.873 | 2.444 | 2.568 | 92.46%
#> m3 | lmerMod | 1755.628 | 1774.786 | 23.438 | 25.592 | 0.00%
```

Finally, we provide convenient visualisation (the `see`

package must be installed).

`plot(compare_performance(m1, m2, m4, rank = TRUE))`

`test_performance()`

carries out the most relevant and appropriate tests based on the input (for instance, whether the models are nested or not).

```
m1 <- lm(Sepal.Length ~ Petal.Length, data = iris)
m2 <- lm(Sepal.Length ~ Petal.Length + Petal.Width, data = iris)
m3 <- lm(Sepal.Length ~ Petal.Length * Petal.Width, data = iris)
test_performance(m1, m2, m3)
#> Name | Model | BF | Omega2 | p (Omega2) | LR | p (LR)
#> ------------------------------------------------------------
#> m1 | lm | | | | |
#> m2 | lm | 0.601 | 0.03 | 0.062 | 3.99 | 0.057
#> m3 | lm | > 1000 | 0.16 | < .001 | 29.35 | < .001
#> Models were detected as nested and are compared in sequential order.
```

Gelman, Andrew, and Jennifer Hill. 2007. *Data Analysis Using Regression and Multilevel/Hierarchical Models*. Analytical Methods for Social Research. Cambridge ; New York: Cambridge University Press.

Hox, J. J. 2010. *Multilevel Analysis: Techniques and Applications*. 2nd ed. Quantitative Methodology Series. New York: Routledge.

Johnson, Paul C. D. 2014. “Extension of Nakagawa & Schielzeth’s R2 GLMM to Random Slopes Models.” Edited by Robert B. O’Hara. *Methods in Ecology and Evolution* 5 (9): 944–46. https://doi.org/10.1111/2041-210X.12225.

Nakagawa, Shinichi, Paul C. D. Johnson, and Holger Schielzeth. 2017. “The Coefficient of Determination R2 and Intra-Class Correlation Coefficient from Generalized Linear Mixed-Effects Models Revisited and Expanded.” *Journal of the Royal Society Interface* 14 (134): 20170213. https://doi.org/10.1098/rsif.2017.0213.