## What is the R2?

The **coefficient of determination**, denoted R^2 and pronounced “R squared”, typically
corresponds the proportion of the variance in the dependent variable
(the response) that is *explained* (i.e., predicted) by the
independent variables (the predictors).

It is an “absolute” index of *goodness-of-fit*, ranging from 0
to 1 (often expressed in percentage), and can be used for model
performance assessment or models comparison.

## Different types of R2

As models become more complex, the computation of an R^2 becomes increasingly less straightforward.

Currently, depending on the context of the regression model object, one can choose from the following measures supported in performance:

- Bayesian R^2
- Cox & Snell’s R^2
- Efron’s R^2
- Kullback-Leibler R^2
- LOO-adjusted R^2
- McFadden’s R^2
- McKelvey & Zavoinas R^2
- Nagelkerke’s R^2
- Nakagawa’s R^2 for mixed models
- Somers’ D_{xy} rank correlation for binary outcomes
- Tjur’s R^2 - coefficient of determination (D)
- Xu’ R^2 (Omega-squared)
- R^2 for models with zero-inflation

*TO BE COMPLETED.*

Before we begin, let’s first load the package.

## R2 for `glm`

In the context of a generalized linear model (e.g., a logistic model
which outcome is binary), R^2 doesn’t
measure the percentage of *“explained variance”*, as this concept
doesn’t apply. However, the R^2s that
have been adapted for GLMs have retained the name of “R2”, mostly
because of the similar properties (the range, the sensitivity, and the
interpretation as the amount of explanatory power).

## R2 for Mixed Models

### Marginal vs. Conditional R2

For mixed models, `performance`

will return two different
R^2s:

- The
**conditional**R^2 - The
**marginal**R^2

The marginal R^2 considers only the
variance of the **fixed effects** (without the random
effects), while the conditional R^2
takes *both* the **fixed and random effects** into
account (i.e., the total model).

```
library(lme4)
# defining a linear mixed-effects model
model <- lmer(Petal.Length ~ Petal.Width + (1 | Species), data = iris)
r2(model)
> # R2 for Mixed Models
>
> Conditional R2: 0.933
> Marginal R2: 0.303
```

Note that `r2`

functions only return the R^2 values. We would encourage users to
instead always use the `model_performance`

function to get a
more comprehensive set of indices of model fit.

```
model_performance(model)
> # Indices of model performance
>
> AIC | AICc | BIC | R2 (cond.) | R2 (marg.) | ICC | RMSE | Sigma
> -----------------------------------------------------------------------------
> 159.036 | 159.312 | 171.079 | 0.933 | 0.303 | 0.904 | 0.373 | 0.378
```

But, in the current vignette, we would like to exclusively focus on this family of functions and will only talk about this measure.

## R2 for Bayesian Models

```
library(rstanarm)
model <- stan_glm(mpg ~ wt + cyl, data = mtcars, refresh = 0)
r2(model)
> # Bayesian R2 with Compatibility Interval
>
> Conditional R2: 0.816 (95% CI [0.704, 0.897])
```

As discussed above, for mixed-effects models, there will be two components associated with R^2.

## Comparing change in R2 using Cohen’s *f*

Cohen’s f (of ANOVA
fame) can be used as a measure of effect size in the context of
sequential multiple regression (i.e., **nested
models**). That is, when comparing two models, we can examine
the ratio between the increase in R^2
and the unexplained variance:

f^{2}={R_{AB}^{2}-R_{A}^{2} \over 1-R_{AB}^{2}}

```
library(effectsize)
data(hardlyworking)
m1 <- lm(salary ~ xtra_hours, data = hardlyworking)
m2 <- lm(salary ~ xtra_hours + n_comps + seniority, data = hardlyworking)
cohens_f_squared(m1, model2 = m2)
> Cohen's f2 (partial) | 95% CI | R2_delta
> ---------------------------------------------
> 1.19 | [0.99, Inf] | 0.17
>
> - One-sided CIs: upper bound fixed at [Inf].
```

If you want to know more about these indices, you can check out
details and references in the functions that compute them **here**.

## Interpretation

If you want to know about how to *interpret* these R^2 values, see these **interpretation
guidelines**.