Check distribution of simulated quantile residuals

check_residuals() checks generalized linear (mixed) models for uniformity of randomized quantile residuals, which can be used to identify typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial and temporal autocorrelation.

Usage

check_residuals(x, ...)

# Default S3 method
check_residuals(x, alternative = "two.sided", distribution = "punif", ...)

Arguments

x: A supported model object or an object returned by simulate_residuals() or DHARMa::simulateResiduals().
...: Passed down to stats::ks.test().
alternative: A character string specifying the alternative hypothesis. Can be one of "two.sided", "less", or "greater". See stats::ks.test() for details.
distribution: The distribution to compare the residuals against. Can be (a) a character value giving a cumulative distribution function (for example, "punif" (default) or "pnorm"), (b) a cumulative distribution function itself (for example, punif or pnorm), or (c) a numeric vector of values.

Value

The p-value of the test statistics.

Details

Simulated quantile residuals are generated by simulating a series of values from a fitted model for each case, comparing the observed response values to these simulations, and computing the empirical quantile of the observed value in the distribution of simulated values. When the model is correctly-specified, these quantile residuals will follow a uniform (flat) distribution. check_residuals() tests the distribution of the quantile residuals against the uniform distribution using a Kolmogorov-Smirnov test. Essentially, comparing quantile residuals to the uniform distribution tests whether the observed response values deviate from model expectations (i.e., simulated values). In this sense, check_residuals() is similar to posterior predictive checks with check_predictions().

There is a plot() method to visualize the distribution of quantile residuals using a Q-Q plot. This plot can be interpreted in the same way as a Q-Q plot for normality of residuals in linear regression.

If desired, a different theoretical distribution or a vector of numeric values can be tested against using the distribution argument.

Tests based on simulated residuals

For certain models, resp. model from certain families, tests like check_zeroinflation() or check_overdispersion() are based on simulated residuals. These are usually more accurate for such tests than the traditionally used Pearson residuals. However, when simulating from more complex models, such as mixed models or models with zero-inflation, there are several important considerations. simulate_residuals() relies on DHARMa::simulateResiduals(), and additional arguments specified in ... are passed further down to that function. The defaults in DHARMa are set on the most conservative option that works for all models. However, in many cases, the help advises to use different settings in particular situations or for particular models. It is recommended to read the 'Details' in ?DHARMa::simulateResiduals closely to understand the implications of the simulation process and which arguments should be modified to get the most accurate results.

Examples

dat <- DHARMa::createData(sampleSize = 100, overdispersion = 0.5, family = poisson())
m <- glm(observedResponse ~ Environment1, family = poisson(), data = dat)
res <- simulate_residuals(m)
check_residuals(res)
#> Warning: Non-uniformity of simulated residuals detected (p = 0.021).
#>