Compute the Highest Density Interval (HDI) of posterior distributions. All points within this interval have a higher probability density than points outside the interval. The HDI can be used in the context of uncertainty characterisation of posterior distributions as Credible Interval (CI).

hdi(x, ...)

# S3 method for numeric
hdi(x, ci = 0.89, verbose = TRUE, ...)

# S3 method for data.frame
hdi(x, ci = 0.89, verbose = TRUE, ...)

# S3 method for MCMCglmm
hdi(x, ci = 0.89, verbose = TRUE, ...)

# S3 method for sim.merMod
hdi(x, ci = 0.89, effects = c("fixed", "random",
  "all"), parameters = NULL, verbose = TRUE, ...)

# S3 method for sim
hdi(x, ci = 0.89, parameters = NULL, verbose = TRUE,
  ...)

# S3 method for emmGrid
hdi(x, ci = 0.89, verbose = TRUE, ...)

# S3 method for stanreg
hdi(x, ci = 0.89, effects = c("fixed", "random",
  "all"), parameters = NULL, verbose = TRUE, ...)

# S3 method for brmsfit
hdi(x, ci = 0.89, effects = c("fixed", "random",
  "all"), component = c("conditional", "zi", "zero_inflated", "all"),
  parameters = NULL, verbose = TRUE, ...)

# S3 method for BFBayesFactor
hdi(x, ci = 0.89, verbose = TRUE, ...)

Arguments

x

Vector representing a posterior distribution. Can also be a stanreg, brmsfit or a BayesFactor model.

...

Currently not used.

ci

Value or vector of probability of the (credible) interval - CI (between 0 and 1) to be estimated. Default to .89 (89%).

verbose

Toggle off warnings.

effects

Should results for fixed effects, random effects or both be returned? Only applies to mixed models. May be abbreviated.

parameters

Regular expression pattern that describes the parameters that should be returned. Meta-parameters (like lp__ or prior_) are filtered by default, so only parameters that typically appear in the summary() are returned. Use parameters to select specific parameters for the output.

component

Should results for all parameters, parameters for the conditional model or the zero-inflated part of the model be returned? May be abbreviated. Only applies to brms-models.

Value

A data frame with following columns:

  • Parameter The model parameter(s), if x is a model-object. If x is a vector, this column is missing.

  • CI The probability of the credible interval.

  • CI_low, CI_high The lower and upper credible interval limits for the parameters.

Details

Unlike equal-tailed intervals (see eti()) that typically exclude 2.5% from each tail of the distribution and always include the median, the HDI is not equal-tailed and therefore always includes the mode(s) of posterior distributions.

By default, hdi() and eti() return the 89% intervals (ci = 0.89), deemed to be more stable than, for instance, 95% intervals (Kruschke, 2014). An effective sample size of at least 10.000 is recommended if 95% intervals should be computed (Kruschke, 2014, p. 183ff). Moreover, 89 indicates the arbitrariness of interval limits - its only remarkable property is being the highest prime number that does not exceed the already unstable 95% threshold (McElreath, 2015).

A 90% equal-tailed interval (ETI) has 5% of the distribution on either side of its limits. It indicates the 5th percentile and the 95h percentile. In symmetric distributions, the two methods of computing credible intervals, the ETI and the HDI, return similar results.

This is not the case for skewed distributions. Indeed, it is possible that parameter values in the ETI have lower credibility (are less probable) than parameter values outside the ETI. This property seems undesirable as a summary of the credible values in a distribution.

On the other hand, the ETI range does change when transformations are applied to the distribution (for instance, for a log odds scale to probabilities): the lower and higher bounds of the transformed distribution will correspond to the transformed lower and higher bounds of the original distribution. On the contrary, applying transformations to the distribution will change the resulting HDI.

References

  • Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press.

  • McElreath, R. (2015). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.

Examples

library(bayestestR) posterior <- rnorm(1000) hdi(posterior, ci = .89)
#> # Highest Density Interval #> #> 89% HDI #> [-1.55, 1.58] #>
hdi(posterior, ci = c(.80, .90, .95))
#> # Highest Density Intervals #> #> 80% HDI #> [-1.21, 1.25] #> #> #> 90% HDI #> [-1.55, 1.66] #> #> #> 95% HDI #> [-2.06, 1.76] #> #>
df <- data.frame(replicate(4, rnorm(100))) hdi(df)
#> # Highest Density Interval #> #> Parameter 89% HDI #> X1 [-0.93, 2.18] #> X2 [-1.59, 1.54] #> X3 [-1.75, 1.74] #> X4 [-1.25, 1.28] #>
hdi(df, ci = c(.80, .90, .95))
#> # Highest Density Intervals #> #> Parameter 80% HDI #> X1 [-0.93, 1.59] #> X2 [-1.29, 1.22] #> X3 [-1.75, 1.09] #> X4 [-0.69, 1.28] #> #> #> Parameter 90% HDI #> X1 [-1.15, 2.18] #> X2 [-1.65, 1.54] #> X3 [-1.75, 1.77] #> X4 [-1.00, 1.66] #> #> #> Parameter 95% HDI #> X1 [-1.61, 2.18] #> X2 [-1.65, 2.14] #> X3 [-1.91, 2.08] #> X4 [-1.56, 1.88] #> #>
library(rstanarm) model <- stan_glm(mpg ~ wt + gear, data = mtcars, chains = 2, iter = 200)
#> #> SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1). #> Chain 1: #> Chain 1: Gradient evaluation took 0 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: WARNING: There aren't enough warmup iterations to fit the #> Chain 1: three stages of adaptation as currently configured. #> Chain 1: Reducing each adaptation stage to 15%/75%/10% of #> Chain 1: the given number of warmup iterations: #> Chain 1: init_buffer = 15 #> Chain 1: adapt_window = 75 #> Chain 1: term_buffer = 10 #> Chain 1: #> Chain 1: Iteration: 1 / 200 [ 0%] (Warmup) #> Chain 1: Iteration: 20 / 200 [ 10%] (Warmup) #> Chain 1: Iteration: 40 / 200 [ 20%] (Warmup) #> Chain 1: Iteration: 60 / 200 [ 30%] (Warmup) #> Chain 1: Iteration: 80 / 200 [ 40%] (Warmup) #> Chain 1: Iteration: 100 / 200 [ 50%] (Warmup) #> Chain 1: Iteration: 101 / 200 [ 50%] (Sampling) #> Chain 1: Iteration: 120 / 200 [ 60%] (Sampling) #> Chain 1: Iteration: 140 / 200 [ 70%] (Sampling) #> Chain 1: Iteration: 160 / 200 [ 80%] (Sampling) #> Chain 1: Iteration: 180 / 200 [ 90%] (Sampling) #> Chain 1: Iteration: 200 / 200 [100%] (Sampling) #> Chain 1: #> Chain 1: Elapsed Time: 0.018 seconds (Warm-up) #> Chain 1: 0.016 seconds (Sampling) #> Chain 1: 0.034 seconds (Total) #> Chain 1: #> #> SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2). #> Chain 2: #> Chain 2: Gradient evaluation took 0 seconds #> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0 seconds. #> Chain 2: Adjust your expectations accordingly! #> Chain 2: #> Chain 2: #> Chain 2: WARNING: There aren't enough warmup iterations to fit the #> Chain 2: three stages of adaptation as currently configured. #> Chain 2: Reducing each adaptation stage to 15%/75%/10% of #> Chain 2: the given number of warmup iterations: #> Chain 2: init_buffer = 15 #> Chain 2: adapt_window = 75 #> Chain 2: term_buffer = 10 #> Chain 2: #> Chain 2: Iteration: 1 / 200 [ 0%] (Warmup) #> Chain 2: Iteration: 20 / 200 [ 10%] (Warmup) #> Chain 2: Iteration: 40 / 200 [ 20%] (Warmup) #> Chain 2: Iteration: 60 / 200 [ 30%] (Warmup) #> Chain 2: Iteration: 80 / 200 [ 40%] (Warmup) #> Chain 2: Iteration: 100 / 200 [ 50%] (Warmup) #> Chain 2: Iteration: 101 / 200 [ 50%] (Sampling) #> Chain 2: Iteration: 120 / 200 [ 60%] (Sampling) #> Chain 2: Iteration: 140 / 200 [ 70%] (Sampling) #> Chain 2: Iteration: 160 / 200 [ 80%] (Sampling) #> Chain 2: Iteration: 180 / 200 [ 90%] (Sampling) #> Chain 2: Iteration: 200 / 200 [100%] (Sampling) #> Chain 2: #> Chain 2: Elapsed Time: 0.013 seconds (Warm-up) #> Chain 2: 0.013 seconds (Sampling) #> Chain 2: 0.026 seconds (Total) #> Chain 2:
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable. #> Running the chains for more iterations may help. See #> http://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable. #> Running the chains for more iterations may help. See #> http://mc-stan.org/misc/warnings.html#tail-ess
hdi(model)
#> # Highest Density Interval #> #> Parameter 89% HDI #> (Intercept) [31.70, 46.41] #> wt [-6.56, -4.56] #> gear [-1.57, 1.12] #>
hdi(model, ci = c(.80, .90, .95))
#> # Highest Density Intervals #> #> Parameter 80% HDI #> (Intercept) [33.51, 44.58] #> wt [-6.19, -4.56] #> gear [-1.34, 0.98] #> #> #> Parameter 90% HDI #> (Intercept) [30.99, 46.41] #> wt [-6.73, -4.56] #> gear [-1.71, 1.06] #> #> #> Parameter 95% HDI #> (Intercept) [28.59, 49.38] #> wt [-7.08, -4.34] #> gear [-1.75, 1.66] #> #>
library(emmeans) hdi(emtrends(model, ~1, "wt"))
#> # Highest Density Interval #> #> Parameter 89% HDI #> overall [-6.56, -4.56] #>
if (FALSE) { library(brms) model <- brms::brm(mpg ~ wt + cyl, data = mtcars) hdi(model) hdi(model, ci = c(.80, .90, .95)) library(BayesFactor) bf <- ttestBF(x = rnorm(100, 1, 1)) hdi(bf) hdi(bf, ci = c(.80, .90, .95)) }