Parameters from Bayesian Models

Model parameters from Bayesian models. This function internally calls bayestestR::describe_posterior() to get the relevant information for the output.

Usage

# S3 method for class 'data.frame'
model_parameters(
  model,
  as_draws = FALSE,
  exponentiate = FALSE,
  verbose = TRUE,
  ...
)

# S3 method for class 'brmsfit'
model_parameters(
  model,
  centrality = "median",
  dispersion = FALSE,
  ci = 0.95,
  ci_method = "eti",
  test = "pd",
  rope_range = "default",
  rope_ci = 0.95,
  bf_prior = NULL,
  diagnostic = c("ESS", "Rhat"),
  priors = FALSE,
  effects = "fixed",
  component = "all",
  exponentiate = FALSE,
  standardize = NULL,
  group_level = FALSE,
  keep = NULL,
  drop = NULL,
  verbose = TRUE,
  ...
)

Arguments

model

Bayesian model (including SEM from blavaan. May also be a data frame with posterior samples, however, as_draws must be set to TRUE (else, for data frames NULL is returned).

as_draws

Logical, if TRUE and model is of class data.frame, the data frame is treated as posterior samples and handled similar to Bayesian models. All arguments in ... are passed to model_parameters.draws().

exponentiate

Logical, indicating whether or not to exponentiate the coefficients (and related confidence intervals). This is typical for logistic regression, or more generally speaking, for models with log or logit links. It is also recommended to use exponentiate = TRUE for models with log-transformed response values. For models with a log-transformed response variable, when exponentiate = TRUE, a one-unit increase in the predictor is associated with multiplying the outcome by that predictor's coefficient. Note: Delta-method standard errors are also computed (by multiplying the standard errors by the transformed coefficients). This is to mimic behaviour of other software packages, such as Stata, but these standard errors poorly estimate uncertainty for the transformed coefficient. The transformed confidence interval more clearly captures this uncertainty. For compare_parameters(), exponentiate = "nongaussian" will only exponentiate coefficients from non-Gaussian families.

verbose

Toggle warnings and messages.

...

Arguments passed to or from other methods. For instance, when bootstrap = TRUE, arguments like type or parallel are passed down to bootstrap_model().

Further non-documented arguments are:

digits, p_digits, ci_digits and footer_digits to set the number of digits for the output. groups can be used to group coefficients. These arguments will be passed to the print-method, or can directly be used in print(), see documentation in print.parameters_model().
If s_value = TRUE, the p-value will be replaced by the S-value in the output (cf. Rafi and Greenland 2020).
pd adds an additional column with the probability of direction (see bayestestR::p_direction() for details). Furthermore, see 'Examples' for this function.
For developers, whose interest mainly is to get a "tidy" data frame of model summaries, it is recommended to set pretty_names = FALSE to speed up computation of the summary table.

centrality

The point-estimates (centrality indices) to compute. Character (vector) or list with one or more of these options: "median", "mean", "MAP" (see map_estimate()), "trimmed" (which is just mean(x, trim = threshold)), "mode" or "all".

dispersion

Logical, if TRUE, computes indices of dispersion related to the estimate(s) (SD and MAD for mean and median, respectively). Dispersion is not available for "MAP" or "mode" centrality indices.

ci

Credible Interval (CI) level. Default to 0.95 (95%). See bayestestR::ci() for further details.

ci_method

Method for computing degrees of freedom for confidence intervals (CI) and the related p-values. Allowed are following options (which vary depending on the model class): "residual", "normal", "likelihood", "satterthwaite", "kenward", "wald", "profile", "boot", "uniroot", "ml1", "betwithin", "hdi", "quantile", "ci", "eti", "si", "bci", or "bcai". See section Confidence intervals and approximation of degrees of freedom in model_parameters() for further details. When ci_method=NULL, in most cases "wald" is used then.

test

The indices of effect existence to compute. Character (vector) or list with one or more of these options: "p_direction" (or "pd"), "rope", "p_map", "p_significance" (or "ps"), "p_rope", "equivalence_test" (or "equitest"), "bayesfactor" (or "bf") or "all" to compute all tests. For each "test", the corresponding bayestestR function is called (e.g. rope() or p_direction()) and its results included in the summary output.

rope_range

ROPE's lower and higher bounds. Should be a vector of two values (e.g., c(-0.1, 0.1)), "default" or a list of numeric vectors of the same length as numbers of parameters. If "default", the bounds are set to x +- 0.1*SD(response).

rope_ci

The Credible Interval (CI) probability, corresponding to the proportion of HDI, to use for the percentage in ROPE.

bf_prior

Distribution representing a prior for the computation of Bayes factors / SI. Used if the input is a posterior, otherwise (in the case of models) ignored.

diagnostic

Diagnostic metrics to compute. Character (vector) or list with one or more of these options: "ESS", "Rhat", "MCSE" or "all".

priors

Add the prior used for each parameter.

effects

Should variables for fixed effects ("fixed"), random effects ("random") or both ("all") be returned? Only applies to mixed models. May be abbreviated.

For models of from packages brms or rstanarm there are additional options:

"fixed" returns fixed effects.
"random_variance" return random effects parameters (variance and correlation components, e.g. those parameters that start with sd_ or cor_).
"grouplevel" returns random effects group level estimates, i.e. those parameters that start with r_.
"random" returns both "random_variance" and "grouplevel".
"all" returns fixed effects and random effects variances.
"full" returns all parameters.

component

Which type of parameters to return, such as parameters for the conditional model, the zero-inflation part of the model, the dispersion term, or other auxiliary parameters be returned? Applies to models with zero-inflation and/or dispersion formula, or if parameters such as sigma should be included. May be abbreviated. Note that the conditional component is also called count or mean component, depending on the model. There are three convenient shortcuts: component = "all" returns all possible parameters. If component = "location", location parameters such as conditional, zero_inflated, or smooth_terms, are returned (everything that are fixed or random effects - depending on the effects argument - but no auxiliary parameters). For component = "distributional" (or "auxiliary"), components like sigma, dispersion, or beta (and other auxiliary parameters) are returned.

standardize

The method used for standardizing the parameters. Can be NULL (default; no standardization), "refit" (for re-fitting the model on standardized data) or one of "basic", "posthoc", "smart", "pseudo". See 'Details' in standardize_parameters(). Importantly:

The "refit" method does not standardize categorical predictors (i.e. factors), which may be a different behaviour compared to other R packages (such as lm.beta) or other software packages (like SPSS). to mimic such behaviours, either use standardize="basic" or standardize the data with datawizard::standardize(force=TRUE) before fitting the model.
By default, the response (dependent) variable is also standardized, if applicable. Set include_response = FALSE to avoid standardization of the response variable. See details in datawizard::standardize.default().
For mixed models, when using methods other than "refit", only the fixed effects will be standardized.
Robust estimation (i.e., vcov set to a value other than NULL) of standardized parameters only works when standardize="refit".

group_level

Logical, for multilevel models (i.e. models with random effects) and when effects = "random", return the parameters for each group level from random effects only. If group_level = FALSE (the default), also information on SD and COR are returned. Note that this argument is superseded by the new options for the effects argument. effects = "grouplevel" should be used instead of group_level = TRUE.

keep

Character containing a regular expression pattern that describes the parameters that should be included (for keep) or excluded (for drop) in the returned data frame. keep may also be a named list of regular expressions. All non-matching parameters will be removed from the output. If keep is a character vector, every parameter name in the "Parameter" column that matches the regular expression in keep will be selected from the returned data frame (and vice versa, all parameter names matching drop will be excluded). Furthermore, if keep has more than one element, these will be merged with an OR operator into a regular expression pattern like this: "(one|two|three)". If keep is a named list of regular expression patterns, the names of the list-element should equal the column name where selection should be applied. This is useful for model objects where model_parameters() returns multiple columns with parameter components, like in model_parameters.lavaan(). Note that the regular expression pattern should match the parameter names as they are stored in the returned data frame, which can be different from how they are printed. Inspect the $Parameter column of the parameters table to get the exact parameter names.

drop

See keep.

Value

A data frame of indices related to the model's parameters.

Note

When standardize = "refit", columns diagnostic, bf_prior and priors refer to the original model. If model is a data frame, arguments diagnostic, bf_prior and priors are ignored.

There is also a plot()-method implemented in the see-package.

Confidence intervals and approximation of degrees of freedom

There are different ways of approximating the degrees of freedom depending on different assumptions about the nature of the model and its sampling distribution. The ci_method argument modulates the method for computing degrees of freedom (df) that are used to calculate confidence intervals (CI) and the related p-values. Following options are allowed, depending on the model class:

Classical methods:

Classical inference is generally based on the Wald method. The Wald approach to inference computes a test statistic by dividing the parameter estimate by its standard error (Coefficient / SE), then comparing this statistic against a t- or normal distribution. This approach can be used to compute CIs and p-values.

"wald":

Applies to non-Bayesian models. For linear models, CIs computed using the Wald method (SE and a t-distribution with residual df); p-values computed using the Wald method with a t-distribution with residual df. For other models, CIs computed using the Wald method (SE and a normal distribution); p-values computed using the Wald method with a normal distribution.

"normal"

Applies to non-Bayesian models. Compute Wald CIs and p-values, but always use a normal distribution.

"residual"

Applies to non-Bayesian models. Compute Wald CIs and p-values, but always use a t-distribution with residual df when possible. If the residual df for a model cannot be determined, a normal distribution is used instead.

Methods for mixed models:

Compared to fixed effects (or single-level) models, determining appropriate df for Wald-based inference in mixed models is more difficult. See the R GLMM FAQ for a discussion.

Several approximate methods for computing df are available, but you should also consider instead using profile likelihood ("profile") or bootstrap ("boot") CIs and p-values instead.

"satterthwaite"

Applies to linear mixed models. CIs computed using the Wald method (SE and a t-distribution with Satterthwaite df); p-values computed using the Wald method with a t-distribution with Satterthwaite df.

"kenward"

Applies to linear mixed models. CIs computed using the Wald method (Kenward-Roger SE and a t-distribution with Kenward-Roger df); p-values computed using the Wald method with Kenward-Roger SE and t-distribution with Kenward-Roger df.

"ml1"

Applies to linear mixed models. CIs computed using the Wald method (SE and a t-distribution with m-l-1 approximated df); p-values computed using the Wald method with a t-distribution with m-l-1 approximated df. See ci_ml1().

"betwithin"

Applies to linear mixed models and generalized linear mixed models. CIs computed using the Wald method (SE and a t-distribution with between-within df); p-values computed using the Wald method with a t-distribution with between-within df. See ci_betwithin().

Likelihood-based methods:

Likelihood-based inference is based on comparing the likelihood for the maximum-likelihood estimate to the the likelihood for models with one or more parameter values changed (e.g., set to zero or a range of alternative values). Likelihood ratios for the maximum-likelihood and alternative models are compared to a $\chi$-squared distribution to compute CIs and p-values.

"profile"

Applies to non-Bayesian models of class glm, polr, merMod or glmmTMB. CIs computed by profiling the likelihood curve for a parameter, using linear interpolation to find where likelihood ratio equals a critical value; p-values computed using the Wald method with a normal-distribution (note: this might change in a future update!)

"uniroot"

Applies to non-Bayesian models of class glmmTMB. CIs computed by profiling the likelihood curve for a parameter, using root finding to find where likelihood ratio equals a critical value; p-values computed using the Wald method with a normal-distribution (note: this might change in a future update!)

Methods for bootstrapped or Bayesian models:

Bootstrap-based inference is based on resampling and refitting the model to the resampled datasets. The distribution of parameter estimates across resampled datasets is used to approximate the parameter's sampling distribution. Depending on the type of model, several different methods for bootstrapping and constructing CIs and p-values from the bootstrap distribution are available.

For Bayesian models, inference is based on drawing samples from the model posterior distribution.

"quantile" (or "eti")

Applies to all models (including Bayesian models). For non-Bayesian models, only applies if bootstrap = TRUE. CIs computed as equal tailed intervals using the quantiles of the bootstrap or posterior samples; p-values are based on the probability of direction. See bayestestR::eti().

"hdi"

Applies to all models (including Bayesian models). For non-Bayesian models, only applies if bootstrap = TRUE. CIs computed as highest density intervals for the bootstrap or posterior samples; p-values are based on the probability of direction. See bayestestR::hdi().

"bci" (or "bcai")

Applies to all models (including Bayesian models). For non-Bayesian models, only applies if bootstrap = TRUE. CIs computed as bias corrected and accelerated intervals for the bootstrap or posterior samples; p-values are based on the probability of direction. See bayestestR::bci().

"si"

Applies to Bayesian models with proper priors. CIs computed as support intervals comparing the posterior samples against the prior samples; p-values are based on the probability of direction. See bayestestR::si().

"boot"

Applies to non-Bayesian models of class merMod. CIs computed using parametric bootstrapping (simulating data from the fitted model); p-values computed using the Wald method with a normal-distribution) (note: this might change in a future update!).

For all iteration-based methods other than "boot" ("hdi", "quantile", "ci", "eti", "si", "bci", "bcai"), p-values are based on the probability of direction (bayestestR::p_direction()), which is converted into a p-value using bayestestR::pd_to_p().

Model components

Possible values for the component argument depend on the model class. Following are valid options:

"all": returns all model components, applies to all models, but will only have an effect for models with more than just the conditional model component.
"conditional": only returns the conditional component, i.e. "fixed effects" terms from the model. Will only have an effect for models with more than just the conditional model component.
"smooth_terms": returns smooth terms, only applies to GAMs (or similar models that may contain smooth terms).
"zero_inflated" (or "zi"): returns the zero-inflation component.
"dispersion": returns the dispersion model component. This is common for models with zero-inflation or that can model the dispersion parameter.
"instruments": for instrumental-variable or some fixed effects regression, returns the instruments.
"nonlinear": for non-linear models (like models of class nlmerMod or nls), returns staring estimates for the nonlinear parameters.
"correlation": for models with correlation-component, like gls, the variables used to describe the correlation structure are returned.

Special models

Some model classes also allow rather uncommon options. These are:

mhurdle: "infrequent_purchase", "ip", and "auxiliary"
BGGM: "correlation" and "intercept"
BFBayesFactor, glmx: "extra"
averaging:"conditional" and "full"
mjoint: "survival"
mfx: "precision", "marginal"
betareg, DirichletRegModel: "precision"
mvord: "thresholds" and "correlation"
clm2: "scale"
selection: "selection", "outcome", and "auxiliary"
lavaan: One or more of "regression", "correlation", "loading", "variance", "defined", or "mean". Can also be "all" to include all components.

For models of class brmsfit (package brms), even more options are possible for the component argument, which are not all documented in detail here.

Examples

# \donttest{
library(parameters)
model <- suppressWarnings(stan_glm(
  Sepal.Length ~ Petal.Length * Species,
  data = iris, iter = 500, refresh = 0
))
model_parameters(model)
#> Parameter                      | Median |         95% CI |     pd |  Rhat
#> -------------------------------------------------------------------------
#> (Intercept)                    |   4.11 | [ 3.48,  4.86] |   100% | 1.018
#> Petal.Length                   |   0.61 | [ 0.09,  1.04] | 98.80% | 1.019
#> Speciesversicolor              |  -1.69 | [-2.81, -0.69] |   100% | 1.012
#> Speciesvirginica               |  -2.98 | [-4.14, -1.97] |   100% | 1.007
#> Petal.Length:Speciesversicolor |   0.21 | [-0.24,  0.76] | 81.50% | 1.021
#> Petal.Length:Speciesvirginica  |   0.37 | [-0.06,  0.90] | 94.60% | 1.018
#> 
#> Parameter                      |    ESS |                 Prior
#> ---------------------------------------------------------------
#> (Intercept)                    | 206.00 | Normal (5.84 +- 2.07)
#> Petal.Length                   | 199.00 | Normal (0.00 +- 1.17)
#> Speciesversicolor              | 377.00 | Normal (0.00 +- 4.38)
#> Speciesvirginica               | 235.00 | Normal (0.00 +- 4.38)
#> Petal.Length:Speciesversicolor | 193.00 | Normal (0.00 +- 1.02)
#> Petal.Length:Speciesvirginica  | 172.00 | Normal (0.00 +- 0.78)
#> 
#> Uncertainty intervals (equal-tailed) computed using a MCMC
#>   distribution approximation.
# }