Skip to contents

This function "pools" (i.e. combines) model parameters in a similar fashion as mice::pool(). However, this function pools parameters from parameters_model objects, as returned by model_parameters().

Usage

pool_parameters(
  x,
  exponentiate = FALSE,
  effects = "fixed",
  component = "all",
  verbose = TRUE,
  ...
)

Arguments

x

A list of parameters_model objects, as returned by model_parameters(), or a list of model-objects that is supported by model_parameters().

exponentiate

Logical, indicating whether or not to exponentiate the coefficients (and related confidence intervals). This is typical for logistic regression, or more generally speaking, for models with log or logit links. It is also recommended to use exponentiate = TRUE for models with log-transformed response values. For models with a log-transformed response variable, when exponentiate = TRUE, a one-unit increase in the predictor is associated with multiplying the outcome by that predictor's coefficient. Note: Delta-method standard errors are also computed (by multiplying the standard errors by the transformed coefficients). This is to mimic behaviour of other software packages, such as Stata, but these standard errors poorly estimate uncertainty for the transformed coefficient. The transformed confidence interval more clearly captures this uncertainty. For compare_parameters(), exponentiate = "nongaussian" will only exponentiate coefficients from non-Gaussian families.

effects

Should parameters for fixed effects ("fixed"), random effects ("random"), or both ("all") be returned? Only applies to mixed models. May be abbreviated. If the calculation of random effects parameters takes too long, you may use effects = "fixed".

component

Should all parameters, parameters for the conditional model, for the zero-inflation part of the model, or the dispersion model be returned? Applies to models with zero-inflation and/or dispersion component. component may be one of "conditional", "zi", "zero-inflated", "dispersion" or "all" (default). May be abbreviated.

verbose

Toggle warnings and messages.

...

Arguments passed down to model_parameters(), if x is a list of model-objects. Can be used, for instance, to specify arguments like ci or ci_method etc.

Value

A data frame of indices related to the model's parameters.

Details

Averaging of parameters follows Rubin's rules (Rubin, 1987, p. 76). The pooled degrees of freedom is based on the Barnard-Rubin adjustment for small samples (Barnard and Rubin, 1999).

Note

Models with multiple components, (for instance, models with zero-inflation, where predictors appear in the count and zero-inflation part, or models with dispersion component) may fail in rare situations. In this case, compute the pooled parameters for components separately, using the component argument.

Some model objects do not return standard errors (e.g. objects of class htest). For these models, no pooled confidence intervals nor p-values are returned.

References

Barnard, J. and Rubin, D.B. (1999). Small sample degrees of freedom with multiple imputation. Biometrika, 86, 948-955. Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons.

Examples

# example for multiple imputed datasets
data("nhanes2", package = "mice")
imp <- mice::mice(nhanes2, printFlag = FALSE)
models <- lapply(1:5, function(i) {
  lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i))
})
pool_parameters(models)
#> # Fixed Effects
#> 
#> Parameter   | Coefficient |   SE |          95% CI | Statistic |    df |      p
#> -------------------------------------------------------------------------------
#> (Intercept) |       18.14 | 3.55 | [ 10.39, 25.90] |      5.12 | 11.63 | < .001
#> age [40-59] |       -6.16 | 2.20 | [-11.54, -0.77] |     -2.80 |  5.97 | 0.031 
#> age [60-99] |       -7.73 | 2.46 | [-13.61, -1.85] |     -3.14 |  6.64 | 0.017 
#> hyp [yes]   |        2.47 | 2.07 | [ -2.58,  7.51] |      1.19 |  6.15 | 0.278 
#> chl         |        0.06 | 0.02 | [  0.01,  0.11] |      2.90 | 10.21 | 0.015 
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed)
#>   computed using a Wald distribution approximation.

# should be identical to:
m <- with(data = imp, exp = lm(bmi ~ age + hyp + chl))
summary(mice::pool(m))
#>          term    estimate  std.error statistic        df      p.value
#> 1 (Intercept) 18.14256305 3.54562901  5.116881 11.625350 0.0002813879
#> 2    age40-59 -6.15715380 2.19792337 -2.801351  5.969005 0.0312810653
#> 3    age60-99 -7.72866592 2.45997959 -3.141760  6.642762 0.0174969755
#> 4      hypyes  2.46673562 2.07396774  1.189380  6.147560 0.2781911871
#> 5         chl  0.06028557 0.02078061  2.901050 10.206904 0.0154916183

# For glm, mice used residual df, while `pool_parameters()` uses `Inf`
nhanes2$hyp <- datawizard::slide(as.numeric(nhanes2$hyp))
imp <- mice::mice(nhanes2, printFlag = FALSE)
models <- lapply(1:5, function(i) {
  glm(hyp ~ age + chl, family = binomial, data = mice::complete(imp, action = i))
})
m <- with(data = imp, exp = glm(hyp ~ age + chl, family = binomial))
# residual df
summary(mice::pool(m))$df
#> [1] 19.24807 19.24807 19.24807 11.91591
# df = Inf
pool_parameters(models)$df_error
#> [1] Inf Inf Inf Inf
# use residual df instead
pool_parameters(models, ci_method = "residual")$df_error
#> [1] 19.24807 19.24807 19.24807 11.91591