Skip to contents

Run a contrast analysis by estimating the differences between each level of a factor. See also other related functions such as estimate_means() and estimate_slopes().

Usage

estimate_contrasts(
  model,
  contrast = NULL,
  at = NULL,
  fixed = NULL,
  transform = "none",
  ci = 0.95,
  adjust = "holm",
  method = "pairwise",
  ...
)

Arguments

model

A statistical model.

contrast

A character vector indicating the name of the variable(s) for which to compute the contrasts.

at

The predictor variable(s) at which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them).

fixed

A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values.

transform

Is passed to the type argument in emmeans::emmeans(). See this vignette. Can be "none" (default for contrasts), "response" (default for means), "mu", "unlink", "log". "none" will leave the values on scale of the linear predictors. "response" will transform them on scale of the response variable. Thus for a logistic model, "none" will give estimations expressed in log-odds (probabilities on logit scale) and "response" in terms of probabilities.

ci

Confidence Interval (CI) level. Default to 0.95 (95%).

adjust

The p-values adjustment method for frequentist multiple comparisons. Can be one of "holm" (default), "tukey", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr" or "none". See the p-value adjustment section in the emmeans::test documentation.

method

Contrast method. See same argument in emmeans::contrast.

...

Other arguments passed for instance to insight::get_datagrid().

Value

A data frame of estimated contrasts.

Details

See the Details section below, and don't forget to also check out the Vignettes and README examples for various examples, tutorials and use cases.

The estimate_slopes(), estimate_means() and estimate_contrasts() functions are forming a group, as they are all based on marginal estimations (estimations based on a model). All three are also built on the emmeans package, so reading its documentation (for instance for emmeans::emmeans() and emmeans::emtrends()) is recommended to understand the idea behind these types of procedures.

  • Model-based predictions is the basis for all that follows. Indeed, the first thing to understand is how models can be used to make predictions (see estimate_link()). This corresponds to the predicted response (or "outcome variable") given specific predictor values of the predictors (i.e., given a specific data configuration). This is why the concept of reference grid() is so important for direct predictions.

  • Marginal "means", obtained via estimate_means(), are an extension of such predictions, allowing to "average" (collapse) some of the predictors, to obtain the average response value at a specific predictors configuration. This is typically used when some of the predictors of interest are factors. Indeed, the parameters of the model will usually give you the intercept value and then the "effect" of each factor level (how different it is from the intercept). Marginal means can be used to directly give you the mean value of the response variable at all the levels of a factor. Moreover, it can also be used to control, or average over predictors, which is useful in the case of multiple predictors with or without interactions.

  • Marginal contrasts, obtained via estimate_contrasts(), are themselves at extension of marginal means, in that they allow to investigate the difference (i.e., the contrast) between the marginal means. This is, again, often used to get all pairwise differences between all levels of a factor. It works also for continuous predictors, for instance one could also be interested in whether the difference at two extremes of a continuous predictor is significant.

  • Finally, marginal effects, obtained via estimate_slopes(), are different in that their focus is not values on the response variable, but the model's parameters. The idea is to assess the effect of a predictor at a specific configuration of the other predictors. This is relevant in the case of interactions or non-linear relationships, when the effect of a predictor variable changes depending on the other predictors. Moreover, these effects can also be "averaged" over other predictors, to get for instance the "general trend" of a predictor over different factor levels.

Example: Let's imagine the following model lm(y ~ condition * x) where condition is a factor with 3 levels A, B and C and x a continuous variable (like age for example). One idea is to see how this model performs, and compare the actual response y to the one predicted by the model (using estimate_response()). Another idea is evaluate the average mean at each of the condition's levels (using estimate_means()), which can be useful to visualize them. Another possibility is to evaluate the difference between these levels (using estimate_contrasts()). Finally, one could also estimate the effect of x averaged over all conditions, or instead within each condition (using [estimate_slopes]).

Examples

library(modelbased)
if (require("emmeans", quietly = TRUE)) {
# Basic usage
model <- lm(Sepal.Width ~ Species, data = iris)
estimate_contrasts(model)

# Dealing with interactions
model <- lm(Sepal.Width ~ Species * Petal.Width, data = iris)
# By default: selects first factor
estimate_contrasts(model)
# Can also run contrasts between points of numeric
estimate_contrasts(model, contrast = "Petal.Width", length = 4)
# Or both
estimate_contrasts(model, contrast = c("Species", "Petal.Width"), length = 2)
# Or with custom specifications
estimate_contrasts(model, contrast = c("Species", "Petal.Width=c(1, 2)"))
# Can fixate the numeric at a specific value
estimate_contrasts(model, fixed = "Petal.Width")
# Or modulate it
estimate_contrasts(model, at = "Petal.Width", length = 4)

# Standardized differences
estimated <- estimate_contrasts(lm(Sepal.Width ~ Species, data = iris))
standardize(estimated)

# Other models (mixed, Bayesian, ...)
if (require("lme4")) {
  data <- iris
  data$Petal.Length_factor <- ifelse(data$Petal.Length < 4.2, "A", "B")

  model <- lmer(Sepal.Width ~ Species + (1 | Petal.Length_factor), data = data)
  estimate_contrasts(model)
}

data <- mtcars
data$cyl <- as.factor(data$cyl)
data$am <- as.factor(data$am)
if (FALSE) {
if (require("rstanarm")) {
  model <- stan_glm(mpg ~ cyl * am, data = data, refresh = 0)
  estimate_contrasts(model)
  estimate_contrasts(model, fixed = "am")

  model <- stan_glm(mpg ~ cyl * wt, data = data, refresh = 0)
  estimate_contrasts(model)
  estimate_contrasts(model, fixed = "wt")
  estimate_contrasts(model, at = "wt", length = 4)

  model <- stan_glm(Sepal.Width ~ Species + Petal.Width + Petal.Length, data = iris, refresh = 0)
  estimate_contrasts(model, at = "Petal.Length", test = "bf")
}
}
}
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> NOTE: Results may be misleading due to involvement in interactions
#> NOTE: Results may be misleading due to involvement in interactions
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Loading required package: lme4
#> Loading required package: Matrix
#> No variable was specified for contrast estimation. Selecting `contrast = "Species"`.
#> Cannot use mode = "kenward-roger" because *pbkrtest* package is not installed
#> Cannot use mode = "satterthwaite" because *lmerTest* package is not installed