Formatting Model Parameters • parameters

The parameters package, together with the insight package, provides tools to format the layout and style of tables from model parameters. When you use the model_parameters() function, you usually don’t have to take care about formatting and layout, at least not for simple purposes like printing to the console or inside rmarkdown documents. However, sometime you may want to do the formatting steps manually. This vignette introduces the various functions that are used for parameters table formatting.

An Example Model

We start with a model that does not make much sense, but it is useful for demonstrating the formatting functions.

data(iris)
iris$Petlen <- cut(iris$Petal.Length, breaks = c(0, 3, 7))
model <- lm(Sepal.Width ~ poly(Sepal.Length, 2) + Species + Petlen, data = iris)

summary(model)
#> 
#> Call:
#> lm(formula = Sepal.Width ~ poly(Sepal.Length, 2) + Species + 
#>     Petlen, data = iris)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -0.7742 -0.1490 -0.0056  0.1666  0.6973 
#> 
#> Coefficients:
#>                        Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)              3.8127     0.0582   65.50  < 2e-16 ***
#> poly(Sepal.Length, 2)1   4.0602     0.4668    8.70    7e-15 ***
#> poly(Sepal.Length, 2)2  -1.3024     0.3149   -4.14    6e-05 ***
#> Speciesversicolor       -1.0056     0.2781   -3.62  0.00041 ***
#> Speciesvirginica        -0.9913     0.2851   -3.48  0.00067 ***
#> Petlen(3,7]             -0.1360     0.2818   -0.48  0.63019    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.28 on 144 degrees of freedom
#> Multiple R-squared:  0.615,  Adjusted R-squared:  0.602 
#> F-statistic:   46 on 5 and 144 DF,  p-value: <2e-16

Formatting Parameter Names

As we can see, in such cases, the standard R output looks a bit cryptic, although all necessary and important information is included in the summary. The formatting of coefficients for polynomial transformation is difficult to read, factors grouped with cut() always require a short time of thinking to find out which of the bound (in this case, Petlen(3,7], 3 and 7) is included in the range, and names of factor levels are directly concatenated to the name of the factor variable.

Thus, the first step would be to format the parameter names, which can be done with format_parameters() from the parameters package:

library(parameters)
format_parameters(model)
#>                 (Intercept)      poly(Sepal.Length, 2)1 
#>               "(Intercept)" "Sepal Length [1st degree]" 
#>      poly(Sepal.Length, 2)2           Speciesversicolor 
#> "Sepal Length [2nd degree]"      "Species [versicolor]" 
#>            Speciesvirginica                 Petlen(3,7] 
#>       "Species [virginica]"             "Petlen [>3-7]"

format_parameters() returns a (named) character vector with the original coefficients as names of each character element, and the formatted names of the coefficients as values of the character vector. Let’s look at the results again:

cat(format_parameters(model), sep = "\n")
#> (Intercept)
#> Sepal Length [1st degree]
#> Sepal Length [2nd degree]
#> Species [versicolor]
#> Species [virginica]
#> Petlen [>3-7]

Now variable names and factor levels, but also polynomial terms or even factors grouped with cut() are much more readable. Factor levels are separated from the variable name, inside brackets. Same for the coefficients of the different polynomial degrees. And the exact range for cut()-factors is also clearer now.

Standardizing Column Names of Parameter Tables

As seen above, the summary() returns columns named Estimate, t value or Pr(>|t|). While Estimate is not specific for certain models, t value is. For logistic regression models, you would get z value. Some packages alter the names, so you get just t or t-value etc.

model_parameters() also uses context-specific column names, where applicable:

colnames(model_parameters(model))
#> [1] "Parameter"   "Coefficient" "SE"          "CI"          "CI_low"     
#> [6] "CI_high"     "t"           "df_error"    "p"

For Bayesian models, Coefficient is usually named Median etc. While this makes sense from a user perspective, because you instantly know which type of statistic or coefficient you have, it becomes difficult when you need a generic naming scheme to access model parameters when the input model is unknown. This is the typical approach from the broom package, where you get “standardized” column names:

library(broom)
colnames(tidy(model))
#> [1] "term"      "estimate"  "std.error" "statistic" "p.value"

To deal with such situations, the insight package provides a standardize_names() function, which exactly does that: standardizing the column names of the input. In the following example, you see that the statistic-column is no longer named t, but statistic. df_error or df_residuals will be renamed to df.

library(insight)
model |>
  model_parameters() |>
  standardize_names() |>
  colnames()
#> [1] "Parameter"   "Coefficient" "SE"          "CI"          "CI_low"     
#> [6] "CI_high"     "Statistic"   "df"          "p"

Furthermore, you can request “broom”-style for column names:

model |>
  model_parameters() |>
  standardize_names(style = "broom") |>
  colnames()
#> [1] "term"       "estimate"   "std.error"  "conf.level" "conf.low"  
#> [6] "conf.high"  "statistic"  "df.error"   "p.value"

Formatting Column Names and Columns

Beside formatting parameter names (coefficient names) using format_parameters(), we can do even more to make the output more readable. Let’s look at an example that includes confidence intervals.

cbind(summary(model)$coefficients, confint(model))
#>                        Estimate Std. Error t value Pr(>|t|) 2.5 % 97.5 %
#> (Intercept)                3.81      0.058   65.50 4.6e-109  3.70   3.93
#> poly(Sepal.Length, 2)1     4.06      0.467    8.70  7.0e-15  3.14   4.98
#> poly(Sepal.Length, 2)2    -1.30      0.315   -4.14  6.0e-05 -1.92  -0.68
#> Speciesversicolor         -1.01      0.278   -3.62  4.1e-04 -1.56  -0.46
#> Speciesvirginica          -0.99      0.285   -3.48  6.7e-04 -1.55  -0.43
#> Petlen(3,7]               -0.14      0.282   -0.48  6.3e-01 -0.69   0.42

We can get a similar tabular output using broom.

tidy(model, conf.int = TRUE)
#> # A tibble: 6 × 7
#>   term                 estimate std.error statistic   p.value conf.low conf.high
#>   <chr>                   <dbl>     <dbl>     <dbl>     <dbl>    <dbl>     <dbl>
#> 1 (Intercept)             3.81     0.0582    65.5   4.61e-109    3.70      3.93 
#> 2 poly(Sepal.Length, …    4.06     0.467      8.70  7.00e- 15    3.14      4.98 
#> 3 poly(Sepal.Length, …   -1.30     0.315     -4.14  5.98e-  5   -1.92     -0.680
#> 4 Speciesversicolor      -1.01     0.278     -3.62  4.12e-  4   -1.56     -0.456
#> 5 Speciesvirginica       -0.991    0.285     -3.48  6.72e-  4   -1.55     -0.428
#> 6 Petlen(3,7]            -0.136    0.282     -0.482 6.30e-  1   -0.693     0.421

Some improvements according to readability could be collapsing and formatting the confidence intervals, and maybe the p-values. This would require some effort, for instance, to format the values of the lower and upper confidence intervals and collapsing them into one column. However, the format_table() function is a convenient function that does all the work for you.

format_table() requires a data frame with model parameters as input, however, there are some requirements to make format_table() work. In particular, the column names must follow a certain pattern to be recognized, and this pattern may either be the naming convention from broom or the easystats packages.

model |>
  tidy(conf.int = TRUE) |>
  format_table()
#>                     term estimate std.error statistic p.value       conf.int
#> 1            (Intercept)     3.81      0.06     65.50  < .001 [ 3.70,  3.93]
#> 2 poly(Sepal.Length, 2)1     4.06      0.47      8.70  < .001 [ 3.14,  4.98]
#> 3 poly(Sepal.Length, 2)2    -1.30      0.31     -4.14  < .001 [-1.92, -0.68]
#> 4      Speciesversicolor    -1.01      0.28     -3.62  < .001 [-1.56, -0.46]
#> 5       Speciesvirginica    -0.99      0.29     -3.48  < .001 [-1.55, -0.43]
#> 6            Petlen(3,7]    -0.14      0.28     -0.48  0.630  [-0.69,  0.42]

When the parameters table also includes degrees of freedom, and the degrees of freedom are the same for each parameter, then this information is included in the statistic-column. This is usually the default for model_parameters():

model |>
  model_parameters() |>
  format_table()
#>                   Parameter Coefficient   SE         95% CI t(144)      p
#> 1               (Intercept)        3.81 0.06 [ 3.70,  3.93]  65.50 < .001
#> 2 Sepal Length [1st degree]        4.06 0.47 [ 3.14,  4.98]   8.70 < .001
#> 3 Sepal Length [2nd degree]       -1.30 0.31 [-1.92, -0.68]  -4.14 < .001
#> 4      Species [versicolor]       -1.01 0.28 [-1.56, -0.46]  -3.62 < .001
#> 5       Species [virginica]       -0.99 0.29 [-1.55, -0.43]  -3.48 < .001
#> 6             Petlen [>3-7]       -0.14 0.28 [-0.69,  0.42]  -0.48 0.630

Exporting the Parameters Table

Finally, export_table() from insight formats the data frame and returns a character vector that can be printed to the console or inside rmarkdown documents. The data frame then looks more “table-like”.

data(mtcars)
export_table(mtcars[1:8, 1:5])
#>   mpg | cyl |   disp |  hp | drat
#> ---------------------------------
#> 21.00 |   6 | 160.00 | 110 | 3.90
#> 21.00 |   6 | 160.00 | 110 | 3.90
#> 22.80 |   4 | 108.00 |  93 | 3.85
#> 21.40 |   6 | 258.00 | 110 | 3.08
#> 18.70 |   8 | 360.00 | 175 | 3.15
#> 18.10 |   6 | 225.00 | 105 | 2.76
#> 14.30 |   8 | 360.00 | 245 | 3.21
#> 24.40 |   4 | 146.70 |  62 | 3.69

Putting all this together allows us to create nice tabular outputs of parameters tables. This can be done using broom:

model |>
  tidy(conf.int = TRUE) |>
  format_table() |>
  export_table()
#> term                   | estimate | std.error | statistic | p.value |       conf.int
#> ------------------------------------------------------------------------------------
#> (Intercept)            |     3.81 |      0.06 |     65.50 |  < .001 | [ 3.70,  3.93]
#> poly(Sepal.Length, 2)1 |     4.06 |      0.47 |      8.70 |  < .001 | [ 3.14,  4.98]
#> poly(Sepal.Length, 2)2 |    -1.30 |      0.31 |     -4.14 |  < .001 | [-1.92, -0.68]
#> Speciesversicolor      |    -1.01 |      0.28 |     -3.62 |  < .001 | [-1.56, -0.46]
#> Speciesvirginica       |    -0.99 |      0.29 |     -3.48 |  < .001 | [-1.55, -0.43]
#> Petlen(3,7]            |    -0.14 |      0.28 |     -0.48 |  0.630  | [-0.69,  0.42]

Or, in a simpler way and with much more options (like standardizing, robust standard errors, bootstrapping, …) using model_parameters(), which print()-method does all these steps automatically:

model_parameters(model)
#> Parameter                 | Coefficient |   SE |         95% CI | t(144) |      p
#> ---------------------------------------------------------------------------------
#> (Intercept)               |        3.81 | 0.06 | [ 3.70,  3.93] |  65.50 | < .001
#> Sepal Length [1st degree] |        4.06 | 0.47 | [ 3.14,  4.98] |   8.70 | < .001
#> Sepal Length [2nd degree] |       -1.30 | 0.31 | [-1.92, -0.68] |  -4.14 | < .001
#> Species [versicolor]      |       -1.01 | 0.28 | [-1.56, -0.46] |  -3.62 | < .001
#> Species [virginica]       |       -0.99 | 0.29 | [-1.55, -0.43] |  -3.48 | < .001
#> Petlen [>3-7]             |       -0.14 | 0.28 | [-0.69,  0.42] |  -0.48 | 0.630

Formatting the Parameters Table in Markdown

export_table() provides a few options to generate tables in markdown-format. This allows to easily render nice-looking tables inside markdown-documents. First of all, use format = "markdown" to activate the markdown-formatting. caption can be used to add a table caption. Furthermore, align allows to choose an alignment for all table columns, or to specify the alignment for each column individually.

The following table has six columns. Using align = "lcccrr" would left-align the first column, center columns two to four, and right-align the last two columns.

model |>
  tidy(conf.int = TRUE) |>
  # parenthesis look better in markdown-tables, so we use "brackets" here
  format_table(ci_brackets = c("(", ")")) |>
  export_table(format = "markdown", caption = "My Table", align = "lcccrr")

My Table
term	estimate	std.error	statistic	p.value	conf.int
(Intercept)	3.81	0.06	65.50	< .001	( 3.70, 3.93)
poly(Sepal.Length, 2)1	4.06	0.47	8.70	< .001	( 3.14, 4.98)
poly(Sepal.Length, 2)2	-1.30	0.31	-4.14	< .001	(-1.92, -0.68)
Speciesversicolor	-1.01	0.28	-3.62	< .001	(-1.56, -0.46)
Speciesvirginica	-0.99	0.29	-3.48	< .001	(-1.55, -0.43)
Petlen(3,7]	-0.14	0.28	-0.48	0.630	(-0.69, 0.42)

print_md() is a convenient wrapper around format_table() and export_table(format = "markdown"), and allows to directly format the output of functions like model_parameters(), simulate_parameters() or other parameters functions in markdown-format.

These tables are also nicely formatted when knitting markdown-documents into Word or PDF. print_md() applies some default settings that have proven to work well for markdown, PDF or Word tables.

model_parameters(model) |> print_md()

Parameter	Coefficient	SE	95% CI	t(144)	p
(Intercept)	3.81	0.06	(3.70, 3.93)	65.50	< .001
Sepal Length (1st degree)	4.06	0.47	(3.14, 4.98)	8.70	< .001
Sepal Length (2nd degree)	-1.30	0.31	(-1.92, -0.68)	-4.14	< .001
Species (versicolor)	-1.01	0.28	(-1.56, -0.46)	-3.62	< .001
Species (virginica)	-0.99	0.29	(-1.55, -0.43)	-3.48	< .001
Petlen (>3-7)	-0.14	0.28	(-0.69, 0.42)	-0.48	0.630

A similar option is print_html(), which is a convenient wrapper for format_table() and export_table(format = "html"). Using HTML in markdown has the advantage that it will be properly rendered when exporting to PDF.

model_parameters(model) |> print_html()

Parameter	Coefficient	SE	95% CI	t(144)	p
(Intercept)	3.81	0.06	(3.70, 3.93)	65.50	< .001
Sepal Length (1st degree)	4.06	0.47	(3.14, 4.98)	8.70	< .001
Sepal Length (2nd degree)	-1.30	0.31	(-1.92, -0.68)	-4.14	< .001
Species (versicolor)	-1.01	0.28	(-1.56, -0.46)	-3.62	< .001
Species (virginica)	-0.99	0.29	(-1.55, -0.43)	-3.48	< .001
Petlen (>3-7)	-0.14	0.28	(-0.69, 0.42)	-0.48	0.630

print_md() and print_html() are considered as main functions for users who want to generate nicely rendered tables inside markdown-documents. A wrapper around these both is display(), which either calls print_md() or print_html().

model_parameters(model) |> display(format = "html")

Parameter	Coefficient	SE	95% CI	t(144)	p
(Intercept)	3.81	0.06	(3.70, 3.93)	65.50	< .001
Sepal Length (1st degree)	4.06	0.47	(3.14, 4.98)	8.70	< .001
Sepal Length (2nd degree)	-1.30	0.31	(-1.92, -0.68)	-4.14	< .001
Species (versicolor)	-1.01	0.28	(-1.56, -0.46)	-3.62	< .001
Species (virginica)	-0.99	0.29	(-1.55, -0.43)	-3.48	< .001
Petlen (>3-7)	-0.14	0.28	(-0.69, 0.42)	-0.48	0.630