Compute the **Probability of Direction** (* pd*, also known as the Maximum
Probability of Effect -

*MPE*). This can be interpreted as the probability that a parameter (described by its posterior distribution) is strictly positive or negative (whichever is the most probable). Although differently expressed, this index is fairly similar (

*i.e.*, is strongly correlated) to the frequentist

**p-value**(see details).

## Usage

```
p_direction(x, ...)
pd(x, ...)
# S3 method for class 'numeric'
p_direction(x, method = "direct", null = 0, ...)
# S3 method for class 'data.frame'
p_direction(x, method = "direct", null = 0, ...)
# S3 method for class 'MCMCglmm'
p_direction(x, method = "direct", null = 0, ...)
# S3 method for class 'emmGrid'
p_direction(x, method = "direct", null = 0, ...)
# S3 method for class 'stanreg'
p_direction(
x,
effects = c("fixed", "random", "all"),
component = c("location", "all", "conditional", "smooth_terms", "sigma",
"distributional", "auxiliary"),
parameters = NULL,
method = "direct",
null = 0,
...
)
# S3 method for class 'brmsfit'
p_direction(
x,
effects = c("fixed", "random", "all"),
component = c("conditional", "zi", "zero_inflated", "all"),
parameters = NULL,
method = "direct",
null = 0,
...
)
# S3 method for class 'BFBayesFactor'
p_direction(x, method = "direct", null = 0, ...)
# S3 method for class 'get_predicted'
p_direction(
x,
method = "direct",
null = 0,
use_iterations = FALSE,
verbose = TRUE,
...
)
```

## Arguments

- x
A vector representing a posterior distribution, a data frame of posterior draws (samples be parameter). Can also be a Bayesian model.

- ...
Currently not used.

- method
Can be

`"direct"`

or one of methods of`estimate_density()`

, such as`"kernel"`

,`"logspline"`

or`"KernSmooth"`

. See details.- null
The value considered as a "null" effect. Traditionally 0, but could also be 1 in the case of ratios of change (OR, IRR, ...).

- effects
Should results for fixed effects, random effects or both be returned? Only applies to mixed models. May be abbreviated.

- component
Should results for all parameters, parameters for the conditional model or the zero-inflated part of the model be returned? May be abbreviated. Only applies to brms-models.

- parameters
Regular expression pattern that describes the parameters that should be returned. Meta-parameters (like

`lp__`

or`prior_`

) are filtered by default, so only parameters that typically appear in the`summary()`

are returned. Use`parameters`

to select specific parameters for the output.- use_iterations
Logical, if

`TRUE`

and`x`

is a`get_predicted`

object, (returned by`insight::get_predicted()`

), the function is applied to the iterations instead of the predictions. This only applies to models that return iterations for predicted values (e.g.,`brmsfit`

models).- verbose
Toggle off warnings.

## Value

Values between 0.5 and 1 *or* between 0 and 1 (see above) corresponding to
the probability of direction (pd).

## Details

### What is the *pd*?

The Probability of Direction (pd) is an index of effect existence, representing the certainty with which an effect goes in a particular direction (i.e., is positive or negative / has a sign), typically ranging from 0.5 to 1 (but see next section for cases where it can range between 0 and 1). Beyond its simplicity of interpretation, understanding and computation, this index also presents other interesting properties:

Like other posterior-based indices,

*pd*is solely based on the posterior distributions and does not require any additional information from the data or the model (e.g., such as priors, as in the case of Bayes factors).It is robust to the scale of both the response variable and the predictors.

It is strongly correlated with the frequentist p-value, and can thus be used to draw parallels and give some reference to readers non-familiar with Bayesian statistics (Makowski et al., 2019).

### Relationship with the p-value

In most cases, it seems that the *pd* has a direct correspondence with the
frequentist one-sided *p*-value through the formula (for two-sided *p*):
$$p = 2 \times (1 - p_d)$$
Thus, a two-sided p-value of respectively `.1`

, `.05`

, `.01`

and `.001`

would
correspond approximately to a *pd* of `95%`

, `97.5%`

, `99.5%`

and `99.95%`

.
See `pd_to_p()`

for details.

### Possible Range of Values

The largest value *pd* can take is 1 - the posterior is strictly directional.
However, the smallest value *pd* can take depends on the parameter space
represented by the posterior.
**For a continuous parameter space**, exact values of 0 (or any point null
value) are not possible, and so 100% of the posterior has *some* sign, some
positive, some negative. Therefore, the smallest the *pd* can be is 0.5 -
with an equal posterior mass of positive and negative values. Values close to
0.5 *cannot* be used to support the null hypothesis (that the parameter does
*not* have a direction) is a similar why to how large p-values cannot be used
to support the null hypothesis (see `pd_to_p()`

; Makowski et al., 2019).
**For a discrete parameter space or a parameter space that is a mixture
between discrete and continuous spaces**, exact values of 0 (or any point
null value) *are* possible! Therefore, the smallest the *pd* can be is 0 -
with 100% of the posterior mass on 0. Thus values close to 0 can be used to
support the null hypothesis (see van den Bergh et al., 2021).

Examples of posteriors representing discrete parameter space:

When a parameter can only take discrete values.

When a mixture prior/posterior is used (such as the spike-and-slab prior; see van den Bergh et al., 2021).

When conducting Bayesian model averaging (e.g.,

`weighted_posteriors()`

or`brms::posterior_average`

).

### Methods of computation

The *pd* is defined as:
$$p_d = max({Pr(\hat{\theta} < \theta_{null}), Pr(\hat{\theta} > \theta_{null})})$$

The most simple and direct way to compute the *pd* is to compute the
proportion of positive (or larger than `null`

) posterior samples, the
proportion of negative (or smaller than `null`

) posterior samples, and take
the larger of the two. This "simple" method is the most straightforward, but
its precision is directly tied to the number of posterior draws.

The second approach relies on density estimation: It starts by
estimating the continuous-smooth density function (for which many methods are
available), and then computing the area under the curve
(AUC) of the density curve on either side of `null`

and taking the maximum
between them. Note the this approach assumes a continuous density function,
and so **when the posterior represents a (partially) discrete parameter
space, only the direct method must be used** (see above).

## Note

There is also a `plot()`

-method implemented in the see-package.

## References

Makowski, D., Ben-Shachar, M. S., Chen, S. A., & Lüdecke, D. (2019). Indices of effect existence and significance in the Bayesian framework. Frontiers in psychology, 10, 2767. doi:10.3389/fpsyg.2019.02767

van den Bergh, D., Haaf, J. M., Ly, A., Rouder, J. N., & Wagenmakers, E. J. (2021). A cautionary note on estimating effect size. Advances in Methods and Practices in Psychological Science, 4(1). doi:10.1177/2515245921992035

## See also

`pd_to_p()`

to convert between Probability of Direction (pd) and p-value.

## Examples

```
library(bayestestR)
# Simulate a posterior distribution of mean 1 and SD 1
# ----------------------------------------------------
posterior <- rnorm(1000, mean = 1, sd = 1)
p_direction(posterior)
#> Probability of Direction
#>
#> Parameter | pd
#> ------------------
#> Posterior | 84.50%
p_direction(posterior, method = "kernel")
#> Probability of Direction
#>
#> Parameter | pd
#> ------------------
#> Posterior | 83.17%
# Simulate a dataframe of posterior distributions
# -----------------------------------------------
df <- data.frame(replicate(4, rnorm(100)))
p_direction(df)
#> Probability of Direction
#>
#> Parameter | pd
#> ------------------
#> X1 | 51.00%
#> X2 | 52.00%
#> X3 | 51.00%
#> X4 | 58.00%
p_direction(df, method = "kernel")
#> Probability of Direction
#>
#> Parameter | pd
#> ------------------
#> X1 | 51.24%
#> X2 | 51.93%
#> X3 | 50.15%
#> X4 | 59.86%
# \donttest{
# rstanarm models
# -----------------------------------------------
if (require("rstanarm")) {
model <- rstanarm::stan_glm(mpg ~ wt + cyl,
data = mtcars,
chains = 2, refresh = 0
)
p_direction(model)
p_direction(model, method = "kernel")
}
#> Probability of Direction
#>
#> Parameter | pd
#> ---------------------
#> (Intercept) | 100.00%
#> wt | 99.98%
#> cyl | 99.97%
# emmeans
# -----------------------------------------------
if (require("emmeans")) {
p_direction(emtrends(model, ~1, "wt", data = mtcars))
}
#> Probability of Direction
#>
#> Parameter | pd
#> ----------------
#> overall | 100%
# brms models
# -----------------------------------------------
if (require("brms")) {
model <- brms::brm(mpg ~ wt + cyl, data = mtcars)
p_direction(model)
p_direction(model, method = "kernel")
}
#> Compiling Stan program...
#> Start sampling
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 8e-06 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.025 seconds (Warm-up)
#> Chain 1: 0.023 seconds (Sampling)
#> Chain 1: 0.048 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2:
#> Chain 2: Gradient evaluation took 4e-06 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2:
#> Chain 2:
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 0.023 seconds (Warm-up)
#> Chain 2: 0.017 seconds (Sampling)
#> Chain 2: 0.04 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
#> Chain 3:
#> Chain 3: Gradient evaluation took 4e-06 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3:
#> Chain 3:
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 0.023 seconds (Warm-up)
#> Chain 3: 0.018 seconds (Sampling)
#> Chain 3: 0.041 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
#> Chain 4:
#> Chain 4: Gradient evaluation took 4e-06 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4:
#> Chain 4:
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 0.022 seconds (Warm-up)
#> Chain 4: 0.018 seconds (Sampling)
#> Chain 4: 0.04 seconds (Total)
#> Chain 4:
#> Probability of Direction
#>
#> Parameter | pd
#> --------------------
#> (Intercept) | 100%
#> wt | 99.99%
#> cyl | 99.97%
# BayesFactor objects
# -----------------------------------------------
if (require("BayesFactor")) {
bf <- ttestBF(x = rnorm(100, 1, 1))
p_direction(bf)
p_direction(bf, method = "kernel")
}
#> Probability of Direction
#>
#> Parameter | pd
#> -----------------
#> Difference | 100%
# }
```