Skip to contents

Data Preparation

Main functions for cleaning and preparing data

data_to_long() reshape_longer()
Reshape (pivot) data from wide to long
data_to_wide() reshape_wider()
Reshape (pivot) data from long to wide
data_extract()
Extract one or more columns or elements from an object
data_match() data_filter()
Return filtered or sliced data frame, or row indices
find_columns() data_find() get_columns() data_select()
Find or get columns in a data frame based on search patterns
data_relocate() data_reorder() data_remove()
Relocate (reorder) columns of a data frame
data_arrange()
Arrange rows by column values
data_merge() data_join()
Merge (join) two data frames, or a list of data frames
data_partition()
Partition data
data_rotate() data_transpose()
Rotate a data frame
data_group() data_ungroup()
Create a grouped data frame
data_duplicated()
Extract all duplicates
data_unique()
Keep only one row from all with duplicated IDs

Statistical Transformations

Functions for transforming variables

categorize()
Recode (or "cut") data into groups of values.
recode_values() change_code()
Recode old values of variables into new values
rescale() change_scale()
Rescale Variables to a New Range
reverse() reverse_scale()
Reverse-Score Variables
slide()
Shift numeric value range
adjust() data_adjust()
Adjust data for the effect of other variable(s)
center() centre()
Centering (Grand-Mean Centering)
demean() degroup() detrend()
Compute group-meaned and de-meaned variables
normalize() unnormalize()
Normalize numeric variable to 0-1 range
ranktransform()
(Signed) rank transformation
rescale_weights()
Rescale design weights for multilevel analysis
standardize() standardise() unstandardize() unstandardise()
Standardization (Z-scoring)
standardize(<default>)
Re-fit a model with standardized data
winsorize()
Winsorize data

Data Properties

Functions to compute statistical summaries of data properties and distributions

data_codebook() print_html(<data_codebook>)
Generate a codebook of a data frame.
data_tabulate()
Create frequency tables of variables
data_peek()
Peek at values and type of variables in a data frame
coef_var() distribution_coef_var()
Compute the coefficient of variation
describe_distribution()
Describe a distribution
distribution_mode()
Compute mode for a statistical distribution
skewness() kurtosis() print(<parameters_kurtosis>) print(<parameters_skewness>) summary(<parameters_skewness>) summary(<parameters_kurtosis>)
Compute Skewness and (Excess) Kurtosis
smoothness()
Quantify the smoothness of a vector
weighted_mean() weighted_median() weighted_sd() weighted_mad()
Weighted Mean, Median, SD, and MAD

Convert and Replace Data

Helpers for data replacements

coerce_to_numeric()
Convert to Numeric (if possible)
to_numeric()
Convert data to numeric
to_factor()
Convert data to factors
replace_nan_inf()
Convert infinite or NaN values into NA
convert_na_to()
Replace missing values in a variable or a data frame.
convert_to_na()
Convert non-missing values in a variable into missing values.

Import data

Helpers for importing data

data_read()
Read (import) data files from various sources

Helpers for Data Preparation

Primarily useful in the context of other ‘easystats’ packages

reshape_ci()
Reshape CI between wide/long formats
data_addprefix() data_addsuffix() data_rename() data_rename_rows()
Rename columns and variable names
empty_columns() empty_rows() remove_empty_columns() remove_empty_rows() remove_empty()
Return or remove variables or observations that are completely missing
rownames_as_column() column_as_rownames()
Tools for working with row names
row_to_colnames() colnames_to_row()
Tools for working with column names
find_columns() data_find() get_columns() data_select()
Find or get columns in a data frame based on search patterns
data_restoretype()
Restore the type of columns according to a reference data frame

Helpers for Text Formatting

Primarily useful for ‘report’ package

Visualization helpers

Primarily useful in the context of other ‘easystats’ packages

visualisation_recipe()
Prepare objects for visualisation

Reexports

Functions from other packages re-exported for convenience

Data

Datasets useful for examples and tests

efc
Sample dataset from the EFC Survey
nhanes_sample
Sample dataset from the National Health and Nutrition Examination Survey