Skip to contents

Safe and intuitive functions to rename variables or rows in data frames. data_rename() will rename column names, i.e. it facilitates renaming variables data_addprefix() or data_addsuffix() add prefixes or suffixes to column names. data_rename_rows() is a convenient shortcut to add or rename row names of a data frame, but unlike row.names(), its input and output is a data frame, thus, integrating smoothly into a possible pipe-workflow.

Usage

data_addprefix(
  data,
  pattern,
  select = NULL,
  exclude = NULL,
  ignore_case = FALSE,
  regex = FALSE,
  verbose = TRUE,
  ...
)

data_addsuffix(
  data,
  pattern,
  select = NULL,
  exclude = NULL,
  ignore_case = FALSE,
  regex = FALSE,
  verbose = TRUE,
  ...
)

data_rename(
  data,
  pattern = NULL,
  replacement = NULL,
  safe = TRUE,
  verbose = TRUE,
  ...
)

data_rename_rows(data, rows = NULL)

Arguments

data

A data frame, or an object that can be coerced to a data frame.

pattern

Character vector. For data_rename(), indicates columns that should be selected for renaming. Can be NULL (in which case all columns are selected). For data_addprefix() or data_addsuffix(), a character string, which will be added as prefix or suffix to the column names. For data_rename(), pattern can also be a named vector. In this case, names are used as values for the replacement argument (i.e. pattern can be a character vector using <new name> = "<old name>" and argument replacement will be ignored then).

select

Variables that will be included when performing the required tasks. Can be either

  • a variable specified as a literal variable name (e.g., column_name),

  • a string with the variable name (e.g., "column_name"), a character vector of variable names (e.g., c("col1", "col2", "col3")), or a character vector of variable names including ranges specified via : (e.g., c("col1:col3", "col5")),

  • a formula with variable names (e.g., ~column_1 + column_2),

  • a vector of positive integers, giving the positions counting from the left (e.g. 1 or c(1, 3, 5)),

  • a vector of negative integers, giving the positions counting from the right (e.g., -1 or -1:-3),

  • one of the following select-helpers: starts_with(), ends_with(), contains(), a range using : or regex(""). starts_with(), ends_with(), and contains() accept several patterns, e.g starts_with("Sep", "Petal").

  • or a function testing for logical conditions, e.g. is.numeric() (or is.numeric), or any user-defined function that selects the variables for which the function returns TRUE (like: foo <- function(x) mean(x) > 3),

  • ranges specified via literal variable names, select-helpers (except regex()) and (user-defined) functions can be negated, i.e. return non-matching elements, when prefixed with a -, e.g. -ends_with(""), -is.numeric or -(Sepal.Width:Petal.Length). Note: Negation means that matches are excluded, and thus, the exclude argument can be used alternatively. For instance, select=-ends_with("Length") (with -) is equivalent to exclude=ends_with("Length") (no -). In case negation should not work as expected, use the exclude argument instead.

If NULL, selects all columns. Patterns that found no matches are silently ignored, e.g. extract_column_names(iris, select = c("Species", "Test")) will just return "Species".

exclude

See select, however, column names matched by the pattern from exclude will be excluded instead of selected. If NULL (the default), excludes no columns.

ignore_case

Logical, if TRUE and when one of the select-helpers or a regular expression is used in select, ignores lower/upper case in the search pattern when matching against variable names.

regex

Logical, if TRUE, the search pattern from select will be treated as regular expression. When regex = TRUE, select must be a character string (or a variable containing a character string) and is not allowed to be one of the supported select-helpers or a character vector of length > 1. regex = TRUE is comparable to using one of the two select-helpers, select = contains("") or select = regex(""), however, since the select-helpers may not work when called from inside other functions (see 'Details'), this argument may be used as workaround.

verbose

Toggle warnings and messages.

...

Other arguments passed to or from other functions.

replacement

Character vector. Indicates the new name of the columns selected in pattern. Can be NULL (in which case column are numbered in sequential order). If not NULL, pattern and replacement must be of the same length. If pattern is a named vector, replacement is ignored.

safe

Do not throw error if for instance the variable to be renamed/removed doesn't exist.

rows

Vector of row names.

Value

A modified data frame.

See also

Examples

# Add prefix / suffix to all columns
head(data_addprefix(iris, "NEW_"))
#>   NEW_Sepal.Length NEW_Sepal.Width NEW_Petal.Length NEW_Petal.Width NEW_Species
#> 1              5.1             3.5              1.4             0.2      setosa
#> 2              4.9             3.0              1.4             0.2      setosa
#> 3              4.7             3.2              1.3             0.2      setosa
#> 4              4.6             3.1              1.5             0.2      setosa
#> 5              5.0             3.6              1.4             0.2      setosa
#> 6              5.4             3.9              1.7             0.4      setosa
head(data_addsuffix(iris, "_OLD"))
#>   Sepal.Length_OLD Sepal.Width_OLD Petal.Length_OLD Petal.Width_OLD Species_OLD
#> 1              5.1             3.5              1.4             0.2      setosa
#> 2              4.9             3.0              1.4             0.2      setosa
#> 3              4.7             3.2              1.3             0.2      setosa
#> 4              4.6             3.1              1.5             0.2      setosa
#> 5              5.0             3.6              1.4             0.2      setosa
#> 6              5.4             3.9              1.7             0.4      setosa

# Rename columns
head(data_rename(iris, "Sepal.Length", "length"))
#>   length Sepal.Width Petal.Length Petal.Width Species
#> 1    5.1         3.5          1.4         0.2  setosa
#> 2    4.9         3.0          1.4         0.2  setosa
#> 3    4.7         3.2          1.3         0.2  setosa
#> 4    4.6         3.1          1.5         0.2  setosa
#> 5    5.0         3.6          1.4         0.2  setosa
#> 6    5.4         3.9          1.7         0.4  setosa
# data_rename(iris, "FakeCol", "length", safe=FALSE)  # This fails
head(data_rename(iris, "FakeCol", "length")) # This doesn't
#> Variable `FakeCol` is not in your data frame :/
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa
head(data_rename(iris, c("Sepal.Length", "Sepal.Width"), c("length", "width")))
#>   length width Petal.Length Petal.Width Species
#> 1    5.1   3.5          1.4         0.2  setosa
#> 2    4.9   3.0          1.4         0.2  setosa
#> 3    4.7   3.2          1.3         0.2  setosa
#> 4    4.6   3.1          1.5         0.2  setosa
#> 5    5.0   3.6          1.4         0.2  setosa
#> 6    5.4   3.9          1.7         0.4  setosa

# use named vector to rename
head(data_rename(iris, c(length = "Sepal.Length", width = "Sepal.Width")))
#>   length width Petal.Length Petal.Width Species
#> 1    5.1   3.5          1.4         0.2  setosa
#> 2    4.9   3.0          1.4         0.2  setosa
#> 3    4.7   3.2          1.3         0.2  setosa
#> 4    4.6   3.1          1.5         0.2  setosa
#> 5    5.0   3.6          1.4         0.2  setosa
#> 6    5.4   3.9          1.7         0.4  setosa

# Reset names
head(data_rename(iris, NULL))
#>     1   2   3   4      5
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa

# Change all
head(data_rename(iris, replacement = paste0("Var", 1:5)))
#>   Var1 Var2 Var3 Var4   Var5
#> 1  5.1  3.5  1.4  0.2 setosa
#> 2  4.9  3.0  1.4  0.2 setosa
#> 3  4.7  3.2  1.3  0.2 setosa
#> 4  4.6  3.1  1.5  0.2 setosa
#> 5  5.0  3.6  1.4  0.2 setosa
#> 6  5.4  3.9  1.7  0.4 setosa