
Reporting Datasets and Dataframes
Source:R/report.character.R
, R/report.data.frame.R
, R/report.factor.R
, and 1 more
report.data.frame.Rd
Create reports for data frames.
Usage
# S3 method for character
report(
x,
n_entries = 3,
levels_percentage = "auto",
missing_percentage = "auto",
...
)
# S3 method for data.frame
report(
x,
n = FALSE,
centrality = "mean",
dispersion = TRUE,
range = TRUE,
distribution = FALSE,
levels_percentage = "auto",
digits = 2,
n_entries = 3,
missing_percentage = "auto",
...
)
# S3 method for factor
report(x, levels_percentage = "auto", ...)
# S3 method for numeric
report(
x,
n = FALSE,
centrality = "mean",
dispersion = TRUE,
range = TRUE,
distribution = FALSE,
missing_percentage = "auto",
digits = 2,
...
)
Arguments
- x
The R object that you want to report (see list of of supported objects above).
- n_entries
Number of different character entries to show. Can be "all".
- levels_percentage
Show characters entries and factor levels by number or percentage. If "auto", then will be set to number and percentage if the length if n observations larger than 100.
- missing_percentage
Show missing by number (default) or percentage. If "auto", then will be set to number and percentage if the length if n observations larger than 100.
- ...
Arguments passed to or from other methods.
- n
Include number of observations for each individual variable.
- centrality
Character vector, indicating the index of centrality (either
"mean"
or"median"
).- dispersion
Show index of dispersion (sd if
centrality = "mean"
, or mad ifcentrality = "median"
).- range
Show range.
- distribution
Show kurtosis and skewness.
- digits
Number of significant digits.
Value
An object of class report()
.
Examples
library(report)
r <- report(iris,
centrality = "median", dispersion = FALSE,
distribution = TRUE, missing_percentage = TRUE
)
r
#> The data contains 150 observations of the following 5 variables:
#>
#> - Sepal.Length: n = 150, Mean = 5.84, SD = 0.83, Median = 5.80, MAD = 1.04, range: [4.30, 7.90], Skewness = 0.31, Kurtosis = -0.55, 0% missing
#> - Sepal.Width: n = 150, Mean = 3.06, SD = 0.44, Median = 3.00, MAD = 0.44, range: [2, 4.40], Skewness = 0.32, Kurtosis = 0.23, 0% missing
#> - Petal.Length: n = 150, Mean = 3.76, SD = 1.77, Median = 4.35, MAD = 1.85, range: [1, 6.90], Skewness = -0.27, Kurtosis = -1.40, 0% missing
#> - Petal.Width: n = 150, Mean = 1.20, SD = 0.76, Median = 1.30, MAD = 1.04, range: [0.10, 2.50], Skewness = -0.10, Kurtosis = -1.34, 0% missing
#> - Species: 3 levels, namely setosa (n = 50, 33.33%), versicolor (n = 50, 33.33%) and virginica (n = 50, 33.33%)
summary(r)
#> The data contains 150 observations of the following 5 variables:
#>
#> - Sepal.Length: Median = 5.80, range: [4.30, 7.90], Skewness = 0.31, Kurtosis = -0.55
#> - Sepal.Width: Median = 3.00, range: [2, 4.40], Skewness = 0.32, Kurtosis = 0.23
#> - Petal.Length: Median = 4.35, range: [1, 6.90], Skewness = -0.27, Kurtosis = -1.40
#> - Petal.Width: Median = 1.30, range: [0.10, 2.50], Skewness = -0.10, Kurtosis = -1.34
#> - Species: 3 levels, namely setosa (n = 50), versicolor (n = 50) and virginica (n = 50)
as.data.frame(r)
#> Variable | Level | n_Obs | percentage_Obs | Mean | SD | Median | MAD | Min | Max | Skewness | Kurtosis | percentage_Missing
#> -----------------------------------------------------------------------------------------------------------------------------------------
#> Sepal.Length | | 150 | | 5.84 | 0.83 | 5.80 | 1.04 | 4.30 | 7.90 | 0.31 | -0.55 | 0.00
#> Sepal.Width | | 150 | | 3.06 | 0.44 | 3.00 | 0.44 | 2.00 | 4.40 | 0.32 | 0.23 | 0.00
#> Petal.Length | | 150 | | 3.76 | 1.77 | 4.35 | 1.85 | 1.00 | 6.90 | -0.27 | -1.40 | 0.00
#> Petal.Width | | 150 | | 1.20 | 0.76 | 1.30 | 1.04 | 0.10 | 2.50 | -0.10 | -1.34 | 0.00
#> Species | setosa | 50 | 33.33 | | | | | | | | |
#> Species | versicolor | 50 | 33.33 | | | | | | | | |
#> Species | virginica | 50 | 33.33 | | | | | | | | |
summary(as.data.frame(r))
#> Variable | Level | n_Obs | percentage_Obs | Median | Min | Max | Skewness | Kurtosis | percentage_Missing
#> --------------------------------------------------------------------------------------------------------------------
#> Sepal.Length | | | | 5.80 | 4.30 | 7.90 | 0.31 | -0.55 | 0.00
#> Sepal.Width | | | | 3.00 | 2.00 | 4.40 | 0.32 | 0.23 | 0.00
#> Petal.Length | | | | 4.35 | 1.00 | 6.90 | -0.27 | -1.40 | 0.00
#> Petal.Width | | | | 1.30 | 0.10 | 2.50 | -0.10 | -1.34 | 0.00
#> Species | setosa | 50 | 33.33 | | | | | |
#> Species | versicolor | 50 | 33.33 | | | | | |
#> Species | virginica | 50 | 33.33 | | | | | |
# grouped analysis using `{dplyr}` package
if (require("dplyr")) {
r <- iris %>%
group_by(Species) %>%
report()
r
summary(r)
as.data.frame(r)
summary(as.data.frame(r))
}
#> Loading required package: dplyr
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
#> Warning: Following variable(s) were not found: n_Obs
#> Warning: Following variable(s) were not found: n_Obs
#> Warning: Following variable(s) were not found: n_Obs
#> Group | Variable | Mean | SD | Min | Max | n_Missing
#> -----------------------------------------------------------------
#> versicolor | Sepal.Length | 5.94 | 0.52 | 4.90 | 7.00 | 0
#> versicolor | Sepal.Width | 2.77 | 0.31 | 2.00 | 3.40 | 0
#> versicolor | Petal.Length | 4.26 | 0.47 | 3.00 | 5.10 | 0
#> versicolor | Petal.Width | 1.33 | 0.20 | 1.00 | 1.80 | 0
#> virginica | Sepal.Length | 6.59 | 0.64 | 4.90 | 7.90 | 0
#> virginica | Sepal.Width | 2.97 | 0.32 | 2.20 | 3.80 | 0
#> virginica | Petal.Length | 5.55 | 0.55 | 4.50 | 6.90 | 0
#> virginica | Petal.Width | 2.03 | 0.27 | 1.40 | 2.50 | 0