Compute hierarchical or kmeans cluster analysis and return the group assignment for each observation as vector.
cluster_analysis(
x,
n_clusters = NULL,
method = c("hclust", "kmeans"),
distance = c("euclidean", "maximum", "manhattan", "canberra", "binary", "minkowski"),
agglomeration = c("ward", "ward.D", "ward.D2", "single", "complete", "average",
"mcquitty", "median", "centroid"),
iterations = 20,
algorithm = c("HartiganWong", "Lloyd", "MacQueen"),
force = TRUE,
package = c("NbClust", "mclust"),
verbose = TRUE
)
x  A data frame. 

n_clusters  Number of clusters used for the cluster solution. By
default, the number of clusters to extract is determined by calling

method  Method for computing the cluster analysis. By default
( 
distance  Distance measure to be used when 
agglomeration  Agglomeration method to be used when 
iterations  Maximum number of iterations allowed. Only applies, if

algorithm  Algorithm used for calculating kmeans cluster. Only applies,
if 
force  Logical, if 
package  Package from which methods are to be called to determine the
number of clusters. Can be 
verbose  Toggle warnings and messages. 
The group classification for each observation as vector. The
returned vector includes missing values, so it has the same length
as nrow(x)
.
The print()
and plot()
methods show the (standardized) mean value for
each variable within each cluster. Thus, a higher absolute value indicates
that a certain variable characteristic is more pronounced within that
specific cluster (as compared to other cluster groups with lower absolute
mean values).
There is also a plot()
method implemented in the seepackage.
Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2014) cluster: Cluster Analysis Basics and Extensions. R package.
n_clusters()
to determine the number of clusters to extract,
cluster_discrimination()
to determine the accuracy of cluster group
classification and check_clusterstructure()
to check suitability of data
for clustering.
# Hierarchical clustering of mtcarsdataset
groups < cluster_analysis(iris[, 1:4], 3)
groups
#> # Cluster Analysis (mean zscore by cluster)
#>
#> Term Group 1 Group 2 Group 3
#> Sepal.Length 1.01 0.09 1.24
#> Sepal.Width 0.85 0.70 0.07
#> Petal.Length 1.30 0.38 1.14
#> Petal.Width 1.25 0.31 1.19
#>
#> # Accuracy of Cluster Group Classification
#>
#> Group Accuracy
#> 1 100.00%
#> 2 95.31%
#> 3 97.22%
#>
#> Overall accuracy of classification: 97.33%
#>
# Kmeans clustering of mtcarsdataset, autodetection of clustergroups
if (FALSE) {
groups < cluster_analysis(iris[, 1:4], method = "k")
groups
}