mclust_tidiers {broom} | R Documentation |
These methods summarize the results of Mclust clustering into three
tidy forms. tidy
describes the size, mixing probability, mean
and variabilty of each class, augment
adds the class assignments and
their probabilities to the original data, and
glance
summarizes the model parameters of the clustering.
## S3 method for class 'Mclust' tidy(x, ...) ## S3 method for class 'Mclust' augment(x, data, ...) ## S3 method for class 'Mclust' glance(x, ...)
x |
Mclust object |
... |
extra arguments, not used |
data |
Original data (required for |
All tidying methods return a data.frame without rownames, whose structure depends on the method chosen.
tidy
returns one row per component, with
component |
A factor describing the cluster from 1:k (or 0:k in presence of a noise term in x) |
size |
The size of each component |
proportion |
The mixing proportion of each component |
variance |
In case of one-dimensional and spherical models, the variance for each component, omitted otherwise. NA for noise component |
mean |
The mean for each component. In case of two- or more dimensional models, a column with the mean is added for each dimension. NA for noise component |
augment
returns the original data with two extra columns:
.class |
The class assigned by the Mclust algorithm |
.uncertainty |
The uncertainty associated with the classification |
glance
returns a one-row data.frame with the columns
model |
A character string denoting the model at which the optimal BIC occurs |
n |
The number of observations in the data |
G |
The optimal number of mixture components |
BIC |
The optimal BIC value |
logLik |
The log-likelihood corresponding to the optimal BIC |
df |
The number of estimated parameters |
hypvol |
The hypervolume parameter for the noise component if required, otherwise set to NA |
library(dplyr) library(ggplot2) library(mclust) set.seed(2016) centers <- data.frame(cluster=factor(1:3), size=c(100, 150, 50), x1=c(5, 0, -3), x2=c(-1, 1, -2)) points <- centers %>% group_by(cluster) %>% do(data.frame(x1=rnorm(.$size[1], .$x1[1]), x2=rnorm(.$size[1], .$x2[1]))) %>% ungroup() m = Mclust(points %>% dplyr::select(x1, x2)) tidy(m) head(augment(m, points)) glance(m)