nroTrain {Numero} | R Documentation |
Iterative algorithm to adapt a self-organizing map (SOM) to a set of multivariable data.
nroTrain(som, data, subsample = NULL, metric = "euclid")
som |
A list object as returned by |
data |
A matrix or a data frame. |
subsample |
Number of rows used during a single training cycle. |
metric |
Distance metric in data space, either "euclid" or "pearson". |
The model is fitted according to columns that are found both in the SOM centroids and the input data.
If subsample
is less than the number of data rows, a random subset
of the specified size is used for each training cycle.
A copy of the list object som
, where the element centroids
is
updated according to the data patterns. The quantization errors during
training are stored in the element history
and the element
metric
is set to the distance measure used.
Gao S, Mutter S, Casey AE, Mäkinen V-P (2018) Numero: a statistical framework to define multivariable subgroups in complex population-based datasets, Int J Epidemiology, https://doi.org/10.1093/ije/dyy113
# Import data. fname <- system.file("extdata", "finndiane.txt", package = "Numero") dataset <- read.delim(file = fname) # Prepare training data. trvars <- c("CHOL", "HDL2C", "TG", "CREAT", "uALB") trdata <- scale.default(dataset[,trvars]) # K-means clustering. km <- nroKmeans(data = trdata) # Train with full data. sm <- nroKohonen(seeds = km) sm <- nroTrain(som = sm, data = trdata) print(sm$history) # Train with subsampling. sm <- nroKohonen(seeds = km) sm <- nroTrain(som = sm, data = trdata, subsample = 200) print(sm$history)