step_corr {recipes} | R Documentation |
step_corr
creates a specification of a recipe step that will
potentially remove variables that have large absolute correlations with
other variables.
step_corr(recipe, ..., role = NA, trained = FALSE, threshold = 0.9, use = "pairwise.complete.obs", method = "pearson", removals = NULL)
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables are
affected by the step. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
threshold |
A value for the threshold of absolute correlation values. The step will try to remove the minimum number of columns so that all the resulting absolute correlations are less than this value. |
use |
A character string for the |
method |
A character string for the |
removals |
A character string that contains the names of columns that
should be removed. These values are not determined until
|
This step attempts to remove variables to keep the largest absolute
correlation between the variables less than threshold
.
An updated version of recipe
with the
new step added to the sequence of existing steps (if any).
Original R code for filtering algorithm by Dong Li, modified by
Max Kuhn. Contributions by Reynald Lescarbeau (for original in
caret
package). Max Kuhn for the step
function.
step_nzv
recipe
prep.recipe
bake.recipe
data(biomass) set.seed(3535) biomass$duplicate <- biomass$carbon + rnorm(nrow(biomass)) biomass_tr <- biomass[biomass$dataset == "Training",] biomass_te <- biomass[biomass$dataset == "Testing",] rec <- recipe(HHV ~ carbon + hydrogen + oxygen + nitrogen + sulfur + duplicate, data = biomass_tr) corr_filter <- rec %>% step_corr(all_predictors(), threshold = .5) filter_obj <- prep(corr_filter, training = biomass_tr) filtered_te <- bake(filter_obj, biomass_te) round(abs(cor(biomass_tr[, c(3:7, 9)])), 2) round(abs(cor(filtered_te)), 2)