step_lincomb {recipes} | R Documentation |
step_lincomb
creates a specification of a recipe step that
will potentially remove numeric variables that have linear combinations
between them.
step_lincomb(recipe, ..., role = NA, trained = FALSE, max_steps = 5, removals = NULL)
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables are
affected by the step. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
max_steps |
A value . |
removals |
A character string that contains the names of columns that
should be removed. These values are not determined until
|
This step finds exact linear combinations between two or more
variables and recommends which column(s) should be removed to resolve the
issue. This algorithm may need to be applied multiple times (as defined
by max_steps
).
An updated version of recipe
with the
new step added to the sequence of existing steps (if any).
Max Kuhn, Kirk Mettler, and Jed Wing
step_nzv
step_corr
recipe
prep.recipe
bake.recipe
data(biomass) biomass$new_1 <- with(biomass, .1*carbon - .2*hydrogen + .6*sulfur) biomass$new_2 <- with(biomass, .5*carbon - .2*oxygen + .6*nitrogen) biomass_tr <- biomass[biomass$dataset == "Training",] biomass_te <- biomass[biomass$dataset == "Testing",] rec <- recipe(HHV ~ carbon + hydrogen + oxygen + nitrogen + sulfur + new_1 + new_2, data = biomass_tr) lincomb_filter <- rec %>% step_lincomb(all_predictors()) prep(lincomb_filter, training = biomass_tr)