lmrob.control {robustbase} | R Documentation |
Tuning parameters for lmrob
, the MM-type regression
estimator and the associated S-, M- and D-estimators. Using
setting="KS2011"
sets the defaults as suggested by
Koller and Stahel (2011) and analogously for "KS2014"
.
The .M*.default
function
s and
.M*.defaults
list
s contain default tuning
parameters for all the predefined psi functions, see also
Mpsi
, etc.
lmrob.control(setting, seed = NULL, nResample = 500, tuning.chi = NULL, bb = 0.5, tuning.psi = NULL, max.it = 50, groups = 5, n.group = 400, k.fast.s = 1, best.r.s = 2, k.max = 200, maxit.scale = 200, k.m_s = 20, refine.tol = 1e-7, rel.tol = 1e-7, solve.tol = 1e-7, trace.lev = 0, mts = 1000, subsampling = c("nonsingular", "simple"), compute.rd = FALSE, method = "MM", psi = "bisquare", numpoints = 10, cov = NULL, split.type = c("f", "fi", "fii"), fast.s.large.n = 2000, eps.outlier = function(nobs) 0.1 / nobs, eps.x = function(maxx) .Machine$double.eps^(.75)*maxx, compute.outlier.stats = method, warn.limit.reject = 0.5, warn.limit.meanrw = 0.5, ...) .Mchi.tuning.defaults .Mchi.tuning.default(psi) .Mpsi.tuning.defaults .Mpsi.tuning.default(psi)
setting |
a string specifying alternative default values. Leave
empty for the defaults or use |
seed |
|
nResample |
number of re-sampling candidates to be used to find the initial S-estimator. Currently defaults to 500 which works well in most situations (see references). |
tuning.chi |
tuning constant vector for the S-estimator. If
|
bb |
expected value under the normal model of the
“chi” (rather rho) function with tuning
constant equal to |
tuning.psi |
tuning constant vector for the redescending
M-estimator. If |
max.it |
integer specifying the maximum number of IRWLS iterations. |
groups |
(for the fast-S algorithm): Number of random subsets to use when the data set is large. |
n.group |
(for the fast-S algorithm): Size of each of the
|
k.fast.s |
(for the fast-S algorithm): Number of local improvement steps (“I-steps”) for each re-sampling candidate. |
k.m_s |
(for the M-S algorithm): specifies after how many unsucessful refinement steps the algorithm stops. |
best.r.s |
(for the fast-S algorithm): Number of of best candidates to be iterated further (i.e., “refined”); is denoted t in Salibian-Barrera & Yohai(2006). |
k.max |
(for the fast-S algorithm): maximal number of refinement steps for the “fully” iterated best candidates. |
maxit.scale |
integer specifying the maximum number of C level
|
refine.tol |
(for the fast-S algorithm): relative convergence tolerance for the fully iterated best candidates. |
rel.tol |
(for the RWLS iterations of the MM algorithm): relative convergence tolerance for the parameter vector. |
solve.tol |
(for the S algorithm): relative
tolerance for inversion. Hence, this corresponds to
|
trace.lev |
integer indicating if the progress of the MM-algorithm
should be traced (increasingly); default |
mts |
maximum number of samples to try in subsampling algorithm. |
subsampling |
type of subsampling to be used, a string:
|
compute.rd |
logical indicating if robust distances (based on
the MCD robust covariance estimator |
method |
string specifying the estimator-chain. |
psi |
string specifying the type ψ-function
used. See Details of |
numpoints |
number of points used in Gauss quadrature. |
cov |
function or string with function name to be used to
calculate covariance matrix estimate. The default is
|
split.type |
determines how categorical and continuous variables
are split. See |
fast.s.large.n |
minimum number of observations required to switch from ordinary “fast S” algorithm to an efficient “large n” strategy. |
eps.outlier |
limit on the robustness weight below which an observation
is considered to be an outlier.
Either a numeric(1) or a function that takes the number of observations as
an argument. Used in |
eps.x |
limit on the absolute value of the elements of the design matrix below which an element is considered zero. Either a numeric(1) or a function that takes the maximum absolute value in the design matrix as an argument. |
compute.outlier.stats |
vector of |
warn.limit.reject |
limit of ratio
# rejected / # obs in level
above (>=) which a warning is produced.
Set to |
warn.limit.meanrw |
limit of the mean robustness per factor level
below which (<=) a warning is produced.
Set to |
... |
further arguments to be added as |
The option setting="KS2011"
alters the default
arguments. They are changed to method = "SMDM"
, psi = "lqq"
,
max.it = 500
, k.max = 2000
, cov = ".vcov.w"
.
The defaults of all the remaining arguments are not changed.
The option setting="KS2014"
builds upon setting="KS2011"
.
More arguments are changed to best.r.s = 20, k.fast.s = 2,
nResample = 1000
. This setting should produce more stable estimates
for designs with factor
s.
By default, and in .Mpsi.tuning.default()
and .Mchi.tuning.default()
,
tuning.chi
and tuning.psi
are set to yield an
MM-estimate with break-down point 0.5 and efficiency of 95% at
the normal.
To get these defaults, e.g., .Mpsi.tuning.default(psi)
is
equivalent to but more efficient than the formerly widely used
lmrob.control(psi = psi)$tuning.psi
.
These defaults are:
psi | tuning.chi | tuning.psi |
bisquare | 1.54764 | 4.685061 |
welsh | 0.5773502 | 2.11 |
ggw | c(-0.5, 1.5, NA, 0.5) | c(-0.5, 1.5, 0.95, NA) |
lqq | c(-0.5, 1.5, NA, 0.5) | c(-0.5, 1.5, 0.95, NA) |
optimal | 0.4047 | 1.060158 |
hampel | c(1.5, 3.5, 8)*0.2119163 | c(1.5, 3.5, 8)*0.9014
|
The values for the tuning constant for the ggw
psi function are
hard coded. The constants vector has four elements: minimal slope, b
(controlling the bend at the maximum of the curve), efficiency,
break-down point. Use NA
for an unspecified value, see examples
in the tables.
The constants for the "hampel"
psi function are chosen to have a
redescending slope of -1/3. Constants for a slope of -1/2
would be
psi | tuning.chi | tuning.psi |
"hampel" | c(2, 4, 8) * 0.1981319 |
c(2, 4, 8) * 0.690794
|
Alternative coefficients for an efficiency of 85% at the normal are given in the table below.
psi | tuning.psi |
bisquare | 3.443689 |
welsh | 1.456 |
ggw , lqq | c(-0.5, 1.5, 0.85, NA) |
optimal | 0.8684 |
hampel (-1/3) | c(1.5, 3.5, 8)* 0.5704545 |
hampel (-1/2) | c( 2, 4, 8) * 0.4769578
|
.Mchi.tuning.default(psi)
and .Mpsi.tuning.default(psi)
return a short numeric
vector of tuning constants which
are defaults for the corresponding psi-function, see the Details.
They are based on the named list
s
.Mchi.tuning.defaults
and .Mpsi.tuning.defaults
,
respectively.
lmrob.control()
returns a named list
with over
twenty components, corresponding to the arguments, where
tuning.psi
and tuning.chi
are typically computed, as
.Mpsi.tuning.default(psi)
or .Mchi.tuning.default(psi)
,
respectively.
Matias Salibian-Barrera, Martin Maechler and Manuel Koller
Koller, M. and Stahel, W.A. (2011) Sharpening Wald-type inference in robust regression for small samples. Computational Statistics & Data Analysis 55(8), 2504–2515.
Koller, M. and Stahel, W.A. (2014) Nonsingular subsampling for regression S~estimators with categorical predictors. Under review. (2012 version available from https://arxiv.org/abs/1208.5595)
Mpsi
, etc, for the (fast!) psi function computations;
lmrob
, also for references and examples.
## Show the default settings: str(lmrob.control()) ## Artificial data for a simple "robust t test": set.seed(17) y <- y0 <- rnorm(200) y[sample(200,20)] <- 100*rnorm(20) gr <- as.factor(rbinom(200, 1, prob = 1/8)) lmrob(y0 ~ 0+gr) ## Use Koller & Stahel(2011)'s recommendation but a larger 'max.it': str(ctrl <- lmrob.control("KS2011", max.it = 1000)) str(.Mpsi.tuning.defaults) stopifnot(identical(.Mpsi.tuning.defaults, sapply(names(.Mpsi.tuning.defaults), .Mpsi.tuning.default))) ## Containing (names!) all our (pre-defined) redescenders: str(.Mchi.tuning.defaults)