NullDistribution {coin} | R Documentation |
Specification of the asymptotic, approximative (Monte Carlo) and exact reference distribution.
asymptotic(maxpts = 25000, abseps = 0.001, releps = 0) approximate(B = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL) exact(algorithm = c("auto", "shift", "split-up"), fact = NULL)
maxpts |
an integer, the maximum number of function values; see
|
abseps |
a numeric, the absolute error tolerance; see
|
releps |
a numeric, the relative error tolerance; see
|
B |
a positive integer, the number of Monte Carlo replicates used for the
computation of the approximative reference distribution. Defaults to
|
parallel |
a character, the type of parallel operation: either |
ncpus |
an integer, the number of processes to be used in parallel operation.
Defaults to |
cl |
an object inheriting from class |
algorithm |
a character, the algorithm used for the computation of the exact reference
distribution: either |
fact |
an integer to multiply the response values with. Defaults to |
asymptotic
, approximate
and exact
provide control of the
specification of the asymptotic, approximative (Monte Carlo) and exact
reference distributions, and are supplied to, e.g.,
independence_test
via its distribution
argument.
The asymptotic reference distribution is computed using randomised quasi-Monte
Carlo methods (Genz and Bretz, 2009) and is applicable to arbitrary covariance
structures and dimensions up to 1000. See package mvtnorm for details
on maxpts
, abseps
and releps
.
The approximative (Monte Carlo) reference distribution is always applicable.
In this case, the distribution is obtained by a conditional Monte Carlo
procedure, i.e., by computing the test statistic for B
random samples
from all admissible permutations of the response Y within each
block (Hothorn et al., 2008). By default, the distribution is obtained
through serial operation (parallel = "no"
). The use of parallel
operation is specified by setting parallel
to either "multicore"
(not available for MS Windows) or "snow"
. In the latter case, if
cl = NULL
(default) a cluster with ncpus
processes is created on
the local machine unless a default cluster has been registered (see
setDefaultCluster
in package
parallel) in which case that gets used instead. Alternatively, the use
of an optional parallel or snow cluster can be specified by
cl
. See ‘Examples’ and package parallel for details on
parallel operation.
The exact reference distribution is currently applicable to univariate
two-sample problems only, using either the shift algorithm (Streitberg and
Röhmel, 1986, 1987) or the split-up algorithm (van de Wiel,
2001). The shift algorithm is defined for positive integer valued scores
h(Y) only, but handles blocks pertaining to, e.g., pre- and
post-stratification. The split-up algorithm can be used with non-integer
scores, but does not handle blocks. By default, an automatic choice is made
(algorithm = "auto"
) but the shift and split-up algorithms can be
selected by setting algorithm
to either "shift"
or
"split-up"
respectively.
Starting with coin version 1.1-0, the default for algorithm
is
"auto"
, having identical behaviour to "shift"
in previous
versions. In earlier versions of the package, algorithm = "shift"
silently switched to the split-up algorithm if non-integer scores were
detected, whereas the current version exists with a warning.
Genz, A. and Bretz, F. (2009). Computation of Multivariate Normal and t Probabilities. Heidelberg: Springer-Verlag.
Hothorn, T., Hornik, K., van de Wiel, M. A. and Zeileis, A. (2008). Implementing a class of permutation tests: the coin package. Journal of Statistical Software 28(8), 1–23. http://www.jstatsoft.org/v28/i08/
Streitberg, B. and Röhmel, J. (1986). Exact distributions for permutations and rank tests: an introduction to some recently published algorithms. Statistical Software Newsletter 12(1), 10–17.
Streitberg, B. and Röhmel, J. (1987). Exakte verteilungen für rang- und randomisierungstests im allgemeinen c-stichprobenfall. EDV in Medizin und Biologie 18(1), 12–19.
van de Wiel, M. A. (2001). The split-up algorithm: a fast symbolic method for computing p-values of distribution-free statistics. Computational Statistics 16(4), 519–538.
## Approximative (Monte Carlo) Cochran-Mantel-Haenszel test ## Serial operation set.seed(123) cmh_test(disease ~ smoking | gender, data = alzheimer, distribution = approximate(B = 100000)) ## Not run: ## Multicore with 8 processes (not for MS Windows) set.seed(123, kind = "L'Ecuyer-CMRG") cmh_test(disease ~ smoking | gender, data = alzheimer, distribution = approximate(B = 100000, parallel = "multicore", ncpus = 8)) ## Automatic PSOCK cluster with 4 processes set.seed(123, kind = "L'Ecuyer-CMRG") cmh_test(disease ~ smoking | gender, data = alzheimer, distribution = approximate(B = 100000, parallel = "snow", ncpus = 4)) ## Registered FORK cluster with 12 processes (not for MS Windows) fork12 <- parallel::makeCluster(12, "FORK") # set-up cluster parallel::setDefaultCluster(fork12) # register default cluster set.seed(123, kind = "L'Ecuyer-CMRG") cmh_test(disease ~ smoking | gender, data = alzheimer, distribution = approximate(B = 100000, parallel = "snow")) parallel::stopCluster(fork12) # clean-up ## User-specified PSOCK cluster with 8 processes psock8 <- parallel::makeCluster(8, "PSOCK") # set-up cluster set.seed(123, kind = "L'Ecuyer-CMRG") cmh_test(disease ~ smoking | gender, data = alzheimer, distribution = approximate(B = 100000, parallel = "snow", cl = psock8)) parallel::stopCluster(psock8) # clean-up ## End(Not run)