SurvivalTests {coin} | R Documentation |
Testing the equality of the survival distributions in two or more independent groups.
## S3 method for class 'formula' logrank_test(formula, data, subset = NULL, weights = NULL, ...) ## S3 method for class 'IndependenceProblem' logrank_test(object, ties.method = c("mid-ranks", "Hothorn-Lausen", "average-scores"), type = c("logrank", "Gehan-Breslow", "Tarone-Ware", "Prentice", "Prentice-Marek", "Andersen-Borgan-Gill-Keiding", "Fleming-Harrington", "Self"), rho = NULL, gamma = NULL, ...)
formula |
a formula of the form |
data |
an optional data frame containing the variables in the model formula. |
subset |
an optional vector specifying a subset of observations to be used. Defaults
to |
weights |
an optional formula of the form |
object |
an object inheriting from class |
ties.method |
a character, the method used to handle ties: the score generating function
either uses mid-ranks ( |
type |
a character, the type of test: either |
rho |
a numeric, the ρ constant when |
gamma |
a numeric, the γ constant when |
... |
further arguments to be passed to |
logrank_test
provides the weighted logrank test reformulated as a
linear rank test. The family of weighted logrank tests encompasses a large
collection of tests commonly used in the analysis of survival data including,
but not limited to, the standard (unweighted) logrank test, the Gehan-Breslow
test, the Tarone-Ware class of tests, the Prentice test, the Prentice-Marek
test, the Andersen-Borgan-Gill-Keiding test, the Fleming-Harrington class of
tests and the Self class of tests. A general description of these methods is
given by Klein and Moeschberger (2003, Ch. 7). See Letón and
Zuluaga (2001) for the linear rank test formulation.
The null hypothesis of equality, or conditional equality given block
,
of the survival distribution of y
in the groups defined by x
is
tested. In the two-sample case, the two-sided null hypothesis is H_0: theta = 1, where θ = λ_2 / λ_1
and λ_s is the hazard rate in the sth sample. In case
alternative = "less"
, the null hypothesis is H_0: theta >= 1, i.e., the survival is lower in sample 1 than in sample
2. When alternative = "greater"
, the null hypothesis is H_0: theta <= 1, i.e., the survival is higher in sample 1
than in sample 2.
If x
is an ordered factor, the default scores, 1:nlevels(x)
, can
be altered using the scores
argument (see
independence_test
); this argument can also be used to coerce
nominal factors to class "ordered"
. In this case, a linear-by-linear
association test is computed and the direction of the alternative hypothesis
can be specified using the alternative
argument. This type of
extension of the standard logrank test was given by Tarone (1975) and later
generalized to general weights by Tarone and Ware (1977).
Let (t_i, δ_i), i = 1, 2, …, n, represent a censored random sample of size n, where t_i is the observed survival time and δ_i is the status indicator (δ_i is 0 for censored observations and 1 otherwise). To allow for ties in the data, let t_(1) < t_(2) < … < t_(m) represent the m, m ≤ n, ordered distinct event times. At time t_(k), k = 1, 2, …, m, the number of events and the number of subjects at risk are given by d_k = sum(i = 1, …, n) I(t_i = t_(k) | delta_i = 1) and n_k = n - r_k, respectively, where r_k depends on the ties handling method.
Three different methods of handling ties are available using
ties.method
: mid-ranks ("mid-ranks"
, default), the
Hothorn-Lausen method ("Hothorn-Lausen"
) and average-scores
("average-scores"
). The first and last method are discussed and
contrasted by Callaert (2003), whereas the second method is defined in Hothorn
and Lausen (2003). The mid-ranks method leads to
r_k = sum(i = 1, …, n) I(t_i < t_(k))
whereas the Hothorn-Lausen method uses
r_k = sum(i = 1, …, n) I(t_i <= t_(k)) - 1.
The scores assigned to censored and uncensored observations at the kth event time are given by
C_k = sum(j=1,…,k) w_j * (d_j / n_j) and c_k = C_k - w_k,
respectively, where w is the logrank weight. For the average-scores method, used by, e.g., the software package StatXact, the d_k events observed at the kth event time are arbitrarily ordered by assigning them distinct values t_(k_l), l = 1, 2, …, d_k, infinitesimally to the left of t_(k). Then scores C_k_l and c_k_l are computed as indicated above, effectively assuming that no event times are tied. The scores C_k and c_k are assigned the average of the scores C_k_l and c_k_l respectively. It then follows that the score for the ith subject is
C_k' if delta_i = 0 a_i = c_k' otherwise
where k' = max{k : t_i >= t_(k)}.
The type
argument allows for a choice between some of the most
well-known members of the family of weighted logrank tests, each corresponding
to a particular weight function. The standard logrank test ("logrank"
,
default) was suggested by Mantel (1966), Peto and Peto (1972) and Cox (1972)
and has w_k = 1. The Gehan-Breslow test ("Gehan-Breslow"
)
proposed by Gehan (1965) and later extended to K samples by Breslow
(1970) is a generalization of the Wilcoxon rank-sum test, where w_k =
n_k. The Tarone-Ware class of tests ("Tarone-Ware"
) discussed by
Tarone and Ware (1977) has w_k = n_k^ρ, where ρ is a
constant; ρ = 0.5 (default) was suggested by Tarone and Ware (1977),
but note that ρ = 0 and ρ = 1 lead to the the standard
logrank test and Gehan-Breslow test respectively. The Prentice test
("Prentice"
) is another generalization of the Wilcoxon rank-sum test
suggested by Prentice (1978), where
w_k = prod(j = 1, …, k) n_j / n_j + d_j).
The Prentice-Marek test ("Prentice-Marek"
) is yet another
generalization of the Wilcoxon rank-sum test discussed by Prentice and Marek
(1979), with
w_k = prod(j = 1, …, k) (n_j + 1 - d_j) / (n_j + 1).
The Andersen-Borgan-Gill-Keiding test ("Andersen-Borgan-Gill-Keiding"
)
proposed by Andersen et al. (1982) is a modified version of the
Prentice-Marek test using
w_k = (n_k / (n_k + 1)) prod(j = 0, …, k - 1) (n_j + 1 - d_j) / (n_j+1)
where n_0 := n and d_0 := 0. The
Fleming-Harrington class
of tests ("Fleming-Harrington"
) suggested by Fleming and Harrington
(1991) uses w_k = Shat_k^rho
* (1 - Shat_k)^gamma, where ρ and γ are constants and
Shat_k = prod(j = 0, …, k - 1) (n_j - d_j) / n_j, Shat_0 := 1
is the left-continuous Kaplan-Meier estimator of the survival function;
ρ = 0 and γ = 0 lead to the standard logrank test. The
Self class of tests ("Self"
) proposed by Self (1991) has w_k = v_k^rho * (1 - v_k)^gamma, where
v_k = 1 / 2 * (t_(k - 1) + t_(k)) / t_(m), t_(0) := 1
is the standardized mid-point between the (k - 1)th and the kth event time. (This is a slight generalization of Self's original proposal in order to allow for non-integer follow-up times.) Again, ρ and γ are constants and ρ = 0 and γ = 0 lead to the standard logrank test.
The conditional null distribution of the test statistic is used to obtain
p-values and an asymptotic approximation of the exact distribution is
used by default (distribution = "asymptotic"
). Alternatively, the
distribution can be approximated via Monte Carlo resampling or computed
exactly for univariate two-sample problems by setting distribution
to
"approximate"
or "exact"
respectively. See
asymptotic
, approximate
and exact
for details.
An object inheriting from class "IndependenceTest"
.
The test statistic differs from the one used by
survdiff
(in package survival), since the
conditional variance used by logrank_test
is not identical to the
variance estimate used by the classical logrank test.
Combining independence_test
or symmetry_test
with
logrank_trafo
offers more flexibility than logrank_test
and allows for, among other things, maximum-type versatile test procedures
(e.g., Lee, 1996; see ‘Examples’) and user-supplied logrank weights
(see GTSG
for tests against Weibull-type or crossing-curve
alternatives).
Starting with version 1.1-0, logrank_test
replaces surv_test
,
which is now deprecated and will be removed in a future release.
Furthermore, logrank_trafo
is now an increasing function for all
choices of ties.method
, implying that the test statistic has the same
sign irrespective of the ties handling method. Consequently, the sign of the
test statistic will now be the opposite of what it was in earlier versions
unless ties.method = "average-scores"
. (In versions of coin
prior to 1.1-0, logrank_trafo
was a decreasing function when
ties.method
was other than "average-scores"
.)
Andersen, P. K., Borgan, Ø., Gill, R. and Keiding, N. (1982). Linear nonparametric tests for comparison of counting processes, with applications to censored survival data (with discussion). International Statistical Review 50(3), 219–258.
Breslow, N. (1970). A generalized Kruskal-Wallis test for comparing K samples subject to unequal patterns of censorship. Biometrika 57(3), 579–594.
Callaert, H. (2003). Comparing statistical software packages: The case of the logrank test in StatXact. The American Statistician 57(3), 214–217.
Cox, D. R. (1972). Regression models and life-tables (with discussion). Journal of the Royal Statistical Society B 34(2), 187–220.
Fleming, T. R. and Harrington, D. P. (1991). Counting Processes and Survival Analysis. New York: John Wiley & Sons.
Gehan, E. A. (1965). A generalized Wilcoxon test for comparing arbitrarily single-censored samples. Biometrika 52(1–2), 203–223.
Hothorn, T. and Lausen, B. (2003). On the exact distribution of maximally selected rank statistics. Computational Statistics & Data Analysis 43(2), 121–137.
Klein, J. P. and Moeschberger, M. L. (2003). Survival Analysis: Techniques for Censored and Truncated Data, Second Edition. New York: Springer.
Lee, J. W. (1996). Some versatile tests based on the simultaneous use of weighted log-rank statistics. Biometrics 52(2), 721–725.
Letón, E. and Zuluaga, P. (2001). Equivalence between scores and weighted tests for survival curves. Communications in Statistics – Theory and Methods 30(4), 591–608.
Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports 50(3), 163–170.
Peto, R. and Peto, J. (1972). Asymptotic efficient rank invariant test procedures (with discussion). Journal of the Royal Statistical Society A 135(2), 185–207.
Prentice, R. L. (1978). Linear rank tests with right censored data. Biometrika 65(1), 167–179.
Prentice, R. L. and Marek, P. (1979). A qualitative discrepancy between censored data rank tests. Biometrics 35(4), 861–867.
Self, S. G. (1991). An adaptive weighted log-rank test with application to cancer prevention and screening trials. Biometrics 47(3), 975–986.
Tarone, R. E. (1975). Tests for trend in life table analysis. Biometrika 62(3), 679–682.
Tarone, R. E. and Ware, J. (1977). On distribution-free tests for equality of survival distributions. Biometrika 64(1), 156–160.
## Example data (Callaert, 2003, Tab.1) callaert <- data.frame( time = c(1, 1, 5, 6, 6, 6, 6, 2, 2, 2, 3, 4, 4, 5, 5), group = factor(rep(0:1, c(7, 8))) ) ## Logrank scores using mid-ranks (Callaert, 2003, Tab.2) with(callaert, logrank_trafo(Surv(time))) ## Classical asymptotic logrank test (p = 0.0523) survdiff(Surv(time) ~ group, data = callaert) ## Exact logrank test using mid-ranks (p = 0.0505) logrank_test(Surv(time) ~ group, data = callaert, distribution = "exact") ## Exact logrank test using average-scores (p = 0.0468) logrank_test(Surv(time) ~ group, data = callaert, distribution = "exact", ties.method = "average-scores") ## Lung cancer data (StatXact 9 manual, p. 213, Tab. 7.19) lungcancer <- data.frame( time = c(257, 476, 355, 1779, 355, 191, 563, 242, 285, 16, 16, 16, 257, 16), event = c(0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1), group = factor(rep(1:2, c(5, 9)), labels = c("newdrug", "control")) ) ## Logrank scores using average-scores (StatXact 9 manual, p. 214) with(lungcancer, logrank_trafo(Surv(time, event), ties.method = "average-scores")) ## Exact logrank test using average-scores (StatXact 9 manual, p. 215) logrank_test(Surv(time, event) ~ group, data = lungcancer, distribution = "exact", ties.method = "average-scores") ## Exact Prentice test using average-scores (StatXact 9 manual, p. 222) logrank_test(Surv(time, event) ~ group, data = lungcancer, distribution = "exact", ties.method = "average-scores", type = "Prentice") ## Approximative (Monte Carlo) versatile test (Lee, 1996) rho.gamma <- expand.grid(rho = seq(0, 2, 1), gamma = seq(0, 2, 1)) lee_trafo <- function(y) logrank_trafo(y, ties.method = "average-scores", type = "Fleming-Harrington", rho = rho.gamma["rho"], gamma = rho.gamma["gamma"]) it <- independence_test(Surv(time, event) ~ group, data = lungcancer, distribution = approximate(B = 10000), ytrafo = function(data) trafo(data, surv_trafo = lee_trafo)) pvalue(it, method = "step-down")