dts_stat {twosamples}R Documentation

Two Sample Test

Description

Two Sample Test

Usage

dts_stat(a, b, power = 1)

dts_test(a, b, nboots = 2000, p = default.p)

two_sample(a, b, nboots = 2000, p = default.p)

Arguments

a

a vector of numbers

b

a vector of numbers

power

also the power to raise the test stat to

nboots

Number of bootstrap iterations

p

power to raise test stat to

Details

The DTS test compares two ECDFs by looking at the reweighted Wasserstein distance between the two. If the wass_test extends cvm_test to interval data, then DTS extends ad_test to interval data. Formally – if E is the ECDF of sample 1, F is the ECDF of sample 2, and G is the ECDF of the combined sample, then DTS = Integral |E(x)-F(x)|/(G(x)(1-G(x))) across all x. The test p-value is calculated by randomly resampling two samples of the same size using the combined sample. Intuitively the DTS test improves on AD by allowing more extreme observations to carry more weight. At a higher level – CVM/AD/KS/etc only require ordinal data. DTS gains its power because it takes advantages of the properties of interval data – i.e. the distances have some meaning. This is the same argument as Wasserstein vs AD/CVM/KS. However, DTS, like Anderson-Darling (AD) also downweights noisier observations relative to WASS, thus (hopefully) giving it extra power.

Value

Output is a length 2 Vector with test stat and p-value in that order. That vector has 3 attributes – the sample sizes of each sample, and the number of bootstraps performed for the pvalue.

Functions

Examples

vec1 = rnorm(20)
vec2 = rnorm(20,4)
dts_stat(vec1,vec2)
dts_test(vec1,vec2)
two_sample(vec1,vec2)

[Package twosamples version 1.0.0 Index]