RARECOVER {AssotesteR} | R Documentation |
RARECOVER is an algorithm proposed by Bhatia et al (2010) that determines the set of variants in a manner of forward variable selection: starting from a null model without any genetic variants, genetic variants are selected one by one based on their statistical significance and then added into the model
RARECOVER(y, X, maf = 0.05, dif = 0.5, perm = 100)
y |
numeric vector with phenotype status: 0=controls, 1=cases. No missing data allowed |
X |
numeric matrix or data frame with genotype data coded as 0, 1, 2. Missing data is allowed |
maf |
numeric value indicating the minor allele frequency threshold for rare variants ( |
dif |
numeric value between 0 and 1 as a threshold for the decision criterion in the RARECOVER algorithm (default |
perm |
positive integer indicating the number of permutations (100 by default) |
The applied association test statistic (denoted as XCORR
in Bhatia et al, 2010) is based on the Pearsons chi-square statistic
The argument maf
is used to specify the threshold of the minor allele frequency for rare variants. By default, only variants below maf=0.05
are taken into account in the analysis. However, if all variants in X
are considered as rare variants, setting maf=1
will consider them all for the analysis
There is no imputation for the missing data. Missing values are simply ignored in the computations.
An object of class "assoctest"
, basically a list with the following elements:
rc.stat |
rarecover statistic |
perm.pval |
permuted p-value |
set |
set of selected variants |
args |
descriptive information with number of controls, cases, variants, rare variants, maf, number of selected variants, and permutations |
name |
name of the statistic |
Gaston Sanchez
Bhatia G, Bansal V, Harismendy O, Schork NJ, Topol EJ, Frazer K, Bafna V (2010) A Covering Method for Detecting Genetic Associations between Rare Variants and Common Phenotypes. PLoS Computational Biology, 6(10): e1000954
## Not run: # number of cases cases = 500 # number of controls controls = 500 # total (cases + controls) total = cases + controls # phenotype vector phenotype = c(rep(1, cases), rep(0, controls)) # genotype matrix with 10 variants (random data) set.seed(1234) genotype = matrix(rbinom(total*10, 2, 0.051), nrow=total, ncol=10) # apply RARECOVER with dif=0.05 and 500 permutations myrc = RARECOVER(phenotype, genotype, maf=0.05, perm=500) myrc ## End(Not run)