snpRecode {gdmp} | R Documentation |
snpRecode
is a function to convert SNP genotypes to 0, 1, and 2 for the homozygous,
heterozygous, and other homozygous genotype, respectively.
snpRecode(snpG, designat)
snpG |
is a column vector in the genotypes array, created by |
designat |
is the 2-base allele designations for each SNP. This is sometimes called allele report data, where the specefic bases of alleles A and B are reported. Formated as data frame with two factors for alleles A and B. See ‘Examples’. |
Recode snp genotypes by counting the number of copies of allele A in an element of snpG
which is a column vector in the genotypes array, ga
, where
snpG
is a column vector in the genotypes array,
ga
is the genotypes array created by toArray
. It contains elements such as "AA", "AG", "GA", "-A", "- -".
Unknown genotypes are those with non A/G/C/T bases, those are coded as 5.
A column vector of the integers 0, 1, and 2 is created based on the number of copies of allele A in each element of the supplied vector of genotypes. A value of 5 is used to indicate an unknown genotype.
## Simulate random allele designations for 100 bi-allelic SNPs set.seed(2016) desig <- array(sample(c('A','C','G','T'), size = 200, repl = TRUE), dim=c(100, 2)) ## Simulate random SNP genotypes for 20 individuals - put them in array format ## '-' indicates an unknown base ga <- array(0, dim=c(20, 100)) for(i in 1:20) for(j in 1:100) ga[i, j] <- paste(sample(c(desig[j,],"-"), 2, prob=c(.46, .46, .08), repl=TRUE), collapse='') ## Recode the matrix, place recoded genotypes in ga.r desig <- data.frame(AlleleA_Forward = factor(desig[,1]), AlleleB_Forward = factor(desig[,2])) ga.r <- array(5, dim=c(20, 100)) for(i in 1:100) ga.r[,i] <- snpRecode(ga[,i], desig[i,]) ## Tabulate recoded genotypes in the matrix ga.r table(ga.r) # 0 1 2 5 # 326 632 701 341