bisect_semi_suprevised {bisect} | R Documentation |
Add together two numbers.
bisect_semi_suprevised(methylation_unkown_samples, total_reads_unknown_samples, methylation_known_samples, total_reads_known_samples, cell_composition_known_samples, alpha = NA, iterations = 200)
methylation_unkown_samples |
a matrix of individuals (rows) on sites (columns), containing the number of methylated reads for each site, in each individual for the samples with unknown cell composition. |
total_reads_unknown_samples |
a matrix of individuals (rows) on sites (columns), containing the total number of reads for each site, in each individual for the samples with unknown cell composition. |
methylation_known_samples |
a matrix of individuals (rows) on sites (columns), containing the number of methylated reads for each site, in each individual for the samples with known cell composition. |
total_reads_known_samples |
a matrix of individuals (rows) on sites (columns), containing the total number of reads for each site, in each individual for the samples with known cell composition. |
cell_composition_known_samples |
a matrix of individuals (rows) on cell types (columns), containing the proportion of each cell type, in each known sample. |
alpha |
a vector containing the hyper-parameters for the dirichelt prior. One value for each cell type. If NA, it is initiallized to 1/(number of cell types). |
iterations |
the number of iterations to use in the EM algorithm. |
A list containing P, a matrix of estimated cell proportions for the unknown samples, and Pi, an estimated reference (the probability of methylation in each cell type).
## Randomly choose samples to be used as known n_known_samples <- 50 known_samples_indices <- sample.int(nrow(baseline_GSE40279), size = n_known_samples) known_samples <- as.matrix(baseline_GSE40279[known_samples_indices, ]) ## Fit a dirichlet distribution to known samples to use as prior fit_dirichlet <- sirt::dirichlet.mle(as.matrix(known_samples)) alpha <- fit_dirichlet$alpha ## Prepare the 4 needed matrices methylation_known <- methylation_GSE40279[known_samples_indices, ] methylation_unknown <-methylation_GSE40279[-known_samples_indices, ] total_known <- total_reads_GSE40279[known_samples_indices, ] total_unknown <- total_reads_GSE40279[-known_samples_indices, ] ## Run Bisect. You should use around 200 iterations. I choose than to accelarate the example. results <- bisect_semi_suprevised(methylation_unknown, total_unknown, methylation_known, total_known, known_samples, alpha, iterations = 10)