fun.chisq.test {FunChisq} | R Documentation |
Asymptotic chi-squared, normalized chi-squared or exact tests on contingency tables to determine model-free functional dependency of the column variable on the row variable.
fun.chisq.test( x, method = c("fchisq", "nfchisq", "exact", "exact.qp", "exact.dp", "exact.dqp", "default", "normalized", "simulate.p.value"), alternative = c("non-constant", "all"), log.p=FALSE, index.kind = c("conditional", "unconditional"), simulate.nruns = 2000, exact.mode.bound=TRUE )
x |
a matrix representing a contingency table. The row variable represents the independent variable or all unique combinations of multiple independent variables. The column variable is the dependent variable. |
method |
a character string to specify the method to compute the functional chi-squared test statistic and its p-value. The options are Note: |
alternative |
a character string to specify the alternative hypothesis. The options are |
log.p |
logical; if |
index.kind |
a character string to specify the kind of function index xi.f to be estimated. The options are |
simulate.nruns |
A number to specify the number of tables generated to simulate the null distribution. Default is |
exact.mode.bound |
logical; if |
The functional chi-squared test determines whether the column variable is a function of the row variable in contingency table x
(Zhang and Song, 2013; Zhang, 2014). This function supports three hypothesis testing methods:
index.kind
specifies the kind of function index to be computed. If the experimental design controls neither the row nor column marginal sums, index.kind = "unconditional"
(default) is recommended; If the column marginal sums are controlled, index.kind = "conditional"
is recommended. The choice of index.kind
affects only the function index xi.f value, but not the test statistic or p-value.
When method="fchisq"
(equivalent to "default"
, the default), the test statistic is computed as described in (Zhang and Song, 2013; Zhang, 2014) and the p-value is computed using the chi-squared distribution.
When method="nfchisq"
(equivalent to "normalized"
), the test statistic is obtained by shifting and scaling the original test statistic (Zhang and Song, 2013; Zhang, 2014); and the p-value is computed using the standard normal distribution (Box et al., 2005). The normalized chi-squared, more conservative on the degrees of freedom, was used by the Best Performer NMSUSongLab in HPN-DREAM (DREAM8) Breast Cancer Network Inference Challenges.
When method="exact"
, "exact.qp"
(quadratic programming), "exact.dp"
(dynamic programming), or "exact.dqp"
(dynamic and quadratic programming), an exact functional test is performed. The option of "exact"
uses "exact.dqp"
, the fastest method. The methods compute an exact p-value, as described in (Zhong and Song, 2019; Nguyen, 2018).
For the "exact.qp"
and "exact.dp"
options, if the sample size is no more than 200 or the average cell count is less than five, and the table size is no more than 10 in either row or column, the exact test will not be called and the asymptotic functional chi-squared test (method="fchisq"
) is used instead.
For "exact.dqp"
, the exact functional test will always be performed.
For 2-by-2 contingency tables, the asymptotic test options (method="fchisq"
or "nfchisq"
) are recommended to test functional dependency, instead of the exact functional test.
When method="simulate.p.value"
, a simulated null distribution is used to calculate p-value
. The null distribution is a multinomial distribution that is the product of two marginal distributions. Like other Monte Carlo based methods, this method is slower but may be more accurate than other methods based on asymptotic distributions.
A list with class "htest
" containing the following components:
statistic |
the functional chi-squared statistic if |
parameter |
degrees of freedom for the functional chi-squared statistic. |
p.value |
p-value of the functional test. If |
estimate |
an estimate of function index between 0 and 1. The value of 1 indicates a strictly mathematical function. It is asymmetrical with respect to transpose of the input contingency table, different from the symmetrical Cramer's V based on the Pearson's chi-squared test statistic. |
Yang Zhang, Hua Zhong and Joe Song
Box, G. E., Hunter, J. S. and Hunter, W. G. (2005) Statistics for Experimenters: Design, Innovation and Discovery, 2nd ed., New York: Wiley-Interscience.
Nguyen, H. H. (2018) Inference of Functional Dependency via Asymmetric, Optimal, and Model-free Statistics. Unpublished doctoral dissertation, Department of Computer Science, New Mexico State University, Las Cruces, USA.
Zhang, Y. and Song, M. (2013) Deciphering interactions in causal networks without parametric assumptions. arXiv Molecular Networks, arXiv:1311.2707, https://arxiv.org/abs/1311.2707
Zhang, Y. (2014) Nonparametric Statistical Methods for Biological Network Inference. Unpublished doctoral dissertation, Department of Computer Science, New Mexico State University, Las Cruces, USA.
Zhong, H. and Song, M. (2019) A fast exact functional test for directional association and cancer biology applications. IEEE/ACM Transactions on Computational Biology and Bioinformatics 16(3), 818–826. Retrieved from https://doi.org/10.1109/TCBB.2018.2809743
For data discretization by optimal univariate k-means clustering, see Ckmeans.1d.dp.
For symmetrical dependency tests on discrete data, see Pearson's chi-squared test chisq.test
, Fisher's exact test fisher.test
, and mutual information entropy.
## Not run: # Example 1. Asymptotic functional chi-squared test x <- matrix(c(20,0,20,0,20,0,5,0,5), 3) fun.chisq.test(x) # strong functional dependency fun.chisq.test(t(x)) # weak functional dependency # Example 2. Normalized functional chi-squared test x <- matrix(c(8,0,8,0,8,0,2,0,2), 3) fun.chisq.test(x, method="nfchisq") # strong functional dependency fun.chisq.test(t(x), method="nfchisq") # weak functional dependency # Example 3. Exact functional chi-squared test x <- matrix(c(4,0,4,0,4,0,1,0,1), 3) fun.chisq.test(x, method="exact") # strong functional dependency fun.chisq.test(t(x), method="exact") # weak functional dependency # Example 4. Exact functional chi-squared test on a real data set # (Shen et al., 2002) # x is a contingency table with row variable for p53 mutation and # column variable for CIMP x <- matrix(c(12,26,18,0,8,12), nrow=2, ncol=3, byrow=TRUE) # Test the functional dependency: p53 mutation -> CIMP fun.chisq.test(x, method="exact") # Test the functional dependency CIMP -> p53 mutation fun.chisq.test(t(x), method="exact") # Example 5. Asymptotic functional chi-squared test with simulated distribution x <- matrix(c(20,0,20,0,20,0,5,0,5), 3) fun.chisq.test(x, method="simulate.p.value") fun.chisq.test(x, method="simulate.p.value", simulate.n = 1000) ## End(Not run)