cond.fun.chisq.test {FunChisq}R Documentation

Conditional Functional Chi-Squared Test for Model-Free Conditional Functional Dependency

Description

Asymptotic chi-squared test to determine the model-free functional dependency of effect variable Y on a cause variable X, conditioned on a third variable Z.

Usage

cond.fun.chisq.test(x, y, z=NULL, data=NULL, log.p = FALSE,
                    method = c("fchisq", "nfchisq"))

Arguments

x

vector or character; either a discrete random variable (cause) represented as vector or a character column name in data.

y

vector or character; either a discrete random variable (effect) represented as vector or a character column name in data.

z

vector or character; either a discrete random variable (condition) represented as vector or a character column name in data. In case of NULL a fun.chisq.test on a contingency table, with x as row variable and y as column variable, is returned. See ?fun.chisq.test for details. The default is NULL.

data

data.frame; a dataframe containing the three variables x, y and z. In case of NULL x, y and z should be vectors. The default is NULL.

log.p

logical; if TRUE, the p-value is given as log(p). Taking the log improves the accuracy when p-value is close to zero. The default is FALSE.

method

a character string to specify the method to compute the conditional functional chi-squared test statistic and its p-value. The options are "fchisq" (default) and "nfchisq". See Details.

Details

Conditional functional chi-squared introduces the concept of conditional functional depedency, where the functional association between two variables (x and y) is tested conditioned on a third variable (z). Two methods are provided to compute the chi-squared statistic and its p-value. When method = "fchisq", the p-value is computed using the chi-squared distribution; when method = "nfchisq" a normalized statistic is obtained by shifting and scaling the original chi-squared statistic and a p-value is computed using the standard normal distribution (Box et al., 2005). The normalized test is more conservative on the degrees of freedom.

Value

A list with class "htest" containing the following components:

statistic

the conditional functional chi-squared statistic if method = "fchisq"; or the normalized conditional functional chi-squared statistic if method = "nfchisq".

parameter

degrees of freedom for the conditionalfunctional chi-squared statistic.

p.value

p-value of the conditional functional test. If method = "fchisq" the p-value is computed by an asymptotic chi-squared distribution; if method = "nfchisq" the p-value is computed by the standard normal distribution.

estimate

an estimate of the conditional function index between 0 and 1. The value of 1 indicates strong functional dependency between x and y, given z. It is asymmetrical with respect to whether x was chosen as the cause of effect y or vice versa.

Author(s)

Sajal Kumar and Mingzhou Song

References

Box, G. E., Hunter, J. S. and Hunter, W. G. (2005) Statistics for Experimenters: Design, Innovation and Discovery, 2nd ed., New York: Wiley-Interscience.

Zhang, Y. and Song, M. (2013) Deciphering interactions in causal networks without parametric assumptions. arXiv Molecular Networks, arXiv:1311.2707, https://arxiv.org/abs/1311.2707

Zhang, Y. (2014) Nonparametric Statistical Methods for Biological Network Inference. Unpublished doctoral dissertation, Department of Computer Science, New Mexico State University, Las Cruces, USA.

Zhong, H. and Song, M. (2018) A fast exact functional test for directional association and cancer biology applications. IEEE/ACM Transactions on Computational Biology and Bioinformatics. In press. https://doi.org/10.1109/TCBB.2018.2809743

Examples

# Generate a relationship between variables X and Z
xz = matrix(c(30,2,2, 2,2,40, 2,30,2),ncol=3,nrow=3,
            byrow = TRUE)
# Re-construct X
x = rep(c(1:nrow(xz)),rowSums(xz))
# Re-construct Z
z = c()
for(i in 1:nrow(xz))
  z = c(z,rep(c(1:ncol(xz)),xz[i,]))

# Generate a relationship between variables Z and Y
# Make sure Z retains its distribution
zy = matrix(c(4,30, 30,4, 4,40),ncol=2,nrow=3,
            byrow = TRUE)
# Re-construct Y
y = rep(0,length(z))
for(i in unique(z))
  y[z==i] = rep(c(1:ncol(zy)),zy[i,])

# Tables
table(x,z)
table(z,y)
table(x,y)

# Conditional functional dependency
# Y = f(X) | Z should be false
cond.fun.chisq.test(x=x,y=y,z=z)
# Z = f(X) | Y should be true
cond.fun.chisq.test(x=x,y=z,z=y)
# Y = f(Z) | X should be true
cond.fun.chisq.test(x=z,y=y,z=x)

[Package FunChisq version 2.4.8-1 Index]