knn_mi {rmi} | R Documentation |
Computes mutual information based on the distribution of nearest neighborhood distances. Method available are KSG1 and KSG2 as described by Kraskov, et. al (2004) and the Local Non-Uniformity Corrected (LNC) KSG as described by Gao, et. al (2015). The LNC method is based on KSG2 but with PCA volume corrections to adjust for observed non-uniformity of the local neighborhood of each point in the sample.
knn_mi(data, splits, options)
data |
Matrix of sample observations, each row is an observation. |
splits |
A vector that describes which sets of columns in |
options |
A list that specifies the estimator and its necessary parameters (see details). |
Current available methods are LNC, KSG1 and KSG2.
For KSG1 use: options = list(method = "KSG1", k = 5)
For KSG2 use: options = list(method = "KSG2", k = 5)
For LNC use: options = list(method = "LNC", k = 10, alpha = 0.65)
, order needed k > ncol(data)
.
Isaac Michaud, North Carolina State University, ijmichau@ncsu.edu
Gao, S., Ver Steeg G., & Galstyan A. (2015). Efficient estimation of mutual information for strongly dependent variables. Artificial Intelligence and Statistics: 277-286.
Kraskov, A., Stogbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical review E 69(6): 066138.
set.seed(123) x <- rnorm(1000) y <- x + rnorm(1000) knn_mi(cbind(x,y),c(1,1),options = list(method = "KSG2", k = 6)) set.seed(123) x <- rnorm(1000) y <- 100*x + rnorm(1000) knn_mi(cbind(x,y),c(1,1),options = list(method = "LNC", alpha = 0.65, k = 10)) #approximate analytic value of mutual information -0.5*log(1-cor(x,y)^2) z <- rnorm(1000) #redundancy I(x;y;z) is approximately the same as I(x;y) knn_mi(cbind(x,y,z),c(1,1,1),options = list(method = "LNC", alpha = c(0.5,0,0,0), k = 10)) #mutual information I((x,y);z) is approximately 0 knn_mi(cbind(x,y,z),c(2,1),options = list(method = "LNC", alpha = c(0.5,0.65,0), k = 10))