isr {ISR3} | R Documentation |
isr
performs imputation of missing values based on an optionally
specified model. Missingness is assumed to be missing at random (MAR).
isr(X, M, Xinit, mi = 1, burnIn = 100, thinning = 20, intercept = T)
X |
A matrix of points to be imputed or used for covariates by isr.
|
M |
A boolean valued optional matrix specifying the factorized pdf of the joint multivariate normal distribution of the variables requiring imputation.
A description of the factorized pdf is provided in the details.
The column names of |
Xinit |
An optional matrix with the same dimensions of |
mi |
A scalar indicating the number of imputations to return |
burnIn |
A scalar indicating the number of iterations to burn in before returning imputations. Note, that burnIn is the total number of iterations, no thinning is performed until multiple imputation generation starts. |
thinning |
A scalar that represents the amount of thinning for the MCMC routine. A value of one implies no thinning. |
intercept |
A logical value identifying if the imputation model should have an intercept. |
The ISR algorithm performs Bayesian multivariate normal imputation. This imputation follows two steps, an imputation step and a prediction step. In the imputation step, the missing values are imputed from a Normal-Inverse-Wishart model with non-informative priors. In the prediction step, the parameters are estimated using both the observed and imputed values.
Imputation of parameters are done through the conditional factoring of the joint pdf.
A conditional factoring is an expansion of the joint pdf of all
the dependent variables in X
. e.g. Pr(X|Z) = Pr(X1,X2,X3|Z) = Pr(X1,Z) Pr(X2|X1,Z) Pr(X3|X1,X2,Z),
where the right hand side is the fully conditional specification for the dependent variables X1-X3 and independent variable Z.
This function returns a list with two elements: param
a three dimensional array
of parameter estimates of the factored pdf. The last dimension is an index for the multiple imputations.
imputed
a three dimensional array of X
with imputed values, the last dimension is an
index for the multiple imputations.
Robbins, M. W., & White, T. K. (2011). Farm commodity payments and imputation in the Agricultural Resource Management Survey. American journal of agricultural economics, DOI: 10.1093/ajae/aaq166.
# simulation parameters set.seed(100) n <- 30 p <- 5 missing <- 10 # generate a covar matrix covarMatrix <- rWishart(1,p+1,diag(p))[,,1] # simulation of variables under the variable relationships U <- chol(covarMatrix) X <- matrix(rnorm(n*p), nrow=n) %*% U # make some data missing X[sample(1:(n*p),size=missing)] <- NA # specify relationships fitMatrix <- matrix( c( # Covar2 CoVar1 Var1 Var2 Var3 # 1. Var1 TRUE, TRUE, FALSE, FALSE, FALSE, # 2. Var2 TRUE, TRUE, FALSE, FALSE, FALSE, # 3. Var3 TRUE, TRUE, TRUE, TRUE, FALSE ),nrow=3,byrow=TRUE) covarList <- c('Covar2', 'CoVar1', 'Var1', 'Var2','Var3') # setup names colnames(fitMatrix) <- covarList rownames(fitMatrix) <- covarList[-1:-2] colnames(X) <- covarList XImputed <- isr(X,fitMatrix)