FHDI_CellProb {FHDI}R Documentation

Joint cell probabilities for multivariate incomplete categorical data

Description

Calculate the joint cell probabilities for multivariate missing data using the expectation maximization algorithm.

Usage

FHDI_CellProb(datz, w=NULL, id=NULL)

Arguments

datz

multivariate incomplete categorical data.

w

samping weight. Default = 1.0 if NULL. a scalar or w(nrow_y).

id

index for each unit. Default = 1:nrow_y if NULL.

Details

The joint cell probabilities are estimated using EM by weighting method. The algorithm computes the maximum likelihood estimates of the joint cell probabilities under missing at random assumption.

Value

cellpr

table of the joint cell probability. name of cell is linked to the user-defined categories in "k": e.g., name "325" denotes 3rd, 2nd, 5th categories for three variables, respectively, whereas "a1c" denotes 10th, 1st, 12th categories.

w

reprint of the sampling weights "w" initially defined by the user.

Author(s)

Dr. Im, Jongho jonghoim@iastate.edu Dr. Cho, Inho icho@iastate.edu Dr. Kim, Jaekwang jkim@iastate.edu

References

Im, J., Cho, I.H. and Kim, J.K. (2018). FHDI: An R Package for Fractional Hot-Deck Imputation. The R Journal. 10(1), pp. 140-154; Im, J., Kim, J.K. and Fuller, W.A. (2015). Two-phase sampling approach to fractional hot deck imputation, Proceeding of the Survey Research Methods Section, Americal Statistical Association, Seattle, WA.; Ibrahim, J.G. (1990). Incomplete data in generalized linear models. Journal of the American Statistical Assocation 85, 765-769.

Examples

### Toy Example ### 
# y : trivariate variables
# r : indicator corresponding to missingness in y

set.seed(1345) 
n=100 
rho=0.5 
e1=rnorm(n,0,1) 
e2=rnorm(n,0,1) 
e3=rgamma(n,1,1) 
e4=rnorm(n,0,sd=sqrt(3/2))

y1=1+e1 
y2=2+rho*e1+sqrt(1-rho^2)*e2 
y3=y1+e3 
y4=-1+0.5*y3+e4

r1=rbinom(n,1,prob=0.6) 
r2=rbinom(n,1,prob=0.7) 
r3=rbinom(n,1,prob=0.8) 
r4=rbinom(n,1,prob=0.9)

y1[r1==0]=NA 
y2[r2==0]=NA 
y3[r3==0]=NA 
y4[r4==0]=NA

daty=cbind(y1,y2,y3,y4)

result_CM=FHDI_CellMake(daty, k=5, s_op_merge="fixed")
datz=result_CM$cell
result_CP=FHDI_CellProb(datz)
names(result_CP)

[Package FHDI version 1.3.2 Index]