robustgam.CV {robustgam} | R Documentation |
This function combines the robustgam
with automatic smoothing parameter selection. The smoothing parameter is selected through robust cross validation criterion described in Wong, Yao and Lee (2013). The criterion is designed to be robust to outliers. This function uses grid search to find the smoothing parameter that minimizes the criterion.
robustgam.CV(X, y, family, p=3, K=30, c=1.345, show.msg=FALSE, count.lim=200, w.count.lim=50, smooth.basis="tp", wx=FALSE, sp.min=1e-7, sp.max=1e-3, len=50, show.msg.2=TRUE, ngroup=length(y), seed=12345)
X |
a vector or a matrix (each covariate form a column) of covariates |
y |
a vector of responses |
family |
A family object specifying the distribution and the link function. See |
p |
order of the basis. It depends on the option of smooth.basis. |
K |
number of knots of the basis; dependent on the option of smooth.basis. |
c |
tunning parameter for Huber function; a smaller value of c corresponds to a more robust fit. It is recommended to set as 1.2 and 1.6 for binomial and poisson distribution respectively. |
show.msg |
If |
count.lim |
maximum number of iterations of the whole algorithm |
w.count.lim |
maximum number of updates on the weight. It corresponds to zeta in Wong, Yao and Lee (2013) |
smooth.basis |
the specification of basis. Four choices are available: |
wx |
If |
sp.min |
A vector of minimum values of the searching range for smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. |
sp.max |
A vector of maximum values of the searching range for smoothing parameters. If only one value is specified, it will be used for all smoothing parameters. |
len |
A vector of grid sizes. If only one value is specified, it will be used for all smoothing parameters. |
show.msg.2 |
If |
ngroup |
number of group used in the cross validation. If |
seed |
The seed for random generator used in generating partitions. |
fitted.values |
fitted values (of the optimum fit) |
initial.fitted |
the starting values of the algorithm (of the optimum fit) |
beta |
estimated coefficients (corresponding to the basis) (of the optimum fit) |
optim.index |
the index of the optimum fit |
optim.index2 |
the index of the optimum fit in another representation:
|
optim.criterion |
the optimum value of robust cross validation criterion |
optim.sp |
the optimum value of the smoothing parameter |
criteria |
the values of criteria for all fits during grid search |
sp |
the grid of smoothing parameter |
optim.fit |
the robustgam fit object of the optimum fit. It is handy for applying the prediction method. |
Raymond K. W. Wong <raymondkww.dev@gmail.com>
Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear.
robustgam.GIC
, robustgam.GIC.optim
, robustgam.CV
, pred.robustgam
# load library library(robustgam) # test function test.fun <- function(x, ...) { return(2*sin(2*pi*(1-x)^2)) } # some setting set.seed(1234) true.family <- poisson() out.prop <- 0.05 n <- 100 # generating dataset for poisson case x <- runif(n) x <- x[order(x)] true.eta <- test.fun(x) true.mu <- true.family$linkinv(test.fun(x)) y <- rpois(n, true.mu) # for poisson case # create outlier for poisson case out.n <- trunc(n*out.prop) out.list <- sample(1:n, out.n, replace=FALSE) y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,TRUE))) ## Not run: # robust GAM fit robustfit.gic <- robustgam.CV(x, y, family=true.family, p=3, c=1.6, show.msg=FALSE, count.lim=400, smooth.basis='tp',ngroup=5); robustfit <- robustfit.gic$optim.fit # ordinary GAM fit nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for 'tp' # prediction x.new <- seq(range(x)[1], range(x)[2], len=1000) robustfit.new <- pred.robustgam(robustfit, data.frame(X=x.new))$predict.values nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response")) # plot plot(x, y) lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue") lines(x.new, robustfit.new, col="red") lines(x.new, nonrobustfit.new, col="green") legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"), lty=c(1,1,1)) ## End(Not run)