regnet {regnet}R Documentation

fit a regression for given lambda with network-based regularization

Description

Network-based penalization regression for given values of λ_{1} and λ_{2}. Typical usage is to have the cv.regnet function compute the optimal lambdas, then provide them to the regnet function. Users could also use MCP or Lasso.

Usage

regnet(X, Y, response = c("binary", "continuous", "survival"),
  penalty = c("network", "mcp", "lasso"), lamb.1 = NULL,
  lamb.2 = NULL, r = NULL, clv = NULL, initiation = NULL,
  alpha.i = 1, robust = FALSE)

Arguments

X

matrix of predictors without intercept. Each row should be an observation vector. A column of 1 will be added to the X matrix by the program as the intercept.

Y

response variable. For response="binary", Y should be a numeric vector with zeros and ones. For response="survival", Y should be a two-column matrix with columns named 'time' and 'status'. The latter is a binary variable, with '1' indicating a event, and '0' indicating censoring.

response

response type. regnet now supports three types of response: "binary", "continuous" and "survival".

penalty

penalty type. regnet provides three choices for the penalty function: "network", "mcp" and "lasso".

lamb.1

the tuning parameter λ_{1} that imposes sparsity.

lamb.2

the tuning parameter λ_{2} that controls the smoothness among coefficient profiles. λ_{2} is needed for network penalty.

r

the regularization parameter in MCP. For binary response, r should be larger than 4.

clv

a value or a vector, indexing variables that are not subject to penalty. clv only works for continuous and survival responses for now, and will be ignored for other types of responses.

initiation

method for initiating the coefficient vector. For binary and continuous response, the default is elastic-net, and for survival response the default is zero.

alpha.i

the elastic-net mixing parameter. The program can use the elastic-net for choosing initial values of the coefficient vector. alpha.i is the elastic-net mixing parameter, with 0 alpha.i 1. alpha.i=1 is the lasso penalty, and alpha.i=0 is the ridge penalty. If the user chooses a method other than elastic-net for initializing coefficients, alpha.i will be ignored.

robust

logical flag. Whether or not to use robust methods. Robust methods are only available for survival response.

Details

The current version of regnet supports two types of responses: “binary”, "continuous" and “survival”. regnet(…, response="binary", penalty="network") fits a network-based penalized logistic regression; regnet(…, response="continuous", penalty="network") fits a network-based least square regression; regnet(…, response="survival", penalty="network") fits a robust regularized AFT model using network penalty. Please see the references for more details of the models. By default, regnet uses robust methods for survival response. If users would like to use non-robust methods, simply set robust=FALSE. User could also use MCP or Lasso penalty.

The coefficients are always estimated on a standardized X matrix. regnet standardizes each columns of X to have unit variance (using 1/n rather than 1/(n-1) formula). If the coefficients on the original scale are needed, the user can refit a standard model using the subset of variables that have non-zero coefficients.

Value

a vector of estimated coefficients. Please note that, if there are variables not subject to penalty (indicated by clv), the order of returned vector is c(Intercept, unpenalized coefficients of clv variables, penalized coefficients of other variables).

References

Ren, J., He, T., Li, Y., Liu, S., Du, Y., Jiang, Y., and Wu, C. (2017). Network-based regularization for high dimensional SNP data in the case-control study of Type 2 diabetes. BMC Genetics, 18(1):44

Ren, J., Du, Y., Li, S., Ma, S., Jiang,Y. and Wu, C. (2019). Robust network-based regularization and variable selection for high dimensional genomics data in cancer prognosis. Genet. Epidemiol., 43:276-291

See Also

cv.regnet

Examples

## Survival response
data(SurvExample)
X = rgn.surv$X
Y = rgn.surv$Y
clv = c(1:5) # variables 1 to 5 are clinical variables which we choose not to penalize.
penalty = "network"
b = regnet(X, Y, "survival", penalty, rgn.surv$lamb1, rgn.surv$lamb2, clv=clv, robust=TRUE)
index = which(rgn.surv$beta != 0)
pos = which(b != 0)
tp = length(intersect(index, pos))
fp = length(pos) - tp
list(tp=tp, fp=fp)


[Package regnet version 0.4.0 Index]