islasso {islasso} | R Documentation |
islasso
is used to fit lasso regression models wherein the nonsmooth L1 norm penalty is replaced by a smooth approximation justified under the induced smoothing paradigm. Simple lasso-type or elastic-net penalties are permitted and Linear, Logistic, Poisson and Gamma responses are allowed.
islasso(formula, family = gaussian, lambda, alpha=1, data, weights, subset, offset, unpenalized, contrasts = NULL, control = is.control())
formula |
an object of class “formula” (or one that can be coerced to that class): the ‘usual’ symbolic description of the model to be fitted. |
family |
the assumed response distribution. Gaussian, (quasi) Binomial, (quasi) Poisson, and Gamma are allowed. |
lambda |
Value of the tuning parameter in the objective. If missing, the optimal lambda is computed using |
alpha |
The elastic-net mixing parameter, with 0≤α≤ 1. The penalty is defined as (1-α)/2||β||_2^2+α||β||_1.
|
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which |
weights |
observation weights. Default is 1 for each observation. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. |
unpenalized |
optional. A vector of integers or characters indicating the covariate coefficients not to penalize. The intercept, if included in the model, is always unpenalized. |
contrasts |
an optional list. See the contrasts.arg of |
control |
a list of parameters for controlling the fitting process (see |
islasso
estimates regression models by imposing a lasso-type penalty on some or all regression coefficients. However the nonsmooth L1 norm penalty is replaced by a smooth approximation justified under the induced smoothing paradigm. The advantage is that reliable standard errors are returned as model output and hypothesis testing on linear combinantions of the regression parameters can be carried out straightforwardly via the Wald statistic. Simulation studies provide evidence that the proposed approach controls type-I errors and exhibits good power in different scenarios.
A list of
coefficients |
a named vector of coefficients |
se |
a named vector of standard errors |
res |
the working residuals |
fitted.values |
the fitted values |
linear.predictors |
the linear predictors |
rank |
the estimated degrees of freedom |
family |
the family object used |
deviance |
the family deviance |
null.deviance |
the family null deviance |
aic |
the Akaike Information Criterion |
df.null |
the degrees of freedom of a null model |
phi |
the estimated dispersion parameter |
beta.unbias |
unbiased coefficients |
se.unbias |
unbiased standard errors |
internal |
internal elements |
control |
the value of the control argument used |
model |
if requested (the default), the model frame used. |
terms |
the terms object used. |
contrasts |
(only where relevant) the contrasts used. |
call |
the matched call |
formula |
the formula supplied |
The main function of the same name was inspired by the R function previously implemented by Vito MR Muggeo. Maintainer: Gianluca Sottile <gianluca.sottile@unipa.it>
Cilluffo, G, Sottile, G, S, La Grutta, S and Muggeo, VMR (2019). The Induced Smoothed lasso: A practical framework for hypothesis testing in high dimensional regression. Statistical Methods in Medical Research, DOI: 10.1177/0962280219842890.
Sottile, G, Cilluffo, G, Muggeo, VMR (2019). The R package islasso: estimation and hypothesis testing in lasso regression. Technical Report on ResearchGate. doi:10.13140/RG.2.2.16360.11521.
islasso.fit
, coef.islasso
, summary.islasso
, residuals.islasso
, AIC.islasso
, logLik.islasso
, fitted.islasso
, predict.islasso
and deviance.islasso
methods.
set.seed(1) n <- 100 p <- 100 p1 <- 10 #number of nonzero coefficients coef.veri <- sort(round(c(seq(.5, 3, l=p1/2), seq(-1, -2, l=p1/2)), 2)) sigma <- 1 coef <- c(coef.veri, rep(0, p-p1)) X <- matrix(rnorm(n*p), n, p) eta <- drop(X%*%coef) ##### gaussian ###### mu <- eta y <- mu + rnorm(n, 0, sigma) o <- islasso(y~-1+X, family=gaussian) o summary(o) coef(o) fitted(o) predict(o, type="response") plot(o) residuals(o) deviance(o) AIC(o) logLik(o) ## Not run: # for the interaction o <- islasso(y~-1+X[,1]*X[,2], family=gaussian) ##### binomial ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(1, coef)) mu <- binomial()$linkinv(eta) y <- rbinom(n, 100, mu) y <- cbind(y, 100-y) o <- islasso(y~X, family=binomial) ##### poisson ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(1, coef)) mu <- poisson()$linkinv(eta) y <- rpois(n, mu) o <- islasso(y~X, family=poisson) ##### Gamma ###### coef <- c(c(1,1,1), rep(0, p-3)) X <- matrix(rnorm(n*p), n, p) eta <- drop(cbind(1, X)%*%c(1, coef)) mu <- Gamma(link="log")$linkinv(eta) shape <- 10 phi <- 1 / shape y <- rgamma(n, scale = mu / shape, shape = shape) o <- islasso(y~X, family=Gamma(link="log")) ## End(Not run)