irlba {irlba} | R Documentation |
The augmented implicitly restarted Lanczos bidiagonalization algorithm (IRLBA) finds a few approximate largest singular values and corresponding singular vectors of a sparse or dense matrix using a method of Baglama and Reichel. It is a fast and memory-efficient way to compute a partial SVD.
irlba(A, nv = 5, nu, maxit = 1000, work = nv + 7, reorth = TRUE, tol = 1e-05, v = NULL, right_only = FALSE, verbose = FALSE, scale, center, du, ds, dv, shift, mult, fastpath = TRUE)
A |
numeric real- or complex-valued matrix or real-valued sparse matrix. |
nv |
number of right singular vectors to estimate. |
nu |
number of left singular vectors to estimate (defaults to |
maxit |
maximum number of iterations. |
work |
working subspace dimension, larger values can speed convergence at the cost of more memory use. |
reorth |
if |
tol |
convergence is determined when ||AV - US|| < tol*||A||, where the spectral norm ||A|| is approximated by the largest estimated singular value, and U, V, S are the matrices corresponding to the estimated left and right singular vectors, and diagonal matrix of estimated singular values, respectively. |
v |
optional starting vector or output from a previous run of |
right_only |
logical value indicating return only the right singular vectors
( |
verbose |
logical value that when |
scale |
optional column scaling vector whose values divide each column of |
center |
optional column centering vector whose values are subtracted from each
column of |
du |
DEPRECATED optional subspace deflation vector (see notes). |
ds |
DEPRECATED optional subspace deflation scalar (see notes). |
dv |
DEPRECATED optional subspace deflation vector (see notes). |
shift |
optional shift value (square matrices only, see notes). |
mult |
optional custom matrix multiplication function (default is |
fastpath |
try a fast C algorithm implementation if possible; set |
Returns a list with entries:
max(nu, nv) approximate singular values
nu approximate left singular vectors (only when right_only=FALSE)
nv approximate right singular vectors
The number of Lanczos iterations carried out
The total number of matrix vector products carried out
The syntax of irlba
partially follows svd
, with an important
exception. The usual R svd
function always returns a complete set of
singular values, even if the number of singular vectors nu
or nv
is set less than the maximum. The irlba
function returns a number of
estimated singular values equal to the maximum of the number of specified
singular vectors nu
and nv
.
Use the optional scale
parameter to implicitly scale each column of
the matrix A
by the values in the scale
vector, computing the
truncated SVD of the column-scaled sweep(A, 2, scale, FUN=`/`)
, or
equivalently, A %*% diag(1 / scale)
, without explicitly forming the
scaled matrix. scale
must be a non-zero vector of length equal
to the number of columns of A
.
Use the optional center
parameter to implicitly subtract the values
in the center
vector from each column of A
, computing the
truncated SVD of sweep(A, 2, center, FUN=`-`)
,
without explicitly forming the centered matrix. This option may not be
used together with the general rank 1 deflation options. center
must be a vector of length equal to the number of columns of A
.
This option may be used to efficiently compute principal components without
explicitly forming the centered matrix (which can, importantly, preserve
sparsity in the matrix). See the examples.
The optional deflation parameters are deprecated and will be removed in
a future version. They could be used to compute the rank-one deflated
SVD of A - ds*du %*% t(dv), where
t(du) %*% A - ds * t(dv) == 0. For
example, the triple ds, du, dv
may be a known singular value
and corresponding singular vectors. Or ds=m
and dv
and du
represent a vector of column means of A
and of ones,
respectively, where m
is the number of rows of A
.
This functionality can be effectively replaced with custom matrix
product functions.
Specify an optional alternative matrix multiplication operator in the
mult
parameter. mult
must be a function of two arguments,
and must handle both cases where one argument is a vector and the other
a matrix. See the examples and
demo("custom_matrix_multiply", package="irlba")
for an
alternative approach.
Use the v
option to supply a starting vector for the iterative
method. A random vector is used by default (precede with set.seed()
to
for reproducibility). Optionally set v
to
the output of a previous run of irlba
to restart the method, adding
additional singular values/vectors without recomputing the solution
subspace. See the examples.
The function may generate the following warnings:
"did not converge–results might be invalid!; try increasing maxit or fastpath=FALSE" means that the algorithm didn't
converge – this is potentially a serious problem and the returned results may not be valid. irlba
reports a warning here instead of an error so that you can inspect whatever is returned. If this
happens, carefully heed the warning and inspect the result.
"You're computing a large percentage of total singular values, standard svd might work better!"
irlba
is designed to efficiently compute a few of the largest singular values and associated
singular vectors of a matrix. The standard svd
function will be more efficient for computing
large numbers of singular values than irlba
.
"convergence criterion below machine epsilon" means that the product of tol
and the
largest estimated singular value is really small and the normal convergence criterion is only
met up to round off error.
The function might return an error for several reasons including a situation when the starting
vector v
is near the null space of the matrix. In that case, try a different v
.
The fastpath=TRUE
option only supports real-valued matrices and sparse matrices
of type dgCMatrix
(for now). Other problems fall back to the reference
R implementation.
Augmented Implicitly Restarted Lanczos Bidiagonalization Methods, J. Baglama and L. Reichel, SIAM J. Sci. Comput. 2005.
set.seed(1) A <- matrix(runif(400), nrow=20) S <- irlba(A, 3) S$d # Compare with svd svd(A)$d[1:3] # Restart the algorithm to compute more singular values # (starting with an existing solution S) S1 <- irlba(A, 5, v=S) # Principal components (see also prcomp_irlba) P <- irlba(A, nv=1, center=colMeans(A)) # Compare with prcomp and prcomp_irlba (might vary up to sign) cbind(P$v, prcomp(A)$rotation[, 1], prcomp_irlba(A)$rotation[, 1]) # A custom matrix multiplication function that scales the columns of A # (cf the scale option). This function scales the columns of A to unit norm. col_scale <- sqrt(apply(A, 2, crossprod)) mult <- function(x, y) { # check if x is a vector if (is.vector(x)) { return((x %*% y) / col_scale) } # else x is the matrix x %*% (y / col_scale) } irlba(A, 3, mult=mult)$d # Compare with: irlba(A, 3, scale=col_scale)$d # Compare with: svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]