clusters {intervals} | R Documentation |
This function uses tools in the intervals package to quickly identify clusters – contiguous collections of positions or intervals which are separated by no more than a given distance from their neighbors to either side.
## S4 method for signature 'numeric' clusters(x, w, which = FALSE, check_valid = TRUE) ## S4 method for signature 'Intervals_virtual' clusters(x, w, which = FALSE, check_valid = TRUE)
x |
An appropriate object. |
w |
Maximum permitted distance between a cluster member and its neighbors to either side. |
which |
Should indices into the |
check_valid |
Should |
A cluster is defined to be a maximal collection, with at least two
members, of components of x
which are separated by no more than
w
. Note that when x
represents intervals, an interval
must actually contain a point at distance w
or less from
a neighboring interval to be assigned to the same cluster. If the ends
of both intervals in question are open and exactly at distance
w
, they will not be deemed to be cluster co-members. See the
example below.
A list whose components are the clusters. Each component is thus a
subset of x
, or, if which == TRUE
, a vector of
indices into the x
object. (The indices correspond to row
numbers when x
is of class "Intervals_virtual"
.)
Implementation is by a call to reduce
followed by a call
to interval_overlap
. The clusters
methods are
included to illustrate the utility of the core functions in the
intervals package, although they are also useful in their own
right.
# Numeric method w <- 20 x <- sample( 1000, 100 ) c1 <- clusters( x, w ) # Check results sapply( c1, function( x ) all( diff(x) <= w ) ) d1 <- diff( sort(x) ) all.equal( as.numeric( d1[ d1 <= w ] ), unlist( sapply( c1, diff ) ) ) # Intervals method, starting with a reduced object so we know that all # intervals are disjoint and sorted. B <- 100 left <- runif( B, 0, 1e4 ) right <- left + rexp( B, rate = 1/10 ) y <- reduce( Intervals( cbind( left, right ) ) ) gaps <- function(x) x[-1,1] - x[-nrow(x),2] hist( gaps(y), breaks = 30 ) w <- 200 c2 <- clusters( y, w ) head( c2 ) sapply( c2, function(x) all( gaps(x) <= w ) ) # Clusters and open end points. See "Details". z <- Intervals( matrix( 1:4, 2, 2, byrow = TRUE ), closed = c( TRUE, FALSE ) ) z clusters( z, 1 ) closed(z)[1] <- FALSE z clusters( z, 1 )