kNN {dbscan} | R Documentation |
This function uses a kd-tree to find all k nearest neighbors in a data matrix (including distances) fast.
kNN(x, k, sort = TRUE, search = "kdtree", bucketSize = 10, splitRule = "suggest", approx = 0)
x |
a data matrix, a dist object or a kNN object. |
k |
number of neighbors to find. |
search |
nearest neighbor search strategy (one of "kdtree", "linear" or "dist"). |
sort |
sort the neighbors by distance? Note that this is expensive and
|
bucketSize |
max size of the kd-tree leafs. |
splitRule |
rule to split the kd-tree. One of "STD", "MIDPT", "FAIR", "SL_MIDPT", "SL_FAIR" or "SUGGEST" (SL stands for sliding). "SUGGEST" uses ANNs best guess. |
approx |
use approximate nearest neighbors. All NN up to a distance of
a factor of 1+ |
search
controls if a kd-tree or linear search (both implemented in
the ANN library; see Mount and Arya, 2010). Note, that these implementations cannot handle NAs. search="dist"
precomputes Euclidean distances first using R.
NAs are handled, but the resulting distance matrix cannot contain NAs. To use other distance measures, a precomputed distance matrix can be
provided as x
(search
is ignored).
bucketSize
and splitRule
influence how the kd-tree is built.
approx
uses the approximate nearest neighbor search implemented in ANN.
All nearest neighbors up to a distance of eps
/(1+approx
)
will be considered and all with a distance greater than eps
will not
be considered. The other points might be considered. Note that this results in
some actual nearest neighbors being omitted leading to spurious clusters and noise points. However, the algorithm will enjoy a significant speedup. For more details see Mount and Arya (2010).
Note: self-matches are removed!
An object of class kNN containing a list with the following components:
dist |
a matrix with distances. |
id |
a matrix with ids. |
k |
number of k used. |
Michael Hahsler
David M. Mount and Sunil Arya (2010). ANN: A Library for Approximate Nearest Neighbor Searching, http://www.cs.umd.edu/~mount/ANN/.
NN
and frNN
for fixed radius nearest neighbors.
data(iris) x <- iris[, -5] # finding kNN directly in data (using a kd-tree) nn <- kNN(x, k=5) nn # explore neighborhood of point 10 i <- 10 nn$id[i,] plot(x, col = ifelse(1:nrow(iris) %in% nn$id[i,], "red", "black")) # Visualize the 5 nearest neighbors plot(nn, x) # Visualize a reduced 2-NN graph plot(kNN(nn, k = 2), x)