akmeans {akmeans} | R Documentation |
Adaptive K-means algorithm is quite simple ## 1. Set min.k and max.k. ## 2. Run K-means with K = min.k ## 3. For each cluster, check the threshold condition. ## 4. If all clusters satisfy the threshold condition => Done, return the result ## 5. Check K>max.k => If yes, stop. If no, go to step 5. ## 6. For any cluster violating the threshold condition, run K'-means with K'=2 on those cluster members, ## which means K will increase by the number of violating clusters. ## 7. Run K-means setting the present cluster centers as the initial centers and go to step 4.
akmeans(x, ths1 = 0.2, ths2 = 0.2, ths3 = 0.7, ths4 = 0.2, min.k = 5, max.k = 100, iter.max = 100, nstart = 1, mode = 1, d.metric = 1, verbose = TRUE)
x |
data matrix n by p: all elements should be numeric |
ths1 |
threshold to decide whether to increase k or not: check sum((sample-assigned center)^2) < ths1*sum(assigned center^2) |
ths2 |
threshold to decide whether to increase k or not: check all components of |sample-assigned center| < ths2 |
ths3 |
threshold to decide whether to increase k or not: check inner product of (sample,assigned center) > ths3 , this is only for cosine distance metric |
ths4 |
threshold to decide whether to increase k or not: check all components of sum(abs(sample-assigned center)) < ths4 |
min.k |
minimum number of clusters, starting k |
max.k |
maximum number of clusters |
iter.max |
will be delivered to kmeans function |
nstart |
will be delivered to kmeans function |
mode |
1: use ths1, 2: use ths2, 3: use ths3 |
d.metric |
1: use euclidean distance metric, otherwise use cosine distance metric |
verbose |
print the messages or not |
## ths1: threshold to decide whether to increase k or not: check sum((sample-assigned center)^2) < ths1*sum(assigned center^2) ## ths2: threshold to decide whether to increase k or not: check all components of |sample-assigned center| < ths2 ## ths3: threshold to decide whether to increase k or not: check inner product of (sample,assigned center) > ths3 , this is only for cosine distance metric ## ths4: threshold to decide whether to increase k or not: check all components of sum(abs(sample-assigned center)) < ths4
if d.metric=1, it will return the same result as 'kmeans' function. if d.metric is not 1, a list will be returned with components : cluster: A vector of integers indicating the cluster to which each point is allocated. centers: A matrix of cluster centres size: The number of points in each cluster
Jungsuk Kwac
x = matrix(rnorm(1000),100,10) akmeans(x) ## euclidean distance based akmeans(x,d.metric=2,ths3=0.8,mode=3) ## cosine distance based