ImpKmeans: An Improved Version of the KMeans Algorithm, by Determining Optimum Initial Centroids, based on Multivariate Kernel Density Estimation and Kd-Tree
[ X ]
Tarih
2024
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Budapest Tech
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
K -means is the best known clustering algorithm, because of its usage simplicity, fast speed and efficiency. However, resultant clusters are influenced by the randomly selected initial centroids. Therefore, many techniques have been implemented to solve the mentioned issue. In this paper, a new version of the k -means clustering algorithm named as ImpKmeans shortly (An Improved Version of K -Means Algorithm by Determining Optimum Initial Centroids Based on Multivariate Kernel Density Estimation and Kd-tree) that uses kernel density estimation, to find the optimum initial centroids, is proposed. Kernel density estimation is used, because it is a nonparametric distribution estimation method, that can identify density regions. To understand the efficiency of the ImpKmeans, we compared it with some state-of-the-art algorithms. According to the experimental studies, the proposed algorithm was better than the compared versions of k -means. While ImpKmeans was the most successful algorithm in 46 tests of 60, the second-best algorithm, was the best on 34 tests. Moreover, experimental results indicated that the ImpKmeans is fast, compared to the selected k -means versions.
Açıklama
Anahtar Kelimeler
k-means, clustering, kernel density estimation, centroid initialization, kd-tree
Kaynak
Acta Polytechnica Hungarica
WoS Q Değeri
Q2
Scopus Q Değeri
Q1
Cilt
21
Sayı
2