ImpKmeans: An Improved Version of the KMeans Algorithm, by Determining Optimum Initial Centroids, based on Multivariate Kernel Density Estimation and Kd-Tree

[ X ]

Tarih

2024

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Budapest Tech

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

K -means is the best known clustering algorithm, because of its usage simplicity, fast speed and efficiency. However, resultant clusters are influenced by the randomly selected initial centroids. Therefore, many techniques have been implemented to solve the mentioned issue. In this paper, a new version of the k -means clustering algorithm named as ImpKmeans shortly (An Improved Version of K -Means Algorithm by Determining Optimum Initial Centroids Based on Multivariate Kernel Density Estimation and Kd-tree) that uses kernel density estimation, to find the optimum initial centroids, is proposed. Kernel density estimation is used, because it is a nonparametric distribution estimation method, that can identify density regions. To understand the efficiency of the ImpKmeans, we compared it with some state-of-the-art algorithms. According to the experimental studies, the proposed algorithm was better than the compared versions of k -means. While ImpKmeans was the most successful algorithm in 46 tests of 60, the second-best algorithm, was the best on 34 tests. Moreover, experimental results indicated that the ImpKmeans is fast, compared to the selected k -means versions.

Açıklama

Anahtar Kelimeler

k-means, clustering, kernel density estimation, centroid initialization, kd-tree

Kaynak

Acta Polytechnica Hungarica

WoS Q Değeri

Q2

Scopus Q Değeri

Q1

Cilt

21

Sayı

2

Künye