kpeaks : an R package for quick selection of k for cluster analysis

Cebeci, Zeynel and Cebeci, Cagatay; (2019) kpeaks : an R package for quick selection of k for cluster analysis. In: 2018 International Conference on Artificial Intelligence and Data Processing - Proceedings. IEEE, TUR. ISBN 9781538668788 (https://doi.org/10.1109/IDAP.2018.8620896)

[thumbnail of Cebeci-Cebeci-AIDP2018-kpeaks-an-R-package-for-quick-selection-of-k-for-cluster-analysis]
Preview
Text. Filename: Cebeci_Cebeci_AIDP2018_kpeaks_an_R_package_for_quick_selection_of_k_for_cluster_analysis.pdf
Accepted Author Manuscript

Download (404kB)| Preview

Abstract

The argument k is a mandatory user-specified input argument for the number of clusters which is required to start all of the partitioning clustering algorithms. In unsupervised learning applications, an optimal value of this argument is generally determined by using any of the internal validity indexes. However, the determination of k with aid of these indexes are computationally very expensive because they compute a k value using the results after several runs of a clustering algorithm. On the contrary, the package 'kpeaks' enables to estimate k before starting a clustering session. It is based on a simple novel technique using the descriptive statistics of peak counts of the features in datasets. In this paper, we introduce and illustrate the details of R package 'kpeaks' as an implementation for quick selection of the number of clusters for starting cluster algorithms.