Call for Paper - August 2022 Edition
IJCA solicits original research papers for the August 2022 Edition. Last date of manuscript submission is July 20, 2022. Read More

A Novel Clustering Algorithm Using K-means (CUK)

Print
PDF
International Journal of Computer Applications
© 2011 by IJCA Journal
Number 1 - Article 3
Year of Publication: 2011
Authors:
Khaled W. Alnaji
Wesam M. Ashour
10.5120/2995-4025

Khaled W Alnaji and Wesam M Ashour. Article: A Novel Clustering Algorithm using K-means (CUK). International Journal of Computer Applications 25(1):25-30, July 2011. Full text available. BibTeX

@article{key:article,
	author = {Khaled W. Alnaji and Wesam M. Ashour},
	title = {Article: A Novel Clustering Algorithm using K-means (CUK)},
	journal = {International Journal of Computer Applications},
	year = {2011},
	volume = {25},
	number = {1},
	pages = {25-30},
	month = {July},
	note = {Full text available}
}

Abstract

While K-means is one of the most well known methods to partition data set into clusters, it still has a problem when clusters are of different size and different density. K-means converges to one of many local minima. Many methods have been proposed to overcome these limitations of K-means, but most of these methods do not overcome the limitation of both different density and size in the same time. The previous methods success to overcome one of them while fails with the others. In this paper we propose a novel algorithm of clustering using K-means (CUK). Our proposed algorithm uses K-means to cluster data objects by using one additional centroid, several partitioning and merging process are used. Merging decision depends on the average mean distance where average distance between each cluster mean and each data object is determined, since the least and closet clusters in average mean distance are merged in one cluster, this process continues until we get the final required clusters in an accurate and efficient way. By comparing the results with K-means, it was found that the results obtained by the proposed algorithm CUK are more effective and accurate.

Reference

  • D.Vanisri, and Dr.C.Loganathan, "An Efficient Fuzzy Clustering Algorithm Based on Modified K-Means", D. Vanisri et. al. International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5949-5958.
  • S. Guha, R. Rastogi, and K. Shim, “CURE: An Efficient Clustering Algorithm for Large Databases”, Proc. ACM SIGMOD Int’l Conf. Management of Data, ACM Press, New York, 1998, pp. 73-84.
  • Gan, Guojun, Chaoqun Ma, and Jianhong Wu, Data Clusterin, "Theory, Algorithms, and Applications", ASA-SIAM Series on Statistics and Applied Probability, SIAM, Philadelphia, ASA, Alexandria, VA, 2007.
  • H. Tsai, S. Horng, S. Tsai, S. Lee, T. Kao, and C. Chen. “Parallel clustering algorithms on a reconfigurable array of processors with wider bus networks”, in Proc. IEEE International Conference on Parallel and Distributed Systems, 1997.
  • I. S. Dhillon and D. S. Modha, “A Data-Clustering Algorithm on Distributed Memory Multiprocessors”, in Proceedings of KDDWS on High Performance Data Mining, 1999.
  • S. S. Khan and A. Ahmed, “Cluster center initialization for Kmeans algorithm”, in Pattern Recognition Letters, vol. 25, no. 11, pp. 1293-1302, 2004.
  • P. S. Bradley and U. M. Fayyad, “Refining Initial Points for Kmeans Clustering”, in Technical Report of Microsoft Research Center, Redmond,California, USA, 1998.
  • F. X. Wu, “Genetic weighted K-means algorithm for clustering large-scale gene expression data”, in BMC Bioinformatics, vol. 9, 2008.
  • Malay K. Pakhira, " A Modified K-means Algorithm to Avoid Empty Clusters", International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009.
  • Kohei Arai, and Ali Ridho Barakbah, "Hierarchical K-means: an algorithm for centroids initialization for K-means", Saga Univ. Saga University, Vol. 36, No.1, 2007.
  • Tajunisha and Saravanan, "Performance analysis of K-means with different initialization methods for high dimensional data", International Journal of Artificial Intelligence & Applications (IJAIA), Vol.1, No.4, October 2010.
  • J. B. McQueen, “Some methods of classification and analysis in multivariate observations”, in Proc. Of fifth Barkley symposium on mathematical statistics and probability, pp. 281 - 297, 1967.
  • Likas, Vlassis and J. J. Verbeek, “The global k-means clustering algorithm”, in Pattern Recognition , vol. 36, no. 2, pp. 451-461, 2003.