A Novel Clustering Algorithm using K-means (CUK)

Khaled W. Alnaji; Wesam M. Ashour

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 22 April 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

A Brief survey of Data Mining Techniques Applied to Agricultural Data

June

2014

Diagnosis of Mathematical Symbols using Hidden Markov Model

September

2015

Design and Performance Comparison of 6-T SRAM Cell in 32nm CMOS, FinFET and CNTFET Technologies

May

2013

Fraud Detection in Web Advertisement

March

2015

Reseach Article

A Novel Clustering Algorithm using K-means (CUK)

by Khaled W. Alnaji, Wesam M. Ashour

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 25 - Number 1

Year of Publication: 2011

Authors: Khaled W. Alnaji, Wesam M. Ashour

10.5120/2995-4025

Khaled W. Alnaji, Wesam M. Ashour . A Novel Clustering Algorithm using K-means (CUK). International Journal of Computer Applications. 25, 1 ( July 2011), 25-30. DOI=10.5120/2995-4025

@article{ 10.5120/2995-4025,

author = { Khaled W. Alnaji, Wesam M. Ashour },

title = { A Novel Clustering Algorithm using K-means (CUK) },

journal = { International Journal of Computer Applications },

issue_date = { July 2011 },

volume = { 25 },

number = { 1 },

month = { July },

year = { 2011 },

issn = { 0975-8887 },

pages = { 25-30 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume25/number1/2995-4025/ },

doi = { 10.5120/2995-4025 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:10:39.580012+05:30

%A Khaled W. Alnaji

%A Wesam M. Ashour

%T A Novel Clustering Algorithm using K-means (CUK)

%J International Journal of Computer Applications

%@ 0975-8887

%V 25

%N 1

%P 25-30

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

While K-means is one of the most well known methods to partition data set into clusters, it still has a problem when clusters are of different size and different density. K-means converges to one of many local minima. Many methods have been proposed to overcome these limitations of K-means, but most of these methods do not overcome the limitation of both different density and size in the same time. The previous methods success to overcome one of them while fails with the others. In this paper we propose a novel algorithm of clustering using K-means (CUK). Our proposed algorithm uses K-means to cluster data objects by using one additional centroid, several partitioning and merging process are used. Merging decision depends on the average mean distance where average distance between each cluster mean and each data object is determined, since the least and closet clusters in average mean distance are merged in one cluster, this process continues until we get the final required clusters in an accurate and efficient way. By comparing the results with K-means, it was found that the results obtained by the proposed algorithm CUK are more effective and accurate.

References

D.Vanisri, and Dr.C.Loganathan, "An Efficient Fuzzy Clustering Algorithm Based on Modified K-Means", D. Vanisri et. al. International Journal of Engineering Science and Technology Vol. 2(10), 2010, 5949-5958.
S. Guha, R. Rastogi, and K. Shim, “CURE: An Efficient Clustering Algorithm for Large Databases”, Proc. ACM SIGMOD Int’l Conf. Management of Data, ACM Press, New York, 1998, pp. 73-84.
Gan, Guojun, Chaoqun Ma, and Jianhong Wu, Data Clusterin, "Theory, Algorithms, and Applications", ASA-SIAM Series on Statistics and Applied Probability, SIAM, Philadelphia, ASA, Alexandria, VA, 2007.
H. Tsai, S. Horng, S. Tsai, S. Lee, T. Kao, and C. Chen. “Parallel clustering algorithms on a reconfigurable array of processors with wider bus networks”, in Proc. IEEE International Conference on Parallel and Distributed Systems, 1997.
I. S. Dhillon and D. S. Modha, “A Data-Clustering Algorithm on Distributed Memory Multiprocessors”, in Proceedings of KDDWS on High Performance Data Mining, 1999.
S. S. Khan and A. Ahmed, “Cluster center initialization for Kmeans algorithm”, in Pattern Recognition Letters, vol. 25, no. 11, pp. 1293-1302, 2004.
P. S. Bradley and U. M. Fayyad, “Refining Initial Points for Kmeans Clustering”, in Technical Report of Microsoft Research Center, Redmond,California, USA, 1998.
F. X. Wu, “Genetic weighted K-means algorithm for clustering large-scale gene expression data”, in BMC Bioinformatics, vol. 9, 2008.
Malay K. Pakhira, " A Modified K-means Algorithm to Avoid Empty Clusters", International Journal of Recent Trends in Engineering, Vol 1, No. 1, May 2009.
Kohei Arai, and Ali Ridho Barakbah, "Hierarchical K-means: an algorithm for centroids initialization for K-means", Saga Univ. Saga University, Vol. 36, No.1, 2007.
Tajunisha and Saravanan, "Performance analysis of K-means with different initialization methods for high dimensional data", International Journal of Artificial Intelligence & Applications (IJAIA), Vol.1, No.4, October 2010.
J. B. McQueen, “Some methods of classification and analysis in multivariate observations”, in Proc. Of fifth Barkley symposium on mathematical statistics and probability, pp. 281 - 297, 1967.
Likas, Vlassis and J. J. Verbeek, “The global k-means clustering algorithm”, in Pattern Recognition , vol. 36, no. 2, pp. 451-461, 2003.

Index Terms

Computer Science

Information Sciences

Keywords

Data Clustering K-means Clustering using K-means Average Mean Distance