A Comparative Study of Data Clustering Algorithms

Geet Singhal; Shipra Panwar; Kanika Jain; Devender Banga

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

Interactive Image Segmentation using Color and Texture Features

February

2016

Image Registration using combination of PCA and GPOF Method for Multiframe Super-Resolution

June

2015

Determine Weakest Bus for IEEE 14 Bus Systems

Nov

2018

The SOM Robustness Capacity for Phonemes Recognition in Adverse Environment

December

2012

Reseach Article

A Comparative Study of Data Clustering Algorithms

by Geet Singhal, Shipra Panwar, Kanika Jain, Devender Banga

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 83 - Number 15

Year of Publication: 2013

Authors: Geet Singhal, Shipra Panwar, Kanika Jain, Devender Banga

10.5120/14528-2927

Geet Singhal, Shipra Panwar, Kanika Jain, Devender Banga . A Comparative Study of Data Clustering Algorithms. International Journal of Computer Applications. 83, 15 ( December 2013), 41-46. DOI=10.5120/14528-2927

@article{ 10.5120/14528-2927,

author = { Geet Singhal, Shipra Panwar, Kanika Jain, Devender Banga },

title = { A Comparative Study of Data Clustering Algorithms },

journal = { International Journal of Computer Applications },

issue_date = { December 2013 },

volume = { 83 },

number = { 15 },

month = { December },

year = { 2013 },

issn = { 0975-8887 },

pages = { 41-46 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume83/number15/14528-2927/ },

doi = { 10.5120/14528-2927 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:59:31.349200+05:30

%A Geet Singhal

%A Shipra Panwar

%A Kanika Jain

%A Devender Banga

%T A Comparative Study of Data Clustering Algorithms

%J International Journal of Computer Applications

%@ 0975-8887

%V 83

%N 15

%P 41-46

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Data clustering is a process of partitioning data points into meaningful clusters such that a cluster holds similar data and different clusters hold dissimilar data. It is an unsupervised approach to classify data into different patterns. In general, the clustering algorithms can be classified into the following two categories: firstly, hard clustering, where a data object can belong to a single and distinct cluster and secondly, soft clustering, where a data object can belong to different clusters. In this report we have made a comparative study of three major data clustering algorithms highlighting their merits and demerits. These algorithms are: k-means, fuzzy c-means and K-NN clustering algorithm. Choosing an appropriate clustering algorithm for grouping the data takes various factors into account for illustration one is the size of data to be partitioned.

References

Joseph P. Bigus. "Data Mining With Neural Networks",Mcgraw-Hill (Tx), 1996
Paulraj Pooniah. "Data Warehousing Fundamentals", Wiley; 2 edition (May 24, 2010).
Jain, A. K. and Dubes, R. C. (1988) Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, NJ.
Shiv Pratap, Singh Kushwah, KeshavRawat and Pradeep Gupta. Analysis and Comparison of Efficient Techniques of Clustering Algorithms in Data Mining.
Arpit Gupta, Ankit Gupta and Amit Mishra. Research Paper On Cluster Techniques Of Data Variations, IJATER, 2011 Volume 1.
Yi Liu, Rong Jin, and Anil K. Jain. "BoostCluster: Boosting Clustering by Pairwise Constraints", KDD 2007, USA.
Anil K. Jain, Alexander Topchy, Martin H. C. Law,and Joachim M. Buhmann. "Landscape of Clustering Algorithms. " ICPR 2004, Vol. 1
Raymond T. Ng and JiaweiHany. "Efficient and Effective Clustering Methods for Spatial Data Mining". 20th VLDB Conference, 1994
Shiv Pratap Singh Kushwah, KeshavRawat, Pradeep Gupta. "Analysis and Comparison of Efficient Techniques of Clustering Algorithms in Data Mining", IJITEE 2012, Volume 1, Issue 3.
R. Suganya, R. Shanthi . "Fuzzy C- Means Algorithm- A Review" IJSRP, Volume 2, Issue 11, November 2012 Edition.
P´adraig Cunningham1 and Sarah Jane Delany. "k-Nearest Neighbour Classifiers Technical Report", UCD-CSI-2007-4March 27, 2007
A. K. Jain, M. N. Murty and P. J. Flynn. "Data Clustering: A Review" ACM Computing Surveys, Vol. 31, No. 3, September

Index Terms

Computer Science

Information Sciences

Keywords

k-means algorithm c-means algorithm k-nn algorithm Euclidian distance Hard clustering Soft clustering.