CATCLUS – A Proposed Algorithm for Clustering Categorical Data

Srikanta Kolay; Kumar S. Ray

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

Efficient Algorithm for Mining Frequent Subgraphs (Static and Dynamic) based on gSpan

February

2013

Handwritten Gurmukhi Numeral Recognition using Different Feature Sets

August

2011

Spectral Entropy Estimation of HRV Data of Thyroid and Healthy subjects

September

2011

Finger Vein Verification System based on Three Methodologies of Feature Extraction

Aug

2017

Reseach Article

CATCLUS – A Proposed Algorithm for Clustering Categorical Data

by Srikanta Kolay, Kumar S. Ray

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 139 - Number 10

Year of Publication: 2016

Authors: Srikanta Kolay, Kumar S. Ray

10.5120/ijca2016909394

Srikanta Kolay, Kumar S. Ray . CATCLUS – A Proposed Algorithm for Clustering Categorical Data. International Journal of Computer Applications. 139, 10 ( April 2016), 40-44. DOI=10.5120/ijca2016909394

@article{ 10.5120/ijca2016909394,

author = { Srikanta Kolay, Kumar S. Ray },

title = { CATCLUS – A Proposed Algorithm for Clustering Categorical Data },

journal = { International Journal of Computer Applications },

issue_date = { April 2016 },

volume = { 139 },

number = { 10 },

month = { April },

year = { 2016 },

issn = { 0975-8887 },

pages = { 40-44 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume139/number10/24530-2016909394/ },

doi = { 10.5120/ijca2016909394 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:40:37.039705+05:30

%A Srikanta Kolay

%A Kumar S. Ray

%T CATCLUS – A Proposed Algorithm for Clustering Categorical Data

%J International Journal of Computer Applications

%@ 0975-8887

%V 139

%N 10

%P 40-44

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Classification of categorical data always involves more complexities compared to the numerical data. Because, a firm outline cannot be drawn in case of categorical data. Different types of assumptions are followed by various researchers to treat such kind of data. Again, dissimilarity measures applied in case of numerical data cannot be applied directly in this case. In this paper, a new clustering algorithm for categorical data is proposed. The algorithm is using a newly devised dissimilarity measure. This paper only includes the theoretical description of the proposed algorithm with appropriate example.

References

MCQUEEN, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 281-297.
Z. Huang Extensions to the k-means algorithm for clustering large data sets with categorical values Data Mining and Knowledge Discovery, 2 (3) (1998), pp. 283–304
S. Guha, R. Rastogi, and K. Shim,” ROCK: A Robust Clustering Algorithm for Categorical Attributes”, 15th International Conference on Data Engineering, pp. 512-521, 2000.
V., Ganti, J. Gehrke, R. Ramakrishnan, CACTUS – clustering categorical data using summaries, in: Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999, pp. 73–83.
Z. He, X. Xu, S. Deng, Squeezer: an efficient algorithm for clustering categorical data Journal of Computer Science & Technology, 17 (5) (2002), pp. 611–624
D. Kim, K. Lee, D. Lee Fuzzy clustering of categorical data using fuzzy centroids Pattern Recognition Letters, 25 (11) (2004), pp. 1263–1271

Index Terms

Computer Science

Information Sciences

Keywords

Categorical Data Clustering Dissimilarity Measure Algorithm.