![]() |
10.5120/ijca2016909394 |
Srikanta Kolay and Kumar S Ray. Article: CATCLUS – A Proposed Algorithm for Clustering Categorical Data. International Journal of Computer Applications 139(10):40-44, April 2016. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX
@article{key:article, author = {Srikanta Kolay and Kumar S. Ray}, title = {Article: CATCLUS – A Proposed Algorithm for Clustering Categorical Data}, journal = {International Journal of Computer Applications}, year = {2016}, volume = {139}, number = {10}, pages = {40-44}, month = {April}, note = {Published by Foundation of Computer Science (FCS), NY, USA} }
Abstract
Classification of categorical data always involves more complexities compared to the numerical data. Because, a firm outline cannot be drawn in case of categorical data. Different types of assumptions are followed by various researchers to treat such kind of data. Again, dissimilarity measures applied in case of numerical data cannot be applied directly in this case. In this paper, a new clustering algorithm for categorical data is proposed. The algorithm is using a newly devised dissimilarity measure. This paper only includes the theoretical description of the proposed algorithm with appropriate example.
References
- MCQUEEN, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 281-297.
- Z. Huang Extensions to the k-means algorithm for clustering large data sets with categorical values Data Mining and Knowledge Discovery, 2 (3) (1998), pp. 283–304
- S. Guha, R. Rastogi, and K. Shim,” ROCK: A Robust Clustering Algorithm for Categorical Attributes”, 15th International Conference on Data Engineering, pp. 512-521, 2000.
- V., Ganti, J. Gehrke, R. Ramakrishnan, CACTUS – clustering categorical data using summaries, in: Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999, pp. 73–83.
- Z. He, X. Xu, S. Deng, Squeezer: an efficient algorithm for clustering categorical data Journal of Computer Science & Technology, 17 (5) (2002), pp. 611–624
- D. Kim, K. Lee, D. Lee Fuzzy clustering of categorical data using fuzzy centroids Pattern Recognition Letters, 25 (11) (2004), pp. 1263–1271
Keywords
Categorical Data, Clustering, Dissimilarity Measure, Algorithm.