![]() |
10.5120/6072-7456 |
Sovan Kumar Patnaik, Soumya Sahoo and Dillip Kumar Swain. Article: Clustering of Categorical Data by Assigning Rank through Statistical Approach. International Journal of Computer Applications 43(2):1-3, April 2012. Full text available. BibTeX
@article{key:article, author = {Sovan Kumar Patnaik and Soumya Sahoo and Dillip Kumar Swain}, title = {Article: Clustering of Categorical Data by Assigning Rank through Statistical Approach}, journal = {International Journal of Computer Applications}, year = {2012}, volume = {43}, number = {2}, pages = {1-3}, month = {April}, note = {Full text available} }
Abstract
Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Working only on numeric values prohibits it from being used to cluster real world data containing categorical values. Recently, the problem of clustering categorical data has started drawing interest. The k-means algorithm is well known for its efficiency in this respect. It is also well known for its efficiency in clustering large data sets. However, in this paper we use the k-means algorithm to categorical domains by assigning rank value to the attributes
References
- Han, J. , Kamber, M. 2010. Data Mining:Concept and Technique, 2nd ed, 383–464.
- Ng. H. P,Ong. S. H,Foong K. W. C. 2006. Medical Image Segmentation Using K-Means Clustering And Improved Watershed Algorithm, 4244-0069- IEEE
- Hammouda. K. M 2001 "Web Mining: Clustering Web Documents. A Preliminary Review.
- Anderberg,M. R. 1973. ClusterAnalysisforApplicationsNewwwork:Academic press.
- Agresti, A. 1984. Analysis of Ordinal Categorical Data, New York: John Wiley &Sons.
- Kaufman, L. and Rousseeuw, P. J. 1990 Finding groups in data: an introduction to cluster analysis, New York: John Wiley & Sons.
- Mishra, B. B. , Sahoo, S. , and Patnaik. , S. K. 2011. KPSO:AnEvolutionary Approach for Data Clustering. In Proceeding of National conference on Future Trends in Information & Communication Technology , 137–142
- Andersen, Erling B. 1980. Discrete Statistical Models with Social Science Applications. North Holland,
- Mar San. O Huynh. V, Nakamori. Y An Alternative Extension Of The K-Means Algorithm For Clustering Categorical Data Int. J. Appl. Math. Comput. Sci. , 2004, Vol. 14, No. 2, 241–247
- Alan Agresti 2nd edition 2007" An Introduction to Categorical Data Analysis" JohnWiley & Sons, Inc. ,
- Kim, D. J. , Park, Y. W. , and Park, D. J. 2001. A novel validity index for determination of the optimal number of clusters, IEICE Trans. Inf. Syst. , vol. E84D, 281–285.
- Mishra, B. B. ,Patnaik, S. K. , and Mohanty P. 2011 Simultaneous Learning for both Clustering and Classification in Multi-objective Framework. In Proceeding of National conference on Future Trends in Information & Communication Technology 137–142.
- Dunn, J. C. 1974. Well separated clusters and optimal fuzzy partitions, J. Cybern. , vol. 4, 95-104.
- Huang. Z,1998 "Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values" Data Mining and Knowledge Discovery 2, 283–304 (1998) °c 1998 Kluwer Academic Publishers. Manufactured in The Netherlands.