CFP last date
20 March 2024
Reseach Article

Clustering of Categorical Data by Assigning Rank through Statistical Approach

by Sovan Kumar Patnaik, Soumya Sahoo, Dillip Kumar Swain
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 43 - Number 2
Year of Publication: 2012
Authors: Sovan Kumar Patnaik, Soumya Sahoo, Dillip Kumar Swain
10.5120/6072-7456

Sovan Kumar Patnaik, Soumya Sahoo, Dillip Kumar Swain . Clustering of Categorical Data by Assigning Rank through Statistical Approach. International Journal of Computer Applications. 43, 2 ( April 2012), 1-3. DOI=10.5120/6072-7456

@article{ 10.5120/6072-7456,
author = { Sovan Kumar Patnaik, Soumya Sahoo, Dillip Kumar Swain },
title = { Clustering of Categorical Data by Assigning Rank through Statistical Approach },
journal = { International Journal of Computer Applications },
issue_date = { April 2012 },
volume = { 43 },
number = { 2 },
month = { April },
year = { 2012 },
issn = { 0975-8887 },
pages = { 1-3 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume43/number2/6072-7456/ },
doi = { 10.5120/6072-7456 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:32:47.544060+05:30
%A Sovan Kumar Patnaik
%A Soumya Sahoo
%A Dillip Kumar Swain
%T Clustering of Categorical Data by Assigning Rank through Statistical Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 43
%N 2
%P 1-3
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Most of the earlier work on clustering has mainly been focused on numerical data whose inherent geometric properties can be exploited to naturally define distance functions between data points. Working only on numeric values prohibits it from being used to cluster real world data containing categorical values. Recently, the problem of clustering categorical data has started drawing interest. The k-means algorithm is well known for its efficiency in this respect. It is also well known for its efficiency in clustering large data sets. However, in this paper we use the k-means algorithm to categorical domains by assigning rank value to the attributes

References
  1. Han, J. , Kamber, M. 2010. Data Mining:Concept and Technique, 2nd ed, 383–464.
  2. Ng. H. P,Ong. S. H,Foong K. W. C. 2006. Medical Image Segmentation Using K-Means Clustering And Improved Watershed Algorithm, 4244-0069- IEEE
  3. Hammouda. K. M 2001 "Web Mining: Clustering Web Documents. A Preliminary Review.
  4. Anderberg,M. R. 1973. ClusterAnalysisforApplicationsNewwwork:Academic press.
  5. Agresti, A. 1984. Analysis of Ordinal Categorical Data, New York: John Wiley &Sons.
  6. Kaufman, L. and Rousseeuw, P. J. 1990 Finding groups in data: an introduction to cluster analysis, New York: John Wiley & Sons.
  7. Mishra, B. B. , Sahoo, S. , and Patnaik. , S. K. 2011. KPSO:AnEvolutionary Approach for Data Clustering. In Proceeding of National conference on Future Trends in Information & Communication Technology , 137–142
  8. Andersen, Erling B. 1980. Discrete Statistical Models with Social Science Applications. North Holland,
  9. Mar San. O Huynh. V, Nakamori. Y An Alternative Extension Of The K-Means Algorithm For Clustering Categorical Data Int. J. Appl. Math. Comput. Sci. , 2004, Vol. 14, No. 2, 241–247
  10. Alan Agresti 2nd edition 2007" An Introduction to Categorical Data Analysis" JohnWiley & Sons, Inc. ,
  11. Kim, D. J. , Park, Y. W. , and Park, D. J. 2001. A novel validity index for determination of the optimal number of clusters, IEICE Trans. Inf. Syst. , vol. E84D, 281–285.
  12. Mishra, B. B. ,Patnaik, S. K. , and Mohanty P. 2011 Simultaneous Learning for both Clustering and Classification in Multi-objective Framework. In Proceeding of National conference on Future Trends in Information & Communication Technology 137–142.
  13. Dunn, J. C. 1974. Well separated clusters and optimal fuzzy partitions, J. Cybern. , vol. 4, 95-104.
  14. Huang. Z,1998 "Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values" Data Mining and Knowledge Discovery 2, 283–304 (1998) °c 1998 Kluwer Academic Publishers. Manufactured in The Netherlands.
Index Terms

Computer Science
Information Sciences

Keywords

Catagorical Data K-mean Rank Value