CFP last date
20 May 2024
Reseach Article

K-modes Clustering Algorithm for Categorical Data

by Neha Sharma, Nirmal Gaud
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 127 - Number 17
Year of Publication: 2015
Authors: Neha Sharma, Nirmal Gaud
10.5120/ijca2015906708

Neha Sharma, Nirmal Gaud . K-modes Clustering Algorithm for Categorical Data. International Journal of Computer Applications. 127, 17 ( October 2015), 1-6. DOI=10.5120/ijca2015906708

@article{ 10.5120/ijca2015906708,
author = { Neha Sharma, Nirmal Gaud },
title = { K-modes Clustering Algorithm for Categorical Data },
journal = { International Journal of Computer Applications },
issue_date = { October 2015 },
volume = { 127 },
number = { 17 },
month = { October },
year = { 2015 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume127/number17/22818-2015906708/ },
doi = { 10.5120/ijca2015906708 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:18:15.652950+05:30
%A Neha Sharma
%A Nirmal Gaud
%T K-modes Clustering Algorithm for Categorical Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 127
%N 17
%P 1-6
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Partitioning clustering is generally performed using K-modes cluster algorithms, which work well for large datasets. A K-modes technique involve random chosen initial cluster centre (modes) as seed, which lead toward that problem clustering results be regularly reliant on the choice initial cluster centre and non-repeatable cluster structure may be obtain. K-Modes technique has been widely applied to categorical data a clustering in replace means through modes. The pervious algorithms select the attributes on frequency basis but not provided better result. Proposed algorithm select attributes on information gain basis which provide better result. Experimental results showing the proposed technique provided better accuracy.

References
  1. Shehroz S. Khan, Amir Ahmad, Cluster center initialization algorithm for K-modes clustering, Expert Systems with Applications 40 (2013) 7444–7456.
  2. Joel Luis Carbonera, Mara Abel, An entropy-based subspace clustering algorithm for categorical data, 2014 IEEE 26th International Conference on Tools with Artificial Intelligence.
  3. S.Aranganayagi, K.ThangaveI, S.Sujatha, New Distance Measure based on the Domain for Categorical Data.
  4. Zhexue Huang, A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining.
  5. J.L.Carbonera, and M. Abel, “Categorical data clustering: a correlation-based approach for unsupervised attribute weighting,” in Proceedings of ICTAI 2014.
  6. D. Bacciu, I.H. Jarman, T.A. Etchells and P.J.G. Lisboa. Patient Stratification with competing risks by multivariate Fisher distance. International Joint Conference on Neural Networks, 14-19th June 2009, pp 213-220.
  7. Rishi Syal, Dr V.Vijaya Kumar, Innovative Modified K-Mode Clustering Algorithm www.ijera.com Vol. 2, Issue 4, July-August 2012, pp.390-398
  8. Liang Bai, Jiye Liang, Chuangyin Dang , Fuyuan Cao, A cluster centers initialization method for clustering categorical data, Expert Systems with Applications 39 (2012) 8022–8029
  9. Guo Tao, Ding Xingu, Li Yefeng, Parallel k-modes Algorithm based on MapReduce.
Index Terms

Computer Science
Information Sciences

Keywords

Clustering Categorical data K-mean algorithm K-modes algorithm Text mining