CFP last date
22 April 2024
Reseach Article

Comparative Study of Fuzzy k-Nearest Neighbor and Fuzzy C-means Algorithms

by Pradeep Kumar Jena, Subhagata Chattopadhyay
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 57 - Number 7
Year of Publication: 2012
Authors: Pradeep Kumar Jena, Subhagata Chattopadhyay
10.5120/9127-3294

Pradeep Kumar Jena, Subhagata Chattopadhyay . Comparative Study of Fuzzy k-Nearest Neighbor and Fuzzy C-means Algorithms. International Journal of Computer Applications. 57, 7 ( November 2012), 22-32. DOI=10.5120/9127-3294

@article{ 10.5120/9127-3294,
author = { Pradeep Kumar Jena, Subhagata Chattopadhyay },
title = { Comparative Study of Fuzzy k-Nearest Neighbor and Fuzzy C-means Algorithms },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 57 },
number = { 7 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 22-32 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume57/number7/9127-3294/ },
doi = { 10.5120/9127-3294 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:01:19.915705+05:30
%A Pradeep Kumar Jena
%A Subhagata Chattopadhyay
%T Comparative Study of Fuzzy k-Nearest Neighbor and Fuzzy C-means Algorithms
%J International Journal of Computer Applications
%@ 0975-8887
%V 57
%N 7
%P 22-32
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Fuzzy clustering techniques handle the fuzzy relationships among the data points and with the cluster centers (may be termed as cluster fuzziness). On the other hand, distance measures are important to compute the load of such fuzziness. These are the two important parameters governing the quality of the clusters and the run time. Visualization of multidimensional data clusters into lower dimensions is another important research area to note the hidden patterns within the clusters. This paper investigates the effects of cluster fuzziness and three different distance measures, such as Manhattan distance (MH), Euclidean distance (ED), and Cosine distance (COS) on Fuzzy c-means (FCM) and Fuzzy k-nearest neighborhood (FkNN) clustering techniques, implemented on Iris and extended Wine data. The quality of the clusters is assessed based on (i) data discrepancy factor (i. e. , DDF, proposed in this study), (ii) cluster size, (iii) its compactness, (iv) distinctiveness, (v) execution time taken, and (vi) cluster fuzziness (m) values. The study observes that FCM handles the cluster fuzziness better than FkNN. MH distance measure yields best clusters with both FCM and FkNN. Finally, best clusters are visualized using a Self Organizing Map (SOM).

References
  1. MacQueen, J. B. (1967), Some Methods for classification and Analysis of Multivariate Observations. s. l. : University of California Press. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. pp. 281–297.
  2. Theodoridis S. , Koutroumbas K. (2006), Pattern Recognition. 3rd. s. l. : Elsevier, p. 635.
  3. Ester M. , Kriegel H-P. , Sander J. , Xu X. (1996), A density-based algorithm for discovering clusters in large spatial databases with noise. . Han J. , Fayyad U. M. , Simoudis E. [ed. ] s. l. : AAAI Press. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. pp. 226-231.
  4. Chattopadhyay S. , Pratihar D. K. , De Sarkar S. C. (2007), Some studies on fuzzy clustering of psychoses data. International Journal of Business Intelligence and Data Mining, Vol. 2, pp. 143-159.
  5. Bezdek J. C. , Ehrlich R. , Full W. (1984), FCM: The fuzzy c-means clustering algorithm. 2-3, Computers & Geosciences, Vol. 10, pp. 191-203 .
  6. Keller J. M. , Gray M. R. , Givens (jr. ) J. A. (1985), A fuzzy K-Nearest Neighbor Algorithm. , IEEE Transactions on Systems, Man, and Cybernetics, Vol. 15, pp. 580-586.
  7. Yao J. , Dash M. , Tan S. T. (2000), Entropy-based fuzzy clustering and fuzzy modeling. Fuzzy Sets and Systems, Vol. 113, pp. 381-388.
  8. Dunn J. C. (1973), A fuzzy relative of ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernet, Vol. 3, pp. 32-57.
  9. Han J. , Kamber M. Data mining: concepts and techniques. s. l. : Morgan Kaufmann, 2006.
  10. Chattopadhyay S. , Ray P. , Chen H. S. , Lee M. B. , Chiang H. C. (2008), Suicidal risk evaluation using a similarity-based classifier. Tang et al. [ed. ] Chengdu China : Springer-Verlag Berlin Heidelberg, Advanced Data Mining and Applications (ADMA). pp. 51-61.
  11. Kohonen, T. Self-organizing maps. New York : Springer-Verlag , 1997. ISBN:3-540-62017-6.
  12. Tu D. C. , Zhao J. H. , Liu M. H. , Shen J. , Yu F. (2010) Preliminary Study on Quantification of Duck Color Based on Fuzzy K – Nearest Neighbor Method. Applied Mechanics and Materials, Vol. 39, pp. 210-215. DOI: 10. 4028/www. scientific. net/AMM. 39. 210.
  13. Chen H-L. , Liu D-Y. , Yang B. , Liu J. , Wang G. , Wang S-J. (2010), An Adaptive Fuzzy k-Nearest Neighbor Method Based on Parallel Particle Swarm Optimization for Bankruptcy Prediction. Cao L. , Srivastava J. Huang J. Z. [ed. ] s. l. : Springer-Verlag Berlin Heidelberg, 2011. PAKDD 2011, Part 1. LNAI 6634. pp. 249-264.
  14. Chen H-L. , Yang B. , Wang G. , Liu J. , Xu X. , Wang S-J. , Liu D-Y. (2010), A novel bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method. Knowledge-Based Systems. DOI: doi:10. 1016/j. knosys. 2011. 06. 008.
  15. Arai Y. , Lien N. T. H. , Ishigaki K. , Satoh H. , Hayashi T. , Dong F. , Hirota K. (2010), Fuzzy few-Nearest Neighbor Method with a Few Samples for Personal Authentication. Journal of Advanced Computational Intelligence and Intelligent Informatics, Vol. 14, pp. 167-178.
  16. Arif M. , Akram M. U. , Afsar F. A. (2009), Arrhythmia Beat Classification Using Pruned Fuzzy K-Nearest Neighbor Classifier. Malacca, Malaysia : IEEE Computer Society. International Conference of Soft Computing and Pattern Recognition. DOI:http://doi. ieeecomputersociety. org/10. 1109/SoCPaR. 2009. 20.
  17. Wang H-M. , Kim J-H, Jung D-Y. , Lee S-M. , Lee S-H. (2011), Power interconnected system clustering with advanced fuzzy C-mean algorithm. Journal of Central South University of Technology, Vol. 18, pp. 190-195. DOI: 10. 1007/s11771-011-0679-5.
  18. Wang H. , Zhang Y. , Li D. (2010), Network intrusion detection based on hybrid Fuzzy C-mean clustering. Yantai, Shandong : IEEE Xplore, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD). pp. 483 - 486. DOI: 10. 1109/FSKD. 2010. 5569762.
  19. Ramathilagam S. , Pandiyarajan R. , Sathya A. , Devi R. , Kannan S. R. (2011), Modified fuzzy c-means algorithm for segmentation of T1-T2-weighted brain MRI. 2011, Journal of Computational and Applied Mathematics, Vol. 235. DOI: 10. 1016/j. cam. 2010. 08. 033.
  20. Li Z. Y. , Weng G. R. (2011), Segmentation of cDNA Microarray Image Using Fuzzy c-Mean Algorithm and Mathematical Morphology, Key Engineering Materials, Vol. 464, pp. 159-162.
  21. Chattopadhyay S. , Pratihar D. K. , De Sarkar S. C. Performance studies of some similarity-based fuzzy clustering algorithms. 2007, International journal of Performability Engineering, Vol. 2, pp. 191-200.
  22. Chattopadhyay S. , Pratihar D. K. , De Sarkar S. C. (2009), Fuzzy logic-based Screening and Prediction of Adult Psychoses. IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans, Vol. 39, pp. 381-387.
  23. Chattopadhyay S. , Pratihar D. K. , De Sarkar S. C. (2008), Developing Fuzzy Classifiers to Predict the Chance of Occurrence of Adult Psychoses, Knowledge based Systems, Vol. 20, pp. 479-497.
  24. Fisher R. A. (1936), The use of multiple measurements in taxonomic problems. , Annals of Eugenics, Vol. 7, pp. 179-188.
  25. Forina M. , Armanino C. , Castino M. , Ubigli M. (1990), Chemometrical investigation on four red wines from a single cultivar grown in the Piedmont region. Analyst, Vol. 115, pp. 907-910. DOI: 10. 1039/AN9901500907.
  26. Dutta P. , Pratihar D. K. Some studies on mapping methods. (2006), International Journal of Business Intelligence and Data Mining, Vol. 1(3), pp. 347-370.
  27. Panda S. , Sahu S. , Jena P. K. , Chattopadhyay S. (2012), Comparing Fuzzy-C means and K-means Clustering Techniques: a Comprehensive Study. In Proceedings of 2nd International Conference on Computer Science, Engineering & Applications, by D. C. Wyld, J. Zizka, D. Nagamalai (Eds. ), Advances in Intelligent and Soft Computing (AISC) Vol. 166, pp. 451-460. DOI: 10. 1007/978-3-642-30157-5_45, 25-27 May, New Delhi India.
Index Terms

Computer Science
Information Sciences

Keywords

Fuzzy clusters FkNN FCM Cluster fuzziness Data discrepancy factor (DDF)