CFP last date
20 May 2024
Reseach Article

A New Efficient Approach towards k-means Clustering Algorithm

by Pallavi Purohit, Ritesh Joshi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 65 - Number 11
Year of Publication: 2013
Authors: Pallavi Purohit, Ritesh Joshi
10.5120/10966-6097

Pallavi Purohit, Ritesh Joshi . A New Efficient Approach towards k-means Clustering Algorithm. International Journal of Computer Applications. 65, 11 ( March 2013), 7-10. DOI=10.5120/10966-6097

@article{ 10.5120/10966-6097,
author = { Pallavi Purohit, Ritesh Joshi },
title = { A New Efficient Approach towards k-means Clustering Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { March 2013 },
volume = { 65 },
number = { 11 },
month = { March },
year = { 2013 },
issn = { 0975-8887 },
pages = { 7-10 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume65/number11/10966-6097/ },
doi = { 10.5120/10966-6097 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:18:33.803131+05:30
%A Pallavi Purohit
%A Ritesh Joshi
%T A New Efficient Approach towards k-means Clustering Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 65
%N 11
%P 7-10
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

K-means clustering algorithms are widely used for many practical applications. Original k-mean algorithm select initial centroids and medoids randomly that affect the quality of the resulting clusters and sometimes it generates unstable and empty clusters which are meaningless. The original k-means algorithm is computationally expensive and requires time proportional to the product of the number of data items, number of clusters and the number of iterations. The new approach for the k-mean algorithm eliminates the deficiency of exiting k mean. It first calculates the initial centroids k as per requirements of users and then gives better, effective and good cluster without scarifying Accuracy. It generates stable clusters to improve accuracy. It also reduces the mean square error and improves the quality of clustering. We also applied our algorithm for the evaluation of student's academic performance for the purpose of making effective decision by the student councilors.

References
  1. Dechang Pi, Xiaolin Qin and Qiang Wang, "Fuzzy Clustering Algorithm Based on Tree for Association Rules", International Journal of Information Technology, vol. 12, No. 3, 2006.
  2. Fahim A. M. , Salem A. M. , "Efficient enhanced k-means clustering algorithm", Journal of Zhejiang University Science, 1626 – 1633, 2006.
  3. Fang Yuag, Zeng Hui Meng, "A New Algorithm to get initial centroid", Third International Conference on Machine Learning and cybernetics, Shanghai, 26-29 August,1191 – 1193, 2004.
  4. Friedrich Leisch1 and Bettina Gr un2, "Extending Standard Cluster Algorithms to Allow for Group Constraints", Compstat 2006, Proceeding in Computational Statistics, Physica verlag, Heidelberg, Germany,2006
  5. J. MacQueen, "Some method for classification and analysis of multi varite observation", University of California, Los Angeles, 281 – 297.
  6. Maria Camila N. Barioni, Humberto L. Razente, Agma J. M. Traina, "An efficient approach to scale up k-medoid based algorithms in large databases", 265 – 279.
  7. Michel Steinbach, Levent Ertoz and Vipin Kumar, "Challenges in high dimensional data set", International Conference of Data management, Vol. 2,No. 3, 2005.
  8. Parsons L. , Haque E. , and Liu H. , "Subspace clustering for high dimensional data: A review", SIGKDD, Explor, Newsletter 6, 90 -105, 2004.
  9. Rui Xu, Donlad Wunsch, "Survey of Clustering Algorithm", IEEE Transactions on Neural Networks, Vol. 16, No. 3, may 2005.
  10. Sanjay garg, Ramesh Chandra Jain, "Variation of k-mean Algorithm: A study for High Dimensional Large data sets", Information Technology Journal5 (6), 1132 – 1135, 2006.
  11. Vance Febre, "Clustering and Continues k-mean algorithm", Los Alamos Science, Georgain Electonics Scientific Journal: Computer Science and Telecommunication, vol. 4,No. 3, 1994.
  12. Zhexue Huang, "A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining".
  13. Nathan Rountree, "Further Data Mining: Building Decision Trees", first presented 28 July 1999.
  14. Yang liu, "Introduction to Rough Set Theory and Its Application in Decision Suppot System"
  15. Wei-YIn loh, "Regression trees with unbiased variable selection and interaction detection", University of Wisconsin–Madison.
  16. S. Rasoul Safavian and David Landgrebe, "A Survey of Decision Tree Classifier Methodology", School of Electrical Engineering ,Purdue University, West Lafayette, IN 47907.
  17. David S. Vogel, Ognian Asparouhov and Tobias Scheffer, "Scalable Look-Ahead Linear Regression Trees" .
  18. Alin Dobra, "Classification and Regression Tree Construction", Thesis Proposal, Department of Computer Science, Cornell university, Ithaca NY, November 25, 2002
  19. Yinmei Huang, "Classification and regression tree (CART) analysis: methodological review and its application", Ph. D. Student, The Department of Sociology, The University of Akron Olin Hall 247, Akron, OH 44325-1905,
  20. Yan X. and Han J. (2003), GSpan: Graph-Based Substructure Pattern Mining. Proc. 2nd IEEE Int. Conf. on Data Mining (ICDM 2003, Maebashi, Japan), 721–724. IEEE Press,Piscataway, NJ,USA.
Index Terms

Computer Science
Information Sciences

Keywords

Cluster analysis Centroids K-mean