Call for Paper - March 2023 Edition
IJCA solicits original research papers for the March 2023 Edition. Last date of manuscript submission is February 20, 2023. Read More

BPSO Optimized K-means Clustering Approach for Data Analysis

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Authors:
Juhi Gupta, Aakanksha Mahajan
10.5120/ijca2016907945

Juhi Gupta and Aakanksha Mahajan. Article: BPSO Optimized K-means Clustering Approach for Data Analysis. International Journal of Computer Applications 133(15):9-14, January 2016. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX

@article{key:article,
	author = {Juhi Gupta and Aakanksha Mahajan},
	title = {Article: BPSO Optimized K-means Clustering Approach for Data Analysis},
	journal = {International Journal of Computer Applications},
	year = {2016},
	volume = {133},
	number = {15},
	pages = {9-14},
	month = {January},
	note = {Published by Foundation of Computer Science (FCS), NY, USA}
}

Abstract

In data mining, K-means clustering is well known for its efficiency in clustering large data sets. The main aim in grouping data points into clusters is to lump similar items together in the same cluster such that objects lying in one cluster should be as close as possible to each other (homogeneity) and objects lying in different clusters are further apart from each other.

However, there exist some flaws in classical K-means clustering algorithm. First, the algorithm is sensitive in selecting initial centroids and can be easily trapped at a local minimum with regards to the measurement (the sum of squared errors). Secondly, the KM problem in terms of finding a global minimal sum of the squared errors is NP-hard even when the number of the clusters is equal to 2 or the number of attributes for data point is 2, so finding the optimal clustering is believed to be computationally intractable.

In this dissertation, KM clustering problem is solved by optimized KM. The proposed algorithm is named as BPSO in which the issue of how to derive an optimization model for the minimum sum of squared errors for a given data set is considered. Two evolutionary optimization algorithms BFO and PSO are combined to optimize KM algorithm to guarantee that the result of clustering is more accurate than clustering by basic KM algorithm. F-measure is used to do comparison of both basic K-means and BPSO algorithm.

References

  1. Nikhil Kushwaha, Vimal Singh Bisht, Gautam Shah, “Genetic Algorithm based Bacterial Foraging Approach for Optimization”, International Journal of Computer Applications (IJCA), 2012.
  2. Tarun Kumar Sharma, Millie Pant “Improved Swarm Bee Algorithm for Global Optimization”, International Journal of Computer Applications (IJCA), International Conference on Recent Advances and Future Trends in Information Technology (iRAFIT2012).
  3. Vipul Sharma, S.S. Pattnaik, Tanuj Garg, “A Review of Bacterial Foraging Optimization and Its Applications”, International Journal of Computer Applications (IJCA), (2012).
  4. Yang Yong, “The Research of Imbalanced Data Set of Sample Sampling Method Based on K-Means Cluster and Genetic Algorithm”, International Conference on Future Electrical Power and Energy Systems (SciVerse ScienceDirect), 2012.
  5. Youguo Li, Haiyan Wu, “A Clustering Method Based on K-Means Algorithm”, International Conference on Solid State Devices and Materials Science (SciVerse ScienceDirect), 2012.
  6. Sunita Sarkar, Arindam Roy and Bipul Shyam Purkayastha, “Application of Particle Swarm Optimization in Data Clustering: A Survey”, International Journal of Computer Applications (0975–8887) Volume 65– No.25, March 2013.
  7. Gautam Mahapatra, Soumya Banerjee, “A Study of Bacterial Foraging Optimization Algorithm and its Applications to Solve Simultaneous Equations”, International Journal of Computer Applications (0975 – 8887) Volume 72– No.5, May 2013.
  8. Hlaudi Daniel Masethe, Mosima Anna Masethe,” Prediction of Heart Disease using Classification Algorithms”, Proceedings of the World Congress on Engineering and Computer Science, 2014 Vol. II.
  9. Ibrahim M. El-Hasnony, Hazem M. El Bakry, Ahmed A. Saleh,” Data Mining Techniques for Medical Applications: A Survey”, Mathematical Methods in Science and Mechanics, 2014.
  10. Khalid Raza,” Clustering analysis of cancerous microarray data”, Journal of Chemical and Pharmaceutical Research, 2014.
  11. Poonam Sehrawat, Manju,” Association Rule Mining Using Particle Swarm Optimization”, International Journal of Innovations & Advancement in Computer Science, Volume 2, Issue 1 January 2014.
  12. Sanjay Tiwari, Mahainder Kumar Rao,” Optimization In Association Rule Mining Using Distance Weight Vector And Genetic Algorithm” International Journal of Advanced Technology & Engineering Research (IJATER), Volume 4, Issue 1, Jan. 2014.
  13. P.Kalyani,” Medical Data Set Analysis Ñ A Enhanced Clustering Approach” International Journal of Latest Research in Science and Technology, Volume 3, Issue 1: Page No.102-105 ,January-February 2014.
  14. P. Ramachandran, N.Girija,” Early Detection and Prevention of Cancer using Data Mining Techniques”, International Journal of Computer Applications, Volume 97– No.13, July 2014.
  15. Sandeep U. Mane, Pankaj G. Gaikwad,” Hybrid Particle Swarm Optimization (HPSO) for Data Clustering”, International Journal of Computer Applications (0975 8887) Volume 97 - No. 19, July 2014.
  16. Amin Rostami and Maryam Lashkari, ”Extended PSO Algorithm For Improvement Problems K-Means Clustering Algorithm”, International Journal of Managing Information Technology (IJMIT) Vol.6, No.3, August 2014.
  17. Sundararajan S, Dr. Karthikeyan S,” An Hybrid Technique for Data Clustering Using Genetic Algorithm with Particle Swarm Optimization”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 4, Issue 12, December 2014.
  18. Maheshwar, keshav Kaushik, VikramArora, “A hybrid data clustering using firefly algorithm based improved genetic algorithm”,Second International Symposium on Computer Vision and the Internet(Visionnet’15),(SciVerseScienceDirect), 2015.
  19. R.Jensi and G.Wiselin Jiji,” Hybrid Data Clustering Approach Using K-Means And Flower Pollination Algorithm”, Advanced Computational Intelligence:An International Journal (ACII), Vol.2, No.2, April 2015.

Keywords

PSO (Particle Swarm Optimization), BFO (Bacterial Foraging Optimization), KDD (Knowledge Discovery in Databases), BPSO (Bacterial Particle Swarm Optimization), KM (K-Means) etc.