Call for Paper - November 2022 Edition
IJCA solicits original research papers for the November 2022 Edition. Last date of manuscript submission is October 20, 2022. Read More

Analysis of Clustering Algorithm of Weka Tool on Air Pollution Dataset

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Richa Agrawal, Jitendra Agrawal

Richa Agrawal and Jitendra Agrawal. Analysis of Clustering Algorithm of Weka Tool on Air Pollution Dataset. International Journal of Computer Applications 168(13):1-5, June 2017. BibTeX

	author = {Richa Agrawal and Jitendra Agrawal},
	title = {Analysis of Clustering Algorithm of Weka Tool on Air Pollution Dataset},
	journal = {International Journal of Computer Applications},
	issue_date = {June 2017},
	volume = {168},
	number = {13},
	month = {Jun},
	year = {2017},
	issn = {0975-8887},
	pages = {1-5},
	numpages = {5},
	url = {},
	doi = {10.5120/ijca2017914522},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


Data mining is the process of extracting knowledge from the huge amount of data. The data can be stored in databases and information repositories. Data mining task can be divided into two models descriptive and predictive model. In the Predictive model, we can predict the values from a different set of sample data, they are classified into three types such as classification, regression and time series. The descriptive model enables us to determine patterns in a sample data and sub-divided into clustering, summarization and association rules. Clustering creates a group of classes based on the patterns and relationship between the data. There is different types of clustering algorithms partition, density based algorithm. In this paper, algorithms are analyzing and comparing the various clustering algorithm by using WEKA tool to find out which algorithm will be more comfortable for the users for performing clustering algorithm. This present the application's of data minning WEKA tool it provide the cluster's huge data set and clustering thet provide making hand in the optimizing in search engine.


  1. Chauhan R, Kaur H, Alam M A, “Data Clustering Method for Discovering Clusters in Spatial Cancer Databases”, International Journal of Computer Applications , (0975 – 8887) Vol.10– No.6, November 2010.
  2. Data Preprocessing in WEKA, Available at:
  3. Raj Bala, Sunil Sikka and Juhi singh et. ,“A Comparative Analysis of Clustering Algorithms”, International Journal of Computer Applications (0975 – 8887) Volume 100 – No.15, August 2014.
  4. Bharat Choudhari, Manan Parikh et., “A Comparative Study on Role of Data Mining Techniques in Education: A Review” , International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Web Site: Email: This e-mail address is being protected from spambots. You need JavaScript enabled to view it Volume 3, Issue 3, May – June 2014 ISSN 2278-6856.
  5. Deepti V. Patange Dr. Pradeep K. Butey S. E. Tayde, “Analytical Study of Clustering Algorithms by Using Weka”, National Conference on “Advanced Technologies in Computing and Networking"-ATCON-2015 Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477.
  7. Z. Huang."Extensions to the k-means algorithm for clustering large data sets with categorical values". Data Mining and Knowledge Discovery,2:283–304, 1998.
  9. Marie Cottrell, “Some Other Applications of the SOM algorithm : how to use the Kohonen algorithm for forecasting”, 2002.
  10. William Iba and Pat Langley. "Cobweb models of categorization and probabilistic concept formation". In Emmanuel M. Pothos and Andy J. Wills,. Formal approaches in categorization. Cambridge: Cambridge University Press. pp. 253–273. ISBN 9780521190480.
  11. Introduction to Weka, Available at: ge/weka/WekaManual-3.6.0.pdf
  12. Kohonen, T. (1995) : Self-Organizing Maps, Springer Series in Information Sciences Vol 30, Springer.
  13. Kaski, S. (1997) : Data Exploration Using Self-Organizing Maps, Acta Polytechnica Scandinavia, 82.
  14. http://www.cs.
  15. Sanjoy Dasgupta ―Performance guarantees for hierarchical clustering Department of Computer Science and Engineering University of California, San Diego.
  16. Ali, MA, Karmakar, GC & Dooley, LS 2008 ‘Review on Fuzzy Clustering Algorithms’. IETECH Journal of Advanced Computations, vol. 2, no. 3, pp. 169 – 181.
  17. Suganya, R & Shanthi, R 2012 ‘Fuzzy C- Means Algorithm - A Review’. Int. J. of Scientific and Research Publications, vol. 2, no. 11, pp. 1-3.
  18. Bora, DJ & Gupta, AK 2014 ‘A Comparative study Between Fuzzy Clustering Algorithm and Hard Clustering Algorithm’. Int. J. of Computer Trends and Technology, vol. 10, no. 2, pp. 108-113.
  19. Glenn Fung, "A Comprehensive Overview of Basic Clustering Algorithms", 2002.
  20. Ossama Abu Abbas., "Comparisons Between of Data Clustering algorithms", The International Arab Journal of Information Technology, Vol. 5, No. 3, 2008.
  21. Madjid Khalilian, Norwati Mustapha, MD Nasir Suliman, MD Ali Mamat, "K-Means Based Clustering Algorithm ", International multi conference of Enginnrs and Computer Scientists, 2010.
  22. Rui Xu, Wunsch, D., II, Dept. of Electr. & Comput. Eng., Univ. of Missouri-Rolla, Rolla, MO, USA, "Survey of clustering algorithms", IEEE Transaction on Neural Networks, 2005.
  23. HE Ling WU Ling-da, CAI Yi-chao(College of Information System & Management ,National University of Defense Technology, Changsha Hunan 410073,China), ''Survey of Clustering Algorithms in Data Mining", 2007.


Data Mining, Clustering algorithms, K-mean, LVQ, SOM, cobweb, WEKA

Learn about the IJCA article correction policy and process
Dealing with any form of infringement.
‘Peer Review – A Critical Inquiry’ by David Shatz
Directly place requests for print/ hard copies of IJCA via Google Docs