CFP last date
20 May 2024
Reseach Article

A Review on Clustering Web data using PSO

by Jayshree Ghorpade-aher, Roshan Bagdiya
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 108 - Number 6
Year of Publication: 2014
Authors: Jayshree Ghorpade-aher, Roshan Bagdiya
10.5120/18917-0245

Jayshree Ghorpade-aher, Roshan Bagdiya . A Review on Clustering Web data using PSO. International Journal of Computer Applications. 108, 6 ( December 2014), 31-36. DOI=10.5120/18917-0245

@article{ 10.5120/18917-0245,
author = { Jayshree Ghorpade-aher, Roshan Bagdiya },
title = { A Review on Clustering Web data using PSO },
journal = { International Journal of Computer Applications },
issue_date = { December 2014 },
volume = { 108 },
number = { 6 },
month = { December },
year = { 2014 },
issn = { 0975-8887 },
pages = { 31-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume108/number6/18917-0245/ },
doi = { 10.5120/18917-0245 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:42:17.685389+05:30
%A Jayshree Ghorpade-aher
%A Roshan Bagdiya
%T A Review on Clustering Web data using PSO
%J International Journal of Computer Applications
%@ 0975-8887
%V 108
%N 6
%P 31-36
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Due to its wide distribution, openness and highly dynamic data, the resources on the web are greatly scattered and they have no unified management and structure. Near about 90 % web data is unstructured and needed to be structure as it greatly reduces the efficiency in using web information. Web text feature extraction and clustering are the main challenging tasks in web data mining, which requires an efficient clustering technique. Data mining tasks require fast and accurate partitioning of huge unstructured data which may come with a variety of dimensions and attribute. In our paper we are focusing on the different clustering techniques, helpful for web data clustering. For such novel approach we perform a literature survey and depicted an evolutionary bio-inspired Swarm Intelligence algorithm called Particle Swarm Optimization (PSO) for optimized clustering result. In order to preprocess input data for improving the accuracy and optimize keyword searching, stop word removal and stemming methods are used. PSO algorithm will greatly improve the efficiency of web texts processing, and such evolutionary clustering techniques are used for web text data clustering.

References
  1. Jayshree Ghorpade and Vishakha Arun Metre. Article: PSO based Multidimensional Data Clustering: A Survey. International Journal of Computer Applications 87(16), 2014, pp. 41-48.
  2. R. C. Eberhart and J. Kennedy, "A new optimizer using particle swarm theory," in Proc. 6th Int. Symp. Micro Machine and Human Science, 1995, pp. 39-45.
  3. Ahmed A. A. Esmin, Rodrigo A. Coelho, Stan Matwin, "A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data", Springer, 2013, pp. 1-23.
  4. Song Liangtu, Zhang Xiaoming, "Web Text Feature Extraction with Particle Swarm Optimization", IJCSNS International Journal of Computer Science and Network Security, VOL. 7 No. 6, June 2007, pp. 132-136.
  5. Rania Hassan, Babak Cohanim, Olivier de Weck, "A comparison of PSO and GA", American Institute of Aeronautics and Astronautics,2004.
  6. Xing Huang, Qing Wu, "Micro-blog Commercial Word Extraction Based On Improved TF-IDF Algorithm", IEEE, 2013, pp. 1-5.
  7. Stefan Janson and Martin Middendorf , "A Hierarchical Particle Swarm Optimizer and Its Adaptive Variant", Ieee Transactions On Systems, Man, And Cybernetics, Dec 2005, pp. 1272-1282.
  8. Mita K. Dalal, Mukesh A. Zaveri, "Automatic text classification of sports blog data", IEEE, 2012, pp. 219-222.
  9. Dian Palupi Rini, Siti Mariyam Shamsuddin, Siti Sophiyati Yuhaniz," Particle Swarm Optimization: Technique, System and Challenges", IJOCA, 2011, pp. 19-27.
  10. Shafiq Alam,Gillian Dobbie, Yun Sing Koh, Patricia Riddle, "Clustering Heterogeneous Web Usage Data Using Hierarchical Particle Swarm Optimization", IEEE, 2013, pp. 147-154.
  11. Stuti Karol and Veenu Mangat, "Survey On Particle Swarm Optimization Based Web Mining", IJIOME, 2012, pp. 273-276.
  12. Mohammad Syafrullah and Naomie Salim, "Improving Term Extraction Using ParticleSwarm Optimization Techniques", JOC , 2010, pp. 116-120.
  13. Hongbo LI Yunming Ye, "Improved Blog Clustering Through Automated Weighing of Text blocks", IEEE, 2009, pp. 1586-1591.
  14. Mrs. G. Sudhamathy, Dr. C. Jothi Venkateswaran, "Web Log Clustering Approaches – A Survey", IJCSE 2011, pp. 2896-2903.
  15. Mariam El-Tarabily, "A PSO-Based Subtractive Data Clustering Algorithm ", IJORCS 2013 , pp. 1-9.
  16. Ziqiang Wang, Qingzhou Zhang, Dexian Zhang, "A PSO-Based Web Document Classification Algorithm", IEEE, 2007, pp. 659-664.
  17. Xiaohui Cui, Thomas E. Potok, "Document Clustering using PSO", IEEE, 2005, pp. 185-191.
  18. Shouning Qu ,Sujuan Wang,Yan Zou, "Improvement of Text Feature Selection Method based on TFIDF", IEEE, 2008, pp. 79-81.
  19. Huo Ling Yu1, Liu Bingwu, Yan Fang, "Similarity Computation of Web Pages of Focused Crawler" , International Forum on Information Technology and Applications, 2010, pp 499-505
  20. Shafiq Alam, Gillian Dobbie, Yun Sing Koh, Patricia Riddle, "Web Bots Detection Using Particle Swarm Optimization Based Clustering", IEEE, 2014, pp 2955-2962.
  21. Tien-Chi Huang, Shu-Chen Cheng, Yueh-Min Huang, "A blog article recommendation generating mechanism using an SBACPSO algorithm" , Expert Systems with Applications 36,2009, pp 10388–10396.
  22. Ching-Yi Cheo, Fun Ye, "Particle Swarm Optimization Algorithm and Its Application to Clustering Analysis" , IEEE, 2004, pp 789-794.
  23. Tang Rui, Simon Fong, Xin-She Yang, Suash Deb, "Nature-inspired Clustering Algorithms for Web Intelligence Data", IEEE, 2012, pp. 147-153.
  24. Wiak, S?awomir, Andrzej Krawczyk, and Ivo Dolezel, " Intelligent Computer Techniques In Applied Electromagnetics" vol 119, 2008, pp. 1-291.
  25. F. Moussouni et al. : Comparison of Two Multi-Agent Algorithms: ACO and PSO for the Optimization of a Brushless DC Wheel Motor, Studies in Computational Intelligence (SCI) 119, 2008, pp. 3–10.
  26. Marco A. Montes de Oca, Thomas Stützle, Mauro Birattari and Marco Dorigo, "Frankenstein's PSO: A Composite Particle Swarm Optimization Algorithm", IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 13, NO. 5, OCTOBER 2009,pp 1-30.
  27. Emad Elbeltagi, Tarek Hegazy, Donald Grierson, Comparison among five evolutionary-based optimization algorithms, Advanced Engineering Informatics, Volume 19, Issue 1, January 2005, pp 43-53.
  28. [JIA 06] Jiawei Han and Micheline Kamber, "Data Mining Concepts and Techniques", published by Morgan Kauffman, 2nd Ed, 2006.
  29. [SER] Serkan Kiranyaz, Turker Ince, and Moncef Gabbouj, "Multidimensional Particle Swarm Optimization For Machine Learning And Pattern Recognition", Springer Adaptation, Learning, And Optimization Volume 15.
  30. [MAU 06] Maurice Clerc, "Particle Swarm Optimization", © ISTE Ltd, 2006.
  31. [PSOL] http://www. particleswarm. info
  32. Kiranyaz, Serkan, et al. "Fractional particle swarm optimization in multidimensional search space. " Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 40. 2 (2010): 298-319.
Index Terms

Computer Science
Information Sciences

Keywords

Particle Swarm Optimization Clustering Evolutionary Algorithm Web data