Call for Paper - January 2022 Edition
IJCA solicits original research papers for the January 2022 Edition. Last date of manuscript submission is December 20, 2021. Read More

An Improved Optimized Web Page Classification using Firefly Algorithm with NB Classifier (WPCNB)

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Authors:
Khushboo Bhatt, Anju Singh, Divakar Singh
10.5120/ijca2016910668

Khushboo Bhatt, Anju Singh and Divakar Singh. An Improved Optimized Web Page Classification using Firefly Algorithm with NB Classifier (WPCNB). International Journal of Computer Applications 146(4):15-21, July 2016. BibTeX

@article{10.5120/ijca2016910668,
	author = {Khushboo Bhatt and Anju Singh and Divakar Singh},
	title = {An Improved Optimized Web Page Classification using Firefly Algorithm with NB Classifier (WPCNB)},
	journal = {International Journal of Computer Applications},
	issue_date = {July 2016},
	volume = {146},
	number = {4},
	month = {Jul},
	year = {2016},
	issn = {0975-8887},
	pages = {15-21},
	numpages = {7},
	url = {http://www.ijcaonline.org/archives/volume146/number4/25385-2016910668},
	doi = {10.5120/ijca2016910668},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

The web is a huge repository of information which needs for accurate automated classifiers for Web pages to maintain Web directories and to increase search engines’ performance. In web page classification problem each term in each HTML/XML tag of each Web page can be taken as a feature, an efficient methods to select best features to reduce feature space of the Web page classification problem derived here. Classification of Web page content is essential to many tasks in Web information retrieval such as maintaining, web directories and focused crawling. The uncontrolled nature of Web content presents additional challenges to Web page classification as compared to traditional text classification, but the interconnected nature of hypertext also provides features that can assist the process. As in derived work reviewed in Web page classification, the importance of these Web-specific features and algorithms, describe state-of-the-art practices, and track the underlying assumptions behind the use of information from neighboring pages. This work, our aimed to optimize best features selection for Web page classification problem. Since Firefly Algorithm (FA) is a recent nature inspired optimization algorithm, that simulates the flash pattern and characteristics of fireflies. Clustering is a popular data analysis technique to identify homogeneous groups of objects based on the values of their attributes. Here FA is used for clustering on benchmark problems which is being found more suitable than Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO), and other nine methods used. The web page optimization using Naïve Bayes classifier (WPCNB) is an improved optimized web page classification using firefly algorithm with NB classifier. this work is tested on research banking data set where firefly algorithm used for web optimization and Naïve Bayes (NB) classifier used for classification of pages in contrast to selected pages with reference to different fireflies. The entitled work is being found better in terms of feature measure(FM),accuracy, precision etc. parameters with respect to existing key concepts.it is also an search optimization approach and can be enhanced by different genetic algorithm(GA)based classifiers use in future.

References

  1. Esra Saraç, Selma Ayşe Özel ,”Web Page Classification Using Firefly Optimization”, IEEE, vol. 6, pp1-5, 2013.
  2. Amarita Ritthipakdee, Arit Thammano, Nol Premasathian, and Bunyarit Uyyanonvara, ”An Improved Firefly Algorithm for Optimization Problems”, ADCONP,HIROSHIMA, vol.4, pp 2-6,2014.
  3. Xin-She Yang, Xingshi He, “Firefly Algorithm:Recent Advances and Applications”, School of Science, Xi‟an Polytechnic University,Vol. 1, issue 1, 2013.
  4. Ben Choi and Zhongmei Yao,”Web Page Classification”, Louisiana Tech University, Ruston, LA 71272, USA,Vol.180,pp 221-224, 2008.
  5. Daniele Riboni,”Feature Selection for Web Page Classification” ,D.S.I., Universita‟ degli Studi di Milano, Italy,2009.
  6. Comparative Study of Firefly Algorithm and Particle Swarm Optimization for Noisy Non-Linear Optimization Problems I.J. Intelligent Systems and Applications, 2012.
  7. Adil Hashmi, Nishant Goel, Shruti Goel, Divya Gupta,”Firefly Algorithm for Unconstrained Optimization”, IOSR-JCE,2013.
  8. XIAOGUANG QI and BRIAN D. DAVISON, ”Web Page Classification: Features and Algorithms”,ACM, 2009.
  9. Xin-She Yang,M. Bramer et al. (eds.), “Firefly Algorithm”, L´evy Flights and Global Optimization Research and Development in Intelligent Systems Springer,2010.
  10. Selma Ayse Ozel, Esra Sarac, “Feature selection for web page classification using the intelligent water drops algorithm”,01330 Turke, 2011.
  11. Xin-She Yang , “Firefly Algorithms for Multimodal Optimization”,2010.
  12. Maybin K. Muyeba, Liangxiu Han, “Fuzzy Classification in Web Usage Mining using Fuzzy Quantifiers”, IEEE/ACM,2013.
  13. Sankalap Arora,Satvir Singh, “The Firefly Optimization Algorithm: Convergence Analysis and Parameter Selection”,IJCA,2013.
  14. Nikita Sahu, Dr. R. K. Kapoor, “A Review on Optimization in Web Page Classification”, IJAFRC 2014.
  15. Prabhjot Kaur , “Web Content Classification: A Survey”,IJCTT,vol.10, no.2,pp: 97-101, April 2014.
  16. Jie Chen,Jian Li,Hao Liao,Qingsheng Yuan, Xiuguo Bao, “Study on Meaningful String Extraction Algorithm for Improving Webpage Classification” IEEE,2011..
  17. Jiao Lijuan, Feng Liping,”Improvement of Feature Extraction in Web Page Classification” IEEE, 2010.
  18. JIAO Lijuan, Feng Liping, “Web Page Categorization based on Maximum Entropy Model” ,2010.

Keywords

Feature, Navie Bayes classifier, j48, f- measure.