CFP last date
20 June 2024
Reseach Article

Design of Web Ranking Module using Genetic Algorithm

by Vikas Thada, Vivek Jaglan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 97 - Number 9
Year of Publication: 2014
Authors: Vikas Thada, Vivek Jaglan

Vikas Thada, Vivek Jaglan . Design of Web Ranking Module using Genetic Algorithm. International Journal of Computer Applications. 97, 9 ( July 2014), 43-48. DOI=10.5120/17038-7346

@article{ 10.5120/17038-7346,
author = { Vikas Thada, Vivek Jaglan },
title = { Design of Web Ranking Module using Genetic Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { July 2014 },
volume = { 97 },
number = { 9 },
month = { July },
year = { 2014 },
issn = { 0975-8887 },
pages = { 43-48 },
numpages = {9},
url = { },
doi = { 10.5120/17038-7346 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-06T22:23:41.778777+05:30
%A Vikas Thada
%A Vivek Jaglan
%T Design of Web Ranking Module using Genetic Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 97
%N 9
%P 43-48
%D 2014
%I Foundation of Computer Science (FCS), NY, USA

Crawling is a process in which web search engines collect data from the web. Focused crawling is a special type of crawling process where crawler look for information related to a predefined topic[1]. In this paper a method for finding out the most relevant document among a set of documents for the given set of keyword is presented. Relevance checking is done with the help of Rogers-Tanimoto, MountFord and Baroni-Urbani/Buser similarity coefficients. The method uses genetic algorithm to show that the average similarity of documents to the query increases when Probability of mutation is taken as low and Probability of crossover is taken as high. The method does the performance analysis of different similarity coefficients on the same set of documents and applies ranking to the documents whose relevancy is highest among the three coefficients.

  1. B. Novak "A Survey Of Focused Web Crawling Algorithms", Proceedings of SIKDD, pp. 55–58, 12-15 Oct 2004.
  2. http://www. wikipedia. org/Web_Crawler
  3. B. Klabbankoh, O. Pinngern. "applied genetic algorithms in information retrieval" Proceeding of IEEE ,pp. 702-711,Nov 2004
  4. S. S. Satya and P. Simon, "Review on Applicability of Genetic Algorithm to Web Search," International Journal of Computer Theory and Engineering, vol. 1, no. 4, pp. 450-455, 2009.
  5. Shokouhi, M. ; Chubak, P. ; Raeesy, Z " Enhancing focused crawling with genetic algorithms"Vol: 4-6, pp. 503-508,2005.
  6. www. sequentix. de/gelquest/help/distance_measures. htm?
  7. V. Consonni and R. Todeschini ,"New Similarity Coefficients for Binary Data", Communications in Mathematical and in Computer Chemistry, pp. 581-592, 2012
  8. H. Wolda, "Similarity Indices, Sample Size and Diversity", OecoIogia-Springer-Verlag ,pp. 296-302,1981
  9. M. A. Kauser, M. Nasar, S. K. Singh, "A Detailed Study on Information Retrieval using Genetic Algorithm", Journal of Industrial and Intelligent Information vol. 1, no. 3, pp. 122-127 Sep 2013.
  10. http://en. wikipedia/wiki/Fitness_Proportionate_Selection
  11. J. R. Koza, " Survey Of Genetic Algorithms And Genetic Programming", Proceedings of the Wescon, pp. 589-595,1995
  12. http://textalyser. net/
  13. http://www. webconfs. com/keyword-density-checker. php.
  14. V. Thada, V. Jaglan, "Use of Genetic Algorithm in Web Information Retrieval", International Journal of Emerging Technologies in Computational and Applied Sciences, vol. 7,no. 3,pp. 278-281, Feb,2014
Index Terms

Computer Science
Information Sciences


Relevancy similarity coefficients genetic