Reseach Article

Design of Web Ranking Module using Genetic Algorithm

by Vikas Thada, Vivek Jaglan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 97 - Number 9
Year of Publication: 2014
Authors: Vikas Thada, Vivek Jaglan

Crawling is a process in which web search engines collect data from the web. Focused crawling is a special type of crawling process where crawler look for information related to a predefined topic[1]. In this paper a method for finding out the most relevant document among a set of documents for the given set of keyword is presented. Relevance checking is done with the help of Rogers-Tanimoto, MountFord and Baroni-Urbani/Buser similarity coefficients. The method uses genetic algorithm to show that the average similarity of documents to the query increases when Probability of mutation is taken as low and Probability of crossover is taken as high. The method does the performance analysis of different similarity coefficients on the same set of documents and applies ranking to the documents whose relevancy is highest among the three coefficients.

