Reseach Article

Crawling the Hidden Web: An Approach to Dynamic Web Indexing

by Moumie Soulemane, Mohammad Rafiuzzaman, Hasan Mahmud
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 55 - Number 1
Year of Publication: 2012
Authors: Moumie Soulemane, Mohammad Rafiuzzaman, Hasan Mahmud

Moumie Soulemane, Mohammad Rafiuzzaman, Hasan Mahmud . Crawling the Hidden Web: An Approach to Dynamic Web Indexing. International Journal of Computer Applications. 55, 1 ( October 2012), 7-15. DOI=10.5120/8717-7290

The majority of the websites encapsulating online information are dynamic and hence too sophisticated for many traditional search engines to index. With the ever growing quantity of such hidden web pages, this issue continues to raise diverse opinions between the research and practitioner among the web mining communities. Several aspects enriching these dynamic web pages are bringing more challenges day-by-day to index them. By explaining these aspects and challenges, in this paper we have presented a framework for dynamic web indexing. With the implementation of this framework and the results which we have found from it, all the necessary experimental setup and the developmental processes are explained. We have concluded by exposing a possible future scope through the integration of Hadoop-Mapreduce with this framework to update and maintain the index.

Index Terms

Computer Science
Information Sciences


Dynamic web pages crawler hidden web index hadoop