An Efficiently harvesting Deep Web Interfaces based on Two Stage Crawler

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

Multi-Band RLS Estimation with Rank Two Updates: Application to Short-Term Temperature Forecast

Alexander Stotsky

Random Articles

Baids: Detection of Blackhole Attack in Manet by Specialized Mobile Agent

February

2012

SysRisk ñA Decisional Framework to Measure System Dimensions of Legacy Application for Rejuvenation through Reengineering

February

2011

Fuzzy Approach for Three Level Linear Programming Problems

January

2016

An Analysis of Linear Feedback Shift Registers in Stream Ciphers

May

2012

Reseach Article

An Efficiently harvesting Deep Web Interfaces based on Two Stage Crawler

Published on June 2018 by Rohini Navnathkhedkar, Madhuri Dalal

International Conference on Emerging Trends in Computing and Communication

Foundation of Computer Science USA

ICETCC2017 - Number 3

June 2018

Authors: Rohini Navnathkhedkar, Madhuri Dalal

Rohini Navnathkhedkar, Madhuri Dalal . An Efficiently harvesting Deep Web Interfaces based on Two Stage Crawler. International Conference on Emerging Trends in Computing and Communication. ICETCC2017, 3 (June 2018), 18-22.

@article{

author = { Rohini Navnathkhedkar, Madhuri Dalal },

title = { An Efficiently harvesting Deep Web Interfaces based on Two Stage Crawler },

journal = { International Conference on Emerging Trends in Computing and Communication },

issue_date = { June 2018 },

volume = { ICETCC2017 },

number = { 3 },

month = { June },

year = { 2018 },

issn = 0975-8887,

pages = { 18-22 },

numpages = 5,

url = { /proceedings/icetcc2017/number3/29474-c129/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 International Conference on Emerging Trends in Computing and Communication

%A Rohini Navnathkhedkar

%A Madhuri Dalal

%T An Efficiently harvesting Deep Web Interfaces based on Two Stage Crawler

%J International Conference on Emerging Trends in Computing and Communication

%@ 0975-8887

%V ICETCC2017

%N 3

%P 18-22

%D 2018

%I International Journal of Computer Applications

Abstract

As deep web grows at a very fast pace, there has been increased interest in techniques that help efficiently locate deep-web interfaces. However, due to the large volume of web resources and the dynamic nature of deep web, achieving wide coverage and high efficiency is a challenging issue. We propose a two-stage framework, for harvesting deep web interfaces. In the first stage of harvesting, performs site-based searching for center pages with the help of search engines, avoiding visiting a large number of pages. To achieve more accurate results for a focused crawl ranks websites to prioritize highly relevant ones for a given topic. In the second stage, it achieves fast in-site searching by excavating most relevant links with an adaptive link-ranking.

References

Feng Zhao, Jingyu Zhou, Chang Nie, Heqing Huang, Hai Jin "SmartCrawler: A Two Stage Crawler for efficiently harvesting Deep-Web interfaces" IEEE Transactions on Services Computing Volume: 99 PP Year: 2015.
L. Barbosa and J. Freire, "An adaptive crawler for locating hidden web entry points," in Proc. 16th Int. Conf. World Wide Web, 2007, pp. 441–450.
. Olston and M. Najork , "Web Crawling", Foundations and Trends in Information Retrieval, vol. 4, No. 3 ,pp. 175–246, 20.
Y. He, D. Xin, V. Ganti, S. Rajaraman, and N. Shah, "Crawling deep web entity pages," in Proc. 6th ACM Int. Conf. Web Search Data Mining, 2013, pp. 355–364.
Barbosa and J. Freire, "Searching for hidden-web databases,"in Proc. 8th Int. Workshop Web Databases, 2005, pp. 1–6.
Rabia and Sami, Lalitha K. , "Understanding the Deep Web" (2010). Library Philosophy and Practice (e-journal). Paper 364. http://digitalcommons. unl. edu/libphilprac.

Index Terms

Computer Science

Information Sciences

Keywords

Deep Web Ranking Adaptive Learning Two-stage Crawler.