HWPDE: Novel Approach for Data Extraction from Structured Web Pages

International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 50 - Number 8
Year of Publication: 2012
Manpreet Singh Sehgal

	author = {Manpreet Singh Sehgal and Anuradha and},
	title = {Article: HWPDE: Novel Approach for Data Extraction from Structured Web Pages},
	journal = {International Journal of Computer Applications},
	year = {2012},
	volume = {50},
	number = {8},
	pages = {22-27},
	month = {July},
	note = {Full text available}


Diving into the World Wide Web for the purpose of fetching precious stones (relevant information) is a tedious task under the limitations of current diving equipments (Current Browsers). While a lot of work is being carried out to improve the quality of diving equipments, a related area of research is to devise a novel approach for mining. This paper describes a novel approach to extract the web data from the hidden websites so that it can be used as a free service to a user for a better and improved experience of searching relevant data. Through the proposed method, relevant data (Information) contained in the web pages of hidden websites is extracted by the crawler and stored in the local database so as to build a large repository of structured and indexed and ultimately relevant data. Such kind of extracted data has a potential to optimally satisfy the relevant Information starving end user.


