Article:Crawler Indexing using Tree Structure and its Implementation

Deepika Sharma; Parul Gupta; Dr. A.K. Sharma

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 20 June 2025

Submit your paper

Know more

The week's pick

Designing Multi-Tenant E-Learning Systems in the Cloud: A Process-Oriented Approach for Higher Education

Sameh Azouzi Sonia Ayachi Ghannouchi

Random Articles

Intrusion Detection and Secured Data Transmission using Software Hardware Codesign

March

2013

Review: Comparative Analysis of Different Techniques of DL-Frameworks

Sep

2018

One Way Functions –Conjecture, Status, Applications and Future Research Scope

Nov

2016

Software Reliability Prediction using Fuzzy Inference System: Early Stage Perspective

Jul

2016

Reseach Article

Article:Crawler Indexing using Tree Structure and its Implementation

by Deepika Sharma, Parul Gupta, Dr. A.K. Sharma

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 31 - Number 6

Year of Publication: 2011

Authors: Deepika Sharma, Parul Gupta, Dr. A.K. Sharma

10.5120/3830-5323

Deepika Sharma, Parul Gupta, Dr. A.K. Sharma . Article:Crawler Indexing using Tree Structure and its Implementation. International Journal of Computer Applications. 31, 6 ( October 2011), 34-39. DOI=10.5120/3830-5323

@article{ 10.5120/3830-5323,

author = { Deepika Sharma, Parul Gupta, Dr. A.K. Sharma },

title = { Article:Crawler Indexing using Tree Structure and its Implementation },

journal = { International Journal of Computer Applications },

issue_date = { October 2011 },

volume = { 31 },

number = { 6 },

month = { October },

year = { 2011 },

issn = { 0975-8887 },

pages = { 34-39 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume31/number6/3830-5323/ },

doi = { 10.5120/3830-5323 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:17:27.270068+05:30

%A Deepika Sharma

%A Parul Gupta

%A Dr. A.K. Sharma

%T Article:Crawler Indexing using Tree Structure and its Implementation

%J International Journal of Computer Applications

%@ 0975-8887

%V 31

%N 6

%P 34-39

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The plentiful content of the World-Wide Web is useful to millions. Information seekers use a search engine such as Google, Yahoo etc to begin their Web activity. Our aim is to make a search tool that is cost-effective, efficient, fast and user friendly. In response to a query, it should retrieve the most relevant information which has been stored into the database. It should also be portable, so that it can easily be deployed at any platform without any cost and inconvenience. Our goal is to make a Web Search Engine that will retrieve the best matched WebPages in the shortest possible time. This paper proposes an algorithm for crawler in which crawler crawls the WebPages recursively and stores the relevant data in the database. The algorithm uses the basic principles of tree structure while maintaining the crawled data by the crawler to be used by the search engine. The proposed work makes the searching on the web more efficient. It uses the tree/node structure in the database which filters the searched word more efficiently and gives faster results to the user. The paper has also implemented the crawler indexing with tree structure using HTML based Update File at Web Server’ while making the crawling and searching more efficient.

References

Changshang Zhou, Wei Ding, Na Yang, Double Indexing Mechanism of Search Engine based on Campus Net, Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06).
Fabrizio Silvestri, Raffaele Perego and Salvatore Orlando. Assigning Document Identifiers to Enhance Compressibility of Web Search Engines Indexes. In the proceedings of SAC, 2004.
Oren Zamir and Oren Etzioni. Web Document Clustering: A feasibility demonstration. In the proceedings of SIGIR, 1998.
A. Jain and R. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988
Berners-Lee, T., Hendler, J. and Lassila, O., “The Semantic Web,” Scientific American.284(5):35-43, 2001.
O. Zamir, O. Etzioni, O. Madanim, and R.M. Karp, “Fast andIntuitive Clustering of Web Documents,” Proc. Third Int’l Conf. Knowledge Discovery and Data Mining, pp. 287-290, Aug. 1997.
Wang Jicheng, Huang Yuan, Wu Gangshan and Zhang Fuyan, ‘Web Mining: Knowledge Discovery on the Web’ ,IEEE (1999).
Frawley, W., Piatetsky-Shapiro, G., and Matheus, C., Knowledge Discovery in Databases: An Overview. Ai Magazine, Vol. 13 (1992), pp.57-70.
Changshang Zhou, Wei Ding, Na Yang, Double Indexing Mechanism of Search Engine based on Campus Net, Proceedings of the 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06)
Quan, T. T., Hui, S. C., Fong, A. C. M., and Cao, T. H. (2004). Automatic generation of ontology for scholarly semantic Web. In: Lecture Notes in Computer Science. Vol. 3298. (pp. 726–740).

Index Terms

Computer Science

Information Sciences

Keywords

Crawler Indexing Tree Structure World-Wide Web