CFP last date
20 May 2024
Reseach Article

Ignoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target Page

by Sunil Kumar, Niraj Singhal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 85 - Number 1
Year of Publication: 2014
Authors: Sunil Kumar, Niraj Singhal
10.5120/14806-3014

Sunil Kumar, Niraj Singhal . Ignoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target Page. International Journal of Computer Applications. 85, 1 ( January 2014), 30-33. DOI=10.5120/14806-3014

@article{ 10.5120/14806-3014,
author = { Sunil Kumar, Niraj Singhal },
title = { Ignoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target Page },
journal = { International Journal of Computer Applications },
issue_date = { January 2014 },
volume = { 85 },
number = { 1 },
month = { January },
year = { 2014 },
issn = { 0975-8887 },
pages = { 30-33 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume85/number1/14806-3014/ },
doi = { 10.5120/14806-3014 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:01:21.339925+05:30
%A Sunil Kumar
%A Niraj Singhal
%T Ignoring Irrelevant Pages in Weighted PageRank Algorithm using Text Content of the Target Page
%J International Journal of Computer Applications
%@ 0975-8887
%V 85
%N 1
%P 30-33
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The web is expanding day-by-day and people generally rely on search engines to explore the web. The web has created many challenges for information retrieval. Degree of quality of the information extracted is one of the major issue to be taken care of, and current information retrieval approaches need to be modified to meet such challenges. While doing query based searching, the search engines return a list of web documents containing both relevant and irrelevant pages and sometimes show the higher ranking to the irrelevant pages as compared to relevant pages. This paper presents a novel approach to ignore irrelevant pages in weighted pagerank algorithm using text content of the targeted pages.

References
  1. M. G. da Gomes Jr. and Z. Gong, "Web Structure Mining: AnIntroduction", Proceedings of the IEEE International Conference on Information Acquisition, Hong Kong and Macau, China, pp. 590-595,2005.
  2. Google Official Blog, http://googleblog. blogspot. com/2008/07.
  3. Justin Zobeland Alistair Moffat, "Inverted Files for Text Search Engines", ACM Computing Surveys, 38 (2), pp. 1-56,2006.
  4. Allan Borodin, Gareth O. Roberts, Jeffrey S. Rosenthal, and Panayiotis Tsaparas, "Finding Authorities and Hubs from link structures on the World Wide Web", Proceedings of the 10th WWW Conference, Hong Kong, pp. 415-429, 2001.
  5. David Gibson, Jon Kleinberg, and Prabhakar Raghavan, "Inferring Web Communities from Link Topology", Proceedings of the 9th Conference on Hypertext and Hypermedia, Pittsburgh, Pennsylvania, pp. 225-234, June 1998. Brown, L. D. , Hua, H. , and Gao, C. 2003. A widget framework for augmented interaction in SCAPE.
  6. Kiduk Yang, "Combining text-and link-based retrieval methods for Web IR", in Proceedings of 10th Text REtrieval Conference, pp. 609—618, 2001.
  7. R. Kosala, and H. Blockeel, "Web Mining Research: A Survey", SIGKDD Explorations,Newsletter of the ACM Special Interest Group on Knowledge Discovery and DataMining,Vol. 2, No. 1, pp 1-15, 2000
  8. Boleslaw K. Szymanski, and Ming-shu Chung, "A method for Indexing Web Pages Using Web Bots", in Proceedings of the International Conference on Info-Tech Info-Net ICII'2001, Beijing, China, IEEE CS Press, pp. 1-6,2001.
  9. Monika R. Henzinger, and Krishna Bharat, "Improved algorithms for topic distillation in a hyperlinked environment", in Proceedings of the 21st International ACM SIGIR conference on Research and Development in IR, pp. 104-111,1998.
  10. Soumen Chakrabarati, Byron Dom, David Gibson, Jon M. Kleinberg, Prabhakar Raghavan, and Sridhar Rajagopalan, "Automatic resource list compilation by analyzing hyperlink structure and associated text", in Proceedings of the 7th International WWW conference, 30(1-7), pp. 65-74, 1998.
  11. Allan Borodin, Gareth O. Roberts, Jeffrey S. Rosenthal, and Panayiotis Tsaparas,"Link analysis ranking: algorithms, theory, and experiments", in ACM Trans. Inter. Tech. , 5(1), pp. 231-297, 2005.
  12. Neelam Duhan, A. K. Sharma and Komal Kumar Bhatia, "PageRanking Algorithms: A Survey",proceedings of the IEEEInternational Advanced Computing Conference (IACC), pp 1530-1537, 2009.
  13. C. Ridings and M. Shishigin,"Pagerank uncovered", Technical report, 2002.
  14. S. Chakrabarti, B. E. Dom, S. R. Kumar, P. Raghavan, S. Rajagopalan,A. Tomkins, D. Gibson, and J. Kleinberg,"Miningthe Web's link structure",Computer, 32(8), pp. 60–67, 1999.
  15. S. Pal, V. Talwar and P. Mitra, "Web mining in soft computing framework: Relevance, state of the art and future directions",IEEE Trans. Neural Networks, 13(5), pp. 1163–1177,2002
  16. S. Brin and L. Page," The anatomy of a large-scale hypertextual Web search engine", Computer Networks and ISDN Systems,30(1998), pp. 107–117,1998.
  17. L. Page, S. Brin, R. Motwani and T. Winograd," The page rank citation ranking: Bringing order to the web", Technical report, Stanford Digital Libraries SIDL-WP-1999-0120,1999.
  18. C. Ding, X. He, P. Husbands, H. Zha and H. Simon," Link analysis: Hubs and authorities on the world", Technical report:47847, 2001.
  19. J. Wang, Z. Chen, L. Tao, W. Ma and W. Liu," Ranking user's relevance to a topic through link analysis on web logs", WIDM, pp. 49–54, 2002.
  20. W. Xing and Ali Ghorbani, "Weighted PageRank Algorithm", Proceedings of the Second Annual Conference on Communication Networks and Services Research, IEEE, 2004.
  21. P. C. Saxena, J. P. Gupta and Namita Gupta, "Web Page Ranking Based on Text Content of Linked Pages" International Journal of Computer Theory and Engineering, Vol. 2, No. 1, February, 2010.
  22. Prem Chand Saxena, and Namita Gupta," Quick Text Retrieval Algorithm Supporting Synonyms Based on Fuzzy Logic", Computing Multimedia and Intelligent Techniques, 2(1), pp. 7-24, 2006.
Index Terms

Computer Science
Information Sciences

Keywords

Page rank Irrelevant pages Page content Links.