CFP last date
20 May 2024
Reseach Article

An Improvement of Link Analysis Algorithm to Mine Pertinent Links: Weighted HITS Algorithm based on additive fusion of graphs by Query Similarity

by Hemangini S. Patel, Apurva A. Desai
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 24
Year of Publication: 2020
Authors: Hemangini S. Patel, Apurva A. Desai
10.5120/ijca2020920232

Hemangini S. Patel, Apurva A. Desai . An Improvement of Link Analysis Algorithm to Mine Pertinent Links: Weighted HITS Algorithm based on additive fusion of graphs by Query Similarity. International Journal of Computer Applications. 176, 24 ( May 2020), 21-27. DOI=10.5120/ijca2020920232

@article{ 10.5120/ijca2020920232,
author = { Hemangini S. Patel, Apurva A. Desai },
title = { An Improvement of Link Analysis Algorithm to Mine Pertinent Links: Weighted HITS Algorithm based on additive fusion of graphs by Query Similarity },
journal = { International Journal of Computer Applications },
issue_date = { May 2020 },
volume = { 176 },
number = { 24 },
month = { May },
year = { 2020 },
issn = { 0975-8887 },
pages = { 21-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number24/31347-2020920232/ },
doi = { 10.5120/ijca2020920232 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:43:23.859824+05:30
%A Hemangini S. Patel
%A Apurva A. Desai
%T An Improvement of Link Analysis Algorithm to Mine Pertinent Links: Weighted HITS Algorithm based on additive fusion of graphs by Query Similarity
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 24
%P 21-27
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In recent days, link analysis has been found to increase the performance of web search significantly to extract pertinent links which are valuable. Generally, short term queries matches to link anchors and titles. We give a weighted Input to HITs and proposed algorithm weighted HITs (WHITs) in which the adjacency matrix is weighted double if link anchors and titles are matched with query term by additive fusion of graphs. Experimental results provided evidences that weighted input to HITs (WHITs) returns unique rankings for authoritative pages, for link anchors and link titles which are similar to query term. The proposed algorithm, namely weighted HITs (WHITS) helps to extract a pertinent and valuable links based on similarity of link anchors, titles, and query term.

References
  1. M. Henzinger, “Link analysis in web information retrieval,” IEEE Data Engineering Bulleitin, 1-6, 2000.
  2. S. Chakrabati, B. Dom, D. Gibson, J. Kleinberg, S. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins, “Mining the link structure of the World Wide Web,” IEEE Computer, 32, no. 8, 60-67, 1999.
  3. K. Bharat and M. R. Henzinger,”Improved algorithms for topic distillation in a hyperlinked environment,” In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 104-111, 1998.
  4. S. Brin and L. Page, “The anatomy of a large-scale hyper textual Web search engine,” Computer Networks and ISDN Systems, 30(1–7): 107–117, 1998.
  5. L. Page, S. Brin, R. Motwani, and T. Winograd, “The page rank citation ranking: Bringing order to the web,” 1999.
  6. J. M. Kleinberg, “Authoritative sources in a hyperlinked envi-ronment,” Journal of the ACM (JACM). 46, no. 5, 604–632, 1999.
  7. S. Chakrabarti, B. Dom, P. Raghavan, S. Rajagopalan, D. Gibson, and J. Kleinberg, “Automatic resource compilation by analyzing hyperlink structure and associated text,” Computer Networks and ISDN Systems, 30 no. 1, 65-74, 1998.
  8. J. Gevrey, and S. Ruger, “Link-based Approaches for Text Retrieval,” Proceedings of TREC-10, NIST Special Publication, 2002, 279-285, 2001.
  9. S. Nomura, S. Oyama, T. Hayamizu and T. Ishida, “Analysis and improvement of HITS algorithm for detecting Web communities,” Systems and Computers in Japan, IEEE, 32-42, 2004.
  10. H. M. Yan, T. Qin, T. Y. Liu, X. D. Zhang, G. Feng, and W. Y. Ma, “Calculating webpage importance with site structure constraints,” In Information Retrieval Technology. Springer Berlin Heidelberg, 546-551, 2005.
  11. R. Lempel, and S. Moran, “The stochastic approach for link-structure analysis (SALSA) and the TKC effect,” Computer Networks. 387–401, 2000.
  12. D. Cohn, and H. Chang, “Learning to probabilistically identify authoritative documents,” In Proceedings of the 17th International Conference on Machine Learning (ICML), Stanford University, United States, 167-174,2000.
  13. A. Borodin, G. O. Roberts, J. S. Rosenthal, and P. Tsaparas, “Finding authorities and hubs from link structures on the world wide web,” Proceedings of the 10th International World Wide Web Conference, ACM, 415–429, 2001.
  14. J. Rong, A. G. Hauptmann, and C. X. Zhai, “Title language model for information retrieval,” In Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, Tampere, Finland, Association for Computing Machinery, ACM, 42-48,2002.
  15. N. Eiron and K. S. McCurley, “Analysis of anchor text for web search,” In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press, 459–460, 2003.
  16. Y. He, M. Qiu, M. Jin, and T. Xiong, “Improvement on HITS Algorithm,” Applied mathematics and information sciences, 6, no. 3, 1075-1085, 2012.
  17. B. Jaganathan, and K. Desikan, “Weighted Page Rank Algo-rithm based on In-Out Weight of Webpages,” Indian Journal of Science and Technology 8, no. 34, 1-6, 2015.
  18. W. Xing and A. Ghorbani, “Weighted PageRank Algorithm,” in proceedings of the 2rd Annual Conference on Communication Networks & Services Research,IEEE, 305-314, 2004.
  19. R. Baeza-Yates and E. Davis, “Web page ranking using link attributes,” In proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, ACM, 328-329, 2004.
  20. H. Pérez-Rosés, F. Sebé, and J. M. Ribó, “Endorsement deduction and ranking in social networks,” Computer Communications, 73, 200-210, 2016.
  21. M. Benzi, E. Estrada, and C. Klymko, “Ranking hubs and authorities using matrix functions,” Linear Algebra and its Applications, 438, no. 5, 2447-2474, 2013.
  22. X. Luo, and H. Xue, “Weights Allocation Optimization of Search Engine Links Sorted Pagerank Algorithm,” 355-360, 2015.
  23. A. Mirzal and M. Furukawa, “A Method for Accelerating the HITS Algorithm,” Journal of Advanced Computational Intelligence, 1-10, 2009.
  24. H. Hamed, “Link Analysis and web page ranking algorithms,”. 1-8, 2015.
  25. B. Q. Hung, M. Otsubo, Y. Hijikata, and S. Nishida, “HITS algorithm improvement using semantic text portion,” Web Intelligence and Agent Systems: An International Journal, 8, no. 2, 149-164, 2010.
  26. X. Tiana, Y. Dua, W. Songa, W. Liua, and Y. Xieb, “Improve-ments of HITS Algorithm Based on Triadic Closure,” Journal of Information & Computational Science, 1861–1868, 2014.
  27. J. Thom and F. Scholer, “A comparison of evaluation measures given how users perform on search tasks,” In ADCS2007 Australasian Document Computing Symposium, RMIT University, School of Computer Science and Information Technology, 100-103, 2007.
  28. Hemangini S. Patel completed her graduation and post graduation from Veer Narmad South Gujarat University in the Information Technology. She has completed her Ph.D. from Veer Narmad South Gujarat University in the field of computer science. She has been with the Bhagwan Mahavir College of Computer Application, Surat, Gujarat, India since 2008, as Assistant Professor. She has 12 years of teaching experience since 2006 at under graduate level. Her fields of interest are Data mining, Link Analysis,
  29. Apurva A. Desai completed his graduation and post graduation from Veer Narmad South Gujarat University, Surat, Gujarat, India. He earned his Ph.D. in the year 1997 in the joint fields of Operation Research and Computer Science. He has got a long teaching and research experience since 1990. He has published many research papers at national and international level and four books to his credit. Dr. Desai is a Dean of faculty of Computer Science and Information Technology and Chairman Board of Studi
Index Terms

Computer Science
Information Sciences

Keywords

Web Mining Web Structure Mining Information Retrieval Link Analysis Anchor Text WWW.