Analysis of Vector Space Model in Information Retrieval

Print
IJCA Proceedings on National Conference on Communication Technologies & its impact on Next Generation Computing 2012
© 2012 by IJCA Journal
CTNGC - Number 2
Year of Publication: 2012
Authors:
Jitendra Nath Singh
Sanjay Kumar Dwivedi

Jitendra Nath Singh and Sanjay Kumar Dwivedi. Article: Analysis of Vector Space Model in Information Retrieval. IJCA Proceedings on National Conference on Communication Technologies & its impact on Next Generation Computing 2012 CTNGC(2):14-18, November 2012. Full text available. BibTeX

@article{key:article,
	author = {Jitendra Nath Singh and Sanjay Kumar Dwivedi},
	title = {Article: Analysis of Vector Space Model in Information Retrieval},
	journal = {IJCA Proceedings on National Conference on Communication Technologies & its impact on Next Generation Computing 2012},
	year = {2012},
	volume = {CTNGC},
	number = {2},
	pages = {14-18},
	month = {November},
	note = {Full text available}
}

Abstract

Information retrieval is great technology behind web search services. In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The vector space model is one of the classical and widely applied retrieval models to evaluate relevance of web page. The retrieval operation consists of computing the cosine similarity function between a given query vector and the set of documents vector and then ranking documents accordingly. In this paper, we present different approaches of vector space model to compute similarity score of hits from search engine and more importantly, it is felt that this investigation will lead to a clearer understanding of the issues and problems in using the vector space model in information retrieval and our work intends to discuss the main aspects of Vector space models and provide a comprehensive comparison for Term- Count model, Tf-Idf model and Vector space model based on normalization.

References

  • Shalton, G; Wong, A; Yang, C. S. : A vector space Model for automatic indexing Communications of The ACM, Volume 18, Issue 11(November1975).
  • Sanjay K. Dwivedi,Jitendra Nath Singh, Rajesh Gotam "Information Retrieval Evaluative Model" FTICT 2011: Proceedings of the 2011, International conference on "Future Trend in Information & Communication Technology, Ghaziabad, India, Feb -2011.
  • Yi Shang Longzhuang Li: Precision Evaluation of Search Engines. World Wide Web (2002).
  • D. L. Lee, H. Chuang, and K. Seamons. Document ranking and the vector space model. IEEE Transactions on Software, 14(2): 1997.
  • Chris Buckley. The importance of proper weighting methods. In M. Bates, editor, Human Language Technology. Morgan Kaufman, 1993.
  • Longzhuang Li, Yi Shang A new statistical method for performance evaluation of search engines. ICTAI 2000.
  • Longzhuang Li, Yi Shang A new method for automatic performance comparison of search engines. World Wide Web (2000).
  • Chu, H. & Rosenthal: "Search engines for the World Wide Web: A comparative study and evaluation methodology". In Proceedings of the 59th Annual Meeting of the American Society for Information Science, Baltimore, 1996.
  • Jinbiao Hou: "Research on Design of an Automatic Evaluation System of Search Engine" . In proceeding of ETP International Conference on Future Computer and Communication . FCC/2009.
  • Gerald Salton and Chris Buckley. Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5): Is-sue 5. 1988.
  • G. Salton and C. Buckley, "Improving Retrieval Performance by Relevance Feedback," J. Amer. Soc. for Information Science, Vol. 41, No. 4, 1990