Call for Paper - March 2023 Edition
IJCA solicits original research papers for the March 2023 Edition. Last date of manuscript submission is February 20, 2023. Read More

Keyword Extraction using Semantic Analysis

International Journal of Computer Applications
© 2013 by IJCA Journal
Volume 61 - Number 1
Year of Publication: 2013
Mohamed H. Haggag

Mohamed H Haggag. Article: Keyword Extraction using Semantic Analysis. International Journal of Computer Applications 61(1):1-6, January 2013. Full text available. BibTeX

	author = {Mohamed H. Haggag},
	title = {Article: Keyword Extraction using Semantic Analysis},
	journal = {International Journal of Computer Applications},
	year = {2013},
	volume = {61},
	number = {1},
	pages = {1-6},
	month = {January},
	note = {Full text available}


Keywords are list of significant words or terms that best present the document context in brief and relate to the textual context. Extraction models are categorized into either statistical, linguistic, machine learning or a combination of these approaches. This paper introduces a model for extracting keywords based on their relatedness weight among the entire text terms. Strength of terms relationship is evaluated by semantic similarity. Document terms are assigned a weighted metric based on the likeness of their meaning content. Terms that are strongly co-related to each other are highly considered in individual terms semantic similarity. Provision of the overall terms similarity is crucial for defining relevant keywords that most expressing the text in both frequency and weighted likeness. Keywords are recursively evaluated according to their cohesion to each other and to the document context. The proposed model showed enhanced precision and recall extraction values over other approaches.


  • Hunyadi, L. - Keyword extraction: aims and ways today and tomorrow. - In: Proceedings of the Keyword Project: Unlocking Content through Computational Linguistics. 2001.
  • Y Matsuo, M Ishizuka, - Keyword Extraction from a Single Document Using Word Co-Occurrence Statistical Information- - International Journal on Artificial Intelligence Tools, 2004.
  • N. Kang, C. Domeniconi, and D. Barbará. - Categorization and keyword identification of unlabeled documents. - In ICDM, pages 677–680. IEEE Computer Society, 2005.
  • P. D. Turney. - Learning algorithms for key phrase extraction - . Information Retrieval, Springer, 2000.
  • Y. Liu, B. J. Ciliax, K. Borges, V. Dasigi, A. Ram, S. B. Navathe, and R. Dingledine. - Comparison of two schemes for automatic keyword extraction from MEDLINE for functional gene clustering. - In CSB, pages 394–404. IEEE Computer Society, 2004.
  • Andres Romero, Fernando Nino, - Keyword extraction using an artificial immune system, - Genetic And Evolutionary Computation Conference, Proceedings of the 9th annual conference on Genetic and evolutionary computation 2007.
  • A Hulth, J Karlgren, A Jonsson, H Bostrom, L Asker – Automatic Keyword Extraction Using Domain Knowledge, Computational Linguistics and Intelligent Text Processing, 2001 – Springer.
  • Anette Hulth. 2003a. - Improved automatic keyword extraction given more linguistic knowledge - . In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2003), pages 216– 223, Sapporo, Japan. Association for Computational Linguistics.
  • Hulth, A. (2004). - Enhancing Linguistically Oriented Automatic Keyword Extraction. - In Proceedings of the human language technology conference/North American chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL 2004), Boston, May 2004.
  • Martinez-Fernandez, J. L. , A. García-Serrano, P. Martínez, and J. Villena, - Automatic Keyword Extraction for News Finder. - LNCS, 2004. 3094.
  • Gonec Ercan, Llyas Cicekli- - Using lexical chains for keyword extraction - Information Processing and Management: an International Journal, ACM, Volume 43 , Issue 6 (November 2007).
  • R. Bekkerman, R. El-Yaniv, N. Tishby, and Y. Winter. - Distributional word clusters vs. words for text categorization. - Journal of Machine Learning Research, 3:1183–1208, 2003.
  • Xinghua Hu; Bin Wu - Automatic Keyword Extraction Using Linguistic Features,- Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference, Page(s):19 - 23 Dec. (2006)
  • H. Frigui and O. Nasraoui. Simultaneous categorization of text documents and identification of cluster-dependent keywords, Apr. 07 2002.
  • P. Tonella, F. Ricca, E. Pianta, and C. Girardi. Using keyword extraction for web site clustering. In WSE, pages 41–48. IEEE Computer Society, 2003.
  • C. Fellbaum (Ed. ), WordNet: An electronic lexical database, MIT Press, 1998.
  • M. Lesk, - Automatic sense disambiguation using machine readable dictionaries: how to tell a pine code from an ice cream cone - in: Proceedings of the 5th annual international conference on Systems documentation, ACM Press, 1986.
  • Ted Pedersen, Siddharth Patwardhan, and Jason Michelizzi. WordNet::Similarity – measuring the relatedness of concepts. In Proceedings of the Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics, Boston, Massachusetts, 2004.
  • Ted Pedersen, Satanjeev Banerjee, and Siddharth Patwardhan. Maximizing semantic relatedness to perform word sense disambiguation. Technical Report UMSI 2005/25, University of Minnesota Supercomputing Institute, March 2005.
  • BBC dataset, Machine Learning Group; http://mlg. ucd. ie/.
  • CLUTO. A Clustering Toolkit. Release 2. 1. http://www-users. cs. umn. edu-/ karypis/cluto.
  • M. Steinbach, G. Karypis, and V. Kumar. A Comparison of Document Clustering Techniques , KDD Workshop on Text Mining, 2000.
  • Zhao, Y. , & Karypis, G. Criterion functions for document clustering: Experiments and analysis. Technical Report TR #01–40, Department of Computer Science, University of Minnesota, Minneapolis, MN. (2001).
  • Junsheng Zhang, Yunchuan Sun, Huilin Wang, and Yanqing He. "Calculating Statistical Similarity between Sentences". Journal of Convergence Information Technology, Volume 6, Number 2. February 2011
  • Peter Turney, "Extraction of Keyphrases from Text: Evaluation of Four Algorithms", National Research Council of Canada, Canada, October 23, 1997
  • J. Naveenkumar, "Keyword Extraction through Applying Rules of Association and Threshold Values", International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), ISSN: 2278–1021, Vol. 1, Issue 5, pp 295-297, July 2012
  • Jasmeen Kaur and Vishal Gupta, "Effective Approaches for Extraction of Keywords", International Journal of Computer Science Issues (IJCSI), ISSN (Online): 1694-0814, Vol. 7, Issue 6, pp 144-148, November 2010
  • Philip Resnik, "Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language", Journal of Artificial Intelligence Research 11, pp 95-130, 1999