Call for Paper - May 2023 Edition
IJCA solicits original research papers for the May 2023 Edition. Last date of manuscript submission is April 20, 2023. Read More

Parts of Speech Tagging in Bengali for MWEs Detection

International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 99 - Number 19
Year of Publication: 2014
Md Jaynal Abedin
Bipul Syam Purkayastha

Md Jaynal Abedin and Bipul Syam Purkayastha. Article: Parts of Speech Tagging in Bengali for MWEs Detection. International Journal of Computer Applications 99(19):33-36, August 2014. Full text available. BibTeX

	author = {Md Jaynal Abedin and Bipul Syam Purkayastha},
	title = {Article: Parts of Speech Tagging in Bengali for MWEs Detection},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {99},
	number = {19},
	pages = {33-36},
	month = {August},
	note = {Full text available}


Part of speech (POS) tagging is the process of assigning the part of speech tag to each and every word in a sentence. In many Natural Language Processing applications such as word sense disambiguation, information retrieval, information processing, parsing, question answering, MWEs detection and machine translation, POS tagging is considered as the one of the basic important tools. Identifying the ambiguities in language lexical items is based on the proper identification of Part of Sspeech (POS) tagging of that language which can enhance the language processing applications in different ways. This paper describes the POS tagset for Multiword Expressions Detection in Bengali (Bangla) which is also very important for many natural language processing (NLP) applications.


  • Altaf Mahmud, Mumit Khan: Syntactic Part of Speech Tagging Guidelines for Bangla Text . Center for Research on Bangla Language Processing (CRBLP), BRAC University, Dhaka, Bangladesh.
  • Rayson, P. , Piao, S. , Sharoff, S. , Evert, S. & Moriron, B. V. (2010). Multiword expressions: hard going or plain sailing? Language Resources and Evaluation, vol. 44, pp. 1–5.
  • K. Papineni, S. Roukos, T. Ward, J. Henderson and F. Reeder. Corpus-based comprehensive and diagnostic MT evaluation: Initial Arabic, Chinese, French, and Spanish results. In Proceedings of Human Language Technology , San Diego, CA,pp. 132-137, 2002
  • Jennifer Brundage, M. Kresse, U. Schwall and A. Storrer. 1992. Multiword lexemes:A monolingual and contrastive typology for natural language processing and machine translation. Technical Report 232, Institut fuer Wissensbasierte Systeme, IBM DeutschlandGmbH, Heidelberg.
  • Bharati, A. , Sharma, D. M. , Bai, L. and Sangal, R. AnnCorra: Annotation Corpora for POS and Chunk Annotation for Indian Languages. Language Technologies Research Centre, IIIT, Hyderabad, December 15, 2006.
  • A Part of Speech Tagger for Indian Languages (POS Tagger). Workshop on Shallow Parsing in South Asian Languages(SPSAL),Twentieth International Joint Conference on Artificial Intelligence, 2007.
  • Akshar Bharathi and Prashanth R. Mannem (2007), Introduction to the Shallow Parsing Contest for South Asian Languages", Language Technologies Research Center, International Institute of Information Technology, Hyderabad, India 500032.
  • Dandapat, S. , Sarkar, S. , Basu, A. A Hybrid Model for Part-of-Speech Tagging and its Application to Bengali. Transactions on engineering, computing and technology VI ISSN 1305-5313. December 2004.
  • Dandapat, S. , Sarkar, S. , Basu, A. Automatic Part-of-Speech Tagging for Bengali: An Approach for Morphologically Rich Languages in a Poor Resource Scenario. Proceedings of the ACL 2007 Demo and Poster Sessions, June 2007, pages 221-224.
  • Ekbal, A. , Haque, R. , Bandyopadhyay, S. Maximum Entropy Based Bengali Part of Speech Tagging. Advances in natural language processing and applications research in computing science 33, 2008, pp. 67-78.
  • D. Chakrabarti: Layered Parts of Speech Tagging for Bangla, Problems of Parsing in Indian Languages.