CFP last date
22 April 2024
Reseach Article

Normalization of Myanmar Grammatical Categories for Part-of-Speech Tagging

by Phyu Hninn Myint, Tin Myat Htwe, Ni Lar Thein
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 36 - Number 1
Year of Publication: 2011
Authors: Phyu Hninn Myint, Tin Myat Htwe, Ni Lar Thein
10.5120/4454-6233

Phyu Hninn Myint, Tin Myat Htwe, Ni Lar Thein . Normalization of Myanmar Grammatical Categories for Part-of-Speech Tagging. International Journal of Computer Applications. 36, 1 ( December 2011), 10-17. DOI=10.5120/4454-6233

@article{ 10.5120/4454-6233,
author = { Phyu Hninn Myint, Tin Myat Htwe, Ni Lar Thein },
title = { Normalization of Myanmar Grammatical Categories for Part-of-Speech Tagging },
journal = { International Journal of Computer Applications },
issue_date = { December 2011 },
volume = { 36 },
number = { 1 },
month = { December },
year = { 2011 },
issn = { 0975-8887 },
pages = { 10-17 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume36/number1/4454-6233/ },
doi = { 10.5120/4454-6233 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:21:59.478712+05:30
%A Phyu Hninn Myint
%A Tin Myat Htwe
%A Ni Lar Thein
%T Normalization of Myanmar Grammatical Categories for Part-of-Speech Tagging
%J International Journal of Computer Applications
%@ 0975-8887
%V 36
%N 1
%P 10-17
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper, we analyze the syntactic structure of Myanmar grammatical categories to be able to use in tagging Myanmar text with standard Part-of-Speech (POS) tags. In Myanmar lexicon, all words are annotated with basic tags and these words can be called as stem words or root words. The Myanmar POS tagged corpus creation, which has been proposed in [11], used basic POS tagging for each word. Therefore, all words in this corpus have been tagged with only basic tags as in lexicon. For standard POS tagging, normalization step is needed to form more meaningful words and annotate some words with more appropriate finer POS tags and categories. The finer tags can be called as standard POS tags and these can be used to directly concatenate with English POS tags. These tags are very useful in Myanmar to English Machine Translation System. Hence, the main aim of this study is to develop the customized lexical rules in order to deduce finer or standard POS tag from basic POS tags combinations. By analyzing Myanmar grammatical categories, 27 rules are defined to normalize them. Evaluation has been made on a basic POS tagged corpus which contains 1000 basic POS tagged sentences and it yields full satisfaction for all words in these sentences.

References
  1. Bradley, D. 2010. The Characteristics of the Burmic family of Tibeto-Burman. The International Symposium on Sino-Tibetan Comparative Studies in the 21st Century. Institute of Linguistics. Academia Sinica. Taipei. Taiwan.
  2. Department of the Myanmar Language Commission. 2005. Myanmar Grammar. Ministry of Education. Myanmar.
  3. Department of the Myanmar Language Commission. 2006. Myanmar-English Dictionary. Ministry of Education. Myanmar.
  4. Grammar. Burmese language. http://en.wikipedia.org/wiki/Burmese_Language
  5. Hopple, P. 1999. Nominalization in Burmese - sentence patterns. The 32nd International Conference on Sino-Tibetan Languages and Linguistics. Urbana-Champaign.
  6. Hopple, P. 2003. The structure of nominalization in burmese. Ph.D Dissertation. University of Texas. Arlington.
  7. Hopple, P. Burmese Particles as Boundary Marking Units of Text. http:/ic.payap.ac.th/graduate/linguistic/papers/ Burmese_Particles.pdf?v=1289545363
  8. Judson, A. 1842. Grammatical Notices of the Burmese Language. Maulmain: American Baptist Mission Press.
  9. Ko, Taw Sein. 1924. Elementary handbook of the Burmese language. Rangoon: American Baptist Mission Press.
  10. Latter, T. 1845. A grammar of the language of burmah. Baptist Mission Press.
  11. Myint, P. H. 2010. Assigning automatically Part-of-Speech tags to build tagged corpus for Myanmar language. The Fifth Conference on Parallel Soft Computing. Yangon. Myanmar.
  12. Soe, M. 1999. A grammar of Burmese. Ph.D. dissertation. University of Oregon.
  13. Thurgood, G. 1977. Burmese Historical Morphology. Proceedings of the 3rd Annual Meeting of the Berkeley Linguistics Society.
  14. Wright, E. 1877. Anglo-Burmese Student's Assistant. Tenasserim Press.
Index Terms

Computer Science
Information Sciences

Keywords

Part-of-Speech tagging Normalization of Grammatical Categories