Kannada Part-Of-Speech Tagging with Probabilistic Classifiers

Shambhavi B R; Ramakanth Kumar P

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2026

Submit your paper

Know more

The week's pick

AI-Assisted Observability in Distributed Microservice Architectures

Kyrylo Sotnykov

Random Articles

An Evaluation of Network Topologies for Enhance Networking

Jun

2023

Semantic Web Application in Learning Resource Ontology Repository

April

2016

FRANSAC: Fast RANdom Sample Consensus for 3D Plane Segmentation

Jun

2017

Recommender Systems for Software Requirements Negotiation and Prioritization

May

2015

Reseach Article

Kannada Part-Of-Speech Tagging with Probabilistic Classifiers

by Shambhavi B R, Ramakanth Kumar P

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 48 - Number 17

Year of Publication: 2012

Authors: Shambhavi B R, Ramakanth Kumar P

10.5120/7442-0452

Shambhavi B R, Ramakanth Kumar P . Kannada Part-Of-Speech Tagging with Probabilistic Classifiers. International Journal of Computer Applications. 48, 17 ( June 2012), 26-30. DOI=10.5120/7442-0452

@article{ 10.5120/7442-0452,

author = { Shambhavi B R, Ramakanth Kumar P },

title = { Kannada Part-Of-Speech Tagging with Probabilistic Classifiers },

journal = { International Journal of Computer Applications },

issue_date = { June 2012 },

volume = { 48 },

number = { 17 },

month = { June },

year = { 2012 },

issn = { 0975-8887 },

pages = { 26-30 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume48/number17/7442-0452/ },

doi = { 10.5120/7442-0452 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:44:21.090080+05:30

%A Shambhavi B R

%A Ramakanth Kumar P

%T Kannada Part-Of-Speech Tagging with Probabilistic Classifiers

%J International Journal of Computer Applications

%@ 0975-8887

%V 48

%N 17

%P 26-30

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Part-Of-Speech (POS) tagging is defined as the Natural Language Processing (NLP) task in which each word in a sentence is labeled with a tag indicating its appropriate part of speech. Of the entire supervised machine learning classification algorithms, second order Hidden Markov Model (HMM) and Conditional Random Fields (CRF) is chosen in this work for POS tagging of Kannada language. Training data includes 51,269 words and test data consists of around 2932 tokens. Both set being disjoint and taken from EMILLE corpus. Experiments show that the accuracy of the tools based on HMM and CRF is 79. 9% and 84. 58% respectively.

References

Brill E. 1992 A Simple Rule-Based Part of Speech Tagger. In Proceedings of the Third Conference on Applied Computational Linguistics (ACL), Trento, Italy.
Ratnaparkhi, A. 1996 A Maximum Entropy Model for Part-of Speech Tagging. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 133–142.
Gimenez, J. and L. Marquez, 2003. Fast and Accurate Part-of-Speech Tagging: The SVM Approach Revisited. In Proceedings of the Fourth RANLP.
H Schmid, 1994, Part of Speech Tagging with Neural Networks. In Proceedings of the 15th International Conference on Computational Linguistics (COLING-94) 172-176.
Proceedings of IJCAI- 2007, Workshop on Shallow Parsing for South Asian Languages (SPSAL-2007), Hyderabad, India
Pranjal Awasthi, Delip Rao, Balaraman Ravindran 2006 Part Of Speech Tagging and Chunking with HMM and CRF. In Proceedings of the NLPAI ML contest workshop, National Workshop on Artificial Intelligence.
Himanshu Agrawal, Anirudh Mani 2006 Part Of Speech Tagging and Chunking Using Conditional Random Fields. In Proceedings of the NLPAI ML contest workshop, National Workshop on Artificial Intelligence.
A. Ekbal, R. Haque and S. Bandyopadhyay 2007 Bengali Part of Speech Tagging using Conditional Random Field. In Proceedings of the 7th International Symposium on Natural Language Processing (SNLP-07), Thailand. 131-136.
Navanath Saharia, Dhrubajyoti Das, Utpal Sharma, Jugal Kalita 2009 Part of Speech Tagger for Assamese Text. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, Suntec, Singapore. 33–36.
Chirag Patel, Karthik Gali 2008 Part-Of-Speech Tagging for Gujarati Using Conditional Random Fields. In Proceedings of the IJCNLP-08 Workshop on NLP for Less Privileged Languages,Hyderabad, India. 117-122
Ekbal, Asif, Mondal, S. , and S. Bandyopadhyay 2007 POS Tagging using HMM and Rule-based Chunking. In Proceedings of SPSAL-2007, IJCAI-07, 25-28.
Manju K, Soumya S, Sumam Mary Idicula 2009 Development of A Pos Tagger for Malayalam-An Experience. In Proceedings of 2009 International Conference on Advances in Recent Technologies in Communication and Computing, IEEE
Thoudam Doren Singh, Sivaji Bandyopadhyay 2008 Manipuri POS Tagging using CRF and SVM: A Language Independent Approach. In Proceedings of ICON-2008: 6th International Conference on Natural Language Processing.
Antony P. J , Soman K. P. 2010 Kernel based Part of Speech Tagger for Kannada. In Proceedings of the Ninth International Conference on Machine Learning and Cybernetics, Qingdao. 2139 – 2144
Siva Reddy, Serge Sharoff. 2011 Cross Language POS Taggers (and other Tools) for Indian Languages: An Experiment with Kannada using Telugu Resources. In Proceedings of IJCNLP workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. Thailand.
Shambhavi B R, RamakanthKumar P, Revanth G 2012 A Maximum Entropy Approach to Kannada Part Of Speech Tagging. International Journal of Computer Applications (IJCA), Volume 41 –No. 13,9-12.
Rajapurohit B B, 1982. Accoustic Characteristics of Kannada, Central Institute of Indian Languages, Mysore.
T. Brants. 2000 TnT – A statistical part-of-speech tagger. In Proceedings of the 6th Applied NLP Conference, 224-231.
J. Lafferty, A. McCallum, and F. Pereira 2001 Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning (ICML-2001), Williams, MA.
F. Sha and F. Pereira. 2003 Shallow parsing with conditional random fields. In Proceedings of HLT-NAACL.
A. Bharati, R. Sangal, D. M. Sharma, and L. Bai. 2006 Anncorra: Annotating corpora guidelines for POS and chunk annotation for Indian languages. In Technical Report (TR-LTRC-31), LTRC, IIIT-Hyderabad.
Baker, P, Hardie, A, McEnery, A, Xiao, R, Bontcheva, K, Cunningham, H, Gaizauskas, R, Hamza, O, Maynard, D, Tablan, V, Ursu, C, Jayaram, BD and Leisher, M 2004 Corpus linguistics and South Asian languages: corpus creation and tool development. Literary and Linguistic Computing 19(4): 509-524.

Index Terms

Computer Science

Information Sciences

Keywords

Natural Language Processing Part Of Speech Tagging Hidden Markov Model Conditional Random Fields