CFP last date
20 May 2024
Reseach Article

Sentence Boundary Detection in Kannada Language

by Deepamala. N, Ramakanth Kumar. P
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 39 - Number 9
Year of Publication: 2012
Authors: Deepamala. N, Ramakanth Kumar. P
10.5120/4852-7124

Deepamala. N, Ramakanth Kumar. P . Sentence Boundary Detection in Kannada Language. International Journal of Computer Applications. 39, 9 ( February 2012), 38-41. DOI=10.5120/4852-7124

@article{ 10.5120/4852-7124,
author = { Deepamala. N, Ramakanth Kumar. P },
title = { Sentence Boundary Detection in Kannada Language },
journal = { International Journal of Computer Applications },
issue_date = { February 2012 },
volume = { 39 },
number = { 9 },
month = { February },
year = { 2012 },
issn = { 0975-8887 },
pages = { 38-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume39/number9/4852-7124/ },
doi = { 10.5120/4852-7124 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:26:02.777794+05:30
%A Deepamala. N
%A Ramakanth Kumar. P
%T Sentence Boundary Detection in Kannada Language
%J International Journal of Computer Applications
%@ 0975-8887
%V 39
%N 9
%P 38-41
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Sentence Boundary Detection is a pre-processing step for any Natural Language Processing application. Various algorithms have been used to achieve Sentence Boundary Detection or Disambiguation in different languages. In this paper, a rule based method is proposed and tested to achieve Sentence Boundary Detection for Kannada Language. Kannada being a grammatically rich Indian language is analyzed based on semantics and tested with a 227K bytes corpus. The code is written in C using wide characters, with support for Unicode. Results showed 99.2% success in detecting sentence boundary.

References
  1. Manning, C.D. and. Schütze., H. 2002. Foundations of statistical natural language processing. The MIT Press, London.
  2. J. Reynar, and Ratnaparkhi. A. 1997. A Maximum Entropy Approach to Identifying Sentence Boundaries, in Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington D.C, pp. 16-19.
  3. Palmer, D.D. and Hearst, M.A..1997. Adaptive multilingual sentence boundary disambiguation. Computational Linguistics 23 241–267
  4. Mikheev, A. 2000. Tagging Sentence Boundaries. In: Proceedings of the NAACL, Seattle, pp 264-271.
  5. T. Kiss and Strunk, J. 2006. Unsupervised multilingual sentence boundary detection. Computational Linguistics, 32(4):485–525.
  6. Walker, Daniel J., David E. Clements, Maki, Darwin and Jan, W. Amtrup. 2001. Sentence boundary detection: a comparison of paradigms for improving MT quality. In: Proceedings of the MT Summit VIII, Santiago de Compostela, Spain.
  7. Akita, Y. 2006. Sentence Boundary Detection of Spontaneous Japanese Using Statistical Language Model and Support Vector Machines. In: Proceedings of. Interspeech-ICSLP, Pittsburgh, PA.
  8. Singh, Preetam, Negi, Rauthan M.M.S and Dhami, H.S. 2010. Sentence Boundary Disambiguation: a User Friendly Approach. IJCA. Vol, 7-No.8.
  9. Mona Parakh, Rajesha N. and Ramya M. 2011. Sentence Boundary Disambiguation in Kannada Texts, Language in India. www.languageinindia.com. 11:5 May 2011 Special Volume: Problems of Parsing in Indian Languages, pp. 17- 19.
  10. Gillick, D. 2009. Sentence Boundary Detection and the Problem with the U.S. In: Proceedings of the NAACL HLT: Short Papers, Boulder, Colorado.
  11. Agarwal N., Ford K., and Shneider M., Sentence Boundary Detection using a MaxEnt Classifier. citeseerx.ist.psu.edu
  12. Wang H. and Huang Y. 2003. Bondec - A sentence Boundary Detector. CS224N Project, Stanford, 2003
Index Terms

Computer Science
Information Sciences

Keywords

Sentence Boundary Detection Verb Suffix Abbreviation