Call for Paper - July 2019 Edition
IJCA solicits original research papers for the July 2019 Edition. Last date of manuscript submission is June 20, 2019. Read More

Morpheme Segmentation for Highly Agglutinative Tamil Language by Means of Unsupervised Learning

Print
PDF
IJCA Proceedings on International Conference on Communication, Computing and Information Technology
© 2015 by IJCA Journal
ICCCMIT 2014 - Number 1
Year of Publication: 2015
Authors:
Ananthi Sheshasaayee
Angela Deepa.V.R

Ananthi Sheshasaayee and Ananthi Sheshasaayee. Article: Morpheme Segmentation for Highly Agglutinative Tamil Language by Means of Unsupervised Learning. IJCA Proceedings on International Conference on Communication, Computing and Information Technology ICCCMIT 2014(1):32-35, March 2015. Full text available. BibTeX

@article{key:article,
	author = {Ananthi Sheshasaayee and Ananthi Sheshasaayee},
	title = {Article: Morpheme Segmentation for Highly Agglutinative Tamil Language by Means of Unsupervised Learning},
	journal = {IJCA Proceedings on International Conference on Communication, Computing and Information Technology},
	year = {2015},
	volume = {ICCCMIT 2014},
	number = {1},
	pages = {32-35},
	month = {March},
	note = {Full text available}
}

Abstract

To understand human language is one of the major challenges in the field of intelligent information systems. Morphological processing is the first step to be done in many Natural language processing applications. This task becomes crucial for morphological rich languages. This paper illustrates the importance of unsupervised morphological segmentation algorithms for the problem of morpheme boundary detection for Tamil language which are highly inflectional and agglutinative in morphology. This paper serves as ground work to represent the various methods and the comparative study among the selection of the algorithms which is based on highly agglutinative languages like Kannada, Finnish and Bengali. The prime advantages of these algorithms elevate to the efficient morphological processing of Tamil language

References

  • Ananthi Sheshasaayee and Angela Deepa. V. R, "The Role of Morphological Analyzer and Generator for Tamil Language in Machine Translation Systems", International Journal of Computer Sciences and Engineering, Volume-02, Issue-05, Page No (107-111), May -2014
  • Rissanen, Jorma. "Modeling by shortest data description. " Automatica 14. 5 (1978): 465-471.
  • Grünwald, Peter. "A tutorial introduction to the minimum description length principle. " (2005).
  • Myung, In Jae. "Tutorial on maximum likelihood estimation. " Journal of mathematical Psychology 47. 1 (2003): 90-100.
  • Kullback, Solomon, and Richard A. Leibler. "On information and sufficiency. "The Annals of Mathematical Statistics (1951): 79-86.
  • Goldwater, Sharon J. "Nonparametric Bayesian models of lexical acquisition. " PhD diss. , Brown University, 2007.
  • Gauvain, Jean-Luc, and Chin-Hui Lee. "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. " Speech and audio processing, ieee transactions on 2. 2 (1994): 291-298. .
  • Stigler, Stephen M. "Who discovered Bayes's theorem?. " The American Statistician 37. 4a (1983): 29
  • Goldsmith, John. "Unsupervised learning of the morphology of a natural language. " Computational linguistics 27. 2 (2001): 153-198.
  • Rissanen, J. "Stochastic complexity in statistical inquiry, 1989. " World Scienti?c, River Edge, NJ.
  • Creutz, Mathias. Induction of the morphology of natural language: Unsupervised morpheme segmentation with application to automatic speech recognition. Helsinki University of Technology, 2006.
  • Creutz, Mathias, and Krista Lagus. "Unsupervised models for morpheme segmentation and morphology learning. " ACM Transactions on Speech and Language Processing (TSLP) 4. 1 (2007): 3.
  • Dasgupta, Sajib, and Vincent Ng. "Unsupervised morphological parsing of Bengali. " Language Resources and Evaluation 40. 3-4 (2006): 311-330.
  • Dasgupta, Sajib, and Vincent Ng. "High-Performance, Language-Independent Morphological Segmentation. " HLT-NAACL. 2007.
  • Keshava, Samarth, and Emily Pitler. "A simpler, intuitive approach to morpheme induction. " Proceedings of 2nd Pascal Challenges Workshop. 2006
  • Bhat, Suma. "Morpheme segmentation for kannada standing on the shoulder of giants. " 24th International Conference on ComputationalLinguistics. 2012.