![]() |
10.5120/6877-9090 |
Mahmudul A S M Hasan, Saria Islam and Arifur M Rahman. Article: Performance Analysis of Different Smoothing Methods on n-grams for Statistical Machine Translation. International Journal of Computer Applications 46(2):45-51, May 2012. Full text available. BibTeX
@article{key:article, author = {A. S. M Mahmudul Hasan and Saria Islam and M. Arifur Rahman}, title = {Article: Performance Analysis of Different Smoothing Methods on n-grams for Statistical Machine Translation}, journal = {International Journal of Computer Applications}, year = {2012}, volume = {46}, number = {2}, pages = {45-51}, month = {May}, note = {Full text available} }
Abstract
Smoothing techniques adjust the maximum likelihood estimate of probabilities to produce more accurate probabilities. This is one of the most important tasks while building a language model with a limited number of training data. Our main contribution of this paper is to analyze the performance of different smoothing techniques on n-grams. Here we considered three most widely-used smoothing algorithms for language modeling: Witten-Bell smoothing, Kneser-Ney smoothing, and Modified Kneser-Ney smoothing. For the evaluation we use BLEU (Bilingual Evaluation Understudy) and NIST (National Institute of Standards and Technology) scoring techniques. A detailed evaluation of these models is performed by comparing the automatically produced word alignment. We use Moses Statistical Machine Translation System for our work (i.e.Moses decoder, GIZA++, mkcls, SRILM, IRSTLM, Pharaoh, BLEU Scoring Tool). Here machine translation approach has been tested on German to English and English to German task. Our obtain results are significantly better than those obtained with alternative approaches to machine translation. This paper addresses several aspects of Statistical Machine Translation (SMT). The emphasis is put on the architecture and modeling of an SMT system.
References
- Machine Translation, Wikipedia, en.wikipedia.org/wiki/Machine_translation, [last access: 06-04-2012].
- F.J. Och, and H. Ney (2004), “The alignment template approach to statistical machine translation”, Computational Linguistics, Vol. 30, no 4,.
- Ye-Yi Wang and Alex Waibel, (1997) “Decoding Algorithm in Statistical Machine Translation”.
- Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, (2001) “IBM Research Report Bleu: a Method for Automatic Evaluation of Machine Translation”, RC22176 (W0109-022).
- Enrique Alfonseca and Diana Perez, (2004) “Automatic Assessment of Open Ended Questions with a BLEU-inspired Algorithm and shallow NLP”.
- Josep M. Crego Clemente, (2008) “Architecture and Modeling for N-gram-based Statistical Machine Translation”.
- Pharaoh, www.isi.edu/licensed-sw/pharaoh/, [last access: 06-04-2012].
- Philipp Koehn, (2009) “Statistical Machine Translation System User Manual and Code Guide”, University of Edinburgh.
- K. A. Papineni, S. Roukos, T. Ward, W. J. Zhu, (2001) “BLEU: a method for automatic evaluation of machine translation”. Technical Report RC22176 (W0109-022), IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, NY.
- Timothy C. Bell, John G. Cleary, Ian H. Witten, (1990) "Text Compression" Prentice Hall.
- Kneser R. and Hermann Ney. (1995) "Improved backing-off for m-gram language modeling". In Proceedings of ICASSP-95, vol. 1, 181–184.
- Stanly F. Chan and Josua Goodman (1998), "An Emperical Study of Smoothing technique for Language Modeling", Computer Science group, Harvard University, Cambridge, Massachusetts.
- Bayes, Thomas, and Price, Richard (1763). "An Essay towards solving a Problem in the Doctrine of Chance. By the late Rev. Mr. Bayes, communicated by Mr. Price, in a letter to John Canton, M. A. and F. R. S.". Philosophical Transactions of the Royal Society of London 53 (0): 370–418.