CFP last date
20 May 2024
Reseach Article

Rule based Sentence Simplification for English to Tamil Machine Translation System

by Poornima C, Dhanalakshmi V, Anand Kumar M, Soman K P
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 25 - Number 8
Year of Publication: 2011
Authors: Poornima C, Dhanalakshmi V, Anand Kumar M, Soman K P
10.5120/3050-4147

Poornima C, Dhanalakshmi V, Anand Kumar M, Soman K P . Rule based Sentence Simplification for English to Tamil Machine Translation System. International Journal of Computer Applications. 25, 8 ( July 2011), 38-42. DOI=10.5120/3050-4147

@article{ 10.5120/3050-4147,
author = { Poornima C, Dhanalakshmi V, Anand Kumar M, Soman K P },
title = { Rule based Sentence Simplification for English to Tamil Machine Translation System },
journal = { International Journal of Computer Applications },
issue_date = { July 2011 },
volume = { 25 },
number = { 8 },
month = { July },
year = { 2011 },
issn = { 0975-8887 },
pages = { 38-42 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume25/number8/3050-4147/ },
doi = { 10.5120/3050-4147 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:11:15.099603+05:30
%A Poornima C
%A Dhanalakshmi V
%A Anand Kumar M
%A Soman K P
%T Rule based Sentence Simplification for English to Tamil Machine Translation System
%J International Journal of Computer Applications
%@ 0975-8887
%V 25
%N 8
%P 38-42
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Machine translation is the process by which computer software is used to translate a text from one natural language to another but handling complex sentences by any machine translation system is generally considered to be difficult. In order to boost the translation quality of the machine translation system, simplifying an input sentence becomes mandatory. Many approaches are available for simplifying the complex sentences. In this paper, Rule based technique is proposed to simplify the complex sentences based on connectives like relative pronouns, coordinating and subordinating conjunction. Sentence simplification is expressed as the list of sub-sentences that are portions of the original sentence. The meaning of the simplified sentence remains unaltered. Characters such as (‘.’,’?’) are used as delimiters. One of the important pre-requisite is the presence of delimiter in the given sentence. Initial splitting is based on delimiters and then the simplification is based on connectives. This method is useful as a preprocessing tool for machine translation.

References
  1. Takao Doi and EiichiroSumita. 2003. “Input sentence splitting and translation”, Proc. of Workshop on Building and using parallel Texts, HLT-NAACL 2003.
  2. Katsuhito Sudoh et al. 2010. “Divide and Translate: Improving Long Distance Reordering in Statistical Machine translation”.
  3. Deepa Gupta. 2005. Contributions to English to Hindi Machine translation using Example-Based Approach.
  4. KatrinTomanek et al. 2003 “Sentence and Token Splitting Based on Conditional Random Fields”, Jena University Language & Information Engineering (JULIE) Lab, Forman, G.
  5. Satoshi Kamatani, Tetsuro Chino and Kazuo Sumita, “Hybrid Spoken Language Translation Using Sentence Splitting Based on Syntax Structure”, Corporate Research and Development Center, Toshiba Corporation.
  6. R. Chandrasekar R, B. Srinivas. 1997. Automatic induction of rules for text simplification.
  7. Furuse, O., et al. 2001. “Splitting Ill-formed Input for Robust Spoken-Language Translation”, Transactions of IPSJ, vol.42 No.45.
  8. Kim, Y. B. et al. 1994. An Automatic Sentence Breaking and Subject Supplement Method for Japanese to English Machine translation.
  9. Orasan, C. 2000. A hybrid method for clause splitting in Unrestricted English texts, Proceedings of ACIDCA 2000, Monastir, Tunisia.
  10. Zhemin Zhu, Delphine Bernhard and Iryna Gurevych 2010. “A Monolingual Tree-based Translation Model for Sentence Simplification”, Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010).
Index Terms

Computer Science
Information Sciences

Keywords

Sentence simplification Sentence segmentation POS tag Machine translation