CFP last date
20 May 2024
Reseach Article

A Finite State Transducer (FST) based Font Converter

by Sriram Chaudhury, Shubhamay Sen, Gyanranjan Nandi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 58 - Number 17
Year of Publication: 2012
Authors: Sriram Chaudhury, Shubhamay Sen, Gyanranjan Nandi
10.5120/9376-3852

Sriram Chaudhury, Shubhamay Sen, Gyanranjan Nandi . A Finite State Transducer (FST) based Font Converter. International Journal of Computer Applications. 58, 17 ( November 2012), 35-39. DOI=10.5120/9376-3852

@article{ 10.5120/9376-3852,
author = { Sriram Chaudhury, Shubhamay Sen, Gyanranjan Nandi },
title = { A Finite State Transducer (FST) based Font Converter },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 58 },
number = { 17 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 35-39 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume58/number17/9376-3852/ },
doi = { 10.5120/9376-3852 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:03:15.926960+05:30
%A Sriram Chaudhury
%A Shubhamay Sen
%A Gyanranjan Nandi
%T A Finite State Transducer (FST) based Font Converter
%J International Journal of Computer Applications
%@ 0975-8887
%V 58
%N 17
%P 35-39
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper describes the rule based approach towards the development of an Oriya Font Converter that effectively converts the SAMBAD and AKRUTI proprietary font to standardize Unicode font. This can be very much helpful towards electronic storage of information in the native language itself, proper search and retrieval. Our approach mainly involves the Apertium machine translation tool that uses Finite State Transducers for conversion of symbolic data to standardized Unicode Oriya font. To do so it requires a map table mapping the commonly used Oriya syllables in Proprietary font to its corresponding font code and the dictionary specifying the rules for mapping the proprietary font code to Unicode font. Further some unhandled symbols that appear in the intermediate converted file are rectified by Flex scanner tool. The converted text thus obtained is in standard Unicode font and remains unchanged as Unicode font is supported by almost all the platforms.

References
  1. Steve Comstock, September 2011. An Introduction to Unicode.
  2. Markus Kuhn, “UTF-8 and Unicode FAQ for Unix/Linux”. http://www.cl.cam.ac.uk/~mgk25/unicode.html.
  3. F. Yergeau, November 2003, "UTF-8, A transformation format of ISO 10646", RFC 3629, The Internet Society (2003).
  4. Akshar Bharati, Nisha Sangal, Vineet Chaitanya, Rajeev Sangal and G Uma Maheshwara Rao,1998. "Generating converters between fonts semi-automatically". In Proceedings of SAARC conference on Multi-lingual and Multi-media Information Technology, (CDAC, Pune, India).
  5. A. Anand Arokia Raj, 2008. "Multi-lingual Screen Reader and Processing of Font-data in Indian languages". MS Thesis at International Institute of Information Technology Hyderabad, India.
  6. IConverter, A utility program for various code conversions. http://www. cse. iitk. ac. in/users/isciig/iconverter/ main. html.
  7. Himanshu Garg, 2005. "Overcoming the font and script barriers among indian languages". MS Thesis at International Institute of Information Technology Hyderabad, India.
  8. Indian Script Code For Information Interchange – ISCII, 1991, Bureau of Indian Standards(BIS), http://varamozhi.sourceforge.net/iscii91.pdf.
  9. Wx notation overview, http://sanskrit.inria.fr/DATA/wx.html.
  10. WX- notation for Devanagari script alphabet. http://mirror. umd. edu/mozdev/indicime/wx_keyboard. html
  11. Mikel L. Forcada et al. , 2010. "Documentation of the Open-Source Shallow Transfer Machine Translation Platform Apertium", Departament de Llenguatges i Sistemes Inform`atics Universitat d'Alacant.
Index Terms

Computer Science
Information Sciences

Keywords

Oriya font converter Proprietary font to Unicode font converter A Finite State Transducer based font Converter Font converter for Indian language Rule based font conversion Apertium in font conversion