Script Identification for Tri-Lingual Image Document

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

Quantifying Label-Induced Bias in Large Language Model Self and Cross Evaluations

Muskan Saraf Sajjad Rezvani Boroujeni Justin Beaudry Hossein Abedi Tom Bush

Random Articles

Effects of Variable Viscosity and Thermal Conductivity on the Flow of Dusty Fluid over a Continuously Moving Plate

July

2015

Fogging: An Advanced Version of Cloud Storage

Mar

2020

Secure Data Retrieval based on Attribute-based Encryption in Cloud

January

2016

Relational Classification using Multiple View Approach with Voting

May

2013

Reseach Article

Script Identification for Tri-Lingual Image Document

Published on September 2014 by Anil Kumar Dahiya, Vivek Kumar Verma

Recent Advances in Wireless Communication and Artificial Intelligence

Foundation of Computer Science USA

RAWCAI - Number 1

September 2014

Authors: Anil Kumar Dahiya, Vivek Kumar Verma

Anil Kumar Dahiya, Vivek Kumar Verma . Script Identification for Tri-Lingual Image Document. Recent Advances in Wireless Communication and Artificial Intelligence. RAWCAI, 1 (September 2014), 35-38.

@article{

author = { Anil Kumar Dahiya, Vivek Kumar Verma },

title = { Script Identification for Tri-Lingual Image Document },

journal = { Recent Advances in Wireless Communication and Artificial Intelligence },

issue_date = { September 2014 },

volume = { RAWCAI },

number = { 1 },

month = { September },

year = { 2014 },

issn = 0975-8887,

pages = { 35-38 },

numpages = 4,

url = { /proceedings/rawcai/number1/17916-1412/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 Recent Advances in Wireless Communication and Artificial Intelligence

%A Anil Kumar Dahiya

%A Vivek Kumar Verma

%T Script Identification for Tri-Lingual Image Document

%J Recent Advances in Wireless Communication and Artificial Intelligence

%@ 0975-8887

%V RAWCAI

%N 1

%P 35-38

%D 2014

%I International Journal of Computer Applications

Abstract

In multi lingual environment where in a single image document have more than one script occur there is need of script identification system. Automatic identification of scripts in document facilitates (i)Automatic archiving of multilingual documents, (ii) Searching online archives of document images, (iii) Selection of script specific OCR in a multilingual environment. The main objective of this system is to identify the specific script and feed them into their specified Optical Character Recognition (OCR) system. OCR is the system which converts the image document into editable text document. Script identification of written text in the domain of Indian script based languages is a well-studied research field. In this paper a technique of script Identification is described to discriminate three major south Indian scripts: Oriya, Telugu and Kannada. These three scripts are member of Brahmi script and most of the character shapes are near similar. This method is applied over segmented line from the image document and it is completely free from size and font. The proposed technique uses the basic distinguishable features based on texture analysis. The approach is based on the analysis of horizontal projection and vertical projection profile. We obtain overall 98. 64% accuracy from test dataset of three ancient mix document images at line level.

References

M C Padma and P A Vijay "Identification of Telugu Devnagri and English Script using discriminating feature "International Journal of Computer science & Information Technology (IJCSIT), Vol 1, pp. 64-78 , November 2009.
Rajesh Gopakumar, N V Subbareddy, Krishnamoorthi Makkithaya, U Dinesh Acharya "Zone-based Structural feature extraction for Script Identification from Indian Documents" 2010 5th International Conference on Industrial and Information Systems, ICIIS 2010, 978-1-4244-6653-5/10/$26. 00 ©2010 IEEE pp. 420-425 ,2010.
B. V. Dhandra, Mallikarjun Hangarge, Ravindra Hegadil and V. S. Malemathl "Word Level Script Identification in Bilingual Documents through Discriminating Features" IEEE - ICSCN 2007, MIT Campus, Anna University, Chennai, India. Feb. 22-24, 2007. Pp. 630-635.
U. Pal, S. Sinha and B. B. Chaudhuri "Multi-Script Line identification from Indian Documents" Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR 2003)0-7695-1960-1/03 $17. 00 © 2003 IEEE.
P Nagabhushan, S. A. Angadi and B. S. Anami," An Intelligent Pin code Script Identification methodology based on texture analysis using modified invariant moments "In Preceding of ICCR-2005,pp. 615-623.
U. pal and B. B chaudhary,"Automatic Seperation of different script Documents", in Proc. Indian Conference on Computer-vision, Graphics and Image processing, PP 141-146, 1998.
Gopal Datt Joshi, Saurabh garg, and Jayanti Saraswat,"Script Identification of Indian Documents", LNCS 3872, PP. 255-267, DAS 2006.
P. A. Vijaya, M. C. Padma, "Text line identification from a multilingual document," Proc. of Intl. Conf. on digital image processing (ICDIP 2009) Bangkok, pp. 302-305, March 2009.
Sukalpa Chanda, Srikanta Pal and Umapada Pal," Word-wise Sinhala Tamil and English Script Identification using Gaussian Kernel SVM " In Preceding of IEEE-2008 978-1-4244-2175.

Index Terms

Computer Science

Information Sciences

Keywords

Ocr Script Identification Knn Oriya Telugu Kannada Projection Profile