Efficient Text Segmentation for Born-Digital Compound Images

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

Navigating the Future of Cybersecurity: A Strategic Approach to Crypto Agility for Modern Enterprises

Aditya Gupta

Random Articles

Passenger Travel behavior Model in Railway Network Simulation

Apr

2017

Review of Application of Internet of Things in Agriculture in India

Aug

2018

Web Application Top 10 OWASP Attacks and Defence Mechanism

Aug

2023

An Incorporated Voting Strategy on Majority and Score- based Fuzzy Voting Algorithms for Safety-Critical Systems

July

2014

Reseach Article

Efficient Text Segmentation for Born-Digital Compound Images

Published on March 2017 by Sonwanevikas V, Shahane N. M

Emerging Trends in Computing

Foundation of Computer Science USA

ETC2016 - Number 2

March 2017

Authors: Sonwanevikas V, Shahane N. M

Sonwanevikas V, Shahane N. M . Efficient Text Segmentation for Born-Digital Compound Images. Emerging Trends in Computing. ETC2016, 2 (March 2017), 26-30.

@article{

author = { Sonwanevikas V, Shahane N. M },

title = { Efficient Text Segmentation for Born-Digital Compound Images },

journal = { Emerging Trends in Computing },

issue_date = { March 2017 },

volume = { ETC2016 },

number = { 2 },

month = { March },

year = { 2017 },

issn = 0975-8887,

pages = { 26-30 },

numpages = 5,

url = { /proceedings/etc2016/number2/27311-6264/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 Emerging Trends in Computing

%A Sonwanevikas V

%A Shahane N. M

%T Efficient Text Segmentation for Born-Digital Compound Images

%J Emerging Trends in Computing

%@ 0975-8887

%V ETC2016

%N 2

%P 26-30

%D 2017

%I International Journal of Computer Applications

Abstract

Images are important information carriers which are often used in email messages and web pages to attach textual information. In Born digital compound image (BDCI) text and graphics/pictures come together on digital devices having certain distinct characteristics like low resolution (easy for online transmission and to display on screen) and text is created digitally on image. Text from BDCI can be effectively adopted for large numbers of applications like to retrieve contents of web, to improve indexing, to enhance content accessibility and content filtering. There are several problems to distinguish texts from BDCI because, text appears in various styles (i. e. Orientation, size, and colour), some neighbour texts are connected, and some text characters are superimposed on pictorial region which may lead to misclassification. Although researchers have proposed many methods in which character-level and block-based objects are commonly assumed to separate text from compound images. But these methods failed to extract reliable features to detect all texts as well as to identify connected components. To address these issues, novel efficient algorithm Local Image Activity Measure (LIAM) and Scale and Orientation Invariant Grouping (SOIG) are proposed to assemble separated characters into Textual Connected Component (TCC). These algorithms arebased on distribution of pixel variations and mean intrastring distance to precisely segment textual regions from BDCI.

References

C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Tu, "Detecting texts of arbitrary orientations in natural images," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. , Providence, RI, USA, 2012.
P. Shivakumara, T. Phan, and C. Tan, "A Laplacian approach to multioriented text detection in video," IEEE Trans. Pattern Anal. Mach. Intell. , vol. 33, no. 2, pp. 412-419, Feb. 2011.
S. Juliet and D. Florinabel, "Efficient block prediction-based coding of computer screen images with precise block classification," IET Image Process. , vol. 5, no. 4, pp. 306-314, Jun. 2011.
E. Haneda and C. Bouman, "Text segmentation for MRC document compression," IEEE Trans. Image Process. , vol. 20, no. 6, pp. 1611-1626, Jun. 2011.
C. Yi and Y. Tian, "Text string detection from natural scenes by structure-based partition and grouping," IEEE Trans. Image Process. , vol. 20, no. 9, pp. 2594-2605, Sep. 2011
Z. Pan, H. Shen, and Y. Lu, "Brower-friendly hybrid codec for compound image compression," in Proc. IEEE Symp. Circuits Syst. , Rio de Janeiro, Brazil, 2011.
D. Karatzas, S. Mestre, J. Mas, F. Nourbakhsh, and P. Roy, "ICDAR 2011 robust reading competition challenge 1: Reading text in borndigital images (web and email)," in Proc. Conf. Document Anal. Recognit. , Beijing, China, 2011.
N. Francisco, N. Rodrigues, and E. Silva, "Scanned compound document encoding using multiscale recurrent patterns," IEEE Trans. Image Process. , vol. 19, no. 10, pp. 2712-2724, Oct. 2010.
W. Ding, Y. Lu, and F. Wu, "Enable efficient compound image compression in H. 264/AVC intra coding," in Proc. IEEE Conf. Image Process. , San Antonio, TX, USA, 2007.
J. Song, Z. Li, M. Lyu, and S. Cai, "Recognition of merged characters based on forepart prediction, necessity-sufficiency matching, and character-adaptive masking," IEEE Trans. Syst. , Man, Cybern. B, Cybern. , vol. 35, no. 1, pp. 2-11, Feb. 2005.
T. Lin and P. Hao, "Compound image compression for real-time computer screen image transmission," IEEE Trans. Image Process. , vol. 14, no. 8, pp. 993-1005, Aug. 2005.
K. Konstantinides and D. Tretter, "A JPEG variable quantization method for compound documents," IEEE Trans. Image Process. , vol. 9, no. 7, pp. 1282-1287, Jul. 2000.
Huan, Yang and Shiqian Wu "Scale and Orientation Invariant Text Segmentation for Born-Digital Compound Images" IEEE Trans. CyberneticS, vol. 45, no. 3, March 2015

Index Terms

Computer Science

Information Sciences

Keywords

Born-digital Compound Image Text Segmentation Mean Intrastring Distance.