CFP last date
20 May 2024
Reseach Article

A Script Independent Technique for Extraction of Characters from Handwritten Word Images

by Ram Sarkar, Samir Malakar, Nibaran Das, Subhadip Basu, Mita Nasipuri
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Number 23
Year of Publication: 2010
Authors: Ram Sarkar, Samir Malakar, Nibaran Das, Subhadip Basu, Mita Nasipuri
10.5120/530-693

Ram Sarkar, Samir Malakar, Nibaran Das, Subhadip Basu, Mita Nasipuri . A Script Independent Technique for Extraction of Characters from Handwritten Word Images. International Journal of Computer Applications. 1, 23 ( February 2010), 83-88. DOI=10.5120/530-693

@article{ 10.5120/530-693,
author = { Ram Sarkar, Samir Malakar, Nibaran Das, Subhadip Basu, Mita Nasipuri },
title = { A Script Independent Technique for Extraction of Characters from Handwritten Word Images },
journal = { International Journal of Computer Applications },
issue_date = { February 2010 },
volume = { 1 },
number = { 23 },
month = { February },
year = { 2010 },
issn = { 0975-8887 },
pages = { 83-88 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume1/number23/530-693/ },
doi = { 10.5120/530-693 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T19:48:08.052247+05:30
%A Ram Sarkar
%A Samir Malakar
%A Nibaran Das
%A Subhadip Basu
%A Mita Nasipuri
%T A Script Independent Technique for Extraction of Characters from Handwritten Word Images
%J International Journal of Computer Applications
%@ 0975-8887
%V 1
%N 23
%P 83-88
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A script independent character segmentation from word images technique has been reported here. Word to character segmentation is an important preprocessing step of optical character recognition process. But in case of handwritten text, presence of touching characters decreases the accuracy of the technique of the segmentation of the characters from the word. In this paper, segmentation of handwritten word of four different scripts namely, Bangla, Devanagri, Gurmukhi and Syloti are considered as the test samples. All these scripts are characterized by the presence of a distinct line along the top of the most of the characters forming the words, called the headline or Matra. Unlike English script, the characters of these handwritten scripts and its components often encircle the main character, making the conventional segmentation methodologies inapplicable. For the segmentation technique two fuzzy features, to identify the Matra region and potential segmentation point, are used here. Experimental results, using the proposed segmentation technique, on sample of 400 handwritten word images containing all the above mentioned scripts of Bangla, Devanagri, Gurmukhi and Syloti show a success rate of 95.41%, 93.61%, 91.23% and 92.37% respectively.

References
  1. R.G. Casey et.al. “A Survey of Methods and Strategies in Character Segmentation”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18,pp 690-706, 1996.
  2. R.M. Bozinovic et.al. “Off-line Cursive Script Word Recognition”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11,pp 68-83, 1989.
  3. A. F. R. Rahman, R. Rahman, M.C. Fairhurst, “Recognition of Handwritten Bengali Characters: a Novel Multistage Approach,” Pattern Recognition, vol. 35, p.p. 997-1006, 2002.
  4. T. K. Bhowmik, U. Bhattacharya and S. K. Parui, “Recognition of Bangla Handwritten Characters Using an MLP Classifier Based on Stroke Features,” in Proc. ICONIP, Kolkata, India, p.p. 814-819, 2004.
  5. A. Bishnu, B. B. Chaudhuri, “Segmentation of Bangla Handwritten Text into Characters by Recursive Contour Following,” in Proc. 5th ICDAR, pp. 402-405, 1999.
  6. U. Pal, S. Datta, “Segmentation of Bangla Unconstrained Handwritten text,” in Proc. 7th ICDAR, pp. 1128-1132, 2003.
  7. U. Garain, B. B. Chaudhuri, “Segmentation of touching characters in printed Devnagri and Bangla scripts using fuzzy multifactorial analysis,” IEEE Trans. On Systems, Man and Cybernetics – Part C: Applications and Reviews, vol. 22, pp. 449 – 459, 2002.
  8. S. Basu, R. Sarkar, N. Das, M. Kundu, M. Nasipuri, D. K. Basu, “A Fuzzy Technique for Segmentation of Handwritten Bangla Word Images”, International Conference on Computing: Theory and Applications (ICCTA), pp. 427-432, March-2007, Kolkata
  9. http://www.compcon-asso.in/projects/sylhet Nagri
  10. R. K. Sharma, A. Singh, “ Segmentation of Handwritten Text in Gurmukhi Script”, International Journal of Computer Science and Security, vol. 2, Issue 3.
  11. D. V. Sharma, G. S. Lehal, “An Iterative Algorithm for Segmentation of Isolated Handwritten Words in Gurmukhi Script”, International Conference on Pattern Recognition – vol. 2, pp. 1022-125, 2006.
  12. R. M. K. Sinha, V. Bansal, “On Devanagari Document Processing”, IEEE International Conference on Systems, Man and Cybernetics, Vancouver, Canada, 1995
  13. U. Garain, B. B. Chaudhuri, “Segmentation of touching characters in printed Devnagri and Bangla scripts using fuzzy multifactorial analysis,” IEEE Trans. On Systems, Man and Cybernetics – Part C: Applications and Reviews, vol. 22, pp. 449 – 459, 2002.
  14. V. Bansal, R.M.K. Sinha, “Segmentation of touching and fused Devanagari characters”, Pattern Recognition, vol. 35 (2002), number 4 pp. 875-893.
Index Terms

Computer Science
Information Sciences

Keywords

Character segmentation handwritten word images Script independent technique Fuzzy features