Call for Paper - August 2019 Edition
IJCA solicits original research papers for the August 2019 Edition. Last date of manuscript submission is July 20, 2019. Read More

Text Line Segmentation of Handwritten Documents using Clustering Method based on Thresholding Approach

Print
PDF
IJCA Proceedings on National Conferecne on Advanced Computing and Communications 2012
© 2012 by IJCA Journal
NCACC - Number 1
Year of Publication: 2012
Authors:
M. Ravi Kumar
Nayana N Shetty
B. P. Pragath

Ravi M Kumar, Nayana N Shetty and B P Pragath. Article: Text Line Segmentation of Handwritten Documents using Clustering Method based on Thresholding Approach. IJCA Proceedings on National Conferecne on Advanced Computing and Communications 2012 NCACC(1):9-12, August 2012. Full text available. BibTeX

@article{key:article,
	author = {M. Ravi Kumar and Nayana N Shetty and B. P. Pragath},
	title = {Article: Text Line Segmentation of Handwritten Documents using Clustering Method based on Thresholding Approach},
	journal = {IJCA Proceedings on National Conferecne on Advanced Computing and Communications 2012},
	year = {2012},
	volume = {NCACC},
	number = {1},
	pages = {9-12},
	month = {August},
	note = {Full text available}
}

Abstract

Segmentation of the text lines in an un-constrained handwritten documents still a challenging task because handwritten text lines are often un-uniformly skewed and curved, and the space between lines is not obvious. In this paper, we propose a text-line segmentation algorithm based on clustering using threshold. The connected components of document image are grouped, from which text-lines are extracted dynamically by coloring all the text-lines.

References

  • Downton A. , Leedham C. G. (1990), Preprocessing and presorting of envelope images for automatic sorting using OCR, Pattern Recognition, 23(3-4):347-362.
  • Govindaraju V. , R. Srihari, S. Srihari (1994), Handwritten text recognition, Document Analysis Systems DAS
  • Seni G. , Cohen E. (1994), External word segmentation of off-line handwritten text line, pattern Recognition, 27, Issue 1, January, pp 41-52
  • Srihari S. , Kim G. (1997), Penman: a system for reading unconstrained handwritten page image, SDIUT 97, Symposium on document image understanding technology, pp. 142-153.
  • Zhang B. , Srihari S. N. , Huang C. (2004), Word image retrieval using binary features, SPIE Conference on Document Recognition and retrieval XI, San Jose,California, USA, Jan 18-22. 2. Antonacopoulos A. (1994), Flexible Page Segmentation Using the Background, Proc. 12th Int. Conf. on Pattern Recognition (12th ICPR), Jerusalem, Israel, October 9-12, vol. 2, pp. 339-344.
  • Marti U. , Bunke H. (1999), A full English sentence database for off-line handwriting recognition, Proc. 5th
  • F. Yin, C. L. Liu,(2007), Handwritten text line extraction based on minimal spanning tree clustering, Proc. 5th Int. Conf. on Wavelet Analysis and Pattern Recognition, Vol. 3, pp. 1123-1128.
  • F. Chang,C. J. Chen,C. J. Lu,A linear-time component labeling algorithm using contour tracing technique, Computer Vision and Image Understanding, Vol. 93, pp. 206-220, 2004.
  • Fei Yin, Cheng-Lin Liu, Handwritten Text Line Segmentation by Clustering with Distance Metric Learning, National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences
  • G. Nagy, S. Seth, M. Viswanathan,(1992), A prototype document image analysis system for technical journals, Computer, Vol. 25, pp. 10-22.
  • U. Pal, S. Datta,(2003), Segmentation of Bangla unconstrained handwritten text, Proc. 7th Int. Conf. on Document Analysis and Recognition, Vol. 2, pp. 1128- 1132.
  • A. Zahour, B. Taconet, P. Mercy,S. Ramdane,(2001),Arabic handwritten text-line extraction, Proc 6th Int. Conf. on Document Analysis and Recognition, pp. 281-285.
  • Z. Shi, S. Setlur, V. Govindaraju, (2005),Text extraction from gray scale historical document image using adaptive local connectivity map, Proc. 8th Int. Conf. on Document Analysis and Recognition, Vol. 2, pp. 794-798.
  • D. J. Kennard, W. A. Barrett,(2006), Separating lines of text in freeform handwritten historical documents, Proc. 2nd Int. Conf. on Document Image Analysis for Libraries, pp. 12-23.
  • Y. Li, Y. Zheng, D. Doermann, S. Jaeger,(2008), Script independent text line segmentation in freestyle handwritten document, IEEE Trans. Pattern Analysis and Machine Intelligence, to appear.
  • L O'Gorman, (1993),The document spectrum for page layout analysis, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 15, No. 11, pp. 1162-1173.
  • L. Likforman-Sulem,(1994), C. Faure, Extracting lines on handwritten document by perceptual grouping,In: Advances in Handwriting and Drawing: A Multidisciplinary Approach, pp . 21-38.
  • I. S. I. Abuhaiba,S. Datta,(1995),M. J. J. Holt, Line extraction and stroke ordering of text pages, Proc. 3rd Int. Conf. on Document Analysis and Recognition, Vol. 1, pp. 390-393.
  • A. Simon, J. -C. Pret , A. P. Johnson,(1997), A fast algorithm for bottom-up document layout analysis, IEEE Trans. Pattern Analysis and Machine Intelligence,Vol. 19, No. 3, pp. 273-277.
  • Y. Pu,Z. Shi,(1998), A natural learning algorithm based on Hough transform for text lines extraction in handwritten document, Proc. 6th Int. Workshop on Frontiers in Handwriting