Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Segmentation of Text Lines and Characters in Ancient Tamil Script Documents using Computational Intelligence Techniques

Print
PDF
International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 52 - Number 14
Year of Publication: 2012
Authors:
N. Sridevi
P. Subashini
10.5120/8268-1826

N Sridevi and P Subashini. Article: Segmentation of Text Lines and Characters in Ancient Tamil Script Documents using Computational Intelligence Techniques. International Journal of Computer Applications 52(14):7-12, August 2012. Full text available. BibTeX

@article{key:article,
	author = {N. Sridevi and P. Subashini},
	title = {Article: Segmentation of Text Lines and Characters in Ancient Tamil Script Documents using Computational Intelligence Techniques},
	journal = {International Journal of Computer Applications},
	year = {2012},
	volume = {52},
	number = {14},
	pages = {7-12},
	month = {August},
	note = {Full text available}
}

Abstract

Document image segmentation is one of the critical phases in handwritten character recognition system. Correct segmentation of individual characters decides the accuracy of the recognition system. It is used to decompose the sequence of characters into individual characters to segmenting text lines and then words. Ancient Tamil scripts documents consist of vowels, consonants and various modifiers. Hence proper segmentation algorithm is required. In existing methods, segmentation of overlapping lines and characters are difficult. In order to overcome this problem, two methods are proposed one for line segmentation and another for character segmentation, first method uses projection profile and PSO for line segmentation. In second method combination of connected components along with nearest neighborhood methods are used to segment the characters. Experimental results show that these methods give better results when compared to other methods.

References

  • Raghuraj Singh. S. Yadav and Prabhat Verma" Optical Character Recognition (OCR) for Printed Devnagari Script Using Artificial Neural Network" , International Journal of Computer Science & Communication, Vol. 1, No. 1, January-June 2010, pp. 91-95.
  • Vijay kumar and Pankaj K. Sengar, "Segmentation of Printed Text in Devanagari Script and Gurmukhi Script", International Journal of Computer Applications,Vol 3, No. 8, June 2010,pp. 24-29
  • N. Dhamayanthi, and P. Thangavel," Handwritten Tamil character recognition using neural network", Proceeding of Tamil Internet 2000, Singapore, July 22-24, 2000, pp. 171-176.
  • http://www. italki. com/notebook/entry/66643. htm.
  • Laurence Likforman-Sulem, et. al," Text Line Segmentation of Historical Documents: a survey", Submitted to Special Issue on Analysis of Historical Document, International Journal on Document Analysis and Recognition, Springer, 2006.
  • Vikas J Dongre and Vijay H Manka, "Devnagari Document Segmentation Using Histogram Approach", International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol. 1, No. 3, August 2011, pp. 46 -53.
  • R. C. Gonzalez and R. E. Woods. (2004): Digital Image Processing, Pearson Education.
  • Stephen Marchand Maillet ,"Binary Digital Image Processing- A Discrete Approach", 1999
  • C V Lakshmi, C PAtardhan "A Multi-font OCR System for printed Telugu Text. ", Proceeding of LEC'02, IEEE, 2002
  • L. Likforman-Sulem, A. Zahour, B. Taconet," Text line segmentation of historical documents: a survey", International journal of Document Analysis and Recognition,Vol 9, 2007, pp. 123 – 138
  • Itay Bar-Yosef et, al, "Line segmentation for degraded handwritten historical documents".
  • R. Sanjeev Kunte and R D Sudhaker Samuel," A Simple and efficient optical character recognition system for basic symbols in printed kannada text", Sadhana, Vol 32, Part 5, October 2007, pp. 521 – 533.
  • Oliveira . S. L. , S. A. Britto, and R. Sabourin, " Optimizing Class-Related Thresholds with Particle Swarm Optimization", Proceeding of International Joint Conference on Neural Networks, IEEE, Montreal, Canada, July 31 – August 4, 2005,pp. 1511 – 1516.
  • M Swamy Das et. al, "Segmentation of Overlapping Text Lines, Characters in Printed Telugu Text Document Images", International Journal of Engineering Science and Technology, Vol. 2, No. 11, 2010,pp. 6606 – 6610.
  • S. Santhosh Baboo, P. Subashini and M. Krishnaveni, "Combining Self-Organizing Maps and Radial Basis Function Networks for Tamil handwritten Character Recognition", International Journal of ICGST-GVIP, Vol. 9, No. 4, August 2009, pp. 1- 7.
  • Gift Siromoney, S Govindaraju, M. Chandrasekaran, "Thirukkural in Ancient Scripts", Department of Statistics, Madras Christian College, Tambaram, 1980.