CFP last date
22 April 2024
Reseach Article

Level Set Methodology for Tamil Document Image Binarization and Segmentation

by S. Karthik, Hemanth.V.K, V. Balaji, K. P. Soman
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 39 - Number 9
Year of Publication: 2012
Authors: S. Karthik, Hemanth.V.K, V. Balaji, K. P. Soman
10.5120/4846-7117

S. Karthik, Hemanth.V.K, V. Balaji, K. P. Soman . Level Set Methodology for Tamil Document Image Binarization and Segmentation. International Journal of Computer Applications. 39, 9 ( February 2012), 7-12. DOI=10.5120/4846-7117

@article{ 10.5120/4846-7117,
author = { S. Karthik, Hemanth.V.K, V. Balaji, K. P. Soman },
title = { Level Set Methodology for Tamil Document Image Binarization and Segmentation },
journal = { International Journal of Computer Applications },
issue_date = { February 2012 },
volume = { 39 },
number = { 9 },
month = { February },
year = { 2012 },
issn = { 0975-8887 },
pages = { 7-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume39/number9/4846-7117/ },
doi = { 10.5120/4846-7117 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:25:59.388889+05:30
%A S. Karthik
%A Hemanth.V.K
%A V. Balaji
%A K. P. Soman
%T Level Set Methodology for Tamil Document Image Binarization and Segmentation
%J International Journal of Computer Applications
%@ 0975-8887
%V 39
%N 9
%P 7-12
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The most challenging task in OCR is getting the characters segmented properly. The accuracy of segmentation depends on the quality of the binarization technique applied. Binarization is the process of setting all intensity values greater than some threshold value to ”on”. It converts the document image into binary image as extracting text and eliminating the background. This process also removes the noise. The output of this process is used as input to image segmentation process. Conventionally separate methods are used for binarizarion and segmentation. In this paper we investigate the use of recently introduced convex optimization methods, selective local/global segmentation (SLGS) algorithm [16] and fast global minimization (FGM) algorithm [15] for simultaneous binarization and segmentation. Out of the two methods we tried out, one of them is found to be suitable for OCR task. The FGM algorithm provides an average accuracy of 89.97% for Tamil character segmentation.

References
  1. J. Ohya, A. Shio, and S. Akamatsu. “Recognizing characters in scene images”. IEEETrans. Pattern, Anal.Mach.Intell., 16(2), 1994, pp.214-220.
  2. Y. Zhong, K. Karu, and A.K. Jain. “Locating text in complex color images.” , Proc. of 3rd Int. Conf. Document Analysis and Recognition, 1995, 146 - 149 vol.1.
  3. O. D. Trier and T.Taxt, “Evaluation of binarization methods for document images”,. IEEE Trans. Pattern Anal. Machine Intell., vol. 17, Mar. 1995, pp. 312-315.
  4. A.T. Abak, U. Baris, and B. Sankur, “The Performance Evaluation of Thresholding Algorithms for Optical Character Recognition”, ICDAR 97, Ulm, Germany, 1997, pp. 697-700.
  5. M. Sezgin, “Survey over image thresholding techniques and quantitative performance evaluation”, Journal of electronic imaging, 13, 146, 2004, doi:10.1117/1.1631315.
  6. PavlosStathis, ErginaKavallieratou, Nikos Papamarkos, “An Evaluation Technique for Binarization Algorithms”, Journal of Universal Computer Science, vol. 14, no. 18 ,2008, pp. 3011-3030.
  7. W. Niblack, “An Introduction to Image Processing”, Prentice-Hall, En- glewood Cliffs, NJ 1986 , pp. 115116.
  8. J. Sauvola and M. Pietaksinen, “Adaptive document image binarization,” Pattern Recogn. 33, 2000 , pp. 225236.
  9. J. N. Kapur, P. K. Sahoo, and A. K. C. Wong, “A New Method for Gray-Level Picture Thresholding Using the Entropy of the Histogram,” Computer Vision, Graphics and Image Processing 29, Mar. 1985, pp.273-285.
  10. P. K. Loo and C. L. Tan. “Adaptive Region Growing Color Segmentation for Text Using Irregular Pyramid”. Document Analysis Systems VI Lecture Notes in Computer Science, Volume 3163/2004, 2004, pp. 103-106, DOI: 10.1007/978-3-540-28640-0-25.
  11. C. Fung, R. Chamchong, “A Review of Evaluation of Optimal Binarization Technique for Character Segmentation in Historical Manuscripts,”3rd Int. Conf. Knowledge Discovery and Data Mining, 2010 ,pp.236-240.
  12. N. Dinh, J. Park, G. Lee, “Korean Text Detection and Binarization in Color Signboards”, Int. Conf. ALPIT ,2008, pp. 235-240.
  13. Stathis, E. Kavallieratou, N. Papamarkos, “An evaluation survey of binarization algorithms on historical documents”, 19th ICPR, 2008, pp.1-4
  14. X. Bresson, S. Esedo, P. Vandergheynst, J.-philippe Thiran, and S. Osher, “Fast Global Minimization of the Active Contour / Snake Model,” Journal of Mathematical Imaging and Vision, 2007 28: 151167.
  15. X. Bresson, “A Short Guide on a Fast Global Minimization Algorithm for Active Contour Models.,” Energy, pp. 1-19, 2009.
  16. Zhang, K., Zhang, L., Song, H., and Zhou, W. (2010). “Active contours with selective local or global segmentation : A new formulation and level set method.”, Image and Vision Computing 28(4), 668-676. Elsevier B.V. doi: 10.1016/j.imavis.2009.10.009.
  17. David Rivest-Hnault , Reza Farrahi Moghaddam ,Mohamed Cheriet “A local linear level set method for the binarization of degraded historical document images”, Springer-Verlag 2011
  18. Basura Fernando, Sezer Karaoglu, Alain Trmeau, “Extreme Value The- ory Based Text Binarization In Documents and Natural Scenes”, 3rd Int. Conf. Machine Vision, Hong Kong, 2010.
  19. Kichenassamy, S., Kumar, A., Olver, P., Tannenbaum, A., and Yezzi,A. “ Gradient flows and geometric active contour models”, Proc. ICCV, Cambridge. 1995.
  20. Caselles, V., Catte, F., Coll, T., and Dibos, F. “A geometric model for active contours”, Numerische Mathematik, 66:131. 1993.
Index Terms

Computer Science
Information Sciences

Keywords

Level Set Active Contours Binarization Segmentation