CFP last date
20 May 2024
Reseach Article

Skew Correction Function of OCR: Stroke-Whitespace based Algorithmic Approach

by Mohammad Abu Obaida, Tanay Kumar Roy, Md. Abu Horaira, Md. Jakir Hossain
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 28 - Number 8
Year of Publication: 2011
Authors: Mohammad Abu Obaida, Tanay Kumar Roy, Md. Abu Horaira, Md. Jakir Hossain
10.5120/3409-4759

Mohammad Abu Obaida, Tanay Kumar Roy, Md. Abu Horaira, Md. Jakir Hossain . Skew Correction Function of OCR: Stroke-Whitespace based Algorithmic Approach. International Journal of Computer Applications. 28, 8 ( August 2011), 7-12. DOI=10.5120/3409-4759

@article{ 10.5120/3409-4759,
author = { Mohammad Abu Obaida, Tanay Kumar Roy, Md. Abu Horaira, Md. Jakir Hossain },
title = { Skew Correction Function of OCR: Stroke-Whitespace based Algorithmic Approach },
journal = { International Journal of Computer Applications },
issue_date = { August 2011 },
volume = { 28 },
number = { 8 },
month = { August },
year = { 2011 },
issn = { 0975-8887 },
pages = { 7-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume28/number8/3410-4759/ },
doi = { 10.5120/3409-4759 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:14:12.989105+05:30
%A Mohammad Abu Obaida
%A Tanay Kumar Roy
%A Md. Abu Horaira
%A Md. Jakir Hossain
%T Skew Correction Function of OCR: Stroke-Whitespace based Algorithmic Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 28
%N 8
%P 7-12
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

As the technology evolves, emergence of Optical Character Recognition (OCR) for both printed and handwritten documents of any language is obvious. In the process of developing an OCR for languages like Hindi, Bengali, Marathi that are among top 15, the most spoken language of the world, the task of skew correction still remains a challenging one as fewer research has been carried out in the field. In this paper, we confront this challenge and describe a stroke-whitespace based algorithmic approach that harnesses horizontal projection technique to correct the skewness of writings precisely for these languages. The paper proposes an easier and effective process named as OJ method that corrects the skewness of images for any degree of rotation. In essence, the paper deals with the images that are rotated by 180˚, using Stroke-Whitespace distance method.

References
  1. Richard O. Duda,Peter E. Hart, April 1971. “Use of the Hough Transform to deterct lines and curves in pictures”. Technical Note 36, AI Center.
  2. Srihari, S.N. and V. Govindaraju, 1989. “Analysis of textual images using the Hough transforms”. Machine Vision Applications, 2: 141-153. DOI: 10.1007/BF01212455.
  3. Le, D.S., G.R. Thoma and H. Wechsler, 1994. Automatic page orientation and skew angle detection for binary document images. Pattern Recognition, 27: 1325-1344.
  4. Pal, U. and B.B. Chaudhuri, 1996. An improved document skew angle estimation technique. Pattern Recognition Lett., 17: 899-904. DOI: 10.1016/0167-8655(96)00042-6
  5. Yu, B. and A.K. Jain, 1996. A robust and fast skew detection algorithm for generic documents. Patt. Recog., 29: 1599-1629. DOI: 10.1016/0031-3203(96)00020-9
  6. Tian Jipeng, G.Hemantha Kumar, H.K. Chethan : “Skew correction for Chinese character using Hough transform”. International Journal of Advanced Computer Science and Applications (IJACSA), Special Issue on Image Processing and Analysis.
  7. Atallah Mahmoud, Al-Shatnawi and Khairuddin Omar: Skew Detection and Correction Technique for Arabic Document Images Based on Centre of Gravity. Journal of Computer Science 5 (5): 363-368, 2009, ISSN 1549-3636
  8. A.F.R. Rahman and M. Kaykobad, A Complete Bengali OCR : A Novel Hybrid Approach to Handwritten Bengali Character Recognition, Journal of Computing and Information Technology, Vol. 6(4), 1998, pp. 395-413.
  9. Omar, K., A. Ramli, R. Mahmod and M. Sulaiman, 2002. Skew detection and correction of jawi images using gradient direction. Journal of Tech., 37: 117-126.
  10. Hou, H.S., 1983. Digital Document Processing. Wisely New York, ISBN: 0471862479.
  11. Akiyama, T. and N. Hagita, 1990. Automated entry system for printed documents. Pattern Recognition, 23: 1141-1158. DOI: 10.1016/0031-3203(90)90112-X
  12. Hashizume, A., P.S. Yeh and A. Cosenfeld, 1986. A method of detecting the orientation of aligned components. Pattern Recognition Letters, 4: 125-132.
  13. O’Gorman, L., 1993. The document spectrum for page layout analysis. IEEE Trans. Patt. Anal. Mach. Intell., 11: 1162-1173. DOI: 10.1109/34.244677.
  14. Yan, H.,1993. Skew correction of document images using interline cross correlation. Computer Vision Graph. Image Process., 55: 538-543. DOI: 10.1006/cgip.1993.1041.
  15. Luiz S. Oliveira, F. Bortolozzi, C.Y.Suen, ''Automatic Recognition of Handwritten Numerical Strings: A Recognition and Verification Strategy'', IEEE Transactions on Pattern Recognition and Machine Intelligence, 2001, Vol. 24, No. 11, pp. 1448-1456.
  16. K. M. Mohiuddin and J. Mao, ''A Comprehensive Study of Different Classifiers for Hand-printed Character Recognition'', Pattern Recognition, Practice IV, 1994, pp. 437- 448.
  17. L. A. Koerich, ''Unconstrained Handwritten Character Recognition Using Different Classification Strategies'', International Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR), 2003.
  18. N. Arica and F. Yarman-Vural, ''An Overview of Character Recognition Focused on Off-line Handwriting'', IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2001, 31(2), pp. 216 - 233.
  19. O. D. Trier, A. K. Jain, T.Taxt, ''Features Extraction Methods for Character Recognition – A Survey '', Pattern Recognition, 1996, Vol.29, No.4, pp. 641-662.
  20. S. Belongie, J. Malik, J. Puzicha, "Shape Matching and Object Recognition Using Shape Contexts", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.24, No. 4, pp. 509-522, 2002.
  21. Anil K. Jain, Dougla Zongker, "Representation and Recognition of Handwritten Digits using Deformable Templates", IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, Vol. 19, No. 12, pp. 1386-1391.
  22. V.G.Gezerlis and S.Theodoridis, "Optical Character Recognition for the Orthodox Hellenic Byzantine music notation", Pattern Recognition, 2002, Vol.35, pp. 895 – 914.
  23. K. Ntzios, B. Gatos, I. Pratikakis, T. Konidaris and S.J. Perantonis, "An Old Greek Handwritten OCR System based on an Efficient Segmentation-free Approach", International Journal on Document Analysis and Recognition (IJDAR), Special Issue on Historical Documents, 2007,vVol. 9, No. 2-4, pp. 179-192.
Index Terms

Computer Science
Information Sciences

Keywords

OCR Skew Correction Text rotation