Denoising of Document Images using Discrete Curvelet Transform for OCR Applications

International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 55 - Number 10
Year of Publication: 2012
C. Patvardhan
A. K. Verma
C. V. Lakshmi

In this paper, a denoising and binarization scheme of document images corrupted by white Gaussian noise and Impulse noise is presented using Curvelet Transform. The ability of sparse representation and edge preservation of Curvelet transform is utilized. Impulse noise gets added during document scanning or after binarization of scanned document images. White Gaussian noise corrupts the document images during transmission. The presence of either type of noise or a combination of them can severely degrade the performance of any OCR system. In the proposed denoising scheme, the curvelet transform is used with level dependent threshold calculated by modified sqtwolog method (universal threshold) at each scale with estimation of noise standard deviation. The noisy curvelet coefficients are thresholded by Hard Thresholding method. After curvelet based denoising, the image is binarized using global Otsu method and post processed to smoothen the text boundaries and remove isolated pixels for better OCR performance. The curvelet based scheme is compared with a wavelet transform based scheme and a modified wavelet based scheme with edge preservation. The results show that curvelet based scheme performs better in case of images containing Gaussian, Impulse and a combination of both the noises.


