Call for Paper - May 2023 Edition
IJCA solicits original research papers for the May 2023 Edition. Last date of manuscript submission is April 20, 2023. Read More

Parallel Implementation of Niblack’s Binarization Approach on CUDA

International Journal of Computer Applications
© 2011 by IJCA Journal
Number 1 - Article 1
Year of Publication: 2011
Brij Mohan Singh
Rahul Sharma
Ankush Mittal
Debashish Ghosh

Brij Mohan Singh, Rahul Sharma, Ankush Mittal and Debashish Ghosh. Article: Parallel Implementation of Niblacks Binarization Approach on CUDA. International Journal of Computer Applications 32(2):22-27, October 2011. Full text available. BibTeX

	author = {Brij Mohan Singh and Rahul Sharma and Ankush Mittal and Debashish Ghosh},
	title = {Article: Parallel Implementation of Niblacks Binarization Approach on CUDA},
	journal = {International Journal of Computer Applications},
	year = {2011},
	volume = {32},
	number = {2},
	pages = {22-27},
	month = {October},
	note = {Full text available}


Image processing and pattern recognition algorithms take more time for execution on a single core processor. Graphics Processing Unit (GPU) is more popular now-a-days due to their speed, programmability, low cost and more inbuilt execution cores in it. Most of the researchers started work to use GPUs as a processing unit with a single core computer system to speedup execution of algorithms. The main goal of this research work is to make binarization faster for recognition of a large number of degraded document images on GPU. In this paper, parallel implementation is focused on the well known Niblack’s binarization approach for Optical Character Recognition (OCR) systems, since it is one of the most fundamental and important problems in the field of computer vision and pattern recognition. Our work employs extensive usage of highly multithreaded architecture of multi-cored GPU. An efficient use of shared memory is required to optimize parallel reduction in Compute Unified Device Architecture (CUDA). Experimental results show that parallel implementation achieved an average speedup of 20.84x over the serial implementation when running on a GPU named GeForce 9500 GT having 32 cores. Niblack’s method of binarization is also evaluated using PSNR, F-measure, NRM, and IND evaluation measures.


  • Fernando, R and Kilgard, M. J. 2003. The Cg tutorial the definitive guide to programmable real-time graphics. Addison-Wesley.
  • Moravanszky, A. 2003. Linear algebra on the GPU, in: W.F. Engel (Ed.), Shader X 2, Wordware Publishing, Texas.
  • Manocha, D. 2003. Interactive geometric & scientific computations using graphics hardware, SIGGRAPH 2003 Tutorial Course #11.
  • Moreland, K. and Angel E. 2003. The FFT on a GPU. In Proceedings of SIGGRAPH Conference on Graphics Hardware, 112-119.
  • Mairal, J., Keriven, R. and Chariot, A. 2006. Fast and efficient dense variational Stereo on GPU. In Proceedings of International Symposium on 3D Data Processing, Visualization, and Transmission, 97-704.
  • Yang, R. and Welch, G. 2002. Fast image segmentation and smoothing using commodity graphics hardware. Journal of Graphics Tools, Vol. 17, (4), 91-100.
  • Fung, J. and Man, S. 2005. OpenVIDIA: Parallel GPU computer vision. In Proceedings of ACM International Conference on Multimedia, 849-852.
  • Jang, H., Park, A. and Jung, K. 2008. Neural network implementation using CUDA and OpenMP. In Proceeding of Computing: Techniques and Applications, (DICTA), IEEE, 155 – 161.
  • Otsu, N. 1979. A threshold selection method from gray level histograms. IEEE Trans. on Systems, Man and Cybernetics, Vol. 9, 62-66.
  • Yu, B., Jain, A. and Mohiuddin, M. 1997. Address block location on complex mail Pieces,” In Proceeding of International Conference of Document Analysis and Recognition, IEEE, 897-901.
  • Rosenfeld, A. and Kak, A.C. 1982. Digital picture processing, second ed., Academic Press, New York.
  • Kittler J. and Illingworth J. 1985. On threshold selection using clustering criteria. IEEE Trans. Systems Man Cybernetics, Vol. 15, 652–655.
  • Brink, A.D. 1992. Thresholding of digital images using two-dimensional entropies. Pattern Recognition, Vol. 25, 803–808.
  • Yan, H. 1996. Unified formulation of a class of image thresholding techniques. Pattern Recognition, Vol. 29, 2025–2032.
  • Bernsen, J. 1986. Dynamic thresholding of grey-level images. In Proceeding of International Conference of Pattern Recognition, 1251-1255.
  • Niblack, W. 1986. An Introduction to digital image processing, Prentice-Hall, Englewood Cliffs, NJ, 115–116.
  • Sauvola, J. and Pietikainen, M. 2000. Adaptive document image binarization. Pattern Recognition, Vol. 33, 225–236.
  • Kim, I.K., Jung, D.W. and Park, R.H. 2002. Document image binarization based on topographic analysis using a water flow model. Pattern Recognition, Vol. 35, 265–277.
  • Gatos, B., Pratikakis, I. and Perantonis, S. J. 2006. Adaptive degraded document image binarization. Pattern Recognition, Vol. 39, 317–327.
  • Chang, Y.F., Pai, Y.T. and Ruan, S.J. 2008. An efficient thresholding algorithm for degraded document images based on intelligent block detection. In Proceeding of IEEE International Conference on Systems, Man, and Cybernetics, 667-672.
  • Valizadeh, M., Komeili, M., Armanfard, N. and Kabir, E. 2009. Degraded document image binarization based on combination of two complementary algorithms. In Proceeding of International Conference of Advances in Computational Tools for Engineering Applications, IEEE, 595-599.
  • Owens, J. D., Luebke, D., Govindaraju, N., Harris, M., Kruger, J., Lefohn, A. E. and Purcell, T. J. 2005. A survey of general-purpose computation on graphics hardware. In proceeding of Eurographics, State of the Art Reports, 21–51.
  • Larsen, E. S., McAllister, D. 2001. Fast Matrix Multiplies using Graphics Hardware. In Proceeding of International Conference for High Performance Computing and Communications, 159-168.
  • Trendall C. and Stewart, A. J. 2000. General calculations using graphics hardware with applications to interactive caustics. Rendering Techniques 2000: 11th Eurographics Workshop on Rendering, 287-298.
  • Li, Wei, Wei, Xiaoming, A. and Kaufman, 2001. Implementing lattice boltzmann computation on graphics hardware. In proceeding of the International Conference for High Performance Computing and Communications.
  • Mizukami, Y., Koga, K. and Torioka, T. 1994. A handwritten character recognition system using hierarchical extraction of displacement. IEICE, J77-D-II(12):2390–2393.
  • Kruger, J. and Westermann, R. 2003. Linear operators for GPU implementation of numerical algorithms. In Proceedings of SIGGRAPH, San Diego, 908- 916.
  • Steinkraus, D., Buck, I., and Simard, P. Y. 2005. GPUs for machine learning algorithms. In proceeding of International Conference of Document Analysis and Recognition, 1115-1120.
  • Mizukami, Y. and Koga, K. 1996. A handwritten character recognition system using hierarchical displacement extraction algorithm. In Proceeding of International Conference of Pattern Recognition, volume 3,160–164.
  • Ilie, A. Optical character recognition on graphics hardware. Downloaded from
  • Oh, K.S. and Jung, K. 2004. GPU implementation of neural networks. Pattern Recognition, Elsevier, 1311-1314.
  • Jung, K. 2001. Neural Network-based text localization in color images. Pattern Recognition Letters, Vol. 22, (4), 1503- 1515.
  • Singh, B.M., Mittal A., and Ghosh, D. 2011. Parallel implementation of Devanagari text line and word segmentation approach on GPU. International Journal of Computer Applications 24(9):7–14.
  • NVIDIA CUDA Programming Guide Version 2.0, available at
  • NVIDIA Corporation: NVIDIA CUDA programming guide. Jan 2007, available at
  • Trier, O.D. and Jain, A.K. 1995. Goal-directed evaluation of thresholding methods, IEEE Trans. Pattern Anal. Mach. Intell. 17 (12) 1191–1201.