Call for Paper - July 2022 Edition
IJCA solicits original research papers for the July 2022 Edition. Last date of manuscript submission is June 20, 2022. Read More

Preprocessing Challenges in Document Image Analysis

Print
PDF
IJCA Proceedings on National Conference on Recent Trends in Computing
© 2012 by IJCA Journal
NCRTC - Number 9
Year of Publication: 2012
Authors:
Keshao D. Kalaskar
Mahendra P. Dhore

Keshao D Kalaskar and Mahendra P Dhore. Article: Preprocessing Challenges in Document Image Analysis. IJCA Proceedings on National Conference on Recent Trends in Computing NCRTC(9):25-30, May 2012. Full text available. BibTeX

@article{key:article,
	author = {Keshao D. Kalaskar and Mahendra P. Dhore},
	title = {Article: Preprocessing Challenges in Document Image Analysis},
	journal = {IJCA Proceedings on National Conference on Recent Trends in Computing},
	year = {2012},
	volume = {NCRTC},
	number = {9},
	pages = {25-30},
	month = {May},
	note = {Full text available}
}

Abstract

Document Image Analysis (DIA) is the subfield of digital image processing that aims at converting document images to symbolic form for modification, storages, retrieval, reuse and transmission. It helps the transition from bookshelves and filing cabinets to the paperless and perhaps even wireless world. Preprocessing is the first stage in document image analysis. In Document Image Analysis, Preprocessing activity involves Representation, Noise reduction, Binarization, Skew estimation/detection, Zoning, Character segmentation. This paper focuses on the major challenges that are to be faced in preprocessing of document images for document image analysis.

References

  • E. T. Endo, "On a Methods of Bianry-Picture representation and its application to data compression," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, no. 1 pp 27-35, January 1980.
  • S. Yajima, J. L. Goodsell, T. Ichida, and H. Hirasishi, "Data Compression of Kanji Character Patterns Digitized on a Hexagonal Mesh", IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 3, no. 2 pp 121-229, February 1981.
  • H. Nagahashi and M. Nakatsuyama, "A Pattern Description and Generation Method of Structural Characters", IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 1 pp 112-117, January 1986.
  • C. A. Cabrelli and U. M. Molter, "Automatic Representation of Binary image", IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 12 pp 1190-1195, December 1990.
  • T Taxt, PJ. Plynn, and A. K. Jain , "Segmentation of Document Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 12 pp 1322-1329, December 1989.
  • O. D. Trier and T. Taxt, "Evaluation of Binarization Melhods for Document Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 3 pp 312-314, March 1995.
  • O. D. Trier and A. K. Jain, "Goal-Directed Evaluation of Binarization Methods," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 12 pp 1191-1201, December 1995.
  • O. D. Trier, T. Taxt, and G. K. Jain, "Recognition of Digits in Hydrographic Maps: Binary Versus Topographic Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 4 pp 399-404, April 1997.
  • Y. Liu and S. Srihari, "Documcnt Image Binarization Based on Texture Features," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 5 pp 540-544, May 1997.
  • P. Sarkar, G. Nagy, J. Zhou, and D. Lopresti, "Spatial Sampling of Printed Patterns," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3 pp 344-350, March 1998.
  • D. I. Havelock, "Geometric Precision in Noise-Free Digital Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 11, no. 10 pp 1065-1075, Oct 1989.
  • D. I. Havelock, " the Topology of locales and Its Effect on position Uncertainty," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 4 pp 380-385, April 1991.
  • H. K. Aghajnn and T. Kailatli, "SLIDE: Subspace-Based Line Detection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 11 pp 1057-1073, Nov 1994.
  • B. B. Chaudhuri and U. Pal, "Skew Angle Detection of Digitized Script Documents" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2 pp 182-186, Feb 1997.
  • A. K. Jain and B. Yu "Document Representation and Its Application to Image Decomposition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3 pp 294-308, March 1998.
  • R. G. Casey and E. Lccolinet, "A Survey of Methods and Strategies in Character Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 7 pp 690-706, July 1996.
  • J. Rocha and T. Pavlidis, "Character Recognition without Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 9 pp 903-909, Sept 1995.
  • Hoover et al. , "An Experimental Comparison of Range Image Segmentation Algorithms" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 7 pp 673-689, July 1996.
  • R. J. Ulichney and D. T. Troxel, "Scaling Binary Images with a Telescoping Template" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 4, no. 3 pp 331-335, March 1982.
  • Namane and M. A. Sid-Ahmad, " Character scaling by contour method," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 6 pp 600-606, June 1990.
  • Zramdini and R. Ingold, "Optical Font Identification Using Typographic Features," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 8 pp 877-882, August 1998.
  • A. L. Spitz, "Determination of the Script and Language Content of Document Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 3 pp 235-245, March 1997.
  • T. N. Tan, "Rotation Invariant Texture Features and Their use in Automatic Script Identification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 7 pp 751-756, July 1998.
  • M. Cheriet and C. Y. SUEN, "Extraction of Key Letters Script Recognition," Pattern Recognition Letters, vol 14, pp. 1009-1017, 1993