CFP last date
22 April 2024
Reseach Article

Discrimination between Printed and Handwritten Text in Documents

Published on None 2010 by M.S. Shirdhonkar, Manesh B. Kokare
Recent Trends in Image Processing and Pattern Recognition
Foundation of Computer Science USA
RTIPPR - Number 3
None 2010
Authors: M.S. Shirdhonkar, Manesh B. Kokare
6b57f7be-bdb3-4c72-b38f-dd6e0e661599

M.S. Shirdhonkar, Manesh B. Kokare . Discrimination between Printed and Handwritten Text in Documents. Recent Trends in Image Processing and Pattern Recognition. RTIPPR, 3 (None 2010), 131-134.

@article{
author = { M.S. Shirdhonkar, Manesh B. Kokare },
title = { Discrimination between Printed and Handwritten Text in Documents },
journal = { Recent Trends in Image Processing and Pattern Recognition },
issue_date = { None 2010 },
volume = { RTIPPR },
number = { 3 },
month = { None },
year = { 2010 },
issn = 0975-8887,
pages = { 131-134 },
numpages = 4,
url = { /specialissues/rtippr/number3/987-110/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Special Issue Article
%1 Recent Trends in Image Processing and Pattern Recognition
%A M.S. Shirdhonkar
%A Manesh B. Kokare
%T Discrimination between Printed and Handwritten Text in Documents
%J Recent Trends in Image Processing and Pattern Recognition
%@ 0975-8887
%V RTIPPR
%N 3
%P 131-134
%D 2010
%I International Journal of Computer Applications
Abstract

Recognition techniques for printed and handwritten text in scanned documents are significantly different. In this paper, we propose method to automatically identify the signature in the scanned document images. This helps to retrieve the document images based on the signature. A simple region growing algorithm is used to segment the document into a number of patches. A patch is composed of many closely located components. A component is a one piece of connected foreground pixels (say 8 connectivity). We extracted the state features of all the patches to identify the signature in the document images. A label for each such segmented patch is inferred using neural network model (NN) and support vector machine (SVM). These models are flexible enough to include signature as a type of handwriting and isolate it from machine-print. From experimental results we found that classification rate for SVM is superior over NN.

References
  1. N.Otsu, A. 1979,Threshold Selection Method from Gray – Level Histograms. In IEEE Transactions on Systems, Man and Cybernetics, v.9, n 1, pp. 62-66.
  2. Shravya Shetty, Harish Srinivasan, Matthew Beal and Sargur Srihari. 2007. Segmentation and labeling of documents using conditional random Fields. Center of Excellence for document analysis and recognition (CEDAR), University of Buffalo, and State University of New York.
  3. J. Laffery, A. Macullum and F. Perira.2001.Conditional random Fields: Probabilistic Model for segmenting and labeling sequential data. Eighteenth International Conference on Machine Learning, pp.282-289.
  4. Shravya Shetty, Harish Srinivas and Sargur Srihari.2007. Use of Conitional Random Fields for signature based retrieval of scanned documents. Center of Excellence for Document analysis and recognition (CEDAR), University of Buffalo, State University of New York, pp. 1-15.
  5. Rafael C. Gonzales’s, Richard E.Words and Steven L, Digital Image using MATLAB, Eddins, Low Price Edition.
  6. Christopher M. Bishop Pattern Recognition and Machine Learning
  7. Christopher J.C.Burges.1998.A Tutorial on support vector Machines for Pattern recognition. Bell Lab. Lucent Technologies,pp. 121-167.
  8. Guangyu Zhu, Yefeng Zheng, and David Doermann.2008. Signature-based Document image retrieval. ECCV, Part III, LNCS 5304, pp.752-765..
Index Terms

Computer Science
Information Sciences

Keywords

Document analysis text identification machine vision signature detection retrieval