CFP last date
20 May 2024
Reseach Article

An Imaging Technique for Retrieval of Lost Content in Damaged Documents

by Neelam Bhardwaj, Suneeta Agarwal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 104 - Number 5
Year of Publication: 2014
Authors: Neelam Bhardwaj, Suneeta Agarwal
10.5120/18195-9113

Neelam Bhardwaj, Suneeta Agarwal . An Imaging Technique for Retrieval of Lost Content in Damaged Documents. International Journal of Computer Applications. 104, 5 ( October 2014), 1-5. DOI=10.5120/18195-9113

@article{ 10.5120/18195-9113,
author = { Neelam Bhardwaj, Suneeta Agarwal },
title = { An Imaging Technique for Retrieval of Lost Content in Damaged Documents },
journal = { International Journal of Computer Applications },
issue_date = { October 2014 },
volume = { 104 },
number = { 5 },
month = { October },
year = { 2014 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume104/number5/18195-9113/ },
doi = { 10.5120/18195-9113 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:35:20.359232+05:30
%A Neelam Bhardwaj
%A Suneeta Agarwal
%T An Imaging Technique for Retrieval of Lost Content in Damaged Documents
%J International Journal of Computer Applications
%@ 0975-8887
%V 104
%N 5
%P 1-5
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

It is very common that some useful contents of documents are lost or hidden intentionally or accidently due to several reasons e. g. Whitener, pasting of paper, ink spreading, fading, dirt etc. The importance of these damaged documents may in terms of some research result, historic event, pacts or any important piece of information. The retrieval of such lost contents is a latent research area. The available approaches only guess for the lost contents by exploring the remaining intact information. But no approach is found yet to say that the retrieval is exactly of original ones. We have proposed a new approach and developed an experimental setup which involves imaging by sensing the light after passing through the damaged document instead reflection and then applying OCR on acquired images for retrieval of lost or hidden contents. Experiments are carried out on various test documents. Good results are obtained. The applicability of this scheme is limited for physically available documents only, but ensures the originality of retrieved contents.

References
  1. A. Antonacopoulos and D. Karatzas, 2004. The Lifecycle of a Digital Historical Document: Structure and Content. Proceedings of the ACM Symposium on Document Engineering (DocEng 2004), pp. 147-154.
  2. A. Antonacopoulos and D. Karatzas, 2005. Semantics-Based Content Extraction in Typewritten Historical Documents. Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR2005), Seoul, South Korea, IEEE, vol. 1, pp. 48-53.
  3. F. Drira, 2006. Towards restoring historic documents degraded over time. Document Image Analysis for Libraries (DIAL '06). IEEE Second International Conference, pp. 357-364.
  4. A. Antonacopoulos and C. C. Castilla, 2006. Flexible Text Recovery from Degraded Typewritten Historical Documents. 18th International Conference on Pattern Recognition (ICPR), IEEE, vol. 2, pp. 1062-1065.
  5. S. Pletschacher1, J. Hu and A. Antonacopoulos, 2009. A New Framework for Recognition of Heavily Degraded Characters in Historical Type written Documents Based on Semi-Supervised Clustering. 10th International Conference on Document Analysis and Recognition, IEEE, pp. 506-510.
  6. Xia Yong, Jia Xu-Hui and Wang Kuan-Quan, 2012. International conference on Systems and Informatics (ICSAI), IEEE xplore, pp. 261 – 264.
  7. Mohamed Cheriet, Nawwaf Karma, Cheng Lin Liu, Chingy Suen, 2007. Character Recognition Systems. A Guide to Students and Practioners, Wiley Publications.
  8. S. Cubero, N. Aleixoa, E. Molto, J. G. Sanchis and J. Blasco, 2011. Advances in machine vision application for automatic inspection and quality evaluation of fruits and vegetables. Food and Bioprocess Technology, Springer, vol. 5, issue 4, pp. 487-504.
Index Terms

Computer Science
Information Sciences

Keywords

Imaging damaged document content retrieval OCR.