Call for Paper - March 2023 Edition
IJCA solicits original research papers for the March 2023 Edition. Last date of manuscript submission is February 20, 2023. Read More

Comparison of Some Text Extraction Methodologies

Print
PDF
IJCA Proceedings on National Conference on Recent Trends in Computing
© 2012 by IJCA Journal
NCRTC - Number 5
Year of Publication: 2012
Authors:
V. K. Yeotikar
M. P. Dhore

V K Yeotikar and M P Dhore. Article: Comparison of Some Text Extraction Methodologies. IJCA Proceedings on National Conference on Recent Trends in Computing NCRTC(5):30-33, May 2012. Full text available. BibTeX

@article{key:article,
	author = {V. K. Yeotikar and M. P. Dhore},
	title = {Article: Comparison of Some Text Extraction Methodologies},
	journal = {IJCA Proceedings on National Conference on Recent Trends in Computing},
	year = {2012},
	volume = {NCRTC},
	number = {5},
	pages = {30-33},
	month = {May},
	note = {Full text available}
}

Abstract

In Document Image analysis the digitized images of printed documents typically consist of a mixture of text, graphics, and image elements. For proper processing and efficient representation, these elements have to be separated. For most of the applications it is essential to separate between text and non-text, because text captures the most information. These text lines may have different orientations or the text lines may be of curved shapes. Some of the techniques proposed for text string extraction are completely independent from text orientation and may deal with text in various font styles and sizes. There are many fast and efficient methods for extracting graphics and text paragraphs from printed document. This paper outlines the comparisons of some text extraction techniques proposed by researchers.

References

  • Frank Hones, Jiirgen Lichter. "TEXT STRING EXTRACTION WITHIN MIXED-MODE DOCUMENTS" 1993 IEEE.
  • Xuhong Li ,Peter A. Ng. "A DOCUMENT CLASSIFICATION AND EXTRACTION SYSTEM WITH LEARNING ABILITY "
  • T. Perroud, K. Sobottka, and H. Bunke "Text extraction from color documents - clustering approaches in three and four dimensions" 2001 IEEE.
  • Jiangying zhou, Daniel Lopresti "EXTRACTING TEXT FROM WWW IMAGES". 1997 IEEE
  • Xuewen Wang "CHARACTER EXTRACTION AND RECOGNITIONS IN NATURAL SCENE IMAGES" 2001 IEEE.
  • F. Leabourgeois, Z. Bublinski and H. Emptoz"A Fast and Efficient Method for Extracting Text paragraphs and graphics from unconstrained Document"1992 IEEE.
  • U. Pal and Partha Pratim Roy "Multioriented and Curved Text Lines Extraction From Indian Documents" 2004 IEEE