Hybrid Approach for Annotating Unstructured Document

Meghana.h.j; Pushpa Ravikumar

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

CAD-Genesis: An Open-Source AI-Powered Add-in for Natural Language-Driven Parametric CAD Modeling and Cross-Platform Integration in SolidWorks and Fusion 360

Anil Mandloi Prakhi Mandloi

Random Articles

Computation (Abacus) Aspects of the Sahasralingam

Jun

2016

Design and Implementation of Photo Voltaic System: Arduino Approach

August

2013

A Review of the Effective Techniques of Compression in Medical Image Processing

July

2014

Performance Comparisons of Novel Feature Vector Selection Methods for Iris Recognition

July

2012

Reseach Article

Hybrid Approach for Annotating Unstructured Document

by Meghana.h.j, Pushpa Ravikumar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 120 - Number 13

Year of Publication: 2015

Authors: Meghana.h.j, Pushpa Ravikumar

10.5120/21291-4270

Meghana.h.j, Pushpa Ravikumar . Hybrid Approach for Annotating Unstructured Document. International Journal of Computer Applications. 120, 13 ( June 2015), 38-41. DOI=10.5120/21291-4270

@article{ 10.5120/21291-4270,

author = { Meghana.h.j, Pushpa Ravikumar },

title = { Hybrid Approach for Annotating Unstructured Document },

journal = { International Journal of Computer Applications },

issue_date = { June 2015 },

volume = { 120 },

number = { 13 },

month = { June },

year = { 2015 },

issn = { 0975-8887 },

pages = { 38-41 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume120/number13/21291-4270/ },

doi = { 10.5120/21291-4270 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:06:10.043657+05:30

%A Meghana.h.j

%A Pushpa Ravikumar

%T Hybrid Approach for Annotating Unstructured Document

%J International Journal of Computer Applications

%@ 0975-8887

%V 120

%N 13

%P 38-41

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Annotation is a process of adding the information into the Document which is useful for extracting the information. A large number of organizations now days generate a large amount of data which is always present in the textual format. But such collections of textual document which contains a large amount of structured information which is completely hidden in the unstructured information. Information extraction algorithm is too costly because it always works on the top of the text and it does not provide the necessary structured information. In our paper, we present a method to generate the structured attribute by identifying the documents which contain the information of interest and this information in future useful for querying the database. The major contribution of this paper, we propose the algorithm, where it identifies the structured attribute which is present in the document by combining both the query workload and the content of the text document. Our Experiment result shows that our technique gives the better results compared to the methods which only relay on the content of the document and only on the query workload.

References

S. R. Jeffery, M. J. Franklin, and A. Y. Halevy, "Pay-as-you-go user feedback for dataspace systems," in ACM SIGMOD, 2008.
A. Jain and P. G. Ipeirotis, "A quality-aware optimizer for information extraction," ACM Transactions on Database Systems, 2009.
M. Jayapandian and H. Jagadish, "Expressive query specification through form customization," in Proceedings of the 11th international conference on Extending database technology: Advances in database technology, ser. EDBT '08. New York, NY, USA: ACM, 2008, pp. 416–427
J. M. Ponte and W. B. Croft, "A language modeling approach to information retrieval," in Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, ser. SIGIR '98. New York, NY, USA: ACM, 1998,
R. Fagin, A. Lotem, and M. Naor, "Optimal aggregation algorithms for middleware," J. Comput. Syst. Sci. , vol. 66, pp. 614–656, June 2003.
G. Tsoumakas and I. Vlahavas, "Random k-labelsets: An ensemble method for multilabel classification," in Proceedings of the 18th European conference on Machine Learning, ser. ECML '07. Berlin, Heidelberg: Springer-Verlag, 2007, pp. 406–417

Index Terms

Computer Science

Information Sciences

Keywords

Annotation CADS form CV and QV