CFP last date
20 May 2024
Reseach Article

Context-free Grammar Learning from Text Document using Sequential Pattern

by Ramesh Thakur
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 106 - Number 15
Year of Publication: 2014
Authors: Ramesh Thakur
10.5120/18597-9856

Ramesh Thakur . Context-free Grammar Learning from Text Document using Sequential Pattern. International Journal of Computer Applications. 106, 15 ( November 2014), 23-26. DOI=10.5120/18597-9856

@article{ 10.5120/18597-9856,
author = { Ramesh Thakur },
title = { Context-free Grammar Learning from Text Document using Sequential Pattern },
journal = { International Journal of Computer Applications },
issue_date = { November 2014 },
volume = { 106 },
number = { 15 },
month = { November },
year = { 2014 },
issn = { 0975-8887 },
pages = { 23-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume106/number15/18597-9856/ },
doi = { 10.5120/18597-9856 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:39:29.412338+05:30
%A Ramesh Thakur
%T Context-free Grammar Learning from Text Document using Sequential Pattern
%J International Journal of Computer Applications
%@ 0975-8887
%V 106
%N 15
%P 23-26
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The World-Wide-Web and information system has gained significant achievements over the last two decades as expressed their dominance in various business and scientific applications. As estimated by Blumberg and Atre more than 85% of all business information exists in the form of unstructured and semi-structured document, typically formatted for human viewing, not for system processing. Extracting information from these document are challenging task. Extracting grammar rules from these documents is interesting idea. Grammar rules can be used to create structural descriptions of text documents. In this paper I propose grammatical inference using sequential pattern to infer formal language (context free grammar), which describes the given sample set.

References
  1. Blumberg, R. , & Atre, S. "The problem with unstructured data. " DM REVIEW, 13, pp 42-49 2003.
  2. James, S. , Mark, D. Roger, F. , Melliyal, A. , Jean, I. , & Xavier, L. "Managing Unstructured Data with Oracle Database 11g". An Oracle White Paper , pp. 1-9 Feb 2009.
  3. B. M. Sundheim, "Overview of the third message understanding evaluation and conference," In Proceedings of the Third Message Understanding Conference (MUC-3), pp. 3–16, San Diego, CA, 1991.
  4. P. Palaga, L. Nguyen, U. Leser, and J. Hakenberg, "High-performance information extraction with AliBaba ," In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, EDBT '09 ACM New York pp 1140–1143, 2009.
  5. Allen J. , "Natural Language Understanding," The Benjamin/Cummings Publishing Company, Inc. , Redwood City, CA, USA. Second Edition, 1995.
  6. N. A. Chinchor, Overview of MUC-7/MET-2 1998.
  7. E. M. GOLD, "Language identification in the limit," Inform Control. vol. 10, no. 5, pp 447–474,1967.
  8. E M. Gold, "Complexity of automaton identification from given data," Inform. Control, vol. 37, pp 302–320, 1978.
  9. Agrawal and R. Srikant. "Mining Sequential Patterns. " In Proceedings of the International Conference on Data Engineering (ICDE), Taipei, Taiwan, 1995.
  10. R. Srikant and R. Agrawal, "Mining Sequential Patterns: Generalizations and Performance Improvements", In Proceedings of the 5th International Conference on Extending Database Technology (EDBT), Avignon, France, March 1996.
  11. Suresh Jain, Ramesh Thakur and N. S. Chaudhari. "Discovery of Sequential Pattern from Text Document". In Proceedings of the National Conference on Intelligent Information Retrieval & Processing (NCIIRP - 2006), CSI Surat Chapter Bardoli, Surat , April, 2006.
  12. D'Ulizia, Arianna, Fernando Ferri, and Patrizia Grifoni. "A survey of grammatical inference methods for natural language learning. " Artificial Intelligence Review 36, No. 1 pp 1-27, 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Information Extraction Grammatical Inference Sequential Pattern.