CFP last date
20 May 2024
Reseach Article

Lexical Syntactic Patterns and Novel Statistical Measures based Bootstrapping Approach for Evolution of Biomedical Ontologies

by B. Sathiya, T. V. Geetha
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 177 - Number 39
Year of Publication: 2020
Authors: B. Sathiya, T. V. Geetha
10.5120/ijca2020919873

B. Sathiya, T. V. Geetha . Lexical Syntactic Patterns and Novel Statistical Measures based Bootstrapping Approach for Evolution of Biomedical Ontologies. International Journal of Computer Applications. 177, 39 ( Feb 2020), 21-27. DOI=10.5120/ijca2020919873

@article{ 10.5120/ijca2020919873,
author = { B. Sathiya, T. V. Geetha },
title = { Lexical Syntactic Patterns and Novel Statistical Measures based Bootstrapping Approach for Evolution of Biomedical Ontologies },
journal = { International Journal of Computer Applications },
issue_date = { Feb 2020 },
volume = { 177 },
number = { 39 },
month = { Feb },
year = { 2020 },
issn = { 0975-8887 },
pages = { 21-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume177/number39/31164-2020919873/ },
doi = { 10.5120/ijca2020919873 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:48:12.778810+05:30
%A B. Sathiya
%A T. V. Geetha
%T Lexical Syntactic Patterns and Novel Statistical Measures based Bootstrapping Approach for Evolution of Biomedical Ontologies
%J International Journal of Computer Applications
%@ 0975-8887
%V 177
%N 39
%P 21-27
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Knowledge extraction and information processing from the proliferating biomedical data is a primary challenge to the researchers in this field. This is tackled by a semantic knowledge representation model with controlled vocabulary termed as ontology. However, the exponential growth of biomedical data makes the ontology outdated soon and hence its evolution process becomes an inevitable one. Even though numerous ontology evolution systems attempted to evolve the ontology automatically in numerous ways, identifying concepts of ontology that need to be evolved and discovery of new components of the concepts such as its related new concepts and relations is not handled automatically. Therefore, the aim of this work is to automatically identify the concepts which need to be evolved and discover the new components for those concepts using the web pages and MEDLINE database. Particularly, a new concept selection measure: CE (Concept to be Evolved) is designed to select the concepts with high possibility to be evolved based on the number of neighbour and depth of it. Next, a lexical syntactic pattern based bootstrapping approach with new statistical scoring measures such as HH-CS (Hyponym Hypernym-Concept Scoring), DR-CS CS (Domain Range-Concept Scoring) and RS (Relation Scoring) is proposed to discover new candidate components from web pages using the set of patterns and precisely select the correct candidate components from the MEDLINE database using the scoring measures. The experimental results on the biomedical ontologies in terms of precision, recall, F-measure and ontology quality metrics prove the effectiveness of the proposed CE measure and bootstrapping approach with new statistical measures in precisely identifying concepts to be evolved and discovering new components.

References
  1. O. Bodenreider, The Unified Medical Language System (UMLS): integrating biomedical terminology, J. Nucl. Acids. Res. 32, 267–270 (2004).
  2. BM. Konopka, Biomedical ontologies—A review, J. Biocybernetics and Biomedical Engineering 35, 75-86 (2015).
  3. The Gene Ontology Consortium, The Gene Ontology project, Nucl Acids Res 36, 440–444 (2008).
  4. AM. Khattak, R. Batool, Z. Pervez, AM. Khan and S. Lee, Ontology Evolution and Challenges, J. Inf. Sci. Eng 29, 851-71 (2013).
  5. AM. Khattak, K. Latif, S. Lee and YK. Lee. Ontology evolution: a survey and future challenges. Proceedings of the International Conference on U-and E-Service, Science and Technology, (2009) 68-75; Springer Berlin Heidelberg.
  6. G. Flouris, D. Manakanatas, H. Kondylakis, D. Plexousakis and G. Antoniou, Ontology change: Classification and survey, The Knowledge Engineering Review 23, 117-52 (2008).
  7. MC. Klein, Change management for distributed ontologies (2004).
  8. NF. Noy, A. Chugh, W. Liu and M.A. Musen. A framework for ontology evolution in collaborative environments. Proceeding of the International semantic web conference, (2006) 544-558; Springer Berlin Heidelberg.
  9. F. Zablith. Ontology evolution: a practical approach. Workshop on Matching and Meaning at Artificial Intelligence and Simulation of Behaviour (2009).
  10. V. Parekh, J. Gwo and T.W. Finin. Mining Domain Specific Texts and Glossaries to Evaluate and Enrich Domain Ontologies. Proceeding of the IKE (2004) 533-540.
  11. T.F. Gharib, N.L. Badr, S. Haridy and A. Abraham, Enriching Ontology Concepts Based on Texts from WWW and Corpus, J. UCS., 18, 2234-2251 (2012).
  12. B. Fortuna, M. Grobelnik and D. Mladenic D. Semi-automatic data-driven ontology construction system. Proceedings of the 9th International multi-conference Information Society IS-2006, Ljubljana, Slovenia (2006) Oct 9 223-226.
  13. G. Flouris G and D. Plexousakis, Handling ontology change: Survey and proposal for a future research direction, Institute of Computer Science, Forth. Greece, Technical Report TR-362 FORTH-ICS. (2005).
  14. P. Plessers P and O. De Troyer. Ontology change detection using a version log. Proceedings of the International Semantic Web Conference 2005 Nov 6 (pp. 578-592). Springer Berlin Heidelberg.
  15. D. Rogozan D and G. Paquette. Managing ontology changes on the semantic web. Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence (2005) 430-433. IEEE Computer Society.
  16. M. Klein, A. Kiryakov, D. Ognyanov, D. Fensel. Finding and characterizing changes in ontologies. Proceedings of the International Conference on Conceptual Modeling (2002) 79-89. Springer Berlin Heidelberg.
  17. S. Castano, A. Ferrara and S. Montanelli, Matching ontologies in open networked systems: Techniques and applications, J. on Data Semantics, 25-63 (2006).
  18. A.M. Khattak, K. Latif, S. Khan and N. Ahmed. Managing change history in web ontologies. Proceedings of the Fourth International Conference on Semantics, Knowledge and Grid (2008) 347-350.
  19. S. Castano, A. Ferrara and G.N. Hess. Discovery-Driven Ontology Evolution. Proceedings of the SWAP (2006).
  20. A. Kilgarriff and C. Fellbaum. WordNet: An Electronic Lexical Database (2000).
  21. S. Thenmalar, B. Sathiya B and T. V. Geetha, Learning concepts and relations for incremental ontology learning, Advances in Natural and Applied Sciences, 145-50 (2015).
  22. K. Liu, W.W. Chapman, G. Savova, C. G. Chute, N. Sioutos and R.S. Crowley, Effectiveness of lexico-syntactic pattern matching for ontology enrichment with clinical documents, Methods of information in medicine 50, 397- 407 (2011).
  23. Y. Zhu, M. Song and E. Yan, Identifying Liver Cancer and Its Relations with Diseases, Drugs, and Genes: A Literature-Based Approach, PloS one. 11, e015609 (2016).
  24. N. Collier, H.S. Park, N. Ogata, Y. Tateishi, C. Nobata, T. Ohta, T. Sekimizu, H. Imai. K. Ibushi K and J. I. Tsujii. The GENIA project: corpus-based knowledge acquisition and information extraction from genome research papers. Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics (1999) 271-272.
  25. N. Daraselia, A. Yuryev, S. Egorov, S. Novichkova, A. Nikitin and I. Mazo, Extracting human protein interactions from MEDLINE using a full-sentence parser, Bioinformatics 20, 604–611 (2004).
  26. H. W. Chun, Y. Tsuruoka, J. D. Kim, R. Shiba, N. Nagata, T. Hishiki and Jun'ichi Tsujii. Extraction of gene-disease relations from Medline using domain dictionaries and machine learning. Proceedings of the Pacific Symposium on Biocomputing (2006) 4-15.
  27. M. A. Hearst. Automatic acquisition of hyponyms from large text corpora. Proceedings of the 14th conference on Computational linguistics-Volume 2 (1992) 539-545. Association for Computational Linguistics.
  28. R. Snow, D. Jurafsky and A. Y. Ng, Learning syntactic patterns for automatic hypernym discovery, Advances in Neural Information Processing Systems 17, (2004).
  29. P. Cimiano, A. Hotho A and S. Staab, Learning concept hierarchies from text corpora using formal concept analysis, J. Artif. Intell. 24, 305-339 (2005).
  30. X. Jiang and A. H. Tan, CRCTOL: A semantic‐based domain ontology learning system, J. of the American Society for Information Science and Technology 61, 150-68 (2010).
  31. J. Euzenat and P. Shvaiko, Ontology matching, Heidelberg: Springer (2007).
  32. K. Liu. Ontology Enrichment from Free-text Clinical Documents: A Comparison of Alternative Approaches (Doctoral dissertation, University of Pittsburgh).
  33. K. W. Church and P. Hanks, Word association norms, mutual information, and lexicography, J. Computational linguistics 16, 22-29 (1990).
  34. S. Tartir, I. B. Arpinar, M. Moore, A. P. Sheth and B. Aleman-Meza. OntoQA: Metric-based ontology quality analysis.
Index Terms

Computer Science
Information Sciences

Keywords

Ontology evolution enrichment bootstrapping biomedical ontologies.