CFP last date
20 June 2024
Reseach Article

Ontology based Semantic Similarity Measure using Concept Weighting

Published on April 2014 by S. Anitha Elavarasi, J. Akilandeswari, K. Menaga
International Conference on Knowledge Collaboration in Engineering
Foundation of Computer Science USA
ICKCE - Number 1
April 2014
Authors: S. Anitha Elavarasi, J. Akilandeswari, K. Menaga
e593b922-3a53-42c5-96b0-de1e6144fac5

S. Anitha Elavarasi, J. Akilandeswari, K. Menaga . Ontology based Semantic Similarity Measure using Concept Weighting. International Conference on Knowledge Collaboration in Engineering. ICKCE, 1 (April 2014), 15-20.

@article{
author = { S. Anitha Elavarasi, J. Akilandeswari, K. Menaga },
title = { Ontology based Semantic Similarity Measure using Concept Weighting },
journal = { International Conference on Knowledge Collaboration in Engineering },
issue_date = { April 2014 },
volume = { ICKCE },
number = { 1 },
month = { April },
year = { 2014 },
issn = 0975-8887,
pages = { 15-20 },
numpages = 6,
url = { /proceedings/ickce/number1/16141-1006/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference on Knowledge Collaboration in Engineering
%A S. Anitha Elavarasi
%A J. Akilandeswari
%A K. Menaga
%T Ontology based Semantic Similarity Measure using Concept Weighting
%J International Conference on Knowledge Collaboration in Engineering
%@ 0975-8887
%V ICKCE
%N 1
%P 15-20
%D 2014
%I International Journal of Computer Applications
Abstract

Semantic similarity between the documents is essential when it is extracted from free text document. Representing the presence and absence of concept in binary format may not provide perfect accuracy. Concept weighting through term frequency will increase accuracy of clustered document. Concept weight is determined using term frequency and semantic distance. Semantic similarity of a concept is derived using ontology extracted from swoogle. Vector space model with parent-child (is-a) relationship ontology are exploited using protégé. Term frequencies for the extracted concepts are calculated using text processing. In this paper Cosine similarity using concept weight measure is applied to find similarity between different documents. According to the similarity score, documents are clustered. In this paper a sample walkthrough for the proposed system has been discussed by comparing two documents.

References
  1. Thusitha Mabotuwana, Michael C. Lee, Eric V. Cohen-Solal. An ontology-based similarity measure for biomedical data-Application to radiology reports. Journal of Biomedical Informatics; 2013. http://dx. doi. org/10. 1016/j. jbi. 2013. 06. 013
  2. Mihalcea R, Corley C, Strapparava C. Corpus-based and knowledge-basedmeasures of text semantic similarity. Proceedings of the 21st nationalconference on artificial intelligence, vol. 1. Boston, Massachusetts: AAAIPress; 2006. p. 775–80.
  3. Aygul I, Cicekli N, Cicekli I. Searching documents with semantically relatedkeyphrases. In: Sixth international conference on advances in semanticprocessing; 2012. p. 59–64.
  4. Pivovarov R, Elhadad N. A hybrid knowledge-based and data-driven approachto identifying semantically similar concepts. Journal of Biomedical Informatics. 2012;45(3):471–81.
  5. Melton GB et al. Inter-patient distance metrics using SNOMED CT definingrelationships. Journal of Biomedical Informatics. 2006;39(6):697–705.
  6. Ganesan P, Garcia-Molina H, Widom J. Exploiting hierarchical domainstructure to compute similarity. ACM Trans InfSyst 2003;21(1):64–93.
  7. David Sanchez , Montserrat Batet. Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective. Journal of Biomedical Informatics 44 (2011) 749–759. doi:10. 1016/j. jbi. 2011. 03. 013
  8. Ming Che Lee. A novel sentence similarity measure for semantic-based expert systems. Expert systems with application 38 (2011) 6392-6399. doi:10. 1016/j. eswa. 2010. 10. 043.
  9. David Sánchez. A methodology to learn ontological attributes from the Web. Proceedings at Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA). 26. 43007 Tarragona, Spain. Journal of Data and knowledge Engineering. 69 (2010): 573–597. doi:10. 1016/j. datak. 2010. 01. 006.
  10. Montserrat Batet , David Sánchez, Aida Valls. An ontology-based measure to compute semantic similarity in biomedicine. Proceedings at Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA) Research Group, Department d'Enginyeria Informatics Matemàtiques, UniversitatRoviraVirgili, Tarragona, Catalonia, Spain. Journal of Biomedical Informatics 44 (2011): 118–125.
  11. http://searchbusinessanalytics. techtarget. com/definition/text-mining.
  12. Text mining and the Semantic Web - http://gate. ac. uk/sale/ talks/text_mining_manchester05. ppt&sa=U&ei=B_xpUtugHoS4rgeH_oDABw&ved=0CBwQFjAA&usg=AFQjCNG_wjZkFbjkqENVQtLcrdgQ5CobYA.
  13. Vector Space Model: http://cogsys. imm. dtu. dk/ thor/projects/multimedia/textmining/node5. html.
  14. What is an Ontology? - Stanford Knowledge Systems Laboratory: http://www-ksl. stanford. edu/kst/what-is-an-ontology. html.
  15. http://www. stat. columbia. edu/~madigan/W2025/notes/clustering. pdf&sa=U&ei=Yf5pUumrKpHJrAeVuYHYDw&ved=0CBoQFjAA&usg=AFQjCNEqfrRx_Q4oAwusW3K3-DUHIFqV2Q:What is Cluster Analysis?
  16. http://www. csee. umbc. edu/~ian/irF02/lectures/03Text-Processing. pdf&sa=U&ei=j_5p UtiyLYazrgePqIHYCQ&ved=0CCMQFjAB&usg=AFQjCNGCfUzmAQArYRIPcJdyJ0nnW6qpHQ: Text Processing
  17. Paris D. HosseinZadeh, Marek Z. Reformat. Assessment of semantic similarity of concepts defined in ontology. Journal of Information Sciences (2013) Elsevier Inc. doi:10. 10116/j. ins. 2013. 06. 056.
  18. Rapid miner 5. 3 download: http://rapid-i. com/content/view/17/211/lang,en/installation guide.
  19. http://protege. stanford. edu/download/protege/4. 3/installanywhere/Web_Installers/developer doc. http://protegewiki. stanford. edu/wiki/Protege4DevDocs.
Index Terms

Computer Science
Information Sciences

Keywords

Ontology Text Processing Semantic Distance Term Frequency Concept Weight Cosine Similarity Clustering.