CFP last date
20 May 2024
Call for Paper
June Edition
IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper
Know more
Reseach Article

Designing Method and Algorithm of Semantic Comparison for Detecting Similarities in the Case of Fuzzy Information using Software Engineering and Medical Ontologies*

by Abdualmajed Ahmed Al-Khulaidi, Adel A. Nasser, Fahd Nasser Al-Wesabi and
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 181 - Number 40
Year of Publication: 2019
Authors: Abdualmajed Ahmed Al-Khulaidi, Adel A. Nasser, Fahd Nasser Al-Wesabi and
10.5120/ijca2019918434

Abdualmajed Ahmed Al-Khulaidi, Adel A. Nasser, Fahd Nasser Al-Wesabi and . Designing Method and Algorithm of Semantic Comparison for Detecting Similarities in the Case of Fuzzy Information using Software Engineering and Medical Ontologies*. International Journal of Computer Applications. 181, 40 ( Feb 2019), 32-40. DOI=10.5120/ijca2019918434

@article{ 10.5120/ijca2019918434,
author = { Abdualmajed Ahmed Al-Khulaidi, Adel A. Nasser, Fahd Nasser Al-Wesabi and },
title = { Designing Method and Algorithm of Semantic Comparison for Detecting Similarities in the Case of Fuzzy Information using Software Engineering and Medical Ontologies* },
journal = { International Journal of Computer Applications },
issue_date = { Feb 2019 },
volume = { 181 },
number = { 40 },
month = { Feb },
year = { 2019 },
issn = { 0975-8887 },
pages = { 32-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume181/number40/30329-2019918434/ },
doi = { 10.5120/ijca2019918434 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:08:42.023454+05:30
%A Abdualmajed Ahmed Al-Khulaidi
%A Adel A. Nasser
%A Fahd Nasser Al-Wesabi and
%T Designing Method and Algorithm of Semantic Comparison for Detecting Similarities in the Case of Fuzzy Information using Software Engineering and Medical Ontologies*
%J International Journal of Computer Applications
%@ 0975-8887
%V 181
%N 40
%P 32-40
%D 2019
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this article we will discuss the work of a semantic comparison method, through which the detection of plagiarism is revealed in the fuzzy information, which we have designed an algorithm with a semantic dimension to detect plagiarism in the fuzzy information and detect impersonations such as changing the structure of speech or replacing words with synonyms and limiting technical spelling errors such as not completely writing the end of the word or unofficial and unknown abbreviations, and analyze shows the degree of similarity of the original text, and analyzes the overall evaluation of the degree of similarity of texts from the apparent structures of the text. Experiments have shown that the proposed method with a semantic dimension in the case of fuzzy information is better than sherlock method in terms of file size criterion of 6% if using word synonyms in the file and 1% in case of rewriting the file. As for the standard time taken to examine the files through the acceleration calculation, it is noted that the proposed method for the semantic dimension in the case of fuzzy information is faster in performance than the Sherlock method in the case of the use of synonyms 1.02 times and in the case of rewriting words with a value of 1.01 times in the case of file size 382 words. The results of the experiments show that the average execution time of the proposed algorithm, for finding plagiarism, is less by 3.47% compared with the Sherlock algorithm in the case of the use of synonyms and less by 1.83% compared with the Sherlock algorithm in the case of rewriting words. The algorithm works effectively as the file size increases, the gain ratio is obtained up to 2.73% in the synonym of words and 2.69% in the case of rewriting words. From the results presented in the tables, we conclude that the average error rate of the proposed algorithm is 2% lower than the error rate sherlock algorithm. The complexity of the proposed algorithm is O(m*n).

References
  1. Nikhil Ghode, Shubham Jadhav, Sampada Moon, Ashmina Khan, Shrutika Bhalkar , Detecting Plagiarism In Academics Using levenshten Distance Algorithm And Semantic Similarity, International Journal on Future Revolution in Computer Science & Communication Engineering ISSN: 2454-4248, Volume: 4 Issue: 3 471 – 473 ,2018.
  2. K. Vani and Deepa Gupta. 2016. Study on Extrinsic Text Plagiarism Detection Techniques and Tools. J. Engin. Sc. & Techn. Review 9, 5 (2016).
  3. Giovanni Acampora, Georgina Cosma, “A Fuzzy-based Approach to Programming Language Independent Source-Code Plagiarism Detection”, IEEE International Conference on Fuzzy Systems, 2015.
  4. S. M. Alzahrani, N. Salim, A. Abraham, Understanding plagiarism linguistic patterns, textual features, and detection methods, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42 (2) (2012) 133-149.
  5. R. Pike. “The Sherlock Plagiarism Detector.” Internet: http://www.cs.su.oz.au/~scilect/sherlock, 2007 [Oct. 04, 2011].
  6. Norman Meuschke, Moritz Schubotz, Felix Hamborg, Tomas Skopal, and Bela Gipp. 2017. Analyzing Mathematical Content to Detect Academic Plagiarism. In Proc. Conf. on Inform. and Knowl. Manage. (CIKM).
  7. Bela Gipp. 2014. Citation-based Plagiarism Detection - Detecting Disguised and Cross-language Plagiarism using Citation Pattern Analysis. Springer.
  8. Solange de L. Pertile, Viviane P. Moreira, and Paolo Rosso. 2016. Comparing and combining Content- and Citation-based approaches for plagiarism detection. JASIST 67, 10 (2016), 2511–2526.
  9. Salha M. Alzahrani, Naomie Salim, and Ajith Abraham. 2012. Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods. In IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., Vol. 42. 133–149.
  10. S. M. Alzahrani, N. Salim, A. Abraham, Understanding plagiarism linguistic patterns, textual features, and detection methods, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42 (2) (2012) 133-149.
  11. Alzahrani, S. M.: iPlag: Intelligent Plagiarism Reasoner in Scientific Publications, IEEE (2011) .
  12. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111-3119, 2013.
  13. Agarwal J, Goudar RH, Kumar P, Sharma N,Parshav V, Sharma R, Srivastava A, Rao S."Intelligent plagiarism detection mechanism using semantic technology: A different approach". International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2013 Aug 22 (pp. 779-783). IEEE.
  14. Vani K., Deepak Gupta.: Study on Extrinsic Text Plagiarism Detection Techniques and Tools, Journal of Engineering Science and Technology Review 9 (4) (2016) .
  15. RDF/OWL Representation of WordNet . [online] Available at: http://www.w3.org/TR/wordnet-rdf/ [Accessed 1-may 2015 ].
  16. WordNet a lexical database for English . . [online] Available at: https://wordnet.princeton.edu/ [Accessed 5-may 2015 ].
  17. Nick Littlestone (2009). "Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm", Machine Learning 285–318(2).
  18. M. Potthast, T. Gollub, M. Hagen, M. Tippmann, J. Kiesel, P. Rosso, E. Stamatatos, and B. Stein. Overview of the 5th International Competition on Plagiarism Detection. In Working Notes Papers of the CLEF 2013 Evaluation Labs, 2013.
  19. Ayad, L. A., Barton, C., & Pissis, S. P. (2017). A faster and more accurate heuristic for cyclic edit distance computation. Pattern Recognition Letters, 88, 81-87.
  20. Sharapova, E.V. Analysis of methods and systems for fuzzy duplicate detection / E.V. Sharapova // Proceedings of 14 International multidisciplinary scientific Geoconference SGEM2014. Informatics, Geoinformatics and Remote Sensing. – 2014. – Vol. 1. – P. 27-33.
  21. Levenshtein, V.I. Binary codes capable of correcting deletions, insertions, and reversals / V.I. Levenshtein // Soviet Physics Doklady. – 1966. – Vol. 10(8). – P. 707-710.
  22. Wagner, R.A. The string-to-string correction problem / R.A. Wagner, M.J. Fischer // Journal of the ACM. – 1974. – Vol. 21(1). – P. 168-173.
  23. Gasfild, D. Strings, trees and sequences in the algorithms: Computer Science and Computational Biology / D. Gasfild. – 2003. – 654 p.
  24. 24. Knuth, D. The Art of Computer Programming / D. Knuth // Addison-Wesley. – 2000. – P. 396- 408.
  25. Baeza-Yates, R. A faster algorithm for approximate string matching / R. Baeza-Yates, G. Navarro // Combinatorial Pattern Matching (CPM'2004). – 2004. – P. 1-23.
  26. Foundational model of anatomy. [online] Available at:http://sig.biostr.washington.edu/projects/fm/ [Accessed 20-may 2015].
  27. Diseases Ontology . [online] Available at:
  28. http://disease-ontology.org/ [Accessed 25-may 2015].
  29. Gene Ontology Consortium. Available at:http://geneontology.org/ [Accessed 20-may 2015].
  30. U.S. National library of Medicine. [online] Available at: http://www.nlm.nih.gov/mesh/ [Accessed 25-may 2015].
  31. The Open Biological and Biomedical Ontologies. [online] Available at: http://www.obofoundry.org/cgi-bin/detail.cgi?id=OGMS [Accessed 25-may 2015].
  32. EDAM Ontology. [online] Available at: http://edamontology.org/page [Accessed 25-may 2015].
  33. Software Engineering ontology. [online] Available at:http://dev.nemo.inf.ufes.br/seon/ [Accessed 25-sep 2017].
  34. Software Engineering ontology. [online] Available at:https://github.com/ChicoState/SoftwareEngineering [Accessed 25-may 2015].
Index Terms

Computer Science
Information Sciences

Keywords

Plagiarism detection medical ontologies software engineering ontology Semantic network.