CFP last date
22 April 2024
Reseach Article

Improving the Performance of Adopted Approaches for Extracting Arabic Keyphrases

by Fatma Elghannam
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 153 - Number 7
Year of Publication: 2016
Authors: Fatma Elghannam
10.5120/ijca2016912099

Fatma Elghannam . Improving the Performance of Adopted Approaches for Extracting Arabic Keyphrases. International Journal of Computer Applications. 153, 7 ( Nov 2016), 13-17. DOI=10.5120/ijca2016912099

@article{ 10.5120/ijca2016912099,
author = { Fatma Elghannam },
title = { Improving the Performance of Adopted Approaches for Extracting Arabic Keyphrases },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2016 },
volume = { 153 },
number = { 7 },
month = { Nov },
year = { 2016 },
issn = { 0975-8887 },
pages = { 13-17 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume153/number7/26414-2016912099/ },
doi = { 10.5120/ijca2016912099 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:58:30.042691+05:30
%A Fatma Elghannam
%T Improving the Performance of Adopted Approaches for Extracting Arabic Keyphrases
%J International Journal of Computer Applications
%@ 0975-8887
%V 153
%N 7
%P 13-17
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this work the improvement of automatic keyphrases extraction using deep linguistic features and supervised machine learning algorithm is discussed. The n-gram method for extracting important keyphrases produces huge number of candidate terms. Many of those terms are non-keyphrases either because they are linguistically non expressive terms or due to redundancy in sense. The objective is to restrict the number of candidate terms and keeping the relevant ones. This work is an extension to a previous one in keyphrase extraction for Arabic documents. The proposed work covers the deep linguistic features of the candidate terms. To capture the well-structured terms a new-added definite structure feature is introduced and tested. A set of linguistic features of the previously assigned candidate terms are applied to a supervised machine learning technique to classify the candidates as keyphrases or not. The experiments carried out showed that the proposed technique improves the accuracy of extracting keyphrases relative to the previous version and other available extractors.

References
  1. Bouckaert R, Eibe F. 2010. WEKA Manual for Version 3.7.12.
  2. CBA. Data mining tool Downloading page at : http://www.comp.nus.edu.sg/~dm/p_download.html.
  3. El-Shishtawy, T. , Al-sammak, A. 2009. Arabic Keyphrase Extraction using Linguistic knowledge and Machine Learning Techniques. Proceedings of the Second International Conference on Arabic Language Resources and Tools. The MEDAR Consortium, Cairo, Egypt.
  4. El-Shishtawy, T., El-Ghannam, F. An Accurate Arabic Root-Based Lemmatizer for Information Retrieval Purposes. International Journal of Computer Science Issues, 2012, Volume 9.
  5. El-Shishtawy, T. , El-Ghannam, F. 2012. Keyphrase Based Arabic Summarizer (KPAS). 8th International Conference on Informatics and Systems INFOS, Egypt.
  6. Frank, E., Paynter,W., Witten, H., Gutwin, C., and Nevill-Manning, G. 1999. Domain-specific keyphrase extraction. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99), pp. 668-673. California: Morgan Kaufmann.
  7. G¨onen, E. Automated text summarization and keyphrase extraction. 2006. Master thesis, Bilkent University.
  8. Hasan, S., NG, V.: Automatic keyphrase extraction. 2014. A survey of the state of the art. Proceedings of the Association for Computational Linguistics (ACL), Baltimore, Maryland: Association for Computational Linguistics.‏
  9. Hulth, A. 2003. Improved automatic keyword extraction given more linguistic knowledge. Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Sapporo, Japan.
  10. Quinlan, R. C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos, (1993).
  11. Sarkar, K.: A hybrid approach to extract keyphrases from medical documents. International Journal of Computer Applications . 2013 , (0975 – 8887) Volume 63– No.18. ‏
  12. Turney, D. 2000. Learning algorithms for keyphrase extraction. Information Retrieval, 2, pp.303-336.
  13. Turney, D. 1999. Learning to Extract Keyphrases from Text. National Research Council, Institute for Information Technology, Technical Report ERB-1057.
  14. Turney, D. 1997. Extraction of Keyphrases from Text: Evaluation of Four Algorithms. National Research Council, Institute for Information Technology, Technical Report ERB-1051.
  15. Witten, I., frank, E. 2000. Data Mining: practical machine learning tools and techniques with Java implementations, Morgan Kaufmann‖, San Francisco.
  16. Witten, I., Paynter, W., Frank E., Gutwin C. and Nevill-Manning. 1999. KEA: Practical Automatic keyphrase extraction. Proceedings of Digital Libraries 99 (DL'99), pp. 254-256. ACM Press.
  17. Witten, I., Paynter, W., Frank, E.,Gutwin, C., and Nevill-Manning, G. 2000. KEA: Practical Automatic Keyphrase Extraction. Working Paper 00/5, Department of Computer Science. The University of Waikato.
Index Terms

Computer Science
Information Sciences

Keywords

Keyphrase Extraction Arabic Keyphrases Information Retrieval Classification Methods Computational Linguistics.