CFP last date
20 May 2024
Call for Paper
June Edition
IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper
Know more
Reseach Article

Implementation of Extractive Text Summarization using Word Frequency in Python

by Ahmad Farhan Al Shammari
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 184 - Number 47
Year of Publication: 2023
Authors: Ahmad Farhan Al Shammari
10.5120/ijca2023922583

Ahmad Farhan Al Shammari . Implementation of Extractive Text Summarization using Word Frequency in Python. International Journal of Computer Applications. 184, 47 ( Feb 2023), 23-26. DOI=10.5120/ijca2023922583

@article{ 10.5120/ijca2023922583,
author = { Ahmad Farhan Al Shammari },
title = { Implementation of Extractive Text Summarization using Word Frequency in Python },
journal = { International Journal of Computer Applications },
issue_date = { Feb 2023 },
volume = { 184 },
number = { 47 },
month = { Feb },
year = { 2023 },
issn = { 0975-8887 },
pages = { 23-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume184/number47/32623-2023922583/ },
doi = { 10.5120/ijca2023922583 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:24:14.343940+05:30
%A Ahmad Farhan Al Shammari
%T Implementation of Extractive Text Summarization using Word Frequency in Python
%J International Journal of Computer Applications
%@ 0975-8887
%V 184
%N 47
%P 23-26
%D 2023
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The goal of this research is to develop an extractive text summarization program using word frequency in Python. The steps of text summarization process are: preprocessing text, word-tokenization, creating bag of words, calculating word frequency, sentence-tokenization, calculating sentence score, calculating average score, and making summary. The developed program was examined on an experimental text from Wikipedia. The program performed the steps of text summarization and provided the required summary.

References
  1. Lloret, E. & Palomar, M. (2012). "Text Summarization in Progress: A Literature Review". Artificial Intelligence Review. 37, 1-41.
  2. Luhn, H. P. (1958). "The Automatic Creation of Literature Abstracts". IBM Journal of research and development, 2(2), 159-165.
  3. Matplotlib: https://www. matplotlib.org
  4. Nenkova, A., &McKeown, K. (2012). "A Survey of Text Summarization Techniques". In Mining Text Data, Springer, 43-76.
  5. NLTK: https://www.nltk.org
  6. Numpy: https://www.numpy.org
  7. Orasan, Constantin. (2009). "Comparative Evaluation of Term-Weighting Methods for Automatic Summarization". Journal of Quantitative Linguistics. 16, 67-95.
  8. Pandas: https:// pandas.pydata.org
  9. Python: https://www.python.org
  10. Salton, G. (1989). "Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer". Addison- Wesley Publishing Company, USA.
  11. Salton, G. & McGill, M. (1983). "Introduction to Modern Information Retrieval". McGraw Hill Book Co, New York.
  12. Salton, G., & Buckley, C. (1988). "Term-Weighting approaches in Automatic Text Retrieval". Information Processing and Management, 24(5), 513-523.
  13. SciKit: https://scikit-learn.org
  14. Sparck Jones, K. (1972). "A Statistical interpretation of Term Specificity and its Application in Retrieval". Journal of Documentation, 28(1), 11-21.
  15. Wikipedia: https://en.wikipedia.org
Index Terms

Computer Science
Information Sciences

Keywords

Artificial Intelligence Machine Learning Text Summarization Natural Language Processing Tokenization Word Frequency Sentence Score Summary Python.