CFP last date
20 May 2024
Call for Paper
June Edition
IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper
Know more
Reseach Article

Survey on Research Paper Classification based on TF-IDF and Stemming Technique using Classification Algorithm

by Kshitija G. Deshmukh, S. A. Itkar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 25
Year of Publication: 2020
Authors: Kshitija G. Deshmukh, S. A. Itkar
10.5120/ijca2020920248

Kshitija G. Deshmukh, S. A. Itkar . Survey on Research Paper Classification based on TF-IDF and Stemming Technique using Classification Algorithm. International Journal of Computer Applications. 176, 25 ( May 2020), 23-27. DOI=10.5120/ijca2020920248

@article{ 10.5120/ijca2020920248,
author = { Kshitija G. Deshmukh, S. A. Itkar },
title = { Survey on Research Paper Classification based on TF-IDF and Stemming Technique using Classification Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { May 2020 },
volume = { 176 },
number = { 25 },
month = { May },
year = { 2020 },
issn = { 0975-8887 },
pages = { 23-27 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number25/31356-2020920248/ },
doi = { 10.5120/ijca2020920248 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:43:28.685640+05:30
%A Kshitija G. Deshmukh
%A S. A. Itkar
%T Survey on Research Paper Classification based on TF-IDF and Stemming Technique using Classification Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 25
%P 23-27
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Classification System is the system of categorizing objects into classes or into groups of classes. It is used in many wide applications including text classification, web page classification, image classification, research paper classification etc. Various research papers are published online and offline. Text classification and class prediction is important for paper classification to reduce the feature size and to speed up the learning process of classifiers. Text classification is a growing interest within the research of text mining. This paper presents a survey on classification algorithm and stemming technique used for Text classification.

References
  1. Sang‑Woon Kim and Joon‑Min Gil, Research paper classification systems based on TF‑IDF and LDA schemes, Kim and Gil Hum.Cent. Comput. Inf. Sci. (2019).
  2. D. Yogeshwaran1, Dr. N. Yuvaraj ,Text Classification using Recurrent Neural Network in Quora, International Research Journal of Engineering and Technology (IRJET) Volume: 06 Issue: 02 Feb 2019.
  3. Fatiha Barigou ,Impact of Instance Selection on kNN-Based Text Categorization, Journal Information Process System, Vol.14, No.2, pp.418~434, April 2018.
  4. Pema Gurung and Rupali Wagh ,A study on Topic Identification using K means clustering algorithm: Big vs. Small Documents ,Advances in Computational Sciences and Technology ISSN 0973- 6107 ume 10, Number 2 (2017) pp. 221-233.
  5. Ms. Anjali Ganesh Jivani, A Comparative Study of Stemming lgorithms” Int. J. Comp. Tech. Appl. IJCTA NOV-DEC 2011, Vol 2 (6), 1930-1938
  6. Vairaprakash Gurusamy, S.Kannan, K.Nandhini,Performance Analysis: Stemming Algorithm for the English Language ,IJSRD International Journal for Scientific Research & Development Vol. 5, Issue 05, 2017 ISSN (online): 2321-0613
  7. Prafulla Bafna, Dhanya Pramod, Anagha Vaidya, “Document Clustering: TF-IDF approach”, IEEE 2016.
  8. Bruno Trstenjak,Sasa Mikac, Dzenana Donko, “KNN with TF-IDF Based Framework for Text Categorization”, Procedia Engineering 69,science Direct ( 2014 ) 1356 – 1364
  9. Juan Ramos,Using TF-IDF to Determine Word Relevance in Document Queries
  10. Anping Zeng, Yongping Huang,“ A Text Classification Algorithm Based on Rocchio and Hierarchical Clustering”, D.-S. Huang et al. (Eds.): ICIC 2011, LNCS 6838, pp. 432–439, 2011.Springer-Verlag Berlin Heidelberg 2011.
  11. Mr. Brijain R Patel, Mr. Kushik K Rana,A Survey on Decision Tree Algorithm For Classification”, 2014 IJEDR Volume 2, Issue 1 ISSN: 2321-9939.
  12. Dalibor Buzic, Jasminka Dobsa,Lyrics Classification using Naive Bayes, May 2018.
  13. Durgesh K. Srivastava, lekha Bhambhu, Data Classification Using Support Vector Machine, Journal of Theoretical and Applied Information Technology · February 2010.
  14. Alon Jacovi, Oren Sar Shalom, Yoav Goldberg, Understanding Convolutional Neural Networks for Text Classification, Proceedings of the 2018 EMNLP Workshop Black box NLP: Analysing and Interpreting Neural Networks for NLP, November 1, 2018.
  15. Hanumanthappa M and Narayana Swamy M,Language Independent Categorization of Documents Based on the Domain”, Advances in Natural and Applied Sciences, 9(6) Special 2015.
  16. Jashanjot Kaur, Preetpal Kaur Buttar,A Systematic Review on Stopword Removal Algorithms”, International Journal on Future Revolution in Computer Science & Communication Engineering April 2018 ISSN: 2454-4248 Volume: 4 Issue: 4
  17. Sandeep R. Sirsat, Dr. Vinay Chavan, Dr. Hemant S. Mahalle,Strength and Accuracy Analysis of Affix Removal Stemming Algorithms, Sandeep R. Sirsat et al, / (IJCSIT) International Journal of Computer Science and Information Technologies Aug 2015, Vol. 4 (2) , 2015, 265 – 269
  18. Mrs. R. Jayanthi , Ms. C. Jeevitha,”An Approach for Effective Text Pre-Processing Using Improved Porters Stemming Algorithm, IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. 2 Issue 7, July 2015.
  19. Deepika Sharma, Stemming Algorithms: A Comparative Study and their Analysis, International Journal of Applied Information Systems (IJAIS) – ISSN : 2249-0868 Foundation of Computer Science FCS, New York, USA Volume 4– No.3, September 2012.
  20. Benno Stein and Martin Potthast,Putting Successor Variety Stemming to Work, Advances in Data Analysis Selected Papers from the 30th Annual Conference of the German Classi_cation Society (GfKl) Berlin, ISBN 978-3-540-70980-0, pp. 367-374, c Springer 2007.
  21. S.P.Ruba Rani, B.Ramesh, M.Anusha, Dr.J.G.R.Sathiaseelan,Evaluation of Stemming Techniques for Text Classification, International Journal of Computer Science and Mobile Computing, Vol.4 Issue.3, March- 2015, pg. 165-171.
  22. Xiaofei Zhou, Yue Hu, Li Guo,Text Categorization Based on Clustering Feature Selection, 1877-0509 © 2014 Published by Elsevier.
  23. Pengfei Liu ,Xipeng Qiu, Xuanjing Huang,Recurrent Neural Network for Text Classification with Multi-Task Learning, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16).
  24. Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao,Recurrent Convolutional Neural Networks for Text Classification, Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.
Index Terms

Computer Science
Information Sciences

Keywords

Text Classification Stemming Technique classes