CFP last date
20 May 2024
Call for Paper
June Edition
IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper
Know more
Reseach Article

Some Investigations on Machine Learning Techniques for Automated Text Categorization

by Bhagirath Prajapati, Sanjay Garg, N C Chauhan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 71 - Number 3
Year of Publication: 2013
Authors: Bhagirath Prajapati, Sanjay Garg, N C Chauhan
10.5120/12340-8617

Bhagirath Prajapati, Sanjay Garg, N C Chauhan . Some Investigations on Machine Learning Techniques for Automated Text Categorization. International Journal of Computer Applications. 71, 3 ( June 2013), 32-36. DOI=10.5120/12340-8617

@article{ 10.5120/12340-8617,
author = { Bhagirath Prajapati, Sanjay Garg, N C Chauhan },
title = { Some Investigations on Machine Learning Techniques for Automated Text Categorization },
journal = { International Journal of Computer Applications },
issue_date = { June 2013 },
volume = { 71 },
number = { 3 },
month = { June },
year = { 2013 },
issn = { 0975-8887 },
pages = { 32-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume71/number3/12340-8617/ },
doi = { 10.5120/12340-8617 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:34:33.175968+05:30
%A Bhagirath Prajapati
%A Sanjay Garg
%A N C Chauhan
%T Some Investigations on Machine Learning Techniques for Automated Text Categorization
%J International Journal of Computer Applications
%@ 0975-8887
%V 71
%N 3
%P 32-36
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The automated categorization (classification) of texts into predefined categories is one of the widely explored fields of research in text mining. Now-a-days, availability of digital data is very high, and to manage them in predefined categories has become a challenging task. Machine learning technique is an approach by which we can train automated classifier to classify the documents with minimum human assistance. This paper discusses the Naïve Bayes, Rocchio, k-Nearest Neighborhood and Support Vector Machine methods within machine learning paradigm for automated text categorization of given documents in predefined categories.

References
  1. Manning, C. D. , Raghavan, P. , Chütze, H. 2009. An Introduction to information retrieval, Chapter 1: Boolean retrieval, page 1, Cambridge University Press.
  2. Rijsbergen, C. J. V. 1979. Information retrieval: Chapter 2: Automatic Text Analysis, Butterworth-Heinemann, 2nd edition.
  3. Sebastian, F. , Ricerche, C. N. 2002. "Machine learning in automated text classification", ACM Computing Surveys, Vol. 34, No. 1, pp. 1-47.
  4. Nilsson, N. J. 1996. Introduction to machine learning, Chap 01: Preliminaries, Draft of Incomplete.
  5. Salton, G. , Buckley, C. 1988. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), pages. 513–523.
  6. Guo, G. , Wang, H. , Bell, D. , Bi, Y. , and Greer, K. 2006. "Using k-NN model-based approach for automatic text categorization", Soft Computing-A Fusion of Foundations, Methodologies and Applications.
  7. Manning, C. , Raghvan, P. , and Schutze, H. 2008. "Text classification and Naïve Bayes", Chapter in Introduction to Information Retrieval, Cambridge University Press.
  8. Yang, Y. 1994. "Expert network: effective and efficient learning from human decisions in text categorization and retrieval", In Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval, Dublin, Ireland, pages. 13–22.
  9. Joachims, T. 1999. "Transductive inference for text classification using support vector machines", ICML-99, Pages 200–209.
  10. Yang, Y. , Liu, X. 1999. "A re-examination of text categorization methods", SIGIR-99, Page 42–49.
  11. Vang, K. : 20 news group dataset, http://people. csail. mit. edu. /Jrennie/20newsgroup.
Index Terms

Computer Science
Information Sciences

Keywords

Machine learning Text categorization.