Text Classification by Enhancing Weights of Terms based on their Positional Appearances

Anagha R Kulkarni; Vrinda Tokekar; Parag Kulkarni

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

An Efficient Hybrid Parallel Prefix Adders for Reverse Converters using QCA Technology

Nov

2016

Computerized Preventive Maintenance Management System (CPMMS) for Haematology Department Equipments

January

2015

Security Enhancement in Cloud Storage using ARIA and Elgamal Algorithms

Aug

2017

EARRA: Enhanced Adaptive Rate Response Adjustment Technique for Congestion Control in Networks

Jun

2017

Reseach Article

Text Classification by Enhancing Weights of Terms based on their Positional Appearances

by Anagha R Kulkarni, Vrinda Tokekar, Parag Kulkarni

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 78 - Number 9

Year of Publication: 2013

Authors: Anagha R Kulkarni, Vrinda Tokekar, Parag Kulkarni

10.5120/13518-1298

Anagha R Kulkarni, Vrinda Tokekar, Parag Kulkarni . Text Classification by Enhancing Weights of Terms based on their Positional Appearances. International Journal of Computer Applications. 78, 9 ( September 2013), 23-26. DOI=10.5120/13518-1298

@article{ 10.5120/13518-1298,

author = { Anagha R Kulkarni, Vrinda Tokekar, Parag Kulkarni },

title = { Text Classification by Enhancing Weights of Terms based on their Positional Appearances },

journal = { International Journal of Computer Applications },

issue_date = { September 2013 },

volume = { 78 },

number = { 9 },

month = { September },

year = { 2013 },

issn = { 0975-8887 },

pages = { 23-26 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume78/number9/13518-1298/ },

doi = { 10.5120/13518-1298 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:51:17.353533+05:30

%A Anagha R Kulkarni

%A Vrinda Tokekar

%A Parag Kulkarni

%T Text Classification by Enhancing Weights of Terms based on their Positional Appearances

%J International Journal of Computer Applications

%@ 0975-8887

%V 78

%N 9

%P 23-26

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Huge store of hidden information in text documents is available. Extracting accurate, useful information from this store is very important. Multinomial Naïve Bayes classification algorithm is effective in processing text and extracting accurate information. A new approach of assigning weights to terms based on their positional appearance is proposed. The effectiveness of this approach is demonstrated for two standard text datasets Reuters-21578 and 20-newsgroups. This proposed approach improves average F-measure by 1. 0% for Reuters-21578 and by 2% for 20-newsgroups at least.

References

David D. Lewis. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. Proc. of the 10th European Conference on Machine Learning 1998, pp. 4-15.
A. McCallum and K. Nigam. A comparison of event models for naïve Bayes text classification. Proc of AAAI, 1998.
JDM Rennie, L. Shih, J. Teevan and D. R. Karger. Tackling the poor Assumption of Naïve Bayes Text Classifiers. Proc of the twelfth Intl Conf on Machine Learning (ICML) 2003.
Y. Ko. A study of term weighting schemes using class information for text classification. Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval 2012, pp. 1029–1030.
M. Mendoza. A new term-weighting scheme for naïve Bayes text categorization. International Journal of Web Information Systems, Vol. 8 Issue 1 2012, pp. 55 – 72.
C. J. van Rijsbergen. Information Retrieval. Butterworth, 1990.
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann and Ian H. Witten. The WEKA Data Mining Software: An Update. SIGKDD Explorations, Volume 11, Issue 1, 2009.
R. Bekkerman and J. Allan. Using bigrams in text categorization. Department of Computer Science, University of Massachusetts, Amherst 1003 (2004).

Index Terms

Computer Science

Information Sciences

Keywords

Term Weighting Document Classification Multinomial Naïve Bayes Classification Algorithm