Machine Learning approach to Document Classification using Concept based Features

C.saranya Jothi; D.thenmozhi

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2026

Submit your paper

Know more

The week's pick

AI-Assisted Observability in Distributed Microservice Architectures

Kyrylo Sotnykov

Random Articles

Non-Dominated Sorting Flower Pollination Algorithm for Dynamic Economic Emission Dispatch

November

2015

Open Student Evaluation Model in e-Learning

October

2015

Predicting the Behaviour of Open Source Software using Object Oriented Metrics

Sep

2016

Raspberry Pi based Implementation of Internet of Things using Mobile Messaging Application - ‘Telegram’

Jul

2016

Reseach Article

Machine Learning approach to Document Classification using Concept based Features

by C.saranya Jothi, D.thenmozhi

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 118 - Number 20

Year of Publication: 2015

Authors: C.saranya Jothi, D.thenmozhi

10.5120/20864-3578

C.saranya Jothi, D.thenmozhi . Machine Learning approach to Document Classification using Concept based Features. International Journal of Computer Applications. 118, 20 ( May 2015), 33-36. DOI=10.5120/20864-3578

@article{ 10.5120/20864-3578,

author = { C.saranya Jothi, D.thenmozhi },

title = { Machine Learning approach to Document Classification using Concept based Features },

journal = { International Journal of Computer Applications },

issue_date = { May 2015 },

volume = { 118 },

number = { 20 },

month = { May },

year = { 2015 },

issn = { 0975-8887 },

pages = { 33-36 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume118/number20/20864-3578/ },

doi = { 10.5120/20864-3578 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:02:17.353893+05:30

%A C.saranya Jothi

%A D.thenmozhi

%T Machine Learning approach to Document Classification using Concept based Features

%J International Journal of Computer Applications

%@ 0975-8887

%V 118

%N 20

%P 33-36

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Text mining refers to the process of deriving high-quality information from text. Text processing involves in search and replace in electronic format of text. A number of approaches have been developed to represent and classify text documents. Most of the approach tries to attain good classification performance while taking a document only by words. We propose a concept based methodology instead of terms. It represents the meaning of text to reduce the features. Support Vector Machine (SVM) algorithm is applied for document classification. Then the performance measure is compared with document classification using original features and concept based features. This methodology enhances the document classification accuracy.

References

Basu T. and Murthy C. (2012). Effective text classification by a supervised feature selection approach. In Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on, pages 918–925. IEEE.
Gayathri K. and Marimuthu A. (2013). Text document pre-processing with the knn for classification using the svm. In Intelligent Systems and Control (ISCO), 2013 7th International Conference on, pages 453–457. IEEE.
Lin Y. S. , Jiang J. Y. , and Lee S. J. (2013). A similarity measure for text classification and clustering. IEEE Transactions on Knowledge and Data Engineering, page 1.
Peng J. , Yang D. q. , Tang S. W. , Gao J. , Zhang P. y. , and Fu Y. (2007). A concept similarity based text classification algorithm. In Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery-Volume 01, pages 535–539. IEEE Computer Society.
Wang Z. Q. , Sun X. , Zhang D. X. , and Li X. (2006). An optimal svm based text classification algorithm. In 2005 International Conference on Machine Learning and Cybernetics, pages 1378–1381.
Datasets for single-label text categorization: http://web. ist. utl. pt/~acardoso/datasets/
WEKA, classpath: http://weka. wikispaces. com/classpath
WordNet 2. 1. http://www. brothersoft. com/wordnet-236667. html.

Index Terms

Computer Science

Information Sciences

Keywords

Text classification Support Vector Machine Feature Selection.