Reseach Article

DACS Dewey index-based Arabic Document Categorization System

by A. F. Alajmi, E. M Saad, M H Awadalla
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 47 - Number 23
Year of Publication: 2012
Authors: A. F. Alajmi, E. M Saad, M H Awadalla

A. F. Alajmi, E. M Saad, M H Awadalla . DACS Dewey index-based Arabic Document Categorization System. International Journal of Computer Applications. 47, 23 ( June 2012), 50-57. DOI=10.5120/7500-0634

This paper is devoted to the development of Arabic Text Categorization System. First, a stop-words list is generated using statistical approach which captures the inflation of different Arabic words. Second, a feature representation model based on Hidden Markov Model is developed to extract roots and morphological weights. Third, a semantic synonyms merge technique is presented for feature reduction. Finally a Dewey-Index Based Back-propagation Artificial Neural Network is developed for Arabic Document Categorization. The system was compared with other classifiers and the results reveal a promising architecture.

Index Terms

Computer Science
Information Sciences


Arabic Text Processing Natural Language Processing Classification Feature Reduction Feature Representation Morphological Analyzer