CFP last date
20 May 2024
Reseach Article

Towards Arabic Spell-Checker Based on N-Grams Scores

by Hasan Muaidi, Rasha Al-tarawneh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 53 - Number 3
Year of Publication: 2012
Authors: Hasan Muaidi, Rasha Al-tarawneh
10.5120/8400-2168

Hasan Muaidi, Rasha Al-tarawneh . Towards Arabic Spell-Checker Based on N-Grams Scores. International Journal of Computer Applications. 53, 3 ( September 2012), 12-16. DOI=10.5120/8400-2168

@article{ 10.5120/8400-2168,
author = { Hasan Muaidi, Rasha Al-tarawneh },
title = { Towards Arabic Spell-Checker Based on N-Grams Scores },
journal = { International Journal of Computer Applications },
issue_date = { September 2012 },
volume = { 53 },
number = { 3 },
month = { September },
year = { 2012 },
issn = { 0975-8887 },
pages = { 12-16 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume53/number3/8400-2168/ },
doi = { 10.5120/8400-2168 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:53:10.147957+05:30
%A Hasan Muaidi
%A Rasha Al-tarawneh
%T Towards Arabic Spell-Checker Based on N-Grams Scores
%J International Journal of Computer Applications
%@ 0975-8887
%V 53
%N 3
%P 12-16
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The main purpose of this paper is to develop a simple and flexible spell-checker for Arabic language. The proposed spell-checker is based on N-Grams scores. For this purpose, eleven matrices are built to present the combination between the Arabic letters word. Each matrix concerns in the connection between a 2-grams letters. Each cell in the generarated matrix is assigned an integer value 2, 1 or 0. The cell is assigned the value 2 in the corresponding matrix; if the word is ended by these two letter and assigned 1 if there is a connection and the word is not over yet, and is assigned 0 otherwise. On the other side searching process for any word that is by extracting each pair of letters in the word then it examines the value for each pair when the corresponding value is zero then the spell checker will consider the test word as wrong; otherwise it will check if it is assign with 1 that indicates that there is a connection it will be continue until reach to the value of 2 to determine that the word is correct. The overall accuracy for the proposed spell-checker is reached to 98. 99%.

References
  1. Feldman A. Computational linguistics: Models, resources, applications. Computational Linguistics, 32(3):443–444, 2006.
  2. Haddad B. and Yaseen M. Detection and correction of nonwords in arabic: A hybrid approach. International Journal of Computer Processing of Oriental Languages, 30, 2007.
  3. P. Brown, P. deSouza, R. Mercer, V. Pietra, and J. Lai. Class-based n-gram models of natural language. Computational Linguistics, 18:467–479, 1992.
  4. Muaidi H. Extraction Of Arabic Word Roots: An Approach Based on Computational Model and Multi- Backpropagation Neural Networks. PhD thesis, De Montfort University - UK, 2008.
  5. Satori H. , Harti M. , and Chenfour N. Arabic speech recognition system using cmu-sphinx4. CoRR 0704. 2201, 2007.
  6. Shaalan K. , Allam A. , and Gomah A. Towards automatic spell checking for arabic. In Language Engineering, 2003.
  7. Kukich Karen. Technique for automatically correcting words in text. ACM Computing Surveys (CSUR), 24:377– 439, 1992.
  8. Karttunen. Applications of finite-state transducers in natural language processing. In CIAA: International Conference on Implementation and Application of Automata, LNCS, 2000.
  9. Kabbani M. The arabic spell-checker dictionary from ayaspell project. Technical report, Prix special des troisiemes rencontres africaines du Logiciel Libre, 2008.
  10. Suleiman H. Mustafa and Qasem A. Al-Radaideh. Using n-grams for arabic text searching. JASIST, 55(11):1002– 1007, 2004.
  11. Alqrainy S. , Ayesh A. , and Muaidi H. Automated tagging system and tagset design for arabic text. International Journal of Computational Linguistics Research, 1:55–62, 2010.
  12. Zerrouki T. and Balla A. Implementation of infixes and circumfixes in the spellcheckers. In Proceedings of the Second International Conference on Arabic Language Resources and Tools, 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Natural Languages Processing Arabic Language Processing Spell-Checker N-Gram