Call for Paper - January 2024 Edition
IJCA solicits original research papers for the January 2024 Edition. Last date of manuscript submission is December 20, 2023. Read More

Checking the Correctness of Bangla Words using N-Gram

International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 89 - Number 11
Year of Publication: 2014
Nur Hossain Khan
Gonesh Chandra Saha
Bappa Sarker
Md. Habibur Rahman

Nur Hossain Khan, Gonesh Chandra Saha, Bappa Sarker and Md. Habibur Rahman. Article: Checking the Correctness of Bangla Words using N-Gram. International Journal of Computer Applications 89(11):1-3, March 2014. Full text available. BibTeX

	author = {Nur Hossain Khan and Gonesh Chandra Saha and Bappa Sarker and Md. Habibur Rahman},
	title = {Article: Checking the Correctness of Bangla Words using N-Gram},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {89},
	number = {11},
	pages = {1-3},
	month = {March},
	note = {Full text available}


N-gram model is used in many domains like spelling and syntactic verification, speech recognition, machine translation, character recognition and like others. This paper describes a system for checking the correctness of a bangle word using N-gram model. An experimental corpus containing one million word tokens was used to train the system. The corpus was a part of the BdNC01 corpus, created in the SIPL lab. of Islamic university. Collecting several sample text from different newspapers, the system was tested by 50,000 correct and another 50,000 incorrect words. The system has successfully detected the correctness of the test words at a rate of 96. 17%. This paper also describes the limitations of the system with possible solutions.


  • P Majumder, M Mitra, B. B. Chaudhuri, "N-gram: a language independent approach to IR and NLP", ICUKL November 2002, Goa, India.
  • Wikipedia, "n-gram", http://en. wikipedia. org/wiki/N-gram, Access date: 17th Dec. 2013.
  • Daniel Jurafsky, James H. Martin,"Speech and Language Processing An Introduction to Natural Language Processing: Computational Linguistics and Speech Recognition", Prentice Hall, Englewood Cliffs, New Jersey 07632 , September 28, 1999
  • C. E. Shannon, "Prediction and entropy of printed English," Bell Sys. Tec. J. (30):50–64, 1951
  • Farag Ahmed, Ernesto William De Luca, and Andreas Nürnberger, "Revised N-Gram based Automatic Spelling Correction Tool to Improve Retrieval Effectiveness", August 22, 2009
  • Hasan Muaidi, Rasha Al-Tarawneh, "Towards Arabic Spell-Checker Based on N-Grams Scores", International Journal of Computer Applications (0975 -8887), Volume 53 - No. 3, September 2012.