CFP last date
20 May 2024
Call for Paper
June Edition
IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper
Know more
Reseach Article

A Revised Unicode based Sorting Algorithm for Bengali Texts

by Md. Mahfuzur Rahaman
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 147 - Number 14
Year of Publication: 2016
Authors: Md. Mahfuzur Rahaman
10.5120/ijca2016911305

Md. Mahfuzur Rahaman . A Revised Unicode based Sorting Algorithm for Bengali Texts. International Journal of Computer Applications. 147, 14 ( Aug 2016), 35-40. DOI=10.5120/ijca2016911305

@article{ 10.5120/ijca2016911305,
author = { Md. Mahfuzur Rahaman },
title = { A Revised Unicode based Sorting Algorithm for Bengali Texts },
journal = { International Journal of Computer Applications },
issue_date = { Aug 2016 },
volume = { 147 },
number = { 14 },
month = { Aug },
year = { 2016 },
issn = { 0975-8887 },
pages = { 35-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume147/number14/25836-2016911305/ },
doi = { 10.5120/ijca2016911305 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:51:58.378410+05:30
%A Md. Mahfuzur Rahaman
%T A Revised Unicode based Sorting Algorithm for Bengali Texts
%J International Journal of Computer Applications
%@ 0975-8887
%V 147
%N 14
%P 35-40
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper describes a sorting algorithm for Bengali texts which is one of the most vital tasks for Bengali Natural Language Processing. As Unicode is much more preferable than ASCII encoding, we need to use this representation for Bengali Language. But due to some distinct properties of Bengali Language, they cannot be sorted directly using the order in Unicode character scheme. A few works have been done on this topics – some of them are for ASCII encoding whether some are for Unicode. But still they have some drawbacks and still there is no standard to sort Bengali texts. In this paper, we have discussed about the previous approaches and proposing a revised and easier procedure to sort Unicode Bengali texts. We used a mapping to simplify the sorting process. The efficiency depends on the efficiency of the sorting algorithm. This method is able to sort any Unicode Bengali texts. It will also work for Unicode text of any language if we just change the mapping part. So the process is both keyboard and language independent.

References
  1. https://en.wikibooks.org/wiki/Bengali
  2. https://en.wikipedia.org/wiki/Bengali_language
  3. Kenneth Katzner, 'The Languages of the World', Routledge, 1995.
  4. http://www.banglaacademy.org.bd/
  5. https://en.wikipedia.org/wiki/Bangla_Academy
  6. https://en.wikipedia.org/wiki/Bengali_alphabet
  7. http://forum.daffodilvarsity.edu.bd/index.php?topic=11714.0
  8. Md. Ruhul Amin, Asif Mohammed Samir, Madhusodan Chakraborty, Md. Mahfuzur Rahman, “An Efficient Unicode based Sorting Algorithm for Bengali Words”
  9. Aamira Shabnam, Debakar Shamanta Piklu, “An Easily Comprehendible Unicode Based Sorting Algorithm for Bangla Words”
  10. Aamira Shabnam, Tapashee Tabassum Urmi, Md. Saiful Islam, “A Faster Approach to Sort Unicode Represented Bengali Words”
  11. Partha Sarathi Kar, Shantanu Mandal, Labiba Jahan, “An Improved Unicode Based Sorting Algorithm for Bengali Words”
  12. https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers
  13. Bangla Academy Bengali-English Dictionary, First Edition June, 1994, Bangla Academy, Dhaka, Bangladesh.
  14. Cormen, Thomas and Leiserson, Charles and Rivest,Ronald: “Introduction to Algorithm”, Prentice – Hall of India Private Limited, 1999.
  15. Ellis Horowitz and Sartaz Shani,: "Fundamentals of Computer Algorithm", Galgotia Publications Limited.
  16. Unicode Consortium http://www.unicode.org/charts/PDF/U0980.pdf
  17. Mohammad, Kazi Din: “Adhunik Bangla Byakoron O Rochona”
  18. Rajesh Palit, Md. Abdus Sattar, “Representation of Bangla Characters in the Computer Systems”, Bangladesh Journal of Computer and Information Technology, Vol. 7, No. 1, December, 1999.
  19. Masum, Md. Salahuddin, “Study of Bangla Conjunctive Characters for Recognition”, B.Sc.Engg.Thesis, department of Computer Scince and Engineering, BUET, August 2001.
  20. Deitel and Santry “Advanced Java 2 Platform”, Prentice Hall Publications.
  21. Knuth, Donald “The Art of Computer Programming”, Addison-Wisely Publications, Boston
  22. Samsad Bengali-English Dictionary -http://dsal.uchicago.edu/dictionaries/biswas-bengali/
  23. Ishida, Richard - Bengali script noteshttp://rishida.net/scripts/bengali/
Index Terms

Computer Science
Information Sciences

Keywords

Bengali Word Sorting Bengali Text Sorting Unicode Bengali Text Sorting Bengali Linguistic Sort Bengali Dictionary Sort Bangla Academy Dictionary Based Sort.