![]() |
10.5120/ijca2016909488 |
Jumi Sarmah and Shikhar Kr. Sarma. Decision Tree based Supervised Word Sense Disambiguation for Assamese. International Journal of Computer Applications 141(1):42-48, May 2016. BibTeX
@article{10.5120/ijca2016909488, author = {Jumi Sarmah and Shikhar Kr. Sarma}, title = {Decision Tree based Supervised Word Sense Disambiguation for Assamese}, journal = {International Journal of Computer Applications}, issue_date = {May 2016}, volume = {141}, number = {1}, month = {May}, year = {2016}, issn = {0975-8887}, pages = {42-48}, numpages = {7}, url = {http://www.ijcaonline.org/archives/volume141/number1/24752-2016909488}, doi = {10.5120/ijca2016909488}, publisher = {Foundation of Computer Science (FCS), NY, USA}, address = {New York, USA} }
Abstract
Word Sense Disambiguation (WSD) aims to disambiguate the words which have multiple sense in a context automatically. Sense denotes the meaning of a word and the words which have various meanings in a context are referred as ambiguous words. WSD is vital in many important Natural Language Processing tasks like MT, IR, TC, SP etc. This research paper attempts to propose a supervised Machine Learning approach- Decision Tree for Word Sense Disambiguation task in Assamese language. A Decision Tree is decision model flow-chart like tree structure where each internal node denotes a test, each branch represents result of a test and each leaf holds a sense label. J48 a Java implementation of C4.5 decision tree algorithm is taken for experimentation in our case. A few polysemous words with different real occurrences in Assamese text with manual sense annotation was collected as the training and test dataset. DT algorithm produces average F-measure of .611 when 10-fold crossvalidation evaluation was performed on 10 Assamese ambiguous words.
References
- Ide, N. and Véronis, J. 1998. Word sense disambiguation: The state of the art. MIT Press Computational Linguistics Journal, 24(1):1-40.
- Sarmah, J. and Sarma, S.K., Survey on Word Sense Disambiguation: an initiative towards an Indo-Aryan Language. Accepted in IJEM, March 2016, ISSN: 2305-3631 (Print), ISSN:2306-5982 (Online)
- Linden, K., Word Sense Discovery and Disambiguation Thesis, PUBLICATION No. 37, 2005. ISSN 0355-7170.
- https://en.wikipedia.org/wiki/C4.5_algorithm.
- Sarmah, J. and Sarma, S.K., Word Sense Disambiguation for Assamese, Accepted in 6th IEEE IACC 2016, Feb 27-28, ISBN: 978-1-4673-8285-4
- Borah, P.P., Talukdar, G., Baruah, A., In Proceedings of IEEE IC3I, 2014, Nov 27-29.Pg: 946-950
- Singh, R.L., Ghosh, K., Nongmeikapam, K. and Bandyopadhyay, S., A decision tree based Word Sense Disambiguation System in Manipuri Language. Advanced Computing: An International Journal (ACIJ), Vol.5, No.4, July 2014
- Kumar, A.M., Rajendran, S., Soman, PK., Tamil Word Sense Disambiguation using support vector machines with rich features. International Journal of Applied Engineering Research, Research India Publications, Volume 9, Number 20, p.7609-7620 (2014)
- Haroon, R.P., “Malayalam Word Sense Disambiguation” In Proceedings of IEEE International Computational Intelligence and Computing Research (ICCIC), 2010.
- Sinha, M., Reddy R.M.K., Bhattacharyya, P., Pandey, P., Kashyap,L.,www.cfilt.iitb.ac.in/wordnet/webhwn/papers/HindiWSD.pdf
- Parameswarappa, S., Target Word Sense Disambiguation system for Kannada language. In Proceedings of 3rd International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom 2011).
- Roy, A., Sarkar, S., and Purkayastha, B.S., Knowledge Based Approaches to Nepali Word Sense Disambiguation. International Journal on Natural Language Computing(IJNLC) Vol. 3, No.3, June 2014
- Kalita, P. and Barman. AK, Word Sense Disambiguation: A Survey. International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 5 May 2015, Page No. 11743-11748V
- Zampieri, M., A supervised Machine Learning Method for Word Sense Disambiguation of Portuguese Nouns, A Project submitted as part of a program of study for the award of MA Natural Language Processing & Human Language Technology, UNIVERSITY OF WOLVERHAMPTON .
- Al_Bayaty, B.F.Z., Joshi, S., International Conference on Emerging Trends in Science and Cutting Edge Technology (ICETSCET-2014) EMPIRICAL IMPLEMENTATION DECISION TREE CLASSIFIER TO WSD PROBLEM.
- Dai, W., and Ji, W., A MapReduce Implementation of C4.5 Decision Tree Algorithm, International Journal of Database Theory and Application, Vol 7, No 1(2014), pp 49-60
- Han, J., Kamber., M., Pei, J., Third Edition Data Mining Concepts and Techniques– Book Published by Morgan Kaufmann Publishers, ISBN: 978-93--80931-91-3
- [18Barman. A.K., A Structured Approach for Building Assamese Corpus: Insights, Applications and Challenges. In Proceedings of the 10th Workshop on Asian Language Resources, pages 21–28, COLING 2012, Mumbai, December 2012.
- Sarma, S.K., Gogoi, M., Saikia, U., Medhi, R., Foundation and structure of Developing Assamese WordNet. In Proceedings of 5th International Conference of the Global WordNetAssociation(GWC-2010).
Keywords
Word Sense Disambiguation, Decision Tree, Assamese, Supervised approach