Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Decision Tree based Supervised Word Sense Disambiguation for Assamese

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Authors:
Jumi Sarmah, Shikhar Kr. Sarma
10.5120/ijca2016909488

Jumi Sarmah and Shikhar Kr. Sarma. Decision Tree based Supervised Word Sense Disambiguation for Assamese. International Journal of Computer Applications 141(1):42-48, May 2016. BibTeX

@article{10.5120/ijca2016909488,
	author = {Jumi Sarmah and Shikhar Kr. Sarma},
	title = {Decision Tree based Supervised Word Sense Disambiguation for Assamese},
	journal = {International Journal of Computer Applications},
	issue_date = {May 2016},
	volume = {141},
	number = {1},
	month = {May},
	year = {2016},
	issn = {0975-8887},
	pages = {42-48},
	numpages = {7},
	url = {http://www.ijcaonline.org/archives/volume141/number1/24752-2016909488},
	doi = {10.5120/ijca2016909488},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

Word Sense Disambiguation (WSD) aims to disambiguate the words which have multiple sense in a context automatically. Sense denotes the meaning of a word and the words which have various meanings in a context are referred as ambiguous words. WSD is vital in many important Natural Language Processing tasks like MT, IR, TC, SP etc. This research paper attempts to propose a supervised Machine Learning approach- Decision Tree for Word Sense Disambiguation task in Assamese language. A Decision Tree is decision model flow-chart like tree structure where each internal node denotes a test, each branch represents result of a test and each leaf holds a sense label. J48 a Java implementation of C4.5 decision tree algorithm is taken for experimentation in our case. A few polysemous words with different real occurrences in Assamese text with manual sense annotation was collected as the training and test dataset. DT algorithm produces average F-measure of .611 when 10-fold crossvalidation evaluation was performed on 10 Assamese ambiguous words.

References

  1. Ide, N. and Véronis, J. 1998. Word sense disambiguation: The state of the art. MIT Press Computational Linguistics Journal, 24(1):1-40.
  2. Sarmah, J. and Sarma, S.K., Survey on Word Sense Disambiguation: an initiative towards an Indo-Aryan Language. Accepted in IJEM, March 2016, ISSN: 2305-3631 (Print), ISSN:2306-5982 (Online)
  3. Linden, K., Word Sense Discovery and Disambiguation Thesis, PUBLICATION No. 37, 2005. ISSN 0355-7170.
  4. https://en.wikipedia.org/wiki/C4.5_algorithm.
  5. Sarmah, J. and Sarma, S.K., Word Sense Disambiguation for Assamese, Accepted in 6th IEEE IACC 2016, Feb 27-28, ISBN: 978-1-4673-8285-4
  6. Borah, P.P., Talukdar, G., Baruah, A., In Proceedings of IEEE IC3I, 2014, Nov 27-29.Pg: 946-950
  7. Singh, R.L., Ghosh, K., Nongmeikapam, K. and Bandyopadhyay, S., A decision tree based Word Sense Disambiguation System in Manipuri Language. Advanced Computing: An International Journal (ACIJ), Vol.5, No.4, July 2014
  8. Kumar, A.M., Rajendran, S., Soman, PK., Tamil Word Sense Disambiguation using support vector machines with rich features. International Journal of Applied Engineering Research, Research India Publications, Volume 9, Number 20, p.7609-7620 (2014)
  9. Haroon, R.P., “Malayalam Word Sense Disambiguation” In Proceedings of IEEE International Computational Intelligence and Computing Research (ICCIC), 2010.
  10. Sinha, M., Reddy R.M.K., Bhattacharyya, P., Pandey, P., Kashyap,L.,www.cfilt.iitb.ac.in/wordnet/webhwn/papers/HindiWSD.pdf
  11. Parameswarappa, S., Target Word Sense Disambiguation system for Kannada language. In Proceedings of 3rd International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom 2011).
  12. Roy, A., Sarkar, S., and Purkayastha, B.S., Knowledge Based Approaches to Nepali Word Sense Disambiguation. International Journal on Natural Language Computing(IJNLC) Vol. 3, No.3, June 2014
  13. Kalita, P. and Barman. AK, Word Sense Disambiguation: A Survey. International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 5 May 2015, Page No. 11743-11748V
  14. Zampieri, M., A supervised Machine Learning Method for Word Sense Disambiguation of Portuguese Nouns, A Project submitted as part of a program of study for the award of MA Natural Language Processing & Human Language Technology, UNIVERSITY OF WOLVERHAMPTON .
  15. Al_Bayaty, B.F.Z., Joshi, S., International Conference on Emerging Trends in Science and Cutting Edge Technology (ICETSCET-2014) EMPIRICAL IMPLEMENTATION DECISION TREE CLASSIFIER TO WSD PROBLEM.
  16. Dai, W., and Ji, W., A MapReduce Implementation of C4.5 Decision Tree Algorithm, International Journal of Database Theory and Application, Vol 7, No 1(2014), pp 49-60
  17. Han, J., Kamber., M., Pei, J., Third Edition Data Mining Concepts and Techniques– Book Published by Morgan Kaufmann Publishers, ISBN: 978-93--80931-91-3
  18. [18Barman. A.K., A Structured Approach for Building Assamese Corpus: Insights, Applications and Challenges. In Proceedings of the 10th Workshop on Asian Language Resources, pages 21–28, COLING 2012, Mumbai, December 2012.
  19. Sarma, S.K., Gogoi, M., Saikia, U., Medhi, R., Foundation and structure of Developing Assamese WordNet. In Proceedings of 5th International Conference of the Global WordNetAssociation(GWC-2010).

Keywords

Word Sense Disambiguation, Decision Tree, Assamese, Supervised approach