Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Generating Multilingual Subjectivity Resources using English Language

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Authors:
Vandana Jha, Shreedevi G. R., P. Deepa Shenoy, Venugopal K. R.
10.5120/ijca2016911946

Vandana Jha, Shreedevi G R., Deepa P Shenoy and Venugopal K R.. Generating Multilingual Subjectivity Resources using English Language. International Journal of Computer Applications 152(9):41-47, October 2016. BibTeX

@article{10.5120/ijca2016911946,
	author = {Vandana Jha and Shreedevi G. R. and P. Deepa Shenoy and Venugopal K. R.},
	title = {Generating Multilingual Subjectivity Resources using English Language},
	journal = {International Journal of Computer Applications},
	issue_date = {October 2016},
	volume = {152},
	number = {9},
	month = {Oct},
	year = {2016},
	issn = {0975-8887},
	pages = {41-47},
	numpages = {7},
	url = {http://www.ijcaonline.org/archives/volume152/number9/26362-2016911946},
	doi = {10.5120/ijca2016911946},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

The text data can be of two types : facts and opinions. With the introduction of UTF-8 standards and development of Web 2.0, we are in abundance of opinionated text data available in many languages on the web. Subjectivity analysis aims at dividing those opinionated data into subjective and objective sentences and automatic extraction of subjective information from it. Many subjectivity resources as well as subjectivity analysis works are available in English language. In this paper, we examine different methods of generating subjectivity resources in Hindi language and other Indian languages using resources and tools available in English language. Two methods are proposed using wordlevel subjectivity annotations. These methods use English language OpinionFinder subjectivity lexicon and a small seed word list of Hindi language which can be expanded to generate subjectivity lexicon, respectively. Four methods are proposed using sentencelevel subjectivity annotations. These methods use subjectivity annotated corpora and tools available in English language. Different evaluation strategies are used to validate the generated lexicon and corpora in Hindi language. The simulations conducted confirm that these methods are effective in rapidly creating subjectivity resources in Hindi language and other Indian languages.

References

  1. Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC, volume 10, pages 2200–2204, 2010.
  2. Akshat Bakliwal, Piyush Arora, and Vasudeva Varma. Hindi subjective lexicon: A lexical resource for hindi polarity classification. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC), 2012.
  3. Carmen Banea, JanyceMWiebe, and Rada Mihalcea. A bootstrapping method for building subjectivity lexicons for languages with scarce resources. 2008.
  4. Alberto Barr´on-Cede˜no, Andreas Eiselt, and Paolo Rosso. A comparison of models over wikipedia articles revisions. ICON, 2009, 2009.
  5. Veena H Bhat, Prasanth G Rao, R V Abhilash, P Deepa Shenoy, K R Venugopal, and L M Patnaik. A data mining approach for data generation and analysis for digital forensic application. IACSIT International Journal of Engineering and Technology, 2(3):314–319, 2010.
  6. Pushpak Bhattacharyya. Indowordnet. In In Proc. of LREC- 10. Citeseer, 2010.
  7. Ond?rej Bojar, Vojt?ech Diatka, Pavel Rychl´y, Pavel Stra?n´ak, V´it Suchomel, Ale?s Tamchyna, and Daniel Zeman. HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation. In Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis, editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Reykjavik, Iceland, may 2014. European Language Resources Association (ELRA).
  8. Amitava Das and Sivaji Bandyopadhyay. Sentiwordnet for bangla. Knowledge Sharing Event-4: Task, 2, 2010.
  9. Amitava Das and Sivaji Bandyopadhyay. Sentiwordnet for indian languages. Asian Federation for Natural Language Processing, China, pages 56–63, 2010.
  10. Dipankar Das and Sivaji Bandyopadhyay. Labeling emotion in bengali blog corpus–a fine grained tagging at sentence level. In Proceedings of the 8th Workshop on Asian Language Resources, page 47, 2010.
  11. P Deepa Shenoy, K G Srinivasa, K R Venugopal, and Lalit M Patnaik. Evolutionary approach for mining association rules on dynamic databases. In Advances in knowledge discovery and data mining, pages 325–336. Springer, 2003.
  12. P Deepa Shenoy, K G Srinivasa, K R Venugopal, and Lalit M Patnaik. Dynamic association rule mining using genetic algorithms. Intelligent Data Analysis, 9(5):439–453, 2005.
  13. Andrea Esuli and Fabrizio Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, volume 6, pages 417–422. Citeseer, 2006.
  14. Vasileios Hatzivassiloglou and Kathleen R McKeown. Predicting the semantic orientation of adjectives. In Proceedings of the 35th annual meeting of the association for computational linguistics and eighth conference of the european chapter of the association for computational linguistics, pages 174–181. Association for Computational Linguistics, 1997.
  15. Vandana Jha, N Manjunath, P Deepa Shenoy, and K R Venugopal. Hsas: Hindi subjectivity analysis system. In 2015 Annual IEEE India Conference (INDICON), pages 1–6. IEEE, 2015.
  16. Vandana Jha, N Manjunath, P Deepa Shenoy, and K R Venugopal. Hsra: Hindi stopword removal algorithm. In Microelectronics, Computing and Communications (MicroCom), 2016 International Conference on, pages 1–5. IEEE, 2016.
  17. Vandana Jha, N Manjunath, P Deepa Shenoy, K R Venugopal, and L MPatnaik. Homs: Hindi opinion mining system. In Recent Trends in Information Systems (ReTIS), 2015 IEEE 2nd International Conference on, pages 366–371. IEEE, 2015.
  18. Vandana Jha, R Savitha, P Deepa Shenoy, and K R Venugopal. Reputation system: Evaluating reputation among all good sellers. In Proceedings of NAACL-HLT, pages 115–121, 2016.
  19. Vandana Jha, R Savitha, Sudhashri S Hebbar, P Deepa Shenoy, and K R Venugopal. Hmdsad: Hindi multidomain sentiment aware dictionary. In 2015 International Conference on Computing and Network Communications (CoCoNet), pages 241–247. IEEE, 2015.
  20. Thorsten Joachims. Text categorization with support vector machines: Learning with many relevant features. Springer, 1998.
  21. Aditya Joshi, AR Balamurali, and Pushpak Bhattacharyya. A fall-back strategy for sentiment analysis in hindi: a case study. Proceedings of the 8th ICON, 2010.
  22. Jaap Kamps, Maarten Marx, Robert J Mokken, and Maarten De Rijke. Using wordnet to measure semantic orientations of adjectives. In LREC, volume 4, pages 1115–1118. Citeseer, 2004.
  23. Arun Karthikeyan Karra, Prabhakar Pande, Rohan Railkar, Aditya Sharma, and Pushpak Bhattacharyya. Hindi english wordnet linkage. 2009.
  24. Soo-Min Kim and Eduard Hovy. Determining the sentiment of opinions. In Proceedings of the 20th international conference on Computational Linguistics, page 1367. Association for Computational Linguistics, 2004.
  25. Soo-Min Kim and Eduard Hovy. Identifying and analyzing judgment opinions. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pages 200–207. Association for Computational Linguistics, 2006.
  26. Bing Liu. Sentiment analysis and subjectivity. Handbook of natural language processing, 2:627–666, 2010.
  27. Namita Mittal, Basant Agarwal, Garvit Chouhan, Nitin Bania, and Prateek Pareek. Sentiment analysis of hindi review based on negation and discourse relation. In Sixth International Joint Conference on Natural Language Processing, page 45, 2013.
  28. Dipak Narayan, Debasri Chakrabarti, Prabhakar Pande, and Pushpak Bhattacharyya. An experience in building the indo wordnet-a wordnet for hindi. In First International Conference on Global WordNet, Mysore, India, 2002.
  29. Bo Pang and Lillian Lee. 4.1.2 subjectivity detection and opinion identification. Opinion mining and sentiment analysis, 2008.
  30. Delip Rao and Deepak Ravichandran. Semi-supervised polarity lexicon induction. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pages 675–682. Association for Computational Linguistics, 2009.
  31. Ellen Riloff and Janyce Wiebe. Learning extraction patterns for subjective expressions. In Proceedings of the 2003 conference on Empirical methods in natural language processing, pages 105–112. Association for Computational Linguistics, 2003.
  32. Philip J Stone, Dexter C Dunphy, and Marshall S Smith. The general inquirer: A computer approach to content analysis. 1966.
  33. Peter D Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics, pages 417–424. Association for Computational Linguistics, 2002.
  34. Vladimir Vapnik. The nature of statistical learning theory. Springer Science & Business Media, 2013.
  35. Janyce Wiebe and Ellen Riloff. Creating subjective and objective sentence classifiers from unannotated texts. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 486–497. Springer, 2005.
  36. JanyceWiebe, TheresaWilson, and Claire Cardie. Annotating expressions of opinions and emotions in language. Language resources and evaluation, 39(2-3):165–210, 2005.
  37. Theresa Wilson, Paul Hoffmann, Swapna Somasundaran, Jason Kessler, Janyce Wiebe, Yejin Choi, Claire Cardie, Ellen Riloff, and Siddharth Patwardhan. Opinionfinder: A system for subjectivity analysis. In Proceedings of hlt/emnlp on interactive demonstrations, pages 34–35. Association for Computational Linguistics, 2005.
  38. Hong Yu and Vasileios Hatzivassiloglou. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the 2003 conference on Empirical methods in natural language processing, pages 129–136. Association for Computational Linguistics, 2003.

Keywords

Data Mining, Text Mining, Subjectivity Analysis, Hindi Language, Natural Language Processing.