CFP last date
20 May 2024
Reseach Article

Dependency Parsing using the URDU.KON-TB Treebank

by Saima Munir, Qaisar Abbas, Bushra Jamil
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 167 - Number 12
Year of Publication: 2017
Authors: Saima Munir, Qaisar Abbas, Bushra Jamil
10.5120/ijca2017914492

Saima Munir, Qaisar Abbas, Bushra Jamil . Dependency Parsing using the URDU.KON-TB Treebank. International Journal of Computer Applications. 167, 12 ( Jun 2017), 25-31. DOI=10.5120/ijca2017914492

@article{ 10.5120/ijca2017914492,
author = { Saima Munir, Qaisar Abbas, Bushra Jamil },
title = { Dependency Parsing using the URDU.KON-TB Treebank },
journal = { International Journal of Computer Applications },
issue_date = { Jun 2017 },
volume = { 167 },
number = { 12 },
month = { Jun },
year = { 2017 },
issn = { 0975-8887 },
pages = { 25-31 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume167/number12/27823-2017914492/ },
doi = { 10.5120/ijca2017914492 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:14:55.528250+05:30
%A Saima Munir
%A Qaisar Abbas
%A Bushra Jamil
%T Dependency Parsing using the URDU.KON-TB Treebank
%J International Journal of Computer Applications
%@ 0975-8887
%V 167
%N 12
%P 25-31
%D 2017
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper, we present evaluation of URDU.KON-TB in the dependency parsing domain. The URDU.KON-TB treebank is developed on the bases of the phrase structure and hyper dependency structure which are only functional constituent’s label. Treebank was annotated with three levels of annotation tagset, the semi-semantic POS (SSP), semi-semantic Syntactic (SSS) and Functional (F) tagset and was checked for the Phrase Structure Parsing domain. To evaluate this treebank in the Dependency Parsing domain we have selected MaltParser. To use data in the parser, we have converted the URDU.KON-TB treebank annotated data according to the CONLL format. The compatibility of data to CoNLL is also measured along with usability of data in the dependency parsing domain. To make the data compatible, few assumptions are taken. The converted data is used to evaluate the system by dividing 80% data as training data and 20% data as testing data. We have performed eight experiments. Four experiments are conducted with six different feature models with converted data. The experiments results show URDU.KON-TB treebank is not suitable for the dependency parsing as dependency relation because Head information was missing in the treebank. We then performed four experiments with an assumption based enhancement by adding Head information. The algorithm used to train and test data is Nivre arc-agear algorithm. The new experiments show this treebank data can be used to develop new dependency treebank for Urdu.

References
  1. Abbas, Q. (2014). Building Computational Resources: The URDU. KON-TB Treebank and the Urdu Parser (Doctoral dissertation).
  2. Ali, W., &Hussain, S. (2010). Urdu dependency parser: a data-driven approach. In Proceedings of Conference on Language and Technology (CLT10), SNLP, Lahore, Pakistan.
  3. Ali, W, (2010). Data-Driven Dependency Parsing for Urdu, MS (MPhil), Computer Sciences thesis, Department of Computer Sciences, National University of Computer and Emerging (NUCES), Lahore, Pakistan.
  4. Bhat, R. A., Jain, S., & Sharma, D. M. (2012). Experiments on dependency parsing of Urdu. Proceedings of TLT11, 31-36.
  5. Bhat, R. A., & Sharma, D. M. (2012, July). A dependency treebank of Urdu and its evaluation. In Proceedings of the Sixth Linguistic Annotation Workshop (pp. 157-165). Association for Computational Linguistics.
  6. Abbas, Q. (2014). Semi-semantic part of speech annotation and evaluation.LAW VIII, 75.
  7. Nivre, J., Hall, J., & Nilsson, J. (2006, May). Maltparser: A data-driven parser-generator for dependency parsing. In Proceedings of LREC (Vol. 6, pp. 2216-2219).
  8. Bharati, A., Husain, S., Ambati, B., Jain, S., Sharma, D., &Sangal, R. (2008). Two semantic features make all the difference in parsing accuracy.Proc. of ICON, 8.
  9. Ballesteros, M., &Nivre, J. (2012, May). MaltOptimizer: A System for MaltParser Optimization. In LREC (pp. 2757-2763).
  10. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., ... &Marsi, E. (2007). MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(02), 95-135.
  11. Bohnet, B., &Nivre, J. (2012, July). A transition-based system for joint part-of-speech tagging and labeled non-projective dependency parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (pp. 1455-1465). Association for Computational Linguistics.
  12. Spreyer, K., & Kuhn, J. (2009, June). Data-driven dependency parsing of new languages using incomplete and noisy training data. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (pp. 12-20). Association for Computational Linguistics.
  13. Ambati, B. R., Husain, S., Nivre, J., &Sangal, R. (2010, June). On the role of morphosyntactic features in Hindi dependency parsing. In Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages (pp. 94-102). Association for Computational Linguistics.
  14. Nilsson, J. (2009). Transformation and Combination in Data-Driven Dependency Parcing.
  15. Nivre, J. (2008). Sorting out dependency parsing. In Advances in Natural Language Processing (pp. 16-27). Springer Berlin Heidelberg.
  16. Abbas, Q. 2015, Morphologically rich Urdu grammar parsing using Earley algorithm, Natural Language Engineering (NLE), Vol.21(2), PP.1-36, ISSN: 1351-3249, DOI: 10.1017/S1351324915000133, Cambridge University Press, UK
  17. N. Chomsky. Three Models For The Description Of Language. Information Theory, IRE Transactions on, 2(3):113–124, 1956.
  18. PUNEETH, K. (2016). Dependency Parsing and Empty Category Detection in Hindi Language (Doctoral dissertation, International Institute of Information Technology Hyderabad).
  19. GADE, R. P. (2014). Dependency parsing approaches for Indian Languages: Hindi and Sanskrit (Doctoral dissertation, International Institute of Information Technology Hyderabad).
  20. J. Nivre, Inductive Dependency Parsing, Springer, 2006.
  21. M. Marcus, B. Santorini, and M.A. Marcinkiewicz, "Building a large annotated corpus of English: The Penn Treebank", Computational Linguistics 1993
Index Terms

Computer Science
Information Sciences

Keywords

Phrase structure parsing Data Driven Dependency Parsing MaltParser