CFP last date
20 May 2024
Reseach Article

Named Entity Recognition in Telugu Language using Language Dependent Features and Rule based Approach

by B. Sasidhar, P. M. Yohan, Dr. A. Vinaya Babu, Dr. A. Govardhan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 22 - Number 8
Year of Publication: 2011
Authors: B. Sasidhar, P. M. Yohan, Dr. A. Vinaya Babu, Dr. A. Govardhan
10.5120/2602-3628

B. Sasidhar, P. M. Yohan, Dr. A. Vinaya Babu, Dr. A. Govardhan . Named Entity Recognition in Telugu Language using Language Dependent Features and Rule based Approach. International Journal of Computer Applications. 22, 8 ( May 2011), 30-34. DOI=10.5120/2602-3628

@article{ 10.5120/2602-3628,
author = { B. Sasidhar, P. M. Yohan, Dr. A. Vinaya Babu, Dr. A. Govardhan },
title = { Named Entity Recognition in Telugu Language using Language Dependent Features and Rule based Approach },
journal = { International Journal of Computer Applications },
issue_date = { May 2011 },
volume = { 22 },
number = { 8 },
month = { May },
year = { 2011 },
issn = { 0975-8887 },
pages = { 30-34 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume22/number8/2602-3628/ },
doi = { 10.5120/2602-3628 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:08:52.845448+05:30
%A B. Sasidhar
%A P. M. Yohan
%A Dr. A. Vinaya Babu
%A Dr. A. Govardhan
%T Named Entity Recognition in Telugu Language using Language Dependent Features and Rule based Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 22
%N 8
%P 30-34
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The objective of Named Entity Recognition (NER) is to categorize all named entities in a document into predefined classes like person, organization, location, brand names and others. Named Entity Recognition is a difficult process in Indian languages like Telugu, Hindi, and Bengali, Urdu etc., where sufficient gazetteers and annotated corpora are not available compared to English language? A rule based systems is very difficult to implement because of lack of grammatical and linguistic analysis to make rules in Indian languages like “Telugu”. In this paper we describe the identification of Named Entities using various features, gazetteer lists using language dependent features and rule based approaches for Telugu language. Here we described two phase representation of Named Entity Recognition. The first phase describes the noun identification using Telugu dictionaries, noun morphological stemmer and noun suffixes. The second phase identifies the Named Entities using transliterated gazetteer lists related to different Named Entity tags, various Named Entity suffix features, context features and morphological features.

References
  1. A. Borthwick.,”A Maximum Entropy Approach to Named Entity Recognition, Ph.D theis, New Yark University.
  2. Ekbal, A., Naskar, S., Bandyopadhyay, S.: Named Entity Recognition and Transliteration in Bengali. Named Entities: Recognition, Classification and Use, Special Issue of Lingvisticae Investigationes Journal 30 (2007) 95–114.
  3. Asif Ekbal et. al. “Language Independent Named Entity Recognition in Indian Languages”. IJCNLP, 2008
  4. Babych, B., Hartley, A.: Improving Machine Translation Quality with Automatic Named Entity Recognition. In: Proceedings of EAMT/EACL 2003 Workshop on MT and other Language Technology Tools. (2003) 1–8
  5. Bikel D. M., Miller S, Schwartz R and Weischedel R. 1997. Nymble: A high performance learning name-finder. In: Proceedings of the Fifth Conference on Applied Natural LanguageProcessing,pp.194
  6. Brown, C.P., The Grammar of the Telugu Language. 1991, New Delhi: Laurier Books Ltd.
  7. Daniel M. Bikel, R. Schwartz, Ralph M. Weischedel, “An Algorithm that Learns What’s in Name”, Machine Learning (Special Issue on NLP), 1999, pp. 1-20.
  8. Ganapathiraju, M., et al. OM: "One Tool for Many (Indian) Languages". in ICUDL: International Conference on Universal Digital Library. 2005. Hang Zhou
  9. Telugu Language website http://www.te.wikipedia.org/wiki/
  10. Morphological Analyzers – IIIT Hyderabad – http://www.iiit.net/ltrc/morph/morph_analyser.html
  11. Krishnamurti, B., A grammar of modern Telugu. 1985, Delhi; New York: Oxford University Press.
  12. McDonald D. 1996. Internal and external evidence in the identification and semantic categorization of proper names. In: B.Boguraev and J. Pustejovsky (eds), Corpus Processing for Lexical Acquisition, pp. 21-39.
  13. Nadeau, David; Turney, P.; Matwin, S. 2006. “Unsupervised Named Entity Recognition; Generating Gazetteers and Resolving Ambiguity” in the proceedings of Canadian Conference on Artificial Intelligence.
  14. Praneeth M Shishtla, Karthik Gali, Prasad Pingali and Vasudeva Varma. 2008. “Experiments in Telugu NER: A conditional Random Field Approach” in the proceedings of the IJCNLP-08 Workshop on NER for South and South East Asian Languages, pages 105-110, Hyderabad, India.
  15. R. Grishman. 1995. “The NYU system for MUC-6 or Where’s the Syntax” in the proceedings of Sixth Message Understanding Conference (MUC-6), pages 167-195, Fairfax, Virginia
  16. Wakao T., Gaizauskas R. and Wilks Y. 1996. Evaluation of an algorithm for the recognition and classification of proper names. In: Proceedings of COLING-96
Index Terms

Computer Science
Information Sciences

Keywords

NER morphological stemmer NER features