CFP last date
22 April 2024
Reseach Article

Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages

by Nita Patil, Ajay S. Patil, B. V. Pawar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 134 - Number 16
Year of Publication: 2016
Authors: Nita Patil, Ajay S. Patil, B. V. Pawar
10.5120/ijca2016908197

Nita Patil, Ajay S. Patil, B. V. Pawar . Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages. International Journal of Computer Applications. 134, 16 ( January 2016), 21-26. DOI=10.5120/ijca2016908197

@article{ 10.5120/ijca2016908197,
author = { Nita Patil, Ajay S. Patil, B. V. Pawar },
title = { Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages },
journal = { International Journal of Computer Applications },
issue_date = { January 2016 },
volume = { 134 },
number = { 16 },
month = { January },
year = { 2016 },
issn = { 0975-8887 },
pages = { 21-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume134/number16/23999-2016908197/ },
doi = { 10.5120/ijca2016908197 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:34:23.821868+05:30
%A Nita Patil
%A Ajay S. Patil
%A B. V. Pawar
%T Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages
%J International Journal of Computer Applications
%@ 0975-8887
%V 134
%N 16
%P 21-26
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Named Entity Recognition (NER) is sub task of Information Extraction that includes identification of named entities and classification of them into named entity classes such as person, location and organization etc. NER can be used to preprocess textual information and convert it into structured form that can be useful for Information Retrieval, Machine Translation, Question Answering System and Text Summarization. This paper presents a survey regarding NER research done for various Indian and non Indian languages. The study and observations related to approaches, techniques and features required to implement NER for various languages especially for Indian languages is reported.

References
  1. Frank Landsbergen, Evaluation of Named Entity Work in IMPACT: NE Recognition and Matching, Technical Report, 2012.
  2. Robert Krovetz, Paul Deane and Nitin Madnani, “The Web is not a Person, Berners-Lee is not an Organization, and African-Americans are not Locations: An Analysis of the Performance of Named-Entity Recognition.” in Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World (MWE 2011). Association for Computational Linguistics, Stroudsburg, 2011, PA, USA, pp. 57-64.
  3. Hans Van Halteren, “Syntactic Wordclass Tagging (Text, Speech, and Language Technology)”, Springer, 1999.
  4. Language-Independent Named Entity Recognition, http://www.cnts.ua.ac.be/conll2003/ner/
  5. GuoDong Zhou, Jian Su, “Named Entity Recognition using an HMM-Chunk Tagger”, Proceedings of 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, 2002, pp. 473-480.
  6. Muntsa Padro and Lluis Padro, “Named Entity Recognition System based on a Finite Automata Acquisition Algorithm”, Journal Natural Language Processing, Vol. 1 No. 35, pp. 319 - 326, 2005.
  7. Chia-Wei Wu, Shyh-Yi Jan, Tzong-Han Tsai, Wen-Lian Hsu, “On Using Ensemble Methods for Chinese Named Entity Recognition”, Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, July 2006, pp. 142–145.
  8. Desmet, Bart, and Véronique Hoste. "Dutch Named Entity Recognition using Classifier Ensembles." LOT Occasional Series 16, 2010, pp. 29-41.
  9. Ionas Michailidis, Konstantinos Diamantaras, Spiros Vasileiadis, Yannick Frere, “Greek Named Entity Recognition using Support Vector Machines, Maximum Entropy and Onetime”, in Proceedings of the 5th International Conference on Language Resources and Evaluation, 2006, pp. 47–52.
  10. Julio Cesar Duarte, Ruy Luiz Milidiu, “Machine Learning Algorithms for Portuguese Named Entity Recognition”, Journal of Artificial Intelligence Revista Iberoamericana”, 2007, pp. 67-75.
  11. Anita Louis, Alta De Waal and Cobus Venter, “Named Entity Recognition in a South African Context”, In Proceedings of SAICSIT 2006, pp 170-179.
  12. Yashar Mehdad, Vitalie Scurtu, Evgeny Stepanov, “Italian Named Entity Recognizer”, in EVALITA 2009 Workshop, XIth International Conference of the Italian Association for Artificial Intelligence", Italy, 2009
  13. H B Patil, A S Patil and B V Pawar (2014) “Part-of-Speech Tagger for Marathi Language using Limited Training Corpora”, International Journal of Computer Applications Proceedings on National Conference on Recent Advances in Information Technology NCRAIT Vol. 4, pp. 33-37.
  14. Dusko Vitas and Gordana Pavlovic Lazetic, “Resources and Methods for Named Entity Recognition in Serbian”, In INFOTHECA-Journal of Informatics and Librarianship, Ng 1-2, vol. IX, p35a-42a May 2008.
  15. Sasano R, Kurohashi S, “Japanese Named Entity Recognition Using Structural Natural Language Processing”, in Proceedings of IJCNLP 2008,pp. 607-612, 2008
  16. Richard Farkas, Gyorgy Szarvas, “Statistical Named Entity Recognition for Hungarian: Analysis of the Impact of Feature Space Characteristics”, in Proceedings of CESCL 2006, Budapest, Hungary, 2006
  17. Hwang, Yi-Gyu, Eui-Sok Chung, and Soo-jong Lim. "HMM based Korean Named Entity Recognition." Organization 24, no. 11.3 (2003): 4-0.
  18. Srihari, Rohini, Cheng Niu, and Wei Li. "A Hybrid Approach for Named Entity and Sub-type Tagging." in Proceedings of the Sixth Conference on Applied Natural Language Processing, Association for Computational Linguistics, 2000, pp. 247-254.
  19. Chanlekha, Hutchatai, and Asanee Kawtrakul. “Thai Named Entity Extraction by Incorporating Maximum Entropy Model with Simple Heuristic Information”, in Proceedings of the IJCNLP. 2004.
  20. David Nadeau and Satoshi Sekine, “Survey of Named Entity Recognition and Classification”, Journal of Linguisticae Investigationes, Vol. 30, No. 1, 2007
  21. Hongzhi Yu, Tao Jiang and Ning Ma, “Named Entity Recognition for Tibetan Texts Using Case-auxiliary Grammars”, In Proceedings of the International MultiConference of Engineers and Computer Scientists , Vol. I, IMECS March 2010, Hong Kong
  22. Yeniterzi, Reyyan. “Exploiting Morphology in Turkish Named Entity Recognition System”, in Proceedings of the ACL 2011 Student Session, Association for Computational Linguistics, 2011
  23. Abdallah, Sherief, Shaalan, Khaled,Shoaib, Muhammad, “Integrating Rule-Based System with Classification for Arabic Named Entity Recognition” , in Computational Linguistics and Intelligent Text Processing Lecture Notes in Computer Science Vol. 7181, 2012, pp. 311-322
  24. Arindam Dey, Abhijit Paul, Bipul Syam Purkayastha,” Named Entity Recognition for Nepali language: A Semi Hybrid Approach”. International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 8, February 2014 pp. 21-25
  25. N. V. Patil, H. B. Patil, A. S. Patil and B. V. Pawar, “The State-of-the-Art of Named Entity Recognition for Natural Language Processing”, National Conference on Emerging Trends in Computer Science and Computer Applications. Organized by DES’s Fergusson College, Pune, on 7th–8th Dec. 2013 pp 1-8.
  26. Shilpi Srivastava, Mukund Sanglikar & D.C Kothari, ,” Named Entity Recognition System for Hindi Language: A Hybrid Approach”, International Journal of Computational Linguistics (IJCL), Volume (2) : Issue (1) : 2011 pp.10-23
  27. B. Sasidhar, P. M. Yohan, Dr. A. Vinaya Babu, Dr. A Goverdhan, “A Survey on Named Entity Recognition in Indian Languages with Particular Reference to Telugu”, In IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 2, ISSN:1694-0814, 2011
  28. Srikanth P and Narayana Murthy Kavi,“Named Entity Recognition for Telugu”, Proceedings of IJCNLP 2008,Workshop on NER for South and South East Asian Languages, IIIT, Hyderabad, India, 2008
  29. G. V.S. Raju, B. Shrinivasu, Dr. S. Viswanadha Raju and K. S. M. V. Kumar, “Named Entity Recognition for Telugu using Maximum Entropy Model”, Journal of Theoretical and Applied Information Technology, 2005-2010.
  30. Ekbal, Asif, and Sivaji Bandyopadhyay, "Development of Bengali Named Entity Tagged Corpus and its Use in NER Systems." IJCNLP, 2008, pp. 1-8.
  31. Animesh Nayan, B. Ravi Kiran Rao, Pawandeep Singh, Sudip Sanyal and Ratna Sanyal, “Named Entity Recognition for Indian Languages”, in Proceedings of the IJCNLP-08 Workshop on NER for South and South East Asian Languages, Pages 97-104, Hyderabad, India, 2008
  32. Shalini Gupta, Pushpak Bhattacharyya, “Think Globally, Apply Locally: Using Distributional Characteristics for Hindi Named Entity Identification” Proceedings of the 2010 Named Entities Workshop, ACL 2010, pages 116-125 Uppsala, Sweden, 2010
  33. Sitanath Biswas, S. P. Mishra, S Acharya and S Mohanty, “A Hybrid Oriya Named Entity Recognition System: Harnessing the Power of Rule”, International Journal of Artificial Intelligence and Expert Systems (IJAE),Vol.1: Issue 1, 2010 pp.1-6
  34. Anup Patel, Ganesh Ramkrishana and Pushpak Bhattacharya, “Incorporating Linguistic Expertise using ILP for Named Entity Recognition in Data Hungry Indian Languages”, in Proceedings of the 19th International Conference on Inductive Logic Programming ILP'09, 2009, pp 178-185.
  35. H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan, "Gate: An Architecture for Development of Robust HLT Applications," in Recent Advances in Language Processing, 2002, pp. 168-175.
  36. N Kiran Kumar, GSK Santosh, Vasudeva Varma, “A Language-Independent Approach to Identify the Named Entities in Under Resourced Languages and Clustering Multilingual Documents”, International Conference on Multilingual and Multimodal Information Access Evaluation (CLEF- 2011), pp 74-82.
  37. Cucerzan, Silviu, and David Yarowsky. "Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence." Proceedings of the 1999 Joint SIGDAT Conference on EMNLP and VLC. 1999, pp. 90-99.
  38. Li, Wei, and Andrew McCallum, “Rapid Development of Hindi Named Entity Recognition Using Conditional Random Fields and Feature Induction”, ACM Transactions on Asian Language Information Processing (TALIP) Vol. 2 No. 3, 2003, pp. 290-294.
  39. Kumar N. and Bhattacharyya Pushpak “Named Entity Recognition in Hindi using MEMM.” In technical report IIT Bombay, 2006.
  40. Mohammad Hasanuzzaman, Asif Ekbal and Sivaji Bandyopadhyay, “Maximum Entropy Approach for Named Entity Recognition in Bengali and Hindi”, International Journal of Recent Trends in Engineering, Vol. 1, No.1, May 2009.
  41. Ekbal, Asif and Bandyopadhyay, Sivaji, “A Conditional Random Field Approach for Named Entity Recognition in Bengali and Hindi”, Linguistic Issues in Language Technology”, Vol. 2, No. 1 November, 2009, pp. 1-44.
  42. Ekbal, Asif, and Sivaji Bandyopadhyay. “Named Entity Recognition Using Support Vector Machine: A Language Independent Approach”, International Journal of Electrical and Electronics Engineering Vol. 4 No. 2 2010, pp. 155-170.
  43. Ekbal, Asif and Bandyopadhyay, Sivaji, “Named Entity Recognition in Bengali and Hindi Using Support Vector Machine”, Journal of Lingvisticae Investigationes, Vol. 34, No. 1, 2011, pp. 35-67.
  44. Ekbal, Asif,  Naskar, Sudip Kumar; Bandyopadhyay, Sivaji Named Entity Recognition and Transliteration in Bengali”. Journal of  Lingvisticae Investigationes Vol. 30, No. 1, 2007, pp. 95-114
  45. Ekbal, Asif, Rejwanul Haque, and Sivaji Bandyopadhyay. “Named Entity Recognition in Bengali: A Conditional Random Field Approach”, IJCNLP. 2008.
  46. Vishal Gupta, Gurpreet Singh Lehal. “Named Entity Recognition for Punjabi Language Text Summarization” International Journal of Computer Applications (0975– 8887) Vol. 33 No. 3, November 2011, pp. 28-32
  47. Kamaldeep Kaur, Vishal Gupta,”Name Entity Recognition for Punjabi Language”, International Journal of Computer Science and Information Technology & Security (IJCSITS), Vol. 2, No.3, June 2012, pp.561-567
  48. Vijayakrishna R and Sobha L., “Domain Focused Named Entity Recognizer for Tamil Using Conditional Random Fields”, in Proceedings of the IJCNLP-08Workshop on NER for South and South East Asian Languages, Hyderabad, India. pp. 93–100,
  49. S. Lakshmana Pandian , Krishnan Aravind Pavithra , T. V. Geetha. Hybrid Three-stage Named Entity Recognizer for Tamil,” INFOS2008 (2008), March 27-29, 2008 Cairo, Egypt, http://infos2008.fci.cu.edu.eg/infos/ NLP_08_P045-052.pdf
  50. Smruthi Mukund, Rohini Shrihari and Erik Peterson,“An Information- Extraction System for Urdu- A Resource Poor Language”, ACM Transactions on Asian Language Information Processing, Vol. 9, No. 4, Article 15, 2010.
Index Terms

Computer Science
Information Sciences

Keywords

NER tools Information Extraction Machine Translation