CFP last date
20 May 2024
Reseach Article

Hybrid Approach for Named Entity Recognition

by Kanwalpreet Singh Bajwa, Amardeep Kaur
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 118 - Number 1
Year of Publication: 2015
Authors: Kanwalpreet Singh Bajwa, Amardeep Kaur
10.5120/20713-3048

Kanwalpreet Singh Bajwa, Amardeep Kaur . Hybrid Approach for Named Entity Recognition. International Journal of Computer Applications. 118, 1 ( May 2015), 36-41. DOI=10.5120/20713-3048

@article{ 10.5120/20713-3048,
author = { Kanwalpreet Singh Bajwa, Amardeep Kaur },
title = { Hybrid Approach for Named Entity Recognition },
journal = { International Journal of Computer Applications },
issue_date = { May 2015 },
volume = { 118 },
number = { 1 },
month = { May },
year = { 2015 },
issn = { 0975-8887 },
pages = { 36-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume118/number1/20713-3048/ },
doi = { 10.5120/20713-3048 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:00:32.577235+05:30
%A Kanwalpreet Singh Bajwa
%A Amardeep Kaur
%T Hybrid Approach for Named Entity Recognition
%J International Journal of Computer Applications
%@ 0975-8887
%V 118
%N 1
%P 36-41
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper proposes the Named Entity Recognition (NER) system for Punjabi language using a hybrid approach in which rule based approach and machine learning approach i. e. Hidden Markov Model (HMM) is combined. With no Dataset available, the Named Entities (NEs) were manually tagged which led us to the creation of training and testing dataset, under the linguistic supervision. Using hybrid approach, the proposed system is able to recognize Name of person, Location, Time, Date, Designation, Organization, Title-person, Event, Abbreviation, Facility, Number, Artifact, Relation and Measure. This paper presents two versions of NER for Punjabi language, the first version is designed with HMM only and the second version is designed hybrid approach in which HMM is used in combination with handcrafted rules. NER system with proposed hybrid approach is able to achieve the precision of 72. 92%, Recall of 76. 27%, F-measure of 74. 56% with hybrid approach and Precision, Recall and F-measure of 47. 57%, 48. 98%, 48. 27% respectively has been achieved by using HMM only. This paper has also compared proposed method with simple HMM and observed that proposed NER system performs better.

References
  1. Grishman, R. , & Sundheim, B. (1996, August). Message Understanding Conference-6: A Brief History. In COLING Vol. 96 (pp. 466-471).
  2. Nadeau, D. , & Sekine, S. (2007). A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1) (pp. 3-26).
  3. A. Singh and J. Rani (2013). Maximum Entropy Approach based Named Entity Recognition in Punjabi Language. International journal of Computer Application (IJCA) vol. 84,no. 3(pp. 1–5).
  4. Kamaldeep kaur and Vishal Gupta (2012, June). Name Entity Recognition for Punjabi Language. IRACST – International Journal of Computer Science and Information Technology & Security (IJCSITS),vol. 2, pp. 561-567.
  5. Srihari, R. , Niu, C. , & Li, W. (2000, April). A hybrid approach for named entity and sub-type tagging. In Proceedings of the sixth conference on Applied natural language processing (pp. 247-254). Association for Computational Linguistics.
  6. Gayen, Vivekananda, and Kamal Sarkar. 2014 An HMM Based Named Entity Recognition System for Indian Languages. The JU System at ICON 2013.
  7. Vishal gupta and Gurpreet Singh Lehal. (2014). Named Entity Recognition for Punjabi Language Text Summarization. (IJCA)International Journal of Computer Applications vol. 33, no. 3 (pp. 28–32).
  8. S. Chan, W. Lam, X. Yu. 2007. A cascaded approach to biomedical named entity recognition using a unified model. In the proceedings of Seventh IEEE International Conference on Data Mining ICDM 2007 (pp. 93-102). IEEE.
  9. McDonald, D. (1996). Internal and external evidence in the identification and semantic categorization of proper names. Corpus processing for lexical acquisition (pp. 21-39).
  10. Wakao, T. , Gaizauskas, R. , & Wilks, Y. (1996, August). Evaluation of an algorithm for the recognition and classification of proper names. In Proceedings of the 16th conference on Computational linguistics-Volume 1(pp. 418-423). Association for Computational Linguistics.
  11. Wang, Jing Liu, Zhijing1. 2008 A novel arithmetic of named entity identification. In the Proceedings of 5th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2008 vol. 4 (pp. 457-461). IEEE.
  12. D. Klein, J. Smarr, H. Nguyen, C. Manning. 2003. Named entity recognition with character-level models. In the Proceedings of the seventh conference on Natural language learning, HLTNAACL 2003, vol. 4, pp. 180-183. ACM.
  13. B. Todorovic,S. Rancic, I. Markovic, E. Mulalic, V. Ilic. 2008. Named Entity Recognition and Classification using Context Hidden Markov Model. In the proceedings of Neural Network Applications in Electrical Engineering, NEUREL 2008 no. 1 (pp. 43-46). IEEE.
  14. R. Florian, A. Ittycheriah, H. Jing, T. Zang (2003, May). Named entity recognition through classifier combination. In the Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 (pp. 168-171). Association for Computational Linguistics,
  15. Bikel, D. M. , Miller, S. , Schwartz, R. , & Weischedel, R. (1997, March). Nymble: a high-performance learning name-finder. In Proceedings of the fifth conference on Applied natural language processing (pp. 194-201). Association for Computational Linguistics.
  16. A. Krishnarao, H. Gahlot, A. Srinet, D. Khushwaha. 2009 A Comparative Study of Named Entity Recognition for Hindi Using Sequential Learning Algorithms. In the proceedings of International Advance Computing Conference, IACC pp. 1164-1169. IEEE.
  17. Asahara, M. , & Matsumoto, Y. (2003, May). Japanese named entity extraction with redundant morphological analysis. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 8-15). Association for Computational Linguistics.
  18. S. Chan, W. Lam, X. Yu. 2007. A cascaded approach to biomedical named entity recognition using a unified model. In the proceedings of Seventh IEEE International Conference on Data Mining ICDM 2007 (pp. 93-102). IEEE.
  19. Borthwick, Andrew, Sterling, J. , Agichtein, E. , Grishman, R. 1998. NYU: Description of the MENE Named Entity System as used in MUC-7. In the Proceeding Seventh Message Understanding Conference (MUC-7).
  20. Zhang, T. , & Johnson, D. (2003, May). A robust risk minimization based named entity recognition system. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 (pp. 204-207). Association for Computational Linguistics.
  21. McCallum, A. , & Li, W. (2003, May). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In the Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 (pp. 188-191). Association for Computational Linguistics.
  22. Aman deep Kaur & Gurpreet singh josan (2014, March). Improved Named Entity Tagset for Punjabi Language. In Engineering and Computational Sciences (RAECS), 2014 Recent Advances in (pp. 1-5). IEEE.
Index Terms

Computer Science
Information Sciences

Keywords

Named Entity Recognition (NER) NLP and Hidden Markov Model (HMM) Rule based approach.