Call for Paper - August 2022 Edition
IJCA solicits original research papers for the August 2022 Edition. Last date of manuscript submission is July 20, 2022. Read More

Information Extraction from Clinical Text using NLP and Machine Learning: Issues and Opportunities

IJCA Proceedings on National Conference on “Recent Trends in Information Technology”
© 2016 by IJCA Journal
NCRTIT 2016 - Number 2
Year of Publication: 2016
M. Sridevi
Arunkumar B. R.

M Sridevi and Arunkumar B.r.. Article: Information Extraction from Clinical Text using NLP and Machine Learning: Issues and Opportunities. IJCA Proceedings on National Conference on Recent Trends in Information Technology NCRTIT 2016(2):11-16, August 2016. Full text available. BibTeX

	author = {M. Sridevi and Arunkumar B.r.},
	title = {Article: Information Extraction from Clinical Text using NLP and Machine Learning: Issues and Opportunities},
	journal = {IJCA Proceedings on National Conference on Recent Trends in Information Technology},
	year = {2016},
	volume = {NCRTIT 2016},
	number = {2},
	pages = {11-16},
	month = {August},
	note = {Full text available}


Natural Language Processing (NLP) and Machine Learning concepts are gaining rapid importance in the era of digitalization of data. The value of data keeps changing over time and makes it important to harness that value for performing in depth research in various domains. Extracting information from clinical text helps in automated terminology management, data mining, de-identification of clinical text, research subject identification and studying effect of research on them, predicting the onset and progress of various chronic diseases, disease-treatment-side effect analysis etc. Methods based on NLP and Machine Learning tends to perform better in this area but more experience is required to analyse clinical text than the biomedical literature. The issues and opportunities in information extraction from the clinical text need to be intensively reviewed to find new avenues in this domain of research.


  • Peter B. Jensen, Lars J. Jensen &SorenBrunak, "Mining EHRs towards better research applications and clinical care", Nature Reviews Genetics, June 2012.
  • Shortliffe EH, 1987. Computer programs to support clinical decision making.
  • DunaDemner-Fushman, Wendy W. Chapman, Clement J. McDonald, "What can NLP do for Clinical Decision Support?", Journal of Biomedical Informatics, August 2009, Elsevier Inc.
  • Sager N, Chi E, Friedman C, "The analysis and processing of clinical narrative", Medinfo;1986, Elsevier.
  • Haug PJ, Ranum DL, Frederick PR, "Computerized extraction of coded findings from from free-text radio-logic reports", Radiology, February 1990.
  • Haug PJ, Koehler S, Lau LM, Wang P, Rocha R, Huff SM, "Experience with a mixed semantic/ syntactic parser", Proceedings of Annual Symposium of Computational Appl. Med Care, 1995.
  • Hobbs JR, "Information extraction from biomedical text", Journal of Biomed Information, August 2002.
  • Pakhomov S, Buntrock J, Duffy PH, "High throughput modularized NLP system for clinical text", 43rd Annual Meeting of the Association for Computational Linguistics, 2005.
  • Liu K, Mitchell KJ, Chapman WW, Crowley RS, "Automating tissue bank annotation from pathology reports – comparison to a gold standard expert annotation set", AMIA Annual Symposium Proceedings, 2005.
  • Dorr DA, Phillips WF, Phansalkar S, Sims S A, Hurdle JF, "Assessing the difficulty and time cost of de-identification in clinical narratives", Methods Inf Med, 2006.
  • Sweeney L, "Replacing personally-identifying information in medical records, the Scrub system, Proceedings of AMIA Annual Fall Symposium 1996.
  • Ruch P, Baud RH, Rassinoux AM, BouillonnP, Robert G, "Medical document anonymization with a semantic lexicon", Proceedings of AMIA Symposium, 2000.
  • Beckwith BA, Mahaadevan R, Balis UJ, Kuo F, "Development and evaluation of an open source software tool for de-identification of pathology reports", BMC Medical Informatics & Decision Making, 2006.
  • Uzuner O, Luo Y, Szolovits P, "Evaluating the state-of-the-art in automatic de-identification", JAMIA 2007.
  • Aronow DB, Fangfang F, Croft WB, "Ad hoc classification of radiology reports", JAMIA 1999.
  • Mutalik PG, Deshpande A, Nadkarni PM, "Use of general purpose negation detection to augment concept indexing of medical documents: a quantitative study using UMLS, JAMIA 2001.
  • Huang Y, Lowe HJ, "A novel hybrid approach to automated negation detection in clinical radiology reports", JAMIA 2007.
  • Harkema H, Setzer A, Gaizauskas R, Hepple M, "Mining and modelling temporal clinical data", Proceedings of the UK e-Science All Hands Meeting 2005.
  • Bramsen P, Deshpande P, Lee YK, Barzilay R, "Finding temporal order in discharge summaries", Proceedings of AMIA Annual Symposium 2006.
  • Chapman W, Chu D, Dowing JN, "ConText: An algorithm for identifying contextual features from clinical text", BioNLP 2007: Biological, translational, and clinical language processing, Prague, CZ.