Call for Paper - November 2020 Edition
IJCA solicits original research papers for the November 2020 Edition. Last date of manuscript submission is October 20, 2020. Read More

Lexical Analysis of Religious Texts using Text Mining and Machine Learning Tools

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Mayuri Verma

Mayuri Verma. Lexical Analysis of Religious Texts using Text Mining and Machine Learning Tools. International Journal of Computer Applications 168(8):39-45, June 2017. BibTeX

	author = {Mayuri Verma},
	title = {Lexical Analysis of Religious Texts using Text Mining and Machine Learning Tools},
	journal = {International Journal of Computer Applications},
	issue_date = {June 2017},
	volume = {168},
	number = {8},
	month = {Jun},
	year = {2017},
	issn = {0975-8887},
	pages = {39-45},
	numpages = {7},
	url = {},
	doi = {10.5120/ijca2017914486},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


This paper presents a text mining approach to compare and to explore the similarities and the differences between various religious texts using POS Tagging and Term Document Matrix. Automated text mining and machine learning tools have been used for lexical analysis of the ten world famous religious texts: the Holy Bible, the Dhammapada, the Tao Te Ching, the Bhagwad Gita, the Guru Granth Sahib, the Agama, the Quran, the Rig Veda, the Sarbachan and the Torah. The extracted nouns categories were used as features to explore some interesting relationships between these religions and ideas that have emerged in different religions from different geographic regions.


  1. Daniel McDonald. “A Text Mining Analysis of Religious Texts”. The Journal of Business Inquiry ,2014.
  2. Qahl, Salha Hassan Muhammed, "An Automatic Similarity Detection Engine Between Sacred Texts Using Text Mining and Similarity Measures" (2014). Thesis. Rochester Institute of Technology.
  3. Frank Lloyd Sindler.” COMPARATIVE STUDY OF CHRISTIAN, JEWISH, AND ISLAMIC THEODICY”(1982).Thesis. B.S., Clemson University.
  4. Feldman, Ronen, and James Sanger. The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge University Press, 2007.
  5. Manning, Christopher D., and Hinrich Schütze. Foundations of statistical natural language processing. Vol. 999. Cambridge: MIT press, 1999.
  6. The Holy Bible, translated from the Latin Vulgate
  7. Free Books To Read Audio Libary
  8. The Holy Bible, translated from the Latin Vulgate
  9. Bhagavad-Gita As It Is: files/Bhagavad-gita_As_It_Is.pdf
  10. English Translation of Siri Guru Granth Sahib
  11. AGAMA – An Introduction: download_pdf/Aagam_Intro_Booklet%20v280912.pdf
  12. Quran English Translation downloads/quran-english-translation-clearquran-edition-allah.pdf
  13. The Hymns of the Rigveda: http://www.sanskritweb .net/rigveda/griffith.pdf
  15. Torah Bible of Jewish
  16. Martin Schweinberger.” Part-Of-Speech Tagging with R “(June 24, 2016)
  17. Vocabulary Size and Use: Lexical Richness in L2 ... - Oxford Academic
  18. Text Mining Package


Religious Texts, POS Tagging, R, Lexical Analysis