CFP last date
22 June 2026
Reseach Article

BERT vs. Logistic Regression: Classifying Mental Health-Related Text using Machine Learning and Natural Language Processing

by Amila Hrnjić, Zerina Altoka
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 104
Year of Publication: 2026
Authors: Amila Hrnjić, Zerina Altoka
10.5120/ijca1dc7d28ef5a6

Amila Hrnjić, Zerina Altoka . BERT vs. Logistic Regression: Classifying Mental Health-Related Text using Machine Learning and Natural Language Processing. International Journal of Computer Applications. 187, 104 ( May 2026), 32-39. DOI=10.5120/ijca1dc7d28ef5a6

@article{ 10.5120/ijca1dc7d28ef5a6,
author = { Amila Hrnjić, Zerina Altoka },
title = { BERT vs. Logistic Regression: Classifying Mental Health-Related Text using Machine Learning and Natural Language Processing },
journal = { International Journal of Computer Applications },
issue_date = { May 2026 },
volume = { 187 },
number = { 104 },
month = { May },
year = { 2026 },
issn = { 0975-8887 },
pages = { 32-39 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number104/bert-vs-logistic-regression-classifying-mental-health-related-text-using-machine-learning-and-natural-language-processing/ },
doi = { 10.5120/ijca1dc7d28ef5a6 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2026-05-17T02:29:17+05:30
%A Amila Hrnjić
%A Zerina Altoka
%T BERT vs. Logistic Regression: Classifying Mental Health-Related Text using Machine Learning and Natural Language Processing
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 104
%P 32-39
%D 2026
%I Foundation of Computer Science (FCS), NY, USA
Abstract

With the rise of mental health discussions in online spaces, the ability to automatically detect emotionally sensitive content has become increasingly important. This study compares traditional machine learning (ML) methods with deep learning models to evaluate their effectiveness in classifying mental health-related texts. Two models were tested: Logistic regression (LR) with TF-IDF features and the BERT transformer model. A balanced dataset containing labeled text samples was used, with standard natural language processing (NLP) preprocessing applied. Model performance was evaluated using precision, recall, F1-score, and AUC. Results show that BERT outperforms logistic regression across all metrics, achieving an F1-score of 0.95 and an AUC of 0.99. Confusion matrices and ROC curves confirmed BERT’s superior accuracy and its ability to reduce false classifications. These findings highlight the strength of deep learning models in understanding nuanced language, which is crucial in the mental health domain. Overall, the study confirms that transformer-based models like BERT offer a more reliable approach to classifying emotionally sensitive content, with promising applications in early detection tools and mental health support systems.

References
  1. A. Le Glaz, Y. Haralambous, D.-H. Kim-Dufor, P. Lenca, R. Billot, T. C. Ryan, J. Marsh, J. DeVylder, M. Walter, S. Berrouiguet, and C. Lemey. Machine learning and natural language processing in mental health: Systematic review. Journal of Medical Internet Research, 23(5):e15708, 2021.
  2. K. Bajaj, M. Kumar, S. Jain, V. Bhardwaj, and S. Walia. Enhancing Suicide Risk Prediction through BERT: Leveraging Textual Biomarkers for Early Detection. International Journal of Intelligent Systems and Applications, 17(2):101–111, 2025.
  3. E. Yeskuatov, S.-L. Chua, and L. K. Foo. Detecting suicidal ideations on Reddit with transformer models. Artificial Intelligence and Human-Computer Interaction, IOS Press, 2025.
  4. P. Jain, K. R. Srinivas, and A. Vichare. Depression and Suicide Analysis Using Machine Learning and NLP. Journal of Physics: Conference Series, 2161(1):012034, 2022.
  5. B. L. Cook, A. M. Progovac, P. Chen, B. Mullin, S. Hou, and E. Baca-Garcia. Novel use of natural language processing (NLP) to predict suicidal ideation and psychiatric symptoms in a text-based mental health intervention in Madrid. Computational and Mathematical Methods in Medicine, 2016:8708434, 2016.
  6. B. G. Bokolo and Q. Liu. Advanced Comparative Analysis of Machine Learning and Transformer Models for Depression and Suicide Detection in Social Media Texts. Electronics, 13(20):3980, 2024.
  7. N. Viani, R. Botelle, J. Kerwin, L. Yin, R. Patel, R. Stewart, and S. Velupillai. A natural language processing approach for identifying temporal disease onset information from mental healthcare text. Scientific Reports, 11(1), 2021.
  8. T. A. Spiliotis. Comparative analysis for mental health prediction tasks based on social media posts. Postgraduate diploma thesis, National Technical University of Athens, 2024.
  9. T. Sasaki. Mental Health Classifier – NLP. Kaggle dataset, 2022.
  10. B. Hans Christian, M. P. Agus, and D. Suhartono. Single Document Automatic Text Summarization using Term Frequency–Inverse Document Frequency (TF-IDF). ComTech: Computer, Mathematics and Engineering Applications, 7(4):285–294, 2016.
  11. S. Shalev-Shwartz and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014.
  12. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2019.
  13. N. M. Gardazi, A. Daud, M. K. Malik, A. Bukhari, T. Alsahfi, and B. Alshemaimri. BERT applications in natural language processing: a review. Artificial Intelligence Review, 58:166, 2025.
  14. Shinigami. Sentimental Analysis. Kaggle dataset, 2021
Index Terms

Computer Science
Information Sciences

Keywords

Mental health Text classification BERT Machine Learning NLP