| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 60 |
| Year of Publication: 2025 |
| Authors: Alan Janbey |
10.5120/ijca2025925566
|
Alan Janbey . Cross-Platform NLP Framework for Detecting LGBTQIA Hate Speech: Evaluation on Reddit and Simulated Twitter Datasets. International Journal of Computer Applications. 187, 60 ( Nov 2025), 1-12. DOI=10.5120/ijca2025925566
Online hate speech targeting the LGBTQIA community presents a persistent challenge to social cohesion and individual well-being. This study proposes a computational approach to detecting and mitigating such content using Natural Language Processing (NLP) techniques. Data were collected from public Reddit forums, annotated into offensive and acceptable categories, and pre-processed using tokenisation, normalisation, and stopword removal. Both Count Vectorisation and TF-IDF Vectorisation were employed to generate features for training a Decision Tree Classifier. To enhance robustness and assess cross-platform applicability, a simulated evaluation was also conducted on a representative Twitter dataset. The Reddit dataset evaluation yielded an accuracy of 0.76, with strong precision for acceptable content but lower precision for offensive content due to vocabulary variability. The simulated Twitter dataset showed improved balance between precision and recall, achieving an accuracy of 0.81. High-resolution visualisations, including word clouds, class distribution charts, and an NLP workflow diagram, provide insights into data characteristics and model architecture. The results indicate that the proposed approach is effective for detecting offensive language in LGBTQIA-related discourse and adaptable to multiple social media platforms. Future research will explore multilingual extensions, multimodal content analysis, and real-time deployment for proactive content moderation.