CFP last date
20 May 2026
Reseach Article

XGBoost-based Employee Attrition Prediction with SHAP Explainability: A Comparative Study of Supervised Classification Algorithms on the IBM HR Analytics Dataset

by Anurag Bodkhe, Sahil Jirapure, Ujjwal Garud, Shrinivas Bhore
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 99
Year of Publication: 2026
Authors: Anurag Bodkhe, Sahil Jirapure, Ujjwal Garud, Shrinivas Bhore
10.5120/ijca1f36c9153a77

Anurag Bodkhe, Sahil Jirapure, Ujjwal Garud, Shrinivas Bhore . XGBoost-based Employee Attrition Prediction with SHAP Explainability: A Comparative Study of Supervised Classification Algorithms on the IBM HR Analytics Dataset. International Journal of Computer Applications. 187, 99 ( Apr 2026), 7-11. DOI=10.5120/ijca1f36c9153a77

@article{ 10.5120/ijca1f36c9153a77,
author = { Anurag Bodkhe, Sahil Jirapure, Ujjwal Garud, Shrinivas Bhore },
title = { XGBoost-based Employee Attrition Prediction with SHAP Explainability: A Comparative Study of Supervised Classification Algorithms on the IBM HR Analytics Dataset },
journal = { International Journal of Computer Applications },
issue_date = { Apr 2026 },
volume = { 187 },
number = { 99 },
month = { Apr },
year = { 2026 },
issn = { 0975-8887 },
pages = { 7-11 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number99/xgboost-based-employee-attrition-prediction-with-shap-explainability-a-comparative-study-of-supervised-classification-algorithms-on-the-ibm-hr-analytics-dataset/ },
doi = { 10.5120/ijca1f36c9153a77 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2026-04-28T21:29:24.423858+05:30
%A Anurag Bodkhe
%A Sahil Jirapure
%A Ujjwal Garud
%A Shrinivas Bhore
%T XGBoost-based Employee Attrition Prediction with SHAP Explainability: A Comparative Study of Supervised Classification Algorithms on the IBM HR Analytics Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 99
%P 7-11
%D 2026
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Employee attrition remains one of the most consequential workforce challenges facing contemporary organizations, with replacement costs estimated between 50% and 200% of an affected employee's annual compensation. This paper presents the design, implementation, and empirical evaluation of an Employee Attrition Prediction System (EAPS) built on supervised machine learning techniques applied to the IBM HR Analytics dataset comprising 1,470 employee records and 35 workforce attributes. Four classification algorithms—Logistic Regression, Support Vector Machine (SVM), Random Forest, and XGBoost—are systematically trained, tuned, and evaluated under realistic class-imbalance conditions using the Synthetic Minority Oversampling Technique (SMOTE). Three domain-informed engineered features are introduced: Compensation Ratio, Tenure per Job, and Years Without Change. Experimental results demonstrate that XGBoost achieves superior performance across all five evaluation metrics, attaining 97.2% accuracy, 96.8% precision, 95.4% recall, a macro F1 score of 96.1%, and an AUC-ROC of 0.991. A modular six-component system architecture is proposed, culminating in an HR decision-support dashboard leveraging SHAP (SHapley Additive exPlanations) values for individualized, interpretable attrition risk assessments.

References
  1. F. Fallucchi, M. Coladangelo, R. Giuliano, and E. W. De Luca, "Predicting Employee Attrition Using Machine Learning Techniques," Computers, vol. 9, no. 4, p. 86, Nov. 2020.
  2. S. Krishna and S. Sidharth, "HR Analytics: Employee Attrition Analysis using Random Forest," Int. J. Performability Eng., vol. 18, no. 4, pp. 275–281, Apr. 2022.
  3. L. Akinode and O. Bada, "Employee Attrition Prediction Using Machine Learning Algorithms," in Proc. 3rd Int. Conf., The Federal Polytechnic, Ilaro, Nigeria, Aug. 2022, pp. 1252–1261.
  4. O. Iparraguirre-Villanueva et al., "Employee Attrition Prediction Using Machine Learning Models," in Proc. 22nd LACCEI Multi-Conf., San Jose, Costa Rica, Jul. 2024.
  5. N. Mansor, N. S. Sani, and M. Aliff, "Machine Learning for Predicting Employee Attrition," Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 11, pp. 435–445, 2021.
  6. H. Alqahtani, H. Almagrabi, and A. Alharbi, "Employee Attrition Prediction Using Machine Learning Models: A Review Paper," Int. J. Artif. Intell. Appl., vol. 15, no. 2, pp. 23–49, Mar. 2024.
  7. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," J. Artif. Intell. Res., vol. 16, pp. 321–357, Jun. 2002.
  8. T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proc. 22nd ACM SIGKDD Conf., San Francisco, CA, USA, Aug. 2016, pp. 785–794.
  9. S. M. Lundberg and S.-I. Lee, "A Unified Approach to Interpreting Model Predictions," in Proc. 31st NeurIPS, Long Beach, CA, USA, Dec. 2017, pp. 4765–4774.
  10. L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001.
  11. Society for Human Resource Management (SHRM), "Retaining Talent: A Guide to Analyzing and Managing Employee Turnover," SHRM Foundation, Alexandria, VA, USA, 2021.
  12. H. He and E. A. Garcia, "Learning from Imbalanced Data," IEEE Trans. Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, Sep. 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Machine Learning Employee Attrition Prediction XGBoost Random Forest SMOTE Explainable AI Binary Classification IBM HR Dataset SHAP Feature Engineering