Comparative Evaluation of Ensemble Learning Algorithms for Early Detection of Diabetes

Main Article Content

Abdoul Malik

Abstract

Diabetes is a severe metabolic disorder that poses a significant worldwide health challenge. Early detection is crucial to avoid serious complications and reduce healthcare burdens. Despite the rising use of machine learning in medical diagnostics, many studies rely on single algorithms, which limits predictive robustness and effectiveness. This study aims to perform a comparative evaluation of five widely used ensemble learning algorithms, including Random Forest, AdaBoost, XGBoost, LightGBM, and CatBoost on the early-stage diabetes risk prediction dataset. The algorithms have been trained and validated using 10-fold cross-validation, and their performance was assessed with accuracy, precision, recall, F1-score, and AUC. Random Forest outperformed the other models, with an accuracy of 98.46%, an F1-score of 98.75%, and an AUC of 100%, closely followed by LightGBM and CatBoost. The study's findings reveal that ensemble learning approaches, especially Random Forest, demonstrate practical and generalizable predictive performance, providing highly reliable predictions for early diabetes detection that surpass those of other methods.  These findings highlight the practical value of ensemble learning in clinical contexts, offering robust, reliable, and interpretable tools to support timely diabetes diagnosis and decision-making.

Article Details

How to Cite
Malik , A. . (2025). Comparative Evaluation of Ensemble Learning Algorithms for Early Detection of Diabetes. EDRAAK, 2025, 103-110. https://doi.org/10.70470/EDRAAK/2025/013
Section
Articles