Publicación: Random Forest Model Based on Machine Learning for Early Detection of Diabetes
Autor corporativo
Recolector de datos
Otros/Desconocido
Director audiovisual
Editor
Tipo de Material
Fecha
Citación
Título de serie/ reporte/ volumen/ colección
Es Parte de
Resumen
Diabetes mellitus presents a growing prevalence at the global level, representing a significant public health challenge. Despite the availability of specific treatments, it is imperative to develop innovative strategies that optimize early detection and management of the disease. The research aims to develop a model that allows for the early detection of diabetes using the Random Forest algorithm, using the Knowledge Discovery in Databases (KDD) methodology, which comprises the phases of selection, preprocessing, transformation, data mining, interpretation and evaluation. The dataset used include 520 randomly selected patient records. The model achieved robust performance, with an accuracy of 85%, sensitivity of 75%, and an F1-score of 78%, indicating an adequate balance between precision and sensitivity. Specificity was 78%, while the area under the ROC curve (AUC) reached 86%, demonstrating a high discriminative ability between positive and negative cases. The balanced accuracy was 82%, andthe Matthews correlation coefficient (MCC) registered a value of 0.72, confirming the strength and reliability of the model even in the presence of class imbalance. These results demonstrate the effectiveness of the machine learning-based approach for the early detection of diabetes mellitus, with potential application in clinical decision support systems. © 2025 Elsevier B.V., All rights reserved.


