Taiwo, Adedoyin O. (2025) Machine Learning Model for Prediction of Prediabetes Among Adults in Nigeria and Ghana. International Journal of Innovative Science and Research Technology, 10 (4): 25apr1043. pp. 2367-2377. ISSN 2456-2165

[thumbnail of IJISRT25APR1043.pdf] Text
IJISRT25APR1043.pdf - Published Version

Download (949kB)

Abstract

Introduction: Pre-diabetes is a significant metabolic disease that can have harmful effects on the body as a whole, with millions of cases in Africa. Early identification and treatment of pre-diabetes is necessary to decrease the risk of diabetes, as well as maintaining a healthy lifestyle. Machine learning, on the other hand, is a computational method for automated learning from data for accurate predictions. Deploying machine learning models for the prediction of health outcomes in clinical medicine (including oncology, cardiovascular diseases, and diabetes), is now gaining wave around the globe, however, there is no such model available for the prediction of pre-diabetes among Africans. Hence, there is a need for an Afrocentric model that identifies the risk of developing pre-diabetes among Africans.  Objective: The aim of this study is to build such model that would help in predicting the outcome of Pre-Diabetes among adult Nigerians and Ghanaians for proper diagnosis and disease preventive measures.  Methods: The data analysed in this research included 2463 participants from Nigeria and Ghana. Further Pre-processing of the data, which involved excluding those participants that are already diabetic” left this research with 2,016 research participants. The outcome variable is a recode of the Laboratory Fasting Blood Glucose variable where the participants with < 99mg/dl are normal, participants with Laboratory Fasting Blood Glucose between 100mg/dl and 125mg/dl are pre- diabetic, and participants with Laboratory Fasting Blood Glucose > 125mg/dl are diabetic. This study assessed five different supervised machine learning predictive models, including Support vector machine (SVM), k-NN, Naïve Bayes, Random Forest, Decision Tree Classifier and Logistic Regression to predict diagnostic outcomes for pre-diabetes. The performance of all the five distinct models were assessed using precision, recall, area under curve (AUC) and F1 score.  Results: The result of this study also showed that 10% of the study participants considered are prediabetic. Family history (OR = 41.50), Hypertension Status (OR = 1.53), Tobacco Use (OR = 1.05), Alcohol Use (OR = 1.01), BMI (OR = 1.04), and Obesity (OR = 1.28) are factors that increase prediabetes outcome. The results of our feature selection methods showed that Domicile, Alcohol Use, Family History, Tobacco Use, Dyslipidemia, Body Mass Index (BMI), Age, Obesity, Blood Pressure, Hypertension Status, Country, Gender contributed more to the prediction of prediabetes outcome. The areas under curve and accuracy results for all models showed that Random Forest (0.90, 0.85), SVM (0.92, 0.86) and the logistic regression model (0.92, 0.86) performed best on classification accuracy.  Conclusion: The study concluded that the Support Vector Machine (SVM) is the most efficient model in predicting prediabetes outcome. Hence, SVM can be integrated into medical devices and software applications to determine prediabetic outcome among Adults in Nigeria and Ghana. This study will also aid future researchers in selecting the most suitable predictive models for the implementation of community lifestyle programs aimed at reducing the prevalence of prediabetes.

Item Type: Article
Subjects: R Medicine > R Medicine (General)
Divisions: Faculty of Medicine, Health and Life Sciences > School of Medicine
Depositing User: Editor IJISRT Publication
Date Deposited: 07 May 2025 09:13
Last Modified: 07 May 2025 09:13
URI: https://eprint.ijisrt.org/id/eprint/736

Actions (login required)

View Item
View Item