Comparative Study on Early Stage Diabete Detection by Using Machine Learning Methods

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers

Abstract

This paper introduces an innovative approach to diabetes prediction, leveraging machine learning algorithms. The study is dedicated to elevating the precision of medical examinations through the application of machine learning to electronic health records (EHRs). In our investigation of the Pima Indian dataset, we employed two distinct strategies-imputation data and, notably, the novel filtered data approach-to address missing values. Subsequently, we rigorously evaluated six supervised machine learning models, encompassing Logistic Regression, Random Forest, K-Nearest Neighbor, Support Vector Machine, XGBoost, and Cat Boost. Metrics including accuracy, precision, sensitivity, specificity, and stability were meticulously assessed. Encouragingly, we achieved a commendable 98% accuracy with the Random Forest classifier using the imputation data strategy. However, our groundbreaking contribution lies in the filtered data approach, where we achieved an equally promising 84% accuracy using the XGBoost classifier. This pivotal finding unequivocally establishes the superiority of the filtered data methodology, signifying a significant leap towards enhancing patient risk scoring systems and foreseeing the onset of disease.

Description

Keywords

Diabetic Detection, Machine Learning Methods, Logistic Regression, Random Forest, K-Nearest Neighbour, Support Vector Machine, XGBoost, Cat Boost

Citation

Endorsement

Review

Supplemented By

Referenced By