Comparative Analysis of Resampling Techniques and Machine Learning Classifiers in Multiclass Classification of Diabetes Mellitus

2023 International Conference on Self Sustainable Artificial Intelligence Systems (ICSSAS)(2023)

引用 0|浏览0
暂无评分
摘要
This research study explores the effects of various resampling techniques with different machine learning classifiers on the accuracy of multi-class classification of Diabetes using an imbalanced dataset. The diabetes dataset of Mendeley is a multi-class dataset with information about patients with no diabetes, pre-diabetes, and diabetes. The dataset is imbalanced, where the majority class is diabetic. This study is a comparative analysis of various oversampling techniques, undersampling techniques, and hybrid techniques with different machine learning algorithms to accurately classify the person as diabetic, pre-diabetic, or non-diabetic. Eight machine-learning algorithms and ten resampling techniques were applied to the dataset to classify the patient accurately. The result indicates that the combination of XGBoost with K mean smote and smote N attains the highest accuracy of 99.2%. It also suggests that oversampling techniques perform better than undersampling techniques and hybrid techniques.
更多
查看译文
关键词
Multiclass Classification,Resampling,Smote,Diabetes Mellitus,Imbalanced Data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要