Comparative Analysis of Resampling Techniques and Machine Learning Classifiers in Multiclass Classification of Diabetes Mellitus
2023 International Conference on Self Sustainable Artificial Intelligence Systems (ICSSAS)(2023)
摘要
This research study explores the effects of various resampling techniques with different machine learning classifiers on the accuracy of multi-class classification of Diabetes using an imbalanced dataset. The diabetes dataset of Mendeley is a multi-class dataset with information about patients with no diabetes, pre-diabetes, and diabetes. The dataset is imbalanced, where the majority class is diabetic. This study is a comparative analysis of various oversampling techniques, undersampling techniques, and hybrid techniques with different machine learning algorithms to accurately classify the person as diabetic, pre-diabetic, or non-diabetic. Eight machine-learning algorithms and ten resampling techniques were applied to the dataset to classify the patient accurately. The result indicates that the combination of XGBoost with K mean smote and smote N attains the highest accuracy of 99.2%. It also suggests that oversampling techniques perform better than undersampling techniques and hybrid techniques.
更多查看译文
关键词
Multiclass Classification,Resampling,Smote,Diabetes Mellitus,Imbalanced Data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要