Data Pre-Processing Using SMOTE Technique for Gender Classification with Imbalance Hu’s Moments Features

Ahmad Haadzal Kamarulzalis,Muhamad Hasbullah Mohd Razali,Balkiah Moktar

Proceedings of the Second International Conference on the Future of ASEAN (ICoFA) 2017 – Volume 2(2018)

引用 2|浏览3
暂无评分
摘要
Imbalance data is common in real-world applications like text categorization, face recognition for gender classification, medical diagnosis, fraud detection, oil-spills detection of satellite images. Most of the algorithms in machine learning are focusing on classification of majority class while ignoring or misclassifying minority sample. The minority samples are those that rarely occur but very important. It is commonly agreed that standard classifiers such as neural networks, support vector machines, and C4.5 are heavily biased in recognizing mostly the majority class since they are built to achieve overall accuracy to which the minority class contributes very little. In this study, we demonstrate how the synthetic minority over-sampling technique (SMOTE) can significantly improve the imbalance problem in gender classification from the data-level perspective. Hu’s moment of the face images was generated as the numerical descriptors with different imbalance ratio and classified using a supervised decision tree (J48) algorithm. The results show that prior to preprocessing the data with SMOTE, the minority group was severely misclassified as the majority group. Our claims are confirmed through the application of SMOTE in reducing the imbalance effects before inducing the decision tree.
更多
查看译文
关键词
Imbalanced data, SMOTE, Hu’s moments, J48 decision tree
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要