A two-stage case-based reasoning driven classification paradigm for financial distress prediction with missing and imbalanced data

Lean Yu, Mengxin Li, Xiaojun Liu

Expert Systems with Applications(2024)

引用 0|浏览7
暂无评分
摘要
Financial distress prediction often accompanies missing sample feature data and imbalanced normal and abnormal samples. To solve missing and imbalanced data that have significant negative impacts on the financial distress prediction model, a two-stage CBR-driven classification paradigm is proposed to accurately and robustly predict financial distress. The proposed classification paradigm involves two main stages: CBR-driven missing data imputation and learning vector quantization (LVQ)-CBR-driven classifier prediction. In the first stage, the hybrid CBR-driven weighted imputation method is used to fill in missing values in the analytical dataset to obtain reliable and stable imputation performance, thereby solving the data missing problem. In the second stage, the LVQ-CBR-driven classification model is constructed to predict financial distress. By highlighting and fully learning minority abnormal samples, the classification model solves the low prediction accuracy of minority abnormal samples arising from data imbalance. For illustration and verification, some experiments are performed on seven Chinese-listed enterprise datasets with different missing and imbalance rates. Corresponding results show that the proposed two-stage CBR-driven classification paradigm can achieve the best imputation performance, greatly improve the prediction accuracy of minority abnormal samples, and integrally realize the best overall prediction performance compared with other imputation methods, imbalanced data processing methods, and their combinations. This implies that the proposed two-stage CBR-driven classification paradigm can be used as a competitive solution to financial distress prediction with missing and imbalanced data.
更多
查看译文
关键词
Case-based reasoning,Missing data imputation,Imbalanced data classification,Financial distress prediction,Learning vector quantization clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要