Hybrid Defect Prediction Model Based on Counterfactual Feature Optimization

Hum. Centric Intell. Syst.(2023)

引用 0|浏览12
暂无评分
摘要
Software defect prediction is critical to ensuring software quality. Researchers have worked on building various defect prediction models to improve the performance of defect prediction. Existing defect prediction models are mainly divided into two categories: models constructed based on artificial statistical features and models constructed based on semantic features. DP-CNN [Li J, He P, Zhu J, et al. Software defect prediction via convolutional neural network. In: 2017 IEEE international conference on software quality, reliability and security (QRS). IEEE, 2017; 318–328.] is one of the best defect prediction models, because it combines both artificial statistical features and semantic features, so its performance is greatly improved compared to traditional defect prediction models. This paper is based on the DP-CNN model and makes the following two improvements: first, using a new Struc2vec network representation technique to mine existing information between software modules, which specializes in learning node representations from structural identity and can further extract structural features associated with defects. Let the DP-CNN model once again incorporate the newly mined structural features. Then, this paper proposes a feature selection method based on counterfactual explanations, which can determine the importance score of each feature by the feature change rate of counterfactual samples. The origin of these feature importance scores is interpretable. Under the guidance of these interpretable feature importance scores, better feature subsets can be obtained and used to optimize artificial statistical features within the DP-CNN model. Based on the above methods, this paper proposes a new hybrid defect prediction model DPS-CNN-STR. Evaluating our model on six open source projects in terms of F1 score in defect prediction. Experimental results show that DPS-CNN-STR improves the state-of-the-art method by an average of 3.3%.
更多
查看译文
关键词
hybrid defect prediction model,feature,prediction model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要