A Two-Step Approach to Boost Neural Network Generalizability in Predicting Defective Software

Alexandre Nascimento,Vinícius Veloso de Melo,Márcio P. Basgalupp, Luis Alberto Viera Dias

Advances in intelligent systems and computing(2023)

引用 0|浏览1
暂无评分
摘要
With society’s digitalization, the ever-growing dependence on software increased the negative impact of poor software quality. That impact was estimated at $2.41 trillion to the US economy in 2022. In searching for better tools for supporting quality assurance efforts, such as software testing, many studies have demonstrated the use of Machine Learning (ML) classifiers to predict defective software modules. They could be used as tools to focus test efforts on the potentially defective modules, enhancing the results achieved with limited resources. However, the practical applicability of many of those studies is arguable because of (1) the misuse of their training datasets; (2) the improper metrics used to measure those classifiers’ performance; (3) the use of data from only a system or project; and (4) the use of data from only a computer programing language. When those factors are not considered, the experiments’ results are biased towards a very high accuracy, leading to improper conclusions related to the generalizability of classifiers to practical uses. This study sheds light on those issues and points out promising results by proposing and testing the cross-project and cross-language generalizability of a novel 2-step approach for artificial neural networks (ANN) using a large dataset of 17,147 software modules from 12 projects with distinct programming languages (C, C++, and Java). The results demonstrated that the proposed approach could deal with an imbalanced dataset and outperform a similar ANN trained with the conventional approach. Moreover, the proposed approach was able to improve by 277% the number of defective modules found with the same software test effort.
更多
查看译文
关键词
neural network generalizability,neural network,software,two-step
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要