A Method and Experiment to evaluate Deep Neural Networks as Test Oracles for Scientific Software

Proceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test(2022)

引用 0|浏览2
暂无评分
摘要
Testing scientific software is challenging because usually such type of systems have non-deterministic behaviours and, in addition, they generate non-trivial outputs such as images. Artificial intelligence (AI) is now a reality which is also helping in the development of the software testing activity. In this article, we evaluate seven deep neural networks (DNNs), precisely deep convolutional neural networks (CNNs) with up to 161layers, playing the role of test oracle procedures for testing scientific models. Firstly, we propose a method, TOrC, which starts by generating training, validation, and test image datasets via combinatorial interaction testing applied to the original codes and second-order mutants. Within TOrC we also have classical steps such as transfer learning, a technique recommended for DNNs. Then, we verified the performance of the oracles (CNNs). The main conclusions of this research are: i) not necessarily a greater number of layers means that a CNN will present better performance; ii) transfer learning is a valuable technique but eventually we may need extended solutions to get better performances; iii) data-centric AI is an interesting path to follow; and iv) there is not a clear correlation between the software bugs, in the scientific models, and the errors (image misclassifications) presented by the CNNs. CCS CONCEPTS • Software and its engineering → Software testing and debugging;. Computing methodologies → Neural networks; Supervised learning by classification; Computer vision.
更多
查看译文
关键词
Test Oracles,Deep Convolutional Neural Networks,Transfer Learning,Explainable Artificial Intelligence,Data-Centric Artificial Intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要