IRJIT -- An Information Retrieval Technique for Just-in-time Defect Identification

arXiv (Cornell University)(2022)

引用 0|浏览2
暂无评分
摘要
Defect identification at commit check-in time prevents the introduction of defects into software. Current defect identification approaches either rely on manually crafted features such as change metrics or involve training expensive machine learning or deep learning models. By relying on a complex underlying model, these approaches are not often explainable, which means the models' predictions cannot be understood by the developers. An approach that is not explainable might not be adopted in real-life development environments because of developers' lack of trust in its results. Furthermore, because of an extensive training process, these approaches cannot readily learn from new examples when they arrive, making them unsuitable for fast online prediction. To address these limitations, we propose an approach called IRJIT that employs information retrieval on source code, and labels new commits as buggy or clean based on their similarity to past buggy or clean commits. Our approach is online and explainable as it can learn from new data without retraining, and developers can see the documents that support a prediction. Through an evaluation of 8 open-source projects, we show that IRJIT achieves AUC and F1 score close to the state-of-the-art machine learning approach JITLine, without considerable re-training.
更多
查看译文
关键词
information retrieval technique,irjit,identification,just-in-time
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要