MTLink: Adaptive multi-task learning based pre-trained language model for traceability link recovery between issues and commits

Yang Deng,Bangchao Wang,Qiang Zhu,Junping Liu, Jiewen Kuang, Xingfu Li

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES(2024)

引用 0|浏览1
暂无评分
摘要
Traceability links between issues and commits (issue-commit links recovery (ILR)) play a significant role in software maintenance tasks by enhancing developers' observability in practice. Recent advancements in large language models, particularly pre-trained models, have improved the effectiveness of automated ILR. However, these models' large parameter sizes and extended training time pose challenges in large software projects. Besides, existing methods often overlook the association and distinction among artifacts, leading to the generation of erroneous links. To mitigate these problems, this paper proposes a novel link recovery method called MTLink. It utilizes multi-teacher knowledge distillation (MTKD) to compress the model and employs an adaptive multi-task strategy to reduce information loss and improve link accuracy. Experiments are conducted on four open-source projects. The results show that (i) MTLink outperforms state-of-the-art methods; (ii) The multi-teacher knowledge distillation maintains accuracy despite model size reduction; (iii) The adaptive multi-task tracing method effectively handles confusion caused by similar artifacts and balances each task. In conclusion, MTLink offers an efficient solution for ILR in software traceability. The code is available at https://zenodo.org/records/10321150.
更多
查看译文
关键词
Issue-commit link recovery,Multi-teacher knowledge distillation,Adaptive multi-task
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要