A Comparative Study of Data Collection Periods for Just-In-Time Defect Prediction Using the Automatic Machine Learning Method

IEICE Trans. Inf. Syst.(2023)

引用 0|浏览6
暂无评分
摘要
This paper focuses on the "data collection period" for training a better Just-In-Time (JIT) defect prediction model-the early commit data vs. the recent one-, and conducts a large-scale comparative study to explore an appropriate data collection period. Since there are many possible machine learning algorithms for training defect prediction models, the selection of machine learning algorithms can become a threat to valid-ity. Hence, this study adopts the automatic machine learning method to mitigate the selection bias in the comparative study. The empirical results using 122 open-source software projects prove the trend that the dataset composed of the recent commits would become a better training set for JIT defect prediction models.
更多
查看译文
关键词
key Just-in-time defect prediction,automatic machine learning,training data collection period,process metrics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要