Characterization and prediction of issue-related risks in software projects

MSR(2015)

引用 37|浏览62
暂无评分
摘要
Identifying risks relevant to a software project and planning measures to deal with them are critical to the success of the project. Current practices in risk assessment mostly rely on high-level, generic guidance or the subjective judgements of experts. In this paper, we propose a novel approach to risk assessment using historical data associated with a software project. Specifically, our approach identifies patterns of past events that caused project delays, and uses this knowledge to identify risks in the current state of the project. A set of risk factors characterizing "risky" software tasks (in the form of issues) were extracted from five open source projects: Apache, Duraspace, JBoss, Moodle, and Spring. In addition, we performed feature selection using a sparse logistic regression model to select risk factors with good discriminative power. Based on these risk factors, we built predictive models to predict if an issue will cause a project delay. Our predictive models are able to predict both the risk impact (i.e. the extend of the delay) and the likelihood of a risk occurring. The evaluation results demonstrate the effectiveness of our predictive models, achieving on average 48%--81% precision, 23%--90% recall, 29%--71% F-measure, and 70%--92% Area Under the ROC Curve. Our predictive models also have low error rates: 0.39--0.75 for Macro-averaged Mean Cost-Error and 0.7--1.2 for Macro-averaged Mean Absolute Error.
更多
查看译文
关键词
issue-related risks,software projects,risk assessment,project delays,risks identification,risky software tasks,open source projects,Apache,Duraspace,JBoss,Moodle,Spring,feature selection,sparse logistic regression model,risk factor selection,project delay,macroaveraged mean cost-error,macroaveraged mean absolute error
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要