Comments on “Using $k$ k -Core Decomposition on Class Dependency Networks to Improve Bug Prediction Model's Practical Performance”
IEEE Transactions on Software Engineering(2022)
摘要
In a very recent paper by (Qu
et al.
, 2021), the authors propose an effective equation,
top-core
, to improve the performance of effort-aware bug prediction models. A distinctive feature of
top-core
is that it takes into account the
coreness
of a class in a Class Dependency Network (CDN) when calculating the
relative risk
of a class to be buggy. In this comment, we show that Qu
et al.
's paper contains three shortcomings that may influence the performance of
top-core
or even have the potential to lead to erroneous results. First, we show that the CDN that they use to calculate the
coreness
of classes is not very accurate, neglecting many important types of dependency relations between classes such as
method call
relation,
access
relation, and
instantiates
relation. Second, they trained a Logistic Regression model using the scikit-learn framework to predict the probability of a specific class to be buggy. It is actually an L2 regularized Logistic Regression model, which is dependent on the scale of the features. But they neglected to normalize the features, making the obtained results erroneous. Finally, the number of execution times (viz. 10 times in the paper of Qu
et al.
) they used to reduce the bias caused by the randomness (viz. random split of instances and the process to handle class-imbalance problem) in the experiments is too small to ensure that the obtained results converge to stable values; but they failed to signify the precision level of their results for comparison. In this comment, we provide solutions to the problems by using i) an improved CDN (ICDN) to represent the structure of software systems, ii) the z-score method to normalize the features, and iii) an adaptive mechanism to determine the number of execution times. In the experiments, we find that Qu
et al.
's approach based on the Logistic Regression model does not perform significantly better than the state-of-the-art approach Ree, which is inconsistent with the conclusion in Qu
et al.
's work. We also observe that replacing CDN with ICDN does improve the performance of Qu
et al.
's approach.
更多查看译文
关键词
Effort-aware bug prediction,coreness, $k$ k -core,class dependency network,complex network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要