FeatRacer: Locating Features Through Assisted Traceability

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING(2023)

引用 0|浏览1
暂无评分
摘要
Locating features is one of the most common software development activities. It is typically done during maintenance and evolution, when developers need to identify the exact places in a codebase where specific features are implemented. Unfortunately, locating features is laborious and error-prone, since feature knowledge fades, projects are developed by different developers, and features are often scattered across the codebase. Recognizing the need, many automated feature location techniques have been proposed, which try to retroactively recover features, i.e., very domain-specific information from the codebase. Unfortunately, such techniques require large training datasets, only recover coarse-grained locations and produce too many false positives to be useful in practice. An alternative is recording features during development, when they are still fresh in a developer's mind. However, recording is easily forgotten and also costly, especially when the software evolves and such recordings need to be updated. We address the infamous feature location problem (a.k.a., concern location or concept assignment problem) differently. We present FeatRacer, which combines feature recording and automated feature location in a way that allows developers to proactively and continuously record features and their locations during development, while addressing the shortcomings of both strategies. Specifically, FeatRacer relies on embedded code annotations and a machine-learning-based recommender system. When a developer forgets to annotate, FeatRacer reminds the developer about potentially missing features, which it learned from the feature recording practices in the project at hand. FeatRacer also facilitates fine-grained locations as decided by the developer. Our evaluation shows that FeatRacer outperforms traditional automated feature location based on Latent Semantic Indexing (LSI) and Linear Discriminant Analysis (LDA)-two of the most common methods to realize such techniques-when predicting features for 4,650 commit changesets from the histories of 16 open-source projects spanning an average of three years between 1985 and 2015. Compared to the traditional techniques, FeatRacer showed a 3x higher precision and a 4.5x higher recall, with an average precision and recall of 89.6% among all 16 projects. It can accurately predict feature locations within the first five commits of our evaluation projects, being effective already for small datasets. FeatRacer takes on average 1.9ms to learn from past code fragments of a project, and 0.002ms to predict forgotten feature annotations in new code.
更多
查看译文
关键词
Codes,Annotations,Recording,Software,Large scale integration,Manuals,Maintenance engineering,Feature location,traceability,recommender
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要