Duplicate Bug Report detection using Named Entity Recognition

KNOWLEDGE-BASED SYSTEMS(2024)

引用 0|浏览6
暂无评分
摘要
Software bugs pose significant challenges in management. The Bug Tracking System (BTS) serves as a standard platform to chronicle, oversee, and manage bugs throughout software development and maintenance. While BTS aggregates numerous bug reports for tracking purposes, identical bugs often get reported by various individuals. This redundancy leads to excessive duplicate reports, straining manual inspection efforts, risking repeated bug assignment tasks, and diminishing the efficiency of bug resolution. Notably, many contemporary DBR detection techniques tend to overlook the structured data abundant in descriptive information about bug report behaviors. To mitigate this oversight, this study introduces a groundbreaking method named CorNER. This technique enhances DBR detection precision by converting unstructured textual content into structured data via named entity recognition (NER). Specifically, CorNER employs Random Forest with context (RNER) to annotate entities in the title and description sections of bug reports and subsequently harnesses Text Convolutional Neural Networks (TextCNN) for feature extraction. Empirical evidence indicates a commendable improvement in CorNER's F1-Score by 6.24% and 4.96% on average, surpassing the benchmarks of two prevalent DBR detection methods across five datasets.
更多
查看译文
关键词
Duplicate Bug Report detection,Named Entity Recognition,Context-aware random forest,Convolutional neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要