Multi-Label Requirements Classification with Large Taxonomies
arxiv(2024)
摘要
Classification aids software development activities by organizing
requirements in classes for easier access and retrieval. The majority of
requirements classification research has, so far, focused on binary or
multi-class classification. Multi-label classification with large taxonomies
could aid requirements traceability but is prohibitively costly with supervised
training. Hence, we investigate zero-short learning to evaluate the feasibility
of multi-label requirements classification with large taxonomies. We
associated, together with domain experts from the industry, 129 requirements
with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we
conducted a controlled experiment to study the impact of the type of
classifier, the hierarchy, and the structural characteristics of taxonomies on
the classification performance. The results show that: (1) The sentence-based
classifier had a significantly higher recall compared to the word-based
classifier; however, the precision and F1-score did not improve significantly.
(2) The hierarchical classification strategy did not always improve the
performance of requirements classification. (3) The total and leaf nodes of the
taxonomies have a strong negative correlation with the recall of the
hierarchical sentence-based classifier. We investigate the problem of
multi-label requirements classification with large taxonomies, illustrate a
systematic process to create a ground truth involving industry participants,
and provide an analysis of different classification pipelines using zero-shot
learning.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要