Looking for related posts on GitHub discussions

PEERJ COMPUTER SCIENCE(2023)

引用 0|浏览12
暂无评分
摘要
Software teams increasingly adopt different tools and communication channels to aid the software collaborative development model and coordinate tasks. Among such resources, software development forums have become widely used by developers. Such environments enable developers to get and share technical information quickly. In line with this trend, GitHub announced GitHub Discussions-a native forum to facilitate collaborative discussions between users and members of communities hosted on the platform. Since GitHub Discussions is a software development forum, it faces challenges similar to those faced by systems used for asynchronous communication, including the problems caused by related posts (duplicated and near-duplicated posts). These related posts can add noise to the platform and compromise project knowledge sharing. Hence, this article addresses the problem of detecting related posts on GitHub Discussions. To achieve this, we propose an approach based on a Sentence-BERT pre-trained general-purpose model: the RD-Detector. We evaluated RD-Detector using data from three communities hosted in GitHub. Our dataset comprises 16,048 discussion posts. Three maintainers and three Software Engineering (SE) researchers manually evaluated the RD-Detector results, achieving 77-100% of precision and 66% of recall. In addition, maintainers pointed out practical applications of the approach, such as providing knowledge to support merging the discussion posts and converting the posts to comments on other related posts. Maintainers can benefit from RD-Detector to address the laborintensive task of manually detecting related posts.
更多
查看译文
关键词
Sentence-BERT,Software teams interaction,Communication tool,Knowledge sharing,GitHub Discussions,Related posts
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要