SentiCR: a customized sentiment analysis tool for code review interactions.

ASE(2017)

引用 156|浏览142
暂无评分
摘要
Sentiment Analysis tools, developed for analyzing social media text or product reviews, work poorly on a Software Engineering (SE) dataset. Since prior studies have found developers expressing sentiments during various SE activities, there is a need for a customized sentiment analysis tool for the SE domain. On this goal, we manually labeled 2000 review comments to build a training dataset and used our dataset to evaluate seven popular sentiment analysis tools. The poor performances of the existing sentiment analysis tools motivated us to build SentiCR, a sentiment analysis tool especially designed for code review comments. We evaluated SentiCR using one hundred 10-fold cross-validations of eight supervised learning algorithms. We found a model, trained using the Gradient Boosting Tree (GBT) algorithm, providing the highest mean accuracy (83%), the highest mean precision (67.8%), and the highest mean recall (58.4%) in identifying negative review comments.
更多
查看译文
关键词
software engineering dataset,gradient boosting tree algorithm,GBT algorithm,SentiCR,negative review comments,code review comments,SE domain,SE activities,social media text,code review interactions,customized sentiment analysis tool
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要