Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models

CoRR(2023)

Cited 0|Views15
No score
Abstract
Software development is an inherently collaborative process, where various stakeholders frequently express their opinions and emotions across diverse platforms. Recognizing the sentiments conveyed in these interactions is crucial for the effective development and ongoing maintenance of software systems. For instance, app developers can harness sentiment analysis of app users' reviews to enhance the quality of their app. Over the years, many tools have been proposed to aid in sentiment analysis, but accurately identifying the sentiments expressed in software engineering datasets remains challenging. Recent advances have showcased the potential of fine-tuned pre-trained language models in handling software engineering datasets, albeit they grapple with the shortage of labeled data. With the emergence of large language models (LLMs), it is pertinent to investigate how these models perform in the context of sentiment analysis for software engineering. In this work, we undertake a comprehensive empirical study using five established software engineering datasets. We assess the performance of three open-source LLMs in both zero-shot and few-shot scenarios. Additionally, we draw comparisons between fine-tuned pre-trained smaller language models and LLMs employing prompts. Our experimental findings demonstrate that LLMs exhibit state-of-the-art performance on datasets marked by limited training data and imbalanced distributions. LLMs can also achieve excellent performance under a zero-shot setting. However, when ample training data is available, or the dataset exhibits a more balanced distribution, fine-tuned smaller language models can still achieve superior results.
More
Translated text
Key words
sentiment analysis,large language models,software engineering
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined