Whose Emotions and Moral Sentiments Do Language Models Reflect?
CoRR(2024)
摘要
Language models (LMs) are known to represent the perspectives of some social
groups better than others, which may impact their performance, especially on
subjective tasks such as content moderation and hate speech detection. To
explore how LMs represent different perspectives, existing research focused on
positional alignment, i.e., how closely the models mimic the opinions and
stances of different groups, e.g., liberals or conservatives. However, human
communication also encompasses emotional and moral dimensions. We define the
problem of affective alignment, which measures how LMs' emotional and moral
tone represents those of different groups. By comparing the affect of responses
generated by 36 LMs to the affect of Twitter messages, we observe significant
misalignment of LMs with both ideological groups. This misalignment is larger
than the partisan divide in the U.S. Even after steering the LMs towards
specific ideological perspectives, the misalignment and liberal tendencies of
the model persist, suggesting a systemic bias within LMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要