Are Large Language Models Consistent over Value-laden Questions?
arxiv(2024)
摘要
Large language models (LLMs) appear to bias their survey answers toward
certain values. Nonetheless, some argue that LLMs are too inconsistent to
simulate particular values. Are they? To answer, we first define value
consistency as the similarity of answers across (1) paraphrases of one
question, (2) related questions under one topic, (3) multiple-choice and
open-ended use-cases of one question, and (4) multilingual translations of a
question to English, Chinese, German, and Japanese. We apply these measures to
a few large (>=34b), open LLMs including llama-3, as well as gpt-4o, using
eight thousand questions spanning more than 300 topics. Unlike prior work, we
find that models are relatively consistent across paraphrases, use-cases,
translations, and within a topic. Still, some inconsistencies remain. Models
are more consistent on uncontroversial topics (e.g., in the U.S.,
"Thanksgiving") than on controversial ones ("euthanasia"). Base models are both
more consistent compared to fine-tuned models and are uniform in their
consistency across topics, while fine-tuned models are more inconsistent about
some topics ("euthanasia") than others ("women's rights") like our human
subjects (n=165).
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要