Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond
CoRR(2024)
摘要
Uncertainty estimation plays a pivotal role in ensuring the reliability of
safety-critical human-AI interaction systems, particularly in the medical
domain. However, a general method for quantifying the uncertainty of free-form
answers has yet to be established in open-ended medical question-answering (QA)
tasks, where irrelevant words and sequences with limited semantic information
can be the primary source of uncertainty due to the presence of generative
inequality. In this paper, we propose the Word-Sequence Entropy (WSE), which
calibrates the uncertainty proportion at both the word and sequence levels
according to the semantic relevance, with greater emphasis placed on keywords
and more relevant sequences when performing uncertainty quantification. We
compare WSE with 6 baseline methods on 5 free-form medical QA datasets,
utilizing 7 "off-the-shelf" large language models (LLMs), and show that WSE
exhibits superior performance on accurate uncertainty measurement under two
standard criteria for correctness evaluation (e.g., WSE outperforms existing
state-of-the-art method by 3.23
terms of the potential for real-world medical QA applications, we achieve a
significant enhancement in the performance of LLMs when employing sequences
with lower uncertainty, identified by WSE, as final answers (e.g., +6.36
accuracy improvement on the COVID-QA dataset), without requiring any additional
task-specific fine-tuning or architectural modifications.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要