Alleviating Hallucinations of Large Language Models through Induced Hallucinations
arxiv(2023)
摘要
Despite their impressive capabilities, large language models (LLMs) have been
observed to generate responses that include inaccurate or fabricated
information, a phenomenon commonly known as “hallucination”. In this work, we
propose a simple Induce-then-Contrast Decoding (ICD) strategy to
alleviate hallucinations. We first construct a factually weak LLM by inducing
hallucinations from the original LLMs. Then, we penalize these induced
hallucinations during decoding to enhance the factuality of the generated
content. Concretely, we determine the final next-token predictions by
amplifying the predictions from the original model and downplaying the induced
untruthful predictions via contrastive decoding. Experimental results on both
discrimination-based and generation-based hallucination evaluation benchmarks,
such as TruthfulQA and FActScore, demonstrate that our proposed ICD
methods can effectively enhance the factuality of LLMs across various model
sizes and families. For example, when equipped with ICD, Llama2-7B-Chat and
Mistral-7B-Instruct achieve performance comparable to ChatGPT and GPT4 on
TruthfulQA, respectively.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要