EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models
CoRR(2024)
摘要
Multimodal large language models (MLLMs) have attracted increasing attention
in the past few years, but they may still generate descriptions that include
objects not present in the corresponding images, a phenomenon known as object
hallucination. To eliminate hallucinations, existing methods manually annotate
paired responses with and without hallucinations, and then employ various
alignment algorithms to improve the alignment capability between images and
text. However, they not only demand considerable computation resources during
the finetuning stage but also require expensive human annotation to construct
paired data needed by the alignment algorithms. To address these issues, we
borrow the idea of unlearning and propose an efficient fine-grained unlearning
framework (EFUF), which can eliminate hallucinations without the need for
paired data. Extensive experiments show that our method consistently reduces
hallucinations while preserving the generation quality with modest
computational overhead. Our code and datasets will be publicly available.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要