Uncovering Bias in Large Vision-Language Models with Counterfactuals
arxiv(2024)
摘要
With the advent of Large Language Models (LLMs) possessing increasingly
impressive capabilities, a number of Large Vision-Language Models (LVLMs) have
been proposed to augment LLMs with visual inputs. Such models condition
generated text on both an input image and a text prompt, enabling a variety of
use cases such as visual question answering and multimodal chat. While prior
studies have examined the social biases contained in text generated by LLMs,
this topic has been relatively unexplored in LVLMs. Examining social biases in
LVLMs is particularly challenging due to the confounding contributions of bias
induced by information contained across the text and visual modalities. To
address this challenging problem, we conduct a large-scale study of text
generated by different LVLMs under counterfactual changes to input images.
Specifically, we present LVLMs with identical open-ended text prompts while
conditioning on images from different counterfactual sets, where each set
contains images which are largely identical in their depiction of a common
subject (e.g., a doctor), but vary only in terms of intersectional social
attributes (e.g., race and gender). We comprehensively evaluate the text
produced by different LVLMs under this counterfactual generation setting and
find that social attributes such as race, gender, and physical characteristics
depicted in input images can significantly influence toxicity and the
generation of competency-associated words.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要