SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
arxiv(2024)
摘要
The emergence of Vision Language Models (VLMs) has brought unprecedented
advances in understanding multimodal information. The combination of textual
and visual semantics in VLMs is highly complex and diverse, making the safety
alignment of these models challenging. Furthermore, due to the limited study on
the safety alignment of VLMs, there is a lack of large-scale, high-quality
datasets. To address these limitations, we propose a Safety Preference
Alignment dataset for Vision Language Models named SPA-VL. In terms of breadth,
SPA-VL covers 6 harmfulness domains, 13 categories, and 53 subcategories, and
contains 100,788 samples of the quadruple (question, image, chosen response,
rejected response). In terms of depth, the responses are collected from 12
open- (e.g., QwenVL) and closed-source (e.g., Gemini) VLMs to ensure diversity.
The experimental results indicate that models trained with alignment techniques
on the SPA-VL dataset exhibit substantial improvements in harmlessness and
helpfulness while maintaining core capabilities. SPA-VL, as a large-scale,
high-quality, and diverse dataset, represents a significant milestone in
ensuring that VLMs achieve both harmlessness and helpfulness. We have made our
code https://github.com/EchoseChen/SPA-VL-RLHF and SPA-VL dataset url
https://huggingface.co/datasets/sqrti/SPA-VL publicly available.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要