COBIAS: Contextual Reliability in Bias Assessment
CoRR(2024)
摘要
Large Language Models (LLMs) are trained on inherently biased data. Previous
works on debiasing models rely on benchmark datasets to measure model
performance. However, these datasets suffer from several pitfalls due to the
extremely subjective understanding of bias, highlighting a critical need for
contextual exploration. We propose understanding the context of user inputs
with consideration of the diverse situations in which input statements are
possible. This approach would allow for frameworks that foster bias awareness
rather than guardrails that hurt user engagement. Our contribution is twofold:
(i) we create a dataset of 2287 stereotyped statements augmented with points
for adding context; (ii) we develop the Context-Oriented Bias Indicator and
Assessment Score (COBIAS) to assess statements' contextual reliability in
measuring bias. Our metric is a significant predictor of the contextual
reliability of bias-benchmark datasets (χ^2=71.02, p<2.2 · 10^-16).
COBIAS can be used to create reliable datasets, resulting in an improvement in
bias mitigation works.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要