COBIAS: Contextual Reliability in Bias Assessment

Priyanshul Govil,Vamshi Krishna Bonagiri,Manas Gaur,Ponnurangam Kumaraguru,Sanorita Dey

CoRR（2024）

引用 0|浏览5

暂无评分

摘要

Large Language Models (LLMs) are trained on inherently biased data. Previous works on debiasing models rely on benchmark datasets to measure model performance. However, these datasets suffer from several pitfalls due to the extremely subjective understanding of bias, highlighting a critical need for contextual exploration. We propose understanding the context of user inputs with consideration of the diverse situations in which input statements are possible. This approach would allow for frameworks that foster bias awareness rather than guardrails that hurt user engagement. Our contribution is twofold: (i) we create a dataset of 2287 stereotyped statements augmented with points for adding context; (ii) we develop the Context-Oriented Bias Indicator and Assessment Score (COBIAS) to assess statements' contextual reliability in measuring bias. Our metric is a significant predictor of the contextual reliability of bias-benchmark datasets (χ^2=71.02, p<2.2 · 10^-16). COBIAS can be used to create reliable datasets, resulting in an improvement in bias mitigation works.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要