Effective Controllable Bias Mitigation for Classification and Retrieval using Gate Adapters
CoRR(2024)
摘要
Bias mitigation of Language Models has been the topic of many studies with a
recent focus on learning separate modules like adapters for on-demand
debiasing. Besides optimizing for a modularized debiased model, it is often
critical in practice to control the degree of bias reduction at inference time,
e.g., in order to tune for a desired performance-fairness trade-off in search
results or to control the strength of debiasing in classification tasks. In
this paper, we introduce Controllable Gate Adapter (ConGater), a novel modular
gating mechanism with adjustable sensitivity parameters, which allows for a
gradual transition from the biased state of the model to the fully debiased
version at inference time. We demonstrate ConGater performance by (1)
conducting adversarial debiasing experiments with three different models on
three classification tasks with four protected attributes, and (2) reducing the
bias of search results through fairness list-wise regularization to enable
adjusting a trade-off between performance and fairness metrics. Our experiments
on the classification tasks show that compared to baselines of the same
caliber, ConGater can maintain higher task performance while containing less
information regarding the attributes. Our results on the retrieval task show
that the fully debiased ConGater can achieve the same fairness performance
while maintaining more than twice as high task performance than recent strong
baselines. Overall, besides strong performance ConGater enables the continuous
transitioning between biased and debiased states of models, enhancing
personalization of use and interpretability through controllability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要