Fairness Without Harm: An Influence-Guided Active Sampling Approach
CoRR(2024)
摘要
The pursuit of fairness in machine learning (ML), ensuring that the models do
not exhibit biases toward protected demographic groups, typically results in a
compromise scenario. This compromise can be explained by a Pareto frontier
where given certain resources (e.g., data), reducing the fairness violations
often comes at the cost of lowering the model accuracy. In this work, we aim to
train models that mitigate group fairness disparity without causing harm to
model accuracy. Intuitively, acquiring more data is a natural and promising
approach to achieve this goal by reaching a better Pareto frontier of the
fairness-accuracy tradeoff. The current data acquisition methods, such as fair
active learning approaches, typically require annotating sensitive attributes.
However, these sensitive attribute annotations should be protected due to
privacy and safety concerns. In this paper, we propose a tractable active data
sampling algorithm that does not rely on training group annotations, instead
only requiring group annotations on a small validation set. Specifically, the
algorithm first scores each new example by its influence on fairness and
accuracy evaluated on the validation dataset, and then selects a certain number
of examples for training. We theoretically analyze how acquiring more data can
improve fairness without causing harm, and validate the possibility of our
sampling approach in the context of risk disparity. We also provide the upper
bound of generalization error and risk disparity as well as the corresponding
connections. Extensive experiments on real-world data demonstrate the
effectiveness of our proposed algorithm.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要