Simultaneously Achieving Group Exposure Fairness and Within-Group Meritocracy in Stochastic Bandits
CoRR(2024)
Abstract
Existing approaches to fairness in stochastic multi-armed bandits (MAB)
primarily focus on exposure guarantee to individual arms. When arms are
naturally grouped by certain attribute(s), we propose Bi-Level Fairness, which
considers two levels of fairness. At the first level, Bi-Level Fairness
guarantees a certain minimum exposure to each group. To address the unbalanced
allocation of pulls to individual arms within a group, we consider meritocratic
fairness at the second level, which ensures that each arm is pulled according
to its merit within the group. Our work shows that we can adapt a UCB-based
algorithm to achieve a Bi-Level Fairness by providing (i) anytime Group
Exposure Fairness guarantees and (ii) ensuring individual-level Meritocratic
Fairness within each group. We first show that one can decompose regret bounds
into two components: (a) regret due to anytime group exposure fairness and (b)
regret due to meritocratic fairness within each group. Our proposed algorithm
BF-UCB balances these two regrets optimally to achieve the upper bound of
O(√(T)) on regret; T being the stopping time. With the help of
simulated experiments, we further show that BF-UCB achieves sub-linear regret;
provides better group and individual exposure guarantees compared to existing
algorithms; and does not result in a significant drop in reward with respect to
UCB algorithm, which does not impose any fairness constraint.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined