Confirmation bias optimizes reward learning


引用 5|浏览2
Confirmation bias—the tendency to overweight information that matches prior beliefs or choices—has been shown to manifest even in simple reinforcement learning. In line with recent work, we find that participants learned significantly more from choice-confirming outcomes in a reward-learning task. What is less clear is whether asymmetric learning rates somehow benefit the learner. Here, we combine data from human participants and artificial agents to examine how confirmation-biased learning might improve performance by counteracting decisional and environmental noise. We evaluate one potential mechanism for such noise reduction: visual attention—a demonstrated driver of both value-based choice and predictive learning. Surprisingly, visual attention showed the opposite pattern to confirmation bias, as participants were most likely to fixate on “missed opportunities”, slightly dampening the effects of the confirmation bias we observed. Several million simulated experiments with artificial agents showed this bias to be a reward-maximizing strategy compared to several alternatives, but only if disconfirming feedback is not completely ignored—a condition that visual attention may help to enforce. ### Competing Interest Statement The authors have declared no competing interest.
AI 理解论文
Chat Paper