Information spillover effect on human exploratory behavior in contextual multi-armed bandit problem.

World Symposium on Software Engineering(2023)

引用 0|浏览0
暂无评分
摘要
Recently, the upper confidence bound (UCB) strategy, which combines belief updating by the Gaussian Process, has received much attention as a model of human vast space exploratory behavior. However, a major drawback of this model is that it retains the independence from irrelevant alternatives (IIA) property. This property implies that the evaluation of one alternative/arm is determined independently of its relationship with other alternatives/arms, eliminating the information spillover effect. Specifically, in the context of contextual bandit, this property seems to be a strong restriction. In this study, we first present an empirical example, in which the IIA property does not hold. Next, we propose a modification of the UCB model, in which the search bonus is given by the information gain from the alternatives rather than the uncertainty of the alternatives. The information gain is widely known as an efficient search criterion in the field of active learning and it considers the information spillover effect. Our empirical results show that this information spillover effect is an important guideline in human vast space search.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要