Non-Myopic Knowledge Gradient Policy for Ranking and Selection.

WSC（2022）

引用 0|浏览3

暂无评分

摘要

We consider the ranking and selection (R&S) problem with fixed simulation budget, in which the budget is assumed to be allocated sequentially. Deriving the optimal sampling procedure for this problem amounts to solving a stochastic dynamic program that is highly intractable. To overcome this difficulty, the existing R&S procedures are often designed from a myopic viewpoint. However, these myopic procedures are only single-step optimal and may have a poor performance for general sequential R&S problems. Therefore, in this paper, we combine two popular lookahead strategies and design a non-myopic knowledge gradient (KG) procedure. Meanwhile, to streamline the computation of procedure, we propose a modified Monte Carlo tree search method specifically designed under the R&S context. We show that the new procedure can exhibit a performance superior to the classic KG.

查看译文

关键词

ranking,knowledge,selection,non-myopic

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要