Determinize, Solve, and Generalize: Classical Planning for MDP Heuristics

msra(2009)

引用 23|浏览35
暂无评分
摘要
Heuristics make MDP solvers practical by reducing their space and memory requirements. Some of the most effective heuristics (e.g. the FF heuristic) first determinize the MDP to a classical approximation and then solve a relaxation of the resulting classical problem (e.g., one which ignores the ac- tions' delete effects). While these heuristics can be computed quite quickly, they frequently yield overly-optimistic value estimates. This paper proposes a novel class of heuristics, called THUDS, which improve on the existing methods by using full-fledged classical planners to solve the non-relaxed deter- minizations. THUDS produces more informative state value estimates than those given by the FF heuristic, causing many fewer states to be explored. Of course, invoking a determin- istic planner can be very slow; to overcome this high cost THUDS generalizes the heuristic value of one state to many others by extracting basis functions from the plans discov- ered in the process of heuristic computation. Thus, the clas- sical planner is only called for states without basis functions — amortizing its costly invocation. Experiments show that THUDS can provide large time and memory savings com- pared to the FF heuristic and that generalization is vital in making THUDS computationally feasible.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要