Exploiting Learned Policies and Learned Heuristics in Bounded-Suboptimal Search.

SOCS(2021)

引用 0|浏览0
暂无评分
摘要
Machine Learning (ML) has made significant progress to perform different tasks, such as image classification, speech recognition, and natural language processing, mainly driven by deep learning. Also, ML algorithms, through learning policies or heuristics estimates, have demonstrated potential for solving deterministic problems that would usually be solved using search techniques. Nevertheless, in solving a search problem with purely learning techniques, it is not possible to deliver guarantees regarding the quality of the solution. This research explores how a learned policy or heuristic can be integrated with a bounded-suboptimal search algorithm using Focal Search, sorting the FOCAL list using the concept of discrepancies to speed up the search. On the experimental side, we train a simple neural network as a learned policy and the DeepCubeA as a learned heuristic for the 15-puzzle domain. The results show that a learned policy or heuristic can reduce, at least, one order of magnitude, the expansions than WA* with the same bound and deliver better solution quality.
更多
查看译文
关键词
learned heuristics,learned policies,search,bounded-suboptimal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要