Scalable POMDP Decision-Making Using Circulant Controllers

2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021)(2021)

引用 2|浏览8
暂无评分
摘要
This paper presents a novel policy representation for partially observable Markov decision processes (POMDPs) called circulant controllers and a provably efficient gradient-based algorithm for them. A formal mathematical description is provided that leverages circulant matrices for the controller’s stochastic node transitions. This structure is particularly effective for capturing decision-making patterns found in real-world domains with repeated periodic behaviors that adapt their cycles based on observation. This includes domains such as bipedal walking over varied terrain, pick-and-place tasks in warehouses, and home healthcare monitoring and medicine delivery in household environments. A performant gradient-based algorithm is presented with a detailed theoretical analysis, formally proving the algorithm’s improved performance, as well as circulant controllers’ structural properties. Experiments on these domains demonstrate that the proposed controller algorithm outperforms other state-of-the-art POMDP controller algorithms. The proposed novel controller approach is demonstrated on an actual robot performing a navigation task in a real household environment.
更多
查看译文
关键词
stochastic node transitions,real household environment,robot navigation task,POMDP controller,formal mathematical description,gradient-based algorithm,partially observable Markov decision processes,circulant controllers,scalable POMDP decision-making
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要