Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

Operations Research Letters(2024)

引用 0|浏览10
暂无评分
摘要
The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated reward subject to constraints on its utilities/costs. We propose a new primal-dual approach with a novel integration of entropy regularization and Nesterov's accelerated gradient method. The proposed approach is shown to converge to the global optimum with a complexity of O˜(1/ϵ) in terms of the optimality gap and the constraint violation, which improves the complexity of the existing primal-dual approaches by a factor of O(1/ϵ).
更多
查看译文
关键词
constrained Markov decision process,primal-dual algorithm,entropy regularization,accelerated gradient method,policy optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要