Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process
Operations Research Letters(2024)
摘要
The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated reward subject to constraints on its utilities/costs. We propose a new primal-dual approach with a novel integration of entropy regularization and Nesterov's accelerated gradient method. The proposed approach is shown to converge to the global optimum with a complexity of O˜(1/ϵ) in terms of the optimality gap and the constraint violation, which improves the complexity of the existing primal-dual approaches by a factor of O(1/ϵ).
更多查看译文
关键词
constrained Markov decision process,primal-dual algorithm,entropy regularization,accelerated gradient method,policy optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要