RL in Markov Games with Independent Function Approximation: Improved Sample Complexity Bound under the Local Access Model
arxiv(2024)
摘要
Efficiently learning equilibria with large state and action spaces in
general-sum Markov games while overcoming the curse of multi-agency is a
challenging problem. Recent works have attempted to solve this problem by
employing independent linear function classes to approximate the marginal
Q-value for each agent. However, existing sample complexity bounds under such
a framework have a suboptimal dependency on the desired accuracy ε
or the action space. In this work, we introduce a new algorithm,
Lin-Confident-FTRL, for learning coarse correlated equilibria (CCE) with local
access to the simulator, i.e., one can interact with the underlying environment
on the visited states. Up to a logarithmic dependence on the size of the state
space, Lin-Confident-FTRL learns ϵ-CCE with a provable optimal
accuracy bound O(ϵ^-2) and gets rids of the linear dependency on the
action space, while scaling polynomially with relevant problem parameters (such
as the number of agents and time horizon). Moreover, our analysis of
Linear-Confident-FTRL generalizes the virtual policy iteration technique in the
single-agent local planning literature, which yields a new computationally
efficient algorithm with a tighter sample complexity bound when assuming random
access to the simulator.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要