PTDE: Personalized Training with Distilled Execution for Multi-Agent Reinforcement Learning
arxiv(2022)
摘要
Centralized Training with Decentralized Execution (CTDE) has emerged as a
widely adopted paradigm in multi-agent reinforcement learning, emphasizing the
utilization of global information for learning an enhanced joint Q-function
or centralized critic. In contrast, our investigation delves into harnessing
global information to directly enhance individual Q-functions or individual
actors. Notably, we discover that applying identical global information
universally across all agents proves insufficient for optimal performance.
Consequently, we advocate for the customization of global information tailored
to each agent, creating agent-personalized global information to bolster
overall performance. Furthermore, we introduce a novel paradigm named
Personalized Training with Distilled Execution (PTDE), wherein
agent-personalized global information is distilled into the agent's local
information. This distilled information is then utilized during decentralized
execution, resulting in minimal performance degradation. PTDE can be seamlessly
integrated with state-of-the-art algorithms, leading to notable performance
enhancements across diverse benchmarks, including the SMAC benchmark, Google
Research Football (GRF) benchmark, and Learning to Rank (LTR) task.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要