Sequential Cooperative Multi-Agent Reinforcement Learning

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems(2023)

引用 0|浏览28
暂无评分
摘要
Cooperative multi-agent reinforcement learning (MARL) aims to coordinate the actions of multiple agents via a shared team reward. The complex interactions among agents make this problem extremely difficult. The mainstream of MARL methods often implicitly learn an inexplicable value decomposition from the shared reward into individual utilities, failing to give insights into how well each agent acts and lacking direct policy optimization guidance. This paper presents a sequential MARL framework that factorizes and simplifies the complex interaction analysis into a sequential evaluation process for more effective and efficient learning. We explicitly formulate this factorization via a novel sequential advantage function to evaluate each agent's actions, which achieves an explicable credit assignment and substantially facilitates policy optimization. We realize the sequential credit assignment (SeCA) by dynamically adjusting the sequence in light of agents' contributions to the team. Extensive experimental validations on a challenging set of StarCraft II micromanagement tasks verify SeCA's effectiveness.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要