Learning Ad Hoc Cooperation Policies from Limited Priors via Meta-Reinforcement Learning

Qi Fang,Junjie Zeng,Haotian Xu,Yue Hu,Quanjun Yin

APPLIED SCIENCES-BASEL（2024）

引用 0|浏览3

暂无评分

摘要

When agents need to collaborate without previous coordination, the multi-agent cooperation problem transforms into an ad hoc teamwork (AHT) problem. Mainstream research on AHT is divided into type-based and type-free methods. The former depends on known teammate types to infer the current teammate type, while the latter does not require them at all. However, in many real-world applications, the complete absence and sufficient knowledge of known types are both impractical. Thus, this research focuses on the challenge of AHT with limited known types. To this end, this paper proposes a method called a Few typE-based Ad hoc Teamwork via meta-reinforcement learning (FEAT), which effectively adapts to teammates using a small set of known types within a single episode. FEAT enables agents to develop a highly adaptive policy through meta-reinforcement learning by employing limited priors about known types. It also utilizes this policy to generate a diverse type repository automatically. During the ad hoc cooperation, the agent can autonomously identify known teammate types followed by directly utilizing the pre-trained optimal cooperative policy or swiftly updating the meta policy to respond to teammates of unknown types. Comprehensive experiments in the pursuit domain validate the effectiveness of the algorithm and its components.

查看译文

关键词

ad hoc teamwork,meta-reinforcement learning,self-play,agent types,online adapting

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要