DSDF: Coordinated look-ahead strategy in multi-agent reinforcement learning with noisy agents

PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024（2024）

引用 0|浏览0

暂无评分

摘要

Existing methods of Multi-Agent Reinforcement learning, involving Centralized Training and Decentralized execution, attempt to train the agents towards learning a pattern of coordinated actions to arrive at optimal joint policy. However, during the execution phase, if some of the agents degrade and perform noisy actions (not the same actions suggested by policy) to varying degrees, the above methods provide poor coordination. In this paper, we show how such random noise in agents, which could be a result of the degradation or aging of robots, can add to the uncertainty in coordination and thereby contribute to unsatisfactory global rewards. In such a scenario, the agents which are in accordance with the policy have to understand the behavior and limitations of the noisy agents while the noisy agents have to plan in cognizance of their limitations. In our proposed method, Deep Stochastic Discount Factor (DSDF), based on the degree of degradation the algorithm tunes the discount factor for each agent uniquely, thereby altering the global planning of the agents. Moreover, given the degree of degradation in some agents is expected to change over time, our method provides a framework under which such changes can be incrementally addressed without extensive retraining. Results on benchmark environments show the efficacy of the DSDF approach when compared with existing approaches.

查看译文

关键词

reinforcement learning,multi-agent reinforcement learning,discount factor,noisy agents

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要