ROMAX: Certifiably Robust Deep Multiagent Reinforcement Learning via Convex Relaxation

IEEE International Conference on Robotics and Automation(2022)

引用 16|浏览20
暂无评分
摘要
In a multirobot system, a number of cyber-physical attacks (e.g., communication hijack, observation per-turbations) can challenge the robustness of agents. This robust-ness issue worsens in multiagent reinforcement learning because there exists the non-stationarity of the environment caused by simultaneously learning agents whose changing policies affect the transition and reward functions. In this paper, we propose a minimax MARL approach to infer the worst-case policy update of other agents. As the minimax formulation is computationally intractable to solve, we apply the convex relaxation of neural networks to solve the inner minimization problem. Such convex relaxation enables robustness in interacting with peer agents that may have significantly different behaviors and also achieves a certified bound of the original optimization problem. We eval-uate our approach on multiple mixed cooperative-competitive tasks and show that our method outperforms the previous state of the art approaches on this topic.
更多
查看译文
关键词
ROMAX,robust deep multiagent reinforcement learning,convex relaxation,multirobot system,cyber-physical attacks,communication hijack,observation per-turbations,robust-ness issue worsens,nonstationarity,changing policies,reward functions,minimax MARL approach,worst-case policy update,minimax formulation,inner minimization problem,peer agents
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要