Adversary Agnostic Robust Deep Reinforcement Learning

IEEE transactions on neural networks and learning systems(2023)

引用 4|浏览28
暂无评分
摘要
Deep reinforcement learning (DRL) policies have been shown to be deceived by perturbations (e.g., random noise or intensional adversarial attacks) on state observations that appear at test time but are unknown during training. To increase the robustness of DRL policies, previous approaches assume that explicit adversarial information can be added into the training process, to achieve generalization ability on these perturbed observations as well. However, such approaches not only make robustness improvement more expensive but may also leave a model prone to other kinds of attacks in the wild. In contrast, we propose an adversary agnostic robust DRL paradigm that does not require learning from predefined adversaries. To this end, we first theoretically show that robustness could indeed be achieved independently of the adversaries based on a policy distillation (PD) setting. Motivated by this finding, we propose a new PD loss with two terms: 1) a prescription gap maximization (PGM) loss aiming to simultaneously maximize the likelihood of the action selected by the teacher policy and the entropy over the remaining actions and 2) a corresponding Jacobian regularization (JR) loss that minimizes the magnitude of gradients with respect to the input state. The theoretical analysis substantiates that our distillation loss guarantees to increase the prescription gap and hence improves the adversarial robustness. Furthermore, experiments on five Atari games firmly verify the superiority of our approach compared to the state-of-the-art baselines.
更多
查看译文
关键词
Robustness,Training,Perturbation methods,Reinforcement learning,Jacobian matrices,Games,Entropy,Adversarial robustness,adversary agnostic,Atari games,deep reinforcement learning (DRL)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要