Communication-Efficient and Resilient Distributed Q-Learning

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS(2024)

引用 0|浏览3
暂无评分
摘要
This article investigates the problem of communication-efficient and resilient multiagent reinforcement learning (MARL). Specifically, we consider a setting where a set of agents are interconnected over a given network, and can only exchange information with their neighbors. Each agent observes a common Markov Decision Process and has a local cost which is a function of the current system state and the applied control action. The goal of MARL is for all agents to learn a policy that optimizes the infinite horizon discounted average of all their costs. Within this general setting, we consider two extensions to existing MARL algorithms. First, we provide an event-triggered learning rule where agents only exchange information with their neighbors if a certain triggering condition is satisfied. We show that this enables learning while reducing the amount of communication. Next, we consider the scenario where some of the agents can be adversarial (as captured by the Byzantine attack model), and arbitrarily deviate from the prescribed learning algorithm. We establish a fundamental trade-off between optimality and resilience when Byzantine agents are present. We then create a resilient algorithm and show almost sure convergence of all reliable agents' value functions to the neighborhood of the optimal value function of all reliable agents, under certain conditions on the network topology. When the optimal Q-values are sufficiently separated for different actions, we show that all reliable agents can learn the optimal policy under our algorithm.
更多
查看译文
关键词
Event-triggered communication,multiagent systems,reinforcement learning,resilience
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要