Multi-User Delay-Constrained Scheduling With Deep Recurrent Reinforcement Learning

IEEE/ACM Transactions on Networking(2024)

引用 0|浏览1
暂无评分
摘要
Multi-user delay-constrained scheduling is a crucial challenge in various real-world applications, such as wireless communication, live streaming, and cloud computing. The scheduler must make real-time decisions to guarantee both delay and resource constraints simultaneously, without prior information on system dynamics that can be time-varying and challenging to estimate. Additionally, many practical scenarios suffer from partial observability issues due to sensing noise or hidden correlation. To address these challenges, we propose a deep reinforcement learning (DRL) algorithm called Recurrent Softmax Delayed Deep Double Deterministic Policy Gradient ( $\mathtt{RSD4}$ ) (https://github.com/hupihe/RSD4), which is a data-driven method based on a Partially Observed Markov Decision Process (POMDP) formulation. $\mathtt{RSD4}$ guarantees resource and delay constraints by Lagrangian dual and delay-sensitive queues, respectively. It also efficiently handles partial observability with a memory mechanism enabled by the recurrent neural network (RNN). Moreover, it introduces user-level decomposition and node-level merging to support large-scale multihop scenarios. Extensive experiments on simulated and real-world datasets demonstrate that $\mathtt{RSD4}$ is robust to system dynamics and partially observable environments and achieves superior performance over existing methods.
更多
查看译文
关键词
Delay-constrained,scheduling,partial observability,deep reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要