DRL Empowered On-policy and Off-policy ABR for 5G Mobile Ultra-HD Video Delivery

Mobile Networks and Applications(2024)

引用 0|浏览0
暂无评分
摘要
Fifth generation (5G) and beyond 5G networks support high-throughput ultra-high definition (UHD) video applications. This paper examines the use of dynamic adaptive streaming over HTTP (DASH) to deliver UHD videos from servers to 5G-capable devices. Due to the dynamic network conditions of wireless networks, it is particularly challenging to provide a high quality of experience (QoE) for UHD video delivery. Consequently, adaptive bit rate (ABR) algorithms are developed to adapt the video bit rate to the network conditions. To improve QoE, several ABR algorithms are developed, the majority of which are based on predetermined rules. Therefore, they do not apply to a broad variety of network conditions. Recent research has shown that ABR algorithms powered by deep reinforcement learning (DRL) based vanilla asynchronous advantage actor-critic (A3C) methods are more effective at generalizing to different network conditions. However, they have some limitations, such as a lag between behavior and target policies, sample inefficiency, and sensitivity to the environment’s randomness. In this paper, we propose the design and implementation of two DRL-empowered ABR algorithms: (i) on-policy proximal policy optimization adaptive bit rate (PPO-ABR), and (ii) off-policy soft-actor critic adaptive bit rate (SAC-ABR). We evaluate the proposed algorithms using 5G traces from the Lumos 5G dataset and show that by utilizing specific properties of on-policy and off-policy methods, our proposed methods perform much better than vanilla A3C for different variations of QoE metrics.
更多
查看译文
关键词
Deep reinforcement learning,Actor-critic methods,Quality of experience (QoE),Adaptive bit rates (ABR),Video streaming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要