Deep Reinforcement Learning for Improving Resource Utilization Efficiency of URLLC With Imperfect Channel State Information
IEEE Wireless Communications Letters(2023)
Abstract
This letter optimizes resource blocks allocated for channel estimation and data transmission in multiple-input and single-output (MISO) ultra-reliable low-latency communication (URLLC) systems. The goal is to improve the resource utilization efficiency subject to a reliability constraint. Considering that wireless channels are correlated in the temporal domain, and the channel estimation is not error-free, the problem is formulated as a partial observation Markov decision process (POMDP), which is a sequential decision-making problem with partial observations. To solve this problem, we develop a constrained deep reinforcement learning (DRL) algorithm, namely Cascaded-Action Twin Delayed Deep Deterministic policy (CA-TD3). We train the policy by using a primal domain method and compare it with a primal-dual method and an existing benchmark. Two channel models are considered for evaluation: the first-order autoregressive correlated channel model and the clustered delay line (CDL) channel model. The simulation results show that the proposed primal CA-TD3 method converges faster than the primal-dual method and achieves over 30% improvement in resource utilization efficiency compared to the benchmark.
MoreTranslated text
Key words
Ultra-reliable and low-latency communications,deep reinforcement learning,resource allocation
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined