Joint Resource Allocation for Time-Varying Underwater Acoustic Communication System: A Self-Reflection Adversarial Bandit Approach

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS(2024)

引用 0|浏览0
暂无评分
摘要
This study deals with a joint channel selection and power allocation problem for time-varying underwater acoustic communicationUWAC system. Without any prior channel information, designing a highly adaptable resource allocation algorithm to cope with the fast time-varying environment is a very challenging issue. To address this issue, a hierarchical learning approach, which is combined with adversarial multiarmed banditMAB theory and outdated pilot-based feedback information, is proposed. The proposed learning approach can online optimize joint resource allocate strategy without any prior channel state informationCSI. Specifically, a hierarchical self-reflection learning structure is proposed to offer different learning manners and spaces for the actual played information and outdated feedback information, thereby balancing the exploitation and exploration to cope with the time-varying environment effectively. Further, an integration learning structure is proposed to alleviate the solving difficulty and policy explosion of joint multiple substrategies problem. The user can rapidly achieve a few superior strategies in low-dimension space, then efficiently search the expected optimal strategy in high-dimension space, as a result, the learning efficiency is significantly improved. The proposed algorithms show strong tolerance for delay and noncomplete information due to the elaborate learning structures. The superiority of the proposed algorithms is demonstrated through numerical results.
更多
查看译文
关键词
Joint resource allocation,machine learning,underwater acoustic communicationunderwater acoustic communication (UWAC) system
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要