Restless Bandits for Sensor Scheduling in Energy Constrained Networks

2022 Eighth Indian Control Conference (ICC)(2022)

引用 0|浏览4
暂无评分
摘要
We consider the problem of sensor scheduling in energy constrained network. It is modeled using restless multi-armed bandits with dynamic availability of arms. An arm represents the sensor and due to the energy constrained its availability is dynamic. The data transmission rate depends on the channel quality. Sensor scheduling problem is a sequential decision problem which needs to account both for the evolution of the channel quality and fluctuation in energy levels of sensor nodes. When sensor with available energy is scheduled, it yields data rate based on channel quality, this is referred to as immediate reward. The channel quality is modeled using two state Markov model. The higher channel state corresponds to higher quality, and hence higher immediate reward. When sensors are not scheduled, it yields no reward. Sensors with non-availability of energy are not scheduled. Further, channel quality of sensors is not observable to the decision maker but signals after data transmissions are observable. It is called as partially observable restless bandits. The objective of decision maker is to maximize infinite horizon discounted cumulative reward by sequentially scheduling sensors. We study Whittle's index policy, and describe algorithm to compute index formula. We also study online rollout policy and analyze the computation complexity. The simulation examples compare the performances of different policies-index policy, rollout policy, and myopic policy.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要