A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health
CoRR(2024)
摘要
Efforts to reduce maternal mortality rate, a key UN Sustainable Development
target (SDG Target 3.1), rely largely on preventative care programs to spread
critical health information to high-risk populations. These programs face two
important challenges: efficiently allocating limited health resources to large
beneficiary populations, and adapting to evolving policy priorities. While
prior works in restless multi-armed bandit (RMAB) demonstrated success in
public health allocation tasks, they lack flexibility to adapt to evolving
policy priorities. Concurrently, Large Language Models (LLMs) have emerged as
adept, automated planners in various domains, including robotic control and
navigation. In this paper, we propose DLM: a Decision Language Model for RMABs.
To enable dynamic fine-tuning of RMAB policies for challenging public health
settings using human-language commands, we propose using LLMs as automated
planners to (1) interpret human policy preference prompts, (2) propose code
reward functions for a multi-agent RL environment for RMABs, and (3) iterate on
the generated reward using feedback from RMAB simulations to effectively adapt
policy outcomes. In collaboration with ARMMAN, an India-based public health
organization promoting preventative care for pregnant mothers, we conduct a
simulation study, showing DLM can dynamically shape policy outcomes using only
human language commands as input.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要