Reinforcement learning for humanitarian relief distribution with trucks and UAVs under travel time uncertainty

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES(2023)

引用 0|浏览1
暂无评分
摘要
Effective humanitarian relief operations are challenging in the aftermath of disasters, as trucks are often faced with considerable travel time uncertainties due to damaged transportation networks. Efficient deployment of Unmanned Aerial Vehicles (UAVs) potentially mitigates this problem, supplementing truck fleets in an impactful manner. To plan last-mile relief distribution in this setting, we introduce a multi-trip, split-delivery vehicle routing problem with trucks and UAVs, soft time windows, and stochastic travel times for last-mile relief distribution, formulated as a stochastic dynamic program. Within a finite time horizon, we aim to maximize a weighted objective function comprising the number of goods delivered, the number of different locations visited, and late arrival penalties. Our study offers insights into dealing with travel time uncertainty in humanitarian logistics by (i) deploying Unmanned Aerial Vehicles (UAVs) as partial substitutes for trucks, (ii) evaluating dynamic solutions generated by two deep reinforcement learning (RL) approaches - specifically value function approximation (VFA) and policy function approximation (PFA) - and (iii) comparing the RL solutions with solutions stemming from mathematical programming and dynamic heuristics. Experiments are performed on both Solomon-based instances and two real-world cases. The real-world cases - the 2015 Nepal earthquake and the 2018 Indonesia tsunami - are based on locally collected field data and real-world UAV specifications, and aim to provide practical insights. The experimental results show that dynamic decision-making improves both performance and robustness of humanitarian operations, achieving reductions in lateness penalties of around 85% compared to static solutions based on expected travel times. Furthermore, the results show that replacing half of the trucks with UAVs improves the weighted objective value by 11% to 56%, benefitting both reliability and location coverage. The results indicate that both the deployment of UAVs and the use of dynamic methods successfully mitigate travel time uncertainties in humanitarian operations.
更多
查看译文
关键词
Humanitarian logistics,Last-mile relief distribution,Travel time uncertainty,UAVs,Reinforcement learning,Comparative analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要