Verified Safe Reinforcement Learning for Neural Network Dynamic Models
CoRR(2024)
摘要
Learning reliably safe autonomous control is one of the core problems in
trustworthy autonomy. However, training a controller that can be formally
verified to be safe remains a major challenge. We introduce a novel approach
for learning verified safe control policies in nonlinear neural dynamical
systems while maximizing overall performance. Our approach aims to achieve
safety in the sense of finite-horizon reachability proofs, and is comprised of
three key parts. The first is a novel curriculum learning scheme that
iteratively increases the verified safe horizon. The second leverages the
iterative nature of gradient-based learning to leverage incremental
verification, reusing information from prior verification runs. Finally, we
learn multiple verified initial-state-dependent controllers, an idea that is
especially valuable for more complex domains where learning a single universal
verified safe controller is extremely challenging. Our experiments on five safe
control problems demonstrate that our trained controllers can achieve verified
safety over horizons that are as much as an order of magnitude longer than
state-of-the-art baselines, while maintaining high reward, as well as a perfect
safety record over entire episodes.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要