Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions
CoRR(2024)
摘要
Deep generative models (DGMs) have demonstrated great success across various
domains, particularly in generating texts, images, and videos using models
trained from offline data. Similarly, data-driven decision-making and robotic
control also necessitate learning a generator function from the offline data to
serve as the strategy or policy. In this case, applying deep generative models
in offline policy learning exhibits great potential, and numerous studies have
explored in this direction. However, this field still lacks a comprehensive
review and so developments of different branches are relatively independent.
Thus, we provide the first systematic review on the applications of deep
generative models for offline policy learning. In particular, we cover five
mainstream deep generative models, including Variational Auto-Encoders,
Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion
Models, and their applications in both offline reinforcement learning (offline
RL) and imitation learning (IL). Offline RL and IL are two main branches of
offline policy learning and are widely-adopted techniques for sequential
decision-making. Specifically, for each type of DGM-based offline policy
learning, we distill its fundamental scheme, categorize related works based on
the usage of the DGM, and sort out the development process of algorithms in
that field. Subsequent to the main content, we provide in-depth discussions on
deep generative models and offline policy learning as a summary, based on which
we present our perspectives on future research directions. This work offers a
hands-on reference for the research progress in deep generative models for
offline policy learning, and aims to inspire improved DGM-based offline RL or
IL algorithms.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要