Generalized Predictive Model for Autonomous Driving
CVPR 2024(2024)
Abstract
In this paper, we introduce the first large-scale video prediction model in
the autonomous driving discipline. To eliminate the restriction of high-cost
data collection and empower the generalization ability of our model, we acquire
massive data from the web and pair it with diverse and high-quality text
descriptions. The resultant dataset accumulates over 2000 hours of driving
videos, spanning areas all over the world with diverse weather conditions and
traffic scenarios. Inheriting the merits from recent latent diffusion models,
our model, dubbed GenAD, handles the challenging dynamics in driving scenes
with novel temporal reasoning blocks. We showcase that it can generalize to
various unseen driving datasets in a zero-shot manner, surpassing general or
driving-specific video prediction counterparts. Furthermore, GenAD can be
adapted into an action-conditioned prediction model or a motion planner,
holding great potential for real-world driving applications.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined