Valeo4Cast: A Modular Approach to End-to-End Forecasting
arxiv(2024)
摘要
Motion forecasting is crucial in autonomous driving systems to anticipate the
future trajectories of surrounding agents such as pedestrians, vehicles, and
traffic signals. In end-to-end forecasting, the model must jointly detect from
sensor data (cameras or LiDARs) the position and past trajectories of the
different elements of the scene and predict their future location. We depart
from the current trend of tackling this task via end-to-end training from
perception to forecasting and we use a modular approach instead. Following a
recent study, we individually build and train detection, tracking, and
forecasting modules. We then only use consecutive finetuning steps to integrate
the modules better and alleviate compounding errors. Our study reveals that
this simple yet effective approach significantly improves performance on the
end-to-end forecasting benchmark. Consequently, our solution ranks first in the
Argoverse 2 end-to-end Forecasting Challenge held at CVPR 2024 Workshop on
Autonomous Driving (WAD), with 63.82 mAPf. We surpass forecasting results by
+17.1 points over last year's winner and by +13.3 points over this year's
runner-up. This remarkable performance in forecasting can be explained by our
modular paradigm, which integrates finetuning strategies and significantly
outperforms the end-to-end-trained counterparts.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要