Street Gaussians for Modeling Dynamic Urban Scenes
CoRR(2024)
摘要
This paper aims to tackle the problem of modeling dynamic urban street scenes
from monocular videos. Recent methods extend NeRF by incorporating tracked
vehicle poses to animate vehicles, enabling photo-realistic view synthesis of
dynamic urban street scenes. However, significant limitations are their slow
training and rendering speed, coupled with the critical need for high precision
in tracked vehicle poses. We introduce Street Gaussians, a new explicit scene
representation that tackles all these limitations. Specifically, the dynamic
urban street is represented as a set of point clouds equipped with semantic
logits and 3D Gaussians, each associated with either a foreground vehicle or
the background. To model the dynamics of foreground object vehicles, each
object point cloud is optimized with optimizable tracked poses, along with a
dynamic spherical harmonics model for the dynamic appearance. The explicit
representation allows easy composition of object vehicles and background, which
in turn allows for scene editing operations and rendering at 133 FPS
(1066×1600 resolution) within half an hour of training. The proposed
method is evaluated on multiple challenging benchmarks, including KITTI and
Waymo Open datasets. Experiments show that the proposed method consistently
outperforms state-of-the-art methods across all datasets. Furthermore, the
proposed representation delivers performance on par with that achieved using
precise ground-truth poses, despite relying only on poses from an off-the-shelf
tracker. The code is available at https://zju3dv.github.io/street_gaussians/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要