Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection
arxiv(2024)
摘要
Multi-view 3D object detection systems often struggle with generating precise
predictions due to the challenges in estimating depth from images, increasing
redundant and incorrect detections. Our paper presents Ray Denoising, an
innovative method that enhances detection accuracy by strategically sampling
along camera rays to construct hard negative examples. These examples, visually
challenging to differentiate from true positives, compel the model to learn
depth-aware features, thereby improving its capacity to distinguish between
true and false positives. Ray Denoising is designed as a plug-and-play module,
compatible with any DETR-style multi-view 3D detectors, and it only minimally
increases training computational costs without affecting inference speed. Our
comprehensive experiments, including detailed ablation studies, consistently
demonstrate that Ray Denoising outperforms strong baselines across multiple
datasets. It achieves a 1.9% improvement in mean Average Precision (mAP) over
the state-of-the-art StreamPETR method on the NuScenes dataset. It shows
significant performance gains on the Argoverse 2 dataset, highlighting its
generalization capability. The code will be available at
https://github.com/LiewFeng/RayDN.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要