Efficient flexible voxel-based two-stage network for 3D object detection in autonomous driving

Applied Soft Computing(2024)

引用 0|浏览4
暂无评分
摘要
3D object detection from the LiDAR point cloud plays an important role in autonomous driving. It is difficult to balance inference speed and detection accuracy when performing 3D point cloud object detection due to the large size of point cloud data and its unstructured storage, which makes it difficult to represent its features. To address the challenge, we propose a two-stage point cloud object detector, AFV-RCNN. In stage-1, the attention flexible voxel feature encoding layer is introduced, which utilizes flexible voxels to enhance feature encoding speed and focuses on foreground points to be detected through voxel attention. In stage-2, the multi-level and grid-based multi-scale RoI (Region of Interest) feature fusion module is designed. It directly extracts complete 3D structures from 3D region proposals and focuses on both local and global features through multi-scale partitioning. In the training stage, GHM-C Loss is applied to address the challenges associated with imbalanced target categories and the imbalance between difficult and easy samples in the classification task. We evaluate the model on the public KITTI Dataset and Waymo Open Dataset. The mAP in KITTI for 3D detection is 73.41% and inference on a single GPU reaches 30.0 fps. Compared with other state-of-the-art methods, AFV-RCNN achieves both the inference speed of a single-stage detector and the detection accuracy of a two-stage detector. It ensures higher detection accuracy while efficiently processing the point cloud.
更多
查看译文
关键词
Point cloud,Voxel,3D object detection,Autonomous driving
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要