Residual Graph Convolutional Network for Bird's-Eye-View Semantic Segmentation
IEEE/CVF Winter Conference on Applications of Computer Vision(2023)
摘要
Retrieving spatial information and understanding the semantic information of
the surroundings are important for Bird's-Eye-View (BEV) semantic segmentation.
In the application of autonomous driving, autonomous vehicles need to be aware
of their surroundings to drive safely. However, current BEV semantic
segmentation techniques, deep Convolutional Neural Networks (CNNs) and
transformers, have difficulties in obtaining the global semantic relationships
of the surroundings at the early layers of the network. In this paper, we
propose to incorporate a novel Residual Graph Convolutional (RGC) module in
deep CNNs to acquire both the global information and the region-level semantic
relationship in the multi-view image domain. Specifically, the RGC module
employs a non-overlapping graph space projection to efficiently project the
complete BEV information into graph space. It then builds interconnected
spatial and channel graphs to extract spatial information between each node and
channel information within each node (i.e., extract contextual relationships of
the global features). Furthermore, it uses a downsample residual process to
enhance the coordinate feature reuse to maintain the global information. The
segmentation data augmentation and alignment module helps to simultaneously
augment and align BEV features and ground truth to geometrically preserve their
alignment to achieve better segmentation results. Our experimental results on
the nuScenes benchmark dataset demonstrate that the RGC network outperforms
four state-of-the-art networks and its four variants in terms of IoU and mIoU.
The proposed RGC network achieves a higher mIoU of 3.1% than the best
state-of-the-art network, BEVFusion. Code and models will be released.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要