Multi-Modal Virtual-Real Fusion based Transformer for Collaborative Perception

2022 IEEE 13th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)(2022)

引用 1|浏览13
暂无评分
摘要
Automobile intelligence and networking have become the inevitable trend in the future development of the automotive industry. Existing intelligent and connected vehicles rely on single-agent intelligence to perform the basic perception, which is still weak in dealing with the problem of accurate recognition and positioning in complex traffic scenes such as small and far away objects. To tackle this issue, we propose a multi-model virtual-real fusion Transformer for collaborative perception. Specifically, to possess the complementary information from both RGB images and LiDAR point clouds, we propose the multi-model virtual-real fusion (MVRF) method, which generates virtual points and compensates for the lack of point information on sparse locations. Furthermore, the heterogeneous graph attention network (HGAN) is constructed to capture the inter-agent interaction and adaptively incorporate multiple agents’ features. The HGAN contains a series of encoder layers, each of which has a heterogeneous inter-agent attention module and a multi-scale self-attention module, which motivates to learn different relationships based on various agents’ types and simultaneously capture the global and local spatial attention. Extensive experiments demonstrate that the proposed method gains superior performance as compared with state-of-the-art methods.
更多
查看译文
关键词
Collaborative Perception,Intelligent and Connected Vehicle,Multi-Model Fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要