An Iterative Attention Fusion Network for 6D Object Pose Estimation

Weili Chen,Hui Zhang,Yiming Jiang,Yurong Chen, Bo Chen, Changqing Huang

2023 China Automation Congress (CAC)(2023)

Cited 0|Views0
No score
6D object pose estimation plays a crucial role in many real-world applications. RGB-D based methods have shown remarkable performance in recent years. And how to fully exploit the information from both RGB and depth/point cloud modalities presents a key technical challenge in this pose estimation method. Previous efforts mainly focused on simply concatenating the features of these two modalities, limiting their performance in scenes with heavy clutter and realtime applications. In this paper, we proposed a generic framework for estimating the 6D pose using RGB-D data. Building upon pixel-level fusion of features from RGB and depth modalities, we extract and incorporate semantic features from object's masks into dense feature fusion. This enriches the semantic information of the fusion features while reducing non-relevant data. Moreover, we introduce an iterative attention module to interact with the fusion features for enhancing their representational ability and producing a more robust feature representation. Extensive experiments conducted on the LineMOD dataset illustrate that our approach achieves significantly better performance compared to prior work. In particular, our method achieves an average pose prediction accuracy of 98.2%.
Translated text
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined