Chrome Extension
WeChat Mini Program
Use on ChatGLM

RTransNav:Relation-wise Transformer Network for More Successful Object Goal Navigation

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems(2023)

Cited 0|Views7
No score
Abstract
The task of object goal navigation is to drive an embodied agent to finding the location of given target only using visual observation. The mapping from visual perception of observation determines the navigation actions. We consider the problem of generalization for the agent across scenes to be lacking good visual perception and spatial reasoning ability. The mutual relationships between edges and objects in the observation is the essential part of scene graph, which reflect the deep understanding of visual perception. Despite recent advances, such as visual transformer and contextual information embedding, the visual perception of graph representation remains a challenging task. In this work, we propose a novel Heterogeneous Zone Graph Visual Transformer formulation for graph representation and visual perception. It consists of two key ideas:1)Heterogeneous Zone Graph (HZG) that explore the heterogeneous target related zones graph and spatial information. It allows the agent to navigate efficiently. 2) Relation-wise Transformer Network (RTN) that transforms the relationship between previously observed objects and navigation actions. RTN extracts rich nodes and edges features as pay more attention on the target-related zone. We model self-attention on the node-to-node encoder and cross-attention on the edge-to-node decoder. The HZG-based model and RTN are shown to improve the agent's policy and to achieve SOTA results on the commonly-used datasets.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined