Reward for Exploration Based on View Synthesis

IEEE Access(2023)

引用 0|浏览1
暂无评分
摘要
Research on embodied-AI has flourished in recent years to make AI accessible to real-world information. Visual exploration is a very fundamental task in embodied-AI applications such as object-goal navigation, embodied questioning and answering (EQA), and rearrangement. However, it is still a challenging task. The frontier-based method is successful but it is difficult to use for reinforcement learning (RL). Moreover, it heavily relies on two-dimensional grid-map representation, therefore difficult to apply free movement in three-dimensional environments. We propose a novel reward for RGB-D camera-based exploration to maximize the amount of new information contained in the observations obtained from the camera. The basic idea of our method is to predict the destination image by view synthesis using a point cloud obtained by back-projecting depth information. The more lacks in this predicted image, the more likely it is to contain unknown information. For efficient exploration, we also propose topological map implementation to prevent the agent from repetitively visiting the same states. Our method achieves a performance of coverage of area, objects and landmarks comparable to that of state-of-the-art visual exploration methods without using two-dimensional grid maps. Furthermore, we implement object-goal navigation through integration of object detection and simple point-goal navigation, and it outperforms the task-specific RL method with the same architecture on the success rate.
更多
查看译文
关键词
Task analysis,Navigation,Point cloud compression,Visualization,Optimization,Image reconstruction,Robot kinematics,Embodied-AI,visual exploration,object-goal navigation,view synthesis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要