Semantic Map-based Generation of Navigation Instructions
arxiv(2024)
摘要
We are interested in the generation of navigation instructions, either in
their own right or as training material for robotic navigation task. In this
paper, we propose a new approach to navigation instruction generation by
framing the problem as an image captioning task using semantic maps as visual
input. Conventional approaches employ a sequence of panorama images to generate
navigation instructions. Semantic maps abstract away from visual details and
fuse the information in multiple panorama images into a single top-down
representation, thereby reducing computational complexity to process the input.
We present a benchmark dataset for instruction generation using semantic maps,
propose an initial model and ask human subjects to manually assess the quality
of generated instructions. Our initial investigations show promise in using
semantic maps for instruction generation instead of a sequence of panorama
images, but there is vast scope for improvement. We release the code for data
preparation and model training at https://github.com/chengzu-li/VLGen.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要