Mixed Knowledge Relation Transformer for Image Captioning

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)(2022)

引用 1|浏览5
暂无评分
摘要
Internal relationship of image objects has contributed significantly to the development of image captioning, especially when combined with Transformer architecture. Most of these methods only calculate the relationship between entities and ignore the information between entities and background. Besides, the way of exploring the relational information inside the image can also be extended. In this paper, we continually explore the relationship between objects from both internal and external perspectives, and embed the vital image global information into the internal relationship module. To validate the effectiveness of our model, we conduct extensive experiments on the most popular MSCOCO dataset, and achieve state-of-the-art performance on both online and offline test sets.
更多
查看译文
关键词
image captioning,external knowledge,object relation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要