RICOA: Rich Captioning with Object Attributes.

Enes Muvahhid Sahin,Gözde Bozdagi Akar

IEEE International Conference on Consumer Electronics(2024)

引用 0|浏览0
暂无评分
摘要
In this study, we demonstrate how state-of-the-art baseline image captioning methods overlook important details in the image and we analyze the reasoning behind this problem. We propose a novel approach, named RICOA (RIch Captioning with Object Attributes), which integrates object attributes to the generated captions. Our analyses demonstrate that the proposed approach generates richer and more visually grounded captions by integrating attributes of the objects in the scene to the generated captions successfully.
更多
查看译文
关键词
image captioning,novel object captioning,vision-language pretraining,object tags,object attributes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要