GuessWhich? Visual dialog with attentive memory network

Pattern Recognition(2021)

引用 6|浏览62
暂无评分
摘要
•We use memory network in the cooperative ‘GuessWhich’ game between Q-BOT and A-BOT. It reduces the repetition of the generated dialogs and makes image retrieval efficient.•We propose a novel Attentive Memory Network that adds a fusion model to the memory network. The fusion model can effectively use the manually labeled caption and the image. Thus the generated dialogs and the predicted image representation can be visually grounded.•Experiments conducted on VisDial 1.0 datasets demonstrate that our generated dialogs are natural and precise, and the results exceed the state-of-the-art ‘GuessWhich’ based visual dialog algorithms. Extensive image retrieval experiments prove that our method also can generate more accurate results compared to the benchmarks.
更多
查看译文
关键词
Visual dialog,Attentive memory network,Reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要