Diversity and Consistency - Exploring Visual Question-Answer Pair Generation.

EMNLP(2021)

引用 3|浏览62
暂无评分
摘要
Although showing promising values to downstream applications, generating question and answer together is under-explored.In this paper, we introduce a novel task that targets question-answer pair generation from visual images.It requires not only generating diverse question-answer pairs but also keeping the consistency of them.We study different generation paradigms for this task and propose three models: the pipeline model, the joint model, and the sequential model.We integrate variational inference into these models to achieve diversity and consistency.We also propose region representation scaling and attention alignment to improve the consistency further.We finally devise an evaluator as a quantitative metric for consistency.We validate our approach on two benchmarks, VQA2.0 and Visual-7w, by automatically and manually evaluating diversity and consistency.Experimental results show the effectiveness of our models: they can generate diverse or consistent pairs.Moreover, this task can be used to improve visual question generation and visual question answering.
更多
查看译文
关键词
Visual Question Answering,Topic Modeling,Representation Learning,Visual Recognition,Scene Graph Generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要