Connective Cognition Network for Directional Visual Commonsense Reasoning
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019)(2019)
摘要
Visual commonsense reasoning (VCR) has been introduced to boost research of cognition-level visual understanding, i.e., a thorough understanding of correlated details of the scene plus an inference with related commonsense knowledge. Recent studies on neuroscience have suggested that brain function or cognition can be described as a global and dynamic integration of local neuronal connectivity, which is context-sensitive to specific cognition tasks. Inspired by this idea, towards VCR, we propose a connective cognition network (CCN) to dynamically reorganize the visual neuron connectivity that is contextualized by the meaning of questions and answers. Concretely, we first develop visual neuron connectivity to fully model correlations of visual content. Then, a contextualization process is introduced to fuse the sentence representation with that of visual neurons. Finally, based on the output of contextualized connectivity, we propose directional connectivity to infer answers or rationales. Experimental results on the VCR dataset demonstrate the effectiveness of our method. Particularly, in Q -> AR mode, our method is around 4% higher than the state-of-the-art method.
更多查看译文
关键词
brain function,dynamic integration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络