Dual-stream Multi-Modal Graph Neural Network for Few-Shot Learning

2023 IEEE 6th International Conference on Multimedia Information Processing and Retrieval (MIPR)（2023）

引用 0|浏览13

暂无评分

摘要

Few-shot learning aims to rapidly recognize unseen targets using only a limited number of labeled samples, which is one of the core capabilities for humans. However, existing research primarily transforms the samples into the feature space of a single modality, neglecting the hidden features within other modalities. To address this problem, we develop a Dual-stream Multi-Modal Graph Neural Network (DMMG) that leverages additional information from multi-modality for few-shot learning. We convert the representation of text and images into each other's feature space in parallel streams. Then we compare instances of text and images in different vector spaces at the same time, exploiting the potentials of each modality for few-shot classification through skipping the bottleneck of information. This approach can be extended to a wide range of metric-based few-shot learning methods. The experiments on the miniImageNet dataset demonstrate that DMMG outperforms state-of-the-art few-shot learning methods, highlighting the effectiveness of our proposed approach.

查看译文

关键词

Few-shot Learning,Embedding Propagation,Multi-Modality,Graph Neural Networks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要