Self-supervised deep partial adversarial network for micro-video multimodal classification

Yun Li, Shuyi Liu, Xuejun Wang,Peiguang Jing

Inf. Sci.(2023)

引用 3|浏览11
暂无评分
摘要
Micro-videos have gained popularity on various social media platforms because they provide a great medium for real-time storytelling. Although micro-videos can be naturally characterized by several modalities, for situations with uncertain missing modalities, a flexible multimodal rep-resentation learning framework that integrates complementary and consistent information has been difficult to develop. To better deal with the issue regarding incomplete modalities in multimodal micro-video classification, in this paper, we propose a self-supervised deep multi -modal adversarial network (SDMAN) to learn comprehensive and robust micro-video represen-tations. Specifically, we first consider a parallel multi-head attention (MHA) encoding module that simultaneously learns the representations of complete and incomplete modality groupings. We then present a multimodal self-supervised cycle generative adversarial network module, in which multiple generative adversarial networks are explored to transfer the information obtained from the complete modality grouping to the incomplete modality groupings. As a result, complementarity and consistency are mutually promoted among the modalities. Furthermore, experiments conducted on a large-scale micro-video dataset demonstrate that the SDMAN per-forms better than the state-of-the-art methods.
更多
查看译文
关键词
Micro -video classification,Multimodal representation,Self -supervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要