Chrome Extension
WeChat Mini Program
Use on ChatGLM

A Dual-branch Enhanced Multi-task Learning Network for Multimodal Sentiment Analysis

ICMR '23: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval(2023)

Cited 0|Views13
No score
Abstract
Multimodal sentiment analysis is a complex research problem. Firstly, current multimodal approaches fail to adequately consider the intricate multi-level correspondence between modalities and the unique contextual information within each modality; secondly, cross-modal fusion methods for inter-modal fusion somewhat weaken the mode-specific internal features, which is a limitation of the traditional single-branch model. To this end, we proposes a dual-branch enhanced multi-task learning network (DBEM), a new architecture that considers both the multiple dependencies of sequences and the heterogeneity of multimodal data, for better multimodal sentiment analysis. The global-local branch takes into account the intra-modal dependencies of different length time subsequences and aggregates global and local features to enrich the feature diversity. The cross-refine branch considers the difference in information density of different modalities and adopts coarse-to-fine fusion learning to model the inter-modal dependencies. Coarse-grained fusion achieves low-level feature reinforcement of audio and visual modalities, and fine-grained fusion improves the ability to integrate information complementarity between different levels of modalities. Finally, multi-task learning is carried out to improve the generalization and performance of the model based on the enhanced fusion features obtained from the dual-branch network. Compared with the single branch network (SBEM, variant of DBEM model) and SOTA methods, the experimental results on the two datasets CH-SIMS and CMU-MOSEI validate the effectiveness of the DBEM model.
More
Translated text
Key words
Dual-branch, Multimodal sentiment analysis, Multi-task learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined