Chrome Extension
WeChat Mini Program
Use on ChatGLM

Video Relationship Detection Using Mixture of Experts

IEEE Access(2023)

Cited 0|Views44
No score
Abstract
Machine comprehension of visual information from images and videos by neural networks suffers from two limitations: (1) the computational and inference gap in vision and language to accurately determine which object a given agent acts on and then to represent it by language, and (2) the shortcoming in stability and generalization of the classifier trained by a single, monolithic neural network. To address these limitations, we propose MoE-VRD, a novel approach to visual relationship detection via a mixture of experts. MoE-VRD recognizes language triplets in the form of a < subject, predicate, object > tuple to extract the relationship between subject, predicate, and object from visual processing. Since detecting a relationship between a subject (acting) and the object(s) (being acted upon) requires that the action be recognized, we base our network on recent work in visual relationship detection. To address the limitations associated with single monolithic networks, our mixture of experts is based on multiple small models, whose outputs are aggregated. That is, each expert in MoE-VRD is a visual relationship learner capable of detecting and tagging objects. MoE-VRD employs an ensemble of networks while preserving the complexity and computational cost of the original underlying visual relationship model by applying a sparsely-gated mixture of experts, which allows for conditional computation and a significant gain in neural network capacity. We show that the conditional computation capabilities and massive ability to scale the mixture-of-experts leads to an approach to the visual relationship detection problem which outperforms the state-of-the-art.
More
Translated text
Key words
Visualization,Videos,Neural networks,Computational modeling,Video sequences,Computer vision,Deep learning,video analysis,visual relationship detection,mixture-of-experts,deep learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined