Don't Judge by the Look: Towards Motion Coherent Video Representation
CoRR(2024)
Abstract
Current training pipelines in object recognition neglect Hue Jittering when
doing data augmentation as it not only brings appearance changes that are
detrimental to classification, but also the implementation is inefficient in
practice. In this study, we investigate the effect of hue variance in the
context of video understanding and find this variance to be beneficial since
static appearances are less important in videos that contain motion
information. Based on this observation, we propose a data augmentation method
for video understanding, named Motion Coherent Augmentation (MCA), that
introduces appearance variation in videos and implicitly encourages the model
to prioritize motion patterns, rather than static appearances. Concretely, we
propose an operation SwapMix to efficiently modify the appearance of video
samples, and introduce Variation Alignment (VA) to resolve the distribution
shift caused by SwapMix, enforcing the model to learn appearance invariant
representations. Comprehensive empirical evaluation across various
architectures and different datasets solidly validates the effectiveness and
generalization ability of MCA, and the application of VA in other augmentation
methods. Code is available at https://github.com/BeSpontaneous/MCA-pytorch.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined