Adversarially Masked Video Consistency for Unsupervised Domain Adaptation
CoRR(2024)
Abstract
We study the problem of unsupervised domain adaptation for egocentric videos.
We propose a transformer-based model to learn class-discriminative and
domain-invariant feature representations. It consists of two novel designs. The
first module is called Generative Adversarial Domain Alignment Network with the
aim of learning domain-invariant representations. It simultaneously learns a
mask generator and a domain-invariant encoder in an adversarial way. The
domain-invariant encoder is trained to minimize the distance between the source
and target domain. The masking generator, conversely, aims at producing
challenging masks by maximizing the domain distance. The second is a Masked
Consistency Learning module to learn class-discriminative representations. It
enforces the prediction consistency between the masked target videos and their
full forms. To better evaluate the effectiveness of domain adaptation methods,
we construct a more challenging benchmark for egocentric videos, U-Ego4D. Our
method achieves state-of-the-art performance on the Epic-Kitchen and the
proposed U-Ego4D benchmark.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined