Lip Forgery Video Detection via Multi-Phoneme Selection


Cited 0|Views31
No score
Deepfake technique can produce realistic manipulation videos including full-face synthesis and local region forgery. General methods work well in detecting the former but are usually intractable in capturing local artifacts especially for lip forgery detection. In this paper, we focus on the lip forgery detection task. We first establish a robust mapping from audio to lip shapes. Then we classify the lip shapes of each video frame according to different spoken phonemes, enable the network in capturing the dissonances between lip shapes and phonemes in fake videos, increasing the interpretability. Each lip shapephoneme set is used to train a sub-model, thosewith better discriminationwill be selected to obtain an ensemble classification model. Extensive experimental results demonstrate that our method outperforms the most state-of-the-art methods on both the public DFDC dataset and a self-organized lip forgery dataset.
Translated text
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined