DeepFake detection with multi-scale convolution and vision transformer

Digital Signal Processing(2023)

引用 8|浏览61
暂无评分
摘要
With the help of some modern image generative techniques, it is possible to generate or manipulate image or video contents without introducing any obvious visual artifacts. If these manipulated images/videos are abused, it probably has a huge negative impact on society and individuals. Thus, deepfake detection has attracted considerable attention in recent years. Although the existing methods can achieve good detection performance on high-quality datasets, they are still far from satisfactory for low-quality dataset and cross-dataset evaluation. In this paper, therefore, we propose a new CNN-based method via multi-scale convolution and vision transformer for deepfake detection. In the proposed model, we design a multi-scale module with dilation convolution and depthwise separable convolution to capture more face details and tampering artifacts at different scales. Unlike the traditional classification module, furthermore, we employ a vision transformer to further learn the global information of face features for classification. Extensive experiments demonstrate that in most cases the proposed method achieves better detection results on both high-quality and low-quality datasets compared with related modern methods, and the cross-dataset generalization capabilities of the proposed method are good. In addition, many ablation experiments are provided to verify the rationality of the proposed network.
更多
查看译文
关键词
Deepfake,Convolutional neural network,Multi-scale feature,Vision transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要