Dual-Stream Temporal Convolutional Neural Network for Voice Presentation Attack Detection

2022 International Workshop on Biometrics and Forensics (IWBF)(2022)

引用 0|浏览17
暂无评分
摘要
Improving the robustness of biometric systems to external attacks is of the utmost importance for the research community. In particular, Automatic Speaker Verification (ASV) can be easily bypassed by launching either attack presentations (i.e., physical access attacks) over the capture devices (i.e., micro-phone) or exchanging the input sample in the channel between the capture device and the signal processor (i.e., logical access attacks). In order to address these security threats, ASVspoof challenges have evaluated the generalisation ability of several Presentation Attack Detection (PAD) approaches in the last decade. Those algorithms have reported a remarkable detection performance to detect physical and logical access attacks when they are combined with the decision provided by the ASV systems. They fundamentally depend upon the complementary information of ASV systems for a reliable detection performance. Therefore, they are not interoperable across different systems. In this work, we propose an interoperable dual-stream PAD method which leverages temporal information from image-based voice spectrograms to enhance generalisation on PAD. The experimental results conducted over the publicly available ASVspoof 2019 and 2021 databases show the feasibility of our approach to detect both physical and logical access attacks unknown in training.
更多
查看译文
关键词
Presentation Attack Detection,Automatic Speaker Verification,Generalisation Capability,Temporal Convolutional Neural Network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要