Spoof Detection using Voice Contribution on LFCC features and ResNet-34

2023 18TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING, ISAI-NLP(2023)

引用 0|浏览0
暂无评分
摘要
Biometric authentication, especially in speaker verification, has seen significant advancements recently. Despite these significant strides, compelling evidence highlights the ongoing vulnerability to spoofing attacks, requiring specialized countermeasures to detect various attack types. This paper specifically focuses on detecting replay, speech synthesis, and voice conversion attacks. In our spoof detection strategy, we employed linear frequency cepstral coefficients (LFCC) for front-end feature extraction and ResNet-34 for distinguishing between genuine and fake speech. By integrating LFCC with ResNet34, we evaluated the proposed method using the ASVspoof 2019 dataset, PA (Physical Access), and LA (Logical Access). In our approach, we contrast using the entire utterance for feature extraction in both PA and LA datasets with an alternative method that extracts features from a specific percentage of the voice segment within the utterance for classification. In addition, we conducted a comprehensive evaluation by comparing our proposed method with the established baseline techniques, LFCC-GMM and CQCC-GMM. The proposed method demonstrates promising performance with an equal error rate (EER) of 3.11% and 3.49% for replay attacks (PA) in the development and evaluation datasets. For voice conversion and speech synthesis attacks (LA), the method achieves EERs of 0.16% in the development dataset and 6.89% in the evaluation dataset. The proposed method shows promising results in identifying spoof attacks for both PA and LA attacks.
更多
查看译文
关键词
replay attack,speech synthesis,voice conversion,LFCC,ResNet-34,ASVspoof
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要