Toward Pitch-Insensitive Speaker Verification via Soundfield.

IEEE Internet of Things Journal(2024)

引用 0|浏览6
暂无评分
摘要
Automatic speaker verification systems (ASVs) verify a person’s identity by his/her voice and have been widely deployed for user authentication. However, existing ASVs are based on traditional audio spectral features and hence, perform poorly in verifying pitch-changed utterances from speakers with cold or sore throat. In this article, we propose soundfield tracker (SOFTER) , a soundfield-based speaker verification system that can verify speakers regardless of the pitch changes. SOFTER is based on the observation that soundfield features reflect the speaker’s vocal tract, mouth, head, torso, etc., which are less affected by the pitch changes in speech signals. SOFTER can be integrated into off-the-shelf smartphones without any hardware modifications. One major challenge is that the soundfield is sensitive to the distance between the speaker and the phone. To solve this problem, we propose a two-stage mechanism combining distance sensing and soundfield reconstruction, which enables to reconstruct the soundfield to a setting similar to the one in the enrollment phase, thus, the speaker can be verified from any distance to the phone. We compare SOFTER with six state-of-the-art academic and commercial ASVs on two data sets of 134 speakers and 31000 speech samples. Results show that SOFTER has an equal error rate (EER) of 2.18% and 1.61% on the two data sets, respectively. Moreover, SOFTER outperforms other ASVs by at least 24.67% on average in verifying pitch-varying or pathological speech samples, denoting an evidence of SOFTER ’s effectiveness in both normal and unhealthy user conditions.
更多
查看译文
关键词
Biometrics,pathological speech,pitch variation,soundfield,speaker verification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要