Evaluation of Phonexia automatic speaker recognition software under conditions reflecting those of a real forensic voice comparison case (forensic_eval_01)

Speech Communication(2019)

引用 13|浏览9
暂无评分
摘要
As part of the Speech Communication virtual special issue “Multi-laboratory evaluation of forensic voice comparison systems under conditions reflecting those of a real forensic case (forensic_eval_01)” two automatic speaker recognition systems developed by the company Phonexia were tested. The first named SID (Speaker Identification)-XL3 is an i-vector PLDA system that works with two streams of features, one of them using MFCCs in a classical sense, the other using DNN-Stacked Bottle-Neck features based on correlated spectral-domain features as well as on information from voice/voiceless detection and fundamental frequency. The second system that was tested is called SID-BETA4. It uses MFCCs as input features (without deltas and double deltas) and employs a DNN-based speaker embedding architecture. Each of the two systems was tested in two variants. In the first, the system was used without including any domain-specific data, i.e. data from the training set of forensic_eval_01. In the second variant, training set data were used with a method called 10% FAR calibration. With this method scores are shifted in a way that 10% of the scores in the non-target distribution (based on training data) will have LLR > 0 and 90% will have LLR < 0. Results showed that the speaker embedding system SID-BETA4 leads to clear improvement over the use of the SID-XL3 in terms of accuracy, discrimination and precision measures. Use of the FAR calibration method turned out to leave precision unaffected but lead to improvement in discrimination. The accuracy measure Cllrpooled improved with the use vs. non-use of FAR calibration in SID-XL3 but not SID-BETA4.
更多
查看译文
关键词
Forensic voice comparison,Automatic speaker recognition,Evaluation,Phonexia
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要