Environmental robust speech and speaker recognition through multi-channel histogram equalization

Neurocomputing(2012)

引用 27|浏览2
暂无评分
摘要
Feature statistics normalization in the cepstral domain is one of the most performing approaches for robust automaticspeech and speaker recognition in noisy acoustic scenarios: feature coefficients are normalized by using suitable linear or nonlinear transformations in order to match the noisy speech statistics to the clean speech one. Histogram equalization (HEQ) belongs to such a category of algorithms and has proved to be effective on purpose and therefore taken here as reference. In this paper the presence of multi-channel acoustic channels is used to enhance the statistics modeling capabilities of the HEQ algorithm, by exploiting the availability of multiple noisy speech occurrences, with the aim of maximizing the effectiveness of the cepstra normalization process. Computer simulations based on the Aurora 2 database in speech and speaker recognition scenarios have shown that a significant recognition improvement with respect to the single-channel counterpart and other multi-channel techniques can be achieved confirming the effectiveness of the idea. The proposed algorithmic configuration has also been combined with the kernel estimation technique in order to further improve the speech recognition performances.
更多
查看译文
关键词
multiple noisy speech occurrence,environmental robust speech,noisy acoustic scenario,speech recognition performance,multi-channel histogram equalization,speaker recognition scenario,speaker recognition,significant recognition improvement,noisy speech statistic,feature statistics normalization,heq algorithm,clean speech,histogram equalization,speech recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要