A Deep Neural Network Approach To Speech Bandwidth Expansion

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2015)

引用 153|浏览66
暂无评分
摘要
We propose a deep neural network (DNN) approach to speech bandwidth expansion (BWE) by estimating the spectral mapping function from narrowband (4 kHz in bandwidth) to wideband (8 kHz in bandwidth). Log-spectrum power is used as the input and output features to perform the required nonlinear transformation, and DNNs are trained to realize this high-dimensional mapping function. When evaluating the proposed approach on a large-scale 10-hour test set, we found that the DNN-expanded speech signals give excellent objective quality measures in terms of segmental signal-to-noise ratio and log-spectral distortion when compared with conventional BWE based on Gaussian mixture models (GMMs). Subjective listening tests also give a 69% preference score for DNN-expanded speech over 31% for GMM when the phase information is assumed known. For tests in real operation when the phase information is imaged from the given narrowband signal the preference comparison goes up to 84% versus 16%. A correct phase recovery can further increase the BWE performance for the proposed DNN method.
更多
查看译文
关键词
Deep neural network,speech bandwidth expansion,spectrum mapping,phase estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要