Data driven articulatory synthesis with deep neural networks

Computer Speech & Language(2016)

引用 46|浏览83
暂无评分
摘要
We present an articulatory-to-acoustic mapping for real-time articulatory synthesis.The method uses a deep neural network with a tapped-delay input line.Tapped-delay line efficiently captures dynamics in articulatory trajectories.The model achieved higher accuracy than competing models based on Gaussian mixtures.The improvement was also found perceivable in a subjective listening test. The conventional approach for data-driven articulatory synthesis consists of modeling the joint acoustic-articulatory distribution with a Gaussian mixture model (GMM), followed by a post-processing step that optimizes the resulting acoustic trajectories. This final step can significantly improve the accuracy of the GMM frame-by-frame mapping but is computationally intensive and requires that the entire utterance be synthesized beforehand, making it unsuited for real-time synthesis. To address this issue, we present a deep neural network (DNN) articulatory synthesizer that uses a tapped-delay input line, allowing the model to capture context information in the articulatory trajectory without the need for post-processing. We characterize the DNN as a function of the context size and number of hidden layers, and compare it against two GMM articulatory synthesizers, a baseline model that performs a simple frame-by-frame mapping, and a second model that also performs trajectory optimization. Our results show that a DNN with a 60-ms context window and two 512-neuron hidden layers can synthesize speech at four times the frame rate - comparable to frame-by-frame mappings, while improving the accuracy of trajectory optimization (a 9.8% reduction in Mel Cepstral distortion). Subjective evaluation through pairwise listening tests also shows a strong preference toward the DNN articulatory synthesizer when compared to GMM trajectory optimization.
更多
查看译文
关键词
Articulatory synthesis,Electromagnetic articulography,Deep learning,Gaussian mixture models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要