Speech Recognition Based on Deep Tensor Neural Network and Multifactor Feature

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)(2019)

引用 2|浏览10
暂无评分
摘要
This paper presents a speech recognition system based on deep tensor neural network which uses multifactor feature as input feature of acoustic model. First, a deep neural network is trained to estimate articulatory feature from input speech, where the training data is MOCHA database[1]. Mel frequency cepstrum coefficients in conjunction with articulatory feature are used as multifactor feature. Deep tensor neural network which involves tensor interactions among neurons is used as the acoustic model in this system. Speech recognition results indicate that the multifactor feature helps in improving speech recognition performance not only under clean conditions but also under noisy background conditions; deep tensor neural network is more capable of modeling multifactor features because of its tensor interactions than deep neural network.
更多
查看译文
关键词
articulatory feature,multifactor feature,deep tensor neural network,tensor interactions,speech recognition results,speech recognition system,acoustic model,Mel frequency cepstrum coefficients,speech recognition performance improvement,noisy background conditions
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要