Evaluation Of A Silent Speech Interface Based On Magnetic Sensing And Deep Learning For A Phonetically Rich Vocabulary

José A. González,Lam Aun Cheah,Phil D. Green,James M. Gilbert,Stephen R. Ell,Roger K. Moore,Ed Holdsworth

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION（2017）

引用 26|浏览37

暂无评分

摘要

To help people who have lost their voice following total laryngectomy. we present a speech restoration system that produces audible speech from articulator movement. The speech articulators are monitored by sensing changes in magnetic field caused by movements of small magnets attached to the lips and tongue. Then, articulator movement is mapped to a sequence of speech parameter vectors using a transformation learned from simultaneous recordings of speech and articulatory data. In this work, this transformation is performed using a type of recurrent neural network (RNN) with fixed latency, which is suitable for real-time processing. The system is evaluated on a phonetically rich database with simultaneous recordings of speech and articulatory data made by non-impaired subjects. Experimental results show that our RNN-based mapping obtains more accurate speech reconstructions (evaluated using objective quality metrics and a listening test) than articulatory-to-acoustic mappings using Gaussian mixture models (GMMs) or deep neural networks (DNNs). Moreover, our fixed-latency RNN architecture provides comparable performance to an utterance-level batch mapping using bidirectional RNNs (BiRNNs).

查看译文

关键词

speech rehabilitation, articulatory-to-acoustic mapping, recurrent neural network, speech synthesis

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要