Perceived naturalness of electrolaryngeal speech produced using sEMG-controlled vs. manual pitch modulation

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES(2016)

引用 6|浏览8
暂无评分
摘要
Producing speech with natural prosodic patterns is an ongoing challenge for users of electrolaryngeal (EL) speech. This study describes speech produced using a method currently in development, wherein a prosodic pattern is derived from skin surface electromyographical (sEMG) signals recorded from under the chin (submental surface). Eight laryngectomees who currently use a TruTone(TM) EL as their primary or backup mode of speech provided samples of EL speech in two modes: conventional thumb-pressure pitch modulated control (represented by the TruTone(TM) EL; Griffin Laboratories, CA, U.S.A.) and sEMG-based pitch-modulated control (EMG-EL). Ratings of perceived naturalness were obtained from ten listeners unfamiliar with EL speech. Listener ratings indicated that five speakers produced equally natural speech using both devices, and three produced significantly more natural speech using the EMG-EL than the TruTone(TM) EL. Mean fundamental frequency (f0) was similar within speakers for both modes; however, mean f0 range and standard deviation were significantly larger for the EMG-EL than for the TruTonerm(TM) EL, despite both devices having similar potential f0 range. This study showed that the EMG-EL provides an intuitive means of controlling f0-based prosodic patterns that are more natural-sounding than push-button control for some EL users.
更多
查看译文
关键词
alaryngeal speech, electrolaryngeal speech, fundamental frequency, laryngectomy, naturalness, prosody
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要