Chrome Extension
WeChat Mini Program
Use on ChatGLM

Neural Speech Synthesis with Style Intensity Interpolation: A Perceptual Analysis

HRI '20: ACM/IEEE International Conference on Human-Robot Interaction Cambridge United Kingdom March, 2020(2020)

Cited 2|Views16
No score
Abstract
State of the art in speech synthesis considerably reduced the gap between synthetic and human speech on the perception level. However the impact of a speech style control on the perception is not well known. In this paper, we propose a method to analyze the impact of controlling the TTS system parameters on the perception of the generated sentence. This is done through a visualization and analysis of listening test results. For this, we train a speech synthesis system with different discrete categories of speech styles. Each style is encoded using a one-hot representation in the network. After training, we interpolate between the vectors representing each style. A perception test showed that despite being trained with only discrete categories of data, the network is capable of generating intermediate intensity levels between neutral and a given speech style.
More
Translated text
Key words
Deep Learning,Speech Synthesis,Style Interpolation,Perception
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined