Pitch Contour Separation from Overlapping Speech.

Interspeech(2021)

引用 0|浏览6
暂无评分
摘要
In everyday conversation, speakers' utterances often overlap. For conversation corpora that are recorded in diverse environments, results of pitch extraction in the overlapping parts may be incorrect. The goal of this study is to establish the technique of separating each speaker's pitch contour from an overlapping speech in conversation. The proposed method estimates statistically most plausible f(o) contour from the spectrogram of overlapping speech, along with the information of the speaker to extract. Visual inspection of the separation results showed that the proposed model was able to extract accurate fo contours from overlapping speeches of specified speakers. By applying this method, voicing decision errors and gross pitch errors were reduced by 63 % compared to simple pitch extraction for overlapping speech.
更多
查看译文
关键词
prosody,conversation,source separation,neural f(o) model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要