LeanSpeech: The Microsoft Lightweight Speech Synthesis System for Limmits Challenge 2023

Chen Zhang,Shubham Bansal, Aakash Lakhera, Jinzhu Li,Gang Wang, Sandeepkumar Satpal,Sheng Zhao,Lei He

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 0|浏览9
暂无评分
摘要
This paper describes the Microsoft Text-to-Speech (TTS) system: LeanSpeech for LIMMITS (Lightweight, Multi-speaker, Multi-lingual Indic TTS) Challenge 2023 1 , which is part of ICASSP2023 to encourage the advance of TTS in Indian Languages. We propose a lightweight encoder-decoder acoustic model composed of 1-D convolution and LSTM blocks, which is trained with knowledge distillation from a multi-speaker multi-lingual teacher model, DelightfulTTS [1]. The speech corpus is reprocessed and used in both AM training and vocoder fine-tuning. In Track-2 of the challenge, our system achieves MOS 4.56 and SMOS 3.98, which indicates the efficiency of the proposed model and training strategy.
更多
查看译文
关键词
Text to Speech,Lightweight,Multi-speaker,Multi-lingual,WaveGlow
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要