LeanSpeech: The Microsoft Lightweight Speech Synthesis System for Limmits Challenge 2023

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

Cited 0|Views21
No score
Abstract
This paper describes the Microsoft Text-to-Speech (TTS) system: LeanSpeech for LIMMITS (Lightweight, Multi-speaker, Multi-lingual Indic TTS) Challenge 2023 1 , which is part of ICASSP2023 to encourage the advance of TTS in Indian Languages. We propose a lightweight encoder-decoder acoustic model composed of 1-D convolution and LSTM blocks, which is trained with knowledge distillation from a multi-speaker multi-lingual teacher model, DelightfulTTS [1]. The speech corpus is reprocessed and used in both AM training and vocoder fine-tuning. In Track-2 of the challenge, our system achieves MOS 4.56 and SMOS 3.98, which indicates the efficiency of the proposed model and training strategy.
More
Translated text
Key words
Text to Speech,Lightweight,Multi-speaker,Multi-lingual,WaveGlow
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined