End-to-End Speech Translation with Mutual Knowledge Distillation

Hao Wang,Zhengshan Xue,Yikun Lei,Deyi Xiong

ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2024）

引用 0|浏览3

暂无评分

摘要

Multi-task learning (MTL) is widely used to improve end-to-end speech translation (ST), which implicitly transfer knowledge from auxiliary automatic speech recognition (ASR) and/or machine translation (MT) to ST through shared modules. In this study, we find that triple-task MTL (ST+MT+ASR) suffers from a knowledge transfer limitation that leads to performance stagnation compared with dual-task MTL (ST+MT or ST+ASR). To address this issue, we propose a simple yet effective method, ST-MKD (Speech Translation with Mutual Knowledge Distillation). In ST-MKD, we employ a mutual knowledge distillation framework to mutually enhance dual-task MTL models with different knowledge bases, and explore regularization to maintain the consistency of the task representations. Experiments on the ST benchmark dataset MuST-C show that ST-MKD significantly outperforms strong MTL baseline and achieves state-of-the-art performance under three speech pre-training settings. Further analyses confirm that our approach effectively overcomes the knowledge transfer limitation of triple-task MTL.

查看译文

关键词

Speech translation,Mutual knowledge distillation,Multi-task learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要