Neural2Speech: A Transfer Learning Framework for Neural-Driven Speech Reconstruction
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)
摘要
Reconstructing natural speech from neural activity is vital for enabling
direct communication via brain-computer interfaces. Previous efforts have
explored the conversion of neural recordings into speech using complex deep
neural network (DNN) models trained on extensive neural recording data, which
is resource-intensive under regular clinical constraints. However, achieving
satisfactory performance in reconstructing speech from limited-scale neural
recordings has been challenging, mainly due to the complexity of speech
representations and the neural data constraints. To overcome these challenges,
we propose a novel transfer learning framework for neural-driven speech
reconstruction, called Neural2Speech, which consists of two distinct training
phases. First, a speech autoencoder is pre-trained on readily available speech
corpora to decode speech waveforms from the encoded speech representations.
Second, a lightweight adaptor is trained on the small-scale neural recordings
to align the neural activity and the speech representation for decoding.
Remarkably, our proposed Neural2Speech demonstrates the feasibility of
neural-driven speech reconstruction even with only 20 minutes of intracranial
data, which significantly outperforms existing baseline methods in terms of
speech fidelity and intelligibility.
更多查看译文
关键词
Brain-computer interface,Electrocorticography,Speech reconstruction,Transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要