THE TUM+TUT+KUL APPROACH TO THE 2ND CHIME CHALLENGE: MULTI-STREAM ASR EXPLOITING BLSTM NETWORKS AND SPARSE NMF

international conference on acoustics speech and signal processing(2013)

引用 34|浏览56
暂无评分
摘要
We present our joint contribution to the 2nd CHiME Speech Separation and Recognition Challenge. Our system combines speech enhancement by supervised sparse non-negative matrix factorisation (NMF) with a multi-stream speech recognition system. In addition to a conventional MFCC HMM recogniser, predictions by a bidirectional Long Short-Term Memory recurrent neural network (BLSTM-RNN) and from non-negative sparse classification (NSC) are integrated into a triple-stream recogniser. Experiments are carried out on the small vocabulary and the medium vocabulary recognition tasks of the Challenge. Consistent improvements over the Challenge baselines demonstrate the efficacy of the proposed system, resulting in an average word accuracy of 92.8% in the small vocabulary task and an average word error rate of 41.42% in the medium vocabulary task.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要