Locality enhanced dynamic biasing and sampling strategies for contextual ASR

Md Asif Jalal,Pablo Peso Parada, George Pavlidis, Vasileios Moschopoulos, Karthikeyan Saravanan, Chrysovalantis-Giorgos Kontoulis,Jisi Zhang, Anastasios Drosou, Gil Ho Lee,Jungin Lee, Seokyeong Jung

2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)(2024)

引用 0|浏览0
暂无评分
摘要
Automatic Speech Recognition (ASR) still face challenges when recognizing time-variant rare-phrases. Contextual biasing (CB) modules bias ASR model towards such contextually-relevant phrases. During training, a list of biasing phrases are selected from a large pool of phrases following a sampling strategy. In this work we firstly analyse different sampling strategies to provide insights into the training of CB for ASR with correlation plots between the bias embeddings among various training stages. Secondly, we introduce a neighbourhood attention (NA) that localizes self attention (SA) to the nearest neighbouring frames to further refine the CB output. The results show that this proposed approach provides on average a 25.84 LibriSpeech sets and rare-word evaluation compared to the baseline.
更多
查看译文
关键词
Contextual Biasing,ASR,Local Attention,Adaptation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要