Domain Adaptation Using Density Ratio Approach and CTC Decoder for Streaming Speech Recognition

Tatsunari Takagi, Yukoh Wakabayashi,Atsunori Ogawa,Norihide Kitaoka

2023 10th International Conference on Advanced Informatics: Concept, Theory and Application (ICAICTA)(2023)

引用 0|浏览1
暂无评分
摘要
This study proposes a method of streaming automatic speech recognition (ASR) for Japanese speech using domain adaptation. The development of ASR technology has led to its use in various applications, however it has been observed that recognition accuracy declines when there is a mismatch between the domains of the training data and the speech to be recognized (i.e., the target data). In particular, the accuracy of online speech recognition, also known as streaming ASR, which is becoming popular in real-world applications, is degraded by such differences. Since creating custom training data for specialized applications is costly, domain adaptation is often used to transform existing training corpora. In this study, we propose a method of domain adaptation is often used to transform existing training corpora. In this study, we propose a method of domain adaptation for streaming ASR of Japanese speech, utilizing Density Ratio Approach (DRA), which involves the use of Language Models (LMs) trained with large amounts of text data. To facilitate recognition of streaming speech data, a CTC decoder is used to successively replace linguistic information on a frame-by-frame basis. Recognition results are then obtained through greedy searches. Taking into consideration CTC's assumption of conditional independence, we used 1-gram, 2-gram and 3-gram LMs for the subtraction and addition of language information during domain adaptation with DRA, for replacement of the selected linguistic information. Experimental results confirm that the proposed method enhances speech recognition accuracy when there is a mismatch between the data in the training and target domains.
更多
查看译文
关键词
Streaming ASR,CTC,DRA,Domain Adaptation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要