Decoupling word-pair distance and co-occurrence information for effective long history context language modeling

IEEE Transactions on Audio, Speech, and Language Processing(2015)

引用 3|浏览127
暂无评分
摘要
In this paper, we propose the use of distance and co-occurrence information of word-pairs to improve language modeling. We have empirically shown that, for history-context sizes of up to ten words, the extracted information about distance and co-occurrence complements the n-gram language model well, for which learning long-history contexts is inherently difficult. Evaluated on the Wall Street Journal and the Switchboard corpora, our proposed model reduces the trigram model perplexity by up to 11.2% and 6.5%, respectively. As compared to the distant bigram model and the trigger model, our proposed model offers a more effective manner of capturing far context information, as verified in terms of perplexity and computational efficiency, i.e. fewer free parameters to be finetuned. Experiments using the proposed model for speech recognition, text classification and word prediction tasks showed improved performance.
更多
查看译文
关键词
language modeling,speech recognition,text categorization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要