Improving Chinese spell checking with bidirectional LSTMs and confusionset-based decision network

Neural Computing and Applications(2023)

引用 0|浏览8
暂无评分
摘要
Chinese spell checking is a task to detect and correct Chinese spelling errors, which is very important for natural language understanding. Generally, studies on Chinese spell checking are mainly based on n -gram language model and neural network models. However, the validity of the n -gram model needs to balance the value n and the storage resources, and most neural networks cannot efficiently handle the cases with severely uneven distribution of the correct and incorrect characters. This makes spell checking be limited in text application scenarios that contain many oral expressions. To solve the issues, a confusionset-guided decision network for spoken Chinese spell checking is proposed. By using confusionset to generate candidate set, the model can reasonably locate the wrong characters with decision network which ensures bidirectional long short-term memory pay more attention to the characteristics of the wrong characters. To verify the correctness and effectiveness of our model, extensive experiments were carried out on a logistics question and answer corpus and SIGHAN Bake-off dataset. Experimental results show that the model is efficient. It is much effective in spell checking for spoken Chinese, and it outperforms all competitor models. Besides, it can efficiently correct the wrong characters in real scenarios.
更多
查看译文
关键词
Chinese Spell checking,Attention,Seriously uneven distribution,Oral expression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要