HWCGEC:HW-TSC's 2023 Submission for the NLPCC2023's Chinese Grammatical Error Correction Task.

Chang Su ,Xiaofeng Zhao,Xiaosong Qiao,Min Zhang,Hao Yang , Junhao Zhu, Ming Zhu, Wenbing Ma

NLPCC (3)(2023)

引用 0|浏览0
暂无评分
摘要
Deep learning has shown remarkable effectiveness in various language tasks. This paper presents Huawei Translation Services Center’s (HW-TSC’s) work called HWCGEC which get the best performance among the seven submitted results in the NLPCC2023 shared task 1, namely Chinese grammatical error correction (CGEC). CGEC aims to automatically correct grammatical errors that violate language rules and converts the noisy input texts to clean output texts. This paper, through experiments, discovered that after model fine-tuning the BART a sequence to sequence (seq2seq) model performs better than the ChatGLM a large language model (LLM) in situations where training data is large while the LoRA mode has a smaller number of parameters for fine-tuning. Additionally, the BART model achieves good results in the CGEC task through data augmentation and curriculum learning methods. Although the performance of LLM is poor in experiments, they possess excellent logical abilities. With the training set becoming more diverse and the methods for training set data augmentation becoming more refined, the supervised fine-tuning (SFT) mode trained LLMs are expected to achieve significant improvements in CGEC tasks in the future.
更多
查看译文
关键词
nlpcc2023s,correction,error,hw-tsc
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要