An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling
CoRR(2024)
摘要
Sequence labeling models often benefit from incorporating external knowledge.
However, this practice introduces data heterogeneity and complicates the model
with additional modules, leading to increased expenses for training a
high-performing model. To address this challenge, we propose a two-stage
curriculum learning (TCL) framework specifically designed for sequence labeling
tasks. The TCL framework enhances training by gradually introducing data
instances from easy to hard, aiming to improve both performance and training
speed. Furthermore, we explore different metrics for assessing the difficulty
levels of sequence labeling tasks. Through extensive experimentation on six
Chinese word segmentation (CWS) and Part-of-speech tagging (POS) datasets, we
demonstrate the effectiveness of our model in enhancing the performance of
sequence labeling models. Additionally, our analysis indicates that TCL
accelerates training and alleviates the slow training problem associated with
complex models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要