Aspect Term Extraction via Contrastive Learning over Self-augmented Data

IEEE International Joint Conference on Neural Network (IJCNN)(2022)

引用 0|浏览7
暂无评分
摘要
Aspect Term Extraction (ATE) is a natural language processing task, which identifies the languages describing product attributes. Such languages (words) are referred to aspect terms in this field. The current neural ATE models suffer from sparsity of available training data. As a result, they fail to be robust in real applications due to overfitting. More seriously, the distinguishable underlying features cannot be learned sufficiently by neural networks, which causes high misjudgement rates. Deliberate data expansion by human annotation undoubtedly helps to alleviate the problem. However, it is time-consuming. In order to overcome the bottleneck, we utilize the Regularized Dropout (R-Drop) approach to implement self data augmentation, creating variant distributed representations for learning in the real-time computation process of neural networks. More importantly, we propose to conduct contrastive learning over the self-augmented data, which sufficiently leverages the variant distributed representations to explore the distinguishable features. We experiment on four widely-used benchmark datasets (R14-16 and L14) in the shared tasks of Semantic Evaluation (SemEval). Experimental results show that contrastive learning over self-augmented data yields significant performance gains, where the improvement is up to 1.9% F1-score at best. In addition, it is demonstrated in the experiments that our method outperforms the state of the art on two test sets (R14 and R16), and meanwhile achieves competitive performance on the rest test sets.
更多
查看译文
关键词
Aspect term extraction,Contrastive learning,Regularized dropout
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要