Learning to Attentively Represent Distinctive Information for Semantic Text Matching.

NLPCC (1)(2023)

引用 0|浏览4
暂无评分
摘要
Pre-trained language models (PLMs) such as BERT have achieved remarkable results in the task of Semantic Text Matching (STM). Nevertheless, existing models face challenges in discerning subtle distinction between texts, although it is the vital clue for STM. Concretely, the alteration of a single word causes significant variation of semantics of the entire text. To solve the problem, we propose a novel method of attentively representing distinctive information for STM. It comprises two components, including Reversed Attention Mechanism (RAM) and Sample-based Adaptive Learning (SAL). RAM reverses the hidden states of texts before computing attentions, which contributes to the highlighting of mutually-different syntactic constituents when comparing texts. In addition, during the initial stage of training, the model may acquire some biases. For example, it may ignore the distinctions between sentence pairs and simply classify sentence pairs with high lexical overlap as positive examples, because a majority of positive examples exhibit such high lexical overlap. SAL is designed to facilitate the model in comprehensively acquiring the semantic knowledge hidden in the distinctive constituents. Experiments on six STM datasets demonstrate the effectiveness of our proposed approach. Furthermore, we employ ChatGPT to generate textual descriptions of distinction between texts and empirically validate the significance of distinctive information in semantic text matching task.
更多
查看译文
关键词
semantic text matching,represent distinctive information,learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要