Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations.
CoRR(2023)
摘要
We introduce sub-sentence encoder, a contrastively-learned contextual
embedding model for fine-grained semantic representation of text. In contrast
to the standard practice with sentence embeddings, where the meaning of an
entire sequence of text is encoded into a fixed-length vector, the sub-sentence
encoder learns to produce distinct contextual embeddings corresponding to
different atomic propositions, i.e. atomic units of meaning expressed within a
text sequence. The sub-sentence embeddings are contrastively learned to
recognize (inferred) semantic equivalence between propositions across different
text sequences. Our experiments show the effectiveness of sub-sentence encoders
in applications, such as retrieving supporting facts for fine-grained text
attribution or recognizing the conditional semantic similarity between texts.
In practice, we demonstrate that sub-sentence encoders keep the same level of
inference cost and space complexity compared to sentence encoders.
更多查看译文
关键词
contrastive learning,propositional semantic
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要