SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries
CoRR(2024)
Abstract
We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query
embeddings for set operations and Boolean logic queries, such as Intersection
(AND), Difference (NOT), and Union (OR). SetBERT significantly improves
retrieval performance for logic-structured queries, an area where both
traditional and neural retrieval methods typically underperform. We propose an
innovative use of inversed-contrastive loss, focusing on identifying the
negative sentence, and fine-tuning BERT with a dataset generated via prompt
GPT. Furthermore, we demonstrate that, unlike other BERT-based models,
fine-tuning with triplet loss actually degrades performance for this specific
task. Our experiments reveal that SetBERT-base not only significantly
outperforms BERT-base (up to a 63
performance comparable to the much larger BERT-large model, despite being only
one-third the size.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined