Denoising Table-Text Retrieval for Open-Domain Question Answering
CoRR(2024)
摘要
In table-text open-domain question answering, a retriever system retrieves
relevant evidence from tables and text to answer questions. Previous studies in
table-text open-domain question answering have two common challenges: firstly,
their retrievers can be affected by false-positive labels in training datasets;
secondly, they may struggle to provide appropriate evidence for questions that
require reasoning across the table. To address these issues, we propose
Denoised Table-Text Retriever (DoTTeR). Our approach involves utilizing a
denoised training dataset with fewer false positive labels by discarding
instances with lower question-relevance scores measured through a false
positive detection model. Subsequently, we integrate table-level ranking
information into the retriever to assist in finding evidence for questions that
demand reasoning across the table. To encode this ranking information, we
fine-tune a rank-aware column encoder to identify minimum and maximum values
within a column. Experimental results demonstrate that DoTTeR significantly
outperforms strong baselines on both retrieval recall and downstream QA tasks.
Our code is available at https://github.com/deokhk/DoTTeR.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要