NoiseQA: Challenge Set Evaluation for User-Centric Question Answering

Abhilasha Ravichander,Siddharth Dalmia,Maria Ryskina,Florian Metze,Eduard Hovy,Alan W Black

EACL（2021）

引用 21|浏览50

暂无评分

摘要

When Question-Answering (QA) systems are deployed in the real world, users query them through a variety of interfaces, such as speaking to voice assistants, typing questions into a search engine, or even translating questions to languages supported by the QA system. While there has been significant community attention devoted to identifying correct answers in passages assuming a perfectly formed question, we show that components in the pipeline that precede an answering engine can introduce varied and considerable sources of error, and performance can degrade substantially based on these upstream noise sources even for powerful pre-trained QA models. We conclude that there is substantial room for progress before QA systems can be effectively deployed, highlight the need for QA evaluation to expand to consider real-world use, and hope that our findings will spur greater community interest in the issues that arise when our systems actually need to be of utility to humans.

查看译文

关键词

challenge set evaluation,user-centric

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要