VoMBaT: A Tool for Visualising Evaluation Measure Behaviour in High-Recall Search Tasks

PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023(2023)

引用 3|浏览12
暂无评分
摘要
The objective of High-Recall Information Retrieval (HRIR) is to retrieve as many relevant documents as possible for a given search topic. One approach to HRIR is Technology-Assisted Review (TAR), which uses information retrieval and machine learning techniques to aid the review of large document collections. TAR systems are commonly used in legal eDiscovery and systematic literature reviews. Successful TAR systems are able to find the majority of relevant documents using the least number of assessments. Commonly used retrospective evaluation assumes that the system achieves a specific, fixed recall level first, and then measures the precision or work saved (e.g., precision at r% recall). This approach can cause problems related to understanding the behaviour of evaluation measures in a fixed recall setting. It is also problematic when estimating time and money savings during technology-assisted reviews. This paper presents a new visual analytics tool to explore the dynamics of evaluation measures depending on recall level. We implemented 18 evaluation measures based on the confusion matrix terms, both from general IR tasks and specific to TAR. The tool allows for a comparison of the behaviour of these measures in a fixed recall evaluation setting. It can also simulate savings in time and money and a count of manual vs automatic assessments for different datasets depending on the model quality. The tool is open-source, and the demo is available under the following URL: https://vombat.streamlit.app.
更多
查看译文
关键词
visual analytics,technology-assisted reviews,systematic reviews,citation screening,evaluation,evaluation measures,TNR
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要