Evaluating the Explainability of Neural Rankers
Lecture Notes in Computer Science Advances in Information Retrieval(2024)
摘要
Information retrieval models have witnessed a paradigm shift from
unsupervised statistical approaches to feature-based supervised approaches to
completely data-driven ones that make use of the pre-training of large language
models. While the increasing complexity of the search models have been able to
demonstrate improvements in effectiveness (measured in terms of relevance of
top-retrieved results), a question worthy of a thorough inspection is - "how
explainable are these models?", which is what this paper aims to evaluate. In
particular, we propose a common evaluation platform to systematically evaluate
the explainability of any ranking model (the explanation algorithm being
identical for all the models that are to be evaluated). In our proposed
framework, each model, in addition to returning a ranked list of documents,
also requires to return a list of explanation units or rationales for each
document. This meta-information from each document is then used to measure how
locally consistent these rationales are as an intrinsic measure of
interpretability - one that does not require manual relevance assessments.
Additionally, as an extrinsic measure, we compute how relevant these rationales
are by leveraging sub-document level relevance assessments. Our findings show a
number of interesting observations, such as sentence-level rationales are more
consistent, an increase in complexity mostly leads to less consistent
explanations, and that interpretability measures offer a complementary
dimension of evaluation of IR systems because consistency is not
well-correlated with nDCG at top ranks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要