CARE: Extracting Experimental Findings From Clinical Literature.
CoRR(2023)
摘要
Extracting fine-grained experimental findings from literature can provide
massive utility for scientific applications. Prior work has focused on
developing annotation schemas and datasets for limited aspects of this problem,
leading to simpler information extraction datasets which do not capture the
real-world complexity and nuance required for this task. Focusing on
biomedicine, this work presents CARE (Clinical Aggregation-oriented Result
Extraction) -- a new IE dataset for the task of extracting clinical findings.
We develop a new annotation schema capturing fine-grained findings as n-ary
relations between entities and attributes, which includes phenomena challenging
for current IE systems such as discontinuous entity spans, nested relations,
and variable arity n-ary relations. Using this schema, we collect extensive
annotations for 700 abstracts from two sources: clinical trials and case
reports. We also benchmark the performance of various state-of-the-art IE
systems on our dataset, including extractive models and generative LLMs in
fully supervised and limited data settings. Our results demonstrate the
difficulty of our dataset -- even SOTA models such as GPT4 struggle,
particularly on relation extraction. We release our annotation schema and CARE
to encourage further research on extracting and aggregating scientific findings
from literature.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要