Generating multiple-choice questions for medical question answering with distractors and cue-masking

arxiv(2023)

引用 0|浏览28
暂无评分
摘要
Medical multiple-choice question answering (MCQA) is particularly difficult. Questions may describe patient symptoms and ask for the correct diagnosis, which requires domain knowledge and complex reasoning. Standard language modeling pretraining alone is not sufficient to achieve the best results. \citet{jin2020disease} showed that focusing masked language modeling on disease name prediction when using medical encyclopedic paragraphs as input leads to considerable MCQA accuracy improvement. In this work, we show that (1) fine-tuning on generated MCQA dataset outperforms the masked language modeling based objective and (2) correctly masking the cues to the answers is critical for good performance. We release new pretraining datasets and achieve state-of-the-art results on 4 MCQA datasets, notably +5.7\% with base-size model on MedQA-USMLE.
更多
查看译文
关键词
distractors,medical questions,multiple-choice,cue-masking
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要