227 Spine-tuned Natural Language Models and Bespoke Regular Expression Classifiers for Automated Spinal Surgery Registry Development

Daniel Alexander Alber,Alexander Cheung, David B. Kurland, Karl Lee Sangwon,Lavender Jiang, Chris Liu,Eric Karl Oermann

Neurosurgery(2024)

引用 0|浏览9
暂无评分
摘要
INTRODUCTION: Surgical research demands the development of clinical registries, often through time-intensive manual chart review. Natural language processing (NLP) may accelerate registry development, and an ideal automatic registry (autoregistry) algorithm would be highly accurate while requiring minimal manual data annotation. NLP approaches including bespoke Regular Expression (RegEx) classifiers and Large Language Models (LLM) possess distinct strengths and weaknesses and have not been compared in the setting of autoregistry development. METHODS: We used an institutional data lake to retrieve 31,502 neurosurgical operative notes. A standardized set of spinal procedures was chosen for inclusion in the autoregistry. 200 manually annotated notes were used for training and testing purposes. RegEx classifiers were engineered to retrieve procedural info from unprocessed notes. A family of 110-million parameter BERT models, including LLM pre-trained on clinical text, was fine-tuned for the same tasks. We also tested a open-source 7-billion parameter LLM chatbot, Vicuna, without fine-tuning. RESULTS: The RegEx classifiers were able to identify spinal procedures and associated vertebral levels in nearly 99% of operative notes. Fine-tuned LLM identified common procedures (e.g. spinal fusion and laminectomy) with greater than 95% accuracy but performed poorly for rarer procedures (e.g. XLIF, corpectomy) and vertebral body identification. Qualitative evaluation of the Vicuna chatbot showed potential for the same tasks, following iteratively refined prompting. CONCLUSIONS: The goal of autoregistry development is to minimize time- and labor-intensive manual chart review. We found that fine-tuned LLM could not match the accuracy and efficiency of the RegEx classifier. However, LLM may be well-suited to expand existing clinical databases that provide a robust training set. Further work combining NLP approaches will attempt to develop a pipeline for autoregistry development from natural language (plain English) queries.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要