Systematic evaluation of common natural language processing techniques to codify clinical notes

Nazgol Tavabi,Mallika Singh,James Pruneski,Ata M. Kiapour

PLOS ONE（2024）

引用 4|浏览18

暂无评分

摘要

Proper codification of medical diagnoses and procedures is essential for optimized health care management, quality improvement, research, and reimbursement tasks within large healthcare systems. Assignment of diagnostic or procedure codes is a tedious manual process, often prone to human error. Natural Language Processing (NLP) has been suggested to facilitate this manual codification process. Yet, little is known on best practices to utilize NLP for such applications. With Large Language Models (LLMs) becoming more ubiquitous in daily life, it is critical to remember, not every task requires that level of resource and effort. Here we comprehensively assessed the performance of common NLP techniques to predict current procedural terminology (CPT) from operative notes. CPT codes are commonly used to track surgical procedures and interventions and are the primary means for reimbursement. Our analysis of 100 most common musculoskeletal CPT codes suggest that traditional approaches can outperform more resource intensive approaches like BERT significantly (P-value = 4.4e-17) with average AUROC of 0.96 and accuracy of 0.97, in addition to providing interpretability which can be very helpful and even crucial in the clinical domain. We also proposed a complexity measure to quantify the complexity of a classification task and how this measure could influence the effect of dataset size on model's performance. Finally, we provide preliminary evidence that NLP can help minimize the codification error, including mislabeling due to human error.

查看译文

关键词

natural language processing,natural language processing techniques,notes

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要