Deep learning for [18F]fluorodeoxyglucose-PET-CT classification in patients with lymphoma: a dual-centre retrospective analysis

LANCET DIGITAL HEALTH(2024)

引用 0|浏览3
暂无评分
摘要
Background The rising global cancer burden has led to an increasing demand for imaging tests such as [F-18]fluorodeoxyglucose ([F-18]FDG)-PET-CT. To aid imaging specialists in dealing with high scan volumes, we aimed to train a deep learning artificial intelligence algorithm to classify [F-18]FDG-PET-CT scans of patients with lymphoma with or without hypermetabolic tumour sites. Methods In this retrospective analysis we collected 16 583 [F-18]FDG-PET-CTs of 5072 patients with lymphoma who had undergone PET-CT before or after treatment at the Memorial Sloa Kettering Cancer Center, New York, NY, USA. Using maximum intensity projection (MIP), three dimensional (3D) PET, and 3D CT data, our ResNet34-based deep learning model (Lymphoma Artificial Reader System [LARS]) for [F-18]FDG-PET-CT binary classification (Deauville 1-3 vs 4-5), was trained on 80% of the dataset, and tested on 20% of this dataset. For external testing, 1000 [F-18]FDG-PET-CTs were obtained from a second centre (Medical University of Vienna, Vienna, Austria). Seven model variants were evaluated, including MIP-based LARS-avg (optimised for accuracy) and LARS-max (optimised for sensitivity), and 3D PET-CT-based LARS-ptct. Following expert curation, areas under the curve (AUCs), accuracies, sensitivities, and specificities were calculated. Findings In the internal test cohort (3325 PET-CTs, 1012 patients), LARS-avg achieved an AUC of 0.949 (95% CI 0.942-0.956), accuracy of 0.890 (0.879-0.901), sensitivity of 0.868 (0.851-0.885), and specificity of 0.913 (0.899-0.925); LARS-max achieved an AUC of 0.949 (0.942-0.956), accuracy of 0.868 (0.858-0.879), sensitivity of 0.909 (0.896-0.924), and specificity of 0.826 (0.808-0.843); and LARS-ptct achieved an AUC of 0.939 (0.930-0.948), accuracy of 0.875 (0.864-0.887), sensitivity of 0.836 (0.817-0.855), and specificity of 0.915 (0.901-0.927). In the external test cohort (1000 PET-CTs, 503 patients), LARS-avg achieved an AUC of 0.953 (0.938-0.966), accuracy of 0.907 (0.888-0.925), sensitivity of 0.874 (0.843-0.904), and specificity of 0.949 (0.921-0.960); LARS-max achieved an AUC of 0.952 (0.937-0.965), accuracy of 0.898 (0.878-0.916), sensitivity of 0.899 (0.871-0.926), and specificity of 0.897 (0.871-0.922); and LARS-ptct achieved an AUC of 0.932 (0.915-0.948), accuracy of 0.870 (0.850-0.891), sensitivity of 0.827 (0.793-0.863), and specificity of 0.913 (0.889-0.937). Interpretation Deep learning accurately distinguishes between [F-18]FDG-PET-CT scans of lymphoma patients with and without hypermetabolic tumour sites. Deep learning might therefore be potentially useful to rule out the presence of metabolically active disease in such patients, or serve as a second reader or decision support tool. Copyright (c) 2023 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要