Simulation-driven training of vision transformers enables metal artifact reduction of highly truncated CBCT scans.

Medical physics(2023)

引用 0|浏览0
暂无评分
摘要
BACKGROUND:Due to the high attenuation of metals, severe artifacts occur in cone beam computed tomography (CBCT). The metal segmentation in CBCT projections usually serves as a prerequisite for metal artifact reduction (MAR) algorithms. PURPOSE:The occurrence of truncation caused by the limited detector size leads to the incomplete acquisition of metal masks from the threshold-based method in CBCT volume. Therefore, segmenting metal directly in CBCT projections is pursued in this work. METHODS:Since the generation of high quality clinical training data is a constant challenge, this study proposes to generate simulated digital radiographs (data I) based on real CT data combined with self-designed computer aided design (CAD) implants. In addition to the simulated projections generated from 3D volumes, 2D x-ray images combined with projections of implants serve as the complementary data set (data II) to improve the network performance. In this work, SwinConvUNet consisting of shift window (Swin) vision transformers (ViTs) with patch merging as encoder is proposed for metal segmentation. RESULTS:The model's performance is evaluated on accurately labeled test datasets obtained from cadaver scans as well as the unlabeled clinical projections. When trained on the data I only, the convolutional neural network (CNN) encoder-based networks UNet and TransUNet achieve only limited performance on the cadaver test data, with an average dice score of 0.821 and 0.850. After using both data II and data I during training, the average dice scores for the two models increase to 0.906 and 0.919, respectively. By replacing the CNN encoder with Swin transformer, the proposed SwinConvUNet reaches an average dice score of 0.933 for cadaver projections when only trained on the data I. Furthermore, SwinConvUNet has the largest average dice score of 0.953 for cadaver projections when trained on the combined data set. CONCLUSIONS:Our experiments quantitatively demonstrate the effectiveness of the combination of the projections simulated under two pathways for network training. Besides, the proposed SwinConvUNet trained on the simulated projections performs state-of-the-art, robust metal segmentation as demonstrated on experiments on cadaver and clinical data sets. With the accurate segmentations from the proposed model, MAR can be conducted even for highly truncated CBCT scans.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要