ArthroNet: a monocular depth estimation technique with 3D segmented maps for knee arthroscopy

Intelligent Medicine(2023)

引用 8|浏览2
暂无评分
摘要
Background Lack of depth perception from medical imaging systems is one of the long-standing technological limitations of minimally invasive surgeries. The ability to visualize anatomical structures in 3D can improve conventional arthroscopic surgeries, as a full 3D semantic representation of the surgical site can directly improve surgeons' ability. It also brings the possibility of intraoperative image registration with preoperative clinical records for the development of semi-autonomous, and fully autonomous platforms. This study aimed to present a novel monocular depth prediction model to infer depth maps from a single-color arthroscopic video frame.Methods We applied a novel technique that provides the ability to combine both supervised and self-supervised loss terms and thus eliminate the drawback of each technique. It enabled the estimation of edge-preserving depth maps from a single untextured arthroscopic frame. The proposed image acquisition technique projected artificial textures on the surface to improve the quality of disparity maps from stereo images. Moreover, following the integration of the attention-ware multi-scale feature extraction technique along with scene global contextual constraints and multiscale depth fusion, the model could predict reliable and accurate tissue depth of the surgical sites that complies with scene geometry.Results A total of 4,128 stereo frames from a knee phantom were used to train a network, and during the pre-trained stage, the network learned disparity maps from the stereo images. The fine-tuned training phase uses 12,695 knee arthroscopic stereo frames from cadaver experiments along with their corresponding coarse disparity maps obtained from the stereo matching technique. In a supervised fashion, the network learns the left image to the disparity map transformation process, whereas the self-supervised loss term refines the coarse depth map by minimizing reprojection, gradients, and structural dissimilarity loss. Together, our method produces high-quality 3D maps with minimum re-projection loss that are 0.0004132 (structural similarity index), 0.00036120156 (L1 error distance) and 6.591908 x 10 - 5 (L1 gradient error distance).Conclusion Machine learning techniques for monocular depth prediction is studied to infer accurate depth maps from a single-color arthroscopic video frame. Moreover, the study integrates segmentation model hence, 3D segmented maps are inferred that provides extended perception ability and tissue awareness.
更多
查看译文
关键词
Monocular depth estimation technique,3D segmented maps,Knee arthroscopic
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要