Evaluation Of An Ai System For Breast Cancer Screening In Mammograms Of Young Women.

JOURNAL OF CLINICAL ONCOLOGY(2020)

引用 0|浏览13
暂无评分
摘要
e14068 Background: The interpretation of the mammography is challenging, especially in young women, who have dense breasts. Artificial intelligence (AI) promises to improve breast cancer detection; however these systems should be tested on different datasets. Our aim is to evaluate the performance of a publicly available deep convolutional neural network, developed by Wu et al. (IEEE Trans. Med. Imaging, 2019), applied to mammograms of young women. Methods: The test dataset consisted of mammograms obtained on a single occasion from 135 young women (up to 40 years old) on a Siemens mammography system. Each exam consisted of 4 full-field digital mammography images and had two labels (left malignant and right malignant). Mammograms were analyzed by a single mammography trained radiologist, using BI-RADS reporting tool. Among 270 labels, 170 were malignant and 100 were non-malignant. We used the program developed by Wu et al. that, according to the authors, presents AUC of 0.895 for the general population. As a preliminary test, we ran this program in a publicly available dataset named INbreast and obtained AUC of 0.8708, very close to the result reported by the authors. Results: We applied the program to our dataset of young women and obtained AUC of 0.876. We computed its standard error, obtaining 0.0290. At equal error rate point of the ROC curve, specificity and sensitivity are both 0.774. With this result we conclude that, at least for our dataset, cancer detection in young women is not substantially more difficult than in general population for an AI system. We fine-tuned the weights of the original network to the population of young women using transfer learning and obtained a slight improvement in AUC: 0.9018±0.0528, where the mean and the standard error were obtained using 5-fold cross validation. As the improvement was small and the standard errors are large, we would have to test on a larger test set to ensure that the observed improvement is real. Conclusions: We conclude, based on the experimental data, that there is no substantial degradation in accuracy when a mammogram screening program for general population is used for young women. We also conclude that it seems to be possible to obtain a slight improvement in accuracy by fine-tuning the network for the population of young women.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要