Transparency in Artificial Intelligence Reporting in Ophthalmology-A Scoping Review.

Dinah Chen,Alexi Geevarghese, Samuel Lee, Caitlin Plovnick, Cansu Elgin,Raymond Zhou, Eric Oermann, Yindalon Aphinyonaphongs, Lama A Al-Aswad

Ophthalmology science(2024)

引用 0|浏览3
暂无评分
摘要
Topic:This scoping review summarizes artificial intelligence (AI) reporting in ophthalmology literature in respect to model development and validation. We characterize the state of transparency in reporting of studies prospectively validating models for disease classification. Clinical Relevance:Understanding what elements authors currently describe regarding their AI models may aid in the future standardization of reporting. This review highlights the need for transparency to facilitate the critical appraisal of models prior to clinical implementation, to minimize bias and inappropriate use. Transparent reporting can improve effective and equitable use in clinical settings. Methods:Eligible articles (as of January 2022) from PubMed, Embase, Web of Science, and CINAHL were independently screened by 2 reviewers. All observational and clinical trial studies evaluating the performance of an AI model for disease classification of ophthalmic conditions were included. Studies were evaluated for reporting of parameters derived from reporting guidelines (CONSORT-AI, MI-CLAIM) and our previously published editorial on model cards. The reporting of these factors, which included basic model and dataset details (source, demographics), and prospective validation outcomes, were summarized. Results:Thirty-seven prospective validation studies were included in the scoping review. Eleven additional associated training and/or retrospective validation studies were included if this information could not be determined from the primary articles. These 37 studies validated 27 unique AI models; multiple studies evaluated the same algorithms (EyeArt, IDx-DR, and Medios AI). Details of model development were variably reported; 18 of 27 models described training dataset annotation and 10 of 27 studies reported training data distribution. Demographic information of training data was rarely reported; 7 of the 27 unique models reported age and gender and only 2 reported race and/or ethnicity. At the level of prospective clinical validation, age and gender of populations was more consistently reported (29 and 28 of 37 studies, respectively), but only 9 studies reported race and/or ethnicity data. Scope of use was difficult to discern for the majority of models. Fifteen studies did not state or imply primary users. Conclusion:Our scoping review demonstrates variable reporting of information related to both model development and validation. The intention of our study was not to assess the quality of the factors we examined, but to characterize what information is, and is not, regularly reported. Our results suggest the need for greater transparency in the reporting of information necessary to determine the appropriateness and fairness of these tools prior to clinical use. Financial Disclosures:Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要