Use of Artificial Intelligence for Analyzing Kidney Stone Composition: Are We There Yet?

Mayo Clinic Proceedings: Digital Health(2023)

引用 0|浏览0
暂无评分
摘要
Kidney stones are associated with significant morbidity and are linked to over $5 billion in economic burden annually, secondary to treatment costs, and lost worker productivity.1Saigal C.S. Joyce G. Timilsina A.R. Urologic Diseases of America ProjectDirect and indirect costs of nephrolithiasis in an employed population: opportunity for disease management?.Kidney Int. 2005; 68: 1808-1814Abstract Full Text Full Text PDF PubMed Scopus (345) Google Scholar,2Pearle M.S. Calhoun E.A. Curhan G.C. Urologic Diseases of America Project. Urologic diseases in America project: urolithiasis.J Urol. 2005; 173: 848-857Crossref PubMed Scopus (665) Google Scholar Unfortunately, the prevalence of kidney stones has been increasing annually over the years.3Scales C.D. Smith A.C. Hanley J.M. Saigal C.S. Urologic Diseases of America ProjectPrevalence of kidney stones in the United States.Eur Urol. 2012; 62: 160-165https://doi.org/10.1016/j.eururo.2012.03.052Abstract Full Text Full Text PDF PubMed Scopus (1641) Google Scholar Therefore, efforts to study the use of Artificial Intelligence (AI) to assist with the diagnosis, treatment, and prevention of kidney stones have intensified over the last 2 decades.4Shah M. Naik N. Somani B.K. Hameed B.M.Z. Artificial intelligence (AI) in urology-current use and future directions: an iTRUE study.Turk J Urol. 2020; 46: S27-S39Crossref PubMed Scopus (36) Google Scholar,5Hameed B.M.Z. Shah M. Naik N. et al.The ascent of artificial intelligence in endourology: a systematic review over the last 2 decades.Curr Urol Rep. 2021; 22: 53https://doi.org/10.1007/s11934-021-01069-3Crossref PubMed Scopus (10) Google Scholar Because it is crucial to accurately diagnose kidney stones for patients to get appropriate treatment and prevention recommendations, Day et al6Day P.L. Erdahl S. Rokke D.L. et al.Artificial intelligence for kidney stone spectra analysis: using artificial intelligence algorithms for quality assurance in the clinical laboratory.Mayo Clinic Proceedings: Digit Health. 2023; 1: 1-12https://doi.org/10.1016/j.mcpdig.2023.01.001Abstract Full Text Full Text PDF Google Scholar used AI for quality assurance during kidney stone spectra analysis to minimize erroneously classified spectra by clinical laboratory technologists. The role of AI for kidney stone management has matured considerably since 2000, with 14 studies performed to assess diagnostic prediction, 10 to assess imaging, 9 to assess stone composition, 3 studies to assess procedure outcomes, 12 to assess extracorporeal shockwave lithotripsy, 4 to assess percutaneous nephrolithotomy, and 3 studies to assess ureteroscopy.5Hameed B.M.Z. Shah M. Naik N. et al.The ascent of artificial intelligence in endourology: a systematic review over the last 2 decades.Curr Urol Rep. 2021; 22: 53https://doi.org/10.1007/s11934-021-01069-3Crossref PubMed Scopus (10) Google Scholar Accurate analysis of stone composition is the most crucial diagnostic procedure to ensure that the patient is getting appropriate treatment and prevention against the recurrence of kidney stones. Common techniques used for kidney stone composition analysis are infrared spectroscopy, X-ray diffraction, and chemical analysis. In 2016, a multicenter study was performed to assess the quality of urinary stone analysis in laboratories within Europe.7Siener R. Buchholz N. Daudon M. et al.Quality assessment of urinary stone analysis: results of a multicenter study of laboratories in Europe.PLoS One. 2016; 11e0156606https://doi.org/10.1371/journal.pone.0156606Crossref PubMed Scopus (27) Google Scholar This study highlighted the fact that only 56% of the laboratories met the quality requirements to accurately analyze the urinary stones across 8 countries in Europe. The study also emphasized the importance of appropriate equipment, reference spectra, and the qualification of the laboratory staff technologists for an accurate analysis. This lack of accuracy in half the laboratories has inspired the need to assess the role of AI in improving the quality of kidney stone composition analysis. There have been several studies performed so far to assess the role of AI in detecting stone composition.8Bejan C.A. Lee D.J. Xu Y. Hsi R.S. Performance of a natural language processing method to extract stone composition from the electronic health record.Urology. 2019; 132: 56-62https://doi.org/10.1016/j.urology.2019.07.007Abstract Full Text Full Text PDF PubMed Scopus (5) Google Scholar, 9Kazemi Y. Mirroshandel S.A. A novel method for predicting kidney stone type using ensemble learning.Artif Intell Med. 2018; 84: 117-126https://doi.org/10.1016/j.artmed.2017.12.001Crossref PubMed Scopus (67) Google Scholar, 10Aldoukhi A.H. Law H. Black K.M. Roberts W.W. Deng J. Ghani K.R. PD04-06 deep learning computer vision algorithm for detecting kidney stone composition: towards an automated future.J Urol. 2019; 201: e75-e76Google Scholar, 11Kriegshauser J.S. Silva A.C. Paden R.G. et al.Ex vivo renal stone characterization with single-source dual-energy computed tomography: a multiparametric approach.Acad Radiol. 2016; 23: 969-976https://doi.org/10.1016/j.acra.2016.03.009Abstract Full Text Full Text PDF PubMed Scopus (8) Google Scholar, 12Große Hokamp N.G. Lennartz S. Salem J. et al.Dose independent characterization of renal stones by means of dual energy computed tomography and machine learning: an ex-vivo study.Eur Radiol. 2020; 30: 1397-1404https://doi.org/10.1007/s00330-019-06455-7Crossref PubMed Scopus (19) Google Scholar, 13Saçlı B. Aydınalp C. Cansız G. et al.Microwave dielectric property based classification of renal calculi: application of a kNN algorithm.Comput Biol Med. 2019; 112103366Crossref PubMed Scopus (33) Google Scholar, 14Cui X. et al.Preoperative prediction of infection stones using radiomics features from computed tomography.IEEE Access. 2019; 7: 122675-122683https://doi.org/10.1109/ACCESS.2019.2937907Crossref Scopus (10) Google Scholar, 15Zhang G.M.Y. Sun H. Shi B. Xu M. Xue H.D. Jin Z.Y. Uric acid versus non-uric acid urinary stones: differentiation with single energy CT texture analysis.Clin Radiol. 2018; 73: 792-799https://doi.org/10.1016/j.crad.2018.04.010Abstract Full Text Full Text PDF PubMed Scopus (6) Google Scholar These studies have been summarized in the Table. All these studies were able to successfully assess the composition of the kidney stones with over 90% accuracy. What is remarkable is that they were able to do that despite the smaller sample sizes used for machine learning and deep learning in the majority of these studies. With these results, it can be inferred that larger sample sizes can further improve the accuracy of AI applications.TableStudies Assessing the Role of AI in the Detection of Stone CompositionReference, yearSample Size (n)Technique UsedOutcome of InterestAccuracyBejan et al,8Bejan C.A. Lee D.J. Xu Y. Hsi R.S. Performance of a natural language processing method to extract stone composition from the electronic health record.Urology. 2019; 132: 56-62https://doi.org/10.1016/j.urology.2019.07.007Abstract Full Text Full Text PDF PubMed Scopus (5) Google Scholar 2019Mined data from >125 million notes in EHR and identified 11,585 patientsStoneX, a natural language processing (NPL) algorithmNon-local means algorithm performance in detecting stone composition from EHRPositive predictive values: uric acid stones: 87.5%. Rest of the stones: >90%Kazemi and Mirroshandel,9Kazemi Y. Mirroshandel S.A. A novel method for predicting kidney stone type using ensemble learning.Artif Intell Med. 2018; 84: 117-126https://doi.org/10.1016/j.artmed.2017.12.001Crossref PubMed Scopus (67) Google Scholar 2018936 patients with nephrolithiasisData were preprocessed to improve model performance and various data mining models, such as Bayesian model, decision trees, artificial NNs, and rule-based classifiers were employed. The final model was an ensemble learning model combining individual classifiersPredict the chances of nephrolithiasis and kidney stone type based on various risk factorsThe final ensemble-based model showed a 97.1% accuracy in predicting kidney stone typeAldoukhi et al,10Aldoukhi A.H. Law H. Black K.M. Roberts W.W. Deng J. Ghani K.R. PD04-06 deep learning computer vision algorithm for detecting kidney stone composition: towards an automated future.J Urol. 2019; 201: e75-e76Google Scholar 201963 human kidney stones of various compositionDeep convolutional neural network (DCNN) model based on digital photographs of stonesNetwork prediction recall in identifying kidney stone composition from digital photographsComposition prediction recall for uric acid: 94%, calcium oxalate: 90%, cysteine: 75%, triple phosphate: 86%. Overall weighted recall: 85%Kreigshauser et al,11Kriegshauser J.S. Silva A.C. Paden R.G. et al.Ex vivo renal stone characterization with single-source dual-energy computed tomography: a multiparametric approach.Acad Radiol. 2016; 23: 969-976https://doi.org/10.1016/j.acra.2016.03.009Abstract Full Text Full Text PDF PubMed Scopus (8) Google Scholar 201632-stone datasetSingle source Dual Energy Computerized Tomography (ssDECT) scan-based multiparametric model comprising Artificial Neural Network (ANN), Support Vector Machine (SVM), Decision Tree (C4.5), RandomTree, and Naïve Bayes Tree (NBTree) algorithmsDistinguishing uric acid from non-uric acid stone.Distinguishing subtypes of non-uric acid stoneFor stones >5 mm: 100% accuracy in distinguishing uric acid from non-uric acid stones. 75% accuracy in non-uric acid stone subtype diagnosisGroße Hokamp et al,12Große Hokamp N.G. Lennartz S. Salem J. et al.Dose independent characterization of renal stones by means of dual energy computed tomography and machine learning: an ex-vivo study.Eur Radiol. 2020; 30: 1397-1404https://doi.org/10.1007/s00330-019-06455-7Crossref PubMed Scopus (19) Google Scholar 2020200 kidney stonesDual-energy CT images were used to train ML and shallow NNEvaluate whether ML using shallow NN enables the prediction of a stone’s main component-base and if there is dose-independent diagnostic accuracy obtained with the NNNearly 90% accuracy was achieved in predicting the main stone composition for stone sizes 3-18 mmSaçlı et al,13Saçlı B. Aydınalp C. Cansız G. et al.Microwave dielectric property based classification of renal calculi: application of a kNN algorithm.Comput Biol Med. 2019; 112103366Crossref PubMed Scopus (33) Google Scholar 2019105 naturally occurring renal calculi samples collected from 40 different patientsMicrowave dielectric properties of various types of renal calculi were fitted with Cole-Cole parameters and the k-nearest neighbor ML algorithm was usedEvaluated whether the ML algorithm can accurately classify renal calculi compositions98.2% accuracy was achieved for renal calculi classification using Cole-Cole parametersCui et al,14Cui X. et al.Preoperative prediction of infection stones using radiomics features from computed tomography.IEEE Access. 2019; 7: 122675-122683https://doi.org/10.1109/ACCESS.2019.2937907Crossref Scopus (10) Google Scholar 202198 patients with infective stones and 59 with noninfectiveRadiomics features extracted from CT images were processed by the LASSO algorithm, then ensemble learning based on bagged trees was usedPerformance of the model in differentiating noninfection vs infection kidney stone90.7% accuracy was achieved in differentiating noninfective vs infective kidney stonesZhang et al,15Zhang G.M.Y. Sun H. Shi B. Xu M. Xue H.D. Jin Z.Y. Uric acid versus non-uric acid urinary stones: differentiation with single energy CT texture analysis.Clin Radiol. 2018; 73: 792-799https://doi.org/10.1016/j.crad.2018.04.010Abstract Full Text Full Text PDF PubMed Scopus (6) Google Scholar 201818 urate stones and 32 nonurate stones from 45 patientsCT texture analysis (CTTA)-based features were used to train SVM classifiersDiagnostic accuracy in differentiation uric acid stone from non-uric acid stone based on unenhanced CT imagesAverage SVM accuracy for differentiating stone subtypes varied from 88 to 92% (after 10-fold cross-validation)CT, computerized tomography; EHR, electronic health record; ML, machine learning; NN, neural network. Open table in a new tab CT, computerized tomography; EHR, electronic health record; ML, machine learning; NN, neural network. On the contrary, Bejan et al8Bejan C.A. Lee D.J. Xu Y. Hsi R.S. Performance of a natural language processing method to extract stone composition from the electronic health record.Urology. 2019; 132: 56-62https://doi.org/10.1016/j.urology.2019.07.007Abstract Full Text Full Text PDF PubMed Scopus (5) Google Scholar used a natural language processing-based pattern-matching algorithm that was developed to extract kidney stone composition-related information from a database comprising 125 million clinical notes. On the basis of more than 45,235 matching texts from the clinical notes of 11,585 patients, the algorithm achieved a positive predictive value of >90% in identifying the most common stone-type-related information. The algorithm also enabled them to match this information with 14 patient phenotypes related to International Classification of Diseases, Nineth Revision diagnoses (comorbidities and other clinical characteristics) against 6 different stone types, and they were able to establish associations among them. For example, in their adjusted multivariate regression-based analysis, uric acid stones were found to be associated with type 2 diabetes mellitus (odds ratio [OR], 2.69; 95% confidence intervals [CI], 1.91-3.79), often reported in the literature, and hydroxyapatite stones were associated with pulmonary collapse (OR, 3.67; 95% CI, 2.10-6.42), not reported previously. Moreover, a survival analysis from a second stone surgery in their cohort showed statistically significant differences among different stone types (P=.03).8Bejan C.A. Lee D.J. Xu Y. Hsi R.S. Performance of a natural language processing method to extract stone composition from the electronic health record.Urology. 2019; 132: 56-62https://doi.org/10.1016/j.urology.2019.07.007Abstract Full Text Full Text PDF PubMed Scopus (5) Google Scholar These findings allude to the notion that AI holds great promise for its role in diagnosing kidney stones in the future. The strength of study reported by Day et al6Day P.L. Erdahl S. Rokke D.L. et al.Artificial intelligence for kidney stone spectra analysis: using artificial intelligence algorithms for quality assurance in the clinical laboratory.Mayo Clinic Proceedings: Digit Health. 2023; 1: 1-12https://doi.org/10.1016/j.mcpdig.2023.01.001Abstract Full Text Full Text PDF Google Scholar stems from their access to a large institutional historical database, which was developed by clinically validated Fourier transform infrared spectroscopy (FTIR)—assisted analysis of >1,000,000 kidney stones over several years (The Metals Laboratory at Mayo Clinic). This database, in conjunction with an internally developed FTIR spectra library with >300 different stone composition spectra, laid a strong foundation for the development, training, and validation of their machine-learning algorithm. Although spectra readouts from FTIR instruments and technologist-derived interpretations of the spectra were used in phase 1 (algorithm development) of their study, in a prospective phase 2, they also developed a quality assurance initiative using this program, wherein a report was generated twice a week to look at the discordance rates between AI program interpretation and human interpretation (as documented by the technologists in the Laboratory Information System). Incongruent interpretations were then reviewed by an independent qualified technologist to determine the final interpretation, and the rate of verified human misclassifications resulting in a revised report was calculated as a primary outcome. Over the 1-year study period, 81,517 kidney stone spectra were reviewed, and the overall revision rate of misreported kidney stone spectra was nearly 8 times higher (relative risk, 7.9; 95% CI, 4.1-15.2) when compared to the year before initiating the AI quality assurance program. However, it is worth noting that even though the misreported spectra identified by AI were 7.61 per 10,000 spectra during the study period, the overall event rate was still very low at 0.096% (78 of the 81,517).6Day P.L. Erdahl S. Rokke D.L. et al.Artificial intelligence for kidney stone spectra analysis: using artificial intelligence algorithms for quality assurance in the clinical laboratory.Mayo Clinic Proceedings: Digit Health. 2023; 1: 1-12https://doi.org/10.1016/j.mcpdig.2023.01.001Abstract Full Text Full Text PDF Google Scholar The role of explainable AI (XAI) in understanding the reasoning and mechanism behind model-based predictions has been well-established in Clinical Decision-Support Systems.16Antoniadi A.M. Du Y. Guendouz Y. et al.Current challenges and future opportunities for XAI in machine learning-based clinical decision support systems: a systematic review.Appl Sci. 2021; 11: 5088https://doi.org/10.3390/app11115088Crossref Scopus (93) Google Scholar A study by Day et. al6Day P.L. Erdahl S. Rokke D.L. et al.Artificial intelligence for kidney stone spectra analysis: using artificial intelligence algorithms for quality assurance in the clinical laboratory.Mayo Clinic Proceedings: Digit Health. 2023; 1: 1-12https://doi.org/10.1016/j.mcpdig.2023.01.001Abstract Full Text Full Text PDF Google Scholar determined the need to explore the relationship between the convolutional neural network and its associated predictions, as shown by their model, through the incorporation of SHapley Additive exPlanation (SHAP) plots using 600 training samples as the background, across a range of correct and incorrect predictions. Their SHAP prediction model looked at heatmap intensities, which represented the relative weights of spectra locations contributing to the model predictions, and areas of increased heatmap intensity corresponding to unique wavelength regions for specific stones. With the help of qualified technologists, such areas of increased heatmap intensity were reviewed and confirmed to represent specific wavelength regions used to identify other commonly encountered kidney stones. Clearly, the incorporation of a multidisciplinary clinical team including laboratory staff, data scientists, and biostatisticians, along with collaborators in clinical practice, gave further strength to their study in that it was not being performed in silos using just a database but was being tried and tested in the real world.6Day P.L. Erdahl S. Rokke D.L. et al.Artificial intelligence for kidney stone spectra analysis: using artificial intelligence algorithms for quality assurance in the clinical laboratory.Mayo Clinic Proceedings: Digit Health. 2023; 1: 1-12https://doi.org/10.1016/j.mcpdig.2023.01.001Abstract Full Text Full Text PDF Google Scholar Because 60-70% of all diagnoses are based on laboratory data, a laboratory error, even if it is happening at a low rate, can considerably affect the care of patients on a national level and can add to the economic burden on the society. The COVID-19 pandemic has highlighted, how much our health care is dependent on its workforce to ensure patient care delivery. Human-AI-augmented workflows can potentially reduce the workload on the staff in health care by providing assistance in repetitive task execution and minimizing staff burnout. With the increased use of digital information, AI-based computation pathology modalities have emerged as promising tools for increasing the accuracy and availability of high-quality care delivered to patients.17Cui M. Zhang D.Y. Artificial intelligence and computational pathology.Lab Invest. 2021; 101: 412-422https://doi.org/10.1038/s41374-020-00514-0Abstract Full Text Full Text PDF PubMed Scopus (128) Google Scholar However, with the development of more such machine-learning-based models, there might be a noticeable difference in their performance. Therefore, there remains a need to compare the performance of such models created for similar purposes using standard metrics. Sensitivity, specificity, positive predictive value, negative predictive value, accuracy, precision, recall, confusion matrix, receiver operating characteristics, the area under the curve, and F1 score are some of the commonly used tools to assess such models. The Table summarizes the performance of metrics of some of the models relevant to this paper. To achieve better model performance, the rigor and characteristics of the training dataset play a great role. Although the robustness of such models is expected to improve with bigger and more diverse datasets, it may not always be feasible.18Salamon J. Bello J.P. Deep convolutional neural networks and data augmentation for environmental sound classification.IEEE Signal. Process. Lett. 2017; 24: 279-283https://doi.org/10.1109/LSP.2017.2657381Crossref Scopus (980) Google Scholar,19Lei C. Hu B. Wang D. Zhang S. Chen Z. A preliminary study on data augmentation of deep learning for image classification.in: Proceedings of the 11th Asia-Pacific Symposium on Internetware. 2019https://doi.org/10.1145/3361242.3361259Crossref Scopus (13) Google Scholar In order to overcome such limitations, several strategies are often incorporated into model development and training. Sharing data from different sites, which vary in population and disease characteristics, often allows models to be tested and fine-tuned to perform well in varied situations. However, with ethical, legal, and logistical barriers associated with it, it often becomes challenging. Data augmentation is another approach wherein synthetic data is generated using the characteristics of the available dataset, often preserving the labels of the original dataset.20Um T.T. Pfister F.M.J. Pichler D. et al.Data augmentation of wearable sensor data for Parkinson’s disease monitoring using convolutional neural networks.in: Proceedings of the 19th ACM International Conference on Multimodal Interaction. 2017https://doi.org/10.1145/3136755.3136817Crossref Scopus (338) Google Scholar,21Simard PY, Steinkraus D, Platt JC. Best practices for convolutional neural networks applied to visual document analysis. Paper presented at: Seventh International Conference on Document Analysis and Recognition; August 6, 2003; Edinburgh, United Kingdom.Google Scholar On the contrary, transfer learning uses the existing solution for a similar problem as a starting point and tries to solve a new classification problem, thus requiring less training data to obtain a robust solution. Therefore, data sharing, data augmentation, and transfer learning modalities confer added strength to the existing machine-learning modalities by reducing the possible bias from the training dataset, optimizing the models for the new, unseen data, and increasing generalizability.22Rozo A. Moeyersons J. Morales J. et al.Data augmentation and transfer learning for data quality assessment in respiratory monitoring.Front Bioeng Biotechnol. 2022; 10806761https://doi.org/10.3389/fbioe.2022.806761Crossref PubMed Scopus (3) Google Scholar Human-AI-driven management strategies hold great promise for the future and provide a roadmap for providing more personalized patient care. Because of AI being in its nascent phase, some challenges still exist, with most hospitals not having the information technology infrastructure needed and the ability of scientific societies to review and validate these technologies as they are developed and assess their role in management guidelines. Lack of insurance coverage for human-AI augmented workflows may also pose a challenge for the implementation of these applications in clinical practice if not addressed by scientific societies. A shift in clinical paradigm is forthcoming with AI algorithms taking their place in the management guidelines for kidney stone disease, but we are not there yet. The authors report no competing interests. Artificial Intelligence for Kidney Stone Spectra Analysis: Using Artificial Intelligence Algorithms for Quality Assurance in the Clinical LaboratoryMayo Clinic Proceedings: Digital HealthVol. 1Issue 1PreviewTo determine if a set of artificial intelligence (AI) algorithms could be leveraged to interpret Fourier transform infrared spectroscopy (FTIR) spectra and detect potentially erroneous stone composition results reported in the laboratory information system by the clinical laboratory. Full-Text PDF Open Access
更多
查看译文
关键词
kidney stone composition,artificial intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要