Direct prediction of Homologous Recombination Deficiency from routine histology in ten different tumor types with attention-based Multiple Instance Learning: a development and validation study

medRxiv (Cold Spring Harbor Laboratory)(2023)

引用 1|浏览35
暂无评分
摘要
Background Homologous Recombination Deficiency (HRD) is a pan-cancer predictive biomarker that identifies patients who benefit from therapy with PARP inhibitors (PARPi). However, testing for HRD is highly complex. Here, we investigated whether Deep Learning can predict HRD status solely based on routine Hematoxylin & Eosin (H&E) histology images in ten cancer types. Methods We developed a fully automated deep learning pipeline with attention-weighted multiple instance learning (attMIL) to predict HRD status from histology images. A combined genomic scar HRD score, which integrated loss of heterozygosity (LOH), telomeric allelic imbalance (TAI) and large-scale state transitions (LST) was calculated from whole genome sequencing data for n=4,565 patients from two independent cohorts. The primary statistical endpoint was the Area Under the Receiver Operating Characteristic curve (AUROC) for the prediction of genomic scar HRD with a clinically used cutoff value. Results We found that HRD status is predictable in tumors of the endometrium, pancreas and lung, reaching cross-validated AUROCs of 0.79, 0.58 and 0.66. Predictions generalized well to an external cohort with AUROCs of 0.93, 0.81 and 0.73 respectively. Additionally, an HRD classifier trained on breast cancer yielded an AUROC of 0.78 in internal validation and was able to predict HRD in endometrial, prostate and pancreatic cancer with AUROCs of 0.87, 0.84 and 0.67 indicating a shared HRD-like phenotype is across tumor entities. Conclusion In this study, we show that HRD is directly predictable from H&E slides using attMIL within and across ten different tumor types. ### Competing Interest Statement JNK reports consulting services for Owkin, France, Panakeia, UK and DoMore Diagnostics, Norway and has received honoraria for lectures by MSD, Eisai and Fresenius. JSRF reports a leadership (board of directors) role at Grupo Oncoclinicas, stock or other ownership interests at Repare Therapeutics and Paige.AI, and a consulting or Advisory Role at Genentech/Roche, Invicro, Ventana Medical Systems, Volition RX, Paige.AI, Goldman Sachs, Bain Capital, Novartis, Repare Therapeutics, Lilly, Saga Diagnostics, Swarm and Personalis. No other potential conflicts of interest are reported by any of the authors. ### Funding Statement JNK is supported by the German Federal Ministry of Health (DEEP LIVER, ZMVI1-2520DAT111) and the Max-Eder-Programme of the German Cancer Aid (grant #70113864), the German Federal Ministry of Education and Research (PEARL, 01KD2104C), and the German Academic Exchange Service (SECAI, 57616814). This research was supported by the National Institute for Health and Care Research (NIHR, NIHR213331) Leeds Biomedical Research Centre. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. JSRF is funded in part by the Breast Cancer Research Foundation, a Susan G Komen Leadership Grant, the NIH/NCI P50 CA247749 01 grant and by the NIH/NCI Cancer Center Core Grant P30-CA008748. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes The WSI, molecular and clinical data for TCGA and CPTAC cohorts are publicly accessible at and (accessed, 08 March 2022). Script for calculating the HRD score is available under (accessed 06 June 2022). All other source codes can be downloaded under . Our calculated HRD score is publicly available in Supplementary Table 2. Moreover, our custom TCGA-BRCA HRD-H and HRD-L group can be accessed for the PanCancer Atlas cohort at (Supplementary 3). . * AI : artificial intelligence ASCAT : Allele-Specific Copy number Analysis of Tumors attMIL : attention-weighted multiple instance learning AUROC : Area Under the Receiver Operating Characteristic curve BRCA : breast invasive carcinoma BRCA1/2 : Breast Cancer genes 1 and 2 CI : confidence interval CIOMS : Council for International Organizations of Medical Sciences CPTAC : Clinical Proteomic Tumor Analysis Consortium CRC : colorectal cancer DL : Deep Learning DSB : DNA double-strand breaks ER- : estrogen receptor negative ER+ : estrogen receptor positive FDA : U.S. Food and Drug Administration GBM : glioblastoma GDC : Genomic Data Commons GIS : genomic instability score H&E : Hematoxylin & Eosin HR : Homologous recombination HRD-H : HRD high HRD-L : HRD low HRD : Homologous Recombination Deficiency HRR : Homologous recombination repair LIHC : liver hepatocellular carcinoma LOH : loss of heterozygosity LSCC : squamous cell carcinoma of the lung LST : large-scale state transitions LUAD : adenocarcinoma of the lung LUSC : squamous cell carcinoma of the lung OV : ovarian cancer (OV) PAAD : pancreatic adenocarcinoma PDA : pancreatic adenocarcinoma PARP : Poly(ADP-Ribose)-polymerase PARPi : Poly(ADP-Ribose)-polymerase inhibitor PRAD : prostate adenocarcinoma PRC : precision recall curve ROC : receiving operating curve SBS3 : single base substitution 3 SNP : single nucleotide polymorphism SSDBs : single strand DNA breaks SSL : self-supervised learning TAI : telomeric allelic imbalance TCGA : The Cancer Genome Atlas TRIPOD : Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis UCEC : endometrial carcinoma WSI : whole slide images
更多
查看译文
关键词
multiple instance learning,homologous recombination deficiency,different tumor types,routine histology,homologous recombination,attention-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要