An Integrated Pipeline For Detecting And Characterizing Structural Variation In Cancer

CANCER RESEARCH(2015)

引用 0|浏览22
暂无评分
摘要
Detection and characterization of somatic structural variants (SVs) and copy-number variants (CNVs) from whole genome sequencing remains a challenging part of cancer analysis. Many callers have been developed that use different detection strategies, but most methods suffer from high rates of false positives and false negatives, and agreement between different callers is usually low. We have developed a flexible pipeline that combines the results of multiple callers, filters calls to remove likely artifacts, and functionally annotates the resulting variants. We employ a diverse set of variant callers utilizing a combination of read depth, read pair, and split read detection methods: NBIC-seq (Xi et al., 2011), Crest (Wang et al., 2011), Delly (Rausch et al., 2012), and BreakDancer (Chen et al., 2009). To remove artifact calls due to mis-mapping, we apply filters that discard predicted SVs whose breakpoints exhibit certain sequence features (e.g. extensive mapping ambiguity, high repeat content). SVs corresponding to known germline variants (1000G, DGV, in-house database) are marked and removed as unlikely somatic variants: this greatly helps to prevent both sequencing protocol- and caller-specific artifacts as well as false positive somatic calls arising from missed calls in the matched germline sample. Finally, we employ our sensitive split read mapper SplazerS to identify SV breakpoints with base pair precision. In this step, we are also able to remove remaining germline variants for which we find split read support in the matched normal sample. The final predicted structural variants are annotated for overlap with SVs in COSMIC, overlap with known cancer genes and potential impact on gene structure. We use a public synthetic data set (DREAM challenge; Boutros et al., 2014) to demonstrate that using our selected ensemble of tools significantly improves sensitivity as compared to any single caller and that our filters effectively remove artifacts. Further, we show results from a set of colorectal cancer samples (Brannon et al., 2014) in which highly similar primary and metastatic tumors show excellent agreement in somatic SV calls in the absence of overlap between unrelated samples. Results from testing our pipeline on TCGA glioblastoma multiforme tumors, for which validated genomic rearrangements are available, will also be presented. In conclusion, our pipeline improves detection of SVs by integrating orthogonal calling methods and facilitates identification of clinically relevant SVs through effective filters and cancer-specific functional annotation. Citation Format: Minita Shah, Dayna M. Oschwald, Soren Germer, Michael C. Zody, Toby Bloom, Anne-Katrin Emde. An integrated pipeline for detecting and characterizing structural variation in cancer. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 4876. doi:10.1158/1538-7445.AM2015-4876
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要