Detection of fusion transcripts and their genomic breakpoints from RNA sequencing data

biorxiv(2021)

引用 0|浏览12
暂无评分
摘要
Spliced fusion-transcripts are typically identified by RNA-seq without elucidating the causal genomic breakpoints. However, non poly(A)-enriched RNA-seq contains large proportions of intronic reads spanning also genomic breakpoints. Using 1.274 RNA-seq samples, we investigated what additional information is embedded in non poly(A)-enriched RNA-seq data. Here, we present our novel, graph-based, Dr. Disco algorithm that makes use of both intronic and exonic RNA-seq reads to identify not only fusion transcripts but also genomic breakpoints in gene but also in intergenic regions. Dr. Disco identified TMPRSS2-ERG fusions with genomic breakpoints and other transcribed rearrangements from multiple RNA-sequencing cohorts. In breast cancer and glioma samples Dr. Disco identified rearrangement hotspots near CCND1 and MDM2 and could directly associate this with increased expression. A comparison with matched DNA-sequencing revealed that most genomic breakpoints are not, or minimally, transcribed while also revealing highly expressed translocations missed by DNA-seq. By using the full potential of non poly(A)-enriched RNA-seq data, Dr. Disco can reliably identify expressed genomic breakpoints and their transcriptional effects. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
fusion transcripts,genomic breakpoints,rna,sequencing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要