Optimized splitting of RNA sequencing data by species

biorxiv(2021)

引用 0|浏览7
暂无评分
摘要
Gene expression studies using chimeric xenograft transplants or co-culture systems have proven to be valuable to uncover cellular dynamics and interactions during development or in disease models. However, the mRNA sequence similarities among species presents a challenge for accurate transcript quantification. To identify optimal strategies for analyzing mixed-species RNA sequencing data, we evaluate both alignment-dependent and alignment-independent methods. Alignment of reads to a pooled reference index is effective, particularly if optimal alignments are used to classify sequencing reads by species, which are re-aligned with individual genomes, generating >97% accuracy across a range of species ratios. Alignment-independent methods, such as Convolutional Neural Networks, which extract the conserved patterns of sequences from two species, classify RNA sequencing reads with over 85% accuracy. Importantly, both methods perform well with different ratios of human and mouse reads. Our evaluation identifies valuable and effective strategies to dissect species composition of RNA sequencing data from mixed populations. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
rna,splitting,species
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要