Protein k-mers enable assembly-free microbial metapangenomics

biorxiv(2022)

引用 1|浏览5
暂无评分
摘要
An estimated 2 billion species of microbes exist on Earth with orders of magnitude more strains. Microbial pangenomes are created by aggregating all genomes of a single clade and reflect the metabolic diversity of groups of organisms. As de novo metagenome analysis techniques have matured and reference genome databases have expanded, metapangenome analysis has risen in popularity as a tool to organize the functional potential of organisms in relation to the environment from which those organisms were sampled. However, the reliance on assembly and binning or on reference databases often leaves substantial portions of metagenomes unanalyzed, thereby underestimating the functional potential of a community. To address this challenge, we present a method for metapangenomics that relies on amino acid k-mers (kaa-mers) and metagenome assembly graph queries. To enable this method, we first show that kaa-mers estimate pangenome characteristics and that open reading frames can be accurately predicted from short shotgun sequencing reads using the previously developed tool orpheum. These techniques enable pangenomics to be performed directly on short sequencing reads. To enable metapangenome analysis, we combine these approaches with compact de Bruijn assembly graph queries to directly generate sets of sequencing reads for a specific species from a metagenome. When applied to stool metagenomes from an individual receiving antibiotics over time, we show that these approaches identify strain fluctuations that coincide with antibiotic exposure. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
protein,k-mers,assembly-free
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要