Chrome Extension
WeChat Mini Program
Use on ChatGLM

Exploring high-quality microbial genomes by assembling short-reads with long-range connectivity

biorxiv(2023)

Cited 2|Views20
No score
Abstract
Despite long-read sequencing enables to generate complete genomes of unculturable microbes, its high cost hinders its widespread application in large cohorts. An alternative method is to assemble short-reads with long-range connectivity, which can be a cost-effective way to generate high-quality microbial genomes. We developed Pangaea to improve metagenome assembly using short-reads with physical or virtual barcodes. It adopts a deep-learning-based binning algorithm to assemble the co-barcoded reads with similar sequence contexts and abundances to improve assemblies of high- and medium-abundance microbes. Pangaea also leverages a multi-thresholding reassembly strategy to refine assembly for low-abundance microbes. We benchmarked Pangaea with linked-reads and a combination of short- and long-reads from mock communities and human gut metagenomes. Pangaea achieved significantly higher contig continuity as well as more near-complete metagenome-assembled genomes (NCMAGs) than the existing assemblers. Pangaea was also observed to generate three complete and circular NCMAGs on the human gut microbiomes. ### Competing Interest Statement LJH is an employee of Kangmeihuada GeneTech Co., Ltd (KMHD). * MAG : Metagenome-Assembled Genome NCMAG : Near-Complete Metagenome-Assembled Genome OGRE : Overlap Graph-based Read clustEring PacBio CLR : PacBio Continuous Long-Reads RSS : Resident Set Size rRNA : Ribosomal RNA stLFR : single-tube Long Fragment Read TELL-Seq : Transposase Enzyme-Linked Long-read Sequencing TNF : TetraNucleotide Frequency tRNA : Transfer RNA UST : Universal Sequencing Technology VAE : Variational AutoEncoder
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined