Fast Short Read De-novo Assembly Using Overlap-Layout-Consensus Approach.
IEEE/ACM transactions on computational biology and bioinformatics(2020)
摘要
The
de-novo
genome assembly is a challenging computational problem for which several pipelines have been developed. The advent of
long-read
sequencing technology has resulted in a new set of algorithmic approaches for the assembly process. In this work, we identify that one of these new and fast
long-read
assembly techniques (using
Minimap2
and
Miniasm
) can be modified for the
short-read
assembly process. This possibility motivated us to customize a
long-read
assembly approach for applications in a
short-read
assembly scenario. Here, we compare and contrast our proposed
de-novo
assembly pipeline (
MiniSR
) with three other recently developed programs for the assembly of bacterial and small eukaryotic genomes. We have documented two trade-offs: one between speed and accuracy and the other between contiguity and base-calling errors. Our proposed assembly pipeline shows a good balance in these trade-offs. The resulting pipeline is 6 and 2.2 times faster than the short-read assemblers Spades and SGA, respectively.
MiniSR
generates assemblies of superior N50 and NGA50 to
SGA
, although assemblies are less complete and accurate than those from
Spades
. A third tool,
SOAPdenovo2
, is as fast as our proposed pipeline but had poorer assembly quality.
更多查看译文
关键词
Pipelines,Genomics,Bioinformatics,Indexing,Tools,DNA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要