Fast and accurate taxonomic classification for viral genomes with VISTA

Research Square (Research Square)(2023)

引用 0|浏览1
暂无评分
摘要
The exponential growth of viral genome sequences in public databases has created a pressing need for a universal, scalable, and automated taxonomic framework to facilitate comprehensive virus studies. Here, we present VISTA (Virus Sequence-based Taxonomy Assignment), a novel computational tool for virus taxonomy featuring a pairwise sequence comparison system and an automatic demarcation threshold identification framework. VISTA leverages physio-chemical property sequences, k-mer profiles and selected features to construct a comprehensive distance-based framework for taxonomic clustering. Through a systematic comparison with the Pairwise Sequence Comparison (PASC), we demonstrate that VISTA achieves significantly improved separation for taxonomic groups, establishes more objective taxonomic demarcation thresholds, and runs at a substantially faster speed. We successfully applied VISTA to the class Caudoviricetes and 38 other virus families, indicating that our approach is robust, scalable, and capable of providing taxonomy assignments for both prokaryotic and eukaryotic viruses. Moreover, the application of VISTA to 679 unclassified phage genomes recovered from metagenomic data identified 46 novel virus families. VISTA is available as both a command line tool and a user-friendly web portal at https://ngdc.cncb.ac.cn/vista.
更多
查看译文
关键词
viral genomes,accurate taxonomic classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要