Artificial intelligence redefines RNA virus discovery

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 1|浏览69
暂无评分
摘要
RNA viruses are diverse components of global ecosystems. The metagenomic identification of RNA viruses is currently limited to those with sequence similarity to known viruses, such that highly divergent viruses that comprise the "dark matter" of the virosphere remain challenging to detect. We developed a deep learning algorithm - LucaProt - to search for highly divergent RNA-dependent RNA polymerase (RdRP) sequences in 10,487 global meta-transcriptomes. LucaProt integrates both sequence and structural information to accurately and efficiently detect RdRP sequences. With this approach we identified 180,571 RNA viral species and 180 superclades (viral phyla/classes). This is the broadest diversity of RNA viruses described to date, including many viruses undetectable using BLAST or HMM approaches. The newly identified RNA viruses were present in diverse ecological niches, including the air, hot springs and hydrothermal vents, and both virus diversity and abundance varied substantially among ecological types. We also identified the longest RNA virus genome (nido-like) observed so far, at 47,250 nucleotides, and expanded the diversity of RNA bacteriophage to more than ten phyla/classes. This study marks the beginning of a new era of virus discovery, with the potential to redefine our understanding of the global virosphere and reshape our understanding of virus evolutionary history. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
rna virus discovery,artificial intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要