BERT Classifies SARS-CoV-2 Variants

Smart innovation, systems and technologies(2023)

引用 0|浏览5
暂无评分
摘要
Medical diagnostics faced numerous difficulties during the COVID-19 pandemic. One of these has been the need for ongoing monitoring of SARS-CoV-2 mutations. Genomics is the technique most frequently used for precisely identifying variants. The ongoing global gathering of RNA samples of the virus has made such an approach possible. Nevertheless, variant identification techniques are frequently resource-intensive. As a result, the diagnostic capability of small medical laboratories might not be sufficient. In this work, an effective deep learning strategy for identifying SARS-CoV-2 variants is presented. This work makes two contributions: (1) a fine-tuning architecture of Bidirectional Encoder Representations from Transformers (BERT) to identify SARS-CoV-2 variants; (2) providing biological insights by exploiting BERT self-attention. Such an approach enables the analysis of the S gene of the virus to quickly recognize its variant. The selected model BERT is a transformer-based neural network first developed for natural language processing. Nonetheless, it has been effectively used in numerous applications, such as genomic sequence analysis. Thus, the fine-tuning of BERT was performed to adapt it to the RNA sequence domain, achieving a 98.59% F1-score on test data: it was successful in identifying variants circulating to date. The interpretability of the model was examined, since BERT utilizes the self-attention mechanism. In fact, it was discovered that by attending particular areas of the S gene, BERT extracts pertinent biological information on variants. Thus, the presented approach allows obtaining insights into the particular characteristics of SARS-CoV-2 RNA samples.
更多
查看译文
关键词
bert,variants,sars-cov
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要