Novel sequence agnostic computational approach based on the proportion of amino acids of the spike protein for reliable discrimination of SARS-CoV2

Benedict Paul C, S P Abusheikh, R Harini, Ramya Lakshmi D,Gnanadurai John Fletcher

Research Square (Research Square)(2023)

引用 0|浏览7
暂无评分
摘要
Abstract Protein sequencing is integral to discovery pipelines. We have developed a sequence agnostic computational approach based on the proportion of amino acids of the spike protein for the reliable discrimination of SARS-CoV2 and other coronaviruses. A total of 717 Complete Spike Protein sequences of Coronaviruses were retrieved from the NCBI Virus database. A Python based algorithm was developed to read the FASTA files and to extract the percentage of individual amino acid residues (https://github.com/ABUSHEIKHSP/aminoacidProfiler). A Go based computer algorithm was developed to analyze the sequences for discriminatory thresholds of signature amino acid proportions (https://github.com/drbenedictpaul/AminoacidSignatureGenerator). The algorithm developed was able to specifically discriminate SARS-CoV2 from other coronaviruses at the threshold of ≥ 7.4% with an amino acid signature comprising Leu, Ser, Thr, Val. In addition, it returned specific signatures for SARS-CoV and MERS as Leu, Ser, Thr and Leu, Ser respectively. The seasonal coronaviruses had exclusively asparagine along with their unique signatures. Study findings demonstrate that this novel pragmatic computational approach can be used assertively to discriminate SARS-CoV2 from other coronaviruses. Considering its versatile applications, the usefulness need not be restrictive but it can be leveraged for diverse applications in protein biology.
更多
查看译文
关键词
spike protein,amino acids,sars-cov
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要