BERT_PLPS: A BERT-based Model for Predicting Lysine Phosphoglycerylation Sites

Songning Lai,Yankun Cao, Pengwei Wang,Lan Ye, Zhi Li

Research Square (Research Square)(2023)

引用 0|浏览2
暂无评分
摘要
Abstract As one of the most important post-translational modification processes, lysine phosphoglycerylation modifications affect many important biosynthetic processes in the human body. However, traditional experimental methods for the recognization of lysine phosphoglycerylation sites are not only expensive but also time-consuming. Computational techniques may provide an economical and efficient way to predict lysine phosphoglycerylation sites. Therefore, it is extremely necessary and meaningful to study and establish prediction models with high accuracy. In the present study, we propose a BERT-based model, BERT_PLPS, which could predict accurately lysine phosphoglycerylation sites. This model extracts amino acid sequence features with three algorithms: CKSAAP, AAC, and BE. Sample equalization is performed using the ADASYN and KNN algorithms. The data are dimensionalized by the ISOMap algorithm, and the features are encoded into feature sequences by an encoder as the input to a BERT-based prediction model. To learn better the intrinsic biological language of lysine, we replaced the original static mask with a dynamic random mask. Compared to other machine learning or deep learning-based models, BERT_PLPS exhibits up to 99.53% accuracy and outperforms the most advanced model (PLP_FS) with an increase of approximately 0.35% on ACC and approximately 0.93% on MCC.
更多
查看译文
关键词
bert-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要