The Identifications of Post Translational Modification Sites with Capsule Network

SIMULATION TOOLS AND TECHNIQUES, SIMUTOOLS 2021(2022)

引用 0|浏览0
暂无评分
摘要
Post-translational modification (PTM) is considered a significant biological process with a tremendous impact on the function of proteins in both eukaryotes, and prokaryotes cells. Malonylation of lysine is a newly discovered post-translational modification, which is associated with many diseases, such as type 2 diabetes and different types of cancer. In addition, compared with the experimental identification of propionylation sites, the calculation method can save time and reduce cost. In this paper, we combine principal component analysis with support vector machine (SVM) to propose a new computational model - Mal-prec (malonylation prediction). Firstly, the one-hot encoding, physicochemical properties and the composition of k-spacer acid pairs were used to extract sequence features. Secondly, we preprocess the data, select the best feature subset by principal component analysis (PCA), and predict the malonylation sites by SVM. And then, we do a five-fold cross validation, and the results show that compared with other methods, Mal-prec can get better prediction performance. In the 10-fold cross validation of independent data sets, AUC (area under receiver operating characteristic curve) analysis has reached 96.39%. Mal-pred is used to identify the malonylation sites in the protein sequence, which is a computationally reliable method. It is superior to the existing prediction tools that found in the literature and can be used as a useful tool for identifying and discovering novel malonylation sites in human proteins.
更多
查看译文
关键词
Post translational modification, Malonylation, One-hot encoding, Principal component analysis, Support vector machine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要