Prediction of protein aggregation propensity employing SqFt-based logistic regression model

Fatemeh Eshari, Fahime Momeni, Amirreza Faraj Nezhadi, Soudabeh Shemehsavar,Mehran Habibi-Rezaei

International journal of biological macromolecules(2023)

引用 0|浏览3
暂无评分
摘要
Here we present a novel machine-learning approach to predict protein aggregation propensity (PAP) which is a key factor in the formation of amyloid fibrils based on logistic regression (LR). Amyloid fibrils are associated with various neurodegenerative diseases (ND) such as Alzheimer's disease (AD) and Parkinson's disease (PD), which are caused by oxidative stress and impaired protein homeostasis. Accordingly, the paper uses a dataset of hexapeptides with known aggregation tendencies and eight physiochemical features to train and test the LR model. Also, it evaluates the performance of the LR model using F-measure and Matthews correlation coefficient (MCC) as metrics and compares it with other existing methods. Moreover, it investigates the effect of combining sequence and feature information in the prediction. In conclusion, the LR model with sequence and feature information achieves high F-measure (0.841) and MCC (0.6692), outperforming other methods and demonstrating its efficiency and reliability for PAP prediction. In addition, the overall performance of the concluded method was higher than the other known servers, for instance, Aggrescan, Metamyl, Foldamyloid, and PASTA 2.0. The LR model can be accessed at: https://github.com/KatherineEshari/Protein-aggregation-prediction.
更多
查看译文
关键词
Protein aggregation,Logistic regression,Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要