Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile.

Biochimie(2010)

引用 112|浏览20
暂无评分
摘要
Knowledge of structural class plays an important role in understanding protein folding patterns. In this study, a simple and powerful computational method, which combines support vector machine with PSI-BLAST profile, is proposed to predict protein structural class for low-similarity sequences. The evolution information encoding in the PSI-BLAST profiles is converted into a series of fixed-length feature vectors by extracting amino acid composition and dipeptide composition from the profiles. The resulting vectors are then fed to a support vector machine classifier for the prediction of protein structural class. To evaluate the performance of the proposed method, jackknife cross-validation tests are performed on two widely used benchmark datasets, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence similarity lower than 40% and 25%, respectively. The overall accuracies attain 70.7% and 72.9% for 1189 and 25PDB datasets, respectively. Comparison of our results with other methods shows that our method is very promising to predict protein structural class particularly for low-similarity datasets and may at least play an important complementary role to existing methods.
更多
查看译文
关键词
Protein structural class,Sequence similarity,Support vector machine,Position-specific score matrix
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要