SCMPSP: Prediction and characterization of photosynthetic proteins based on a scoring card method

BMC Bioinformatics(2015)

引用 23|浏览19
暂无评分
摘要
Background Photosynthetic proteins (PSPs) greatly differ in their structure and function as they are involved in numerous subprocesses that take place inside an organelle called a chloroplast. Few studies predict PSPs from sequences due to their high variety of sequences and structues. This work aims to predict and characterize PSPs by establishing the datasets of PSP and non-PSP sequences and developing prediction methods. Results A novel bioinformatics method of predicting and characterizing PSPs based on scoring card method (SCMPSP) was used. First, a dataset consisting of 649 PSPs was established by using a Gene Ontology term GO:0015979 and 649 non-PSPs from the SwissProt database with sequence identity <= 25%.- Several prediction methods are presented based on support vector machine (SVM), decision tree J48, Bayes, BLAST, and SCM. The SVM method using dipeptide features-performed well and yielded - a test accuracy of 72.31%. The SCMPSP method uses the estimated propensity scores of 400 dipeptides - as PSPs and has a test accuracy of 71.54%, which is comparable to that of the SVM method. The derived propensity scores of 20 amino acids were further used to identify informative physicochemical properties for characterizing PSPs. The analytical results reveal the following four characteristics of PSPs: 1) PSPs favour hydrophobic side chain amino acids; 2) PSPs are composed of the amino acids prone to form helices in membrane environments; 3) PSPs have low interaction with water; and 4) PSPs prefer to be composed of the amino acids of electron-reactive side chains. Conclusions The SCMPSP method not only estimates the propensity of a sequence to be PSPs, it also discovers characteristics that further improve understanding of PSPs. The SCMPSP source code and the datasets used in this study are available at http://iclab.life.nctu.edu.tw/SCMPSP/ .
更多
查看译文
关键词
Support Vector Machine,Propensity Score,Dipeptide,Thylakoid Membrane,Amino Acid Side Chain
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要