Novel topological methods for identifying surprising protein tertiary structure relationships

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览0
暂无评分
摘要
Abstract We present fast and simple-to-implement measures of the entanglement of protein tertiary structures which are appropriate for highly flexible structure comparison. These quantities are based on the writhing and crossing numbers heavily utilised in DNA topology studies which and which have shown some promising results when applied to proteins recently. Here we show how they can be applied in a novel manner across various scales of the protein’s backbone to identify similar topologies which can be missed by more common RMSD, secondary structure or primary sequence based comparison methods. We derive empirical bounds on the entanglement implied by these measures and show how they can be used to constrain the search space of a protein for solution scattering, a method highly suited to determining the likely structure of proteins in solution where crystal structure or machine learning based predictions often fail to match experimental data. In addition we identify large scale helical geometries present in a large array of proteins, which are consistent across a number of different protein structure types and sequences. This is used in one specific case to demonstrate significant structural similarity between Rossmann fold and TIM Barrel proteins, a link which is potentially significant as attempts to engineer the latter have in the past produced the former. Finally we provide the SWRITHE python notebook to calculate these metrics. Author summary There is much interest in developing quantitative methods to compare different protein structures or identify common sub-structures across protein families. We present novel methods for studying and comparing protein structures based on the entanglement of their amino-acid backbone and demonstrate a number of their critical properties. First, they are shown to be especially useful in identifying similar protein entanglement for structures which may be seen as distinct via more established methods. Second, by studying the distribution of entanglement across a wide sample of proteins, we show that there exists a minimum expected amount (a lower bound) of entanglement given the protein’s length. This bound is shown to be useful in ensuring realistic predictions from experimental structural determination methods. Third, using fundamental properties of this entanglement measure, we identify two common classes of protein sub-structure. The first are large scale helices, which provide stability to the structure. These helical structures indicate strong structural similarity of two protein families usually regarded as differing significantly. The second class of substructure is one which, though complex, has a small net entanglement. This configuration is physically useful in other disciplines, but its function in proteins is not yet clear. Finally, we provide an interactive python notebook to compute these measures for a given protein.
更多
查看译文
关键词
novel topological methods,surprising protein,structure
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要