Protein complex similarity based on Weisfeiler-Lehman labeling

semanticscholar(2018)

引用 0|浏览0
暂无评分
摘要
Being able to quantify the similarity between two protein complexes is essential for numerous applications. Prominent examples are database searches for known complexes with a given query complex, comparison of the output of different protein complex prediction algorithms, or summarizing and clustering protein complexes, e.g., for visualization. While the corresponding problems have received much attention on single proteins and protein families, the question about how to model and compute similarity between protein complexes has not yet been systematically studied. Because protein complexes can be naturally modeled as graphs, in principle general graph similarity measures may be used, but these are often computationally hard to obtain and do not take typical properties of protein complexes into account. Here we propose a parametric family of similarity measures based on Weisfeiler-Lehman labeling. We evaluate it on simulated complexes of the extended human integrin adhesome network. Because the connectivity (graph topology) of real complexes is often unknown and hard to obtain experimentally, we use both known protein-protein interaction networks and known interdependencies (constraints) between interactions to simulate more realistic complexes than from interaction networks alone. We empirically show that the defined family of similarity measures is in good agreement with edit similarity, a similarity measure derived from graph edit distance, but can be much more efficiently computed. It can therefore be used in large-scale studies and simulations and serve as a basis for further refinements of modeling protein complex similarity.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要