Parallelization of the functional flow algorithm for prediction of protein function using protein-protein interaction networks

High Performance Computing and Simulation(2011)

Cited 2|Views1
No score
Abstract
Protein-protein interaction networks provide important information about functions of proteins. There are various studies which analyze interaction networks and predict functions of novel proteins based on their network connectivity. However, all of these methods are sequential methods that do not utilize high performance computing. Functional flow is one of these methods that uses network connectivity, distance effect, and topology of the network with local and global views to predict protein function. With these advantages, the functional flow algorithm produces more accurate results compared to other techniques. However, due to lack of a parallelized version of the algorithm, the method cannot be practically applied on large scale networks of complex species. In this paper, we provide a parallel implementation of functional flow. We use Hadoop which is one of the open source map/reduce environments. For our experiments, we installed Hadoop on 18 hosts with eight cores each. The first map/reduce job distributes the protein interaction network as a format which allows parallel distributed computing on all the worker nodes. The other map/reduce jobs generate flows for each known protein function and the function of novel proteins are predicted by accumulating all of these generated flows. Our experiments show that the method can be distributed on worker nodes efficiently and the application can provide better performance as the number of resources increases.
More
Translated text
Key words
bioinformatics,parallel processing,proteins,public domain software,Hadoop,MapReduce,distance effect,functional flow algorithm,high performance computing,network connectivity,network topology,open source map environment,parallel distributed computing,parallel implementation,protein function prediction,protein-protein interaction network,sequential method,Bioinformatics and Biocomputing,Hadoop,MapReduce,Network Flow,Parallel and Distributed Computing,Protein-Protein Interactions,
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined