Chrome Extension
WeChat Mini Program
Use on ChatGLM

CGAT-core: a python framework for building scalable, reproducible computational biology workflows

bioRxiv(2019)

Cited 21|Views97
No score
Abstract
In the genomics era computational biologists regularly need to process, analyse and integrate large and complex biomedical datasets. Analysis inevitably involves multiple dependent steps, resulting in complex pipelines or workflows, often with several branches. Large data volumes mean that processing needs to be quick and efficient and scientific rigour requires that analysis be consistent and fully reproducible. We have developed CGAT-core, a python package for the rapid construction of complex computational workflows. CGAT-core seamlessly handles parallelisation across high performance computing clusters, integration of Conda environments, full parameterisation, database integration and logging. To illustrate our workflow framework, we present a pipeline for the analysis of RNAseq data using pseudoalignment. Availability CGAT-core is freely available under an MIT licence for installation and use, including source code at Contact andreas.heger{at}imm.ox.ac.uk (AH), david.sims{at}imm.ox.ac.uk (DS), adam.cribbs{at}imm.ox.ac.uk (AC) Supplementary information : * DSL : Domain Specific Languages API : Application Programming Interface DRMAA : Distributed Resource Management Application API
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined