A single-cell RNA-seq Training and Analysis Suite using the Galaxy Framework

GigaScience(2020)

Cited 1|Views46
No score
Abstract
Background The vast ecosystem of single-cell RNA-seq tools has until recently been plagued by an excess of diverging analysis strategies, inconsistent file formats, and compatibility issues between different software suites. The uptake of 10x Genomics datasets has begun to calm this diversity, and the bioinformatics community leans once more towards the large computing requirements and the statistically-driven methods needed to process and understand these ever-growing datasets. Results Here we outline several Galaxy workflows and learning resources for scRNA-seq, with the aim of providing a comprehensive analysis environment paired with a thorough user learning experience that bridges the knowledge gap between the computational methods and the underlying cell biology. The Galaxy reproducible bioinformatics framework provides tools, workflows and trainings that not only enable users to perform one-click 10x preprocessing, but also empowers them to demultiplex raw sequencing from custom tagged and full-length sequencing protocols. The downstream analysis supports a wide range of high-quality interoperable suites separated into common stages of analysis: inspection, filtering, normalization, confounder removal and clustering. The teaching resources cover an assortment of different concepts from computer science to cell biology. Access to all resources is provided at the [singlecell.usegalaxy.eu][1] portal. Conclusions The reproducible and training-oriented Galaxy framework provides a sustainable HPC environment for users to run flexible analyses on both 10x and alternative platforms. The tutorials from the Galaxy Training Network along with the frequent training workshops hosted by the Galaxy Community provide a means for users to learn, publish and teach scRNA-seq analysis. Key Points ### Competing Interest Statement The authors have declared no competing interest. * ### List of abbreviations DOI : Digital Object Identifier GTN : Galaxy Training Network HDF5 : Hierarchical Data Format 5 HPC : High Performance Computing PAGA : Partition-based Graph Abstraction PCA : Principal Component Analysis scRNA : Single-Cell RNA tSNE : t-distributed Stochastic Network Embeddings UMAP : Uniform Manifold Approximation and Projection UMI : Unique Molecular Identifier [1]: http://singlecell.usegalaxy.eu
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined