Genome variant calling workflow implementation and deployment in HPC infrastructure.

BIBM(2021)

引用 1|浏览4
暂无评分
摘要
The short variant discovery is one of the most important steps into genomics studies since it allows genetic variants identification that influences the emergence and evolution of some diseases. Specifically, cancer can be associated with germline variants present in small populations, such as somatic variants located in tumor cells. Therefore, it is necessary to implement workflows that allow data analysis resulting from the new generation sequencing while taking advantage of the resources available in HPC infrastructures. This work presents the PIPEMB-WDL workflow for HPC infrastructure to integrate the short variant discovery for germline and somatic calling, including pre-processing and variants refinement steps, following the best practices of GATK4. This workflow was developed using emerging technologies in current development like WDL and Cromwell engine. The challenges we address in this paper are integrating and deploying container technologies, workload manager technologies, Cromwell, and WDL in our HPC infrastructure.
更多
查看译文
关键词
Workflow,short variant discovery,HPC,Cromwell,WDL,Slurm,Singularity,GATK4
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要