ddRAD-seq variant calling in peach and the effect of removing PCR duplicates

Acta Horticulturae(2022)

引用 0|浏览0
暂无评分
摘要
Double digest RAD-seq (ddRAD-seq) is a flexible and cost-effective strategy that has emerged as one of the most popular genotyping approaches in plants. It relies on combining two restriction enzymes for library preparation followed by PCR amplification of the template molecules. However, PCR introduces sequence duplicates and may erroneously inflate the confidence of genotype calls at a particular site. Although the process of variant calling is relatively straightforward, it is time-consuming, involving multiple steps. Thus, removing any unneeded steps would reduce the computation time and simplify the analysis. Hence, the primary aim of this study is to evaluate the necessity of PCR duplicates and their effects on SNP and indel calling in peach. On the other hand, the accuracy of genetic variant identification in plants is a crucial step toward understanding phenotypical traits and monitoring breeding programs. However, false positive calls are a common issue that could hamper the detection of relevant variants. Thereby, a good combination of computational tools for alignment and variant calling is crucial to tackle these artifacts. In response to this challenge, three variant callers (BCFtools-mpileup, Freebayes and GATK-HaplotypeCaller) were combined on top of the BWA-mem read mapper. Variants derived from the intersection of these callers are selected as a high confidence set and flagged for subsequent analysis. The pipeline is documented and available as a set of Makefiles that can be adapted to any species. This work provides useful guidelines and a reproducible workflow for variant detection using ddRAD-seq data.
更多
查看译文
关键词
Prunus persica, DNA-variants, SAMtools, Stacks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要