A chromosome-level, haplotype-phased genome assembly for Vanilla planifolia highlights that partial endoreplication challenges accurate whole genome assembly

Q. Piet,G. Droc,W. Marande,G. Sarah,S. Bocs,C. Klopp, M. Bourge,S. Siljak-Yakovlev, O. Bouchez, C. Lopez-Roques,S. Lepers-Andrzejewski, L. Bourgois, J. Zucca,M. Dron,P. Besse,M. Grisoni,C. Jourda,C. Charron

Plant communications(2022)

Cited 7|Views19
No score
Abstract
Vanilla planifolia, the species cultivated to produce one of the world’s most popular flavors, is highly prone to partial genome endoreplication (PE) which leads to highly unbalanced DNA content in cells. We report here first molecular evidence of PE at chromosome scale by the assembly and annotation of an accurate haplotype-phased genome of V. planifolia . Cytogenetic data demonstrated that the diploid genome size is 4.09 Gb, with 16 chromosome pairs although aneuploid cells are frequently observed. Using PacBio HiFi and optical mapping, we assembled and phased a diploid genome of 3.4 Gb with a scaffold N50 of 1.2 Mb and 59,128 predicted protein-coding genes. The atypical k-mers frequencies and the uneven sequencing depth observed agreed with our expectation of unbalanced genome representation. Sixty-seven percent of the genes were scattered over only 30% of the genome, putatively linking gene-rich regions and the endoreplication phenomenon. On the contrary, low coverage regions (non-endoreplicated) were rich in repeated elements but also contained 33% of the annotated genes. Furthermore, this assembly showed distinct haplotype-specific sequencing depth variation patterns suggesting a complex molecular regulation of endoreplication along the chromosomes. This high-quality anchored assembly represented 83% of the estimated V. planifolia genome. It provides a significant step towards the elucidation of this complex genome. To support post-genomics efforts, we developed the Vanilla Genome Hub, a user-friendly integrated web portal that allows centralized access to high-throughput genomic and other omics data, and interoperable use of bioinformatics tools. The genome of the orchid Vanilla planifolia (4.09 Gb, 16 pairs of chromosomes) is very prone to partial endoreplication (PE) which leads to a very unbalanced DNA content in the cells. We report here first molecular evidence of PE at chromosome scale through the assembly and annotation of an accurate haplotype-phased genome of V. planifolia . Distinct haplotype-specific sequencing depth variation patterns suggest complex molecular regulation of endoreplication along chromosomes. To facilitate post-genomics efforts, an integrated public and user-friendly web portal (the Vanilla Genome Hub) has been developed.
More
Translated text
Key words
vanilla planifolia genome,partial endoreplication,whole-genome genome,chromosome-level,haplotype-phased
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined