Chrome Extension
WeChat Mini Program
Use on ChatGLM

The Ribosomal Operon Database (ROD): A full-length rDNA operon database extracted from genome assemblies

crossref(2024)

Cited 0|Views6
No score
Abstract
Current rDNA reference sequence databases are tailored towards shorter DNA markers, such as parts of the 16/18S marker or the ITS region. However, due to advances in long-read DNA sequencing technologies, longer stretches of the rDNA operon are increasingly used in environmental sequencing studies to increase the phylogenetic resolution. There is, therefore, a growing need for longer rDNA reference sequences. Here, we present the Ribosomal Operon Database (ROD), which includes eukaryotic full-length rDNA operons fished from publicly available genome assemblies. Full-length operons were detected in 34.1% of the 34,701 examined eukaryotic genome assemblies from NCBI. In most cases (53.1%), more than one operon variant was detected, which can be due to intragenomic operon copy variability, allelic variation in non-haploid genomes, or technical errors from the sequencing and assembly process. The highest copy number found was 5,947 in Zea mays. In total, 453,697 unique operons were detected, with 69,480 operon clusters remaining after intragenomic clustering at 99% sequence identity. The operon length varied extensively across eukaryotes, ranging from 4,136 to 16,463 bp, which will lead to considerable PCR bias during PCR amplification of the entire operon. Clustering the full-length operons revealed that the different parts (i.e., 18S, 28S, the hypervariable region V4 of 18S, and ITS) provide divergent taxonomic resolution, with 18S and the V4 region being the most conserved. The Ribosomal Operon Database (ROD) will be updated regularly to provide an increasing number of full-length rDNA operons to the scientific community.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined