Core gene set of the species Saccharomyces cerevisiae

biorxiv(2023)

引用 0|浏览23
暂无评分
摘要
Examination of the genome sequence of Saccharomyces cerevisiae strain S288c and 93 additional diverse strains allows identification of the 5873 genes that make up the core set of genes in this species and gives a better sense of the organization and plasticity of this genome. S. cerevisiae strains each contain dozens to hundreds of strain specific genes. In addition to a variable content of retrotransposons, some strains contain a novel transposable element, Ty7. Examination further shows that some annotated putative protein coding genes are likely artifacts. We propose altering approximately 5% of the current annotations in the widely used reference strain S288c. Potential null alleles are common and found in all 94 strains examined, with these potential null alleles typically containing a single stop codon or frameshift. There are also gene remnants, pseudogenes, and variable arrays of genes. Among the core genes there are now only 373 protein coding genes of unknown function, classified as uncharacterized in the Saccharomyces Genome Database. This work suggests that there is a role for carefully edited and annotated genome sequences in understanding the genome organization and content of a species. We propose that gene remnants be added to the repertoire of features found in the S. cerevisiae genome, and likely other fungal species. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要