High-Quality Genome Sequence Resource of Thielaviopsis paradoxa Strain X-3314, Causing Coconut Stem Bleeding

PLANT DISEASE(2022)

引用 0|浏览3
暂无评分
摘要
HomePlant DiseaseVol. 106, No. 9High-Quality Genome Sequence Resource of Thielaviopsis paradoxa Strain X-3314, Causing Coconut Stem Bleeding PreviousNext RESOURCE ANNOUNCEMENT OPENOpen Access licenseHigh-Quality Genome Sequence Resource of Thielaviopsis paradoxa Strain X-3314, Causing Coconut Stem BleedingXiaoqing Niu, Fengyu Yu, Hui Zhu, Xiuli Meng, Dejie Yang, Weiquan Qin, and Guodong LuXiaoqing Niuhttps://orcid.org/0000-0001-5348-8060Coconut Research Institute of Chinese Academy of Tropical Agricultural Sciences, Wenchang 571339, ChinaSearch for more papers by this author, Fengyu YuCoconut Research Institute of Chinese Academy of Tropical Agricultural Sciences, Wenchang 571339, ChinaSearch for more papers by this author, Hui Zhuhttps://orcid.org/0000-0001-8609-4754Coconut Research Institute of Chinese Academy of Tropical Agricultural Sciences, Wenchang 571339, ChinaSearch for more papers by this author, Xiuli MengCoconut Research Institute of Chinese Academy of Tropical Agricultural Sciences, Wenchang 571339, ChinaSearch for more papers by this author, Dejie YangCoconut Research Institute of Chinese Academy of Tropical Agricultural Sciences, Wenchang 571339, ChinaSearch for more papers by this author, Weiquan QinCoconut Research Institute of Chinese Academy of Tropical Agricultural Sciences, Wenchang 571339, ChinaSearch for more papers by this author, and Guodong Lu†Corresponding author: G. Lu; E-mail Address: lgd@fafu.edu.cnKey Laboratory of Biopesticide and Chemical Biology, Ministry of Education, Fujian Agriculture and Forestry University, Fuzhou 350002, ChinaSearch for more papers by this authorAffiliationsAuthors and Affiliations Xiaoqing Niu1 Fengyu Yu1 Hui Zhu1 Xiuli Meng1 Dejie Yang1 Weiquan Qin1 Guodong Lu2 † 1Coconut Research Institute of Chinese Academy of Tropical Agricultural Sciences, Wenchang 571339, China 2Key Laboratory of Biopesticide and Chemical Biology, Ministry of Education, Fujian Agriculture and Forestry University, Fuzhou 350002, China Published Online:27 Jul 2022https://doi.org/10.1094/PDIS-01-22-0231-AAboutSectionsPDF ToolsAdd to favoritesDownload CitationsTrack Citations ShareShare onFacebookTwitterLinked InRedditEmailWechat Genome AnnouncementCoconut palm (Cocos nucifera L.) is one of the most economically and ecologically important perennial oil crops in the humid tropics. Coconut fruit is either eaten raw or processed into manufactured products and byproducts, which generate employment and income in producing regions (Carvalho et al. 2013; Niu et al. 2019). Coconut stem-bleeding (CSB) is one of the most important diseases of coconut (Dulce and Edson 2009). In China, it was first reported in 2012 (Yu et al. 2012), and it has become a major constraint on coconut production in nearly all coconut plantations.The causal agent of CSB is the filamentous ascomycete Thielaviopsis paradoxa (syn. Ceratocystis paradoxa), which can produce two different types of asexual spores, endoconidia and chlamydospores; the latter can survive for long periods in the soil (Niu et al. 2019; Yu et al. 2012). Previous studies have reported the biological characteristics and other hosts of T. paradoxa (Polizzi et al. 2006; Yu et al. 2011). However, to date, an annotated draft genome of T. paradoxa has not been made publicly available.Here, we report a draft genome assembly of T. paradoxa strain X-3314 collected from a CSB-infected coconut stem from Wenchang, Hainan Province, China (Yu et al. 2012). High-quality genomic DNA and RNA were extracted from fresh mycelium cultivated in potato dextrose broth and sent to Biomarker Technologies Co., Ltd., (Beijing, China) for whole-genome sequencing and RNA sequencing (RNA-seq), respectively. We obtained 7.06 Gb of PacBio long reads (representing approximately 237× genome coverage; N50 = 16.71 kb, maximum read length of 96.99 kb) generated by the PacBio sequencing platform using the continuous long read model (Fig. 1A). We also obtained 2.67 Gb of short reads (representing approximately 90× genome coverage) and 5.05 Gb of RNA-seq reads generated by the Illumina HiSeq 3000 sequencing platform. The genome size of strain X-3314 was estimated by GenomeScope v2.0 (Ranallo-Benavidez et al. 2020) using the k-mer distribution of Illumina short reads (k = 21, P = 1). The average k-mer depth was 64 (Fig. 1B, main peak) and the estimated size was 33,693,763 bp with an 83.20% unique sequence (Fig. 1B).Fig. 1. Genome characteristics of Thielaviopsis paradoxa strain X-3314. A, Length distribution of PacBio long reads. B, The k-mer-based genome size estimation conducted by GenomeScope v2.0 with Illumina short reads (k = 21, ploidy = 1). C, Completeness of genome assembly and protein-coding genes assessed by BUSCO v5.2.2 at the fungi (n = 758) and ascomycota (n = 1,706) levels. D, Protein-coding genes predicted by EVM, which combined de novo, homolog, and RNA-sequencing (RNA-seq) results.Download as PowerPointA de novo genome assembly was generated by wtdbg2 v2.5 (Ruan and Li 2020) using corrected PacBio long reads generated in Canu v1.9 (Koren et al. 2017). The draft genome assembly was base corrected by Pilon v1.23 (Walker et al. 2014) using Illumina short reads. Finally, we obtained a 29.80-Mb polished genome assembly for strain X-3314 with GC content of 48.09%, which is around 90% of the estimated genome size (33.69 Mb). The final genome assembly consisted of 12 contigs with N50 of 4.46 Mb (L50 = 4), and a maximum length of 5.03 Mb (Table 1).Table 1. Genome summary of Thielaviopsis paradoxa strain X-3314FeaturesaX-3314Estimated genome size (Mb)33.69Assembly size (Mb)29.80Contig number12Contig N50 (Mb)4.46Contig L504Maximum contig length (Mb)5.03GC content (%)51.00Repeat content9.03%Protein-coding genes6,931Genes annotated by nr6,541Genes annotated by Swiss-Prot4,751Genes annotated by Pfam5,336Genes annotated by GO2,374Genes annotated by KEGG2,822Genes annotated by KOG4,061Membrane transport proteinsb111Carbohydrate-active enzymesb222Pathogen–host interaction genesb2,004Putative effectorsc357aDatabase abbreviations: nr = nonredundant, GO = gene ontology, KEGG = Kyoto Encyclopedia of Genes and Genomes, and KOG = eukaryotic orthologous groups.bMembrane transport proteins were annotated by the Transporter Classification Database (https://tcdb.org/), carbohydrate-active enzymes were annotated by dbCAN2 meta server (https://bcb.unl.edu/dbCAN2/), and pathogen–host interaction genes were annotated by PHI-base (http://www.phi-base.org/).cPutative effectors in this study were defined as having a signal peptide identified by SignalP v5.0 (Almagro Armenteros et al. 2019) but without transmembrane helices predicted by TMHMM v2.0 (Krogh et al. 2001).Table 1. Genome summary of Thielaviopsis paradoxa strain X-3314View as image HTML Genome completeness analyzed by BUSCO v5.2.2 (Manni et al. 2021) showed that the genome assembly of strain X-3314 contained 752 (99.21%, including 749 single-copy and 3 duplicated) complete orthologs at the fungi level (n = 758), and 1,663 complete (97.48%, including 1,658 single-copy and 5 duplicated) and 6 fragmented orthologs at the ascomycota level (n = 1,706) (Fig 1C). Illumina short reads were mapped to the genome assembly using BWA v0.7.17 (Li and Durbin 2009). The mapping summary showed that 15,279,159 of 17,813,016 (85.78%) Illumina short reads were mapped, of which 14,972,310 (84.67%) reads were properly paired.A de novo repeat library of strain X-3314 was generated by RepeatModeler v2.02 (http://www.repeatmasker.org/RepeatModeler/). Then, it was used for repeat masking performed by RepeatMasker v4.1.2 (http://www.repeatmasker.org/) and, in total, 2,692,269 bp (9.03%) repeat sequences were identified in the genome assembly of strain X-3314 (Table 1). Interestingly, the repeats consisted mainly of simple repeats (1,675,458 bp, 62.23%), long interspersed nuclear elements (401,481 bp, 14.91%), and low-complexity repeats (322,508 bp, 11.98%).In total, 6,931 protein-coding genes (PCGs) were identified from the repeat-masked genome assembly using EVM v2.1.6 (Haas et al. 2008). This integrated three kinds of evidence for gene predictions, including ab-initio-based results from Augustus v3.4.0 (Keller et al. 2011), RNA-seq-based results from PASA v2.3.3 (Haas et al. 2008), and homology-based results from GeMoMa v2.3 (Keilwagen et al. 2019) (Table 1; Fig 1D). In total, 5,802 PCGs (87.50%) were identified using all three methods (Fig 1D), indicating that the quality of the PCGs was reliable. The quality of PCGs was also assessed by BUSCO v5.2.2 (Manni et al. 2021). Strain X-3314 contained 750 (98.94%, included 748 single-copy and 2 duplicated) complete and 3 fragmented orthologs at the fungi level (n = 758), and 1,654 complete (96.95%, including 1,651 single-copy and 3 duplicated) and 11 fragmented orthologs at the ascomycota level (n = 1,706) (Fig 1C).The functions of the PCGs were annotated by protein aligner tools against a set of popular bioinformatic databases. Of 6,931 PCGs, 6,541 (94.37%) and 4,751 (68.55%) were assigned annotations from the NCBI nr/nt database (https://ftp.ncbi.nlm.nih.gov/blast/db/) and Swiss-Prot (https://www.uniprot.org/) by DIAMOND v2.0.11 (Buchfink et al. 2021), respectively (Table 1). EggNOG-mapper v2 (Cantalapiedra et al. 2021) was employed for the Kyoto Encyclopedia of Genes and Genomes (2,822 PCGs, 40.72%) and eukaryotic orthologous groups (4,061 PCGs, 58.59%) annotation (Table 1). InterProScan v5.52-86.0 (Jones et al. 2014) was used for Pfam (5,336 PCGs, 76.99%), and gene ontology (2,374 genes, 34.25%) annotation (Table 1).PCGs were also annotated with pathogenicity-related databases, and 2,004 pathogen–host interaction-related PCGs (PHI-base v4.12; http://www.phi-base.org/), 222 carbohydrate-active enzymes (dbCAN2) (Lombard et al. 2014), 111 membrane transport proteins (Transporter Classification Database; https://tcdb.org/) (Saier et al. 2021), and 357 putative effectors were identified (Bao et al. 2017) (Table 1).The nearly complete genome assembly and annotation will help to improve understanding of the pathogenicity of T. paradoxa, and provide a foundation for comparative genomics with other Thielaviopsis spp.Data AvailabilityThe genome assembly and predicted gene annotations of strain X-3314 have been deposited in the Genome Warehouse (http://ngdc.cncb.ac.cn/gwh, accession number GWHBHEV00000000, BioProject: PRJCA008091) at the National Genomics Data Center, China National Center for Bioinformation (CNCB-NGDC Members and Partners 2021).The author(s) declare no conflict of interest.Literature CitedAlmagro Armenteros, J. J., Tsirigos, K. D., Sønderby, C. K., Petersen, T. N., Winther, O., Brunak, S., von Heijne, G., and Nielsen, H. 2019. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 37:420-423. https://doi.org/10.1038/s41587-019-0036-z Crossref, ISI, Google ScholarBao, J., Chen, M., Zhong, Z., Tang, W., Lin, L., Zhang, X., Jiang, H., Zhang, D., Miao, C., Tang, H., Zhang, J., Lu, G., Ming, R., Norvienyeku, J., Wang, B., and Wang, Z. 2017. PacBio sequencing reveals transposable elements as a key contributor to genomic plasticity and virulence variation in Magnaporthe oryzae. Mol. Plant 10:1465-1468. https://doi.org/10.1016/j.molp.2017.08.008 Crossref, ISI, Google ScholarBuchfink, B., Reuter, K., and Drost, H. G. 2021. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods 18:366-368. https://doi.org/10.1038/s41592-021-01101-x Crossref, ISI, Google ScholarCantalapiedra, C. P., Hernández-Plaza, A., Letunic, I., Bork, P., and Huerta-Cepas, J. 2021. eggNOG-mapper v2: Functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38:5825-5829. https://doi.org/10.1093/molbev/msab293 Crossref, ISI, Google ScholarCarvalho, R. R., Souza, P. E., Warwick, D. R., Pozza, E. A., and Filho, J. L. 2013. Spatial and temporal analysis of stem bleeding disease in coconut palm in the state of Sergipe, Brazil. An. Acad. Bras. Cienc. 85:1567-1576. https://doi.org/10.1590/0001-37652013112412 Crossref, ISI, Google ScholarCNCB-NGDC Members and Partners. 2021. Database resources of the National Genomics Data Center, China National Center for Bioinformation in 2021. Nucleic Acids Res. 49:D18-D28. https://doi.org/10.1093/nar/gkaa1022 Crossref, ISI, Google ScholarDulce, R. N., and Edson, E. M. 2009. Outbreak of stem bleeding in coconuts caused by Thielaviopsis paradoxa in Sergipe, Brazil. Trop. Plant Pathol. 34:175-177. ISI, Google ScholarHaas, B. J., Salzberg, S. L., Zhu, W., Pertea, M., Allen, J. E., Orvis, J., White, O., Buell, C. R., and Wortman, J. R. 2008. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9:R7. https://doi.org/10.1186/gb-2008-9-1-r7 Crossref, ISI, Google ScholarJones, P., Binns, D., Chang, H. Y., Fraser, M., Li, W., McAnulla, C., McWilliam, H., Maslen, J., Mitchell, A., Nuka, G., Pesseat, S., Quinn, A. F., Sangrador-Vegas, A., Scheremetjew, M., Yong, S. Y., Lopez, R., and Hunter, S. 2014. InterProScan 5: Genome-scale protein function classification. Bioinformatics 30:1236-1240. https://doi.org/10.1093/bioinformatics/btu031 Crossref, ISI, Google ScholarKeilwagen, J., Hartung, F., and Grau, J. 2019. GeMoMa: Homology-based gene prediction utilizing intron position conservation and RNA-seq data. Methods Mol. Biol. 1962:161-177. https://doi.org/10.1007/978-1-4939-9173-0_9 Crossref, Google ScholarKeller, O., Kollmar, M., Stanke, M., and Waack, S. 2011. A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics 27:757-763. https://doi.org/10.1093/bioinformatics/btr010 Crossref, ISI, Google ScholarKoren, S., Walenz, B. P., Berlin, K., Miller, J. R., Bergman, N. H., and Phillippy, A. M. 2017. Canu: Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27:722-736. https://doi.org/10.1101/gr.215087.116 Crossref, ISI, Google ScholarKrogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E. L. 2001. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305:567-580. https://doi.org/10.1006/jmbi.2000.4315 Crossref, ISI, Google ScholarLi, H., and Durbin, R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754-1760. https://doi.org/10.1093/bioinformatics/btp324 Crossref, ISI, Google ScholarLombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M., and Henrissat, B. 2014. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42:D490-D495. https://doi.org/10.1093/nar/gkt1178 Crossref, ISI, Google ScholarManni, M., Berkeley, M. R., Seppey, M., Simão, F. A., and Zdobnov, E. M. 2021. BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38:4647-4654. https://doi.org/10.1093/molbev/msab199 Crossref, ISI, Google ScholarNiu, X., Pei, M., Liang, C., Lv, Y., Wu, X., Zhang, R., Lu, G., Yu, F., Zhu, H., and Qin, W. 2019. Genetic transformation and green fluorescent protein labeling in Ceratocystis paradoxa from coconut. Int. J. Mol. Sci. 20:2387. Crossref, ISI, Google ScholarPolizzi, G., Castello, I., Vitale, A., Catara, V., and Tamburino, V. 2006. First report of Thielaviopsis trunk rot of date palm in Italy. Plant Dis. 90:972. https://doi.org/10.1094/PD-90-0972C Link, ISI, Google ScholarRanallo-Benavidez, T. R., Jaron, K. S., and Schatz, M. C. 2020. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11:1432. https://doi.org/10.1038/s41467-020-14998-3 Crossref, ISI, Google ScholarRuan, J., and Li, H. 2020. Fast and accurate long-read assembly with wtdbg2. Nat. Methods 17:155-158. https://doi.org/10.1038/s41592-019-0669-3 Crossref, ISI, Google ScholarSaier, M. H., Reddy, V. S., Moreno-Hagelsieb, G., Hendargo, K. J., Zhang, Y., Iddamsetty, V., Lam, K. J. K., Tian, N., Russum, S., Wang, J., and Medrano-Soto, A. 2021. The Transporter Classification Database (TCDB): 2021 update. Nucleic Acids Res. 49:D461-D467. https://doi.org/10.1093/nar/gkaa1004 Crossref, ISI, Google ScholarWalker, B. J., Abeel, T., Shea, T., Priest, M., Abouelliel, A., Sakthikumar, S., Cuomo, C. A., Zeng, Q., Wortman, J., Young, S. K., and Earl, A. M. 2014. Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963. https://doi.org/10.1371/journal.pone.0112963 Crossref, ISI, Google ScholarYu, F. Y., Lin, C. H., Zhu, H., Wang, P., Chen, S. T., Niu, X. Q., and Tang, Q. H. 2011. Biological characteristics of the pathogenic fungus causing stem bleeding disease of coconut. Chin. J. Trop. Crops. 32:1122-1127. Google ScholarYu, F. Y., Niu, X. Q., Tang, Q. H., Zhu, H., Song, W. W., Qin, W. Q., and Lin, C. H. 2012. First report of stem bleeding in coconut caused by Ceratocystis paradoxa in Hainan, China. Plant Dis. 96:290. https://doi.org/10.1094/PDIS-10-11-0840 Link, ISI, Google ScholarFunding: This work was supported by Innovation Platform for Academicians of Hainan Province, grants from Hainan Natural Science Foundation (2019RC339) and Major Research Project for Science and Technology (ZDKJ201817), and the Key Project of Hainan Province (ZDYF2019072).The author(s) declare no conflict of interest.DetailsFiguresLiterature CitedRelated Vol. 106, No. 9 September 2022SubscribeISSN:0191-2917e-ISSN:1943-7692 Download Metrics Article History Issue Date: 30 Aug 2022Published: 27 Jul 2022Accepted: 21 Apr 2022 Pages: 2514-2517 Information© 2022 The American Phytopathological SocietyFundingHainan Natural Science FoundationGrant/Award Number: 2019RC339Major Research Project for Science and TechnologyGrant/Award Number: ZDKJ201817Key Project of Hainan ProvinceGrant/Award Number: ZDYF2019072Keywordscoconut stem-bleedinggenome assemblyThielaviopsis paradoxaX-3314The author(s) declare no conflict of interest.PDF download
更多
查看译文
关键词
coconut stem-bleeding, genome assembly, Thielaviopsis paradoxa, X-3314
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要