1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Establishment of a stable, effective and universal genetic transformation technique in the diverse species of Brassica oleracea

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Brassica oleracea is an economically important species, including seven cultivated variants. Agrobacterium-mediated transformation of B. oleracea crops, mainly via hypocotyl and cotyledon, has been achieved in the past. However, previously established transformation methods showed low efficiency, severe genotype limitation and a prolonged period for transformants acquisition, greatly restricting its application in functional genomic studies and crop improvement. In this study, we have compared the shoot regeneration and genetic transformation efficiency of hypocotyl, cotyledon petiole and curd peduncle explants from twelve genotypes of cauliflower and broccoli. Finally, an Agrobacterium-mediated transformation method using curd peduncle as explant was established, which is rapid, efficient, and amenable to high-throughput transformation and genome editing. The average genetic transformation efficiency of this method is stable up to 11.87% and was successfully implemented in twelve different genotypes of cauliflower and broccoli and other B. oleracea crops with low genotype dependence. Peduncle explants were found to contain abundant cambial cells with a strong cell division and shoot regeneration ability, which might be why this method achieved stable and high genetic transformation efficiency with almost no genotype dependence.

          Related collections

          Most cited references47

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes

          Brassica oleracea comprises many important vegetable crops including cauliflower, broccoli, cabbages, Brussels sprouts, kohlrabi and kales. The species demonstrates extreme morphological diversity and crop forms, with various members grown for their leaves, flowers and stems. About 76 million tons of Brassica vegetables were produced in 2010, with a value of 14.85 billion dollars ( http://faostat.fao.org/). Most B. oleracea crops are high in protein1 and carotenoids2, and contain diverse glucosinolates (GSLs) that function as unique phytochemicals for plant defence against fungal and bacterial pathogens3 and on consumption have been shown to have potent anticancer properties4 5 6. B. oleracea is a member of the family Brassicaceae (~\n338 genera and 3,709 species)7 and one of three diploid Brassica species in the classical triangle of U8 that also includes diploids B. rapa (AA) and B. nigra (BB) and allotetraploids B. juncea (AABB), B. napus (AACC) and B. carinata (BBCC). These allotetraploid species are important oilseed crops, accounting for 12% of world edible oil production ( http://faostat.fao.org/). As the origin and relationship between these species is clear, the timing and nature of the evolutionary events associated with Brassica divergence and speciation can be revealed by interspecific genome comparison. Each of the Brassica genomes retains evidence of recursive whole-genome duplication (WGD) events9 10 (Supplementary Fig. 1) and have undergone a Brassiceae-lineage-specific whole-genome triplication (WGT)11 12 since their divergence from the Arabidopsis lineage. These events were followed by diploidization that involved substantial genome reshuffling and gene losses11 12 13 14 15. Because of this, Brassica species are a model for the study of polyploid genome evolution (Supplementary Fig. 2), mechanisms of duplicated gene loss, neo- and sub-functionalization, and associated impact on morphological diversity and species differentiation. We report a draft genome sequence of B. oleracea and its comprehensive genomic comparison with the genome of sister species B. rapa, which diverged from a common ancestor ~\n4 MYA. These data provide insights into the dynamics of Brassica genome evolution and divergence, and serve as important resources for Brassica vegetable and oilseed crop breeding. Furthermore, this genome will support studies of the large range of morphological variation found within B. oleracea, which includes sexually compatible crops such as cabbages, cauliflower and broccoli that are important for their economic, nutritional and potent anticancer value. Results B. oleracea genome assembly and annotation Complementing the sequencing of the smaller B. rapa genome11, a draft genome assembly of B. oleracea var. capitata line 02–12 was produced by interleaving Illumina, Roche 454 and Sanger sequence data. This assembly represents 85% of the estimated 630 Mb genome, and includes >98% of the gene space (Supplementary Methods, Supplementary Tables 1–3, 7 and 8 and Supplementary Fig. 3). The assembly was anchored to a new genetic map16 to produce nine pseudo-chromosomes that account for 72% of the assembly, and validated by comparison with a B. oleracea physical map17, a high-density B. napus genetic map18 and complete BAC sequences (Supplementary Figs 4–9 and Supplementary Tables 4 and 5). For comparative analyses, identical genome annotation pipelines were used for annotation of protein-coding genes and transposable elements (TEs) for B. oleracea and B. rapa. A total of 45,758 protein-coding genes were predicted, with a mean transcript length of 1,761 bp, a mean coding length of 1,037 bp, and a mean of 4.55 exons per gene (Table 1, Supplementary Methods, Supplementary Table 6 and Supplementary Fig. 10), similar to A. thaliana 19 and B. rapa 11. Publicly available ESTs, together with RNA sequencing (RNA-seq) data generated in this study, support 94% of predicted gene models (Supplementary Tables 7 and 8), and 91.6% of predicted genes have a match in at least one public protein database (Supplementary Tables 9 and 10, and Supplementary Fig. 11). Of the 45,758 predicted genes, 13,032 produce alternative splicing (AS) variants with intron retention and exon skipping (Supplementary Table 11). Genome annotation also predicted 3,756 non-coding RNAs (miRNA, tRNA, rRNA and snRNA) (Supplementary Table 12). A combination of structure-based analyses and homology-based comparisons resulted in the identification of 13,382 TEs with clearly identified terminal boundaries, including 5,107 retrotransposons and 8,275 DNA transposons (Supplementary Methods, Supplementary Fig. 12 and Supplementary Table 13). These elements together with numerous truncated elements or TE remnants make up 38.80% of the assembled portion of the B. oleracea genome, whereas TEs account for only 21.47% of the B. rapa genome assembly. Copia (11.64%) and gypsy (7.84%) retroelements are the major constituents of the repetitive fraction, and are unevenly distributed across each chromosome, with retrotransposons predominantly found in pericentromeric or heterochromatic regions (Supplementary Fig. 13) in B. oleracea. Tentative physical positions of some of the centromeres were determined based on homologue and phylogenetic analysis of the centromere-specific 76 bp tandem repeats CentBo-1 and CentBo-2 and copia-type retrotransposon (CentCRBo) (Supplementary Table 14 and Supplementary Figs 14–17). The distribution of 45S and 5S rDNA sequences were also visualized by fluorescent in situ hybridization (Supplementary Figs 18 and 19), leading to a predicted karyotype ideogram for B. oleracea (Supplementary Fig. 20). An extra-centromeric locus with colocalized centromeric satellite repeat CentBo-1 and the centromeric retrotransposon CRBo-1 was observed on the long arm of chromosome 6 (Supplementary Figs 18–20). A comprehensive database for the genome information is accessible at http://www.ocri-genomics.org/bolbase/index.html. Conserved syntenic blocks and genome rearrangement after WGT The relatively complete triplicated regions in B. oleracea and B. rapa were constructed and they relate to the 24 ancestral crucifer blocks (A–X) in A. thaliana 20. Further the triplicated blocks resulting from WGT in the two Brassica species were partitioned into three subgenomes: LF (Least-fractionated), MF1 (Medium-fractionated) and MF2 (Most-fractionated)11 (Fig. 1a, Supplementary Methods, Supplementary Tables 15 and 16, and Supplementary Figs 21–26). These syntenic blocks occupy the majority of the genome assemblies of A. thaliana (19,628 genes, 72.24% of 27,169 genes), B. oleracea (26,485 genes, 57.88%) and B. rapa (26,698 genes, 64.84%), and provide a foundation for comparative analyses of chromosomal rearrangement, gene loss and divergence of retained paralogues after WGT. Massive gene loss occurred in an asymmetrical and reciprocal fashion in the three subgenomes of each species and was largely completed before the B. oleracea–B. rapa divergence (Fig. 1c, Supplementary Tables 17–19 and Supplementary Figs 25–27). The timing of this evolutionary process was supported by the estimated timing of WGT ~\n15.9 million years ago (MYA), and species divergence ~\n4.6 MYA, based on synonymous substitution (Ks) rates of genes located in the blocks (Fig. 1b and Supplementary Table 20). Gene loss occurred mainly through small deletions that may be caused by illegitimate recombination21 22 (Supplementary Fig. 27), consistent with observations in other plant genomes. Abundant genome rearrangement following WGT and subsequent Brassica species divergence resulted in complex mosaics of triplicated ancestral genomic blocks in the A and C genomes (Fig. 1a and Supplementary Fig. 28). At least 19 major, and numerous fine-scale, chromosome rearrangements occurred, which differentiate the two Brassica species (Supplementary Fig. 29). This is in agreement with previous comparative studies based on chromosome painting12 23 and genetic mapping24 25. The extensive chromosome reshuffling in Brassica is in contrast to that observed in other taxa, such as the highly syntenic tomato–potato and pear–apple genomes, each with longer divergence times and less genome rearrangement26 27. This difference may be a consequence of mesopolyploidy in Brassica. Greater TEs accumulation in B. oleracea than B. rapa Both retro- (22.13%) and DNA (16.67%) TEs appear to be greater amplified in B. oleracea relative to B. rapa (9.43 and 12.04%) (Fig. 2a and Supplementary Table 13). We constructed 1,362 gap-free contig-contig syntenic regions by clustering orthologous B. rapa—B. oleracea genes using MCscan (Supplementary Figs 29 and 30). The B. oleracea TE length (34.03% of the 259.6M) is 3.4 times greater than that of the syntenic B. rapa regions (16.73% of the 155.0M) (Fig. 2c, Supplementary Tables 21 and 22, and Supplementary Fig. 31). Phylogenetic analysis revealed that B. oleracea has both more LTR retrotransposon (LTR-RT) families, and more members in most families than B. rapa (Fig. 2d and Supplementary Figs 12, 32 and 33). Furthermore, two new lineages of LTR-RTs, Brassica Copia Retrotransposon and Brassica Gypsy Retrotransposon, were defined in both Brassica species (Supplementary Fig. 33). Analysis of LTR insertion time revealed that ~\n98% of B. oleracea intact LTR-RTs amplified continuously over the ~\n4 million years (MY) since the B. oleracea–B. rapa split, whereas ~\n68% of B. rapa intact LTR-RTs amplified rapidly within the last 1 MY, predominantly in the recent 0.2 MY (Fig. 2b and Supplementary Fig. 34). Hence, LTR-RTs expanded more in the intergenic space of euchromatic regions in B. oleracea than B. rapa. This agrees with previous observations based on comparison of BAC sequences between the A and C genomes28. As a consequence of continuous TE amplification over the last 4 MY, the genome size of B. oleracea is ~\n30% larger than that of B. rapa although the two genomes share the same ploidy and are largely collinear. Species-specific genes and tandemly duplicated genes While the genomes of B. oleracea and B. rapa are highly similar in terms of total gene clusters/sequences and the gene number in each cluster, there are also a large number of species-specific genes in the two species. A total of 66.5% (34,237 genes) of B. oleracea genes and 74.9% (34,324) of B. rapa genes were clustered into OrthoMCL groups (Supplementary Table 23 and Supplementary Fig. 35). We identified 9,832 B. oleracea-specific and 5,735 B. rapa-specific genes, of which 77% were supported by gene expression and/or a clear Arabidopsis homologue (Supplementary Table 24). Of them, >90% of these specific genes were validated for their absence in the counterpart genomes by reciprocal mapping of raw clean reads (Supplementary Tables 25 and 26). Most Brassica-specific genes are randomly distributed along the chromosomes (Supplementary Figs 36 and 37). More than 80% of the species-specific genes were surrounded by non-specific genes (Supplementary Fig. 38), suggesting that deletion of individual genes may be the major mechanism underlying gene loss and the difference in gene numbers between B. oleracea and B. rapa. Tandem duplication produces clusters of duplicated genes and contributes to the expansion of gene families29. We identified 1,825, 2,111 and 1,554 gene clusters containing 4,365, 5,181 and 4,170 tandemly duplicated genes in B. oleracea, B. rapa and A. thaliana, respectively (Fig. 3a, Supplementary Tables 27 and 28 and Supplementary Fig. 39). The wide range of sequence divergence of tandem gene pairs in each species suggests that tandem gene duplication occurred continuously throughout the evolutionary history of these species, rather than in discrete bursts (Supplementary Figs 40 and 41). Their continuous and asymmetrical occurrence after species divergence resulted in 522, 697 and 815 species-specific tandem clusters in the three genomes. The frequency of tandem duplication is independent of the total gene content, suggesting that genome triplication has not inhibited its occurrence. Tandemly duplicated genes are preferentially enriched for gene ontology (GO) categories related to defence response and pathways related to secondary metabolism such as indole alkaloid biosynthesis and tropane, piperidine and pyridine alkaloid biosynthesis (Fig. 3b, Supplementary Tables 29–32 and Supplementary Fig. 42). Over 44.0 and 51.9% of the NBS-encoding resistance genes are tandemly duplicated in B. oleracea and B. rapa, respectively (Supplementary Table 33). Biased loss and retention of genes after WGT/WGD Following polyploidization, reversion of gene numbers towards diploid levels through gene loss has been widely observed in plants30. However, in Brassica this only appears to be true for collinear genes in the conserved syntenic regions, with a loss of ~\n60% of the predicted post-triplication gene set, nearly restoring the pre-triplication gene number. This is reflected in an overall retention rate of 1.2-fold of A. thaliana orthologous genes in corresponding syntenic regions (Fig. 1c and Supplementary Table 18). In contrast, in terms of genes that have no collinear gene in A. thaliana and either Brassica species (hereafter called non-collinear genes), gene retention rates is 2.5-fold the A. thaliana gene number in B. oleracea and 1.9-fold in B. rapa, both significantly higher than the expected rates (P value 40% of WGT paralogous gene pairs are differentially expressed in these species (Fig. 4b and Supplementary Fig. 65), suggesting potential subfunctionalization of these genes. In both species, a general trend of expression differentiation was alpha-WGD paralogous genes (~\n46%) > WGT paralogous genes (~\n42%) > tandemly duplicated genes (~\n35%) (Fig. 4b, Supplementary Fig. 66 and Supplementary Tables 42 and 43). Different tissues harbour approximately the same number of differentially expressed duplicates, but this number was slightly higher in flower tissue. The expression level of genes in the LF subgenome was significantly higher than corresponding syntenic genes in the more fractionated subgenomes (MF1 and MF2) while no expression dominance relationship was observed between the subgenomes MF1 and MF2 (Fig. 4c, Supplementary Table 44 and Supplementary Fig. 67). Duplicated transcription factor gene pairs showed less differentiated expression (~\n38%) than the expected ratio at the genome-wide level (Fig. 4d and Supplementary Table 45), while paralogues with GO categories related to membrane, catalytic activity and defence response exhibited a higher ratio of differentiated expression (Fig. 4e and Supplementary Table 46). Of B. oleracea–B. rapa orthologous gene pairs (23,823 in total), ~\n42% were differentially expressed across all tissues (Supplementary Tables 42 and 43). Furthermore, many paralogues generate different transcripts, resulting in expression differentiation. Analysis of AS variants of paralogous gene pairs that have identical numbers of exons demonstrated that these variants (either different variants or differential expression of the same variants) cause >20% and >44% of such paralogous genes to be differentially expressed in B. oleracea and B. rapa, respectively (Fig. 4f and Supplementary Table 47). For orthologous gene pairs of B. oleracea and B. rapa, 35.5% (8,467) of gene pairs showed differential expression due to AS variation. When only counting intron retention and exon skipping, 9.3% (2,215) of gene pairs differ. Divergence in AS variants of gene pairs presents an important layer of gene regulation, as reported35 36 37 38, and thus provides a genetic basis for species evolution and new species formation. Unique GSLs metabolism pathways GSLs and hydrolysis products have been of long-standing interest due to their role in plant defence and anticancer properties. Compared with B. rapa and B. napus, B. oleracea has the greatest GSL profile diversity, with wide qualitative and quantitative variation39 40. We identified 101 and 105 GSL biosynthesis genes in B. rapa and B. oleracea, respectively, and 22 GSL catabolism genes in each species (Fig. 5a, Supplementary Table 48 and Supplementary Data 2). In the GSL biosynthesis and catabolism pathways, tandem genes (41.4%, 40.7% and 33.9% in A. thaliana, B. oleracea and B. rapa, respectively) were present in a much higher proportion than the genome-wide average (Supplementary Table 32). The observed variation of GSL profiles is mainly attributed to the duplication of two genes, methylthioalkylmalate (MAM) synthase and 2-oxoglutarate-dependent dioxygenase (AOP). In Arabidopsis, the MAM family contains three tandemly duplicated and functionally diverse members (MAM1, MAM2 and MAM3), and functional analysis demonstrated that MAM2 (absent in ecotype Columbia) and MAM1 catalyses the condensation reaction of the first and the first two elongation cycles for the synthesis of dominant 3 and 4 carbon (C) side-chain aliphatic GSLs, respectively40 41, while MAM3 is assumed to contribute to the production of all GSL chain lengths42. In B. rapa and B. oleracea, MAM1/MAM2 genes experienced independent tandem duplication to produce 6 and 5 orthologs respectively (Fig. 5b,c). The main GSLs in B. oleracea are 4C and 3C GSLs (progoitrin, gluconapin, glucoraphanin and sinigrin)43, while those in B. rapa are 4C and 5C GSLs (gluconapin and glucobrassicanapin)39 (Fig. 5a). Based on the results of expression and phylogenetic analyses, we found a pair of genes Bol017070 and Bra013007, which are the only orthologous genes showing high expression in B. oleracea but silenced in B. rapa (Fig. 5a). This expression difference most likely leads to greater accumulation of the 3C GSL anticancer precursor sinigrin in B. oleracea. Meanwhile, the expression level of MAM3 in B. rapa is much higher than in B. oleracea, explaining the accumulation of 5C GSL glucobrassicanapin in B. rapa. Other genes affecting specific anticancer GLS products are AOPs. Previously, research has reported four gene loci involved in the side-chain modifications of aliphatic GSLs in Arabidopsis. Two tandemly duplicated genes AOP2 and AOP3 catalyse the formation of alkenyl and hydroxyalkyl GSLs, respectively. When both AOPs are non-functional, the plant accumulates the precursor methylsulfinyl alkyl GSL. We identified three AOP2 genes in B. oleracea (Fig. 5d), but two are non-functional due to the presence of premature stop codons. In contrast, all three AOP2 copies are functional in B. rapa 44. No AOP3 homologue has been identified in Brassica. This analysis supports GSL content surveys and explains why glucoraphanin is abundant in B. oleracea, but not in B. rapa. Discussion The Brassica genomes experienced WGT11 12 25 followed by massive gene loss and frequent reshuffling of triplicated genomic blocks. Analysis of retained or lost genes following triplication identified over-retention of genes for metabolic pathways such as oxidative phosphorylation, carbon fixation, photosynthesis and circadian rhythm32, which may contribute to polyploid vigour45. Fewer lost genes were observed in the less-fractionated subgenome, possibly due to expression dominance as reported in maize46. Gene expression analysis revealed extensive divergence and AS variants between duplicate genes. This subfunctionalization or neofunctionalization of duplicated genes provides genetic novelty and a basis for species evolution and new species formation. For example, TF genes that are considered to be conserved still have more than 38% of paralogous pairs showing differential expression across tissues although this percentage is lower than the average from all duplicated genes. Gene expression variation may contribute to an increased complexity of regulatory networks after polyploidization. The multi-layered asymmetrical evolution of the Brassica genomes revealed in this study suggests mechanisms of polyploid genome evolution underlying speciation. Asymmetrical gene loss between the Brassica subgenomes, the asymmetrical amplification of TEs and tandem duplications, preferential enrichment of genes for certain pathways or functional categories, and divergence in DNA sequence and expression, including alternative splicing among a large number of paralogous and orthologous genes, together shape a route for genome evolution after polyploidization. A molecular model of polyploid genome evolution through these asymmetrical mechanisms is summarized in Supplementary Fig. 2. The additional information of accessible large datasets and resource was provided in Supplementary Table 49. In summary, the B. oleracea genomic sequence, its features in comparison with its relatives, and the genome evolution mechanisms revealed, provide a fundamental resource for the genetic improvement of important traits, including components of GSLs for anticancer pharmaceuticals. The genome sequence has also laid a foundation for investigation of the tremendous range of morphological variation in B. oleracea as well as supporting genome analysis of the important allotetraploid crop B. napus (canola or rapeseed). Methods Sample preparation and genome sequencing A B. oleracea sp. capitata homozygous line 02–12 with elite agronomic characters and widely used as a parent in hybrid breeding was used for the reference genome sequencing (Supplementary Methods). The seedlings of plants were collected and genomic DNA was extracted from leaves with a standard CTAB extraction method. Illumina Genome Analyser whole-genome shotgun sequencing combined with GS FLX Titanium sequencing technology was used to achieve a B. oleracea draft genome. We constructed a total of 35 paired-end sequencing libraries with insertion sizes of 180 base pairs (bp), 200 bp, 350 bp, 500 bp, 650 bp, 800 bp, 2 kb, 5 kb, 10 kb and 20 kb following a standard protocol provided by Illumina (Supplementary Methods). Sequencing was performed using Illumina Genome Analyser II according to the manufacturer’s standard protocol. Genome assembly and validation We took a series of checking and filtering measures on reads following the Illumina-Pipeline, and low-quality reads, adaptor sequences and duplicates were removed (Supplementary Methods). The reads after the above filtering and correction steps were used to perform assembly including contig construction, scaffold construction and gap filling using SOAPdenovo1.04 ( http://soap.genomics.org.cn/) (Supplementary Methods). Finally, we used 20-kb-span paired-end data generated from the 454 platform and 105-kb-span BAC-end data downloaded from NCBI ( http://www.ncbi.nlm.nih.gov/nucgss?term=BOT01) to extend scaffold length (Supplementary Methods). The B. oleracea genome size was estimated using the distribution curve of 17-mer frequency (Supplementary Methods). To anchor the assembled scaffolds onto pseudo-chromosomes, we developed a genetic map using a double haploid population with 165 lines derived from a F1 cross between two homozygous lines 02–12 (sequenced) and 0188 (re-sequenced). The genetic map contains 1,227 simple sequence repeat markers and single nucleotide polymorphism markers in nine linkage groups, which span a total of 1,180.2 cM with an average of 0.96 cM between the adjacent loci16. To position these markers to the scaffolds, marker primers were compared with the scaffold sequences using e-PCR (parameters -n2 -g1 –d 400–800), with the best-scoring match chosen in case of multiple matches. We validated the B. oleracea genome assembly by comparing it with the published physical map constructed using 73,728 BAC clones ( http://lulu.pgml.uga.edu/fpc/WebAGCoL/brassica/WebFPC/)17 and a genetic map from B. napus 18 (Supplementary Methods). Eleven Sanger-sequenced B. oleracea BAC sequences were used to assess the assembled genome using MUMmer-3.22 ( http://mummer.sourceforge.net/) (Supplementary Methods). Gene prediction and annotation Gene prediction was performed on the genome sequence after pre-masking for TEs (Supplementary Methods). Gene prediction was processed with the following steps: (i) De novo gene prediction used AUGUSTUS47 and GlimmerHMM48 with parameters trained from A. thaliana genes. (ii) For homologue prediction, we mapped the protein sequences from A. thaliana, O. sativa, C. papaya, V. vinifera and P. trichocarpa to the B. oleracea genome using tblastn with an E-value cutoff of 10−5, and used GeneWise (Version 2.2.0)49 for gene annotation. (iii) For EST-aided annotation, the Brassica ESTs from NCBI were aligned to the B. oleracea genome using BLAT (identity ≥0.95, coverage ≥0.90) and further assembled using PASA50. Finally, all the predictions were combined using GLEAN51 to produce the consensus gene sets. Functional annotation of B. oleracea genes was based on comparison with SwissProt, TrEMBL, Interproscan and KEGG proteins databases. The tRNA genes were identified by tRNAscan-SE using default parameters52. Then rRNAs were compared with the genome using blastn. Other non-coding RNAs, including miRNA, snRNA, were identified using INFERNAL53 by comparison with the Rfam database. TE annotation LTR-RTs were initially identified using the LTR_STRUC54 programme, and then manually annotated and checked based on structure characteristics and sequence homology. Refined intact elements were then used to identify other intact elements and solo LTRs55. All the LTR-RTs with clear boundaries and insertion sites were classified into superfamilies (Copia-like, Gypsy-like and Unclassified retroelements) and families relying on the internal protein sequence, 5′, 3′ LTRs, primer-binding site and polypurine tracts. Non-LTR-RTs (Long interspersed nuclear element, LINE and Short interspersed nuclear element, SINE) and DNA transposons (Tc1-Mariner, hAT, Mutator, Pong, PIF-Harbinger, CACTA and miniature inverted repeat TE) were identified using conserved protein domains of reverse transposase or transposase as queries to search against the assembled genome using tblastn. Further upstream and downstream sequences of the candidate matches were compared with each other to define their boundaries and structure56. Helitron elements were identified by the HelSearch 1.0 programme57 and manually inspected. All the TE categories were identified according to the criteria described previously58. Typical elements of each category were selected and mixed together as a database for RepeatMasker59 analysis. Around 20 × coverage of shotgun reads randomly sampled from the two Brassica genomes were masked by the same TE data set to confirm the different accumulation of TEs between the two genomes. Syntenic block construction of B. oleracea and its relatives We used the same strategy as described in the B. rapa genome paper11 to construct syntenic blocks between species (Supplementary Methods). The all-against-all blastp comparison (E-value ≤ 1e–5) provided the gene pairs for syntenic clustering determined by MCScan (MATCH_SCORE: 50, MATCH_SIZE: 5, GAP_SCORE: –3, E_VALUE: 1E–05). As applied in B. rapa 11, we assigned and partitioned multiple B. oleracea or B. rapa chromosomal segments that matched the same A. thaliana segment (‘A to X’ numbering system in A. thaliana 22) into three subgenomes: LF, MF1 and MF2. OrthoMCL clustering To identify and estimate the number of potential orthologous gene families between B. oleracea, B. rapa, A. thaliana, C. papaya, P. trichocarpa, V. vinifera, S. bicolor and O. sativa, and also between B. oleracea and B. rapa, we applied the OrthoMCL pipeline60 using standard settings (blastp E value 5) or the Fisher’s exact test (N≤5) was used to detect significant differences between the proportion of (WGT or tandem) genes observed in each child GO, IPR or KEGG categories, and the expected overall proportion of (WGT or tandem) genes in the whole genome. Correlation of the gene numbers of WGT-derived paralogous genes with tandem genes in 938 GO terms was tested by Pearson correlation coefficients (Supplementary Figure 68). The Benjamini–Hochberg false discovery rate was performed to adjust the P values67. Author contributions I.B., B.C., D.E., Q.H., W.H., G.J.K., S.L., Y.L., J. Ma, A.H.P., J.C.P., I.A.P.P., JunW., XiaowuW., XiyinW. and T.-J.Y. are principal investigators (alphabetic order). B.C., W.H., A.H.P., JunW. and XiaowuW. are equally contributing senior authors. S.L., J.W., W.H., X.X. and Z.Y. planned and managed the project. S.L., C.T., A.H.P. and D.E., X.Y. and M.Z. wrote this manuscript and I.B., J. Ma., G.J.K., J.C.P., B.C., T.-J.Y., I.A.P.P., XiyinW., XiaowuW., K.L., Y.L., J.B. and A.G.S. made revision or edits or comments. J.W. (leader), W.H. (co-leader), JunW., L.Y., and Z.Y. performed DNA sequencing. L.Y. (leader), W.H. (co-leader), S.H., J.W., S.L. and J.Y. conducted genomic sequence assembly. S.H. (leader), XiyinW. (co-leader), J.Min, I.B., W.H., J.B., D.E., P.R., S.L., J.S., Y.L. and W.W. conducted scaffold anchoring to linkage maps and assembly validation. X.Y. (leader), J.Y. (co-leader), S.L., Q.Z., S.H. and J. Min performed annotation. C.T. (leader), Wanshun L., W.H., Y.L., C.L., W.W., J. Wu, S.L., C.D. and M.Z. performed transcriptome sequencing. S.L. conceived analysis of comparison and evolution. S.L. (leader), C.T., X.Y., ZhangyanW., C.L., S.H., J. Ma, J.Y., M.Z., Zhuo W., Q.Z., S.P., I.A.P.P., A.G.S., L.Y., I.B., G.J.K., J.C.P., XiaowuW., B.C., F.C., YinH., WenbinL. and X.Liang performed analysis of comparative genomics and evolution. J. Ma (leader), M.Z., Q.Z., C.T., S.L., B.C., S.H., H.B., C.L. and JianaL. conducted TE analysis. XiyinW. (leader), J.Y., T.-J.Y., ZhangyanW., L.W., J. Li, T.-H.L., JinpengW., H.J., X.T., X.L., M.G. and L.J. conducted gene family analysis. K.L. (leader), J.Y., S.L., C.T., H.L., H.G., S.P., D.Z., Z.F., Q.H., Xnfa W., C.Q., D.D., Z.H., Y.H., J.H., D.M., J.L., Z. Li, J.Z., L.X., Y.Zhou., Z.L. and Y.Zhang conducted trait-related gene analysis. A.H.P. (leader), XiyinW., D.J., Y.W. and T.-H.L. conducted gene conversion analysis. T.-J. Y. (leader), M.Z., P.S., B.-S.P., J.Ma, N.E.W., R.Q., X.L., J.Lee and H.H.K. conducted centromere analysis. C.T. (leader), S.L., X.Y., S.H., C.L., Zhangyan W., Q.Z., J.Y., J.T. and J.B. conducted tandemly duplicated gene analysis. ZhangyanW. and J.Y. performed data submission. Additional information Accession codes: Genome sequence data for B. oleracea have been deposited in the DDBJ/EMBL/GenBank nucleotide core database under the accession code AOIX00000000. Transcriptome sequence data for B. rapa and B. oleracea have been deposited in the DDBJ/EMBL/GenBank Sequence Read Archive (SRA) under the accession codes GSE43245 and GSE42891 respectively. How to cite this article: Liu, S. et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat. Commun. 5:3930 doi: 10.1038/ncomms4930 (2014). Supplementary Material Supplementary Figures, Tables, Methods and References Supplementary Figures 1-68, Supplementary Tables 1-49, Supplementary Methods and Supplementary References Supplementary Data 1 The 23,823 Brassica oleracea-B. rapa orthologous gene pairs and those with different exon numbers Supplementary Data 2 The genes for biosynthesis and breakdown of glucosinolates (GSL) in B. rapa and B. oleracea. Supplementary Data 3 The multiple sequence alignment of gene families corresponding to Figure 5 and Supplementary Figures 46-61.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            A rapid DNA isolation produce for small quantities of fresh leaf tissue

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps

              Plant genomes are often characterized by a high level of repetitiveness and polyploid nature. Consequently, creating genome assemblies for plant genomes is challenging. The introduction of short-read technologies 10 years ago substantially increased the number of available plant genomes. Generally, these assemblies are incomplete and fragmented, and only a few are at the chromosome scale. Recently, Pacific Biosciences and Oxford Nanopore sequencing technologies were commercialized that can sequence long DNA fragments (kilobases to megabase) and, using efficient algorithms, provide high-quality assemblies in terms of contiguity and completeness of repetitive regions1-4. However, even though genome assemblies based on long reads exhibit high contig N50s (>1 Mb), these methods are still insufficient to decipher genome organization at the chromosome level. Here, we describe a strategy based on long reads (MinION or PromethION sequencers) and optical maps (Saphyr system) that can produce chromosome-level assemblies and demonstrate applicability by generating high-quality genome sequences for two new dicotyledon morphotypes, Brassica rapa Z1 (yellow sarson) and Brassica oleracea HDEM (broccoli), and one new monocotyledon, Musa schizocarpa (banana). All three assemblies show contig N50s of >5 Mb and contain scaffolds that represent entire chromosomes or chromosome arms.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Plant Sci
                Front Plant Sci
                Front. Plant Sci.
                Frontiers in Plant Science
                Frontiers Media S.A.
                1664-462X
                12 October 2022
                2022
                : 13
                : 1021669
                Affiliations
                [1] Institute of Vegetables, Zhejiang Academy of Agricultural Sciences , Hangzhou, China
                Author notes

                Edited by: Mengyao Li, Sichuan Agricultural University, China

                Reviewed by: Changwei Zhang, Nanjing Agricultural University, China; Guang-Long Wang, Huaiyin Institute of Technology, China; Xu Yang, Yangzhou University, China

                *Correspondence: Honghui Gu, guhh@ 123456zaas.ac.cn

                †These authors have contributed equally to this work

                This article was submitted to Functional and Applied Plant Genomics, a section of the journal Frontiers in Plant Science

                Article
                10.3389/fpls.2022.1021669
                9597678
                36311069
                b56bf1cb-53cd-406f-b76b-84f3658215b3
                Copyright © 2022 Sheng, Yu, Wang, Shen and Gu

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 17 August 2022
                : 15 September 2022
                Page count
                Figures: 7, Tables: 1, Equations: 0, References: 48, Pages: 12, Words: 5576
                Categories
                Plant Science
                Original Research

                Plant science & Botany
                brassica oleracea,cauliflower,broccoli,curd peduncle explants,direct shoot regeneration,effective transformation

                Comments

                Comment on this article