35
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Rapeseed ( Brassica napus) is the second most important oilseed crop in the world but the genetic diversity underlying its massive phenotypic variations remains largely unexplored. Here, we report the sequencing, de novo assembly and annotation of eight B. napus accessions. Using pan-genome comparative analysis, millions of small variations and 77.2–149.6 megabase presence and absence variations (PAVs) were identified. More than 9.4% of the genes contained large-effect mutations or structural variations. PAV-based genome-wide association study (PAV-GWAS) directly identified causal structural variations for silique length, seed weight and flowering time in a nested association mapping population with ZS11 (reference line) as the donor, which were not detected by single-nucleotide polymorphisms-based GWAS (SNP-GWAS), demonstrating that PAV-GWAS was complementary to SNP-GWAS in identifying associations to traits. Further analysis showed that PAVs in three FLOWERING LOCUS C genes were closely related to flowering time and ecotype differentiation. This study provides resources to support a better understanding of the genome architecture and acceleration of the genetic improvement of B. napus.

          Abstract

          The assembly of eight high-quality rapeseed genomes allows identification of presence and absence variations (PAVs) and small variations. PAV-based genome-wide association analysis uncovered causal variations for agronomic traits and ecotype differentiation.

          Related collections

          Most cited references40

          • Record: found
          • Abstract: found
          • Article: not found

          Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome".

          The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium.

            Common genetic polymorphisms may explain a portion of the heritable risk for common diseases. Within candidate genes, the number of common polymorphisms is finite, but direct assay of all existing common polymorphism is inefficient, because genotypes at many of these sites are strongly correlated. Thus, it is not necessary to assay all common variants if the patterns of allelic association between common variants can be described. We have developed an algorithm to select the maximally informative set of common single-nucleotide polymorphisms (tagSNPs) to assay in candidate-gene association studies, such that all known common polymorphisms either are directly assayed or exceed a threshold level of association with a tagSNP. The algorithm is based on the r(2) linkage disequilibrium (LD) statistic, because r(2) is directly related to statistical power to detect disease associations with unassayed sites. We show that, at a relatively stringent r(2) threshold (r2>0.8), the LD-selected tagSNPs resolve >80% of all haplotypes across a set of 100 candidate genes, regardless of recombination, and tag specific haplotypes and clades of related haplotypes in nonrecombinant regions. Thus, if the patterns of common variation are described for a candidate gene, analysis of the tagSNP set can comprehensively interrogate for main effects from common functional variation. We demonstrate that, although common variation tends to be shared between populations, tagSNPs should be selected separately for populations with different ancestries.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Multiple reference genomes and transcriptomes for Arabidopsis thaliana

              Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.
                Bookmark

                Author and article information

                Contributors
                kdliu@mail.hzau.edu.cn
                yqy@mail.hzau.edu.cn
                llchen@mail.hzau.edu.cn
                guoliang@mail.hzau.edu.cn
                Journal
                Nat Plants
                Nat Plants
                Nature Plants
                Nature Publishing Group UK (London )
                2055-0278
                13 January 2020
                13 January 2020
                2020
                : 6
                : 1
                : 34-45
                Affiliations
                [1 ]ISNI 0000 0004 1790 4137, GRID grid.35155.37, National Key Laboratory of Crop Genetic Improvement, , Huazhong Agricultural University, ; Wuhan, People’s Republic of China
                [2 ]ISNI 0000 0004 1790 4137, GRID grid.35155.37, Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, , Huazhong Agricultural University, ; Wuhan, People’s Republic of China
                Author information
                http://orcid.org/0000-0001-7955-4222
                http://orcid.org/0000-0002-1395-2049
                http://orcid.org/0000-0002-3510-8906
                http://orcid.org/0000-0002-3005-526X
                http://orcid.org/0000-0001-7191-5062
                Article
                577
                10.1038/s41477-019-0577-7
                6965005
                31932676
                41c828bc-e8ca-4bf9-8864-c1c87ec8e0c6
                © The Author(s) 2020

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 31 July 2019
                : 29 November 2019
                Categories
                Article
                Custom metadata
                © The Author(s), under exclusive licence to Springer Nature Limited 2020

                genetics,plant sciences,agriculture
                genetics, plant sciences, agriculture

                Comments

                Comment on this article