2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species

      research-article

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Effective utilization of wild relatives is key to overcoming challenges in genetic improvement of cultivated tomato, which has a narrow genetic basis; however, current efforts to decipher high-quality genomes for tomato wild species are insufficient. Here, we report chromosome-scale tomato genomes from nine wild species and two cultivated accessions, representative of Solanum section Lycopersicon, the tomato clade. Together with two previously released genomes, we elucidate the phylogeny of Lycopersicon and construct a section-wide gene repertoire. We reveal the landscape of structural variants and provide entry to the genomic diversity among tomato wild relatives, enabling the discovery of a wild tomato gene with the potential to increase yields of modern cultivated tomatoes. Construction of a graph-based genome enables structural-variant-based genome-wide association studies, identifying numerous signals associated with tomato flavor-related traits and fruit metabolites. The tomato super-pangenome resources will expedite biological studies and breeding of this globally important crop.

          Abstract

          A tomato super-pangenome constructed using chromosome-scale genomes of nine wild species and two cultivated accessions highlights genomic diversity and structural variation across wild and cultivated tomatoes.

          Related collections

          Most cited references101

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Fast and accurate short read alignment with Burrows–Wheeler transform

          Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            KEGG: kyoto encyclopedia of genes and genomes.

            M Kanehisa (2000)
            KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              MUSCLE: multiple sequence alignment with high accuracy and high throughput.

              We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
                Bookmark

                Author and article information

                Contributors
                wanghuan@caas.cn
                lihongbo_solab@163.com
                yuqinghui@xaas.ac.cn
                Journal
                Nat Genet
                Nat Genet
                Nature Genetics
                Nature Publishing Group US (New York )
                1061-4036
                1546-1718
                6 April 2023
                6 April 2023
                2023
                : 55
                : 5
                : 852-860
                Affiliations
                [1 ]GRID grid.433811.c, ISNI 0000 0004 1798 1482, The State Key Laboratory of Genetic Improvement and Germplasm Innovation of Crop Resistance in Arid Desert Regions (Preparation), Key Laboratory of Genome Research and Genetic Improvement of Xinjiang Characteristic Fruits and Vegetables, Institute of Horticultural Crops, , Xinjiang Academy of Agricultural Sciences, ; Urumqi, China
                [2 ]GRID grid.410727.7, ISNI 0000 0001 0526 1937, Institute of Crop Sciences, , Chinese Academy of Agricultural Sciences, ; Beijing, China
                [3 ]GRID grid.410727.7, ISNI 0000 0001 0526 1937, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Shenzhen Key Laboratory of Agricultural Synthetic Biology, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, , Chinese Academy of Agricultural Sciences, ; Shenzhen, China
                [4 ]GRID grid.413251.0, ISNI 0000 0000 9354 9799, College of Horticulture, , Xinjiang Agricultural University, ; Urumqi, China
                [5 ]GRID grid.413254.5, ISNI 0000 0000 9544 7024, College of Life Science and Technology, , Xinjiang University, ; Urumqi, China
                [6 ]Adsen Biotechnology Co., Ltd., Urumqi, China
                [7 ]GRID grid.27871.3b, ISNI 0000 0000 9750 7019, College of Horticulture, , Nanjing Agricultural University, ; Nanjing, China
                [8 ]GRID grid.5386.8, ISNI 000000041936877X, Boyce Thompson Institute, , Cornell University, ; Ithaca, NY USA
                [9 ]GRID grid.512862.a, US Department of Agriculture-Agricultural Research Service, , Robert W. Holley Center for Agriculture and Health, ; Ithaca, NY USA
                [10 ]GRID grid.410727.7, ISNI 0000 0001 0526 1937, Biotechnology Research Institute, , Chinese Academy of Agricultural Sciences, ; Beijing, China
                Author information
                http://orcid.org/0000-0003-3356-2125
                http://orcid.org/0000-0002-5140-8220
                http://orcid.org/0000-0003-0780-2973
                http://orcid.org/0000-0002-8547-5309
                http://orcid.org/0000-0001-9684-1450
                http://orcid.org/0000-0003-3836-0741
                http://orcid.org/0000-0003-1579-4600
                http://orcid.org/0000-0002-9342-5301
                Article
                1340
                10.1038/s41588-023-01340-y
                10181942
                37024581
                1855b867-7ae6-45bb-af03-140ff3ac8ed8
                © The Author(s) 2023

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 1 December 2021
                : 21 February 2023
                Funding
                Funded by: the National Natural Science Foundation of China (31860555, 32260763 and 31991180), Key projects for crop traits formation and cutting-edge technologies in biological breeding (xjnkywdzc-2022001), Key Research and development task special project of Xinjiang (2022B02002), Special Incubation Project of Science & Technology Renovation of Xinjiang Academy of Agricultural Sciences (xjkcpy-2021001), China Agriculture Research System of MOF and MARA (CARS-23-G24), Guangdong Major Project of Basic and Applied Basic Research (2021B0301030004), the National Key Research and Development Program of China (2019YFA0906200 and 2021YFF1000100), Shenzhen Science and Technology Program (Grant No. KQTD2016113010482651), Special Funds for Science Technology Innovation and Industrial Development of Shenzhen Dapeng New District (Grand No. RC201901-05), Shenzhen Outstanding Talents Training Fund and the US National Science Foundation (IOS-1855585).
                Categories
                Article
                Custom metadata
                © Springer Nature America, Inc. 2023

                Genetics
                plant breeding,genomics
                Genetics
                plant breeding, genomics

                Comments

                Comment on this article