0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Chromosome-level genome assembly and characterization of Sophora Japonica

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Sophora japonica is a medium-size deciduous tree belonging to Leguminosae family and famous for its high ecological, economic and medicinal value. Here, we reveal a draft genome of S. japonica, which was ∼511.49 Mb long (contig N50 size of 17.34 Mb) based on Illumina, Nanopore and Hi-C data. We reliably assembled 110 contigs into 14 chromosomes, representing 91.62% of the total genome, with an improved N50 size of 31.32 Mb based on Hi-C data. Further investigation identified 271.76 Mb (53.13%) of repetitive sequences and 31,000 protein-coding genes, of which 30,721 (99.1%) were functionally annotated. Phylogenetic analysis indicates that S. japonica separated from Arabidopsis thaliana and Glycine max ∼107.53 and 61.24 million years ago, respectively. We detected evidence of species-specific and common-legume whole-genome duplication events in S. japonica. We further found that multiple TF families (e.g. BBX and PAL) have expanded in S. japonica, which might have led to its enhanced tolerance to abiotic stress. In addition, S. japonica harbours more genes involved in the lignin and cellulose biosynthesis pathways than the other two species. Finally, population genomic analyses revealed no obvious differentiation among geographical groups and the effective population size continuously declined since 2 Ma. Our genomic data provide a powerful comparative framework to study the adaptation, evolution and active ingredients biosynthesis in S. japonica. More importantly, our high-quality S. japonica genome is important for elucidating the biosynthesis of its main bioactive components, and improving its production and/or processing.

          Related collections

          Most cited references72

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Sequence Alignment/Map format and SAMtools

          Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            HISAT: a fast spliced aligner with low memory requirements.

            HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              PLINK: a tool set for whole-genome association and population-based linkage analyses.

              Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
                Bookmark

                Author and article information

                Journal
                DNA Res
                DNA Res
                dnares
                DNA Research: An International Journal for Rapid Publication of Reports on Genes and Genomes
                Oxford University Press
                1340-2838
                1756-1663
                June 2022
                25 April 2022
                25 April 2022
                : 29
                : 3
                : dsac009
                Affiliations
                [1 ] State Key Laboratory of Grassland Agro-Ecosystems, and College of Ecology, Lanzhou University , Lanzhou 730000, China
                [2 ] Key Laboratory of Bio-resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University , Chengdu 610000, China
                [3 ] Institute of Loess Plateau, Shanxi University , Taiyuan 030006, China
                Author notes
                To whom correspondence should be addressed. Tel. 13880788291. Email: rudf@ 123456lzu.edu.cn (D.R.); Tel. 13880788291. lbb2015@ 123456sxu.edu.cn (B.L.)

                Weixiao Lei, Zefu Wang and Man Cao contributed equally to this work.

                Article
                dsac009
                10.1093/dnares/dsac009
                9154292
                35466378
                e1fe0472-24ee-4b5a-a9f9-de0460e4f989
                © The Author(s) 2022. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License ( https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                : 14 December 2021
                : 03 April 2022
                : 07 April 2022
                : 27 May 2022
                Page count
                Pages: 10
                Funding
                Funded by: National Natural Science Foundation of China, DOI 10.13039/501100001809;
                Award ID: 32001085
                Funded by: Fundamental Research Funds for Central Universities;
                Award ID: lzujbky-2020-34
                Award ID: lzujbky-2020-ct02
                Categories
                Resource Article: Genomes Explored
                AcademicSubjects/MED00774
                AcademicSubjects/SCI01140

                Genetics
                sophora japonica,genome,nanopore,hi-c,wgd
                Genetics
                sophora japonica, genome, nanopore, hi-c, wgd

                Comments

                Comment on this article