0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats

      research-article
      1 , 27 , 2 , 27 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 4 , 4 , 10 , 11 , 12 , 13 , 6 , 7 , 14 , 12 , 6 , 1 , 15 , 16 , 10 , 1 , 17 , 12 , 12 , 18 , 19 , 16 , 20 , 21 , 2 , 4 , 22 , 12 , 4 , 6 , 7 , 23 , 19 , 10 , 10 , 12 , 24 , 12 , 4 , 24 , 19 , 12 , 18 , 25 , 4 , 26 , 6 , 7 , 6 , 7 , 12 , 2 , , 1 , 28 , ∗∗
      Cell Genomics
      Elsevier
      rat, reference genome, mRatBN7.2, Rnor_6.0, hybrid rat diversity panel, heterogeneous stock, genetic map, phylogenetic tree, inbred strains, recombinant inbred

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Summary

          The seventh iteration of the reference genome assembly for Rattus norvegicus—mRatBN7.2—corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ∼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.

          Graphical abstract

          Highlights

          • mRatBN7.2 is a rat reference genome with improved contiguity and accuracy

          • Gene annotations, from both RefSeq and Ensembl, are improved with mRatBN7.2

          • Our analysis of 120 strains/substrains of rats found 20 million sequence variations

          • A refined phylogenetic tree reveals the relationships between laboratory rats

          Abstract

          de Jong et al. evaluated the seventh assembly of the rat reference genome, mRatBN7.2, and found that it reduces base-level errors and increases contiguity, although some misassemblies remain. Gene annotations are now more complete. Analysis of whole genomes representing 120 rat strains/substrains revealed 20 million sequence variations. Phylogenetic analysis refined ancestral relationships among these strains. In addition, a new rat genetic map, along with annotated transcription start sites and alternative polyadenylation sites based on mRatBN7.2, is provided.

          Related collections

          Most cited references103

          • Record: found
          • Abstract: found
          • Article: not found

          STAR: ultrafast universal RNA-seq aligner.

          Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Basic local alignment search tool.

            A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              PLINK: a tool set for whole-genome association and population-based linkage analyses.

              Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
                Bookmark

                Author and article information

                Contributors
                Journal
                Cell Genom
                Cell Genom
                Cell Genomics
                Elsevier
                2666-979X
                26 March 2024
                10 April 2024
                26 March 2024
                : 4
                : 4
                : 100527
                Affiliations
                [1 ]Department of Pharmacology, Addiction Science, and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
                [2 ]Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
                [3 ]Institute of Biotechnology, University of Helsinki, Helsinki, Finland
                [4 ]Department of Psychiatry, University of California San Diego, San Diego, CA, USA
                [5 ]Department of Integrative Structural and Computational Biology, Scripps Research, San Diego, CA, USA
                [6 ]Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA
                [7 ]Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
                [8 ]Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI, USA
                [9 ]Department of Medicine, University of California San Diego, San Diego, CA, USA
                [10 ]Tree of Life, Wellcome Sanger Institute, Cambridge, UK
                [11 ]Institute of Genetics and Biophysics, National Research Council, Naples, Italy
                [12 ]Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
                [13 ]Department of Anatomy, Physiology & Genetics, The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
                [14 ]The Brown Foundation Institute of Molecular Medicine, Center for Human Genetics, University of Texas Health Science Center, Houston, TX, USA
                [15 ]Genome Structure and Ageing, University of Groningen, UMC, Groningen, the Netherlands
                [16 ]European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus in Hinxton, Cambridgeshire, UK
                [17 ]Gluck Equine Research Center, Department of Veterinary Science, University of Kentucky, Louisville, KY, USA
                [18 ]Center for Proteomics and Metabolomics, St. Jude Children’s Research Hospital, Memphis, TN, USA
                [19 ]Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
                [20 ]Center for Immunity and Immunotherapies, Seattle Children’s Research Institute, Seattle, WA, USA
                [21 ]Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, USA
                [22 ]Institute of Physiology, Czech Academy of Sciences, Prague, Czechia
                [23 ]Department of Internal Medicine, Section on Molecular Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
                [24 ]Department of Animal Sciences, Washington State University, Pullman, WA, USA
                [25 ]National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
                [26 ]Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
                Author notes
                []Corresponding author junzli@ 123456med.umich.edu
                [∗∗ ]Corresponding author hchen@ 123456uthsc.edu
                [27]

                These authors contributed equally

                [28]

                Lead contact

                Article
                S2666-979X(24)00069-7 100527
                10.1016/j.xgen.2024.100527
                11019364
                38537634
                d6cd0e72-8303-4664-9084-8d9d8cad043e
                © 2024 The Authors

                This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

                History
                : 2 October 2023
                : 26 December 2023
                : 29 February 2024
                Categories
                Resource

                rat,reference genome,mratbn7.2,rnor_6.0,hybrid rat diversity panel,heterogeneous stock,genetic map,phylogenetic tree,inbred strains,recombinant inbred

                Comments

                Comment on this article