96
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The complete sequence of a human genome*

      research-article
      1 , 1 , 1 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 4 , 9 , 12 , 13 , 14 , 14 , 15 , 9 , 16 , 17 , 18 , 19 , 13 , 4 , 19 , 20 , 3 , 21 , 22 , 23 , 24 , 18 , 11 , 25 , 10 , 26 , 27 , 28 , 10 , 11 , 19 , 29 , 1 , 30 , 11 , 22 , 23 , 31 , 9 , 32 , 29 , 26 , 16 , 17 , 33 , 34 , 1 , 35 , 4 , 36 , 14 , 28 , 37 , 35 , 11 , 29 , 32 , 4 , 13 , 6 , 7 , 38 , 39 , 40 , 9 , 41 , 42 , 43 , 11 , 44 , 41 , 19 , 45 , 44 , 29 , 46 , 45 , 5 , 47 , 48 , 42 , 19 , 35 , 1 , 29 , 19 , 42 , 49 , 14 , 9 , 50 , 49 , 44 , 3 , 7 , 51 , 13 , 52 , 10 , 8 , 41 , 35 , 9 , 49 , 4 , 23 , * , 11 , 53 , * , 1 , *
      Science (New York, N.Y.)

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion base pair (bp) sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million bp of sequence containing 1,956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.

          One-Sentence Summary:

          Twenty years after the initial drafts, a truly complete sequence of a human genome reveals what has been missing.

          Related collections

          Most cited references132

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Sequence Alignment/Map format and SAMtools

          Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability

            We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

              The Molecular Evolutionary Genetics Analysis (Mega) software implements many analytical methods and tools for phylogenomics and phylomedicine. Here, we report a transformation of Mega to enable cross-platform use on Microsoft Windows and Linux operating systems. Mega X does not require virtualization or emulation software and provides a uniform user experience across platforms. Mega X has additionally been upgraded to use multiple computing cores for many molecular evolutionary analyses. Mega X is available in two interfaces (graphical and command line) and can be downloaded from www.megasoftware.net free of charge.
                Bookmark

                Author and article information

                Journal
                0404511
                7473
                Science
                Science
                Science (New York, N.Y.)
                0036-8075
                1095-9203
                7 March 2022
                April 2022
                31 March 2022
                10 June 2022
                : 376
                : 6588
                : 44-53
                Affiliations
                [1 ]Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
                [2 ]Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego; La Jolla, CA, USA
                [3 ]Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
                [4 ]Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
                [5 ]Department of Bioengineering, University of California, Berkeley; Berkeley, CA, USA
                [6 ]Sirius University of Science and Technology; Sochi, Russia
                [7 ]Vavilov Institute of General Genetics; Moscow, Russia
                [8 ]Department of Molecular Biology and Genetics, Johns Hopkins University; Baltimore, MD, USA
                [9 ]Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
                [10 ]Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
                [11 ]UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
                [12 ]University of Geneva Medical School; Geneva, Switzerland
                [13 ]Stowers Institute for Medical Research; Kansas City, MO, USA
                [14 ]NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
                [15 ]Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
                [16 ]Department of Data Sciences, Dana-Farber Cancer Institute; Boston, MA
                [17 ]Department of Biomedical Informatics, Harvard Medical School; Boston, MA
                [18 ]DNAnexus; Mountain View, CA, USA
                [19 ]Wellcome Sanger Institute; Cambridge, UK
                [20 ]Department of Genetics, University of Cambridge; Cambridge, UK
                [21 ]Inscripta; Boulder, CO, USA
                [22 ]Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University; New York, NY, USA
                [23 ]Howard Hughes Medical Institute; Chevy Chase, MD, USA
                [24 ]Department of Genetics, Washington University School of Medicine; St. Louis, MO, USA
                [25 ]University of Tennessee Health Science Center; Memphis, TN, USA
                [26 ]McDonnell Genome Institute, Washington University in St. Louis; St. Louis, MO, USA
                [27 ]Department of Genetics, Yale University School of Medicine; New Haven, CT, USA
                [28 ]Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
                [29 ]Pacific Biosciences; Menlo Park, CA, USA
                [30 ]Department of Computational and Data Sciences, Indian Institute of Science; Bangalore KA, India
                [31 ]Reservoir Genomics LLC; Oakland, CA
                [32 ]Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA, USA
                [33 ]Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
                [34 ]Heinrich Heine University Düsseldorf, Medical Faculty, Institute for Medical Biometry and Bioinformatics; Düsseldorf, Germany
                [35 ]Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
                [36 ]Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s Hospital; Seattle, WA, USA
                [37 ]Max-Planck Institute of Molecular Cell Biology and Genetics; Dresden, Germany
                [38 ]Department of Psychiatry, University of Massachusetts Medical School; Worcester, MA, USA
                [39 ]Faculty of Biology, Lomonosov Moscow State University; Moscow, Russia
                [40 ]Cancer Institute of New Jersey; New Brunswick, NJ, USA
                [41 ]Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
                [42 ]National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
                [43 ]Human Genome Sequencing Center, Baylor College of Medicine; Houston TX, USA
                [44 ]Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
                [45 ]Institute for Systems Biology; Seattle, WA, USA
                [46 ]Digital BioLogic d.o.o.; Ivanić-Grad, Croatia
                [47 ]Chan Zuckerberg Biohub; San Francisco, CA, USA
                [48 ]Department of Molecular Genetics and Microbiology, Duke University School of Medicine; Durham, NC, USA
                [49 ]Department of Biology, Johns Hopkins University; Baltimore, MD, USA
                [50 ]Department of Pathology, University of Pittsburgh; Pittsburgh, PA, USA
                [51 ]Research Center of Biotechnology of the Russian Academy of Sciences; Moscow, Russia
                [52 ]Department of Biochemistry and Molecular Biology, University of Kansas Medical School; Kansas City, MO, USA
                [53 ]Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
                Author notes
                [†]

                These authors contributed equally to this work

                [‡]

                Present address: Oxford Nanopore Technologies Inc.; Lexington, MA, USA

                [*]

                This manuscript has been accepted for publication in Science. This version has not undergone final editing. Please refer to the complete version of record at http://www.sciencemag.org/. The manuscript may not be reproduced or used in any manner that does not fall within the fair use provisions of the Copyright Act without the prior, written permission of AAAS.

                Author contributions: Analysis teams (leads*): Assembly: SN*, SK*, MR*, MA, HC, CSC, RD, EG, MKi, MKo, HL, TM, EWM, IS, BPW, AW, AMP. Acrocentrics: AMP*, JLG*, MR, SEA, MB, RD, LGL, TP. Validation: AR*, AVB*, AM*, MA*, AMM*, KS*, WC, LGL, TD, GF, AF, KH, CJ, EDJ, DP, VAS, YS, BAS, FTN, JT, JMDW, AMP. Segmental duplications: MRV*, EEE*, SN, SK, MD, PCD, AG, GAL, DP, CJS, DCS, MYD, WT, KHM, AMP. Satellite annotation: NA*, IAA*, KHM*, AVB, LU, TD, LGL, PAP, EIR, ASt, BAS, AMP. Epigenetics: AG*, WT*, SK, AR, MRV, NA, SJH, GAL, GVC, MCS, RJO, EEE, KHM, AMP. Variants: SA*, DCS*, SMY*, SZ*, RCM*, MYD*, JMZ*, MCS*, NFH, MKi, JM, DEM, NDO, JAR, FJS, KS, ASh, JW, CX, AMP. Repeat annotation: SJH*, RJO*, AG, PGSG, GAH, LGL, AFAS, JMS. Gene annotation: MD*, MH*, ASh*, SN, SK, PCD, ITF, SLS, FTN, AMP. Browsers: MD*, NCC, PK. Data generation: SJH, GGB, SYB, GVC, RSF, TAGL, IMH, MWH, MJ, JK, VVM, JCM, BP, PP, ACY, US, MYD, JLG, RJO, WT, EEE, KHM, AMP. Computational resources: CSC, AF, RJO, MCS, KHM, AMP. Manuscript draft: AMP. Figures: SK, SN, AMP, AR. Editing: AMP, SN, SK, AR, EEE, KHM, with the assistance of all authors. Supplement: SN, SK, with the assistance of the working groups. Supervision: RCM, MYD, IAA, JLG, RJO, WT, JMZ, MCS, EEE, KHM, AMP. Conceptualization: EEE, KHM, AMP.

                [* ]Corresponding authors: Evan E. Eichler ( eee@ 123456gs.washington.edu ); Karen H. Miga ( khmiga@ 123456ucsc.edu ); Adam M. Phillippy ( adam.phillippy@ 123456nih.gov )
                Article
                NIHMS1775562
                10.1126/science.abj6987
                9186530
                35357919
                5ed7ff03-6add-4381-b836-01b0c6684034

                This work is licensed under a Creative Commons Attribution 4.0 International License, which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.

                History
                Categories
                Article

                Uncategorized
                Uncategorized

                Comments

                Comment on this article