33
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Multi-platform discovery of haplotype-resolved structural variation in human genomes

      research-article
      1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 3 , 9 , 10 , 11 , 12 , 13 , 5 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 1 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 27 , 1 , 5 , 1 , 27 , 6 , 15 , 9 , 26 , 15 , 9 , 26 , 29 , 20 , 25 , 30 , 21 , 22 , 31 , 32 , 28 , 7 , 29 , 26 , 21 , 22 , 6 , 33 , 27 , 34 , 27 , 6 , 35 , 21 , 22 , 29 , 30 , 3 , 1 , 21 , 22 , 1 , 16 , 28 , 29 , 27 , 32 , 36 , 28 , 6 , 3 , 7 , 25 , 1 , 37 , 29 , 6 , 6 , 20 , 20 , 3 , 17 , 18 , 38 , 39 , 40 , 41 , 31 , 32 , 20 , 15 , 21 , 22 , 42 , 43 , 44 , 7 , 45 , 46 , 25 , 28 , 31 , 47 , 16 , 10 , 12 , 13 , 48 , 9 , 5 , 19 , 49 , 4 , 50 , 8 , 3 , 20 , , 1 , 51 , , 6 , 33 ,
      Nature Communications
      Nature Publishing Group UK

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

          Abstract

          Structural variants (SVs) in human genomes contribute diversity and diseases. Here, the authors use a multi-platform strategy to generate haplotype-resolved SVs for three human parent–child trios.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Paired-end mapping reveals extensive structural variation in the human genome.

          Structural variation of the genome involves kilobase- to megabase-sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements. We introduce high-throughput and massive paired-end mapping (PEM), a large-scale genome-sequencing method to identify structural variants (SVs) approximately 3 kilobases (kb) or larger that combines the rescue and capture of paired ends of 3-kb fragments, massive 454 sequencing, and a computational approach to map DNA reads onto a reference genome. PEM was used to map SVs in an African and in a putatively European individual and identified shared and divergent SVs relative to the reference genome. Overall, we fine-mapped more than 1000 SVs and documented that the number of SVs among humans is much larger than initially hypothesized; many of the SVs potentially affect gene function. The breakpoint junction sequences of more than 200 SVs were determined with a novel pooling strategy and computational analysis. Our analysis provided insights into the mechanisms of SV formation in humans.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Segmental duplications and copy-number variation in the human genome.

            The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P 4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Sequencing chromosomal abnormalities reveals neurodevelopmental loci that confer risk across diagnostic boundaries.

              Balanced chromosomal abnormalities (BCAs) represent a relatively untapped reservoir of single-gene disruptions in neurodevelopmental disorders (NDDs). We sequenced BCAs in patients with autism or related NDDs, revealing disruption of 33 loci in four general categories: (1) genes previously associated with abnormal neurodevelopment (e.g., AUTS2, FOXP1, and CDKL5), (2) single-gene contributors to microdeletion syndromes (MBD5, SATB2, EHMT1, and SNURF-SNRPN), (3) novel risk loci (e.g., CHD8, KIRREL3, and ZNF507), and (4) genes associated with later-onset psychiatric disorders (e.g., TCF4, ZNF804A, PDE10A, GRIN2B, and ANK3). We also discovered among neurodevelopmental cases a profoundly increased burden of copy-number variants from these 33 loci and a significant enrichment of polygenic risk alleles from genome-wide association studies of autism and schizophrenia. Our findings suggest a polygenic risk model of autism and reveal that some neurodevelopmental genes are sensitive to perturbation by multiple mutational mechanisms, leading to variable phenotypic outcomes that manifest at different life stages. Copyright © 2012 Elsevier Inc. All rights reserved.
                Bookmark

                Author and article information

                Contributors
                jan.korbel@embl.de
                eee@gs.washington.edu
                charles.lee@jax.org
                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Publishing Group UK (London )
                2041-1723
                16 April 2019
                16 April 2019
                2019
                : 10
                : 1784
                Affiliations
                [1 ]ISNI 0000000122986657, GRID grid.34477.33, Department of Genome Sciences, , University of Washington School of Medicine, ; Seattle, WA 98195 USA
                [2 ]ISNI 0000 0001 2156 6853, GRID grid.42505.36, Quantitative and Computational Biology, , University of Southern California, ; Los Angeles, CA 90089 USA
                [3 ]ISNI 0000 0004 0495 846X, GRID grid.4709.a, European Molecular Biology Laboratory, , Genome Biology Unit, ; 69117 Heidelberg, Germany
                [4 ]ISNI 0000000086837370, GRID grid.214458.e, Department of Computational Medicine and Bioinformatics, , University of Michigan, ; Ann Arbor, MI 48109 USA
                [5 ]ISNI 000000041936754X, GRID grid.38142.3c, Center for Genomic Medicine, Massachusetts General Hospital, Department of Neurology, , Harvard Medical School, ; Boston, MA 02114 USA
                [6 ]ISNI 0000 0004 0374 0039, GRID grid.249880.f, The Jackson Laboratory for Genomic Medicine, ; Farmington, CT 06032 USA
                [7 ]European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, Groningen, AV NL-9713 The Netherlands
                [8 ]ISNI 0000 0004 0491 9823, GRID grid.419528.3, Center for Bioinformatics, , Saarland University and the Max Planck Institute for Informatics, ; 66123 Saarbrücken, Germany
                [9 ]ISNI 0000 0001 2175 4264, GRID grid.411024.2, Institute for Genome Sciences, , University of Maryland School of Medicine, ; Baltimore, MD 21201 USA
                [10 ]ISNI 0000 0001 0670 2351, GRID grid.59734.3c, Department of Genetics and Genomic Sciences, , Icahn School of Medicine at Mount Sinai, ; New York, NY 10029 USA
                [11 ]ISNI 0000 0001 0599 1243, GRID grid.43169.39, The School of Life Science and Technology of Xi’an Jiaotong University, ; 710049 Xi’an, China
                [12 ]ISNI 0000 0001 0599 1243, GRID grid.43169.39, MOE Key Lab for Intelligent Networks & Networks Security, , School of Electronics and Information Engineering, Xi’an Jiaotong University, ; 710049 Xi’an, China
                [13 ]ISNI 0000 0001 0599 1243, GRID grid.43169.39, Ye-Lab For Omics and Omics Informatics, , Xi’an Jiaotong University, ; 710049 Xi’an, China
                [14 ]ISNI 000000041936754X, GRID grid.38142.3c, Program in Bioinformatics and Integrative Genomics, , Harvard Medical School, ; Boston, MA 02115 USA
                [15 ]ISNI 0000 0001 2291 4776, GRID grid.240145.6, Department of Bioinformatics and Computational Biology, , The University of Texas MD Anderson Cancer Center, ; Houston, TX 77030 USA
                [16 ]ISNI 0000 0000 8598 2218, GRID grid.266859.6, Department of Bioinformatics and Genomics, College of Computing and Informatics, , The University of North Carolina at Charlotte, ; Charlotte, NC 28223 USA
                [17 ]ISNI 000000041936754X, GRID grid.38142.3c, Department of Genetics, , Harvard Medical School, ; Boston, MA 02115 USA
                [18 ]GRID grid.66859.34, The Stanley Center for Psychiatric Research, , Broad Institute of MIT and Harvard, ; Cambridge, MA 02142 USA
                [19 ]GRID grid.66859.34, Program in Medical and Population Genetics, , Broad Institute of MIT and Harvard, ; Cambridge, MA 02142 USA
                [20 ]European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom
                [21 ]ISNI 0000000419368710, GRID grid.47100.32, Yale University Medical School, Computational Biology and Bioinformatics Program, ; New Haven, CT 06520 USA
                [22 ]ISNI 0000000419368710, GRID grid.47100.32, Department of Molecular Biophysics and Biochemistry, , Yale University, ; 266 Whitney Avenue, New Haven, CT 06520 USA
                [23 ]ISNI 0000 0004 1936 9684, GRID grid.27860.3b, Biochemistry and Molecular Medicine, , University of California Davis, ; Davis, CA 95616 USA
                [24 ]ISNI 0000 0004 1936 9684, GRID grid.27860.3b, UC Davis Genome Center, , University of California, Davis, ; Davis, CA 95616 USA
                [25 ]ISNI 0000 0001 2193 0096, GRID grid.223827.e, USTAR Center for Genetic Discovery and Department of Human Genetics, , University of Utah School of Medicine, ; Salt Lake City, UT 84112 USA
                [26 ]GRID grid.423340.2, Pacific Biosciences, ; Menlo Park, CA 94025 USA
                [27 ]ISNI 0000 0004 0473 1353, GRID grid.470262.5, Bionano Genomics, ; San Diego, CA 92121 USA
                [28 ]ISNI 0000 0001 2107 4242, GRID grid.266100.3, Beyster Center for Genomics of Psychiatric Diseases, Department of Psychiatry University of California San Diego, ; La Jolla, CA 92093 USA
                [29 ]GRID grid.498512.3, 10X Genomics, ; Pleasanton, CA 94566 USA
                [30 ]ISNI 0000 0004 0507 3954, GRID grid.185669.5, Illumina Clinical Services Laboratory, , Illumina, Inc., ; 5200 Illumina Way, San Diego, CA 92122 USA
                [31 ]ISNI 0000 0001 2107 4242, GRID grid.266100.3, Department of Cellular and Molecular Medicine, , University of California San Diego, ; La Jolla, CA 92093 USA
                [32 ]ISNI 0000000097371625, GRID grid.1052.6, Ludwig Institute for Cancer Research, ; La Jolla, CA 92093 USA
                [33 ]ISNI 0000 0001 2171 7754, GRID grid.255649.9, Department of Graduate Studies – Life Sciences, , Ewha Womans University, ; 52, Ewhayeodae-gil, Seodaemun-gu, Seoul, 03760 South Korea
                [34 ]GRID grid.410904.8, DNA Link, Seodaemun-gu, ; Seoul, South Korea
                [35 ]TreeCode Sdn Bhd, Bandar Botanic, 41200 Klang, Malaysia
                [36 ]ISNI 0000 0001 2107 4242, GRID grid.266100.3, Bioinformatics and Systems Biology Graduate Program, , University of California, San Diego, ; La Jolla, CA 92093 USA
                [37 ]ISNI 0000 0001 2181 3113, GRID grid.166341.7, School of Biomedical Engineering, , Drexel University, ; Philadelphia, PA 19104 USA
                [38 ]GRID grid.66859.34, Program in Medical and Population Genetics, , Broad Institute of MIT and Harvard, ; Cambridge, MA 02142 USA
                [39 ]ISNI 0000 0000 9206 2401, GRID grid.267308.8, Human Genetics Center, School of Public Health, , The University of Texas Health Science Center at Houston, ; Houston, TX 77225 USA
                [40 ]ISNI 0000 0001 2355 7002, GRID grid.4367.6, Department of Medicine, McDonnell Genome Institute, Siteman Cancer Center, , Washington University School of Medicine, ; St. Louis, MI 63108 USA
                [41 ]ISNI 0000 0001 2308 5949, GRID grid.10347.31, High Impact Research, , University of Malaya, ; 50603 Kuala Lumpur, Malaysia
                [42 ]ISNI 0000000419368710, GRID grid.47100.32, Department of Computer Science, , Yale University, ; 266 Whitney Avenue, New Haven, CT 06520 USA
                [43 ]ISNI 0000000419368710, GRID grid.47100.32, Department of Statistics and Data Science, , Yale University, ; 266 Whitney Avenue, New Haven, CT 06520 USA
                [44 ]ISNI 0000 0001 2297 6811, GRID grid.266102.1, Institute for Human Genetics, , University of California–San Francisco, ; San Francisco, CA 94143 USA
                [45 ]ISNI 0000 0001 0702 3000, GRID grid.248762.d, Terry Fox Laboratory, , BC Cancer Agency, ; Vancouver, BC V5Z 1L3 Canada
                [46 ]ISNI 0000 0001 2288 9830, GRID grid.17091.3e, Department of Medical Genetics, , University of British Columbia, ; Vancouver, BC V6T 1Z4 Canada
                [47 ]ISNI 0000 0001 2107 4242, GRID grid.266100.3, Department of Pediatrics, , University of California San Diego, ; La Jolla, CA 92093 USA
                [48 ]GRID grid.452438.c, The First Affiliated Hospital of Xi’an Jiaotong University, ; 710061 Xi’an, China
                [49 ]GRID grid.66859.34, Center for Mendelian Genomics, , Broad Institute of MIT and Harvard, ; Cambridge, MA 02142 USA
                [50 ]ISNI 0000000086837370, GRID grid.214458.e, Department of Human Genetics, , University of Michigan, ; Ann Arbor, MI 48109 USA
                [51 ]ISNI 0000000122986657, GRID grid.34477.33, Howard Hughes Medical Institute, , University of Washington, ; Seattle, WA 98195 USA
                Author information
                http://orcid.org/0000-0003-4036-9577
                http://orcid.org/0000-0001-5773-5620
                http://orcid.org/0000-0001-9671-1533
                http://orcid.org/0000-0002-3128-3547
                http://orcid.org/0000-0003-1183-0432
                http://orcid.org/0000-0003-0381-7801
                http://orcid.org/0000-0002-5187-0415
                http://orcid.org/0000-0002-2536-165X
                http://orcid.org/0000-0003-4394-2455
                http://orcid.org/0000-0001-5750-1808
                http://orcid.org/0000-0003-4264-1853
                http://orcid.org/0000-0002-5989-6898
                http://orcid.org/0000-0003-4944-4107
                http://orcid.org/0000-0003-3047-4250
                http://orcid.org/0000-0002-3492-1102
                http://orcid.org/0000-0001-8413-6498
                http://orcid.org/0000-0002-5640-9070
                http://orcid.org/0000-0002-0539-9714
                http://orcid.org/0000-0001-9312-1159
                http://orcid.org/0000-0001-7827-9839
                http://orcid.org/0000-0002-5435-1127
                http://orcid.org/0000-0002-3897-7955
                http://orcid.org/0000-0003-4013-5279
                http://orcid.org/0000-0002-9087-526X
                http://orcid.org/0000-0003-4662-3177
                http://orcid.org/0000-0002-9376-1030
                http://orcid.org/0000-0002-2798-3794
                http://orcid.org/0000-0002-8246-4014
                Article
                8148
                10.1038/s41467-018-08148-z
                6467913
                30992455
                d8756384-2c9c-4320-bd31-644b71a14d4f
                © The Author(s) 2019

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 24 October 2018
                : 20 December 2018
                Categories
                Article
                Custom metadata
                © The Author(s) 2019

                Uncategorized
                Uncategorized

                Comments

                Comment on this article