10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de novo assembly of the diploid genome of a 115-year-old Dutch cognitively healthy woman. We combined this assembly with two previously published haploid assemblies (CHM1 and CHM13) and the GRCh38 reference genome to create a compendium of SVs that occur across five independent human haplotypes using the graph-based multi-genome aligner REVEAL. Across these five haplotypes, we detected 31,680 euchromatic SVs (>50 bp). Of these, ~62% were comprised of repetitive sequences with ‘variable number tandem repeats’ (VNTRs), ~10% were mobile elements (Alu, L1, and SVA), while the remaining variants were inversions and indels. We observed that VNTRs with GC-content >60% and repeat patterns longer than 15 bp were 21-fold enriched in the subtelomeric regions (within 5 Mb of the ends of chromosome arms). VNTR lengths can expand to exceed a critical length which is associated with impaired gene transcription. The genes that contained most VNTRs, of which PTPRN2 and DLGAP2 are the most prominent examples, were found to be predominantly expressed in the brain and associated with a wide variety of neurological disorders. Repeat-induced variation represents a sizeable fraction of the genetic variation in human genomes and should be included in investigations of genetic factors associated with phenotypic traits, specifically those associated with neurological disorders. We make available the long and short-read sequence data of the supercentenarian genome, and a compendium of SVs as identified across 5 human haplotypes.

          Related collections

          Most cited references81

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          SciPy 1.0: fundamental algorithms for scientific computing in Python

          SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

            DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A framework for variation discovery and genotyping using next-generation DNA sequencing data

              Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.
                Bookmark

                Author and article information

                Contributors
                h.holstege@amsterdamumc.nl
                Journal
                Transl Psychiatry
                Transl Psychiatry
                Translational Psychiatry
                Nature Publishing Group UK (London )
                2158-3188
                2 November 2020
                2 November 2020
                2020
                : 10
                : 369
                Affiliations
                [1 ]GRID grid.484519.5, Department of Clinical Genetics, , Amsterdam Neuroscience, Vrije Universiteit Amsterdam, Amsterdam UMC, ; Amsterdam, The Netherlands
                [2 ]GRID grid.5292.c, ISNI 0000 0001 2097 4740, Delft Bioinformatics Lab, , Delft University of Technology, ; Delft, The Netherlands
                [3 ]GRID grid.5596.f, ISNI 0000 0001 0668 7884, Department of Human Genetics, , KU Leuven, ; Leuven, Belgium
                [4 ]GRID grid.423340.2, ISNI 0000 0004 0640 9878, Pacific Biosciences, ; Menlo Park, CA USA
                [5 ]GRID grid.484519.5, Alzheimer Center Amsterdam, Department of Neurology, Amsterdam Neuroscience, Vrije Universiteit Amsterdam, Amsterdam UMC, ; Amsterdam, The Netherlands
                Author information
                http://orcid.org/0000-0001-8019-4893
                http://orcid.org/0000-0003-3047-4250
                http://orcid.org/0000-0002-1148-1562
                http://orcid.org/0000-0002-7688-3087
                Article
                1060
                10.1038/s41398-020-01060-5
                7608644
                33139705
                ddf6101a-d874-4882-b623-3354921cdfdc
                © The Author(s) 2020

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 1 August 2020
                : 27 August 2020
                : 22 September 2020
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100010573, Stichting Dioraphte (Dioraphte Foundation);
                Award ID: N/A
                Award Recipient :
                Funded by: This work was supported by Stichting Alzheimer Nederland (WE09.2014-03), Stichting Dioraphte, Horstingstuit foundation, and Stichting VUmc Fonds, Stichting Universiteitsfonds Delft.
                Funded by: FundRef https://doi.org/10.13039/501100010969, Alzheimer Nederland (Alzheimer Netherlands);
                Award ID: WE09.2014-03
                Award Recipient :
                Categories
                Article
                Custom metadata
                © The Author(s) 2020

                Clinical Psychology & Psychiatry
                clinical genetics
                Clinical Psychology & Psychiatry
                clinical genetics

                Comments

                Comment on this article