1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Recurrent repeat expansions in human cancer genomes

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases 1, 2 . However, repeat expansions are often not explored beyond neurological and neurodegenerative disorders. In some cancers, mutations accumulate in short tracts of TRs, a phenomenon termed microsatellite instability; however, larger repeat expansions have not been systematically analysed in cancer 38 . Here we identified TR expansions in 2,622 cancer genomes spanning 29 cancer types. In seven cancer types, we found 160 recurrent repeat expansions (rREs), most of which (155/160) were subtype specific. We found that rREs were non-uniformly distributed in the genome with enrichment near candidate cis-regulatory elements, suggesting a potential role in gene regulation. One rRE, a GAAA-repeat expansion, located near a regulatory element in the first intron of UGT2B7 was detected in 34% of renal cell carcinoma samples and was validated by long-read DNA sequencing. Moreover, in preliminary experiments, treating cells that harbour this rRE with a GAAA-targeting molecule led to a dose-dependent decrease in cell proliferation. Overall, our results suggest that rREs may be an important but unexplored source of genetic variation in human cancer, and we provide a comprehensive catalogue for further study.

          Abstract

          An atlas explores the landscape of recurrent repeat expansions in human cancer genomes.

          Related collections

          Most cited references65

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          BEDTools: a flexible suite of utilities for comparing genomic features

          Motivation: Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing web-based methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. Results: This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. Availability and implementation: BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools Contact: aaronquinlan@gmail.com; imh4y@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            An Integrated Encyclopedia of DNA Elements in the Human Genome

            Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool

              Background System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. Results Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. Conclusions Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.
                Bookmark

                Author and article information

                Contributors
                gerwin@stanford.edu
                mark@gersteinlab.org
                mpsnyder@stanford.edu
                Journal
                Nature
                Nature
                Nature
                Nature Publishing Group UK (London )
                0028-0836
                1476-4687
                14 December 2022
                14 December 2022
                2023
                : 613
                : 7942
                : 96-102
                Affiliations
                [1 ]GRID grid.168010.e, ISNI 0000000419368956, Department of Genetics, , Stanford University, ; Stanford, CA USA
                [2 ]GRID grid.21729.3f, ISNI 0000000419368729, Department of Biomedical Informatics, , Columbia University, ; New York, NY USA
                [3 ]GRID grid.429884.b, ISNI 0000 0004 1791 0895, New York Genome Center, ; New York, NY USA
                [4 ]GRID grid.185669.5, ISNI 0000 0004 0507 3954, Illumina, Inc., ; San Diego, CA USA
                [5 ]GRID grid.168010.e, ISNI 0000000419368956, Division of Oncology, Department of Medicine, , Stanford University School of Medicine, ; Stanford, CA USA
                [6 ]GRID grid.16753.36, ISNI 0000 0001 2299 3507, Data Science Program, , Northwestern University, ; Chicago, IL USA
                [7 ]GRID grid.42327.30, ISNI 0000 0004 0473 9646, Genetics and Genome Biology, The Hospital for Sick Children, ; Toronto, Ontario Canada
                [8 ]GRID grid.17063.33, ISNI 0000 0001 2157 2938, Department of Molecular Genetics, , University of Toronto, ; Toronto, Ontario Canada
                [9 ]GRID grid.168010.e, ISNI 0000000419368956, Department of Urology, , Stanford University School of Medicine, ; Stanford, CA USA
                [10 ]GRID grid.280747.e, ISNI 0000 0004 0419 2556, Veterans Affairs Palo Alto Health Care System, ; Palo Alto, CA USA
                [11 ]GRID grid.168010.e, ISNI 0000000419368956, Division of Nephrology, Department of Medicine, , Stanford University School of Medicine, ; Stanford, CA USA
                [12 ]GRID grid.47100.32, ISNI 0000000419368710, Computational Biology and Bioinformatics Program, , Yale University, ; New Haven, CT USA
                [13 ]GRID grid.47100.32, ISNI 0000000419368710, Molecular Biophysics and Biochemistry Department, , Yale University, ; New Haven, CT USA
                [14 ]GRID grid.47100.32, ISNI 0000000419368710, Department of Computer Science, , Yale University, ; New Haven, CT USA
                Author information
                http://orcid.org/0000-0002-9286-7626
                http://orcid.org/0000-0002-1352-8686
                http://orcid.org/0000-0002-5194-8454
                http://orcid.org/0000-0003-1769-4814
                http://orcid.org/0000-0001-7273-4968
                http://orcid.org/0000-0001-9980-3863
                http://orcid.org/0000-0001-8965-1253
                http://orcid.org/0000-0002-9746-3719
                http://orcid.org/0000-0003-0784-7987
                Article
                5515
                10.1038/s41586-022-05515-1
                9812771
                36517591
                a072e8e7-4258-4afb-ae0e-9276e97bd6aa
                © The Author(s) 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 13 August 2021
                : 2 November 2022
                Categories
                Article
                Custom metadata
                © The Author(s), under exclusive licence to Springer Nature Limited 2023

                Uncategorized
                genome,genetic variation,cancer genomics
                Uncategorized
                genome, genetic variation, cancer genomics

                Comments

                Comment on this article