8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Census of Tandemly Repeated Polymorphic Loci in Genic Regions Through the Comparative Integration of Human Genome Assemblies

      methods-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Polymorphic Tandem Repeat (PTR) is a common form of polymorphism in the human genome. A PTR consists in a variation found in an individual (or in a population) of the number of repeating units of a Tandem Repeat (TR) locus of the genome with respect to the reference genome. Several phenotypic traits and diseases have been discovered to be strongly associated with or caused by specific PTR loci. PTR are further distinguished in two main classes: Short Tandem Repeats (STR) when the repeating unit has size up to 6 base pairs, and Variable Number Tandem Repeats (VNTR) for repeating units of size above 6 base pairs. As larger and larger populations are screened via high throughput sequencing projects, it becomes technically feasible and desirable to explore the association between PTR and a panoply of such traits and conditions. In order to facilitate these studies, we have devised a method for compiling catalogs of PTR from assembled genomes, and we have produced a catalog of PTR for genic regions (exons, introns, UTR and adjacent regions) of the human genome (GRCh38). We applied four different TR discovery software tools to uncover in the first phase 55,223,485 TR (after duplicate removal) in GRCh38, of which 373,173 were determined to be PTR in the second phase by comparison with five assembled human genomes. Of these, 263,266 are not included by state-of-the-art PTR catalogs. The new methodology is mainly based on a hierarchical and systematic application of alignment-based sequence comparisons to identify and measure the polymorphism of TR. While previous catalogs focus on the class of STR of small total size, we remove any size restrictions, aiming at the more general class of PTR, and we also target fuzzy TR by using specific detection tools. Similarly to other previous catalogs of human polymorphic loci, we focus our catalog toward applications in the discovery of disease-associated loci. Validation by cross-referencing with existing catalogs on common clinically-relevant loci shows good concordance. Overall, this proposed census of human PTR in genic regions is a shared resource (web accessible), complementary to existing catalogs, facilitating future genome-wide studies involving PTR.

          Related collections

          Most cited references48

          • Record: found
          • Abstract: found
          • Article: not found

          A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer.

          In December 1997, the National Cancer Institute sponsored "The International Workshop on Microsatellite Instability and RER Phenotypes in Cancer Detection and Familial Predisposition," to review and unify the field. The following recommendations were endorsed at the workshop. (a) The form of genomic instability associated with defective DNA mismatch repair in tumors is to be called microsatellite instability (MSI). (b) A panel of five microsatellites has been validated and is recommended as a reference panel for future research in the field. Tumors may be characterized on the basis of: high-frequency MSI (MSI-H), if two or more of the five markers show instability (i.e., have insertion/deletion mutations), and low-frequency MSI (MSI-L), if only one of the five markers shows instability. The distinction between microsatellite stable (MSS) and low frequency MSI (MSI-L) can only be accomplished if a greater number of markers is utilized. (c) A unique clinical and pathological phenotype is identified for the MSI-H tumors, which comprise approximately 15% of colorectal cancers, whereas MSI-L and MSS tumors appear to be phenotypically similar. MSI-H colorectal tumors are found predominantly in the proximal colon, have unique histopathological features, and are associated with a less aggressive clinical course than are stage-matched MSI-L or MSS tumors. Preclinical models suggest the possibility that these tumors may be resistant to the cytotoxicity induced by certain chemotherapeutic agents. The implications for MSI-L are not yet clear. (d) MSI can be measured in fresh or fixed tumor specimens equally well; microdissection of pathological specimens is recommended to enrich for neoplastic tissue; and normal tissue is required to document the presence of MSI. (e) The "Bethesda guidelines," which were developed in 1996 to assist in the selection of tumors for microsatellite analysis, are endorsed. (f) The spectrum of microsatellite alterations in noncolonic tumors was reviewed, and it was concluded that the above recommendations apply only to colorectal neoplasms. (g) A research agenda was recommended.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Recent segmental duplications in the human genome.

            Primate-specific segmental duplications are considered important in human disease and evolution. The inability to distinguish between allelic and duplication sequence overlap has hampered their characterization as well as assembly and annotation of our genome. We developed a method whereby each public sequence is analyzed at the clone level for overrepresentation within a whole-genome shotgun sequence. This test has the ability to detect duplications larger than 15 kilobases irrespective of copy number, location, or high sequence similarity. We mapped 169 large regions flanked by highly similar duplications. Twenty-four of these hot spots of genomic instability have been associated with genetic disease. Our analysis indicates a highly nonrandom chromosomal and genic distribution of recent segmental duplications, with a likely role in expanding protein diversity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Segmental duplications and copy-number variation in the human genome.

              The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P 4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Genet
                Front Genet
                Front. Genet.
                Frontiers in Genetics
                Frontiers Media S.A.
                1664-8021
                02 May 2018
                2018
                : 9
                : 155
                Affiliations
                [1] 1Institute for Informatics and Telematics of CNR , Pisa, Italy
                [2] 2Department of Health Sciences, University of Eastern Piedmont Amedeo Avogadro , Novara, Italy
                [3] 3Institute for Biomedical Technologies of CNR , Segrate, Italy
                [4] 4Department of Science and Technological Innovation, University of Eastern Piedmont Amedeo Avogadro , Novara, Italy
                Author notes

                Edited by: Max A. Alekseyev, George Washington University, United States

                Reviewed by: Ibrokhim Abdurakhmonov, Academy of Science of Uzbekistan, Uzbekistan; Enrique Medina-Acosta, State University of Norte Fluminense, Brazil; Dapeng Wang, University of Oxford, United Kingdom; Arthur Gruber, Universidade de São Paulo, Brazil

                *Correspondence: Marco Pellegrini marco.pellegrini@ 123456iit.cnr.it

                This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Genetics

                Article
                10.3389/fgene.2018.00155
                5941971
                29770143
                6e6c59d1-7d8b-475e-9524-787011e6db82
                Copyright © 2018 Genovese, Geraci, Corrado, Mangano, D'Aurizio, Bordoni, Severgnini, Manzini, De Bellis, D'Alfonso and Pellegrini.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 18 October 2017
                : 13 April 2018
                Page count
                Figures: 12, Tables: 3, Equations: 0, References: 62, Pages: 21, Words: 16034
                Categories
                Genetics
                Methods

                Genetics
                variable number tandem repeats,short tandem repeats,polymorphic tandem repeats,genic regions,catalog,tandem repeat detection tools,fuzzy tandem repeats,measure of polymorphism

                Comments

                Comment on this article