11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Founder lineages in the Iberian Roma mitogenomes recapitulate the Roma diaspora and show the effects of demographic bottlenecks

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Roma are the largest ethnic minority in Europe. With a Northwestern Indian origin around ~ 1.5 kya, they travelled throughout West Asia until their arrival in Europe around the eleventh century CE. Their diaspora through Europe is characterized by population bottlenecks and founder events which have contributed to their present day genetic and cultural diversity. In our study, we focus on the effects of founder effects in the mitochondrial DNA (mtDNA) pool of Iberian Roma by producing and analyzing 144 novel whole mtDNA sequences of Iberian Roma. Over 60% of their mtDNA pool is composed by founder lineages of South Asian origin or acquired by gene flow during their diaspora in the Middle East or locally in Europe in Europe. The TMRCA of these lineages predates the historical record of the Roma arrival in Spain. The abundance of founder lineages is in contrast with ~ 0.7% of autochthonous founder lineages present in the non-Roma Iberian population. Within those founder lineages, we found a substantial amount of South Asian M5a1b1a1 haplotypes and high frequencies of West Eurasian founder lineages (U3b1c, J2b1c, J1c1b, J1b3a, H88, among others), which we characterized phylogenetically and put in phylogeographical context. Besides, we found no evidence of genetic substructure of Roma within the Iberian Peninsula. These results show the magnitude of founder effects in the Iberian Roma and further explain the Roma history and genetic diversity from a matrilineal point of view.

          Related collections

          Most cited references41

          • Record: found
          • Abstract: found
          • Article: not found

          The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

          Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS--the 1000 Genome pilot alone includes nearly five terabases--make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            A global reference for human genetic variation

            The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.

              This unit describes how to use BWA and the Genome Analysis Toolkit (GATK) to map genome sequencing data to a reference and produce high-quality variant calls that can be used in downstream analyses. The complete workflow includes the core NGS data processing steps that are necessary to make the raw data suitable for analysis by the GATK, as well as the key methods involved in variant discovery using the GATK.
                Bookmark

                Author and article information

                Contributors
                david.comas@upf.edu
                Journal
                Sci Rep
                Sci Rep
                Scientific Reports
                Nature Publishing Group UK (London )
                2045-2322
                4 November 2022
                4 November 2022
                2022
                : 12
                : 18720
                Affiliations
                [1 ]GRID grid.5612.0, ISNI 0000 0001 2172 2676, Departament de Medicina i Ciències de la Vida, Institut de Biologia Evolutiva (CSIC-UPF), , Universitat Pompeu Fabra, ; 08003 Barcelona, Spain
                [2 ]GRID grid.7080.f, ISNI 0000 0001 2296 0625, Facultat de Sociologia, , Universitat Autònoma de Barcelona, ; Barcelona, Spain
                [3 ]GRID grid.5841.8, ISNI 0000 0004 1937 0247, Facultat de Geografia i Història, , Universitat de Barcelona, ; Barcelona, Spain
                Article
                23349
                10.1038/s41598-022-23349-9
                9636147
                36333436
                cf1fd928-8e46-4791-9247-6d99253db941
                © The Author(s) 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 15 June 2022
                : 30 October 2022
                Funding
                Funded by: Spanish Ministry of Economy and Competitiveness
                Award ID: CGL2016-75389-P (MINEICO/FEDER, UE)
                Award ID: CGL2016-75389-P (MINEICO/FEDER, UE)
                Award Recipient :
                Funded by: Agència de Gestió d’Ajuts Universitaris i de la Recerca (Generalitat de Catalunya)
                Award ID: 2017SGR00702
                Award ID: 2017SGR00702
                Award Recipient :
                Categories
                Article
                Custom metadata
                © The Author(s) 2022

                Uncategorized
                population genetics,biological anthropology
                Uncategorized
                population genetics, biological anthropology

                Comments

                Comment on this article