5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Admixture has obscured signals of historical hard sweeps in humans

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In particular, admixture between diverged populations can distort or hide prior selection events in modern genomes, though this process is not explicitly accounted for in most selection studies despite its apparent ubiquity in humans and other species. Through analyses of ancient and modern human genomes, we show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes. Our results imply that this canonical mode of selection has probably been underappreciated in the evolutionary history of humans and suggest that our current understanding of the tempo and mode of selection in natural populations may be inaccurate.

          Abstract

          Through analyses of ancient and modern human genomes, the authors show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes.

          Related collections

          Most cited references98

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Sequence Alignment/Map format and SAMtools

          Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            fastp: an ultra-fast all-in-one FASTQ preprocessor

            Abstract Motivation Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient. Results We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2–5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools. Availability and implementation The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

              Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS--the 1000 Genome pilot alone includes nearly five terabases--make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
                Bookmark

                Author and article information

                Contributors
                yassine.souilmi@adelaide.edu.au
                raymond.tobler@adelaide.edu.au
                johar.angad@mayo.edu
                alanjcooper42@gmail.com
                cdh5313@psu.edu
                Journal
                Nat Ecol Evol
                Nat Ecol Evol
                Nature Ecology & Evolution
                Nature Publishing Group UK (London )
                2397-334X
                31 October 2022
                31 October 2022
                2022
                : 6
                : 12
                : 2003-2015
                Affiliations
                [1 ]GRID grid.1010.0, ISNI 0000 0004 1936 7304, Australian Centre for Ancient DNA, , The University of Adelaide, ; Adelaide, South Australia Australia
                [2 ]GRID grid.1001.0, ISNI 0000 0001 2180 7477, Evolution of Cultural Diversity Initiative, , Australian National University, ; Canberra, Australian Capital Territory Australia
                [3 ]GRID grid.66875.3a, ISNI 0000 0004 0459 167X, Department of Cardiovascular Diseases, , Mayo Clinic, ; Rochester, MN USA
                [4 ]GRID grid.415306.5, ISNI 0000 0000 9983 6924, Transplantation Immunology Group, Immunology Division, , Garvan Institute of Medical Research, ; Darlinghurst, New South Wales Australia
                [5 ]GRID grid.1005.4, ISNI 0000 0004 4902 0432, St Vincent’s Clinical School, Faculty of Medicine, , UNSW, ; Darlinghurst, New South Wales Australia
                [6 ]GRID grid.1010.0, ISNI 0000 0004 1936 7304, ARC Centre of Excellence for Mathematical and Statistical Frontiers, , The University of Adelaide, ; Adelaide, South Australia Australia
                [7 ]GRID grid.469873.7, ISNI 0000 0004 4914 1197, Department of Archaeogenetics, , Max Planck Institute for the Science of Human History, ; Jena, Germany
                [8 ]GRID grid.1010.0, ISNI 0000 0004 1936 7304, School of Mathematical Sciences, , The University of Adelaide, ; Adelaide, South Australia Australia
                [9 ]GRID grid.1005.4, ISNI 0000 0004 4902 0432, Chronos 14Carbon-Cycle Facility and Earth and Sustainability Science Research Centre, , University of New South Wales, ; Sydney, New South Wales Australia
                [10 ]GRID grid.148374.d, ISNI 0000 0001 0696 9806, Statistics and Bioinformatics Group, School of Fundamental Sciences, , Massey University, ; Palmerston North, New Zealand
                [11 ]GRID grid.437963.c, ISNI 0000 0001 1349 5098, South Australian Museum, ; Adelaide, South Australia Australia
                [12 ]BlueSky Genetics, Ashton, South Australia Australia
                [13 ]GRID grid.29857.31, ISNI 0000 0001 2097 4281, Department of Biology, , Penn State University, ; University Park, PA USA
                Author information
                http://orcid.org/0000-0001-7543-4864
                http://orcid.org/0000-0002-4603-1473
                http://orcid.org/0000-0001-9698-3352
                http://orcid.org/0000-0002-5862-7389
                http://orcid.org/0000-0001-6417-4702
                http://orcid.org/0000-0002-4204-5018
                http://orcid.org/0000-0002-1688-8951
                http://orcid.org/0000-0002-6197-3872
                http://orcid.org/0000-0001-6733-0993
                http://orcid.org/0000-0002-2267-2604
                Article
                1914
                10.1038/s41559-022-01914-9
                9715430
                36316412
                4126d2c7-7ea0-40a2-837c-ea2f71d68650
                © The Author(s) 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 21 November 2021
                : 16 September 2022
                Funding
                Funded by: Australian Research Council, DE190101069
                Funded by: Australian Research Council, FL140100260
                Funded by: Australian Research Council, DE180100883
                Categories
                Article
                Custom metadata
                © The Author(s), under exclusive licence to Springer Nature Limited 2022

                molecular evolution,population genetics
                molecular evolution, population genetics

                Comments

                Comment on this article