82
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Pool-hmm: a Python program for estimating the allele frequency spectrum and detecting selective sweeps from next generation sequencing of pooled samples

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Due to its cost effectiveness, next generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for genome-wide estimation of allele frequencies in population samples. As the allele frequency spectrum provides information about past episodes of selection, Pool-seq is also a promising design for genomic scans for selection. However, no software tool has yet been developed for selection scans based on Pool-Seq data. We introduce Pool-hmm, a Python program for the estimation of allele frequencies and the detection of selective sweeps in a Pool-Seq sample. Pool-hmm includes several options that allow a flexible analysis of Pool-Seq data, and can be run in parallel on several processors. Source code and documentation for Pool-hmm is freely available at https://qgsp.jouy.inra.fr/.

          Related collections

          Most cited references8

          • Record: found
          • Abstract: found
          • Article: not found

          Genomic scans for selective sweeps using SNP data.

          Detecting selective sweeps from genomic SNP data is complicated by the intricate ascertainment schemes used to discover SNPs, and by the confounding influence of the underlying complex demographics and varying mutation and recombination rates. Current methods for detecting selective sweeps have little or no robustness to the demographic assumptions and varying recombination rates, and provide no method for correcting for ascertainment biases. Here, we present several new tests aimed at detecting selective sweeps from genomic SNP data. Using extensive simulations, we show that a new parametric test, based on composite likelihood, has a high power to detect selective sweeps and is surprisingly robust to assumptions regarding recombination rates and demography (i.e., has low Type I error). Our new test also provides estimates of the location of the selective sweep(s) and the magnitude of the selection coefficient. To illustrate the method, we apply our approach to data from the Seattle SNP project and to Chromosome 2 data from the HapMap project. In Chromosome 2, the most extreme signal is found in the lactase gene, which previously has been shown to be undergoing positive selection. Evidence for selective sweeps is also found in many other regions, including genes known to be associated with disease risk such as DPP10 and COL4A3.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The next generation of molecular markers from massively parallel sequencing of pooled DNA samples.

            Next generation sequencing (NGS) is about to revolutionize genetic analysis. Currently NGS techniques are mainly used to sequence individual genomes. Due to the high sequence coverage required, the costs for population-scale analyses are still too high to allow an extension to nonmodel organisms. Here, we show that NGS of pools of individuals is often more effective in SNP discovery and provides more accurate allele frequency estimates, even when taking sequencing errors into account. We modify the population genetic estimators Tajima's π and Watterson's to obtain unbiased estimates from NGS pooling data. Given the same sequencing effort, the resulting estimators often show a better performance than those obtained from individual sequencing. Although our analysis also shows that NGS of pools of individuals will not be preferable under all circumstances, it provides a cost-effective approach to estimate allele frequencies on a genome-wide scale.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              OmegaPlus: a scalable tool for rapid detection of selective sweeps in whole-genome datasets.

              Recent advances in sequencing technologies have led to the rapid accumulation of molecular sequence data. Analyzing whole-genome data (as obtained from next-generation sequencers) from intra-species samples allows to detect signatures of positive selection along the genome and therefore identify potentially advantageous genes in the course of the evolution of a population. We introduce OmegaPlus, an open-source tool for rapid detection of selective sweeps in whole-genome data based on linkage disequilibrium. The tool is up to two orders of magnitude faster than existing programs for this purpose and also exhibits up to two orders of magnitude smaller memory requirements. OmegaPlus is available under GNU GPL at http://www.exelixis-lab.org/software.html.
                Bookmark

                Author and article information

                Journal
                Mol Ecol Resour
                Mol Ecol Resour
                men
                Molecular Ecology Resources
                Blackwell Publishing Ltd
                1755-098X
                1755-0998
                March 2013
                11 January 2013
                : 13
                : 2
                : 337-340
                Affiliations
                [1 ]Laboratoire de Génétique Cellulaire, INRA 24 Chemin de Borde Rouge, Auzeville CS 52627, Castanet Tolosan Cedex, 31326, France
                [2 ]Institut für Populationsgenetik, Vetmeduni Vienna Veterinärplatz 1, Wien, A-1210, Austria
                [3 ]Institute of Statistics and Operations Research, University of Vienna Universitätsstrasse 5/9, Wien, A-1010, Austria
                Author notes
                Correspondence: Simon Boitard, Fax: +33 561285308; E-mail: simon.boitard@ 123456toulouse.inra.fr
                Article
                10.1111/1755-0998.12063
                3592992
                23311589
                6c5478a0-7aa0-4af9-8a71-d82ffc0198e2
                © 2013 Blackwell Publishing Ltd

                Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.

                History
                : 10 July 2012
                : 26 November 2012
                : 29 November 2012
                Categories
                Resource Articles

                Ecology
                allele frequency spectrum,hidden markov models,next generation sequencing,pooled dna,selective sweeps.

                Comments

                Comment on this article