109
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Delimiting Species Using Single-Locus Data and the Generalized Mixed Yule Coalescent Approach: A Revised Method and Evaluation on Simulated Data Sets

      research-article
      1 , 2 , 1 , *
      Systematic Biology
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          DNA barcoding-type studies assemble single-locus data from large samples of individuals and species, and have provided new kinds of data for evolutionary surveys of diversity. An important goal of many such studies is to delimit evolutionarily significant species units, especially in biodiversity surveys from environmental DNA samples. The Generalized Mixed Yule Coalescent (GMYC) method is a likelihood method for delimiting species by fitting within- and between-species branching models to reconstructed gene trees. Although the method has been widely used, it has not previously been described in detail or evaluated fully against simulations of alternative scenarios of true patterns of population variation and divergence between species. Here, we present important reformulations to the GMYC method as originally specified, and demonstrate its robustness to a range of departures from its simplifying assumptions. The main factor affecting the accuracy of delimitation is the mean population size of species relative to divergence times between them. Other departures from the model assumptions, such as varying population sizes among species, alternative scenarios for speciation and extinction, and population growth or subdivision within species, have relatively smaller effects. Our simulations demonstrate that support measures derived from the likelihood function provide a robust indication of when the model performs well and when it leads to inaccurate delimitations. Finally, the so-called single-threshold version of the method outperforms the multiple-threshold version of the method on simulated data: we argue that this might represent a fundamental limit due to the nature of evidence used to delimit species in this approach. Together with other studies comparing its performance relative to other methods, our findings support the robustness of GMYC as a tool for delimiting species when only single-locus information is available. [Clusters; coalescent; DNA; genealogical; neutral; speciation; species.]

          Related collections

          Most cited references73

          • Record: found
          • Abstract: not found
          • Book: not found

          R: A Language and Environment for Statistical Computing.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Bayesian species delimitation using multilocus sequence data.

            In the absence of recent admixture between species, bipartitions of individuals in gene trees that are shared across loci can potentially be used to infer the presence of two or more species. This approach to species delimitation via molecular sequence data has been constrained by the fact that genealogies for individual loci are often poorly resolved and that ancestral lineage sorting, hybridization, and other population genetic processes can lead to discordant gene trees. Here we use a Bayesian modeling approach to generate the posterior probabilities of species assignments taking account of uncertainties due to unknown gene trees and the ancestral coalescent process. For tractability, we rely on a user-specified guide tree to avoid integrating over all possible species delimitations. The statistical performance of the method is examined using simulations, and the method is illustrated by analyzing sequence data from rotifers, fence lizards, and human populations.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci.

              The effective population sizes of ancestral as well as modern species are important parameters in models of population genetics and human evolution. The commonly used method for estimating ancestral population sizes, based on counting mismatches between the species tree and the inferred gene trees, is highly biased as it ignores uncertainties in gene tree reconstruction. In this article, we develop a Bayes method for simultaneous estimation of the species divergence times and current and ancestral population sizes. The method uses DNA sequence data from multiple loci and extracts information about conflicts among gene tree topologies and coalescent times to estimate ancestral population sizes. The topology of the species tree is assumed known. A Markov chain Monte Carlo algorithm is implemented to integrate over uncertain gene trees and branch lengths (or coalescence times) at each locus as well as species divergence times. The method can handle any species tree and allows different numbers of sequences at different loci. We apply the method to published noncoding DNA sequences from the human and the great apes. There are strong correlations between posterior estimates of speciation times and ancestral population sizes. With the use of an informative prior for the human-chimpanzee divergence date, the population size of the common ancestor of the two species is estimated to be approximately 20,000, with a 95% credibility interval (8000, 40,000). Our estimates, however, are affected by model assumptions as well as data quality. We suggest that reliable estimates have yet to await more data and more realistic models.
                Bookmark

                Author and article information

                Journal
                Syst Biol
                Syst. Biol
                sysbio
                sysbio
                Systematic Biology
                Oxford University Press
                1063-5157
                1076-836X
                September 2013
                14 June 2013
                14 June 2013
                : 62
                : 5
                : 707-724
                Affiliations
                1Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, UK; and 2Department of Entomology, Natural History Museum, London SW7 5BD, UK
                Author notes

                Associate Editor: Richard Glor

                *Correspondence to be sent to: Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, UK; E-mail: t.barraclough@ 123456imperial.ac.uk .
                Article
                syt033
                10.1093/sysbio/syt033
                3739884
                23681854
                5c75783c-8516-41d3-b75e-21e9c4d48383
                © The Author(s) 2013. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 11 August 2012
                : 19 November 2012
                : 3 May 2013
                Page count
                Pages: 18
                Categories
                Regular Articles

                Animal science & Zoology
                Animal science & Zoology

                Comments

                Comment on this article