24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Finding Single Copy Genes Out of Sequenced Genomes for Multilocus Phylogenetics in Non-Model Fungi

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Historically, fungal multigene phylogenies have been reconstructed based on a small number of commonly used genes. The availability of complete fungal genomes has given rise to a new wave of model organisms that provide large number of genes potentially useful for building robust gene genealogies. Unfortunately, cross-utilization of these resources to study phylogenetic relationships in the vast majority of non-model fungi (i.e. “orphan” species) remains an unexamined question. To address this problem, we developed a method coupled with a program named “PHYLORPH” (PHYLogenetic markers for ORPHans). The method screens fungal genomic databases (107 fungal genomes fully sequenced) for single copy genes that might be easily transferable and well suited for studies at low taxonomic levels (for example, in species complexes) in non-model fungal species. To maximize the chance to target genes with informative regions, PHYLORPH displays a graphical evaluation system based on the estimation of nucleotide divergence relative to substitution type. The usefulness of this approach was tested by developing markers in four non-model groups of fungal pathogens. For each pathogen considered, 7 to 40% of the 10–15 best candidate genes proposed by PHYLORPH yielded sequencing success. Levels of polymorphism of these genes were compared with those obtained for some genes traditionally used to build fungal phylogenies (e.g. nuclear rDNA, β-tubulin, γ-actin, Elongation factor EF-1α). These genes were ranked among the best-performing ones and resolved accurately taxa relationships in each of the four non-model groups of fungi considered. We envision that PHYLORPH will constitute a useful tool for obtaining new and accurate phylogenetic markers to resolve relationships between closely related non-model fungal species.

          Related collections

          Most cited references76

          • Record: found
          • Abstract: found
          • Article: not found

          OrthoMCL: identification of ortholog groups for eukaryotic genomes.

          The identification of orthologous groups is useful for genome annotation, studies on gene/protein evolution, comparative genomics, and the identification of taxonomically restricted sequences. Methods successfully exploited for prokaryotic genome analysis have proved difficult to apply to eukaryotes, however, as larger genomes may contain multiple paralogous genes, and sequence information is often incomplete. OrthoMCL provides a scalable method for constructing orthologous groups across multiple eukaryotic taxa, using a Markov Cluster algorithm to group (putative) orthologs and paralogs. This method performs similarly to the INPARANOID algorithm when applied to two genomes, but can be extended to cluster orthologs from multiple species. OrthoMCL clusters are coherent with groups identified by EGO, but improved recognition of "recent" paralogs permits overlapping EGO groups representing the same gene to be merged. Comparison with previously assigned EC annotations suggests a high degree of reliability, implying utility for automated eukaryotic genome annotation. OrthoMCL has been applied to the proteome data set from seven publicly available genomes (human, fly, worm, yeast, Arabidopsis, the malaria parasite Plasmodium falciparum, and Escherichia coli). A Web interface allows queries based on individual genes or user-defined phylogenetic patterns (http://www.cbil.upenn.edu/gene-family). Analysis of clusters incorporating P. falciparum genes identifies numerous enzymes that were incompletely annotated in first-pass annotation of the parasite genome.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Phylogenetic species recognition and species concepts in fungi.

            The operational species concept, i.e., the one used to recognize species, is contrasted to the theoretical species concept. A phylogenetic approach to recognize fungal species based on concordance of multiple gene genealogies is compared to those based on morphology and reproductive behavior. Examples where Phylogenetic Species Recognition has been applied to fungi are reviewed and concerns regarding Phylogenetic Species Recognition are discussed.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Assessing Performance of Orthology Detection Strategies Applied to Eukaryotic Genomes

              Orthology detection is critically important for accurate functional annotation, and has been widely used to facilitate studies on comparative and evolutionary genomics. Although various methods are now available, there has been no comprehensive analysis of performance, due to the lack of a genomic-scale ‘gold standard’ orthology dataset. Even in the absence of such datasets, the comparison of results from alternative methodologies contains useful information, as agreement enhances confidence and disagreement indicates possible errors. Latent Class Analysis (LCA) is a statistical technique that can exploit this information to reasonably infer sensitivities and specificities, and is applied here to evaluate the performance of various orthology detection methods on a eukaryotic dataset. Overall, we observe a trade-off between sensitivity and specificity in orthology detection, with BLAST-based methods characterized by high sensitivity, and tree-based methods by high specificity. Two algorithms exhibit the best overall balance, with both sensitivity and specificity>80%: INPARANOID identifies orthologs across two species while OrthoMCL clusters orthologs from multiple species. Among methods that permit clustering of ortholog groups spanning multiple genomes, the (automated) OrthoMCL algorithm exhibits better within-group consistency with respect to protein function and domain architecture than the (manually curated) KOG database, and the homolog clustering algorithm TribeMCL as well. By way of using LCA, we are also able to comprehensively assess similarities and statistical dependence between various strategies, and evaluate the effects of parameter settings on performance. In summary, we present a comprehensive evaluation of orthology detection on a divergent set of eukaryotic genomes, thus providing insights and guides for method selection, tuning and development for different applications. Many biological questions have been addressed by multiple tests yielding binary (yes/no) outcomes but no clear definition of truth, making LCA an attractive approach for computational biology.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2011
                13 April 2011
                : 6
                : 4
                : e18803
                Affiliations
                [1 ]INRA, UMR1202, BIOGECO (Biodiversité Gènes et Communautés), Cestas, France
                [2 ]INRA, Nancy Université, UMR1136 Interactions Arbres/Microorganismes (IFR110), Champenoux, France
                University College Dublin, Ireland
                Author notes

                Conceived and designed the experiments: NF. Performed the experiments: NF TD CH. Analyzed the data: NF TD CH. Contributed reagents/materials/analysis tools: NF M-LD-L CD. Wrote the paper: NF. Supervised the project and improved the manuscript: MLDL CD. Designed and tested the program: NF.

                Article
                PONE-D-10-04683
                10.1371/journal.pone.0018803
                3076447
                21533204
                963fecfe-3038-498d-b0d4-78107ee88078
                Feau et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 10 November 2010
                : 18 March 2011
                Page count
                Pages: 17
                Categories
                Research Article
                Biology
                Evolutionary Biology
                Evolutionary Systematics
                Molecular Systematics
                Phylogenetics
                Comparative Genomics
                Genomics
                Genome Sequencing

                Uncategorized
                Uncategorized

                Comments

                Comment on this article