132
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Lactobacillus paracasei Comparative Genomics: Towards Species Pan-Genome Definition and Exploitation of Diversity

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its “pan-genome”. We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800–3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids ( bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25–53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link the distribution pattern of a specific phenotype to the presence/absence of specific sets of genes.

          Related collections

          Most cited references81

          • Record: found
          • Abstract: found
          • Article: not found

          Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome".

          The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The microbial pan-genome.

            A decade after the beginning of the genomic era, the question of how genomics can describe a bacterial species has not been fully addressed. Experimental data have shown that in some species new genes are discovered even after sequencing the genomes of several strains. Mathematical modeling predicts that new genes will be discovered even after sequencing hundreds of genomes per species. Therefore, a bacterial species can be described by its pan-genome, which is composed of a "core genome" containing genes present in all strains, and a "dispensable genome" containing genes present in two or more strains and genes unique to single strains. Given that the number of unique genes is vast, the pan-genome of a bacterial species might be orders of magnitude larger than any single genome.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.

              Orthologs are genes in different species that originate from a single gene in the last common ancestor of these species. Such genes have often retained identical biological roles in the present-day organisms. It is hence important to identify orthologs for transferring functional information between genes in different organisms with a high degree of reliability. For example, orthologs of human proteins are often functionally characterized in model organisms. Unfortunately, orthology analysis between human and e.g. invertebrates is often complex because of large numbers of paralogs within protein families. Paralogs that predate the species split, which we call out-paralogs, can easily be confused with true orthologs. Paralogs that arose after the species split, which we call in-paralogs, however, are bona fide orthologs by definition. Orthologs and in-paralogs are typically detected with phylogenetic methods, but these are slow and difficult to automate. Automatic clustering methods based on two-way best genome-wide matches on the other hand, have so far not separated in-paralogs from out-paralogs effectively. We present a fully automatic method for finding orthologs and in-paralogs from two species. Ortholog clusters are seeded with a two-way best pairwise match, after which an algorithm for adding in-paralogs is applied. The method bypasses multiple alignments and phylogenetic trees, which can be slow and error-prone steps in classical ortholog detection. Still, it robustly detects complex orthologous relationships and assigns confidence values for both orthologs and in-paralogs. The program, called INPARANOID, was tested on all completely sequenced eukaryotic genomes. To assess the quality of INPARANOID results, ortholog clusters were generated from a dataset of worm and mammalian transmembrane proteins, and were compared to clusters derived by manual tree-based ortholog detection methods. This study led to the identification with a high degree of confidence of over a dozen novel worm-mammalian ortholog assignments that were previously undetected because of shortcomings of phylogenetic methods.A WWW server that allows searching for orthologs between human and several fully sequenced genomes is installed at http://www.cgb.ki.se/inparanoid/. This is the first comprehensive resource with orthologs of all fully sequenced eukaryotic genomes. Programs and tables of orthology assignments are available from the same location. Copyright 2001 Academic Press.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2013
                19 July 2013
                : 8
                : 7
                : e68731
                Affiliations
                [1 ]Danone Research, Palaiseau, France
                [2 ]Institut Pasteur, Genotyping of Pathogens and Public Health platform, Paris, France
                [3 ]NIZO food research, Ede, The Netherlands
                [4 ]Centre for Molecular and Biomolecular Informatics (CMBI), Radboud University Medical Centre, Nijmegen, The Netherlands
                [5 ]Netherlands Bioinformatics Centre (NBIC), Nijmegen, The Netherlands
                [6 ]Kluyver Center for Genomics of Industrial Fermentation, Delft, The Netherlands
                [7 ]Microbial Bioinformatics, Ede, The Netherlands
                Baylor College of Medicine, United States of America
                Author notes

                Competing Interests: The authors declare that TS, JvHV and CC work for Danone Research, part of the Danone Group. Danone is selling products that contain Lactobacilli. Danone Research financed part of this study, including subcontracting to NIZO food research (MW, JB) and Microbial Bioinformatics (RS). This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.

                Conceived and designed the experiments: TS JHV MW CC SB RS. Performed the experiments: JP SB CC MW. Analyzed the data: SB CC JB MW JP RS. Contributed reagents/materials/analysis tools: SB JB MW. Wrote the paper: RS TS JHV.

                [¤]

                Current address: Universita Cattolica del Sacro Cuore, Piacenza, Italy

                Article
                PONE-D-12-38397
                10.1371/journal.pone.0068731
                3716772
                23894338
                b769f5cb-73da-4a98-a4cc-afbfb1e3696f
                Copyright @ 2013

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 3 December 2012
                : 31 May 2013
                Page count
                Pages: 18
                Funding
                This work was partially supported by a KIT grant from the Kluyver Centre for Genomics of Industrial Fermentation, which is part of the Netherlands Genomics Initiative (NGI) and Danone Research financed part of this study, including subcontracting to NIZO food research (MW, JB) and Microbial Bioinformatics (RS). TS, JvHV and CC obtained financial support from the ERA-Net PathoGenoMics (ANR-10-PATH-004 project). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology
                Computational Biology
                Genomics
                Comparative Genomics
                Genome Evolution
                Genome Sequencing
                Molecular Genetics
                Gene Identification and Analysis
                Sequence Analysis
                Genomics
                Comparative Genomics
                Genome Evolution
                Genome Sequencing
                Microbiology
                Bacteriology
                Bacterial Biochemistry
                Bacterial Evolution
                Industrial Microbiology
                Microbial Evolution
                Microbial Metabolism

                Uncategorized
                Uncategorized

                Comments

                Comment on this article