0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genome-centric analysis of short and long read metagenomes reveals uncharacterized microbiome diversity in Southeast Asians

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Despite extensive efforts to address it, the vastness of uncharacterized ‘dark matter’ microbial genetic diversity can impact short-read sequencing based metagenomic studies. Population-specific biases in genomic reference databases can further compound this problem. Leveraging advances in hybrid assembly (using short and long reads) and Hi-C technologies in a cross-sectional survey, we deeply characterized 109 gut microbiomes from three ethnicities in Singapore to comprehensively reconstruct 4497 medium and high-quality metagenome assembled genomes, 1708 of which were missing in short-read only analysis and with >28× N50 improvement. Species-level clustering identified 70 (>10% of total) novel gut species out of 685, improved reference genomes for 363 species (53% of total), and discovered 3413 strains unique to these populations. Among the top 10 most abundant gut bacteria in our study, one of the species and >80% of strains were unrepresented in existing databases. Annotation of biosynthetic gene clusters (BGCs) uncovered more than 27,000 BGCs with a large fraction (36–88%) unrepresented in current databases, and with several unique clusters predicted to produce bacteriocins that could significantly alter microbiome community structure. These results reveal significant uncharacterized gut microbial diversity in Southeast Asian populations and highlight the utility of hybrid metagenomic references for bioprospecting and disease-focused studies.

          Abstract

          Reference genomes for gut microbiomes help unravel microbial “dark matter” and serve as valuable resource for disease-focused studies. Here, the authors perform short and long read metagenomics and metagenome-assembled genomes analyses to profile the gut microbiome of Southeast Asian populations, revealing significant species and strain-level diversity, with thousands of previously uncharacterized biosynthetic gene clusters.

          Related collections

          Most cited references72

          • Record: found
          • Abstract: found
          • Article: not found

          Minimap2: pairwise alignment for nucleotide sequences

          Heng Li (2018)
          Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes

            Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Prodigal: prokaryotic gene recognition and translation initiation site identification

              Background The quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals. Results With our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives. Conclusion We built a fast, lightweight, open source gene prediction program called Prodigal http://compbio.ornl.gov/prodigal/. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines.
                Bookmark

                Author and article information

                Contributors
                yyteo@nus.edu.sg
                henning@tll.org.sg
                nagarajann@gis.a-star.edu.sg
                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Publishing Group UK (London )
                2041-1723
                13 October 2022
                13 October 2022
                2022
                : 13
                : 6044
                Affiliations
                [1 ]GRID grid.418377.e, ISNI 0000 0004 0620 715X, Genome Institute of Singapore, ; Singapore, 138672 Singapore
                [2 ]GRID grid.4280.e, ISNI 0000 0001 2180 6431, Life Sciences Institute, , National University of Singapore, ; Singapore, 117456 Singapore
                [3 ]GRID grid.1051.5, ISNI 0000 0000 9760 5620, Baker Heart and Diabetes Institute, ; 75 Commercial Rd, Melbourne, 3004 VIC Australia
                [4 ]GRID grid.226688.0, ISNI 0000 0004 0620 9198, Temasek Life Sciences Laboratory, 1 Research Link, ; Singapore, 117604 Singapore
                [5 ]GRID grid.4280.e, ISNI 0000 0001 2180 6431, Saw Swee Hock School of Public Health, , National University of Singapore, ; 12 Science Drive 2, Singapore, 117549 Singapore
                [6 ]GRID grid.4280.e, ISNI 0000 0001 2180 6431, Department of Statistics and Applied Probability, , National University of Singapore, ; Singapore, 117546 Singapore
                [7 ]GRID grid.4280.e, ISNI 0000 0001 2180 6431, Department of Biological Sciences, , National University of Singapore, ; Singapore, 117558 Singapore
                [8 ]GRID grid.4280.e, ISNI 0000 0001 2180 6431, Yong Loo Lin School of Medicine, , National University of Singapore, ; Singapore, 117596 Singapore
                Author information
                http://orcid.org/0000-0003-0997-2490
                http://orcid.org/0000-0001-5317-0458
                http://orcid.org/0000-0002-5763-0236
                http://orcid.org/0000-0003-0850-5604
                Article
                33782
                10.1038/s41467-022-33782-z
                9561172
                36229545
                329211c3-e5bd-461b-8fad-f43545dd9515
                © The Author(s) 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 27 April 2022
                : 27 September 2022
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100001348, Agency for Science, Technology and Research (A*STAR);
                Categories
                Article
                Custom metadata
                © The Author(s) 2022

                Uncategorized
                microbial genetics,microbiome,genome assembly algorithms,metagenomics
                Uncategorized
                microbial genetics, microbiome, genome assembly algorithms, metagenomics

                Comments

                Comment on this article