10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Gut microbiome diversity detected by high-coverage 16S and shotgun sequencing of paired stool and colon sample

      data-paper

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The gut microbiome has a fundamental role in human health and disease. However, studying the complex structure and function of the gut microbiome using next generation sequencing is challenging and prone to reproducibility problems. Here, we obtained cross-sectional colon biopsies and faecal samples from nine participants in our COLSCREEN study and sequenced them in high coverage using Illumina pair-end shotgun (for faecal samples) and IonTorrent 16S (for paired feces and colon biopsies) technologies. The metagenomes consisted of between 47 and 92 million reads per sample and the targeted sequencing covered more than 300 k reads per sample across seven hypervariable regions of the 16S gene. Our data is freely available and coupled with code for the presented metagenomic analysis using up-to-date bioinformatics algorithms. These results will add up to the informed insights into designing comprehensive microbiome analysis and also provide data for further testing for unambiguous gut microbiome analysis.

          Abstract

          Measurement(s) genome •rRNA_16S
          Technology Type(s) DNA sequencing
          Factor Type(s) sex •age •Smoking •Weight •Height •Diet •Medication
          Sample Characteristic - Organism Homo sapiens
          Sample Characteristic - Environment feces •colon

          Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.11902236

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found

          16S ribosomal DNA amplification for phylogenetic study.

          A set of oligonucleotide primers capable of initiating enzymatic amplification (polymerase chain reaction) on a phylogenetically and taxonomically wide range of bacteria is described along with methods for their use and examples. One pair of primers is capable of amplifying nearly full-length 16S ribosomal DNA (rDNA) from many bacterial genera; the additional primers are useful for various exceptional sequences. Methods for purification of amplified material, direct sequencing, cloning, sequencing, and transcription are outlined. An obligate intracellular parasite of bovine erythrocytes, Anaplasma marginale, is used as an example; its 16S rDNA was amplified, cloned, sequenced, and phylogenetically placed. Anaplasmas are related to the genera Rickettsia and Ehrlichia. In addition, 16S rDNAs from several species were readily amplified from material found in lyophilized ampoules from the American Type Culture Collection. By use of this method, the phylogenetic study of extremely fastidious or highly pathogenic bacterial species can be carried out without the need to culture them. In theory, any gene segment for which polymerase chain reaction primer design is possible can be derived from a readily obtainable lyophilized bacterial culture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Bracken: estimating species abundance in metagenomics data

            Metagenomic experiments attempt to characterize microbial communities using high-throughput DNA sequencing. Identification of the microorganisms in a sample provides information about the genetic profile, population structure, and role of microorganisms within an environment. Until recently, most metagenomics studies focused on high-level characterization at the level of phyla, or alternatively sequenced the 16S ribosomal RNA gene that is present in bacterial species. As the cost of sequencing has fallen, though, metagenomics experiments have increasingly used unbiased shotgun sequencing to capture all the organisms in a sample. This approach requires a method for estimating abundance directly from the raw read data. Here we describe a fast, accurate new method that computes the abundance at the species level using the reads collected in a metagenomics experiment. Bracken (Bayesian Reestimation of Abundance after Classification with KrakEN) uses the taxonomic assignments made by Kraken, a very fast read-level classifier, along with information about the genomes themselves to estimate abundance at the species level, the genus level, or above. We demonstrate that Bracken can produce accurate species- and genus-level abundance estimates even when a sample contains multiple near-identical species.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences

              Background Microbiome studies often involve sequencing a marker gene to identify the microorganisms in samples of interest. Sequence classification is a critical component of this process, whereby sequences are assigned to a reference taxonomy containing known sequence representatives of many microbial groups. Previous studies have shown that existing classification programs often assign sequences to reference groups even if they belong to novel taxonomic groups that are absent from the reference taxonomy. This high rate of “over classification” is particularly detrimental in microbiome studies because reference taxonomies are far from comprehensive. Results Here, we introduce IDTAXA, a novel approach to taxonomic classification that employs principles from machine learning to reduce over classification errors. Using multiple reference taxonomies, we demonstrate that IDTAXA has higher accuracy than popular classifiers such as BLAST, MAPSeq, QIIME, SINTAX, SPINGO, and the RDP Classifier. Similarly, IDTAXA yields far fewer over classifications on Illumina mock microbial community data when the expected taxa are absent from the training set. Furthermore, IDTAXA offers many practical advantages over other classifiers, such as maintaining low error rates across varying input sequence lengths and withholding classifications from input sequences composed of random nucleotides or repeats. Conclusions IDTAXA’s classifications may lead to different conclusions in microbiome studies because of the substantially reduced number of taxa that are incorrectly identified through over classification. Although misclassification error is relatively minor, we believe that many remaining misclassifications are likely caused by errors in the reference taxonomy. We describe how IDTAXA is able to identify many putative mislabeling errors in reference taxonomies, enabling training sets to be automatically corrected by eliminating spurious sequences. IDTAXA is part of the DECIPHER package for the R programming language, available through the Bioconductor repository or accessible online (http://DECIPHER.codes). Electronic supplementary material The online version of this article (10.1186/s40168-018-0521-5) contains supplementary material, which is available to authorized users.
                Bookmark

                Author and article information

                Contributors
                v.moreno@iconcologia.net
                ville.pimenoff@ki.se
                Journal
                Sci Data
                Sci Data
                Scientific Data
                Nature Publishing Group UK (London )
                2052-4463
                16 March 2020
                16 March 2020
                2020
                : 7
                : 92
                Affiliations
                [1 ]ISNI 0000 0001 2097 8389, GRID grid.418701.b, Oncology Data Analytics Program, , Catalan Institute of Oncology (ICO), ; Barcelona, Spain
                [2 ]ISNI 0000 0004 0427 2257, GRID grid.418284.3, Colorectal Cancer Group, ONCOBELL Program, , Bellvitge Institute of Biomedical Research (IDIBELL), ; Barcelona, Spain
                [3 ]ISNI 0000 0004 1756 6246, GRID grid.466571.7, Consortium for Biomedical Research in Epidemiology and Public Health (CIBERESP), ; Barcelona, Spain
                [4 ]GRID grid.417656.7, Gastroenterology Department, , Bellvitge University Hospital-IDIBELL, Hospitalet de Llobregat, ; Barcelona, Spain
                [5 ]ISNI 0000 0004 0427 2257, GRID grid.418284.3, Cancer Epigenetics and Biology Program (PEBC), , Bellvitge Biomedical Biomedical Research Institute (IDIBELL), ; Barcelona, Catalonia Spain
                [6 ]Digestive System Service, Moisés Broggi Hospital, Sant Joan Despí, Spain
                [7 ]Endoscopy Unit, Digestive System Service, Viladecans Hospital-IDIBELL, Viladecans, Spain
                [8 ]ISNI 0000 0004 1937 0247, GRID grid.5841.8, Department of Clinical Sciences, Faculty of Medicine, , University of Barcelona, ; Barcelona, Spain
                [9 ]ISNI 0000 0004 1937 0626, GRID grid.4714.6, National Cancer Center Finland (FICAN-MID) and Karolinska Institute, ; Stockholm, Sweden
                Author information
                http://orcid.org/0000-0001-6484-3937
                http://orcid.org/0000-0001-6071-7343
                http://orcid.org/0000-0001-8840-7259
                http://orcid.org/0000-0002-2818-5487
                Article
                427
                10.1038/s41597-020-0427-5
                7075950
                32179734
                7be13bde-6a31-4b76-9db0-1c028b8784fa
                © The Author(s) 2020

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

                History
                : 13 September 2019
                : 21 February 2020
                Funding
                Funded by: Ministry of Science, Government of Spain, FPU17/05474
                Funded by: Fundacion Cientifica de la Asociacion Espanola Contra el Cancer, AECC
                Funded by: Spanish Ministry for Economy and Competitivity, Instituto de Salud Carlos III, co-funded by FEDER funds –a way to build Europe– (FIS PI17/00092), Agency for Management of University and Research Grants (AGAUR) of the Catalan Government (grant 2017SGR723).
                Funded by: Ministry of Science, Innovation and Universities, Government of Spain (grant FPU17/05474). Ministry of Health, Government of Catalonia (grants SLT002/16/00496 and SLT002/16/00398), Spanish Ministry for Economy and Competitivity, Instituto de Salud Carlos III, co-funded by FEDER funds –a way to build Europe– (FIS PI17/00092), Agency for Management of University and Research Grants (AGAUR) of the Catalan Government (grant 2017SGR723). Mireia Obón-Santacana received a post-doctoral fellow from “Fundación Científica de la Asociación Española Contra el Cáncer (AECC)”. We thank CERCA Program, Generalitat de Catalunya for institutional support.
                Funded by: Ministry of Health, Government of Catalonia (grants SLT002/16/00496 and SLT002/16/00398) and Spanish Ministry for Economy and Competitivity, Instituto de Salud Carlos III, co-funded by FEDER funds –a way to build Europe– (FIS PI17/00092).
                Categories
                Data Descriptor
                Custom metadata
                © The Author(s) 2020

                diagnostic markers,microbiome,classification and taxonomy,dna sequencing

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content165

                Cited by30

                Most referenced authors8,104