80
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A practical guide to amplicon and metagenomic analysis of microbiome data

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Advances in high-throughput sequencing (HTS) have fostered rapid developments in the field of microbiome research, and massive microbiome datasets are now being generated. However, the diversity of software tools and the complexity of analysis pipelines make it difficult to access this field. Here, we systematically summarize the advantages and limitations of microbiome methods. Then, we recommend specific pipelines for amplicon and metagenomic analyses, and describe commonly-used software and databases, to help researchers select the appropriate tools. Furthermore, we introduce statistical and visualization methods suitable for microbiome analysis, including alpha- and beta-diversity, taxonomic composition, difference comparisons, correlation, networks, machine learning, evolution, source tracing, and common visualization styles to help researchers make informed choices. Finally, a step-by-step reproducible analysis guide is introduced. We hope this review will allow researchers to carry out data analysis more effectively and to quickly select the appropriate tools in order to efficiently mine the biological significance behind the data.

          Related collections

          Most cited references131

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Trimmomatic: a flexible trimmer for Illumina sequence data

          Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: usadel@bio1.rwth-aachen.de Supplementary information: Supplementary data are available at Bioinformatics online.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Fast gapped-read alignment with Bowtie 2.

            As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              edgeR: a Bioconductor package for differential expression analysis of digital gene expression data

              Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org). Contact: mrobinson@wehi.edu.au
                Bookmark

                Author and article information

                Contributors
                yxliu@genetics.ac.cn
                ybai@genetics.ac.cn
                Journal
                Protein Cell
                Protein Cell
                Protein & Cell
                Higher Education Press (Beijing )
                1674-800X
                1674-8018
                11 May 2020
                11 May 2020
                May 2021
                : 12
                : 5
                : 315-330
                Affiliations
                [1 ]GRID grid.9227.e, ISNI 0000000119573309, State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, , Chinese Academy of Sciences, ; Beijing, 100101 China
                [2 ]GRID grid.410726.6, ISNI 0000 0004 1797 8419, CAS Center for Excellence in Biotic Interactions, , University of Chinese Academy of Sciences, ; Beijing, 100049 China
                [3 ]GRID grid.9227.e, ISNI 0000000119573309, CAS-JIC Centre of Excellence for Plant and Microbial Science, Institute of Genetics and Developmental Biology, , Chinese Academy of Sciences, ; Beijing, 100101 China
                [4 ]GRID grid.410726.6, ISNI 0000 0004 1797 8419, College of Advanced Agricultural Sciences, , University of Chinese Academy of Sciences, ; Beijing, 100049 China
                [5 ]GRID grid.410318.f, ISNI 0000 0004 0632 3409, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, ; Beijing, 100700 China
                [6 ]GRID grid.13402.34, ISNI 0000 0004 1759 700X, Department of Rheumatology Immunology & Allergy, Children’s Hospital, , Zhejiang University School of Medicine, ; Hangzhou, Zhejiang Province 310053 China
                Author information
                http://orcid.org/0000-0003-1832-9835
                http://orcid.org/0000-0003-0705-0636
                http://orcid.org/0000-0003-3134-3113
                http://orcid.org/0000-0002-4930-9493
                http://orcid.org/0000-0003-0733-8031
                http://orcid.org/0000-0001-5422-6740
                http://orcid.org/0000-0003-2652-7022
                Article
                724
                10.1007/s13238-020-00724-8
                8106563
                32394199
                ff3b0837-f233-44b4-9646-7e735ba6868c
                © The Author(s) 2020

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 4 February 2020
                : 10 April 2020
                Categories
                Review
                Custom metadata
                © The Author(s) 2021

                metagenome,marker genes,high-throughput sequencing,pipeline,reproducible analysis,visualization

                Comments

                Comment on this article