Microbiome differential abundance methods produce different results across 38 datasets

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Identifying differentially abundant microbes is a common goal of microbiome studies. Multiple methods are used interchangeably for this purpose in the literature. Yet, there are few large-scale studies systematically exploring the appropriateness of using these tools interchangeably, and the scale and significance of the differences between them. Here, we compare the performance of 14 differential abundance testing methods on 38 16S rRNA gene datasets with two sample groups. We test for differences in amplicon sequence variants and operational taxonomic units (ASVs) between these groups. Our findings confirm that these tools identified drastically different numbers and sets of significant ASVs, and that results depend on data pre-processing. For many tools the number of features identified correlate with aspects of the data, such as sample size, sequencing depth, and effect size of community differences. ALDEx2 and ANCOM-II produce the most consistent results across studies and agree best with the intersect of results from different approaches. Nevertheless, we recommend that researchers should use a consensus approach based on multiple differential abundance methods to help ensure robust biological interpretations.

Abstract

Many microbiome differential abundance methods are available, but it lacks systematic comparison among them. Here, the authors compare the performance of 14 differential abundance testing methods on 38 16S rRNA gene datasets with two sample groups, and show ALDEx2 and ANCOM-II produce the most consistent results.

Related collections

Most cited references 81

Record: found
Abstract: found
Article: found

Is Open Access

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

Michael Love, Wolfgang Huber, Simon Anders (2014)

In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html. Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.

0 comments Cited 26664 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Yoav Benjamini, Yosef Hochberg (1995)

0 comments Cited 25556 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

limma powers differential expression analyses for RNA-sequencing and microarray studies

Matthew E. Ritchie, Belinda Phipson, Di Wu … (2015)

limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

0 comments Cited 12325 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Jacob T. Nearing:

ORCID: http://orcid.org/0000-0002-2261-034X

jacob.nearing@dal.ca

Journal

Journal ID (nlm-ta): Nat Commun

Journal ID (iso-abbrev): Nat Commun

Title: Nature Communications

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2041-1723

Publication date (Electronic): 17 January 2022

Publication date PMC-release: 17 January 2022

Publication date Collection: 2022

Volume: 13

Electronic Location Identifier: 342

Affiliations

[1 ]GRID grid.55602.34, ISNI 0000 0004 1936 8200, Department of Microbiology and Immunology, , Dalhousie University, ; Halifax, NS Canada

[2 ]GRID grid.55602.34, ISNI 0000 0004 1936 8200, Department of Mathematics and Statistics, , Dalhousie University, ; Halifax, NS Canada

[3 ]GRID grid.55602.34, ISNI 0000 0004 1936 8200, Department of Computer Science, , Dalhousie University, ; Halifax, NS Canada

[4 ]GRID grid.55602.34, ISNI 0000 0004 1936 8200, Integrated Microbiome Resource, , Dalhousie University, ; Halifax, NS Canada

[5 ]GRID grid.55602.34, ISNI 0000 0004 1936 8200, Department of Civil and Resource Engineering, , Dalhousie University, ; Halifax, NS Canada

[6 ]GRID grid.55602.34, ISNI 0000 0004 1936 8200, Department of Pharmacology, , Dalhousie University, ; Halifax, NS Canada

Author information

Jacob T. Nearing http://orcid.org/0000-0002-2261-034X

Molly G. Hayes http://orcid.org/0000-0003-3565-535X

Akhilesh S. Dhanani http://orcid.org/0000-0002-4204-0153

André M. Comeau http://orcid.org/0000-0001-7066-7239

Article

Publisher ID: 28034

DOI: 10.1038/s41467-022-28034-z

PMC ID: 8763921

PubMed ID: 35039521

SO-VID: 84b14c55-e55d-4dbf-b051-f6beea6d94ec

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

History

Date received : 12 May 2021

Date accepted : 4 January 2022

Custom metadata

ScienceOpen disciplines: Uncategorized

Keywords: data processing,high-throughput screening,statistical methods,microbiome

Data availability:

ScienceOpen disciplines: Uncategorized

Keywords: data processing, high-throughput screening, statistical methods, microbiome

Microbiome differential abundance methods produce different results across 38 datasets

Read this article at

Abstract

Abstract

Related collections

Tick microbiome

Most cited references 81

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

limma powers differential expression analyses for RNA-sequencing and microarray studies

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Categories

Custom metadata

Comments

Comment on this article

Similar content 91

Cited by 212

Most referenced authors 2,649