“How many viruses are there in the environment, compared to cells ?”. This deceptively
simple question remains challenging to address today, yet because it relates to fundamental
processes and characteristics of viral communities, even the imperfect approximations
available have been critical for the field of viral ecology. Most notably, early quantitative
observations reporting a high abundance of virus-like particles in aquatic environments
[1, 2] put a bright spotlight on viruses in the microbial ecology field, and spurred
many to investigate the possible roles and impacts of these viruses [3, 4]. Admittedly,
these also led to some redundancy in the introduction of many viral ecology publications,
with statements such as “viruses are the most abundant entities” or “viruses outnumber
cells by 10 to 1” almost systematically included (about 1550 hits in Google Scholar
as of May 12, 2023, a trend to which we ourselves and almost every researcher in the
field contributed, for better or worse).
This “virus to cell” ratio, also referred to as the virus-to-microbe ratio (VMR or
VTM), virus-to-bacterium ratio (VBR), or virus-to-prokaryote ratio (VPR), is often
reported as a key ecological metric, and typically derives from counts of viruses
and cells using transmission electron microscopy (TEM), epifluorescence microscopy
(EFM) or flow cytometry (FCM). Among these, EFM counts have been the most widely used
as they combine a relatively high throughput and an ability to distinguish even small
viruses from background noise [5]. These abundances of viruses, cells, and their ratio
are some of the basic pieces of information used to estimate the potential impacts
of viruses on ecosystem food webs and microbial processes, alongside other key metrics
such as frequency of visibly infected cells or bacterial production. Yet, the relevance
and usefulness of these all-embracing counts of all observable virus-like particles
and all microbial cells in a sample remain questionable.
In aquatic environments, a large-scale meta-analysis indicated that, while a 10:1
ratio could represent a median value for some ecosystems, VMRs typically span across
~ 2 to 3 orders of magnitude so that “a 10:1 model has either limited or no explanatory
power” [6]. A similar pattern of virus-like particles being overall more abundant
than microbial cells, but with high sample-to-sample variation in the specific ratio,
was reported for other environments [4]. Beyond the over-generalization of the 10
to 1 ratio in the literature, counts of virus-like particles are not without biases
and limitations [7, 8]. Several types of structures can be counted as “virus-like
particles” yet not be infectious viruses, including defective virions, Gene Transfer
Agents, and in some cases DNA-containing vesicles or minerals. At the same time, some
genuine viruses can be challenging to identify and count, including ssDNA and RNA
viruses for which standard dyes are not always efficient, and large viruses that can
be confounded with microbial cells. Virus-like particle counts can thus be both under-
and over-estimated, with no clear way to better constrain this uncertainty using current
methods. Consequently, alternative and complementary approaches enabling estimation
of virus:microbe ratios, or even better virus:host ratios, are highly desirable and
a topic of active research.
Meanwhile, these last 15 years have seen a rapid transformation of the viral ecology
field with the fast rise of metagenomics. Large genome catalogs for uncultivated viruses
obtained from metagenomes already provided invaluable information regarding their
functional diversity, distribution across ecosystems, and dynamics through space and
time [9]. Given the large amount of metagenome data already available and likely to
be generated in the near-future, the prospect of leveraging metagenome-assembled genomes
to estimate VMR is very appealing. Yet, the compositional nature of metagenomic data
along with uncertainties around viral sequence detection methods have so far limited
such attempts, and it is not entirely clear if and how much metagenomes can be used
in this area.
In a new study, López-García et al. try to tackle this challenge by formally establishing
a metagenome-based virus to (cellular) microbe ratio (mVMR), and compare it to microscopy-based
counts across several ecosystems and sample types including bulk samples, cellular
fractions, and viral fractions [10]. The mVMR metric relies on updated collections
of single-copy marker genes enabling the detection of most of the major known viral,
bacterial, and archaeal groups, with the ratio of abundance for viral and host markers
used as a proxy for the virus:cell ratio in the original sample. When comparing mVMR
estimations to epifluorescence-based counts (“fVMR”) for a set of aquatic samples
and size fractions, both metrics typically ranged between 1:1 and 10:1 viruses to
cells, yet, intriguingly, the overall correlation was limited between the two approaches.
In freshwater environments, mVMR was more than twice that of fVMR, while the opposite
was true in some marine and hypersaline samples. These discrepancies may be due to
technical limitations, but could also reflect biological differences between these
ecosystems, including e.g., the number and abundance of “non-viral virus-like particles”
(likely counted with fVMR but not mVMR), as well as the abundance and frequency of
viral genomes integrated in host genomes with little to no virion production (likely
counted in mVMR but not fVMR).
Applied to a broader range of samples and ecosystems, mVMR typically range between
1:1 and ~10:1 with a relatively high sample-to-sample variation. Some broad ecosystem
trends were nevertheless apparent, with higher mVMR in aquatic ecosystems compared
to soil/sediments and animal-associated microbiomes. López-García et al. also illustrate
another potential advantage of the mVMR approach by providing an estimated relative
abundance of individual taxa within the aggregated “virus” and “microbe” counts for
each sample, leveraging the fact that single-copy genes can also be used as taxonomic
markers. This larger analysis illustrates both the promises and the remaining challenges
of mVMR as a viral ecology metric. On one side, the prospect of leveraging the ever-growing
number of public metagenomes, along with the possibility to estimate mVMR in a lineage-specific
way to more closely reflect virus:host relationships, are tantalizing and would enable
a much larger and in-depth investigation of virus:host dynamics across microbiomes.
On the other hand, it is still complicated at this point to determine whether mVMR
applied in this way is a “superior” metric, i.e., more accurate and less biased than
microscopy counts.
Several technical limitations could lead to both under- and over-estimation of mVMR.
Publicly available metagenomes most often target cellular size fractions, and may
miss a significant number of viruses including (i) viral genomes encapsidated in virions,
(ii) viruses not represented in current collections of single-copy marker genes, and
(iii) ssDNA and RNA viruses not captured by standard library preparation protocols.
While the most common dsDNA viruses are now most likely well captured by single-copy
marker genes, a comprehensive mVMR would have to rely on integrated sampling across
size fractions and a combination of DNA and RNA libraries, which would not be available
for the majority of publicly available data. Meanwhile, mVMR counts will also include
elements not typically considered in the VMR metric, e.g., dormant viruses that do
not produce virions and/or kill their host, or virus-like machinery encoded by microbes
and including homologs of virus marker genes. Hence, with the potential for relatively
large under- and over-estimation, it is still difficult to get a sense for how accurate
the mVMR metric is.
Ultimately, mVMR and fVMR appear to provide orthogonal perspectives on a highly complex
biological process with different technical biases and limitations. On a bad day,
a viral ecologist would only see two flawed metrics with unknown intervals of confidence
that cannot be trusted. But on a good day, these two approaches can easily be seen
as highly complementary, and the addition of mVMR next to fVMR could provide a much-needed
additional window into virus:host interactions in microbiomes. With more controlled
studies of known consortia to better evaluate methodological biases, advances in “quantitative
sequencing” with the use of e.g., artificial spike-ins or high-throughput single-cell
approaches, and more studies providing paired microscopy- and sequencing-based observations,
we believe that the limitations around fVMR and mVMR will progressively be reduced,
finally enabling viral ecologists to be more confident in these measurements. For
better or worse, however, since both fVMR and mVMR seem to suggest that viruses typically
outnumber cells across diverse samples and microbiomes, we anticipate to keep reading
variants of the “viruses are the most abundant …” introduction statement for the foreseeable
future.