We have come
a long way in the last years of developing analytical strategies in
metabolomics. We have seen huge progress in tackling multiplatform
measurement, data analysis, data integration, and interpretation.
1
Mass spectrometry (MS) is the unrivaled technology
in the field. Following a divide and conquer strategy, successful
approaches defined and addressed sub-omes individually. Recursively
solving technical “subproblems” also with regard to
the analytical tasks of quantification and identification allowed
us to make significant progress. However, some of the challenges,
as imposed by the metabolome’s complexity (molecules <1500
Da), are not entirely overcome to date.
Indeed, the physicochemical
space occupied by this building block of life is vastly heterogeneous,
spanning concentration ranges from (high) fM to mM
2
and forming dynamic complex reaction networks. The complete
scope of metabolic networks remains to be elucidated. This holds true
for “simple” organisms such as bacteria with relatively
small-sized metabolomes (in the hundreds) and even more for the human
body metabolome considering that it consists of hundreds of different
metabolome, depending on body fluids, cell type, status, and tissue.
Next to the endogenous human pathways, numerous metabolites exist
that are transformed and/or circulated upon the complex interplay
with trillions of microbes constituting the “ecosystem”
of the human body.
3
Additionally, certain
disease-specific metabolites (e.g. methylated amino acids) with biological
function may occur, defining the so-called epi-metabolome.
4
Damaged and repaired metabolites can be the result
of enzymatic impairment.
5
Finally, the
human metabolome is highly dependent on nutrition and the surrounding
environment. More than 200,000 food derived metabolites and 10,000
xenobiotics exist that are potentially circulating.
6
Consequently, we have not yet reached the ultimate
aim, which is to comprehensively identify and quantify all metabolites
with one or at least a few analytical runs. Metabolome coverage, selectivity,
sensitivity and throughput remain conflicting goals that we have to
navigate.
7
While this fact limits the pace
of experimental assessment of metabolome inventories
8
regarding different cell types and model organisms, there
has been significant progress in customizing workflows, with the aim
of providing a pragmatic base for informative metabolite measurements.
9
The virtuous cycle of the global metabolomics
workflow starts with discoveries by nontargeted analysis. Over the
last few years, this analytical strategy has seen a tremendous impact
across different metabolomics applications and beyond. At the same
time analytical chemists, embracing the novel omics-type of measurement,
have been and keep being challenged regarding quality control (QC),
method standardization, and harmonization. Evidently, the computational
methods for data processing and data analysis are by far more complex
than in target analysis. Establishing metrics and guidelines for nontargeted
analysis is not straightforward,
10
especially
compared to the well-established validation practice in target analysis.
Experimental design and data quality
11
are
key to fully exploit the potential of nontargeted analysis with regard
to e.g. biomarker discovery
12,13
and beyond. The integration
of reference materials in nontargeted workflows is still under debate.
The complexity of omics-reference materials production, following
stringent metrological criteria, results in high costs, which contradicts
to the idea of affordable discoveries by large-scale studies. The
authors assume that this lack of general acceptance has in turn reduced
the pace of material development, and today we still have only a few
biological matrix reference standards available. Finally, whether
a discovery can be standardized might be debatable; however, a finding
should be validated. In fact, a metabolomics experiment should not
end with nontargeted methods, but the results should be validated
both analytically and biologically.
14
Thus, the final analytical step of our ideal virtuous metabolomics
cycle includes targeted measurements using authentic standards. Typical
sample numbers in metabolomics range from tens or hundreds up to a
few thousand, depending on the study design.
15
The more diverse the study cohort, the more samples must be analyzed
in order to generate a meaningful hypothesis. Following the golden
rules of step-wise discovery and stringent analytical validation is
more demanding for large scale studies. Time spans between sampling,
analysis, interpretation, discovery, and final validation together
with the limited availability of authentic standards pose practical
limitations towards this approach. A major aim of analytical development
remains increasing throughput of measurement. Regarding compound annotation, de facto
every current study accepts annotations with varying
but defined degree of certainty. This holds true for metabolomics
and lipidomics, where annotation is facilitated by rule-based MS data
interpretation as enabled by the structural templates of lipids. It
is common practice in both applications to report levels of annotation.
16,17
However, estimating the proportion of potentially false assignments
is still an exciting field of research.
18
Finally, analytical validation should include the quantitative dimension
of discoveries in nontargeted analysis. Despite significant progress
in harmonization, standardization, and advanced statistical analysis,
19
large scale multicenter studies remain challenging.
Recent applications resort to small scale studies for hypothesis generation,
followed by a (wide) targeted large scale study for hypothesis validation.
20
Biological validation is dependent on
the scope of the study. In metabolic phenotyping, biological probability
checks are facilitated by massive joint efforts to deploy open-source
metabolic atlases for a number of different organisms. Comparisons
with both experimental data and predictions (reactions, rules, and
enzymes) support the findings.
21
The complexity
of biological validation increases dramatically in the case of a hypothesized
biological function. Then, validation of the generated hypothesis
does not only address the mere presence/up- or downregulation of a
certain metabolite/pathway, but the hypothesized biological function
needs to be corroborated. For example, in functional metabolomics,
22
cutting-edge multi-omics analysis
23
together with biochemical assays unravels molecular
functions and associated modulatory mechanisms of perturbed metabolism
in relation to phenotype.
Undoubtedly, accepting multiple lines
of evidence in nontargeted discoveries (with reported degree of confidence)
has accelerated metabolomics research. The question to which degree
analytical validation can be reduced or even entirely replaced by
advanced computational methods and biological validation experiments
needs to be addressed in the overwhelmingly interdisciplinary science
of metabolomics. Reporting on the accurate assessment and the resulting
degree of confidence alone is a minimum requirement.
24
On the other hand, the ways of evidence besides strict
analytical validation might promote the acceleration of the measurement
step itself. High-throughput technologies proved to be fit-for-purpose
in dedicated applications despite limited selectivity.
25,26
This review will focus on recurring topics in MS-based metabolomics
measurement (including lipids). We will emphasize the role of stable
isotopes for both target and nontargeted analysis giving an overview
on different standard materials derived from isotopically labeled
biomass and strategies enabled by these materials. We will discuss
the current state of the art of quantification, validation, and harmonization
with respect to both metabolomics and lipidomics. We will include
strategies enabling various ways of scientific evidence regarding
the metabolite/lipid annotation task. Finally, we will survey the
rationales of workflow design, which straddle coverage and throughput.
Nearly five years have passed since Cajka and Fiehn published their
review on the state of the art of metabolomics/lipidomics, proposing
at the same time a vision of merging targeted and nontargeted analysis.
9
Since then, many studies have realized the potential
of simultaneous unanticipated discovery and quantification of a selected
metabolite pool, a strategy enabled by high-resolution mass spectrometry
(HRMS). We report on the progress of “merging” ideas.
We think that lipidomics and metabolomics need to be integrated into
one workflow. We will discuss the potential of chromatographic solutions
as compared to recent high-throughput technologies for the simultaneous
analysis of the two sub-omes, as a first key step.
Established Concepts
of Quantification in Metabolomics/Lipidomics
In absolute
accurate quantification, guidelines on bioanalytical method validation
from the United States Food & Drug Administration (U.S. FDA)
27
or European Medicines Agency (EMA)
28
establish gold standards and metrological frames.
However, application to omics-type analysis is challenged by the sheer
number of analytes within one measurement, the lack of standards,
and the need for an actual analyte-free matrix. In the following,
we will give a brief tutorial summary on absolute quantification strategies
currently established in the field of metabolomics and lipidomics.
The term quantitative assessment in MS-based omics studies often refers
to relative quantification of differences between sample groups, while
here we refer to absolute quantification requiring proper standardization
and analytical validation. A brief introduction will emphasize the
need for standards and reference materials, in the form of both multi-mix
standards and biological matrix material.
Recommended Absolute Quantification
Approaches
The method of highest metrological order in MS
based analysis is isotope dilution established by matrix-matched multi-level
external calibration with internal standardization. The internal standard
(ISTD) added as early as possible in the analytical process and equilibration
between sample and spike should be ensured prior to extraction. Multilevel
calibration, preferred as the working range (given by the lower limit
of quantification (LLOQ) and the upper limit of quantification (ULOQ)),
is assessed and controlled along with the quantification exercise.
This is not the case when isotope dilution is based on a single spike
level (one-point calibration). Next to this gold standard, other external
calibration strategies could meet the recommendations of widely accepted
(bio-) analytical method validation guidelines, as well, as long as
they properly employ internal standardization. As internal standards,
either standards of similar structures or of matching retention time
(RT) and thus co-ionization are commonly used. Spiking the same amount
of ISTD to external calibrants and samples allows us to use ISTDs
without certified concentration. Figure 1
dissects the calibration method into four
major components and discusses their relevance. According to the guidelines,
the analysis of biological matrix blanks is mandatory. The conceptualization
of such a blank sample, i.e. a biological matrix free of endogenous
metabolites, is challenging. Knockout experiments for specific metabolites,
albeit tedious, offer a solution. However, most studies resort to
simplifying approaches using extraction blanks or protein mixtures.
Figure 1
Accurate
absolute quantification according to the U.S. FDA guideline. Four
requirements need to be fulfilled for calibration: 1, matrix-matched;
2, multipoint; 3, external standardization; 4, internal standardization.
Additionally, their control point, the challenge, and a practical
solution for omics-experiments are given. *The ranking of ISTD follows
the levels of quantification of the Lipidomics Standards Initiative
(LSI).
29
The gold standard of quantification, if applied to -omics type of
analysis requires a high number of external standards (ESTDs) and
ISTDs, which are stable isotope labeled. Fully labeled standards are
expensive but simplify data evaluation and validation. State of the
art wide targeted assays in metabolomics implement hundreds of standards.
In research practice, the need for fit-for-purpose methods has led
to the implementation of alternative quantification strategies with
the aim of reducing the overall number of standards, measurements,
and costs involved. In lipidomics, calibration strategies resorting
to few standards per lipid class have successfully been established
as enabled by the structural templates of lipids.
30
Moreover, recent development concerns the use of partial
isotopic labeling for standard production.
31
Figure 2
provides
an overview on established quantification methods in metabolomics/lipidomics
compared to the gold standard (matrix-matched multi-point external
calibration including internal standardization).
Figure 2
Fit for purpose internal
standard-based quantification strategies established in the field
of metabolomics and lipidomics. Colors in the graphs symbolize values
from the sample (purple), compound-specific standards (green), and
surrogate standards (orange).
As previously mentioned, isotope dilution using a known amount of
isotopically labeled ISTD with characterized concentration (traceable)
offers a method of high metrological order. Both fully labeled and
partially labeled ISTDs can be used. In the latter case, concentrations
are calculated using multiple linear regressions. The “single
spike” isotope dilution method is accurate, provided that (1)
spike and sample are equilibrated upon extraction and (2) the blend
ratio is within the linear dynamic range and (3) significantly different
from the natural ratio. Thus, additional validation experiments are
required. For highest metrological order, reversed isotope dilution
experiments are necessary to characterize the spike with every experiment.
These steps are mostly omitted in -omics measurements. The validation
process is accelerated by kit solutions and commercial availability
of ISTD mixtures with concentration levels tailored for specific applications.
If no compound-specific calibrant is available, surrogate calibration
is accomplished by structurally similar standards, either using isotopically
labeled ISTD or non-endogenous ISTDs. Structurally similar standards
are preferred over RT matched standards, which ensure co-ionization
only. Surrogate internal standardization drastically reduced the number
of necessary standards. It is executed as multi-point calibration
32
or as one-point calibration.
33
In lipidomics, surrogate calibration is accepted, provided
lipid class co-ionization and the use of response factors
34
is ensured. If lipid surrogate quantification
is performed on the MS2 level, variations in signal intensities between
the different fatty acyl chain fragments have to be mathematically
corrected for. Schuhmann et al. recently published a model based on
commercially available lipid standards to correct systematic errors
(up to 60%) for common glycerophospholipids due to the differences
in (1) the sn-1/2 positions of the glycerol backbone, (2) the length
of the hydrocarbon chain, and (3) the number and location of double
bonds.
35
Stable Isotope Labeling
In contrast to radionuclides, isotopes have stable nuclei, hence
representing a safe alternative for labeling approaches. The overall
abundance of heavy stable isotopes in nature is low (<5%). Given
the isotopic effect, i.e. the isotopic fractionation upon chemical
reactions and biological processes, the natural abundance varies to
a small degree, forming the basis for natural tracer studies in geochronology,
ecology, archeology, or climatology. The low natural abundance facilitates
the production of pure stable isotope labeled compounds, either via
chemical synthesis or via in vivo synthesis.
36
Stable isotopes and stable isotope labeling
have a well-documented history in MS, which was exquisitely outlined
for life sciences by Lehmann.
37
In this
review, we emphasize the pivotal role of stable isotope labeled biomass.
Today, in vivo synthesized stable isotope labeled
compounds have become essential tools for mass spectrometry-based
identification or quantification in metabolomics (including lipidomics).
The important application of supplied stable isotope tracers in metabolomics
for flux and tracer studies is comprehensively covered elsewhere.
38−41
Labeled biomass was used early on in quantitative omics workflows,
e.g. amino acid labeling to monitor proteome changes upon system perturbation.
Relative quantification in proteomics studies using cell culture based
labeling
42
was performed, but also successful
labeling of higher organisms such as Caenorhabditis elegans, Drosophila melanogaster,
and mice was reported.
43−45
However, it is important to note that for higher organism complex
nutrients and media composition are necessary so that in most cases
only specific amino acids (SILAC approach) were labeled leading to
amino acid labeling efficiencies of up to 98%.
44
Only when fully labeled mircoorganisms such as Escherichia coli 98% enriched in 15N
were fed
to worms (C. elegans) or fruit flies
(D. melanogaster), protein extracts
with a labeling degree up to 94% were detectable.
45
However, the smaller number of nitrogen atoms limits its
use in metabolomics or lipidomics, thus carbon or deuterium labeling
is preferred. Already in 2005, absolute quantification based on internal
standardization by uniformly 13C-labeled yeast cell extracts
was introduced, paving the way for absolute quantification of a high-numbered
analyte panel.
46
At that time no enrichment
degrees were reported for metabolites or lipids. The use of labeled
biomass for quantification tasks in metabolomics was facilitated by
fully labeled E. coli grown in shaking flasks as
pioneered by the group of Rabinowitz
47
and
further extended for eukaryotic uniformly labeled yeast grown in fermenters
by Canelas et al.
48
Enrichment Degree and Isotopologue
Distribution
Isotopically labeled standards are characterized
by the enrichment degree—often used interchangeably with the
term labeling efficiency—which refers to the probability of
finding a labeled atom at any possible label site. One has to be aware
that the actual relative abundance of the heaviest isotopologue, i.e.
the fully labeled isotopologue, is lower than the enrichment degree
and depends on enrichment, the number of labeling sites, elemental
composition, and mass resolution (see Figure 3
A–D). A simplified assumption of
100% abundance of the fully labeled isotopologue leads to errors in
actual relative abundance in the mass spectra (Examples for leucine
and phosphatidylcholine (PC) 34:2 can be found in Figure 3
E). This is relevant in absolute
quantification relying on ISTDs with known concentration and especially
crucial if the labeled compound is used as surrogate ISTD as e.g.
often performed in lipidomics.
33
In this
case, either all isotopologues are summed up (after they have been
checked for interferences) or the actual value is corrected e.g. similar
to isotope correction Type 1 for natural unlabeled lipids.
33
A useful tool for fast prediction of isotopologue
distributions from molecular formulas is enviPat, which is available
as website version and R package.
49
Overall,
in order to enable omics-type analysis, the known enrichment degree
is of paramount importance. Spike materials of high enrichment degree
(>99%) are preferred as they lead to more distinct isotopologue
signals, reduced spectral overlay, and more straightforward data interpretation.
Figure 3
Difference
between enrichment degree and the relative isotopic abundance of a
fully labeled isotopologue. (A) Isoleucine with 6 carbon atoms is
used as an example. (B) Calculation of abundances for carbon as di-isotopic
element is based on the binominal formula. Other elements with more
than one isotope (e.g. H, N) influence the final abundance according
to their natural abundance also based on a binominal formula. Polyisotopic
elements (O) are based on polynomial terms. Usually, the contribution
of H, N, and O to the overall difference is minimal (here 1–2%)
but other elements must be considered (e.g. Cl, Br, S). (C) Determination
of coefficients of a binominal formula for each term according to
the n + 1 line in Pasqual’s triangle (for n = 6:1, 6, 15, 20,
15, 6, 1). (D) Binominal formula for n = 6. Each term is the relative
abundance of the corresponding isotopologue without the consideration
of other elemental isotopes. The last term corresponds to the fully
labeled isotopologue. The sum of all isotopologues is always 100%.
(E) Exemplarily, the effect of 1% enrichment difference (99%-darker
color and 98%-lighter color) on the abundance is shown for PC 34:2
(n = 42, blue) and isoleucine (n = 6, grey). The bar chart shows the
distribution from the fully labeled isotopologue (M′) until
M′ – 4 for both molecules. The difference for the fully
labeled isotopologue to 100% is already 12% for the 98% labeled isoleucine
and 58% for PC 34:2. But even for a better enrichment (99%) the error
for PC 34:2 is still 36%, highlighting the importance to consider
the relative abundance for quantification workflows.
50
Suite of In Vivo Synthesized Isotopically Labeled Materials
In the last
decades, labeled organisms such as bacteria, yeast, or plants have
been grown to create huge libraries of stable isotope labeled (13C, 15N, 34S, 2H)
endogenous
metabolites.
51−53
Controlled growth conditions of E. coli or Pichia pastoris were
particularly successful as enrichment degrees higher than 99% were
achieved leading to the simultaneous production of hundreds of biologically
relevant labeled metabolites covering the highly conserved primary
metabolome.
51,54−56
Some of the
materials are already commercially available (such as labeled E. coli, yeast, and
algae products; details
on the materials can be found in Table 1
).
Table 1
Overview on Labeled Biomass Materials
Organism
Kingdom
Isotope
Enrichment degree
Feed
Reference
Escherichia
coli
Bacteria
13C
>98%
Glucose
Mahieu and Patti 2017
57
Escherichia coli
Bacteria
15N
>98%
(NH4)2SO4
Krüger et al. 2008
43
Arthrospira platensis (Spirulina)
Bacteria
13C
>97%
CO2
Berthold et al. 1991
58
Chlamydomonas reinhardtii (algae)
Bacteria
13C
>98%
CO2
Behrens et al. 1994
59
Chlorella vulgaris (algae)
Bacteria
13C
>98%
CO2
Behrens et al. 1994
59
Nannochloropsis oculata (algae)
Protist
13C
>85%
CO2
Doomun et al. 2020
60
Pichia pastoris (yeast)
Fungi
13C
>98%
Glucose
Neubauer et al. 2012
56
Pichia pastoris (yeast)
Fungi
34S
>95%
Na2SO4
Hermann et al.
2016
61
Saccharomyces cerevisiae (yeast)
Fungi
15N
>94%
(NH4)2SO4
Krüger et al. 2008
43
Fusarium graminearum
Fungi
13C
>99.5%
Glucose
Bueschl et al. 2014
52
Arabidopsis thaliana
Planae
13C
>95%
CO2
Giavalisco et al. 2009
53
Triticum durum (wheat)
Planae
13C /15N
>96%/>95%
CO2/NO3 salts
Ćeranić et al. 2020
62
Caenorhabditis elegans (worm)
Animalia
15N
>98%
E. coli
Krüger et al. 2008
43
Drosophila melanogaster (fly)
Animalia
15N
>94%
S. cerevisiae
Krüger et
al. 2008
43
Rattus norvegicus domestica (rat)
Animalia
15N
>94%
Spirulina
McClatchy et al. 2007
63
Mus musculus (mice)
Animalia
13C
6–75%
Ralstonia eutropha
Dethloff et al. 2018
64
Homo sapiens (HeLa cells)
Animalia
2H
0–5%
5% D2O
Kim et al. 2019
31
Homo
sapiens (HCT116 cells)
Animalia
13C
0–99%
Glucose and AAs
Grankvist et al. 2018
65
The list of
labeled organisms is further growing. For example, uniformly 13C labeled lipids derived
from microalgae Nannochloropsis
oculata were measured via MS/MS to calculate 13C enrichment for both the whole molecule
and the different building
blocks of a lipid.
60
Such information can
be useful to follow labeling of the head group versus fatty acids
and might help to study lipid synthesis and remodeling processes.
Advances in stable isotope labeling in plants using customized closed
growth chambers enabled us to increase the enrichment degree to 96–98%
for 13C and 95–99% for 15N adding a complex
compound panel of primary and secondary metabolites.
62
Still missing is a fully labeled mammalian organism. The
complex feed or media and the resulting high costs limit the production
to partial labeling approaches, which have been used successfully
for relative quantification. For example, growing HeLa cells on a
5% deuterium oxide enriched medium together with a deconvolution algorithm
facilitating classical isotopic dilution approaches enabled improved
relative quantification for lipids.
31
Even
mice can be partially labeled (6–75% enrichment depending on
the metabolite) for feeding a commercially available 13C-labeled bacterial diet (Ralstonia
eutropha). This
strategy was also applied for relative quantification, improving precision
from 27% to less than 10%.
64
Table 1
summarizes the labeled biomass
materials, used labeled isotopes, enrichment degree, feed, and literature.
Applications of Stable Isotope Labeled Biomass
Isotopically
labeled biomass has three major applications in metabolomics and lipidomics,
namely (1) credentialing by identification of biological metabolites
using labeled and nonlabeled metabolite pairs, (2) validation of isotopologue
distributions, and (3) standardization and normalization for quantification
workflows.
Credentialing: Isotopically Labeled Biomass for Identification
Credentialing-type approaches involve the analysis of samples containing
analytes in an unlabeled as well as a stable-isotope labeled form.
Mixing of extracts from uniformly labeled organisms with those from
unlabeled organisms allows us to distinguish metabolic features with
biological origin from background contaminants by the occurrence of
shifted m/z and MS/MS spectra and, in approaches implementing liquid
chromatography (LC–MS), also matching RTs. An early application
of comprehensive incorporation of stable isotope labeled biomass was
published by Giavalisco et al.,
53
who applied 13C labeling of Arabidopsis thaliana in order
to recognize biological features and improve the molecular formula
annotation of their flow injection (FI-) fourier-transform ion cyclotron
resonance (FTICR) and reversed-phase (RP)-LC-FTICR analysis. The first
open-source software MetExtract capable of automatizing assignment
of LC-MS peaks originating from 13C labeled compounds to
their endogenous counterparts was published by Bueschl et al.
66
Later, other tools mostly relying on differential
incorporation of isotopic labels into metabolites have been introduced,
which simplify this type of analysis and include tracer analysis (MAVEN,
67
mzMatch-ISO,
68
X13CMS,
69
isoMETLIN,
70
george,
71
and ALLocator
72
).
The isotopic ratio outlier analysis (IROA) approach demonstrated the
introduction of highly specific isotopologue patterns to further improve
specificity and quantification capabilities using labeled organisms.
73
In 2014, Mahieu et al.
74
coined the term “credentialing” and further emphasized
the importance of this type of approaches for the recognition of real
biological features and the comparison and fine tuning of metabolomics
workflows. Later they used stable isotope labeling combined with other
feature grouping and noise removal approaches to show that the number
of biological features in an E. coli extract can account for less than 5% of all features
detected via
nontargeted peak detection.
57
MetExtract
was later updated to MetExtract II to remove mismatches and group
different ion-species as well as employ stable isotope patterns for
the purpose of LC-MS peak detection, annotation/noise removal in fragmentation
spectra, molecular formula elucidation, and isotopic tracer studies.
75
This presented a significant step in harvesting
the full potential of stable isotope labeling. In 2019, Wang et al.
employed not only 13C but also 15N isotopically
labeled organisms (Saccharomyces cerevisiae and E. coli).
76
As
in the original credentialing approach they combined stable isotope
labeling with other noise reduction and feature grouping approaches
in order to recognize biological features. Using this approach, they
found a comparable number of biological features (only 4% of the peaks
were annotated as apparent metabolites). Moreover, systematic annotation
of peaks and discrimination of biological compounds (including isotopic
variants) from adducts, fragments and MS artifacts was established.
In fact, the correct identification of adducts was identified as a
major bottleneck for elucidating the number of true sample molecules.
In the following, the integration of stable isotope labeled buffers
in LC-HRMS improved cost efficiency and introduced an universal stable
isotope labeling approach for the corroboration and annotation of
real chemical features to any kind of sample.
77
The disadvantage of doubled measurement time is compensated by the
comparable performance (for noise removal and annotation) to other
credentialing approaches.
Isotopically Labeled Biomass for Validated
Isotopologue Distribution Elucidations
Another way to harvest
stable-isotope labels in metabolomics is the investigation of differential
incorporation of labels into organisms comprehensively reviewed elsewhere.
38−40
However, we want to highlight the application of labeled biomass
with controlled labeling pattern
78−80
to validate isotope
tracer analysis workflows.
81
In the past,
it was shown that 13C tracer and flux experiments demand
dedicated validation tools. Spectral accuracy, i.e. an instrument’s
ability to truly measure the fractional abundance of the different
isotopologues, is crucial. Metabolite standards with natural isotopic
pattern (as well as fully labeled standards) are not well suited to
assess the accuracy of carbon isotopologue distribution in tracer
studies. Due to the low natural abundance of 13C, heavy
natural isotopologues are below the limit of detection. Using in vivo synthesis, tailored
carbon isotopologue distribution
of primary metabolites can be obtained, which serves as ideal reference.
Isotopologue distribution of stable isotopologe-labeled compounds
can be assessed with excellent precisions of <1% and trueness bias
as small as 0.01–1%.
Isotopically Labeled Biomass
for Quantification
Starting in the 1980s, stable isotope-labeled
ISTDs and isotope dilution approaches in combination with LC- and
gas chromatography (GC)-MS/MS were used to improve quantification
of small molecules.
37
In metabolomics,
internal standardization is widely adopted for absolute quantification,
as the analytical process consists of multiple steps and requires
normalization. Chemical synthesis of isotope labeled standards precludes
omics-type of analysis, as hundreds of ISTDs are required to make
isotopically labeled biomass a promising alternative. The cost-effective in vivo synthesized
metabolites standards are characterized
with respect to their isotope labeling degree but not their concentrations.
Thus, normalization between samples (relative quantification) or internal
standardization of external calibration (absolute quantification)
47,48,54,56
is accomplished by spiking known amounts of labeled biomass into
the samples. The benefits of these quantification workflows are well
documented. Overall, improved analytical figures of merit (trueness,
precision, and linearity) have been observed upon the integration
of labeled yeast extracts.
54,55,82
The use of HRMS together with stable isotope labeled standards supports
workflows merging absolute quantification and nontargeted unanticipated
discoveries (relative quantification and annotation) in one analytical
run. This powerful strategy has been addressed in metabolomics and
lipidomics.
54,82
In lipidomics, only a slight
decrease of identified lipids (∼10%) was observed in the presence
of labeled biomass.
82
This can be explained
with ion competition in complex matrices when applying data-dependent
fragmentation and can be further optimized by deep metabolite profiling
or data-independent acquisition. Stable isotope labeled materials
as an intermediate have to be chosen on the bases of sufficient metabolite/lipid
class coverage and biomass availability/costs. Labeled yeast, e.g. P. pastoris, offers
a reasonable compromise for quantitative
studies, as it is an eukaryotic organism that can be easily cultivated
under controlled conditions on a sole carbon source. Yeasts share
a high metabolome and lipidome overlap with humans including the evolutionarily
conserved primary metabolome, e.g. amino acids, nucleotides, organic
acids, and metabolites of the central carbon metabolism. But also
lipids are covered as shown by Natter et al.
83
Wolrab et al. summarized the most frequently up- and downregulated
lipids in oncology including the classes phosphatidylcholines (PC),
phosphatidylethanolamines (PE), phosphatidylinositols (PI), phosphatidylserines
(PS), lysophosphatidylcholines (LPC), lysophosphatidylethanolamines
(LPE), lysophosphatidic acids (LPA), free fatty acids (FA), triacylglycerols
(TG), diacylglycerols (DG), cholesterol esters (CE), sphingomyelins
(SM), ceramides (Cer), monosialodihexosylganglioside (GM3), and sulfatides
(SHexCer) in both tissue and body fluids,
84
and except for CE, SM, GM3, and SHexCer, all of the listed classes
are present in yeast. In the past, P. pastoris yeast extracts were successfully spiked
to human plasma (including
standard reference material (SRM) 1950 from the national institute
of standards and technology (NIST), USA), different cell extracts,
and yeast, either as ISTD based on ethanolic extracts or chloroform
based lipidome isotope labeling of yeast (LILY) extracts, for metabolites
and lipids, respectively (Figure 4
A, B).
Figure 4
Current in-house library of annotated metabolites and
lipids found in Pichia pastoris (yeast). (A) Metabolite
classes in ethanolic yeast extract
85
classified
using the ClassyFirer
86
annotation system.
(B) Lipid classes annotated in chloroformic yeast extract.
87
GPL, glycerophospholipids; GL, glycerolipids;
SL, sphingolipids; ST. sterols; PR, prenols; Hex1Cer, hexosyl ceramides;
SPH, shingosine bases; SE, steryl esters; Co, coenzyme Q; PG, phosphatidylglycerols;
PA, phosphatidic acids; CL, cardiolipins.
At the present state, a library of 206 metabolites for the ethanolic
yeast extract covering the classes of (1) organic acids and derivatives,
(2) nucleosides, nucleotides, and analogues, (3) lipids and lipid-like
molecules, (4) organic oxygen compounds, (5) organoheterocyclic compounds,
(6) organic nitrogen compounds, and (7) benzoids is established (Figure 4
A). All of the identified
metabolites were also present in The Human Metabolome Database (HMDB)
88
. This can be in part attributed to the human
microbiome, but also to the evolutionary (inter-species) conservation
of the primary metabolome. With regard to the yeast and human lipidome,
major differences exist including a different sphingoid base—SPH
18:0;3 instead of SPH 18:1;2—as well as other sphingolipid
classes (inositol phosphoceramide (IPC), mannosylinositol phosphoceramide
(MIPC), and mannose-bis(inositolphospho)ceramide (M(IP)2C) instead
of SM, ceramide 1-phosphates (CerP), and gangliosides. Yeasts also
contain a smaller diversity of fatty acids with a maximum of three
double bonds with a lack of higher polyunsaturated fatty acids (PUFA).
Furthermore, no ether lipids (plasmanyl (ether bond), plasmenyl (vinyl
bond)) are present and cholesterol is replaced by ergosterol in yeast.
Overall, this leads to a list of 405 lipid species (Figure 4
B) combining information from
reports on LILY from chloroform extracts by RP-LCMS
82
and an improved preparative supercritical fluid chromatography
(SFC) workflow.
87
Optimized extraction
strategies and confirmation by authentic standards can further increase
the metabolite and lipid list in yeast.
89
Here, we want to emphasize the possibility of class or retention-time
specific standardization if the target metabolite or lipid is not
present in the yeast extract. By using these labeled compounds as
class or retention-time specific ISTD if the target analyte is not
present in the yeast extract,
90
the list
of possible analytes in a quantitative approach can further be enlarged
and adapted to the sample of interest.
Harmonization
and Reference Materials
Joint efforts toward harmonized metabolomics
protocols and the definition of a minimum of quality requirements
are of paramount importance. There is a vivid scientific community
working toward harmonization to raise transparency and quality of
published results.
17,91−95
Standardized methods and reference materials provide
benchmarks, paving the way to reproducibility and most importantly
interassay commutability, with regard to both targeted and nontargeted
analysis.
Reference Materials and Interlaboratory Comparisons
Certified reference materials represent the highest metrological
order benchmarks enabling traceable and accurate quantification in
metabolomics workflows. Certification requires an inherently long
lead time, as composition and quantitative values are reported with
characterized uncertainty and stability. Certified reference materials
are provided by metrological institutions or by accredited material
producers. While the application of (certified) reference materials
in absolute quantification is well established, their integration
for nontargeted metabolomics is emerging. A recent multi-platform
study by hydrophilic interaction liquid chromatography (HILIC)/RP-LC
HRMS
96
demonstrated the power of using
high-quality benchmarks in large-scale nontargeted metabolomics. Three
pooled human plasma reference materials (Qstd3, 211 CHEAR, NIST SRM
1950) were repeatedly measured along with 3600 samples over a period
of 17 months, providing a convincing strategy for data normalization
and estimative concentration levels.
As the pace of standard
production suitable for omics-type research in national metrological
institutions is slow, international ring trials/interlaboratory initiatives
drive standardization by offering measurement protocols and consensus
values for biological matrix materials which can be distributed to
the community. For the widely adopted NIST reference material human
plasma SRM 1950, the number of consensus values assessed by international
ring trials is continuously growing. Consensus values for 250 metabolites
(amino acids, biogenic amines, acylcarnitines, glycerolipids, glycerophospholipids,
cholesteryl esters, sphingolipids, hexoses) were assessed on the basis
of the Biocrates AbsoluteIDQp400HR.
97
Interlaboratory
comparisons are of paramount importance in lipidomics, since reference
materials are lacking. In 2017, an international ring trial provided
consensus values for 339 lipids (from the major categories: fatty
acids, glycerolipids, glycerophospholipids, sphingolipids, sterols)
in SRM 1950.
98
Recently, Triebl et al.
99
further emphasized the need for reference samples
by showing that lipidomics workflows continue to suffer from limitations
associated with reproducibility and commutability of quantitative
data from different platforms, even when isotopically labeled ISTDs
were included. The authors compared direct infusion, HILIC, and RP-LC-MS
workflows for lipid analysis showing that upon normalization to the
reference sample SRM 1950, platform-dependent quantitative bias was
successfully removed.
99
The frequent use
of SRM 1950 in both metabolomics and lipidomics studies
96,97,100,101
highlights its key role as a reference point for merged workflows.
Another recent interlaboratory study tested seven distinct materials
including human urine pools from four SRMs and one research-grade
test material (RGTM) provided by NIST.
102
Untargeted analytical profiles for these materials were obtained
using a variety of common metabolomics platforms (nuclear magnetic
resonance (NMR), GC- and LC-MS), leading to the conclusion that all
platforms were able to detect compositional differences despite some
platform-dependent differences.
Community-Based Guidelines
in Metabolomics
Community guidelines on how to report and
perform metabolomics workflows form the basis of standardization.
The metabolomics standardization initiative (MSI) of the metabolomics
society
91
has worked intensively on definitions
and guidelines considering all steps of the targeted and nontargeted
analytical process for many years. This includes defining the analytical
task, sampling/analysis of data standards, data evaluation, and reporting.
17,93
The metabolomics community is currently, revisiting the standards
of metabolite reporting by the state of the art level of confidence
scala
94
(1–3) introducing new subclasses
(A–F) for unambiguous metabolite identification such as cis/trans configuration information.
In October 2020, a
new guideline on lipid classification, nomenclature, and shorthand
notation was published
95
including major
changes for the annotation of double bond equivalents and the number
of oxygens as well as newly delineated oxygenated lipid species. Figure 5
shows the metabolite
and lipid identification ranking according to the newly proposed guidelines
of the metabolomics community.
Figure 5
Metabolite (left) and lipid (right) identification
according to the proposed guidelines of the metabolomics society (A–G)
using the examples of leucine and a PC 18:0/16:2(7E,11Z)[R]. The lowest
annotation level corresponds to known accurate mass information (G)
followed by a known compound class (F), known compound sum formula
(E), known functional moieties (D), known structure (isoleucine)/double
bond position (PC 18:0/16:2(7,11) (C), known diastereomer (B), and
the highest level to enantiomer-specific identification (A). *in lipidomics
105
3 intermediate steps are distinguished at level
D: sum of carbon and double bond number for all fatty acyl chains
(PC 34:2)/known distribution (PC 18:0_16:2) and known position of
the fatty acyl chains (PC 18:0/16:2).
Updated metabolomics repositories such as MetaboLights
103
provide openness and transparency of reported
data sets. These repositories will be essential for developing of
community-based benchmark materials and will facilitate the development
of accepted guidelines.
Instrument-dependent compound identification
workflows complicate cross-platform evaluations and call for harmonization
of reference libraries. A recent European interlaboratory study published
harmonization guidelines for acquisition and processing of tandem
MS data. Interestingly, they also revealed that under certain collision
energies time of flights ((TOF)s) and Orbitrap fragmentation spectra
are comparable.
104
Quality Control and Benchmarking
QC and normalization strategies are essential for successful large-scale
studies. Normalization can be performed by QC samples and data-driven
or via ISTDs and is extensively summarized elsewhere.
106−109
In large-scale metabolomics and lipidomics studies, the concept
of a pooled sample for QC has gained worldwide acceptance, also allowing
us to correct for intra- and interbatch variations and to accomplish
MS/MS measurements required for annotation.
109,110
However, the production of sufficient amounts of pooled samples
can be problematic for multicenter studies in clinical metabolomics.
Additionally, if only one sample pool including all sample groups
is produced, dilution effects can mask low abundant metabolite signals.
The production of QCs for each group represents an alternative; however,
in some cases preparing a pooled sample is simply impossible. For
example, in many large-scale investigations such as longitudinal clinical
studies or population profiling, all samples are not available at
the beginning of the analysis. Alternatively, multistandard mixes
of metabolites and/or lipids are established reference samples, which
can be either produced user-defined in the lab or ordered as commercially
available stocks, e.g. LSMLS or MSMLS (from IROA) including 400 metabolites
each 1 mg in well plates or 600 metabolites each 5 μg per well
plate. Lipid-specific kits are also offered, e.g. AbsoluteIDQ (from
Biocrates) including 180 or 400 lipids. More recently, lipid mixes
with matrix-specific concentrations are commercially available e.g.
SPLASH LIPIDOMIX (Avanti) products which include one deuterated ISTD
of all major lipid classes at ratios relative to human plasma. Another
possibility is to take deuterated standards from the UltimateSPLASH
(Avanti) panel from different lipid classes to prepare a customized
lipid mix. These valuable standard panels offer reference materials
for streamlined validation protocols and accelerate harmonization.
However, it should be emphasized that harmonization efforts enabled
by reference standard mixes and kit-type of analysis will not replace
certified reference materials which are fully traceable. Recently,
the concept of a cheap and easily accessible biological benchmark
material was proposed for metabolomics and lipidomics. The idea was
resumed from proteomics, where HeLa cell extracts have become the
gold standard for benchmarking instrument performance and proof-of-principle
experiments upon introduction of new analytical methods.
111−115
Yeast ethanolic extracts with a characterized metabolome, not only
enabled testing for the chemical space and coverage upon method implementation
and developments but also enabled in-house routines for instrumental
performance tests with additional potential for batch to batch corrections
in large scale nontargeted metabolomics studies. The benchmark material
is obtained from P. pastoris from
fully controlled fermentations, which can be easily reproduced in
a lab with fermentor access.
85
Additionally,
these extracts are also commercially available in both endogenous
and 13C-labeled formate. An open-source yeast metabolite
and lipid library is established for the material. All reported compounds
were reported in the human metabolome data base, showing once more
that yeast is a cost-effective benchmark material for human metabolomics.
104 out of 206 metabolites were stable for several years when stored
in aliquots at −80 °C.
85
Nontargeted Data Analysis—Increasing Quality by Multiple Lines
of Evidence
Nontargeted metabolomics workflows consist of
key steps that need to be addressed individually with regard to standardization.
The first step of a nontargeted experiment involves the analytical
process aspects, discussed in several reviews.
1,116,117
Data analysis constitutes the most time
consuming and complex step of nontargeted experiments. Many tools
and approaches are available for this process and have been summarized
extensively.
107,118−120
More specifically, data analysis follows stepwise data preprocessing,
features table processing, statistical analysis (feature prioritization
and biomarker elucidation), annotation, and biological contextualization
like pathway mapping and integration with other omics data, all of
which (with the exception of statistical analysis) are discussed in
the following. We will emphasize the multiple strategies of corroborating
nontargeted read-outs and deliberately focus on different aspects
that improve quality.
Data Preprocessing
Data preprocessing
(DPP) presents the first major challenge in nontargeted metabolomics
since it facilitates the translation of raw data into the less complex
format of so-called feature tables. While approaches enabling metabolomics
DPP keep being improved, the general steps have remained unchanged
across different tools (Figure 6
) (with very few exceptions as in ref (121)). However, despite this
fact and the development of different DPP parameter optimization tools
122−124
it often suffers from extensive problems. Those include false negative
and false positive reports of ion species as well as wrongly reported
abundance values and other issues.
125−128
It should be noted that data
pre-processing is not challenging because it is hard to perform, but
because it is hard to perform well. This point was laid out by Sindelar
et al., who demonstrated why poor performance of data preprocessing
could lead to much harder downstream data analysis.
129
It is therefore essential to control the effectiveness
of this process.
Figure 6
General steps of nontargeted data preprocessing.
There are a number of advances we would like to
highlight in this context. One recent R package, named patRoon, combines
different data preprocessing and annotation algorithms into a single
framework and thereby allows us to build pipelines in the R-environment.
130
This increases flexibility in data processing
choices considerably since it allows us to combine the strengths of
many different tools and to compare them more easily. It is worth
noting that patRoon supports any HRMS platform and supports algorithms
from many widely used tools such as ProteoWizard,
131
XCMS.
132
Two other tools which
should be noted here are NeatMS
133
and
MetaClean
134
, which are based on deep learning
and machine learning, respectively. Both tools allow us to comprehensively
assess the peak picking quality as conducted via different tools for
experimental datasets.
133
To the best of
our knowledge these recently published works represent the only available
tools to comprehensively assess peak picking quality for all picked
peaks, which poses a significant advancement. However, RT alignment
and false negatives are not considered in this approach which makes
further development necessary. To address this need mzRAPP was introduced,
a tool enabling reliability assessment of different nontargeted data
preprocessing steps (under submission). It is based on automatically
validated and extended benchmarks (starting from user supplied integration
boundaries per molecular formula) and allows us to derive different
performance metrics including the proportion of false negatives, affected
isotopic ratios, and the number of alignment errors for nontargeted
DPP of any experimental datasets. It is worth noting that the use
of benchmark datasets in this context enables us to investigate the
number of false negative peaks as they provide a so called “ground
truth” as a reference point. While this also offers several
other apparent advantages for the benchmarking of different DPP tools,
135
benchmark datasets in metabolomics come with
significant problems. First, their curation process requires extensive
manual work and is hugely time intensive (although some do exist;
e.g., ref (136)). This,
in turn, implies that it is impractical to create benchmarks for different
types of datasets (e.g. sample complexities or choices in instrumentation
or acquisition mode such as RP-LC, HILIC, orbitrap MS, TOF MS) which
might imply different needs for applied DPP software. Secondly, it
can be problematic to consider benchmark datasets as “ground
truth” without sufficient validation. mzRAPP tackles this problem
by automatically applying a number of validation metrics to check
the consistency of user supplied benchmark candidates.
Elimination
of Redundancies and Noise from Feature Tables
As discussed,
nontargeted data preprocessing of LC-HRMS data generally leads to
aligned feature tables. Ideally, (when bioinformatic noise is not
considered) rows in those feature tables correspond to chromatographic
peaks with specific mz@RT values in different samples. Hence, each
mz@RT value ideally reflects an ion species originating from a sample
molecule eluting from the chromatographic dimension and being ionized
in the electrospray. However, preprocessing workflows typically introduce
significant numbers of bioinformatics noise-features into data sets.
In this context, we would like to highlight three recently published
tools allowing us to remove those noise features from datasets. MetProc
allows us to remove features based on missing value structures in
QC samples.
137
Another tool called genuMet
is solely relying on injection order to identify false positive features
without relying on measured QC samples.
138
Finally MS-CleanR has been added to the MS-DIAL
139
workflow, allowing us to (among other things not discussed
here) filter features based on blank signals, background drifts, unusual
mass decimals and relative standard deviations (RSDs).
140
Since all of those tools offer slightly different
approaches, their compatibility for different data sets remains to
be elucidated.
Over the last years, many papers and authors
have discussed the challenge that the number of reported ion-species
cannot be directly translated to the number of sample molecules.
11
This is due to bioinformatic noise and because
one molecule will form multiple ion species due to the presence of
different isotopologues and adducts. It has been reported that a single
metabolite can lead to more than 100 different ion species during
the ionization process.
141
More recently,
it was also shown that adduct species differ significantly in HILIC
compared to RP chromatography.
142
The same
work also highlighted the problem of in-source fragmentation, which
poses a significant risk for wrong annotation.
Over the years
a number of approaches have been developed to group those different
ion-species in order to eliminate redundancies or even gain additional
reliability for annotations. Many tools enabling this and other important
data analysis steps are summarized elsewhere.
120
An interesting experimental approach which has been shown
to allow improved and simplified annotation of adducts has been to
measure samples twice with different LC-MS buffer compositions (14NH3–acetate and
15NH3–formate buffer)
77
(in fact, this
approach has also been used by ref (142)). In both conducted studies this approach
showed
great potential for annotating adducts and eliminating noise. Unlike
credentialing approaches
74,143
this workflow is applicable
to any samples even if it cannot be labeled via stable isotopes. However,
as it requires two measurements for each sample, it dramatically increases
measurement time and might not be applicable to small sample volumes.
Nevertheless, this approach presents significant improvement in increased
control over noise reduction and adduct annotation.
The Annotation
Task
In metabolomics the term annotation refers to the assignment
of molecular information to features. This information can involve
details on contributing atoms (molecular formula e.g. C6H12O6), structural class (e.g.
steroid), atomic
connections (e.g. phenylalanine), relative stereochemistry (e.g. leucine
or isoleucine) or chirality (e.g. d-leucine). Different approaches
allow to collect evidence for the affiliation of a feature on any
of those levels. In fact the Metabolite Identification Task Group
of the Metabolomics Society has proposed reporting standards for different
levels of identification depending on the nature of collected evidence
(Figure 5
shows the
proposed metabolite annotation). While those standards are defining
specific types of evidence which have to be collected for a level
to be reached (e.g. matching of acquired MS/MS scans against a mass
spectral library), there are no consensus criteria for the necessary
strength of collected evidence (e.g. what constitutes a valid spectral
match). In this context one of the most discussed topics is the adaptation
of a false discovery rate (FDR) for spectral matching as it is routinely
applied in the proteomics field. Over recent years a range of different
strategies allowing us to apply this idea also in metabolomics has
been proposed or implemented.
18,144−147
While their actual application is still scarce, they definitely
pose a step toward increased reliability of annotations. Another point,
which needs to be considered in this regard is the nature of reference
spectra used for spectral matching. Until now, matching against experimental
spectral libraries has been considered the gold standard for this
kind of approach. Although spectral libraries have been growing to
impressive sizes (e.g. recently METLIN reached more than 850 K standard
spectra),
148
a recent evaluation on available
reference spectra from authentic chemical standards
149
regarding the coverage of different MS spectral libraries
in different genome scale metabolic models (GSMs) revealed that on
average only <40% of metabolites in the models are represented.
Meanwhile, in silico approaches MetFrag
150
(a combinatorial fragmenter) and machine learning based methods
such as CFM-ID
151
(an in-silico fragmenter)
and CSI:FingerID
152
(a structure predictor)
are more and more accepted. This is mainly due to their increased
coverage of the molecular space since they do not rely on experimental
fragmentation data but molecular structure databases such as PubChem.
153
Indeed some of those can even go beyond that
(e.g. in combination with tools like EMMF
154
).
The advantage is evident since such structure databases
are many orders of magnitude larger than any spectral library. Indeed,
this might lead to an improved FDR when using this kind of approach
as compared to matching against a spectral library with less metabolic
coverage. Another strategy worth mentioning involves the support of
annotations utilizing reactivities of specific functional groups.
Briefly, this involves the specific derivatization of functional groups
(such as amines, carboxylic acids, alcohols, etc.) commonly referred
to as sub-omes.
155
Derivatization improves
overall ionization efficiency and enables selective separation and
enrichment using reversed-phase stationary phases. Moreover, the production
of sample specific ISTDs is facilitated. Blends of sample derivatized
with isotopically labeled reagent or unlabeled reagent, respectively,
served for relative as well as absolute quantification.
156
This also enables credentialing-type approaches
(as discussed above).
157
As a drawback,
these approaches take considerable effort in terms of data analysis.
Dedicated RT and spectral libraries for identification of derivatized
molecules (available for some derivatization strategies such as dansylation
158
) are required. It should be noted that derivatization
approaches reduce throughput and require dedicated validation, due
to challenges arising from matrix effects and decreased stability.
159
Hence, derivatization strategies can potentially
bring many advantages, but require an extensive amount of work in
order for validation and method development.
H/D exchange on
the other hand is more straightforward in its application and can
be included into existing data evaluation pipelines. Recently, there
have been significant advancements in infrastructure for this type
of analysis. For example, the software MetFrag supports H/D exchange
data.
160
Although H/D exchange only allows
us to investigate acidic moieties, its potential for annotation has
been shown in multiple studies.
161,162
In
cases where the annotation strategies discussed above fail to deliver
the desired insight, novel approaches based upon complex bioinformatics
algorithms fill the void. These innovations utilize molecular networking
of fragmentation spectra (spectral similarity translated to biochemical
and chemical substructures) or machine learning algorithms. In this
context MS2LDA, which was initially published in 2016,
163
associates specific fragments and/or neutral
losses with chemical moieties and, thereby inspecting complex structural relationships
between different unknown
analytes. This algorithm has been further developed to now directly
enable differential analysis of chemical substructures between different
samples (such as investigations on the regulation of xenobiotic derivatives
across different samples.
164
More recently
also feature based molecular networking, allowing us to consider the
chromatographic and/or ion mobility dimension in this type of analysis,
has been introduced.
165
This way, isomers
and in-source fragments can potentially be investigated. Another tool
we wish to highlight here is CANOPUS
166
, which classifies features via their MS/MS spectra even when existing
spectral libraries do not include MS/MS scans of the class in question.
Annotation
in the Field of Lipidomics
The general annotation strategies
applied for metabolomics are often not applicable in nontargeted lipidomics.
This is reflected in a survey among lipidomics researchers from 2018
167
revealing that 60% of all researchers rely
mostly on manual (visual) annotation. Even though software tools are
available and commonly applied (e.g. LDA;
168
MS-DIAL,
169
LIFS software tools
170
), manual annotation remains an integral part
of lipid annotation highlighting the lack of adequate nontargeted
analysis tools in lipidomics. Most available software tools are based
on two approaches: library matching (MS-DIAL, LipidSearch, Greazy,
LipidDex, etc.) and decision rule-based identification (LDA, LipidXplorer,
LipidMatch, LipidHunter, etc.). Due to building blocks of lipids leading
to a distinct MS2 pattern within the same class, decision rule sets
based on well-defined fragments (fragment rules) and their intensity
relationships (intensity rules) can be described for specific lipid
classes.
171,172
For library matching similar
principles are applied as in standard metabolomics workflows using
accurate mass, MS2 spectra, and scoring algorithm. Both experimental
or in-silico databases are applied in lipidomics. Unfortunately, false
discovery rate calculation is not possible up to now and a certain
level of false assignments is state of the art in nontargeted lipidomics.
Hence, it is of utmost importance to reliably estimate the proportion
of potentially false assignments. Filtering of false positive annotations
can be done by relative RT; the homologous lipid series of the same
class depends on relative carbon number and/or relative double bond
number.
173
Using regressions models the
so-called equivalent carbon number (ECN) model can be applied for
manual annotation
174
or RT prediction
175
in order to exclude false positive hits and
confirm lipids. Additionally, Kendrick mass plots can be used to identify
homologous series in lipid data sets.
176
The application of Bayesian statistics presents an interesting and
promising direction and may overcome some limitations of hand-crafted
rule sets.
177
Excellent community-based
resources provide guidelines (see ILS, LSI
92
) on criteria and characteristic fragments for MS/MS annotation.
The LIPID MAPS
178
and LSI website list
continuously update information on manual inspection of MS/MS data
reporting on obligatory fragment ions for unambiguous annotation of
lipids. Still, only the minimum requirements have been defined (see
ILS and LSI) so that openness and transparency of reported datasets
remains inevitable to bring harmonization in lipidomics to the next
level. As nontargeted lipidomics remains error-prone and still requires
expert knowledge, comprehensive information on lipid annotations is
essential. The periodicity of lipids offers further control points
in lipid identification. In our opinion, lipidomics and metabolomics
annotation have to be harmonized and is already possible using the
proposed identification levels by the metabolomics society (Figure 5
).
Retention Time
and Cross Section as Orthogonal Parameter in Nontargeted Analysis
Retention
Time for Annotation
Orthogonal data as chromatographic RTs
are key to increase the confidence of MS-based compound annotations.
So far, the poor reproducibility and commutability of experimental
retention times across labs even when using reversed-phase chromatography
only precluded the wide adoption of RT libraries
179
for high quality annotation across labs. RT prediction
from molecular structures is a currently very active area of research.
The most relevant developments are summarized elsewhere.
180
The most recent advances not covered in the
review are provided by the software tools Retip
181
and QSRR automator.
182
Retip
is a machine learning based tool which has been trained using more
than 800 standard compounds for each, RP and HILIC chromatography.
Retip was integrated into the MS-DIAL tool-box. QSRR automator has
been published as a Python package and builds RT prediction models
for in-house chromatographic methods.
182
It is worth noting that RT prediction is not (yet) accurate enough
to enable accurate identification of small molecules. However, it
can be applied for the annotation of (miss-)annotated in-source fragments
and allows reranking of positional isomers which can provide valuable
insights.
181
Collision Cross Section
Value for Compound Annotation
The role of collision cross
sections (CCS) obtained from ion mobility spectrometry (IMS) for confident
compound annotation has been extensively discussed.
183−186
The pace of generating CCS databases (both experimental and in-silico
predicted) has been enormous.
187−189
Currently there are two unified
databases, CCS Compendium
186
and AllCCS.
190
Novel open-source software tools facilitate
data evaluation.
131,169
Seminal studies showed
that interlaboratory reproducibility of CCS assessment outperforms
191,192
reproducibility of chromatographic RTs. As a drawback, a CCS value
correlates with the measured accurate mass of a molecule, while chromatographic
retention offers an entirely orthogonal identifier. Due to the current
limitations in ion mobility resolution, isomer separation of small
primary metabolites is limited. In complex samples, only molecules
exhibiting CCS differences in the low % range (typically 3%) are routinely
resolved. The resolution is improved by novel advanced instrumental
concepts.
193,194
Recently, the potential of trapped
IMS (TIMS) to separate lipid isomers was shown.
195
The obtained resolving power allowed us to discriminate
lipid species exhibiting CCS differences of <1% in complex biological
mixtures. Several studies implemented ion mobility for structurally
characterizing lipids with a high degree of specificity. Information
on double bond position and geometry was obtained combining IMS with
ozonolysis and Paternò–Büchi reaction.
196,197
Navigating the Conflicting Goals of Metabolome
Coverage and Throughput
Up to date metabolome analysis is
the best fit for purpose compromise between coverage, selectivity,
and throughput. High coverage implies a wide interrogation window
with regard to both the chemical molecular dimension and the metabolite
abundance dimension (8 orders of magnitude concentration difference).
Major application areas of metabolomics such as, e.g., precision medicine
envisage the measurement of large cohorts (thousands of samples) in
regulated environments. The current transitory phase from small scale
experiments to large scale studies, industry- and clinical applications,
triggers exciting developments regarding streamlined workflows and
tailored solutions with advanced throughput. As the field moves forward,
economic considerations regarding cost effectiveness and automation
of the complete workflow become more important. Miniaturization accommodates
the analysis of small precious samples, bears the potential of increasing
sensitivity, and reduces solvent consumption following the principle
of green chemistry. We will discuss key aspects of current developments—from
sample preparation to analysis, advancing automation, miniaturization,
and throughput—and discuss the methods with regard to coverage
and selectivity.
Sample Preparation
High-throughput
sample preparation is still a bottleneck preventing exploitation of
the full potential of high-throughput MS-based metabolomics. A recent
review discusses the need for high-throughput technologies emphasizing
the role of sample preparation.
198
The
state of the art of sample preparation strategies for all relevant
sample matrices is comprehensively reviewed elsewhere.
1
Protein precipitation upon dilution, liquid–liquid
extraction, and solid phase extraction (SPE) are widely accepted methods
in metabolomics and lipidomics analysis which can be adopted for robotic
liquid handing systems (e.g. methyl tert-butyl ether (MTBE) extractions
in lipidomics
199
). Further advancement
of classical sample preparation strategies in metabolomics and lipidomics
is driven by emerging application fields such as biotechnological
large-scale enzyme activity screens and plate-based biomarker or drug
screening and includes the development of miniaturized green sample
pre-treatment (e.g. micro-liquid–liquid extraction, low volume
SPE), offering favorable extraction kinetics, high preconcentration
rates, and increased throughput. For example, implementation of a
commercially available, fully automated SPE system using small volume
SPE cartridges achieved a duty cycle of less than 15 seconds per sample
preparation.
200
Automated nondispersive
micro-liquid–liquid extraction allows high-throughput through
parallelization. Dispersive micro-liquid–liquid extraction
ameliorates extraction kinetics, but severe limitations regarding
automation of phase separation have been reported.
198,201
Currently, solid-phase microextraction (SPME) and electromembrane
extraction methods are “re-explored” for metabolomics,
given their potential for fully automated parallel extraction in well-plate
formats and enrichment through miniaturization.
202−204
SPME is a nondestructive and nonexhaustive extraction showing
great promise in probing and extraction of “tiny” metabolomes.
While multianalyte quantification remains a challenge, low invasiveness
of SPME and the nonexhaustive nature of extraction, together with
recently developed extractive phases, make the technique particularly
attractive for time-resolved or spatially resolved metabolomics fingerprinting.
202
For example, a high-throughput time-course
metabolomic analysis was achieved through multiple extraction of 96-well-plate
cell cultures.
205
Direct immersion (DI) in vivo sampling enabled time-resolved metabolic fingerprinting
of animal brains
206,207
and a method for the analysis
of small molecules from semi-solid tissue relying on DI-SPME and desorption
electrospray ionization, (DESI)-MS, has been proposed, promising space-resolved
analysis of tissues.
208
Non-exhaustive in vivo extraction followed by GC X GC qTOFMS analysis enabled
real-time monitoring of apple metabolism during the process of ripening
on the tree. The slim geometry of the extraction device avoided tissue
wounding and oxidative degradation of analytes seen with conventional
workflows relying on harvesting, metabolism quenching, and ex vivo extraction.
209
However,
the current selection of commercially available DI-SPME extractive
devices is very narrow, limiting the wide adoption of this technique.
202
Electromembrane extraction is a combination
of partitioning-based liquid–liquid extraction and electrophoresis.
Fundamentals of electromembrane extraction have been summarized in
a review by Douin et al.
204
Analytes move
from a donor phase, usually an aqueous sample, through a water-immiscible
organic layer acting as purification filter, into an aqueous (or optionally
organic) acceptor phase. Mass transfer is driven by an electric field
introduced between donor and acceptor phase via insertion of electrodes
and application of direct current in the milliampere range, which
speeds up the extraction process and enhances extraction yield compared
to simple partitioning-based extraction. For optimized systems, selective
analyte enrichment up to 100-fold and recoveries up to 100%
204
and excellent cleanup potential have been reported
(salt- and protein-removal,
210
phospholipid-removal
211
). The technique holds high potential for point
of care analysis as enabled by parallelization and downscaling of
analysis as well as implementation into microfluidic chips (e.g. Hansen
et al.
212
). However, the extraction principle
is inherently limited to ionizable molecule species and is not suited
for molecules prone to degradation by electrolysis, and electrolysis
phenomena are aggravated with decreasing acceptor volume. Plus, electromembrane
extraction is a selective extraction procedure,
204
preventing the full scope of wide coverage metabolomics.
On the other hand, high selectivity towards target analytes is a desirable
feature for specialized routine application in regulated environments
as it facilitates process validation.
Direct Analysis in Metabolomics
and Lipidomics
Flow Injection-MS
Direct analysis
has its undisputed role as a rapid first-pass metabolic fingerprinting
method. It comes with a reduced analysis time of 2–5 min, thereby
increasing the analytical throughput by one order of magnitude compared
to typical LC-MS-based metabolomics. A recent review gives an excellent
summary on successful applications and well-known limitations imposed
by matrix effects and the occurrence of isomers and in-source fragments.
213
Ion suppression and ion competition were studied
in fundamental experiments using injections of 5 μL at flow
rates <100 μL min–1, where ion competition
was shown to be a major cause for limited sensitivity in orbitrap
MS.
214
As a consequence, sensitivity could
be increased by optimizing data acquisition. The use of sequential
narrow mass segments in trapping MS with fixed m/z windows or variable
sample specific windows showed to be valid strategies for improving
sensitivity and linear dynamic range.
214
A recent study combined FI-HRMS with online fractionation
improving the metabolome coverage and reducing matrix effects.
215
The fully automated sequential fractionation
was based on solid-phase extraction on complementary ion-exchange
and reversed-phase chemistries. Fast and high coverage screening (3
min per polarity) was thoroughly validated for targeted analysis of
50 diagnostic and explorative biomarkers in plasma samples, including
amino acids, amines, purines, sugars, acylcarnitines, organic acids,
and fatty acids. The sensitivity of FI was significantly improved.
LLOQ values comparable to conventional LC-MS/MS were reported. FI-HRMS
for quantification of high abundant cholesterol and cholesterylester
utilizing compound specific response factors proved to be fit for
the purpose for cultured cells, tissue homogenates, and serum samples.
34
IMS offers a rapid (millisecond-regime)
post-ionization separation dimension,
216
which makes it particularly attractive for FI analysis. Its benefit
for both targeted and nontargeted metabolomics has been investigated.
217,218
Compared to FI-MS alone, FI-IMS-MS offers improved linearity and
reduced noise level. Nonetheless, ionization suppression due to matrix
effects remains a major obstacle with detrimental impact on sensitivity,
peak capacity, and consequently, coverage.
219
It is therefore unlikely that IMS will render chromatographic separations
obsolete in nontargeted analysis.
Ultimate Throughput–Duty
Cycles of Seconds Per Sample
The cycle time of the sample
transfer to the MS limits the throughput of FI-MS-based metabolomics.
For example, the fastest commercially available SPE system offers
a sample cycle time of 10 seconds, limited by the required SPE elution
volumes.
200
When used without SPE, the
rate limiting step becomes the autosampler, enabling a duty cycle
of 2.5 seconds per sample,
220
a setting
which was proposed for drug discovery and high-throughput MS targeted
assays.
Duty cycles of seconds per sample are also realized
in alternative ambient MS approaches. However, despite significant
progress, large scale metabolomics studies have not yet been put into
practice. Excellent duty cycles in the second-regime were, for example,
obtained by immediate drop on demand technology combined with open
port sampling interfaces (I-DOT-OPSI-MS).
221
Recent studies on single cell metabolomics demonstrate the power
of high throughput MS. Another emerging high-throughput-technique
enables nanoliter-scale infusion MS at sampling rates of up to 6 Hz
installing plate robotic handling.
222
Acoustic
droplet ejection (ADE) uses acoustic pulses to generate nanoliter-droplets
directly from a microtiter plate in a contactless manner with high
speed, precision, and accuracy. The potential areas of future applications
are evident and range from high-throughput drug screening assays,
plate based synthetic chemistry, and large-scale biotechnological
studies addressing enzyme kinetics. Interfacing ADE with MS involved
(1) acoustic mist ionization (AMI) coupled to MS
222
or (2) acoustic ejection MS (AEMS) using an open port interface
(OPI) with electrospray ionization (ESI).
223,224
While the first approach integrated droplet generation and ionization,
the latter configuration used ADE only for sample delivery for subsequent
ionization by ESI. This way, matrix effects and adverse effects caused
by contamination of MS transfer capillaries were reduced. Excellent
analytical figures of merit were obtained upon injection of 25,000
samples (standards) revealing excellent RSD of 8.5% for peak intensity
and full width at half maximum (177 ms), respectively. The peak width
was in the order of 200 ms.
224
Miniaturization–Nanoflow
Direct Infusion
Miniaturization of direct analysis toward
nanoflow proved to be particularly attractive because of the inherent
features of nanoESI. Ionization at this flow regime is characterized
by increased ionization efficiency. At the same time, differences
in ionization efficiency for different molecules are significantly
reduced as compared to ESI at higher flow rates.
225
Shotgun lipidomics accomplished by chip based nanoESI orbitrap
MS have become an essential tool of the trade for both lipid identification
and quantification.
35,226
A 50 min analysis time consuming
only 10 μL of sample solution is theoretically possible.
227
In practice, a 5–15 min run time ensures
analysis at both polarities while applying data dependent acquisition
(DDA) or data independent acquisition (DIA) approaches. Today, MS2
methods based on DIA (covering the whole mass range in 1 Da steps
228
) prevail over DDA (follows the intensity order
226
). Dedicated software solutions allow for noise
filtering accelerating data processing.
229
Typically, several hundred lipids are identified on a species level
covering the abundant lipid classes. Several strategies enable increased
coverage, by e.g. including derivatization.
230
Pitfalls regarding lipid identification are summarized and curated
by the LSI.
29
Quantification is achieved
by ISTDs. The lipid head group determines the ionization efficiency
to a large extent, allowing us to minimize the number of calibrators
to one or a few per class. Response factor corrections were introduced
for the quantification of neutral lipids.
34
Quantification on the molecular species level is complicated as
the required MS2 level, different fatty acyl chain moieties show different
responses (up to 60%), jeopardizing accuracy without correction.
231
Chromatography—Key
Steps Toward Coverage and Throughput
Miniaturization of Liquid
Chromatography
In MS-based metabolomics, microscale and nanoscale
separations have been developed with the aim of advancing small scale
sample analysis, increasing sensitivity and thus coverage of low abundant
analytes, and finally reducing costs by overall reduced reagent consumption.
Miniaturized separation used with tailored low-diameter ESI-emitters
offers unrivaled absolute detection limits (fmol on column). Combinations
with large volume injection and online enrichment allows the analysis
of very low analyte concentrations and very efficient sample use.
However, the successful application of online-enrichment-nano-RP-LC
faces limitations: Numerous primary analytes show poor retention on
RP-LC, and sample volumes may be extremely limited, as in single cell
analysis. In such cases, the full sensitivity potential of nano-RP-LC-MS
platforms may be exploited by analyte derivatization, increasing RP-retention
and ionization efficiency. A comprehensive summary of theory, common
approaches, and over 20 of the most recent applications of nano-LC-MS
in metabolomics and lipidomics investigation can be found elsewhere.
232
Single cell analysis is an emerging application
of small-scale metabolomics by nano-LC-MS. Recently, Nakatani et al.
reported a method for derivatization-free targeted quantification
of hydrophilic metabolites in single HeLa cells. Living single cells
were sampled from culture using an in-house developed nano-pipette
device, and the sampling capillary was directly connected to a sample
loop line. The optimized nano-LC-MS/MS method based on a self-packed
RP-LC column (pentafluorophenylpropyl Discovery HSF5, 0.1 × 180
mm, 3 μm) and multiple reaction monitoring yielded an average
sensitivity increase of 26-fold compared to a conventional flow setup
(2.1 x 150 mm) employing the same column chemistry. 18 relatively
abundant hydrophilic metabolites (16 amino acids and 2 nucleic acid
related metabolites) were detected and quantified in 22 single HeLa
cells. Clustering in different groups was observed.
233
Another emerging nano-LC-MS application is the in-depth,
high-coverage analysis of the lipidome. With a recently published
110 min nano-LC-MS method, linear dynamic range and sensitivity could
be substantially increased by 1–2 and 2–3 orders of
magnitude, respectively, when compared to conventional high-performance
LC (HPLC) (150 × 2.1 mm, 2.7 μm). The proposed workflow
displayed excellent analytical figures of merit after careful optimization
of sample reconstitution. Lipidome coverage was evaluated for the
phospholipidome of S. cerevisiae and achieved increased lipid identification (436
phospholipids)
compared to conventional-flow HPLC and a shotgun approach. Low abundant
lipid species and isomers could be detected even when they were coeluting.
234
When combined to a new data evaluation pipe-line,
almost 900 lipid species in 26 lipid classes in S. cerevisiae were identified. The
identification
rate was increased by a factor of 4 compared to previous whole yeast
lipidome shotgun studies.
89
The high potential
of this workflow for in-depth lipidome analysis is highlighted due
to the detection of less common lipid classes like monomethyl-PE (MMPE)
and dimethyl-PE (DMPE) and lipids with incorporated odd-chain and
diunsaturated fatty acids.
For a long time, the development
and wide adoption of microscale separations in metabolomics suffered
from the fact that many stationary phase chemistries were not commercialized
for the required column dimensions (1.5–0.5 mm inner diameter
for micro-LC; 0.5–0.15 mm I.D. for capillary LC
235
). Micro-LC separations are more common, since
ionization performance of ESI sources is compromised at the flow regime
of capillary-LC (0.01–0.001 mL min–1). The
sensitivity gain of micro-LC is moderate as compared to microbore-LC
(3.2–1.5 mm I.D.). In a recently published study, the optimized
microflow-LC-MS/MS improved sensitivity in a compound-dependent manner
by 6- to 49-fold when compared to conventional microbore-LC-MS/MS.
236
In metabolomics, the sensitivity gain
provided by micro-LC has been exploited to design rapid separations
for metabolic phenotyping. Throughput has been optimized at the expense
of chromatographic performance, providing fit for purpose platforms
with enhanced but not maximized sensitivity upon miniaturization.
237−241
A systematic comparison to conventional HPLC methods in terms of
LOD and LLOQ was beyond the scope of these studies. Short micro-ultraperformance
(UPLC) type of separation utilizing RP materials with sub-2-μm
particles (100% wettable, 1.0 × 50 mm, 1.7 μm) and separation
times of 2.5 min was successfully applied in large scale phenotyping
studies
237
and recently explored in combination
with ion mobility.
238,239
A rapid micro-HILIC method utilizing
sub-2-μm particles (Acquity UPLC BEH-Amide, 1.0 × 50 mm,
1.7 μm) addressed the analysis of polar metabolites in rat urine
in less than 3.5 min. Comparison to conventional HILIC-MS demonstrated
4-fold reduction of analysis time, 75% reduction in solvent consumption,
and 18-fold reduction of sample consumption, while providing sufficient
retention of polar metabolites (e.g. hexoses, methylhistidine, kynurenic
acid, creatinine), excellent reproducibility (RT RSDs between 0.31
and 6.3% over 134 sample injections), and excellent run-to-run reproducibility.
240
Rapid lipidomic profiling of plasma by micro-LC-IMS-MS
proved to be fit for the purpose for clustering plasma lipotypes as
assessed in breast cancer patients and healthy controls indicating
the suitability of micro-LC–IMS–MS as a rapid platform
for large scale lipidomics screening.
241
The combination of online SPE enrichment with micro-LC-MS was validated
for targeted analysis of 13 steroid hormones from human plasma. After
careful optimization, large volume injections obtained excellent LOQs
in the sub-ng/mL range at high throughput (below 3 min per sample).
Validation according to FDA guidelines showed the suitability for
high-throughput analysis in a clinical routine laboratory.
242
Cebo et al. introduced a validated approach
based on offline mixed-mode SPE enrichment and micro-UHPLC-ESI-triple
quadruple (QqQ)-MS/MS for the quantification of 42 oxylipins in plasma
and platelets at reasonable throughput (13 min per sample).
243
Limits of detection were between 2 and 250
fmol on column, offering comparable LODs to well established conventional
LC approaches, but at significantly reduced solvent consumption.
Multidimensional Chromatography
Comprehensive two-dimensional
chromatography is undoubtedly a powerful strategy for the separation
of complex mixtures.
244,245
In order to maximize the separation
space and thus the peak capacity, orthogonality and compatibility
of the two dimensions is essential. Each chromatographic peak of the
first dimension requires sampling several times, efficient transfer,
and rapid separation by a second dimension, in order to maintain the
chromatographic resolution of the first dimension. The time spans
between successive separations in the second dimension should be minimized
requiring short separation and equilibration times. This is well established
in GC X GC but still a technical challenge in 2D-liquid-chromatography.
The wide suite of successful GC X GC applications in metabolomics
has been summarized elsewhere.
1,245
In LC X LC, method
development is still regarded as a bottleneck. The experimental design
regarding the two separation dimensions is not straightforward, as
separation conditions influence and restrict each other.
244
Solvent incompatibility with regard to mismatch
of elution strength and immiscibility requires dilution of the sample
upon transfer to the second dimension. Typical 2D-LC-MS designs involve
HILIC and RP-LC. Long microLC/microbore LC columns (flow rate 10–50
μL/min) are employed as the first dimension, followed by short,
thick columns (flow rate in the mL/min regime) in the second dimension.
In the last years many new instrumental 2D-LC designs have been developed
for metabolomics/lipidomics applications facilitating flexibility
and universality.
246
Elaborate constructions
allow versatile modulation (i.e. sample collection and transfer) including
active modulation with dilution conditions optimized over the separation
time. Despite significant progress, the theoretical peak capacity
in LC X LC-MS is hardly reached in practise, due to incomplete usage
of separation space, still suboptimal cutting, and peak deterioration
upon remaining solvent incompatibility.
Adoption of comprehensive
LC X LC-MS in metabolomics and lipidomics has been limited mainly
because comprehensive 2D-LC-MS metabolomics approaches developed up
to date have largely suffered from incomplete usage of separation
space in HILIC and RP-LC combinations and from severe sensitivity
loss.
247
The latter diminishes actual coverage
in nontargeted screenings. Solvent evaporation interfaces as featured
in SFC X RP-LC lipidomics might overcome this challenge. Recent reports
on 2D-SFC-RP-MS, separating 370 lipids from 10 lipid classes of human
plasma within 38 min, are promising.
248
2D-LC-MS relying on heart-cutting strategies proved to be powerful
in selected applications
246
including the
separation of secondary metabolites in plants and emerging targeted
chiral metabolomics.
249,250
Dual/Parallel Chromatography
Several column switching approaches have been introduced as elegant
solutions for increased throughput and coverage within one analytical
run. One successful configuration (Figure 7
A) integrated serial orthogonal chromatography
in order to transfer the poorly retained metabolites of the first
dimension onto a second orthogonal column and enable two parallel
separations subsequently. This configuration offered a valuable alternative
to heart-cut chromatography.
246
A simple
six-port valve was installed between the two chromatographic columns
enabling to transfer metabolites eluting from the first column onto
the second column. Two independent separations were carried out by
the switching of the valve. This setting was successfully employed
for high coverage metabolome
251
and lipidome
analysis.
252
For metabolome analysis, reversed-phase
and porous graphitized carbon LC were combined. The method was validated
by targeted absolute quantification of 80 primary metabolites in P. pastoris. Excellent
RT stability (average
0.4%) even in the presence of a biological matrix was obtained. An
interplatform comparison with GC- and LC-tandem-MS analyses showed
the power of the method even with respect to sugar phosphate isomer
quantification.
251
The same separation
concept combined HILIC and RP-LC for high coverage lipidome analysis.
253
The void volume of the HILIC separation containing
non-polar lipids was transferred to the RP column which enabled the
on-line combination of HILIC with RP without any dilution in the second
dimension. Rapid consecutive separation for polar lipids and class
specific separation for nonpolar lipids was accomplished within one
analytical run of only 15 min (including re-equilibration time, using
stationary phases with sub-2-μm particles and UPLC).
Figure 7
Practical setup
solutions for sequential and parallel LC. (A) In valve position A,
the void volume of the first column is transferred to the second column.
Afterward, the valve is switched in position B and the sample is analyzed
on both columns parallel.
251,252
(B) In valve position
A, the first extract is injected on the first column and analyzed.
Meanwhile, the second column is equilibrated and the mobile phase
is flushed into waste. After separation on the first column, the valve
is switched to position B and the second extract is injected on the
second column and analyzed while the first column is equilibrated.
253
(C) In valve position A, the sample is loaded
and divided into two sample loops equally. In valve position B, both
parts of the sample are injected onto two orthogonal columns and analyzed.
254
Figure 7
summarizes options for successful column
switching technologies. Dual chromatography extends the separation
space by fully automated consecutive or parallel execution of orthogonal
chromatographic separations. Different configurations enabled orthogonal
dual HILIC/RP-LC separation by parallel injection of two extracts
from one biological specimen (Figure 7
B) or of one sample extract (Figure 7
C). While the latter configuration was proposed
for simultaneous analysis of nonpolar and polar metabolites,
254
the parallel injection of different sample
extracts facilitated the development of merged metabolomics/lipidomics.
253
The HRMS workflow integrated biphasic extractions,
parallel injection, separation, and MS analysis providing the full
scope of targeted and nontargeted metabolomics and lipidomics within
one analytical run. Wang et al. proposed a dual chromatography approach
for simultaneous lipidomics and metabolomics analysis implementing
parallel HILIC and RP separation in a heart cutting 2D-LC configuration
where parallel analysis was preceded by prefractionation on a first
separation dimension.
255
As a drawback,
this method precluded the integration of biphasic extractions, since
only one sample could be analysed. Recently, sample preparation and
reconstitution were reoptimized in order to provide high coverage
within one measurement solution.
256
MS Platforms and Data Acquisition Strategies—Improving Coverage,
Selectivity, and Reliability
Despite significant progress,
cutting edge low-resolution tandem-MS outperforms latest generation
HRMS for quantitative analysis, both in terms of sensitivity and linear
dynamic range. Large scale metabolomics studies are often performed
on triple quadrupole-MS based platforms profiting from increased robustness
and high quantitative capabilities.
257
Today,
the implementation of multiple reaction monitoring (MRM) approaches
is supported by significant computational resources. A library containing
MRM transitions for more than 15,500 molecules is publicly available.
258
Both experimentally assessed and in-silico
generated MRM transitions are included. Dedicated software tools enable
optimization of MRM transitions including collision energies using
mass spectral libraries, such as METLIN and HMDB.
259
Large-scale metabolomics as enabled by QqQ-MS was successfully
applied in wide-targeted assays providing absolute quantification
of a high coverage metabolite panel.
1,97
Recently,
hybrid MS approaches emerged which offer attractive ways bridging
the concepts of targeted and nontargeted analysis.
20
These workflows successfully exploit the power for accurate
relative quantification by QqQ-MS, without omitting a discovery step.
The “discovery” is realized through optimizing the MRM
transitions based on a sample matrix representative for the large-scale
study. This optimization/discovery step can be performed using low
mass resolution MS only or integrating high mass resolution. In MS/MS
analysis by HRMS, sensitivity and selectivity (and thus the coverage)
are significantly influenced by the type of mass spectrometer, but
also by the selected data acquisition strategy. DIA and DDA acquisition
modes have their specific applications in both targeted quantification
and nontargeted compound annotation.
260−262
For both DIA and DDA,
new tools for online/on-the-fly and offline scan-level control, fragmentation,
and acquisition optimization are available to support automated mass
spectrometer parameter choice.
262−264
Toward MS-Based Multi-omics
Emerging multi-omics analysis led to significant efforts for methods
integrating multiple omics layers for one sample. Significant progress
relates to the experimental design of multi-omics measurement and
data evaluation strategies. Multi-omics applications profit from global
phenotype metabolomics data acquired at reasonable throughput. Sophisticated
sample preparation protocols together with high coverage multi-platforms
characterize the tools of the trade for multi-omics analysis. Cutting
edge network analysis enables us to integrate MS-based datasets with
genome, transcriptome, proteome, and metabolome information derived
from orthogonal platforms.
Multi-omics Sample Preparation Strategies
Multi-omics sample preparation approaches have to deal with the
challenge that preferred collection methods, storage techniques, required
quantity, and choice of biological samples are not directly transferable
from one omics field to the other, especially when quantification
rather than profiling has to be performed.
265
The metabolomics part of multi-omics studies is especially challenging
as degradation, oxidation, or conversion of metabolites (including
lipids) might occur during sample preparation. Moreover, the procedures
have to be tailored for the two sub-omes.
9
In the multi-omics setting, discovery studies call for minimal pretreatment
in order to prevent the potential loss of metabolites.
266
Multi-omics sample preparation strategies based
on a single sample enable the true combination of multi-molecular
information without influences of different sample aliquots. One-phase
extractions coined as sample preparation for multi-omics technologies
(SPOT) were successfully applied for the parallel analysis of the
proteome and metabolome.
267
Other multi-omics
strategies involve filter tips such as cellulose based filter tips,
which enable detergent-free single-pot metabolomics and proteomics
by capturing the protein fraction, collecting the metabolite flow-through
and the peptide fraction after tryspin digestion.
268
For lipid sub-ome integration, extraction strategies involve
different solvent mixtures with higher organic content, e.g. chloroform/MeOH
269
or MTBE/MeOH.
199
Two-phase extractions such as chloroform/MeOH/water
270
or MTBE/MeOH/water
226,253,271,272
proved to be particularly successful for global high coverage analysis
as metabolites and lipids can be analyzed from a single sample at
the same time separating the protein fraction. In terms of sample
handling and automatization potential, MTBE/MeOH/water is preferable
as the protein pellet is found at the vial bottom after phase induction
and is not present in the intermediate phase between the polar and
nonpolar phase. Such phase separation strategies reduce the sample
amount needed and pave the way for direct-infusion and LC-MS based
multi-omics (metabolomics, lipidomics, and proteomics) from a single
sample.
226,273
Additional lipidome coverage can be achieved
using approaches such as (1) 3-phase extraction separating neutral
lipids in the upper phase and glycerophospholipids in the middle organic
phase
274
or (2) the combination of two-phase
liquid extraction with MTBE combined with SPE.
271
In order to understand metabolite action on the subcellular
level, experimental resolution of subcellular metabolism is needed.
Recently, non-aqueous fractionation was successfully applied to resolve
subcellular plant metabolism and the corresponding proteome.
275
In this comprehensive approach, organic solvent
mixtures and ultracentrifugation was applied to analyze proteins and
primary metabolite in a four-compartment plant model comprising chloroplasts,
cytosol, vacuole, and mitochondria.
275
However,
additional steps such as various liquid phases or subcellular analysis
increase the sample preparation significantly so that scientists have
to decide in each case at what time expense they aim to increase metabolite
coverage.
Multiplatform Analysis Strategies for Metabolomics and Lipidomics
High coverage analysis in MS-based metabolomics is only achieved
through multiplatform solutions involving orthogonal chromatographic
separations integrating both HRMS and MS/MS methods. Despite significant
progress in HRMS and its quantification capabilities, up-to-date tandem
MS is the method of choice when aiming at the analysis of low abundant
metabolites (e.g. bile acids or fatty acyl-CoA esters). Hydrophilic
primary metabolites—the biological definition of metabolites
involved in growth, development, and reproduction—are among
the most evolutionarily conserved biomolecules. Multiple isomers and
isomeric in-source fragments intrinsically challenge MS analysis.
276
Examples are hexose phosphates, pentose phosphates,
3-phosphoglycerate/2-phosphoglycerat, citrate/isocitrate, homoserin/threonine,
leucine/isoleucine, adenosine monophosphate/deoxyguanosine monophosphate,
adenosine triphosphate/deoxyguanosine triphosphate, and alanine/sarcosine.
As a consequence, common multiplatform workflows include chromatographic
separations providing selectivity for primary metabolites, a prerequisite
for accuracy regarding both identification and quantification, respectively.
In LC, ion-pairing and HILIC are the methods of choice separating
water-soluble central metabolites. A currently accepted protocol of
wide targeted analysis relies on RP ion-pairing MS/MS analysis. The
method covers 215 metabolites including amino acids, citric acid cycle
intermediates, and other carboxylic acids, nucleobases, nucleosides,
phosphosugars, and fatty acids.
277
Sugar
phosphate isomers are quantified based on distinct MS/MS fragmentation
pattern
278
(as no base-line separation
is provided). There are several drawbacks associated with the use
of ion-pairing reagents, such as MS system contamination, ion suppression
effects limiting overall sensitivity, together with the fact that
metabolites ionizing only in positive mode such as e.g. carnitines
and S-adenosylmethionine cannot be measured. Generally, the use of
ion-pairing reagents implies the establishment of a dedicated MS systems,
often precluding the combination with HRMS. For HILIC separations,
two stationary phases, i.e. the BEH amid phase and the polymeric zwitterionic
phase,
279
prevail, using both acidic and
neutral/basic eluents. HILIC separations are versatile, but as a drawback
there is no single experimental setting covering all relevant primary
metabolites.
280
When optimized properly
the separation selectivity of phosphorylated carbohydrates is comparable
to ion-pairing chromatography. Thus, GC remains the unrivaled separation
method when addressing intermediates of glycolysis and pentose phosphate
pathways. Wide coverage of the primary metabolome is established upon
two-step derivatization procedures (ethoximation/methoximation followed
by trimethylsilylation). Routine applications involve robotic just-in
time derivatization. However, GC is not suitable for measuring the
energy status of a given sample, as important nucleotides and cofactors
are not covered. For this purpose, both ion-pairing
281
chromatography or HILIC can be applied after careful consideration
of sample preparation.
280
Up to date, despite
excellent separation power for polar and ionic metabolites, the application
of capillary electrophoresis (CE) in metabolomics is limited to expert
laboratories. The most recent CE developments were comprehensively
summarized elsewhere.
1,198
Examples of multiplatform combinations
practice integrating the analysis of different extracts (tailored
preparation for sub-ome analysis) followed by nontargeted assays (RP-LC-HRMS
for lipidomics, GC-HRMS for primary metabolites,
280
complementary HILIC-HRMS for metabolites not amenable to
GC-MS, and targeted tandem mass spectrometric assays for low abundant
metabolites such as bile acids,
282
steroids,
and oxylipins).
283
Alternatively, the number
of MS platforms can be reduced by replacing the combination of GC-HRMS/LC-HRMS
by two complementary LC-HRMS methods (either two HILIC methods with
acidic and basic eluents/positive and negative ionization or the combination
of acidic RP-LC-HRMS and basic HILIC-HRMS).
In lipidomics, the
majority of lipid classes can be covered by state of the art profiling
approaches such as direct infusion MS or RP-MS. Multiplatform combinations
in lipidomics often involve shotgun lipidomics for bulk lipid analysis
in combination with LC-MS strategies enlarging the lipidome in terms
of lower abundant lipids as recently shown for the platelet lipidome.
284
An excellent review by Rustam and Reid summarizes
the analytical challenges and advances in lipidomics including common
MS-based review, chromatographic solutions, and possible combinations.
285
The major challenge remains measuring high-abundant
membrane lipids besides very low-abundant signal molecules across
a huge polarity range (log P value from 5–35). Low concentrated
lipid mediators are an important subclass often analyzed by RP-LC
to cover different eicosanoide isomers.
286
Sphingolipids analysis is of high interest as they are involved
in signaling and protein sorting and are often suppressed by other
membrane lipids.
287
Extraction and analysis
of glycosphingolipids subclasses such as gangliosides or sulfatides
is challenging, and potential strategies are summarized in a recent
workflow by Barrientos et al.
288
If specific
lipid sub-omes (e.g. sterols or prenols) are of interest, LIPID MAPS
provides methods and protocols for LC-MS and GC-MS based analysis
as starting points including a summary of available analytical standards
(see resources section).
178
Merging Metabolomics
and Lipidomics
Monitoring the metabolic phenotype should
always consider lipids due to their critical function in health and
disease. Lipids make up 79% (90,678 lipids in 114,126 metabolites)
of all listed metabolites in the HMDB 4.0 (accessed October 2020)
88
highlighting the need to cover them in the analysis
workflows. Especially when it comes to biomarker research the metabolome
including the lipidome should be monitored to follow disease relevant
changes as shown in recent studies on cancer prediction
289,290
or cardiovascular disease.
291
Cajka and
Fiehn provide an excellent overview on the challenges and opportunities
of merged metabolomics and lipidomics workflows.
9
Here, we want to emphasize that global metabolite and lipid
profiling in one analytical run is possible shown by two-phase MTBE
extraction and fully automated parallel HILIC chromatography for metabolites
and RP chromatography for lipids.
253
The
instrumental setup was realized by a HRMS, a dual-injection autosampler,
and two positional 6-ports enabling simultaneous lipid and metabolite
analysis in one analytical run of 32 min (Figure7
B). Untargeted screening of human plasma
samples resulted in >100 metabolite (organic acids, amino acids,
nucleotides, acyl carnitines) and >380 lipid (phospholipids, sphingolipids,
cholesteryl esters, di-and triglycerides). Stable-isotope labeled
metabolites and lipids from yeast extracts further enabled us to merge
targeted and nontargeted identification and are generally possible
when labeled biomass is utilized. Moreover, LC-MS based proteomics,
metabolomics, and lipidomics can be performed from a single sample
which provides the starting-point for interesting multi-omics studies
based on network analysis, e.g. protein-metabolite interactions in
mesenchymal stem cell adipogenesis.
283
Network Analysis and Visualization of Multi-omics Workflows
Multi-omics derived data sets including different sub-omes rely on
appropriate data integration originating from several layers of information.
The ultimate aim is to understand “the flow of information”
underlying a certain phenotype. Here, we will consider relevant solutions—a
selection of tools can be found in Table 2
, which emerged to satisfy the need to facilitate
downstream analysis of metabolomics data and generate or validate
an underlying biological hypothesis. Visualization tools play a crucial
role for biological interpretation of metabolomics data, and well-covered
overviews can be found in several recent reviews.
292−294
Such tools are required at the end of a metabolomics workflow pipeline
and assume successful tackling of steps prior in the pipeline
292,295
including construction of adept study design, biological experiment,
sample preparation, identification, quantification, assessing QC standards,
adjustment to batch effects, etc. The aspects and critical points
of a metabolomics experiment from study design and sample preparation
to data analysis and evaluation of various tools have been discussed
in a comprehensive review.
293
Table 2
Selected Tools for Data Analysis and Visualization, Metabolic Networking,
and Databases
Name of the tool
Literature
Innovation
MetaboAnalyst
(Chong et al., 2019)
328
One-in-all metabolomics data analysis tool collection.
MetExplore
(Chazalviel et al., 2018; Cottret et al., 2018)
303,304
Visualization of
metabolic networks and pathways, facilitates the analysis of omics
data in biochemical context and pathway enrichment.
KEGG
(Kanehisa et al., 2017)
329
“Encyclopedia of genes and genomes”.
Several model organisms. KEGG orthology for genes and proteins.
Reactome
(Bohler et al., 2016; Fabregat et al., 2018)
309,330
Knowledge base
of biomolecular pathways: free, open-source, open-data, curated and
peer-reviewed.
Cyc databases
(Caspi et al., 2020)
311
The “largest curated collection of metabolic
pathways”. Many different model organisms.
Virtual Metabolic Human
database
(Noronha et al., 2017) (Noronha et al., 2019)
310,314
Human and gut microbiome metabolism, 255 diseases, and also microbial
genes, microbes.
WikiPathways
(Slenter et al., 2018)
312
Browsable, editable database curated by the research
community
Chemical Similarity Enrichment Analysis (ChemRICH)
(Barupal and Fiehn, 2017)
306
Alternative to biochemical pathway
mapping for metabolomic datasets. Not based on biochemistry directly
but on structural similarity. The enrichment test is Kolmogorov–Smirnov
test based (not hypergeometric test or Fisher exact test).
Metabox
(Wanichthanarak et al., 2017)
308
Metabolomics data analysis
and interpretation toolbox for integration of proteomics and transcriptomics
data.
Metscape
(Gao et al., 2010; Karnovsky et al., 2012)
322,323
Cytoscape plugin, metabolomics
correlation networks and KEGG-based metabolic networks integrating
gene expression and metabolomics.
PathBank
(Wishart et al., 2020)
313
Comprehensive, interactive database for metabolic
pathways in 10 different model organisms.
OmicsNet
(Zhou and Xia, 2018)
318
Multi-omics data integration, biological networks
(genes, proteins, microRNAs, transcription factors, metabolites)
GEM-Vis
(Buchweitz et al., 2020)
324
Visualization of time-course
metabolomic data within the context of metabolic network maps.
FEMTO
(Nägele et al. 2016)
302
Integration of metabolomic
time-series analysis and network information.
Among the commonly applied strategies
are uni- and multivariate statistical methods. Multivariate methods
involve both several unsupervised methods like principal component
analysis (PCA) or hierarchical clustering (HC) as well as supervised
methods (partial least squares discriminant analysis (PLS-DA), orthogonal
projections to latent structures discriminant analysis (oPLS-DA),
(linear) discriminant analysis ((L)DA), (canonical) correspondence
analysis (C)DA, random forests (RF), support vector machines (SVM),
neural networks (NN), and feature selection strategies (recursive
feature elimination, genetic algorithms, sparse models (Lasso, Elastic
Net, sparse PLS)). Such methods are capable of capturing and pinpointing
the unique metabolic fingerprints related to the underlying phenotype.
119
While this is no concern with unsupervised
techniques, a particular issue with supervised methods is the risk
of overfitting to the labeled data.
296
However,
accompanying cross-validation can help avoid this issue.
297
A powerful approach beyond the realm
of statistics and machine learning algorithms is pathway analysis,
which is taking advantage of established biological knowledge. In
the simplest variation, metabolites of interest derived from a metabolomics
experiment can be mapped on the pathways defined in a particular library.
In the case of over-representation analysis (ORA), corresponding p-values
can be obtained based on the metabolites list for the affected pathways.
Extending the metabolite list with quantitative information (fold-change,
intensity, and absolute amounts) can be further exploited by metabolite
set enrichment analysis (MSEA). Even beyond this, considering the
position of affected metabolites within pathways informs about the
perturbation of the pathway and is a useful additional metric next
to the enrichment, as it is capable of identifying subtle but consistent
changes, whereby affected metabolites are ranked based on centrality
measures thus contributing more to perturbation. Such a strategy constitutes
the core of combined enrichment and topology analysis.
298,299
Although pathway analysis is a powerful tool and there is great
merit in identifying relevant pathways corresponding to a phenotype,
it suffers from several limitations. First, definitions of pathways
differ to a varying degrees across databases.
300
Second, highly linked metabolites with a high number of
possible biochemical reactions are also constituents of multiple pathways
and pathways might overlap. Hence, it is challenging to explain changes
over several pathways.
301
In such a scenario,
time-series analysis might support the identification of regulatory
hubs within a metabolic network.
302
A great way to comprise results and enable biological interpretation
is the use of network-based approaches. Similar to pathway analysis,
metabolic networking relies on reference databases for biochemical
and signaling pathway information but constructs a single network
where each metabolite is linked by all possible biochemical reactions.
With the help of several strategies and options to extract a subnetwork
capturing all relevant metabolites from the input metabolite list,
they can represent a global and concise picture of metabolism.
301
MetExplore
303,304
allows metabolic
network construction, exploration, and combination with omics data
analysis. It can access several databases for multiple model organisms
and allows collaboration in curation and annotation of metabolic networks.
The interpretation of results is aided by metabolite set enrichment
analysis (MSEA) and extraction of relevant subnetworks. Correlation-network
construction does not require biochemical knowledge but is based on
quantitative information. It can establish correlations and metabolites
can be grouped based on the magnitude and sign of correlations, while
the network visualization strategy lends itself well to identify the
corresponding clusters and relationships among them.
305
Furthermore, correlations can also be determined based
on chemical similarity in a metabolite list (ChemRICH)
306
or spectral similarity (MS2LDA).
164
The full potential and perspectives of network-analysis
in metabolomics data analysis and system biological approach for biological
interpretation have been discussed in recent reviews including various
possibilities and respective tools.
294,305
A plethora
of bioinformatics and data analysis tools was developed in the R ecosystem,
also for the metabolomics community. These tools and their evolution
have been extensively reviewed.
120
However,
the barrier of entry can be substantially higher to users with no
programming skills or experience with command line tools. One response
by R developers for this is to include a graphical user interface
(GUI) within the package, which allows users to work within the comfort
of their browser. Several web-based tools emerged in the last years,
which function as a metabolomics data analysis toolbox and allow the
visualization of metabolomics results via different modules and offer
multiple solutions from the aforementioned options (MetaboAnalyst,
307
MetaBox,
308
MetExplore
304
). Such tools lower the barrier of entry with
aesthetic and user-friendly GUI and example datasets. As a prime example,
MetaboAnalyst and its equivalent R-based package MetaboAnalystR
123
serve multiple functionalities in several modules
ranging from metabolite identification, exploratory data analysis,
pathway enrichment analysis, combined MSEA and topology analysis,
and multi-omics integration, just to name a few.
Several pathway
databases exist (KEGG, Reactome,
309
Recon,
310
Cyc,
311
WikiPathways,
312
PathBank,
313
etc., Table 2
) with a different
focus, number of model organisms contained, and thus target audience,
features, and applications. They have been reviewed extensively.
293
Most of them provide a basic functionality
to map metabolites from a list to their pathways, visualize, and some
form of quantitative analysis (ORA, MSEA). As a unique feature Virtual
Metabolic Human
310,314
integrates the largest database
of human and gut microbiome metabolism and presents a virtual human
model with many possible pathological conditions.
The final
integration of data from multiple omics-type experiments (like genomics,
transcriptomics, and proteomics) complementing metabolomics studies
315
depends on the possibility to combine multilayer
information. MetaboAnalyst, Reactome, Recon, PaintOmics 3,
316
the R-package mixOmics,
317
and OmicsNet
318,319
contain several modules
for multi-omics data integration. A comprehensive review by Wörheide
et al.
320
discusses the various ways how
to perform data-integration in multi-omics workflows. MetScape
321−323
is a Cytoscape plugin to facilitate the visualization of correlation
networks and metabolic networks based on metabolomics data. Metabolomics
networks can also integrate transcriptomics data to inspect gene-metabolite
connections, and subnetworks can be extracted. The new visualization
technique GEM-Vis
324
facilitates the visualization
and exploration of time-course metabolomics experiments as metabolic
network maps.
Although the field of metabolomics downstream
data analysis and visualization clearly gained momentum with an increasing
number of novel tools in the last years,
120,293,325
there are many software examples
which are not available anymore through the uniform resource locator
(URL) originally referenced. This is by no means specific to metabolomics,
but rather to bioinformatics software in general.
326
In addition, some tools—though still available in
online repositories—are not compatible with current R versions
or require specific dependencies. Using virtual environments, virtual
machines or containers are technical solutions to the problem of long-term
software availability. Here it has to be mentioned that funding of
scientific research is often not capable to encompass the full life
cycle of software development as functional tools require maintenance
even after their publication.
327
Hence,
sustainability of software solutions is of utmost importance as the
challenge of growing data complexity increases our dependency on data
interpretation pipelines.
Conclusions
Years
of successful analytical development led to informative tailored methods.
An optimal metabolomics workflow should cover the lipid dimension
and has to find the right balance of coverage, throughput, and accuracy.
State of the art workflows consist of complementary multiplatform
modules which allow nontargeted discoveries and targeted absolute
quantification. Only recently, measurement and data evaluation strategies
of the two sub-ome specific disciplines metabolomics and lipidomics
converge. Both high-resolution and low-resolution tandem MS are integral
parts of multiplatform approaches. Up to date, coverage of low abundant
metabolites (pM/low nM concentrations) is ensured by quadrupole-based
tandem MS. While, there has been a paradigmatic shift in using HRMS
for targeted absolute quantification, thereby enabling us to merge
targeted and nontargeted approaches, typical limits of detection of
HRMS workflows remain in the (low) nM range. The final measurement
strategy depends on sample type and size, sampling frequency, the
envisaged depth of the metabolomics/lipidomic profile, the different
experimental conditions addressed, and finally, the type of information
expected as outcome. Accurate quantification and identification are
the prerequisite for correct biological interpretation, a bold argument
which remains valid even in the context of powerful multi-omics data
rich in analysis and network integration. While the gold standard
validating the quantitative aspect in MS-based metabolomics will remain
stringent analytical validation using standards and reference materials,
the corroboration of the qualitative realm in metabolomics is currently
revolutionized by bioinformatics tools. For example, in silico approaches are more
and more accepted as an alternative to spectral
library search. However, one should not forget that the development
and validation of these tools is inherently linked to the availability
of excellent community-based resources. Providing standards, reference
materials, setting up and curating open-source data sets and experimental
spectral libraries was and still is of paramount importance of the
progress to the field.