41
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Transcriptome analysis of parallel-evolved Escherichia coli strains under ethanol stress

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Understanding ethanol tolerance in microorganisms is important for the improvement of bioethanol production. Hence, we performed parallel-evolution experiments using Escherichia coli cells under ethanol stress to determine the phenotypic changes necessary for ethanol tolerance.

          Results

          After cultivation of 1,000 generations under 5% ethanol stress, we obtained 6 ethanol-tolerant strains that showed an approximately 2-fold increase in their specific growth rate in comparison with their ancestor. Expression analysis using microarrays revealed that common expression changes occurred during the adaptive evolution to the ethanol stress environment. Biosynthetic pathways of amino acids, including tryptophan, histidine, and branched-chain amino acids, were commonly up-regulated in the tolerant strains, suggesting that activating these pathways is involved in the development of ethanol tolerance. In support of this hypothesis, supplementation of isoleucine, tryptophan, and histidine to the culture medium increased the specific growth rate under ethanol stress. Furthermore, genes related to iron ion metabolism were commonly up-regulated in the tolerant strains, which suggests the change in intracellular redox state during adaptive evolution.

          Conclusions

          The common phenotypic changes in the ethanol-tolerant strains we identified could provide a fundamental basis for designing ethanol-tolerant strains for industrial purposes.

          Related collections

          Most cited references38

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Metabolomic and transcriptomic stress response of Escherichia coli

          Introduction The response of biological systems to environmental perturbations is characterized by a fast and appropriate adjusting of physiology on every level of the cellular and molecular network. Stress response, as reflected on the level of gene expression, displays some conserved features largely independent of the organism. Gene expression stress responses are transient, leading to new steady state levels similar to the unstressed cells even in the presence of a persistent stress (Lopez-Maury et al, 2008). Stress response is usually represented by a combination of both specific responses, aimed at minimizing deleterious effects (e.g. catalase during oxidative stress), or repairing damage (e.g. protein chaperones under temperature stress) and general responses which, in part, comprise the downregulation of genes related to translation and ribosome biogenesis (Hengge-Aronis, 2000). This in turn is reflected by growth cessation or reduction observed under essentially all stress conditions and is an important strategy to adjust cellular physiology to the new condition. Escherichia coli has been intensively investigated in relation to stress responses (Zheng et al, 2001; Chang et al, 2002; Patten et al, 2004; Phadtare and Inouye, 2004; Gadgil et al, 2005; Durfee et al, 2008). Major components of the general and specific response regulate key cellular processes ensuring global control upon perturbation. σ s (RpoS) is a central regulator during the response to many stress conditions. σ s controls expression of >140 genes involved in metabolism, protein processing, stress adaptation, transport, and transcriptional regulation (Weber et al, 2005). Another important global regulator is (p)ppGpp, involved in the stringent response, one of the mechanisms bacteria use to tune metabolism to available resources. The stringent response is observed when depleting the system of amino acids, and during carbon starvation (Irr, 1972). The majority of global analyses of the E. coli response to environmental changes have been limited to just one level of information processing, transcription. Although this may be explained by both the central importance of gene expression and the availability of mature techniques, which permit the study of transcriptional changes on a genome-wide level, it is also true that similar approaches on different molecular levels are largely missing. Specifically, comprehensive analyses of changes on the level of metabolites are very rare (Brauer et al, 2006). This is particularly true for the integrated and parallel analysis of the systems response on two levels of genome information processing such as the transcriptome and the metabolome (Bradley et al, 2009). To better understand system response to perturbation, we designed a time-resolved experiment to compare and integrate metabolic and transcript changes of E. coli using four stress conditions including non-lethal temperature shifts, oxidative stress, and carbon starvation relative to cultures grown under optimal conditions. The resulting data set allowed us to identify parallel and distinct response patterns, represented by conserved patterns on both the metabolic and gene expression levels, across all stress conditions, which indicates a systematic adjustment to suboptimal growth conditions through the impediment of energy demanding growth-related processes. In addition to this conserved component, each response displayed a large amount of stress specificity, thus allowing the clear discrimination of the various stresses through clustering of the metabolomic or transcriptomic data. Performing a time-resolved analysis of the response, however, showed a higher degree of stress specificity for the metabolomic response when compared with the transcriptomic response during the early time points after stress application. As well, metabolic profiles of cultures entering stationary phase are, in contrary to transcript changes, highly dissimilar to metabolic responses to all other tested perturbations. Clustering and canonical correlation approaches were followed to identify coordinated changes on the transcriptome and the metabolite level, which revealed previously known specific pathway regulations (such as Kleefeld et al, 2009) as well as potential new ones that will require biological validation through further experimentation. Results and discussion Experiment design An established metabolic profiling platform was used to characterize the metabolic responses of an E. coli to four different environmental perturbations, comprising oxidative stress, glucose-lactose diauxic shift, heat and cold treatments, and using an unperturbed culture as a control. Each experimental condition was independently repeated three times and in each of these three biological repetitions, three technical replicas were made, thereby yielding a total of >550 samples. Metabolic profiles containing 188 metabolites (95 could be positively identified, 58 could be chemically classified, and 35 of unknown structure) from E. coli cultures before, during, and after acclimation to the four perturbations plus controls were obtained. In parallel to gas chromatography mass spectrometry (GC-MS) measurements, microarray-based transcript profiling was carried out for samples from time points 10–50 min post-perturbation plus two control time points before each perturbation for all conditions except the oxidative stress experiment in which all samples (12 time points) were used for transcript profiling covering the entire growth curve, including the stationary phase. Again, three biological replicates were analyzed for each time point, but in contrast to the metabolic profiling no technical repetitions were performed. The overall measurement reproducibility was determined for all independently performed biological experiments. Relative standard deviation (RSD) of technical and biological replicates was calculated, and showed high reproducibility (Supplementary Figure 1). Median RSD of metabolic measurements for all biological replicates lay within the range of 19.5 (cold) to 27.1% (oxidative stress). The experiments were designed to both compare and contrast the growth phases within any single applied condition, and also of similar (parallel) time points from the different perturbations on both the metabolic and transcript level. However, of greatest interest was the dynamic response of the system to each of the different conditions applied. Therefore, each experiment was sampled with at least 11 non-linear time points with the highest sampling resolution during the adaptation phase of the culture immediately after perturbation. The five experimental conditions resulted in three distinct growth curves. Exponentially growing cells confronted with oxidative stress and glucose-lactose shift arrested growth for ∼40 min and then resumed logarithmic growth (40–210 min after stress) until reaching stationary phase at about 210 min after stress. After both heat and cold stress application, E. coli stopped growing for approximately 40–50 min and then slowly recovered growth (50–260 min) although at a much slower rate. Within the time frame of the experiment (260 min after stress application), heat and cold stressed cultures did not reach stationary phase. Unperturbed control cultures reached stationary phase about 210 min after having reached optical density (OD) 0.6 (the time point of stress application for the treated cultures). Further details about the growth and sampling time points can be found in Supplementary Figure 2. Growth phase has a predominant influence on metabolic profiles Here, we describe the significant metabolic changes (P⩽0.05, ratio ⩾2) relative to time points before perturbation, illustrating the influence growth phase has on the metabolic composition. We first analyzed the changes of the metabolite composition across all time points of all conditions. As cultures were harvested before perturbation in mid-logarithmic growth, a comparison of the metabolic response from each condition to the average of metabolites taken before perturbation is possible (Figure 1). Figure 1 shows the metabolic profiles of all identified metabolites for all four stress conditions and the control relative to time points before perturbation. One of the most striking features of the heat map is the strong influence of the growth phase on metabolite levels. During both temperature experiments (cold and heat stress), the temperature was maintained at the altered level after the initial shock treatment. In consequence, no resumption of exponential growth was observed (Supplementary Figure 2). In this sense, the applied cold and heat stresses are ‘permanent,' which is largely reflected in the metabolic readout. After application of cold or heat, metabolic levels stay fixed or gradually recover after the initial perturbations immediately after stress. This is in contrast to the more transient changes seen after hydrogen peroxide treatment and carbon source shift, which both restore exponential growth after 40 min post-perturbation. A detailed description of the metabolite changes is given in the Supplementary information. The conserved metabolic response pattern is in agreement with the energy conservation program The requirement to conserve energy is an important feature of all stress responses, and this necessity has been associated with many stress response mechanisms including the stringent response (Durfee et al, 2008), and the general stress response (Weber et al, 2005). The implementation of the latter has been shown through gene expression studies to reduce energy expenditure through the repression of genes involved in growth, cell division, and protein synthesis (Weber et al, 2005). The repression of transcripts involved in aerobic metabolism has also been seen in response to oxidative stress (Chang et al, 2002) and carbon starvation (Nystrom, 2004). It has been shown that the stringent response involves the downregulation of transcripts involved in transcription and translation (Barker et al, 2001). In light of these transcriptome-based observations, we decided to see whether the general decrease of central metabolism is also reflected on the metabolite level across the different stress conditions. As induction of the general stress response takes place directly after perturbation, we concentrated on the changes specifically during the first 40 min after application of the stress, the time where cells had not yet resumed growth (Supplementary Figure 2). Metabolic profiles of all identified metabolites are presented in Figure 1, whereas all significant changes are shown in Supplementary Table 1. Consistent decrease in the levels of metabolites related to glycolysis, the pentose phosphate pathway (p.p.p.), and the TCA cycle is one of the most pronounced effects of the stress application (Supplementary Figure 3). Those include rapid decrease of glucose-6-phosphate (glc-6-P), glyceric acid-3-phosphate (3PGA), pyruvic acid followed by decrease of succinic acid, erythrose-4-phosphate (E-4-P), and ribose-5-phosphate (ribose-5-P) within 40 min, and 6-phosphogluconic acid 90 min after heat-stress application. After oxidative stress application within 20 min, glc-6-P, 3PGA, malic acid, and 2-ketoglutaric acid decreased. Levels of 2-ketoglutaric acid decreased also 10 min after glucose-lactose shift. At 90 min after cold stress, levels of malic acid and ribose-5-P significantly decreased. Noteworthy is the decrease in levels of ribose-5-P, which is precursor of the nucleotide biosynthesis. The decrease in nucleotide biosynthesis is strongly reflected also on transcript level (see below), being one of the most pronounced responses common to different stress conditions (Gasch et al, 2000). The only glycolytic intermediate that accumulates during the adaptation phase is phosphoenolpyruvic acid (PEP), which transiently increases 10 min after the glucose-lactose shift. As PEP serves as phosphate donor for the phosphotransferase system responsible for glucose import, swift accumulation of PEP was recently proposed to be a direct effect of decreased glucose import caused by low-glucose concentration in the medium (Brauer et al, 2006). Another general effect of stress application is the accumulation of various amino acids (Supplementary Figure 3). During the adaptation phase, levels of alanine, asparagine, lysine, isoleucine, methionine, leucine, aspartic acid, glutamic acid, phenylalanie, and homoserine significantly increase under cold; isoleucine, threonine, phenylalanine, lysine, alanine, asparagine, glutamic acid, and homoserine under heat; asparagine in lactose shift; and alanine and asparagine in oxidative stress experiment. The increase in amino-acid levels could be, at least in part, a result of increased protein degradation (Mandelstam, 1963). Degradation of proteins can be caused by the need to eliminate abnormal proteins formed as a result of stress, or can be interpreted as a means to increase the availability of amino acids required for the synthesis of new proteins important for survival under the new, less favorable condition (Willetts, 1967). It has been shown that protein degradation is influenced by the increase of ppGpp levels during amino acid and carbon starvation, and this degradation was suggested to be dependent on the action of Lon and Clp proteases (Kuroda et al, 2001). Proteins that are preferentially degraded by proteases are free ribosomal proteins, tagged with a polyphosphate chain, which stimulates proteolytic attack (Kuroda et al, 2001). In line with those findings, we observed a massive increase in the levels of various amino acids on the entry to the stationary phase of growth starting from 210 min after oxidative stress, lactose shift, and in parallel time points in the control cultures (Supplementary Table 1). Although many amino acids accumulate, some do show a decrease. Methionine levels significantly decrease after both heat and oxidative stress (Supplementary Table 1), which is in agreement with methionine synthase (MetE) being very sensitive to oxidation. The addition of methionine to the growth medium leads to increased survival of E. coli during heat stress and a shortened growth lag during oxidative stress (Hondorp and Matthews, 2004). As oxidized MetE is inactive, the resulting methionine limitation might affect protein translation (Gold, 1988). In line with these findings, we observe an increase in methionine levels on growth resumption in the oxidative stress experiment (Figure 1; Supplementary Figure 3). Taken together, the changes observed on metabolic level, specifically the decrease in most measured metabolites of the TCA cycle and the glycolysis pathway (cf. also Supplementary Figure 3 for a better presentation of these metabolites), are in agreement with the general energy conservation strategy previously reported for the transcriptomic response. Major changes at the metabolic and transcript level coincide with growth transitions As discussed in the Introduction, both specific Figure 1 and general responses (Gasch et al, 2000; Weber et al, 2005) were observed. To further probe conserved and non-conserved responses, we analyzed time points displaying the highest number of changes. To this end, the number of metabolites and transcripts that significantly differ (change) between two neighboring time points (the time point of interest and the directly preceding one) within each condition were calculated. When performing this analysis for all conditions, a highly conserved pattern emerged for both transcripts and metabolites (Figure 2A and B). Thus, on both levels, the largest number of changes is observed within the first time point after stress application with the largest number of changes on the transcriptome level displayed by the heat-stress conditions. As to the metabolite pattern, the diauxic shift displays the largest number of changes followed by cold stress, oxidative stress, and heat stress. It is important to note that no significant changes were observed for the control cultures during this growth period (mid-log growth phase), indicating that exponential growth phase is represented on both levels by few if any changes on the level of transcripts and metabolites, which is in agreement with transcript level observations (Chang et al, 2002). Overrepresentation analysis of functional categories (based on gene ontology—GO) of genes, which change at 10 min past stress application reveals a conserved pattern across all conditions. Genes associated with amino acid, amine, nucleotide, and ribonucleotide biosynthetic processes and ATP synthesis, proton transport were downregulated (Supplementary Figure 4). These findings are in agreement with comparable experiments performed for both yeast and E. coli (Gasch et al, 2000; Chang et al, 2002; Durfee et al, 2008). Interestingly, we observed downregulation of genes assigned to ‘flagella motility' GO term across all conditions. As flagella motility requires a steep proton gradient between the periplasmatic space and the cytoplasm, decreased cell motion could indicate energy deficiency. Other biological processes that depend on proton gradient are ATP synthesis and transmembrane transport. However, in contrast to genes involved in ATP synthesis, which decrease after all perturbations, genes encoding general transport increase during glucose-lactose shift and oxidative stress. This could indicate that transport of external carbon sources is favored over chemotaxis (Lemuth et al, 2008). The coincidence of the response on both levels can indicate that the changes on the metabolic level are not transcriptionally dependent. Global proteomics analyses indicated that protein levels, post-translational modifications, and stability are directly affected by different perturbations (for review see Kultz, 2005). As enzyme abundance and activity have predominant influence on biochemical reactions, the possibility that metabolic changes are caused by enzymes, directly influenced by environmental conditions, cannot be excluded. This possibility could be tested by application of transcription inhibitors (e.g. Rifampicin) and analyzing the kinetics of metabolic response. It would be interesting to further extend this concept by applying protein synthesis or protein post-translational modifications inhibitors. Stress response displays higher specificity on the metabolite as compared with the transcript level with respect to the individual stress applied As described above, the general response pattern on both metabolite and transcript level is similar with respect to its kinetics within 40 min post-perturbation. To see whether this pattern is due to similar or rather dissimilar responses, we determined which metabolites and transcripts change significantly (for significance thresholds see Materials and methods section) during the different stress treatments in comparison to the relative time points from control. Subsequently, we asked whether the observed changes display a significant overlap between different conditions by applying Fisher exact test. This analysis enables us to compare the specificity (as defined in the Materials and methods section) of E. coli response to perturbation on the metabolome with the transcriptome. Figure 3 displays these results for all pairwise comparisons of experimental conditions in a binary form: 1 encodes a significant overlap or dependence of the response of two conditions, whereas a 0 entry corresponds to no significant overlap, that is an independent response. The absolute numbers of changing genes and metabolites are shown in Supplementary Figure 5). With respect to the metabolites as shown in Figure 3A for the first post-perturbation time point (10 min), stress specificity is high with only one of the six possible comparisons displaying significant similarity (heat and oxidative stress). At later time points (20 and 30 min post-perturbation), three out of six conditions show overlap, whereas after 40 min, only heat and oxidative stress still overlap. We summarize these findings by the positive predictive value (PPV) of the metabolic response of 71%. We next analyzed the overlap on the transcriptome level. However, as the number of metabolites analyzed is less than the number of transcripts, a direct comparison between both data sets would be biased. Moreover, this could lead to a higher level of conservation on the transcript level due to the inclusion of many general transcriptional responses (as exemplified by ESR in yeast, Gasch et al, 2000) not paralleled by any metabolite data. Therefore, the transcriptome analysis included only those 288 genes, which are directly linked to metabolic enzymes (based on EcoCyc), by considering genes where either the substrate or the product was contained in the metabolite data set (Supplementary Table 2). In contrast to the metabolite data, more pairwise comparisons of different conditions show dependence in the transcriptome response (Figure 3B). Our results show a significant overlap for three comparisons within 10 min, and five pairwise comparisons 20 min after stress (Figure 3B). The number of dependent responses decreases with increasing time; specifically, the response of the diauxic shift experiment loses similarity to other responses correspondingly to the metabolic response (Figure 3A). The highest similarity was found for the response toward heat and oxidative stress at both levels. This corroborates the link between responses to heat and oxidative stress observed in earlier studies (Farr and Kogoma, 1991) and is in further agreement with the results of the HCA presented in Supplementary Figure 7. Taken together, the response on metabolic level is obviously more specific as the PPV on the metabolites is 71% in contrast to 42% on the transcript level. Our observation that the metabolic response displays a higher level of specificity as compared with the transcriptomics response cannot be explained in a straightforward way. One interpretation is that metabolism has both the capacity to react faster and the need to react more specifically compared with the more mid-term adjustment based on reprogramming of the transcription–translation machinery. A fast delivery of metabolites needed to protect the system could be crucial for the initial survival of the system before more massive changes brought about by changes on the gene expression program come into play. One example of such mechanism is osmotic stress response in Synechocystis, where concentration of compatible solute is regulated on the post-transcriptional level of protein activity triggered directly by the stress and paralleled by a more time-consuming induction of gene expression (Hagemann, 1996). In contrast to the highly conserved transcriptional response pattern, the metabolite response is different for growth arrest induced by stress and by reaching stationary phase E. coli responds to stress by ceasing or reducing growth. It has been shown previously that changes on the transcript level, as a result of stress-induced growth arrest, significantly overlap with changes observed when cells cease to grow due to entering stationary phase (Chang et al, 2002; Weber et al, 2005). In light of the observation that the stress-induced changes on the metabolite level in the initial response phase display a higher stress specificity compared with the transcript level, we were interested to determine the degree of similarity of the changes on the metabolite level observed in response to the two different growth cessation conditions. To this end, we compared time points from the stress adaptation phase and time points taken 210 min after stress application (for details see Materials and methods section). At this time point, the lactose shift, oxidative stress, and the control experiment had entered the stationary phase (Supplementary Figure 2). Both temperature stress experiments were excluded from this comparison as, due to the maintained temperature stress, these cultures do not resume exponential growth and therefore do not run out of nutrients and enter stationary phase. When comparing only the metabolic profiles for the three stationary phase samples, a high degree of similarity is seen (Figure 4A) in agreement with the results of the HCA shown in Supplementary Figure 7, suggesting an underlying common cause. Among the metabolites that change consistently in all stationary phase conditions PEP, isoleucine, and phenylalanine all increased, whereas homoserine consistently decreased. A decrease in homoserine levels and an increase in PEP has previously been shown under carbon and nitrogen starvation (Brauer et al, 2006). The assumption of carbon starvation as the common underlying source is further supported by transcriptome data revealing an upregulation of carbon starvation-induced genes (csiD, csiE, cstA). The metabolites that significantly change their concentration upon entry into stationary phase (210–260 min) were subsequently compared with those whose levels changed within 10–40 min after the respective perturbation. Only 1 of the 12 pairwise comparisons of metabolic responses, (heat-stress response versus stationary phase of the oxidative stress) resulted in a significant similarity as based on the Fisher exact test (Figure 4B; see Supplementary information for a discussion regarding the overlap between stationary phase and heat stress). This indicates a high degree of dissimilarity (PPV=92%) between metabolic responses during growth cessation as induced through stationary phases or through various stress applications, which is in strong contrast to the high level of overlap reported for the response on the transcript level (Chang et al, 2002). To assure ourselves that the difference described above between metabolite and transcript characteristics is not due to differences in experimental conditions, we performed the same comparison between the transcriptome changes observed during growth cessation due to stationary phase as compared with induced by stress application on our own data set. To this end, stationary phase samples from the oxidative stress experiment were analyzed for the transcriptome and compared against the transcriptome changes occurring as a result of stress application. With the exception of the cold stress response, a highly significant overlap between stationary phase-induced growth arrest and stress-induced growth arrest was observed (PPV=25%), thus further strengthening the significance of the observed disparate behavior for the metabolite response (Figure 4C). The level of coordination between transcript and metabolite data is strongly influenced by the environmental conditions As outlined in the Introduction, biological systems respond to changes in their environments by adjusting their entire physiology to the new condition involving different levels of the system. In this study, we have monitored responses in parallel on the transcriptome and the metabolite level thus allowing one to compare the level of coordination between both molecular readouts. To perform this analysis, we followed two different approaches, an untargeted (holistic) co-clustering approach and a targeted approach using prior biological knowledge in conjunction with canonical correlation analysis (CCA). In the co-clustering approach, metabolites and transcripts were jointly subjected to a k-means clustering. The resulting clusters were subsequently analyzed for overrepresentation of transcripts and metabolites from the same biochemical pathway (see Materials and methods section for details). When applying this approach to the entire data set, that is combining the measurements of all individual stress conditions, no co-clustering of metabolites and transcripts from the same pathway could be observed (data not shown). Applying this co-clustering approach respectively to each growth phases of each stress condition separately (e.g. all time points from the oxidative stress condition), we were able to identify several metabolites and transcripts from the same pathway within the same cluster, although the overall enrichment is restricted to ≈10% of the derived clusters. Furthermore, several gene–metabolite pathway associations are not preserved and were found for only one of the conditions. Interestingly, the oxidative and cold stress conditions exhibit the largest number of associations (Supplementary Table 3 for a full representation of the results). One striking observation immediately apparent was the overrepresentation of amino acids in the gene–metabolite associations and more specifically the association between amino acids and genes involved in amino-acid catabolism (cf. Figure 5 that shows in an exemplary manner a schematic view of the corresponding pathway and the representation of the corresponding transcript and metabolite levels). Thus, asparagine levels are highly associated with transcript levels of the asparaginase gene ansB threonine and its precursor—aspartic acid correlate with expression of the tdh and kbl genes, and arginine correlates with expression of genes involved in the arginine and ornithine degradation pathway. Glutamine levels correlate with a number of transcripts associated with arginine biosynthesis that might possibly indicate a common regulation by glutamate, which is a precursor for both arginine and glutamine synthesis. In contrast to the numerous associations between amino-acid catabolism genes and amino acids, only few associations are observable for amino acids and corresponding genes encoding biosynthetic enzymes. Examples for this type of association are observable between valine and one of the enzymes from the valine biosynthesis pathway—IlvC and between histidine and genes coding two enzymes involved in histidine biosynthesis HisB and HisC. The only association observed for a non-amino acid as a metabolite and a related gene is the co-clustering of trehalose and the gene treA encoding its degrading enzyme trehalase under stationary phase (Figure 5). Most of the data described here and in other studies indicate that environmental changes are most profound in central metabolism especially with respect to the early response. In a second approach, we therefore limited the analysis to particular pathways covering parts of central metabolism, which bears the further advantage of significantly reducing data complexity especially with respect to the transcripts, thus allowing other algorithms to be applied. More specifically, metabolites from glycolysis, the TCA cycle, the p.p.p., and anaerobic respiration were subjected to a CCA together with transcript data of all enzymes from those pathways as derived from EcoCyc. As we are also interested in general regulators, we further included several global transcriptional regulators, known to be involved in metabolism control (ArcA, ArcB, Cra, Crp, Cya, Fnr, Mlc). A complete list of all metabolites and transcripts covered is given in Supplementary Table 4. Figure 6 shows in an exemplary manner the canonical structure correlation plot as a result of the CCA, applied to the control condition data (see Supplementary Figure 8 for the remaining two conditions discussed in this section). The results for the three conditions are summarized in the form of projection onto pathways in Figure 7A–C. When applying CCA to all conditions separately, multiple associations were observed only for three conditions: control growth, heat stress, and stationary phase. The visualization of the canonical structure correlations with the first two canonical variates (see Materials and methods section) shows a number of metabolites in close proximity to genes coding enzymes, which catalyze their biochemical conversions. For the remaining three conditions, cold stress, oxidative stress, and diauxic shift, very few or no intuitive associations were observed. Under control conditions, two groups of highly associated metabolites and transcripts are observed (Figures 6 and 7A, colored in magenta and blue). The first comprises all measured metabolites from the oxidative p.p.p. (glc-6-P, 6-P-gluconic acid, ribose-5-P, and E-4-P) in addition to metabolites from the glycolytic pathway (3PGA and PEP in addition to glc-6-P) forming a strong association with two genes encoding pathway enzymes, that is rpe encoding ribulose phosphate 3-epimerase and pps encoding PEP synthase. The high association of metabolites and transcripts from these two pathways is only observed under optimal growth conditions and is largely lost under all other conditions analyzed such as heat stress and during the stationary phase (see Supplementary Figure 8A and B). This tight coupling between glycolysis and the p.p.p. might reflect the strong demand of fast growing cells for synthesis of high levels of the nucleotide precursor ribose-5-P. It is known that exponentially growing cells metabolize glc-6-P into fructose-6-phosphate (fru-6-P) and 3PGA by glycolytic enzymes, and next use transketolase and transaldolase enzymes from p.p.p. to convert two molecules of fru-6-P and one molecule of 3PGA into three molecules of ribose-5-P (Berg et al, 2006). Finally, these data suggest that both rpe and pps could have a major regulatory function mostly exerted through transcriptional regulation of both genes. The second group of coordinated metabolites and genes found under optimal growth conditions form part of the TCA cycle. Thus, the expression of the mqo gene encoding malate-quinone oxidoreductase (MQO) is associated with all TCA cycle intermediates measured: 2-ketoglutaric acid, fumaric acid, malic acid, and succinic acid. In addition, pyruvic acid, which is located at the key point between glycolysis and the TCA cycle, shows association with mqo. MQO catalyses the irreversible oxidation of malate to oxaloacetate (Kather et al, 2000) that in turn regulates the activity of citrate synthase, which is a major rate determining enzyme of the TCA cycle (Frederick and Roy Curtiss, 1996). Although the conversion of malate to oxaloacetate is also catalyzed by other enzymes including the NAD-dependent malate dehydrogenase (mdh), it was recently suggested that under optimal growth conditions, MQO is the major route of malate oxidation (van der Rest et al, 2000). The strong association between mqo gene expression and multiple members of the TCA cycle as well as pyruvate suggest mqo expression to have a major function for the regulation of the TCA cycle, which need to be experimentally validated. The tight coupling between the oxidative p.p.p. and glycolysis is lost, however, under non-optimal growth conditions. Thus, during stationary growth, no association is observed between any metabolites and transcripts related to those pathways (Figure 7C). In contrast, under heat stress (Figure 7B; Supplementary Figure 8B), the expression of zwf gene encoding the glc-6-P dehydrogenase correlates with three intermediates of the p.p.p. including glc-6-P, 6-phosphogluconic acid, and E-4-P, suggesting a control of the flux through p.p.p. by changes in zwf expression. Expression of zwf gene that encodes the first key enzyme from p.p.p. is among others controlled by the SoxRS regulon in response to oxidative stress (Fawcett and Wolf, 1995). Correlation of expression of zwf and p.p.p. metabolites under heat stress indicates a similar redirection of p.p.p. under heat-stress conditions again emphasizing the similarity between heat and oxidative stress. Analysis of the stationary phase data reveals among others the association of three metabolites of the TCA cycle including malic, fumaric, and succinic acid with the expression of several genes including fumarate reductase (frd C,D), fumarase B (emphfumB), and fumarate-succinate antiporter (dcuB). This is a most interesting observation as fumaric acid is known to serve as an alternative electron acceptor during anaerobic respiration further regulating the expression of genes associated with anaerobic respiration including the four genes mentioned above (Jones and Gunsalus, 1987; Zientz et al, 1998; Golby et al, 1999). The mechanism of this regulation includes activation of the DcuS-DcuR two component system by fumaric acid, which subsequently stimulates expression of target genes (Kleefeld et al, 2009). Our data confirm this model and in addition show that this regulation only holds true under stationary phase characterized among others by limiting oxygen availability. This model can be further extend based on the tight coordination between the expression of both fumarate reductase genes (frdC, frdD) also with malic and succinic acid that expression of these genes might be regulated by levels of all three metabolites, which is in agreement with previous studies (Kleefeld et al, 2009). A complex picture different from both the stationary phase and the optimal growth conditions emerge from the analysis of the heat-stress experiment concerning the TCA cycle. Inspection of the canonical loadings shows among other associations a high similarity between the expression levels of pflB gene coding pyruvate formate-lyase (PFL) and concentration of pyruvic acid. Pyruvic acid further is strongly associated with the transcriptional regulator FNR (fnr). This association is in full agreement with a model developed for anaerobic conditions (which are approximated by heat stress), which suggests that expression of pflB is regulated in an FNR-dependent manner by pyruvic acid (Sawers and Bock, 1988). It is interesting to see that also two other genes from upper glycolysis (pgk and pgi) are in close proximity of fnr, pflB, pyruvic acid, and 3PGA (Supplementary Figure 8B). Both of these genes seem to have an important function in anaerobic metabolism. The expression of hpgk encoding phosphoglycerate kinase is induced under anaerobiosis (Nellemann et al, 1989), whereas a mutation in pgi was shown to reduce the expression of several anaerobically induced genes, including PFL, with glucose as the sole carbon source (Rasmussen et al, 1991). Interestingly, the effect of the pgi mutation could be overcome by addition of pyruvic acid (Rasmussen et al, 1991). This, together with our data, might suggest that the induction of PFL expression is dependent on the presence of glycolytic metabolic intermediates, whose synthesis is blocked in pgi mutant, most likely pyruvic acid (Leonardo et al, 1993). This leads to the hypothesis that products of both pgk and pgi could have important functions under hypoxic conditions by controlling the levels of pyruvate, which is then converted by PFL in anaerobic respiration. Conclusion The time-resolved and combined analysis of the transcriptomic and metabolomic response of E. coli to four different stresses reveals conserved and specific responses on both levels of information processing. Different stress conditions have similar global impact on cell metabolism, which consists on energy conservation strategy as is evident on the transcript and metabolic level. Co-occurring responses on the transcript and metabolic level were observed as peaks of maximal changes directly post-perturbation irrespective of the stress applied. The co-occurrence of metabolic and transcript responses was observed for functionally related genes and metabolites and proposed to be an effect of strong co-regulation of both levels of response. Specificity of the response is higher on the metabolome as compared with the transcriptome level especially during early time points after perturbation. Stress-induced growth cessation is similar to stationary phase growth cessation when compared on the level of the transcriptome, but different when compared on the level of the metabolome. Application of co-clustering and CCA on combined metabolite–transcript data identified a number of condition-dependent significant associations between metabolites and transcripts. The results obtained confirm and extend existing models about co-regulation between gene expression and metabolites demonstrating the power of integrated systems oriented analysis. Materials and methods E. coli culture conditions For all experiments, E. coli strain MG1655 was used, which was obtained from the American Type Culture Collection (ATCC 700926). The minimal medium used for all experiments was a modification of MOPS (morpholinopropane sulfonate) minimal medium (Neidhardt et al, 1974) obtained from Teknova, CA (product number M2006), which contains 86 mM NaCl, 9.5 mM NH4Cl, 5 mM K2HPO4 and 0.2% glucose. All cultures were grown aerobically in a thermostatically controlled 37°C culture room. Cultures (150 ml culture volume) were stirred by magnetic stirrers at 330 r.p.m. (Thermo Scientific Variomag Multipoint 6in) 1000 ml Erlenmeyer flask. Analysis of gene expression data for transcripts indicative for anaerobiosis showed the absence of any oxygen shortage under optimal growth conditions and rather in contrast showed a slight induction of genes associated with aerobic respiration for example ubiquinone oxidoreductase (nuoH, nuoN, nuoL). Induction of expression of genes associated with hypoxia was, however, observed after glucose-lactose shift, oxidative stress, and more pronounced during heat and stationary phase. Temperature and pH were carefully monitored during growth. Starting cultures were inoculated from a single colony and grown overnight. Each experimental culture was then inoculated from such an overnight culture at a dilution of 1:20 into 150 ml fresh MOPS minimal medium in a 1000 ml flask. Growth of cultures was monitored by measuring OD at 600 nm using an Eppendorf Biophotometer. All cultures were grown until early mid-log phase (OD 0.6), at which point each of the perturbations was applied. Oxidative stress A measure of 200 μg/ml of 30% pre-warmed hydrogen peroxide (Fluka) was added to 150 ml constantly stirred (330 r.p.m.) cultures kept in 1000 ml flasks. The amount of hydrogen peroxide used for the stress was calculated to cause a non-lethal 40 min lag phase. This was monitored by plating on solid LB medium and calculating viable cell number. Cold stress Cultures were transferred from 37°C into an ice cold water bath to lower the temperature, whereas stirring, to 16°C in 0. Co-clustering and pathway enrichment analysis of genes and metabolites Here, we use a co-clustering approach to determine the extent to which genes and metabolites, showing differential expression under the investigated conditions, are involved in the same biochemical pathway. We simultaneously apply a k-means clustering algorithm to the combined metabolite and transcript level data for a specific condition, given in a form of an m × n matrix J (m is the total number of genes and metabolites and n is the number of time points). To limit the effect of the absolute magnitude of concentration or expression levels on a used similarity measure, we normalized every row in J to have zero mean and unit variance (i.e. we perform a z-score transformation). To supply a suitable estimate for the initial number of clusters (i.e. parameter k) for the k-means algorithm for every experimental condition, we used a graph-based approach to estimate a probabilistic data-dependent range for k (Klie et al, 2010). Briefly, this approach uses the topology of a graph-based representation of J to identify dense regions (i.e. clusters) of the data by means of a random walk. These dense regions are then enumerated by construction of a minimum spanning tree. Note that this range is dependent on the used similarity measure and was computed for Euclidean distance and Pearson's correlation coefficient, each resulting in an independent clustering of J. To further increase the robustness of the presented findings in section ‘The level of coordination between transcript and metabolite data is strongly influenced by the environmental conditions,' we repeated the clustering procedure 100 times with randomized initial cluster centers for each k in the previously determined interval, for both similarity measures. Out of those 100 clustering runs, we selected the clustering that minimizes the root mean square error for a given k. This approach aims at compensating for the non-deterministic nature of the k-means algorithm. Finally, over-representation of certain pathways on each cluster was determined analogous to finding enriched GO terms, using the hyper-geometric distribution as a null distribution (Rivals et al, 2007). The significance level was, again, set to 0.05 and the P-values are Benjamini–Hochberg corrected. We focus only on pathways that are enriched for both metabolites and genes, although the pathways enriched only for metabolites and only for genes can also be readily determined. To validate the significance of the observed co-clustering events of genes and metabolites resulting in an enrichment of a certain pathway, we applied a non-parametric bootstrap sampling procedure. Briefly, we sampled m genes and metabolites with replacement and uniform probability from the original condition-specific joint metabolite–transcript data set. The obtained bootstrap sample is then again subjected to k-means clustering to determine whether the previously observed co-clustering is significant or a random observation. A more detailed treatment of this analysis step can be found in the Supplementary information; the derived P-values and probabilities for co-clustering events can be found in Supplementary Table 3 together with the pathway enrichment results. We point out that all co-clustering events presented here are significant at the 1% level (after correcting for multiple testing). In summary, we searched for pathway-over-enrichment in each combination of experimental condition, choice of k, and similarity measure. CCA of genes and metabolites involved in primary metabolism CCA is a statistical technique for studying associations between two sets of variables (Hotelling, 1936) measured under the same experimental units. CCA and its variants were previously applied to either compare data of the same source (e.g. microarray data) originating from different species (van den Berg et al, 2009) or to integrate different sources of data from the same system (e.g. complementing gene expression data with phenotypic data (Gonzlez et al, 2008) or integration of data originating from different ‘omics' technologies (Le Cao et al, 2009)). Given a set of genes and a set of metabolites, the principle idea of CCA is to find two linear combinations, one for the set of genes and one for the set of metabolites, which are maximally correlated. Here, the set of genes is described by the matrix X of dimension n × p, where rows correspond to the expression levels measured at n time points of p genes (columns) under one specific condition. Correspondingly, Y of dimension n × q represents the n measured concentrations of q metabolites under the same experimental condition. Furthermore, we denote the ith column of matrix X by X i and correspondingly denote by Y j the jth column vector of Y. To avoid having many more variables than observations, we used the data from all three independent biological replicates individually, instead of taking the mean or median for the replicates. Moreover, it will be assumed that the columns of X and Y are standardized (i.e. a mean of 0 and a variance of 1), that p⩾q and X as well as Y are of full column rank p and q. Let a 1=(a 1 1;…;a p 1)T and b 1=(b 1 1;…;b q 1)T denote the two basis vectors (both of var (U 1)=var (V 1)=1), such that the correlation between the projections of the variables onto these basis vectors given by and are mutually maximized: The derived linear projections U 1 and V 1 will be called the first canonical variates and ρ1 is referred to as the first canonical correlation. Higher order canonical variates can be found as a stepwise problem with the restriction to be orthogonal to the already determined set of linear combinations. Note that the successively computed canonical correlations satisfy ρ1⩾ρ2⩾…⩾ρ q . In this work, we use the results of the CCA on a subset of genes and metabolites involved in the primary metabolism (see section ‘The level of coordination between transcript and metabolite data is strongly influenced by the environmental conditions' and Supplementary Table 4) as an explanatory tool to display associations between genes and metabolites that are less prominent by means of direct linear relationships (e.g. Pearson correlation) in the initial data. Specifically, for the purpose of visualization, we use two-dimensional scatter plots for the genes and metabolites, which are also known as canonical loadings plots. Here, the axes define the canonical variates U j and U k with j≠k and both from the integer interval [1, q], for the gene set X. Coordinates of genes in X and metabolites in Y on each axis correspond to Pearson correlations of their initial representation (e.g. for gene i the corresponding column vector X i ) and the respective canonical variate U s , s∈j, k. This form of correlation is known as canonical structure correlations. An example of a canonical structure correlation vectors can be found in the Supplementary information (Supplementary Table 5). As both genes and metabolites are assumed to be of unit variance, their projections on the plane (U j ; U k ) reside within a circle of radius 1 centered at the origin. Variables with a strong relation are projected in the same direction from the origin. Clearly, the greater the distance from the origin, the stronger is the relation. For clarity, a second circle with radius of 0.5 is shown to indicate associations of genes and metabolites, which are less strong and of limited importance for the conducted analysis. The CCA results presented in the paper rely on a regularized version of CCA, which is available in the CCA package (Gonzlez et al, 2008), which is available for the statistical software R. Supplementary Material Supplementary data Supplementary text, Supplementary tables S1–5, Supplementary figures S1–8 Metabolic raw data The raw metabolite peak height data (columns 9-205) are supplied together with all samples (rows). The basic description of all samples is given in Columns 1-8 and include: sample ID, experimental condition, time point, technical and biological repetition, internal standard peak height, and a list of samples prior to perturbation. For the further information about the GC-MS analysis and metabolic data processing see Materials and Methods section of the manuscript.Normalization.All metabolites were normalized to the optical density (Column 6) and the intensity of internal standard (Column 7). Additionally all time points were normalized to the average of the time points prior to perturbation which are depicted by "TRUE" in Column 8. The proper alignment of the time points from different biological repetitions of the same experiment was done based on the expression of stress specific marker genes (see manuscript main text for further details)
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation

            RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database offering curated knowledge of the transcriptional regulatory network of Escherichia coli K12, currently the best-known electronically encoded database of the genetic regulatory network of any free-living organism. This paper summarizes the improvements, new biology and new features available in version 6.0. Curation of original literature is, from now on, up to date for every new release. All the objects are supported by their corresponding evidences, now classified as strong or weak. Transcription factors are classified by origin of their effectors and by gene ontology class. We have now computational predictions for σ54 and five different promoter types of the σ70 family, as well as their corresponding −10 and −35 boxes. In addition to those curated from the literature, we added about 300 experimentally mapped promoters coming from our own high-throughput mapping efforts. RegulonDB v.6.0 now expands beyond transcription initiation, including RNA regulatory elements, specifically riboswitches, attenuators and small RNAs, with their known associated targets. The data can be accessed through overviews of correlations about gene regulation. RegulonDB associated original literature, together with more than 4000 curation notes, can now be searched with the Textpresso text mining engine.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Global iron-dependent gene regulation in Escherichia coli. A new mechanism for iron homeostasis.

              Organisms generally respond to iron deficiency by increasing their capacity to take up iron and by consuming intracellular iron stores. Escherichia coli, in which iron metabolism is particularly well understood, contains at least 7 iron-acquisition systems encoded by 35 iron-repressed genes. This Fe-dependent repression is mediated by a transcriptional repressor, Fur (ferric uptake regulation), which also controls genes involved in other processes such as iron storage, the Tricarboxylic Acid Cycle, pathogenicity, and redox-stress resistance. Our macroarray-based global analysis of iron- and Fur-dependent gene expression in E. coli has revealed several novel Fur-repressed genes likely to specify at least three additional iron-transport pathways. Interestingly, a large group of energy metabolism genes was found to be iron and Fur induced. Many of these genes encode iron-rich respiratory complexes. This iron- and Fur-dependent regulation appears to represent a novel iron-homeostatic mechanism whereby the synthesis of many iron-containing proteins is repressed under iron-restricted conditions. This mechanism thus accounts for the low iron contents of fur mutants and explains how E. coli can modulate its iron requirements. Analysis of 55Fe-labeled E. coli proteins revealed a marked decrease in iron-protein composition for the fur mutant, and visible and EPR spectroscopy showed major reductions in cytochrome b and d levels, and in iron-sulfur cluster contents for the chelator-treated wild-type and/or fur mutant, correlating well with the array and quantitative RT-PCR data. In combination, the results provide compelling evidence for the regulation of intracellular iron consumption by the Fe2+-Fur complex.
                Bookmark

                Author and article information

                Journal
                BMC Genomics
                BMC Genomics
                BioMed Central
                1471-2164
                2010
                19 October 2010
                : 11
                : 579
                Affiliations
                [1 ]Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka, Japan
                [2 ]Exploratory Research for Advanced Technology (ERATO), Japan Science and Technology Agency (JST), 1-5 Yamadaoka, Suita, Osaka, Japan
                [3 ]Graduate School of Frontier Biosciences, Osaka University, 1-5 Yamadaoka, Suita, Osaka, Japan
                Article
                1471-2164-11-579
                10.1186/1471-2164-11-579
                3091726
                20955615
                5c3bbb48-bb74-4ef1-a23c-f4b7ce45aa02
                Copyright ©2010 Horinouchi et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 26 July 2010
                : 19 October 2010
                Categories
                Research Article

                Genetics
                Genetics

                Comments

                Comment on this article