The effect of human amnion epithelial cells on lung development and inflammation in preterm lambs exposed to antenatal inflammation

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Lung inflammation and impaired alveolarization are hallmarks of bronchopulmonary dysplasia (BPD). We hypothesize that human amnion epithelial cells (hAECs) are anti-inflammatory and reduce lung injury in preterm lambs born after antenatal exposure to inflammation.

Methods

Pregnant ewes received either intra-amniotic lipopolysaccharide (LPS, from E. coli 055:B5; 4mg) or saline (Sal) on day 126 of gestation. Lambs were delivered by cesarean section at 128 d gestation (term ~150 d). Lambs received intravenous hAECs (LPS/hAECs: n = 7; 30x10 ⁶ cells) or equivalent volumes of saline (LPS/Sal, n = 10; or Sal/Sal, n = 9) immediately after birth. Respiratory support was gradually de-escalated, aimed at early weaning from mechanical ventilation towards unassisted respiration. Lung tissue was collected 1 week after birth. Lung morphology was assessed and mRNA levels for inflammatory mediators were measured.

Results

Respiratory support required by LPS/hAEC lambs was not different to Sal/Sal or LPS/Sal lambs. Lung tissue:airspace ratio was lower in the LPS/Sal compared to Sal/Sal lambs (P<0.05), but not LPS/hAEC lambs. LPS/hAEC lambs tended to have increased septation in their lungs versus LPS/Sal (P = 0.08). Expression of inflammatory cytokines was highest in LPS/hAECs lambs.

Conclusions

Postnatal administration of a single dose of hAECs stimulates a pulmonary immune response without changing ventilator requirements in preterm lambs born after intrauterine inflammation.

Related collections

Most cited references 57

Record: found
Abstract: found
Article: not found

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes

Jo Vandesompele, Katleen De Preter, Filip Pattyn … (2002)

Background Gene-expression analysis is increasingly important in many fields of biological research. Understanding patterns of expressed genes is expected to provide insight into complex regulatory networks and will most probably lead to the identification of genes relevant to new biological processes, or implicated in disease. Two recently developed methods to measure transcript abundance have gained much popularity and are frequently applied. Microarrays allow the parallel analysis of thousands of genes in two differentially labeled RNA populations [1], while real-time RT-PCR provides the simultaneous measurement of gene expression in many different samples for a limited number of genes, and is especially suitable when only a small number of cells are available [2,3,4]. Both techniques have the advantage of speed, throughput and a high degree of potential automation compared to conventional quantification methods, such as northern-blot analysis, ribonuclease protection assay, or competitive RT-PCR. Nevertheless, these new approaches require the same kind of normalization as the traditional methods of mRNA quantification. Several variables need to be controlled for in gene-expression analysis, such as the amount of starting material, enzymatic efficiencies, and differences between tissues or cells in overall transcriptional activity. Various strategies have been applied to normalize these variations. Under controlled conditions of reproducible extraction of good-quality RNA, the gene transcript number is ideally standardized to the number of cells, but accurate enumeration of cells is often precluded, for example when starting with solid tissue. Another frequently applied normalization scalar is the RNA mass quantity, especially in northern blot analysis. There are several arguments against the use of mass quantity. The quality of RNA and related efficiency of the enzymatic reactions are not taken into account. Moreover, in some instances it is impossible to quantify this parameter, for example, when only minimal amounts of RNA are available from microdissected tissues. Probably the strongest argument against the use of total RNA mass for normalization is the fact that it consists predominantly of rRNA molecules, and is not always representative of the mRNA fraction. This was recently evidenced by a significant imbalance between rRNA and mRNA content in approximately 7.5% of mammary adenocarcinomas [5]. Also, it has been reported that rRNA transcription is affected by biological factors and drugs [6,7,8]. Further drawbacks to the use of 18S or 28S rRNA molecules as standards are their absence in purified mRNA samples, and their high abundance compared to target mRNA transcripts. The latter makes it difficult to accurately subtract the baseline value in real-time RT-PCR data analysis. To date, internal control genes are most frequently used to normalize the mRNA fraction. This internal control - often referred to as a housekeeping gene - should not vary in the tissues or cells under investigation, or in response to experimental treatment. However, many studies make use of these constitutively expressed control genes without proper validation of their presumed stability of expression. But the literature shows that housekeeping gene expression - although occasionally constant in a given cell type or experimental condition - can vary considerably (reviewed in [9,10,11,12]). With the increased sensitivity, reproducibility and large dynamic range of real-time RT-PCR methods, the requirements for a proper internal control gene have become increasingly stringent. In this study, we carried out an extensive evaluation of 10 commonly used housekeeping genes in 13 different human tissues, and outlined a procedure for calculating a normalization factor based on multiple control genes for more accurate and reliable normalization of gene-expression data. Furthermore, this normalization factor was validated in a comparative study with frequently applied microarray scaling factors using publicly available microarray data. Results Expression profiling of housekeeping genes Primers were designed for ten commonly used housekeeping genes (ACTB, B2M, GAPD, HMBS, HPRT1, RPL13A, SDHA, TBP, UBC and YWHAZ) (see Table 1 for full gene name, accession number, function, chromosomal localization, alias, existence of processed pseudogenes, and indication that primers span an intron; see Table 2 for primer sequences). Special attention was paid to selecting genes that belong to different functional classes, which significantly reduces the chance that genes might be co-regulated. The expression level of these 10 internal control genes was determined in 34 neuroblastoma cell lines (independently prepared in different labs from different patients), 20 short-term cultured normal fibroblast samples from different individuals, 13 normal leukocyte samples, 9 normal bone-marrow samples, and 9 additional normal human tissues from pooled organs (heart, brain, fetal brain, lung, trachea, kidney, mammary gland, small intestine and uterus). The raw expression values are available as a tab-delimited file (see Additional data files). Single control normalization error To determine the possible errors related to the common practice of using only one housekeeping gene for normalization, we calculated the ratio of the ratios of two control genes in two different samples (from the same tissue panel) and termed it the single control normalization error, E (see Materials and methods). For two ideal internal control genes (constant ratio between the genes in all samples), E equals 1. In practice, observed E values are larger than 1 and constitute the erroneous E-fold expression difference between two samples, depending on the particular housekeeping gene used for normalization. E values were calculated for all 45 two-by-two combinations of control genes and 865 two-by-two sample combinations within the available tissue panels (neuroblastoma, fibroblast, leukocyte, bone marrow and a series of normal tissues from Clontech; that is, a total of 38,925 data points) (Figure 1). In addition, the systematic error distribution was calculated by analysis of repeated runs of the same control gene. The average 75th and 90th percentile E values are 3.0 (range 2.1-3.9), and 6.4 (range 3.0-10.9), respectively. Gene-stability measure and ranking of selected housekeeping genes It is generally accepted that gene-expression levels should be normalized by a carefully selected stable internal control gene. However, to validate the presumed stable expression of a given control gene, prior knowledge of a reliable measure to normalize this gene in order to remove any nonspecific variation is required. To address this circular problem, we developed a gene-stability measure to determine the expression stability of control genes on the basis of non-normalized expression levels. This measure relies on the principle that the expression ratio of two ideal internal control genes is identical in all samples, regardless of the experimental condition or cell type. In this way, variation of the expression ratios of two real-life housekeeping genes reflects the fact that one (or both) of the genes is (are) not constantly expressed, with increasing variation in ratio corresponding to decreasing expression stability. For every control gene we determined the pairwise variation with all other control genes as the standard deviation of the logarithmically transformed expression ratios, and defined the internal control gene-stability measure M as the average pairwise variation of a particular gene with all other control genes. Genes with the lowest M values have the most stable expression. Assuming that the control genes are not co-regulated, stepwise exclusion of the gene with the highest M value results in a combination of two constitutively expressed housekeeping genes that have the most stable expression in the tested samples. To manage the large number of calculations, we have written a Visual Basic Application (VBA) for Microsoft Excel - termed geNorm - that automatically calculates the gene-stability measure M for all control genes in a given set of samples (geNorm is freely available from the authors on request). The program enables elimination of the worst-scoring housekeeping gene (that is, the one with the highest M value) and recalculation of new M values for the remaining genes. Using this VBA applet, we ranked the ten control genes in the five tissue panels tested according to their expression stability (Figure 2, Table 3). In addition, the systematic variation was calculated as the pairwise variation, V, for repeated RT-PCR experiments on the same gene, reflecting the inherent machine, enzymatic and pipet variation. Normalization factor calculation based on the geometric mean of multiple control genes We concluded that in order to measure expression levels accurately, normalization by multiple housekeeping genes instead of one is required. Consequently, a normalization factor based on the expression levels of the best-performing housekeeping genes must be calculated. For accurate averaging of the control genes, we propose to use the geometric mean instead of the arithmetic mean, as the former controls better for possible outlying values and abundance differences between the different genes. The number of genes used for geometric averaging is a trade-off between practical considerations and accuracy. It is obvious that an accurate normalization factor should not include the rather unstable genes that were observed in some tissues. On the other hand, it remains relatively impractical to quantify, for example, eight control genes when only a few target genes need to be studied, or when only minimal amounts of RNA are available. Furthermore, it is a waste of resources to quantify more genes than necessary if all genes are relatively stably expressed and if the normalization factor does not significantly change whether or not more genes are included. Taking all this into consideration, we recommend the minimal use of the three most stable internal control genes for calculation of an RT-PCR normalization factor (NF n , n = 3), and stepwise inclusion of more control genes until the (n + 1)th gene has no significant contribution to the newly calculated normalization factor (NF n + 1). To determine the possible need or utility of including more than three genes for normalization, the pairwise variation V n/n + 1 was calculated between the two sequential normalization factors (NF n and NF n + 1) for all samples within the same tissue panel (with a ij = NF n,i and a ik = NF n + 1,i , n the number of genes used for normalization (3 ≤ n ≤ 9), and i the sample index; see Equations 2 and 3 in Materials and methods). A large variation means that the added gene has a significant effect and should preferably be included for calculation of a reliable normalization factor. For all tissue types, normalization factors were calculated for the three most stable control genes (that is, those with the lowest M value) and for seven additional factors by stepwise inclusion of the most stable remaining control gene. Pairwise variations were subsequently calculated for every series of NF n and NF n + 1 normalization factors, reflecting the effect of adding an (n + 1)th gene (Figure 3a). It is apparent that the inclusion of a fourth gene has no significant effect (that is, low V 3/4 value) for leukocytes, fibroblasts and bone marrow. This is also illustrated by the nearly perfect correlation between NF3 and NF4 values, as shown for fibroblasts in Figure 3b. On the basis of these data, we decided to take 0.15 as a cut-off value, below which the inclusion of an additional control gene is not required. For neuroblastoma and the pool of normal tissues, one and two additional genes, respectively, are necessary for reliable normalization (see also Figure 3b). The high V 8/9 and V 9/10 values for the normal pool, neuroblastoma and leukocytes corroborate very well the findings obtained by stepwise exclusion of the worst-scoring control gene (Figure 2). This analysis showed an initial steep decrease in average M value, pointing at two aberrantly expressed control genes for leukocytes and one unstable gene for neuroblastoma and the pool of normal tissues. Furthermore, the need to include additional control genes for these last two tissue panels is in keeping with the high variation in control-gene expression, as evidenced from Figure 2. Validation of proposed real-time RT-PCR normalization factors To assess the validity of the established gene-stability measure, that is, that genes with the lowest M values have indeed the most stable expression, we determined the gene-specific variation for each control gene as the variation coefficient of the expression levels after normalization. This coefficient should be minimal for proper housekeeping genes. Three different normalization factors were calculated, based on the geometric mean of three genes with, respectively, the lowest (NF3(1-3)), the highest (NF3(8-10)), and intermediate M values (NF3(6-8)) (as determined by geNorm). We subsequently determined the average gene-specific variation of the three genes with the most stable expression (that is, the lowest variation coefficient) for each normalization factor and within each tissue panel (Figure 4a). It is clear that the gene-specific variation in all tissue panels is by far the smallest when the data are normalized to NF3(1-3). This demonstrates that the gene-stability measure effectively identified the control genes with the most stable expression. To verify that a high M value is characteristic of an unstable or differentially expressed gene, we analyzed the expression level of MYCN - a highly differentially expressed protooncogene in neuroblastoma with prognostic value [13] - together with the set of ten housekeeping genes. MYCN was readily identified as the most differentially expressed gene, with an M value of 6.02 compared to 2.17 for the least stable control gene (B2M) in neuroblastoma. It was further observed that normalization with a single control gene consistently resulted in significantly higher gene-specific variations of the other control genes (data not shown), which underscores the improvement in normalization by using multiple housekeeping genes. To show that the associations between the best control genes are independent of cell proliferation, we analyzed the expression level of the proliferation marker PCNA in the neuroblastoma cancer cell lines, and determined the Spearman rank correlation coefficient between the raw expression levels of the four best housekeepers and the marker gene PCNA. From this analysis, it was clear that the control genes were - as expected - significantly correlated (p < 0.001, correlation coefficient between 0.60 and 0.76). In contrast, no correlation was observed between PCNA and three of the four control genes, and only a weak correlation (p = 0.024, coefficient = 0.43) between PCNA and control gene HPRT1. These data firmly demonstrate that the most stable control genes (identified by the geNorm algorithm) are not per se linked to the state of cell proliferation of the samples. To further validate the accuracy of geometric averaging of carefully selected control genes for normalization, the geometric means of housekeeping-gene expression levels obtained from publicly available microarray data were compared with commonly applied microarray normalization factors calculated for the same data. For this purpose, an 8,000-gene array data set [14] was chosen, containing nine of the ten control genes evaluated in this RT-PCR study. Two commonly applied microarray normalization factors (based on median ratio normalization, and total intensity normalization) [15,16,17] were determined for eight randomly selected hybridization sets. Subsequently, for each hybridization set, the background-corrected expression levels of nine housekeeping genes for the two fluorescence channels were imported into geNorm and ranked, as described for the RT-PCR data. As these microarray data originate from hybridizations of cell lines from various histological origin versus a reference pool of multiple cell lines, we have calculated the geometric mean of the five most stable control genes (NF5) for each hybridization set, in accordance to the recommendations for reliable normalization within a heterogeneous tissue panel (see previous paragraph). Alternatively, internal control genes were excluded in a stepwise manner until the M values of the remaining genes were below 0.7 (experimental value shown to eliminate the most variable and outlying genes in this microarray dataset). Depending on the hybridization set, seven to nine genes fitted this criterion, upon which the geometric mean was calculated (NF M < 0.7). Both normalization factors (NF5 and NF M < 0.7) were shown to be similar to the calculated microarray normalization factors (Figure 4b). Tissue-specific housekeeping gene expression To compare the control gene-expression levels within the heterogeneous group of all 13 tested tissues, the same set of control genes should be used for normalization. We therefore calculated the geometric mean of six control genes that were withheld from the set of ten genes after elimination of the two genes with the highest M value within each tissue panel (that is, B2M, RPL13A, ACTB and HMBS) (see Table 3). Given the large variety of tested tissues, this is the optimal strategy to eliminate most variation, and to allow direct comparison between the different samples. Under the assumption of equal PCR threshold cycle values for equal transcript numbers of different genes, an estimation of the transcript abundance of the various control genes can be made. Figure 5 shows that the ten tested genes belong to various abundance classes, with an approximately 400-fold expression difference between the most abundant (ACTB) and the rarest (HMBS) transcript. Although the overall abundance of a given control gene in the different tissues is relatively similar, we clearly observe tissue-specific expression differences, for example, B2M expression level is 112-fold higher in leukocytes compared to fetal brain, and ACTB shows an expression difference of 22-fold between fibroblasts and heart tissue. It is also clear that some genes have a relatively constant expression level (for example, UBC and HPRT1) compared to the differential expression pattern of others (for example, B2M and ACTB). Discussion Accurate normalization of gene-expression levels is an absolute prerequisite for reliable results, especially when the biological significance of subtle gene-expression differences is studied. Still, little attention has been paid to the systematic study of normalization procedures and the impact on the conclusions. For RT-PCR, there is a general consensus on using a single control gene for normalization purposes. A comprehensive literature analysis of expression studies that were published in high-impact journals during 1999 indicated that GAPD, ACTB, 18S and 28S rRNA were used as single control genes for normalization in more than 90% of cases [11]. As numerous studies reported that housekeeping gene expression can vary considerably [6,9,10,11,12], the validity of the conclusions is highly dependent on the applied control. Some laboratories have tried to find the optimal control gene for their experimental system, and often rRNA molecules were proposed as best references. These studies should be approached with some caution, as often only the variation in expression of the tested genes with respect to the mass loading of total RNA was assessed. As rRNA molecules make up the bulk of total RNA, they should indeed correlate very well with the total RNA mass, but that does not necessarily make them good control genes. As outlined in the introduction, total RNA and rRNA levels are not proper references, because of the observed imbalance between rRNA and mRNA fractions. In addition to searching for a stable control gene, we aimed at determining the errors related to the common practice of single control normalization. In this study, we provide evidence that a conventional normalization strategy based on a single housekeeping gene leads to erroneous normalization up to 3.0- and 6.4-fold in 25% and 10% of the cases, respectively, with sporadic cases showing error values above 20. This analysis showed that a few control genes were unstable and significantly differentially expressed in some tissue panels, as evidenced by the decrease from 5.9 to 4.5 for the 90th-percentile single control normalization error value for neuroblastoma when the B2M gene is omitted (data not shown). This finding agrees with the reported differential expression of B2M in neuroblastoma, corresponding to the stage of differentiation of the tumor cells [18]. The error-distribution curves not only reflect the stability of expression of the applied controls, but also the sample heterogeneity within a tissue panel, as noted from the less steep curve for the heterogeneous set of normal pooled tissues compared to the other, relatively homogeneous, tissue panels. In this regard, the issue has been raised that finding proper control genes is even more important when working with tissues of different histological origin [9]. The single control normalization error values point to inherent noisy oscillations in expression levels of the control genes, a finding which has been corroborated in other large-scale studies where several thousand genes were measured in different cells or tissues by microarray analysis. No gene was found on an 8,000-feature array that did not vary by ratios of at least twofold across a panel of 60 cell lines [14], and a set of genes frequently used for normalization (including GAPD and ACTB) was found to vary in expression by 7- to 23-fold [9]. Taken together, our data and these studies clearly show that ideal and universal control genes do not exist. This warrants the search for stably expressed genes in each experimental system, and for the development of an accurate normalization strategy. To validate the expression stability of the tested control genes without any prior assumption of a metric for standardization, we had initially measured the correlation between the raw, non-normalized expression levels of any two control genes, which should be nearly perfect for proper control genes. We observed, however, that the data range between the minimum and maximum expression levels, or any outlying value, could have a profound influence on the slope of the regression line, and consequently on the value of the correlation coefficient. This made Pearson and Spearman correlation coefficients unsuitable for this kind of analysis. We have therefore developed a new stability measure, based on the principle that the expression ratio of two proper control genes should be identical in all samples, regardless of the experimental condition or cell type, with increasing ratio variation corresponding to decreasing expression stability of one (or both) of the tested genes. The proposed standard deviation of log-transformed control gene ratios is a robust measure for the variation between two control genes, as it does not impose any requirements for normality or homoscedasticity of the data points. Furthermore, this measure is independent of the abundance difference between the genes, and is equally affected by any outlying or extreme ratio (that is, outliers for a sample with low or high overall expression, or outliers caused by an upregulated or downregulated gene have an equivalent increase in pairwise variation V). Logarithmic transformation of the ratios is required for symmetrical distribution of the data around zero, resulting in equal absolute values (but opposite signs) for a given ratio and the inverse ratio. As a result, the standard deviation of log-transformed ratios is identical to the standard deviation of log-transformed inverse ratios, which makes this measure characteristic for every combination of two genes. Having established a robust measure to assess the variation in expression of two control genes, we subsequently defined a gene-stability measure M as the average pairwise variation between a particular gene and all other control genes. Using a VBA applet geNorm developed in-house, we ranked ten commonly used housekeeping genes belonging to different functional and abundance classes according to their expression stability in five tested tissue panels. The clear decrease of M of the remaining control genes during stepwise exclusion of the worst-scoring gene points at differences in the stability of gene-specific expression and demonstrates that the remaining genes are more stably expressed than the excluded genes. Some tissue panels show a relatively steep initial decline, which reflects the exclusion of one or more aberrantly expressed control genes (for example, ACTB and HMBS for leukocytes), as also noticed from the single control normalization error analyses (see above). The average gene stability values of the remaining genes during stepwise elimination of the least stable control genes also indicates tissue-specific differences, with bone marrow and the pool of normal tissues having the lowest and highest overall expression variation, respectively. The latter is no surprise, given the larger tissue heterogeneity in this panel. The question of whether the observed high variation for neuroblastoma is a cancer-related phenomenon of deregulated expression is currently under further investigation. From these analyses, it is clear that there is no universal control gene suitable for all cell types. ACTB and B2M appear to be the worst-scoring genes, whereas UBC, GAPD and HPRT1 seem to be the best overall control genes, each belonging to the four most stable genes in four out of five tested tissues. However, these generalizations should be treated with caution. B2M appears to be one of the least stable control genes, but is nevertheless a good choice for normalization of leukocyte expression levels. This clearly demonstrates that a proper choice of housekeeping genes is highly dependent on the tissues or cells under investigation. This is even more important when considering the differences in transcript abundance of some control genes between different tissues. The large expression differences between the tissues tested for B2M and ACTB, for instance, would definitely result in large normalization errors if they were used for standardization. Interestingly, the observed tissue-specific expression of these control genes is in keeping with their known role or function: there is high B2M expression in leukocytes, where it is a major cell-surface marker, and relatively low non-muscle cytoskeletal ACTB expression in heart tissue, which is predominantly of muscular origin. In view of the inherent variation in expression of housekeeping genes, we recommend the use of at least three proper control genes for calculating a normalization factor, and present a procedure to determine whether or not more - and if so, how many - control genes were required for reliable normalization. This analysis clearly showed that three stable control genes sufficed for accurate normalization of samples with relatively low expression variation, whereas other tissue panels required a fourth, or even a fifth control gene to capture the observed variation. The purpose of normalization is to remove the sampling differences (such as RNA quantity and quality) in order to identify real gene-specific variation. For proper internal control genes, this variation should be minimal or none. To validate the gene-stability measure M and the geNorm algorithm to identify the most stable control genes in a set of samples, we have calculated the gene-specific variation for each gene as the coefficient of variation of normalized expression levels. To this end, the raw expression values were standardized to different normalization factors, calculated as the geomean of the most, intermediate, or least stable control genes (as determined by geNorm). The rationale of this analysis is that a normalization factor based on proper internal control genes should remove all nonspecific variation. In contrast, unstable control genes cannot completely remove the nonspecific variation, and even add more variation, resulting in larger so-called gene-specific variations for the tested control genes. This analysis clearly demonstrated that most nonspecific variation was removed when the most stable control genes (as determined by geNorm) were used for normalization, which proves that the novel stability measure and strategy presented here effectively allowed the stability of gene expression in the different tissue panels to be assessed. Further validation demonstrated that the geometric mean of carefully selected control genes is an accurate estimate of the mRNA transcript fraction, as determined by comparison with frequently applied microarray normalization factors. Although both RT-PCR normalization factors based on geometric averaging are relatively similar, the one based on at least seven control genes (that is, NF M < 0.7) is slightly more equivalent to the microarray-scaling factors. Two possible explanations can account for this observation. First, the five most stable control genes as determined by geNorm are based on only two RNA samples (that is, a Cy3-labeled reference pool, and a Cy5-labeled test sample), in contrast to the RT-PCR data, where 9 to 34 samples were used, resulting in more reliable estimation of the expression stability. Second, recent technical reports clearly state that array hybridization analyses experience considerable - often underestimated - variation and uncertainty at several levels. Accurate background fluorescence correction and spot quality assessment, among others, have been described as critical issues for reliable ratio estimation [19,20,21]. The higher variability associated with array hybridization results might thus explain the need for more control genes to normalize the data. Nevertheless, this study clearly showed that normalization based on the geometric mean of carefully selected control genes results in equivalent ratio estimation compared to commonly applied array scale factors, which validates its use for RT-PCR normalization. In addition, the method presented could easily be applied to normalize gene-expression levels resulting from microarray hybridization experiments, where only a limited number of genes are spotted, including some housekeeping genes. In conclusion, we described and validated a procedure to identify the most stable control genes in a given set of tissue samples, and to determine the optimal number of genes required for reliable normalization of RT-PCR data. The strategy presented can be applied to any number or kind of genes or tissues, and should allow more accurate gene-expression profiling. This is of utmost importance for studying the biological significance of subtle expression differences, and for confirmatory and/or extended analyses of microarray results by means of RT-PCR. Materials and methods Sample preparation Thirty-four neuroblastoma cell lines were grown to subconfluency according to standard culture conditions. RNA was isolated using the RNeasy Midi Kit (Qiagen) according to the manufacturer's instructions. Nine RNA samples from pooled normal human tissues (heart, brain, fetal brain, lung, trachea, kidney, mammary gland, small intestine and uterus) were obtained from Clontech. Blood and fibroblast biopsies were obtained from different normal healthy individuals. Thirteen leukocyte samples were isolated from 5 ml fresh blood using Qiagen's erythrocyte lysis buffer. Fibroblast cells from 20 upper-arm skin biopsies were cultured for a short time (3-4 passages) and harvested at subconfluency as described [22]. Bone marrow samples were obtained from nine patients with no hematological malignancy. Total RNA of leukocyte, fibroblast and bone marrow samples was extracted using Trizol (Invitrogen), according to the manufacturer's instructions. Real-time RT-PCR DNase treatment, cDNA synthesis, primer design and SYBR Green I RT-PCR were carried out as described [23]. In brief, 2 μg of each total RNA sample was treated with the RQ1 RNase-free DNase according to the manufacturer's instructions (Promega). Treated RNA samples were desalted (to prevent carry over of magnesium) before cDNA synthesis using Microcon-100 spin columns (Millipore). First-strand cDNA was synthesized using random hexamers and SuperscriptII reverse transcriptase according to the manufacturer's instructions (Invitrogen), and subsequently diluted with nuclease-free water (Sigma) to 12.5 ng/μl cDNA. RT-PCR amplification mixtures (25 μl) contained 25 ng template cDNA, 2x SYBR Green I Master Mix buffer (12.5 μl) (Applied Biosystems) and 300 nM forward and reverse primer. Reactions were run on an ABI PRISM 5700 Sequence Detector (Applied Biosystems). The cycling conditions comprised 10 min polymerase activation at 95°C and 40 cycles at 95°C for 15 sec and 60°C for 60 sec. Each assay included (in duplicate): a standard curve of four serial dilution points of SK-N-SH or IMR-32 cDNA (ranging from 50 ng to 50 pg), a no-template control, and 25 ng of each test cDNA. All PCR efficiencies were above 95%. Sequence Detection Software (version 1.3) (Applied Biosystems) results were exported as tab-delimited text files and imported into Microsoft Excel for further analysis. The median coefficient of variation (based on calculated quantities) of duplicated samples was 6%. Single control normalization error E For any given m tissue samples, real-time RT-PCR gene-expression levels a ij of n internal control genes are measured. For every combination of two tissue samples p and q, and every combination of two internal control genes j and k, the single control normalization error E was calculated (Equation 1). This is the fold expression difference between samples p and q when normalized to housekeeping gene j or k. ( j,k [1,n], p,q [1,m], j ≠ k and p ≠ q): Internal control gene-stability measure M For every combination of two internal control genes j and k, an array A jk of m elements is calculated which consist of log2-transformed expression ratios a ij /a ik (Equation 2). We define the pairwise variation V jk for the control genes j and k as the standard deviation of the A jk elements (Equation 3). The gene-stability measure M j for control gene j is the arithmetic mean of all pairwise variations V jk (Equation 4). ( j,k [1,n] and j ≠ k): V jk = st.dev (A jk ) (3) Normalization of array data Publicly available raw microarray data [14] were downloaded as tab-delimited files. Eight hybridization data sets were randomly selected and imported into Microsoft Excel software for further manipulation (MCF7, DU-145, 786-0, BC2, K562, A549, U251, and SK-OV-3). For each hybridization array, all spots with Cy3 or Cy5 fluorescence intensities below the average overall background level plus one standard deviation were discarded. Subsequently, a local background correction for each spot was applied. Two scale factors were calculated for each slide on the basis of median ratio normalization (median ratio set to 1) and total intensity normalization (equalized sum of fluorescence intensities for both channels). Nine housekeeping genes were identified by BLAST similarity or keyword search against the database of cDNA clones present on the array (see IMAGE clones listed in Table 1). Additional data files The raw expression values are available as a tab-delimited file. Supplementary Material Additional data file 1 Raw expression values Click here for additional data file

0 comments Cited 1406 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data

Jan Hellemans, Geert Mortier, Anne Paepe … (2007)

Background Since its introduction more than 10 years ago [1], quantitative PCR (qPCR) has become the standard method for quantification of nucleic acid sequences. The ease of use and high sensitivity, specificity and accuracy has resulted in a rapidly expanding number of applications with increasing throughput of samples to be analyzed. The software programs provided along with the various qPCR instruments allow for straightforward extraction of quantification cycle values from the recorded fluorescence measurements, and at best, interpolation of unknown quantities using a standard curve of serially diluted known quantities. However, these programs usually do not provide an adequate solution for the processing of these raw data (coming from one or multiple runs) into meaningful results, such as normalized and calibrated relative quantities. Furthermore, the currently available tools all have one or more of the following intrinsic limitations: dedicated for one instrument, cumbersome data import, a limited number of samples and genes can be processed, forced number of replicates, normalization using only one reference gene, lack of data quality controls (for example, replicate variability, negative controls, reference gene expression stability), inability to calibrate multiple runs, limited result visualization options, lack of experimental archive, and closed software architecture. To address the shortcomings of the available software tools and quantification strategies, we modified the classic delta-delta-Ct method to take multiple reference genes and gene specific amplification efficiencies into account, as well as the errors on all measured parameters along the entire calculation track. On top of that, we developed an inter-run calibration algorithm to correct for (often underestimated) run-to-run differences. Our advanced models and algorithms are implemented in qBase, a flexible and open source program for qPCR data management and analysis. Four basic principles were followed during development of the program: the use of correct models and formulas for quantification and error propagation, inclusion of data quality control where required, automation of the workflow as much as possible while retaining flexibility, and user friendliness of operation. Our quantification framework and software fit exactly in current thinking that places emphasis on getting every step of a real-time PCR assay right (such as RNA quality assessment, appropriate reverse transcription, selection of a proper normalization strategy, and so on [2]), especially if small differences between samples need to be reliably demonstrated. In this entire workflow, data analysis is an important last step. Results and discussion Determination of the error on estimated amplification efficiencies qBase employs a proven, advanced and universally applicable relative quantification model. An important underlying assumption is that PCR efficiency is assay dependent and sample independent. While this may not be true in every experimental situation, there is currently no consensus on how sample specific PCR efficiencies should be calculated and used for robust quantification. Most evaluation studies attribute a lack of precision to these sample specific efficiency estimation methods. Hence, the gold standard is still the use of a PCR efficiency estimated by a serial dilution series (preferably of pooled cDNA samples, to mimic as much as possible the actual samples to be measured), at least if one aims at accurate and precise quantification. Sample specific PCR efficiency estimation has its usefulness, but currently only for outlier detection [3-5]. Calculation of relative quantities from quantification cycle values requires knowledge of the amplification efficiency of the PCR. As stated above, amplicon specific amplification efficiencies are preferably determined using linear regression (formulas 1 and 5 in Materials and methods) of a serial dilution series with known quantities (either relative or absolute). However, the error on the estimated amplification efficiency is almost never determined, nor taken into account. This error can be calculated using linear regression as well (formulas 2 to 4 and 6), and should subsequently be propagated during conversion of the quantification cycle values to the relative quantities. The formula for the error on the slope provides the mathematical basis to learn how more accurate amplification efficiency estimates can be achieved, that is, by expanding the range of the dilution and including more measurement points. Calculation of normalized relative quantities and error minimization Methods for the conversion of quantification cycle values (Cq; see Materials and methods for terminology) into normalized relative quantities (NRQs) were first reported in 2001. The simplest model described by Livak and Schmittgen [6] assumes 100% PCR efficiency (reflected by a value of 2 for the base E of the exponential function) and uses a single reference gene for normalization: NRQ = 2ΔΔCt Pfaffl [7] modified the above model by adjusting for differences in PCR efficiency between the gene of interest (goi) and a reference gene (ref): N R Q = E g o i Δ C t , g o i E r e f Δ C t , r e f MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGobGaamOuaiaadgfacqGH9aqpdaWcaaqaaiaadweadaqhaaWcbaGaam4zaiaad+gacaWGPbaabaGaeuiLdqKaam4qaiaadshacaGGSaGaam4zaiaad+gacaWGPbaaaaGcbaGaamyramaaDaaaleaacaWGYbGaamyzaiaadAgaaeaacqqHuoarcaWGdbGaamiDaiaacYcacaWGYbGaamyzaiaadAgaaaaaaaaa@4B9D@ This model constituted an improvement over the classic delta-delta-Ct method, but cannot deal with multiple (f) reference genes, which is required for reliable measurements of subtle expression differences [8]. Therefore, we further extended this model to take into account multiple stably expressed reference genes for improved normalization. Although not yet published, this advanced and generalized model of relative quantification has been applied previously in our nucleic acid quantification studies [8-12]. N R Q = E g o i Δ C t , g o i ∏ o f E r e f o Δ C t , r e f o f MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGobGaamOuaiaadgfacqGH9aqpdaWcaaqaaiaadweadaqhaaWcbaGaam4zaiaad+gacaWGPbaabaGaeuiLdqKaam4qaiaadshacaGGSaGaam4zaiaad+gacaWGPbaaaaGcbaWaaOqaaeaadaqeWbqaaiaadweadaqhaaWcbaGaamOCaiaadwgacaWGMbWaaSbaaWqaaiaad+gaaeqaaaWcbaGaeuiLdqKaam4qaiaadshacaGGSaGaamOCaiaadwgacaWGMbWaaSbaaWqaaiaad+gaaeqaaaaaaSqaaiaad+gaaeaacaWGMbaaniabg+GivdaaleaacaWGMbaaaaaaaaa@5300@ The calculation of relative quantities, normalization and corresponding error propagation is detailed in formulas 7-16. The basic principle of the delta-Cq quantification model is that a difference (delta) in quantification cycle value between two samples (often a true unknown and calibrator or reference sample) is transformed into relative quantities using the exponential function with the efficiency of the PCR reaction as its base. In principle, any sample can be selected as calibrator, either a real untreated control, or the sample with the highest or lowest expression. In addition, any arbitrary cycle value can be chosen as the calibrator quantification cycle value. The choice of calibrator sample or cycle value does not influence the relative quantification result; while numbers may be different, the actual fold differences between the samples remain identical, so results are fully equivalent and thus only rescaled. However, the choice of calibrator quantification cycle value does have a profound influence on the final error on the relative quantities if the error on the estimated amplification efficiency (see above) is taken into account in the error propagation procedure. To address this issue, we developed an error minimization approach that uses the arithmetic mean quantification cycle value across all samples for a gene within a single run as the calibrator quantification cycle value. As the increase in error is proportional to the difference in quantification cycle between the sample of interest and the calibrator (formula 12), the overall final error is minimized if the mean quantification cycle is used as the calibrator quantification cycle value (Figure 1). Evaluation of normalization The normalization of relative quantities with reference genes relies on the assumption that the reference genes are stably expressed across all tested samples. When using only one reference gene, its stability can not be evaluated. The use of multiple reference genes does not only produce more reliable data, but permits an evaluation of the stability of these genes as well. Previously, we developed a method for the identification of the most stably expressed reference genes in a set of samples [8,13]. The same stability parameter (formulas 21-25) can also be used to evaluate the measured reference genes in an actual quantification experiment. In addition, we calculate here another powerful indicator for expression stability in the actual experiment (formulas 17-20): the coefficient of variation of normalized reference gene relative quantities. Ideally, a reference gene should display the same expression level across all samples after normalization. Consequently, the coefficient of variation indicates how stably the gene is expressed. To provide reference values for acceptable gene stability values (M) and coefficients of variation (CV), we calculated these normalization quality parameters for our previously established reference gene expression data matrix obtained for 85 samples belonging to 5 different human tissue groups [8]. Table 1 shows that mean CV and M values lower than 25% and 0.5, respectively, are typically observed for stably expressed reference genes in relatively homogeneous sample panels. For more heterogeneous panels, the mean CV and M values can increase to 50% and 1, respectively. While the use of multiple stably expressed reference genes is currently considered to be the gold standard for normalization of mRNA expression, other strategies might be more appropriate for specific applications, such as: counting cell numbers and expressing mRNA expression levels as copy numbers per cell; using a biologically relevant, specific internal reference (sometimes referred to as in situ calibration); or normalizing against DNA (for overview of alternative strategies, see [14]). Clearly, no single strategy is applicable to every experimental situation and it remains up to individual researchers to identify and validate the method most appropriate for their experimental conditions. Important to note is that the presented qBase framework and software is compatible with most of the above mentioned normalization strategies. Inter-run calibration Two different experimental set-ups can be followed in a qPCR relative quantification experiment. According to the preferred sample maximization method, as many samples as possible are analyzed in the same run. This means that different genes (assays) should be analyzed in different runs if not enough free wells are available to analyze the different genes in the same run. In contrast, the gene maximization set-up analyzes multiple genes in the same run, and spreads samples across runs if required (Figure 2). The latter approach is often used in commercial kits or in prospective studies. It is important to realize that in a relative quantification study, the experimenter is usually interested in comparing the expression level of a particular gene between different samples. Therefore, the sample maximization method is highly recommended because it does not suffer from (often underestimated) technical (run-to-run) variation between the samples. Whatever set-up is used, inter-run calibration is required to correct for possible run-to-run variation whenever all samples are not analyzed in the same run. For this purpose, the experimenter needs to analyze so-called inter-run calibrators (IRCs); these are identical samples that are tested in both runs. By measuring the difference in quantification cycle or NRQ between the IRCs in both runs, it is possible to calculate a correction or calibration factor to remove the run-to-run difference, and proceed as if all samples were analyzed in the same run. Inter-run calibration is required because the relationship between quantification cycle value and relative quantity is run dependent due to instrument related variation (PCR block, lamp, filters, detectors, and so on), data analysis settings (baseline correction and threshold), reagents (polymerase, fluorophores, and so on) and optical properties of plastics. Important to note is that inter-run calibration should be performed on a gene per gene basis. It is not sufficient to determine the quantification cycle or relative quantity relation for one primer pair; the experimenter should do this for all assays. To provide experimental proof of the advantage of sample maximization over gene maximization with respect to reduction in variation, we designed and performed an experiment consisting of five different runs (Figure 2). The results for one of the genes are shown in Figure 3. With gene maximization, 11 samples are spread over runs 1 and 2. Samples 1 to 3 occur in both runs and can thus be used as IRCs. Run 5 contains all 11 samples in a sample maximization set-up. When comparing the Cq values for the IRCs between runs 1 and 2, it is apparent that those in run 2 are systematically higher (0.77 cycles). After conversion of Cq values into NRQs (and thus taking into account the Cq run-to-run differences for 3 reference genes as well), the NRQ values for samples 1 to 3 differ, on average, by 72% (Additional data file 1). It is important to realize that these values are merely examples. Although the differences can be minimized in a well designed and controlled experiment, they can be much bigger and are generally unpredictable. Anyway, by performing proper inter-run calibration, these run-dependent differences can be corrected and the resulting expression pattern (obtained by calibrating the gene maximization set-up) becomes highly similar to that from the sample maximization method (where there is no run-to-run variation). To our knowledge, there is only one instrument software that can perform such a correction, but the algorithm is based on the Cq values of a single IRC. Although it can be valid to calibrate data based on Cq values, this method has the drawback that the same template dilution needs to be used in all the runs to be calibrated (for example, nucleic acids from a new cDNA synthesis or a new dilution cannot be reliably used). It is often much more straightforward and easier to calibrate the runs based on the NRQs of the IRCs (formulas 13-16). The quantity (and to some extent also the quality) of the calibrating input material is adjusted after normalization. This has the important advantage that independently prepared cDNA of the same RNA source can be used as a calibrator in the different runs (which allows addition of extra runs, even when the cDNA of the calibrator is run out). To some extent, even a biological replicate (for example, regrown cells) can be used for inter-run calibration when doing the calibration on the NRQs, provided that the experimenter realizes this introduces some level of biological replicate variation (but still adequately removes inter-run variation). The validity of using independently prepared cDNA as calibrator is demonstrated by the experiment described in Figure 2. Inter-run calibration between runs 1 and 3 based on IRCs from different cDNA preparations results in the same expression pattern as that obtained with sample maximization or inter-run calibration with the same cDNA (Figure 3). This is also clearly demonstrated by calculating the ratio of the calibrated NRQs (CNRQs) in runs 2 and 3 (mean ratio: 0.985, 95% CI: [0.945, 1.026]) (Additional data file 2). It is also advisable to use multiple IRCs. A failed calibrator does not ruin an experiment if two or more are available. In addition, calibration with multiple IRCs gives more precise results with a smaller error. Based on our real calibration experiment, inter-run calibration using a single IRC inherently increases the uncertainty on the relative quantity by about 70% whereas a set of 3 IRCs increases it by only 40% (Table 2). Although it is still advisable to choose the sample maximization setup, inter-run calibration based on the NRQs of multiple IRCs provides reliable results and flexibility in the source of the IRCs. It is important to note that formulas 13'-16' can only be used for inter-run calibration if the same set of IRCs is used in all runs to be calibrated. For more complex experimental set-ups (whereby different combinations of IRCs are used in the various runs), advanced inter-run calibration algorithms are currently being developed in our laboratory (whereby the challenge is the proper propagation of the errors). The process of inter-run calibration is very analogous to normalization. Normalization removes the sample specific non-biological variation, while inter-run calibration removes the technical run-to-run variation between samples analyzed in different runs. As such, the same formulas can be used to calculate the inter-run calibration factor (the geometric mean of the different IRCs' NRQs; formulas 13'-16'), and the same quality parameters can be applied to monitor the inter-run calibration process (provided multiple IRCs are used; formulas 21'-25'). Calculation of the IRC stability measure allows the evaluation of the quality of the calibration, which depends on the results of the IRCs. Our experiment shows that, with low M values (Additional data file 2: M ≅ 0.1), virtually identical results are obtained for the different selections of IRCs (Table 2). If inconsistent or erroneous data were obtained for one of the IRCs, higher IRC-M values would be obtained and dissimilar results would be calculated for different sets of IRCs. Therefore, the IRC stability measure M is of great value to determine the quality of the IRCs (provided more than one IRC is used), and to verify whether the calibration procedure is trustworthy. qBase Calculation of NRQs for large data sets, followed by inter-run calibration, is a difficult, error prone and time consuming process when performed in a spreadsheet, especially if errors have to be propagated throughout all calculations. To automate these calculations, and to provide data quality control and result visualization, we developed the software program qBase (Figure 4a). This program is composed of two modules: the 'qBase Browser' for managing and archiving data and the 'qBase Analyzer' for processing raw data into biologically meaningful results. qBase Browser The Browser allows users to import and to organize hierarchically runs from most currently available qPCR instruments. In qBase, data are structured into three layers: raw data from the individual runs (plates) are stored in the run layer; the experiment layer groups data from different runs that need to be processed and visualized together; and the project layer combines a number of related experiments (for example, biological replicates of the same experiment). This hierarchical structure provides a clear framework to manage qPCR data in a straightforward and simple manner. The qBase Browser window is split into two parts: the bottom of the screen provides an explorer-like window to browse through the data; and the top of the screen contains a separate window displaying the annotation of the selected run, experiment or project. The qBase Browser allows the deletion and addition of projects, experiments and runs. The facility for exporting and importing projects and experiments is a convenient way to exchange data between different qBase users. Data import Each qPCR instrument has its own method of data collection and storage, accompanied by a large heterogeneity in export files with respect to file format, table layout and used terminology. During import into qBase, the different instrument export files are translated into a common internal format. This format contains information on the well name, sample type, sample and gene name, quantification cycle value, starting quantity values (for standards), and the exclusion status. The last field indicates whether the measurement should be excluded from further calculations without actually discarding the measurement. Data can be imported from a number of data formats. Two standards (qBase internal format and RDML (Real-time PCR Data Markup Language)) and a number of instrument specific formats are supported. The qBase standard consists of a Microsoft Excel table in which the columns correspond to the information that is used internally by qBase. RDML is a universal format under development for the exchange of qPCR data under the form of XML files [15]. The import wizard guides users through the process of data import (Figure 4b). To address the limitation that some instrument software packages provide only a single identifier field for a well (while there are numerous variables, such as sample and gene name, sample type, and so on), qBase offers the possibility to extract multiple types of information from a single identifier. As such, the identifier 'UNKN|JohnSmith|Gremlin' could, for instance, be extracted to sample type 'UNKN' (unknown), sample name 'JohnSmith' and gene name 'Gremlin'. qBase analyzer The Analyzer is the data processing module for experiments. It performs relative quantification with proper error propagation along all quantifications, provides a number of quality controls and visualizes NRQs. This process involves several consecutive steps, some of them to be interactively performed by the user, others automatically executed by the program. Users are guided through the analysis by means of a simple workflow scheme in the main screen of the qBase Experiment Analyzer (Figure 4d). Step 1: Initialization The first step in the workflow is the (automatic) initialization of an experiment, during which raw data from all individual run files from the same experiment are combined into a single data table. The initialization procedure also generates a non-redundant list of all the samples and genes within the experiment. There are no limits on the number of replicates, genes or samples contained within an experiment, except for those imposed by Excel (no more than 65,535 wells can be stored into a single experiment). The absence of such limitations is a major improvement compared to the existing PCR data analysis tools, which are usually limited to processing data from a single plate or run with a fixed number of sample replicates. In qBase, data points with identical sample and gene names are automatically identified as technical replicates, except when the wells are located in different runs. In the latter case, they are interpreted as IRCs and renamed as such, that is, an appendix is added to indicate the run in which they are analyzed. Within the sample and gene lists on the main screen, a color code is used to label the reference genes and special sample types (standards, no template controls, no amplification controls, and IRCs; Figure 4d). Step 2: Review sample and gene annotation Sample and gene names can be easily modified in all runs belonging to the same experiment. This is very useful for achieving consistent naming of samples and genes across runs. To change names in only a selection of wells in a particular run, a run editor is available in qBase. This editor visualizes the plate (or rotor) layout with well annotation. It allows the modification of gene and sample names, as well as sample types and quantities in individually selected cells or in a range of neighboring cells. Together these tools allow users to review and correct the input annotation. Step 3: Reference gene selection Accurate relative quantification requires appropriate normalization to correct for non-specific experimental variation, such as differences in starting quantity and quality between the samples. The current consensus is that multiple stably expressed reference genes are required for accurate and robust normalization, especially for measuring subtle expression differences. While different tools are available to determine which candidate reference genes are stably expressed (for example, geNorm [8,13], BestKeeper [16], Normfinder [17]), almost no software is available to perform straightforward normalization using more than one reference gene (with the exception of the commercial Bio-Rad iQ5 and the REST 2005 software). qBase allows gene expression levels to be normalized using up to five reference genes that can easily be selected from the gene list. Step 4: Raw data quality control Several problems and mistakes can occur when preparing and performing qPCR reactions. The erroneous data produced by these problems need to be detected and excluded from further data analysis to prevent obscuring valuable information or generating false positive results. qBase provides several important quality control checks to evaluate whether: a no template control (NTC) is present for all genes (primer pairs); the quantification cycle values of NTCs are larger than a user defined threshold; the difference in quantification cycle value between samples of interest and NTCs is larger than a user defined threshold; the difference in quantification cycle value between replicated reactions is less than a user defined threshold; and genes are spread over multiple runs (meaning that not all samples tested for a particular gene are analyzed in the same run). After data quality control, a message box reports all quality issue alerts and the involved data points are color-coded in the data list. This allows users to easily evaluate their data and to select data points for exclusion from analysis without actually removing the data themselves. Step 5: Sample order and selection During initialization, samples are ordered alphanumerically, but the order of the samples can be adjusted in a user defined way. Samples can be re-ordered in the list by using the up and down keyboard arrows or the sample context menu. Samples that do not need to show up in the results can be excluded by using the delete button on the keyboard or the sample context menu. Apart from changing the default sample order and display selection in the Analyzer main screen, this can also be modified in a temporary gene specific manner when reviewing the results (see below). Step 6: Amplification efficiencies All quantification models transform (logarithm) quantification cycle values into quantities using an exponential function with the efficiency of the PCR reaction as its base. Although these models and derivative formulas have been used for years, no model or software has taken into account the error (uncertainty) on the calculated efficiency. qBase is the first tool that takes the error on the amplification efficiency into account by means of proper error propagation. Within qBase, gene specific amplification efficiencies can be specified in three ways. A default amplification efficiency (and error) can be set to all genes, or it can be provided for each gene individually. In the latter case, the efficiencies and corresponding errors can be simply typed (for example, when calculated in an independent experiment), or calculated from a standard dilution series. qBase provides an interface for the evaluation of standard curves whereby outlier reactions can be removed. Amplification efficiencies are calculated by means of linear regression and can be saved to the gene list, in order to be taken into account during further calculation steps (Figure 4c). Step 7: Calculation of relative quantities After raw qPCR data (quantification cycle values) quality control, reference gene(s) selection and amplification efficiency estimation, qBase can calculate the normalized and rescaled quantities. This process is fully automated and involves the following steps: calculation of the average and the standard deviation of the quantification cycle values for all technical replicates (data points with identical gene and sample names) - the program automatically detects the number of replicates for each sample-gene combination and can deal with a variable number of replicates (formulas 7-8); conversion of quantification cycle values into relative quantities based on the gene specific amplification efficiency (formulas 9-12); calculation of a sample specific normalization factor by taking the geometric mean of the relative quantities of the reference genes (formulas 13-14); normalization of quantities by division by the normalization factor (formulas 15-16); rescaling of the normalized quantities as requested by the user (either relative to the sample with the highest or lowest relative quantity, or relative to a user defined calibrator) (Figure 5). For each step in the calculation of normalized and rescaled relative quantities, qBase propagates the error. Depending on the settings, qBase will use the classic delta-delta-Ct method (100% PCR efficiency and one reference gene) [6], the Pfaffl modification of delta-delta-Ct (gene specific PCR efficiency correction and one reference gene) [7] or our generalized qBase model (gene specific PCR efficiency correction and multiple reference gene normalization). Evaluation of normalization Normalization can be monitored by inspecting the normalization factors for all samples, or by calculating reference gene stability parameters. In an experiment with perfect reference genes, identical sample input amounts of equal quality, the normalization factor should be similar for all samples. Variations indicate unequal starting amounts, PCR problems or unstable reference genes. The qBase normalization factor histogram allows easy identification of these potential problems. One of the unique features of qBase is the option to normalize the relative quantities with multiple reference genes, resulting in more accurate and reliable results. In addition, qBase evaluates the stability of the applied reference genes (and hence the reliability of the normalization) by calculating two quality measures: the coefficient of variation of the normalized reference gene expression levels; and the geNorm stability M-value. Both values are only meaningful, or can be calculated only if multiple reference genes are quantified. The lower these quality values, the more stably the reference genes are expressed in the tested samples. Based on our reported data on the expression of 10 candidate reference genes in 85 samples from 13 different human tissues [8], we have calculated the above mentioned quality parameters and propose acceptable values for M and CV in Table 1. Note that the limits of acceptance largely depend on the required accuracy and resolution of the relative quantification study. Step 8: Inter-run calibration qBase is especially useful and unique for analysis of experiments containing multiple runs. As users are usually interested in comparing the expression for a given gene between different samples, the sample maximization experimental set-up is the preferred set-up because it minimizes technical (run-to-run) variation between the samples. Nevertheless, the gene maximization set-up is also frequently used. To correct the inter-run variation introduced by this set-up as much as possible, qBase allows runs to be calibrated (on a gene specific basis) using one or multiple IRCs (Figure 5). If no sample(s) is (are) measured for the same gene in the different runs, qBase can not perform calibration and inter-run differences are assumed to be nil. Another unique and important aspect is that inter-run calibration is performed after normalization, which greatly enhances the flexibility in experimental design, as it is no longer obligatory that the same IRC template is used throughout all runs (as such, a new batch of cDNA can be synthesized, and variations will be canceled out during normalization). Step 9: Evaluation of results Normalized and rescaled relative quantities can be presented in three ways: a single-gene histogram, a multi-gene histogram, or a table. The default sample order and sample selection is defined in the main qBase window by editing the sample list. For the single-gene histogram (Figure 4e) the default order and selection can be changed to an alphanumerical, a user defined or a quantity based (that is, decreasing quantities) order. The option menu allows users to define the size of the error to be displayed (one or more standard error of the mean units). For both histogram views, the scale of the Y-axis can be switched from linear to logarithmic mode and vice versa. The multi-gene histogram (Figure 4f) is instrumental for comparing expression patterns (but not the actual expression levels) between different genes (because each gene is rescaled independently). The genes to be shown in the histogram can be selected from a gene list. Data from the table view (with or without error values) can be easily exported for further processing in other dedicated programs. Distribution qBase is freely available for non-commercial research and can be downloaded from the qBase website [18]. Manual and tutorial For the training of new qBase users we have designed a demo experiment that is explained in detail in a step-by-step tutorial. Demo experiment 1 consists of 4 runs (96-well format) containing 16 samples, 5 standards, and a no template control to be analyzed for 5 genes of interest and 3 reference genes. Demo experiment 2 adds two runs to the initial experiment, expanding it with eight additional samples and three calibrators for inter-run calibration. After training, complete analysis of these six plates can be performed in less than an hour. This includes data import, correction of well annotation, quality control, determination of amplification efficiencies, inter-run calibration, calculations and results interpretation. To our knowledge, there are no other tools available that can perform all these functions. Conventional spreadsheet calculations would take considerably longer, are error prone and do not include quality control. Conclusion Although qPCR has been around for more than ten years, the employed calculation models are still amenable for improvement. Here we report our advanced, and proven, model for relative quantification that uses gene-specific amplification efficiencies and allows normalization with multiple reference genes. Errors are propagated throughout all calculation steps, and previously ignored errors, such as the uncertainty on the estimated amplification efficiency, are now taken into account. In addition, we developed an inter-run calibration method that allows samples analyzed in different runs to be compared against each other. We implemented these improved and innovative methods in an easy to use, Microsoft Excel based tool for the management and the automated analysis of qPCR data, coined qBase. This freely available software package incorporates several data quality controls and uses an advanced relative quantification model with efficiency correction, multiple reference gene normalization, inter-run calibration and error propagation along each step of the calculations. A configurable graphical results output and the possibility to import and export experiments allow easy results interpretation and data exchange, respectively. As a final comment, we would like to point out that, although our framework and program help management and interpretation of mRNA data, assessment of biological relevance or statistical significance requires the correlation of these mRNA data with protein levels or activity, and the measurement of biological replicates, respectively. Materials and methods Terminology According to the Real-time PCR Data Markup Language (RDML) we used the proposed universal terms for the plethora of available descriptions (for example, quantification cycle value (Cq) instead of cycle threshold value (Ct), take off point (TOP) or crossing point (Cp)). Error propagation Error propagation is performed using the delta method, based on a truncated Taylor series expansion. Symbols used in formulas N, number of replicates i; g, number of genes j; c, number of IRCs m, m'; r, number of runs l, l'; s, number of samples k; f, number of reference genes p, p'; h, number of standard curve points q with known quantity Q; Cq, quantification cycle; CF, calibration factor; NF, normalization factor; RQ, relative quantity (relative to other samples within the same run for the same gene); NRQ, normalized relative quantity; SE, standard error; IRC, inter-run calibrator; CV, coefficient of variation; A, column matrix in which each element consists of the log2 transformed (normalized) relative quantity ratio; V, geNorm pairwise variation; M, geNorm stability parameter. Determination of amplification efficiencies A standard curve can be generated from the Cq and quantity values of a dilution series measured for the same amplicon within a single run. The slope and its standard error can be calculated for this curve by means of linear regression: s l o p e j l = ∑ q = 1 h ( Q q j l − Q j l ¯ ) ( C q q j l − C q j l ¯ ) ∑ q = 1 h ( Q q j l − Q j l ¯ ) 2 ( formula 1 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGZbGaamiBaiaad+gacaWGWbGaamyzamaaBaaaleaacaWGQbGaamiBaaqabaGccqGH9aqpdaWcaaqaamaaqahabaWaaeWaaeaacaWGrbWaaSbaaSqaaiaadghacaWGQbGaamiBaaqabaGccqGHsisldaqdaaqaaiaadgfadaWgaaWcbaGaamOAaiaadYgaaeqaaaaaaOGaayjkaiaawMcaamaabmaabaGaam4qaiaadghadaWgaaWcbaGaamyCaiaadQgacaWGSbaabeaakiabgkHiTmaanaaabaGaam4qaiaadghadaWgaaWcbaGaamOAaiaadYgaaeqaaaaaaOGaayjkaiaawMcaaaWcbaGaamyCaiabg2da9iaaigdaaeaacaWGObaaniabggHiLdaakeaadaaeWbqaamaabmaabaGaamyuamaaBaaaleaacaWGXbGaamOAaiaadYgaaeqaaOGaeyOeI0Yaa0aaaeaacaWGrbWaaSbaaSqaaiaadQgacaWGSbaabeaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabaGaamyCaiabg2da9iaaigdaaeaacaWGObaaniabggHiLdaaaOGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGymaaGaayjkaiaawMcaaaaa@7058@ s e , j l = ∑ q = 1 h ( C q q j l , m e a s u r e d − C q q j l , p r e d i c t e d ) 2 h − 2 ( formula 2 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGZbWaaSbaaSqaaiaadwgacaGGSaGaamOAaiaadYgaaeqaaOGaeyypa0ZaaOaaaeaadaWcaaqaamaaqahabaWaaeWaaeaacaWGdbGaamyCamaaBaaaleaacaWGXbGaamOAaiaadYgacaGGSaGaamyBaiaadwgacaWGHbGaam4CaiaadwhacaWGYbGaamyzaiaadsgaaeqaaOGaeyOeI0Iaam4qaiaadghadaWgaaWcbaGaamyCaiaadQgacaWGSbGaaiilaiaadchacaWGYbGaamyzaiaadsgacaWGPbGaam4yaiaadshacaWGLbGaamizaaqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabaGaamyCaiabg2da9iaaigdaaeaacaWGObaaniabggHiLdaakeaacaWGObGaeyOeI0IaaGOmaaaaaSqabaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIYaaacaGLOaGaayzkaaaaaa@6A74@ s x , j l = 1 h − 1 ∑ q = 1 h ( Q q j l − Q j l ¯ ) 2 ( formula 3 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGZbWaaSbaaSqaaiaadIhacaGGSaGaamOAaiaadYgaaeqaaOGaeyypa0ZaaOaaaeaadaWcaaqaaiaaigdaaeaacaWGObGaeyOeI0IaaGymaaaadaaeWbqaamaabmaabaGaamyuamaaBaaaleaacaWGXbGaamOAaiaadYgaaeqaaOGaeyOeI0Yaa0aaaeaacaWGrbWaaSbaaSqaaiaadQgacaWGSbaabeaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabaGaamyCaiabg2da9iaaigdaaeaacaWGObaaniabggHiLdaaleqaaOGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaG4maaGaayjkaiaawMcaaaaa@5744@ S E ( s l o p e j l ) = s e , j l s x , j l ( h − 1 ) ( formula 4 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaam4CaiaadYgacaWGVbGaamiCaiaadwgadaWgaaWcbaGaamOAaiaadYgaaeqaaaGccaGLOaGaayzkaaGaeyypa0ZaaSaaaeaacaWGZbWaaSbaaSqaaiaadwgacaGGSaGaamOAaiaadYgaaeqaaaGcbaGaam4CamaaBaaaleaacaWG4bGaaiilaiaadQgacaWGSbaabeaakiaacIcacaWGObGaeyOeI0IaaGymaiaacMcaaaGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGinaaGaayjkaiaawMcaaaaa@564A@ The base for exponential amplification E, and its standard error SE(E) are calculated from these values: E j l = 10 ( 1 s l o p e j l ) ( formula 5 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGfbWaaSbaaSqaaiaadQgacaWGSbaabeaakiabg2da9iaaigdacaaIWaWaaWbaaSqabeaadaqadaqaamaalaaabaGaaGymaaqaaiaadohacaWGSbGaam4BaiaadchacaWGLbWaaSbaaWqaaiaadQgacaWGSbaabeaaaaaaliaawIcacaGLPaaaaaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaI1aaacaGLOaGaayzkaaaaaa@4CA1@ S E ( E j l ) = E j l ⋅ ln ⁡ ( 10 ) ⋅ S E ( s l o p e j l ) s l o p e jl 2 ( formula 6 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaamyramaaBaaaleaacaWGQbGaamiBaaqabaaakiaawIcacaGLPaaacqGH9aqpdaWcaaqaaiaadweadaWgaaWcbaGaamOAaiaadYgaaeqaaOGaeyyXICTaciiBaiaac6gadaqadaqaaiaaigdacaaIWaaacaGLOaGaayzkaaGaeyyXICTaam4uaiaadweadaqadaqaaiaadohacaWGSbGaam4BaiaadchacaWGLbWaaSbaaSqaaiaadQgacaWGSbaabeaaaOGaayjkaiaawMcaaaqaaiaadohacaWGSbGaam4BaiaadchacaWGLbWaa0baaSqaaiaabQgacaqGSbaabaGaaeiiaiaabccacaqGYaaaaaaakiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaiAdaaiaawIcacaGLPaaaaaa@6426@ Conversion of Cq values into relative quantities Step 1 Calculation of the average Cq value for all replicates of the same gene/sample combination jk within a given run l: C q j k l ¯ = ∑ i = 1 n C q i j k l n ( formula 7 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaqdaaqaaiaadoeacaWGXbWaaSbaaSqaaiaadQgacaWGRbGaamiBaaqabaaaaOGaeyypa0ZaaSaaaeaadaaeWbqaaiaadoeacaWGXbWaaSbaaSqaaiaadMgacaWGQbGaam4AaiaadYgaaeqaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aaGcbaGaamOBaaaacaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaI3aaacaGLOaGaayzkaaaaaa@5052@ S E ( C q j k l ) = 1 n ( n − 1 ) ∑ i = 1 n ( C q i j k l − C q j k l ¯ ) 2 ( formula 8 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaam4qaiaadghadaWgaaWcbaGaamOAaiaadUgacaWGSbaabeaaaOGaayjkaiaawMcaaiabg2da9maakaaabaWaaSaaaeaacaaIXaaabaGaamOBamaabmaabaGaamOBaiabgkHiTiaaigdaaiaawIcacaGLPaaaaaWaaabCaeaadaqadaqaaiaadoeacaWGXbWaaSbaaSqaaiaadMgacaWGQbGaam4AaiaadYgaaeqaaOGaeyOeI0Yaa0aaaeaacaWGdbGaamyCamaaBaaaleaacaWGQbGaam4AaiaadYgaaeqaaaaaaOGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaeaacaWGPbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaSqabaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaI4aaacaGLOaGaayzkaaaaaa@60A5@ Step 2 Transformation of mean Cq value into RQ using the gene specific PCR efficiency E jl , with minimization of the overall error: C q r e f e r n c e , j l = C q j l ¯ = ∑ k = 1 s C q j k l s ( formula 9 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGdbGaamyCamaaBaaaleaacaWGYbGaamyzaiaadAgacaWGLbGaamOCaiaad6gacaWGJbGaamyzaiaacYcacaWGQbGaamiBaaqabaGccqGH9aqpdaqdaaqaaiaadoeacaWGXbWaaSbaaSqaaiaadQgacaWGSbaabeaaaaGccqGH9aqpdaWcaaqaamaaqahabaGaam4qaiaadghadaWgaaWcbaGaamOAaiaadUgacaWGSbaabeaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaadohaa0GaeyyeIuoaaOqaaiaadohaaaGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGyoaaGaayjkaiaawMcaaaaa@5B7E@ ΔCq jkl = Cq reference, jl - Cq jkl (formula 10) R Q j k l = E j l Δ C q j k l ( formula 11 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGsbGaamyuamaaBaaaleaacaWGQbGaam4AaiaadYgaaeqaaOGaeyypa0JaamyramaaDaaaleaacaWGQbGaamiBaaqaaiaabccacaqGGaGaaeiiaiaabccacqqHuoarcaWGdbGaamyCamaaBaaameaacaWGQbGaam4AaiaadYgaaeqaaaaakiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaigdacaaIXaaacaGLOaGaayzkaaaaaa@4FE5@ S E ( R Q j k l ) = R Q j k l 2 [ ( Δ C q j k l ⋅ S D ( E j l ) E j l ) 2 + ( ln ⁡ ( E j l ) ⋅ S D ( C q j k l ¯ ) ) 2 ] ( formula 12 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaamOuaiaadgfadaWgaaWcbaGaamOAaiaadUgacaWGSbaabeaaaOGaayjkaiaawMcaaiabg2da9maakaaabaGaamOuaiaadgfadaqhaaWcbaGaamOAaiaadUgacaWGSbaabaGaaGOmaaaakmaadmaabaWaaeWaaeaafaqabeGabaaabaGaeuiLdqKaam4qaiaadghadaWgaaWcbaGaamOAaiaadUgacaWGSbaabeaakiabgwSixlaadofacaWGebWaaeWaaeaacaWGfbWaaSbaaSqaaiaadQgacaWGSbaabeaaaOGaayjkaiaawMcaaaqaaiaadweadaWgaaWcbaGaamOAaiaadYgaaeqaaaaaaOGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaakiabgUcaRmaabmaabaGaciiBaiaac6gadaqadaqaaiaadweadaWgaaWcbaGaamOAaiaadYgaaeqaaaGccaGLOaGaayzkaaGaeyyXICTaam4uaiaadseacaGGOaWaa0aaaeaacaWGdbGaamyCamaaBaaaleaacaWGQbGaam4AaiaadYgaaeqaaaaakiaacMcaaiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaakiaawUfacaGLDbaaaSqabaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIXaGaaGOmaaGaayjkaiaawMcaaaaa@76B5@ Normalization: inter-run calibration The procedures for normalization and inter-run calibration are highly analogous and are therefore described in parallel. Step 1 Calculation of the normalization factor NF for sample k based on the RQs of the reference genes p. Step 1' Calculation of the calibration factor CF for gene j in run l based on the NRQs of the IRCs m: N F k = ∏ p = 1 f R Q p k f ( formula 13 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGobGaamOramaaBaaaleaacaWGRbaabeaakiabg2da9maakeaabaWaaebCaeaacaWGsbGaamyuamaaBaaaleaacaWGWbGaam4AaaqabaaabaGaamiCaiabg2da9iaaigdaaeaacaWGMbaaniabg+GivdaaleaacaWGMbaaaOGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGymaiaaiodaaiaawIcacaGLPaaaaaa@4CFF@ C F j l = ∏ m = 1 c N R Q j l m c MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGdbGaamOramaaBaaaleaacaWGQbGaamiBaaqabaGccqGH9aqpdaGcbaqaamaarahabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGSbGaamyBaaqabaaabaGaamyBaiabg2da9iaaigdaaeaacaWGJbaaniabg+GivdaaleaacaWGJbaaaaaa@441E@ (formula 13'; for definition of NRQ, see formula 15) S E ( N F k ) = N F k ∑ p = 1 f ( S E ( R Q p k ) f ⋅ R Q p k ) 2 ( formula 14 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaamOtaiaadAeadaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaacqGH9aqpcaWGobGaamOramaaBaaaleaacaWGRbaabeaakmaakaaabaWaaabCaeaadaqadaqaamaalaaabaGaam4uaiaadweadaqadaqaaiaadkfacaWGrbWaaSbaaSqaaiaadchacaWGRbaabeaaaOGaayjkaiaawMcaaaqaaiaadAgacqGHflY1caWGsbGaamyuamaaBaaaleaacaWGWbGaam4AaaqabaaaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqaaiaadchacqGH9aqpcaaIXaaabaGaamOzaaqdcqGHris5aaWcbeaakiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaigdacaaI0aaacaGLOaGaayzkaaaaaa@5EC9@ S E ( C F j l ) = C F j l ∑ m = 1 c ( S E ( N R Q j l m ) c ⋅ N R Q j l m ) 2 ( formula 14 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaam4qaiaadAeadaWgaaWcbaGaamOAaiaadYgaaeqaaaGccaGLOaGaayzkaaGaeyypa0Jaam4qaiaadAeadaWgaaWcbaGaamOAaiaadYgaaeqaaOWaaOaaaeaadaaeWbqaamaabmaabaWaaSaaaeaacaWGtbGaamyramaabmaabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGSbGaamyBaaqabaaakiaawIcacaGLPaaaaeaacaWGJbGaeyyXICTaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGSbGaamyBaaqabaaaaaGccaGLOaGaayzkaaaaleaacaWGTbGaeyypa0JaaGymaaqaaiaadogaa0GaeyyeIuoakmaaCaaaleqabaGaaGOmaaaaaeqaaOGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGymaiaaisdacaGGNaaacaGLOaGaayzkaaaaaa@64BF@ Step 2 Conversion of RQs into NRQs. Step 2' Conversion of NRQs into CNRQs: N R Q j k = R Q j k N F k ( formula 15 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGobGaamOuaiaadgfadaWgaaWcbaGaamOAaiaadUgaaeqaaOGaeyypa0ZaaSaaaeaacaWGsbGaamyuamaaBaaaleaacaWGQbGaam4AaaqabaaakeaacaWGobGaamOramaaBaaaleaacaWGRbaabeaaaaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIXaGaaGynaaGaayjkaiaawMcaaaaa@4AD3@ C N R Q j k l = N R Q j k l C F j l ( formula 15 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGdbGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGRbGaamiBaaqabaGccqGH9aqpdaWcaaqaaiaad6eacaWGsbGaamyuamaaBaaaleaacaWGQbGaam4AaiaadYgaaeqaaaGcbaGaam4qaiaadAeadaWgaaWcbaGaamOAaiaadYgaaeqaaaaakiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaigdacaaI1aGaai4jaaGaayjkaiaawMcaaaaa@4FE0@ S E ( N R Q j k ) = N R Q j k ( S E ( N F k ) N F k ) 2 + ( S E ( R Q j k ) R Q j k ) 2 ( formula 16 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGRbaabeaaaOGaayjkaiaawMcaaiabg2da9iaad6eacaWGsbGaamyuamaaBaaaleaacaWGQbGaam4AaaqabaGcdaGcaaqaamaabmaabaWaaSaaaeaacaWGtbGaamyramaabmaabaGaamOtaiaadAeadaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaaaeaacaWGobGaamOramaaBaaaleaacaWGRbaabeaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaGccqGHRaWkdaqadaqaamaalaaabaGaam4uaiaadweadaqadaqaaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGRbaabeaaaOGaayjkaiaawMcaaaqaaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGRbaabeaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabeaakiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaigdacaaI2aaacaGLOaGaayzkaaaaaa@656A@ S E ( C N R Q j k l ) = C N R Q j k l ( S E ( C F j l ) C F j l ) 2 + ( S E ( N R Q j k l ) N R Q j k l ) 2 ( formula 16 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaam4qaiaad6eacaWGsbGaamyuamaaBaaaleaacaWGQbGaam4AaiaadYgaaeqaaaGccaGLOaGaayzkaaGaeyypa0Jaam4qaiaad6eacaWGsbGaamyuamaaBaaaleaacaWGQbGaam4AaiaadYgaaeqaaOWaaOaaaeaadaqadaqaamaalaaabaGaam4uaiaadweadaqadaqaaiaadoeacaWGgbWaaSbaaSqaaiaadQgacaWGSbaabeaaaOGaayjkaiaawMcaaaqaaiaadoeacaWGgbWaaSbaaSqaaiaadQgacaWGSbaabeaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaGccqGHRaWkdaqadaqaamaalaaabaGaam4uaiaadweadaqadaqaaiaad6eacaWGsbGaamyuamaaBaaaleaacaWGQbGaam4AaiaadYgaaeqaaaGccaGLOaGaayzkaaaabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadQgacaWGRbGaamiBaaqabaaaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacaaIYaaaaaqabaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIXaGaaGOnaiaacEcaaiaawIcacaGLPaaaaaa@6ED9@ Coefficient of variation of NRQs of a reference gene Step 1 Calculation of the mean NRQ for all samples k and a given reference gene p: N R Q p ¯ = ∑ k = 1 s N R Q p k s ( formula 17 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaqdaaqaaiaad6eacaWGsbGaamyuamaaBaaaleaacaWGWbaabeaaaaGccqGH9aqpdaWcaaqaamaaqahabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadchacaWGRbaabeaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaadohaa0GaeyyeIuoaaOqaaiaadohaaaGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGymaiaaiEdaaiaawIcacaGLPaaaaaa@4EE9@ S E ( N R Q p ) = 1 s − 1 ∑ k = 1 s ( N R Q p k − N R Q p ¯ ) 2 ( formula 18 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGtbGaamyramaabmaabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadchaaeqaaaGccaGLOaGaayzkaaGaeyypa0ZaaOaaaeaadaWcaaqaaiaaigdaaeaacaWGZbGaeyOeI0IaaGymaaaadaaeWbqaamaabmaabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadchacaWGRbaabeaakiabgkHiTmaanaaabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadchaaeqaaaaaaOGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaeaacaWGRbGaeyypa0JaaGymaaqaaiaadohaa0GaeyyeIuoaaSqabaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIXaGaaGioaaGaayjkaiaawMcaaaaa@5BA7@ Step 2 Calculation of the coefficient of variation CV of a given reference gene p across all samples k: C V p = S E ( N R Q p ) N R Q p ¯ ( formula 19 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGdbGaamOvamaaBaaaleaacaWGWbaabeaakiabg2da9maalaaabaGaam4uaiaadweadaqadaqaaiaad6eacaWGsbGaamyuamaaBaaaleaacaWGWbaabeaaaOGaayjkaiaawMcaaaqaamaanaaabaGaamOtaiaadkfacaWGrbWaaSbaaSqaaiaadchaaeqaaaaaaaGccaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIXaGaaGyoaaGaayjkaiaawMcaaaaa@4D1C@ Step 3 Calculation of the mean coefficient of variation for all reference genes: C V ¯ = ∑ p = 1 f C V p f ( formula 20 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaqdaaqaaiaadoeacaWGwbaaaiabg2da9maalaaabaWaaabCaeaacaWGdbGaamOvamaaBaaaleaacaWGWbaabeaaaeaacaWGWbGaeyypa0JaaGymaaqaaiaadAgaa0GaeyyeIuoaaOqaaiaadAgaaaGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGOmaiaaicdaaiaawIcacaGLPaaaaaa@4AF9@ Reference gene and IRC stability parameter M Since normalization and inter-run calibration are highly analogous, quality evaluation using the stability parameter M is similar as well. Therefore, both methods are explained in parallel. Step 1 Calculation of the s × 1 matrix A gene in which the kth element is the log2 transformed ratio between the relative quantities (not yet normalized) of two reference genes p and p' in sample k; matrix A sample is calculated in an analogous manner. Step 1' Calculation of the g × 1 matrix A irc in which the jth element is the log2 transformed ratio between the NRQs of two IRCs m and m' for the same gene j within a run l; matrix A run is calculated in an analogous manner: ( ∀ p , p ′ ∈ [ 1 , f ] , p ≠ p ′ ) : A p p ′ k g e n e = log ⁡ 2 ( R Q k p R Q k p ′ ) ( formula 21 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaqadaqaaiabgcGiIiaadchacaGGSaGabmiCayaafaGaeyicI48aamWaaeaacaaIXaGaaiilaiaadAgaaiaawUfacaGLDbaacaGGSaGaamiCaiabgcMi5kqadchagaqbaaGaayjkaiaawMcaaiaacQdacaWGbbWaa0baaSqaaiaadchaceWGWbGbauaacaWGRbaabaGaam4zaiaadwgacaWGUbGaamyzaaaakiabg2da9iGacYgacaGGVbGaai4zamaaBaaaleaacaaIYaaabeaakmaabmaabaWaaSaaaeaacaWGsbGaamyuamaaBaaaleaacaWGRbGaamiCaaqabaaakeaacaWGsbGaamyuamaaBaaaleaacaWGRbGabmiCayaafaaabeaaaaaakiaawIcacaGLPaaacaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIYaGaaGymaaGaayjkaiaawMcaaaaa@6428@ ( ∀ m , m ′ ∈ [ 1 , c ] , m ≠ m ′ ) : A m m ′ j l i r c = log ⁡ 2 ( N R Q m j l N R Q m ′ j l ) ( formula 21 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaqadaqaaiabgcGiIiaad2gacaGGSaGabmyBayaafaGaeyicI48aamWaaeaacaaIXaGaaiilaiaadogaaiaawUfacaGLDbaacaGGSaGaamyBaiabgcMi5kqad2gagaqbaaGaayjkaiaawMcaaiaacQdacaWGbbWaa0baaSqaaiaad2gaceWGTbGbauaacaWGQbGaamiBaaqaaiaadMgacaWGYbGaam4yaaaakiabg2da9iGacYgacaGGVbGaai4zamaaBaaaleaacaaIYaaabeaakmaabmaabaWaaSaaaeaacaWGobGaamOuaiaadgfadaWgaaWcbaGaamyBaiaadQgacaWGSbaabeaaaOqaaiaad6eacaWGsbGaamyuamaaBaaaleaaceWGTbGbauaacaWGQbGaamiBaaqabaaaaaGccaGLOaGaayzkaaGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGOmaiaaigdacaGGNaaacaGLOaGaayzkaaaaaa@6848@ Step 2 Calculation of the mean log transformed ratio and the standard deviation V gene for all samples k and a given reference gene combination (p, p'). V gene is the geNorm pairwise variation V for two reference genes. Step 2' Calculation of the mean log transformed ratio and the standard deviation V irc for all runs l and a given IRC combination (m, m') and a given gene j. V sample and V run are calculated similarly from A sample and A run , respectively: A p p ′ g e n e = ∑ k = 1 s A p p ′ k g e n e s ( formula 22 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGbbWaa0baaSqaaiaadchaceWGWbGbauaaaeaacaWGNbGaamyzaiaad6gacaWGLbaaaOGaeyypa0ZaaSaaaeaadaaeWbqaaiaadgeadaqhaaWcbaGaamiCaiqadchagaqbaiaadUgaaeaacaWGNbGaamyzaiaad6gacaWGLbaaaaqaaiaadUgacqGH9aqpcaaIXaaabaGaam4CaaqdcqGHris5aaGcbaGaam4CaaaacaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIYaGaaGOmaaGaayjkaiaawMcaaaaa@54CA@ A m m ′ j i r c = ∑ l = 1 r A m m ′ j l i r c r ( formula 22 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGbbWaa0baaSqaaiaad2gaceWGTbGbauaacaWGQbaabaGaamyAaiaadkhacaWGJbaaaOGaeyypa0ZaaSaaaeaadaaeWbqaaiaadgeadaqhaaWcbaGaamyBaiqad2gagaqbaiaadQgacaWGSbaabaGaamyAaiaadkhacaWGJbaaaaqaaiaadYgacqGH9aqpcaaIXaaabaGaamOCaaqdcqGHris5aaGcbaGaamOCaaaacaWLjaGaaCzcamaabmaabaGaaeOzaiaab+gacaqGYbGaaeyBaiaabwhacaqGSbGaaeyyaiaabccacaaIYaGaaGOmaiaacEcaaiaawIcacaGLPaaaaaa@557B@ V p p ′ g e n e = S D ( A p p ′ g e n e ) = 1 s − 1 ∑ k = 1 s ( A p p ′ k g e n e − A p p ′ g e n e ¯ ) 2 ( formula 23 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGwbWaa0baaSqaaiaadchaceWGWbGbauaaaeaacaWGNbGaamyzaiaad6gacaWGLbaaaOGaeyypa0Jaam4uaiaadseadaqadaqaaiaadgeadaqhaaWcbaGaamiCaiqadchagaqbaaqaaiaadEgacaWGLbGaamOBaiaadwgaaaaakiaawIcacaGLPaaacqGH9aqpdaGcaaqaamaalaaabaGaaGymaaqaaiaadohacqGHsislcaaIXaaaamaaqahabaWaaeWaaeaacaWGbbWaa0baaSqaaiaadchaceWGWbGbauaacaWGRbaabaGaam4zaiaadwgacaWGUbGaamyzaaaakiabgkHiTmaanaaabaGaamyqamaaDaaaleaacaWGWbGabmiCayaafaaabaGaam4zaiaadwgacaWGUbGaamyzaaaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabaGaam4Aaiabg2da9iaaigdaaeaacaWGZbaaniabggHiLdaaleqaaOGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGOmaiaaiodaaiaawIcacaGLPaaaaaa@6C54@ V m m ′ j i r c = S D ( A m m ′ j i r c ) = 1 r − 1 ∑ l = 1 r ( A m m ′ j l i r c − A m m ′ j i r c ¯ ) 2 ( formula 23 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGwbWaa0baaSqaaiaad2gaceWGTbGbauaacaWGQbaabaGaamyAaiaadkhacaWGJbaaaOGaeyypa0Jaam4uaiaadseadaqadaqaaiaadgeadaqhaaWcbaGaamyBaiqad2gagaqbaiaadQgaaeaacaWGPbGaamOCaiaadogaaaaakiaawIcacaGLPaaacqGH9aqpdaGcaaqaamaalaaabaGaaGymaaqaaiaadkhacqGHsislcaaIXaaaamaaqahabaWaaeWaaeaacaWGbbWaa0baaSqaaiaad2gaceWGTbGbauaacaWGQbGaamiBaaqaaiaadMgacaWGYbGaam4yaaaakiabgkHiTmaanaaabaGaamyqamaaDaaaleaacaWGTbGabmyBayaafaGaamOAaaqaaiaadMgacaWGYbGaam4yaaaaaaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaaabaGaamiBaiabg2da9iaaigdaaeaacaWGYbaaniabggHiLdaaleqaaOGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGOmaiaaiodacaGGNaaacaGLOaGaayzkaaaaaa@6D0B@ Step 3 Calculation of the arithmetic mean M gene of all pairwise variations V gene of a given reference gene p with all other tested reference genes p'. M gene represents the geNorm gene stability measure M for a particular reference gene p. Step 3' Calculation of the arithmetic mean M irc of all pair wise variations V irc of a given IRC m with all the other IRCs m', for the same gene j. M sample and M run are calculated similarly from V sample and V run , respectively: M p g e n e = ∑ p ′ = 1 f V p p ′ g e n e f − 1 ( formula 24 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGnbWaa0baaSqaaiaadchaaeaacaWGNbGaamyzaiaad6gacaWGLbaaaOGaeyypa0ZaaSaaaeaadaaeWbqaaiaadAfadaqhaaWcbaGaamiCaiqadchagaqbaaqaaiaadEgacaWGLbGaamOBaiaadwgaaaaabaGabmiCayaafaGaeyypa0JaaGymaaqaaiaadAgaa0GaeyyeIuoaaOqaaiaadAgacqGHsislcaaIXaaaaiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaikdacaaI0aaacaGLOaGaayzkaaaaaa@549B@ M m j i r c = ∑ m ′ = 1 c V m m ′ j i r c c − 1 ( formula 24 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaacaWGnbWaa0baaSqaaiaad2gacaWGQbaabaGaamyAaiaadkhacaWGJbaaaOGaeyypa0ZaaSaaaeaadaaeWbqaaiaadAfadaqhaaWcbaGaamyBaiqad2gagaqbaiaadQgaaeaacaWGPbGaamOCaiaadogaaaaabaGabmyBayaafaGaeyypa0JaaGymaaqaaiaadogaa0GaeyyeIuoaaOqaaiaadogacqGHsislcaaIXaaaaiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaikdacaaI0aGaai4jaaGaayjkaiaawMcaaaaa@5546@ Step 4 Calculation of the mean stability measure for all reference genes. Step 4' Calculation of the mean stability measure for all IRCs: M g e n e ¯ = ∑ p = 1 f M p g e n e f ( formula 25 ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaqdaaqaaiaad2eadaahaaWcbeqaaiaadEgacaWGLbGaamOBaiaadwgaaaaaaOGaeyypa0ZaaSaaaeaadaaeWbqaaiaad2eadaqhaaWcbaGaamiCaaqaaiaadEgacaWGLbGaamOBaiaadwgaaaaabaGaamiCaiabg2da9iaaigdaaeaacaWGMbaaniabggHiLdaakeaacaWGMbaaaiaaxMaacaWLjaWaaeWaaeaacaqGMbGaae4BaiaabkhacaqGTbGaaeyDaiaabYgacaqGHbGaaeiiaiaaikdacaaI1aaacaGLOaGaayzkaaaaaa@50FA@ M j i r c ¯ = ∑ m = 1 f M m j i r c f ( formula 25 ' ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaqdaaqaaiaad2eadaqhaaWcbaGaamOAaaqaaiaadMgacaWGYbGaam4yaaaaaaGccqGH9aqpdaWcaaqaamaaqahabaGaamytamaaDaaaleaacaWGTbGaamOAaaqaaiaadMgacaWGYbGaam4yaaaaaeaacaWGTbGaeyypa0JaaGymaaqaaiaadAgaa0GaeyyeIuoaaOqaaiaadAgaaaGaaCzcaiaaxMaadaqadaqaaiaabAgacaqGVbGaaeOCaiaab2gacaqG1bGaaeiBaiaabggacaqGGaGaaGOmaiaaiwdacaGGNaaacaGLOaGaayzkaaaaaa@51B1@ Calculations on the effect of inter-run calibration The calculations for Figure 3 and Additional data file 1 have been performed as described in the formulas listed above. Difference in Cq is defined as the mean difference between the IRCs in run 1 and run 2. Fold change is defined as the ratio of the geometric mean of the (C)NRQs of the IRCs in run 1 and run 2. For the calculation of the effects of inter-run calibration, NRQ values were retrieved from qBase for runs 1, 2 and 3 independently. Inter-run calibration was performed as described in formulas 13'-16', using one, two or three IRCs (Additional data file 2). The effect of inter-run calibration with two IRCs was calculated on the three sets of two IRCs (IRCs 1,2 versus IRCs 1,3 versus IRCs 2,3). Similarly, the effect of inter-run calibration with one IRC was calculated over all individual IRCs. The increase in error is defined as the ratio of the relative error after and before calibration. The 95% confidence interval (CI) for this increase was calculated on log-transformed ratios. For the investigation of the effect of the selection of (sets of) IRCs from the three available calibrators, CNRQs for the different calibrated data sets were rescaled to allow them to be compared. The fold difference between the data sets was log transformed and a 95% CI was calculated. The effect of calibration with identical or independently prepared cDNA was studied similarly to the effect of the selection of IRCs. The IRC stability measure was calculated as described in formulas 21'-25'. Additional data files The following additional data are available with the online version of this paper. Additional data file 1 contains all the data and calculations leading to the results presented in Figure 3. Additional data file 2 contains all the data and calculations that were used for the evaluation of the effect of inter-run calibration on the final results. The conclusions of these calculations are represented, in part, in Table 2. Supplementary Material Additional data file 1 Data and calculations leading to the results presented in Figure 3 Click here for file Additional data file 2 The conclusions of these calculations are represented, in part, in Table 2 Click here for file

0 comments Cited 369 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Neonatal sepsis: an international perspective.

G. Heath, P. Kazembe, S Vergnano … (2005)

Neonatal infections currently cause about 1.6 million deaths annually in developing countries. Sepsis and meningitis are responsible for most of these deaths. Resistance to commonly used antibiotics is emerging and constitutes an important problem world wide. To reduce global neonatal mortality, strategies of proven efficacy, such as hand washing, barrier nursing, restriction of antibiotic use, and rationalisation of admission to neonatal units, need to be implemented. Different approaches require further research.

0 comments Cited 112 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Paris Clarice Papagianis:

ORCID: https://orcid.org/0000-0002-4709-6017

Role: Data curationRole: InvestigationRole: Project administrationRole: Writing – original draftRole: Writing – review & editing

Siavash Ahmadi-Noorbakhsh:

ORCID: https://orcid.org/0000-0001-6293-670X

Role: Project administrationRole: Writing – review & editing

Rebecca Lim: Role: ConceptualizationRole: Funding acquisitionRole: ResourcesRole: Writing – review & editing

Euan Wallace: Role: ConceptualizationRole: Funding acquisitionRole: Writing – review & editing

Graeme Polglase: Role: Formal analysisRole: Funding acquisitionRole: MethodologyRole: ResourcesRole: SupervisionRole: Writing – review & editing

J. Jane Pillow: Role: ConceptualizationRole: Data curationRole: Formal analysisRole: Funding acquisitionRole: InvestigationRole: MethodologyRole: Project administrationRole: ResourcesRole: SoftwareRole: SupervisionRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing

Timothy J. Moss: Role: ConceptualizationRole: Data curationRole: Formal analysisRole: Funding acquisitionRole: InvestigationRole: MethodologyRole: Project administrationRole: ResourcesRole: SoftwareRole: SupervisionRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing

Harald Ehrhardt: Role: Editor

Journal

Journal ID (nlm-ta): PLoS One

Journal ID (iso-abbrev): PLoS One

Journal ID (publisher-id): plos

Title: PLoS ONE

Publisher: Public Library of Science (San Francisco, CA USA )

ISSN (Electronic): 1932-6203

Publication date (Electronic): 25 June 2021

Publication date Collection: 2021

Volume: 16

Issue: 6

Electronic Location Identifier: e0253456

Affiliations

[1 ] The Ritchie Centre, Hudson Institute of Medical Research, Clayton, Victoria, Australia

[2 ] Department of Obstetrics and Gynaecology, School of Clinical Health Sciences, Monash University, Clayton, Victoria, Australia

[3 ] School of Human Sciences, The University of Western Australia, Crawley, WA, Australia

[4 ] School of Health Sciences and Health Innovations Research Institute, RMIT University, Melbourne, VIC, Australia

Center of Pediatrics, GERMANY

Author notes

Competing Interests: The University of Western Australia (via JJP) has consultancy agreements with Chiesi Farmaceutici S.p.A. unrelated to the subject of this study. Fisher & Paykel Healthcare have material transfer agreements with Hudson Research Institute that are also unrelated to this work. There are no other relevant interests relating to employment, consultancy, patents, or products in development or marketed products to declare. These material transfer agreements do not alter our adherence to PLOS ONE policies on sharing data and materials. None of the authors have conflicts of interest to disclose.

‡ These authors are joint senior authors on this work.

* E-mail: pariscpapagianis@ 123456gmail.com

Author information

Paris Clarice Papagianis https://orcid.org/0000-0002-4709-6017

Siavash Ahmadi-Noorbakhsh https://orcid.org/0000-0001-6293-670X

Article

Publisher ID: PONE-D-21-04396

DOI: 10.1371/journal.pone.0253456

PMC ID: 8232434

PubMed ID: 34170941

SO-VID: f5ba3869-6040-46e5-b872-34647813cfce

License:

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 9 February 2021

Date accepted : 4 June 2021

Page count

Figures: 6, Tables: 3, Pages: 20

Funding

Funded by: National Health and Medical Research Council (AU)

Award ID: 1077769

Award Recipient : Timothy J. Moss

Funded by: National Health and Medical Research Council (AU)

Award ID: 1057514

Award Recipient : J. Jane Pillow

Funded by: funder-id http://dx.doi.org/10.13039/501100000925, National Health and Medical Research Council;

Award ID: 1043294

Award Recipient : Timothy J. Moss

Funded by: funder-id http://dx.doi.org/10.13039/501100000925, National Health and Medical Research Council;

Award ID: 1077691

Award Recipient : J. Jane Pillow

This research was supported by an NHMRC Project Grant (1077769), NHMRC Centre for Research Excellence (1057514), two NHMRC Senior Research Fellowships (JJP; 1077691: TJM 1043294), the Victorian Government’s Operational Infrastructure Support Program, and the West Australian Government’s Medical and Health Research Infrastructure Fund. Unrestricted equipment and consumable support was provided by Chiesi Farmaceutici S.p.A. (poractant alfa); Fisher & Paykel Healthcare (ventilator circuits); and ICU Medical (arterial monitoring lines). None of the commercial industry funders had any input into study design, data collection and analysis, decision to publish or preparation of the manuscript. Chiesi Farmaceutici S.p.A. reviewed the final manuscript for technical accuracy pertaining to description and use of their surfactant, in accordance with a Material Transfer Agreement associated with the provision of surfactant for study animals. The University of Western Australia (via JJP) has consultancy agreements with Chiesi Farmaceutici S.p.A. unrelated to the subject of this study. Fisher & Paykel Healthcare have material transfer agreements with Hudson Research Institute that are also unrelated to this work. There are no other relevant interests relating to employment, consultancy, patents, or products in development or marketed products to declare. These material transfer agreements do not alter our adherence to PLOS ONE policies on sharing data and materials. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Custom metadata

Data Availability All relevant data are within the manuscript and its Supporting information files.

ScienceOpen disciplines: Uncategorized

Data availability:

ScienceOpen disciplines: Uncategorized

Comments

Comment on this article

scite_

Cited by 8

See all cited by

Most referenced authors 720

See all reference authors

- Version 1

The effect of human amnion epithelial cells on lung development and inflammation in preterm lambs exposed to antenatal inflammation

Read this article at

Abstract

Background

Methods

Results

Conclusions

Related collections

Sex hormones and lung development

Most cited references 57

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes

qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data

Neonatal sepsis: an international perspective.

Author and article information

Contributors

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 222

Cited by 8

Most referenced authors 720