Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence activated cell sorting, or in-silico deconfounding. We studied feasibility and validity of a non-negative matrix decomposition algorithm using experimental gene expression data for blood and sorted cells from the same donor samples. Our objective was to optimize the algorithm regarding detection of differentially expressed genes and to enable its use for classification in the difficult scenario of reversely regulated genes. This would be of importance for the identification of candidate biomarkers in heterogeneous tissues.

Results

Experimental data and simulation studies involving noise parameters estimated from these data revealed that for valid detection of differential gene expression, quantile normalization and use of non-log data are optimal. We demonstrate the feasibility of predicting proportions of constituting cell types from gene expression data of single samples, as a prerequisite for a deconfounding-based classification approach.

Classification cross-validation errors with and without using deconfounding results are reported as well as sample-size dependencies. Implementation of the algorithm, simulation and analysis scripts are available.

Conclusions

The deconfounding algorithm without decorrelation using quantile normalization on non-log data is proposed for biomarkers that are difficult to detect, and for cases where confounding by varying proportions of cell types is the suspected reason. In this case, a deconfounding ranking approach can be used as a powerful alternative to, or complement of, other statistical learning approaches to define candidate biomarkers for molecular diagnosis and prediction in biomedicine, in realistically noisy conditions and with moderate sample sizes.

Related collections

Most cited references 18

Record: found
Abstract: not found
Article: not found

Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

Y. H. Yang (2002)

There are many sources of systematic variation in cDNA microarray experiments which affect the measured gene expression levels (e.g. differences in labeling efficiency between the two fluorescent dyes). The term normalization refers to the process of removing such variation. A constant adjustment is often used to force the distribution of the intensity log ratios to have a median of zero for each slide. However, such global normalization approaches are not adequate in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. This article proposes normalization methods that are based on robust local regression and account for intensity and spatial dependence in dye biases for different types of cDNA microarray experiments. The selection of appropriate controls for normalization is discussed and a novel set of controls (microarray sample pool, MSP) is introduced to aid in intensity-dependent normalization. Lastly, to allow for comparisons of expression levels across slides, a robust method based on maximum likelihood estimation is proposed to adjust for scale differences among slides.

0 comments Cited 805 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Laser capture microdissection.

M R Emmert-Buck, R Bonner, P Smith … (1996)

Laser capture microdissection (LCM) under direct microscopic visualization permits rapid one-step procurement of selected human cell populations from a section of complex, heterogeneous tissue. In this technique, a transparent thermoplastic film (ethylene vinyl acetate polymer) is applied to the surface of the tissue section on a standard glass histopathology slide; a carbon dioxide laser pulse then specifically activates the film above the cells of interest. Strong focal adhesion allows selective procurement of the targeted cells. Multiple examples of LCM transfer and tissue analysis, including polymerase chain reaction amplification of DNA and RNA, and enzyme recovery from transferred tissue are demonstrated.

0 comments Cited 186 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

On testing the significance of sets of genes

Robert Tibshirani, Bradley Efron (2006)

This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis (GSEA) procedure of Subramanian et al. [Proc. Natl. Acad. Sci. USA 102 (2005) 15545--15550]. We study the problem in some generality and propose two potential improvements to GSEA: the maxmean statistic for summarizing gene-sets, and restandardization for more accurate inferences. We discuss a variety of examples and extensions, including the use of gene-set scores for class predictions. We also describe a new R language package GSA that implements our ideas.

0 comments Cited 104 times – based on 0 reviews

Preprint

     Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): BMC Bioinformatics

Title: BMC Bioinformatics

Publisher: BioMed Central

ISSN (Electronic): 1471-2105

Publication date Collection: 2010

Publication date (Electronic): 14 January 2010

Volume: 11

Page: 27

Affiliations

[1 ]Department of Genetics and Biometry, Research Institute for the Biology of Farm Animals, Wilhelm-Stahl Allee 2, D 18196 Dummerstorf, Germany

[2 ]Bioinformatics Chair, Institute for Biochemistry and Biology at the University of Potsdam, Karl-Liebknecht-Str. 24-25, D 14476 Potsdam-Golm, Germany

[3 ]Molecular Biology and Human Genetics, University of Stellenbosch, Tygerberg, Cape Town 7505, South Africa

[4 ]Department of Immunology, Max-Planck-Institute for Infection Biology, Charitéplatz 1, D 10117 Berlin, Germany

[5 ]Department of Immunology, Bernhard-Nocht-Institute for Tropical Medicine, Bernhard-Nocht-Str. 74, D 20359 Hamburg, Germany

Article

Publisher ID: 1471-2105-11-27

DOI: 10.1186/1471-2105-11-27

PMC ID: 3098067

PubMed ID: 20070912

SO-VID: 774a8a27-1579-4113-8d30-62937611b17d

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 3 September 2009

Date accepted : 14 January 2010

Comments

Comment on this article

scite_

Cited by 40

See all cited by

Most referenced authors 298

See all reference authors

- Version 1

Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach

Read this article at

Abstract

Background

Results

Conclusions

Related collections

REPO4EU WP2 Databases

Most cited references 18

Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

Laser capture microdissection.

On testing the significance of sets of genes

Author and article information

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 54

Cited by 40

Most referenced authors 298