28
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Critical limitations of consensus clustering in class discovery

      research-article
      a , 1 , 4 , 2 , b , 3
      Scientific Reports
      Nature Publishing Group

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Consensus clustering (CC) has been adopted for unsupervised class discovery in many genomic studies. It calculates how frequently two samples are grouped together in repeated clustering runs, and uses the resulting pairwise "consensus rates" for visual demonstration that clusters exist, for comparing cluster stability, and for estimating the optimal cluster number (K). However, the sensitivity and specificity of CC have not been systemically assessed. Through simulations we find that CC is able to divide randomly generated unimodal data into apparently stable clusters for a range of K, essentially reporting chance partitions of cluster-less data. For data with known structure, the common implementations of CC perform poorly in identifying the true K. These results suggest that CC should be applied and interpreted with caution. We found that a new metric based on CC, the proportion of ambiguously clustered pairs (PAC), infers K equally or more reliably than similar methods in simulated data with known K. Our overall approach involves the use of realistic null distributions based on the observed gene-gene correlation structure in a given study, and the implementation of PAC to more accurately estimate K. We discuss the strength of our approach in the context of other ensemble-based methods.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: not found

          Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis.

          Previously undescribed prognostic subclasses of high-grade astrocytoma are identified and discovered to resemble stages in neurogenesis. One tumor class displaying neuronal lineage markers shows longer survival, while two tumor classes enriched for neural stem cell markers display equally short survival. Poor prognosis subclasses exhibit markers either of proliferation or of angiogenesis and mesenchyme. Upon recurrence, tumors frequently shift toward the mesenchymal subclass. Chromosomal locations of genes distinguishing tumor subclass parallel DNA copy number differences between subclasses. Functional relevance of tumor subtype molecular signatures is suggested by the ability of cell line signatures to predict neurosphere growth. A robust two-gene prognostic model utilizing PTEN and DLL3 expression suggests that Akt and Notch signaling are hallmarks of poor prognosis versus better prognosis gliomas, respectively.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma.

            Comprehensive knowledge of the genomic alterations that underlie cancer is a critical foundation for diagnostics, prognostics, and targeted therapeutics. Systematic efforts to analyze cancer genomes are underway, but the analysis is hampered by the lack of a statistical framework to distinguish meaningful events from random background aberrations. Here we describe a systematic method, called Genomic Identification of Significant Targets in Cancer (GISTIC), designed for analyzing chromosomal aberrations in cancer. We use it to study chromosomal aberrations in 141 gliomas and compare the results with two prior studies. Traditional methods highlight hundreds of altered regions with little concordance between studies. The new approach reveals a highly concordant picture involving approximately 35 significant events, including 16-18 broad events near chromosome-arm size and 16-21 focal events. Approximately half of these events correspond to known cancer-related genes, only some of which have been previously tied to glioma. We also show that superimposed broad and focal events may have different biological consequences. Specifically, gliomas with broad amplification of chromosome 7 have properties different from those with overlapping focalEGFR amplification: the broad events act in part through effects on MET and its ligand HGF and correlate with MET dependence in vitro. Our results support the feasibility and utility of systematic characterization of the cancer genome.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Gene expression profiling identifies clinically relevant subtypes of prostate cancer.

              Prostate cancer, a leading cause of cancer death, displays a broad range of clinical behavior from relatively indolent to aggressive metastatic disease. To explore potential molecular variation underlying this clinical heterogeneity, we profiled gene expression in 62 primary prostate tumors, as well as 41 normal prostate specimens and nine lymph node metastases, using cDNA microarrays containing approximately 26,000 genes. Unsupervised hierarchical clustering readily distinguished tumors from normal samples, and further identified three subclasses of prostate tumors based on distinct patterns of gene expression. High-grade and advanced stage tumors, as well as tumors associated with recurrence, were disproportionately represented among two of the three subtypes, one of which also included most lymph node metastases. To further characterize the clinical relevance of tumor subtypes, we evaluated as surrogate markers two genes differentially expressed among tumor subgroups by using immunohistochemistry on tissue microarrays representing an independent set of 225 prostate tumors. Positive staining for MUC1, a gene highly expressed in the subgroups with "aggressive" clinicopathological features, was associated with an elevated risk of recurrence (P = 0.003), whereas strong staining for AZGP1, a gene highly expressed in the other subgroup, was associated with a decreased risk of recurrence (P = 0.0008). In multivariate analysis, MUC1 and AZGP1 staining were strong predictors of tumor recurrence independent of tumor grade, stage, and preoperative prostate-specific antigen levels. Our results suggest that prostate tumors can be usefully classified according to their gene expression patterns, and these tumor subtypes may provide a basis for improved prognostication and treatment stratification.
                Bookmark

                Author and article information

                Journal
                Sci Rep
                Sci Rep
                Scientific Reports
                Nature Publishing Group
                2045-2322
                27 August 2014
                2014
                : 4
                : 6207
                Affiliations
                [1 ]Department of Computational Medicine & Bioinformatics, University of Michigan , Ann Arbor, MI, USA
                [2 ]Department of Statistics and EECS, University of Michigan , Ann Arbor, MI, USA
                [3 ]Department of Human Genetics, University of Michigan , Ann Arbor, MI, USA
                [4 ]Current address: Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA
                Author notes
                Article
                srep06207
                10.1038/srep06207
                4145288
                25158761
                51332e77-147c-48c6-a828-5398c4951216
                Copyright © 2014, Macmillan Publishers Limited. All rights reserved

                This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/

                History
                : 19 May 2014
                : 08 August 2014
                Categories
                Article

                Uncategorized
                Uncategorized

                Comments

                Comment on this article