2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A multiobjective multi-view cluster ensemble technique: Application in patient subclassification

      research-article
      * ,
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recent high throughput omics technology has been used to assemble large biomedical omics datasets. Clustering of single omics data has proven invaluable in biomedical research. For the task of patient sub-classification, all the available omics data should be utilized combinedly rather than treating them individually. Clustering of multi-omics datasets has the potential to reveal deep insights. Here, we propose a late integration based multiobjective multi-view clustering algorithm which uses a special perturbation operator. Initially, a large number of diverse clustering solutions (called base partitionings) are generated for each omic dataset using four clustering algorithms, viz., k means, complete linkage, spectral and fast search clustering. These base partitionings of multi-omic datasets are suitably combined using a special perturbation operator. The perturbation operator uses an ensemble technique to generate new solutions from the base partitionings. The optimal combination of multiple partitioning solutions across different views is determined after optimizing the objective functions, namely conn-XB, for checking the quality of partitionings for different views, and agreement index, for checking agreement between the views. The search capability of a multiobjective simulated annealing approach, namely AMOSA is used for this purpose. Lastly, the non-dominated solutions of the different views are combined based on similarity to generate a single set of non-dominated solutions. The proposed algorithm is evaluated on 13 multi-view cancer datasets. An elaborated comparative study with several baseline methods and five state-of-the-art models is performed to show the effectiveness of the algorithm.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin.

          Recent genomic analyses of pathologically defined tumor types identify "within-a-tissue" disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multiplatform classification, while correlated with tissue-of-origin, provides independent information for predicting clinical outcomes. All data sets are available for data-mining from a unified resource to support further biological discoveries and insights into novel therapeutic strategies. Copyright © 2014 Elsevier Inc. All rights reserved.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis.

            The molecular complexity of a tumor manifests itself at the genomic, epigenomic, transcriptomic and proteomic levels. Genomic profiling at these multiple levels should allow an integrated characterization of tumor etiology. However, there is a shortage of effective statistical and bioinformatic tools for truly integrative data analysis. The standard approach to integrative clustering is separate clustering followed by manual integration. A more statistically powerful approach would incorporate all data types simultaneously and generate a single integrated cluster assignment. We developed a joint latent variable model for integrative clustering. We call the resulting methodology iCluster. iCluster incorporates flexible modeling of the associations between different data types and the variance-covariance structure within data types in a single framework, while simultaneously reducing the dimensionality of the datasets. Likelihood-based inference is obtained through the Expectation-Maximization algorithm. We demonstrate the iCluster algorithm using two examples of joint analysis of copy number and gene expression data, one from breast cancer and one from lung cancer. In both cases, we identified subtypes characterized by concordant DNA copy number changes and gene expression as well as unique profiles specific to one or the other in a completely automated fashion. In addition, the algorithm discovers potentially novel subtypes by combining weak yet consistent alteration patterns across data types. R code to implement iCluster can be downloaded at http://www.mskcc.org/mskcc/html/85130.cfm
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              A validity measure for fuzzy clustering

                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Formal analysisRole: InvestigationRole: MethodologyRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: ConceptualizationRole: SupervisionRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2019
                23 May 2019
                : 14
                : 5
                : e0216904
                Affiliations
                [001] Department of Computer Science and Engineering, Indian Institute of Technology Patna, India
                Northeast Electric Power University, CHINA
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Author information
                http://orcid.org/0000-0001-8140-6499
                Article
                PONE-D-19-04726
                10.1371/journal.pone.0216904
                6533037
                31120942
                58d10f6c-8130-45eb-bed0-7559f4cbe0ba
                © 2019 Mitra, Saha

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 17 February 2019
                : 30 April 2019
                Page count
                Figures: 6, Tables: 11, Pages: 30
                Funding
                Dr. Sriparna Saha gratefully acknowledges the Young Faculty Research Fellowship (YFRF) Award, supported by Visvesvaraya PhD scheme for Electronics and IT, Ministry of Electronics and Information Technology (MeitY), Government of India, being implemented by Digital India Corporation (formerly Media Lab Asia) for carrying out this research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Biology and Life Sciences
                Genetics
                Gene Expression
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Clustering Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Clustering Algorithms
                Research and Analysis Methods
                Research Facilities
                Information Centers
                Archives
                Biology and life sciences
                Biochemistry
                Nucleic acids
                RNA
                Non-coding RNA
                Natural antisense transcripts
                MicroRNAs
                Biology and life sciences
                Genetics
                Gene expression
                Gene regulation
                MicroRNAs
                Biology and life sciences
                Cell biology
                Chromosome biology
                Chromatin
                Chromatin modification
                DNA methylation
                Biology and life sciences
                Genetics
                Epigenetics
                Chromatin
                Chromatin modification
                DNA methylation
                Biology and life sciences
                Genetics
                Gene expression
                Chromatin
                Chromatin modification
                DNA methylation
                Biology and life sciences
                Genetics
                DNA
                DNA modification
                DNA methylation
                Biology and life sciences
                Biochemistry
                Nucleic acids
                DNA
                DNA modification
                DNA methylation
                Biology and life sciences
                Genetics
                Epigenetics
                DNA modification
                DNA methylation
                Biology and life sciences
                Genetics
                Gene expression
                DNA modification
                DNA methylation
                Ecology and Environmental Sciences
                Soil Science
                Soil Perturbation
                Biology and Life Sciences
                Genetics
                Gene Expression
                Gene Regulation
                Custom metadata
                Underlying data owned by Ron Shamir’s Lab at Tel Aviv University, Israel, are available here: http://acgt.cs.tau.ac.il/multi_omic_benchmark/download.htm. Further underlying datasets may be found at the following: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE22219 for OXF.BRC.1 and OXF.BRC.1 datasets for GSE22219 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE22220 for OXF.BRC.1 and OXF.BRC.1 datasets for GSE22220 http://cbio.mskcc.org/cancergenomics/prostate/data/ for MSKCC.PRCA.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article