73
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A consensus orthogonal partial least squares discriminant analysis (OPLS-DA) strategy for multiblock Omics data fusion.

      1 ,
      Analytica chimica acta

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Omics approaches have proven their value to provide a broad monitoring of biological systems. However, as no single analytical technique is sufficient to reveal the full biochemical content of complex biological matrices or biofluids, the fusion of information from several data sources has become a decisive issue. Omics studies generate an increasing amount of massive data obtained from different analytical devices. These data are usually high dimensional and extracting knowledge from these multiple blocks is challenging. Appropriate tools are therefore needed to handle these datasets suitably. For that purpose, a generic methodology is proposed by combining the strengths of established data analysis strategies, i.e. multiple kernel learning and OPLS-DA to offer an efficient tool for the fusion of Omics data obtained from multiple sources. Three real case studies are proposed to assess the potential of the method. A first example illustrates the fusion of mass spectrometry-based metabolomic data acquired in both negative and positive electrospray ionisation modes, from leaf samples of the model plant Arabidopsis thaliana. A second dataset involves the classification of wine grape varieties based on polyphenolic extracts analysed by two-dimensional heteronuclear magnetic resonance spectroscopy. A third case study underlines the ability of the method to combine heterogeneous data from systems biology with the analysis of publicly available data related to NCI-60 cancer cell lines from different tissue origins, which include metabolomics, transcriptomics and proteomics. The fusion of Omics data from different sources is expected to provide a more complete view of biological systems. The proposed method was demonstrated as a relevant and widely applicable alternative to handle efficiently the inherent characteristics of multiple Omics data, such as very large numbers of noisy collinear variables.

          Related collections

          Author and article information

          Journal
          Anal. Chim. Acta
          Analytica chimica acta
          1873-4324
          0003-2670
          Mar 26 2013
          : 769
          Affiliations
          [1 ] Laboratoire de Chimie Analytique, AgroParisTech, Paris, France.
          Article
          S0003-2670(13)00170-0
          10.1016/j.aca.2013.01.022
          23498118
          97f75b60-7396-4429-ba8a-64ec0849edc4
          Copyright © 2013 Elsevier B.V. All rights reserved.
          History

          Comments

          Comment on this article