0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine learning and multi-omics data reveal driver gene-based molecular subtypes in hepatocellular carcinoma for precision treatment

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The heterogeneity of Hepatocellular Carcinoma (HCC) poses a barrier to effective treatment. Stratifying highly heterogeneous HCC into molecular subtypes with similar features is crucial for personalized anti-tumor therapies. Although driver genes play pivotal roles in cancer progression, their potential in HCC subtyping has been largely overlooked. This study aims to utilize driver genes to construct HCC subtype models and unravel their molecular mechanisms. Utilizing a novel computational framework, we expanded the initially identified 96 driver genes to 1192 based on mutational aspects and an additional 233 considering driver dysregulation. These genes were subsequently employed as stratification markers for further analyses. A novel multi-omics subtype classification algorithm was developed, leveraging mutation and expression data of the identified stratification genes. This algorithm successfully categorized HCC into two distinct subtypes, CLASS A and CLASS B, demonstrating significant differences in survival outcomes. Integrating multi-omics and single-cell data unveiled substantial distinctions between these subtypes regarding transcriptomics, mutations, copy number variations, and epigenomics. Moreover, our prognostic model exhibited excellent predictive performance in training and external validation cohorts. Finally, a 10-gene classification model for these subtypes identified TTK as a promising therapeutic target with robust classification capabilities. This comprehensive study provides a novel perspective on HCC stratification, offering crucial insights for a deeper understanding of its pathogenesis and the development of promising treatment strategies.

          Author summary

          Dividing highly heterogeneous HCC into molecular subtypes with similar characteristics is crucial for personalized anti-tumor therapies. Although driver genes play pivotal roles in cancer progression, their potential in HCC subtyping has been largely overlooked. In this work, we developed a multi-omics network-based stratification algorithm that utilizes patient mutation data and requires smaller computational resources for subtype assignment. Through this algorithm, we categorized HCC into two subtypes, CLASS A and CLASS B. Using multi-omics and single-cell data, we identified differences between these subtypes in gene expression, methylation, immune infiltration, and other aspects. Beyond subtype characterization, our study established a robust clinical prediction model ( https://mike-wang-bjut.shinyapps.io/DynNomapp_HCC_Sutypes/) incorporating subtype information and typical clinical features, enabling precise survival predictions. Finally, we developed a high-performing machine learning classifier for our subtype. Analyzing this classification model and reviewing previous experimental papers, we identified TTK as a potential diagnostic marker and therapeutic target specific to our subtypes. In conclusion, our research offers a novel perspective on HCC stratification, which is crucial for a deeper understanding of its pathogenesis and developing promising treatment strategies.

          Related collections

          Most cited references76

          • Record: found
          • Abstract: found
          • Article: not found

          Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries

          This article provides an update on the global cancer burden using the GLOBOCAN 2020 estimates of cancer incidence and mortality produced by the International Agency for Research on Cancer. Worldwide, an estimated 19.3 million new cancer cases (18.1 million excluding nonmelanoma skin cancer) and almost 10.0 million cancer deaths (9.9 million excluding nonmelanoma skin cancer) occurred in 2020. Female breast cancer has surpassed lung cancer as the most commonly diagnosed cancer, with an estimated 2.3 million new cases (11.7%), followed by lung (11.4%), colorectal (10.0 %), prostate (7.3%), and stomach (5.6%) cancers. Lung cancer remained the leading cause of cancer death, with an estimated 1.8 million deaths (18%), followed by colorectal (9.4%), liver (8.3%), stomach (7.7%), and female breast (6.9%) cancers. Overall incidence was from 2-fold to 3-fold higher in transitioned versus transitioning countries for both sexes, whereas mortality varied <2-fold for men and little for women. Death rates for female breast and cervical cancers, however, were considerably higher in transitioning versus transitioned countries (15.0 vs 12.8 per 100,000 and 12.4 vs 5.2 per 100,000, respectively). The global cancer burden is expected to be 28.4 million cases in 2040, a 47% rise from 2020, with a larger increase in transitioning (64% to 95%) versus transitioned (32% to 56%) countries due to demographic changes, although this may be further exacerbated by increasing risk factors associated with globalization and a growing economy. Efforts to build a sustainable infrastructure for the dissemination of cancer prevention measures and provision of cancer care in transitioning countries is critical for global cancer control.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The blockade of immune checkpoints in cancer immunotherapy.

            Among the most promising approaches to activating therapeutic antitumour immunity is the blockade of immune checkpoints. Immune checkpoints refer to a plethora of inhibitory pathways hardwired into the immune system that are crucial for maintaining self-tolerance and modulating the duration and amplitude of physiological immune responses in peripheral tissues in order to minimize collateral tissue damage. It is now clear that tumours co-opt certain immune-checkpoint pathways as a major mechanism of immune resistance, particularly against T cells that are specific for tumour antigens. Because many of the immune checkpoints are initiated by ligand-receptor interactions, they can be readily blocked by antibodies or modulated by recombinant forms of ligands or receptors. Cytotoxic T-lymphocyte-associated antigen 4 (CTLA4) antibodies were the first of this class of immunotherapeutics to achieve US Food and Drug Administration (FDA) approval. Preliminary clinical findings with blockers of additional immune-checkpoint proteins, such as programmed cell death protein 1 (PD1), indicate broad and diverse opportunities to enhance antitumour immunity with the potential to produce durable clinical responses.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              GSVA: gene set variation analysis for microarray and RNA-Seq data

              Background Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. Results To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. Conclusions GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Formal analysisRole: MethodologyRole: Writing – original draft
                Role: InvestigationRole: VisualizationRole: Writing – original draft
                Role: Data curationRole: Software
                Role: SupervisionRole: Writing – review & editing
                Role: Funding acquisition
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput Biol
                plos
                PLOS Computational Biology
                Public Library of Science (San Francisco, CA USA )
                1553-734X
                1553-7358
                10 May 2024
                May 2024
                : 20
                : 5
                : e1012113
                Affiliations
                [001] Faculty of Environment and Life of Beijing University of Technology, Beijing, China
                University of Southern California, UNITED STATES
                Author notes

                The authors declare that they have no competing interests.

                Author information
                https://orcid.org/0009-0001-5963-2462
                https://orcid.org/0000-0002-5799-1971
                Article
                PCOMPBIOL-D-24-00005
                10.1371/journal.pcbi.1012113
                11230636
                38728362
                3ff6e611-2f4b-4c98-8783-7373715252f0
                © 2024 Wang et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 2 January 2024
                : 24 April 2024
                Page count
                Figures: 8, Tables: 2, Pages: 25
                Funding
                Funded by: National Key Research and Development Program of China
                Award ID: 2022YFC2704804
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100001809, National Natural Science Foundation of China;
                Award ID: 61931013
                Award Recipient :
                This study was supported by the National Key Research and Development Program of China (2022YFC2704804 to BG), and the National Natural Science Foundation of China (61931013 to BG). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Medicine and Health Sciences
                Oncology
                Cancers and Neoplasms
                Carcinoma
                Hepatocellular Carcinoma
                Medicine and Health Sciences
                Oncology
                Cancers and Neoplasms
                Gastrointestinal Tumors
                Hepatocellular Carcinoma
                Medicine and Health Sciences
                Gastroenterology and Hepatology
                Liver Diseases
                Hepatocellular Carcinoma
                Biology and Life Sciences
                Biochemistry
                Proteins
                Protein Domains
                Biology and life sciences
                Cell biology
                Chromosome biology
                Chromatin
                Chromatin modification
                DNA methylation
                Biology and life sciences
                Genetics
                Epigenetics
                Chromatin
                Chromatin modification
                DNA methylation
                Biology and life sciences
                Genetics
                Gene expression
                Chromatin
                Chromatin modification
                DNA methylation
                Biology and life sciences
                Genetics
                DNA
                DNA modification
                DNA methylation
                Biology and life sciences
                Biochemistry
                Nucleic acids
                DNA
                DNA modification
                DNA methylation
                Biology and life sciences
                Genetics
                Epigenetics
                DNA modification
                DNA methylation
                Biology and life sciences
                Genetics
                Gene expression
                DNA modification
                DNA methylation
                Biology and Life Sciences
                Genetics
                Mutation
                Biology and Life Sciences
                Genetics
                Gene Expression
                Medicine and Health Sciences
                Oncology
                Cancer Treatment
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Medicine and Health Sciences
                Diagnostic Medicine
                Prognosis
                Custom metadata
                vor-update-to-uncorrected-proof
                2024-07-08
                The data used in this study are all from public databases TCGA ( https://www.cancer.gov/ccg/research/genome-sequencing/tcga) and ICGC ( https://dcc.icgc.org/). The Fudan cohort and single-cell dataset are from the GEO database with accession numbers GSE14520 and GSE149614, respectively. The LIMORE dataset was obtained from its original paper (PMID: 31378681). To facilitate clinical translation, we developed an interactive HCC prognosis model: https://mike-wang-bjut.shinyapps.io/DynNomapp_HCC_Sutypes/. The subtype classifiers SVM_10 and SVM_TTK, along with the relevant code, are stored at https://github.com/Mike-W29/SVM_model_for_HCC_subtype.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article