21
views
0
recommends
+1 Recommend
2 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Gene set proximity analysis: expanding gene set enrichment analysis through learned geometric embeddings, with drug-repurposing applications in COVID-19

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Gene set analysis methods rely on knowledge-based representations of genetic interactions in the form of both gene set collections and protein–protein interaction (PPI) networks. However, explicit representations of genetic interactions often fail to capture complex interdependencies among genes, limiting the analytic power of such methods.

          Results

          We propose an extension of gene set enrichment analysis to a latent embedding space reflecting PPI network topology, called gene set proximity analysis (GSPA). Compared with existing methods, GSPA provides improved ability to identify disease-associated pathways in disease-matched gene expression datasets, while improving reproducibility of enrichment statistics for similar gene sets. GSPA is statistically straightforward, reducing to a version of traditional gene set enrichment analysis through a single user-defined parameter. We apply our method to identify novel drug associations with SARS-CoV-2 viral entry. Finally, we validate our drug association predictions through retrospective clinical analysis of claims data from 8 million patients, supporting a role for gabapentin as a risk factor and metformin as a protective factor for severe COVID-19.

          Availability and implementation

          GSPA is available for download as a command-line Python package at https://github.com/henrycousins/gspa.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

            Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

              DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                January 2023
                17 November 2022
                17 November 2022
                : 39
                : 1
                : btac735
                Affiliations
                Department of Biomedical Data Science, Stanford University School of Medicine , Stanford, CA 94305, USA
                Optum Labs at UnitedHealth Group , Minneapolis, MN 55343, USA
                Optum Labs at UnitedHealth Group , Minneapolis, MN 55343, USA
                Optum Labs at UnitedHealth Group , Minneapolis, MN 55343, USA
                Optum Labs at UnitedHealth Group , Minneapolis, MN 55343, USA
                Department of Genetics, Stanford University School of Medicine , Stanford, CA 94305, USA
                Department of Pathology, Stanford University School of Medicine , Stanford, CA 94305, USA
                Department of Biomedical Data Science, Stanford University School of Medicine , Stanford, CA 94305, USA
                Department of Genetics, Stanford University School of Medicine , Stanford, CA 94305, USA
                Department of Medicine, Stanford University School of Medicine , Stanford, CA 94305, USA
                Department of Bioengineering, Stanford University , Stanford, CA 94305, USA
                Author notes
                To whom correspondence should be addressed. Email: cousinsh@ 123456stanford.edu or russ.altman@ 123456stanford.edu
                Author information
                https://orcid.org/0000-0002-8694-0604
                https://orcid.org/0000-0003-4725-8714
                https://orcid.org/0000-0003-3859-2905
                Article
                btac735
                10.1093/bioinformatics/btac735
                9805577
                36394254
                b9ed6c8f-a8cc-4a2a-91b6-abd3d9b36577
                © The Author(s) 2022. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 04 March 2022
                : 27 September 2022
                : 05 November 2022
                : 16 November 2022
                : 30 November 2022
                Page count
                Pages: 8
                Funding
                Funded by: National Institutes of Health, DOI 10.13039/100000002;
                Award ID: GM007365
                Award ID: GM102365
                Funded by: Knight-Hennessy Scholarships;
                Funded by: UnitedHealth Group Research and Development;
                Categories
                Original Paper
                Gene Expression
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article