41
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Functional Cohesion of Gene Sets Determined by Latent Semantic Indexing of PubMed Abstracts

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05). These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature.

          Availability

          GCAT is freely available at http://binf1.memphis.edu/gcat

          Related collections

          Most cited references27

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes.

            We present a simple but powerful procedure to extract Gene Ontology (GO) terms that are significantly over- or under-represented in sets of genes within the context of a genome-scale experiment (DNA microarray, proteomics, etc.). Said procedure has been implemented as a web application, FatiGO, allowing for easy and interactive querying. FatiGO, which takes the multiple-testing nature of statistical contrast into account, currently includes GO associations for diverse organisms (human, mouse, fly, worm and yeast) and the TrEMBL/Swissprot GOAnnotations@EBI correspondences from the European Bioinformatics Institute.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Gene-set approach for expression pattern analysis.

              Recently developed gene set analysis methods evaluate differential expression patterns of gene groups instead of those of individual genes. This approach especially targets gene groups whose constituents show subtle but coordinated expression changes, which might not be detected by the usual individual gene analysis. The approach has been quite successful in deriving new information from expression data, and a number of methods and tools have been developed intensively in recent years. We review those methods and currently available tools, classify them according to the statistical methods employed, and discuss their pros and cons. We also discuss several interesting extensions to the methods.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2011
                14 April 2011
                : 6
                : 4
                : e18851
                Affiliations
                [1 ]Bioinformatics Program, University of Memphis, Memphis, Tennessee, United States of America
                [2 ]Department of Mathematical Sciences, University of Memphis, Memphis, Tennessee, United States of America
                [3 ]Department of Computer Science, University of Memphis, Memphis, Tennessee, United States of America
                [4 ]Computable Genomix, Memphis, Tennessee, United States of America
                [5 ]Department of Electrical and Computer Engineering, University of Tennessee, Knoxville, Tennessee, United States of America
                [6 ]Department of Biological Sciences, University of Memphis, Memphis, Tennessee, United States of America
                Cairo University, Egypt
                Author notes

                Conceived and designed the experiments: RH LX. Performed the experiments: LX. Analyzed the data: LX. Contributed reagents/materials/analysis tools: NF YL. Wrote the paper: RH LX. Generated the gene-by-gene similarity matrix by LSI under the direction of M. W. Berry: KH. Directed KH in generating gene-by-gene similarity matrix by LSI: MWB. Provided statistical supervision of the study: EOG.

                [¤]

                Current address: Computer Science Department, University of California Los Angeles, Los Angeles, California, United States of America

                Article
                PONE-D-10-06046
                10.1371/journal.pone.0018851
                3077411
                21533142
                f79ea88e-5863-426d-846d-4aa1aea0b554
                Xu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 10 November 2010
                : 21 March 2011
                Page count
                Pages: 9
                Categories
                Research Article
                Biology
                Computational Biology
                Microarrays
                Text Mining
                Genetics
                Gene Function
                Genomics
                Functional Genomics

                Uncategorized
                Uncategorized

                Comments

                Comment on this article

                scite_
                34
                0
                38
                0
                Smart Citations
                34
                0
                38
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content49

                Cited by15

                Most referenced authors1,049