108
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Identifying Relationships among Genomic Disease Regions: Predicting Genes at Pathogenic SNP Associations and Rare Deletions

      research-article

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Translating a set of disease regions into insight about pathogenic mechanisms requires not only the ability to identify the key disease genes within them, but also the biological relationships among those key genes. Here we describe a statistical method, Gene Relationships Among Implicated Loci (GRAIL), that takes a list of disease regions and automatically assesses the degree of relatedness of implicated genes using 250,000 PubMed abstracts. We first evaluated GRAIL by assessing its ability to identify subsets of highly related genes in common pathways from validated lipid and height SNP associations from recent genome-wide studies. We then tested GRAIL, by assessing its ability to separate true disease regions from many false positive disease regions in two separate practical applications in human genetics. First, we took 74 nominally associated Crohn's disease SNPs and applied GRAIL to identify a subset of 13 SNPs with highly related genes. Of these, ten convincingly validated in follow-up genotyping; genotyping results for the remaining three were inconclusive. Next, we applied GRAIL to 165 rare deletion events seen in schizophrenia cases (less than one-third of which are contributing to disease risk). We demonstrate that GRAIL is able to identify a subset of 16 deletions containing highly related genes; many of these genes are expressed in the central nervous system and play a role in neuronal synapses. GRAIL offers a statistically robust approach to identifying functionally related genes from across multiple disease regions—that likely represent key disease pathways. An online version of this method is available for public use ( http://www.broad.mit.edu/mpg/grail/).

          Author Summary

          Modern genetic studies, including genome-wide surveys for disease-associated loci and copy number variation, provide a list of critical genomic regions that play an important role in predisposition to disease. Using these regions to understand disease pathogenesis requires the ability to first distinguish causal genes from other nearby genes spuriously contained within these regions. To do this we must identify the key pathways suggested by those causal genes. In this manuscript we describe a statistical approach, Gene Relationships Across Implicated Loci (GRAIL), to achieve this task. It starts with genomic regions and identifies related subsets of genes involved in similar biological processes—these genes highlight the likely causal genes and the key pathways. GRAIL uses abstracts from the entirety of the published scientific literature about the genes to look for potential relationships between genes. We apply GRAIL to four very different phenotypes. In each case we identify a subset of highly related genes; in cases where false positive regions are present, GRAIL is able to separate out likely true positives. GRAIL therefore offers the potential to translate disease genomic regions from unbiased genomic surveys into the key processes that may be critical to the disease.

          Related collections

          Most cited references39

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Cluster analysis and display of genome-wide expression patterns.

            A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Newly identified loci that influence lipid concentrations and risk of coronary artery disease.

              To identify genetic variants influencing plasma lipid concentrations, we first used genotype imputation and meta-analysis to combine three genome-wide scans totaling 8,816 individuals and comprising 6,068 individuals specific to our study (1,874 individuals from the FUSION study of type 2 diabetes and 4,184 individuals from the SardiNIA study of aging-associated variables) and 2,758 individuals from the Diabetes Genetics Initiative, reported in a companion study in this issue. We subsequently examined promising signals in 11,569 additional individuals. Overall, we identify strongly associated variants in eleven loci previously implicated in lipid metabolism (ABCA1, the APOA5-APOA4-APOC3-APOA1 and APOE-APOC clusters, APOB, CETP, GCKR, LDLR, LPL, LIPC, LIPG and PCSK9) and also in several newly identified loci (near MVK-MMAB and GALNT2, with variants primarily associated with high-density lipoprotein (HDL) cholesterol; near SORT1, with variants primarily associated with low-density lipoprotein (LDL) cholesterol; near TRIB1, MLXIPL and ANGPTL3, with variants primarily associated with triglycerides; and a locus encompassing several genes near NCAN, with variants strongly associated with both triglycerides and LDL cholesterol). Notably, the 11 independent variants associated with increased LDL cholesterol concentrations in our study also showed increased frequency in a sample of coronary artery disease cases versus controls.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                June 2009
                June 2009
                26 June 2009
                : 5
                : 6
                : e1000534
                Affiliations
                [1 ]Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
                [2 ]Center for Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [3 ]Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
                [4 ]Harvard Medical School – Partners HealthCare Center for Genetics and Genomics, Boston, Massachusetts, United States of America
                [5 ]Harvard-MIT Health Sciences and Technology, Cambridge, Massachusetts, United States of America
                [6 ]Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [7 ]Gastroenterology Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [8 ]Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
                [9 ]Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [10 ]Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [11 ]Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [12 ]Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
                [13 ]Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                Princeton University, United States of America
                Author notes

                Conceived and designed the experiments: SR RMP EJR SMP PS DA MJD. Performed the experiments: SR EJR ACYN International Schizophrenia Consortium. Analyzed the data: SR ACYN EMS RJX MJD. Contributed reagents/materials/analysis tools: SR International Schizophrenia Consortium EMS DA MJD. Wrote the paper: SR RMP EJR ACYN SMP PS EMS RJX DA MJD. Critically read and contributed to the final manuscript: SR RMP SJR ACYN SMP PS EMS RJX DA MJD.

                Article
                09-PLGE-RA-0248R2
                10.1371/journal.pgen.1000534
                2694358
                19557189
                9fc6529e-2d14-4d22-9d20-48b06d37adc2
                Raychaudhuri et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 16 February 2009
                : 22 May 2009
                Page count
                Pages: 15
                Categories
                Research Article
                Computational Biology/Genomics
                Computational Biology/Literature Analysis
                Computational Biology/Systems Biology
                Genetics and Genomics/Bioinformatics
                Genetics and Genomics/Complex Traits
                Genetics and Genomics/Gene Discovery
                Genetics and Genomics/Genetics of Disease
                Immunology/Autoimmunity

                Genetics
                Genetics

                Comments

                Comment on this article