Fine-mapping causal tissues and genes at disease-associated loci

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Heritable diseases often manifest in a highly tissue-specific manner, with different disease loci mediated by genes in distinct tissues or cell types. We propose Tissue-Gene Fine-Mapping (TGFM), a fine-mapping method that infers the posterior probability (PIP) for each gene-tissue pair to mediate a disease locus by analyzing GWAS summary statistics (and in-sample LD) and leveraging eQTL data from diverse tissues to build cis-predicted expression models; TGFM also assigns PIPs to causal variants that are not mediated by gene expression in assayed genes and tissues. TGFM accounts for both co-regulation across genes and tissues and LD between SNPs (generalizing existing fine-mapping methods), and incorporates genome-wide estimates of each tissue’s contribution to disease as tissue-level priors. TGFM was well-calibrated and moderately well-powered in simulations; unlike previous methods, TGFM was able to attain correct calibration by modeling uncertainty in cis-predicted expression models. We applied TGFM to 45 UK Biobank diseases/traits (average $N = 316 K$ ) using eQTL data from 38 GTEx tissues. TGFM identified an average of 147 PIP > 0.5 causal genetic elements per disease/trait, of which 11% were gene-tissue pairs. Implicated gene-tissue pairs were concentrated in known disease-critical tissues, and causal genes were strongly enriched in disease-relevant gene sets. Causal gene-tissue pairs identified by TGFM recapitulated known biology (e.g., TPO-thyroid for Hypothyroidism), but also included biologically plausible novel findings (e.g., SLC20A2-artery aorta for Diastolic blood pressure). Further application of TGFM to single-cell eQTL data from 9 cell types in peripheral blood mononuclear cells (PBMC), analyzed jointly with GTEx tissues, identified 30 additional causal gene-PBMC cell type pairs at PIP > 0.5—primarily for autoimmune disease and blood cell traits, including the well-established role of CTLA4 in CD8 ⁺ T cells for All autoimmune disease. In conclusion, TGFM is a robust and powerful method for fine-mapping causal tissues and genes at disease-associated loci.

Related collections

Most cited references 116

Record: found
Abstract: not found
Article: not found

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Yoav Benjamini, Yosef Hochberg (1995)

0 comments Cited 24873 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data

Mark Robinson, Davis J. McCarthy, Gordon K. Smyth (2009)

Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org). Contact: mrobinson@wehi.edu.au

0 comments Cited 10631 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The UK Biobank resource with deep phenotyping and genomic data

Clare Bycroft, Colin Freeman, Desislava Petkova … (2018)

The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

0 comments Cited 2693 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): medRxiv

Journal ID (publisher-id): MEDRXIV

Title: medRxiv

Publisher: Cold Spring Harbor Laboratory

Publication date (Electronic): 08 November 2023

Electronic Location Identifier: 2023.11.01.23297909

Affiliations

[1 ]Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA

[2 ]Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA

[3 ]Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA

[4 ]Department of Medicine, University of California San Diego, La Jolla, CA, USA

[5 ]Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA

[6 ]Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA

Author notes

[& ]Corresponding authors: bstrober@ 123456hsph.harvard.edu and aprice@ 123456hsph.harvard.edu

Article

DOI: 10.1101/2023.11.01.23297909

PMC ID: 10635248

PubMed ID: 37961337

SO-VID: 2deea64b-6165-4e82-a2b1-ac9f2cfbced1

License:

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License, which allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use.

Fine-mapping causal tissues and genes at disease-associated loci

Read this article at

Abstract

Related collections

Genes & Diseases

Most cited references 116

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data

The UK Biobank resource with deep phenotyping and genomic data

Author and article information

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 175

Cited by 2

Most referenced authors 13,004