5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MOSTWAS: Multi-Omic Strategies for Transcriptome-Wide Association Studies

      research-article
      1 , 2 , 3 , 4 , 2 , 3 , *
      PLoS Genetics
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Traditional predictive models for transcriptome-wide association studies (TWAS) consider only single nucleotide polymorphisms (SNPs) local to genes of interest and perform parameter shrinkage with a regularization process. These approaches ignore the effect of distal-SNPs or other molecular effects underlying the SNP-gene association. Here, we outline multi-omics strategies for transcriptome imputation from germline genetics to allow more powerful testing of gene-trait associations by prioritizing distal-SNPs to the gene of interest. In one extension, we identify mediating biomarkers (CpG sites, microRNAs, and transcription factors) highly associated with gene expression and train predictive models for these mediators using their local SNPs. Imputed values for mediators are then incorporated into the final predictive model of gene expression, along with local SNPs. In the second extension, we assess distal-eQTLs (SNPs associated with genes not in a local window around it) for their mediation effect through mediating biomarkers local to these distal-eSNPs. Distal-eSNPs with large indirect mediation effects are then included in the transcriptomic prediction model with the local SNPs around the gene of interest. Using simulations and real data from ROS/MAP brain tissue and TCGA breast tumors, we show considerable gains of percent variance explained (1–2% additive increase) of gene expression and TWAS power to detect gene-trait associations. This integrative approach to transcriptome-wide imputation and association studies aids in identifying the complex interactions underlying genetic regulation within a tissue and important risk genes for various traits and disorders.

          Author summary

          Transcriptome-wide association studies (TWAS) are a powerful strategy to study gene-trait associations by integrating genome-wide association studies (GWAS) with gene expression datasets. TWAS increases study power and interpretability by mapping genetic variants to genes. However, traditional TWAS consider only variants that are close to a gene and thus ignores important variants far away from the gene that may be involved in complex regulatory mechanisms. Here, we present MOSTWAS (Multi-Omic Strategies for TWAS), a suite of tools that extends the TWAS framework to include these distal variants. MOSTWAS leverages multi-omic data of regulatory biomarkers (transcription factors, microRNAs, epigenetics) and borrows from techniques in mediation analysis to prioritize distal variants that are around these regulatory biomarkers. Using simulations and real public data from brain tissue and breast tumors, we show that MOSTWAS improves upon traditional TWAS in both predictive performance and power to detect gene-trait associations. MOSTWAS also aids in identifying possible mechanisms for gene regulation using a novel added-last test that assesses the added information gained from the distal variants beyond the local association. In conclusion, our method aids in detecting important risk genes for traits and disorders and the possible complex interactions underlying genetic regulation within a tissue.

          Related collections

          Most cited references100

          • Record: found
          • Abstract: found
          • Article: not found

          PLINK: a tool set for whole-genome association and population-based linkage analyses.

          Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Regularization Paths for Generalized Linear Models via Coordinate Descent

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Structural absorption by barbule microstructures of super black bird of paradise feathers

              Many studies have shown how pigments and internal nanostructures generate color in nature. External surface structures can also influence appearance, such as by causing multiple scattering of light (structural absorption) to produce a velvety, super black appearance. Here we show that feathers from five species of birds of paradise (Aves: Paradisaeidae) structurally absorb incident light to produce extremely low-reflectance, super black plumages. Directional reflectance of these feathers (0.05–0.31%) approaches that of man-made ultra-absorbent materials. SEM, nano-CT, and ray-tracing simulations show that super black feathers have titled arrays of highly modified barbules, which cause more multiple scattering, resulting in more structural absorption, than normal black feathers. Super black feathers have an extreme directional reflectance bias and appear darkest when viewed from the distal direction. We hypothesize that structurally absorbing, super black plumage evolved through sensory bias to enhance the perceived brilliance of adjacent color patches during courtship display.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: MethodologyRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: Funding acquisitionRole: ValidationRole: VisualizationRole: Writing – review & editing
                Role: ConceptualizationRole: Funding acquisitionRole: MethodologyRole: SupervisionRole: ValidationRole: VisualizationRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS Genet
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, CA USA )
                1553-7390
                1553-7404
                8 March 2021
                March 2021
                : 17
                : 3
                : e1009398
                Affiliations
                [1 ] Department of Pathology and Laboratory Medicine, University of California-Los Angeles, Los Angeles, California, United States of America
                [2 ] Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
                [3 ] Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
                [4 ] Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
                Case Western Reserve University, UNITED STATES
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0003-1196-4385
                https://orcid.org/0000-0002-9275-4189
                https://orcid.org/0000-0001-8401-0545
                Article
                PGENETICS-D-20-01017
                10.1371/journal.pgen.1009398
                7971899
                33684137
                f14f36b1-7079-4766-b35c-5553ab69c503
                © 2021 Bhattacharya et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 27 June 2020
                : 4 February 2021
                Page count
                Figures: 5, Tables: 0, Pages: 30
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/100000066, National Institute of Environmental Health Sciences;
                Award ID: P30-ES010126
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000050, National Heart, Lung, and Blood Institute;
                Award ID: R01-HL129132
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: R01-GM105785
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000050, National Heart, Lung, and Blood Institute;
                Award ID: R01-HL146500
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000071, National Institute of Child Health and Human Development;
                Award ID: U54-HD079124
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000054, National Cancer Institute;
                Award ID: P01-CA142538
                Award Recipient :
                Funded by: National Institute of Environmental Health Sciences
                Award ID: P30-ES010126
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/100000025, National Institute of Mental Health;
                Award ID: R01-MH118349
                Award Recipient :
                A.B. is supported by the National Institute of Environmental Health Sciences (P30-ES010126, https://www.niehs.nih.gov/), Y.L. is partially supported by the National Heart, Lung, and Blood Institute (R01-HL129132, https://www.nhlbi.nih.gov/), the National Institute of General Medical Sciences (R01-GM105785, https://www.nigms.nih.gov/), the National Heart, Lung, and Blood Institute (R01-HL146500, https://www.nhlbi.nih.gov/), and the National Institute of Child Health and Human Development (U54-HD079124, https://www.nichd.nih.gov/). M.I.L. is supported by the National Cancer Institute (P01-CA142538, https://www.cancer.gov/), the National Institute of Environmental Health Sciences (P30-ES010126, https://www.niehs.nih.gov/), and the National Institute of Mental Health (R01-MH118349, https://www.nimh.nih.gov). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Genome-Wide Association Studies
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Genome-Wide Association Studies
                Biology and Life Sciences
                Genetics
                Human Genetics
                Genome-Wide Association Studies
                Biology and Life Sciences
                Genetics
                Gene Expression
                Biology and Life Sciences
                Genetics
                Heredity
                Biology and Life Sciences
                Genetics
                Genetic Loci
                Research and Analysis Methods
                Mathematical and Statistical Techniques
                Statistical Methods
                Forecasting
                Physical Sciences
                Mathematics
                Statistics
                Statistical Methods
                Forecasting
                Physical Sciences
                Mathematics
                Discrete Mathematics
                Combinatorics
                Permutation
                Biology and Life Sciences
                Computational Biology
                Genome Analysis
                Transcriptome Analysis
                Biology and Life Sciences
                Genetics
                Genomics
                Genome Analysis
                Transcriptome Analysis
                Medicine and Health Sciences
                Oncology
                Cancers and Neoplasms
                Breast Tumors
                Breast Cancer
                Custom metadata
                vor-update-to-uncorrected-proof
                2021-03-18
                All relevant data are referred to within the manuscript and its Supporting Information files: MOSTWAS software, https://github.com/bhattacharya-a-bt/MOSTWAS Sample code: https://github.com/bhattacharya-a-bt/mostwas_suppdata Models and full results, https://doi.org/10.5281/zenodo.4314067 [ 32] TCGA GDC Legacy Archive, https://portal.gdc.cancer.gov/legacy-archive GDAC Firehose Browser, https://gdac.broadinstitute.org ROS/MAP data, https://www.synapse.org/#!Synapse:syn3219045 iCOGS GWAS Summary Statistics, http://bcac.ccge.medschl.cam.ac.uk/bcacdata/icogs-complete-summary-results IGAP Late-onset Alzheimer's Disease Risk GWAS Summary Statistics, http://web.pasteur-lille.fr/en/recherche/u744/igap/igap_download.php PGC Major Depressive Disorder GWAS Summary Statistics, https://www.med.unc.edu/pgc/download-results/mdd/ UKBB Major Depressive Disorder GWAX Summary Statistics, http://gwas-browser.nygenome.org/downloads/gwas-browser/ PsychENCODE Project, http://resource.psychencode.org/.

                Genetics
                Genetics

                Comments

                Comment on this article