72
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          MendelianRandomization is a software package for the R open-source software environment that performs Mendelian randomization analyses using summarized data. The core functionality is to implement the inverse-variance weighted, MR-Egger and weighted median methods for multiple genetic variants. Several options are available to the user, such as the use of robust regression, fixed- or random-effects models and the penalization of weights for genetic variants with heterogeneous causal estimates. Extensions to these methods, such as allowing for variants to be correlated, can be chosen if appropriate. Graphical commands allow summarized data to be displayed in an interactive graph, or the plotting of causal estimates from multiple methods, for comparison. Although the main method of data entry is directly by the user, there is also an option for allowing summarized data to be incorporated from the PhenoScanner database of genotype—phenotype associations. We hope to develop this feature in future versions of the package. The R software environment is available for download from [ https://www.r-project.org/]. The MendelianRandomization package can be downloaded from the Comprehensive R Archive Network (CRAN) within R, or directly from [ https://cran.r-project.org/web/packages/MendelianRandomization/]. Both R and the MendelianRandomization package are released under GNU General Public Licenses (GPL-2|GPL-3).

          Related collections

          Most cited references8

          • Record: found
          • Abstract: found
          • Article: not found

          'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease?

          Associations between modifiable exposures and disease seen in observational epidemiology are sometimes confounded and thus misleading, despite our best efforts to improve the design and analysis of studies. Mendelian randomization-the random assortment of genes from parents to offspring that occurs during gamete formation and conception-provides one method for assessing the causal nature of some environmental exposures. The association between a disease and a polymorphism that mimics the biological link between a proposed exposure and disease is not generally susceptible to the reverse causation or confounding that may distort interpretations of conventional observational studies. Several examples where the phenotypic effects of polymorphisms are well documented provide encouraging evidence of the explanatory power of Mendelian randomization and are described. The limitations of the approach include confounding by polymorphisms in linkage disequilibrium with the polymorphism under study, that polymorphisms may have several phenotypic effects associated with disease, the lack of suitable polymorphisms for studying modifiable exposures of interest, and canalization-the buffering of the effects of genetic variation during development. Nevertheless, Mendelian randomization provides new opportunities to test causality and demonstrates how investment in the human genome project may contribute to understanding and preventing the adverse effects on human health of modifiable exposures.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic

            Background MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error’ (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied. Methods An adaptation of the I 2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it I G X 2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example. Results In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of I G X 2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We demonstrate our proposed approach for a two-sample summary data MR analysis to estimate the causal effect of low-density lipoprotein on heart disease risk. A high value of I G X 2 close to 1 indicates that dilution does not materially affect the standard MR-Egger analyses for these data. Conclusions Care must be taken to assess the NOME assumption via the I G X 2 statistic before implementing standard MR-Egger regression in the two-sample summary data context. If I G X 2 is sufficiently low (less than 90%), inferences from the method should be interpreted with caution and adjustment methods considered.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors

              Finding individual-level data for adequately-powered Mendelian randomization analyses may be problematic. As publicly-available summarized data on genetic associations with disease outcomes from large consortia are becoming more abundant, use of published data is an attractive analysis strategy for obtaining precise estimates of the causal effects of risk factors on outcomes. We detail the necessary steps for conducting Mendelian randomization investigations using published data, and present novel statistical methods for combining data on the associations of multiple (correlated or uncorrelated) genetic variants with the risk factor and outcome into a single causal effect estimate. A two-sample analysis strategy may be employed, in which evidence on the gene-risk factor and gene-outcome associations are taken from different data sources. These approaches allow the efficient identification of risk factors that are suitable targets for clinical intervention from published data, although the ability to assess the assumptions necessary for causal inference is diminished. Methods and guidance are illustrated using the example of the causal effect of serum calcium levels on fasting glucose concentrations. The estimated causal effect of a 1 standard deviation (0.13 mmol/L) increase in calcium levels on fasting glucose (mM) using a single lead variant from the CASR gene region is 0.044 (95 % credible interval −0.002, 0.100). In contrast, using our method to account for the correlation between variants, the corresponding estimate using 17 genetic variants is 0.022 (95 % credible interval 0.009, 0.035), a more clearly positive causal effect. Electronic supplementary material The online version of this article (doi:10.1007/s10654-015-0011-z) contains supplementary material, which is available to authorized users.
                Bookmark

                Author and article information

                Journal
                Int J Epidemiol
                Int J Epidemiol
                ije
                International Journal of Epidemiology
                Oxford University Press
                0300-5771
                1464-3685
                December 2017
                07 April 2017
                07 April 2017
                : 46
                : 6
                : 1734-1739
                Affiliations
                [dyx034-1 ]Newnham College, University of Cambridge and
                [dyx034-2 ]Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
                Author notes
                Corresponding author. Strangeways Research Laboratory, 2 Worts Causeway, Cambridge CB1 8RN, UK. E-mail. sb452@ 123456medschl.cam.ac.uk
                Article
                dyx034
                10.1093/ije/dyx034
                5510723
                28398548
                f4e43a56-d7df-4e40-bc1f-6a92b1c94e01
                © The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 14 February 2017
                Page count
                Pages: 6
                Funding
                Funded by: Wellcome Trust 10.13039/100004440
                Award ID: 100114
                Categories
                Software Application Profile

                Public health
                mendelian randomization,instrumental variable,causal inference,summarized data,two-sample,data parasite

                Comments

                Comment on this article