51
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      LDpred2: better, faster, stronger

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Polygenic scores have become a central tool in human genetics research. LDpred is a popular method for deriving polygenic scores based on summary statistics and a matrix of correlation between genetic variants. However, LDpred has limitations that may reduce its predictive performance.

          Results

          Here, we present LDpred2, a new version of LDpred that addresses these issues. We also provide two new options in LDpred2: a ‘sparse’ option that can learn effects that are exactly 0, and an ‘auto’ option that directly learns the two LDpred parameters from data. We benchmark predictive performance of LDpred2 against the previous version on simulated and real data, demonstrating substantial improvements in robustness and predictive accuracy compared to LDpred1. We then show that LDpred2 also outperforms other polygenic score methods recently developed, with a mean AUC over the 8 real traits analyzed here of 65.1%, compared to 63.8% for lassosum, 62.9% for PRS-CS and 61.5% for SBayesR. Note that LDpred2 provides more accurate polygenic scores when run genome-wide, instead of per chromosome.

          Availability and implementation

          LDpred2 is implemented in R package bigsnpr.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references38

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Welcome to the Tidyverse

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            The UK Biobank resource with deep phenotyping and genomic data

            The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found

              LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.

              Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                01 December 2020
                16 December 2020
                16 December 2020
                : 36
                : 22-23
                : 5424-5431
                Affiliations
                [btaa1029-aff1 ] National Centre for Register-Based Research, Aarhus University , Aarhus 8210, Denmark
                [btaa1029-aff2 ] Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK , Grenoble 38000, France
                [btaa1029-aff3 ] Bioinformatics Research Centre, Aarhus University , Aarhus 8000, Denmark
                Author notes
                To whom correspondence should be addressed. florian.prive.21@ 123456gmail.com and bjv@ 123456econ.au.dk
                Article
                btaa1029
                10.1093/bioinformatics/btaa1029
                8016455
                33326037
                d4a8c67a-ffe9-4dac-99e2-fc022fbe7c3c
                © The Author(s) 2020. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 15 July 2020
                : 24 November 2020
                : 01 December 2020
                : 25 November 2020
                Page count
                Pages: 8
                Funding
                Funded by: Danish National Research Foundation, DOI 10.13039/501100001732;
                Funded by: Lundbeck Foundation Initiative for Integrative Psychiatric Research;
                Award ID: R248-2017-2003
                Categories
                Original Papers
                Genetics and Population Analysis
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article