PRSice-2: Polygenic Risk Score software for biobank-scale data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Polygenic risk score (PRS) analyses have become an integral part of biomedical research, exploited to gain insights into shared aetiology among traits, to control for genomic profile in experimental studies, and to strengthen causal inference, among a range of applications. Substantial efforts are now devoted to biobank projects to collect large genetic and phenotypic data, providing unprecedented opportunity for genetic discovery and applications. To process the large-scale data provided by such biobank resources, highly efficient and scalable methods and software are required.

Results

Here we introduce PRSice-2, an efficient and scalable software program for automating and simplifying PRS analyses on large-scale data. PRSice-2 handles both genotyped and imputed data, provides empirical association P-values free from inflation due to overfitting, supports different inheritance models, and can evaluate multiple continuous and binary target traits simultaneously. We demonstrate that PRSice-2 is dramatically faster and more memory-efficient than PRSice-1 and alternative PRS software, LDpred and lassosum, while having comparable predictive power.

Conclusion

PRSice-2's combination of efficiency and power will be increasingly important as data sizes grow and as the applications of PRS become more sophisticated, e.g., when incorporated into high-dimensional or gene set–based analyses. PRSice-2 is written in C++, with an R script for plotting, and is freely available for download from http://PRSice.info.

Related collections

Most cited references 18

Record: found
Abstract: found
Article: found

Is Open Access

PRSice-2: Polygenic Risk Score software for biobank-scale data

Shing Wan Choi, Paul O'Reilly (2019)

Abstract Background Polygenic risk score (PRS) analyses have become an integral part of biomedical research, exploited to gain insights into shared aetiology among traits, to control for genomic profile in experimental studies, and to strengthen causal inference, among a range of applications. Substantial efforts are now devoted to biobank projects to collect large genetic and phenotypic data, providing unprecedented opportunity for genetic discovery and applications. To process the large-scale data provided by such biobank resources, highly efficient and scalable methods and software are required. Results Here we introduce PRSice-2, an efficient and scalable software program for automating and simplifying PRS analyses on large-scale data. PRSice-2 handles both genotyped and imputed data, provides empirical association P-values free from inflation due to overfitting, supports different inheritance models, and can evaluate multiple continuous and binary target traits simultaneously. We demonstrate that PRSice-2 is dramatically faster and more memory-efficient than PRSice-1 and alternative PRS software, LDpred and lassosum, while having comparable predictive power. Conclusion PRSice-2's combination of efficiency and power will be increasingly important as data sizes grow and as the applications of PRS become more sophisticated, e.g., when incorporated into high-dimensional or gene set–based analyses. PRSice-2 is written in C++, with an R script for plotting, and is freely available for download from http://PRSice.info.

0 comments Cited 577 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Genotype imputation.

Yun Li, Cristen Willer, Serena Sanna … (2009)

Genotype imputation is now an essential tool in the analysis of genome-wide association scans. This technique allows geneticists to accurately evaluate the evidence for association at genetic markers that are not directly genotyped. Genotype imputation is particularly useful for combining results across studies that rely on different genotyping platforms but also increases the power of individual scans. Here, we review the history and theoretical underpinnings of the technique. To illustrate performance of the approach, we summarize results from several gene mapping studies. Finally, we preview the role of genotype imputation in an era when whole genome resequencing is becoming increasingly common.

0 comments Cited 390 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Polygenic prediction via Bayesian regression and continuous shrinkage priors

Tian Ge, Chia-Yen Chen, Yang Ni … (2019)

Polygenic risk scores (PRS) have shown promise in predicting human complex traits and diseases. Here, we present PRS-CS, a polygenic prediction method that infers posterior effect sizes of single nucleotide polymorphisms (SNPs) using genome-wide association summary statistics and an external linkage disequilibrium (LD) reference panel. PRS-CS utilizes a high-dimensional Bayesian regression framework, and is distinct from previous work by placing a continuous shrinkage (CS) prior on SNP effect sizes, which is robust to varying genetic architectures, provides substantial computational advantages, and enables multivariate modeling of local LD patterns. Simulation studies using data from the UK Biobank show that PRS-CS outperforms existing methods across a wide range of genetic architectures, especially when the training sample size is large. We apply PRS-CS to predict six common complex diseases and six quantitative traits in the Partners HealthCare Biobank, and further demonstrate the improvement of PRS-CS in prediction accuracy over alternative methods.

0 comments Cited 382 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Gigascience

Journal ID (iso-abbrev): Gigascience

Journal ID (publisher-id): gigascience

Title: GigaScience

Publisher: Oxford University Press

ISSN (Electronic): 2047-217X

Publication date Collection: July 2019

Publication date (Electronic): 15 July 2019

Publication date PMC-release: 15 July 2019

Volume: 8

Issue: 7

Electronic Location Identifier: giz082

Affiliations

[1 ]MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, De Crespigny Park, Denmark Hill, London, UK, SE5 8AF

[2 ]Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, 1 Gustave L. Levy Pl, New York City, NY 10029, USA

Author notes

Correspondence addres. Shing Wan Choi, Icahn School of Medicine, Mount Sinai, New York, USA. E-mail: choishingwan@ 123456gmail.com

Correspondence addres. Paul F. O'Reilly, Icahn School of Medicine, Mount Sinai, New York, USA. E-mail: paul.oreilly@ 123456mssm.edu

Author information

Shing Wan Choi http://orcid.org/0000-0003-2215-3238

Paul F O'Reilly http://orcid.org/0000-0001-7515-0845

Article

Publisher ID: giz082

DOI: 10.1093/gigascience/giz082

PMC ID: 6629542

PubMed ID: 31307061

SO-VID: a58b7d01-3520-438f-8ab1-39984fd64d27

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 27 November 2018

Date revision received : 13 March 2019

Date accepted : 11 June 2019

Page count

Pages: 6

Funding

Funded by: Medical Research Council 10.13039/501100000265

Award ID: MR/N015746/1

Funded by: National Institute for Health Research 10.13039/501100000272

Funded by: South London and Maudsley NHS Foundation Trust 10.13039/100009362

Funded by: King's College London 10.13039/501100000764

Funded by: Department of Health 10.13039/501100003921

Comments

Comment on this article

scite_

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

PRSice-2: Polygenic Risk Score software for biobank-scale data

Read this article at

Abstract

Background

Results

Conclusion

Related collections

Software for SAXS correction and analysis

Most cited references 18

PRSice-2: Polygenic Risk Score software for biobank-scale data

Genotype imputation.

Polygenic prediction via Bayesian regression and continuous shrinkage priors

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 310

Cited by 538

Most referenced authors 6,657