1 
Towards Clinical Utility of Polygenic 
Risk Scores 
 
Samuel A. Lambert1-4, Gad Abraham1-2,5, Michael Inouye1-6 
 
1. Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary 
Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom 
2. Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, 
Melbourne, Victoria 3004, Australia 
3. MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary 
Care, University of Cambridge, Cambridge CB1 8RN, United Kingdom 
4. Cambridge Substantive Site, Health Data Research UK, Wellcome Genome Campus, 
Hinxton, UK 
5. Department of Clinical Pathology, University of Melbourne, Parkville, VIC 3010, Australia 
6. The Alan Turing Institute, London, UK 
 
Abstract 
Prediction of disease risk is an essential part of preventative medicine, often guiding clinical 
management. Risk prediction typically includes risk factors such as age, sex, family history of 
disease, and lifestyle (e.g. smoking status); however, in recent years there has been increasing 
interest to include genomic information into risk models. Polygenic risk scores (PRS) aggregate 
the effects of many genetic variants across the human genome into a single score, and have 
recently been shown to have predictive value for multiple common diseases. In this review, we 
summarise the potential use cases for seven common diseases (breast cancer, prostate cancer, 
coronary artery disease, obesity, type 1 diabetes, type 2 diabetes, Alzheimer's disease) where 
PRS has or could have clinical utility. PRS analysis for these diseases frequently revolved 
around (i) risk prediction performance of a PRS alone and in combination with other non-genetic 
risk factors, (ii) estimation of lifetime risk trajectories, (iii) the independent information of PRS 
and family history of disease or monogenic mutations, and (iv) estimation of the value of adding 
a PRS to specific clinical risk prediction scenarios. We summarise open questions regarding 
PRS usability, ancestry bias, and transferability, emphasising the need for the next wave of 
studies to focus on the implementation and health-economic value of PRS testing. In 
conclusion, it is becoming clear that PRS have value in disease risk prediction and there are 
multiple areas where this may have clinical utility. 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 2 
Introduction 
A multitude of human traits and diseases are heritable to varying degrees. Further, the genetic 
basis for many such traits has been established as polygenic—explained by the contributions of 
many genes, each with moderate or weak contribution to the trait, in contrast to Mendelian traits 
which are caused by variation in one gene or a small set of genes with large effect. The 
combination of large-scale genome variation projects, such as the HapMap (1) and 1000 
Genomes projects (2), together with low-cost robust genotyping platforms, has enabled 
genome-wide association studies (GWAS) on large cohorts. GWAS have focused on identifying 
disease- or trait-associated genetic variants (typically SNPs, single nucleotide polymorphisms) 
which are common in a given population (e.g. minor allele frequency [MAF]>1%). To date, 
GWAS have identified thousands of loci that are associated with a range of complex human 
traits and diseases, including cardiovascular diseases, cancers, obesity, and Alzheimer’s 
disease (3). These data have provided numerous insights into the genes and pathways that 
cause disease, but more recently the use of these data for disease risk prediction has gained 
interest (4–6). 
 
Polygenic risk scores (PRS), sometimes referred to as genomic risk scores (GRS), are one 
such method to predict an individual's genetic predisposition for disease. In its simplest and 
most common form, PRS are sums of the effects of m SNPs, based on the estimated SNP 
effect sizes  ̂ (obtained from GWAS summary statistics), 
      ∑     ̂
 
   
 
 
where     is the genotype for the i
th individual and jth SNP (usually encoded as 0, 1, or 2 for the 
effect allele dosage). Typically, these scores include hundreds-to-thousands of SNPs, motivated 
by theory and data showing that many diseases are polygenic (7). In this way, PRS aggregate 
the contribution of an individual’s germline genome into a single number proportional to the risk 
for a given disease.  
 
There are numerous considerations related to the data and methods used to develop and 
validate a PRS (see (8, 9) for details). Here, we briefly summarise approaches that use GWAS 
summary statistics (alleles and effect sizes, and/or p-value) (10) rather than individual-level 
genotypes, although the principles are broadly similar. Initially, PRS tended to be constructed 
from genome-wide significant SNPs (typically, P<5x10-8), which for many diseases led to weakly 
predictive PRS as the number of genome-wide SNPs was small (11, 12). In contrast with 
GWAS, which was designed for detecting SNPs associated with the disease while maintaining a 
low false positive rate, the task of prediction allows for methods with a more lenient signal to 
noise trade-off. Thus, more powerful PRS can typically be constructed by incorporating larger 
numbers of SNPs, however, there is a trade-off between using a small number of SNPs with 
precise effect estimates and a large number of SNPs with increasingly noisy effect size 
estimates. There is no universal set of parameters for this trade-off, as they depend on the 
genetic architecture of the disease, genotyping density, and sample size. In practice, a training 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 3 
set comprising individual-level genotypes and phenotypes is often used to optimise the PRS. 
Using an independent validation dataset, or cross-validation, allows unbiased estimation of the 
predictive performance, avoiding optimism due to overfitting. Generally, once predictive 
performance plateaus or declines in the validation set, the optimal trade-off of signal and noise 
has been reached. 
 
Another consideration is linkage disequilibrium (LD), the correlations between nearby SNPs, 
which leads to over-representation of high LD regions in the model, thus potentially reducing its 
predictive performance. Common methods for constructing PRS include LD pruning (randomly 
removing one SNP from a pair in high LD), P-value thresholding, and clumping (pruning by LD 
while referentially retaining more significantly-associated SNPs), as well as more complex 
methods that explicitly account for LD, such as LDpred (13) and lassosum (14). The result of 
PRS development is the set of SNPs and effect sizes that can be applied to an independent 
sample. 
 
After the PRS has been constructed, it is essential to assess its predictive performance with the 
disease of interest in an external cohort, one not used for the underlying GWAS or for tuning the 
PRS. The accuracy of a PRS is bounded by the disease's heritability (total amount of disease 
variance that can be explained by genetics), and current PRS agree with estimates from theory 
(see Box 1). For polygenic scores of quantitative traits, the effect size per standard deviation 
(SD) change is usually reported, as well as the proportion of variance explained (R2) by the 
score. However, as most diseases are binary outcomes, the effect sizes are expressed as odds 
ratios (OR) or hazard ratios (HR), depending on the study design (case/control vs. prospective) 
and the availability of age at event. The model’s performance can be measured using variance 
explained (Nagelkerke's or pseudo-R2), or classification accuracy using area under the receiver-
operating characteristic curve (AUC), the area under the precision-recall curve (AUPRC), or 
Harrell’s C-index (15). However, caution must be exercised when interpreting prediction metrics 
such as AUC or C-index without sufficient context; even small increases in these metrics can 
lead to several percent of the population being reclassified into different risk categories, 
changing their clinical management. Further, these metrics do not take into account the costs 
and benefits of various clinical decisions (e.g. use of statins), which can only be done within a 
public health and health-economic framework. In addition, when comparing metrics across 
studies, it is important to note that the ancestry as well as study design (e.g. covariates included 
in the risk model) can affect these measures (as well as the standard deviation of the PRS). 
Potential for PRS utility 
We reviewed the literature for seven well-studied diseases where PRS could potentially have 
clinical value. These diseases include coronary artery disease (CAD), diabetes (types 1 and 2), 
obesity (and body mass index (BMI)), breast cancer, prostate cancer, and Alzheimer's disease. 
Table 1 summarizes information about each disease, their conventional risk factors, potential 
uses of a PRS, and recent references evaluating the clinical use of PRS in each case. A 
common theme is the expectation that the utility of PRS will be to predict future disease risk or 
identify those most at risk, and use this information to target treatments or alter screening 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 4 
paradigms. In this section, we elaborate on two examples where the clinical benefit of a PRS 
has been suggested: (i) providing better CAD risk estimates to guide treatments, and (ii) the 
potential to target screening to populations at high risk of prostate cancer.  
 
For cardiovascular disease, traditional risk factors such as systolic blood pressure, cholesterol 
levels, and smoking habits (Table 1) are routinely used to predict risk and guide initiation of 
treatment (e.g. statins) to lower low-density lipoprotein (LDL) cholesterol and reduce the disease 
risk. Recent studies have shown that adding PRS for CAD to the Framingham Risk Score and 
the ACC/AHA pooled risk equations resulted in increased predictive power (16). Additionally, in 
two re-analyses of clinical trials evaluating the effect of statin use on cardiovascular disease 
prevention, it was shown that the treatment benefit (absolute CAD risk reduction) was highest in 
those with the highest CAD polygenic risk (17, 18). The MI-GENES study found that disclosing 
CAD genetic risk to patients when deciding whether to initiate statin therapy resulted in 
improved LDL reduction, and the effect was again higher in those with the most genetic risk 
(19). Preliminary health economic analysis has also shown the potential cost benefits of using 
PRS in targeted testing for CAD prevention within the Finnish health system (20). Together 
these results show that a PRS for CAD can inform a more accurate risk estimate and define 
individuals most likely to benefit from statin therapy; however, the exact net benefit will likely 
vary across health systems and thus will require evaluation within each one. 
 
Another potential use case for PRS may be to increase the utility of lower sensitivity diagnostics. 
The serum prostate-specific antigen (PSA) test was used to screen for prostate cancer, but 
large trials showed that it results in a significant amount of overdiagnosis (false-positives leading 
to overtreatment) (21); while still used in diagnosis it has been abandoned for broad screening. 
Multiple prostate cancer PRS have been developed that can accurately stratify individual’s risk; 
a key finding from these studies has been that the probability of overdiagnosis by screening 
decreases as individual’s prostate cancer polygenic risk increases (22–24). This finding 
suggests that the PSA test could be targeted to a higher-risk population, as measured by a 
PRS, where the  PSA test has a higher positive predictive value. In other disease areas there is 
similar interest in adjusting screening test frequency and/or age of initiation, and in breast 
cancer the WISDOM clinical trial (25) is currently evaluating the use of risk (including PRS) 
instead of age-based guidelines (26) to guide these decisions. 
Lessons learned from PRS prediction studies 
PRS define a lifetime risk trajectory 
The majority of common complex diseases are late onset, with risk accumulating over time. Age 
is typically the strongest predictor of risk for many common diseases (Table 1), since it 
encapsulates the time dimension over which environmental exposures (risk factors) occur, as 
well as the ageing process (which can accelerate disease processes) (Figure 1). Thus the goal 
of risk prediction for such diseases is to evaluate whether the risk, either lifetime risk or shorter 
time horizons (e.g. 10-year risk), is higher than a threshold given by clinical guidelines or by 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 5 
age-adjusted average risk. The predicted risk can then be used to plan appropriate clinical 
action, whether it be treatment or increased screening. The shape of this risk trajectory can be 
different from birth, and is modified by an individual’s genetics as well as environment and 
behaviors, such as smoking, diet, exercise, and medication usage. 
 
Analyses across a range of complex diseases have utilised methods from survival analysis to 
examine how PRS affect the trajectory of cumulative risk over a lifetime, including CAD (16, 27), 
breast cancer (28, 29), prostate cancer (23), Alzheimer’s disease (30, 31), and weight gain 
trajectories (32). These trajectories stratified by genetic risk can be estimated from an early age, 
prior to any clinical risk factors manifesting. For example, an average male in the UK Biobank 
would reach 10% cumulative risk of CAD by the age of 68 (27). On the other hand, individuals 
with the highest and lowest 20% of CAD PRS would attain 10% cumulative risk by 61 and 75 
years, respectively. Similarly, the risk trajectory of breast cancer was modified in a cohort of 
Estonian women (29), whereby at age 70 the average risk of breast cancer was 5%, but was 
12% for those >95th percentile of genetic risk, and 2.4% in those of the bottom quintile. Taken 
together, and with evidence from other diseases, it is clear that genetic risk can substantially 
stratify individual disease risk trajectories above what can be predicted by age alone. 
PRS capture risk not quantified by family history and rare 
monogenic mutations 
Two other major predictors that have been used for disease risk prediction are (i) family history 
and (ii) monogenic mutations.  
 
A family history of disease is a composite of genetic risk (both common and rare) and a shared 
environment. For instance, many breast cancer risk prediction methods implemented in clinical 
practice (e.g. BOADICEA (33)) use family history, often represented in a pedigree, to estimate 
risk alongside other predictors. Family history, however, suffers from several drawbacks: (i) 
family history depends on actual disease events occurring (a cancer diagnosis), and thus 
cannot detect individuals who are at high risk but have not experienced an event; (ii) complex 
trait theory predicts that the majority of cases of complex disease arise in individuals without any 
family history of disease (sporadic cases) (34); and (iii) family history information is often 
incomplete or imprecise in practice, leading to further reduction in its predictive power. 
 
PRS can be thought of as a method to explicitly capture the common polygenic component of 
family history. Indeed, even early PRS could predict lower prevalence diseases better than 
family history (35). Recently, more predictive PRS for higher prevalence disease, such as CAD, 
have been shown to be associated with CAD independently of family history (16, 36). Since 
family history includes an environmental as well as genetic component, we expect that as PRS 
get more powerful, they will better capture the common genetic component of family history, 
without affecting the shared environment or the monogenic (rare) component. Thus, it is likely 
that for prediction purposes, models combining both family history and PRS will be stronger 
than any one of the two single factors, and that family history will not be made redundant by 
PRS.  
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 6 
 
Another form of genetic risk is monogenic in origin, namely, Mendelian germline mutations with 
high penetrance. Such examples are given by BRCA1/2 for breast, ovarian, and prostate 
cancers, and familial hypercholesterolemia (FH, caused by LDLR/APOB/PCSK9 mutations) for 
CAD. While these mutations are often highly penetrant, their relative rarity in the population 
means that they only explain a small fraction of disease cases. Furthermore, these rare genetic 
variants are generally not well-genotyped by standard genome-wide genotyping arrays (nor 
well-imputed from reference panels) (37), such that PRS derived from standard GWAS 
summary statistics with typical MAF thresholds do not capture rarer variation with high 
accuracy.  
 
Comparing the relative contributions of polygenic and monogenic risk is not straightforward 
since PRS represent continuous risk while monogenic risk is typically represented as 
presence/absence of known mutations. One approach used by Khera et al. (2018) was to find 
what proportion of the population had a PRS level high enough to be considered as equivalent 
to carrying monogenic mutations. For example, the top 8% CAD PRS confers an odds ratio of 3, 
which is similar to that of FH (38, 39), but far more prevalent (1 in 13 and 1 in 200 for the PRS 
top 8% and FH, respectively), thus representing a much higher disease burden on the 
population level. 
 
Since monogenic and polygenic risk are largely independent, individuals can inherit any 
combination of these two factors, and some small proportion of the population may receive both 
high polygenic risk as well as monogenic mutations for the same disease, putting them at 
extreme risk; conversely, some monogenic carriers may be at lower risk than their average 
peers. This has been shown for LDL cholesterol levels in carriers of both FH mutations and high 
CAD risk (39), and by the ability of CAD PRS to predict CAD in cohorts of high-risk FH cases 
(16, 40). Outside of CAD, PRS for diseases have been combined with well-studied mutations to 
show that PRS provides additional stratification in carriers of BRCA1/2 mutations in prostate 
cancer (41) and breast cancer (42), and APOE ε4 carriers in Alzheimer’s disease (30, 43–45). 
There is some evidence from Alzheimer’s disease (44) and breast cancer (42) that polygenic 
risk may interact non-additively with monogenic risk, but more research is needed to understand 
the impact on risk prediction. Ultimately, combined monogenic/polygenic scores will likely 
provide the most information for individual risk prediction. 
PRS are largely independent of traditional risk factors and can 
improve current clinical risk prediction models 
When considering adding PRS to risk models based on traditional risk factors, there are three 
main questions: (i) is the PRS associated with disease risk independently of traditional risk 
factors; (ii) does the PRS combine additively or non-additively with traditional risk factors in 
affecting risk; and (iii) does the PRS increase predictive power over traditional risk factors.  
 
The PRS for several diseases have been shown to be associated with disease risk largely 
independently of traditional risk factors. For example, in CAD, the association of PRS with 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 7 
disease is only partially attenuated by adjusting for a range of traditional risk factors such as 
systolic blood pressure, LDL cholesterol, BMI, and others (16, 27, 46). In addition, the PRSs are 
often only weakly associated with these risk factors. This is likely due to several reasons: (i) 
PRS are based on a large number of SNPs representing a multitude of biological pathways, 
some of which are not represented by traditional risk factors; (ii) many risk factors are 
themselves driven both by genetics and environment, and PRS can only capture the genetic 
component; (iii) current PRS are incomplete in that they typically only explain a small proportion 
of heritability; (iv) some risk factors, such as blood pressure, can exhibit substantial temporal 
variation and noise in measurements, whereas the PRS is capturing a life-long effect. 
 
A subsequent question is whether PRS and traditional risk factors combine additively in 
affecting disease, or does one modify the other in a non-additive way (statistical interaction). 
Results so far in CAD (16, 27) indicate that PRS and traditional risk factors combine largely 
additively; there is some evidence that PRS for breast cancer may interact with a minority of its 
risk factors including alcohol consumption, height, and hormone therapy (47), however, it is 
unknown whether the magnitude of these interactions has substantial implications for improved 
risk prediction. 
 
The final issue is whether PRS add substantial new information on top of traditional risk factors 
as to increase predictive power. In breast cancer this has been tested with multiple PRS 
(varying in GWAS summary statistics, training datasets, number of SNPs in the score) and 
multiple established risk predictors (varying in the genetic and non-genetic risk factors included; 
models listed in Table 1 and (48, 49)). In a systematic review and meta-analysis of these 
studies Fung et al. found that the AUC of any risk predictor improved by 0.004 with the inclusion 
of a PRS, and the net reclassification improvement (NRI, a measure of change in classification 
accuracy based on established risk thresholds) improved in all studies but one (49); however, 
care should be taken when interpreting these results as all of the scores included fewer than 
100 SNPs. In a recent study of PRS utility for risk prediction in 101 breast cancer families 
without BRCA1/2 mutations the inclusion of a 161 SNP PRS into the BOADICEA changed 
screening recommendations for 11.5–19.8% of women based on the risk guidelines used (50). 
Using another recent PRS (PRS-77; (51)) resulted in similar fraction of risk categories changing 
when included into BOADICEA and a number of other risk prediction methods (BRCAPRO, 
BCRAT, and IBIS) in a small Australian cohort (52), and a smaller 67 SNP PRS was 
independently predictive of risk when included in the Gail risk model along with mammographic 
density and endogenous hormones (53) (similar findings are observed using PRS-77 (54)). The 
use of larger cancer PRS (28, 29, 38) will likely improve risk stratification further. 
PRS are most informative for prevention 
While there is benefit to adding PRS to existing clinical risk scores, the unique characteristics of 
PRS open up possibilities for earlier prevention. Indeed, a study to predict the development of 
T1D in high-risk children (family history of T1D) found that a PRS was only predictive of 
progression to T1D before any metabolic abnormalities were present (high DPT-1 score), 
indicating the value of a T1D PRS for predicting those likely to progress to disease (55). For 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 8 
cardiovascular disease, traditional risk factors are typically not measured early in life and can 
have substantial temporal variation. In contrast, individuals can be genotyped early in life, and 
have their PRS for a wide range of complex diseases. For those at substantially increased 
lifetime risk of disease, but without elevated traditional risk factors, targeted lifestyle 
interventions could be used to reduce their risk, for example by more frequent follow-ups or 
more stringent targets for traditional risk factors (e.g. cholesterol) (56).  
Open questions and challenges for the PRS field 
We have outlined the value and potential of PRS for disease risk prediction but there remain a 
number of technical, practical, and ethical concerns that should be resolved before widespread 
clinical adoption. 
Improving the replicability and comparability of PRS predictions 
Currently PRS exist in the research domain, where scores methods and standards are 
constantly developing. The PRS for a single disease area can vary widely in their risk 
predictions because they will include different numbers (10–106) and non-overlapping sets of 
SNPs, with different effect sizes in different scores, depending on the GWAS summary statistics 
used to create the score (e.g., number of samples and their ancestry, phenotype definition, 
imputation panel for SNPs), along with the computational method and samples used to train the 
score. Apparent performance can also vary due to the covariates adjusted for in the risk 
prediction, such as age and sex. We believe this lack of consistency to be a prime concern for 
the PRS field and additional resources, such as a centralised public database of published 
polygenic scores, are necessary to increase PRS comparability and evaluation, and thus 
improve their potential for translation. However, further major challenges remain, including those 
as discussed below: increasing the diversity of genotyped cohorts to reduce the bias of PRS 
performance for European ancestries; investigating sex-based differences in PRS performance; 
and delineating clinical utility in disease-specific scenarios, rather than relying on generic 
prediction metrics, such as AUC.  
Sources of bias in PRS predictions: stratification by ancestry and 
sex? 
Currently, the majority of PRS are developed and evaluated using individuals of European 
ancestries, since the majority of GWAS and genetic reference panels (used for imputation) are 
currently biased toward European ancestries (57, 58). Because of this it has been observed that 
PRS developed using data from European ancestries are less predictive in non-European 
ancestries (59–63). There are various possible reasons for this lack of transferability including (i) 
population stratification in the original summary statistics (confounding the association results); 
(ii) differences in LD patterns between ancestries; and (iii) differences in the true genetic 
architectures of disease, including gene-environment interactions (58, 64). 
 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 9 
Population stratification can generally be adjusted for during the GWAS or in the evaluation of 
the score on new datasets, using principal component analysis (PCA) or linear mixed models 
(65). Care must be taken even within a single ancestry group, as there can be regional 
variations of PRS driven by subtle population stratification (66, 67). As for performance 
differences due to diverging LD patterns, these arise since many of the GWAS SNPs are not 
necessarily the causal SNP but are in LD with the causal SNP (tagging), however due to 
differences in LD between populations, the causal SNPs may no longer be well tagged, leading 
to reduced performance (58, 64). 
 
The issue of differences in genetic architecture differing by ancestry groups is difficult to assess 
without large GWAS in non-European ancestries. So far, the evidence from diseases such as 
T2D is that the genetic architecture is largely concordant between European and non-European 
ancestries (68–71); and directionally concordant effect sizes between different ancestries have 
been observed in multiple other comparisons of GWAS across ancestral groups (72–74). 
Assuming that this holds across the majority of complex diseases, LD differences are likely the 
main challenge to overcome. Some proposed solutions include a single pan-ancestry PRS or 
creating different ancestry-specific PRSs (62, 75–77). A related challenge will then be how to 
accurately align an individual to a PRS based on their ancestry. 
 
Another important yet relatively unexplored aspect of PRS predictive differences are how they 
differ by sex. Many traits, including disease risk, differ by sex and some of that may be partly 
genetic (78). However, most GWAS are not sex-specific, and often exclude sex chromosomes 
(particularly X) from the analysis. This is an area of interest for future PRS research, with recent 
results showing stronger predictive power for obesity (79) and Alzheimer’s disease (80) using 
sex-specific PRS. 
What is the value of PRS, and how do we achieve it? 
In this review we have outlined the benefits of how PRS can improve risk prediction, and 
highlighted cases of potential clinical utility. However, the evaluation of a PRS in public health 
and health economic terms as well as in feasibility of implementation is necessary to motivate 
adoption; these aspects, however, have not been extensively explored. Public health, economic, 
and implementation assessment will be highly dependent on the PRS use case and costs of the 
clinical action (e.g. medication, or altered screening guidelines). A previous review outlined the 
potential value of PRS in optimally allocating therapies in reducing the Number Needed to Treat 
(NNT) (81), however cost-benefit analysis represent another large step to be taken. To our 
knowledge the cost-benefit of PRS testing has only been explored in CAD and breast cancer. In 
a simulation framework of the Finnish health system, it was found that an optimal allocation of a 
CAD PRS alongside traditional risk factors would be cost-beneficial if deployed in a targeted, 
rather than population-wide, approach (20). In a UK-based analysis of the benefits of allocating 
breast cancer screening using risk-based (a combined predictor including a PRS) rather than 
age-based estimates would improve the cost-effectiveness and the benefit-to-harm ratio over 
current guidelines (26). While these cases suggest the value of genetic testing in their specific 
use cases, they may be underestimating the potential benefit due to the multiple PRS that can 
be estimated from a single genotype array. It is possible that there would be a significant health 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 10 
and economic benefit for genotyping once and receiving concurrent risk predictions for multiple 
diseases, optimizing treatment or screening for each.  
Conclusions 
Sixteen years since the human genome sequence was finished and nearly 13 years since the 
GWAS era began, PRS have emerged as a powerful tool to predict genetic predisposition of 
disease. For the seven diseases we evaluate here, the addition of PRS generally increased the 
accuracy of existing risk models of established risk factors, with the resulting improved risk 
prediction models affecting clinical management (diagnostic screening and/or treatment) in 
sizeable fractions of patients (~10% in the case of breast cancer). While these studies 
demonstrate the potential clinical impact and benefits of using PRS, there are still open 
questions regarding their eventual utility.   
 
The utility of PRS for informing disease risk is further evidenced by its practical implementation 
as a one-time, minimally invasive DNA extraction (e.g. saliva or blood draw) at any point in a 
lifetime, coupled with low-cost array genotyping and, in the future, genome sequencing. A single 
individual's genotype data allows for the parallel calculation of PRS for many diseases. From 
this single test, preexisting risk prediction models for multiple diseases appear to be improved, 
and lifetime risk trajectories can be estimated. In the future, these risk estimates may be used to 
guide screening frequencies, therapeutic interventions, and targeted recommendations for 
lifestyle. Regardless, the predictive accuracy of PRS will continue to improve with larger and 
more diverse cohorts as well as improved methods to derive and apply PRS, all of which are 
likely to increase the potential clinical utility of PRS and accelerate translation. 
 
 
  
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 11 
Display Items  
 
Box 1: Empirical results for CAD closely follow predictions from polygenic trait theory. 
Under an additive genetic liability threshold model, by assuming several key quantities, 
including population prevalence K, heritability h2 (on the liability scale) (and/or the sibling 
recurrence risk λs), we can derive the expected predictive power of a PRS, measured in 
sensitivity, specificity, AUC, and other quantities (82, 83). 
 
The adjacent figure shows 
simulation results for two 
scenarios relevant to CAD 
(assuming a population 
prevalence K=0.05 and 
h2=0.5): (a) a PRS explaining 
10% of the phenotypic 
variance, similar to the results 
achieved by the latest CAD 
PRS (27); and (b) the results 
for a PRS explaining all the 
known heritability of CAD 
(50% of the phenotypic 
variance). Clearly, as the PRS 
explains more of the 
heritability, there is greater 
separation between the 
average scores of cases and 
non-cases (quantified by the AUC) and corresponding effect sizes (ORstdev). For a disease such 
as CAD, the expected AUC from a PRS explaining all of the known heritability is 0.9. For 
scenario (i), the top 5% of the population will have an average absolute (lifetime) CAD risk of 
15%, but for scenario (ii) this goes up to a risk of 40%, and the top 15% of the population have a 
risk of >10%.  
 
Note that the genetic liability threshold model does not have direct bearing on how to increase 
the heritability explained by PRS, only what are the consequences of the increase. To increase 
the explained heritability we will likely need larger GWAS sample sizes (84, 85), together with 
wider genotyping of rarer genetic variants, such as via whole-genome sequencing (86, 87). In 
the absence of larger sample sizes, multi-trait prediction models can also be used to make small 
but consistent gains in predictive power (88, 89). 
 
 
  
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 12 
Table 1. An overview of PRS in seven different diseases. 
Cardiometabolic Traits/Diseases 
Obesity & BMI 
Risk 
Factors 
Mendelian Risk 
Factors 
MC4R mutations  
Other Factors 
Age, Sex, Family History 
Lifestyle: Diet, Physical Activity 
Potential clinical utility for PRS 
 Targeting lifestyle interventions and potential treatments (e.g. bariatric 
surgery) to those at most risk of developing obesity 
 BMI PRS is enriched in those who have undergone bariatric 
surgery in UK biobank (32) 
 Predicting weight gain trajectories (32, 90, 91) 
 Useful as a risk predictor of other diseases where obesity is a causal 
risk factor (79) 
Coronary artery disease (CAD) 
Risk 
Factors 
Mendelian Risk 
Factors 
Familial Hypercholesterolemia (FH) mutations: LDLR, APOB, PCSK9 
Other Factors 
Age, Sex, Family History 
Systolic blood pressure, LDL or non-HDL cholesterol, BMI 
Lifestyle: Smoking, Diet, Physical Activity 
Potential clinical utility for PRS 
 Adds accuracy to clinical risk predictors (e.g. Framingham Risk Score, 
ACC/AHA13 (16)) 
 Useful for defining most benefit from statin prescription (17, 18) 
 Useful for estimating lifetime risk trajectories (27, 56) 
Diabetes (Type 1) 
Risk 
Factors 
Mendelian Risk 
Factors 
Maturity onset diabetes of the young (MODY) related genes 
HLA susceptibility alleles 
 
Other Factors 
Age, Sex, Family History 
DPT-1 Metabolic Risk Score: BMI, glucose, and C-peptide 
 
Potential clinical utility for PRS 
 Predicting at-risk children who are most likely to progress to disease 
(55, 75, 92, 93) 
 Discriminating between Type 1 and 2 Diabetes (93) 
Diabetes (Type 2)  
Risk 
Factors 
Mendelian Risk 
Factors 
 Undetermined 
Other Factors 
Age, Sex, Family History 
BMI, waist circumference, waist-hip ratio, history of hypertension, history of 
high blood glucose 
Lifestyle: Smoking, Diet, Physical Activity level 
Potential clinical utility for PRS 
 Adding additional stratification to already accurate risk models (e.g. 
age, sex, BMI) (94) 
 Estimating lifetime risk trajectories (94) 
Cancers 
Breast Cancer 
Risk 
Factors 
Mendelian Risk 
Factors 
Pathogenic BRCA1/2 mutations 
Lower risk pathogenic variants: PALB2, ATM, CHEK2 
Other Factors 
Age, Sex, Family History 
Age at menarche, age at menopause, nulliparity and age at first childbirth, 
BMI, Hormone replacement therapy 
Potential clinical utility for PRS 
 Currently implemented within the BOADICEA risk model (33) 
 Added to other models including: Gail, Tyrer-Cusick, BCSC, BI-
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 13 
RADS, Rosner-Colditz, NCI (29, 49, 53) 
 Has value when included in risk models that can be applied to study 
risk-based vs. age-based screening programs (26) 
 Disease subtyping: Can be used to estimate genetic risk for ER-
positive or negative breast cancer separately (28) 
Prostate Cancer 
Risk 
Factors 
Mendelian Risk 
Factors 
Pathogenic BRCA1/2 mutations 
Other Factors Age, Sex, Family History 
Potential clinical utility for PRS 
 Improve predictions for risk-based screening and target PSA test to 
those with higher genetic risk (22) 
 Positive predictive value (PPV) of the PSA test increases with genetic 
risk (23) 
Other 
Alzheimer's Disease 
Risk 
Factors 
Mendelian Risk 
Factors 
APOE ε4 and ε2 alleles 
Other Factors Age, Sex, Family History 
Potential clinical utility for PRS 
 Current polygenic scores can explain the majority of heritability for 
common variants (95) 
 Polygenic Hazard Scores (PHS) to estimate age-of-onset (30) 
 
 
  
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 14 
 
 
 
 
 
 
Figure 1. PRS define lifetime risk trajectories. (A) Example density plot of a population 
according to polygenic risk. The distribution is filled and labeled according to the lowest (0-20%; 
blue), population average (40-60%; grey), and highest (80-100%; red) quintiles of genetic risk. 
(B). Example of a risk trajectory (Kaplan-Meier cumulative risk curve) for the population average 
(grey), and the highest and lowest quintiles of genetic risk (coloured as in A). Representative 
risk threshold shown for example. 
  
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 15 
References 
1. Altshuler,D.M., Gibbs,R.A., Peltonen,L., Schaffner,S.F., Yu,F., Dermitzakis,E., Bonnen,P.E., 
De Bakker,P.I.W., Deloukas,P., Gabriel,S.B., et al. (2010) Integrating common and rare 
genetic variation in diverse human populations. Nature, 467, 52–58. 
2. 1000 Genomes Project Consortium, Auton,A., Brooks,L.D., Durbin,R.M., Garrison,E.P., 
Kang,H.M., Korbel,J.O., Marchini,J.L., McCarthy,S., McVean,G.A., et al. (2015) A global 
reference for human genetic variation. Nature, 526, 68–74. 
3. Buniello,A., MacArthur,J.A.L., Cerezo,M., Harris,L.W., Hayhurst,J., Malangone,C., 
McMahon,A., Morales,J., Mountjoy,E., Sollis,E., et al. (2019) The NHGRI-EBI GWAS 
Catalog of published genome-wide association studies, targeted arrays and summary 
statistics 2019. Nucleic Acids Res., 47, D1005–D1012. 
4. Abraham,G. and Inouye,M. (2015) Genomic risk prediction of complex human disease and its 
clinical application. Curr. Opin. Genet. Dev., 33, 10–16. 
5. Torkamani,A., Wineinger,N.E. and Topol,E.J. (2018) The personal and clinical utility of 
polygenic risk scores. Nat. Rev. Genet., 19, 581–590. 
6. Martin,A.R., Daly,M.J., Robinson,E.B., Hyman,S.E. and Neale,B.M. (2018) Predicting 
Polygenic Risk of Psychiatric Disorders. Biol. Psychiatry, 10.1016/j.biopsych.2018.12.015. 
7. Visscher,P.M., Wray,N.R., Zhang,Q., Sklar,P., McCarthy,M.I., Brown,M.A. and Yang,J. (2017) 
10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet., 
101, 5–22. 
8. Choi,S.W., Shin,T., Mak,H. and Reilly,P.F.O. (2018) A guide to performing Polygenic Risk 
Score analyses. bioRxiv, 10.1101/416545. 
9. Chatterjee,N., Shi,J. and García-Closas,M. (2016) Developing and evaluating polygenic risk 
prediction models for stratified disease prevention. Nat. Rev. Genet., 17, 392–406. 
10. Pasaniuc,B. and Price,A.L. (2017) Dissecting the genetics of complex traits using summary 
association statistics. Nat. Rev. Genet., 18, 117–127. 
11. International Schizophrenia Consortium, Purcell,S.M., Wray,N.R., Stone,J.L., Visscher,P.M., 
O’Donovan,M.C., Sullivan,P.F. and Sklar,P. (2009) Common polygenic variation 
contributes to risk of schizophrenia and bipolar disorder. Nature, 460, 748–52. 
12. Evans,D.M., Visscher,P.M. and Wray,N.R. (2009) Harnessing the information contained 
within genome-wide association studies to improve individual prediction of complex 
disease risk. Hum. Mol. Genet., 18, 3525–3531. 
13. Vilhjálmsson,B.J., Yang,J., Finucane,H.K., Gusev,A., Lindström,S., Ripke,S., Genovese,G., 
Loh,P.-R., Bhatia,G., Do,R., et al. (2015) Modeling Linkage Disequilibrium Increases 
Accuracy of Polygenic Risk Scores. Am. J. Hum. Genet., 97, 576–592. 
14. Mak,T.S.H., Porsch,R.M., Choi,S.W., Zhou,X. and Sham,P.C. (2017) Polygenic scores via 
penalized regression on summary statistics. Genet. Epidemiol., 41, 469–480. 
15. Steyerberg,E.W., Vickers,A.J., Cook,N.R., Gerds,T., Gonen,M., Obuchowski,N., 
Pencina,M.J. and Kattan,M.W. (2010) Assessing the Performance of Prediction Models. 
Epidemiology, 21, 128–138. 
16. Abraham,G., Havulinna,A.S., Bhalala,O.G., Byars,S.G., De Livera,A.M., Yetukuri,L., 
Tikkanen,E., Perola,M., Schunkert,H., Sijbrands,E.J., et al. (2016) Genomic prediction of 
coronary heart disease. Eur. Heart J., 37, 3267–3278. 
17. Natarajan,P., Young,R., Stitziel,N.O., Padmanabhan,S., Baber,U., Mehran,R., Sartori,S., 
Fuster,V., Reilly,D.F., Butterworth,A., et al. (2017) Polygenic Risk Score Identifies 
Subgroup With Higher Burden of Atherosclerosis and Greater Relative Benefit From Statin 
Therapy in the Primary Prevention Setting. Circulation, 135, 2091–2101. 
18. Mega,J.L., Stitziel,N.O., Smith,J.G., Chasman,D.I., Caulfield,M., Devlin,J.J., Nordio,F., 
Hyde,C., Cannon,C.P., Sacks,F., et al. (2015) Genetic risk, coronary heart disease events, 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 16 
and the clinical benefit of statin therapy: an analysis of primary and secondary prevention 
trials. Lancet (London, England), 385, 2264–2271. 
19. Kullo,I.J., Jouni,H., Austin,E.E., Brown,S.-A., Kruisselbrink,T.M., Isseh,I.N., Haddad,R.A., 
Marroush,T.S., Shameer,K., Olson,J.E., et al. (2016) Incorporating a Genetic Risk Score 
Into Coronary Heart Disease Risk Estimates: Effect on Low-Density Lipoprotein 
Cholesterol Levels (the MI-GENES Clinical Trial). Circulation, 133, 1181–8. 
20. Hynninen,Y., Linna,M. and Vilkkumaa,E. (2018) Value of genetic testing in the prevention of 
cardiovascular events. PLoS One, 14, e0210010. 
21. Grossman,D.C., Curry,S.J., Owens,D.K., Bibbins-Domingo,K., Caughey,A.B., 
Davidson,K.W., Doubeni,C.A., Ebell,M., Epling,J.W., Kemper,A.R., et al. (2018) Screening 
for Prostate Cancer. JAMA, 319, 1901. 
22. Pashayan,N., Pharoah,P.D., Schleutker,J., Talala,K., Tammela,T.L., Määttänen,L., 
Harrington,P., Tyrer,J., Eeles,R., Duffy,S.W., et al. (2015) Reducing overdiagnosis by 
polygenic risk-stratified screening: findings from the Finnish section of the ERSPC. Br. J. 
Cancer, 113, 1086–1093. 
23. Seibert,T.M., Fan,C.C., Wang,Y., Zuber,V., Karunamuni,R., Parsons,J.K., Eeles,R.A., 
Easton,D.F., Kote-Jarai,Z., Al Olama,A.A., et al. (2018) Polygenic hazard score to guide 
screening for aggressive prostate cancer: development and validation in large scale 
cohorts. BMJ, 360, j5757. 
24. Pashayan,N., Duffy,S.W., Neal,D.E., Hamdy,F.C., Donovan,J.L., Martin,R.M., Harrington,P., 
Benlloch,S., Amin Al Olama,A., Shah,M., et al. (2015) Implications of polygenic risk-
stratified screening for prostate cancer on overdiagnosis. Genet. Med., 17, 789–795. 
25. Shieh,Y., Eklund,M., Madlensky,L., Sawyer,S.D., Thompson,C.K., Stover Fiscalini,A., Ziv,E., 
van’t Veer,L.J., Esserman,L.J. and Tice,J.A. (2017) Breast Cancer Screening in the 
Precision Medicine Era: Risk-Based Screening in a Population-Based Trial. J. Natl. Cancer 
Inst., 109, djw290. 
26. Pashayan,N., Morris,S., Gilbert,F.J. and Pharoah,P.D.P. (2018) Cost-effectiveness and 
Benefit-to-Harm Ratio of Risk-Stratified Screening for Breast Cancer. JAMA Oncol., 4, 
1504. 
27. Inouye,M., Abraham,G., Nelson,C.P., Wood,A.M., Sweeting,M.J., Dudbridge,F., Lai,F.Y., 
Kaptoge,S., Brozynska,M., Wang,T., et al. (2018) Genomic Risk Prediction of Coronary 
Artery Disease in 480,000 Adults: Implications for Primary Prevention. J. Am. Coll. Cardiol., 
72, 1883–1893. 
28. Mavaddat,N., Michailidou,K., Dennis,J., Lush,M., Fachal,L., Lee,A., Tyrer,J.P., Chen,T.-H., 
Wang,Q., Bolla,M.K., et al. (2019) Polygenic Risk Scores for Prediction of Breast Cancer 
and Breast Cancer Subtypes. Am. J. Hum. Genet., 104, 21–34. 
29. Läll,K., Lepamets,M., Palover,M., Esko,T., Metspalu,A., Tõnisson,N., Padrik,P., Mägi,R. and 
Fischer,K. (2019) Polygenic prediction of breast cancer: comparison of genetic predictors 
and implications for risk stratification. BMC Cancer, 19, 557. 
30. Desikan,R.S., Fan,C.C., Wang,Y., Schork,A.J., Cabral,H.J., Cupples,L.A., Thompson,W.K., 
Besser,L., Kukull,W.A., Holland,D., et al. (2017) Genetic assessment of age-associated 
Alzheimer disease risk: Development and validation of a polygenic hazard score. PLoS 
Med., 14, 1–17. 
31. Tan,C.H., Hyman,B.T., Tan,J.J.X., Hess,C.P., Dillon,W.P., Schellenberg,G.D., Besser,L.M., 
Kukull,W.A., Kauppi,K., McEvoy,L.K., et al. (2017) Polygenic hazard scores in preclinical 
Alzheimer disease. Ann. Neurol., 82, 484–488. 
32. Khera,A. V., Chaffin,M., Wade,K.H., Zahid,S., Brancale,J., Xia,R., Distefano,M., Senol-
Cosar,O., Haas,M.E., Bick,A., et al. (2019) Polygenic Prediction of Weight and Obesity 
Trajectories from Birth to Adulthood. Cell, 177, 587-596.e9. 
33. Lee,A., Mavaddat,N., Wilcox,A.N., Cunningham,A.P., Carver,T., Hartley,S., Babb de 
Villiers,C., Izquierdo,A., Simard,J., Schmidt,M.K., et al. (2019) BOADICEA: a 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 17 
comprehensive breast cancer risk prediction model incorporating genetic and nongenetic 
risk factors. Genet. Med., 0, 1. 
34. Yang,J., Visscher,P.M. and Wray,N.R. (2010) Sporadic cases are the norm for complex 
disease. Eur. J. Hum. Genet., 18, 1039–1043. 
35. Do,C.B., Hinds,D.A., Francke,U. and Eriksson,N. (2012) Comparison of Family History and 
SNPs for Predicting Risk of Complex Disease. PLoS Genet., 8, e1002973. 
36. Tada,H., Melander,O., Louie,J.Z., Catanese,J.J., Rowland,C.M., Devlin,J.J., Kathiresan,S. 
and Shiffman,D. (2016) Risk prediction by genetic risk scores for coronary heart disease is 
independent of self-reported family history. Eur. Heart J., 37, 561–7. 
37. Weedon,M., Jackson,L., Harrison,J., Ruth,K., Tyrrell,J., Hattersley,A. and Wright,C. (2019) 
Very rare pathogenic genetic variants detected by SNP-chips are usually false positives : 
implications for direct-to-consumer genetic testing. bioRxiv, 10.1101/696799. 
38. Khera,A. V., Chaffin,M., Aragam,K.G., Haas,M.E., Roselli,C., Choi,S.H., Natarajan,P., 
Lander,E.S., Lubitz,S.A., Ellinor,P.T., et al. (2018) Genome-wide polygenic scores for 
common diseases identify individuals with risk equivalent to monogenic mutations. Nat. 
Genet., 50, 1219–1224. 
39. Khera,A. V, Chaffin,M., Zekavat,S.M., Collins,R.L., Roselli,C., Natarajan,P., Lichtman,J.H., 
D’Onofrio,G., Mattera,J.A., Dreyer,R.P., et al. (2018) Whole Genome Sequencing to 
Characterize Monogenic and Polygenic Contributions in Patients Hospitalized with Early-
Onset Myocardial Infarction. Circulation, 10.1161/CIRCULATIONAHA.118.035658. 
40. Paquette,M., Chong,M., Thériault,S., Dufour,R., Paré,G. and Baass,A. (2017) Polygenic risk 
score predicts prevalence of cardiovascular disease in patients with familial 
hypercholesterolemia. J. Clin. Lipidol., 11, 725-732.e5. 
41. Lecarpentier,J., Silvestri,V., Kuchenbaecker,K.B., Barrowdale,D., Dennis,J., McGuffog,L., 
Soucy,P., Leslie,G., Rizzolo,P., Navazio,A.S., et al. (2017) Prediction of Breast and 
Prostate Cancer Risks in Male BRCA1 and BRCA2 Mutation Carriers Using Polygenic Risk 
Scores. J. Clin. Oncol., 35, 2240–2250. 
42. Kuchenbaecker,K.B., McGuffog,L., Barrowdale,D., Lee,A., Soucy,P., Dennis,J., 
Domchek,S.M., Robson,M., Spurdle,A.B., Ramus,S.J., et al. (2017) Evaluation of 
Polygenic Risk Scores for Breast and Ovarian Cancer Risk Prediction in BRCA1 and 
BRCA2 Mutation Carriers. J. Natl. Cancer Inst., 109, 248–252. 
43. Escott-Price,V., Sims,R., Bannister,C., Harold,D., Vronskaya,M., Majounie,E., 
Badarinarayan,N., GERAD/PERADES, IGAP consortia, Morgan,K., et al. (2015) Common 
polygenic variation enhances risk prediction for Alzheimer’s disease. Brain, 138, 3673–84. 
44. van der Lee,S.J., Wolters,F.J., Ikram,M.K., Hofman,A., Ikram,M.A., Amin,N. and van 
Duijn,C.M. (2018) The effect of APOE and other common genetic variants on the onset of 
Alzheimer’s disease and dementia: a community-based cohort study. Lancet Neurol., 17, 
434–444. 
45. Stocker,H., Möllers,T., Perna,L. and Brenner,H. (2018) The genetic risk of Alzheimer’s 
disease beyond APOE ε4: systematic review of Alzheimer’s genetic risk scores. Transl. 
Psychiatry, 8, 166. 
46. Hindy,G., Wiberg,F., Almgren,P., Melander,O. and Orho-Melander,M. (2018) Polygenic Risk 
Score for Coronary Heart Disease Modifies the Elevated Risk by Cigarette Smoking for 
Disease Incidence. Circ. Genomic Precis. Med., 11, e001856. 
47. Rudolph,A., Song,M., Brook,M.N., Milne,R.L., Mavaddat,N., Michailidou,K., Bolla,M.K., 
Wang,Q., Dennis,J., Wilcox,A.N., et al. (2018) Joint associations of a polygenic risk score 
and environmental risk factors for breast cancer in the Breast Cancer Association 
Consortium. Int. J. Epidemiol., 47, 526–536. 
48. Willoughby,A., Andreassen,P.R. and Toland,A.E. (2019) Genetic testing to guide risk-
stratified screens for breast cancer. J. Pers. Med., 9. 
49. Fung,S.M., Wong,X.Y., Lee,S.X., Miao,H., Hartman,M. and Wee,H.L. (2019) Performance of 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 18 
single-nucleotide polymorphisms in breast cancer risk prediction models: A Systematic 
Review and Meta-analysis. Cancer Epidemiol. Biomarkers Prev., 28, 506–521. 
50. Lakeman,I.M.M., Hilbers,F.S., Rodríguez-Girondo,M., Lee,A., Vreeswijk,M.P.G., 
Hollestelle,A., Seynaeve,C., Meijers-Heijboer,H., Oosterwijk,J.C., Hoogerbrugge,N., et al. 
(2019) Addition of a 161-SNP polygenic risk score to family history-based risk prediction: 
impact on clinical management in non- BRCA1/2 breast cancer families. J. Med. Genet., 
10.1136/jmedgenet-2019-106072. 
51. Mavaddat,N., Pharoah,P.D.P., Michailidou,K., Tyrer,J., Brook,M.N., Bolla,M.K., Wang,Q., 
Dennis,J., Dunning,A.M., Shah,M., et al. (2015) Prediction of Breast Cancer Risk Based on 
Profiling With Common Genetic Variants. JNCI J. Natl. Cancer Inst., 107, 1–15. 
52. Dite,G.S., Macinnis,R.J., Bickerstaffe,A., Dowty,J.G., Allman,R., Apicella,C., Milne,R.L., 
Tsimiklis,H., Phillips,K.A., Giles,G.G., et al. (2016) Breast cancer risk prediction using 
clinical models and 77 independent risk-associated SNPs for women aged under 50 years: 
Australian breast cancer family registry. Cancer Epidemiol. Biomarkers Prev., 25, 359–365. 
53. Zhang,X., Rice,M., Tworoger,S.S., Rosner,B.A., Eliassen,A.H., Tamimi,R.M., Joshi,A.D., 
Lindstrom,S., Qian,J., Colditz,G.A., et al. (2018) Addition of a polygenic risk score, 
mammographic density, and endogenous hormones to existing breast cancer risk 
prediction models: A nested case–control study. PLOS Med., 15, e1002644. 
54. Vachon,C.M., Scott,C.G., Tamimi,R.M., Thompson,D.J., Fasching,P.A., Stone,J., 
Southey,M.C., Winham,S., Lindström,S., Lilyquist,J., et al. (2019) Joint association of 
mammographic density adjusted for age and body mass index and polygenic risk score 
with breast cancer risk. Breast Cancer Res., 21, 1–10. 
55. Redondo,M.J., Geyer,S., Steck,A.K., Sharp,S., Wentworth,J.M., Weedon,M.N., Antinozzi,P., 
Sosenko,J., Atkinson,M., Pugliese,A., et al. (2018) A type 1 diabetes genetic risk score 
predicts progression of islet autoimmunity and development of type 1 diabetes in 
individuals at risk. Diabetes Care, 41, 1887–1894. 
56. Natarajan,P. (2018) Polygenic Risk Scoring for Coronary Heart Disease. J. Am. Coll. 
Cardiol., 72, 1894–1897. 
57. Morales,J., Welter,D., Bowler,E.H., Cerezo,M., Harris,L.W., McMahon,A.C., Hall,P., 
Junkins,H.A., Milano,A., Hastings,E., et al. (2018) A standardized framework for 
representation of ancestry data in genomics studies, with application to the NHGRI-EBI 
GWAS Catalog. Genome Biol., 19, 21. 
58. Martin,A.R., Kanai,M., Kamatani,Y., Okada,Y., Neale,B.M. and Daly,M.J. (2019) Clinical use 
of current polygenic risk scores may exacerbate health disparities. Nat. Genet., 51, 584–
591. 
59. Ware,E.B., Schmitz,L.L., Faul,J., Gard,A., Smith,J.A., Zhao,W., Weir,D. and Kardia,S.L.R. 
(2017) Heterogeneity in polygenic scores for common human traits. bioRxiv, 
10.1101/106062. 
60. Reisberg,S., Iljasenko,T., Läll,K., Fischer,K. and Vilo,J. (2017) Comparing distributions of 
polygenic risk scores of type 2 diabetes and coronary heart disease within different 
populations. PLoS One, 12, e0179238. 
61. Kim,M.S., Patel,K.P., Teng,A.K., Berens,A.J. and Lachance,J. (2018) Genetic disease risks 
can be misestimated across global populations. Genome Biol., 19, 179. 
62. Onengut-Gumuscu,S., Chen,W.-M., Robertson,C.C., Bonnie,J.K., Farber,E., Zhu,Z., 
Oksenberg,J.R., Brant,S.R., Bridges,S.L., Edberg,J.C., et al. (2019) Type 1 Diabetes Risk 
in African-Ancestry Participants and Utility of an Ancestry-Specific Genetic Risk Score. 
Diabetes Care, 42, 406–415. 
63. Curtis,D. (2018) Polygenic risk score for schizophrenia is more strongly associated with 
ancestry than with schizophrenia. Psychiatr. Genet., 28, 85–89. 
64. Martin,A.R., Gignoux,C.R., Walters,R.K., Wojcik,G.L., Neale,B.M., Gravel,S., Daly,M.J., 
Bustamante,C.D. and Kenny,E.E. (2017) Human Demographic History Impacts Genetic 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 19 
Risk Prediction across Diverse Populations. Am. J. Hum. Genet., 100, 635–649. 
65. Price,A.L., Zaitlen,N.A., Reich,D. and Patterson,N. (2010) New approaches to population 
stratification in genome-wide association studies. Nat. Rev. Genet., 11, 459–463. 
66. Haworth,S., Mitchell,R., Corbin,L., Wade,K.H., Dudding,T., Budu-Aggrey,A., Carslake,D., 
Hemani,G., Paternoster,L., Smith,G.D., et al. (2019) Apparent latent structure within the UK 
Biobank sample has implications for epidemiological analysis. Nat. Commun., 10, 333. 
67. Kerminen,S., Martin,A.R., Koskela,J., Ruotsalainen,S.E., Havulinna,A.S., Surakka,I., 
Palotie,A., Perola,M., Salomaa,V., Daly,M.J., et al. (2019) Geographic Variation and Bias in 
the Polygenic Scores of Complex Diseases and Traits in Finland. Am. J. Hum. Genet., 104, 
1169–1181. 
68. Mahajan,A., Go,M.J., Zhang,W., Below,J.E., Gaulton,K.J., Ferreira,T., Horikoshi,M., 
Johnson,A.D., Ng,M.C.Y., Prokopenko,I., et al. (2014) Genome-wide trans-ancestry meta-
analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat. 
Genet., 46, 234–244. 
69. Waters,K.M., Stram,D.O., Hassanein,M.T., Le Marchand,L., Wilkens,L.R., Maskarinec,G., 
Monroe,K.R., Kolonel,L.N., Altshuler,D., Henderson,B.E., et al. (2010) Consistent 
Association of Type 2 Diabetes Risk Variants Found in Europeans in Diverse Racial and 
Ethnic Groups. PLoS Genet., 6, e1001078. 
70. Hassanali,N., De Silva,N.M.G., Robertson,N., Rayner,N.W., Barrett,A., Bennett,A.J., 
Groves,C.J., Matthews,D.R., Katulanda,P., Frayling,T.M., et al. (2014) Evaluation of 
Common Type 2 Diabetes Risk Variants in a South Asian Population of Sri Lankan 
Descent. PLoS One, 9, e98608. 
71. Gan,W., Walters,R.G., Holmes,M. V., Bragg,F., Millwood,I.Y., Banasik,K., Chen,Y., Du,H., 
Iona,A., Mahajan,A., et al. (2016) Evaluation of type 2 diabetes genetic risk variants in 
Chinese adults: findings from 93,000 individuals from the China Kadoorie Biobank. 
Diabetologia, 59, 1446–1457. 
72. Wojcik,G.L., Graff,M., Nishimura,K.K., Tao,R., Haessler,J., Gignoux,C.R., Highland,H.M., 
Patel,Y.M., Sorokin,E.P., Avery,C.L., et al. (2019) Genetic analyses of diverse populations 
improves discovery for complex traits. Nature, 570, 514–518. 
73. Gurdasani,D., Barroso,I., Zeggini,E. and Sandhu,M.S. (2019) Genomics of disease risk in 
globally diverse populations. Nat. Rev. Genet., 10.1038/s41576-019-0144-0. 
74. Lam,M., Chen,C.-Y., Li,Z., Martin,A., Bryois,J., Ma,X., Gaspar,H., Ikeda,M., Benyamin,B., 
Brown,B., et al. (2018) Comparative genetic architectures of schizophrenia in East Asian 
and European populations. bioRxiv, 10.1101/445874. 
75. Perry,D.J., Wasserfall,C.H., Oram,R.A., Williams,M.D., Posgai,A., Muir,A.B., Haller,M.J., 
Schatz,D.A., Wallet,M.A., Mathews,C.E., et al. (2018) Application of a Genetic Risk Score 
to Racially Diverse Type 1 Diabetes Populations Demonstrates the Need for Diversity in 
Risk-Modeling. Sci. Rep., 8, 4529. 
76. Starlard-Davenport,A., Allman,R., Dite,G.S., Hopper,J.L., Tuff,E.S., Macleod,S., 
Kadlubar,S., Preston,M. and Henry-Tillman,R. (2018) Validation of a genetic risk score for 
Arkansas women of color. PLoS One, 13. 
77. Shieh,Y., Fejerman,L., Lott,P.C., Marker,K., Sawyer,S.D., Hu,D., Huntsman,S., Torres,J., 
Echeverry,M., Bohorquez,M.E., et al. (2019) A polygenic risk score for breast cancer in 
U.S. Latinas and Latin-American women. bioRxiv, 10.1101/598730. 
78. Khramtsova,E.A., Davis,L.K. and Stranger,B.E. (2019) The role of sex in the genomics 
of human complex traits. Nat. Rev. Genet., 20, 173–190. 
79. Censin,J.C., Bovijn,J., Ferreira,T., Pulit,S.L., Magi,R., Mahajan,A., Holmes,M. V and 
Lindgren,C.M. (2019) Causal relevance of obesity on the leading causes of death in 
women and men: A Mendelian randomization study. bioRxiv, 10.1101/523217. 
80. Tan,C.H., Fan,C.C., Mormino,E.C., Sugrue,L.P., Broce,I.J., Hess,C.P., Dillon,W.P., 
Bonham,L.W., Yokoyama,J.S., Karch,C.M., et al. (2018) Polygenic hazard score: an 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019
 20 
enrichment marker for Alzheimer’s associated amyloid and tau deposition. Acta 
Neuropathol., 135, 85–93. 
81. Gibson,G. (2019) On the utilization of polygenic risk scores for therapeutic targeting. PLOS 
Genet., 15, e1008060. 
82. Wray,N.R., Yang,J., Goddard,M.E. and Visscher,P.M. (2010) The Genetic Interpretation of 
Area under the ROC Curve in Genomic Profiling. PLoS Genet., 6, e1000864. 
83. So,H.-C. and Sham,P.C. (2010) A Unifying Framework for Evaluating the Predictive Power 
of Genetic Variants Based on the Level of Heritability Explained. PLoS Genet., 6, 
e1001230. 
84. Dudbridge,F. (2013) Power and Predictive Accuracy of Polygenic Risk Scores. PLoS 
Genet., 9, e1003348. 
85. Chatterjee,N., Wheeler,B., Sampson,J., Hartge,P., Chanock,S.J. and Park,J.H. (2013) 
Projecting the performance of risk prediction based on polygenic analyses of genome-wide 
association studies. Nat. Genet., 45, 400–405. 
86. Wainschtein,P., Jain,D.P., Yengo,L. and Zheng,Z. (2019) Recovery of trait heritability from 
whole genome sequence data Visscher 2019.pdf. bioRxiv, 10.1101/588020. 
87. Wray,N.R., Kemper,K.E., Hayes,B.J., Goddard,M.E. and Visscher,P.M. (2019) Complex 
Trait Prediction from Genome Data: Contrasting EBV in Livestock to PRS in Humans. 
Genetics, 211, 1131–1141. 
88. Turley,P., Walters,R.K., Maghzian,O., Okbay,A., Lee,J.J., Fontana,M.A., Nguyen-Viet,T.A., 
Wedow,R., Zacher,M., Furlotte,N.A., et al. (2018) Multi-trait analysis of genome-wide 
association summary statistics using MTAG. Nat. Genet., 50, 229–237. 
89. Maier,R.M., Zhu,Z., Lee,S.H., Trzaskowski,M., Ruderfer,D.M., Stahl,E.A., Ripke,S., 
Wray,N.R., Yang,J., Visscher,P.M., et al. (2018) Improving genetic prediction by leveraging 
genetic correlations among human diseases and traits. Nat. Commun., 9, 989. 
90. Song,M., Zheng,Y., Qi,L., Hu,F.B., Chan,A.T. and Giovannucci,E.L. (2018) Longitudinal 
analysis of genetic susceptibility and BMI throughout adult life. Diabetes, 67, 248–255. 
91. Brandkvist,M., Bjørngaard,J.H., Ødegård,R.A., Åsvold,B.O., Sund,E.R. and Vie,G.Å. (2019) 
Quantifying the impact of genes on body mass index during the obesity epidemic: 
longitudinal findings from the HUNT Study. Bmj, 10.1136/bmj.l4067. 
92. Bonifacio,E., Beyerlein,A., Hippich,M., Winkler,C., Vehik,K., Weedon,M.N., Laimighofer,M., 
Hattersley,A.T., Krumsiek,J., Frohnert,B.I., et al. (2018) Genetic scores to stratify risk of 
developing multiple islet autoantibodies and type 1 diabetes: A prospective study in 
children. PLoS Med., 15, e1002548. 
93. Sharp,S.A., Rich,S.S., Wood,A.R., Jones,S.E., Beaumont,R.N., Harrison,J.W., 
Schneider,D.A., Locke,J.M., Tyrrell,J., Weedon,M.N., et al. (2019) Development and 
Standardization of an Improved Type 1 Diabetes Genetic Risk Score for Use in Newborn 
Screening and Incident Diagnosis. Diabetes Care, 42, 200–207. 
94. Läll,K., Mägi,R., Morris,A., Metspalu,A. and Fischer,K. (2017) Personalized risk prediction 
for type 2 diabetes: the potential of genetic risk scores. Genet. Med., 19, 322–329. 
95. Escott-Price,V., Shoai,M., Pither,R., Williams,J. and Hardy,J. (2017) Polygenic score 
prediction captures nearly all common genetic risk for Alzheimer’s disease. Neurobiol. 
Aging, 49, 214. 
 
D
ow
nloaded from
 https://academ
ic.oup.com
/hm
g/advance-article-abstract/doi/10.1093/hm
g/ddz187/5540980 by U
niversity of C
am
bridge user on 31 July 2019