1 Mendelian randomization for cardiovascular diseases: principles and applications Susanna C. Larsson1,2, Adam S. Butterworth3-7, and Stephen Burgess3,4,8* 1 Unit of Medical Epidemiology, Department of Surgical Sciences, Uppsala University, Uppsala, Sweden 2 Unit of Cardiovascular and Nutritional Epidemiology, Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden 3 British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom 4 Victor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Papworth Road, Cambridge, United Kingdom 5 British Heart Foundation Centre of Research Excellence, School of Clinical Medicine, Addenbrooke’s Hospital, University of Cambridge, Cambridge, United Kingdom 6 Health Data Research UK, Wellcome Genome Campus and University of Cambridge, Hinxton, United Kingdom 7 NIHR Blood and Transplant Research Unit in Donor Health and Behaviour, University of Cambridge, Cambridge, United Kingdom 8 MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom * Corresponding author. Email: sb452@medschl.cam.ac.uk Word count for abstract: 180 words Word count for text only: 4985 words 2 Abstract Large-scale genome-wide association studies conducted over the last decade have uncovered numerous genetic variants associated with cardiometabolic traits and risk factors. These discoveries have enabled the Mendelian randomization (MR) design, which uses genetic variation as a natural experiment to improve causal inferences from observational data. By analogy with the random assignment of treatment in randomized controlled trials, the random segregation of genetic alleles when DNA is transmitted from parents to offspring at gamete formation is expected to reduce confounding in genetic associations. MR analyses make a set of assumptions that must hold for valid results. Provided that the assumptions are well justified for the genetic variants that are employed as instrumental variables, MR studies can inform on whether a putative risk factor likely has a causal effect on the disease or not. MR has been increasingly applied over recent years to predict the efficacy and safety of existing and novel drugs targeting cardiovascular risk factors, and to explore the repurposing potential of available drugs. This review article describes the principles of the MR design and some applications in cardiovascular epidemiology. Keywords: Cardiovascular disease; Genetics; Mendelian randomization; Single-nucleotide polymorphisms 3 Introduction Identification of causal risk factors and effective treatments for prevention of cardiovascular disease (CVD), the leading cause of morbidity and premature death worldwide,1 is crucial from both individual and societal perspectives. Randomized controlled trials (RCTs) are considered the gold standard design to infer causality. However, RCTs are expensive, time consuming, and often unfeasible to conduct, for example because of poor long-term compliance and ethical issues about random treatment allocation. Thus, relationships of modifiable risk factors with CVD events have mostly been investigated using observational study designs, such as case- control and cohort studies, which cannot reliably infer causality as confounding and reverse causation bias can distort the findings. Large-scale genome-wide association studies (GWAS) performed over the last decade have uncovered numerous genetic variants associated with cardiovascular risk factors, such as body mass index (BMI),2 glycemic traits,3 blood pressure,4 blood lipids,5 alcohol and tobacco use,6 coffee consumption,7 and physical activity,8 as well as CVD outcomes9-14 (Table 1). These discoveries have enabled the Mendelian randomization (MR) design, which employs genetic variation as a natural experiment to improve causal inferences from observational data. This state-of-the-art review describes the principles and some applications of the MR design to improve causal inference in cardiovascular epidemiology. Information included in this review is based on literature published through 1 July 2023. What is a Mendelian randomization study? MR is an application of instrumental variable analysis, which aims to test a causal hypothesis in non-experimental data. In an MR analysis, genetic variants, commonly single-nucleotide polymorphisms, are used as instrumental variables for the putative risk factor. The principle of MR refers to Mendel’s second law of independent segregation of genetic alleles when DNA is transmitted from parents to offspring at gamete formation. This is similar to the random assignment of treatment in an RCT, which aims to produce groups with similar clinical characteristics, hence reducing the risk of confounding. Figure 1 illustrates the analogy of an RCT and MR study investigating the effect of higher serum calcium levels on coronary heart disease risk. In this example, participants in the RCT are randomly assigned to receive either placebo or calcium supplements, leading to higher serum calcium levels in the treatment group.15 Analogously, in the MR study, the study population is ‘randomized’ by genetic variants that associate with serum calcium levels; for each variant, a participant may inherit the 4 allele that raises serum calcium levels, or the allele that does not raise serum calcium levels. In both studies, the randomization is independent of confounding factors, and allows inferences on the effect of elevated serum calcium levels on coronary heart disease risk. If the randomized group in an RCT (or the genetically “randomized” group in MR) with higher average levels of serum calcium also have higher risk of coronary heart disease, this is indicative of a causal effect of calcium levels on coronary heart disease risk. An MR study also diminishes the risk of reverse causation bias as genetic variants are unchangeable and cannot be influenced by disease status. A glossary of common terms used in MR studies is provided in Box 1. What are the advantages? MR studies have several advantages over RCTs. They are often faster and cheaper to conduct, as they can be conducted using existing large-scale GWAS data. MR studies can inform on potential causal relationships between modifiable risk factors and rare diseases that would require extensive sample sizes and long-term follow-up for sufficient endpoints to occur in an RCT. Moreover, MR studies can investigate exposures with expected adverse effects on disease risk, which would be unethical to test in trials. RCTs for a modifiable risk factor or medical treatment usually examine short-term effects as long-term compliance can be difficult to achieve and cost increases with longer duration. In contrast, as genetic variants are fixed at conception, MR results reflect the effects of life-long perturbations in the risk factor. Thus, MR is a valuable study design to overcome several of the limitations and problems confronted in conventional observational studies and RCTs. Nonetheless, MR should not be considered as a panacea as this design comes with its own set of assumptions and caveats, as described below. What are the assumptions? The three core assumptions that must hold for valid results in an MR analysis are illustrated in Figure 2. Specifically, the genetic variant (or multiple genetic variants) used as instrumental variable for the risk factor must: (1) reliably associate with the risk factor under investigation (relevance assumption); (2) not associate with any known or unknown confounding factors (independence assumption); and (3) influence the outcome only through the risk factor and not through any direct causal pathway (exclusion restriction assumption). The first assumption can be tested by choosing genetic variants that are significantly associated with the risk factor in a GWAS. Typically, genetic variants selected as instrumental variables are associated with the risk factor at the conventional level of genome-wide significance (P<5×10-8), although 5 increasingly, Mendelian randomization analyses are conducted using variants from gene regions chosen based on prior knowledge about the relevance of the gene function to the risk factor. The plausibility of the second assumption can be evaluated by examining whether the genetic variant is associated with competing risk factors. The third assumption cannot be assessed directly but must be justified by biological knowledge. In MR analyses involving multiple genetic variants, the plausibility of the assumptions can also be assessed by statistical methods (e.g., MR-Egger test for pleiotropy, see Box 1). What are the caveats? A primary concern to the validity of results from an MR analysis is pleiotropy, specifically ‘horizontal pleiotropy’ whereby a genetic variant affects the outcome through a pathway that does not involve the risk factor of interest. This would violate the MR assumptions and can be caused by multiple biological functions of the gene. It occurs where the variant associates with a factor (e.g., educational attainment) that is upstream of the risk factor of interest and which associates with multiple downstream risk factors (e.g., lifestyle factors) that affect the outcome via distinct pathways. Horizonal pleiotropy can produce a spurious, non-causal association between genetic predictors of the studied risk factor and the outcome but can also result in a false negative finding if the pleiotropic effect counteracts the true causal effect of the risk factor on the outcome. As an example, use of genetic variants that associate with coffee consumption (risk factor of interest) but also with another risk factor, such as smoking (confounder), that is not on the causal pathway from coffee consumption to coronary heart disease would give an estimate of the association between genetically predicted coffee consumption and coronary heart disease that does not correspond to the true causal effect (Figure 3). Another type of pleiotropy, termed ‘vertical pleiotropy’ is when a genetic variant associates with another factor on the causal pathway from the genetic variants via risk factor to the outcome, such that any causal pathway from the variants to the outcome passes through the risk factor. This type of pleiotropy does not invalidate MR estimates. Indeed, some MR studies seek to uncover factors that lie on the causal pathway from the studied risk factor to disease, as these are potential mediators of the causal relationship and can improve our mechanistic understanding about causal pathways. As an example, the association between higher physical activity level and reduced risk of coronary heart disease may be mediated via BMI8 (Figure 3). However, distinguishing between horizontal and vertical pleiotropy is primarily dependent on our biological understanding of the relationships between the genetic variants, exposure, outcome, and pleiotropic factors. 6 Linkage disequilibrium (LD), which refers to the correlation of genetic variants in the population, is another potential caveat in MR studies. Genetic variants in physical proximity on the same chromosome can be in LD. Confounding would result if the genetic variant used to proxy the risk factor of interest is in LD (i.e., it is correlated) with another genetic variant that is associated with the outcome through a pathway that does not involve the risk factor of interest. As an example, a study of genetically predicted glucose-dependent insulinotropic polypeptide receptor (GIPR) agonism in relation to CVD occurrence showed that the association of higher GIPR-mediated fasting glucose-dependent insulinotropic polypeptide levels with coronary artery disease risk was not driven by GIPR variants but was the result of LD confounding between variants at the GIPR locus and a variant in SNRPD2, an established coronary artery disease risk locus.16 What are the limitations? A shortcoming of the MR design is that it can only be applied to risk factors for which suitable genetic variants are available. Genetic variants typically have a small effect on most risk factors (i.e., they explain a small proportion of the variation), which can lead to low statistical power in the MR analysis and the risk of false negative findings. The proportion of variance explained and thus the statistical power can be increased by utilizing multiple genetic variants associated with the risk factor as instrumental variables. For example, the fat mass and obesity-associated gene (FTO) is the locus with the largest effect on BMI, but this locus explains less than 0.5% of the variation in BMI in populations of European ancestries and even less in populations of other ancestries.17 The corresponding variation explained by all near-independent genetic variants (n=941) found to be associated with BMI in a GWAS meta-analysis involving ~700 000 European ancestry individuals was ~6%.2 The amount of variation explained by known genetic variants is often below 5% for complex phenotypes. MR studies of such phenotypes require very large sample sizes, particularly large numbers of cases, to achieve reasonable power to detect weak to modest effects. The variation in the risk factor explained by genetics is higher for risk factors that are less influenced by environmental factors. As an example, circulating lipoprotein(a) (Lp[a]) levels are mainly determined by genetic variations at the LPA locus. Genetic variants that have been used to proxy the effect of Lp(a) explain over 60% of the variation in Lp(a) levels.18 One- or two-sample MR study? 7 In a one-sample MR study, the genetic variant-risk factor association and the genetic variant- outcome association are obtained from the same individuals, while in a two-sample MR study those associations come from independent study populations. For example, a two-sample MR study on serum calcium levels and coronary heart disease risk can involve summarized (i.e., aggregated) data for the genetic associations with serum calcium from one study19 and the corresponding data for the genetic associations with coronary artery disease from another study.20,21 An advantage of the two-sample design is that statistical power is typically greater as existing summarized data from large-scale GWAS consortia can be used. The two-sample design comes with the requirement that the two samples represent similar underlying populations (or better still, the same population) as the genetic variants identified to the associated with the risk factor in the first sample should be reliable predictors of the risk factor also in the outcome dataset. This assumption may not hold if age, sex, ancestry, or other characteristics differ in the two samples. For example, genetic variants associated with smoking heaviness in a GWAS analysis involving smokers only would be unsuitable as instrumental variables in an MR analysis with outcome data from a population largely consisting of non- smokers. Ideally, there should be no overlap of the populations in the two samples as overlap of cases can bias MR estimates in the direction of the observational association, especially when the genetic associations with the risk factor are not strong.22 A limitation of using summarized data is the reliance on the validity of GWAS results reported by other research groups, and reduced flexibility in the MR analysis. Using data at the individual-level enables more comprehensive analyses, such as non-linear MR analysis or analysis of a specific subgroup (e.g., among smokers only). GWAS data that can be used in two-sample MR analyses are publicly available for many phenotypes and disease outcomes. A few examples of large-scale GWAS studies are listed in Table 1. How to select genetic variants? There are two typical strategies for selection of genetic variants for use in a MR analysis. Genetic variant selection can be based on biological rationale or by including all independent genetic variants associated with the risk factor irrespective of biological function. For example, MR studies of the association between alcohol consumption and CVD have either used variants in genes that encode enzymes with a key role in alcohol metabolism,23,24 or all independent genetic variants associated with alcohol consumption at the genome-wide significance level in large consortium data.6 Alcohol (ethanol) is metabolized in the liver via two steps: firstly by alcohol dehydrogenases and secondly by acetaldehyde dehydrogenases. Genetic variants in the 8 coding gene regions of these enzymes affect alcohol drinking behaviors as accumulation of the intermediate product of this two-step reaction (i.e., acetaldehyde) produces discomforts, such as facial flushing and increases of pulse rate and skin temperature, at sufficient concentrations.25 MR studies have demonstrated that higher alcohol consumption proxied by one or more variants in the coding gene regions for the alcohol or acetaldehyde dehydrogenases is associated with higher systolic blood pressure (SBP)23,24,26,27 and increased risk of coronary heart disease23,27 and stroke.24,27 Alcohol drinking behavior and many other phenotypes are polygenic, meaning that they are influenced by variants in many genes. When multiple genetic variants are available as instruments, with individual-level data, the variants can be combined into a polygenic score, which is the weighted sum of genotypes over many variants.28 This score can then be used as an instrumental variable. In a two-sample MR analysis based on summarized data, each variant provides its own MR ratio estimate that are combined by taking a weighted average of them. Similar to MR analyses of alcohol consumption proxied by variants in genes involved in alcohol metabolism, a two-sample MR analysis showed that higher alcohol consumption proxied by 94 genetic variants was associated with higher SBP and stroke risk.27 The two strategies to select genetic variants come with different strengths and limitations. The advantage of using few genetic variants with a clear biological role in influencing the putative risk factor is that the likelihood of pleiotropic effects is typically lower. For example, the instrument comprising all genetic variants associated with alcohol consumption was associated with smoking liability in UK Biobank.27 An advantage of using all genetic variants associated with the risk factor is that statistical power can be greater as the proportion of variation in the risk factor increases with the number of genetic variants employed as instrumental variables. Furthermore, when many genetic variants are available, a broad range of sensitivity analyses can be employed to test the MR assumptions. How to analyze the data and obtain causal estimates? For a single genetic variant, the MR estimate can be obtained by dividing the variant-outcome association by the variant-risk factor association. The ratio is known as the Wald estimate. In a two-sample MR study based on multiple genetic variants and summarized data, the causal estimate can be obtained by the inverse-variance weighted method, which is a meta-analysis of the single Wald ratios and is the most efficient method (greatest statistical power) but is sensitive to pleiotropy.29 Several other methods that are more robust to pleiotropy but typically 9 less efficient, such as the weighted median,30 MR-Egger,31 and MR-PRESSO32 methods, are commonly used as sensitivity analyses. These approaches require the availability of variants in multiple gene regions. A brief description of some commonly used MR methods is available in Box 1; further detailed comparisons of methods can be found elsewhere.33,34 Why account for other risk factors? According to Mendel’s laws, each characteristic should be inherited independently of other characteristics, thereby preventing confounding. Nevertheless, genetic variants may still have pleiotropic associations with other variables. Adjustment for related traits with shared genetic predictors and for known pleiotropic factors can be done in a multivariable MR analysis, which is a statistical approach that allows for the association of genetic variants with multiple risk factors to be incorporated into the analysis.35 As an example, multivariable MR analysis has been conducted to unravel which one (or more) of the atherogenic lipid-related traits accounts for the causal association with major CVDs. These studies have demonstrated that low-density lipoprotein cholesterol (LDL-C), apolipoprotein B, and triglycerides were all associated with coronary artery disease and ischemic stroke when assessed individually in univariable MR analysis.36,37 Nevertheless, only apolipoprotein B remained robustly associated with these CVDs in multivariable MR analysis with mutual adjustment for the other lipid-related traits.36,37 Multivariable MR analysis can also be applied to explore mediating effects of factors that may lie in the causal pathway from the studied risk factor to the outcome. As an example, multivariable MR analysis was performed to evaluate the mediating effects of cardiometabolic risk factors on the association between adiposity and common atherosclerotic CVDs.38 The study showed that genetically predicted SBP and type 2 diabetes liability mediated 27% and 41%, respectively, of the association between genetically predicted BMI and risk of coronary artery disease.38 Adjustment for blood lipids and smoking liability through multivariable MR analysis resulted in only minor attenuations in the association estimate for genetically predicted BMI in relation to coronary artery disease,38 suggesting that lipids and smoking were not major mediators or confounders of the relationship. Triangulating the evidence Although MR studies can add an important piece to the puzzle on the possible causal effect of a risk factor on a health outcome, MR findings should be interpreted in the light of evidence from other sources such as traditional observational and experimental studies. As an example, 10 results from prospective cohort studies have shown that high circulating calcium levels are associated with an increased risk of myocardial infarction.39,40 Furthermore, a meta-analysis of three RCTs showed that relatively short-term high-dose calcium monotherapy or calcium plus vitamin D supplements, both of which result in a slight but significant increase in serum calcium levels15 and are amongst the most commonly prescribed therapeutics,41 increased the risk of myocardial infarction.42 On top of this evidence, MR investigations have found that genetically predicted lifelong higher serum calcium levels are associated with an increased risk of coronary artery disease,21 myocardial infarction,21 and overall CVD.42 Hence, triangulating the evidence across study designs supports a causal association between short-term and lifelong modest elevations in circulating calcium levels and a higher risk of myocardial infarction. What are the applications? MR has been applied to investigate potential causal relationships between putative risk factors and CVD risk as well as to predict the efficacy and adverse effects of existing and novel drugs and for drug repurposing opportunities. A summary of MR studies on conventional cardiovascular risk factors and lifestyle factors in relation to CVD risk is presented below and in the Graphical Abstract. Some examples of drug-target and drug repurposing MR studies are also provided. Conventional cardiovascular risk factors MR studies have provided convincing evidence that greater adiposity, instrumented by BMI- associated genetic variants discovered by the Genetic Investigation of Anthropometric Traits consortium,2,43 is causally associated with increased risk of most CVDs.13,44-48 In a recent meta- analysis of MR studies, genetically predicted higher BMI was associated with an increased risk of all 14 studied CVDs.46 Likewise, genetically predicted greater waist-to-hip ratio, whole-body fat mass, and visceral fat, are associated with increased risk of CVDs.38,49-52 Genetically predicted adiposity is associated with cardiometabolic factors, including glycemic traits, blood pressure, and circulating lipids.53-56 MR studies have provided evidence that higher fasting insulin, fasting glucose, or glycated hemoglobin levels are causally associated with an increased risk of some CVDs (e.g., coronary artery disease, peripheral artery disease, and ischemic stroke),48,57-59 and that elevated SBP is a risk factor for most CVDs.45,48,60 With respect to circulating lipids, MR studies have concluded that atherogenic lipid-related entities, including LDL-C,13,61-64 apolipoprotein B,13,36,37,61,65,66 and Lp(a),13,18,61,67-72 are associated with an 11 increased risk of atherosclerotic CVDs and that genetically predicted Lp(a) levels are associated with atrial fibrillation risk.72 Multivariable MR analyses have suggested that the associations of LDL-C with risk of coronary artery disease, ischemic stroke, and peripheral artery disease are largely driven by apolipoprotein B.36,37,65 MR findings have not supported an independent causal role of high-density lipoprotein cholesterol in major CVDs after accounting for LDL-C or apolipoprotein B.36,37,48,61,65 Lifestyle factors The MR design has been used to investigate the potential causal associations of lifestyle factors, including smoking, alcohol and coffee consumption, physical activity, and sleep patterns, with risk of CVD. A consistent association has been reported for genetic liability to smoking with increased risk of most CVDs,48,59,73-75 with the strongest magnitude of association observed for peripheral artery disease and abdominal aortic aneurysm.73-75 In contrast to conventional observational studies showing a protective association between moderate alcohol consumption and risk of coronary heart disease76 and ischemic stroke,77 MR studies have shown that genetically predicted higher alcohol consumption is associated with an increased risk of coronary heart disease23,78 and stroke27,78 in European populations. Moreover, in the China Kadoorie Biobank, alcohol consumption proxied by a loss-of-function variant of the aldehyde dehydrogenase 2 gene (common in east Asian populations) and a variant of the alcohol dehydrogenase 1B gene had a continuous positive log-linear association with risk of both ischemic stroke and intracerebral hemorrhage but was not associated with myocardial infarction.24 A nonlinear MR analysis in the UK Biobank showed that light alcohol drinking was associated with a minimal increase in coronary artery disease risk and that the risk increased exponentially at higher intakes.78 MR studies have provided suggestive evidence that genetically predicted higher alcohol consumption is associated with increased risk of abdominal aortic aneurysm,27 atrial fibrillation,27,78 heart failure,78 and peripheral artery disease,27,59 but not with aortic valve stenosis and venous thromboembolism27 in European populations. The observational findings of an inverse association between moderate coffee consumption and risk of CVD,79,80 particularly coronary heart disease and ischemic stroke,79 have no support from MR studies which have proxied coffee consumption by a couple of variants in genes known to be involved in caffeine metabolism (and associated with coffee consumption)48,80 or all variants strongly associated with coffee consumption.48,81 Furthermore, no association has been observed between genetically predicted plasma caffeine levels and CVD risk.82 It should be noted that MR analyses have assumed a linear relationship between coffee consumption and 12 CVD and results might therefore have been attenuated to the null if the relationship is nonlinear as suggested by observational studies (lowest CVD risk at 3 to 5 cups per day).79 The disparate results, which highlight the importance of triangulation of evidence, might also reflect residual confounding in the observational studies or pleiotropy in MR studies. Observational findings of inverse associations between physical activity and risk of major CVDs83 have gained little support from MR studies. Suggestive evidence of strong inverse associations has been observed for genetically predicted vigorous physical activity and risk of myocardial infarction84 and for moderate-to-vigorous physical activity and risk of subarachnoid hemorrhage.48 However, other MR studies reported no association of genetically predicted self- reported moderate to vigorous physical activity with risk of coronary artery disease,85 ischemic stroke,85 peripheral artery disease,59 or heart failure.86 Likewise, no association has been found between genetically predicted accelerometer-based physical activity and risk of coronary artery disease or ischemic stroke.85 The genetic instruments used in these MR studies explain little variation in the physical activity phenotypes (i.e., between ~0.1% to 0.24%).59,85,86 Thus, the negative MR findings may reflect insufficient power to detect weak to modest associations. The associations of sleep traits, particularly sleep duration and insomnia, with risk of CVD have been investigated in several MR studies.48,59,87-90 For example, an MR study involving 404 044 UK Biobank participants found that genetic liability to short sleep duration (≤6 h) was associated with increased risk of hypertension, coronary artery disease, myocardial infarction, and pulmonary embolism, and possibly with atrial fibrillation.87 Other MR studies found that genetic liability to short sleep duration was associated with an increased risk of peripheral artery disease.59,88 Moreover, genetic liability to insomnia has been found to associate with increased risk of several CVDs, including coronary artery disease, peripheral artery disease, atrial fibrillation, heart failure, ischemic stroke, and subarachnoid hemorrhage.48,59,89,90 Predicting efficacy and adverse drug effects The methodology of drug-target MR analysis to evaluate efficacy and safety of drugs targeting cardiovascular risk factors has been described in depth previously.91-93 In brief, most drugs act by targeting proteins, which are coded for by genes. Variants in the region of the relevant protein-coding gene can thus be used to proxy the pharmacologic effects of perturbing the corresponding drug target. Whereas MR studies of modifiable risk factors generally employ variants from multiple gene regions, MR analyses exploring drug target effects typically utilize variants from a single gene region, specifically, variants within the region around the protein- coding gene. Such an analysis is known as cis-MR analysis as variants near the protein-coding 13 gene are named cis-variants. A prerequisite is that the selected genetic variants represent the clinical effects of the drug target. As an example, MR has been applied to predict the effects of cholesterol-lowering drugs that target 3-hydroxy-3-methylglutaryl-CoA reductase (HMGCR; target of statins), Niemann- Pick C1–like 1 (NPC1L1, target of ezetimibe), proprotein convertase subtilisin/kexin type 9 (PCSK9, target of PCSK9 inhibitors), and cholesteryl ester transfer protein (CETP, target of CETP inhibitors) on CVD risk.62,64,66,94-100 In one of these initial MR studies,94 the researchers selected variants within ±100 kb of the HMGCR and NPC1L1 genes that associated with LDL- C levels at a threshold of P<5.0×10-6 in the Global Lipids Genetic Consortium101 as instrumental variables to predict the corresponding drug effects. The analysis showed that genetically predicted LDL-C lowering mediated by variants in the HMGCR or NPC1L1 gene or in both genes was associated with a reduced risk of coronary heart disease.94 Similar associations have been reported for genetic mimicry of HMGCR inhibition in relation to risk of ischemic stroke96 and abdominal aortic aneurysm.64 Likewise, genetic mimicry of PCSK9 or CETP inhibition has been found to associate with a reduced risk of coronary artery disease62,98,99 and abdominal aortic aneurysm64 as well as ischemic stroke in some98,99 but not all studies.62,97,100 With respect to adverse effects, genetically predicted LDL-C lowering mediated through variants in the NPC1L1 gene has been associated with an increased risk of gallstone disease,102 whereas genetic mimicry of HMGCR and CETP inhibition has been associated with an increased risk of intracerebral hemorrhage66 and age-related macular degeneration,100,103 respectively. Additionally, genetically predicted LDL-C lowering independent of drug target has been reported to associate with an increased risk of type 2 diabetes.95,104,105 As a further example, cis-MR analyses have been performed to predict the cardiovascular effects of lowering circulating levels of Lp(a). These MR studies have used variants in the LPA gene region as instrumental variables and consistently demonstrated that genetically predicted higher Lp(a) levels are associated with an increased risk of many CVDs, particularly coronary artery disease, peripheral artery disease, aortic stenosis, abdominal aortic aneurysm, and ischemic stroke.18,67-71 According to a multivariable MR analysis, the increased risk of coronary artery disease related to higher Lp(a) levels is independent of apolipoprotein B.106 A recent study that integrated MR and phenome-wide association study technologies (MR-PheWAS; see Box 1) to explore the effects of Lp(a) on 1081 outcomes amongst ~400 000 participants of the UK Biobank found little evidence of adverse effects of lowering Lp(a) levels.71 Nevertheless, earlier MR studies have suggested a possible increased risk of Alzheimer’s disease associated with Lp(a)-lowering.68,70 RCTs are underway to investigate whether Lp(a)-lowering therapies 14 can reduce the risk of recurrent major adverse cardiovascular events as well as to evaluate the safety and tolerance of such therapies.107 An important limitation of this approach is that the extent to which genetic variants mimic the action of specific drugs is often unclear.108 Further limitations are that genetic variants have life-long effects, whereas trials assess the impact of short-term interventions; trials often compare the impact of interventions on top of standard care (such as statins), and hence MR investigations may not reflect real-world practice; and trial endpoints often differ from outcomes in MR analyses. Drug repurposing MR has been applied to evaluate the repurposing potential of available drugs. As an example, MR studies have demonstrated that genetic mimicry of interleukin-6 receptor blockade (targeted by tocilizumab) is associated with lower risk of rheumatoid arthritis,109,110 but also with reduced risks of several CVDs110-115 as well as COVID-19.116 Nonetheless, a possible side effect is an increased risk of pneumonia.110,116 These findings suggest that blockade of the interleukin-6 signaling pathway may be a target for the prevention of diverse CVDs and COVID-19 but that caution should be taken with regard to possible adverse effects. As another example, the broad effects of genetic mimicry of tyrosine kinase 2 (TYK2) inhibition (targeted by deucravacitinib) on ~1500 outcomes amongst ~340 000 participants of the UK Biobank study was recently examined in an MR-PheWAS.117 The study showed that TYK2 inhibition instrumented by a variant in the TYK2 gene was, as expected, effective in reducing the risk of psoriasis and other autoimmune diseases, but was associated with potential adverse effects such as increased risk of prostate and breast cancer.117 Future directions MR investigations are dependent on the availability of studies with linked genetic and epidemiological data. These have expanded in several directions in recent years: in size, in coverage, and in scope. Larger sample sizes enable more powerful analyses, as well as adequately powered analyses in population subgroups. Data are becoming available on a wider range of population groups, such as the multi-ancestry GWAS from the Global Lipids Genetic Consortium.5 This is important not only to improve representation in research findings, but also because key treatment-mimicking variants may only be available in specific ancestry groups, such as loss-of-function variants proxying darapladib in East Asians.118,119 Most GWAS and 15 MR analyses conducted to date have included participants of primarily European ancestries. The generalizability of the findings to other ancestries deserves further study. Finally, ever more detailed data on proteomics,120 metabolomics,121 transcriptomics,122 and other biological domains are enabling focused, translational analyses to understand the potential effects of diverse interventions. This is combined with methodological innovations, enabling analyses that characterize causal non-linear response curves,123 identify relevant causal traits,124 and model effects on multiple outcomes.125 Conclusions MR analyses can provide critical evidence on the potential causal effects of many modifiable exposures, including traditional epidemiological risk factors, lifestyle factors, and druggable targets. The validity of inferences is subject to untestable assumptions that will not hold in all cases. Still, MR can add important evidence supporting or dampening enthusiasm for the exposure as a worthwhile target for therapeutic intervention. Data availability All data in this paper is available via published articles. Funding SCL receives financial support from the Swedish Heart-Lung Foundation (Hjärt-Lungfonden; 20190247), the Swedish Research Council (Vetenskapsrådet; 2019-00977), and the Swedish Cancer Society (Cancerfonden). SB is supported by the Wellcome Trust (225790/Z/22/Z) and the United Kingdom Research and Innovation Medical Research Council (MC_UU_00002/7). This research was supported by the National Institute for Health Research Cambridge Biomedical Research Centre (NIHR203312). The views expressed are those of the authors and not necessarily those of the National Institute for Health Research or the Department of Health and Social Care. The BHF Cardiovascular Epidemiology Unit has been supported by core funding from the: NIHR Blood and Transplant Research Unit (BTRU) in Donor Health and Genomics (NIHR BTRU-2014-10024), NIHR BTRU in Donor Health and Behaviour (NIHR203337), UK Medical Research Council (MR/L003120/1), British Heart Foundation (SP/09/002; RG/13/13/30194; RG/18/13/33946) and NIHR Cambridge BRC (BRC-1215- 20014; NIHR203312) and has received funding from an EC-Innovative Medicines Initiative (BigData@Heart). This work was supported by Health Data Research UK, which is funded by 16 the UK Medical Research Council, Engineering and Physical Sciences Research Council, Economic and Social Research Council, Department of Health and Social Care (England), Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), British Heart Foundation and Wellcome. For the purpose of open access, the author(s) has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission. Conflict of interest A.S.B. has received grants unrelated to this work from AstraZeneca, Bayer, Biogen, BioMarin, Bioverativ, Novartis and Sanofi. The other authors declare no conflicts of interest. 17 References 1. GBD 2019 Diseases and Injuries Collaborators. Global burden of 369 diseases and injuries in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet 2020;396:1204-1222. doi: 10.1016/S0140- 6736(20)30925-9 2. Yengo L, Sidorenko J, Kemper KE, Zheng Z, Wood AR, Weedon MN, et al. Meta- analysis of genome-wide association studies for height and body mass index in approximately 700000 individuals of European ancestry. Hum Mol Genet 2018;27:3641-3649. doi: 10.1093/hmg/ddy271 3. Chen J, Spracklen CN, Marenne G, Varshney A, Corbin LJ, Luan J, et al. The trans- ancestral genomic architecture of glycemic traits. Nat Genet 2021;53:840-860. doi: 10.1038/s41588-021-00852-9 4. Evangelou E, Warren HR, Mosen-Ansorena D, Mifsud B, Pazoki R, Gao H, et al. Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits. Nat Genet 2018;50:1412-1425. doi: 10.1038/s41588-018-0205-x 5. Graham SE, Clarke SL, Wu KH, Kanoni S, Zajac GJM, Ramdas S, et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 2021;600:675-679. doi: 10.1038/s41586-021-04064-3 6. Liu M, Jiang Y, Wedow R, Li Y, Brazel DM, Chen F, et al. Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat Genet 2019;51:237-244. doi: 10.1038/s41588-018-0307-5 7. Cornelis MC, Byrne EM, Esko T, Nalls MA, Ganna A, Paynter N, et al. Genome-wide meta-analysis identifies six novel loci associated with habitual coffee consumption. Mol Psychiatry 2015;20:647-656. doi: 10.1038/mp.2014.107 8. Wang Z, Emmerich A, Pillon NJ, Moore T, Hemerich D, Cornelis MC, et al. Genome- wide association analyses of physical activity and sedentary behavior provide insights into underlying mechanisms and roles in disease prevention. Nat Genet 2022;54:1332-1344. doi: 10.1038/s41588-022-01165-1 9. van der Harst P, Verweij N. Identification of 64 Novel Genetic Loci Provides an Expanded View on the Genetic Architecture of Coronary Artery Disease. Circ Res 2018;122:433-443. doi: 10. Nielsen JB, Thorolfsdottir RB, Fritsche LG, Zhou W, Skov MW, Graham SE, et al. Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat Genet 2018;50:1234-1239. doi: 10.1038/s41588-018-0171-3 11. Aragam KG, Jiang T, Goel A, Kanoni S, Wolford BN, Atri DS, et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat Genet 2022;54:1803-1815. doi: 10.1038/s41588-022-01233-6 12. Mishra A, Malik R, Hachiya T, Jurgenson T, Namba S, Posner DC, et al. Stroke genetics informs drug discovery and risk prediction across ancestries. Nature 2022;611:115- 123. doi: 10.1038/s41586-022-05165-3 13. Chen HY, Dina C, Small AM, Shaffer CM, Levinson RT, Helgadottir A, et al. Dyslipidemia, inflammation, calcification, and adiposity in aortic stenosis: a genome-wide study. Eur Heart J 2023. doi: 10.1093/eurheartj/ehad142 14. Rasooly D, Peloso GM, Pereira AC, Dashti H, Giambartolomei C, Wheeler E, et al. Genome-wide association analysis and Mendelian randomization proteomics identify drug targets for heart failure. Nat Commun 2023;14:3826. doi: 10.1038/s41467-023-39253-3 15. Yuan S, Baron JA, Michaelsson K, Larsson SC. Serum calcium and 25- hydroxyvitamin D in relation to longevity, cardiovascular disease and cancer: a Mendelian randomization study. NPJ Genom Med 2021;6:86. doi: 10.1038/s41525-021-00250-4 18 16. Bowker N, Hansford R, Burgess S, Foley CN, Auyeung VPW, Erzurumluoglu AM, et al. Genetically Predicted Glucose-Dependent Insulinotropic Polypeptide (GIP) Levels and Cardiovascular Disease Risk Are Driven by Distinct Causal Variants in the GIPR Region. Diabetes 2021;70:2706-2719. doi: 10.2337/db21-0103 17. Loos RJ, Yeo GS. The bigger picture of FTO: the first GWAS-identified obesity gene. Nat Rev Endocrinol 2014;10:51-61. doi: 10.1038/nrendo.2013.227 18. Burgess S, Ference BA, Staley JR, Freitag DF, Mason AM, Nielsen SF, et al. Association of LPA Variants With Risk of Coronary Disease and the Implications for Lipoprotein(a)-Lowering Therapies: A Mendelian Randomization Analysis. JAMA Cardiol 2018;3:619-627. doi: 10.1001/jamacardio.2018.1470 19. O'Seaghdha CM, Wu H, Yang Q, Kapur K, Guessous I, Zuber AM, et al. Meta- analysis of genome-wide association studies identifies six new loci for serum calcium concentrations. PLoS Genet 2013;9:e1003796. doi: 10.1371/journal.pgen.1003796 20. Nikpay M, Goel A, Won HH, Hall LM, Willenborg C, Kanoni S, et al. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat Genet 2015;47:1121-1130. doi: 10.1038/ng.3396 21. Larsson SC, Burgess S, Michaelsson K. Association of genetic variants related to serum calcium levels with coronary artery disease and myocardial infarction. JAMA 2017;318:371-380. doi: 10.1001/jama.2017.8981 22. Burgess S, Davies NM, Thompson SG. Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol 2016;40:597-608. doi: 10.1002/gepi.21998 23. Holmes MV, Dale CE, Zuccolo L, Silverwood RJ, Guo Y, Ye Z, et al. Association between alcohol and cardiovascular disease: Mendelian randomisation analysis based on individual participant data. BMJ 2014;349:g4164. doi: 10.1136/bmj.g4164 24. Millwood IY, Walters RG, Mei XW, Guo Y, Yang L, Bian Z, et al. Conventional and genetic evidence on alcohol and vascular disease aetiology: a prospective study of 500 000 men and women in China. Lancet 2019;393:1831-1842. doi: 10.1016/S0140-6736(18)31772- 0 25. Polimanti R, Gelernter J. ADH1B: From alcoholism, natural selection, and cancer to the human phenome. Am J Med Genet B Neuropsychiatr Genet 2018;177:113-125. doi: 10.1002/ajmg.b.32523 26. Lawlor DA, Nordestgaard BG, Benn M, Zuccolo L, Tybjaerg-Hansen A, Davey Smith G. Exploring causal associations between alcohol and coronary heart disease risk factors: findings from a Mendelian randomization study in the Copenhagen General Population Study. Eur Heart J 2013;34:2519-2528. doi: 10.1093/eurheartj/eht081 27. Larsson SC, Burgess S, Mason AM, Michaelsson K. Alcohol consumption and cardiovascular disease: A Mendelian randomization study. Circ Genom Precis Med 2020;13:e002814. doi: 10.1161/CIRCGEN.119.002814 28. Dudbridge F. Polygenic Mendelian Randomization. Cold Spring Harb Perspect Med 2021;11. doi: 10.1101/cshperspect.a039586 29. Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol 2013;37:658-665. doi: 10.1002/gepi.21758 30. Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol 2016;40:304-314. doi: 10.1002/gepi.21965 31. Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol 2015;44:512-525. doi: 10.1093/ije/dyv080 19 32. Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet 2018;50:693-698. doi: 10.1038/s41588-018-0099-7 33. Slob EAW, Burgess S. A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol 2020;44:313-329. doi: 10.1002/gepi.22295 34. Sanderson E, Glymour MM, Holmes MV, Kang H, Morrison J, Munafo MR, et al. Mendelian randomization. Nat Rev Methods Primers 2022;2. doi: 10.1038/s43586-021- 00092-5 35. Sanderson E. Multivariable Mendelian Randomization and Mediation. Cold Spring Harb Perspect Med 2021;11. doi: 10.1101/cshperspect.a038984 36. Richardson TG, Sanderson E, Palmer TM, Ala-Korpela M, Ference BA, Davey Smith G, Holmes MV. Evaluating the relationship between circulating lipoprotein lipids and apolipoproteins with risk of coronary heart disease: A multivariable Mendelian randomisation analysis. PLoS Med 2020;17:e1003062. doi: 10.1371/journal.pmed.1003062 37. Yuan S, Tang B, Zheng J, Larsson SC. Circulating Lipoprotein Lipids, Apolipoproteins and Ischemic Stroke. Ann Neurol 2020;88:1229-1236. doi: 10.1002/ana.25916 38. Gill D, Zuber V, Dawson J, Pearson-Stuttard J, Carter AR, Sanderson E, et al. Risk factors mediating the effect of body mass index and waist-to-hip ratio on cardiovascular outcomes: Mendelian randomization analysis. Int J Obes (Lond) 2021;45:1428-1438. doi: 10.1038/s41366-021-00807-4 39. Reid IR, Gamble GD, Bolland MJ. Circulating calcium concentrations, vascular disease and mortality: a systematic review. J Intern Med 2016;279:524-540. doi: 10.1111/joim.12464 40. Rohrmann S, Garmo H, Malmstrom H, Hammar N, Jungner I, Walldius G, Van Hemelrijck M. Association between serum calcium concentration and risk of incident and fatal cardiovascular disease in the prospective AMORIS study. Atherosclerosis 2016;251:85- 93. doi: 10.1016/j.atherosclerosis.2016.06.004 41. Audi S, Burrage DR, Lonsdale DO, Pontefract S, Coleman JJ, Hitchings AW, Baker EH. The 'top 100' drugs and classes in England: an updated 'starter formulary' for trainee prescribers. Br J Clin Pharmacol 2018;84:2562-2571. doi: 10.1111/bcp.13709 42. Bolland MJ, Grey A, Avenell A, Gamble GD, Reid IR. Calcium supplements with or without vitamin D and risk of cardiovascular events: reanalysis of the Women's Health Initiative limited access dataset and meta-analysis. BMJ 2011;342:d2040. doi: 10.1136/bmj.d2040 43. Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 2015;518:197-206. doi: 10.1038/nature14177 44. Larsson SC, Bäck M, Rees JMB, Mason AM, Burgess S. Body mass index and body composition in relation to 14 cardiovascular conditions in UK Biobank: a Mendelian randomization study. Eur Heart J 2019;41:221-226. doi: 10.1093/eurheartj/ehz388 45. Shah S, Henry A, Roselli C, Lin H, Sveinbjornsson G, Fatemifar G, et al. Genome- wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure. Nat Commun 2020;11:163. doi: 10.1038/s41467-019-13690-5 46. Larsson SC, Burgess S. Causal role of high body mass index in multiple chronic diseases: a systematic review and meta-analysis of Mendelian randomization studies. BMC Med 2021;19:320. doi: 10.1186/s12916-021-02188-x 47. Kim MS, Kim WJ, Khera AV, Kim JY, Yon DK, Lee SW, et al. Association between adiposity and cardiovascular outcomes: an umbrella review and meta-analysis of 20 observational and Mendelian randomization studies. Eur Heart J 2021;42:3388-3403. doi: 10.1093/eurheartj/ehab454 48. Karhunen V, Bakker MK, Ruigrok YM, Gill D, Larsson SC. Modifiable Risk Factors for Intracranial Aneurysm and Aneurysmal Subarachnoid Hemorrhage: A Mendelian Randomization Study. J Am Heart Assoc 2021;10:e022277. doi: 10.1161/JAHA.121.022277 49. Karlsson T, Rask-Andersen M, Pan G, Hoglund J, Wadelius C, Ek WE, Johansson A. Contribution of genetics to visceral adiposity and its relation to cardiovascular and metabolic disease. Nat Med 2019;25:1390-1395. doi: 10.1038/s41591-019-0563-7 50. Larsson SC, Burgess S. Fat mass and fat-free mass in relation to cardiometabolic diseases: a two-sample Mendelian randomization study. J Intern Med 2020;288:260-262. doi: 10.1111/joim.13078 51. Emdin CA, Khera AV, Natarajan P, Klarin D, Zekavat SM, Hsiao AJ, Kathiresan S. Genetic Association of Waist-to-Hip Ratio With Cardiometabolic Traits, Type 2 Diabetes, and Coronary Heart Disease. JAMA 2017;317:626-634. doi: 10.1001/jama.2016.21042 52. Tikkanen E, Gustafsson S, Knowles JW, Perez M, Burgess S, Ingelsson E. Body composition and atrial fibrillation: a Mendelian randomization study. Eur Heart J 2019 Feb 5. doi: 10.1093/eurheartj/ehz003. [Epub ahead of print]. doi: 10.1093/eurheartj/ehz003 53. Kivimaki M, Smith GD, Timpson NJ, Lawlor DA, Batty GD, Kahonen M, et al. Lifetime body mass index and later atherosclerosis risk in young adults: examining causal links using Mendelian randomization in the Cardiovascular Risk in Young Finns study. Eur Heart J 2008;29:2552-2560. doi: 10.1093/eurheartj/ehn252 54. Fall T, Hagg S, Magi R, Ploner A, Fischer K, Horikoshi M, et al. The role of adiposity in cardiometabolic traits: a Mendelian randomization analysis. PLoS Med 2013;10:e1001474. doi: 10.1371/journal.pmed.1001474 55. Lyall DM, Celis-Morales C, Ward J, Iliodromiti S, Anderson JJ, Gill JMR, et al. Association of Body Mass Index With Cardiometabolic Disease in the UK Biobank: A Mendelian Randomization Study. JAMA Cardiol 2017;2:882-889. doi: 10.1001/jamacardio.2016.5804 56. Dale CE, Fatemifar G, Palmer TM, White J, Prieto-Merino D, Zabaneh D, et al. Causal Associations of Adiposity and Body Fat Distribution With Coronary Heart Disease, Stroke Subtypes, and Type 2 Diabetes Mellitus: A Mendelian Randomization Analysis. Circulation 2017;135:2373-2388. doi: 10.1161/circulationaha.116.026560 57. Burgess S, Malik R, Liu B, Mason AM, Georgakis MK, Dichgans M, Gill D. Dose- response relationship between genetically proxied average blood glucose levels and incident coronary heart disease in individuals without diabetes mellitus. Diabetologia 2021;64:845- 849. doi: 10.1007/s00125-020-05377-0 58. Yuan S, Mason AM, Burgess S, Larsson SC. Differentiating Associations of Glycemic Traits With Atherosclerotic and Thrombotic Outcomes: Mendelian Randomization Investigation. Diabetes 2022;71:2222-2232. doi: 10.2337/db21-0905 59. Hoek AG, van Oort S, Elders PJM, Beulens JWJ. Causal Association of Cardiovascular Risk Factors and Lifestyle Behaviors With Peripheral Artery Disease: A Mendelian Randomization Approach. J Am Heart Assoc 2022;11:e025644. doi: 10.1161/JAHA.122.025644 60. Higgins H, Mason AM, Larsson SC, Gill D, Langenberg C, Burgess S. Estimating the population benefits of blood pressure lowering: A wide-angled Mendelian randomization study in UK Biobank. J Am Heart Assoc 2021:10:e021098. doi: 61. Small AM, Peloso GM, Linefsky J, Aragam J, Galloway A, Tanukonda V, et al. Multiancestry Genome-Wide Association Study of Aortic Stenosis Identifies Multiple Novel Loci in the Million Veteran Program. Circulation 2023;147:942-955. doi: 10.1161/CIRCULATIONAHA.122.061451 21 62. Allara E, Morani G, Carter P, Gkatzionis A, Zuber V, Foley CN, et al. Genetic determinants of lipids and cardiovascular disease outcomes: A wide-angled Mendelian randomization investigation. Circ Genom Precis Med 2019;12:e002711. doi: 10.1161/CIRCGEN.119.002711 63. Holmes MV, Asselbergs FW, Palmer TM, Drenos F, Lanktree MB, Nelson CP, et al. Mendelian randomization of blood lipids for coronary heart disease. Eur Heart J 2015;36:539-550. doi: 10.1093/eurheartj/eht571 64. Harrison SC, Holmes MV, Burgess S, Asselbergs FW, Jones GT, Baas AF, et al. Genetic Association of Lipids and Lipid Drug Targets With Abdominal Aortic Aneurysm: A Meta-analysis. JAMA Cardiol 2018;3:26-33. doi: 10.1001/jamacardio.2017.4293 65. Levin MG, Zuber V, Walker VM, Klarin D, Lynch J, Malik R, et al. Prioritizing the Role of Major Lipoproteins and Subfractions as Risk Factors for Peripheral Artery Disease. Circulation 2021;144:353-364. doi: 10.1161/circulationaha.121.053797 66. Yu Z, Zhang L, Zhang G, Xia K, Yang Q, Huang T, Fan D. Lipids, Apolipoproteins, Statins, and Intracerebral Hemorrhage: A Mendelian Randomization Study. Ann Neurol 2022;92:390-399. doi: 10.1002/ana.26426 67. Emdin CA, Khera AV, Natarajan P, Klarin D, Won HH, Peloso GM, et al. Phenotypic Characterization of Genetically Lowered Human Lipoprotein(a) Levels. J Am Coll Cardiol 2016;68:2761-2772. doi: 10.1016/j.jacc.2016.10.033 68. Pan Y, Li H, Wang Y, Meng X, Wang Y. Causal effect of Lp(a) [lipoprotein(a)] level on ischemic stroke and Alzheimer disease: A Mendelian randomization study. Stroke 2019;50:3532-3539. doi: 10.1161/STROKEAHA.119.026872 69. Gudbjartsson DF, Thorgeirsson G, Sulem P, Helgadottir A, Gylfason A, Saemundsdottir J, et al. Lipoprotein(a) concentration and risks of cardiovascular disease and diabetes. J Am Coll Cardiol 2019;74:2982-2994. doi: 10.1016/j.jacc.2019.10.019 70. Larsson SC, Gill D, Mason AM, Jiang T, Back M, Butterworth AS, Burgess S. Lipoprotein(a) in Alzheimer, atherosclerotic, cerebrovascular, thrombotic, and valvular disease: Mendelian randomization investigation. Circulation 2020;141:1826-1828. doi: 10.1161/CIRCULATIONAHA.120.045826 71. Larsson SC, Wang L, Li X, Jiang F, Chen X, Mantzoros CS. Circulating lipoprotein(a) levels and health outcomes: Phenome-wide Mendelian randomization and disease-trajectory analyses. Metabolism 2022;137:155347. doi: 10.1016/j.metabol.2022.155347 72. Mohammadi-Shemirani P, Chong M, Narula S, Perrot N, Conen D, Roberts JD, et al. Elevated Lipoprotein(a) and Risk of Atrial Fibrillation: An Observational and Mendelian Randomization Study. J Am Coll Cardiol 2022;79:1579-1590. doi: 10.1016/j.jacc.2022.02.018 73. Larsson SC, Mason AM, Back M, Klarin D, Damrauer SM, Million Veteran P, et al. Genetic predisposition to smoking in relation to 14 cardiovascular diseases. Eur Heart J 2020;41:3304-3310. doi: 10.1093/eurheartj/ehaa193 74. Klarin D, Verma SS, Judy R, Dikilitas O, Wolford BN, Paranjpe I, et al. Genetic Architecture of Abdominal Aortic Aneurysm in the Million Veteran Program. Circulation 2020;142:1633-1646. doi: 10.1161/CIRCULATIONAHA.120.047544 75. Larsson SC, Burgess S. Appraising the causal role of smoking in multiple diseases: A systematic review and meta-analysis of Mendelian randomization studies. EBioMedicine 2022;82:104154. doi: 10.1016/j.ebiom.2022.104154 76. Ronksley PE, Brien SE, Turner BJ, Mukamal KJ, Ghali WA. Association of alcohol consumption with selected cardiovascular disease outcomes: a systematic review and meta- analysis. BMJ 2011;342:d671. doi: 10.1136/bmj.d671 22 77. Larsson SC, Wallin A, Wolk A, Markus HS. Differing association of alcohol consumption with different stroke types: a systematic review and meta-analysis. BMC Med 2016;14:178. doi: 10.1186/s12916-016-0721-4 78. Biddinger KJ, Emdin CA, Haas ME, Wang M, Hindy G, Ellinor PT, et al. Association of Habitual Alcohol Intake With Risk of Cardiovascular Disease. JAMA Netw Open 2022;5:e223849. doi: 10.1001/jamanetworkopen.2022.3849 79. Ding M, Bhupathiraju SN, Satija A, van Dam RM, Hu FB. Long-term coffee consumption and risk of cardiovascular disease: a systematic review and a dose-response meta-analysis of prospective cohort studies. Circulation 2014;129:643-659. doi: 10.1161/CIRCULATIONAHA.113.005925 80. Nordestgaard AT, Nordestgaard BG. Coffee intake, cardiovascular disease and all- cause mortality: observational and Mendelian randomization analyses in 95 000-223 000 individuals. Int J Epidemiol 2016;45:1938-1952. doi: 10.1093/ije/dyw325 81. Yuan S, Carter P, Mason AM, Burgess S, Larsson SC. Coffee Consumption and Cardiovascular Diseases: A Mendelian Randomization Study. Nutrients 2021;13. doi: 10.3390/nu13072218 82. Larsson SC, Woolf B, Gill D. Appraisal of the causal effect of plasma caffeine on adiposity, type 2 diabetes, and cardiovascular disease: two sample mendelian randomisation study. BMJ Med 2023;2:1-8. doi: 10.1136/bmjmed-2022-000335 83. Kraus WE, Powell KE, Haskell WL, Janz KF, Campbell WW, Jakicic JM, et al. Physical Activity, All-Cause and Cardiovascular Mortality, and Cardiovascular Disease. Med Sci Sports Exerc 2019;51:1270-1281. doi: 10.1249/MSS.0000000000001939 84. Zhuo C, Zhao J, Chen M, Lu Y. Physical Activity and Risks of Cardiovascular Diseases: A Mendelian Randomization Study. Front Cardiovasc Med 2021;8:722154. doi: 10.3389/fcvm.2021.722154 85. Bahls M, Leitzmann MF, Karch A, Teumer A, Dorr M, Felix SB, et al. Physical activity, sedentary behavior and risk of coronary artery disease, myocardial infarction and ischemic stroke: a two-sample Mendelian randomization study. Clin Res Cardiol 2021;110:1564-1573. doi: 10.1007/s00392-021-01846-7 86. van Oort S, Beulens JWJ, van Ballegooijen AJ, Handoko ML, Larsson SC. Modifiable lifestyle factors and heart failure: A Mendelian randomization study. Am Heart J 2020;227:64-73. doi: 10.1016/j.ahj.2020.06.007 87. Ai S, Zhang J, Zhao G, Wang N, Li G, So HC, et al. Causal associations of short and long sleep durations with 12 cardiovascular diseases: linear and nonlinear Mendelian randomization analyses in UK Biobank. Eur Heart J 2021;42:3349-3357. doi: 10.1093/eurheartj/ehab170 88. Yuan S, Levin MG, Titova OE, Chen J, Sun Y, Million Veteran Program VA, et al. Sleep duration, daytime napping, and risk of peripheral artery disease: multinational cohort and Mendelian randomization studies. Eur Heart J Open 2023;3:oead008. doi: 10.1093/ehjopen/oead008 89. Larsson SC, Markus HS. Genetic Liability to Insomnia and Cardiovascular Disease Risk. Circulation 2019;140:796-798. doi: 10.1161/CIRCULATIONAHA.119.041830 90. Yuan S, Mason AM, Burgess S, Larsson SC. Genetic liability to insomnia in relation to cardiovascular diseases: a Mendelian randomisation study. Eur J Epidemiol 2021;36:393- 400. doi: 10.1007/s10654-021-00737-5 91. Burgess S, Mason AM, Grant AJ, Slob EAW, Gkatzionis A, Zuber V, et al. Using genetic association data to guide drug discovery and development: Review of methods and applications. Am J Hum Genet 2023;110:195-214. doi: 10.1016/j.ajhg.2022.12.017 23 92. Gill D, Georgakis MK, Walker VM, Schmidt AF, Gkatzionis A, Freitag DF, et al. Mendelian randomization for studying the effects of perturbing drug targets. Wellcome Open Res 2021;6:16. doi: 10.12688/wellcomeopenres.16544.2 93. Schmidt AF, Finan C, Gordillo-Maranon M, Asselbergs FW, Freitag DF, Patel RS, et al. Genetic drug target validation using Mendelian randomisation. Nat Commun 2020;11:3255. doi: 10.1038/s41467-020-16969-0 94. Ference BA, Majeed F, Penumetcha R, Flack JM, Brook RD. Effect of naturally random allocation to lower low-density lipoprotein cholesterol on the risk of coronary heart disease mediated by polymorphisms in NPC1L1, HMGCR, or both: a 2 x 2 factorial Mendelian randomization study. J Am Coll Cardiol 2015;65:1552-1561. doi: 10.1016/j.jacc.2015.02.020 95. Lotta LA, Sharp SJ, Burgess S, Perry JRB, Stewart ID, Willems SM, et al. Association Between Low-Density Lipoprotein Cholesterol-Lowering Genetic Variants and Risk of Type 2 Diabetes: A Meta-analysis. JAMA 2016;316:1383-1391. doi: 10.1001/jama.2016.14568 96. Hindy G, Engstrom G, Larsson SC, Traylor M, Markus HS, Melander O, Orho- Melander M. Role of blood lipids in the development of ischemic stroke and its subtypes: A Mendelian randomization study. Stroke 2018;49:820-827. doi: 10.1161/strokeaha.117.019653 97. Hopewell JC, Malik R, Valdes-Marquez E, Worrall BB, Collins R, ISGC MCot. Differential effects of PCSK9 variants on risk of coronary disease and ischaemic stroke. Eur Heart J 2018;39:354-359. doi: 10.1093/eurheartj/ehx373 98. Schmidt AF, Hunt NB, Gordillo-Maranon M, Charoen P, Drenos F, Kivimaki M, et al. Cholesteryl ester transfer protein (CETP) as a drug target for cardiovascular disease. Nat Commun 2021;12:5640. doi: 10.1038/s41467-021-25703-3 99. De Marchis GM, Dittrich TD, Malik R, Zietz AV, Kriemler LF, Ference BA, et al. Genetic proxies for PCSK9 inhibition associate with lipoprotein(a): Effects on coronary artery disease and ischemic stroke. Atherosclerosis 2022;361:41-46. doi: 10.1016/j.atherosclerosis.2022.09.007 100. Nordestgaard LT, Christoffersen M, Lauridsen BK, Afzal S, Nordestgaard BG, Frikke-Schmidt R, Tybjaerg-Hansen A. Long-term Benefits and Harms Associated With Genetic Cholesteryl Ester Transfer Protein Deficiency in the General Population. JAMA Cardiol 2022;7:55-64. doi: 10.1001/jamacardio.2021.3728 101. Willer CJ, Schmidt EM, Sengupta S, Peloso GM, Gustafsson S, Kanoni S, et al. Discovery and refinement of loci associated with lipid levels. Nat Genet 2013;45:1274-1283. doi: 10.1038/ng.2797 102. Lauridsen BK, Stender S, Frikke-Schmidt R, Nordestgaard BG, Tybjaerg-Hansen A. Genetic variation in the cholesterol transporter NPC1L1, ischaemic vascular disease, and gallstone disease. Eur Heart J 2015;36:1601-1608. doi: 10.1093/eurheartj/ehv108 103. Burgess S, Davey Smith G. Mendelian Randomization Implicates High-Density Lipoprotein Cholesterol-Associated Mechanisms in Etiology of Age-Related Macular Degeneration. Ophthalmology 2017;124:1165-1174. doi: 10.1016/j.ophtha.2017.03.042 104. Ference BA, Robinson JG, Brook RD, Catapano AL, Chapman MJ, Neff DR, et al. Variation in PCSK9 and HMGCR and Risk of Cardiovascular Disease and Diabetes. N Engl J Med 2016;375:2144-2153. doi: 10.1056/NEJMoa1604304 105. Schmidt AF, Swerdlow DI, Holmes MV, Patel RS, Fairhurst-Hunter Z, Lyall DM, et al. PCSK9 genetic variants and risk of type 2 diabetes: a mendelian randomisation study. Lancet Diabetes Endocrinol 2017;5:97-105. doi: 10.1016/S2213-8587(16)30396-5 106. Trinder M, Zekavat SM, Uddin MM, Pampana A, Natarajan P. Apolipoprotein B is an insufficient explanation for the risk of coronary disease associated with lipoprotein(a). Cardiovasc Res 2021;117:1245-1247. doi: 10.1093/cvr/cvab060 24 107. Sheridan C. RNA drugs lower lipoprotein(a) and genetically driven cholesterol. Nat Biotechnol 2022;40:983-985. doi: 10.1038/s41587-022-01396-x 108. Anderson EL, Williams DM. Drug target Mendelian randomisation: are we really instrumenting drug use? Diabetologia 2023;66:1156-1158. doi: 10.1007/s00125-023-05875-x 109. Yuan S, Li X, Lin A, Larsson SC. Interleukins and rheumatoid arthritis: bi-directional Mendelian randomization investigation. Semin Arthritis Rheum 2022;53:151958. doi: 10.1016/j.semarthrit.2022.151958 110. Cupido AJ, Asselbergs FW, Natarajan P, Group CIW, Ridker PM, Hovingh GK, Schmidt AF. Dissecting the IL-6 pathway in cardiometabolic disease: A Mendelian randomization study on both IL6 and IL6R. Br J Clin Pharmacol 2022;88:2875-2884. doi: 10.1111/bcp.15191 111. Interleukin-6 Receptor Mendelian Randomisation Analysis C, Swerdlow DI, Holmes MV, Kuchenbaecker KB, Engmann JE, Shah T, et al. The interleukin-6 receptor as a target for prevention of coronary heart disease: a mendelian randomisation analysis. Lancet 2012;379:1214-1224. doi: 10.1016/S0140-6736(12)60110-X 112. Harrison SC, Smith AJ, Jones GT, Swerdlow DI, Rampuri R, Bown MJ, et al. Interleukin-6 receptor pathways in abdominal aortic aneurysm. Eur Heart J 2013;34:3707- 3716. doi: 10.1093/eurheartj/ehs354 113. Rosa M, Chignon A, Li Z, Boulanger MC, Arsenault BJ, Bosse Y, et al. A Mendelian randomization study of IL6 signaling in cardiovascular diseases, immune-related disorders and longevity. NPJ Genom Med 2019;4:23. doi: 10.1038/s41525-019-0097-4 114. Yuan S, Lin A, He QQ, Burgess S, Larsson SC. Circulating interleukins in relation to coronary artery disease, atrial fibrillation and ischemic stroke and its subtypes: A two-sample Mendelian randomization study. Int J Cardiol 2020;313:99-104. doi: 10.1016/j.ijcard.2020.03.053 115. Georgakis MK, Malik R, Gill D, Franceschini N, Sudlow CLM, Dichgans M, Invent Consortium CIWG. Interleukin-6 Signaling Effects on Ischemic Stroke and Other Cardiovascular Outcomes: A Mendelian Randomization Study. Circ Genom Precis Med 2020;13:e002872. doi: 10.1161/CIRCGEN.119.002872 116. Larsson SC, Burgess S, Gill D. Genetically proxied interleukin-6 receptor inhibition: opposing associations with COVID-19 and pneumonia. Eur Respir J 2021;57. doi: 10.1183/13993003.03545-2020 117. Yuan S, Wang L, Zhang H, Xu F, Zhou X, Yu L, et al. Mendelian randomization and clinical trial evidence supports TYK2 inhibition as a therapeutic target for autoimmune diseases. EBioMedicine 2023;89:104488. doi: 10.1016/j.ebiom.2023.104488 118. Gregson JM, Freitag DF, Surendran P, Stitziel NO, Chowdhury R, Burgess S, et al. Genetic invalidation of Lp-PLA(2) as a therapeutic target: Large-scale study of five functional Lp-PLA(2)-lowering alleles. Eur J Prev Cardiol 2017;24:492-504. doi: 10.1177/2047487316682186 119. Millwood IY, Bennett DA, Walters RG, Clarke R, Waterworth D, Johnson T, et al. A phenome-wide association study of a lipoprotein-associated phospholipase A2 loss-of- function variant in 90 000 Chinese adults. International Journal of Epidemiology 2016;45:1588-1599. doi: 120. Sun BB, Chiou J, Traylor M, Benner C, Hsu Y-H, Richardson TG, et al. Genetic regulation of the human plasma proteome in 54,306 UK Biobank participants. BioRxiv 2022:2022.2006.2017.496443. doi: 10.1101/2022.06.17.496443 121. Yin X, Chan LS, Bose D, Jackson AU, VandeHaar P, Locke AE, et al. Genome-wide association studies of metabolites in Finnish men identify disease-relevant loci. Nat Commun 2022;13:1644. doi: 10.1038/s41467-022-29143-5 25 122. Consortium GT. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020;369:1318-1330. doi: 10.1126/science.aaz1776 123. Tian H, Mason AM, Liu C, Burgess S. Relaxing parametric assumptions for non- linear Mendelian randomization using a doubly-ranked stratification method. BioRxiv, posted preprint on June 30, 2022. doi: https://doi.org/10.1101/2022.06.28.497930. doi: 124. Batool F, Patel A, Gill D, Burgess S. Disentangling the effects of traits with shared clustered genetic predictors using multivariable Mendelian randomization. Genet Epidemiol 2022;46:415-429. doi: 10.1002/gepi.22462 125. Zuber V, Lewin A, Levin MG, Haglund A, Ben-Aicha S, Emanueli C, et al. Multi- response Mendelian randomization: Identification of shared and distinct exposures for multimorbidity and multiple related disease outcomes. The American Journal of Human Genetics 2023;110:1177-1199. doi: 10.1016/j.ajhg.2023.06.005 https://doi.org/10.1101/2022.06.28.497930 26 Box 1. Glossary of frequently used terms in Mendelian randomization studies Concepts • Causality – Refers to a cause-and-effect relationship: altering the level of the exposure would change the outcome (or change the risk of the outcome where it is a disease). In contrast, an ‘association’ does not necessarily imply causality but merely that the exposure and outcome are correlated. • Confounding – Refers to a distortion in the estimate for a risk factor-outcome association that occurs when the risk factor of interest is associated with another factor that causally affects the outcome. For example, the association of alcohol consumption with coronary heart disease risk may be confounded by the fact that people who drink alcohol are also more likely to smoke cigarettes, which has a causal influence on disease risk. • Genome-wide association study (GWAS) – A hypothesis-free study design that tests the associations of thousands or millions of genetic variants with a phenotype. The principal aim of a GWAS is to identify variants that are associated with the phenotype, which can be used to identify genes that are relevant to the etiology of the phenotype or to develop a predictive polygenic score for the phenotype. • Linkage disequilibrium (LD) – Refers to the non-independent segregation of genetic variants. Genetic variants in proximity on the same chromosome can be inherited together, which can lead to correlations between them if allele frequencies are similar. • Mendelian randomization (MR) – The use of genetic variants associated with the exposure (proposed risk factor) to understand the causal effect of the exposure on a health outcome. • Mendelian randomization phenome-wide association study (MR-PheWAS) – A hypothesis-free study design that performs Mendelian randomization for a risk factor on a wide range of outcomes. A limitation of this approach is multiple hypothesis testing, which leads to a challenge in identifying true associations and biologically relevant associations. • Phenotype – Refers to an individual’s observable characteristics, such as eye color, blood type, and body weight. An individual’s phenotype may be determined by genotype alone (e.g., in the case of blood type) or by both genotype and environmental factors (e.g., for body weight). 27 • Pleiotropy – Refers to the association of a genetic variant with multiple phenotypes. Horizontal pleiotropy refers to the association of a genetic variant with more than one phenotype on discrete biological pathways. This pleiotropy is of concern as it violates the exclusion-restriction assumption and can distort the results. Vertical pleiotropy refers to the association of a genetic variant with more than one phenotype on the same biological pathway, which does not invalidate the findings. • Reverse causation (also known as reverse causality) – A phenomenon by which the outcome (disease) affects the levels of the exposure (risk factor) rather than vice versa, as would be expected. This bias is minimized in MR studies as genetic variants are unchangeable and cannot be influenced by disease status. • Single-nucleotide polymorphism – A common genetic variation in which one base in the DNA is changed (e.g., a C instead of a T at a particular place in the genetic sequence). Statistical methods • Inverse-variance weighted method – Most efficient (greatest statistical power) and usually the main analytical method in MR studies involving multiple genetic variants. Requires that all genetic variants are valid instrumental variables. • Weighted median – A common complementary method in MR studies that operates by taking the median of variant-specific estimates. Robust to outliers but sensitive to the addition or removal of genetic variants. • Multivariable MR – A statistical method that allows for the association of genetic variants with multiple risk factors to be incorporated into the analysis. The approach can be used to adjust for known confounders or to explore the mediating effects of factors that are in the causal pathway from the risk factor of interest to the outcome. • MR-Egger – A common complementary method in MR studies. Can test and adjust for pleiotropy but is sensitive to outliers and less efficient compared to the inverse-variance weighted method. • MR-PRESSO (Pleiotropy RESidual Sum and Outlier) – Can identify and remove outliers but has a high false positive rate with several invalid instrumental variables. • Non-linear MR – A statistical approach to assess the shape of the causal relationship between the exposure and the outcome, and in particular whether the causal effect of the exposure on the outcome varies at different levels of the exposure. 28 Table 1 Examples of genome-wide association studies relevant to cardiovascular research Phenotype Consortium or study No. of genetic variants or loci* Total sample size (cases) Potential risk factors Alcohol and tobacco use GSCAN 378/99† Up to 1 232 0916 Blood lipids GLGC 773 loci Up to 1 654 9605 Blood pressure traits ICBP and UKBB >1000 variants 757 6014 Body mass index GIANT and UKBB 941 variants 681 2752 Coffee consumption CCGC 8 loci 91 4627 Glycemic traits MAGIC 242 loci 281 4163 Physical activity 51 studies 11 loci for MVPA Up to 703 9018 Cardiovascular outcomes Aortic valve stenosis Ten studies 18 variants 653 867 (13 765)13 Atrial fibrillation AFGen and five other studies 142 variants 1 030 836 (60 620)10 Coronary artery disease CARDIoGRAMplusC4D 241 loci 1 165 690 (181 522)11 Heart failure HERMES and MVP 39 variants 1 188 957 (90 653)14 Stroke GIGASTROKE 89 loci 1 614 080 (110 182)12 ICBP, International Consortium of Blood Pressure Genome Wide Association Studies; CARDIoGRAMplusC4D, Coronary ARtery DIsease Genome wide Replication and Meta-analysis plus The Coronary Artery Disease Genetics consortium; CCGC, Coffee and Caffeine Genetics Consortium; GIANT, Genetic Investigation of ANthropometric Traits; GLGC, Global Lipids Genetics Consortium; GSCAN, GWAS and Sequencing Consortium of Alcohol and Nicotine use; MAGIC, Meta-Analysis of Glucose and Insulin-related Traits Consortium. MVP, Million Veteran Program; MVPA, moderate-to-vigorous intensity physical activity; UKBB, UK Biobank. *Number of independent or near-independent genetic variants (single-nucleotide polymorphisms) or loci identified to be associated with the phenotype at the genome-wide significance threshold. †Number of near- independent genetic variants associated with smoking initiation/alcohol consumption. 29 Figure legends Graphical Abstract. Mendelian randomization findings on major cardiometabolic and lifestyle factors and common cardiovascular diseases. Figure 1. Comparison of randomized controlled trial (RCT) and Mendelian randomization (MR) study designs showing the common basis behind interpretation of a causal effect of higher serum calcium levels on coronary heart disease. According to Mendel’s laws, random and independent inheritance of genetic alleles can be thought of analogously to random allocation of treatment vs. placebo in RCT. Therefore, by the same reasoning, if MR finds genetic variants affecting serum calcium levels are associated with a difference in coronary heart disease risk, it provides evidence that serum calcium causally affects coronary heart disease. Figure 2. Illustration of the Mendelian randomization assumptions with the example of alcohol consumption as the putative risk factor and coronary heart disease as the outcome. Dashed lines indicate pathways that would violate the assumptions. In this example, a genetic variant in the alcohol dehydrogenase 1B (ADH1B) gene is robustly associated with alcohol consumption in individuals of European ancestries, is not associated with smoking (a main potential confounder) and has a key role in the metabolism of alcohol. Figure 3. Examples of horizontal and vertical pleiotropy in a Mendelian randomization study. (A) An example of horizontal pleiotropy, in which the variant used as instrumental variable for coffee consumption is also associated with smoking (confounder), leading to violation of the third assumption (exclusion restriction) and can invalidate the results. (B) An example of vertical pleiotropy, in which the effect of physical activity on coronary heart disease is mediated by body weight. This does not distort the findings. Abstract