Bayesian Optimization in the Latent Space of a Variational Autoencoder for the Generation of Selective FLT3 Inhibitors Raghav Chandra, Robert I. Horne,* and Michele Vendruscolo* Cite This: J. Chem. Theory Comput. 2024, 20, 469−476 Read Online ACCESS Metrics & More Article Recommendations ABSTRACT: The process of drug design requires the initial identification of compounds that bind their targets with high affinity and selectivity. Advances in generative modeling of small molecules based on deep learning are offering novel opportunities for making this process faster and cheaper. Here, we propose an approach to achieve this goal, where predictions of binding affinity are used in conjunction with the Junction Tree Variational Autoencoder (JTVAE) whose latent space is used to facilitate the efficient exploration of the chemical space using a Bayesian optimization strategy. The exploration identifies small molecules predicted to have both high affinity and high selectivity by using an objective function that optimizes the binding to the target while penalizing the binding to off-targets. The framework is demonstrated for FMS-like tyrosine kinase 3 (FLT3) and shown to predict small molecules with predicted affinity and selectivity comparable to those of clinically approved drugs for this target. ■ INTRODUCTION Drug discovery is highly expensive, uncertain, and inefficient.1,2 It has been estimated that the development cost of a new drug is close to $2 billion.1,3,4 It is also remarkable that 24% of all marketed drugs and 35% of anticancer drugs originated from serendipitous discoveries.5 The advent of deep learning methods is providing novel opportunities to address at least some aspects of the issue of reducing the time, cost, and rate of failure of drug discovery pipelines.6−8 These approaches frequently require large amounts of relevant data, such as the case of the identification of the experimental antibiotics halicin9 and abaucin,10 where deep learning methods were trained on the experimentally measured inhibition of thousands of small molecules against bacterial growth. To reduce the impact of the limitation of the high data requirement, the aim of this work is to develop an end-to-end pipeline for the generation of small molecules with high affinity for a chosen target binding pocket and with low affinity for structurally similar off-target pockets, without the need of target-specific extensive data. The low data requirement of this approach is based on the following observation. A fundamental aspect of generative modeling is to use a deep learning strategy to estimate a function that assigns a probability for a given compound to bind its intended target. Knowing this function enables one, at least in principle, to sample the chemical space to identify compounds with predicted high affinity. However, there are at least two major problems with this approach. The first is that the chemical space relevant for drug discovery has been estimated to contain some 1060 compounds,11 and thus learning the binding affinity function requires substantial amounts of data. The second is that molecular representations are often discontinuous,12,13 and thus not easily amenable to efficient searches. In order to overcome these problems, one can work in the latent chemical space, which is the vector space used by variational autoencoders in deep-learning methods to represent a compound.12,13 Since the size of a typical latent space can be on the order of just 102, one can ask whether learning the function in the latent chemical space could require less data. Furthermore, since in the latent space of variational autoencoders (VAEs), small molecules corresponding to similar vectors are structurally similar, the search for binders of high affinity and specificity can be cast as one of optimization. Based on this idea, the pipeline described in this work has 5 components: (1) a method of predicting the binding affinity of a compound for its target pocket, (2) a method of searching for off-target pockets on other proteins similar to the target pocket, (3) a variational autoencoder to represent the structure of the compound, (4) an objective function to estimate the binding affinity and the specificity for the target pocket, and (5) a Bayesian optimization method to maximize the objective Received: November 4, 2023 Revised: November 25, 2023 Accepted: November 27, 2023 Published: December 19, 2023 Articlepubs.acs.org/JCTC © 2023 The Authors. Published by American Chemical Society 469 https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 This article is licensed under CC-BY 4.0 https://pubs.acs.org/action/doSearch?field1=Contrib&text1="Raghav+Chandra"&field2=AllField&text2=&publication=&accessType=allContent&Earliest=&ref=pdf https://pubs.acs.org/action/doSearch?field1=Contrib&text1="Robert+I.+Horne"&field2=AllField&text2=&publication=&accessType=allContent&Earliest=&ref=pdf https://pubs.acs.org/action/doSearch?field1=Contrib&text1="Michele+Vendruscolo"&field2=AllField&text2=&publication=&accessType=allContent&Earliest=&ref=pdf https://pubs.acs.org/action/showCitFormats?doi=10.1021/acs.jctc.3c01224&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?goto=articleMetrics&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?goto=recommendations&?ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=tgr1&ref=pdf https://pubs.acs.org/toc/jctcce/20/1?ref=pdf https://pubs.acs.org/toc/jctcce/20/1?ref=pdf https://pubs.acs.org/toc/jctcce/20/1?ref=pdf https://pubs.acs.org/toc/jctcce/20/1?ref=pdf pubs.acs.org/JCTC?ref=pdf https://pubs.acs.org?ref=pdf https://pubs.acs.org?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://pubs.acs.org/JCTC?ref=pdf https://pubs.acs.org/JCTC?ref=pdf https://acsopenscience.org/researchers/open-access/ https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/ function in the latent space of the variational autoencoder. A schematic overview of the pipeline is shown in Figure 1. This pipeline is illustrated for FMS-like tyrosine kinase 3 (FLT3), which was chosen due to its coverage in the scientific literature, including data on drugs with documented off-target bindings.14 FLT3 is mutated in 28% of adult patients with de novo acute myeloid leukemia,15 and belongs to the class III family of receptor tyrosine kinases (RTKs).16 The binding pocket of FLT3 with Gilteritinib bound is what we target here (the ATP binding site). Gilteritinib is, therefore, a type I inhibitor. In this way, we are targeting a known inhibitory mechanism and therefore believe the generated molecules may have relevance as potential type I inhibitors of this particular kinase. Although several generative methods have been already reported to identify kinase inhibitors,17−21 to our knowledge, the problem of specificity for this class of targets22−24 has received less attention using generative modeling. Our results indicate that the pipeline that we report is effective in generating small molecules with a predicted high affinity and high specificity for the intended target. ■ RESULTS Pocket Similarity Search. Naiv̈e optimization of the binding affinity of small molecules against a selected target would confer no selectivity and therefore likely lead to nonspecific small molecules. Therefore, we selected structurally similar protein pockets to be used as off-target test cases. A fast 3D pocket alignment method, PoSSuM,25 was used for this purpose (see Methods). PoSSuM uses putative and known binding sites algorithmically determined from the Protein Data Bank (PDB),26 and embeds pockets into feature vectors that encode their geometric and physicochemical properties. The cosine similarity of these can then be computed, and 3D alignment is carried out on pairs that have high cosine similarities, indicating that the pockets are similar. For the test case of FLT3, pockets with cosine similarities greater than 0.8 were sorted by the fraction of identical residues of the aligned residues, and the top five results were selected. The binding pocket used was that of the crystal structure of FLT3 in complex with gilteritinib,27 a small molecule binder of FLT3. The top five results after following this procedure were the following 5 tyrosine kinases: PDGFRA (2/18 different residues), CKIT (2/18 different residues), VEGFR2 (5/16 different residues), MK2 (5/16 different residues), and JAK2 (6/18 different residues). The similarity search was validated using known off-target drugs against FLT3. Several examples of off-target binding are known for the first three pockets above which, like FLT3, are members of RTK class III.28 Notably, midostaurin, sorafenib, sunitinib, and other first-generation FLT3 inhibitors bind to these off targets and are not selective for FLT3.29 Dual JAK2/ FLT3 inhibitors have also been reported.30 Binding Affinity Prediction. The fast and accurate prediction of the binding affinity of a small molecule for a protein is a challenging problem.1,31 Here, we adopted a variant of AutoDock Vina,32 a widely used docking method to predict the protein−ligand complex structure and the corresponding binding score (see Methods). We found that the variant Vinardo33 performs better than the default parameters in all relevant metrics for FLT3. This has a parameter, “exhaustiveness,” which controls how comprehen- sive the docking procedure is for each molecule. Since the pipeline is modular, other binding affinity predictors could be used. Objective Score. We defined an objective score as a function of the predicted binding affinities to the target and off-target (see Eq. 1 and Methods). We did not include any stipulations for drug-likeness in the objective function as it would complicate its optimization, apart from using JTVAE, which was trained on 250,000 druglike molecules from the ZINC database.34 The quantitative estimate of drug-likeness (QED)30 was calculated for the top-scoring small molecules and found to be high for the majority of them. Molecular Representations. We used the latent space of the Junction Tree Variational Autoencoder (JTVAE),35 which generates chemically valid small molecules, although not necessarily readily synthesizable or stable (see Methods). The JTVAE encodes and decodes small molecules using a 56- dimensional latent space where each dimension is normally distributed with mean 0 and variance 1. By sampling vectors in Figure 1. Schematic overview of the method reported in this work. (A) Off-target identification module. After target selection, the PoSSuM25 database of similar protein−ligand binding pockets was used to screen for homologous off targets. (B) Scoring module. AutoDock Vina32 was applied to screen compounds to predict the binding energies to the binding pocket of the target and to the binding pockets of the homologous off-targets. The binding energies were then used in an objective function that rewarded binding to the target and penalized off-target binding, to provide an overall score for each molecule (see Eq 1). (C) Overall iterative specificity pipeline. Compounds were initially randomly sampled from the latent space of a pretrained junction tree variation autoencoder (JTVAE).35 The resulting compounds were then passed through the scoring module, and the latent space of the molecular representations was then iteratively sampled via Bayesian optimization to obtain the molecules that maximized the objective function. Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 470 https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig1&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig1&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig1&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig1&ref=pdf pubs.acs.org/JCTC?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as this latent space and decoding them, drug-like molecules can be generated.35 The published model was used, pretrained on 250,000 drug-like molecules from the ZINC database.34 Bayesian Optimization. Due to the high computational cost of each prediction, which involves a target prediction and five off-target calculations by Vinardo, Bayesian optimization was selected as the method for optimization of the objective function36 (see Eq. 1 and Methods). Occasionally (in around 1% of vectors), JTVAE produces small molecules for which RDKit35,37 is unable to produce 3D conformations which are required for docking and energy calculations by Vinardo. Empirically, this tends to be due to some chemically unfeasible or synthetically inaccessible substructure, and as such, we are not concerned with missing potential hits due to removing these points. Bayesian optimization is nonstochastic as usually the acquisition function will only have one global optimum, and so an invalid point cannot simply be ignored as optimization of the acquisition function would return the same point. Therefore, target and off-target binding affinities are all arbitrarily set to −5.0 kcal/mol, as this produces a low value of the objective function and discourages exploration of this region. For the case of other pockets, this could be set to the value for εanybind +1 kcal/mol in the objective function (see Eq. 1 below). The frequency of this event is sufficiently low that it does not significantly impact optimization. In all cases of repeated small molecules, we choose to return the previously calculated scores and binding affinities to prevent the waste of computational resources in repeated re-evaluations. As the optimization is carried out in a complex and noisy space, the Gaussian process correctly learns that it cannot infer long-range dependencies. In practice, this means that a vast majority of the latent space is stationary. Hence, the typical procedure of optimizing the acquisition function, randomizing a large number of points, and performing derivative-based optimization on them fails to find optima. However, we know that the optima of the acquisition function for such a problem must lie close to the best points that are already known, where the Gaussian process is nonstationary. Therefore, the optimization of the acquisition function is seeded near points where the objective evaluation was greater than 0.4, allowing efficient optimization of the acquisition function. Ten iterations of seeding and optimization of seeded points are carried out using 1024 randomized seeding points and 1024 seeded with a standard deviation of 0.1 in each dimension from points where the objective evaluation was greater than 0.4. These points are optimized using L-BFGS-B38 and the point with the highest acquisition value is chosen as the point to sample. Targeting FLT3. The structure and binding pocket of FLT3 used are shown in Figure 2, bound to gilteritinib. First, for comparison with Bayesian optimization, random sampling from the latent space was attempted, yielding around 4.5% of samples having objective evaluations greater than 0.5, which was taken as the threshold for being a hit (Figure 3). In all plots, repeated sampling of the same small molecule is not shown. The objective function is noisy and hence a significant fraction of hits from random sampling will be results of fortuitous noise. Bayesian optimization was carried out following a random initialization phase, increasing the hit rate to around 10% (Figure 4). In the first ∼40 iterations, the optimization probes near a variety of points with high objective evaluations, many of which do not give strong evaluations, suggesting that the initial high scores were the results of fortuitous noise. The optimization then converges on a promising region of the latent space for which it generates several high-scoring candidates which are likely not results of fortuitous noise due to their high frequency and structural similarities. Repetition of the optimization with different initializations resulted in the convergence to different regions of the latent space. We thus generated candidates with high levels of structural diversity (Table 1 and Figure 5). Evaluation of the Generated Small Molecules. Binding affinities of some clinically approved drugs used against FLT3 were predicted at an exhaustiveness of 64 for the purpose of comparison with generated small molecules (Table 2). Only type I inhibitors are shown, as these bind to the same pocket as gilteritinib whose complex structure was used to determine the pocket locations, while type II inhibitors have a different binding site.39 The predictions were compared to literature information of known binding affinities to FLT3 and the off- targets40,41 (Table 3). We defined an objective score as a function of the predicted binding affinities to the target (εtarget) and off-targets (εoff‑target) and an artificial “anybind” target (εanybind). The energy of the anybind target was empirically set to be 1 kcal/mol lower than the average observed binding to the target when randomly sampling from the latent space in order to remove weak binders to the target from consideration. The objective score is defined as = = + + P Z 1 e e e e e target target anybind target off target (1) Using binding affinities and = Kln( )d (2) the objective function (Eq. 1) can be approximated to = + P K K K d,target 1 d,target 1 d,off target 1 (3) if it is assumed that the temperature of binding assays was close to 310 K and contributions to Z from unreported off-targets and the anybind term are negligible. Carrying out this Figure 2. Structure of gilteritinib bound to FLT3. Image from the PDB26 of PDB ID 6JQR.27 Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 471 https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig2&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig2&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig2&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig2&ref=pdf pubs.acs.org/JCTC?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as conversion yields the predictions as indicated in Table 2 and the agreement with those derived from Vinardo is good, aside from gilteritinib (Figure 6A). Additionally, it was expected that Vinardo binding affinities should be proportional to ln(Kd) and this hypothesis was tested (Figure 6B). The error in the Vinardo predictions is linked to the number of rotatable bonds present in the small molecule42 and this is thought to explain its failure for gilteritinib, which has 9 rotatable bonds. The most successful predictions, midostaurin and lestaurtinib, have 1 and 3 rotatable bonds, respectively. The majority of the small Figure 3. Representative results of a run of 200 iterations of random sampling from the latent space. Hits (shown in orange), which we define to be small molecules with scores greater than 0.5, occur with a frequency of ∼4.5%. Structures of the highest performing 3 small molecules are also shown. Figure 4. Representative results of a run of 200 iterations of Bayesian optimization following 150 initialization iterations. The expected improvement acquisition function was used with a fixed noise Gaussian process of 0.2 involving the Matern 5/2 covariance kernel. Acquisition function optimization was seeded near points with objective evaluations greater than 0.4. Hits (shown in orange), which we define to be small molecules with scores greater than 0.5, occur with a frequency of ∼10%. The plot displays 156 unique small molecules, and structures of the highest performing 3 small molecules are also shown. Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 472 https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig3&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig3&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig3&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig3&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig4&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig4&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig4&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig4&ref=pdf pubs.acs.org/JCTC?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as molecules generated with this pipeline have fewer than 5 rotatable bonds. It should be stressed that FTL3 was chosen as a target for its difficult selectivity, as PDGFRA and CKIT both differ by only Table 1. AutoDock Vina (Vinardo) Binding Affinity Predictions (kcal/mol) at 64 Exhaustiveness for Top 20 Generated Small Molecules across Three Runs with Objective Scores as Defined in Eq 1 molecule kinase pocket AutoDock Vina binding energies/kcal mol−1 metrics FLT3 CKIT PDGFRA VEGFR2 MK2 JAK2 score QED A −9.11 −6.02 −6.48 −5.97 −6.14 −5.48 0.958 0.909 B −10.26 −8.44 −6.97 −7.26 −4.62 −5.25 0.938 0.760 C −8.25 −5.27 −4.58 −5.85 −5.60 −6.13 0.907 0.737 D −9.49 −7.93 −6.33 −5.00 −5.89 −7.80 0.865 0.778 E −8.56 −6.23 −5.21 −5.72 −5.99 −7.49 0.803 0.739 F −8.00 −6.45 −4.96 −6.06 −5.27 −6.80 0.755 0.737 G −7.70 −6.41 −5.02 −5.95 −5.21 −6.23 0.734 0.780 H −7.48 −5.39 −4.40 −6.07 −5.55 −6.29 0.703 0.866 I −7.85 −6.87 −4.52 −6.05 −3.9 −6.64 0.69 0.899 J −7.50 −6.26 −4.97 −5.96 −5.79 −5.87 0.688 0.862 K −8.06 −7.19 −5.83 −5.96 −5.98 −6.54 0.686 0.779 L −8.05 −5.96 −7.41 −6.06 −4.31 −6.21 0.661 0.749 M −6.98 −5.46 −4.76 −5.45 −4.62 −5.54 0.658 0.822 N −7.67 −6.68 −4.85 −6.23 −6.27 −5.84 0.656 0.827 O −7.58 −5.12 −6.22 −6.21 −5.85 −6.67 0.623 0.865 P −7.60 −6.88 −5.68 −5.37 −5.38 −6.42 0.611 0.935 Q −6.93 −5.80 −5.21 −5.48 −5.00 −5.26 0.606 0.822 R −7.86 −7.27 −6.00 −5.74 −4.12 −6.86 0.586 0.912 S −7.41 −6.04 −6.14 −6.37 −4.34 −6.36 0.584 0.880 T −7.30 −6.41 −6.06 −5.74 −4.34 −6.20 0.573 0.934 Figure 5. Top 20 predictions after rerunning small molecules with scores greater than 0.5 at 64 exhaustiveness. Small molecules from three runs of 200 iterations of Bayesian Optimisation following 150 initialization iterations. Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 473 https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig5&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig5&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig5&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig5&ref=pdf pubs.acs.org/JCTC?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as two out of 18 aligned residues. Therefore, the problem of the distinction between the binding affinities to these pockets is especially difficult. It would be reasonable to expect that a more typical binding pocket could have more dissimilar off- targets, for which this distinction is easier. Bayesian optimization was carried out three times, each time with 150 random initialization iterations followed by 200 optimization iterations. All hit small molecules (those with objective scores above 0.5) had their objective function rerun with a Vinardo exhaustiveness of 64 and the top 20 small molecules following this procedure are shown in Figure 5 and Table 1. These score comparably or more strongly than clinically approved drugs in all metrics (Table 2). They exhibit high QED scores and selectivity scores as well as strong predicted binding to FLT3. The high QED scores are believed to be inherent to the latent space of the variational autoencoder. Many of the predicted small molecules are amines, which were protonated with reference to a physiological pH. ■ CONCLUSIONS We have shown that Bayesian optimization in the latent space of a variational autoencoder is a powerful approach for the generation of small molecules predicted to be highly selective against their chosen target. The generated small molecules for the illustrative case of FLT3 were shown to be comparable, or in some cases superior, in all predicted metrics to clinically approved drugs, including drug-likeness, selectivity, and target binding affinity. We believe that the pipeline demonstrated here represents a useful method for the generation of hit compounds at low computational expense when compared with strategies such as in silico high throughput screening of compound libraries. Importantly, our method encourages selectivity for the desired target, as this aspect is often neglected in computational approaches. We note that experimental validation will be the next step for this work to determine whether the generated small molecules are indeed selective. ■ METHODS Binding Affinity Prediction. We used Vinardo,33 a variant of AutoDock Vina,32 a common docking method to predict the Table 2. AutoDock Vina (Vinardo) Binding Affinity Predictions (kcal/mol) at 64 Exhaustiveness for Selected Type I Inhibitors of FLT3 with Objective Scores as Defined in Eq 1 drug kinase pocket AutoDock Vina binding energies/kcal mol−1 metrics FLT3 CKIT PDGFRA VEGFR2 MK2 JAK2 score QED gilteritinib −5.55 −5.19 −4.73 −4.78 −4.62 −5.91 0.162 0.428 crenolanib −7.33 −6.33 −4.93 −5.36 −4.99 −6.06 0.657 0.504 sunitinib −6.95 −6.79 −5.08 −5.58 −4.92 −6.47 0.378 0.626 lestaurtinib −7.86 −6.19 −5.67 −6.17 −5.59 −7.72 0.492 0.373 midostaurin −7.85 −6.36 −6.06 −6.25 −5.16 −7.05 0.645 0.287 Table 3. Kd Values (in nM) for Selected Type I Inhibitors of FLT3 with Scores Estimated Using Eq 3 drug kinase pocket Kd values/nM metrics FLT3 CKIT PDGFRA VEGFR2 MK2 JAK2 score QED gilteritinib 7.0 1 0.428 crenolanib 0.74 78 3.2 0.806 0.504 sunitinib 0.41 0.37 0.79 1.5 0.345 0.626 lestaurtinib 8.5 150 380 220 3.7 0.293 0.373 midostaurin 11.0 220 380 94 0.836 0.287 Figure 6. Correlations between predicted and experimental selectivity and binding affinity. (A) Selectivity scores calculated from experimental measurements (Eq. 3) against those from Vina- predicted binding affinities (Eq. 1). (B) Experimentally determined binding affinities against Vina predicted binding affinities. Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 474 https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig6&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig6&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig6&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?fig=fig6&ref=pdf pubs.acs.org/JCTC?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as protein−ligand complex structure and the corresponding binding score. This has a parameter, “exhaustiveness,” which controls how comprehensive the search is. Rigid receptors are used with an exhaustiveness of 8 for optimization which is increased to 64 for subsequent validation of strong candidates. Molecular Representations. We used the Junction Tree Variational Autoencoder (JTVAE).35 The JTVAE encodes and decodes small molecules using a 56-dimensional latent space where each dimension is normally distributed with mean 0 and variance 1. The published model was used, pretrained on 250,000 drug-like molecules from the ZINC database.34 Pocket Similarity Search. We used the freely available database from a fast 3D pocket alignment method, PoSSuM.25 This was constructed using putative and known binding sites algorithmically determined from the Protein Data Bank (PDB),26 and embedding pockets into feature vectors that encode their geometric and physicochemical properties. The cosine similarity of these can then be computed and 3D alignment is carried out on pairs that have high cosine similarities, indicating that the pockets are similar. Objective Score. We defined an objective score as a function of the predicted binding affinities to the target (εtarget) and off-targets (εoff‑target) and an artificial “anybind” target (εanybind), see Eq 1. This score is based on the fraction of small molecules (P) that would be bound to the target pocket in the canonical ensemble, where it is assumed that the partition function (Z) is comprised of equally weighted Boltzmann terms for the target, off-targets, and an artificial anybind target. The energy of the anybind target was empirically set to be 1 kcal/mol lower than the average observed binding to the target when randomly sampling from the latent space, −6 kcal/mol in the case of FLT3. Any bind target was present to penalize the event of weak binding to the desired target. This can be interpreted as an expectation that any small molecule will be able to find some environment where it will bind with that energy, and it biases the distribution of small molecules so that only those that bind more strongly than this value can have high values of the objective function. For the case of other binding pockets, this constant could either be specified using domain knowledge or set close to the average observed value after random sampling. β is equal to 1/kT where k is Boltzmann constant and T is the absolute temperature, here taken as 310 K. In theory, it would be possible to use protein abundance databases to appropriately weight Boltzmann terms.43 However, in practice, abundances differ by several orders of magnitude across databases, so the simpler approach of equal weighting is taken here. Unequal weighting could be used if there were large differences in the magnitudes of target abundances. For the test case of FLT3 and its identified off- targets, this is not the case. Bayesian Optimization. Bayesian optimization was carried out using the expected improvement acquisition function44 and the ARD Mateŕn 5/2 kernel36 with a fixed noise of 0.2. As we choose to return previously calculated scores for small molecules already seen, an automatic determination of the noise would incorrectly infer very small noise as repeated sampling in the same region would return exactly the same score. Returning calculated scores guides the Gaussian process to learn appropriately large characteristic length scales, which encourage sampling of new small molecules. Optimization is carried out in the unit hypercube, which is cast to the normally distributed space of JTVAE using inverse transform methods. ■ ASSOCIATED CONTENT Accession Codes The full code can be found at the GitHub repository: https:// github.com/raghavchandra123/selectivebayes. ■ AUTHOR INFORMATION Corresponding Authors Michele Vendruscolo − Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, U.K.; orcid.org/0000-0002-3616- 1610; Email: mv245@cam.ac.uk Robert I. Horne − Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, U.K.; orcid.org/0000-0003-1534- 2639; Email: rih29@cam.ac.uk Author Raghav Chandra − Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, U.K. Complete contact information is available at: https://pubs.acs.org/10.1021/acs.jctc.3c01224 Notes The authors declare no competing financial interest. ■ REFERENCES (1) Sadybekov, A. V.; Katritch, V. Computational approaches streamlining drug discovery. Nature 2023, 616 (7958), 673−685. (2) Wong, C. H.; Siah, K. W.; Lo, A. W. Estimation of clinical trial success rates and related parameters. Biostatistics 2019, 20 (2), 273− 286. (3) Paul, S. M.; Mytelka, D. S.; Dunwiddie, C. T.; Persinger, C. C.; Munos, B. H.; Lindborg, S. R.; Schacht, A. L. How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat. Rev. Drug Discovery 2010, 9 (3), 203−214. (4) Wouters, O. J.; McKee, M.; Luyten, J. Estimated research and development investment needed to bring a new medicine to market, 2009−2018. JAMA 2020, 323 (9), 844−853. (5) Hargrave-Thomas, E.; Yu, B.; Reynisson, J. Serendipity in anticancer drug discovery. World J. Clin. Oncol. 2012, 3 (1), 1. (6) Schneider, P.; Walters, W. P.; Plowright, A. T.; Sieroka, N.; Listgarten, J.; Goodnow, R. A., Jr; Fisher, J.; Jansen, J. M.; Duca, J. S.; Rush, T. S. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discovery 2020, 19 (5), 353−364. (7) Bender, A.; Cortés-Ciriano, I. Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 1: Ways to make an impact, and why we are not there yet. Drug Discovery Today 2021, 26 (2), 511−524. (8) Vamathevan, J.; Clark, D.; Czodrowski, P.; Dunham, I.; Ferran, E.; Lee, G.; Li, B.; Madabhushi, A.; Shah, P.; Spitzer, M. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discovery 2019, 18 (6), 463−477. (9) Stokes, J. M.; Yang, K.; Swanson, K.; Jin, W.; Cubillos-Ruiz, A.; Donghia, N. M.; MacNair, C. R.; French, S.; Carfrae, L. A.; Bloom- Ackermann, Z. A deep learning approach to antibiotic discovery. Cell 2020, 180 (4), 688−702.e13. (10) Liu, G.; Catacutan, D. B.; Rathod, K.; Swanson, K.; Jin, W.; Mohammed, J. C.; Chiappino-Pepe, A.; Syed, S. A.; Fragis, M.; Rachwalski, K. Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii. Nat. Chem. Biol. 2023, 19, 1342− 1350, DOI: 10.1038/s41589-023-01349-8. (11) Dobson, C. M. Chemical space and biology. Nature 2004, 432, 824−828. (12) Gómez-Bombarelli, R.; Wei, J. N.; Duvenaud, D.; Hernández- Lobato, J. M.; Sánchez-Lengeling, B.; Sheberla, D.; Aguilera- Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 475 https://github.com/raghavchandra123/selectivebayes https://github.com/raghavchandra123/selectivebayes https://pubs.acs.org/action/doSearch?field1=Contrib&text1="Michele+Vendruscolo"&field2=AllField&text2=&publication=&accessType=allContent&Earliest=&ref=pdf https://orcid.org/0000-0002-3616-1610 https://orcid.org/0000-0002-3616-1610 mailto:mv245@cam.ac.uk https://pubs.acs.org/action/doSearch?field1=Contrib&text1="Robert+I.+Horne"&field2=AllField&text2=&publication=&accessType=allContent&Earliest=&ref=pdf https://orcid.org/0000-0003-1534-2639 https://orcid.org/0000-0003-1534-2639 mailto:rih29@cam.ac.uk https://pubs.acs.org/action/doSearch?field1=Contrib&text1="Raghav+Chandra"&field2=AllField&text2=&publication=&accessType=allContent&Earliest=&ref=pdf https://pubs.acs.org/doi/10.1021/acs.jctc.3c01224?ref=pdf https://doi.org/10.1038/s41586-023-05905-z https://doi.org/10.1038/s41586-023-05905-z https://doi.org/10.1093/biostatistics/kxx069 https://doi.org/10.1093/biostatistics/kxx069 https://doi.org/10.1038/nrd3078 https://doi.org/10.1038/nrd3078 https://doi.org/10.1001/jama.2020.1166 https://doi.org/10.1001/jama.2020.1166 https://doi.org/10.1001/jama.2020.1166 https://doi.org/10.5306/wjco.v3.i1.1 https://doi.org/10.5306/wjco.v3.i1.1 https://doi.org/10.1038/s41573-019-0050-3 https://doi.org/10.1016/j.drudis.2020.12.009 https://doi.org/10.1016/j.drudis.2020.12.009 https://doi.org/10.1016/j.drudis.2020.12.009 https://doi.org/10.1038/s41573-019-0024-5 https://doi.org/10.1038/s41573-019-0024-5 https://doi.org/10.1016/j.cell.2020.01.021 https://doi.org/10.1038/s41589-023-01349-8 https://doi.org/10.1038/s41589-023-01349-8 https://doi.org/10.1038/s41589-023-01349-8?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1038/nature03192 pubs.acs.org/JCTC?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as Iparraguirre, J.; Hirzel, T. D.; Adams, R. P.; Aspuru-Guzik, A. Automatic chemical design using a data-driven continuous repre- sentation of molecules. ACS Cent. Sci. 2018, 4 (2), 268−276. (13) Bilodeau, C.; Jin, W.; Jaakkola, T.; Barzilay, R.; Jensen, K. F. Generative models for molecular discovery: Recent advances and challenges. WIREs Comput. Mol. Sci. 2022, 12 (5), No. e1608. (14) Daver, N.; Schlenk, R. F.; Russell, N. H.; Levis, M. J. Targeting FLT3 mutations in AML: review of current knowledge and evidence. Leukemia 2019, 33 (2), 299−312. (15) Cancer Genome Atlas Research Network. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N. Engl. J. Med. 2013, 368 (22), 2059−2074. (16) Gebru, M. T.; Wang, H.-G. Therapeutic targeting of FLT3 and associated drug resistance in acute myeloid leukemia. J. Hematol. Oncol. 2020, 13 (1), 155. (17) Zhavoronkov, A.; Ivanenkov, Y. A.; Aliper, A.; Veselov, M. S.; Aladinskiy, V. A.; Aladinskaya, A. V.; Terentiev, V. A.; Polykovskiy, D. A.; Kuznetsov, M. D.; Asadulaev, A. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 2019, 37 (9), 1038−1040. (18) Li, Y.; Zhang, L.; Wang, Y.; Zou, J.; Yang, R.; Luo, X.; Wu, C.; Yang, W.; Tian, C.; Xu, H. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat. Commun. 2022, 13 (1), 6891. (19) Moret, M.; Pachon Angona, I.; Cotos, L.; Yan, S.; Atz, K.; Brunner, C.; Baumgartner, M.; Grisoni, F.; Schneider, G. Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. Nat. Commun. 2023, 14 (1), 114. (20) Bajorath, J. Generative kinase inhibitor modeling viewed from a medicinal chemistry perspective. Future Med. Chem. 2023, 15 (4), 313−315. (21) Krishnan, K.; Kassab, R.; Agajanian, S.; Verkhivker, G. Interpretable Machine Learning Models for Molecular Design of Tyrosine Kinase Inhibitors Using Variational Autoencoders and Perturbation-Based Approach of Chemical Space Exploration. Int. J. Mol. Sci. 2022, 23 (19), 11262. (22) Müller, S.; Chaikuad, A.; Gray, N. S.; Knapp, S. The ins and outs of selective kinase inhibitor development. Nat. Chem. Biol. 2015, 11 (11), 818−821. (23) Davis, M. I.; Hunt, J. P.; Herrgard, S.; Ciceri, P.; Wodicka, L. M.; Pallares, G.; Hocker, M.; Treiber, D. K.; Zarrinkar, P. P. Comprehensive analysis of kinase inhibitor selectivity. Nat. Biotechnol. 2011, 29 (11), 1046−1051. (24) Lu, X.; Smaill, J. B.; Ding, K. New promise and opportunities for allosteric kinase inhibitors. Angew. Chem., Int. Ed. 2020, 59 (33), 13764−13776. (25) Ito, J.-I.; Tabei, Y.; Shimizu, K.; Tsuda, K.; Tomii, K. PoSSuM: a database of similar protein−ligand binding and putative pockets. Nucleic Acids Res. 2012, 40 (D1), D541−D548. (26) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1), 235−242. (27) Kawase, T.; Nakazawa, T.; Eguchi, T.; Tsuzuki, H.; Ueno, Y.; Amano, Y.; Suzuki, T.; Mori, M.; Yoshida, T. Effect of Fms-like tyrosine kinase 3 (FLT3) ligand (FL) on antitumor activity of gilteritinib, a FLT3 inhibitor, in mice xenografted with FL- overexpressing cells. Oncotarget 2019, 10 (58), 6111. (28) Abu-Duhier, F. M.; Goodeve, A. C.; Care, R. S.; Gari, M.; Wilson, G. A.; Peake, I. R.; Reilly, J. T. Mutational analysis of class III receptor tyrosine kinases (C-KIT, C-FMS, FLT3) in idiopathic myelofibrosis. Br. J. Hamaetol. 2003, 120 (3), 464−470. (29) Verstovsek, S.; Odenike, O.; Singer, J. W.; Granston, T.; Al- Fayoumi, S.; Deeg, H. J. Phase 1/2 study of pacritinib, a next generation JAK2/FLT3 inhibitor, in myelofibrosis or other myeloid malignancies. J. Hematol. Oncol. 2016, 9 (1), 137. (30) Bickerton, G. R.; Paolini, G. V.; Besnard, J.; Muresan, S.; Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 2012, 4 (2), 90−98. (31) Gentile, F.; Agrawal, V.; Hsing, M.; Ton, A.-T.; Ban, F.; Norinder, U.; Gleave, M. E.; Cherkasov, A. Deep docking: a deep learning platform for augmentation of structure based drug discovery. ACS Cent. Sci. 2020, 6 (6), 939−949. (32) Eberhardt, J.; Santos-Martins, D.; Tillack, A. F.; Forli, S. AutoDock Vina 1.2. 0: New docking methods, expanded force field, and python bindings. J. Chem. Inf. Model. 2021, 61 (8), 3891−3898. (33) Quiroga, R.; Villarreal, M. A. Vinardo: A scoring function based on autodock vina improves scoring, docking, and virtual screening. PLoS One 2016, 11 (5), No. e0155183. (34) Irwin, J. J.; Tang, K. G.; Young, J.; Dandarchuluun, C.; Wong, B. R.; Khurelbaatar, M.; Moroz, Y. S.; Mayfield, J.; Sayle, R. A. ZINC20 - a free ultralarge-scale chemical database for ligand discovery. J. Chem. Inf. Model. 2020, 60 (12), 6065−6073. (35) Jin, W.; Barzilay, R.; Jaakkola, T.Junction tree variational autoencoder for molecular graph generation. Proceedings of the 35th International Conference on Machine Learning, 2018; pp 2323−2332. (36) Snoek, J.; Larochelle, H.; Adams, R. P. Practical bayesian optimization of machine learning algorithms. Adv. Neural Inf. Process., 2012, vol 25. (37) Landrum, G. RDKit: Open-source cheminformatics, Google Scholar, 2006. (38) Byrd, R. H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16 (5), 1190−1208. (39) Kiyoi, H.; Kawashima, N.; Ishikawa, Y. FLT3 mutations in acute myeloid leukemia: Therapeutic paradigm beyond inhibitor development. Cancer Sci. 2020, 111 (2), 312−322. (40) Heinrich, M. C.; Griffith, D.; McKinley, A.; Patterson, J.; Presnell, A.; Ramachandran, A.; Debiec-Rychter, M. Crenolanib inhibits the drug-resistant PDGFRA D842V mutation associated with imatinib-resistant gastrointestinal stromal tumors. Clin. Cancer Res. 2012, 18 (16), 4375−4384. (41) Davies, M.; Nowotka, M.; Papadatos, G.; Dedman, N.; Gaulton, A.; Atkinson, F.; Bellis, L.; Overington, J. P. ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015, 43 (W1), W612−W620. (42) Trott, O.; Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010, 31 (2), 455−461. (43) Wang, M.; Herrmann, C. J.; Simonovic, M.; Szklarczyk, D.; von Mering, C. Version 4.0 of PaxDb: protein abundance data, integrated across model organisms, tissues, and cell-lines. Proteomics 2015, 15 (18), 3163−3168. (44) Mockus, J. Application of Bayesian approach to numerical methods of global and stochastic optimization. J. Glob. Optim. 1994, 4, 347−365. Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article https://doi.org/10.1021/acs.jctc.3c01224 J. Chem. Theory Comput. 2024, 20, 469−476 476 https://doi.org/10.1021/acscentsci.7b00572?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1021/acscentsci.7b00572?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1002/wcms.1608 https://doi.org/10.1002/wcms.1608 https://doi.org/10.1038/s41375-018-0357-9 https://doi.org/10.1038/s41375-018-0357-9 https://doi.org/10.1056/NEJMoa1301689 https://doi.org/10.1056/NEJMoa1301689 https://doi.org/10.1186/s13045-020-00992-1 https://doi.org/10.1186/s13045-020-00992-1 https://doi.org/10.1038/s41587-019-0224-x https://doi.org/10.1038/s41587-019-0224-x https://doi.org/10.1038/s41467-022-34692-w https://doi.org/10.1038/s41467-022-34692-w https://doi.org/10.1038/s41467-022-35692-6 https://doi.org/10.1038/s41467-022-35692-6 https://doi.org/10.1038/s41467-022-35692-6 https://doi.org/10.4155/fmc-2023-0029 https://doi.org/10.4155/fmc-2023-0029 https://doi.org/10.3390/ijms231911262 https://doi.org/10.3390/ijms231911262 https://doi.org/10.3390/ijms231911262 https://doi.org/10.1038/nchembio.1938 https://doi.org/10.1038/nchembio.1938 https://doi.org/10.1038/nbt.1990 https://doi.org/10.1002/anie.201914525 https://doi.org/10.1002/anie.201914525 https://doi.org/10.1093/nar/gkr1130 https://doi.org/10.1093/nar/gkr1130 https://doi.org/10.1093/nar/28.1.235 https://doi.org/10.1093/nar/28.1.235 https://doi.org/10.18632/oncotarget.27222 https://doi.org/10.18632/oncotarget.27222 https://doi.org/10.18632/oncotarget.27222 https://doi.org/10.18632/oncotarget.27222 https://doi.org/10.1046/j.1365-2141.2003.04108.x https://doi.org/10.1046/j.1365-2141.2003.04108.x https://doi.org/10.1046/j.1365-2141.2003.04108.x https://doi.org/10.1186/s13045-016-0367-x https://doi.org/10.1186/s13045-016-0367-x https://doi.org/10.1186/s13045-016-0367-x https://doi.org/10.1038/nchem.1243 https://doi.org/10.1021/acscentsci.0c00229?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1021/acscentsci.0c00229?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1021/acs.jcim.1c00203?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1021/acs.jcim.1c00203?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1371/journal.pone.0155183 https://doi.org/10.1371/journal.pone.0155183 https://doi.org/10.1021/acs.jcim.0c00675?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1021/acs.jcim.0c00675?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as https://doi.org/10.1137/0916069 https://doi.org/10.1137/0916069 https://doi.org/10.1111/cas.14274 https://doi.org/10.1111/cas.14274 https://doi.org/10.1111/cas.14274 https://doi.org/10.1158/1078-0432.CCR-12-0625 https://doi.org/10.1158/1078-0432.CCR-12-0625 https://doi.org/10.1158/1078-0432.CCR-12-0625 https://doi.org/10.1093/nar/gkv352 https://doi.org/10.1093/nar/gkv352 https://doi.org/10.1002/jcc.21334 https://doi.org/10.1002/jcc.21334 https://doi.org/10.1002/jcc.21334 https://doi.org/10.1002/pmic.201400441 https://doi.org/10.1002/pmic.201400441 https://doi.org/10.1007/BF01099263 https://doi.org/10.1007/BF01099263 pubs.acs.org/JCTC?ref=pdf https://doi.org/10.1021/acs.jctc.3c01224?urlappend=%3Fref%3DPDF&jav=VoR&rel=cite-as