I see the whole project in the context of targeted / knowledgebased scoring vs physics bases (eg MMPBSA, LIE, PBbinding calcs). Both approaches start to get within computational reach for virtual screening and library design and hence it is useful to establish what one can expect. COMBINE is, at least to me, on the extreme when it comes to knowledge based scoring functions if one stays within one series it is well established that the results are excellent. For libdesign the challenge is to select binders from a set of molecules with a identical scaffold (geometry) challenge for most docking programs, hence the value of more detailed scoring functions.
make a table with molecule picture, pIC50, reference, and pdbcode for supplementary material. a short write up including that we are aware of the problem mixing ligands from several labs
technical procedure , variable selection procedure and final plots (q2, expt vs predicted, residuals)
Write up methods for docking (how was receptor prepared etc)
In this section we look at the feasability of using COMBINE to score docked structures in a virtual screening setting. The experiment proceeds in three stages:
Using the training set with known Xray structures to look at how well does docking docking + minimisation reproduce xrays (ie rmsd plots, simply histogram of rmsds + one picture of a successful docking). Having established a docking protocol we move on to look at binding affinity prediction for the training set. (section training set below) we also need to score the final poses with ZAP for reference since we will compare with this later.
Looking at a test set of known Kis (but unknown Xray structure), use GOLD to predict binding mode and then score with COMBINE or ZAP for binding affinity prediction.
Looking at random molecules (with an extremely low probability of binding) to see how they score with the model here we use the NCI datasset. This is to emulate a virtual screening setting the obvious critisism is that it is trivial to computationally separate NCI from proteases maybe we can do a matched set of inhibitors as well
Redocking of inhibitors used to develop COMBINE model (GOLD)
details on the method and receptor used
Results:
Look at the binding energy prediction for the training dataset. There are a set of baseline models that we compare with. The most simple model is Ki = Ki (ie we set the prediction to the mean for training set) and look at the error in predictions. We also compare with using Gold Chemscore and ZAP.
It looks as if the LV4
LV5 scores
performs best for training set make a plot of Rsquare RMS error
to
prove/illustrate this
|
|
||
RSquare |
0.207084 |
||
RSquare Adj |
0.178765 |
||
Root Mean Square Error |
1.546532 |
||
Mean of Response |
-8.86933 |
||
Observations (or Sum Wgts) |
30 |
||
|
|
||
Mean |
-8.86933 |
||
Std Dev [RMSE] |
1.706574 |
||
Std Error |
0.311576 |
||
SSE |
84.45949 |
||
Compare selection methods
LV4 |
|
RSquare |
0.450632 |
RSquare Adj |
0.431012 |
Root Mean Square Error |
1.287291 |
Mean of Response |
-8.86933 |
Observations (or Sum Wgts) |
30 |
|
|
RSquare |
0.417038 |
RSquare Adj |
0.396218 |
Root Mean Square Error |
1.326067 |
Mean of Response |
-8.86933 |
Observations (or Sum Wgts) |
30 |
RSquare |
0.423632 |
RSquare Adj |
0.403048 |
Root Mean Square Error |
1.318545 |
Mean of Response |
-8.86933 |
Observations (or Sum Wgts) |
30 |
RSquare |
0.465337 |
RSquare Adj |
0.446242 |
Root Mean Square Error |
1.269946 |
Mean of Response |
-8.86933 |
Observations (or Sum Wgts) |
30 |
RSquare |
0.618364 |
RSquare Adj |
0.604734 |
Root Mean Square Error |
1.072927 |
Mean of Response |
-8.86933 |
Observations (or Sum Wgts) |
30 |
RSquare |
0.614585 |
RSquare Adj |
0.600821 |
Root Mean Square Error |
1.078225 |
Mean of Response |
-8.86933 |
Observations (or Sum Wgts) |
30 |
details on dataset
Here there is a rapid decrease of performance with number of latent variables little predictive value beyond the first one so to some degree the model is overfitted (or argualby not fit for purpose...) again illustrate with plots of Rsquare vs number of latent variables for the different selection methods.
|
|
RSquare |
0.167599 |
RSquare Adj |
0.162923 |
Root Mean Square Error |
1.287089 |
Mean of Response |
-8.54581 |
Observations (or Sum Wgts) |
180 |
for this dataset we also
want to look at
the chemscore performance and ZAP preformance.
The correlation with experimental Ki values is atrocious but that is usually the case for scoring functions they might still be useful for virtual screening next section
Here we want to look at enrichments for the different selection methods this is a standard virtual screening exercise but arguably a rather easy case (not much value in this)
Need to produce enrichment plots for the dataset. Would be good to run this for a larger set of NCI compounds (and also a matched property set I guess we could use our old thrombin dummies)
Performance on library set score within one scaffold. Look at overall performance and performance per library