procedure for preparing a thrombin model


1. Superposition

home/henricsn/combine2go/data/thrombin/pdb/
superimpose 1ets to 1o5a
superimpose further thrombin structures to aligned 1ets structure

selecting PDB structures of thrombin

pconv -g -no -s 1ets -pdb 1ghy -n A09


superposition of used x-ray structure for thrombin model; blue: receptor, grey: ligands

2. Building a receptor model

selecting 1c5l (human thrombin, apo) for thrombin model
checked structure according to electron density map (2fofc)
- structure looks fine; loop between T147 and G150 not defined; loop Thr1H - Glu1C not defined

removing alternative conformations and water molecules;
residues removed from light chain:
residues removed from light chain: Thr1H - Glu1C, Asp14L - Arg15; residue L-STY63 was changed to Tyr
used alternative conformation of heavy chain: S27b, M84b, E97Aa, M106b, E192a, M210a, E217a, K224a
/home/henricsn/combine2go/data/thrombin/pdb/model/thrombin-model-temp.pdb

xleap.sh -pdb thrombin-model -n 000 -prot thrombin-model-temp.pdb -lig model1.ligand -m /home/henricsn/combine2go/parameter/protmin.thrombin-model.template -w -noac -nopc -notemp -bres |tee xleap.log


thrombin model: thrombin-model.minbres.pdb together with superposition of ligands of x-ray structures

thrombin-model.min.pdb
thrombin-model.minbres.pdb

3. Ki values

checking Ki values
/home/henricsn/combine2go/sdf/all_160206_3D.sdf

4. Preparation of ligands

adding hydrogens with babel (babel -ipdb 1d9i.ligand.pdb -osdf 1d9i-babel.sdf) and pymol and saving modified ligands as /home/henricsn/combine2go/data/thrombin/pdb/ligand/*.ligand.prot.pdb
34 ligands:
1a4w.ligand.prot.pdb 1c1v.ligand.prot.pdb 1dwd.ligand.prot.pdb 1o5g.ligand.prot.pdb
1ae8.ligand.prot.pdb 1c1w.ligand.prot.pdb 1fpc.ligand.prot.pdb 1qbv.ligand.prot.pdb
1afe.ligand.prot.pdb 1c4u.ligand.prot.pdb 1ghv.ligand.prot.pdb 1ta2.ligand.prot.pdb
1aix.ligand.prot.pdb 1c4v.ligand.prot.pdb 1ghw.ligand.prot.pdb 1ta6.ligand.prot.pdb
1bcu.ligand.prot.pdb 1c4y.ligand.prot.pdb 1ghy.ligand.prot.pdb 1tom.ligand.prot.pdb
1bhx.ligand.prot.pdb 1c5n.ligand.prot.pdb 1gj4.ligand.prot.pdb 7kme.ligand.prot.pdb
1bmm.ligand.prot.pdb 1d6w.ligand.prot.pdb 1k21.ligand.prot.pdb 8kme.ligand.prot.pdb
1bmn.ligand.prot.pdb 1d9i.ligand.prot.pdb 1k22.ligand.prot.pdb
1c1u.ligand.prot.pdb 1dwc.ligand.prot.pdb 1o2g.ligand.prot.pdb

rename atom names:
pdbconv-ligand 1a4w.ligand.prot.pdb 1a4w.ligand.pdb

5. Minimization

start minimization in /home/henricsn/combine2go/data/thrombin/model1_200206
local_start.sh -sdf /home/henricsn/combine2go/sdf/all_160206_3D.sdf -inrec ./ -inlig /home/henricsn/combine2go/data/thrombin/pdb/ligand/ -model thrombin-model.min.pdb -ligbase .ligand.pdb |tee local.log

problems with following liands:

A85/8kme too large
A43/1aix bor not in gaff.dat defined
A69/1tab antechamber problems

A04/1o2g and A17/1o5g clashes between ligand and Lys60F (Lys52)

not selected structures because of coordinates Zn in active site:
A59/1c1w
A58/1c1v
A49/1c1u

6. Interaction energies

29 ligands:
A04.0.min.xyz A18.0.min.xyz A46.0.min.xyz A68.0.min.xyz A86.0.min.xyz A91.0.min.xyz
A09.1.min.xyz A19.0.min.xyz A47.0.min.xyz A72.0.min.xyz A87.0.min.xyz A92.0.min.xyz
A12.1.min.xyz A22.0.min.xyz A48.2.min.xyz A73.0.min.xyz A88.0.min.xyz A93.0.min.xyz
A14.0.min.xyz A44.0.min.xyz A62.0.min.xyz A82.1.min.xyz A89.0.min.xyz A94.0.min.xyz
A17.0.min.xyz A45.0.min.xyz A63.0.min.xyz A84.0.min.xyz A90.0.min.xyz

mkdir anal

cd anal/

start calculation of interaction energies
calc_inter_model.sh -ki thrombin -sdf /home/henricsn/combine2go/sdf/all_160206_3D.sdf -indir ../ -outdir ./ -x |tee calc.log


7. Prediction

Data file : /home/henricsn/combine2go/data/thrombin/model1_200206/golpe/output.dat
Comment : Golpe dat for COMBINE analysis, 02/21/06 16:14:40
Number of variables = 590
Number of experiments = 29
Number of X-variables = 582 , Y-variables = 1
Loading :
1 A09.1
2 A86.0
3 A22.0
4 A72.0
5 A90.0
6 A87.0
7 A63.0
8 A92.0
9 A94.0
10 A93.0
11 A89.0
12 A91.0
13 A62.0
14 A44.0
15 A12.1
16 A17.0
17 A68.0
18 A73.0
19 A47.0
20 A19.0
21 A84.0
22 A04.0
23 A18.0
24 A45.0
25 A46.0
26 A88.0
27 A14.0
28 A48.2
29 A82.1

Active X-variables (SS 1.0E-7) = 228
Active Y-variables (SS 1.0E-7) = 1
Active X-variables after PRETREATMENT = 228
Active Y-variables after PRETREATMENT = 1

Principal Component Analysis (PCA) 29 objects 228 X-var
components XVarExp XAccum
1 51.9133 51.9133
2 9.1910 61.1042
3 7.2524 68.3566
4 4.8495 73.2061
5 4.0591 77.2652
6 3.9408 81.2059
7 3.6106 84.8166
8 2.4545 87.2710
9 2.1938 89.4648
10 1.8242 91.2890

PCA Rank Validation - using 4 random groups
components PRESS Seps R
1 2.0974e+03 2.2718e+03 0.9232
2 1.0979e+03 1.0516e+03 1.0440
3 9.8209e+02 8.1770e+02 1.2010
4 5.3508e+03 6.3855e+02 8.3797
5 2.1962e+03 5.1816e+02 4.2384
6 5.2104e+02 4.2062e+02 1.2388
7 3.3270e+02 3.3201e+02 1.0021
8 4.3021e+02 2.5559e+02 1.6832
9 4.6896e+02 2.0372e+02 2.3020
10 3.5559e+04 1.5990e+02 222.3865

Partial Least Squares (PLS) 29 objects 228 X-var 1 Y-var
Y1 components XVarExp XAccum SDEC r2
0 0.0000 0.0000 2.5948 0.0000
1 49.3449 49.3449 2.1231 0.3305
2 10.6351 59.9800 1.5373 0.6490
3 4.0066 63.9866 1.2712 0.7600
4 2.3666 66.3532 1.1146 0.8155
5 3.6328 69.9861 1.0090 0.8488
6 4.7149 74.7010 0.9222 0.8737
7 2.8602 77.5612 0.8533 0.8919
8 2.3764 79.9375 0.7809 0.9094
9 2.1442 82.0817 0.6920 0.9289
10 1.4829 83.5647 0.6053 0.9456

PLS Model Validation - LOO
Y1 components SDEP SDEV(sdep) q2
0 2.6875 - -0.0727
1 2.3873 - 0.1536
2 2.1311 - 0.3255
3 2.1921 - 0.2863
4 2.2379 - 0.2562
5 2.5593 - 0.0272
6 2.7710 - -0.1404
7 3.0330 - -0.3662
8 3.3226 - -0.6396
9 3.7573 - -1.0966
10 4.1967 - -1.6158

Deleted unselected vars. (D-Optimal)
1 : 228 - 114 Comp.=4 , Wed Feb 22 10:14:10 2006
Active X-variables after VAR. SELECT. = 114

*** FFD Variable Selection Started ***
Max. dimensionality : 4
Validation Mode : LOO
Recalculate weights : yes
Comb./Var. ratio : 2.0
Use groups :
Uncertains : Retain
Fold-over design : no
perc. of dummies : 20

Deleted unselected vars. (F.Factorial)
1 : 114 - 100 Comp.=4 , Wed Feb 22 10:14:25 2006
Active X-variables after VAR. SELECT. = 100

Principal Component Analysis (PCA) 29 objects 100 X-var
components XVarExp XAccum
1 41.7468 41.7468
2 13.4479 55.1947
3 12.7690 67.9637
4 8.3512 76.3150
5 6.8785 83.1935

PCA Rank Validation - using 4 random groups
components PRESS Seps R
1 6.4791e+02 8.2382e+02 0.7865
2 1.8660e+03 4.6197e+02 4.0393
3 1.0663e+03 3.4157e+02 3.1219
4 2.9614e+03 2.3443e+02 12.6325
5 1.9648e+02 1.6610e+02 1.1829
6 1.1059e+02 1.1275e+02 0.9808
7 9.3985e+01 8.4796e+01 1.1084

Partial Least Squares (PLS) 29 objects 100 X-var 1 Y-var
Y1 components XVarExp XAccum SDEC r2
0 0.0000 0.0000 2.5948 0.0000
1 40.3089 40.3089 1.7332 0.5539
2 11.5101 51.8190 1.3351 0.7353
3 7.6649 59.4839 1.2325 0.7744
4 9.0552 68.5391 1.1282 0.8110
5 6.0591 74.5981 0.9386 0.8692
6 7.1584 81.7565 0.8343 0.8966
7 3.1069 84.8634 0.6621 0.9349

PLS Model Validation - LOO
Y1 components SDEP SDEV(sdep) q2
0 2.6875 - -0.0727
1 1.9745 - 0.4210
2 1.8121 - 0.5123
3 1.8103 - 0.5133
4 2.0096 - 0.4002
5 2.5135 - 0.0617
6 2.6900 - -0.0747
7 2.7183 - -0.0974

after removing potential outlier: A91

Data file : /home/henricsn/combine2go/data/thrombin/model1_200206/golpe/output-A91.dat

Comment : Extraction from output.dat

Number of variables = 590
Number of experiments = 28

Number of X-variables = 582 , Y-variables = 1
Loading :

1 A09.1
2 A86.0
3 A22.0
4 A72.0
5 A90.0
6 A87.0
7 A63.0
8 A92.0
9 A94.0
10 A93.0
11 A89.0
13 A62.0
14 A44.0
15 A12.1
16 A17.0
17 A68.0
18 A73.0
19 A47.0
20 A19.0
21 A84.0
22 A04.0
23 A18.0
24 A45.0
25 A46.0
26 A88.0
27 A14.0
28 A48.2
29 A82.1

Active X-variables (SS 1.0E-7) = 224
Active Y-variables (SS 1.0E-7) = 1

Active X-variables after PRETREATMENT = 224
Active Y-variables after PRETREATMENT = 1


Principal Component Analysis (PCA) 28 objects 224 X-var

components XVarExp XAccum
1 52.9307 52.9307
2 9.4123 62.3430
3 7.4219 69.7650
4 4.1313 73.8963
5 4.2523 78.1486
6 4.0121 82.1607
7 2.8239 84.9846



PCA Rank Validation - using 4 random groups

components PRESS Seps R
1 2.0118e+03 2.1970e+03 0.9157
2 1.6334e+03 9.9410e+02 1.6431
3 2.2787e+03 7.6341e+02 2.9849
4 5.4931e+03 5.8741e+02 9.3513
5 1.1197e+03 4.8518e+02 2.3079
6 4.6836e+02 3.8781e+02 1.2077
7 3.0735e+02 3.0169e+02 1.0187


Partial Least Squares (PLS) 28 objects 224 X-var 1 Y-var

Y1 components XVarExp XAccum SDEC r2
0 0.0000 0.0000 2.6214 0.0000
1 50.3463 50.3463 2.1645 0.3183
2 10.7028 61.0491 1.5330 0.6580
3 3.4660 64.5151 1.1298 0.8143
4 3.1321 67.6472 1.0038 0.8534
5 5.0592 72.7064 0.9128 0.8788
6 3.0767 75.7831 0.7975 0.9074
7 0.5760 76.3591 0.6653 0.9356


PLS Model Validation - LOO

Y1 components SDEP SDEV(sdep) q2
0 2.7185 - -0.0754
1 2.4436 - 0.1311
2 2.1929 - 0.3002
3 2.0273 - 0.4020
4 1.9244 - 0.4611
5 2.0860 - 0.3668
6 2.3072 - 0.2254
7 2.5249 - 0.0723


Deleted unselected vars. (D-Optimal)
1 : 224 - 112 Comp.=4 , Wed Feb 22 14:34:19 2006
Active X-variables after VAR. SELECT. = 112


*** FFD Variable Selection Started ***
Max. dimensionality : 4
Validation Mode : LOO
Recalculate weights : yes
Comb./Var. ratio : 2.0
Use groups :
Uncertains : Retain
Fold-over design : no
perc. of dummies : 20


Deleted unselected vars. (F.Factorial)
1 : 112 - 100 Comp.=4 , Wed Feb 22 14:34:34 2006
Active X-variables after VAR. SELECT. = 100



Principal Component Analysis (PCA) 28 objects 100 X-var

components XVarExp XAccum
1 45.8754 45.8754
2 9.6223 55.4977
3 9.6057 65.1034
4 6.8805 71.9839
5 6.9018 78.8858
6 4.3504 83.2362
7 3.4082 86.6444



PCA Rank Validation - using 4 random groups

components PRESS Seps R
1 5.4754e+02 7.9577e+02 0.6881
2 1.3422e+04 4.1404e+02 32.4166
3 3.6891e+02 3.2678e+02 1.1289
4 2.3583e+02 2.4557e+02 0.9603
5 1.7135e+02 1.8861e+02 0.9085
6 1.3400e+02 1.3573e+02 0.9872
7 1.0366e+02 1.0269e+02 1.0095


Partial Least Squares (PLS) 28 objects 100 X-var 1 Y-var

Y1 components XVarExp XAccum SDEC r2
0 0.0000 0.0000 2.6214 0.0000
1 44.8369 44.8369 1.7746 0.5417
2 8.2411 53.0780 1.1845 0.7958
3 6.4920 59.5700 1.0813 0.8299
4 3.3068 62.8768 0.9665 0.8641
5 6.4937 69.3705 0.8355 0.8984
6 4.7944 74.1649 0.6508 0.9384
7 5.1959 79.3608 0.5718 0.9524


PLS Model Validation - LOO

Y1 components SDEP SDEV(sdep) q2
0 2.7185 - -0.0754
1 2.0375 - 0.3959
2 1.6646 - 0.5968
3 1.5735 - 0.6397
4 1.9563 - 0.4431
5 2.2310 - 0.2757
6 2.6518 - -0.0233
7 2.5788 - 0.0323

Privacy Imprint