This Brownian dynamics program is specifically devoted to modeling of generic
protein:protein association.
Treatment of electrostatic interactions is improved by using the potential-derived
charges (computed by the ecm package) and by including charge desolvation
penalties.
Offers a variety of ways to model the encounter complexes.
2. How is the program optimized for speed and
what is paid for that?
Overlaps are disallowed by using the exclusion grid of one of the proteins
and the surface atoms of the other protein for which the surface atoms
are treated as having identical radii. Consequently, rigid proteins are
simulated and the treatment of the surface of the second protein is approximate.
The time step is variable depending on the radial distance between the
centers of 2 proteins. Rotations are accumulated and performed only when
the accumulated rotation exceeds a given value (courtesy of S.H.Northrup).
These approximations have no effect on the results, although they speed
up calculations considerably, so long as the parameters defining the time
step and rotations are reasonable. The criterion for "reasonability"
of the time step is a resolution of the model and the distance to the boundaries,
namely, small rotations resulting in less than 0.5 Å shift of atoms
may be accumulated and larger moves may be done, when the proteins are
far from the overlap and c-surface.
The standard version computes interaction forces using charges on one protein
and the potential grid of the other. The drawback is that the executable
needs to store electrostatic potential grids, making the program size ~
50MB at 150^3 grid.
When the electrostatic potential is stored on a 1 Å grid, the force
computations are 1.5 times faster than the simulations with electrostatic
potentials stored on the grid with another spacing. This spacing, however,
may be too large for a highly variable potential.
The program has 2 main computational bottlenecks: checks for exclusions
and force computations. The third bottleneck appeares with icomm=3,
when the check for reaction criteria may become the most expensive operation.
3. Why is the standard executable that big
?
The potentials are stored as real*8 grids, which has an advantage over
real*4 on SGI computers running on IRIX6. You may consider using real*4
to decrease the size. Also think of the possibility of using the option
iforce=2.
4. How do I know what size the potential
grid should be ?
The size of the potential grid cannot be larger than allowed by present
day computers. For example 190^3 is about the limit while using UHBD on
an SGI Power Challenge with 2 GB RAM. On one hand, the extension of the
grid should be large enough to ensure the interactions at distances further
than the extension of the grid may safely be neglected. On the other
hand, the spacing of the grid must be sufficient to ensure adequate accuracy
in computation of the forces. At higher ionic strength conditions,
it is recommended to use a smaller grid spacing keeping the same dimension,
thus decreasing the extension of the grid. As an example, favourable interactions
at surface-to-surface distances larger than 30 Å in a solvent of
50 mM ionic strength (corresponding Debye length - 14 Å) account
for about 25% of the association rate of barnase and barstar.
5. While modeling the association at different
ionic strengths by sda, I have got oscillating ionic strength dependence
of association rates...
Make sure the hydrogen atoms are treated consistently in electrostatic
computations and Brownian dynamics simulations. First, the hydrogens need
to be treated in Brownian dynamics program as they were treated in electrostatic
computations. Second, the probe radius for protein 2 needs to be larger
than the ion radius used in electrostatic computations. Inconsistency is
found to produce unexpected ionic strength dependence when the ion inpenetrable
region (as treated in the electrostatic computations) appears to be penetrable
by protein charges, thus changing the (smooth) Debye-Hueckel treatment
of interactions.
6. How do I check if the probe radius to represent
all atoms of the second protein is adequate ?
Run the reverse simulations interchanging the 2 proteins (i.e. 1 protein
treated as protein 2 and vice-versa). The difference between the results
of the direct and the reverse simulations is a measure of the effect of
this probe approximation.
7. The computed rates seem too high, the contact
distance dependence shows a saturation tendency at 3-4 Å...
Check (after the standard consistency checks) the adequacy of the minimal
time step dt1. Then think of anything else.
8. Different treatments of hydrogens give different
results....
The treatment of hydrogens DOES influence the interaction strength, and
association rates as a consequence. What you should do is to think which
treatment gives adequate treatment of the interactions.
9. Different treatment of some side chain conformations
give different results...
See above.
10. How do I use option iforce=2?
Most important is to decrease the number of charges while trying to keep
the description of interactions accurate. For that, run ecm
several times with different sets of charge sites to figure out the smallest
set of charges that gives sufficient accuracy of fitting of the potentials.
The charges having small "Norm of Coulomb-Debye grid" can be eliminated
from the set influencing at least the fitting accuracy. Unless the number
of charges is less than 10-20 this option may be incredibly slow. For example,
the results of simulations of barnase and barstar (having 47 and 45 charges
to represent their potentials) using option iforce=1 may
be essentially reproduced using an option iforce=2 and
30 and 28 charges (respectively). The second simulations are however 6
times longer. Grid-charge computations need ~20*(nq1+nq2) (or 1.5 times
more when variable grid spacing routines are used) multiplications to evaluate
the forces. Charge-charge computations need ~15*nq1*nq2 multiplications,
making these computations as efficient as the grid-charge treatment only
when the number of charges of the proteins satisfies nq1~nq2<3-4.
11. What are the different executables for
?
2 excutables sda and sdaa are for
running on SGI IRIX and DEC ALPHA, respectively. The difference in sources
is reading of binary and ascii (respectively) grid files, since on the
ALPHA station, I could not manage to read in the UHBD binary grid.
sdabf is a separate program to sample different positions
of the second grid around the first and output averaged Boltzmann factors
of electrostatic interaction free energies at every definition of the encounter
complex. It also writes 2 grids with the values of the averaged (upon
different orientations, by Boltzmann factor) electrostatic interaction
energy and the values of the average Boltzmann factor at each point of
the space, where the center of the second protein was placed during sampling.
12. I got a strange exclusion grid profile
in the filefort.77.
If the grid is not centred, then the center is wrongly specified in input
file. You may choose not to specify the center at all by setting variable
ic1=0,
but if the centers have come from grid calculations, it is worth checking
that the pdb file you use and the pdb-file used by UHBD are related to
the same position (of the same protein).
If you wanted to use an atom with large radius and modified the routine
specifying the atomic radii, take care that you also changed the variable
armax
in subroutine mkex1par.
13. How long simulations need I run ?
The relative error of the rate evaluation is ~ M-1/2, where
M is the number of reaction events defining the rate. 100 rare events would
be detected with ~ 10% accuracy. If the expected fraction of events is
x
(exactly the number you want to know) and the relative error you would
accept is r, then you need in ~r-2/x
runs,
among which r-2will be reactive.
The fraction of events depends on the value of b (defining
the radius of the sphere where all the trajectories begin), D
- relative diffusion constant of proteins and the expected reaction rate
kexp
(where exp stands for "expected", may be related to "experimental")
just being equal x=kexp/4
b D.
14. Are there any cutoff artefacts ?
The interaction force is set to zero when the charge falls outside the
extent of the grid. The effect of truncation of interactions diminishs
as the extent of the potential grids is increased. For barnase:barstar
interactions, for example, increasing the extent of the grid from 54 A
to 74 A increased computed association rates by some 20% when 50 mM salt
conditions were modeled. The effect is smaller when the salt concentration
is larger.
15. All 3 reaction criteria options appear
to deal with specific contacts. What could I do to study non-specific contact
formation ?
It is possible to define non-specific contacts by placing all possible
pairwise contacts in rxna files, and using option icomm=2.
In order to monitor a reasonably small number of pair contacts, preselection
of contacts to be included may be necessary.
16. How can I watch the Brownian motion of
proteins simulated by the program ?
The visualization is done in 4 steps.
First choose the trajectory to be visualized from the normal run. This
can be done by looking at fort.66, where a long trajectory indicates that
the proteins are interacting strongly, although not necessarily at the
right place. Or it can be done by running sda with nprint=1
and tracking the trajectory after which the recorded reaction rate (beta)
for the close contacts changes - this means that reaction took place at
the end of this trajectory. In the output file, the beta-s are recorded
sequentially in one record starting with the trajectory number for each
contact distance, and sequentially for 1,2,3,4 contacts.
The second step is to run sda which will record the indicated
trajectory. For this, the parameter iwrite in the input
file should be set to the number of the trajectory to be written into file
fort.55.
The third step is to use the program his2pdbs.f from the aux/
directory, which reads this fort.55 file and writes a sequence of pdb files.
It reads 3 lines from standard input: the name of the pdb file with the
coordinates of the second protein; the name of the history file (fort.55);
and the first letter (say L) to appear in the name of the pdb files of
form SXXXX.pdb.
The fourth step depends on the visualization software you have. For
QUANTA it may be useful to convert this set of pdb files to DCD format.
To make movies, I loaded a set of pdb files into InsightII sequentially,
making snapshots, and then converting the set of snapshots into a movie
file - this needs to be done with an InsightII script, since there are
several hundred files to be processed.
An example of this procedure is given n sda distribution.
R. Gabdoulline, R.C. Wade, February, 2000; March 2005PrivacyImprint