SDA, Frequently asked questions

sda FAQ

1. What is new in this program ?

This Brownian dynamics program is specifically devoted to modeling of generic protein:protein association.
Treatment of electrostatic interactions is improved by using the potential-derived charges (computed by the ecm package) and by including charge desolvation penalties.
Offers a variety of ways to model the encounter complexes.

2. How is the program optimized for speed and what is paid for that?

Overlaps are disallowed by using the exclusion grid of one of the proteins and the surface atoms of the other protein for which the surface atoms are treated as having identical radii. Consequently, rigid proteins are simulated and the treatment of the surface of the second protein is approximate.
The time step is variable depending on the radial distance between the centers of 2 proteins. Rotations are accumulated and performed only when the accumulated rotation exceeds a given value (courtesy of S.H.Northrup). These approximations have no effect on the results, although they speed up calculations considerably, so long as the parameters defining the time step and rotations are reasonable. The criterion for "reasonability" of the time step is a resolution of the model and the distance to the boundaries, namely, small rotations resulting in less than 0.5 Å shift of atoms may be accumulated and larger moves may be done, when the proteins are far from the overlap and c-surface.
The standard version computes interaction forces using charges on one protein and the potential grid of the other. The drawback is that the executable needs to store electrostatic potential grids, making the program size ~ 50MB at 150^3 grid.
When the electrostatic potential is stored on a 1 Å grid, the force computations are 1.5 times faster than the simulations with electrostatic potentials stored on the grid with another spacing. This spacing, however, may be too large for a highly variable potential.
The program has 2 main computational bottlenecks: checks for exclusions and force computations. The third bottleneck appeares with icomm=3, when the check for reaction criteria may become the most expensive operation.

3. Why is the standard executable that big ?

The potentials are stored as real*8 grids, which has an advantage over real*4 on SGI computers running on IRIX6. You may consider using real*4 to decrease the size. Also think of the possibility of using the option iforce=2.

4. How do I know what size the potential grid should be ?

The size of the potential grid cannot be larger than allowed by present day computers. For example 190^3 is about the limit while using UHBD on an SGI Power Challenge with 2 GB RAM. On one hand, the extension of the grid should be large enough to ensure the interactions at distances further than the extension of the grid may safely be neglected. On the other hand, the spacing of the grid must be sufficient to ensure adequate accuracy in computation of the forces. At higher ionic strength conditions, it is recommended to use a smaller grid spacing keeping the same dimension, thus decreasing the extension of the grid. As an example, favourable interactions at surface-to-surface distances larger than 30 Å in a solvent of 50 mM ionic strength (corresponding Debye length - 14 Å) account for about 25% of the association rate of barnase and barstar.

5. While modeling the association at different ionic strengths by sda, I have got oscillating ionic strength dependence of association rates...

Make sure the hydrogen atoms are treated consistently in electrostatic computations and Brownian dynamics simulations. First, the hydrogens need to be treated in Brownian dynamics program as they were treated in electrostatic computations. Second, the probe radius for protein 2 needs to be larger than the ion radius used in electrostatic computations. Inconsistency is found to produce unexpected ionic strength dependence when the ion inpenetrable region (as treated in the electrostatic computations) appears to be penetrable by protein charges, thus changing the (smooth) Debye-Hueckel treatment of interactions.

6. How do I check if the probe radius to represent all atoms of the second protein is adequate ?

Run the reverse simulations interchanging the 2 proteins (i.e. 1 protein treated as protein 2 and vice-versa). The difference between the results of the direct and the reverse simulations is a measure of the effect of this probe approximation.

7. The computed rates seem too high, the contact distance dependence shows a saturation tendency at 3-4 Å...

Check (after the standard consistency checks) the adequacy of the minimal time step dt1. Then think of anything else.

8. Different treatments of hydrogens give different results....

The treatment of hydrogens DOES influence the interaction strength, and association rates as a consequence. What you should do is to think which treatment gives adequate treatment of the interactions.

9. Different treatment of some side chain conformations give different results...

See above.

10. How do I use option iforce=2?

Most important is to decrease the number of charges while trying to keep the description of interactions accurate. For that, run ecm several times with different sets of charge sites to figure out the smallest set of charges that gives sufficient accuracy of fitting of the potentials. The charges having small "Norm of Coulomb-Debye grid" can be eliminated from the set influencing at least the fitting accuracy. Unless the number of charges is less than 10-20 this option may be incredibly slow. For example, the results of simulations of barnase and barstar (having 47 and 45 charges to represent their potentials) using option iforce=1 may be essentially reproduced using an option iforce=2 and 30 and 28 charges (respectively). The second simulations are however 6 times longer. Grid-charge computations need ~20*(nq1+nq2) (or 1.5 times more when variable grid spacing routines are used) multiplications to evaluate the forces. Charge-charge computations need ~15*nq1*nq2 multiplications, making these computations as efficient as the grid-charge treatment only when the number of charges of the proteins satisfies nq1~nq2<3-4.

11. What are the different executables for ?

2 excutables sda and sdaa are for running on SGI IRIX and DEC ALPHA, respectively. The difference in sources is reading of binary and ascii (respectively) grid files, since on the ALPHA station, I could not manage to read in the UHBD binary grid.
sdabf is a separate program to sample different positions of the second grid around the first and output averaged Boltzmann factors of electrostatic interaction free energies at every definition of the encounter complex. It also writes 2 grids with the values of the averaged (upon different orientations, by Boltzmann factor) electrostatic interaction energy and the values of the average Boltzmann factor at each point of the space, where the center of the second protein was placed during sampling.

12. I got a strange exclusion grid profile in the file fort.77.

If the grid is not centred, then the center is wrongly specified in input file. You may choose not to specify the center at all by setting variable ic1=0, but if the centers have come from grid calculations, it is worth checking that the pdb file you use and the pdb-file used by UHBD are related to the same position (of the same protein).
If you wanted to use an atom with large radius and modified the routine specifying the atomic radii, take care that you also changed the variable armax in subroutine mkex1par.

13. How long simulations need I run ?

The relative error of the rate evaluation is ~ M^-1/2, where M is the number of reaction events defining the rate. 100 rare events would be detected with ~ 10% accuracy. If the expected fraction of events is x (exactly the number you want to know) and the relative error you would accept is r, then you need in ~r^-2/x runs, among which r^-2 will be reactive. The fraction of events depends on the value of b (defining the radius of the sphere where all the trajectories begin), D - relative diffusion constant of proteins and the expected reaction rate k_exp (where exp stands for "expected", may be related to "experimental") just being equal x=k_exp/4 b D.

14. Are there any cutoff artefacts ?

The interaction force is set to zero when the charge falls outside the extent of the grid. The effect of truncation of interactions diminishs as the extent of the potential grids is increased. For barnase:barstar interactions, for example, increasing the extent of the grid from 54 A to 74 A increased computed association rates by some 20% when 50 mM salt conditions were modeled. The effect is smaller when the salt concentration is larger.

15. All 3 reaction criteria options appear to deal with specific contacts. What could I do to study non-specific contact formation ?

It is possible to define non-specific contacts by placing all possible pairwise contacts in rxna files, and using option icomm=2. In order to monitor a reasonably small number of pair contacts, preselection of contacts to be included may be necessary.

16. How can I watch the Brownian motion of proteins simulated by the program ?

The visualization is done in 4 steps.

First choose the trajectory to be visualized from the normal run. This can be done by looking at fort.66, where a long trajectory indicates that the proteins are interacting strongly, although not necessarily at the right place. Or it can be done by running sda with nprint=1 and tracking the trajectory after which the recorded reaction rate (beta) for the close contacts changes - this means that reaction took place at the end of this trajectory. In the output file, the beta-s are recorded sequentially in one record starting with the trajectory number for each contact distance, and sequentially for 1,2,3,4 contacts.

The second step is to run sda which will record the indicated trajectory. For this, the parameter iwrite in the input file should be set to the number of the trajectory to be written into file fort.55.

The third step is to use the program his2pdbs.f from the aux/ directory, which reads this fort.55 file and writes a sequence of pdb files. It reads 3 lines from standard input: the name of the pdb file with the coordinates of the second protein; the name of the history file (fort.55); and the first letter (say L) to appear in the name of the pdb files of form SXXXX.pdb.

The fourth step depends on the visualization software you have. For QUANTA it may be useful to convert this set of pdb files to DCD format. To make movies, I loaded a set of pdb files into InsightII sequentially, making snapshots, and then converting the set of snapshots into a movie file - this needs to be done with an InsightII script, since there are several hundred files to be processed.

An example of this procedure is given n sda distribution.

R. Gabdoulline,
R.C. Wade,
February, 2000; March 2005 Privacy Imprint