Running the usage example
Computation options
The directory tree of PIPSA distribution
Timing
References
9 protein models of PH domains are compared on the basis of their electrostatic potentials. This is a subset of the 104 PH domains analyzed in the paper 1. All model structures are available elsewhere2 . PIPSA performs an automated comparison, from which it is clear that there are 2 clusters, one formed by proteins having a mostly positive potential, and the other formed by proteins having negative potentials. In this case, this classification could be done by simply looking at electrostatic potential contours. However, PIPSA does the classification in an automated, objective and quantitative fashion, and allows more subtle classification, as found amongst the full 104 domain dataset 1. Indeed, PIPSA shows that for this 9 protein model example, the cluster of proteins having positive potentials can be further divided into two subclusters.
1. Unzip and untar the PIPSA distribution (see
the description of appearing directory tree here).
2. Put your pdb files
in the directory pdbs/. For the example run, copy the 9
pdb files from the directory pdbs_example/ to the directory
pdbs/.
3. Go to the directory
source/ and run the script do_pipsa, choose the calculation
mode (1, 2 or 3):
4. Collect results (3 files: sims.log,
sims.kin
and sims.mat) in corresponding directories (pdbs/,
uhbd/
and grid/ for options 1, 2
and 3, respectively).
5. Display the proteins clustered according
to the similarity of their interaction properties with the kinemage program
using the command "mage sims.kin"
6. Cluster proteins according to the similarity matrix
sims.mat by using the command nmrclust.
Note. Executables in the distribution were
compiled under unix (SGI IRIX 6.5) (see the file Makefile in source/
directory).
Option "1. Analytical estimate" does not require any additional programs to be installed. Computes the electrostatic similarity matrix based on the monopole+dipole representation of proteins. The results of the similarity analysis are in files:
pdbs/sims.log - log-file, containing the similarity matrix, created by running qdipsim.
pdbs/sims.kin - kinemage file for presenting proteins as points in 3D;
pdbs/sims.mat - matrix of pairwise distances between proteins, based on their similarity.Option "2. Similarity of electrostatic potentials, computed using UHBD" may be chosen only if you have UHBD executable located at source/uhbd and WHATIF program installed, so that it may be executed by typing whatif on the command line. Computes electrostatic potential grids using the UHBD program to solve the FDPBE, writes electrostatic potentials as files and computes the similarity matrix based on these electrostatic potentials. The results of similarity analysis are in files:
uhbd/sims.log - log-file, containing the similarity matrix, created by 2potsim_skin.
uhbd/sims.kin - kinemage file for presenting proteins as points in 3D;
uhbd/sims.mat - matrix of pairwise distances between proteins, based on their similarity.
For this example case, you can skip WHATIF calculations, when necessary pdb files will simply be copied from the directory whatif_example/ of PIPSA distribution. The script will prompt about this possibility.
For example case you can also skip UHBD calculations, if you download a zip file of the electrostatic potential grid files (8220 KB) and unzip it in pipsa directory, so that necessary grid files for similarity calculations will be in the directory uhbd_example/. The script will prompt about this possibility.Option "3. Similarity of probe interaction fields, computed using the program GRID may be chosen only if you have GRID executables grin and grid located at source/grin and source/grid. Computes molecular interaction field grids for small chemical probes (the PO42- ion by default) using the GRID program and computes the similarity matrix based on these interaction fields. The results of similarity analysis are in files:
grid/sims.log - log-file, containing the similarity matrix, created by 2potsim_skin.
grid/sims.kin - kinemage file for presenting proteins as points in 3D;
grid/sims.mat - matrix of pairwise distances between proteins, based on their similarity. The files sims.kin are kinemage files to be visualized by MAGE3 ("mage sims.kin" if your MAGE executable is mage). The files sims.mat may be used as a distance matrix for the program NMRCLUST4. For that, after executing nmrclust, answer "no" to the question "Use a PDB file for input?", and enter sims.mat as a Matrix filename.
The directory tree of PIPSA distribution:
doc/ - documentation;
pdbs/ - the directory
to keep original data - pdb files of (superimposed) proteins;
source/ - the directory with all
scripts and programs to use;
pdbs_example/ - has the
pdb files of the usage example and the results of running qdipsim;
grid_example/ - copy of
the directory
grid/ obtained after running the PIPSA demo, with
grid files removed;
whatif_example/ - copy of the directory
whatif/ with pdb files for electrostatic computations, obtained after
running the PIPSA demo;
uhbd_example/ - copy of
the directory uhbd/ obtained after running the PIPSA demo, with
grid files removed.
The following 3 directories will be created by
the main script "do_pipsa"
grid/ - the directory
to store interaction field grids computed by the GRID program;
whatif/ - the directory to store
the pdb files prepared for electrostatic computations;
uhbd/
- the directory to store electrostatic potential grids and perform
similarity computations.
These 3 directories and the directory pdbs/
were
renamed to corresponding *_example/ directories after test run
of the script do_pipsa and all grid files were deleted.
Timing for example case:
1. 42 min to compute n=9 GRID (m=65)^3 grids
2. 02 min to compute n^2 similarity indices
3. 03 min to compute n=9 (m=65)^3 electrostatic potential grids
4. 02 min to compute n^2 similarity indices
Steps 1 and 3 proportional to ~ n*m3,
steps 2 and 4 proportional to ~ n2,
i.e. expect timing of 1 to be 420 min and timing of 2 to be 200 min
when doing the same for n=90 pdb files
1 N. Blomberg, R.R. Gabdoulline, M. Nilges
and R. C. Wade. Classification of protein sequences by homology modeling
and quantitative analysis of electrostatic similarity. Proteins:
Str., Function and Genetics, 37:379-387 (1999)
2 http://www.EMBL-Heidelberg.DE/~blomberg/PHdomains/pdbfiles/
3 MAGE:
copyright © 1998 by David C. Richardson, Little River Institute, 5820
Old Stony Way, Durham NC 27705; dcr@kinemage.biochem.duke.edu; Biochemistry
Dept., Duke University, North Carolina 27710, USA
4 Lawrence A. Kelley, Stephen P. Gardner
and Michael J. Sutcliffe. An Automated Approach For Clustering
An Ensemble Of NMR-Derived Protein Structures Into Conformationally-Related
Subfamilies. Protein Engineering 9, 1063-1065(1996).
Razif Gabdoulline , January 2000