analytically defined (molecular) surfaces

Introduction

  • Surfaces
  • Distance to the surface
  • Operation on the surface
  • Projections of molecular surfaces and interfaces
  • Accuracy of area values
  • The program

  • Surfaces

    The surface of a molecule bears information about how it interacts with other molecules and its solvent. As the surface of a molecule is not a quantity for which a unique physico-chemical definition exists, several definitions have been introduced. Those most commonly used are:

  • The probe or solvent accessible surface, i.e. the boundary of the region where the centres of the atoms of other molecules with the radius of the probe are allowed to be placed (see Lee and Richards). The solvent accessible areas and volumes of molecules may be used to quantify their interaction with the solvent (see Eisenberg and McLachlan).
  • The probe excluded molecular surface, which is drawn on the surface of the probe as it is rolled over the molecule (Richards). It highlights the region belonging to the molecule alone; namely, no part of the probe is allowed to be inside the molecular surface. It may be calculated by construction of a dot surface (Richards) or analytically (Connolly).
  • The van der Waals surface of the molecule dependent only on the atomic radii and coordinates of the atoms in the molecule. It is equivalent to a molecular surface computed with a probe of zero radius.
  • The analytical molecular surface is smoother than the van der Waals surface and this facilitiates its analysis. However, its primary importance is not its smoothness but that it gives a different description of the molecular interior from the van der Waals surface, since the cavities inside the molecule which are not accessible to the solvent probe appear to belong to the molecule's interior.

    The properties of a molecule are usually displayed on its surface by assigning values to points on the surface - all but a minority of which will be on the surface of a single atom and thus possess the properties of that atom. The points can be displayed on a graphical device or used for computational analysis. The analysis of surface properties is usually based on visual inspection for which points are coloured or assigned sizes corresponding to the value of a property. Solid rendering, such as exemplified by the GRASP program of Nicholls, can be used to enhance visual quality. There are several algorithms allowing automation of surface analysis. We mention two by way of example: a) The fully automated detection of clusters of surface points with like properties in order to specify hydrophobic patches on protein surfaces by Lijnzaad et al. b) Location of knobs and holes on a protein surface using a geometric hashing algorithm applied to coordinates derived from a dot representation of the molecular surface by Fischer et al. A reasonably accurate representation of a molecular surface requires 10-30 points per Å 2, generating 100,000-300,000 points per surface for medium-sized proteins. While the scanning of all surface points is fast, considerable computational effort is necessary to establish the connectivity (neighbourhood) of the points. Operations with dot surfaces can, however, be made extremely efficient by avoiding time consuming distance checks (see Eisenhaber et al ).

    An alternative description of a molecular surface, which is analogous to the van der Waals surface, may be derived from a Gaussian description of the molecule (Duncan and Olson). It is defined using the approximate electron density distribution:
    ,
    representing the contribution from each atom i of the molecule with different weighting factors and taking into account the different sizes of the atoms. This form of representation is rather arbitrary and exponential functions such as:
    ,
    would give a more correct description of the asymptotic behavior of electron density. Advantage of Gaussian surface description is that it may describe the fuzziness of molecular surfaces due to high frequency atomic vibrations (see Agishtein).

    The volumetric properties of the Gaussian molecular representation have been compared to those of other surfaces by Duncan and Olson and Grant and Pickup. They found that a Gaussian representation could reproduce area and volume quantitites computed using a hard-sphere model. The Gaussian representation is exploited to locate specific regions on molecular surfaces in the SURFNET program of Laskowski. Physically, the Gaussian representation provides a much more natural and realistic description of the shape of molecules than the atomic hard sphere representation. Another advantage is its continuity which allows straight-forward analytical estimation of the derivatives of surface dependent properties, in contrast to the computational difficulties of dealing with the discontinuous hard sphere representation.

    In this study, we use a related, exponential, density function to construct molecular surfaces. It differs from previous uses of Gaussian functions in the following respects:
    a) The function is constructed to define molecular surfaces rather than the molecular interior. We intentionally study the surfaces directly, since quantization and numerical operations on the 2D surface require considerably fewer operations and memory than operations on the 3D density function itself.
    b) Surfaces similar to the solvent accessible surface of the molecule can be derived as well as those corresponding to its van der Waals surface. The solvent accessible surface is easier to treat analytically because of its relative simplicity.
    c) The parameters of the density function are assigned to define the degree of smoothing of a hard-sphere surface. These parameters are related to electron density parameters to the extent that the hard-sphere model reflects the electron density.

    The starting point for this study is the analytical definition of the surface approximating the solvent accessible or van der Waals surface. The surface is defined implicitly using the functional of the distance to it (simply, distance=0). Using this functional, one can make a numerically fast and stable projection to the surface on which one can place a set of approximately equally spaced points with obvious connectivity. The latter may be used to build a grid of quasi-curvilinear coordinates on the surface. which allows to map a part of the surface or a molecular interface to a flat rectangle. The mapping allows the projection of the surface from 3D to 2D with conservation of neighbourhood and reasonably small distortions. Although the mapping algorithm used may give rise to overlaps when the distance conservation requirement is strong, overlaps decrease as this requirement is weakened. The mapping algorithm is particularly suitable for analyzing the solvent accessible surface of a molecule, and can be used for the van der Waals surface although interpretation is more difficult as overlap is more of a problem. Interfaces between molecules can be mapped reasonably in the majority of cases.


    Distance to the surface

    Let us define an exponential parametric function g(r,A;d) associated with atom A, whose center is positioned at and which has a radius . The parametric function depends on coordinate r and adjustable parameter d :

    .

    While other functional forms may be equally appropriate, we will only consider the above exponential function. A simple manipulation gives the exact value of the distance to the surface of the atom A, :

    ,

    which is independent of the parameter d. Assignment of zero atom radius would give the distance to the atom center.

    Now, let us consider a molecule M comprised of atoms , i=1,2,...,N, with radii and coordinates . One can define the exponential parametric function for a molecule as a sum of those for every atom:

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (1a)

    Then, the distance functional is:

    . . . . . . . . . . (1b)

    This definition is formally equivalent to an isocontour value of the sum in (1a), but is easier to handle numerically.

    At any point, r, the major contribution to the sum in (1a) will be from the closest atom. Thus, the distance, , derived from G(r,M;d) will largely reflect the distance to the closest atom, (as the distance to a collection of atoms should do according to molecular surface ideology). Variation of the parameter d changes the relative contribution of the close atoms to the sum. As d is decreased, the number of contributing atoms decreases and, when d=0, only the closest atom contributes.

    Depending on which radii are ascribed to the atoms, one obtains the distances to different surfaces: a) if all are atomic van der Waals radii, then =0 defines an approximation to the van der Waals surface; b) if all are van der Waals radii of atoms incremented by a solvent probe radius, then the solvent accessible surface is approximated.

    The choice of the value of the parameter d should be related to the characteristic interatomic distances. At any given point on the surface, the relative contributions to the sum in equation (1b) arising from the the closest atom and the next closest atom can be compared. Considering a typical interatomic distance of 1.5 Å between these atoms, the contribution to the sum from the second atom will be 0.05 and 0.22 times that of the closest atom for values of d of 0.5 and 1.0 Å, respectively. Consequently, considering only the two closest atoms, the =0 surface will follow the van der Waals or solvent accessible surface with a deviation of only 0.025 Å at d=0.5 Å or 0.2 Å at d= 1.0 Å. If the geometrical positions of the atoms is different that from described above, or more atoms are considered, the distortion may be much greater. Figures 01-04 illustrate the dependence of the computed surfaces on the parameter d for the human growth factor hormone.

    Figure 01: Contours on the solvent accessible surface of human growth factor (hGH) computed with d =1.0 Å. There are 6911 points at the surface. The solvent accessible surface area, computed from the hard-sphere representation is 9785 Å 2. Figure 02: Contours on the solvent accessible surface of human growth factor (hGH) computed with d =0.5 Å. There are 7497 points at the surface.
    Figure 03: Planar section through the same molecule, hGH, showing contours corresponding to van der Waals surfaces at d =1.0, 0.50 and 0.25 Å. The contour with d =0.5Å essentially shows all the details of the molecular surface except the cavities within the molecule. At d<0.25 Å the contours are not of closed form, making continuous tracing of the surface impossible. Figure 04 shows contours on the same cross-section plane as in Figure 03 for solvent accessible surfaces computed with four values of d. Note that surfaces computed with d<0.25 Å are almost indistinguishable.

    The surface of molecule M is thus defined by =0. To define the interface between two molecules, M1 and M2, the definition of the distance to the molecular surface given in (1b) can be used in:

    =

    where the distances and are the distances to the surfaces of the first and second molecules, respectively. The parameters in the definition of the distance functional (1b) can be modified to obtain different interfaces defined as equidistant to molecular surfaces using atomic radii or defined as equidistant to the closest atoms by setting all radii to zero.


    Operations on the surface

    The performance of some of the operations to be described below depends on the value assigned to the parameter d, defining the closeness of the computed surface to a hard-sphere surface. Except for the first two, which are performable at any d, mapping of the surface can usually (for proteins) be done only at d >0.5 Å when the surface is smooth enough to keep the mapping operations stable. Improvement of the algorithms may, however, enable mapping of the more complicated surfaces obtained at smaller d.

    a. Placing a point on the surface

    The surface is defined implicitly and is smooth. The basic operation for moving from any given point to the point on the surface is projection to the surface along the gradient of the distance function (1b). Projection is performed iteratively with a step size equal to the distance to the surface.

    b. Motions along the surface

    Motions along the surface from any point can be done using the surface tangents. The choice of the functional (1b) makes the higher order derivatives small, so the tangent movements can be corrected by a few iterations (typically 1-2, when the deviations from the surface are less than 1 Å), when the step size of the tangent motion is within 1-3Å. Consequently, the move along the surface costs several computations of sums like that in (1b) (the functional itself and its 3 first derivatives are computed at each iteration). Computation of the sum (1b) scales as the number of atoms: the same number of motions wll cost only twice as much when the molecule size is doubled.

    c. Mapping part of a surface

    Locally, the molecular surface resembles a Euclidean rectangle (being topologically equivalent to it). One can define pseudo-Euclidean coordinates on an interesting part of the surface. It is natural to introduce polar coordinates, starting from some specific point on the surface. This can be achieved by growing up rings of points starting from a given center. A variety of algorithms can be used. We introduce the basal distance D to define the distance between the points on the surface. Each ring of points is constructed at this distance from the previous one and then projected onto the surface. The points on each newly constructed ring are then checked, one after the other, to see if the distance to the next point is less than 0.75D, in which case this next point is eliminated, or larger than 1.7D, in which case one more point is added. An example of a set of points resulting from applying this procedure is shown in Figure 05. This method is similar to that used by Bacon and Moult, who introduced web coordinates on molecular surface patches by fitting the precomputed surface points by B-splines and constructing a self-growing web. We use the ring construction algorithm to map the interfaces between two molecules. If D=1 Å, the algorithm places ca. 1 point per 1.225 Å 2 of the surface.


    Figure 05: Example of a set of points resulting from the mapping procedure.

    d. Scanning the whole molecular surface

    To represent the molecular surface as a whole, one needs a representative set of points on the surface that cover the entire surface as uniformly as possible. The simplest solution is to project every atom center in the molecule onto the closest point on the surface. A drawback of this procedure is that the set of points on the surface cannot be made more dense unless a check over all surface point pairs is done because the points are not ordered according to proximity. It is just such computationally intensive searching that we wish to avoid. An alternative procedure is to use the algorithm described above in section (c) to map the entire molecular surface. For relatively spherical molecular surfaces, the growing rings may eventually contract to a point on the other side of the molecular surface from that at which growing was started, thus defining the spherical coordinates. These mapped coordinates may be used to locate every point on the surface. Making a more dense representation is straightforward since the set of points is ordered. Examples are presented in Figures 01 and 02 where this algorithm was used to cover the solvent accessible surface of the molecule by a sequence of rings.


    Projections of molecular surfaces and interfaces

    Surfaces are usually defined as a set of points that is visualized with the aid of graphical programs and analysed by eye. The automation of the procedure would not only save time spent in analysis, but would also avoid possible errors resulting from the subjective nature of the manual analysis procedure. The projection of two different surfaces onto one simple surface is necessary for the comparison of these surfaces (see review by Masek ). A natural solution is to project the surface onto a plane where it can be quantified in a straightforward manner. However, simple projection of the entire molecular surface encounters a topological problem (met already in mapping the surface of the earth), since a closed surface can at best be projected onto a sphere rather than a plane without destroying the connectivities between points. A beautiful solution to this problem for small molecules was suggested by Gasteiger et al, who used Kohonen maps to project the surface onto a torus conserving neighbourhood.

    For projecting the surface onto a sphere, the gnomonic projection (see Chau and Dean) provides the simplest solution. The surface is placed within the sphere and then projected along the radials of the sphere. The feasibility of approximately conserving distances between points on the surface upon projection is highly dependent on the relative orientations of the sphere and surface and distance conservation will not be uniform over the surface. Moreover, two points will often be projected onto one, resulting in loss of information. A projection that avoids overlaps is the spherical harmonics representation developed by Duncan and Olson in which a special procedure eliminates overlaps resulting from the gnomonic projection.

    The solution proposed in this work is based on the analytical definition of molecular surfaces and the algorithm of growing rings, which builds up quasi-polar coordinates on the surface. This construction may overlap onto itself, causing the duplicate projection of some regions of the surface. The duplication problem is inevitable as can be appreciated from imagining covering an irregular surface with a piece of paper. The duplicate covering can obviously not be completely avoided except in the case of a planar surface. However, most of the duplications can be avoided by weakening the requirement of distance conservation, in an analogous way to covering the surface with a piece of stretch film rather than paper. Depending on the curvature of the surface, this will cause different degrees of distortion. On the other hand, overlaps caused by topological differences between the original and projection surfaces should be handled either by projecting onto an appropriate simple surface or by additional description, for example, specifying the equivalence of opposite borders of a rectangle to which a torus has been projected.

    Once the surface has been covered by a set of rings, the m-th point of the n-th ring can be projected onto a point (x,y) on the plane using the following formulae:

    , where M is the number of points in the n-th ring.

    As a result various properties on the surface or interface are transferred to an appropriate Euclidian rectangle.

    Figures 01 and 02 show surfaces mapped avoiding duplications. A correction of the procedure described in section (d) above avoids local sources of duplication, so that the growing rings eventually evolve to a point on the other side of the molecule. The ring numbering can serve as a latitude and the length parameter along each ring as a longitude. This provides a means by which to automatically introduce spherical coordinates for a molecular surface.


    Accuracy of area values

    There is approximate area conservation during the mapping procedure. When the reference distance for ring generation is D=1Å, approximately one point is placed per 1.225 Å 2. However the distribution of the points is not uniform. Projection of the surface rings to the rings of the polar coordinates on a planar rectangle further distorts the areas.

    We repeated the mapping for the two interfaces for hGH binding to hGHR1 and hGHR2 using different starting points. In both cases, the interface remains the same since its definition is independent of mapping. Distortions occur when the surface is projected onto the planar square and these distortions are related to the position of the interface with respect to the center/starting point. There is no strict correlation between the distance to the chosen center and the distortions, although the area around the center will always be less distorted. Mapping with different centers is a good test of area conserving properties. Figure 18 shows the correlation between the areas assigned to every residue of both proteins at both interfaces during two different mappings. Deviations are within either the absolute limit of points or the relative limit of These deviations depend mainly on the surface curvature which defines the degree of distortion.

    Figure 06. Correlation between areas assigned to residues at the hGH-hGHR1 and hGH-hGHR2 interfaces when mapping is started at points separated by 2 Å An absolute tolerance limit of is shown in red and a relative tolerance limit of is shown in blue. The scale is logarithmic. Figure 07. Correlation between the number of surface points per residue on the surface shown in Figure 02 for hGH and residue accessibility.
    The distortions above are introduced by the projection procedure. However, the placing of the points on the interface or surface is also subject to error. These distortions can be quantified by relating the accessibility of each residue to the number of surface points generated on the analytical surface which are closer to this residue than any other (see Figure 07). There are two sources for deviations. The first is caused by the deviation of the analytical smooth surface from the hard-sphere surface used to compute residue accessibilities. The second is defined by the algorithm generating the points on the surface, which is designed to generate non-intersecting rings that evolve to a single point when mapping of the surface is complete. As can be seen in Figure 07, although the upper estimate of the error for the surface shown in Figure 02 is 22 Å 2, for the majority of residues the number of surface points per residue correlates much better (with a scaling factor of 1.25) with the residue accessibilities. The scaling factor arises because the surface was constructed so that there is on average 1 point per 1.225 Å 2.


    The program

    The illustrations above are generated using the ADS (Analytically Defined molecular Surfaces) program package. The interface analysis is performed in 3 steps. In the first step, rings of points are constructed on the interface using a program which maps the surface of the molecule. In the second step, the required properties are assigned to the points on the surface. In the third step, the surface rings are projected onto a rectangle, producing data files to be used in the fourth step, which is to display the properties and plot the pictures. Output data files are readable by the X-window oriented program XFarbe of Preusser, which is used to make a contour representation of the functions on a rectangle. Computer times necessary to perform the manipulations (which have not yet been optimized for speed) are within 2-5 minutes on an SGI Indy RC 4600 workstation for proteins having ca 1500 atoms.


    References 

    1. Agishtein, M.E. Fuzzy molecular surfaces. J. Biomol. Str. & Dynamics., 1992, 9, 759-768.
    2. Bacon D.J. and J. Moult. Docking by least-squares fitting of molecular surface patterns. J. Mol. Biol., 1992, 225, 849-858.
    3. Chau P.L. and P.M. Dean. Molecular recognition: 3D surface structure comparison by gnomonic projection. J. Mol. Graphics, 1987, 5, 97-100.
    4. Connolly, M. L. Solvent-accessible surfaces of proteins and nucleic acids. Science, 1983, 221, 709-713.
    5. Clackson T., J.A.Wells. A hot spot of binding energy in a hormone-receptor interface. Science, 1995, 267, 383-386.
    6. de Vos A.M., M. Ultsch, A.A. Kossiakoff. Human growth hormone and extracellular domain of its receptor: Crystal structure of the complex. Science, 255, 306-312.
      PDB entry 3hhr.brk, notations hGH, hGHR1 and hGHR2 are used for chains A, B and C, respectively.
    7. Duncan, B., and A. J. Olson. Approximation and characterization of molecular surfaces. Biopolymers, 1993, 33, 219-229.
    8. Duncan, B., and A. J. Olson. Shape analysis of molecular surfaces. Biopolymers, 1993, 33, 231-238.
    9. Eisenberg., D., A.D. McLachlan. Solvation energy in protein folding and binding. Nature , 1986, 319 , 199-203.
    10. Eisenhaber, F., P. Lijnzaad, P. Argos, C. Sander, and M. Scharf. The double cubic lattice method: efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies. J. Comp. Chem., 1995, 16, 273-284.
    11. Fischer D., R. Norel, H. Wolfson, R. Nussinov. Surface motifs by a computer vision technique: searches, detection, and implications for protein-ligand recognition. Proteins: Structure, Function, and Genetics, 1993, 16, 278-292.
    12. Gasteiger, J., X. Li, C. Rudolph, J. Sadowski, and J. Zupan. 1994. Representation of molecular electrostatic potentials by topological feature maps. J. Am. Chem. Soc., 1994, 116, 4608-4620.
    13. Gasteiger, J., X. Li, and A. Uschold. The beauty of molecular surfaces as revealed by self organizing neural networks. J. Mol. Graphics., 1994, 12, 90-97.
    14. Grant, J. A., and B. T. Pickup. A gaussian description of the molecular shape. J. Phys. Chem. 1995, 99, 3503-3510.
    15. Hodgkin, E.E. and W.G. Richards. Molecular similarity based on electrostatic potential and electric field. Int. J. Quantum Chemistry. Quantum Biol. Symp., 1987, 14, 105-110.
    16. Laskowski, R. A. SURFNET: A program for visualizing molecular surfaces, cavities, and intermolecular interactions. J. Mol. Graphics, 1995, 13, 323-330.
    17. Lee, B., F.M. Richards. The interpretation of protein structures: Estimation of static accessibility. J. Mol. Biol., 1971, 55 , 379-400.
    18. Lijnzaad P., H.J.C. Berendsen, P.Argos. Hydrophobic patches on the surfaces of protein structures. Proteins: Structure, Function, and Genetics, 1996, 25, 389-397.
    19. Madura, J.D., J.M. Briggs, R.C. Wade, M.E. Davis, B.A. Luty, A. Ilin, J. Antosieeicz, M.K. Gilson, B. Bagheri, L.R. Scott, and J.A. McCammon. Electrostatics and diffusion of molecules in solutions: simulations with the University of Houston Brownian Dynamics Program. Comp. Phys. Comm., 1995, 91, 57-95.
    20. Masek B.B. Molecular surface comparisons. In: Molecular similarity in drug design , edited by P.M. Dean, Blackie Academic and Professional, Glasgow, 1995, 163-186.
    21. Nicholls, A., R. Bharadwaj, and B.Honig. GRASP: Graphical representation and analysis of molecular surfaces. Biophys. J. 1993, 64 , A166
    22. Preusser, A. Algorithm 671 - FARB-E-2D: Fill area with bicubics on reactangles - A contour plot program. ACM Trans.Math.Soft., 1989, 15, 79-89. On-line description of the program is also available.
    23. QUANTA molecular modeling software package. 1992. Molecular Simulations, Inc. Waltham, MA.
    24. Richards F.M. Areas, volumes, packing, and protein structure. Annu. Rev. Biophys. Bioeng., 1977, 6, 151-176.




















































    Privacy Imprint