European Molecular Biology Laboratory, Heidelberg, Germany.
The effects of data pretreatment, data scaling, and variable selection on three-dimensional quantitative structure-activity relationships
derived by comparative molecular field analysis (CoMFA) using the GRID energy function were studied in detail for a set of inhibitors
the human synovial fluid phospholipase A2 (HSF-PLA2). The quality of the models was evaluated for predictive power and ability to map
the receptor binding site by (a) comparison of predicted and experimental activities using cross-validation and external validation
sets and (b) comparison of the regions selected in space in the CoMFA models with a crystal structure of a HSF-PLA2-inhibitor complex,
with optimized comparative binding energy analysis (COMBINE) models (Ortiz et al., 1995) and with structure-activity relationships
derived previously for different sets of compounds. It is found that
(1) data scaling and dielectric modeling strongly influence CoMFA results. Unscaled data and a uniform dielectric constant of 4 are
well suited to GRID-CoMFA studies for the present compound set.
(2) The GOLPE and Q2-GRS variable selection methods select variables in roughly the same regions in Cartesian space, but they produce
different models in chemometric space and differ in their sensitivity to data scaling and pretreatment and their tendency to overfitting.
(3) CoMFA models are consistent with COMBINE models in that they identify approximately the same intermolecular interactions as relevant
for activity. Our study provides support for the qualitative receptor-mapping properties of CoMFA models and for the validity of variable
selection when applied with care and also provides guidelines for how to evaluate the quality of CoMFA models.
J Med Chem. 1997 Mar 28;40(7):1136-48.
PMID: 9089335 [PubMed - indexed for MEDLINE]