Você está na página 1de 25

J. Chim. Phys.

(1 999) 96, 566-590


O EDP Sciences. Les Ulis

An approximate procedure for the calculation of van der Waals and solvent-accessible surfaces areas; computing Gibbs free energies of hydration
M. Ulmschneider and E. pnigaul{
Laboratoire de Photochimie Gnrale, UMR 7525 du CNRS, ENSCMu, 3 rue Alfred Werner, 68093 Mulhouse cedex, France (Received 5 Februaty 1998; accepted 4 February 1999)
Correspondence and repflnts.

RSUM Les tapes vectorielle et analytique du nouveau procd ASC de calcul approch des surfaces molculaires (de van der Waals ou accessible au solvant), sont ajustes. L'objectif est d'approcher au mieux les valeurs moyennes de surface qui ont t6 calcules pour les structures molculaires d'un ensemble de rfrence. Les surfaces molculaires, les surfaces atomiques partielles, ainsi que les gradients dcrivant les variations de surface en fonction des dplacements nuclaires, sont valus par une voie analytique. Aprs sa validation, le modle ASC est exploit pour tudier les corrlations entre surfaces molculaires et enthalpies libres d'hydratation exprimentales. Une corrlation satisfaisante est tablie pour un large ventail de composs organiques, condition d'introduire des lments d e surface o et R en complment de la distinction traditionnelle entre surfaces atomiques partielles polaires et nonpolaires.

mots-cls: modle analytique, surface molculaire, surface de van der Waals,


surface accessible au solvant, surface atomique partielle, surface atomique sigma, surface atomique pi, enthalpie libre d'hydratation.

ABSTRACT
Both geometrical and analytical steps of the new ASC procedure for the approximate cornputation of molecular van der Waals and solvent-accessible surface areas, were calibrated. The assigned objective was to determine best fit accurate surface values for a representative set of molecular structures. Molecular surfaces, partial atomic surfaces, as also the gradients describing surface changes as a function of atomic displacements are analytically described. After the validation of the ASC model, an endeavour was made to correlate experimental Gibbs free energies of hydration with partial atomic surfaces. A satisfactory correlation was obtained for a large set of organic compounds, provided that o- and K- surface elements are introduced in addition to the

Solvent-accessible surface and rnolar energy of hydration

567

traditional differentiation into polar and non-polar partial atomic surfaces.

key words: analytical model, molecular surface, van der Waals surface, solventaccessible surface, partial atomic surface, atomic sigma surface, atomic pi surface, Gibbs free energy of hydration.

INTRODUCTION
Although not strictly defined in a quantum mechanical sense, the notion of molecular surface plays an important role in the interpretation and prediction of molecular properties and recognition [l-41. The present paper is concemed with an analytical procedure for the calculation of van der Waals and solvent-accessible molecular surfaces as also partial atomic surface areas. The van der Waals surface of a molecule is the exposed surface of the fused van der Waals spheres of its constituent atoms. The solvent-accessible surface is the locus of the centre of a probe sphere rolling over the van der Waals surface [4]. Both van der Waals and solvent-accessible surfaces are continuous with discontinuous slopes at the boundaries between atoms. The accurate analytical calculation of surface areas is a complex geometrical problem of multiply overlapping spheres. A variety of numerical procedures have been developed for the approximate calculation of molecular surfaces [3-131. If one is willing to make a sufficient computational effort, numerical procedures can be applied to any desired degree of accuracy. However, they do not provide the gradient describing the changes in surface area for small atomic displacements. This quantity, which proves useful in molecular mechanics calculations, would have to be estimated by finite differences requiring excessive computational effort. Numerical approaches rely on geometrical constructions, surface point distributions, or three dimensional grid algorithms. These methods are time-consuming. That is why several authors have proposed analytical solutions to the molecular surface problem. Analytical approximations based on statistics or exact solutions to more rigorous definitions of a solventaccessible surface have been worked out [14-191. The present paper deals with a fast and effective analytical procedure. It consists of a geometrical step in which a few probe points with assigned surface equivalents are placed around each atom, followed by an analytical step in which partial surface
J. Chim. Phys

568

M. Ulmschneider and E. Penigault

inclusions in neighbouring atoms were estirnated by using a simple point-to-atom distance function. Since the geometrical and the analytical steps both depend on the atomic coordinates in a straightforward manner, the surface areas as also their derivatives with respect to atomic displacements are analytically available. Throughout this work the procedure will be referred to as ASC (Approximated Surface Calculation) for greater convenience. Most biochemical processes in living systems occur in aqueous media. Accordingly, many attempts have been made at describing the properties of molecular systems in water [2,11,15,20-301. To assess conformational properties and molecular recognition phenornena in aqueous solution, it is necessary to estimate the Gibbs free energies of hydration in a conformation-dependent manner [3 1-34]. Hence, simple empirical models that estimate this hydration energy directly from the structural information of a given solute molecule are of particular interest. By using the novel analytical ASC method to compute partial atomic van der Waals surface areas, a linear additivity scheme of atom-specific alndifferentiated solvation energy contributions was developed. These contributions were calibrated by multiple-linear regression analysis of calculated and experimental Gibbs free energies of hydration from a large set of organic compounds.

1. GENERAL ASPECTS OF THE ASC PROCEDURE


A key element of the ASC method [35,36] is the arrangement of the probe points

around individual atoms, which is based on atomic hybridization and structural context. The centred regular tetrahedron, trigonal bipyramid or regular octahedron assigned to each bonded sp3, sp2, and spl-hybridized atom, is oriented in such a way that the corresponding vertices lie optimally on the bond axes of the heavyatom skeleton (al1 distances from the centre of the polyhedron to the vertices are equal to unity). Each vertex not identified by a bond axis is used to define the location of an atom-specific probe point, placed along the radial vector to the vertex at a distance r from the atomic nucleus and centre of the polyhedron. The radial distance r equals the atomic van der Waals radius r , d , optionally

Solvent-accessiblesurface and rnolar energy of hydration

569

incremented by a solvent radius rsol for van der Waals and solvent-accessible surfaces, respectively. A radial increment rinc is added to optimize the surface calculations. An example of construction is shown in Figure 1. To estimate the free surface area of a bonded atom A, a surface equivalent SA is assigned to each associated probe point. Since probe points are arranged on the vertices of symmetrical polyhedrons, the surface equivalents are given by
SA-=

4rcrjlnA

where rA is the radius of the atomic sphere and nA the number of vertices of the selected polyhedron.

Figure 1: probe vecror constructionsfor a mode1 &lactant, reference polyhedra (above)and van der Waalsprobe points (below)for selected atoms (the probe points are differeniiared as a-and ir-fype poinis)

570

M. Ulmschneider and E. Pnigault

To estimate surface inclusions by rnutual overlap of neighbouring atomic spheres, inclusion factors are defined by a universal function f of the distance between probe points and atomic nuclei. This function takes values between O (total inclusion) and 1 (no inclusion) within a limited distance range given by the cutoff distance coff beyond which there is no inclusion. The cutoff distance is obtained from the van der Waals radius rg,,dw of the overlapping atom B, optionally augmented by radius rsol of the sphere simulating a solvent molecule:
Goff =
Cext

( rB,vdw +

) = cext

rB

The extension factor cm is introduced as an ajustable parameter to optimize the surface estimation. The inclusion function f and its derivative with respect to the distance must be well behaved, as the distance x approaches the boundaries of the definition interval. A good way to achieve this is to take a cubic spline over the definition range:

if

[O, c

off

if
function and its derivative

x E(c~.+=[

Explicit values for a and b can be found by applying expressions of the f

fi at the cutoff boundary:

Solving this system of two equations leads to complete definition of the inclusion function:

For an atom A with n~ probe points and an intersecting atom B, the residual exposed surface area of the j-th probe point of A is given by
S ~j . = S A

.fj,j(~)

Solvent-accessible surface and molar energy of hydration

571

where x is the distance between the j-th probe point of A and atom B. For a set of N overlapping atoms, a probabilistic approach [15, 171 is adopted. The residual exposed surface equivalent of the j-th probe point of A is given by the product of the respective inclusion factors:
N
sA,j

= SA . f ~ , =j S A

k = l . k#A

fA,j

The exposed partial atomic surface SA of atom A, and the exposed surface S of the molecule, are given by following equations:

2. CALIBRATION OF ASC PROCEDURE The structures of 430 representative small monofunctional and polyfunctional organic molecules were built by using the modelling package MOLOC [37], which is property of the F. Hoffmann-La Roche Company. The structure library contains hydrocarbons, aicohols, phenols, ethers, amines, ketones, aldehydes and esters. In order to get representative samples for proper calibration, various conformations were included for both cyclic and acyclic compounds and most major functional groups.
Table 1: equations of best fit and parameters used for ASC procedure ( S M O L S V and S A S C are the accurate and the approximate surface areas, the solvent radius is a fixed parameter, probe shift and extension factors are optimized during calibration)
parameten linear regression analysis solvent radius r,,,lY (A) probe shift ri,,,
SMom
n = 430

van der Waals surfaces

solveni-accessible surfaces

= (0.006iO.800) + (1.123'0.005) SASC SMOLTV = (0.004I4.879) + (1.190f0.016)SASC


s = 5.60

2 = 0.993
O O 1.89

F = 59177

n = 430

r = 25.22

= 0.927
1.45 1.O0 1.56

F = 5430

(A)

extension factor cexz

572

M. Ulrnschneider and E. Pnigault

For al1 molecules in the structure library, the approximate molecular surface areas SASC were determined for a given parameter set {rin,, c,,}. The accurate surface areas SMOtSV were computed by the numerical procedure MOLSV, available through QCPE [ 6 ] .The optimum values for parameters rkc and ceH were calculated by optimizing the linear relationship between SASC and SMOLSV. The results for both van der Waals and solvent-accessible surfaces are summarized in Table 1. Gratifyingly, rin, remained close to zero for optimum estimates of the van der Waals surfaces, so that, for conceptual simplicity, the value of this parameter was set to O for this type of surface calculations. A similar reasoning was applied in the case of solvent-accessible surfaces, where rinc was set to 1. From these data as well as the plots of Figures 2 and 3, it is evident that the procedure yields reasonably accurate estimates of molecular van der Waals surfaces, but is somewhat less satisfactory for solvent-accessible surfaces: the degree of complexity in multiply overlapping atomic spheres is considerably higher than with van der Waals surfaces. For comparison, Table II gives the CPU times in seconds measured with a Micro Vax 3800, that are related to the van der Waals and solvent-accessible surface areas of four selected molecular stmctures, calculated by using the MOLSV and the ASC rnodels. These molecules are benzene, the a helical conformation of N-acetyl-decaalanyl-N'-methylarnide (Ac-AlalO-NHMe),bovine B 1 insulin (the second chain of a polypeptide hormone including 30 amino acids) and sperm whale deoxymyoglobin (a 153 residues long polypeptide chain). The number n of heavy atoms (Le. C, O and N) in the selected molecules is also shown in Table II. Inspection of Table II shows that the algorithm used with the ASC model is much more convenient than the MOLSV model. The computation times with the ASC rnodel are shorter because the number of points required for one atom is on the average 600 times lower than with MOLSV.However, ASC calculations are not 600 times faster, because additionna1 arithmetic steps are needed for each point. For great values of n, the dependence of the CPU time on n is roughly linear. The core of both algorithms consists in two embedded loops over n and the CPU time should be a quadratic function of n. The efficiency is improved by making out for each

Solvent-accessible surface and molar energy of hydration

30
MOLSV
(A2)

ASC

(442)

Figure 2: correlation of exact MOLSV values vs approximote ASC values for van der Waals molecuiar sugoce areas (thestononstical data of the linear correlation are given in Table 1)

100

200

300

400
ASC (2)

XX)

6aI

Figure 3: correlation of exact MOLSV values vs approximate ASC values for solvent-accessible rnolecular surface areas (the sfatisticaidafaof the linear correlation are given in Table 1)
J. Chim. Phys.

574

M. Ulmschneider and E. Pnigault

atom the list of the neighbouring atoms located within a specific cutoff distance, thus, reducing the number of unnecessary calculations. For a given molecule, the CPU time depends, then, from both n and the average number of atoms selected in those lists. For a given atom, the number of neighbouring atoms selected in the list may Vary according to the size and the topology of the molecule and also the type of surface that has to be computed. For instance, in the intersecting sphere approach more atoms overlap when computing the solvent-accessible surface area: the CPU times are comparatively longer than those for van der Waals surfaces.
Table II: CPU times in seconds, measured with a Micro Vax 3800 for van der Waals and solvent-accessible surface areas calculated by using the MOLSV and the ASC models
molecule benzene van der Waals surfaces n 6
55

MOLSV

ASC
0,03 0,7
2,s
58,4

solvent-accessible surfaces MOLSV ASC

1 . 6
17.9 53,l 443,7

1 . 4
39,9 128.8

0,03

( Ac-AlalO-NHMe) a helix
bovine B 1 insulin deoxymyoglobin

I,]
4 , 7
83,7

161

1217

1 174,4

3. CALCULATING PARTIAL ATOMIC SURFACE AREAS


To assess the ability to estimate partial atomic surfaces, the above results were compared with accurate partial surfaces for al1 atoms contained in the structure library. The comparison is best performed for individual subsets of atoms, classified by element type (E), hybridization index (h) and number (n) of covalently bonded non-hydrogen ligands (index h is the exponent of hybridization type sph). Tables III and IV list the number of occurrences in each atomic subset (Ehn) and summarize the results in terms of mean surface areas and standard deviations for van der Waals and solvent-accessible partial surfaces, respectively. The data are displayed in Figures 4a and 4b. While the mean values scatter reasonably well about the main diagonal, the method is obviously more successful in estimating partial van

Solvent-accessible surface and molar energy of hydration

der Waals surfaces than partial solvent-accessible surfaces


Table III: van der Waals surface areas in A2 for the different individual subsets of atoms, classified by element type (E), hybridization index (h) and number (n) of covalently bonded non-hydrogen ligands (for each subset, the number of occurrences in the data set is given, as well as the mean values and the standard deviations for partial atomic surfaces calculated by MOLSV and ASC)
number of occurrences
cl 1 cl, C2i

mean values MOLSV ASC


31.79 15.04 3 1.57 17.00 5.05 27.48 17.70 4.70 24.54 23.34 12.64 2.46 22.27 18.55 8.70 18.46 9.84 32.79 14.04 32.84 14.96 5.41 28.00 17.75 4.48 25.28 21.77 12.17 3.91 20.49 14.25 8.08 14.16 9.16

standard deviations MOLSV ASC


0.52 1.89 1.84 1.30 1.78 3.66 2.10 1.35 0.07. 2.76 1.22 0.37 2.59 2.08 1.16 2.18 1.63 0.16 1.54 2.20 1.49 1.5 1 4.57 2.1 1 1.41 0.04 4.00 1.60 0.58 4.20 3.05 1.49 3.64 1.47

C22 23 C3l 32 c33


1

NI $: N
23
031

4 24 26 749 560 1160 1051 45 3 14

59
12 4 32 205 38 43 66

O*'

o~~
O :

Table IV: solvent-accessible surface'areas in A2 for the different individual subsets of atoms, classified by element type (E), hybridization index ( h ) and number ( n ) of covalently bonded non-hydrogen ligands. For each subset, the number of occurrences in the data set i s given, as well as the mean values and the standard deviations for partial atomic surfaces calculated by MOLSV and ASC
number of occurrences
Cl 1 4 24 26 749 560 1160 105 1 453 14 59 12 4 32 205 38 43 66

mean values MOLSV ASC


84.3 1 19.99 74.9 1 32.93 3.49 55.25 33.18 8.24 71.17 47.33 22.70 0.70 44.59 40.24 12.39 38.7 1 17.84 91.60 28.03 78.53 32.42 4.01 54.67 34.01 10.68 64.52 37.02 22.18 4.59 39.45 3 1.24 15.08 33.77 20.68

standard deviations MOLSV ASC


1.44 8.31 11.06 7.79 2.39 14.82 9.29 5.48 0.06 11.90 5.7 1 0.58 1'3.18 11.23 4.76 12.80 8.67 11.63 7.32 14.24 8.16 3.41 18.62 9.89 5.50 0.66 14.10 4.02 1.31 18.51 12.30 4.4 1 17.39 8.08

J. Chim. Phys

M. Ulrnschneider and E. Pnigault

ASC

20

60
ASC 40

Figure 4: correlation of exuct ASC values vs exact MOLSV valuesfor partial atomic van der WaaLr (a) and solvent-accessible ( b )surface ureas; mean values a,nd standard deviations are plotted for individual atomic subsets Eh,. classified according to element type (E), hybridization index (h), and number (n)of covalently bonded non-hydrogen Iigands: al1 surfaces in A2

Solvent-accessible surface and molar energy of hydration

MOLSV

(A=)

10

20
ASC

30

40

(A2)

Figure 5: plot of approximate ASC values vs exact MOLSV values of partial van der Waalssugace areas of sp3-hybridized carbons atoms with one, two and three covalently bonded non-hydrogen ligands of al1 molecules in the structure library

M O L S V

(A2)

10 ASC

20

30

(A2)

Figure 6: plot of approximate ASC values vs exact MOLTV values ofpartial solvent-accessible areas of sp2-hybridizedcarbons atoms with one, two and three covalently bonded non-hydrogen ligands of al1 molecules in the structure library
J. Chim. Phys.

578

M. Ulmschneider and E. Pnigault

The considerable variations in the magnitudes of standard deviations reflect the diversity of structural contexts and concomitant differences in surface inclusions. The procedure reproduces both mean values and standard deviations reasonably well. Further analysis shows that such correlations hold for each individual atomic subset, where data sets are clustered along the main diagonal. This is prototypically illustrated in Figures 5 and 6 for the stmcturally important subsets of sp3- and sp2hybridized carbon atoms with one to three non-hydrogen ligands. These subsets of rather ubiquitously occurring atomic units are highly populated in the structure library. Hence, the parameter calibration is somewhat biased towards these stmcturally important elements. This seems to be justified in view of the overall satisfactory correlations of the results.

4. EXPERIMENTAL GIBBS FREE ENERGIES OF HYDRATION The affinity of a compound for an aqueous environment can be evaluated by experimental determination of its vapour pressure over dilute aqueous solutions. The method studied extensively by Wolfenden et al. 1381 involves rneasurement of the dimensionless equilibrium constant corresponding to the transfer of a substance from the vapour phase, in which each molecule exists in virtual isolation, to a dilute aqueous solution for which solute-solute interactions can be disregarded. The dilute solution is pH-adjusted, if necessary, to maintain the solute in the uncharged state. Measurements are limited to relatively volatile solutes that exhibit substantial vapour pressures above the aqueous phase. Experimental error is generally within a few kJ mol'l. This is a direct method for the determination of Gibbs free energies of hydration. Indirect methods rely on additional organic solvents and the determination of partition coefficients. However, the direct method is restricted to relatively volatile compounds and not applicable to biological systems. The possibility of extrapolating the data for smaller molecules to larger ones remains an open issue. An extensive list of 350 small organic compounds, with various experimental thermodynamic properties of the corresponding neutral species derived from different sources, has been critically reviewed and tabulated by Canabi et al. 1391. From this compilation, 268 entries were selected, excluding
J. chim. Phys.

Solvent-accessiblesurface and molar energy of hydration

579

halogen-containing compounds. The corresponding three-tridimensional structures were generated with the united-atom molecular modelling program MOLOC [37] and organized into a structure library. This library contains representative sets of hydrocarbons, alcohols, phenols, ethers; amines, pyridines, ketones, aldehydes, esters and nitrocompounds. There are also a few carboxylic acids, nitriles, thioethers, thiols, but only one amide. In terms of elementary distribution, carbon atoms are clearly the most abundant, followed by oxygen and nitrogen atoms. Sulfur atoms are rare. Apart from the lack of sp2-h~bridizedsulfur atoms, spl o r sp2-hybridized carbon and nitrogen atoms, as also sp2-hybridized oxygen atoms are well represented. Only few charged or polyfunctional compounds are contained in this library. On the other hand, branched and unbranched, as also cyclic and acyclic structures are approximately equally abundant.

5. HYDRATION AND MOLECULAR

SURFACE

Solvation is generally considered to be predominantly a molecular surface phenomenon [2,20]. Accordingly, a correlation has been established between the surface area of apolar solutes and their Gibbs free energies of transfer from hydrocarbon solvents to water [20,21,25]. Several additivit) schemes have been proposed to estimate Gibbs free energies of hydration based on the notions of fragments, functional groups or atomic classes, and their independent contributions to solvation. An early mode1 [ l ] has been improved by considering the solventaccessible surface area of individual fragments and assuming proportionality between surface area and contribution to solvation energies [31-341. For a molecular structure with N fragments, the total Gibbs free energy of hydration is then given by the sum where, for the i-th fragment, the accessible surface area is Siwith its specific contribution gi, to the Gibbs free energy of hydration:
N

AG, =
1

g,Si

Changes in Gibbs free energies of hydration resulting from changes in conformation are usually rationalized through the conformational dependence of exposed fragment atoms. A sirnilar approximation was adopted in the present work,

580

M. Ulmschneider and E. Pnigault

together with the new ASC method for the analytical calculation of partial atomic surfaces [35, 361. This method provides a sufficiently detailed description of molecular surface topologies, while keeping close to chemical intuition. Since the ASC method is an approximate method for the calculation of surfaces, the accuracy with which the ASC method estimates the solvent-accessible and the van der Waals surface areas for the selected 268 molecular structures contained in the library, was first assessed.
Table V: correlation equations of estimated and accurate molecular surface areas (in A2)
S a,MOLS

, = (26.26W5.417) + (0.834H.017) .
s=16.27

solvent-accessible areas
, + , , S , ,

n=268

r 2 = 0.898

F = 235 1

van der Waals areas

,
n = 268

= ( 1 ,124kO.676)
s = 2.61

+ (0.969I0.005) . SVmy,ASc
r 2 = 0.993
F = 40492

solvent-accessible areas vs. van der Waals areas

S,,,,

= (74.607I1.269)
n = 268
s

+ (1.573kO.009) . Svdw,MO,V
r = 0.991

= 4.87

F = 28927

For this purpose, the approximate surface areas estimated by ASC were compared with the accurate areas obtained with MOLSV [6]. The correlations of estimated and accurate molecular surface areas are summarized in Table V. The ASC method works successfully for van der Waals surfaces but is somewhat less satisfactory for solvent-accessible surfaces. However, in view of the fact that numerically accurate van der Waals and solvent-accessible surfaces show a very close correlation (Table V), the use of ASC- estimated van der Waals surfaces in the solvation parameterization scheme is justified.

6.

o/n DIFFERENTIATION IN THE SURFACE DEFINITION

In the geometrical step of the ASC mode1 a small number of probe points were placed around each atom at the vertices of atom-centred reference polyhedra, the

Solvent-accessible surface and molar energy of hydration

581

shape and orientation of which depend on atomic hybridization state and valence geometry. Consequently, for sp2- or spi-hybridized atoms, we may differentiate between o- and K-type probe points, lying in the 6-and the R-planes of a given atom, respectively. Each probe point carries a surface equivalent which is characterized by the chemical nature of its associated atom and is generally partially included in neighbouring atomic spheres. These surface elements are expected to contribute to the overall Gibbs free energy of hydration to the extent of their exposure. Hence, a straightforward mode1 can be proposed, based on a strictly atomic additivity scheme, the complexity of which is govemed by the degree of atomic differentiation alone.

A preliminary analysis of experimental hydration energy data revealed the need

for a differentiation between o- and n-type surface areas. This is illustrated for the subset of aromatic hydrocarbons as also ketones and aldehydes. In both cases, the experimental free enthalpies of hydration (Table VI) correlate poorly with the total van der Waals surfaces, as borne out by the scatter plots of Figures 7 and 8 and correlation coefficients of ca 40%. Interestingly, the correlation for the aliphatic ketones and aldehydes cannot be improved by a differentiation into total non-polar carbon and polar oxygen van der Waals surface areas (Figure 9). However, for both data sets, the surface-to-solvation correlation can be significantly improved by introducing o- and R-type surface areas. Indeed, a o/x-differentiated analysis results in correlation coefficients close to 90% in both data sets (Figuresc10 and 11). However, these two individual subset correlations do not result in a uniform parameterization scheme, as evident from the opposite signs of the n-contributions to the solvation energy of carbon atoms in carbonyl groups and aromatic rings. Therefore, a more refined approach has to be adopted in order to arrive at a surface mode1 that adequately covers the whole data set available. Using atom specific o- and K-type surface elements for C, N, O and only O-type for S, the best linear fit to the experimental hydration energy data yields a correlation coefficient of 72%. Although not of paramount importance, this result is temarkable in view of the fact that it is achieved with a minimal parameter set of only seven types of surface elements.

M. Ulmschneider and E. Penigault

Figure 7: experimental Gibbs free energies of hydration AGh us total van der Waals areas for aromatic hydrocarbons AGh,calc=(37.0716.08)+(-0.15f0.04).S,,n = 2 7 s=4.39 r2=0.41 F=O

Experimntal

O
O
-16
O O

AG,,in W mol-'

Figure 8: experimental Gibbs free energies of hydration AGh vs total van der Waals areas for aliphatic ketones and aldehydes AGh,calc=(-20.34-+l.71)+(0.05H.01).S,, n=28 s=2.19 r2=0.39 F=O
J. Chim. Phys.

Solvent-accessible surface and.rnolar energy of hydration

583

Table VI: aromatic hydrocarbons, ketones a n d aldehydes, with experimental Gibbs free energies of hydration, AGh (in kJ molm1)

benzene -3.62 1.3-dimethylnaphthalene - 10.35 methylbenzene -3.71 1,4-dimethylnaphthalene -1 1.79 -3.33 2,3-dirnethylnaphthalene - 11.64 ethy lbenzene 1,2dimethylbenzene -3.77 2,64irnethylnaphthalene - 11.CO 1,3dimethylbenzene -3.50 acenaphthene -13.77 14-dimethylbenzene -3.37 andiracene -17.70 pmpy lbenzene -2.23 phenanthrene -16.53 isopropylbenzene -1.26 pyrene -16.68 1.2.4-trirnethylbenzene -3.60 2-propanone -16.12 butylbenzene - 1.66 2-butanone -15.22 sec-butylbenzene - 1.88 2-pentanone - 14.76 t-buiylbenzene - 1.83 3-pentanone -14.28 i-pentylbenzene -0.74 3-methyl-2-butanone -13.56 l ,l'-biphenyl -1 1.O6 Zhexanone - 13.76 bipheny lrnethane -1 1.78 4-methyl-2-pentanone -12.81 9H-fluorene -14.4 1 2-heptanone - 12.72 naphthalene - 10.01 Cheptanone - 12.24 1-rnethylnaphthalene -9.91 2,4-dimethyl-3-pentanone - 1 1.46 1-ethylnaphthalene - 10.02

2-octanone 2-nonanone 5-nonanone 2-undecanone acetophenone acetaldehyde pmpanal butanal pentanai hexanal heptanal octanal nonanal nm-2-butenal tram-2-hexenal tram-2-octenal nanr,irm-2,4-hexadienai benzaldehyde

-12.06 -10.41 -11.18 -9.05 -19.18 - 14.66 - 14.40 -13.29 - 12.68 -1 1.76 -11.18 -9.58 -8.69 -17.68 -15.40 -14.40 - 19.39 -16.84

-8 ,
r J

i : 0
C
-12

.0

l 5

a
0

n
c2

A&

Expnmd in W mol"
-16

-3

G
O

cl
O

. .

u
O O O 0
I

-20

. ,
-16

-La

- 14

-12

-10

-0

Figure 9 : experimental Gibbsfree energies of hydration AGh vs AGh values calculatedfor aliphatic ketones and aldehydes with the total van der Waals surface areas of non-polar carbon and polar oxygen AG,,,lc = (-29.16k4.39) + (0.04f0.01) . Sc, + (-0.08k0.02) . Sc, + (0.00f0.32) . S,,

M. Ulmschneider and E. Pnigault

Expcrinicaial

AG,, kJ mol-'

"

Cllailsied AG,, in W rmPL

Figure 10: experimental Gibbs free energies of hydration AGh values calculated for aromatic hydrocarbons with the dx-type sudace areas AGhpcalc = (7.64f2.49) + (-0.01M.02) . Sb + (-0.21 f0.0 1) . Sn
n=27

s = 1.73

r2=0.91

F = 124

Figure I I : experirnental Gibbsfree energies of hydration AGh vs AGh values calcularedfor aliphatic ketones and aldehydes with the d ~ - t y psurj%ace e areas AGh.,, = (-29.16k4.39)+ (0.04H.01) . SC, + (-0.08f0.02) . SC,+

(0.00fl.32) . Sou

+ (1.55M.89). Scs=1.09

n=28

r2=0.868

F=3
J. Chim. Phys.

Solvent-accessible surface and molar energy of hydration


7. PARAMETERIZATION RELYING ON o & x-SURFACE AREAS

Further exploratory data analysis 1351 using this minimal parameter set complemented by selected additional parameters revealed the importance of differentiating between atomic hybridization States, protonated and unprotonated heteroatoms, and the recognition of polarization effects by heteroatoms. Further differentiation of the o- and n-type van der Waals surfaces is proposed as an attempt to include such aspects. To be as general as possible, atomic species, i.e. carbon, nitrogen, oxygen and sulfur atoms were considered separately. Heteroatoms were distinguished, whether protonated or not. Differently hybridized atoms were also separated. To incorporate polarity effects, two types of carbon atoms were used. A carbon atom directly bonded to a heteroatom was assumed to be strongly influenced by its polar neighbour. The effect of induced polarization at the carbon atom in a more remote position was disregarded. Carbon atoms directly connected (alpha) to a heteroatom were differentiated from carbon atoms in more remote positions, that were treated as carbon atoms in hydrocarbons. The fully extended parameterization scheme relying on surface area contributions was denoted ASC-AG and included o- and n- van der Waals surface areas of the following atomic categories:

1) Carbon atoms in the apolar skeleton: spl, sp2 or sp3-hybridized. 2) Carbon atoms in alpha position to a heteroatom: spl, sp2 or sp3-hybridized.
3) Hydrogenated nitrogen atoms: sp2 or sp3-hybridized.
4) Non-hydrogenated nitrogen atoms: spl, sp2 or sp3-hybridized.

5) Hydrogenated oxygen atoms: sp2 or sp3-hybridized.


6) Non-hydrogenated oxygen atoms: sp2 or sp3-hybridized. 7) Sulfur atoms: sp3-hybridized.
The values of the g-coefficients obtained by multiple regression analysis incorporating the complete data set are given in Table VIl. The number of occurrences of each surface category in the data library is also indicated. Figure 12 shows the resulting correlation. Experimental and estimated Gibbs free energies of hydration for the mode1 structures are given in the Appendix. The variance r2 is only of 87%. However, when compared to the earlier simplified o/n-mode1 with

M. Ulmschneider and E. Pnigault

Table VI1: g-coefficients and statistical data for the best fit of the ASC-AG model with Gibbs free energies of hydration for the complete library data set, with the number of occurrences for each class n = 268 s = 4.38 r = 0.872 F = 69 AGh,cBlc = go + gi . Si
r

go gc.spl,a gc,sp 1,
A

0.06k1.48 0.21M.77 -0.08M.19 0.5039.08 -0.46f0.05 0.03M.01 -0.39k3.90 0.82M.19 -1.33M.24 -0.0633.03 -0.81fl.16 -1.0539.20 -0.79M.06

gN,sp

'. a

5 5 1 9 27 7
,11 11

42.57185.21

g ~ . s p ' rr . g ~s p 2 ,. a g~ ,sp 2 , rr g~,sp3 a, g ~sp ', ~ a

-1 6.7W29.02
-0.96kO.89 0.88M.43 -2.6750.56 -3.7w3.35 3.30I4.56 -0.91M.05 -2.1M.43 2.08M.53 -0.73M.34

9
88
88

gc . sp 2. a gc , sp 2 , rr
g ~ sp . 3, a g a c , sp g n ,~ sp
l. 2,
A

221
5

OH. sp 2, rr
g ~ ~ . s a p

a
A

43
100

'.
*

32 77 77 27

g a c . sp ',

g o . sp 2, 0

g a ~sp , 3, a
~ N HSP 2, O

1 12
1 1

go . p ,

2.

g o , sp3, a

~ N H sp , 2, A g ~s p 3 .~a ,

22

6 -0.07fl.06 gs . a - -2 go in k~ mol-', gi in kJ mol-1 A

Figure 12: experimental Gibbsfree energies ojhydration ACh vs AG,, values calculated with the ASC-AC model,for complete reference set
J. Chim. Phys

Solvent-accessible surface and molar energy of hydration

587

seven types of surfaces, the refined surface differentiation results in an increased prediction ability of the mode1 by nearly 15 percentage points. It is noteworthy that the constant term can be dropped. This is a comforting aspect and consistent with a complete surface area approach. The statistical definition of some parameters is not adequate in the case of atoms that are poorly documented in the reference library, or the experimental data of which are highly scattered. For a given series of heteroatoms (e.g. sp2-hybridized oxygen atoms), the o-surface contributes to a negative (favourable) hydration energy value, and the n-surface to a positive one. The parameters are of the same order of magnitude. Carbon atoms follow the opposite pattern: o-type surfaces are more hydrophobic than x-type surfaces. This trend has already been observed: carbon atoms in C-C double bonds contribute an overall hydrophilic increment to the free enthalpy of hydration due to the hydrophilic nature and generally better accessibility of x-type surfaces. The slight hydrophilic nature of the o-surface of sp3-hybridized carbon atoms alpha to heteroatoms is noteworthy. It may reflect the anticipated polarization by the heteroatom. The design of ASC-AG is simple and straightforward. It can be extended to any class of molecular compounds. When new atomic species are concerned, the parameters for the o and n surfaces are just added into the parameter list for the various hybridizations. Insofar as these new species are also hydrogen-bond donor sites, it is necessary to include the corresponding parameters. Finally, a new calibration has to be performed to ensure an overall consistent set of parameters.

CONCLUSION
Given the conceptual simplicity of the ASC method, the success achieved in estimating both molecular and partial atomic surfaces is remarkable. This method exhibits several potential advantages. Surfaces can be estimated orders of magnitude faster than with most numerical procedures. The surfaces as also their variations as a function of atomic displacements are given analytically. They can be easily incorporated into structure minimization or molecular dynamics calculations. In addition, the concept of surface probe points, spatially arranged by reference to the
J. Chim. Phys.

588

M. Ulmschneider and E. Pnigault

valence States of bonded atoms, with the inherent possibility to differentiate between hydrophobic and hydrophilic, polar and non-polar as well as 0- and 7c-type surface elements lends itself to chemically intuitive interpretations of surface topologies and associated molecular propenies. Its utilization to estimate solvation effects and partition coefficients is arnong the most obvious applications. To calculate Gibbs free energies of hydration, the ASC-AG model has one major advantage over more fundamental methods. It yields essentially the same quality of answers in a transparent way and with a minimum of computational effort. It rests on the additivity of hydrophobic and hydrophilic contributions of the various atomic surface elements and thus, in principle, can easily be extended to a larger variety of organic compounds, given the availability of expenmental data. Further refinements are conceivable. Carbon atoms alpha to a heteroatom can be differentiated according to the nature of the latter. For example, carbon atoms in alpha position to an oxygen atom may contribute to hydration in another way than carbon atoms in alpha position to a nitrogen atom. It may be adequate to differentiate nitrogen and oxygen atoms according to their electronic charges. Aromatic carbon atoms and heteroatoms may be considered as specific atomic classes. However, these multiple classifications increase the degrees of freedom of the model and should be considered only with a substantially larger data set. Refinements could also account for the two different types of H-bond sites at the oxygen atom of a hydroxyl group. Indeed, one O-direction corresponds to the -H bond. which acts as an H-bond donor, whereas the other two O-directions correspond to the lonepairs, which can only act as H-bond acceptors. In the present version of the model, no distinction was made between these probes, and the results obtained represent an average effect. Hydrophilic or hydrophobic surfaces of atoms are currently treated as if there were no cooperativity effects. These effects involve many-body favourable or unfavourable interactions among different donor/acceptor sites. It .would be possible to model such effects by an additional analytical inter-probe function of the distance, involving only accessible probes. In this manner, special hydration effects, like those observed for small rings, o r due to favourable or unfavourable interactions between neighbouring functional groups, could be accounted for.
J. Chim. Phys.

Solvent-accessible surface and rnolar energy of hydration

589

Appendix
List of the 47 molecular structures of the Scheraga paper [32] with experimental and calculated Gibbs free energies of hydration AG,, in kJ molm1(experimental data source [39]except for (w) molecular values from [38]) molecular structure acetamide acetic acid acetic acid ethyl ester acetic acid methyl ester anthracene benzene benzenethiol butane butanoic acid 1-butano1 2-butanol I -butylamine 1.3-dimethylbenzene 1.4-dimethylbenzene 2,2-dimethylbenzene ethane ethanethiol ethanol ethylamine ethylbenzene heprane hexane 1-hexanol I -hexylamine methanethiol methanol methyl ethyl sulfide methylamine methylbenzene 4-meth y limidazole 3-methylindole Cmethylphenol 2-methylpropane naphthalene octane pentane 1 -pentanol 1-pent lamine phenor propane propanoic acid 1 -propanol 2-propanol propionamide 1-propylamine n-propy lbenzene propylguanidine exp. val. -40.63 -28.05 - 12.95 - 13.87 - 17.70 - 3.62 - 10.67 - 8.70 -26.59 -1 9.73 -19.10 -17.97 - 3.50 - 3.37 10.46 7.66 - 5.42 -20.98 - 18.84 - 3.33 10.96 10.40 -18.26 -16.87 - 5.19 -2 1.40 - 6.20(w) -19.09 - 3.71 -42.92(w) -24.75(w) -25.67 9.70 -10.01 12.10 9.76 -1 8.72 -17.14 -27.68 8.1 8 -27.09 -20.19 -1 9.90 -39.4 1 (w) - 18.37 - 2.23 -45.73(w) Scheraga -28.04 -29.15 - 4.21 - 7.40 -1 1.69 - 7.40 -13.58 - 8.29 -27.79 -19.01 -16.38 -18.34 1.90 1.89 8.71 6.2 1 - 2.81 -2 1.O9 -20.42 - 1.24 112 4 10.37 - 16.96 - 16.29 - 4.72 -24.04
1
a/lr mode1

ASC-AG mode1

-23.80 - 2.76 -25.64 - 17.1O -32.91 8.25 - 9.55 12.25 9.33 - 17.97 -17.30 -37.56 7.25 -28.8 1 -20.05 -1 9.85 -21.78 - 19.38 - 0.21 -59.91

-39.77 -31.O4 -14.18 - 16.00 - 18.46 - 8.03 -16.81 6.74 -27.22 -20.64 -17.30 -1 8.59 - 3.25 - 3.26 8.03 4.52 5.38 -22.67 -20.69 - 4.85 10.22 9.06 - 16.93 -16.28 - 6.55 -25.26 O. 13 -22.88 - 5.65 -26.16 -20.92 -21.78 6.80 -1 1.18 11.39 7.90 -18.12 - 17.44 -24.15 5.58 -28.32 -2 1.79 -19.77 -37.32 - 19.75 - 3.56 -59.58

-40.65 -32.17 -12.30 -14.13 -14.91 - 1.46 -12.79 2.75 -28.26 -18.51 -15.03 -20.41 - 5.24 - 5.26 3.27 1.87 - 3.34 - 19.28 -21.19 - 3.97 4.15 3.68 -17.59 - 19.48 - 5.24 -22.37 - 4.24 -24.08 . - 3.37 -20.73 -3 1.93 -28.16 2.78 - 8.24 4.6 1 3.22 -18.04 -19.95 -26.21 2.29 -28.55 - 18.94 -16.48 -37.48 -20.40 - 1.76 -69.89

590

M. Ulmschneider and E. Pnigault

REFERENCES
1 Buttler JAV. Harrover P (1937) Trans Faradav Soc 3 3. 229-236. symposium ~ o n o ~ r a ~ h , ' ~ h e m Catalog i c a l Co, 2 Langmuir IM (1925) ~ o l l o i d New-York. Vol 3. 3 Bondi A (1964) J Phys Chem 68, 441-45 1. 4 Lee B, Richards FM (1971) J Mol Bi01 55,379-400. 5 Stouch TR,Jurs PC (1986) J Chem Inf Compu Sci 26, 4-12. 6 Smith GS (1985) QCPE 509. 7 Pearlmann RS ( 1986) in Partition CoefJicient Determination and Estimation Dunn III WJ, Block JH, Pearlmann RS, Eds, Pergamon Press, New-York 3-20. 8 Meyer AY (1988) J Comp Chem 9. 18-24. 9 Akahane K, Nagano Y, Umeyama H (1989) Chem P h a m Bull 3 7 , 86-92. 10 Shrake A, Rupley JA (1973) 5 Mol Bi01 79, 35 1-371. 11 Cramer CJ, Truhlar DG (1992) J Comp Chem 13, 1089-1097. 12 Grand SML, Merz Jr KM (1993) J Comp Chem 14, 349-352. 13 Duncan BS, Olson AJ (1993) Biopolymers 33, 219-229. 14 Connoll ML (1983) J Appl Cryst 16, 548-558. 15 Still WC!,Tem cryk A, Hawley RC, Hendrickson T (1990) J Am Chem Soc 112, 6127-6129. 16 Richmond TJ (1984) 5 Mol Bi01 178, 63-89. 17 Wodak S, Janin J (1980) Pmc Natl Acad Sci USA 77, 1736-1740. 18 Sanner M (1992) PhD Thesis, Universit de Haute-Alsace, Mulhouse, France. 19 Agishtein ME (1992) J Biomol Struc & Dyn 9, 759-768. 20 Chothia C (1974) Nature 248, 338-339. 21 Reynods JA, Gilbert DB, Tanford C (1974) Pmc Natl Acad Sci USA 7 1, 29252927. 22 Rupley JA, Gratton E, Careri G (1983) TIBS 8, 18-23. 23 Ben-Am A, Marcus Y (1984) J Chem Phys 8 1, 2016-2027. 24 Frommel C (1984) J Theor Bi01 111, 247-260. 25 Sharp KA, Nicholls A, Fine RE, Honig B (1991) Science 252, 106-109. 26 Sharp KA (1991) Cur Bi01 1, 171-174. 27 Gao J, Xia X (1992) Science 258,631-634. 28 Soda K. Hirashima H (1990) J Phys Soc Jap 59,4177-4185. 29 Soda K, Hirashima H (1992) J Phys Soc Jap 6 1, 2992-3006. 30 Sharp K, Jean-Charles A, Honig B (1992) J Phys Chem 96, 3822-3828. 31 Hase1 W, Hendrickson TF, Still WC (1988) Tet Comp Meth 1, Vol 2, 103-106. 32 Ooi T,Oobatake M, Nmethy G, Scheraga HA (1987) Pmc Natl Acad Sci USA 84, 3086-3090. 33 Eisenberg D, Wesson M, Yamashita M (1989) Chemica Scripta 29A. 217-221. 34 Eisenberg D, Mc Lachlan AD (1986) Nature 319, 199-203. 35 Ulrnschneider M (1993) PhD Thesis, Universit de Haute-Alsace, Mulhouse, France. 36 Ulmschneider M, Pnigault E (1999) J Chim Phys, in press. 37 Mller K, Ammann HJ, Doran DM, Gerber PR, Gubernator K, Schre fer G (1989) in Trends in Medicinal Chemistry' 88, van der Goot GDH, Pal os L, Timmermann H, Eds, Elsevier Science Publishers, Amsterdam. 38 Wolfenden R, Anderson L, Cullis PM, Southgate CCB (1981) Biochemistry 20, 849-855. . Mollica V, Lepori L (1981) J Sol Chem 10, 563-595. 39 Cabani S, Gianni P

J. Chim. Phys

Você também pode gostar