Você está na página 1de 8


Biochimica et Biophysica Acta, 580 ( 1 9 7 9 ) 24--31

E l s e v i e r / N o r t h - H o l l a n d B i o m e d i c a l Press

BBA 3 8 2 7 3


T.L. B L U N D E L L a, B.T. S E W E L L a a n d A.D. M c L A C H L A N b,*

a Laboratory of Molecular Biology, Department of Crystallography, Birkbeck College, University of London, Malet Street, London WC1 and b Medical Research Council, Laboratory o f Molecular Biology, Hills Road, Cambridge CB2 2QH (U.K.)
(Received J a n u a r y 5 t h , 1 9 7 9 )

Key words: Pepsin structure; Gene duplication; Acid proteases; Domain structure

The observation that the acid proteases contain two structurally equivalent lobes related by a dyad through the active centre has been extended to show that in endothiapepsin each lobe contains two similar halves related by a further local dyad. In lobe 1 22 pairs of a-carbons are equivalent with a root mean square deviation of 1.92 /~. In lobe 2 17 pairs match within 2.31 ~. Convergent evolution or gene quadruplication may have occurred.

Introduction In the acid protease enzymes of the pepsin family the polypeptide chain forms two distinct globular lobes which are linked together across two antiparallel strands of fi-sheet [1--3]. The active centre is formed round two aspartic acid groups which face one another across a cleft between the two lobes. The amino acid sequences surrounding these residues, Asp-32 and Asp-215 are remarkably similar [4,5] and X-ray analysis shows that the a-carbon backbones of the lobes are structurally related [3]. In penicillopepsin 61 residues of lobe 1 can be superimposed on equivalent residues of lobe 2 by a screw rotation about an approximate dyad axis running through the active centre. Tang et al. [3] proposed that the enzyme may have evolved from a symmetrical dimer of two identical separated subunits which later united into one chain by tandem gene duplication [6,7], although the lack of strong sequence repeats is also consistent with convergent evolution. Each lobe of the acid proteases is built round an eight-stranded fi-sheet which

* To whom correspondence should be addressed.



Fig. 1. T o p o l o g y o f t h e ~-sheets in (a) p a r t l y u n f o l d e d l o b e ; (b) fully f o l d e d l o b e , o r i e n t e d in t h e s a m e w a y as Fig. 3 b a n d c. T w o i n t e r l o c k e d f o u r - s t r a n d e d s h e e t s a, b, c, d a n d a'~ b', c', d ' , w h i t e a n d b l a c k s t r a n d s in (a), are f o l l o w e d b y a single s t r a n d e' a n d a pair of a n t i p a r a l l e l s t r a n d s q, r - s h a d e d . D o t t e d lines in (a) s h o w h o w o u t e r m o s t l o o p s fold u p a n d pair t o g e t h e r o v e r t o p o f t h e l o w e r l a y e r t o close t h e l o b e w i t h .an u p p e r h y d r o g e n - b o n d e d l a y e r . S t r a n d s c a n d e' are e a c h b e n t a n d c o n s i s t o f t w o s e c t i o n s Cl, c2 o r el '. c 2 . I n t h e c o m p l e t e e n z y m e t w o l o b e s are l i n k e d b y h g d r o g e n b o n d s b e t w e e n a n t i p a r a l l e l s t r a n d s q, q ' . I n d i v i d u a l h y d r o g e n b o n d s c a n n o t y e t b e assigned a t 2.7 A r e s o l u t i o n .

itself has a repetitive topology (Fig. 1), so that the complete lobe can be regarded as a paired structure built from two intertwined four-stranded pieces. This fact suggested to one of us (A.D.M.) and independently to Andreeva and coworkers [8,9] that each lobe might itself contain two structurally equivalent parts, paired about a new dyad inside the lobe. Structural comparison To test this idea we have compared the s-carbon positions in the half-lobes using our X-ray analysis of pepsin from the fungus Endothia parasitica at 2.7/~ resolution [ 1 ]. The m e t h o d depends on finding the best rotation and translation which superposes two sets of residues with the least root mean square deviation [10,11]. A trial set of equivalent residues defines an initial match which can be improved iteratively by adding or removing a few atoms at a time (or even by

26 starting afresh). In the results we have generally considered pairs of regions to be spatially equivalent if they satisfy three conditions: (1) all residues match within 3.0 /~; (2) residues from runs of similar structure at least three residues long, and (3) the ~-sheets are pleated correspondingly, with internal and external side chains correctly matched. The fits presented here are based solely on three-dimensional structure; the amino acid sequence used here is tenetative, being based partly on the electron density map and partly on X-ray evidence by analogy with pepsin, chymosin and penicillopepsin [2] (the sequence is being determined by Pedersen and Foltman [12]), Some regions of the structure seem to follow similar paths without being precisely superposable. These we have classed as 'topologically equivalent' in the alignments , without defining the relationship more rigorously. Three independent comparison were done: lobe 1 with lobe 2, the two halves of lobe 1, and the two halves of lobe 2. These comparisons were quite independent of the previous one on penicillopepsin [2,3], the coordinates of which were n o t available to us. Three dyad axes The best fit between lobe 1 and lobe 2 gives 63 structurally equivalent atom pairs with an root mean square deviation of 2.06 /~ as shown in Fig. 2 and Table I. A further 37 atoms are topologically equivalent as defined above. The match corresponds closely with that found for penicillopepsin. One curious feature appears to be an asymmetry near the active site, since Phe-31, Asp-32, Thr-33 do not quite match their partners 214--216 and have deviations of 3.1 /~ (exceeding the normal limit prescribed above). The operation which superposes the lobes is almost a pure rotation, with a screw shift of only 0.003 /~

C r y s t a l c a r t e s i a n a x e s are r e l a t e d t o t h e P 2 1 u n i t c e l l [ 1 3 ] w i t h a = 53.6 A, b = 7 4 . 0 5 A, c = 4 5 . 7 A a n d f ~ = l l O C as f o l l o w s : x a n d y p a r a l l e l t o a a n d b, w i t h z n o r m a l t o t h e a--b p l a n e . O r t h o g o n a l m o l e c u l a r a x e s are d e f i n e d w i t h o r i g i n t h e c e n t r e o f m a s s o f t h e 6 3 p a i r s o f m a t c h e d r e s i d u e s ; K - a x i s a l o n g d y a d l 12; I - a x i s f r o m c e n t r e o f l o b e 1 t o c e n t r e o f l o b e 2 ( 6 3 a t o m s ) ; J n o r m a l t o I a n d K . C e n t r e s o f l o b e s are t h u s at (-+ 1 1 . 0 3 , 0 , 0 ) i n t h e I,J,K s y s t e m . I n c r y s t a l c a r t e s i a n a x e s I = ( 0 . 6 8 4 , - 0 . 7 2 7 , - 0 . 0 5 3 ) a n d J = ( 0 . 6 0 5 , --O.526, --0.597). RMS, root mean square. Quantity Number of residues fitted RMS distance (A) R o t a t i o n a n g l e (% Direction of axis (crystal) Lobe 1 with lobe 2 63 2.06 168.4 0.407 0.440 --0.800 0.003 19.32 ---4.84 28.42 0.0 0.0 1.000 Halves of lobe 1 22 1.92 175.1 --0.034 0.912 -0.408 1.21 8.60 6.15 29.03 -0.665 ----0.126 0.714 Halves of lobe 2 17 2.31 167.7 0.746 ---0.225 ---0.627 0.65 31.77 --13.75 30.14 0.707 0.041 0.706

Ix ly Iz

Shift along axis (A) Point on axis (crystal)


of axis (molecule)

x y z lI Ij lK


T G [S 176

T G 182

S 18

wsF 190

N V D S 195

YT~:~ 198 [

P A 04 S






F S y



F 207


D 215

Y 222

D 225

V 228



S G 229

V S ]A 235 [

A 24

A A 44

F ~ 248



b# V V

94 V

99 Q

i00 -.A V

C e Q A

1o4 A Q

i o ~ Q~---"~

~L 256

y A 263

V ~ O I 268 286

N S 290 I


G L L G L F F D T G~---'~


304_j ~ 3 1 0

F V [311








F F ~24


A-~--N 3171


Fig. 2. Alignment of lobe 1 with lobe 2. Spatially equivalent pairs of residues m a t c h e d within 3.0 .~ axe s h o w n by underlines. Colons m a r k topologically equivalent pairs w h o s e positions correspond less accurately. Principal ~-sheet regions arc b o x e d and n a m e d a, b, c, d or a', b', c', d' to indicate the t w o halves of each lobe. Strands q and r f o r m the link b e t w e e n lobes. Strand e' exists only in second halves of both lobes. Curved double lines indicate extra loops which exist in only one lobe. Single lines s h o w gaps. A m i n o acid s e q u e n c e is v e r y t e n t a t i v e , w i t h pig p e p s i n n u m b e r i n g . N o e x t r a r e s i d u e s are p r e s e n t , w h i l e r e s i d u e s 2 2 - - 2 3 , 2 0 9 - - 2 1 0 ; 1 0 9 - - 1 1 1 , 2 9 1 - - 2 9 4 : 1 4 2 - - 1 4 3 a n d 1 5 9 - - 1 6 0 are n o t p r e s e n t .

(Table I). Fig. 3a gives an overall view of the enzyme. The internal dyad inside lobe 1 shows clearly in Fig. 3b. The upper layer of /3-sheet nearest the eye is made of two antiparallel turns with their tips at residues 21, 24 and 92, 93, respectively (22--23 are missing). These loops pair across a dyad between residues 19 and 90. The pleating of the two sheets matches, so that, for example, the side chains of residues 18, 20, 27 and 89, 91, 96 all point towards the interior. At the back of Fig. 3b there are four lower /3-strands paired about the axis on the lower level, such that residues 29--33 and 3 6 - - 4 0 match 9 9 - - 1 0 3 and 119--123, respectively. In all there are 22 equivalent atom pairs with a root mean square deviation of 1.92 /~. Of


. 7

21 52 52


66 74


81 51

C Fig. 3. (a) Overall s t e r e o v i e w i n t o active c e n t r e a l o n g d y a d w h i c h r e l a t e s t w o lobes. L o b e 1 is a t t o p a n d r e s i d u e s are labelled t o e m p h a s i s e t h e a l i g n m e n t of l o b e s . A s p - 3 2 a n d A s p - 2 1 5 are p a i r e d close t o d y a d , L o o p w i t h tip n e a r r e s i d u e s 7 5 - - 7 8 c o m e s across active c e n t r e f r o m lobe I as an e x t e n s i o n of fl-sheet s t r a n d s a'--b'. (b a n d c) I n d i v i d u a l l o b e s v i e w e d f r o m e q u i v a l e n t d i r e c t i o n s l o o k i n g d o w n l o c a l d y a d s w h i c h r e l a t e t h e halves o f e a c h l o b e . (b) l o b e 1; (c) l o b e 2. E q u i v a l e n t residues are labelled t o m a t c h a c c o r d i n g to a l i g n m e n t (Fig. 4). I n e a c h lobe t h e u p p e r l a y e r is f o r m e d b y s t r a n d s b, c I pa/xed w i t h b ' , c 1 ' a n d t h e l o w e r l a y e r b y c2, d, p a i r e d w i t h c ' 2 , d ' .


these pairs 16 belong to the central sheets while six others lie on the outer edges of the domain. The/3-sheets in lobe 1 are well-defined and regular so that equivalent atoms are easily picked out. In lobe 2 the lower layer/~-sheet strands are well-defined and fold about a clearly marked dyad, but the upper layer (residues 1 9 2 - - 2 0 6 and 2 5 6 - - 2 6 6 ) lies in a region of weak electron density and appears to twist markedly. Here it is not possible to choose equivalent pairs of atoms with confidence, and the alignment in Fig. 3b is rather tentative around this region. The 17 matched atoms have a larger root mean square deviation of 2 . 3 1 / ~ . The results have been tested for consistency. First it may be seen from Figs. 2 and 4 that the relationships between the halves of each l o b e are consistent with the alignment of the two complete lobes, except for some marginal fits. The match of residues 0--4 with 59--63 is a little uncertain and so is that o f 51--53 with 140, 1 4 1 , 1 4 4 . Another test is geometrical. Are the three dyads in mutually consistent directions? The rotation which puts lobe 1 onto lobe 2 also brings the screw dyad axis 11 Close to the direction of 12 (with an error of only 2.9 ). Thus the lobe axes Ii and 12, respectively, make similar angles of 44.4 and 45.1 with the molecular dyad ll2 and an angle of 88.6 with one another. The three dyads are almost coplanar. In Table I the axes are also referred to a more natural moleculebased coordinate s y s t e m / , J, K. These tests suggest that the fitting operations are all consistent within the
-4 o a
I SYG~ '~ ~, ~,


176 181 185



21 92 O 16



--~/ ,/



SA A T Y S , V N F s w F G 0 261 D~ o256 244 /p SISVP~jT FVYAA


4~ / C
108 112


* "




~> __ ~ 0

'X~VTAHAQAVQA~A00 204 0208211t O tt'21 s sADAFAS /AI ADTG

262 eee o267 286= *289 296

I soo-~. ) ~j ~.1

{-HG~ ~-GT <310






A LG L LGDN N T D 3051 S D B G Y L L T T 2 9 a <>


Fig. 4 . C o m b i n e d alignment o f the f o u r half-lobes a b o u t their internal d y a d s . S e q u e n c e s zead l e f t or right along strands a, b, c, d. S e g m e n t s 1 A , 1B are t h e t w o halves o f l o b e 1 and 2 A , 2B t h e h o m o l o g o u s halves o f l o b e 2. R e s i d u e s w h i c h m a t c h to 3 . 0 ~ w i t h i n l o b e s are m a r k e d (o). R e s i d u e s in large t y p e m a t c h t o p o logically. T h i c k h o r i z o n t a l lines s h o w u n m a t c h e d l o o p s . <>, residues c l o s e to dyad, in u p p e r and l o w e l sheets; ~, active-site aspartates. Vertical w a v y lines s h o w f o l d b e t w e e n strands c I and c2 and the end of sheets d. S e q u e n c e is very tentative,

30 present accuracy of the X-ray structure. Any realignment of the halves of lobe 2 would require a correction to the relationship between the complete lobes (Fig. 2) if the two well-matched halves of lobe 1 have been correctly related. Folding and evolution Our results show that the central ~-sheet core of each lobe has a symmetrical hydrogen-bonded shell built by interlocking four twisted [14] antiparallel strands in the first half of the chain with four similar strands in the second half, round the local dyad axis. The half-lobes appear unable to exist correctly folded in isolation from one another and are linked across the dyad by matched isologous pairs of interactions as if they were two identical subunits in a dimeric enzyme [15,16]. The amino acid sequences of the half-lobes show no repeats except perhaps in the b--c1 loops (comparing 14--26 with 85--95 and 190--206 with 248 ... 264) and the hydrophobic central parts of strands c2 and d. The dyad could thus have arisen by convergent evolution. Alternatively the lobe may have evolved as a closely paired dimer of t w o identical four-stranded sheets by tandem gene duplication in the way proposed before for the full acid protease structure [3] and for other proteins [17--19]. If this happened then each lobe is an 'intramolecular dimer' while the whole protease would be a quadruple gene product or 'intramolecular tetramer' containing four copies of an ancestral fl-sheet unit a b o u t 45 residues long [8,9]. The terminal strands q and r of each lobe would need to have been added after the first duplication and before the second. There is strong evidence for four-fold structural repeats in other proteins [20--23] and two proteins show some of the same structural features as endothiapepsin. The eye-lens protein 7-crystallin [24] and the serine proteases [25,26] are also built of two closely similar lobes with repeated dyad-related halves within each lobe.
1 Subramian, E., Swan, I.D.A., Liu, M., Davies, D.R., Jenkins, J.A., Tickle, I.J. and BlundeU, T.L. (1977) Proc. Natl. Acad. Sci. U.S. 7 4 , 5 5 6 - - 5 5 9 2 Hsu, I.N., Delbaere, L.T.J., James, M.N.G. and Hofmann, T. (1977) Nature 2 6 6 , 1 4 0 - - 1 4 5 3 Tang, J., James, M.N.G., Hsu, I.N., Jenkins, J.A. and BlundeU, T.L. (1978) Nature 2 7 1 , 6 1 8 - - 6 2 1 4 Sepulveda, P., Marciniszyn, J., Liu, D. and Tang, J. (1975) J. Biol. Chem. 250, 5082--5088 5 Tang, J. (1977) Acid Proteases, Structure-Function and Biology, Plenum, New Y o r k 6 Dixon, H.G. (1966) in Essays in Biochemistry (Campbell, P.N. and Grevtlle, G.D., eds.), Vol. 2, Academic Press, New Y o r k 7 0 h n o , A. (1970) Evolution by Gene Duplication, Allen and Unwin, L o n d o n 8 Andreeva, N.S., Fedorov, A.A., Gutschina, A.E., Riskulov, R.R., Schutzkever, N.E. and Safro, M.G. (1978) Mol. Biol. USSR 12, 922--935 9 Andreeva, N.S. and Gutschina, A.E. (1979) Biochem. Biophys. Res. C o m m u n . 87, 32--42 10 R e m i n g t o n , S.J. and Matthews, B.W. (1978) Proc. Natl. Acad. Sci. U.S. 75, 2180--2184 11 Rossmann, M.G. and Argos, P. (1977) J. Mol. Biol. 99--129 12 Pedersen, V. and F o l t m a n , B. (1977) In Acid Proteases, S t ruc t ure -F unc t i on and Biology (Tang, J., ed.), Plenum, New Y o r k 13 Jenkins, J . A , Blundell, T.L., Tickle, I.J. and Ungaretti, L. (1975) J. Mol. Biol. 9 9 , 5 8 3 - - 5 9 0 14 Chothla, C., Levitt, M. and Richardson, D. (1977) Proc. Natl. Acad. Sci. U.S. 74, 4 1 3 0 - - 4 1 3 4 15 Monod, J., Wyman, J. and Changeux, J.P. (1965) J. Mol. Biol. 12, 88--118 16 Matthews, B.W. and Bernhard, S.A. (1973) Annu. Rev. Biophys. Bioeng. 2, 257--317 17 Weeds, A.G. and McLachlan, A.D. (1974) Nature 252, 646--649 18 Watenpaugh, K.D., Sieker, L.C., Herriott, J.R. and Jensen, L.H. (1973) A c t a Crystallogr. B29, 943--946


19 20 21 22 23

Ploegman, J.H., Drent, G., Kalk, K.H. and Hol. W.G.J. (1978) J. Mol. Biol. 123, 557--594 Poljak, R.J., Amzel, L.M. and Phizackerley, R.P. (1976) Prog. Biophys. Mol. Biol. 31, 67--93 Wright, C.S. (1977) J. Mol. Biol. 111,439--457 Sjodahl, J. (1977) Eur. J. Biochem. 78, 471--490 Collins, J.H. (1976) In Calcium in Biological Systems, pp. 303--334, S.E.B. Symposia No. 30, University Press, Cambridge 24 Blundell, T.L., Carlisle, C.H., Lindley, P.F., Moss, D.S., Slingsby, C. and Tickle, I.J. (1978) Acta CrystallogL Suppl. B., l l t h Int. Congz. of Crystallogr., Warsaw, in the press 25 Bixktoft, J.J. and Blow, D.M. (1972) J. Mol. Biol. 68, 187--240 26 McLacblan, A.D. (1979) J. Mol. Biol. 128, 49--80