Você está na página 1de 5

The EMBO Joumal Vol.2 No.11 pp.

2059-2063, 1983

Amino acid sequence data on glial fibrillary acidic protein (GFA); implications for the subdivision of intermediate filaments into epithelial and non-epithelial members
Norbert Geisler and Klaus Weber*
Max Planck Institute for Biophysical Chemistry, D-3400 Gottingen, FRG Communicated by K. Weber Received on 22 July 1983; revised on 8 August 1983

Determination of 50% of the sequence of the astrocytespecific intermediate filament (IF) protein documents the hypervariable regions as well as parts of the coiled-coil array of glial fibrillary acidic protein (GFA). The results show that the four non-epithelial IF proteins (myogenic desmin, mesenchymal vimentin, GFA and neurofiament 68 K protein) known to form homopolymers are much more closely related than the epithelial keratins, which seem to form heteropolymers only. Of the four non-epithelial proteins, desmin and vimentin are the most closely related, since GFA has a shorter non-a-helical array at the amino terminus. We discuss the possibility that the non-a-helical terminal arrays, because of their sequence and length variability, are responsible for differences of distinct IF with respect to physical-chemical properties such as the low ionic strength-induced depolymerization into protofilaments. Key words: intermediate filaments/GFA/neurofilaments/ desmin/keratin
Introduction Immunological data have been instrumental in subdividing the complex multigene family of intermediate filaments (IF) in a histologically meaningful manner (neurofilaments, myogenic desmin, glial-specific GFA, mesenchymal vimentin, epithelial cytokeratins) (for a recent review, see Osborn and Weber, 1983). Amino acid sequence studies not only confirmed this distinction but also allowed the development of a common structural model for IF proteins (Geisler et al., 1982a). Here, a rod-like middle domain displaying extensive a-helical arrays able to form interpolypeptide coiled-coil structure accounts for the a-type X-ray diffraction patterns known for all IFs. Whereas the rod of some 310 residues is highly related in all IF proteins, the two flanking terminal domains reveal themselves as hypervariable regions differing both in sequence and length (Geisler and Weber, 1982; Hanukoglu and Fuchs, 1982). Although this topographical description was based originally only on sequence data for desmin (Geisler et al., 1982a; Geisler and Weber, 1982), vimentin (Geisler and Weber, 1981a; Geisler et al., 1982b), a human 50 K epidermal keratin (Hanukoglu and Fuchs, 1982) and the alignment of partial sequence data for two hard ax-keratins from sheep wool (Geisler and Weber, 1982; Hanukoglu and Fuchs, 1982; Weber and Geisler, 1982; Dowling et al., 1983), subsequent studies on a 59 K epidermal keratin (Steinert et al., 1983), the neurofilament 68 K protein (Geisler et al., 1983) and extension of the vimentin sequence (Quax-Jeuken et al., 1983) have fully supported the model. In particular, it has become obvious (Geisler and Weber, 1982) that contrary to previous assumptions (Steinert et al., 1980), even the short regions
*To whom reprint requests should be sent.
IRL Press Limited, Oxford, England.

separating the coiled-coil arrays of the rods are highly preserved both in sequence and in length. Here we have characterized the last member of the four major non-epithelial IF proteins, the glial-specific protein GFA (glial fibrillary acidic protein). Determination of the amino-terminal 82 and carboxy-terminal 136 residues fully covers the hypervariable regions and provides additional information on the rod region. In line with several common biochemical properties and their histological expression patterns we show that the four non-epithelial IF proteins form a highly related subgroup, with vimentin and desmin being the closest members.

Results and Discussion Determination of porcine GFA sequences In order to fully cover the hypervariable regions and still obtain sequence information on the rod domain (Geisler and Weber, 1982) we aimed at two particular fragments. Following our earlier fragmentation studies, GFA was cleaved at its sole cysteine residue (Geisler et al., 1982b) and the carboxyterminal 15 K fragment was isolated and subjected to sequence analysis. We also probed the molecule for an argininerich headpiece by digesting the protein with lysine-specific protease in the presence of 6 M urea followed by chromatography on CM cellulose, a procedure already found useful in the isolation of the corresponding region of the neurofilament 68 K protein (Geisler et al., 1983). This approach provided a 10 K fragment retaining the blocked amino terminus of the whole molecule. Digests obtained by various proteases were separated by paper methods and h.p.l.c. Peptides were recovered and characterized by amino acid composition and stepwise Edman degradation. The combined information characterized the 10 K and 15 K fragments as shown in Figure 1. In each case the sequence gave an amino acid composition in good agreement with the value obtained by acid hydrolysis. In the case of the 15 K fragment we note: (i) although the nature of the three residues following the blocked cysteine are known, their order based only on indirect arguments is still tentative; (ii) the two residues following the cysteine at positions 43 and 59, which are also underlined in Figure 1, have been assigned by compositional data only; (iii) we lack a smaller overlap peptide covering residues 9 and 10 from the carboxyl end, although a larger fragment supports the sequence as given. The sequence information on the amino-terminal 82 and carboxy-terminal 136 residues of the GFA molecule is added to our previous alignment of chicken gizzard desmin, a 50 K human epidermal keratin, the two major components of sheep wool hard a-keratin and the sequences known for porcine vimentin (Geisler and Weber, 1982). The alignment for vimentin can now be greatly extended towards the aminoterminal side by the sequence deduced for hamster vimentin using DNA data (Quax-Jeuken et al., 1983). Figure 1 also incorporates a recent report on a murine 59 K epidermal keratin (Steinert et al., 1983) and the partial sequence of porcine neurofilament 68 K protein (Geisler et al., 1983). All sequences emphasize the structural motif proposed for IF pro2059

N.Geisler and K.Weber

teins; a highly related coiled-coil domain flanked by hypervariable terminal domains.

Relation between GFA and the other three non-epithelial IF proteins As seen in Figure 1, the presumptive rod domain of GFA is well delineated at its two ends since the data already emphasize the common part of IF proteins corresponding to desmin residues 97-407 (Geisler and Weber, 1982). We note that the two ends of the coiled-coil array, i.e., the start of coil Ia and the end of coil II each approach as predicted a consensus sequence. The common sequence in the latter region explains a monoclonal antibody elicited with human GFA, which seems to recognize all IF proteins (Pruss et al., 1981) and whose epitope lies within the last 20 residues of coil II (Geisler et al., 1983). That the end of coil II may be some three residues earlier than previously thought (Geisler et al., 1982a, 1983) is indicated by the sequence of a mouse epidermal keratin (Steinert et al., 1983) whose non-helical tailpiece sequence starts somewhat closer to the consensus sequence than expected from the sequences of the other proteins. We expect from the sequence data a second type of general IF antibody whose epitope could lie in the high homology segment of coil la (i.e., desmin residues 103 - 121). The formation of mixed dimers upon Cu2 + -phenanthroline oxidation of certain heteropolymeric IF (Quinlan and Franke, 1982, 1983) taken together with the identical location along coil II of the sole cysteine residues of desmin, vimentin (Geisler and Weber, 1981a) and now also of GFA (Figure 1) may put several restrictions on future detailed models of IF structure. Our data on porcine GFA are somewhat at variance with an earlier proposed short sequence of 28 residues for the antigenic 15 K cyanogen bromide fragment of bovine GFA (Hong and Davison, 1981), which we have previously aligned by homology into coil Ia (Weber and Geisler, 1982). Some of
Head S V L Y C S S S KQ F S S S R S G G G G G G G S ME, 56 G C F G G S S G G Y G G F G G G G S F G G G Y G 7 (R2,C2,T1,S2,P1,G4,A2,V1, 11, L1,Y1,F1) 8 1 HE, D 1 S Q S Y S S S QR V S S Y - R R T - - F G G G T V P R H L E P A G S N R NF 1 S S F Y S Q P Y Y S T S Y K R R Y V E T - - - G 1 ME,
1

the few discrepancies may be species-specific residues although this phenomenon is rare along the rod domain of other IF proteins (Geisler and Weber, 1981a; Quax-Jeuken et al., 1983). The combined results show, however, that the major antigenic site(s) of rabbit anti-GFA antibodies are located on the 15 K CNBr fragment which spans most of coil Ia and lb (Figure 1). It remains to be seen if the site(s) are on one of
the two coils (Ia, Ib) or in the short non-ae-helical spacer I located between them. Whereas residues 46-82 of GFA can be readily aligned with residues 80-116 of desmin, the preceding 45 amino acids of GFA are difficult to accommodate in a straightforward manner (Figure 1). The hypervariability of the headpiece domains has already necessitated the introduction of several gaps in order to align the headpieces of vimentin, desmin, and neurofilament 68 K protein, and even after this is done only a rather remote sequence relation is achieved. Thus, for instance, vimentin and desmin show 70% sequence identity in the rod region (desmin residues 97-407) but only a rare identical tripeptide in their headpieces (Geisler and Weber, 1981a; Quax-Jeuken et al., 1983). The GFA headpiece does not allow a further easy alignment. It is characterized by a noticeably shorter length (46 residues versus 70- 80) and a clustering of arginine and particularly of proline residues. Nevertheless, the headpieces of all nonepithelial IF proteins are clearly non-a-helical, reveal several f-turns in line with their high proteolytic susceptibility and are rich in arginine, proline, hydroxyamino acids and glycine. They rarely display an acidic residue or a basic amino acid other than arginine (Geisler et al., 1982a, 1983; Quax-Jeuken etal., 1983). The carboxy-terminal 50 residues of GFA form the variable and non-a-helical tailpiece. This region can be easily aligned with the corresponding arrays in desmin and in vimentin, if allowance for some deletions and amino acid exS L G G G L S S G G F S G G S F S R G S S G G Y G G S S F G G G S F G G V S G P SP CS TTVS S F R S S C S S R PcV P G Y G G G F S S S SSSF S S V T S R V Y Q VSRT R P S T S R - - - -SLY Y S - T A R S A - YSSY R R V T S A A R R SYVS

ME, 109 G S F G G G S F G G G G C G G G
42 23 HE, 18 D 50 V 36 NF 43 G 17
7

V N E S L S S C C G G S G F G S A V P T S S S P G P S A
-

L T P L N L E D P N T T L P G A C N I P A

G G Y G G G L G T G L L S T F R T T R V T G A Y V T R S S A V -

V S S S L

S V R R

S L E T V G G G R R L G P G P R

G G S S F G G G S F G Y R S G G F N F C L P N L G L G G S P V F P R A S F G S - R G S G S Y V T T S T R T Y S - L G A L - - - - P R V H S S V R - S G M E R v Rod F G GGGFGGDGGG - L L S D Q C V K Q E E N V G S C NW F C E G S F D G G G F G G G F A G G D G L L V P L R T Y Q S A Y QG A G E L L D F S L R L R S S M P G V R L L Q D S V D F S L - Y - S S S S G L M P S L - - - - E N L D L S O L S L A R M P P P L P A R - - - - - - V D F S L
-

V R V S S T R G G S S F G G G Y R S

A A V A

D D A G

A M N Q E A N T E A S N D A L N T G

F F L F

L Q T R K N T R K K E T R

Fig. 1. Sequence relationship between GFA and other IF proteins. Alignment is based on previous arguments (Geisler and Weber, 1982). For primary sedata see the following references (Geisler et al., 1982a, 1983; Geisler and Weber, 1982; Hanukoglu and Fuchs, 1982; Steinert et al., 1983; QuaxJeuken et al., 1983; Dowling et al., 1983). Abbreviations for individual proteins are ME1 (mouse epidermal keratin 59 K), HE I (human epidermal keratin 50 K), 8 and 7 (sheep wool hard a-keratin components 8c-l and 7c), D (chicken desmin), V (hamster vimentin), G (porcine GFA) and NF (porcine 68 K neurofilament protein). Horizontal lines indicate as yet unestablished sequences. X is an arginine or lysine residue in NF. In the carboxy-terminal part of GFA a few minor ambiguities are underlined (see text). All proteins seem to have a blocking group, most probably the acetyl group, at their amino-terminal residue. The three structural domains are indicated as are the hydrophobic a and d positions (dots) in the consecutive heptades of the presumptive coiled-coils. Note some general irregularity early in coil II and the a to d reversal around desmin residue 342 also common to all proteins. Arrowheads mark the location of the isolated desmin rod (Geisler et al., 1982a). A leader sequence (underlined) occurring early in the rod is typical only of non-epithelial IF proteins. Identical residues in all four non-epithelial IF proteins or all proteins are given in bold letters, when the highly related sequence arrays of the rods (desmin residues 97-407) are concerned. Deletions (dashes) allow for better alignment of the short spacer regions and to some extent of the hypervariable non-a-helical terminal domains. In the latter case there is only a rather remote relation of D, V and NF through the entire headpiece region but a convincing homology is seen in the tailpieces of D, V and G, which can be extended with a lower degree of homology to HEI. Due to a mistake in sequence storage and transfer which was not recognised previously, the following residues in the desmin sequence must be corrected (see original data in Geisler and Weber, 1982): 268 I, 270 A, 275 I, 280 I and 281 A.
quence

2060

Amino acid sequence data on GFA

Coil la ME, 140 G N G R V T M R N L N D R R 7 8 53 G N E K E T M Q F L N D R HE, 50 G S E K V T M O N L N D R 97 T N E K V E L Q E L N D R D V 83 T N E K V E L Q E L N D R N D R G 63 A S E R A E M M E L S


0*

L F L L F F F
0

A A A A A A A

S A S S N D S

Y F Y Y Y Y Y
0

M D D L E L D I E I D I E

K K K K K K K

V V V V V V
0

R R R R R R

A L E E S N Y E L E G K Q A F F
L L L L

E E E E

R E N E A N QQN Q Q N

A E L E S A D L E V A L M V A L L A K

Spaccer K E V V R tE A -R Q L E R SO OO R R D W Y 0 RO K E V N R L R G ;KQ E L E Q L K G ;-Q


a

A R E R G

C P P P K

L C L A T S

K PR E -S V CP E IK RV R L-

E N N D
-

P L Y Y A G

R E Q S E D

D P S P -

Y L M L

S K - - - -

- - -

Ih CMiI AMl IU

ME, 195 YY K 134 F EG 8 105 YEA HE, 102 Y FK D 146 YEE YEE V 132 Y
7

T E D L Y IE T L T IE E L T IE D L
. *

K R Q R E LR E L R E M R E L R

G R Q N R R

QI L T L TT DNA NV L L Q D N AR L AAD D F R L K Y E N E V T LAR S V E E A E C VEA DSG R LS S E LN H VOQE VLE G Y K K K Y E E E V A L RA T A E


L L Q V D Q V D
0

K K

C T A Q

E AKS ENS R LV ATV DNA NV L L Q E R VE V E R L TG QRAA L TN D KA R VE V E R


.

D D D D

N N N N

AK L AR L L LD L AE
.

ASD AAD N ALO DII

D D K R

F F L L 0

R T K Y E S E R S R T K Y E T E L N 0 K K L Q EE M L Q E E R E K L 0 0
E E E E E E E E E E M R IR V N M N I R I Q
D V T A E E

LAO L V E LAM S V E L LKO E A E ORE E A E


Spacer II
S T S L G V G L Q Q

ME, 246 AD
185 8 156 HE, 153 D 197 V 183
7

NE SD AD NN ST

NGL FVAL NSL NGL L AA F LOS F


*

R K R R R A R Q
R K R R

V LDELT L SQ D VDCAY VR K LDE L T LCK V LDE L T L AR D VDAAT LAR D VDNAS LAR


*

SVL SDL SN L ADL DL LDL

E E E E E E

ESLNE LO A N VEAL I0 A E VESLKE ESLKE MO ESLQE R R R K VESLQE

EL AY L KKN L E AR LYE ELLCL KON HE EL AY L KKNH E EIAFL KKVHE K EIAFL KKLHD

EE
D E D D Q E Q E
K A R K

L L L L L L
.

Q O R R

N V A N SO G Q Q A Q Q A Q

G D D G E E

7 8

ME1 297 - D V NV VS 236 T S V 207 - R L NVHE, 204 - D V NVQV D 248 Q H V 234 Q H V QI -

E K E E E D

M M V M M V

N D D D D D

NA - P G NSR - AA - P T AA - P G SK P VS K P -

V D - D V D V D - D - D

L L L L L L

T N N S T T

M
-

Coil II - - Q L V D C - - R V - - R I - A A - - A A

L A L L L L *

N E N N R R

N E E E D D

M R R T R M R V R V R

Q Q Q Q Q Q QQ
N A A D

Y Y Y Y Y Y
0

E D E E E E

Q D A K S S

L A A L V M A V A VA
0

E S E E A A

K R T K K K

N S N N N N
0

R R R R L L

A A V A A A 0

E E E E E E

E S E E E E

ME, 335 W F N Q K S 277 WY R S K C R Q T 8 245 W Y HE, 242 W F F T K T D 286 W Y K S K V V 272 W Y K S K F W F K S R F NF G


7
0

A Q M S S H K S E D S N K E L T T E R H G E T L R R T K E E K A T V E E

E E S A T

E E D D V

L L L L L
*

N N T S T

K Q V R E V Q A A E A A E S A
0

V A N N A

S S S T N S K N N R N N K N T
0

E E D D D

Q L Q L V Q A L R A L R A V R
0

C S Q Q A

N G A A A
*

Q
K

K K K

T E L N E L E L E E S E L S E Q E M L E Y Q E S N E Y D E V S E S
0 0

R N R R R R R

R T R V R T R T H Q RO R L

V Q G L E OR R T A V N A L Q V MQ0 N L E I Q S Y T C V Q S L T C L K A K T L C
0 0

E E E E E E E E

A S A V D A I E A V D S
*

L V L L

Q E Q Q D

S Q L A L KO
Q H Q L L K L K C X L R
*

S L

N S G G

L M T T G M G T

R K N N N N
*

D A D E E E

S S S S A S

L L L L L L
0

ME, 386 E A S L A E T V E S L L R Q L S Q
7

296

HE, 293 E N S L E E T K G R Y C M D 337 M R Q M R E M E E R F A G V 323 E R Q M R E M E E N F A L AE EE FA *Q N A D K Q E L E L Q NF E KM Q


G

E N T L T E T E A R Y S C Q L Q L E A E A EA D E R Q M R E Q E E R H A R E A

N A G A S A

S A L E E Q L Q Q K Q V Q S L I S N V E S Q L A E Q Q E M G S V E E Q L A Q R H A R L E E E G Y Q D T Q N G R L Q D E N Y Q D T GYQT*AL N K L E N E L R T M Q D T EE*H A S Y Q E A L T R L E E E G Q S

Q S Q

R A E A K Q N R G D L R C E L K D E M K E E T KDE K S E L K D E

T E C M A C L E R M E Q M A R M A R M AR A R M A R

Q N A E Y Q Q L L L L K E Y Q E V M Q N Q E Y Q V L L L L Q N Q E Y K H L R E Y Q D L L H L R E Y Q D L L Y LRY*LL L K E Y Q D L L H L Q E Y Q E L L

K D N S K D V R D V K N V K N V K V K N VK N V K

T L A T M M M L

Tail

G G G G R R G G S G G G S Y GG S S G G G S Y G G S S G G G ME, 437 R L E N E I O T Y R S L L E G E G S S S G O G V - G A V - N V (V, S, S, S) - RR L L E G E E OR L C E I A T Y 7 G L D


8 347 R L E C E I N T Y R G L L D S E D C K L
S S S

HE, 344 R L E O E I A T Y R R L L E G E D A H L
D 388 A L D V E I A E I A V 374 A L D E I A NF A L D E I A A L D G T T A T
Y Y Y Y

R R R R

K K K K

L L L L

L L L L

E E E E

G G G G

E E E E

E E E E

N S T N

P M H Q T S R P N S L P L R R L S F T S V G S P V - Q T T R

F - S S G F A S A L F - S S L L T T G Y F - S N L

A C GKP L S Q S SRD V N F R ETSS P N L R ETN L S Q S SQV F Q I R ETS L

T P C T S S D Q R E S L G R S D T K

S S P

C A

S R G P L A Y S V
-

QI S E V D G G S E

V H T K K

T H S K R L Q T S S G H L K R

ME, 488
7 8

HE, 387
D 438 V 423 NF G

G S Y G G S S G (G G G S Y G G G S S G C G G R G G G S G G G Y G G G S S S G G A G G R G G G S G G G Y R C (Rl, C9, B3, T4, S8, Z1, P3, G13, A3, V7, L5) C V V P S S C G(R,R,Y) PA A P C TTD G K V V S T H E 0 V L R T K N 409 TK V M D VH K TIE T R D G E V V S E A T Q 0 0 H E V L 463 TV M H H D D L E N E T S V K TVE T R D G TL L I E V E E T E A A K A E E A K D E P P S E G E A E E E G K E E E YL M S T RS F P S Y Y T S H V K E S K Q E H K D V - K TVE M R D G E V NV
-

ME, 539 G
NF
NF

G G G S S G R R G G S G G F S G T S G G G D 0 S S K G P R Y 569 E E E E A A E K E E S E E A K E E E G G E G E K E E A E A E A E A E E E G A K K D E G A G E E
A T K K K D

G E E T K E A E E E

2061

N.Geisler and K.Weber

changes is made as suggested in Figure 1. Thus, of the four non-epithelial IF proteins, desmin, vimentin and GFA are the closest related members when only their coiled-coil arrays are compared. In the currently established rod regions the sequence identity levels lie around 70%, a value noticeably higher than the 55% seen for the neurofilament 68 K protein. The striking relation between these proteins is still sometimes considered as to indicate only some homology (Quinlan and Franke, 1983). It should therefore be stressed that the sequence identity levels observed in the rods approach those known for more distant vertebrate haemoglobins. In the headpiece region GFA diverges noticeably from the three other non-epithelial IF proteins, whereas in the tailpiece domain the neurofilament protein is the most remote. Thus, in an overall view, desmin and vimentin are clearly the closest pair of the non-epithelial IF proteins. The suggestion from gel electrophoresis that GFA has most likely a shorter polypeptide length than desmin (Rueger et al., 1981) is now confirmed. Since the common rod domain shows very little change in length even in the spacer regions of the more remote IF proteins (Figure 1), we would predict from the noticeably shorter headpiece that porcine GFA is some 4 K smaller than chicken desmin where the chemically determined mol. wt. is 53 K (Geisler and Weber, 1982). The good alignment at the carboxyl end and the presence of a blocking group at the amino-terminal methionine indicate that GFA purified from spinal cord is not a degraded molecule. The data on the highly protease-sensitive headpiece (for enzymatic digestion studies see Rueger et al., 1981) and its sequence organization also explain several studies on GFA from different glial material reporting multiple components with different solubilities, mol. wts. and isoelectric points (see for instance, Dahl and Bignami, 1975; Newcomb et al., 1982; Bigbee et al., 1983). As discussed before for desmin and vimentin (Geisler et al., 1982a) a zipper-type proteolytic removal of fragments from the basic headpiece must lead to a more pronounced influence of the acidic rod portion and therefore gives rise, in two-dimensional gels, to a 'staircase pattern' of fragments with decreasing mol. wt. and isoelectric point, as also observed for GFA (Bigbee et al., 1983). In addition, it has been shown that the 40 K rod domain of desmin, which can be obtained by chymotryptic digestion in vitro, behaves as a soluble protein at physiological conditions where normal desmin filaments are insoluble (Geisler et al., 1982a). Thus the lower mol. wt. derivatives of GFA, which probably arise due to the action of a Ca2+-activated protease, should also become more soluble. In order to avoid this enzyme activity our preparation (see Materials and methods) includes EGTA as do a few previous procedures (Rueger et al., 1979, 1981). Non-epithelial IF proteins form a closely related subgroup Whereas all non-epithelial IF proteins are able to form homopolymeric 100 A filaments in vitro (Rueger et al., 1979; Geisler and Weber, 1981b; Steinert et al., 1981) similar studies on epidermal keratins indicate the necessity for obligatory heteropolymers (Steinert et al., 1976, 1981), and no epithelial cell type displaying a single cytokeratin is known (Moll et al., 1982). Current sequence data offer only a partial answer for this difference between epithelial and nonepithelial IF. Although at least parts of the terminal domains are necessary for filament integrity (Geisler et al., 1982a) the framework of the structure itself seems to rely on the interactions between coiled-coils (Fraser et al., 1976). With the 2062

molecular characterization of GFA (see above) and the neurofilament 68 K protein (Geisler et al., 1983) the previous speculation that the rod sequences of keratins are more remote (Geisler and Weber, 1982; Hanukoglu and Fuchs, 1982) is now consolidated. Among non-epithelial proteins, sequence identity of at least 55% is seen whereas transition to the keratins provides a much lower value of 30%o. In addition, the two hard a-keratins (8c-l and 7c) known to interact in vitro when their coil lb fragments are mixed (Gruen and Woods, 1983) share only some 30%0 sequence identity. Thus it seems that keratin structure requires at least two distinct rod domains, either as ae-helices or as coiled-coils, which by acting in a complementary manner, lead to a self assemblycompetent component. In non-epithelial IF the necessary information seems already contained within a single protein. Since the two epidermal keratin sequences that are currently known clearly reflect component 8c-1 it is to be expected that there are other epidermal keratins more closely related to 7c and that the two prototypes interact to allow self assembly. Rod sequences of various IF proteins reveal a pronounced 28 residue repeat with a regular pattern of positive and negative charges thought to be responsible for highly specific electrostatic interactions necessary to pack parallelly or antiparallelly oriented coiled-coils (Parry et al., 1977; McLachlan and Stewart, 1982). It remains to be seen if comparative Fourier analysis can locate the differences between nonepithelial IF proteins and keratins. In this context, we note that sequence homology between the two subgroups is not randomly distributed (Geisler and Weber, 1982). The start of coil Ia, the end of coil lb and both ends of coil II are particularly well preserved among all IF proteins and these consensus sequences should be of general structural importance. In addition, we have observed that all non-epithelial IF proteins reveal a leader sequence with the potential for a-helix formation but with poor coiled-coil forming ability. This sequence of 17 residues precedes the highly related part of the rod (Figure 1, underlined). Its function, if any, remains unknown but it has not been detected in the keratin sequences so far established. Divergence of the terminal hypervariable domains Since previous descriptions of these regions were restricted to a rather small sample of proteins (Geisler and Weber, 1982; Hanukoglu and Fuchs, 1982; Steinert et al., 1983) one could not foresee how general the proposed hypervariability would be. As summarized in Figure 1 we now recognize several distinct motifs. The long tandem repeats of several glycine residues flanked by large hydrophobic residues can occur with epidermal keratins both in the headpieces and in the tailpieces (Hanukoglu and Fuchs, 1982; Steinert et al., 1983). That such sequences are not a necessary requirement for keratin structure is shown by the hard cx-keratins of sheep wool where proline- and cysteine-rich regions reveal a different chemistry probably leading to extensive disulfide crosslinking (Crewther et al., 1980; Sparrow and Inglis, 1980; Weber and Geisler, 1982). The headpieces of all nonepithelial proteins are arginine- and proline-rich (Geisler et al., 1982a, 1983; Quax-Jeuken et al., 1983, and Results), and curiously a preference for arginine can also be seen in some of the emerging head- and tailpieces of epidermal keratins (Steinert et al., 1983). Finally, the neurofilament triplet proteins have extremely complicated and very long tailpieces, where one subdomain has been shown to contain 46 glutamic acids within 106 residues (Geisler et al., 1983; Figure 1).
-

Anino add sequence data on GFA

Although the coiled-coils and their interaction must be the key to IF structure, the non-a-helical terminal domains have an independent and necessary contribution to filament stability. The rod domain of desmin (residues 73-415) isolated by chymotryptic trimming is at physiological salt and pH a soluble protein, although intact desmin forms extensive filament arrays under the same conditions (Geisler et al., 1982a; Geisler and Weber, 1982). However, it is difficult to decide if differences in physical-chemical properties of distinct IF such as low ionic strength-induced depolymerization into protofilaments are only due to changes in the terminal domains or also to differences in the organization of higher order coiledcoil interactions or both. Thus further biochemical and more sequence data are required to pinpoint the higher tendency of GFA versus desmin/vimentin to depolymerize in low ionic strength buffers (Rueger et al., 1979; Steinert et al., 1981). Nevertheless, it is very likely that the extreme insolubility of epidermal keratins reported previously (Steinert, 1978) should be related to their unique 'polyglycine' tracts, which can occur in either one or both terminal domains and possibly form a special structure. Although all epidermal keratins characterized by amino acid composition seem rich in glycine (Zackroff et al., 1981; Hanukoglu and Fuchs, 1982) this is not the case for certain cytokeratins from interior epithelia. We have recently found that a higher mol. wt. cytokeratin of porcine intestinal epithelium has a nearly normal glycine content and that the cytokeratin mixture of these cells shows a solubility more related to desmin than to epidermal keratins (our unpublished results). Thus a further possibility is indicated to account for the broad spectrum of molecularly distinct keratins. Materials and methods
GFA was isolated from porcine spinal cord by a modification of our general procedure to isolate IF proteins. After extraction with a physiological buffer [0.1 M MES, pH 6.5; 0.1 M NaCl, 5 mM EGTA, 1 mM phenylmethylsulphonyl fluoride (PMSF), 0.5 mM dithiothreitol (DTT)] followed by buffer with 1 7o Triton X-l00 and again by buffer, the insoluble material was dissolved in 6 M urea buffer in the presence of EGTA and chromatographed on DEAE-cellulose using a shallow salt gradient. GFA eluted well-separated from the later emerging neurofilament triplet proteins (Geisler and Weber, 1981b). GFA purified in this manner can be isolated within a few days in 50 mg amounts. It has a polypeptide chain of apparent mol. wt. 51 000 and is obtained in 95% purity. The material is specifically detected in Western blots by two GFA antisera which include a rabbit serum kindly provided by A.Bignami. Removal of urea under the conditons described by Rueger et al. (1979) leads to extensive formation of 10 nm filaments. Protein was dialyzed against water and lyophilized. The amino-terminal 10 K fragment was obtained by digestion with lysine-specific protease (Boehringer Mannheim, FRG) in 6 M urea followed by chromatography on CMcellulose as described for neurofilament 68 K protein (Geisler et al., 1983). The carboxy-terminal 15 K fragment resulted from treatment with 2-nitro-5thiocyanobenzoic acid leading to cleavage at the single cysteine residue (Geisler et al., 1982b). The fragment was isolated by gel filtration through Sepharose 6B in the presence of urea. Digestion with trypsin, chymotrypsin, thermolysin and V8 protease as well as separation by two-dimensional paper methods and h.p.l.c. was as before. Peptides were recovered and characterized by amino acid composition and stepwise degradation using either the normal dansyl-Edman or a modified technique (Geisler and Weber, 1982a; Geisler et al., 1983). Information from the Offord plot was used to check the acid and amide assignments. Polyacrylamide gel electrophoresis in the presence of SDS was used to monitor purity of GFA and its fragments.

References
Bigbee,J.W., Bigner,D.D., Pegram,C. and Eng,L.F. (1983) J. Neurochem., 40, 460-467. Crewther,W.G., Dowling,L.M. and Inglis,A.S. (1980) Proceedings of the 6th Quinquennial International Wool Textile Research Conference, Pretoria, Vol. 2, pp. 79-91. Dahl,D. and Bignami,A. (1975) Biochim. Biophys. Acta, 386, 41-51. Dowling,L.M., Parry,D.A.D. and Sparrow,L.G. (1983) Biosci. Rep., 3, 73-78. Fraser,R.D.B., MacRae,T.P. and Suzuki,E. (1976) J. Mol. Biol., 108, 435452. Geisler,N. and Weber,K. (1981a) Proc. Natl. Acad. Sci. USA, 79, 4120-4123. Geisler,N. and Weber,K. (1981b) J. Mol. Biol., 151, 565-571. Geisler,N. and Weber,K. (1982) EMBO J., 1, 1649-1656. Geisler,N., Kaufmann,E. and Weber,K. (1982a) Cell, 30, 277-286. Geisler,N., Plessmann,U. and Weber,K. (1982b) Nature, 296, 448-450. Geisler,N., Kaufmann,E., Fischer,S., Plessmann,U. and Weber,K. (1983) EMBO J., 2, 1295-1302. Gruen,L.C. and Woods,E.F. (1983) Biochem. J., 209, 587-595. Hanukoglu,j. and Fuchs,E. (1982) Cell, 31, 253-252. Hong,B. and Davison,P.F. (1981) Biochim. Biophys. Acta, 670, 139-145. McLachlan,A.D. and Stewart,M. (1982) J. Mol. Biol., 162, 693-698. Moll,R., Franke,W.W., Schiller,D.L., Geiger,B. and Krepler,R. (1982) Cell, 31, 11-24. Newcomb,J., Glynn,P. and Cuzner,M.L. (1982) J. Neurochem., 38, 267-274. Osborn,M. and Weber,K. (1983) Lab. Invest., 48, 372-394. Parry,D.A.D., Crewther,W.G., Fraser,R.D.B. and MacRae,T.P. (1977) J. Mol. Biol., 113, 449-454. Pruss,R.M., Mirsky,R., Raff,M.C., Thorpe,R., Dowding,A.J. and Anderton,B.H. (1981) Cell, 27, 419-428. Quax-Jeuken,Y.E.F.M., Quax,W.J. and Bloemendal,H. (1983) Proc. Natt. Acad. Sci. USA, 80, 3548-3552. Quinlan,R.A. and Franke,W.W. (1982) Proc. Natl. Acad. Sci. USA, 79, 3452-3456. Quinlan,R.A. and Franke,W.W. (1983) Eur. J. Biochem., 132, 477-484. Rueger,D.C., Huston,J.S., Dahl,D. and Bignami,A. (1979) J. Mol. Biol., 135, 53-68. Rueger,D.C., Gardner,E.E., Simonian,H.D., Dahl,D. and Bignami,A. (1981) J. Biol. Chem., 256, 10606-10612. Sparrow,L.G. and Inglis,A.S. (1980) Proceedings of the 6th Quinquennial International Wool Textile Research Conference, Pretoria, Vol. 2, pp. 237246. Steinert,P.M. (1978) J. Mol. Biol., 123, 49-70. Steinert,P.M., Idler,W.W. and Zimmerman,S.B. (1976) J. Mol. Biol., 108, 547-567. Steinert,P.M., Idler,W.W. and Goldman,R.D. (1980) Proc. Natl. Acad. Sci. USA, 77, 4534-4538. Steinert,P.M., Idler,W.W., Cabral,F., Gottesman,M.M. and Goldman,R.D. (1981) Proc. Natt. Acad. Sci. USA, 78, 3692-36%. Steinert,P.M., Rice,R.H., Roop,D.R., Trus,B.L. and Steven,A.D. (1983) Nature, 302, 794-800. Weber,K. and Geisler,N. (1982) EMBO J., 1, 1155-1160. Zackroff,R.V., Steinert,P.M., Whitman,M.A. and Goldman,R.D. (1981) Cell Surf. Rev., 7, 56-97.
Note added in proof A recent DNA sequence study on a 56-kd human epidermal keratin shows a very close sequence relationship with wool a-keratin 7c (I. Hanukoglu and E. Fuchs, Cell, 33, 915-924, 1983). Thus, as predicted from a-keratins, there are only two prototype rod sequences present in epidermal keratins.

Acknowledgements
We thank W.Koch for preparing GFA and greatly appreciate the proteinchemical expertise of U.Plessmann. Dr.A.Bignami kindly supplied a sample of rabbit anti-GFA serum. This work was supported by the Max Planck

Society and a grant from the Deutsche Forschungsgemeinschaft to N.G. and


K.W.

2063

Você também pode gostar