Escolar Documentos
Profissional Documentos
Cultura Documentos
Molecular
BioSystems
REVIEW
www.rsc.org/molecularbiosystems
Introduction
a
This journal is
151
Basics of SAXS
First, we shall briey present the main aspects of SAXS
applied to solutions of biological macromolecules and the basic
equations required to understand the principles of this technique.
152
This journal is
K
X
n k Ik s;
k1
where nk and Ik(s) are the volume fraction and the scattering
intensity from the k-th component, respectively. Perhaps the
worst case of a polydisperse system in practical applications is
unspecic aggregation, which often prevents meaningful analysis
of SAXS data. However, SAXS is useful for studies of equilibrium systems like oligomeric protein mixtures. Here, the
number of components (oligomers) is relatively low (e.g. two
components for a monomerdimer equilibrium) and their scattering patterns are often available, thus SAXS can provide the
volume fractions (nk) of the components by solving eqn (3).18
Both IDPs and modular multi-domain proteins with exible
linkers can be represented as mixtures of dierent conformations
of the same molecule, and their scattering is therefore described
by eqn (3). However, these mixtures contain a very large number
of congurations (k c 1) and thus plain decomposition, as in
the case of oligomeric mixtures, is not feasible. Below we
describe the methods specically developed for the studies of
highly exible systems.
Fig. 2 Eect of the conformational sampling of IDPs on SAXS proles and their Kartky representations. (A) Ten 100-residue long polyalanine
conformations built with Flexible-Meccano, chosen from a pool of 10 000, representing the variety of sizes and shapes encountered in a disordered
protein. (B) Individual SAXS proles (black) and (C) Kratky plots (black) of these ten randomly selected chains. The average of the SAXS
intensities and the Kratky plots from the 10 000 conformations are shown in red in (B) and (C), respectively. Averaged curves indicate the common
behavior of fully disordered proteins.
This journal is
153
(5)
155
Table 1 Experimental and calculated Rgs for intrinsically disordered proteins. The bottom part of the table presents data for several Tau protein
constructs
Protein
#Residuesa
Rgexp/A
RgRC/Ab
Ref.
MeCP2
Msh6 N-term
Ki-1/57
MeCP2 (78305)
Synthetic Resilin
Pig Calpastatin domain I
HrpO
II-1
a-Synuclein, pH 7.5
a-Synuclein, pH 3.0
N-tail nucleoprotein MV
b-synuclein
Human NHE1 cdt (5 1C)
Human NHE1 cdt (45 1C)
ERM Transactivation Domain
Neuroligin 3
elF4E binding Protein (4E-BP)
Prothymosin a, pH 7.5
Prothymosin a, pH 2.5
paNHE1 cdt (5 1C)
paNHE1 cdt (45 1C)
N-protein of bacteriophage l
FEZ1 monomer
HIV-1 Tat133
p53 (193)
Sic1
pSic1 (hexaphosphorylated)
PIR domain
IB5
ACTR (5 1C)
ACTR (45 1C)
N-term VS Virus phosphoprotein
486
304
292
228
185
148
147
141c
140
140
139
137
131
131
130
118
117
109
109
107
107
107
103
101
93
92
92
75
73c
71
71
68
62.5 4.5
56 2
47.5 1.0
37.0 0.9
50 5
35.4
35.0
41.0 1.0
40 1
30 1
27.2 0.5
49 1
37.1d
35.3d
39.6 0.7
31.5 1.0
48.8 0.2
37.8 0.9
27.6 0.9
32.8d
32.9d
33 2e
36 1
33.0 1.5
28.7 0.3
34.7
34.0
26.5 0.5
27.9 1.0
25.8d
23.8d
26 1f
64.2
50.2
49.2
43.3
38.8
34.5
34.4
33.6
33.5
33.5
33.4
33.1
32.4
32.4
32.2
30.6
30.5
29.4
29.4
29.1
29.1
29.1
28.5
28.3
27.1
26.9
26.9
24.2
23.8
23.5
23.5
23.0
30
31
32
30
33
34
35
36
37
37
38
39
40
41
42
43
44
45
45
40
41
46
47
48
49
50
50
51
36
41
41
52
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
Tau
441
202
174
130
352
171
143
99
283
167
185
254
202
352
352
130
441
441
65
42
39
38
53
37
36
35
52
40
41
49
41
54
52
35
66
67
61.0
40.6
37.5
32.2
54.2
37.2
33.9
28.0
48.4
36.7
38.7
45.7
40.6
54.2
54.2
32.2
61.0
61.0
53
53
53
53
53
53
53
53
53
53
53
53
53
53
53
53
54
54
ht40
K32
K16
K18
ht23
K27
K17
K19
K44
K10
K25
K23
K32 AT8 AT100
ht23 S214E
ht23 AT8 AT100
K18 P301L
ht40 AT8 AT100 PHF1 (10 1C)
ht40 AT8 AT100 PHF1 (50 1C)
3
3
3
3
3
2
2
1
2
1
2
2
3
3
3
2
3
3
a
When present, His-tags were considered part of the protein. b Threshold Rg value obtained from the parametrization of Florys relationship with
the coil database. c Length of the most populated isoform of the samples was used. d Rg derived from averaging conformations selected with
EOM. e Data measured by SANS in highly crowded conditions (130 mg ml1 of BPTI). f Data derived from the 10 mM Arg/Glu buer.
157
Fig. 4 Comparison of the experimental SAXS proles (empty dots) with the theoretical ones derived from large ensembles computed with
Flexible-Meccano (FM)(red) for Sendai virus PX, Transactivation domain of p53, and K18 construct of Tau protein. FM ensembles were modied
to reproduce RDC data in these cases. The excellent agreement in these three examples indicates the proper description of the overall properties of
IDPs coded in FM. Note that these plots come from a direct comparison of SAXS curves and not from a tting.
ones for the partially folded Sendai virus PX,14 the transactivation
domain of p53,49 and the K18 construct of Tau protein
(Fig. 4).53,103 In these two latter cases, transient levels of
structuration were found by RDCs. Remarkably, in all these
cases the same structural model simultaneously described
NMR properties, which mainly report on the conformational
(local) properties, and SAXS curves, which report on the size
and shape (global) of the proteins. Again, these results underline the synergy between SAXS and NMR observables.
The encouraging results obtained for the structural modeling of several IDPs using FM prompted the derivation of a
specic parametrization of Florys equation for these proteins
(eqn (5)).29 This aim was accomplished by tting Rg values
derived from synthetic SAXS curves computed for large
conformational ensembles, built with FM, of several proteins
covering a large spectra of sizes (eqn (6)).
Rg = (2.54 0.01)N(0.522
0.01)
(6)
159
Fig. 6 Low-resolution SAXS models of the oligomer, bril, and an on-pathway oligomer of a-synuclein. (A) The p(r) of the early and late brils
(large spheres, light and dark colors respectively), and the cross section of the early bril (inset, black spheres) compared with the p(r) of the
oligomer [inset, (white spheres)]. (B) SAXS-derived structure of the a-synuclein oligomer. Two orthogonal views of the average structure (mesh
representation) and the ltered averaged structure (surface representation) are displayed and superimposed. The ltered structure has a volume
corresponding to the average volume of individual models, and the dierence between the average and ltered structure indicates the general level
of dierences between individual models. (C) The SAXS-derived model of the symmetric, early bril that is present in solution in equilibrium with
native species and a fourth component. A single repeating unit is shown in cyan, with the averaged model and the ltered averaged models
superimposed in mesh-representation. The principle of the repeats building the mature brils is shown to the right, where three repeats of the
ltrated model are displayed. Repeats two and three have been translated 880 A vertically with respect to the rst repeat, and the model to the right
is rotated 901 around a vertical axis with respect to the left model. (D) Model for the elongation of brils. In pink/purple, 26 oligomers constituting
one repeating unit of the mature bril (averaged and ltered model shown in cyan mesh) are displayed in surface representation. Below, the 26
oligomers are superimposed with the bril repeating unit, whereas the two models are separated above. The lowest representation is rotated 901
around a horizontal axis with respect to the top two models. Figure reprinted from ref. 108.
N
1X
In s
N n1
Fig. 7 Schematic representation of the EOM strategy for the analysis of SAXS data in terms of Rg distributions. The M conformations/curves of the
pool (random distribution), left part of the gure, are used to generate the initial C chromosomes and to feed the genetic operators (mutations, crossing
and elitism) along the GA process that runs for G generations. The complete process is repeated R independent times, and each run provides N selected
structures/curves that t the experimental prole. The structural analysis of the resulting conformations is displayed on the right part of the scheme, the
Rg distribution of the selected (N R) conformations is compared with that derived from the pool that is considered as a complete conformational
freedom scenario. From this comparison it is possible to derive a quantitative structural estimation of the protein conformations coexisting in solution.
w2
This journal is
K
1 X
mIsj Iexp sj 2
K 1 j1
ssj
161
163
Conclusions
164
Fig. 9 EOM analysis of ribosomal L12 protein. (A) Cartoon of a L12 conformation (1rqu) with the dimerization N-terminal domain (NTD) in
green and the C-terminal domain (CTD) in blue, exible linker is shown in red. (B) Rg distributions from the EOM-selected ensemble (red) and that
corresponding to the pool (black). Interdomain distance distributions, NTD-CTD (C) and CTD-CTD (D), from the EOM-selected ensemble (red)
and the pool (black). The shift in the distributions of the selected ensemble indicates that L12 behaves as a more extended particle than expected
from a random coil linker, and some degree of interdomain correlation is present. (E) Correlation between interdomain NTD-CTD distances in
each conformation of the EOM-selected ensemble with respect to those found in the random pool. Positive values indicate overpopulation in the
selected ensemble, whereas negative values imply depleted correlations. Peaks outside of the diagonal indicate a strong degree of asymmetry in
selected conformations. Adapted with permission from ref. 133.
List of abbreviations
SAS
SAXS
SANS
IDP
EOM
NMR
RDC
GA
FM
AUC
DLS
MO
CS
PRE
Small-angle scattering
Small-angle X-ray scattering
Small-angle neutron scattering
Intrinsically disordered protein
Ensemble optimization method
Nuclear magnetic resonance
Residual dipolar coupling
Genetic algorithm
Flexible-Meccano
Analytical ultracentrifugation
Dynamic light scattering
Maximum occurrence
Chemical shifts
Paramagnetic relaxation enhancement
Acknowledgements
The authors thank Yolanda Perez and Miquel Pons (IRB
Barcelona) for providing unpublished data, Giacomo Parigi and
Claudio Luchinat (CERM-Florence), and Martin Blackledge
(IBS-Grenoble) for providing gures and raw data. Tanya Yates
This journal is
References
1 L. A. Feigin and D. I. Svergun, Structure analysis by small-angle
X-ray and neutron scattering, 1987, Plenum Press, New York,
pp. 335.
2 D. I. Svergun and M. H. J. Koch, Curr. Opin. Struct. Biol., 2002,
12, 654660.
3 M. V. Petoukhov and D. I. Svergun, Curr. Opin. Struct. Biol.,
2007, 17, 562571.
4 C. D. Putnam, M. Hammel, G. L. Hura and J. A. Tainer, Q. Rev.
Biophys., 2007, 40, 191285.
5 D. A. Jacques and J. Trewhella, Protein Sci., 2010, 19, 642567.
6 H. D. Mertens and D. I. Svergun, J. Struct. Biol., 2010, 172,
128141.
7 A. R. Round, D. Franke, S. Moritz, R. Huchler, M. Fritsche,
D. Malthan, R. Klaering, D. I. Svergun and M. Roessle, J. Appl.
Crystallogr., 2008, 41, 913917.
8 G. L. Hura, A. L. Menon, M. Hammel, R. P. Rambo, F. L. Poole,
S. E. Tsutakawa, F. E. Jenney, S. Classen, K. A. Frankel,
R. C. Hopkins, S. J. Yang, J. W. Scott, B. D. Dillard, M. W. W.
Adams and J. A. Tainer, Nat. Methods, 2009, 6, 606612.
9 J. Blobel, P. Bernado, D. I. Svergun, R. Tauler and M. Pons,
J. Am. Chem. Soc., 2009, 131, 43784386.
10 B. Vestergaard, M. Groenning, M. Roessle, J. S. Kastrup, M. van
de Weert, J. M. Flink, S. Frokjaer, M. Gajhede and D. I. Svergun,
PLoS Biol., 2007, 5, e134.
11 S. Akiyama, A. Nohara and Y. Maeda, Mol. Cell, 2008, 29,
703716.
12 S. Doniach, Chem. Rev., 2001, 101, 17631778.
13 L. Pollack and S. Doniach, Methods Enzymol., 2009, 469, 253268.
14 P. Bernado, L. Blanchard, P. Timmins, D. Marion, R. W. H.
Ruigrok and M. Blackledge, Proc. Natl. Acad. Sci. U. S. A., 2005,
102, 1700217007.
15 P. Bernado, E. Mylonas, M. V. Petoukhov, M. Blackledge and
D. I. Svergun, J. Am. Chem. Soc., 2007, 129, 56565664.
165
166
This journal is
This journal is
167