Alexander Shekhtman, David S. Burz-Protein NMR Techniques (Methods in Molecular Biology, V831) - Springer (2011)

METHODS IN MOLECULAR BIOLOGY™
Series Editor
John M. Walker
School of Life Sciences
University of Hertfordshire
Hatfield, Hertfordshire, AL10 9AB, UK
For further volumes:

http://www.springer.com/series/7651
Protein NMR Techniques
Third Edition
Edited by
Alexander Shekhtman and David S. Burz

Department of Chemistry, University at Albany, State University of New York,
Albany, NY, USA
Editors
Alexander Shekhtman David S. Burz
Department of Chemistry Department of Chemistry
University at Albany University at Albany
State University of New York State University of New York
1400 Washington Avenue 1400 Washington Avenue
Albany, NY 12222, USA Albany, NY 12222, USA
ashekhta@albany.edu dsburz@albany.edu
ISSN 1064-3745 e-ISSN 1940-6029

ISBN 978-1-61779-479-7 e-ISBN 978-1-61779-480-3
DOI 10.1007/978-1-61779-480-3
Springer New York Dordrecht Heidelberg London
Library of Congress Control Number: 2011943883
© Springer Science+Business Media, LLC 2012

All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the
publisher (Humana Press, c/o Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA),
except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or
hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified
as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights.
Printed on acid-free paper
Humana Press is part of Springer Science+Business Media (www.springer.com)

Preface
The field of protein NMR spectroscopy has rapidly expanded into new areas of biochemistry,
molecular biology, and cell biology research that were impossible to study as recently as 10
years ago. The potential to study macromolecular systems that were once considered too
large or too transient or too complex by using NMR spectroscopy is now being realized
with the development of innovative technologies. Standard NMR technologies are also get-
ting a facelift in part due to the pervasive nature of high-throughput approaches in bio-
chemical and biomedical research. These advances warrant a new edition of Protein NMR
Techniques that includes an authoritative but down-to-earth description of new methodolo-
gies. This edition consists of 24 chapters divided into four major categories: NMR sample
preparation, solution NMR methodologies, solid-state NMR methodologies, and data pro-
cessing. The material presented contains enough detail for use not only in specialized NMR
laboratories, but in biochemical, molecular, and cell biology research labs that have access
to high-field NMR spectrometers.
Preparing proteins for NMR spectroscopy can be a time-consuming process that may
take longer than data collection and analysis combined. Expression in bacterial cells still
remains one of the most popular ways of preparing NMR samples. Chapter 1 discusses new
methods for optimizing and increasing the production of isotope-labeled protein in bacte-
ria. However, some proteins are difficult to express in bacteria, in these cases an alternative
approach involves using yeast cells. Chapter 2 describes a methodology for producing pro-
teins in yeast that are usually secreted into the growth medium. This technique is proving
to be as robust and economic as bacterial production. One drawback to using proteins
secreted by yeast is that they may exhibit altered patterns of glycosylation and phosphoryla-
tion. To avoid this problem and achieve proper posttranslational modifications, proteins are
best produced in insect or mammalian cells. Advances in the use of these cells for producing
NMR samples are detailed in Chapters 3 and 4. Cell-free expression of proteins has become
a method of choice for high-throughput protein production especially in cases, where the
yields from in vivo overexpression are very low. Cell-free systems allow for the selective
incorporation of any isotope-labeled amino acid into a target protein with minimal scram-
bling. Chapters 5 and 6 describe the cell-free production of proteins for solution and solid-
state NMR, respectively. Finally, although well-expressed in bacterial cells, some soluble
proteins do not fold properly in sufficient quantity to permit analyses of structure, dynam-
ics, and interactions. Chapter 7 presents a methodology for expressing and purifying such
proteins in a cost-efficient manner.
The chapters on solution NMR methodologies range from the study of individual pro-
teins, large multidomain proteins, protein–ligand and protein–nucleic acid complexes
in vitro, to the study of proteins inside living cells. A strategy for studying supramolecular
systems, which has become possible due to advances in isotope labeling and NMR pulse
sequences, is described in Chapter 8. Chapter 9 presents basic protocols and the latest
improvements for measuring relaxation rates and analyzing protein dynamics. Methods to
help overcome difficulties in applying solution NMR to the study of membrane proteins are
v
vi Preface
detailed in Chapter 10. Structurally characterizing multi-domain proteins can be challenging

due to the inherent flexibility present in these systems and requires special approaches out-
lined in Chapter 11. To regulate biological activity, proteins engage in interactions with
other macromolecules present in the cell. A description of methods used to prepare pro-
tein–RNA, protein–DNA, and protein–ligand complexes suitable for study by using NMR
spectroscopy are presented in Chapters 12, 13, and 14. Lastly, Chapter 15 describes in-cell
NMR spectroscopy, a relatively new area of NMR research that affords atomic resolution
information about isotope-labeled proteins inside living cells.
Solid-state NMR spectroscopy presents a complementary approach to studying pro-
teins, especially since the method is not limited by the molecular size constraints that ham-
per solution NMR. With the availability of high-field NMR spectrometers, solid-state NMR
has become a viable technique for acquiring unique information about protein systems that
are difficult to characterize by using solution NMR. Chapter 16 reviews the use of magic
angle spinning solid-state NMR to study the structure and dynamics of perdeuterated pro-
teins. The preparation and characterization of protein complexes for solid-state NMR and
methodologies to analyze the structures and dynamics of protein complexes are presented
in Chapter 17. The area of membrane protein expression has seen extensive advances of late,
spurred by intense interest in signaling pathways, but impeded by difficulties in preparing
samples in sufficient quantity for NMR spectroscopy. Chapter 18 details methods for pro-
ducing membrane proteins suitable for study by using solid-state NMR.
Processing and analyzing NMR data has historically been an extremely laborious part of
NMR research, requiring skillful NMR spectroscopists to assign chemical shifts and to deter-
mine atomic resolution structures of proteins. With the advent of high-throughput assign-
ment protocols, this task has become largely manageable by a trained graduate student.
Nevertheless, there are difficult cases for which there is no substitution for the experienced
spectroscopist. For example, characterization of eukaryotic kinases by NMR spectroscopy is
complicated by the extensive dynamics and large size exhibited by these proteins. Chapter 19
describes the procedures used to assign backbone resonances for ERK2. The reactivity of
solvent-exposed backbone amides varies by a factor of at least a billion-fold because of elec-
trostatic interactions at the protein surface. The use of electrostatic analysis of hydrogen
exchange rates to analyze protein flexibility is reviewed in Chapter 20. Chapter 21 presents a
strategy for assigning the backbone resonances of small- to medium-sized globular proteins
in a few hours by using a highly automated program, BATCH, to acquire, process, and ana-
lyze NMR data. A versatile protocol, UNIO, that provides nearly fully automated structure
determination is described in Chapter 22. In UNIO, user-intervention is encouraged and
facilitated by graphical tools for preparing, analyzing, validating, and presenting the NMR
structure. Chapter 23 details the use of the ARIA software, which incorporates both solution
and solid-state NMR structural constraints to perform structure calculations. The final chap-
ter, Chapter 24, introduces the software DYNAMICS for analyzing relaxation rates that
characterize the overall tumbling and local dynamics of a protein.
This book presents a comprehensive description of the latest innovations in the field of
protein NMR. It focuses on the importance of biochemistry, molecular biology, and cell
biology to NMR spectroscopy while avoiding excessive repetition of existing material,
which is readily available through a number of excellent texts and reviews that cover topics
Preface vii
relevant to studying proteins by using NMR. Rather than reiterating the fundamental
principles behind NMR methodologies, we have emphasized the practical aspects of experi-
mental design combined with practical advice and examples. We hope that this book will
provide both experienced NMR spectroscopists and biochemists, who are new to the field
of NMR, with enough background to successfully apply these techniques to their
research.
Albany, NY, USA Alexander Shekhtman

David S. Burz
Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1 A Novel Bacterial Expression Method with Optimized Parameters

for Very High Yield Production of Triple-Labeled Proteins . . . . . . . . . . . . . . . . . . . 1
Victoria Murray, Yuefei Huang, Jianglei Chen, Jianjun Wang,
and Qianqian Li
2 Isotopic Labeling of Heterologous Proteins in the Yeast Pichia pastoris
and Kluyveromyces lactis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Toshihiko Sugiki, Osamu Ichikawa, Mayumi Miyazawa-Onami,
Ichio Shimada, and Hideo Takahashi
3 Isotope Labeling in Insect Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Krishna Saxena, Arpana Dutta, Judith Klein-Seetharaman,
and Harald Schwalbe
4 Isotope Labeling in Mammalian Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Arpana Dutta, Krishna Saxena, Harald Schwalbe,
and Judith Klein-Seetharaman
5 Cell-Free Protein Production for NMR Studies. . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Mitsuhiro Takeda and Masatsune Kainosho
6 Cell-Free Membrane Protein Expression for Solid-State NMR . . . . . . . . . . . . . . . . 85
Alaa Abdine, Kyu-Ho Park, and Dror E. Warschawski
7 Expression and Purification of Src-family Kinases for Solution NMR Studies . . . . . . 111
Andrea Piserchio, David Cowburn, and Ranajeet Ghose
8 NMR Studies of Large Protein Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Shiou-Ru Tzeng, Ming-Tao Pai, and Charalampos G. Kalodimos
9 Protein Dynamics by 15N Nuclear Magnetic Relaxation . . . . . . . . . . . . . . . . . . . . . . 141
Fabien Ferrage
10 Bacterial Production and Solution NMR Studies of a Viral Membrane
Ion Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Jolyon K. Claridge and Jason R. Schnell
11 Preparation of the Modular Multi-Domain Protein RPA for Study
by NMR Spectroscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Chris A. Brosey, Marie-Eve Chagot, and Walter J. Chazin
12 NMR Studies of Protein–RNA Interactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Carla A. Theimer, Nakesha L. Smith, and May Khanna
13 Preparation and Optimization of Protein–DNA Complexes Suitable
for Detailed NMR Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
My D. Sam and Robert T. Clubb
ix
x Contents
14 NMR Studies of Protein–Ligand Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Michael Goldflam, Teresa Tarragó, Margarida Gairí, and Ernest Giralt
15 In-Cell NMR Spectroscopy in Escherichia coli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Kirsten E. Robinson, Patrick N. Reardon, and Leonard D. Spicer
16 Deuterated Peptides and Proteins: Structure and Dynamics Studies
by MAS Solid-State NMR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Bernd Reif
17 Solid-State NMR Spectroscopy of Protein Complexes . . . . . . . . . . . . . . . . . . . . . . . 303
Shangjin Sun, Yun Han, Sivakumar Paramasivam, Si Yan,
Amanda E. Siglin, John C. Williams, In-Ja L. Byeon, Jinwoo Ahn,
Angela M. Gronenborn, and Tatyana Polenova
18 Synthesis, Purification, and Characterization of Single Helix Membrane
Peptides and Proteins for NMR Spectroscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Miki Itaya, Ian C. Brett, and Steven O. Smith
19 Assignment of Backbone Resonances in a Eukaryotic Protein
Kinase – ERK2 as a Representative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Andrea Piserchio, Kevin N. Dalby, and Ranajeet Ghose
20 Electrostatics of Hydrogen Exchange for Analyzing Protein Flexibility . . . . . . . . . . 369
Griselda Hernández, Janet S. Anderson, and David M. LeMaster
21 Fast Protein Backbone NMR Resonance Assignment Using
the BATCH Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
Bernhard Brutscher and Ewen Lescop
22 Comprehensive Automation for NMR Structure Determination of Proteins . . . . . . 429
Paul Guerry and Torsten Herrmann
23 ARIA for Solution and Solid-State NMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Benjamin Bardiaux, Thérèse Malliavin, and Michael Nilges
24 Determining Protein Dynamics from 15N Relaxation Data
by Using DYNAMICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
David Fushman
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Contributors
ALAA ABDINE • CNRS and Université Paris Diderot, IBPC, Paris, France
JINWOO AHN • Department of Structural Biology, Pittsburgh Center for HIV Protein
Interactions, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
JANET S. ANDERSON • Department of Chemistry, Union College, NY, USA
BENJAMIN BARDIAUX • NMR-supported Structural Biology, Leibnitz-Institut
für Molekulare Pharmakologie (FMP), Berlin, Germany
IAN C. BRETT • Department of Biochemistry and Cell Biology, Stony Brook University,
Stony Brook, NY, USA
CHRIS A. BROSEY • Departments of Biochemistry and Chemistry, Center for Structural
Biology, Vanderbilt University, Nashville, TN, USA
BERNHARD BRUTSCHER • Institut de Biologie Structurale – Jean-Pierre Ebel,
CNRS, CEA, UJF, UMR5075, Grenoble Cedex, France
IN-JA L. BYEON • Department of Structural Biology, Pittsburgh Center for HIV Protein
Interactions, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
MARIE-EVE CHAGOT • Departments of Biochemistry and Chemistry,
Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
WALTER J. CHAZIN • Departments of Biochemistry and Chemistry,
Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
JIANGLEI CHEN • Department of Biochemistry and Molecular Biology,
School of Medicine, Wayne State University, Detroit, MI, USA
JOLYON K. CLARIDGE • Department of Biochemistry, University of Oxford, Oxford, UK
ROBERT T. CLUBB • Department of Chemistry and Biochemistry,
University of California, Los Angeles, CA, USA
DAVID COWBURN • Departments of Biochemistry and Physiology & Biophysics,
Albert Einstein College of Medicine of Yeshiva University, Bronx, NY, USA
KEVIN N. DALBY • Division of Medicinal Chemistry, University of Texas,
Austin, TX, USA; Graduate Programs, Cellular and Molecular Biology, Pharmacy,
Biomedical Engineering and Biochemistry, University of Texas, Austin, TX, USA
ARPANA DUTTA • Department of Structural Biology, University of Pittsburgh
School of Medicine, Pittsburgh, PA, USA
FABIEN FERRAGE • Département de chimie, Ecole normale supérieure et
Laboratoire des Biomolécules, CNRS UMR 7203, Paris, Cedex, France
DAVID FUSHMAN • Department of Chemistry and Biochemistry and Center for
Biomolecular Structure and Organization, University of Maryland, MD, USA
MARGARIDA GAIRÍ • Servicios Cientifico Tecnicos, Universitat de Barcelona,
Barcelona, Spain
RANAJEET GHOSE • Department of Chemistry, The City College of New York,
New York, NY, USA; Graduate Center of the City University of New York,
New York, NY, USA
xi
xii Contributors
ERNEST GIRALT • Institute for Research in Biomedicine (IRB Barcelona),

Parc Científic de Barcelona, Barcelona, Spain; Departament de Química Orgànica,
Universitat de Barcelona, Barcelona, Spain
MICHAEL GOLDFLAM • Institute for Research in Biomedicine (IRB Barcelona),
Parc Científic de Barcelona, Barcelona, Spain
ANGELA M. GRONENBORN • Department of Structural Biology, Pittsburgh Center
for HIV Protein Interactions, University of Pittsburgh School of Medicine,
Pittsburgh, PA, USA
PAUL GUERRY • Centre Européen de RMN à très Hauts Champs, Université de Lyon,
Ecole Normale Supérieure de Lyon, CNRS, Université Claude, Villeurbanne, France
YUN HAN • Department of Chemistry and Biochemistry, University of Delaware,
Newark, DE, USA
GRISELDA HERNÁNDEZ • Department of Health and Department of Biomedical
Sciences, Wadsworth Center, School of Public Health, University at
Albany – SUNY, Albany, NY, USA
TORSTEN HERRMANN • Centre Européen de RMN à très Hauts Champs,
Université de Lyon, CNRS, Ecole Normale Supérieure de Lyon, Université Claude,
Villeurbanne, France
YUEFEI HUANG • Department of Biochemistry and Molecular Biology,
OSAMU ICHIKAWA • Graduate School of Pharmaceutical Sciences,
The University of Tokyo, Tokyo, Japan
MIKI ITAYA • Department of Biochemistry and Cell Biology, Stony Brook University,
Stony Brook, NY, USA
MASATSUNE KAINOSHO • Graduate School of Science, Nagoya University,
Nagoya, Japan; Center for Priority Areas, Tokyo Metropolitan University,
Hachioji, Japan
CHARALAMPOS G. KALODIMOS • Department of Chemistry and Chemical Biology,
Rutgers University, Piscataway, NJ, USA
MAY KHANNA • Department of Biochemistry and Molecular Biology,
Indiana University School of Medicine, Indianapolis, IN, USA
JUDITH KLEIN-SEETHARAMAN • Department of Structural Biology, University
of Pittsburgh School of Medicine, Pittsburgh, PA, USA
DAVID M. LEMASTER • Department of Health and Department of Biomedical
Sciences, Wadsworth Center, School of Public Health, University at
Albany – SUNY, Albany, NY, USA
EWEN LESCOP • Laboratoire de Chimie et Biologie Structurales, Institut de
Chimie des Substances Naturelles, Centre de Recherche de Gif,
CNRS, Gif-sur-Yvette, France
QIANQIAN LI • Department of Biochemistry and Molecular Biology, School of Medicine,
Wayne State University, Detroit, MI, USA
THÉRÈSE MALLIAVIN • Unité de Bioinformatique Structurale, CNRS
URA 2185, Institut Pasteur, Paris, France
MAYUMI MIYAZAWA-ONAMI • Japan Biological Informatics Consortium (JBiC),
Tokyo, Japan; Biomedicinal Information Research Center (BIRC),
National Institute of Advanced Industrial Science and Technology (AIST),
Tokyo, Japan
Contributors xiii
VICTORIA MURRAY • Department of Biochemistry and Molecular Biology,

MICHAEL NILGES • Unité de Bioinformatique Structurale, CNRS
URA 2185, Institut Pasteur, Paris, France
MING-TAO PAI • Department of Chemistry and Chemical Biology, Rutgers University,
Piscataway, NJ, USA
SIVAKUMAR PARAMASIVAM • Department of Chemistry and Biochemistry,
University of Delaware, Newark, DE, USA
KYU-HO PARK • CNRS and Université Paris Diderot, IBPC, Paris, France
ANDREA PISERCHIO • Department of Chemistry, The City College of New York,
New York, NY, USA
TATYANA POLENOVA • Department of Chemistry and Biochemistry,
University of Delaware, Newark, DE, USA
PATRICK N. REARDON • Department of Biochemistry, Duke University NMR Center,
Durham, NC, USA
BERND REIF • Munich Center for Integrated Protein Science (CIPSM) at Department
Chemie, Technische Universität München, Garching, Germany; Leibniz-Institut
für Molekulare Pharmakologie (FMP), Berlin, Germany; Helmholtz-Zentrum
München (HMGU), German Research Center for Environmental Health,
Neuherberg, Germany
KIRSTEN E. ROBINSON • Department of Biochemistry, Duke University NMR Center,
Durham, NC, USA
MY D. SAM • Department of Biological Chemistry and Molecular Pharmacology,
Harvard Medical School, Boston, MA, USA
KRISHNA SAXENA • Institute for Organic Chemistry and Chemical Biology,
Center for Biomolecular Magnetic Resonance, Johann Wolfgang Goethe-University
Frankfurt, Frankfurt am Main, Germany
JASON R. SCHNELL • Department of Biochemistry, University of Oxford, Oxford, UK
HARALD SCHWALBE • Institute for Organic Chemistry and Chemical Biology,
Center for Biomolecular Magnetic Resonance, Johann Wolfgang Goethe-University
Frankfurt, Frankfurt am Main, Germany
ICHIO SHIMADA • Biomedicinal Information Research Center (BIRC),
National Institute of Advanced Industrial Science and Technology (AIST),
Tokyo, Japan; Graduate School of Pharmaceutical Sciences, The University of Tokyo,
Tokyo, Japan
AMANDA E. SIGLIN • Department of Molecular Medicine, Beckman Research
Institute of City of Hope, Duarte, CA, USA
NAKESHA L. SMITH • Department of Chemistry, University at Albany SUNY,
Albany, NY, USA
STEVEN O. SMITH • Department of Biochemistry and Cell Biology, Stony Brook
University, Stony Brook, NY, USA
LEONARD D. SPICER • Department of Biochemistry, Duke University NMR Center,
Durham, NC, USA; Department of Radiology, Duke University NMR Center,
Durham, NC, USA
TOSHIHIKO SUGIKI • Japan Biological Informatics Consortium (JBiC), Tokyo, Japan;
Biomedicinal Information Research Center (BIRC), National Institute
of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
xiv Contributors
SHANGJIN SUN • Department of Chemistry and Biochemistry, University of Delaware,

Newark, DE, USA
HIDEO TAKAHASHI • Department of Supramolecular Biology, Graduate School
of Nanobioscience, Yokohama City University, Yokohama, Japan;
Biomedicinal Information Research Center (BIRC), National Institute
of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
MITSUHIRO TAKEDA • Graduate School of Science, Nagoya University, Nagoya, Japan
TERESA TARRAGÓ • Institute for Research in Biomedicine (IRB Barcelona),
Parc Científic de Barcelona, Barcelona, Spain
CARLA A. THEIMER • Department of Chemistry, University at Albany SUNY,
Albany, NY, USA
SHIOU-RU TZENG • Department of Chemistry and Chemical Biology,
Rutgers University, Piscataway, NJ, USA
JIANJUN WANG • Department of Biochemistry and Molecular Biology,
DROR E. WARSCHAWSKI • CNRS and Université Paris Diderot, IBPC, Paris, France
JOHN C. WILLIAMS • Department of Molecular Medicine, Beckman Research
Institute of City of Hope, Duarte, CA, USA
SI YAN • Department of Chemistry and Biochemistry, University of Delaware,
Newark, DE, USA
Chapter 1
A Novel Bacterial Expression Method with Optimized

Parameters for Very High Yield Production
of Triple-Labeled Proteins
Victoria Murray, Yuefei Huang, Jianglei Chen, Jianjun Wang,
and Qianqian Li
Abstract
The Gram-negative bacterium Escherichia coli offer a means for rapid, high-yield, and economical
production of recombinant proteins. However, when preparing protein samples for NMR, high-level pro-
duction of functional isotopically labeled proteins can be quite challenging. This is especially true for the
preparation of triple-labeled protein samples in D2O (2H/13C/15N). The large expense and time-consuming
nature of triple-labeled protein production for NMR led us to revisit the current bacterial protein expres-
sion protocols. Our goal was to develop an efficient bacterial expression method for very high-level pro-
duction of triple-labeled proteins that could be routinely utilized in every NMR lab without changing
expression vectors or requiring fermentation. We developed a novel high cell-density IPTG-induction
bacterial expression method that combines tightly controlled traditional IPTG-induction expression with
the high cell-density of auto-induction expression. In addition, we optimize several key experimental
protocols and parameters to ensure that our new high cell-density bacterial expression method routinely
produces 14–25 mg of triple-labeled proteins and 15–35 mg of unlabeled proteins from 50-mL bacterial
cell cultures.
Key words: High yield protein production, Bacterial expression, Isotopic labeling, NMR
1. Introduction
To perform NMR structural studies of proteins, we have to produce

proteins that are isotopically labeled with 13C and 15N for small
proteins (<20 kDa) and with 13C, 15N, and 2H for larger proteins
(>20 kDa). Among the many systems available for heterologous
protein production, the Gram-negative bacterium Escherichia coli
remains one of the most attractive hosts (1, 2). This is especially
true for isotopically labeling proteins since bacterial expression
Alexander Shekhtman and David S. Burz (eds.), Protein NMR Techniques, Methods in Molecular Biology, vol. 831,
DOI 10.1007/978-1-61779-480-3_1, © Springer Science+Business Media, LLC 2012
1
2 V. Murray et al.
provides the cheapest way to prepare these proteins for NMR stud-
ies (3). Protein expression and purification is a routine practice in
many NMR labs, but it is not uncommon to see a drastic reduction
in protein yield when isotopically labeling the proteins, especially
when D2O must be used.
To overcome these difficulties, we have developed a novel bac-
terial expression method that combines the tightly controlled
traditional IPTG-induction bacterial expression with the high cell-
density of the auto-induction method (4). To summarize our pro-
cedure, we first determine how to make a proper starting culture,
followed by double colony selection and finally high cell-density
expression. With these optimized protocols and parameters, our
new bacterial expression method offers near gram quantity
production of triple-labeled proteins from one-liter bacterial cell
cultures, without changing expression vectors and without fermen-
tation (5). Thus, every NMR laboratory can easily apply this novel
bacterial expression method on a routine basis for the production
of a very high yield of triple-labeled proteins for their NMR structural
studies of proteins.
2. Materials
2.1. Sample 1. 4× SDS loading buffer: 200 mM Tris–HCl, pH 6.8, 8% (w/v)

Preparation and sodium dodecyl sulfate (SDS), 0.4% (w/v) bromophenol blue,
SDS-PAGE 40% glycerol. Store at room temperature.
2. Dithiothreitol (DTT): Prepare 1 M solution using sterile water.
Filter through a 0.22-μm pore size membrane (syringe filter)
and store 1-mL aliquots at −20°C.
3. 30% Acrylamide/bis solution (29:1).
4. 1.5 M Tris–HCl, pH 8.8.
5. 10% SDS.
6. 10% ammonium persulfate.
7. TEMED.
8. M Tris–HCl, pH 6.8.
9. Protein molecular weight markers.
10. 5× SDS running buffer: 0.5% SDS, 125 mM Tris base, 1.25 M
glycine. Store at room temperature. Do not adjust pH.
11. Coomassie blue-staining solution: 0.25% (w/v) Coomassie bril-
liant blue, 45% methanol, 10% acetic acid. Add brilliant blue to
methanol and stir for 60 min. Add water and acetic acid and stir
for another 30 min. Store at room temperature (see Note 1).
12. Destaining solution: 30% methanol, 10% acetic acid. Store at
room temperature.
1 A Novel Bacterial Expression Method with Optimized Parameters… 3
2.2. Protein Expression 1. LB medium (Miller): Dissolve 25 g of powdered LB medium

in 1 L of distilled water or D2O. Adjust the pH to 7.4 using
NaOH. Autoclave and store at room temperature (see Note 2).
Add antibiotics (KAN or AMP) prior to use.
2. Kanamycin monosulfate stock solution (KAN): Dissolve KAN
monosulfate to a concentration of 30 mg/mL in distilled
water. Syringe filter and store 1 mL aliquots at −20°C.
3. Ampicillin sodium sulfate stock solution (AMP): Dissolve AMP
sodium sulfate to a concentration of 50 mg/mL in distilled
water. Syringe filter and store 1 mL aliquots at −20°C.
4. LB agar plates (Miller): Dissolve 40 g of LB agar (Miller) in
1 L of distilled water and/or D2O in a 2-L flask. Cover with
foil and autoclave. Monitor the temperature as it cools. When
the temperature reaches ~50°C, add 1 mL of the KAN or AMP
stock solution. Pour ~10 mL into 100 × 10-mm Petri dishes
and swirl to coat the plate. Let the LB agar solidify at room
temperature. Place plates back into a plastic bag, seal with tape,
and store at 4°C (see Note 2).
5. Isopropyl-β-D-thiogalactopyranoside (IPTG): Prepare a 1 M
solution using distilled water. Syringe filter and store 1 mL
aliquots at −20°C.
6. 100% Glycerol: Autoclave to sterilize. Store at room temperature.
7. 5× M9 Salts (1 L): Dissolve 64 g of Na2HPO4, 15 g of KH2PO4,
5 g of NH4Cl, and 2.5 g of NaCl in distilled water or D2O, adjust
the volume to 1 L. Autoclave to sterilize and store at room
temperature (see Note 2). Omit NH4Cl from this recipe when
5× M9 salts are used for isotope-labeling. Do not adjust pH.
8. 20% Glucose: Dissolve 20 g glucose in distilled water, adjust
volume to 100 mL. Sterilize by filtration and store at 4°C.
9. MgSO4: Prepare a 1-M solution, autoclave, and store at room
temperature.
10. CaCl2: Prepare 1-M solution, autoclave, and store at room
temperature.
11. M9 minimal medium for traditional IPTG method and double
colony selection (100 mL): 78 mL of distilled, sterilized water,
20 mL of 5× M9 salts, 2 mL of 20% glucose, 200 μL of 1 M
MgSO4, 10 μL of 1 M CaCl2, and 100 μL of antibiotic. Add
the CaCl2 last and immediately swirl the flask to dissolve the
cloudy precipitate. Adjust the pH to 7.4 using NaOH.
12. M9 minimal medium for high cell-density IPTG-induction
method (100 mL): 75 mL of distilled, sterilized water, 20 mL of
5× M9 salts, 5 mL of 20% glucose, 200 μL of 1 M MgSO4, 10 μL
of 1 M CaCl2, and 100 μL of antibiotic. Add the CaCl2 last and
immediately swirl the flask to dissolve the cloudy precipitate (see
Note 3). Adjust the pH to 7.4 using NaOH (see Note 4).
4 V. Murray et al.
13. M9 for double-labeling (100 mL): 80 mL of distilled, steril-

ized water, 20 mL of 5× M9 salts without NH4Cl, 100 mg of
15
NH4Cl and 0.2 g of 13C-glucose (for traditional IPTG
method) or 1 g of 13C-glucose (for high cell-density IPTG-
induction method), 200 μL of 1 M MgSO4, 10 μL of 1 M
CaCl2, and 100 μL of antibiotic. Add the CaCl2 last and imme-
diately swirl the flask to dissolve the cloudy precipitate (see
Note 3). Adjust the pH to 7.4 using NaOH (see Note 4). Use
a filtration unit with a 0.22-μm pore size to sterilize the medium
(see Note 5).
14. M9 for triple-labeling (100 mL): 80 mL of 99% D2O, 20 mL
of 5× M9 salts without NH4Cl in D2O, 100 mg of 15NH4Cl,
and 0.2 g of 13C/2H-glucose (for traditional IPTG-induction
method), or 1 g of 13C/2H-glucose (for high cell-density
IPTG-induction method), 200 μL of 1 M MgSO4, 10 μL of
1 M CaCl2, and 100 μL of antibiotic. You can also use
13
C-glucose; however, this usually generates ~90% deuterated
triple-labeled protein samples. Add the CaCl2 last and immedi-
ately swirl the flask to dissolve the cloudy precipitate (see
Note 3). Adjust the pH to 7.4 using NaOH (see Note 4). Use
a filtration unit with a 0.22-μm pore size to sterilize the medium
(see Note 5).
15. 1000× Trace metals (100 mL): Dissolve 811 mg of FeCl3
(50 mM), 222 mg of CaCl2 (20 mM), 125.8 mg of MnCl2
(10 mM), 161.5 mg of ZnSO4 (10 mM), 26 mg of CoCl2
(2 mM), 26.9 mg of CuCl2 (2 mM), 25.9 mg of NiCl2 (2 mM),
41.2 mg of Na2MoO4 (2 mM), 34.6 mg of Na2SeO3 (2 mM),
and 12.4 mg of H3BO3 (2 mM) in 60 mM HCl. Autoclave to
sterilize. Store at room temperature.
16. BME vitamins (see Note 3).
2.3. Protein 1. Affinity resin: His-Bind Resin.

Purification 2. 8× Charge buffer: 400 mM NiSO4. Store at 4°C.
3. 8× Binding buffer: 160 mM Tris–HCl, pH 7.9, 2 M NaCl,
20 mM imidazole. Store in an amber bottle at room
temperature.
4. 8× Wash buffer: 160 mM Tris–HCl, pH 7.9, 2 M NaCl,
240 mM imidazole. Store in an amber bottle at room
temperature.
5. 4× Elute buffer: 80 mM Tris–HCl, pH 7.9, 1 M NaCl, 4 M
imidazole. Store in an amber bottle at room temperature.
6. (NH4)2CO3: Dissolve 474.4 g of (NH4)2CO3 in 3.5 L of water.
Once dissolved, adjust the volume to 4 L (final concentration
1.23 M) and store at room temperature. Do not adjust pH.
7. Urea.
3. Methods
When triple-labeling proteins, bacteria have to be grown in D2O,

usually causing a significant reduction in protein yields. We sought
to overcome this obstacle. Our strategy mainly focuses on increas-
ing the cell density of bacterial expression without manipulation of
the expression vector or use of a fermenter. Unfortunately, bacte-
rial expression at a high cell-density in D2O usually causes several
major problems, including (1) plasmid loss, (2) significant reduc-
tion in the pH of the growth medium due to cell metabolites, and
(3) limited availability of dissolved oxygen. These problems often
result in a low or even no protein production with high cell-density
bacterial expression.
We developed several practical protocols that solved these
problems, including (1) preparation of a proper starting culture,
(2) double colony selection in D2O, (3) optimization of bacterial
expression conditions, and (4) better control of the pH of the
medium. We further developed a high cell-density IPTG-induction
bacterial expression method that combines the tightly controlled
traditional IPTG-induction expression with high cell-density auto-
induction expression. Our optimized protocols ensure plasmid sta-
bility inside bacterial cells, resulting in routine production of
14–25 mg triple-labeled proteins from a 50-mL bacterial cell cul-
ture. Importantly, this novel bacterial expression method uses the
same expression vectors as the traditional IPTG-induction method
and does not require a fermenter. Thus, every NMR laboratory can
easily adopt this novel bacterial expression method to produce
large quantities of triple-labeled proteins.
3.1. PAGE Sample PAGE samples come either from the cell lysate isolated immediately
Preparation after bacterial expression or from the column flow through during
protein purification.
1. For samples collected directly from bacterial cell culture:
Collect 500 μL of cells and place in a 1.7-mL microcentrifuge
tube. Spin down at 12,000 × g for 5 min at room temperature
using a microcentrifuge, discard the supernatant, and tap out
the excess on a paper towel.
2. Add 25 μL of 4× SDS loading buffer and 25 μL of water and
resuspend the pellet. Store the samples in a freezer until ready
to run a gel.
3. Before running the culture samples on a gel, place the samples
on a 90°C heat block for 30 min. Remove from the heat block;
add 50 μL of water and vortex for 30 s (see Note 6).
4. For samples collected from flow through during protein puri-
fication, mix 60 μL of column flow through with 20 μL of
6 V. Murray et al.
4× SDS loading buffer and mix thoroughly by repeated

pipetting up and down. Store the samples in a freezer until
ready to run a gel.
5. Before running the flow through samples on a gel, place the
samples on a 90°C heat block for 5 min (see Note 6).
6. Before loading samples on a gel, centrifuge at 12,000 × g for
10 min at room temperature using a microcentrifuge to pellet
cellular debris (see Note 7).
3.2. SDS-PAGE 1. Depending on the protein size, choose an appropriate acryl-

amide percentage for the resolving gel (see Note 8).
2. To prepare a 10% SDS-PAGE mini-resolving gel (5 mL), using
a mini-gel apparatus: Mix 1.9 mL of water, 1.7 mL of 30%
acrylamide/bis solution, 1.3 mL of 1.5 M Tris–HCl, pH 8.8,
50 μL of 10% SDS, 50 μL of 10% ammonium persulfate, and
2 μL of TEMED. Mix well and pour between glass plates set
in a loading cassette. Leave about 1.5-cm space on top for the
stacking gel. Gently pipet water on top and let the gel set
(about 20 min).
3. Pour a stacking gel (2 mL) once the resolving gel has set. Mix
1.4 mL of water, 330 μL of 30% acrylamide/bis solution,
250 μL of 1 M Tris–HCl, pH 6.8, 20 μL of 10% SDS, 20 μL
of ammonium persulfate, and 2 μL of TEMED. Pour the water
off the top of the resolving gel, remove excess water with filter
paper, and pour the stacking gel. Insert a comb containing the
appropriate number of lanes and let the stacking gel set (about
20 min).
4. Prepare 1× running buffer by diluting 100 mL of 5× running
buffer solution with 400 mL of distilled water. Make sure to
mix the solution well. Pour the buffer into the inner and outer
chambers of the gel apparatus. For cell culture samples, load
7.5 μL of each sample into the lanes. For column flow through
samples, load 20 μL into each lane. Make sure to load 5 μL of
molecular weight markers in one lane.
5. Secure the lid on the gel box and plug into a power supply.
Run the gel at 88 V for ~2 h. Turn off the power supply when
the blue dye front reaches the bottom of the gel.
6. Remove the gel from the glass plates and place it in a small box
with 20 mL of Coomassie blue-staining solution. Allow the gel
to stain for at least 30 min (see Note 1). Pour out the stain,
rinse the gel with water to remove excess stain and then add
30 mL of destaining solution. Place a small piece of paper towel
in the box to accelerate the destaining process. This process
may take a few hours; however, you can start to detect bands
within 30–60 min.
3.3. A Proper Starting A critical consideration for high-level bacterial expression is the
Culture preparation of a proper starting culture in a rich medium for scal-
ing up in minimal medium. The general practice in NMR labora-
tories is to grow an overnight culture using a rich medium, such as
LB, at 37°C. We observe that an overnight culture usually reaches
saturation by the next morning, and may result in plasmid instabil-
ity and loss due to several factors including basal leakage of the T7
expression system that expresses the toxic target proteins to the
host cells under this overgrowth condition (4, 6, 7). This usually
results in a poor yield of target protein. Figure 1 shows a growth
curve for E. coli BL21 (DE3) cells carrying the LCAT/pET30a
vector, in H2O-based LB, suggesting that the bacteria are in the
exponential or log phase of growth between 6 and 7.5 h at 37°C
(see Note 9). We placed particular emphasis on double colony
selection (Subheading 3.5) to ensure that a high percentage of
bacterial cells within this colony contain the DNA expression
plasmid.
1. Perform a time course of bacterial growth of a new protein
expression vector in rich (LB) medium by measuring the OD600
every 30 min for ~10–12 h in water-based rich medium and
~16–18 h in D2O-based rich medium (see Note 10). The OD600
Fig. 1. Plot of E. coli growth in 5 mL of LB medium over a 10-h period at 37°C starting with
a glycerol stock. The bacterial strain, BL-21(DE3), contains a pET30a vector expressing
the gene for lecithin:cholesterol acyltransferase (LCAT). Based on this plot, the log phase
of the culture is between an OD600 of 1 and 3.5. Note: The growth curve is vector, protein,
and bacterial strain dependent. Reproduced from Murray 2010 with permission from Cold
Spring Harbor.
8 V. Murray et al.
of the log phase and the time to reach saturation are vector,
protein, and bacterial strain-dependent; therefore, we suggest
performing this experiment before actually expressing protein
for purification and labeling (see Note 11).
2. Once the optimal OD600 of the log phase of the starting cul-
ture is determined, this will be the OD600 for all future starting
cultures using both traditional IPTG expression and high cell-
density expression methods.
3.4. Traditional IPTG This method can be scaled up or down to suit your needs. This
Method following protocol is used to make a 100-mL expression culture.
1. Prepare a 5-mL proper starting culture as described in
Subheading 3.3 in a 50-mL conical tube with holes poked in
the lid.
2. Prepare 100 mL of M9 minimal medium in a 250-mL flask.
Add 1.5 mL of the starting culture and measure the OD600.
(To check the OD600 at this point, aliquot 1 mL of the cell
culture into a cuvette and measure the OD600.) We suggest a
starting OD600 between 0.05 and 0.10 for healthy bacterial cell
growth. Place the flask in a 37°C incubator with a shaking
speed of 200 rpm.
3. Start to monitor the OD600 after 3 h (to check the OD600 at this
point, dilute 100 μL of cell culture into a cuvette containing
900 μL of distilled water and measure the OD600. The final OD600
value is ten times the spectrophotometer reading). Once the
OD600 reaches between 0.8 and 1.2 (see Note 12), remove 2 mL
of cell culture and place it in a 15-mL culture tube (for a non-
induced reference). Induce the remaining culture with 0.5 mM
IPTG. Place both the flask and the 15-mL culture tube in a 20°C
incubator overnight with a shaking speed of 200 rpm.
4. The following morning, measure the final OD600 of the cell
culture. If the protein expression is induced at an OD600 of 1,
the OD600 of the bacterial cell culture will be around 2–3, indi-
cating healthy bacterial cell growth. Harvest the cells by spin-
ning down at 10,000 × g for 10 min at 4°C using a benchtop
centrifuge. Remove the supernatant and either store the cell
pellet at −80°C or use immediately for protein purification.
5. To check protein expression levels, take 500 μL samples of
each culture (non-induced and IPTG-induced), follow the
sample preparation protocol (Subheading 3.1) and run an
SDS-PAGE (Subheading 3.2) to compare non-induced and
IPTG-induced samples.
3.5. Double Colony Since plasmid loss is encountered during bacterial expression in D2O
Selection much more frequently than in H2O, we describe a double colony
selection procedure for triple-labeling protein in D2O. Based on our
Fig. 2. SDS-PAGE of protein expression of apoE(1–215)/pTYB1 in D2O before (a), during (b) and after (c) double colony
selection. Arrows indicate the expected protein band (~80 kDa). (a) Shows four different colonies before colony selection.
(b) Shows the results of three different colonies selected from the single colony selection (lanes 1–3) and another three
colonies selected from the double colony selection (lanes 4–6). The second colony selection was based on Colony
3 (lane 3, b). (c) Shows the results of six colonies from the double colony selection, indicating a high level of protein expres-
sion for all six colonies. Reproduced from Sivashanmugam 2009 with permission from Wiley Interscience.
experience, this is a critical protocol that significantly increases the

yield of triple-labeled proteins. Figure 2 shows the result of a typical
double colony selection of apoE(1–215)/pTYB1 in D2O, demon-
strating that high protein expression levels are achieved after double
colony selection. This high-level expression of apoE(1–215)/pTYB1
in D2O has been stable for more than 2 years.
Usually, we perform double colony selection before we opti-
mize the expression conditions. Thus, the traditional IPTG-
induction method is the default method with which to start double
colony selection. The procedure, described in Subheading 3.4, can
be applied here for double colony selection, except for culture vol-
ume and the use of D2O.
1. Prepare LB agar plates with 50% D2O.
2. Perform a bacterial transformation using LB agar plates pre-
pared with D2O.
3. Next afternoon, choose nine colonies to make starting cultures
of 5 mL of LB in 50% D2O and 5 μL of antibiotic in 50-mL
conical tubes, and prepare a master plate (see Note 13). Punch
holes in the lids of the tubes. Eight of the nine colonies will be
induced with IPTG and the ninth colony will be used as a
negative, non-induced control.
4. When the starting culture is ready, add 50–100 μL to 5 mL of
M9 minimal medium in 70% D2O and 5 μL of antibiotic in
50-mL conical tubes. Ensure that the starting OD600 is between
0.05 and 0.10. Place the tube in a 37°C incubator with a shaking
speed of 200 rpm.
10 V. Murray et al.
5. When the OD600 of the culture reaches between 0.8 and 1 (see
Note 14), add 0.5 mM IPTG to eight of the nine cultures to
induce protein expression, clearly marking which culture is
serving as a negative control. Place the tubes in a 20°C incuba-
tor overnight with a shaking speed of 200 rpm.
6. The following morning, remove 500 μL of cell culture from
each tube and place in a microcentrifuge tube. Centrifuge at
12,000 × g for 10 min at room temperature and discard the
supernatant. Prepare SDS-PAGE samples and run a gel.
7. Choose the colony expressing the biggest protein band (using
the negative control as reference) and prepare 5 mL of LB cul-
ture in either 70 or 99% D2O and 5 μL of antibiotic containing
the colony from the master plate. Grow at 37°C until the
OD600 reaches 0.7–0.9 and spread 150 μL on an LB plate pre-
pared in 50% D2O. Invert the plate and incubate it at 37°C
overnight.
8. The following day, repeat steps 3–6 using 70 or 99% D2O for
the second round of colony selection (see Note 14). When
completed, choose the colony expressing the biggest protein
band to make a 5-mL LB culture in 100% D2O and 5 μL of
antibiotic. Grow the culture halfway through its log phase.
Add 800 μL of culture to 200 μL of 100% sterile glycerol in a
2-mL cryogenic tube with a screw top cap. Pipet up and down
to mix thoroughly and flash freeze the tubes by dipping them
in liquid nitrogen, store at −80°C. If the protein expression
level is extremely high, we suggest that you make at least 5–10
glycerol stocks of this double selected colony for future use.
3.6. High Cell-Density This is a hybrid method combining traditional IPTG-induction

IPTG-Induction and auto-induction bacterial expression methods. It takes advan-
Method tage of tightly controlled IPTG-induction and the high cell-density
of the auto-induction bacterial expression. We use rich media, such
as LB and 2× YT, to reach a high cell-density before IPTG-
induction and then switch the culture medium by gently spinning-
down the cells and resuspending them into an equal volume of
minimal medium. However, many problems may occur during a
high cell-density bacterial expression that can cause a significant
reduction in protein yield, such as reduced pH of the expression
medium, poor aeration, and/or plasmid loss during expression.
We describe the following procedure to avoid these problems,
ensuring a high yield of triple-labeled protein.
1. Using the glycerol stock prepared after double colony selec-
tion (Subheading 3.5), make a starting culture in 50 mL of LB
medium containing 50 μL antibiotic in a 250-mL flask (Dip
the pipet tip in the glycerol stock and scratch the surface. Place
the tip into the culture medium, pipet up and down a few
times and remove the tip. Immediately return the glycerol

stock to the −80°C freezer). Place the flask in a 37°C incubator
with a shaking speed of 200 rpm and incubate until the OD600
is halfway through the log phase (Subheading 3.3, step 1). DO
NOT let the starting culture grow overnight since the satura-
tion of cell growth may cause plasmid loss.
2. Transfer the cells into sterile tubes and spin down the culture
at 5,000 × g for 7 min at room temperature. Remove the super-
natant and tap the tubes on paper towels to remove as much
LB as possible.
3. Gently resuspend the pellets in 50 mL of M9 minimal medium
(see Note 4) and transfer the resuspended cells to a 250-mL
sterile flask. Place the flask in an incubator that is set at the
optimal induction temperature (see Subheading 3.7).
Maintain the shaking speed at 200 rpm for efficient aeration
and keep the culture at the optimal temperature for 1–1.5 h
to allow the cells to adapt to the new medium. An enhance-
ment of approximate 0.5–1 U of OD600 should be observed
at the end of this time. For example, if the OD600 is 3 right
after medium exchange, you can expect the OD600 to reach
3.5–4 at the end of the 1–1.5 h incubation, indicating healthy
cell growth (this increase in OD600 will be slightly less when
expressing in D2O).
4. Add the optimal concentration of IPTG (Subheading 3.7) to
induce protein expression and keep the culture in the incuba-
tor for an optimal period of time (Subheading 3.7) at a shaking
speed of 200 rpm.
5. Measure the final OD600 before harvesting cells. You can expect
to see a two- to fourfold enhancement at the end of the expres-
sion. For example, if your OD600 is 4 after the 1–1.5 h incuba-
tion, the final OD600 will be between 8 and 16, indicating
healthy bacterial growth.
6. Harvest the cells by spinning down the culture at 10,000 × g for
10 min at 4°C. Remove the supernatant and either store the cell
pellet at −80°C, or immediately use for protein purification.
3.7. Optimization Another important step for high-level protein production using
of Various Conditions high cell-density bacterial expression is to optimize the expression
conditions such as culture temperature and the induction time.
These steps are critical for the initial expression of a protein using
the high cell-density expression method. We usually use the tradi-
tional IPTG-induction method first to check if a new protein can
be expressed by using bacteria. Once the protein expression is con-
firmed by the traditional IPTG-induction method, we can opti-
mize the high cell-density expression method to produce high-yield
isotopically labeled proteins.
12 V. Murray et al.
We usually carry out time courses at different temperatures,

such as 15, 20, room temperature, 30 and 37°C. We normally
prepare a 5-mL starting culture either in D2O or in water for the
time course and closely monitor the following parameters during
expression: OD600, pH, and target protein production. The detailed
procedure of optimization follows:
1. Prepare 10 mL starting cultures in 50-mL flasks: 10 mL of LB
medium in 99% D2O, 10 μL of antibiotic and bacterial cells
from a glycerol stock after double colony selection. Incubate at
37°C with a shaking speed of 200 rpm.
2. Once the optimal OD600 has been reached, gently centrifuge
the cells at 5,000 × g for 7 min at room temperature and dis-
card the supernatant. Resuspend the cell pellets with 10 mL of
M9 minimal medium in 99% D2O in 50-mL flasks and incu-
bate at various temperatures, such as 15, 20, room-temperature,
30° and 37°C, for 1 h. Check the OD600 of the cultures before
and after this 1-h cell incubation.
3. If the OD600 of each culture after the 1-h incubation increases
by 0.5–1 OD600 units, this indicates that the cells are healthy
and growing after the medium exchange. Induce protein
expression by adding 0.5 mM IPTG. Be sure to choose one
culture without IPTG-induction to serve as the negative con-
trol. Return the tubes to appropriate shakers at different tem-
peratures. We found that 0.5 mM IPTG usually gives a
reasonable protein production, thus we always used this IPTG
concentration as our starting point. However, an independent
optimization of IPTG concentration can be carried out and is
discussed in step 6 of this section.
4. For cultures growing below 25°C, let them grow overnight
(~14–16 h). The following morning, collect 500 μL of cell
culture samples every 2 h (typically collect between 16 and
28 h and one more sample the next morning). For the cultures
growing above 25°C, start to collect samples every 2 h after
induction for at least 8 h and one the next morning. At each
time point, check the OD600 and the pH of the cell culture
(see Note 4). Be sure to keep the collected cell pellets at −20°C.
5. When all of the samples have been collected, prepare samples
for SDS-PAGE analysis. A comparison with the negative control
(noninduced culture) allows you to determine which tempera-
ture and induction time give you the best protein yield
(see Note 15).
6. Repeat the above procedure to optimize the IPTG concentration
using the optimized temperature and incubation time. We usually
test IPTG concentrations of 0.1, 0.25, 0.5, 0.75, and 1 mM.
7. Once the optimal conditions have been determined, you can
now perform high cell-density bacterial expression on a larger
scale. Be sure to use a large flask for better aeration. We usually

use a 250-mL flask for a 50-mL cell culture and a 500-mL flask
for a 100-mL cell culture. If you want to grow a 200-mL cell
culture, we suggest dividing the culture into 2 × 100 mL cul-
tures in two 500-mL flasks.
Figure 3 shows a typical time course experiment performed
during optimization of experimental conditions. Table 1 shows the
expression parameters for the time course during our optimization
of the expression of human apolipoprotein A-I (apoAI). It is clear
that at the maximum OD600 the bacterial expression produces the
highest yield of triple-labeled apoAI, as confirmed by a Western
blot (Fig. 3).
Fig. 3. Left panel: An SDS-PAGE showing auto-induction time course of triple-labeled human apoAI expression in D2O at
room temperature. Lanes 1–7 = 24, 28, 32, 36, 40, 44, and 54 h, respectively. Right panel is a Western blot of the same
time course using anti-apoAI monoclonal antibody. Reproduced from Sivashanmugam 2009 with permission from Wiley
Interscience.
Table 1
Parameters of the time course of human apolipoprotein A–I expression
Time 24 h 28 h 32 h 36 h 40 h 44 h 54 h
OD600 2.5 3.9 7.2 9.1 8.4 8 8.1

pH 6.6 6.5 6.3 6 6 6 6.1
Protein yield − + ++ +++ ++ ++ −
14 V. Murray et al.
3.8. Protein Protein purification depends on the fusion tag that is used, as dif-
Purification ferent tags serve different purposes. In our laboratory, we generally
use histidine tags. In this section, we describe a typical protein
purification procedure using a His-Bind Resin column.
1. Prepare 1× dilutions of all buffers. Recheck the pH to ensure
that they are still 7.9.
2. Centrifuge the cell culture at 10,000 × g for 10 min at 4°C.
Remove the supernatant and resuspend the pellets in 20 mL of
1× binding buffer. If the protein is in inclusion bodies and can
be refolded readily during dialysis, you can resuspend the
pellet in 20 mL of 1× binding buffer containing 6 M urea.
3. Lyse the cells by using either sonication or a French press.
Centrifuge the lysate at 16,000 × g for 20 min at 4°C. Collect
the supernatant and store it on ice.
4. Add 10 mL of 1× binding buffer and repeat step 2 at least twice,
and then combine all the supernatants. Depending on the pro-
tein, you may need to add additional binding buffer and repeat
step 2 3–5 times to completely extract the protein from the cells.
5. Equilibrate the affinity column with 50 mL of 1× charge buffer
(see Note 16). Remove the charge buffer and equilibrate the
column with 50 mL of 1× binding buffer. The column should
be a light blue color after equilibration.
6. Load the column with the clear lysate from step 3. The flow
rate should be ~1 mL/minute (see Note 17). Collect the flow
through and remove a 60-μL sample for SDS-PAGE.
7. Wash the column with 200 mL of 1× binding buffer, followed
by an additional 100 mL of 1× wash buffer. The flow rate of
wash buffer should also be ~1 mL/min. Elute the column with
100 mL of 1× elution buffer. Collect the last drop of elution
for SDS-PAGE to make sure that all of the protein has been
eluted from the column.
8. Perform SDS-PAGE analysis with all of the collected samples
to assess the purification.
9. Place the eluted protein into a dialysis bag and dialyze exten-
sively against water containing 20 mM (NH4)2CO3 to remove
imidazole, salts and possibly urea. After dialysis, freeze the pro-
tein sample with liquid nitrogen and lyophilize to obtain pure
triple-labeled protein powder. Run a gel to assess the purity of
the protein powder.
3.9. Conclusion With the high cell-density IPTG-induction bacterial expression

method and the practical protocols described above, we routinely
produce 14–25 mg of triple-labeled proteins and 15–35 mg of
unlabeled proteins from a 50-mL cell culture for all the proteins
we tested. Table 2 lists the final yields of unlabeled and triple-
labeled proteins obtained using high cell-density bacterial expression
Table 2
Final yields of unlabeled and triple labeled proteins: high cell-density vs. traditional
IPTG method
High cell
Protein densityb (mg) IPTGb (mg) M.W. (Cal) (Da) M.W. (MS) (Da) %Dc
Triple-labeled
RAP(1–210) 20 ± 3 0.5 33,801 33,525 ± 195 ~92
RAP(91–323) 25 ± 3 0.8 36,633 36,376 ± 200 ~93
ApoE(1–183)a 18 ± 4 2 22,866 22,686 ± 116 ~89
Mouse apoAI(1–216) 15 ± 2 0.8 28,014 27,732 ± 125 ~90
Human apoAI 14 ± 1 0.6 32,814 32,401 ± 150 ~88
Unlabeled
Human apoAI 34 ± 1 1
Human apoE 17 ± 2 0.2
M.W. molecular weight
Reproduced from Sivashanmugam 2009 with permission from Wiley Interscience
a
ApoE(1–183) was expressed in 40% D2O, the rest are expressed in 99.7% D2O
b
High cell density (50-mL culture volume): high cell-density expression methods, including auto-induction and high
cell-density IPTG-induction; IPTG: the optimized traditional IPTG-induced expression. We repeated the expressions
at least three times for all proteins, the yield shown is the average ± standard deviation
c
Estimated percentage of deuteration, assuming 100% 13C and 15N-labeling. For apoE(1–183), the %D is the estimated
percentage of deuteration based on 40% D2O. For the other four proteins, the %D is the estimated percentage of deu-
teration based on 99.7% D2O
compared with the yields obtained by using the traditional IPTG-

induction method in a 50-mL cell culture, the results suggest a
5–100-fold enhancement in protein yield. In addition, the protocols
described produce a consistent high-level of triple-labeled protein,
which is always reproducible. Table 2 also gives mass spectroscopic
data for the triple-labeled protein, indicating the efficiency of
deuteration for triple-labeled protein using auto-induction expres-
sions. Overall, the deuteration efficiency is around 90% if we assume
the 13C and 15N-labeling are 100%. This is because we used 99.7%
D2O and nondeuterated 13C-glycerol or 13C-glucose in high cell-
density expressions. This result is comparable to the deuteration
efficiency of the traditional IPTG-induction expression method
using single labeled 13C-glucose.
4. Notes
1. The stain can be reused multiple times. When using fresh stain,
you only need to stain gels for 15–30 min. Pour the used stain
into a separate container. When reusing stain, you may need to
stain gels longer.
16 V. Murray et al.
2. When using D2O to replace water in LB agar, broth or M9

salts, solutions CANNOT be autoclaved to sterilize. When
making LB broth, follow the directions as stated, but instead
of autoclaving, use a filtration unit with a 0.22-μm pore size to
sterilize. When making LB agar plates, use a microwave to
bring the solution just to a boil (to dissolve agar), add antibiot-
ics once the agar temperature cools to ~50°C, and pour into
plates. When making 5× M9 salts, follow the directions as
stated, however omit NH4Cl for isotopic-labeling. Once again,
do not autoclave; sterilize using a filtration unit with a
0.22-μm pore size.
3. For the auto-induction method, Studier suggests using vita-
mins and trace metals (4). We found vitamins and trace metals
help promote healthy bacterial growth when using our high
cell-density IPTG-induction bacterial expression. We pur-
chased the trace metals and BME vitamins stock solution from
Sigma. The trace metals used in our laboratory is based on
Studier’s recipe provided in the supplement materials of his
elegant paper on the auto-induction bacterial expression
method (4). For our optimized high cell-density IPTG-
induction minimal medium, we added 0.25× vitamins and
0.25× trace metals (see Table 1 in ref. 5).
4. While monitoring OD600 during the optimization time course
(Subheading 3.7), the pH should be monitored as well. As the
cell density increases, the pH of the culture lowers due to the
release of cell metabolites. If the pH becomes too low (pH < 6),
it will affect bacterial cell health and protein production. If the
pH drops below 6 during the time course, we increase the pH
of the M9 minimal medium to 8 using NaOH to allow for a
larger buffering capacity of the culture medium.
5. M9 minimal medium containing isotopes (2H, 13C, or 15N)
cannot be autoclaved. It must be sterilized using a filtration
unit containing a 0.22-μm pore size.
6. If the protein contains cysteine residue(s), add 10–20 mM
DTT after adding water. Vortex the sample well and let it sit
for 30–60 min at room temperature.
7. The release of DNA can cause the sample to become quite
viscous, making it hard to load on the gel. If you notice this,
you can simply sonicate your sample at a low wattage (4–6 W)
for 5–10 s. Afterward, spin down the sample at 12,000 × g for
2 min at room temperature. Also, when loading the gel, remove
the sample from the top portion of the supernatant to avoid
the pelleted cellular debris at the bottom.
8. To determine which percentage gel to use, follow the guidelines
found in Table A8-8 in Sambrook and Russell (8). Also, if you
are working with proteins that weigh less than 15 kDa,
12% Tricine gels or gradient gels are highly recommended

for better resolution in the molecular weight range of 5–15 kDa.
9. Bacteria display a four-phase pattern of cell growth in liquid
media. First, there is an initial lag phase when bacteria are
adapting to the growth conditions; at this point, an increase in
OD600 will not be seen. Second, bacteria enter their exponen-
tial or log phase at which point the bacterial cells start dividing
(doubling in number). The OD600 during this log phase climbs
steadily. Third, bacteria enter the stationary phase during which
the rate of cell growth significantly slows due to a decrease in
available nutrients and an accumulation of toxins. The OD600
will level off during this phase. Finally, if fresh medium is not
made available and toxins are not removed, bacteria will enter
the death phase and a noticeable drop in OD600 will be observed.
The key is to utilize a starting culture during the exponential
or log phase of their growth curve (see Fig. 1).
10. When performing a time course using rich media, such as LB or
2× YT in D2O, bacteria grow much more slowly than in water.
11. Based on our experience, we suggest the appropriate OD600
range for the starting culture is between 3 and 5 in LB medium
and between 5 and 7 in 2× YT medium. However, the OD600
of the log phase is vector, protein, and bacterial strain-dependent.
Thus, the best way to determine the middle point of the log
phase is to perform a time course of bacterial growth for each
new protein expression vector.
12. Based on our experience, for healthy bacterial cell growth, the
culture is expected to reach an OD600 of 0.8–1.2 within 4–6 h for
bacterial expressions in water and within 6–9 h for bacterial expres-
sions in D2O if the starting OD600 is between 0.05 and 0.10.
13. When performing colony selection, it is helpful to make a
“master plate.” Take a KAN or AMP plate and make a 9-square
grid under the agar plate. Label boxes 1–9. When you are ready
to inoculate the medium, take a sterilized tip, gently touch the
selected colony, and then gently touch the agar of the master
plate in the corresponding box. Go back to the original plate,
retouch the same colony, and drop it into a tube. Repeat this
procedure for all selected colonies. Once finished, put the lid
back on the plate, invert and incubate at 37°C for about 8 h.
Colonies should be about 1–2-mm in diameter. Cut a long
strip of parafilm, wrap the edges of the plate, and store inverted
at 4°C. You can use this plate to regrow the colonies for future
cell cultures. However, the plate is only good for about 2 weeks,
so be sure to also make glycerol stocks.
14. If the bacteria do not grow well in 70% D2O, then they must
be trained to adapt to D2O medium. For this purpose, pick a
colony off a D2O plate and start a 5-mL bacterial culture of LB
18 V. Murray et al.
medium in 25% D2O. Once the OD600 of the culture reaches

1 at 37°C, transfer 100 μL of the cell culture into 5 mL of LB
medium in 50% D2O. The starting OD600 of this new culture is
about 0.1. Let the cell culture grow at 37°C until the OD600
reaches 1 and transfer 100 μL of the cell culture into 5 mL of
LB medium in 75% D2O and let the culture to grow at 37°C
until the OD600 reaches 2–3. Use this cell culture as your start-
ing culture.
15. A Western blot of this time course will further allow an unam-
biguous determination of the time point that produces the
best protein yield.
16. Typically, we prepare columns containing 5 mL of affinity resin
for a 50-mL cell culture since 2.5 mL of resin can bind about
20 mg of protein. This can be scaled up or down to suit your
needs based on the expected protein yield.
17. If the flow rate is too slow, we have found that using a 1.5-μm
syringe filter to remove cellular debris from the clear lysate pre-
vents columns from becoming clogged and running slow.
References
1. Swartz, J.R. (2001) Advances in Escherichia protocols for production of very high yields of
coli production of therapeutic proteins. Curr. recombinant proteins using Escherichia coli.
Opin. Biotechnol. 12, 195–201. Protein Sci. 18, 936–948.
2. Hewitt, L., and McDonnell, J.M. (2004) 6. Chen, H.C., Hwang, C.F., and Mou, D.G.
Screening and optimizing protein production in (1992) High-density Escherichia coli cultiva-
E. coli methods. Methods Mol. Biol. 278, 1–16. tion process for hyperexpression of recombi-
3. McIntosh, L.P. and Dahlquist, F.W. (1990) nant porcine growth hormone. Enzyme Microb.
Biosynthetic incorporation of 15N and 13C for Technol. 14, 321–326.
assignment and interpretation of nuclear mag- 7. Baneyx, F. (1999) Recombinant protein
netic resonance spectra of proteins. Q Rev. expression in Escherichia coli. Curr. Opin.
Biophys. 23, 1–38. Biotechnol. 10, 411–421.
4. Studier, F.W. (2005) Protein production by 8. Sambrook, J. and Russell, D. (2001) Mole-
auto-induction in high density shaking cul- cular Cloning: A Laboratory Manual (3rd ed.).
tures. Protein Expr. Purif. 41, 207–234. Cold Spring Harbor Laboratory Press, Cold
5. Sivashanmugam, A., Murray, V., Cui, C., Spring Harbor, New York (ISBN 978-
Zhang, Y., Wang, J., and Li, Q. (2009) Practical 087969577-4).
Chapter 2
Isotopic Labeling of Heterologous Proteins in the Yeast

Pichia pastoris and Kluyveromyces lactis
Toshihiko Sugiki, Osamu Ichikawa, Mayumi Miyazawa-Onami,
Ichio Shimada, and Hideo Takahashi
Abstract
Several protein expression systems are available for the preparation of stable isotope-labeled recombinant
proteins for NMR studies. Yeast expression systems have several advantages over prokaryotic systems, such
as the widely used Escherichia coli expression system. Protein expression using the methylotrophic yeast
Pichia pastoris is commonly employed for the preparation of isotope-labeled proteins. Recently, the hemi-
ascomycete yeast Kluyveromyces lactis expression system was reported as being useful for preparing proteins
for NMR studies. Since each yeast expression system has different features, their applications have increased
in number. In this chapter, we describe procedures for the efficient production of uniformly isotope-labeled
proteins using the P. pastoris and the K. lactis yeast expression systems.
Key words: Yeast expression systems, Pichia pastoris, Kluyveromyces lactis, Stable isotope labeling,
Fed-batch fermentation, NMR
1. Introduction
The most widely used method for the expression of isotopically

labeled heterologous recombinant proteins is the Escherichia coli
expression system because of easy handling, rapid and high-density
cell growth, high levels of protein production, and relatively low
costs for isotope-labeling. However, in many cases, expression of
structurally and functionally intact eukaryotic proteins by E. coli
and other prokaryotic cells is fundamentally difficult due to (1) the
lack of intracellular organelles, (2) a limited number of molecular
chaperones, and (3) the absence of posttranslational modification
mechanisms (1–6). Yeast, however, combines several advantages of
19
20 T. Sugiki et al.
both eukaryotic and prokaryotic expression systems (1) processing,

folding, complex disulfide-bond network formation, and posttrans-
lational modification (e.g., glycosylation) of proteins are possible,
(2) yeast exhibit rapid growth rates and grow to high-density, and
(3) methods for molecular genetic manipulation are well established
and are simple to perform (1–6). Here, we introduce the Pichia
pastoris and Kluyveromyces lactis yeast expression systems for the
stable isotopic labeling of heterologous proteins.
1.1. The P. pastoris Host cells and expression vectors are available from Invitrogen.
Expression System Based upon Invitrogen instruction manuals (7), previous reports
(8–15), and our experience, we describe here optimized cell culture
procedures for the production of isotope-labeled heterologous
proteins by P. pastoris.
P. pastoris is capable of utilizing methanol as both a carbon
source and to induce protein expression. Methanol is oxidized by alco-
hol oxidases, AOX1 and AOX2, in yeast cell peroxisomes (though
the majority of alcohol oxidase activity is attributed to AOX1). The
AOX1 promoter tightly regulates expression of the AOX1 gene,
and methanol induces AOX1 promoter activity. Thus, the AOX1
promoter is used to drive the expression of a target protein
by replacing the AOX1 gene with a cDNA encoding the desired
heterologous protein (1–6).
Invitrogen supplies several vector series that are commonly
used for protein expression in P. pastoris, including pPIC9K,
pPIC3.5K, and pPICZ. Using these vectors, a cDNA cassette con-
taining the target gene under the control of the AOX1 promoter is
inserted into the genome of P. pastoris by using homologous
recombination along with a gene coding for resistance to a drug,
such as Zeocin™ or G418, for subsequent selection of transformed
cells (7).
Target proteins are expressed either intracellularly or secreted
into the medium by P. pastoris. For secretion of target proteins,
vectors pPICZα and pPIC9K are recommended (7). The pPICZα
and pPIC9K series contain a gene that encodes a prepro signal
sequence [such as the Saccharomyces cerevisiae α-mating factor
(α-MF)] between the AOX1 promoter and the target gene (7).
Although target proteins may contain native secretion signals, in
many cases, the α-MF sequence is generally used as the sole secretion
signal. Since P. pastoris secretes a low amount of its native proteins,
a secreted target protein comprises the vast majority of the total
proteins in the culture medium, simplifying purification of the
target protein (1–7). A different set of vectors lacking secretion signal
sequences are required if the target protein is to be expressed intra-
cellularly. For cytosolic and nonglycosylated proteins, vectors pPICZ
or pPIC3.5K are recommended (7).
In this chapter, we describe methods and provide technical
advice for producing uniformly labeled [U-13C, 15N] and [U-2H, 15N]
target proteins in P. pastoris (8–14). The procedures described are
2 Isotopic Labeling of Heterologous Proteins in the Yeast… 21
optimized for secretory expression of isotope-labeled DDR2 from

P. pastoris strain X-33 (Mut+ phenotype transformants) using the
pPICZα vector system (15).
1.2. The K. lactis Host cells and expression vectors are available from New England
Expression System Biolabs. Based upon instruction manuals (16), previous reports,
and our experience, we describe here optimized cell culture proce-
dures for the production of isotope-labeled heterologous proteins
by K. lactis. Heterologous recombinant proteins expressed by
K. lactis can be sequestered intracellularly or secreted into the
medium. As with P. pastoris, K. lactis secretes very low levels of native
proteins, therefore secreted target proteins comprise the vast
majority of the total protein present in the culture medium, which
simplifies purification of the target protein (16–18).
In K. lactis, the LAC4 promoter drives expression of the target
protein gene. Using the pKLAC vector, a cDNA cassette containing
the target gene under the control of the LAC4 promoter is inserted
into the genome of K. lactis by using homologous recombination
(16–18). The transcriptional activity of the LAC4 promoter is
induced by galactose, which K. lactis also utilizes as a carbon source
for cell growth. Therefore, expression of the target protein is con-
stitutively induced by cultivation in a medium containing galactose
(16–18).
The pKLAC vectors impart a fungal acetamidase gene (amdS)
to the transformants. Since only transformants that express amdS
can utilize acetamide as a sole nitrogen source, amdS is used for
selection of the desired transformants, thereby obviating the need
for expensive antibiotics, such as Zeocin™ and rendering this
auxotrophic selection method more cost-effective (16–18). In addi-
tion, auxotrophic acetamide selection enriches populations of trans-
formants in which multiple tandem copies of target cDNA are
integrated into the yeast genome (16–18). Thus, the K. lactis
protein expression system is simple, cost-effective, easily scaled-up,
and highly reproducible.
We recently established a cost-effective isotope-labeling method
utilizing the hemiascomycete yeast K. lactis (6, 19). In the most
commonly employed K. lactis expression system, 20 g/L galactose
is required as a carbon source for cell growth and as an inducer of
target protein expression (16). However, while a 20 g/L carbon
source is acceptable for uniform 15N-labeling, it is economically
infeasible for uniform 13C-labeling. Sugiki et al. reported that by
combining K. lactis strain GG799, which is characterized by weak
glucose suppression for the LAC4 promoter (16), with a fed-batch
culture method, larger amounts of protein can be expressed using a
smaller amount of glucose, thus reducing costs to a level compa-
rable to isotope-labeling by E. coli systems (6, 19). In this chapter,
we describe methods and provide technical advice for producing
uniformly labeled [U-13C, 15N] and [U-2H, 15N] target proteins in
K. lactis.
22 T. Sugiki et al.
2. Materials
The recipes for the culture media described here were derived
mainly from the manuals for the EasySelect™ Pichia Expression Kit
(Invitrogen), the Pichia Fermentation Process Guidelines (Invitrogen),
and the K. lactis Protein Expression Kit (New England Biolabs).
Autoclave sterilization is performed at 121°C for 15 min.
2.1. Uniform 13C, 1. 40% unlabeled D-glucose stock solution: Dissolve 200 g of
15
N-Labeling Using D-glucose in 1 L of water. Sterilize by aseptic filtration and store
P. pastoris at room temperature. The shelf life of this solution is approxi-
mately 1 year.
2. 100 mg/mL Stock solution of Zeocin™ (Invitrogen).
3. Yeast extract peptone dextrose sorbitol (YPDS) agar plates:
Dissolve 20 g of bacto peptone, 10 g of yeast extract, 182.2 g
of sorbitol, and 20 g of bacto agar in 950 mL of water. Sterilize
by autoclaving and cool to 50–60°C. Aseptically add 1 mL
of 100 mg/mL Zeocin™ and 50 mL of 40% glucose stock
solution, gently mix well, and dispense into sterile disporsable
Petri dishes. Store at 4°C in the dark. The shelf life of these
agar plates is 1–2 weeks.
4. 10× Yeast nitrogen base (YNB): Dissolve 34 g of YNB without
amino acids and ammonium sulfate in 1 L of water (see Note 1).
Warm it to 40–50°C to dissolve completely if needed. Sterilize
by aseptic filtration and store at 4°C. The shelf life of this solu-
tion is approximately 1 year.
5. 1 M Potassium phosphate: Mix 132 mL of 1 M K2HPO4 and
868 mL of 1 M KH2PO4. Adjust the pH to 6.0 ± 0.1 using
KOH or phosphoric acid. Sterilize by autoclaving and store at
room temperature. The shelf life of this solution is greater than
1 year.
6. 500× Biotin: Dissolve 20 mg of biotin in 100 mL of water and
warm it to 40–50°C to dissolve completely. Sterilize by aseptic
filtration and store at 4°C. The shelf life of this solution is
approximately 1 year.
7. 10% (w/v) Glycerol: Dissolve 50 mL of glycerol in 450 mL of
water. Sterilize by aseptic filtration and store at room tempera-
ture. The shelf life of this solution is over 1 year.
8. 200× PTM1 trace salts: Mix together the following ingredients
and dissolve to a final volume of 1 L in water: 6.0 g of
CuSO 4⋅5H2O, 0.08 g of NaI, 3.0 g of MgSO4⋅H2O, 0.2 g
of Na2MoO4⋅2H2O, 0.02 g of boric acid, 0.5 g of CoCl2, 20.0 g
of ZnCl2, 65.0 g of FeSO4⋅7H2O, 0.2 g of biotin, 5.0 mL of
H2SO4. Warm to 40–50°C to dissolve completely if needed.
Sterilize by aseptic filtration and store at 4°C or room temperature.
The shelf life of this solution is approximately 3 months at 4°C.
9. Buffered glycerol-complex (BMGY) medium: Dissolve 20 g bacto

peptone and 10 g yeast extract in 700 mL of water. Sterilize by
autoclaving and cool it to room temperature. Aseptically add
100 mL of sterile 1 M potassium phosphate, 100 mL of sterile
10× YNB, 2 mL of sterile 500× biotin, and 100 mL of sterile
10% (w/v) glycerol solutions. If needed, 5 mL of 200× PTM1
trace salts solution can be added to this medium. Store at 4°C.
The shelf life of this medium is approximately 3 months.
10. 10% 15N-ammonium chloride: Dissolve 6 g of 15N-ammonium
chloride (98 atom% 15N) in 60 mL of water (see Note 2). Sterilize
by aseptic filtration. Prepare this solution just before use.
15
11. N-BM medium: To prepare 1.2 L of uniformly 15N-labeled
buffered minimal (15N-BM) medium, autoclave 900 mL of water
in a 2-L media bottle and allow the water to cool to room
temperature. Aseptically add 120 mL of 10× YNB, 2.4 mL of
500× biotin, 120 mL of 1 M potassium phosphate, and 60 mL
of 10% 15N-ammonium chloride to the autoclaved water.
If needed, 6 mL of 200× PTM1 trace salts solution can be
added to 15N-BM medium.
12. 5% D-[13C6]glucose: Dissolve 1 g of D-[13C6]glucose (99 atom%
13
C) in 20 mL of 15N-BM medium and sterilize by aseptic
filtration. Prepare this solution just before use.
13. Antifoaming agent: (see Note 3).
13
14. C-methanol (99 atom% 13C).
15. 50% (w/v) ethanol. Ethanol is just necessary for immersing air
outlet port of the fermentation system.
2.2. Uniform 2H, 1. YPDS agar plates (Subheading 2.1).

15
N-Labeling Using 2. BMGY medium (Subheading 2.1).
P. pastoris
3. 10× YNB(2H2O): Prepare this solution in a manner similar to
10× YNB solution (Subheading 2.1), but use 2H2O (99.8 atom
% 2H) instead of water (H2O). Sterilize by aseptic filtration and
store at room temperature. The shelf life of this solution is
approximately 2–3 months.
4. 1 M potassium phosphate(2H2O): Prepare 150 mL of this solu-
tion in a manner similar to 1 M potassium phosphate solution
(Subheading 2.1), but use 2H2O instead of water. Sterilize by
aseptic filtration. Prepare this solution just before use.
5. 500× biotin(2H2O): Prepare this solution in a manner similar
to 500× biotin solution (Subheading 2.1), but use 2H2O
instead of water. Sterilize by aseptic filtration and store at 4°C.
The shelf life of this solution is approximately 2–3 months.
6. 200× PTM1 trace salts(2H2O): Prepare this solution in a manner
similar to 200× PTM1 trace salts solution (Subheading 2.1),
but use 2H2O instead of water. Sterilize by aseptic filtration and
store at 4°C.
24 T. Sugiki et al.
7. 10% 15N-ammonium chloride(2H2O): Dissolve 7 g of

15
N-ammonium chloride in 70 mL of 2H2O (see Note 2) and ster-
ilize by aseptic filtration. Prepare this solution just before use.
8. 10% unlabeled D-glucose(2H2O): Dissolve 1 g of D-glucose in
10 mL of 2H2O and sterilize by aseptic filtration. Prepare this
solution just before use.
9. 15N-BMD (90% 2H2O) medium: To prepare 10 mL of uniformly
15
N-labeled buffered minimal medium containing unlabeled
glucose prepared in 90% 2H2O, mix 6 mL of 2H2O and 1 mL
of H2O, and sterilize by aseptic filtration. Aseptically add 1 mL
of 10× YNB(2H2O), 0.02 mL of 500× biotin(2H2O), 1 mL of
1 M potassium phosphate(2H2O), 0.5 mL of 10% unlabeled
D-glucose(2H2O) (0.5% final concentration), 0.5 mL of 10%
15
N-ammonium chloride(2H2O), and 5–10 μL of antifoaming
agents. If needed, 0.05 mL of 200× PTM1 trace salts(2H2O) can
be added to this medium. Prepare this medium just before use.
15
10. N-BM(2H2O) medium: To prepare 1.3 L of uniformly
15
N-labeled buffered minimal medium in 100% 2H2O, sterilize
956 mL of fresh 2H2O by aseptic filtration. Aseptically add
130 mL of 10× YNB(2H2O), 3 mL of 500× biotin(2H2O),
130 mL of 1 M potassium phosphate(2H2O), and 66 mL of
10% 15N-ammonium chloride(2H2O) into the sterile 2H2O. If
needed, 7 mL of 200× PTM1 trace salts(2H2O) can be added
to this medium. Prepare this medium just before use.
15
11. N-BMD(2H2O) medium: To prepare 5 mL of uniformly
15
N-labeled buffered minimal medium containing unlabeled
glucose in 100% 2H2O, aseptically add 0.25 mL of 10% unla-
beled D-glucose(2H2O) to 4.75 mL of 15N-BM(2H2O) in a
50-mL Corning tube. Prepare this medium just before use.
12. 10% D-[2H7]glucose(2H2O): Dissolve 1.5 g of D-[2H7]glucose
(98 atom% 2H) in 15 mL of 2H2O and sterilize by aseptic filtra-
tion. Prepare this solution just before use.
13. 2H, 15N-BMD(2H2O) medium: To prepare 300 mL of uniformly
15
N-labeled buffered minimal medium containing D-[2H7]glu-
cose in 100% 2H2O, transfer 285 mL of 15N-BM(2H2O) into a
sterilize, well-dried 1-L baffled flask. Aseptically add 15 mL of
10% D-[2H7]glucose(2H2O) and 0.15–0.30 mL of antifoaming
agents. Prepare this medium just before use.
14. Antifoaming agent (Subheading 2.1).
15. [2H4]methanol (99.5 atom% 2H).
2.3. Uniform 13C, 1. Yeast peptone (YP): Dissolve 20 g of bacto peptone and 10 g
15
N-Labeling Using of yeast extract in 950 mL of water. Sterilize by autoclaving
K. lactis and store at room temperature or 4°C. The shelf life of this
solution is approximately 1 year.
2. 40% unlabeled D-glucose stock solution (Subheading 2.1).

3. YPD medium: Aseptically add 50 mL of sterile 40% D-glucose
stock solution to the autoclaved 950 mL YP solution. The shelf
life of this solution is approximately 1 month.
4. 1 M potassium phosphate (Subheading 2.1).
5. YCB-acetamide agar plates: Dissolve 5.85 g of yeast carbon
base (YCB) medium powder and 10 g of bacto agar in 470 mL
of water. Add 25 mL of 1 M potassium phosphate solution.
Autoclave and cool it to 50–60°C. Aseptically add 5 mL of
100× acetamide stock solution (supplied by New England
Biolabs), gently mix well, and dispense into sterile disporsable
Petri dishes. Store at 4°C. The shelf life of these plates is
approximately 3 months.
6. 10× YNB (Subheading 2.1).
7. 500× biotin (Subheading 2.1).
8. 200× PTM1 trace salts (Subheading 2.1).
9. 10% 15N-ammonium chloride: Dissolve 7.5 g of 15N-ammonium
chloride in 75 mL of water (see Note 2) and sterilize by aseptic
filtration. Prepare this solution just before use.
10. 4% D-[13C6]glucose: Dissolve 2 g of D-[13C6]glucose in 50 mL
of water and sterilize it by aseptic filtration. Prepare this solu-
tion just before use.
13
11. C, 15N-BMD medium: To prepare 0.5 L of 15N-BM medium
containing 0.4% D-[13C6]glucose, autoclave 325 mL of water
containing 0.25–0.50 mL of antifoaming agents in a 2-L fermen-
tation vessel and allow it to cool to room temperature. Aseptically
add 50 mL of 10× YNB, 1 mL of 500× biotin, 50 mL of 1 M
potassium phosphate, 25 mL of 10% 15N-ammonium chloride,
and 50 mL of 4% D-[13C6]glucose to the autoclaved water for a
final volume of 500 mL. If needed, 2.5 mL of 200× PTM1
trace salts solution can be added to this medium.
12. 6% D-[13C6]glucose: Dissolve 6 g of D-[13C6]glucose in 100 mL
of water and sterilize by aseptic filtration. Prepare this solution
just before use.
13. 13C, 15N-BMD feeding medium: To prepare 1 L of 15N-BM
medium containing 0.6% D-[13C6]glucose feeding medium, auto-
clave 650 mL of water containing 0.5–1.0 mL of antifoaming
agents in a 1-L media bottle and allow it to cool to room tempera-
ture. Aseptically add 100 mL of 10× YNB, 2 mL of 500× biotin,
100 mL of 1 M potassium phosphate, 50 mL of 10%
15
N-ammonium chloride and 100 mL of 6% D-[13C6]glucose to
the autoclaved water for a total volume of 1 L. If needed, 5 mL of
200× PTM1 trace salts solution can be added to this medium.
14. Antifoaming agent (Subheading 2.1).
26 T. Sugiki et al.
2.4. Uniform 2H, 1. YPD medium (Subheading 2.3).

15
N-Labeling Using 2. YCB-acetamide agar plates (Subheading 2.3): If needed,
K. lactis prepare the plates with 2H2O (see Note 4).
3. 10× YNB(2H2O) (Subheading 2.2).
4. 500× biotin(2H2O) (Subheading 2.2).
5. 1 M potassium phosphate(2H2O) (Subheading 2.2).
6. 200× PTM1 trace salts(2H2O) (Subheading 2.2).
7. 10% 15N-ammonium chloride(2H2O): Dissolve 7.5 g of
15
N-ammonium chloride in 75 mL of 2H2O (see Note 2) and ster-
ilize by aseptic filtration. Prepare this solution just before use.
8. 4% D-[2H7]glucose(2H2O): Dissolve 2 g of D-[2H7]glucose in
9. 2H, 15N-BMD(2H2O): To prepare 0.5 L of uniformly
15
N-labeled buffered minimal medium containing D-[2H7]glu-
cose in 2H2O, sterilize 325 mL of 2H2O by aseptic filtration.
Aseptically add 50 mL of 10× YNB(2H2O), 1 mL of 500×
biotin(2H2O), 50 mL of 1 M potassium phosphate(2H2O),
25 mL of 10% 15N-ammonium chloride(2H2O), 50 mL of 4%
D-[2H7]glucose(2H2O), and 0.25–0.50 mL of antifoaming
agent to the sterilized 2H2O for a final volume of 500 mL. If
needed, 2.5 mL of 200× PTM1 trace salts(2H2O) can be added
to this medium.
10. 6% D-[2H7]glucose(2H2O): Dissolve 6 g of D-[2H7]glucose in
11. 2H, 15N-BMD(2H2O) feeding medium: Sterilize 650 mL of
2
H2O by aseptic filtration. Aseptically add 100 mL of 10×
YNB(2H2O), 2 mL of 500× biotin(2H2O), 100 mL of 1 M
potassium phosphate(2H2O), 50 mL of 10% 15N-ammonium
chloride(2H2O), 100 mL of 6% D-[2H7]glucose(2H2O), and
0.5–1.0 mL of antifoaming agent to the sterilized 2H2O for a
final volume of 1 L. If needed, 5 mL of 200× PTM1 trace
salts(2H2O) can be added to this medium.
12. Antifoaming agent (Subheading 2.1). 13. 50% (w/v) ethanol.
3. Methods
3.1. Uniform 13C, 1. Prepare necessary media.

15
N-Labeling Using (a) Using aluminum foil, loosely cover the open lid of an
P. pastoris empty 1-L baffled flask and autoclave it.
(b) Prepare 1.2 L of fresh, sterile 15N-BM medium in a 2-L

medium bottle and 20 mL of 5% D-[13C6]glucose.
(c) Aseptically combine 180 mL of fresh 15N-BM medium,
20 mL of 5% D-[13C6]glucose and 0.1–0.2 mL of anti-
foaming agents in the sterile 1-L baffled flask to make
13
C, 15N-BMD medium.
(d) Using aluminum foil, loosely mask the open lid of the
empty 2-L fermentation vessel. Clamp all of the lines that
could come into contact with the growth medium and
autoclave the vessel.
(e) Aseptically combine 1 L of fresh 15N-BM medium,
0.5–1.0 mL of antifoaming agent and 5 mL of 13C-methanol
into the sterile empty 2-L fermentation vessel, to make
13
C, 15N-BMM medium.
2. Using a sterile loop, scoop a small aliquot of P. pastoris trans-
formants from a frozen glycerol stock, and streak it onto a
YPDS agar plate containing Zeocin™. Incubate the plate for
24–48 h at 30°C.
3. To produce the primary culture, inoculate a fresh, single
colony of the P. pastoris transformants into 5 mL of BMGY
medium (in a 50-mL Corning tube) and shake at 200–250 rpm
at 30°C for 18–24 h.
4. Pellet the primary culture cells by centrifugation at 3,000 × g
for 5 min at 20°C and discard the supernatant. Gently resus-
pend the pellet with 30 mL of fresh, sterile 13C, 15N-BMD
medium, and pour the resuspended cells into the remaining
13
C, 15N-BMD medium in the baffled flask for a final volume of
200 mL (see Note 5). Shake the flask at > 200 rpm at 30°C
until the cell density reaches an OD600 of 4–6.
5. Pellet the cells in sterile tubes by centrifuging at 2,000 × g for
10 min at 20°C and discard the supernatant. Gently resuspend
the pellet with 30 mL of fresh, sterile 13C, 15N-BMM medium,
and pour the resuspended cells into the remaining fresh, sterile
13
C, 15N-BMM medium in the 2-L fermentation vessel to reach
an OD600 of approximately 1 (see Notes 6–8).
6. Assemble the fermentation system. Aseptically attach the pH
and dissolved O2 sensors to the fermentation vessel and con-
nect the probes to their respective controllers. Connect the
outlet port of the air-feeding tube to the air flow inlet port of
the vessel. Aseptically attach a Liebig condenser to the outlet
port of the vessel, and immerse the air-exhaust port of the
Liebig condenser under 1 L of 50 % (w/v) ethanol solution.
Insert the temperature probe into the vessel and connect it to
a temperature controller, such as a circulating water bath.
7. Agitate the culture medium at 300–800 rpm with 0.1–0.3 L/
min of feeding air at 30°C (see Notes 9 and 10). Add
28 T. Sugiki et al.
13
C-methanol (0.5 % (w/v) of the total volume) to the culture
medium every 24 h during this induction phase (20) (see
Note 11). The total amount of 13C-methanol required for suf-
ficient protein expression in a cost-effective manner is approx-
imately 10–20 mL per 1 L culture medium (see Note 12).
8. Pellet the cells by centrifuging at 6,000 × g for 20 min at 4°C.
If the target protein is secreted into the culture medium, filter
the supernatant by passing it through a < 0.45-μm pore size
membrane and collect the filtrate. If needed, add appropriate
protease inhibitors to the filtrate to prevent proteolytic degra-
dation of secreted target proteins. If the target protein is
expressed in the cytoplasm of the host cells, discard the super-
natant after centrifugation and retain the cell pellet.
3.2. Uniform 2H, Deuterium isotope-labeling of target proteins is one of the most
15
N-Labeling Using important techniques for protein NMR studies, especially for analyses
P. pastoris of large molecular weight (MW > 25 K) proteins and in cross-
saturation experiments to identify intermolecular-binding sites
(6, 21–24). The P. pastoris expression system can be used to over-
express deuterium-labeled heterologous proteins.
For the efficient production of deuterated target proteins, cells
should be adapted to grow in deuterated broth medium. Adaptation
is achieved by multistage subculturing in which the deuterium
concentration is raised in a stepwise fashion. For instance, subcul-
turing of cells is performed using semideuterated (25–95% 2H2O)
medium, with subsequent culturing in fully deuterated medium
(14) (see Note 4). In the case of uniform 2H-labeling, using a
2
H-labeled carbon source (such as [2H4]methanol) only during the
induction phase is insufficient for producing fully deuterated target
proteins (a considerable amount of protons remain on methyl and
Lys δ/ε groups) (14). To achieve the nearly-complete deuteration
level required for cross-saturation experiments, a deuterium-labeled
carbon source, such as D-[2H7]glucose should also be used during
the cell growth phase (prior to the induction phase) (7, 14). In this
section, we describe a cultivation procedure for overexpressing
perdeuterated heterologous proteins in P. pastoris. Using this pro-
cedure, Ichikawa and coworkers successfully prepared a discoidin
domain of DDR2 that was more than 95% deuterated, and clearly
identified the collagen-binding site of DDR2 through transferred
cross-saturation experiments (7).
1. Prepare necessary media:
(a) Autoclave a 1-L baffled flask and a 2-L fermentation vessel,
and dry them in an oven at 60–80°C (for at least 48 h) to
completely remove residual H2O.
(b) Prepare 1.3 L of 15N-BM(2H2O) medium in a 2-L media
bottle, 10 mL of 10% unlabeled D-glucose(2H2O), and
15 mL of 10% D-[2H7]glucose(2H2O) solutions.
(c) Prepare 5 mL each of 15N-BMD(90% 2H2O), 15N-BMD

(2H2O), and 2H, 15N-BMD(2H2O) media in 50-mL
Corning tubes.
(d) Prepare 300 mL of 2H, 15N-BMD(2H2O) medium in
a sterilize, well-dried 1-L baffled flask.
(e) Prepare 2H, 15N-BMM(2H2O) medium by mixing 1 L of
15
N-BM(2H2O) medium, 0.5–1.0 mL of antifoaming agent
and 5 mL of [2H4]methanol in the sterilize, well-dried 2-L
fermentation vessel.
2. Using a sterile loop, scoop a small aliquot of P. pastoris
transformants from a frozen glycerol stock and streak it onto a
YPDS agar plate containing Zeocin™. Incubate the plate for
24–48 h at 30°C.
3. To produce the primary culture, inoculate 5 mL of BMGY
medium in a sterile 50-mL Corning tube with a fresh single
colony of the P. pastoris transformant from the YPDS agar plate
and shake at 200–250 rpm at 30°C for 18–24 h.
4. Add 0.1 mL of the primary culture to 5 mL of fresh, sterile
15
N-BMD(90% 2H2O) medium in a sterile 50-mL Corning
tube. Shake the culture tube at 200–300 rpm at 30°C until the
cell density reaches an OD600 of approximately 3–5.
5. Add 0.1 mL of the step 4 culture to 5 mL of fresh, sterile
15
N-BMD(2H2O) medium (in a sterile 50-mL Corning tube).
Shake the culture tube at 200–300 rpm at 30°C until the cell
density reaches an OD600 of 3–5.
6. Add 0.1 mL of the step 5 culture to 5 mL of fresh, sterile 2H,
15
N-BMD(2H2O) medium in a sterile 50-mL Corning tube.
Shake the culture tube at 200–300 rpm at 30°C until the cell
density reaches an OD600 of 3–5.
7. Pellet the cells by centrifuging at 3,000 × g for 5 min at 20°C
and discard the supernatant. Gently resuspend the pellet with
30 mL of fresh, sterile 2H, 15N-BMD(2H2O) medium and pour
the resuspended cells into the remaining 2H, 15N-BMD(2H2O)
medium in the 1-L baffled flask to a final volume of 300 mL.
Shake the flask at > 200 rpm at 30°C until the cell density
reaches an OD600 of 3–4.
8. Pellet the cells in sterile tubes by centrifuging at 2,000 × g for
the pellet with 30 mL of fresh, sterile 2H, 15N-BMM(2H2O)
medium and pour the resuspended cells into the remaining
sterile 2H, 15N-BMM(2H2O) medium in the 2-L fermentation
vessel. The starting OD600 of the culture in the fermenter vessel
should be approximately 1 (see Notes 6–8).
9. Assemble the fermentation system as described in
Subheading 3.1, step 6 (see Note 13).
30 T. Sugiki et al.
10. Agitate the culture medium at 300–800 rpm with 0.1–0.3

L/min of feeding air at 30°C (see Notes 9 and 10). Add [2H4]
methanol (0.5 % (w/v) of the total volume) to the culture
medium every 24 h during this induction phase (see Notes 11
and 12).
11. Collect the target proteins as described in Subheading 3.1,
step 8.
3.3. Uniform 13C, 1. Prepare 0.5 L of sterile 13C, 15N-BMD and 1.0 L of 13C, 15N-BMD
15
N-Labeling Using feeding medium in a 2-L fermentation vessel and 1-L media
K. lactis bottle, respectively.
2. Using a sterile loop, scoop a small aliquot of K. lactis transfor-
mants from a frozen glycerol stock and streak it onto a YCB-
acetamide agar plate. Incubate the plate for 24–48 h at 30°C.
3. To produce the primary culture, inoculate a fresh, single colony
of K. lactis transformants into 5 mL of YPD medium (in a 50-mL
Corning tube) and shake at 200–250 rpm at 30°C for 48 h to
obtain a saturated biomass (OD600 > 20–30).
Subheading 3.1, step 6. Connect the media bottle containing
1 L of fresh 13C, 15N-BMD feeding medium to the inlet port of
the vessel with an appropriate length of feeding tube, and
attach a peristaltic pump at the midpoint of the feeding tube
(Fig. 1).
5. Pellet the primary culture cells by centrifuging at 3,000 × g for
the pellet with 30 mL of fresh, sterile 13C, 15N-BMD medium
from the 2-L fermentation vessel, and pour the resuspended
cells into the remaining 13C, 15N-BMD medium in the fermen-
tation vessel. Agitate the culture medium at 600–800 rpm with
0.1–0.3 L/min feeding air at 30°C (see Notes 9, 10, and 12).
6. During the fermentation, continuously feed fresh, sterile 13C,
15
N-BMD feeding medium into the fermentation vessel at a
constant flow rate of 8.3 mL/h using the peristaltic pump (see
Note 14).
7. Pellet the cells by centrifuging at 10,000 × g for 15 min at 4°C.
If the target protein is secreted into the culture medium, filter
the supernatant by passing it through a < 0.45-μm pore size
membrane and collect the filtrate. If needed, add appropriate
protease inhibitors to the collected filtrate to prevent proteolytic
degradation of the target protein. If the target protein is expressed
in the cytoplasm of the host cells, discard the supernatant after
centrifugation, and retain the cell pellet.
3.4. Uniform 2H, In our experience, perdeuteration (approximately 90% deuteration

15
N-Labeling Using estimated by MS analysis) of maltose-binding protein (MBP) is
K. lactis successfully achieved using K. lactis with fed-batch fermentation.
Fig. 1. Overview of the fed-batch fermentation system. (a) Schematic diagram of fed-batch
fermentation. (b) Photograph of the assembled fed-batch fermentation device. a: fresh
media which is continuously fed into the fermentation vessel, b: peristaltic pump, c: fermentation
vessel, d: air flow inlet, and e: air exhaust port.
Furthermore, insertion of one or two subcultivation steps following

the primary culture improves cell growth and expression of MBP
with a highly efficient (> 92%) level of deuteration (Miyazawa-Onami
and coworkers, unpublished data).
1. Prepare necessary media:
(a) Autoclave an empty 1-L media bottle and a 2-L fermentation
vessel, and dry them in an oven at 60–80°C (for at least
48 h) to completely remove residual H2O.
(b) Prepare 0.5 L of fresh, sterile 2H, 15N-BMD(2H2O) and
1.0 L of 2H, 15N-BMD(2H2O) feeding medium in the
sterile 2-L fermentation vessel and 1-L media bottle,
respectively.
2. Using a sterile loop, scoop a small aliquot of K. lactis transfor-
mants from a frozen glycerol stock and streak it onto a YCB-
acetamide agar plate. Incubate the plate for 24–48 h at 30°C.
32 T. Sugiki et al.
3. To produce the primary culture, inoculate 5 mL of YPD

medium in a 50-mL Corning tube using a fresh single colony
of the K. lactis transformants from the YCB agar plate and
shake at 200–250 rpm at 30°C for 48 h to obtain a saturated
biomass (OD600 > 20–30).
Subheading 3.1, step 6 (see Note 13). Connect the bottle con-
taining 1 L of fresh 2H, 15N-BMD(2H2O) feeding medium
to the inlet port of the vessel with an appropriate length of
feeding tube, and attach a peristaltic pump at the midpoint
of the feeding tube (Fig. 1).
5. Pellet the cells by centrifuging at 3,000 × g for 5 min at 20°C
and discard the supernatant. Gently resuspend the pellet with
30 mL of fresh, sterile 2H, 15N-BMD(2H2O) medium from the
2-L fermentation vessel, and pour the resuspended cells into
the remaining 2H, 15N-BMD(2H2O) medium in the fermentation
vessel. Agitate the culture medium at 600–800 rpm with
0.1–0.3 L/min feeding air at 30°C (see Notes 9, 10, and 12).
6. During the fermentation, continuously feed fresh, 2H,
15
N-BMD(2H2O) feeding medium from the 1-L media bottle
into the fermentation vessel at a constant flow rate of 8.3 mL/h
using the peristaltic pump (see Note 14).
7. Collect the target proteins as described in Subheading 3.3,
step 7.
4. Notes
1. Detailed information about the composition of YNB without

amino acids and ammonium sulfate can be found in the Becton
Dickinson Difco Yeast Media Recipes, available at http://www.
bd.com/ds/technicalCenter/inserts/Yeast_Media.pdf.
2. Ammonium sulfate or ammonium chloride is used as a nitrogen
source by the yeast. In many cases, 5–10 g/L of ammonium
sulfate or ammonium chloride are used in the culture of yeast.
In our experience, a sufficient amount of isotope-labeled protein
could be expressed while keeping costs to a minimum by using
2 g/L of 15N-ammonium chloride (7).
3. Foaming of the culture medium caused by agitation and aeration
will severely affect the yield of target protein due to a reduc-
tion in the level of protein expression and partial denaturation
of the expressed proteins (25). Foaming should be controlled
by adding antifoaming agents to the culture medium. We use

0.05–0.10% (w/v) of “Antifoam 204” (supplied from Sigma).
4. For many organisms, 2H2O-containing medium negatively affect
physiological processes. In many cases, stagnation of cell growth
occurs in 75–100% 2H2O, necessitating adaptation to the
deuterated medium by subcultivating the cells several times
prior to the induction of protein expression (2). Furthermore,
the selection of transformants using deuterated agar plates may
be effective in generating special transformants that are more
adaptable to deuterated medium.
5. Although glycerol is widely used as a carbon source of P. pastoris
during growth phase, in the case of 13C-labeling, 13C-glycerol can
be replaced by 13C-glucose to reduce labeling costs (9, 26, 27).
6. An OD600 = 1 is equivalent to 5 × 107 P. pastoris cells (2).
7. Nonmethanolic carbon sources, especially glucose, should be
completely eliminated (or consumed) prior to the addition of
methanol to achieve full induction efficiency of the AOX1 pro-
moter (2, 7, 20). However, the use of mixtures of carbon
sources during the induction phase to improve cell growth and
the production of target proteins has been reported (28–32).
8. The optimal starting point of the induction phase should be
determined by performing a time course study of cell growth
in a “cold-run” prior to beginning a “hot-run,” since the optimal
point for starting induction varies according to the type of target
protein and culture conditions.
9. The level of oxygen supplied to the fermentation medium should
be kept at 0.1–0.3 vvm (volume of oxygen (liters) per volume
of fermentation culture (liters) per minute) during the cultiva-
tion. This level can be easily achieved using any glass fermenter
and mixing the broth medium via an impeller in an air feeding
condition. Maintaining an adequate concentration of dissolved
O2 during cultivation is crucially important for sufficient cell
growth and protein expression (2, 20, 33).
10. Proteolytic degradation of heterologous proteins is one of the
drawbacks of the secretory expression strategy. In our experience,
this often occurs in a culture grown in a poor-nutrient medium
such as the one that may be found under isotope labeling
conditions. To prevent extensive proteolytic degradation of
target proteins, the following measures can be taken (1) optimiz-
ing the pH range of the culture medium (34); (2) cultivating
at low-temperature (35–38); (3) adding extracts of algae, such
as BioExpress. (26); (4) adding protease inhibitors (2, 27, 39);
however, some types of protease inhibitors are toxic; and
(5) utilizing protease-deficient strains (34, 40). Several
34 T. Sugiki et al.
protease-deficient strains of P. pastoris and K. lactis are com-

mercially available.
11. In the P. pastoris expression system, the methanol concentration
during induction directly affects cell growth and protein
production since an excess of methanol (> 1–2% (w/v)) and an
excessive rate of feeding is toxic to cells (5). Monitoring the
level of dissolved O2 in the culture medium is a convenient
method for assessing the timing for feeding and the appropriate
dose of methanol because sharp “spike” signals are detected
by the O2 electrode when the carbon sources are completely
consumed and/or when the concentration of methanol reaches
a toxic level (20, 33).
12. The typical induction time is between 48 and 96 h. However,
in the case of a deuterium-labeling culture, an induction period
longer than 96 h is typically required. The optimal induction
period and the amount of methanol should be determined by
following the time course of target protein expression level
using SDS-PAGE.
13. To achieve a higher level of deuteration, a glass tube filled with
dried granular calcium chloride should be connected in front
of the air inlet of the fermentation vessel to remove H2O from
the supplied air.
14. The optimal culture conditions (composition of media, pH,
temperature, etc.) vary according to the type of host and target
proteins. In our experience, the most critical factors for obtaining
sufficient protein expression using K. lactis are the concentra-
tion of the carbon source and the concentration of dissolved
oxygen. Especially, in cases involving the use of minimal media
for isotope labeling, we recommend fermentation cultivation
rather than flask culture. Furthermore, addition of isotopically
labeled extracts of algae, such as Celtone, BioExpress (both sup-
plied by Cambridge Isotope Laboratories, Inc.) or C.H.L. (Chlorella
Industry Co., Ltd.) significantly improves cell growth and
protein expression (41).
Acknowledgments
This work was financially supported in part by the Ministry of

Economy, Trade and Industry (METI) and the New Energy and
Industrial Technology Development Organization (NEDO).
References
1. Cregg, J. M., Vedvick, T. S., and Raschke, W. lysozyme where phosphate ion binds using
C. (1993) Recent advances in the expression of NMR measurements. FEBS Lett. 448, 33–37.
foreign genes in Pichia pastoris. Biotechnology 12. van den Burg, H. A., de Wit, P. J., and Vervoort,
11, 905–910. J. (2001) Efficient 13C/15N double labeling of
2. Pickford, A. R., and O’Leary, J. M. (2008) the avirulence protein AVR4 in a methanol-
Isotopic labeling of recombinant proteins from utilizing strain (Mut+) of Pichia pastoris. J.
the methylotrophic yeast Pichia pastoris, in Biomol. NMR 20, 251–261.
Methods in Molecular Biology (Downing, A. K., 13. Rodriguez, E., and Krishna, N. R. (2001)
Ed.), vol. 278, pp. 17–33, Humana Press, An economical method for 13C/15N isotopic
Totowa, NJ. labeling of proteins expressed in Pichia pastoris.
3. Lin Cereghino, G. P., Cereghino, J. L., Ilgen, J. Biochem. 130, 19–22.
C., and Cregg, J. M. (2002) Production of 14. Morgan, W. D., Kragt, A., and Feeney, J. (2000)
recombinant proteins in fermenter cultures of Expression of deuterium-isotope-labelled pro-
the yeast Pichia pastoris. Curr. Opin. Biotechnol. tein in the yeast Pichia pastoris for NMR study.
13, 329–332. J. Biomol. NMR 17, 337–347.
4. Daley, R., and Hearn, M. T. (2005) Expression 15. Ichikawa, M., Osawa, M., Nishida, N.,
of heterologous proteins in Pichia pastoris : Goshima, N., Nomura, N., and Shimada, I.
a useful experimental tool in protein engi- (2007) Structural basis of the collagen-binding
neering and production. J. Mol. Recognit. 18, mode of discoidin domain receptor 2, EMBO J.
119–138. 26, 4168–4176.
5. Cos, O., Ramόn, R., Montesinos, J. L., and 16. New England Biolabs, Inc. K. lactis Protein
Valero F. (2006) Operational strategies, moni- Expression Kit: Instruction manual. Available at
toring and control of heterologous protein http://www.neb.com/nebecomm/ManualFiles/
production in the methylotrophic yeast Pichia manualE1000.pdf.
pastoris under different promoters: A review. 17. Colussi, P. A., and Taron, C. H. (2005)
Microbial Cell Factories 5, 17–36. Kluyveromyces lactis LAC4 promoter variants
6. Takahashi, H., and Shimada, I. (2010) Production that lack function in bacteria but retain full
of isotopically labeled heterologous proteins in function in K. lactis, Appl. Environ. Microbiol.
non-E. coli prokaryotic cells. J. Biomol. NMR 71, 7092–7098.
46, 3–10. 18. Read, J. D., Colussi, P. A., Ganatra, M. B., and
7. Invitrogen Corporation. EasySelect™ Pichia Taron, C. H. (2007) Acetamide selection of
Expression Kit: A manual of methods for expres- Kluyveromyces lactis cells transformed with an
sion of recombinant proteins using pPICZ and integrative vector leads to high-frequency for-
pPICZα in Pichia pastoris, Version I. Available mation of multicopy strains. Appl. Environ.
at http://tools.invitrogen.com/content/sfs/man- Microbiol. 73, 5088–5096.
uals/easyselect_man.pdf. 19. Sugiki, T., Shimada, I., and Takahashi, H.
8. Laroche, Y., Strome, V., De Meutter, J., (2008) Stable isotope labeling of protein by
Messens, J., and Lauwereys, M. (1994) High- Kluyveromyces lactis for NMR study. J. Biomol.
level secretion and very efficient isotopic labeling NMR 42, 159–162.
of tick anticoagulant peptide (TAP) expressed 20. Invitrogen Corporation. Pichia Fermentation
in the methylotrophic yeast, Pichia pastoris. Process Guidelines. Available at http://toolsja.
Bio/Technol 12, 1119–1124. invitrogen.com/content/sfs/manuals/pichia-
9. Denton, H., Smith, M., Husi, H., Uhrin, D., ferm_prot.pdf.
Barlow, P. N., Batt, C. A. , and Sawyer, L. (1998) 21. Takahashi, H., Nakanishi, T., Kami, K., Arata,
Isotopically labeled bovine β-lactoglobulin for Y., and Shimada, I. (2000) A novel NMR
NMR studies expressed in Pichia pastoris. method for determining the interface of large
Protein Expr. Purif. 14, 97–103. protein-protein complexes. Nat. Struct. Biol. 7,
10. Wood, M. J., and Komives, E. A. (1999) 220–223.
Production of large quantities of isotopically 22. Nakanishi, T., Miyazawa, M., Sakakura, M.,
labeled protein in Pichia pastoris by fermenta- Terasawa, H., Takahashi, H., and Shimada, I.
tion.. J. Biomol. NMR 13, 149–159. (2002) Determination of the interface of a large
11. Mine, S., Ueda, T., Hashimoto, Y., Tanaka, Y., protein complex by transferred cross-saturation
and Imoto, T. (1999) High-level expression of measurements. J. Mol. Biol. 318, 245–249.
uniformly 15N-labeled hen lysozyme in Pichia 23. Takahashi, H., Miyazawa, M., Ina, Y., Fukunishi,
pastoris and identification of the site in hen Y., Mizukoshi, Y., Nakamura, H., and Shimada,
36 T. Sugiki et al.
I. (2006) Utilization of methyl proton reso- Pichia pastoris fermentation. Biotechnol. Bioeng
nances in cross-saturation measurement for 72, 1–11.
determining the interfaces of large protein-pro- 33. Cai, M., Huang, Y., Sakaguchi, K., Clore, G.
tein complexes. J. Biomol. NMR 34, 167–177. M., Gronenborn, A. M., and Craigie, R. (1998)
24. Shimada, I., Ueda, T., Matsumoto, M., Sakakura, An efficient and cost-effective isotope labeling
M., Osawa, M., Takeuchi, K., Nishida, N., and protocol for proteins expressed in Escherichia
Takahashi, H. (2008) Cross-saturation and coli. J. Biomol. NMR 11, 97–102.
transferred cross-saturation experiments. Prog. 34. Zhang, Y., Liu, R., and Wu, X. (2007) The proteolytic
Nucl. Magn. Reson. Spectrosc. 54, 123–140. systems and heterologous proteins degradation
25. Koch, V., Rüffer, H.-M., Schügerl, K., in the methylotrophic yeast Pichia pastoris.
Innertsberger, E., Menzel, H., and Weis, J. Annal. Microbiol. 57, 553–560.
(1995) Effect of antifoam agents on the medium 35. Li, Z., Xiong, F., Lin, Q., d’Anjou, M., Daugulis,
and microbial cell properties and process per- A. J., Yang, D. S., and Hew, C. L. (2001) Low-
formance in small and large reactors. Process temperature increases the yield of biological
Biochem. 30, 435–446. active herring antifreeze protein in Pichia pastoris.
26. Macauley-Patrick, S., Fazenda, M. L., McNeil, Protein Expr. Purif. 21, 438–445.
B., and Harvey, L. M. (2005) Heterologous 36. Jahic, M., Gustavsson, M., Jansen, A.-K., Martinelle,
protein production using the Pichia pastoris M., and Enfors, S.-O. (2003) Analysis and
expression system. Yeast 22, 249–270. control of proteolysis of a fusion protein in
27. Shapiro, R. I., Wen, D., Levesque, M., Pichia pastoris fed-batch processes. J. Biotechnol.
Hronowski, X., Gill, A., Garber, E. A., Galdes, 102, 45–53.
A., Strauch, K. L., and Taylor, F. R. (2003) 37. Jahic, M., Wallberg, F., Bollok, M., Garcia, P.,
Expression of sonic hedgehog-Fc fusion protein and Enfors, S.-O. (2003) Temperature limited
in Pichia pastoris. Identification and control of fed-batch technique for control of proteolysis
post-translational, chemical, and proteolytic in Pichia pastoris bioreactor cultures. Microbial
modifications. Protein Expr. Purif. 29, 272–283. Cell Factories 2, 6–17.
28. Files, D., Ogawa, M., Scaman, C. H., and 38. Surribas, A., Stahn, R., Montesinos, J. L., Enfors,
Baldwin, S. A. (2001) A Pichia pastoris fermen- S.-O., Valero, F., and Jahic, M. (2007) Production
tation process for producing high-levels of of Rhizopus oryzae lipase from Pichia pastoris
recombinant human cystatin-C. Enzyme using alternative operational strategies. J. Biotechnol.
Microbial Technol. 29, 335–340. 130, 291–299.
29. Zhang, W., Hywood Potter K. J., Plantz, B. A., 39. Shi, X., Karkut T., Chamankhah, M., Alting-
Schlegel, V. L., Smith, L. A., and Meagher, M. Mees, M., Hemmingsen, S. M., and Hegedus,
M. (2003) Pichia pastoris fermentation with D. (2003) Optimal conditions for the expres-
mixed-feeds of glycerol and methanol: growth sion of a single-chain antibody (scFv) gene
kinetics and production improvement. J. Ind. in Pichia pastoris. Protein Expr. Purif. 28,
Microbiol. Biotechnol. 30, 210–215. 321–330.
30. Xie, J., Zhang, L., Ye, Q., Zhou, Q., Xin, L., 40. Yao, X. Q., Zhao, H. L., Xue, C., Zhang, W.,
Du, P., and Gan, R. (2003) Angiostatin pro- Xiong, X. H., Wang, Z. W., Li, X. Y., and Liu,
duction in cultivation of recombinant Pichia Z. M. (2009) Degradation of HSA-
pastoris fed with mixed carbon sources. AX15(R13K) when expressed in Pichia pas-
Biotechnol. Lett. 25, 173–177. toris can be reduced via the disruption of
31. McGrew, J. T., Leiske, D., Dell, B., Klinke, R., YPS1 gene in this yeast. J. Biotechnol. 139,
Krasts, D., Wee, S. F., Abbott, N., Armitage, 131–136.
R., and Harrington, K. (1997) Expression of 41. Madduri, K., Badger, M., Li, Z.-S., Xu, X.,
trimeric CD40 ligand in Pichia pastoris: use of a Thornburgh, S., Evans, S., and Dhadialla, T. S.
rapid method to detect high-level expressing (2009) Development of stable isotope and sele-
transformants. Gene 187, 193–200. nomethionine labeling methods for proteins
32. d’Anjou, M. C., and Daugulis, A. J. (2001) A expressed in Pseudomonas fluorescens. Protein
rational approach to improving productivity in Expr. Purif. 65, 57–65.
Chapter 3
Isotope Labeling in Insect Cells

Krishna Saxena, Arpana Dutta, Judith Klein-Seetharaman,
and Harald Schwalbe
Abstract
Recent years have seen remarkable progress in applying nuclear magnetic resonance (NMR) spectroscopy
to proteins that have traditionally been difficult to study due to issues with folding, posttranslational
modification, and expression levels or combinations thereof. In particular, insect cells have proved useful
in allowing large quantities of isotope-labeled, functional proteins to be obtained and purified to homoge-
neity, allowing study of their structures and dynamics by using NMR. Here, we provide protocols that
have proven successful in such endeavors.
Key words: Isotope labeling, Baculovirus, Nuclear magnetic resonance, Recombinant protein
expression, Insect cells
1. Introduction:
Baculovirus-Insect
Cell Expression
System Isotope labeling of proteins represents an important and often
required tool for the application of nuclear magnetic resonance
(NMR) spectroscopy to investigate the structure and dynamics of
proteins. So far, the great majority of isotope-labeled proteins have
been expressed in Escherichia coli (1) because of the ease of cloning
and expressing proteins at low cost. When possible, protein pro-
duction should be performed in a prokaryotic system (i.e., E. coli),
since this strategy is the most cost-effective and allows the most
flexibility. However, human or complex proteins often cannot be
expressed in E. coli in an active, correctly folded, posttranslationally
modified form (glycosylated, phosphorylated, etc.). The capacity
of E. coli for protein folding and forming disulfide bonds is not
37
38 K. Saxena et al.
sufficient for many recombinant proteins, although there are a

number of new developments to overcome these limitations:
1. Decreasing the temperature of the cell culture (2, 3)
2. Coexpressing molecular chaperones (4)
3. Fusing highly soluble tags (gst, mbp, trxa, nusa, sumo, etc.) to
the target proteins (5)
4. Overexpressing (6, 7) the target protein or using an engineered
E. coli strain capable of forming disulfide bonds in the cyto-
plasm (e.g., Shuffle, New England Biolabs)
5. Refolding in vitro
However, low-expression yield and solubility problems of the
target protein in E. coli often force the change from a bacterial
recombinant protein expression system to a eukaryotic host.
Numerous eukaryotic-based expression systems are currently
available in protein expression laboratories. For NMR use, only
three expression hosts are generally considered: yeast (Pichia pastoris
(8), Hansenula polymorpha or Kluyveromyces lactis), baculovirus-
mediated insect cells, and mammalian cells. More recently, the
generation of baculoviruses harboring mammalian promoters
(BacMams) have extended the use of baculovirus-mediated expres-
sion systems (BvE) to the development of gene delivery (9) and
expression vectors in mammalian cells (10). BacMams cannot rep-
licate in mammalian cells, which renders them a much safer alter-
native to conventional virus vectors (11).
Yeast offers a powerful, simple system for expressing recombi-
nant proteins. Besides the capability of performing posttransla-
tional modification to the recombinant protein, the main advantage
of these expression hosts is the feasibility of isotope labeling in
simple minimal-defined media. Therefore, the costs of 15N, 13C, or
2
H uniform isotope incorporation are negligible in comparison to
the costs of other eukaryotic cell media. Higher eukaryotes need
well-defined expression media supplemented with expensive,
labeled amino acids. But there are also disadvantages to recombi-
nant protein expression in yeast. Since they provide N- and O-linked
high-mannose-type glycans that could be immunogenic in humans
(12, 13), the production of glycosylated proteins in yeast is not
optimal, especially if glycosylation is required for the biological
functionality of the target protein. Additionally, yeast cannot per-
form tyrosine O-sulfation (14), and proteins whose native forms
are nonglycosylated may be hyperglycosylated when expressed in
yeast (15). A recent study reported the successful reengineering of
the glycosylation pathways in P. pastoris to allow the expression of
recombinant proteins with human-type glycans (16). This may
allow future improvements in the expression of glycosylated
proteins in yeast. However, glycosylation is not the only challenge
for successful expression of proteins in yeast. Certain proteins are
3 Isotope Labeling in Insect Cells 39
expressed at low levels even when glycosylation is not necessarily

an issue (17–19). While the reasons are not fully clear, low-expression
yields have been attributed to defects in the ER folding machinery
(20–22). Even for proteins in which high-expression yields can be
obtained, much of the protein may be misfolded (23). In order to
detect which part of the folding machinery may be responsible,
protein disulfide isomerase (PDI), an enzyme that catalyzes disul-
fide exchange in the ER, was overexpressed and found to enhance
protein yields (24, 25). On the other hand, human adenosine A2A
receptor levels were not found to increase with an increase in PDI
expression (26). Thus, the process of disulfide bond formation in
yeast remains uncertain. Finally, the case is most uncertain for
membrane proteins, where there is a lack of knowledge on folding
mechanisms and also on the way chaperones and the translocon
participate in the folding pathway. Therefore, it is currently recom-
mended to use higher eukaryotic hosts with advanced cell machin-
ery systems for the production of recombinant proteins that have
to be glycosylated, disulfide bonded, and/or membrane inserted
for functional activity.
Advanced expression systems with higher cell machineries for
posttranslational modifications are offered by the baculovirus-
mediated expression system in insect cells (BvE) and mammalian
expression methods. The BvE is one of the most efficient and popular
systems among the eukaryotic hosts to use for expressing recombi-
nant proteins. Therefore, its application is widespread in industrial
as well as in academic environments for structural and functional
studies of diverse proteins. However, for the production of therapeutic
recombinant proteins, mammalian expression systems (mainly
Chinese hamster ovary cells (CHO) and Human embryonic kidney
(HEK) 293 cells) are required. Since CHO and HEK293 cells have
been extensively characterized and have the ability for human-like
glycosylation, the production of recombinant therapeutic mono-
clonal antibodies and Fc fusion proteins in this host is safe. The
rate of production of therapeutic proteins, the largest class of new
products being developed by the biopharmaceutical industry, has
increased significantly in recent years (27).
Mammalian expression systems have conventionally been
considered to be too weak and inefficient for protein expression.
However, recent advances have significantly improved the expres-
sion levels of these systems. This chapter and the following one,
therefore, attempt to provide an overview of some of the recent
developments in expression strategies for baculovirus-mediated
insect cell and mammalian expression systems in view of NMR
investigations. Since NMR requires isotope-labeled protein, the
focus of this article is directed toward strategies to incorporate
stable isotopes (15N, 13C) into the target protein in insect and
mammalian cells (see Chapter 4). A major bottleneck of uniform
labeling in higher eukaryotes is the high cost of complex medium
40 K. Saxena et al.
with labeled amino acids. Another limitation of these hosts is that

they cannot survive in deuterium oxide (D2O)-containing media,
so cost-effective generation of perdeuterated proteins is not
available, either for insect or for mammalian cell systems.
It is beyond the scope of both chapters to discuss features of
the BvE or mammalian expression systems in detail. Comprehensive
guides and detailed methodologies for the construction and analysis
of recombinant baculovirus for insect cell expression, maintenance
of insect cells in culture, and analysis of recombinant protein
expression can be found elsewhere (28, 29), including in Baculovirus
Manuals from Invitrogen, Pharmingen, Novagen, and others.
However, given the number of different BvE strategies, we begin
this chapter by explaining the principle of BvE and subsequently
give a short survey of BvE options.
1.1. Principle of The insect cell baculovirus-mediated expression system (BvE) is a

Baculovirus-Mediated powerful platform to rapidly produce high levels of recombinant
Recombinant Protein proteins (see ref. 28 for an excellent review). Unlike bacterial expres-
Expression sion hosts, the baculovirus system relies on a eukaryotic expression
system and thus offers protein modification, sorting, and transpor-
tation machineries similar to those found in higher eukaryotic
organisms. Baculoviruses are insect viruses that predominantly
infect insect larvae of the order Lepidoptera (butterflies and moths)
(30). A baculovirus expression vector is a recombinant baculovirus
that has been genetically modified to contain a foreign gene of
interest, which can be expressed in insect cells under the control of
a baculovirus gene promoter. The BvE uses a helper-independent
virus that can be propagated to high titers in insect cells adapted for
growth in suspension cultures, enabling the production of large
amounts of protein with relative ease (31). Finally, baculoviruses
are noninfectious to vertebrates and their promoters have been
shown to be inactive in mammalian cells (32).
The most commonly used baculovirus for recombinant
protein expression is Autographa californica, a multicapsid nucle-
opolyhedrovirus (AcMNPV) (33). AcMNPV is a large (130 kb),
lytic, double-stranded DNA virus and can accommodate large
segments of foreign DNA for the expression of recombinant protein
(34). The BvE is based on the infection of cultured insect cells by
a recombinant virus vector in which the target DNA (or multiple
genes) is integrated under the control of the strong viral polyhe-
dron promoter (28). The polyhedrin gene (polh) is necessary for
the formation of polyhedral or occlusion bodies in the cell nucleus,
but is nonessential for viral replication in insect cells. Polyhedra are
large particles that appear in the nuclei of AcMNPV-infected insect
cells. The first recombinant baculoviruses were generated by replac-
ing the viral polyhedrin gene with a foreign gene of interest through
homologous recombination (33). Homologous exchange between
the flanking sequences common to both DNA molecules facilitates
the insertion of the gene of interest into the viral genome at the
polh locus, resulting in the production of a recombinant virus
genome and allowing the powerful polyhedron promoter to drive
protein expression of the foreign gene. Since the efficiency of
homologous recombination is quite low, identification, isolation,
and selection of recombinant virus were traditionally achieved by
labor-intensive, technically demanding plaque assays. Due to the
deletion of the polyhedrin gene, the recombinant plaque has a
more clearly distinct morphology than the parental virus contain-
ing the polh gene. Subsequently, additional rounds of plaque
screening are required to separate the desired recombinant virus
from the parental wild-type virus. However, discriminating between
polyhedron-positive and -negative plaques and isolating recombi-
nant virus turned out to be a serious problem for many investiga-
tors who used the BvE for the production of recombinant proteins.
Nowadays, these technical issues and the time-consuming plaque
purification processes are eliminated. In the next section, some of
the key developments in the BvE are presented and discussed.
1.2. Commercially Generally, the baculovirus genome is considered too large to insert
Available Baculovirus a foreign gene directly. In most applications of the BvE, the gene
Expression Systems of interest is therefore cloned into a transfer vector, which contains
sequences that flank the polyhedron gene in the baculovirus
genome. The virus genome and the transfer vector are cotrans-
fected into the insect cells and the gene of interest inserts into the
virus genome via homologous recombination (see Fig. 1) under
the control of the strong late viral polyhedrin promoter (35). Since
a mixture of recombinant and original parental virus is produced
after the initial replication, time-consuming plaque purification
and isolation are required before protein expression can proceed.
BacVector (Merck Biosciences), Baculo-Gold and pBacPAK (BD
Biosciences), and Bac-N-Blue (Invitrogen) are commercially avail-
able BvEs that use homologous recombination to integrate foreign
genes into the virus genome.
New developments in generating recombinant virus by using
site-specific transpositions (Bac-to-Bac or BaculoDirect, Invitrogen)
or progress in recombination methodology with an engineered
baculovirus containing a lethal mutation in an essential gene (open
reading frame (ORF1629), flashBAC from Oxford Expression
Technologies, or BacMagic from EMD Chemicals, Novagen) have
facilitated the use of BvE for a larger user community (28).
Generally, these improvement strategies can be classified into transfer
plasmid modifications and parental baculovirus genome modifica-
tion (28). The flashBAC and BacMagic are the most promising
BvEs so far, since the efficiency of recombination in both systems
is 100%. Therefore, these BvEs overcome the requirement of time-
consuming plaque assays and protein expression can be started
directly after one or two rounds of virus amplification. This technology
42 K. Saxena et al.
Fig. 1. Construction of baculovirus recombinants with Novagen® BacMagic™ system.

This expression system is based on a modified baculovirus genome containing a bacterial
artificial chromosome (BAC) at the polyhedrin locus and a partial deletion of the essential
ORF1629 viral gene. The BacMagic DNA is mixed with a transfer vector, containing a foreign
gene at the polh locus and the complete ORF1629, to generate the recombinant virus via
homologous recombination in insect cells. Picture modified from the Novagen manual.
© EMD Chemicals Inc., an Affiliate of Merck KGaA. Darmstadt, Germany. BacMagic™ and
Novagen® are trademarks of Merck KGaA.
reduces the production of recombinant virus to a one-step proce-

dure, fully amenable to high-throughput and automated produc-
tion systems. Moreover, this approach is compatible with all
baculovirus transfer vectors based on homologous recombination
in insect cells at the polyhedrin locus, including several multigene
coexpression plasmids.
The technology of the flashBAC and BacMagic is driven by a
modified bacmid, in which the baculovirus genome AcMNPV with
a portion of the essential viral gene (ORF1629) deleted. In addi-
tion, a bacterial artificial chromosome (BAC) replaces the polyhe-
drin-coding region. This combination prevents nonrecombinant
virus from replicating in insect cells, yet allows the viral DNA to be
propagated as circular DNA in bacteria. This circular viral DNA is
then isolated and purified from bacterial cells (flashBAC or
BacMagic DNA provided in the kits). Homologous recombination
with a compatible expression plasmid (containing the gene of interest
flanked by the lef2 and ORF1629 recombination sites) restores the

function of the viral ORF1629 allowing the virus DNA to replicate
and replaces the BAC sequence with the target coding sequence
under the control of the polyhedrin promoter (Fig. 1). Since only
recombinant viruses with a restored ORF1629 can replicate, this
results in a unique recombinant virus population. This population
can then be used directly to infect a larger insect cell culture
(50–200 mL) to produce a high-titer working stock.
1.3. Insect Cell Lines The main insect cell lines used for cotransfections and baculovirus
amplification are Spodoptera frugiperda Sf9 or Sf21 (derivatives of
the fall armyworm). Trichoplusia ni BTI 5B1-4 (36) (High Five™)
cells are generally used for the production of secreted recombinant
proteins and not for virus production because of the increased pos-
sibility of generating virus mutants (37). Due to the high-mannose
and paucimannose types of glycosylation that are obtained in insect
cells, no therapeutic protein is currently produced using this sys-
tem as this would compromise in vivo bioactivity and potentially
induce allergenic reactions. Engineering insect cells with glycosyl-
transferases allows the production of proteins with mammalian-
type sugars (38).
1.4. Expression of Due to the high costs of incorporating stable isotopes into insect
Labeled Recombinant cells, it is recommended that the recombinant protein expression
Protein in the is optimized using the chosen BvE before starting with labeled
Baculovirus fermentation. There are only a few studies reporting the incorpo-
Expression System ration of stable isotopes into proteins expressed by baculovirus-
mediated insect cell expression (39–47).
In contrast to the initial trials of amino acid-type selective
labeling of proteins in user-defined insect cell media, nowadays
there are commercial media (BioExpress-2000, Cambridge Isotope
Laboratories, CIL) available for the different labeling applications.
Expression of uniformly 13C- 15N-labeled Abelson Kinase domain
(13C–15N BioExpress-2000) was the first example of backbone
NMR resonance assignments of a recombinant protein expressed
using the BvE (48). So far, most uniform labeling protein work in
insect cells is only performed by industrial research groups due to
the extraordinary costs of the required media. It is not possible to
cultivate insect cells in minimal medium, since this host requires
essential amino acids for its growth. Reports of selective amino
acid isotope labeling in BvE are more frequently cited in the litera-
ture, since this approach is easy and fast and does not require
expensive medium for labeling. Even in the absence of a backbone
assignment of the target protein, structural information can be
deduced from the selective labeling approach based on an existing
X-ray structure of the protein. From a practical aspect, it should be
considered that there are essential and nonessential amino
acids (alanine, cysteine, glutamic acid, glutamine, aspartic acid,
and asparagine) in insect cells whose content in the medium
44 K. Saxena et al.
NH4
Glutamin Aspartate Asparagine
Lysine Glutamate Tyrosine
Alanine Phenylalanine
Valine
Leucine Isoleucine Glycine Serine
Fig. 2. A schematic presentation of amino acid metabolism in E. coli and Sf9 with respect to 15N: The black arrows symbolize
pathways present in both expression hosts. Pathways that only exist in E. coli are shown in grey. The strength of the arrows
reflects the intensity of the conversion. Picture modified from Bruggert et al. (42).
depends on the specific provider of the medium. Before starting

with site-specific amino acid labeling, the unlabeled quantity of the
desired amino acid in the medium should be checked to calculate
the required amount of the amino acid to be labeled. In Fig. 2, a
schematic presentation of amino acid metabolism is shown for
E. coli and insect cells (42). Interestingly, selective labeling of
amino acids in insect cells is more effective than in bacteria, since the
amino acid pathways in insect cells do not harbor as many aminotrans-
ferases as in prokaryotes, which leads to cross-labeling problems.
2. Materials
2.1. Cotransfection 1. 35-mm2 tissue culture dishes.

of Insect Cells 2. Sf9 or Sf21 insect cells.
3. SF900 II insect cell culture medium (serum-free, antibiotic-
free; Invitrogen).
4. Insect GeneJuice® transfection reagent (Novagen) (see Note 1).
5. BacMagic DNA: 100 ng (5 μL) per cotransfection (20 ng/μL).
6. Sterile baculovirus transfer vector DNA containing the gene
under investigation (500 ng per cotransfection).
7. Plastic box to house dishes in the incubator.
8. Sterile pipettes, bijoux (sterile tubes).
2.2. Amplification of 1. Recombinant virus seed stock (Subheading 3.1).

Recombinant Virus 2. Sf 9 insect cells.
3. SF900 II insect cell culture medium (Subheading 2.1).
4. Inverted phase-contrast microscope.
2.3. Analysis of 1. 35-mm2 tissue culture dishes.

Recombinant Protein 2. Sf 9 or High Five insect cells.
Expression
3. SF900 II insect cell culture medium (Subheading 2.1).
4. Recombinant virus stock (Subheading 3.2).
5. Phosphate-buffered saline (PBS; Invitrogen): pH 6.2, sterilize
by autoclaving.
2.4. Production 1. Sf 9 insect cells.

of Isotopically 2. SF900 II insect cell culture medium (Subheading 2.1).
Labeled Protein
3. BioExpress-2000-U (CIL): Unlabeled insect cell culture
medium.
4. BioExpress-2000-CN (CIL): 15N-, 13C-labeled insect cell cul-
ture medium.
5. Recombinant virus stock (Subheading 3.2).
6. Protease inhibitor (Complete™, Roche).
7. PBS (Subheading 2.3).
3. Methods
For a general overview of the BvEs and cloning, expression, analysis,

and purification of recombinant proteins in insect cells, please refer
to ref. 29. The following procedures describe the cloning of a recom-
binant baculovirus expression vector, production (Subheading 3.1)
and amplification of recombinant baculovirus (Subheading 3.2),
analysis of recombinant protein expression (Subheading 3.3), and
finally production of isotope-labeled protein (Subheading 3.4)
expressed in insect cells. All of the procedures are based on the flash-
BAC or BacMagic systems (no plaque purification required) and
must be carried out using sterile technique.
Since the flashBAC system is compatible with all transfer vectors
designed for homologous recombination in insect cells at the polh l
(BacPAK technology), the target gene can be cloned into many
suitable transfer vectors (see Note 2). Moreover, this BvE is
compatible with traditional (T4 DNA ligase) and elegant recombi-
natorial cloning techniques, such as Creator or In-Fusion (BD
Biosciences) (see Note 3). While Gateway (Invitrogen) is one of
the most popular recombinatorial cloning systems, it is limited to
46 K. Saxena et al.
specifically engineered expression vectors with specifically engineered

expression vectors with λ recombination sites. These lead to incor-
poration of additional amino acids into the protein of interest.
A new development to avoid multiple cloning of target genes into
host-specific expression vectors is triple host transfer vector, pTriEx,
from Novagen. Due to the parallel existence of three promoters in
this vector series, recombinant protein expression is enabled in
vertebrates, insect cells, and bacteria.
3.1. Cotransfection For efficient transfection, high-quality DNA of the transfer plasmid
of Insect Cells is prepared using commercially available plasmid DNA purification
kits (Qiagen, Novagen) (see Note 4). Additionally, it is recom-
mended to use fresh, rapidly proliferating cells (see Note 5) for
transfection experiments and to have positive and negative trans-
fection controls (see Note 6).
1. For each cotransfection, prepare one 35-mm2 plate. Seed the
dishes with insect cells at least 1 h before use. Use 1 × 106 cells/
dish for Sf9 cells and 1.5 × 106 cells/dish for Sf21 cells in 2 mL
of SF900 II insect cell culture medium. Allow the cells to
attach by incubating at 28°C for 20 min.
2. During the 1-h incubation period, prepare a DNA–liposome
complex cotransfection mix of BacMagic DNA and Insect
GeneJuice® transfection reagent for each transfection. Assemble
the following components, in the order listed, in a sterile tube
(bijoux) (see Note 7):
1 mL serum-free, antibiotic-free SF900 II insect cell culture
medium
5 μL Insect GeneJuice
5 μL BacMagic DNA (100 ng total)
5 μL Transfer vector DNA (500 ng total)
1.015 mL Total volume
3. Incubate at room temperature for 15–30 min to allow the
DNA–liposome complexes to form.
4. Remove the culture medium from the 35-mm2 dishes of cells
using a sterile pipette, ensuring that the cell monolayer is not
disrupted (see Note 8).
5. Immediately after the medium has been removed from the
cells, add the 1 mL of the DNA–liposome complex dropwise
to the center of each dish (see Note 9) and incubate in a plastic
sandwich box at 28°C for a minimum of 5 h or overnight.
6. After the incubation period, add another 1 mL of SF900 II
insect cell culture medium to each dish and continue the
incubation for 5 days in total.
7. Following the 5-day incubation period (see Note 10), harvest

the medium containing the recombinant virus into a sterile
bijoux and store in the dark at 4°C. This is the seed stock of
recombinant baculovirus (see Note 11). Due to the limited
size of the stock, the next step is to amplify the virus recombi-
nation sites.
3.2. Amplification Amplification of the recombinant virus (produced in Sub-

of Recombinant Virus heading 3.1) is necessary before proceeding with recombinant
protein expression. The following provides a protocol (adapted
from the Novagen and flashBAC Manual) for amplification and
preparation of high-titer recombinant virus (passage 1 stock) in
cells grown in suspension culture.
1. Observe the health and viability of cells under an inverted
phase-contrast microscope (see Note 12).
2. Prepare a 100–200 mL culture of Sf9 cells at an appropriate
cell density in serum-free SF900 II insect cell culture medium
(e.g., 2 × 106 Sf9 cells/mL in log-phase growth; high aeration
is recommended (see Note 13)). Cells should be infected at a
multiplicity of infection (MOI) of <1 pfu/cell.
3. Add 0.5 mL of recombinant virus seed stock to the cell cul-
ture. Incubate with shaking until the cells are well-infected
(usually, 4–5 days) (see Note 14).
4. When the cells appear to be well-infected with virus, harvest
the cell culture medium by centrifugation at 1,500 × g for
15 min at 4°C. Remove the supernatant aseptically. Store the
supernatant (recombinant virus stock) in the dark at 4°C (see
Note 15).
3.3. Analysis Before proceeding with use of the virus in subsequent protein
of Recombinant expression experiments, a plaque assay to determine the accurate
Protein Expression titer is strongly recommended by the manufacturer of the flash-
BAC and BacMagic technology (see Note 16). This allows the
calculation of MOI and ensures cross-referencing and reproduc-
ibility in the following experiments. However, for analysis of protein
expression, it is not absolutely required to know the titer of the
recombinant virus (see Note 17). Pilot expression analysis can also
be performed by infecting cells with different amounts of P1 virus
and monitoring the expression of the recombinant protein.
According to the protein expression literature, it is recommended
that exponentially growing cells should be infected at a high MOI
to ensure that all cells are infected simultaneously and that the cul-
ture is synchronous. However, in our lab, various P1 viruses are
directly added (0.5–10%) to the insect cells for determining the
best ratio for optimal recombinant protein expression without
knowing the exact titer of the different viruses. The best time to
48 K. Saxena et al.
harvest the recombinant protein can be examined by taking samples

at different time points after infection (see Note 18). It is recom-
mended to also prepare noninfected or mock-infected cells as a
control for host cell proteins and a positive control of a successfully
tested recombinant virus. Recombinant protein expression can
then be evaluated by SDS-PAGE or Western blot analysis.
Additionally, protein expression should be compared between Sf9
and High Five cell lines, especially if the protein of interest is
secreted (see Note 19). The following protocol is adapted from
the Novagen and flashBAC Manual.
1. Seed 35-mm2 dishes (one per virus to be tested plus a positive
and negative control) with 1 × 106 of S9 or High Five cells per
dish and allow the cells to attach by incubating at 28°C for 1 h.
2. Remove the medium, add 200 μL of the recombinant virus
stock to be tested, and incubate at room temperature for an
additional hour.
3. Remove the virus inoculum, add 1.5 mL of SF900 II insect cell
culture medium, and incubate at 28°C for 48 h.
4. Use a sterile pipette tip to scrape the cells from the dishes into
suspension and to transfer them into sterile Eppendorf tubes.
5. Centrifuge the cells at 1,500 × g for 10 min at room tempera-
ture. Remove the supernatant and discard (see Note 20).
6. Resuspend the pellet (protein of interest in the cytosol) in
80 μL of PBS.
7. Analyze the extent of protein overexpression by using SDS-
PAGE and Western blotting if a suitable antibody is available
for the recombinant protein (see Note 21 and Fig. 3).
Fig. 3. Protein overexpression and purification using insect cells. SDS PAGE (left ), Western Blot (right ): Protein samples
loaded are BC (before Ni-NTA chromatography), FT (Flowthrough), M (Molecular weight marker), W (Wash), E (Elution).
3.4. Production While fermenters and bioreactors are now routinely used to produce
of Isotopically recombinant proteins in insect cells, good yields can be achieved in
Labeled Protein shake cultures at a fraction of the cost. Large, disposable shake
flasks can be obtained that hold up to 1.5 L of insect cell culture
on a shaking platform in a warm room or incubator. The most suit-
able culture protocol for isotope labeling of a recombinant protein
in insect cells involves an initial growth phase for the insect cells in
an unlabeled growth/expression medium and a subsequent expres-
sion phase, where cells are centrifuged and resuspended into the
labeling medium prior to infection with the recombinant baculo-
virus (see Note 22). This centrifugation step allows much higher
expression levels of the labeled protein compared to one with
growth and expression in labeling medium without the change in
medium. Recently, isotope-labeled expression media for insect cells
have become commercially available (BioExpress-2000, CIL), but
the source and the concentration of the amino acids in these media
are not published. However, for site-specific labeling of amino
acids in proteins expressed in insect cells, adding the isotope-
labeled amino acid of interest to a standard insect cell medium
before infection (see Note 23) also works well and the cost for this
labeling strategy is comparable to that when using E. coli.
The following protocol is adapted from the Application Note:
Efficient uniform isotope labeling of proteins expressed in
Baculovirus-infected insect cells using BioExpress® 2000 (Insect cell)
medium by André Strauss, Gabriele Fendrich, and Wolfgang Jahnke.
Prior to performing isotope labeling of a protein, optimize the
culture and BV-infection conditions in unlabeled medium (e.g.,
BioExpress 2000-U or SF900 II for expression of the protein).
1. Cultivate several 100 mL cultures of Sf9 cells, adapted to
growth in serum-free SF900 II medium, in 500-mL Erlenmeyer
flasks for 3 days at 27°C with shaking at 90 rpm.
2. Prepare the uniform isotope-labeling medium (see Note 24).
3. When the final cell density of the culture has reached 1.5 × 106
cells/mL (~3 days), sterile centrifuge the cells at 400 × g for
20 min at 20°C.
4. Resuspend the pelleted cells in 100-mL portions in labeled
BioExpress® 2000-CN medium and transfer them to fresh
500-mL Erlenmeyer flasks.
5. Add the recombinant virus stock at a titer of 0.5–2 × 108 pfu/mL
to an MOI = 1–2, according to optimized conditions.
6. Grow the 100 mL cultures of baculovirus-infected Sf9 cells for
3 days post infection in labeled BioExpress® 2000-CN medium
at 27°C, with shaking at 90 rpm (see Note 25).
7. Harvest the cells expressing the labeled recombinant protein
by centrifuging at 400 × g for 20 min at 20°C (see Note 26);
resuspend the pelleted cells in 20 mL of PBS containing
50 K. Saxena et al.
protease inhibitor mix (see Note 27), followed by a second

centrifugation in 50-mL plastic tubes at 400 × g for 20 min at
20°C. Store the pelleted cells at −80°C.
8. Isolate and purify the recombinant protein according to
protocols generated for the unlabeled protein.
4. Notes
1. Lipofectin® (Invitrogen), FuGENE 6 (Roche), Tfx-20™

(Promega), and CELLFECTIN® (Invitrogen) have also been
successfully tested.
2. Several suitable vectors from different providers are available
and are summarized: http://www.expressiontechnologies.
com/flashBAC/vectors.asp.
3. Ligation protocols are described in the manuals of the specific
cloning technology providers.
4. Plasmid DNA purification protocols are provided by the manu-
facturers of the purification kits. The DNA must be sterile and
must be of a quality suitable for transfection into cells.
5. Healthy cells look bright, round, and refractile, and many
should be in the process of dividing into daughter cells.
6. A control transfer vector is supplied with the BacMagic kit and
can be used to make recombinant virus; the lacZ-positive
infected cells can be stained using X-gal. Add 1 mL of appro-
priate insect cell culture medium containing 15 μL of 2% (w/v)
X-gal in N,N-dimethylformamide and incubate at 28°C. After
5 h, the cells and culture medium appear blue in color,
confirming the production of recombinant virus.
7. Plasticware used to prepare the transfection mixture has to be
made from polystyrene and not from polypropylene, since
the complexes bind to polypropylene.
8. When removing liquid from a dish of cells, tip the dish at a
30–60° angle so that the liquid pools toward one side of the
dish. It is important not to allow the cell monolayer to dry out
at this point.
9. Adding the mixture dropwise should not disturb the cell
monolayer if it is done slowly and gently.
10. Cell monolayers in which recombinant virus has been pro-
duced appear very different from mock-transfected control
cells under the inverted microscope. Control cells form a
confluent monolayer while virus-infected cells do not form
a confluent monolayer and appear grainy with enlarged nuclei.
11. The expected titer of this initial viral seed stock is generally
about 1 × 107 pfu/mL. In contrast to virus-free cells, virus-
infected cells appear grainy with enlarged nuclei and do not
form a confluent monolayer.
12. It is important that the cells are healthy and in the log phase of
growth to ensure that virus replication occurs efficiently to
generate high-titer stocks of virus for subsequent use in expres-
sion studies suitable for transfection into cells.
13. Virus-infected cells have an increased need for oxygen, and
therefore the contents of the flasks should be shaken at quite
high speeds to maximize aeration. The surface area-to-volume
ratio should also be as large as possible for maximum gas
exchange. Do not overfill the flasks.
14. Under a phase-contrast inverted microscope, cells infected
with virus appear grainy when compared to healthy cells. The
infected cells become uniformly rounded and enlarged, with
distinct enlarged nuclei.
15. The virus stock can be stored in the dark at 4°C for 6–12 months,
although the titer begins to drop after 3–4 months. Titer the
virus before use and reamplify if necessary. The addition of
2–5% serum when using serum-free medium can be helpful in
avoiding a drop in titer. Virus may be frozen at −80°C for lon-
ger periods of time. Avoid multiple freeze–thaw cycles.
16. Protocols for plaque assays can be found in the manuals of
Novagen and Oxford Expression Technologies.
17. After cotransfecting and harvesting the seed stock of virus for
further amplification, it is possible to harvest the remaining
cells from the dish and prepare these for SDS-PAGE/Western
blotting. This gives a quick check for gene expression.
18. The most commonly used times are at 24, 48, 72, and 96 h
post infection (hpi). Some proteins may be very stable and
accumulate to high levels by 96 hpi, and others may start to
degrade and thus need to be harvested much earlier.
19. High Five cell lines (Invitrogen) often increase the yield of
secreted recombinant proteins.
20. For secreted protein expression, the supernatant has to be
analyzed.
21. Standard molecular biology procedures like SDS-PAGE and
Western-blotting are precisely described in Molecular Cloning:
A Laboratory Manual) (49).
22. A similar cost-reducing labeling strategy in E. coli was also
described for the production of isotope-labeled protein by
Marley and coworkers (50).
52 K. Saxena et al.
23. 300–600 mg of 15N-Phe or 15N-Leu (per liter culture medium)

was immediately added to the insect cells after being infected
with virus. The medium used for the amino acid-selective labeling
was 1LSF-900 II serum-free medium from Invitrogen.
24. Medium can be stored, and filter-sterilized for several months
at 4°C in the dark without loss of capacity for protein expression.
Warm to 28°C before using.
25. The harvesting time of the infected cells depends on the stability
of the target protein.
26. For secreted protein expression, the supernatant has to be
further purified.
27. Protease inhibitor mix: Roche complete (one tablet dissolved
in 200 mL of PBS)
References
1. Goto, N. K., and Kay, L. E. (2000) New 9. Hu, Y. C. (2008) Baculoviral vectors for gene
developments in isotope labeling strategies for delivery: a review. Curr. Gene Ther. 8, 54–65.
protein solution NMR spectroscopy. Curr. 10. Kost, T. A., Condreay, J. P., Ames, R. S., Rees,
Opin. Struct. Biol. 10, 585–592. S., and Romanos, M. A. (2007) Implementation
2. Schein, C. H. (1991) Optimizing protein fold- of BacMam virus gene delivery technology in a
ing to the native state in bacteria. Curr Opin. drug discovery setting. Drug Discov. Today 12,
Biotechnol. 2, 746–750. 396–403.
3. Qing, G., Ma, L. C., Khorchid, A., Swapna, G. 11. Hitchman, R. B., Possee, R. D., and King, L.
V., Mal, T. K., Takayama, M. M., Xia, B., A. (2009) Baculovirus expression systems for
Phadtare, S., Ke, H., Acton, T., Montelione, recombinant protein production in insect cells.
G. T., Ikura, M., and Inouye, M. (2004) Cold- Recent Pat. Biotechnol. 3, 46–54.
shock induced high-yield protein production 12. Lam, J. S., Huang, H., and Levitz, S. M.
in Escherichia coli. Nat. Biotechnol. 22, (2007) Effect of differential N-linked and
877–882. O-linked mannosylation on recognition of fun-
4. Young, J. C., Agashe, V. R., Siegers, K., and gal antigens by dendritic cells. PLoS One 2,
Hartl, F. U. (2004) Pathways of chaperone- e1009.
mediated protein folding in the cytosol. Nat. 13. Dasgupta, S., Navarrete, A. M., Bayry, J.,
Rev. Mol. Cell. Biol. 5, 781–791. Delignat, S., Wootla, B., Andre, S., Christophe,
5. Esposito, D., and Chatterjee, D. K. (2006) O., Nascimbeni, M., Jacquemin, M., Martinez-
Enhancement of soluble protein expression Pomares, L., Geijtenbeek, T. B., Moris, A.,
through the use of fusion tags. Curr. Opin. Saint-Remy, J. M., Kazatchkine, M. D., Kaveri,
Biotechnol. 17, 353–358. S. V., and Lacroix-Desmazes, S. (2007) A role
6. Andersen, C. L., Matthey-Dupraz, A., for exposed mannosylations in presentation of
Missiakas, D., and Raina, S. (1997) A new human therapeutic self-proteins to CD4+ T
Escherichia coli gene, dsbG, encodes a periplas- lymphocytes. Proc. Natl. Acad. Sci. U.S.A 104,
mic protein involved in disulphide bond forma- 8965–8970.
tion, required for recycling DsbA/DsbB and 14. Moore, K. L. (2003) The biology and enzy-
DsbC redox proteins. Mol. Microbiol. 26, mology of protein tyrosine O-sulfation. J. Biol.
121–132. Chem. 278, 24243–24246.
7. Bardwell, J. C. (1994) Building bridges: disul- 15. Daly, R., and Hearn, M. T. (2005) Expression
phide bond formation in the cell. Mol. Microbiol. of heterologous proteins in Pichia pastoris:
14, 199–205. a useful experimental tool in protein engineer-
8. Pickford, A. R., and O’Leary, J. M. (2004) ing and production. J. Mol. Recognit. 18,
Isotopic labeling of recombinant proteins from 119–138.
the methylotrophic yeast Pichia pastoris. 16. Hamilton, S. R., and Gerngross, T. U. (2007)
Methods Mol. Biol. 278, 17–33. Glycosylation engineering in yeast: the advent
of fully humanized yeast. Curr. Opin. Biotechnol. 29. O´Reilly, D. R., Miller, L., Luckow, V.A. (1992)
18, 387–392. Baculovirus Expression Vectors - a Laboratory
17. Chisholm, V., Chen, C. Y., Simpson, N. J., and Manual. WH Freeman, New York.
Hitzeman, R. A. (1990) Molecular and genetic 30. van Regenmortel, M. H., Mayo, M. A.,
approach to enhancing protein secretion. Fauquet, C. M., and Maniloff, J. (2000) Virus
Methods Enzymol. 185, 471–482. nomenclature: consensus versus chaos. Arch.
18. Dorner, A. J., and Kaufman, R. J. (1990) Virol. 145, 2227–2232.
Analysis of synthesis, processing, and secretion 31. Hunt, I. (2005) From gene to protein: a review
of proteins expressed in mammalian cells. of new and enabling technologies for multi-
Methods Enzymol. 185, 577–596. parallel protein expression. Protein Expr. Purif.
19. Moir, D. T. (1989) Yeast mutants with increased 40, 1–22.
secretion efficiency. Biotechnology 13, 32. Carbonell, L. F., Klowden, M. J., and Miller, L.
215–231. K. (1985) Baculovirus-mediated expression of
20. Biemans, R., Thines, D., Rutgers, T., De Wilde, bacterial genes in dipteran and mammalian
M., and Cabezon, T. (1991) The large surface cells. J. Virol. 56, 153–160.
protein of hepatitis B virus is retained in the 33. Smith, G. E., Summers, M. D., and Fraser, M.
yeast endoplasmic reticulum and provokes its J. (1983) Production of human beta interferon
unique enlargement. DNA Cell. Biol. 10, in insect cells infected with a baculovirus
191–200. expression vector. Molecular Cell. Biol. 3,
21. Gennaro, D. E., Hoffstein, S. T., Marks, G., 2156–2165.
Ramos, L., Oka, M. S., Reff, M. E., Hart, T. 34. Ayres, M. D., Howard, S. C., Kuzio, J., Lopez-
K., and Bugelski, P. J. (1991) Quantitative Ferber, M., and Possee, R. D. (1994) The
immunocytochemical staining for recombinant complete DNA sequence of Autographa cali-
tissue-type plasminogen activator in transfected fornica nuclear polyhedrosis virus. Virology
Chinese hamster ovary cells. Proc. Soc. Exp. 202, 586–605.
Biol. Med. 198, 591–598. 35. Kitts, P. A., and Possee, R. D. (1993) A method
22. Shuster, J. R. (1991) Gene expression in yeast: for producing recombinant baculovirus expres-
protein secretion. Curr. Opin. Biotechnol. 2, sion vectors at high frequency. Biotechniques
685–690. 14, 810–817.
23. Mollaaghababa, R., Davidson, F. F., Kaiser, C., 36. Kost, T. A., Condreay, J. P., and Jarvis, D. L.
and Khorana, H. G. (1996) Structure and (2005) Baculovirus as versatile vectors for pro-
function in rhodopsin: expression of functional tein expression in insect and mammalian cells.
mammalian opsin in Saccharomyces cerevisiae. Nat. Biotechnol. 23, 567–575.
Proc. Natl. Acad. Sci. U.S.A 93, 37. Friesen, P. D., and Nissen, M. S. (1990) Gene
11482–11486. organization and transcription of TED, a lepi-
24. Robinson, A. S., Hines, V., and Wittrup, K. D. dopteran retrotransposon integrated within the
(1994) Protein disulfide isomerase overexpres- baculovirus genome. Mol. Cell. Biol. 10,
sion increases secretion of foreign proteins in 3067–3077.
Saccharomyces cerevisiae. Biotechnology (N Y) 38. Harrison, R. L., and Jarvis, D. L. (2006)
12, 381–384. Protein N-glycosylation in the baculovirus-
25. Shusta, E. V., Raines, R. T., Pluckthun, A., and insect cell expression system and engineering of
Wittrup, K. D. (1998) Increasing the secretory insect cells to produce “mammalianized”
capacity of Saccharomyces cerevisiae for produc- recombinant glycoproteins. Adv. Virus Res. 68,
tion of single-chain antibody fragments. Nat. 159–191.
Biotechnol. 16, 773–777. 39. DeLange, F., Klaassen, C. H., Wallace-Williams,
26. Butz, J. A., Niebauer, R. T., and Robinson, A. S. E., Bovee-Geurts, P. H., Liu, X. M., DeGrip,
S. (2003) Co-expression of molecular chaper- W. J., and Rothschild, K. J. (1998) Tyrosine
ones does not improve the heterologous structural changes detected during the photo-
expression of mammalian G-protein coupled activation of rhodopsin. J. Biol. Chem. 273,
receptor expression in yeast. Biotechnol. Bioeng. 23735–23739.
84, 292–304. 40. Creemers, A. F., Klaassen, C. H., Bovee-Geurts,
27. Durocher, Y., and Butler, M. (2009) Expression P. H., Kelle, R., Kragl, U., Raap, J., de Grip, W.
systems for therapeutic glycoprotein produc- J., Lugtenburg, J., and de Groot, H. J. (1999)
tion. Curr. Opin. Biotechnol. 20, 700–707. Solid state 15N NMR evidence for a complex
28. Jarvis, D. L. (2009) Baculovirus-insect cell Schiff base counterion in the visual G-protein-
expression systems. Methods Enzymol. 463, coupled receptor rhodopsin. Biochemistry 38,
191–222. 7195–7199.
54 K. Saxena et al.
41. Bellizzi, J. J., Widom, J., Kemp, C. W., and 46. Stockman, B. J., Kothe, M., Kohls, D., Weibley,
Clardy, J. (1999) Producing selenomethionine- L., Connolly, B. J., Sheils, A. L., Cao, Q.,
labeled proteins with a baculovirus expression Cheng, A. C., Yang, L., Kamath, A. V., Ding,
vector system. Structure 7, R263-267. Y. H., and Charlton, M. E. (2009) Identification
42. Bruggert, M., Rehm, T., Shanker, S., of allosteric PIF-pocket ligands for PDK1 using
Georgescu, J., and Holak, T. A. (2003) A novel NMR-based fragment screening and 1H-15N
medium for expression of proteins selectively TROSY experiments. Chem. Biol. Drug Des.
labeled with 15N-amino acids in Spodoptera fru- 73, 179–188.
giperda (Sf9) insect cells. J. Biomol. NMR 25, 47. Jahnke, W., Grotzfeld, R. M., Pelle, X., Strauss,
335–348. A., Fendrich, G., Cowan-Jacob, S. W., Cotesta,
43. Strauss, A., Bitsch, F., Cutting, B., Fendrich, S., Fabbro, D., Furet, P., Mestan, J., and
G., Graff, P., Liebetanz, J., Zurini, M., and Marzinzik, A. L. (2010) Binding or bending:
Jahnke, W. (2003) Amino-acid-type selective distinction of allosteric Abl kinase agonists from
isotope labeling of proteins expressed in antagonists by an NMR-based conformational
Baculovirus-infected insect cells useful for assay. J. Am. Chem. Soc. 132, 7043–7048.
NMR studies. J. Biomol. NMR 26, 367–372. 48. Vajpai, N., Strauss, A., Fendrich, G., Cowan-
44. Strauss, A., Bitsch, F., Fendrich, G., Graff, P., Jacob, S. W., Manley, P. W., Jahnke, W., and
Knecht, R., Meyhack, B., and Jahnke, W. Grzesiek, S. (2008) Backbone NMR resonance
(2005) Efficient uniform isotope labeling of assignment of the Abelson kinase domain in
Abl kinase expressed in Baculovirus-infected complex with imatinib. Biomol. NMR Assign.
insect cells. J. Biomol. NMR 31, 343–349. 2, 41–42.
45. Betz, M., Vogtherr, M., Schieborr, U., Elshorst, 49. Maniatis, T. (1982) Molecular cloning : a labo-
B., Grimme, S., Pescatore, B., Langer, T., ratory manual / T. Maniatis, E.F. Fritsch, J.
Saxena, K., and Schwalbe, H. (2008) Chemical Sambrook, Cold Spring Harbor Laboratory,
Biology of Kinases Studied by NMR Cold Spring Harbor, N.Y.
Spectroscopy, In Chemical Biology (Prof. Dr. 50. Marley, J., Lu, M., and Bracken, C. (2001)
Stuart L. Schreiber, P. D. T. M. K. P. D. G., A method for efficient isotopic labeling of recom-
Ed.), pp 852–890. binant proteins. J. Biomol. NMR 20, 71–75.
Chapter 4
Isotope Labeling in Mammalian Cells

Arpana Dutta, Krishna Saxena, Harald Schwalbe,
and Judith Klein-Seetharaman
Abstract
Isotope labeling of proteins represents an important and often required tool for the application of nuclear
magnetic resonance (NMR) spectroscopy to investigate the structure and dynamics of proteins. Mammalian
expression systems have conventionally been considered to be too weak and inefficient for protein expression.
However, recent advances have significantly improved the expression levels of these systems. Here, we
provide an overview of some of the recent developments in expression strategies for mammalian expression
systems in view of NMR investigations.
Key words: Isotope labeling, Nuclear magnetic resonance, Recombinant protein expression, Human
embryonic kidney cells
1. Introduction:
Mammalian Cell
Expression System
Mammalian cell expression systems are being increasingly used to
1.1. Principle express proteins for structural biology. This is evident from the
of Mammalian increase in the number of structures available in the Protein Data
Cell-Mediated Bank (PDB) (1) of proteins purified from such sources. The main
Protein Expression reason for switching to these higher eukaryotic expression systems
is the ability to produce biologically active cell surface receptors
and secreted glycoproteins. The functionality of these proteins is
linked to the requirement for posttranslational modifications, such
as glycosylation and disulfide bond formation, which can often only
be satisfied by using mammalian systems and not other eukaryotic
systems, such as yeast and insect cells.
Protein expression in mammalian cells involves transfection
with a plasmid carrying the gene of interest under the control of a
mammalian promoter. The mammalian gene can be introduced
55
56 A. Dutta et al.
into cells either by way of transient transfection or by developing a

stable cell line. The former can be achieved with both adherent and
suspension cells using polyethyleneimine (PEI). The only advantage
of this method over stable cell line preparation is that milligram
quantities of protein can be prepared in a few days time compared
to the months required by the latter method (2–4). However,
there are several disadvantages associated with the transient trans-
fection approach (1) Not every protein, in particular membrane
proteins, give high yields when transiently transfected; (2) Very
large quantities of DNA need to be prepared; (3) The process
needs to be repeated every time protein is needed, adding labor
and cost. Stable cell line creation overcomes these limitations,
although 2–6 months are required to establish a high-level expres-
sion system. The effort is often justified, however, by the advantages
of a stable mammalian cell line over a transient transfect (1) The
yield is typically very high; (2) The process becomes fast and robust
once the cell line is created for a particular protein; (3) Stable cell
lines are usually created using calcium phosphate transfection, an
efficient and inexpensive method; (4) Cell lines can be established
in cellular backgrounds that are specifically tailored to the needs of
the protein being expressed. For example, heterogenous glyco-
solyation is a problem in NMR studies which typically require
highly homogenous material. To overcome this problem, cell lines
can be used that are deficient in complex heterogenous glycosyla-
tion capabilities (5); (5) The ability to create inducible cell lines is
an additional advantage. In cases where constitutive expression of a
particular protein is toxic to cells, placing the gene under an inducible
promoter can allow expression of otherwise toxic proteins; (6) Finally,
stable cells can be easily scaled up by transferring them to suspen-
sion cultures in spinner flasks or bioreactors.
Episomal vectors developed from viruses, such as Epstein–Barr
virus (EBV) (6), bovine papilloma virus (BPV) (7), BK virus (BKV)
(8), and Simian virus 40 (SV40) (9) are mainly used for transfection.
A list of commonly used vectors can be found in (10). The advan-
tages of using episomal vectors over integrating genes into the
DNA of the host cell are numerous (1) Episomal expression will
result in gene expression independent of the regulatory mechanisms
of the host cell and the position of integration in the host genome,
leading to higher expression levels since these factors may unfavor-
ably influence expression. Random integration may also disrupt
the characteristics of the gene of interest (11); (2) There is no inter-
ruption in host cell gene expression, which is often the case with
integrative vectors, the latter leading to transformation of the host
cell and undesirable effects on protein production; (3) Episomal
vectors can exist in multiple copies in a cell, thus leading to ampli-
fication of the gene of interest. The main criteria for selecting a
suitable episomal vector are (1) high copy number in E. coli for
large-scale DNA production; (2) a strong mammalian promoter
4 Isotope Labeling in Mammalian Cells 57
for high expression levels of the gene of interest in mammalian

cells; and (3) small size so that genes of different lengths can be
easily cloned into the vector. Thus, the vector typically consists of
a strong mammalian promoter to drive expression of the mamma-
lian gene of interest, a viral origin of replication activated by viral
early genes, which is required to carry out its replication thereby
maintaining the vector in the host system, and a eukaryotic selec-
tion marker, usually an antibiotic resistance gene, such as neomycin,
to select for transfected cells. The human cytomegalovirus pro-
moter has been shown to be very powerful in both HEK293 and
Chinese Hamster Ovary (CHO) cells and is also active in most
mammalian cells (12).
1.2. Mammalian Mammalian cell lines used for transfection with virus-based vectors
Cell Lines carrying the gene of interest usually already carry vectors encoding
viral early genes corresponding to the origin of replication used by
the new vector. Some of these early genes are SV40 T antigen,
EBNA-1, E1, and E2 which bind to SV40, EBV, and BPV ori,
respectively. The early genes act in trans to initiate replication by
binding to the ori and they also act as enhancers, thereby increas-
ing the copy number of viral vector and the expression levels of
the protein of interest (13). Since very high copy numbers of the
viral genes can lead to host cell death, regulation of viral replication
can be imposed by transfecting mammalian cells with a different
vector carrying the early genes (13). Mammalian cell lines that are
widely used for protein expression are CHO and Human Embryonic
Kidney 293 (HEK293) cells. Both are suitable for use in adherent
and suspension cultures. The advantage of using CHO cells is the
availability of various auxotrophs that can be used as selection
markers for transfection. An example of such an auxotroph is
dihydrofolate reductase (DHFR) deficient cells which are triple
auxotrophs for hypoxanthine, glycine, and thymidine (14).
Transfection of foreign genes along with DHFR genes in these
cells allows for the selection of clones in a medium devoid of the
above nutrients. Another advantage of this system is that it helps
amplify the foreign gene when DHFR deficient cells are grown in
the presence of methotrexate, which blocks DHFR activity. This
causes the transfected cells to deal with low DHFR activity by
amplifying the copy number of DHFR, thus amplifying the copy
number of the transfected gene of interest. The disadvantages of
the CHO cell line is that these cells do not carry all of the sugar
transferring enzymes (15) that are present in human cell lines. This
may lead to the production of functionally nonrelevant proteins.
Further, some of the posttranslational modifications of human
proteins are not appropriately carried out in these cells (16). It has
also been seen that expression levels are usually lower in CHO cells
than in HEK293 cells (12, 17). Therefore, the human cell line,
HEK293, is used for proteins that require functionally sensitive
posttranslational modifications.
58 A. Dutta et al.
There are two mutant cell lines of HEK293, HEK293E,

expressing EBNA-1, and HEK293T, expressing SV40 large T antigen,
that increase the copy number of plasmids with EBV and SV40
origins of replication, respectively, thereby increasing expression
levels of the protein of interest (13).
1.3. Expression Uniform isotope labeling of proteins expressed in mammalian cells

and Labeling of is still under development (18). This is because mammalian cells,
Recombinant Proteins like insect cells, are unable to grow in the type of minimal media
in Mammalian Cells that are used for bacteria where glucose or glycerol and ammonium
chloride are the only sources of carbon and nitrogen, respectively.
Higher eukaryotes, like the organisms they represent, require
certain essential amino acids, without which they are unable to
grow. Thus, the use of simple sources for 13C and 15N isotope labeling
of proteins in bacteria is not possible for mammalian cells. A number
of recent approaches, however, have been developed to address
this complication. The first uniformly isotope-labeled protein
purified from mammalian cells was obtained by Hansen et al. (19)
followed by (20, 21). The method involved purification of a
mixture of isotope labeled amino acids from an acid hydrolysate of
algae or bacteria grown in 15NH4Cl and 13C glucose or 13CO2 to
prepare uniformly labeled urokinase in a Sp2/0 mouse myeloma
cell line. Different purification protocols of the labeled mixture
were tested but only acid hydrolysis, which removes bacterial and
algal products that are toxic to mammalian cells, proved to work.
The mixture, however, needed to be supplemented with commercially
available amino acids that would degrade during hydrolysis, par-
ticularly glutamine and cysteine. A commercially available form of
15
N-cysteine was used. Because of the high cost involved in purchasing
15
N-glutamate, 15N-, and 15N/13C-glutamine were synthesized
from labeled glutamate (available commercially), 15NH4Cl, and
ATP. Dialyzed serum was used to prevent dilution of isotope
labeled amino acids with unlabeled amino acids from the serum.
Such purified amino acid mixtures are commercially available today.
One such commercially available mixture from Martek Biosciences
Corp. was tested to support expression in CHO cells (20). The
growth medium was optimized in terms of its concentration,
removal of amino acids, such as aspartic acid and asparagine,
and addition of amino acids, such as arginine, cysteine, and glu-
tamic acid after isolating and purifying from the mixture so that
the highest level of protein expression was obtained. However,
there are serious concerns regarding the toxic effects of feeding
these mixtures to mammalian cells, leading to extensive methods
of purification and optimization, and making it a time and cost
inefficient method. HEK293 cells, the most commonly used
system of expression for preparing isotope-labeled samples, have,
to date, not been grown successfully in the presence of these
mixtures (Klein-Seetharaman, unpublished results). An uniformly
labeled 15N TGFb1 sample was prepared from CHO cells by
growing them in Minimum Essential Medium (MEM, Gibco)

with dialyzed serum, uniformly 15N labeled choline and uniformly
15
N-labeled amino acids (except tryptophan which was labeled
only in the backbone nitrogen) (22), demonstrating the utility of
CHO systems. The total cost involved was $1,000/L. Therefore,
this method could prove to be expensive if the protein to be
labeled does not have high expression levels, as is the case with
most membrane proteins.
Mammalian growth media containing a mixture of certain
isotope-labeled amino acids are commercially available and have
been recently used for labeling the G-protein coupled receptor,
bovine rhodopsin, using 15N labeled medium in which GKLQSTVW
amino acids were 15N labeled (23). If all the amino acids in the
mixture do not cover the complete sequence of the protein to be
labeled, then this method may not lead to complete labeling, as
was the case with rhodopsin where 50% of amino acids were labeled
(23). Apart from the high costs involved, an additional problem is
the inability to perform perdeuteration because of the sensitivity of
mammalian cells to deuterium oxide. Therefore, the most widely
used method to obtain structural information on mammalian
proteins by NMR has been by conducting amino acid type selective
(AATS) isotope labeling. AATS labels a protein with specific
isotope-labeled amino acids rather than uniform labels and yields
structural information selectively for the amino acid(s) used in
labeling. This has been successfully done with numerous proteins
(24–26). Extensive AATS labeling has been carried out with
rhodopsin, which was labeled by using 15N-isotopes of tryptophan
(26), lysine (25), histidine (27) and 13C-isotopes of tryptophan
(27), histidine (27) and glutamate (28), all through expression in
HEK293 cells. 15N-lysine, 13C-glycine/serine double labeling has
also been shown for rhodopsin (25). For example, multiple combina-
tions of 15N- and 13C-isotope-labeled rhodopsin samples were
used to assign 15N-tryptophan resonances in the NMR spectrum
of rhodopsin, shown in Fig. 1 (29). Since the protocol for such a
labeling method has been optimized and works very well at least
for rhodopsin, we are providing a step-by-step protocol below. The
first step is the creation of an inducible stable cell line of rhodopsin
in HEK293 cells. Scaling up the expression level is accomplished
by growing these cells in a suspension culture, thereby producing
the milligram quantities required for NMR studies. The yield of
rhodopsin from such a transient transfection is on the order of
50 mg/15 cm2 plate. After establishing the stable cell line with the
opsin (apo form of rhodopsin) gene, the next step is to transfer
these cells to suspension culture and grow them in isotope-labeled
medium containing unlabeled amino acids along with the labeled
amino acid(s) of interest. Up to 10 mg of rhodopsin can be obtained
from such suspension cultures when amino acid supplements are
provided (30), but typically 2 mg of rhodopsin are obtained from
60 A. Dutta et al.
Fig. 1. Dephased (red line) and nondephased (blue line) 15N detected 13C/15N CP REDOR spectra of selectively labeled
rhodopsin in DOPC lipid bilayers at 220 K. (a) a,e-15N-Trp; (b) 13C¢-Leu/a,e-15N-Trp; (c) 13U-Thr/a,e-15N-Trp; (d) 13C¢-Cys/a,e-
15
N-Trp; (e) 13C¢-Pro/a,e-15N-Trp; (f) 13C¢-Gly/a,e-15N-Trp. Figure reproduced with permission from (29).
isotope labeled medium in which amino acid mixtures cannot be

supplemented, to avoid dilution of the isotope-labeled amino acids
provided in the medium.
2. Materials
2.1. Expression 1. Plasmid containing the gene of interest (see Note 1).
of Isotope-Labeled 2. HEK293 cells.
Recombinant Proteins
3. Dulbecco’s modified Eagle medium (DMEM F-12).
4. Blasticidin (500 mg/mL stock solution prepared in DMEM
F-12).
5. Geneticin, G418 (100 mg/mL stock solution prepared in
DMEM F-12).
6. 0.05% Trypsin–EDTA (Gibco).
7. Penicillin–streptomycin (PS): 100 U/mL of each.
8. Fetal bovine serum (FBS).
9. Phosphate buffer saline (PBS): Autoclaved.
10. Tetracycline: 200 mg/mL (100× stock).
11. Sodium butyrate: 500 mM (100× stock), filtered.
12. Complete medium: DMEM F-12, 10% FBS, 1% PS.
13. Induction medium: Complete medium containing 2 mg/mL
tetracycline and 5 mM sodium butyrate.
14. Selection medium: Complete medium containing 5 mg/mL
blasticidin and 1, 2, or 3 mg/mL of G418.
15. Cryo medium: Complete medium containing 10% DMSO.
16. 2.5 M CaCl2.
17. BES: 50 mM N,N-bis(2-hydroxyethyl)-2-aminoethanesulfonic
acid, pH 7.2, 250 mM NaCl, 1.5 mM Na2HPO4, 1 M NaOH
used to adjust pH.
18. 15- and 10-cm2 Tissue culture plates.
19. 6- and 24-Well cell culture plates.
20. Sterile forceps.
21. Cloning rings.
22. Vacuum grease.
23. Dodecyl maltoside.
2.2. Specific Isotope Suspension DMEM medium:

Labeling of Proteins
1. Table 1 gives the individual components of suspension DMEM
in their final concentrations based on the composition of com-
mercially available DMEM. 100× stock solutions are prepared
62
Table 1
A. Dutta et al.
Amino Acid Composition of suspension DMEM
Essential amino Nonessential Inorganic salts

acid mg/L amino acid mg/L Vitamins mg/L and other mg/L
Arginine HCl 84 Alanine None D-Ca pantothenate 4 CaCl2 50

Histidine HCl H2O 42 Asparagine None Choline Chloride 4 Fe(NO3)3 9H2O 0.1
Isoleucine 105 Aspartate None Folic Acid 4 MgSO4 97.7
Leucine 105 Cystine 2HCl 63 i-Inositol 7.2 KCl 400
Lysine HCl 146 Glutamate None Niacinamide 4 NaCl 6,400
Methionine 30 Glutamine 584 Pyridoxal HCl 4 NaH2PO4 H2O 125
Phenylalanine 66 Glycine 30 Riboflavin 0.4
Threonine 95 Proline None Thiamine HCl 4
Tryptophan 16 Serine 42 Phenol Red Na+ 15
Valine 94 Tyrosine 2Na+ 2H2O 104 Glucose 4,500
for most components (see Note 2). Each component is dissolved

in distilled water.
2. 0.05% trypsin–EDTA (Subheading 2.1).
3. FBS (Subheading 2.1).
4. PS (Subheading 2.1).
5. Dialyzed FBS: Heat inactivated, dialyzed against PBS and then
filtered.
6. 10% Pluronic F-68 (100× stock), filtered.
7. 5 mg/mL Heparin (100× stock), filtered.
8. 15-cm2 tissue culture plates.
9. 2-L spinner flasks.
10. 100× tetracycline (Subheading 2.1).
11. 100× sodium butyrate (Subheading 2.1).
12. 20% (w/v) glucose, sterile filtered.
13. 8% (w/v) NaHCO3.
14. Complete medium (Subheading 2.1).
15. Selection medium (Subheading 2.1).
3. Methods
3.1. Expression Following is a protocol for generating an inducible stable cell line
of Isotope-Labeled of HEK293 (30–32).
Recombinant Proteins
1. Wake up HEK293 cells from cryostocks and maintain them in
15 cm2 plates in 25 mL of complete medium containing 5 mg/mL
blasticidin.
2. Split the cells into a 10-cm2 dish at 80% confluency.
3. Split the cells 1:10 or 1:8 (~1–2 million cells per plate) into com-
plete medium containing 5 mg/mL blasticidin the day before
transfection.
4. On the day of transfection, the cells should be 30–40% confluent.
To transfect the cells, prepare the following cocktail, adding
components in the order given (volumes are for one 10-cm2
plate): In a falcon tube, mix 30 mg of plasmid DNA, 50 mL of
2.5 M CaCl2, 500 mL of BES, and sterile water to 1 mL.
Incubate the mixture for exactly 1 min and gently add it to the
cells. After 1 h, verify the efficiency of transfection by the presence
of calcium phosphate precipitate which appears as fine sand
particles between cells.
5. Incubate the plates at 35°C and 3% CO2 for 19 h.
64 A. Dutta et al.
6. After 19 h, wash the plates twice with complete medium containing

5 mg/mL blasticidin and add fresh complete medium contain-
ing 5 mg/mL blasticidin.
7. Incubate the cells at 37°C and 5% CO2 overnight.
8. Split the cells 1:10 (0.5–1 million cells/plate) into nine plates,
three plates each for selection with G418 (1, 2, and 3 mg/mL).
9. Replace the medium after 20 h with selection medium.
10. Replace the selection medium every 2–3 days until colonies of
workable size are formed.
11. On the day of picking the clones:
(a) Circle the colonies that are to be picked such that the cir-
cles are big enough to place the cloning rings on its
perimeter.
(b) Aspirate the medium and look for more colonies on the
empty plate by eye. Wash with 8 mL of PBS.
(c) Using sterile forceps, dip one end (base) of each cloning ring
in vacuum grease and place it over each clone (see Note 3).
Make sure there is no grease on the inner walls of the rings.
Make sure that the rings form a well around the clones
with a leak proof seal at the bottom.
(d) Add 40 mL of 0.05% trypsin–EDTA to each well and incubate
for 1 min. Then, add 80 mL of fresh selection medium
and gently pipette up and down. Transfer the cells (100 mL)
to a 24-well plate containing 1 mL of selection medium
per well.
(e) Repeat for all the clones.
12. Replace the selection medium every 2–3 days until the cells are
confluent. Make a note of medium changes and transfers to
6-well plates in a chart (see step 13 for details).
13. As the cells become confluent, transfer them to two 6-well
plates at different cell densities. For the transfer:
(a) Pipette 3 mL of selection medium into each well of a
6-well plate.
(b) Aspirate the medium from the cells (24-well plate) and
add 300 mL of 0.05% trypsin–EDTA to each well and
incubate for 1 min.
(c) Add 1,200 mL (use a 1-mL pipette and add 600 + 600 mL)
of selection medium and gently pipette up down.
(d) Remove 300 mL and add it to one well and place the
remaining (1,200 mL) in another well. The well with higher
cell density will be induced in the future to screen the
clones while the one with lower density will be maintained
and used for making glycerol stocks.
14. Change the selection medium on the cells every 2–3 days.
15. As soon as the cells in the higher density well reach confluence,
induce with induction medium.
16. Harvest the cells 48-h post induction.
(a) After harvesting, solubilize the cells in a detergent suitable
for the membrane protein of interest (e.g., 1% dodecyl
maltoside).
(b) Centrifuge the solubilized cells at 126,000 × g for 20 min
at room temperature and collect the supernatant.
17. Check for levels of expression by performing a Western dot
blot on serial dilutions of the supernatant in PBS, containing
1% of the detergent used for solubilization, versus protein sam-
ples of known concentrations (see Note 4).
18. When the cells from the low density well reach confluence,
split 1:5 into two 10-cm2 cell culture dishes. Add 0.5 mL of
0.05% trypsin–EDTA to a 6-well plate, and then add 1.5 mL
of selection medium. Add 1 mL of this cell suspension to each
of the two 10-cm2 plates.
19. After the cells reach confluence on the 10-cm2 plate, prepare
three 1-mL glycerol stocks from each plate. To prepare glyc-
erol stocks:
(a) Wash 70–90% confluent plates twice with PBS and
trypsinize with 1 mL of 0.05% trypsin–EDTA for 1 min.
(b) Add 10 mL of complete medium (no selection) and col-
lect the cells in a 15-mL falcon tube.
(c) Centrifuge the cells at 600 × g for 10 min at 4°C.
(d) Aspirate the medium and gently resuspend the cells in
3 mL of cryo medium.
(e) Transfer the cells to cryo vials in 1-mL aliquots and place
the tubes in cryo boxes. Incubate at −20°C for 1 h, and then
−80°C overnight.
(f) On the following day, move the tubes to liquid nitrogen
storage tanks.
3.2. Specific Isotope After estimating the yield from above (Subheading 3.1) and iden-
Labeling of Proteins tifying the highest yield clone, the following procedure is used to
prepare a protein sample, in suspension culture, in which particular
amino acids are specifically isotope labeled (30–32).
1. Grow the highest yielding stable cell line clone in complete
medium.
2. Split the cells 1:5 into 10-cm2 dishes after ~3 days using 15 mL
of selection medium.
66 A. Dutta et al.
3. Split the cells until a sufficient number of plates are obtained

for setting up spinner flasks. For reference, 1 mg of protein per
liter of suspension culture is usually obtained for rhodopsin
and 3–4 confluent 15-cm2 plates are used to inoculate a 500-mL
flask (see Notes 5 and 6).
4. Prepare labeled suspension DMEM medium from 100× stocks
of the individual components (Table 1, Subheading 2.2), except
for the isotope labeled amino acid(s). Supplement with 10%
dialyzed FBS and 1% PS or obtain isotope-labeled medium
from a commercial source.
(a) Add glucose, NaCl, glutamine, isotope-labeled amino
acid(s) as solids.
(b) Lower the glutamine concentration to half when starting
the suspension culture.
(c) Add Pluronic F-68 and heparin to 1× concentration.
(d) Add isotope-labeled amino acid(s) as solids.
5. Inoculate the suspension cultures and grow at 37°C, spinning
at 47 rpm for 6 days. To inoculate, add 2 mL of 0.05% trypsin–
EDTA to each plate and extract the cells with 8 mL of unlabeled
suspension medium. Collect cells from each plate (10-mL volume)
and centrifuge at 1,000 × g for 10 min at 4°C. Aspirate the
medium and add 10 mL of labeled suspension DMEM medium
to each plate and resuspend the cells. Add the cell suspension
to the suspension culture in the spinner flask.
6. After 6 days (see Note 7), supplement the growth medium
with 6 mL of 20% (w/v) glucose and 4 mL of 8% (w/v)
NaHCO3 and induce expression with 5 mM sodium butyrate
and 2 mg/mL of tetracycline.
7. After the 6th day, feed the cells with 6 mL of 20% glucose
every 24 h.
8. Grow the cells for 2 more days after the 6th day (total 8 days
in suspension).
9. Harvest at the end of 8 days.
4. Conclusions
A variety of expression systems have been developed that can be

used to express a protein of choice in isotope-labeled form, each
system having advantages and disadvantages. These systems include
bacterial, yeast, insect, and mammalian cells. The choice of the
system largely depends on the type of the protein to be expressed.
Other factors are the costs involved, the need for modifications and
yield of protein required. Bacterial expression systems are most

commonly used for soluble protein expression due to ease of use
and cost-effectiveness in cloning and expression. However, this
system is not the system of choice for proteins that require
posttranslational modification for their activity. In such cases,
eukaryotic expression systems are required. Yeast being both
eukaryotic and a microorganism is a simple system to use, allowing
relatively inexpensive complete isotope labeling of proteins for
NMR studies. Although it can carry out some posttranslational
modifications, these are often not sufficient and deficiency in
folding machinery can be a problem for some proteins, especially
membrane proteins. For these reasons, higher eukaryotic systems,
such as insect cells have been in use where posttranslational
modifications are very similar to those found in mammalian cells.
The greatest advantage of insect cell expression is the very large
amounts of proteins that can be expressed, in most cases more than
that in mammalian cells. However, disadvantages of expensive iso-
tope-labeled media, incomplete labeling and intolerability to
perdeuteration that hamper mammalian cell expression are also
prevalent in insect cells. Hence, efforts are ongoing towards further
developing mammalian expression systems since these are the only
systems in which the full complement of folding and posttransla-
tional modification machineries are available, optimally supporting
the activity of the expressed protein.
5. Notes
1. A tetracycline (tet)-regulated mammalian expression vector

should be used.
2. All solutions were prepared as 100× concentrated stock
solutions, except glucose, NaCl, glutamine, and the isotope-
labeled amino acid(s), which were added as solids. Typically,
500 mL of stock solutions were prepared and filtered to main-
tain sterility. Sterile stock solutions were kept at 4°C and were
stable for several weeks.
3. Cloning rings, vacuum grease, and forceps should be
autoclaved.
4. PVDF membranes should not be used.
5. Cells should be counted before adding them to spinner flasks
so that appropriate number of cells (in the range between 60
and 90 million cells/500 mL) can be transferred.
6. A 2-L spinner flask is ideal for a 500-mL suspension culture.
7. The color of the medium should turn yellow after 6 days.
68 A. Dutta et al.
References
1. Nettleship, J. E., Assenberg, R., Diprose, J. M., Rosser, M. P., MacRobbie, J., Olsen, C. L., and
Rahman-Huq, N., and Owens, R. J. (2010) Cobb, R. R. (2006) High levels of protein
Recent advances in the production of proteins expression using different mammalian CMV
in insect and mammalian cells for structural promoters in several cell lines. Protein Expr.
biology. J. Struct. Biol. 172, 55–65. Purif. 45, 115–124.
2. Aricescu, A. R., Lu, W., and Jones, E. Y. (2006) 13. Van Craenenbroeck, K., Vanhoenacker, P., and
A time- and cost-efficient system for high- Haegeman, G. (2000) Episomal vectors for
level protein production in mammalian gene expression in mammalian cells. Eur. J.
cells. Acta. Crystallogr. D Biol. Crystallogr. 62, Biochem. 267, 5665–5678.
1243–1250. 14. Urlaub, G., and Chasin, L. A. (1980) Isolation
3. Lee, J. E., Fusco, M. L., and Ollmann Saphire, of Chinese hamster cell mutants deficient in
E. (2009) An efficient platform for screening dihydrofolate reductase activity. Proc. Natl.
expression and crystallization of glycoproteins Acad. Sci. U.S.A. 77, 4216–4220.
produced in human cells. Nat. Protoc. 4, 15. Grabenhorst, E., Schlenke, P., Pohl, S., Nimtz,
592–604. M., and Conradt, H. S. (1999) Genetic engi-
4. Nettleship, J. E., Rahman-Huq, N., and Owens, neering of recombinant glycoproteins and the
R. J. (2009) The production of glycoproteins glycosylation pathway in mammalian host cells.
by transient expression in mammalian cells. Glycoconj. J. 16, 81–97.
Methods Mol. Biol. 498, 245–263. 16. Suttie, J. W. (1986) Report of Workshop on
5. Reeves, P. J., Callewaert, N., Contreras, R., and expression of vitamin K-dependent proteins in bac-
Khorana, H. G. (2002) Structure and function terial and mammalian cells. Madison, Wisconsin,
in rhodopsin: high-level expression of rho- USA, April 1986, Thromb. Res. 44, 129–134.
dopsin with restricted and homogeneous 17. Schlaeger, E. J., Kitas, E. A., and Dorn, A.
N-glycosylation by a tetracycline-inducible (2003) SEAP expression in transiently trans-
N-acetylglucosaminyltransferase I-negative fected mammalian cells grown in serum-free
HEK293S stable mammalian cell line. Proc. suspension culture. Cytotechnology 42, 47–55.
Natl. Acad. Sci. U.S.A. 99, 13419–13424.
18. Takahashi, H., and Shimada, I. (2010)
6. Sclimenti, C. R., and Calos, M. P. (1998) Epstein- Production of isotopically labeled heterologous
Barr virus vectors for gene expression and trans- proteins in non-E. coli prokaryotic and eukary-
fer. Curr. Opin. Biotechnol. 9, 476–479. otic cells. J. Biomol. NMR 46, 3–10.
7. Sambrook, J., Rodgers, L., White, J., and 19. Hansen, A. P., Petros, A. M., Mazar, A. P.,
Gething, M. J. (1985) Lines of BPV-transformed Pederson, T. M., Rueter, A., and Fesik, S. W.
murine cells that constitutively express influenza (1992) A practical method for uniform isotopic
virus hemagglutinin. EMBO J. 4, 91–103. labeling of recombinant proteins in mammalian
8. Grossi, M. P., Caputo, A., Rimessi, P., Chiccoli, cells. Biochemistry 31, 12713–12718.
L., Balboni, P. G., and Barbanti-Brodano, G. 20. Lustbader, J. W., Birken, S., Pollak, S., Pound,
(1988) New BK virus episomal vector for com- A., Chait, B. T., Mirza, U. A., Ramnarain, S.,
plementary DNA expression in human cells. Canfield, R. E., and Brown, J. M. (1996)
Arch. Virol. 102, 275–283. Expression of human chorionic gonadotropin
9. Piechaczek, C., Fetzer, C., Baiker, A., Bode, J., uniformly labeled with NMR isotopes in
and Lipps, H. J. (1999) A vector based on the Chinese hamster ovary cells: an advance toward
SV40 origin of replication and chromosomal rapid determination of glycoprotein structures.
S/MARs replicates episomally in CHO cells. J. Biomol. NMR 7, 295–304.
Nucleic Acids Res. 27, 426–428. 21. Shindo, K., Masuda, K., Takahashi, H., Arata, Y.,
10. Sambrook, J., Fritsch, E. F., and Maniatis, T. and Shimada, I. (2000) Backbone 1H, 13C, and 15N
(1989) Molecular Cloning: A Laboratory Manual, resonance assignments of the anti-dansyl antibody
CSH Laboratory Press, Cold Spring Harbor, NY. Fv fragment. J. Biomol. NMR 17, 357–358.
11. Doerfler, W., Schubbert, R., Heller, H., 22. Archer, S. J., Bax, A., Roberts, A. B., Sporn, M.
Kammer, C., Hilger-Eversheim, K., Knoblauch, B., Ogawa, Y., Piez, K. A., Weatherbee, J. A.,
M., and Remus, R. (1997) Integration of for- Tsang, M. L., Lucas, R., Zheng, B. L., and
eign DNA and its consequences in mammalian et al. (1993) Transforming growth factor beta 1:
systems. Trends Biotechnol. 15, 297–301. NMR signal assignments of the recombinant
12. Xia, W., Bringmann, P., McClary, J., Jones, P. protein expressed and isotopically enriched
P., Manzana, W., Zhu, Y., Wang, S., Liu, Y., using Chinese hamster ovary cells. Biochemistry
Harvey, S., Madlansacay, M. R., McLean, K., 32, 1152–1163.
23. Werner, K., Richter, C., Klein-Seetharaman, J., 28. Han, M., and Smith, S. O. (1995) High-
and Schwalbe, H. (2008) Isotope labeling of resolution structural studies of the retinal–
mammalian GPCRs in HEK293 cells and char- Glu113 interaction in rhodopsin. Biophys. Chem.
acterization of the C-terminus of bovine rho- 56, 23–29.
dopsin by high resolution liquid NMR spec- 29. Werner, K., Lehner, I., Dhiman, H. K., Richter,
troscopy. J. Biomol. NMR 40, 49–53. C., Glaubitz, C., Schwalbe, H., Klein-
24. Arata, Y., Kato, K., Takahashi, H., and Shimada, Seetharaman, J., and Khorana, H. G. (2007)
I. (1994) Nuclear magnetic resonance study of Combined solid state and solution NMR stud-
antibodies: a multinuclear approach. Methods ies of alpha,epsilon-15N labeled bovine rhodop-
Enzymol. 239, 440–464. sin. J. Biomol. NMR 37, 303–312.
25. Klein-Seetharaman, J., Reeves, P. J., Loewen, 30. Reeves, P. J., Kim, J. M., and Khorana, H. G.
M. C., Getmanova, E. V., Chung, J., Schwalbe, (2002) Structure and function in rhodopsin: a
H., Wright, P. E., and Khorana, H. G. (2002) tetracycline-inducible system in stable mamma-
Solution NMR spectroscopy of (alpha -15N) lian cell lines for high-level expression of opsin
lysine-labeled rhodopsin: The single peak mutants. Proc. Natl. Acad. Sci. U.S.A. 99,
observed in both conventional and TROSY- 13413–13418.
type HSQC spectra is ascribed to Lys-339 in 31. Reeves, P. J., Callewaert, N., Contreras, R., and
the carboxyl-terminal peptide sequence. Proc. Khorana, H. G. (2002) Structure and function
Natl. Acad. Sci. U.S.A. 99, 3452–3457. in rhodopsin: high-level expression of rhodop-
26. Klein-Seetharaman, J., Yanamala, N. V., Javeed, sin with restricted and homogeneous
F., Reeves, P. J., Getmanova, E. V., Loewen, M. N-glycosylation by a tetracycline-inducible
C., Schwalbe, H., and Khorana, H. G. (2004) N-acetylglucosaminyltransferase I-negative
Differential dynamics in the G protein-coupled HEK293S stable mammalian cell line 452. Proc.
receptor rhodopsin revealed by solution NMR. Natl. Acad. Sci. U.S.A. 99, 13419–13424.
Proc. Natl. Acad. Sci. U.S.A. 101, 3409–3413. 32. Reeves, P. J., Thurmond, R. L., and Khorana,
27. Patel, A. B., Crocker, E., Reeves, P. J., H. G. (1996) Structure and function in rho-
Getmanova, E. V., Eilers, M., Khorana, H. G., dopsin: high level expression of a synthetic
and Smith, S. O. (2005) Changes in interhelical bovine opsin gene and its mutants in stable
hydrogen bonding upon rhodopsin activation. mammalian cell lines. Proc. Natl. Acad. Sci.
J. Mol. Biol. 347, 803–812. U.S.A. 93, 11487–11492.
Chapter 5
Cell-Free Protein Production for NMR Studies

Mitsuhiro Takeda and Masatsune Kainosho
Abstract
The cell-free expression system using an Escherichia coli extract is a practical method for producing
isotope-labeled proteins. The advantage of the cell-free system over cellular expression is that any isotope-
labeled amino acid can be incorporated into the target protein with minimal scrambling, thus providing
opportunities for advanced isotope labeling of proteins. We have modified the standard protocol for E. coli
cell-free expression to cope with two problems specific to NMR sample preparation. First, endogenous
amino acids present in the E. coli S30 extract lead to dilution of the added isotope. To minimize the content
of the remaining amino acids, a gel filtration step is included in the preparation of the E. coli extract.
Second, proteins produced by the cell-free system are not necessarily homogeneous due to incomplete
processing of the N-terminal formyl-methionine residue, which complicates NMR spectra. Therefore, the
protein of interest is engineered to contain a cleavable N-terminal histidine-tag, which generates a homo-
geneous protein after the digestion of the tag. Here, we describe the protocol for modified E. coli cell-free
expression.
Key words: Cell-free synthesis, S30 extract from E. coli, NMR, Stable isotope labeling
1. Introduction
The NMR study of a protein starts by expressing the isotope-labeled

protein. It is now widely accepted that the cell-free expression
system is one of the most practical methods available for producing
isotope-labeled proteins, along with the cellular expression system
(1, 2). The cell-free system employs a cell extract from an organism
such as Escherichia coli (3–7), an extract from wheat germ (8) or a
mixture of recombinant proteins (9). As a consequence of its
reduced amino acid metabolic activity in vitro, isotope-labeled
amino acids can be efficiently incorporated into a target protein
with minimal scrambling (10), which generates new opportunities
for advanced selective isotope labeling of proteins, such as stereo-
array isotope-labeled proteins (11–13).
71
72 M. Takeda and M. Kainosho
We utilize a cell-free system, in which the transcription and

translation systems are reconstituted in a vessel by combining an
E. coli extract and several components. The E. coli cell-free system
has a long history and the level of expression has been improving
due to extensive investigations. The E. coli cell-free protein expression
system is generally regarded as a practical method for producing
isotope-labeled proteins, along with the cellular expression system.
E. coli extract is now commercially available. In many of the E. coli
cell-free systems, transcription is mediated by T7 RNA polymerase,
and thus any plasmid encoding a DNA sequence under the control
of the T7 promoter, such as the pET vectors (Novagen), can be
used. Therefore, it is now relatively easy to test for expression of
proteins of interest by using the E. coli cell-free system.
We encountered two problems in the standard protocol of the
E. coli cell-free procedure in terms of NMR sample preparation,
and thus addressed the problems (14). First, the endogenous
amino acids present in the E. coli S30 extract are incorporated into
the target protein, which leads to isotope dilution. When (2H, 13C,
15
N)-calmodulin (CaM) protein was produced by using an E. coli
S30 extract prepared by the standard protocol, the labeling efficiency
was only ~90%. Therefore, to minimize the residual amino acids,
we introduced a gel filtration step into the protocol for preparing
the E. coli S30 extract. As a result, the labeling efficiency increased
to 96% without loss of activity in the extract (14).
Second, the formyl group of the N-terminal methionine in
proteins produced by the cell-free system is not processed completely.
The presence of the formyl moiety causes large, chemical shift
changes in nearby residues, giving rise to doubled peaks, which
correspond to formylated and deformylated forms. For example,
the NMR signals from CaM, which has a flexible N terminus, are
affected by the heterogeneity over an extensive region (14). Larger
spectral complications may arise from heterogeneity of the peptide
chain in proteins with a structured N terminus. We solved this
problem by engineering the target protein with a cleavable
N-terminal tag. After cleaving the tag, the protein was homoge-
neous, and the (1H, 15N) heteronuclear single quantum correlation
(HSQC) spectrum of this CaM preparation lacked detectable extra
peaks (14).
To engineer proteins with an N-terminal cleavable tag, a template
vector encoding an N-terminal tag followed by the multicloning
site is useful. We constructed a family of vectors by modifying a
pIVEX2.3d vector (Roche) to encode the N-terminal tag. In these
vectors, the amino acid sequences are identical, but the DNA
sequence in the region of the N-terminal tag is different. Introducing
silent mutations into the DNA sequence in the N-terminal region
can lead to different expression levels of the target protein (14).
Therefore, plasmids with a higher expression level can be isolated
when the target DNA sequence is introduced into the modified
5 Cell-Free Protein Production for NMR Studies 73
pIVEX vectors. In the case of CaM, for example, improvements of

the protein yields relative to original sequence were observed for
two of the ten pIVEX vectors. In the large-scale reaction performed
with the best sequence and 51.1 mg of amino acid mixture
(1.7 mg/mL), the amount of CaM synthesized was 5.2 mg (10
wt%). This was twice the yield from the construct prior to the silent
mutagenesis (14).
2. Materials
2.1. Preparation 1. RNase-free Milli-Q PF Plus water (Millipore) (see Note 1).
of E. coli S30 Extract 2. LB medium: Dissolve 20 g of Luria Broth powder in 1 L of
water.
3. Incomplete rich medium: Dissolve 5.6 g of KH2PO4, 28.9 g of
K2HPO4, 1 g of Bacto yeast extract, and 1.5 mg of thiamine
hydrochloride in 1 L of water, and autoclave the solution. Cool
to room temperature and add 50 mL of 40% (w/v) D-glucose
and 10 mL of 100 mM Mg(OAc)2.
4. BL21 Star (DE3) cells (Invitrogen) (see Note 2).
5. French press (see Note 3).
6. S30 buffer: Combine 10 mL of 1 M Tris–OAc, pH 8.2, 10 mL
of 1.4 M Mg(OAc) 2, and 10 mL of 6 M KOAc in a final
volume of 1 L of RNase-free water, atuoclave, and add 1 mL
of 1 M DTT.
7. Gel filtration apparatus: Pack Sephadex G25 resin into a
2.5 × 20-cm chromatography column and set the column verti-
cally at 4°C. After the resin settles, attach a funnel to the top of
the column and equilibrate with 500 mL of S30 buffer (see
Note 4).
8. Dialysis tubing (molecular weight cutoff: 6–8 kDa).
9. Polyethylene glycol (PEG)-8000: Avg. MW 8,000 Da.
10. 2-mercaptoethanol.
11. Diethyl pyrocarbonate (DEPC).
12. Centrifugation tube: Wash the tubes with DEPC water and
autoclave to completely remove RNase.
13. 40% (w/v) D-glucose.
14. 100 mM Mg(OAc)2.
15. 1 M Tris–OAc, pH 8.2.
16. 1.4 M Mg(OAc)2.
17. 6 M KOAc.
18. 1 M DTT.
2.2. Preparation of T7 1. M9 medium supplemented with tryptone peptone: Combine

RNA Polymerase 1 g of Tryptone peptone (Bectone), 5 g of NaCl, 1 g of NH4Cl,
3 g of KH2PO4, and 6 g of Na2HPO4, and add water to 1 L.
After autoclaving, add 10 mL of sterile-filtered 40% (w/v)
glucose and 1 mL of sterile 1 M MgSO4.
2. Sonication buffer A: 50 mM Tris–HCl, pH 8.1, 20 mM NaCl,
2 mM EDTA, and 1 mM DTT.
3. Sonication buffer B: 50 mM Tris–HCl, pH 8.1, 20 mM NaCl,
2 mM EDTA, 1 mM DTT, 1.5 mg/mL lysozyme, 20 μg/mL
PMSF, 10 μg/mL bacitracin, and 0.1 mM benzamidine.
4. Dialysis buffer: 20 mM sodium phosphate–NaOH, pH 7.7,
1 mM EDTA, 1 mM DTT, and 5% glycerol.
5. SP-Sepharose Fast Flow (GE Healthcare): This resin is packed
into a column (3 × 30 cm).
6. Elution buffer: 20 mM sodium phosphate–NaOH, pH 7.7,
1 mM EDTA, 1 mM DTT, 5% glycerol, 200 mM NaCl, and
20 μg/mL PMSF.
7. Isopropyl-beta-D-thiogalactopyranoside (IPTG).
8. 0.8% (w/v) sodium deoxycholate.
9. 2 M Ammonium sulfate.
10. 10% (w/v) Polyethleneimine–HCl, pH 7.0.
11. Saturated ammonium sulfate: Adjust pH to 7.0 using 1 M Tris.
12. E. coli strain BL21 (DE3) transformed with a plasmid encod-
ing T7 RNA polymerase.
13. LB medium (Subheading 2.1).
14. 40% D-Glucose (Subheading 2.1).
15. 1 M MgSO4: Sterilize by autoclaving.
2.3. Cell-Free 1. LM mixture: To prepare the low-molecular-weight mixture,

Synthesis Reaction combine 22 mL of 2 M HEPES–KOH, pH 7.5, 33.4 mL of
6 M K(OAc), 210 mg of DTT, 530 mg of ATP, 338 mg of
CTP, 335 mg of GTP, 310 mg of UTP, 172 mg of cAMP,
28 mg of folinic acid, and 140 mg of tRNA, and, if needed,
64 mL 50% (w/v) PEG-8000 (see Note 5), and then add
RNase-free water up to 200 mL (see Note 6).
2. Dialysis membrane for cell-free reaction: A size 8 cellulose
membrane (Viskase Sales Corporation) ensures a reproducible
expression level. Cut the membrane to an appropriate size.
First, soak the membrane in RNase-free water and then warm
it to 70–80°C. After it has cooled down, wash the membrane
with RNase-free water. Second, soak the membrane in water
containing ~50 mM of NaHCO3, warm it in a microwave to
remove glycerol, and then wash it with RNase-free water.
Third, soak the membrane in water containing ~1 mM EDTA,
warm it in a microwave to further remove glycerol, and then
wash it with RNase-free water three times. Finally, soak the

membrane in water and autoclave it.
3. 0.645 M Creatine phosphate (see Note 7).
4. 10 mg/mL Creatine kinase (see Note 7).
5. RNase inhibitor (Porcine liver).
6. 1.4 M NH4OAc or (15N) NH4OAc (see Note 8).
7. T7 RNA polymerase at ~11 mg/mL (see Note 9).
8. Template DNA: A plasmid containing the sequence encoding
the target protein at ~1 mg/mL (see Note 10).
9. 2× Loading buffer: 2 mL of 0.5 M Tris–HCl, pH 6.8, 4 mL of
10% sodium dodecyl sulfate (SDS), 1.2 mL of 2-mercapto-
ethanol, 2 mL of glycerol, 1–2 mg of bromophenol blue, and
0.8 mL of water.
10. RNase-free water (Subheading 2.1).
11. 0.5 M Mg(OAc)2.
12. S30 extract (Subheading 3.1).
13. Reaction solution: Mix 112 μL of RNase-free water, 9.8 μL of
1.4 M NH4OAc, 15 μL of 0.5 M Mg(OAc)2, 20 μL of SAIL
amino acids, 40 μL of 0.645 M creatine phosphate, 125 μL of
LM mixture, 10 μL of 1 mg/mL template DNA, 4.5 μL of
11 mg/mL T7 RNA polymerase, 1.25 μL of 40 U/mL RNase
inhibitor, 12.5 μL of 10 mg/mL creatine kinase, and 150 μL
of S30 extract for 0.5 mL total volume.
14. Dialysis solution: Mix 1160.8 μL of RNase-free water, 39.2 μL
of 1.4 M NH4OAc, 60 μL of 0.5 M Mg(OAc)2, 80 μL of SAIL
amino acids, 160 μL of 0.645 M creatine phosphate, and
500 μL of LM mixture for 2 mL total volume.
15. Phosphate buffer: 10 mM potassium phosphate, pH 7.0.
Adjust pH using NaOH or HCl.
16. 6 M KOAc (Subheading 2.1).
17. 50% (w/v) PEG-8000.
18. 10% SDS.
19. 0.5 M Tris–HCl, pH 6.8.
20. 2-Mercaptoethanol.
3. Methods
The E. coli S30 extract preparation protocol essentially follows

previously reported procedures (3, 7, 15), but contains a gel filtra-
tion step to minimize the endogenous amino acids. Typically, prepa-
ration of the S30 extract takes 2–3 days for one person. We strongly
recommend saving an aliquot of the extract at each step to monitor
the activity throughout the purification. T7 RNA polymerase is

also prepared following established methods (16–18). The prepa-
ration of T7 RNA polymerase takes 1 week for one person.
The cell-free expression strategy for new proteins is shown in
Fig. 1 . As the first step, we recommend starting with E. coli
cellular expression of the target protein with uniform 15N labeling.
Fig. 1. Flowchart of the cell-free strategy (reproduced from ref. 14 with permission from
Elsevier Science).
It is less costly and more efficient to produce proteins from E. coli

cells than by using the cell-free method, and cellular expression is
better suited for optimizing the purification protocol and the NMR
buffer conditions. In general, a protein that exhibits poor cellular
expression is also likely to show poor cell-free expression, unless
the protein produced is toxic to E. coli cells.
Next, several parameters are optimized to maximize the expression
level in the cell-free reaction based on small-scale reactions under
different conditions. There are various parameters to consider in
the optimization process, including the concentration of magne-
sium, incubation temperature, incubation time, and concentration
of PEG. If necessary, the target DNA sequence can be transferred
to the line of pIVEX vectors with different silent mutations, as
described in the introduction. When an acceptable expression level
is attained, a large quantity of the 15N-labeled protein is produced
and its (1H, 15N)-HSQC spectrum is compared with that of the
protein produced by cellular expression. This comparison helps to
detect any possible difference between in vivo and in vitro expres-
sion. Special attention should be paid to the incomplete deformy-
lation of the N terminus of proteins in the cell-free reaction. The
presence of formylated and deformylated species complicates NMR
spectra. To overcome this problem, the use of an engineered
N-terminal cleavable tag is helpful, as shown in Fig. 2.
3.1. Preparation 1. Inoculate 10 mL of LB medium with the E. coli stock cells

of E. coli S30 Extract (BL21 star (DE3) strain) and incubate the preculture over-
night at 37°C with shaking.
2. Add the preculture to 1 L of incomplete rich medium
prewarmed to 37°C, and incubate the culture at the same
temperature with shaking to an optical density at 600 nm
(OD600) of 0.7 (see Note 11).
3. Centrifuge the cells at 5,000 × g for 10 min at 4°C and then
wash them three times with 200 mL of ice-cold S30 buffer
supplemented with 0.05% 2-mercaptoethanol (see Note 12).
Store the cells as a pellet at −80°C.
4. Gently resuspend the cell pellet in 200 mL of ice-cold S30
buffer supplemented with 0.05% 2-mercaptoethanol. Centrifuge
at 5,000 × g for 10 min at 4°C and weigh the pellet. Resuspend
the pellet in 1.27 mL of S30 buffer per gram of cells.
5. Disrupt the cells with a French Press at 20,000 psi
(1,400 kg cm−2) (see Note 13). Immediately after the disrup-
tion, add 30 μL of 1 M DTT and centrifuge the lysate at
30,000 × g for 30 min at 4°C in DEPC-treated/autoclaved
centrifuge tubes. Carefully collect ~1.4 mL of supernatant per
gram of E. coli without mixing with the precipitate.
a 105
110
Q3E2
T5 Q3E2
Q3E2 Q3E2
T5 T70
115
D2 D2 T70
15N (p.p.m.)
R74 R74
I9 Q3
I9 Q3 120
K77
D78 K77
D78
M71
A73
K13 A73 M71
K13
125
A1 L4 L4
M0
130
11 10 9 8 7 6
1H (p.p.m.)
b G113
105
G33
N137D2 N137D2
G40 G59 Q135E2 Q135E2
T62
G96 G132
V55
G23
Q8E2 Q41E2 N53D2
110
T146 Q41E2
Q8E2
N53D2 N111D2
T5 N111D2 Q143E2
G98 Q143E2 Q3E2 N42D2
N42D2 Q3E2
G25 G134 T29 T26 Q49E2
S17 Q49E2
D58
T44 D95 D22 M145
G61
T117 T110 T79 N60D2
T70
N97D2
Q135
E127
R90
F19
N60D2 115
T28 E54
N42 M109 Y99
M72 D93 N97D2
15N (p.p.m.)
D131
D129 F92
S81 E139 D2E67 D20 R74
R106
N53 Q49 N60 T34
Y138 A103
I125 M76 I52 Q143
Q3 V91
F65 M36 Q41 K75
F89 L69 R126 E87 H107 E47
E119
F16 R37 E84 S38V108 L112 M51 E11
A128
V142 D122 M124
E140
L32 E7
N97
I9 E83D50 M144
E123 Q8 E104 120
E6 E120 L39
E45 F12 D24 L48 E114K77 E14 K30
L18
D133 A46
A10 E31
D118 L105 V121
R86 D78 D56
M71
A102 F68 I85 A88 L4 V35
A73
K13
E82 A15 N111
A1 D80
K21
S101 I63
K115 K148 125
V136 F141 L116 K94
I130 A147
D64
I27
I100
N137
A57
130
11 10 9 8 7 6
Fig. 2. (1H, 15N) HSQC experiments of (U-13C, 15N) CaM. (a) CaM synthesized by the E. coli
cell-free system. (b) CaM synthesized with the N-terminal tag following its removal by
thrombin digestion. The extra peaks in (a) and the peaks in (b) are labeled with their
assignments (reproduced from ref. 14 with permission from Elsevier Science).
6. Transfer the collected supernatant to fresh DEPC-treated/

autoclaved tubes. Centrifuge at 30,000 × g for 30 min at 4°C
and collect ~1.0 mL of the supernatant per gram of E. coli and
place it into a 50-mL tube.
7. Incubate the 50-mL tube at 37°C for 80 min with shaking (see
Note 14).
8. Dialyze the solution for 45 min against 2 L of S30 buffer at
4°C using dialysis tubing with a MWCO of 6–8,000 kDa.
Repeat the dialysis, and then centrifuge the solution at
15,000 × g for 10 min at 4°C.
9. Apply the supernatant to a gel-filtration column preequili-
brated at 4°C with S30 buffer. Load the supernatant onto the
column. Maintain a continuous flow of S30 buffer in the
column. When the sample first begins to exit the column, as
judged from its color and turbidity (see Note 15), collect, in
bulk, 1.4 times the total volume applied to the column.
10. Dialyze the collected bulk solution at 4°C for 40–50 min
against 700 mL of an equal weight mixture of PEG-8000 and
S30 buffer (see Note 16). Adjust the dialysis time so as to con-
centrate the extract to 0.86 times the initial volume. Dialyze
the extract against 2 L of S30 buffer at 4°C for 60 min to
remove PEG-8000.
11. Transfer the dialyzed S30 extract to 1.5-mL tubes, freeze it in
liquid nitrogen, and store it at −80°C (see Note 17).
12. Assess the extent to which the endogenous amino acids in the
extract are eliminated by preparing a protein by using an 2H-,
13
C-, and 15N-labeled amino acid mixture, e.g., SAIL amino
acids set. The remaining endogenous amino acids give rise to
proton signals in 13C-filtered 1H NMR experiments, as shown
in Fig. 3.
3.2. Preparation of T7 1. Using a glycerol stock of E. coli strain BL21 (DE3) trans-
RNA Polymerase formed with a plasmid encoding T7 RNA polymerase, inocu-
late 10 mL of LB medium containing the appropriate antibiotic
in a 50-mL tube and then incubate with shaking at 37°C
overnight.
2. Inoculate 1 L of M9 medium supplemented with tryptone
peptone, containing the appropriate antibiotic, with the 10 mL
culture from step 1 into a 2-L flask that has been prewarmed
to 37°C. Incubate the cells with shaking at 37°C until the culture
reaches an OD600 of 0.5.
3. Induce the expression of T7 RNA polymerase by adding IPTG
to a final concentration of 0.5 mM. Incubate the cells with
shaking for 8 h at 37°C and then centrifuge the culture at
5,000 × g for 10 min at 4°C. Store the cell pellet at −80°C.
Fig. 3. 13C-filtered 1H-NMR spectra of CaM. (a) 15N-labeled CaM synthesized with the
conventional extract. (b) 2H-, 13C-, 15N-labeled CaM synthesized with the conventional S30
extract. (c) 2H-, 13C-, 15N-labeled CaM synthesized with the improved, dialyzed S30 extract.
All of the spectra are adjusted for the intensities of the amide region. Since the peaks of the
protons attached to 13C are filtered, only the protons attached to 12C give rise to resonances
in the aliphatic region (reproduced from ref. 14 with permission from Elsevier Science).
4. Suspend the frozen cell pellet in 72 mL of sonication buffer A.

Add 18 mL of sonication buffer B and stir the mixture for
15 min at 4°C (see Note 18). Add 7.5 mL of 0.8% sodium
deoxycholate and stir the mixture for 15 min at 4°C. Disrupt
the cells with an ultrasonic generator (five rounds of 1-min
sonication at 5-min intervals).
5. Add 15 mL of 2 M ammonium sulfate and bring the volume
of the lysate to 150 mL by adding sonication buffer A. Then,
gradually add 15 mL of 10% polyethyleneimine and stir the
mixture on ice for 20 min. Centrifuge at 39,000 × g for 15 min
at 4°C.
6. Gradually add 0.82 volume of saturated ammonium sulfate

to the supernatant. Stir on ice for 15 min and centrifuge the
mixture at 12,000 × g for 15 min at 4°C.
7. Resuspend the precipitate in 45 mL of dialysis buffer contain-
ing 100 mM NaCl and 20 μg/mL PMSF. Dialyze the suspen-
sion against 2 L of dialysis buffer containing 100 mM NaCl
and 20 μg/mL PMSF for 3 h using a membrane with a molec-
ular weight cutoff of 6,000–8,000. Repeat the dialysis for 3 h
and overnight.
8. Centrifuge the dialysate at 12,000 × g for 10 min at 4°C. Dilute
the dialysate with one volume of dialysis buffer containing
20 μg/mL of PMSF to produce conductivity equal to that of
the dialysis buffer plus 50 mM NaCl. Load the dialysate on an
SP-Sepharose FF column (3 × 30 cm), equilibrated with dialy-
sis buffer containing 50 mM NaCl and 20 μg/mL of PMSF,
and run the 2-column volumes of dialysis buffer containing
50 mM NaCl and 20 μg/mL of PMSF at a flow rate of 1 mL/min
to elute impurities. If necessary, to further remove impurities
remaining in the column, load up to 500 mL of dialysis buffer
containing 50 mM NaCl and 20 μg/mL PMSF at a flow rate
of 2 mL/min (see Note 19).
9. Elute the T7 RNA polymerase with elution buffer and dialyze
it against dialysis buffer containing 100 mM NaCl and 20 μg/mL
of PMSF at 4°C overnight.
10. Repeat the dialysis for 6 h (see Note 20). Finally, dialyze the T7
RNA polymerase against dialysis buffer containing 100 mM
NaCl and 50% glycerol at 4°C overnight. Centrifuge at 18,000 × g
for 10 min at 4°C, collect the supernatant, and determine the
concentration of the T7 RNA polymerase by measuring the
optical density at 280 nm (ε280 = 1.4 × 105 M−1 cm−1) (see Note 21).
Transfer the protein to 1.5-mL tubes and store it at −20°C.
3.3. Cell-Free 1. Wear sanitary gloves (see Note 22) and thaw the LM mixture
Synthesis Reaction and the S30 extract (see Note 23).
2. Prepare the reaction and dialysis solutions (Subheading 2.3).
In the case of a small-scale reaction for evaluating expression
levels, the typical volumes of the reaction and dialysis solu-
tions are 0.5 and 2.0 mL, respectively. For large production
quantities, each volume is scaled up while maintaining the
volume ratio of the reaction solution to the dialysis solution
(see Note 24).
3. Transfer the dialysis solution to an RNase-free vessel. Tie one
end of the dialysis tube firmly. Transfer the reaction solution to
the dialysis tube, and tie off the other end. Fold the tubing 4–6
times and place it into the vessel such that the dialysis tube is
completely submerged in the dialysis solution.
4. Incubate the vessel with shaking for 4–8 h at 37°C (see

Note 25).
5. Retrieve the reaction and dialysis solutions. If the expressed
protein has a molecular weight smaller than molecular weight
cutoff of the membrane, then it will be found in both the
reaction and dialysis solutions.
6. To evaluate the expression level, centrifuge 50-μL aliquots of
the reaction and dialysis solutions at 5,000 × g for 1 min and 4°C
for. Mix the supernatant with an equal volume of 2× loading
buffer, incubate for 5 min at 95°C, and load 10 μL on a poly-
acrylamide gel. Dissolve the precipitate in 50 μL of phosphate
buffer (pH 7.0), mix it with an equal volume of 2× loading
buffer, and load 10 μL on a polyacrylamide gel.
4. Notes
1. Water, treated with DEPC and then autoclaved, can also be

used as RNase-free water (19). However, handle it carefully,
since DEPC is a carcinogen.
2. The expression level of BL21 star is 1.2–1.4 times higher than
that of the A19 strain.
3. The disruption of the E. coli cells can be accomplished by using
equipment other than a French press.
4. The uniformity of the filled resin affects the resolution of gel
filtration. The resin is used only once for each preparation.
5. While the use of PEG increases the expression level in some
cases, it hampers SDS-PAGE analysis. Therefore, PEG should
be removed by ethanol precipitation when the sample is sub-
jected to SDS-PAGE analysis.
6. The prepared LM mixture can be frozen at −20°C for 1 month
or more.
7. The creatine phosphate and creatine kinase should be prepared
just prior to use.
8. If 15N labeling of the side-chain amide groups in asparagine
and glutamine residues is intended, use (15N)-NH4OAc.
9. While we prepare the T7 RNA polymerase ourselves, commer-
cially available T7 RNA polymerase can also be used.
10. The cell-free system uses a large amount of the plasmid, and
thus a high-copy plasmid saves both cost and time.
11. The growth rate of the cells in the culture correlates with the
activity of the resulting extract (20). We strongly recommend
monitoring the growth rate. If the grow rate is slow, then it
might be better to restart the culture. On the other hand,

overgrowth, i.e., OD600 larger than 0.9, leads to reduced
activity of the cells.
12. Do not allow the suspension to foam.
13. The cell disruption is a critical step that affects the activity of
the extract.
14. This runoff reaction was performed with a preincubation mix-
ture (14). The expression level does not change, even without
the supplement (21).
15. The fraction to be collected looks yellow.
16. Before use, stir the PEG-S30 buffer at 4°C with a large mag-
netic stir bar to avoid PEG deposition.
17. The presence of DTT is necessary for preserving the activity
over time. The activity of an S30 extract prepared without
DTT decreases by 10–20% after 3 months of storage at −80°C.
18. The viscosity of the solution increases because of the lysozyme.
19. Wash until the optical density at 280 nm becomes constant.
20. The protein may precipitate during the dialysis.
21. If necessary, concentrate the solution up to about 10 mg/mL
with an ultrafiltration device.
22. When setting up the cell-free reaction, use sanitary gloves to
prevent contamination with RNases.
23. The frozen S30 extract should be thawed gently on ice.
24. Tryptophan and tyrosine residues are unlikely to be soluble. If
needed, warm the amino acid solutions to 60°C. Do not vigor-
ously shake the solution after the template DNA has been
added.
25. PEG is assumed to play a role as a stabilizer of the system. For
example, in the case of CaM, the ratio of protein yields with
and without PEG was 1.6. However, even a small amount of
PEG can disturb the SDS-PAGE quantitation of the production
level of a target protein. If the solution contains PEG, then
precipitate the supernatant with ethanol. Empirically, however,
it is difficult to completely remove PEG even by ethanol
precipitation. Therefore, optimizing the PEG concentration
is somewhat difficult in small-scale cell-free reactions.
Acknowledgments
This work was supported by a grant from the Targeted Protein

Research Program (MEXT) to M. Kainosho and a Grant-in-Aid
for Young Scientists (B) (21770110) to M. Takeda.
References
1. Clemens, M.J., and Prujin, G.J. (1999) Protein protein structure determinations. Nature 440,
synthesis in Eukaryotic Cell-free systems. Oxford 52–57.
University Press, New York, pp. 129–165. 12. Takeda, M., Ikeya, T., Güntert, P., and
2. Kramer, G., Kudlicki, W., and Hardesty, B. Kainosho, M. (2007) Automated structure
(1999) Cell-free Coupled Transcription- determination of proteins with the SAIL-FLYA
Translation systems from Escherichia coli. Oxford NMR method. Nature Protocol 2, 2896–2902.
University Press, New York, pp. 129–165. 13. Kainosho, M., and Güntert, P. (2010) SAIL-
3. Zubay, G. (1973) In vitro synthesis of protein Stereo-array isotope labeling. Q. Rev. Biophys.
in microbial systems. Ann. Rev. Gen. 7, 267–287. 7, 1–54.
4. Spirin, A.S., Barano, V.I., Ryabova, L.A., 14. Torizawa, T., Shimizu, M., Taoka, M., Miyano,
Ovodov, S.Y., and Alakhov, Y.B. (1988) A con- H., and Kainosho, M. (2004) Efficient produc-
tinuous cell-free translation system capable of tion of isotopically labeled proteins by cell-free
producing polypeptides in high yield. Science synthesis: A practical protocol. J. Biomol. NMR
242, 1162–1164. 30, 311–325.
5. Kim, D.M., Kigawa, T., Choi, C.Y., and 15. Pratt, J.M. (1984) Transcription and Trans-
Yokoyama, S. (1996) A highly efficient cell-free lation: A Practical Approach, IRL Press,
protein synthesis system from Escherichia coli. New York, pp. 179–209.
Eur. J. Biochem. 239, 881–886. 16. Davanloo, P., Rosenberg, A.H., Dunn, J.J.,
6. Kim, D.M., and Swartz, J.R. (2000) Prolonging and Studier, F.W. (1984) Cloning and expres-
cell-free protein synthesis by selective reagent sion of the gene for bacteriophage T7 RNA
additions. Biotechnol. Prog. 16, 385–390. polymerase. Proc. Natl. Acad. Sci. U.S.A. 81,
7. Kigawa, T., Yabuki, T., Yoshida, Y., Tsutsui, 2035–2039.
M., Ito, Y., Shibata, T., and Yokoyama, S. 17. Zawadzki, V., and Gross, H.J. (1991) Rapid
(1999) Cell-free production and stable-isotope and simple purification of T7 RNA polymerase.
labeling of milligram quantities of proteins. Nucl. Acid Res. 19,1948.
FEBS Lett. 442,15-19. 18. Grodberg, J., and Dunn, J.J. (1988) ompT
8. Madin, K., Sawasaki, T., Ogasawara, T., and encodes the Escherichia coli outer membrane
Endo, Y. (2000) A highly efficient and robust protease that cleaves T7 RNA polymerase dur-
cell-free protein synthesis system prepared from ing purification. J. Bacteriol. 170, 1245–1253.
wheat embryos: plants apparently contain a sui- 19. Huang, Y.H., Leblanc, P., Apostolou, V.,
cide system directed at ribosomes. Proc. Natl. Stewart, B., and Moreland, R.B. (1995)
Acad. Sci. U.S.A. 97, 559–564. Comparison of Milli-Q PF Plus water to
9. Shimizu, Y., Inoue, A., Tomari, Y., Suzuki, T., DEPC-treated water in the preparation and
Yokogawa, T., Nishikawa, K., and Ueda, T. (2001) analysis of RNA. Biotechniques 19, 656–661.
Cell-free translation reconstituted with purified 20. Zawada, J., and Swartz, J.R. (2006) Effects of
components. Nat. Biotechnol. 19, 751–755. growth rate on cell extract performance in cell-
10. Kigawa, T., Muto, Y., and Yokoyama, S. (1995) free protein synthesis. Biotechnol. Bioeng. 94,
Cell-free synthesis and amino acid-selective 618–624.
stable isotope labeling of proteins for NMR 21. Liu, D.V., Zawada J.F., and Swartz, J.R.
analysis. J. Biomol. NMR 6, 129–134. (2005) Streamlining Escherichia coli S30
11. Kainosho, M., Torizawa, T., Iwashita, Y., extract preparation for economical cell-
Terauchi, T., Ono. A,M., and Güntert, P. free protein synthesis. Biotechnol. Prog. 21,
(2006) Optimal isotope labelling for NMR 460–465.
Chapter 6
Cell-Free Membrane Protein Expression for Solid-State NMR

Alaa Abdine, Kyu-Ho Park, and Dror E. Warschawski
Abstract
Although cell-free expression is a relative newcomer to the biochemical toolbox, it has already been
reviewed extensively, even in the more specialized cases such as membrane protein expression, nanolipo-
protein particles, and applications to crystallography and nuclear magnetic resonance (NMR). Solid-state
NMR is also a newcomer to the structural biology toolbox, with its own specificities in terms of sample
preparation. Cell-free expression and solid-state NMR are a promising combination that has already proven
useful for the structural study of membrane proteins in their native environment, the hydrated lipid bilayer.
We describe below several protocols for preparing MscL, a mechanosensitive membrane channel, using
cell-free expression destined for a solid-state NMR study. These protocols are flexible and can easily be
applied to other membrane proteins, with minor adjustments.
Key words: In vitro synthesis, Integral membrane proteins, Membrane protein reconstitution,
Liposomes, Solid-state NMR
1. Introduction
Cell-free expression is one of the major new developments in

structural biology, both for nuclear magnetic resonance (NMR)
and X-ray crystallography, because it allows for overproduction of
proteins, both wild-type and mutant, that are often produced with
prohibitively low yields using classical biosynthetic methods.
Proteins that are expressed cell-free end up in the reaction con-
tainer and hence in water where membrane proteins precipitate,
often irreversibly. Cell-free expression was first thought to be
impractical for membrane protein expression because the system
did not include a biological apparatus for targeting the protein to
the membrane (1). Providing the medium with detergent was also
considered risky since detergent could perturb protein expression
by interfering with the ribosome, polymerase or any other essential
85
86 A. Abdine et al.
component of the cell-free expression system. These obstacles were

overcome in 2004, when several groups, using optimized cell
lysates, developed protocols for membrane protein expression
in vitro, in the presence of detergents (2–4).
For X-ray crystallography and solution-state NMR, the major
techniques used in structural biology today, membrane protein
structure determination is still a challenge: first, because the afore-
mentioned necessity of obtaining large quantity of functional
proteins is aggravated, in vivo, by the limited membrane surface
available (5) and second, because membrane proteins are hydro-
phobic and have to be manipulated in detergents at all times when
they are extracted from their native environment, the lipid bilayer
(6). Detergents or other surfactants are used for membrane solubi-
lization and during protein purification (which often requires an
affinity tag, such as a polyhistidine stretch). They can interfere with
protein crystallization or make aggregates that are too large for
solution-state NMR. Last but not the least, detergents can inter-
fere with protein folding or function, and the protein structure or
dynamics determined in a detergent environment is not necessarily
representative of the native structure or dynamics (7, 8). Nevertheless,
since 1985, over 200 membrane protein structures have been
determined by using X-ray crystallography (9) and, since 1997,
about 30 structures by using solution-state NMR (10). In this
context, membrane protein cell-free expression has been developed
and reviewed, especially for isotope labeling strategies, another
issue for NMR, where in vitro synthesis offers a very efficient
and versatile alternative that greatly minimizes amino acid scram-
bling (11, 12).
Solid-state NMR is an alternative for membrane protein
structure determination inside the hydrated lipid bilayer, where the
protein is correctly folded and stable for a long time. Sample prep-
aration for solid-state NMR is therefore different than for solution-
state NMR. If the protein is expressed cell-free in the presence of
detergent, it needs to be purified, renatured and reconstituted in a
membrane bilayer. Compared to a protein expressed in a cell, the
solubilization step is avoided and the purification step is greatly
simplified and accelerated, reducing the chances of spurious pro-
teolysis. Reconstitution of the protein in a lipid membrane is the
price to pay to obtain a sample where the membrane protein is
almost certainly in its native state, and where the protein function
can be checked and monitored (13, 14). All these steps are quite
flexible, and they are described below for the expression of the
mechanosensitive channel MscL. This protocol is general for
membrane proteins, although details such as the nature or the
concentration of detergent can vary from one protein to the next.
Cell-free expression in presence of liposomes or nanodiscs is an
alternative sample preparation approach that has been developed
more recently and that has yet to be proven general for membrane
6 Cell-Free Membrane Protein Expression for Solid-State NMR 87
proteins (15–17). When possible, it presents many advantages for

solid-state NMR studies. First and foremost, the protein is already
in its final state, saving time and avoiding many biochemical steps
where the protein may be partially lost or inactivated. Second, the
protein is purified simply by centrifugation, alleviating the necessity
of adding an affinity tag. In addition, no detergent is used that can
interfere and that needs to be removed. Importantly, the protein is
in its native environment at all times, where it is well folded, func-
tional and stable (18). Finally, this approach is quite flexible, allowing
for a variety of labeling strategies (19). In our hands, cell-free
expression of MscL in the presence of liposomes has proven
efficient (3, 14, 18, 19) and we describe it below, as it is the best
method so far for providing a solid-state NMR sample.
The protocols presented here make use of commercial kits for
cell-free expression, which we found advantageous for its conve-
nience, reliability, for saving the time and manpower to make the
lysate, and also for managing the stocks. Cell lysates can also be
prepared, following published protocols from various cell types
such as bacteria, wheat germ, or others (11, 20). We describe below
an optimized procedure for expressing the membrane protein
MscL, using the commercial Roche/5Prime continuous exchange
vessel and kits, with an Escherichia coli extract. Since, for each sample
preparation protocol, it is necessary to check for protein integrity,
we also describe several tests that should be performed to assess the
quality of the sample.
2. Materials
2.1. Cell-Free All solutions are prepared with autoclaved nuclease-free Milli-Q
Expression of the water (see Note 1).
Mechanosensitive
1. Thermoregulated shaker for the cell-free reaction vessel.
Channel MscL in
Detergent Micelles 2. Nuclease-free 50-mL tubes.
3. Nuclease-free 1.5-mL tubes.
4. Nuclease-free glass pipettes.
5. Nuclease-free pipette tips (0–10, 10–200, 200–1,000 μL, and
1–5 mL), autoclaved at 121°C for 20 min.
6. pIVEX-2.3-mscL plasmid (3). Encodes the E. coli MscL
C-terminally fused to a His6 tag under the control of the T7
promoter. Dissolve in pure nuclease-free water and store ali-
quoted at −20°C, at 0.5 μg/μL.
7. RTS 9000 cell-free expression kit (Roche/5Prime). Contains
lyophilized E. coli lysate, reaction mix, feeding mix, reconstitu-
tion buffer, continuous exchange reaction vessel, and a syringe
(see Fig. 1, Notes 2–5). Store at −80°C.
88 A. Abdine et al.
Fig. 1. Roche/5Prime RTS 9000 reaction vessel. The reaction compartment (10 mL) and the
feeding compartment (100 mL) are each accessible through two screws. The reaction
compartment contains the cell lysate, plasmid DNA, detergent or preformed liposomes and
is where the coupled transcription/translation reaction takes place. The separate feeding
compartment provides additional ions, energy substrates, nucleotides and amino acids,
through a semipermeable membrane (MW cutoff 10 kDa). Simultaneously, by-products
that may inhibit the reaction are diluted through the same membrane into the feeding
compartment. This continuous exchange allows cell-free expression to last for up to 24 h.
8. Unlabeled amino acids (powder), store at −20°C.

13
9. C/15N labeled amino acids (powder), store at −20°C.
10. Dithiothreitol (DTT): Prepare 40 mM stock, store at −20°C.
11. 20% (w/v) Triton X-100: Prepare 200 mL, stir until homoge-
neous, store at 4°C for up to 6 months.
12. Unlabeled amino acid solutions: Prepare 168 mM solutions of
each amino acid except for leucine (140 mM) and the labeled
amino acids, Ile and Thr, with the appropriate solution (see
Note 6). The amount of amino acid to be incorporated is
calculated in Subheading 3.1 and is indicated in Table 1. Transfer
the appropriate volume of each unlabeled amino acid to a 1.5-mL
vial, and sonicate until the solution looks clear. Transfer the indi-
vidual solutions to a 50-mL tube. Add Tyr and Leu last to avoid
precipitation; prepare fresh before using and keep on ice.
13. Labeled amino acid solutions, Ile and Thr: Dissolve 23 mg of
Ile and 3.9 mg of Thr in 1.0 and 0.20 mL of reconstitution
buffer, respectively (~168 mM each), into 1.5-mL tubes; pre-
pare fresh before using and keep on ice (see Subheading 3.1,
Table 1 and Note 7).
14. FPLC purification system.
15. 1 M NaOH: To adjust pH, store at room temperature.
16. 100 mM NiSO4: Store at room temperature.
17. 20% Ethanol: Store at room temperature for up to a month.
18. 4-(2-Aminoethyl) benzene sulfonyl fluoride hydrochloride
(AEBSF): 100 mM stock solution in water, store at 4°C for up
to 6 months.
Table 1
Preparation of amino acid solutions for MscL cell-free expression
in a volume V = 110 mL (see Subheading 3.1 and Fig. 1)
MWi Amount Volume

Amino acid (g/mol) n (mg) (mL)
Ala (A) 89 15 15 0.98

Arg (R) 174 6 11 0.39
Asn (N) 132 6 8.7 0.39
Asp (D) 133 7 10 0.46
Cys (C) 121 0 0 0.10
Gln (Q) 146 4 6.4 0.26
Glu (E) 147 8 13 0.52
Gly (G) 75 13 11 0.85
His (H) 155 7 12 0.46
Ile (I) 131 16 23 1.0
Leu (L) 131 13 19 1.0
Lys (K) 146 9 14 0.59
Met (M) 149 5 8.2 0.33
Phe (F) 165 10 18 0.65
Pro (P) 115 7 8.9 0.46
Ser (S) 105 4 4.6 0.26
Thr (T) 119 3 3.9 0.20
Trp (W) 204 0 0 0.10
Tyr (Y) 181 1 2.0 0.10
Val (V) 117 13 17 0.85
n is the number of each amino acid type, of molecular weight MWi, in the
protein sequence. The weight of each amino acid is (n × MWi × 1.1 × 10−2),
expressed in mg. Since each amino acid is solubilized at 168 mM, except for
leucine, at 140 mM, the corresponding volume of each amino acid solution is
(n × 11)/168, except for leucine where it is (n × 11)/140, expressed in mL
19. Chelating column (5 mL): Stored in 20% ethanol at 4°C.

20. FPLC buffer A1: 50 mM NaH2PO4–NaOH (7.1 g/L), pH 8,
300 mM NaCl (17.5 g/L), 10 mM imidazole (680 mg/L),
4.0% Triton X-100 (40 g/L), store at 4°C.
21. FPLC buffer A2: 50 mM NaH2PO4–NaOH (7.1 g/L), pH 8,
300 mM NaCl (17.5 g/L), 10 mM imidazole (680 mg/L),
0.2% Triton X-100 (2 g/L), store at 4°C.
90 A. Abdine et al.
22. FPLC buffer B: 50 mM NaH2PO4–NaOH (7.1 g/L), pH 8,

300 mM NaCl (17.5 g/L), 500 mM imidazole (34 g/L), 0.2%
Triton X-100 (2 g/L), store at 4°C.
23. FPLC sample loading buffer: 50 mM NaH2PO4–NaOH
(7.1 g/L), pH 8, 300 mM NaCl (17.5 g/L), 10 mM imida-
zole (680 mg/L), 1% Triton X-100 (10 g/L), store at 4°C.
24. 10× Dialysis buffer: 0.1 M HEPES–KOH, pH 7.5, 1 M KCl,
and 2% (w/v) Triton X-100, store at 4°C for up to
6 months.
25. 1× Dialysis buffer: Prepare 2 L by diluting 200 mL of 10× buf-
fer in 1,750 mL of water, adjusting the pH to 7.5 using 4 M
KOH, and adding water to 2 L final volume. Store at 4°C for
up to 6 months.
26. 2 dialysis cassettes with a 10 kDa MW cutoff.
27. Methanol.
28. DOPC: 1,2-dioleoyl-sn-glycero-3-phosphocholine powder,
store at −20°C.
29. 4 M KOH: To adjust pH, store at room temperature.
30. 0.8% Triton X-100: Prepare 10 mL, dissolve and stir until
homogeneous, store at 4°C for up to 6 months.
31. Wet polystyrene beads: Wash 5 g of 300–1,200 μm polystyrene
beads with 25 mL of pure methanol and then wash four times
with 25 mL of Milli-Q water. The beads are kept in water at
4°C and are stable for months, provided a weekly renewal of
Milli-Q water.
32. HEPES–KCl buffer: 10 mM HEPES–KOH, pH 7.5, 100 mM
KCl, store at 4°C for up to 6 months.
33. HEPES solutions for solubilizing Tyr, Trp, and Phe
(Subheading 2.1, see Notes 6 and 7): 60 mM HEPES, pH 13
for Tyr, pH 1 for Trp, and pH 7.5 for Phe. Adjust the pH
using KOH or HCl. Keep on ice.
2.2. Cell-Free All solutions are prepared with autoclaved nuclease-free Milli-Q
Expression of the water (see Note 1).
Mechanosensitive
1. Thermoregulated shaker for the cell-free reaction vessel.
Channel MscL in
Liposomes 2. Mini-Extruder with two 1-mL gastight microsyringes and
0.1-μm cutoff polycarbonate membranes.
3. Nuclease-free 50-mL tubes.
5. Nuclease-free glass pipettes.
6. Nuclease-free pipette tips (0–10, 10–200, 200–1000 µL and
1–5 mL). Autoclave at 121°C for 20 min.
7. pIVEX-2.3-mscL plasmid: For protein expression in liposomes,

an affinity tag such as a polyhistidine stretch is preferable but
not mandatory.
8. RTS 9000 cell-free expression kit (Roche/5Prime). Contains
lyophilized E. coli lysate, reaction mix, feeding mix, reconstitu-
tion buffer, a continuous exchange reaction vessel, and a
syringe (see Fig. 1, Notes 2–5), store at −80°C.
9. Unlabeled amino acids (powder), store at −20°C.
13
10. C/15N labeled amino acids (powder), store at −20°C.
11. DTT (dithiothreitol): Prepare 40 mM stock, store at −20°C.
12. 4 M KOH: To adjust pH, store at room temperature.
13. AEBSF (Subheading 2.1).
14. DOPC.
15. Chloroform.
16. HEPES–KCl buffer (Subheading 2.1).
17. DOPC liposome preparation: Prepare a chloroform solution of
10 mg/mL of DOPC (100 mg of DOPC in 10 mL of dry
chloroform). Dry the lipids and then resuspend with RTS
reconstitution buffer to obtain a 20 mg/mL aqueous solution.
Sonicate 2.5 mL of the solution for 5 min, and then extrude it
13 times with a 0.1-μm filter, to obtain a liposome solution
containing 40–50 mg of lipids, and store at 4°C. This is the
amount of lipids that is expected in the final sample. DOPC
can be replaced by other lipids or lipid mixtures, at a higher or
lower concentration.
18. HEPES solutions for solubilizing Tyr, Trp, and Phe
(Subheading 2.1).
19. Unlabeled amino acid solutions: Prepare 168 mM solutions of
each amino acid except for leucine (140 mM) and the labeled
amino acids (item 20), with the appropriate solution (see Note 6).
The amount of amino acid to be incorporated is calculated in
Subheading 3.1 and is indicated in Table 1. The appropriate vol-
ume of each unlabeled amino acid is transferred to a 1.5-mL
vial, and sonicated until the solution looks clear. The vials are
transferred into a 50-mL tube. Tyr and Leu are added last, to
avoid precipitation; prepare fresh before using and keep on ice.
20. Labeled amino acid solutions, Arg, Ile, Pro, Met, and Phe:
Weigh 11 mg of Arg, 23 mg of Ile, and 8.9 mg of Pro and dis-
solve in 0.39, 1.0, and 0.46 mL of reconstitution buffer,
respectively (~168 mM). Weigh 8.2 mg of Met and dissolve in
0.33 mL of reconstitution buffer plus 50 μL of 40 mM DTT.
Weigh 18 mg of Phe and dissolve in 0.65 mL of the appropri-
ate HEPES solution (Subheading 2.1), and sonicate for 1 min.
Prepare each amino acid solution fresh before using and keep
on ice (see Subheading 3.1, Table 1 and Note 7).
92 A. Abdine et al.
2.3. Sample All solutions described should be prepared using Milli-Q water and
Characterization stored at room temperature unless stated otherwise.
1. FA diluted solution: 37% formaldehyde in water, store at 4°C.
2. Dimethyl sulfoxide (DMSO).
3. 1 M NaOH: To adjust pH.
4. 1 M HCl: To adjust pH.
5. 1 M Tris base, pH 8.
6. SDS running buffer: 100 mM Tris base, 100 mM HEPES,
0.1% SDS, adjust to pH 8 using NaOH or HCl.
7. Staining buffer: 0.5% Coomassie Brilliant Blue R-250 dye, 50%
ethanol and 10% acetic acid.
8. Destaining buffer: 20% ethanol and 10% acetic acid.
9. 2× SDS loading buffer: 0.2 M Tris base, pH 8, 8% SDS,
40% glycerol, 0.4% bromophenol blue, 0.4 M DTT, adjust pH
8 using 1 M NaOH or 1 M HCl.
10. 1× SDS loading buffer: dilute 1 mL of 2× SDS loading buffer
with 1 mL of water.
11. DSS solution: 50 mM disuccinimidylsuberate in DMSO.
12. HEPES–KCl buffer (see Subheading 2.1).
13. Polyvinylidene fluoride (PVDF) membranes.
14. TBS-Tween buffer 1: 50 mM Tris base, pH 7.4, 150 mM
NaCl, 0.05% Tween-20.
15. Monoclonal anti-histidineperoxidase-conjugate antibody solu-
tion (for the dot blot): Dilute 2,000× from the stock solution
with TBS-Tween buffer 1, prepare fresh.
16. Diaminobenzidine solution: Dissolve one tablet of diamin-
obenzidine and one tablet of urea hydrogen peroxide in 5 mL
of water, prepare fresh.
17. Western-blotting detection kit.
18. Nonfat dry milk (powder).
19. Chemiluminescence reagents (H2O2 and luminol), store at 4°C.
20. Towbin buffer: 25 mM Tris base, 192 mM glycine, 20% iso-
propanol, and 0.1% SDS. Store without isopropanol at 4°C,
add isopropanol just before use.
21. Ponceau S solution: 0.1% Ponceau S, 5% acetic acid.
22. TBS-Tween buffer 2: TBS-Tween buffer 1 with 5% (w/v)
nonfat dry milk, freshly made.
23. Mouse anti-histidine antibody solution (for the Western-blot):
Dilute 5,000× from the stock solution with TBS-Tween buffer 1,
prepare fresh.
24. Anti-mouseperoxidase-coupled antibody solution (for the
Western-blot): Dilute 10,000× from the stock solution with
TBS-tween buffer 1, prepare fresh.
25. Micro BCA Protein Assay kit (Pierce).

27. Bovine serum albumine (BSA): 1 mg/mL stock solution.
28. Solubilization buffer: 0.2% Triton X-100, solubilized in
HEPES–KCl buffer, store at 4°C for up to 6 months.
29. Acetone, stored at −20°C.
30. Nitric acid solution: 1%, store at room temperature in a safety
cabinet. Handle under the hood.
31. Perchloric acid solution: 70%, store at room temperature in a
safety cabinet. Handle under the hood.
32. Ammonium molybdate solution: Dissolve 2.5 g in 100 mL of
water, store at 4°C.
33. Ascorbic acid solution: Dissolve 10 g in 100 mL of water, pre-
pare fresh, and do not keep for more than 1 week at 4°C.
34. Phosphate stock solution: Dissolve 10 mg of KH2PO4 in
100 mL of water and store at 4°C.
35. Nuclease-free 3-mL ultracentrifugation tubes.
36. Sucrose.
37. Sucrose solutions: 10, 20, and 45% (w/v) sucrose, solubilized
in HEPES–KCl buffer, store at 4°C.
38. Nitrocellulose membrane (for Western blotting).
39. X-ray film (to develop the proteoliposomes must be mixed
with sucrose Western blot).
40. Film developer.
2.4. NMR Sample Four millimeter diameter rotors for high-resolution magic-angle
Preparation spinning solid-state NMR (see Fig. 2).
Fig. 2. Four millimeter diameter rotors for high-resolution magic-angle spinning solid-
state NMR. The sample (approximately 50 μL) is contained in the center of the zirconium
rotor, thanks to the Teflon insert bottom and top, and is kept tightly in place by the top
screw and the Kel-F cap.
94 A. Abdine et al.
3. Methods
The methods outlined below describe the following: (1) calculation

of the amount of incorporated amino acids, (2) cell-free expression
of the mechanosensitive channel MscL in detergent micelles, purifi-
cation of protein/detergent micelles, protein reconstitution into
liposomes, (3) cell-free expression directly into liposomes, (4) sam-
ple characterization, and (5) NMR sample preparation.
3.1. Cell-Free The method below describes the production of an MscL sample
Expression for solid-state NMR, using a cell-free protein expression system in
of the Mechano- detergent micelles. Although any labeling scheme could be per-
sensitive Channel formed, the sample described below was labeled on isoleucines and
MscL in Detergent threonines, using 13C/15N-labeled amino acids, and was subse-
Micelles quently used in solid-state NMR experiments (14).
3.1.1. Day 1: Cell-Free 1. Calculate the amount of amino acids to be incorporated during
Protein Expression in protein expression. The concentration of each amino acid in
Detergent Micelles the protein cell-free expression system (V = 110 mL, see Fig. 1)
is usually ~1 mM. For a given protein, each amino acid con-
centration depends on its occurrence, n, in the protein
sequence. In our case, we have obtained good results with a
final concentration of each amino acid equal to n/10, expressed
in mM. The corresponding weight of each amino acid, of
molecular weight MWi, is therefore (n × V × MWi/10),
expressed in mg. The corresponding volume of each amino
acid solution of concentration Ci, where Ci is 168 or 140 μM
(Subheading 2.1), is (n × V/10)/Ci, expressed in mL. The
weights and final volumes calculated for this protocol are sum-
marized in Table 1.
2. Thaw the DTT, RTS reconstitution buffer, and MscL plasmid
at room temperature.
3. Thaw the other components of the RTS kit (E. coli lysate, reac-
tion mix, and feeding mix) on ice.
4. Prepare the unlabeled amino acid solutions (see Subheading 2.1).
5. Prepare the labeled amino acid solutions (see Subheading 2.1).
6. Add the labeled amino acid solution to the unlabeled amino
acid solution.
7. Add 3 mL of 40 mM DTT, sonicate (80 W for 5 min or until
clear), and store on ice.
8. Reconstitute the lyophilized E. coli lysate in 5.2 mL of recon-
stitution buffer. Shake the bottle gently (see Note 8).
9. Reconstitute the lyophilized reaction mix in 2.2 mL of recon-
10. Reconstitute the lyophilized feeding mix in 80 mL of reconstitu-

tion buffer. Shake the bottle gently (see Note 8).
11. Prepare the feeding solution by adding 26 mL of the reconsti-
tuted amino acid solutions (mixture of unlabeled and labeled
amino acid solutions) and 3 mL of 40 mM DTT to the feeding
mix (80 mL).
12. Prepare the reaction solution by adding the reconstituted reac-
tion mix (2.2 mL), 2.7 mL of the reconstituted amino acid
solution and 0.3 mL of 40 mM DTT to the reconstituted
E. coli lysate (5.2 mL). Remove a 20 μL aliquot of this solution
for later comparison on gel electrophoresis.
13. Add 300 μL of the MscL plasmid (~150 μg) to the reaction
solution.
14. Add 200 μL of 20% Triton X-100 solution to the reaction solution.
15. Add 2.2 mL of 20% Triton X-100 solution to the feeding
solution.
16. Open both screws of the reaction compartment and fill the
reaction compartment with the reaction solution using a nucle-
ase-free pipette (see Fig. 1). Remove air bubbles by tapping
the vessel lightly.
17. Open both screws of the feeding compartment and fill the
feeding compartment with the feeding solution, using the pro-
vided syringe (see Fig. 1). Remove air bubbles by tapping the
vessel lightly.
18. Insert the reaction vessel into a thermoregulated shaker.
19. Set the shaking speed to 800 rpm.
20. Set the temperature to 30°C.
21. Incubate the reaction for 22 h.
3.1.2. Day 2: Purification 1. After 22 h, stop the cell-free protein expression. Remove a 20 μL
of Detergent/Protein aliquot of the reaction run-on solution for gel electrophoresis.
Micelles 2. Remove the reaction run-on mix using a pipette, add 100 μL
of the AEBSF solution to prevent protease digestion, and stir
at room temperature for 15 min (see Note 9).
3. Dilute the sample to a final volume of 50 mL using FPLC
buffer A1 and centrifuge at 10,000 × g for 15 min at 4°C.
4. Connect the chelating column to the FPLC system (see Note 10).
Pass 4–5 column volumes (CV) of distilled water through
the column, to wash away the ethanol solution in which it has
been kept.
5. Degas the FPLC buffers for 10 min in an ultrasonic bath. Wash
the pump inlets with the degassed solutions. Wash the system
with FPLC buffer A1 until the UV and conductivity baselines
are stable.
96 A. Abdine et al.
6. At the end of the sample centrifugation (step 3), dilute the

supernatant (reaction run-on mix) with FPLC sample loading
buffer to a final volume of 100 mL (final dilution ~10×).
7. Filter this solution through a 0.45-μm filter.
8. Resuspend the pellet with 5 mL of FPLC buffer A1 and remove
a 20 μL aliquot for gel electrophoresis.
9. Introduce half (50 mL) of the diluted reaction run-on mix into
the column (see Note 11). Collect the flow-through in the first
50-mL tube.
10. Wash off the nonspecifically bound material by raising the con-
centration of imidazole to 100 mM, by mixing 80% of FPLC
buffer A1 with 20% of FPLC buffer B (6 CV total). The con-
centration of detergent remains high (around 3%). Collect the
flow-through in the second 50-mL tube.
11. To elute the protein, switch FPLC buffer A1 to buffer A2, to
reduce the detergent concentration to 0.2%. Collect the flow-
through in the second 50-mL tube.
12. Increase the imidazole concentration to 200 mM by mixing
60% of solution A2 with 40% of solution B (2 CV total), and
collect eluant fractions of 2 mL each (see Note 12).
13. Increase the imidazole concentration to 500 mM by switching
to 100% FPLC buffer B (6 CV total) and collect eluant frac-
tions of 2 mL each.
14. Wash the column with FPLC buffer A1 (10 CV or until the
UV and conductivity baselines are stable), then repeat steps
9–13 with the remaining 50 mL (second half of the diluted
reaction run-on mix, see Note 11).
15. At the end of the FPLC protein purification run (see Note 10),
the potentially interesting fractions are separated into approxi-
mately 40 tubes, whereas the flow-though is contained in two
50-mL tubes. Remove a 20 μL aliquot from each of these 42
tubes, for gel electrophoresis.
16. Determine which fractions contain the histidine-tagged pro-
tein by performing a dot blot (see Subheading 3.3.4).
17. Wash the two dialysis cassettes in 1× dialysis buffer.
18. For each half of the solution, pool the fractions containing the
protein into a dialysis cassette.
19. Dialyze against 1 L of 1× dialysis buffer, overnight at 4°C.
3.1.3. Day 3: Sample 1. Dialyze a second time against 1 L of 1× dialysis buffer for 2 h
Characterization and at 4°C. Remove a 20 μL aliquot for gel electrophoresis.
Reconstitution into 2. Assess the cell-free expression level, FPLC purification effi-
Liposomes ciency, and oligomeric state by SDS-PAGE (see Subheadings
3.3.1 and 3.3.3).
3. After SDS-PAGE, wash the gel in water and perform a Western-

blot transfer (see Subheading 3.3.5).
4. Estimate the protein concentration. Since MscL does not con-
tain any tryptophan it has to be quantified by using the BCA
method (see Subheading 3.3.6 and Note 13).
5. BCA quantification of our MscL preparation indicated that each
half of the solution contains approximately 4.8 mg of protein in
a 10 mL volume that will be reconstituted in 19.2 mg of DOPC
liposomes (protein–lipid ratio of 1:4). DOPC can be replaced by
other lipids or lipid mixtures, at a higher or lower concentration
(21). Lipids can also be replaced by other mixtures to reconsti-
tute the protein into bicelles or nanodiscs (22, 23).
6. Split the solutions into four fractions of 5 mL, each containing
2.4 mg of protein and 10 mg of Triton X-100. To each frac-
tion, add 9.6 mg of DOPC solubilized in 2.5 mL of the 0.8%
Triton X-100 solution, such that the final lipid–detergent ratio
is 1:3. Stir slowly for 15 min at room temperature.
7. To each fraction of 7.5 mL add 300 mg of wet polystyrene
beads (detergent–bead ratio is 1:10) and put the four tubes on
a rotator, to remove the detergent by adsorption onto the
beads. Incubate overnight at 4°C. For alternative methods
(21, 24), see Note 14.
3.1.4. Day 4: Preparation 1. Add another 300 mg of wet polystyrene beads to each tube.
of the Sample for Incubate for another 2 h, at 4°C.
Solid-State NMR 2. After 2 h, set the tubes vertically for 5 min and discard the
pellet and the beads.
3. Divide the supernatant into four tubes and centrifuge at
100,000 × g for 30 min at 4°C. Remove a 20 μL aliquot of one
supernatant for gel electrophoresis, discard the rest and resus-
pend the pellets with 1 mL of HEPES–KCl buffer for each
pellet. Split the new resuspended pellets into two tubes and
centrifuge at 100,000 × g for 30 min at 4°C. Repeat one more
time so that the entire sample fits into a single tube. Remove a
final 20 μL aliquot of the supernatant for gel electrophoresis.
Dry the final pellet under argon and store it at −20°C.
4. Assess the reconstitution by SDS-PAGE (see Subheading 3.3.1),
and by proteoliposome density characterization on a sucrose gra-
dient (see Subheading 3.3.9). Also, estimate the lipid and water
content (see Subheadings 3.3.8 and 3.3.10, and Note 15).
5. The average quantity of protein obtained with the RTS 9000
kit is 10 mg in 10 mL of reaction mix, which is generally suf-
ficient to prepare two or three NMR samples. Transfer ~30 mg
of pelleted sample to the NMR rotor (see Subheading 3.4).
6. Store the remaining pellet at −20°C.
98 A. Abdine et al.
3.2. Cell-Free The method below describes the production of an MscL sample
Expression of the for solid-state NMR, using a cell-free protein expression system
Mechanosensitive directly into lipid vesicles. Although any labeling scheme could be
Channel MscL performed, the sample described below was labeled on arginines,
in Liposomes isoleucines, methionines, phenylalanines and prolines, using
13
C/15N-labeled amino acids, and was subsequently used in solid-
state NMR experiments (19). Lipids used were synthetic DOPC,
but other lipids or complex mixtures can also be used, such as
asolectin (18) or nanodiscs (17).
3.2.1. Day 1: Cell-Free 1. Thaw the DTT solution, RTS reconstitution buffer and MscL
Protein Expression plasmid at room temperature.
2. Thaw the other components of the RTS kit (E. coli lysate, reac-
tion mix, and feeding mix) on ice.
3. Prepare the DOPC liposomes (Subheading 2.2).
4. Prepare the HEPES solutions (Subheading 2.1).
5. Prepare the unlabeled amino acid solutions (Subheading 2.1).
6. Prepare the labeled amino acid solutions (Subheading 2.1).
7. Add the labeled amino acid solutions to the unlabeled amino
acid solutions.
8. Add 3 mL of 40 mM DTT, sonicate (80 W for 5 min or until
clear), and store on ice.
9. Reconstitute the lyophilized E. coli lysate in 2.7 mL of recon-
10. Add the liposome solution (2.5 mL) to the E. coli lysate once
the latter is completely dissolved.
11. Reconstitute the lyophilized reaction mix in 2.2 mL of recon-
12. Reconstitute the lyophilized feeding mix in 80 mL of reconsti-
tution buffer. Shake the bottle gently (see Note 8).
13. Prepare the feeding solution by adding 26 mL of the reconsti-
tuted amino acid solution (mixture of unlabeled and labeled
amino acid solutions) and 3 mL of 40 mM DTT to the feeding
mix solution (80 mL).
14. Prepare the reaction solution by adding the reconstituted reac-
tion mix (2.2 mL), 2.7 mL of the reconstituted amino acid solu-
tion and 0.3 mL of 40 mM DTT to the reconstituted E. coli
lysate with the liposomes (5.2 mL). Take a 20 μL aliquot of this
reaction solution for later comparison on gel electrophoresis.
15. Add 300 μL of the MscL plasmid (~150 μg) to the reaction
solution.
16. Open both screws of the reaction compartment and fill the
reaction compartment with the reaction solution using a nucle-
ase-free pipette (see Fig. 1). Remove any air bubbles by tapping
the vessel lightly.
17. Open both screws of the feeding compartment and fill the
feeding compartment with the feeding solution, using the
provided syringe (see Fig. 1). Remove any air bubbles by
tapping the vessel lightly.
18. Insert the reaction vessel into a thermoregulated shaker.
19. Set the shaking speed to 800 rpm.
20. Set the temperature to 30°C.
21. Incubate the reaction for 22 h.
3.2.2. Day 2: Preparation 1. After 22 h, stop the cell-free protein expression. Remove a
of the Sample for 20 μL aliquot of the reaction run-on solution for gel
Solid-State NMR electrophoresis.
2. Remove the reaction run-on mix using a pipette, add 100 μL
of the AEBSF solution to prevent protease digestion, and stir
at room temperature for 15 min (see Note 9).
3. Split the sample into six tubes and centrifuge at 100,000 × g for
30 min at 4°C. Remove a 20 μL aliquot of one supernatant for
gel electrophoresis, discard the rest and resuspend each pellet
with 1 mL of HEPES–KCl buffer. Split the resuspended pellets
into three tubes and centrifuge at 100,000 × g for 30 min at
4°C. Repeat one more time so that the entire sample fits into a
single tube. Remove a final 20 μL aliquot of the supernatant
for gel electrophoresis. Dry the final pellet under argon and
store it at −20°C.
4. Assess cell-free expression efficiency and oligomeric state by
SDS-PAGE (see Subheadings 3.3.2 and 3.3.3). At this point,
the expressed protein is generally almost pure.
5. If the protein was expressed with a polyhistidine tag, after SDS-
PAGE, wash the gel in water and perform a Western-blot
transfer (see Subheading 3.3.5).
6. Estimate the protein and lipid concentration using the BCA
and Rouser methods respectively (see Subheadings 3.3.7 and
3.3.8, and Note 13).
7. Characterize the proteoliposome density on a sucrose gradient
(see Subheading 3.3.9), and estimate the water content (see
Subheading 3.3.10 and Note 15).
8. The average quantity of protein obtained with the RTS 9000
kit is 10 mg in 10 mL of reaction mix, which is generally suf-
ficient to prepare two or three NMR samples. About 30 mg of
pelleted sample is transferred into the NMR rotor (see
Subheading 3.4).
9. Store the remaining pellet at −20°C.
100 A. Abdine et al.
3.3. Sample 1. Perform SDS-PAGE on the following samples:

Characterization (a) Aliquots from the reaction mix before and after the cell-free
3.3.1. Electrophoresis expression reaction, to check for expression of the target
of Protein Expressed protein.
in Detergent Micelles (b) Aliquots from the centrifugation pellet and supernatant,
to check whether the protein was totally solubilized in
detergent micelles or if there are some aggregates.
(c) Aliquots from the purification flow-through and the puri-
fied fractions.
(d) Analyze the incorporation of the protein by comparing ali-
quots of the protein-detergent micelles with the proteoli-
posomes, after the reconstitution step.
(e) Aliquots of the supernatant of all the wash steps.
(f) Also check the integrity of the purified protein after
dialysis.
2. Dilute all SDS PAGE samples 1:1 with 2× SDS loading
buffer.
3. Heat the samples to 90°C for 5 min.
4. Cool the samples and load on a gel.
5. Run the gel using SDS running buffer.
6. Wash the gel in water and stain/destain the gel using staining/
destaining buffers, respectively.
3.3.2. Electrophoresis 1. Perform SDS-PAGE on the following samples:

of Protein Expressed (a) Aliquots from the reaction mix before and after the cell-
in Liposomes free expression reaction, to check for expression of the tar-
get protein.
(b) Also check the centrifugation supernatants.
2. Follow steps 2–6 in Subheading 3.3.1.
3.3.3. Electrophoresis If the protein is an oligomer (like MscL), assess its integrity by
for Oligomeric State cross-linking the protein with different free primary amine target-
Characterization ing reagents, like formaldehyde (FA) or disuccinimidylsuberate
(DSS) (3, 25). NB: the protein buffer must not contain primary
amines (such as in Tris).
1. Mix 5 μg of purified MscL, either in detergent or liposomes, in
10 μL of HEPES–KCl buffer and 0.54 μL of the FA diluted
solution.
2. Mix 5 μg of purified MscL, either in detergent or liposomes, in
10 μL of HEPES–KCl buffer and 0.2 μL of 50 mM DSS.
3. Incubate both samples, without shaking, for 30 min at room
temperature.
Fig. 3. Coomassie blue stained SDS-PAGE of MscL: (1) purified in detergent, (2) cross-
linked with formaldehyde, (3) cross-linked with disuccinimidylsuberate. Formaldehyde
generates five protein bands on the gel, corresponding to various oligomeric forms, from
monomers to pentamers, while disuccinimidylsuberate generates mostly pentamers, con-
firming the pentameric nature of Escherichia coli MscL (3, 25).
4. Stop the cross-linking reactions by adding 2 μL of 1 M Tris

base, pH 8, to each sample. Gently mix and incubate for 10 min
at room temperature.
5. Add 3 μL of 2× SDS loading buffer to each sample and mix.
6. Heat the FA sample to 60°C and the DSS sample to 90°C, for
5 min in a water bath.
7. Cool the samples and load on a gel.
8. After migration, wash the gel in water and stain/destain the
gel using staining/destaining buffers, respectively (see Fig. 3)
or perform a Western-blot transfer (see Subheading 3.3.5).
3.3.4. Dot Blot 1. Deposit 2 μL of each fraction of purified protein directly onto
a PVDF membrane.
2. Allow the protein solution to air-dry.
3. Prepare the TBS-Tween buffers 1 and 2.
4. Prepare the monocolonal antibody solution.
5. When the protein spots are completely dry, incubate the mem-
brane in the TBS-Tween buffer 2 for 30 min.
6. Wash the membrane twice for 5 min with the TBS-Tween
buffer 1.
7. Incubate the membrane with the monoclonal antibody

solution for 1 h.
8. Wash the membrane twice for 5 min with the TBS-Tween
buffer 1.
9. Prepare the diaminobenzidine solution.
10. Dry the membrane on a tissue.
11. Incubate the membrane with the diaminobenzidine solution
for a few seconds, until the spots are colored.
12. Dry the membrane on a tissue. The color remains on the spots
containing the histidine-tagged protein, while it vanishes on
the other spots.
3.3.5. Semidry 1. Prepare the Towbin buffer.

Western-Blot 2. Prepare the transfer sandwich by superimposing two filter
papers, the gel, the nitrocellulose membrane and another two
filter papers in the Towbin buffer.
3. Place the transfer sandwich on the anode plate and clip the
cathode plate on top.
4. Clamp the potential at 25 V and blot for 20 min.
5. Prepare TBS-Tween buffers 1 and 2, mouse anti-histidine
antibody, and anti-mouse peroxidase-coupled antibody
solutions.
6. Assess the sample transfer and relative concentration by stain-
ing the membrane with Ponceau S solution. This is a revers-
ible stain that can be removed by washing in TBS-Tween
buffer 1.
7. After protein transfer and destaining, saturate the nitrocellu-
lose membrane by incubating in TBS-Tween buffer 2 for
30 min.
8. Rinse the membrane quickly with TBS-Tween buffer 1.
9. Incubate for 45 min at room temperature with the mouse anti-
histidine antibodies solution.
10. Wash the membrane for 4 × 5 min with TBS-Tween buffer 1.
11. Incubate for 30 min at room temperature with the anti-mouse
peroxidase-coupled antibody solution.
12. Wash the membrane for 5 × 5 min with TBS-Tween buffer 1.
13. Incubate the membrane for 1 min with 1 mL of each of the
chemiluminescence reagents (H2O2 and luminol).
14. In a light tight box, expose X-ray films to the membrane for
varying times, usually between 30 s and 2 min, and develop the
films in the photo developer.
3.3.6. Protein Protein concentration is measured using the BCA method (26, 27),
Quantification using the test tube procedure of the Micro BCA Protein Assay Kit:
in Detergents
1. Prepare eight BSA standards by diluting the stock solution
with the solubilization buffer (the final BSA concentration
should fall between 0.5 and 20 μg/mL). Make three replicates
of each dilution. Introduce 500 μL of each into 1.5-mL tubes
(24 tubes in total) and store at room temperature.
2. Dilute the unknown protein sample with the solubilization
buffer to three different concentrations that are expected to be
between 1 and 20 μg/mL. Make three replicates of each dilu-
tion. Transfer 500 μL of each sample into 1.5-mL tubes (nine
tubes in total).
3. Add 500 μL of Micro BCA Working Reagent to each of the
1.5-mL tubes (24 BSA standards and nine unknown proteins)
and incubate for 60 min at 60°C. Cool to room temperature.
4. Measure the absorbance at 562 nm.
5. Plot a standard curve based on the absorbance of the BSA
samples.
6. Deduce the protein concentration of each unknown protein
sample.
3.3.7. Protein Lipids need to be removed from the sample by precipitating the
Quantification proteins using cold acetone, followed by a centrifugation to sepa-
in Proteoliposomes rate the protein pellet from the supernatant containing the lipids:
1. Prepare the 24 BSA standard tubes (Subheading 3.3.6, step 1).
2. Dilute the unknown protein sample with the solubilization
buffer to three different concentrations that are expected to be
between 1 and 20 μg/mL. Make three replicates of each dilu-
tion. Transfer 500 μL of each sample into 1.5-mL tubes (nine
tubes in total).
3. Add 1 mL of cold acetone to each tube of unknown protein.
Vortex the tubes and incubate for 60 min at −20°C. Centrifuge
at 10,000 × g for 10 min at room temperature. Discard the
supernatant. Incubate the tubes for 30 min at room tempera-
ture to allow for acetone evaporation. Add 500 μL of the solu-
bilization buffer, and vortex the tubes again.
4. Add 500 μL of Micro BCA Working Reagent to each of the
1.5-mL tubes (24 BSA standards and nine unknown proteins)
and incubate for 60 min at 60°C. Cool to room temperature.
5. Measure the absorbance at 562 nm.
6. Plot a standard curve based on the absorbance of the BSA
samples.
7. Deduce the protein concentration of each unknown protein
sample.
3.3.8. Lipid Quantification The phospholipid content in the proteoliposome sample is assessed
in Proteoliposomes by the Rouser method, which measures the phosphate concentra-
tion in the sample (28). Full eye, face, and skin protection is
required for this method.
1. Set a heating block at 180°C.
2. Wash the glass tubes in nitric acid solution before use and dry
them in an oven.
3. Prepare five phosphate standard samples by diluting the phos-
phate stock solution into five samples containing 1–5 μg of
KH2PO4 per tube. Classically, 5 μg of KH2PO4 give an absor-
bance of 0.9 at 800 nm. Store at 4°C.
4. Prepare the ascorbic acid solution (Subheading 2.3).
5. Collect three proteoliposomes volumes containing approxi-
mately 1–5 μg of lipids. Transfer the samples into clean glass
tubes.
6. Add 0.65 mL of the perchloric acid solution and place the
tubes in the heated block at 180°C for 30 min of digestion,
until the yellow color has disappeared.
7. Add 0.65 mL of the perchloric acid solution per phosphate
standard tube (digestion is not necessary).
8. Put all the tubes on ice, and set the heated block at 100°C
(alternatively, a boiling water bath can be used).
9. Once cooled, add to the tubes the following: 3.3 mL of water,
0.5 mL of the ammonium molybdate solution, and 0.5 mL of the
ascorbic acid solution. Agitate on a vortex after each addition.
10. Put the tubes in the heated block at 100°C for 5 min.
11. Put the tubes on ice for 5 min.
12. Read the absorbance of the cooled samples at 800 nm.
13. Plot a standard curve based on the absorbance of the phos-
phate standard samples. The exact concentration in the phos-
pholipid samples is calculated based on the standard curve.
3.3.9. Proteoliposome A discontinuous sucrose flotation gradient analysis (21, 29) can be
Density Characterization performed to check that the membrane protein is correctly recon-
stituted into proteoliposomes. The proteoliposomes can be layered
at the bottom (see below) or at the top (see Note 16) of the sucrose
layers.
1. After MscL reconstitution, add 120 mg of sucrose to 0.2 mL of
the proteoliposome suspension (containing approximately 96 μg
of protein) and mix gently by pipetting, until the sucrose has
completely dissolved. The final volume is approximately
0.265 mL and the final sucrose concentration is 45%. Adjust the
final volume to 0.4 mL with the buffered 45% sucrose solution.
2. Transfer the resulting suspension to a 3-mL ultracentrifuge

tube, and keep it vertical in an appropriate rack for the follow-
ing steps.
3. Carefully deposit 0.7 mL of the buffered 20% sucrose solution,
with minimal flow along the tube wall to avoid layers mixing.
4. Repeat with 0.7 mL of the buffered 10% sucrose solution.
5. Ultracentrifuge at 100,000 × g for 1.5 h in a swinging rotor at
18°C.
6. At the end of the centrifugation, carefully remove and transfer
the tube into the rack, keeping it vertical. Carefully collect six
fractions of 0.3 mL, from top to bottom, into 1.5-mL tubes,
and adjust the final volume to 1 mL with water.
7. Wash the bottom of the 3 mL tube with 100 μL of 1× SDS load-
ing buffer and remove a 15 μL aliquot for gel electrophoresis.
8. Centrifuge the 1.5-mL tubes at 16,000 × g for 15 min at room
temperature, to pellet the proteoliposomes and remove the
sucrose.
9. Resuspend each pellet in 100 μL of 1× SDS loading buffer and
take a 15 μL aliquot of each for gel electrophoresis.
10. Analyze the aliquots by gel electrophoresis. For an expected
ratio of protein to lipid of 1:4 (w/w), proteoliposomes appear
at the interface between 20% and 10% sucrose. Protein-free
liposomes lie at the top of the sucrose gradient, while lipid-free
protein aggregates lie at the bottom of the sucrose gradient
(see Note 17).
3.3.10. Water Water content is estimated by weighing 10 mg of sample prior to

Quantification and after 16 h in vacuum.
in the Sample
3.4. NMR Sample 1. Weigh the empty rotor, cap, insert bottom, top, and screw
Preparation (see Fig. 2).
2. Introduce the insert bottom into the rotor.
3. Introduce a couple of mg of the pelleted sample into the rotor
on the tip of a spatula. Place the rotor into a 1.5-mL tube and
centrifuge it at 10,000 × g for 1 min at room temperature.
Repeat this until approximately 30 mg of sample has been
introduced into the rotor.
4. Place the insert top into the rotor, without the top screw. Clean
the upper part the rotor with a precision wiper before capping
tightly. At this point, the rotor containing sample should be
stored at −20°C until the NMR experiment is performed.
5. Introduce the rotor into the magic-angle spinning NMR probe
and into the magnet of the NMR spectrometer. Spin the rotor
at ~10 kHz for 15 min.
6. Extract the rotor and open it. Clean the upper part of the rotor
again, in case a drop of water has come out. Place the top
screw, tighten it, and then cap the rotor tightly. Weigh the full
rotor to deduce the final sample mass. A typical sample consists
of 3 mg proteins, 12 mg lipids and 15 mg water.
4. Notes
1. All solutions require Milli-Q water, but only solutions used

inside the cell-free expression vessel need to be autoclaved and
nuclease-free, to make sure that no ribonucleases are present
that would damage the RNAs in the lysate.
2. Continuous-exchange cell-free protein expression with an RTS
commercial kit from Roche/5Prime is advantageous for its
convenience, reliability, for saving time and manpower in pre-
paring the lysate, and for managing the stocks. The thermo-
regulated shaker, on the contrary, is not necessarily from
Roche/5Prime.
3. Cell-free reactions should be performed on a small scale (RTS
100 or analog kits) for optimization studies, including deter-
gents or lipids. Protein yield should be evaluated in a cell-free
kit equipped with a continuous-exchange system (e.g., RTS
500), where the yield is usually higher. Once the protocol is
optimized, the proteins can then be expressed on larger scales
(e.g., RTS 9000).
4. Cell-free kits were stored at −80°C rather than −20°C, which
increases the lifespan of the kits to over a year. Freeze–thaw
cycles should be avoided.
5. Circular DNA was used for cell-free expression of MscL. Linear
PCR templates can also be used, but they require larger
amounts of DNA. DNA quality is important for cell-free
expression. A plasmid purification procedure with a good yield
(typically 100 μg of plasmid DNA per standard MIDI-prep),
including an anion-exchange step, was found to be necessary.
DNA should be dissolved in either nuclease-free water (our
case) or Tris solution (typically 10 mM, at a pH between 8 and
8.5), but not in a buffer containing EDTA, as it would change
the free magnesium ion concentration and reduce the protein
expression yield.
6. Some amino acids are difficult to solubilize. Sonication or heat-
ing at 60°C can improve solubility. Met (M) and Cys (C) have
to be supplemented with DTT (4 and 8 mM respectively). Tyr
(Y), Trp (W), and Phe (F) have to be solubilized in 60 mM
HEPES in nuclease-free water, adjusting the pH to 13, 1, or
7.5 respectively, using KOH or HCl. However, Trp (W), Asp

(D), Asn (N), Cys (C), or Tyr (Y) will not dissolve completely
and have to be used as suspensions. Alternatively, the
Roche/5Prime RTS Amino acid Sampler provides appropriate
stock solutions of each individual unlabeled amino acid.
7. Commercial mixtures of labeled amino-acids are also available
(30). All labeled amino-acids can be specifically incorporated
into the expressed proteins, except for Gln (Q) and Glu (E)
when using the Roche/5Prime buffer, which contains large
quantities of unlabeled glutamate. Some scrambling is observed
with Ser (S), Asp (D), Asn (N), Gln (Q), and Glu (E). Some
metabolic degradation can occur with Arg (R), Cys (C), Trp
(W), Met (M), Asp (D), and Glu (E) during prolonged incu-
bation. In such a case, an increased amount of amino acids can
enhance the protein production.
8. Too vigorous mixing should be avoided while preparing the
different lysates, as it may denature proteins or ribosomes in
the extracts.
9. AEBSF is added after the expression is complete to prevent
protease degradation of the target product. Proteases inhibi-
tors can also be added to the reaction chamber, but they should
be tested on a small scale beforehand.
10. If the chelating column is not precharged with Ni ions, before
connecting it to the FPLC system, open it at both ends and
wash it with 2 CV of distilled water, 1 CV of 100 mM EDTA,
pH 8, 3 CV of distilled water, 2 CV of 0.1 M NiSO4 solution,
and 5 CV of distilled water. Once the column is charged, an
additional washing step with 5 CV of 20% ethanol is necessary
before storage at 4°C. At the end of the protein purification
run, wash the column with water (5 CV), 0.5 M NaOH (5 CV),
water (5 CV) and store in 20% ethanol.
11. The total reaction run-on is expected to contain approximately
10 mg of protein but also detergent and impurities, which may
bind to the column as well. Since the binding saturation of the
column surface is on the order of 10 mg/mL, the reaction
run-on is split in two so as to always remain below the satura-
tion limit.
12. For MscL, the fractions containing the protein are eluted
between 200 and 500 mM of imidazole, but this is highly pro-
tein dependent. Depending on the histidine-tag accessibility, the
protein may elute at lower or higher imidazole concentrations.
13. Protein concentration is usually estimated using UV fluores-
cence of aromatic amino acids. If the protein is not sufficiently
fluorescent, quantification can be performed using the com-
mercial Micro BCA bicinchonic acid protein assay kit (Pierce).
Care should be taken if lipids are present in the sample.
14. Instead of using polystyrene beads, detergent extraction can be

performed using cyclodextrin inclusion compounds, as
described by DeGrip et al. (24), or by dialysis (21).
15. Proteoliposomes can be dehydrated and rehydrated, but care
should be taken, since the process may affect protein activity.
16. The discontinuous sucrose flotation gradient analysis to check
that the membrane protein is correctly reconstituted can be
performed with the proteoliposomes layered either at the top
or at the bottom of the sucrose layers, or both (21, 29). If at the
top, the proteoliposomes must be mixed with sucrose for a
final sucrose concentration of 10%. The rest of the protocol is
the same: After layering 0.7 mL of 45% sucrose, then 0.7 mL
of 20% sucrose and then 0.7 mL of the proteoliposomes in
10% sucrose, the proteoliposomes usually appear at the inter-
face between 20 and 10% sucrose.
17. If the amount of lipid-free protein aggregates is not negligible,
the discontinuous sucrose flotation gradient can also be used as
a purification technique by layering the entire reconstituted
proteoliposome suspension.
Acknowledgments
This work was supported by fellowships from the Ministère de

l’Enseignement Supérieur et de la Recherche and the Fondation
pour la Recherche Médicale (to A.A.), by the CNRS (UMR 7099
and 8619), the ANR (ANR-06-JCJC0014), the Univ Paris Diderot
and the Université Paris-Sud 11. We thank Alexandre Ghazi,
Catherine Berrier, Emmanuelle Billon-Denis, and Michiel A.
Verhoeven for helping us optimize the protocols presented here.
References
1. Lyford, L. K., and Rosenberg, R. L. (1999) 4. Klammt, C., Lohr, F., Schafer, B., Haase, W.,
Cell-free expression and functional reconstitu- Dötsch, V., Ruterjans, H., Glaubitz, C., and
tion of Homo-oligomeric α7 Nicotinic Bernhard, F. (2004) High level cell-free expres-
Acetylcholine Receptors into Planar Lipid sion and specific labeling of integral membrane
Bilayers. J. Biol. Chem. 274, 25675–25681. proteins. Eur. J. Biochem. 271, 568–580.
2. Elbaz, Y., Steiner-Mordoch, S., Danieli, T., 5. Wagner, S., Bader, M. L., Drew, D., and de Gier,
and Schuldiner, S. (2004) In vitro synthesis of J. W. (2006) Rationalizing membrane protein
fully functional EmrE, a multidrug transporter, overexpression. Trends Biotechnol. 24, 364–371.
and study of its oligomeric state. Proc. Natl. 6. Eshaghi, S. (2009) High-throughput expression
Acad. Sci. U.S.A. 101, 1519–1524. and detergent screening of integral membrane
3. Berrier, C., Park, K. H., Abes, S., Bibonne, A., proteins. Methods Mol. Biol. 498, 265–271.
Betton, J. M., and Ghazi, A. (2004) Cell-free 7. Tate, C. G. (2010) Practical considerations of
synthesis of a functional ion channel in the membrane protein instability during purifica-
absence of a membrane and in the presence of tion and crystallisation. Methods Mol. Biol. 601,
detergent. Biochemistry 43, 12585–12591. 187–203.
8. Breyton, C., Pucci, B., and Popot, J.-L. (2010) 19. Abdine, A., Verhoeven, M. A., and
Amphipols and fluorinated surfactants: Two Warschawski, D. E. (2011) Cell-free expres-
alternatives to detergents for studying mem- sion and labeling strategies for a new decade
brane proteins in vitro. Methods Mol. Biol. 601, in solid-state NMR. New Biotechnol. 28,
219–245. 272–276.
9. White, S. H. (2010) Membrane Proteins of 20. He, M. (2008) Cell-free protein synthesis:
Known Structure. University of California at applications in proteomics and biotechnology.
Irvine. http://blanco.biomol.uci.edu/ New Biotechnol. 25, 126–132.
Membrane_Proteins_xtal.html Accessed 25 21. Rigaud, J.-L., and Lévy, D. (2003)
July 2011. Reconstitution of Membrane Proteins into
10. Warschawski, D. E. (2010) Membrane Proteins of Liposomes. Methods Enzymol. 372, 65–86.
Known Structure Determined by NMR. Drorlist. 22. Triba, M. N., Zoonens, M., Popot, J.-L.,
http://www.drorlist.com/nmr/MPNMR.html Devaux, P. F., and Warschawski, D. E. (2006)
Accessed 25 July 2011. Reconstitution and alignment by a magnetic
11. Schneider, B., Junge, F., Shirokov, V. A., Durst, field of a β-barrel membrane protein in bicelles.
F., Schwarz, D., Dötsch, V., and Bernhard, F. Eur. Biophys. J. 35, 268–275.
(2010) Membrane protein expression in cell- 23. Leitz, A. J., Bayburt, T. H., Barnakov, A. N.,
free systems. Methods Mol. Biol. 601, 165–186. Springer, B. A., and Sligar, S. G. (2006)
12. Sobhanifar, S., Reckel, S., Junge, F., Schwarz, Functional reconstitution of β2-adrenergic
D., Kai, L., Karbyshev, M., Löhr, F., Bernhard, receptors utilizing self-assembling Nanodisc
F., and Dötsch, V. (2010) Cell-free expression technology. BioTechniques 40, 601–612.
and stable isotope labelling strategies for mem- 24. De Grip, W. J., Van Oostrum, J., and Bovee-
brane proteins. J. Biomol. NMR 46, 33–43. Geurts, P. H. M. (1998) Selective detergent-
13. Lehner, I., Basting, D., Meyer, B., Haase, W., extraction from mixed detergent/lipid/protein
Manolikas, T., Kaiser, C., Karas, M., and micelles, using cyclodextrin inclusion com-
Glaubitz, C. (2008) The key residue for sub- pounds: a novel generic approach for the prep-
strate transport (Glu14) in the EmrE dimer is aration of proteoliposomes. Biochem. J. 330,
asymmetric. J. Biol. Chem. 283, 3281–3288. 667–674.
14. Abdine, A., Verhoeven, M. A., Park, K.-H., 25. Sukharev, S. I., Schroeder, M. J., and McCaslin,
Ghazi, A., Guittet, E., Berrier, C., Van D. R. (1999) Stoichiometry of the Large
Heijenoort, C., and Warschawski, D. E. (2010) Conductance Bacterial Mechanosensitive
Structural study of the membrane protein Channel of E. coli. A Biochemical Study. J.
MscL using cell-free expression and solid-state Membrane Biol. 171, 183–193.
NMR. J. Magn. Reson. 204, 155–159. 26. Smith, P. K., Krohn, R. I., Hermanson, G. T.,
15. Kalmbach, R., Chizhov, I., Schumacher, M. Mallia, A. K., Gartner, F. H., Provenzano, M.
C., Friedrich, T., Bamberg, E., and Engelhard, D., Fujimoto, E. K., Goeke, N. M., Olson, B.
M. (2007) Functional cell-free synthesis of a J., and Klenk, D. C. (1985) Measurement of
seven helix membrane protein: in situ insertion protein using bicinchoninic acid. Anal.
of bacteriorhodopsin into liposomes. J. Mol. Biochem. 150, 76–85.
Biol. 371, 639–648. 27. Wiechelman, K. J., Braun, R. D., and
16. Marques, B., Liguori, L., Paclet, M. H., Villegas- Fitzpatrick, J. D. (1988) Investigation of the
Mendéz, A., Rothe, R., Morel, F., Lenormand, bicinchoninic acid protein assay: identification
J.-L. (2007) Liposome-mediated cellular deliv- of the groups responsible for color formation.
ery of active gp91(phox). PLoS One 2, e856. Anal. Biochem. 175, 231–237.
17. Katzen, F., Fletcher, J. E., Yang, J. P., Kang, D., 28. Rouser, G., Fkeischer, S., and Yamamoto, A.
Peterson, T. C., Cappuccio, J. A., Blanchette, (1970) Two dimensional thin layer chromato-
C. D., Sulchek, T., Chromy, B. A., Hoeprich, P. graphic separation of polar lipids and determi-
D., Coleman, M. A., and Kudlicki, W. (2008) nation of phospholipids by phosphorus analysis
Insertion of membrane proteins into discoidal of spots. Lipids 5, 494–496.
membranes using a cell-free protein expression 29. Laird, D. M., Eble, K. S., and Cunningham, C.
approach. J. Proteome Res. 7, 3535–3542. C. (1986) Reconstitution of mitochondrial
18. Berrier, C., Guilvout, I., Bayan, N., Park, F0.F1-ATPase with phosphatidylcholine using
K.-H., Mesneau, A., Chami, M., Pugsley, A. P., the nonionic detergent, octylglucoside. J. Biol.
and Ghazi, A. (2011) Coupled cell-free synthe- Chem. 261, 14844–14850.
sis and lipid vesicle insertion of a functional oli- 30. Etezady-Esfarjani, T., Hiller, S., Villalba, C.,
gomeric channel MscL - MscL does not need and Wüthrich, K. (2007) Cell-free protein syn-
the insertase YidC for insertion in vitro. thesis of perdeuterated proteins for NMR
Biochim. Biophys. Acta 1808, 41–46. studies. J. Biomol. NMR 39, 229–238.
Chapter 7
Expression and Purification of Src-family Kinases

for Solution NMR Studies
Andrea Piserchio, David Cowburn, and Ranajeet Ghose
Abstract
NMR analyses of the structure, dynamics, and interactions of the Src family kinases (SFKs) have been
hindered by the limited ability to obtain sufficient amounts of properly folded, soluble protein from bacterial
expression systems, to allow these studies to be performed in an economically viable manner. In this chapter,
we detail our attempts to overcome these difficulties using the catalytic domain (SrcCD) of c-Src, the pro-
totypical SFK, as an illustrative example. We describe in detail two general methods to express and purify
SrcCD from Escherichia coli expression systems in both fully active wild-type and kinase-deficient mutant
forms, allowing the efficient and cost-effective labeling by NMR-active isotopes for solution NMR studies.
Key words: Protein tyrosine kinases, Src-family kinases, Escherichia coli expression systems
1. Introduction
Protein kinases in higher eukaryotes transfer the γ-phosphate of

ATP to the hydroxyl groups of specific serine, threonine (serine/
threonine kinases) or tyrosine (tyrosine kinases) residues in
substrate proteins. This covalent modification of the Ser/Thr/
Tyr–OH moiety is a fundamental mechanism of cellular regulation
and intracellular signal transduction (1–3) and has a significant role
in almost every aspect of cell growth, differentiation, maturation,
motility, and regulated cell death. In vivo, the phosphorylation
levels of kinase targets are tightly controlled by the opposing
actions of kinases and their corresponding phosphatases (4, 5).
A loss of this control due to mutations, changes in expression levels,
alteration of protein–protein interactions including localization,
and/or catalytic activity has been implicated in a variety of diseases
111
112 A. Piserchio et al.
including a multitude of human cancers (6–9), making protein

kinases significant and established targets for anticancer therapies
(10–13).
Eukaryotic protein kinases contain a highly conserved catalytic
domain (CD) (14, 15) that forms a dual-lobed structure (Fig. 1)
with a smaller, mainly β-sheet N-terminal lobe (N-lobe), and a
larger, mainly α-helical C-terminal lobe (C-lobe). The catalytic
activity, while wholly contained within the CD, is often regulated
by additional modular domains (16) or insertions within the CD
itself (17). While crystallographic analyses of both serine/threo-
nine (18) and tyrosine kinases (19) have provided valuable insights
into the regulation of their catalytic activity, several questions
remain. The current hypothesis is that most eukaryotic protein
kinases function as simple two-state switches transitioning
between well-defined “active” and “inactive” conformations based
on the phosphorylation state of a small number of regulatory resi-
dues. Mounting evidence suggests that this may be an oversimpli-
fication and the boundary between fully active and fully inactive
states may be not that well defined. Further, the simple phospho-
rylation-dependent switch scenario is complicated by mounting
experimental evidence that hints towards the role played by long-
range allosteric dynamic pathways in the regulation of kinase activ-
ity, where active-like conformations may be attained through
remote interactions (20). Structural details on the nature of recog-
nition of downstream substrates and of regulating kinases and
phosphatases are still poorly defined for many of the subfamilies of
Fig. 1. (a) Domain arrangement in inactive full-length c-Src (PDB ID: 2SRC). The residues comprising the linker between
the SH2 and the catalytic domain (CD), which engages the SH3 domain, is shown in ball-and-stick representation, as are
the residues that comprise the C-terminal tail. The regulatory tyrosine residues in the activation loop (Y416, unphosphory-
lated in inactive c-Src) and the C-terminal tail (Y527, phosphorylated in inactive c-Src) are also shown in ball-and-stick
representation. The ATP-analog, ANP (phosphoaminophosphonic acid-adenylate), bound to the ATP-binding site, is also
shown in ball-and-stick representation. (b) Expanded view of SrcCD with the N-lobe and C-lobes indicated. Important
regulatory elements that include the activation loop, Gly-rich loop, the HRD (that precedes the catalytic loop) and DFG
motifs are also indicated. Y416 is shown in a ball-and-stick representation and the αC helix is indicated by a dashed
ellipse. The ligand ANP is also shown bound to the ATP-binding pocket.
7 Expression and Purification of Src-family Kinases for Solution NMR Studies 113
the human kinome. This is not unexpected, given that some of

these interactions may be weak or transient, and the resultant com-
plexes may not be amenable to crystallographic analyses.
Given this background, it would seem that solution NMR
spectroscopy is tailor-made to resolve these unanswered issues
about structure, dynamics, and interactions involving protein
kinases. Targeting eukaryotic protein kinases by using solution
NMR methodologies is especially important because of consid-
erable interest in the pharmaceutical industry in designing small
molecule inhibitors of enzymatic activity (21). In particular, the
rational design of inhibitors targeting remote sites exerting allos-
teric control over catalytic activity in protein kinases (type III
inhibitors) holds promise as a strategy for inhibiting kinase signal-
ing in a highly selective fashion. These allosteric sites, which are
more likely to be unique to specific protein kinases or kinase
subfamilies, could be identified by NMR studies performed on a
representative subgroup of the human kinome. These studies are
expected to assume greater importance as the role of yet uncharac-
terized protein kinases in cancer becomes clearer by using a com-
bination of RNAi technologies in tumor cell lines and genetic
knockdown studies (22–25).
Unfortunately, there have been only a handful of NMR studies
reported on protein kinases. These include the receptor tyrosine
kinase Eph (26), c-Abl (27), c-Src (28), ERK2 (29) protein kinase
A (PKA) (30, 31), and Csk (32). A large number of these targets,
including PKA and Eph, for which detailed NMR analyses are
available, may be expressed and purified with relative ease and are,
to a large extent, well behaved in solution, in contrast to a large
majority of the human kinome. NMR studies on a broader class of
protein kinases have been hindered by difficulties in bacterial
expression and purification of sufficient quantities of soluble, prop-
erly folded protein for economically viable labeling with NMR-
active isotopes. Indeed, the only comprehensive NMR study on a
nonreceptor tyrosine kinase examined c-Abl expressed using the
baculovirus Sf9 insect cell expression system (27). This system
allows limited labeling options and is generally not an economi-
cally viable option in a noncommercial setting. Thus, to make these
studies viable for a broader set of representative members of the
kinome, the optimization of protocols for protein production in
bacterial expression systems is required.
The problems encountered in NMR studies of eukaryotic pro-
tein kinases are exemplified by the Src-family of nonreceptor
tyrosine kinases (SFKs) (Fig. 1). c-Src is the prototypical represen-
tative of this nine member family (c-Src, Blk, Fgr, Fyn, Hck, Lck,
Lyn, Yes, and Yrk) (16, 33, 34). SFKs have been found to be
involved in a multitude of human cancers including those of the
breast, colon, gastrointestinal tract, lung and in leukemias, lym-
phomas, and myelomas (35). In particular, the catalytic activity of
c-Src is highly elevated in a majority of human colon cancers (36).

SFKs in general and c-Src in particular are therefore key targets for
the design of drugs to treat a wide range of human cancers (37).
The catalytic activity of c-Src, and indeed of other SFKs, depends
on the phosphorylation state of two conserved tyrosine residues
(chicken c-Src numbering used throughout): Tyr416 (activation)
and Tyr527 (suppression). Phosphorylation occurs at Tyr416
through an intermolecular mechanism, while Tyr527 is phospho-
rylated by another family of nonreceptor tyrosine kinases that is
comprised of two members: C-terminal Src kinase (Csk) and Csk-
homologous kinase (Chk) (38, 39).
Several researchers have utilized a variety of systems, including
insect cells (40), yeast (Schizosaccharomyces pombe) (41), and
platelets (42) to express c-Src. Weijland et al. (41) used a S. pombe
based strategy to express the catalytic domain of c-Src by coexpressing
the catalytic domain of the tyrosine phosphatase PTP-PEST to
reduce catalytic activity and the resultant cytotoxicity. As an extension
of this strategy, Seeliger et al. (43) have reported the Escherichia
coli expression of the catalytic domains of c-Src and c-Abl by coex-
pressing the downregulating YopH phosphatase (43). However,
multiple issues, including poor sample homogeneity, multiple
phosphorylation states (44), sample aggregation or degradation
are often encountered when using these strategies. Escherichia coli,
the system of choice for production of material amenable to analysis
by solution NMR spectroscopy, is particularly affected by these
drawbacks (28). The expression and purification of the catalytic
domain of c-Src (SrcCD) from E. coli for NMR studies was
hindered by several major complications that include the following:
1. Cytotoxicity resulting in poor cell growth after induction.
2. Protein misfolding leading to the accumulation of large
amounts of insoluble material in inclusion bodies resulting in
vanishing low yields of properly folded soluble protein.
Attempts to refold these misfolded aggregates using chaperones
in vitro have also not yielded promising results (Piserchio et al.
unpubl. results).
3. The production of heterogeneously phosphorylated protein.
Resolution of these issues is absolutely necessary for NMR,
with its ability to provide both structural and dynamic information
over a wide range of timescales, to play a role in studies of these key
signaling molecules for drug discovery (45). Here, we describe our
efforts toward alleviating these issues by optimal choice of con-
structs, the use of solubility-enhancing tags, coexpression with
chaperones, and carefully optimizing overexpression conditions
(28). These methods should provide a useful guide for the bacte-
rial expression and purification of similar protein kinases for NMR
studies.
2. Expression
of the Catalytic
Domain of c-Src The first strategy that we utilized for the bacterial expression of
in Escherichia coli c-Src involved (1) the addition of an N-terminal MBP fusion tag to
enhance both solubility and folding, (2) the coexpression of GroEL
and GroES chaperones to further facilitate the accumulation of
properly folded protein, an idea originally proposed by Cole (46),
and (3) a decrease in the growth temperature after induction to
15°C, thus reducing the rapid accumulation of large quantities of
proteins and macromolecular crowding effects that tend to
facilitate aggregation.
MBP is widely used to enhance the production of fused proteins
through a mechanism that is far from clear. Several possible mecha-
nisms have been proposed including the formation of large vesicular
aggregates with the misfolded or unfolded proteins contained
inside (47) or a general chaperoning effect (48). Our initial
attempts to produce the catalytic domain of the so-called “kinase-
dead mutant” bearing a Lys → Met mutation (SrcCDK295M) (49) in
M9 minimal medium showed that in fact, just the introduction of
a N-terminal MBP fusion, while not eliminating the presence of
protein in the insoluble inclusion bodies, drastically increased the
amount of soluble material. However, a majority of the protein
produced in this fashion found in solution consisted of soluble
aggregates, presumably formed by misfolded material. Observation
of similar phenomena has added credence to the speculation that
MBP fusions can form micellar-like systems, formed by unfolded
material in the core, and by the soluble MBP domain on the sur-
face (47). Though a relatively small amount of properly folded
SrcCDK295M was obtained by these preparations, it allowed us to
collect the first 15N-HSQC spectra of SrcCDK295M. However, the
quantity of soluble, properly folded SrcCD obtained using this
procedure was not sufficient for detailed NMR analyses. This was
because a large number of experiments, collected over long periods
of time, are required for the resonance assignment process, neces-
sitating multiple NMR samples produced using expensive labeling
supplies. Nevertheless, the observation that material suitable for
NMR experiments could be obtained from an E. coli expression
system was indeed encouraging.
The largest improvement in the yield of properly folded
SrcCDK295M came when the N-terminal MBP fusion system was
transformed into E. coli cells that overexpress GroEL and GroES
chaperones. Cole and coworkers have shown that overexpression
of these bacterial chaperones allow the purification of small amounts
of SrcCDK295M from E. coli (49). We found that a more substantial
improvement, also achievable in the minimal medium used for
NMR sample preparation, can be obtained by integrating the MBP
fusion and the GroELS systems. Furthermore, the same approach
Fig. 2. Production and purification of SrcCDK295M. (a) SDS-PAGE gel (20%) showing different steps in the purification
process: lane 1 – GroEL and GroES (expressed alone), lane 2 – cell lysate supernatant, lane 3 – cell lysate pellet, lane 4 –
metal affinity column flow through, lane 5 – metal affinity column eluate, lane 6 – product after thrombin cleavage
(MBP – higher molecular weight and SrcCDK295M – lower molecular weight), lane 7 – SrcCDK295M after Q-column purification.
(b) Effects of controlled phosphorylation and dephosphorylation of SrcCDK295M on the Q column elution profiles. Dashed line:
sample preincubated with catalytic amounts of commercial wild-type c-Src and Mg+2/ATP. Dark solid line: sample preincu-
bated with alkaline phosphatase (nonspecific phosphatase). Light solid line: sample preincubated with Lyp phosphatase
(specific for phosphorylated Y416). The identity of pSrcCDK295M was confirmed by immunoblots using anti-pTyr antibodies
and MS/MS, following trypsin digestion.
could also be used to grow wild-type SrcCD (as opposed to

SrcCDK295M), albeit with lower yields, with minimal alteration of
the overall protocol. The introduction of the chaperones, however,
did not completely eliminate the formation of soluble aggregates
mentioned above. However, we noticed that, while these unde-
sired aggregates bound to the amylose columns that are routinely
employed to purify MBP fusion proteins, they were unable to bind
Co+2 affinity columns despite the presence of an N-terminal His6-
tag, most likely because the His6-tag was occluded in the misfolded
species. This fortuitous scenario allowed us to use metal affinity
chromatography to efficiently separate the soluble aggregates
from properly folded SrcCD.
When a target protein is expressed as an MBP fusion, a critical
step prior to NMR analyses is the removal of the bulky MBP tag
from the fusion. SrcCD can be efficiently separated from the MBP
tag by inserting a thrombin cleavage site between the two compo-
nents of the fusion construct, although some care has to be taken
to avoid protein degradation in the presence of thrombin.
We found that resins that bind the N-terminal end of the cleaved
fusion (metal-chelating for His6 or amylose for the MBP tag) can-
not be efficiently used to isolate SrcCD, given the tendency of the
kinase to bind nonspecifically to these columns. We then found
that separation of MBP and SrcCD can be carried out using ion
exchange chromatography (Fig. 2). The catalytically compromised
SrcCDK295M could be efficiently separated from cleaved MBP in this
fashion. Interestingly, during purification we found that two forms
of SrcCDK295M (one selectively phosphorylated at the activating
Tyr416 and the other not; confirmed by Western blot against

anti-pTyr antibodies and MS/MS analysis) (28) were obtained as
two distinct peaks in the elution chromatogram (Fig. 2). This last
observation was surprising to some extent, because Q columns do
not usually have sufficient resolution to separate identical proteins
with unit changes in charge states. Crystallographic and computa-
tional analyses (e.g., refs. 50, 51) have revealed that activating
phosphorylation alters the overall conformation of the catalytic domain,
thus the protein surface that interacts with the column is also
likely altered. This feature, we believe, allows the separation of the
phosphorylated and nonphosphorylated species using ion-exchange
chromatography. We provide below a step-by-step protocol for the
purification of SrcCD as an MBP fusion in Subheading 4.1.
The MBP/GroELS based protocol, though complex in the
nature of its purification scheme, results in high yields of SrcCD
purified to homogeneity (up to 700 nmol/L of culture). However,
the biggest disadvantage lies in the fact that the catalytic domain of
c-Src can only be purified either as a kinase-deficient mutant
(SrcCDK295M) or in wild-type form in complex with a high-affinity
inhibitor (see Subheading 4.1.2), using this strategy. This prevents
access to the full range of catalytically relevant conformations and
motional modes, in spite of having the ability to purify both con-
structs (mutant and wild-type) with the activating Tyr416 in the
unphosphorylated and selectively phosphorylated forms. Thus, we
developed an alternative strategy to purify wild-type SrcCD that
displays full catalytic activity in solution.
Our experiences seem to indicate that proper folding appears to
be the main hindrance to SrcCD production in bacterial expression
systems. In eukaryotes, the folding of several protein kinases including
c-Src, occurs with the assistance of the Hsp90 chaperone machinery
(52–54). This process is further assisted by an essential kinase-
specific cochaperone Cdc37/p50, that can independently bind both
the kinase target and the Hsp90 chaperone, thus approximating the
kinase and the chaperone machinery, and stabilizing the transient
ternary complex (55, 56). Indeed, ternary complexes of Hsp90,
v-Src (a transforming viral form of c-Src that contains a tail trunca-
tion and lacks Tyr527) and Cdc37 (57) have been identified.
We found that coexpressing the Cdc37 cochaperone (the plasmid
encoding Cdc37 was a kind gift from Dr. Avrom Caplan, CCNY)
with wild-type SrcCD leads to a properly folded soluble kinase
domain even in the absence of the N-terminal MBP fusion protein
(Fig. 3). Consequently, the His6-tagged kinase can be purified eas-
ily using metal affinity chromatography followed by size exclusion
chromatography. Thrombin cleavage and ion-exchange chroma-
tography, steps that can both lead to substantial losses in material,
become unnecessary. Although the yields are somewhat lower, the
methodology is less time consuming, and the wild-type kinase
domain can be isolated without the use of an inhibitor, allowing
Fig. 3. (a) SDS-PAGE gel (20%) showing the various steps in the purification of wild-type SrcCD (His6-tagged): lane 1 – cell
lysate supernatant, lane 2 – cell lysate pellet, lane 3 – metal affinity column flow-through (unbound material), lane 4 –
metal affinity column eluate, lane 5 – cSrc after purification using gel filtration. The positions of Cdc37, MBP-Lyp and SrcCD
in lanes 3–5, respectively, are highlighted by boxes. Samples in lanes 4 and 5 were concentrated by a factor of 10. (b) Size
exclusion column profile of the eluate from the metal affinity column purification. (c) Immunoblots of phosphorylated
wild-type SrcCD (preincubated with 2 mM ATP and 10 mM Mg+2) with anti-pTyr antibody. All immunoblots were visualized
using the Bio-Rad ChemiDoc system. Lane 1 – phosphorylated SrcCD, lane 2 – phosphorylated SrcCD after incubation
with alkaline phosphatase.
access to the full complement of structural and dynamics changes

that accompany activation. The detailed protocols used are
described at length in Subheading 4.2.
3. Materials
Most of the materials utilized in the experiments described below

are routinely used in NMR laboratories for protein expression and
purification.
3.1. Expression 1. H-MBP-3C vector (58): a kind gift from Dr. Kaushik Dutta,
Vectors and New York Structural Biology Center (see Note 1).
Competent Cells 2. pACYCDuet vector: for simultaneous expression of two
proteins (see Note 2): SrcCD and Cdc37 should be cloned
into this vector following the protocol described below.
3. 0.1-cm gap sterile electroporation cuvette for bacteria.
4. Electroporator (see Note 3).
5. pREP4-groELS electrocompetent BL21 (DE3) cells
(see Note 4).
6. BL21 (DE3) T1 competent cells.
3.2. Media for 1. LB (Luria Bertani) Miller medium: Dissolve 20 g of LB

Bacterial Growth powder in 1 L of water, and sterilized by autoclaving. Store at
room temperature.
2. M9 salts: 47.9 mM anhydrous Na2HPO4 (6.8 g/L), 22.0 mM

anhydrous KH2PO4 (3 g/L), and 8.6 mM NaCl (0.5 g/L),
sterilized by autoclaving. Store at room temperature.
3. Ampicillin stock (1,000×): 100 mg/mL Ampicillin in water.
Store at 4°C.
4. Kanamycin stock (1,000×): 50 mg/mL Kanamycin in water.
Store at 4°C.
5. Chloramphenicol stock (1,000×): 35 mg/mL Chloramphenicol
in Isopropanol. Store at 4°C.
6. LB plates: Mix 40 g/L of LB Miller agar and autoclave the
suspension, let the material cool to ~55°C and add the desired
antibiotic (100 mg/L Ampicillin, or 50 mg/L Kanamycin, or
35 mg/L Chloramphenicol) or combination of antibiotics
(100 mg/L Ampicillin and 50 mg/L Kanamycin, or 100 mg/L
Ampicillin and 35 mg/L Chloramphenicol).
7. 1M Isopropyl β-D-1-thiogalactopyranoside (IPTG): Prepare in
water and store at 4°C.
8. Metal Solution I (1,000×): 1 mM MnCl2·4H2O, 0.2 mM
CoCl2·6H2O, 2 mM ZnCl2, 1 mM CuSO4·5H2O, 50 mM
H3BO4, and 0.001 mM Na2MoO4. The compounds are added
in the order listed above, dissolving each compound completely
before adding the following one.
9. Metal Solution II (1,000×): 1 mM FeSO4, filter-sterilize and
store at 4°C.
10. MgSO4 stock (500×): 1M MgSO4, autoclave and store at room
temperature.
11. CaCl2 stock (1,000×): 0.1M CaCl2, autoclave and store at
room temperature.
12. Vitamin solution (100×): Prepare 10 mL aliquots in 15-mL
centrifuge tubes, and store at −20°C (see Note 5).
15
13. NH4Cl: The material is weighed and carefully dissolved in
the minimum amount of preparation medium (40 mL for 0.5 g
of NH4Cl), sterile-filtered, and added immediately to the
medium.
14. [13C6]-Glucose: The material is weighed and carefully dissolved
in the minimum amount of preparation medium (40 mL
for 2.5 g of glucose), sterile-filtered, and added immediately to
the medium.
15. [U-15N,13C,2H] CELTONE® powder (see Note 6). The material
is weighed and carefully dissolved in the minimum amount of
preparation medium, sterile-filtered, and added immediately to
the medium. Store at room temperature.
16. Anhydrous dextrose.
14
17. N ammonium chloride.
18. 0.22-μm Filters.
3.3. Protein 1. Human α-thrombin, store at −80°C.

Purification 2. Empty 20-mL bed volume chromatography columns.
3. Preequilibrated cobalt resin: Resuspend the cobalt-based beads,
(see Note 7) used for metal-affinity purification of His6-tagged
proteins, by gently inverting the bottle. Typically 4 mL of bed
volume is used for every 0.5 L of culture (2 mL of slurry provides
1 mL of bed volume). Once resuspended, pipette the slurry
into a 20 mL empty column until all the methanol is washed
away and the resin settles. Gently wash the column with three
bed volumes of deionized H2O followed by three bed volumes
of lysis buffer. Resuspend the resin in 1–2 mL of lysis buffer (by
gently inverting the closed column) and add to the cell lysate.
4. Lysis buffer: 20 mM Tris–HCl, pH 7.8, 0.5M NaCl, 1% Triton
X-100, 0.1% β-mercaptoethanol. The buffer is prepared without
β-mercaptoethanol and Triton-X, and stored at room temperature
with these two chemicals added immediately before usage.
5. Q columns (5 mL): for ion-exchange chromatography
(see Note 8).
6. Ion-exchange chromatography buffer A: 20 mM Tris–HCl,
pH 7.4, 10% glycerol. Store at room temperature.
7. Ion-exchange chromatography buffer B: 20 mM Tris–HCl,
pH 7.4, 1M NaCl, 10% glycerol. Store at room temperature.
8. Dialysis buffer (before size-exclusion chromatography):
20 mM Tris, pH 7.5 (titrated with 12M HCl), 150 mM NaCl,
1 mM EDTA, and 1 mM DTT (Dithiothreitol), prepared
immediately before use.
9. Size exclusion chromatography buffer: 50 mM sodium phos-
phate, pH 6.5, 150 mM NaCl, 1 mM DTT. 4.82 g of NaH2PO4,
and 2.14 g of Na2HPO4 are added to 1 L of water, the pH is
adjusted if necessary using 12M HCl or 1M NaOH, The DTT
is added immediately before use and the rest stored at room
temperature.
10. Gel filtration column (see Note 9): Preequilibrate the column
with 1.5 volumes of size exclusion chromatography buffer.
The typical flow rate is 0.5 mL/min.
11. AP23464: 10 mM in DMSO. The high-affinity kinase inhibi-
tor (59) is a kind gift from Dr. David Dalgarno, Ariad Pharma-
ceuticals. Dissolve 4.7 mg in 1 mL of dimethyl sulofoxide.
12. Imidazole buffer: 20 mM Tris, 0.5M NaCl, 1M Imidazole.
Dissolve 1.2 g Tris, 14.61 g NaCl, and 34.04 g of Imidazole
in ~ 450 mL H2O. Adjust the pH to 7.8 using 12M HCl, bring
the solution to 0.5 L volume using a graduated cylinder, and
readjust the pH to 7.8 using 12M HCl or 1M NaOH. Cover

the bottle with aluminum foil and store at room temperature.
13. Cleavage buffer: 20 mM Tris–HCl, pH 7.8, 150 mM NaCl.
Add β-mercaptoethanol to a final concentration of 4 mM just
before the cleavage reaction.
14. 1M sodium hydroxide: dissolve 40 g of NaOH in ~ 950 mL of
water and bring to 1 L volume using a graduated cylinder.
15. EDTA solution: 20 mM Tris, 0.5M EDTA. Dissolve 1.2 g of
Tris and 73 g of ethylenediaminetetraacetic acid in 450 mL of
water, adjust the pH to 8.0 using 1M NaOH, bring to volume
with a graduated cylinder, and adjust the pH to 8.0 using 12M
HCl or 1M NaOH.
16. AEBSF: 41.7 mM solution, 1 mL aliquots. Dissolve 10 mg of
4-(2-aminoethyl) benzenesulfonyl fluoride hydrochloride in
1 mL of water and store at −20°C.
17. French press.
18. 12M HCl.
4. Methods
4.1. Expression and The detailed protocol provided below is for the bacterial expres-
Purification of the sion and purification of the catalytic domain of the kinase deficient
Catalytic Domain of construct (SrcCDK295M) or the wild-type domain in the presence of
c-Src as an N-terminal a high-affinity inhibitor.
MBP Fusion
4.1.1. Overexpression 1. Thaw a tube containing the pREP4-groELS electrocompetent

of SrcCD BL21 (DE3) cells, on ice (see Note 4).
2. Add ~20 ng of H-MBP-3C vector, encoding SrcCD, to the
electrocompetent cells. Transfer the contents into a pre-
chilled 0.1 cm gap electroporation cuvette and incubate for
2 min on ice.
3. Transform the cells with a 6 ms pulse at 1,730 V, and add
960 μL of LB medium, prewarmed to 37°C. Transfer into a
sterile 15-mL conical centrifuge tube and shake at 250 rpm for
1 h in an incubator set to 37°C.
4. Plate ~100 μL of transformed cells on a prewarmed (37°C) LB
plate containing both Kanamycin and Ampicillin (50 and
100 mg/L, respectively). Incubate overnight at 37°C.
5. Transfer the plate to 4°C for storage.
6. Autoclave 0.5 L of M9 salts in a 2-L flask, transfer 20 mL of
this solution into a 50-mL conical tube. From the stock solu-
tions, add 20 μL of 0.1M CaCl2 (final concentration: 0.1 mM),
40 μL of 1M MgSO4 (final concentration: 2.0 mM), 20 μL of

Metal Solutions I and II, 200 μL of Vitamins solution, 20 μL
each of Ampicillin and Kanamycin (final concentration: 100
and 50 mg/L, respectively). Finally, add 0.1 g (5 g/L) of anhy-
drous dextrose and 20 mg (1 g/L) of ammonium chloride.
Dissolve by agitation or vortexing, and filter the M9 medium
into a preautoclaved 125-mL flask using a sterile 50-mL
syringe and a sterile 0.22-μm filter.
7. Scoop a few colonies from the transformed plate using a sterile
loop, transfer into the M9 medium and incubate overnight in
a shaker (250 rpm) at 37°C.
8. Using a sterile culture pipette, transfer ~20 mL of the M9 salt
solution into a sterile 50-mL centrifuge tube. Use this solu-
tion to dissolve 2.5 g of dextrose and 0.5 g of ammonium
chloride (final concentration: 5 and 1 g/L, respectively).
Then, add 5 mL of vitamin solution, dissolve by shaking or
vortexing, and filter the entire contents into the original 2-L
flask. Add 500 μL of 0.1M CaCl2 1 mL of 1M MgCl2 stock
solutions (final concentration: 0.1 and 2.0 mM, respectively),
500 μL each of Metal Solutions I and II, 500 μL of Ampicillin
and Kanamycin (final concentration: 100 and 50 mg/L,
respectively) to the flask.
9. Add the 20 mL overnight culture directly into the 0.5 L of M9
medium. Incubate at 250 rpm at 37°C.
10. After 1 h, check the OD600 every 30 min; when the OD600
reaches ~0.8, add 100 μL of 1M IPTG (final concentration:
200 μM, see Note 10). Keep the flask at 37°C for about 30 min
(see Note 11).
11. Transfer the flask to a second shaker with the temperature set
to 15°C, and incubate at 250 rpm overnight (see Note 12).
12. Centrifuge the cells at 4,900 g for 20 min at 4°C. Store the cell
pellet at −80°C.
13. For 15N,13C enrichment, replace 14NH4Cl and dextrose in steps
6 and 8 described above, with 15NH4Cl and [U-13C]-glucose,
respectively.
14. Under normal circumstances, for uniform 15N,13C,2H enrichment,
grow the cells in M9 medium in which H2O is replaced by
D2O. However, in the case of SrcCD, we found that the intro-
duction of 2H2O dramatically reduces the yields. Therefore, we
follow the approach developed by Fiaux et al. (60), where a
pool of 15N,13C,2H-labeled amino acids were used to supple-
ment M9 medium prepared in H2O. In this case: In step 6
above, replace 14NH4Cl with 15NH4Cl; in step 8, above, replace
14
NH4Cl with 15NH4Cl, and use 0.5 g of dextrose (1 g/L
instead of 5 g/L) and add 0.75–1.0 g of 15N,13C,2H-labeled
algal lysate medium (e.g., CELTONE® 1.5–2.0 g/L).
4.1.2. Protocol for the 1. Lyse cells using a French Press (1,100 psi), followed by
Purification of SrcCD from centrifugation at 16,000 g for 20 min at 4°C. Transfer the
the MBP Fusion Protein supernatant to a centrifuge tube.
2. Check the pH of the supernatant (see Note 13), if it is below
7.5, adjust by adding small amounts of solid tris-hydroxym-
ethyl-aminomethane (Tris).
3. Transfer preequilibrated cobalt-resin into the centrifuge tube
containing the lysate supernatant, and incubate for about 1 h
under rotation or slight agitation at 4°C.
4. Pour the lysate and beads into an empty column. Elute the
lysate and wash the column with four bed volumes of lysis
buffer and two bed volumes of lysis buffer containing 3 mM
imidazole (3 μL Imidazole buffer/mL of lysis buffer).
5. Elute the protein with four bed volumes of the lysis buffer
containing 300 mM imidazole (combine three volumes of
Imidazole buffer with seven volumes of lysis buffer).
6. Add 20 μL of EDTA solution/mL of eluate (final concentra-
tion: 10 mM; see Note 14) to the eluate prior to dialysis and
then dialyze against cleavage buffer at 4°C.
7. Determine the concentration of the fusion protein by measuring
the OD280 using 119,100 M-1-cm-1 as the extinction coefficient.
8. Add 5 units of α-thrombin/mg of fusion protein to the solu-
tion, immediately add β-mercaptoethanol to a final concentra-
tion of 4 mM. Cleave overnight at 4°C.
9. Check for completion of the cleavage reaction by running an
SDS–PAGE gel. Add more thrombin if necessary. When the
cleavage is complete, add solid DTT and the protease inhibi-
tor, AEBSF (4-(2-aminoethyl) benzenesulfonyl fluoride) to
final concentrations of 5 mM and 100 μM, respectively. Add
AP23464, a high-affinity kinase inhibitor, to the cleavage buf-
fer to a final concentration of 100 μM for wild-type SrcCD.
10. Dilute the completed reaction fivefold into ion exchange chro-
matography buffer A, and load the resulting sample onto a
Q column (ion-exchange) at a flow-rate of 4 mL/min (the volume
at this stage is likely to be large, possibly more than 100 mL;
see Note 15).
11. Elute the cleaved kinase by running a salt gradient from 0 to
300 mM NaCl, using ion exchange chromatography buffer B,
in 120 min using a flow-rate of 1 mL/min. Unphosphorylated
SrcCDK295M appears at around 15.4 mS/cm and SrcCDK295M
phosphorylated at Tyr416 appears at around 16.9 mS/cm. Wild-
type SrcCD bound with the kinase inhibitor, AP23464, appears
at around 20.4 mS/cm. MBP elutes at around 8–10 mS/cm,
multiple peaks may appear due to nonspecific phosphorylation
(see Note 16).
4.2. Expression We provide below a step-by-step protocol for the expression and
and Purification purification of wild-type SrcCD without the use of an N-terminal
of the Fully Active, MBP tag. The SrcCD obtained using this protocol is fully active
Wild-Type Catalytic and does not require a kinase inhibitor to allow purification to
Domain of c-Src homogeneity. To implement this protocol, first SrcCD and Cdc37
have to be cloned into a pACYCDuet vector (Chloramphenicol
resistant) using standard procedures. SrcCD should be inserted in
the first cloning site using the EcoRI and SalI restriction enzymes
(resulting in a N-terminal His6-tag for SrcCD, see Note 17), and
Cdc37 in the second site using XhoI and NdeI (see Note 18).
SrcCD expression was also tested with and without the concomi-
tant coexpression of a third protein, the tyrosine phosphatase, Lyp
(61) (a kind gift from Dr. Ronald Seidel, Albert Einstein College
of Medicine). Lyp selectively dephosphorylates the activating Tyr
residue in SFKs (specifically Tyr394 in Lck), and was obtained in a
H-MBP-3C Ampicillin resistant vector (see Note 1).
1. If using electrocompetent cells, transfer ~20 ng of pACYC-
Duet vector into the Eppendorf tube containing BL21(DE3)
T1 electrocompetent cells. Transfer the contents of the
Eppendorf tube into a prechilled 0.1 cm gap electroporation
cuvette and incubate for 2 min on ice (see Note 19). If also
overexpressing the phosphatase, then transfer both vectors
(pACYC and H-MBP-3C, 20 ng each) into the cuvette at the
same time (see Note 20).
2. Follow steps 4–14 of Subheading 4.1.1 but use Chloram-
phenicol (35 mg/L) and Ampicillin only if overexpressing the
phosphatase. This applies to step 4 (Chloramphenicol, or
Ampicillin and Chloramphenicol double resistant plates) and
step 6 (Chloramphenicol, or Ampicillin and Chloramphenicol
added to the M9 medium).
3. Purify the protein following steps 1–6 of Subheading 4.1.2.
Note that the final purified protein retains the His6-tag. If the
tag needs to be removed, a thrombin cleavage site could be
inserted at step 1 in the SrcCD forward primer (described in
Note 10). Then steps 7–9 from Subheading 4.1.2. would also
apply. Note that, in our construct, the MBP-phosphatase
fusion protein has an N-terminal His6 tag and coelutes with
the kinase.
4. After elution, dialyze the protein(s) against dialysis buffer.
5. Concentrate the protein(s) down to 2 mL and inject it onto a
size exclusion column equilibrated with size exclusion
chromatography buffer. Note that the phosphatase elutes in
the void (presumably due to some level of self-association)
while the kinase elutes consistently at the volume expected for
its molecular weight, and are both readily isolated.
Both methods of expression (with and without coexpression

with Lyp) lead to the isolation of active, wild-type SrcCD (con-
firmed by analysis of kinetics, see Note 21). However, a higher
final OD600 and overall yield of purified protein was observed when
the kinase was coexpressed with Lyp phosphatase (~190 nmol/L
of culture).
5. Conclusions
We have described how careful optimization of overexpression

protocols allows the expression and purification of the catalytic
domain of c-Src both in mutant, as well as active wild-type forms
using E. coli expression systems. In particular, we developed two
related strategies, the first employs both an N-terminal MBP fusion
and coexpression with the GroEL and GroES chaperones, while
the second, by taking advantage of the higher specificity of the
Cdc37 cochaperone for the nascent c-Src chain, does not require
the production of SrcCD as an MBP fusion protein. Both methods
present advantages and disadvantages. In particular, the first
method leads to higher yields, but it is more efficient in producing
the catalytically compromised (K295M) mutant compared to the
wild-type protein. The wild-type protein can only be purified in
the presence of a high-affinity kinase inhibitor when using this
method.
The second method produces lower yields; however it also
less labor intensive and more importantly, allows the production
of the catalytic domain in active, wild-type form. Given the higher
yields, the first method is probably more useful when SrcCD is
required in large amounts for studies such as those to determine
specific protein–protein interactions that do not require the intact
catalytic properties of the kinase. However, given that the MBP-
dependent protocol also produces a considerable amount of solu-
ble misfolded aggregates that have not been observed using the
Cdc37-dependent protocol, we believe that the latter may be better
suited for “in-cell” NMR studies in E. coli (62–64). Obviously,
the latter protocol is required when wild-type, fully active SrcCD
needs to be produced. While this manuscript was in production,
Campos-Olivas et. al. (65) utilized some of the procedures
described herein, to produce sufficient amounts of SrcCD, and
obtain backbone resonance assignments in complex with the
kinase inhibitor imatinib.
6. Notes
1. The H-MBP-3C vector contains a thrombin cleavage site and

features a maltose binding protein (MBP) tag to enhance sol-
ubility and a His6 sequence to facilitate protein purification
using a metal affinity chromatography. The protease 3C site
described in the original publication (58) has been replaced by
a thrombin site in the present case.
2. The pACYCDuet vector was purchased from Novagen (EMD
Biosciences).
3. We used an Eppendorf 2510 electroporator.
4. We observed that the transformation of the pREP4-groESL
vector was often inefficient, and therefore, we avoided the
simultaneous double transformation of pREP4-groESL and
H-MBP-3C vector encoding SrcCD, by preparing a stock of
electrocompetent cells pretransformed with the former. Electro-
competent cells were preferred over chemically competent
cells, due to their ease of preparation and high efficiency of
transformation. These cells were prepared by streaking out
commercial BL21 (DE3) competent cells on an LB plate and
incubating the plate overnight at 37°C. From this, electro-
competent cells were prepared using standard protocols.
5. We used Kao and Michaylchuk vitamin solution (100×) purchased
from Sigma.
6. CELTONE® is a rich growth medium for bacteria derived
from algal extracts. It may be purchased from Cambridge
Isotopes Laboratories.
7. We used TALON beads purchased from Clontech.
8. We used HiTrap Q columns from GE Healthcare Biosciences
Corp.
9. We used a HiPrep 16/60 Sephacryl S-100 column from GE
Healthcare Biosciences Corp. All protein purification was
carried out using an Akta Explorer 100 from GE Healthcare
Biosciences Corp.
10. Lower IPTG concentrations are recommended at low temper-
ature to avoid toxicity.
11. This helps to increase the final yield by boosting mRNA
production, while most protein is synthesized during the
following lower temperature growth. In the first hour after
induction, bacteria mainly activate mRNA transcription, while
actual translation is still minimal.
12. All bacterial growth postinduction (after the initial half an hour
at 37°C, see Note 11) was carried out at a low temperature
(15°C). It is well known that low temperatures decrease the
degradation rates of unstable proteins; in the case of SrcCD,
we observed that low temperature growth is essential for the
production of soluble, properly folded SrcCD. We speculate
that SrcCD folding is a slow process that is in competition with
the formation of aggregates both soluble and insoluble. In our
protocol both the N-terminal MBP fusion and the GroEL/
GroES chaperones (or the Cdc37 cochaperone) provide assis-
tance in stabilizing the nascent chain and protect it from
aggregation. Lower temperatures likely assist this process by
reducing the rates of protein synthesis and aggregation.
13. We noticed that after long induction times at low temperature,
E. coli cells release significant amounts of acid molecules into
the lysis buffer. This is, in some instances, sufficient to over-
come the buffering power of the solution leading to a pH
below 7.0, compromising efficient binding of the His6 tag to
the metal affinity column.
14. EDTA is introduced to chelate Co2+ cations. Metal leaks from
the resin can be caused by mechanical stress and by high
Imidazole concentrations. If not chelated by EDTA, Co2+ can
be difficult to eliminate by dialysis, since it can bind the His6-tag
as well as the metal binding pockets in SrcCD if left in the sam-
ple. Co2+ forms a reddish precipitate once a substantial amount
of reducing agent has been added (interfering with the remain-
ing purification process), while the Co2+ remaining in solution,
being paramagnetic, degrades the quality of the NMR spectra.
15. We normally do not load more than 15 mg of fusion protein
into the ion exchange column at a given time.
16. Unlike the catalytically compromised mutant, wild-type SrcCD
does not elute once loaded on the Q column. Presumably the
wild-type protein is less stable under conditions of low ionic
strength used for sample loading, and precipitates on the
column. We found however, that this construct could also be
successfully loaded and eluted using the protocol described
above by adding a high-affinity kinase inhibitor (59) after
cleaving the fusion protein.
17. For SrcCD (human), we used the following primers:
Forward: GGTGGTGGAATTCGTCCAAGCCGCAGA
CTCAGGG.
Reverse: GGTGGTGGTCGACCTAGAGGTTCTCCCC
GGGCTGGTA.
18. For Cdc37 we used the following primers:

Forward: GGTGGTGCATATGGTGGACTACAGCG
TGTGGGAC.
Reverse: GGTGGTGCTCGAGTCACACACTGACATC
CTTCTCATCGCC.
19. The electrocompetent cells were homemade and derived from
the commercial BL21 (DE3) T1 expression system (chemical
competent). We expect that transformation directly into the
commercial electrocompetent cells would produce equivalent
results.
20. In contrast to what was observed for the pREP4-GroESL
vectors, cotransformation of both the H-MBP-3C Lyp vector
and the pACYCDuet hosting SrcCD and Cdc37 does not pres-
ent any difficulties, and can be performed following standard
protocols with chemically competent BL21 (DE3) cells.
21. The kcat for the phosphorylation of a substrate peptide poly-
E4Y by SrcCD was measured using a photometric assay based
on the recycling of ADP and the concomitant consumption of
NADH catalyzed by L-lactate-dehydrogenase (LDH) and
pyruvate kinase (PK). PK regenerates ATP from ADP (a by-
product of the SrcCD phosphorylation process) by converting
phosphoenolpyruvate into pyruvate. LDH converts then the
pyruvate into lactate by oxidizing NADH. Velocities are
obtained following the time-dependent decrease of the NADH
fluorescence. This method yielded a kcat of 240 ± 10/min, a
value comparable to that obtained previously (41).
Acknowledgments
The following grants from the National Institutes of Health,

GM084278 (to R. G.), GM047021 (to D. C.) and RR03060
(toward partial support of the NMR facilities at CCNY), are
acknowledged. The authors are members of the New York
Structural Biology Center, a STAR center supported by the New
York State Office for Science, Technology and Academic Research.
The authors thank Dr. Avrom Caplan (CCNY), Dr. Ronald Siedel
(Albert Einstein College of Medicine), and Dr. Kaushik Dutta
(NYSBC) for gifts of materials, and members of the Ghose and
Cowburn labs for useful discussions.
References
1. Hunter, T. (2000) Signaling – 2000 and 17. Roux, P. P., and Blenis, J. (2004) ERK and p38
beyond. Cell 100, 113–127. MAPK-activated protein kinases: a family of
2. Johnson, L. N. (2009) The regulation of pro- protein kinases with diverse biological functions.
tein phosphorylation. Biochem. Soc. Trans. 37, Microbiol. Mol. Biol. Rev. 68, 320–344.
627–641. 18. Chen, Z., Gibson, T. B., Robinson, F., Silvestro,
3. Tarrant, M. K., and Cole, P. A. (2009) The L., Pearson, G., Xu, B., Wright, A., Vanderbilt,
chemical biology of protein phosphorylation. C., and Cobb, M. H. (2001) MAP kinases.
Annu. Rev. Biochem. 78, 797–825. Chem. Rev. 101, 2449–2476.
4. Ostman, A., and Bohmer, F. D. (2001) 19. Hubbard, S. R., and Till, J. H. (2000) Protein
Regulation of receptor tyrosine kinase signal- tyrosine kinase structure and function. Annu.
ing by protein tyrosine phosphatases. Trends Rev. Biochem. 69, 373–398.
Cell Biol. 11, 258–266. 20. Shi, Z., Resing, K. A., and Ahn, N. G. (2006)
5. Shi, Y. (2009) Serine/threonine phosphatases: Networks for the allosteric control of protein
mechanism through structure. Cell 139, kinases. Curr. Opin. Struct. Biol. 16, 686–692.
468–484. 21. Fedorov, O., Muller, S., and Knapp, S. (2010)
6. Kolibaba, K. S., and Druker, B. J. (1997) The (un)targeted cancer kinome. Nature Chem.
Protein tyrosine kinases and cancer. Biochim. Biol. 6, 166–169.
Biophys. Acta 1333, F217–248. 22. Grueneberg, D. A., Degot, S., Pearlberg, J., Li,
7. Blume-Jensen, P., and Hunter, T. (2001) W., Davies, J. E., Baldwin, A., Endege, W.,
Oncogenic kinase signalling. Nature 411, Doench, J., Sawyer, J., Hu, Y., Boyce, F., Xian,
355–365. J., Munger, K., and Harlow, E. (2008) Kinase
8. Chong, P. K., Lee, H., Kong, J. W., Loh, M. requirements in human cells: I. Comparing
C., Wong, C. H., and Lim, Y. P. (2008) kinase requirements across various cell types.
Phosphoproteomics, oncogenic signaling and Proc. Natl. Acad. Sci. U.S.A. 105,
cancer research. Proteomics 8, 4370–4382. 16472–16477.
9. Pawson, T., and Kofler, M. (2009) Kinome sig- 23. Baldwin, A., Li, W., Grace, M., Pearlberg, J.,
naling through regulated protein–protein Harlow, E., Munger, K., and Grueneberg, D.
interactions in normal and cancer cells. Curr. A. (2008) Kinase requirements in human cells:
Opin. Cell Biol. 21, 147–153. II. Genetic interaction screens identify kinase
requirements following HPV16 E7 expression
10. Johnson, L. N. (2009) Protein kinase inhibi- in cancer cells. Proc. Natl. Acad. Sci. U.S.A.
tors: contributions from structure to clinical 105, 16478–16483.
compounds. Q. Rev. Biophys. 42, 1–40.
24. Bommi-Reddy, A., Almeciga, I., Sawyer, J.,
11. Noble, M. E., Endicott, J. A., and Johnson, L. Geisen, C., Li, W., Harlow, E., Kaelin, W. G.,
N. (2004) Protein kinase inhibitors: insights Jr., and Grueneberg, D. A. (2008) Kinase
into drug design from structure. Science 303, requirements in human cells: III. Altered kinase
1800–1805. requirements in VHL−/− cancer cells detected
12. Nichols, G. L. (2003) Tyrosine kinase inhibi- in a pilot synthetic lethal screen. Proc. Natl.
tors as cancer therapy. Cancer Invest. 21, Acad. Sci. U.S.A. 105, 16484–16489.
758–771. 25. Grueneberg, D. A., Li, W., Davies, J. E.,
13. Shawver, L. K., Slamon, D., and Ullrich, A. Sawyer, J., Pearlberg, J., and Harlow, E. (2008)
(2002) Smart drugs: tyrosine kinase inhibitors Kinase requirements in human cells: IV.
in cancer therapy. Cancer Cell 1, 117–123. Differential kinase requirements in cervical and
14. Nolen, B., Taylor, S., and Ghosh, G. (2004) renal human tumor cell lines. Proc. Natl. Acad.
Regulation of protein kinases; controlling activ- Sci. U.S.A. 105, 16490–16495.
ity through activation segment conformation. 26. Wiesner, S., Wybenga-Groot, L. E., Warner,
Mol. Cell 15, 661–675. N., Lin, H., Pawson, T., Forman-Kay, J. D.,
15. Kornev, A. P., and Taylor, S. S. (2010) Defining and Sicheri, F. (2006) A change in conforma-
the conserved internal architecture of a protein tional dynamics underlies the activation of Eph
kinase. Biochim. Biophys. Acta 1804, receptor tyrosine kinases. EMBO J. 25,
440–444. 4686–4696.
16. Parsons, S. J., and Parsons, J. T. (2004) Src 27. Vajpai, N., Strauss, A., Fendrich, G., Cowan-
family kinases, key regulators of signal trans- Jacob, S. W., Manley, P. W., Grzesiek, S., and
duction. Oncogene 23, 7906–7909. Jahnke, W. (2008) Solution conformations and
dynamics of Abl kinase-inhibitor complexes 40. Sicheri, F., Moarefi, I., and Kuriyan, J. (1997)
determined by NMR substantiate the different Crystal structure of the Src family tyrosine
binding modes of imatinib/nilotinib and dasa- kinase Hck. Nature 385, 602–609.
tinib. J. Biol. Chem. 283, 18292–18302. 41. Weijland, A., Neubauer, G., Courtneidge, S.
28. Piserchio, A., Ghose, R., and Cowburn, D. A., Mann, M., Wierenga, R. K., and Superti-
(2009) Optimized bacterial expression and Furga, G. (1996) The purification and charac-
purification of the c-Src catalytic domain for terization of the catalytic domain of Src
solution NMR studies. J. Biomol. NMR 44, expressed in Schizosaccharomyces pombe.
87–93. Comparison of unphosphorylated and tyrosine
29. Piserchio, A., Warthaka, M., Devkota, A. K., phosphorylated species. Eur. J. Biochem. 240,
Kaoud, T. S., Lee, S., Abramczyk, O., Ren, P., 756–764.
Dalby, K. N., and Ghose R. (2011) Solution 42. Feder, D., and Bishop, J. M. (1990) Purification
NMR insights into docking interactions involv- and enzymatic characterization of pp60c-src
ing inactive ERK2. Biochemistry 50, from human platelets. J. Biol. Chem. 265,
3660–3672. 8205–8211.
30. Masterson, L. R., Mascioni, A., Traaseth, N. J., 43. Seeliger, M. A., Young, M., Henderson, M. N.,
Taylor, S. S., and Veglia, G. (2008) Allosteric Pellicena, P., King, D. S., Falick, A. M., and
cooperativity in protein kinase A. Proc. Natl. Kuriyan, J. (2005) High yield bacterial expres-
Acad. Sci. U.S.A. 105, 506–511. sion of active c-Abl and c-Src tyrosine kinases.
31. Masterson, L. R., Cheng, C., Yu, T., Tonelli, Protein Sci. 14, 3135–3139.
M., Kornev, A., Taylor, S. S., and Veglia, G. 44. Stover, D. R., Liebetanz, J., and Lydon, N. B.
(2010) Dynamics connect substrate recogni- (1994) Cdc2-mediated modulation of pp60c-
tion to catalysis in protein kinase A. Nature src activity. J. Biol. Chem. 269,
Chem. Biol. 6, 821–828. 26885–26889.
32. Liu, D., Xu, R., and Cowburn, D. (2009) 45. Pellecchia, M., Bertini, I., Cowburn, D., Dalvit,
Segmental isotopic labeling of proteins for C., Giralt, E., Jahnke, W., James, T. L.,
nuclear magnetic resonance. Meth. Enzymol. Homans, S. W., Kessler, H., Luchinat, C.,
462, 151–175. Meyer, B., Oschkinat, H., Peng, J., Schwalbe,
33. Boggon, T. J., and Eck, M. J. (2004) Structure H., and Siegal, G. (2008) Perspectives on
and regulation of Src family kinases. Oncogene NMR in drug discovery: a technique comes of
23, 7918–7927. age. Nature Rev. Drug Discov. 7, 738–745.
34. Martin, G. S. (2001) The hunting of the Src. 46. Cole, P. A. (1996) Chaperone-assisted protein
Nature Rev. Mol. Cell Biol. 2, 467–475. expression. Structure 4, 239–242.
35. Alvarez, R. H., Kantarjian, H. M., and Cortes, 47. Nomine, Y., Ristriani, T., Laurent, C., Lefevre,
J. E. (2006) The role of Src in solid and hema- J. F., Weiss, E., and Trave, G. (2001) Formation
tologic malignancies: development of new- of soluble inclusion bodies by HPV E6 onco-
generation Src inhibitors. Cancer 107, protein fused to maltose-binding protein.
1918–1929. Protein Expr. Purif. 23, 22–32.
36. Russello, S. V., and Shore, S. K. (2003) Src in 48. Kapust, R. B., and Waugh, D. S. (1999)
human carcinogenesis. Front. Biosci. 8, Escherichia coli maltose-binding protein is
s1068–1073. uncommonly effective at promoting the solu-
37. Trevino, J. G., Summy, J. M., and Gallick, G. E. bility of polypeptides to which it is fused.
(2006) Src inhibitors as potential therapeutic Protein Sci. 8, 1668–1674.
agents for human cancers. Mini Rev. Med. 49. Wang, D., Huang, X. Y., and Cole, P. A. (2001)
Chem. 6, 681–687. Molecular determinants for Csk-catalyzed
38. Grgurevich, S., Mikhael, A., and McVicar, D. W. tyrosine phosphorylation of the Src tail.
(1999) The Csk homologous kinase, Chk, Biochemistry 40, 2004–2010.
binds tyrosine phosphorylated paxillin in 50. Ozkirimli, E., and Post, C. B. (2006) Src kinase
human blastic T cells. Biochem. Biophys. Res. activation: a switched electrostatic network.
Commun. 256, 668–675. Protein Sci. 15, 1051–1062.
39. Zrihan-Licht, S., Deng, B., Yarden, Y., McShan, 51. Cowan-Jacob, S. W. (2006) Structural biology
G., Keydar, I., and Avraham, H. (1998) Csk of protein tyrosine kinases. Cell Mol. Life Sci.
homologous kinase, a novel signaling molecule, 63, 2608–2625.
directly associates with the activated ErbB-2 52. Welch, W. J., and Feramisco, J. R. (1982)
receptor in breast cancer cells and inhibits their Purification of the major mammalian heat
proliferation. J. Biol. Chem. 273, 4065–4072. shock proteins. J. Biol. Chem. 257,
14949–14959.
53. Hahn, J. S. (2009) The Hsp90 chaperone selective trisubstituted purine-based com-
machinery: from structure to drug develop- pounds. Chem. Biol. Drug Des. 67, 46–57.
ment. BMB Rep. 42, 623–630. 60. Fiaux, J., Bertelsen, E. B., Horwich, A. L., and
54. Zuehlke, A., and Johnson, J. L. (2010) Hsp90 Wuthrich, K. (2004) Uniform and residue-spe-
and co-chaperones twist the functions of diverse cific 15N-labeling of proteins on a highly deu-
client proteins. Biopolymers 93, 211–217. terated background. J. Biomol. NMR 29,
55. Caplan, A. J., Mandal, A. K., and Theodoraki, 289–297.
M. A. (2007) Molecular chaperones and pro- 61. Wu, J., Katrekar, A., Honigberg, L. A., Smith, A.
tein kinase quality control. Trends Cell Biol. 17, M., Conn, M. T., Tang, J., Jeffery, D., Mortara,
87–92. K., Sampang, J., Williams, S. R., Buggy, J., and
56. Karnitz, L. M., and Felts, S. J. (2007) Cdc37 Clark, J. M. (2006) Identification of substrates of
regulation of the kinome: when to hold ‘em and human protein-tyrosine phosphatase PTPN22.
when to fold ‘em. Science STKE 2007, pe22. J. Biol. Chem. 281, 11002–11010.
57. Whitelaw, M. L., Hutchison, K., and Perdew, G. 62. Burz, D. S., Dutta, K., Cowburn, D., and
H. (1991) A 50-kDa cytosolic protein complexed Shekhtman, A. (2006) In-cell NMR for pro-
with the 90-kDa heat shock protein (hsp90) is tein–protein interactions (STINT-NMR).
the same protein complexed with pp60v-Src Nature Protoc. 1, 146–152.
hsp90 in cells transformed by the Rous sarcoma 63. Selenko, P., and Wagner, G. (2006) NMR
virus. J. Biol. Chem. 266, 16436–16440. mapping of protein interactions in living cells.
58. Alexandrov, A., Dutta, K., and Pascal, S. M. Nature Meth. 3, 80–81.
(2001) MBP fusion protein with a viral pro- 64. Sakakibara, D., Sasaki, A., Ikeya, T., Hamatsu,
tease cleavage site: one-step cleavage/purifica- J., Hanashima, T., Mishima, M., Yoshimasu,
tion of insoluble proteins. Biotechniques 30, M., Hayashi, N., Mikawa, T., Walchli, M.,
1194–1198. Smith, B. O., Shirakawa, M., Guntert, P., and
59. Dalgarno, D., Stehle, T., Narula, S., Schelling, Ito, Y. (2009) Protein structure determination
P., van Schravendijk, M. R., Adams, S., Andrade, in living cells by in-cell NMR spectroscopy.
L., Keats, J., Ram, M., Jin, L., Grossman, T., Nature 458, 102–105.
MacNeil, I., Metcalf, C., 3rd, Shakespeare, W., 65. Campos-Olivas, R., Marenchino, M., Scapozza,
Wang, Y., Keenan, T., Sundaramoorthi, R., L. and Gervasio, F. L. (2011) Backbone assign-
Bohacek, R., Weigele, M., and Sawyer, T. ment of the tyrosine kinase Src catalytic domain
(2006) Structural basis of Src tyrosine kinase in complex with imatinib. Biomol. NMR Asgn.
inhibition with a new class of potent and 5, 221–224.
Chapter 8
NMR Studies of Large Protein Systems

Shiou-Ru Tzeng, Ming-Tao Pai, and Charalampos G. Kalodimos
Abstract
Over the recent years, there has been increased interest in applying NMR spectroscopy for the characterization
of proteins and protein complexes of large molecular weight. The combination of multidimensional NMR,
novel pulse sequences allowing for the selection of slowly relaxing coherence pathways, and the development
of a range of labeling techniques has enabled high-resolution NMR analyses of supramolecular systems of
even megadalton size. Here, we describe how NMR can be used to obtain structural information in large
systems by using as an example the recent structure determination of SecA ATPase (204 kDa) in complex
with a signal peptide.
Key words: Paramagnetic resonance enhancement, Macromolecular complex, Molecular machinery,

Stable isotope labeling
1. Introduction
Application of NMR spectroscopy to supramolecular systems

(>200 kDa) has been revolutionized by specific labeling of methyl
groups (1). The labeling protocol, pioneered by the group of Lewis
Kay, is very simple and robust (2, 3). The approach exploits some
very favorable properties of methyl groups in proteins: (a) they
occur frequently in the hydrophobic cores of proteins and at the
interfaces of biomolecular complexes, and are thus excellent report-
ers of structure and dynamics; (b) the three protons of the methyl
group all contribute to the intensity of the same signal, and therefore
methyl probes are significantly more sensitive than other candidates;
and (c) methyl groups are intrinsically optimized for use in TROSY
133
134 S.-R. Tzeng et al.
spectroscopy and the simple 1H-13C HMQC experiment can be used

to select for pathways with favorable relaxation properties (4).
Currently, the methyl groups of five different amino acids can be
labeled in a highly specific and scramble-free manner: Ala, Ile (δ1), Leu,
Met, and Val. These five residues are highly abundant, typically
accounting for 35–45% of the total number of residues in a protein
and are distributed throughout the protein, thus providing almost
complete coverage of the protein space.
The methyl-labeling approach combined with methyl-TROSY
currently provides the method of choice for NMR characterization
of large protein systems. Although this approach has proven to be
very robust for recording spectra of large proteins with high sensi-
tivity and resolution, a major hurdle in obtaining site-specific infor-
mation remains the difficulty in obtaining assignments. While
the traditional approach of assigning the backbone and subse-
quently linking the methyl side chains to the backbone has worked
efficiently for smaller proteins, it is not applicable to larger systems.
The only approach currently is to “disassemble” the supramolecular
system. For higher order oligomeric systems, such as the protea-
some (5), this means preparing the subunit in its monomeric form
and for large, single-chain proteins, such as the SecA (6), preparing
isolated domains or fragments.
In principle, determining solution structures of supramolecular
protein–ligand complexes by NMR should be feasible, provided
that the crystal structures of the free partners are previously known.
Because usually only methyl groups can be robustly and unam-
biguously detected for supramolecular systems, in cases where
complex interactions are mediated by hydrophobic contacts
involving methyl-bearing residues it is likely that intermolecular
NOEs can be detected, thereby enabling the reliable docking of
the complex. Unfavorable motions commonly observed at protein
interfaces, however, may result in line broadening and render NOE
detection unfeasible. Although the NOE has served as the gold
standard for protein structure determination by NMR, the old,
but recently resurrected, paramagnetic relaxation enhancement
(PRE) (7, 8) technique holds great promise for obtaining both
structural and dynamic information in supramolecular protein
complexes (6). By combining transferred NOESY, line broadening,
and PRE experiments, the structure of the 204 kDa SecA ATPase
in complex with a secretory signal peptide was recently determined
(6). Using this system as an example, we describe strategies to
(a) obtain samples optimally labeled for methyl detection, (b) assign
the methyl resonances of the large protein system, and (c) obtain
intermolecular distance restraints for the structure determination
of large protein–ligand complexes.
8 NMR Studies of Large Protein Systems 135
2. Materials
1. Frozen, transformed Escherichia coli BL21(DE3) cells to

overexpress protein of interest.
2. M9 medium: 6 g/L Na2HPO4, 3 g/L KH2PO4, pH 7.0–7.4.,
0.5 g/L NaCl, 1.0 g/L NH4Cl or 15NH4Cl. Autoclave, let the
medium cool down, and then add 0.1 mL/L of 1M CaCl2,
1 mL/L of 1M MgSO4, and 2 g/L D-[2H,12C]-glucose or
D-[2H,13C]-glucose.
3. 1M CaCl2 stock: Dissolve 11.0 g of CaCl2 in 100 mL of D2O;

filter sterilize.
4. 1M MgSO4 stock: Dissolve 12.04 g of MgSO4 in 100 mL of
D2O; filter sterilize.
5. D2O (Cambridge Isotope Laboratories, CIL).
6. BIOEXPRESS (CIL).
7. ISOGRO (Isotec).
13
8. CH3-2H-alanine.
9. α-Ketobutyrate (CIL or Isotec).
10. α-Ketoisovalerate (CIL or Isotec).
13
11. CH3-methionine (CIL).
12. IPTG.
13. AMICON stir cell (Millipore).
14. L-broth: 10 g tryptone/L, 5 g yeast extract/L, 5 g NaCl/L,
adjusted to pH 7.4 with NaOH; autoclave.
15. (2,2,5,5-Tetramethyl-2,5-dihydro-1H-pyrrol-3-yl)methyl
methanesulfonothioate (MTSL) (Toronto Research Chemicals
Inc.): Dissolve in acetonitirile.
16. Ascorbic acid.
3. Methods
3.1. Protein Labeling 1. Pick a freshly transformed colony (see Note 1) of BL21(DE3)
for Methyl Detection cells and inoculate a 1–2 mL culture of L-broth in D2O containing
0.1% of glucose at 37°C until cells reach an OD600 of ~0.7–0.8.
2. Centrifuge the cells at 1,200 × g for 20 min at room tempera-
ture and resuspend them in 5–10 mL of sterile M9 medium in
D2O (M9/D2O) in a sterile flask to a starting OD600 of ~0.05.
Incubate the culture in a shaking incubator (220–250 rpm) at
37°C until it reaches an OD600 of ~0.6.
3. Centrifuge as in step 2, and resuspend the cells in 50–100 mL

of M9/D2O prepared with either D-[2H,13C]glucose
(see Note 2) or D-[2H, 12C]glucose (see Note 3) and 15NH4Cl
containing 2% of BIOEXPRESS (CIL) or ISOGRO (Sigma)
(see Note 4). The starting OD600 should be ~0.1. Incubate the
culture in a shaking incubator (220–250 rpm) at 37°C until
the OD600 is ~0.6.
4. Centrifuge as in step 2, resuspend the cells in 1 L of M9/D2O,
and grow until the OD600 is ~0.25.
5. At this point, amino acid precursors for methyl labeling can be
added.
(a) Add precursors (see Note 5) or amino acids 30–60 min
prior to IPTG induction; add the following quantities (final
concentration): 100 mg/L of 13CH3-2Hα-alanine (see
Note 6) for Ala labeling; 45–50 mg/L of α-ketobutyrate
for isoleucine labeling; 85–100 mg/L of α-ketoisovalerate
for leucine and valine labeling (see Note 7); 250 mg/L of
[13CH3]-methionine for Met labeling. The methyl groups
of all five residues can be labeled in one sample in a scramble-
free manner (see Note 8) (Fig. 1a).
(b) Continue incubating the culture for approximately 1 h.
The OD600 should reach a value of ~0.3–0.4.
6. Add IPTG to 0.5 mM to induce protein overexpression.
7. Continue postinduction growth for 6–8 h (see Note 9).
8. Harvest the cells by centrifugation at 5,000 × g for 15 min at 4°C.
9. Freeze the wet cell pack at −80°C.
a b
12 Ile 20
14 Met 21
16 Ala
22
(p.p.m.)
(p.p.m.)
18
Leu, Val 23
13C
13C
20
24
22
25
24
26
26
27
2.0 1.6 1.2 0.8 0.4 0.0 1.2 1.0 0.8 0.6 0.4 0.2 0.0
1H (p.p.m.)
Fig. 1. (a) 1H-13C HMQC of [U-2H,12C], Ala-, Leu-, Met-, Val-, Ile-δ1-[13CH3] labeled protein. The methyl groups of all five
amino acids can be labeled with no scrambling. (b) 1H-13C HMQC of the same protein as in (a, black) but prepared using
10% BIOEXPRESS (green). The Leu methyl groups are completely suppressed, whereas the Val methyl groups are only
minimally affected.
3.2. NMR Assignment To assign a large protein, such as SecA (204 kDa; 901 residues per
subunit), a domain-parsing strategy is followed.
1. Isolate and characterize, by using NMR, virtually all domains
of the full-length protein and a number of fragments compris-
ing contiguous domains (see Note 10). The size of the isolated
domains and fragments should be such that backbone and
side-chain assignment is feasible using standard approaches.
2. Prepare Ala, Ile, Met, Leu, and Val methyl-labeled samples for
the full-length protein and the domains thereof (see Note 11).
Record methyl-TROSY for all of the samples, overlay, and
compare the spectra of the individual domains against the
spectra of the longer fragments and full-length protein. If good
resonance correspondence among domains, fragments, and
the full-length protein can be demonstrated, then assignment
is in principle transferable.
3. Record standard triple-resonance NMR experiments for isolated
domains and obtain backbone assignments. Record the three-
dimensional spectra required for the assignment of the methyl
groups (see Note 12).
4. Transfer the methyl assignments obtained for the isolated
domains to the larger fragments and finally to the full-length
protein by visually inspecting the methyl-TROSY spectra. Only
the assignment of the obvious and well-dispersed resonances
can be safely transferred this way.
5. Record 13C HMQC-NOESY-HMQC spectra (see Note 13) for
the methyl-labeled samples. Use the NOE patterns to confirm
and extend the assignment transfer from the domains to the
full-length protein. If a crystal structure is available, it can be
used to determine the distances between the methyl groups
and assist with the assignment.
6. Prepare site-directed mutations to assign ambiguous resonances
and further extend and confirm the assignments (see Note 14).
3.3. Paramagnetic 1. To prepare nitroxide spin label (MTSL)-derivatized ligand via

Relaxation cysteine-specific modification of engineered ligand derivatives
Enhancement containing single-solvent-accessible cysteine residues at sites of
Measurements interest (see Note 15), add MTSL from a concentrated stock
in acetonitrile to the ligand solution (free from any reducing
agent) at a tenfold molar excess over the ligand and allow the
reaction to proceed at 4°C for ~12 h. If available, confirm the
completion of the reaction by mass spectrometry.
2. Remove excess MTSL by extensive dialysis using an Amicon
stirred cell.
3. Determine PRE-derived distances from 1H-13C HMQC spectra
by measuring peak intensities before (paramagnetic) and after
(diamagnetic) reduction of the nitroxide spin label by the

addition of 5 mM ascorbic acid (see Note 16).
4. Convert PRE values to distances by using a modified Solomon-
Bloembergen equation for transverse relaxation (7).
5. Incorporate distance (intermolecular) restraints into the struc-
ture calculation protocol of the complex.
6. Restrain resonances strongly affected by the presence of the
spin label in the ligand (Ipara /Idia < 0.15) and whose resonances
broaden beyond detection in the paramagnetic spectrum with
only an upper bound distance estimated from the noise of the
spectrum plus 4 Å.
7. Restrain resonances that appear in the paramagnetic spectra
(Ipara /Idia < 0.85) as the calculated distance with ±4 Å upper/
lower bounds.
3.4. Structure 1. Determine the interface between the ligand and the large protein
Determination using differential line broadening (9). The residues affected by
complex formation can be used as ambiguous restraints.
2. If the ligand is a flexible peptide, use transferred NOESY (10)
to determine the structure of the peptide in the complex.
Determine the structure of the complex by using a CNS-based
software, such as HADDOCK (11) or Xplor-NIH (12). Use
the crystal structure of the large protein to define the starting
conformation, and both unambiguous and ambiguous
restraints obtained from NOE, PRE, line broadening, and
chemical shift perturbation experiments.
4. Notes
1. Freshly transformed colonies always give better protein yield.

2. [U-2H,13C]-glucose should be used when uniform 13C labeling
is desired or when all side-chain carbons of the methyl-bearing
residues are to be 13C labeled for magnetization transfer from
methyls to the backbone. In this case, the uniformly 13C-labeled
ketoacid precursor must be used.
3. [U-2H,12C]-glucose should be used to produce an NMR sample
in which all carbons are 12C labeled, except the methyl carbons
of interest. 1H-13C HMQC spectra of such samples are recorded
without the use of the constant time version and typically provide
the best resolution. Such a sample can also be used for relax-
ation experiments (1).
4. Up to ~2.5% of a rich labeling medium can be used to increase
the protein yield with no effect on the specific labeling of the
methyl groups.
5. Precursors can be purchased in protonated form and dissolved

in D2O for exchange to take place: at pH 12.5 (45°C), 2–3 h
for α-ketoisovalerate, and at pH 10.5 (45°C), 12–14 h for
α-ketobutyrate; the pH values are optimized for exchange and
prevent the generation of dimers through condensation of two
ketoacid molecules.
6. 13
CH3-2Hα-alanine can be prepared by using the tryptophan
synthase enzyme to catalyze the proton-to-deuterium
exchange of the α hydrogen, as described by Matthews and
coworkers (13).
7. Incorporation of 13CH3/12CD3 isotope labels into the isopropyl
moieties of Val and Leu residues should be used for very large
proteins since the inter-methyl dipolar relaxation is significantly
reduced. The methyl-TROSY spectra show significant gains
in resolution with practically no losses in sensitivity despite
the twofold dilution of the NMR-active methyls in such samples.
Precursors have also become available that allow any of the
methyl isotopomers (13CHD2, 13CH2D, and 13CH3) to be incor-
porated into the protein (1). The different isotopomers can
be used for relaxation experiments.
8. In this case, addition of ~2% of a rich labeling medium (e.g.,
BIOEXPRESS or ISOGRO) is required to suppress scrambling
associated with the addition of the alanine amino acid. Interestingly,
further increase of the rich labeling medium (~10%) suppresses
completely the methyl labeling of Leu while having a minimal
effect on the methyl labeling of Val (Fig. 1b). Since the methyl
groups of these two residues often overlap, this labeling scheme
can be used to differentiate between the two.
9. Critical step: Excessively prolonged growth after induction
should be avoided to prevent generation of methyl groups with
undesired isotopomers.
10. The design of domains and fragments thereof that would retain
their fold and are soluble in isolation can be quite tricky. In
this respect, the availability of a crystal structure can be of
tremendous help.
11. When the methyl residues of all five residues are labeled in the
same sample of a very large protein, the signal may be signifi-
cantly compromised due to enhanced inter-methyl relaxation.
The preparation of multiple samples each containing a single
amino acid labeled may be desirable in such a case.
12. An arsenal of pulse sequences are available for methyl assign-
ment (14).
13. The highly deuterated background suppresses spin diffusion,
and thus the mixing time for the NOESY experiments can be
set as high as 500 ms allowing for NOEs to be observed between
methyl groups as far as ~8 Å.
14. Amino acids should be typically substituted by an isosteric

amino acid to prevent significant changes in the local environ-
ment and protein packing, which could introduce significant
chemical shift effects.
15. Nonreactive Cys residues can be judged by the Elman’s test. Sites
for MTSL incorporation should be selected so that they cause
no or minimal effect on protein structure. This can be assessed
by NMR.
16. PRE rates should typically be measured using several MTSL-
derivatized ligands, each containing a single MTSL at a different
site. Because PRE rates provide long-range distance information,
in the absence of available NOE data a large number of PREs
are required to properly determine the structure of a protein–
ligand complex. The complex between SecA and the signal
peptide was determined using 160 PRE-derived intermolecular
restraints.
References
1. Ruschak, A. M., and Kay, L. E. (2010) Methyl limited nuclear overhauser effect data.
groups as probes of supra-molecular structure, Biochemistry 39, 5355–5365.
dynamics and function. J. Biomol. NMR 46, 8. Tang, C., Schwieters, C., and Clore, G. (2007)
75–87. Open-to-closed transition in apo maltose-binding
2. Goto, N., Gardner, K., Mueller, G., Willis, R., protein observed by paramagnetic NMR. Nature
and Kay, L. (1999) A robust and cost-effective 449, 1078–1082.
method for the production of Val, Leu, Ile (delta 9. Takeuchi, K., and Wagner, G. (2006) NMR
1) methyl-protonated 15N-, 13C-, 2H-labeled studies of protein interactions. Curr. Opin.
proteins. J. Biomol. NMR 13, 369–374. Struct. Biol. 16, 109–117.
3. Tugarinov, V., Kanelis, V., and Kay, L. E. (2006) 10. Post, C. (2003) Exchange-transferred NOE
Isotope labeling strategies for the study of high- spectroscopy and bound ligand structure
molecular-weight proteins by solution NMR determination. Curr. Opin. Struct. Biol. 13,
spectroscopy. Nat. Protoc. 1, 749–754. 581–588.
4. Tugarinov, V., Hwang, P., Ollerenshaw, J., and 11. de Vries, S. J., van Dijk, M., and Bonvin, A. M.
Kay, L. (2003) Cross-correlated relaxation (2010) The HADDOCK web server for
enhanced 1H-13C NMR spectroscopy of methyl data-driven biomolecular docking. Nat. Protoc.
groups in very high molecular weight proteins 5, 883–897.
and protein complexes. J. Am. Chem. Soc. 125, 12. Schwieters, C. D., Kuszewski, J. J., Tjandra, N.,
10420–10428. and Clore, G. M. (2003) The Xplor-NIH
5. Sprangers, R., and Kay, L. E. (2007) Quantitative NMR molecular structure determination package.
dynamics and binding studies of the 20S protea- J. Magn. Reson. 160, 65–73.
some by NMR. Nature 445, 618–622. 13. Isaacson, R., Simpson, P., Liu, M., Cota, E.,
6. Gelis, I., Bonvin, A., Keramisanou, D., Koukaki, Zhang, X., Freemont, P., and Matthews, S.
M., Gouridis, G., Karamanou, S., Economou, (2007) A new labeling method for methyl
A., and Kalodimos, C. G. (2007) Structural transverse relaxation-optimized spectroscopy
basis for signal-sequence recognition by the NMR spectra of alanine residues. J. Am. Chem.
translocase motor SecA as determined by NMR. Soc. 129, 15428–15429.
Cell 131, 756–769. 14. Tugarinov, V., and Kay, L. (2003) Ile, Leu, and
7. Battiste, J., and Wagner, G. (2000) Utilization Val methyl assignments of the 723-residue
of site-directed spin labeling and high-resolution malate synthase G using a new labeling strategy
heteronuclear nuclear magnetic resonance for and novel NMR methods. J. Am. Chem. Soc.
global fold determination of large proteins with 125, 13868–13878.
Chapter 9
Protein Dynamics by 15N Nuclear Magnetic Relaxation

Fabien Ferrage
Abstract
Nitrogen-15 relaxation is the most ubiquitous source of information about protein (backbone) dynamics
used by NMR spectroscopists. It provides the general characteristics of hydrodynamics as well as internal
motions on subnanosecond, micro- and millisecond timescales of a biomolecule. Here, we present a full
protocol to perform and analyze a series of experiments to measure the 15N longitudinal relaxation rate,
the 15N transverse relaxation rate under an echo train or a single echo, the 15N–1H dipolar cross-relaxation
rate, as well as the longitudinal and transverse cross-relaxation rates due to the cross-correlation of the
nitrogen-15 chemical shift anisotropy and the dipolar coupling with the adjacent proton. These rates can
be employed to carry out model-free analyses and can be used to quantify accurately the contribution of
chemical exchange to transverse relaxation.
Key words: Nuclear magnetic resonance, Protein dynamics, Relaxation rates, Nitrogen-15, Longitudinal
relaxation, Transverse relaxation, Cross-relaxation, Cross-correlated relaxation, Chemical exchange
1. Introduction
Nuclear magnetic resonance is a fantastic tool to investigate the

dynamics of biomolecules and, in particular, proteins. Among the
numerous techniques available, which provide access to internal
motions over a wide range of timescales, measurements of backbone
nitrogen-15 relaxation rates have proven to be by far the most
popular method to sample and quantify protein dynamics (1–3).
Nitrogen-15 relaxation rates can be analyzed to determine the
hydrodynamic properties of proteins (4), internal motions faster
than overall motions (subnanosecond) as well as slower motions
on microsecond–millisecond timescales that give rise to chemical
exchange-induced relaxation. Fast subnanosecond motions are
closely related to the local atomic density of a protein, and they
141
142 F. Ferrage
contribute to conformational entropy (5, 6) and are a good

indication of the local malleability of the protein structure.
Hydrodynamic properties can be used for structural refinement
of a single-domain (7) or tight multidomain protein or protein
complex (8), interdomain motions may also be analyzed (9).
Microsecond and millisecond motions and chemical reactions are
essential to protein function. In most cases, the kinetics of these
events can be determined by nitrogen-15 relaxation methods
(10, 11), while, in favorable cases, both thermodynamics and struc-
tural changes between exchanging states can be characterized (12).
The canonical set of nitrogen-15 relaxation experiments
comprises the measurements of longitudinal, R1, and transverse,
R2, auto-relaxation rates, as well as the 15N–{1H} nuclear Overhauser
effect, from which the dipolar cross-relaxation rate between
the nitrogen-15 nucleus and its neighboring proton, sNH, can be
extracted (13). In this review, I present basic protocols to set up
and analyze these experiments, including the latest developments.
In addition, we provide protocols for two additional experiments
for measuring longitudinal and transverse cross-relaxation rates
due to the cross-correlation of the nitrogen-15 chemical shift
anisotropy (CSA) and the dipole–dipole (DD) coupling with its
attached proton. These rates can be analyzed alongside the canonical
set in order to evaluate the contribution of chemical exchange
effects to transverse relaxation (14, 15).
1.1. Theory We provide here the minimal theoretical background that is necessary
to define the various terms that are used in this review. For a
detailed understanding of relaxation theory, the reader should refer
to some of the many reviews that have been published on the
subject (16–19). Relaxation is the irreversible process through which
a spin system evolves toward a steady state. Different elements of the
density operator evolve with different auto-relaxation rates, and
they convert into one another with well-defined cross-relaxation
rates. The molecular processes underlying relaxation are the fluc-
tuations of high-amplitude orientation-dependent interactions,
such as dipole–dipole couplings and the anisotropy of the chemical
shift. The amplitude of these interactions is known so that one
can extract the characteristics of motions from relaxation rates.
Relaxation rates do not depend directly on the correlation function
of interaction Hamiltonians but on their Fourier transform: the
spectral density function J(w). Here, we describe a protocol to measure
the following relaxation rates for 15N nuclei in a 15N–1H pair.
The longitudinal auto-relaxation rate, R1:
R1 = D (6(1 + a 2 ) J (w N ) + 2 J (w H - w N ) + 12 J (w H + w N )). (1)
The transverse auto-relaxation rate, R2:

9 Protein Dynamics by 15N Nuclear Magnetic Relaxation 143
R2 = D (4(1 + a 2 ) J (0) + 3(1 + a 2 ) J (w N )

+ J (w H - w N ) + 6 J (w H ) + 6 J (w H + w N )).
(2)
The 1H–15N dipolar cross-relaxation rate, sNH:
s NH = D (-2 J (w H - w N ) + 12 J (w H + w N )). (3)
The CSA/DD transverse cross-correlated cross-relaxation rate, dtN:
dtN = D a(8 J (0) + 6 J (w N )). (4)
The CSA/DD longitudinal cross-correlated cross-relaxation rate, dlN:

dlN = 12D aJ (w N ), (5)
3
2 rNH
with D = 1 æ m0 ö g H g N and a = - 2 B0 (s / / - s^ )
2 2 2
; gH and gN
20 çè 4p ÷ø rNH
6
3g H
are the gyromagnetic ratios of the proton and nitrogen-15 nuclei,

respectively; wH and wN are the Larmor angular frequencies of the
proton and nitrogen-15, respectively (see Note 1); rNH is the inter-
nuclear distance; m0 is the permeability of free space; ħ is the Planck
constant divided by 2p; B0 is the static magnetic field and s// and
s ^ are the axial and perpendicular components of the anisotropic
chemical shift tensor of the nitrogen-15 nucleus (which we consider
to be axially symmetric).
The 1H–15N dipolar cross-relaxation rate is measured indirectly
from nuclear Overhauser effects. In this case, signal intensities are
measured under two conditions: at the steady state under effective
proton saturation Iss and at equilibrium Ieq. The ratio of these
intensities is:
I ss / I eq = 1 + (g H s NH ) / (g N R1 ) (6)
In addition, we measure the transverse relaxation rate R2 under

two different conditions, the more typical CPMG (Carr-Purcell-
Meiboom-Gill) train of echoes and by a single echo combined with
continuous 1H composite pulse decoupling. The later rate includes,
in most cases, the full contribution of chemical exchange to trans-
verse relaxation, Rex, which is a probe of chemical reactions and
motions on micro- to millisecond timescales (20, 21), a particularly
relevant range for biological processes. In the case of fast exchange
between two sites A and B (with a timescale tex and a difference of
chemical shift between the two sites Dw), we have:
Rex = p A pB Dw 2 t ex , (7)
where pA and pB are the populations of the two exchanging sites A

and B respectively, pA + pB = 1.
The focus of this review is not to describe how to determine
the various parameters of the exchange process, which is well
144 F. Ferrage
presented in the following reviews (10, 22). Nevertheless, we

describe a protocol to identify the presence of chemical exchange.
2. Materials
1. Appropriately labeled protein, preferably (2H, 15N) or (15N) [while

studies of (2H, 15N, 13C) and (15N, 13C) labeled proteins are
also possible] sample at ~1 mM concentration (see Note 2).
2. NMR spectrometer with a magnetic field strength of ³11.7 T
(see Note 3).
3. nmrPipe: software to process, display and analyze 2D spectra
(http://spin.niddk.nih.gov/NMRPipe/).
4. Curvefit: software to fit the relaxation decays and build-up
curves (http://biochemistry.hs.columbia.edu/labs/palmer/
software/curvefit.html).
5. Grace: software to display the results of the Curvefit analysis
(http://plasma-gate.weizmann.ac.il/Grace/).
3. Methods
3.1. Preliminary A few preliminary procedures should be followed before starting

Set-up the series of experiments. First, when all experiments are run on the
same spectrometer, one should ensure the temperature calibration
is very recent, in any case, it is safer to calibrate the temperature
prior to recording all experiments. When data are collected by
using two or more spectrometers, the exact match of temperatures
should be verified directly on the same standard methanol sample
on each spectrometer. It is more important to run all experiments
at the exact same temperature than to have a small error on the
nominal temperature of the experiments.
1. When running experiments on the first spectrometer, intro-
duce a sample of perdeuterated methanol into the spectrome-
ter. Set the temperature at the desired value according to the
latest calibration. Let the sample equilibrate for 5 min. Match
and tune the probe, and shim the magnet.
2. Run a simple 1D proton experiment with a small angle excita-
tion (typically with a 1 ms 1H pulse at high power). Measure
the difference in chemical shifts between the hydroxyl and
methyl protons, Dd. The temperature is (23):
T = -16.7467 Dd 2 - 52.5130 Dd + 419.1381. (8)
3. Change the nominal temperature on the spectrometer and

repeat step 2 until the desired temperature is reached. Write
down the value of Dd.
4. When running experiments on another spectrometer, directly
match the value of Dd.
Second, pulses should be calibrated for each set of mea-
surements collected on each spectrometer. The typical values
obtained by using a standard sample should be used as a guess
to start calibration but not directly to run experiments.
1. Introduce the protein sample in the spectrometer, let it equili-
brate for a few minutes. Shim, tune, and match each channel
that is going to be used. Set the carrier on the water signal and
run a 1D proton spectrum with a short (1 ms) excitation at
high power (see Note 4). Phase and use this spectrum as a
reference.
2. Set the pulse duration to the expected value for a 360° flip angle.
Collect a spectrum. Optimize the value of the pulse duration
to get a null signal. Write down the duration of the corresponding
90° pulse.
3. Prepare a 1D version of a 15N–1H Heteronuclear Single
Quantum Coherence (HSQC) experiment, preferably with a
heteronuclear gradient echo. Set the calibrated duration for
proton pulses and the expected value for 15N pulses. Make sure
the carrier on the 15N channel is set at 117 ppm. Set the number
of scans to 16. Run a 1D reference spectrum. If the signal-
to-noise is very high, reduce the number of scans. If it is low,
increase accordingly.
4. Now, set the duration of the first 90° pulse on 15N to the expected
duration for a 180° pulse. Run the experiment. Optimize to get
a null signal. Write down the corresponding value for the cali-
brated 90° pulse.
5. If the sample is carbon-13 labeled, follow steps 3 and 4 with
13
C instead of 15N. The 13C carrier should be set at 35 ppm and
the proton signal should be null between 2 and 3 ppm when
the flip angle is 180°.
6. Alternatively, or, when using a carbon-13 and perdeuterated
sample, run steps 3 and 4 with a 1D version of an HNCO
experiment, where the first 13C pulse is adjusted to 180°. The car-
rier on the 13C channel should be set to 174 ppm (see Note 5).
Third, the carrier, spectral width, and number of time
points should be optimized in the 15N dimension (this should
have been done during the assignment experiments). In par-
ticular, cross-correlated relaxation experiments on small pro-
teins can have a long duration because of long phase cycles,
146 F. Ferrage
but are not signal-to-noise limited. Such optimization is an

important time saver.
1. Set up a 2D HSQC experiment with a wide spectral width (larger
than 30 ppm) so that only the arginine NeHe signals are folded.
2. Process the spectrum and adjust the spectral width and carrier.
Folded peaks should not overlap with nonfolded peaks, and it
is advised to keep about 1 ppm on each side of the spectrum
with no peak. This process may require a few iterations. In the end,
a spectral width between 16 and 24 ppm should be obtained
for most small- or medium-size proteins.
3. This last step should be carried each time one runs experiments
at a new B0 field. Record a final spectrum with the optimal
carrier and spectral width as well as a large number of points in
the indirect dimension. Process it with the lowest number of
points that provides good resolution of all peaks of interest.
Use this value for all experiments. Be careful not to underestimate
the number of points, it may save 10% of the time but you may
also have to repeat the full series of experiments.
3.2. Auto-Relaxation The sequences presented in Fig. 1a, b should be employed for this
Rates Measurements experiment. There is no fully satisfactory combination of water-flip
back schemes and suppression of CSA/DD cross-correlated cross-
3.2.1. Longitudinal
relaxation pathways during the relaxation delay. One solution is to
Relaxation Rate
saturate the water resonance in each scan and use a long recovery
delay between scans. To saturate the water resonance, drop the
water-flip back pulse at the end of the first INEPT in Fig. 1a and
use a strong gradient G2. In that case, the two shaped pulses in
Fig. 1b can be substituted either by composite pulse decoupling
(see Fig. 1c) or a series of proton 180° pulses (see Fig. 1d). Here,
we use a scheme that can be used when no amide resonance is
lying too close to the water resonance. It is one of the standard
sequences available on Bruker spectrometers.
1. Set pulse durations and amplitudes according to the spectrom-
eter specifications, the calibration and the description of the
pulse sequence in Fig. 1a, b. Set delays according to the pulse
sequence. Calibrate proton shape pulses according to the
spectrometer-specific protocol.
2. Set the recovery delay between scans to a large value, usually
2 s; 3 s can be used if spectrometer time is not a problem.
3. For a well-behaved protein, start with eight scans (16–32 for
low-concentration samples). Set the relaxation delay to a low
value (we use 20 ms). Run the first 1D experiment. Phase and
save this spectrum.
4. Set the relaxation delay to a value close to the expected average
T1. At 300 K, this is 500 ms for a globular 100-residue protein
and 700 ms for a 150-residue protein. Acquire a new spectrum.
Fig. 1. Pulse sequences used for measuring 15N auto-relaxation rates. (a) General scheme. (b) Relaxation sequence for
measuring the longitudinal relaxation rate R1; (c) relaxation sequence for measuring the transverse relaxation rate under a
single echo R2echo; (d) relaxation sequence for the measurement of the transverse relaxation rate under a CPMG echo train
R2CPMG. All narrow (filled) and (wide ) open rectangles represent 90° and 180° pulses, respectively. Pulse phases are along
the x-axis of the rotating frame unless otherwise mentioned. Proton composite pulse decoupling during the delay t1 was
performed with a GARP scheme and a radio frequency field (rf) amplitude of 1 kHz. The 13C channel pulse was a 500 ms
smoothed CHIRP pulse (37), with a sweep of 60 kHz on a 600 MHz spectrometer; the carrier at the center of the pulse was
110 ppm. Composite-pulse decoupling during acquisition was performed on the 15N channel with a GARP scheme (38) and
an rf amplitude of 1,090 Hz. The delay ta is 2.56 ms; the delay tb can be adjusted around 5 ms. The phase cycles were:
f1 = {x, −x }; f3 = {x, x, −x, −x }; f4 = {x, x, −x, −x }; f5 = {−y, −y, y, y }; facq = {x, −x, −x, x }. When relaxation block (b) is used,
f2 = { y, y, y, y, −y, −y, −y, −y } and facq = {x, −x, −x, x, −x, x, x, −x }. The amplitude profile of the pulsed field gradient was
a sine bell shape. Their durations and peak amplitudes over the x, y, and z orientations (when triple axis gradients are
available) were, respectively: G1; 600 ms, 9.5 G/cm, 9.5 G/cm, 0; G2; 1 ms, 0, 0, 30 G/cm; G3; 600 ms, 15 G/cm, –15 G/cm,
0; G4; 1 ms, 0, 0, 40 G/cm; G5; 1 ms, 0, 0, 8.1 G/cm. Coherence selection was achieved by inverting the amplitude of the
gradient G4 and phase f1. (b) The carrier is placed at 8.2 ppm during the relaxation block; gray bell-shaped pulses are
1.6 ms Q3 Gaussian cascade pulses at 600 MHz (39) (see Note 9). (c) WALTZ-16 decoupling should be used for 1H decoupling
during the relaxation block (40). See text for how to choose the relaxation delays. (d) Gray rectangles are 180° pulses,
depending on the spectrometer and probe, they should be either at high power or less, but should not be longer than
100 ms. t should be set to 500 ms.
148 F. Ferrage
5. Compare the intensities and repeat with adjusted delays until

the ratio of intensities is about 0.3. We call this delay tmax + 20 ms.
6. The best sampling of the decay is achieved with an even decay
of intensities between each time point. If n different relaxation
delays are to be acquired, the intensity of the j-th experiment
should be:
I j = I 1 (1 - 0.7( j - 1) / (n - 1)). (9)
We expect a mono-exponential decay with an average rate R1av

and we know that:
( )
exp -R1av t max = 0.3 Û R1av = - ln 0.3 / t max (10)
( )
and exp -R1avt j = I j . (11)
So that
t j = (t max / ln0.3)ln(1 - 0.7( j - 1) / (n - 1)). (12)
7. A typical number of relaxation delays is 8. This number can be

reduced to 6, even 4 in cases where each spectrum takes a very
long time to acquire. Record all experiments in an interleaved
manner, changing the relaxation time once each full 1D spectrum
is acquired. This requires a small modification of standard pulse
sequence programs. When using standard pulse sequences,
the sequence of delays should be: t1, tn, t2, tn−1, t3, tn−2, etc., so that
the decay of intensities would not appear mono-exponential
if the quality of the spectra was to decay, because of a bubble
for instance, which would otherwise be reflected by a higher,
erroneous rate. Similarly, at least t1 and, if possible t3 should be
repeated to identify a possible decay in the quality of spectra.
If this decay is larger than the error bars, the dataset should
be discarded.
8. Before starting the series of experiments, set the number of “dummy
scans” in the first experiment to a very high value (at least
256, preferably 512). This leaves enough time for the tempera-
ture control system of the spectrometer to reach equilibrium.
9. Import the spectra into nmrPipe (24). Process them with typical
parameters, the window function should be a sine-bell function,
the power should be set to 2 and the shift parameter between
0.5 and 0.35 in each dimension (the lower the value, the more
resolution). Truncation artifacts in the 2D spectrum should be
avoided at all cost, as they may contaminate the intensity of
neighboring peaks.
10. Peak pick the first spectrum, or export a peak list from the
assignment software, save the peak list as peakX_0.tab, where
X is the number of the spectrum. To extract the intensity of
the peaks with the best possible accuracy, use the nonlinear line
shape analysis tool nlinLS, which is provided with nmrPipe.

First, measure the full width, in points, of the peaks in each
dimension: WH and WN. Then, run the following line:
nlinLS -in peakX_0.tab -out peakX_1.tab -data spectrumX.ft2
-w WH/6 WN/6 -delta X_AXIS WH/10 Y_AXIS WN/10.
This supposes that the proton dimension is the X dimension of
the spectrum. The option w defines the region, where the peak
is fitted (the current command specifies a square of dimensions
WH and WN); the option delta adds a constraint on the following
parameters, here the position of each peak.
11. It is likely that nlinLS will initially provide a series of error
messages, most often indicating bad convergence of the fit.
In spite of these error messages, the output file will be of better
quality than the input file and should be used as an input for a
second round of analysis with nlinLS. The process is iterative
and may require up to five iterations for convergence and a
proper fit with no error message. If the width of a peak is
diverging from typical values in the output file, an additional
constraint can be added to the peak width parameters, WX and
WY. Peaks may also be added or excluded from clusters to
improve convergence. Use the final, error-free list obtained
from the first spectrum as the first input for all other fits.
12. Run this iterative procedure for each spectrum.
13. Collect the intensities and peak numbers with the getCols tool,
which is provided with nmrPipe. nlinLS provides both peak
intensities and volumes. Only the intensities should be used.
Although the precision should be better than the spectral noise,
this turns out to be a good estimate of the error in the intensities.
Use this value as the error for all the peaks of each spectrum.
14. Using a short script or a spreadsheet editor, prepare a table
with peak numbers in the first column, then the intensities of
peaks in the series of spectra in columns 2, 4, 6, etc., and the
respective error in columns 3, 5, 7, etc.
15. Run a short script to create a series of Curvefit input files with
names: N.in, where N is the peak number (see Note 6).
16. Run the command: batch_curve in. The fit for each peak number
appears (see Fig. 2). It is very important to check every single
fit and identify errors that may come from improper intensity
fits or other potential errors. Write down the peaks and spectra
that lead to apparently wrong intensities and inspect the spectra
for potential errors. There may be no error.
3.2.2. Transverse The sequence presented in Fig. 1b should be employed for this
Relaxation Rates experiment. Only relaxation delays that lead to full cycles of the pro-
with a Single Echo ton composite pulse decoupling (CPD) should be used, which
modifies slightly the protocol from the measurement of longitudi-
nal relaxation rates.
150 F. Ferrage
Fig. 2. Example of a longitudinal relaxation decay curve as shown by Grace during the
Curvefit procedure.
1. Most parameters are identical to those used in longitudinal

relaxation measurements. To cover the entire amide region the
carrier for the proton CPD is set to 8.2 ppm and the amplitude
to about 2 × x Hz; where x is the proton Larmor frequency of
the spectrometer in MHz. Set the CPD scheme to WALTZ-16
(see Note 7). The half relaxation delay T should be set to mul-
tiples of 96 × tcpd, where tcpd is the duration of a 90° pulse at the
decoupling power.
2. Contrary to other sequences, proton CPD with a carrier on the
amides prevents a proper control of the polarization of water.
Set the radiofrequency amplitude to zero for the water flip
back pulse following the first INEPT and use a long recovery
delay (at least 2 s) in between scans.
3. Follow steps 2–7 of the protocol for longitudinal relaxation
(Subheading 3.2.1) with the following changes. Start with the
shortest relaxation delay, i.e., zero. The fact that the total relax-
ation delay has to be a multiple of 192 × tcpd, severely limits the
choice of possible delays, use all accessible delays between zero
and tmax.
4. To obtain the transverse relaxation rate under a single echo R2echo,
the analyses of spectra are exactly the same as that described in
steps 8–15 of the preceding protocol (Subheading 3.2.1).
3.2.3. Transverse For the sake of consistency, this protocol is described here. However,
Relaxation Rates with this is the most challenging experiment as heating from high
a CPMG Scheme radiofrequency fields may lead to bubble formation and sample
degradation. It is strongly advised to run this experiment as the
last one of the series.
1. Most parameters are identical to those used in longitudinal
relaxation measurements. Set the power level for 15N 180° pulses
during the CPMG echo train. This experiment is one of the
most demanding that can be run on a high-resolution probe.
The maximum power (i.e., shortest nitrogen-15 pulses) that can
be employed during a CPMG echo train is usually provided to
users for each probe on each spectrometer. If no value is
recommended, ask the person in charge of the spectrometer
maintenance. If you are in charge of the spectrometer mainte-
nance, ask the manufacturer. Often, particularly on cryogenic
probes, these pulses should not be applied at full power.
For proper accuracy of the experiment, 180° pulses should be no
longer than 100 ms.
2. In addition to Fig. 1d, include a temperature-compensation
loop at the beginning of the sequence. At the end of the
recycling delay, include a series of far off-resonance (200 kHz
works fine) 15N 180° pulses at the amplitude of the CPMG
echo train. Do not include 1H 180° pulses as these would alter the
1
H longitudinal polarization. The duration of this train should
be such that the total number of 15N 180° pulses is constant
whatever the relaxation delay is. An additional delay, equal to
the relaxation delay, should also be added so that the total
duration for the recovery of proton polarization is constant.
(Subheading 3.2.1). Zero can be used as the shortest relax-
ation delay.
4. Use a long recovery delay (preferably 2 s) and a large number
of dummy scans (512) to let the temperature control system
reach equilibrium before the first experiment.
5. Follow steps 9–16 of the R1 analysis (Subheading 3.2.1) to obtain
transverse relaxation rates under a CPMG train, R2CPMG.
3.3. 15N–{ 1H} Nuclear The sequence presented in Fig. 3 should be used for this experiment.
Overhauser Effect It displays a series of improvements introduced recently (25–27).
Measurements The two experiments (under effective proton saturation and at
equilibrium) have to be run in an interleaved manner.
1. Set up the saturation scheme (Fig. 3b). The saturation element
is symmetric, 1H pulses have a 180° flip angle, the carrier is placed
in the amide region (8.2 ppm), the interpulse delay tNOE should
be a multiple of 1/JNH, where JNH is the 1H–15N one-bond scalar
coupling constant. In globular proteins at low pH, tNOE = 22 ms
152 F. Ferrage
Fig. 3. Pulse sequence used for recording steady-state 15N–{1H} nuclear Overhauser effects. For each measurement, reference
and steady-state experiments have to be recorded in an interleaved manner. In the reference experiment, one should run
the part of the pulse program displayed in box (a). At the end of the recovery delay TNOE = 10 s (or more), the proton carrier
is placed on resonance with the water signal and a very selective water-flip back pulse is applied (3 ms sinc shaped or
longer). To record steady-state experiments, the boxed sequence in (a) is substituted by the scheme shown in (b) for the
effective saturation of amide proton resonances. After an optional delay T¢NOE = 2 s for stable detection of the lock signal,
the proton carrier is placed in the center of the amide region (at 8.2 ppm) as shown by the arrow labeled by N. The motif
[delay tNOE/2 – 180° pulse – delay tNOE/2] is repeated nNOE times. The interpulse delay, tNOE, is typically 22 ms (11 ms may
also be used, see text). The rf amplitude for the pulses should be 7.5 kHz at a 500 MHz Larmor frequency and 9 kHz at
600 MHz. A gradient G1 is applied at the end of the last tNOE/2 delay to suppress all transverse components of the proton
polarization. The carrier was moved on-resonance with the water signal as indicated by the W arrow. The number of cycles,
nNOE, was set so that the total duration for effective saturation was 4 s. All narrow (filled ) and wide (open) rectangles represent 90°
and 180° pulses, respectively. Pulse phases are along the x-axis of the rotating frame unless otherwise mentioned. Proton
composite pulse decoupling during the delay t1 was performed with a GARP scheme and an rf amplitude of 1 kHz.
Composite-pulse decoupling during acquisition was performed on the 15N channel with a GARP scheme (38) and an rf
amplitude of 1,090 Hz. The delay ta is 2.56 ms. The phase cycles were: f1 = {y, –y}; f2 = {x, x, –x, –x}; f3 = {x, x, –x, –x };
f4 = {−y, –y, y, y }; facq = {x, –x, –x, x }. The amplitude profile of the pulsed field gradient was a sine bell shape. The durations
and peak amplitudes over the x, y, and z orientations were, respectively: G1; 600 ms, 15 G/cm, 15 G/cm, 0; G2; 1 ms, 0, 0,
25 G/cm; G3; 1 ms, 0, 0, 40 G/cm; G4; 1 ms, 0, 0, 8.1 G/cm. Coherence selection was achieved by inverting the amplitude
of the gradient G3 and phase f4.
is a safe value, but tNOE = 11 ms should be used in disordered

proteins or on high pH samples, where proton exchange with
the solvent is faster. The amplitude of these radiofrequency
pulses should be 1.5 × X kHz, where X is the 1H Larmor
frequency in MHz divided by 100 (e.g., 9 kHz on a 600 MHz
spectrometer).
2. Set up the duration of the saturation scheme. The total duration
of the saturation is linked to the 15N R1 rates, but not to the 1H
relaxation rates, since the present scheme ensures an effective
saturation of protons after each saturation element. If the
longitudinal relaxation experiments have been fully analyzed,
identify the lowest value of all 15N R1 rates. Set the value of nNOE
so that the total duration of the saturation scheme is 4/R1.
If only an estimate of the average longitudinal relaxation rates
R1av is available from a comparison of 1D spectra (steps 3–5 of
the protocol for the measurement of R1 rates; Subheading 3.2.1),
set the total duration of saturation to at least 6/R1av and

preferably 8/R1av, if time permits.
3. Set up the reference experiment (Fig. 3a). The soft water-flip
back pulse should be long (at least a 3 ms sinc-shaped pulse,
and longer for spectrometers with a Larmor frequency under
700 MHz) as it should not touch amide resonances closest to
the water resonance. The total duration of the recovery delay,
TNOE, should be long. In this case, the spin system should be at
equilibrium, including the protons. The use of a water-flip
back scheme permits the use of significantly shorter recovery
delays. In general, these delays should be no shorter than 10 s.
It is strongly advisable to spend up to an hour to compare the
intensity of the first 1D spectrum (obtained with t1 = 0) for
various values of TNOE and identify for which value of TNOE the
signal has reached its maximum value. Since the signal-to-noise
ratio will be too low to properly evaluate the difference between
97% and 100% of the signal, add at least 2 s to the value of TNOE
where the signal appears to have reached its maximum value.
4. The total number of scans should be at least 24 and up to 64. The
experiment should last between 12 and 24 h. If longer acquisition
times are necessary, reduce the number of scans for each exper-
iment and run a series of experiments no longer than 24 h.
The number of dummy scans should not be greater than 64.
(Subheading 3.2.1) to obtain intensities.
6. Compute the NOE ratio (Eq. 6). Although it may not be
necessary for the following analysis, derive the dipolar cross-
relaxation rate from the NOE ratio using Eq. 6.
3.4. Measurements These experiments are less commonly run in spite of their great
of CSA/DD Cross- utility. The set up turns out to be straightforward after more typi-
Correlated Cross- cal relaxation experiments have been run.
Relaxation Rates 1. Set pulse durations and amplitudes according to the spectrom-
eter specifications, the calibration and the description of the pulse
sequence, which is shown in Fig. 4. Set delays according to
the pulse sequence. Calibrate proton shape pulses according
to the spectrometer-specific protocol. Most of this part has
already been set up in the preceding experiments.
2. For both experiments, the minimum number of scans imposed
by the phase cycle is higher, leading to very long experimental
times when high resolution data needs to be collected (large
number of t1 points). On the other hand, shorter recycle
delays can be used, typically between 1 and 1.5 s with little loss
of sensitivity and no loss of accuracy.
3. These experiments use symmetrical reconversion (15, 28). This
means that all four relaxation pathways in a two-operator space
are detected. These four experiments have to be run in an
154 F. Ferrage
Fig. 4. Pulses sequences for measuring CSA/DD cross-correlated cross-relaxation rates. (a) General scheme; (b) relaxation
block for measuring the transverse cross-relaxation rate; (c) relaxation block for measuring the longitudinal cross-relaxation
rate. Only specific elements are detailed here. The delay d is equal to the shortest value of t1 so that the first effective value
of t1 is zero. The phase cycles were: f1 = { y, –y }; f2 = {x, x, x, x, –x, –x, –x, –x}; f3 = {x, x, x, x, x, x, x, x, –x, –x, –x, –x, –x,
–x, –x, –x}; f4 = {x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, –x, –x, –x, –x, –x, –x, –x, –x, –x, –x, –x, –x, –x, –x, –x, –x}; f5 = {x};
f6 = {x, x, –x, –x}; f¢6 = {x, x, y, y}; facq = {x, –x, –x, x, –x, x, x, –x, –x, x, x, –x, x, –x, –x, x, –x, x, x, –x, x, –x, –x, x, x, –x, –x,
x, –x, x, x, –x}. Gradient durations and peak amplitudes over the x, y, and z orientations were, respectively: G1; 600 ms, 15 G/
cm, 15 G/cm, 0; G2; 600 ms, 6.5 G/cm, 0, 0; G3; 2 ms, 0, 0, 40 G/cm; G4; 1 ms, –9.5 G/cm, 9.5 G/cm, 0 G/cm; G5; 600 ms,
3.5 G/cm, 0 G/cm, 14.5 G/cm; G6; 1 ms, –35 G/cm, –35 G/cm, –35 G/cm; G7; 1 ms, 35 G/cm, 35 G/cm, 35 G/cm. Frequency
sign discrimination was performed using States-TPPI.
interleaved manner (see Note 8). Remember to set the number

of points in the indirect dimension to the proper value (i.e.,
eight times the number of complex time points).
3.4.1. Transverse 1. Set up the relaxation delay(s). The limiting factor for sensitivity
Cross-Correlated is the intensity in the cross-relaxation experiments (pathways II
Cross-Relaxation Rates and III in Fig. 4a). This intensity is maximum when the relax-
ation delay is equal to the auto-relaxation time. The optimal
relaxation delay typically lies between 30 and 80 ms. Explore
this interval in 20 ms steps. The intensity versus time curve is
usually flat around the maximum. When two relaxation delays
appear to give the same maximum intensity, first check a relax-
ation delay at the midpoint between the two. If the intensity is
better, keep this delay. If the intensity is the same, choose the
shortest delay so the intensity in the auto-relaxation experi-
ments will be better.
2. This experiment can be run with a single relaxation delay. If time
permits, repeat the optimal relaxation delay and run another delay
with sufficient intensity on the cross-relaxation experiments.
3. Follow steps 9–14 of the longitudinal relaxation protocol
(Subheading 3.2.1) to obtain the intensities of the four interleaved
experiments. The following quantity should be computed:
S (4T ) = I II (4T )I III (4T ) / I I (4T )I IV (4T ). (13)
(
S (4T ) = tanh dNt 4T . ) (14)
4. If only one relaxation delay is recorded, simply invert Eq. 14.
If several time points are recorded, compile the values of S and
the corresponding errors in a text file and follow steps 15 and
16 of the longitudinal relaxation protocol (Subheading 3.2.1).
The header of each N.in file should set the hyperbolic tangent
mode for the fit to evaluate dtN .
3.4.2. Longitudinal 1. Set up the relaxation delay(s). The constraints are similar to
Cross-Correlated those for the transverse cross-relaxation rate (Subheading 3.4.1).
Cross-Relaxation Rates A difference is that the cross-relaxation experiments are now
number I and IV because of the conversion between longitudinal
polarization and two-spin order in the middle of the cross-
relaxation delay. The optimal delay should be found between 80
and 200 ms. However, particularly in nondeuterated proteins,
there is a strong advantage to using shorter delays. If not,
one has to calculate a correction for the rates that takes into
account the effects of proton–proton cross-relaxation (15). In that
case, at least two well-separated time points should be recorded.
2. This experiment can be run with a single relaxation delay if it is
short and/or the protein is deuterated. To make sure that the
correction is small, it is advisable to run two experiments with
different delays.
156 F. Ferrage
3. Follow steps 3–4 for the transverse cross-relaxation protocol

(Subheading 3.4.1 ). Figure 5 shows the build-up of the
symmetrical reconversion observable for longitudinal cross-
correlated cross-relaxation. These data were recorded with a
small deuterated protein, thus allowing the use of long cross-
relaxation delays.
4. If two or more relaxation delays are recorded on a nondeuter-
ated sample, evaluate the correction factor, D, for each time point,
and compute the ratio:
2
I IV / I I = C exp(-D 4T ) / (1 + 32d1N DT 3 / 3. (15)
From these2 ratios, fit the value of D. In most cases, evaluate the
factor 32d1N DT 3 / 3 , it should be much smaller than 1 so that
D can be obtained from a simple exponential fit.
5. Compute for each time point:
dlN T = atanh (II I IV )(

/ I II I III / 1 + 2D 2T 2 / 3 . ) (16)
Proceed to a linear fit to obtain dlN .
3.5. Analysis and The analysis of this dataset closely follows the one presented by
Interpretation of 15N Kroenke et al. (14). The use of more accurate measurements of
Relaxation Rates nuclear Overhauser effects and longitudinal CSA/DD cross-correlated
Fig. 5. Example of a longitudinal cross-correlated cross-relaxation build-up curve as shown

by Grace during the Curvefit procedure.
cross-relaxation rates as well as the comparison of transverse relaxation

rates measured under a CPMG echo train and a single echo are the
main improvements.
3.5.1. Identification 1. Compute the transverse relaxation rate expected in the absence
of Chemical Exchange of exchange R20, for each backbone 15N nucleus:
R20 = (R1 - 1.25s NH )dtN / dlN - 1.08s NH . (17)
2. Derive the exchange contribution to the transverse relaxation

rate measured with a CPMG sequence:
Rexecho = R2echo - R20 . (18)
3. Derive the exchange contribution to the transverse relaxation

rate measured with a single echo:
RexCPMG = R2CPMG - R20 . (19)
4. Figure 6 shows, as an example, a plot of RexCPMG for Calbindin

D9k at 296 K. All but one residue shows contributions of exchange
to transverse relaxation rates during CPMG that are significantly
larger than 1/s. This confirms that motions of the backbone of
Calbindin D9k are very limited on timescales in the 10 ms/ms
range. Only L6 and T45 show small but significant contribu-
tions of chemical exchange to transverse relaxation. RexCPMG
should never be negative, so the presence of a few values around
−0.7/s (with an expected error smaller than 0.2/s) indicates
that the accuracy of the method is not as good as its precision.
In this particular case, the use of a carbon-13 labeled sample
makes the derivation of R20 less accurate since contributions of
15
N–13C dipolar interactions to 15N auto-relaxation also have to
be taken into account and predicted.
5. The comparison of RexCPMG and Rexecho provides insights on the
order of magnitude of the timescales of the exchange processes.
If a nonzero value of Rexecho is determined while RexCPMG is zero,
the exchange process is slower than 1 ms. If both values are
identical, the exchange process is faster than 1 ms. If RexCPMG is
smaller than Rexecho but significantly larger than zero, the tim-
escale of the exchange process is close to 1 ms or two exchange
processes take place, one faster and one slower than 1 ms. Note
that RexCPMG should never be larger than Rexecho.
3.5.2. A Note on the The ensemble of rates can be used as the input for a model-free
Model-Free Analysis of 15N analysis (29). A series of software are available: modelfree (30); fast
Relaxation Rates modelfree (31); tensor2 (32); and dynamics (33) are good examples.
I do not describe here the use of such software. The only point that
discussed is the choice of the ensemble of relaxation rates that
should be employed. The above-mentioned softwares are designed
to use the longitudinal relaxation rate R1, the transverse relaxation
158 F. Ferrage
Fig. 6. Contribution of chemical exchange processes to transverse relaxation in a single

echo experiment RexCPMG in Calbindin D9k at 296 K.
rate R2CPMG, and the NOE ratio as inputs. In most cases, these rates
should be used. One of the most difficult tasks of the analysis is the
detection of an exchange contribution to transverse relaxation
rates R2CPMG. When a nonzero contribution RexCPMG is identified,
the rates are at best fit with a simpler model for fast motions. When
a protein shows significant contributions of chemical exchange to
the relaxation of many 15N nuclei, this can be detrimental to the
overall quality of the analysis of hydrodynamic properties and fast
local motions. The analysis described herein provides exchange-
free transverse relaxation rates R20. If, and only if, the ensemble of
RexCPMG rates shows a very flat baseline around zero (see for instance
ref. 15), the R20 rates can be employed instead of the R2CPMG rates
in the model-free analysis.
4. Notes
1. Note that since the gyromagnetic ratio of 15N is negative, these

two frequencies have opposite signs. Keeping this in mind
helps understand the analysis of relaxation rates.
2. Particular care should be given to the sample conditions for
relaxation studies. The stability of the sample is very important
since data are collected as series of experiments in several days
(or possibly more) and will be used together for the analysis.
We advise against running such a series of experiments on

samples with very short (2 days or less) lifetimes unless high
reproducibility can be achieved in sample preparation and
several samples can be used to collect the ensemble of data
discussed here. In order to ensure the quality of the data and a
safe interpretation of hydrodynamic properties and chemical
exchange, it is advised to verify the concentration dependence
of relaxation rates which can be affected by aggregation and
oligomerization. The transverse relaxation rate measured with
a single echo is very sensitive to such processes and can be used
to find the optimal concentration (34).
One of the common problems that can arise during the week-
long data collection is the appearance of bubbles, which can
degrade the field homogeneity and decrease the signal intensity,
especially when a Shigemi tube is being used (as bubbles will be
trapped in the detection volume). It is advisable that samples
should be degassed before recording long relaxation experiments.
3. The series of experiments described herein is most easily
adapted for small and middle-size proteins (less than 200
amino acids, monomeric). In most cases, the data are easily
recorded on any spectrometer (B0 ³ 11.7 T) when a concentra-
tion around ~1 mM can be used. When the samples are con-
centration limited, the use of spectrometers equipped with
cryogenic probes should be preferred, particularly at low fields
(B0 £ 14.1 T). When using cryogenic probes, optimal homoge-
neity of the magnetic field B0 should be reached in order to
ensure proper suppression of the water signal. Probes
equipped with triple-axis gradients enhance the water suppres-
sion (35). When only z-axis gradients are available, the sign of
gradients should be chosen such that the phase of the water
transverse polarization builds up (36). For instance, two pulsed
field gradients separated in the pulse sequence by a single 180°
pulse on the proton channel should have opposite signs in the
manner of bipolar gradients.
4. Using a small angle to start helps reduce the effects of radiation
damping. It is also practical, since the signal is directly propor-
tional to the flip angle (modulo 360°), one can use the intensity
in this first experiment as a meter for the deviations of following
pulses from 360°. For example, if the intensity in the first
attempt at a 360° pulse is positive and about twice as large as
that of the 1 ms pulse experiment, it is likely that a pulse 2 ms
shorter will be an exact 360° pulse.
5. If, as described here, only adiabatic pulses are used on the 13C
channel, pulse calibration on this channel is optional, since the
quality of the inversion by the adiabatic pulse will not be affected
by a small deviation of the rf amplitude from its nominal value.
6. Sample script for an exponential decay:
160 F. Ferrage
The input file is a text file made of 25 columns; the first

one is the peak/residue number while columns 2–25 are the inten-
sity and error for the 12 longitudinal relaxation experiments.
7. WALTZ-16 is not as good a decoupling scheme as GARP.
However, its basis element is much shorter. Therefore, when
very low amplitude decoupling is performed, or when short
durations are favored (like in the present case) WALTZ-16
should be used instead of GARP.
8. Since the four experiments for each relaxation delay are run in
an interleaved manner, there is no need to record different
relaxation delays in series. If doubts arise about the stability of
the sample, run one set for a given value of the relaxation delay.
Additional datasets for other relaxation delays (or repeats of
the same) can be acquired after the full series of experiments.
9. When working on a spectrometer with a different Larmor
frequency, all pulse durations should be scaled with the inverse
of the field (e.g., 1.6 ms at 600 MHz corresponds to 1.2 ms at
800 MHz). There are a few exceptions: high-power, 15N CPMG,
and 15N decoupling radiofrequency amplitudes usually do not
scale up with B0.
Acknowledgments
I am grateful to Geoffrey Bodenhausen, David Cowburn, Ranajeet

Ghose, Arthur G. Palmer, and Philippe Pelupessy for their many
contributions to my training, from hands-on practice to many
insightful discussions. I thank Mikael Akke for the sample of Calbindin
D9k and Kaushik Dutta for carefully reading this manuscript.
References
1. Mittermaier, A., and Kay, L. E. (2006) Review – An Analytical Approach and Its Application to
New tools provide new insights in NMR studies of Cooperative Ca2+ Binding by Calbindin Dgk. J.
protein dynamics. Science 312, 224–228. Am. Chem. Soc. 115, 9832–9833.
2. Palmer, A. G. (2004) NMR characterization 6. Frederick, K. K., Marlow, M. S., Valentine, K. G.,
of the dynamics of biomacromolecules. Chem. and Wand, A. J. (2007) Conformational entropy
Rev. 104, 3623–3640. in molecular recognition by proteins. Nature
3. Massi, F., Wang, C. Y., and Palmer, A. G. (2006) 448, 325–329.
Solution NMR and computer simulation stud- 7. Kuszewski, J., Gronenborn, A. M., and Clore, G.
ies of active site loop motion in triosephos- M. (1999) Improving the packing and accuracy
phate isomerase. Biochemistry 45, of NMR structures with a pseudopotential for
10787–10794. the radius of gyration. J. Am. Chem. Soc. 121,
4. Tjandra, N., Feller, S. E., Pastor, R. W., and Bax, 2337–2338.
A. (1995) Rotational Diffusion Anisotropy of 8. Ryabov, Y., and Fushman, D. (2007) Structural
Human Ubiquitin from N-15 NMR Relaxation. Assembly of Multidomain Proteins and Protein
J. Am. Chem. Soc. 117, 12562–12566. Complexes Guided by the Overall Rotational
5. Akke, M., Brüschweiler, R., and Palmer III, A. G. Diffusion Tensor. J. Am. Chem. Soc. 129,
(1993) NMR Order Parameters and Free Energy: 7894–7902.
162 F. Ferrage
9. Ryabov, Y. E., and Fushman, D. (2007) A model chemical exchange in N-15-labeled proteins.
of interdomain mobility in a multidomain pro- Magn. Reson. Chem. 41, 866–876.
tein. J. Am. Chem. Soc. 129, 3315–3327. 22. Kempf, J. G., and Loria, J. P. (2004) Measurement
10. Palmer, A. G., and Massi, F. (2006) Characte- of Intermediate Exchange Phenomena. Meth.
rization of the dynamics of biomacromolecules Mol. Biol. 278, 185–231.
using rotating-frame spin relaxation NMR spec- 23. Findeisen, M., Brand, T., and Berger, S. (2007)
troscopy. Chem. Rev. 106, 1700–1719. A H-1-NMR thermometer suitable for cryo-
11. Palmer, A. G. (2004) NMR characterization of probes. Magn. Reson. Chem. 45, 175–178.
the dynamics of biomacromolecules. Chem. Rev. 24. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu,
104, 3623–3640. G., Pfeifer, J., and Bax, A. (1995) NMRPipe: a
12. Vallurupalli, P., Hansen, D. F., and Kay, L. E. Multidimensional Spectral Processing System
(2008) Structures of invisible, excited protein Based on UNIX Pipes. J. Biomol. NMR 6,
states by relaxation dispersion NMR spec- 277–293.
troscopy. Proc. Natl. Acad. Sci. U.S.A. 105, 25. Ferrage, F., Piserchio, A., Cowburn, D., and
11766–11771. Ghose, R. (2008) On the measurement of
13. Kay, L. E., Torchia, D. A., and Bax, A. (1989) N-15-{H-1} nuclear Overhauser effects. J. Magn.
Backbone Dynamics of Proteins as Studied by Reson. 192, 302–313.
N-15 Inverse Detected Heteronuclear NMR- 26. Ferrage, F., Cowburn, D., and Ghose, R. (2009)
Spectroscopy – Application to Staphylococcal Accurate Sampling of High-Frequency Motions
Nuclease. Biochemistry 28, 8972–8979. in Proteins by Steady-State 15N-{1H} Nuclear
14. Kroenke, C. D., Loria, J. P., Lee, L. K., Rance, Overhauser Effect Measurements in the Presence
M., and Palmer III, A. G. (1998) Longitudinal of Cross-Correlated Relaxation. J. Am. Chem.
and Transverse H-1-N-15 Dipolar N-15 Chemical Soc. 131, 6048–6049.
Shift Anisotropy Relaxation Interference: 27. Ferrage, F., Reichel, A., Battacharya, S., Cowburn,
Unambiguous Determination of Rotational D., and Ghose, R. (2010) On the measurement
Diffusion Tensors and Chemical Exchange of N-15-{H-1} nuclear Overhauser effects. 2.
Effects in Biological Macromolecules. J. Am. Effects of the saturation scheme and water signal
Chem. Soc. 120, 7905–7915. suppression. J. Magn. Reson. 207, 294–303.
15. Pelupessy, P., Ferrage, F., and Bodenhausen, G. 28. Pelupessy, P., Espallargas, G. M., and
(2007) Accurate Measurement of Longitudinal Bodenhausen, G. (2003) Symmetrical recon-
Cross-Relaxation Rates in Nuclear Magnetic version: measuring cross-correlation rates
Resonance. J. Chem. Phys. 126, 134508. with enhanced accuracy. J. Magn. Reson. 161,
16. Korzhnev, D. M., Billeter, M., Arseniev, A. S., 258–264.
and Orekhov, V. Y. (2001) NMR Studies of 29. Lipari, G., and Szabo, A. (1982) Model-Free
Brownian Tumbling and Internal Motions in Approach to the Interpretation of Nuclear
Proteins. Prog. Nucl. Magn. Reson. Spectrosc. 38, Magnetic Resonance Relaxation In Macromo-
197–266. lecules 1. Theory and Range of Validity. J. Am.
17. Luginbuhl, P., and Wuthrich, K. (2002) Semi- Chem. Soc. 104, 4546–4559.
classical nuclear spin relaxation theory revisited 30. Mandel, A. M., Akke, M., and Palmer III, A. G.
for use with biological macromolecules. Prog. (1995) Backbone Dynamics of Escherichia coli
Nucl. Magn. Reson. Spectrosc. 40, 199–247. Ribonuclease HI : Correlations with Structure
18. Nicholas, M. P., Eryilmaz, E., Ferrage, F., and Function in an Active Enzyme. J. Mol. Biol.
Cowburn, D., and Ghose, R. (2010) Nuclear 246, 144–163.
spin relaxation in isotropic and anisotropic media, 31. Cole, R., and Loria, J. P. (2003) FAST-Modelfree:
Prog. Nucl. Magn. Reson. Spectrosc. 57, 111–158. A program for rapid automated analysis of solu-
19. Cavanagh, J., Fairbrother, W. J., Palmer III, A. tion NMR spin-relaxation data. J. Biomol. NMR
G., Rance, M., and Skelton, N. J. (2006) Protein 26, 203–13.
NMR Spectroscopy: Principles and practice, 32. Dosset, P., Hus, J. C., Blackledge, M., and
Academic Press, San Diego. Marion, D. (2000) Efficient analysis of macro-
20. Wang, L. C., Pang, Y. X., Holder, T., Brender, J. molecular rotational diffusion from heteronu-
R., Kurochkin, A. V., and Zuiderweg, E. R. P. clear relaxation data. J. Biomol. NMR 16,
(2001) Functional dynamics in the active site of 23–28.
the ribonuclease binase. Proc. Natl. Acad. Sci. 33. Fushman, D., Cahill, S., and Cowburn, D.
U.S.A. 98, 7684–7689. (1997) The main chain dynamics of the
21. Wang, C. Y., and Palmer, A. G. (2003) Solution dynamin pleckstrin homology (PH) domain
NMR methods for quantitative identification of in solution: analysis of 15N relaxation with
monomer/dimer equilibration. J. Mol. Biol. 37. Bohlen, J. M., and Bodenhausen, G. (1993)
266, 173–194. Experimental Aspects of Chirp NMR-
34. Butterwick, J. A., Loria, J. P., Astrof, N. S., Spectroscopy. J. Magn. Reson. A 102, 293–301.
Kroenke, C. D., Cole, R., Rance, M., and Palmer, 38. Shaka, A. J., Barker, P. B., and Freeman, R.
A. G. (2004) Multiple time scale backbone (1985) Computer-Optimized Decoupling
dynamics of homologous thermophilic and Scheme for Wideband Applications and Low-
mesophilic ribonuclease HI enzymes. J. Mol. Biol. Level Operation. J. Magn. Reson. 64,
339, 855–871. 547–552.
35. Sarkar, R., Moskau, D., Ferrage, F., Vasos, P. R., 39. Emsley, L., and Bodenhausen, G. (1990)
and Bodenhausen, G. (2008) Single or triple Gaussian Pulse Cascades-New Analytical
gradients? J. Magn. Reson. 193, 110–118. Functions for Rectangular Selective Inversion
36. Muhandiram, D. R., Yamazaki, T., Sykes, B. and In-phase Excitation in NMR. Chem. Phys.
D., and Kay, L. E. (1995) Measurement of 2H Lett. 165, 469–476.
T1ro Relaxation Times in Uniformly 40. Shaka, A. J., Keeler, J., Frenkiel, T., and Freeman,
13
C-Labeled and Fractionally 2H-Labeled R. (1983) An Improved Sequence for Broad
Proteins in Solution. J. Am. Chem. Soc. 117, Band Decoupling – WALTZ-16. J. Magn. Reson.
11536–11544. 52, 335–38.
Chapter 10
Bacterial Production and Solution NMR Studies

of a Viral Membrane Ion Channel
Jolyon K. Claridge and Jason R. Schnell
Abstract
Advances in solution nuclear magnetic resonance (NMR) methodology that enable studies of very large
proteins have also paved the way for studies of membrane proteins that behave like large proteins due to
the added weight of surfactants. Solution NMR has been used to determine the high-resolution structures
of several small, membrane proteins dissolved in detergent micelles and small bicelles. However, the usual
difficulties with membrane proteins in producing, purifying, and stabilizing the proteins away from native
membranes remain, requiring intensive screening efforts. Low levels of heterologous expression can be the
most detrimental aspect to studying membrane proteins. This is exacerbated for NMR studies because of
the costs of isotopically enriched media. Thus, solution NMR studies have tended to focus on relatively
small, membrane proteins that can be expressed into inclusion bodies and refolded. Here, we describe
the methods used to produce, purify, and refold the proton channel M2 into detergent micelles, and the
procedures used to determine chemical shift assignments and the atomic level structure of the closed form
of the homotetrameric channel.
Key words: Membrane proteins, Ion channel, NMR, M2
1. Introduction
Several difficulties arise in applying solution nuclear magnetic resonance

(NMR) to the study of membrane proteins. Foremost among the
difficulties are obtaining sufficient quantities of pure and natively
folded membrane proteins at a reasonable cost, adverse effects of
increased relaxation rates from complexes of protein and detergent
micelles or detergent and lipid bicelles, limited lifetime of natively
folded membrane proteins outside of lipid bilayers, and cost and
availability of deuterated detergents. The requirement that the native
structure be stable for days or months remains a general problem
165
166 J.K. Claridge and J.R. Schnell
for structural studies of membrane proteins and largely relies on

trial and error to overcome. Access to relatively inexpensive deuterated
detergents facilitates collection of information-rich methyl-based
experiments, and in some cases it is necessary also to prevent rapid
spin diffusion in backbone amide-based NOESY experiments.
Increases in transverse relaxation rates with increasing protein
size often mean that structure determinations to high resolution by
solution NMR becomes very difficult for proteins with rotational
correlation times of less than ~20 ns, unless extraordinary efforts
are made. In general, this limit corresponds to protein sizes of
~35 kDa for water-soluble proteins at 30°C. However, studies of
membrane proteins become challenging at much lower sizes
because of the additional mass of the detergent micelles, which
might double or triple the effective tumbling times. The introduc-
tion of TROSY-based pulse sequences (1, 2) and methyl-based
approaches (3) that have been used extensively for much larger
water-soluble proteins greatly benefit studies of small- and medium-
sized membrane proteins (4, 5) as well.
A common difficulty in studying membrane proteins by solution
NMR (or, indeed, by any other method) arises from the inability
to produce the proteins in suitable quantities. For NMR, the high
unit cost of culture media enriched in the rare isotopes needed for
multidimensional NMR studies prohibits the large-scale cultures
(>10 L) common for crystallographic studies. Although isotopic
labeling in eukaryotic systems (6, 7) or via cell-free methods (8) have
been reported, the predominant method by which isotopically
labeled proteins are produced remains by expression in bacteria.
Using such methods, both the number and size of membrane
proteins that have been solved by solution NMR have grown rapidly
in the past few years (9, 10). Several of these studies expressed the
proteins into bacterial membranes (e.g., refs. 3–5, 11); however,
a larger trend has been to use solution NMR to study relatively
small-membrane proteins that can be refolded from bacterial inclu-
sion bodies. Membrane proteins containing a small number of
transmembrane helices, which are more likely to fold in vitro,
appear to be abundant in genomes (12, 13).
Expression into inclusion bodies may happen spontaneously or
by fusing the target protein to one that is efficiently “packaged”
into inclusion bodies. One such fusion protein is “trpLE” (14),
which has been used to express and purify several hydrophobic
peptides: Vpu from HIV (15), transmembrane regions of GPCRs
(16), caveolins (17), and several others (18–20) (see Note 1).
TrpLE contains the leader sequence of the trp operon of Escherichia
coli fused to a sequence of 97 residues found near the C-terminus
of the anthranilate synthase gene (21, 22). Fusion to trpLE directs
the protein to inclusion bodies, with the advantages of reduced
10 Bacterial Production and Solution NMR Studies of a Viral… 167
toxicity and protease resistance (23). In the trpLE construct,

cysteines have been mutated to alanines to prevent disulfide bond
formation, and methionines have been mutated to leucine to allow
for the introduction of a single CNBr cleavage site at the fusion
junction (14).
In this chapter, we outline the approach used to produce,
purify, and determine the structure of the M2 proton channel from
influenza A in detergent micelles (24). M2 is one of the smallest
ion channels known, being assembled from a homotetramer of a
single-pass membrane protein. However, it is surprisingly sophisti-
cated, being pH gated and highly proton selective. Past circulating
strains of M2 are the target of a class of antiviral drugs, the
adamantanes, and understanding drug resistance may help in devel-
oping drugs that are again effective. We describe how M2 is expressed
at high yield into bacterial inclusion bodies as a trpLE fusion,
chemically cleaved, purified, and refolded into detergent micelles
to provide high-resolution NMR spectra of the native homote-
tramer (24) (Fig. 1).
Fig. 1. An overview of the homotetrameric structure of the high pH, closed M2 proton
channel. Alpha helices are shown as cylinders. An idealized schematic of the DHPC micelle
is shown as a single layer of detergent molecules with the hydrocarbon chains coating the
transmembrane domain. The positioning of the micelle is based on the absence of cross
peaks in amide backbone strips at the resonance frequency of water in an 15N-edited
NOESY (24).
2. Materials
2.1. Expression ● Expression strain of E. coli: BL21(DE3) pLysS cells (Novagen).

of the TrpLE-M2 ● LB agar plates: 10 mg/mL tryptone, 5 mg/mL yeast extract,
Fusion Protein 10 mg/mL NaCl, and 16 mg/mL agar. Adjust pH to 7.0, if
necessary, with NaOH or HCl and autoclave. Add 100 mM
kanamycin and 100 mM chloramphenicol just before solidifica-
tion begins, and pour into plates.
● LB medium: 10 mg/mL tryptone, 5 mg/mL yeast extract,
and 10 mg/mL NaCl. Adjust pH to 7.0, if necessary, with
NaOH or HCl and autoclave.
● M9 medium: 42 mM Na2HPO4, 22 mM KH2PO4, 20 mM
NH4Cl, 10 mM NaCl, 22 mM glucose, 2 mM MgSO4, 0.1 mM
CaCl2, and 1× MEM vitamin solution (10 mL of a 100× stock
purchased from Sigma–Aldrich), 100 mM kanamycin, and
100 mM chloramphenicol. The salts (Na2HPO4, KH2PO4,
NH4Cl, and NaCl) are first dissolved into 990 mL of distilled
and 0.22-mm filtered water and autoclaved, followed by addi-
tion of the remaining components. For production of 15N-labeled
protein, uniformly 15N-labeled ammonium chloride is substi-
tuted. For 13C-labeled protein, 11 mM of uniformly 13C-labeled
glucose is substituted. For partial deuteration, the M9 solutes
are dissolved in 99.8% 2H2O and sterile filtered through 0.2-mm
filter flasks, rather than autoclaved.
● 1 M DTT: Prepare stock solution in distilled, filtered water.
● 10% SDS: Prepare by adding 1 g of SDS to 10 mL of distilled,
filtered water.
● NuPAGE Novex 12% Bis–Tris Gel (Invitrogen).
● 4× NuPAGE LDS Sample Buffer (Invitrogen).
● NuPAGE® MES SDS Running Buffer (Invitrogen).
● 1 M IPTG: Prepare stock solution in distilled, filtered water.
● 8 M Urea: Prepare stock solution in distilled, filtered water.
2.2. Fusion Protein ● Lysis buffer: 50 mM Tris–HCl, pH 8.0, 200 mM NaCl.

Purification, Cleavage, ● Guanidine buffer: 20 mM Tris–HCl, pH 8.0, 200 mM NaCl,
and Preparation 6 M guanidine HCl, and 15 mM imidazole.
of Pure M2 ● Elution buffer: 20 mM Tris–HCl, pH 7.0, 200 mM NaCl,
6 M guanidine HCl, and 400 mM imidazole.
● Reverse phase C4 column for HPLC was purchased from Grace-
Vydac (214TP C4, 300 Å silica, 5 mm beads, 2.1 × 150 mm).
● Hexafluoroisopropanol: ³98% purity.
● Formic acid: ³98% purity.
● Syringe filter: 0.2 mm polytetrafluoroethylene.
● HPLC buffer A: 5% isopropanol, 95% water, 0.1% trifluoroacetic

acid.
● HPLC buffer B: 57% isopropanol, 38% acetonitrile, 5% water,
0.1% trifluoroacetic acid.
● Ni-NTA agarose beads.
● Cyanogen bromide: Solid, ³98.5 % purity.
● Dialysis tubing: 10 kDa MWCO, 22-mm internal diameter.
● Dialysis cassette with 3.5 kDa MWCO.
2.3. Reconstitution ● Reconstitution buffer: 50 mM sodium phosphate, pH 7.5,

of M2 into Detergent 6 M guanidine HCl, 0.3 M dihexanoyl-sn-glycerol-3-phos-
Micelles phocholine (DHPC) detergent, 30 mM sodium glutamate.
Adjust pH with NaOH.
● NMR buffer: 50 mM sodium phosphate, pH 7.5, 30 mM
sodium glutamate. Adjust pH with NaOH.
● Deuterium oxide: 98% purity.
● Rimantadine: 0.3 M stock prepared in 80 mM DHPC.
● Mini dialysis cups for protein refolding: 3.5 kDa MWCO,
10–100 mL capacity (Thermo Scientific).
● Centrifugal concentrator: 5 kDa MWCO.
2.4. NMR Chemical ● All NMR experiments are conducted on a 14.1 T spectrometer
Shift Assignments equipped with a cryogenic probe.
and Restraint ● Processing and preliminary analysis of data, peak fitting, and
Measurements extraction of intensities for determining scalar bond couplings
are performed using NMRPipe (25).
● Resonance assignments and quantitation of NOE cross-peak
intensities are performed using CARA (26).
● Prediction of backbone dihedral angles from chemical shifts is
performed using TALOS (27).
● Cylindrically shaped polyacrylamide gel: A solution containing
4.5% acrylamide concentration with an acrylamide/bisacryl-
amide molar ratio of 40 is cast in a 6-mm cylindrical vessel.
● A gel press kit (New Era Enterprises, Inc.) is used to push the
cylindrical gel (6 mm in diameter) into an open-ended 4.2-mm
inner diameter NMR tube.
2.5. Structure ● Structure calculations were performed using the program

Calculation XPLOR-NIH (28).
and Measurement ● The program PALES (29) was used to evaluate the fit of residual
of Tryptophan Gate dipolar couplings (RDCs) to structures.
Dynamics ● Fitting of relaxation dispersion data was accomplished using
the CPMGfit software (Dr. Art Palmer).
3. Methods
The following protocols describe in detail our process for expressing,

purifying, reconstituting the M2 proton channel into detergent
micelles, assigning chemical shifts, and performing structural
analyses. However, our lab has used the same protocol for several
membrane proteins containing one or two transmembrane domains.
A generalized flowchart of the procedure is provided in Fig. 2.
The gene of interest was inserted between the HindIII (5¢) and
BamHI (3¢) sites in the pMMHb plasmid (kindly provided by
Stephen Blacklow, Brigham and Women’s Hospital, Boston),
which contains a gene conferring kanamycin resistance. Cyanogen
bromide is used to cleave the expressed construct on the C-terminal
side of a unique methionine, resulting in release of the N-terminal
(His)9-tagged trpLE, and the C-terminal M2 construct with no
extra residues.
Purified peptide of different isotopic composition is gently
refolded from denaturing conditions into buffered detergent
solution suitable for solution NMR experiments. Samples are typi-
cally stable for up to 3 weeks. Resonance assignments are obtained
Fig. 2. A generalized flowchart showing the production of tetrameric M2 samples from trpLE fusions expressed in
Escherichia coli.
from a combination of triple-resonance NMR experiments, an

15
N-separated NOESY, and a 13C-separated NOESY collected on
15
N, 13C, 2H-labeled, 15N-labeled, and 15N, 13C-labeled peptide, respec-
tively. Structure calculations are performed by an iterative process
that incorporates local experimental restraints first, followed by long-
range restraints. RDCs are incorporated last with a low-temperature
refinement to avoid local minima that arise from the degeneracy of
RDC magnitudes.
3.1. Expression 1. Transform the plasmid into E. coli BL21(DE3) pLysS cells,
of the TrpLE-M2 plate on LB agar containing kanamycin and chloramphenicol,
Fusion Protein and incubate overnight at 37°C.
2. Inoculate cultures of 200–500 mL Luria–Bertani (LB) media
with individual colonies of freshly transformed bacteria and
grow overnight at 37°C with moderate shaking (150 rpm).
3. Centrifuge the cultures at 2,000 × g for 25 min at 4°C and resus-
pend into 40 mL of chilled M9 medium. Add the resuspended
cells to the large-scale M9 cultures (2–4 L) such that the OD600
is 0.2–0.3 relative to water. Grow each liter of culture in a 2.5-L
baffled flask at 37°C with moderate shaking (150 rpm).
4. When the OD600 reaches 0.6–0.7, induce expression of the
trpLE-M2 fusion by adding IPTG from a stock of 1 M IPTG
to a final concentration of 1 mM. Grow overnight. The final
OD600 is typically between 1.2 and 1.4 (see Note 2).
5. Analyze protein expression levels by using SDS-PAGE (12%
Bis–Tris gel). Spin down cell quantities equivalent to 250 mL
of OD 600 = 0.6 cell culture at 5,000 × g for 5 min at room
temperature and redissolve the pellet into 40 mL of 8 M urea,
20 mL of 4× LDS sample buffer, 5 mL of 1 M DTT, and 5 mL
of 10% SDS. Load 20 mL of sample per lane and run the gel
using MES SDS running buffer.
3.2. Fusion Protein 1. Harvest the cells by centrifugation at 5,000 × g for 30 min at
Purification, Cleavage, 4°C. Resuspend the cell pellets in lysis buffer using a Dounce
and Preparation homogenizer, and sonicate on ice for a total of 3 min with a
of Pure M2 20% duty cycle. Spin down the inclusion bodies and cell debris
at 10,000 × g for 25 min at 4°C. Solubilize the pellets in 40 mL
of lysis buffer and spin down at 10,000 × g for 25 min at 4°C to
purify away additional water-soluble contaminants (see Note 3).
2. Dissolve the water-insoluble matter in 50 mL (per 1 L culture)
of guanidine buffer using a Dounce homogenizer. Pellet undis-
solved matter, which includes nucleic acids, by centrifugation
at 100,000 × g for 1.5 h at 4°C. Add the supernatant to 2 mL
of Ni-NTA agarose beads (see Note 4) preequilibrated with
guanidine buffer. After a 1-h incubation at 4°C with gentle
rotation, pour the slurry into a gravity column and wash with
80 mL of guanidine buffer. Elute bound trpLE-M2 fusion
protein from the column by adding elution buffer in three stages

of 5 mL each.
3. Dialyze the elution containing the trpLE-M2 fusion in 10 kDa
molecular weight cutoff (MWCO) dialysis tubing against 4 L
of H2O, with several exchanges over 4 h. The trpLE-M2 fusion
precipitates as white flakes. Centrifuge in a swinging-bucket
rotor at 1,500 × g for 30 min at 4°C.
4. Chemically cleave the M2 peptide from trpLE by dissolving
the pellet into 5 mL of 70% formic acid containing 1 g of CNBr
(see Note 5). Cover the reaction vessel with aluminum foil and
allow the reaction to proceed for 2 h under a low-pressure
stream of nitrogen gas. Load the sample via syringe into a dialysis
cassette with a 3.5 kDa MWCO and dialyze against 4 L of water
for 1 h, snap frozen in liquid nitrogen, and then lyophilize.
5. In preparation for reverse-phase HPLC, dissolve the lyophilized
sample in 1.5 mL of hexafluoroisopropanol, which clarifies
after 15 min at 40°C. Subsequently, add 0.5 mL of formic acid
and 2 × 1 mL of water. Draw this solution into a 5-mL syringe,
degas using a vacuum hose, and load onto a C4 reverse-phase
column through a syringe filter. Collect fractions over a linear
gradient from 0 to 100% buffer B (100–0% buffer A). Cleaved
trpLE elutes first (~55% buffer B), followed by the uncut
fusion (~65% buffer B), and the cleaved M2 peptide (~75% buf-
fer B). Cleavage efficiencies for the trpLE-M2 fusion are rou-
tinely between 70 and 80%. Lyophilize the pooled fractions.
For quantitation and aliquoting of peptide, dissolve the dried
samples in 50% acetonitrile containing 0.1% TFA, quantitate
by absorbance at 280 nm based on amino acid composition,
and relyophilize (see Note 6).
3.3. Reconstitution 1. Dissolve purified peptide (1.2 mg) into reconstitution buffer
of M2 into Detergent (see Note 7) at a concentration of 250 mM (final volume is
Micelles 960 mL), split into three 3.5 kDa MWCO dialysis cups, and
dialyze for 12 h against 2 L of NMR buffer with slow stirring
and one buffer change at 10 h.
2. Concentrate the sample to ~0.7 mM monomer using a cen-
trifugal concentrator with a 5 kDa MWCO.
3. Add rimantadine to 10 mM. Rimantadine binds at four equiva-
lent sites near the gate on the lipid-facing side of the channel
and stabilizes the closed conformation of the pore. Because of
the poor water solubility of rimantadine, it is added from a
0.3 M stock that contains 80 mM DHPC (Subheading 2.3).
4. Add deuterium oxide for the magnetic field lock to a concen-
3.4. NMR Chemical tration of 5% in three steps.
Shift Assignments
and Restraint
1. To achieve nearly complete sequence-specific backbone chemi-
cal shift assignment of 1HN, 15N, 13C¢, 13Ca, and 13Cb, use
Measurements
TROSY versions of the HNCA, HNCACB, and HNCO exper-

iments (30, 31) on fully 15N-, 13C-, and 85% 2H-labeled
protein (see Note 8). Process spectra in NMRPipe (25), and
assign resonances and quantitate NOE cross-peak intensities in
CARA (26).
2. Once backbone chemical shifts are known, use TALOS (27) to
predict regions of secondary structure. For M2, this indicates
that the transmembrane domain and the C-terminal jux-
tamembrane region are alpha-helical. This is confirmed by the
characteristic local NOE patterns, and is used to aid assignment
of most 1Ha and 1Hb intraresidue and sequential NOEs in a 3D
15
N-edited NOESY spectrum (110 ms; see Note 9 and Fig. 3).
3. Collect a methyl-based 3D 13C-edited NOESY (150 ms mixing
time) on fully 15N-, 13C-labeled protein; this is particularly
helpful in the methyl-rich transmembrane segment to confirm
backbone proton assignments, extend the side-chain proton
assignments, and identify intermolecular contacts (Fig. 4; see
Note 8). Obtain stereospecific assignments of gamma methyls
of valine and delta methyls of leucine from 10% 13C-labeled
protein by recording a constant time 1H–13C HSQC with 28-ms
carbon evolution, which allows discrimination between coupled
and uncoupled methyls based on the sign of the cross peak (32).
4. Complete assignment of side-chain proton resonances is facili-
tated by determining a large number of c1 and c2 rotamers.
Fig. 3. Projection of the 1H,1H plane of 15N-separated NOESY spectra on an M2 sample containing (a) fully protonated or
(b) fully deuterated DHPC detergent showing the loss of information due to spin diffusion when the hydrocarbon chain of DHPC
is protonated (see Note 9). NOE mixing times were 90 and 110 ms, respectively.
Fig. 4. Final, lowest-energy structure of the high pH M2 channel showing the position of
methyl groups throughout the transmembrane domain, including the helix–helix interfaces.
Helices are shown as ribbons and methyl protons as filled circles. Structural elements in
adjacent helices are shaded darker or lighter.
Determine the c1 of isoleucines, threonine, and valines from

methyl-based measurements of the 3-bond scalar couplings
3
JNCg and 3JC¢C g (33, 34), and the c2 of leucine and isoleucine
from 3JCaCd (35, 36). Determine the c1 of long-chain aliphatic
(arginine, leucine, and lysine) and aromatic (histidine, phenylala-
nine, tryptophan, and tyrosine) side chains from 3JNCg and 3JC¢Cg
values measured in an 1H–15N constant-time TROSY experiments
on 15N-, 13C-, and 85% 2H-labeled protein (37, 38).
5. NOEs that cannot be explained by intramonomer distances based
on the local secondary structure are identified as intermonomer
NOEs. Carry out the assigning of intermonomer distance
restraints and structure calculations iteratively until all NOE
cross peaks in the NOESY spectra are self-consistent (see
Subheading 3.5). First, identify protein-drug NOEs in the
15
N-edited and 13C-edited NOESYs described in step 3, and
subsequently confirm them by acquiring an 15N-edited NOESY
(500-ms mixing time) on a sample containing uniformly 15N- and
2
H-labeled protein, protonated rimantadine, and perdeuterated
DHPC (see Note 10).
6. Weakly align the ion channel relative to the magnetic field by
using a strained gel (39–41). Soak the protein and detergent
solution into a cylindrically shaped polyacrylamide gel, initially
6 mm in diameter and 9 mm in length; squeeze this into the
4.2-mm inner diameter of an open-ended NMR tube. Obtain

the RDCs by subtracting the scalar coupling (J), measured on
an unaligned sample, from the measured coupling (J + RDC)
of the aligned sample. In both cases, obtain couplings by inter-
leaving a regular gradient-enhanced HSQC and a gradient-selected
TROSY (42), both acquired with 80 ms of 15N evolution.
3.5. Structure 1. Begin structure calculations from a random coil structure

Calculation generated in XPLOR-NIH that incorporates intramonomer
and Measurement NOEs, backbone dihedral angle restraints derived from
of Tryptophan Gate chemical shifts, and side-chain c1 and c2 restraints. Enforce the
Dynamics intramonomer NOE restraints by using flat-well harmonic
potentials, with the force constant fixed at 50 kcal-mol-1-Å-2.
For fixed side-chain c1 and c2 angles, apply flat-well (±30°)
harmonic potentials with a force constant of 30 kcal-mol-1-rad-2.
During the simulation, ramp the van der Waals, improper
angle, and bond angle force constants to 4.0 kcal-mol-1-Å-2,
1.0 kcal-mol-1-degree-2 and 1.0 kcal-mol-1-degree-2, respec-
tively. Calculate a total of 20 monomer structures with a stan-
dard high-temperature simulated annealing protocol in which
the bath temperature is cooled from 1,000 to 200 K.
2. To obtain an initial set of tetramer structures, replicate and
uniquely translate the lowest-energy monomer structure four
times. Perform another high-temperature simulated annealing
run (1,000 to 200 K) using all previous restraints plus intermono-
mer NOEs, which are applied in a fourfold manner consistent
with the C4 symmetry implied by a single set of resonances.
Calculate a total of 100 tetramer structures.
3. Independently cross validate the 100 tetramer structures by
1
H–15N RDCs using singular value decomposition (SVD) as
implemented in PALES. Assess the goodness of fit by the
Pearson correlation coefficient (r) and quality factor (Q) (43).
Select the 15 structures with the best agreement with RDCs
(r ~ 0.91 and Q ~ 0.25).
4. Set the approximate initial values of the magnitude (Da) and
rhombicity (Rh) of the alignment tensor to the average values
of Da = 14.0 Hz and Rh = 0.20 that were obtained from the best
SVD fits. Due to flexibility in the tetrameric assembly on the
timescale of RDCs (up to microseconds (44)), the rhombicity
is nonzero when the monomers are fit to a single alignment
tensor. Thus, the monomers are restrained against indepen-
dent alignment tensors during refinement. During this final
refinement, cool the bath from 200 to 20 K. Fix the force
constants for NOE and dihedral restraints at 100 kcal-mol-1-Å-2
and 40 kcal-mol-1-rad-2, respectively. In addition, ramp a weak
database-derived “Rama” potential function from 0.02 to 0.20
(dimensionless force constant) for the general treatment of side-

chain rotamers. The RDC restraint force constant was ramped
from 0.010 to 0.125 kcal-mol-1-Hz-2, thereby supplementing,
but not supplanting, the NOE restraints. Generate 10 RDC-
refined structures for each of the 15 structures validated by
RDCs, and add the structure with the lowest total energy to
the final ensemble. Choose the structure with heavy atom
conformation closest to the mean to represent the ensemble.
5. Measure the timescale of chemical shift exchange of the Trp41
side chain using an 15N relaxation dispersion CPMG experi-
ment (45). Because the single tryptophan He1 of M2(18–60)
is downfield from the backbone amides, measure relaxation
dispersion to high precision using a 1D experiment with many
scans (>1,000).
6. Fit the dependence of 15N relaxation due to chemical exchange
on the frequency of refocusing (1/tcp) of chemical shift evolution
to a two-site exchange model given by Rex ∝ 1 – (2tex/tcp) tanh
(tcp/2tex), where Rex is the contribution to transverse relaxation
due to chemical shift exchange and tex is the correlation time
of the process that is generating the chemical shift exchange
(46). Analyze the relaxation-compensated CPMG experiment
(45) using the program CPMGfit.
4. Notes
1. Single-pass membrane proteins approaching 100 amino acids

are robustly expressed into bacterial inclusion bodies with
N-terminal trpLE fusions. We have observed that the expression
yields of longer proteins or those containing additional TM
helices can be increased by the addition of a second, C-terminal
trpLE with an intervening methionine for cleavage.
2. Optimal temperature and IPTG conditions should be explored
for each construct. In some cases, lower IPTG and lower tem-
perature (100 mM IPTG and 18°C) or higher IPTG and higher
temperature (1 mM IPTG and 42°C) result in higher expres-
sion levels. In some cases, better expression can be obtained
using low volumes (200 mL) in 2.5-L baffled flasks and shak-
ing at very high speeds (300 rpm) to increase aeration.
3. Beta-mercaptoethanol at 0.05% can be added to all cell-
processing buffers if cysteines are present in the target protein.
4. For maximum yields, it may be necessary to use fresh Ni-NTA
resin for each preparation because of degradation that occurs
in 6 M guanidine.
5. Use of CNBr for cleavage of the fusion protein requires a unique

methionine at the fusion site. However, the side chain of the
residue following the methionine can affect cleavage efficiencies.
The hydroxyls of serine and threonine residues impair cleavage
efficiency and should be avoided. In contrast, we have observed
particularly high cleavage efficiencies when a glycine follows
methionine. In addition, lower yields are observed for reactions
with low concentrations of the fusion protein. Formylation of
tryptophan and lysine side chains can occur when the reaction is
allowed to proceed for longer than 3 h. CNBr is very toxic and
corrosive, requiring a chemical hood and all safety precautions
necessary to avoid direct contact.
6. A light, fluffy consistency of the lyophilized peptide prior to
reconstitution correlated with best-quality samples for NMR.
7. The presence of glutamate is necessary to avoid nonspecific
aggregation of M2 above ~200 mM (47). It is important to
include glutamate in the 2H2O that is added to NMR samples
to prevent localized aggregation.
8. Deuteration of the protein and the use of TROSY is critical for
observing many of the cross peaks in the transmembrane
domain because of the slow, rotational tumbling time of the
tetrameric protein (~20 kDa) plus bound detergent.
9. The 3D 13C-edited NOESY spectrum was collected in the
presence of perdeuterated detergent to prevent obfuscation of
the aliphatic region. Collection of the 15N-edited NOESY
spectrum also required deuterated detergent to prevent loss
of magnetization due to rapid spin diffusion. Subsequent tests
established that deuteration of only the acyl chains was sufficient
to suppress rapid spin diffusion. The cost of perdeuterated DHPC
(D35) is approximately tenfold more than that of DHPC
deuterated only at the acyl chains (D22).
10. To ensure that essentially all nonexchangeable protein protons
were replaced with deuterium, cells were grown in 99.9% 2H2O
and perdeuterated glucose (Cambridge Isotope Laboratories).
Acknowledgments
James J. Chou and the National Institutes of Health, USA (NIH)

are acknowledged for supporting development of the approaches
described. J.R.S. was supported by a Ruth Kirschstein Fellowship
from the NIH. Matthew E. Call is acknowledged for many useful
modifications and additions to the trpLE fusion expression and
purification protocol.
References
1. LeMaster, D. M. (1994) Isotope labeling in 13. Marsden, R. L., Lee, D., Maibaum, M., Yeats,
solution protein assignment and structural C., and Orengo, C. A. (2006) Comprehensive
analysis. Prog. Nucl. Magn. Reson. Spectrosc. 26, genome analysis of 203 genomes provides struc-
371–419. tural genomics with new insights into protein
2. Pervushin, K., Riek, R., Wider, G., and family space. Nucleic Acids Res. 34, 1066–1080.
Wuthrich, K. (1997) Attenuated T2 relaxation 14. Staley, J. P., and Kim, P. S. (1994) Formation of
by mutual cancellation of dipole-dipole cou- a native-like subdomain in a partially folded
pling and chemical shift anisotropy indicates an intermediate of bovine pancreatic trypsin inhib-
avenue to NMR structures of very large bio- itor. Protein Sci. 3, 1822–1832.
logical macromolecules in solution. Proc. Natl. 15. Ma, C., Marassi, F. M., Jones, D. H., Straus, S.
Acad. Sci. U.S.A 94, 12366–12371. K., Bour, S., Strebel, K., Schubert, U., Oblatt-
3. Ruschak, A. M., and Kay, L. E. (2010) Methyl Montal, M., Montal, M., and Opella, S. J.
groups as probes of supra-molecular structure, (2002) Expression, purification, and activities
dynamics and function. J. Biomol. NMR 46, of full-length and truncated versions of the
75–87. integral membrane protein Vpu from HIV-1.
4. Gautier, A., Mott, H. R., Bostock, M. J., Protein Sci. 11, 546–557.
Kirkpatrick, J. P., and Nietlispach, D. (2010) 16. Zheng, H., Zhao, J., Wang, S., Lin, C. M.,
Structure determination of the seven-helix Chen, T., Jones, D. H., Ma, C., Opella, S., and
transmembrane receptor sensory rhodopsin II Xie, X. Q. (2005) Biosynthesis and purification
by solution NMR spectroscopy. Nat. Struct. of a hydrophobic peptide from transmembrane
Mol. Biol. 17, 768–774. domains of G-protein-coupled CB2 receptor.
5. Imai, S., Osawa, M., Takeuchi, K., and Shimada, J. Pept. Res. 65, 450–458.
I. (2010) Structural basis underlying the dual 17. Diefenderfer, C., Lee, J., Mlyanarski, S., Guo, Y.,
gate properties of KcsA. Proc. Natl. Acad. Sci. and Glover, K. J. (2009) Reliable expression
U.S.A. 107, 6216–6221. and purification of highly insoluble transmem-
6. Bruggert, M., Rehm, T., Shanker, S., Georgescu, brane domains. Anal. Biochem. 384, 274–278.
J., and Holak, T. A. (2003) A novel medium for 18. Call, M. E., Schnell, J. R., Xu, C., Lutz, R. A.,
expression of proteins selectively labeled with Chou, J. J., and Wucherpfennig, K. W. (2006)
15
N-amino acids in Spodoptera frugiperda (Sf9) The structure of the zetazeta transmembrane
insect cells. J. Biomol. NMR 25, 335–348. dimer reveals features essential for its assembly
7. Strauss, A., Bitsch, F., Cutting, B., Fendrich, with the T cell receptor. Cell 127, 355–368.
G., Graff, P., Liebetanz, J., Zurini, M., and 19. Chong, Y. H., Ball, J. M., Issel, C. J., Montelaro,
Jahnke, W. (2003) Amino-acid-type selective isotope R. C., and Rushlow, K. E. (1991) Analysis of
labeling of proteins expressed in Baculovirus- equine humoral immune responses to the trans-
infected insect cells useful for NMR studies. membrane envelope glycoprotein (gp45) of
J. Biomol. NMR 26, 367–372. equine infectious anemia virus. J. Virol. 65,
8. Makino, S., Goren, M. A., Fox, B. G., and Markley, 1013–1018.
J. L. Cell-free protein synthesis technology in 20. Smith, J. G., Mothes, W., Blacklow, S. C., and
NMR high-throughput structure determination. Cunningham, J. M. (2004) The mature avian
Methods Mol. Biol. 607, 127–147. leukosis virus subgroup A envelope glycoprotein
9. Kim, H. J., Howell, S. C., Van Horn, W. D., Jeon, is metastable, and refolding induced by the syn-
Y. H., and Sanders, C. R. (2009) Recent Advances ergistic effects of receptor binding and low pH is
in the Application of Solution NMR Spectroscopy coupled to infection. J. Virol. 78, 1403–1410.
to Multi-Span Integral Membrane Proteins. Prog. 21. Bertrand, K., Squires, C., and Yanofsky, C.
Nucl. Magn. Reson. Spectrosc. 55, 335–360. (1976) Transcription termination in vivo in the
10. Warchawski, D. (2010) Membrane proteins of leader region of the tryptophan operon of
known structure determined by NMR. http:// Escherichia coli. J. Mol. Biol. 103, 319–337.
www.drorlist.com/nmr/MPNMR.html. 22. Miozzari, G. F., and Yanofsky, C. (1978) Translation
11. Chill, J. H., Louis, J. M., Miller, C., and Bax, of the leader region of the Escherichia coli trypto-
A. (2006) NMR study of the tetrameric KcsA phan operon. J. Bacteriol. 133, 1457–1466.
potassium channel in detergent micelles. Protein 23. Kleid, D. G., Yansura, D., Small, B., Dowbenko,
Sci. 15, 684–698. D., Moore, D. M., Grubman, M. J., McKercher,
12. Liu, Y., Engelman, D. M., and Gerstein, M. P. D., Morgan, D. O., Robertson, B. H., and
(2002) Genomic analysis of membrane protein Bachrach, H. L. (1981) Cloned viral protein
families: abundance and conserved motifs. vaccine for foot-and-mouth disease: responses
Genome Biol. 3, research0054. in cattle and swine. Science 214, 1125–1129.
24. Schnell, J. R., and Chou, J. J. (2008) Structure peptide as revealed by three-bond carbon-carbon
and mechanism of the M2 proton channel of couplings and 13C chemical shifts. J. Biomol.
influenza A virus. Nature 451, 591–595. NMR 7, 256–260.
25. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., 37. Hu, J.-S., Grzesiek, S., and Bax, A. (1997) Chi1
Pfeifer, J., and Bax, A. (1995) NMRPipe: a multi- angle information from a simple two-dimensional
dimensional spectral processing system based on NMR experiment which identifies trans 3JNCg
UNIX pipes. J. Biomol. NMR 6, 277–293. couplings in isotopically enriched proteins.
26. Keller, R. (2004) The Computer Aided Resonance J. Biomol. NMR 9, 323–328.
Assignment Tutorial, First Edition ed., Cantina- 38. Hu, J.-S., Grzesiek, S., and Bax, A. (1997)
Verlag, Goldau, Switzerland. http://cara.nmr- Two-dimensional NMR methods for determining
software.org/downloads/3-85600-112-3.pdf. c1 angles of aromatic residues in proteins from
27. Cornilescu, G., Delaglio, F., and Bax, A. (1999) three-bond JC¢Cg and JNCgcouplings. J. Am.
Protein backbone angle restraints from searching Chem. Soc. 119, 1803–1804.
a database for chemical shift and sequence 39. Tycko, R., Blanco, F. J., and Ishii, Y. (2000)
homology. J. Biomol. NMR 13, 289–302. Alignment of biopolymers in strained gels:
28. Schwieters, C. D., Kuszewski, J., Tjandra, N., A new way to create detectable dipole-dipole
and Clore, G. M. (2002) The Xplor-NIH NMR couplings in high-resolution biomolecular
molecular structure determination package. J. NMR. J. Am. Chem. Soc. 122, 9340–9341.
Magn. Reson. 160, 66–74. 40. Sass, H. J., Musco, G., Stahl, S. J., Wingfield, P.
29. Zweckstetter, M., Bax, A. (2000) Prediction of T., and Grzesiek, S. (2000) Solution NMR of
sterically induced alignment in a dilute liquid proteins within polyacrylamide gels: diffusional
crystalline phase: aid to protein structure deter- properties and residual alignment by mechanical
mination by NMR. J. Am. Chem. Soc. 122, stress or embedding of oriented purple mem-
3791–3792. branes. J. Biomol. NMR 18, 303–309.
30. Salzmann, M., Wider, G., Pervushin, K., and 41. Chou, J. J., Gaemers, S., Howder, B., Louis, J.
Wuthrich, K. (1999) Improved sensitivity and M., Bax, A. (2001) A simple apparatus for gen-
coherence selection for [N-15,H-1]-TROSY erating stretched polyacrylamide gels, yielding
elements in triple resonance experiments. J. uniform alignment of proteins and detergent
Biomol. NMR 15, 181–184. micelles. J. Biomol. NMR 21, 377–382.
31. Kay, L. E., Ikura, M., Tschudin, R., and Bax, A. 42. Weigelt, J. (1998) Single Scan, Sensitivity- and
(1990) Three-dimensional triple resonance Gradient-Enhanced TROSY for Multidimen-
NMR spectroscopy of isotopically enriched sional NMR Experiments. J. Am. Chem. Soc. 120,
proteins. J. Magn. Reson. 89, 496–514. 10778–10779.
32. Neri, D., Szyperski, T., Otting, G., Senn, H., 43. Cornilescu, G., Marquardt, J. L., Ottiger, M.,
and Wuthrich, K. (1989) Stereospecific nuclear and Bax, A. (1998) Validation of protein struc-
magnetic resonance assignments of the methyl ture from anisotropic carbonyl chemical shifts
groups of valine and leucine in the DNA- in a dilute liquid crystalline phase. J. Am. Chem.
binding domain of the 434 repressor by biosyn- Soc. 120, 6836–6837.
thetically directed fractional 13C labeling. 44. Lakomek, N. A., Lange, O. F., Walter, K. F.,
Biochemistry 28, 7510–7516. Fares, C., Egger, D., Lunkenheimer, P., Meiler,
33. Grzesiek, S., Vuister, G. W., and Bax, A. (1993) J., Grubmuller, H., Becker, S., de Groot, B. L.,
A simple and sensitive experiment for measure- and Griesinger, C. (2008) Residual dipolar cou-
ment of JCC couplings between backbone car- plings as a tool to study molecular recognition of
bonyl and methyl carbons in isotopically enriched ubiquitin. Biochem. Soc. Trans. 36, 1433–1437.
proteins. J. Biomol. NMR 3, 487–493. 45. Loria, J. P., Rance, M., and Palmer, A. G.
34. Vuister, G. W., Wang, A. C., and Bax, A. (1993) (1999) A relaxation-compensated Carr-Purcell-
Measurement of three-bond nitrogen-carbon J Meiboom-Gill sequence for characterizing
couplings in proteins uniformly enriched in 15N chemical exchange by NMR spectroscopy. J.
and 13C. J. Am. Chem. Soc. 115, 5334–5335. Am. Chem. Soc. 121, 2331–2332.
35. Bax, A., Vuister, G. W., Grzesiek, S., Delaglio, F., 46. Allerhand, A., and Thiele, E. (1966) Analysis of
Wang, A. C., Tschudin, R., and Zhu, G. (1994) Carr-Purcell Spin-Echo NMR Experiments on
Measurement of homo- and heteronuclear Multiple-Spin Systems. II. The Effect of Chemical
J couplings from quantitative J correlation. Exchange. J. Chem. Phys. 45, 902–916.
Methods Enzymol. 239, 79–105. 47. Golovanov, A. P., Hautbergue, G. M., Wilson,
36. MacKenzie, K. R., Prestegard, J. H., and S. A., and Lian, L. Y. (2004) A simple method
Engelman, D. M. (1996) Leucine side-chain for improving protein solubility and long-term
rotamers in a glycophorin A transmembrane stability. J. Am. Chem. Soc. 126, 8933–8939.
Chapter 11
Preparation of the Modular Multi-Domain Protein

RPA for Study by NMR Spectroscopy
Chris A. Brosey, Marie-Eve Chagot, and Walter J. Chazin
Abstract
The integrity and propagation of the genome depend upon the fidelity of DNA processing events, such as
replication, damage recognition, and repair. Requisite to the numerous biochemical tasks required for
DNA processing is the generation and manipulation of single-stranded DNA (ssDNA). As the primary
eukaryotic ssDNA-binding protein, Replication Protein A (RPA) protects ssDNA templates from stray
nuclease cleavage and untimely reannealment. More importantly, RPA also serves as a platform for orga-
nizing access to ssDNA for readout of the genetic code, recognition of aberrations in DNA, and processing
by enzymes. We have proposed that RPA’s ability to adapt to such a broad spectrum of multiprotein
machinery arises in part from its modular organization and interdomain flexibility. While requisite for
function, RPA’s modular flexibility has presented many challenges to providing a detailed characterization
of the dynamic architecture of the full-length protein. To enable the study of RPA’s interdomain dynamics
and responses to ssDNA binding by biophysical methods including NMR spectroscopy, we have success-
fully produced recombinant full-length RPA in milligram quantities at natural abundance and enriched
with NMR-active isotopes.
Key words: Replication Protein A, DNA processing, Protein modularity, Isotopic labeling,
Recombinant expression, Protein purification, NMR spectroscopy
1. Introduction
As the primary eukaryotic single-stranded DNA (ssDNA)-binding

protein, Replication Protein A (RPA) prevents reannealment of
unwound DNA strands, controls access to DNA templates, and
serves as a scaffold for the assembly and disassembly of DNA
processing machinery (1, 2). A heterotrimer, RPA’s three subunits
(RPA70, RPA32, and RPA14) contain seven structured domains
interconnected by flexible linkers. Three of these domains form
181
182 C.A. Brosey et al.
the trimeric core of the protein (70C, 32D, 14), from which
emanate the flexibly linked N-terminal domains of RPA70 (70N,
70A, 70B), as well as the disordered N-terminal and structured
C-terminal domains of RPA32 (32N and 32C, respectively).
Binding of ssDNA is facilitated by domains 70A, 70B, 70C, and
32D, which together occupy an occluded site size of 30 nucle-
otides (1). Interactions with other DNA-processing proteins are
primarily mediated by domains 70N and 32C, and the principal
DNA-binding domains 70A and 70B (1, 2).
As a universal participant in DNA processing, RPA must inter-
act with a wide array of structurally unique multiprotein complexes.
The flexible, modular organization of the protein is thought to be
critical for enabling such structural adaptability (3). Although
high-resolution X-ray or NMR structures of all individual RPA
domains are available (4–8), the dynamic interdomain organiza-
tion of full-length RPA and the accompanying structural altera-
tions imposed by DNA processing have not been extensively
characterized. The full-length protein’s intrinsic flexibility poses
several challenges to study by X-ray diffraction; and at 116 kDa,
RPA falls outside the size limit of conventional NMR methods
(30–40 kDa). Application of advanced NMR approaches, however,
namely, deuterium labeling and TROSY- or CRINEPT-based tech-
niques, has allowed this size limitation to be extended to proteins
in excess of 100 kDa (9–12). This, combined with the discrete
distribution of molecular mass among RPA domains (50 kDa for
the trimer core and 10–14 kDa for the remaining domains), makes
feasible characterization of the full-length protein by NMR (13).
Here, we describe the production of full-length RPA by recom-
binant expression in Escherichia coli and subsequent purification of
the protein by a series of FPLC steps. The protocols provided
include those required for preparation of 2H-, 15N-enriched RPA
for study by NMR spectroscopy.
2. Materials
2.1. Cell 1. RPA pET15b plasmid (see Note 1).

Transformation 2. BL21(DE3) pLyS competent cells: 100-μL aliquots stored at
−80°C.
3. LB medium plates: 10 g/L tryptone, 10 g/L NaCl, 5 g/L
yeast extract, 15 g/L agar dissolved in Milli-Q water (filtered
to a resistance of 18.3 MΩ-cm) and autoclaved at 121°C for
15 min. Add antibiotic stocks (ampicillin and chloramphenicol)
at 1:1,000 dilution when the medium has cooled to 50–60°C
(14) (see Note 2).
11 Preparation of the Modular Multi-Domain Protein RPA… 183
4. 1,000× Ampicillin stock: 100 mg/mL in Milli-Q water, steril-

ize by filtration at 0.2 μm (see Note 3).
5. 1,000× Chloramphenicol stock: 34 mg/mL in ethanol, steril-
ize by filtration at 0.2 μm (see Note 3).
6. SOC recovery medium: 20 g/L tryptone, 0.5 g/L NaCl,
5 g/L yeast extract, 2.5 mM KCl, 5 mM MgCl2, 5 mM
MgSO4, 20 mM glucose dissolved in Milli-Q water and auto-
claved at 121°C for 15 min (14).
2.2. Cell Expression 1. RPA pET15b BL21(DE3) pLysS LB plate (Subheading 2.1).
Testing 2. Sterile 10-mL test culture tubes.
3. LB medium: 10 g/L tryptone, 10 g/L NaCl, 5 g/L yeast
extract dissolved in Milli-Q water and autoclaved at 121°C for
15 min.
4. 1,000× Antibiotic stocks (Subheading 2.1).
5. 1 M IPTG: Sterilize by filtration at 0.2 μm and store at −20°C.
6. 2× SDS loading buffer: 100 mM Tris–HCl, pH 6.8, 4% (w/v)
SDS (electrophoresis grade), 0.2% (w/v) bromophenol blue,
20% (w/v) glycerol, 200 mM β-mercaptoethanol (βME,
added fresh) (14).
7. 8 M urea.
8. Precast 4–12% Bis-Tris SDS-PAGE gel (Invitrogen).
9. 1× MES SDS running buffer (Invitrogen).
10. 1× Prestained molecular weight standards.
11. SimplyBlue SafeStain.
2.3. Preparation 1. LB medium (Subheading 2.2).

of Culture Media 2. 1,000× Antibiotic stocks (Subheading 2.1).
2.3.1. LB Medium 3. 500-mL Erlenmeyer flask with baffles.
4. Six 2.8-L Fernbach flasks with baffles.
2.3.2. Minimal Medium 1. Milli-Q water (900 mL/L medium).

2. 10× M9 salts: 5 g/L NaCl, 30 g/L KH2PO4, 60 g/L Na2HPO4
dissolved in Milli-Q water, adjusted to pH 7.4 with 10 M
NaOH, and autoclaved at 121°C for 15 min.
3. 1 M MgSO4: Sterilize filter at 0.2 μm and store at room
temperature.
4. 1 M CaCl2: Sterilize filter at 0.2 μm and store at room
temperature.
5. 20% (w/v) Glucose: Sterilize filter at 0.2 μm and store at room

temperature.
6. 1 M Thiamine hydrochloride: Sterilize filter at 0.2 μm and
store at room temperature.
15
8. NH4Cl.
9. 500-mL Erlenmeyer flask with baffles.
10. Six 2.8-L Fernbach flasks with baffles.
2.3.3. Deuterated 1. 99% D2O.

Minimal Medium 2. Dry components: 0.5 g/L NaCl, 3 g/L KH2PO4, 6 g/L
Na2HPO4, 0.24 g/L-MgSO4, 11.1 mg/L-CaCl2, 2 g/L-glucose,
0.337 g/L thiamine hydrochloride, 0.5 g/L 15NH4Cl, and
0.1 g/L ampicillin.
3. Six sterile vacuum filtration systems with 1-L storage containers.
4. Six sterile 2.8-L Fernbach flasks with baffles.
2.4. Starter Cultures 1. RPA pET15B BL21(DE3) pLysS LB plate (Subheading 2.1).
2.4.1. LB Medium 2. 250 mL LB starter culture (Subheading 2.3.1).
2.4.2. Minimal Medium 1. RPA pET15B BL21(DE3) pLysS LB plate (Subheading 2.1).
and Deuterated Minimal 2. Sterile 10-mL test culture tubes.
Medium
3. LB medium (Subheading 2.2).
5. 250 mL Minimal medium starter culture (Subheading 2.3.2).
2.5. Large-Scale Cell 1. Starter culture (Subheading 2.4).

Culture and 2. 6 L Sterile media in Fernbach flasks (Subheading 2.3).
Overexpression
3. 1 M IPTG (Subheading 2.2).
4. Bleach or 1% Terg-a-Zyme solution for decontamination of
spent media.
2.6. RPA Purification 1. Lysis buffer: Dissolve two complete EDTA-free protease inhibitor
cocktail tablets (Roche) in 80 mL of Ni-NTA buffer A
2.6.1. Cell Lysis
(Subheading 2.6.2) in a 150-mL glass beaker on ice immedi-
ately prior to use.
2. 100-mL Glass homogenizer.
3. Sonic dismembrator.
4. 25-mm diameter, 0.45-μm syringe filter.
2.6.2. Ni-NTA 1. Refrigerated Äkta FPLC purification system and accessories.

Chromatography 2. Ni-NTA buffer A: 20 mM HEPES, pH 7.5, 500 mM NaCl,
5 mM βME, 10 μM ZnCl2, 10 mM imidazole; adjusted to
target pH with concentrated HCl, filtered at 0.45 μm under

vacuum, and stored at 4°C (see Notes 4–7).
3. Ni-NTA buffer B: 20 mM HEPES, pH 7.5, 500 mM NaCl,
5 mM βME, 10 μM ZnCl2, 300 mM imidazole; adjusted to
target pH with concentrated HCl, filtered at 0.45 μm under
vacuum, and stored at 4°C (see Notes 4–7).
4. 25 mL Ni-NTA pre-packed FPLC column (Sigma–Aldrich).
2.6.3. Desalting Exchange 1. Centrifugal concentrators (15 mL, 30 kDa MWCO).

and Heparin 2. Refrigerated Äkta FPLC purification system and accessories.
Chromatography
3. Heparin buffer A: 20 mM HEPES, pH 7.5, 50 mM NaCl,
5 mM βME, 10 μM ZnCl2, 10% glycerol; adjusted to target
pH with concentrated HCl, filtered at 0.45 μm under vacuum,
and stored at 4°C (see Notes 6 and 7).
4. Heparin buffer B: 20 mM HEPES, pH 7.5, 1 M NaCl, 5 mM
βME, 10 μM ZnCl2, 10% glycerol; adjusted to target pH with
concentrated HCl, filtered at 0.45 μm under vacuum, and
stored at 4°C (see Notes 6 and 7).
5. HiPrep 26/10 Desalting column (GE Healthcare).
6. HiTrap 5-mL Heparin HP column (GE Healthcare).
2.6.4. Superdex 200 Gel 1. Centrifugal concentrators (15 mL, 30 kDa MWCO).
Filtration Chromatography 2. 0.22-μm centrifugal spin filters.
3. Refrigerated Äkta FPLC purification system and accessories.
4. Gel filtration buffer: 20 mM HEPES, pH 7.5, 100 mM NaCl,
5 mM βME, 10 μM ZnCl2, 200 mM arginine; adjusted to tar-
get pH with concentrated HCl, filtered at 0.45 μm under
vacuum, and stored at 4°C (see Notes 6 and 7).
5. Superdex 200 HR 10/30 column (GE Healthcare).
2.7. Preparation 1. Centrifugal concentrators (15 mL, 30 kDa MWCO).

of Samples for NMR 2. 5- and 4-mm NMR tubes.
3. 99% D2O.
3. Methods
This section describes a protocol for the production of full-length

RPA in E. coli and its subsequent purification, including prepara-
tion of 2H-,15N-enriched protein for study by NMR. As a rule,
robust expression of full-length RPA in E. coli is challenging as
RPA is a relatively large protein (>100 kDa) and its ssDNA binding
properties are toxic to bacterial cells. Average yields of RPA over-
expressed from the pET15b vector range from 17 mg for 6 L of
rich LB culture to 4–8 mg of 15N-enriched RPA for 6 L of minimal

medium culture (1–2 NMR samples at approximately 130 μM
concentration and 260-μL volume). Working with RPA cultures
grown in deuterated minimal medium requires patience and careful
monitoring, as a deuterated environment is particularly stressful to
the bacterial metabolism. Consequently, deuterated cultures take
much longer to reach their target induction densities and usually
result in a diminished yield of the recombinant protein. Due to the
high cost of D2O and time investment required for growth and
expression, we found it beneficial to pilot a small-scale test culture
(100 mL) to develop an expected timeline for the growth and to
ensure that all reagents were functioning as expected. Subsequent
purification of this culture allowed us to determine that the overall
yield of RPA had not suffered significantly from production in a
deuterated environment. In the expression protocol below, we
describe the full 6 L production run; however, when embarking
upon deuterium labeling for the first time, we highly recommend
starting with the smaller pilot culture.
As RPA is a trimeric, DNA-binding protein, the purification
protocol below is designed to ensure samples free of contaminating
ssDNA, as well as uniform stoichiometry among all three RPA
subunits. Expression of RPA from the pET15b vector results in an
excess of the RPA70 subunit, which can be successfully separated
from the intact heterotrimer with heparin and gel filtration chro-
matography. The heparin purification step also selects for RPA free
from ssDNA contamination.
3.1. Cell Ensuring robust antibiotic selection of the RPA vector on solid
Transformation medium is vital to enabling the success of subsequent liquid
cultures (see Note 2).
1. Thaw 100 μL of BL21(DE3) pLyS competent cells on ice, gen-
tly combine with 100 ng of RPA pET15b vector, and incubate
for 30 min on ice.
2. Heat shock the cells at 42°C for 45 s and incubate on ice for
2 min. Add 900 μL of sterile SOC recovery medium.
3. Incubate the cells for 1 h at 37°C and 200–230 rpm, then
centrifuge the cells at 16,100 × g for 1 min at room tempera-
ture (see Note 8). Remove 900 μL of the clarified SOC medium
and gently resuspend the cells in the remaining medium prior
to plating on an LB medium plate.
4. Incubate the plates overnight at 37°C.
3.2. Cell Expression Expression testing allows confirmation of the expression capability
Testing of the transformed bacterial colonies prior to scaling up protein
production. The testing also allows for selection of colonies with
the most robust expression.
1. Prepare five LB test cultures as follows: Transfer 5 mL of LB

medium to a 10-mL sterile culture tube and add 1:1,000 dilu-
tions of ampicillin and chlormaphenicol antibiotic stocks.
Inoculate each culture with a colony selected from the center
of a freshly transformed RPA pET15b LB plate (see Note 9)
and incubate the cultures at 37°C, 200–230 rpm, until they
reach an A600 of 0.5–0.6 (approximately 3–4 h).
2. Transfer 250 μL of each LB test culture into an Eppendorf tube
as a preinduction sample. Centrifuge the sample at 16,100 × g
for 1 min, decant the supernatant, add 7 μL each of 2× SDS-
PAGE loading buffer and 8 M urea, and vortex to mix.
3. Add IPTG to a final concentration of 1 mM to the remainder
of the test culture to induce expression and continue to incu-
bate with shaking at room temperature for 3 h. Collect a final
250 μL postinduction sample and process as in step 2.
4. Boil pre- and postinduction SDS-PAGE samples for 5–10 min
to denature the lysates and load 2–4 μL from each sample into
a precast 4–12% Bis–Tris SDS-PAGE gel preloaded into an
electrophoresis cell filled with 1× MES SDS running buffer.
Reserve one lane for 5 μL of 1× prestained molecular weight
standards. Run the gel at 200 V.
5. Remove the gel from the electrophoresis cell and place in a
loosely capped container filled with Milli-Q water (see Note 10).
Fix the gel by heating on a high setting in a microwave oven
for 1 min, followed by 1 min of cooling. Exchange the water
and repeat. Remove the final rinse and stain with SimplyBlue
SafeStain for 20 min. Remove the stain and refill the container
with deionized water to destain the gel.
6. The relative proportion of RPA32 and RPA14 subunits is too
low to observe on the gel. RPA70 should be just distin-
guishable at the appropriate molecular weight in lanes con-
taining postinduction samples. Select colonies exhibiting
the most abundant RPA production for subsequent large-
scale expression.
3.3. Preparation This medium serves for the production of unlabeled RPA.
of Culture Media Preparation should include a 250-mL starter culture to accommo-
date a 6-L large-scale culture.
3.3.1. LB Medium
1. Dissolve LB components (Subheading 2.2) in Milli-Q water in
a 500-mL baffled Erlenmeyer flask (starter culture) or 2.8-L
baffled Fernbach flasks (large-scale culture), autoclave at 121°C
for 15 min, and cool to 50–60°C.
2. Add ampicillin and chloramphenicol at 1:1,000 dilution imme-
diately prior to inoculation.
3.3.2. Minimal Medium This medium serves for the production of 15N-enriched RPA.
Preparation should include a 250 mL starter culture to accommo-
date a 6 L large-scale culture. A 250 mL minimal medium culture
also serves as an adaptation culture for production of deuterated
protein.
1. Dilute 10× M9 salts in Milli-Q water to 1× in a 500-mL baffled
Erlenmeyer flask (starter culture) or 2.8-L baffled Fernbach
flasks (large-scale culture), autoclave at 121°C for 15 min, and
cool to 50–60°C.
2. Add the following components immediately prior to inocu-
lation: 0.5 g/L of 15NH4Cl (see Note 11), 2 mL/L of 1 M
MgSO4, 100 μL/L of 1 M CaCl2, 10 mL/L of 20% glucose,
1 mL/L of 1 M thiamine hydrochloride, and antibiotic stocks
at 1:1,000 dilution.
3.3.3. Deuterated Minimal This medium serves for six 1 L large-scale production cultures and
Medium is prepared immediately prior to inoculation after the success of the
250 mL minimal medium adaptation culture has been ascertained.
1. Dry autoclave six 2.8-L Fernbach flasks with baffles and allow
to dry thoroughly overnight.
2. Dissolve all dry components in 6 L of 99% D2O (see Notes 12
and 13) and immediately sterilize the medium in 1-L batches
by using sterile vacuum filtration systems (i.e., 1 L/unit). This
apparatus filters the medium directly into a sterile 1-L bottle.
Chloramphenicol is not included in the medium at this stage to
ease the metabolic burden on the cells.
3. Carefully transfer each 1 L of sterile deuterated minimal
medium to a dry, sterile Fernbach flask (see Note 14).
3.4. Starter Cultures 1. Inoculate a 250 mL LB starter culture (Subheading 3.3.1)

directly with an RPA pET15b colony selected from the test
3.4.1. LB Medium
expression.
2. Grow the culture overnight at 37°C, 200–230 rpm. The culture
should be cloudy in the morning.
3.4.2. Minimal Medium 1. Inoculate a 4 mL LB starter culture (4 mL LB + 1:1,000 ampi-

and Deuterated Minimal cillin/chloramphenicol in a 10-mL culture tube) with an RPA
Medium pET15b colony selected from the test expression. Grow for
3–4 h at 37°C, 200–230 rpm, or until cloudy.
2. Inoculate the 250 mL minimal medium starter culture
(Subheading 3.3.2) with the 4 mL LB starter culture and
shake overnight at 37°C. The culture should be cloudy in the
morning.
3.5. Large-Scale Cell 1. Prepare six 1 L cultures of rich LB medium, minimal medium, or
Culture and deuterated minimal medium as described above (Subheading 3.3)
Overexpression and inoculate each with 30 mL (40 mL for deuterated minimal
medium) of the corresponding overnight starter culture.
2. Grow the cultures at 37°C, 200–230 rpm, until an A600 of
0.6–0.7 is reached (see Note 15).
3. Allow the cultures to equilibrate for half an hour with agitation
at 18°C (or room temperature for deuterated minimal medium)
prior to induction. Collect a preinduction SDS-PAGE sample
as described above (Subheading 3.2) and induce the cells with
1 mM IPTG. Allow cells to express overnight (approximately
16–18 h).
4. The A600 at the end of the expression period should be 1.8–2.0
for LB medium cultures and 0.9–1.0 for standard and deuterated
minimal media cultures. Collect postinduction SDS-PAGE
samples from the cultures as described above (Subheading 3.2).
Harvest the cultures by centrifuging at 10,000 × g for 20 min
at 4°C.
5. Decant the supernatant and reserve the spent deuterated media
for recycling (15). Spent LB or minimal media may be decon-
taminated by the addition of bleach or a 1% Terg-a-zyme solu-
tion for 30 min, and then discarded. If purification does not
follow immediately, transfer the pellets to sterile 50-mL conical
tubes and freeze at −80°C. Run pre- and postinduction SDS-
PAGE samples as in Subheading 3.2 to confirm the presence
of RPA expression.
3.6. RPA Purification Purification of RPA involves three primary steps: Ni-NTA affinity,
heparin, and size-exclusion chromatography. For best results, the
protocol should be completed over the course of 2 days, where
Ni-NTA and heparin chromatography steps are accomplished the
first day and the final gel filtration step is carried out on the second
day. If necessary, the Ni-NTA and heparin steps may be divided
into two separate days and protein fractions from each purification
kept at 4°C overnight. Ideally, though, the time from cell lysis to
the final gel filtration exchange should be kept to a minimum.
3.6.1. Cell Lysis 1. If cells have been frozen at −80°C, thaw the pellets by
submerging the 50-mL conical tubes in cool water. Meanwhile,
prepare and chill the lysis buffer and pre-chill the 100-mL glass
homogenizer on ice (see Note 16).
2. Transfer all 6 L of RPA cell pellets into the 100-mL homoge-
nizer, rinse the 50-mL conical tubes with ice-cold lysis buffer,
and add the rinse and any remaining lysis buffer to the
homogenizer.
3. Homogenize the lysate until smooth (10–15 strokes).

4. Return the lysate to the 150-mL glass beaker and pack this into
a 2-L plastic beaker filled with an ice-water bath (see Note 17).
Ensure that there is sufficient ice to securely brace the beaker
and prevent floating.
5. Sonicate the lysate with a macrotip set at 60% power for 5.0 min
of total process time (pulsing 5.0 s on and 5.0 s off). Pause the
sonicator half-way through this cycle to replenish the ice-water
bath and to check the temperature of the lysate (see Note 16).
The lysate should become translucent and less viscous as the
sonicator disrupts the cellular material. If the lysate viscosity remains
unchanged after the cycle is complete, repeat the cycle once
more, monitoring the ice-water bath and lysate temperature.
6. Clarify the lysate by centrifuging at 48,000 × g for 20 min at
4°C. Ensure that both centrifuge and rotor are pre-chilled to
at least 4°C.
7. Decant and filter the clarified supernatant through a 0.45-μm
membrane. Store on ice for immediate loading onto the FPLC
Ni-NTA column.
3.6.2. Ni-NTA Steps 2–4 are implemented as a pre-programmed Äkta FPLC

Chromatography method.
1. Equilibrate the prepacked 25 mL Ni-NTA column with three
column volumes (3 CVs) each of filtered Milli-Q water and
Ni-NTA buffer A (see Note 18).
2. Load the filtered lysate onto the equilibrated Ni-NTA column
at 1.0–1.5 mL/min.
3. Wash unbound lysate from the column with 4 CVs of Ni-NTA
buffer A at 2.5 mL/min.
4. Elute RPA with a 4 CV gradient (0–100% Ni-NTA buffer B,
10–300 mM imidazole), collecting 6-mL fractions at 2.5 mL/min.
5. Assess the presence of RPA from the A280 chromatogram trace
and SDS-PAGE of relevant fractions (sampling 5 μL of each
fraction). Pool fractions containing all three RPA subunits for
further processing (typically, a 60-mL pool).
3.6.3. Desalting Exchange The charged DNA-binding clefts of RPA render it sensitive to the
and Heparin absence of ambient salt. Effective binding of RPA to the heparin
Chromatography matrix, however, requires a low salt content in the loading buffer.
Direct dialysis into the loading buffer (heparin buffer A) usually
provokes extensive precipitation of RPA. Buffer exchange by desalt-
ing, however, allows for the rapid and successful transfer of RPA
into the loading buffer with minimal aggregation. Once the series
of FPLC desalting runs are complete, it is imperative to load the
exchanged protein directly onto the heparin column to restore a
stabilizing ionic environment. As before, loading, washing, and
elution are implemented automatically using pre-programmed

Äkta FPLC methods.
1. Pre-rinse two centrifugal concentrators (15 mL, 30 kDa
MWCO) with Milli-Q water by centrifuging at 3,700 × g for
10 min at 4°C. Concentrate the Ni-NTA RPA pool to ~30 mL
and store on ice for desalting into heparin buffer A (see Notes
19 and 20).
2. Equilibrate the HiPrep 26/10 Desalting column with 2 CVs
of filtered Milli-Q water and 1.5 CVs of heparin buffer A.
Equilibrate the HiTrap 5 mL Heparin HP column with 5 CVs
of filtered Milli-Q water and 3 CVs of heparin buffer A (see
Note 21).
3. Filter and load 10 mL of the Ni-NTA RPA concentrate onto
the desalting column at 2.0 mL/min, collecting 4-mL frac-
tions. The protein should elute within the first four fractions
(16 mL) of the run. Re-equilibrate the column and repeat the
run twice for the remaining 20 mL of RPA Ni-NTA concen-
trate. Store fractions on ice until all runs are complete.
4. Combine all three desalting pools (48 mL), filter at 0.45 μm,
and load directly onto the equilibrated heparin column at
1.0 mL/min.
5. Wash out unbound sample with 3 CVs of heparin buffer A at
2.5 mL/min.
6. Elute RPA with a 20 CV gradient (0–100% heparin buffer B,
50 mM to 1 M NaCl), collecting 4-mL fractions at 2.5 mL/min.
7. Assess the presence of RPA from the A280 chromatogram trace
and SDS-PAGE of relevant fractions (sampling 5 μL of each
fraction). The elution should include two major peaks: the first
corresponding to RPA70 exclusively and the second to trimeric
RPA. Fractions containing all three RPA subunits are pooled
for further final purification (typically, a 40-mL pool).
3.6.4. Superdex 200 Gel Gel filtration provides a final polishing step for the purification
Filtration Chromatography and ensures the removal of any trace RPA70 or low-molecular-weight
contaminants. As before, loading and elution are implemented
automatically using a pre-programmed Äkta FPLC method.
1. Equilibrate the Superdex 200 HR 10/30 column with 1.5 CVs
of filtered Milli-Q water and 1.5 CVs of gel filtration buffer.
2. Pre-rinse two centrifugal concentrators (15 mL, 30 kDa
MWCO) with Milli-Q by centrifuging at 3,700 × g for 10 min
at 4°C. Concentrate the heparin RPA pool to ~300–500 μL
(see Note 19).
3. Filter the concentrate by using a 0.22-μm centrifugal spin filter
in a refrigerated (4°C) centrifuge and load onto the Superdex
200 HR 10/30 column at 0.3 mL/min, collecting 0.5-mL
fractions for 1.5 CVs.
4. As before, assess the presence of RPA from the A280 trace and SDS-
PAGE of relevant fractions (sampling 5 μL of each fraction).
5. Before the final RPA fractions are pooled, acquire final A280
and A260 measurements by UV-Vis spectrophotometry to
ensure that the selected fractions are DNA-free, as determined
by A260/A280 ratios of 0.64 or less.
3.7. Preparation 1. Pre-rinse two centrifugal concentrators (15 mL, 30 kDa

of Samples for NMR MWCO) with Milli-Q water by centrifuging at 3,700 × g for
10 min at 4°C. Concentrate the S200 RPA pool to ~500–
600 μL (see Note 19), monitoring the protein concentration
by UV-Vis spectrophotometry (the gel filtration buffer may
serve as a blank).
2. The target NMR concentration for RPA is ~100–130 μM
(10–15 mg/mL) with a minimum sample volume of 260 μΛ
(using a 4-mm-diameter NMR tube). Once the target concen-
tration is reached, centrifuge 260–300 μL of the concentrate at
16,100 × g for 5 min at 4°C to remove any stray precipitation.
3. Load the protein into a standard 4-mm diameter NMR tube,
which may be fitted with an adaptor to fit a 5-mm spinner or
slipped into a 5-mm diameter tube containing 120 μL D2O
without the need for the adaptor.
4. Notes
1. The tricistronic RPA pET15b vector was a gift from the lab of
Alexey Bochkarev. 6×-His tags with thrombin cleavage sites
precede the RPA70 and RPA14 subunits. The order of subunit
open reading frames (ORFs) is as follows: RPA70, RPA14,
RPA32.
2. Using freshly prepared ampicillin stock in the LB medium is
important for ensuring robust RPA transformation. Even
though the choice of the BL21(DE3) pLysS cell line is designed
to circumvent leaky expression, even small amounts of nonin-
duced RPA can potentially result in resistant cells with less than
robust expression.
3. Antibiotic stocks may be aliquoted and stored at −20°C for
future use. For long-term storage of ampicillin stocks, storage
at −80°C is recommended.
4. The 70C domain of RPA contains a zinc-binding motif, for
which ZnCl2 is included in the purification buffers.
5. The most effective imidazole concentrations in the Ni-NTA
buffers will depend on how recently the Ni-NTA resin has been
charged. For freshly charged resin, the imidazole concentration
for Ni-NTA buffer A is often raised to 30 mM for the first few

purifications to compensate for the higher nonspecific affinity
of the resin.
6. βME is added fresh to each buffer immediately prior to use.
Buffers can be prepared without βME and stored at 4°C if
their use is anticipated to last beyond 1–2 days.
7. Preparation of 1 L of each buffer should provide more than
enough for the entire purification.
8. As the transformation efficiency of the RPA pET15b vector is
low and the double-antibiotic selection with the BL21(DE3)
pLysS strain is quite stringent, plating the entire transforma-
tion culture is recommended.
9. When transferring a selected colony to the test expression
culture, be sure to leave a portion behind for future inocula-
tion of the large-scale cultures. If the colony is too small to
divide in this manner, the plate may be left at room tempera-
ture for half a day to allow the colony to regrow.
10. An empty gel tip box will suffice. The level of water should be
enough to immerse the gel.
11. The 15NH4Cl is measured out and added directly to the sterile
medium as a powder. This ensures that the labeled material is
not wasted should a step prior to the inoculation fails.
12. As mentioned at the beginning of Subheading 3, the volume
of deuterated minimal medium may be adjusted for small-scale
testing.
13. Mixing of the deuterated minimal medium dry components
should take place in a clean, dry container and be carried out as
efficiently as possible to prevent exchange with ambient water
vapor. If a container large enough to accommodate 6-L volume
is not available, substitution of two 3-L containers with subse-
quent exchange and mixing between the two batches can be used
to ensure the homogeneity of the medium across all cultures.
14. Performing this sterile transfer in the presence of a Bunsen
burner flame is advised.
15. In our experience, growth in rich LB medium requires 3–4 h
to reach the target A600 while growth in minimal medium
requires 8–10 h. For deuterated minimal medium, this time-
line extends to 1.5–2 days. To ensure that the induction
occurred during daylight hours, cultures were switched to
agitation at 20°C overnight after a full day of growth at 37°C
and then returned to 37°C the next morning. The target A600
was reached during the afternoon of the second day.
16. RPA is susceptible to proteolytic cleavage, particularly at the
unstructured 60–70 amino-acid linker that connects the 70N
and 70A domains. Throughout the purification, it is essential
that all buffers are kept ice cold and that heating from other
steps of the lysis (sonication, centrifugation) is kept to a
minimum.
17. The ice-water bath serves as a heat sink during sonication of
the lysate.
18. Initializing the FPLC system (pump washing, cleaning super-
loops, setting up fraction collectors), as well as equilibration of
the Ni-NTA column, should occur prior to or concurrently
with cell lysis to ensure that the clarified, filtered lysate can be
loaded directly into the system as soon as it is available.
19. Centrifugal concentrators are usually spun in 10–15-min incre-
ments and carefully mixed with each addition of the Ni-NTA
pool to prevent buildup and aggregation of RPA at the base of
the concentrator.
20. The desalting resolution of the HiPrep 26/10 Desalting
column (GE Healthcare) is 10 mL; that is, the column can
effectively exchange 10 mL of injected sample into the tar-
get buffer without contamination from the original buffer.
To avoid desalting too concentrated a volume of RPA and
triggering aggregation, the Ni-NTA pool is processed in
three sequential 10-mL batches.
21. As with the Ni-NTA step, initializing the FPLC system and
equilibrating the desalting and heparin columns should occur
prior to or concurrently with concentrating the RPA Ni-NTA
pool to allow for immediate loading once the target volume
is reached.
Acknowledgments
The authors would like to thank Dr. Dalyir Pretto and Susan Meyn.
This work was supported by the National Institutes of Health
operating grant R01 GM65484 and graduate training grant T32
GM08320.
References
1. Wold, M. S. (1997) Replication protein A: repair, and recombination. J. Biol. Chem. 279,
A heterotrimeric, single-stranded DNA-binding 30915–30918.
protein required for eukaryotic DNA metabo- 4. Bochkarev, A., Pfuetzner, R. A., Edwards, A. M.,
lism. Annu. Rev. Biochem. 66, 61–92. and Frappier, L. (1997) Structure of the single-
2. Fanning, E., Klimovich, V., and Nager, A. R. stranded-DNA-binding domain of replication
(2006) A dynamic model for replication pro- protein A bound to DNA. Nature 385,
tein A (RPA) function in DNA processing 176–181.
pathways. Nuc. Acids Res. 34, 4216–4137. 5. Jacobs, D. M., Lipton, A. S., Isern, N. G.,
3. Stauffer, M. E., and Chazin, W. J. (2004) Daughdrill, G. W., Lowry, D. F., Gomes, X.,
Structural mechanisms of DNA replication, and Wold, M. S. (1999) Human replication
protein A: Global fold of the N-terminal RPA- 10. Riek, R., Wider, G., Pervushin, K., and
70 domain reveals a basic cleft and flexible Wuthrich, K. (1999) Polarization transfer by
C-terminal linker. J. Biomol. NMR 14, cross-correlated relaxation in solution NMR
321–331. with very large molecules. Proc. Natl. Acad.
6. Bochkarev, A., Bochkareva, E., Frappier, L., Sci. U.S.A. 96, 4918–4923.
and Edwards, A. M. (1999) The crystal struc- 11. Riek, R., Pervushin, K., and Wuthrich, K.
ture of the complex of replication protein A (2000) TROSY and CRINEPT: NMR with
subunits RPA32 and RPA14 reveals a mecha- large molecular and supramolecular structures
nism for single-stranded DNA binding. EMBO J. in solution. Trends Biochem. Sci. 25,
18, 4498–4504. 462–468.
7. Mer, G., Bochkarev, A., Gupta, R., Bochkareva, 12. Tugarinov, V., Hwang, P. M., and Kay, L. E.
E., Frappier, L., Ingles, C. J., Edwards, A. M., (2004) Nuclear magnetic resonance spectros-
and Chazin, W. J. (2000) Structural basis for copy of high-molecular-weight proteins. Annu.
the recognition of DNA repair proteins UNG2, Rev. Biochem. 73, 107–146.
XPA, and RAD52 by replication factor A. Cell
103, 449–456. 13. Brosey, C. A., Chagot, M. E., Ehrhardt, M.,
Pretto, D. I., Weiner, B. E., and Chazin, W. J.
8. Bochkareva, E., Korolev, S., Lees-Miller, S. P., (2009) NMR analysis of the architecture and
and Bochkarev, A. (2002) Structure of the RPA functional remodeling of a modular multi-
trimerization core and its role in the multistep domain protein, RPA. J. Am. Chem. Soc. 131,
DNA-binding mechanism of RPA. EMBO J. 6346–6347.
21, 1855–1863.
9. Pervushin, K., Riek, R., Wider, G., and 14. Sambrook J and Russell D. (2001) Molecular
Wuthrich, K. (1997) Attenuated T2 relaxation cloning: A laboratory manual, vol. 3, 3rd ed.
by mutual cancellation of dipole-dipole cou- Cold Spring Harbor Laboratory Press,
pling and chemical shift anisotropy indicates an New York.
avenue to NMR structures of very large bio- 15. Li, M. X., Corson, D. C., and Sykes, B. D.
logical macromolecules in solution. Proc. Natl. (2002) Structure determination by NMR iso-
Acad. Sci. U.S.A. 94, 12366–12371. tope labeling. Meth. Mol. Biol. 173, 255–265.
Chapter 12
NMR Studies of Protein–RNA Interactions

Carla A. Theimer, Nakesha L. Smith, and May Khanna
Abstract
This chapter describes the preparation of NMR quantities of RNA purified to single-nucleotide resolution
for protein–RNA interaction studies. The protocol is easily modified to make nucleotide-specific isotopically
labeled RNAs or uniformly labeled RNA fragments for ligation to generate segmentally labeled RNAs.
Key words: In vitro transcription, Single-nucleotide resolution, RNA synthesis, Protein–RNA interactions,
RNA purification, Isotopic labeling
1. Introduction
Understanding RNA–protein interactions and the structures of

RNA–protein complexes is an important avenue of research.
Ribonucleoprotein (RNP) complexes and RNA–protein interactions
have been found to be central to many biological processes, ranging
from genomic stability through telomeric maintenance (1–3), and
alternative splicing of RNA by the spliceosome (4), to protein synthesis
by the ribosome (5–8). Additionally, exciting new discoveries
regarding small, noncoding RNAs, including snoRNAs, snRNAs,
and RNA interference, continue to unfold. Many of these RNAs are
associated with proteins at various stages of maturation, from traf-
ficking to processing to their final destination, and any defect in
these pathways can lead to serious diseases (9). Thus, dissecting the
structure of RNA–protein interactions is crucial for deciphering
the roles of these RNP complexes as well as identifying potential
targets for pharmaceutical intervention.
Protocols are well-established and documented for the genera-
tion of sufficient quantities of 13C-,15N-isotopically labeled proteins
197
198 C.A. Theimer et al.
from bacterial sources for NMR studies; the purification of such

proteins uses removable protein tags (e.g., His6 and GST) and various
permutations of affinity, ion exchange, and size-exclusion chroma-
tography. In addition, methodologies for generating 13C-,15N-
isotopically labeled amino acid-specific (10) or segmentally labeled
proteins are also available (11, 12). Similar strategies for segmental
and nucleotide-specific labeling are particularly important for
the analyses of RNAs by using NMR spectroscopy (13, 14), given
the extremely poor dispersion of RNA chemical shifts. Selectively
labeled RNAs are particularly useful in combination with NMR
experiments that utilize filtering and editing schemes to identify
intramolecular NOEs (between labeled and unlabeled nucleotides
in a single-RNA sequence for unambiguous assignment of overlapped
chemical shifts) and intermolecular NOEs (between RNAs or in RNA–
protein complexes) (15). However, since there is no commercially
available source for chemically synthesized 13C-,15N-isotopically
labeled RNA, these studies rely on the ability to synthesize large
quantities of RNA in vitro in the laboratory. The following sections
outline start-to-finish detailed protocols for the synthesis and purifica-
tion of milligram quantities of RNA using in vitro transcription by T7
RNA polymerase, suitable for use in NMR studies (Fig. 1a).
Fig. 1. RNA synthesis and purification. (a) An RNA synthesis and purification flowchart, including the general steps from
initial template design to NMR data collection. (b) DNA template design for RNA transcription. The RNA product is presented
on the annealed DNA template/T7 Top promoter duplex in lower case letters. The DNA sequences of the T7 RNA polymerase
promoter sequence and DNA template for transcription are presented in upper case letters. The two nucleotides at the 5¢
end of the template, which should have 2¢-OMe substitutions, are indicated by asterisks. (c) A representative 20% (19:1)
acrylamide:bisacrylamide gel showing the observed bands from a typical analytical transcription experiment compared to
the DNA template.
12 NMR Studies of Protein–RNA Interactions 199
2. Materials
2.1. DNA Template 1. Template DNA: Complementary DNA template for RNA
Design for RNA transcription, 0.1–1.0 mM in water, store at −20°C, Fig. 1b
Synthesis (see Note 1).
2. T7 Top DNA: Purified T7 RNA polymerase promoter DNA
(coding strand, 5¢-TAATACGACTCACTATA-3¢), 1 mM in
water, store at −20°C.
2.2. RNA Synthesis 1. 50 mM Annealed DNA template: T7 Top DNA stock solution
in water, store at −20°C.
2. 10× Transcription buffer: 400 mM Tris–HCl, pH 8.0, 10 mM
spermidine, 0.1% (w/v) Triton X-100; store at room temperature.
3. 100 mM dithiothreitol (DTT): Store in 1–2-mL aliquots
at −20°C.
4. 100 mM ATP, CTP, GTP, and UTP: Dissolve nucleotides in
water and adjust to pH 8.0 using 1M NaOH; store in 1–2-mL
aliquots at −20°C (see Note 2).
5. 1M MgCl2: Store at room temperature.
6. T7 RNA polymerase/ribonuclease (RNAse) inhibitor mixture:
T7 RNA polymerase in solution, dialyzed into 30 mM HEPES
(adjusted to pH 7.5 with 1M NaOH), 0.1M potassium glutamate,
0.25 mM EDTA, 0.05% Tween-20, 1 mM DTT, 200 mM
NaCl. Following dialysis, RNAse inhibitor (see Note 3) and
glycerol (50% (w/v) final concentration) are added. Store in
1-mL aliquots at −80°C.
7. Gel running buffer (1× TBE): 90 mM Tris base, 90 mM boric
acid, and 2 mM EDTA (using 0.5M EDTA, pH 8.0, stock
solution). Typically made as a 5× or 10× concentrated stock
and diluted as needed; store at room temperature.
8. Denaturing acrylamide gel solution: 20% acrylamide/bisacryl-
amide (19:1), 1× TBE, and 7.8 M urea; store at 4°C (see Note 4).
9. 10% (w/v) Ammonium persulfate solution (APS): Store at 4°C.
10. Denaturing gel loading buffer: 80% formamide, 10 mM EDTA
(using 0.5M EDTA, pH 8.0, stock solution), 0.025% (w/v)
xylene cyanol, and 0.025% (w/v) bromophenol blue; store at
room temperature.
11. Toluidine blue stain: 0.25% (w/v) in water; prepare as needed,
and store at room temperature.
12. 100% Ethanol; store at −20°C.
13. 0.5M EDTA stock solution: 0.5M EDTA, adjust to pH 8.0
using solid NaOH pellets.
14. N,N,N´,N´-tetramethyl-ethane-1,2-diamine (TEMED).
2.3. RNA Purification 1. Denaturing gel loading buffer, denaturing acrylamide gel solu-
tion, APS, and gel running solutions are identical to those
described in Subheading 2.2.
2. Polyester-backed, silica-based, fluorescent thin-layer chroma-
tography (TLC) plates.
3. Low-salt buffer: 10 mM monosodium phosphate/disodium
phosphate buffer, pH 7.6 (adjusted with 1M NaOH or 1M
HCl if the pH is off by more than 0.2 pH units), 1 mM EDTA,
and 200 mM KCl; store at room temperature (see Note 5).
4. High-salt buffer: 10 mM monosodium phosphate/disodium
phosphate buffer, pH 7.6 (adjusted with 1M NaOH or 1M
HCl if the pH is off by more than 0.2 pH units), 1 mM EDTA,
and 1.5M KCl; store at room temperature (see Note 5).
5. HiTrap-Q anion exchange column (5 mL prepacked).
6. 20% (w/v) Ethanol.
7. Amicon-pressurized stirred cell and 1,000 and/or 3,000
molecular weight cutoff (MWCO) membranes.
8. NMR buffer for preliminary RNA experiments or for RNA–
protein interaction studies (typically, 10–20 mM sodium phos-
phate buffer, pH 6.0–7.0, 0–200 mM KCl).
9. 70% Ethanol.
10. 100% Ethanol.
11. Electroelution chamber.
2.4. Preliminary RNA 1. 99.99% D2O: Store in sealed ampoules at room temperature.
Analysis by NMR 2. 4M potassium chloride: Store at room temperature.
3. Acid and base solutions for pH optimization: Usually, 0.1 and
1M hydrochloric acid and 0.1 and 1M potassium (or sodium)
hydroxide; store at room temperature.
4. Sodium azide.
5. Standard and Shigemi NMR tubes.
2.5. RNA–Protein 1. Polyethyleneimine (PEI): 5% (w/v) solution at pH 7.9 (MW

Interactions by NMR 50–100 K).
2. Solid NaCl.
3. Amicon-pressurized stirred cell and 1,000 and/or 3,000
MWCO membranes.
4. NMR buffer similar to that used for RNA experiments (see
Subheading 2.3).
5. Shigemi NMR tubes, D2O matched.
6. Solid ammonium sulfate.
3. Methods
3.1. DNA Template Commonly, a partially double-stranded DNA template is used for
Design for RNA in vitro transcription of large quantities of RNA, since only the
Synthesis promoter region of the template DNA must be double stranded
for T7 RNA polymerase to bind and initiate transcription (Fig. 1b)
(16). Standard PCR can be used to make the DNA template com-
pletely double stranded if it improves transcription efficiency and
the overall yield for a particular template, although we do not often
find this to be necessary. The biggest issue for synthesizing RNA
by using in vitro transcription is RNA product heterogeneity due
to in vitro transcription artifacts (Fig. 1c). It has been demonstrated
that, in vitro, T7 RNA polymerase can produce off-target products
as a result of 5¢-heterogeneity, 3¢-heterogeneity, and RNA-
templated RNA addition (16–19). T7 RNA polymerase only syn-
thesizes RNA products that contain at least one 5¢-G nucleotide.
Increased transcription efficiency is observed with two or three
5¢-G nucleotides, but multiple G nucleotides at the 5¢-end of the
RNA has been shown to cause 5¢-heterogeneity (additional G
nucleotides) in the RNA product (18). To strike a balance between
transcription efficiency and artifact generation, we prefer, when
possible, to start sequences with no more than two G nucleotides in a
row. 3¢-heterogeneity appears to be a result of the runoff transcrip-
tion mechanism and often results in the nontemplated addition of
one or two nucleotides (N + 1 and, less often, N + 2 products) (16).
This problem can be diminished or overcome completely by hav-
ing the DNA template synthesized with 2¢-methoxyl groups on the
two terminal 5¢ nucleotides (20, 21), and we strongly recommend that
template DNAs be synthesized with this modification; Fig. 1b.
Typically, we see RNA-templated RNA addition products that
vary in size from ~10–20 nucleotides longer than the expected
transcribed product. Since this phenomenon is sequence dependent,
some sequences display no extraneous long products and some
sequences make large amounts of long products (17, 19). A study
has been performed on DNA sequence mutations that reduce or
abolish this behavior (22); when sequence alteration is not feasible,
there are transcription conditions that can help to reduce this problem
(19). In addition, strategies to eliminate 5¢- and 3¢-heterogeneity
in RNA samples using ribozyme-based cleavages are also performed,
as described in detail elsewhere (13). Finally, mass spectrometry is
an excellent technique to check that the final RNA product is the
correct length and has the correct sequence composition.
The DNA template is complementary to the sequence of the
RNA and the top strand of the T7 promoter sequence (Fig. 1b).
Both DNA sequences for RNA transcription can either be ordered
from an oligonucleotide synthesis company typically at a 250-nmol

to 1-mmol scale, unpurified, or synthesized on site if you have access
to a DNA synthesizer. The chemically synthesized DNA should be
desalted by the company and arrive as a lyophilized pellet.
1. Dissolve the pellet in 100–300 mL of water. Depending on the
quality of the DNA synthesis, it may be necessary to purify the
DNA template using denaturing polyacrylamide gel electro-
phoresis (PAGE), although we do not always do so. Use a UV/
VIS spectrophotometer and the molar extinction coefficient of
the specific DNA sequence to calculate the concentration
of the DNA template (see Note 6) and adjust to ~0.2–1 mM.
A dilution between 1:100 and 1:1,000 should be sufficient for
spectrometry.
2. Prepare 1 mL of 50 mM annealed stock DNA for transcription,
which is enough for 50 mL of transcription at a final template
concentration of 1 mM. Pipette 50 mL of 1 mM T7 Top DNA
(total 50 nmol) into a sterile labeled Eppendorf tube. Add
50 nmol of the template DNA for RNA transcription (based
on the calculated concentration) and water to make a total
volume of 1 mL.
3. Vortex the tube containing the DNAs and heat it at 95°C for
5 min.
4. Allow the annealed DNA template to cool to room tempera-
ture. Use the DNA immediately or store for later use at 4
or −20°C.
3.2. RNA Synthesis The transcription reaction is essentially the same for making selec-
tively or uniformly 13C-,15N-isotopically labeled RNA as it is for
unlabeled RNA. All nucleotides are prepared and stored separately
so that one or more unlabeled nucleotides can be replaced with
13
C-,15N-isotopically labeled nucleotides for making selectively
labeled RNA samples or all four unlabeled nucleotides can be
replaced with 13C-,15N-isotopically labeled nucleotides for uniform
labeling. The primary differences are as follows: (1) since labeled
nucleotides are prohibitively expensive, we typically run labeled
transcriptions at a concentration of 2 mM for each nucleotide and
(2) we often explore additional transcription optimization condi-
tions (additional transcription components), such as adding inorganic
pyrophosphatase (IPP) or polyethylene glycol (avg. MW 8,000),
for improved transcription efficiency (23).
Before performing large-scale transcription reactions (10–
50 mL), it is necessary to determine the optimum transcription
conditions for every new DNA template (Fig. 1c). The conditions
that need to be considered include NTP concentration (2–4 mM),
magnesium chloride concentration (15–50 mM), annealed DNA

template concentration (0.5–1 mM), and T7 RNA polymerase/
RNAse inhibitor mixture concentration. We typically use a concen-
tration of 4 mM for each NTP when working with unlabeled
nucleotides, and ~2 mL of T7 RNA polymerase/RNAse inhibitor
mixture per 100 mL test reaction. Test reactions are performed in
1.5-mL Eppendorf tubes (see Note 7).
1. Thaw the T7 RNA polymerase/RNAse inhibitor mixture at
−20°C, and then keep on ice; handle very gently (see Note 8).
2. Thaw all other frozen components on ice (except for the T7
RNA polymerase/RNAse inhibitor mixture), and vortex them
briefly to ensure homogeneity (see Note 9).
3. For a single-DNA template, run between 8 and 16 different
test reaction conditions, each reaction consisting of a total of
100 mL. Add 10 mL of 10× transcription buffer (1× final con-
centration) and 2.5 mL of 100 mM DTT (2.5 mM final concentra-
tion) to each of the labeled test reaction tubes.
4. Based on the exact conditions to be tested, add 1–2 mL of
50 mM annealed DNA template (0.5–1.0 mM final concentration),
2–4 mL of each 100 mM NTP stock (2.0–4.0 mM final con-
centration), and 1.5–5.0 mL of 1M MgCl2 (10–50 mM final
concentration) to each tube.
5. Add water to bring the volume to 100 mL, leaving space for the
T7 RNA polymerase/RNAse inhibitor mixture to be added
later. Vortex the samples and centrifuge in a standard tabletop
at 14,000 × g for 3 min at room temperature to collect the
solution droplets.
6. Add 1–5 mL of T7 RNA polymerase/RNAse inhibitor mixture
to the samples, pipetting the solution up and down gently to
mix in the polymerase and to prevent denaturation of the protein.
Always add the T7 RNA polymerase/RNAse inhibitor mixture
last.
7. Incubate the samples at 37°C for 2–4 h.
8. When the test transcription reactions are complete, centrifuge
in a standard tabletop at 14,000 × g for 3 minutes to pellet pre-
cipitated magnesium pyrophosphate, and transfer 10 mL of
each reaction into clean labeled 1.5-mL Eppendorf tubes and
add 10 mL of denaturing gel loading buffer to each tube. A
standard sample is also made containing 5 mL of the annealed
DNA template, 5 mL of water, and 10 mL of denaturing gel
loading buffer. The samples are heated at ~95°C for 5 min
immediately prior to loading on the denaturing gel.
9. These instructions assume the use of a vertical electrophoresis
system (for example: ASU-250, C.B.S. Scientific) and 17-cm-length
gels, although any vertical gel apparatus with a reasonable gel
size can be used. Rinse the glass plates and spacers with water,
followed by ethanol, and wipe dry immediately prior to use.
Although very small gels can be made using standard protein
SDS-gel apparatus, we do not advise this as the resolution on
such gels is generally too low for single-nucleotide resolution
and thus not satisfactory for this purpose.
10. Prepare a 0.75-mm-thick 20% gel by mixing 50 mL of denaturing
acrylamide gel solution with 500 mL of 10% APS. To this solu-
tion, mix in 50 mL of TEMED and immediately pour the gel.
These gels are a single layer and the comb (10, 14, or 20 well)
is inserted immediately after the entire gel is poured. The gel
takes approximately 30 min to polymerize. The percentage of
acrylamide:bisacrylamide solution used depends on the size of
the RNA product that is run on the gel. We typically use 20%
for RNA transcripts up to 40–50 nucleotides long, 15% for
RNA transcripts between 45 and 70 nucleotides, and 10–12%
for longer RNA transcripts.
11. When the gel is polymerized, remove the comb and carefully
rinse the wells with water by using a 30-mL syringe equipped
with a 22-gauge needle. Place the gel in the apparatus and fill
the buffer chambers with gel running buffer. The gel should be
pre-run at 150 V (or 15 W) for 15–20 min and the wells rinsed
again with gel running buffer prior to loading the samples.
12. Once the samples are loaded, run the gel for 2–3 h at 150 V or
until the bromophenol blue dye front (dark blue) is within
~3 cm of the bottom of the gel.
13. Stain the gel with toluidine blue stain on an orbital shaker for
15 min. Destain the gel with water (multiple exchanges) until
you observe good contrast between the dark blue nucleic acid
bands and the background of the gel. The optimal transcription
conditions are chosen based on the test conditions that produce
the highest intensity of the RNA product band.
14. Once the optimal solution conditions have been identified, the
large-scale transcription reaction can be performed. Generally,
we perform a 30 mL transcription reaction in a sterile, disposable,
blue-capped 50-mL centrifuge tube. We find that this is a large
enough transcription volume to generate a reasonable quantity
of RNA for NMR samples (150–400 nmol of RNA) for most
RNA sequences, although (rarely) some RNA sequences transcribe
much better or much worse than this. The large-scale reaction
is a direct scale-up of the small-scale reaction. The only differences
are the volumes of reagents used and that it is usually not necessary
to centrifuge the solution after vortex mixing. The T7 RNA
polymerase/RNAse inhibitor mixture is still the last ingredient
added before incubation and the tube is swirled gently after
adding the polymerase.
15. Incubate the reactions for 4–8 h at 37°C. The reactions can be
incubated overnight, although this can be risky if there is any
possibility of RNAse contamination or if the RNA sequence
is prone to undesirable side reactions, like RNA-primed
RNA addition by T7 RNA polymerase, as described above (see
Note 10).
16. Centrifuge the reaction solution at 4,000 × g for 5 min at 4°C
and decant the supernatant, which contains the RNA product,
into a sterile 250-mL centrifuge bottle. There is a large pellet
from precipitated inorganic pyrophosphate; this does not contain
any RNA and can be discarded.
17. Add 1/10th the volume of 0.5M EDTA, pH 8.0 (1 mL of
EDTA for every 10 mL of transcription), swirl to mix, and then
add 2.5–3 times the volume of cold 100% ethanol (~75–90 mL of
ethanol for a 30 mL transcription) to the transcription solution.
18. Place the centrifuge bottle containing the RNA product at −20°C
overnight to precipitate (see Note 11).
3.3. RNA Purification Denaturing PAGE is our primary means of purifying large quantities
of RNA to single-nucleotide resolution for NMR. Although we have
not experimented recently with currently available preparative-scale
HPLC columns, in the past we found that HPLC purification would
not purify RNAs larger than ~30 nucleotides to single-nucleotide
resolution on a preparative scale. In addition, while we frequently
purify small quantities of RNA using native PAGE, the native gels
must be run in the cold room (4°C), typically take a long time to
run, and rarely yield single-nucleotide resolution when the single-band
products are checked on analytical denaturing gels for purity.
1. Centrifuge the bottle at 14,000 × g for 45 min at 4°C.
2. Very gently, decant the supernatant, as soon as the centrifuge
stops running (see Note 12). There should be a visible white
pellet on the wall of the bottle. The size of the pellet is not a
direct reflection of product yield, since the majority of the size
of the pellet is due to salt precipitation.
3. If the pellet is very large, put 20–50 mL of cold 70% (w/v) etha-
nol in the bottle very gently to wash away excess salt. Centrifuge
with the pellet located against the outer wall at 14,000 × g for
30 min at 4°C, and again decant the supernatant immediately
(see Note 13).
4. Dry the pellet by placing the capless centrifuge bottle underneath
the hood, angled so that the residual ethanol is not lying
directly on top of the pellet. Evaporating off all residual ethanol
typically takes about 2 h if the supernatant was properly poured
off without disturbing the pellet.
5. When the pellet has completely dried, put 2–3 mL of water in

the bottle directly over the pellet and allow it to sit for 10 min.
Pipette up and down gently to resuspend and transfer to a sterile,
disposable 15-mL centrifuge tube. Rinse the bottle with an
additional 1–2 mL of water to gather up any remaining sample
and transfer to the centrifuge tube.
6. Add denaturing gel loading buffer to the sample. Use 1:1 RNA
solution:gel loading buffer; although if the RNA solution volume
is too large, you can use 2:1 RNA solution:gel loading buffer
and it loads and runs fine on the purification gels.
7. These instructions assume the use of gel electrophoresis
equipment for large, sequencing gels, 20 cm (w) × 42 cm (l) (for
example: DDH-400-20, C.B.S. Scientific), spacers, and 3-well
preparative scale (3-mm thick) combs. Rinse the glass plates,
combs, and spacers with water, followed by ethanol, and wipe
dry immediately prior to use.
8. Pour the number of gels needed to obtain the appropriate
purification level (see Note 14). Preparative-scale gels typically
take ~400 mL of acrylamide solution to fill each gel. Place
400 mL of denaturing acrylamide gel solution into a beaker, add
4 mL of 10% APS solution, and stir briefly to mix the APS in evenly.
Add 400 mL of TEMED and stir the solution again briefly.
In general, a 1/100 ratio of APS and 1/1,000 ratio of TEMED
to gel solution are used for polymerization. The gel must be
poured immediately at this ratio of APS and TEMED. The gels
take 2 h to completely polymerize and cool, although they can
sit for longer or even overnight.
9. Remove the combs and clean the wells by using a wash bottle
of water. It is important to also carefully clean the back plate of the
gel, which is in contact with the ceramic heat exchange plate,
to prevent poor contact and smiling of the gel. Set up the gels in
the sequencing apparatus and fill the upper and lower buffer cham-
bers with gel running buffer. The gels should be pre-electrophoresed
for 20–30 min at 20 W per gel, and the wells thoroughly rinsed
with 1× TBE prior to loading the RNA samples.
10. Heat the RNA samples at ~95°C for 5 min immediately prior
to loading on the purification gels. Load the gels with a stan-
dard 1,000 mL pipettor, since the tip easily fits down into the
3-mm wells.
11. The size of the RNA dictates how long the gels should be run
and the wattage. Typically, we run these gels overnight (~18 h)
at 20 W per gel for smaller RNAs (15–25 nucleotides), 25 W
per gel for medium-sized RNAs (25–40 nucleotides), and
30 W per gel for larger RNAs (over 40 nucleotides). The gels
are monitored based on the location of the dyes and the previ-
ous observation from test transcription gels, of where the RNA
runs compared to the xylene cyanol and bromophenol blue dyes
(see Note 15).
12. Remove the gel from the apparatus, carefully remove one glass
plate, and cover the gel with plastic wrap. Flip the gel over
(plastic wrap on the bottom) onto a fluorescent TLC plate and
gently remove the top glass plate. When a handheld short-
wavelength UV light (254 nm) is shone onto the gel, the plate
glows green and the RNA band is visible as a grey to black
shadow on the fluorescent green background (UV shadowing).
Carefully cut the product bands out of the gel with a clean
razor blade and transfer them to a sterile, disposable 50-mL
centrifuge tube and discard the rest of the gel. Repeat for all of
the gels (exposure to UV light should be minimized to avoid
UV damage of RNA).
13. Elute the RNA product by using a 4-trap Elutrap electroelution
chamber (Whatman), with 1× TBE as the running buffer. Gel slices
should be cut into small pieces but not crushed, and placed
into the gel holding chamber between the BT1 and BT2
membranes (see Note 16).
14. Run the elution at 150 V and collect the RNA sample from the
trap at 2–3-h intervals, typically for 9 h. For longer RNAs, we
frequently reduce the voltage to 50 V and run overnight to
ensure complete elution of the product from the gel slices. The
progress of the elution can be tracked by calculating the amount
of RNA in the eluant at each time point, based on UV absor-
bance at 260 nm. Time points should be stored at −20°C until
all of the RNA has been collected, and the elution is finished.
15. Thaw and pool all of the eluted RNA fractions. Load this material
onto an HiTrap-Q anion-exchange column hooked up to a
peristaltic pump (flow rate 3–5 mL/min) and pre-equilibrated
with low-salt buffer. Wash the column with ten-column
volumes (50 mL) or more of low-salt buffer. Elute the RNA
with high-salt buffer and collect 4-mL fractions. The purified
RNA is typically found entirely in the second and third fractions.
The column must always be thoroughly rinsed with low- and
high-salt buffers between samples and immediately stored in
20% ethanol when not in use to prevent contamination.
16. Pool the two RNA containing fractions (8 mL) in a 50-mL cen-
trifuge tube, add 24 mL of cold 100% ethanol, and store the
RNA overnight at −20°C to precipitate.
17. Remove the centrifuge tube directly from the freezer, make a
weight-matched balance tube, and centrifuge at 15,000 × g for
45 min at 4°C. Very gently, decant the supernatant as soon as
the centrifuge stops running. At this point, it is possible to
again perform a 70% ethanol wash as described above (step 3).
18. Dry the pellet by placing the capless centrifuge tube under-
neath the hood, angled so that the residual ethanol is not lying
directly on top of the pellet. Evaporating off all residual ethanol
typically takes about 2 h.
19. Dissolve the RNA sample in 50 mL of water or NMR buffer, if

you already know what the required buffer should be, based
on the protein to be investigated. For new RNAs under inves-
tigation, desalt the RNA (step 21) into water for pH and salt
titrations in the NMR to determine optimal conditions for
obtaining a single-RNA conformation in solution.
20. Anneal the RNA by placing the 50-mL tube at ~95°C for
5 min and then slow cool to room temperature on the bench-
top (see Note 17).
21. When cool, load the solution into an Amicon-stirred cell
(MWCO 3,000 or 1,000 membrane, depending on the size of
your RNA), pressurize the chamber with 55 psi of nitrogen or
argon, and stir. Concentrate the RNA down to ~1–2 mL and
then add fresh buffer or water up to the 50-mL line. Perform
at least three washes to remove excess salt. Concentrate the
RNA down to ~0.5–1 mL (~0.2–2 mM) and transfer to a sterile
1.5-mL Eppendorf tube. The RNA is ready for NMR studies
and should be stored at −20°C when not in use.
3.4. Preliminary RNA Once an initial unlabeled RNA sample is made, a basic set of NMR
Analysis by NMR experiments is run, including 1D 1H NMR experiments collected
in 95% H2O/5% D2O (5–10°C) for pH and salt titrations, NOESYs
collected in 95% H2O/5% D2O (5–10°C) and D2O (20–30°C),
and a TOCSY collected in D2O (20–30°C), to assess the properties
of the RNA in solution at NMR concentrations. Optimal salt and
pH conditions are assessed by the number, intensity, and line
widths of the detectable imino proton resonances compared to the
number of imino protons which are expected to be protected from
rapid exchange due to hydrogen bonding in Watson-Crick and
noncanonical base pairs (Fig. 2). Generally, we find that the optimal
pH for RNA samples falls between 6.0 and 7.0 and the optimal salt
conditions vary from 0 to 200 mM monovalent salt, depending
on the ability of the sequence to form alternative conformations
(see Note 18). The optimal solution conditions are typically dictated
by the protein of interest, but it is important to be aware of the
expected behavior of the RNA under the appropriate solution
conditions.
Degradation products and alternative conformations (including
unwanted dimerization) can be identified both from additional imino
proton resonances in the 1D and 2D spectra collected in 95% H2O/5%
D2O (Fig. 2). In addition, the 1D 1H spectrum helps identify
any small-molecule contaminants in the RNA sample. The most
common contaminants in RNA samples (and their causes) are as
follows: acrylamide and/or urea (insufficient volume of low-salt
wash of the anion-exchange column before eluting the RNA product),
EDTA and/or ethanol (too few buffer exchanges during desalting
and concentration), and ethanol and/or glycerol (insufficient soaking
and rinsing of the Amicon membrane prior to installing in the
Fig. 2. 1D and 2D imino proton NMR spectra of RNA. The secondary structure of a hairpin RNA and the imino proton region
of 1D and 2D NOESY 1H spectra (500 MHz) of the RNA in 95% H2O/5% D2O at 10°C demonstrate the presence of sharp
imino proton resonances. A clear sequential walk through the stem base pairs is indicated for the expected hairpin stem.
The smaller imino proton resonances and NOE peaks are due to an alternative conformation in solution.
stirred cell). EDTA can be eliminated from the high- and low-salt
anion-exchange buffers if it is a recurring contaminant in the final
samples. Ethanol contamination can easily be removed by freezing
the sample in liquid nitrogen, lyophilizing the sample to dryness,
and resuspending in 95% H2O/5% D2O (repeated as necessary).
Glycerol typically requires exhaustive dialysis or buffer exchanges
in the Amicon-stirred cell to be completely removed.
1. For titrations, the RNA sample was previously concentrated
into water during the last steps of RNA purification
(Subheading 3.3, steps 19–21). Add 5% (w/v) 99.99% D2O to
the RNA sample (usually, ~0.1–0.3 mM RNA for solution con-
dition optimization) and adjust to the starting pH (~pH 5.5)
using acid and base solutions (see Note 19).
2. Transfer the sample to a standard NMR tube, insert the NMR
tube into the spectrophotometer, and allow the sample to
equilibrate to 10°C for 5 min before the instrument is set up
for data collection.
3. Collect a baseline 1D imino proton spectrum (using 1,1 echo
water suppression) for a low-salt, low-pH (~5.5–6.0), RNA
sample at 10°C; usually, 128 scans are sufficient (see Note 20).
4. After each spectrum is collected, transfer the RNA sample into

a 1.5-mL Eppendorf tube, adjust the pH by 0.5 pH units using
the appropriate acid and base solutions, return the sample to
the NMR tube, and allow it to equilibrate to 10°C for 5 min in
the spectrophotometer before the setup for data collection.
Due to increased RNA degradation under basic conditions, we
typically do not take RNA samples above pH 8.0 unless abso-
lutely necessary.
5. After the NMR spectra are collected and compared, identify an
optimal pH based on the number of expected and observed
peaks, intensities of the peaks, line widths, and pH at which
there is the least peak overlap in terms of chemical shift. The
optimal pH for most RNA samples typically lies between pH
6.0 and 7.0.
6. For salt titrations, use a new RNA sample that is already adjusted
to the correct pH. The same procedure is followed for salt
titrations. Typically, use either 25- or 50-mM increments for
the monovalent salt concentration, aliquoted from a 4 M KCl
stock, and choose the best conditions again based on the number
of expected and observed peaks, intensities of the peaks, line
widths, and monovalent salt concentration at which there is
the least peak overlap in terms of chemical shift. If the optimal
conditions for the RNA are consistent with the conditions
needed for protein stability and protein–RNA interactions, all
of the RNA from the NMR samples can then be pooled with
the stock RNA and exchanged into the appropriate NMR buffer
at this point as described in RNA purification (steps 19–21).
3.5. RNA–Protein This section focuses on preparing the protein for investigating RNA–
Interactions by NMR protein interactions. Protein purifications are well-established and
isotopic labeling of proteins has been reviewed extensively (12, 24,
25). Other than the standard issues and concerns for preparing
NMR quality protein samples, there are two issues that also need
to be considered. First, RNAses can bind to or copurify with the
protein of interest and may potentially degrade the RNA during
complex formation. Second, high-affinity cellular RNA can copurify
with the protein. Usually, several different purification steps, such
as affinity tag purification, followed by ion-exchange and size-exclu-
sion chromatography, are sufficient to ensure complete removal of
RNAses (26). If it is also necessary to remove RNA bound to protein,
this can be done through treatment with 0.1–0.5% PEI (27) to
ensure removal of all RNA bound to the protein of interest.
Uniformly 15N-labeled protein is typically obtained by overex-
pression from E. coli in M9 minimal medium and purified using
affinity columns for either His6- or GST-tagged proteins. This step
is often followed by size-exclusion chromatography. Depending on
the purity of the protein, as judged by Coomassie staining of SDS
polyacrylamide gels, it may be necessary to include a third-column

purification step, such as anion-exchange chromatography. Following
these purification steps, there should be no RNAse contamination
left in the purified protein. However, there may still be nucleic acid
contamination, which can be detected by running the purified protein
on an agarose or acrylamide gel and staining with ethidium bromide.
If nucleic acid contamination is detected, a PEI precipitation following
the first affinity column can ensure complete removal.
1. Pool the appropriate fractions that contain the protein of interest
from the affinity column and add solid NaCl to a final concen-
tration of 1M.
2. Place all samples at 4°C, add PEI to a final concentration of
0.5% (w/v), and incubate the solution for 1 h.
3. Centrifuge the solution at 15,000 × g for 35 min at 4°C to
remove the precipitated PEI–nucleic acid complexes.
4. Recover the protein from the supernatant through precipita-
tion by adding ammonium sulfate to a final concentration of
75% (see Note 21).
5. Centrifuge the protein solution at 10,000 × g for 20 min at
4°C, decant the supernatant, and recover the pelleted protein
by dissolving in low-salt buffer (Subheading 2.3), if the next
column is an anion-exchange column, or in the final NMR
buffer (Subheading 2.3), if the next column is a size-exclusion
column. The final buffer should be identical to the buffer chosen
for optimal RNA stability (see Subheading 3.4).
6. A final column purification step, such as an ion-exchange column,
can be inserted at this step, if the previous purification steps do
not yield protein of sufficient purity.
7. Pool all the fractions that contain the protein of interest and
concentrate the protein to 0.2–1 mM concentration using
either Amicon-pressurized stir cells or disposable Amicon ultra
centrifugal filters (Ultracel).
8. Monitor complex formation between the RNA and protein by
acquiring a 1H–15N heteronuclear single-quantum correlation
(15N-HSQC) spectrum of the 15N-labeled protein (isotopic label-
ing of proteins has been extensively reviewed elsewhere (11)).
9. First, collect the 15N-HSQC spectrum of the free protein. The
2D 15N-HSQC should be acquired from 6 to 13 ppm in the
hydrogen dimension and 95–140 ppm in the nitrogen dimen-
sion, which is typically referred to as the fingerprint region of
proteins (see Note 22).
10. Prepare the protein sample in a Shigemi NMR tube using
300 mL of a 0.2 mM protein sample in 5% D2O to obtain a
good quality spectrum and define the effect of buffers on the
overall fold of the protein.
11. Following the initial test of protein folding, measure the

chemical shift perturbation in the 15N-HSQC of the amide
protons from the free protein upon titration with unlabeled
RNA. The protein should be between 0.1 and 0.5 mM and the
unlabeled RNA has to be concentrated to obtain different
ratios for the titration ranging from 0.25 to 2:1 of RNA:protein
without requiring the need for large volumes of RNA solution
to be added to the NMR sample (see Note 23).
12. Since each 15N peak on the protein represents a distinct amide
peak of the protein, this titration can yield information on the
position of the RNA binding site at the protein interface as
well as the binding affinity, based on the NMR timescale.
13. The reverse titration (protein into the RNA sample) can also
be performed. First, using the same purification protocol,
obtain a concentrated, unlabeled protein (1–2 mM) that can
be titrated into a 13C-,15N-labeled RNA sample, which is purified
as described above for unlabeled RNA (see Subheading 3.3).
14. Obtain a 13C-HSQC of RNA, which typically exhibits distinct
peaks for the carbons attached to the aromatic and H1¢ pro-
tons. Binding to the protein should cause perturbations of
these peaks, and thus yields information of the binding site on
the RNA (28) (see Note 24).
Once a few sites on the protein or RNA have been identified
using the chemical shift perturbation method, other more tar-
geted experiments narrow down the binding region (for a review
of different experiments and how they can be applied to solve
the RNA–protein complex, see ref. 26). Ultimately, measuring
NOEs between the RNA and the protein provides the most
important details for the structural calculations of the RNA–
protein complex. Filtered and edited NOESY experiments (out-
lined in the introduction based on ref. 15) have been instrumental
in identifying NOEs for RNA structure determination and
greatly simplify the identification of NOEs between the RNA
and protein components of the RNA–protein complex.
4. Notes
1. All water for RNA studies must be distilled deionized 18.0 MW-cm
quality from a purification system that is properly maintained
(cartridges replaced as suggested by manufacturer). Large bottles
of autoclaved water are kept on hand in the laboratory for
solution making, and all solutions that come into contact with
RNA are sterile filtered using Nalgene sterile filter units with a
0.2-mm pore size prior to storage. Unless otherwise noted, all
solutions should be handled in this manner and “water” in the
context of this protocol indicates autoclaved water of this quality

which has been sterile filtered prior to use. As long as the water
quality is maintained, we have not found it necessary to treat
water with diethylpyrocarbonate (DEPC) in order to prevent
RNAse contamination and degradation of RNA samples.
2. It is important that the pH of the nucleotides is adjusted using
sodium hydroxide and not Tris base. High Tris base concentra-
tions appear to inhibit transcription by T7 RNA polymerase.
3. For the final solution of T7 RNA polymerase/RNAse inhibitor
mixture, a tube (10,000 U) of SUPERase·In RNase Inhibitor
(Ambion/Applied Biosystems) is mixed with the enzyme prior
to the addition of glycerol to a final concentration of 50%
(w/v), prior to making aliquots and storing at −80°C. RNAse
inhibitor is not essential, but is added as a precaution against
any residual RNAse from the T7 RNA polymerase enzyme puri-
fication. The final concentration of T7 RNA polymerase is rarely
measured; we use test transcriptions to identify the amount of
each preparation to use for optimal RNA transcription efficiency.
4. It is important when making this solution that it is not diluted
to volume until all of the urea has dissolved (very-high-quality
ultrapure urea should be used) and that it is not heated. When
using stored acrylamide–urea solutions, the solution must be
examined to ensure that urea has not precipitated out of solution.
It is for this reason that we store these solutions in transparent
brown glass bottles. In addition, since acrylamide is a cumulative
neurotoxin, it is important that gloves are worn at all times
even when it is polymerized. We prefer to buy premixed acryl-
amide to reduce handling.
5. The low- and high-salt buffers for anion exchange can use different
buffer salts and sodium chloride, rather than potassium chloride.
We use phosphate buffer and potassium chloride for this
column because this is basically the last step of RNA purifica-
tion and they are the usual buffer and salt compounds that we
use for RNA NMR studies.
6. If the extinction coefficient for a DNA is not provided in the
documentation from the company, it can be calculated (for
DNA or RNA) using the extinction coefficients for individual
NMPs (A = 15.4 M-1-cm-1, C = 7.4 M-1-cm-1, G = 11.8 M-1-cm-1,
T = 9.6 M-1-cm-1, U = 9.9 M-1-cm-1) or a good approximation is to
set the extinction coefficient at 10 M-1-cm-1 multiplied by the
number of nucleotides. It is not advisable to use the general
assumption that an A260 of 1 corresponds to 50 mg/mL as this
is not accurate for short DNAs or short RNAs.
7. All disposable plasticware (pipette tips and Eppendorf tubes)
that do not come presterilized and certified RNAse and DNAse
free should be autoclaved prior to use and kept in sealed
containers to prevent contamination. We also cover the openings

of glassware with aluminum foil and autoclave all glassware in
the laboratory routinely after washing and rinsing with distilled
deionized water.
8. Although it is possible to use purchased T7 RNA polymerase
for in vitro transcription, the amounts necessary for large-scale
synthesis of RNA for NMR tend to be prohibitively expensive.
Therefore, most laboratories, ours included, purify their own
supply of T7 RNA polymerase for transcription. The plasmid
for His6-tagged T7 RNA polymerase is readily available from
any number of academic sources. In general, we do not quantitate
the T7 RNA polymerase; we simply use analytical transcription
reactions to determine the appropriate amount of T7 RNA
polymerase to be used for optimal transcription efficiency.
9. These solutions do not freeze in a homogeneous fashion, nor
do they thaw that way. Before pipetting out of a thawed tube
of solution, you must make sure that the solution is completely
thawed, then vortex mix the tube briefly to make sure that it is
well-mixed, and centrifuge the solution briefly to push all the
droplets to the bottom of the tube.
10. If the reaction is working, it will start to get cloudy 1–2 h after
you put the reaction into the water bath. The reaction gets
cloudy because, as nucleotide triphosphates are linked together
to form RNA (which has only one phosphate between each
nucleotide), the other two phosphate groups are released as
pyrophosphate (PPi). The pyrophosphate interacts with the
magnesium in solution to form an insoluble magnesium pyro-
phosphate complex. The more precipitation you observe, the
more nucleotide triphosphates have been converted into RNA.
11. If you use 2.5–3× the volume of 100% ethanol, your final solution
will be 70–75% ethanol. It is not recommended to leave a sample
in ethanol for more than a few days. The longer the sample sits,
the more salt precipitates out of solution. The greater the amount
of salt in the precipitate, the bigger effect it has on the separation
efficiency and running of the purification gels. Ideal timing
for RNA ethanol precipitation is either overnight at −20°C or
for a few hours at −80°C.
12. It is important to pour off the supernatant as soon as the rotor
stops spinning, before the pellet detaches from the wall of the
bottle.
13. This step removes excess salt and helps the purification gels to
run with better resolution. It is critical that you do not resus-
pend the pellet or rinse it off the wall with rough handling or
it is very difficult to recover without significant loss of sample.
14. The number of gels needed to purify your RNA needs to be
determined before you run your entire RNA sample on the gels.
Each purification gel has three wells that can be loaded with
different amounts of your dissolved RNA: loading buffer mixture.

Load three different amounts of RNA in the three wells and
run one gel to see which amount gives you a good-sized RNA
band. Generally, loading amounts of the dissolved RNA solution
that correspond to ~2, 4, and 6 mL of the original transcription
give you a good idea what the proper amount for optimal purifica-
tion might be. The RNA band should be somewhere between
the width of a pencil and the width of your finger. Much less
than this is hard to find on the gels and any more than that
eliminates single-nucleotide resolution. The optimal loading
amount is used for subsequent gels for this specific RNA.
15. To obtain the full separation capacity of the 42-cm-long gel, it
is important to run the RNA to the bottom 1/3 of the gel or
as close to the bottom as possible. The location of the RNA of
interest on the gel with respect to the dyes should be noted at
the test-preparative-gel stage. When purifying RNAs that run
very close to the dye front (an ~27-nucleotide RNA runs on
top of the xylene cyanol dye on a 20% 19:1 denaturing acryl-
amide gel), it is necessary to omit the overlapping dye from the
samples completely and run a lane on one of the gels that
contains only dye as a marker.
16. The MWCO for the BT1 membrane is ~5,000. To prevent loss
of smaller product RNAs (~30 nucleotides or less), the BT1
membrane should be replaced with thoroughly rinsed 1,000
MWCO dialysis tubing for these samples. 18-mm wide dialysis
tubing fits exactly in the trap chambers and only needs to be
cut at the open ends to fit the length of the BT1 membrane. It
is also possible to make a second trap by placing an additional
BT1 membrane after the next insert in the trap. It is absolutely
critical that elution traps are not overloaded with gel slices,
which significantly reduces elution efficiency.
17. The conditions for annealing should be based on the behavior
of the RNA in question. RNAs that are intended to form unimo-
lecular structures should be annealed under dilute conditions
(50 mL) while RNAs that are intended to form duplexes (dimers)
should be annealed in more concentrated solutions (2–5 mL).
In addition, snap cooling on ice can be used to affect the
conformational exchange in solution. The timing of adding
monovalent salts (before annealing or immediately after snap
cooling) can be used to bias the sample toward forming a single
conformation as needed. The appropriate annealing conditions
require trial and error for each RNA sample.
18. We typically include 0.2% (w/v) sodium azide in RNA samples
to prevent bacterial contamination. Usually, the sodium azide
is included in the NMR buffer solution used for the final
stages of RNA concentration and desalting (RNA purification:
steps 19–21), after the optimal NMR solution conditions have
been identified. While divalent salts, such as MgCl2, do stabilize
RNA structures and are often needed for larger RNA structures
to form, completely small RNAs typically fold without magne-
sium, and the NMR samples last longer (degrade less quickly)
if magnesium is excluded from the NMR buffer, whenever
possible.
19. It is very important that, when the pH of RNA samples in
water is being adjusted, you are careful not to overshoot the
desired pH. Alternately adding acid and base to reach the
appropriate pH results unnecessarily increases the salt concen-
tration, which also affects the intensity of the imino proton
resonance peaks.
20. It is important to ensure that the sweep width is wide enough
not to miss any unusual imino proton chemical shifts (proto-
nated cytosine residues, for example). Therefore, we routinely
collect out to 18 ppm initially, although almost all imino proton
peaks are found between ~9 and 15 ppm.
21. PEI can be very difficult to remove from a protein preparation,
as it cannot be dialyzed out effectively. Ammonium sulfate pre-
cipitation is the standard method of separating the protein of
interest from the PEI, with the protein precipitating and the
PEI remaining in the supernatant solution. The precise percent-
age of ammonium sulfate necessary is protein dependent and
the optimal percentage to be used should be tested on small
samples of the protein to ensure maximal protein recovery.
22. Examining the fingerprint region of the free protein spectrum:
If the peaks are well-dispersed with minimal overlap in the center
region of the spectrum, this is an indication that the protein is
well-folded. Each peak in this spectrum represents a unique
amide peak from the protein. If the protein is not well-folded,
it is still worth testing the titration with RNA, as it is possible
that the RNA may induce conformational change of the protein
upon binding.
23. It is best to start with a lower end of the range of protein concen-
trations since formation of RNA:RNA multimers can be an issue
at high concentrations. This titration also gives some idea of
the stoichiometry of RNA:protein complex. If a large amount
of RNA (high RNA:protein ratio) must be added to observe
chemical shift perturbations, this may be an indication that
measuring NOEs between the RNA and protein is difficult.
24. One must keep in mind that it is possible to get shifted reso-
nances (on the protein or the RNA) not only in the binding
site, but also in other regions due to allosteric modulation or,
more likely, changes in the pH or monovalent salt concentra-
tion of the solution. This is why it is particularly important to
try to have both the RNA and protein prepared such that their
final solution conditions are identical.
References
1. Autexier, C., and Triki, I. (1999) Tetrahymena (2008) Multiple segmental and selective iso-
telomerase ribonucleoprotein RNA–protein tope labeling of large RNA for NMR structural
interactions. Nucl. Acids Res. 27, 2227–2234. studies. Nucl. Acids Res. 36, e89.
2. Bachand, F., Triki, I., and Autexier, C. (2001) 15. Peterson, R. D., Theimer, C. A., Wu, H., and
Human telomerase RNA–protein interactions. Feigon, J. (2004) New applications of 2D
Nucl. Acids Res. 29, 3385–3393. filtered/edited NOESY for assignment and
3. Greider, C. W., and Blackburn, E. H. (1987) structure elucidation of RNA and RNA–protein
The telomere terminal transferase of Tetrahymena complexes. J. Biomol. NMR 28, 59–67.
is a ribonucleoprotein enzyme with two kinds of 16. Milligan, J. F., Groebe, D. R., Witherell, G. W.,
primer specificity. Cell 51, 887–898. and Uhlenbeck, O. C. (1987) Oligoribo-
4. Staley, J. P., and Woolford, J. L., Jr. (2009) nucleotide synthesis using T7 RNA polymerase
Assembly of ribosomes and spliceosomes: com- and synthetic DNA templates. Nucl. Acids Res.
plex ribonucleoprotein machines. Curr. Opin. 15, 8783–8798.
Cell Biol. 21, 109–118. 17. Cazenave, C., and Uhlenbeck, O. C. (1994)
5. Ban, N., Nissen, P., Hansen, J., Moore, P. B., RNA template-directed RNA synthesis by T7
and Steitz, T. A. (2000) The complete atomic RNA polymerase. Proc. Natl. Acad. Sci. U.S.A.
structure of the large ribosomal subunit at 2.4 91, 6972–6976.
A resolution. Science 289, 905–920. 18. Pleiss, J. A., Derrick, M. L., and Uhlenbeck, O.
6. Cech, T. R. (2000) Structural biology. The ribo- C. (1998) T7 RNA polymerase produces 5¢ end
some is a ribozyme, Science 289, 878–879. heterogeneity during in vitro transcription from
7. Schluenzen, F., Tocilj, A., Zarivach, R., Harms, certain templates. RNA 4, 1313–1317.
J., Gluehmann, M., Janell, D., Bashan, A., 19. Triana-Alonso, F. J., Dabrowski, M., Wadzack, J.,
Bartels, H., Agmon, I., Franceschi, F., and and Nierhaus, K. H. (1995) Self-coded 3¢-exten-
Yonath, A. (2000) Structure of functionally sion of run-off transcripts produces aberrant prod-
activated small ribosomal subunit at 3.3 ang- ucts during in vitro transcription with T7 RNA
stroms resolution. Cell 102, 615–623. polymerase. J. Biol. Chem. 270, 6298–6307.
8. Wimberly, B. T., Brodersen, D. E., Clemons, 20. Kao, C., Rudisser, S., and Zheng, M. (2001) A
W. M., Jr., Morgan-Warren, R. J., Carter, A. P., simple and efficient method to transcribe RNAs
Vonrhein, C., Hartsch, T., and Ramakrishnan, with reduced 3¢ heterogeneity. Methods 23,
V. (2000) Structure of the 30S ribosomal sub- 201–205.
unit. Nature 407, 327–339. 21. Kao, C., Zheng, M., and Rudisser, S. (1999)
9. Ule, J. (2008) Ribonucleoprotein complexes in A simple and efficient method to reduce non-
neurologic diseases. Current Opinion in templated nucleotide addition at the 3 terminus
Neurobiology 18, 516–523. of RNAs transcribed by T7 RNA polymerase.
10. Whittaker, J. W. (2007) Selective isotopic labeling RNA 5, 1268–1272.
of recombinant proteins using amino acid 22. Nacheva, G. A., and Berzal-Herranz, A. (2003)
auxotroph strains. Methods. Mol. Biol. 389, Preventing nondesired RNA-primed RNA
175–188. extension catalyzed by T7 RNA polymerase.
11. Cowburn, D., Shekhtman, A., Xu, R., Ottesen, Eur. J. Biochem./FEBS 270, 1458–1465.
J. J., and Muir, T. W. (2004) Segmental isoto- 23. Cunningham, P. R., and Ofengand, J. (1990)
pic labeling for structural biological applica- Use of inorganic pyrophosphatase to improve
tions of NMR. Methods Mol. Biol. 278, 47–56. the yield of in vitro transcription reactions cata-
12. Liu, D., Xu, R., and Cowburn, D. (2009) lyzed by T7 RNA polymerase. BioTechniques 9,
Segmental isotopic labeling of proteins for 713–714.
nuclear magnetic resonance. Methods Enzymol. 24. Gardner, K. H., and Kay, L. E. (1998) The use
462, 151–175. of 2H, 13C, 15N multidimensional NMR to
13. Lu, K., Miyazaki, Y., and Summers, M. F. (2010) study the structure and dynamics of proteins.
Isotope labeling strategies for NMR studies of Ann. Rev. Biophys. Biomol. Struct. 27, 357–406.
RNA. J. Biomol. NMR 46, 113–125. 25. Marley, J., Lu, M., and Bracken, C. (2001)
14. Nelissen, F. H., van Gammeren, A. J., Tessari, M., A method for efficient isotopic labeling of recom-
Girard, F. C., Heus, H. A., and Wijmenga, S. S. binant proteins. J. Biomol. NMR 20, 71–75.
26. Wu, H., Finger, L. D., and Feigon, J. (2005) HIV-1 Rev. Protein Expression and Purification
Structure determination of protein/RNA 63, 112–119.
complexes by NMR. Methods Enzymol. 394, 28. Khanna, M., Wu, H., Johansson, C., Caizergues-
525–545. Ferrer, M., and Feigon, J. (2006) Structural
27. Marenchino, M., Armbruster, D. W., and study of the H/ACA snoRNP components
Hennig, M. (2009) Rapid and efficient purifi- Nop10p and the 3¢ hairpin of U65 snoRNA.
cation of RNA-binding proteins: application to RNA 12, 40–52.
Chapter 13
Preparation and Optimization of Protein–DNA Complexes

Suitable for Detailed NMR Studies
My D. Sam and Robert T. Clubb
Abstract
This chapter describes the methods to form and optimize samples of protein–DNA complexes that are
suitable for detailed structure and dynamics studies by NMR spectroscopy.
Key words: Protein–DNA complex, NMR, Structure, Intermolecular NOEs
1. Introduction
Interactions between proteins and DNA molecules play an essential

role in a wide range of important biological processes, including
gene expression and genomic replication, recombination, and
repair. Of particular interest are site-specific DNA-binding tran-
scription factors, which regulate gene expression by recognizing
specific nucleotide sequences. These proteins locate, and bind to,
the correct site within the genome despite the presence of a vast
number of competitor sites with similar geometries and electrostatic
surfaces. Over the past several decades, X-ray crystallography has
been used extensively to determine a large number of high-resolution
structures of protein–DNA complexes (1, 2). This work has provided
a wealth of detailed stereochemical information about binding site
recognition, which typically is achieved through complementary
hydrogen-bonding and van der Waals interactions that can be
maximized by protein folding and/or DNA distortions (3, 4).
However, crystallography provides only a static view of a protein–
DNA complex, and thus little insight into the conformational
dynamics that underpin macromolecular recognition (5).
219
220 M.D. Sam and R.T. Clubb
NMR spectroscopy is a powerful tool that can be used to

investigate protein–DNA recognition in the solution state. When
the spectra obtained are of good quality, NMR can be used to eluci-
date high-resolution structures and atomic-level conformational
dynamics (6, 7). NMR can also be applied to investigate other key
aspects of recognition, including the basis of nonspecific binding
(8), hydration lifetimes (9), on/off rates of binding, and the
process by which a protein locates its binding site (10). Even when
the quality of the NMR spectra are poor due to resonance line
broadening, structural models of a protein–DNA complex can be
generated using chemical shift mapping techniques. In recent years,
the size and complexity of protein–DNA complexes amenable for
NMR studies have increased due to several methodological
advances, such as selective isotopic labeling, residual dipolar cou-
pling measurements (11), paramagnetic relaxation enhancement
methods (12), and transverse relaxation-optimized experiments,
that exploit ultrahigh magnetic field strengths (13, 14). However,
one of the largest obstacles to successfully studying a protein–DNA
complex by NMR is the preparation of sufficiently stable, concen-
trated, and homogeneous samples of the complex.
For a protein–DNA complex to be suitable for detailed NMR
studies, it typically must satisfy several criteria. The components
and nature of the interaction should be well-defined. In particular,
biochemical experiments should have been performed to clearly
delineate the specific nucleotide sequence recognized by the pro-
tein, as well as the stoichiometry and affinity of the resulting complex.
The same molecular weight limitations that hinder NMR studies of
other macromolecules apply, so the final size of the complex is also
an important consideration. Ideally, the complex should have a dis-
sociation constant (Kd) in the submicromolar range, thus making
it more likely that it will be in the slow-exchange regime on the
chemical shift timescale. However, weaker affinity complexes that
are in fast exchange have also been successfully studied. To optimize
the production and spectral qualities of a complex, large quantities
of DNA and isotopically labeled protein are also needed to enable
different preparative procedures and conditions to be tested. Pilot
studies typically make use of purified 15N-enriched protein and a
range of DNA species that differ in their length and sequence. In
our experience, the greatest chance of success occurs when the
protein is soluble in its DNA-free state and its 1H–15N HSQC
spectrum is well-resolved. However, DNA is highly soluble and as
a result the aggregation behavior of the protein may be greatly
reduced upon complex formation.
To create a stable complex suitable for NMR studies, the DNA
molecule must have the appropriate nucleotide sequence and
length to form productive contacts with the protein. For sequence-
specific DNA binding proteins, this information can be obtained
from previously reported biochemical studies, which should define
13 Preparation and Optimization of Protein–DNA… 221
the specificity, affinity, and stoichiometry of the complex. If the

protein is known to bind to several DNA sites, then an alignment
of their nucleotide sequences may reveal conserved positions
essential for binding. This knowledge is helpful later in optimizing
the spectra of the protein–DNA complex, since it identifies nucle-
otides within the DNA molecule that can presumably be altered
without affecting stability. To reduce spectral overlap, the minimal
DNA sequence with good binding affinity for the protein should
be used to make the complex. If the structure of the protein in the
DNA-free state is known, a model of it docked to B-form DNA
should be constructed. This may help to determine the minimum
DNA length that can be used, and whether additional nucleotides
outside of the known binding site are required to form nonspecific
stabilizing contacts. In practice, DNA molecules studied in our
group almost always contain a G:C base pair (bp) at each end,
which limits fraying by increasing the melting temperature.
Minor changes in the DNA and protein sequences can dramati-
cally affect the NMR spectrum of a protein–DNA complex and are
therefore parameters that can be optimized. A common mistake is
to choose a DNA fragment that is too long with unnecessary base
pairs at either its 3¢ or 5¢ end. This can be problematic as longer
DNA fragments can contain weaker, “cryptic” binding sites for the
protein that become occupied at the high protein concentrations
present in the NMR sample (typically > 0.5 mM). For example, a
protein that forms numerous interactions with an A-T sequence
located at the center of the primary site might also bind to a
secondary A-T dinucleotide sequence present elsewhere in a longer
DNA fragment. If this occurs, the multiple binding modes of the
protein cause resonance line broadening. Modeling studies and a
comparison of the DNA-binding sites can be used to identify poten-
tial “cryptic” sites, which can then be eliminated by altering the
nucleotide sequence of the DNA molecule. A nucleotide sequence
comparison can also identify dsDNA molecules that have NMR
spectra that can more readily be assigned. For example, it may be
preferable to maximize the number of thymine bases in the sequence,
as its methyl groups are good anchor points in the assignment
process. The length and sequence of the protein can also be adjusted
to improve spectral quality. Typically, this involves deleting unstruc-
tured amino acids at the polypeptide termini to reduce spectral
overlap. However, even subtle single amino acid changes can have a
dramatic impact on spectral quality. For example, in our studies of
an ARID-DNA complex, a single phenylalanine-to-leucine muta-
tion was found to dramatically reduce line broadening, salvaging a
protein–DNA complex that was originally ill suited for structural
analysis by NMR (15). This biochemical approach is not a general
method, but may prove useful in the spectral optimization of other
protein complexes that suffer from interfacial line broadening
caused by dynamic changes in proximal aromatic rings.
Fig. 1. Flowchart showing the procedures used to form and optimize protein–DNA
complexes for NMR studies.
In this chapter, we outline the approaches we typically use to

form protein–DNA complexes suitable for NMR studies. The
overall procedure for this protocol is outlined in Fig. 1.
2. Materials
The exact reagents used to form protein–DNA complexes suitable

for high-resolution solution-state NMR studies vary depending
upon the specific system that is being studied. In this section, the
materials used to produce the Integrase(Int)-DNA complex are
described (16).
2.1. Binding Affinity 1. 2× binding buffer: 40 mM Tris–HCl, pH 7.5, 40 mM NaCl,

Measurements 40 mM KCl, 10% (w/v) glycerol, 2 mM EDTA, and 2 mM
DTT.
32
2. P-labeled DNA: Labeled at its 5¢-termini with T4 polynucle-
otide kinase and g-32P-ATP.
3. PhosphorImager (Molecular Dynamics Inc.): To quantify
radioactivity in the gels.
4. 1× TBE: For 1 L of 10× TBE, dissolve 108 g of Tris base, 55 g
of boric acid, and 7.4 g of disodium EDTA salt. Dilute tenfold
to get 1× TBE.
5. Bovine serum albumin (BSA): Stock concentration 1 mg/mL in
H2O.
6. Poly dI/dC: Stock concentration 0.5 mg/mL.
7. Protein stock solution: 20 mM (4× of the highest protein

concentration used in the binding assay). For protein–DNA
complexes with nanomolar affinity, a typical titration range is
as follows (nM): 5,000, 1,000, 200, 100, 50, 25, 5, 1, 0.5, and
0. Protein should be dissolved in a solution that is most stable
for that particular protein. This protein buffer is diluted in the
binding assay and replaced with binding buffer solution.
8. 6× DNA loading dyes: 0.03% (w/v) xylene cyanol FF and
0.03% (w/v) bromphenol blue in 10 mM Tris, pH 8, and 30%
(w/v) glycerol.
9. 1 M Tris, pH 8: Dissolve 121.1 g of Trizma (MW = 121.1 g/mol)
in 800 mL of H2O, titrate with concentrated HCl until pH 8
is achieved, and then bring final volume to 1 L.
2.2. Preparation 1. DNA oligonucleotides: 1 mmol scale synthesis (Integrated

of Purified Single- DNA Technologies, IDT) (see Note 1).
Stranded DNA 2. 17% acrylamide–urea gel: For 400 mL total volume, mix
168.2 g of urea, 170 mL of 40% acrylamide (37.5:1), 40 mL
of 10× TBE (Subheading 2.1), 60 mL of H2O, 1 mL of 10%
(w/v) APS, and 100 mL of TEMED.
3. DNA loading buffer: 7 M urea, 50 mM Tris, pH 8.0 (using
1 M Tris, pH 8; Subheading 2.1), 5 mM EDTA (using 0.5 M
EDTA, pH 8), and 10% (w/v) glycerol.
4. 0.5 M EDTA, pH 8: Add 186.1 g of EDTA disodium salt and
~20 g of NaOH pellets to 700 mL of H2O (EDTA dissolves as
it approaches pH 8), and then bring the solution to 1 L once
pH 8 is achieved.
5. 6x DNA loading dyes (see Subheading 2.1).
6. 1× TBE (see Subheading 2.1).
7. FLEX TLC plates.
8. Electroelution chamber.
9. Dialysis buffer: 50 mM Tris–HCl, pH 7.5, 200 mM NaCl, and
2 mM EDTA (using 0.5 M EDTA, pH 8).
2.3. Preparation 1. Annealing buffer: Same as dialysis buffer (Subheading 2.2).

of Duplex DNA 2. D2O.
for NMR Studies
2.4. Preparation 1. High-salt protein buffer: 50 mM Hepes, pH 7.0 (using 1 M

of the Protein–DNA Hepes, pH 7), 500 mM NaCl, and 1 mM DTT.
Complex 2. 1 M Hepes, pH 7: Stock solution is adjusted to pH with
NaOH.
3. High-salt DNA buffer: 50 mM Tris, pH 7.5 (using 1 M Tris,
pH 7.5, see Subheading 2.1, except adjust pH to 7.5), 500 mM
NaCl, and 0.1 mM EDTA.
4. Low-salt buffer: 25 mM Hepes, pH 7.0 (using 1 M Hepes, pH

7), 15 mM NaCl, 2 mM DTT, 7% D2O, and 0.01% NaN3.
5. Centricon YM-3 centrifugal filter device (Amicon
Bioseparations).
6. ~50 mM protein solution.
7. ~50 mM dsDNA solution.
3. Methods
Our lab has solved the structures of six protein–DNA complexes.

Four have been determined by NMR spectroscopy and two have
been determined by X-ray crystallography (17–22). Below, we
describe the procedures we generally use to form complexes
between a sequence-specific binding protein and a duplex-DNA
molecule. Unless otherwise stated, the procedures described below
are used to produce samples of the complex between the Int
protein and its cognate DNA site (17–22). Four procedures are
presented: (1) an electrophoretic mobility shift assay (EMSA) for
affinity and specificity measurements, (2) methods to purify single-
stranded DNA (ssDNA), (3) methods to prepare duplex DNA
(dsDNA), and (4) the procedures used to assess the spectral quality
of a protein–DNA complex to determine if additional NMR
studies are warranted.
3.1. Binding Affinity Biochemical assays should be available to rapidly estimate the affinity
Measurements of wild-type and mutant proteins for different DNA molecules.
A variety of methods can be employed to measure binding, such as
the EMSA, isothermal titration calorimetry (ITC), fluorescence
anisotropy, surface plasmon resonance (SPR; e.g., Biacore), and
fluorescence quenching (if the protein contains an appropriately
positioned tryptophan). However, we favor the EMSA because it is
robust and simple to perform (23). This procedure has been
described in detail previously (24) and is outlined below.
1. Mix the following components: 12 mL of 2× binding buffer,
3 mL of 1 mg/mL BSA, 2 mL of 0.5 mg/mL Poly dI/dC (com-
petitor DNA) for a total of 17 mL.
2. Add to this mixture the protein stock solution and the
appropriate amount of H2O to achieve a final volume of 23 mL
(see Subheading 2.1 and Note 2).
3. Incubate on ice for 20 min.
4. Add 1 mL of 32P-labeled DNA probe (~4,000 cpm/mL).
5. Incubate on ice for 20 min.
6. Prepare an 8% polyacrylamide gel and pre-electrophorese the

gel by running it for 30–60 min at 10 V/cm in 1× TBE at
ambient temperature or 4°C (see Note 3).
7. Load the reaction mixtures onto an 8% polyacrylamide/TBE
gel at 4°C (see Note 4). Load 6× DNA loading dyes in a
separate lane as a reference to track the migration of the free
DNA. The gel run time should be optimized for each specific
system so as to resolve the species of interest and to be as short
as possible. A typical gel run time is ~1.5 h at constant voltage
(10 V/cm), but the voltage should be reduced if the gel
becomes warm during electrophoresis.
8. Quantify the amount of free and bound DNA in each lane by
using a PhosphorImager system or the equivalent.
9. Determine the dissociation constant by fitting to the following
equation: θ = [L ]/ ([L ]+ K d ), where q, [L], and Kd are the
fraction of DNA bound, the total protein concentration in
the reaction, and the dissociation constant, respectively. q is
equal to the counts present in the shifted band divided by the
total counts for the DNA (free plus shifted bands).
3.2. Preparation In this section, we discuss how to purify large quantities of

of Purified Single- commercially available ssDNA oligonucleotides for NMR studies.
Stranded DNA We initially purchase the ssDNA in a crude form at a cost of ~$2 per
nucleotide for 1 mmole of material. ssDNA with lengths less than
20 base pairs is purified on 20% acrylamide–urea gels and with
lengths longer than 20 bp is purified on 17% gels. The ssDNA is
then eluted from the gel and dialyzed into native buffer for further
use. The procedure we use generally yields ssDNA that is >98%
pure for molecules up to 40 bp in length. For ssDNA shorter than
15 bp, more conventional approaches are sufficient to produce
ssDNA suitable for NMR studies (see Note 5).
1. Prepare one 17% acrylamide/urea gel with a single lane (see
Note 6).
2. Dissolve the DNA oligonucleotide (1 mmol or 3–6 mg) in
2 mL of DNA loading buffer.
3. Load 20 mL of 6× DNA loading dyes on the right and left edge
of the gel within the single lane. The migration of the xylene
cyanol FF and bromophenol blue dyes along the gel gives an
estimate of how far the DNA has migrated (see Note 7).
4. Run the gel in 1× TBE. It typically takes ~10–13 h for ssDNA
(depending on the length) to migrate ~3/4 of the entire gel
length. For a single gel running at a constant 50 W, the voltage
is ~600–700 V (65–80 mA) (see Note 8).
5. Transfer the gel onto saran wrap by removing the top gel plate
and placing a layer of saran wrap directly on top of the gel.
Invert the gel onto FLEX TLC plates with the saran wrap on
top of the TLC plates. Remove the remaining gel plate and in
a darkroom, use a handheld UV lamp (254 nm) to locate the
ssDNA. Quickly excise the DNA using a razor blade. Cut the
excised gel containing the desired ssDNA into ½-in. pieces to
ensure efficient DNA electroelution. The purpose of the
FLEX TLC plate is to enhance the DNA signals during
exposure to UV lights.
6. Assemble an Elutrap™ electroelution device (Whatman).
Membranes for trapping DNA are BT1 (14 nucleotide cutoff)
and BT2 (cellulose acetate membrane). Alternatively, BT1 can be
replaced with a low-molecular-weight cutoff dialysis membrane.
7. Elute the DNA in 1× TBE. Run the Elutrap at 150 V for
8–10 h (remove eluted DNA from the Elutrap two to three
times during this period).
8. Dialyze extensively with dialysis buffer to remove all denatur-
ing reagents from the ssDNA.
9. Determine the DNA concentration from a UV absorbance
reading at 260 nm (A260). The extinction coefficient may be
calculated online using a program provided by IDT (http://
www.idtdna.com/analyzer/Applications/OligoAnalyzer/).
3.3. Preparation of Complementary purified ssDNA molecules are annealed to produce

Duplex DNA for NMR the appropriate DNA duplex. NMR and/or chromatographic
Studies approaches are then used to verify duplex formation prior to
forming the protein–DNA complex. The steps used to generate
dsDNA are outlined below.
1. Dissolve complementary ssDNA molecules in annealing buffer
to a final concentration of ~100 mM. The samples should be
free of urea and EDTA.
2. Heat the sample to ~95–100°C for 10 min and slowly cool to
ambient temperature in the heat block. A water bath may be
used in place of a heating block for larger annealing volumes.
3. Perform NMR experiments to ensure that the sample has
properly annealed and that no excess ssDNA is present. Add an
appropriate amount of D2O to maintain the field lock, and
acquire an 1H 2D TOCSY spectrum (mixing time ~40 ms).
Acquire spectra of the ssDNA components of the duplex as
well. Compare the three spectra; signals from the ssDNA spec-
tra appearing in the duplex spectrum indicate ssDNA excess.
Typically, the H5-H6 cross peaks of the cytosine bases are used
to discriminate between the single-stranded and double-
stranded forms of DNA (see Note 9).
3.4. Preparation of the Our group follows a conservative approach when forming protein–
Protein–DNA Complex DNA complexes to minimize sample loss due to precipitation
(17–22) (see Note 10). In this procedure, dilute concentrations of
the components are mixed in the presence of high salt, followed by

concentration and removal of the salt to form the final NMR
sample. Typically, the starting concentration of the dsDNA is
~50–100 mM and it is dissolved in at least 150 mM NaCl, near
physiological pH (see Note 11).
Pilot studies should be performed prior to embarking on a
large-scale production of the complex as we have found that the
order of addition of the protein and DNA components can have a
dramatic impact on the results obtained. To test for precipitation
of the sample upon component mixing, slowly titrate the protein
into a ~20 mL sample of the DNA (the final volume of the fully
formed complex should be ~40 mL). The concentration of the
components is as described above and enough protein should be
added to generate a 1:1 complex. The reverse titration in which
dsDNA is titrated into a sample of the protein should also be per-
formed. In both cases, carefully observe if any precipitation occurs
during the mixing procedure and, more importantly, if any precipi-
tation remains after the titration is complete. A light microscope
can provide an easy way to assay for the presence of precipitation.
As already mentioned, the order of addition can be critical. For
example, in our studies of the Excisionase(Xis)-DNA complex, the
addition of dsDNA to a solution of Xis resulted in irreversible
precipitation, whereas the reverse titration, titrating Xis into a solu-
tion of dsDNA, yielded a soluble complex after mixing (20).
To obtain an NMR sample once the protocol for complex
formation has been optimized, we typically mix the DNA and
protein components such that the final volume of the complex is
~10 mL (~25 mM of the complex). The salt and unwanted buffer
components are then removed by dialysis or using a centrifugal
filter unit. This step is important for complex stability as the pres-
ence of salt tends to destabilize the protein–DNA complex by
shielding electrostatic interactions. Generally, to obtain a sample
with good NMR spectral properties, several variants of the complex
are studied that differ in their pH, complex concentration, and ionic
strength. Initial screening is typically performed using ~200–500 mM
samples of the complex, which are then concentrated further to
construct the final sample once the best conditions are discovered
(see Note 12).
1. Prepare ~10 mL of 50 mM Int protein solution dissolved in
high-salt protein buffer. Prepare ~10 mL of 50 mM dsDNA
solution dissolved in high-salt DNA buffer (see Note 11).
2. Prepare the final NMR sample by slowly titrating 10 mL of
50 mM Int into 10 mL of 50 mM dsDNA. Add ~0.5 mL of
protein, mix, and repeat until the titration is complete. The
specific order of addition and the conditions used should
have been optimized as described above.
3. Exchange the final sample into low-salt buffer using a protein

concentrator (see Note 12). Concentrate the sample to a vol-
ume suitable for NMR experiments. Monitor the total amount
of complex at each step of this process by measuring the A260
(see Note 13).
3.5. Assessing the 1. To assess the quality of the DNA spectrum, record 1D 1H
Quality of the Protein– spectra using a 1331 pulse adjusted to maximally excite the
DNA Complex imino protons (25).
2. Compare the spectra of the complex and the isolated dsDNA
molecule to assess protein binding.
3. If sufficient material is present, use a similar excitation scheme
to record a 2D 1H NOESY spectrum, which reveals whether
the appropriate imino–imino cross peaks are present. Ascertain
the quality of the protein spectrum by comparing the 1H–15N
HSQC spectra of the free and DNA-bound forms of the pro-
tein. The spectrum of the complex should be well-resolved,
exhibit uniform signal intensities, and differ substantially from
the NMR spectrum of the DNA-free protein (Fig. 2).
In favorable cases, it may be possible to observe intermo-
lecular NOEs using protein–DNA complexes containing only
15
N-labeled protein. Depending on the structure of the complex,
intermolecular NOE cross peaks are sometimes observed in
the 2D 1H NOESY spectrum between the imino protons of
Fig. 2. The 1H–15N HSQC spectrum of the Int-DNA complex, which exhibits good dispersion
and uniform line widths. Data for this complex were suitable to determine a high-
resolution NMR structure (17–22).
the DNA and protein protons that resonate upfield of 1.2 ppm.
This is because the most upfield resonances in the 1H spectrum
of DNA are the thymine H5 methyl groups (1.2 and 1.6 ppm).
Intermolecular NOEs between the protein amide and DNA
imino protons can sometimes also be seen in the 3D 15N-edited
NOESY spectrum of the 15N-labeled complex. However, a full
assessment of whether a complex is suitable for structure
determination by NMR requires the acquisition of the appro-
priate 2D and 3D edited and filtered NOESY experiments
using samples in which the protein is labeled with 13C and 15N.
4. Notes
1. A variety of companies sell pure or partially purified single-

stranded oligonucleotides that are synthesized using phos-
phoramidite chemistry (e.g., IDT, Invitrogen, Sigma, and
Applied Biosystems). To save money, we typically obtain stan-
dard, desalted, unpurified oligonucleotides, which are then
further purified in-house. Olignonucleotides greater than 60 nt
in length can be produced via enzymatic reactions, which may
also be applied to produce DNA molecules enriched with 13C
and 15N (26, 27).
2. In this procedure, binding isotherms are generated by varying
the protein concentration (0, 0.5, 1, 5, 25, 50, 100, 200,
1,000, 5,000 nM protein).
3. Alternatively, Tris–acetate–EDTA (TAE) may be used for the
gel casting mixture and running buffer.
4. A 5–15% gel could be used depending on the respective sizes
of the individual components and formed complex.
5. A classical and more straightforward method to purify ssDNA
from a crude synthesis is to use an HPLC reverse-phase column
(C4 to C18; porous hydrocarbon silica gel) with 0.1 M trieth-
ylammonium acetate (TEAA; mobile phase A) and ace-tonitrile
(mobile phase B). A complete purification protocol and a col-
umn selection guide are described in Current Protocols Nucleic
Acid Chemistry by Andrus et al. (28). A Mono Q column
(GE, Mono Q HR5/5) on an FPLC can be used to purify
ssDNA under denaturing conditions. Buffers for this purifica-
tion scheme are as follows: (a) Buffer A: 50 mM Tris, pH 7.5,
6 M urea, and (b) Buffer B: 50 mM Tris, pH 7.5, 6 M urea,
1.5 M NaCl. No more than 1.5 mmol of ssDNA should be
loaded onto this column for good separation. A 25 mL gradi-
ent (flow rate = 1 mL/min) from 10 to 30% buffer B should be
sufficient to ensure good separation of ssDNA from impurities.
It should be noted that all buffers and resuspended DNA
should be filtered through a 0.2-mm filter before applying to

the column.
6. If the DNA is longer than ~20 nucleotides, a 20% acrylamide
gel should be used instead.
7. On a 10% gel, xylene cyanol migrates equivalent to a 55-nucleotide
ssDNA molecule and bromophenol blue migrates equivalent
to a 12-nucleotide ssDNA molecule. On a 20% gel, xylene
cyanol and bromophenol blue migrate equivalent to 28- and
8-nucleotide ssDNA molecules, respectively.
8. Power settings for a single 20% acrylamide/7 M urea gel
should be set to 30 W and maximum voltage and current. For
a single gel running at a constant 30 W, the voltage is ~600–650 V
and the current is 45–50 mA.
9. An alternative approach to ensure that solutions of dsDNA do
not contain excess ssDNA is to provide a 10% excess of one of
the DNA strands in the annealing reaction. After annealing, a
Mono Q column is then used to separate the dsDNA molecule
from excess ssDNA under native conditions. The following
buffers can be used: Buffer A: 50 mM Tris, pH 7.5, and 1 mM
EDTA; Buffer B: 50 mM Tris, pH 7.5, 1 mM EDTA, and
1.5 M NaCl. For good separation of the two DNA species, no
more than 2 mg of DNA should be loaded onto a 1 mL Mono
Q column. ssDNA typically elutes around 25% buffer B
(0.375 M NaCl) and dsDNA typically elutes at ~35% buffer B
(0.525 M NaCl). A 25 mL gradient from 10 to 45% NaCl running
at 1 mL/min works well for oligos between 10 and 30 bp.
10. A variety of approaches can be used to successfully make
protein–DNA complexes for NMR studies. A strategy commonly
used in the literature is to titrate a ~1 mM 15N-labeled sample
of the protein with a concentrated stock solution of the DNA
molecule. 1H–15N HSQC spectra are recorded at various
points during the titration until the desired stoichiometry is
reached. Although the titration method is simple, mixing con-
centrated protein and DNA samples can cause the resulting
complex to precipitate.
11. These conditions prevent a shift in the equilibrium from
dsDNA to ssDNA that occurs at low-salt concentrations.
A similar protein concentration is used when forming the
complex; however, the salt and pH in the protein solution can
vary and are chosen to maximize the stability and solubility
of the free protein.
12. Frequently, the components of the complex are precious and
therefore methods that minimize sample loss during concen-
tration are desired. One trick to concentrate small-volume
samples (< 2 mL) is to partially evaporate the complex. In this
procedure, a weak flow of nitrogen gas is blown over the
sample while it rests in the NMR tube. This can be accomplished

by passing the gas through drawn-out pipette that is in turn
inserted into the NMR tube. The buffer components are also
concentrated during this process, and therefore the initial
buffer conditions must be chosen accordingly.
13. A rough estimate of the concentration of the complex can be
obtained by measuring its A260 immediately after mixing the
components. The amount of the material present at this point
in the preparation procedure is known. Therefore, it is possible
to estimate the extinction coefficient of the complex by deter-
mining the optical absorbance of the complex at 260 and
280 nm. Once estimated, the extinction coefficient enables the
concentration of the complex to be readily determined as it is
concentrated to a volume suitable for NMR. It also enables the
yield of the concentration procedure to be determined as
the total amount of complex before and after concentrating
can be determined.
Acknowledgments
We thank Dr. Evgeny Fadeev for making Fig. 2. This work was
supported by a grant from the National Institutes of Health to
R.T.C. (R01 AI52217).
References
1. Pabo, C. O., and Sauer, R. T. (1992) solution structure of an Antennapedia homeodo-
Transcription Factors-Structural Families and main-DNA complex. J. Mol. Biol. 234,
Principles of DNA Recognition. Annu. Rev. 1084–1093.
Biochem. 61, 1053–95. 7. Omichinski, J. G., Clore, G. M., Schaad, O.,
2. Garvie, C. W., and Wolberger, C. (2001) Felsenfeld, G., Trainor, C., Appella, E., Stahl,
Recognition of specific DNA sequences. Mol. S. J., and Gronenborn, A. M. (1993) NMR
Cell. 8, 937–946. structure of a specific DNA complex of
3. Nadassy, K., Wodak, S. J., and Janin, J. (1999) Zn-containing DNA binding domain of GATA-1.
Structural features of protein-nucleic acid rec- Science 261, 438–446.
ognition sites. Biochemistry 38, 1999–2017. 8. Kalodimos, C. G., Biris, N., Bonvin, A. M.,
4. Jen-Jacobson, L. (1997) Protein-DNA recog- Levandoski, M. M., Guennuegues, M., Boelens,
nition complexes: conservation of structure R., and Kaptein, R. (2004) Structure and flex-
and binding energy in the transition state. ibility adaptation in nonspecific and specific
Biopolymers 44, 153–180. protein-DNA complexes. Science 305,
5. Boehr, D. D., Nussinov, R., and Wright, P. E. 386–389.
(2009) The role of dynamic conformational 9. Qian, Y. Q., Otting, G., and Wuthrich, K.
ensembles in biomolecular recognition. Nat. (1993) NMR detection of hydration water in
Chem. Biol. 5, 789–796. the intermolecular interface of a protein-DNA
6. Billeter, M., Qian, Y. Q., Otting, G., Muller, M., complex. J. Am. Chem. Soc. 115, 1189–1190.
Gehring, W., and Wuthrich, K. (1993) 10. Iwahara, J., and Clore, G. M. (2006) Direct
Determination of the nuclear magnetic resonance observation of enhanced translocation of a
homeodomain between DNA cognate sites by 19. Fadeev, E. A., Sam, M. D., and Clubb, R. T.
NMR exchange spectroscopy. J. Am. Chem. (2009) NMR structure of the amino-terminal
Soc. 128, 404–405. domain of the lambda integrase protein in
11. Tjandra, N., and Bax, A. (1997) Direct mea- complex with DNA: immobilization of a
surement of distances and angles in biomolecules flexible tail facilitates beta-sheet recognition
by NMR in a dilute liquid crystalline medium. of the major groove. J. Mol. Biol. 388,
Science 278, 1111–1114. 682–690.
12. Clore, G. M., and Iwahara, J. (2009) Theory, 20. Sam, M. D., Cascio, D., Johnson, R. C., and
practice, and applications of paramagnetic Clubb, R. T. (2004) Crystal structure of the
relaxation enhancement for the characteriza- excisionase-DNA complex from bacteriophage
tion of transient low-population states of bio- lambda. J. Mol. Biol. 338, 229–240.
logical macromolecules and their complexes. 21. Wojciak, J. M., Connolly, K. M., and Clubb, R.
Chem. Rev. 109, 4108–4139. T. (1999) NMR structure of the Tn916 inte-
13. Pervushin, K., Riek, R., Wider, G., and grase-DNA complex. Nature Struct. Biol. 6,
Wuthrich, K. (1997) Attenuated T2 relaxation 366–373.
by mutual cancellation of dipole-dipole cou- 22. Wojciak, J. M., Iwahara, J., and Clubb, R. T.
pling and chemical shift anisotropy indicates an (2001) The Mu repressor-DNA complex con-
avenue to NMR structures of very large bio- tains an immobilized “wing” within the minor
logical macromolecules in solution. Proc. Natl. groove. Nature Struct. Biol. 8, 84–90.
Acad. Sci. USA 94, 12366–12371. 23. Buratowski, S., and Chodosh, L. A. (2001)
14. Cavanagh, J., Fairbrother, W. J., Palmer, A. G. Mobility shift DNA-binding assay using gel
I., Rance, M., and Skelton, N. J. (2006) Protein electrophoresis. Curr. Protoc. Mol. Biol.,
NMR Spectroscopy: Principles & Practice (2nd Chapter 12, Unit 12 2.
ed.), Academic Press, San Diego. 24. Taylor, J. D., Ackroyd, A. J., and Halford, S. E.
15. Iwahara, J., Wojciak, J. M., and Clubb, R. T. (1994) The gel shift assay for the analysis of
(2001) Improved NMR spectra of a protein- DNA-protein interactions, in DNA-protein
DNA complex through rational mutagenesis interactions, principles and protocols (Kneale, G.
and the application of a sensitivity optimized G., Ed.), Humana Press, Totowa, NJ.
isotope-filtered NOESY experiment. J. Biomol. 25. Hore, P. J. (1983) A new method for water
NMR 19, 231–241. suppression in the proton NMR spectra of
16. Sambrook, J., Fritsch, E. F., and Maniatis, T. aqueous solutions. J. Magn. Reson. 54,
(1989) Molecular Cloning, A laboratory man- 539–542.
ual, 2nd ed., Cold Spring Harbor Laboratory 26. Louis, J. M., Martin, R. G., Clore, G. M., and
Press. Gronenborn, A. M. (1998) Preparation of uni-
17. Abbani, M. A., Papagiannis, C. V., Sam, M. D., formly isotope-labeled DNA oligonucleotides
Cascio, D., Johnson, R. C., and Clubb, R. T. for NMR spectroscopy. J. Biol. Chem. 273,
(2007) Structure of the cooperative Xis-DNA 2374–2378.
complex reveals a micronucleoprotein filament 27. Xiong, A. S., Yao, Q. H., Peng, R. H., Duan,
that regulates phage lambda intasome assem- H., Li, X., Fan, H. Q., Cheng, Z. M., and Li,
bly. Proc. Natl. Acad. Sci. USA 104, Y. (2006) PCR-based accurate synthesis of
2109–2114. long DNA sequences. Nat Protoc. 1,
18. Iwahara, J., Iwahara, M., Daughdrill, G. W., 791–797.
Ford, J., and Clubb, R. T. (2002) The struc- 28. Andrus, A., and Kuimelis, R. G. (2001) Analysis
ture of the Dead ringer-DNA complex reveals and purification of synthetic nucleic acids using
how AT-rich interaction domains (ARIDs) rec- HPLC. Curr. Protoc. Nucleic Acid Chem.,
ognize DNA. EMBO J. 21, 1197–1209. Chapter 10, Unit 10 5.
Chapter 14
NMR Studies of Protein–Ligand Interactions

Michael Goldflam, Teresa Tarragó, Margarida Gairí,
and Ernest Giralt
Abstract
Nuclear magnetic resonance (NMR) has evolved into a powerful tool for characterizing protein–ligand
interactions in solution under near physiological conditions. It is now frequently harnessed to assess the
affinity and specificity of interactions; to identify binding epitopes on proteins and ligands; and to charac-
terize the structural rearrangements induced by binding.
The first section of this chapter provides a general overview of the NMR study of protein–ligand
interactions. The section is divided according to two main categories of experiments: those based on
observing protein signals and those based on observing ligand signals. The next section explains two case
studies performed in the authors’ laboratory. The first of these deals with the interaction between vascular
endothelial growth factor and a peptidic ligand, and includes a detailed protocol of chemical shift
perturbation experiments. The second one reports on the interaction between prolyl oligopeptidase and a
small molecule as monitored by ligand saturation transfer difference (STD), and illustrates how NMR can
be used to confirm binding and to identify the binding epitope of a ligand.
Key words: Protein–ligand interactions, Chemical shift perturbation, Saturation transfer difference,
NMR, Vascular endothelial growth factor, Prolyl oligopeptidase, Protein observed experiments,
Ligand observed experiments
1. Introduction
Protein–ligand interactions are integral to diverse biological

processes. They include the interaction of proteins with signaling
molecules, such as neurotransmitters and hormones, or cofactors,
as well as antigen recognition and enzyme–substrate interactions.
In all of these processes, correct biological functioning of the
protein requires that it specifically recognize a ligand at a particular
binding area on its surface.
233
234 M. Goldflam et al.
Deep knowledge of these processes and their underlying

mechanisms is necessary not only for understanding these events at
the molecular level, but also for being able to selectively modulate
these interactions to provoke a desired biological response. This
can be done by modifying natural compounds or by developing
completely new compounds. Both cases offer a nearly unlimited
pool of small organic molecules, peptides, carbohydrates, or mixtures
thereof. Whatever the potential of these compounds to interact
with a given protein, recognition itself is steered by the structural
orientation of the protein’s functional groups. Thus, elucidation
of these interactions greatly facilitates selection of appropriate
functional groups in an appropriate framework.
Protein–ligand interactions can be studied with several tools,
nearly all of which can provide information on binding strength
and specificity. This information can be complemented with data
acquired by isothermal titration calorimetry (ITC), mass spectrom-
etry (MS), surface plasmon resonance (SPR), and nuclear magnetic
resonance (NMR).
ITC records the change in temperature of a protein solution
upon titration with a ligand solution in an isolated chamber (1). It
enables determination of thermodynamic parameters, including
the free energy (ΔG), enthalpy (ΔH), and entropy (ΔS) of the inter-
action, and the change in heat capacity (ΔCp).
In mass spectrometry, various techniques are used to ionize
compounds or complexes and subsequently analyze their mass-to-
charge ratios. Recent developments in MS have facilitated the study
of protein–ligand interactions, allowing the detection and character-
ization of individual conformational states of protein complexes (2).
Owing to the high sensitivity of MS, only minute amounts of
sample are needed. The study of hydrogen–deuterium exchange of
protein backbone amide hydrogens can give information on the
binding epitope of a ligand; however, this is most amenable to
higher affinity ligands. Finally, MS is one of the few methods that
enable study of complexes in gas phase. Comparison between
binding energies in gas phase and in solution may advance under-
standing of the forces behind protein–ligand interactions and,
more precisely, help establish the role of solvation in molecular
recognition at protein surfaces (3).
SPR probes the interaction between an analyte in solution
and a biomolecular recognition element immobilized on a sensor
surface (4). It enables direct determination of the binding kinetics
parameters kon and koff, from which thermodynamic parameters can
be quantified. If the protein is the immobilized binding partner,
then only small amounts are necessary. The main drawback of SPR
is that it requires immobilization of one of the binding partners,
which may influence the protein–ligand interaction.
NMR has evolved into a powerful tool for obtaining massive
amounts of data on inter- and intramolecular processes. Use of NMR
14 NMR Studies of Protein–Ligand Interactions 235
to detect protein–ligand interactions is widely documented in the

literature (5–7). An advantage of NMR over other techniques is
that it provides access to a broad set of experiments that have been
optimized for various objectives: determination of affinity and
specificity; identification of binding epitopes on the protein and on
the ligand; characterization of structural rearrangements induced
by binding; and turnover of substrates by enzymes. Furthermore,
since the experiments are performed in solution, physiological or
near physiological conditions are possible. Another advantage of
NMR is that it is not limited to high-affinity systems: it can be
applied to study very weak interactions (i.e., mM range), for which
other techniques are often unsuitable (8). Moreover, for low-affin-
ity systems, NMR offers a relatively low incidence of false positive
and false negatives compared to other analytical approaches. The
main limitation of NMR is its low sensitivity. Also, compared to
other techniques, NMR experiments are intrinsically low-through-
put. Nevertheless, improved automation, and development of high
sensitivity probes (e.g., cryoprobes), new pulse sequences, efficient
isotopic labeling techniques, and more powerful magnets, have all
contributed significantly to minimize these limitations. NMR
experiments for protein–ligand interactions fall into two main cat-
egories: either studying them from the perspective of the protein
or from the perspective of the ligand. In the following section,
both approaches are overviewed and some typical experiments
from each group are analyzed.
1.1. Protein Observed Although in some cases monodimensional (1D) 1H-NMR experi-
Experiments ments have been used to characterize the protein–ligand binding
by following the 1H chemical shifts of specific residues in the pro-
tein, most experiments on protein observation entail bidimensional
(2D) NMR. Conventional 1D-1H spectra typically cannot resolve
the individual proton signals of the protein. This limitation can be
overcome by distributing the information along two dimensions
and by employing heteronuclear spectroscopy (i.e., studying mag-
netically active nuclei other than protons that are present in pro-
teins, such as 15N and 13C). Since the natural abundance of 15N and
13
C (0.37% and 1.1%, respectively) is too low for NMR experi-
ments, the protein to be studied must be isotopically labeled,
usually, through expression in E. coli. Several efficient labeling
schemes are available, and choosing the right one can greatly simplify
NMR studies of proteins.
The most widely used labeling method is uniform labeling with
15
N. For proteins expressed recombinantly in E. coli, 15N (in the
form of an ammonium salt) is added to the expression media. This
simple modification provides near quantitative isotopic labeling of
the protein: all backbone amides as well the nitrogen containing
side chains are labeled with this magnetically active nucleus.
Heteronuclear 1H-15N correlation NMR experiments that allow
direct observation of J-coupled 1H to 15N nuclei generate spectra

containing at least one signal for each amino acid, except proline.
Additional signals arise from amides in the side chains. When signal
assignment is available, this strategy enables mapping of changes in
the protein’s backbone amides that are induced by binding of a
ligand, and if the 3D structure of the protein is known, then the
regions directly involved in the binding process can be easily
identified.
Another common labeling scheme for studying protein–ligand
interactions is amino acid specific labeling, in which the desired
amino acid – or a suitable precursor – is added prelabeled to the
expression media and an auxotrophic bacterial strain is used. This is
advantageous for large proteins (i.e., >40 kDa), for which it provides
far simpler spectra than those obtained with uniform labeling of the
backbone. In this context, a type of amino acid is selected which is
well distributed throughout the protein sequence and which can
serve as a probe for changes induced by the ligand–protein interac-
tion. The authors of this review recently used this strategy to map
changes induced by ligand binding to POP, an 80-kDa protein,
using a 15N-indole selective labeling scheme of the 12 Trp residues
in the enzyme (9). Selective labeling of one type of amino acid can
also be attractive for small proteins. According to the Hot Spot the-
ory proposed by Bogan et al., specific amino acids are concentrated
in regions of the protein that contribute to interactions with other
proteins or ligands (10). Therefore, one of these amino acids may
serve as a site-specific probe. Even if the assignment is incomplete,
this scheme can identify ligands that bind to a zone of interest.
For methyl-bearing side chains, 13C labeling provides a very
sensitive probe. Due to its mobility and the presence of three
degenerated protons, the methyl group generates a high intensity
signal in 2D heteronuclear 1H-13C correlation NMR experiments,
while being extremely sensitive to environmental changes. The
advantages and applications of using selectively 13C-labeled methyl
groups in the NMR study of large biomolecules have been reviewed
by Tugarinov (11).
The key experiment for the study of protein–ligand interac-
tions in 15N-labeled target proteins is the 1H-detected 2D-[15N,
1
H]-HSQC experiment (12, 13). For uniformly 15N-labeled sam-
ples, at least one signal per amino acid is observed. The basic
experiment comprises four main blocks (Fig. 1a).
Block A comprises an INEPT module (14), whose purpose is
to transfer nuclear spin polarization between J-coupled nuclei – in
this case, from the more sensitive one, 1H, to the less sensitive one,
15
N. Since the scalar coupling constant is adjusted to the 1JHN value
(ca. 90–95 Hz), only magnetization of amide protons is transferred
to the adjacent 15N nucleus. During block B, a 15N frequency
labeling is achieved by incrementing the variable delay t1, which
leads to generation of the indirect dimension of the 2D spectrum
Fig. 1. (a) Basic pulse sequence for the 1H-15N HSQC experiment. The narrow and wide
bars depict 90° and 180° pulses, respectively. The delays (t ; equal to 1/[41JHN]) allow
magnetization evolution to be transferred between coupled nuclei. 15N magnetization
evolves for t1 and 1H magnetization is directly detected during t2. Double Fourier transfor-
mation along t1 and t2 generates a 2D correlation spectrum with frequencies F1 and F2,
respectively, as shown in (b) Every signal in the spectrum corresponds to one NH group in
the protein and gives information on the chemical shifts of an amide nitrogen (F1) and an
amide proton (F2) that are directly coupled through the coupling constant 1JHN.
(F1 frequency). Block C comprises a reverse INEPT module.

Nuclear spin polarization is again transferred – this time, from 15N
to 1H. This enables data acquisition in block D, in which 1H
magnetization is directly detected during t2, which corresponds to
the F2 frequency in the 2D spectrum (see Fig. 1b). Both excitation
and direct detection of 1H, the nucleus with the higher gyromag-
netic ratio, provide a highly sensitive NMR experiment.
Since labile (e.g., amide) protons are observed in the experi-
ment, protein NMR must be performed in H2O, rather than in
D2O. Proton concentration in H2O (ca. 100 M) is usually several
orders of magnitude higher than that of the protein (mM range),
which implies a wide dynamic range. Thus, the H2O signal must be
strongly attenuated in order to observe the protein protons at a
sufficient signal-to-noise ratio in the NMR spectrum. Water sup-
pression is thereby a critical requisite that must be experimentally
optimized. Currently, most schemes that provide good water
elimination (15) are based on using pulsed field gradients and
proton selective pulses that enable manipulation of the H2O
magnetization independently of that of the protein.
Protein observed experiments for studying protein–ligand
interactions are very simple: the chemical shifts of the protein signals
change upon binding of the ligand. The resulting chemical shift
perturbation (CSP) provides the basis for detecting binding.
Moreover, if signal assignment is available, the exact location of the
interaction on the protein surface can be mapped. Although this
approach was pioneered by several authors, including Gerhard
Wagner, it is strongly associated with Stephen Fesik and his
colleagues at Abbott Laboratories, who coined the term SAR by
NMR (16) to describe the use of CSP for establishing structure–
activity relationships (SAR) in drug discovery.
Interaction of the protein with a ligand affects not only the

local magnetic environment of the backbone amides, but also
the protein’s dynamics. In principle, NMR is well suited for
studying protein dynamics, although this approach is still in its
infancy for protein–ligand interactions. Smrcka et al. (17) studied
the intensities of 15N-Trp labeled G-protein βγ subunits in the pres-
ence and absence of ligands to gain insight into these subunits
ability to interact with diverse molecular partners. They concluded
that the wide range of signal intensities corresponding to different
Trp residues is related to differences in local mobility, which is the
underlying mechanism behind their molecular promiscuity. The
experiments done in the presence of a ligand supported this idea,
since the intensities of residues close to the ligand decreased upon
binding (17).
In addition to binding affinity, binding kinetics are also decisive
in CSP experiments. However, since the theory behind this is
already covered in the literature (7, 18), only a qualitative description
of the phenomenon and its impact on CSP are provided here.
Depending on the system being studied, the kinetic constants of
the binding event can be much faster or much slower than the dif-
ference between the chemical shifts of the bound and free states.
This leads to a range of behaviors in CSP experiments, whereby
increasing amounts of a ligand are titrated into a protein sample and
ligand-induced chemical shift changes are subsequently detected.
In the fast exchange regime (see Note 1) the exchange between
the bound and free form is faster than the difference in chemical
shifts. Only one set of protein signals is visible, and their positions
typically shift according to the ratio between the bound and free
species. Therefore, the chemical shifts move from the free form of
the protein to the position of the bound state, which is reached
once the protein sample has been completely saturated with ligand.
If the same amount of ligand is used in each titration step, then
the chemical shifts will change asymptotically. This can be fit to a
mathematical model and used to calculate the affinity (KD) of the
interaction.
In the slow exchange regime the situation is reversed: exchange
is slower than the difference in chemical shifts between the bound
and free states. Therefore, the bound and free states give separate
signals. In the course of the titration experiment, the signal of the
free protein declines while a new signal appears at the position of
the bound state, which increases in intensity until becoming the
only observable signal at the saturation point.
Fast and slow exchange regimes are not isolated extremes: they
are linked by the intermediate regime, whereby the rate of exchange
between the bound and free states is comparable to the difference
in chemical shifts between these two states. Consequently, the
behavior is more complicated, as it entails a mixture of signal shifts,
decreasing signals, and newly appearing signals. This results in very
broad signals and non-Lorentzian line shapes, which makes analysis

very difficult (19).
The equilibrium dissociation constant KD can be used for
quantification of exchange regimes. If a diffusion controlled on
rate with kon ~ 108 M-1s-1 is assumed, then koff can be estimated:
ligands with KD < 1–10 nM and koff ~ 0.1–1/s will be in the slow
regime; ligands with KD > 10 μM and koff > 103/s will be in the
fast regime; and ligands with values in between these will fall
in the intermediate regime. However, these values are only valid if
the association is indeed diffusion-controlled.
CSP experiments are not always easy to interpret, chiefly due
to the difficulty in distinguishing between short- and long-distance
effects. Short-distance effects are perturbations resulting from the
interaction of residues with the ligand. They delineate the binding
zone of the ligand. Long-distance effects are perturbations caused
by structural rearrangements of the protein under ligand binding.
Although detection of long-distance effects may be of interest,
they can give misleading information if the ligand interaction zone
is unknown. Long distance effects markedly complicate the study
of very flexible systems and weak ligands. To overcome this
problem, Fesik et al. performed CSP studies of FKBP and of Bcl-XL
using closely related ligands (20). Although in both cases all ligands
caused massive perturbations, the differences among the perturba-
tions of these related ligands enabled identification of the binding
site and the crude orientation of the ligands. More recently,
Krishnamoorthy et al. addressed this issue by proposing a new way
to analyze NMR CSP data in detail (21).
1.2. Ligand Observed All ligand observation experiments are based on the difference in
Experiments NMR parameters between the bound and free states of the ligand.
The changes in nuclear Overhauser effects (NOEs) when ligands
bind to receptor proteins are especially interesting (22). Ligands
with molecular weight lower than 1,000 U exhibit short correla-
tion times (τc) and show only weak positive NOEs, very small neg-
ative NOEs, or no NOEs at all, depending on the magnetic field
strength and the molecular weight. Proteins, due to their size,
show large τc, large negative NOEs, and highly efficient spin
diffusion. Upon binding, the ligand forms a high molecular weight
complex with the protein; consequently its properties change,
especially its NOE behavior, with the appearance of strong nega-
tive NOEs, usually called transferred NOEs (trNOEs). The differ-
ence in the properties between the bound and free ligand is as
large as the difference in molecular weight between the ligand and
the protein–ligand complex. Since these differences have a direct
impact on the observable NMR parameters of the ligand, several
experiments can be used to detect and characterize the binding
event. Most ligand observed methods are based on one of the
following: assessment of changes in conventional NMR parameters
of the ligand (e.g., line widths, chemical shits, relaxation properties,

and diffusion); or observation of intermolecular proton magneti-
zation transfer from the protein to the free ligand (via the bound
ligand), to distinguish between binding and nonbinding ligand
molecules.
There are myriad ligand observed experiments currently available,
some of which are briefly introduced in the following section.
One of the first reported applications from the first category
above entailed using 1H NMR to observe the binding-induced
chemical shift changes in certain signals of a ligand upon its inter-
action with a protein. However, because changes in chemical shifts
are small compared to line width changes, experiments based on
relaxation rate effects have been more extensively used. One such
experiment is the Carr–Purcell–Meiboom–Gill (CMPG) filtered
1
H spectrum (23), in which, an R2 relaxation filter comprising of a
train of conveniently spaced 180° pulses is applied prior to data
acquisition. Provided that the ligand remains bound to the protein
long enough to adopt its relaxation behavior, this method, when
adjusted properly, removes signals from the quickly relaxing protons
of the bound ligand as well as those of the protein. This procedure
is also useful since the degree of the attenuation in ligand signals
can be used to rank the affinity of various ligands.
As mentioned above, the sign and size of the NOEs of a small
molecule (i.e., ligand) will change when binding to a receptor pro-
tein. Transient NOE experiments (see Note 2) such as 2D NOESY
(22) can be performed to observe transferred-NOEs to determine
conformations of ligands bound to proteins (24). During the mixing
time of the NOESY, the NOEs build up to a maximum value, and
the difference in build-up rate among transferred-NOES and
NOEs from the free ligand is the key point for ligand-binding
detection: for binding ligands, trNOE rates range from 50 to 100 ms,
whereas for nonbinding ligands, larger values (200–1,000 ms) are
typical. However, this experiment is less sensitive than other exper-
iments (e.g., STD). Nevertheless, its value lies in enabling struc-
ture determination of the bound conformation of the ligand in the
complex, when intramolecular trNOEs are detected. The intermo-
lecular trNOEs between a ligand and a protein can be used to
establish the orientation of the bound ligand in the protein’s
binding pockets.
The transfer NOE effect can be considered as a precursor to
experiments in the second category described above, which are
currently very popular. Responses of magnetization transfer experi-
ments are based on exchange-averaged parameters and are affected
by many experimental parameters. Saturation transfer difference
(STD) and Water-Ligand Observed via Gradient Spectroscopy
(WaterLOGSY) are among the most important of these experi-
ments. Case study 2 is based on the use of STD NMR; therefore,
this experiment is described in more detail.
STD was introduced in 1999 by Bernd Meyer in two seminal

papers (25, 26). He described the experiment in studying the inter-
action of wheat germ agglutinin with saccharides, and reported its
potential use for analyzing mixtures of putative ligands. Several
other STD experiments have since been reported, using a broad
range of targets, including transmembrane receptors on whole
cells.
An ideal sample for an STD experiment comprises a medium
to high molecular weight protein plus a low molecular weight
ligand in a deuterated buffer. The ligand is in high excess over the
protein, to which it binds in fast exchange; these are common
conditions for low affinity ligands. A conventional 1D 1H NMR
spectrum of this sample will show a combination of broad peaks,
corresponding to the protein protons and narrow peaks, corre-
sponding to the ligands protons. The low concentration of the
protein and its short T2 will generate low intensity signals distrib-
uted along the entire spectrum as follows: δ 10–6 ppm (amide and
aromatic protons); δ 6–4 ppm (α-protons); and δ 4 to −1 ppm
(aliphatic protons). Ligand signals vary strongly by ligand struc-
ture, but usually appear at δ > 0.8. This leaves a high-field spectral
region occupied exclusively by protein signals.
STD is based on difference spectroscopy, so two sets of experi-
ments, the on-resonance and the off-resonance, are acquired. In the
on-resonance experiment a frequency-selective pulse is repetitively
applied to the sample in the aforementioned range in which only
protein signals are present (e.g., at −1 ppm) to saturate these
protons, which are chiefly methyl groups of aliphatic side chains.
The magnetic saturation will transfer to protons located in prox-
imity, and then spread over the entire protein due to fast spin-
diffusion and fast cross-relaxation mechanisms. This process is
observed in the 1H-spectrum as a nearly complete disappearance of
protein signals. If a ligand present in the sample then binds to the
protein, it will form part of this high molecular weight system, and
consequently, will receive part of that magnetic saturation.
Interestingly, the degree of saturation received by each ligand
proton is not equal, but rather depends on their proximity to the
protein. Therefore, this property can be used to determine
the binding epitope of the ligand.
The on-resonance experiment must be compared to a refer-
ence (i.e., the off-resonance experiment), in which no magnetic
saturation of the protein is performed, and therefore, no change in
signal intensities is observed. Subtracting the on-resonance spec-
trum from the off-resonance spectrum provides the STD
spectrum, in which distinguishing whether or not a ligand binds
to the protein is easy, since only the signals of the binding molecule
are visible. To minimize the appearance of artifacts in the resulting
difference spectrum, the on- and off-resonance experiments must
be completely comparable. Because of this, in the off-resonance
Fig. 2. Basic pulse sequence for an STD experiment. Selective saturation via a train of N
selective 90° pulses separated by a delay δ is performed during block A. Protein signals
are suppressed during an R2 relaxation filtering delay (block B). After a module for water
suppression (block C), the 1H signal is detected during the FID (block D).
experiment a train of frequency-selective pulses is applied to a

spectral region lacking ligand and protein signals (e.g., 40 ppm).
Furthermore, both experiments are acquired in an interleaving
manner to reduce the impact of equipment instabilities. Figure 2
is a schematic of the STD pulse scheme. The core of the program
comprises three main blocks.
During block A, a train of selective pulses is applied for a total
saturation time (tSat). During the on-resonance experiment, the
selective pulses are applied only to the protein signals at high-field,
whereas in the off-resonance experiment the pulses are applied to
a region far off-resonance from the protein and ligand signals.
During block B, after a hard 90° pulse, a spin lock (R2 relaxation
filter) is applied to remove protein background signals. Block C
comprises a water suppression module for samples containing a
significant amount of H2O (i.e., >20%). This block can be omitted
when working in D2O or other organic solvents.
The widespread use of STD NMR stems from its many attrac-
tive features.
First, STD is amenable to high molecular weight therapeutic
targets. In fact, the larger the target, the more favorable the condi-
tions for the experiment. Magnetic saturation is easily achieved for
larger targets, which promotes saturation transfer from the protein
to the ligand better than for smaller targets. STD works well for
large receptors (>30 kDa). For masses lower than 10 kDa, special
attention must be paid because the R2 relaxation rate may be insuf-
ficient for the intramolecular spreading of the saturation and the
intermolecular transfer to the ligand. These cases may demand lon-
ger saturation pulse trains, or either addition of viscosity enhancing
reagents or use of lower temperatures to slow molecular tumbling.
Secondly, STD experiments can be run with a minute amount
of protein, only low micromolar concentrations are usually chosen.
This is true because the ligand is present in molar excess (generally,
100-fold) over the protein. Assuming fast exchange, one molecule
of protein can bind to a multitude of ligand molecules during the
total saturation time tSat (usually 1–3 s). Due to the small R1 relax-
ation values for the free ligand state, free ligand molecules conserve
the magnetic saturation received by the protein, which leads to a

buildup of saturated ligands in the sample imprinted with the infor-
mation of the binding event. This signal amplification is what
makes STD more sensitive than other techniques.
Thirdly, the STD experiment is easy to implement. Optimization
of the on-resonance frequency for each protein is important, such
that only protein signals are selectively irradiated. Although the
ligand-to-protein ratio, saturation time, and length of the R2 relax-
ation filter can all be optimized, in most cases, STD signals will
already be observed with standard (default) parameters. Unfavorable
kinetics of the ligand exchange may be improved by changing the
temperature at which the experiment is performed (27).
Fourth, the binding epitope of a ligand (the specific portions of
the ligand surface critical for molecular recognition) can be esti-
mated from STD experiments (28) by exploiting the fact that STD
signal intensities (ISTD) are not equal for the different protons in
the ligand. The usual interpretation is that the larger the STD
response, the closer the contact between the protein and the ligand.
However, the magnitude of the STD signals depends not only on
the proximity to the receptor, but also on the longitudinal relax-
ation times (T1) of the free ligand; thus, the STD response depends
both on intermolecular cross-relaxation with the saturated receptor
protons and on autorelaxation. STD effects at long saturation
times may be misinterpreted for protons in molecules having
significantly different T1 values. To determine the binding epitope
without the bias of different relaxation times (T1), the STD experi-
ment must be performed at different saturation time (tSat) values
(29). Experimental data are fitted to the STD build-up curves for
each proton (having a different T1) to obtain the slope of the
monoexponential equation (STDmax) and the saturation rate
constant (ksat):
STD Ampl. = STDmax × [1 − exp(−kSat × t Sat )] (1)
whereby
I STD
STD Ampl. = e × h = e × (2)
I0
STDAmpl. corresponds to the STD amplification factor (28) and is a
correction for total ligand concentration to the STD effect; ISTD is
the intensity of an individual proton in the STD spectrum, and I0,
intensity of the same proton in the reference spectrum; e is the
ligand excess; h is the fraction of ISTD from I0; STDmax is the maximal
STD intensity achievable with long saturation times and corre-
sponds to the STD intensity in the absence of T1 bias.
Finally, STD experiments can be used to determine dissociation
constants if a titration curve is recorded with varying ligand
concentrations at the same saturation time (30). To do this, the
STD-amplification factors are first determined as described above,

and then plotted against the ligand concentration. For one-site
binding models, the curve can be fitted to the following
equation:
STDmax × [L]
STD Ampl. = (3)
K D + [L]
whereby [L] is the concentration of the ligand; and KD is the affin-

ity constant of the ligand (relative to the protein).
The range in KD has been estimated to be from 10−8 to 10−3 M,
assuming a diffusion-limited on-rate constant (ca. 108/s/M). The
intrinsic sensitivity of the STD experiment is limited by the effi-
ciency of the signal amplification and the magnetization transfer.
The signal amplification depends on the kinetics of the binding
process, especially on the off-rate. For KD < 10−8 M, small off-rates
cause a low turnover of ligands into saturated ligands; the binding
is so tight that saturation transfer from the bound to the free ligand
molecules is very inefficient. Additionally, when binding is very
weak, the population of the ligand–protein complex is so low that
it leads to either weak STD signals or no signals at all.
Despite being extremely utile and versatile, STD suffers from
certain limitations. Among these is that the large excess of ligand
relative to the protein may promote nonspecific binding (7) once
the specific binding site has been saturated. Another limitation is
that protein saturation is suboptimal in the case of low proton
density, local proton deficiency, or molecular motion which com-
promises the intramolecular 1H–1H dipole interaction network.
In such cases, WaterLOGSY may be a more effective experiment
than STD (31).
The main difference between STD and WaterLOGSY is the
way in which the system receives magnetic saturation. Whereas
STD NMR uses direct saturation of the protein, WaterLOGSY
applies indirect saturation of the protein, namely, by selective
saturation of the bulk water protons (H2O). Therefore, the trans-
fer magnetization flows from water to protein to ligand.
Technically, there are several options to achieve the selective bulk
water saturation. Dalvit et al. use the selective inversion of the
water resonance via the e-PHOGSY scheme (32). The transfer of
magnetization from the water to the protein-bound ligand occurs
via labile receptor protons (NH and OH protein protons) situated
in the ligand-binding site as well as via remote labile protons in
the protein, through spin diffusion. Additionally, direct proton–
proton cross-relaxation between the bound ligand and long-lived
water molecules within the binding pocket is an effective pathway
in the magnetization transfer process. Differential cross-relaxation
properties of binding and nonbinding molecules with water allow
distinguishing between binding and nonbinding ligands. Whereas
binding molecules interact with the proton spins of inverted water
via dipolar interactions, which lead to negative cross-relaxation

rates, nonbinding molecules yield positive cross-relaxation rates.
The result is that signals of nonbinding molecules show oppo-
site sign to, and are usually weaker than, the resonances of
binding ligands.
1.3. Ligand Versus Ligand-based and protein-based approaches have distinct advantages
Protein Observed and disadvantages. The former yield information about the strength
Experiments of the interaction, the binding epitope, and the conformation of
the ligand. They can be used to simultaneously screen several
compounds for their ability to bind to a protein of interest. Ligand-
based experiments have simple requirements. First, they do not
require isotopically labeled protein. Secondly, there is no upper
limit for protein size, but the difference between protein and ligand
has to be substantial enough to result in differential relaxation
behaviors. Contrariwise, protein observation experiments are cur-
rently only feasible for proteins weighing ca. 40 kDa or less.
Moreover, they demand considerable amounts of isotopically
labeled protein, which must be stable at high concentrations for
long periods of time. When signal assignment is available, protein-
based experiments may provide more information. Most impor-
tantly, they can be used to identify one or several binding sites of
the ligand on the protein and indicate zones of structural rear-
rangement. Furthermore, in these experiments, formation of ligand
aggregates cannot be misinterpreted as an interaction. However,
the data in protein observation experiments are easiest to interpret
when the binding site has been saturated. In the case of low affinity
ligands, this point may be beyond the limit of solubility. In conclu-
sion, there are cases for which one approach is better suited than
the other. Nevertheless, the full power of NMR to characterize
protein–ligand interactions can only be exploited if both approaches
are combined. Several examples from academic and industrial drug
discovery projects are testament to the great success of a combined
approach (7, 8, 33, 34).
2. Materials
2.1. Protein-Based 1. Purified samples of uniformly 15N-labeled vascular endothelial

Study on the Binding growth factor (VEGF): 160 μ L at 100 μ M in 25 mM
of VEGF to the Peptidic phosphate buffer, pH 7.0 (see Note 3), 50 mM NaCl, 90% H2O,
Ligand P-7i 10% D2O in a 3-mm NMR tube (see Note 4). VEGF is obtained
by recombinant expression as previously described (35).
2. Peptide P-7i: Prepared by using standard solid phase peptide
synthesis (3).
3. Bruker Digital Avance 600 MHz spectrometer equipped with
a cryoprobe (see Note 5).
4. Data processing and analysis programs: TopSpin (36), Cara

(37), and Origin (38), and the results are visualized using the
program MOE (39).
2.2. Ligand-Based 1. Prolyl oligopeptidase (POP):160 μL of 100 μM POP in

Study on Binding 20 mM phosphate buffer, pH 7.0 (see Note 3) in 100% D2O
of POP to the Ligand in a 3-mm NMR tube (see Note 4).
Baicalin 2. Baicalin: 160 μL of 500 μM baicalin in 20 mM phosphate
buffer, pH 7.0 in 100% D2O in a 3-mm NMR tube.
3. 160 μL of POP (10 μM) and baicalin (500 μM) in 20 mM
phosphate buffer, pH 7.0 in 100% D2O in a 3-mm NMR
tube.
4. 160 μL of POP (20 μM) and baicalin (180 μM) in 20 mM
phosphate buffer, pH 7.0 in 100% D2O in a 3-mm NMR
tube.
5. Bruker Digital Avance 600 MHz NMR spectrometer equipped
with a cryoprobe (see Note 5).
6. Data processing and analysis programs, TopSpin and Origin.
3. Methods
3.1. Protein-Based The following case study describes NMR monitoring of the inter-
Study on the Binding action between VEGF and the peptidic ligand P-7i, presented here
of VEGF to the Peptidic as a representative example of a CSP experiment (3). The 23-kDa
Ligand P-7i VEGF11–109 construct used is a truncated version of VEGF121 which
exhibits excellent solubility and stability. Moreover, it is readily
labeled with 15N and is a symmetric homodimer, making it highly
suited to protein-based NMR experiments. Additionally, the signal
assignment for this construct is available (35). P-7i is a 19 amino
acid-long analog of v107 that was discovered by phage display
(40). It differs from v107 by a single mutation: the Ile-7 is D rather
than L, which translates to a reduced affinity for VEGF (252 μM)
compared to the wild type (1.0 μM). Although P-7i is relatively
large (MW = ca. 2.3 kDa) for NMR studies of protein–ligand
binding, it was selected as an example because it exhibits fast and
intermediate exchange behavior and is amenable to CSP studies.
3.1.1. Sample Preparation 1. Starting from a stock solution of P-7i in water, prepare six
aliquots (two at each of three concentrations) with a total
amount of ligand corresponding to 50, 100, or 200 μM in a
volume of 160 μL (19, 37, or 75 μg, respectively). Freeze the
aliquots in 1.5-mL Eppendorf tubes and lyophilize (Fig. 3).
3.1.2. Recording of CSP 1. Equilibrate the VEGF sample inside the NMR spectrometer
NMR Spectra for 15 min at 318 K before starting the NMR spectra acquisi-
tion. This temperature is chosen because the available signal
assignment was performed at this temperature (35). Execute
standard NMR procedures: tune the probe, shim the magnetic
field, calibrate the length of the 90° pulses for 1H and 15N, and
optimize the water suppression.
2. Record a 1H-15N HSQC spectrum using the FAST-HSQC
experiment (41), which uses pulsed field gradients and a
WATERGATE (42) module to efficiently suppress the water
signal. For this, 2,048 × 256 complex points with a total of
8 transients per increment are used. The total experiment time
is ca. 40 min.
3. Once the spectrum is obtained, remove the sample from the
spectrometer and transfer it to the first Eppendorf tube contain-
ing the lyophilized ligand (see Fig. 3). Add 0.5% of DMSO-d6
to ensure ligand solubility (see Note 6), vortex the sample and
centrifuge (5,000 RCF, 1 min, RT), and transfer all of the liquid
to the previously used NMR tube. Introduce the resulting sam-
ple, containing the protein and the ligand at the first titration
concentration, into the spectrometer. Equilibrate the sample at
318 K for 15 min, re-shim the magnetic field and record a new
spectrum using the same conditions described in step 2.
4. Repeat step 3 until spectra are recorded for each titration point.
By dissolving the lyophilized ligand in the sample incremen-
tally, the concentration of P-7i increases stepwise over the
course of the titration from 0 to 50 μM, 100, 200 , 300, 500,
and finally, 700 μM (Fig. 3).
Fig. 3. Stepwise addition of P-7i to VEGF: overview of lyophilized ligand aliquots, the respective concentration increments
and the total concentration over the course of titration. After the reference experiment on the sample containing only protein,
the ligand concentration was increased stepwise by transferring the protein sample to Eppendorf tubes with the denoted
amounts of lyophilized P-7i prior to acquisition of the next NMR spectrum. A total of seven spectra were acquired by
repeating this procedure: one reference spectrum, plus one spectrum at each of the six titration points.
3.1.3. Data Analysis 1. To process the HSQC spectra, increase the number of points
in the indirect dimension (F1) from 256 to 512 by linear
prediction and then zero fill to 1,024 points to yield a
2,048 × 1,024 matrix. Adjust the phase correction manually
and apply a squared sine weighting function in both dimen-
sions. Process all the spectra acquired in the titration experi-
ment identically using Topspin 2.0. Figure 4 shows the seven
superimposed spectra acquired from the titration of VEGF
with P-7i.
2. Determination of CSP requires peak picking and subsequent
assignment of the peaks to the corresponding residues. Use
the program Cara to do this for the first and last spectra of
the titration (see Note 7).
3. Extract the relevant data for mapping the binding site by
calculating the distance between the position of the reference
Fig. 4. (a) Seven superimposed 1H-15N HSQC spectra of a 100-μM sample of uniformly-15N labeled VEGF11–109 at 318 K
titrated with 0–700 μM of P-7i (600 MHz with cryoprobe). (b) Zoom of Lys48 shifts (reproduced from ref. 3 with kind per-
mission from Wiley).
peak in the spectrum of the protein without ligand and the

peak position in the spectrum of the highest ligand concentra-
tion. Calculate the distance between two peaks as the difference
between the average chemical shift Δd NH for each peak computed
from proton and nitrogen chemical shits (d H and d N, respec-
tively), according to the following formula (see Note 8):
⎛ 2 ⎛ Δδ N ⎞ 2 ⎞
Δδ NH = ⎜ Δδ H +⎜ (4)
⎝ ⎝ 5 ⎠⎟ ⎟⎠
The majority of the peaks exhibit fast exchange behavior;

however, residues 17, 21, 26, 64, and 104 exhibit intermediate
exchange. Signal broadening and a rapid decrease in signal
intensity are observed for these residues; however, despite the
broad signals, assignment is still possible. Residue 21, located
in the ligand-binding zone, exhibits slow exchange behavior.
The signal disappears completely after the second titration
point. CSP is not feasible for this behavior (see Note 9). To
identify the binding site, consider as significant only changes
greater than the sum of the mean shift and the standard deriva-
tion. Figure 5 shows the calculated changes for each residue.
Fig. 5. Histogram of the P-7i induced CSP of the backbone amides Δd for every residue of
VEGF11–109 observed in the 1H-15N HSQC experiment. The histogram was calculated for the
shifts between the reference spectrum (VEGF alone) and the spectrum of the sample with
the highest ligand concentration (VEGF plus 700 μM of P-7i). The lower (dashed ) horizon-
tal line represents the mean shift, and the upper (dotted ) horizontal line, the cutoff for
significant changes (mean shift plus one standard derivation).
Fig. 6. Surface representation of the homodimer VEGF11–109 (PDB:2VPF). Residues encoded

in red exhibit significant CSP, thereby indicating the binding zone for the ligand P-7i (resi-
dues encoded in black show no observable signals).
The residues with significant changes (10 of 83) are mapped to

the 3D-structure of VEGF and depicted in red (Fig. 6) using
the software MOE.
4. Calculate the binding affinity by plotting ΔdNH against the
ligand concentration, which requires that ΔdNH is calculated for
each of the seven titration points. Although a KD can be calcu-
lated for each residue, this exercise is only performed for some
of the residues with a strong shift: those that exhibit the best
signal-to-noise ratios and are free from signal overlap. Thus,
a KD is calculated for residues 25, 48, 50, and 66, assuming
a model of two independent identical binding sites (3):
[L ]− [L]
Δδ NH = F × 0
[P ] 0
(5)
K + [L ]+ 2 [P ]− (K + [L ]) + 4 [P ](K − [L ]+ [P ])
2
D 0 0 D 0 0 D 0 0
=F×
2 [P ] 0
whereby [P0] is the total protein concentration; [L0], the total

ligand concentration; [L], the concentration of unbound
ligand in solution; F, a scaling factor; and KD, the affinity to be
calculated. The calculated average KD for the VEGF – P-7i
system is 252 μM (Fig. 7). The fact that the R2 values are all
greater than 0.99 validates this model of the system.
3.2. Ligand-Based The following case study was chosen to give a step-by-step expla-
Study on Binding nation of an STD experiment. The example is based on work
of POP to the Ligand performed in the authors’ laboratory to characterize the interac-
Baicalin tion between the protease POP and the flavonoid baicalin (43).
Fig. 7. (a) Relative amount of bound ligand plotted against the total ligand concentration
and fitted to a model of two independent identical binding sites. Analysis is based on
Lys48 peak displacement in the spectra of the VEGF-(P-7i) complex. (b) Results of fitting
for the four selected residues (reproduced from ref. 3 with kind permission from Wiley).
These experiments were designed to confirm baicalin as a ligand of

POP and to obtain structure–activity information on POP–baicalin
binding, especially on the influence of the baicalin sugar moiety.
This entailed recording of a saturation buildup curve.
Based on its molecular weight (80 kDa), POP is not appropri-
ate for simple protein-based experiments; however, it is well suited
for ligand-based experiments. Likewise, baicalin (MW 446 Da), as
a relatively small to medium-sized ligand, is ideal for ligand-based
experiments. Concerning its affinity properties, baicalin is a weak
ligand of POP, having an IC50 value of 12 μM (43) (see Note 10).
3.2.1. Optimization 1. Equilibrate the sample containing only POP in the NMR
of STD Parameters spectrometer to 308 K (see Note 11) for 15 min. Follow the
and Confirmation of standard procedure: tune the probe, shim the magnetic field,
Baicalin as a POP Ligand calibrate the length of the 90° pulse, and optimize the water
suppression (see Note 12).
2. To optimize the protein saturation, record a 1H-spectrum of
POP to identify promising signals in the aliphatic region.
The aliphatic region of POP shows a signal at ca. 0.9 ppm
that decreases in intensity in the up-field direction without
occurrence of any new maxima. Therefore, the closer to this
value the protein is irradiated, the greater the saturation (see
Note 13).
3. To determine if protein irradiation is affecting any of the
ligand’s signals, acquire several STD spectra of a sample of
baicalin alone, using values of 0, −1, and −2 ppm for the
on-resonance frequency. No STD signals are observed in any
case. This negative control experiment confirms that no
direct irradiation of baicalin or baicalin-aggregates occurs.
For the subsequent STD experiments, select an on-reso-

nance frequency of 0 ppm and an off-resonance frequency of
80 ppm (see Note 14).
4. Optimize the total saturation time with a sample containing
POP (10 μM) and baicalin (500 μM) (see Note 15). Acquire
STD spectra with different tSat values (from 1 to 4 s) using
the same total experimental time. Superimpose all
on-resonance spectra and select the spectrum with the best
signal-to-noise ratio as the optimum one (in this case, 2 s).
5. Optimize the spin lock filter by testing different mixing times
(from 20 to 70 ms). Under optimal conditions the protein
signals are completely suppressed, thereby reducing the back-
ground noise in the STD experiment, whereas ligand signals
are not affected. In this case, this is achieved with a spin lock
length of 30 ms.
6. Acquire the final STD spectrum with the aforementioned
optimized conditions and 2 k scans. Process the NMR data by
multiplication with an exponential line-broadening function of
0.5 Hz prior to Fourier transformation. The resulting spectrum
exhibits clear STD signals (Fig. 8) for baicalin, thereby
confirming that it binds to POP.
Fig. 8. (a) 1H STD spectrum of baicalin (500 μM) in the presence of POP (10 μM) recorded at 600 MHz and 308 K.
The protein signals were suppressed by applying a spin lock filter. (b) 1H-reference spectrum of baicalin. *Signals arising
from sample impurities (reproduced from ref. 43 with kind permission from Elsevier).
Fig. 9. (a) Saturation transfer difference amplification for individual protons at different saturation times. The data were
acquired with a ninefold excess of baicalin over POP (20 μM). (b) Build-up curves obtained for individual protons in baicalin.
The key to the nomenclature for the individual protons is provided in Fig. 10 (adjusted from ref. 43 with kind permission
from Elsevier).
3.2.2. Identifying A sample containing 20 μM POP and 180 μM baicalin is used to

the Binding Epitope identify the binding epitope (see Note 16). Experiments are
on Baicalin performed at 308 K.
1. Obtain data for the saturation buildup curve using the previously
optimized parameters (on-resonance irradiation: 0 ppm; 1 k
scans; 308 K). Record a total of four experiments, using tSat
values of 0.5, 1.0, 1.5, and 2.0 s.
2. For each proton, determine the values for STDAmpl. for the dif-
ferent saturation times (Fig. 9). Superimpose the STD and off-
resonance spectra for each saturation time, measure the
difference between the STD and off-resonance signal for a spe-
cific proton (as a percentage), and multiply this value by the
ligand excess (Eq. 2), which was equal to 9 (see Note 17).
3. Determine the binding epitope by fitting the values for STDAmpl.
against the saturation times for each proton, according to
Eq. 1. Calculate the initial slope v0 by multiplying STDmax by
kSat. The saturation build-up curves of the protons of baicalin
shows different initial slopes. The values were adjusted by set-
ting the proton with the highest initial slope to 100% (Fig. 10).
These data indicate that the protons in the γ-chromenone and
the phenyl ring are in close contact with the protein, whereas
those in the saccharide moiety contribute less to binding.
Fig. 10. The STD data were fitted to a monoexponential equation, from which the STDmax and the saturation rate constant
(ksat) were obtained. The initial slope directly correlates to the proximity of the corresponding proton to the protein and is
the product of STDmax and ksat. The relative STDs were calculated by setting the proton with the greatest STD effect to 100%
(adjusted from ref. 43 with kind permission from Elsevier).
4. Notes
1. Exchange between the free (L) and bound (PL) states of a

ligand is considered in this context, assuming that the binding
follows a bimolecular association reaction with second-order
kon
kinetics: P + L PL ; K = [P][L] = koff .
D
koff [PL] kon
2. In transient NOE experiments, a nonequilibrium state is
generated via high-frequency pulses, and in a subsequent
mixing period, returns to equilibrium by relaxation.
3. A buffer of the desired pH is created by mixing 25 mM
solutions of NaH2PO4 and Na2HPO4 in deuterated water until
the final pH is reached.
4. The most widely used NMR tubes are 3- and 5-mm standard
tubes and 5-mm Shigemi tubes, which require sample volumes
of 160, 600–700, and 300 μL, respectively. Voehler et al.
analyzed the influence of tube type on the sensitivity of HSQC
experiments (44). In general, 5-mm tubes are recommended
for cases of abundant, poorly soluble protein; 3-mm standard
or 5-mm Shigemi tubes when the sample is limited; and 3-mm
standard tubes for high salt concentrations. Moreover, for
titration experiments 3-mm tubes are easier to manipulate than
Shigemi tubes. Higher protein concentrations are generally

preferable, as they enable shorter acquisition times. One excep-
tion to this is the case of a low affinity, poorly soluble ligand,
for which a major excess of ligand must be employed to
saturate the binding site, which in turn is only possible at
lower protein concentrations.
5. In a cryoprobe the coil and the preamplifier are cooled to
decrease thermal noise. The signal-to-noise ratio can be
increased by three- to fourfold by using a cryoprobe instead of
a conventional probe operating at room temperature. The
authors have obtained a signal-to-noise ratio of ca. 7,000:1 for
1
H using a cryoprobe and a standard sample of 0.1% ethylben-
zene in CDCl3.
6. Ligands soluble at high concentrations (ca. 100 mM) in a
vehicle such as DMSO-d6 can usually be added directly to the
sample, without resulting in any significant dilution; however,
if this does not apply, then lyophilized aliquots of ligand can be
used, as in Case Study 1. Vehicle may be added simply to ensure
the solubility of the ligand; however, then a control experi-
ment is necessary to evaluate any changes induced by it. If
lyophilized aliquots of ligand are used, then a vehicle can also
be added to the reference experiment (i.e., protein alone) to
eliminate any vehicle-induced changes during the course of the
titration. Finally, data quality may be improved if the ligand
concentration in the sample is controlled. This is easily achieved
by adding a reference compound at a known concentration to
the sample, such as trimethylsilyl propionate-d4 (TSP), and
then comparing the signal intensities of the ligand and of the
reference in a 1H-spectrum.
7. The authors found the freeware program Cara (available at
http://www.nmr.ch) to be utile and straightforward; never-
theless, other assignment tools (e.g., NMRview and Sparky)
are equally suitable.
8. This formula is one of the most commonly used ones for
calculating the distance between two peaks in the two-
dimensional plane. Other approaches and their impact on CSP
mapping have been reviewed by Schumann et al. (45).
9. For the wild-type peptide v107, the binding site could not be
mapped by observing CSP. This is because the binding kinetics
are in the slow exchange regime. All residues that are directly
involved in ligand binding or are in close proximity do not
shift, but simply decrease in intensity as a new signal appears.
In this case an alternative mapping of the binding site for slow
exchanging ligands was feasible based on the changes of signal
intensities induced by the binding of the ligand.
10. As mentioned in Subheading 1.2, STD NMR enables calculation
of KD. However, since the IC50 of baicalin for POP had
previously been determined, this calculation was not performed.

Information on the calculation of binding strengths can be
found in references (5) and (30).
11. Temperature is another parameter that can be optimized in an
STD experiment, as it affects the binding kinetics of the system,
the efficiency of protein saturation, and the relaxation rates. In
STD, signal intensities can be increased by changing the
temperature (27); however, once temperature has been
optimized, all other parameters must also be adjusted.
12. Before performing the STD experiments, a 1D 1H spectrum
with a water suppression module was employed to optimize
water suppression. The residual H2O signal was suppressed
with an excitation sculpting block (46), which uses a double
gradient spin-echo to defocus the H2O resonance. Squared
pulses of 2 ms length were used as 180° pulses. Optimized
parameters for water suppression in this experiment were then
used for the STD experiments.
13. This sample can also be used to verify if the saturation is well
spread over the whole protein. This requires that an STD
spectrum without spin lock filter is acquired. A decrease in
intensity of all protein signals indicates a good distribution,
which is the case for most proteins.
14. Some proteins exhibit several up-field shifted signals (ca. 1
to -1 ppm) that differ in intensities. In these cases, the on-
resonance frequency can be optimized by recording a set of
STD spectra of a sample containing only protein, without
the spinlock filter but using different saturation frequencies.
The saturation frequency that provides the strongest decrease
between on- and off-resonance spectra is deemed the most
favorable (upon confirmation that no direct saturation of ligand
signals occurs). Commonly used values for the on-resonance
irradiation lie between 0 and −1 ppm, and for the off-resonance,
either 40 or 80 ppm.
15. The range of ligand excess in STD NMR can be very wide.
A good starting point is to use a 50-fold molar excess of ligand
at a protein concentration of 10 μM. However, the optimal
ratio between protein and ligand depends on the system’s
kinetics. The faster the exchange, the better the signal can be
amplified which makes higher protein-to-ligand ratios (100-fold
and higher) useful and will ultimately yield stronger STD
effects. Independent of the choice of the ratio, the ligand con-
centration should always be in a range in which aggregation
can be excluded.
16. To map a binding epitope, lower ligand-to-protein ratios are
typically used. The objective is not to reach the maximal STD
amplification, but rather to cover the range of STDAmpl. values
in order to calculate the build-up curve. A smaller excess of

ligand leads to more pronounced differences between ISTD and
I0 for the different saturations times than does a larger excess.
17. In this example the STD amplification factor STDAmpl. was
used, rather than η, the fraction of ISTD from I0. The difference
between the two is that STDAmpl. accounts for the ligand excess
(Eq. 2). Since in this example the ligand excess is equal for all
saturation times, then it turns into a scaling factor between
STDAmpl. and η, which is not required for calculating the bind-
ing epitope. STDAmpl. is more commonly used because it enables
calculation of KD.
References
1. Wiseman, T., Williston, S., Brandts, J. F., and 11. Tugarinov, V., and Kay, L. E. (2005) Methyl
Lin, L. N. (1989) Rapid measurement of groups as probes of structure and dynamics in
binding constants and heats of binding using a NMR studies of high-molecular-weight
new titration calorimeter. Anal. Biochem. 179, proteins. Chembiochem. 6, 1567–1577.
131–137. 12. Bodenhausen, G., and Ruben, D. J. (1980)
2. Baldwin, M. A. (2005) Mass spectrometers for Natural abundance nitrogen-15 NMR by
the analysis of biomolecules. Methods Enzymol. enhanced heteronuclear spectroscopy. Chemical
402, 3–48. Physics Letters 69, 185–189.
3. Dyachenko, A., Goldflam, M., Vilaseca, M., 13. Kay, L., Keifer, P., and Saarinen, T. (1992) Pure
and Giralt, E. (2010) Molecular recognition at absorption gradient enhanced heteronuclear
protein surface in solution and gas phase: Five single quantum correlation spectroscopy with
VEGF peptidic ligands show inverse affinity improved sensitivity. J. Am. Chem. Soc. 114,
when studied by NMR and CID-MS. 10663–10665.
Biopolymers 94, 689–700. 14. Morris, G. A., and Freeman, R. (1979)
4. Englebienne, P., Hoonacker, A. V., and Verhas, Enhancement of nuclear magnetic resonance
M. (2003) Surface plasmon resonance: princi- signals by polarization transfer. J. Am. Chem.
ples, methods and applications in biomedical Soc. 101, 760–762.
sciences. Spectroscopy 17, 255–273. 15. Gang, Z., and Price, W. S. Solvent signal
5. Fielding, L. (2003) NMR methods for the suppression in NMR. Prog. Nucl. Magn. Reson.
determination of protein-ligand dissociation Spectrosc. 56, 267–288.
constants. Curr. Top. Med. Chem. 3, 39–53. 16. Shuker, S. B., Hajduk, P. J., Meadows, R. P.,
6. Carlomagno, T. (2005) Ligand-target interac- and Fesik, S. W. (1996) Discovering High-
tions: what can we learn from NMR? Annu. Affinity Ligands for Proteins: SAR by NMR.
Rev. Biophys. Biomo.l Struct. 34, 245–266. Science 274, 1531–1534.
7. Lepre, C. A., Moore, J. M., and Peng, J. W. 17. Smrcka, A. V., Kichik, N., Tarrago, T.,
(2004) Theory and applications of NMR-based Burroughs, M., Park, M. S., Itoga, N. K.,
screening in pharmaceutical research. Chem. Stern, H. A., Willardson, B. M., and Giralt, E.
Rev. 104, 3641–3676. (2010) NMR analysis of G-protein betagamma
8. Dalvit, C. (2009) NMR methods in fragment subunit complexes reveals a dynamic G(alpha)-
screening: theory and a comparison with other Gbetagamma subunit interface and multiple
biophysical techniques. Drug Discov. Today 14, protein recognition modes. Proc. Natl. Acad.
1051–1057. Sci. USA 107, 639–644.
9. Tarrago, T., Claasen, B., Kichik, N., Rodriguez- 18. Pellecchia, M. (2005) Solution nuclear mag-
Mias, R. A., Gairi, M., and Giralt, E. (2009) A netic resonance spectroscopy techniques for
cost-effective labeling strategy for the NMR probing intermolecular interactions. Chem.
study of large proteins: selective 15 N-labeling Biol. 12, 961–971.
of the tryptophan side chains of prolyl oligo- 19. Reibarkh, M., Malia, T. J., and Wagner, G.
peptidase. Chembiochem. 10, 2736–2739. (2006) NMR distinction of single- and
10. Bogan, A. A., and Thorn, K. S. (1998) Anatomy multiple-mode binding of small-molecule
of hot spots in protein interfaces. J. Mol. Biol. protein ligands. J. Am. Chem. Soc. 128,
280, 1–9. 2160–2161.
20. Medek, A., Hajduk, P., Mack, J., and Fesik, S. affinity to proteins via magnetization transfer
(2000) The Use of Differential Chemical Shifts from bulk water. J. Biomol. NMR 18,
for Determining the Binding Site Location and 65–68.
Orientation of Protein-Bound Ligands. J. Am. 32. Dalvit, C., Fogliatto, G., Stewart, A., Veronesi,
Chem. Soc. 122, 1241–1242. M., and Stockman, B. (2001) WaterLOGSY as
21. Krishnamoorthy, J., Yu, V. C., and Mok, Y. K. a method for primary NMR screening: Practical
(2010) Auto-FACE: an NMR based binding aspects and range of applicability. J. Biomol.
site mapping program for fast chemical NMR 21, 349–359.
exchange protein-ligand systems. PLoS One 5, 33. Pellecchia, M., Sem, D. S., and Wuthrich, K.
e8943. (2002) NMR in drug discovery. Nat. Rev.
22. Neuhaus, D., and Williamson, M. P. (2000) Drug Discov. 1, 211–219.
The Nuclear Overhauser Effect in Structural 34. Oltersdorf, T., Elmore, S. W., Shoemaker, A.
and Conformational Analysis, 2nd Edition ed., R., Armstrong, R. C., Augeri, D. J., Belli, B.
Wiley, New York. A., Bruncko, M., Deckwerth, T. L., Dinges, J.,
23. Meiboom, S., and Gill, D. (1958) Modified Hajduk, P. J., Joseph, M. K., Kitada, S.,
Spin-Echo Method for Measuring Nuclear Korsmeyer, S. J., Kunzer, A. R., Letai, A., Li,
Relaxation Times, Review of Scientific C., Mitten, M. J., Nettesheim, D. G., Ng, S.,
Instruments 29, 688–691. Nimmer, P. M., O’Connor, J. M., Oleksijew,
24. Ni, F., and Scheraga, H. A. (1994) Use of the A., Petros, A. M., Reed, J. C., Shen, W., Tahir,
Transferred Nuclear Overhauser Effect To S. K., Thompson, C. B., Tomaselli, K. J., Wang,
Determine the Conformations of Ligands Bound B., Wendt, M. D., Zhang, H., Fesik, S. W., and
to Proteins. Accts. Chem. Res. 27, 257–264. Rosenberg, S. H. (2005) An inhibitor of Bcl-2
25. Moriz, M., and Bernd, M. (1999) family proteins induces regression of solid
Characterization of Ligand Binding by Saturation tumours. Nature 435, 677–681.
Transfer Difference NMR Spectroscopy. Angew. 35. Fairbrother, W. J., Champe, M. A., Christinger,
Chem. Int. Ed. Engl. 38, 1784–1788. H. W., Keyt, B. A., and Starovasnik, M. A.
26. Klein, J., Meinecke, R., Mayer, M., and Meyer, (1997) 1H, 13C, and 15N backbone assign-
B. (1999) Detecting Binding Affinity to ment and secondary structure of the receptor-
Immobilized Receptor Proteins in Compound binding domain of vascular endothelial growth
Libraries by HR-MAS STD NMR. J. Am. factor. Protein Sci. 6, 2250–2260.
Chem. Soc. 121, 5336–5337. 36. Bruker Corporation, (2007) Topspin 2.0,
27. Groves, P., Kover, K. E., Andre, S., http://www.bruker-biospin.com/nmr_soft-
Bandorowicz-Pikula, J., Batta, G., Bruix, M., ware.html.
Buchet, R., Canales, A., Canada, F. J., Gabius, 37. Keller, R. (2004) The Computer Aided
H. J., Laurents, D. V., Naranjo, J. R., Resonance Assignment Tutorial, 1st edition ed.,
Palczewska, M., Pikula, S., Rial, E., Strzelecka- CANTINA Verlag.
Kiliszek, A., and Jimenez-Barbero, J. (2007) 38. Origin Corporation, (2007) Origin 8.0,
Temperature dependence of ligand-protein http://www.originlab.com/.
complex formation as reflected by saturation 39. Chemical computing group, (2009) http://
transfer difference NMR experiments. Magn. www.chemcomp.com/index.htm.
Reson. Chem. 45, 745–748. 40. Pan, B., Li, B., Russell, S. J., Tom, J. Y.,
28. Mayer, M., and Meyer, B. (2001) Group Cochran, A. G., and Fairbrother, W. J. (2002)
epitope mapping by saturation transfer differ- Solution structure of a phage-derived peptide
ence NMR to identify segments of a ligand in antagonist in complex with vascular endothe-
direct contact with a protein receptor. J. Am. lial growth factor. J. Mol. Biol. 316,
Chem. Soc. 123, 6108–6117. 769–787.
29. Mayer, M., and James, T. L. (2004) NMR- 41. Mori, S., Abeygunawardana, C., Johnson, M.
based characterization of phenothiazines as a O., and van Zijl, P. C. (1995) Improved sensi-
RNA binding scaffold. J. Am. Chem. Soc. 126, tivity of HSQC spectra of exchanging protons
4453–4460. at short interscan delays using a new fast
30. Meyer, B., and Peters, T. (2003) NMR HSQC (FHSQC) detection scheme that
spectroscopy techniques for screening and avoids water saturation. J. Magn. Reson. B
identifying ligand binding to protein receptors. 108, 94–98.
Angew. Chem. Int. Ed. Engl. 42, 864–890. 42. Piotto, M., Saudek, V., and Sklenář, V. (1992)
31. Dalvit, C., Pevarello, P., Tato, M., Veronesi, Gradient-tailored excitation for single-quantum
M., Vulpetti, A., and Sundstrom, M. (2000) NMR spectroscopy of aqueous solutions.
Identification of compounds with binding J. Biomol. NMR 2, 661–665.
43. Tarrago, T., Kichik, N., Claasen, B., Prades, R., 45. Schumann, F. H., Riepl, H., Maurer, T.,
Teixido, M., and Giralt, E. (2008) Baicalin, a Gronwald, W., Neidig, K. P., and Kalbitzer, H. R.
prodrug able to reach the CNS, is a prolyl (2007) Combined chemical shift changes and
oligopeptidase inhibitor. Bioorg. Med. Chem. amino acid specific chemical shift mapping of
16, 7516–7524. protein-protein interactions. J. Biomol. NMR
44. Voehler, M. W., Collier, G., Young, J. K., 39, 275–289.
Stone, M. P., and Germann, M. W. (2006) 46. Hwang, T. L., and Shaka, A. J. (1995) Water
Performance of cryogenic probes as a function suppression that works. Excitation sculpting
of ionic strength and sample tube geometry. using arbitrary waveforms and pulsed field gra-
J. Magn. Reson. 183, 102–109. dients. J. Magn. Reson. 112, 275–279.
Chapter 15
In-Cell NMR Spectroscopy in Escherichia coli

Kirsten E. Robinson, Patrick N. Reardon, and Leonard D. Spicer
Abstract
A living cell is a complex system that contains many biological macromolecules and small molecules necessary
for survival, in a relatively small volume. It is within this crowded and complex cellular environment that
proteins function making in-cell studies of protein structure and binding interactions an exciting and
important area of study. Nuclear magnetic resonance (NMR) spectroscopy is a particularly attractive
method for in-cell studies of proteins since it provides atomic-level data noninvasively in solution. In addition,
NMR has recently undergone significant advances in instrumentation to increase sensitivity and in methods
development to reduce data acquisition times for multidimensional experiments. Thus, NMR spectroscopy
lends itself to studying proteins within a living cell, and recently “in-cell NMR” studies have been reported
from several laboratories. To date, this technique has been successfully applied in Escherichia coli (E. coli),
Xenopus laevis (X. laevis) oocytes, and HeLa host cells. Demonstrated applications include protein assignment
as well as de novo 3D protein structure determination. The most common use, however, is to probe binding
interactions and structural modifications directly from proton nitrogen correlation spectra. E. coli is the
most extensively used cell type thus far and this chapter is largely confined to reviewing recent literature
and describing methods and detailed protocols for in-cell NMR studies in this bacterial cell.
Key words: In-cell NMR spectroscopy, Protein NMR spectroscopy, Escherichia coli, Fast NMR
spectroscopy
1. Introduction
1.1. In-Cell NMR Within a cell, there are many different proteins, other biological
Spectroscopy: macromolecules, and small molecules that must function properly
The Technique for a cell to survive. The simplest organisms are estimated to utilize
and Information a few hundred types of small molecules and genomically encode up
It Reveals to 1,000 different proteins (1, 2), with the human genome estimated
to encode 10–100-fold more (3). The large number and diversity
of small molecules and biological macromolecules form a cellular
environment, where these molecules function, that is very crowded
261
262 K.E. Robinson et al.
and complex (4). Understanding the influences that this packed

environment has on individual protein structure, stability, and
behavior as well as on protein complexes is one of the challenges of
contemporary biomedical science (1, 5, 6). The knowledge that
can be gained from complete molecular- and atomic-level observa-
tions in a live cellular environment is invaluable in understanding
detailed protein mechanisms of action as well as in the areas of
drug development and protein engineering and numerous other
research frontiers.
In-cell nuclear magnetic resonance (NMR) spectroscopy is
currently the most comprehensive technique for examining
proteins at the atomic level within a living cell (4). The ultimate
goal is molecular structure determination. Historically, high-resolution
protein structures have been determined only on purified proteins.
The fact that NMR is a noninvasive solution technique suggests
that NMR is uniquely suited to examining full atomic-level macro-
molecular structures in a representative environment, although
not necessarily a natural environment (6). The strength of in-cell
NMR spectroscopy not only lies in the determination of de novo
3-D structure, but also in the ability to observe structural changes
of biological macromolecules in a native environment. Structural
changes can be observed directly by monitoring changes that occur
in a protein 2-D 1H–15N (or 1H–13C) heteronuclear single-quantum
coherence (HSQC) fingerprint spectrum, since every protein generates
a unique 1H–15N HSQC that is dependent upon the secondary,
tertiary, and quaternary structure as well as the chemical environment
in which the protein resides. Alterations in the structure of the
target protein or its chemical environment give rise to changes in
the 2-D fingerprint. This enables studies of molecular interactions
within a living cell, such as mapping the binding interface between
two proteins (7).
1.2. In-Cell NMR In-cell NMR spectroscopy within E. coli has been used to determine
Spectroscopy: the 3-D backbone assignment of GB1 (8), determine the 3-D
Current Achievements de novo structure of the gene product TTHA1718 (9, 10), as well
and Challenges as to study protein–protein interactions (7), protein–DNA interac-
tions (11), binding events (12), and identify potential new drugs
(13). These studies have all been performed using 15N and/or 13C
isotopic labeling of proteins under 20 kDa. Both uniform labeling
and selective methyl-group labeling strategies have been applied
(14, 15). Labeling schemes typically use isotope 15N enrichment for
2-D fingerprint data collection and also incorporate uniform 13C
labeling when collecting 3-D data for assignment and structure
determination. The 1H–13C HSQC fingerprint can also be useful,
but is often complicated by significant background signals that arise
from metabolites within the cell incorporating the 13C isotope (15).
However, it has been demonstrated that utilizing specific 13C methyl-
group labeling can be beneficial for observing larger proteins that
15 In-Cell NMR Spectroscopy in Escherichia coli 263
have attenuated 1H–15N HSQC spectra since methyl groups most

often have independent rotational motion and give better line shape
and intensity (15–17). An alternative specific labeling strategy is the
incorporation of 19F as an NMR probe, which can enhance sensitiv-
ity (18), but typically does not provide sufficient data for structural
characterization.
In-cell NMR spectroscopy enables protein studies within a living
cellular environment with some limitations. Most success has been
achieved for proteins under 12 kDa; the largest protein observed
in-cell with clear spectral features is calmodulin at 16.8 kDa.
Calmodulin was observed using specific labeling techniques to
incorporate either 13C-labeled methyl groups or 15N-labeled lysines
(15, 19). Protein that aggregates or interacts with the cellular
membrane, large cellular complexes, or DNA may have significantly
longer rotational correlation times due to slower molecular motions
and/or may be involved in an intermediate exchange regime. This
can lead to attenuated NMR signals due to line broadening, thus
making the protein quite difficult to detect and often “invisible” by
NMR within a living cell. This was found to be the case with the
MetJ repressor protein (11), which binds nonselectively to DNA.
Care must be taken to demonstrate the source of signal attenuation
by incorporating appropriate controls, as well as alternate analytical
methods and when possible rescue experiments (11).
To characterize proteins by NMR within a living E. coli cell, it
is necessary to isotopically label the protein with 15N and/or 13C.
When the protein is expressed within the cell type of interest
(see Note 1), it is essential to keep in mind that the isotopic labels
are incorporated throughout the cell, into the protein of interest
and into the native proteins and metabolites. The presence of
isotopes in the native biomolecules and metabolites results in
background signals: signals that are not associated with the protein
of interest (Fig. 1 ). Serber et al. used rifampicin to suppress
production of bacterial protein; however, the detected background
signal remained (19). The background signals that are being
detected are, therefore, thought to belong to small metabolites,
such as amino acids, that are incorporating the stable isotope labels
during growth (19). Growing the cells to log phase in unlabeled
medium and then switching to isotopically labeled medium have
also been studied (19). While this approach did not reduce the
intensity of the observable background signal, this method can be
used for cost and time savings.
An E. coli in-cell preparation for NMR is composed typically of
a 20% cell slurry. In this slurry, the cells settle over time in an NMR
tube without some form of mixing. It takes a few hours for a 20%
cell slurry to start layering at the bottom of the tube and this may
or may not be confined within the active volume of the probe coil
(see Note 2). This cell settling problem is seemingly easily over-
come by packing the cells close together, and there are in-cell NMR
110.0
115.0
120.0
N (ppm)
125.0
130.0
135.0
9.50 9.00 8.50 8.00 7.50 7.00

NH (ppm)
Fig. 1. 1H–15N HSQC spectrum from E. coli cells grown in isotopically labeled medium without protein overexpression.
The signals detected here are referred to as the background signal for in-cell NMR spectroscopy.
reports using a denser sample. Careful studies of the effect of cell

density on spectral quality, however, demonstrate that a 20–30%
cell slurry yields the best signal (19) and that higher cell densities
result in line broadening. The source of this spectral degradation
has not been thoroughly studied, but contributing factors are likely
bulk magnetic susceptibility and inhomogeneity of the sample.
Another strategy to prevent cell settling that has been employed is
to encapsulate E. coli cells in alginate microcapsules (20). This
method involves preparing a mixture of warm alginate with E. coli
cells and then using an electric current to force the mixture
through a needle into a calcium bath to cause polymerization. This
technique, however, has not been commonly used. Other options
to prevent cell settling beyond the active volume of the coil may
include using Shigemi tubes or susceptibility plugs, but these have
not been explored.
As mentioned above, another important concern when per-
forming in-cell NMR experiments is that cells remain viable during
the course of the NMR experiment. Cells that die during the
experiment can lyse and release their contents into the supernatant,
thus leading to the protein of interest being found in the medium
and not in the concentrated cellular environment. This leads to
spurious protein signals. The NMR tube is a harsh environment,
limited in nutrients and oxygen, which causes eventual cell death.
It is necessary to determine the viable cell count in the NMR sample
before and after performing the NMR experiment. Cell viability
must be determined by a quantitative assay rather than a qualitative

assay as having only a small percentage of viable cells is sufficient
for a positive result by a qualitative assay. The quantitative assay we
use is a serial dilution plating experiment which is described in
Subheading 3.2 (see Note 3).
A related problem that must be avoided and diligently tested
for is leakage of the protein of interest out of intact cells. It is
always necessary after performing an in-cell NMR experiment to
harvest the cells from the sample and examine the supernatant for
the presence of the protein of interest to ensure that the spectral
signals recorded do not originate partially or fully from outside the
living cells. Investigators should be warned that even healthy cells
frequently release protein into their surroundings and stressed cells
expressing nonnative proteins are often more prone to this release.
There is at least one reported strategy for overcoming NMR detec-
tion of protein outside the cell (8). In our own studies, we found
that a small amount of the protein GB1, a domain of a protein
from Group G Streptococcus, can escape from E. coli cells during
NMR experiments however. By including IgG antibody in the cell
slurry medium, the extracellular GB1 is efficiently scavenged by
the large IgG molecule eliminating this GB1 signal from the super-
natant. Protein leakage from E. coli is also reported to be correlated
with the total fraction of overexpressed protein in the cell (21).
In general, careful controls must be established to ensure that the
supernatant gives no detectable NMR signal for the protein of
interest, thus assuring that the detected protein is located within
living cells.
These considerations place significant time limitations on an
in-cell NMR experiment. Cells settle in a few hours, unless settling
is attenuated (20). Cell viability along with cell wall permeability
affect the intracellular versus extracellular protein concentration,
thus restricting data acquisition time. If protein leakage is observed,
effective extracellular protein scavengers can be employed to extend
the available data collection time (8). Even though these limitations
are highly variable, they usually do not affect 2-D HSQC experi-
ments since the data can be collected within the time frame of a few
minutes. Therefore, 2-D HSQC has been the most common
in-cell NMR experiment performed to date.
HSQC spectra can be very useful when the in vitro assignments
for the protein of interest are already known. The fingerprints
obtained from the in-cell NMR experiment can often be compared
to the in vitro spectrum to obtain the assignments if the structure
does not change. So far, most in-cell NMR HSQC spectra have
shown only slight differences when compared with the in vitro
spectrum, so it is straightforward to transfer the assignments. This
also indicates that the molecular crowding and corresponding
limited protein-accessible free volume characteristic of the intracel-
lular milieu are not significant determinants of the folded state.
With the HSQC spectrum assigned, it is straightforward to study

how specific changes within the cellular environment affect the
protein structure. Through examination of 2-D 1H–15N HSQC
experiments, it is possible to study how drugs bind to the protein
of interest within the cellular environment (12), and to identify
potential new drugs (13). It is also possible to study protein–protein
interactions (7) and to examine the effect that a posttranslational
modification on one of the binding partners may have on the binding
interface (22). Even the effect of nonselective protein–DNA inter-
actions on the NMR spectrum has been studied by in-cell NMR
spectroscopy, leading to new insights into repressor activity in tran-
scription regulation (11). In-cell NMR spectra have also shown
that some, although not all, intrinsically disordered proteins may
gain structure within the cellular environment (23, 24). These studies,
utilizing 2-D 1H–15N HSQC experiments, illustrate the diversity
and strength of in-cell NMR spectroscopy to expand the knowledge
of protein function within a living cell.
While 2-D HSQC experiments are quite useful, if the assign-
ments have not been previously determined in vitro or if the HSQC
fingerprint is dramatically altered in-cell, then it becomes necessary
to collect additional multidimensional data. As mentioned above,
there are time limitations that can significantly affect the ability to
collect multidimensional NMR data. The requirement for short
acquisition times is important when performing 3-D in-cell NMR
experiments; however, the type of experiment implemented should
not compromise sensitivity or resolution of the detected signal
unless specific, well-dispersed spectral features are targeted. These
time limitations generally make it necessary to utilize fast NMR
methodologies and ultrasensitive probes that are continuing to be
developed and improved. There are several ways to collect fast NMR
data (25–28); however, to date, only sparse sampling techniques
have been applied to in-cell NMR spectroscopy.
Most fast NMR methodologies require two aspects to be
considered: the sampling pattern used and the processing method
applied. The sampling patterns used for fast NMR are designed
to reduce the number of points sampled to significantly decrease
the time it takes to collect adequate data for a specific experiment
when compared to the standard Cartesian sampling pattern.
Sparse sampling patterns used for multidimensional NMR include
a radial sampling pattern (28, 29), concentric ring sampling (30),
and random sampling (31). Radial sampling is a special case of con-
centric ring sampling, where the same number of points is taken
for each ring. Random sampling involves collecting data points
that are distributed randomly. These patterns are different from the
standard Cartesian sampling pattern as Cartesian sampling distrib-
utes the data points equally on a grid while these other sampling
patterns either are not on a grid at all or, if positioned to be on a
grid, only partially fill it.
The other aspect of performing fast NMR experiments is the

ability to properly process the data. There are several processing
methods available (31–38). The sampling pattern chosen can
lead to artifacts being incorporated in the spectrum upon processing.
It is well-known that radial sampling introduces artifacts. As a
result, the field is moving away from radial sampling and toward
concentric ring sampling. In any of the methods used, it is important
to appreciate the potential for introducing artifacts in the processed
data collected from fast acquisition experiments. Programs are
available to aid with processing the data (see Note 4).
Currently, there are two studies that have been performed
using these fast NMR techniques. The first study used projection
reconstruction NMR (PR-NMR) with radial sampling and the
hybrid back-projection/lower value (HBLV) reconstruction algo-
rithm to walk the backbone of GB1 (8). The backbone assignment
of GB1 was accomplished by collecting PR-NMR versions of the
3-D HNCA, HNCO, and HA(CA)NH experiments. It should be
noted that the HA(CA)NH experiment works only for very small
proteins. The second study used random sampling followed by
maximum entropy processing (9). The 3-D structure of TTHA1718
was determined using several 3-D heteronuclear NMR experiments
and distance restraints obtained from NOE data. Fast NMR data
collection is continuously growing and evolving.
1.3. In-Cell NMR The studies cited above illustrate that in-cell NMR utilizing E. coli
Spectroscopy: enables the study of proteins within a physiologically relevant envi-
Looking Ahead ronment. However, the technique requires that the protein of
interest is expressed to intracellular concentrations sufficient for
NMR detection that are greater than the concentrations of most
cellular proteins. Thus, overexpression generates a labeled protein
in the cellular environment but is not strictly in vivo. Ideally, the
protein concentration should be tightly controlled to understand
what effects a high concentration may cause within a living cell as
well as potentially lower the concentration while remaining within
the detection limits of NMR. It has been demonstrated that protein
expression levels for in-cell NMR can be controlled with tunable
promoters, such as the arabinose promoter (7). External delivery
of isotopically labeled protein can provide another strategy to control
concentrations of labeled proteins in cells; however, to date, this
has only been reported using Xenopus laevis oocytes or HeLa cells
(39, 40) (Fig. 2). The overexpression protocol is a useful strategy
because the protein of interest is never exposed to a noncellular
environment; however, E. coli lack much of the complexity found
within eukaryotic organisms as they are not capable of performing
most posttranslational modifications and do not have organelles.
Therefore, it is also of interest to develop complementary methods
to overexpress proteins within a eukaryotic organism, such as yeast
or insect cells, so that the protein of interest is always exposed to
Target protein
Cell-penetrating
peptide
Injection needle
Overexpression Delivery X. laevis Oocytes
E. coli
HeLa
Fig. 2. Methods for in-cell NMR spectroscopy. E. coli cells are used for overexpressing
protein within the cell of interest. Proteins are difficult to overexpress in eukaroptic cells;
therefore, currently, two protocols are used. For X. laevis oocytes, physical injection has
been used while for HeLa cells cell-penetrating peptides have been used.
a natural eukaryotic environment, but this has yet to be achieved.

This chapter discusses detailed planning and implementation of in-
cell NMR spectroscopy studies in live E. coli using overexpression
methods to introduce stable isotope labels into the target protein.
2. Materials
2.1. Protein Expression 1. E. coli BL21 DE3 cells containing the expression plasmid (see
Note 1).
2. Luria-Broth (LB) Medium: 10 g NaCl, 10 g Bacto-Tryptone,
5 g yeast extract in 1 L dIH2O. Autoclave to sterilize and store
at 25°C.
3. LB agar selection plates: LB medium with 20 g Bacto-Agar in
1 L dIH2O. Autoclave to sterilize. Add the desired antibiotic
after the medium has cooled to 50°C. Store plates at 4°C.
4. 5× M9 Salts: 64 g Na2HPO4, 15 g KH2PO4, 2.5 g NaCl, 5.0 g
NH4Cl in 1 L dIH2O. Autoclave to sterilize. To incorporate
15
N, use 15NH4Cl; final pH is 7.2. Store at 25°C.
5. 1,000× trace metal mix: 2 mM H3BO3, 2 mM CuSO4, 2 mM
CoCl2, 10 mM MnCl2, 2 mM NiSO4, 2 mM (NH4)6Mo7O24.
Filter sterilize and store at 25°C.
6. M9 minimal medium: 1× M9 salts, 2 mM MgSO4, 1 mM FeCl3,
25 mM ZnSO4, 100 mM CaCl2, 1× trace metal mix, 0.0005%
thiamine, 0.3% (w/v) glucose (see Note 5). To incorporate
13
C, use 0.3% (w/v) 13C glucose.
7. Inducers (see Table 1).

8. Antibiotics (see Table 2).
9. 99.9% D2O.
2.2. Cell Viability 1. Luria–Broth medium (Subheading 2.1).

Assay 2. LB agar selection plates (Subheading 2.1).
3. Antibiotics (Table 2).
2.3. Freeze–Thaw 1. 10× phosphate buffer: 42.3 g NaH2PO4, 27.46 g Na2HPO4 in

Lysis 1 L dIH2O. Use NaOH or HCL to adjust pH to 7. Autoclave
and store at 25°C.
2. 10× NaCl: 8.766 g NaCl in 50 mL dIH2O. Filter sterilize and
store at 25°C.
3. 1,000× Phenylmethanesulfonylfluoride (PMSF) stock solution:
0.871 g PMSF in 100% ethanol. Store at 25°C.
4. 1,000× Deoxyribonuclease (DNase) stock solution: 20 mg/mL
DNase (from lyophilized powder) in 10 mM Tris–HCl,
pH 7.5, 50 mM NaCl, 10 mM MgCl2, 1 mM dithiothreitol
(DTT), and 50% (w/v) glycerol. Store at −20°C.
5. Dry ice.
6. 100% Ethanol.
Table 1
Induction Agent Composition
Inducer a [Stock solution] [Working] Storage
Isopropyl b-D-1-thioga- 1 M in H2O 1 mM −20°C

lactopyranoside (IPTG)
Arabinose 20% (w/v) 0.02–0.2% (w/v) 25°C
a
All inducers are filter sterilized
Table 2
Selection Agent Composition
Antibiotic a [Stock solution] [Working]
Ampicillin 100 mg/mL in H2O 100 mg/mL

Kanamycin 25 mg/mL in H2O 25 mg/mL
Tetracycline 12.5 mg/mL in 100% Ethanol 12.5 mg/mL
Streptomycin 50 mg/mL in H2O 50 mg/mL
a
All antibiotics are filter sterilized and stored at −20°C
7. 2× Loading dye: 40% (w/v) glycerol, 125 mM Tris–HCl, pH

6.8, 100 mM DTT, 2% (w/v) sodium dodecyl sulfate (SDS),
0.025% (w/v) bromophenol blue.
8. Freeze–thaw lysis buffer: 1× phosphate buffer, 1× NaCl, 1×
PMSF, 1× DNase, and 1.5 mg/mL lysozyme (from lyophilized
powder).
3. Methods
3.1. Protein Expression The typical strategy for studying a protein using in-cell NMR
for Preparing an spectroscopy entails incorporating 15N uniform isotope labeling
In-Cell NMR Sample and collecting a 2-D 1H–15N HSQC. Uniform 13C labeling for the
1
H–13C HSQC experiment is rarely done as the background signal
detected is quite high. However, selective labeling schemes to
incorporate 13C isotope into methyl groups can be beneficial,
particularly for larger proteins, and significantly reduce the back-
ground signal detected as very few metabolites incorporate the 13C
label introduced by this strategy. An alternative to labeling methyl
groups with 13C isotope is to use 19F trifluoromethyl groups, but
this procedure has limitations both in protein production and the
spectral information produced.
If the target protein yields a successful in-cell HSQC spectrum,
then higher dimensional experiments and/or binding studies can be
designed and implemented. Binding studies require tight regulation
over the delivery and concentration of the type of binding partner
to be studied. For protein–protein interaction studies, it is desirable
to be able to detect both proteins individually and to have independent
control over expressing or delivering both proteins to the same cell
(see Note 6). The experimental design depends on the goal of the
experiment; however, all in-cell NMR experiments have a similar
basic protocol as described below (Fig. 3).
1. Using a frozen glycerol stock or single LB selection plate colony,
inoculate 5 mL of LB medium with the appropriate antibiotic
(Table 2). Incubate the culture for 12–16 h at 37°C in a shaking
incubator. The container used for all cultures should be at least
three to four times the volume of the medium volume to ensure
proper aeration. We find it convenient to start this growth the
afternoon before expression.
2. Start two 50 mL cultures in LB medium containing the appro-
priate antibiotic (Table 2) by inoculating from the overnight
culture to a starting OD600 » 0.05 (see Note 7). Two cultures
are grown to have a sample that can be used for setting NMR
shims and other parameters on the spectrometer without losing
valuable data collection time.
al
ign
Centrifuge Centrifuge tS
tec
Switch media 20% Slurry De
No Lyse
Sig cells
Grow Induce na
l
Fig. 3. Schematic of the standard in-cell NMR experiment.
3. Place the cultures in a 37°C shaking incubator and grow until

the cells reach log phase (OD600 » 0.4–0.6). Remove two 1-mL
samples from each culture and harvest the cells by spinning at
1,000 × g for 15 min at 25°C. These are the “before induction”
SDS-PAGE samples. Two samples are harvested so that one
sample can be lysed and the other sample can be left intact to
identify artifacts that may be caused by the lysis protocol.
4. Harvest the remainder of the cells at 1,000 × g for 15 min at
4°C.
5. Resuspend the cell pellets in an equivalent volume of isotopi-
cally labeled M9 minimal medium containing the appropriate
antibiotic (Table 2).
6. Allow the cells to recover for ~10 min, then add the appropriate
inducing agent (Table 1), and allow the cells to grow for a
predetermined optimum expression time. Remove two 1-mL
samples for the SDS-PAGE gel analysis “after induction” time
point as was done in step 3.
7. Two hours before the induction is complete, gently harvest
one 50-mL culture by spinning at 1,000 × g for 15 min at 4°C.
Pour the spent medium into a beaker and determine the
volume of the cell pellet by comparing it to a known volume of
water. Resuspend the cell pellet to a 20% slurry using spent
medium containing 10% D2O. Place the cell slurry into a 5-mm
NMR tube and set up the NMR instrument using this sample
(tune, shim, and calibrate rf pulses). At the end of the optimal
induction time, take the second culture, collect two 1-mL
samples for gel analysis, and then gently harvest the remainder
of the cells as described in step 4. Resuspend the cell pellet as
before, place in a clean 5-mm NMR tube, and collect the
desired data. Before and after running the NMR experiment,
perform the serial dilution assay (Subheading 3.2) to assess cell
viability.
(a) If protein signals are detected during the NMR experiment,

gently harvest the cells from the sample and collect an
HSQC on the supernatant. This is done to ensure that the
observed protein is found within the cells and does not
originate from the medium.
(b) If no protein signal is detected, lyse the cells by freeze–
thaw lysis (Subheading 3.3, see Note 8) using a buffer
volume that is equivalent to the original NMR sample volume,
and the cleared lysate can be examined by NMR. This is done
to determine if the protein concentration achieved during
expression is sufficient to detect by NMR spectroscopy.
3.2. Cell Viability Assay 1. Using a marker, draw partitions on the outside of an LB agar
(41) (see Note 3) selective plate to section it as shown in Fig. 4.
2. Perform two to three serial 1:100 dilutions, followed by three
serial 1:10 dilutions into fresh LB medium (Fig. 4). To reduce
error, adjust each dilution to a final volume of 1 mL. Vortex
each dilution prior to preparing the next dilution. This is done
to ensure that cells are evenly suspended.
3. Plate 10 mL drops of each dilution on the previously marked
selective plate, one drop per section (Fig. 4). Vortex the sample
in between each drop. Allow drops to thoroughly dry on the
plate and incubate at 37°C for 12–16 h.
4. Count the number of colonies per section. To calculate cell
viability, identify the section(s) containing 3–30 single colonies,
count the number of colonies for each 10 mL drop (these are
typically called colony-forming units (CFU)), and divide by the
volume (V) plated (in mL) times the dilution factor (D) (10−x):
(CFU/(V × D)). A healthy bacterial sample with an OD600 of
~1 should give ~109 CFU/mL. It is important to keep in mind
that cell viability altered by just one dilution indicates that 90%
of the viable cells have died off.
11
4
11
4 11
10
6 6
10 10
8
8 9
102 104 106 107 108 109 8
Serial Dilutions 9
9
NMR Tube Selective Plate
Fig. 4. Serial dilution assay schematic.

3.3. Freeze–Thaw 1. Prepare the freeze–thaw lysis buffer immediately before use
Lysis Protocol (see (see Note 9).
Note 8) 2. Resuspend one of each of the two samples gathered from each
point of the protein induction (i.e., before and after induction)
in 50 mL of lysis buffer.
3. Place the samples in a dry ice/ethanol bath for 5 min. Completely
thaw at room temperature. Repeat the freeze–thaw cycle two
more times. Be careful when thawing as some proteins are
temperature sensitive and may be aggregated by high or low
temperatures (see Note 7).
4. Spin the samples at 16,000 × g for 10 min at 25°C. Transfer the
supernatant to a fresh Eppendorf tube; this is the “cleared lysate”
sample. The pellet left behind is the “insoluble” sample.
5. Mix 50 mL of cleared lysate with 50 mL of 2× loading dye.
Resuspend the insoluble pellet using 25 mL of 2× loading dye and
25 mL of dIH2O. Boil the samples for 10 min.
6. Resuspend the cell pellets that were not used in step 2 in
25 mL of 2× loading dye and 25 mL of dIH2O. Boil the samples
for 10 min.
7. Run an SDS-PAGE gel to analyze expression levels and determine
the solubility of the protein (42). Solubility is determined by
what sample the target protein appears in. If the target protein
appears in the cleared lysate sample, then the protein is soluble.
If it appears in the insoluble pellet, then it is insoluble.
8. This method can also be applied to an in-cell NMR sample if
there was no protein signal detected during the experiment.
The volume of lysis buffer should be equal to the volume of
the NMR sample. Follow the freeze–thaw protocol through
step 4, then place the cleared lysate into an NMR tube, and
examine it for a protein signal.
4. Notes
1. Currently, E. coli BL21 (DE3), BL21 (DE3) Gold, BL21 (DE3)

Rosetta, and JM109 (DE3) have been used for performing
in-cell NMR experiments. There are many other expression
strains of E. coli that serve specific purposes. We recommend
examining the available expression strains and selecting a strain
in which overexpression gives the highest yield of soluble
protein of interest. To do this, we typically lyse induced cells,
separate the soluble and insoluble portions, and examine by
SDS-PAGE analysis.
2. We recommend examining the cell slurry at the end of the

experiment to ensure that the cell slurry remains within the coil
volume in the probe. If the cells have all settled out of the
coil volume, then it can be assumed that the detected signal
arises from protein that has leaked out of the cell, making the
experiment no longer an in-cell NMR experiment.
3. There are several quantitative cell viability assays that can be
used, including cell counting, serial dilution, colorimetric
assays, etc. The important aspect of the quantitative assay is
that it has the ability to detect a range of viability allowing the
determination of the percentage of viable cells at the end of the
experiment.
4. New software for FT processing is free and available upon
request from Dr. Pei Zhou or Dr. Brian Coggins. Their e-mail
addresses can be found at this Web site: http://zhoulab.biochem.
duke.edu/.
The software for maximum entropy processing can be obtained
(free for not-for-profit organizations) by going to Dr. Jeffrey
Hochs’ Web site: http://structbio.uchc.edu/HochLab_files/
Hoch_Lab/Software.html.
5. The glucose, thiamine, MgSO4, FeCl3, ZnSO4, and CaCl2
stocks should all be filter sterilized before use.
6. Protein–protein interaction studies within a living cell can be
very informative. They generally require that both proteins can
be independently detected within a living cell. If both proteins
are expressed using the same promoter, it is very difficult to
selectively label one protein. Therefore, the protein expression
should be designed to use two different promoter systems
(e.g., one protein is expressed using the T7 promoter system
and the other protein is expressed using the arabinose system,
as was reported by Burz et al. (7)). Since this requires two
separate vectors, it is necessary to use two different antibiotics
for selection purposes. This means that both antibiotics should
be present in all cultures to ensure that neither vector is lost
from the cell. If the target proteins have well-dispersed spectra,
it may be possible to coexpress the proteins using a single vector
(such as pCDFDuet), thus negating the requirement for multiple
antibiotics.
7. The timing used for growing E. coli cells can vary. The proce-
dure discussed here has been used successfully on several small
proteins (8, 14, 15, 19); however, a different timing was used
for the first 3-D de novo structure (10). Timing of protein
expression is dependent on the expressed protein.
8. This lysis protocol is a gentle method that can be used to determine
if the protein is located in the soluble or insoluble portion of
the E. coli cell. For in-cell NMR spectroscopy, it is necessary
that the protein is located in the soluble portion of the cell.

Proteins that are located in the insoluble portion of the cell are
not detected by solution NMR spectroscopy. There are other
methods of cell lysis for small volumes that utilize detergents
(e.g., Bugbuster Protein Extraction Reagent (Pierce)); however,
we have found that the detergent lysis methods often cause
proteins to appear in the insoluble portion of the cell artificially.
There are also physical methods that can lyse small volumes,
such as the French Press and sonication.
9. Be aware that using lysozyme in the lysis buffer causes lysozyme
to appear on the SDS-PAGE gel. Lysozyme is a 14.7-kDa pro-
tein, so if the protein of interest runs in that range the use
of lysozyme is not recommended. The lysis protocol works
without lysozyme present, although it is less efficient.
Acknowledgments
The authors wish to thank Dr. Ronald A. Venters and Dr. Brian E.
Coggins for their useful discussions and comments on this
manuscript.
References
1. Dobson, C. M. (2004) Chemical space and 9. Sakakibara, D., Sasaki, A., Ikeya, T., Hamatsu,
biology. Nature 432, 824–828. J., Hanashima, T., Mishima, M., Yoshimasu,
2. Goto, S., Okuno, Y., Hattori, M., Nishioka, T., M., Hayashi, N., Mikawa, T., Walchli, M.,
and Kanehisa, M. (2002) LIGAND: database of Smith, B. O., Shirakawa, M., Guntert, P., and
chemical compounds and reactions in biological Ito, Y. (2009) Protein structure determination
pathways. Nucl. Acids Res. 30, 402–404. in living cells by in-cell NMR spectroscopy.
3. Lander, E. S., et al. (2001) Initial sequencing Nature 458, 102–105.
and analysis of the human genome. Nature 10. Ikeya, T., Sasaki, A., Sakakibara, D., Shigemitsu,
409, 860–921. Y., Hamatsu, J., Hanashima, T., Mishima, M.,
4. Goodsell, D. S. (1991) Inside a living cell. Yoshimasu, M., Hayashi, N., Mikawa, T.,
Trends Biochem. Sci. 16, 203–206. Nietlispach, D., Walchli, M., Smith, B. O.,
5. Ellis, R. J., and Minton, A. P. (2003) Cell biology: Shirakawa, M., Guntert, P., and Ito, Y. (2010)
join the crowd. Nature 425, 27–28. NMR protein structure determination in living
6. Hall, D., and Minton, A. P. (2003) Macromolecular E. coli cells using nonlinear sampling. Nat.
crowding: qualitative and semiquantitative Protoc. 5, 1051–1060.
successes, quantitative challenges. Biochim. Biophys. 11. Augustus, A. M., Reardon, P. N., and Spicer, L.
Acta. 1649, 127–139. D. (2009) MetJ repressor interactions with
7. Burz, D. S., Dutta, K., Cowburn, D., and DNA probed by in-cell NMR. Proc. Natl. Acad.
Shekhtman, A. (2006) Mapping structural Sci. USA 106, 5065–5069.
interactions using in-cell NMR spectroscopy 12. Hubbard, J. A., MacLachlan, L. K., King, G. W.,
(STINT-NMR). Nat. Methods 3, 91–93. Jones, J. J., and Fosberry, A. P. (2003) Nuclear
8. Reardon, P. N., and Spicer, L. D. (2005) magnetic resonance spectroscopy reveals the
Multidimensional NMR spectroscopy for protein functional state of the signalling protein CheY
characterization and assignment inside cells. in vivo in Escherichia coli. Mol. Microbiol. 49,
J. Am. Chem. Soc. 127, 10848–10849. 1191–1200.
13. Xie, J., Thapa, R., Reverdatto, S., Burz, D. S., 25. Mandelshtam, V. A., Taylor, H. S., and Shaka,
and Shekhtman, A. (2009) Screening of small A. J. (1998) Application of the filter diagonalization
molecule interactor library by using in-cell method to one- and two-dimensional NMR
NMR spectroscopy (SMILI-NMR). J. Med. spectra. J. Magn. Reson. 133, 304–312.
Chem. 52, 3516–3522. 26. Kupce, E., and Freeman, R. (2003) Fast multi-
14. Serber, Z., Keatinge-Clay, A. T., Ledwidge, R., dimensional NMR of proteins. J. Biomol. NMR
Kelly, A. E., Miller, S. M., and Dotsch, V. 25, 349–354.
(2001) High-resolution macromolecular NMR 27. Schanda, P., Kupce, E., and Brutscher, B.
spectroscopy inside living cells. J. Am. Chem. (2005) SOFAST-HMQC experiments for
Soc. 123, 2446–2447. recording two-dimensional heteronuclear corre-
15. Serber, Z., Straub, W., Corsini, L., Nomura, A. lation spectra of proteins within a few seconds.
M., Shimba, N., Craik, C. S., Ortiz de J. Biomol. NMR 33, 199–211.
Montellano, P., and Dotsch, V. (2004) Methyl 28. Kupce, E., and Freeman, R. (2003) Projection-
groups as probes for proteins and complexes in reconstruction of three-dimensional NMR
in-cell NMR experiments. J. Am. Chem. Soc. spectra. J. Am. Chem. Soc. 125, 13958–13959.
126, 7119–7125. 29. Coggins, B. E., Venters, R. A., and Zhou, P.
16. Tugarinov, V., and Kay, L. E. (2003) Ile, Leu, (2004) Generalized reconstruction of n-D
and Val methyl assignments of the 723-residue NMR spectra from multiple projections: appli-
malate synthase G using a new labeling strategy cation to the 5-D HACACONH spectrum of
and novel NMR methods. J. Am. Chem. Soc. protein G B1 domain. J. Am. Chem. Soc. 126,
125, 13868–13878. 1000–1001.
17. Goto, N. K., Gardner, K. H., Mueller, G. A., 30. Coggins, B. E., and Zhou, P. (2007) Sampling
Willis, R. C., and Kay, L. E. (1999) A robust of the NMR time domain along concentric
and cost-effective method for the production of rings. J. Magn. Reson. 184, 207–221.
Val, Leu, Ile (delta 1) methyl-protonated 15N-, 31. Barna, J. C. J., Laue, E. D., Mayger, M. R.,
13
C-, 2H-labeled proteins. J. Biomol. NMR 13, Skilling, J., and Worrall, S. J. P. (1987)
369–374. Exponential Sampling, an alternative method
18. Li, C., Wang, G. F., Wang, Y., Creager-Allen, for sampling in two-dimensional NMR experi-
R., Lutz, E. A., Scronce, H., Slade, K. M., Ruf, ments. J. Magn. Reson. 73, 69–77.
R. A., Mehl, R. A., and Pielak, G. J. (2010) 32. Hiller, S., Fiorito, F., Wuthrich, K., and Wider, G.
Protein (19)F NMR in Escherichia coli. J. Am. (2005) Automated projection spectroscopy
Chem. Soc. 132, 321–327. (APSY). Proc. Natl. Acad. Sci. USA 102,
19. Serber, Z., Ledwidge, R., Miller, S. M., and 10876–10881.
Dotsch, V. (2001) Evaluation of parameters 33. Eghbalnia, H. R., Bahrami, A., Tonelli, M.,
critical to observing proteins inside living Hallenga, K., and Markley, J. L. (2005) High-
Escherichia coli by in-cell NMR spectroscopy. resolution iterative frequency identification
J. Am. Chem. Soc. 123, 8895–8901. for NMR as a general strategy for multidimen-
20. Li, C., Charlton, L. M., Lakkavaram, A., Seagle, sional data collection. J. Am. Chem. Soc. 127,
C., Wang, G., Young, G. B., Macdonald, J. M., 12528–12536.
and Pielak, G. J. (2008) Differential dynamical 34. Kupce, E., and Freeman, R. (2004) Projection-
effects of macromolecular crowding on an reconstruction technique for speeding up
intrinsically disordered protein and a globular multidimensional NMR spectroscopy. J. Am.
protein: implications for in-cell NMR spectros- Chem. Soc. 126, 6429–6440.
copy. J. Am. Chem. Soc. 130, 6310–6311. 35. Venters, R. A., Coggins, B. E., Kojetin, D.,
21. Barnes, C. O., and Pielak, G. J. (2010) In-cell Cavanagh, J., and Zhou, P. (2005) (4,2)D
protein NMR and protein leakage, Proteins Projection – reconstruction experiments for
79, 347–351. protein backbone assignment: application to
22. Burz, D. S., and Shekhtman, A. (2008) In-cell human carbonic anhydrase II and calbindin
biochemistry using NMR spectroscopy. PLoS D(28 K). J. Am. Chem. Soc. 127, 8785–8795.
One 3, e2571. 36. Coggins, B. E., Venters, R. A., and Zhou, P.
23. Dedmon, M. M., Patel, C. N., Young, G. B., (2005) Filtered backprojection for the recon-
and Pielak, G. J. (2002) FlgM gains structure struction of a high-resolution (4,2)D CH3-NH
in living cells. Proc. Natl. Acad. Sci. USA 99, NOESY spectrum on a 29 kDa protein. J. Am.
12681–12684. Chem. Soc. 127, 11562–11563.
24. McNulty, B. C., Young, G. B., and Pielak, G. J. 37. Kupce, E., and Freeman, R. (2003) Recon-
(2006) Macromolecular crowding in the Esche- struction of the three-dimensional NMR spectrum
richia coli periplasm maintains alpha-synuclein of a protein from a set of plane projections.
disorder. J. Mol. Biol. 355, 893–897. J. Biomol. NMR 27, 383–387.
38. Coggins, B. E., and Zhou, P. (2006) Polar Fourier High-resolution multi-dimensional NMR
transforms of radially sampled NMR data. J. Magn. spectroscopy of proteins in human cells. Nature
Reson. 182, 84–95. 458, 106–109.
39. Selenko, P., Serber, Z., Gadea, B., Ruderman, J., 41. Beckman, J. S., and Siedow, J. N. (1985)
and Wagner, G. (2006) Quantitative NMR Bactericidal agents generated by the peroxidase-
analysis of the protein G B1 domain in Xenopus catalyzed oxidation of para-hydroquinones.
laevis egg extracts and intact oocytes. Proc. J. Biol. Chem. 260, 14604–14609.
Natl. Acad. Sci. USA 103, 11904–11909. 42. Sambrook, J., and Russell, D. W., (Eds.) (2001)
40. Inomata, K., Ohno, A., Tochio, H., Isogai, S., Molecular Cloning: A Laboratory Manual, Vol.
Tenno, T., Nakase, I., Takeuchi, T., Futaki, S., 3, Cold Spring Harbor Laboratory Press, Cold
Ito, Y., Hiroaki, H., and Shirakawa, M. (2009) Spring Harbor.
Chapter 16
Deuterated Peptides and Proteins: Structure

and Dynamics Studies by MAS Solid-State NMR
Bernd Reif
Abstract
Perdeuteration and back substitution of exchangeable protons in microcrystalline proteins, in combination
with recrystallization from D2O-containing buffers, significantly reduce 1H, 1H dipolar interactions. This
way, amide proton line widths on the order of 20 Hz are obtained. Aliphatic protons are accessible either
via specifically protonated precursors or by using low amounts of H2O in the bacterial growth medium.
The labeling scheme enables characterization of structure and dynamics in the solid-state without dipolar
truncation artifacts.
Key words: Magic angle spinning solid-state NMR, Perdeuteration, 2H labeling, Microcrystalline
proteins, 15N relaxation, Order parameters, Protein dynamics
1. Introduction
Magic angle spinning (MAS) solid-state nuclear magnetic resonance

(NMR) spectroscopy has rapidly progressed over the past 10 years.
Whereas samples with one or two NMR active nuclei were investi-
gated in the past, uniformly labeled samples are now the focus of
investigations. This development was made possible with the
advent of microcrystalline proteins (1, 2). Clearly, those samples
hold the potential to characterize a multitude of interactions at the
same time using only one sample. On the other hand, the problem
of dipolar truncation (3), i.e., the suppression of weak interactions
in the presence of strong couplings, needs to be addressed to derive
a structure in the end. Dipolar truncation can be circumvented by
preparing samples that are magnetically dilute in the carbon spin
system. This can be achieved by growing bacteria that overexpress
the protein of interest in a medium that contains (1,3)-13C-glucose
279
280 B. Reif
or (2)-13C-glucose (4). Progress in hardware (5–7) and sample

preparation (1, 8–11) resulted in the structural characterization of
these crystalline proteins (12–17). Due to the fact that the line
width in the solid state is independent of molecular tumbling, very
large crystalline protein complexes can be investigated (18). This
was shown for the 143-kDa tryptophane synthase (19), as well as
for the 480-kDa ferretin (20). Experiments are not limited to precipi-
tated/crystalline proteins, but can also be carried out in solution,
where the viscosity is large enough that the tumbling correlation
time exceeds the MAS rotor period (21).
In addition to investigations involving crystalline/soluble proteins,
NMR experiments are performed on noncrystalline, uniformly
isotopically enriched samples, like membrane proteins and amyloid
fibrils. In this context, a GPCR-bound ligand (22), a toxin binding
to the nicotinic acetylcholine receptor (23), and a potassium channel
KcsA with an interacting toxin (24) were characterized by using
NMR spectroscopy. Furthermore, the uniformly isotopically labeled
membrane proteins phospholamban (25), OmpG (26), EmrE (27),
sensory rhodopsin (28), proteorhodopsin (29, 30), the ABC trans-
porter ArtJ (31), and DsbB (32, 33) were investigated, and amyloid
fibrils formed from the Alzheimer’s disease ß-amyloid peptide
(34–37), transthyretin (38, 39), the WW domain (40), Het-s
(41, 42), IAPP (43–45), α-synuclein (46, 47), and Ure2p (48)
were assigned and structurally characterized.
Furthermore, the microtubule-binding protein CAP-Gly
(49, 50) was characterized using MAS solid-state NMR and, large
protein complexes, such as the small heat-shock protein αB-crystallin,
which are not amenable to solution-state NMR methods, were
successfully studied by solid-state NMR (21, 51).
Recently, a number of high-quality reviews on MAS solid-state
NMR methodology on biomolecules have been published
( 52– 57). The focus of this chapter is, therefore, on NMR spectro-
scopic investigations using perdeuterated proteins. Perdeuteration
significantly simplifies spectroscopy by eliminating most of the
strong homo- and heteronuclear interactions that are inherent to
“normal” solid-state NMR samples.
In the past, MAS solid-state NMR experiments were restricted
to the observation of heteronuclear spins. CRAMPS (58–60) type
1
H, 1H homonuclear decoupling approaches, such as WHH-4- (61),
BR-24- (62), MSHOT- (63), and Lee-Goldburg-derived sequences,
such as FSLG (64), PMLG (65), w-PMLG (66), DUMBO (67),
or symmetry-based methods (68), did not succeed in reducing the
unscaled proton line widths to values below 150–300 Hz
(i.e., 0.25–0.5 ppm at 600 MHz). Given the small chemical shift
dispersion for protons of the same kind (e.g., HN, Hα or methyl
protons) with a chemical shift range of ca. 3 ppm (for each of
them), the line width achieved is not sufficient to resolve individual
resonances of larger molecules. Due to their high gyromagnetic
16 Deuterated Peptides and Proteins: Structure and Dynamics Studies… 281
ratio, protons should be the nucleus with the highest sensitivity.

In the solid state, however, 1H, 1H dipolar couplings make proton
detection ineffective as they induce significant line broadening.
Application of ultrafast MAS (60–70 kHz) (69) represents an alter-
native to proton homonuclear decoupling. At the moment, however,
with ultrafast spinning, the resolution achieved is comparable to
the resolution that is obtained at moderate spinning frequencies
using an advanced CRAMPS technique. Alternatively, suppression
of strong 1H, 1H dipolar couplings can be chemically achieved by
perdeuterating the sample. In this approach, all nonexchangeable
proton sites are occupied by deuterium atoms which have a gyromag-
netic ratio that is a factor of 6.5 smaller than the proton gyromagnetic
ratio. Correspondingly, all 1H, 2H dipolar couplings are reduced.
In addition, the interaction Hamiltonian becomes heteronuclear
which makes manipulation of the interaction by MAS or RF pulses
straightforward. Exchangeable sites are subsequently back substituted
with protons. This strategy was pioneered in solution-state NMR
(70–72). In the solid state, deuteration was first applied to small
molecules (73–75) and then later extended to peptides (76–78)
and proteins (79–82). The achieved dilution of the proton bath is
illustrated in Fig. 1.
2. Correlation
Spectroscopy
To record proton-detected experiments, solvent suppression
becomes a major issue. Water suppression can be achieved using
pulsed field gradients (79), cross-polarization (CP) periods as spin
locks to purge unwanted solvent magnetization (82), or a combina-
tion of both (84). In experiments that are carried out with proteins
that are recrystallized from buffers containing 100% H2O, the 1H
line width of most of the resonances is typically on the order of
150–250 Hz and 80–150 Hz in the absence and presence of
homonuclear decoupling, respectively (85). We showed recently
that ultrahigh-resolution 1H spectra are obtained in MAS solid-
state NMR if the respective perdeuterated protein is recrystallized
from a buffer containing 90% D2O (86) (Fig. 2). The resulting 1H
line width is on the order of 17–35 Hz for MAS spinning frequen-
cies in the range of 8–24 kHz. This approach enables 1H-detected
2D 1H, 15N correlation spectroscopy without the need for homo-
and heteronuclear dipolar decoupling.
Similarly, high-resolution spectra can be recorded for methyl
protons in perdeuterated peptides and proteins (Fig. 3a) (87). The
bacteria that overexpress the SH3 domain are grown in a medium
containing glucose that is only ~97% enriched in deuterium. The
likelihood that a proton gets incorporated into a methyl group is,
therefore, on the order of 10%. The canonical line widths in the 1H
282 B. Reif
Fig. 1. Proton density in the α-spectrin SH3 domain upon deuteration. (a) Protonated sample. (b) Sample recrystallized
from 100% H2O: Only exchangeable protons of the protein and protons of refined hydration water are displayed. (c) Sample
recrystallized from 10% H2O and 90% D2O. On average, every molecule contains only 21 protons, assuming that there are
53 hydration water molecules as found in the X-ray structure (PDB: 1U06) (83). (d) Upon deuteration, 1H, 1H dipolar interactions
are strongly attenuated due to the chemical dilution of the proton spins in the sample.
and 13C dimensions are 20–25 Hz and 5–8 Hz, respectively, at a

MAS rotation frequency of 22 kHz. An increase in sensitivity can be
achieved by making use of precursors that allow selective labeling
of methyl groups in aliphatic side chains, like pyruvate (88) or
α-ketoisovalerate (89). This kind of labeling strategy was pioneered
by Lewis Kay and coworkers for solution-state NMR applications
(90). In the solid state, care should be taken to preferentially incor-
porate CHD2 isotopomers, as the dipolar couplings among methyl
protons in the CH3 group can induce severe line broadening.
a G51
Y13
110 V53
V9 R21 V58 S19
V23
N Chemical shift [ppm]
A56 Y57
115 N35
G28 Q50 D40 T24
D14
F52 Y15
K18 E45
120 I30 K59 T32
M25
K27 R49 L8
V44 K39 E17
L10 K43 W41 E22
125 S36 K26
15
L61
K60 V46 L34 Q16
We41 A11
L31 L12 D62
A55
130 L33
10.0 9.5 9.0 8.5 8.0 7.5 7.0 6.5

1H Chemical shift [ppm]
b
30
Line Width [Hz]
20
G51
G28
10 We41
1H
A56
L61
0
0 20 40 60 80 100 120
Rotor Period [•s]
Fig. 2. (a) 1H-detected 1H, 15N correlation recorded with a perdeuterated α-spectrin SH3 sample that was recrystallized
from a buffer containing 90% D2O. (b) Amide proton line widths as a function of MAS rotation frequencies for selected residues.
Reproduced by permission of Wiley from Chevelkov et al. (86).
A prerequisite to achieve high-resolution spectra for long-lasting

multidimensional experiments is an internal 2H lock that can be
employed to decouple 2H, 13C scalar couplings, which induce a
significant broadening of the resonances in 13C evolution periods
(88). In principle, ultrafast MAS probes (69) might allow the use
of higher proton concentrations while maintaining high 1H resolution
as the residual dipolar interactions are more efficiently suppressed.
However, ultrafast MAS probes impose challenges in shimming,
resulting in intrinsic line widths on the order of 25–30 Hz. Problems
arise from the small dimensions of the sample, in particular at high
magnetic fields. This might change when susceptibility matched
284 B. Reif
a
I30δ
12
14
Chemical Shift (ppm)

A55
16 M1
V53γ1
A56
18 I30γ2
V44γ1
A11
V58γ1
V9γ1
20 M25 V23γ1
T24 V44γ2
T4 V23-γ2
V53γ2 L12δ1
22
13C
T37 L34δ1
L8δ1
V58γ2
T32 L61δ1 L8δ2
24 V9γ2
L33δ2
L10δ2
L12δ2 L33δ1
26 L34δ2
L31 L10δ1
28
2.5 2.0 1.5 1.0 0.5 0.0
1
H Chemical Shift (ppm)
b S36β P54δ
A55
6 S36β 3
DQ Chemical Shift (ppm)
SQ Chemical Shift (ppm)

K27
S19β
G28
T24β
8 P54α A56 G51 4
T32β T37α
K39
K60 A11
Y15
G28
10 P20α 5
T37β
2Hα
2 α
T32α
H
V23
V53
I30
T24α
12 V44 6
F52 V58
70 60 50
13Cα Chemical Shift (ppm)
Fig. 3. (a) 1H, 13C correlation recorded for a [U – 2H, 13C, 15N]-labeled sample of the α-spectrin
SH3 domain. The experiment makes use of the residual protonation of the precursors
employed during protein biosynthesis. Spectra in gray and black are recorded using INEPT
and CP, respectively, for magnetization transfer. Cross peaks highlighted with circles are
only visible in the INEPT version of the experiment. Reproduced by permission of Elsevier
from Agarwal et al. (87). (b) Cα spectral region of the 13C-detected 2H-DQ, 13C correlation
experiment applied to the SH3 domain. Dα resonances are as narrow as 16 Hz (A56) in the
2
H DQ dimension. Reproduced by permission of ACS from Agarwal et al. (93).
wire and isolation become available. Then, higher concentrations

of exchangeable protons might be achieved without compromising
resolution (91, 92).
On the other hand, 2H can be used as an additional nucleus or
chemical shift dimension to disperse overlapping resonances. In
the solid state, high-resolution 2H, 13C correlation experiments are
possible, since overall tumbling is absent in immobilized crystalline
systems (93). Figure 3b shows a 13C-detected 2H-DQ, 13C correlation
recorded for a perdeuterated α-spectrin SH3 sample. The 2H dimen-
sion is realized by evolving double-quantum (DQ) coherences,
since 2H-DQ are independent from deviations of the spinning axis
from the magic angle, and insensitive to the motional effects that
would interfere with MAS. In addition, 2H-DQ allow a doubling
of the effective resolution since they evolve twice as fast as single-
quantum (SQ) coherences. The efficiency of the 2H, 13C magnetiza-
tion transfer is strongly coupled to the RF field strength on the 2H
channel (>80 kHz). The use of optimum control (OC) in the
design of better magnetization transfer sequences might alleviate
this problem in the future (94).
In solution-state NMR, exchangeable hydroxyl protons are
difficult to assign due to their rapid exchange with the solvent.
In the solid state, magnetization can be transferred approximately
50× faster using dipolar couplings. This way, many exchangeable
hydroxyl protons are, thus, accessible. Figure 4 shows a 13C-detected
1
H, 13C correlation spectrum recorded for the microcrystalline
α-spectrin SH3 domain (95). All three threonine hydroxyl protons
S36-Cβ
T24-Cβ T32-Cβ S19-Cβ
5 H2 O
T37-Cβ T37-Cα
1H Chemical Shift (ppm)
6 T37-OH
1J
T24-HN HN
7 T32-OH T32-Cα
8
T37-HN
T24-OH
9
L33-HN
N38-HN
S36-HN
10
74 72 70 68 66 64 62
13C Chemical Shift (ppm)
Fig. 4. Threonine spectral region of a 13C-detected 1H, 13C correlation recorded for a [U – 2H,
13
C, 15N]-labeled sample of the α-spectrin SH3 domain. The spectrum was recorded at
5°C. Peaks are split into doublets in the 1H indirect dimension due to evolution of the 1JNH
scalar coupling. Reproduced by permission of ACS from Agarwal et al. (95).
286 B. Reif
are readily assigned by correlations to the Cβ and Cα carbon

chemical shifts. In EXSY-type experiments, the exchange charac-
teristics of the respective hydroxyl proton can be probed. Dipolar
recoupling experiments allow the proton to be localized within a
hydrogen bond. From the dephasing behavior in REDOR-type
experiments, the distance between the hydroxyl proton and a carbon
atom in the donor as well as acceptor group can be deduced. We
expect that these experiments have an impact on the understanding
of enzymes that are involved in proton transfer reactions.
Figure 5 shows the effect of miscalibration of the spinning
angle on the experimental 15N and 2H spectra. The 15N spectrum
in Fig. 5a is split into a doublet due to the one-bond scalar coupling
between the nitrogen and the amide proton. Even small deviations
of the spinning angle from the magic angle ( 0.02° for 15N) result
in a noticeable deterioration of the spectral quality. The situation is
worse for 2H-SQ spectroscopy. Because of the large deuterium
quadrupolar interaction, 2H resonances are sensitive to a miscalibra-
tion of the spinning angle as small as 0.005° (Fig. 5b). We expect
that in the future Hall devices, which allow a direct adjustment of
the spinning angle without the need for an external reference sample,
will greatly improve the quality of solid-state spectra (96).
In addition to using specifically protonated precursors of amino
acid biosynthesis, doping the sample with Cu-EDTA results in a
reduction of the 1H T1 (up to a factor of 15), and thus the recycle
delay of the experiment (99, 100). Therefore, the paramagnetic
complex leaves the proton line width unaffected. Doping is
straightforward in perdeuterated samples for which no high-
power proton decoupling is required in a direct or indirect
evolution period. Typical duty cycles are on the order of 30%
(100-ms acquisition time, 300-ms recycle delay). However, care
has to be taken if protonated samples are employed. The use of
probes that largely exclude the electric field from the active volume
of the sample seems mandatory (5, 6).
The high resolution that is achievable in the solid state of these
highly deuterated proteins enables solution state-type, scalar cou-
pling-based correlation experiments, e.g., HNCO, HNCA, HNCACB,
HNCACO, and HNCOCA, which yield reliable backbone
resonance assignments (18, 101). Note that only by taking into
account the HN proton chemical shift is an unambiguous backbone
assignment of uniformly isotopically enriched proteins in the solid
state possible. Even with 15N line widths on the order of 10 Hz
(which is the typical line width of the resonances in perdeuterated
SH3 recrystallized from 90% D2O), many 15N chemical shifts
overlap (Fig. 2a). Finally, deuteration of nonexchangeable sites
in combination with back exchange of amide protons allows the
determination of HN–HN long-range distances (76, 80, 81), detec-
tion of dynamic water molecules (81, 83), and characterization of
side-chain dynamics using deuterium (102, 103).
a b
βRL = βMA ΔνFWHM
= 10 Hz ΔνFWHM
2H-DQ
= 10 Hz βRL =
54.71° , βMA
βRL = 54.79°
24 kHz
10 kHz
βRL = 54.715° ΔνFWHM

= 10 Hz
ΔνFWHM
= 16 Hz
2H-SQ
βRL = βMA
24 kHz
10 kHz
βRL = 54.73°
βRL = 54.6° βRL = 54.75°
24 kHz
βRL = 54.71°
10 kHz
60 30 0 –30 –60
250 200 150 100 50 0 –50 2H Chemical Shift (Hz)
15
N Frequency [Hz]
Fig. 5. Effect of the MAS spinning axis on the 15N (a) and 2H (b) line width. (a) The simulation of the 15N spectrum (15N–1H
spin pair) assumes an 15N chemical shift anisotropy of 100 ppm and a dipolar and scalar coupling to a directly bonded
proton of 10 kHz and −95 Hz, respectively. The external magnetic field strength was set to 14.1 T, corresponding to a proton
Larmor frequency of 600 MHz. Simulations are shown for a MAS rotation frequency of 10 kHz (dashed) and 24 kHz (solid).
( )
Mis-setting the spinning angle from the “magic angle” ( b MA = b RL = arctan 2 ≈ 54.73561) reintroduces the sum
and difference anisotropy for the upfield and downfield components, respectively. Reproduced by permission of ACS from
Chevelkov et al. (97). (b) Simulation of the 2H DQ (top) and DQ (bottom) spectrum. The quadrupolar coupling was assumed
to be 100 kHz, setting h = 0.1. In the simulation, the Euler angle b RL, which describes the angle between the principal axes
of the rotor fixed frame and the laboratory coordinate system, was varied as indicated. All simulations were carried out
using the program SIMPSON (98). Reproduced by permission of ACS from Agarwal et al. (93).
3. Characterization
of Dynamics
in the Solid State
In solution-state NMR, the relaxation properties of an NMR
observable are largely determined by the overall tumbling of
molecule in the solvent. Local structural fluctuations, which are
often of greater interest than the characterization of the overall
correlation time of a molecule, are therefore difficult to access. The
situation is different when microcrystalline proteins are considered.
In MAS solid-state NMR experiments, overall tumbling is absent
and relaxation is mostly driven by local structural fluctuations.
Therefore, the 15N-T1 relaxation time of an amide nitrogen in the
protein backbone can fluctuate by several orders of magnitude
(104–107). Figure 6 shows the 15N-T1 relaxation times that were
288 B. Reif
a b
6
5 6
5 15
4 4 N 1 (500 MHz)
3 15
3 N 1 (600 MHz)
2 2
– T1 [sec]
10 10
7 7
6 6
5 5
4 4
15N
3 3
2
2
1
1
15N T1 (600 MHz) 7
7 15N T (900 MHz) 6
6 1 5
5 4
10 20 30 40 50 60 10 20 30 40 50 60
Residue Number Residue Number
Fig. 6. 15N T1 relaxation times of the α-spectrin SH3 domain recorded in the solid state (a) and in solution (b). Reproduced
by permission of ACS from Chevelkov et al. (108).
obtained for a perdeuterated sample of the α-spectrin SH3 domain

that was recrystallized from a buffer that contained 90% D2O
(108). The results obtained in the solid state (Fig. 6a) are contrasted
with the 15N-T1 relaxation times that are obtained in solution-state
NMR experiments (Fig. 6b) recorded for the same protein. Clearly,
the dynamic range is increased in the solid state. The use of deuter-
ated samples ensures that 1H, 1H spin diffusion does not perturb
the experimental rates. Spin diffusion would result in an averaging
of the experimental rate. A comparative analysis of 15N relaxation
times and order parameters in the solid state and solution shows
that both techniques can be combined to allow for a more reliable
quantification of motional processes (109, 110). The measured
relaxation rate R1(15N) is related to the size of the N-H dipolar
coupling, d, the chemical shift anisotropy, c, and the spectral den-
sity function, J(ω), according to (111, 112):
d2
R1 (15 N ) = ⎡ J 0 (w H − wN ) + 3 J 1 (wN ) + 6 J 2 (w H + wN )⎤⎦ (1)
10 ⎣
2
+ c 2 J 1 (wN )
15
with
2
⎛ γ γ h⎞ (2)
d = ⎜ H N3 ⎟ ≡ ω HN
2 2
⎝ 2π r ⎠NH
( )
2
c = ⎡⎣ γ N B0 σ|| − σ ⊥ ⎤⎦ ≡ ωN2 ·Δσ 2
2
rNH refers to the 1H–15N bond length, wH and wN represent the 1H

and the 15N Larmor frequencies, respectively, and γH and γN are the
gyromagnetic magnetic ratios of 1H and 15N, respectively. The
frequency of the 1H, 15N dipole–dipole interaction is denoted as
wHN. The 15N chemical shift anisotropy (CSA) can be assumed to
be axially symmetric. The 15N–1H bond is tilted by approximately
20° with respect to the principal axis of the 15N CSA tensor (113).
Typical values for the anisotropy of the 15N chemical shift are
Δs = s|| − s = 170 ± 8 ppm or sz = 106 ± 6 ppm (9, 113–116). In
the absence of motion, an effective N–H bond length of
rNH = (<r 3NH>)1/3 = 1.015 Å is assumed (117).
Until recently, transverse relaxation properties of nuclear spins
in the solid state were not generally accessible. An exception is
represented by chemical exchange phenomena, which have a direct
impact on the spectral line shape or the powder pattern. In proto-
nated samples, the decay of transverse magnetization cannot be
easily related to the motional properties of the molecule, since
the magnetization decay can arise from insufficient 1H decoupling
or other experimental issues. Given the high resolution that is
achievable in highly deuterated proteins, proton and nitrogen spectra
can be recorded without decoupling in the direct or indirect
dimension (97). The resulting spectrum yields a doublet in either dimen-
sion for every amide moiety. In the solid state, two effects contrib-
ute to differences in the intensities of each of the doublets: first,
the multiplet intensites are affected by a coherent effect which is
MAS frequency dependent; second, static, coherent effects are well-
documented and have been exploited by many solid-state NMR
groups (118–123). In brief, the chemical shift w PASN of a particular
amide nitrogen of a given crystallite is determined by two contri-
butions that are due to the 1H, 15N dipolar interaction and the 15N
chemical shielding in the nitrogen principal axis system:
wNPAS (b ) = wNCSA (b ) + wNH

Dipol
(b )
⎡ 3cos 2 b − 1 ⎤ 3cos 2 b − 1 (3)
= ⎢s Niso + dN ⎥ N z + D HN 2N z H z
⎣ 2 ⎦ 2
3cos 2 b − 1
= s isoN z +
2
(dN + 2DHN H z )N z
with ⎛m ⎞g g
DHN = ⎜ 0 ⎟ H 3 N ,
⎝ 4π ⎠ rNH
in which s iso and dΝ describe the isotropic and anisotropic chemical

shift of the nitrogen spin, respectively. b refers to the angle of the
principal axis of the dipolar/shielding tensor with respect to the
external magnetic field. For simplicity, it is assumed that the dipolar
and chemical shielding tensors are collinear and that the 15N shielding
tensor is axially symmetric. DHN represents the size of the 1H, 15N
290 B. Reif
dipolar interaction, which is dependent on the magnetic permeability

m0, the gyromagnetic ratios, γH and γN, of the proton and the nitrogen
nuclei, respectively, Planck’s constant, ħ, and the N–H bond length
rNH. Given the fact that the proton spin state can adopt the spin
quantum number ±1/2, one obtains
⎡⎛ 1 ⎞ ⎛1 ⎞⎤
dN + DHN H z = dN + DHN ⎢⎜ + H z ⎟ − ⎜ − H z ⎟ ⎥
⎣ ⎝ 2 ⎠ ⎝ 2 ⎠⎦
= d N + DHN ⎡⎣H a ⎤⎦ − DHN ⎡⎣H b ⎤⎦ (4)
⎧⎪ d N + DHN ⎡⎣H a ⎤⎦ ; upfield component
=⎨
⎪⎩dN − DHN ⎡⎣H ⎤⎦ ; downfield component
b
As a consequence, the b spin state of the multiplet experiences

effectively only the difference anisotropy dN−DHN, whereas the a
spin state experiences the sum anisotropy dN + DHN. The intensity
of the spinning sideband resonances associated with the sum anisot-
ropy is, therefore, distributed over a larger spectral region and the
respective central band intensity is decreased. The opposite applies
for the resonance associated with the difference tensor. As the 15N
CSA and 1H–15N dipolar interactions are purely inhomogeneous as
per Maricq and Waugh (124), no contribution of the coherent
static effect to the 15N-Hα/Hβ multiplet line width is expected.
This explanation is supported by numerical simulations in which a
two-site exchange process is explicitly included in the calculation
of the powder average of a 1H, 15N two-spin system (125).
Static interference effects are not observable in solution-state
NMR, since they are averaged to zero because of the tumbling
of the molecule in solution. However, a second-order dynamic
interference effect, which is based on dipolar and CSA relaxation
interference, still influences the spectra. This effect is the physical
basis of TROSY (126) and cross-correlated relaxation experiments
(127–129). The size of the 15N CSA, 15N–1H dipole cross-correlated
relaxation rate can be expressed as (127)
hCSA/DD = 2ad {4 J (0) + 3 J (wN )}P2 (cos q ), (5)
where a = −4π/3Β0(σ|| − σ )rHN3/(hγH) and d = γH2γN2h2/ (80π2rHN6).

rHN refers to the H–N bond length and P2 refers to the second-order
Legendre polynomial ½(3cos2q–1) with q corresponding to the
angle between the principal axis of the N–H dipolar vector and
the 15N CSA shielding tensor. For an isotropic motional model
without internal motion, the spectral density function J(w) is given
as J(w) = tC/(1 + w 2tC2). In addition to the size and relative orien-
tation of the CSA and dipolar tensors, the cross-correlated
relaxation rate is, therefore, directly proportional to the molecular
correlation time tC. An exact quantification of h is typically performed
in experiments in which the magnetization of 15N-Hα and 15N-Hβ

spin states is allowed to relax for a constant time, Δ (130).
Keeping the MAS rotation frequency constant, variation of the
effective sample temperature allows a direct probe of CSA-dipole
cross-correlated relaxation effects on the differential 15N-Hα/Hβ
line width (Fig. 7). Clearly, the anisotropy of the intensities associated
with the multiplet components of the Hα/Hβ spin states becomes
larger at lower temperature for both L61 and D62, indicating
slower motional correlation times at lower temperatures. This behavior
is expected. Nevertheless, it is surprising to see that backbone motions
have large enough amplitudes and correlation times to produce
this effect. A significant effect can only be expected for motional
correlation times that are on the order of or larger than the inverse
of the 15N Larmor frequency, i.e., several ns (see Eq. 3).
In the solid state, order parameters are directly accessible by
measuring the dipolar interaction between two spins, e.g., 13C/15N
Effective
4°C 10 °C 17 °C 24 °C Temperature
13.0 Hz 13.0 Hz 11.2 Hz 10.2 Hz Experimental

FWHM
L61
17.9 Hz 17.7 Hz 13.8 Hz 12.5 Hz
128 126 128 126 128 126 128 126
17.0 Hz 16.5 Hz 13.8 Hz 12.6 Hz Experimental

FWHM
D62
~30 Hz 21.4 Hz
130 128 130 128 130 128 130 128

15
N Chemical Shift [ppm]
Fig. 7. 15N columns extracted from a 2D 1H, 15N correlation experiment that was recorded without 1H decoupling in the 15N
evolution period. The MAS rotation frequency was kept constant at 13 kHz. Contributions from a static correlation between
the 15N CSA tensor and the 1H, 15N dipole as a possible source of the effect should, therefore, be constant and independent
of a change in temperature. The anisotropy of the multiplet intensities increases at lower temperatures indicating that
the correlation time implied by dynamics is decreased. The effects are, in particular, pronounced for D62 for which the
upfield component cannot be detected when the sample temperature is adjusted to 4°C. Reproduced by permission of ACS
from Chevelkov et al. (97).
292 B. Reif
and its directly bonded 1H (9, 131–133). Again, the dipolar interaction
can be determined more reliably if perdeuterated proteins are
employed since residual 1H, 1H dipolar interactions can be neglected
as a possible source of error (134). Dipolar couplings, extracted
from CPPI-type experiments (135–137), are almost unaffected by
the radio frequency inhomogeneities of the probe, and thus enable
a more accurate determination of the absolute value of the coupling.
A comparison of solid-state and solution-state order parameters
allows identification of slow motional processes that are normally
not easily observed in solution (110).
To analyze motion quantitatively, the spectral density functions
in Eqs. 1 and 5 need to be expressed explicitly. This is not trivial,
since the exact form of the spectral density function depends on the
underlying motional model (111). In the framework of extended
model-free formalism (138, 139), the spectral density functions
Jm(w) are expressed as Lorentzian functions that depend on two
correlation times, ts and tF, and two order parameters, SS and SF,
referring to slow and fast motional processes, respectively:
(
J (w ) = 1 − S F2 ) 1 + tw t
F
2 2 (
+ S F2 1 − SS2 ) 1 + tw t
S
2 2
.
(6)
F S
To find the best fit in the framework of an extended model-free

analysis, all experimental results are combined (15R1 measured at an
external field of 14.1 T and 21.1 T, corresponding to an 1H Larmor
frequency of 600 MHz and 900 MHz; 1H–15N dipole, 15N CSA
cross-correlated relaxation rate ηDD/CSA, and 1H, 15N dipolar couplings).
In total, the data contain four experimental observables which
are just enough to yield a determined system. The best fit corre-
sponds to the minimum root mean square deviation χ between
experimental and theoretical rates which is defined as
⎧ ⎡ 1 2
⎤ ⎡ 1 2⎫
⎪ expt ⎤ ⎪
2
⎢ R
(
χ = ⎨∑ ⎢ expt R1,theo
i − R1,expt
i ) ⎥⎦ ⎣ η
(
⎥ + ⎢ expt η − η
theo
⎥ ⎬.
⎦ ⎪⎭
(7)
)
⎩⎪ ⎣
i 1,i
Superscripts theo and expt denote the theoretical and experi-

mental values for the 15N longitudinal relaxation rate R1 and
the 1H–15N dipole, 15N CSA cross-correlated relaxation rate ηDD/
CSA
. In the grid search, the order parameter of fast motion SF2 was
calculated according to SF = S/SS while the parameter times tS, tF,
and SS2 were allowed to float freely. Figure 8 shows RMSD contour
plots for residue Q16 of the α-spectrin SH3 domain as a function
of tS and SS2. tF was set to 22 ps, which corresponds to its optimal
value obtained in the course of the grid search. In fitting the curves
in Fig. 8a, only 15 N-T1 are included; (B), (C), and (D) contain cross-
correlated relaxation data (hDD/CSA) as well. We find that the minimum
for the fit of the motional correlation time tS is more restricted if h DD/CSA
is taken into account. Inclusion of an additional 15N-T1 relaxation
time measured at a different external field strength increases the
a b
S S2 S S2
τS (ns) τS (ns)
c d
SS2
SS2
τS (ns)
τS (ns)
Fig. 8. Rms difference plots between experimental and theoretical data as a function of SS2 and τS for the residue Q16 in
α-spectrin SH3. For the best fit, we obtain τF = 22 ps, and SF2 = 0.819. Data included in the fit are: (a) 15N T1 measured at 14.1 T
and 21.1 T. (b) 15N T1 measured at 14.1 T and hDD/CSA. (c) 15N T1 measured at 21.1 T and hDD/CSA. (d) 15N T1 measured at 14.1 T, 15N T1
measured at 21.1 T and h DD/CSA. Reproduced by permission of Springer from Chevelkov et al. (140).
steepness of the minimum, but leaves the best fit for τS and SS2 approxi-
mately unaltered. This is in agreement with previous findings
(106, 109).
We expect that this kind of analysis will become more and more
important in the future to characterize the dynamics of membrane
proteins and amyloid fibrils. However, for soluble/crystalline proteins,
solid-state NMR might also be the method of choice to quantitate
dynamic processes. Overall tumbling, which is the major source of
relaxation in solution, is absent in the solid state. Local motional
processes are directly reflected in the respective relaxation rates,
and the quantitative characterization of dynamics should thus be
more accurate.
In addition to backbone dynamics, information on side-chain
dynamics is obtained from analysis of the 2H Pake tensor. In the
past, specific deuterium labeling was used to investigate the dynamics
of various crystalline and amorphous solids, like liquid crystals (141),
polymers (3, 142, 143), biomembranes (144, 145), membrane
proteins (146–148), and enzymes (149). If the increment in the
294 B. Reif
2
H dimension of a multidimensional experiment is chosen to be small
enough, the resulting spinning sideband manifold can be employed
to extract the anisotropy and asymmetry parameters for the 2H
quadrupolar tensor in uniformly perdeuterated proteins (102, 150).
The (scaled) anisotropy and asymmetry yield the order parameter
and give direct information on the implicated motional model,
respectively. The quadrupolar interaction, which dominates the
deuterium spectral line shape, is very sensitive to molecular motion
over a large kinetic window (3, 151). This method should apply
to motional processes that are faster compared to the size of the
quadrupolar interaction (ca. 165 kHz of a sp3 hybridized carbon
(152)). Intermediate motions (10−4 to 10−7 s) result in line shape
distortions due to anisotropic 2H T2 relaxation (142, 153, 154).
In crystalline proteins, this scenario is difficult to observe due to low
signal-to-noise ratios. Faster processes can, in principle, be analyzed
by measuring 2H T1 relaxation times. The anisotropy of the spin-
lattice relaxation time T1 was used in the past to study fast molecular
motion (10−8 to 10−12 s) (151). Uniformly deuterated spin systems,
similar to the case of 15N–2H T1 relaxation times, also suffer from
(2H, 2H) spin diffusion. The measured 2H R1 rates are generally
averaged because of cross talk among the 2H spins (155). An alternative
route to assess fast side-chain dynamics involves the incorporation
of selective 13C labels into the methyl groups, making use of selectively
isotopically enriched amino acid precursors. This can be achieved
by employing α-ketoisovalerate ((12CD3) (13CHD2)-CD-CO-COO−)
in protein biosynthesis, which yields efficient labeling of one methyl
group (−CD2H) in valine and leucine residues (89, 90). Interestingly,
the resulting side-chain 13C-T1 relaxation times match those found
in solution, demonstrating that motional processes in the solid
state and in solution are highly similar (89, 155). This similarity
opens the door for future characterization of biomolecular dynamics
in which MAS solid-state NMR might play a major role given
the fact that relaxation parameters are independent of molecular
tumbling. Thus, local structural fluctuations are accessible with
much higher precision in the solid-state compared to solution-state
NMR experiments.
References
1. Pauli, J., Van Rossum, B.-J., Förster, H., De 3. Schmidt-Rohr, K., and Spiess, H. W. (1994)
Groot, H. J. M., and Oschkinat, H. (2000) Multidimensional Solid-State NMR and
Sample Optimization and Identification of Polymers, Academic Press, London.
Signal Patterns of Amino Acid Side Chains in 4. LeMaster, D. M., and Kushlan, D. M. (1996)
2D-RFDR Spectra of the α-Spectrin SH3 Dynamical Mapping of E. coli Thioredoxin via
Domain. J. Magn. Reson. 143, 411–416. 13
C NMR Relaxation Analysis. J. Am. Chem.
2. McDermott, A., Polenova, T., Böckmann, A., Soc. 118, 9255–9264.
Zilm, K. W., Paulsen, E. K., Martin, R. W., and 5. Stringer, J. A., Bronnimann, C. E., Mullen,
Montelione, G. T. (2000) Partial Assignments C. G., Zhou, D. H. H., Stellfox, S. A., Li, Y.,
for uniformly (13C,15N)-enriched BPTI in the Williams, E. H., and Rienstra, C. M. (2005)
solid state. J. Biomol. NMR 16, 209–219. Reduction of RF-induced sample heating with
a scroll coil resonator structure for solid-state by High-Resolution Solid-State NMR

NMR probes. J. Magn. Reson. 173, 40–48. Spectroscopy: Application to Microcrystalline
6. Doty, F. D., Kulkarni, J., Turner, C., Ubiquitin. J. Am. Chem. Soc. 127, 8618–8626.
Entzminger, G., and Bielecki, A. (2006) 16. Loquet, A., Bardiaux, B., Gardiennet, C., Blanchet,
Using a cross-coil to reduce RF heating by an C., Baldus, M., Nilges, M., Malliavin, T., and
order of magnitude in triple-resonance multi- Bockmann, A. (2008) 3D structure determi-
nuclear MAS at high fields. J. Magn. Reson. nation of the Crh protein from highly ambig-
182, 239–253. uous solid-state NMR restraints. J. Am. Chem.
7. Dillmann, B., Elbayed, K., Zeiger, H., Soc. 130, 3579–3589.
Weingertner, M.-C., Plotto, M., and Engelke, 17. Franks, W. T., Wylie, B. J., Frericks Schmidt,
F. (2007) A novel low-E field coil to minimize H. L., Nieuwkoop, A. J., Mayrhofer, R.-M.,
heating of biological samples in solid-state Shah, G. J., Graesser, D. T., and Rienstra, C.
multinuclear NMR experiments. J. Magn. Reson. M. (2008) Dipole tensor-based atomic-resolution
187, 10–18. structure determination of a nanocrystalline
8. Martin, R. W., and Zilm, K. W. (2003) protein by solid-state NMR. Proc. Natl Acad.
Preparation of protein nanocrystals and their Sci. USA 105, 4621–4626.
characterization by solid state NMR. J. Magn. 18. Linser, R., Fink, U., and Reif, B. (2010) Narrow
Reson. 165, 162–174. carbonyl resonances in proton-diluted proteins
9. Franks, W. T., Zhou, D. H., Wylie, B. J., Money, facilitate NMR assignments in the solid-state.
B. G., Graesser, D. T., Frericks, H. L., Gurmukh, J. Biomol. NMR 47, 1–6.
S., and Rienstra, C. M. (2005) Magic-Angle 19. Tian, Y., Chen, L., Niks, D., Kaiser, J. M., Lai,
Spinning Solid-State NMR Spectroscopy of J., Rienstra, C. M., Dunn, M. F., and Mueller,
the beta 1 Immunoglobulin Binding Domain L. J. (2009) J-Based 3D sidechain correlation
of Protein G (GB1): 15 N and 13 C Chemical in solid-state proteins. Phys. Chem. Chem.
Shift Assignments and Conformational Analysis. Phys. 11, 7078–7086.
J. Am. Chem. Soc. 127, 12291–12305. 20. Turano, P., Lalli, D., Felli, I. C., Theil, E. C.,
10. Lorch, M., Lehner, I., Siarheyeva, A., Basting, and Bertini, I. (2010) NMR reveals pathway
D., Pfleger, N., Manolikas, T., and Glaubitz, for ferric mineral precursors to the central
C. (2005) NMR and fluorescence spectros- cavity of ferritin. Proc. Natl Acad. Sci. USA
copy approaches to secondary and primary 107, 545–550.
active multidrug efflux pumps. Biochem. Soc. 21. Mainz, A., Jehle, S., van Rossum, B. J., Oschkinat,
Trans. 33, 873–877. H., and Reif, B. (2009) Large Protein Complexes
11. Marulanda, D., Tasayco, M. L., Cataldi, M., with Extreme Rotational Correlation Times
Arriaran, V., and Polenova, T. (2005) Resonance Investigated in Solution by Magic-Angle-
Assignments and Secondary Structure Analysis Spinning NMR Spectroscopy. J. Am. Chem.
of E. coli Thioredoxin by Magic Angle Soc. 131, 15968–15969.
Spinning Solid-State NMR Spectroscopy. 22. Luca, S., White, J. F., Sohal, A. K., Filippov,
J. Phys. Chem. B 109, 18135–18145. D. V., van Boom, J. H., R., G., and Baldus,
12. Nomura, K., Takegoshi, K., Terao, T., Uchida, M. (2003) The conformation of neuro-
K., and Kainosho, M. (1999) Determination tensin bound to its G protein-coupled recep-
of the Complete Structure of a Uniformly tor. Proc. Natl. Acad. Sci. USA 100,
Labeled Molecule by Rotational Resonance 10706–10711.
Solid-State NMR in the Tilted Rotating 23. Krabben, L., van Rossum, B. J., Castellani, F.,
Frame. J. Am. Chem. Soc. 121, 4064–4065. Bocharov, E., Schulga, A. A., Arseniev, A. S.,
13. Rienstra, C. M., Tucker-Kellogg, L., Jaroniec, Weise, C., Hucho, F., and Oschkinat, H.
C. P., Hohwy, M., Reif, B., McMahon, M. T., (2004) Towards structure determination of
Tidor, B., Lozano-Pérez, T., and Griffin, R. neurotoxin II bound to nicotinic acetylcho-
G. (2002) De Novo Determination of Peptide line receptor: a solid-state NMR approach.
Structure with Solid-State MAS NMR FEBS Lett. 564, 319–324.
Spectroscopy. Proc. Natl. Acad. Sci. USA 99, 24. Lange, A., Giller, K., Hornig, S., Martin-
10260–10265. Eauclaire, M. F., Pongs, O., Becker, S., and
14. Castellani, F., van Rossum, B.-J., Diehl, A., Baldus, M. (2006) Toxin-induced conforma-
Schubert, M., Rehbein, K., and Oschkinat, tional changes in a potassium channel revealed
H. (2002) Structure of a protein determined by solid-state NMR. Nature 440, 959–962.
by solid-state magic-angle spinning NMR. 25. Andronesi, O. C., Becker, S., Seidel, K.,
Nature 420, 98–102. Heise, H., Young, H. S., and Baldus, M.
15. Zech, S. G., Wand, A. J., and McDermott, A. E. (2005) Determination of membrane protein
(2005) Protein Structure Determination structure and dynamics by magic-angle-spinning
296 B. Reif
solid-state NMR spectroscopy. J. Am. Chem. 36. Tycko, R. (2006) Molecular structure of
Soc. 127, 12965–12974. amyloid fibrils: insights from solid-state NMR.
26. Hiller, M., Krabben, L., Vinothkumar, K. R., Quart. Rev. Biophys. 39, 1–55.
Castellani, F., Van Rossum, B., Kühlbrandt, 37. Paravastu, A. K., Qahwash, I., Leapman, R.
W., and Oschkinat, H. (2005) Solid-State D., Meredith, S. C., and Tycko, R. (2009)
Magic-Angle Spinning NMR of Outer- Seeded growth of beta-amyloid fibrils from
Membrane Protein G from Escherichia coli. Alzheimer’s brain-derived fibrils produces a
ChemBioChem. 6, 1679–1684. distinct fibril structure. Proc. Natl Acad. Sci.
27. Agarwal, V., Fink, U., Schuldiner, S., and USA 106, 7443–7448.
Reif, B. (2007) MAS Solid-State NMR 38. Jaroniec, C. P., MacPhee, C. E., Astrof, N. S.,
Studies on the Multidrug Transporer EmrE. Dobson, C. M., and Griffin, R. G. (2002)
BBA- Biomembranes 1768, 3036–3043. Molecular conformation of a peptide fragment
28. Etzkorn, M., Martell, S., Andronesi, O. C., of transthyretin in an amyloid fibril. Proc.
Seidel, K., Engelhard, M., and Baldus, M. (2007) Natl. Acad. Sci. USA 99, 16748–16753.
Secondary structure, dynamics, and topology 39. Jaroniec, C. P., MacPhee, C. E., Bajaj, V. S.,
of a seven-helix receptor in native membranes, McMahon, M. T., Dobson, C. M., and Griffin,
studied by solid-state NMR spectroscopy. R. G. (2004) High-resolution molecular structure
Angew. Chem. Int. Edt. 46, 459–462 of a peptide in an amyloid fibril determined
29. Shi, L., Lake, E. M. R., Ahmed, M. A. M., Brown, by magic angle spinning NMR spectroscopy.
L. S., and Ladizhansky, V. (2009) Solid- Proc. Natl. Acad. Sci. USA 101, 711–716.
state NMR study of proteorhodopsin in the 40. Ferguson, N., Becker, J., Tidow, H., Tremmel,
lipid environment: Secondary structure S., Sharpe, T. D., Krause, G., Flinders, J.,
and dynamics. Biochim. Biophys. Acta 1788, Petrovich, M., Berriman, J., Oschkinat, H.,
2563–2574. and Fersht, A. R. (2006) General structural
30. Pfleger, N., Woerner, A. C., Yang, J., Shastri, motifs of amyloid protofilaments. Proc. Natl.
S., Hellmich, U. A., Aslimovska, L., Maier, Acad. Sci. USA 103, 16248–16253.
M. S. M., and Glaubitz, C. (2009) Solid-state 41. Ritter, C., Maddelein, M.-L., Siemer, A. B.,
NMR and functional studies on proteorhodop- Lührs, T., Ernst, M., Meier, B. H., Saupe, S.
sin. Biochim. Biophys. Acta 1787, 697–705. J., and Riek, R. (2005) Correlation of struc-
31. Lange, V., Becker-Baldus, J., Kunert, B., van tural elements and infectivity of the HET-s
Rossum, B.-J., Casagrande, F., Engel, A., prion. Nature 435, 844–848.
Roske, Y., Scheffel, F. M., Schneider, E., and 42. Wasmer, C., Lange, A., Van Melckebeke, H.,
Oschkinat, H. (2010) A MAS NMR Study of Siemer, A. B., Riek, R., and Meier, B. H. (2008)
the Bacterial ABC Transporter ArtMP. Amyloid fibrils of the HET-s(218–289) prion
ChemBioChem. 11, 547–555. form a beta solenoid with a triangular hydro-
32. Li, Y., Berthold, D. A., Frericks, H. L., Gennis, phobic core. Science 319, 1523–1526.
R. B., and Rienstra, C. M. (2007) Partial 43. Luca, S., Yau, W.-M., Leapman, R. D., and
C-13 and N-15 chemical-shift assignments Tycko, R. (2007) Peptide Conformation and
of the disulfide-bond-forming enzyme DsbB Supramolecular Organization in Amylin
by 3D magic-angle spinning NMR spectros- Fibrils: Constraints from Solid-State NMR.
copy. Chembiochem 8, 434–442. Biochemistry 46, 13505–13522.
33. Li, Y., Berthold, D. A., Gennis, R. B., and 44. Madine, J., Jack, E., Stockley, P. G., Radford,
Rienstra, C. M. (2008) Chemical shift assign- S. E., Serpell, L. C., and Middleton, D. A. (2008)
ment of the transmembrane helices of DsbB, Structural Insights into the Polymorphism of
a 20-kDa integral membrane enzyme, by 3D Amyloid-Like Fibrils Formed by Region
magic-angle spinning NMR spectroscopy. 20–29 of Amylin Revealed by Solid-State
Protein Science 17, 199–204. NMR and X-ray Fiber Diffraction. J. Am. Chem.
34. Petkova, A. T., Ishii, Y., Balbach, J. J., Antzutkin, Soc. 130, 14990–15001.
O. N., Leapman, R. D., Delaglio, F., and Tycko, 45. Nielsen, J. T., Bjerring, M., Jeppesen, M. D.,
R. (2002) A structural model for Alzheimer’s Pedersen, R. O., Pedersen, J. M., Hein, K. L.,
β-amyloid fibrils based on experimental con- Vosegaard, T., Skrydstrup, T., Otzen, D. E., and
straints from solid state NMR. Proc. Natl. Nielsen, N. C. (2009) Unique Identification
Acad. Sci. USA 99, 16742–16747. of Supramolecular Structures in Amyloid
35. Petkova, A. T., Leapman, R. D., Guo, Z. H., Fibrils by Solid-State NMR Spectroscopy.
Yau, W. M., Mattson, M. P., and Tycko, R. Angew. Chem. Int. Edt. 48, 2118 –2121.
(2005) Self-propagating, molecular-level 46. Heise, H., Hoyer, W., Becker, S., Andronesi, O. C.,
polymorphism in Alzheimer’s beta-amyloid Riedel, D., and Baldus, M. (2005) Molecular-
fibrils. Science 307, 262–265. level secondary structure, polymorphism, and
dynamics of full-length alpha-synuclein fibrils 59. Gerstein, B. C., Chow, C., Pembleton, R. G.,
studied by solid-state NMR. Proc. Natl. Acad. and Wilson, R. C. (1977) Utility of Pulse
Sci. USA 102, 15871–15876. Nuclear Magnetic Resonance in Studying
47. Kloepper, K. D., Zhou, D. H., Li, Y., Winter, Protons in Coals. J. Phys. Chem. 81, 565–570.
K. A., George, J. M., and Rienstra, C. M. (2007) 60. Burum, D. P. (1990) Combined Rotation
Temperature-dependent sensitivity enhancement and Multiple Pulse Spectroscopy (CRAMPS).
of solid-state NMR spectra of alpha-synuclein Concepts in Magn. Reson. 2, 213–227.
fibrils. J. Biomol. NMR 39, 197–211. 61. Waugh, J. S., Huber, L. M., and Haeberlen,
48. Loquet, A., Luc, Gardiennet, C., Sourigues, Y., U. (1968) Approach to High-Resolution
Wasmer, C., Habenstein, B., Schutz, A., Meier, B. NMR in Solids. Phys. Rev. Lett. 20, 180.
H., Melki, R., and Bockmann, A. (2009) Prion 62. Burum, D. P., and Rhim, W. K. (1979) Analysis
Fibrils of Ure2p Assembled under Physiological of multiple pulse NMR in solids .3. J. Chem.
Conditions Contain Highly Ordered, Natively Phys. 71, 944–956.
Folded Modules. J. Mol. Biol. 394, 108–118. 63. Hohwy, M., Bower, P. V., Jakobsen, H. J.,
49. Sun, S., Siglin, A., Williams, J. C., and and Nielsen, N. C. (1997) A high-order
Polenova, T. (2009) Solid-State and Solution and broadband CRAMPS experiment using
NMR Studies of the CAP-Gly Domain of z-rotational decoupling Chem. Phys. Lett.
Mammalian Dynactin and Its Interaction 273, 297–303.
with Microtubules. J. Am. Chem. Soc. 131, 64. Bielecki, A., Kolbert, A. C., and Levitt, M. H.
10113–10126. (1989) Frequency-switched pulse sequences –
50. Ahmed, S., Sun, S., Siglin, A. E., Polenova, Homonuclear decoupling and dilute spin NMR
T., and Williams, J. C. (2010) Disease- in solids. Chem. Phys. Lett. 155, 341–346.
Associated Mutations in the p150(Glued) 65. Vinogradov, E., Madhu, P. K., and Vega, S.
Subunit Destabilize the CAP-gly Domain. (1999) High-resolution proton solid-state
Biochemistry 49, 5083–5085. NMR spectroscopy by phase-modulated
51. Jehle, S., van Rossum, B.-J., Stout, J. R., Lee-Goldburg experiment. Chem. Phys. Lett.
Noguchi, S. M., Falber, K., Rehbein, K., 314, 443–450.
Oschkinat, H., Klevit, R. E., and Rajagopal, P. 66. Vinogradov, E., Madhu, P. K., and Vega, S.
(2009) alpha B-Crystallin: A Hybrid Solid- (2002) Proton spectroscopy in solid state
State/Solution-State NMR Investigation nuclear magnetic resonance with windowed
Reveals Structural Aspects of the Heterogeneous phase modulated Lee–Goldburg decoupling
Oligomer. J. Mol. Biol. 385, 1481–1497. sequences. Chem. Phys. Lett. 354, 193–202.
52. Baldus, M. (2002) Correlation experiments 67. Lesage, A., Sakellariou, D., Hediger, S.,
for assignment and structure elucidation of Elena, B., Charmont, P., Steuernagel, S., and
immobilized polypeptides under magic angle Emsley, L. (2003) Experimental aspects of
spinning. Prog. NMR Spect. 41, 1–47. proton NMR spectroscopy in solids using
53. Baldus, M. (2006) Molecular interactions phase-modulated homonuclear dipolar decou-
investigated by multi-dimensional solid-state pling. J. Magn. Reson. 163, 105–113.
NMR. Curr. Opin. Struct. Biol 16, 618–623. 68. Madhu, P. K., Zhao, X., and Levitt, M. H.
54. Brown, S. P. (2007) Probing proton–proton (2001) High-resolution H-1 NMR in the
proximities in the solid state. Prog. NMR solid state using symmetry-based pulse
Spect. 50, 199–251 sequences. Chem. Phys. Lett. 346, 142–148.
55. Böckmann, A. (2008) 3D protein structures 69. Samoson, A., Tuherm, T., Past, J., Reinhold,
by solid-state NMR: ready for high resolution. A., Anupold, T., and Heinmaa, N. (2005)
Angew. Chem. Int. Edt. 47, 6110–6113. New horizons for magic-angle spinning
56. Wylie, B. J., and Rienstra, C. M. (2008) NMR. Top. Curr. Chem. 246, 15–31.
Multidimensional solid state NMR of aniso- 70. LeMaster, D. M., and Richards, F. M. (1988)
tropic interactions in peptides and proteins. NMR Sequential Assignment of Escherichia
J. Chem. Phys. 128, 052207. Coli Thioredoxin Utilizing Random Fractional
57. McDermott, A. (2009) Structure and Deuteration. Biochemistry 27, 142–150.
Dynamics of Membrane Proteins by Magic 71. LeMaster, D. M. (1989) Deuteration in
Angle Spinning Solid-State NMR. Ann. Rev. protein proton magnetic resonance. Methods
Biophys. 38, 385–403. Enzymol. 177, 23–43.
58. Schnabel, B., Haubenreisser, U., Scheler, G., 72. Kay, L. E., and Gardner, K. H. (1997)
and Müller, R. (1976) in 19th Congress Solution NMR spectroscopy beyond 25 kDa.
Ampere pp 441, Heidelberg. Curr. Op. Struct. Biol. 7, 722–731.
298 B. Reif
73. McDermott, A. E., Creuzet, F. J., Kolbert, A. domain of alpha-spectrin by MAS solid-
C., and Griffin, R. G. (1992) High-Resolution state NMR. J. Biomol. NMR 31, 295–310.
Magic-Angle-Spinning NMR Spectra of 84. Zhou, D. H., and Rienstra, C. M. (2008)
Protons in Deuterated Solids. J. Magn. Reson. High-Performance Solvent Suppression for
98, 408–413. Proton-Detected Solid-State NMR. J. Magn.
74. Zheng, L., Fishbein, K. W., Griffin, R. G., Reson. 192, 167–172.
and Herzfeld, J. (1993) Two-Dimensional 85. Morcombe, C. R., Paulson, E. K., Gaponenko,
Solid-State 1H NMR and Proton Exchange. V., Byrd, R. A., and Zilm, K. W. (2005) H-1-
J. Am. Chem. Soc. 115, 6254–6261. N-15 correlation spectroscopy of nanocrystal-
75. Zorin, V. E., Brown, S. P., and Hodgkinson, P. line proteins. J. Biomol. NMR 31, 217–230.
(2006) Origins of linewidth in 1H magic- 86. Chevelkov, V., Rehbein, K., Diehl, A., and
angle spinning NMR. J. Chem. Phys. 125, Reif, B. (2006) Ultra-high resolution in pro-
144508. ton solid-state NMR at high levels of deutera-
76. Reif, B., Jaroniec, C. P., Rienstra, C. M., tion. Angew. Chem. Int. Ed. 45, 3878–3881.
Hohwy, M., and Griffin, R. G. (2001) 1H-1H 87. Agarwal, V., and Reif, B. (2008) Residual
MAS Correlation Spectroscopy and Distance Methyl Protonation in Perdeuterated Proteins
Measurements in a Deuterated Peptide. J. Magn. for Multidimensional Correlation Experiments
Reson. 151, 320–327. in MAS solid-state NMR Spectroscopy.
77. Reif, B., and Griffin, R. G. (2003) 1H detected J. Magn. Reson. 194, 16–24.
1
H,15N Correlation Spectroscopy in Rotating 88. Agarwal, V., Diehl, A., Skrynnikov, N., and
Solids. J. Magn. Reson. 160, 78–83. Reif, B. (2006) High Resolution 1H Detected
1
78. Zhou, D. H., Graesser, D. T., Franks, W. T., H,13C Correlation Spectra in MAS Solid-State
and Rienstra, C. M. (2006) Sensitivity and NMR using Deuterated Proteins with Selective
1
resolution in proton solid-state NMR at inter- H,2H Isotopic Labeling of Methyl Groups. J.
mediate deuteration levels: Quantitative lin- Am. Chem. Soc. 128, 12620–12621.
ewidth analysis and applications to correlation 89. Agarwal, V., Xue, Y., Reif, B., and Skrynnikov,
spectroscopy. J. Magn. Reson. 178, 297–307. N. R. (2008) Protein side-chain dynamics as
79. Chevelkov, V., van Rossum, B. J., Castellani, observed by solution- and solid-state NMR: a
F., Rehbein, K., Diehl, A., Hohwy, M., similarity revealed J. Am. Chem. Soc. 130,
Steuernagel, S., Engelke, F., Oschkinat, H., 16611–16621.
and Reif, B. (2003) 1H detection in MAS 90. Goto, N., and Kay, L. E. (2000) New devel-
solid state NMR spectroscopy employing opments in isotope strategies for protein solu-
pulsed field gradients for residual solvent sup- tion NMR spectroscopy. Curr. Opin. Cell
pression. J. Am. Chem. Soc. 125, 7788–7789. Biol. 10, 585–592.
80. Reif, B., van Rossum, B. J., Castellani, F., 91. Zhou, D. H., Shah, G., Cormos, M., Mullen,
Rehbein, K., Diehl, A., and Oschkinat, H. C., Sandoz, D., and Rienstra, C. M. (2007)
(2003) Determination of 1H 1H distances in a Proton-detected solid-state NMR
uniformly 2H,15N labeled SH3 domain by Spectroscopy of fully protonated proteins at
MAS solid state NMR spectroscopy. J. Am. 40 kHz magic-angle spinning. J. Am. Chem.
Chem. Soc. 125, 1488–1489. Soc. 129, 11791–11801.
81. Paulson, E. K., Morcombe, C. R., Gaponenko, 92. Zhou, D. H., Shea, J. J., Nieuwkoop, A. J.,
V., Dancheck, B., Byrd, R. A., and Zilm, K. Franks, W. T., Wylie, B. J., Mullen, C.,
W. (2003) High-Sensitivity Observation of Sandoz, D., and Rienstra, C. M. (2007) Solid-
Dipolar Exchange and NOEs between State Protein-Structure Determination with
Exchangeable Protons in Proteins by 3D Proton-Detected Triple-Resonance 3D
Solid-State NMR Spectroscopy. J. Am. Chem. Magic-Angle-Spinning NMR Spectroscopy.
Soc. 125, 14222–14223. Angew. Chemie Int. Edt. 46, 8380–8383.
82. Paulson, E. K., Morcombe, C. R., Gaponenko, 93. Agarwal, V., Faelber, K., Schmieder, P., and
V., Dancheck, B., Byrd, R. A., and Zilm, K. Reif, B. (2009) High-Resolution Double-
W. (2003) Sensitive High Resolution Inverse Quantum Deuterium Magic Angle Spinning
Detection NMR Spectroscopy of Proteins Solid-State NMR Spectroscopy of Perdeuterated
in the Solid State. J. Am. Chem. Soc. 125, Proteins. J. Am. Chem. Soc. 131, 2–3.
15831–15836. 94. Tosner, Z., Vosegaard, T., Kehlet, C., Khaneja,
83. Chevelkov, V., Faelber, K., Diehl, A., N., Glaser, S. J., and Nielsen, N. C. (2009)
Heinemann, U., Oschkinat, H., and Reif, B. Optimal control in NMR spectroscopy:
(2005) Detection of dynamic water mole- Numerical implementation in SIMPSON.
cules in a microcrystalline sample of the SH3 J. Magn. Reson. 197, 120–134.
95. Agarwal, V., Linser, R., Fink, U., Faelber, K., Backbone Dynamics in a Crystalline Protein
and Reif, B. (2010) Identification of Hydroxyl from Nitrogen-15 Spin-Lattice Relaxation.
Protons, Determination of their Exchange J. Am. Chem. Soc. 127, 18190–18201.
Dynamics, and Characterization of Hydrogen 107. Giraud, N., Blackledge, M., Böckmann, A.,
Bonding by MAS solid-state NMR Spectro- and Emsley, L. (2007) The influence of nitro-
scopy in a Microcrystalline Protein. J. Am. gen-15 proton-driven spin diffusion on the
Chem. Soc. 132, 3187–3195. measurement of nitrogen-15 longitudinal
96. Mamone, S., Dorsch, A., Johannessen, O. G., relaxation times. J. Magn. Reson. 184, 51–61.
Naik, M. V., Madhu, P. K., and Levitt, M. H. 108. Chevelkov, V., Diehl, A., and Reif, B. (2008)
(2008) A Hall effect angle detector for solid- Measurement of 15 N-T1 Relaxation Rates in a
state NMR. J. Magn. Reson. 190, 135–141. Perdeuterated Protein by MAS Solid-State
97. Chevelkov, V., Faelber, K., Schrey, A., NMR Spectroscopy. J. Chem. Phys. 128,
Rehbein, K., Diehl, A., and Reif, B. (2007) 052316.
Differential Line Broadening in MAS solid- 109. Chevelkov, V., Zhuravleva, A. V., Xue, Y., Reif,
state NMR due to Dynamic Interference. B., and Skrynnikov, N. R. (2007) Combined
J. Am. Chem. Soc. 129, 10195–10200. Analysis of 15 N Relaxation Data from Solid-
98. Bak, M., Rasmussen, J. T., and Nielsen, N. C. and Solution-State NMR Spectroscopy. J. Am.
(2000) SIMPSON: A General Simulation Chem. Soc. 129, 12594–12595.
Program for Solid-State NMR Spectroscopy. 110. Chevelkov, V., Xue, Y., Linser, R., Skrynnikov,
J. Magn. Reson. 147, 296–330. N. R., and Reif, B. (2010) Comparison of
99. Wickramasinghe, N. P., Kotecha, M., Solid-State Dipolar Couplings and Solution
Samoson, A., Past, J., and Ishii, Y. (2007) Relaxation Data Provides Insight into Protein
Sensitivity enhancement in C-13 solid-state Backbone Dynamics. J. Am. Chem. Soc. 132,
NMR of protein microcrystals by use of para- 5015–5017.
magnetic metal ions for optimizing H-1 T-1 111. Torchia, D. A., and Szabo, A. (1982) Spin-
relaxation. J. Magn. Reson. 184, 350–356. Lattice Relaxation in Solids. J. Magn. Reson.
100. Linser, R., Chevelkov, V., Diehl, A., and Reif, B. 49, 107–121.
(2007) Sensitivity Enhancement Using 112. Cavanagh, J., Fairbrother, W. J., Palmer, A.
Paramagnetic Relaxation in MAS Solid State G., and Skelton, N. J. (1996) Protein NMR
NMR of Perdeuterated Proteins. J. Magn. Spectroscopy: Principles and Practice,
Reson. 189, 209–216. Academic Press, San Diego.
101. Linser, R., Fink, U., and Reif, B. (2008) 113. Chekmenev, E. Y., Zhang, Q., Waddell, K.
Proton-detected Scalar Coupling based W., Mashuta, M. S., and Wittebort, R. J.
Assignment Strategies in MAS Solid-State (2004) 15 N Chemical Shielding in Glycyl
NMR Spectroscopy applied to Perdeuterated Tripeptides: Measurement by Solid-State
Proteins J. Magn. Reson. 193, 89–93. NMR and Correlation with X-ray Structure.
102. Hologne, M., Faelber, K., Diehl, A., and Reif, J. Am. Chem. Soc. 126, 379–384.
B. (2005) Characterization of Dynamics of 114. Wylie, B. J., Franks, W. T., and Rienstra, C.
Perdeuterated Proteins by MAS Solid-State M. (2006) Determinations of N-15 chemical
NMR. J. Am. Chem. Soc. 127, 11208–11209. shift anisotropy magnitudes in a uniformly
103. Hologne, M., Chen, Z., and Reif, B. (2006) N-15, C-13-labeled microcrystalline protein
Characterization of dynamic processes using by three-dimensional magic-angle spinning
deuterium in uniformly 2H,13C,15N enriched nuclear magnetic resonance spectroscopy.
peptides by MAS solid-state NMR. J. Magn. J. Phys. Chem. B 110, 10926–10936.
Reson. 179, 20–28. 115. Hall, J. B., and Fushman, D. (2006) Variability
104. Cole, H. B. R., and Torchia, D. A. (1991) An of the N-15 chemical shielding tensors in the
NMR-study of the Backbone Dynamics of B3 domain of protein G from N-15 relaxation
Staphylococcal Nuclease in the Crystalline measurements at several fields. Implications
State. Chem. Phys. 158, 271–281. for backbone order parameters. J. Am. Chem.
105. Giraud, N., Böckmann, A., Lesage, A., Penin, Soc. 128, 7855–7870.
F., Blackledge, M., and Emsley, L. (2004) Site- 116. Wylie, B. J., Sperling, L. J., Frericks, H. L.,
Specific Backbone Dynamics from a Crystalline Shah, G. J., Franks, W. T., and Rienstra, C.
Protein by Solid-State NMR Spectroscopy. M. (2007) Chemical-shift anisotropy mea-
J. Am. Chem. Soc. 126, 11422–11423. surements of amide and carbonyl resonances
106. Giraud, N., Blackledge, M., Goldman, M., in a microcrystalline protein with slow magic-
Böckmann, A., Lesage, A., Penin, F., and angle spinning NMR spectroscopy. J. Am.
Emsley, L. (2005) Quantitative Analysis of Chem. Soc. 129, 5318–5319.
300 B. Reif
117. Yao, L., Vögeli, B., Ying, J., and Bax, A. 129. Reif, B., Diener, A., Hennig, M., Maurer, M.,
(2008) NMR Determination of Amide N-H and Griesinger, C. (2000) Cross Correlated
Equilibrium Bond Length from Concerted Relaxation for the Measurement of Angles
Dipolar Coupling Measurements. J. Am. between Tensorial Interactions. J. Magn.
Chem. Soc. 130, 16518–16520. Reson. 143, 45–68.
118. Zilm, K. W., and Grant, D. M. (1981) 130. Chevelkov, V., Diehl, A., and Reif, B. (2007)
Carbon-13 Dipolar Spectroscopy of Small Quantitative Measurement of Differential
Organic Molecules in Argon Matrices. J. Am. 15
N-Hα/β T2 Relaxation Times in a
Chem. Soc. 103, 2913–2922. Perdeuterated Protein by MAS Solid-State
119. Harris, R. K., Packer, K. J., and Thayer, A. M. NMR Spectroscopy. Magn. Reson. Chem. 45,
(1985) Slow Magic-Angle Rotation 13 C S156–S160.
NMR Studies of Solid Phosphonium Iodides. 131. Lorieau, J. L., and McDermott, A. E. (2006)
The Interplay of Dipolar, Shielding and Order parameters based on (CH)-C-13-H-1,
Indirect Coupling Tensors. J. Magn. Reson. (CH2)-C-13-H-1 and (CH3)-C-13-H-1 het-
62, 284–297. eronuclear dipolar powder patterns: a compari-
120. Griffey, D., and Redfield, A. (1987) Proton- son of MAS-based solid-state NMR sequences.
deteceted heteronuclear edited and correlated Magn. Reson. Chem. 44, 334–347.
nuclear-magnetic-resonance and nuclear 132. Lorieau, J. L., and McDermott, A. E. (2006)
Overhauser effect in solution. Quart. Rev. Conformational Flexibility of a Microcrystalline
Biophys. 19, 51–82. Globular Protein: Order Parameters by Solid-
121. Wu, G., Sun, B., Wasylishen, R. E., and State NMR Spectroscopy. J. Am. Chem. Soc.
Griffin, R. G. (1997) Spinning Sidebands in 128, 11505–11512.
Slow-Magic-Angle-Spinning NMR Spectra 133. Lorieau, J. L., Day, L. A., and McDermott, A.
Arising from Tightly J-Coupled Spin Pairs. J. E. (2008) Conformational dynamics of an
Magn. Reson. 124, 366–371 intact virus: Order parameters for the coat
122. Duma, L., Hediger, S., Lesage, A., Sakellariou, protein of Pf1 bacteriophage. Proc. Natl
D., and Emsley, L. (2003) Carbon-13 lineshapes Acad. Sci. USA 105, 10366–10371.
in solid-state NMR of labeled compounds. 134. Chevelkov, V., Fink, U., and Reif, B. (2009)
Effects of coherent CSA-dipolar cross-correla- Accurate Determination of Order Parameters
tion. J. Magn. Reson. 162, 90–101. from 1 H,15N Dipolar Couplings in MAS
123. Igumenova, T. I., and McDermott, A. E. solid-state NMR experiments. J. Am. Chem.
(2003) Improvement of resolution in solid Soc. 131, 14018–14022.
state NMR spectra with J-decoupling: an 135. Wu, X. L., and Zilm, K. W. (1993) Cross-
analysis of lineshape contributions in uni- Polarization with High-Speed Magic-Angle
formly 13 C-enriched amino acids and pro- Spinning. J. Magn. Reson. A 104, 154–165.
teins. J. Magn. Reson. 164, 270–285. 136. Dvinskikh, S. V., Zimmermann, H., Maliniak,
124. Maricq, M. M., and Waugh, J. S. (1979) A., and Sandstrom, D. (2003) Heteronuclear
NMR in rotating solids. J. Chem. Phys. 70, dipolar recoupling in liquid crystals and solids
3300–3316. by PISEMA-type pulse sequences. J. Magn.
125. Skrynnikov, N. R. (2007) Asymmetric doublets Reson. 164, 165–170.
in MAS NMR: coherent and incoherent mech- 137. Dvinskikh, S. V., Zimmermann, H., Maliniak,
anisms. Magn. Reson. Chem. 45, S161–S173. A., and Sandström, D. (2005) Heteronuclear
126. Pervushin, K., Riek, R., Wider, G., and dipolar recoupling in solid-state nuclear mag-
Wüthrich, K. (1997) Attenuated T2 relaxation netic resonance by amplitude-, phase-, and
by mutual cancellation of dipole-dipole cou- frequency-modulated Lee–Goldburg cross-
pling and chemical shift anisotropy indicates polarization. J. Chem. Phys. 122, 044512.
an avenue to NMR structures of very large 138. Lipari, G., and Szabo, A. (1982) Model-Free
biological macromolecules in solution. Proc. Approach to the Interpretation of Nuclear
Natl. Acad. Sci. USA 94, 12366–12371. Magnetic Resonance Relaxation in
127. Tjandra, N., Szabo, A., and Bax, A. (1996) Macromolecules. 1. Theory and Range of
Protein Backbone Dynamics and 15 N Validity. J. Am. Chem. Soc. 104, 4546–4559.
Chemical Shift Anisotropy from Quantitative 139. Clore, G. M., Szabo, A., Bax, A., Kay, L. E.,
Measurement of Relaxation Interference Driscoll, P. C., and Gronenborn, A. M.
Effects. J. Am. Chem. Soc. 118, 6986–6991. (1990) Deviations from the Simple
128. Reif, B., Hennig, M., and Griesinger, C. 2-Parameter Model-Free Approach to the
(1997) Direct Measurement of Angles Interpretation of N-15 Nuclear Magnetic
Between Bond Vectors in High-Resolution Relaxation of Proteins. J. Am. Chem. Soc.
NMR. Science 276, 1230–1233. 112, 4989–4991.
140. Chevelkov, V., Fink, U., and Reif, B. (2009) site on the nicotinic acetylcholine receptor.
Analysis of the Dynamics of Backbone Motion Proc. Natl. Acad. Sci. USA 98, 2346–2351.
in the Solid-State. J. Biomol. NMR 45, 148. Howard, K. P., Liu, W., Crocker, E., Nanda,
197–206. V., Lear, J., Degrado, W. F., and Smith, S. O.
141. Sandström, D., and Zimmermann, H. (2000) (2005) Rotational orientation of monomers
Correlation of deuterium quadrupolar cou- within a designed homo-oligomer transmem-
plings and carbon-13 chemical shifts in brane helical bundle. Protein Sci. 14,
ordered media by multiple-quantum NMR. J. 1019–1024.
Phys. Chem. B 104, 1490–1493. 149. Williams, J. C., and McDermott, A. E. (1995)
142. Spiess, H. (1985) Deuteron NMR – a new Dynamics of the flexible loop of triosephos-
tool for studying chain mobility and orien- phate isomerase – the loop motion is not
tation in polymers. Adv. Polym. Sci. 66, ligand-gated Biochemistry 34, 8309–8319.
23–58. 150. Hologne, M., Chevelkov, V., and Reif, B.
143. Hirschinger, J., Miura, H., Gardner, K. H., (2006) Deuteration of Peptides and Proteins
and English, A. D. (1990) Segmental dynamin MAS Solid-State NMR. Prog. NMR Spect.
ics in the crystalline phase of Nylon 66 : Solid 48, 211–232.
State 2 H NMR. Macromolecules 23, 151. Hoatson, G. L., and Vold, R. L. (1994) 2 H
2153–2169. NMR Spectroscopy of Solids and Liquid
144. Seelig, J. (1977) Deuterium magnetic reso- Crystals. NMR Basic Principles and Progress
nance: Theory and application to lipid mem- 32, 3–61.
branes. Q. Rev. Biophys. 10, 353–418. 152. Emsley, J. W. (2002) Solid-State NMR
145. Davis, J. H. (1983) The description of mem- Spectroscopy- Principles and Applications,
brane lipid conformation, order and dynamics Duer, M.J. edt., Blackwell Science, Oxford.
by 2 H NMR. Biochim. Biophys. Acta 737, 153. Wittebort, R. J., Olejniczak, E. T., and Griffin,
117–171. R. G. (1987) Analysis of deuterium nuclear
146. Copié, V., McDermott, A. E., Beshah, K., magnetic resonance line shapes in anisotropic
Williams, J. C., Spijker-Assink, M., Gebhard, media. J. Chem. Phys. 86, 5411–5420.
R., Lugtenburg, J., Herzfeld, J., and Griffin, 154. Hologne, M., and Hirschinger, J. (2004)
R. G. (1994) Deuterium Solid-State Nuclear Molecular Dynamics as Studied by Static-
Magnetic Resonance Studies of Methyl Group Powder and MAS 2 H NMR. Solid State NMR
Dynamics in Bacteriorhodopsin and Retinal 26, 1–10.
Model Compounds: Evidence for a 6-s-Trans 155. Reif, B., Xue, Y., Agarwal, V., Pavlova, M. S.,
Chromophore in the Protein. Biochemistry. Hologne, M., Diehl, A., Ryabov, Y. E., and
33, 3280–3286. Skrynnikov, N. R. (2006) Protein Side-Chain
147. Williamson, P. T. F., Watts, J. A., Addona, G. Dynamics Observed by Solution- and Solid-
H., Miller, K. W., and Watts, A. (2001) state NMR: Comparative Analysis of Methyl
Dynamics and orientation of N+ (CD3) 2
H Relaxation Data. J. Am. Chem. Soc. 128,
(3)-bromoacetylcholine bound to its binding 12354–12355.
Chapter 17
Solid-State NMR Spectroscopy of Protein Complexes

Shangjin Sun, Yun Han, Sivakumar Paramasivam, Si Yan,
Amanda E. Siglin, John C. Williams, In-Ja L. Byeon,
Jinwoo Ahn, Angela M. Gronenborn, and Tatyana Polenova
Abstract
Protein–protein interactions are vital for many biological processes. These interactions often result in the
formation of protein assemblies that are large in size, insoluble, and difficult to crystallize, and therefore
are challenging to study by structure biology techniques, such as single crystal X-ray diffraction and solu-
tion NMR spectroscopy. Solid-state NMR (SSNMR) spectroscopy is emerging as a promising technique
for studies of such protein assemblies because it is not limited by molecular size, solubility, or lack of long-
range order. In the past several years, we have applied magic angle spinning SSNMR-based methods to
study several protein complexes. In this chapter, we discuss the general SSNMR methodologies employed
for structural and dynamics analyses of protein complexes with specific examples from our work on thiore-
doxin reassemblies, HIV-1 capsid protein assemblies, and microtubule-associated protein assemblies. We
present protocols for sample preparation and characterization, pulse sequences, SSNMR spectra collection,
and data analysis.
Key words: SSNMR, Magic angle spinning, Protein complexes
1. Introduction
Protein–protein interactions are involved in many important

biological processes such as signal transduction (1), cellular trans-
port (2), viral infection (3, 4), and immune response (5). These
interactions often result in large protein complexes that are insoluble
and difficult to crystallize. Because of the insolubility and inherent
lack of long-range order in protein assembles, the mature struc-
tural techniques that yield atomic-level information, such as solution
NMR spectroscopy and X-ray crystallography, cannot be applied to
studies of such protein complexes. Solid-state NMR (SSNMR)
303
304 S. Sun et al.
spectroscopy has emerged as one of the very few techniques that

can yield atomic level structural information for these types of sys-
tems. Recently, several studies have been reported on SSNMR
applications for analysis of protein assemblies, such as bacterio-
phage viruses (6), oligomeric membrane peptides and proteins
(7–10), amyloid fibrils (11–17), HIV-1 capsid protein assemblies
(18), microtubule-associated protein assemblies (19), as well as
assemblies of soluble proteins (20–22). The major strength of
SSNMR spectroscopy is that there is no intrinsic limitation on
molecular size or solubility, and long-range order is not required.
In large systems where the resonance lines are narrow but spectral
congestion presents a challenge, sparse (23), differential (20), and
selective isotopic labeling (24, 25) enables simplification of SSNMR
spectra and hence detailed atomic-resolution information can be
attained (21, 22, 26, 27). Furthermore, with SSNMR spectros-
copy residue-specific dynamics can be probed for protein com-
plexes on multiple timescales ranging from picoseconds to many
seconds (28), which fosters a deeper understanding of their bio-
logical function.
Resonance assignment (or chemical shift assignment) is a pre-
requisite for extracting site-specific structural and dynamics infor-
mation in proteins by NMR spectroscopy, including SSNMR (29).
NMR experiments for resonance assignments generate two types
of information. The first is correlations between atoms within the
same residue, which allow for amino acid type identification. The
second is correlations between atoms belonging to neighboring
residues, which allow for establishing sequential connectivities.
With the intraresidue and sequential correlations and from the
known primary sequence of a protein, site-specific resonance
assignments are extracted. In SSNMR spectroscopy, either through-
space (dipolar) or through-bond (scalar) correlation spectroscopy
can be employed for assignments under the magic angle spinning
(MAS) conditions (Fig. 1 illustrates the orientation of the sample
rotor with respect to the static magnetic field). MAS (30) frequencies
of 8–20 kHz are usually employed for multidimensional correla-
tion spectroscopy. In the multidimensional MAS NMR correlation
experiments, the typical building blocks for constructing the pulse
sequences are (1) cross polarization (CP) for 1H-15N or 1H-13C
polarization transfer, (2) double cross polarization (DCP) (31) and
its band-selective version (SPECIFIC-CP) (32) for 15N–13C polar-
ization transfer, (3) PDSD (33), DARR (34), DREAM (35),
RFDR (36), SPC5 (37), and several other sequences for 13C–13C
magnetization transfer through proton-driven spin diffusion or its
rotary-assisted variant or by direct 13C–13C dipolar recoupling, (4)
TOBSY (38), CTUC-COSY (39–41), and several other sequences
for 13C–13C magnetization transfer through scalar couplings. For
reviews of the homonuclear and heteronuclear dipolar recoupling
methods see refs. 42, 43, and 44, respectively. Figure 2 shows the
17 Solid-State NMR Spectroscopy of Protein Complexes 305
Fig. 1. Magic angle spinning: (a) the sample rotor is spun at a 54.7° angle (magic angle) with respect to the static magnetic
field; (b) a Varian 3.2-mm thick wall rotor is loaded into a Varian T3 probe. The stator holds the rotor at the magic angle and
allows the bearing/drive air flow to spin the rotor at desired frequency.
typical 2D and 3D MAS NMR experiments for NMR assignments

based on these building blocks, and the corresponding 2D and 3D
spectra for thioredoxin reassembly, CAP-Gly/MT reassembly, and
HIV-1 CA assembly are presented in Fig. 3.
Differential (20), selective (24, 25), and sparse (23) isotopic
labeling enables simplification of NMR spectra as well as the
distinction between intra- and intermolecular correlations.
Nonuniform labeling is commonly employed in the structural
and dynamics analysis of large protein assemblies by SSNMR
spectroscopy. Differential labeling with paramagnetic tags for
gaining long-range intermolecular constraints in protein inter-
faces is another emerging area (45–47). There is quite extensive
literature on applications of these various labeling schemes to
solid-state protein NMR spectroscopy (20, 21, 48, 49). In this
chapter, we discuss one of the possible labeling schemes, namely
the differential labeling of two interacting proteins where one
molecule is enriched in 15N, and the second molecule in 13C, 15N.
This approach enables detailed structural analysis of the 13C,
15
N-labeled protein and at the same time extraction of the inter-
molecular interface information by a suitable dipolar dephasing
technique. We employed this labeling protocol in the
1-73(U-13C,15N)/74-108(U-15N) thioredoxin reassembly
(Subheading 3.1), and developed a set of 2D MAS NMR experi-
ments, which allow for simultaneous identification of the residues
constituting the intermolecular interface and resonance
assignment of the binding partners (27). These experiments,
REDOR–PAINCP, REDOR–PDSD, REDOR–HETCOR, and
HETCOR–REDOR are presented below. Pulse sequences for
306 S. Sun et al.
Fig. 2. Pulse sequences for resonance assignments of proteins in MAS solid-state NMR: (a) 2D 13C–13C DARR; (b) 2D
dipolar-based NCA/NCO with SPECIFIC-CP for heteronuclear 15N–13C polarization transfer; (c) 3D dipolar-based NCACX or
NCOCX with SPECIFIC-CP and DARR mixing periods for 15N–13C and 13C–13C polarization transfers, respectively; (d) 3D
dipolar-based NCACB with SPECIFIC-CP and DREAM mixing periods for 15N–13C and 13C–13C polarization transfers, respec-
tively. Filled and open rectangles represent p/2 and p pulses, respectively, unless specified otherwise.
Fig. 3. Representative solid-state NMR spectra for resonance assignments of protein complexes. (a) 2D spectra of the
1–73(U-13C,15N)/74–108(U-15N) thioredoxin reassembly demonstrating the examples of intraresidue and sequential back-
bone and side chain assignments; (a1) 13C–13C DARR; (a2) NCO and (a3) NCA. All spectra are recorded at 14.1 T with the
MAS frequency of 10 kHz. Reproduced from ref. 22 with permission from John Wiley and Sons. (b) Overlay of 2D DARR
spectra of CAP-Gly/MT (black) and CAP-Gly alone (green). The spectra of free CAP-Gly and of CAP-Gly/MT complex are
acquired at 21.1 T and MAS frequency of 14 kHz. (b2) and (b3) are expansions around selected aliphatic regions (Ca–Cb or
Ca–Cg correlations) to demonstrate chemical shift perturbations of CAP-Gly upon binding to microtubules. Reproduced from
ref. 19 with permission from the American Chemical Society. (c) Sequential backbone connectivity for the sequence stretch
A105-L111 in HIV-1 CA assemblies of conical morphology based on 3D NCOCX, NCACX, and NCACB experiments at 14.1 T
and MAS frequency of 10 kHz. The residue names are shown on top of the spectra at their 15N chemical shift plane.
Negative cross-peaks resulting from two-bond N–Cb correlations in the NCACB spectra are displayed in green. Reproduced
from ref. 18 with permission from the American Chemical Society.
these experiments are shown in Fig. 4 and representative spectra

acquired by these experiments are shown in Fig. 5.
There are a number of software packages for multidimensional
data processing and analysis, such as NMRPipe/NMRDraw (50),
308 S. Sun et al.
Fig. 4. Pulse sequences for interface studies by solid-state NMR. (a) 15N–13C REDOR–
PAINCP; ( b ) 15N– 15N PDSD–REDOR; ( c ) 1 H– 15N HETCOR–REDOR; ( d ) 1H– 113C
REDOR–HETCOR. Filled and open rectangles represent p and p/2 pulses, respec-
tively, unless specified otherwise. XY-8 phase cycle is used in the rotor-synchronous
REDOR-p pulse train.
RNMRTK (51), NMRView (52), Sparky (53), ccpNMR (54),

ANSIG (55), and, SIFT (56). The choice of a particular software
package is somewhat judicial as many of these programs offer similar
capabilities. In our laboratory, we typically employ NMRPipe for
multidimensional NMR data processing and Sparky for spectral
analysis. In multidimensional processing, the choice of the processing
parameters is determined by the specifics of the experiment, and in
some cases, it is beneficial to process the SSNMR spectra in two or
more different ways, tailored for either sensitivity or resolution
enhancement. The window functions and other processing func-
tions are applied as necessary. For example, a common processing
Fig. 5. 2D spectra for studies of intermolecular interfaces in 1–73(U-13C,15N)/74–108(U-15 N) thioredoxin reassembly: (a)
REDOR–PAINCP, (b) REDOR–HETCOR, (c) HETCOR–REDOR, and (d) PDSD–REDOR. All spectra are acquired at 14.1 T with a
MAS frequency of 10 kHz. Reproduced from ref. 27 with permission from the American Chemical Society.
sequence may include (in one or all dimensions, as needed): 90° or 60°
shifted sine bell/sine square apodization followed by a Lorentzian-
to-Gaussian transformation (for sensitivity or resolution enhancement,
respectively); forward linear predication in the indirect dimension(s),
zero filling, phase correction, polynomial, or multipoint baseline
correction. Depending on a particular experiment, maximum entropy
reconstruction (57, 58) and/or nonuniform sampling algorithms
(56, 59) may be beneficial.
Numerical simulations of SSNMR spectra are an integral part
of most of the data analysis protocols. Numerical simulations
can be employed for any part of the SSNMR investigation, from
pulse sequence design to interpretation of anisotropic lineshapes to
310 S. Sun et al.
quantitative calculations of specific spectra. In the past decade,

several powerful software packages have been developed for numer-
ical simulations of SSNMR experiments, including ANTIOPE
(60), GAMMA (61), BlochLib (62), SIMPSON (63), and SPINE-
VOLUTION (64). In addition to these multipurpose simulation
packages, researchers in the field often use custom-coded programs,
for example, under Mathematica and Matlab environments. In our
workonproteinassemblies,weutilizeSIMPSON,SPINEVOLUTION,
as well as home-written Mathematica- and Fortran-based programs.
Our laboratories have been working on the development of
MAS SSNMR spectroscopy for investigation of protein complexes.
In this chapter, we present experimental protocols for sample
preparation techniques, resonance assignments by MAS NMR
spectroscopy, structure analysis, and dynamics studies of protein
complexes based on our work on three classes of protein complexes:
thioredoxin reassembly, HIV-1 capsid protein assembly, and micro-
tubule/CAP-Gly assembly (18, 19, 22, 27, 28). Figure 6 illustrates
representative morphologies of HIV-1 CA assemblies, microtubules
(MT), and CAP-Gly/MT assemblies before and after MAS.
2. Materials
2.1. Preparation 1. M9 Minimal Medium (1 L): Add 200 mL of 5× M9 salts,

of Thioredoxin 2 mL of 1 M MgSO4, 0.1 mL of 1 M CaCl2, 20 mL of
Reassemblies for 20% glucose, 10 mL of 100 mg/mL of NH4Cl, and 1 mL of
Solid-State NMR 50 mg/mL ampicillin in 767 mL of water (see Note 1).
Studies 2. M9 salts (5×): Dissolve 64 g of Na2HPO4·7H2O, 15 g of
KH2PO4, and 2.5 g of NaCl in 500 mL of water. Adjust vol-
ume to 1 L with water. Divide the solution into aliquots of
200 mL. Sterilize by autoclaving for 20 min.
3. NH4Cl (100 mg/mL): Dissolve 1 g of NH4Cl into 10 mL of
water. Sterilize by filtration.
15
4. NH4Cl (100 mg/mL): Dissolve 1 g of 15NH4Cl into 10 mL
of water. Sterilize by filtration.
5. Ampicillin (50 mg/mL): Dissolve 0.5 g into 10 mL of water.
Sterilize by filtration.
phate, pH 7.0, 3 mM EDTA. Adjust pH with 1 M HCl or 1 M
NaOH.
7. Anion exchange chromatography buffer: 20 mM sodium phos-
phate, pH 7.0, 500 mM KCl, 3 mM EDTA. Adjust pH with
1 M HCl or 1 M NaOH.
8. HiLoad Superdex 75 column (see Note 2).
9. DEAE-cellulose resin.
Fig. 6. Morphology of HIV-1CA assemblies, microtubules and CAP-Gly/MT characterized by confocal and TEM microscopy.
(a) Confocal images of HIV-1CA assemblies before and after magic angle spinning of the sample; (b) TEM images of MT
and MT/CAP-Gly assemblies before and after magic angle spinning of the sample.
10. Citraconylation buffer: 500 mM potassium phosphate, pH

8.5. Adjust pH with 1 M HCl or 1 M KOH.
11. Citraconic anhydride.
12. 5 M NaOH.
13. Citraconylated thioredoxin purification buffer (size exclusion):
0.5% NH4HCO3, pH 7.9. Adjust pH with 10–35% ammonium
hydroxide or 1 M HCl.
312 S. Sun et al.
14. Desalting column: PD-10 disposable column packed with

Sephadex G-25 medium resin.
15. Trypsin: Dissolve sequencing grade modified trypsin lyophilized
powder in 50 mM acetic acid to 100 mg/mL.
16. 50% Acetic acid.
17. Sephadex G-25 and Sephadex G-50 resins (see Note 3).
18. Denaturing buffer: 10 mM potassium phosphate, pH 7.4,
7.6 M urea, pH is adjusted by titrating with 1 M HCl or 1 M
KOH.
19. Refolding buffer: 100 mM potassium phosphate, pH 5.7.
Adjust pH with 1 M HCl or 1 M KOH.
20. Amicon stirred cell, microcon, membrane with a 3,000 Da
molecular weight cut off.
21. Precipitation buffer: 35% PEG-4,000 in 10 mM NaCH 3COO,
1 mM NaN3, pH 3.5. Adjust pH with 1 M HCl or 1 M NaOH.
22. 10–35% Ammonium hydroxide.
23. 1 M HCl.
24. 1 M KOH.
25. 1 M NaOH.
26. 1 M MgSO4.
27. 1 M CaCl2.
28. 20% (w/v) D-glucose.
29. 20% (w/v) U-13C6 D-Glucose.
30. 4 mm Bruker HRMAS or Varian 3.2 mm thick wall NMR
sample rotor.
31. E. coli BL21 (DE3).
2.2. Preparation of 1. cDNA encoding gag polyprotein, pr55gag: Obtained from the
HIV-1 CA Assemblies NIH AIDS Research and Reference Reagent Program (88).
2. pET21 vector (EMD).
3. Basal Vitamins Eagle medium.
4. Modified M9 growth medium: Prepared by supplementing the
1 L of standard M9 medium (Subheading 2.1) with 10.0 mL
of Basal Vitamins Eagle medium (65).
5. Growth medium for selective labeling: Prepared by adding a
13
C, 15N isotopically labeled amino acid and the other 19 unlabeled
amino acids to the cultures at 100 mg/L in M9 medium.
6. Anion exchange chromatography buffer: 25 mM sodium
phosphate, pH 7.0, 1 mM DTT, 0.02% NaN3. Adjust pH with
7. Cation exchange chromatography buffer A: 25 mM sodium

phosphate, pH 5.8, 1 mM DTT, 0.02% NaN3. Adjust pH with
8. Cation exchange chromatography buffer B: 25 mM sodium
phosphate, pH 5.8, 1 M NaCl, 1 mM DTT, 0.02% NaN3.
Adjust pH with 1 M HCl or 1 M NaOH.
phate, pH 6.5, 100 mM NaCl, 1 mM DTT, 0.02% NaN3.
10. Anion exchange chromatography column: HiTrap Q HP (GE
healthcare).
11. Cation exchange chromatography column: HiTrap SP HP
(GE healthcare).
12. Size exclusion chromatography column: HiLoad Superdex
200 (GE healthcare).
13. CA dialysis buffer: 25 mM sodium phosphate pH 5.5. Adjust
pH with 1 M HCl or 1 M NaOH.
14. PEG-20,000 solution: 17.5% in H2O (e.g. dissolve 1.75 g
PEG-20,000 to 8.25 mL water).
15. 10 mM EDTA-Cu(II): Dissolve EDTA-Cu(II) in 90%
D2O/10%H2O.
16. CA tubular morphology incubation buffer: 50 mM Tris HCl
buffer, pH 8.0, 1 M NaCl. Adjust pH with 1 M HCl or 1 M
NaOH.
17. 1 M HCl.
18. 1 M NaOH.
19. E. coli Rosetta 2 (DE3).
20. 4 mm Bruker HRMAS or Varian 3.2 mm thick wall NMR
sample rotor.
21. Isopropyl b-D-1-thiogalactopyranoside (IPTG) stock: 200 mM
solution.
2.3. Transmission 1. TEM staining solution: ammonium molybdate (5% w/v) in

Electron Microscopy water, filtered with a 0.2-mm syringe filter.
of HIV-1 CA Protein 2. Nonsterile 72-well mini trays with lids.
Assemblies
3. 55 mm diameter qualitative circle filter paper.
4. 60 mm × 15 mm-Petri dish.
5. Transmission electron microscope: Zeiss CEM 902, operating
at 80 kV.
6. 400 mesh, Formval/carbon-coated copper grids, stabilized
with evaporated carbon films.
314 S. Sun et al.
2.4. Confocal 1. One-well chambered cover glasses.

Microscopy of HIV-1 2. Staining solution: 0.5% (w/v) Nile Blue A in water, filtered
CA Protein Assemblies with a 0.2-mm syringe filter.
3. Laser scanning microscope: Zeiss LSM 510 NLO (25 mW
HeNe laser; 543 nm) equipped with a Zeiss 40× (NA 1.3) oil
immersion objective lens.
2.5. Cryo-SEM 1. EM PACT high-pressure freezer (Leica).

Microscopy of HIV-1 2. Gold carrier plates.
CA Protein Assemblies
3. Copper hats.
4. Gold (for deposition).
5. Liquid nitrogen.
2.6. Preparation of 1. Modified M9 growth medium (Subheading 2.2).

CAP-Gly/Microtubule 2. IPTG stock (Subheading 2.2).
Complexes
3. Buffers for Ni affinity chromatography: 20 mM Tris, pH 7.5,
containing 10 mM, 50 mM, 200 mM. Adjust pH with 1 M
HCl or 1 M NaOH.
4. Anion exchange buffer A: 20 mM Tris, pH 7.5, 1 mM DTT.
5. Anion exchange buffer B: 20 mM Tris, pH 7.5, 1 M NaCl,
1 mM DTT. Adjust pH with 1 M HCl or 1 M NaOH.
6. Microtubule polymerization buffer: 25 mM sodium phos-
phate, pH 6.0, 25 mM NaCl, 0.4 mM DTT. Adjust pH with
7. Ni affinity chromatography column: HisTrap (GE healthcare).
8. Anion exchange chromatography column: HiTrap FF-Q (GE
healthcare).
9. Lypholized bovine tubulin powder: Stored at 4°C. Generally,
fresh tubulin solution is used for assays. Excess tubulin solution
is quick frozen by liquid nitrogen and stored at −80°C.
10. Paclitaxel (Taxol): 3 mM paclitaxel dissolved in dimethyl
sulfoxide (DMSO). Store at −20°C.
11. GTP.
12. Luria-Bertani (LB) Liquid Medium (1 L): Dissolve 10 g bacto-
tryptone, 5 g bacto-yeast and 10 g NaCl into 950 mL of tap
water. Adjust pH to 7.0 with NaOH. Adjust volume to 1 L with
tap water. Sterilize by autoclaving for 20 min at 15 lb/sq.in on
the liquid cycle. Let it cool to ca. 40°C to add antibiotics.
13. E. coli BL21 (DE3).
14. 1 M HCl.
15. 1 M NaOH.
3. Methods
3.1. Preparation For a detailed description of the overexpression system and the
of Thioredoxin purification protocol for E. coli thioredoxin, see refs. 66 and 67.
Reassemblies For a description of proteolytic cleavage of thioredoxin at Arg-73
for Solid-State NMR by trypsin digestion see ref. 68. The salient steps pertaining to the
Studies preparation of the SSNMR samples of differentially enriched thi-
oredoxin reassembly are outlined below.
1. Prepare differentially enriched thioredoxin reassemblies by
overexpressing two batches of thioredoxin in E. coli BL21(DE3)
separately. Use M9 minimal medium containing 15NH4Cl and
U-13C6 glucose for expression of U-13C,15N thioredoxin and
use M9 minimal medium containing 15NH4Cl and natural
abundance glucose for 15N thioredoxin (69).
2. Purify each batch of thioredoxin by loading the crude cell
extract onto a size exclusion (Superdex 75) column. Elute the
protein using size exclusion chromatography buffer.
3. Apply the eluant to an anion exchange (DEAE-cellulose) column.
Elute the protein.
4. Validate the purity of thioredoxin by using SDS-PAGE and
measure its concentration by UV absorbance (extinction
coefficient e 280 = 14,100/M/cm) (66).
5. Cleave each protein batch at the Arg-73 site by trypsin diges-
tion. First, dialyze thioredoxin against citraconylation buffer
and concentrate the protein in an Amicon stirred cell (molecu-
lar weight cut off: 3,000 Da) to 0.3 mM. Second, block the
lysine side chain amine groups by adding 25 mL of citraconic
anhydride at 20 min intervals (total amount of citraconic anhy-
dride is 50 mL for every 1 mmol of thioredoxin) and allow the
reaction to continue for 2 h after the final addition. Add 5 M
NaOH to maintain the pH at 8.5. Third, remove excess reagent
by running the reaction mixture through a desalting column.
Fourth, add trypsin to citraconylated thioredoxin (trypsin to
thioredoxin ratio is 1:100 w/w) and allow the enzyme digestion
to continue for 6 h at 37°C. Finally, lyophilize the mixture and
incubate in 50% acetic acid for 1 h to remove citraconyl groups
on lysine side chains.
6. Separate the two peptide fragments, thioredoxin (1–73) and
thioredoxin (74–108), on a size exclusion chromatography
(Sephadex G-50) using 50% acetic acid as elution buffer.
7. Purify each fragment separately by size exclusion chromatogra-
phy (Sephadex G-25) using citraconylated thioredoxin purifi-
cation elution buffer.
8. Validate the purity of each fragment by SDS-PAGE and mea-
sure the concentration of each fragment by UV absorbance.
316 S. Sun et al.
Extinction coefficients for the N fragment (1–73) and the

C fragment (74–108) are e 280 = 14,100 M-1-cm-1 and
e 215 = 39,700 M-1-cm-1, respectively.
9. Reconstitute thioredoxin by mixing each 13C,15N-enriched
fragment with its complementary 15N-enriched counterpart.
This step results in two reassembled thioredoxin samples:
1–73(U-13C,15N)/74–108(U-15N), and 1–73(U-15N)/74–108(U-
13
C,15N). Mix equimolar amounts of N and C fragments at low
concentration (~50 mM) in denaturing buffer, then dialyze
against refolding buffer.
10. Concentrate the reconstituted thioredoxin to 70 mg/mL using
an Amicon stirred cell and microcon (molecular weight cut off:
3,000 Da). Gradually add precipitation buffer (10 mL every
10 min or longer) into 0.5 mL of concentrated thioredoxin
solution until no further protein precipitation is observed.
Quantify the extent of precipitation by measuring residual
absorbance at 280 nm.
11. Centrifuge the hydrated thioredoxin/PEG precipitate at 14,000 × g
for 15 min at 4°C and transfer the pellet into 4 mm Bruker
HRMAS NMR sample rotor or Varian 3.2-mm thick wall
NMR sample rotor. Seal the samples with the upper spacer and
the top spinner (see Note 4).
3.2. Preparation of The cDNA encoding gag polyprotein, pr55gag, was obtained from
HIV-1 CA Assemblies the NIH AIDS Research and Reference Reagent Program (88).
The DNA sequence coding for CA (gag residues 133–363) was
amplified and subcloned into pET21 vector using the NdeI and
XhoI sites (70). The primers used for PCR amplification are 5¢-
GAT ATA CAT ATG CCT ATA GTG CAG AAC ATC CAG
GGG-3¢, and 5¢-GTG GTG CTC GAG TCA TCA CAA AAC TCT
TGC CTT ATG GCC GGG-3¢, respectively. Restriction sites are
underlined.
1. Express U-13C,15N isotopically labeled CA protein in E. coli
Rosetta 2 (DE3) in modified M9 medium using 15NH4Cl and
U-13C6 glucose as the sole nitrogen and carbon sources. Induce
the protein expression with 0.4 mM IPTG at 23°C for 16 h.
2. Express selectively labeled CA protein in E coli Rosetta 2 (DE3)
in M9 growth medium for selective labeling prepared by adding
a 13C, 15N isotopically labeled amino acid and the other 19 unla-
beled amino acids to the cultures at 100 mg/L when 0.4 mM
IPTG is added to induce protein expression (see Note 5).
3. Purify the CA protein by anion exchange chromatography
using anion exchange chromatography buffer. Use a flow rate
of 2 mL/min and collect the flow through (nonbinding part).
4. Purify the CA protein produced in step 3 by cation exchange
chromatography using a gradient formed by cation exchange
chromatography buffer A and B. The flow rate is 2 mL/min. CA

containing fraction was eluted at conductivity of ca. 10 ms/cm.
5. Remove aggregates by size exclusion chromatography using
size exclusion buffer. The flow rate is 2 mL/min.
6. Validate the purity of CA protein by SDS-PAGE and measure
the concentration of proteins by UV absorbance (extinction
coefficient e 280 = 33,585 M-1-cm-1).
7. Dialyze the CA protein against CA dialysis buffer. To prepare
CA assemblies containing mixed labels, mix two solutions con-
taining CA protein, each isotopically labeled with a different
desired amino acid, in a 1:1 ratio, followed by the assembly
step to produce CA assemblies of one of the three morpholo-
gies: conical, spherical, or tubular, as described below.
8. Lyophilize the purified CA protein (see Note 6). Prepare CA
protein assemblies of conical morphology by adding PEG-20,000
solution to the lyophilized protein to a final protein concentra-
tion of 32 mg/mL if needed for experiments, add EDTA-Cu(II)
(see Note 7). Incubate the mixture for 1 h at 37°C. Recover the
assembled material as the pellet after centrifugation at 18,800 × g
for 5 min at room temperature. Pack 15 mg of the precipitate
into a 3.2-mm Varian NMR sample rotor and seal the sample
with an upper spacer and a top spinner.
9. Prepare CA assemblies of spherical morphology by mixing a
32-mg/mL CA solution, prepared in CA dialysis buffer, with
PEG-20,000 buffer (1:1 volume ratio). Incubate the resulting
mixture on ice for 30 min and dilute it fourfold. Dry the solu-
tion containing the spherical assemblies with N2 gas to remove
any excess water. Pack 12 mg of the dried sample into a 3.2-
mm Varian NMR sample rotor and seal the sample using an
upper spacer and a top spinner (see Note 8).
10. Prepare CA assemblies of tubular morphology by incubating a
32-mg/mL CA solution prepared in tubular morphology
incubation buffer at 37°C for 1 h.
3.3. Transmission The morphologies of the HIV-1 CA assemblies are analyzed using
Electron Microscopy a Zeiss CEM 902 transmission electron microscope operating at
of HIV-1 CA Protein 80 kV. Samples are stained with TEM staining solution, deposited
Assemblies and onto 400 mesh, Formval/carbon-coated copper grids, and dried
CAP-Gly/Microtubule for 40 min. Follow the protocol below to prepare the TEM grids:
Protein Assemblies 1. Transfer 5 mL of CA assemblies slurry and 5 mL of TEM staining
solution to two separate wells of a mini tray.
2. Place the copper grid on the CA assemblies slurry first with the
shiny side down, incubate for 1 min.
3. Use the edge of a filter paper to remove excess solution.
4. Place the grid on top of the staining solution and incubate for
30 s with the same side facing down.
318 S. Sun et al.

6. Place the grid on the CA assemblies slurry drop again for 30 s
with the same side facing down.
8. Place the grid on the staining solution drop again for 30 s with
the same side facing down.
10. Place one piece of filter paper in the Petri dish. Then place the
copper grid on the filter paper, which is already in the Petri dish.
11. Place the Petri dish under a lamp to dry the copper grid for
40 min.
12. Place the dried copper grid on the TEM sample holder and
then acquire the images.
3.4. Confocal The morphologies of the HIV-1 CA assemblies are analyzed in

Microscopy of HIV-1 solution using confocal microscopy. The stain, Nile Blue A is excited
CA Protein Assemblies under a 543-nm laser line and emits fluorescence in the hydropho-
bic environment (protein assemblies). For confocal imaging of the
CA assemblies, follow the steps below.
1. Place 1 mL of protein assemblies slurry on a cover glass (see
Note 9).
2. Add 5 mL of staining solution to the protein assemblies slurry
(see Note 10).
3. Acquire the images under a 543-nm laser line of a 25-mW
HeNe laser scanning microscope using a Zeiss 40× (NA 1.3)
oil immersion objective lens. Turn on the transmitted light
channel when appropriate.
3.5. Cryo-SEM The morphologies of the HIV-1 CA assemblies are analyzed using
Microscopy of HIV-1 Cryo-SEM microscopy on a cold stage using a high-pressure
CA Protein Assemblies freezer. The cryo-fixed specimens are cryo-fractured under vacuum
to reveal internal structure. For cryo-SEM imaging of the CA
assemblies, follow the steps below (see Note 11).
1. Place 1 mL of protein assemblies slurry on the gold carrier
plates. Carefully cover the gold carrier plate with copper hat.
2. Transfer the gold plate set to the Leica EM PACT high-
pressure freezer and freeze the sample at 2,000 bar at dT/
dt > 10,000°C/s.
3. Transfer the frozen sample to the precooled sample prepara-
tion chamber (−125°C) with liquid nitrogen.
4. Fracture the copper cover with the knife in the preparation
chamber.
5. Deposit 10 nm of gold on the freshly fractured surface and

lower the temperature to −125°C.
6. Transfer the sample to the cryostage for observation and
increase the temperature to −90°C for 5–7 min to remove
surface water.
7. Acquire images at −125°C and 1.0 kV at a working distance of
approximately 4–5 mm.
3.6. Preparation of CAP-Gly domain of the p150Glued subunit of mammalian dynactin

CAP-Gly/Microtubule encompassing residues 19–107 was subcloned into the pET28b-
Complexes His6-SMT3 vector (71) using the BamHI and XhoI restriction
sites. Successful subcloning was confirmed by DNA sequencing.
(SMT-His6)-CAP-Gly containing plasmid was transformed into
E. coli BL21(DE3) cells. For subsequent production of isotopically
enriched protein, purification and assembly steps, follow the proto-
cols below.
1. Overexpress U-13C,15N (SMT-His6)-CAP-Gly in M9 medium
or modified M9 medium (see Note 12) containing 15NH4Cl
and U-13C6 glucose.
2. Purify the tagged protein by Ni affinity chromatography using
the buffers for Ni affinity chromatography. Remove nonbind-
ing and nonspecifically bound protein impurities by washing
the Ni-affinity column with buffers containing 10 and 50 mM
imidazole, respectively. Elute tagged CAP-Gly by using the
buffer containing 200 mM imidazole.
3. Overexpress His6-ULP1 protease (71) in LB medium (see
Note 13). Purify the enzyme by Ni affinity chromatography
following the same procedure described in step 2. Divide the
His6-ULP1 protease expressed in 1 L of LB medium into
Eppendorf tubes (0.5 mL/tube) and store at −80°C without
assaying enzyme activity.
4. Mix 24 mL of (SMT-His6)-CAP-Gly expressed in 250 mL of
M9 medium or modified M9 medium with 1–1.5 mL of His6-
ULP1 protease to cleave the SMT-His6 tag from CAP-Gly (see
Note 13). Incubate the mixture at 4°C overnight.
5. Dilute the mixture (step 4) to 40 mL with Ni affinity chroma-
tography buffer containing 10 mM imidazole and load the
diluted mixture onto a 5-mL HisTrap (Ni affinity) column.
Elute the CAP-Gly(19–107) with 10 mM imidazole buffer (CAP-
Gly(19–107) does not bind to the column). Elute His6-SMT3 tag
and His6-ULP-1 with 200 mM imidazole buffer.
6. Purify the CAP-Gly containing fractions by anion exchange
chromatography to remove the residual protein and nucleic
acid impurities using a gradient formed by anion exchange
buffer A and B. The flow rate is set as 1 mL/min. The gradient
320 S. Sun et al.
is set as 0–40% buffer B in 100 mL elution (or 100 min). The

protein is eluted when buffer B is 15–20% in the gradient.
7. Validate the purity of CAP-Gly(19–107) by SDS-PAGE.
8. Dialyze U-13C,15N CAP-Gly(19–107) against microtubule polym-
erization buffer. Concentrate CAP-Gly(19–107) to 7.3 mg/mL.
Measure the concentration of CAP-Gly by UV absorbance
(e 280 = 8,250/M/cm).
9. Dissolve bovine tubulin (lyophylized powder) in the microtubule
polymerization buffer. Add GTP and paclitaxel to the tubulin
solution (final concentration is 30 mM for tubulin, 1 mM for
GTP, and 15 mM for paclitaxel). Incubate the mixture at 37°C
for 40–45 min.
10. Mix a 1.5-mL solution of 7.3 mg/mL U-13C,15N CAP-Gly(19–107)
with 3.165 mL of 23 mM paclitaxel-stabilized microtubules.
Centrifuge the resulting complex at 80,000 × g for 40 min at
4°C. Pack 14.2 mg of hydrated gel-like pellets into a 3.2-mm
Varian NMR sample rotor and seal the sample using an upper
spacer and a top spinner (see Note 14).
3.7. Transmission The morphologies of the CAP-Gly/microtubule protein assemblies

Electron Microscopy are analyzed using a Zeiss CEM 902 transmission electron micro-
of CAP-Gly/ scope operating at 80 kV. Samples are stained with 5% (w/v)
Microtubule Protein ammonium molybdate, deposited onto 400 mesh, Formval/
Assemblies carbon-coated copper grids, and dried for 40 min.
1. Polymerize microtubules from bovine tubulin in vitro as
described in Subheading 3.6.
2. Express and purify natural abundance CAP-Gly according to
the procedure described in Subheading 3.6.
3. Gently mix 10 mM CAP-Gly and 10 mM microtubules to prepare
CAP-Gly/microtubule assembly (see Note 15).
4. Follow steps 2–12 in Subheading 3.3.
3.8. Solid-State The typical SSNMR experimental parameters for 2D and 3D reso-
NMR Spectroscopy nance assignment experiments conducted at 14.1 T on a Varian
for Resonance InfinityPlus instrument equipped with 3.2 mm triple tuned T3
Assignments probe are detailed below.
1. For the MAS frequency of 10 kHz, set the radio frequency (rf)
field strengths to 95 kHz (1H), 50 kHz (13C), and 50 kHz
(15N) for hard pulses. For 1H–13C CP or 1H–15N CP, contact
times are 0.85 and 1.1 ms, respectively; 1H radio frequency
field is 50 kHz, and 13C or 15N radio frequency field is ca.
40 kHz in the center of a linear or tangential ramp. Use TPPM
decoupling (72); the decoupling field strengths range between
80 and 100 kHz in different experiments. Recycle delays in all
experiments are temperature dependent, and for temperatures

between 0 and −30°C are typically set to 2 s.
2. For selective magnetization transfers from 15N to 13Ca (NCA)
or to 13C¢ (NCO), match the 15N and 13C radio frequencies
according to wN ± wC = nwr (32). For example, at 14.1 T and
when the MAS frequency is 10 kHz, the typical rf field strengths
are wC = 25 kHz (constant amplitude) and wN = 15 kHz (at the
center of a tangential ramp). The mixing time is 6–7 ms.
3. For the DARR and PDSD sequences employed for the 13C–13C
correlation spectroscopy either as stand alone experiments or
as part of 3D NCACX experiment, tune the mixing time to
observe cross peaks within the desired range of distances (see
Note 16). Typical mixing times for one-bond correlations are
10 and 50 ms at 14.1 and 21.1 T, respectively.
4. Use the DREAM sequence in the 2D/3D NCACB experiment
to establish predominantly one-bond Ca–Cb correlations
following the 15N–13Ca transfer by SPECIFIC-CP. The double-
quantum matching condition for the DREAM sequence is
nwr = (wrf2 + W12)1/2 + (wrf2 + W22)1/2 (35). The typical mixing
time in the DREAM step is ca. 2 ms (see Note 17). The one-bond
Ca-Cb correlations result in negative cross peaks. Under these
conditions, a number of two-bond correlations (e.g., Ca-Cg
correlations for Thr residues) will appear in the spectra, and these
are positive-intensity cross peaks.
3.9. Solid-State NMR The pulse sequences used for backbone dynamics experiments are
Experiments for shown in Fig. 7 and described below.
Probing Protein 15
1. N longitudinal relaxation rates (T1) provide information
Backbone Dynamics about motions on the pico- to nanosecond time scales. Insert
a p/2 − t − p/2 block into the NCA experiment before the
DCP block, by which magnetization is transferred from 15N to
13
Ca. Monitor the decay of 15N magnetization by tracing the
peak intensities in a series of 2D NCA spectra (28). Generate
relaxation curves from the spectra acquired with a series of
delays; for each residue, plot the cross peak intensities as a
function of the delay time, and fit the experimental points to a
single-exponential function I = I0exp(−R1t) to extract the residue-
specific relaxation rates R1 (see Note 18).
2. For qualitative detection of submillisecond motions in backbone
amide protons, insert a t/2 − p − t/2 echo before the 1H–15N CP
block. Proton magnetization is rapidly dephased during the
two t/2 delays if it is in the rigid environment of the solid
protein. Only the amide protons with high mobility on the
submillisecond time scale can survive the relatively long delay
(e.g., 400 ms), and the backbone nitrogen atoms bonded to
these protons detected (see Note 18).
322 S. Sun et al.
Fig. 7. Pulse sequences for dynamics studies of proteins and protein assemblies by MAS solid-state NMR. (a) 3D 1H T2¢
filtered NCA experiment; (b) 3D NCA-based 15N T1 relaxation experiment; (c) 3D DIPSHIFT-NCA experiment with R1817 block
employed for dipolar recoupling; (d) 3D ROCSA-NCA experiment. Filled and open rectangles represent p/2 and p pulses,
respectively, unless specified otherwise.
15
3. N chemical shift anisotropy (CSA) is also sensitive to protein
dynamics. The CSA is reduced in the presence of motions
occurring at frequencies faster than the magnitude of the CSA
interaction. Therefore, the ratio of the anisotropy in the pres-
ence of dynamics to the static-limit anisotropy as well as the
asymmetry parameter of the dynamically averaged CSA tensor
are a probe of the amplitude and geometry of the motions. 15N
CSA at 14.1 T is ca. 10 kHz, and therefore, it is sensitive to
the motions occurring on the time scales faster than 100 ms.
Measure the 15N CSA tensors site-specifically by introducing a
ROCSA CSA recoupling block (73) before the 15N chemical
shift evolution period in the NCA experiment in a 3D ROCSA-
NCA (74) experiment (see Note 19). Representative 15N CSA
lineshapes are illustrated for reassembled thioredoxin in Fig. 8.
4. 1H–15N dipolar couplings are also sensitive to motions on the
time scales of less than 100 ms. Record residue-specific 1H–15N
dipolar lineshapes in a 3D DIPSHIFT-NCA experiment by
introducing a DIPSHIFT period (75) in the basic NCA
sequence. A number of dipolar recoupling sequences can be
employed (the original DIPSHIFT (75), TMREV (76), LGCP
(77–79)). In our work, we employ an RN-type recoupling
block, R1817 (43), for the recoupling of the 1H–15N dipolar
coupling and at the same time the suppression of the 1H–1H
homonuclear dipolar interactions. During the R1817 recoupling
period, the 15N chemical shift is refocused by a spin echo (see
Note 20). Representative 1H–15N dipolar lineshapes are illus-
trated for reassembled thioredoxin in Fig. 8.
5. Extract the CSA and dipolar tensor parameters from the 3D
ROCSA-NCA and 3D DIPSHIFT-NCA experiments, by
numerical simulations of the experimental lineshapes in
Fig. 8. Dynamics information extracted from 3D-ROCSA and 3D-DIPSHIFT experiments in 1–73(U-13C,15N)/74–108(U-15N)
reassembled thioredoxin. (a) Representative 15N CSA lineshapes; the fit values are: G21, ds = 34 ± 2 ppm, h = 1.0 ± 0.25;
R73, ds = 75 ± 5 ppm, h = 0.16 ± 0.11; V25, ds = 90 ± 4 ppm, h = 0.20 ± 0.10; T8, ds = 97 ± 4 ppm, h = 0.20 ± 0.08.
(b) Chemical shift anisotropy ds plotted as a function of the residue number; (c) Representative 15N–1H dipolar lineshapes;
(d) Dipolar order parameters <S> plotted as a function of the residue number. Reproduced from ref. 28 with permission
from the American Chemical Society.
SIMPSON (63) or SPINEVOLUTION (64) to find the best

fit to the experimental results.
3.10. Solid-State 1. The REDOR block (80) is incorporated into a family of NMR
NMR Experiments pulse sequences to differentiate between 15N nuclei in two
for Structural Analysis distinct environments. REDOR reintroduces the dipolar
of Protein Interfaces coupling between 15N and 13C (or between 13C and 1H) nuclei,
which would otherwise be suppressed by MAS. Therefore, in
the differentially enriched 1–73(U-13C,15N)/74–108(U-15N)
thioredoxin reassembly the 15N or 1H magnetization in the
U-13C,15N-enriched fragment can be selectively dephased during
the REDOR period by the reintroduced 15N–13C or 1H–13C
heteronuclear dipolar coupling, respectively, while the 15N or 1H
magnetization in the U-15N fragment is retained allowing
for subsequent polarization transfer through the interface or within
the 15N fragment resulting in either intermolecular or intramo-
lecular isotopically edited correlations, depending on the desired
information (see Note 21).
324 S. Sun et al.
2. In the REDOR–PAINCP sequence, the 13C–15N REDOR

period is introduced after the initial 1H–15N CP. The residual
unwanted 13C transverse magnetization excited by REDOR is
removed by the 15N Z-filter, after which only the 15N magneti-
zation of the 15N-enriched fragment is retained. After the 15N t1
chemical shift evolution period, the 15N magnetization is
transferred to 13C using the heteronuclear 15N–13C PAINCP
(11, 76) step followed by the detection of 13C chemical shift
evolution in the t2 period. Since only one of the two fragments
(U-15N,13C) is 13C enriched, the 15N–13C cross peaks represent
exclusively intermolecular through interface correlations (see
Note 22).
3. In the PDSD–REDOR sequence, a 15N proton-driven spin
diffusion (PDSD) mixing period is introduced after the initial
1
H–15N CP step followed by an 15N t1 chemical shift evolution
period to establish sequential 15N–15N correlations. The subse-
quent 13C–15N REDOR dephasing period removes the 15N
signals arising from the U-13C,15N-enriched fragment, and there-
fore, in the final spectrum only sequential 15N–15N correlations
from the U-15N labeled fragment can be detected, resulting
in considerable spectral simplification due to isotopic editing
(see Note 23).
4. In the HETCOR–REDOR sequence, the initial part is the
FSLG-based 1H–15N HETCOR experiment (81) employing a
flat 1H–15N CP with a short contact time to establish one-bond
1
H–15N correlations between the amide proton and nitrogen
atoms in the entire protein. The 15N magnetization arising
from the U-13C,15N-enriched fragment is dephased in the
subsequent REDOR period, which is introduced after the FSLG-CP
part of the sequence. In the final spectrum only 1H–15N
correlations from nuclei in the U-15N labeled fragment are
detected, resulting in considerable spectral simplification due
to isotopic editing (see Note 24).
5. In the REDOR–HETCOR sequence, the 13C-1H REDOR
filter is employed for the dephasing of the 1H magnetization
from the (U-15N-13C) enriched fragment. Under the experi-
mental conditions, 1H magnetization dephasing is also observed
in the part of the (U-15N) enriched fragment constituting the
intermolecular interface. Following the t1 evolution under
FSLG, the 1H magnetization is transferred to 15N by a flat CP
with a short contact time. The 15N signal is detected during the
t2 period. The final spectrum contains the 1H–15N correlations
arising solely from the residues of the U-15N-enriched fragment,
while cross peaks that would be due to the residues constituting
the intermolecular interface being either absent or displaying
reduced intensity because of their full or partial 13C/1HN REDOR
dephasing. A combination of HETCOR/REDOR and REDOR/
HETCOR experiments therefore yields information on the
intramolecular 1H–15N correlations in the (U-15N) enriched

fragment as well as on the 1H–15N correlations of the residues
composing the intermolecular interface (see Note 25).
4. Notes
1. Distilled water should be used instead of Millipore pure water

for preparing M9 medium.
2. To improve the efficiency of purification, we used chromatography
columns purchased from GE healthcare (unless otherwise indi-
cated), which is also the producer of the AKTA FPLC system
used in our lab. Columns or resins from other vendors may
also work but the procedure details (e.g., buffer conditions)
will be different and need to be optimized.
3. To achieve adequate resolution, the size exclusion column
packed with Sephadex G-50 should be long enough (>160 cm
for separating two fragments and >140 cm for repurification).
4. To obtain SSNMR spectra with narrow lines, it is critical to
preserve conformational homogeneity during the preparation
of SSNMR samples. Controlled precipitation is a general pro-
tocol that allows for generating conformationally homoge-
neous SSNMR samples of proteins and protein assemblies that
are intrinsically soluble. For controlled precipitation, hanging
drop screening is performed first to identify the suitable condi-
tions. In our work, we typically employ polyethylene glycol
(PEG) of various molecular weights as the precipitant because
in our experience, precipitation conditions for virtually any
well-behaved protein or protein complex can be successfully
established, leading to high-quality samples. In order to pack more
protein sample into the MAS rotor, supernatant should be
removed from the protein/PEG pellet generated by controlled
precipitation. To preserve conformational homogeneity, the
pellet should be kept hydrated.
5. Protein expression is induced only for 2 h 30 min to limit shuf-
fling between labeled and unlabeled amino acids.
6. To prepare CA assembly of conical morphology, the final
concentration should be 32 mg/mL. Lyophilization of CA
enables the direct preparation of CA solution in PEG-20,000
at an initial concentration of 32 mg/mL.
7. Introducing 10 mM Cu(II)-EDTA in the precipitant allows
the Cu(II)-EDTA complex to diffuse into the CA sample and
to enhance proton longitudinal relaxation, thus permiting
shorter recycle delays in the NMR experiments under very
fast-MAS conditions (MAS frequencies of 40 kHz or greater)
(82, 83).
326 S. Sun et al.
8. The spherical assemblies are not stable in solution, but are

stable and retain their morphology for many weeks when dried
under N2 gas.
9. When observing conical assemblies with confocal microscopy,
an excess of staining solution is desirable.
10. When confocal microscopy is employed for imaging the
tubular CA assemblies, the high salt content quenches some of
the fluorescence, and a higher receiver gain in the fluorescence
channel is needed to get good quality images.
11. The humidity in the ambient environment is critical for the
cryo-SEM experiment. Ice tends to accumulate on the surface
and cover the details of the structures when humidity is high.
12. The protocol of Marley et al. employs modified M9 medium
(89). The timings and the conditions of the individual steps
have to be optimized for a specific protein. For CAP-Gly,
E. coli cells are grown in LB medium until O.D. at 600 nm
reaches 0.8. Cells are pelleted, washed with M9 medium without
a nitrogen or carbon source, and then transferred to the M9
growth medium whose volume is a quarter of that of the LB
medium culture. After 1 h of recovery, expression of SMT-His6-
CAP-Gly is induced by addition of IPTG to 0.8 mM. After
another 4 h, cells are harvested for protein purification.
13. The His-ULP1 expression system (in E. coli) is a gift from Weill
Cornell medical college. ULP1 is a cysteine protease (71). For
efficient cleavage by the His6-ULP1 enzyme, DTT is added to
the His6-ULP1, SMT-His6-CAP mixture. The final concentra-
tion of DTT is 1–5 mM.
14. The CAP-Gly/microtubule ratio is optimized by a co-sedimentation
assay (19).
15. Microtubules are fragile protein assemblies and their morphology
may be altered during various biochemical manipulations.
To prevent shearing of microtubules upon pipetting, 100 mL
pipette tips are cut at the sharp end. Prior to and after MAS
experiments, microtubule morphologies have to be examined
by TEM to ensure that the microtubules remain intact.
16. DARR and PDSD mixing times are strongly magnetic field
dependent, and polarization transfer at higher fields is slower.
The mixing time is determined experimentally for a specific
magnetic field strength. For example, the mixing times for one-
bond correlations at 14.1 T are ca. 2–10 ms; at 17.6 T ca. 10 ms,
and at 21.1 T ca. 50 ms.
17. DREAM is a double-quantum homonuclear recoupling sequence,
and the magnetization generated by DREAM is of opposite
phase of the original polarization. Therefore for NCACB experi-
ment, the N-Ca SPECIFIC-CP should be carefully optimized to
avoid two-bond N-Cb magnetization transfer. Cb magnetization

generated by nonselective SPECIFIC-CP would cancel signals
generated by DREAM.
18. The pulse lengths and power levels are similar to those in experi-
ments for resonance assignments, with additional 15N T1 or T2
filter delays introduced in the corresponding experiments.
19. In our experiments, a C2 21 POST block ( 84 ) is used with
(a, b) = (0.0329, 0.467) and one rotor period (100 ms) increment
per t1 point. During ROCSA, a 10-ms 13C p pulse with XY-8
phase cycling scheme (85) is introduced in the middle of every
rotor period on the 13C channel, and 110 kHz CW decoupling
is employed on the 1H channel.
20. In this study, we used the R1817 = {18070180−70}9 element (43)
with a 10 kHz MAS spinning frequency.
21. During the 13C–15N Rotational Echo Double Resonance (REDOR)
dephasing, 100 kHz 1H TPPM decoupling is employed, and
the XY-8 phasing scheme (86) is applied to minimize the
resonance offset of the rotor-synchronized p-pulse train. The
13
C and 15N radio frequency field strengths are both 50 kHz.
Generally, the REDOR dephasing time needs to be optimized,
and under our experimental conditions the dephasing time
longer than 6 ms ensured complete suppression of 15N signals
from the U-15N, 13C labeled fragment.
22. During the Proton-Assisted Insensitive Nuclei Cross Polarization
(PAINCP) transfers (11), the radio frequency field strengths
on the 13C and 15N channels are 45 kHz, while the field strength
on the 1H channel is optimized for each experiment and is
57–63 kHz. In the 1H–15N heteronuclear correlation experi-
ments, a flat CP with a short contact time of 170 ms is used.
23. The N–N PDSD mixing time utilized in the REDOR–PDSD
experiments is 4 s. Under these conditions, almost all of the
cross-peaks are from the sequential Ni–Ni–1 correlations, and
the cross peak intensities are 10–30% of the corresponding
diagonal signals. The cross peaks from Ni–Ni–2 correlations are
too weak to be detected.
24. The 1H–1H homonuclear dipolar couplings are suppressed by
Frequency-Switched Lee-Goldburg scheme (FSLG) (81) which
in the PMLG variant (87) can be implemented by ramping the
phase of the proton radio frequency while keeping the proton
carrier frequency unchanged.
25. 1HN–13C dipolar interaction is ten times stronger than the
15
N–13C coupling when the internuclear distances are identical.
Therefore, the dephasing effect of 13C–1HN REDOR is generally
stronger than that of 13C–15N REDOR. Therefore, the 13C–1HN
REDOR, in addition to dephasing the HN signals from the
328 S. Sun et al.
U-15N,13C labeled thioredoxin fragment, will give rise to partial

dephasing of HN signals belonging to the singly 15N labeled
fragment and lining the intermolecular interface of reassem-
bled thioredoxin. Under our experimental conditions where
3.2 ms 1H–13C REDOR dephasing is employed, HN signals
from the 15N,13C labeled fragment are eliminated completely,
and 1HN signals corresponding to the residues at the interface
also disappear. The signals belonging to residues away from the
interface are not affected. This experiment allows identification of
amino acid residues that constitute the intermolecular interface.
Acknowledgments
The projects discussed here are supported by the National Institutes

of General Medical Sciences (NIH Grants P50GM082251 and
R01GM085306) and the National Center for Research Resources
(NIH Grants P20RR017716-07 and P20RR015588). The authors
thank Maria Luisa Tasayco, Dabeiba Marulanda, Jun Yang, Marcela
Cataldi, Vilma Arriaran for their contributions to the preparation
of thioredoxin reassemblies and/or solid-state NMR studies of
these reassemblies.
References
1. Yool A. J. (2007) Aquaporins: Multiple roles from solid-state NMR Spectroscopy. J. Phys.
in the central nervous system. Neuroscientist Chem. B 111, 10340–10351.
13, 470–485. 8. Lange A., et al. (2006) Toxin-induced confor-
2. Vale R. D. (2003) The molecular motor mational changes in a potassium channel
toolbox for intracellular transport. Cell 112, revealed by solid-state NMR. Nature 440,
467–480. 959–962.
3. Grunewald K. & Cyrklaff M. (2006) Structure 9. Porcelli F., Buck-Koehntop B. A., Thennarasu
of complex viruses and virus-infected cells by S., Ramamoorthy A., & Veglia G. (2006)
electron cryo tomography. Curr. Opin. Microbiol. Structures of the dimeric and monomeric vari-
9, 437–442. ants of magainin antimicrobial peptides
4. Klein K. C., Reed J. C., & Lingappa J. R. (2007) (MSI-78 and MSI-594) in micelles and bilayers,
Intracellular destinies: Degradation, targeting, determined by NMR spectroscopy. Biochemistry
assembly, and endocytosis of HIV gag. AIDS 45, 5793–5799.
Rev. 9, 150–161. 10. Zheng Z., Yang R., Bodner M. L., & Weliky
5. Uysal H., et al. (2010) Antibodies to citrulli- D. P. (2006) Conformational flexibility and strand
nated proteins: molecular interactions and arrangements of the membrane-associated
arthritogenicity. Immunol. Rev. 233, 9–33. HIV fusion peptide trimer probed by solid-
6. Goldbourt A., Gross B. J., Day L. A., & state NMR spectroscopy. Biochemistry 45,
McDermott A. E. (2007) Filamentous phage 12960–12975.
studied by magic-angle spinning NMR: 11. Lewandowski J. R., De Paepe G., & Griffin R.
Resonance assignment and secondary struc- G. (2007) Proton assisted insensitive nuclei
ture of the coat protein in Pf1. J. Am. Chem. cross polarization. J. Am. Chem. Soc. 129,
Soc. 129, 2338–2344. 728–729.
7. Hong M. (2007) Structure, topology, and 12. Chimon S. & Ishii Y. (2005) Capturing interme-
dynamics of membrane peptides and proteins diate structures of Alzheimer’s beta-amyloid,
A beta(1–40), by solid-state NMR spectroscopy. 26. Marulanda D., Tasayco M. L., Cataldi M.,
J. Am. Chem. Soc. 127, 13472–13473. Arriaran V., & Polenova T. (2005) Resonance
13. Jaroniec C. P., et al. (2004) High-resolution assignments and secondary structure analysis
molecular structure of a peptide in an amyloid of E. coli thioredoxin by magic angle spinning
fibril determined by magic angle spinning solid-state NMR spectroscopy. J. Phys. Chem. B
NMR spectroscopy. Proc. Natl. Acad. Sci. USA 109, 18135–18145.
101, 711–716. 27. Yang J., Tasayco M. L., & Polenova T. (2008)
14. Petkova A. T., et al. (2004) Solid state NMR Magic angle spinning NMR experiments for
reveals a pH-dependent antiparallel beta-sheet structural studies of differentially enriched
registry in fibrils formed by a beta-amyloid protein interfaces and protein assemblies. J.
peptide. J. Mol. Biol. 335, 247–260. Am. Chem. Soc. 130, 5798–5807.
15. Shewmaker F., Wickner R. B., & Tycko R. (2006) 28. Yang J., Tasayco M. L., & Polenova T. (2009)
Amyloid of the prion domain of Sup35p has an Dynamics of Reassembled Thioredoxin Studied
in-register parallel beta-sheet structure. Proc. by Magic Angle Spinning NMR: Snapshots
Natl. Acad. Sci. USA 103, 19754–19759. from Different Time Scales. J. Am. Chem. Soc.
16. Siemer A. B., et al. (2006) Observation of highly 131, 13690–13702.
flexible residues in amyloid fibrils of the HET-s 29. Franks W., Kloepper K., Wylie B., & Rienstra
prion. J. Am. Chem. Soc. 128, 13224–13228. C. (2007) Four-dimensional heteronuclear
17. Tycko R. (2006) Molecular structure of correlation experiments for chemical shift
amyloid fibrils: insights from solid-state NMR. assignment of solid proteins. J. Biomol. NMR
Q. Rev. Biophys. 39, 1–55. 39, 107–131.
18. Han Y., et al. (2010) Solid-State NMR Studies 30. Andrew E. R., Bradbury A., & Eades R. G.
of HIV-1 Capsid Protein Assemblies. J. Am. (1958) Nuclear Magnetic Resonance Spectra
Chem. Soc. 132, 1976–1987. from a Crystal Rotated at High Speed. Nature
19. Sun S. J., Siglin A., Williams J. C., & Polenova 182, 1659–1659.
T. (2009) Solid-State and Solution NMR 31. Schaefer J., McKay R. A., & Stejskal E. O.
Studies of the CAP-Gly Domain of Mammalian (1979) Double-cross-polarization NMR of
Dynactin and Its Interaction with Microtubules. solids. J. Magn. Reson. 34, 443–447.
J. Am. Chem. Soc. 131, 10113–10126. 32. Baldus M., Petkova A. T., Herzfeld J., & Griffin
20. Etzkorn M., Bockmann A., Lange A., & Baldus R. G. (1998) Cross polarization in the tilted
M. (2004) Probing molecular interfaces using frame: assignment and spectral simplification
2D magic-angle-spinning NMR on protein in heteronuclear spin systems. Mol. Phys. 95,
mixtures with different uniform labeling. J. Am. 1197–1207.
Chem. Soc. 126, 14746–14751. 33. Szeverenyi N. M., Sullivan M. J., & Maciel G.
21. Marulanda D., et al. (2004) Magic angle spinning E. (1982) Observation of spin exchange by
solid-state NMR spectroscopy for structural two-dimensional fourier transform 13C cross
studies of protein interfaces. Resonance assign- polarization-magic-angle spinning. J. Magn.
ments of differentially enriched Escherichia Reson. 47, 462–475.
coli thioredoxin reassembled by fragment 34. Takegoshi K., Nakamura S., & Terao T. (2001)
complementation. J. Am. Chem. Soc. 126, 13C-1H dipolar-assisted rotational resonance
16608–16620. in magic-angle spinning NMR. Chem. Phys.
22. Yang J., et al. (2007) Magic angle spinning NMR Lett. 344, 631–637.
spectroscopy of thioredoxin reassemblies. Magn. 35. Verel R., Baldus M., Ernst M., & Meier B. H.
Reson. Chem. 45, S73-S83. (1998) A homonuclear spin-pair filter for
23. Castellani F., et al. (2002) Structure of a protein solid-state NMR based on adiabatic-passage
determined by solid-state magic-angle-spinning techniques. Chem. Phys. Lett. 287, 421–428.
NMR spectroscopy. Nature 420, 98–102. 36. Bennett A. E., et al. (1998) Homonuclear
24. Hong M. & Jakes K. (1999) Selective and radio frequency-driven recoupling in rotating
extensive 13 C labeling of a membrane protein solids. J. Chem. Phys. 108, 9463–9479.
for solid-state NMR investigations. J. Biomol. 37. Hohwy M., Rienstra C. M., Jaroniec C. P., &
NMR 14, 71–74. Griffin R. G. (1999) Fivefold symmetric homo-
25. Muchmore D. C., McIntosh L. P., Russell C. nuclear dipolar recoupling in rotating solids:
B., Anderson D. E., & Dahlquist F. W. (1989) Application to double quantum spectroscopy.
Expression and nitrogen-15 labeling of proteins J. Chem. Phys. 110, 7983–7992.
for proton and nitrogen-15 nuclear magnetic 38. Ernst M., Detken A., Bockmann A., & Meier B. H.
resonance. Methods Enzymol. 177, 44–73. (2003) NMR spectra of a microcrystalline
330 S. Sun et al.
protein at 30 kHz MAS. J. Am. Chem. Soc. 125, analysis of NMR data. J. Biomol. NMR 4,
15807–15810. 603–614.
39. Chen L. L., et al. (2007) J-based 2D homo- 53. Goddard T. D. & Kneller D. G. Sparky 3
nuclear and heteronuclear correlation in (University of California, San Francisco).
solid-state proteins. Magn. Reson. Chem. 45, 54. Vranken W. F., et al. (2005) The CCPN data
S84-S92. model for NMR spectroscopy: development of
40. Chen L., et al. (2006) Constant-Time a software pipeline. Proteins 59, 687–696.
Through-Bond 13C Correlation Spectroscopy 55. Kraulis P. J. (1989) ANSIG: A program for the
for Assigning Protein Resonances with Solid- assignment of protein 1H 2D NMR spectra by
State NMR Spectroscopy. J. Am. Chem. Soc. interactive computer graphics. J. Magn. Reson.
128, 9992–9993. 84, 627–633.
41. Chen L., et al. (2007) Backbone assignments 56. Matsuki Y., Eddy M. T., & Herzfeld J. (2009)
in solid-state proteins using J-based 3D Spectroscopy by Integration of Frequency and
Heteronuclear correlation spectroscopy. J. Am. Time Domain Information for Fast Acquisition
Chem. Soc. 129, 10650–10651. of High-Resolution Dark Spectra. J. Am.
42. Griffin R. G. (1998) Dipolar recoupling in Chem. Soc. 131, 4648–4656.
MAS spectra of biological solids. Nat. Struct. 57. Laue E. D., Skilling J., Staunton J., Sibisi S., &
Biol. 5 Suppl, 508–512. Brereton R. G. (1985) Maximum Entropy Method
43. Zhao X., Edén M., & Levitt M. H. (2001) in Nuclear Magnetic Resonance Spectroscopy.
Recoupling of heteronuclear dipolar interactions J. Magn. Reson. 62, 437–452.
in solid-state NMR using symmetry-based pulse 58. Hoch J. C., Stern A. S., Donoho D. L., &
sequences. Chem. Phys. Lett. 342, 353–361. Johnstone I. M. (1990) Maximum-Entropy
44. Ladizhansky V. (2009) Homonuclear dipolar Reconstruction of Complex (Phase-Sensitive)
recoupling techniques for structure determi- Spectra. J. Magn. Reson. 86, 236–246.
nation in uniformly 13C-labeled proteins. 59. Barna J. C. J., Laue E. D., Mayger M. R.,
Solid State Nucl. Magn. Reson. 36, 119–128. Skilling J., & Worrall S. J. P. (1987) Exponential
45. Balayssac S. p., Bertini I., Lelli M., Luchinat C., Sampling, an Alternative Method for Sampling
& Maletta M. (2007) Paramagnetic Ions Provide in Two-Dimensional NMR Experiments.
Structural Restraints in Solid-State NMR of J. Magn. Reson. 73, 69–77.
Proteins. J. Am. Chem. Soc. 129, 2218–2219. 60. de Bouregas F. S. & Waugh J. S. (1992)
46. Nadaud P. S., Helmus J. J., Kall S. L., & Jaroniec ANTIOPE, a program for computer experiments
C. P. (2009) Paramagnetic Ions Enable Tuning on spin dynamics. J. Magn. Reson. 96, 280–289.
of Nuclear Relaxation Rates and Provide Long- 61. Smith S. A., Levante T. O., Meier B. H., &
Range Structural Restraints in Solid-State Ernst R. R. (1994) Computer Simulations in
NMR of Proteins. J. Am. Chem. Soc. 131, Magnetic Resonance. An Object-Oriented
8108–8120. Programming Approach. J. Magn. Reson., Ser
47. Xu X., et al. (2009) Intermolecular dynamics A 106, 75–105.
studied by paramagnetic tagging. J. Biomol. 62. Blanton W. B. (2003) BlochLib: a fast NMR
NMR 43, 247–254. C++ tool kit. J. Magn. Reson. 162, 269–283.
48. Lian L.-Y. & Middleton D. A. (2001) Labelling 63. Bak M., Rasmussen J. T., & Nielsen N. C.
approaches for protein structural studies by (2000) SIMPSON: A General Simulation
solution-state and solid-state NMR. Prog. Program for Solid-State NMR Spectroscopy. J.
Nucl. Magn. Reson. Spectrosc. 39, 171–190. Magn. Reson. 147, 296–330.
49. Schubert M., Manolikas T., Rogowski M., & 64. Veshtort M. & Griffin R. G. (2006)
Meier B. H. (2006) Solid-state NMR spectros- SPINEVOLUTION: A powerful tool for the
copy of 10% 13C labeled ubiquitin: spectral simpli- simulation of solid and liquid state NMR
fication and stereospecific assignment of isopropyl experiments. J. Magn. Reson. 178, 248–282.
groups. J. Biomol. NMR 35, 167–173. 65. Erickson-Viitanen S., et al. (1989) Cleavage of
50. Delaglio F., et al. (1995) NMRPipe: a multidi- HIV-1 gag polyprotein synthesized in vitro:
mensional spectral processing system based on sequential cleavage by the viral protease. AIDS
UNIX pipes. J. Biomol. NMR 6, 277–293. Res. Hum. Retroviruses 5, 577–591.
51. Mobli M., Maciejewski M. W., Gryk M. R., & 66. Langsetmo K., Fuchs J., & Woodward C.
Hoch J. C. (2007) Automatic maximum entropy (1989) Escherichia coli Thioredoxin Folds into
spectral reconstruction in NMR. J. Biomol. NMR 2 Compact Forms of Different Stability to Urea
39, 133–139. Denaturation. Biochemistry 28, 3211–3220.
52. Johnson B. A. & Blevins R. A. (1994) NMR View: 67. Tasayco M. L. & Chao K. (1995) NMR study
A computer program for the visualization and of the reconstitution of the beta-sheet of
thioredoxin by fragment complementation. 78. van Rossum B. J., de Groot C. P., Ladizhansky
Proteins 22, 41–44. V., Vega S., & de Groot H. J. M. (2000) A
68. Slaby I. & Holmgren A. (1975) Reconstitution method for measuring heteronuclear (1H-13C)
of Escherichia coli thioredoxin from comple- distances in high speed MAS NMR. J. Am.
menting peptide fragments obtained by cleav- Chem. Soc. 122, 3465–3472.
age at methionine-37 or arginine-73. J. Biol. 79. Lorieau J. L. & McDermott A. E. (2006)
Chem. 250, 1340–1347. Conformational flexibility of a microcrystalline
69. Sambrook J., Fritsch E. F., & Sambrook J. globular protein: order parameters by solid-
(1989) Molecular cloning: a laboratory manual state NMR spectroscopy. J. Am. Chem. Soc.
(Cold Spring Harbor Laboratory, Cold Spring 128, 11505–11512.
Harbor, N.Y.) 2nd Ed. 80. Gullion T. & Schaefer J. (1989) Rotational-
70. Byeon I. J., et al. (2009) Structural conver- echo double-resonance NMR. J. Magn. Reson.
gence between Cryo-EM and NMR reveals 81, 196–200.
intersubunit interactions critical for HIV-1 81. Bielecki A., Kolbert A. C., & Levitt M. H.
capsid function. Cell 139, 780–790. (1989) Frequency-switched pulse sequences:
71. Mossessova E. & Lima C. D. (2000) Ulp1- Homonuclear decoupling and dilute spin NMR
SUMO Crystal Structure and Genetic Analysis in solids. Chem. Phys. Lett. 155, 341–346.
Reveal Conserved Interactions and a 82. Wickramasinghe N. P., Kotecha M., Samoson
Regulatory Element Essential for Cell Growth A., Past J., & Ishii Y. (2007) Sensitivity
in Yeast. Mol. Cell 5, 865–876. enhancement in (13)C solid-state NMR of
72. Bennett A. E., Rienstra C. M., Auger M., protein microcrystals by use of paramagnetic
Lakshmi K. V., & Griffin R. G. (1995) metal ions for optimizing (1)H T(1) relax-
Heteronuclear Decoupling in Rotating Solids. ation. J. Magn. Reson. 184, 350–356.
J. Chem. Phys. 103, 6951–6958. 83. Wickramasinghe N. P., et al. (2009) Nanomole-
73. Chan J. C. C. & Tycko R. (2003) Recoupling scale protein solid-state NMR by breaking
of chemical shift anisotropies in solid-state intrinsic 1HT1 boundaries. Nat. Methods 6,
NMR under high-speed magic-angle spinning 215–218.
and in uniformly 13C-labeled systems. J. Chem. 84. Carravetta M., Edén M., Zhao X., Brinkmann
Phys. 118, 8378–8389. A., & Levitt M. H. (2000) Symmetry principles
74. Wylie B. J., Franks W. T., & Rienstra C. M. for the design of radiofrequency pulse sequences
(2006) Determinations of 15N Chemical Shift in the nuclear magnetic resonance of rotating
Anisotropy Magnitudes in a Uniformly solids. Chem. Phys. Lett. 321, 205–215.
15
N,13C-Labeled Microcrystalline Protein by 85. Holl S. M., McKay R. A., Gullion T., &
Three-Dimensional Magic-Angle Spinning Schaefer J. (1990) Rotational-echo triple-reso-
Nuclear Magnetic Resonance Spectroscopy. J. nance NMR. J. Magn. Reson. 89, 620–626.
Phys. Chem. B 110, 10926–10936. 86. Gullion T., Baker D. B., & Conradi M. S. (1990)
75. Munowitz M., Aue W. P., & Griffin R. G. New, compensated Carr-Purcell sequences.
(1982) Two-dimensional separation of dipolar J. Magn. Reson. 89, 479–484.
and scaled isotropic chemical shift interactions 87. Vinogradov E., Madhu P. K., & Vega S. (1999)
in magic angle NMR spectra. J. Chem. Phys. High-resolution proton solid-state NMR spec-
77, 1686–1689. troscopy by phase-modulated Lee-Goldburg
76. Hohwy M., Jaroniec C. P., Reif B., Rienstra C. experiment. Chem. Phys. Lett. 314, 443–450.
M., & Griffin R. G. (2000) Local structure 88. Erickson-Viitanen S., Manfredi J., Viitanen P.,
and relaxation in solid-state NMR: Accurate Tribe D. E., Tritch R., Hutchison C. A., 3rd,
measurement of amide N-H bond lengths and Loeb D. D., & Swanstrom R. (1989) Cleavage
H-N-H bond angles. J. Am. Chem. Soc. 122, of HIV-1 gag polyprotein synthesized in vitro:
3218–3219. sequential cleavage by the viral protease. AIDS
77. Hong M., Yao X., Jakes K., & Huster D. (2002) Res. Hum. Retroviruses 5, 577–591.
Investigation of Molecular Motions by Lee- 89. Marley J., Lu M., & Bracken C. (2001) A
Goldburg Cross-Polarization NMR Spectroscopy. method for efficient isotopic labeling of recom-
J. Phys. Chem. B 106, 7355–7364. binant proteins. J. Biomol. NMR 20, 71–75.
Chapter 18
Synthesis, Purification, and Characterization of Single Helix

Membrane Peptides and Proteins for NMR Spectroscopy
Miki Itaya, Ian C. Brett, and Steven O. Smith
Abstract
Membrane proteins function as receptors, channels, transporters, and enzymes. These proteins are gener-
ally difficult to express and purify in a functional form due to the hydrophobic nature of their membrane
spanning sequences. Studies on membrane proteins with a single membrane spanning helix have been
particularly challenging. Single-pass membrane proteins will often form dimers or higher order oligomers
in cell membranes as a result of sequence motifs that mediate specific transmembrane helix interactions.
Understanding the structural basis for helix association provides insights into how these proteins function.
Nevertheless, nonspecific association or aggregation of hydrophobic membrane spanning sequences can
occur when isolated transmembrane domains are reconstituted into membrane bilayers or solubilized into
detergent micelles for structural studies by solid-state or solution NMR spectroscopy. Here, we outline the
methods used to synthesize, purify, and characterize single transmembrane segments for structural studies.
Two synthetic strategies are discussed. The first strategy is to express hydrophobic peptides as protein
chimera attached to the maltose binding protein. The second strategy is by direct chemical synthesis.
Purification is carried out by several complementary chromatography methods. The peptides are solubi-
lized in detergent for solution NMR studies or reconstituted into model membranes for solid-state NMR
studies. We describe the methods used to characterize the reconstitution of these systems prior to NMR
structural studies to establish if there is nonspecific aggregation.
Key words: Membrane protein, NMR spectroscopy, Gp55-P, Epo receptor, Transmembrane
1. Introduction
Membrane proteins containing single membrane-spanning helices

are involved in a broad range of cellular functions, such as signal
transduction, cellular mobility, and apoptosis. Despite the impor-
tance of these proteins, relatively few structures have been reported.
For protein crystallography, the single transmembrane (TM) helix
presents a problem because it must remain embedded in a mem-
brane environment. The strategy for structural studies has often
333
334 M. Itaya et al.
been to separately crystallize the extracellular and intracellular

domains of these proteins and assume that the TM sequence forms
a passive tether. In contrast, nuclear magnetic resonance (NMR)
spectroscopy has emerged as an effective tool for probing the
membrane-spanning elements of membrane proteins (1–3). In addi-
tion, it is increasingly recognized that single TM helices can medi-
ate biological function by associating in specific orientations.
Solid-state NMR spectroscopy is well suited for structural studies
of TM proteins reconstituted into lipid bilayers (4–8) and has been
used to study helix–helix interfaces in single TM helix dimers
(9–11) as well as to determine their structure and orientation with
respect to the bilayer normal (12–14). Solution NMR spectroscopy
has been used to solve the structures of several membrane proteins
(15–23), and the structures of single TM helix dimers and higher
order oligomers have been deposited in the Protein Data Bank (24).
We discuss the methods that have been developed for the
synthesis, purification, and characterization of isolated TM heli-
ces using two different membrane proteins. The first membrane
protein is the murine erythropoietin (Epo) receptor (P14753).
The Epo receptor belongs to the cytokine receptor family and is
responsible for the production and development of red blood
cells. The second membrane protein is gp55-P, a viral membrane
protein from the murine spleen focus-forming virus (SFFV) that
interacts with and activates the murine Epo receptor (25–27).
Both proteins are thought to exist as TM-mediated homodimers
(28–30). For each protein, constructs of varying lengths are pro-
duced, ranging from peptides containing only the TM domain to
constructs that include the TM domain and portions of the intra-
cellular or extracellular domains. For the Epo receptor, the lon-
gest peptide sequence (EpoR218–368) used includes the TM domain
(residues 226–248) and two intracellular regions (the switch
region and Box 1/2 regions) that are important for activity. The
switch region (residues 249–256) and Box 1/2 region (residues
257–265, 303–312) are conserved across the cytokine receptors.
The orientation of the switch region controls receptor activity
(31), while residues in Box 1 are involved in binding the JAK2
kinase (32).
1.1. Synthesis of One of the limiting factors of membrane protein structure deter-
Peptides and Proteins mination is the increased difficulty of membrane protein produc-
with Single TM Helices tion and purification. Many different production and purification
schemes have been described for structural studies (22, 33–37).
For short membrane protein sequences, two particularly useful
methods are solid-phase peptide synthesis (38–40) and expression
in E. coli (41–43).
1.1.1. Solid-Phase The synthesis of membrane peptides by conventional solid-phase

Peptide Synthesis methods can be challenging due to their hydrophobic nature.
18 Synthesis, Purification, and Characterization of Single Helix Membrane¼ 335
During synthesis, the peptides tend to aggregate on the resin and

consequently the efficiency of the reactions needed to sequen-
tially add amino acids is reduced. Some of the concerns involving
hydrophobic peptide synthesis have previously been discussed
(40, 44). We list below several tips for improving the yield of the
desired peptide.
1. Incorporate charged amino acids at the C terminus. Typically,
single membrane spanning sequences are terminated by a series
of basic amino acids. Because chemical synthesis starts with the
C terminus and progresses toward the N terminus, the inclusion
of these amino acids in the synthesis helps to extend the growing
peptide from the surface of the resin.
2. Use a low density of reactive sites on the resin. The density of
peptides being synthesized on the solid resin support is deter-
mined by the density of the first amino acid coupled to the
resin. A lower density of peptide helps to prevent aggregation.
For hydrophobic peptide synthesis, 0.2 mmol of reactive sites/
gram of the resin (or less) are typically used.
3. Use strong amino acid activating reagents. Peptide aggrega-
tion lowers the efficiency of the reactions for elongating hydro-
phobic peptides (44). The efficiency of these reactions can be
improved by using a combination of HOAt and HATU in
place of the more commonly used reagents O-benzotriazole-
N , N , N ¢ , N ¢ -tetramethyl-uronium-hexafluoro-phosphate
(HBTU) and hydroxybenzotriazole (HOBt) or dicyclohexyl-
carbodiimide (DCC) and HOBt (45–47).
4. Incorporation of pseudoproline. Amino acids such as serine,
threonine, and cysteine are difficult to incorporate into hydro-
phobic peptides during the elongation step. Insertion of pseu-
doprolines as temporary side chain protection helps to reduce
peptide self-association and β-sheet formation during synthesis
(48). The native amino acid is regenerated by acid treatment
and ring-opening upon cleavage from the resin.
1.1.2. Protein Many recombinant protein expression methods have been devel-
Overexpression oped, but the basic differences are in the properties of the vector
that govern the purification method (affinity tag) and the mecha-
nism of separation of the affinity tag from the protein of interest
(49). Common affinity tags are the hexahistidine (His), glutathi-
one-S-transferase (GST), and Flag tags. Incubating the cellular
lysate with a modified resin that interacts with the tag allows puri-
fication of the protein of interest. These tags, while useful for
expression and purification of soluble proteins, have had mixed
success when used with hydrophobic proteins. More recent
approaches focus on combining a traditional affinity tag with
another tag that promotes solubility of the otherwise hydrophobic
fusion protein. For example, maltose binding protein (MBP) is
336 M. Itaya et al.
thought to act as a chaperone or to recruit chaperones to promote

the solubility of the hydrophobic fusion peptide (50, 51).
Initial attempts at membrane protein expression and purification
used fusion proteins with a simple affinity tag, such as a His tag,
but neglected a mechanism for removal of the tag. This approach
is undesirable because solubility and success of expression are variable
(43, 52). Also, residual His tags can drive protein oligomerization
(53). Proteolytic separation of the protein of interest from the
affinity/solubility tag can be accomplished using a protease, such
as thrombin, factor Xa, enterokinase, or the tobacco etch virus
(TEV) protease. A less common method of chemical proteolysis
makes use of cyanogen bromide, which cleaves the peptide backbone
at methionine residues.
Recently, a ligation-independent cloning (LIC) vector encoding
a His tag with a TEV protease site (54) has been adapted to include
MBP for the expression of otherwise insoluble or TM domain
containing constructs (21). A major difference between this and
other purification methods is that the protein of interest remains
soluble and does not get shuttled into inclusion bodies. One draw-
back is that there are three non-native residues (SNA) at the N
terminus of the protein after TEV protease cleavage. We have
successfully used this vector to express and purify single TM con-
taining constructs of varying lengths.
1.2. Purification of TM Reverse-phase HPLC is a widely used technique for purifying

Peptides and Proteins hydrophobic peptides (55). The columns used are generally made
by attaching hydrocarbon chains of different lengths to a solid sup-
1.2.1. Peptide Purification
port. The chain lengths (e.g., C4, C8, C18) characterize the type
of column. Columns with short hydrocarbon chain lengths gener-
ally exhibit weaker interactions with the long hydrophobic peptide
being purified. The consequence is that the resolution is poorer,
but the yield of peptide is higher. Typically, the yield of pure pep-
tide per crude peptide weight is ~10–20% with a C4 column (40).
The contaminants for expressed and chemically synthesized
peptides are different. For chemically synthesized peptides, the
major protein contaminants are peptides that are one to two amino
acids shorter than the target sequence. In these peptides, the cou-
pling reaction failed at one or more steps. For expressed peptides,
the contaminants are generally other protein products with larger
differences in molecular weight than the target peptide.
Different hydrophobic peptides often require different solvent
conditions to achieve optimal separation. However, a water–ace-
tonitrile gradient is often a good starting point (40). The gradient
starts with a low concentration of acetonitrile in water, and the ace-
tonitrile concentration is increased in a linear fashion during the
purification (see Note 1). We present, in Subheading 3.2.1, a gen-
eral strategy for HPLC purification of a crude peptide mixture from
solid phase synthesis by using reverse phase chromatography.
1.2.2. Protein Purification Once the fusion protein has been expressed and the TM domain-
containing construct has been separated from the His-MBP tag, the
protein of interest must be separated from this mix. Two different
methods can be employed to accomplish this: organic extraction
(Subheading 3.2.3) and aqueous (Subheading 3.2.4) purification.
Constructs that have a high percentage of hydrophobic residues can
be solubilized from a dried mixture after TEV protease cleavage
using organic solvents. This method depends on two factors: a high
hydrophobic content of the protein of interest and low solubility in
organic solvents of the other contaminants (His-TEV, His-MBP,
His-MBP-fusion protein). A drawback of this method is that the
denaturation of the protein of interest is likely. In contrast, the
aqueous method leaves the protein of interest in its natively folded
state and uses the His tag present on all contaminants to remove
them from solution. To follow the purification and cleavage reac-
tions, we use SDS-PAGE (Fig. 1) and mass spectrometry (Fig. 2).
Sample conditions require that detergent which is present in
excess when purifying membrane proteins; therefore, most of the
preceding steps are done at relatively high detergent concentra-
tion. Because the signals from the detergent alkyl groups will be
overwhelming in some NMR experiments, it would be preferable
to have the sample in a deuterated detergent. It is impractical to
conduct the purification in deuterated detergents because of the
expense, but it is possible to exchange the purified sample into a
deuterated detergent. Detergent exchange can lead to longer pro-
tein relaxation times. However, this is generally not appreciable.
Fig. 1. Analysis of protein expression, cleavage, and purification using 15% SDS-PAGE. The ~48-kDa fusion protein (His-
MBP EpoR218–268) is the major protein in the lysate. The eluted protein is nearly pure and, when exposed to TEV protease for
22 h, almost 100% cleavage occurs, resulting in a 5.9-kDa band corresponding to EpoR218–268 (see boxed area). NEB MWM
NEB broad range molecular weight markers.
338 M. Itaya et al.
a b
1000
Lane 1
M 1
TEV
66 kDa His-MBP EpoR 800
Intensity (AU)
His-MBP His-MBP
600
27 kDa His-TEV
20 kDa 400
14 kDa
200
0
20 40 60 80 100 120 (103)
m/z
c M 1 2
d
300 Lane 2
66 kDa
Intensity (AU)
200
27 kDa EpoR
20 kDa EpoR(218-368)
+2
100
14 kDa
0
10 20 30 40 50 (103)
m/z
Fig. 2. Characterization of a typical aqueous purification by 15% SDS-PAGE and mass spectrometry. (a) SDS-PAGE gel from
a typical purification. Lane 1 contains the TEV cleavage mixture ~24 h after the addition of the TEV protease. The compo-
nents in the sample are His-TEV protease (27 kDa), His-MBP (43 kDa), and EpoR218–368 (17.4 kDa). (b) MALDI-TOF mass
spectrum (large MW window) of the TEV cleavage mixture showing peaks for the His-TEV protease and His-MBP. (c) SDS-
PAGE gel of the same sample mixture from (a) after the aqueous purification described in the text. Lanes 1 and 2 contain
1 and 5 μL of sample, respectively. (d) MALDI-TOF mass spectrum (small MW window) of the EpoR218–368 peptide after
aqueous purification. The 17.4-kDa band is clearly visible without contaminants from His-MBP or His-TEV protease. The
mass chromatograms shown here were obtained using a Bruker AutoFlex II MALDI-TOF–TOF mass spectrometer.
Detergent exchange can be accomplished by binding the protein

to an ion exchange (IEX) column and extensively washing in a buf-
fer with the deuterated detergent (56).
1.3. Characterization Mass spectrometry (MS) (Subheading 3.3) is used to verify the
of NMR Samples molecular weight and purity of the final purified sample. Figure 2
presents a combination of SDS-PAGE gels and mass spectra to
1.3.1. Mass Spectrometry
show the usefulness of mass spectrometry for assaying the purity of
expressed hydrophobic TM peptides after the removal of His-MBP
and His-TEV by aqueous extraction. Mass spectrometry is also
routinely used for assessing the purity of peptides produced by
solid phase synthesis and organic extraction.
1.3.2. Size Exclusion One of the pitfalls in the purification of membrane proteins is
Chromatography aggregation. While centrifugation can remove large aggregates
from solution, microaggregrates and oligomers can remain and are
undesirable for solution NMR spectroscopy. Changing the type or
the amount of detergent used for solubilization can disrupt micro-
aggregates. Size exclusion chromatography (SEC) can optimize
the peptide solubilization by providing a picture of the aggregation
state. Figure 3 shows that adding more detergent disperses micro-
aggregates of the EpoR218–268 peptide.
1.3.3. Polarized Attenuated Attenuated total reflection (ATR)-Fourier transform infrared

Total Reflection Fourier (FTIR) spectroscopy is used to characterize the global secondary
Transform Infrared structure and TM orientation of the reconstituted peptide (57).
Spectroscopy The amide I vibrational bands observed between 1,600 and
1,700cm-1 are sensitive to secondary structure. Helical secondary
structure yields an amide I vibration between 1,650 and 1,660cm-1,
while aggregated strand and β-sheet structure yields amide I
vibrations below ~1,645cm-1. For membrane spanning helices,
the dichroic ratio of the amide I vibration can be used to calculate
the helix orientation relative to the membrane normal (57), and
consequently provides a rapid assay to characterize whether the
peptide is properly inserted into the membrane bilayer (58). The
purified bands from sucrose gradients containing peptide inserted
with a TM orientation generally yield dichroic ratios for the heli-
cal amide I band at ~1,655cm-1 of greater than ~3. Bands from
sucrose gradients containing aggregated peptide exhibit dichroic
a b
80
15 EpoR(218-268)
60 dimer
A 220 (mAU)
A 220 (mAU)
EpoR(218-268)
10 40
aggregates
5 20
0 0
8 10 12 14 16 18 20 22 8 10 12 14 16 18 20 22
Elution Volume (mL) Elution Volume (mL)
Fig. 3. Fast protein liquid chromatography. (a) FPLC analysis of the EpoR218–268 peptide solubilized in DPC at 20× CMC
shows that the peptide elutes in the first fractions from the gel filtration column, indicating the presence of molecular
aggregates. (b) FPLC analysis of the same sample as in (a) after the addition of DPC to 100× CMC and sonication for 1 min
shows that the peptide elutes at the molecular weight corresponding to a dimer. The FPLC separations were performed
with a Superdex 200 10/300 GL column calibrated with known standards. The buffer for the samples and standards con-
tained 100 mM sodium phosphate and 150 mM NaCl at pH 7.0, and the injected sample volume was 0.1 mL. SDS-PAGE
confirmed the identity of the elution fractions (data not shown).
340 M. Itaya et al.
0.2
0.15
Absorbance
0.1
0.05
0
1800 1760 1720 1680 1640 1600
Wavenumber (cm–1)
Fig. 4. Polarized ATR-FTIR spectroscopy. Polarized ATR-FTIR spectra were obtained of the gp55-P peptide reconstituted
into DMPC bilayers by detergent dialysis using IR light polarized parallel (solid line) and perpendicular (dotted line) rela-
tive to the bilayer normal. Only the region between 1,600cm-1 and 1,800cm-1 is shown. The amide I vibration is observed
at 1,654cm-1, indicating that the reconstituted peptide has an α-helical conformation. The intense vibration at 1,735cm-1
is due to the C=O stretching vibration of the lipid acyl chains. The dichroic ratio (A║/A┴) of the amide I band of 3.3
corresponds to a helix orientation of ~20°. The FTIR spectrum shown here was obtained using a Bruker IFS 66 V/S
spectrometer.
ratios of less than ~3. Figure 4 presents the amide I region from
the polarized FTIR spectrum of gp55-P reconstituted into
dimyristoylphosphatidylcholine (DMPC) bilayers. There is a
single amide I band observed at a frequency of 1,654cm-1, char-
acteristic of helical secondary structure. The dichroic ratio of the
1,654cm-1 vibration is 3.3, which corresponds to a tilt of the helix
axis of ~20° relative to the membrane normal. Together, the
frequency and dichroic ratio of the amide I vibration indicate that
the gp55-P peptide is properly reconstituted in a homogeneous
α-helical conformation.
1.3.4. Circular Dichroism Circular dichroism (CD) spectroscopy is a widely used technique
Spectroscopy to deduce protein secondary structure (59). CD can inform us
about the helical content of the sample, confirming that the sample
is folded or refolded properly. FTIR and CD spectroscopy provide
complementary information on secondary structure. CD can easily
distinguish random coil from α-helical and β-sheet secondary
structure. Light scattering from membrane vesicles often leads to a
red shift and damping of the CD bands making it more difficult to
distinguish the signature bands for α-helix and β-sheet. In con-
trast, α-helix and β-sheet are easily distinguished by FTIR spec-
troscopy, but they are more difficult to distinguish from random
coil. Figure 5 presents the CD spectrum of the pure EpoR218–268
peptide in dodecylphosphocholine (DPC) at a detergent concen-
tration corresponding to 1.4 times its CMC. It exhibits negative
CD absorption bands at approximately 208 and 222 nm character-
istic of α-helical secondary structure. The peptide sequence includes
30
20
Ellipticity (millideg)
10
0
200 220 240 260 280
–10
Wavelength (nm)
Fig. 5. Circular dichroism spectroscopy. The CD spectrum is shown of pure EpoR218–268

peptide solubilized in DPC at 1.4× CMC. The spectrum was taken at room temperature in
10 mM sodium phosphate buffer at pH 7.0. Minima are observed at approximately 208
and 222 nm, which are characteristic of α-helical secondary structure and in agreement
with the expected global secondary structure of the EpoR218–268 sequence. The CD spec-
trum shown here was obtained using an Olis RSM 1000 CD spectrophotometer.
the TM domain and two intracellular regions (the switch region

and Box 1) and we expect to observe global α-helical structure for
a well-reconstituted sample.
1.4. NMR Deuterium NMR spectroscopy is well suited for probing dynamic
Spectroscopy processes in membrane proteins (60–62). Deuterium NMR has
been widely used to look at lipid dynamics, but much less so to
1.4.1. Solid-State 2H NMR
investigate side chain dynamics in membrane proteins because of
Spectroscopy
sensitivity and selectivity issues (63, 64). However, by combining
magic angle spinning (MAS) with specifically deuterated mem-
brane proteins and peptides, the sensitivity issue can be resolved
(61). We use deuterium MAS NMR to assess homo-oligomerization
of the reconstituted membrane peptides prior to undertaking
structural studies with more advanced NMR methods. There are
several advantages of deuterium MAS NMR. The first is that it is a
simple experiment, using a single pulse at relatively low MAS fre-
quencies. The spinning side bands in slow MAS spectra reveal the
envelope of the static 2H lineshape. Second, the measurements are
carried out in the liquid crystalline phase (above the lipid phase
transition temperature). Third, the experiments are comparative in
nature. Figure 6 presents deuterium MAS NMR spectra of gp55-P
selectively labeled at four consecutive leucines (Leu396–399) in
the middle of the TM region. Comparison of the deuterium MAS
side band patterns shows that Leu399 has the most restricted
motion and consequently is likely to be oriented toward the TM
dimer interface. Leu397 and Leu398 exhibit the narrowest deute-
rium side band patterns. The narrow lineshapes correspond to
342 M. Itaya et al.
Facing lipids Facing dimer interface
Leu397 Leu396
396 398
L
397 L 399 L
L
399
Leu398 L Leu399
397
398 L
L 396
L
gp55-p
–20 –10 0 10 20 –20 –10 0 10 20

Frequency (kHz) Frequency (kHz)
Restricted motion of
Mobile side chain
side chain
RRPPWFTTLISTIMGSLIILLLLLILLIWTLYS
Fig. 6. Deuterium MAS NMR spectroscopy. Gp55-P has five consecutive leucines in the middle of the TM region that allow
us to map out the dimer interface by using deuterium MAS NMR (9). Four different gp55-P TM peptides were chemically
synthesized, each with one of four sequential leucines methyl deuterium labeled. By examining the deuterium spectrum of
each peptide, we can infer which leucines are in the helix dimer interface and which ones face the lipid side chains.
Comparison of lineshapes of deuterated leucines clearly show that Leu396 and Leu399 are most likely in the dimer inter-
face and Leu397 and Leu398 oriented away from the dimer interface.
increased mobility relative to Leu399. These leucines are likely

facing the surrounding lipids. The modulation of leucine deuterium
lineshape around one turn of the gp55-P TM helix argues that the
helix is dimerizing in membrane bilayers and not forming nonspecific
aggregates. Together, the results from polarized IR and deuterium
NMR presented in Figs. 4 and 6 indicate that the gp55-P TM
peptide can be reconstituted as a helical membrane-spanning
dimer, which is now suitable for high-resolution solid-state NMR
structural studies.
1.4.2. Solution-State NMR After production of a pure sample, preliminary scouting experi-
Spectroscopy ments must be performed to determine sample conditions that
promote both long-term sample stability and sample homogeneity.
Several factors that must be considered are buffer type and concen-
tration, detergent type and concentration, pH, and temperature
for data collection. These variables are covered well elsewhere
(2, 65). Solubility/stability assessments may be carried out with unla-
beled protein. However, the proton-nitrogen heteronuclear single
quantum correlation (15N-HSQC) experiment should be used to
assess the resolution and the sensitivity of particular sample condi-
tions; this requires 15N-labeled protein. These experiments can be
conducted at a relatively low protein concentration (~0.1 mM),
ppm
112
113
114
115
116
117
15
N 118
119
120
121
122
123
124
125
8.8 8.6 8.4 8.2 8.0 7.8 7.6 7.4 7.2 ppm
1
H
Fig. 7. Solution NMR spectroscopy. The 1H-15N HSQC spectrum of the EpoR220–248 peptide is
shown. The peptide was prepared by the aqueous extraction method and solubilized in
d38-DPC (100× CMC). The spectrum was collected with 32 scans at a temperature of
313 K. About 30 N-H peaks are visible, which are expected from the 31 amino acid
construct, along with the pair of N-H resonances from the single Asn residue side chain.
but a general rule is that most expected peaks should be visible

after an 8-scan experiment; otherwise the time required for three-
dimensional NMR experiments will make these experiments
impractical. Once optimal conditions for data collection have been
identified, proteins with different labeling schemes can be pro-
duced for a variety of solution studies. Figure 7 presents the
15
N-HSQC spectrum of the TM domain of the Epo receptor
(EpoR220–248). There are 31 residues in the 15N-labeled peptide,
and the observation of ~30 N-H peaks of roughly the same inten-
sity indicates that the sample is suitable for three-dimensional
NMR measurements.
Here, we describe methods to produce, purify, and characterize
peptides and proteins that are either specifically or fully labeled with
13
C and 15N for both solid-state and solution NMR spectroscopy.
2. Materials
The experimental protocols described below require access to

equipment typically found in a laboratory conducting recombinant
protein expression and instrumentation used in protein character-
ization, which is often located in laboratories involved in biophys-
ics. Equipment needed for protein expression includes shaker
incubators, centrifuges, a French press, bath sonicator, and a setup
344 M. Itaya et al.
for gel electrophoresis. Instrumentation needed for protein charac-

terization includes spectrometers for circular dichroism (CD),
FTIR, and NMR spectroscopy.
2.1. Protein 1. His-MBP TEV expression vector (21, 66).

Overexpression 2. Escherichia coli BL21 DE3 cells.
3. Luria-Bertani (LB) broth: 10 g tryptone, 10 g NaCl, 5 g yeast
extract per liter of water. Adjust to pH 7.0 with 5N NaOH.
Add ampicillin to a final concentration of 100 μg/mL.
4. M9 salts: 2.5 g of NaCl, 15 g of KH2PO4, 5 g of NH4Cl, 64 g
of Na2HPO4·7H2O, per 1 L sterile deionized water. 15NH4Cl
can be substituted in the same amount as unlabeled NH4Cl to
express uniformly labeled protein.
5. M9 medium: 750 mL of sterile deionized water, 200 mL of
autoclaved M9 salts, 20 mL of 20% glucose, 2 mL of 1 M
MgSO4, 0.1 mL of 1 M CaCl2. Ampicillin is added to a final
concentration of 100 μg/mL.
6. Labeled M9 medium: Same as M9 medium except 15NH4Cl
and U13C-glucose are substituted in the same amounts as unla-
beled materials to express uniformly labeled protein. Ampicillin
is added to a final concentration of 100 μg/mL.
7. Isopropyl β-D-1-thiogalactopyranoside (IPTG).
8. 20% (w/v) glucose: Filter sterilize.
9. 1 M MgSO4.
10. 1 M CaCl2.
11. Binding buffer: 20 mM Tris–HCl, pH 8.0 (1.95 g/L Tris–HCl
and 0.92 g/L Tris-Base, adjust pH with 1N NaOH or 1N
HCl), 500 mM NaCl, 5 mM imidazole.
2.2. Purification of TM 1. C4 Reversed-phase semipreparatory HPLC column: 200 Å C4

Peptides and Proteins 5 μm (Higgins Analytical).
2.2.1. Peptide Purification 2. 2,2,2 Trifluoroethanol, 99% (TFE).
3. Trifluoroacetic acid (TFA).
4. Solvent A: Water containing 0.1% TFA.
5. Solvent B: Acetonitrile containing 0.1% TFA (see Note 2).
6. Liquid nitrogen.
2.2.2. Protein Purification, 1. Binding buffer (Subheading 2.1).

Organic and Aqueous 2. Wash buffer: 20 mM Tris–HCl, pH 8.0 (1.95 g/L Tris–HCl
Extractions and 0.92 g/L Tris-Base, adjust pH with 1N NaOH or 1N
3. Elution buffer: 20 mM Tris–HCl, pH 8.0 (1.95 g/L Tris–HCl
and 0.92 g/L Tris-Base, adjust pH with 1N NaOH or 1N
4. n-Octyl-β-D-glucopyranoside (β-OG): Critical micelle concen-

tration (CMC) = 0.25 mM.
5. Ni+/NTA resin.
6. n-Dodecyl-β-D-maltopyranoside (DDM): CMC = 0.015 mM.
7. His-tagged TEV (His-TEV) protease (Invitrogen) or plasmids
for in-house expression available.
8. Trichloroacetic acid.
9. MilliQ water (18.2 mW water).
10. Methanol:chloroform mixture: 90% methanol:10% chloroform.
11. 0.2-μm PTFE syringe filters.
12. Dialysis membrane 1 kDa MWCO.
13. Amicon spin concentrator (3 kDa MWCO).
2.2.3. Detergent Exchange 1. Detergent exchange dialysis buffer: 20 mM Tris–HCl, pH 8.5

(0.884 g/L Tris–HCl and 1.744 g/L Tris-Base).
2. Detergent exchange start buffer: 20 mM Tris–HCl, pH 8.5
(0.884 g/L Tris–HCl and 1.744 g/L Tris-Base), 3.08 mM
d38-DPC (2× CMC of 1.54 mM).
3. Detergent exchange elution buffer: 20 mM Tris–HCl, pH 8.5
(0.884 g/L Tris–HCl and 1.744 g/L Tris-Base), 3.08 mM
d38-DPC, 1 M NaCl.
4. IEX column: 1 mL Q-Sepharose FF anion exchange column
(GE Healthcare).
5. Amicon spin concentrator (3 kDa MWCO).
2.3. Mass 1. Saturated sinapinic acid matrix solution: Dissolve ~20 mg/mL
Spectroscopy in 50% acetonitrile and 0.1% TFA. Spin down excess matrix by
centrifugation at 15,000 × g for 10 s at room temperature and
use only the supernatant (saturated sinapinic acid solution).
2. Washing buffer: 10 mM ammonium phosphate, monobasic
dissolved in 0.1% TFA.
3. Recrystallization buffer: 60% ethanol, 30% acetone, and 10%
water with 0.1% TFA.
2.4. NMR Sample 1. β-OG (Subheading 2.2.2).

Preparation 2. DMPC (Avanti Polar Lipids).
2.4.1. Reconstitution into 3. TFE.
Membrane Lipids 4. Liquid nitrogen.
5. Phosphate dialysis buffer (1×): 10 mM sodium phosphate,
50 mM NaCl, pH 7.0 (prepared using 100 mM sodium phos-
phate dibasic, 500 mM NaCl and titrate with 100 mM sodium
phosphate monobasic, 500 mM NaCl to adjust the pH. This
concentration corresponds to a 10× stock solution).
346 M. Itaya et al.
6. MES rehydration buffer (for a peptide sequence that contains

cysteine): 5 mM 2-(N-morpholino)ethanesulfonic acid (MES),
50 mM NaCl, 5 mM dithiothreitol. Adjust the pH to 6.2 with
10N NaOH.
7. MES dialysis buffer (for a peptide sequence that contains
cysteine): 5 mM MES, 50 mM NaCl. Adjust the pH to 6.2
with 10N NaOH.
8. Sucrose.
9. Deuterium depleted water (Cambridge Isotope Laboratories).
2.4.2. Solubilization 1. NMR buffer: 10 mM sodium phosphate, pH 7.0 (prepared by

in Deteregnt Micelles mixing 5.77 mL of 1 M Na2HPO4 and 4.23 mL of 1 M
NaH2PO4 with 990 mL of milliQ water).
2. Argon or nitrogen gas.
3. Deuterium oxide (D2O).
4. 4,4-Dimethyl-4-silapentane-1-sulfonic acid (DSS): 0.1–0.25 mM.
5. Sodium azide (NaN3).
6. Bath sonicator.
3. Methods
3.1. Protein 1. Inoculate 25 mL of LB broth with a single colony of E. coli

Overexpression BL21 (DE3) transformed with the His-MBP-TEV expression
vector containing the TM clone, isolated from a plate or taken
from a frozen glycerol stock solution (see Note 3). Grow over-
night at 37°C.
2. Pellet the cells by centrifuging at 6,000 × g for 20 min at 4°C,
wash twice with M9 medium, and inoculate 1 L of labeled M9
medium with the resuspended cell pellet (see Note 4). Grow at
37°C at 200 rpm until the optical absorbance at 600 nm (A600)
reaches 0.5–0.8, then reduce the temperature to 23°C and
induce by adding IPTG to a final concentration of 0.4 mM.
Continue incubating for 12–16 h, then pellet cells by centri-
fuging at 4,000 × g for 30 min at 4°C.
3. Resuspend the cell pellet in 10 mL of binding buffer. Freeze at
−20°C until ready for extraction.
3.2. Purification of TM 1. Dissolve the chemically synthesized crude peptide in TFE. If

Peptides and Proteins it does not promptly dissolve, use a minimum amount of TFA
and dilute immediately with TFE (40).
3.2.1. Peptide Purification
2. Equilibrate the C4 reversed phase column with 5% acetonitrile
containing 0.1% TFA (95% solvent A + 5% solvent B) (see
Note 5).
3. Inject the sample as prepared in step 1.

4. Elute the peptide using a linear gradient that varies the ace-
tonitrile concentration from the starting concentration to 95%
solvent B over 45 min at a flow rate of 2.5 mL/min, then clean
the column for 10 min with 95% solvent B.
5. Monitor the elution by the optical absorbance at 220 nm (pep-
tide backbone) and 260–280 nm (aromatic amino acids).
Collect fractions across the major peaks. Assess the purity by
using analytical HPLC and MALDI-TOF mass spectrometry
(see Subheading 3.3).
6. Measure the concentration of pure peptide by absorbance or
by amino acid analysis.
7. Freeze with liquid nitrogen and lyophilize (see Note 6), then
store at −20°C.
3.2.2. Protein Purification 1. Lyse the cells using one of several methods (e.g., French press,
cell homogenizer).
2. Clarify the lysate by centrifuging at 25,000 × g for 25 min at
4°C.
3. Transfer the supernatant to a tube and add β-OG to ~2× its
CMC (CMC = 0.25 mM), nutate ~5 min at room temperature
to dissolve.
4. Apply the supernatant to a Ni+/NTA column (12.5 mL bed
volume) previously equilibrated with binding buffer, nutate (in
column) for 2–4 h at 4°C to bind the fusion protein, and then
allow the lysate to flow through.
5. Wash with 14 column volumes of wash buffer, collecting all
fractions for SDS-PAGE analysis. Figure 1 shows that most of
the contaminating protein is removed in the flow-through and
the first wash.
6. Elute the column with 3 column volumes of elution buffer,
collecting 1 mL fractions. Measure the optical absorbance at
280 nm (A280), combine the fractions that have protein, mea-
sure the A280 of that mixture (for a rough determination of the
protein concentration). Add DDM to 10× its CMC, and dis-
solve by sonication or nutation.
7. Add an appropriate amount of His-TEV protease (see Note 7),
nutate for 16–24 h at 23°C to cleave. Monitor the cleavage by
the loss of the band corresponding to the His-MBP-fusion
protein and the appearance of two lower molecular weight
bands corresponding to His-MBP and the TM peptide on
SDS-PAGE (see Fig. 1, lane 10).
At this point, further purification proceeds using either the
organic extraction or the aqueous extraction method.
348 M. Itaya et al.
3.2.3. Organic Extraction 1. Precipitate the protein in the TEV protease mixture
(Subheading 3.2.2) by adding trichloroacetic acid to a final
concentration of 6% (w/v).
2. Centrifuge at 9,000 × g for 20 min at 4°C to collect the precipi-
tate. Decant the supernatant, wash with milliQ water, centri-
fuge again, decant the supernatant; repeat once. Lyophilize to
remove water.
3. Nutate the dried pellet for 2 h at room temperature with
10 mL of 90% methanol/10% chloroform.
4. Filter the supernatant through a 0.2-μm syringe filter (PTFE)
to remove particulate matter. Continue to Subheading 3.4 for
reconstitution of organic solvent purified samples.
5. Assess the purity of the protein by SDS-PAGE analysis and
mass spectrometry as described in Fig. 2.
3.2.4. Aqueous Extraction 1. Dialyze the sample overnight against binding buffer contain-
ing DDM at 1× its CMC to remove the imidazole and glycerol
from the Ni+/NTA purification and TEV cleavage, respectively
(1 kDa MW cutoff is appropriate).
2. Nutate the dialyzed sample in a column with a 20-mL bed
volume of Ni+/NTA (pre-equilibrated with binding buffer plus
1× CMC of DDM) for 2–4 h at 4°C.
3. Collect the flow-through and wash the beads with 2 col-
umn volumes of binding buffer containing DDM at 1× its
CMC. Combine the flow-through with the first column vol-
ume in the wash step. Elute the column with 6 column volumes
of elution buffer.
4. Re-equilibrate the Ni+/NTA with binding buffer. Perform a
second incubation using the combined flow-through and
washes from above. Again, combine the flow-through and first
wash fractions.
5. Dialyze overnight against the final NMR buffer, or if detergent
exchange will occur, against the detergent exchange dialysis
buffer of choice (see Subheading 3.2.5, step 1).
6. If NMR will be performed directly on this sample, concentrate
the result of step 5 using an Amicon spin column (3 kDa
MWCO), and verify the purity and molecular weight by SDS-
PAGE and mass spectrometry (see Fig. 2).
3.2.5. Detergent Exchange 1. Dialyze the aqueous purification sample against detergent
exchange dialysis buffer without deuterated detergent (see
Note 8).
2. Concentrate the sample to 1–3 mL using an Amicon Ultra spin
concentrator.
3. Centrifuge at 20,000 × g for 15 min at 4°C to remove particulate
matter or aggregated protein, save the supernatant.
4. Load the sample on an IEX column pre-equilibrated with

detergent exchange start buffer containing deuterated detergent.
5. Wash with 10 column volumes of detergent exchange start
buffer, collect 1 mL fractions.
6. Elute with 10 column volumes of detergent exchange elution
buffer, collect 1 mL fractions.
7. Check the fractions by SDS-PAGE, combine the fractions con-
taining protein.
3.3. Characterization Mass spectrometry is used to verify the molecular weight and purity
of Purified Protein of the final purified sample.
1. Mix the sample and saturated sinapinic acid matrix solution in
a volume ratio of 1:5 to 1:10 (see Note 9). Apply 1–2 μL of
solution onto the target plate for MALDI-TOF MS and let it
dry at room temperature (67).
2. Wash the target with 5–10 μL of washing buffer.
3. Remove the washing buffer after a few seconds with a pipet
and let the liquid evaporate.
4. Apply 0.5–1 μL of recrystallization buffer on the washed spot
and let it dry.
3.4. NMR Sample 1. Co-solubilize β-OG and DMPC in TFE. The amount of deter-
Preparation gent is determined by the final detergent concentration (5%
w/v) when the sample is rehydrated in step 5. In general, a
3.4.1. Reconstitution into
1:50 peptide-to-lipid molar ratio is used (see Note 10). Freeze
Membrane Lipids
the solution with liquid nitrogen and remove under vacuum; a
small amount of water can be added if a powder cannot be
obtained with TFE alone.
2. Dissolve the dry mixture of β-OG and DMPC in a minimum
amount of water (~0.5 mL).
3. Dissolve the appropriate amount of peptide in 1 mL of TFE
and incubate for 6 h at 37°C (for solid-state NMR experi-
ments, we typically use ~2 μmol peptide).
4. Add β-OG /DMPC drop-wise to the peptide solution while
stirring. Then add water drop-wise while stirring the sample
until bubbles form. Bubbling can be achieved when the ratio
of organic solution to water is between 1:2 and 1:4 (19). After
the titration is complete, freeze and lyophilize the sample (see
Note 11).
5. Rehydrate the sample with 4 mL of phosphate dialysis buffer
for 6 h while stirring slowly at 37°C. If the peptide sequence
contains cysteine, use MES rehydration buffer to rehydrate the
sample.
6. Dialyze the rehydrated sample against 2 L of phosphate dialysis
buffer for 48 h at 37°C. If the peptide sequence contains
350 M. Itaya et al.
cysteine, use MES dialysis buffer. Change the dialysis buffer

every 5–12 h. As the sample dialyzes and detergent concentra-
tion decreases, the sample will become cloudy.
7. Save 100 μL (~0.3 mM peptide) of the dialyzed sample for
FTIR analysis (see Note 12, Subheading 1.3.3 and Fig. 4).
Layer the membrane vesicles containing peptides on the sur-
face of the germanium plate.
8. Purify the membrane vesicles containing reconstituted mem-
brane peptides by sucrose gradient ultracentrifugation. Make
sucrose gradients (10–40% w/v) using either a gradient maker
or by careful layering of different densities (1.4 mL for each
layer in 5% increments for a 10–40% gradient for ~10 mL total
volume) of sucrose in an ultracentrifuge tube appropriate for a
swinging bucket ultracentrifuge rotor. Clear or transparent
tubes are preferred. Load the sample from step 6 on the top of
the gradient and ultracentrifuge at 150,000 × g for 8–12 h at
15°C. A peptide oriented in a transmembrane fashion can be
found in the upper band and the aggregates collect in lower
bands or pellet. Collect the upper band and dialyze (repeat
step 6) to remove sucrose.
9. For deuterium NMR measurements, centrifuge the dialyzed
sample at 228,556 × g for 1 h at 4°C (see Note 13). Discard the
supernatant and save the pellet.
10. Freeze and lyophilize the pellet.
11. Rehydrate the pellet with 50% (w/w) deuterium depleted
water.
12. Incubate overnight at 37°C.
13. Pack the rehydrated pellet in a 4-mm MAS rotor.
3.4.2. Solubilization Following the organic extraction procedure (Subheading 3.2.3):

in Detergent Micelles
1. Blow down the organic solvent with a fine stream of argon or
nitrogen gas to ~2 mL, add the detergent of choice (deuter-
ated or otherwise) to the concentration desired in the final
sample volume (~300 μL for Shigemi tubes, 600 μL for regu-
lar NMR tubes).
2. Add water drop-wise while stirring the sample; the sample may
become cloudy. Stop adding water when the detergent bubbles
do not immediately pop after sample agitation (19).
3. Freeze the sample with liquid nitrogen and lyophilize in a low
temperature (< −70°C), low pressure (~20 mTor) lyophilizer.
The low temperature and pressure are needed for the water–
organic solvent mixtures.
4. Dissolve the dried mixture in water (~2 mL), refreeze, and
lyophilize to drive off residual organic solvent.
5. Dissolve in an appropriate volume of water or buffer; add D2O

to the final desired concentration. If desired, also add DSS (for
spectral referencing) and NaN3 (antimicrobial) to a final con-
centration of 0.1–0.25 mM and 0.05% w/v, respectively.
Following the aqueous extraction procedure
(Subheading 3.2.4):
1. Dialyze the sample that results from the detergent exchange
(Subheading 3.2.5) into the buffer in which NMR experiments
will be run (without the detergent).
2. Concentrate the sample to roughly 500 μL (~300 μL for a
Shigemi tube) using Amicon spin concentrators, then add
detergent (deuterated or otherwise) to the final desired con-
centration and sonicate in a bath sonicator until the detergent
is dissolved (~60 s).
3. Add D2O for the solvent lock, DSS for spectral referencing,
and NaN3 as an antimicrobial, if desired.
4. Notes
1. Using water/formic acid/isopropanol as the mobile phase can

dramatically improve resolution (47, 68, 69). However, formic
acid can lead to peptide formylation as well as shorten the life-
time of an HPLC column.
2. Adding 1-propanol or 2-propanol to solvent B is often used to
increase the resolution.
3. This protocol assumes that the desired His-MBP-TEV expres-
sion construct has been prepared as described and transformed
into the E. coli expression strain of interest (here, BL21 (DE3)
cells). For convenience, we prepare 50% glycerol stocks of our
transformants and freeze at −80°C by inoculating 5 mL of LB
broth with a single isolated colony from an LB agar plate,
growing overnight at 37°C, adding sterile glycerol to 50%
(w/v) and then freezing. Scraping the surface of the frozen
glycerol stock with a sterile pipet tip is all that is needed to
remove a small amount for inoculation of the LB broth.
4. Isotopic labeling in E. coli. There are several protocols available
for isotopic labeling that allow incorporation of specifically
labeled or deuterated amino acids (70, 71). The protocol
described here is for full 13C and 15N incorporation. Other pro-
tocols can be substituted at this point.
5. A higher starting concentration of acetonitrile (e.g., 30% sol-
vent B) is used for more hydrophobic peptides.
352 M. Itaya et al.
6. The reverse phase HPLC fractions containing the peptide of

interest often have sufficient water to yield a fluffy powder by
lyophilization. If the peptide elutes in a high concentration of
acetonitrile and dries as a film under vacuum, then a sublimat-
ing solvent, such as cyclohexane, can be added to dissolve the
film and to repeat the lyophilization procedure. Alternatively, a
small amount of water can be added directly to the HPLC elu-
tion fraction.
7. TEV protease cleavage. For TEV protease that is expressed and
purified in-house, the typical yield is ~0.35 mg/mL (the con-
centration of eluate off of the column, total yield is 18 mg/L
of culture). TEV protease is stored in 50% glycerol, 5 mM
DTT, 1 mM EDTA at −20°C until use. Cleavage reactions are
set up in a 1:1 v:v ratio consisting of 1 part (2 mg/mL)
expressed MBP fusion protein to 1 part (0.175 mg/mL) TEV
solution. Regardless of the source of the TEV protease, it is
essential to determine experimentally the cleavage efficiency of
a particular construct. Some fusions will cleave quickly and effi-
ciently with little TEV protease, some constructs require more
TEV protease or require more time. Finally, for some con-
structs proteolysis may not proceed to completion even after
24 h.
8. IEX/detergent exchange. The efficiency of detergent exchange
using an IEX column depends upon strong binding of the pro-
tein to the IEX column. Binding is determined by the protein
charge, which in turn is determined by the pI of the protein
and the pH of the buffer chosen. For instance, EpoR218–268 has
an estimated pI of 5.78, so a basic buffer is used for anion
exchange chromatography. Tris buffer at a pH of 8.5 yields an
overall protein charge of −1.8. This charge allows complete
binding of the EpoR218–268 protein to the IEX column and effi-
cient detergent exchange.
9. Sample preparation for mass spectrometry. Sinapinic acid is
used in this method and is generally suitable for peptides and
proteins larger than 3 kDa. However, other MALDI-TOF
matrices are available. Steps 2–4 in this section may be omitted
if the results are acceptable.
10. Selection of the optimum protein-to-lipid ratio. There are
competing factors that must be considered when selecting the
protein-to-lipid ratio for NMR studies on membrane-reconsti-
tuted peptides. The sensitivity of the NMR measurement
increases as the protein-to-lipid ratio increases. However,
increasing peptide concentrations leads to nonspecific aggre-
gation. In a series of control studies using the transmembrane
domain of glycophorin A, it was found that the peptide began
to nonspecifically aggregate above protein-to-lipid ratios of
~1:50 (57). A typical ratio of cellular membrane has a

peptide:lipid molar ratio of 1:60 (72).
11. Organic solvents for peptide – membrane reconstitution. TFE
is often our first choice of an organic solvent for solubilization
of hydrophobic membrane peptides, lipids, and detergent.
A second choice is hexafluoroisopropanol (HFIP). Both sol-
vents are co-soluble with water and allow good co-mixing with
detergent and lipids. The strategy for the reconstitution is to
monomerize the lipid in TFE and then to add mixed deter-
gent–lipid micelles in a minimum amount of water to induce
the formation of mixed micelles containing peptide. In this
procedure, when the mixed detergent–lipid micelles contain-
ing peptide are frozen and lyophilized to remove the organic
solvent, the peptide has not had the opportunity to nonspecifi-
cally aggregate.
12. Step 8 in this section may be omitted if the dichroic ratio of
short helical TM peptides (~30 amino acids) is ³3, which rep-
resents proper reconstitution with a helical tilt angle of £25°
from the bilayer normal. For longer TM peptides, a lower
dichroic ratio may be observed as a result of the peptide
sequence that is not embedded in the membrane and can adopt
other secondary structures and orientations.
13. Water content in solid-state NMR samples. For NMR mea-
surements other than deuterium, the reconstituted membranes
containing peptide after step 6 or 8 are pelleted in an ultracen-
trifuge (e.g., SW60Ti Beckman Coulter rotor, 407,506 × g,
24 h) and then loaded into an NMR rotor as a wet paste. The
sample is then typically spun in an MAS rotor at 3–4 kHz for
30 min to further pellet the membranes and remove excess
water. This step helps balance the rotor for high-speed MAS
experiments. The level of hydration can be measured based on
the intensity of the water 1H resonances relative to those of the
lipid and peptide. The hydration levels after this procedure are
typically in the range of 80–100% (w/w) water. At this level of
hydration, lipid phase transition temperatures are not
changed.
Acknowledgments
This work was supported by NIH-NSF instrumentation grants

(S10 RR13889 and DBI-9977553), a grant from the NIH to
S.O.S (GM-46732). We gratefully acknowledge the W.M. Keck
Foundation for support of the NMR facilities in the Center of
Structural Biology at Stony Brook.
354 M. Itaya et al.
References
1. Opella, S. J., and Marassi, F. M. (2004) dipolar coupling spectra obtained with polar-
Structure determination of membrane proteins ization inversion spin exchange at the magic
by NMR spectroscopy. Chem. Rev. 104, angle and magic-angle sample spinning
3587–3606. (PISEMAMAS). Solid State Nucl. Magn. Reson.
2. Sanders, C. R., and Sönnichsen, F. (2006) 4, 387–392.
Solution NMR of membrane proteins: Practice 13. Wang, J. F., Kim, S., Kovacs, F., and Cross, T.
and challenges. Magn. Reson. Chem. 44, A. (2001) Structure of the transmembrane
S24–S40. region of the M2 protein H+ channel. Protein
3. McDermott, A. (2009) Structure and dynam- Sci. 10, 2241–2250.
ics of membrane proteins by magic angle spin- 14. Kim, M. J., Park, S. H., Opella, S. J., Marsilje,
ning solid-state NMR. Ann. Rev. Biophys. 38, T. H., Michellys, P. Y., Seidel, H. M., and Tian,
385–403. S. S. (2007) NMR structural studies of interac-
4. Ketchem, R. R., Hu, W., and Cross, T. A. tions of a small, nonpeptidyl Tpo mimic with
(1993) High-resolution conformation of gram- the thrombopoietin receptor extracellular jux-
icidin A in a lipid bilayer by solid-state NMR. tamembrane and transmembrane domains. J.
Science 261, 1457–1460. Biol. Chem. 282, 14253–14261.
5. Andronesi, O. C., Becker, S., Seidel, K., Heise, 15. Bocharov, E. V., Pustovalova, Y. E., Pavlov, K.
H., Young, H. S., and Baldus, M. (2005) V., Volynsky, P. E., Goncharuk, M. V.,
Determination of membrane protein structure Ermolyuk, Y. S., Karpunin, D. V., Schulga, A.
and dynamics by magic-angle-spinning solid- A., Kirpichnikov, M. P., Efremov, R. G., et al.
state NMR spectroscopy. J. Am. Chem. Soc. (2007) Unique dimeric structure of BNip3
127, 12965–12974. transmembrane domain suggests membrane
6. Cady, S. D., Schmidt-Rohr, K., Wang, J., Soto, permeabilization as a cell death trigger. J. Biol.
C. S., DeGrado, W. F., and Hong, M. (2010) Chem. 282, 16256–16266.
Structure of the amantadine binding site of 16. Bocharov, E. V., Mineev, K. S., Volynsky, P. E.,
influenza M2 proton channels in lipid bilayers. Ermolyuk, Y. S., Tkach, E. N., Sobol, A. G.,
Nature 463, 689–692. Chupin, V. V., Kirpichnikov, M. P., Efremov, R.
7. Chu, S. D., Coey, A. T., and Lorigan, G. A. G., and Arseniev, A. S. (2008) Spatial structure
(2010) Solid-state 2H and 15N NMR studies of of the dimeric transmembrane domain of the
side-chain and backbone dynamics of phospho- growth factor receptor ErbB2 presumably cor-
lamban in lipid bilayers: Investigation of the responding to the receptor active state. J. Biol.
N27A mutation. Biochim. Biophys. Acta- Chem. 283, 6950–6956.
Biomembr. 1798, 210–215. 17. Roosild, T. P., Greenwald, J., Vega, M.,
8. Etzkorn, M., Martell, S., Andronesi, O. C., Castronovo, S., Riek, R., and Choe, S. (2005)
Seidel, K., Engelhard, M., and Baldus, M. NMR structure of Mistic, a membrane-inte-
(2007) Secondary structure, dynamics, and grating protein for membrane protein expres-
topology of a seven-helix receptor in native sion. Science 307, 1317–1321.
membranes, studied by solid-state NMR spec- 18. MacKenzie, K. R., Prestegard, J. H., and
troscopy. Angew. Chem. Int. Ed. Engl. 46, Engelman, D. M. (1997) A transmembrane
459–462. helix dimer: Structure and implications. Science
9. Liu, W., Crocker, E., Constantinescu, S. N., 276, 131–133.
and Smith, S. O. (2005) Helix packing and ori- 19. Sulistijo, E. S., and MacKenzie, K. R. (2009)
entation in the transmembrane dimer of gp55- Structural basis for dimerization of the BNIP3
P of the spleen focus forming virus. Biophys. J. transmembrane domain. Biochemistry 48,
89, 1194–1202. 5106–5120.
10. Smith, S. O., Song, D., Shekar, S., Groesbeek, 20. Oxenoid, K., and Chou, J. J. (2005) The struc-
M., Ziliox, M., and Aimoto, S. (2001) Structure ture of phospholamban pentamer reveals a
of the transmembrane dimer interface of glyco- channel-like architecture in membranes. Proc.
phorin A in membrane bilayers. Biochemistry Natl. Acad. Sci. USA 102, 10870–10875.
40, 6553–6558. 21. Hu, J., Qin, H., Li, C., Sharma, M., Cross, T.
11. Smith, S. O., Smith, C. S., and Bormann, B. J. A., and Gao, F. P. (2007) Structural biology of
(1996) Strong hydrogen bonding interactions transmembrane domains: Efficient production
involving a buried glutamic acid in the trans- and characterization of transmembrane pep-
membrane sequence of the neu/erbB-2 recep- tides by NMR. Protein Sci. 16, 2153–2165.
tor. Nat. Struct. Biol. 3, 252–258. 22. Tamm, L. K., and Liang, B. Y. (2006) NMR of
12. Ramamoorthy, A., and Opella, S. J. (1995) membrane proteins in solution. Prog. Nucl.
Two-dimensional chemical shift/heteronuclear Magn. Reson. Spectrosc. 48, 201–210.
23. Mineev, K. S., Bocharov, E. V., Pustovalova, Y. E., 34. Loll, P. J. (2003) Membrane protein structural
Bocharova, O. V., Chupin, V. V., and Arseniev, biology: the high throughput challenge. J.
A. S. (2010) Spatial structure of the transmem- Struct. Biol. 142, 144–153.
brane domain heterodimer of ErbB1 and 35. Wang, D. N., Safferling, M., Lemieux, M. J.,
ErbB2 receptor tyrosine kinases. J. Mol. Biol. Griffith, H., Chen, Y., and Li, X. D. (2003)
400, 231–243. Practical aspects of overexpressing bacterial
24. White, S. H. (2009) Biophysical dissection of secondary membrane transporters for struc-
membrane proteins. Nature 459, 344–346. tural studies. Biochim. Biophys. Acta-Biomembr.
25. Ruscetti, S. K., Janesch, N. J., Chakraborti, A., 1610, 23–36.
Sawyer, S. T., and Hankins, W. D. (1990) 36. Klammt, C., Löhr, F., Schäfer, B., Haase, W.,
Friend spleen focus-forming virus induces fac- Dötsch, V., Rüterjans, H., Glaubitz, C., and
tor independence in an erythropoietin-depen- Bernhard, F. (2004) High level cell-free expres-
dent erythroleukemia cell line. J. Virol. 64, sion and specific labeling of integral membrane
1057–1062. proteins. Eur. J. Biochem. 271, 568–580.
26. Hoatlin, M. E., Kozak, S. L., Lilly, F., 37. Laage, R., and Langosch, D. (2001) Strategies
Chakraborti, A., Kozak, C. A., and Kabat, D. for prokaryotic expression of eukaryotic mem-
(1990) Activation of erythropoietin receptors brane proteins. Traffic 2, 99–104.
by Friend viral gp55 and by erythropoietin and 38. Bormann, B. J., Knowles, W. J., and Marchesi,
down-modulation by the murine Fv-2r resis- V. T. (1989) Synthetic peptides mimic the
tance gene. Proc. Natl. Acad. Sci. USA 87, assembly of transmembrane glycoproteins.
9985–9989. J. Biol. Chem. 264, 4033–4037.
27. Li, J. P., D’Andrea, A. D., Lodish, H. F., and 39. Kochendoerfer, G. G., Salom, D., Lear, J. D.,
Baltimore, D. (1990) Activation of cell growth Wilk-Orescan, R., Kent, S. B. H., and DeGrado,
by binding of Friend spleen focus-forming virus W. F. (1999) Total chemical synthesis of the
gp55 glycoprotein to the erythropoietin recep- integral membrane protein influenza A virus
tor. Nature 343, 762–764. M2: Role of its C-terminal domain in tetramer.
28. Constantinescu, S. N., Keren, T., Socolovsky, Biochemistry 38, 11905–11913.
M., Nam, H. S., Henis, Y. I., and Lodish, H. F. 40. Fisher, L. E., and Engelman, D. M. (2001)
(2001) Ligand-independent oligomerization High-yield synthesis and purification of an
of cell-surface erythropoietin receptor is medi- α-helical transmembrane domain. Anal.
ated by the transmembrane domain. Proc. Natl. Biochem. 293, 102–108.
Acad. Sci. USA 98, 4379–4384. 41. Tian, C. L., Karra, M. D., Ellis, C. D., Jacob,
29. Gurezka, R., Laage, R., Brosig, B., and J., Oxenoid, K., Sonnichsen, F., and Sanders,
Langosch, D. (1999) A heptad motif of leucine C. R. (2005) Membrane protein preparation
residues found in membrane proteins can drive for TROSY NMR screening. Meth. Enzymol.
self-assembly of artificial transmembrane seg- 394, 321–334.
ments. J. Biol. Chem. 274, 9265–9270. 42. Page, R. C., Moore, J. D., Nguyen, H. B.,
30. Constantinescu, S. N., Keren, T., Russ, W. P., Sharma, M., Chase, R., Gao, F. P., Mobley, C.
Ubarretxena-Belandia, I., Malka, Y., Kubatzky, K., Sanders, C. R., Ma, L., Sonnichsen, F. D.,
K. F., Engelman, D. M., Lodish, H. F., and et al. (2006) Comprehensive evaluation of
Henis, Y. I. (2003) The erythropoietin recep- solution nuclear magnetic resonance spectros-
tor transmembrane domain mediates complex copy sample preparation for helical integral
formation with viral anemic and polycythemic membrane proteins. J. Struct. Funct. Genomics
gp55 proteins. J. Biol. Chem. 278, 7, 51–64.
43755–43763. 43. Qin, H. J., Hu, J., Hua, Y. Z., Challa, S. V.,
31. Constantinescu, S. N., Huang, L. J. S., Nam, Cross, T. A., and Gao, F. P. (2008) Construction
H. S., and Lodish, H. F. (2001) The erythro- of a series of vectors for high throughput clon-
poietin receptor cytosolic juxtamembrane ing and expression screening of membrane pro-
domain contains an essential, precisely oriented, teins from Mycobacterium tuberculosis. BMC
hydrophobic motif. Mol. Cell 7, 377–385. Biotechnol. 8, 51–59.
32. Witthuhn, B. A., Quelle, F. W., Silvennoinen, 44. Kent, S. B. H. (1988) Chemical synthesis of
O., Yi, T. L., Tang, B., Miura, O., and Ihle, J. peptides and proteins. Annu. Rev. Biochem. 57,
N. (1993) Jak2 associates with the erythropoi- 957–989.
etin receptor and is tyrosine-phosphorylated 45. Carpino, L. A. (1993) 1-Hydroxy-7-
and activated following stimulation with eryth- Azabenzotriazole – an efficient peptide coupling
ropoietin. Cell 74, 227–236. additive. J. Am. Chem. Soc. 115, 4397–4398.
33. Dawson, P. E., and Kent, S. B. H. (2000) 46. Glover, K. J., Martini, P. M., Vold, R. R., and
Synthesis of native proteins by chemical liga- Komives, E. A. (1999) Preparation of insoluble
tion. Annu. Rev. Biochem. 69, 923–960. transmembrane peptides: Glycophorin-A, prion
356 M. Itaya et al.
(110–137), and FGFR (368–397). Anal. A transmembrane helix dimer. Biophys. J. 82,
Biochem. 272, 270–274. 2476–2486.
47. Heukeshoven, J., and Dernick, R. (1982) 58. Tamm, L. K., and Tatulian, S. A. (1997)
Reversed-phase high-performance liquid-chro- Infrared spectroscopy of proteins and peptides
matography of virus proteins and other large in lipid bilayers. Q. Rev. Biophys. 30, 365–429.
hydrophobic proteins in formic-acid contain- 59. Johnson, W. C. (1999) Analyzing protein cir-
ing solvents. J. Chromatogr. 252, 241–254. cular dichroism spectra for accurate secondary
48. Mutter, M., Nefzi, A., Sato, T., Sun, X., Wahl, structures. Proteins 35, 307–312.
F., and Wohr, T. (1995) Pseudo-prolines (Psi- 60. Siminovitch, D. J. (1998) Solid-state NMR
Pro) for accessing inaccessible peptides. Pept. studies of proteins: the view from static 2H
Res. 8, 145–153. NMR experiments. Biochem. Cell Biol. 76,
49. Arnau, J., Lauritzen, C., Petersen, G. E., and 411–422.
Pedersen, J. (2006) Current strategies for the 61. Ying, W. W., Irvine, S. E., Beekman, R. A.,
use of affinity tags and tag removal for the puri- Siminovitch, D. J., and Smith, S. O. (2000)
fication of recombinant proteins. Protein Expr. Deuterium NMR reveals helix packing interac-
Purif. 48, 1–13. tions in phospholamban. J. Am. Chem. Soc.
50. Kapust, R. B., and Waugh, D. S. (1999) 122, 11125–11128.
Escherichia coli maltose-binding protein is 62. Sharpe, S., Barber, K. R., Grant, C. W. M.,
uncommonly effective at promoting the solu- Goodyear, D., and Morrow, M. R. (2002)
bility of polypeptides to which it is fused. Organization of model helical peptides in lipid
Protein Sci. 8, 1668–1674. bilayers: Insight into the behavior of single-
51. Nallamsetty, S., and Waugh, D. S. (2006) span protein transmembrane domains. Biophys.
Solubility-enhancing proteins MBP and NusA J. 83, 345–358.
play a passive role in the folding of their fusion 63. Siminovitch, D. J., Ruocco, M. J., Olejniczak,
partners. Protein Expr. Purif. 45, 175–182. E. T., Das Gupta, S. K., and Griffin, R. G.
52. Korepanova, A., Gao, F. P., Hua, Y. Z., Qin, H. (1988) Anisotropic 2H-nuclear magnetic reso-
J., Nakamoto, R. K., and Cross, T. A. (2005) nance spin-lattice relaxation in cerebroside- and
Cloning and expression of multiple integral phospholipid-cholesterol bilayer membranes.
membrane proteins from Mycobacterium tuber- Biophys. J. 54, 373–381.
culosis in Escherichia coli. Protein Sci. 14, 64. Bloom, M., and Smith, I. C. P. (1985)
148–158. Manifestations of lipid-protein interactions in
53. Amor-Mahjoub, M., Suppini, J. P., Gomez- deuterium NMR, in Progress in Protein-Lipid
Vrielyunck, N., and Ladjimi, M. (2006) The Interactions (Watts, A. & De Pont, J. J. H. H.
effect of the hexahistidine-tag in the oligomer- M., Eds.) pp 61–88, Elsevier, Amsterdam.
ization of HSC70 constructs. J. Chromatogr. B 65. Krueger-Koplin, R. D., Sorgen, P. L., Krueger-
Analyt. Technol. Biomed. Life Sci. 844, Koplin, S. T., Rivera-Torres, A. O., Cahill, S.
328–334. M., Hicks, D. B., Grinius, L., Krulwich, T. A.,
54. Stols, L., Gu, M. Y., Dieckman, L., Raffen, R., and Girvin, M. E. (2004) An evaluation of
Collart, F. R., and Donnelly, M. I. (2002) A detergents for NMR structural studies of mem-
new vector for high-throughput, ligation-inde- brane proteins. J. Biomol. NMR 28, 43–57.
pendent cloning encoding a tobacco etch virus 66. Nallamsetty, S., and Waugh, D. S. (2007) A
protease cleavage site. Protein Expr. Purif. 25, generic protocol for the expression and purifi-
8–15. cation of recombinant proteins in Escherichia
55. Lew, S., and London, E. (1997) Simple proce- coli using a combinatorial His6-maltose bind-
dure for reversed-phase high-performance liq- ing protein fusion tag. Nat. Protoc. 2,
uid chromatographic purification of long 383–391.
hydrophobic peptides that form transmem- 67. Karas, M., Bachmann, D., Bahr, U., and
brane helices. Anal. Biochem. 251, 113–116. Hillenkamp, F. (1987) Matrix-assisted ultravio-
56. Fleming, K. G., Ackerman, A. L., and let laser desorption of non-volatile compounds.
Engelman, D. M. (1997) The effect of point Int. J. Mass Spectrom. Ion Process. 78, 53–68.
mutations on the free energy of transmembrane 68. Bollhagen, R., Schmiedberger, M., and Grell,
alpha-helix dimerization. J. Mol. Biol. 272, E. (1995) High-Performance Liquid-
266–275. Chromatographic Purification of Extremely
57. Smith, S. O., Eilers, M., Song, D., Crocker, E., Hydrophobic Peptides – Transmembrane
Ying, W. W., Groesbeek, M., Metz, G., Ziliox, Segments. J. Chromatogr. A 711, 181–186.
M., and Aimoto, S. (2002) Implications of 69. Sato, T., Kawakami, T., Akaji, K., Konishi, H.,
threonine hydrogen bonding in the glycophorin Mochizuki, K., Fujiwara, T., Akutsu, H., and
Aimoto, S. (2002) Synthesis of a membrane 71. Goto, N. K., and Kay, L. E. (2000) New
protein with two transmembrane regions. J. developments in isotope labeling strategies for
Pept. Sci. 8, 172–180. protein solution NMR spectroscopy. Curr.
70. Gardner, K. H., and Kay, L. E. (1998) The use Opin. Struct. Biol. 10, 585–592.
of 2H, 13C, 15N multidimensional NMR to study 72. Gennis, R. B. (1989) Membrane Dynamics and
the structure and dynamics of proteins. Annu. Protein-Lipid Interactions, in Biomembranes pp
Rev. Biophys. Biomol. Struct. 27, 357–406. 166–198, Springer-Verlag, New York.
Chapter 19
Assignment of Backbone Resonances in a Eukaryotic

Protein Kinase – ERK2 as a Representative Example
Andrea Piserchio, Kevin N. Dalby, and Ranajeet Ghose
Abstract
A first step toward the analysis of the structure, dynamics, and interactions of proteins by NMR is obtain-
ing an acceptable level of resonance assignments. This process is nontrivial in most eukaryotic kinases given
their size and suboptimal behavior in solution. Using inactive ERK2 as a representative example, we
describe the procedures we utilized to achieve a significant degree of completeness of backbone resonance
assignment.
Key words: MAP kinase, ERK2, TROSY, Backbone resonance assignment, Selective labeling, Spin-
labeled ATP
1. Introduction
ERK2 is a member of the extracellular signal-regulated kinase

(ERK) subfamily of the mitogen-activated protein kinases (MAPKs).
ERKs are upregulated in response to the activation of cell surface
receptors mediated by extracellular cues, such as hormones, cytok-
ines, and growth factors (1–3). ERKs play a central role in growth
factor-related apoptosis in colorectal cancer (4), making the ERK
signaling pathway a key target for cancer therapy (5, 6). The activa-
tion of ERKs (ERK1 and ERK2) occurs downstream of the Ras/
Raf pathway upon dual phosphorylation of the conserved 183Thr-
X-Tyr185 motif by MAP/ERK kinase kinase (MEKK) (7).
While the bacterial expression and purification of ERK2, at
least in its inactive state (on which we focus here), are more straight-
forward than some other eukaryotic kinases, e.g., c-Src (see
Chapter 7, this volume, (8)), complications in NMR characteriza-
tion due to the large size and extensive dynamics remain a general
trend in most eukaryotic kinases. A necessary step before NMR
359
studies of structure, dynamics, and interactions of these important

signaling molecules can be undertaken is to obtain a sufficient
number of assignments for backbone resonances. However, stan-
dard methodologies (9, 10) that are successfully applied to smaller
or more well-behaved systems tend to fail for these important sig-
naling molecules. This would explain why only a few detailed NMR
studies on eukaryotic kinases are available in the literature (11–15).
Here, we provide a description of the procedures that we applied to
obtain backbone resonance assignments for inactive ERK2. These
strategies illustrate how similar protocols can be utilized for other
protein kinases. It is to be noted, that generation of homogeneous
samples of active, dual-phosphorylated (on Thr183 and Tyr185)
ERK2 for NMR studies is nontrivial and studies on the active,
dual-phosphorylated species will be described elsewhere.
2. Assignment
of Backbone
Resonances
for Full-Length The size of ERK2 (42 kDa), its tendency for nonspecific aggrega-
Inactive ERK2 tion at concentrations above approximately 200 mM, and dynamics
on the slow to intermediate timescale leading to line-broadening
effects make the assignment of backbone resonances quite difficult.
Given these issues, the resonance assignment procedure for full-
length inactive ERK2 is described below in some detail.
2.1. Standard For inactive ERK2 (referred to hereon forward as ERK2), TROSY-
Triple-Resonance based experiments (10, 16, 17) consistently displayed significantly
Experiments narrower line widths, even at fields as low as 600 MHz, and were
therefore preferred over their non-TROSY counterparts. However,
for samples prepared in D2O-based media, the slow back exchange
of several well-protected amide groups resulted in reduced sensi-
tivity in backbone-directed NMR experiments, complicating their
analysis. This problem is evident when an 15N-, 1H-TROSY spec-
trum of perdeuterated ERK2 (prepared in a D2O-based medium)
was compared to that of a sample prepared from cells grown in a
H2O-based medium supplemented with uniformly 2H-, 15N-labeled
amino acids. The latter spectrum displayed additional sets of reso-
nances not visible in the former. Incomplete back exchange com-
plicates resonance assignment both by restricting the number of
detectable resonances, and limiting the degree of correlations
obtained (complicating the so-called backbone walk) for unambigu-
ous correspondences between the resonances that are observed.
This complication adds to the problem of the overall quality of the
triple-resonance experiments being poor presumably because of the
aforementioned aggregation phenomenon and dynamics. A TROSY-
HNCO experiment collected at 800 MHz exhibits around 65–70%
of the expected peaks. Clearly, this represents the upper limit for
19 Assignment of Backbone Resonances in a Eukaryotic Protein Kinase – ERK2... 361
NMR assignment (for protein prepared in a D2O-based medium,

where there is incomplete back exchange of amide protons), given
that HNCO is by far the most sensitive triple-resonance experi-
ment. In an HNCACB dataset collected at the same field and
requiring a week of acquisition time, Cb (i − 1) peaks were identi-
fied for approximately 43% of the expected resonances, if only the
resonances also appearing in the HNCO experiment were consid-
ered (30% of the expected resonances considering the entire pro-
tein). Spectral overlap only partially justifies such an incomplete
peak count. An HN(COCA)CB experiment collected at 600 MHz
shows 53% of the spin systems visible in the HNCO (38% of the
overall expected resonances). Fortunately, the Ca-based experi-
ments were, by comparison, far more complete; the HN(CO)CA
(also collected at 600 MHz), for example, includes 90% of the
Ca(i − 1) peaks expected from the HNCO-detected spin systems.
By combining this with an HNCA (collected at 800 MHz) experi-
ment, most of the expected Ca(i), Ca(i − 1) patterns matching the
observed HNCO peaks could be successfully recognized. In addi-
tion, roughly 70% (relative to the HNCO-observed resonances) of
the intraresidue HN(CA)CO peaks were also found. However,
CO- or Ca-based experiments are limited in scope since they allow
the identification of only a limited number of amino acid types
unlike the Cb-based experiments. Clearly, these statistics indicate
that the extent of the NMR assignment achievable by using con-
ventional approaches is fairly limited. Therefore, we relied heavily
on an approach that takes into consideration the structural and
biochemical features of ERK2. A similar approach has been
employed by Langer and coworkers (18) for assignment of the
catalytic subunit of protein kinase A (PKA).
2.2. Use of Predicted Several crystal structures of ERK2 can be found in the PDB, in
Chemical Shifts both the inactive (19) and dual-phosphorylated (on T183 and
Y185) active (20) forms. A simple way to take advantage of the
available structural data is to utilize them to predict the protein
NMR chemical shifts. This can be done with reasonable accuracy
for 13C resonances, as long as the crystal structure reflects the struc-
ture in solution. Toward this purpose, we used the software Sparta
(21), freely available from the Bax group. For well-ordered regions,
like sheets and helices, these predictions are expected to be more
accurate than for loops and regions of noncanonical secondary
structure. Discrepancies may also be introduced in highly struc-
tured areas by the presence (or by the lack of) ligands or by inter-
molecular interactions either in solution (aggregation) or in crystallo
(crystal-packing forces). The latter scenario is especially true in flex-
ible and highly dynamic molecules (22, 23) as the protein kinases
are known to be. It should be noted that in most of the crystal
structures of ERK2 available in the PDB it is in complex with a
variety of ligands. In addition, the measured chemical shift values
are also affected by several sources of experimental error, including

those resulting from low digital resolution (especially for 13Cb),
spectral overlap, poor signal-to-noise ratio, and artifacts introduced
due to pulse-sequence imperfections and nonidealities. Therefore,
we used a relatively large cutoff (2.3 ppm) for differences between
measured and predicted chemical shifts when evaluating a potential
match for a 13C resonance for a particular position along the pro-
tein sequence. Due to this large uncertainty and the large number
of potential matches along the polypeptide chain, any comparison
done at the level of individual residues leads to several ambiguous
matches and is not particularly informative. If however, the com-
parison is done using stretches of three (or more) resonance peaks
sequentially linked together, the method becomes much more use-
ful. In particular, we found that links typically of four, and some-
times three, residues belonging to well-structured regions were
sufficient to assign these resonances. We also found that, when ana-
lyzing areas of less well-defined secondary structure, this approach
still remains useful when combined with the traditional analysis
based on average chemical shift values expected for a given residue
type as obtained from the Biological Magnetic Resonance Data
Bank (http://www.bmrb.wisc.edu). Often, a link comprising four
residues can be assigned to a protein loop if just three of the four
residues in a given link correlate favorably to the corresponding
predicted chemical shift values, provided that the chemical shifts
observed for all four of them are compatible with the expected
average database values of residues comprising the sequence.
2.3. Use of Structural Clearly, the assignments obtained using predicted chemical shifts
Information should not be considered reliable until confirmed using more con-
ventional experimental spectroscopy-based approaches. An obvi-
ous way to accomplish is to take advantage of the known
three-dimensional structure of ERK2, and use the internuclear dis-
tances available from them for comparison with cross peaks between
amide protons that appear in a three-dimensional 15N-edited
NOESY-TROSY experiment. The verification of the existence (or
the absence) of specific cross peaks predicted from the crystal struc-
ture is an effective way to validate assignments, especially for
b-strands and loops. In the case of b-sheets, the expected (and
observed) NOEs are mainly long range, allowing confirmation of
sequentially nonproximal stretches of residues that comprise indi-
vidual strands of a b-sheet that have been independently assigned.
The reproduction of proper patterns of internuclear distances from
incorrectly assigned resonances is highly unlikely. In case of loops,
generally only few specific residues would be expected to be well-
structured and generate amide–amide NOEs, so again the observed
NOE pattern can be used to confirm a tentative assignment. For
helices, however, the expected NOEs are mostly short range and
only involve amino acids in the particular helical segment, so they
can be used principally to distinguish a helical motif from a nonhe-

lical one. Interamide NOEs in helices can nevertheless be used as
an aid to, or an alternative for, triple-resonance experiments in
order to sequentially link successive amino acid spin systems. Since
the early days of protein NMR spectroscopy when heteronuclear
labeling was not commonplace, “walking” the sequential NH–NH
NOEs represented a simple path to assign resonances correspond-
ing to helical fragments (24). Furthermore, in samples of low pro-
ton density (as in the present case), spin diffusion can be utilized to
generate excellent medium-range connectivities (i, i + 2; i, i + 3,
etc.) within helical stretches. For example, an 15N-edited NOESY-
TROSY experiment with a long mixing time (400 ms) effectively
generates a TOCSY-like pattern among the NH resonances in a
tight turn (Fig. 1). This process is greatly simplified by perdeutera-
tion that reduces magnetization transfer to other regions of the
protein. This is particularly useful in regions of spectral crowding,
when resonance overlaps prevent the unambiguous identification
of sequential NOEs at several positions.
2.4. Use of Selective Selective amino acid labeling represents another route to aid in the
Labeling Strategies linking of neighboring spin systems and to fill gaps in assignments
when the information content of the triple-resonance experiments,
especially in the Cb region, is poor. Usually, selective labeling (25)
is performed by supplementing the M9 medium with unlabeled
(14 N) ammonium chloride, 1H-12C glucose, or similar nutrients
(sometimes, LB is used directly (26)), a particular 15N-labeled (15N,
12
C, 1H) amino acid, and sometimes an unlabeled (14N, 12C, 1H)
pool of the remaining amino acids. Then, a simple 15N, 1H HSQC
experiment should highlight the amide resonances belonging to
the residue type selected. This method can also be more rigorously
applied using E. coli strains that are auxotrophic for the specific
amino acid to be labeled (27). Unfortunately, this labeling approach
did not perform well when applied to ERK2. Independently of the
specific amino acid tested, the resulting HSQC spectrum lacked
discernable peaks. We attributed this problem to extensive line
broadening resulting from efficient 1H–1H relaxation in the absence
of deuteration. Reducing the overall 1H density by growing the
bacteria in D2O did not significantly improve the quality of the
spectra suggesting that this was the result of the contribution of
the local dipolar interactions between the amide and alpha protons
of the selectively labeled amino acids. This phenomenon leads to
an increase in the contribution of the 1H homonuclear R1 to the
relaxation rate of the antiphase term between the amide 15N and
1
H nuclei, and results in a broadening of the resonances in an 15N,
1
H-HSQC experiment. We then decided to alter our selective
labeling approach and use amino acids selectively 13C-labeled only
at the carbonyl position (14N, 12C, 13CO,1H) in a uniformly
15
N-labeled, deuterated background. As shown by Takeuchi and
Fig. 1. (a) Structure of ERK2 with the N- and C-terminal lobes colored light and dark grey, respectively. The MAP kinase
insert and the C-terminal extension are colored black. Side chains for the regulatory T183 and Y185 residues are shown
and labeled. Side chains for the tight turn encompassing residues T92-M96 are shown on the structure and expanded on
the right panel. (b) Strips taken from an 15N-edited NOESY-TROSY spectrum collected with a 400-ms mixing time at
800 MHz on a uniformly 2H-, 13C-, 15N-labeled inactive ERK2 sample in a buffer containing 150 mM NaCl, 2 mM DTT, 10 mM
MgCl2, 2 mM ADP, 50 mM phosphate, pH 6.8, 10% 2H2O. Shown here is the effect of spin diffusion generating long-range
connections among the amides of the segment comprising residues T92-M96. The lines highlight the total correlation-like
(as in a TOCSY experiment, where transfer occurs through scalar rather than dipolar couplings) effect of the magnetization
transfer. The source (first label) and target (last label) amide 1HN nuclei for the cross peaks are labeled. Only a single label
is used for the diagonal peaks.
coworkers (28), this 15N-labeled, deuterated background can be

achieved by adding 15NH4Cl, 12C-2H, glucose, and a pool of 15N,
12
Cl, 2H amino-acids (CELTONE base powder, Cambridge Isotope
Laboratories) to the growth medium and by replacing H2O with
Fig. 2. 13CO, 1H planes for TROSY-based HNCO spectra for representative examples (Leu, Ala) of residue-selective
13
CO-labeled samples of ERK2 in a uniformly 15N-labeled, perdeuterated background. Also shown in the extreme left panel
is the corresponding plane from uniformly 13C-, 2H-, 15N-labeled ERK2. The labels correspond to the residue that contributes
the 13CO nucleus (i.e., the i − 1 residue).
D2O. HNCO experiments would then be expected to show peaks

at a position corresponding to the selectively labeled carbonyl and
the nitrogen of the following residue. While the alpha position of
the labeled residues (i − 1) is still protonated, the resonance detected
corresponds to the amide (1HN) for the ith residue that carries a
deuteron at the Ca position. A further advantage of this approach
is the higher resolution offered by the 3D HNCO compared to the
extensive resonance overlap seen in 2D HSQC experiments used
with the 15N-selective labeling approach. Another piece of infor-
mation provided by this labeling scheme is the disappearance of the
resonances corresponding to the 14N-labeled amino acids (for those
selectively 13C labeled at the carbonyl position) that can be moni-
tored using 2D TROSY experiments. We successfully utilized this
strategy for Gly, Ala, Leu, Val, and Ile residues in ERK2 (represen-
tative examples are shown in Fig. 2).
2.5. Use of Spin- Like all protein and indeed nonprotein kinases, ERK2 binds ATP
Labeled ATP Analogs and ADP. However, the chemical shift perturbations induced by
binding of these molecules (or corresponding slowly hydrolyzed
analogs) are not limited to the ATP-binding pocket; therefore, the
shifts of unknown resonances can be difficult to correlate to a spe-
cific portion of the structure simply by monitoring chemical shift
perturbations. It has already been shown for PKA that spin-labeled
ATP (sl-ATP) molecules can be successfully employed to highlight
those residues within a certain distance from the nucleotide binding
pocket (18). The sl-ATP we employed, sl-N3-ATP (a kind gift from
Dr. Pia Vogel, SMU), carries a stable nitroxide spin label as part of a
2,2,5,5 tetramethyl 3-pyrroline scaffold attached to the 3¢ (70–80%)
or 2¢ (20–30%) positions of the ribose moiety (29). A crystal struc-
ture of ERK2 bound to this specific ligand does not exist; therefore,
Fig. 3. Paramagnetic relaxation enhancement (PRE) monitored using TROSY-based HNCO

experiments. Partial and complete quenching for L114 and K115, respectively, induced by
substoichiometric amount of sl-ATP (1:0.25 ratio) are illustrated. Both residues are a dis-
tance of ~12 Å from the label.
we relied on the structure of ATP-bound ERK2 (PDB: 1GOL) to

estimate distances from the spin label. We estimated that spin-labeled
ATP in a one-fourth (or half) substoichiometric amount is capable
of significantly quenching the HNCO peaks corresponding to resi-
dues within 20–25 Å of the 3¢ ribose position (a representative
example is shown in Fig. 3). Given the substoichiometric amounts
of sl-ATP used and the low affinity (30) of ATP for inactive ERK2
(KD > ~700 mM), the conformational changes induced by simple
ATP binding are expected to be negligible. This approach helped
extend the assignments in the area at the interface between the N-
and C-lobes of ERK2, a critical region that was difficult to assign by
other means. Using these strategies, we have unambiguously assigned
~90% (~65% of all nonproline resonances) of the resonances seen/
resolved to date at 800 MHz. The largest unassigned continuous
stretch corresponds to the catalytic segment that can be expected to
be in conformational exchange, a phenomenon that would lead to
line-broadening effects. We are investigating alternative strategies
to obtain assignments for this region, including experiments that
allow better visualization of exchange-broadened lines (31).
3. Conclusions
We focused on the problem of the NMR backbone assignment

of eukaryotic kinases using ERK2 as an example. NMR studies
of this class of proteins is hindered by a number of problems,
namely, incomplete amide protons back exchange, aggregation/
oligomerization at high protein concentration, and internal dynam-
ics. We have shown here that the careful analysis of otherwise
well-established NMR experiments that can be normally found in

most standard pulse sequence libraries can lead to an acceptable
level of resonance assignment. However, this process requires mul-
tiple sample conditions (different ligands, various selectively labeled
samples, spin-labeled ATP analogs, etc.) and available structural
information. In general, we have found that the resonance assign-
ment of the sites of protein–protein interactions (where known, as
in ERK2) is significantly less challenging than the highly dynamic
regions around the catalytic site. This process of resonance assign-
ment is certainly time and resource consuming, but obtaining these
assignments is clearly worthwhile given their utility in investigating
protein-protein interactions involving these key signaling mole-
cules, especially in the large number of cases where crystallographic
information about the interaction interfaces is not available (32).
Acknowledgments
This research has been supported by the following grants from the
National Institutes of Health: GM084278 (to RG), GM059802
(to KND), and 5G12 RR03060 (toward partial support of the
NMR facilities at The City College of New York). RG is a member
of the New York Structural Biology Center, NYSTAR facility.
KND is a recipient of a grant from the Welch Foundation (F-1390).
The authors thank Dr. Pia Vogel (SMU) for the kind gift of spin-
labeled ATP.
References
1. Murphy, L. O., and Blenis, J. (2006) MAPK pathway: application as anticancer drugs. Prog.
signal specificity: the right place at the right Cell Cyc. Res. 5, 219–224.
time. Trends Biochem. Sci. 31, 268–275. 7. Roux, P. P., and Blenis, J. (2004) ERK and p38
2. Chen, Z., Gibson, T. B., Robinson, F., Silvestro, MAPK-activated protein kinases: a family of
L., Pearson, G., Xu, B., Wright, A., Vanderbilt, protein kinases with diverse biological func-
C., and Cobb, M. H. (2001) MAP kinases. tions. Microbiol. Mol. Biol. Rev. 68, 320–344.
Chem. Rev. 101, 2449–2476. 8. Piserchio, A., Dalby, K. N., and Ghose, R.
3. Pearson, G., Robinson, F., Beers Gibson, T., (2012) Expression and Purification of Src-
Xu, B. E., Karandikar, M., Berman, K., and family Kinases for Solution NMR Studies.
Cobb, M. H. (2001) Mitogen-activated pro- Meth. Mol. Biol. 831, 111–132.
tein (MAP) kinase pathways: regulation and 9. Sattler, M., Schleucher, J., and Griesinger, C.
physiological functions. Endocrine Rev. 22, (1999) Heteronuclear multidimensional NMR
153–183. experiments for the structure determination of
4. Fang, J. Y., and Richardson, B. C. (2005) The proteins in solution employing pulsed field gra-
MAPK signalling pathways and colorectal can- dients. Prog. NMR Spectr. 34, 93–158.
cer. The Lancet Oncol. 6, 322–327. 10. Salzmann, M., Pervushin, K., Wider, G., Senn,
5. Kohno, M., and Pouyssegur, J. (2006) H., and Wuthrich, K. (1998) TROSY in tri-
Targeting the ERK signaling pathway in cancer ple-resonance experiments: new perspectives
therapy. Annal. Med. 38, 200–211. for sequential NMR assignment of large pro-
6. Kohno, M., and Pouyssegur, J. (2003) teins. Proc. Natl. Acad. Sci. USA 95,
Pharmacological inhibitors of the ERK signaling 13585–13590.
11. Masterson, L. R., Mascioni, A., Traaseth, N. J., 21. Shen, Y., and Bax, A. (2007) Protein backbone
Taylor, S. S., and Veglia, G. (2008) Allosteric chemical shifts predicted from searching a data-
cooperativity in protein kinase A. Proc. Natl. base for torsion angle and sequence homology.
Acad. Sci. USA 105, 506–511. J. Biomol. NMR 38, 289–302.
12. Masterson, L. R., Cheng, C., Yu, T., Tonelli, 22. Fushman, D., Xu, R., and Cowburn, D. (1999)
M., Kornev, A., Taylor, S. S., and Veglia, G. Direct determination of changes of interdo-
(2010) Dynamics connect substrate recogni- main orientation on ligation: use of the orien-
tion to catalysis in protein kinase A. Nature tational dependence of 15 N NMR relaxation in
Chem. Biol. 6, 821–828. Abl SH(32). Biochemistry 38, 10225–10230.
13. Wiesner, S., Wybenga-Groot, L. E., Warner, N., 23. Piserchio, A., Nair, P. A., Shuman, S., and
Lin, H., Pawson, T., Forman-Kay, J. D., and Ghose, R. (2010) Solution NMR studies of
Sicheri, F. (2006) A change in conformational Chlorella virus DNA ligase-adenylate. J. Mol.
dynamics underlies the activation of Eph recep- Biol. 395, 291–308.
tor tyrosine kinases. EMBO J. 25, 4686–4696. 24. Wüthrich, K. (1986) NMR of proteins and
14. Vajpai, N., Strauss, A., Fendrich, G., Cowan- nucleic acids, John Wiley and Sons, New York.
Jacob, S. W., Manley, P. W., Grzesiek, S., and 25. Muchmore, D. C., McIntosh, L. P., Russell, C.
Jahnke, W. (2008) Solution conformations and B., Anderson, D. E., and Dahlquist, F. W. (1989)
dynamics of ABL kinase-inhibitor complexes Expression and nitrogen-15 labeling of proteins
determined by NMR substantiate the different for proton and nitrogen-15 nuclear magnetic
binding modes of imatinib/nilotinib and dasa- resonance. Meth. Ezymnol. 177, 44–73.
tinib. J. Biol. Chem. 283, 18292–18302.
26. Englander, J., Cohen, L., Arshava, B., Estephan,
15. Vogtherr, M., Saxena, K., Hoelder, S., Grimme, R., Becker, J. M., and Naider, F. (2006)
S., Betz, M., Schieborr, U., Pescatore, B., Selective labeling of a membrane peptide with
Robin, M., Delarbre, L., Langer, T., Wendt, K. 15
N-amino acids using cells grown in rich
U., and Schwalbe, H. (2006) NMR character- medium. Biopolymers 84, 508–518.
ization of kinase p38 dynamics in free and
ligand-bound forms. Angew. Chem. Intl. Ed. 27. LeMaster, D. M., and Cronan, J. E., Jr. (1982)
Engl. 45, 993–997. Biosynthetic production of 13 C-labeled amino
acids with site-specific enrichment. J. Biol.
16. Riek, R., Pervushin, K., and Wuthrich, K. Chem. 257, 1224–1230.
(2000) TROSY and CRINEPT: NMR with
large molecular and supramolecular structures 28. Takeuchi, K., Ng, E., Malia, T. J., and Wagner,
in solution. Trends Biochem. Sci. 25, 462–468. G. (2007) 1-13 C amino acid selective labeling
in a 2H15N background for NMR studies of
17. Pervushin, K. (2000) Impact of transverse
large proteins. J. Biomol. NMR 38, 89–98.
relaxation optimized spectroscopy (TROSY)
on NMR as a technique in structural biology. 29. Vogel-Claude, P., Schafer, G., and Trommer,
Q. Rev. Biophys. 33, 161–197. W. E. (1988) Synthesis of a photoaffinity-spin-
labeled derivative of ATP and its first applica-
18. Langer, T., Vogtherr, M., Elshorst, B., Betz,
tion to F1-ATPase. FEBS Lett. 227, 107–109.
M., Schieborr, U., Saxena, K., and Schwalbe,
H. (2004) NMR backbone assignment of a 30. Prowse, C. N., and Lew, J. (2001) Mechanism
protein kinase catalytic domain by a combina- of activation of ERK2 by dual phosphoryla-
tion of several approaches: application to the tion. J. Biol. Chem. 276, 99–103.
catalytic subunit of cAMP-dependent protein 31. Li, Y., and Palmer, A. G., III. (2010) Narrowing
kinase. ChemBioChem 5, 1508–1516. of protein NMR spectral lines broadened by
19. Zhang, F., Strand, A., Robbins, D., Cobb, M. chemical exchange. J. Am. Chem. Soc. 132,
H., and Goldsmith, E. J. (1994) Atomic 8856–8857.
structure of the MAP kinase ERK2 at 2.3 Å 32. Piserchio, A., Warthaka, M., Devkota, A. K.,
resolution. Nature 367, 704–711. Kaoud, T. S., Lee, S., Abramczyk, O., Ren, P.,
20. Canagarajah, B. J., Khokhlatchev, A., Cobb, Dalby, K. N., and Ghose R. (2011) Solution
M. H., and Goldsmith, E. J. (1997) Activation NMR insights into docking interactions
mechanism of the MAP kinase ERK2 by dual involving inactive ERK2. Biochemistry, 50,
phosphorylation. Cell 90, 859–869. 3660–3672.
Chapter 20
Electrostatics of Hydrogen Exchange for Analyzing

Protein Flexibility
Griselda Hernández, Janet S. Anderson, and David M. LeMaster
Abstract
Electrostatic interactions at the protein–aqueous interface modulate the reactivity of solvent-exposed backbone
amides by a factor of at least a billion fold. The brief (~10 ps) lifetime of the peptide anion formed during
the hydroxide-catalyzed exchange reaction helps enable the experimental rates to be robustly predictable
by continuum dielectric methods. Since this ability to predict the structural dependence of exchange reac-
tivity also applies to the protein amide hydrogens that are only rarely exposed to the bulk solvent phase,
electrostatic analysis of the experimental exchange rates provides an effective assessment of whether a given
model ensemble is consistent with the properly weighted Boltzmann conformational distribution of the
protein native state.
Key words: Hydrogen exchange, Protein flexibility, Electrostatics, Conformational distribution,

Dielectric shielding, Poisson–Boltzmann, Protein ensemble
1. Introduction
Both the flexibility and the conformational dynamics of proteins

are generally thought to play critical roles in biological function.
Accurate experimental and computational characterization of these
properties for any given protein remains challenging. In the equi-
librium distribution of the protein native state, every energetically
feasible conformation has a nonzero probability. As a result, the
quantitative analysis of protein flexibility is synonymous with deter-
mining the proper Boltzmann-weighting of this conformational
distribution.
To effectively compare between experimental measurements
and computational modeling on the conformational distribution
of the protein native state, several conditions should be met. The
computational modeling must be sufficiently detailed so that a
369
370 G. Hernández et al.
quantitative structure-based prediction of the observed experimental

data can be made. Conversely, given a set of protein conformations
against which to test, the experimental method needs to be pre-
dictable on the basis of that distribution. If instead, interpretation
of the experimental measurements depends not only upon the con-
formational distribution but also upon the rate of interchange
between those conformations, then the computational modeling
approach must encompass the complete conformational dynamics
of the system. Although a full dynamical analysis is appealing in
principle, in practice, the ability to use experimental data to distin-
guish the degree to which a given dynamical simulation is consis-
tent with physical reality is often significantly decreased.
The kinetics of the hydrogen exchange reaction for the amides
along the protein backbone have long been interpreted as provid-
ing a passive monitor of what fraction of time a given amide hydro-
gen is directly exposed to the solvent phase. In reality, the reactivity
to exchange for a solvent-exposed amide is acutely sensitive to its
electrostatic environment. Poisson–Boltzmann continuum dielec-
tric methods offer usefully accurate predictions of those electro-
static environments. Owing to the highly transient peptide anion
charge state, the kinetics of hydroxide-catalyzed amide hydrogen
exchange provide a “snapshot” of the Boltzmann conformational
distribution which is nearly independent from the dynamics of
interchange between protein conformations. Not only is hydrogen
exchange analysis of the well-exposed backbone amides acutely
sensitive to the detailed conformations of the highly populated
states, these data also reflect both the frequency and structural
detail of the exchange-competent states that arise from rare con-
formational transitions of the structurally buried amides. As a
result, electrostatic analysis of amide hydrogen exchange provides
a robust experimental basis upon which to assess the consistency of
any given model ensemble with the properly weighted Boltzmann
conformational distribution.
2. Steric
Interpretation
of Protein
Hydrogen Before the first protein X-ray structure was reported, Linderstrøm-
Exchange Lang and colleagues (1) described the so-called EX2 analysis of
hydrogen exchange from structurally buried backbone amides, as
2.1. Hydrogen summarized in the following kinetic scheme:
Exchange as a kop kch
Measure of Solvent closed open → exchanged
kcl
Accessibility
If the rate of the closing reaction is rapid compared to the open
state chemical exchange step (i.e., kcl > > kch), a preequilibrium of
the open and closed conformational states is established and the
20 Electrostatics of Hydrogen Exchange for Analyzing Protein Flexibility 371
overall exchange rate constant kex equals (kop/kcl) kch, in which

kop/kcl is the equilibrium constant for the conformational opening
transition. Since that time, the conventional steric interpretation of
hydrogen exchange has identified the rate constant kch with the
corresponding kinetics of exchange in model peptides under analo-
gous sample conditions (2, 3). The ratio of the observed exchange
rate to that of the model peptide defines a protection factor that is
assumed to specify the fraction of the population in the open-state
and thus a residue-specific free energy [i.e., ΔG = −RT ln(kex/kpep)]
for the conformational transition that gives rise to exchange (3).
Central to the peptide normalization analysis is the assumption
that the residual tertiary structure in the exchange-competent open
state does not influence the kinetics of exchange.
When applied to the slowest exchanging amides, the peptide
normalization analysis has been shown to yield reasonable predic-
tions of global thermodynamic stability for a number of proteins
(4), as might be expected when hydrogen exchange is occurring
from a conformationally disordered unfolded state. However,
under physiological conditions, the amides of well folded proteins
that exchange via a global unfolding transition generally constitute
only a small fraction of the peptide backbone. A survey of 20 well-
studied proteins identified less than 10% of all backbone amides as
such “core” amides (5).
The desire to extend the residue-specific stability interpretation
of experimental hydrogen exchange data has stimulated the devel-
opment of a number of structure-based analysis algorithms (6–12)
to predict the population of solvent-exposed conformations for
each backbone amide. As illustrated by the COREX conformational
sampling algorithm (6), the reported success in predicting protein
hydrogen exchange rates has been invoked to validate the applica-
tion of this algorithm to a broad range of questions including the
structural propagation of ligand binding effects (13), analysis of the
localized energetics of allosteric coupling pathways (14), partition-
ing of protein structures into high, medium and low thermody-
namic stability environments (15) and characterizing the
determinants of fold specificity (16) as well as structurally interpret-
ing protein cold denaturation (17), the framework model for fold-
ing (18) and pathological protein misfolding transitions (19). Yet
when the protection factor predictions given in the initial COREX
manuscript (6) were directly compared against the corresponding
experimental hydrogen exchange values, no net correlation was
observed (20, 21). Although this lack of predictive capability sug-
gests a limited utility for the specific conformational sampling algo-
rithm used, it does not provide an unambiguous test for the validity
of the peptide normalization analysis. Since independent experi-
mental data characterizing the properties of the transient, partially
ordered conformations that give rise to hydrogen exchange are rarely
available, the predicted residue-specific conformational free ener-
gies can not generally be directly verified or refuted.
2.0 E53
S25
1.0
log kex (s–1)
K2
0.0 K3 D14
D35 K46
D36
I12
K29
S47
D21 K51
–1.0
V38*
5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0

pH
Fig. 1. Magnetization transfer-based hydrogen exchange rate measurements on the solvent-exposed amides of P. furiosus
A2K rubredoxin. CLEANEX-PM [71, 72] measurements were carried out at 25°C. Dashed lines with slope of 1.0 were
drawn for the pH dependent data of each solvent-exposed amide, indicating a simple hydroxide ion dependence on the
exchange rates over most of the pH range. The exchange rate value for Val 38, marked with an asterisk, is derived by
extrapolation from measurements at 52°C. Reprinted from ref. 22 with permission from the American Chemical Society.
On the contrary, the physical plausibility of the peptide

normalization analysis can be straightforwardly examined.
Specifically, is it chemically reasonable to assume that exposure of
an amide hydrogen to the bulk solvent phase is sufficient to estab-
lish exchange kinetics equivalent to those of the corresponding
model peptide? This assumption can be directly examined by con-
sideration of the exchange behavior for protein backbone amide
hydrogens that are well-exposed to solvent in the high resolution
X-ray structure so that no conformational transition is required for
the exchange reaction.
The well exposed Val 38 amide hydrogen of rubredoxin from
Pyrococcus furiosus (22) exchanges at a rate that is nearly 107-fold
slower than that of the corresponding Trp-Val model peptide (23).
Conversely, His 38 in the active site of the a domain of the human
protein disulfide isomerase exchanges at a rate 400-fold faster than
the corresponding model peptide value (24). Application of the
standard protection factor analysis to these two static solvent-
exposed amides yields a 13 kcal/mol range of apparent conforma-
tional stabilities. This range is at least as large as the maximal global
stability of any protein predicted from hydrogen exchange mea-
surements, which has been independently verified by either calori-
metric or spectroscopic methods (4, 25, 26).
Figure 1 illustrates the exchange rates for all of the backbone
amide hydrogens of Pyrococcus furiosus rubredoxin that are exposed
to solvent in the 1.1 Å resolution X-ray structure (27). At each pH,
exchange rates can be accurately quantified over the range of 0.2–
70 s−1 (28). The exchange rates of the individual static solvent
accessible (29) amides increase directly with the hydroxide ion
concentration up to pH 10.77. At the highest pH values, the slope

for the amides of Lys 29 and Lys 46 decreases, presumably reflect-
ing the increasingly negative protein charge arising from the neu-
tralization of these amino acid side chains. These data yield
hydroxide-catalyzed rate constants kOH− at 25°C ranging from
109.68 M−1 s−1 for Lys 2 to 101.55 M−1 s−1 for Ile 12. The exchange of
Val 38 at pH 11.85 and 25°C is too slow to be observed under
these conditions by the magnetization transfer method used.
Taking advantage that this protein is stable to 94°C at pH 11.6
(30), exchange measurements on the pH 11.85 sample were car-
ried out at elevated temperature and extrapolated to a rate constant
of 100.67 M−1 s−1 at 25°C, a billion-fold slower than that of Lys 2.
2.2. Comparing The physical plausibility of the peptide normalization analysis of

Solvent Accessibility hydrogen exchange can also be tested by comparison with molecu-
from Molecular lar simulation studies that strive to predict the correct Boltzmann-
Simulations to that weighted native state conformational distribution of a protein.
Inferred from the Ubiquitin has served as the primary model system for such detailed
Steric Interpretation studies of the conformational ensemble. To facilitate such com-
of Hydrogen Exchange parisons, our recent measurements for ubiquitin provide the first
reported data set describing the hydroxide-catalyzed exchange rate
constants kOH− for every backbone amide of a protein under near
physiological solution conditions (20).
One such model conformational ensemble has been deposited
in the Protein Data Bank by Vendruscolo and colleagues (PDB
code 2NR2 (31)) for which molecular simulations of ubiquitin
were restrained to match both experimental NOE and NMR relax-
ation data. More recently, de Groot and colleagues (32) also depos-
ited a molecular dynamics simulation of ubiquitin (PDB code
2K39) that was initially restrained to match the same set of NOE
restraints. This set of conformations was then used to iteratively
select subsets of conformations that were consistent with experi-
mental residual dipolar coupling data.
The 51 amide hydrogens of ubiquitin that become exposed to
solvent in at least one of the 144 structures of the 2NR2 ensemble
or one of the 116 structures of the 2K39 ensemble were compared
to the accessibility predictions derived from protection factor anal-
ysis of the amide hydrogen exchange (33) (Fig. 2). When the
experimental exchange rate constants for these 51 residues were
normalized against the model peptide values to obtain an estimate
of the population of exchange-competent conformations for each
residue, the fraction of solvent-exposed conformations varies by
more than a factor of 107. This variation corresponds to a range in
excess of 10 kcal/mol for the apparent residue-specific conforma-
tional stabilities ΔGHX. Given the number of conformations
included, the molecular dynamics-derived ensembles can only sam-
ple fractional accessibilities over a range of ~102. For this set of 51
amides, the fractional accessibility predictions from the ensembles,
β1 β2 α β3 β4 β5
0
–1
log solvent accessibility

–2
–3
–4
–5
–6
–7
–8
0 10 20 30 40 50 60 70
residue number
Fig. 2. The fraction of conformations in which the backbone amide hydrogen is predicted
to be exposed to solvent for each residue of ubiquitin. Estimations based on protection
factor analysis [3, 23] of hydrogen exchange measurements [20], normalized to model
peptide values, are indicated (filled circle). Illustrated as well is the fraction of conforma-
tions in the 2NR2 (filled triangle) and 2K39 (filled inverted triangle) NMR-restrained
ensembles for which the solvent accessibility of the amide hydrogen is greater than
0.5 Å2. The position of the secondary structure elements of ubiquitin are indicated along
the top of the figure. Reprinted from ref. 33 with permission from Elsevier Limited.
as compared to the peptide normalization-based estimates, differ

by up to a factor of 105 (ΔΔG ~ 7 kcal/mol). Indeed, 21 of these
amides yield ΔGHX values that differ from the molecular simulation-
derived ensemble predictions by at least half that much
(ΔΔG ~ 3.5 kcal/mol). Within the degree to which these two
NMR-restrained molecular simulations faithfully model the
Boltzmann conformational distribution of ubiquitin, the conven-
tional interpretation of the hydrogen exchange data severely under-
estimates the flexibility of this protein.
As discussed in the following section, the physical basis for this
systematic error in flexibility predictions derived from the conven-
tional hydrogen exchange analysis is straightforward. Although
occasional exceptions arise due to specific local electrostatic inter-
actions (24), most solvent-exposed amides that lie along the sur-
face of a partially or fully folded protein will have lower acidities
than the corresponding model peptides, due to the presence of the
low dielectric volume of the adjacent protein interior. When the
peptide normalization analysis is applied to protein hydrogen
exchange data, these depressed ionization equilibria are misinter-
preted as a lower fraction of solvent-accessible conformations. The
direct implication of the electrostatic and conformational contri-
butions to hydrogen exchange kinetics is that normalization against
the model peptide exchange rates can only be expected to provide
useful conformational equilibria data when the exchange-compe-
tent state exhibits both solvation and conformational sampling
behavior similar to that of the model peptide (34, 35).
3. Kinetics
and Electrostatics
of Hydrogen
Exchange Hydroxide-catalyzed amide hydrogen exchange is a straightforward
acid-base reaction. A number of researchers (36–41) have pointed
3.1. Implications out that electrostatic interactions modulate the kinetics of amide
of Amides Being Weak
hydrogen exchange. Nevertheless, many of the earlier reported
electrostatic effects have appeared to be relatively modest when
Normal Eigen Acids
compared with the 107 to 108-fold decrease in exchange rates,
which is commonly observed for the most slowly exchanging
amides of moderately stable proteins. Consistent with such an
assessment, it has been argued that titrating the formal charges of
the side chains modulates the observed hydrogen exchange rates
much more strongly via an indirect effect on protein stability than
they do via a direct electrostatic interaction (42).
As first predicted by Eigen (43), amides have been experimen-
tally demonstrated (44, 45) to act as normal Eigen acids such that
the reaction rate with hydroxide ions is attenuated from the diffu-
sion limit by the fraction of forward-reacting encounters Ki/
(Ki + 1), where Ki is the equilibrium constant for the transfer of a
proton from the amide to an hydroxide ion. Therefore, the ther-
modynamic acidity of an amide directly predicts its kinetic acidity
as monitored by the hydrogen exchange reaction. Most all protein
backbone amides have appreciably lower thermodynamic acidities
than that of water. As a result, nearly every collision with a neutral
water molecule will quench the peptide anion charge state. This
low acidity implies that near neutral pH, most backbone amides
will be in the peptide anion state at a fractional population of less
than one part in 1010.
The key advantage in predicting the ionization behavior of the
backbone amide, as compared to predicting ionization of protein
side chains, stems from the short lifetime of the peptide anion (22,
46, 47). In contrast to the μs–ms lifetimes for the charge states of
the ionizable side chains near neutral pH, the range of protein
conformational responses to the peptide anion charge state is
strongly limited by its brief lifetime. Although a direct measure-
ment of how rapidly the peptide anion is quenched by a neutral
water molecule has not been reported, NMR relaxation studies
indicate that the residence lifetime of an hydroxide ion in water
is ~ 5 ps (48), and lifetimes near 10 ps have often been observed
for photoactivated strong acids and bases (49, 50). Given that the
dominant phase of the Debye dielectric relaxation profile for water
has a time constant of 8 ps at 25°C (51), it has been argued that
the dynamics of water reorientation are limiting in these fast pro-
ton transfer reactions (49, 50). By analogy, the lifetime of the
peptide anion intermediate is likewise anticipated to be ~10 ps
(22, 46, 47).
3.2. Electronic As long discussed in electron transfer theory (52), dielectric shielding
Polarizability is frequency dependent. The lifetime of a transient charge state
in the Dielectric determines the range of conformational motions that can give rise to
Shielding effective dielectric shielding since conformational transitions that are
of the Peptide Anion slower than the charge state lifetime cannot adjust rapidly enough to
stabilize that state. As a result, the dielectric shielding of the hydro-
gen exchange reaction that arises from the protein molecule is
expected to be dominated by electronic polarizability (47). Owing
to the highly transient peptide anion charge state, the kinetics of
amide hydrogen exchange provides a “snapshot” of the Boltzmann
conformational distribution that is nearly independent from the
dynamics of interchange between protein conformations.
The electrostatic free energy of a generalized-Born ion of
charge Q and radius R is given by the formula (53):
ΔG elec = −(1 / e int − 1 / e ext )Q 2 / 2R
When such a low dielectric ion (eint) is embedded in a high

dielectric solvent (eext), its electrostatic free energy is essentially
inversely proportional to the value of the internal dielectric (eint). As
discussed in Subheading 5.1, Poisson–Boltzmann continuum
dielectric calculations on the static solvent-accessible amides from a
set of four globular proteins have demonstrated that this inverse
proportionality is well preserved for these more complex geometries
(47). As a result, the slope of the correlation between the experi-
mental and predicted peptide acidities provides a sensitive monitor
of the optimal effective internal dielectric value, which was found to
equal 3 for these same four globular proteins (22, 47).
An internal dielectric value of 2.0 is commonly used to model
the electronic polarizability of the protein interior. This value is
derived from refractive index measurements on typical organic liq-
uids that monitor the dielectric response at optical frequencies
(~1015 s−1). However, noting that the density within the protein
interior is 30–40% higher than that of analogous small molecule
liquids (54, 55), Krishtalik and colleagues (56) have argued that
the average contribution of electronic polarizability implies a
dielectric shielding value of at least 2.5 for protein molecules. On
the slower time scale of ~10−13 s, nuclei respond to an altered elec-
tric field by adjusting bond lengths and angles as well as the cor-
responding vibrational frequencies. Although estimates vary, the
nuclear relaxation response may account for as little as 5% of the
total polarizability in the high frequency range (57). These studies
provide strong support for the interpretation that our experimen-
tally derived determination of an effective internal dielectric value
of 3 for the protein hydrogen exchange reaction indicates that the
dielectric shielding of the peptide anion is dominated by electronic
polarizability.
O R O R H O R H
OH– OH–
N N N
N N N
H O R H O R O R
DV
Fig. 3. Electrostatic free energy of peptide ionization. The reaction of hydroxide ion forms
a peptide anion at one or another site along the protein backbone. The differential electro-
static free energy for these two species is given by the product of the charge and the
difference in electrostatic potential for the two sites ΔeV which, in turn, is proportional to
the ΔpK for these two amide nitrogens. Reprinted from ref. 22 with permission from the
American Chemical Society.
Not only do the continuum dielectric predictions of experimental

protein hydrogen exchange data provide a direct estimation of the
effective internal dielectric value, the quality of the individual resi-
due predictions provides an upper bound on how much the spatial
distribution of that dielectric shielding can deviate from uniformity.
The internal effective dielectric value primarily represents the vol-
ume polarizability of electronic shielding as averaged over the length
scale of the electrostatic interactions for the ionizing peptide. The
electrostatic potential around the amide nitrogen is sensitive to a
large set of significant nonbonded interactions that range in length
from van der Waals contact out to 14 Å or more (22, 58).
The assumption of a uniform internal effective dielectric value
proves to be a considerably more robust approximation in hydro-
gen exchange analysis than for the analogous protein side chain pK
predictions. It is well known that applying continuum dielectric
methods to protein side chain ionizations generally does not yield
predictions that are consistent with a well-determined uniform
internal dielectric value (59–64). Protein conformational reorgani-
zation through reorientation of the various mobile charged side
chains are believed to provide the primary contribution to the
comparatively high dielectric shielding observed for the ionizable
side chains (65), although larger scale structural motion can play a
critical role, particularly for buried side chains (66).
If chemical induction effects do not differentially alter the
intrinsic acidity of the protein amides, the difference in electro-
static free energy for each such pair of amide anions will corre-
spond to the free energy of transferring an amide hydrogen from
one site to the other, which is proportional to the ΔpK between
those two amide nitrogens (Fig. 3) (22). The free energy of this
proton transfer is necessarily equivalent to the difference in free
energy of protonating each amide anion site, since in both cases
the identical neutral backbone protein structure is generated.
After setting all of the side chain and C-terminal carboxyls as
well as the side chain and N-terminal amines to the desired charge
state, the amide proton for each of the exchanging residues is

removed one at a time and the electrostatic free energy for each
species is calculated. One significant complication in the prediction
of protein side chain pK values is largely circumvented in the case
of hydrogen exchange. Since many of the ionizable side chains of a
protein have similar pK values, whether a given side chain is in the
neutral or charged state will alter the ionization behavior of these
other side chains so that the final population distribution of charge
states must generally be determined in an iterative fashion. Although
the ionization of individual backbone amides is obviously sensitive
to the charge distribution of the ionizable side chains, given pep-
tide anion concentrations of ~1 in 1010, the thermodynamics of the
side chain ionizations are insensitive to the backbone ionizations.
Most of the relevant hydrogen exchange data can be obtained in
the range of pH 7 to pH 10. Besides histidines, relatively few side
chains undergo ionization over this pH range in most small pro-
teins. In the case of side chain pK titrations in this pH range, the
pH dependence of the hydrogen exchange kinetics for nearby
amides can provide an effective means of determining the hydrox-
ide-catalyzed rate constants in both the neutral and charged state
of a given side chain (47).
When the protein conformational transition to the exchange-
competent state becomes rate-limiting for the hydrogen exchange
reaction (i.e., the EX1 condition), the kinetic acidity of an amide is
necessarily less than its thermodynamic acidity. The formally analo-
gous condition holds for the ionization of most carbon-bound
hydrogens. For example, the reaction rates of nitroalkanes with
hydroxide are more than 1010-fold slower than that predicted for a
normal Eigen acid (67). In this case, the charge delocalization that
provides resonance stabilization of the anion progresses more
slowly than does proton transfer (68), and heavy-atom intramo-
lecular reorganization is generally the rate-limiting process (69).
Establishing a clear physicochemical basis for interpreting protein
hydrogen exchange provides a means to characterize both the
structure and the population of protein heavy-atom reorganization
processes that facilitate solvent access for the structurally buried
amides.
4. Hydrogen
Exchange
Techniques
Rapid hydrogen exchange can be monitored by magnetization
4.1. Magnetization transfer techniques in which the water resonance is selectively
Transfer Methods excited. The NMR experiment then monitors the transfer of this
magnetization to the amide resonances. A particularly robust
implementation of magnetization transfer-based hydrogen
exchange monitoring is that of CLEANEX-PM (70, 71), in which
most NOE/ROE and TOCSY-derived contributions to the

observed resonances are efficiently suppressed. We (72) have
introduced a modification to this experiment that compensates for
the effects arising from transverse relaxation, which limit the accu-
racy of the deduced exchange rates.
Elimination of NOE and ROE cross-relaxation effects in the
CLEANEX-PM sequence is based on their mutual cancelation in
the slow tumbling limit (73). Concerns over the applicability of the
CLEANEX-PM sequence to more mobile peptide groups recently
prompted Skrynnikov and colleagues to develop a modified
(HACACO)NH sequence to detect solvent exchange via the 15N
amide resonance (74). Although the lower power level used in
their SOLEXSY sequence yielded reduced sample heating in the
high ionic strength solutions, the exchange rates obtained for the
backbone amides of the denatured drkN SH3 domain were virtu-
ally indistinguishable from their CLEANEX-PM results. In the
present application, many of the potential complications to obtain-
ing accurate exchange rates were directly suppressed by the use of
perdeuterated protein samples which further served to enhance the
sensitivity of the CLEANEX-PM measurements by reducing 1H
transverse relaxation effects.
When the hydrogen exchange data of rubredoxin, summarized
in Fig. 1, was combined with that from three other model proteins
collected under analogous conditions, a total of 46 residues are
found for which a linear hydroxide-dependent rate constant can be
reliably fitted at two or more pH values with an overall uncertainty
of 0.053 in the log kOH− rate constants, at least an order of magni-
tude more accurate than the current ability to predict these data by
continuum dielectric methods (20).
4.2. Solvent Exchange In the case of ubiquitin, exchange kinetics for the amides that do
by 1H Exchange-In not exhibit exchange in the CLEANEX-PM experiments have been
Protocol analyzed using an 1H exchange-in protocol (20). By preexchange
of the amide hydrogen positions with deuterium and then dissolu-
tion of the protein sample in a 1H2O-containing buffer, one can
circumvent the complications from the isotope dependence of sol-
vent, buffer and protein side chain ionizations that plague quanti-
tative interpretation of exchange rates measured using the
conventional 2H exchange-in protocol. Furthermore, no correc-
tion is needed for the significant differences in protein stability that
can result from comparing measurements in normal and heavy
water buffer solutions (75).
When compared to the magnetization transfer-based hydro-
gen exchange measurements (47), the 1H exchange-in protocol
suffers mainly from the differential effect in breakage of an N–D or
an N–H amide bond. Measurements on poly d,l-alanine indicate a
0.08 shift in the log rate constant for this isotope effect in the
hydroxide-catalyzed exchange reaction (76). An additional benefit
of the 1H exchange-in protocol is that the final buffer conditions

correspond to those used to produce the earlier reported magneti-
zation transfer-based measurements (47). As a result, CLEANEX-PM
experiments (71, 72) can be carried out on the 1H exchange-in
sample so as to provide a precise calibration of the relative pH val-
ues between the two sets of measurements.
Based on extrapolation from unfolding measurements in
guanidinium chloride, transition to the EX1 kinetic condition, in
which protein unfolding limits the hydrogen exchange rate, does
not apply to the most slowly exchanging amides of ubiquitin in
normal buffer conditions for pH values less than 9.5 at 25°C (77).
The average rmsd fit to the [1−exp(−kext)] dependence on the
amide 1H peak intensities in the ubiquitin 1H exchange-in experi-
ments was 1.2%. Only the amides of Thr 22 and Leu 50 provided
robust rate constants in both sets of experimental measurements,
yielding log kOH− values from the CLEANEX-PM and 1H exchange-
in experiments of 3.66 and 3.62 for Thr 22 as well as 3.79 and
3.67 for Leu 50, respectively.
5. Continuum
Dielectric Analysis
for Exchange of
Static Solvent Hydroxide-catalyzed exchange rate constants were determined for
Accessible Protein those amides of rubredoxin, FK506-binding protein (FKBP12),
Amides ubiquitin and chymotrypsin inhibitor 2 (CI2) that are solvent-
accessible in the high-resolution X-ray structures (22, 47). The
5.1. Electrostatic
acidity of these amides were calculated using the Poisson–Boltzmann
finite difference algorithm DelPhi (78) as a function of the nonpo-
Parameter Set
larizable electrostatic parameter set, the internal dielectric value
Dependence of Peptide
and the charge distribution of the peptide anion. As illustrated in
Acidity Predictions
Fig. 4, the best performance was obtained using the CHARMM22
electrostatic atomic partial charge and radius parameters (79)
(these parameters are preserved in the current CHARMM27 force
field), an ab initio-derived peptide anion charge distribution (47),
and an internal dielectric value of 3. These parameters yielded an
rmsd value of 7 for the 56 amide exchange rate constants ranging
from 100.67 to 109.0 M−1 s−1. The optimal internal dielectric value
was obtained via its (1/eint) scaling effect on the differences in elec-
trostatic potential for the various peptide anions predicted by the
Poisson–Boltzmann calculations and linear correlation against the
experimental hydrogen exchange rates.
The OPLS-AA electrostatic parameter set (80) yielded compa-
rably robust predictions, as might be expected from its strong simi-
larity to the CHARMM atomic charge and radii set. By contrast,
the nonpolarizable AMBER parm99 (81) and AMBER ff03 (82)
parameter sets performed more poorly. As illustrated in Fig. 5, the
parm99 electrostatic parameters from the AMBER force field do
Fig. 4. Dependence of amide acidity predictions on the atomic charge distribution of the
peptide anion. Protein amide pK values predicted using CHARMM22 atomic charge and
radius parameters [79] and an internal dielectric constant of three at 25°C, with the
excess anion charge density distributed throughout the peptide unit as predicted from
B3LYP DFT calculations [47]. Reprinted from ref. 47 with permission from the American
Chemical Society.
not reliably predict the experimental hydrogen exchange data.

Many of the outlying predicted values arise from peptide groups
that have neighboring charged side chains. In contrast to the
CHARMM22 and OPLS-AA parameter sets considered above, the
parm99 set does not assume that the charges on the O, C, N, and
H of the peptide group are common to every residue type. For the
lysine and arginine side chains, 11.5% of the formal charge of the
side chain is assumed to reside on the backbone carbonyl group,
and an additional 7.1% of the formal charge resides on the back-
bone nitrogen and amide hydrogen. Similarly, for the aspartate and
glutamate residues, the parm99 set assumes that, relative to the
atomic charges of the neutral amino acid types, the backbone car-
bonyl group bears 7.5% of the side chain formal negative charge,
while together the backbone nitrogen and amide hydrogen bear an
additional 7.9%. The partial charges of the AMBER force field are
derived by fitting to the distribution of the ab initio-derived elec-
trostatic potentials surrounding each atom (83). However, in the
present context, the projection of these charges onto each nucleus
operationally corresponds to a modeling of chemical induction
effects. The ability to predict amide exchange in both proteins and
peptides much more accurately with the electrostatic parameter
Fig. 5. Correlation of hydroxide-catalyzed hydrogen exchange rate constants with

Poisson–Boltzmann-derived pK values, using AMBER parm99 electrostatic parameters
[81] and an internal dielectric constant of 3 at 25°C. The open symbols represent amides
for which either of the two neighboring side chains is ionized. Only amides surrounded by
neutral side chains were used in the scaling of the predicted pK values. Reprinted from
ref. 47 with permission from the American Chemical Society.
sets that do not incorporate large shifts in the atomic charges of the
backbone atoms, indicates that the magnitude of charge migration
within the individual amino acids that is modeled into the AMBER
parm99 electrostatic parameter set appears to be well beyond what
might be needed to rationalize local sequence-dependent varia-
tions. These considerations apply even more markedly to calcula-
tions using the AMBER ff03 electrostatic parameters (47).
5.2. Atomic Charge Our initial rubredoxin hydrogen exchange predictions assumed
Distribution that the excess negative charge of the peptide anion resides exclu-
in the Peptide Anion sively on the nitrogen, following the earlier results from continuum
dielectric modeling of hydrogen exchange in simple peptides by
McCammmon and colleagues (41). That assumption conflicts with
the long-standing tradition of representing the product formed by
deprotonation of an amide as an imidate anion. However, in con-
trast to predictions from early valence bond theory studies, there
Fig. 6. Dependence of amide acidity predictions on the atomic charge distribution of the
peptide anion. Protein amide pK values predicted using CHARMM22 electrostatic param-
eters and an internal dielectric constant of 3 at 25°C, with the excess anion charge den-
sity localized to the carbonyl oxygen. Reprinted from ref. 47 with permission from the
are no experimental results or high level quantum mechanical

calculations to support the interpretation that a dominant fraction
of the excess negative charge for a secondary alkyl amide anion
shifts to the oxygen atom (21). Recently, we reported (47) B3LYP
(84) density functional theory calculations at the aug-cc-pVTZ
basis set level on the neutral and anionic states of N-methylacetamide.
These calculations predicted an electron charge distribution for the
peptide anion that assigns a threefold higher excess charge density
for the nitrogen atom than for the oxygen atom. The DFT-derived
peptide anion charge distribution provided somewhat better pro-
tein hydrogen exchange predictions than did an assignment of the
excess charge to the nitrogen (47). By contrast, assignment of the
excess negative charge to the oxygen, so as to generate the imidate
form, yielded markedly poorer predictions of hydrogen exchange
for the four model proteins (Fig. 6) than those obtained from
either nitrogen-centered or ab initio-derived peptide anion charge
distributions (47).
5.3. Dominant Acidic Although these continuum dielectric calculations were based on
Conformer Analysis high resolution X-ray structures, a discrete set of adjustments was
applied to several side chain types during the calculation of the
intraresidue peptide acidity. The most significant case involved
aspartate side chains in which the χ1 side chain torsion angle is
gauche to the backbone nitrogen. This orientation places the neg-
atively charged carboxylate near the intraresidue amide, thus
strongly suppressing its predicted ionization. The sterically unhin-
dered rotation of an Asp carboxylate to a trans rotamer can
enhance the acidity of the intraresidue amide by 5 pH units or
more (22, 35, 47).
A second type of systematic modulation in the predicted elec-
trostatic free energy of the peptide anions as a function of the resi-
due side chain conformation was applied when the χ1 side chain
torsion angle is near +60°. In this rotamer, the Cγ is gauche to both
the main chain nitrogen and carbonyl carbon and is often in van
der Waals contact with the amide hydrogen. Unhindered rotation
to another c1 rotamer tends to increase the solvation of the amide
anion with a resultant increase the peptide acidity of that residue.
Regarding the physical validity of these ad hoc side chain rota-
tions for identifying energetically favorable conformations near the
X-ray coordinates that have enhanced peptide acidities, it should
be noted that the model conformation (molecule 92) within the
independently generated 2NR2 ubiquitin ensemble (further dis-
cussed below) that most accurately predicts the experimental
hydrogen exchange has undergone the Asp and gauche+ side chain
rotamer transitions that were identified by this earlier published
side chain reorientation protocol (33).
The assumption of limited protein conformational reorganiza-
tion during the lifetime of the peptide anion surely can not apply
generally to the side chain hydroxyl hydrogens, since the analo-
gous reorientation of the hydrogens on water molecules gives rise
to the dominant dielectric shielding of that phase. Particularly for
side chain hydroxyl hydrogens that are not involved in an intramo-
lecular hydrogen bond, continuum dielectric calculations based on
a fixed orientation are potentially misleading. This is most notably
the case when amide acidity is estimated with an intraresidue serine
or threonine hydroxyl in either a gauche+ or gauche− c1 rotamer.
Given that the exchange rates are similar for serine- and threonine-
containing model peptides, as compared to the alanine reference
(23), the side chain hydroxyl does not generally serve as a catalyst
for peptide hydrogen exchange. Consistent with that observation,
the peptide acidity analyses for serine and threonine residues with
a gauche c1 rotamer assume that the dielectric shielding of the side
chain hydroxyl is equal to that of the equivalent volume of water.
In such cases, the serine side chain is computationally truncated to
alanine, and threonine is tranformed into α-aminobutyrate.
6. Ensemble
Averaging in
Prediction of
Ubiquitin Hydrogen All model ensembles that are justified by their collective ability to
Exchange predict experimental measurements necessarily invoke the assump-
tion that they represent an accurate Boltzmann sampling of con-
6.1. Population formational space. Yet not all experimental approaches offer
comparable sensitivity to variations in the model distribution. In
Averaging of
contrast to experimental techniques that are equally sensitive to
Conformer Acidities
every protein conformation and thus are generally dominated by
the most populated states, hydrogen exchange reactivity is highly
sensitive to conformation. The fact that structurally buried amides
are effectively unreactive to hydrogen exchange forms the basis for
the widespread application of this experimental technique for
monitoring rare conformational states. On the one hand, the highly
exposed amides exhibit exchange rates that are quite sensitive to
the well-populated protein conformations. On the other hand, this
sensitivity to conformation applies to the rarely exposed amides as
well so that hydrogen exchange measurements for these sites pro-
vide a powerful experimental monitor of both the population and
conformation of the transient exchange-competent state.
Although population averaging of the conformer acidities
(ΣKi) is more formally correct (85), population averaging of the
conformer pKi values (ΣpKi or, equivalently, averaging of the con-
former electrostatic potential values) has occasionally been used to
estimate the effect of conformer sampling in the prediction of pro-
tein side chain ionization. Karplus and colleagues (86) have con-
cluded that, in assessment of the ionization midpoint for each
titrating residue, averaging over the Ki values or averaging over the
log Ki values usually has little effect on the predicted pK values.
The issue of Ki vs. pKi averaging is markedly different in the
estimation of hydroxide-catalyzed hydrogen exchange rates near
neutral pH where, in most cases, less than 1 out of every 1010 mol-
ecules will have a given amide in the ionized state. As a result,
whenever there is a substantial range in conformer acidities, the
most acidic conformers can make the dominant contribution to
the observed hydrogen exchange rate, even if they constitute only
a modest fraction of the overall conformer population.
Molecular simulation techniques have been increasingly
employed to predict the Boltzmann-weighted conformational dis-
tribution of the protein native state. In principle, under the assump-
tion of ergodicity, an unconstrained constant temperature molecular
dynamics simulation can provide the Boltzmann conformational
distribution. In practice, given the roughness of protein energy
landscapes, even simulations extending for hundreds of nanosec-
onds will generally suffer from incomplete conformational sam-
pling. Furthermore, force field parameterizations are only
approximate. As a result, the predicted conformational distribution
can drift away from the physical values.
6.2. Consistency Concerns arising from approximate force fields and incomplete
of Ubiquitin Model sampling have been approached by incorporating experimentally
Ensembles derived restraints into molecular dynamics simulations, as applied
with the Native State to ubiquitin. The MUMO algorithm of Vendruscolo and col-
Conformational leagues (31) introduced NOE-derived distance bound restraints,
Distribution averaged over subsets of protein conformations, as a mechanism
for maintaining the predicted molecular dynamics ensemble distri-
bution to within the neighborhood of the experimentally deter-
mined structure. In parallel, order parameters S2, derived from
backbone 15N and side chain 13C methyl NMR relaxation measure-
ments, were incorporated into the restrained molecular simulation
where they enforce enhanced conformational sampling. The resul-
tant set of 144 protein conformations (PDB code 2NR2 (31))
serves as a model for the random sampling of the native state
Boltzmann distribution of ubiquitin.
In generating an alternate model ensemble (PDB code 2K39),
de Groot and colleagues (32) applied the CONCOORD algorithm
(87) using the same 2,727 NOE constraints from the 1D3Z solu-
tion structure analysis (88) to generate 1,000 model conforma-
tions of ubiquitin. In the EROS (ensemble refinement with
orientational restraints) protocol a subset of 400 conformations
were initially selected as most consistent with the residual dipolar
coupling (RDC) data. An iterative process of simulated annealing
followed by reselection against the RDC data was then applied
until the initial set of 1,000 conformations was winnowed down to
a final set of 116 conformations.
When ensemble averaging of hydrogen exchange reactivity was
applied to the NOE, S2-restrained 2NR2 ubiquitin ensemble (31),
the hydroxide-catalyzed exchange rates for nearly all of the highly
exposed amide hydrogens (solvent-accessible in >50% of confor-
mations) were quite accurately predicted (black circles in Fig. 7)
(20, 33). For 16 of these highly exposed amides (Gly 47 and Asp
52 discussed below), the 2NR2 ensemble predicted the 105-fold
range in experimental rates, yielding an rmsd of 0.51 and a correla-
tion coefficient r = 0.94 for the log kOH− values. This correlation is
markedly better than that obtained using a single crystallographi-
cally derived ubiquitin structure (47).
Most strikingly, for the backbone amides that are exposed to
solvent above 0.5 Å2 in more than one but less than half of the
models in the NMR relaxation-restrained ensemble, with the
exception of Lys 48, the amide pKa predictions are nearly as accu-
rate (rmsd for log kOH− of 0.69) as those for the more highly
exposed sites (Fig. 7). Despite being structurally buried by most
conventional criteria, the exchange rate constants for these 12 resi-
dues, spanning nearly a million-fold range, are predictable to within
a factor of 5.
The underestimation of the hydrogen exchange rates for residues
in which only one model conformation has an amide hydrogen
10
G47
K48
8
log kOH– (PB)

4
D52
–2
–2 0 2 4 6 8 10
log kOH– (M–1 s–1)
Fig. 7. Hydroxide-catalyzed rate constants predicted from the NMR relaxation-restrained

2NR2 ensemble of ubiquitin. For residues in which the amide hydrogen is exposed to
solvent by more than 0.5 Å2 in at least one ensemble model, conformer acidities were
predicted for all solvent-exposed amides. Each residue is distinguished according to
whether the amide hydrogen is exposed to solvent by more than 0.5 Å2 in at least 50% of
the models (filled circle) or exposed in only a single model (filled square). The other tran-
siently exposed amides are denoted as (filled diamond ). Residues Gly 47, Lys 48, and Asp
52 are denoted with open symbols. Reprinted from ref. 20 with permission from the
accessibility above 0.5 Å2 (filled square in Fig. 7) is consistent with

that expected from the statistics of undersampling. As a result of
the ΣKi averaging of conformer acidities discussed above, a suffi-
cient number of models must be sampled not only to establish the
relative fraction of solvent-exposed conformations. In addition,
the solvent-exposed conformations must be sufficiently sampled so
as to approximate the distribution of conformer acidities therein.
Indeed, many of the residues in the 2NR2 ensemble exhibit a range
of conformer acidity values (103–106) comparable to those of the
four residues in the conformationally disordered C-terminus,
despite remaining well-ordered as indicated by the Cα rmsd of
0.56 Å for residues 1–72 in this ensemble (Fig. 8).
Regarding the overestimated peptide acidities for Gly 47 and
Lys 48 predicted from the 2NR2 ensemble, a more marked over-
estimation is also obtained from the 2K39 ensemble discussed
below that spans the segment Ile 44 to Lys 48, which constitutes a
major portion of the recognition site for enzymes involved in for-
mation of Lys 48-linked poly ubiquitylation signals for proteasomal
targeting. In the 144 structures of the 2NR2 ensemble, not a sin-
gle Asp 52 side chain is predicted to occupy a trans rotamer and all
14
16
18
20
pKPB
22
24
26
28
0 10 20 30 40 50 60 70
residue
Fig. 8. The range of conformer acidities for amide hydrogens exposed to solvent by at least 0.5 Å2 in the NMR relaxation-
restrained 2NR2 ensemble of ubiquitin. Residues for which the amide hydrogen is solvent-exposed in less than 50% of
the ensemble models are indicated in gray, while those that are solvent-accessible in more than 50% of the models
are marked in black. The peptide acidities are placed on an absolute scale based on their normal Eigen acid behavior, the
diffusion-limited rate for hydroxide-catalyzed exchange of 2 × 1010 M−1 s−1, and the pK of 15.7 for water at 25°C. These
properties imply an exchange rate constant of 1.0 M−1 s−1 for an amide with a pK value of 26.0 [47]. Reprinted from ref. 20
with permission from the American Chemical Society.
except 8 of these side chain conformations have the carboxylate

bound in a salt bridge to the Lys 27 side chain so that no sampling
of the higher acidity conformations are included.
Although the NOE-restrained, RDC-selected 2K39 ensemble
yields predictions for the highly exposed amides that are nearly as
robust as those from the 2NR2 ensemble (Fig. 9), the 2K39
ensemble provides substantially overestimated exchange rates for a
number of the more weakly exposed amide sites (20, 33). All eight
residues for which only a single ensemble model has an amide
hydrogen accessibility above 0.5 Å2 have predicted exchange rates
that exceed the experimental results. As indicated in the discussion
above, it is highly unlikely that all of these overestimates arise as the
result of undersampling.
6.3. Comparison Of particular significance is the pattern of exchange rates seen for
Against the Set the proteasome targeting interaction site around residue Lys 48.
of Known Ubiquitin- In explicit contrast to the 2NR2 ubiquitin ensemble of Vendruscolo
Protein Complexes and colleagues (31), de Groot and colleagues (32) contended that
their 2K39 ensemble spans a conformational space that includes
the ubiquitin structures found in all of the available X-ray studies
of ubiquitin-protein complexes (41 complexed-ubiquitin mole-
cules +5 X-ray structures of uncomplexed ubiquitin). These authors
further claimed that conformations of ubiquitin found in these
10
G47
K48
8
F45
log kOH– (PB)

6
4 I44 D52
0
0 2 4 6 8 10
log kOH– (M–1 s–1)
Fig. 9. Hydroxide-catalyzed rate constants predicted from the NMR residual dipolar coupling-
restrained 2K39 ensemble of ubiquitin. For residues in which the amide hydrogen is
exposed to solvent by more than 0.5 Å2 in at least one ensemble model, conformer acidi-
ties were predicted for all solvent-exposed amides. Each residue is distinguished accord-
ing to whether the amide hydrogen is exposed to solvent by more than 0.5 Å2 in at least
50% of the models (filled circle) or exposed in only a single model (filled square). The other
transiently exposed amides are denoted as (filled diamond ). Asp 52 and the residues
involved in the primary interaction site for proteasome targeting are individually identified.
Reprinted from ref. 20 with permission from the American Chemical Society.
protein complexes are well represented in their 2K39 conformational

ensemble of free ubiquitin, providing what has been widely
regarded to be a compelling demonstration of the conformational
selection mechanism of protein-protein recognition (89–91).
However, the log kOH− values for Ile 44, Phe 45, Gly 47, and
Lys 48 predicted from the 2K39 ensemble exceed the experimen-
tal results by 2.4, 3.2, 1.7, and 3.1, respectively. An indication that
anomalous sampling statistics do not explain these discrepancies is
that for six of the seven ensemble models in which the Lys 48
amide hydrogen is exposed to solvent by more than 0.5 Å2 each
predict a conformer hydrogen exchange rate that is more than
4,000-fold above the experimentally observed value. As a result,
even after normalization to the 116 models in the ensemble, the
predicted amide pKa value is significantly above the experimental
value. Furthermore, the experimental S2 order parameter values of
0.838, 0.872, 0.840, 0.821, and 0.843 for the N–H bond vectors
of residues Ile 44 to Lys 48 (92) indicate that any substantial inter-
nal motion in this segment must be very weakly populated on the
ps-ns timescale.
This discontinuity between the claims for conformational sampling

in the 2K39 ensemble study and the prediction of hydrogen
exchange reactivity from this ensemble prompted a reexamination
of the conformational distribution of the 2K39 ensemble. One line
of evidence that these authors (32) provided for indicating that the
NOE-restrained, RDC-selected 2K39 ensemble spans the confor-
mational space of the ubiquitin-protein complexes is that each of
the X-ray structures is within a backbone rmsd of 0.8 Å from at
least 1 of the 116 members of the ensemble (for the Cα atoms of
residues 1–70). Yet when the analogous calculation was carried out
on the 2NR2 ensemble, a maximum backbone rmsd value of 0.7 Å
was obtained between each of the 46 X-ray structures and the near-
est member of the 2NR2 ensemble (33).
More significantly, for 36 of the 46 ubiquitin X-ray structures,
each of the 116 backbone conformations in the 2K39 ensemble is
farther from that X-ray structure than is the 1D3Z solution struc-
ture model (88) from which the 2K39 ensemble was initiated. In
comparison, for 41 of the 46 ubiquitin X-ray structures, the 2NR2
ensemble contains a backbone conformation that is closer to the
X-ray structure than is any member of the 2K39 ensemble. Overall,
the 2K39 ensemble clearly represents a drifting away from the con-
formational space spanned by the crystal structures of ubiquitin-
protein complexes.
6.4. Hydrogen Ideally, these two ubiquitin ensembles represent ~102 random sam-
Exchange Analysis plings of the Boltzmann conformational distribution so that amides,
as a Monitor which become exposed to solvent at less than a 1% frequency, will
for Completeness generally be unrepresented in these peptide acidity predictions. As
of Ensemble Sampling indicated in Fig. 2, for nearly every case in which an amide hydro-
gen is exposed to solvent in at least one conformation from either
the 2NR2 or 2K39 ensembles, the experimental exchange rate is
less than what would be predicted for a model peptide having the
same fraction of solvent-exposed conformations (only for Thr 9 is
the apparent solvent accessibility estimated from peptide normaliza-
tion significantly above that from both of the ensembles).
The log exchange rate constants for most model peptides are
>8 (23). Hence, one may anticipate that for a proper 1% Boltzmann
sampling of the conformational distribution nearly all backbone
amides having log kOH− values >6 should have solvent-accessible
conformations within that 1% sampling. Indeed, each of the 23
ubiquitin amides that have experimental log kOH− values >6 are
exposed to solvent in at least one conformation in both the 2NR2
and 2K39 ensembles (20). On the contrary, there are some amides
of ubiquitin, which are exposed to solvent in these two ensembles,
that have predicted and observed exchange log rate constants that
are significantly less than 6, reflecting the fact that their exchange-
competent conformations have strongly depressed exchange reac-
tivities. Nevertheless, a number of backbone amides that are solvent
inaccessible in the X-ray structures of ubiquitin have conformers

within these two model ensembles with amide acidities that are
similar to those of simple peptides (Fig. 8) (20). For such a residue,
if its log kOH− value is below 6, then its amide hydrogen could be
expected to remain solvent inaccessible in most Boltzmann sam-
plings at a 1% level.
There are 12 amides in ubiquitin that have log exchange rate
constants between 5 and 6. For 8 of these 12 residues, the amide
hydrogen is solvent inaccessible in every conformation of the 2NR2
ensemble, despite the fact that all 23 amides with log kOH− values
>6 are solvent accessible. By contrast, only 3 of the 12 residues
with log exchange rate constants between 5 and 6 are solvent-
inaccessible in every 2K39 conformation, consistent with an overly
expanded sampling of conformational space for that ensemble.
This analysis supports the expectation that the fraction of solvent-
accessible conformations as a function of the log kOH− values can
provide a useful monitor of the degree of completeness with which
a given ensemble has sampled the energetically accessible confor-
mational space.
7. Continuum
Dielectric Analysis
of Hydrogen
Exchange in Model Predicting the experimental side-chain-dependent differential
Peptides exchange rates for conformationally unstructured peptides pres-
ents a rather stringent challenge. Although early studies argued
7.1. Backbone that the differences in exchange rates among various simple model
Conformation
peptides arise from chemical induction effects (44, 93), more
recent electrostatic calculations have indicated that amide hydro-
Dependence of Side
gen exchange rates are strongly dependent upon the relative orien-
Chain Correction
tation of the adjacent peptide groups (41, 94).
Factors for Hydrogen
Experimentally measured hydroxide-catalyzed amide exchange
Exchange
rates for conformationally unstructured alanine peptides are essen-
tially unaffected by the intraresidue substitution of a methionine
side chain, while substitution of a phenylalanine, tyrosine, or tryp-
tophan side chain decreases the exchange rate by approximately
twofold (23). A significantly larger (~fivefold) attenuation of the
exchange rates results from substituting any of the branched side
chains from valine, leucine, or isoleucine. A similar pattern is
obtained when these side chain substitutions are introduced into
the residue preceding the site of amide exchange, although the
magnitude of the variations in exchange rates is approximately four-
fold smaller. As a result of these fairly small effects, a compelling
correlation between predicted and observed side-chain-dependent
hydrogen exchange rate differences for conformationally unstruc-
tured peptides requires substantially more accurate predictions than
have yet been demonstrated in protein studies.
Applying continuum dielectric methods with an internal dielectric

value of 4, Fogolari et al. (41) reported a substantial dependence
of peptide acidity on the conformation of the backbone. Using a
70 to 30% weighted average from a pair of calculations with an
extended backbone and an α-helical backbone conformation,
respectively, they found that the log correction factors for the side
chains preceding the exchanging amide could be predicted with an
rmsd of 0.17 for the set of experimental hydroxide-catalyzed
exchange rate constant values. For the larger range of differential
log exchange rates that arise from altering the intraresidue side
chain, those authors obtained an appreciably worse fit with an rmsd
of 0.38, comparable to the range of experimental log rate values.
Avbelj and Baldwin (94) presented a Poisson–Boltzmann anal-
ysis of the steric contributions to hydrogen exchange that demon-
strates the dependence of the predicted side chain correction
factors on the assumed backbone conformation. For the various
side chain types, these authors predicted differences in electrostatic
free energies, relative to the alanine reference, that were roughly
twice as large when the residue bearing the exchanging amide was
placed in a polyproline II conformation as compared to when it
was placed in an extended conformation. In both of these earlier
studies of the side-chain-dependent exchange rates, the problem
was simplified by comparing between residue types assuming the
same backbone conformational distribution. As a result, insight
into the impact of the different side chain types upon the confor-
mational distribution of the backbone is lost.
As compared to the conformational complexity of the protein
native state, it might be anticipated that accurate prediction of the
Boltzmann-weighted conformational distribution for simple model
peptides should be relatively straightforward. In fact, even the basic
question of the relative fraction of extended vs. α conformational
populations in model peptides remains an actively debated issue
(95–97). Current implementations of classical molecular dynamics
simulations (98, 99), as well as density functional theory-based
modeling (100), continue to yield disparate predictions for the
backbone conformational distributions of simple model peptides.
7.2. Dependence Poisson–Boltzmann analysis was carried out utilizing the Protein
of Peptide Acidity Coil Library of Rose and colleagues (101) as a model for the
on Backbone Boltzmann-weighted distribution of the unstructured state. In this
Conformation structural library, protein segments lying outside of regular second-
ary structures were identified from high resolution X-ray analysis.
Generally implicit in the application of these coil libraries is the
assumption that the other forms of long range interactions that are
present in the protein crystal structure do not systematically shift the
average conformational distribution of the individual residue types
away from that of conformationally disordered polypeptides.
100
N-Acetyl-Ala-Ala-N-Methylamide
*
80
Coil library entries

60
40
20
0
–3.0 –2.0 –1.0 0 1.0 2.0
ΔpK (NMA)
Fig. 10. Peptide acidities of Ala–Ala conformers. The electrostatic potential was calculated
for the central peptide anion (asterisks) formed from blocked peptides derived from the
679 Ala–Ala segments found in the Protein Coil Library [101], utilizing the CHARMM22
atomic charge and radius parameters [79] and an internal dielectric value of 3. The
N-methylacetamide anion was used as reference for the electrostatic potential calcula-
tions. Reprinted from ref. 34 with permission from Elsevier Limited.
The 17,422 protein segments in this coil library, taken from

X-ray structures having at least 1.6 Å resolution and R values of
0.25 or better, were divided into dipeptide segments. From the
atomic coordinates of these segments, model N-acetyl-[X-Ala]-N-
methylamides and N-acetyl-[Ala-Y]-N-methylamides were con-
structed and then used to predict the experimental side chain
correction factors for hydrogen exchange (23), while the other
N-acetyl-[X–Y]-N-methylamides were subsequently used to assess
whether anticipated deviations from additivity for these correction
factors (95, 102) are observed.
When the CHARMM22 electrostatic parameters were applied
to the coil library-derived Ala-Ala peptides with the internal dielec-
tric set to 3, the conformer acidities were found to span a range of
6 pH units (Fig. 10). A similar range of conformer acidities was
predicted for the other coil library peptides considered here.
Hence, within the range of conformations observed in native pro-
tein structures, the local backbone conformation of the adjacent
peptide groups is predicted to give rise to a million-fold range in
hydroxide-catalyzed amide hydrogen exchange rates.
As illustrated in Fig. 11, nearly all of Ala-Ala peptides predicted
to be highly acidic have the N-terminal residue in either a β (center
at f » −130°, y » 125°) or a polyproline II (center at f » −80°,
y » 145°) conformation and the C-terminal residue in the α con-
formation. The reverse pattern holds true for the least acidic pep-
tide conformers. It is the residue with a conformation near the
α-helix basin of the Ramachandran map that dominates this behav-
ior. When the C-terminal residue is an α conformation, the posi-
tive end of this peptide dipole points toward the ionizing nitrogen,
180
150
120
90
60
30
psi
0
–30
–60
–90
–120
–150
–180
–180 –150 –120 –90 –60 –30 0
phi
Fig. 11. The backbone conformational distribution of the most acidic and least acidic
N-acetyl-Ala-Ala-N-methylamide conformers in the Protein Coil Library. The (f,ψ) torsion
angle values for the 50 most acidic peptides are plotted in gray, while the values for the
30 least acidic peptides are plotted in black. The N-terminal residues are denoted by cir-
cles and the C-terminal residues by triangles. Dotted lines are used to correlate the N- and
C-terminal residue backbone torsion angles for peptides that do not bridge between the
extended and α conformational regions. None of the most acidic peptides have positive f
torsion angles. Reprinted from ref. 34 with permission from Elsevier Limited.
thus stabilizing the anionic intermediate. Similarly, the negative

end of that peptide dipole points toward the ionizing nitrogen
when the N-terminal residue is in an α conformation.
The subset of acidic Ala-Ala conformers with an extended
N-terminal residue and a C-terminal residue in an α conformation
is predicted to account for over 60% of the total hydrogen exchange,
while constituting only 12% of the population. By contrast, over
half of the Ala-Ala peptides in the coil library have either both resi-
dues in the extended conformation or both in the α conformation.
However, when combined, these two sets of conformers are pre-
dicted to account for only 11% of the total hydrogen exchange
reaction.
15
16
17
18
pKPB (Ala-Val)
19
20
21
22
23
23 2
22 21 20 19 18 17 16 15
pKPB (Ala-Ala)
Fig. 12. Relative contributions of side chain and main chain interactions determining the
peptide acidities of N-acetyl-[Ala-Val]-N-methylamide conformers. For each of the
N-acetyl-[Ala-Val]-N-methylamide conformers, the methyl groups were truncated to form
an [Ala-Ala] conformer, and the electrostatic free energy was calculated. The correspond-
ing pairs of amide pK values are denoted by their χ1 side chain rotamer (g− as diamonds,
g+ as squares, and t as circles). Reprinted from ref. 35 with permission from Elsevier
Limited.
7.3. Hydrogen The degree to which the backbone geometry contributes to the
Exchange for Nonpolar predicted conformer acidities can be assessed by analysis of
Side Chains and the Protein Coil Library [Ala-Val] peptide conformations with the
Nonadditivity of valine side chain truncated to alanine. For each of the three c1
Correction Factors rotamers of the valine side chain, the acidity of the central amide is
closely correlated with that of the [Ala-Ala] peptide in the same
backbone geometry (Fig. 12). However, conformers with the
gauche− side chain rotamer, in which both methyl groups are ori-
ented gauche to the backbone nitrogen, are predicted to have
appreciably lower amide acidities (on average ~0.7 pH units).
Throughout the range of amide acidities, differences in the side
chain c1 torsion angle give rise to variations in pK values spanning
~1 pH unit.
For each of the nonpolar N-acetyl-[X-Ala]-N-methylamides
and N-acetyl-[Ala-Y]-N-methylamides, the predicted acidities of
the central amide in the individual conformers of each peptide span
nearly a million-fold range. Nevertheless, population averaging of
the conformer reactivities predicts the standard side-chain-depen-
dent hydrogen exchange correction factors for model peptides to
within a factor of 30% (100.11) with a correlation coefficient r = 0.91
(Fig. 13).
0.2
YA
FA
0 WA MA
LA
IA AA
D log kOH– (PB)

–0.2
VA
AY AM
AF
–0.4
AL
AW
–0.6
AV
AI
–0.8
–0.8 –0.6 –0.4 –0.2 0 0.2
D log kOH– (exp)
Fig. 13. Predicted and observed nonpolar side-chain-dependent differences in the hydrox-
ide-catalyzed log rate constants for model peptides. Poisson–Boltzmann electrostatic free
energies were calculated for the N-acetyl-[X-Ala]-N-methylamide conformers and
N-acetyl-[Ala-Y]-N-methylamide conformers derived from the Protein Coil Library [101].
Hydrogen exchange rate constants were predicted from the ensemble averaging of the
conformer exchange reactivities and were then compared to the standard experimental
side-chain-dependent hydrogen exchange correction factors [23]. Reprinted from ref. 35
with permission from Elsevier Limited.
Hydrogen exchange rate predictions for unstructured peptides

are standardly derived by adding the side-chain-dependent
exchange rate correction factors for the side chain preceding and
the side chain following the peptide group undergoing exchange
(23). This assumption of additivity for the individual side chain
correction factors has its conceptual justification in the isolated
residue hypothesis of Flory, developed in his classic analysis of the
statistical mechanics of random coil polymers (103). In that para-
digm, the conformational distribution of each residue is assumed
to be independent of the conformational distribution of any other
residue in the chain.
However, direct evidence for violation of the isolated residue
hypothesis (103) was first reported by Penkett et al. (104) who
observed that the backbone ϕ torsion angle of a given residue, as
assessed on the basis of the 3JHNα NMR scalar coupling constant, is
dependent on whether the preceding residue has a β-branched or
an aromatic side chain. The presence of a bulky nonpolar side chain
on the preceding residue was found to increase the population of
conformers having extended main chain torsion angles. Studies of
coil libraries drawn from high resolution protein X-ray structures
have found nearest-neighbor effects in the backbone torsion angle

preferences (105, 106).
If the conformer distributions from the Protein Coil Library
(101) perfectly conformed to the isolated residue hypothesis, then
the differential hydrogen exchange rates predicted for each
N-acetyl-[X–Y]-N-methylamide might be expected to be precisely
determined by the sum of the Δlog kOH− values for the correspond-
ing N-acetyl-[X-Ala]-N-methylamide and N-acetyl-[Ala-Y]-N-
methylamide. However, the predictions assuming additivity in the
calculated nonpolar side-chain-dependent hydrogen exchange cor-
rection factors deviate from the N-acetyl-[X–Y]-N-methylamide
calculations with an rmsd of 0.14 and a correlation coefficient
r = 0.78. This correlation is substantially worse than that for the
predictions of the experimental nonpolar side-chain-dependent
hydrogen exchange correction factors (Fig. 13), despite the fact
that the calculations of the exchange reactivities for the N-acetyl-
[X-Ala]-N-methylamides and N-acetyl-[Ala-Y]-N-methylamides
have several additional sources of uncertainty.
7.4. Evidence It should be noted that for the nonpolar side chains illustrated in
for Dielectric Shielding Fig. 13, an assumed internal dielectric value of three provides an
from Conformational optimal correlation between the experimental and predicted pep-
Reorganization in tide acidities. As expected, the slope of this correlation exhibits the
Carboxamide Side anticipated inverse dependence on that dielectric value. Hence,
Chains even in the case of the highly mobile nonpolar peptides, the con-
formational contribution to dielectric shielding of amide ioniza-
tion appears to be severely limited. However, evidence for a modest
contribution from conformational reorganization was observed for
the Asn and Gln side chains. Poisson–Boltzmann calculations on
the coordinates of the Asn and Gln residues in Protein Coil Library
distribution yielded amide reactivity predictions that were appre-
ciably less than the experimentally determined values (open sym-
bols in Fig. 14). In contrast to unhindered rotamer transitions
around the sp3–sp3 bonds that generally occur in the timeframe of
hundreds of picoseconds to nanoseconds (107, 108), the sp3–sp2
hybridization of the carboxamide side chain results in more rapid
dihedral angle transitions due to the lower intrinsic torsional poten-
tial barrier. Quantum mechanical analysis indicates a barrier of only
0.15 kcal/mol for acetamide (109). As a result, within each c1
rotamer state of Asn, extensive sampling of the c2 torsion angle can
potentially occur during the peptide anion lifetime. Given the large
dipole of the side chain carboxamide group, such a bond rotation
can substantially alter the degree of stabilization provided for the
peptide anion.
Calculations were conducted to estimate the magnitude of the
shielding effect from conformational reorganization of the Asn
side chain by assuming rapid averaging around the c2 torsion angle.
Upon deprotonation of the peptide unit, the statistical weighting
0.6
0.4
AN
0.2 NA
D log kOH– (PB)

0
QA
–0.2 AQ
–0.4
–0.6
–0.8
–0.8 –0.6 –0.4 –0.2 0 0.2 0.4 0.6
D log kOH– (exp)
Fig. 14. The effect of conformational reorganization within the individual side chain
rotamer states for Asn (χ1) and Gln (χ1 and χ2). Electrostatic free energies were calculated
for the [Asn-Ala], [Ala-Asn], [Gln-Ala], and [Ala-Gln] methylamide conformers derived from
the Protein Coil Library [101]. Hydrogen exchange rate constants were predicted from the
ensemble averaging of these conformer exchange reactivities with (filled circle) and with-
out (open circle) allowance for conformational reorganization of the peptide anions within
each side chain rotamer state. The other data are displayed as given in Fig. 13. Reprinted
from ref. 35 with permission from Elsevier Limited.
of this distribution will shift according to the conformer-dependent

strength of the electrostatic interaction between the carboxamide
group and the peptide anion. The differences in energy among all
of the conformer-dependent electrostatic interactions within a
given c1 rotamer state were used to assign a Boltzmann factor
weighting to each conformer within that c1 rotamer state. For the
Asn peptides, this correction for conformational reorganization
brings the predicted peptide acidity fully in line with the experi-
mental results (Fig. 14). A smaller correction is predicted for the
Gln side chain, reflecting the larger average distance between the
backbone nitrogen and the side chain carboxamide.
7.5. Deviations The Ser, Thr, Cys and His+ intraresidue side chains all accelerate
in Dielectric hydroxide-catalyzed peptide hydrogen exchange (23), consistent
Continuum Modeling with an electron-withdrawing effect from the substituent. However,
of Hydrogen Exchange the challenges facing an adequate modeling of the electrostatic
Arising from Chemical potential for these side chains complicate the deconvolution of an
Induction and additional contribution to peptide acidity arising from chemical
Aspartate Side Chain induction (35).
Interactions With the exception of the Asp residue, the other charged side
chains yield correction factors for model peptide hydrogen
16
18
pKPB (Ala-Asp)
20
22
24
24 22 20 18 16
pKPB (Ala-Ala)
Fig. 15. The relative contributions of side chain and main chain interactions in determining
the peptide acidities of N-acetyl-[Ala-Asp]-N-methylamide conformers. For each N-acetyl-
[Ala-Asp]-N- methylamide, the carboxylate group was truncated to form an [Ala-Ala] con-
former, and the electrostatic free energy was then calculated. The corresponding pairs of
amide pK values are denoted by their c1 side chain rotamer (g− as diamonds, g+ as
squares, and t as circles). Reprinted from ref. 35 with permission from Elsevier Limited.
exchange that are robustly predictable from the Protein Coil

Library (21). However in marked contrast to each of the other
residue types, the hydrogen exchange prediction derived from the
calculated peptide acidities for the Asp conformers is significantly
lower than the experimental result, even after the elevated ionic
strength of the hydrogen exchange measurements are taken into
account. These deviations strongly suggest that the present con-
tinuum dielectric calculations overestimate the effective strength of
the electrostatic interaction between the peptide anion and the
negatively charged Asp side chain. Insight into these interactions
can be gained from consideration of the distribution of con-
former acidities as a function of side chain rotamer state. When the
[Ala-Asp] peptide conformers and the [Ala-Ala] peptide conform-
ers generated by truncation of the carboxylate were compared
(Fig. 15), both gauche c1 rotamer states yielded side chain confor-
mation-dependent variations in peptide acidities that were substan-
tially larger than those obtained from the analogous predictions for
the [Ala-Val] peptide conformers (Fig. 12). This effect is particu-
larly marked for the g+ rotamer state in which a substantial propor-
tion of conformers have a carboxylate oxygen positioned close to
the peptide nitrogen. By contrast, the [Ala-Asp] peptide conform-
ers in the trans c1 rotamer state yield peptide conformer acidities
that tightly correlate with the electrostatic interactions of the

backbone, although the predictions are uniformly shifted to lower
acidities, due to the long range interaction between the peptide
anion and the trans carboxylate. The strongly enhanced peptide
acidities predicted from rotating to the trans c1 rotamer reflect the
effects previously noted in our analysis of hydrogen exchange in
rubredoxin (22) and other model proteins (47).
8. Future
Directions
Future studies will provide insight into the degree to which the
residual inaccuracies in predictions of hydrogen exchange for either
proteins or model peptides reflect errors in the modeling of the
Boltzmann conformational distribution or rather reflect inadequa-
cies in the electrostatic modeling used to analyze those conforma-
tions. Both of these avenues for improved predictive capability will
require pursuit. However, as recently observed by Senn and Thiel
(110), despite many years of intense research effort, there are as yet
no generally established polarizable biomolecular force fields.
Fortunately, the present studies further demonstrate that, in the
context of dielectric shielding without substantial conformational
reorganization, the classic paradigm of uniform volume polariz-
ability is strikingly robust. As such, continued insights into model-
ing of the Boltzmann conformational distribution from hydrogen
exchange analysis can be anticipated on the basis of the continuum
dielectric representation.
References
1. Berger, A., and Linderstrøm-Lang, K. (1957) exchange protection factors. J. Mol. Biol. 262,
Deuterium exchange of poly-DL-alanine in 756–772.
aqueous solution. Arch. Biochem. Biophys. 69, 7. Wallqvist, A., Smythers, G.W., and Covell,
106–118. D.G. (1997) Identification of cooperative
2. Hvidt, A., and Nielsen, S.O. (1966) Hydrogen folding units in a set of native proteins. Prot.
exchange in proteins. Advances in Protein Sci. 6, 1627–1642.
Chem. 21, 287–386. 8. Bahar, I., Wallqvist, A., Covell, D.G., and
3. Bai, Y.W., Milne, J.S., Mayne, L., and Jernigan, R.L. (1998) Correlation between
Englander, S.W. (1994) Protein stability native-state hydrogen exchange and coopera-
parameters measured by hydrogen exchange. tive residue fluctuations from a simple model.
Proteins: Struct., Funct., Genet. 20, 4–14. Biochemistry 37, 1067–1075.
4. Huyghues-Despointes, B.M.P., Scholtz, J.M., 9. Sheinerman, F.B., and Brooks, C.L. (1998)
and Pace, C.N. (1999) Protein conformational Molecular picture of folding of a small a/ß
stabilities can be determined from hydrogen protein. Proc. Natl. Acad. Sci USA 95,
exchange rates. Nat. Struct. Biol. 6, 910–912. 1562–1567.
5. Li, R., and Woodward, C. (1999) The hydro- 10. Garcia, A.E., and Hummer, G. (1999)
gen exchange core and protein folding. Prot. Conformational dynamics of cytochrome c:
Sci. 8, 1571–1591. Correlation to hydrogen exchange. Prot.
6. Hilser, V.J., and Freire, E. (1996) Structure- Struct. Funct. Genet. 36, 175–191.
based calculations of the equilibrium folding 11. Dixon, R.D.S., Chen, Y., Ding, F., Khare,
pathway of proteins. Correlation with hydrogen S.D., Prutzman, K.C., Schaller, M.D.,
Campbell, S.L., and Dokholyan, N.V. (2004) 24. Hernández, G., Anderson, J.S., and LeMaster,
New insights into FAK signaling and localiza- D.M. (2008) Electrostatic stabilization and
tion based on detection of a FAT domain fold- general base catalysis in the active site of the
ing intermediate. Structure 12, 2161–2171. human protein disulfide isomerase a domain
12. Livesay, D.R., Dallakyan, S., Wood, G.G., and monitored by hydrogen exchange.
Jacobs, D.J. (2004) A flexible approach for ChemBioChem 9, 768–778.
understanding protein stability. FEBS Lett. 25. Radford, S.E., Buck, M., Topping, K.D.,
576, 468–476. Dobson, C.M., and Evans, P.A. (1992) Hydrogen
13. Freire, E. (1999) The propagation of binding exchange in native and denatured states of hen
interactions to remote sites in proteins: egg-white lysozyme. Proteins 14, 237–248.
Analysis of the binding of the monoclonal 26. Hollien, J., and Marqusee, S. (2002)
antibody D1.3 to lysozyme. Proc. Natl. Acad. Comparison of the folding processes of T.
Sci. USA 96, 10118–10122. thermophilus and E. coli Ribonucleases H. J.
14. Pan, H., Lee, J.C., and Hilser, V.J. (2000) Mol. Biol. 316, 327–340.
Binding sites in Escherichia coli dihydrofolate 27. Bau, R., Rees, D.C., Kurtz-Jr., D.M., Scott,
reductase communicate by modulating the R.A., Huang, H.S., Adams, M.W.W., and
conformational ensemble. Proc. Natl. Acad. Eidsness, M.K. (1998) Crystal-structure of
Sci USA 97, 12020–12025. rubredoxin from Pyrococcus furiosus at 0.95
15. Wrabl, J.O., Larson, S.A., and Hilser, V.J. Angstrom resolution, and the structures of
(2001) Thermodynamic propensities of amino N-terminal methionine and formylmethionine
acids in the native state ensemble: Implications variants of Pf Rd. Contributions of N-terminal
for fold recognition. Prot. Sci. 10, 1032–1045. interactions to thermostability. J. Biol. Inorg.
Chem. 3, 484–493.
16. Wrabl, J.O., Larson, S.A., and Hilser, V.J.
(2002) Thermodynamic environments in pro- 28. LeMaster, D.M., Tang, J., Paredes, D.I., and
teins: Fundamental determinants of fold spec- Hernández, G. (2005) Enhanced thermal sta-
ificity. Prot. Sci. 11, 1945–1957. bility achieved without increased conforma-
tional rigidity at physiological temperatures:
17. Babu, C.R., Hilser, V.J., and Wand, A.J. Spatial propagation of differential flexibility in
(2004) Direct access to the cooperative sub- rubredoxin hybrids. Proteins 61, 608–616.
structure of proteins and the protein ensemble
via cold denaturation. Nat. Struct. Mol. Biol. 29. Lee, B., and Richards, F.M. (1971) The
11, 352–357. Interpretation of Protein Structures: Estimation
of Static Accessibility. J. Mol. Biol. 55, 379–400.
18. Wang, S.W., Gu, J., Larson, S.A., Whitten,
30. Hiller, R., Zhou, Z.H., Adams, M.W.W., and
S.T., and Hilser, V.J. (2008) Denatured-state
Englander, S.W. (1997) Stability and dynam-
energy landscapes of a protein structural data-
ics in a hyperthermophilic protein with melt-
base reveal the energetic determinants of a
ing temperature close to 200 degrees C. Proc.
framework model for folding. J. Mol. Biol.
Natl. Acad. Sci. USA 94, 11329–11332.
381, 1184–1201.
31. Richter, B., Gsponer, J., Varnai, P., Salvatella,
19. Cremades, N., Sancho, J., and Freire, E. X., and Vendruscolo, M. (2007) The MUMO
(2006) The native-state ensemble of proteins (minimal under-restraining minimal over-
provides clues for folding, misfolding and restraining) method for the determination of
function. Trends Biochem. Sci. 31, 494–496. native state ensembles of proteins. J. Biomol.
20. LeMaster, D.M., Anderson, J.S., and Hernández, NMR 37, 117–135.
G. (2009) Peptide conformer acidity analysis of 32. Lange, O.F., Lakomek, N.A., Fares, C.,
protein flexibility monitored by hydrogen Schroder, G.F., Walter, K.F.A., Becker, S.,
exchange. Biochemistry 48, 9256–9265. Meiler, J., Grubmuller, H., Griesinger, C.,
21. Anderson, J.S., Hernandez, G., and LeMaster, and deGroot, B.L. (2008) Recognition
D.M. (2010) Conformational Electrostatics dynamics up to microseconds revealed from
in the Stabilization of the Peptide Anion. an RDC-derived ubiquitin ensemble in solu-
Curr. Org. Chem. 14, 162–180. tion. Science 320, 1471–1475.
22. Anderson, J.S., Hernández, G., and LeMaster, 33. Hernández, G., Anderson, J.S., and LeMaster,
D.M. (2008) A billion-fold range in acidity D.M. (2010) Assessing the native state con-
for the solvent-exposed amides of Pyrococcus formational distribution of ubiquitin by pep-
furiosus rubredoxin. Biochemistry 47, tide acidity. Biophys. Chem., doi: 10.1016/j.
6178–6188. bpc.2010.10.006.
23. Bai, Y.W., Milne, J.S., Mayne, L., and 34. Anderson, J.S., Hernández, G., and LeMaster,
Englander, S.W. (1993) Primary structure D.M. (2009) Backbone conformational
effects on peptide group hydrogen-exchange. dependence of peptide acidity. Biophys. Chem.
Proteins: Struct., Funct., Genet. 17, 75–86. 141, 124–130.
35. Anderson, J.S., Hernandez, G., and LeMaster, 48. Luz, Z., and Meiboom, S. (1964) The activa-
D.M. (2010) Sidechain conformational tion energies of proton transfer reactions in
dependence of hydrogen exchange in model water. J. Am. Chem. Soc. 86, 4768–4769.
peptides. Biophys. Chem. 151, 61–70. 49. Tolbert, L.M., and Solntsev, K.M. (2002)
36. Kim, P.S., and Baldwin, R.L. (1982) Influence Excited-state proton transfer: From con-
of charge on the rate of amide proton strained systems to “Super” photoacids to
exchange. Biochemistry 21, 1–5. superfast proton transfer. Acc. Chem. Res. 35,
37. Tüchsen, E., and Woodward, C. (1985) 19–27.
Hydrogen kinetics of peptide amide protons 50. Leiderman, P., Genosar, L., and Huppert, D.
at the bovine pancreatic trypsin inhibitor pro- (2005) Excited-state proton transfer:
tein-solvent interface. J. Mol. Biol. 185, Indication of three steps in the dissociation
405–419. and recombination process. J. Phys. Chem A
38. Delepierre, M., Dobson, C.M., Karplus, M., 109, 5965–5977.
Poulsen, F.M., States, D.J., and Wedin, R.E. 51. Ellison, W.J., Lamkaouchi, K., and Moreau,
(1987) Electrostatic effects and hydrogen J.M. (1996) Water: A dielectric reference.
exchange behavior in proteins. The pH depen- J. Molec. Liquids 68, 171–279.
dence of exchange rates in lysozyme. J. Mol. 52. Marcus, R.A. (1964) Chemical and electro-
Biol. 197, 111–122. chemical electron-transfer theory. Annu. Rev.
39. Dempsey, C.E. (1995) Hydrogen bond sta- Phys. Chem. 15, 155–196.
bilities in the isolated alamethicin helix: pH- 53. Schaefer, M., and Karplus, M. (1996) A com-
dependent amide exchange measurements prehensive analytical treatment of continuum
in methanol. J. Am. Chem. Soc. 117, electrostatics. J. Phys. Chem. 100,
7526–7534. 1578–1599.
40. Forsyth, W.R., and Robertson, A.D. (1996) 54. Richards, F.M. (1974) The interpretation of
Intramolecular electrostatic interactions accel- protein structures: Total volume, group vol-
erate hydrogen exchange in diketopiperazine ume distributions and packing density. J. Mol.
relative to 2-piperidone. J. Am. Chem. Soc. Biol. 82, 1–14.
118, 2694–2698. 55. Tsai, J., Taylor, R., Chothia, C., and Gerstein,
41. Fogolari, F., Esposito, G., Viglino, P., Briggs, M. (1999) The packing density in proteins:
J.M., and McCammon, J.A. (1998) pKa shift Standard radii and volumes. J. Mol. Biol. 290,
effects on backbone amide base-catalyzed 253–266.
hydrogen exchange rates in peptides. J. Am. 56. Mertz, E.L., and Krishtalik, L.I. (2000) Low
Chem. Soc. 120, 3735–3738. dielectric response in enzyme active site. Proc.
42. Matthew, J.B., and Richards, F.M. (1983) Natl. Acad. Sci. USA 97, 2081–2086.
The pH dependence of hydrogen exchange in 57. Hawranek, J.P., Wrzeszcz, W., Muszyñski,
proteins. J. Biol. Chem. 258, 3039–3044. A.S., and Pajdowska, M. (2002) Infrared dis-
43. Eigen, M. (1964) Proton transfer, acid-base persion of liquid triethylamine. J. Non-Crystal.
catalysis, and enzymatic hydrolysis. (I) Solids 305, 62–70.
Elementary processes. Angew. Chem. Int. Ed. 58. LeMaster, D.M., Anderson, J.S., and
3, 1–19. Hernández, G. (2006) Role of native-state
44. Molday, R.S., and Kallen, R.G. (1972) structure in rubredoxin native-state hydrogen
Substituent effects on amide hydrogen exchange. Biochemistry 45, 9956–9963.
exchange rates in aqueous solution. J. Am. 59. Antosiewicz, J., McCammon, J.A., and Gilson,
Chem. Soc. 94, 6739–6745. M.K. (1994) Prediction of pH dependent
45. Wang, W.H., and Cheng, C.C. (1994) General properties of proteins. J. Mol. Biol. 238,
base catalyzed proton exchange in amides. 415–436.
Bull. Chem. Soc. Jpn. 67, 1054–1057. 60. Antosiewicz, J., McCammon, J.A., and Gilson,
46. LeMaster, D.M., Anderson, J.S., and M.K. (1996) The determinants of pKas in
Hernández, G. (2007) Spatial distribution of proteins. Biochemistry 35, 7819–7833.
dielectric shielding in the interior of Pyrococcus 61. Demchuk, E., and Wade, R.C. (1996)
furiosus rubredoxin as sampled in the sub- Improving the continuum dielectric
nanosecond timeframe by hydrogen exchange. approach to calculating pKa’s of ionizable
Biophys. Chem. 129, 43–48. groups in proteins. J. Phys. Chem. 100,
47. Hernández, G., Anderson, J.S., and LeMaster, 17373–17387.
D.M. (2009) Polarization and polarizability 62. Georgescu, R.E., Alexov, E.G., and Gunner,
assessed by protein amide acidity. Biochemistry M.R. (2002) Combining conformational
48, 6482–6494. flexibility and continuum electrostatics for
calculating pKas in proteins. Biophys. J. 83, 74. Chevelkov, V., Xue, Y., Rao, D.K., Forman-
1731–1748. Kay, J.D., and Skrynnikov, N.R. (2010)
63. Wisz, M.S., and Hellinga, H.W. (2003) An N-15(H/D)-SOLEXSY experiment for accu-
empirical model for electrostatic interactions rate measurement of amide solvent exchange
in proteins incorporating multiple geometry- rates: application to denatured drkN SH3.
dependent dielectric constants. Proteins 51, J. Biomolec. NMR 46, 227–244.
360–377. 75. Makhatadze, G.I., Clore, G.M., and
64. Song, Y., Mao, J., and Gunner, M.R. (2009) Gronenborn, A.M. (1995) Solvent isotope
MCCE2: Improving protein pKa calculations effect and protein stability. Nat. Struct. Biol.
with extensive side chain rotamer sampling. J. 2, 852–855.
Comput. Chem. 30, 2231–2247. 76. Connelly, G.P., Bai, Y.W., Jeng, M.F., and
65. Simonson, T., and Perahia, D. (1995) Internal Englander, S.W. (1993) Isotope effects in
and interfacial dielectric properties of cyto- peptide group hydrogen-exchange. Proteins:
chrome c from molecular dynamics in aque- Struct., Funct., Genet. 17, 87–92.
ous solution. Proc. Natl. Acad. Sci. USA 92, 77. Sivaraman, T., Arrington, C.B., and Robertson,
1082–1086. A.D. (2001) Kinetics of unfolding and folding
66. Harms, M.J., Schlessman, J.L., Chimenti, from amide hydrogen exchange in native
M.S., Sue, G.R., Damjanovic, A., and García- ubiquitin. Nat. Struct. Biol. 8, 331–333.
Moreno, B.E. (2008) A buried lysine that 78. Rocchia, W., Sridharan, S., Nicholls, A., Alexov,
titrates with a normal pKa: Role of conforma- E., Chiabrera, A., and Honig, B. (2002) Rapid
tional flexibility at the protein-water interface grid-based construction of the molecular sur-
as a determinant of pKa values, Prot. Sci. 17, face and the use of induced surface charge to
833–845. calculate reaction field energies: Applications to
67. Bordwell, F.G., W. J. Boyle, J., and Yee, K.C. the molecular systems and geometric objects.
(1970) Equilibrium and kinetic acidities of J. Comput. Chem. 23, 128–137.
nitroalkanes and their relationship to transi- 79. MacKerell.Jr., A.D., Bashford, D., Bellott,
tion state structures. J. Am. Chem. Soc. 92, M., Dunbrack.Jr., R.L., Evanseck, J.D., Field,
5926–5932. M.J., Fischer, S., Gao, J., Guo, H., Ha, S.,
Joseph-McCarthy, D., Kuchnir, L., Kuczera,
68. Bernasconi, C.F. (1987) Intrinsic barriers of
K., Lau, F.T.K., Mattos, C., Michnick, S.,
reactions and the principle of nonperfect syn-
Ngo, T., Nguyen, D.T., Prodhom, B., Reiher-
chronization. Acc. Chem. Res. 20, 301–308.
III, W.E., Roux, B., Schlenkrich, M., Smith,
69. Costentin, C., and Saveant, J.M. (2004) Why J.C., Stote, R., Straub, J., Watanabe, M.,
are proton transfers at carbon slow? Self- Wiorkiewicz-Kuczera, J., Yin, D., and Karplus,
exchange reactions. J. Am. Chem. Soc. 126, M. (1998) All-atom empirical potential for
14787–14795. molecular modeling and dynamics studies of
70. Hwang, T.L., Mori, S., Shaka, A.J., and vanZijl, proteins. J. Phys. Chem. B 102, 3586–3616.
P.C.M. (1997) Application of Phase-Modulated 80. Jorgensen, W.L., Maxwell, D.S., and Tirado-
CLEAN Chemical EXchange Spectroscopy Rives, J. (1996) Development and testing of
(CLEANEX-PM) to detect water-protein pro- the OPLS all-atom force field on conforma-
ton exchange and intermolecular NOEs. J. Am. tional energetics and properties of organic liq-
Chem. Soc. 119, 6203–6204. uids. J. Am. Chem. Soc. 118, 11225–11236.
71. Hwang, T.L., vanZijl, P.C.M., and Mori, S. 81. Cheatham, T.E., Cieplak, P., and Kollman,
(1998) Accurate quantitation of water-amide P.A. (1999) A modified version of the Cornell
proton exchange rates using the phase-modu- et al. force field with improved sugar pucker
lated CLEAN chemical EXchange phases and helical repeat. J. Biomolec. Struct.
(CLEANEX-PM) approach with a fast-HSQC Dynamics 16, 845–862.
(FHSQC) detection scheme. J. Biomol. NMR 82. Duan, Y., Wu, C., Chowdhury, S., Lee, M.C.,
11, 221–226. Xiong, G.M., Zhang, W., Yang, R., Cieplak,
72. Hernández, G., and LeMaster, D.M. (2003) P., Lou, R., Lee, T., Caldwell, J., Wang, J.M.,
Relaxation compensation in chemical exchange and Kollman, P. (2003) A point-charge force
measurements for the quantitation of amide field for molecular mechanics simulations of
hydrogen exchange in larger proteins. Magn. proteins based on condensed-phase quantum
Reson. Chem. 41, 699–702. mechanical calculations. J. Comp. Chem. 24,
73. Griesinger, C., and Ernst, R.R. (1987) 1999–2012.
Frequency offset effects and their elimination 83. Bayly, C.I., Cieplak, P., Cornell, W.D., and
in NMR rotating-frame cross-relaxation spec- Kollman, P.A. (1993) A well behaved electro-
troscopy. J. Magn. Reson. 75, 261–271. static potential-based method using charge
restraints for deriving atomic charges: The RESP 96. Chen, K., Liu, Z., Zhou, C., Bracken, W.C.,
model. J. Phys. Chem. 97, 10269–10280. and Kallenbach, N.R. (2007) Spin relaxation
84. Becke, A.D. (1993) Density-functional ther- enhancement confirms dominance of extended
mochemistry III. The role of exact exchange. conformations in short alanine peptides.
J. Chem. Phys. 98, 5648–5652. Angew. Chem. Int. Ed. 46, 9036–9039.
85. You, T.J., and Bashford, D. (1995) 97. Graf, J., Nguyen, P.H., Stock, G., and
Conformation and hydrogen ion titration of Schwalbe, H. (2007) Structure and dynamics
proteins: A continuum electrostatic model of the homologous series of alanine peptides:
with conformational flexibility. Biophys. J. 69, A joint molecular dynamics/NMR study. J.
1721–1733. Am. Chem. Soc. 129, 1179–1189.
86. vanVlijmen, H.W.T., Schaefer, M., and 98. Wickstrom, L., Okur, A., and Simmerling, C.
Karplus, M. (1998) Improving the accuracy of (2009) Evaluating the Performance of the
protein pKa calculations: Conformational ff99SB Force Field Based on NMR Scalar
averaging versus the average structure. Proteins Coupling Data. Biophys. J. 97, 853–856.
33, 145–158. 99. Pizzanelli, S., Forte, C., Monti, S.,
87. deGroot, B.L., vanAalten, D.M.F., Scheek, Zandomeneghi, G., Hagarman, A., Measey,
R.M., Amadei, A., Vriend, G., and Berendsen, T.J., and Schweitzer-Stenner, R. (2010)
H.J.C. (1997) Prediction of protein confor- Conformations of Phenylalanine in the
mational freedom from distance constraints. Tripeptides AFA and GFG Probed by
Proteins 29, 240–251. Combining MD Simulations with NMR,
88. Cornilescu, G., Marquardt, J.L., Ottiger, M., FTIR, Polarized Raman, and VCD
and Bax, A. (1998) Validation of Protein Spectroscopy. J. Phys. Chem B 114,
Structure from Anisotropic Carbonyl Chemical 3965–3978.
Shifts in a Dilute Liquid Crystalline Phase. 100. Tsai, M., Xu, Y.J., and Dannenberg, J.J.
J. Am. Chem. Soc. 120, 6836–6837. (2009) Ramachandran Revisited. DFT Energy
89. Boehr, D.D., Nussinov, R., and Wright, P.E. Surfaces of Diastereomeric Trialanine Peptides
(2009) The role of dynamic conformational in the Gas Phase and Aqueous Solution.
ensembles in biomolecular recognition. J. Phys. Chem B 113, 309–318.
Nature Chem. Biol. 5, 789–796. 101. Fitzkee, N.C., Fleming, P.J., and Rose, G.D.
90. Mittermaier, A.K., and Kay, L.E. (2009) (2005) The Protein Coil Library: A structural
Observing biological dynamics at atomic reso- database of nonhelix, nonstrand fragments
lution using NMR. Trends Biochem. Sci. 34, derived from the PDB. Prot. Struct. Funct.
601–611. Bioinform. 58, 852–854.
91. Dikic, I., Wakatsuki, S., and Walters, K.J. 102. Avbelj, F., and Baldwin, R.L. (2006) Limited
(2009) Ubiquitin-binding domains - from validity of group additivity for the folding
structures to functions. Nature Rev. Molec. energetics of the peptide group. Prot. Struct.
Cell Biol. 10, 659–671. Funct. Bioinform. 63, 283–289.
92. Tjandra, N., Feller, S.E., Pastor, R.W., and 103. Flory, P.J. Statistical mechanics of chain mole-
Bax, A. (1995) Rotational diffusion anisot- cules, (Wiley Interscience, New York, 1969).
ropy of human ubiquitin from 15 N NMR 104. Penkett, C.J., Redfield, C., Dodd, I., Hubbard,
relaxation. J. Am. Chem. Soc. 117, J., McBay, D.L., Mossakowska, D.E., Smith,
12562–12566. R.A.G., Dobson, C.M., and Smith, L.J.
93. Sheinblatt, M. (1970) Determination of an (1997) NMR analysis of main-chain confor-
acidity scale for peptide hydrogens from mational preferences in an unfolded fibronec-
nuclear magnetic resonance kinetic studies. J. tin-binding protein. J. Mol. Biol. 274,
Am. Chem. Soc. 92, 2505–2509. 152–159.
94. Avbelj, F., and Baldwin, R.L. (2004) Origin 105. Keskin, O., Yuret, D., Gursoy, A., Turkay, M.,
of the neighboring residue effect on peptide and Erman, B. (2004) Relationships between
backbone conformation. Proc. Natl. Acad. amino acid sequences and backbone torsion
Sci. USA 101, 10967–10972. angle preferences. Prot. Struct. Funct.
95. Makowska, J., Rodziewicz-Motowid³o, S., Bioinform. 55, 992–998.
Bagiñska, K., Vila, J.A., Liwo, A., Chmurzyñski, 106. Jha, A.K., Colubri, A., Zaman, M.H., Koide,
L., and Scheraga, H.A. (2006) Polyproline II S., Sosnick, T.R., and Freed, K.F. (2005)
conformation is one of many local conforma- Helix, sheet, and polyproline II frequencies
tional states and is not an overall conforma- and strong nearest neighbor effects in a
tion of unfolded peptides and proteins. Proc. restricted coil library. Biochemistry 44,
Natl. Acad. Sci. USA 103, 1744–1749. 9691–9702.
107. LeMaster, D.M. (1999) NMR relaxation time-scale side-chain motions. J. Am. Chem.
order parameter analysis of the dynamics of Soc. 124, 6449–6460.
protein sidechains. J. Am. Chem. Soc. 121, 109. Darley, M.G., and Popelier, P.L.A. (2008) Role
1726–1742. of short-range electrostatics in torsional poten-
108. Skrynnikov, N.R., Millet, O., and Kay, L.E. tials. J. Phys. Chem A 112, 12954–12965.
(2002) Deuterium spin probes of side-chain 110. Senn, H.M., and Thiel, W. (2009) QM/MM
dynamics in proteins. 2. Spectral density methods for biomolecular systems. Angew.
mapping and identification of nanosecond Chem. Int. Ed. 48, 1198–1229.
Chapter 21
Fast Protein Backbone NMR Resonance Assignment

Using the BATCH Strategy
Bernhard Brutscher and Ewen Lescop
Abstract
Probing protein structure, dynamics, and interaction surfaces by NMR requires initial backbone resonance
assignment. The protocol for this step has been progressively developed in the last 15 years to provide
robust assignments. However, even in the case of favorable conditions (high field magnets and cryogenically
cooled probes, small globular proteins, high sample concentration), the assignment step generally takes
several days of data collection and analysis, thus precluding studies of unstable proteins and limiting high-
throughput applications. Recently, we have introduced the BATCH strategy for fast protein backbone
resonance assignment. BATCH benefits from the combination of several tools (BEST/ASCOM/Targeted-
Sampling/COBRA/HADAMAC) for time-optimized and highly automated NMR data acquisition, processing,
and analysis. In this chapter, we discuss the individual steps of the BATCH method and describe its practical
implementation to obtain the backbone resonance assignment of small globular proteins in a few hours of time.
Key words: Protein, Fast NMR, Resonance assignment, Chemical shift, Amino acid type discrimination,
Algorithm
1. Introduction
Protein NMR studies usually start by assigning the backbone resonances

required for the subsequent measurement of residue-specific
parameters that are related to protein structure and dynamics.
In many cases, the assignment of the 15N HSQC spectrum is
sufficient, although 13CO, 13Ca, and 13Cb chemical shifts are also
desirable for secondary structure assessment and for structure
determination. Over the years, a robust strategy has been developed
to assign backbone resonances of uniformly 15N/13C labeled proteins.
In this strategy, a series of pairs of triple resonance experiments
are collected. These experiments are usually performed to record
three-dimensional (3D) datasets correlating the 1H, 15N chemical
shifts of one amino acid and the 13CO, 13Ca, and/or 13Cb chemical
407
408 B. Brutscher and E. Lescop
shifts of the same amino acid (intraresidual correlation) or of the

previous amino acid (sequential correlation) in the peptide sequence.
The analyses of these 3D spectra provide the connectivity informa-
tion for 1H/15N frequency pairs corresponding to neighboring
amino acids, as well as the 13C chemical shifts that contain some
information on the amino acid type of the corresponding residue.
The combination of sequential connectivity and amino acid type
information, in addition to the known protein sequence, is virtu-
ally sufficient to obtain complete backbone assignment and is easily
obtained using one of the numerous dedicated computer software
packages available for this purpose. This standard strategy is very
robust for a wide range of protein samples (MW, concentration).
However, for a favorable experimental setup providing high sensi-
tivity (related to protein size, concentration, and NMR hardware),
this strategy is far from optimal in terms of overall experimental
time. In such a situation, the acquisition time needed for many 3D
triple resonance experiments is essentially limited by the time-
consuming sampling of the incremented time domains needed to
achieve sufficient resolution along all spectral dimensions. In addition,
amino acid type discrimination based on 13C chemical shifts only is
highly ambiguous: while glycine, alanine, serine, and threonine
residues can be reasonably well differentiated, other amino acid
types cannot be unambiguously identified, and a probabilistic treat-
ment is required for the assignment algorithm.
Bearing in mind the limitations of the commonly used assign-
ment protocols, we and others have developed techniques to
improve data collection and analysis for fast and highly automated
backbone resonance assignment (1–3). In this chapter, we focus on
the BATCH strategy (3) recently developed in our laboratories.
The BATCH strategy has been designed to collect a minimal data-
set sufficient to achieve complete backbone resonance assignment.
BATCH benefits from a suite of individual spectroscopic and com-
putational tools, each devoted to the time optimization of a spe-
cific task. In BATCH, sequential connectivity and amino acid type
information are obtained from separate types of experiments, a
situation that contrasts with the conventional strategy. The BATCH
strategy is particularly efficient and robust for fast assignment of
small-to-medium sized 15N/13C labeled proteins.
2. Description
of the Various
Tools Implemented
in BATCH In this section, we briefly describe the set of methods implemented
in BATCH allowing for significantly faster data collection and analysis.
We emphasize the main characteristics of the introduced techniques
and describe their benefits in the context of the BATCH strategy.
21 Fast Protein Backbone NMR Resonance Assignment Using the BATCH Strategy 409
For more details, the interested reader is advised to refer to the

original publications of the BATCH protocol (3) and of the respec-
tive tools (4–8).
2.1. The BEST Principle Sequential resonance assignment is based on a set of 3D triple reso-
for Fast Pulsing nance (H–N–C) experiments. One crucial parameter governing the
Multidimensional NMR overall experimental time is the interscan (or recycling) delay that is
required for magnetization recovery between successive repetitions
of the pulse sequence. In order to maximize the signal-to-noise ratio
per unit of time (sensitivity), this delay is usually set to ~1–1.5 s,
accounting for the average 1H T1 in proteins. It has been demon-
strated that in experiments that excite and detect amide protons,
amide 1H T1 can be significantly reduced by leaving all nonamide
protons unperturbed throughout the pulse sequence (9–11). This
so-called longitudinal relaxation enhancement effect is mainly due
to 1H–1H dipolar interaction-mediated polarization (energy) trans-
fer from the excited amide 1H to other nearby 1H in thermal equi-
librium and chemical exchange between labile amide protons and
water protons. This observation has led to the development of
BEST-type HNC correlation experiments used for backbone assign-
ment (6, 7). BEST pulse sequences differ from conventional pulse
sequences by the extensive use of shaped pulses for amide 1H band
selective excitation, inversion, and refocusing (PC9 (12), EBURP2
(13) and REBURP (13) shapes), as well as pairs of broadband inver-
sion pulses (BIP (14)) that achieve minimal perturbation of aliphatic
and water 1H magnetizations. Composite broadband 1H decoupling
sequences, such as WALTZ or DIPSI are incompatible with the
BEST effect. Therefore another difference in BEST pulse sequences
is the use of a simple 1H inversion (BIP) pulse for refocusing 1H–15N
or 1H–13C coupling evolution. This slightly reduces the overall sen-
sitivity of BEST triple-resonance experiments for fast relaxing sys-
tems such as high molecular weight proteins. In BEST experiments,
maximal sensitivity is obtained for interscan delays of ~200–400 ms.
This yields a reduction in the overall experimental time by a factor of
~3 without compromising spectral resolution and sensitivity.
2.2. ASCOM 15N 3D triple resonance experiments can be viewed as a repetitive

Spectral Width recording of 15N HSQC spectra with an amplitude modulation of
for Rapid Sampling the 1H–15N correlation peaks according to the 13C chemical shift
of the 15N Dimension evolution during an incremented time delay. Therefore, once the
1
H–15N chemical shift pairs are known from a single 15N HSQC
spectrum, this information can be exploited to optimize the 15N
spectral width for subsequent triple-resonance experiments. For a
spectral width that has been deliberately chosen to be smaller than
the actual chemical shift range, the position of all peaks in the spec-
trum can be accurately predicted from the well-known aliasing
property of complex Fourier transformation. The ASCOM tool (8)
has been developed for optimal spectral compression leading to the

smallest possible 15N spectral width without creating any additional
peak overlap. In practice, the 15N HSQC spectrum is subjected to
peak-picking. The resulting peak list is then used to simulate and
analyze the 15N HSQC spectrum with various 15N spectral widths.
The smallest 15N spectral width value providing no new peak over-
lap is selected. The optimized 15N spectral width can then be used
in subsequent 3D triple resonance experiments to reach the same
15
N spectral resolution with a reduced number of increments
(reduced experimental time).
2.3. COBRA In the conventional approach, peak picking of the 3D Fourier-

for Automated transformed (FT) matrices is usually performed after data collec-
Extraction tion, and 13C chemical shifts are compared between spectra to
of Sequential identify sequential connectivities. To facilitate this assignment step,
Connectivities we introduced the COBRA algorithm (5) that directly extracts
sequential correlation information from the raw data. During the
COBRA procedure, the 3D time domain data are Fourier trans-
formed along the 1H and 15N dimensions, and the 13C time domain
signal is extracted for each of the n individual 15N HSQC cross
peaks (residues). Then, for a pair of intraresidue and sequential
experiments, a correlation (COBRA) map is computed to yield a
n × n matrix with elements given by the (weighted) correlation
coefficients LC(i,j) of the two time domain signals, Si(t) and Ij(t),
extracted from the sequential and intraresidual experiments at the
1
H/15N positions of cross peaks i and j, respectively. The following
equation is used for LC(i,j):
4
⎛ f ⎞
−⎜
⎝ fcut ⎟⎠
LC (i , j ) = corr (Si , I j ) * e
where f refers to the angular phase of the complex correlation

coefficient (corr) of the two traces Si(t) and Ij(t). corr is calculated
as cov(Si , I j ) / var(Si )var(I j ) , with cov and var being the covari-
ance and variance of the two signals, respectively. The LC coeffi-
cient is a real number and ranges from 0 (for uncorrelated signals,
i.e., weak “probability” for sequential connectivity between
1
H/15N cross peaks j i) to 1 (for perfect correlation, i.e., possible
“physical” connectivity). The phase weighted function depends on
a user adjustable fcut phase cutoff parameter and allows better fre-
quency discrimination (resolution) for close but distinct frequen-
cies (corr close to 1 and f close to 0°) while leaving the coefficient
unchanged for identical frequency composition (f = 0°). A small
value of fcut leads to improved frequency discrimination. However,
when taking noise into account, too small a fcut value may lead to
vanishing COBRA element values for traces containing peaks of
identical frequency but low signal intensity. Overall, the informa-
tional content of the COBRA map is limited by the experiment
with the lowest signal-to-noise ratio. As an intuitive but extreme

illustration, the COBRA map calculated on two H–N–C experi-
ments with one of them containing only noise is completely unin-
formative. In the current version of COBRA, the phase cutoff
parameter is set by the user and has to be manually optimized for
the signal-to-noise ratio of the two experiments. An attempt for
automated optimization of this adjustable parameter is also
included in the current version of BATCH and will be presented
elsewhere. When several pairs of experiments are available, all indi-
vidual maps (corresponding to different 13C nuclei) are combined
together, resulting in a final COBRA map representing the sequen-
tial connectivity information contained in the whole set of NMR
data. COBRA performs best with traces containing a minimal
number of signals. Therefore, intraresidue experiments (e.g., iHNCA)
are preferred over their bidirectional counterparts (e.g., HNCA),
and the CA CB transfer in HN(CO)CACB and iHNCACB is
adjusted such to select only correlations with CB and not with CA
frequencies (except for glycine residues). The extraction of sequen-
tial connectivities from triple resonance experiments using the
COBRA approach is very fast (a few seconds of time) and is used
in the BATCH strategy to significantly speed up NMR data
analysis.
2.4. Targeted Sampling In contrast to FT processing, the COBRA processing tool is com-
Approach for Rapid patible with regular but incomplete sampling of the indirect 13C
Sampling of the 13C dimensions (5). Typically, the 13C time domain is sampled over two
Dimension time regions (targeted sampling): t1 = 0 to 3 ms (standard STD
region) and t1 = 25 to 28 ms (time-shifted TS region). This second time
window is chosen such that the signal attenuation due to evolution
under 13Ca–13Cb scalar coupling is negligible: (cos (pJ CCt 1 ) ≅ −1).
A COBRA map is calculated separately for each time window and
subsequently combined. The additional sampling of long t1 values
in the TS time window improves frequency discrimination in the
final COBRA map compared with the STD window-only based
COBRA map. The targeted sampling approach combined with
COBRA analysis provides a high level of frequency discrimination
while requiring only a small number of t1 increments in the 13C
dimension. In practice, the targeted sampling scheme is applied to
the most sensitive pair of experiments (H–N–CA).
2.5. The HADAMAC Amino acid type information for individual residues (15N HSQC
Experiment for Amino cross peaks) is required to assign fragments of sequentially con-
Acid Typing nected 1H–15N frequency pairs to a particular location on the pro-
tein sequence. In the BATCH strategy, we use the recently
introduced HADAMAC experiment (4) for amino acid-type dis-
crimination. In this experiment, the 15N HSQC cross peaks are
edited along an additional “amino acid-type” dimension according
to the amino acid type of the preceding residue in the protein

sequence. The HADAMAC experiment results in six 1H–15N
planes corresponding to the following amino acid groups: (1)
Gly, (2) Ser, (3) Thr, (4) Ala/Val/Ile(AVI), (5) Asn/Asp(Asx),
and (6) Cys and Aromatic residues (Cys-Arom), and all other
residues (Rest). In plane (6) cross peaks corresponding to the
Cys-Arom and Rest groups have opposite signs, allowing their
separation. The HADAMAC pulse sequence, based on a
CBCACONH transfer experiment, exploits the differences
between the 13Cb chemical shift range and spin topology (numbers
of 1H attached to 13Cb and 13C a atoms; numbers and types
(carbonyl, aromatic, or aliphatic) of 13Cg atoms to achieve this
selection. Four spin manipulations (filters) are independently
applied to change the relative signs of signals originating from
different amino acid groups. The filters are applied using a
Hadamard encoding scheme to ensure maximal sensitivity in a
short overall experimental time. Recently, we have introduced an
improved version of the HADAMAC experiment that yields
slightly improved sensitivity (HADAMAC-2 (3)), especially for
larger proteins. The principal advantage of HADAMAC with
respect to conventional 13C chemical shift based methods is that a
high level of amino acid type discrimination is achieved from a
simple visual (or automated) inspection of the six HADAMAC
1
H–15N planes.
2.6. 13C Chemical To extract the 13C chemical shift information present in the avail-
Shift Extraction able triple resonance experiments, an efficient algorithm has been
developed (3). Briefly, 1D 13C time domain data are extracted at
the 1H/15N position of each 15N HSQC cross peak from a given 3D
experiment. These traces are then subjected to Fourier transforma-
tion, resulting in complex valued S(w) traces. Analogous to COBRA4
⎛ f (ω) ⎞
−⎜
f ⎟
processing, the traces are weighted as F (ω) = ℜ(S (ω))* e ⎝ cut ⎠
where f(w) corresponds to the angular phase of S(w). The phase
weighting transformation greatly improves frequency resolution.
Using the same fcut as for COBRA processing is recommended. In
the case of delayed acquisition (TS region defined as t1 = t10 + k*Dt1),
zero and first order phase corrections are required and are calcu-
lated as f0 = t10/Dt1*p and f1 = t10/Dt1*2p. An additional signal
inversion (180° zero-order phase correction) is applied to account
for sign inversion due to 13C–13C scalar coupling evolution. When
the STD and TS regions are collected for the H–N–CA pair, the
traces obtained from the two regions are multiplied point-by-point,
resulting in a single trace. In principle, a given 13C nucleus gives
rise to peaks in sequential and intraresidue experiments at the
1
H/15N positions of sequential residues. Prior to chemical shift
extraction, the two corresponding traces are also multiplied. One
13
C chemical shift is extracted for every trace (residue) as the fre-
quency of maximum (absolute) amplitude.
3. Experimental
Setup
3.1. Protein Sample Protein backbone resonance assignment using the BATCH strategy
requires a 15N/13C labeled protein sample, typically in the concen-
tration range of 100 mM to a few mM. Since high-level deuteration
of aliphatic protons is incompatible with the BATCH strategy, this
approach is best suited for proteins below ~20 kDa. Aliphatic pro-
tons are required as a relaxation source for amide protons in BEST-
type triple-resonance experiments, and as starting magnetization
for the amino acid-type edited HADAMAC experiment. A file
containing the sequence of the protein in NMRView format is also
needed.
3.2. NMR Hardware The time efficiency of the assignment relies on the spectrometer
and Pulse Sequences sensitivity. Therefore high magnetic fields equipped with cryogenic
probes are preferred. A 600 MHz spectrometer may represent a
good trade-off to limit the loss of sensitivity due to the CSA-
induced increase in 13CO transverse relaxation at high field
strengths. In the following sections, we assume that the spectrom-
eter (Bruker or Agilent) is equipped with Topspin 2.1 (or newer)
or VNMRJ (installed with the last version of BioPack) software.
We also assume that pulses on the 15N and 13C channels have been
correctly calibrated and that 1H amplifiers have been linearized.
This is particularly important for automated calibration of shaped
pulses in BEST-type sequences.
For the BATCH strategy, several NMR pulse sequences should
be available on the spectrometer. They include the BEST versions
for 15N HSQC and the following triple resonance experiments:
HN(CO)CA, iHNCA, HN(CO)CB, iHNCB, HNCO, and
iHNCO as well as HADAMAC. The iHNCA (iHNCB, and
iHNCO) only retains the intraresidual H–N–C coherence transfers
as described elsewhere (15). The HN(CO)CB experiment is iden-
tical to HN(CO)CACB with the exception of the longer Ca→Cb
transfer delay for performing a full CA→CB transfer. Therefore, in
HN(CO)CB (and iHNCB) only one cross peak is present for each
1
H/15N frequency pair. In principle, the sensitivity enhanced ver-
sions of these pulse sequences should be used. However, for fast
relaxing systems, shorter INEPT-based versions can be used to
obtain higher signal to noise ratios. The HADAMAC experiment
should be collected first on a protein sample with known resonance
assignments for the assignment of each plane to the corresponding
amino acid group. For VNMRJ users, all pulse sequences are pro-
vided within the latest updated Biopack pulse sequence library. For
Bruker users, most experiments are directly available for Topspin
2.1 and later versions. Pulse sequences are also available upon
request from the authors of this chapter.
3.3. Software The BATCH strategy requires the ASCOM tool for 15N spectral
optimization, which consists of a simple Perl script that can be run
on the spectrometer. The software is available at the Web site http://
www.icsn.cnrs-gif.fr/download/nmr. For Agilent spectrometers
equipped with the most recent versions of BioPack, the macro
BestSW allows for automated peak picking of 2D 15N HSQC spectra
and provides the ASCOM optimized 15N spectral width. The
BATCH software platform is written in Tcl language and is embed-
ded in the NMRView software (http://www.onemoonscientific.
com/nmrview) (16) as a new functionality to make use of the variety
of NMRView native functions. BATCH was validated for the Aqua
(for MaxOS) and C (Linux system) versions of the NMRView soft-
ware and is also available for the platform independent Java version.
The BATCH software package can be downloaded from the Web site
http://www.icsn.cnrs-gif.fr/download/nmr. The downloaded pack-
age includes instructions for installation, a manual, as well as a tuto-
rial. The NMRPipe processing software (http://www.nmrscience.
com/nmrpipe.html) (17) is also required to execute scripts gener-
ated by BATCH. The final assignment can be carried out using the
native algorithm embedded in the BATCH software. Alternatively,
the BATCH software provides a convenient interface for the Mars
( http://www.mpibpc.mpg.de/groups/zweckstetter/_links/
software_mars.htm) (18) or SmartNotebook (http://www.bionmr.
ualberta.ca/bds/software/snb/) (19) software packages that may
also be installed. For the following description, we assume that the
user has a minimal proficiency with NMRView and NMRPipe
software.
4. NMR Data
Collection
In the following sections, specific commands for Topspin and
VNMRJ are given in small caps and italics, respectively, for exam-
ple: PULSECAL and pulsecal.
4.1. Experimental After sample injection and temperature regulation, the probe is
Setup and Pulse tuned and shimmed. The 90° 1H pulse is calibrated, typically by
Calibration determining the 360° pulse duration measured on the on-resonance
water frequency or by using the Bruker PULSECAL tool. A water sup-
pressed 1H spectrum is collected as a first evaluation of spectral
quality, and for 1H chemical shift calibration in case of the presence
of an internal reference (e.g., DSS, TSP).
4.2. The 15N HSQC In a new folder, load the 15N HSQC experiment (RPAR BHSQC,
Experiment best_Nhsqc). Amide 1HN band selective pulses are adjusted to cover
the 1HN spectral width (typically 4 ppm centered at 8.5 ppm). Care
is taken, however, to avoid water saturation for an efficient BEST
effect and solvent suppression. Correct pulse calibration for 1H and
15
N channels are set (GETPROSOL 1H 9 1.5; pw = 9). For Bruker, the
getprosol command automatically generates the power level for all
shaped pulses present in the loaded pulse sequence. For VNMRJ,
the power levels are automatically adjusted “on the fly”using the
Pbox tool. The 15N HSQC experiment is run using the following
parameters by default: recycling delay (d1) of 250 ms, 15N spectral
width of 35 ppm centered at 118 ppm, 2 scans, 16 dummy scans,
and 80–100 increments in the 15N dimension. For cryogenic probe
safety, it is advised to reduce the acquisition time (AQ, at) to less
than 100 ms and the power level of the 15N decoupling sequence
applied during acquisition. The 15N HSQC is acquired within
~1–2 min. After double FT processing, the spectrum is visually
inspected in terms of averaged signal-to-noise ratio, intensity het-
erogeneity, and spectral dispersion to assess the amenability of
sequential resonance assignment through the BATCH approach.
Possible aliasing of backbone 1H/15N cross peaks is detected and
the experiment is possibly run again with increased 15N spectral
width. This step is important to extract exact 15N frequencies before
ASCOM optimization. Care should be taken for side chain H–N
moieties (arginine, asparagine, and glutamine residues). The auto-
mated optimal 15N spectral width is obtained by ASCOM either
“on the fly” (BestSW) or after peak picking and analysis (Bruker),
and should be memorized for the following experiments. The
extent of spectral compression by ASCOM software should be
carefully set. In the course of COBRA processing and for increas-
ing the signal-to-noise ratio, the residue specific 13C time domain is
obtained by the integration of 3D matrices along the 1H and 15N
dimensions (for the typical 1H/15N line widths) around the 1H/15N
chemical shifts position. To avoid cross talk between 13C time
domains corresponding to different residues, the cross peaks should
be as separate as possible in the 1H/15N planes. Therefore, the cut-
off parameters (RH and RN), defining whether two 1H/15N cross
peaks overlap or not in ASCOM software, are set to the averaged
1
H and 15N line widths to limit partial peak overlapping.
4.3. The HADAMAC In a new folder the HADAMAC pulse sequence (RPAR HADAMAC,
Experiment hadamac) is loaded, preferably the HADAMAC-2 version for
increased sensitivity. Set the correct pulse power values (GETPROSOL
1H 9 1.5; pw = 9), the number of repetitions to 2, the relaxation
delay d1 to 0.8–1 s, the 15N spectral width (ASCOM-optimized or
not) and the maximum number of 15N increments as allowed by
the constant-time period inserted in the pulse sequence for
increased resolution. For VNMRJ, set had_flg to (1, 2, …, 8),
phase = 1,2 and array=“(had_flg, phase).” Depending on the cho-
sen 15N spectral width, the HADAMAC experiment lasts from
~30 min to 1 h. This time can be used for the initial setup of
BATCH, inspection of the 15N HSQC spectrum (Subheading 5.2),
and the setup of triple resonance experiments.
4.4. The Triple In a new folder, the sequential BEST HN(CO)CA pulse sequence
Resonance is loaded (RPAR BHNCOCA, best_hncocaP). Set the correct pulse
Experiments power values (vide supra), the number of repetitions to 2, the
relaxation delay (d1) to 250 ms, the ASCOM-optimized 15N spec-
tral width with the maximum number of increments as allowed by
the constant-time period inserted in the pulse sequence, and a typi-
cal 13C spectral window (20 ppm) centered at 56 ppm. The initial
evolution delay under 13C chemical shifts is set to 0 (d0 = 0, d1 = 0),
and 10 complex points are recorded to sample the 0…3 ms time
region (for a 600 MHz 1H field strength). Repeat the procedure
for the BEST iHNCA pulse sequence. This pair of experiments
represents the standard (STD) H–N–CA pair. For correct COBRA
processing, the 13C time domain of the sequential and intraresidual
experiments should be collected in an exactly identical manner,
including the pulse sequence element for 13C chemical shift evolu-
tion, sampling points, and the 13C carrier frequency. The phase
weighted correlation coefficient embedded in COBRA is able to
discriminate 13C signals that are 180° phase shifted (corresponding
to peaks of opposite sign in frequency domain). Such sign inver-
sion occurs, for example, when the first t1 time value is set to half
dwell in the 15N dimension in the case of aliased cross peaks. For
correct COBRA processing, the first t1 time value should be set to
0 ms. Using the same procedure, the time shifted (TS) H–N–CA
pair of experiments is prepared in two additional folders. This pair
is exactly identical to the previous STD H–N–CA pair except that
the initial evolution delay under 13C chemical shifts is set to 25 ms
(d0 = 25 m, d1 = 25 m) instead of 0 ms. Alternatively, the TS time
region can be stored in the same files as the STD H–N–CA and can
also be analyzed by the BATCH software. The same procedure is
then applied to prepare the H–N–CB pair, consisting of the BEST
HN(CO)CB and iHNCB pulse sequences and the H–N–CO pair,
consisting of the BEST HNCO and iHNCO pulse sequences. For
the 13Cb dimension, set the spectral window to 60 ppm centered at
46 ppm and collect 30 complex points to sample the t = 0…3 ms
time region (for a 600 MHz 1H frequency). For the carbonyl 13C
dimension, set the spectral window to 10 ppm centered at 176 ppm
and collect 10 complex increments. Owing to the intrinsic low
sensitivity of the iHNCB and the iHNCO experiments, the num-
ber of transients may be set to 4. Launch all experiments in a row.
The time required for acquisition of the triple resonance data can
be used to process and analyze the HADAMAC experiment
(Subheading 5.2).
After completion of the first pair of triple-resonance experi-
ments (STD H–N–CA), they can be processed (Subheading 5.2),
analyzed using the COBRA algorithm (Subheading 6.2), and a
first attempt at resonance assignment can be done (Subheading 7).
Every newly collected pair can be immediately processed and
analyzed with the BATCH software. Data are collected in an iterative

way until complete assignment is achieved. In case resonance
assignment is not complete upon the collection of the entire series
of experiments, the spectroscopist has to identify the reasons and
eventually collect additional data for increased signal-to-noise ratio.
For example, less than 10–20% assigned resonances likely suggest
low signal-to-noise ratio in at least one of the datasets that may be
collected again and added up to the previously collected datasets.
5. Setup and Raw

Data Processing
in BATCH
Here we assume that NMRView, including the BATCH module,
has been successfully installed on a computer (either on the spec-
trometer or a remote personal computer), and that the collected
raw data are accessible on the same computer. We further assume
that all other software packages, mentioned in Subheading 3.3,
have been successfully installed and that the main BATCH win-
dow is open (Analysis > BATCH, Fig. 1a). This window consists of
several sections for parameter settings and specific actions required
for data processing and analysis. It provides a convenient interface
to the external NMRPipe, MARS, and SmartNoteBook software
packages.
We illustrate the BATCH backbone assignment strategy on
NMR data collected on the 77-residue Hyl1 protein (20). The data
include the 15N HSQC, the HADAMAC and the pair of H–N–CA
experiments with 13C dimensions sampled on the STD (0…3 ms)
and TS (25…28 ms) time regions. The experimental data are also
provided in the BATCH software package as a tutorial and knowl-
edge training. In the following sections, the button clicking action
is indicated as underlined command (ex: Load Setup).
5.1. General A few paths have to be set in the PATH Window (Fig. 1b): the
Parameters location of the directory containing BATCH scripts (see installa-
tion procedure), the base directory for NMR data containing one
folder per experiment and the sequence file. An additional path can
be set for already existing 2D peak lists in the xpk (NMRView)
format. Define the names of the folders containing the collected
experiments. Additional parameter definitions will be automatically
set in the course of preprocessing. The last section of this window
allows the setting of a signal-free region of the 15N HSQC and of
the integration box. The signal-free region is defined by a 1H/15N
cross peak position and is used to estimate the noise in H–N–C spec-
tra for the automated phase cutoff setting algorithm during COBRA
processing. Owing to the extensive folding in 15N dimension in H–N–C
experiments, it is advised to locate the cross peak in a signal-free region
Fig. 1. (a) Main graphical interface of the BATCH software. (b) Window for paths definition and peak integration. (c) Window
for rapid preprocessing of NMR data.
of the 1H projection of the 15N HSQC. The integration box refers to

the spectral integration of the 3D H–N–C spectra around each
1
H/15N cross peak for 13C time domain extraction (by default
0.015 ppm and 0.1 ppm in 1H and 15N dimensions, respectively).
All parameter settings can be saved (Save Setup) in the current

directory for future usage (Load Setup). If available, a 2D peak list
can be loaded from the main window (ReadXPK), and the FT pro-
cessed 15N HSQC spectrum can be automatically loaded in a new
window (LoadHSQC).
5.2. Preprocessing The 15N HSQC, the HADAMAC and the triple resonance experi-
of the Raw Data ments require different preprocessing steps: the 15N HSQC spec-
trum is processed using conventional double FT, the HADAMAC
experiment is subjected to double FT followed by Hadamard
decoding, and triple resonance experiments are Fourier trans-
formed along the 1H and 15N dimensions, while leaving the 13C
dimension in the time domain for COBRA processing (files stored
in the cobra folder). An additional Fourier transformation is also
applied along the 13C dimension of triple resonance experiments
(files stored in ft folder and converted to the NMRview format) for
visualization (Load NvFile). Preprocessing is performed with
NMRPipe in two steps: data conversion to NMRPipe format (fid.
com script file) and data manipulation (nmrproc.com script file).
Since for a given experimental setup (spectrometer and pulse
sequences), the same NMRPipe processing script files can be used
with minor modifications, the BATCH scripts directory contains
one nmrproc.com-type file for each experiment. These files are
modified and validated for the specific experimental setup only
once and can be trustfully used for following protein studies. We
put emphasis on the adequate processing of the HADAMAC
experiment for correct subspectra assignment.
The preprocessing step, facilitated by the “Process Window”
(Fig. 1c), consists of going through each collected and previously
defined experiment. Basically, for each recorded dataset, an
nmrDraw window is opened (fid.com) for the preparation of the
conversion script. The label of the 1H and 15N dimensions should
be consistent over all experiments and should be set to “HN” and
“N15” respectively. After adequate parameter adjustment, the
conversion is executed. In the next step, the processing file is
opened (nmrproc.com). This file may be chosen from different
locations: default script library, folder of the current experiment
or from the STD H–N–CA experiment (in the case of TS
H–N–CA). The nmrproc.com script is then adjusted for the spe-
cific protein (mainly 1H phase correction), saved in the correct
subdirectory and executed by NMRPipe (Save & Execute). The
individual spectra can be inspected in NMRDraw (Check in
NMRDraw) for optimal processing. This initial setting is carried
out without linear prediction (LP) in the 15N dimensions to save
time. An additional button (Process All with LP) allows repro-
cessing of all datasets with additional LP. During the preprocess-
ing step, the paths to all newly generated files are automatically set
(see Path window).
6. 2D Peak-
Picking, HADAMAC
Analysis and
COBRA Calculation One unique (2D) peak list is required in BATCH, containing the
protein’s 1H/15N chemical shifts. Although such a peak list can be
6.1. 2D Peak List obtained directly from the 15N HSQC through the 2D peak-pick-
Generation and ing procedure embedded in NMRView, the better spectral disper-
Extraction of Amino
sion in the pseudo-3D HADAMAC spectra makes this experiment
well suited to discriminate partially overlapping peaks. The
Acid Type
HADAMAC subspectra are loaded (Show HADAMAC spectra)
and inspected for spectral quality. A dedicated algorithm for the
HADAMAC-based 2D 1H/15N peak list generation is called
(AutoPeakPick), and the resulting peak list is inspected and manu-
ally adjusted (Fig. 2b). The peak list (defined as the Current 2D
peak list in Fig. 1a) is then visualized on the 15N HSQC spectrum
(LoadHSQC). Possible folded peaks in the 15N dimensions are
identified and unfolded. More details about this procedure (spe-
cific to NMRView) are provided in the BATCH manual. The 2D
peak list should contain exact (nonfolded) 15N frequencies for the
subsequent extraction of 13C time domain at the correct 1H/15N
chemical shifts position in triple resonance experiments. The indi-
vidual box sizes, used for the extraction of amino acid type infor-
mation, should be manually adjusted for partially overlapping cross
peaks. The cross-peak-dependent amino acid type information
is stored into the Comment section of the NMRView peak list
window (Transfer HADAMAC information to Current Peaklist).
Fig. 2. (a) 15N HSQC spectrum of Hyl1. (b) Overlay of the HADAMAC subspectra of Hyl1. The amino acid groups are colored
as follows: AVI (black), Gly (red), Thr (yellow), Ser (Blue), Asx (Magenta), Cys-Arom (Green), and Rest (Cyan). Of note, the
cross peaks corresponding to the Cys-Arom and Rest groups are present in the same plane but with opposite signs. The
result from the automated peak picking is shown as boxes with (random) numbering.
The algorithm for amino acid type extraction is fairly conservative,

and for the difficult cases of overlapping peaks (as listed in
NMRView Console), the assignment should be checked manually.
The HADAMAC-derived information provides a very strong con-
straint for the backbone assignment and should be carefully vali-
dated before pursuing with the COBRA analysis.
6.2. COBRA Analysis The COBRA maps are calculated from the main BATCH window
(Fig. 1a). Each pair of H–N–C experiments (i.e., one COBRA
map) corresponds to one line allowing the setting of three param-
eters: the first point (usually 1), the number of points in the 13C
time domain (−1 to use all available data) to be used for COBRA
calculation and the phase cutoff parameter fcut. In case the STD
and TS sampled regions are stored in the same file for the H–N–CA
pair, an additional list of 13C evolution delays can be defined on the
same dataset. Only selected pairs of experiments are used for
COBRA calculation (Calculate COBRA maps) and the individual
and final maps are displayed (Show COBRA maps). At the same
time the contour level for the visualization is automatically set to
have on average ~1–2 visible COBRA elements per column and
row (Fig. 3a), and can be further adjusted using classical NMRView
tools. The contour level will be used later for binning. Spectral
noise present in the individual triple resonance experiments propa-
gates through the COBRA calculation. Noise in experimental 13C
traces translates into overall reduction of COBRA elements values.
In addition, noise leads to the deviation of the angular phase f
value of the correlation coefficient corr when compared with noise-
free 13C traces. This deviation is larger for low “averaged” signal-
to-noise ratio and has to be taken into account for setting the phase
cutoff parameter fcut. Default initial cutoff phase values have to be
set to low values (e.g., 15°).
In absence of assignment, the cross peak numbering is random
and real (physical) connectivities are not apparent as a (shifted)
diagonal. In addition, for nonoptimized phase settings, several
nonphysical connectivities (reconstruction artifacts) may appear as
additional nonzero values in the COBRA map and/or correct con-
nectivites may be absent. For these reasons, the informational con-
tent of a given map is difficult to assess visually. To facilitate the
COBRA processing, a new peak list can be generated (Order
Peaklist) that differs from the current one by the cross peak num-
bering (ordering). The new order of the peaks is guided by the
unambiguous connectivities extracted from the current COBRA
map to create unambiguously defined fragments of different
lengths. An unambiguous connectivity between cross peaks i and j
is obtained if the M(i,j) element is larger than the cutoff value
(defined in the main BATCH window) and if no other values
higher than the cutoff value exist along the ith column and the jth
row. This intensity cutoff value is automatically set during processing
Fig. 3. COBRA maps calculated from the H–N–CA experiments collected on Hyl1 with different phase cutoff values for the
STD (fSTD) and TS (fTS) region and different peak ordering. (a) (fSTD = 15°; fTS = 15°) and original (random) peak ordering.
(b) (fSTD = 15°; fTS = 15°) and peak ordering obtained after unambiguous fragment identification. (c) (fSTD = 15°; fTS = 45°)
and same peak order as in (b). (d) (fSTD = 15°; fTS = 45°) and peaks ordered according the final assignment.
and is also used for contour level definition. Optionally, it can also
be adjusted manually. The same cutoff value is used for binning in
the assignment step (vide infra). After peak ordering and calculation
of the new COBRA maps (Calculate COBRA maps), the identified
fragments are ordered from the longest one to the shortest one as
illustrated in Fig. 3b-c. This ordering step is recommended once a
reasonably good COBRA map is obtained (see rule 1 below).
Table 1
Strategy for setting cutoff phases in COBRA algorithm
Rule Observation Origin Action

1 A “good” final COBRA map is Try the assignment
obtained if the majority (> (Subheading 7)
~80–90%) of columns and rows
contains on average only ~1
element with nonzero intensity
2A The COBRA product map contains A subset of COBRA maps Increase by 15° the phase
several (a majority of) columns/ shows low SNR cutoff parameter for the
rows with only zero element corresponding pair
2B values. This situation corresponds All COBRA maps show Increase by 15° the phase
to experiments of low SNR low SNR cutoff parameters for all
pairs
3A The COBRA product map contains This observation applies Decrease by 15° the phase
a majority of columns/rows with only to a subset of cutoff parameter for the
two or more nonzero values. COBRA maps corresponding pairs
3B This situation corresponds to This observation applies Decrease by 15° the phase
experiments of high SNR but to all COBRA maps cutoff parameter for all
weak frequency discrimination pairs
For phase cutoff adjustment, an ensemble of observation rules

and corresponding actions for phase optimization are gathered
together in Table 1. One example of strategy for setting phases in
COBRA processing is described for the case of the Hyl1 protein.
The phase cutoff parameter is initially set to 15° for both pairs
(STD and TS H–N–CA) as illustrated in Fig. 3a. This map con-
tains (visually) a high number of row/columns with only ~1 ele-
ment with a value larger than the intensity cutoff. After peak
ordering (Fig. 3b) several fragments of unambiguously connected
cross peaks are clearly apparent. Nevertheless, several columns at
the extremity of these fragments contain only zero values. By apply-
ing rule 2A, we increase the phase parameter of the TS H–N–CA
pair to 45° (Fig. 3c). This map follows now rule 1 and the first
round of assignment is carried out (Subheading 7). The COBRA
map obtained after assignment based on the available data and
ordering of the cross peaks is shown in Fig. 3d.
Additional considerations are listed below:
(a) Sensitive pairs of experiments (such as STD H–N–CA and
H–N–CB) are processed with lower phase cutoff values com-
pared to less sensitive datasets (TS H–N–CA and H–N–CO).
(b) If a 1H/15N peak gives rise to no signal at all in COBRA maps, it
may correspond to a side chain H–N moiety (and remove it from
the list) or to a residue with low sensitivity due to line-broaden-
ing for example (such a peak will need to be treated manually).
(c) If the right part of the COBRA product map (after ordering)
contains many columns with more than 2 possible connectivi-
ties, this suggests either a lack of 13C frequency discrimination
due to close or overlapping 13C chemical shifts for different
residues in the protein or to partially overlapping 1H/15N. In
the former case, the HADAMAC information may be suffi-
cient to alleviate the ambiguities in the course of the assign-
ment. The latter case can be identified from the inspection of
the 15N HSQC spectrum and the integration box around
1
H/15N frequencies can be reduced (see Path window).
(d) The HNCO/iHNCO pair of experiments should be used
carefully during the course of assignment using COBRA for
glycine residues. The iHNCO experiment contains one Ca
selective pulse designed to optimize the Ca -> CO transfer
while limiting Ca -> Cb transfer. As a consequence, signals
corresponding to glycine residues may be severely attenuated
in the iHNCO experiment. This translates into missing COBRA
connectivities for glycine residues in the final COBRA maps
(after combination of all available COBRA maps).
(e) Overall, less than ~5 COBRA calculations should in principle be
sufficient to obtain a COBRA map that reflects the information
content of the triple resonance experiments. Otherwise, the
suitability of the BATCH method for the particular protein
under investigation should be questioned.
7. Backbone
Assignment
Based on the previous steps, a 2D 1H–15N peak list containing the
cross-peak-dependent HADAMAC amino acid group and the cor-
responding COBRA sequential connectivity map are available.
Three alternative options with different degrees of automation
have been developed to facilitate sequential resonance assignment.
They include the embedded BATCH algorithm (3), the interface
to MARS (18) and SmartNotebook (19) software packages. The
COBRA product map (Subheading 6.2) contains element values
between 0 and 1. Before application of any of the three methods,
this map is first binned to 0 and 1 values according to the intensity
cutoff parameter defined in the main BATCH window. The cutoff
parameter automatically proposed after COBRA map calculation
often represents an excellent starting value. However, it can be
manually adjusted from the (visual) inspection of the COBRA
maps. Based on the newly calculated matrix, fragments of unam-
biguously connected cross peaks are built and fed to the assign-
ment algorithm together with the additional (ambiguous)
sequential connectivities and the HADAMAC information.
An automated assignment algorithm is integrated in the

BATCH software that is based on a Best-First algorithm. This
algorithm quickly converges to a solution when fragments of suf-
ficient lengths can be unambiguously located onto the protein
sequence. Additional cross peaks are also assigned based on the
HADAMAC information and the compatible COBRA connectivi-
ties to allow the extension of already assigned fragments. The
BATCH assignment mode is accessible by BATCH assign.
Additional information can be found in separate windows for the
further manual assignment. MARS embeds an assignment proto-
col widely exploring the assignment space and can also be chosen
for assignment (Start MARS Calculation). SmartNotebook
(Launch SNB) is another assignment module available for
NMRView that significantly helps the manual assignment by pro-
posing allowed protein locations for a given chain.
For each assignment mode, cross peak assignments are stored
directly in the current peak list. An additional tool (Create Ordered
Peaklist From Assigned Peaks) is also provided for reordering the
cross peaks according to the amino acid sequence. The new
COBRA map calculated using the reordered peak list allows the
rapid analysis of still unassigned cross peaks. Fully automated meth-
ods are of great help to quickly assign a large part of the protein.
However, the completion of the assignment may require human
intervention in particular for the identification of missing or over-
lapping 1H/15N cross peaks. As an advantage of the BATCH strat-
egy, the triple resonance experiments can also be fully
Fourier-transformed and analyzed using the conventional meth-
ods. The default processing scripts contain the additional 13C
dimension Fourier transformation step. The resulting 3D matrices
can be directly loaded in NMRView (Load NvFile in the Path
Window) for manual inspection and identification of overlapping
peaks for example.
8. 1H, 15N, and 13C

Chemical Shift
Extraction
After completion of the BATCH assignment only 1H and 15N reso-
nances are assigned. This contrasts with the conventional strategy
where 13C chemical shifts are at the core of the assignment step. In
the BATCH strategy, 13C chemical shifts can also be easily extracted
from the H–N–C experiments. The lower part of the main BATCH
graphical interface (Fig. 1a) allows parameters to be set for the pro-
cessing method (Subheading 2.6) including the first t1 value (used
for first order phase correction), possible spectral reverse operation
and 13C calibration. One pseudo spectrum is calculated for every
nucleus Ca, Cb, and CO (Process spectra) that contains the 13C
traces extracted from the intraresidual (left half of the spectrum)
Fig. 4. Pseudo 2D spectrum processed as described in Subheading 2.6. 13C frequency traces from the intraresidual (left
half ) and sequential (right half ) H–N–CA experiments are shown for each 1H–15N cross peak (ordered according to the final
assignment). The two halves of the spectrum show similar patterns due to the presence of traces containing signals at the
same 13C frequency in the two experiments. For the sequential experiment, cross peak numbers are incremented by n (total
number of cross peaks).
and the sequential (right half) experiments for every 15N HSQC
cross peak (Fig. 4). This calculation is based on the same phase
cutoff parameters as defined for COBRA map calculation. Residue-
specific 13C chemical shifts are automatically extracted for the
assigned cross peaks from the pseudo-spectra (Get Shifts) and are
eventually unfolded to account for possible aliasing in the 13C
dimension (Unfold?). This operation is based on the expected
chemical shifts for known amino acids. 1H, 15N, and available 13C
chemical shifts data are stored in the NMRView assignment table
and can be directly analyzed in terms of secondary structure
prediction by CSI (21) within NMR view.
9. Conclusions
In this chapter, we describe the BATCH protocol for fast and

highly automated sequential protein backbone resonance assign-
ment. In the most favorable situations, the entire assignment step,
from sample injection to complete analysis may take only a few
hours of time. However, the tools forming the basis of the BATCH
strategy have intrinsic limitations. The COBRA and HADAMAC

methods assume a minimal number of 15N HSQC cross peak over-
laps for efficient extraction of single-residue 13C traces and amino
acid type discrimination. In addition, the minimal experimental
times reported herein for the HADAMAC and 3D triple resonance
experiments are only sufficient for slowly relaxing systems, e.g.,
small proteins. Altogether, these restrictions make BATCH opti-
mal for well-folded small-to-medium size (diamagnetic) proteins
that yield well-dispersed NMR spectra while retaining high sensi-
tivity for triple resonance experiments. In our hands, proteins up to
10–15 kDa may be assigned using the BATCH strategy. Of equal
importance, the quality of the 15N HSQC in terms of spectral dis-
persion and assessment of line-broadening serves as a critical test
for the possible outcome of BATCH. Future developments are
expected to extend the applicability of the BATCH method to
higher molecular weight proteins, as well as to intrinsically disor-
dered proteins characterized by low chemical shift dispersion.
Acknowledgments
Special thanks to Rodolfo Rasia and Jérôme Boisbouvier (IBS,

Grenoble) for allowing us to use their NMR data on Hyl1 to illus-
trate this book chapter.
References
1. Jaravine, V. A., Zhuravleva, A. V., Permi, P., time-optimized protein resonance assignment.
Ibraghimov, I., and Orekhov, V. Y. (2008) J. Magn. Reson. 187, 163–169.
Hyperdimensional NMR spectroscopy with 7. Schanda, P., Van Melckebeke, H., and
nonlinear sampling. J. Am. Chem. Soc. 130, Brutscher, B. (2006) Speeding up three-dimen-
3927–3936. sional protein NMR experiments to a few min-
2. Hiller, S., Fiorito, F., Wuthrich, K., and Wider, utes. J. Am. Chem. Soc. 128, 9042–9043.
G. (2005) Automated projection spectroscopy 8. Lescop, E., Schanda, P., Rasia, R., and
(APSY). Proc. Natl. Acad. Sci. USA 102, Brutscher, B. (2007) Automated spectral com-
10876–10881. pression for fast multidimensional NMR and
3. Lescop, E., and Brutscher, B. (2009) Highly increased time resolution in real-time NMR
automated protein backbone resonance assign- spectroscopy. J. Am. Chem. Soc. 129,
ment within a few hours: the “BATCH” strat- 2756–2757.
egy and software package. J. Biomol. NMR 44, 9. Deschamps, M., and Campbell, I. D. (2006)
43–57. Cooling overall spin temperature: protein
4. Lescop, E., Rasia, R., and Brutscher, B. (2008) NMR experiments optimized for longitudinal
Hadamard amino-acid-type edited NMR relaxation effects. J. Magn. Reson. 178,
experiment for fast protein resonance assign- 206–211.
ment. J. Am. Chem. Soc. 130, 5014–5015. 10. Pervushin, K., Vogeli, B., and Eletsky, A.
5. Lescop, E., and Brutscher, B. (2007) (2002) Longitudinal (1)H relaxation optimi-
Hyperdimensional protein NMR spectroscopy zation in TROSY NMR spectroscopy. J. Am.
in peptide-sequence space. J. Am. Chem. Soc. Chem. Soc. 124, 12898–12902.
129, 11916–11917. 11. Diercks, T., Daniels, M., and Kaptein, R. (2005)
6. Lescop, E., Schanda, P., and Brutscher, B. (2007) Extended flip-back schemes for sensitivity
A set of BEST triple-resonance experiments for enhancement in multidimensional HSQC-type
out-and-back experiments. J. Biomol. NMR NMRPipe: a multidimensional spectral

33, 243–259. processing system based on UNIX pipes.
12. Kupce, E., and Freeman, R. (1993) J. Biomol. NMR 6, 277–293.
Polychromatic Selective Pulses. J. Magn. Reson. 18. Jung, Y. S., and Zweckstetter, M. (2004) Mars –
102A, 122–126. robust automatic backbone assignment of pro-
13. Geen, H., and Freeman, R. (1991) Band- teins. J. Biomol. NMR 30, 11–23.
selective radiofrequency pulses. J. Magn. Reson. 19. Slupsky, C. M., Boyko, R. F., Booth, V. K., and
93, 93–141. Sykes, B. D. (2003) Smartnotebook: a semi-
14. Smith, M. A., Hu, H., and Shaka, A. J. (2001) automated approach to protein sequential
Improved Broadband Inversion Performance NMR resonance assignments. J. Biomol. NMR
for NMR in Liquids. J. Magn. Reson. 151, 27, 313–321.
269–283. 20. Rasia, R. M., Mateos, J., Bologna, N. G.,
15. Brutscher, B. (2002) Intraresidue HNCA and Burdisso, P., Imbert, L., Palatnik, J. F., and
COHNCA experiments for protein backbone Boisbouvier, J. (2010) Structure and RNA
resonance assignment. J. Magn. Reson. 156, interactions of the plant MicroRNA process-
155–159. ing-associated protein HYL1. Biochemistry 49,
16. Blevins, R. A., and Johnson, B. A. (1994) 8237–8239.
NMRView: a computer program for the visual- 21. Wishart, D. S., and Sykes, B. D. (1994) The
13
ization and analysis of NMR data. J. Biomol. C chemical-shift index: a simple method for
NMR 4, 603–614. the identification of protein secondary struc-
17. Delaglio, F., Grzesiek, S., Vuister, G. W., ture using 13 C chemical-shift data. J. Biomol.
Zhu, G., Pfeifer, J., and Bax, A. (1995) NMR 4, 171–180.
Chapter 22
Comprehensive Automation for NMR Structure

Determination of Proteins
Paul Guerry and Torsten Herrmann
Abstract
This chapter gives an overview of automated protein structure determination by nuclear magnetic resonance
(NMR) with the UNIO protocol that enables high to full automation of all NMR data analysis steps
involved. Four established algorithms, namely, the MATCH algorithm for sequence-specific resonance
assignment, the ASCAN algorithm for side-chain resonance assignment, the CANDID algorithm for NOE
assignment, and the ATNOS algorithm for signal identification in NMR spectra, are assembled into three
principal UNIO NMR data analysis components (MATCH, ATNOS/ASCAN, and ATNOS/CANDID)
that are accessed thanks to a particularly intuitive and flexible, yet powerful graphical user interface (GUI).
UNIO is designed to work independently or in association with other NMR software. The principal data
analysis components for sequence-specific backbone, side-chain and NOE assignment may be run separately
or out of sequence. User-intervention at individual stages is encouraged and facilitated by graphical tools
included for the preparation, analysis, validation, and subsequent presentation of the NMR structure.
Key words: Protein structure, NMR structure determination, Resonance assignment, NOE assign-
ment, Automated NMR structure determination, MATCH, ASCAN, ATNOS, CANDID, UNIO
protocol
1. Introduction
Little more than 10 years ago, protein NMR structure determina-

tion projects were framed in terms of months if not years of labori-
ous, interactive work that required the expertise of a well-trained
NMR spectroscopist. Nowadays, owing to stunning advances in
NMR experiments, instrumentation and computational data analy-
sis, a relatively propitious protein candidate may be solved in a few
weeks (1, 2). More importantly, even newcomers to the NMR field
are increasingly able to pursue a small- to medium-sized protein
NMR structure determination, with a minimum of supervised train-
ing, by following standard protocols for NMR data acquisition and
429
430 P. Guerry and T. Herrmann
Fig. 1. Scheme of the stepwise standard protocol for protein structure determination by NMR.
data interpretation. Despite these significant advances that promote

NMR spectroscopy as a universal tool for the broader structural
biology community, the motivation remains high and ongoing to
describe and establish a general and robust NMR structure determi-
nation protocol in terms of man-hours, not man-weeks.
The commonly exploited protocol for NMR structure deter-
mination (Fig. 1) includes the preparation of the protein sample,
the acquisition of multidimensional NMR experiments, NMR data
processing, signal identification (peak picking), sequence-specific
resonance assignment, NOE assignment as the primary source of
conformational restraints and structure calculation followed by
structure refinement and structure validation (3). This well estab-
lished stepwise protocol has successfully been applied to thousands
of de novo structure determinations but results in lengthy NMR
data analysis requiring massive human manpower and expertise on
top of the time-consuming data acquisition process.
In recent years, solution NMR has attained a level of develop-
ment where much interest is focused on replacing laborious and
time-consuming manual NMR data analysis with computational-
theoretical approaches (1, 2, 4–8). This interest in streamlining the
process of NMR structure determination, to achieve a level of sophis-
tication presently used by X-ray crystallography, has been further
enhanced by the demands for high-throughput NMR studies on
proteins in Structural Proteomics Initiatives (1). Currently, various
computational expert systems for solution NMR structure determi-
nation are available aiming either at supporting the interactive
22 Comprehensive Automation for NMR Structure Determination of Proteins 431
spectral analysis by visualization tools and systematic bookkeeping

of the collected spectral data (computer-aided approach) (9–16) or
at providing semi- or full automation for specific parts of an NMR
structure determination (7). Most progress has been achieved for
the final part of NOE assignment and structure calculation (17–22).
Although expert systems for NMR data analysis are commonly
exploited for the collection of conformational NOE-derived dis-
tance restraints, most of the proposed approaches operate on list-
ings of peak positions and volumes (peak lists) rather than on the
raw NMR spectra, and their performance critically depends on
careful preprocessing of the input data with the notable caveat that
this laborious task is still mainly performed interactively and hence
subjectively (23). In common practice, several rounds of NOE
assignment and structure calculation with steadily refined NOE
peak lists are required to obtain an accurate and precise three-
dimensional protein structure (23). The preceding step of sequence-
specific resonance assignment has also been subjected to the
development of data analysis models (24–48). Here, most of the
proposed algorithms target the assignment of the polypeptide
backbone atoms, where again, extensively preprocessed input data
are typically required to lead to satisfactory results. Despite many
promising attempts, manual or semi-automated approaches still
prevail, and the critically important chemical shift assignment of the
amino acid side-chain atoms represents a major bottleneck for effi-
cient NMR data analysis and structure determination. Initially,
expert systems were expected to be most suitable for the first step of
NMR signal identification (peak picking) providing efficient, objec-
tive and reliable handling of the large sets of NMR signals compris-
ing thousands of resonance frequencies (6, 27, 38, 49–55). In
practice however it has turned out that robust peak picking is lim-
ited to spectral regions with scarcity of signal overlap and artifacts,
with manual reinspection of the results being generally advised.
Nonetheless, and the impressive progress of research on the
subject notwithstanding, it could be argued that the major stum-
bling block for automated NMR structure determination has
proven to be experimental in that a large number of NMR spectra,
i.e., high data redundancy, is generally required for robust perfor-
mance of the individual automated approaches. Indeed, it is often
the case that an interactive data analysis approach, on fewer NMR
spectra, is more attractive and more time-efficient once the tedious
demands of the proposed automated approaches (i.e., labor-inten-
sive data preprocessing and collection of a comprehensive, highly
redundant set of NMR spectra) are taken into consideration.
In this chapter we describe the UNIO protocol for highly to
fully automated protein NMR structure determination that per-
forms all NMR data analysis tasks, i.e., sequence-specific backbone
and side-chain resonance assignment, and NOE assignment, reliably
and efficiently. UNIO associates previously published algorithms
within a single computational framework and emphasizes ease of

use through intuitive and powerful graphical interfaces and utili-
ties, making it attractive to aficionados and casual users alike. Most
importantly, UNIO is undemanding in terms of experimental
input, both in the number of NMR spectra and the subsequent
data preprocessing required for proper performance. The standard
UNIO protocol requires the acquisition of only six NMR spectra
(three APSY and three NOESY spectra), and the flexibility of
UNIO means that the setup for any part of the step-wise protocol
for NMR structure determination can be tailored according to the
particular problem at hand. Time is therefore gained during data
analysis without increasing the experimental load.
UNIO has been designed for immediate and general applicabil-
ity to problems in structural biology. The underlying ethos is fun-
damentally pragmatic, and the algorithms and protocols described
in this chapter follow the path of optimal robustness and efficiency.
In this context, expert algorithms solve those problems that are
tedious and time-consuming whereas the user remains in control of
the overall process and if necessary, performs tasks where human
judgment and intuition (currently) know no numerical equal. Last
but not least, the simplicity and ease of use of the software ensure
that an entire NMR structure determination can be completed
within a few man-hours of processing the experimental data.
2. Specificities
of the UNIO
Program
The simplicity and ease of use of the UNIO suite ensure that a
structure calculation is up and running within only a few man-
hours of installation, making UNIO attractive to expert and casual
users alike. The following list provides an overview of computer
requirements, compatible input file formats, and molecular dynam-
ics programs that can be used in combination with UNIO.
1. Display resolution of 1,024 × 768 pixel or higher. True color
display (16-bit or 32-bit depth). Computer with either Linux
kernel 2.4 or above, or Mac OSX operating system 10.5 or
higher with Intel processors. A minimum of 100 megabytes of
disk space is required.
2. UNIO software application suite for automated protein NMR
structure determination. UNIO is free-of-charge for academic
use at http://www.unio-nmr.eu.
3. CNS (56), XPLOR-NIH (57), or CYANA(58) software pack-
age for NMR structure calculation by simulated annealing.
4. Input file for the amino acid sequence in any of the following for-
mats: BioMagResBank (59), XEASY (11), FASTA, ANSIG (9),
NMRVIEW (10), SPARKY (14), CYANA, CNS, XPLOR-NIH.
5. For MATCH, input peak lists containing information about

the frequency coordinates of the NMR signals in APSY file
format.
6. For ATNOS/ASCAN and ATNOS/CANDID: input chemi-
cal shift list in any of the following formats: BioMagResBank,
XEASY, NMRVIEW, SPARKY, CYANA, CNS, XPLOR-NIH;
3D 13C or 15N-resolved [1H, 1H]-NOESY spectra in either
BRUKER or XEASY file format.
3. Standard
UNIO Protocol
for Protein NMR
Structure The UNIO protocol for substantially automated protein NMR
Determination structure determination comprises sequence-specific backbone and
side-chain assignment followed by NOE assignment and NMR
structure calculation (Fig. 2).
1. Sequence-specific backbone resonance assignment with MATCH.
The standard experimental input data for MATCH (47) con-
sists of a set of three APSY datasets: 4D APSY-HACANH, 5D
APSY-HACACONH, and 5D APSY-CBCACONH (60, 61).
The MATCH algorithm yields polypeptide backbone reso-
nance assignments for the 1HN, 1Ha, 15N, 13Ca, 13Cb, and 13C¢
atoms.
2. Side-chain resonance assignment with ATNOS/ASCAN. The
standard experimental input data for ATNOS/ASCAN (48,
54) comprises the previously obtained sequence-specific back-
bone resonance assignments and a set of three NOESY spectra:
3D [1H,1H]-NOESY-15N-HSQC and two 3D [1H,1H]-
NOESY-13C-HSQC with the 13C carrier frequency in the ali-
phatic and the aromatic spectral regions. The ATNOS/ASCAN
approach yields meaningful side-chain resonance assignments,
i.e., resonance frequencies of atoms involved in many NOEs.
3. Automated NOESY assignment and NMR structure calcula-
tion. The standard experimental input data for ATNOS/
CANDID (19, 54) consists of the previously determined back-
bone and side-chain resonance assignments, and the three
aforementioned 3D NOESY spectra. ATNOS/CANDID in
combination with a simulated annealing program yields listings
of assigned NOESY peaks and the 3D protein structure.
In the following sections, the individual UNIO data analysis
components are presented in detail. Automated NMR signal identifi-
cation is described in Subheading 4, automated backbone assignment
in Subheading 5, automated side-chain assignment in Subheading 6,
and automated NOESY assignment in Subheading 7. The entire
UNIO protocol presented here has been successfully applied to more
Fig. 2. Schematic outline of the UNIO protocol for highly automatic NMR protein structure determination. The three principal
UNIO modules are in bold font inside solid boxes along with the standard input data and output, in dashed and shaded
rectangles, respectively. The input data common to all three modules is shown in a thick dashed box at the top. Cyclic
symbols denote reevaluation of the experimental input data at the start of each iteration guided by the output of the previ-
ous cycle (new resonance assignments for ATNOS/ASCAN, intermediate protein structures for ATNOS/CANDID). In case, the
UNIO validation criteria are not met, the required interactive refinement is facilitated by UNIO reports.
than a dozen de novo NMR structure determination projects.

Individual NMR data analysis components of UNIO such as the
ATNOS/CANDID (19, 54) approach for combined automated
signal identification and NOE assignment, or the CANDID module
alone, have already evolved into standard processing tools routinely

applied by the biology-oriented NMR community and have con-
tributed to several hundreds of protein NMR structure determina-
tions with hitherto unknown protein folds.
Note that although the three principal NMR data analysis
components of UNIO are designed to work one after the other,
one of UNIO’s strengths is its flexibility; the different modules
may be launched separately, out of sequence and with different
experimental input data from those listed above: MATCH also
supports peak lists from conventional triple-resonance experiments
as input; the ATNOS/ASCAN approach can also be used with
TOCSY input datasets; the ATNOS/CANDID approach can be
supplemented by additional conformational restraints such as
residual dipolar couplings, pseudo contact shifts, torsion angle
restraints, hydrogen and disulfide bond restraints. These optional
conformational restraints are not used for any of the NMR data
analysis tasks performed by UNIO, but are directly passed on to
the structure calculation algorithm.
4. Automated
NMR Signal
Identification
In the UNIO protocol, automated NOESY peak picking and
NOE signal identification in 2D homonuclear and heteronuclear-
resolved 3D [1H, 1H]-NOESY spectra is performed with the
ATNOS algorithm (54) in association with either ASCAN (48)
automated side-chain assignment (see Subheading 6) or CANDID
(19) automated NOE assignment and NMR structure calculation
(see Subheading 7).
4.1. Overview The main elements of ATNOS for NOESY spectral analysis are
of the ATNOS local baseline correction and evaluation of local noise level ampli-
Algorithm tudes, automated determination of spectrum-specific threshold
parameters, the use of symmetry relations, and the inclusion of
chemical shift information and the intermediate protein structures
to distinguish between NOE cross peaks and artifacts.
1. Input data for ATNOS. The input data consists of the amino
acid sequence of the protein, resonance frequencies of the
assigned atoms, and 2D or 3D NOESY spectra.
2. Determination of local baseline and local noise level. These tech-
niques are based on those previously introduced by the FLATT
(62) and AUTOPSY (53) algorithms.
3. Generation of a comprehensive set of NMR signals. Highly per-
missive criteria are applied that only require an initial minimal
signal-to-noise ratio and a local minimum.
4. Identification of “covalent NMR signals”. Assignment of NMR

signals to atom pairs with covalent structure-imposed upper
distance limits are based on compatibility with the input chem-
ical shifts: the fixed bond lengths, bond angles, and chiralities
of the covalent polypeptide structure impose NOE-observable
upper limits on certain intraresidual and sequential 1H–1H dis-
tances. These conformation-independent upper distance limits
can be calculated analytically for all atom pairs that are sepa-
rated by one or two dihedral angles. A covalent NMR signal is
defined such that in its initial list of chemical-shift based assign-
ments there is at least one assignment possibility that corre-
sponds to a hydrogen pair with maximal upper distance limit
smaller than 5 Å. The set of identified covalent NMR signals in
a given NOESY spectrum can then be used to derive spectrum-
specific threshold parameters for minimal signal-to-noise ratio
and adaptation of the input chemical shift.
5. Determination of spectrum-specific threshold parameters.
Threshold values for minimally required signal-to-noise ratio
and peak volume are determined using the previously identi-
fied covalent NMR signals as a reference.
6. Adaptation of input chemical shifts for each individual NOESY
spectrum.
7. Peak validation. The first validation filter is based on peak clas-
sification, compatibility with adapted chemical shifts, network-
anchoring, and symmetry considerations (ATNOS/ASCAN
and ATNOS/CANDID cycles 1, 2, …).
8. Peak validation. The second validation filter is based on com-
patibility with the intermediate protein structure (ATNOS/
CANDID cycles 2, 3, …).
The ATNOS approach for automated NOESY signal identifi-
cation differs from most conventional automated peak picking pro-
grams by incorporating chemical shift information (ATNOS in
combination with ASCAN or CANDID) and intermediate protein
structures (ATNOS in combination with CANDID) into the pro-
cess of NMR signal identification. Most of the routinely used algo-
rithms for automated NOE assignments operate on intermediate
listings of NOE cross peak positions and volumes, such as NOAH
(17), ARIA (20, 63), AUTOSTRUCTURE (22), KNOWNOE
(18), CANDID (19), and PASD (21, 64). In practice, the use of
intermediate listings of NOE cross peak positions entails that the
automated NOE routines are applied in several rounds with suc-
cessively refined NOE peak lists as input data. This is clearly a con-
ceptual limitation of the present practice of automated NOE
assignment and results in time-consuming, laborious editing of the
input data for most automated NOE assignment programs to
obtain an accurate and precise 3D protein structure. The listings of
NOE cross peak positions can also be obtained by automated peak

picking methods. However, even sophisticated pattern recognition
methods easily fail for all but ideal artifact-free, well-separated
NMR signals. Under realistic, experimental conditions, difficulties
in automated NMR signal identification arise mainly from signal
overlap and spectral distortions due to artifacts. Sophisticated algo-
rithms have been introduced at the outset of a spectral analysis, but
in practice their use in spectral regions of strong peak overlap and
weak noisy peaks is limited, and manual reinspection of the result-
ing listings of NOE cross peaks is generally advised. Therefore, in
present practice, NOESY peak picking is still dominantly performed
with interactive graphic computer programs. Automated and inter-
active NOE cross peak identification must be able to clearly distin-
guish between both real and artifact signals, with the signal-to-noise
ratio as the primary filter. Because of the inverse 6th power-rela-
tionship between NOE cross peak intensity and interatomic dis-
tance between the pair of hydrogen atoms giving rise to a NOE
cross peak, a significant fraction of the most informative long-range
NOE signals in a NOESY spectrum may have signal-to-noise ratios
only slightly above the average noise level, which emphasizes the
importance of working with powerful and sophisticated signal fil-
tering procedures. A weakness of many automated peak picking
routines is introduced by the underlying recognition technique
focusing only on limited regions in close proximity to a local extre-
mum, without taking into account mutually inclusive peak patterns
in a given NOESY spectrum, or across several NOESY spectra.
More recently, algorithms for automated peak-picking perform
better when the spectral data is supplemented with additional
information, such as chemical shift lists of atoms that are correlated
with as yet unidentified signals in the spectra to be analyzed, or
information on expected peak patterns derived from the magneti-
zation pathways in the NMR experiments used. These “con-
strained” peak picking algorithms mimic the modus operandi of an
experienced spectroscopist, who analyzes new signals in the con-
text of previously assigned resonances.
When used in the context of NOESY assignment (Subheading 7),
the ATNOS approach performs multiple cycles of NOE peak iden-
tification in concert with automated NOE assignment with the
CANDID algorithm followed by protein structure calculation by
simulated annealing using either CNS (56) or XPLOR-NIH (65)
or CYANA (58). At the outset of a de novo structure calculation
(ATNOS/CANDID cycle 1), ATNOS NOE peak validation is pri-
marily guided by the input chemical shifts. In the second and sub-
sequent cycles of automated ATNOS/CANDID NOESY analysis,
intermediate protein structures are used as an additional guide for
the interpretation of the NOESY spectra. Since the precision and
accuracy of the intermediate protein structures tend to improve
from cycle to cycle, the structure-based criteria for ATNOS NOE
identification are loosened to facilitate identification of weaker

signals. By incorporating the analysis of raw NMR data into the
process of automated protein structure determination, ATNOS
enables direct feedback between the protein structure, the NOE
assignments and the experimental NOESY spectra. Thereby the list
of verified NOE peaks is updated between subsequent cycles of
combined NOE assignment and structure calculation based on the
intermediate protein structures. Notably, within this scheme of
using chemical shifts and intermediate 3D structures, ATNOS
achieves more extensive and reliable NOE cross peak identification
than routines that rely exclusively on the information content of
the NOESY spectrum without further guidance by already avail-
able chemical and structural information. The combination of
ATNOS with an automated NOE assignment routine avoids the
iterative refinement of static peak lists common to most other pop-
ular NOE assignment programs.
5. Automated
Backbone
Assignment
In the UNIO protocol, automated sequence-specific polypeptide
backbone NMR assignment is performed with the MATCH algo-
rithm (47). MATCH employs local optimization for tracing partial
sequence-specific assignments within a global, population-based
search environment, where the simultaneous application of local
and global optimization heuristics guarantees high efficiency and
robustness. MATCH thus makes combined use of the two pre-
dominant concepts in use for automated assignment of proteins
(see Subheading 5.2).
5.1. Overview The MATCH algorithm is founded on two main building blocks:
of the MATCH initialization and optimization. Novel concepts in MATCH are
Algorithm dynamic transition and inherent mutation that enable automatic
adaptation to the variable quality of experimental input data. The
concept of dynamic transition is incorporated in all major building
blocks of the MATCH algorithm, where it enables switching
between local and global optimization heuristics at any time dur-
ing the assignment process. Inherent mutation restricts the intrin-
sically required randomness of the evolutionary algorithm to those
regions of the conformation space that are compatible with the
experimental input data.
1. Input data for MATCH. The input consists of the amino acid
sequence of the protein, a statistical analysis of chemical shift
values of proteins contained in the BioMagResBank, and the
experimental NMR data in form of the frequency coordinates
of the NMR correlation signals.
2. Generation of generic spin-systems. The experimental input peak

lists are consolidated and transformed into a single set of
generic spin systems containing all available intra- and inter-
residual chemical shifts for a given spin system.
3. Buffer of candidate fragments. A graph exploration routine
identifies all possible sequential connectivities between generic
spin-systems up to a user-specified maximal fragment length.
4. Calibration of control parameters. All MATCH control param-
eters used in the optimization routine are automatically adapted
to the degree of ambiguity contained in the experimental input
data.
5. Genesis. The generation of an initial population of arbitrarily
generated sequence-specific resonance assignments (set of
individuals) represents the start of the optimization routine.
6. Assignment optimization. Local optimization is applied by
repositioning candidate fragments. A candidate fragment is
relocated if and only if its sequence-specific assignment score
increases for the newly proposed sequence position.
7. Assignment management. Temporary and permanent sequence-
specific assignments are performed for a given fragment based
on the sequence-specific scoring function and the presence in
the population of the same fragment at the same sequence
position.
8. Cross-over. This key module of the evolutionary MATCH algo-
rithm identifies the most promising individuals that are used
subsequently for the generation of a new population.
9. Intervention. Control parameters are adapted based on the
progress of the optimization process. Return to step 6.
10. Elite buffer. The optimization for a given individual is com-
pleted either when all generic spin-systems are permanently
assigned, or when the total sequence-specific score of all indi-
viduals is equal. The final result is then stored and the optimi-
zation restarts with step 5 until a predetermined number of
elite individuals have been generated.
MATCH initialization (steps 1–4) is needed to load all the
necessary input data, to consolidate the experimental NMR data,
to generate an initial set of measured graphs, and to calibrate intrin-
sic MATCH control parameters. The result of the initialization
process represents the input for the first cycle of optimization
(steps 5–10). Each MATCH optimization cycle starts with the cre-
ation of an initial population of individuals. This is followed by
multiple evolutionary cycles, each consisting of local optimization
and a global “cross-over,” where new individuals are created and
low-scoring individuals are eliminated. Within each evolutionary
cycle, the configuration space is reduced whenever possible, and, if
necessary, the threshold for the assignment of a generic spin-system

to a specific-sequence position is decreased. The result of each
round of optimization, which typically includes a large number of
evolutionary cycles, is stored as a new population of “elite indi-
viduals.” Subsequent optimization rounds always start from a
newly created population.
5.2. Memetic In general, algorithms for solving the resonance assignment prob-
Algorithm lem employ either local or global optimization. Local optimization
algorithms refine a preliminary solution by screening the adjacent
configuration space in search of information on the best candidate
solution. They can work in a highly deterministic fashion, follow-
ing a concrete optimization strategy. The benefit of local optimiza-
tion is high efficiency resulting from the assumption that the
underlying data do not contain information that is incompatible
with the rationale used by the algorithm. This efficiency is predict-
ably gained at the expense of robustness. Global optimization
algorithms, on the contrary, solve combinatorial problems by opti-
mizing all problem parameters independently and at once. They
are usually implemented in a population-based fashion such that
multiple candidate solutions located in different regions of the
configuration space are optimized simultaneously. A certain degree
of randomness may be involved, analogous to mutation in biologi-
cal evolution, e.g., genetic algorithms, the deteriorating influence
of misleading experimental input data is thus muted, and the risk
of getting trapped in local minima is greatly reduced. Overall, a
population-based global optimization approach has high robust-
ness but low efficiency due to the fact that numerous candidate
solutions have to be managed concurrently.
A memetic algorithm is the logical attempt to merge both
approaches, since it contains a local optimization routine embed-
ded in an evolutionary, global optimization algorithm. The evolu-
tionary algorithm is meant to explore the overall problem space,
while the local search heuristic refines discrete areas of this space.
With MATCH, the advantages of both approaches are exploited to
the fullest extent as local optimization efficiently traces partial solu-
tions inside a population-based (genetic) environment that pre-
serves robustness.
5.3. APSY-NMR The NMR method APSY (Automated Projection Spectroscopy)

Input Data (61) enables the automatic generation of high-dimensional heter-
onuclear-resolved correlation peak lists from the analysis of a suit-
ably selected group of experimental 2D projections of the
higher-dimensional experiment. The use of high dimensions
enables a significant reduction of the number of spectra needed for
the resonance assignment. A further important merit of APSY
spectroscopy is the determination of highly precise correlation
peak chemical shifts (66), which is a key asset for fully automated
sequence-specific resonance assignment. MATCH has been

optimized for high efficiency and reliability of automatic backbone
NMR assignment of proteins when using input from APSY-NMR
experiments. Note that MATCH can also be used with conven-
tional triple-resonance data.
6. Automated
Side-Chain
Assignment
In the UNIO protocol, automated sequence-specific NMR assign-
ment of amino acid side-chain atoms is performed according to the
ATNOS/ASCAN approach (48, 54). ATNOS/ASCAN operates
on the 3D heteronuclear-resolved [1H,1H]-NOESY datasets that
are subsequently used to collect the input of NOE-distance con-
straints for the structure calculation. ATNOS/ASCAN makes use
of the chemical shift lists for the previously assigned backbone
atoms, and the knowledge of the covalent polypeptide structure. To
make inevitable imperfections of experimental input NMR data
tractable, the chemical shifts of the previously assigned backbone
and Cb atoms are used to guide both the peak-picking of the NOESY
spectra and the search for new side-chain resonance assignments.
6.1. Overview The ATNOS/ASCAN procedure assigns new resonances based on

of the ATNOS/ASCAN a comparison of the NMR signals expected from the chemical
Approach structure (“covalent peaks”) with the experimentally observed
NOESY peak patterns. The ATNOS/ASCAN approach differs
from most previous procedures for automated resonance assign-
ment of backbone and/or side-chain atoms in that it operates on
raw NMR data rather than on interactively generated peak lists,
which often require extensive preprocessing to lead to satisfactory
results. The underlying techniques of the ATNOS/ASCAN
approach are a procedure for generating expected peak positions,
and a corresponding set of acceptance criteria for assignments
based on the NMR experiments used. Expected patterns of NOESY
cross peaks involving unassigned resonances are generated using
the list of previously assigned resonances, and tentative chemical
shift values taken from the Biological Magnetic Resonance Data
Bank (BMRB) statistics for globular proteins.
1. Input data for ATNOS/ASCAN. The input consists of the
amino acid sequence of the protein, a statistical analysis of chem-
ical shift values of proteins contained in the BioMagResBank,
and the experimental NMR data comprising chemical shift
lists of the 1HN, 15N, 13Ca, 13Cb, and possibly 1Ha atoms, and
one or several 3D 13C- or 15N-resolved [1H,1H]-NOESY spectra.
Optional input data can be provided.
2. Generation of expected 3D NOESY peak patterns. The unas-
signed atoms are correlated with the previously assigned
atoms via the chemical structure to generate the set of expected

peaks.
3. Automated signal identification in the NOESY spectra.
Chemical-shift guided peak picking with ATNOS is performed
in which each cycle yields an updated set of identified signals.
4. Determination of experimental peak patterns. The set of
observed peaks is generated based on the knowledge of the
magnetization pathway employed by the NMR experiment.
5. Mapping of expected peak onto observed peak pattern. A set of
potential resonance frequencies for each unassigned atom is
generated by best fit of the expected onto to the observed peak
pattern.
6. Acceptance criteria for resonance assignment. First, the agree-
ment between expected and observed peaks is assessed using
an iteration-dependent threshold value. Second, the agree-
ment between the remaining potential resonance frequencies
and predictions based on chemical shift statistics is assessed.
Third, a new resonance assignment is stored if only one poten-
tial resonance frequency is retained. Return to step 1.
Starting from the backbone chemical shift lists, ATNOS/
ASCAN generates expected NOE-signal patterns that are subse-
quently compared with those observed in the experimental NOESY
spectra. In each cycle of the iterative ATNOS/ASCAN protocol,
the information on expected signal patterns is updated based on
the new assignments obtained in the preceding cycle. Experimental
peak patterns to be compared with these predicted peak patterns
are also updated after each iteration cycle, making use of the new
assignments obtained in the previous cycles for chemical shift-
guided ATNOS signal identification. The next step consists in an
evaluation of the closeness of fit between expected peak patterns
and the experimental data, which in turn generates the input for a
set of acceptance criteria for new resonance assignments.
The ATNOS/ASCAN protocol for automated side-chain res-
onance assignment is composed of two assignment phases. During
the first assignment phase, only 1H atoms for which the resonance
frequency of the covalently bound heavy atom was present in the
input are assigned. Thus, the first phase completes the assignment
of the 1Ha and 1Hb atoms before the second phase aims at assigning
the more peripheral amino acid side-chain atoms. The entire assign-
ment procedure makes use of fixed as well as iteration-dependent
parameters. These control parameters are chosen in such a way that
initially atoms can only be assigned if the agreement between
experimental peak positions and chemical shift values of previously
assigned atoms is extremely good. Moreover, the initial peak pat-
tern is composed exclusively of NMR signals with a high signal-
to-noise ratio. In later iterations the control parameters are loosened
to allow the assignment of atoms for which the agreement is not so
good. NMR signals with reduced intensity due to line broadening,

solvent suppression, or inefficient magnetization transfer can at
this stage be used for obtaining new resonance assignments.
6.2. Resonance As is the case for interactive side-chain assignment, the ATNOS/
Assignments Obtained ASCAN procedure performs better on interior, buried residues
by the ATNOS/ASCAN than on extensively solvent-exposed residues. In general, for both
Approach approaches, the completeness of the side chain assignments corre-
lates inversely with the degree of solvent-accessibility. This is read-
ily rationalized if one considers that a much larger number of NOEs
is generally observed for interior atoms than for atoms at or near
the protein surface. ATNOS/ASCAN thus primarily assigns side-
chain atoms that are involved in numerous inter-residue NOEs.
Note that it is a special advantage of the [1H,1H]-NOESY-
based ATNOS/ASCAN approach that the same datasets are used
for the amino acid side chain assignments and for the collection of
NOE upper distance constraints. Since adjustments of polypeptide
backbone chemical shifts have already been made when preparing
the input for ATNOS/ASCAN, this eliminates the need for further
chemical shift adjustments between datasets recorded with differ-
ent experimental conditions, which is an intrinsically laborious
procedure that may introduce unnecessary ambiguity into the fol-
lowing step of NOESY assignment and structure calculation.
7. Automated
NOESY Assignment
and NMR Structure
Calculation In the UNIO protocol, automated NOESY spectral analysis fol-
lows the ATNOS/CANDID approach (19, 54) that proceeds, as
all commonly used NOE assignment algorithms, in iterative cycles,
each consisting of exhaustive NOE signal identification and, in
part, ambiguous NOE assignments followed by a structure calcula-
tion. But in contrast to many other NOE assignment approaches
that operate on listings of peak positions and chemical shifts invari-
ant in all NOE assignment cycles, the combined use of ATNOS
NOE signal identification and CANDID NOE assignment waives
the common requirement for multiple rounds of manual peak list
preparation and refinement, and leads to a dramatic increase in the
efficiency and reproducibility of the NOESY spectral analysis.
7.1. Overview Each cycle of the iteratively performed NOESY spectral analysis
of the ATNOS/CANDID consists of automated NOESY peak picking with ATNOS, use of
Approach the resulting lists of peak positions and peak intensities as input for
CANDID automated NOE assignment, and use of a set of NOE
distance restraints from CANDID as input for the structure calcu-
lation. Between subsequent ATNOS/CANDID cycles, information
is transferred exclusively through the intermediate 3D structures,
in that the protein molecular structure obtained in a given cycle is
used to guide NOE signal identification and NOE assignment in

the following cycle. The three main techniques that form the basis
of the automated CANDID NOE assignment algorithm are
ambiguous distance restraints (Subheading 7.2), network-anchored
assignment (Subheading 7.3) and constraint combination
(Subheading 7.4). The latter two concepts make the ATNOS/
CANDID approach robust with respect to the inevitable imperfec-
tions of NMR spectra. Network-anchored assignment and con-
straint combination ensure that the correct protein fold is already
obtained after the first ATNOS/CANDID cycle.
1. Input data for ATNOS/CANDID. The input consists of the
amino acid sequence of the protein, the resonance frequencies
of the previously assigned atoms, and one or several 2D or 3D
NOESY spectra. Optional conformational restraints can be
provided.
2. Automated ATNOS signal identification. The ATNOS algo-
rithm yields a listing of NOE cross peak positions and volumes.
3. Generation of initial assignment possibilities. For each NOESY
cross peak, one or multiple assignments are determined based
on chemical shift fitting within a user-defined tolerance range.
4. Ranking and elimination of initial assignment possibilities.
Only those initial assignment possibilities that contribute more
than an iteration-dependent threshold to the overall peak vol-
ume are retained. Thereby, the contribution of each initial
assignment possibility to a given peak volume is calculated as
function of the closeness of the chemical shift fit, the compat-
ibility with the covalent polypeptide structure, the network-
anchored score (see Subheading 7.3) and the compatibility
with the intermediate protein 3D structures (in ATNOS/
CANDID cycles 2, 3, …).
5. Calibration of NOE upper distance restraints. Upper unambig-
uous or ambiguous distance bounds (see Subheading 7.2) are
derived from the NOESY peak intensities.
6. Elimination of spurious NOESY cross peaks. Only those NOE
cross peaks that have at least one assignment possibility with a
network-anchored score above an iteration-dependent thresh-
old and are compatible with the intermediate 3D protein struc-
ture of the preceding cycle are retained (ATNOS/CANDID
cycles 2,3, …).
7. Constraint combination. In the first ATNOS/CANDID cycle,
unrelated long-range distance restraints are randomly combined
into new virtual distance restraints (see Subheading 7.4).
8. Structure calculation. A 3D protein structure is calculated
using torsion angle dynamics. The UNIO-ATNOS/CANDID
approach interfaces with either CNS or XPLOR-NIH or
CYANA. Return to step 1.
After ATNOS NOE signal identification, each CANDID cycle

starts with the generation for each NOESY cross peak of an initial
chemical shift-based assignment list, i.e., hydrogen atom pairs,
within the user-defined tolerance range, that contribute to the peak
are identified from the fit of chemical shifts. Subsequently, for each
cross peak these initial chemical shift-based assignments are
weighted with respect to several criteria, and initial assignments
with low overall scores are discarded. For each cross peak, the
retained assignments are interpreted in the form of an upper dis-
tance limit derived from the NOE cross peak volume. Thereby, a
conventional distance restraint is obtained for cross peaks with a
single retained assignment, and otherwise an ambiguous distance
restraint (see Subheading 7.2) is generated that embodies several
assignments. In addition, all NOE cross peaks with a poor score are
temporarily discarded. In order to reduce deleterious effects on the
resulting structure from erroneous distance restraints that may pass
the preceding filter step, long-range distance restraints are com-
bined into new virtual distance restraints in ATNOS/CANDID
cycle 1 (see Subheading 7.4). The standard ATNOS/CANDID
protocol consists of seven cycles. The second and subsequent cycles
differ from the first cycle by the use of additional selection criteria
for NOE assignment that are based on assessments relative to the
intermediate protein 3D structure from the preceding cycle. Since
the precision of the protein 3D structure model normally improves
with each subsequent cycle, the criteria for accepting cross peaks
and NOE assignments are successively tightened in more advanced
stages of the calculation.
For proper performance and structure validation the following
two input criteria must be fulfilled prior to starting the ATNOS/
CANDID procedure: (1) The input chemical shift list must con-
tain more than 90% of the nonlabile and backbone amide 1H
chemical shifts. If 3D heteronuclear-resolved NOESY are used,
more than 90% of the 15N and/or 13C chemical shifts must be avail-
able. (2) ATNOS must validate NOE signals for at least 85% of all
pairwise combinations of protons for which sequence-specific
NMR assignments are available, and which have covalent struc-
ture-imposed upper distance limits shorter than 5 Å. This condi-
tion requires high quality of the NOESY spectra and accurate
calibration of the input chemical shifts to the NOESY spectra.
A low percentage of validated NOE cross peaks typically results
when the signal-to-noise ratio is too poor for automated spectral
analysis, or the input chemical shifts are not well-calibrated to the
NOESY spectra. In this situation, the input data need to be critically
reevaluated before attempting a new automated NOESY interpre-
tation. In particular, the adaptation of the chemical shifts to the
NOESY spectra needs to be improved.
The following three criteria have to be met for validation of the
resulting 3D structure. (1) The average final target function value
from the first ATNOS/CANDID cycle should be below 250 Å2,

and the corresponding value for the last ATNOS/CANDID cycle
should be below 10 Å2, with more than 80% of all picked NOESY
cross peaks assigned and less than 20% of the peaks with exclusively
long-range assignments eliminated by the filtering step applied in
CANDID. (2) The average backbone RMSD to the mean coordi-
nates for the structured parts of the polypeptide chain should be
below 3 Å for the bundle of conformers used to represent the pro-
tein structure from the first ATNOS/CANDID cycle. (3) The
RMSD drift between the mean atom coordinates after the first and
the last ATNOS/CANDID cycles calculated for the backbone
heavy atoms of the structured part of the polypeptide chain should
be smaller than 3 Å. The three output criteria emphasize the cru-
cial importance of getting the correct protein fold already after
ATNOS/CANDID cycle 1. For reliable automated NOESY analy-
sis, the initial 3D structure obtained should be reasonably compat-
ible with the input data and show a defined fold of the protein.
Structural changes between the first and subsequent ATNOS/
CANDID cycles should only occur within the conformation space
determined by the initial bundle of conformers obtained after
ATNOS/CANDID cycle 1.
7.2. Ambiguous The high NOE assignment ambiguity at the outset of a protein
Distance Restraints structure determination can be resolved by temporarily ignoring
cross peaks with too many (typically, more than two) assignment
possibilities and instead generating distance restraints for all assign-
ment possibilities of the remaining cross peaks. However, such a
procedure requires highly accurate chemical shift values and NOE
cross peak positions to be present in the input data and is hardly
achievable under realistic, experimental conditions. A more elegant
way for handling the initial chemical shift-based assignment ambi-
guity is given by the concept of ambiguous distance restraints
(63, 67). When using ambiguous distance restraints, each individ-
ual NOE cross peak is treated as the superposition of n degenerate
signals arising from each of its multiple initial chemical shift-based
assignments, using relative weights proportional to the inverse
sixth power of the corresponding interatomic distance. A NOE
cross peak uniquely assigned to a pair of hydrogen atoms, α and β ,
gives rise to an upper distance limit b for the corresponding dis-
tance dαβ ≤ b . A NOESY cross peak with two or more assignment
possibilities (n ≥ 2 ) is then interpreted as an ambiguous distance
restraint with an effective, d eff , or r −6 -summed distance
1
−
⎛ n ⎞ 6
d eff
= ⎜ ∑ di−6 ⎟
⎝ i =1 ⎠
The sum runs over all distances di = d(a , b )i corresponding to

the given chemical shift-based assignment possibility between the
two hydrogen atoms, a and b . In this way, information from

NOE cross peaks with multiple initial assignment possibilities can
be used for the structure calculation, and although inclusion of
erroneous assignments for a given cross peak results in a loss of
information, it will not lead to inconsistencies as long as one or
several correct assignments are among the initial assignments. This
is due to the fact that the effective distance d eff is always shorter
than any of the individual distances di = d(a , b )i.
7.3. Network- The concept of ambiguous distance restraint is quite efficient for
Anchored NOE improving and completing the NOESY assignment once a correct
Assignment preliminary polypeptide 3D fold is available, e.g., based on a lim-
ited set of interactively assigned NOESY cross peaks. However,
obtaining a correct initial protein fold at the outset of a de novo
structure determination often proves to be difficult, because struc-
ture-based filters used for the detection and elimination of errone-
ous cross peaks in the input data and for the discrimination between
multiple initial chemical shift-based cross peak assignments are not
yet operational.
To achieve reliable and robust automated NOE assignment for
de novo protein NMR structure determination, the NOE assign-
ment process cannot solely rely on chemical shift agreement between
resonance frequencies of assigned atoms and frequency coordinates
of the NMR signals, and the subsequent use of ambiguous distance
restraints. Indeed, techniques to remove artifacts prior to any
knowledge of a structure model must also be included.
One powerful concept for robust automated NOE assignment
is network-anchored assignment (19). Network-anchoring imitates
the modus operandi of an experienced spectroscopist who typically
decides on the assignment of an individual NOE cross peak on the
basis of the set of already assigned NOE cross peaks. Network-
anchored assignment exploits the observation that the correctly
assigned restraints form a self-consistent subset in any network of
distance restraints that is sufficiently dense for the determination of
a protein 3D structure. Network-anchoring thereby evaluates the
self-consistency of the NOE assignments independently of any
knowledge about the 3D protein structure, and in this way com-
pensates for the absence of 3D structural information at the outset
of a de novo structure determination. The requirement that each
NOE assignment must be embedded in the network of all other
assignments makes network-anchoring a sensitive approach for
detecting erroneous restraints that might artificially constrain
unstructured parts of the protein. Such restraints might not lead to
systematic constraint violation during the structure calculation,
and therefore might also escape 3D structure-based filtering meth-
ods. The concept of network-anchored assignment has proved effi-
cient and reliable in searching for the correct fold especially in the
initial phase of de novo NMR structure determinations.
7.4. Constraint In the practice of NMR structure determination with biological

Combination macromolecules, the presence of spurious distance restraints is
hardly avoidable in the input for a structure calculation at the out-
set of the NOESY analysis, i.e., before a 3D structure is available to
filter artifacts. A key technique to weaken structural distortions
caused by erroneous distance restraints is constraint combination
(19). Constraint combination generates virtual distance restraints
with combined assignments from different, in general unrelated
(medium- and long-range) NOE cross peaks. Constrain combina-
tion is thus an extension of the concept introduced by ambiguous
distance restraints. The basic property of an ambiguous distance
restraint is that the restraint will be satisfied by the correct protein
structure provided that at least one of the assignments is correct.
Combined restraints therefore have a correspondingly lower prob-
ability of being erroneous than individual ones. Constraint combi-
nation, thus, significantly reduces the impact of artifacts on the
resulting 3D protein structure, at the expense, however, of a tem-
porary loss of information.
References
1. Billeter, M., Wagner, G., and Wüthrich, K. 10. Johnson, B. A., and Blevins, R. A. (1994)
(2008) Solution NMR structure determination Nmr View - a Computer-Program for the
of proteins revisited. J. Biomol. NMR 42, Visualization and Analysis of Nmr Data.
155–158. J. Biomol. NMR 4, 603–614.
2. Williamson, M. P., and Craven, C. J. (2009) 11. Bartels, C., Xia, T. H., Billeter, M., Güntert, P.,
Automated protein structure calculation from and Wüthrich, K. (1995) The Program Xeasy
NMR data. J. Biomol. NMR 43, 131–143. for Computer-Supported Nmr Spectral-Analysis
3. Wüthrich, K. (1986) NMR of Proteins and of Biological Macromolecules. J. Biomol. NMR
Nucleic Acids. Wiley, New York. 6, 1–10.
4. Altieri, A. S., and Byrd, R. A. (2004) Automation 12. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu,
of NMR structure determination of proteins. G., Pfeifer, J., and Bax, A. (1995) Nmrpipe - a
Curr. Opin. Struct. Biol. 14, 547–553. Multidimensional Spectral Processing System
5. Baran, M. C., Huang, Y. J., Moseley, H. N. B., Based on Unix Pipes. J. Biomol. NMR 6,
and Montelione, G. T. (2004) Automated anal- 277–293.
ysis of protein NMR assignments and struc- 13. Neidig, K. P., Geyer, M., Gorler, A., Antz, C.,
tures. Chem. Rev. 104, 3541–3555. Saffrich, R., Beneicke, W., et al. (1995) Aurelia,
6. Huang, Y. P. J., Moseley, H. N. B., Baran, M. a Program for Computer-Aided Analysis of
C., Arrowsmith, C., Powers, R., Tejero, R., Multidimensional Nmr-Spectra. J. Biomol.
et al. (2005) An integrated platform for auto- NMR 6, 255–270.
mated analysis of protein NMR structures. 14. Goddard, T. D., and Kneller, D. G. (2001)
Methods Enzymol. 394, 111–141. SPARKY 3. University of Californai, San
7. Gronwald, W., and Kalbitzer, H. R. (2004) Francisco.
Automated structure determination of proteins 15. Keller, R. L. J. (2004) Optimizing the process
by NMR spectroscopy. Prog. Nucl. Magn. of nuclear magnetic resonance spectrum analy-
Reson. Spectrosc. 44, 33–96. sis and computer aided resonance assignment.
8. Güntert, P. (2009) Automated structure deter- Ph.D. thesis. Diss. ETH Nr. 15947. ETH Zurich,
mination from NMR spectra. Eur. Biophys. J. Zurich, Switzerland.
38, 129–143. 16. Kobayashi, N., Iwahara, J., Koshiba, S.,
9. Kraulis, P. J. (1989) Ansig - a Program for the Tomizawa, T., Tochio, N., Güntert, P., et al.
Assignment of Protein H-1 2d-Nmr Spectra by (2007) KUJIRA, a package of integrated
Interactive Computer-Graphics. J. Magn. modules for systematic and interactive analysis
Reson. 84, 627–633. of NMR data directed to high-throughput
NMR structure studies. J. Biomol. NMR 39, multidimensional nuclear magnetic resonance
31–52. spectra. J. Comput. Chem. 18, 139–149.
17. Mumenthaler, C., Güntert, P., Braun, W., and 28. Choy, W. Y., Sanctuary, B. C., and Zhu, G.
Wüthrich, K. (1997) Automated combined (1997) Using neural network predicted sec-
assignment of NOESY spectra and three- ondary structure information in automatic pro-
dimensional protein structure determination. J. tein NMR assignment. J. Chem. Inf. Comput.
Biomol. NMR 10, 351–362. Sci. 37, 1086–1094.
18. Gronwald, W., Moussa, S., Elsner, R., Jung, A., 29. Buchler, N. E. G., Zuiderweg, E. R. P., Wang,
Ganslmeier, B., Trenner, J., et al. (2002) H., and Goldstein, R. A. (1997) Protein NMR
Automated assignment of NOESY NMR spec- assignments using mean-field simulated anneal-
tra using a knowledge based method ing. Biophys. J. 72, Wp447–Wp447.
(KNOWNOE). J. Biomol. NMR 23, 271–287. 30. Croft, D., Kemmink, J., Neidig, K. P., and
19. Herrmann, T., Güntert, P., and Wüthrich, K. Oschkinat, H. (1997) Tools for the automated
(2002) Protein NMR structure determination assignment of high-resolution three-dimen-
with automated NOE assignment using the sional protein NMR spectra based on pattern
new software CANDID and the torsion angle recognition techniques. J. Biomol. NMR 10,
dynamics algorithm DYANA. J. Mol. Biol. 319, 207–219.
209–227. 31. Zimmerman, D. E., Kulikowski, C. A., Huang,
20. Linge, J. P., Habeck, M., Rieping, W., and Y. P., Feng, W. Q., Tashiro, M., Shimotakahara,
Nilges, M. (2003) ARIA: automated NOE S., et al. (1997) Automated analysis of protein
assignment and NMR structure calculation. NMR assignments using methods from artifi-
Bioinformatics 19, 315–316. cial intelligence. J. Mol. Biol. 269, 592–610.
21. Kuszewski, J., Schwieters, C. D., Garrett, D. S., 32. Gronwald, W., Willard, L., Jellard, T., Boyko,
Byrd, R. A., Tjandra, N., and Clore, G. M. R. E., Rajarathnam, K., Wishart, D. S., et al.
(2004) Completely automated, highly error- (1998) CAMRA: Chemical shift based com-
tolerant macromolecular structure determina- puter aided protein NMR assignments.
tion from multidimensional nuclear overhauser J. Biomol. NMR 12, 395–405.
enhancement spectra and chemical shift assign- 33. Leutner, M., Gschwind, R. M., Liermann, J.,
ments. J. Am. Chem. Soc. 126, 6258–6273. Schwarz, C., Gemmecker, G., and Kessler, H.
22. Huang, Y. J., Tejero, R., Powers, R., and (1998) Automated backbone assignment of
Montelione, G. T. (2006) A topology-con- labeled proteins using the threshold accepting
strained distance network algorithm for protein algorithm. J. Biomol. NMR 11, 31–43.
structure determination from NOESY data. 34. Moseley, H. N. B., Monleon, D., and
Methods Enzymol 62, 587–603. Montelione, G. T. (2001) Automatic determi-
23. Güntert, P. (2003) Automated NMR protein nation of protein backbone resonance assign-
structure calculation. Prog. Nucl. Magn. Reson. ments from triple resonance nuclear magnetic
Spectrosc. 43, 105–125. resonance data. Nuc. Magn. Reson. Biol.
24. Bernstein, R., Cieslar, C., Ross, A., Oschkinat, H., Macromol. 339, 91–108.
Freund, J., and Holak, T. A. (1993) Computer- 35. Coggins, B. E., and Zhou, P. (2003) PACES:
Assisted Assignment of Multidimensional Nmr- Protein sequential assignment by computer-
Spectra of Proteins - Application to 3d assisted exhaustive search. J. Biomol. NMR 26,
Noesy-Hmqc and Tocsy-Hmqc Spectra. 93–111.
J. Biomol. NMR 3, 245–251. 36. Malmodin, D., Papavoine, C. H. M., and
25. Olson, J. B., and Markley, J. L. (1994) Evaluation Billeter, M. (2003) Fully automated sequence-
of an Algorithm for the Automated Sequential specific resonance assignments of heteronuclear
Assignment of Protein Backbone Resonances - a protein spectra. J. Biomol. NMR 27, 69–79.
Demonstration of the Connectivity Tracing 37. Hitchens, T. K., Lukin, J. A., Zhan, Y. P.,
Assignment Tools (Contrast) Software Package. McCallum, S. A., and Rule, G. S. (2003)
J. Biomol. NMR 4, 385–410. MONTE: An automated Monte Carlo based
26. Lukin, J. A., Gove, A. P., Talukdar, S. N., and approach to nuclear magnetic resonance assign-
Ho, C. (1997) Automated probabilistic method ment of proteins. J. Biomol. NMR 25, 1–9.
for assigning backbone resonances of (C-13,N- 38. Moseley, H. N. B., Riaz, N., Aramini, J. M.,
15)-labeled proteins. J. Biomol. NMR 9, Szyperski, T., and Montelione, G. T. (2004) A
151–166. generalized approach to automated NMR peak
27. Bartels, C., Güntert, P., Billeter, M., and list editing: application to reduced dimension-
Wüthrich, K. (1997) GARANT - A general ality triple resonance spectra. J. Magn. Reson.
algorithm for resonance assignment of 170, 263–277.
39. Eghbalnia, H. R., Bahrami, A., Wang, L. Y., Using Automatic Computer-Analysis of Contour
Assadi, A., and Markley, J. L. (2005) Diagrams. J. Magn. Reson. 95, 214–220.
Probabilistic identification of spin systems and 51. Carrara, E. A., Pagliari, F., and Nicolini, C.
their assignments including coil-helix inference (1993) Neural Networks for the Peak-Picking
as output (PISTACHIO). J. Biomol. NMR 32, of Nuclear-Magnetic-Resonance Spectra.
219–233. Neural Networks 6, 1023–1032.
40. Lin, H. N., Wu, K. P., Chang, J. M., Sung, T. 52. Antz, C., Neidig, K. P., and Kalbitzer, H. R.
Y., and Hsu, W. L. (2005) GANA - a genetic (1995) A General Bayesian Method for an
algorithm for NMR backbone resonance assign- Automated Signal Class Recognition in 2d
ment. Nucleic Acids Res. 33, 4593–4601. Nmr-Spectra Combined with a Multivariate
41. Masse, J. E., Keller, R., and Pervushin, K. Discriminant-Analysis. J. Biomol. NMR 5,
(2006) SideLink: Automated side-chain assign- 287–296.
ment of biopolymers from NMR data by rela- 53. Koradi, R., Billeter, M., Engeli, M., Güntert,
tive-hypothesis-prioritization-based simulated P., and Wüthrich, K. (1998) Automated peak
logic. J. Magn. Reson. 181, 45–67. picking and peak integration in macromolecu-
42. Masse, J. E., and Keller, R. (2005) AutoLink: lar NMR spectra using AUTOPSY. J. Magn.
Automated sequential resonance assignment of Reson. 135, 288–297.
biopolymers from NMR data by relative- 54. Herrmann, T., Güntert, P., and Wüthrich, K.
hypothesis-prioritization-based simulated logic. (2002) Protein NMR structure determination
J. Magn. Reson. 174, 133–151. with automated NOE-identification in the
43. Wang, J. Y., Wang, T. Z., Zuiderweg, E. R. P., NOESY spectra using the new software
and Crippen, G. M. (2005) CASA: An efficient ATNOS. J. Biomol. NMR 24, 171–189.
automated assignment of protein mainchain 55. Dancea, F., and Gunther, U. (2005) Automated
NMR data using an ordered tree search algo- protein NMR structure determination using
rithm. J. Biomol. NMR 33, 261–279. wavelet de-noised NOESY spectra. J. Biomol.
44. Kamisetty, H., Bailey-Kellogg, C., and NMR 33, 139–152.
Pandurangan, G. (2006) An efficient random- 56. Brünger, A. T., Adams, P. D., Clore, G. M.,
ized algorithm for contact-based NMR back- DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W.,
bone resonance assignment. Bioinformatics 22, et al. (1998) Crystallography & NMR system: A
172–180. new software suite for macromolecular structure
45. Vitek, O., Bailey-Kellogg, C., Craig, B., and determination. Acta Cryst. D 54, 905–921.
Vitek, J. (2006) Inferential backbone assign- 57. Schwieters, C. D., Kuszewski, J. J., Tjandra,
ment for sparse data. J. Biomol. NMR 35, N., and Clore, G. M. (2003) The Xplor-NIH
187–208. NMR molecular structure determination pack-
46. Wu, K. P., Chang, J. M., Chen, J. B., Chang, C. age. J. Magn. Reson. 160, 65–73.
F., Wu, W. J., Huang, T. H., et al. (2006) 58. Güntert, P., Mumenthaler, C., and Wüthrich,
RIBRA - An error-tolerant algorithm for K. (1997) Torsion angle dynamics for NMR
the NMR backbone assignment problem. structure calculation with the new program
J. Comput. Biol. 13, 229–244. DYANA. J. Mol. Biol. 273, 283–298.
47. Volk, J., Herrmann, T., and Wüthrich, K. 59. Doreleijers, J. F., Mading, S., Maziuk, D.,
(2008) Automated sequence-specific protein Sojourner, K., Yin, L., Zhu, J., et al. (2003)
NMR assignment using the memetic algorithm BioMagResBank database with sets of experi-
MATCH. J. Biomol. NMR 41, 127–138. mental NMR constraints corresponding to the
48. Fiorito, F., Herrmann, T., Damberger, F. F., structures of over 1400 biomolecules deposited
and Wüthrich, K. (2008) Automated amino in the Protein Data Bank. J. Biomol. NMR 26,
acid side-chain NMR assignment of proteins 139–146.
using C-13- and N-15-resolved 3D [H-1,H- 60. Hiller, S., Wider, G., and Wüthrich, K. (2008)
1]-NOESY. J. Biomol. NMR 42, 23–33. APSY-NMR with proteins: practical aspects and
49. Neidig, K. P., Saffrich, R., Lorenz, M., and backbone assignment. J. Biomol. NMR 42,
Kalbitzer, H. R. (1990) Cluster-Analysis and 179–195.
Multiplet Pattern-Recognition in 2-Dimensional 61. Hiller, S., Fiorito, F., Wüthrich, K., and Wider,
Nmr-Spectra. J. Magn. Reson. 89, 543–552. G. (2005) Automated projection spectroscopy
50. Garrett, D. S., Powers, R., Gronenborn, A. M., (APSY). Proc. Natl. Acad. Sci. USA 102,
and Clore, G. M. (1991) A Common-Sense 10876–10881.
Approach to Peak Picking in 2-Dimensional, 62. Güntert, P., and Wüthrich, K. (1992) Flatt - a
3-Dimensional, and 4-Dimensional Spectra New Procedure for High-Quality Base-Line
Correction of Multidimensional Nmr-Spectra. of the PASD algorithm. J. Biomol. NMR 41,

J. Magn. Reson. 96, 403–407. 221–239.
63. Nilges, M. (1995) Calculation of Protein 65. Schwieters, C. D., Kuszewski, J. J., and Clore,
Structures with Ambiguous Distance Restraints - G. M. (2006) Using Xplor-NIH for NMR
Automated Assignment of Ambiguous Noe molecular structure determination. Prog. Nucl.
Crosspeaks and Disulfide Connectivities. J. Mol. Magn. Reson. Spectrosc. 48, 47–62.
Biol. 245, 645–660. 66. Fiorito, F., Hiller, S., Wider, G., and
64. Kuszewski, J. J., Thottungal, R. A., Clore, G. M., Wüthrich, K. (2006) Automated resonance
and Schwieters, C. D. (2008) Automated error- assignment of proteins: 6D APSY-NMR.
tolerant macromolecular structure determina- J. Biomol. NMR 35, 27–37.
tion from multidimensional nuclear Overhauser 67. Nilges, M. (1993) A Calculation Strategy for
enhancement spectra and chemical shift assign- the Structure Determination of Symmetrical
ments: improved robustness and performance Dimers by H-1-Nmr. Proteins 17, 297–309.
Chapter 23
ARIA for Solution and Solid-State NMR

Benjamin Bardiaux, Thérèse Malliavin, and Michael Nilges
Abstract
In solution or solid-state, determining the three-dimensional structure of biomolecules by Nuclear
Magnetic Resonance (NMR) normally requires the collection of distance information. The interpretation
of the spectra containing this distance information is a critical step in an NMR structure determination. In
this chapter, we present the Ambiguous Restraints for Iterative Assignment (ARIA) program for auto-
mated cross-peak assignment and determination of macromolecular structure from solution and solid-state
NMR experiments. While the program was initially designed for the assignment of nuclear Overhauser
effect (NOE) resonances, it has been extended to the interpretation of magic-angle spinning (MAS) solid-
state NMR data. This chapter first details the concepts and procedures carried out by the program. Then,
we describe both the general strategy for structure determination with ARIA 2.3 and practical aspects of
the technique. ARIA 2.3 includes all recent developments. such as an extended integration of the
Collaborative Computing Project for the NMR community (CCPN), the incorporation of the log-har-
monic distance restraint potential and an automated treatment of symmetric oligomers.
Key words: Ambiguous distance restraint, Structure calculation, Automated assignment, MAS,
Solid-state NMR, CCPN, NOE, ARIA, PDSD, CHHC
1. Introduction
Nuclear Magnetic Resonance (NMR) is widely used in the field of

structural biology. Most structure determinations by NMR rely on
the measurement of distances and angles between nuclei, the dis-
tances playing a crucial role in the fold determination. In solution,
these distances are measured by nuclear Overhauser effect spec-
troscopy (NOESY) (1). The intensity of the nuclear Overhauser
effect (NOE), produced by the magnetization transfer through the
dipolar coupling between the observed spins, is related to the dis-
tance between the two interacting spins. The qualitative estimate of
distances from NOE intensities is then translated into interatomic
453
454 B. Bardiaux et al.
restraints and the structure is calculated from these restraints.

Structure determination from NOEs thus requires the assignment
of the NOE cross-peaks to pairs of magnetically interacting spins.
However, this assignment cannot generally be obtained without
the knowledge of the structure. In fact, unambiguously assigning
NOE cross-peaks is sometimes very difficult due to inadequate
spectral resolution, chemical shift degeneracy and potentially over-
lapped cross-peaks.
The introduction of the concept of Ambiguous Distances
Restraints (ADR) (2) was a breakthrough in the treatment of
degenerate NOE assignments, since it actually derives distance
information from ambiguously assigned cross-peaks. The intricate
relationship between the structure determination and the NOE
assignment led to the development of an iterative automatic proce-
dure to simultaneously calculate the structure and assign the NOEs.
In this procedure, structure calculation from ADRs and cross-peak
assignment are performed alternatively by comparing the tentative
ambiguous assignment to an ensemble of molecular conformations
determined on the basis of ADRs. The implementation of this iter-
ative strategy is largely automated in the software ARIA (Ambiguous
Restraints for Iterative Assignment) (3–6), which is described in
more detail. ARIA is an open source software, widely disseminated
in the biological NMR community.
The use of Magic-Angle Spinning (MAS) with solid-state
NMR (ssNMR) spectroscopy was applied to the structure determi-
nation of proteins in microcrystalline or fibrilar form. Long-range
structural restraints can also be obtained from proton-driven spin
diffusion experiments or proton-mediated rare-spin detected cor-
relation experiments. However, cross-peak assignment is compli-
cated by the larger band widths that induce substantial ambiguities
in resonance assignments. The first de novo determination of a pro-
tein structure from MAS ssNMR marked an important step in the
field (7). Shortly after, it was demonstrated that automated meth-
ods for cross-peak assignment, such as ARIA or CYANA (8), could
be successfully applied to carbon–carbon or proton–proton corre-
lation NMR experiments in the solid-state (9–12). ARIA now
incorporates routines for ssNMR structure determination by using
various solid-state NMR spectra.
In addition, the in-depth integration of the Collaborative
Computing Project for the NMR community in ARIA streamlines
the structure determination process by NMR by facilitating import
and export of data. Other recent improvements to ARIA include:
(i) implementation of the network anchoring approach (8, 13)
adapted to the ARIA philosophy, (ii) automated treatment of
symmetric oligomers (13) and (iii) the availability of the log-
harmonic potential and the Bayesian estimation of optimal
restraint weight (14).
23 ARIA for Solution and Solid-State NMR 455
2. Materials
2.1. ARIA Software The following software packages are required to use ARIA.
Package
1. ARIA software package. ARIA (6) is written in the program-
ming language Python (15). The current version is 2.3.
Instructions on how to install ARIA can be found in the ARIA
installation archive, which should be downloaded from http://
aria.pasteur.fr. ARIA can be installed on computers operating
under Linux, Windows or Mac OS X.
2. CNS software. To enable specific features used by ARIA, it is
necessary to compile the CNS program (16) with libraries pro-
vided within the ARIA package.
3. Optional: CCPNmr Analysis software package (version 2 or
later) (17). ARIA uses the CCPN data model to read input
data and to store all results in a general format.
4. Optional: Access to a computer cluster for distributed
calculation.
2.2. Input Data The minimal set of data required by ARIA consists of (see Note 1):
1. Definition of the molecular system.
2. List(s) of chemical shift assignments of 1H (for 2D-NOESY)
and 13C/15N if necessary for 3D-NOESY or for MAS solid-
state NMR spectra (see Note 2).
3. One or more lists of cross-peaks with chemical shift positions
in each dimension and peak volumes/intensities. Individual
peaks can be either fully assigned, partially assigned or com-
pletely unassigned. A list of cross-peaks generally corresponds
to the peaks picked in a particular spectrum. It is recommended
that similar experiments performed with different mixing times
are entered as separated lists.
ARIA also integrates various data types for additional experi-
mental information. All restraints must be in CNS “tbl” format
(see Note 1).
1. Hydrogen bonds: The distance between hydrogen donor and
acceptor as well as the distance between acceptor and hydrogen.
2. Dihedral angles: Dihedral angle restraints incorporated using a
flat-bottom harmonic-wall potential.
3. J-couplings: Calculated J-couplings are directly refined against
observed J-couplings.
4. Residual dipolar couplings: Residual dipolar coupling (RDC)
data as restraints.
5. Distance restraints: Preformatted distance restraints, e.g., from
manual assignments.
6. Preliminary structure or structure ensemble:. PDB formatted

file(s) from a previous calculation or models (see Note 1).
7. CCPN Project. CCPN project containing the same data as
listed above but directly imported into ARIA without format
conversion.
2.3. Software Additional software required to analyze the quality of the final
for Structural Quality structure ensembles:
Checks
1. PROCHECK (18).
2. WHAT IF (or WHAT_CHECK) (19).
3. ProSa II (or ProSa 2003) (20).
4. MolProbity suite (21).
3. Methods
The general workflow of the ARIA methodology is presented in

Fig. 1. After an initial chemical-shift based cross-peak assignment
and a calibration step, ambiguous distance restraints are derived
from the cross-peaks (NOEs or C-C correlations). From these
restraints, an ensemble of conformers is calculated. On the basis of
these structures, noise peaks are detected with a violation analysis,
and unlikely assignment possibilities are discarded. This process is
iterated several times (nine by default) with optimized parameters
for each iteration. Each step of the protocol is described in detail in
the sections below.
3.1. Preparation Phase Before cross-peak assignment and structure calculation, the fol-
lowing steps are automatically performed by ARIA. First, the data
are checked and filtered for errors and inconsistencies. The pro-
gram then creates the molecular topology of the system.
3.1.1. Data Filtering When checking the chemical shift assignments for consistency,
ARIA considers three possible situations:
1. A unique assignment consisting of a single atom and a single
chemical shift.
2. A degenerate chemical shift assignment, where one group of
equivalent atoms is assigned to exactly one chemical shift.
3. An assignment of the two substituents of a prochiral group,
which can have one or two chemical shifts.
In the latter case, floating chirality assignment (22) is used in
the resulting restraints (cf. Subheading 3.3.5). Peaks that lack fre-
quency information or with incorrect/missing peak sizes are
removed (see Note 3).
a b
ltering
Initial cross-peak assignment
Molecular topology creation
ARIA
Iterative protocol
c
Chemical
Cross-peak lists Calibration
shits assignments
d
Violation Analysis
Molecular Structure
ntion ensemble e
Structure ensemble
Noise peaks removal
f
Partial Assignment
Additional restraints g
(dihedrals, RDC,
Restraints Merging
restraints
Distance
distance)
Structure Calculation
ARIA nition h
(GUI)
Floating chirality assignment
j i
Generation of report
nement in explicit solvent
Quality analysis
Cross-peak Restraint PDB structure

assignments violation list ensemble
Structure quality
statistics
Fig. 1. Description of the ARIA protocol workflow. Rounded rectangles indicate steps performed by ARIA, folded rectangles
correspond to user provided input-data and trapezoids represent results.
3.1.2. Molecular Topology From the definition of the molecular system provided as input data,
Creation ARIA creates a molecular topology file (MTF) with the program
CNS (16). Name, chemical type, charge and mass of each atom as
well as the covalent connectivity are defined in the MTF. An
extended conformation of the molecule is then generated by CNS
and the coordinates are stored in a PDB file (cf. Subheading 3.8).
The molecular topology is created automatically for standard bio-
polymers. If applicable, topological features can be easily defined
by the user through the graphical interface (cf. Subheading 3.7).
3.2. Initial Cross-Peak For every cross-peak, ARIA uses the chemical shift lists from the
Assignment sequential resonance assignment to derive possible assignments. As
illustrated in Fig. 2, the peak position is defined by its frequency
3.2.1. Chemical-Shift
coordinates (c1, c2) in each dimension of the spectrum. To account
Based Assignment
for the limited precision in chemical shift measurements, for the
uncertainty of the cross-peak coordinates and for systematic exper-
imental errors, chemical shift tolerances (d1, d2) are applied around
the peak position. The tolerances should be chosen to be sufficiently
dimension 2
c1−δ1 c1 c1+δ1
c2+δ2
pz
py
c2
px
c2−δ2
pa pb pc pd dimension 1
Fig. 2. Illustration of the assignment of a cross-peak. c1,c2 denote the peak coordinates in
frequency space. The assignment frequency window is indicated by the solid black square,
defined from the chemical shifts tolerances d1 and d2. The coordinates of the (hypotheti-
cal) correct assignment are represented by the gray dashed lines (pb, py). Multiple reso-
nances within the tolerance window (pa, pb, pc, pd in dimension 1 and px, py, pz in the other
dimension) give rise to 12 assignment possibilities.
large to obtain frequency windows that can compensate for all

sources of inconsistencies between the list of resonance assign-
ments and the cross-peak lists. Then, for each peak dimension, all
protons (or 13C/15N spins for MAS ssNMR) whose chemical shifts
fall in the peak frequency windows are collected (see Note 4). In
the case of 3D or 4D heteronuclear spectra, the hetero atom
attached to the proton must also match the corresponding chemi-
cal shift window. The list of all assignment possibilities (or contri-
butions) for a cross-peak is generated from the combination of the
resonances assignment (Fig. 2). The sizes of the frequency win-
dows play an important role in the initial cross-peak assignment
step (see Note 5). In addition, the completeness of the chemical
shift assignments influences the accuracy of the initial assignment
(see Note 6). For symmetric oligomers, since symmetric nuclei will
have the same chemical shifts, ARIA will collect possible assign-
ments for all monomers. To simplify the treatment of the resulting
highly ambiguous assignments (see Note 7), ARIA considers only
one dimension (of the two corresponding to the through-space
correlation) as ambiguous in terms of chain assignment. Later on,
the corresponding symmetric restraints will be automatically gen-
erated by ARIA prior to structure calculation. ARIA also takes into
account information about the intramolecular or intermolecular
nature of the experiment (if applicable and specified by the user) by
excluding the nonvisible contributions.
3.2.2. Structural Rules ARIA can use information about the secondary structure organiza-
for Symmetric Oligomers tion of the system under investigation to remove unlikely assign-
ments. ARIA uses simple rules (23) to assign some cross-peaks as
intermonomer before the structure calculation, using the predicted
secondary structure elements (see Note 8). If two symmetric

secondary structure elements are facing each other in the interface,
cross-peaks observed within the same element between residues
separated by more than five residues in sequence cannot arise from
intramolecular contacts and are thus unambiguously classified as
intermolecular.
3.2.3. Network Anchoring ARIA implements a network anchoring approach (8) to reduce the
number of possibilities of cross-peak assignments prior to structure
calculation. The approach is based on the ranking of each assign-
ment, calculated using the information about the assignments of
neighboring nuclei in 3D space, and is efficient because true assign-
ments form a self-consistent subset of the network of all possible
assignments (see ref. 8, 13 for details). The behavior of network
anchoring is controlled by a set of user-defined parameters:
1. “High network-anchoring (NA) score per residue threshold”
high
(N res ).
min
2. “Minimal NA score per residue threshold” (N res ).
min
3. “Minimal NA score per residues threshold” (N atom ).
A peak is conserved if one of the following rules is verified:
S res ≥ N res
high
(1)
S res ≥ N res
min
and S atom ≥ N atom
min
(2)
where Sres and Satom are respectively the residue-wise and atom-wise
network anchoring score. Even though the network anchoring
approach does not directly rely on 3D structure information, it is
still possible to use it after the first ARIA iteration.
3.3. Iterative Structure The most important idea that underlies the ARIA methodology is
Calculation the concept of Ambiguous Distance Restraints (ADR) (2). In the
framework of the ADR, each NOESY cross-peak is treated as the
3.3.1. Ambiguous Distance
superposition of the signals from each of its multiple assignments
Restraints
possibilities: the NOE intensity depends on the sum of the inverse
sixth power of all the individual proton–proton distances that con-
tribute to the signal. An effective distance D is thus derived as:
1
−
⎛ Nc ⎞ 6
D = ⎜ ∑ dc−6 ⎟ (3)
⎝ c =1 ⎠
where c runs through all Nc assignment possibilities and dc is the

interatomic distance between the two protons corresponding to
the c-th contribution. During structure calculation, in a similar
fashion as for unambiguous distance constraints, the distance D in
the molecular coordinates is restrained through the distance target
energy function (cf. Subheading 3.3.5).
3.3.2. Distance Calibration The simplest model to derive distances from NOE signal intensity
is the Isolated Spin Pair Approximation (ISPA), which considers
only the observed spin pair, neglecting spin diffusion through third
nuclei. For short mixing times, ISPA provides a good approxima-
tion to relate an NOE volume (Vij) to the distance dij of two inter-
acting spins i and j:
Vij = Cdij−6 (4)
The scale factor C (also named calibration factor) cannot be

measured directly since it depends on the system under investiga-
tion and on the experimental setup. The calibration factor is esti-
mated for all NOEs from the ratio of the average of the experimental
volume, Vexp, to the average of the theoretical volume:
∑V exp
C= i
(5)
∑ dˆ
i
i
−6
where dî is the average effective distance for NOE i in the con-
former ensemble. In the case of multiple assignment possibilities,
dî is calculated according to equation of ADR Eq. 3. Finally, the
calibrated distance is obtained by:
1
−
d = (C −1V exp ) 6
(6)
In the case of NOE between two groups of magnetically equiv-
alent spins (e.g., methyl groups and aromatic rings), averaging
effects are taken into account by expanding Eq. 4 (see Note 9).
Magnetization can also be transferred from one spin to another
not only directly but also by spin diffusion, i.e., indirectly via other
spins in the vicinity. For longer mixing times, the spin-diffusion
phenomenon must be considered in the estimation of the distance.
When applying ISPA the resulting interproton distances are there-
fore mostly underestimated. ARIA employs relaxation matrix the-
ory to account for indirect magnetization transfer. In this formalism,
cross-peak volumes at mixing time tm can be calculated given the
volumes at tm = 0 and the matrix of auto- and cross-relaxation rates,
R (24):
Vij (t m ) = CVij (0)(exp(−Rt m ))ij (7)
The resulting NOE back-calculated volumes, which take into

account the bias induced by spin-diffusion, are then converted into
corrected target distances d:
1
−
⎛ V exp ⎞ 6
d = dˆ ⎜C −1 ⎟ (8)
⎝ V th ⎠
where d̂ is the average effective distance, and V exp and V th are the
experimental and theoretical NOE volumes, respectively. When
using spin-diffusion corrected distances, the distance bounds cal-
culated from the theoretical volume may also be of use for the
structure calculation (25). In ARIA 2.3, the spin-diffusion correc-
tion is performed by the python core of ARIA and not by CNS
routines. It is also important to note that every spectrum is inde-
pendently calibrated. Still, these models are approximate and it is
common practice to restrain the distance to an interval to account
for uncertainties in the distances (see Note 10). This interval is
thus defined by lower and upper distance bounds, L and U:
L = d − Δ,U = d + Δ where Δ = 0.125d 2 (9)
3.3.3. Violation Analysis To identify incorrect assignments and noise peaks, the calibrated
and Noise Peak Removal restraints are treated with a violation analysis, following the struc-
tural consistency hypothesis (3, 26): incorrectly assigned peaks or
noise peaks are not consistent with the 3D structure determined
with all experimental data. To assess whether a particular restraint
follows the “general trends” imposed on the structures by the entire
data set, the obtained distance bounds are compared to the corre-
sponding distances found in the conformer ensemble. A restraint is
considered as violated if the distance found in the structure lies
outside the bounds by more than a user-defined violation tolerance,
t. To identify systematically violated restraints, each conformer in
the ensemble is analyzed. The fraction, f i , of conformers violating
restraint i is calculated according to:
1 S
fi = ∑ max(Θ(Li − t − di(k) ), Θ(di(k) −U i − t ))
S k =1
(10)
where Li and Ui denote the lower and upper bounds of the i-th
restraint, di(k) designates the distance found in the k-th conformer;
Q is the Heaviside step function and S is the total number of con-
formers analyzed. A restraint is classified as violated if f i exceeds a
user-defined violation threshold (50% by default). The correspond-
ing cross-peak is thus removed from the list of active peaks for the
next iteration. During the course of the protocol, the violation
tolerance, is reduced from iteration to iteration to ensure that most
of the inconsistent peaks are removed.
3.3.4. Partial Assignment The assignment of cross-peaks is made in an indirect fashion by

progressively eliminating unlikely assignment possibilities. Due to
the r −6 dependence, assignments with large distances contribute
only little to the NOE intensity. Thus, for a particular cross-peak,
each assignment possibility is weighted by its normalized partial
volume, wc , calculated as follows:
wc ∝ dc−6 (11)
Nc
∑w
c =1
c =1 (12)
where dc is the average distance of the contribution c in the struc-

ture ensemble and Nc, the number of contributions for the cross-
peak. To reduce the number of assignment possibilities, only the m
largest contributions satisfying the following condition are kept:
m
∑w
1
c ≥p (13)
where p designates a user-defined ambiguity cut-off. This cut-off is

set to 1.0 in the first iteration and progressively reduced to 0.8 so
that for most peaks unambiguous assignments can be derived in
the last iteration. The quality of NMR structure ensembles might
also be improved by excluding peaks that involve a large number of
contributions. This function is controlled by the parameter max_n,
which defines the maximum number of assignment possibilities
(4). Symmetric peaks or duplicate peaks from different experiments
lead to equivalent restraints (restraints involving the same set of
atoms). To avoid overrepresentation of certain distance data, non-
violated restraints with equivalent atom content are detected. The
restraint with the smallest distance is kept, while the others are dis-
carded for the rest of the protocol. For every iteration, the file
noe_restraints.merged lists restraints discarded by the
merging procedure.
3.3.5. Calculation On the basis of the merged restraints list, a new structure ensemble
of Structure Ensemble is calculated with the program CNS (16) through a molecular
dynamics simulated annealing (MDSA) protocol. ARIA provides
two forms of molecular dynamics : in Cartesian or torsion angle
space. Torsion angle molecular dynamics (TAD) (27) reduces the
calculation time and allows for higher MDSA temperatures, while
generally increasing the convergence radius. The molecular struc-
tures obtained with TAD also provide better local geometries. The
MDSA protocol used in ARIA is divided into two phases : an initial
high temperature search phase, and a cooling phase where the tem-
perature slowly decreases. The second part of the cooling stage is
performed in Cartesian coordinates. The length of the cooling
stages determines the slope of the bath temperature cooling func-
tion. It has been shown that this parameter plays an important role
in the convergence properties of the ARIA calculation for highly
ambiguous data (28). The MDSA protocols implemented in ARIA
(3) are optimized for the application of ambiguous distance
restraints and for the violation analysis method. The minimization
protocols are based primarily on separate scaling of different energy
terms with relatively low force constants. Any other structural
Table 1
Important protocol parameters, their location in the GUI, and
defaults values (if applicable)
Parameter GUI item Default value
Project environment Project

Project name 1
File root
Working directory
Temporary directory
Data specification Data
Frequency window (proton) Spectra 0.02
Frequency window (hetero) Spectra 0.5
Trust assignments Spectra No
Use only assigned Spectra No
Symmetry Symmetry None
CNS topology file Molecular system topallhdg5.3.pro
CNS parameter file Molecular system parallhdg5.3.pro
Protocol parameters Protocol
Number of structures Iterations 20
(n_structures)
Violation tolerance (t) Iterations 1000.0–0.1
Violation threshold Iterations 0.5
Ambiguity cutoff (wc) Iterations 1.0–0.8
Maximum nb. of Iterations 20
contributions (max_n)
Number of lowest energy Iterations 7
structures (S)
Solvent for refinement Water refinement Water
Structure calculation Structure Generation
Local CNS executable CNS
Command to start remote Job Manager
calculation
High temperature steps CNS Dynamics 10,000
Cooling 1 steps CNS Dynamics 5,000
Cooling 2 steps CNS Dynamics 4,000
Log-Harmonic potential CNS Annealing No
Parameters
restraints available are also used during the structure calculation

(hydrogen bond restraints, dihedral angles and RDCs). The number
of calculated conformers is an important parameter of the structure
calculation protocol. Among all calculated conformers, only the
n-lowest energy ones (usually n = 30%) will be used in the next
ARIA iteration to recalibrate and reassign NOEs. For every itera-
tion, the number of structures is a user defined parameter (see
Table 1).
3.3.6. Restraint Energy The aim of the MDSA protocol is to find a global energy minimum
Function of an objective function that incorporates experimental data and
physical energy. The latter is quantified by using a molecular
dynamics force field. Experimental data are integrated in the form
of conformational restraints entering the objective function via an
energy potential. For distance restraints, ARIA employs an flat-
bottom harmonic-wall potential with zero-energy between the dis-
tance bounds and linear asymptotes (3). This potential allows for
large distance violations as may occur in an automated assignment
procedure. Nevertheless, it is still difficult to correctly evaluate the
bounds and the relative weight to apply to the data. Recently, we
have introduced an new error-tolerant potential where lower and
upper bounds are replaced by a bounds-free log-harmonic potential
(14). This potential derives from a Bayesian analysis showing that
NOEs and the derived distances ideally follow the log-normal dis-
tribution (29, 30). In ARIA, we also retain another important fea-
ture of this Bayesian approach: automatic determination of the
optimal weight for the experimental data (31). The log-harmonic
potential is applied during the second cooling stage of the MDSA
and during water refinement. The weight for the distance restraints,
wdata , is iteratively evaluated as:
n
wdata = (14)
χ (X )
2
where n is the number of restraints, and:

⎡d ⎤
χ 2 (X ) = ∑ log 2 ⎢ i ⎥ (15)
ˆ
i ⎣⎢ di ⎦⎥
where, for each restraint i, dî is the effective distance Eq. 3 calcu-
lated from the current structure, and di is the target distance of
the restraint. This approach was shown to generally improve the
accuracy as well as the quality of the structures calculated from
assigned restraints (14). Our initial experience in using ARIA with
real (noisy and ambiguous) data indicates that the log-harmonic
restraint potential is preferable.
3.3.7. Symmetric The symmetry of the system is maintained during the calculation
Oligomers by adding a symmetry target function to the objective energy func-
tion (32). This target function contains terms that ensure the
symmetry relation between the monomers and keep them in the
vicinity of each other (Packing, see Note 11).
3.3.8. Floating Chirality The treatment of unassigned prochiral groups is realized with a
Assignment floating chirality assignment approach (22). The two substituents
of a prochiral center (methylene protons or methyl protons of iso-
propyl groups) are often difficult to assign stereo-specifically, in
terms of chemical shifts. In each proton dimension, a resonance
matching one of the chemical shifts may potentially involve either

of the two prochiral substituents. In ARIA, the two assignment
alternatives are tested during the course of the structure calculation
and the most energetically favorable possibility is used. The result
is written for each conformer in a file with a .float extension.
3.4. Solvent The simplified force field parameters for nonbonded contacts
Refinement applied to structure calculations in vacuo often produce structures
that contain artifacts (unrealistic side-chain packing and unsatisfied
hydrogen bond donors or acceptors). Therefore, the final struc-
tures of the last ARIA iteration are automatically refined in a shell
of explicit solvent (water or DMSO molecules). This refinement
consists in a short MD with a complete force field, which includes
coulombic and Lennard-Jones potentials. The covalent parameters
used in the refinement (33) are consistent with the force field used
for structure calculation and validation, thus avoiding systematic
differences that could influence validation results. It has been
shown that the refinement in solution significantly improves the
quality of the structure (33–35).
3.5. Results Export
At the end of the ARIA protocol, assigned peak lists, restraint lists,
and Generation
along with violations, and final structure ensembles (last iteration
of Output Files
and solvent refined) are automatically exported into a CCPN proj-
3.5.1. Export to CCPN ect (see Fig. 3). Data exchange, further analysis of results, and
management of ARIA runs are then facilitated through the use of
the CCPN program suite (cf. Subheading 3.11).
ARIA
Cross-peak assignments
nition Distance restraints
Cross-peaks lists Violations

IMPORT
Final structure ensemble

EXPORT
Chemical shift assignments

Distance restraints
Hydrogen bond restraints
Dihedral angle restraints
RDC restraints
Initial structure ensemble
CCPN
project CCPN Analysis
Fig. 3. Communication interface between ARIA and CCPN for import of input data and export of results.
3.5.2. Report Files For every iteration, ARIA creates the following report files:
1. report summarizes analyses of the restraint lists and the
structure ensemble (number of restraints applied, violations,
ensemble precision).
2. noe_restraints.unambig, noe_restraints.ambig
tabulates information about unambiguous and ambiguous
restraints, respectively. For each restraint, the reference cross-
peak, restraint bounds and the average distance found in the
ensemble are provided. The result of violation analysis is also
given here (see Note 12).
3. noe_restraints.violations lists all violated restraints.
4. noe_restraints.assignments lists the tentative assign-
ments corresponding to every restraint. The nature of the
assignment(s) is also given (fully, partially or unassigned cross-
peaks).
5. noe_restraints.xml, noe_restraints.pickle stores
the complete list of cross-peak based distance restraints in
XML format and Python binary format. The latter is required
for further assignment analysis in the ARIA GUI (cf.
Subheading 3.10.2).
3.5.3. Quality Checks To evaluate the structural quality of both the final set of structures
and the solvent-refined ensemble, ARIA makes use of the programs
WHAT IF (19), PROCHECK (18), ProSa (20) and MolProbity
(21). Separate report files are generated for every program, named
quality_checks.*, and are stored in the directories of the
respective ensembles (last iteration and solvent-refined). Overall
quality scores are tabulated in the file quality_checks, whereas
WHAT-IF score profiles along the molecular sequence are gener-
ated in both textual and graphical forms (cf. Subheading 3.10.3).
3.5.4. CNS Analyses CNS scripts calculate restraint energies, ensemble RMSDs, an opti-
mal superposition of the final structure ensemble (with automated
determination of flexible and rigid regions), and an unminimized
average structure. Analyses of restraints from complementary
experimental data are also given. Results are stored in the directory
analysis/.
In the following sections, we detail the typical procedure to be
followed by a user to perform an ARIA calculation. In a structure
determination project, the general procedure consists of repeated
ARIA runs using revised results from a previous calculation as input
data (Fig. 4).
3.6. Conversion Since most NMR software packages use proprietary formats for
of Input Data data storage, the interconversion step required to transfer data with
other applications such as ARIA can lead to a loss of information.
Initial stage Series of ARIA runs
Preparation of input data

Setup of a new run
Parameters and project setup
Adjustment of frequency windows
Revision of input data

ARIA Completion of cross-peak
assignments
Automated cross-peak
assignment and Removal of potential noise
structure calculation peaks
Examination of quality checks for les

nal structure ensemble
Analysis of violations
and proposed assignments
Final result
Fig. 4. A series of ARIA runs in a typical structure determination project, with several cycles of structure calculations and
cross-peak assignments punctuated by manual inspection and correction of experimental input data.
To facilitate data validation and integration, ARIA uses a data

format based on the extensible markup language (XML) (36) to
describe molecular systems, chemical shifts, and cross-peaks lists. If
input data are intended to be read from a previously created CCPN
project, the conversion step described here is no longer required
(see Fig. 3). Input data will be read directly and internally con-
verted from the CCPN data model into ARIA at run-time. It is
otherwise necessary to convert input data to ARIA XML format
before starting the ARIA program per se. This step is simplified by
the internal conversion routine provided by ARIA. To use this rou-
tine, one must prepare a simple XML conversion file.
1. Conversion template. A preformatted conversion file can be
auto-generated by typing the following command in a
terminal:
aria2 --convert -t conversion.xml
An empty conversion file template, “conversion.xml” is then
created and must be completed.
2. Editing the conversion file. In addition to formats and filenames
of the raw data (sequence, chemical shift lists and spectra) (see
Note 13), the user has to specify the mapping between nuclei
and frequency dimensions. If the molecular system is a sym-
metric multimer, it is mandatory to specify the molecular chains
involved (segment id or segid). For the cross-peak lists, the user
needs to indicate the chains involved and the level of chain-wise
ambiguity. Possibilities are intramolecular, intermolecular, or

unknown. (cf. Subheading 3.7). For solid-state NMR experi-
ments, a parameter has to be filled in by the user to designate
the type of experiment and transfer (see Note 14).
3. Conversion step. Then, invoke the command
aria2 --convert conversion.xml
to start the data conversion. Converted data will be written in
ARIA XML format; a project file, which has to be completed by
the user, will be generated as well.
3.7. Specification 1. Project creation. All program parameters and locations of the
of ARIA Project input data are stored in single project file (in XML format). To
Parameters conveniently change or review the project settings, ARIA pro-
vides a Graphical User Interface (GUI) (Fig. 5). Entering the
following command will start the GUI and load the project
definition from project.xml (see Note 15)
aria2 --gui project.xml
Fig. 5. Graphical User Interface of ARIA 2.3 for project management, where data and protocol settings can be modified
graphically.
Important program and protocol parameters are listed in

Table 1. Default settings are provided for the rest of the
parameters.
2. General settings. Mandatory parameters are related to the gen-
eral infrastructure of the project, e.g., the name of the project,
the directory where an ARIA run will be stored (Working direc-
tory) or the prefix (File root) used by ARIA throughout the
project for naming PDB files.
3. Sequence definition. It is necessary to provide here the defini-
tion of the molecular system. A project file created during the
conversion step will already display the location of the XML
file of the molecular sequence. Otherwise, the “Browse” but-
ton assists in locating the sequence definition XML file. If the
sequence has to be read from a CCPN project, the user should
first locate the CCPN project in the CCPN data model panel
in the GUI. Then, the “CCPN” format has to be chosen for
the sequence, and hitting the “Select” button will open a pop-
up window displaying available molecular systems contained in
the CCPN project. This procedure is common to all steps
where import of data from a CCPN project is available.
4. Adding input data. Spectra and additional experimental data
can be added by clicking the “Add” button in the GUI menu.
When adding a spectrum, it is necessary to provide both the
location of the cross-peak list and the corresponding chemical
shift list. Additional experimental data can be supplied in the
form of CNS “tbl” formatted files or from a CCPN Project. In
the latter case, supplementary options are offered when the
distance restraints list is added. For instance, distance restraints
can be selected to enter the iterative protocol, where they will
be recalibrated and filtered like restraints derived from the
internal ARIA cross-peak assignments procedure. Otherwise,
they will be kept untouched by the program during the entire
protocol.
5. Adjusting data parameters. For each spectrum the default fre-
quency window sizes should be adjusted. When a user wants to
apply spin-diffusion correction, the necessary parameters need
to be entered (molecule correlation time, spectrometer fre-
quency and mixing time). The nature of the cross-peak in
terms of possible chain assignment should also be specified
here in the case of symmetric oligomers. This option intends to
make better use of possible information arising from filtered/
separated experiments recorded on asymmetrically labeled
samples. Finally, for solid-state NMR spectra, we recommend
specifying lower and upper distance bounds that will be applied
to the cross-peak derived restraints (see Note 10). If applicable,
its is furthermore possible to pick an appropriate labeling
scheme (see Note 14). In addition, parameters relative to dihedral
angle restraints (see Note 8), RDC (see Note 16) and J-couplings
(see Note 17) should be defined.
6. Symmetry. ARIA can treat oligomers with C2, C3, C5 or D2
symmetry (see Note 11).
7. Specifying topology patches. By default, ARIA supports the fol-
lowing cases: Disulfide bridges (unambiguous or ambiguous)
(2), Histidine protonation states, cis-proline and tetrahedral
coordination of Zinc ions. In the case of nonstandard residues
or other chemical compounds, manual intervention of the user
is required (see Note 18).
8. Iteration parameters. The mode of restraint calibration has to
be specified : ratio of average (default), spin-diffusion correction
or fixed bounds (see Note 10). For every iteration, default val-
ues are provided for protocol parameters (Table 1) and the
network-anchoring thresholds (see Note 19).
9. Job Manager. Distributing structure calculations to multiple
processors speeds up the ARIA protocol. ARIA provides sup-
port for several job submission modes (see Note 20). The
appropriate command should be entered and the correct path
to the remote CNS program executable should be specified.
10. Structure calculation parameters. The remaining parameters
are related to the molecular dynamics simulated annealing, and
in particular the number of steps, restraint force constants and
potential shape (flat-bottom-harmonic-wall and log-harmonic).
3.8. Project Setup At this point, the project must be set up with the following
command.
aria2 --setup project.xml
The project is then validated and ARIA creates the directory
tree for the project (directory run1). As shown in Fig. 6, the results
of the successive iterations are stored in structures/, each iteration
having its own subdirectory, e.g., structures/it0/. Experimental
data files are copied into their respective directory in data/ (see
Note 21). Report files for the cross-peak filtering procedure are
stored in data/spectra/. All data, protocols, parameters, and topol-
ogy files used by CNS reside in the cns/ subdirectory.
3.9. Starting an ARIA It is now possible to launch the ARIA calculation, using the follow-
Run ing command:
aria2 project.xml
ARIA will then automatically perform all the steps listed in
Subheadings 3.1–3.5. The main ARIA job will be executed on the
local machine where it has been started. According to the job man-
ager settings of the project, the structure calculations will be
begin le
sequence nition (XML)
templates Template structures (PDB)
spectra Cross-peaks and chemical shifts lists (XML)
ssbonds de bonds (TBL)
data hbonds Distance restraints for hydrogen bonds (TBL)
Location where ARIA jcouplings Restraints for J-Couplings (TBL)

stores input data
rdcs Residual Dipolar couplings restraints (TBL)
dihedrals Dihedral angle restraints (TBL)
distances User provided distances restraints (TBL)
it0 rst ARIA iteration (iteration 0)
it1
analysis Various analysis results (performed by CNS)
run1 structures ...
graphics les (PostScript)
ARIA run directory les for each it8
iteration are stored here molmol le to visualize restraints
Last iteration
cns les
ne W ned structures and quality-checks analysis
protocols Simulated-annealing protocols (CNS)
cns data Input data for simulated annealing
Files used for toppar T nition

structure calculation
begin Template structure for simulated-annealing
Fig. 6. Illustration of the directory tree of an ARIA project and details about the content. Final results can be found in the
directories marked in gray.
successively launched on the local processor (default behavior) or

dispatched to a computer cluster (see Note 20).
3.10. Checking In the next paragraphs, we list the points of interests when inspect-
the Results ing the calculation results, along with some guidance on how to
correct input data and adapt the protocol parameters.
The level of convergence indicates how well the protocol managed

3.10.1. Convergence
to find a well-defined structure and a consistent set of assignments.
Convergence can be estimated with two indicators:
1. The average (and variation) of the total energy of the structure
ensemble
2. The conformational variance of the structure ensemble (or pre-
cision) expressed as a RMSD.
A low average energy (see Note 22) and a high precision
(RMSD < 1.5 Å) generally mean that convergence has been reached.
Other situations may stem for unadapted protocol settings or
incomplete or low quality data. The average energy can be found in
structures/it8/analysis/energy.disp and the precision in the report
file or in structures/it8/analysis/rmsdave.disp.
3.10.2. Automated 1. The report files listed in Subheading 3.5 provide analyses on all
Assignments restraints and particularly which restraints have been classified
as violated. Restraints showing consistent violation greater
than 0.1 Å should be inspected manually. Restraints with large
upper-bound violations (³5 Å) in the majority of the conform-
ers (³85%) usually result from incorrect assignments. Restraints
detected as such should not be used in a later ARIA run and
the corresponding cross-peak removed from its respective
spectrum. Other assignments should be considered as “reli-
able” in a subsequent run.
2. Analyzing text files for violations and assignments can be a
tedious task. ARIA also provides ways to investigate this in a
graphical manner (37). Postscript files describing the restraints,
based on the RMS of violations are generated automatically
during a run. These values are displayed at the residue level, in
the form of a profile along the protein sequence, or as a contact
map for the RMS of violations per residue pair (Fig. 7a). The
contact map displays the sum of the RMS of violations per resi-
due pair. In the profile, the sum of the RMS of violations per
residues is plotted along the protein sequence. In addition, the
program provides an interactive tool to browse assignments at
the residue level (Peak map). A peak-map can be viewed for all
iterations in the ARIA GUI (Fig. 8). Clicking on a contact
Fig. 7. Per-residue quality plots. (a) Contact map displaying the sums of RMS deviations and a profile of the RMS deviations.
(b) WHATIF score profiles along the sequence. The RMS deviations are plotted on a color scale (figure adapted from ref. 25).
Fig. 8. Interactive peak map. Right panel of the ARIA 2.3 GUI showing the interactive peak map at iteration 8 of an ARIA run.
Each pixel of the map located between residues i and j is clickable and opens an assignment report, which contains the
list of peaks that exist between residues i and j, along with their contributions (figure adapted from ref. 25).
between residues i and j opens a pop-up window that shows a

list of ARIA restraints involving atoms from both residues,
where restraints are labeled. Such graphical representations can
be useful to detect regions of the structure where violations are
concentrated, indicating where restraints and assignments
should be more thoroughly investigated.
3. Finally, the resulting restraints and assignments that are
exported to a CCPN project can be later investigated with the
CCPNmr Analysis software. As illustrated on Fig. 9, CCPNmr
Analysis offers utilities to inspect restraints through a customi-
zable user interface. Moreover, a user will be able to examine
the proposed resonance assignments directly in a spectral dis-
play window at the positions in frequency space where the
peaks were picked.
3.10.3. Quality Indices The quality of structure ensembles as determined by independent

structure validation is widely acknowledged as a good indicator of
the performance of the structure calculation protocol and of the
reliability of the structure. The application of NMR restraints for
structure calculation may induce distortions in the geometry of the
molecular structure. For this purpose, ARIA applies four major
programs (PROCHECK (18), WHAT IF (19), ProSa (20) and
MolProbity (21)) that aim at detecting outliers and abnormalities
Fig. 9. Screenshot of CCPNmr Analysis windows showing the result of an ARIA run.
in macromolecular structure by comparing several characteristic

geometric properties to a database of small molecules and/or high-
resolution X-ray structures. The summary of all global quality indi-
ces is given in the quality_checks file. For thorough reviews of
tools to evaluate the quality of NMR structures, we suggest con-
sulting the following references (38, 39). We would like to stress
here that despite the apparent lower resolution of solid-state NMR
data, a great deal of attention should still be given to the inspection
of such quality checks. The following scores should be investigated
further (see Note 23).
1. Procheck Ramachandran percentage. For typical NMR struc-
tures deposited in the PDB, 80% of the dihedral angles lie
within the preferred region of the Ramachandran plot. For
high-resolution NMR structures, a higher percentage is
expected (90%).
2. WHAT-IF Z-scores. WHAT-IF results are presented in the form

of overall Z-scores. In general, structures with Z-scores
between −2 and +2 are considered to be within a normal range
and are thus good structures, while structures with Z-scores
lower than −2 should be inspected further. Useful indicators of
good quality are “Backbone conformation” and “Packing
quality”. The “bump-score” also reports the number of van
der Waals violations per 100 residues.
3. WHAT-IF profiles. Recently, some studies have stressed that
global structural indicators are not sufficient to detect errors in
structures and suggested examining parameters on a per-residue
basis (40, 41). Such profiles for the WHAT-IF scores are pro-
duced by ARIA in the form of a PostScript file (Fig. 7b). Thus,
poor quality regions can be precisely identified (see Note 24).
4. Molprobity clashscore. This reports the number of overlaps
>0.4 Å per thousand atoms. For typical NMR structures depos-
ited in the PDB, this score is generally high (>10). From our
experience, the application of the log-harmonic potential along
with automated weight estimation significantly improves this
situation.
3.11. Preparing To use the result of an ARIA run to further improve the structure,
a New Run it may be necessary to correct the input data. At this stage, we rec-
ommend preparing a new ARIA project for better bookkeeping.
CCPNmr Analysis also offers a utility to manage the input and
output of successive ARIA runs (Fig. 9). The same CCPN project
can be used in multiple ARIA runs.
3.11.1. Correction 1. Peaks identified as erroneous (noise peaks) should be deleted

of Input Data from the input data.
2. Automated assignments may be added in the initial cross-peak
assignment and incorrect assignments removed.
3. To improve convergence, reliable assignments can be used
either as distance restraints or set individually as reliable in
the input XML file.
3.11.2. Adjusting In the new project file, protocol parameters may also be changed
Parameters according to the result of a previous calculation. We list here the
most important parameters that ought to be adapted.
1. The number of dynamic steps required for convergence is
determined by the system size and the level of ambiguity or
incompleteness of the input data. Default values work well for
systems up to about 100 residues studied with NOESY.
However, for larger systems (e.g., symmetric oligomers) or
when MAS solid-state NMR data are used, it might become
necessary to increase the number of steps in the cooling stage
of the simulated annealing protocol. On the one hand, the

computation time to calculate a structure will increase with the
length of the dynamics. On the other hand, a slow-cooling
strategy substantially increases the probability of success of the
minimization protocol (see Note 25).
2. In case of poor convergence, one should also check frequency
window sizes. Narrow windows affect the completeness of a
cross-peak assignment. It may therefore be judicious to slightly
increase the individual window size (e.g., by 10%). Conversely,
when the final set of restraints is still largely ambiguous, it is
reasonable to reduce the window sizes.
3. Achieving convergence may also be hampered by a tight viola-
tion tolerance. If a large number of restraints are rejected, the
data may be become too sparse. Also, if an initial ensemble of
template structures (from a previous calculation for instance) is
specified, the default tolerance must be reduced for the first
iteration (e.g., 5 Å).
4. Notes
1. Data can be read from common NMR formats or via the

CCPN program suite. Compliant formats are the following :
Ansig (42), NMRDraw (43), NMRView (44), Pipp (45),
Pronto (46), Sparky (T. D. Goddard and D. G. Kneller,
University of California), XEasy (47), Diana (48), and NMRStar
(49). PDB files with CNS(16), IUPAC(50), or DYANA (51)
atom name nomenclatures can be read by ARIA. Restraints
files should follow the CNS/XPLOR syntax and nomencla-
ture. Mismatch in segment id (segid) between the restraints
and the molecular definition is often a source of errors. ARIA
internally follows the IUPAC (50) recommendations for the
atom name nomenclature. Most common naming problems
are the following:
● The C-terminal carboxyl group is named O¢ and O″. O″
contains two apostrophes (ASCII 39), not a quotation
mark (ASCII 34). The PDB uses O and OXT or OT1 and
OT2 instead.
● The N-terminus consists of H1, H2, and H3 (not HT1,
HT2 and HT3).
● The protein backbone amide proton is called H (instead
of HN).
● The glycine alpha protons are HA2 and HA3.
● Pseudoatoms (52) are not supported, r–6-averaging is
applied to equivalent groups.
2. ARIA supports CHHC/NHHC (53) and 2D/3D 13C-13C

correlation spectra, i.e., PDSD (54, 55), DARR (56), and
PAR (57).
3. We always use absolute values of peak sizes (volume or
intensity).
4. For C/NHHC experiments, cross-peak assignment is per-
formed on the basis of 13C/15N chemical shifts, but later trans-
formed in proton–proton distance restraints.
5. Windows that are too narrow induce potentially incomplete
assignments, while large window sizes lead to highly ambigu-
ous initial assignments, which are often the source of severe
convergence issues during the ARIA protocol. Therefore, win-
dow size must be chosen carefully; the ideal situation is reached
when the windows size is sufficiently large to contain the cor-
rect assignments, but without unduly increasing the number of
assignment possibilities. Typical window size values for NOESY
spectra are 0.02 and 0.04 ppm for the direct and indirect pro-
ton dimensions, respectively, and 0.5 ppm for the heteronu-
clear dimensions. The maximum number of assignment
possibilities (max_n) also affects the quality of the initial assign-
ment, since some peaks that could correctly be assigned are
rejected due to an excessively large number of assignment pos-
sibilities. Fossi et al. have developed a strategy, based on a pre-
calculation analysis, for choosing optimal values for d and and
max_n for a particular data set (58). The size of the windows
is directly linked to the line-width of the spectra. Thus, for
MAS solid-state NMR experiments, line broadening would
require larger assignment windows. From the literature, typical
values for proton-driven spin diffusion experiments or proton-
mediated rare-spin correlation experiments are in the range of
0.25–0.6 ppm.
6. Atoms with missing resonance assignments will not be assigned
to any cross-peak. In this case, automatically generated assign-
ments are almost certainly wrong. From our experience, to
achieve reasonable convergence, the completeness of a chemi-
cal shift list should not be less than 90%.
7. In addition to the standard ambiguity arising from chemical
shift degeneracy, symmetry degeneracy leads to a larger num-
ber of assignment possibilities.
8. Different methods can be used to estimate secondary struc-
tures. For instance, CSI (59), TALOS (60) or DANGLE (61)
predict likely values of phi/psi main-chain dihedral angles from
a list of chemical shift assignments. Such predictions can be
incorporated as dihedral angle restraints using an harmonic
square-well potential.
9. The theoretical cross-peak volume is then calculated as an r −6 -

average over all pairwise contributions:
1
VIJ = CnI n J dÎJ−6 where dÎJ−6 =
N IN J
∑d
I×J
−6
ij (16)
and where I and J denote two groups of spins having nI and nJ

members, respectively. Introduction of the effective distance
dÎJ retains the functional form of Eq. 4. Equation 16 relies on
a discrete slow jump model where spins I and J jump between
NI and NJ equilibrium sites, respectively (24).
10. For solid-state NMR data, approximation is more severe.
Because of additional effects that influence the relation
between peak intensity and the actual distance (dipolar trun-
cation, partial mobility, transfer efficiency), the calibration
routine implemented in ARIA may not be adapted to cor-
rectly model the cross-peak signals. However, the use of fixed
distance bounds has been shown to be sufficient in numerous
solid-state NMR studies. In fact, the calibration is less impor-
tant since the essential feature of the ambiguous distance
restraint remains valid: if at least one of the assignment possi-
bilities is smaller than the upper limit, the restraint is satisfied.
Bounds can be estimated, for instance, from buildup curves.
We recommend consulting the following references for details
(7, 9, 55, 62, 63).
11. The packing restraint intends to compensate for lack of unam-
biguous intermonomer restraints in early ARIA iterations. If
convergence is achieved and a sufficient number of meaningful
intermonomer cross-peaks have been assigned, we advise not
to use this restraint.
12. Restraints discarded by the merging procedure are excluded
from the list.
13. To use the CCPNmr FormatConverter (17) for data conver-
sion with file formats not natively supported by ARIA, it is
necessary to use the following command
aria2 --convert_ccpn conversion.xml
14. If solid-state NMR experiments are performed on site-directed
13
C-enriched samples (7, 64), it is necessary to specify the
appropriate labeling scheme, i.e., [1,3-13C]-glycerol and
[2-13C]-glycerol. ARIA automatically removes assignment
options that are not permitted by the labeling pattern, as first
described in the SOLARIA program (9). Alternatively,
CCPNmr Analysis provides routines to create ambiguous dis-
tance restraints respecting the labeling patterns. Such restraints
can be then imported into ARIA.
15. A user can also choose the “New” item in the GUI menu
“Project” to create a new project. As an alternative, the follow-
ing command
aria2 --project_template project.xml
will create a new project file.
16. Residual dipolar coupling data can be incorporated as restraints
following two alternative approaches: direct (SANI) or indirect
(VEAN). For SANI, the user has to specify the rhombicity and
magnitude of the alignment tensor (65). Several methods exist
to predict these parameters, from the distribution of the RDC
values (66) or from the shape of the molecule (67). VEAN
uses intervector projection angle restraints which must be gen-
erated with a separate program (68).
17. The correlation between a three-bond measured J-coupling
and the corresponding dihedral angle is modeled by the Karplus
curve. Default values for the parameters of the Karplus curve
are given for 3J(HNHa).
18. An MTF can be specified in the project file. Changes must be also
made to the CNS topology, linkage, and parameter files. Definitions
of the additional residues or compounds must be added to the
ARIA dictionary (files atomnames.xml and iupac.xml).
A detailed explanation is given on the ARIA Web site.
19. We recommend the use of the network-anchoring only for the
first 3 iterations. Too stringent thresholds or an application of
network-anchoring during more ARIA iterations may bias the
assignment process toward an incorrect structure (13).
20. Jobs can be submitted via ssh commands or with the follow-
ing batch queuing systems: PBS (69), SGE (70) or Condor
(71). Alternatively, CCPN users can submit their ARIA calcu-
lation to the CCPNGrid portal server at http://www.webapps.
ccpn.ac.uk/ccpngrid/.
21. Only local copies of data files are used for structure calculation.
Changes in the original files will thus become active only in the
next project setup.
22. For systems of about 100 residues, well converged ensembles
show average energies of the order of 1,000 kcal/mol. Normal
energy variation is about 10%, the total average energy scaling
is approximately linear with the system size.
23. Others methods are available to estimate the credibility of the
structures, notably by scrutinizing the information content of
the data (72). For instance, the completeness (73) of a restraint
set provides insight into the local reliability of each structure.
The completeness is the ratio between the number of observed
restraints and the number of expected restraints. We recom-
mend the method AQUA (73) to perform such analysis.
Moreover, several Web servers exist where a user can submit

structures for quality checking and validation, e.g., PSVS (74)
and Cing (75).
24. Comparing such quality profiles can be very helpful to detect reli-
able solutions when multiple conformations are obtained (13).
25. A recent study on the effect of the cooling rate of the simulated-
annealing with highly ambiguous data reported an increased
efficiency of slower cooling, e.g., 100,000 (equivalent Cartesian)
steps (28). The same order of value was successfully used to
determine the structure of the SH3 domain (9), Crh (10), and
aB crystallin dimer from MAS solid-state NMR data (76). Note
that ARIA divides the number of steps for the torsion angle
phase by the value of the parameter TAD time-steps factor to
allow a larger time-step (default factor value is 9).
Acknowledgments
This work was supported by the EU grants SPINE (QLG2-

CT-2002-00988) and ExtendNMR (LSHG-CT- 2005–018988).
The Ministère de l’Enseignement Supérieur (ACI IMPBio, project
ICMD-RMN) and Institut Pasteur are also acknowledged for
financial support. The authors would like to thank Wolfgang
Rieping, Michael Habeck, Aymeric Bernard, and the CCPN team
for their active participation in the development of ARIA, as well
as Anja Böckmann and Barth-Jan van Rossum for fruitful collabo-
rations on solid-state NMR. Benjamin Bardiaux thanks Hartmut
Oschkinat for support.
References
1. Wuthrich, K. (1986) NMR of Proteins and 6. Rieping, W., Habeck, M., Bardiaux, B.,
Nucleic Acids, Wiley-Interscience New York. Bernard, A., Malliavin, T., and Nilges, M.
2. Nilges, M. (1995) Calculation of protein struc- (2007) ARIA2: automated NOE assignment
tures with ambiguous distance restraints. and data integration in NMR structure calcula-
Automated assignment of ambiguous NOE tion. Bioinformatics 23, 381–382.
crosspeaks and disulphide connectivities. J. Mol. 7. Castellani, F., van Rossum, B., Diehl, A.,
Biol. 245, 645–660. Schubert, M., Rehbein, K., and Oschkinat, H.
3. Nilges, M. and O’Donoghue, S. I. (1998) (2002) Structure of a protein determined by
Ambiguous NOEs and automated NOESY solid-state magic-angle-spinning NMR spec-
assignment. Prog. NMR Spec. 32, 107–139. troscopy. Nature 420, 98–102.
4. Linge, J. P., O’Donoghue, S. I., and Nilges, M. 8. Herrmann, T., Güntert, P., and Wüthrich, K.
(2001) Automated assignment of ambiguous (2002) Protein NMR structure determination
nuclear overhauser effects with ARIA. Methods with automated NOE assignment using the
Enzymol. 339, 71–90. new software CANDID and the torsion angle
5. Linge, J. P., Habeck, M., Rieping, W., and dynamics algorithm DYANA. J. Mol. Biol. 319,
Nilges, M. (2003) ARIA: automated NOE 209–227.
assignment and NMR structure calculation. 9. Fossi, M., Castellani, F., Nilges, M., Oschkinat,
Bioinformatics 19, 315–316. H., and van Rossum, B. (2005) SOLARIA: a
protocol for automated cross-peak assignment Arendall, W. B., Snoeyink, J., Richardson, J. S.,
and structure calculation for solid-state magic- and Richardson, D. C. (2007) MolProbity: all-
angle spinning NMR spectroscopy. Angew. atom contacts and structure validation for pro-
Chem. Int. Ed. Engl. 44, 6151–6154. teins and nucleic acids. Nucleic Acids Res. 35,
10. Loquet, A., Bardiaux, B., Gardiennet, C., W375–383.
Blanchet, C., Baldus, M., Nilges, M., Malliavin, 22. Folmer, R. H., Hilbers, C. W., Konings, R. N.,
T., and Böckmann, A. (2008) 3D Structure and Nilges, M. (1997) Floating stereospecific
Determination of the Crh Protein from Highly assignment revisited: application to an 18 kDa
Ambiguous Solid-State NMR Restraints. protein and comparison with J-coupling data.
J. Am. Chem. Soc. 130, 3579–3589. J. Biomol. NMR 9, 245–258.
11. Manolikas, T., Herrmann, T., and Meier, B. 23. Duggan, B., Legge, G., Dyson, H., and Wright,
(2008) Protein structure determination from P. (2001) SANE (Structure Assisted NOE
(13)C spin-diffusion solid-state NMR spectros- Evaluation): an automated model-based
copy. J. Am. Chem. Soc. 130, 3959–3966. approach for NOE assignment. J. Biomol. NMR
12. Wasmer, C., Lange, A., Melckebeke, H. V., 19, 321–329.
Siemer, A., Riek, R., and Meier, B. (2008) 24. Görler, A. and Kalbitzer, H. R. (1997) Relax, a
Amyloid fibrils of the HET-s(218–289) prion flexible program for the back calculation of
form a beta solenoid with a triangular hydro- NOESY spectra based on complete relaxation
phobic core. Science 319, 1523–1526. matrix formalism. J. Magn. Reson. 124,
13. Bardiaux, B., Bernard, A., Rieping, W., Habeck, 177–188.
M., Malliavin, T. E., and Nilges, M. (2009) 25. Linge, J., Habeck, M., Rieping, W., and Nilges,
Influence of different assignment conditions on M. (2004) Correction of spin diffusion during
the determination of symmetric homodimeric iterative automated NOE assignment. J. Magn.
structures with ARIA. Proteins 75, 569–585. Reson. 167, 334–342.
14. Nilges, M., Bernard, A., Bardiaux, B., Malliavin, 26. Mumenthaler, C. and Braun, W. (1995)
T., Habeck, M., and Rieping, W. (2008) Automated assignment of simulated and exper-
Accurate NMR structures through minimisa- imental NOESY spectra of proteins by feedback
tion of an extended hybrid energy. Structure filtering and self-correcting distance geometry.
16, 1305–1312. J. Mol. Biol. 254, 465–480.
15. van Rossum, G., http://www.python.org/. 27. Stein, E. G., Rice, L. M., and Brünger, A. T.
16. Brünger, A. T., Adams, P. D., Clore, G. M., (1997) Torsion-angle molecular dynamics as a
DeLano, W. L., Gros, P., Grosse-Kunstleve, new efficient tool for NMR structure calcula-
R. W., Jiang, J.-S., Kuszewski, J., Nilges, M., tion. J. Magn. Reson. 124, 154–164.
Pannu, N. S., Read, R. J., Rice, L. M., 28. Fossi, M., Oschkinat, F., Nilges, M., and Ball,
Simonson, T., and Warren, G. L. (1998) L. (2005) Quantitative study of the effects of
Crystallography and NMR system (CNS): A chemical shift tolerances and rates of SA cool-
new software suite for macromolecular struc- ing on structure calculation from automatically
ture determination. Acta Cryst. sect. D 54, assigned NOE data. J. Magn. Reson. 175,
905–921. 92–102.
17. Vranken, W. F., Boucher, W., Stevens, T. J., 29. Rieping, W., Habeck, M., and Nilges, M. (2005)
Fogh, R. H., Pajon, A., Llinas, M., Ulrich, E. Modeling errors in NOE data with a log-normal
L., Markley, J. L., Ionides, J., and Laue, E. D. distribution improves the quality of NMR struc-
(2005) The CCPN data model for NMR spec- tures. J. Am. Chem. Soc. 127, 16026–16027.
troscopy: development of a software pipeline. 30. Rieping, W., Habeck, M., and Nilges, M.
Proteins 59, 687–696. (2005) Inferential Structure Determination.
18. Laskowski, R. A., MacArthur, M. W., Moss, D. Science 309, 303–306.
S., and Thornton, J. M. (1993) PROCHECK: 31. Habeck, M., Rieping, W., and Nilges, M.
a program to check the stereochemical quality (2006) Weighting of experimental evidence in
of protein structures. J. Appl. Cryst. 26, macromolecular structure determination. Proc.
283–291. Natl. Acad. Sci. USA 103, 1756–1761.
19. Vriend, G. (1990) WHAT IF: a molecular 32. Nilges, M. (1993) A calculation strategy for the
modeling and drug design program. J. Mol. structure determination of symmetric dimers
Graph. 8, 52–56. by 1 H NMR. Proteins 17, 297–309.
20. Sippl, M. J. (1993) Recognition of errors in 33. Linge, J. P., Williams, M. A., Spronk, C. A.,
three-dimensional structures of proteins. Bonvin, A. M., and Nilges, M. (2003)
Proteins Struct. Funct. Genet. 17, 355–362. Refinement of protein structures in explicit sol-
21. Davis, I. W., Leaver-Fay, A., Chen, V. B., Block, vent. Proteins Struct. Funct. Genet. 20,
J. N., Kapral, G. J., Wang, X., Murray, L. W., 496–506.
34. Linge, J. P. and Nilges, M. (1999) Influence of contour diagrams., J. Magn. Reson. 95,
non-bonded parameters on the quality of NMR 214–220.
structures: a new force-field for NMR structure 46. Kjær, M., Andersen, K. V., and Poulsen, F. M.
calculation. J. Biomol. NMR 13, 51–59. (1994) Automated and semiautomated analysis
35. Nederveen, A., Doreleijers, J., Vranken, W., of homo- and heteronuclear multidimensional
Miller, Z., Spronk, C., Nabuurs, S., Guntert, nuclear magnetic resonance spectra of proteins:
P., Livny, M., Markley, J., Nilges, M., Ulrich, the program PRONTO. Methods Enzymol.
E., Kaptein, R., and Bonvin, A. M. (2005) 239, 288–308.
RECOORD: a REcalculated COORdinates 47. Bartels, C., Xia, T.-H., Billeter, M., Güntert,
Database of 500+ proteins from the PDB using P., and Wüthrich, K. (1995) The program
restraints from the BioMagResBank. Proteins XEASY for computer-supported NMR spectral
59, 662–672. analysis of biological macromolecules. J. Biomol.
36. The World Wide Web Consortium (2008), NMR 5, 1–10.
Extensible Markup Language (XML) 1.0 (Fifth 48. Güntert, P., Braun, W., and Wüthrich, K.
Edition), http://www.w3.org/TR/xml/. (1991) Efficient computation of three-dimen-
37. Bardiaux, B., Bernard, A., Rieping, W., Habeck, sional protein structures in solution from
M., Malliavin, T., and Nilges, M. (2008) nuclear magnetic resonance data using the pro-
Graphical analysis of NMR structural quality gram DIANA and the supporting programs
and interactive contact map of NOE assign- CALIBA, HABAS and GLOMSA. J. Mol. Biol.
ments in ARIA. BMC Struct. Biol. 8, 30–34. 217, 517–530.
38. Spronk, C. A. E. M., Nabuurs, S. B., Krieger, 49. Hall, S. R. and Cook, A. P. F. (1995) STAR
E., Vriend, G., and Vuister, G.W. (2004) dictionary definition language: Initial specifica-
Validation of protein structures derived by tion. J. Chem. Inf. Comput. Sci. 35, 819–825.
NMR spectroscopy. Progress in Nuclear 50. Markley, J. L., Bax, A., Arata, Y., Hilbers, C. W.,
Magnetic Resonance Spectroscopy 45, 315–337. Kaptein, R., Sykes, B. D., Wright, P. E., and
39. Saccenti, E. and Rosato, A. (2008) The war of Wüthrich, K. (1998) Recommendations for the
tools: how can NMR spectroscopists detect presentation of NMR structures of proteins
errors in their structures? J. Biomol. NMR 40, and nucleic acids. J. Mol. Biol. 280, 933–952.
251–261. 51. Güntert, P., Mumenthaler, C., and Wütrich, K.
40. Nabuurs, S., Krieger, E., Spronk, C., Nederveen, (1997) Torsion Angle Dynamics for NMR
A., Vriend, G., and Vuister, G. (2005) Strucutre Calculation with the New Program
Definition of a new information-based per-res- DYANA. J. Mol. Biol. 273, 283–298.
idue quality parameter. J. Biomol. NMR 33, 52. Wüthrich, K., Billeter, M., and Braun, W.
123–134. (1983) Pseudo-structures for the 20 common
41. Nabuurs, S., Spronk, C., Vuister, G., and Vriend, G. amino acids for use in studies of protein con-
(2006) Traditional biomolecular structure deter- formations by measurements of intramolecular
mination by NMR spectroscopy allows for major proton-proton distance constraints with nuclear
errors. PLoS Comput. Biol. 2, e9. magnetic resonance. J Mol Biol 169, 949–961.
42. Kraulis, P., Domaille, P. J., Campbell-Burk, S. 53. Lange, A., Luca, S., and Baldus, M. (2002)
L., van Aken, T., and Laue, E. D. (1994) Structural constraints from proton-mediated
Solution structure and dynamics of ras p21. rare-spin correlation spectroscopy in rotating
GDP determined by heteronuclear three- and solids. J. Am. Chem. Soc. 124, 9704–9705.
four-dimensional NMR spectroscopy. 54. Szeverenyi, N., Sullivan, M., and Maciel, G.
Biochemistry 33, 3515–3531. (1982) Observation of spin exchange by two-
43. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, dimensional fourier transform 13 C cross polar-
G., Pfeifer, J., and Bax, A. (1995) NMRPipe: a ization-magic-angle spinning. J. Magn. Reson.
multidimensional spectral processing system 47, 462–475.
based on UNIX pipes. J. Biomol. NMR 6, 55. Castellani, F., van Rossum, B., Diehl, A.,
277–293. Rehbein, K., and Oschkinat, H. (2003)
44. Johnson, B. A. and Blevins, R. A. (1994) Determination of solid-state NMR structures
NMRView: A computer program for the visu- of proteins by means of three-dimensional
alization and analysis of NMR data. J. Biomol. 15 N-13 C-13 C dipolar correlation spectros-
NMR 4, 603–614. copy and chemical shift analysis. Biochemistry
45. Garrett, D., Powers, R., Gronenborn, A., and 42, 11476–11483.
Clore, G. (1991) A common sense approach to 56. Takegoshi, K., Nakamura, S., and Terao, T.
peak picking two-, three- and four-dimensional (2003) 13 C-1 H dipolar-driven 13 C-13 C recou-
spectra using automatic computer analysis of pling without 13 C rf irradiation in nuclear
magnetic resonance of rotating solids. J. Chem. 67. Zweckstetter, M. and Bax, A. (2000) Prediction
Phys. 118, 2325–2341. of sterically induced alignment in a dilute liquid
57. Lewandowski, J. R., Paëpe, G. D., Eddy, M. T., crystalline phase: Aid to protein structure
and Griffin, R. G. (2009) (15)N-(15)N proton determination by NMR. J. Am. Chem. Soc.
assisted recoupling in magic angle spinning 122, 3791–3792.
NMR. J. Am. Chem. Soc. 131, 5769–5776. 68. Meiler, J., Blomberg, N., Nilges, M., and
58. Fossi, M., Linge, J., Labudde, D., Leitner, D., Griesinger, C. (2000) A new approach for
Nilges, M., and Oschkinat, H. (2005) Influence applying residual dipolar couplings as restraints
of chemical shift tolerances on NMR structure in structure calculations. J. Biomol. NMR 16,
calculations using ARIA protocols for assigning 245–252.
NOE data. J. Biomol. NMR 31, 21–34. 69. Jones, J. P. (2002) PBS: portable batch system,
59. Wishart, D. S. and Sykes, B. D. (1994) The 13 C Beowulf cluster computing with Linux, MIT
chemical-shift index: a simple method for the Press, Cambridge, MA, USA, 369–390.
identification of protein secondary structure 70. Gentzsch, W. (2001) Sun Grid Engine: Towards
using 13 C chemical-shift data. J. Biomol. NMR creating a compute power grid, CCGRID ’01:
4, 171–180. Proceedings of the 1st International Symposium
60. Cornilescu, G., Delaglio, F., and Bax, A. (1999) on Cluster Computing and the Grid,
Protein backbone angle restraints from search- IEEE Computer Society, Washington, DC,
ing a database for chemical shift and sequence USA, 35.
homology. J. Biomol. NMR 13, 289–302. 71. Thain, D., Tannenbaum, T., and Livny, M.
61. Cheung, M.-S., Maguire, M. L., Stevens, T. J., (2005) Distributed computing in practice: the
and Broadhurst, R. W. (2010) DANGLE: A Condor experience. Concurr. Comput.: Pract.
Bayesian inferential method for predicting pro- Exper. 17, 323–356.
tein backbone dihedral angles and secondary 72. Nabuurs, S., Spronk, C., Krieger, E.,
structure. J. Magn. Reson. 202, 223–33. Maassen, H., Vriend, G., and Vuister, G.
62. Loquet, A., Gardiennet, C., and Böckmann, A. (2003) Quantitative evaluation of experimental
(2010) Protein 3D structure determination by NMR restraints. J. Am. Chem. Soc. 125,
high-resolution solid-state NMR. Comptes. 12026–12034.
Rendus - Chimie 13, 423–430. 73. Doreleijers, J. F., Raves, M. L., Rullmann, T.,
63. Gardiennet, C., Loquet, A., Etzkorn, M., and Kaptein, R. (1999) Completeness of NOEs
Heise, H., Baldus, M., and Böckmann, A. in protein structure: a statistical analysis of
(2008) Structural constraints for the Crh pro- NMR data. J. Biomol. NMR 14, 123–132.
tein from solid-state NMR experiments. J. 74. Bhattacharya, A., Tejero, R., and Montelione,
Biomol. NMR. 40, 239–250. G. T. (2007) Evaluating protein structures
64. LeMaster, D. M. and Kushlan, D. M. (1996) determined by structural genomics consortia.
Dynamical mapping of E. coli thioredoxin via Proteins 66, 778–795.
13 C NMR relaxation analysis. J. Am. Chem. 75. Doreleijers, J. F., Vranken, W. F., Schulte, C.,
Soc. 118, 9255–9264. Lin, J., Wedell, J. R., Penkett, C. J., Vuister, G.
65. Tjandra, N., Garrett, D. S., Gronenborn, A. W., Vriend, G., Markley, J. L., and Ulrich, E. L.
M., Bax, A., and Clore, G. M. (1997) Defining (2009) The NMR restraints grid at BMRB for
long range order in NMR structure determina- 5,266 protein and nucleic acid PDB entries.
tion from the dependence of heteronuclear J. Biomol. NMR 45, 389–396.
relaxation times on rotational diffusion anisot- 76. Jehle, S., Rajagopal, P., Bardiaux, B., Markovic,
ropy. Nature Struct. Biol. 4, 443–449. S., Kühne, R., Stout, J. R., Higman, V. A.,
66. Clore, G., Gronenborn, A., and Bax, A. (1998) A Klevit, R. E., van Rossum, B.-J., and Oschkinat,
robust method for determining the magnitude of H. (2010) Solid-state NMR and SAXS studies
the fully asymmetric alignment tensor of oriented provide a structural basis for the activation of
macromolecules in the absence of structural alphaB-crystallin oligomers. Nat. Struct. Mol.
information. J. Magn. Reson. 133, 216–221. Biol. 17, 1037–1042.
Chapter 24
Determining Protein Dynamics from 15N Relaxation

Data by Using DYNAMICS
David Fushman
Abstract
Motions are essential for protein function, and knowledge of protein dynamics is a key to our understanding
the mechanisms underlying protein folding and stability, ligand recognition, allostery, and catalysis. In the
last two decades, NMR relaxation measurements have become a powerful tool for characterizing backbone
and side chain dynamics in complex biological macromolecules such as proteins and nucleic acids. Accurate
analysis of the experimental data in terms of motional parameters is an essential prerequisite for developing
physical models of motions to paint an adequate picture of protein dynamics. Here, I describe in detail how
to use the software package DYNAMICS that was developed for accurate characterization of the overall
tumbling and local dynamics in a protein from nuclear spin-relaxation rates measured by NMR. Step-by-
step instructions are provided and illustrated through an analysis of 15N relaxation data for protein G.
Key words: Relaxation, Protein dynamics, Order parameter, Spectral density, Dipolar coupling,
Chemical shift anisotropy, CSA, Overall tumbling, Rotational diffusion tensor, Monomer–dimer
equilibrium
1. Introduction
Proteins are molecular nanomachines. Understanding of how they

work requires detailed knowledge of not only their three-dimen-
sional structure but also of various motions that take place in a
protein and the roles they play in protein’s folding and stability,
ligand recognition, allostery, and catalysis. NMR is perhaps the
most powerful analytical tool in structural biology, because it is
capable of providing site-specific information on the structure,
dynamics, and electronic environment of essentially any nucleus in
a molecule, even as complex as a protein or nucleic acid. Moreover,
solution NMR methods allow studies of molecules in their native
milieu, the reporter groups used do not cause any structural
perturbations, and the applied magnetic fields are still so weak
485
486 D. Fushman
compared to the thermal energy that they do not affect molecular

structure or dynamics.
Recent decades witnessed a burst in NMR studies of protein
motions (reviewed in ref. 1). A popular reporter group for protein
dynamics studies has been the N–H group, for several reasons,
most importantly: 1H–15N is an isolated spin-pair (to a good
approximation), conveniently located in the backbone and abun-
dant in proteins which, combined with the oftentimes good spread
of NMR signals in the 1H–15N correlation spectra, allows almost
complete coverage of a protein sequence (except prolines), and 15N
enrichment is relatively easy and affordable. There is growing inter-
est in developing and extending the NMR methodology (e.g., see
ref. 2, 3) to understand motions in other groups in a protein, both
in the backbone (CO, C–Ca, CaHa) and in the side chains (e.g.,
methyl groups).
The underlying concept of NMR being a sensitive tool for
accessing equilibrium protein dynamics is that nuclear spin relax-
ation is caused by modulation (by the overall and internal motions)
of the magnetic field sensed by a nucleus under observation. This
field is a result of magnetic interactions (dipolar and scalar) with the
magnetic moments of surrounding nuclei and of perturbations in
the electronic environment of the nucleus, resulting in the shield-
ing effect. All of these mechanisms contribute to nuclear-spin relax-
ation, albeit to a different extent. Modulation of dipolar coupling
can be caused by reorientation of the internuclear vector as well as
variation in its length. For a bonded pair of atoms, bond vibrations
are usually very fast (on the NMR time scale) and therefore do not
contribute to nuclear-spin relaxation rates directly (except for an
altered effective bond length). Likewise, the effect of motions on
the chemical shift (shielding) tensor could be envisioned as a reori-
entation of the tensor, as well as a modulation of its principal com-
ponents. The former contributes in a similar way as reorientation of
the dipolar coupling while the latter could lead to the so-called
conformational (or chemical) exchange contributions to relaxation.
Note that unlike the dipolar coupling, the shielding tensor gener-
ally is not axially symmetric; however, a fully anisotropic tensor can
always be presented as a sum of two axially symmetric tensors, and
for each of them the above statement applies.
I do not describe in this chapter how relaxation rates are mea-
sured; this information can be found in various sources, e.g., see
ref. 4. Instead, here I focus on how these rates can be analyzed.
1.1. The Underlying The experimental spin relaxation parameters (longitudinal and
Equations transverse relaxation rates, R1 and R2, and the steady-state hetero-
nuclear NOE) are directly related to power spectral densities, J(w),
which are Fourier transforms of the corresponding correlation
functions describing reorientations of the internuclear vector of
interest. In the case of the backbone amide 15N nucleus, the major
24 Determining Protein Dynamics from 15N Relaxation Data… 487
sources of 15N spin relaxation are modulations by motions of (1) the

dipole–dipole interaction (=dipolar coupling) of the nuclear
magnetic moment of 15N with that of the directly bonded 1H,
and (2) the anisotropy of the 15N chemical shift tensor (CSA). The
standard equations read as follows:
R1 = 3(d 2 + c 2 ) J (w N ) + d 2 [ J (w H − w N ) + 6 J (w H + w N )] (1)
R2 = 12 (d 2 + c 2 )[4 J (0) + 3 J (w N )] + 12 d 2 [ J (w H − w N ) (2)

+ 6 J (w H ) + 6 J (w H + w N )] + Rex
NOE = 1 − g H / g N d 2 ⎡⎣6 J (w H + w N ) − J (w H − w N )⎤⎦ / R1 (3)
Here d = − (mo/(4p))gHgNh/(4prHN3) is the strength of the

15
N–1H dipolar coupling, c = −wN·CSA/3, wH and wN are the reso-
nance frequencies of 1H and 15N, respectively, and Rex is the con-
formational exchange contribution (if any) to measured R2. These
equations assume that the effects of reorientational motion on the
1
H–15N dipolar interaction and on the 15N CSA can be described
by the same autocorrelation function. Corrections to the above
equations that account for noncollinearity of these two interactions
are discussed in (5).
Equations 1–3 provide the basis for extracting information on
protein dynamics from NMR relaxation measurements. Given the
experimental data, the primary objectives here are (1) to determine
the spectral densities J(w) and, most importantly, (2) to translate
them into an adequate physical picture of protein dynamics. As it is
generally impossible to determine all parameters of complex
motions from a limited set of measurements, the latter objective
requires adequate theoretical models of motion that can be obtained
from comparison with molecular dynamics simulations (e.g., see
ref. 6–8). Nevertheless, accurate analysis of experimental data
(objective (1)) is an essential prerequisite for such a comparison.
Extracting the spectral densities directly from Eqs. 1–3 is prob-
lematic because this system of equations in underdetermined: the
number of unknowns (d, c, J(w)’s, and possibly Rex) exceeds the
number of available experimental data (e.g., see ref. 9). A widely
accepted way to circumvent this problem, the so-called model-free
approach (10, 11), is based on a rather simple parameterization of
the spectral density function by approximating the correlation
function describing local dynamics as monoexponential
⎛ −t ⎞
C loc (t ) = S 2 + (1 − S 2 )exp ⎜ . (4)
⎝ t loc ⎟⎠
In this parameterization, tloc has the meaning of the correlation
time of the bond’s motion, and the angular amplitude of bond
reorientations is characterized by the so-called squared order
488 D. Fushman
parameter, S 2, a dimensionless measure of the amplitude on a

scale from 0 to 1: S 2 = 0 for unrestricted bond motions, while S 2 = 1
when this motion is completely restricted. Assuming that the
strength of the dipolar coupling (d) and the 15N CSA term (c) are
known, this leaves only two fitting parameters (three if Rex is pres-
ent), S 2 and tloc, to be determined for each residue, since the over-
all tumbling of the molecule is described by a small number of
global parameters (see below).
It was found however, that analysis of 15N relaxation data in
proteins sometimes requires a more complex, dual-exponential
parameterization (the so-called extended model-free model (12)):
⎛ −t ⎞ ⎛ −t ⎞
C loc (t ) = S 2 + (S fast
2
− S 2 )exp ⎜ ⎟ + (1 − S fast
2
)exp ⎜ . (5)
⎝ t slow ⎠ ⎝ t fast ⎟⎠
This correlation function represents a superposition of two inde-

pendent motions, “fast” and “slow,” characterized by the corre-
sponding order parameters (Sfast2 and Sslow2) and correlation times (tfast
and tslow) and occurring on entirely separated time scales: tfast < < tslow.
Note that for consistency with Eq. 4, here I introduced the general-
ized order parameter S2 that represents the total amplitude of the
combined motion: S2 = Sslow2 Sfast2. In this chapter, I refer to the model-
free characteristics of local motion (S2, tloc or Sfast2, tfast, S 2, tslow) as well
as Rex as microdynamic parameters. Various parameterizations of the
correlation function Cloc(t) are referred to as models of local motion.
If there is no correlation between the local dynamics and the
overall rotational diffusion of a molecule, as assumed in the model-
free approach, the total correlation function that determines J(w)
and hence the rates of 15N relaxation Eqs. 1–3 can be written in the
following form:
C (t ) = C ovrl (t )C loc (t ), (6)
where Covrl(t) is the autocorrelation function describing the overall
tumbling of a rigid molecule characterized by the (generally aniso-
tropic) rotational diffusion tensor D (13, 14). In the simplest
case of isotropic overall tumbling with a correlation time tc,
1
C ovrl (t ) = e −t /t c . In the case of rotational anisotropy, the expres-
5
sions for Covrl(t) are more complex and depend on the bond’s ori-
entation with respect to the diffusion tensor frame (e.g., see ref.
15). The corresponding equations can be found, for example, in
(13, 14, 16). Recall that Fourier transforms of C(t) give the power
spectral densities J(w) in Eqs. 1–3. For example, in the case of iso-
tropic rotation diffusion, combining Eqs. 4 and 6 gives
J (w ) = S 2 j (w , t c ) + (1 − S 2 ) j (w , t e ), (7)
2 t
where j (w , t ) = and 1 / t e = 1 / t c + 1 / t loc.
5 1 + (wt )2
Numerous studies in the last two decades (by NMR relaxation,

as well as MD simulations, (e.g., see ref. 6, 7)) revealed quite
restricted backbone motions in well-ordered regions (secondary
structure) in proteins, with the amplitudes of S2 ~ 0.87, whereas
significantly lower S2 (reflecting greater amplitudes) are often
observed in the flexible unstructured regions, such as loops and
termini. Likewise, the associated time scales (tloc or tfast) are in the
1–100 ps range for protein-core elements and slower, up to several
nanoseconds (tslow) in the flexible parts, possibly reflecting con-
certed motion of several residues or segments.
It should be pointed out that the information about protein
motions is limited by the time window imposed by the overall tum-
bling. Being unrestricted, the overall tumbling eventually averages
both the dipolar- the CSA-related energies to zero such that there
is essentially nothing left for slower motions to modulate, except
for the principal values of the dipolar or chemical shift tensors. The
latter modulation manifests itself in the so-called chemical or con-
formational exchange processes, which provide access to motions
slower than tc and have been studied quite extensively recently
(e.g., see ref. 17).
In this chapter, I describe the use of computer program
DYNAMICS (18, 19) designed to extract parameters character-
izing protein motions from NMR-measured spin-relaxation
parameters. For convenience, kplot to indicate the names of the
parameters in DYNAMICS. Courier font is used to indicate
Matlab commands and screen output messages; the lines contain-
ing Matlab commands throughout this chapter begin with the
Matlab prompt (>>).
2. Description
of DYNAMICS
Software
DYNAMICS is a computer program for model-free analysis of spin
2.1. Highlights relaxation data. The current version of the program (version 3.0)
of the Program includes the following features:
DYNAMICS ● Overall tumbling. All possible models of the overall rotational
diffusion are allowed: isotropic, axially symmetric, and fully
anisotropic. The overall rotational diffusion tensor can be an
input variable, but can also be determined simultaneously with
the model-free analysis of the relaxation data.
● Multiple-field data. Simultaneous or separate analysis of experi-
mental data from measurements at multiple magnetic fields.
Data sets at various fields do not have to be complete.
● Chemical shift anisotropy. The CSA can be treated as uniform
(fixed) or site-specific, the program also allows determining
site-specific CSA values simultaneously with data analysis.
490 D. Fushman
● Monomer–dimer equilibrium. The program allows data analysis

when the molecule of interest exists in a fast dynamic equilib-
rium between monomeric and dimer states.
2.2. The Overall The overall organization of DYNAMICS is depicted in the flow-
Organization chart in Fig. 1. The program is written in Matlab (The MathWorks,
of the Program Inc); the current version of the program is compatible with Matlab
versions 6.5 and newer. It is assumed here that the user is familiar
with very basic Matlab commands that allow loading and saving
data and navigation to the desired folder/directory.
Fig. 1. Flowchart of the program DYNAMICS.

4
j(ω,τc), τc=10 ns
j(ω,τc), τc=5 ns
3 j(ω,τe), τc=10 ns, τloc=100 ps
j(ω,τ), a.u.
0.2
2
0.1
1 0.0
100 900
0
1 10 100 900
ω/2π, MHz
Fig. 2. Relative contributions to the power spectral density J (w) from the overall tumbling
and local motion. Shown as a function of frequency w are j ( w,tc) (see Eq. 7) for tc = 5 ns
(green) and 10 ns (blue), and j (w,te) for tloc = 100 ps and tc =10 ns (red ). The factors S 2
and (1−S 2) are not included.
2.3. Treating As shown in Fig. 2, the contribution to the spectral density func-
the Overall Rotational tion from the overall tumbling is quite substantial (if not domi-
Diffusion nant) and often overshadows that from local motions. As our main
of a Molecule goal here is to characterize internal motions, accurate treatment of
the overall tumbling is absolutely critical for accurate analysis of the
local dynamics in a protein (20). Thus, the first and foremost step
in relaxation data analysis is to determine, and “subtract,” the con-
tribution from the overall tumbling. Significant attention in the
past was paid to developing tools for accurate analysis of the overall
rotational diffusion (16, 21–28).
In principle, the overall rotational diffusion can also be charac-
terized simultaneously with the analysis of local dynamics, and in
fact, DYNAMICS includes a mechanism for doing this (see
Subheadings 3.2 and 3.3). However, beyond the simplest case of
isotropic tumbling, this determination becomes less straightfor-
ward and can require significant effort, as multiple parameters need
to be optimized manually,. Therefore, if protein atom coordinates
are available, the most straightforward and reliable way to charac-
terize the overall rotational diffusion is directly from relaxation
data and separately from (and prior to) the analysis of local motions.
The underlying reason for this is based on the fact that, for well-
defined structural regions in a protein, the “reduced” relaxation
rates R1¢ and R2¢ are both proportional (to a good approximation)
to the squared order parameter. (The “reduction” is achieved by
subtracting from Eqs. 1–2 the contributions from the high-fre-
quency components, J(wH) and J(wH ± wN), of the spectral density
function, e.g., see ref. 27). Thus the R2¢/R1¢ ratio is S2-independent,
and the determination of the overall motion can be de-convoluted
492 D. Fushman
(hence performed separately) from the analysis of local protein

dynamics (22, 26, 27, 29). Moreover, as discussed in (26, 27, 29,
30), the R2¢/R1¢ ratio is independent of site-specific variations in
the actual values of d and c, Eqs. 1–2, and therefore depends solely
on the structure of a protein (i.e., orientations of the NH bonds
with respect to the diffusion tensor axes) and on the diffusion ten-
sor itself (22).
This concept is implemented in the computer program RotDif
(26) available online from our Web site: http://www.gandalf.
umd.edu/FushmanLab/. The use of this program is illustrated in
(26–28); therefore, we do not describe these steps in detail here.
Briefly, the RotDif program uses NH-vector coordinates and
relaxation rates, R1, R2, as well as NOE (if available) as input
parameters, and outputs the principal components of the diffusion
tensor D (Dx, Dy, Dz) along with the orientation (given by the
Euler angles a, b, and g) of the principal axes of the tensor with
respect to the protein coordinate frame. The output also includes
the overall rotational correlation time tc (=TAUc) and the anisot-
ropy of the tensor (Dz/Dx, Dz/Dy), to be directly entered as input
to DYNAMICS. For the B3 domain of protein G (GB3), used
here as an example, RotDif analysis of 15N relaxation data mea-
sured at 14.1 T (600 MHz 1H frequency) yielded the following
characteristics of the (axially symmetric) diffusion tensor:
D|| ≡ Dz = 6.05 ± 0.44 10−7 s−1, D^ ≡ Dx = Dy =4.45 ± 0.15 10−7 s−1,
a = 90° ± 8°, b = 70° ± 10°, which give tc = 3.34 ± 0.14 ns and the
anisotropy D||/D^ = 1.36 ± 0.09. The results at other fields are very
similar (19).
2.4. Selection At the heart of the DYNAMICS program is the model selection
of the Appropriate algorithm, which, based on how a particular model of local motion
Model for Local fits experimental data, selects the most appropriate model. It is
Motion similar to the approach described in (31) and is based on the
Occam’s razor principle, in that the simplest model that fits the
data is considered sufficient. All models of local motion used in
DYNAMICS are listed in Table 1. The model selection process
starts with the simplest model, LS_00, and first determines if it is
acceptable, i.e., the following two criteria are satisfied: (1) the
model yields physically reasonable values of the microdynamic
parameters (in this case, 0 £ S2 £1, but more generally for all mod-
els: 0 £ S2, Sfast2 £ 1; tloc, tfast >5 ps; 100 ps < tslow < tc, and Rex ³ sR2)
and (2) it provides a reasonable fit to the experimental data, i.e.,
passes the goodness-of-fit test (32). If this model is acceptable, the
program proceeds to the next-level-complexity models (in this
case, LS_tl and LS_ex) and applies the same acceptance rules as
above. If any of these models are acceptable and yield lower residu-
als of fit (c2) than the lower-complexity model (in this case, LS_00),
the program uses the F-statistics test to determine if this improve-
ment in the fit is genuine and reflects a better-fit model or merely
Table 1
Microdynamic parameters for the various models of local motion used
in DYNAMICS
Modela S 2 or Sslow2b tloc or tslowb Sfast2 tfast Rex #expc Npard Indexe
LS_00 Vf 0 1 N/A 0 1 1 0
LS_tl V V 1 N/A 0 1 2 1
LS_ex V 0 1 N/A V 1 2 2
LS_tx V V 1 N/A V 1 3 3
CL_00 V 0 V V 0 2 3 4
CL_tl V V V V 0 2 4 5
CL_ex V 0 V V V 2 4 6
CL_tx V V V V V 2 5 7
Matlab name S2 g TAUloc S2f TAUf Rex
or TAUsl b
a
The name of the corresponding model of local motion as used in DYNAMICS
b
Naming convention used in DYNAMICS: the corresponding motional parameters in the monoexponen-
tial model are S2 and tloc, whereas in the double-exponential (“extended”) model these parameters are
called Sslow2 and tslow
c
The number of exponentials in the corresponding correlation function of local motion see Eqs. 4 and 5
d
The total number of fitting parameters in a given model
e
A numerical index of the model in DYNAMICS, plotted in the output graphs (see Figs. 3 and 4)
f
“V ” indicates that the corresponding parameter is present in a given model and is fitted (not fixed);
N/A = not applied
g
The name of the corresponding Matlab variable in DYNAMICS output. Note that in the case of
“extended” models the reported S2 value is in fact S2 = Sfast2 × Sslow2
reflects the greater number of fitting parameters (32). This is pos-

sible because the models being tested are nested: each higher com-
plexity model retains the same parameters as the lower complexity
model and introduces an additional parameter. If neither model
provides a better fit than LS_00, the latter model is accepted, and
the program moves to the next residue. If not, the program pro-
ceeds to higher complexity models and so on. The program keeps
increasing the level of complexity until the number of fitting
parameters (Npar) reaches the number of experimental data (Ndat)
for a given residue, hence the number of degrees of freedom
(df = Ndat -Npar) becomes 0. Note that when df = 0, the F-statistics
test does not work. In this case we implement a simple rule: if
c2 < 0.01, the model is accepted. This is somewhat arbitrary, and
therefore selection of a model with df = 0 (e.g., LS_tx or CL_00
for a set of R1, R2, NOE measured at a single field) should be taken
with some caution.
494 D. Fushman
It could happen that none of the models of local motion for a

given residue pass the goodness-of-fit test because of a poor fit or
underestimated experimental errors, resulting in c2 values higher than
the acceptance level. In this case, if at least one model yielded a physi-
cally meaningful set of microdynamic parameters, the lowest-c2/df
model that satisfies the latter criterion is selected, and the residue will
be marked as belonging to the NOMOD category. If none of the
models yield a physically meaningful solution, the residue is marked as
program-excluded residue (EXCL category), and DYNAMICS will
output a message: no model found, at all.
2.5. Running All the scripts and functions of DYNAMICS package come in a
DYNAMICS single compressed file. When you uncompress it (using one of the
standard programs), it will by default put all the content of the
2.5.1. Getting Started
package in a folder called dynamics.
I recommend that you run all the analysis from the directory
containing your relaxation data, which is separate from the dynamics
directory: this will prevent you from “littering” the latter with out-
put files that DYNAMICS creates automatically (see Subheading 3.5).
For this, you will need to add the dynamics directory to your Matlab
path, for example by using the following command:
> > path(path,’c:/MyMatlab/dynamics’)
(here I assumed that all DYNAMICS scripts are located in the
folder c:/MyMatlab/dynamics on your computer).
2.5.2. Before You Run Navigate to your data directory and load all required input param-
the Program eters (Table 2) into the Matlab workspace (use Matlab function
load for this). Make sure that all of the parameters are in the
proper format and units as specified in Table 2. The auxiliary pro-
gram pdb2nh (see Subheading 3.6.1) will help you retrieve
NH-vector coordinates from the protein coordinate file.
2.5.3. Run-Time Dialog To start the program type the following command in the Matlab
Command window:
> > dynamics
If you added the dynamics directory to the Matlab path, you
can type this command directly from your data directory (recom-
mended). At the start, the program performs preliminary analysis
of the input data and outputs on the screen various estimates of the
overall rotational correlation time and the statistics of the distribu-
tion of the R2/R1 ratios. Here is an example of such output for 15N
relaxation data at 14.1 Tesla (1H frequency = 600.13 MHz) for
GB3 (19). We use these data throughout this chapter. If relaxation
data at more than one field are included, the analysis and the out-
put will be done for each field separately.
Table 2
Input parameters for DYNAMICS
Parameter or data array name

(case sensitive) The meaning Data format/structure Required?
1
freq H frequency, in MHz, could be several Vector of length Nfreqa Yes
(Nfreq) frequencies, if data at multiple fields
r11 (if single field) or several R1 data for Nres residues at freq (i) Array Nresb x 3: [Residue# R1c sR1c] Yes
arrays r11, r12, etc., gener-
ally: r1i where i = 1,2,…,Nfreq
r21 (if single field) or several R2 data for Nres residues at freq (i) Array Nres x 3: [Residue# R2c sR2c] Yes
arrays r21, r22, etc., gener-
r31 (if single field) or several NOE data for Nres residues at freq (i) Array Nres x 3: [Residue# NOE Yes
arrays r31, r32, etc., gener- sNOE]
vNH NH-vectors (normalized) for Nres residues Array Nres x 4: [Residue# x y zd] Only for anisotropic diffusion models
(kovrl = 1 or −1)
csa CSA values for Nres residues Array Nres x 3: [Residue# CSAe Only for fixed site-specific CSA
sCSAe] (kcsa = −1). If kcsa = 0, the program
will ask you to input CSA manually
TAUc Overall rotational correlation time, Scalar or vector Yes. If TAUc is missing in the work-
tc, in ns space, the program will ask you to
input it manually
Dz2Dx, Dy2Dx Ratios of the principal values of the Scalars Only for anisotropic diffusion models
diffusion tensor (Dz/Dx, Dy/Dx) (kovrl = 1 or −1). The program will
ask you to input them manually
(continued)
Table 2
(continued)
Parameter or data array name

(case sensitive) The meaning Data format/structure Required?
alpha, beta, gamma Euler angles {a,b,g}, in degrees, that define Scalars or vectors Only for anisotropic diffusion models
the orientation of the diffusion tensor axes (kovrl = 1 or −1). The program will
with respect to the protein coordinate frame ask you to input them manually
Ct, Kd Molar concentration of the protein (Ct) and Scalars Only for monomer–dimer equilibrium
the dimer’s dissociation constant (Kd), model (kovrl = 2). The program will
both in mM ask you to input them manually
kovrl Flag indicating various rotational diffusion =0 for isotropic (default) Only for nonisotropic motion,
models =1 for axially symmetric otherwise set to 0 by default
=−1 for fully anisotropic
=2 for monomer–dimer equilibrium
kcsa Flag for selecting various =0 for fixed uniform CSA (default) Only for nonuniform CSA model,
CSA options =−1 for fixed site-specific CSA set to 0 by default
=1 to fit site-specific CSA
kplot Flag to suppress (0) or allow (1) =0 Only for suppressing plot, otherwise
visual output in a form of data plots =1 (default) set to 1 by default
kfig Flag controlling figure numbers =0 open new a figure (default) Only to output to a specific figure#,
for plotting the results =−1 plot to the same figure otherwise set to 0 by default
otherwise figure # = kfig
ML_ver Matlab version e.g., for Matlab version 7.01, Set manually or let the program
ML_ver = 7.0 determine
Exclude List of residues you want to exclude from Vector NO
the analysis
a
Nfreq = number of frequencies in the freq list
b
Nres = number of residues in the list. If data for some residues are unavailable, do not include these residues in the list or use NaN (see footnote c)
c
The values of relaxation rates R1, R2 and their experimental errors, sR1, sR2, should be in 1/s, the values of NOE and the experimental error, sNOE, are dimensionless. If for
a given residue the relaxation parameter (R1, R2, or NOE) is not available, input NaN (“non-assigned-number”) in the corresponding position (second column) in the array
d
x,y,z should be coordinates of a unit vector in the direction of the NH bond (can be obtained by running an auxiliary program pdb2nh, see Subheading 3.6.1)
e
CSA values and their errors (sCSA) should be in ppm
- - - - - - - 600.13 MHz - - - - - - - - -
TOTAL: MEAN = 2.2131 SD(MEAN) = 0.12582 TAU = 3.3506
L&S: MEAN = 2.1806 SD(MEAN) = 0.062933 TAU(MEAN)
= 3.2995
MEAN(TAU) = 3.2981 SD_TAU = 0.098668
TAUmc = 3.3001 SD_TAUmc = 0.11508
resid. with the R2/R1 within MEAN +/− SD: 37
resid. with the R2/R1 above MEAN + SD: 9
resid. with the R2/R1 below MEAN - SD: 5
The purpose of this analysis is to estimate the overall rotational
1 6R2
correlation time (as t c = − 7 , (33)) and to count, in
2wN R1
the spirit of (34), how many residues have the R2/R1 ratio within
one standard deviation (SD) from the mean R2/R1 value. These
residues are expected to fit into the “standard” Lipari & Szabo
model (10, 11). Residues with the R2/R1 ratio more than one
standard deviation below the mean value could require the Rex
term, see Eq. 2, while those residues that have R2/R1 more than
one standard deviation above the mean value might need the
“extended” model-free model (12).
At the start the program plots the experimental data (R2, R1,
NOE, and residue-specific CSAs, if applicable) as a function of resi-
due number (see Figs. 3 and 4). This output can be suppressed by
setting kplot to 0 (or any number other than 1).
The run-time dialog that follows is shown step-by-step below.
Note that many questions that appear on the screen have a default
answer (indicated in the square brackets): this answer will be assumed
if you press ENTER, and if the question was about a parameter
involved in computations, the program will output a message con-
firming that the corresponding value was assumed.
Input a CSA value [−160] ==>
This line appears if kcsa is set to 0 (default), i.e., a fixed uni-
form CSA value will be used. Input the desired value (only numeric
input) or simply press ENTER: in this case CSA = −160 ppm will be
assumed. Note that if kcsa was set to −1, a list of fixed (site-spe-
cific) CSA values must exist in the workspace; otherwise, the pro-
gram will output an error message and exit.
If you did not define TAUc value(s), the program will ask you
the following:
Input TAUc value(s) (in ns) ==>
Here you can input a single value (e.g., 3.3) or a list of values,
e.g., [3.28 3.3 3.32].
If you selected the isotropic rotational diffusion model (i.e.,
kovrl was set to 0 on undefined), the program will proceed to actual
model-free analysis and model selection on a residue by residue
498 D. Fushman
Fig. 3. Output of DYNAMICS analysis of backbone motions in GB3 from 15 N relaxation data at 600 MHz. (a) Input data; (b)
the results of analysis assuming isotropic overall tumbling with TAUc = 3.33 ns; and (c) the results of analysis assuming
anisotropic (axially symmetric) overall tumbling with TAUc = 3.33 ns and other diffusion tensor characteristics presented in
Subheading 2.3. A uniform 15N CSA value of −174.2 ppm was assumed throughout the protein. The circles on the “model”
plot in b indicate residues that fall into the NOMOD category.
basis (see below). However, if the anisotropic diffusion model was

selected (i.e., kovrl was set to −1 or 1), additional input requests
will appear.
In case kovrl was set to 1, you will see the following messages:
<<<<<axially symmetric model selected >>>>>>
Input the Dz/Dx ratio (Dz/Dx < =0 -stop) [1]==>
Enter the actual value of the ratio (in the case of axial symme-
try, it is the same as D||/D^).
Input a range of BETA values [0:10:90]==>
Enter a range of b values, if you want to screen different orien-
tations of the diffusion tensor, or just a single value (for example,
the output of RotDif analysis).
Input the ALPHA angle [0]==>
Fig. 4. Output of DYNAMICS analysis of backbone motions in GB3 from 15N relaxation data at five magnetic fields. (a) Input
data. (b–c) the results of analysis assuming (axially symmetric) anisotropic overall tumbling (b) with a uniform (fixed)
CSA = −174.2 ppm (as in Fig. 3) and (c) site-specific CSAs obtained simultaneously with the microdynamic parameters
from fitting these relaxation data. The circles on the “model” plots indicate residues that fall into the NOMOD category.
Enter the value of angle a, in degrees.

Input a starting BETA angle [0]==>
If you entered a single b value when answering the BETA-
question above, reenter it here. Otherwise enter a single value of b
that you want to start with. Note that the b value you enter in this
line will be used first, even if you entered a different value or a
range of values above. You will be then given the option to proceed
with the above-entered b values.
If kovrl was set to −1, the dialog will be similar, except that you
will see the following message:
<<<<<<<< anisotropic model selected >>>>>>>>
500 D. Fushman
And in addition to the questions listed above you will be asked

to enter Dz/Dy and the angle g:
Input Dy/Dx ratio [1]==>
Input the GAMMA angle [0]==>
If you selected the monomer–dimer equilibrium model (18)
(i.e., kovrl = 2), you will be asked to input the total protein concen-
tration and the dissociation constant (both in mM) prior to start-
ing the analysis:
<<<<<<<< monomer-dimer equilibrium >>>>>>>>
Input protein concentration (in mM) ==>
Input the dissociation constant, Kd (in mM) ==>
The program will then compute and output the [monomer]/
[dimer] molar ratio and proceed to the data analysis (as described
above for the isotropic tumbling option).
After all required parameters (depending on the overall tum-
bling model) have been entered, the program will start model-free
analysis. This analysis is performed on a per-residue basis, and for
each residue the program outputs the results in the following
format (these data are taken from the GB3 analysis):
res# 48 LS_tx -model, S2 = 0.74325 TAUloc = 0.015475
Rex = 0.10926 chi2 = 5.904e-006
res# 49 LS_00 -model, S2 = 0.78293 TAUloc = 0 Rex = 0
chi2 = 0.61294
res# 50 CL_00 -model, S2 = 0.7772 TAUsl = 1.9757
Rex = 0 S2f = 0.85552 TAUf = 0 chi2 = 4.6997e-009
res# 52 LS_tl -model, S2 = 0.8388 TAUloc = 0.0064478
Rex = 0 chi2 = 2.6929e-005
res# 55 LS_tl -model, S2 = 0.80301 TAUloc = 0.016452
Rex = 0 chi2 = 0.48655
The models of local motion and the corresponding parameters
are defined in Table 1; chi2 represents the residuals of fit (c2) for a
given residue. In addition to numeric output, DYNAMICS visual-
izes/plots some of the results of the latest run on the screen (see
examples in Figs. 3 and 4): the relevant microdynamic parameters
(e.g., S2, tloc, Rex) and the selected local motion model (represented
by its Index, see Table 1). As mentioned above, the plot option can
be suppressed by the user by setting kplot = 0. If CSA was among
the fitting parameters (i.e., kcsa = 1), the output also shows the
resulting CSA values.
After completing a run through all nonexcluded residues, the
program outputs a summary of the results (see Subheading 2.5.3)
and either continues the calculations for all other TAUc and/or b
values (if there is more than one value for each of these parame-
ters), or stops and waits for user’s input. If the isotropic tumbling
model was selected (kovrl = 0 or 2), the message on the screen will
read as follows:
Input TAUc (TAUc < = [0] - break)==>
Entering a positive number will trigger another round of cal-
culations with this TAUc value, whereas zero or a negative number
will be interpreted as the signal to proceed to exit or error analysis.
Note that the latest positive TAUc value will be taken as the final/
accepted value and used for error analysis. If the TAUc value that
you want to accept is not the latest one, you need to reenter the
desired value, let the program run through all residues again (this
is quite fast anyway), and only after that enter 0 or a negative TAUc
to exit or proceed to error analysis.
In the case of anisotropic tumbling (kovrl = −1 or 1), the mes-
sage on the screen reads as follows:
Satisfied? (1-yes(calc.err), [0]-cont.(beta-
range), 2-man.input, -1-stop/exit)==>
Enter 1 here to proceed to error analysis, 0 to continue
computations with other β values (if more than one b value was
entered above), 2 if you want to return to manual input of the dif-
fusion tensor parameters (see above), and −1 to exit the program.
If you choose to exit the program, it will automatically remove
unnecessary (run-time) variables from the workspace and finish.
If you choose to proceed to error analysis, the program will ask
you to select the method of error estimation:
Choose MC simulation of exper.data(0) or fitted
params(1 or 2(vis = on)) ==>
Selecting option 0 will generate synthetic experimental data
(assuming normally distributed noise with the standard deviation
sR1, sR2, or sNOE), and for each set of generated data will per-
form the fit using the same model of local motion as selected for
the real data. By default, 500 runs will be performed for each resi-
due, and the standard deviation will be displayed and included in
the ERR array and in the final report RESERR. If you select option
1, the program will determine experimental errors using the con-
stant c2-boundaries method (32), which assumes that the residuals
of fit are distributed according to a c2 distribution, and therefore a
deviation of the fitted parameters from the optimal value by one
standard deviation would result in a specific increase in c2 that
depends on the number of fitting parameters (e.g., Dc2 =1, 2.3 or
3.53 for Npar = 1, 2 or 3, respectively). Thus the program deter-
mines the confidence boundaries for the fitted parameters by gen-
erating their values randomly and keeping only those values that
502 D. Fushman
led to Dc2 below the corresponding threshold. By default, the

simulation runs until 500 generated points fall into the defined Dc2
region. This method is usually faster, except for those rare cases
when the errors in fitting parameters are extremely small. You can
visualize the confidence regions for selected parameters if you select
option 2. As the error estimation proceeds, the program will out-
put on the screen the results (standard deviation, SD) for every
residue and, after it is finished, will also update the results plot with
error bars. If you choose options 1 or 2 in the isotropic tumbling
mode, the program will present you with an option to vary TAUc
together with the other parameters such that the estimated errors
reflect the possible uncertainty in tc as well. However, since tc is
not a fitting parameter, this option should be used for evaluation
purposes only.
After the program run is finished, you can save the results that
you want to keep by using Matlab’s save command, for example:
> > save results.mat RESERR NOMOD EXCL TAUCHI
ANISO
This will save RESERR, NOMOD, and other parameters listed
in that command line to a Matlab file results.mat (which stores data
in a binary format). If you want to save your results in ascii format
(to be easily opened by a text editor or any spreadsheet program),
type:
> > save results.dat RESERR –ascii
Type help save to see other saving options that Matlab
provides. See also the description of DYNAMICS’s automatic saving
feature in Subheading 3.5.
2.5.4. Understanding DYNAMICS outputs the results of the analysis on the screen, both
the Output in numerical format and as plots, and stores them in several output
parameters/arrays, summarized in Table 3.
As discussed above, during each run the program outputs the
results of analysis for each residue. In addition to the obvious
issues, such as the model and the actual values of the microdynamic
parameters, the user should also pay attention to the residuals of fit
(chi2 = c2), which are represented by the last number in each row
(or the one before last if CSA is also a fitting parameter). The chi2
information is important, because it tells you how well the data fit
the model. Ideally, a good fit would give chi2 values of about 1 per
degree of freedom. Thus, chi2 numbers in the range of single dig-
its (» df) or lower indicate a reasonable fit, whereas much higher
chi2 values indicate a potential problem with data analysis for a
particular residue: either the “best”-fit model is not ideal for that
residue or perhaps the experimental errors are underestimated
(hence elevated chi2).
Table 3
Output parameters created by program DYNAMICS
Parameter Meaning and data format
NOMOD List of residues in the NOMOD category (see Subheading 2.4), i.e., residues for which
none of the models of local motion passed the goodness-of-fit test (c 2 too high),
although at least one model provides a physically meaningful set of microdynamic
parameters
EXCL List of residues that have been excluded by the program because none of the tested
models of local motion are able to provide a physically meaningful set of micrody-
namic parameters (see Subheading 2.4)
RES The results of fit in the following format (array Nres x 10): [Residue# tc S2 tloc Rex Sfast2
tfast model-Index c 2 CSA]
ERR The results of error analysis in the following format (array Nres x 7): [Residue# dtc dS 2
d tloc dRex dSfast2 dtfast]
RESERR Combined results of fit (RES) and error analysis (ERR) in the following format (array
Nres x 16): [Residue# tc dtc S2 d S2 tloc d tloc Rex d Rex Sfast2 dSfast2 tfast dtfast model-Index
c 2 CSA]
TAUCHI Record of all evaluations performed during the current DYNAMICS run in the isotropic
tumbling mode or monomer–dimer equilibrium (empty if anisotropic tumbling).
Each line is a summary statistics for all residues, in the following format: tc, c 2(mod),
Nnomod, c 2(nomod), runs, df, c 2(total), c 2(total)/df, Nexcl
ANISO Record of all evaluations performed during the current DYNAMICS run in the aniso-
tropic mode (empty matrix in the isotropic or monomer-dimer equilibrium). Each line
is a summary of statistics for all residues, in the following format (for axially symmetric
model): tc, Dz/Dx,b,a, c 2(mod),Nnomod,c 2(nomod), runs, df, c 2(total), c 2(total)/df,
NexclIn the case of fully anisotropic tumbling model, the format is
tc, Dz/Dx, Dy/Dx b,a,g, c 2(mod),Nnomod,c 2(nomod), runs, df, c 2(total), c2(total)/df, Nexcl
runs This parameter counts how many times the selected model switches between model-free
and “extended” model-free models in adjacent residues along the protein sequence
df Total number of degrees of freedom, df = Ndat – Npar
chi2 Residuals of fit, c 2
⎡⎛ R exp − R calc ⎞ 2 ⎛ R exp − R calc ⎞ 2 ⎛ NOEexp − NOEcalc ⎞ 2 ⎤
χ2 = ∑ ⎢⎜ 1
freq ⎢ ⎝ sR
1
⎟⎠ + ⎜⎝
2
sR
2
⎟⎠ + ⎜⎝ sNOE
⎟⎠ ⎥
⎥⎦
⎣ 1 2
In addition to results for each residue, after each run through

all residues is completed, the program outputs a summary of the
results of the current run, which looks like this:
NOMODEL 2 res.: 36 45 chi2 = 52.8946
TAUc = 3.33 MODEL chi2 = 107.0356 runs = 14 df = 46
TOTAL Chi2 = 159.9302 Chi2/df = 3.4767
504 D. Fushman
The first line here is optional and appears only if there is at

least one residue that falls into the NOMOD category (see
Subheading 2.4): it lists those residues and their total chi2. The
second line reports the current tc value, the total chi2 value for
“MODEL” residues, i.e., those for which proper model selection
was obtained, and a summary of other statistics of the results
(see Table 3). The last entry in this line is the total chi2 divided by
the total number of degrees of freedom.
The program also summarizes similar statistics for all iterations
(various TAUc values) performed so far in a form shown below for
the isotropic tumbling model:
TAUc chi2mod nomod chi2nomod runs df total_chi2
totalCHI2/df nexcl
3.15 87.4701 5 72.5883 0 39 160.0584 4.1041 0
3.33 107.0356 2 52.8946 14 46 159.9302 3.4767 0 < −−
3.45 100.6942 5 157.2367 22 42 257.9309 6.1412 0
Here mod and nomod refer to MODEL and NOMOD catego-
ries, and nexcl is the number of residues in the EXCL category,
i.e., for which no model could be found; the rest of the parameters
are defined in Table 3. These data are also stored in array TAUCHI.
The horizontal arrow on the right indicates the entry with the low-
est total c2/df.
For anisotropic diffusion model the summary table is slightly
different (because of the additional overall-tumbling-related
parameters) and looks like this (also saved in array ANISO, see
Table 3):
TAUc Dz/Dx beta alpha - > chi2mod nomod chi2nomod
runs df total_chi2 totalCHI2/df nexcl
3.33 1.36 65 90 - >76.8789 0 0 6 49 76.8789 1.569 0
3.33 1.36 70 90 ->78.0965 0 0 4 50 78.0965 1.5619 0
For fully anisotropic diffusion model, the summary table also
includes Dy/Dx and angle gamma.
2.6. Practical The examples of DYNAMICS graphics outputs showing the

Examples input 15N relaxation data for GB3 at 14.1 T (600 MHz) and the
resulting microdynamic parameters and models of local motion
are shown in Fig. 3. The analysis was performed assuming isotro-
pic or anisotropic overall tumbling of the protein. Differences in
the microdynamic parameters and selected models of local
motion illustrate the need to use an adequate model for the over-
all tumbling (anisotropic rotational diffusion in the case of GB3
(35)). The analysis of GB3 data measured at five fields (9.4, 11.7,
14.1, 16.4, and 18.8 T) assuming a uniform (fixed) 15N CSA or
including site-specific 15N CSA as a fitting parameter is illustrated
in Fig. 4.
3. Miscellaneous
Issues
3.1. General Notes 1. Make sure the input list of frequencies freq contains all the
on Using DYNAMICS pertinent 1H frequencies (magnetic fields) for the data that you
want to analyze. It is critical that the order in which the fre-
quencies are listed in the freq-list is coordinated with the sec-
ond “index” in the relaxation parameters names. For example,
if freq = [500, 600], then r11, r21, and r31 should be R1, R2,
and NOE data at 500 MHz, respectively, while r12, r22, and
r32 should be R1, R2, and NOE data at 600 MHz.
2. Small Rex values (<1 s−1) could be artifacts of model selection
(for example, due to an inadequate overall tumbling model, as
illustrated in Fig. 3) or a result of elevated 15N CSA, rather
than real exchange contributions. Relaxation data at more than
one magnetic field are required (analyzed separately or
together) to verify the consistency of the model selection as
well as that the resulting Rex term has the expected field depen-
dence (∝ Bo2).
3. To fit CSA values, relaxation data at more than one magnetic
field are required.
4. When conformational exchange (Rex) is present, the CSA val-
ues obtained from the analysis could be biased, since both the
Rex and the CSA contributions to relaxation rates have a similar
field dependence (∝ Bo2).
5. Errors in the derived CSA values, as well as the associated
errors in microdynamic parameters from such analysis
(kcsa = 1), can currently be calculated only using the “experi-
mental data simulation” option.
6. In the output RES matrix, the CSA values are included as col-
umn #10, following the c2 values. In the output RESERR
matrix the CSA values and errors in CSA are included as the
last two columns (#16, 17).
7. Even if you input a single TAUc value for the isotropic or
monomer–dimer equilibrium model, the program will run cal-
culations for this TAUc twice. This is done to compare the
results of at least two runs and, because DYNAMICS is fast,
does not take too much time.
3.2. Strategy What can one do if protein coordinates are not available? One can
for Determining always use the isotropic overall tumbling model in DYNAMICS,
the Overall Rotational which does not require knowledge of protein structure. However,
Correlation Time When one should be aware of the fact that the microdynamic parameters
the Structure derived from such analysis could be biased by the possible oversim-
is Unknown or RotDif plification of the tumbling model used, particularly if the protein
Results are Unreliable shape turns out to be far from spherical.
506 D. Fushman
While RotDif provides an efficient and reliable way to deter-

mine the overall rotational diffusion tensor, it is also possible to
estimate the tensor (or at least some of its characteristics) simulta-
neously with DYNAMICS analysis. This feature was built into
DYNAMICS in the old days when RotDif was unavailable, and still
remains. Here is the recipe that I recommend if you want to deter-
mine tc from 15N relaxation data at a single field.
1. Use only residues from structurally well-defined regions
(“core”) of the protein. Exclude any residues that are in highly
flexible/unstructured parts (e.g., loops and termini) of your
protein: they would likely require the “extended”-model-free
models. You can do this based on the structure of the protein,
or using only residues for which the R2/R1 ratio falls within
one standard deviation from the mean (the estimates are pro-
vided at the very beginning of each DYNAMICS run). You
also might want to exclude residues with noticeably high R2
values but “average” R1 values. It might be helpful to plot the
R2/R1 ratio for such an analysis. The rationale for these exclu-
sions is that at a single field there is no reliable F-statistics test
for models (CL_00, LS_tx) with df = 0, so you try to minimize
these contributions.
2. Start with the estimate of TAUc that DYNAMICS provides at
the very beginning of the run.
3. Run DYNAMICS once at this TAUc, and analyze the output
results for various residues. Identify residues that give very
high chi2, compared to the rest of residues. You will want to
exclude these residues during the next run. The rationale is
that these residues will dominate the total chi2, thus biasing
the optimization based on the residuals of fit. For example, if
the average chi2 level is in the single digits, it might be prudent
to exclude residues that gave chi2 above 30–50.
4. With the new set of residues, run DYNAMICS while varying
TAUc. The goal is to find TAUc that minimizes the total chi2/
df. This optimization could be tricky, because when the input
TAUc deviates from the actual tc, the model selection proce-
dure could compensate by choosing a different model. For
example as TAUc increases, there is a tendency toward the
CL_00 model, for which chi2 could be very low (recall that
for that model df = 0 when using data at a single field, thus the
F-statistics test does not work in this case, and acceptance of
this model is somewhat arbitrary). The same happens when
TAUc is lower than the actual tc: the model selection will com-
pensate by choosing LS_ex or LS_tx models (recall that for
the latter df = 0, hence the F-statistics does not work). This
would also result in a decrease in the total number of degrees
of freedom. Therefore the name of the game here is to find a
b 20 Rex-models
CL-model
10
a
Number of
residues
0
10
Number of
Rex-models 4
residues
5 CL-model NOMOD
2
0 0
4.5 45 4.5 40
4.0 4.0
3.5 40 3.5
35
3.0 3.0
χ2/df
χ2/df
df
df
2.5 35 2.5
2.0 2.0
30
1.5 30 1.5
1.0 1.0
3.25 3.30 3.35 3.40 3.45 3.15 3.20 3.25 3.30 3.35 3.40 3.45
TAUc, ns TAUc, ns
Fig. 5. Illustration of the use of DYNAMICS to optimize tc simultaneously with model-free analysis of 15N relaxation data for
GB3. (a) The analysis assumed an anisotropic tumbling model with the diffusion tensor’s anisotropy and orientation as
specified in Subheading 2.3. (b) The analysis assumed isotropic tumbling model. The 15N CSA was set to −174.2 ppm
(fixed) in both cases. Only secondary structure residues were included; in addition, residues D36 and Y45 were excluded
from this analysis because of high c 2 values. Solid line in both panels shows c 2/df, while the dashed line depicts the total
number of degrees of freedom, df. The vertical bars in the top plots show the number of residues with Rex (blue) (LS_ex
and LS_tx models) or with extended model (CL_00) (red) selected for each value of TAUc, as well as the number of
NOMOD residues (black). There were no NOMOD residues in the analysis in a. The RotDif analysis gives the tc value of
3.34 ± 0.14 ns (19) (or 3.37 ± 0.20 ns for these selected residues); a very similar value (3.36 ns) calculated from the mean
R2/R1 value was reported at the beginning of the DYNAMICS run.
TAUc value (or range) in the “middle” region, where the total
chi2 is at or close to its minimum and at the same time the
total df is at or close to its maximum. There might not be a
clear single TAUc value, but one should remember there is
always an uncertainty in tc even from RotDif analysis. Examples
of such analyses are shown in Fig. 5. In addition, to help deal
with an artificial selection of the CL_00 model, DYNAMICS
also uses another parameter, called runs (Table 3). The ratio-
nale behind it is to avoid an artificial situation when a single
residue shows large-amplitude motions (this is what the
extended model is usually needed for) while its neighbors do
not. To avoid this, runs counts how many times the selected
model switches between “LS” and “CL” models in adjacent
residues. Naturally, one wants to minimize the number of such
switches, because intuitively one would expect that large-
amplitude fluctuations involve several adjacent residues in the
polypeptide chain, and not just a single one.
508 D. Fushman
One should also bear in mind that the total chi2/df (as well
as chi2) is generally not a smooth function because changes in
the selected model of local motion for individual residues
would result in abrupt changes in the residuals of fit and df.
5. After the TAUc value has been optimized, use this value to run
DYNAMICS again, this time for all residues in the protein.
You might need to do several iterations of such an analysis.
3.3. Strategy The strategy here is similar to that described in the previous sec-
for Determining tion, except that (1) protein atom coordinates are required and (2)
the Overall Rotational you will need to vary several parameters as the same time. For
Diffusion Tensor example, for the axially symmetric model, you might want to vary
when RotDif Results Dz/Dx and the angles a and b. While doable, this optimization is
are Unreliable not straightforward, can require significant effort, and can result in
a local rather than a global minimizer. Thus, I would recommend
using it only when no other option is available.
3.4. Cleaning DYNAMICS generates and keeps most of the necessary intermedi-
the Workspace ate variables in the current Matlab workspace (computer memory).
They are automatically removed from the workspace when the
program terminates successfully. If the program run was termi-
nated prematurely either by the user (e.g., via CTRL/C) or in case
of a run-time error, these variables will remain in the workspace.
This might cause an interruption in the normal program execution
when you start it next time during the same Matlab session. To
ensure uninterrupted program flow, it is recommended to remove
the remaining intermediate variables from the computer memory
before restarting DYNAMICS. Removal of only intermediate vari-
ables can be achieved by issuing the following command:
> > dynclean
3.5. Automatic Saving To prevent accidental loss of the computed data, the results (RES,
of the Results NOMOD, EXCL, TAUCHI, ANISO) are automatically saved to a
Matlab file after completion (and acceptance) of the model-free
analysis and again after error analysis (the same parameters as above
plus ERR and RESERR). To reduce the chance of overwriting this
file when you run DYNAMICS again, the name of the file contains
the current date followed by a random number from 0 to 99, e.g.,
dyn16jan2011_92.mat.
3.6. Auxiliary The DYNAMICS package includes several programs designed to

Programs help the user prepare data for running the calculations. Some of
these programs are briefly described below. Their use and the actual
command lines are not detailed here: the reader can find all rele-
vant information in the header of each program using any text
editor (e.g., Matlab editor).
3.6.1. pdb2nh This program extracts coordinates of backbone NH vectors from a

given protein atom coordinates file and normalizes these vectors
(to be used as input for DYNAMICS). In case hydrogens cannot
be found (e.g., crystal structure), the program builds amide hydro-
gens from coordinates of the heavy atoms C¢, O, N, Ca in the
corresponding peptide plane using conventional rules.
3.6.2. Reldata, Reldatae Given all pertinent parameters of the overall and local dynamics, as
well as the orientation of the NH vector (if necessary), the reldata
program computes 15N relaxation rates: R1, R2, and NOE. The input
options also include the ability to add random noise to the data.
The program reldatae performs the same task as reldata, but in
addition also computes the longitudinal (hz) and transverse (hxy)
cross-correlation rates between the 1H–15N dipolar interaction and
15
N CSA, e.g., see ref. 19, 36.
3.6.3. conv2temp This program allows conversion between tc values at different tem-
peratures, by taking into account the temperature dependence of
water viscosity, see e.g., ref. 37.
3.6.4. Demo Scripts The package includes several demo scripts, designed to help the
user learn how to run DYNAMICS:
demo_iso.m Isotropic overall tumbling

demo_ani.m Anisotropic overall tumbling
demomdeq.m Monomer–dimer equilibrium
demo_csa_iso.m CSA fit + isotropic overall tumbling
demo_csa_ax.m CSA fit + axially symmetric anisotropic
overall tumbling
demo_csa_ani.m CSA fit + fully anisotropic overall
tumbling
All “demo_csa” scripts use data at five magnetic fields and

include CSA as a fitting parameter. Note that these scripts can be
modified to use site-specific CSA values as an external fixed param-
eter (kcsa = −1) rather than as an adjustable parameter. All you
need is to open this file with any text editor and uncomment
(remove %) the line kcsa = −1.
Each of these scripts starts by generating synthetic sets of relax-
ation data using reldata.m, and then runs DYNAMICS using these
data as input. Additional three text files, demo_iso.txt, demo_ani.
txt, and demomdeq.txt, contain copies of the screen outputs and
the dialog, to illustrate the main steps in data analysis using
DYNAMICS.
510 D. Fushman
Acknowledgments
The development of DYNAMICS program was supported by NIH

grant GM 065334. My work on this chapter has led to several
modifications of the program, which hopefully made it user-friend-
lier, and I would like to thank the editors, Alex Shekhtman and
David Burz, for being so patient with me during this process.
References
1. Palmer, A. G., 3 rd. (2004) NMR characteriza- determination of residue-specific 15 N chemical
tion of the dynamics of biomacromolecules. shift tensors in proteins in solution: protein
Chem. Rev. 104, 3623–3640. dynamics, structure, and applications of trans-
2. Sheppard, D., Sprangers, R., and Tugarinov, V. verse relaxation optimized spectroscopy, in
(2010) Experimental approaches for NMR Methods in Enzymology (James, T., Schmitz, U.,
studies of side-chain dynamics in high-molecu- and Doetsch, V., Eds.), 339, 109–126.
lar-weight proteins. Prog. Nucl. Magn. Reson. 10. Lipari, G., and Szabo, A. (1982) Model-free
Spectrosc. 56, 1–45. approach to the interpretation of nuclear mag-
3. Godoy-Ruiz, R., Guo, C., and Tugarinov, V. netic resonance relaxation in macromolecules.
(2010) Alanine methyl groups as NMR probes 2. J. Am. Chem. Soc. 104, 4559–4570.
of molecular structure and dynamics in high- 11. Lipari, G., and Szabo, A. (1982) Model-free
molecular-weight proteins. J. Am. Chem. Soc. approach to the interpretation of nuclear mag-
132, 18340–18350. netic resonance relaxation in macromolecules.
4. Cavanagh, J., Fairbrother, W. J., III, A. J. P., 1. Theory and range of validity, J. Am. Chem.
and Skelton, N. J. (1996) Protein NMR Soc. 104, 4546–4559.
Spectroscopy, Academic Press, San Diego. 12. Clore, G. M., Szabo, A., Bax, A., Kay, L. E.,
5. Fushman, D., and Cowburn, D. (1999) The Driscoll, P. C., and Gronenborn, A. M. (1990)
effect of noncollinearity of 15 N-1 H dipolar and Deviations from the simple two-parameter
15
N CSA tensors and rotational anisotropy on model-free approach to the interpretation of
15
N relaxation rates, CSA/DD cross correla- nitrogen-15 nuclear magnetic relaxation of
tion, and TROSY. J. Biomol. NMR 13, proteins. J. Am. Chem. Soc 112, 4989–4936.
139–147. 13. Woessner, D. (1962) Nuclear spin relaxaion in
6. Fushman, D., Ohlenschlager, O., and Rüterjans, ellipsoids undergoing rotational brownian
H. (1994) Determination of the backbone motion. J.Chem.Phys. 37, 647–654.
mobility of ribonuclease T1 and its 2’GMP 14. Favro, D. L. (1960) Theory of the Rotational
complex using molecular dynamics simulations Brownian Motion of a Free Rigid Body. Phys.
and NMR relaxation data. J. Biomol. Struct. Rev. 119, 53–62.
Dyn. 11, 1377–1402. 15. Ryabov, Y. E., and Fushman, D. (2007) A Model
7. Pfeiffer, S., Fushman, D., and Cowburn, D. of Interdomain Mobility in a Multidomain
(2001) Simulated and NMR derived backbone Protein. J. Am. Chem. Soc. 129, 3315–3327.
dynamics of a protein with significant flexibil- 16. Tjandra, N., Feller, S. E., Pastor, R. W., and
ity: A comparison of spectral densities for Bax, A. (1995) Rotational diffusion anisotropy
the < beta > ARK PH domain. J. Am. Chem. Soc. of human ubiquitin from 15 N NMR relaxation.
123, 3021–3036. J. Am. Chem. Soc. 117, 12562–12566.
8. Maragakis, P., Lindorff-Larsen, K., Eastwood, 17. Palmer, A. G., 3 rd, Grey, M. J., and Wang, C.
M. P., Dror, R. O., Klepeis, J. L., Arkin, I. T., (2005) Solution NMR spin relaxation methods
Jensen, M. O., Xu, H., Trbovic, N., Friesner, for characterizing chemical exchange in high-
R. A., Palmer, A. G., and Shaw, D. E. (2008) molecular-weight systems. Methods Enzymol.
Microsecond molecular dynamics simulation 394, 430–465.
shows effect of slow loop dynamics on back- 18. Fushman, D., Cahill, S., and Cowburn, D.
bone amide order parameters of proteins. (1997) The main chain dynamics of the dynamin
J. Phys. Chem. B 112, 6155–6158. pleckstrin homology (PH) domain in solution:
9. Fushman, D., and Cowburn, D. (2001) Analysis of 15 N relaxation with monomer/dimer
Nuclear magnetic resonance relaxation in equilibration. J. Mol. Biol. 266, 173–194.
19. Hall, J. B., and Fushman, D. (2006) Variability diffusion of a protein from 15 N relaxation
of the 15 N Chemical Shielding Tensors in the measurements and hydrodynamic calculations,
B3 Domain of Protein G from 15 N Relaxation in Protein NMR techniques (Methods in
Measurements at Several Fields. Implications Molecular Biology) (A.K.Downing, Ed.),
for Backbone Order Parameters. J.Am.Chem. pp 139–160, Humana Press Inc.
Soc. 128, 7855–7870. 29. Fushman, D., and Cowburn, D. (2002)
20. Fushman, D., and Cowburn, D. (1998) Characterization of Inter-Domain Orientations
Studying protein dynamics with NMR relax- in Solution Using the NMR Relaxation
ation, in Structure, Motion, Interaction and Approach, in Protein NMR for the Millenium
Expression of Biological Macromolecules (Sarma, (Biological Magnetic Resonance Vol 20) (N. R.
R., and Sarma, M., Eds.), pp 63–77, Adenine Krishna, L. B., Ed.), pp 53–78, Kluwer.
Press, Albany, NY. 30. Fushman, D. (2002) Determination of protein
21. Blackledge, M., Cordier, F., Dosset, P., and dynamics using 15 N relaxation measurements,
Marion, D. (1998) Precision and uncertainty in BioNMR in drug research (O.Zerbe, Ed.),
in the characterization of anisotropic rotational pp 283–308, Wiley-VCH.
diffusion by 15 N relaxation. J.Am.Chem.Soc. 31. Mandel, A. M., Akke, M., and Palmer, A. G. I.
120, 4538–4539. (1995) Backbone dynamics of E. coli
22. Fushman, D., Xu, R., and Cowburn, D. (1999) Ribonuclease HI: correlations with structure
Direct determination of changes of interdo- and function in an active enzyme. J. Mol. Biol.
main orientation on ligation: use of the orien- 246, 144–163.
tational dependence of 15 N NMR relaxation in 32. Press, W. H., Teukolsky, S. A., Vetterling, W.
Abl SH(32). Biochemistry 38, 10225–10230. T., and Flannery, B. P. (1992) Numerical
23. Fushman, D., Ghose, R., and Cowburn, D. Recipes in C, Cambridge University Press, NY.
(2000) The effect of finite sampling on the 33. Fushman, D., Weisemann, R., Thüring, H.,
determination of orientational properties: A and Rüterjans, H. (1994) Backbone dynamics
theoretical treatment with application to inter- of ribonuclease T1 and its complex with 2’GMP
atomic vectors in proteins. J. Am. Chem. Soc. studied by two-dimensional heteronuclear
122, 10640–10649. NMR spectroscopy. J. Biomol. NMR 4, 61–78.
24. Dosset, P., Hus, J. C., Blackledge, M., and 34. Kay, L. E., Torchia, D. A., and Bax, A. (1989)
Marion, D. (2000) Efficient analysis of macro- Backbone dynamics of proteins as studies by
molecular rotational diffusion from heteronuclear N15 inverse detected heteronuclear NMR
relaxation data. J. Biomol. NMR 16, 23–28. spectroscopy: application to staphylococcal
25. Ghose, R., Fushman, D., and Cowburn, D. nuclease. Biochemistry 28, 8972–8979.
(2001) Determination of the Rotational 35. Hall, J. B., and Fushman, D. (2003)
Diffusion Tensor of Macromolecules in Characterization of the overall and local dynam-
Solution from NMR Relaxation Data with a ics of a protein with intermediate rotational
Combination of Exact and Approximate anisotropy: Differentiating between conforma-
Methods - Application to the Determination of tional exchange and anisotropic diffusion in the
Interdomain Orientation in Multidomain B3 domain of protein G. J. Biomol. NMR 27,
Proteins. J. Magn. Reson. 149, 214–217. 261–275.
26. Walker, O., Varadan, R., and Fushman, D. (2004) 36. Hall, J. B., and Fushman, D. (2003) Direct
Efficient and accurate determination of the over- measurement of the transverse and longitudinal
all rotational diffusion tensor of a molecule from 15
N chemical shift anisotropy-dipolar cross-
15
N relaxation data using computer program correlation rate constants using 1 H-coupled
ROTDIF. J. Magn. Reson. 168, 336–345. HSQC spectra. Mag. Res. in Chemistry 41,
27. Fushman, D., Varadan, R., Assfalg, M., and 837–842.
Walker, O. (2004) Determining domain orienta- 37. Ryabov, Y. E., Geraghty, C., Varshney, A., and
tion in macromolecules by using spin-relaxation Fushman, D. (2006) An efficient computa-
and residual dipolar coupling measurements. tional method for predicting rotational diffu-
Prog. NMR Spectros. 44, 189–214. sion tensors of globular proteins using an
28. Hall, J. B., Walker, O., and Fushman, D. (2004) ellipsoid representation. J. Am. Chem. Soc. 128,
Characterization of the overall rotational 15432–15444.
INDEX
A Baculovirus-mediated insect cells .......................... 38, 39, 43

BATCH protocol (strategy for resonance assignments)
Abelson Kinase domain (13C–15N labeled)......................... 43 ASCOM............................................ 409–410, 414–416
Acetamidase gene (amdS), 21 BEST... ...............................................409, 413, 414, 416
ADR. See Ambiguous distance restraints COBRA .............................410–412, 415–417, 419–427
Affinity/solubility tags HADAMAC .............. 411–413, 415–417, 419–425, 427
Flag............................................................................ 335 targeted-sampling ...................................................... 411
glutathione-S-transferase (GST) ............................... 335 B3 domain of protein G (GB3) .............................. 492, 494,
hexahistidine (His) .................................................... 335 498–500, 504, 507
maltose binding protein (MBP) ........................ 335–336 Boltzmann conformational distribution ................. 370, 374,
trpLE-tag .................................................................. 336 376, 385, 390, 400
AMBER force field ................................................. 380–381
Ambiguous distance restraints (ADR) ................... 444–448, C
454, 456, 459, 460, 462, 478
Calibration factor (C) ...................................................... 460
Ambiguous Restraints for Iterative Assignment
CCPN. See Collaborative Computing Project for the NMR
(ARIA) .......................................... 436, 453–480
Cdc37 (kinase-specific chaperone) .......................... 117–118
Amino acid specific labeling ............................................ 236 13
C-detected 1H, 13C correlation spectrum.............. 285–286
Amino acid type selective (AATS) isotope labeling .......... 59 13
C-detected 2H-DQ, 13C correlation ..................... 284, 285
Amplification of recombinant virus ............................. 45, 47
Cell-free membrane protein expression
APSY. See Automated projection spectroscopy
in detergent micelles ........... 87–90, 94–97, 100, 350–351
ARIA. See Ambiguous Restraints for Iterative Assignment
in liposomes ............................................................... 100
Automated backbone assignment .................... 433, 438–441
of MscL in liposomes ................................ 90–91, 98–99
Automated projection spectroscopy (APSY )
Cell free synthesis/expression ........ 71, 74–77, 81–82, 85–91,
5D APSY-CBCACONH ......................................... 433
94–100, 106
5D APSY-HACACONH ......................................... 433
Cell viability assay ........................................... 269, 272, 274
4D APSY-HACANH ............................................... 433
CELTONE® ............................................120, 123, 126, 364
Auto relaxation rates ........................................ 142, 146–151
Charge distribution
atomic.. .............................................................. 381–383
B
electron ...................................................................... 383
Backbone resonance assignment ...............286, 408, 413, 433 CHARMM ..............................................380, 381, 383, 393
BacMagic....................................................41–42, 44–47, 50 Chemical exchange
BacMams........................................................................... 38 fast exchange regime .................................................. 238
Bacmid............................................................................... 42 intermediate exchange regime ................................... 263
Bacterial over-expression slow exchange regime ................................ 220, 238, 255
apoE.... .................................................................... 9, 15 Chemical shift anisotropy (CSA) ....................142, 287–289,
calmodulin (CaM) ........................ 72–73, 78, 80, 83, 263 322, 323, 413, 487–489, 497–500,
MscL.................................. 86–91, 94–101, 104, 106, 107 502–505, 509
replication protein A (RPA) .............................. 185, 189 Chemical shift perturbation (CSP) ........................ 138, 212,
rhodopsin..................................................................... 59 216, 237–239, 246–250, 255, 307, 365
SrcCD-MBP fusion .................................. 115–118, 127 Chemical shifts .......................... 72, 142, 143, 169, 172–175,
trpLE-M2 ......................................................... 168, 171 198, 210, 220, 235, 238, 324, 361–362, 407–410,
Bacterial transformation ......................................................9 412, 416, 425–426, 435–447, 455–458, 464–465,
Baculovirus-mediated expression (BvE) 467, 469, 477, 486
system ............................................ 38–41, 43, 45 CHHC...... ...................................................................... 477
DOI 10.1007/978-1-61779-480-3, © Springer Science+Business Media, LLC 2012
513
PROTEIN NMR TECHNIQUES
514 Index
Chromatography n-dodecyl-β-D-maltopyranoside
affinity, Ni-NTA........................................................ 189 (DDM) .......................................... 345, 347, 348
desalting..................................................... 185, 190–191 dodecylphosphocholine (DPC) ..........339–341, 343, 345
heparin....................................................... 185, 189–191 n-octyl-β-D-glucopyranoside (β-OG) ...................... 345
ion exchange (IEC) ............................116, 118, 120, 124 Deuterated target proteins ................................................. 28
reverse phase ............................... 168, 172, 229, 336, 352 Dielectric shielding...........................376–378, 384, 397–398
size-exclusion (SEC) .................................118, 120, 121, 4,4-Dimethyl-4-silapentane-1-sulfonic acid
125, 189, 198, 210, 310, 313, 315, 317, 339 (DSS) ........................ 92, 100–101, 346, 351, 414
Circular dichroism (CD) spectroscopy .................... 340–341 Dimyristoylphosphatidylcholine (DMPC)
CLEAN chemical EXchange-phase modulator bilayer ............................................ 340, 345, 349
(CLEANEX-PM) ......................... 372, 378–380 Dipolar coupling
1
CMC. See Critical micelle concentration H, 1H dipolar coupling ..................................... 281, 487
Collaborative Computing Project for the NMR N–H dipolar coupling, d .................................... 288–289
(CCPN) .................................454–456, 465, 467, Dipole-dipole (DD) coupling.......................................... 142
469, 473–476, 478, 479 Discoidin domain of DDR2 .............................................. 28
Combined rotation and multiple pulse spectroscopy Dissociation constant ....................... 225, 239, 243, 496, 500
(CRAMPS) technique ................................... 281 DNA processing ...................................................... 181, 182
Confocal microscopy DNA template...................... 75, 83, 181, 198, 199, 201–203
HIV-1 CA assemblies ........ 305, 307, 310–313, 316–318 DOPC liposome preparation ................................ 91, 97, 98
Conformation distribution ..............................369, 370, 373, Double colony selection ................................2, 3, 5, 7–10, 12
374, 376, 385–388, 390, 392, 394, 396, 400 Double cross polarization (DCP) block .......................... 321
Conformer acidities .................. 385, 387–389, 393, 395, 399 1D proton experiment ..................................................... 144
Constraint combination........................................... 444, 448 DREAM sequence .......................................... 321, 326–327
COREX algorithm .......................................................... 371 3D ROCSA-NCA experiment ............................... 322–323
Correlation spectroscopy ..........................281–286, 304, 321 DYNAMICS protocol ............................................ 485–509
Co-transfection of insect cells ......................... 43, 44, 46–47
CP. See Cross polarization E
1
H–15N CP block ..................................................... 320, 321 Eigen acid... ..................................................... 375, 378, 388
Critical micelle concentration (CMC) ................... 339–341, Electronic polarizability........................................... 376–378
343, 345, 347–348 Electrophoretic mobility shift assay (EMSA).................. 224
Cross polarization (CP) ................................... 281, 304, 327 Episomal vectors.......................................................... 56–57
Cryo-SEM Eukaryotic protein kinase-2 (ERK2)....................... 359–367
HIV-1 CA assemblies ....................................... 318–319 Expression system
Crystallography and NMR system (CNS) bacteria, E. coli
software.................................................. 138, 455 BL21(DE3) ........................... 7, 74, 77, 79, 135, 168,
CSA. See Chemical shift anisotropy 171, 268, 273, 312, 314, 315, 319, 344, 346, 351
CSA/DD cross-correlated cross-relaxation rates ..... 153–157 BL21(DE3) pLysS .......................168, 171, 183, 184
15
N CSA shielding tensor ................................................ 290 Rosetta 2 (DE3) .......................................... 313, 316
CSP. See Chemical shift perturbation baculovirus
CYANA............................................432–433, 437, 444, 454 Autographa californica ............................................. 40
multicapsid nucleopolyhedrovirus
D
(AcMNPV) ............................................... 40, 42
DARR sequence ............................... 304, 306, 307, 321, 326 insect cells
DD coupling. See Dipole-dipole coupling Spodoptera frugiperda Sf9/Sf21 ......................... 43–49
DelPhi....... ...................................................................... 380 Trichoplusia ni BTI 5B1-4 (High Five™)
Detergent micelles cells .................................................................. 43
reconstitution of M2 into .................................. 169, 172 mammalian cells
Detergents (membrane proteins) human embryonic kidney 293
d38-DPC ........................................................... 343, 345 (HEK293) cells .........................39, 57–59, 61, 63
dihexanoyl-sn-glycerol-3-phosphocholine optimization .............................................................. 125
(DHPC) .........................167, 169, 172–174, 177 yeast, Kluyveromyces lactis (K. lactis)
dimyristoylphosphatidylcholine acetamidase gene (amdS) ....................................... 21
(DMPC) ........................................ 340, 345, 349 Lac4 promoter........................................................ 21
1,2-dioleoyl-sn-glycero-3-phosphocholine yeast, Pichia pastoris (P. pastoris)
(DOPC) ...................................60, 90, 91, 97, 98 AOX1 promoter/gene............................................. 20
Index
515
F In vitro transcription ....................................... 198, 201, 214

Isolated Spin Pair Approximation (ISPA) ............... 460, 486
FAST-heteronuclear single quantum correlation Isopropyl-β-D-thiogalactopyranoside (IPTG)
(FAST-HSQC) ............................................. 247 induction ...........................2–5, 9–12, 14–16, 136
Fast NMR methodologies Isotope labeling
concentric ring sampling.................................... 266, 267 double
hybrid back projection/lower value (HBLV) 13
C 15N labeling ......................... 1, 15, 20–32, 39, 43,
reconstruction algorithm................................ 267 58–60, 78, 80, 88, 91, 94, 98, 173, 174, 197–198,
projection reconstruction NMR (PR-NMR) ............ 267 202, 212, 229, 235, 262, 263, 305, 312, 316, 319,
radial sampling pattern .............................................. 266 327–328, 343, 360, 407–408, 413
random sampling ............................................... 266, 267 single
sparse sampling pattern ............................................. 266 13
C labeling .....................1, 21, 38, 59, 138, 145, 168,
Fed-batch fermentation ............................................... 30, 31 236, 262, 294, 363
FlashBAC... ..........................................41–42, 45, 47, 48, 50 2
H labeling ............................................... 28, 38, 172
Floating chirality assignment .................................. 464–465 methyl labeling ............................133, 134, 136–139,
FSLG-based 1H–15N HETCOR experiment.................. 324 236, 262–263, 270, 282, 294
15
N labeling .................................. 21, 52, 59, 76, 144,
G
168, 210, 236, 263, 324, 342, 363, 419, 485
Gp55-P, viral membrane protein from murine spleen perdeuteration ........................................................ 67
focus-forming virus ........................ 334, 340–342 triple
GroEL 2
H 13C 15N labeling................. 1, 16, 38, 72, 79, 120,
pREP4-groELS electrocompetent BL21 123, 144, 169–170, 172–175, 284, 363–365
(DE3) .................................................... 119, 121 ISPA. See Isolated Spin Pair Approximation
GroES.......................................................115–117, 125, 127
Growth culture media J
deuterated minimal medium....... 184, 186, 188, 189, 193
J-couplings................................ 235–236, 455, 469–470, 479
Luria Bertani (LB) medium ...................... 119, 171, 314
minimal medium (M9) ........................ 3, 8, 9, 11, 12, 16, L
115, 183–184, 188, 210, 268, 271, 310, 315
Gyromagnetic ratio................... 143, 158, 280–281, 289–290 Ligand observed experiments .................................. 239–245
Ligation independent cloning (LIC) vector .................... 336
H Liposomes, reconstitution into .............................. 94, 96–97
Longitudinal cross-correlated cross-relaxation
Hansenula polymorpha ........................................................ 38
1 rates ............................................... 143, 155–156
H–13C-HMQC ...................................... 133–134, 136–138
1 Longitudinal relaxation rate (T1) .....................146–149, 152,
H–13C-HSQC................................................ 173, 262, 270
157–158, 292, 321
Hemiascomycete yeast ....................................................... 21
Heteronuclear spectroscopy ............................................. 235
High cell-density expression.......................2, 5, 8, 10–16, 19
M
High cell-density induction ......................2–5, 10–11, 14–16 Magic angle spinning (MAS) ....................93, 105, 279–294,
1
H–15N-HSQC ................................... 77, 78, 220, 228, 230, 304–307, 309–311, 320–323, 325–327, 341, 342,
236, 237, 247–249, 262–264, 266, 270, 343, 363 454, 455, 458, 475, 477, 480
1
H 15N NOE ........................................................... 151, 174 rotor..... ...............................................280, 325, 350, 353
Human apolipoprotein A-I (apoAI)............................ 13, 15 Magnetization transfer .............................372, 378–379, 460
Hydrogen exchange Maltose-binding protein (MBP)
electrostatics of .................................................. 369–400 fusion protein.............. 115–118, 123–125, 337, 347, 352
kinetics of .................................................. 370, 374–378 MARS....... ...................................................... 417, 424, 425
MAS. See Magic angle spinning
I Mass spectrometry (MS)
In-cell NMR ........................................... 125–126, 261–275 MALDI-TOF mass spectrometry ............. 338, 347, 349
Inducible stable cell line of HEK293........................... 59, 63 Matlab....... ................310, 489, 490, 493, 494, 496, 502, 508
INEPT.............................................. 150, 236, 237, 284, 413 MBP. See Maltose-binding protein
Integral membrane proteins .............................................. 85 Membrane proteins .....................................39, 56, 59, 65, 67,
Intermolecular NOEs ...............................134, 198, 228–229 85–108, 165–167, 170, 176, 280, 293–294,
Intramolecular NOEs .............................................. 198, 240 333–334, 336, 337, 339, 341
In vitro synthesis.........................................86, 198, 201, 214 reconstitution ..................................................... 104, 108
516 Index
Membrane solubilization 3D [1H,1H]-NOESY-15N-HSQC ............................ 433

hexafluoroisopropanol (HFIP) .................................. 353 2D proton NOESY ................................................... 209
15
Methylotrophic yeast ......................................................... 19 N[1H 1H] nuclear Overhauser effect.............. 433, 435,
Microcrystalline proteins ..........................279, 285, 287, 454 441, 443
15
Microdynamic parameters N-separated NOESY .............................. 170–171, 173
order parameters (S2) ................................................. 488 relaxation-compensated CPMG NOESY
Sfast................................................... 2, 488, 492, 493, 503 experiment ..................................................... 176
TAUC4.. ................................. 92, 495, 497, 498, 500–504 transferred NOESY ............................134, 138, 239, 240
τfast4........................................................ 88, 492, 493, 503
τloc4....... ................................. 87–489, 492, 493, 500, 503 O
τslow4...... ................................................................ 88, 492 Oligomeric state characterization ............................ 100–101
Mitogen-activated protein (MAP) kinase ............... 359, 364 Overall tumbling .....................................285, 287, 293, 488,
Model-free analysis .................................157–158, 292, 489, 489, 491, 498, 499, 504, 505, 509
497–498, 500, 507, 508
Models of local motion....................................488, 492–494, P
500, 503, 504, 508 2
H Pake tensor ................................................................. 293
Molecular dynamics simulated annealing
Paramagnetic relaxation enhancement
(MDSA) protocol .................................. 462, 464
(PRE) ..................... 134, 137–138, 140, 220, 366
MolProbity ...............................................456, 466, 473–475
PDSD sequence. See Proton-driven spin diffusion sequence
Monomer-dimer equilibrium ................................. 490, 496,
Plasmids
500, 503, 505, 509
pACYCDuet ......................................118, 124, 126, 128
M2 proton channel .................................................. 167, 170
pET15b .............................. 182, 183, 185–188, 192, 193
MS. See Mass spectrometry
pET vectors ...............7, 72, 182–188, 192, 193, 312, 316
Multiple field data ........................................................... 489
pIVEX2 ....................................................................... 72
Murine erythropoietin receptor (Epo receptor) ............... 334
pKLAC........................................................................ 21
N pMMHb.................................................................... 170
pPIC3.5K .................................................................... 20
NCA experiment pPIC9K ....................................................................... 20
3D DIPSHIFT-NCA experiment .................... 322–323 pPICZ. .................................................................. 20, 21
Network-anchored assignment/network pPICZα ..................................................................20, 21
anchoring (NA) .....................436, 444, 447, 454, pTriEx. ........................................................................ 46
459, 470, 479 pTYB1...........................................................................9
NMR assignment of/spectra of Poisson–Boltzmann .........................................369, 370, 376,
SecA..... ..................................................... 134, 137, 140 380, 382, 392, 396, 397
STP Polarized attenuated total reflection (ATR) Fourier
prolyl oligopeptidase (POP)-baicalin transform infrared (FTIR)
(flavonoid)...................................... 246, 250–256 spectroscopy ........................................... 339–340
vascular endothelial growth factor Polyhedra... ........................................................................ 40
(VEGF)-P-7i (peptidic ligand) ............. 245–251 Polyhedrin gene (polh)/promoter ................................. 40–41
NMRPipe................................ 144, 148, 149, 169, 172–173, PRE. See Paramagnetic relaxation enhancement
307–308, 414, 417, 419 PROCHECK ................................................. 456, 466, 473
NMR sample preparation ProSa......... ...................................................... 456, 466, 473
dsDNA-Int........................................................ 224, 227 Protein
solid-state NMR.................................. 72, 93, 94, 97, 99, dynamics ............................. 141–161, 238, 322, 485–509
105–106, 116, 310–312 flexibility ............................................................ 369–400
NMRView.. ............................. 255, 307–308, 413, 414, 417, modularity ......................................................... 181–194
419–421, 425, 426, 432, 433, 476 observed experiments................................. 235–239, 245
Nuclear Overhauser effect (NOE) assignments ............. 137, purification/preparation of
169, 172–174, 198, 362–363, 430, 431, 433–438, CAP-Gly/microtubule complexes ....... 314, 319–320
443–447, 454, 459, 466 detergent/protein micelles ....................... 94–96, 100
Nuclear Overhauser effect spectroscopy (NOESY ) duplex DNA ........................................ 223, 224, 226
13
C-edited NOESY ................................... 173, 174, 177 HIV-1 CA assemblies................................. 305, 307,
13
C HMQC-NOESY-HMQC ................................. 137 310–313, 316–318
13
C-separated NOESY ...................................... 170–171 replication protein A (RPA) ........................ 181–194
3D [1H,1H]-NOESY-13C-HSQC............................. 433 RNA ............................................................ 197–216
Index
517
SrcCD ......................................... 115–118, 121–125 ROCSA CSA recoupling block ...................................... 322
SrcCD from MBP fusion ........................... 115–117, Rotamer..... .............................. 173–174, 175–176, 384, 387,
123–124, 127 395, 397–400
ssDNA ......................................................... 225–226 Rotational diffusion tensor ...................................... 506, 508
thioredoxin reassemblies ..............305, 307, 309–312, RotDif....... ...............................................492, 498, 505–508
315–316 Rubredoxin (Pyrococcus furiosus) ....................................... 372
T7 RNA polymerase.................................. 74, 79–81
trpLE-M2 ................................................... 168, 171 S
tyrosine kinases .................................................. 111–114
Saturation transfer difference (STD)
Protein–DNA interactions ...................................... 262, 266
amplification factor .................................... 243–244, 257
Protein–protein interactions ................................... 112, 125,
binding epitope .......................... 234, 235, 241, 243, 245,
262, 266, 270, 274, 303, 367
253–254, 256
Protein–RNA interactions ....................................... 197–216
off-resonance .....................................241–242, 252, 253,
Proteolytic separation of affinity/solubility tags
256, 327
cyanogen bromide...................................................... 336
on-resonance..............................152, 241–243, 251–253,
enterokinase ............................................................... 336
256, 414
factor Xa .................................................................... 336
Scalar coupling ( J)
thrombin .................................................................... 336 2
H 13C scalar coupling............................................... 283
tobacco etch virus (TEV) protease ............................ 336
S30 cell extract (E. coli)
Proton-driven spin diffusion (PDSD) sequence .............. 321
preparation....................................................... 73, 77–79
Pseudoproline .................................................................. 335
SDS-PAGE
Q protein expressed in liposomes................................... 100
protein expressed in micelles...................................... 100
Quantification Secretion of target proteins
of lipids in proteoliposomes ....................................... 104 α-MF sequence ........................................................... 20
of proteins in detergent.............................................. 103 Saccharomyces cerevisiae α-mating factor (α-MF)......... 20
of proteins in proteoliposomes ........................... 103–104 Selective labeling ............................................................. 236
of proteoliposome density .................................. 104–105 Shigemi tube ....................................159, 254–255, 350, 351
Single nucleotide resolution ............................ 204, 205, 215
R
Solid-state NMR....................... 85–108, 279–275, 279–294,
RDC. See Residual dipolar coupling 303–328, 334, 342, 349, 353, 452–480
Recombinant protein expression........................... 38, 40–41, Solid state peptide synthesis ............................ 245, 334–335
43–48, 335, 343 Solvent
REDOR block ................................................................ 323 accessibility ........................ 137, 370–374, 376, 380–384,
REDOR-HETCOR ........................305, 308, 309, 324–325 390, 391, 443
REDOR-PAINCP...................................305, 308, 309, 324 exchange ............................................................ 379–380
REDOR-PDSD .............................. 305, 308, 309, 324, 327 refinement ................................................................. 465
Relaxation Spectral density ....................................................... 292, 487
13
C relaxation ............................................................. 137 Spectral density function, J(w).........................142, 290, 292,
cross-correlated relaxation .........................142, 143, 145, 486–488, 491
146, 153–156, 290–292 Spin label
2
H relaxation .............................................................. 294 ATP-sl-N3-ATP ........................................................ 365
15
N CSA,15N-1H dipole cross-correlated nitroxide spin-label .................................... 137–138, 365
relaxation rate ........................................ 290–291 2,2,5,5 tetramethyl 3-pyrroline scaffold............. 365–366
15
N relaxation ..................................................... 141–161 Src-family kinases.................................................... 111–128
Residual dipolar coupling (RDC) ............169, 171, 174–176, Stable isotope labeling ..................................20, 43, 263, 268
373, 386, 388–390, 435, 455, 463, 470, 479 Structure, determination/calculation of
Restraint energy function CPMGfit ........................................................... 169, 176
flat-bottom harmonic-wall potential ......... 455, 464, 470 Pearson correlation coefficient ................................... 175
log-harmonic potential ............... 454, 463, 464, 470, 475 TALOS ..................................................... 169, 173, 477
Reverse phase HPLC ...............................172, 229, 336, 352 Structure ensemble ........... 456, 462–463, 465, 466, 471, 473
RNA synthesis ..........................................198, 199, 201–205 Subculturing
RNA-templated RNA addition....................................... 201 adapting to deuterated medium ................................... 33
RN-type recoupling block ............................................... 322 Symmetry target function ................................................ 464
518 Index
T T7 RNA polymerase
preparation of .................................................. 74, 79–81
TEM. See Transmission electron microscopy TROSY. See Transverse relaxation optimized spectroscopy
TOCSY..... ............................... 208, 226, 363, 364, 379, 435
TPPM decoupling ................................................... 320, 327 U
Transfection .................................... 46, 50, 51, 55–57, 59, 63
Transmembrane (TM) helix protein................................ 334 Ubiquitin... ..............................................373, 374, 379, 380,
Transmission electron microscopy (TEM) 384–391
CAP-Gly/microtubule assemblies ............. 317–318, 320 UNIO protocol
HIV-1 CA assemblies ....................................... 317–318 ASCAN algorithm ............................ 433–436, 441–443
Transverse cross-correlated cross-relaxation rates .... 143, 155 ATNOS algorithm ............................................ 435–438
Transverse relaxation optimized spectroscopy (TROSY ) CANDID algorithm ......................... 433–437, 443–446
methyl-TROSY ......................................... 134, 137, 139 MATCH algorithm............................433, 435, 438–441
15
N-edited NOESY-TROSY ............................. 362–364
15
V
N relaxation dispersion CPMG TROSY
experiment ..................................................... 176 Violation analysis .....................................456, 461, 462, 466
TROSY-HNCO ....................................................... 360 Violation tolerance .................................................. 461, 463
Transverse relaxation rate (T2)
Carr-Purcell-Meiboom-Gill (CPMG) W
echo train ................ 143, 147, 150–151, 157, 158
WATERGATE ............................................................... 247
single echo .......................... 143, 147, 149–150, 157–159
Water-Ligand Observed via Gradient Spectroscopy
Triple resonance NMR experiments
(WaterLOGSY ) .................................... 240, 244
HA(CA)NH.............................................................. 267
Water suppression ...................................159, 209, 242, 247,
HNCA .......................................172–173, 267, 286, 361
251, 256, 281
HNCACB ..........................................172–173, 286, 361
WHAT IF. ....................................... 456, 466, 472, 473, 475
HNCACO ................................................................ 286
HNCO .............................. 145, 172–173, 267, 360–361, X
365, 366, 413, 416, 424
HNCOCA ................................................................ 286 Xplor-NIH ........................138, 169, 175, 432, 433, 437, 444
iHNCA ..................................................... 411, 413, 416
Z
iHNCB.............................................................. 413, 416
iHNCO ..................................................... 413, 416, 424 Zeocin.............................................................. 20–22, 27, 29

Alexander Shekhtman, David S. Burz-Protein NMR Techniques (Methods in Molecular Biology, V831) - Springer (2011)

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Alexander Shekhtman, David S. Burz-Protein NMR Techniques (Methods in Molecular Biology, V831) - Springer (2011)

Enviado por

Direitos autorais:

Formatos disponíveis

METHODS IN MOLECULAR BIOLOGY™

For further volumes:

Alexander Shekhtman and David S. Burz

ISSN 1064-3745 e-ISSN 1940-6029

Library of Congress Control Number: 2011943883

© Springer Science+Business Media, LLC 2012

Printed on acid-free paper

Humana Press is part of Springer Science+Business Media (www.springer.com)

detailed in Chapter 10. Structurally characterizing multi-domain proteins can be challenging

Albany, NY, USA Alexander Shekhtman

1 A Novel Bacterial Expression Method with Optimized Parameters

14 NMR Studies of Protein–Ligand Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

ERNEST GIRALT • Institute for Research in Biomedicine (IRB Barcelona),

VICTORIA MURRAY • Department of Biochemistry and Molecular Biology,

SHANGJIN SUN • Department of Chemistry and Biochemistry, University of Delaware,

A Novel Bacterial Expression Method with Optimized

To perform NMR structural studies of proteins, we have to produce

2.1. Sample 1. 4× SDS loading buffer: 200 mM Tris–HCl, pH 6.8, 8% (w/v)

2.2. Protein Expression 1. LB medium (Miller): Dissolve 25 g of powdered LB medium

13. M9 for double-labeling (100 mL): 80 mL of distilled, steril-

2.3. Protein 1. Affinity resin: His-Bind Resin.

When triple-labeling proteins, bacteria have to be grown in D2O,

4× SDS loading buffer and mix thoroughly by repeated

3.2. SDS-PAGE 1. Depending on the protein size, choose an appropriate acryl-

experience, this is a critical protocol that significantly increases the

3.6. High Cell-Density This is a hybrid method combining traditional IPTG-induction

times and remove the tip. Immediately return the glycerol

We usually carry out time courses at different temperatures,

scale. Be sure to use a large flask for better aeration. We usually

OD600 2.5 3.9 7.2 9.1 8.4 8 8.1

3.9. Conclusion With the high cell-density IPTG-induction bacterial expression

compared with the yields obtained by using the traditional IPTG-

2. When using D2O to replace water in LB agar, broth or M9

12% Tricine gels or gradient gels are highly recommended

medium in 25% D2O. Once the OD600 of the culture reaches

Isotopic Labeling of Heterologous Proteins in the Yeast

The most widely used method for the expression of isotopically

both eukaryotic and prokaryotic expression systems (1) processing,

optimized for secretory expression of isotope-labeled DDR2 from

9. Buffered glycerol-complex (BMGY) medium: Dissolve 20 g bacto

2.2. Uniform 2H, 1. YPDS agar plates (Subheading 2.1).

7. 10% 15N-ammonium chloride(2H2O): Dissolve 7 g of

2. 40% unlabeled D-glucose stock solution (Subheading 2.1).

2.4. Uniform 2H, 1. YPD medium (Subheading 2.3).

3.1. Uniform 13C, 1. Prepare necessary media.

(b) Prepare 1.2 L of fresh, sterile 15N-BM medium in a 2-L

(c) Prepare 5 mL each of 15N-BMD(90% 2H2O), 15N-BMD

10. Agitate the culture medium at 300–800 rpm with 0.1–0.3

3.4. Uniform 2H, In our experience, perdeuteration (approximately 90% deuteration

Furthermore, insertion of one or two subcultivation steps following

3. To produce the primary culture, inoculate 5 mL of YPD

1. Detailed information about the composition of YNB without

by adding antifoaming agents to the culture medium. We use

protease-deficient strains of P. pastoris and K. lactis are com-

This work was financially supported in part by the Ministry of

Isotope Labeling in Insect Cells

sufficient for many recombinant proteins, although there are a

expressed at low levels even when glycosylation is not necessarily

with labeled amino acids. Another limitation of these hosts is that

1.1. Principle of The insect cell baculovirus-mediated expression system (BvE) is a

Fig. 1. Construction of baculovirus recombinants with Novagen® BacMagic™ system.

reduces the production of recombinant virus to a one-step proce-

flanked by the lef2 and ORF1629 recombination sites) restores the