in Quantum Mechanics
With Applications to Schrodinger Operators
Gerald Teschl
Gerald Teschl
Fakultat f ur Mathematik
Nordbergstrae 15
Universitat Wien
1090 Wien, Austria
Email address: Gerald.Teschl@univie.ac.at
URL: http://www.mat.univie.ac.at/~gerald/
2000 Mathematics subject classication. 8101, 81Qxx, 4601
Abstract. This manuscript provides a selfcontained introduction to math
ematical methods in quantum mechanics (spectral theory) with applications
to Schrodinger operators. The rst part covers mathematical foundations
of quantum mechanics from selfadjointness, the spectral theorem, quantum
dynamics (including Stones and the RAGE theorem) to perturbation theory
for selfadjoint operators.
The second part starts with a detailed study of the free Schrodinger op
erator respectively position, momentum and angular momentum operators.
Then we develop WeylTitchmarsh theory for SturmLiouville operators and
apply it to spherically symmetric problems, in particular to the hydrogen
atom. Next we investigate selfadjointness of atomic Schrodinger operators
and their essential spectrum, in particular the HVZ theorem. Finally we
have a look at scattering theory and prove asymptotic completeness in the
short range case.
Keywords and phrases. Schrodinger operators, quantum mechanics, un
bounded operators, spectral theory.
Typeset by /
/
oL
A
T
E
X and Makeindex.
Version: August 14, 2005
Copyright c _ 19992005 by Gerald Teschl
Contents
Preface vii
Part 0. Preliminaries
Chapter 0. A rst look at Banach and Hilbert spaces 3
0.1. Warm up: Metric and topological spaces 3
0.2. The Banach space of continuous functions 10
0.3. The geometry of Hilbert spaces 14
0.4. Completeness 19
0.5. Bounded operators 20
0.6. Lebesgue L
p
spaces 22
0.7. Appendix: The uniform boundedness principle 27
Part 1. Mathematical Foundations of Quantum Mechanics
Chapter 1. Hilbert spaces 31
1.1. Hilbert spaces 31
1.2. Orthonormal bases 33
1.3. The projection theorem and the Riesz lemma 36
1.4. Orthogonal sums and tensor products 38
1.5. The C
n
k=1
(x
k
y
k
)
2
)
1/2
is a metric space and so is C
n
together with d(x, y) = (
n
k=1
[x
k
y
k
[
2
)
1/2
.
3
4 0. A rst look at Banach and Hilbert spaces
The set
B
r
(x) = y X[d(x, y) < r (0.1)
is called an open ball around x with radius r > 0. A point x of some set
U is called an interior point of U if U contains some ball around x. If x
is an interior point of U, then U is also called a neighborhood of x. A
point x is called a limit point of U if B
r
(x) (Ux) ,= . Note that a
limit point must not lie in U, but U contains points arbitrarily close to x.
Moreover, x is not a limit point of U if and only if it is an interior point of
the complement of U. A set consisting only of interior points is called open.
The family of open sets O satises the following properties
(i) , X O
(ii) O
1
, O
2
O implies O
1
O
2
O
(iii) O
O implies
O
That is, O is closed under nite intersections and arbitrary unions.
In general, a space X together with a family of sets O, the open sets,
satisfying (i)(iii) is called a topological space. Every subspace Y of a
topological space X becomes a topological space of its own if we call O Y
open if there is some open set
O X such that O =
O Y (induced
topology).
A family of open sets B O is called a base for the topology if for each
x and each neighborhood U(x), there is some set O B with x O U.
Since O =
xO
U(x) we have
Lemma 0.1. If B O is a base for the topology, then every open set can
be written as a union of elements from B.
If there exists a countable base, then X is called second countable.
Example. By construction the open balls B
1/n
(x) are a base for the topol
ogy in a metric space. In the case of R
n
(or C
n
) it even suces to take balls
with rational center and hence R
n
(and C
n
) are second countable.
A topological space is called Hausdor space if for two dierent points
there are always two disjoint neighborhoods.
Example. Any metric space is a Hausdor space: Given two dierent
points x and y the balls B
d/2
(x) and B
d/2
(y), where d = d(x, y) > 0, are
disjoint neighborhoods (a semimetric space will not be Hausdor).
Example. Note that dierent metrics can give rise to the same topology.
For example, we can equip R
n
(or C
n
) with the Euclidean distance as before,
0.1. Warm up: Metric and topological spaces 5
or we could also use
d(x, y) =
n
k=1
[x
k
y
k
[ (0.2)
Since
1
n
n
k=1
[x
k
[
_
n
k=1
[x
k
[
2
k=1
[x
k
[ (0.3)
shows B
r/
n
((x, y))
B
r
((x, y)) B
r
((x, y)), where B,
B are balls com
puted using d,
d, respectively. Hence the topology is the same for both
metrics.
Example. We can always replace a metric d by the bounded metric
d(x, y) =
d(x, y)
1 +d(x, y)
(0.4)
without changing the topology.
The complement of an open set is called a closed set. It follows from
de Morgans rules that the family of closed sets ( satises
(i) , X (
(ii) C
1
, C
2
( implies C
1
C
2
(
(iii) C
( implies
(
That is, closed sets are closed under nite unions and arbitrary intersections.
The smallest closed set containing a given set U is called the closure
U =
CC,UC
C, (0.5)
and the largest open set contained in a given set U is called the interior
U
=
_
OO,OU
O. (0.6)
It is straightforward to check that
Lemma 0.2. Let X be a metric space, then the interior of U is the set of
all interior points of U and the closure of U is the set of all limit points of
U.
A sequence (x
n
)
n=1
X is said to converge to some point x X if
d(x, x
n
) 0. We write lim
n
x
n
= x as usual in this case. Clearly the
limit is unique if it exists (this is not true for a semimetric).
6 0. A rst look at Banach and Hilbert spaces
Every convergent sequence is a Cauchy sequence, that is, for every
> 0 there is some N N such that
d(x
n
, x
m
) n, m N. (0.7)
If the converse is also true, that is, if every Cauchy sequence has a limit,
then X is called complete.
Example. Both R
n
and C
n
are complete metric spaces.
A point x is clearly a limit point of U if and only if there is some sequence
x
n
U converging to x. Hence
Lemma 0.3. A closed subset of a complete metric space is again a complete
metric space.
Note that convergence can also be equivalently formulated in terms of
topological terms: A sequence x
n
converges to x if and only if for every
neighborhood U of x there is some N N such that x
n
U for n N. In
a Hausdor space the limit is unique.
A metric space is called separable if it contains a countable dense set.
A set U is called dense, if its closure is all of X, that is if U = X.
Lemma 0.4. Let X be a separable metric space. Every subset of X is again
separable.
Proof. Let A = x
n
nN
be a dense set in X. The only problem is that
AY might contain no elements at all. However, some elements of A must
be at least arbitrarily close: Let J N
2
be the set of all pairs (n, m) for
which B
1/m
(x
n
) Y ,= and choose some y
n,m
B
1/m
(x
n
) Y for all
(n, m) J. Then B = y
n,m
(n,m)J
Y is countable. To see that B is
dense choose y Y . Then there is some sequence x
n
k
with d(x
n
k
, y) < 1/4.
Hence (n
k
, k) J and d(y
n
k
,k
, y) d(y
n
k
,k
, x
n
k
) +d(x
n
k
, y) 2/k 0.
A function between metric spaces X and Y is called continuous at a
point x X if for every > 0 we can nd a > 0 such that
d
Y
(f(x), f(y)) if d
X
(x, y) < . (0.8)
If f is continuous at every point it is called continuous.
Lemma 0.5. Let X be a metric space. The following are equivalent
(i) f is continuous at x (i.e, (0.8) holds).
(ii) f(x
n
) f(x) whenever x
n
x
(iii) For every neighborhood V of f(x), f
1
(V ) is a neighborhood of x.
0.1. Warm up: Metric and topological spaces 7
Proof. (i) (ii) is obvious. (ii) (iii): If (iii) does not hold there is
a neighborhood V of f(x) such that B
(x) , f
1
(V ) for every . Hence
we can choose a sequence x
n
B
1/n
(x) such that f(x
n
) , f
1
(V ). Thus
x
n
x but f(x
n
) , f(x). (iii) (i): Choose V = B
(x) f
1
(V ) for some .
The last item implies that f is continuous if and only if the inverse image
of every open (closed) set is again open (closed).
Note: In a topological space, (iii) is used as denition for continuity.
However, in general (ii) and (iii) will no longer be equivalent unless one uses
generalized sequences, so called nets, where the index set N is replaced by
arbitrary directed sets.
If X and X are metric spaces then X Y together with
d((x
1
, y
1
), (x
2
, y
2
)) = d
X
(x
1
, x
2
) +d
Y
(y
1
, y
2
) (0.9)
is a metric space. A sequence (x
n
, y
n
) converges to (x, y) if and only if
x
n
x and y
n
y. In particular, the projections onto the rst (x, y) x
respectively onto the second (x, y) y coordinate are continuous.
In particular, by
[d(x
n
, y
n
) d(x, y)[ d(x
n
, x) +d(y
n
, y) (0.10)
we see that d : X X R is continuous.
Example. If we consider RR we do not get the Euclidean distance of R
2
unless we modify (0.9) as follows:
d((x
1
, y
1
), (x
2
, y
2
)) =
_
d
X
(x
1
, x
2
)
2
+d
Y
(y
1
, y
2
)
2
. (0.11)
As noted in our previous example, the topology (and thus also conver
gence/continuity) is independent of this choice.
If X and Y are just topological spaces, the product topology is dened
by calling O X Y open if for every point (x, y) O there are open
neighborhoods U of x and V of y such that U V O. In the case of
metric spaces this clearly agrees with the topology dened via the product
metric (0.9).
A cover of a set Y X is a family of sets U
such that Y
. A
cover is call open if all U
is called a subcover.
A subset K X is called compact if every open cover has a nite
subcover.
Lemma 0.6. A topological space is compact if and only if it has the nite
intersection property: The intersection of a family of closed sets is empty
if and only if the intersection of some nite subfamily is empty.
8 0. A rst look at Banach and Hilbert spaces
Proof. By taking complements, to every family of open sets there is a cor
responding family of closed sets and vice versa. Moreover, the open sets
are a cover if and only if the corresponding closed sets have empty intersec
tion.
A subset K X is called sequentially compact if every sequence has
a convergent subsequence.
Lemma 0.7. Let X be a topological space.
(i) The continuous image of a compact set is compact.
(ii) Every closed subset of a compact set is compact.
(iii) If X is Hausdor, any compact set is closed.
(iv) The product of compact sets is compact.
(v) A compact set is also sequentially compact.
Proof. (i) Just observe that if O
)
is one for Y .
(ii) Let O
XY is an open cover for X.
(iii) Let Y X be compact. We show that XY is open. Fix x XY
(if Y = X there is nothing to do). By the denition of Hausdor, for
every y Y there are disjoint neighborhoods V (y) of y and U
y
(x) of x. By
compactness of Y , there are y
1
, . . . y
n
such that V (y
j
) cover Y . But then
U(x) =
n
j=1
U
y
j
(x) is a neighborhood of x which does not intersect Y .
(iv) Let O
n1
m=1
B
(x
m
) such that d(x
n
, x
m
) > for m < n.
In particular, we are done if we can show that for every open cover
O
(x) O
for
some = (x). Indeed, choosing x
k
n
k=1
such that B
(x
k
) is a cover, we
have that O
(x
k
)
is a cover as well.
So it remains to show that there is such an . If there were none, for
every > 0 there must be an x such that B
(x) , O
and hence B
(x) O
. But choosing
n so large that
1
n
<
2
and d(x
n
, x) <
2
we have B
1/n
(x
n
) B
(x) O
(x) such
that B
(x) XC
2
. Since U is compact, nitely
many of them cover C
1
and we can choose the union of those balls to be O.
Now replace C
2
by XO.
Note that Urysohns lemma implies that a metric space is normal, that
is, for any two disjoint closed sets C
1
and C
2
, there are disjoint open sets
O
1
and O
2
such that C
j
O
j
, j = 1, 2. In fact, choose f as in Urysohns
lemma and set O
1
= f
1
([0, 1/2)) respectively O
2
= f
1
((1/2, 1]).
0.2. The Banach space of continuous functions
Now let us have a rst look at Banach spaces by investigating set of contin
uous functions C(I) on a compact interval I = [a, b] R. Since we want to
handle complex models, we will always consider complex valued functions!
One way of declaring a distance, wellknown from calculus, is the max
imum norm:
f(x) g(x)
= max
xI
[f(x) g(x)[. (0.14)
It is not hard to see that with this denition C(I) becomes a normed linear
space:
A normed linear space X is a vector space X over C (or R) with a
realvalued function (the norm) . such that
f 0 for all f X and f = 0 if and only if f = 0,
f = [[ f for all C and f X, and
f +g f +g for all f, g X (triangle inequality).
From the triangle inequality we also get the inverse triangle inequality
(Problem 0.1)
[f g[ f g. (0.15)
0.2. The Banach space of continuous functions 11
Once we have a norm, we have a distance d(f, g) = fg and hence we
know when a sequence of vectors f
n
converges to a vector f. We will write
f
n
f or lim
n
f
n
= f, as usual, in this case. Moreover, a mapping
F : X Y between to normed spaces is called continuous if f
n
f
implies F(f
n
) F(f). In fact, it is not hard to see that the norm, vector
addition, and multiplication by scalars are continuous (Problem 0.2).
In addition to the concept of convergence we have also the concept of
a Cauchy sequence and hence the concept of completeness: A normed
space is called complete if every Cauchy sequence has a limit. A complete
normed space is called a Banach space.
Example. The space
1
(N) of all sequences a = (a
j
)
j=1
for which the norm
a
1
=
j=1
[a
j
[ (0.16)
is nite, is a Banach space.
To show this, we need to verify three things: (i)
1
(N) is a Vector space,
that is closed under addition and scalar multiplication (ii) .
1
satises the
three requirements for a norm and (iii)
1
(N) is complete.
First of all observe
k
j=1
[a
j
+b
j
[
k
j=1
[a
j
[ +
k
j=1
[b
j
[ a
1
+b
1
(0.17)
for any nite k. Letting k we conclude that
1
(N) is closed under
addition and that the triangle inequality holds. That
1
(N) is closed under
scalar multiplication and the two other properties of a norm are straight
forward. It remains to show that
1
(N) is complete. Let a
n
= (a
n
j
)
j=1
be
a Cauchy sequence, that is, for given > 0 we can nd an N
such that
a
m
a
n

1
for m, n N
j=1
[a
m
j
a
n
j
[ (0.18)
and take m :
k
j=1
[a
j
a
n
j
[ . (0.19)
Since this holds for any nite k we even have aa
n

1
. Hence (aa
n
)
1
(N) and since a
n
1
(N) we nally conclude a = a
n
+(aa
n
)
1
(N).
12 0. A rst look at Banach and Hilbert spaces
Example. The space
j=1
together
with the norm
a
= sup
jN
[a
j
[ (0.20)
is a Banach space (Problem 0.3).
Now what about convergence in this space? A sequence of functions
f
n
(x) converges to f if and only if
lim
n
f f
n
 = lim
n
sup
xI
[f
n
(x) f(x)[ = 0. (0.21)
That is, in the language of real analysis, f
n
converges uniformly to f. Now
let us look at the case where f
n
is only a Cauchy sequence. Then f
n
(x) is
clearly a Cauchy sequence of real numbers for any xed x I. In particular,
by completeness of C, there is a limit f(x) for each x. Thus we get a limiting
function f(x). Moreover, letting m in
[f
m
(x) f
n
(x)[ m, n > N
, x I (0.22)
we see
[f(x) f
n
(x)[ n > N
, x I, (0.23)
that is, f
n
(x) converges uniformly to f(x). However, up to this point we
dont know whether it is in our vector space C(I) or not, that is, whether
it is continuous or not. Fortunately, there is a wellknown result from real
analysis which tells us that the uniform limit of continuous functions is again
continuous. Hence f(x) C(I) and thus every Cauchy sequence in C(I)
converges. Or, in other words
Theorem 0.12. C(I) with the maximum norm is a Banach space.
Next we want to know if there is a basis for C(I). In order to have only
countable sums, we would even prefer a countable basis. If such a basis
exists, that is, if there is a set u
n
X of linearly independent vectors
such that every element f X can be written as
f =
n
c
n
u
n
, c
n
C, (0.24)
then the span spanu
n
(the set of all nite linear combinations) of u
n
is
dense in X. A set whose span is dense is called total and if we have a total
set, we also have a countable dense set (consider only linear combinations
with rational coecients show this). A normed linear space containing a
countable dense set is called separable.
Example. The Banach space
1
(N) is separable. In fact, the set of vectors
n
, with
n
n
= 1 and
n
m
= 0, n ,= m is total: Let a
1
(N) be given and set
0.2. The Banach space of continuous functions 13
a
n
=
n
k=1
a
k
k
, then
a a
n

1
=
j=n+1
[a
j
[ 0 (0.25)
since a
n
j
= a
j
for 1 j n and a
n
j
= 0 for j > n.
Luckily this is also the case for C(I):
Theorem 0.13 (Weierstra). Let I be a compact interval. Then the set of
polynomials is dense in C(I).
Proof. Let f(x) C(I) be given. By considering f(x) f(a) + (f(b)
f(a))(x b) it is no loss to assume that f vanishes at the boundary points.
Moreover, without restriction we only consider I = [
1
2
,
1
2
] (why?).
Now the claim follows from the lemma below using
u
n
(x) =
1
I
n
(1 x
2
)
n
, (0.26)
where
I
n
=
_
1
1
(1 x
2
)
n
dx =
n!
1
2
(
1
2
+ 1) (
1
2
+n)
=
(1 +n)
(
3
2
+n)
=
_
n
(1 +O(
1
n
)). (0.27)
(Remark: The integral is known as Beta function and the asymptotics follow
from Stirlings formula.)
Lemma 0.14 (Smoothing). Let u
n
(x) be a sequence of nonnegative contin
uous functions on [1, 1] such that
_
x1
u
n
(x)dx = 1 and
_
x1
u
n
(x)dx 0, > 0. (0.28)
(In other words, u
n
has mass one and concentrates near x = 0 as n .)
Then for every f C[
1
2
,
1
2
] which vanishes at the endpoints, f(
1
2
) =
f(
1
2
) = 0, we have that
f
n
(x) =
_
1/2
1/2
u
n
(x y)f(y)dy (0.29)
converges uniformly to f(x).
Proof. Since f is uniformly continuous, for given we can nd a (inde
pendent of x) such that [f(x)f(y)[ whenever [xy[ . Moreover, we
14 0. A rst look at Banach and Hilbert spaces
can choose n such that
_
y1
u
n
(y)dy . Now abbreviate M = max f
and note
[f(x)
_
1/2
1/2
u
n
(xy)f(x)dy[ = [f(x)[ [1
_
1/2
1/2
u
n
(xy)dy[ M. (0.30)
In fact, either the distance of x to one of the boundary points
1
2
is smaller
than and hence [f(x)[ or otherwise the dierence between one and the
integral is smaller than .
Using this we have
[f
n
(x) f(x)[
_
1/2
1/2
u
n
(x y)[f(y) f(x)[dy +M
_
y1/2,xy
u
n
(x y)[f(y) f(x)[dy
+
_
y1/2,xy
u
n
(x y)[f(y) f(x)[dy +M
= + 2M +M = (1 + 3M), (0.31)
which proves the claim.
Note that f
n
will be as smooth as u
n
, hence the title smoothing lemma.
The same idea is used to approximate noncontinuous functions by smooth
ones (of course the convergence will no longer be uniform in this case).
Corollary 0.15. C(I) is separable.
The same is true for
1
(N), but not for
1
f
1
+
2
f
2
, g) =
1
f
1
, g) +
2
f
2
, g)
f,
1
g
1
+
2
g
2
) =
1
f, g
1
) +
2
f, g
2
)
,
1
,
2
C, (0.32)
where denotes complex conjugation. A skew linear form satisfying the
requirements
(i) f, f) > 0 for f ,= 0 (positive denite)
(ii) f, g) = g, f)
(symmetry)
is called inner product or scalar product. Associated with every scalar
product is a norm
f =
_
f, f). (0.33)
The pair (H, ., ..)) is called inner product space. If H is complete it is
called a Hilbert space.
Example. Clearly C
n
with the usual scalar product
a, b) =
n
j=1
a
j
b
j
(0.34)
is a (nite dimensional) Hilbert space.
Example. A somewhat more interesting example is the Hilbert space
2
(N),
that is, the set of all sequences
_
(a
j
)
j=1
j=1
[a
j
[
2
<
_
(0.35)
with scalar product
a, b) =
j=1
a
j
b
j
. (0.36)
(Show that this is in fact a separable Hilbert space! Problem 0.5)
Of course I still owe you a proof for the claim that
_
f, f) is indeed a
norm. Only the triangle inequality is nontrivial which will follow from the
CauchySchwarz inequality below.
A vector f H is called normalized or unit vector if f = 1. Two
vectors f, g H are called orthogonal or perpendicular (f g) if f, g) =
0 and parallel if one is a multiple of the other.
For two orthogonal vectors we have the Pythagorean theorem:
f +g
2
= f
2
+g
2
, f g, (0.37)
which is one line of computation.
16 0. A rst look at Banach and Hilbert spaces
Suppose u is a unit vector, then the projection of f in the direction of
u is given by
f
= u, f)u (0.38)
and f
dened via
f
= f u, f)u (0.39)
is perpendicular to u since u, f
) = u, f u, f)u) = u, f) u, f)u, u) =
0.
f
f
I
f
f
f
f
fw
+ (f
u)
2
= f

2
+[u, f) [
2
(0.40)
and hence f

2
.
Note that the CauchySchwarz inequality entails that the scalar product
is continuous in both variables, that is, if f
n
f and g
n
g we have
f
n
, g
n
) f, g).
As another consequence we infer that the map . is indeed a norm.
f +g
2
= f
2
+f, g) +g, f) +g
2
(f +g)
2
. (0.42)
But let us return to C(I). Can we nd a scalar product which has the
maximum norm as associated norm? Unfortunately the answer is no! The
reason is that the maximum norm does not satisfy the parallelogram law
(Problem 0.7).
0.3. The geometry of Hilbert spaces 17
Theorem 0.17 (Jordanvon Neumann). A norm is associated with a scalar
product if and only if the parallelogram law
f +g
2
+f g
2
= 2f
2
+ 2g
2
(0.43)
holds.
In this case the scalar product can be recovered from its norm by virtue
of the polarization identity
f, g) =
1
4
_
f +g
2
f g
2
+ if ig
2
if + ig
2
_
. (0.44)
Proof. If an inner product space is given, verication of the parallelogram
law and the polarization identity is straight forward (Problem 0.6).
To show the converse, we dene
s(f, g) =
1
4
_
f +g
2
f g
2
+ if ig
2
if + ig
2
_
. (0.45)
Then s(f, f) = f
2
and s(f, g) = s(g, f)
(x)g(x)dx. (0.47)
The corresponding inner product space is denoted by L
2
cont
(I). Note that
we have
f
_
[b a[f
(0.48)
and hence the maximum norm is stronger than the L
2
cont
norm.
Suppose we have two norms .
1
and .
2
on a space X. Then .
2
is
said to be stronger than .
1
if there is a constant m > 0 such that
f
1
mf
2
. (0.49)
It is straightforward to check that
18 0. A rst look at Banach and Hilbert spaces
Lemma 0.18. If .
2
is stronger than .
1
, then any .
2
Cauchy sequence
is also a .
1
Cauchy sequence.
Hence if a function F : X Y is continuous in (X, .
1
) it is also
continuos in (X, .
2
) and if a set is dense in (X, .
2
) it is also dense in
(X, .
1
).
In particular, L
2
cont
is separable. But is it also complete? Unfortunately
the answer is no:
Example. Take I = [0, 2] and dene
f
n
(x) =
_
_
_
0, 0 x 1
1
n
1 +n(x 1), 1
1
n
x 1
1, 1 x 2
(0.50)
then f
n
(x) is a Cauchy sequence in L
2
cont
, but there is no limit in L
2
cont
!
Clearly the limit should be the step function which is 0 for 0 x < 1 and 1
for 1 x 2, but this step function is discontinuous (Problem 0.8)!
This shows that in innite dimensional spaces dierent norms will give
raise to dierent convergent sequences! In fact, the key to solving prob
lems in innite dimensional spaces is often nding the right norm! This is
something which cannot happen in the nite dimensional case.
Theorem 0.19. If X is a nite dimensional case, then all norms are equiv
alent. That is, for given two norms .
1
and .
2
there are constants m
1
and m
2
such that
1
m
2
f
1
f
2
m
1
f
1
. (0.51)
Proof. Clearly we can choose a basis u
j
, 1 j n, and assume that .
2
is
the usual Euclidean norm, 
j
j
u
j

2
2
=
j
[
j
[
2
. Let f =
j
j
u
j
, then
by the triangle and Cauchy Schwartz inequalities
f
1
j
[
j
[u
j

1
j
u
j

2
1
f
2
(0.52)
and we can choose m
2
=
_
j
u
j

1
.
In particular, if f
n
is convergent with respect to .
2
it is also convergent
with respect to .
1
. Thus .
1
is continuous with respect to .
2
and attains
its minimum m > 0 on the unit sphere (which is compact by the HeineBorel
theorem). Now choose m
1
= 1/m.
Problem 0.5. Show that
2
(N) is a separable Hilbert space.
0.4. Completeness 19
Problem 0.6. Let s(f, g) be a skew linear form and p(f) = s(f, f) the
associated quadratic form. Prove the parallelogram law
p(f +g) +p(f g) = 2p(f) + 2p(g) (0.53)
and the polarization identity
s(f, g) =
1
4
(p(f +g) p(f g) + i p(f ig) i p(f + ig)) . (0.54)
Problem 0.7. Show that the maximum norm (on C[0, 1]) does not satisfy
the parallelogram law.
Problem 0.8. Prove the claims made about f
n
, dened in (0.50), in the
last example.
0.4. Completeness
Since L
2
cont
is not complete, how can we obtain a Hilbert space out of it?
Well the answer is simple: take the completion.
If X is a (incomplete) normed space, consider the set of all Cauchy
sequences
X. Call two Cauchy sequences equivalent if their dierence con
verges to zero and denote by
X the set of all equivalence classes. It is easy
to see that
X (and
X) inherit the vector space structure from X. Moreover,
Lemma 0.20. If x
n
is a Cauchy sequence, then x
n
 converges.
Consequently the norm of a Cauchy sequence (x
n
)
n=1
can be dened by
(x
n
)
n=1
 = lim
n
x
n
 and is independent of the equivalence class (show
this!). Thus
X is a normed space (
X is not! why?).
Theorem 0.21.
X is a Banach space containing X as a dense subspace if
we identify x X with the equivalence class of all sequences converging to
x.
Proof. (Outline) It remains to show that
X is complete. Let
n
= [(x
n,j
)
j=1
]
be a Cauchy sequence in
X. Then it is not hard to see that = [(x
j,j
)
j=1
]
is its limit.
Let me remark that the completion X is unique. More precisely any
other complete space which contains X as a dense subset is isomorphic to
X. This can for example be seen by showing that the identity map on X
has a unique extension to X (compare Theorem 0.24 below).
In particular it is no restriction to assume that a normed linear space
or an inner product space is complete. However, in the important case
of L
2
cont
it is somewhat inconvenient to work with equivalence classes of
Cauchy sequences and hence we will give a dierent characterization using
the Lebesgue integral later.
20 0. A rst look at Banach and Hilbert spaces
0.5. Bounded operators
A linear map A between two normed spaces X and Y will be called a (lin
ear) operator
A : D(A) X Y. (0.55)
The linear subspace D(A) on which A is dened, is called the domain of A
and is usually required to be dense. The kernel
Ker(A) = f D(A)[Af = 0 (0.56)
and range
Ran(A) = Af[f D(A) = AD(A) (0.57)
are dened as usual. The operator A is called bounded if the following
operator norm
A = sup
f
X
=1
Af
Y
(0.58)
is nite.
The set of all bounded linear operators from X to Y is denoted by
L(X, Y ). If X = Y we write L(X, X) = L(X).
Theorem 0.22. The space L(X, Y ) together with the operator norm (0.58)
is a normed space. It is a Banach space if Y is.
Proof. That (0.58) is indeed a norm is straightforward. If Y is complete and
A
n
is a Cauchy sequence of operators, then A
n
f converges to an element
g for every f. Dene a new operator A via Af = g. By continuity of
the vector operations, A is linear and by continuity of the norm Af =
lim
n
A
n
f (lim
n
A
n
)f it is bounded. Furthermore, given
> 0 there is some N such that A
n
A
m
 for n, m N and thus
A
n
f A
m
f f. Taking the limit m we see A
n
f Af f,
that is A
n
A.
By construction, a bounded operator is Lipschitz continuous
Af
Y
Af
X
(0.59)
and hence continuous. The converse is also true
Theorem 0.23. An operator A is bounded if and only if it is continuous.
Proof. Suppose A is continuous but not bounded. Then there is a sequence
of unit vectors u
n
such that Au
n
 n. Then f
n
=
1
n
u
n
converges to 0 but
Af
n
 1 does not converge to 0.
Moreover, if A is bounded and densely dened, it is no restriction to
assume that it is dened on all of X.
0.5. Bounded operators 21
Theorem 0.24. Let A L(X, Y ) and let Y be a Banach space. If D(A)
is dense, there is a unique (continuous) extension of A to X, which has the
same norm.
Proof. Since a bounded operator maps Cauchy sequences to Cauchy se
quences, this extension can only be given by
Af = lim
n
Af
n
, f
n
D(A), f X. (0.60)
To show that this denition is independent of the sequence f
n
f, let
g
n
f be a second sequence and observe
Af
n
Ag
n
 = A(f
n
g
n
) Af
n
g
n
 0. (0.61)
From continuity of vector addition and scalar multiplication it follows that
our extension is linear. Finally, from continuity of the norm we conclude
that the norm does not increase.
An operator in L(X, C) is called a bounded linear functional and the
space X
.
The Banach space of bounded linear operators L(X) even has a multi
plication given by composition. Clearly this multiplication satises
(A+B)C = AC +BC, A(B +C) = AB +BC, A, B, C L(X) (0.62)
and
(AB)C = A(BC), (AB) = (A)B = A(B), C. (0.63)
Moreover, it is easy to see that we have
AB AB. (0.64)
However, note that our multiplication is not commutative (unless X is one
dimensional). We even have an identity, the identity operator I satisfying
I = 1.
A Banach space together with a multiplication satisfying the above re
quirements is called a Banach algebra. In particular, note that (0.64)
ensures that multiplication is continuous.
Problem 0.9. Show that the integral operator
(Kf)(x) =
_
1
0
K(x, y)f(y)dy, (0.65)
where K(x, y) C([0, 1] [0, 1]), dened on D(K) = C[0, 1] is a bounded
operator both in X = C[0, 1] (max norm) and X = L
2
cont
(0, 1).
Problem 0.10. Show that the dierential operator A =
d
dx
dened on
D(A) = C
1
[0, 1] C[0, 1] is an unbounded operator.
22 0. A rst look at Banach and Hilbert spaces
Problem 0.11. Show that AB AB for every A, B L(X).
Problem 0.12. Show that the multiplication in a Banach algebra X is con
tinuous: x
n
x and y
n
y implies x
n
y
n
xy.
0.6. Lebesgue L
p
spaces
We x some measure space (X, , ) and dene the L
p
norm by
f
p
=
__
X
[f[
p
d
_
1/p
, 1 p (0.66)
and denote by L
p
(X, d) the set of all complex valued measurable functions
for which f
p
is nite. First of all note that L
p
(X, d) is a linear space,
since [f + g[
p
2
p
max([f[, [g[)
p
2
p
max([f[
p
, [g[
p
) 2
p
([f[
p
+ [g[
p
). Of
course our hope is that L
p
(X, d) is a Banach space. However, there is
a small technical problem (recall that a property is said to hold almost
everywhere if the set where it fails to hold is contained in a set of measure
zero):
Lemma 0.25. Let f be measurable, then
_
X
[f[
p
d = 0 (0.67)
if and only if f(x) = 0 almost everywhere with respect to .
Proof. Observe that we have A = x[f(x) ,= 0 =
n
A
n
, where A
n
=
x[ [f(x)[ >
1
n
. If
_
[f[
p
d = 0 we must have (A
n
) = 0 for every n and
hence (A) = lim
n
(A
n
) = 0. The converse is obvious.
Note that the proof also shows that if f is not 0 almost everywhere,
there is an > 0 such that (x[ [f(x)[ ) > 0.
Example. Let be the Lebesgue measure on R. Then the characteristic
function of the rationals
Q
is zero a.e. (with respect to ). Let be the
Dirac measure centered at 0, then f(x) = 0 a.e. (with respect to ) if and
only if f(0) = 0.
Thus f
p
= 0 only implies f(x) = 0 for almost every x, but not for all!
Hence .
p
is not a norm on L
p
(X, d). The way out of this misery is to
identify functions which are equal almost everywhere: Let
^(X, d) = f[f(x) = 0 almost everywhere. (0.68)
Then ^(X, d) is a linear subspace of L
p
(X, d) and we can consider the
quotient space
L
p
(X, d) = L
p
(X, d)/^(X, d). (0.69)
0.6. Lebesgue L
p
spaces 23
If d is the Lebesgue measure on X R
n
we simply write L
p
(X). Observe
that f
p
is well dened on L
p
(X, d).
Even though the elements of L
p
(X, d) are strictly speaking equivalence
classes of functions, we will still call them functions for notational conve
nience. However, note that for f L
p
(X, d) the value f(x) is not well
dened (unless there is a continuous representative and dierent continuous
functions are in dierent equivalence classes, e.g., in the case of Lebesgue
measure).
With this modication we are back in business since L
p
(X, d) turns
out to be a Banach space. We will show this in the following sections.
But before that let us also dene L
1
p
a +
1
q
b, a, b 0, (0.74)
with a = [f[
p
and b = [g[
q
and integrating over X gives
_
X
[f g[d
1
p
_
X
[f[
p
d +
1
q
_
X
[g[
q
d = 1 (0.75)
and nishes the proof.
As a consequence we also get
Theorem 0.27 (Minkowskis inequality). Let f, g L
p
(X, d), then
f +g
p
f
p
+g
p
. (0.76)
Proof. Since the cases p = 1, are straightforward, we only consider 1 <
p < . Using [f +g[
p
[f[ [f +g[
p1
+[g[ [f +g[
p1
we obtain from Holders
inequality (note (p 1)q = p)
f +g
p
p
f
p
(f +g)
p1

q
+g
p
(f +g)
p1

q
= (f
p
+g
p
)(f +g)
p1
p
. (0.77)
k=1
[g
k
(x)[ (0.79)
is in L
p
. This follows from
_
_
_
n
k=1
[g
k
[
_
_
_
p
k=1
g
k
(x)
p
f
1

p
+
1
2
(0.80)
0.6. Lebesgue L
p
spaces 25
using the monotone convergence theorem. In particular, G(x) < almost
everywhere and the sum
n=1
g
n
(x) = lim
n
f
n
(x) (0.81)
is absolutely convergent for those x. Now let f(x) be this limit. Since
[f(x) f
n
(x)[
p
converges to zero almost everywhere and [f(x) f
n
(x)[
p
2
p
G(x)
p
L
1
, dominated convergence shows f f
n

p
0.
In particular, in the proof of the last theorem we have seen:
Corollary 0.29. If f
n
f
p
0 then there is a subsequence which con
verges pointwise almost everywhere.
It even turns out that L
p
is separable.
Lemma 0.30. Suppose X is a second countable topological space (i.e., it
has a countable basis) and is a regular Borel measure. Then L
p
(X, d),
1 p < is separable.
Proof. The set of all characteristic functions
A
(x) with A and (A) <
, is total by construction of the integral. Now our strategy is as follows:
Using outer regularity we can restrict A to open sets and using the existence
of a countable base, we can restrict A to open sets from this base.
Fix A. By outer regularity, there is a decreasing sequence of open sets
O
n
such that (O
n
) (A). Since (A) < it is no restriction to assume
(O
n
) < , and thus (O
n
A) = (O
n
) (A) 0. Now dominated
convergence implies 
A
On

p
0. Thus the set of all characteristic
functions
O
(x) with O open and (O) < , is total. Finally let B be a
countable basis for the topology. Then, every open set O can be written as
O =
j=1
O
j
with
O
j
B. Moreover, by considering the set of all nite
unions of elements from B it is no restriction to assume
n
j=1
O
j
B. Hence
there is an increasing sequence
O
n
O with
O
n
B. By monotone con
vergence, 
O
On

p
0 and hence the set of all characteristic functions
O
with
O B is total.
To nish this chapter, let us show that continuous functions are dense
in L
p
.
Theorem 0.31. Let X be a locally compact metric space and let be a
nite regular Borel measure. Then the set C
c
(X) of continuous functions
with compact support is dense in L
p
(X, d), 1 p < .
Proof. As in the previous proof the set of all characteristic functions
K
(x)
with K compact is total (using inner regularity). Hence it suces to show
26 0. A rst look at Banach and Hilbert spaces
that
K
(x) can be approximated by continuous functions. By outer regu
larity there is an open set O K such that (OK) . By Urysohns
lemma (Lemma 0.11) there is a continuous function f
which is one on K
and 0 outside O. Since
_
X
[
K
f
[
p
d =
_
O\K
[f
[
p
d (OK) (0.82)
we have f
K
 0 and we are done.
If X is some subset of R
n
we can do even better.
A nonnegative function u C
c
(R
n
) is called a mollier if
_
R
n
u(x)dx = 1 (0.83)
The standard mollier is u(x) = exp(
1
x
2
1
) for [x[ < 1 and u(x) = 0 else.
If we scale a mollier according to u
k
(x) = k
n
u(k x) such that its mass is
preserved (u
k

1
= 1) and it concentrates more and more around the origin
E
T
u
k
we have the following result (Problem 0.17):
Lemma 0.32. Let u be a mollier in R
n
and set u
k
(x) = k
n
u(k x). Then
for any (uniformly) continuous function f : R
n
C we have that
f
k
(x) =
_
R
n
u
k
(x y)f(y)dy (0.84)
is in C
(R
n
) and converges to f (uniformly).
Now we are ready to prove
Theorem 0.33. If X R
n
and is a Borel measure, then the set C
c
(X) of
all smooth functions with compact support is dense in L
p
(X, d), 1 p < .
Proof. By our previous result it suces to show that any continuous func
tion f(x) with compact support can be approximated by smooth ones. By
setting f(x) = 0 for x , X, it is no restriction to assume X = R
n
. Now
choose a mollier u and observe that f
k
has compact support (since f
0.7. Appendix: The uniform boundedness principle 27
has). Moreover, since f has compact support it is uniformly continuous
and f
k
f uniformly. But this implies f
k
f in L
p
.
Problem 0.13. Suppose (X) < . Show that
lim
p
f
p
= f
(0.85)
for any bounded measurable function.
Problem 0.14. Prove (0.74). (Hint: Show that f(x) = (1 t) + tx x
t
,
x > 0, 0 < t < 1 satises f(a/b) 0 = f(1).)
Problem 0.15. Show the following generalization of Holders inequality
f g
r
f
p
g
q
, (0.86)
where
1
p
+
1
q
=
1
r
.
Problem 0.16 (Lyapunov inequality). Let 0 < < 1. Show that if f
L
p
1
L
p
2
, then f L
p
and
f
p
f
p
1
f
1
p
2
, (0.87)
where
1
p
=
p
1
+
1
p
2
.
Problem 0.17. Prove Lemma 0.32. (Hint: To show that f
k
is smooth use
Problem A.7 and A.8.)
Problem 0.18. Construct a function f L
p
(0, 1) which has a pole at every
rational number in [0, 1]. (Hint: Start with the function f
0
(x) = [x[
which
has a single pole at 0, then f
j
(x) = f
0
(x x
j
) has a pole at x
j
.)
0.7. Appendix: The uniform boundedness
principle
Recall that the interior of a set is the largest open subset (that is, the union
of all open subsets). A set is called nowhere dense if its closure has empty
interior. The key to several important theorems about Banach spaces is the
observation that a Banach space cannot be the countable union of nowhere
dense sets.
Theorem 0.34 (Baire category theorem). Let X be a complete metric space,
then X cannot be the countable union of nowhere dense sets.
Proof. Suppose X =
n=1
X
n
. We can assume that the sets X
n
are closed
and none of them contains a ball, that is, XX
n
is open and nonempty for
every n. We will construct a Cauchy sequence x
n
which stays away from all
X
n
.
28 0. A rst look at Banach and Hilbert spaces
Since XX
1
is open and nonempty there is a closed ball B
r
1
(x
1
)
XX
1
. Reducing r
1
a little, we can even assume B
r
1
(x
1
) XX
1
. More
over, since X
2
cannot contain B
r
1
(x
1
) there is some x
2
B
r
1
(x
1
) that is
not in X
2
. Since B
r
1
(x
1
) (XX
2
) is open there is a closed ball B
r
2
(x
2
)
B
r
1
(x
1
) (XX
2
). Proceeding by induction we obtain a sequence of balls
such that
B
rn
(x
n
) B
r
n1
(x
n1
) (XX
n
). (0.88)
Now observe that in every step we can choose r
n
as small as we please, hence
without loss of generality r
n
0. Since by construction x
n
B
r
N
(x
N
) for
n N, we conclude that x
n
is Cauchy and converges to some point x X.
But x B
rn
(x
n
) XX
n
for every n, contradicting our assumption that
the X
n
cover X.
(Sets which can be written as countable union of nowhere dense sets are
called of rst category. All other sets are second category. Hence the name
category theorem.)
In other words, if X
n
X is a sequence of closed subsets which cover
X, at least one X
n
contains a ball of radius > 0.
Now we come to the rst important consequence, the uniform bound
edness principle.
Theorem 0.35 (BanachSteinhaus). Let X be a Banach space and Y some
normed linear space. Let A
 C is
uniformly bounded.
Proof. Let
X
n
= x[ A
x n for all =
x[ A
x n, (0.89)
then
n
X
n
= X by assumption. Moreover, by continuity of A
and the
norm, each X
n
is an intersection of closed sets and hence closed. By Baires
theorem at least one contains a ball of positive radius: B
(x
0
) X
n
. Now
observe
A
y A
(y +x
0
) +A
x
0
 n +A
x
0
 (0.90)
for y < . Setting y =
x
x
we obtain
A
x
n +C(x
0
)
x (0.91)
for any x.
Part 1
Mathematical
Foundations of
Quantum Mechanics
Chapter 1
Hilbert spaces
The phase space in classical mechanics is the Euclidean space R
2n
(for the n
position and n momentum coordinates). In quantum mechanics the phase
space is always a Hilbert space H. Hence the geometry of Hilbert spaces
stands at the outset of our investigations.
1.1. Hilbert spaces
Suppose H is a vector space. A map ., ..) : H H C is called skew
linear form if it is conjugate linear in the rst and linear in the second
argument. A positive denite skew linear form is called inner product or
scalar product. Associated with every scalar product is a norm
 =
_
, ). (1.1)
The triangle inequality follows from the CauchySchwarzBunjakowski
inequality:
[, )[   (1.2)
with equality if and only if and are parallel.
If H is complete with respect to the above norm, it is called a Hilbert
space. It is no restriction to assume that H is complete since one can easily
replace it by its completion.
Example. The space L
2
(M, d) is a Hilbert space with scalar product given
by
f, g) =
_
M
f(x)
g(x)d(x). (1.3)
31
32 1. Hilbert spaces
Similarly, the set of all square summable sequences
2
(N) is a Hilbert space
with scalar product
f, g) =
jN
f
j
g
j
. (1.4)
(Note that the second example is a special case of the rst one; take M = R
and a sum of Dirac measures.)
A vector H is called normalized or unit vector if  = 1. Two
vectors , H are called orthogonal or perpendicular ( ) if
, ) = 0 and parallel if one is a multiple of the other. If and are
orthogonal we have the Pythagorean theorem:
 +
2
= 
2
+
2
, , (1.5)
which is straightforward to check.
Suppose is a unit vector, then the projection of in the direction of
is given by
= , ) (1.6)
and
dened via
= , ) (1.7)
is perpendicular to .
These results can also be generalized to more than one vector. A set
of vectors
j
is called orthonormal set if
j
,
k
) = 0 for j ,= k and
j
,
j
) = 1.
Lemma 1.1. Suppose
j
n
j=0
is an orthonormal set. Then every H
can be written as
=
=
n
j=0
j
, )
j
, (1.8)
where
and
) = 0 for all 1 j n.
In particular,

2
=
n
j=0
[
j
, )[
2
+

2
. (1.9)
Moreover, every
in the span of
j
n
j=0
satises

 
 (1.10)
with equality holding if and only if
=
. In other words,
is uniquely
characterized as the vector in the span of
j
n
j=0
being closest to .
1.2. Orthonormal bases 33
Proof. A straightforward calculation shows
j
,
) = 0 and hence
and
=
n
j=0
c
j
j
. (1.11)
in the span of
j
n
j=0
. Then one computes


2
= 

2
= 

2
+

2
= 

2
+
n
j=0
[c
j
j
, )[
2
(1.12)
from which the last claim follows.
From (1.9) we obtain Bessels inequality
n
j=0
[
j
, )[
2

2
(1.13)
with equality holding if and only if lies in the span of
j
n
j=0
.
Recall that a scalar product can be recovered from its norm by virtue of
the polarization identity
, ) =
1
4
_
 +
2
 
2
+ i i
2
i + i
2
_
. (1.14)
A bijective operator U L(H
1
, H
2
) is called unitary if U preserves
scalar products:
U, U)
2
= , )
1
, , H
1
. (1.15)
By the polarization identity this is the case if and only if U preserves norms:
U
2
= 
1
for all H
1
. The two Hilbert space H
1
and H
2
are called
unitarily equivalent in this case.
Problem 1.1. The operator S :
2
(N)
2
(N), (a
1
, a
2
, a
3
. . . ) (0, a
1
, a
2
, . . . )
clearly satises Ua = a. Is it unitary?
1.2. Orthonormal bases
Of course, since we cannot assume H to be a nite dimensional vector space,
we need to generalize Lemma 1.1 to arbitrary orthonormal sets
j
jJ
.
We start by assuming that J is countable. Then Bessels inequality (1.13)
shows that
jJ
[
j
, )[
2
(1.16)
34 1. Hilbert spaces
converges absolutely. Moreover, for any nite subset K J we have

jK
j
, )
j

2
=
jK
[
j
, )[
2
(1.17)
by the Pythagorean theorem and thus
jJ
j
, )
j
is Cauchy if and only
if
jJ
[
j
, )[
2
is. Now let J be arbitrary. Again, Bessels inequality
shows that for any given > 0 there are at most nitely many j for which
[
j
, )[ . Hence there are at most countably many j for which [
j
, )[ >
0. Thus it follows that
jJ
[
j
, )[
2
(1.18)
is welldened and so is
jJ
j
, )
j
. (1.19)
In particular, by continuity of the scalar product we see that Lemma 1.1
holds for arbitrary orthonormal sets without modications.
Theorem 1.2. Suppose
j
jJ
is an orthonormal set. Then every H
can be written as
=
jJ
j
, )
j
, (1.20)
where
and
) = 0 for all j J. In
particular,

2
=
jJ
[
j
, )[
2
+

2
. (1.21)
Moreover, every
in the span of
j
jJ
satises

 
 (1.22)
with equality holding if and only if
=
. In other words,
is uniquely
characterized as the vector in the span of
j
jJ
being closest to .
Note that from Bessels inequality (which of course still holds) it follows
that the map
is continuous.
An orthonormal set which is not a proper subset of any other orthonor
mal set is called an orthonormal basis due to following result:
Theorem 1.3. For an orthonormal set
j
jJ
the following conditions are
equivalent:
(i)
j
jJ
is a maximal orthonormal set.
(ii) For every vector H we have
=
jJ
j
, )
j
. (1.23)
1.2. Orthonormal bases 35
(iii) For every vector H we have

2
=
jJ
[
j
, )[
2
. (1.24)
(iv)
j
, ) = 0 for all j J implies = 0.
Proof. We will use the notation from Theorem 1.2.
(i) (ii): If
jJ
would be a
larger orthonormal set, contradicting maximality of
j
jJ
.
(ii) (iii): Follows since (ii) implies
= 0.
(iii) (iv): If ,
j
) = 0 for all j J we conclude 
2
= 0 and hence
= 0.
(iv) (i): If
j
jJ
were not maximal, there would be a unit vector
such that
j
jJ
is larger orthonormal set. But
j
, ) = 0 for all
j J implies = 0 by (iv), a contradiction.
Since
n
(x) =
1
2
e
inx
, n Z, (1.25)
forms an orthonormal basis for H = L
2
(0, 2). The corresponding orthogo
nal expansion is just the ordinary Fourier series. (Problem 1.17)
A Hilbert space is separable if and only if there is a countable orthonor
mal basis. In fact, if H is separable, then there exists a countable total set
N
j=0
. After throwing away some vectors we can assume that
n+1
can
not be expressed as a linear combinations of the vectors
0
, . . .
n
. Now we
can construct an orthonormal basis as follows: We begin by normalizing
0
0
=
0

0

. (1.26)
Next we take
1
and remove the component parallel to
0
and normalize
again
1
=
1
0
,
1
)
0

1
0
,
1
)
0

. (1.27)
Proceeding like this we dene recursively
n
=
n
n1
j=0
j
,
n
)
j

n
n1
j=0
j
,
n
)
j

. (1.28)
This procedure is known as GramSchmidt orthogonalization. Hence
we obtain an orthonormal set
j
N
j=0
such that span
j
n
j=0
= span
j
n
j=0
36 1. Hilbert spaces
for any nite n and thus also for N. Since
j
N
j=0
is total, we infer that
N
j=0
is an orthonormal basis.
Example. In L
2
(1, 1) we can orthogonalize the polynomial f
n
(x) = x
n
.
The resulting polynomials are up to a normalization equal to the Legendre
polynomials
P
0
(x) = 1, P
1
(x) = x, P
2
(x) =
3 x
2
1
2
, . . . (1.29)
(which are normalized such that P
n
(1) = 1).
If fact, if there is one countable basis, then it follows that any other basis
is countable as well.
Theorem 1.4. If H is separable, then every orthonormal basis is countable.
Proof. We know that there is at least one countable orthonormal basis
jJ
. Now let
k
kK
be a second basis and consider the set K
j
=
k K[
k
,
j
) , = 0. Since these are the expansion coecients of
j
with
respect to
k
kK
, this set is countable. Hence the set
K =
jJ
K
j
is
countable as well. But k K
K implies
k
= 0 and hence
K = K.
We will assume all Hilbert spaces to be separable.
In particular, it can be shown that L
2
(M, d) is separable. Moreover, it
turns out that, up to unitary equivalence, there is only one (separable)
innite dimensional Hilbert space:
Let H be an innite dimensional Hilbert space and let
j
jN
be any
orthogonal basis. Then the map U : H
2
(N), (
j
, ))
jN
is unitary
(by Theorem 1.3 (iii)). In particular,
Theorem 1.5. Any separable innite dimensional Hilbert space is unitarily
equivalent to
2
(N).
Let me remark that if H is not separable, there still exists an orthonor
mal basis. However, the proof requires Zorns lemma: The collection of
all orthonormal sets in H can be partially ordered by inclusion. Moreover,
any linearly ordered chain has an upper bound (the union of all sets in the
chain). Hence Zorns lemma implies the existence of a maximal element,
that is, an orthonormal basis.
1.3. The projection theorem and the Riesz
lemma
Let M H be a subset, then M
= [, ) = 0, M is called
the orthogonal complement of M. By continuity of the scalar prod
uct it follows that M
= M
with
M and
. One writes
M M
= H (1.30)
in this situation.
Proof. Since M is closed, it is a Hilbert space and has an orthonormal basis
jJ
. Hence the result follows from Theorem 1.2.
In other words, to every H we can assign a unique vector
which
is the vector in M closest to . The rest
lies in M
. The operator
P
M
=
) = , P
M
). Clearly we have P
M
=
P
M
=
.
Moreover, we see that the vectors in a closed subspace M are precisely
those which are orthogonal to all vectors in M
, that is, M
= M. If M
is an arbitrary subset we have at least
M
= span(M). (1.32)
Note that by H
= we see that M
: , ) is a
bounded linear functional (with norm ). In turns out that in a Hilbert
space every bounded linear functional can be written in this way.
Theorem 1.7 (Riesz lemma). Suppose is a bounded linear functional on
a Hilbert space H. Then there is a vector H such that () = , ) for
all H. In other words, a Hilbert space is equivalent to its own dual space
H
= H.
Proof. If 0 we can choose = 0. Otherwise Ker() = [() = 0
is a proper subspace and we can nd a unit vector Ker()
. For every
H we have () ( ) Ker() and hence
0 = , () ( )) = () ( ) , ). (1.33)
In other words, we can choose = ( )
.
The following easy consequence is left as an exercise.
38 1. Hilbert spaces
Corollary 1.8. Suppose B is a bounded skew liner form, that is,
[B(, )[ C . (1.34)
Then there is a unique bounded operator A such that
B(, ) = A, ). (1.35)
Problem 1.2. Show that an orthogonal projection P
M
,= 0 has norm one.
Problem 1.3. Suppose P
1
and P
1
are orthogonal projections. Show that
P
1
P
2
(that is , P
1
) , P
2
)) is equivalent to Ran(P
1
) Ran(P
2
).
Problem 1.4. Prove Corollary 1.8.
Problem 1.5. Let
j
be some orthonormal basis. Show that a bounded
linear operator A is uniquely determined by its matrix elements A
jk
=
j
, A
k
) with respect to this basis.
Problem 1.6. Show that L(H) is not separable H is innite dimensional.
Problem 1.7. Show P : L
2
(R) L
2
(R), f(x)
1
2
(f(x) + f(x)) is a
projection. Compute its range and kernel.
1.4. Orthogonal sums and tensor products
Given two Hilbert spaces H
1
and H
2
we dene their orthogonal sumH
1
H
2
to be the set of all pairs (
1
,
2
) H
1
H
2
together with the scalar product
(
1
,
2
), (
1
,
2
)) =
1
,
1
)
1
+
2
,
2
)
2
. (1.36)
It is left as an exercise to verify that H
1
H
2
is again a Hilbert space.
Moreover, H
1
can be identied with (
1
, 0)[
1
H
1
and we can regard
H
1
as a subspace of H
1
H
2
. Similarly for H
2
. It is also custom to write
1
+
2
instead of (
1
,
2
).
More generally, let H
j
j N, be a countable collection of Hilbert spaces
and dene
j=1
H
j
=
j=1
j
[
j
H
j
,
j=1

j

2
j
< , (1.37)
which becomes a Hilbert space with the scalar product
j=1
j
,
j=1
j
) =
j=1
j
,
j
)
j
. (1.38)
Suppose H and
H are two Hilbert spaces. Our goal is to construct their
tensor product. The elements should be products
of elements H
1.4. Orthogonal sums and tensor products 39
and
H. Hence we start with the set of all nite linear combinations of
elements of H
H
T(H,
H) =
n
j=1
j
(
j
,
j
)[(
j
,
j
) H
H,
j
C. (1.39)
Since we want (
1
+
2
)
=
1
+
2
, (
1
+
2
) =
1
+
2
,
and ()
= (
) we consider T(H,
H)/^(H,
H), where
^(H,
H) = span
n
j,k=1
k
(
j
,
k
) (
n
j=1
j
,
n
k=1
k
) (1.40)
and write
for the equivalence class of (,
).
Next we dene
,
) = , )
,
) (1.41)
which extends to a skew linear form on T(H,
H)/^(H,
H). To show that we
obtain a scalar product, we need to ensure positivity. Let =
i
,=
0 and pick orthonormal bases
j
,
k
for span
i
, span
i
, respectively.
Then
=
j,k
jk
j
k
,
jk
=
j
,
i
)
k
,
i
) (1.42)
and we compute
, ) =
j,k
[
jk
[
2
> 0. (1.43)
The completion of T(H,
H)/^(H,
H) with respect to the induced norm is
called the tensor product H
H of H and
H.
Lemma 1.9. If
j
,
k
are orthonormal bases for H,
H, respectively, then
j
k
is an orthonormal basis for H
H.
Proof. That
j
k
is an orthonormal set is immediate from (1.41). More
over, since span
j
, span
k
is dense in H,
H, respectively, it is easy to
see that
j
k
is dense in T(H,
H)/^(H,
H). But the latter is dense in
H
H.
Example. We have H C
n
= H
n
.
Example. Let (M, d) and (
M, d ) be two measure spaces. Then we have
L
2
(M, d) L
2
(
M, d ) = L
2
(M
M, d d ).
Clearly we have L
2
(M, d) L
2
(
M, d ) L
2
(M
M, d d ). Now
take an orthonormal basis
j
k
for L
2
(M, d) L
2
(
M, d ) as in our
40 1. Hilbert spaces
previous lemma. Then
_
M
_
M
(
j
(x)
k
(y))
j
(x)
f
k
(x)d(x) = 0, f
k
(x) =
_
M
k
(y)
j=1
H
j
) H =
j=1
(H
j
H), (1.46)
where equality has to be understood in the sense, that both spaces are
unitarily equivalent by virtue of the identication
(
j=1
j
) =
j=1
j
. (1.47)
Problem 1.8. We have
.
Problem 1.9. Show (1.46)
1.5. The C
) = A, ) (1.48)
(compare Corollary 1.8).
Example. If H = C
n
and A = (a
jk
)
1j,kn
, then A
= (a
kj
)
1j,kn
.
Lemma 1.10. Let A, B L(H), then
(i) (A+B)
= A
+B
, (A)
,
(ii) A
= A,
(iii) (AB)
= B
,
(iv) A = A
 and A
2
= A
A = AA
.
1.5. The C
, B) =
B
 = sup
==1
[, A
)[ = sup
==1
[A, )[ = A. (1.49)
and
A
A = sup
==1
[, A
A)[ = sup
==1
[A, A)[
= sup
=1
A
2
= A
2
, (1.50)
where we have used  = sup
=1
[, )[.
As a consequence of A
= a
+b
, (a)
, a
= a, (ab)
= b
, (1.51)
satisfying
a
2
= a
a (1.52)
is called a C
 aa
.
Any subalgebra which is also closed under involution, is called a 
algebra. An ideal is a subspace 1 / such that a 1, b / implies
ab 1 and ba 1. If it is closed under the adjoint map it is called a ideal.
Note that if there is and identity e we have e
= e and hence (a
1
)
= (a
)
1
(show this).
Example. The continuous function C(I) together with complex conjuga
tion form a commutative C
algebra.
An element a / is called normal if aa
= a
a, selfadjoint if a = a
,
unitary if aa
= a
a = I, (orthogonal) projection if a = a
= a
2
, and
positive if a = bb
, H. (1.53)
(Hint: Problem 0.6)
Problem 1.11. Show that U : H H is unitary if and only if U
1
= U
.
Problem 1.12. Compute the adjoint of S :
2
(N)
2
(N), (a
1
, a
2
, a
3
. . . )
(0, a
1
, a
2
, . . . ).
42 1. Hilbert spaces
1.6. Weak and strong convergence
Sometimes a weaker notion of convergence is useful: We say that
n
con
verges weakly to and write
wlim
n
n
= or
n
. (1.54)
if ,
n
) , ) for every H (show that a weak limit is unique).
Example. Let
n
be an (innite) orthonormal set. Then ,
n
) 0 for
every since these are just the expansion coecients of . (
n
does not
converge to 0, since 
n
 = 1.)
Clearly
n
implies
n
and hence this notion of convergence is
indeed weaker. Moreover, the weak limit is unique, since ,
n
) , )
and ,
n
) ,
) implies , (
)) = 0. A sequence
n
is called weak
Cauchy sequence if ,
n
) is Cauchy for every H.
Lemma 1.11. Let H be a Hilbert space.
(i)
n
implies  liminf 
n
.
(ii) Every weak Cauchy sequence
n
is bounded: 
n
 C.
(iii) Every weak Cauchy sequence converges weakly.
(iv) For a weakly convergent sequence
n
we have:
n
if
and only if limsup
n
 .
Proof. (i) Observe

2
= , ) = liminf,
n
)  liminf 
n
. (1.55)
(ii) For every we have that [,
n
)[ C() is bounded. Hence by the
uniform boundedness principle we have 
n
 = 
n
, .) C.
(iii) Let
m
be an orthonormal basis and dene c
m
= lim
n
m
,
n
).
Then =
m
c
m
m
is the desired limit.
(iv) By (i) we have lim
n
 =  and hence

n

2
= 
2
2Re(,
n
)) +
n

2
0. (1.56)
The converse is straightforward.
Clearly an orthonormal basis does not have a norm convergent subse
quence. Hence the unit ball in an innite dimensional Hilbert space is never
compact. However, we can at least extract weakly convergent subsequences:
Lemma 1.12. Let H be a Hilbert space. Every bounded sequence
n
has
weakly convergent subsequence.
Proof. Let
k
be an ONB, then by the usual diagonal sequence argument
we can nd a subsequence
nm
such that
k
,
nm
) converges for all k. Since
1.6. Weak and strong convergence 43
n
is bounded, ,
nm
) converges for every H and hence
nm
is a weak
Cauchy sequence.
Finally, let me remark that similar concepts can be introduced for oper
ators. This is of particular importance for the case of unbounded operators,
where convergence in the operator norm makes no sense at all.
A sequence of operators A
n
is said to converge strongly to A,
slim
n
A
n
= A : A
n
A x D(A) D(A
n
). (1.57)
It is said to converge weakly to A,
wlim
n
A
n
= A : A
n
A D(A) D(A
n
). (1.58)
Clearly norm convergence implies strong convergence and strong conver
gence implies weak convergence.
Example. Consider the operator S
n
L(
2
(N)) which shifts a sequence n
places to the left
S
n
(x
1
, x
2
, . . . ) = (x
n+1
, x
n+2
, . . . ). (1.59)
and the operator S
n
L(
2
(N)) which shifts a sequence n places to the right
and lls up the rst n places with zeros
S
n
(x
1
, x
2
, . . . ) = (0, . . . , 0
. .
n places
, x
1
, x
2
, . . . ). (1.60)
Then S
n
converges to zero strongly but not in norm and S
n
converges weakly
to zero but not strongly.
Note that this example also shows that taking adjoints is not continuous
with respect to strong convergence! If A
n
s
A we only have
, A
n
) = A
n
, ) A, ) = , A
) (1.61)
and hence A
n
A
in general. However, if A
n
and A are normal we have
(A
n
A)
 = (A
n
A) (1.62)
and hence A
n
s
A
j=1
be some orthonormal basis and dene

w
=
j=1
1
2
j
[
j
, )[. (1.64)
Show that .
w
is a norm. Show that
n
if and only if 
n

w
0.
Problem 1.16. A subspace M H is closed if and only if every weak
Cauchy sequence in M has a limit in M. (Hint: M = M
.)
1.7. Appendix: The StoneWeierstra theorem
In case of a selfadjoint operator, the spectral theorem will show that the
closed algebra generated by this operator is isomorphic to the C
algebra
of continuous functions C(K) over some compact set. Hence it is important
to be able to identify dense sets:
Theorem 1.14 (StoneWeierstra, real version). Suppose K is a compact
set and let C(K, R) be the Banach algebra of continuous functions (with the
sup norm).
If F C(K, R) contains the identity 1 and separates points (i.e., for
every x
1
,= x
2
there is some function f F such that f(x
1
) ,= f(x
2
)), then
the algebra generated by F is dense.
Proof. Denote by A the algebra generated by F. Note that if f A,
we have [f[ A: By the Weierstra approximation theorem (Theorem 0.13)
there is a polynomial p
n
(t) such that
[t[p
n
(t)
<
1
n
and hence p
n
(f) [f[.
In particular, if f, g in A, we also have
maxf, g =
(f +g) +[f g[
2
, minf, g =
(f +g) [f g[
2
(1.65)
1.7. Appendix: The StoneWeierstra theorem 45
in A.
Now x f C(K, R). We need to nd some f
A with f f
< .
First of all, since A separates points, observe that for given y, z K
there is a function f
y,z
A such that f
y,z
(y) = f(y) and f
y,z
(z) = f(z)
(show this). Next, for every y K there is a neighborhood U(y) such that
f
y,z
(x) > f(x) , x U(y) (1.66)
and since K is compact, nitely many, say U(y
1
), . . . U(y
j
), cover K. Then
f
z
= maxf
y
1
,z
, . . . , f
y
j
,z
A (1.67)
and satises f
z
> f by construction. Since f
z
(z) = f(z) for every z K
there is a neighborhood V (z) such that
f
z
(x) < f(x) +, x V (z) (1.68)
and a corresponding nite cover V (z
1
), . . . V (z
k
). Now
f
= minf
z
1
, . . . , f
z
k
A (1.69)
satises f
algebra
of continuous functions (with the sup norm).
If F C(K) separates points, then the closure of the algebra generated
by F is either C(K) or f C(K)[f(t
0
) = 0 for some t
0
K.
Proof. There are two possibilities, either all f F vanish at one point
t
0
K (there can be at most one such point since F separates points)
or there is no such point. If there is no such point we can proceed as in
the proof of the StoneWeierstra theorem to show that the identity can
46 1. Hilbert spaces
be approximated by elements in A (note that to show [f[ A if f A
we do not need the identity, since p
n
can be chosen to contain no constant
term). If there is such a t
0
, the identity is clearly missing from A. However,
adding the identity to A we get A + C = C(K) and it is easy to see that
A = f C(K)[f(t
0
) = 0.
Problem 1.17. Show that the of functions
n
(x) =
1
2
e
inx
, n Z, form
an orthonormal basis for H = L
2
(0, 2).
Problem 1.18. Show that the algebra generated by f
z
(t) =
1
tz
, z C, is
dense in the C
algebra C
(x) =
_
R
3
x[(x, t)[
2
d
3
x. (2.3)
In a real life setting, it will not be possible to measure x directly and one will
only be able to measure certain functions of x. For example, it is possible to
check whether the particle is inside a certain area of space (e.g., inside a
detector). The corresponding observable is the characteristic function
(x)
of this set. In particular, the number
E
) =
_
R
3
(x)[(x, t)[
2
d
3
x =
_
[(x, t)[
2
d
3
x (2.4)
47
48 2. Selfadjointness and spectrum
corresponds to the probability of nding the particle inside R
3
. An
important point to observe is that, in contradistinction to classical mechan
ics, the particle is no longer localized at a certain point. In particular,
the meansquare deviation (or variance)
(x)
2
= E
(x
2
) E
(x)
2
is
always nonzero.
In general, the conguration space (or phase space) of a quantum
system is a (complex) Hilbert space H and the possible states of this system
are represented by the elements having norm one,  = 1.
An observable a corresponds to a linear operator A in this Hilbert space
and its expectation, if the system is in the state , is given by the real
number
E
(A) = , A) = A, ), (2.5)
where ., ..) denotes the scalar product of H. Similarly, the meansquare
deviation is given by
(A)
2
= E
(A
2
) E
(A)
2
= (AE
(A))
2
. (2.6)
Note that
(A).
From a physical point of view, (2.5) should make sense for any H.
However, this is not in the cards as our simple example of one particle already
shows. In fact, the reader is invited to nd a square integrable function (x)
for which x(x) is no longer square integrable. The deeper reason behind
this nuisance is that E
1
(0) +
2
2
(0)) =
1
1
(t) +
2
2
(t) ([
1
[
2
+
[
2
[
2
= 1). In other words, U(t) should be a linear operator. Moreover,
2.1. Some quantum mechanics 49
since (t) is a state (i.e., (t) = 1), we have
U(t) = . (2.8)
Such operators are called unitary. Next, since we have assumed uniqueness
of solutions to the initial value problem, we must have
U(0) = I, U(t +s) = U(t)U(s). (2.9)
A family of unitary operators U(t) having this property is called a one
parameter unitary group. In addition, it is natural to assume that this
group is strongly continuous
lim
tt
0
U(t) = U(t
0
), H. (2.10)
Each such group has an innitesimal generator dened by
H = lim
t0
i
t
(U(t) ), D(H) = H[ lim
t0
1
t
(U(t) ) exists.
(2.11)
This operator is called the Hamiltonian and corresponds to the energy of
the system. If (0) D(H), then (t) is a solution of the Schrodinger
equation (in suitable units)
i
d
dt
(t) = H(t). (2.12)
This equation will be the main subject of our course.
In summary, we have the following axioms of quantum mechanics.
Axiom 1. The conguration space of a quantum system is a complex
separable Hilbert space H and the possible states of this system are repre
sented by the elements of H which have norm one.
Axiom 2. Each observable a corresponds to a linear operator A dened
maximally on a dense subset D(A). Moreover, the operator correspond
ing to a polynomial P
n
(a) =
n
j=0
j
a
j
,
j
R, is P
n
(A) =
n
j=0
j
A
j
,
D(P
n
(A)) = D(A
n
) = D(A)[A D(A
n1
) (A
0
= I).
Axiom 3. The expectation value for a measurement of a, when the
system is in the state D(A), is given by (2.5), which must be real for
all D(A).
Axiom 4. The time evolution is given by a strongly continuous one
parameter unitary group U(t). The generator of this group corresponds to
the energy of the system.
In the following sections we will try to draw some mathematical conse
quences from these assumptions:
50 2. Selfadjointness and spectrum
First we will see that Axiom 2 and 3 imply that observables correspond
to selfadjoint operators. Hence these operators play a central role in quan
tum mechanics and we will derive some of their basic properties. Another
crucial role is played by the set of all possible expectation values for the
measurement of a, which is connected with the spectrum (A) of the corre
sponding operator A.
The problem of dening functions of an observable will lead us to the
spectral theorem (in the next chapter), which generalizes the diagonalization
of symmetric matrices.
Axiom 4 will be the topic of Chapter 5.
2.2. Selfadjoint operators
Let H be a (complex separable) Hilbert space. A linear operator is a linear
mapping
A : D(A) H, (2.13)
where D(A) is a linear subspace of H, called the domain of A. It is called
bounded if the operator norm
A = sup
=1
A = sup
==1
[, A)[ (2.14)
is nite. The second equality follows since equality in [, A)[  A
is attained when A = z for some z C. If A is bounded it is no restriction
to assume D(A) = H and we will always do so. The Banach space of all
bounded linear operators is denoted by L(H).
The expression , A) encountered in the previous section is called the
quadratic form
q
A
() = , A), D(A), (2.15)
associated to A. An operator can be reconstructed from its quadratic form
via the polarization identity
, A) =
1
4
(q
A
( +) q
A
( ) + iq
A
( i) iq
A
( + i)) . (2.16)
A densely dened linear operator A is called symmetric (or Hermitian)
if
, A) = A, ), , D(A). (2.17)
The justication for this denition is provided by the following
Lemma 2.1. A densely dened operator A is symmetric if and only if the
corresponding quadratic form is realvalued.
2.2. Selfadjoint operators 51
Proof. Clearly (2.17) implies that Im(q
A
()) = 0. Conversely, taking the
imaginary part of the identity
q
A
( + i) = q
A
() +q
A
() + i(, A) , A)) (2.18)
shows ReA, ) = Re, A). Replacing by i in this last equation
shows ImA, ) = Im, A) and nishes the proof.
In other words, a densely dened operator A is symmetric if and only if
, A) = A, ), D(A). (2.19)
This already narrows the class of admissible operators to the class of
symmetric operators by Axiom 3. Next, let us tackle the issue of the correct
domain.
By Axiom 2, A should be dened maximally, that is, if
A is another
symmetric operator such that A
A, then A =
A. Here we write A
A if
D(A) D(
A) and A =
A for all D(A). In addition, we write A =
A
if both
A A and A
A hold.
The adjoint operator A
) = H[
H : , A) =
, ), D(A)
A
. (2.20)
The requirement that D(A) is dense implies that A
is welldened. How
ever, note that D(A
for C and (A + B)
+ B
) = Ran(A)
. (2.21)
For symmetric operators we clearly have A A
. If in addition, A = A
A(x)f(x)d(x) =
_
(A(x)
h(x))
f(x)d(x) =
Ah, f),
(2.23)
where
A is multiplication by A(x)
,
(
Af)(x) = A(x)
f(x), D(
A) = f L
2
(R
n
, d) [
Af L
2
(R
n
, d).
(2.24)
Note D(
A) = D(A). At rst sight this seems to show that the adjoint of
A is
A. But for our calculation we had to assume h D(A) and there
might be some functions in D(A
) there is some g L
2
(R
n
, d) such that
_
h(x)
A(x)f(x)d(x) =
_
g(x)
g(x))
g(x))
f(x)d(x) = 0, f L
2
(R
n
, d), (2.27)
which shows that
n
(h(x)A(x)
g(x))
L
2
(R
n
, d) vanishes. Since n
is arbitrary, we even have h(x)A(x)
= g(x) L
2
(R
n
, d) and thus A
is
multiplication by A(x)
and D(A
) = D(A).
In particular, A is selfadjoint if A is realvalued. In the general case we
have at least Af = A
, (2.28)
that is, increasing the domain of A implies decreasing the domain of A
.
Thus there is no point in trying to extend the domain of a selfadjoint
operator further. In fact, if A is selfadjoint and B is a symmetric extension,
we infer A B B
= A implying A = B.
Corollary 2.2. Selfadjoint operators are maximal, that is, they do not have
any symmetric extensions.
2.2. Selfadjoint operators 53
Furthermore, if A
and
thus A
of A.
If A is symmetric we have A A
and hence A = A
, that is,
A lies between A and A
. Moreover, , A
= A for D(A).
Example. (Dierential operator) Take H = L
2
(0, 2).
(i). Consider the operator
A
0
f = i
d
dx
f, D(A
0
) = f C
1
[0, 2] [ f(0) = f(2) = 0. (2.29)
That A
0
is symmetric can be shown by a simple integration by parts (do
this). Note that the boundary conditions f(0) = f(2) = 0 are chosen
such that the boundary terms occurring from integration by parts vanish.
However, this will also follow once we have computed A
0
. If g D(A
0
) we
must have
_
2
0
g(x)
(if
(x))dx =
_
2
0
g(x)
f(x)dx (2.30)
for some g L
2
(0, 2). Integration by parts shows
_
2
0
f
(x)
_
g(x) i
_
x
0
g(t)dt
_
dx = 0 (2.31)
and hence g(x) i
_
x
0
g(t)dt f
[f D(A
0
)
. But f
[f D(A
0
) =
h C(0, 2)[
_
2
0
h(t)dt = 0 implying g(x) = g(0) + i
_
x
0
g(t)dt since
f
[f D(A
0
) = h H[1, h) = 0 = 1
and 1
= span1. Thus
g AC[0, 2], where
AC[a, b] = f C[a, b][f(x) = f(a) +
_
x
a
g(t)dt, g L
1
(a, b) (2.32)
denotes the set of all absolutely continuous functions (see Section 2.6). In
summary, g D(A
0
) implies g AC[0, 2] and A
0
g = g = ig
. Conversely,
for every g H
1
(0, 2) = f AC[0, 2][f
L
2
(0, 2) (2.30) holds with
g = ig
and we conclude
A
0
f = i
d
dx
f, D(A
0
) = H
1
(0, 2). (2.33)
In particular, A is symmetric but not selfadjoint. Since A
we
compute
0 = g, A
0
f) A
0
g, f) = i(f(0)g(0)
f(2)g(2)
) (2.34)
54 2. Selfadjointness and spectrum
and since the boundary values of g D(A
0
) can be prescribed arbitrary, we
must have f(0) = f(2) = 0. Thus
A
0
f = i
d
dx
f, D(A
0
) = f D(A
0
) [ f(0) = f(2) = 0. (2.35)
(ii). Now let us take
Af = i
d
dx
f, D(A) = f C
1
[0, 2] [ f(0) = f(2). (2.36)
which is clearly an extension of A
0
. Thus A
0
and we compute
0 = g, Af) A
g, f) = if(0)(g(0)
g(2)
). (2.37)
Since this must hold for all f D(A) we conclude g(0) = g(2) and
A
f = i
d
dx
f, D(A
) = f H
1
(0, 2) [ f(0) = f(2). (2.38)
Similarly, as before, A = A
2
e
inx
, n Z, (2.40)
which are wellknown to form an orthonormal basis.
2.2. Selfadjoint operators 55
We will see a bit later that this is a consequence of selfadjointness of
A. Hence it will be important to know whether a given operator is self
adjoint or not. Our example shows that symmetry is easy to check (in case
of dierential operators it usually boils down to integration by parts), but
computing the adjoint of an operator is a nontrivial job even in simple situ
ations. However, we will learn soon that selfadjointness is a much stronger
property than symmetry justifying the additional eort needed to prove it.
On the other hand, if a given symmetric operator A turns out not to
be selfadjoint, this raises the question of selfadjoint extensions. Two cases
need to be distinguished. If A is selfadjoint, then there is only one self
adjoint extension (if B is another one, we have A B and hence A = B
by Corollary 2.2). In this case A is called essentially selfadjoint and
D(A) is called a core for A. Otherwise there might be more than one self
adjoint extension or none at all. This situation is more delicate and will be
investigated in Section 2.5.
Since we have seen that computing A
will be useful.
Lemma 2.3. Let A be symmetric such that Ran(A+z) = Ran(A+z
) = H
for one z C. Then A is selfadjoint.
Proof. Let D(A
) and A
=
. Since Ran(A + z
) = H, there is a
D(A) such that (A+z
) =
+z
. Now we compute
, (A+z)) =
+z
, ) = (A+z
), ) = , (A+z)), D(A),
(2.41)
and hence = D(A) since Ran(A+z) = H.
To proceed further, we will need more information on the closure of
an operator. We will use a dierent approach which avoids the use of the
adjoint operator. We will establish equivalence with our original denition
in Lemma 2.4.
The simplest way of extending an operator A is to take the closure of its
graph (A) = (, A)[ D(A) H
2
. That is, if (
n
, A
n
) (,
)
we might try to dene A =
. For A to be welldened, we need that
(
n
, A
n
) (0,
) implies
= 0. In this case A is called closable and the
unique operator A which satises (A) = (A) is called the closure of A.
Clearly, A is called closed if A = A, which is the case if and only if the
graph of A is closed. Equivalently, A is closed if and only if (A) equipped
with the graph norm 
2
(A)
= 
2
+ A
2
is a Hilbert space (i.e.,
closed). A bounded operator is closed if and only if its domain is closed
(show this!).
56 2. Selfadjointness and spectrum
Example. Let us compute the closure of the operator A
0
from the pre
vious example without the use of the adjoint operator. Let f D(A
0
)
and let f
n
D(A
0
) be a sequence such that f
n
f, A
0
f
n
ig. Then
f
n
g and hence f(x) =
_
x
0
g(t)dt. Thus f AC[0, 2] and f(0) = 0.
Moreover f(2) = lim
n0
_
2
0
f
n
(t)dt = 0. Conversely, any such f can be
approximated by functions in D(A
0
) (show this).
Next, let us collect a few important results.
Lemma 2.4. Suppose A is a densely dened operator.
(i) A
is closed.
(ii) A is closable if and only if D(A
) is dense and A = A
respectively
(A)
= A
in this case.
(iii) If A is injective and the Ran(A) is dense, then (A
)
1
= (A
1
)
.
If A is closable and A is injective, then A
1
= A
1
.
Proof. Let us consider the following two unitary operators from H
2
to itself
U(, ) = (, ), V (, ) = (, ). (2.42)
(i). From
(A
) = (, ) H
2
[, A) = , ) D(A
)
= (, ) H
2
[( , ), (,
))
(A)
= 0 (,
) (A)
= U((A)
) = (U(A))
(2.43)
we conclude that A
is closed.
(ii). From
(A) = (A)
= (U(A
))
= (,
)[ , A
, ) = 0, D(A
) (2.44)
we see that (0,
) (A) if and only if
D(A
. Hence A is closable if
and only if D(A
= A
.
Moreover, replacing A by A
= A.
(iii). Next note that (provided A is injective)
(A
1
) = V (A). (2.45)
Hence if Ran(A) is dense, then Ker(A
) = Ran(A)
= 0 and
((A
)
1
) = V (A
) = V U(A)
= UV (A)
= U(V (A))
(2.46)
2.2. Selfadjoint operators 57
shows that (A
)
1
= (A
1
)
we obtain
Theorem 2.5. We have A L(H) if and only if A
L(H).
Now we can also generalize Lemma 2.3 to the case of essential selfadjoint
operators.
Lemma 2.6. A symmetric operator A is essentially selfadjoint if and only
if one of the following conditions holds for one z CR.
Ran(A+z) = Ran(A+z
) = H.
Ker(A
+z) = Ker(A
+z
) = 0.
If A is nonnegative, that is , A) 0 for all D(A), we can also
admit z (, 0).
Proof. As noted earlier Ker(A
) = Ran(A)
() = A
, ) is pointwise
bounded

 = A
 C. That is, A
is bounded and so is A = A
.
Finally we want to draw some some further consequences of Axiom 2
and show that observables correspond to selfadjoint operators. Since self
adjoint operators are already maximal, the dicult part remaining is to
show that an observable has at least one selfadjoint extension. There is a
good way of doing this for nonnegative operators and hence we will consider
this case rst.
An operator is called nonnegative (resp. positive) if , A) 0
(resp. > 0 for ,= 0) for all D(A). If A is positive, the map (, )
, A) is a scalar product. However, there might be sequences which are
Cauchy with respect to this scalar product but not with respect to our
original one. To avoid this, we introduce the scalar product
, )
A
= , (A+ 1)), A 0, (2.50)
dened on D(A), which satises  
A
. Let H
A
be the completion of
D(A) with respect to the above scalar product. We claim that H
A
can be
regarded as a subspace of H, that is, D(A) H
A
H.
If (
n
) is a Cauchy sequence in D(A), then it is also Cauchy in H (since
 
A
by assumption) and hence we can identify it with the limit of
(
n
) regarded as a sequence in H. For this identication to be unique, we
need to show that if (
n
) D(A) is a Cauchy sequence in H
A
such that

n
 0, then 
n

A
0. This follows from

n

2
A
=
n
,
n
m
)
A
+
n
,
m
)
A

n

A

n
m

A
+
n
(A+ 1)
m
 (2.51)
since the right hand side can be made arbitrarily small choosing m, n large.
Clearly the quadratic form q
A
can be extended to every H
A
by
setting
q
A
() = , )
A

2
, Q(A) = H
A
. (2.52)
The set Q(A) is also called the form domain of A.
Example. (Multiplication operator) Let A be multiplication by A(x) 0
in L
2
(R
n
, d). Then
Q(A) = D(A
1/2
) = f L
2
(R
n
, d) [ A
1/2
f L
2
(R
n
, d) (2.53)
and
q
A
(x) =
_
R
n
A(x)[f(x)[
2
d(x) (2.54)
2.2. Selfadjoint operators 59
(show this).
Now we come to our extension result. Note that A + 1 is injective and
the best we can hope for is that for a nonnegative extension
A,
A + 1 is a
bijection from D(
A) onto H.
Lemma 2.8. Suppose A is a nonnegative operator, then there is a non
negative extension
A such that Ran(
A+ 1) = H.
Proof. Let us dene an operator
A by
D(
A) = H
A
[
H : , )
A
= ,
), H
A
A =
. (2.55)
Since H
A
is dense,
is welldened. Moreover, it is straightforward to see
that
A is a nonnegative extension of A.
It is also not hard to see that Ran(
A + 1) = H. Indeed, for any
H,
, ) = , )
A
for all H
A
. By the denition of
A,
(
A+ 1) =
and hence
A+ 1 is onto.
Now it is time for another
Example. Let us take H = L
2
(0, ) and consider the operator
Af =
d
2
dx
2
f, D(A) = f C
2
[0, ] [ f(0) = f() = 0, (2.56)
which corresponds to the onedimensional model of a particle conned to a
box.
(i). First of all, using integration by parts twice, it is straightforward to
check that A is symmetric
_
0
g(x)
(f
)(x)dx =
_
0
g
(x)
(x)dx =
_
0
(g
)(x)
f(x)dx. (2.57)
Note that the boundary conditions f(0) = f() = 0 are chosen such that
the boundary terms occurring from integration by parts vanish. Moreover,
the same calculation also shows that A is positive
_
0
f(x)
(f
)(x)dx =
_
0
[f
(x)[
2
dx > 0, f ,= 0. (2.58)
(ii). Next let us show H
A
= f H
1
(0, ) [ f(0) = f() = 0. In fact,
since
g, f)
A
=
_
0
_
g
(x)
(x) +g(x)
f(x)
_
dx, (2.59)
we see that f
n
is Cauchy in H
A
if and only if both f
n
and f
n
are Cauchy
in L
2
(0, ). Thus f
n
f and f
n
g in L
2
(0, ) and f
n
(x) =
_
x
0
f
n
(t)dt
60 2. Selfadjointness and spectrum
implies f(x) =
_
x
0
g(t)dt. Thus f AC[0, ]. Moreover, f(0) = 0 is obvious
and from 0 = f
n
() =
_
0
f
n
(t)dt we have f() = lim
n
_
0
f
n
(t)dt = 0.
So we have H
A
f H
1
(0, ) [ f(0) = f() = 0. To see the converse
approximate f
by smooth functions g
n
. Using g
n
0
g
n
(t)dt instead of g
n
it is no restriction to assume
_
0
g
n
(t)dt = 0. Now dene f
n
(x) =
_
x
0
g
n
(t)dt
and note f
n
D(A) f.
(iii). Finally, let us compute the extension
A. We have f D(
A) if for
all g H
A
there is an
f such that g, f)
A
= g,
f). That is,
_
0
g
(x)
(x)dx =
_
0
g(x)
(
f(x) f(x))dx. (2.60)
Integration by parts on the right hand side shows
_
0
g
(x)
(x)dx =
_
0
g
(x)
_
x
0
(
f(t) f(t))dt dx (2.61)
or equivalently
_
0
g
(x)
_
f
(x) +
_
x
0
(
f(t) f(t))dt
_
dx = 0. (2.62)
Now observe g
H[g H
A
= h H[
_
0
h(t)dt = 0 = 1
and thus
f
(x) +
_
x
0
(
f(t) f(t))dt 1
= span1. So we see f H
2
(0, ) =
f AC[0, ][f
H
1
(0, ) and
Af = f
Af =
d
2
dx
2
f, D(
A) = f H
2
[0, ] [ f(0) = f() = 0. (2.63)
and (A+B)
+B
(where D(A
+
B
) = D(A
) D(B
A (with D(A
A) =
D(A)[A D(A
) is selfadjoint. (Hint: A
A 0.)
Problem 2.9. Show that A is normal if and only if AA
= A
A.
2.3. Resolvents and spectra
Let A be a (densely dened) closed operator. The resolvent set of A is
dened by
(A) = z C[(Az)
1
L(H). (2.66)
More precisely, z (A) if and only if (A z) : D(A) H is bijective
and its inverse is bounded. By the closed graph theorem (Theorem 2.7), it
suces to check that A z is bijective. The complement of the resolvent
set is called the spectrum
(A) = C(A) (2.67)
of A. In particular, z (A) if A z has a nontrivial kernel. A vector
Ker(Az) is called an eigenvector and z is called eigenvalue in this
case.
62 2. Selfadjointness and spectrum
The function
R
A
: (A) L(H)
z (Az)
1
(2.68)
is called resolvent of A. Note the convenient formula
R
A
(z)
= ((Az)
1
)
= ((Az)
)
1
= (A
)
1
= R
A
(z
). (2.69)
In particular,
(A
) = (A)
. (2.70)
Example. (Multiplication operator) Consider again the multiplication op
erator
(Af)(x) = A(x)f(x), D(A) = f L
2
(R
n
, d) [ Af L
2
(R
n
, d),
(2.71)
given by multiplication with the measurable function A : R
n
C. Clearly
(Az)
1
is given by the multiplication operator
(Az)
1
f(x) =
1
A(x) z
f(x),
D((Az)
1
) = f L
2
(R
n
, d) [
1
Az
f L
2
(R
n
, d) (2.72)
whenever this operator is bounded. But (A z)
1
 = 
1
Az

is
equivalent to (x[ [A(x) z[ ) = 0 and hence
(A) = z C[ > 0 : (x[ [A(x) z[ ) = 0. (2.73)
Moreover, z is an eigenvalue of A if (A
1
(z)) > 0 and
A
1
({z})
is a
corresponding eigenfunction in this case.
Example. (Dierential operator) Consider again the dierential operator
Af = i
d
dx
f, D(A) = f AC[0, 2] [ f
L
2
, f(0) = f(2) (2.74)
in L
2
(0, 2). We already know that the eigenvalues of A are the integers
and that the corresponding normalized eigenfunctions
u
n
(x) =
1
2
e
inx
(2.75)
form an orthonormal basis.
To compute the resolvent we must nd the solution of the correspond
ing inhomogeneous equation if
) = (z z
)R
A
(z)R
A
(z
) = (z z
)R
A
(z
)R
A
(z). (2.80)
In fact,
(Az)
1
(z z
)(Az)
1
(Az
)
1
=
(Az)
1
(1 (z A+Az
)(Az
)
1
) = (Az
)
1
, (2.81)
which proves the rst equality. The second follows after interchanging z and
z
. Now x z
= z
0
and use (2.80) recursively to obtain
R
A
(z) =
n
j=0
(z z
0
)
j
R
A
(z
0
)
j+1
+ (z z
0
)
n+1
R
A
(z
0
)
n+1
R
A
(z). (2.82)
The sequence of bounded operators
R
n
=
n
j=0
(z z
0
)
j
R
A
(z
0
)
j+1
(2.83)
converges to a bounded operator if [z z
0
[ < R
A
(z
0
)
1
and clearly we
expect z (A) and R
n
R
A
(z) in this case. Let R
= lim
n
R
n
and
set
n
= R
n
, = R
= R
A
(z) as
anticipated.
64 2. Selfadjointness and spectrum
If A is bounded, a similar argument veries the Neumann series for
the resolvent
R
A
(z) =
n1
j=0
A
j
z
j+1
+
1
z
n
A
n
R
A
(z)
=
j=0
A
j
z
j+1
, [z[ > A. (2.86)
In summary we have proved the following
Theorem 2.11. The resolvent set (A) is open and R
A
: (A) L(H) is
holomorphic, that is, it has an absolutely convergent power series expansion
around every point z
0
(A). In addition,
R
A
(z) dist(z, (A))
1
(2.87)
and if A is bounded we have z C[ [z[ > A (A).
As a consequence we obtain the useful
Lemma 2.12. We have z (A) if there is a sequence
n
D(A) such
that 
n
 = 1 and (Az)
n
 0. If z is a boundary point of (A), then
the converse is also true. Such a sequence is called Weyl sequence.
Proof. Let
n
be a Weyl sequence. Then z (A) is impossible by 1 =

n
 = R
A
(z)(A z)
n
 R
A
(z)(A z)
n
 0. Conversely, by
(2.87) there is a sequence z
n
z and corresponding vectors
n
H such
that R
A
(z)
n

n

1
. Let
n
= R
A
(z
n
)
n
and rescale
n
such that

n
 = 1. Then 
n
 0 and hence
(Az)
n
 = 
n
+ (z
n
z)
n
 
n
 +[z z
n
[ 0 (2.88)
shows that
n
is a Weyl sequence.
Let us also note the following spectral mapping result.
Lemma 2.13. Suppose A is injective, then
(A
1
)0 = ((A)0)
1
. (2.89)
In addition, we have A = z if and only if A
1
= z
1
.
Proof. Suppose z (A)0. Then we claim
R
A
1(z
1
) = zAR
A
(z) = z R
A
(z). (2.90)
In fact, the right hand side is a bounded operator from H Ran(A) =
D(A
1
) and
(A
1
z
1
)(zAR
A
(z)) = (z +A)R
A
(z) = , H. (2.91)
2.3. Resolvents and spectra 65
Conversely, if D(A
1
) = Ran(A) we have = A and hence
(zAR
A
(z))(A
1
z
1
) = AR
A
(z)((Az)) = A = . (2.92)
Thus z
1
(A
1
). The rest follows after interchanging the roles of A and
A
1
.
Next, let us characterize the spectra of selfadjoint operators.
Theorem 2.14. Let A be symmetric. Then A is selfadjoint if and only if
(A) R and A 0 if and only if (A) [0, ). Moreover, R
A
(z)
[Im(z)[
1
and, if A 0, R
A
() [[
1
, < 0.
Proof. If (A) R, then Ran(A + z) = H, z CR, and hence A is self
adjoint by Lemma 2.6. Conversely, if A is selfadjoint (resp. A 0), then
R
A
(z) exists for z CR (resp. z C(, 0]) and satises the given
estimates as has been shown in the proof of Lemma 2.6.
In particular, we obtain
Theorem 2.15. Let A be selfadjoint, then
inf (A) = inf
D(A), =1
, A). (2.93)
For the eigenvalues and corresponding eigenfunctions we have:
Lemma 2.16. Let A be symmetric. Then all eigenvalues are real and eigen
vectors corresponding to dierent eigenvalues are orthogonal.
Proof. If A
j
=
j
j
, j = 1, 2, we have
1

1

2
=
1
,
1
1
) =
1
, A
1
) =
1
, A
1
) =
1
1
,
1
) =
1

1

2
(2.94)
and
(
1
2
)
1
,
2
) = A
1
,
2
) A
1
,
2
) = 0, (2.95)
nishing the proof.
The result does not imply that two linearly independent eigenfunctions
to the same eigenvalue are orthogonal. However, it is no restriction to
assume that they are since we can use GramSchmidt to nd an orthonormal
basis for Ker(A ). If H is nite dimensional, we can always nd an
orthonormal basis of eigenvectors. In the innite dimensional case this is
no longer true in general. However, if there is an orthonormal basis of
eigenvectors, then A is essentially selfadjoint.
Theorem 2.17. Suppose A is a symmetric operator which has an orthonor
mal basis of eigenfunctions
j
, then A is essentially selfadjoint. In par
ticular, it is essentially selfadjoint on span
j
.
66 2. Selfadjointness and spectrum
Proof. Consider the set of all nite linear combinations =
n
j=0
c
j
j
which is dense in H. Then =
n
j=0
c
j
j
i
j
D(A) and (A i) =
shows that Ran(Ai) is dense.
In addition, we note the following asymptotic expansion for the resolvent.
Lemma 2.18. Suppose A is selfadjoint. For every H we have
lim
Im(z)
AR
A
(z) = 0. (2.96)
In particular, if D(A
n
), then
R
A
(z) =
n
j=0
A
j
z
j+1
+o(
1
z
n+1
), as Im(z) . (2.97)
Proof. It suces to prove the rst claim since the second then follows as
in (2.86).
Write =
+, where
D(A) and  . Then
AR
A
(z) R
A
(z)A
 +AR
A
(z)
A

Im(z)
+, (2.98)
by (2.48), nishing the proof.
Similarly, we can characterize the spectra of unitary operators. Recall
that a bijection U is called unitary if U, U) = , U
U) = , ). Thus
U is unitary if and only if
U
= U
1
. (2.99)
Theorem 2.19. Let U be unitary, then (U) z C[ [z[ = 1. All
eigenvalues have modulus one and eigenvectors corresponding to dierent
eigenvalues are orthogonal.
Proof. Since U 1 we have (U) z C[ [z[ 1. Moreover, U
1
is also unitary and hence (U) z C[ [z[ 1 by Lemma 2.13. If
U
j
= z
j
j
, j = 1, 2 we have
(z
1
z
2
)
1
,
2
) = U
1
,
2
)
1
, U
2
) = 0 (2.100)
since U = z implies U
= U
1
= z
1
= z
.
Problem 2.10. What is the spectrum of an orthogonal projection?
Problem 2.11. Compute the resolvent of Af = f
, D(A) = f H
1
[0, 1] [ f(0) =
0 and show that unbounded operators can have empty spectrum.
Problem 2.12. Compute the eigenvalues and eigenvectors of A =
d
2
dx
2
,
D(A) = f H
2
(0, )[f(0) = f() = 0. Compute the resolvent of A.
2.4. Orthogonal sums of operators 67
Problem 2.13. Find a Weyl sequence for the selfadjoint operator A =
d
2
dx
2
, D(A) = H
2
(R) for z (0, ). What is (A)? (Hint: Cut o the
solutions of u
and
A
A
(z)A
1) , R
A
A
(z) =
1
z
(A
R
AA
(z)A1) .
(2.101)
2.4. Orthogonal sums of operators
Let H
j
, j = 1, 2, be two given Hilbert spaces and let A
j
: D(A
j
) H
j
be
two given operators. Setting H = H
1
H
2
we can dene an operator
A = A
1
A
2
, D(A) = D(A
1
) D(A
2
) (2.102)
by setting A(
1
+
2
) = A
1
1
+A
2
2
for
j
D(A
j
). Clearly A is closed,
(essentially) selfadjoint, etc., if and only if both A
1
and A
2
are. The same
considerations apply to countable orthogonal sums
A =
j
A
j
, D(A) =
j
D(A
j
) (2.103)
and we have
Theorem 2.20. Suppose A
j
are selfadjoint operators on H
j
, then A =
j
A
j
is selfadjoint and
R
A
(z) =
j
R
A
j
(z), z (A) = C(A) (2.104)
where
(A) =
_
j
(A
j
) (2.105)
(the closure can be omitted if there are only nitely many terms).
Proof. By Ran(Ai) = (Ai)D(A) =
j
(A
j
i)D(A
j
) =
j
H
j
= H we
see that A is selfadjoint. Moreover, if z (A
j
) there is a corresponding
Weyl sequence
n
D(A
j
) D(A) and hence z (A). Conversely, if
z ,
j
(A
j
) set = d(z,
j
(A
j
)) > 0, then R
A
j
(z)
1
and hence

j
R
A
j
(z)
1
shows that z (A).
Conversely, given an operator A it might be useful to write A as orthog
onal sum and investigate each part separately.
Let H
1
H be a closed subspace and let P
1
be the corresponding projec
tor. We say that H
1
reduces the operator A if P
1
A AP
1
. Note that this
68 2. Selfadjointness and spectrum
implies P
1
D(A) D(A). Moreover, if we set H
2
= H
1
, we have H = H
1
H
2
and P
2
= I P
1
reduces A as well.
Lemma 2.21. Suppose H
1
H reduces A, then A = A
1
A
2
, where
A
j
= A, D(A
j
) = P
j
D(A) D(A). (2.106)
If A is closable, then H
1
also reduces A and
A = A
1
A
2
. (2.107)
Proof. As already noted, P
1
D(A) D(A) and hence P
2
D(A) = (I
P
1
)D(A) D(A). Thus we see D(A) = D(A
1
) D(A
2
). Moreover, if
D(A
j
) we have A = AP
j
= P
j
A H
j
and thus A
j
: D(A
j
) H
j
which proves the rst claim.
Now let us turn to the second claim. Clearly A A
1
A
2
. Conversely,
suppose D(A), then there is a sequence
n
D(A) such that
n
and A
n
A. Then P
j
n
P
j
and AP
j
n
= P
j
A
n
PA. In
particular, P
j
D(A) and AP
j
= PA.
If A is selfadjoint, then H
1
reduces A if P
1
D(A) D(A) and AP
1
H
1
for every D(A). In fact, if D(A) we can write =
1
2
,
j
= P
j
D(A). Since AP
1
= A
1
and P
1
A = P
1
A
1
+ P
1
A
2
=
A
1
+P
1
A
2
we need to show P
1
A
2
= 0. But this follows since
, P
1
A
2
) = AP
1
,
2
) = 0 (2.108)
for every D(A).
Problem 2.15. Show (A
1
A
2
)
= A
1
A
2
.
2.5. Selfadjoint extensions
It is safe to skip this entire section on rst reading.
In many physical applications a symmetric operator is given. If this
operator turns out to be essentially selfadjoint, there is a unique selfadjoint
extension and everything is ne. However, if it is not, it is important to nd
out if there are selfadjoint extensions at all (for physical problems there
better are) and to classify them.
In Section 2.2 we have seen that A is essentially selfadjoint if Ker(A
z) = Ker(A
(A) = dimK
, K
= Ran(Ai)
= Ker(A
i), (2.109)
defect indices of A (we have chosen z = i for simplicity, any other z CR
would be as good). If d
(A) = d
+
(A) = 0 there is one selfadjoint extension
of A, namely A. But what happens in the general case? Is there more than
2.5. Selfadjoint extensions 69
one extension, or maybe none at all? These questions can be answered by
virtue of the Cayley transform
V = (Ai)(A+ i)
1
: Ran(A+ i) Ran(Ai). (2.110)
Theorem 2.22. The Cayley transform is a bijection from the set of all
symmetric operators A to the set of all isometric operators V (i.e., V  =
 for all D(V )) for which Ran(1 +V ) is dense.
Proof. Since A is symmetric we have (A i)
2
= A
2
+ 
2
for
all D(A) by a straightforward computation. And thus for every =
(A+ i) D(V ) = Ran(A+ i) we have
V  = (Ai) = (A+ i) = . (2.111)
Next observe
1 V = ((Ai) (A+ i))(A+ i)
1
=
_
2A(A+ i)
1
2i(A+ i)
1
, (2.112)
which shows D(A) = Ran(1 V ) and
A = i(1 +V )(1 V )
1
. (2.113)
Conversely, let V be given and use the last equation to dene A.
Since A is symmetric we have (1 V ), (1 V )) = 2iV , ) for
all D(V ) by a straightforward computation. And thus for every =
(1 V ) D(A) = Ran(1 V ) we have
A, ) = i(1+V ), (1+V )) = i(1+V ), (1+V )) = , A), (2.114)
that is, A is symmetric. Finally observe
Ai = ((1 +V ) (1 V ))(1 V )
1
=
_
2i(1 V )
1
2iV (1 V )
1
, (2.115)
which shows that A is the Cayley transform of V and nishes the proof.
Thus A is selfadjoint if and only if its Cayley transform V is unitary.
Moreover, nding a selfadjoint extension of A is equivalent to nding a
unitary extensions of V and this in turn is equivalent to (taking the closure
and) nding a unitary operator from D(V )
to Ran(V )
. This is possible
if and only if both spaces have the same dimension, that is, if and only if
d
+
(A) = d
(A).
Theorem 2.23. A symmetric operator has selfadjoint extensions if and
only if its defect indices are equal.
In this case let A
1
be a selfadjoint extension, V
1
its Cayley transform.
Then
D(A
1
) = D(A) + (1 V
1
)K
+
= +
+
V
1
+
[ D(A),
+
K
+
(2.116)
70 2. Selfadjointness and spectrum
and
A
1
( +
+
V
1
+
) = A + i
+
+ iV
1
+
. (2.117)
Moreover,
(A
1
i)
1
= (Ai)
1
i
2
j
, .)(
j
), (2.118)
where
+
j
is an orthonormal basis for K
+
and
j
= V
1
+
j
.
Corollary 2.24. Suppose A is a closed symmetric operator with equal defect
indices d = d
+
(A) = d
)(A + z)
1
for any z CR. Let d
(z) = dimK
(z), K
+
(z) =
Ran(A + z)
respectively K
(z) = Ran(A + z
(A).
Example. Recall the operator A = i
d
dx
, D(A) = f H
1
(0, 2)[f(0) =
f(2) = 0 with adjoint A
= i
d
dx
, D(A
) = H
1
(0, 2).
Clearly
K
= spane
x
(2.119)
is one dimensional and hence all unitary maps are of the form
V
e
2x
= e
i
e
x
, [0, 2). (2.120)
The functions in the domain of the corresponding operator A
are given by
f
satises
f
(2) = e
i
(0), e
i
=
1 e
i
e
2
e
2
e
i
(2.122)
and thus we have
D(A
) = f H
1
(0, 2)[f(2) = e
i
f(0). (2.123)
. Then K
j
is an
orthonormal set in Ran(A i)
. Hence
j
is an orthonormal basis for
Ran(A+ i)
if and only if K
j
is an orthonormal basis for Ran(Ai)
.
Hence the two spaces have the same dimension.
Finally, we note the following useful formula for the dierence of resol
vents of selfadjoint extensions.
Lemma 2.27. If A
j
, j = 1, 2 are selfadjoint extensions and if
j
(z) is
an orthonormal basis for Ker(A
), then
(A
1
z)
1
(A
2
z)
1
=
j,k
(
1
jk
(z)
2
jk
(z))
k
(z), .)
k
(z
), (2.125)
where
l
jk
(z) =
j
(z
), (A
l
z)
1
k
(z)). (2.126)
Proof. First observe that ((A
1
z)
1
(A
2
z)
1
) is zero for every
Ran(Az). Hence it suces to consider it for vectors =
j
(z), )
j
(z)
Ran(Az)
. Hence we have
(A
1
z)
1
(A
2
z)
1
=
j
(z), .)
j
(z), (2.127)
where
j
(z) = ((A
1
z)
1
(A
2
z)
1
)
j
(z). (2.128)
Now computation the adjoint once using ((A
j
z)
1
)
= (A
j
z
)
1
and
once using (
j
, .)
j
)
j
, .)
j
we obtain
j
(z
), .)
j
(z
) =
j
(z), .)
j
(z). (2.129)
72 2. Selfadjointness and spectrum
Evaluating at
k
(z) implies
k
(z) =
j
(z
),
k
(z))
j
(z
) (2.130)
and nishes the proof.
Problem 2.16. Compute the defect indices of A
0
= i
d
dx
, D(A
0
) = C
c
((0, )).
Can you give a selfadjoint extension of A
0
.
2.6. Appendix: Absolutely continuous functions
Let (a, b) R be some interval. We denote by
AC(a, b) = f C(a, b)[f(x) = f(c) +
_
x
c
g(t)dt, c (a, b), g L
1
loc
(a, b)
(2.131)
the set of all absolutely continuous functions. That is, f is absolutely
continuous if and only if it can be written as the integral of some locally
integrable function. Note that AC(a, b) is a vector space.
By Corollary A.33 f(x) = f(c) +
_
x
c
g(t)dt is dierentiable a.e. (with re
spect to Lebesgue measure) and f
(x)g(x)dx. (2.133)
We set
H
m
(a, b) = f L
2
(a, b)[f
(j)
AC(a, b), f
(j+1)
L
2
(a, b), 0 j m1.
(2.134)
Then we have
Lemma 2.28. Suppose f H
m
(a, b), m 1. Then f is bounded and
lim
xa
f
(j)
(x) respectively lim
xb
f
(j)
(x) exist for 0 j m1. Moreover,
the limit is zero if the endpoint is innite.
Proof. If the endpoint is nite, then f
(j+1)
is integrable near this endpoint
and hence the claim follows. If the endpoint is innite, note that
[f
(j)
(x)[
2
= [f
(j)
(c)[
2
+ 2
_
x
c
Re(f
(j)
(t)
f
(j+1)
(t))dt (2.135)
shows that the limit exists (dominated convergence). Since f
(j)
is square
integrable the limit must be zero.
2.6. Appendix: Absolutely continuous functions 73
Let me remark, that it suces to check that the function plus the highest
derivative is in L
2
, the lower derivatives are then automatically in L
2
. That
is,
H
m
(a, b) = f L
2
(a, b)[f
(j)
AC(a, b), 0 j m1, f
(r)
L
2
(a, b).
(2.136)
For a nite endpoint this is straightforward. For an innite endpoint this
can also be shown directly, but it is much easier to use the Fourier transform
(compare Section 7.1).
Problem 2.17. Show (2.133). (Hint: Fubini)
Problem 2.18. Show that H
1
(a, b) together with the norm
f
2
2,1
=
_
b
a
[f(t)[
2
dt +
_
b
a
[f
(t)[
2
dt (2.137)
is a Hilbert space.
Problem 2.19. What is the closure of C
0
(a, b) in H
1
(a, b)? (Hint: Start
with the case where (a, b) is nite.)
Chapter 3
The spectral theorem
The time evolution of a quantum mechanical system is governed by the
Schrodinger equation
i
d
dt
(t) = H(t). (3.1)
If H = C
n
, and H is hence a matrix, this system of ordinary dierential
equations is solved by the matrix exponential
(t) = exp(itH)(0). (3.2)
This matrix exponential can be dened by a convergent power series
exp(itH) =
n=0
(it)
n
n!
H
n
. (3.3)
For this approach the boundedness of H is crucial, which might not be the
case for a a quantum system. However, the best way to compute the matrix
exponential, and to understand the underlying dynamics, is to diagonalize
H. But how do we diagonalize a selfadjoint operator? The answer is known
as the spectral theorem.
3.1. The spectral theorem
In this section we want to address the problem of dening functions of a
selfadjoint operator A in a natural way, that is, such that
(f+g)(A) = f(A)+g(A), (fg)(A) = f(A)g(A), (f
)(A) = f(A)
. (3.4)
As long as f and g are polynomials, no problems arise. If we want to extend
this denition to a larger class of functions, we will need to perform some
limiting procedure. Hence we could consider convergent power series or
equip the space of polynomials with the sup norm. In both cases this only
75
76 3. The spectral theorem
works if the operator A is bounded. To overcome this limitation, we will use
characteristic functions
()
2
=
(),
the corresponding operators should be orthogonal projections. Moreover,
we should also have
R
(A) = I and
(A) =
n
j=1
j
(A) for any nite
union =
n
j=1
j
of disjoint sets. The only remaining problem is of course
the denition of
(A).
Denote the Borel sigma algebra of R by B. A projectionvalued mea
sure is a map
P : B L(H), P(), (3.5)
from the Borel sets to the set of orthogonal projections, that is, P()
=
P() and P()
2
= P(), such that the following two conditions hold:
(i) P(R) = I.
(ii) If =
n
n
with
n
m
= for n ,= m, then
n
P(
n
) =
P() for every H (strong additivity).
Note that we require strong convergence,
n
P(
n
) = P(), rather
than norm convergence,
n
P(
n
) = P(). In fact, norm convergence
does not even hold in the simplest case where H = L
2
(I) and P() =
1
, . . . ,
m
be its (distinct) eigenvalues and let P
j
be the projections onto
the corresponding eigenspaces. Then
P
A
() =
{j
j
}
P
j
(3.6)
is a projection valued measure.
Example. Let H = L
2
(R) and let f be a realvalued measurable function.
Then
P() =
f
1
()
(3.7)
is a projection valued measure (Problem 3.2).
It is straightforward to verify that any projectionvalued measure satis
es
P() = 0, P(R) = I P(), (3.8)
3.1. The spectral theorem 77
and
P(
1
2
) +P(
1
2
) = P(
1
) +P(
2
). (3.9)
Moreover, we also have
P(
1
)P(
2
) = P(
1
2
). (3.10)
Indeed, suppose
1
2
= rst. Then, taking the square of (3.9) we infer
P(
1
)P(
2
) +P(
2
)P(
1
) = 0. (3.11)
Multiplying this equation from the right by P(
2
) shows that P(
1
)P(
2
) =
P(
2
)P(
1
)P(
2
) is selfadjoint and thus P(
1
)P(
2
) = P(
2
)P(
1
) =
0. For the general case
1
2
,= we now have
P(
1
)P(
2
) = (P(
1
2
) +P(
1
2
))(P(
2
1
) +P(
1
2
))
= P(
1
2
) (3.12)
as stated.
To every projectionvalued measure there corresponds a resolution of
the identity
P() = P((, ]) (3.13)
which has the properties (Problem 3.3):
(i) P() is an orthogonal projection.
(ii) P(
1
) P(
2
) (that is , P(
1
)) , P(
2
))) or equiva
lently Ran(P(
1
)) Ran(P(
2
)) for
1
2
.
(iii) slim
n
P(
n
) = P() (strong right continuity).
(iv) slim
() = , P()) =
P()
2
with
(R) = 
2
< . The corresponding distribution func
tion is given by () = , P()) and since for every distribution function
there is a unique Borel measure (Theorem A.2), for every resolution of the
identity there is a unique projection valued measure.
Using the polarization identity (2.16) we also have the following complex
Borel measures
,
() = , P()) =
1
4
(
+
()
() + i
i
() i
+i
()).
(3.14)
Note also that, by CauchySchwarz, [
,
()[  .
Now let us turn to integration with respect to our projectionvalued
measure. For any simple function f =
n
j=1
j
(where
j
= f
1
(
j
))
78 3. The spectral theorem
we set
P(f)
_
R
f()dP() =
n
j=1
j
P(
j
). (3.15)
In particular, P(
(
j
) shows
, P(f)) =
_
R
f()d
,
() (3.16)
and, by linearity of the integral, the operator P is a linear map from the set
of simple functions into the set of bounded linear operators on H. Moreover,
P(f)
2
=
n
[
j
[
2
(
j
) (the sets
j
are disjoint) shows
P(f)
2
=
_
R
[f()[
2
d
(). (3.17)
Equipping the set of simple functions with the sup norm this implies
P(f) f
 (3.18)
that P has norm one. Since the simple functions are dense in the Banach
space of bounded Borel functions B(R), there is a unique extension of P to
a bounded linear operator P : B(R) L(H) (whose norm is one) from the
bounded Borel functions on R (with sup norm) to the set of bounded linear
operators on H. In particular, (3.16) and (3.17) remain true.
There is some additional structure behind this extension. Recall that
the set L(H) of all bounded linear mappings on H forms a C
algebra. A C
algebras which
respects both the multiplication and the adjoint, that is, (ab) = (a)(b)
and (a
) = (a)
.
Theorem 3.1. Let P() be a projectionvalued measure on H. Then the
operator
P : B(R) L(H)
f
_
R
f()dP()
(3.19)
is a C
) = P(f)
algebra homomorphism.
The last claim follows from the dominated convergence theorem and
(3.17).
3.1. The spectral theorem 79
As a consequence, observe
P(g), P(f)) =
_
R
g
()f()d
,
() (3.20)
and thus
d
P(g),P(f)
= g
fd
,
. (3.21)
Example. Let H = C
n
and A GL(n) respectively P
A
as in the previous
example. Then
P
A
(f) =
m
j=1
f(
j
)P
j
. (3.22)
In particular, P
A
(f) = A for f() = .
Next we want to dene this operator for unbounded Borel functions.
Since we expect the resulting operator to be unbounded, we need a suitable
domain rst. Motivated by (3.17) we set
D
f
= H[
_
R
[f()[
2
d
() < . (3.23)
This is clearly a linear subspace of H since
() = [[
2
() and since
+
() 2(
() +
and hence
n
D
f
. Moreover,
n
by (3.17) since
n
1 in L
2
(R, d
).
The operator P(f) has some additional properties. One calls an un
bounded operator A normal if D(A) = D(A
) and A = A
 for all
D(A).
Theorem 3.2. For any Borel function f, the operator
P(f) =
_
R
f()dP(), D(P(f)) = D
f
, (3.25)
is normal and satises
P(f)
= P(f
). (3.26)
Proof. Let f be given and dene f
n
,
n
as above. Since (3.26) holds for
f
n
by our previous theorem, we get
, P(f)) = P(f
), ) (3.27)
80 3. The spectral theorem
for any , D
f
= D(f
) D
f
. If D(P(f)
) we have , P(f)) =
, ) for all
D
f
by denition. Now observe that P(f
n
) = P(
n
)
since we have
P(f
n
), ) = , P(f
n
)) = , P(f)P(
n
)) = P(
n
)
, ) (3.28)
for any H. To see the second equality use P(f
n
) = P(f
m
n
) =
P(f
m
)P(
n
) for m n and let m . This proves existence of the limit
lim
n
_
R
[f
n
[
2
d
() = lim
n
P(f
n
)
2
= lim
n
P(
n
)

2
= 

2
, (3.29)
which implies f L
2
(R, d
), that is, D
f
. That P(f) is normal follows
from P(f)
2
= P(f
)
2
=
_
R
[f()[
2
d
.
These considerations seem to indicate some kind of correspondence be
tween the operators P(f) in H and f in L
2
(R, d
). Recall that U : H
H is
called unitary if it is a bijection which preserves scalar products U, U) =
, ). The operators A in H and
A in
H are said to be unitarily equiva
lent if
UA =
AU, UD(A) = D(
A). (3.30)
Clearly, A is selfadjoint if and only if
A is and (A) = (
A).
Now let us return to our original problem and consider the subspace
H
= P(f)[f L
2
(R, d
) H. (3.31)
Observe that this subspace is closed: If
n
= P(f
n
) converges in H, then
f
n
converges to some f in L
2
(since 
n
m

2
=
_
[f
n
f
m
[
2
d
) and
hence
n
P(f).
The vector is called cyclic if H
(P(f)) = f (3.32)
denes a unique unitary operator U
: H
L
2
(R, d
) such that
U
P(f) = fU
, (3.33)
where f is identied with its corresponding multiplication operator. More
over, if f is unbounded we have U
(D
f
H
) = D(f) = g L
2
(R, d
)[fg
L
2
(R, d
= fd
jJ
(J some index set) is called a set of spectral vectors
if 
j
 = 1 and H
i
H
j
for all i ,= j. A set of spectral vectors is called a
spectral basis if
j
H
j
= H. Luckily a spectral basis always exist:
3.1. The spectral theorem 81
Lemma 3.3. For every projection valued measure P, there is an (at most
countable) spectral basis
n
such that
H =
n
H
n
(3.34)
and a corresponding unitary operator
U =
n
U
n
: H
n
L
2
(R, d
n
) (3.35)
such that for any Borel function f,
UP(f) = fU, UD
f
= D(f). (3.36)
Proof. It suces to show that a spectral basis exists. This can be easily
done using a GramSchmidt type construction. First of all observe that if
jJ
is a spectral set and H
j
for all j we have H
j
for all j.
Indeed, H
j
implies P(g) H
j
for every bounded function g since
P(g), P(f)
j
) = , P(g
f)
j
) = 0. But P(g) with g bounded is dense
in H
implying H
j
.
Now start with some total set
j
. Normalize
1
and choose this to be
1
. Move to the rst
j
which is not in H
1
, take the orthogonal complement
with respect to H
1
and normalize it. Choose the result to be
2
. Proceeding
like this we get a set of spectral vectors
j
such that span
j
H
j
.
Hence H = span
j
H
j
.
It is important to observe that the cardinality of a spectral basis is not
welldened (in contradistinction to the cardinality of an ordinary basis of
the Hilbert space). However, it can be at most equal to the cardinality of an
ordinary basis. In particular, since H is separable, it is at most countable.
The minimal cardinality of a spectral basis is called spectral multiplicity
of P. If the spectral multiplicity is one, the spectrum is called simple.
Example. Let H = C
2
and A =
_
0 0
0 1
_
. Then
1
= (1, 0) and
2
=
(0, 1) are a spectral basis. However, = (1, 1) is cyclic and hence the
spectrum of A is simple. If A =
_
1 0
0 1
_
there is no cyclic vector (why)
and hence the spectral multiplicity is two.
Using this canonical form of projection valued measures it is straight
forward to prove
Lemma 3.4. Let f, g be Borel functions and , C. Then we have
P(f) +P(g) P(f +g), D(P(f) +P(g)) = D
f+g
(3.37)
82 3. The spectral theorem
and
P(f)P(g) P(f g), D(P(f)P(g)) = D
g
D
f g
. (3.38)
Now observe, that to every projection valued measure P we can assign a
selfadjoint operator A =
_
R
dP(). The question is whether we can invert
this map. To do this, we consider the resolvent R
A
(z) =
_
R
( z)
1
dP().
By (3.16) the corresponding quadratic form is given by
F
(z) = , R
A
(z)) =
_
R
1
z
d
(), (3.39)
which is know as the Borel transform of the measure
. It can be shown
(see Section 3.4) that F
() = lim
0
lim
0
1
_
+
Im(F
(t + i))dt. (3.40)
Conversely, if F
(z) = , R
A
(z)). (3.41)
This function is holomorphic for z (A) and satises
F
(z
) = F
(z)
and [F
(z)[

2
Im(z)
(3.42)
(see Theorem 2.14). Moreover, the rst resolvent formula (2.80) shows
Im(F
(z)) = Im(z)R
A
(z)
2
(3.43)
that it maps the upper half plane to itself, that is, it is a Herglotz function.
So by our above remarks, there is a corresponding measure
() given by
Stieltjes inversion formula. It is called spectral measure corresponding to
.
More generally, by polarization, for each , H we can nd a corre
sponding complex measure
,
such that
, R
A
(z)) =
_
R
1
z
d
,
(). (3.44)
The measure
,
is conjugate linear in and linear in . Moreover, a
comparison with our previous considerations begs us to dene a family of
3.1. The spectral theorem 83
operators P
A
() via
, P
A
()) =
_
R
()d
,
(). (3.45)
This is indeed possible by Corollary 1.8 since [, P
A
())[ = [
,
()[
 . The operators P
A
() are non negative (0 , P
A
()) 1) and
hence selfadjoint.
Lemma 3.5. The family of operators P
A
() forms a projection valued mea
sure.
Proof. We rst show P
A
(
1
)P
A
(
2
) = P
A
(
1
2
) in two steps. First
observe (using the rst resolvent formula (2.80))
_
R
1
z
d
R
A
(z
),
() = R
A
(z
), R
A
( z)) = , R
A
(z)R
A
( z))
=
1
z z
(, R
A
(z)) , R
A
( z)))
=
1
z z
_
R
_
1
z
1
z
_
d
,
() =
_
R
1
z
d
,
()
z
(3.46)
implying d
R
A
(z
),
() = ( z)
1
d
,
() since a Herglotz function is
uniquely determined by its measure. Secondly we compute
_
R
1
z
d
,P
A
()
() = , R
A
(z)P
A
()) = R
A
(z
), P
A
())
=
_
R
()d
R
A
(z
),
() =
_
R
1
z
()d
,
()
implying d
,P
A
()
() =
()d
,
(). Equivalently we have
, P
A
(
1
)P
A
(
2
)) = , P
A
(
1
2
)) (3.47)
since
2
=
2
. In particular, choosing
1
=
2
, we see that
P
A
(
1
) is a projector.
The relation P
A
(R) = I follows from (3.93) below and Lemma 2.18 which
imply
(R) = 
2
.
Now let =
n=1
n
with
n
m
= for n ,= m. Then
n
j=1
, P
A
(
j
)) =
n
j=1
(
j
) , P
A
()) =
() (3.48)
by additivity of
. Hence P
A
is weakly additive which implies strong
additivity, as pointed out earlier.
Now we can prove the spectral theorem for selfadjoint operators.
84 3. The spectral theorem
Theorem 3.6 (Spectral theorem). To every selfadjoint operator A there
corresponds a unique projection valued measure P
A
such that
A =
_
R
dP
A
(). (3.49)
Proof. Existence has already been established. Moreover, Lemma 3.4 shows
that P
A
((z)
1
) = R
A
(z), z CR. Since the measures
,
are uniquely
determined by the resolvent and the projection valued measure is uniquely
determined by the measures
,
we are done.
The quadratic form of A is given by
q
A
() =
_
R
d
() (3.50)
and can be dened for every in the form domain selfadjoint operator
Q(A) = H[
_
R
[[d
() < (3.51)
(which is larger than the domain D(A) = H[
_
R
2
d
() < ). This
extends our previous denition for nonnegative operators.
Note, that if Aand
Aare unitarily equivalent as in (3.30), then UR
A
(z) =
R
A
(z)U and hence
d
= d
U
. (3.52)
In particular, we have UP
A
(f) = P
A
(f)U, UD(P
A
(f)) = D(P
A
(f)).
Finally, let us give a characterization of the spectrum of A in terms of
the associated projectors.
Theorem 3.7. The spectrum of A is given by
(A) = R[P
A
(( , +)) ,= 0 for all > 0. (3.53)
Proof. Let
n
= (
0
1
n
,
0
+
1
n
). Suppose P
A
(
n
) ,= 0. Then we can nd
a
n
P
A
(
n
)H with 
n
 = 1. Since
(A
0
)
n

2
= (A
0
)P
A
(
n
)
n

2
=
_
R
(
0
)
2
n
()d
n
()
1
n
2
(3.54)
we conclude
0
(A) by Lemma 2.12.
Conversely, if P
A
((
0
,
0
+)) = 0, set f
() =
R\(
0
,
0
+)
()(
0
)
1
. Then
(A
0
)P
A
(f
) = P
A
((
0
)f
()) = P
A
(R(
0
,
0
+)) = I. (3.55)
Similarly P
A
(f
)(A
0
) = I[
D(A)
and hence
0
(A).
3.2. More on Borel measures 85
Thus P
A
((
1
,
2
)) = 0 if and only if (
1
,
2
) (A) and we have
P
A
((A)) = I and P
A
(R (A)) = 0 (3.56)
and consequently
P
A
(f) = P
A
((A))P
A
(f) = P
A
(
(A)
f). (3.57)
In other words, P
A
(f) is not aected by the values of f on R(A)!
It is clearly more intuitive to write P
A
(f) = f(A) and we will do so from
now on. This notation is justied by the elementary observation
P
A
(
n
j=0
j
) =
n
j=0
j
A
j
. (3.58)
Moreover, this also shows that if A is bounded and f(A) can be dened via
a convergent power series, then this agrees with our present denition by
Theorem 3.1.
Problem 3.1. Show that a selfadjoint operator P is a projection if and
only if (P) 0, 1.
Problem 3.2. Show that (3.7) is a projection valued measure. What is the
corresponding operator?
Problem 3.3. Show that P() satises the properties (i)(iv).
3.2. More on Borel measures
Section 3.1 showed that in order to understand selfadjoint operators, one
needs to understand multiplication operators on L
2
(R, d), where d is a
nite Borel measure. This is the purpose of the present section.
The set of all growth points, that is,
() = R[(( , +)) > 0 for all > 0, (3.59)
is called the spectrum of . Invoking Moreas together with Fubinis theorem
shows that the Borel transform
F(z) =
_
R
1
z
d() (3.60)
is holomorphic for z C(). The converse following from Stieltjes inver
sion formula. Associated with this measure is the operator
Af() = f(), D(A) = f L
2
(R, d)[f() L
2
(R, d). (3.61)
By Theorem 3.7 the spectrum of A is precisely the spectrum of , that is,
(A) = (). (3.62)
86 3. The spectral theorem
Note that 1 L
2
(R, d) is a cyclic vector for A and that
d
g,f
() = g()
f()d(). (3.63)
Now what can we say about the function f(A) (which is precisely the
multiplication operator by f) of A? We are only interested in the case where
f is realvalued. Introduce the measure
(f
)() = (f
1
()), (3.64)
then
_
R
g()d(f
)() =
_
R
g(f())d(). (3.65)
In fact, it suces to check this formula for simple functions g which follows
since
f =
f
1
()
. In particular, we have
P
f(A)
() =
f
1
()
. (3.66)
It is tempting to conjecture that f(A) is unitarily equivalent to multi
plication by in L
2
(R, d(f
)) L
2
(R, d), g g f. (3.67)
However, this map is only unitary if its range is L
2
(R, d), that is, if f is
injective.
Lemma 3.8. Suppose f is injective, then
U : L
2
(R, d) L
2
(R, d(f
)), g g f
1
(3.68)
is a unitary map such that Uf() = Uf().
Example. Let f() =
2
, then (g f)() = g(
2
) and the range of the
above map is given by the symmetric functions. Note that we can still
get a unitary map L
2
(R, d(f
)) L
2
(R, d(f
)) L
2
(R, d), (g
1
, g
2
)
g
1
(
2
) +g
2
(
2
)(() ()), where =
(0,)
.
Lemma 3.9. Let f be realvalued. The spectrum of f(A) is given by
(f(A)) = (f
). (3.69)
In particular,
(f(A)) f((A)), (3.70)
where equality holds if f is continuous and the closure can be dropped if, in
addition, (A) is bounded (i.e., compact).
Proof. If
0
(f
n
,
n
=
f
1
((
0
1
n
,
0
+
1
n
)), satises g
n
 = 1, (f(A)
0
)g
n
 < n
1
and hence
0
(f(A)). Conversely, if
0
, (f
), then (
n
) = 0 for some n and
hence we can change f on
n
such that f(R) (
0
1
n
,
0
+
1
n
) = without
3.2. More on Borel measures 87
changing the corresponding operator. Thus (f(A)
0
)
1
= (f()
0
)
1
exists and is bounded, implying
0
, (f(A)).
If f is continuous, f
1
(f() , f() + ) contains an open interval
around and hence f() (f(A)) if (A). If, in addition, (A) is
compact, then f((A)) is compact and hence closed.
If two operators with simple spectrum are unitarily equivalent can be
read o from the corresponding measures:
Lemma 3.10. Let A
1
, A
2
be selfadjoint operators with simple spectrum and
corresponding spectral measures
1
and
2
of cyclic vectors. Then A
1
and
A
2
are unitarily equivalent if and only if
1
and
2
are mutually absolutely
continuous.
Proof. Without restriction we can assume that A
j
is multiplication by
in L
2
(R, d
j
). Let U : L
2
(R, d
1
) L
2
(R, d
2
) be a unitary map such that
UA
1
= A
2
U. Then we also have Uf(A
1
) = f(A
2
)U for any bounded Borel
Function and hence
Uf() = Uf() 1 = f()U(1)() (3.71)
and thus U is multiplication by u() = U(1)(). Moreover, since U is
unitary we have
1
() =
_
R
[
[
2
d
1
=
_
R
[u
[
2
d
2
=
_
[u[
2
d
2
, (3.72)
that is, d
1
= [u[
2
d
2
. Reversing the role of A
1
and A
2
we obtain d
2
=
[v[
2
d
1
, where v = U
1
1.
The converse is left as an exercise (Problem 3.8.)
Next we recall the unique decomposition of with respect to Lebesgue
measure,
d = d
ac
+d
s
, (3.73)
where
ac
is absolutely continuous with respect to Lebesgue measure (i.e.,
we have
ac
(B) = 0 for all B with Lebesgue measure zero) and
s
is singular
with respect to Lebesgue measure (i.e.,
s
is supported,
s
(RB) = 0, on
a set B with Lebesgue measure zero). The singular part
s
can be further
decomposed into a (singularly) continuous and a pure point part,
d
s
= d
sc
+d
pp
, (3.74)
where
sc
is continuous on R and
pp
is a step function. Since the measures
d
ac
, d
sc
, and d
pp
are mutually singular, they have mutually disjoint
supports M
ac
, M
sc
, and M
pp
. Note that these sets are not unique. We will
choose them such that M
pp
is the set of all jumps of () and such that M
sc
has Lebesgue measure zero.
88 3. The spectral theorem
To the sets M
ac
, M
sc
, and M
pp
correspond projectors P
ac
=
Mac
(A),
P
sc
=
Msc
(A), and P
pp
=
Mpp
(A) satisfying P
ac
+ P
sc
+ P
pp
= I. In
other words, we have a corresponding direct sum decomposition of both our
Hilbert space
L
2
(R, d) = L
2
(R, d
ac
) L
2
(R, d
sc
) L
2
(R, d
pp
) (3.75)
and our operator A
A = (AP
ac
) (AP
sc
) (AP
pp
). (3.76)
The corresponding spectra,
ac
(A) = (
ac
),
sc
(A) = (
sc
), and
pp
(A) =
(
pp
) are called the absolutely continuous, singularly continuous, and pure
point spectrum of A, respectively.
It is important to observe that
pp
(A) is in general not equal to the set
of eigenvalues
p
(A) = R[ is an eigenvalue of A (3.77)
since we only have
pp
(A) =
p
(A).
Example. let H =
2
(N) and let A be given by A
n
=
1
n
n
, where
n
is the sequence which is 1 at the nth place and zero else (that is, A is
a diagonal matrix with diagonal elements
1
n
). Then
p
(A) =
1
n
[n N
but (A) =
pp
(A) =
p
(A) 0. To see this, just observe that
n
is the
eigenvector corresponding to the eigenvalue
1
n
and for z , (A) we have
R
A
(z)
n
=
n
1nz
n
. At z = 0 this formula still gives the inverse of A, but
it is unbounded and hence 0 (A) but 0 ,
p
(A). Since a continuous
measure cannot live on a single point and hence also not on a countable set,
we have
ac
(A) =
sc
(A) = .
Example. An example with purely absolutely continuous spectrum is given
by taking to be Lebesgue measure. An example with purely singularly
continuous spectrum is given by taking to be the Cantor measure.
Problem 3.4. Construct a multiplication operator on L
2
(R) which has
dense point spectrum.
Problem 3.5. Let be Lebesgue measure on R. Show that if f AC(R)
with f
> 0, then
d(f
) =
1
f
()
d. (3.78)
Problem 3.6. Let d() =
[0,1]
()d and f() =
(,t]
, t R. Compute
f
.
Problem 3.7. Let A be the multiplication operator by the Cantor function
in L
2
(0, 1). Compute the spectrum of A. Determine the spectral types.
Problem 3.8. Show the missing direction in the proof of Lemma 3.10.
3.3. Spectral types 89
3.3. Spectral types
Our next aim is to transfer the results of the previous section to arbitrary
selfadjoint operators A using Lemma 3.3. To do this we will need a spectral
measure which contains the information from all measures in a spectral basis.
This will be the case if there is a vector such that for every H its
spectral measure
. Such a
vector will be called a maximal spectral vector of A and
will be
called a maximal spectral measure of A.
Lemma 3.11. For every selfadjoint operator A there is a maximal spectral
vector.
Proof. Let
j
jJ
be a spectral basis and choose nonzero numbers
j
with
jJ
[
j
[
2
= 1. Then I claim that
=
jJ
j
(3.79)
is a maximal spectral vector. Let be given, then we can write it as =
j
f
j
(A)
j
and hence d
j
[f
j
[
2
d
j
. But
() =
j
[
j
[
2
j
() =
0 implies
j
() = 0 for every j J and thus
() = 0.
A set
j
of spectral vectors is called ordered if
k
is a maximal
spectral vector for A restricted to (
k1
j=1
H
j
)
j+1
is absolutely
continuous with respect to
j
.
If is a maximal spectral measure we have (A) = () and the follow
ing generalization of Lemma 3.9 holds.
Theorem 3.13 (Spectral mapping). Let be a maximal spectral measure
and let f be realvalued. Then the spectrum of f(A) is given by
(f(A)) = R[(f
1
( , +)) > 0 for all > 0. (3.80)
In particular,
(f(A)) f((A)), (3.81)
where equality holds if f is continuous and the closure can be dropped if, in
addition, (A) is bounded.
Next, we want to introduce the splitting (3.75) for arbitrary selfadjoint
operators A. It is tempting to pick a spectral basis and treat each summand
90 3. The spectral theorem
in the direct sum separately. However, since it is not clear that this approach
is independent of the spectral basis chosen, we use the more sophisticated
denition
H
ac
= H[
is absolutely continuous,
H
sc
= H[
is singularly continuous,
H
pp
= H[
Mxx
(A), xx ac, sc, pp. In particular, the subspaces H
xx
reduce A. For
the sets M
xx
one can choose the corresponding supports of some maximal
spectral measure .
Proof. We will use the unitary operator U of Lemma 3.3. Pick H and
write =
n
n
with
n
H
n
. Let f
n
= U
n
, then, by construction
of the unitary operator U,
n
= f
n
(A)
n
and hence d
n
= [f
n
[
2
d
n
.
Moreover, since the subspaces H
n
are orthogonal, we have
d
n
[f
n
[
2
d
n
(3.84)
and hence
d
,xx
=
n
[f
n
[
2
d
n,xx
, xx ac, sc, pp. (3.85)
This shows
UH
xx
=
n
L
2
(R, d
n,xx
), xx ac, sc, pp (3.86)
and reduces our problem to the considerations of the previous section.
The absolutely continuous, singularly continuous, and pure point
spectrum of A are dened as
ac
(A) = (A[
Hac
),
sc
(A) = (A[
Hsc
), and
pp
(A) = (A[
Hpp
),
(3.87)
respectively. If is a maximal spectral measure we have
ac
(A) = (
ac
),
sc
(A) = (
sc
), and
pp
(A) = (
pp
).
If A and
A are unitarily equivalent via U, then so are A[
Hxx
and
A[
Hxx
by (3.52). In particular,
xx
(A) =
xx
(
A).
Problem 3.9. Compute (A),
ac
(A),
sc
(A), and
pp
(A) for the multipli
cation operator A =
1
1+x
2
in L
2
(R). What is its spectral multiplicity?
3.4. Appendix: The Herglotz theorem 91
3.4. Appendix: The Herglotz theorem
A holomorphic function F : C
+
C
+
, C
using F(z
) = F(z)
.
Suppose is a nite Borel measure. Then its Borel transform is dened
via
F(z) =
_
R
d()
z
. (3.88)
Theorem 3.15. The Borel transform of a nite Borel measure is a Herglotz
function satisfying
[F(z)[
(R)
Im(z)
, z C
+
. (3.89)
Moreover, the measure can be reconstructed via Stieltjes inversion formula
1
2
(((
1
,
2
)) +([
1
,
2
])) = lim
0
1
_
2
1
Im(F( + i))d. (3.90)
Proof. By Moreas and Fubinis theorem, F is holomorphic on C
+
and the
remaining properties follow from 0 < Im((z)
1
) and [z[
1
Im(z)
1
.
Stieltjes inversion formula follows from Fubinis theorem and the dominated
convergence theorem since
1
2i
_
2
1
(
1
x i
1
x + i
)d
1
2
_
[
1
,
2
]
(x) +
(
1
,
2
)
(x)
_
(3.91)
pointwise.
Observe
Im(F(z)) = Im(z)
_
R
d()
[ z[
2
(3.92)
and
lim
_
1
z
1
z
2i
_
F()d. (3.96)
Inserting the explicit form of we see
F(z) =
1
_
R
R
y
2
+ (y )
2
F(x + i +)d
+
i
_
0
y
R
2
e
2i
+ (y )
2
F(x + i +Re
i
)Re
i
d. (3.97)
The integral over the semi circle vanishes as R and hence we obtain
F(z) =
1
_
R
y
( x)
2
+ (y )
2
F( + i)d (3.98)
and taking imaginary parts
w(z) =
_
R
()w
()d, (3.99)
where
() = (y)/((x)
2
+(y)
2
) and w
() = w(+i)/. Letting
y we infer from our bound
_
R
w
()d M. (3.100)
In particular, since [
()
0
()[ const we have
w(z) = lim
0
_
R
0
()d
(), (3.101)
where
() =
_
() M, there is a convergent
subsequence for xed . Moreover, by the standard diagonal trick, there
is even a subsequence
n
such that
n
() converges for each rational .
For irrational we set (
0
) = inf
0
()[ rational. Then () is
monotone, 0 (
1
) (
2
) M,
1
2
, and we claim
w(z) =
_
R
0
()d(). (3.102)
Fix > 0 and let
1
<
2
< <
m+1
be rational numbers such that
[
j+1
j
[ and
1
x
y
3
,
m+1
x +
y
3
. (3.103)
Then
[
0
()
0
(
j
)[
y
2
,
j
j+1
, (3.104)
3.4. Appendix: The Herglotz theorem 93
and
[
0
()[
y
2
,
1
or
m+1
. (3.105)
Now observe
[
_
R
0
()d()
_
R
0
()d
n
()[
[
_
R
0
()d()
m
j=1
0
(
j
)((
j+1
) (
j
))[
+[
m
j=1
0
(
j
)((
j+1
) (
j
)
n
(
j+1
) +
n
(
j
))
+[
_
R
0
()d
n
()
m
j=1
0
(
j
)(
n
(
j+1
)
n
(
j
))[ (3.106)
The rst and third term can be bounded by 2M/y
2
. Moreover, since
0
(y) 1/y we can nd an N N such that
[(
j
)
n
(
j
)[
y
2m
, n N, (3.107)
and hence the second term is bounded by . In summary, the dierence in
(3.106) can be made arbitrarily small.
Now F(z) and
_
R
(z)
1
d() have the same imaginary part and thus
they only dier by a real constant. By our bound this constant must be
zero.
The RadonNikodym derivative of can be obtained from the boundary
values of F.
Theorem 3.17. Let be a nite Borel measure and F its Borel transform,
then
(D)() liminf
0
1
F( + i) limsup
0
1
F( + i) (D)(). (3.108)
Proof. We need to estimate
Im(F( + i)) =
_
R
K
(t)d(t), K
(t) =
t
2
+
2
. (3.109)
We rst split the integral into two parts
Im(F(+i)) =
_
I
(t)d(t)+
_
R\I
(t)(t), I
= (, +).
(3.110)
Clearly the second part can be estimated by
_
R\I
(t )(t) K
()(R). (3.111)
94 3. The spectral theorem
To estimate the rst part we integrate
K
(s)ds =
_
I
(K() K
(t ))d(t). (3.113)
Now suppose there is are constants c and C such that c
(Is)
2s
C,
0 s , then
2c arctan(
)
_
I
(t )d(t) 2C arctan(
) (3.114)
since
K
() +
_
0
sK
(s)ds = arctan(
). (3.115)
Thus the claim follows combining both estimates.
As a consequence of Theorem A.34 and Theorem A.35 we obtain
Theorem 3.18. Let be a nite Borel measure and F its Borel transform,
then the limit
Im(F()) = lim
0
1
Im(F()) (3.117)
whenever (D)() exists.
Moreover, the set [F() = is a support for the singularly and
[F() < is a support for the absolutely continuous part.
In particular,
Corollary 3.19. The measure is purely absolutely continuous on I if
limsup
0
Im(F( + i)) < for all I.
Chapter 4
Applications of the
spectral theorem
This chapter can be mostly skipped on rst reading. You might want to have a
look at the rst section and the come back to the remaining ones later.
Now let us show how the spectral theorem can be used. We will give a
few typical applications:
Firstly we will derive an operator valued version of of Stieltjes inversion
formula. To do this, we need to show how to integrate a family of functions
of A with respect to a parameter. Moreover, we will show that these integrals
can be evaluated by computing the corresponding integrals of the complex
valued functions.
Secondly we will consider commuting operators and show how certain
facts, which are known to hold for the resolvent of an operator A, can be
established for a larger class of functions.
Then we will show how the eigenvalues below the essential spectrum and
dimension of RanP
A
() can be estimated using the quadratic form.
Finally, we will investigate tensor products of operators.
4.1. Integral formulas
We begin with the rst task by having a closer look at the projector P
A
().
They project onto subspaces corresponding to expectation values in the set
. In particular, the number
,
(A)) (4.1)
95
96 4. Applications of the spectral theorem
is the probability for a measurement of a to lie in . In addition, we have
, A) =
_
() hull(), P
A
()H,  = 1, (4.2)
where hull() is the convex hull of .
The space Ran
{
0
}
(A) is called the eigenspace corresponding to
0
since we have
, A) =
_
R
{
0
}
()d
,
() =
0
_
R
d
,
() =
0
, ) (4.3)
and hence A =
0
for all Ran
{
0
}
(A). The dimension of the
eigenspace is called the multiplicity of the eigenvalue.
Moreover, since
lim
0
i
0
i
=
{
0
}
() (4.4)
we infer from Theorem 3.1 that
lim
0
iR
A
(
0
+ i) =
{
0
}
(A). (4.5)
Similarly, we can obtain an operator valued version of Stieltjes inversion
formula. But rst we need to recall a few facts from integration in Banach
spaces.
We will consider the case of mappings f : I X where I = [t
0
, t
1
] R is
a compact interval and X is a Banach space. As before, a function f : I X
is called simple if the image of f is nite, f(I) = x
i
n
i=1
, and if each inverse
image f
1
(x
i
), 1 i n, is a Borel set. The set of simple functions S(I, X)
forms a linear space and can be equipped with the sup norm
f
= sup
tI
f(t). (4.6)
The corresponding Banach space obtained after completion is called the set
of regulated functions R(I, X).
Observe that C(I, X) R(I, X). In fact, consider the simple function
f
n
=
n1
i=0
f(s
i
)
[s
i
,s
i+1
)
, where s
i
= t
0
+ i
t
1
t
0
n
. Since f C(I, X) is
uniformly continuous, we infer that f
n
converges uniformly to f.
For f S(I, X) we can dene a linear map
_
: S(I, X) X by
_
I
f(t)dt =
n
i=1
x
i
[f
1
(x
i
)[, (4.7)
where [[ denotes the Lebesgue measure of . This map satises

_
I
f(t)dt f
(t
1
t
0
) (4.8)
4.1. Integral formulas 97
and hence it can be extended uniquely to a linear map
_
: R(I, X) X
with the same norm (t
1
t
0
) by Theorem 0.24. We even have

_
I
f(t)dt
_
I
f(t)dt, (4.9)
which clearly holds for f S(I, X) und thus for all f R(I, X) by conti
nuity. In addition, if X
(4.13)
exists for all t I. In particular, if f C(I, X), then F(t) =
_
t
t
0
f(s)ds
C
1
(I, X) and dF/dt = f as can be seen from
F(t +) F(t) f(t) = 
_
t+
t
(f(s) f(t))ds [[ sup
s[t,t+]
f(s) f(t).
(4.14)
The important facts for us are the following two results.
Lemma 4.1. Suppose f : I R C is a bounded Borel function such that
f(., ) and set F() =
_
I
f(t, )dt. Let A be selfadjoint. Then f(t, A)
R(I, L(H)) and
F(A) =
_
I
f(t, A)dt respectively F(A) =
_
I
f(t, A) dt. (4.15)
Proof. That f(t, A) R(I, L(H)) follows from the spectral theorem, since
it is no restriction to assume that A is multiplication by in some L
2
space.
98 4. Applications of the spectral theorem
We compute
, (
_
I
f(t, A)dt)) =
_
I
, f(t, A))dt
=
_
I
_
R
f(t, )d
,
()dt
=
_
R
_
I
f(t, )dt d
,
()
=
_
R
F()d
,
() = , F(A)) (4.16)
by Fubinis theorem and hence the rst claim follows.
Lemma 4.2. Suppose f : R L(H) is integrable and A L(H). Then
A
_
R
f(t)dt =
_
R
Af(t)dt respectively
_
R
f(t)dtA =
_
R
f(t)Adt. (4.17)
Proof. It suces to prove the case where f is simple and of compact sup
port. But for such functions the claim is straightforward.
Now we can prove Stones formula.
Theorem 4.3 (Stones formula). Let A be selfadjoint, then
1
2i
_
2
1
(R
A
( + i) R
A
( i))d
1
2
(P
A
([
1
,
2
]) +P
A
((
1
,
2
)))
(4.18)
strongly.
Proof. The result follows combining Lemma 4.1 with Theorem 3.1 and
(3.91).
Problem 4.1. Let be a dierentiable Jordan curve in (A). Show
(A) =
_
R
A
(z)dz, (4.19)
where is the intersection of the interior of with R.
4.2. Commuting operators
Now we come to commuting operators. As a preparation we can now prove
Lemma 4.4. Let K R be closed. And let C
(K).
4.2. Commuting operators 99
Proof. If K is compact, the claim follows directly from the complex Stone
Weierstra theorem since (
1
z)
1
= (
2
z)
1
implies
1
=
2
. Otherwise,
replace K by
K = K, which is compact, and set (z)
1
= 0. Then
we can again apply the complex StoneWeierstra theorem to conclude that
our subalgebra is equal to f C(
K)[f() = 0 which is equivalent to
C
(K).
We say that two bounded operators A, B commute if
[A, B] = AB BA = 0. (4.21)
If A or B is unbounded, we soon run into trouble with this denition since
the above expression might not even make sense for any nonzero vector (e.g.,
take B = , .) with , D(A)). To avoid this nuisance we will replace A
by a bounded function of A. A good candidate is the resolvent. Hence if A
is selfadjoint and B is bounded we will say that A and B commute if
[R
A
(z), B] = [R
A
(z
), B] = 0 (4.22)
for one z (A).
Lemma 4.5. Suppose A is selfadjoint and commutes with the bounded
operator B. Then
[f(A), B] = 0 (4.23)
for any bounded Borel function f. If f is unbounded, the claim holds for
any D(f(A)).
Proof. Equation (4.22) tell us that (4.23) holds for any f in the subalgebra
generated by R
A
(z). Since this subalgebra is dense in C
((A)) converging to f in L
2
(R, d
). Then
Bf(A) = lim
n
Bf
n
(A) = lim
n
f
n
(A)B = f(A)B. (4.24)
If f is unbounded, let D(f(A)) and choose f
n
as in (3.24). Then
f(A)B = lim
n
f
n
(A)B = lim
n
Bf
n
(A) (4.25)
shows f L
2
(R, d
B
) (i.e., B D(f(A))) and f(A)B = BF(A).
Corollary 4.6. If A is selfadjoint and bounded, then (4.22) holds if and
only if (4.21) holds.
Proof. Since (A) is compact, we have C
). Then
Bg() = Bg() 1 = g()(B1)() (4.26)
since B commutes with the multiplication operator g(). Hence B is multi
plication by f() = (B1)().
The assumption that the spectrum of A is simple is crucial as the exam
ple A = I shows. Note also that the functions exp(itA) can also be used
instead of resolvents.
Lemma 4.8. Suppose A is selfadjoint and B is bounded. Then B commutes
with A if and only if
[e
iAt
, B] = 0 (4.27)
for all t R.
Proof. It suces to show [
f(A), B] = 0 for f o(R), since these functions
are dense in C
2
[
_
R
f(t)e
iAt
dt, B] =
1
2
_
R
f(t)[e
iAt
, B]dt = 0 (4.28)
by Lemma 4.2.
The extension to the case where B is selfadjoint and unbounded is
straightforward. We say that A and B commute in this case if
[R
A
(z
1
), R
B
(z
2
)] = [R
A
(z
1
), R
B
(z
2
)] = 0 (4.29)
for one z
1
(A) and one z
2
(B) (the claim for z
2
follows by taking
adjoints). From our above analysis it follows that this is equivalent to
[e
iAt
, e
iBs
] = 0, t, s R, (4.30)
respectively
[f(A), g(B)] = 0 (4.31)
for arbitrary bounded Borel functions f and g.
4.3. The minmax theorem 101
4.3. The minmax theorem
In many applications a selfadjoint operator has a number of eigenvalues be
low the bottom of the essential spectrum. The essential spectrum is obtained
from the spectrum by removing all discrete eigenvalues with nite multiplic
ity (we will have a closer look at it in Section 6.2). In general there is no way
of computing the lowest eigenvalues and their corresponding eigenfunctions
explicitly. However, one often has some idea how the eigenfunctions might
approximately look like.
So suppose we have a normalized function
1
which is an approximation
for the eigenfunction
1
of the lowest eigenvalue E
1
. Then by Theorem 2.15
we know that
1
, A
1
)
1
, A
1
) = E
1
. (4.32)
If we add some free parameters to
1
, one can optimize them and obtain
quite good upper bounds for the rst eigenvalue.
But is there also something one can say about the next eigenvalues?
Suppose we know the rst eigenfunction
1
, then we can restrict A to the
orthogonal complement of
1
and proceed as before: E
2
will be the inmum
over all expectations restricted to this subspace. If we restrict to the or
thogonal complement of an approximating eigenfunction
1
, there will still
be a component in the direction of
1
left and hence the inmum of the
expectations will be lower than E
2
. Thus the optimal choice
1
=
1
will
give the maximal value E
2
.
More precisely, let
j
N
j=1
be an orthonormal basis for the space spanned
by the eigenfunctions corresponding to eigenvalues below the essential spec
trum. Assume they satisfy (AE
j
)
j
= 0, where E
j
E
j+1
are the eigen
values (counted according to their multiplicity). If the number of eigenvalues
N is nite we set E
j
= inf
ess
(A) for j > N and choose
j
orthonormal
such that (AE
j
)
j
 .
Dene
U(
1
, . . . ,
n
) = D(A)[  = 1, span
1
, . . . ,
n
. (4.33)
(i) We have
inf
U(
1
,...,
n1
)
, A) E
n
+O(). (4.34)
In fact, set =
n
j=1
j
and choose
j
such that U(
1
, . . . ,
n1
),
then
, A) =
n
j=1
[
j
[
2
E
j
+O() E
n
+O() (4.35)
and the claim follows.
102 4. Applications of the spectral theorem
(ii) We have
inf
U(
1
,...,
n1
)
, A) E
n
O(). (4.36)
In fact, set =
n
.
Since can be chosen arbitrarily small we have proven
Theorem 4.9 (MinMax). Let A be selfadjoint and let E
1
E
2
E
3
be the eigenvalues of A below the essential spectrum respectively the inmum
of the essential spectrum once there are no more eigenvalues left. Then
E
n
= sup
1
,...,
n1
inf
U(
1
,...,
n1
)
, A). (4.37)
Clearly the same result holds if D(A) is replaced by the quadratic form
domain Q(A) in the denition of U. In addition, as long as E
n
is an eigen
value, the sup and inf are in fact max and min, explaining the name.
Corollary 4.10. Suppose A and B are selfadjoint operators with A B
(i.e. AB 0), then E
n
(A) E
n
(B).
Problem 4.2. Suppose A, A
n
are bounded and A
n
A. Then E
k
(A
n
)
E
k
(A). (Hint AA
n
 is equivalent to A A A+.)
4.4. Estimating eigenspaces
Next, we show that the dimension of the range of P
A
() can be estimated
if we have some functions which lie approximately in this space.
Theorem 4.11. Suppose A is a bounded selfadjoint operator and
j
, 1
j k, are linearly independent elements of a H.
(i). Let R,
j
Q(A). If
, A) < 
2
(4.38)
for any nonzero linear combination =
k
j=1
c
j
j
, then
dimRanP
A
((, )) k. (4.39)
Similarly, , A) > 
2
implies dimRanP
A
((, )) k.
(ii). Let
1
<
2
,
j
D(A). If
(A
2
+
1
2
) <
2
1
2
 (4.40)
for any nonzero linear combination =
k
j=1
c
j
j
, then
dimRanP
A
((
1
,
2
)) k. (4.41)
4.5. Tensor products of operators 103
Proof. (i). Let M = span
j
H. We claim dimP
A
((, ))M =
dimM = k. For this it suces to show KerP
A
((, ))[
M
= 0. Sup
pose P
A
((, )) = 0, ,= 0. Then we see that for any nonzero linear
combination
, A) =
_
R
d
() =
_
[,)
d
()
_
[,)
d
() = 
2
. (4.42)
This contradicts our assumption (4.38).
(ii). Using the same notation as before we need to show KerP
A
((
1
,
2
))[
M
=
0. If P
A
((
1
,
2
)) = 0, ,= 0, then,
(A
2
+
1
2
)
2
=
_
R
(x
2
+
1
2
)
2
d
(x) =
_
x
2
d
(x +
2
+
1
2
)
(
2
1
)
2
4
_
(x +
2
+
1
2
) =
(
2
1
)
2
4

2
, (4.43)
where = (, (
2
1
)/2][(
2
1
)/2, ). But this is a contradiction
as before.
4.5. Tensor products of operators
Suppose A
j
, 1 j n, are selfadjoint operators on H
j
. For every monomial
n
1
1
nn
n
we can dene
(A
n
1
1
A
nn
n
)
1
n
= (A
n
1
1
1
) (A
nn
n
n
),
j
D(A
n
j
j
).
(4.44)
Hence for every polynomial P(
1
, . . . ,
n
) of degree N we can dene
P(A
1
, . . . , A
n
)
1
n
,
j
D(A
N
j
), (4.45)
and extend this denition to obtain a linear operator on the set
D = span
1
n
[
j
D(A
N
j
). (4.46)
Moreover, if P is realvalued, then the operator P(A
1
, . . . , A
n
) on D is sym
metric and we can consider its closure, which will again be denoted by
P(A
1
, . . . , A
n
).
Theorem 4.12. Suppose A
j
, 1 j n, are selfadjoint operators on H
j
and let P(
1
, . . . ,
n
) be a realvalued polynomial and dene P(A
1
, . . . , A
n
)
as above.
Then P(A
1
, . . . , A
n
) is selfadjoint and its spectrum is the closure of the
range of P on the product of the spectra of the A
j
, that is,
(P(A
1
, . . . , A
n
)) = P((A
1
), . . . , (A
n
)). (4.47)
104 4. Applications of the spectral theorem
Proof. By the spectral theorem it is no restriction to assume that A
j
is
multiplication by
j
on L
2
(R, d
j
) and P(A
1
, . . . , A
n
) is hence multiplication
by P(
1
, . . . ,
n
) on L
2
(R
n
, d
1
d
n
). Since D contains the set of
all functions
1
(
1
)
n
(
n
) for which
j
L
2
c
(R, d
j
) it follows that the
domain of the closure of P contains L
2
c
(R
n
, d
1
d
n
). Hence P is
the maximally dened multiplication operator by P(
1
, . . . ,
n
), which is
selfadjoint.
Now let = P(
1
, . . . ,
n
) with
j
(A
j
). Then there exists Weyl
sequences
j,k
D(A
N
j
) with (A
j
j
)
j,k
0 as k . Then, (P
)
k
0, where
k
=
1,k
1,k
and hence (P). Conversely,
if , P((A
1
), . . . , (A
n
)), then [P(
1
, . . . ,
n
) [ for a.e.
j
with
respect to
j
and hence (P)
1
exists and is bounded, that is (P).
The two main cases of interest are A
1
A
2
, in which case
(A
1
A
2
) = (A
1
)(A
2
) =
1
2
[
j
(A
j
), (4.48)
and A
1
I +I A
2
, in which case
(A
1
I +I A
2
) = (A
1
) +(A
2
) =
1
+
2
[
j
(A
j
). (4.49)
Chapter 5
Quantum dynamics
As in the nite dimensional case, the solution of the Schrodinger equation
i
d
dt
(t) = H(t) (5.1)
is given by
(t) = exp(itH)(0). (5.2)
A detailed investigation of this formula will be our rst task. Moreover, in
the nite dimensional case the dynamics is understood once the eigenvalues
are known and the same is true in our case once we know the spectrum. Note
that, like any Hamiltonian system from classical mechanics, our system is
not hyperbolic (i.e., the spectrum is not away from the real axis) and hence
simple results like, all solutions tend to the equilibrium position cannot be
expected.
5.1. The time evolution and Stones theorem
In this section we want to have a look at the initial value problem associated
with the Schrodinger equation (2.12) in the Hilbert space H. If H is one
dimensional (and hence A is a real number), the solution is given by
(t) = e
itA
(0). (5.3)
Our hope is that this formula also applies in the general case and that we
can reconstruct a oneparameter unitary group U(t) from its generator A
(compare (2.11)) via U(t) = exp(itA). We rst investigate the family of
operators exp(itA).
Theorem 5.1. Let A be selfadjoint and let U(t) = exp(itA).
(i). U(t) is a strongly continuous oneparameter unitary group.
105
106 5. Quantum dynamics
(ii). The limit lim
t0
1
t
(U(t) ) exists if and only if D(A) in
which case lim
t0
1
t
(U(t) ) = iA.
(iii). U(t)D(A) = D(A) and AU(t) = U(t)A.
Proof. The group property (i) follows directly from Theorem 3.1 and the
corresponding statements for the function exp(it). To prove strong con
tinuity observe that
lim
tt
0
e
itA
e
it
0
A

2
= lim
tt
0
_
R
[e
it
e
it
0
[
2
d
()
=
_
R
lim
tt
0
[e
it
e
it
0
[
2
d
() = 0 (5.4)
by the dominated convergence theorem.
Similarly, if D(A) we obtain
lim
t0

1
t
(e
itA
) + iA
2
= lim
t0
_
R
[
1
t
(e
it
1) + i[
2
d
() = 0 (5.5)
since [e
it
1[ [t[. Now let
A be the generator dened as in (2.11). Then
i
t
(U(t) 1), ) =
A, ) (5.6)
and hence
A = A by Corollary 2.2. This settles (ii).
To see (iii) replace U(s) in (ii).
For our original problem this implies that formula (5.3) is indeed the
solution to the initial value problem of the Schrodinger equation. Moreover,
U(t), AU(t)) = U(t), U(t)A) = , A) (5.7)
shows that the expectations of A are time independent. This corresponds
to conservation of energy.
On the other hand, the generator of the time evolution of a quantum
mechanical system should always be a selfadjoint operator since it corre
sponds to an observable (energy). Moreover, there should be a one to one
correspondence between the unitary group and its generator. This is ensured
by Stones theorem.
Theorem 5.2 (Stone). Let U(t) be a weakly continuous oneparameter uni
tary group. Then its generator A is selfadjoint and U(t) = exp(itA).
Proof. First of all observe that weak continuity together with Lemma 1.11 (iv)
shows that U(t) is in fact strongly continuous.
5.1. The time evolution and Stones theorem 107
Next we show that A is densely dened. Pick H and set
=
_
0
U(t)dt (5.8)
(the integral is dened as in Section 4.1) implying lim
0
= . More
over,
1
t
(U(t)
) =
1
t
_
t+
t
U(s)ds
1
t
_
0
U(s)ds
=
1
t
_
+t
U(s)ds
1
t
_
t
0
U(s)ds
=
1
t
U()
_
t
0
U(s)ds
1
t
_
t
0
U(s)ds U() (5.9)
as t 0 shows
= z
, then
for each D(A) we have
d
dt
, U(t)) = , iAU(t)) = iA
, U(t))
= iz, U(t)) (5.10)
and hence , U(t)) = exp(izt), ). Since the left hand side is bounded
for all t R and the exponential on the right hand side is not, we must have
, ) = 0 implying = 0 since D(A) is dense.
So A is essentially selfadjoint and we can introduce V (t) = exp(itA).
We are done if we can show U(t) = V (t).
Let D(A) and abbreviate (t) = (U(t) V (t)). Then
lim
s0
(t +s) (t)
s
= iA(t) (5.11)
and hence
d
dt
(t)
2
= 2Re(t), iA(t)) = 0. Since (0) = 0 we have
(t) = 0 and hence U(t) and V (t) coincide on D(A). Furthermore, since
D(A) is dense we have U(t) = V (t) by continuity.
As an immediate consequence of the proof we also note the following
useful criterion.
Corollary 5.3. Suppose D D(A) is dense and invariant under U(t).
Then A is essentially selfadjoint on D.
Proof. As in the above proof it follows , ) = 0 for any Ker(A
)
and D.
108 5. Quantum dynamics
Note that by Lemma 4.8 two strongly continuous oneparameter groups
commute
[e
itA
, e
isB
] = 0 (5.12)
if and only if the generators commute.
Clearly, for a physicist, one of the goals must be to understand the time
evolution of a quantum mechanical system. We have seen that the time
evolution is generated by a selfadjoint operator, the Hamiltonian, and is
given by a linear rst order dierential equation, the Schrodinger equation.
To understand the dynamics of such a rst order dierential equation, one
must understand the spectrum of the generator. Some general tools for this
endeavor will be provided in the following sections.
Problem 5.1. Let H = L
2
(0, 2) and consider the one parameter unitary
group given by U(t)f(x) = f(x t mod 2). What is the generator of U?
5.2. The RAGE theorem
Now, let us discuss why the decomposition of the spectrum introduced in
Section 3.3 is of physical relevance. Let  =  = 1. The vector , )
is the projection of onto the (onedimensional) subspace spanned by .
Hence [, )[
2
can be viewed as the part of which is in the state . A
rst question one might rise is, how does
[, U(t))[
2
, U(t) = e
itA
, (5.13)
behave as t ? By the spectral theorem,
,
(t) = , U(t)) =
_
R
e
it
d
,
() (5.14)
is the Fourier transform of the measure
,
. Thus our question is an
swered by Wieners theorem.
Theorem 5.4 (Wiener). Let be a nite complex Borel measure on R and
let
(t) =
_
R
e
it
d() (5.15)
be its Fourier transform. Then the Ces`aro time average of (t) has the
following limit
lim
T
1
T
_
T
0
[ (t)[
2
dt =
R
[()[
2
, (5.16)
where the sum on the right hand side is nite.
5.2. The RAGE theorem 109
Proof. By Fubini we have
1
T
_
T
0
[ (t)[
2
dt =
1
T
_
T
0
_
R
_
R
e
i(xy)t
d(x)d
(y)dt
=
_
R
_
R
_
1
T
_
T
0
e
i(xy)t
dt
_
d(x)d
(y). (5.17)
The function in parentheses is bounded by one and converges pointwise to
{0}
(x y) as T . Thus, by the dominated convergence theorem, the
limit of the above expression is given by
_
R
_
R
{0}
(x y)d(x)d
(y) =
_
R
(y)d
(y) =
yR
[(y)[
2
. (5.18)
Mxx
(A) exp(itA) = exp(itA)
Mxx
(A) = U(t)P
xx
, xx ac, sc, pp.
Moreover, if H
xx
we have P
xx
= which shows , f(A)) =
, P
xx
f(A)) = P
xx
, f(A)) implying d
,
= d
P
xx
,
. Thus if
n
j=1
is an orthonormal basis for Ran(K) we have
K =
n
j=1
j
, K)
j
=
n
j=1
j
, )
j
, (5.19)
110 5. Quantum dynamics
where
j
= K
j
. The elements
j
are linearly independent since Ran(K) =
Ker(K
=
n
j=1
j
, )
j
. (5.20)
The closure of the set of all nite rank operators in L(H) is called the set
of compact operators C(H). It is straightforward to verify (Problem 5.2)
Lemma 5.5. The set of all compact operators C(H) is a closed ideal in
L(H).
There is also a weaker version of compactness which is useful for us. The
operator K is called relatively compact with respect to A if
KR
A
(z) C(H) (5.21)
for one z (A). By the rst resolvent identity this then follows for all
z (A). In particular we have D(A) D(K).
Now let us return to our original problem.
Theorem 5.6. Let A be selfadjoint and suppose K is relatively compact.
Then
lim
T
1
T
_
T
0
Ke
itA
P
c

2
dt = 0 and lim
t
Ke
itA
P
ac
 = 0
(5.22)
for every D(A). In particular, if K is also bounded, then the result
holds for any H.
Proof. Let H
c
respectively H
ac
and drop the projectors. Then, if K
is a rank one operator (i.e., K =
1
, .)
2
), the claim follows from Wieners
theorem respectively the RiemannLebesgue lemma. Hence it holds for any
nite rank operator K.
If K is compact, there is a sequence K
n
of nite rank operators such
that K K
n
 1/n and hence
Ke
itA
 K
n
e
itA
 +
1
n
. (5.23)
Thus the claim holds for any compact operator K.
If D(A) we can set = (A i)
1
, where H
c
if and only if
H
c
(since H
c
reduces A). Since K(A + i)
1
is compact by assumption,
the claim can be reduced to the previous situation. If, in addition, K is
bounded, we can nd a sequence
n
D(A) such that 
n
 1/n and
hence
Ke
itA
 Ke
itA
n
 +
1
n
K, (5.24)
5.2. The RAGE theorem 111
concluding the proof.
With the help of this result we can now prove an abstract version of the
RAGE theorem.
Theorem 5.7 (RAGE). Let A be selfadjoint. Suppose K
n
L(H) is a se
quence of relatively compact operators which converges strongly to the iden
tity. Then
H
c
= H[ lim
n
lim
T
1
T
_
T
0
K
n
e
itA
dt = 0,
H
pp
= H[ lim
n
sup
t0
(I K
n
)e
itA
 = 0. (5.25)
Proof. Abbreviate (t) = exp(itA). We begin with the rst equation.
Let H
c
, then
1
T
_
T
0
K
n
(t)dt
_
1
T
_
T
0
K
n
(t)
2
dt
_
1/2
0 (5.26)
by CauchySchwarz and the previous theorem. Conversely, if , H
c
we
can write =
c
+
pp
. By our previous estimate it suces to show
K
n
pp
(t) > 0 for n large. In fact, we even claim
lim
n
sup
t0
K
n
pp
(t)
pp
(t) = 0. (5.27)
By the spectral theorem, we can write
pp
(t) =
j
j
(t)
j
, where the
j
are orthonormal eigenfunctions and
j
(t) = exp(it
j
)
j
. Truncate this
expansion after N terms, then this part converges uniformly to the desired
limit by strong convergence of K
n
. Moreover, by Lemma 1.13 we have
K
n
 M, and hence the error can be made arbitrarily small by choosing
N large.
Now let us turn to the second equation. If H
pp
the claim follows
by (5.27). Conversely, if , H
pp
we can write =
c
+
pp
and by our
previous estimate it suces to show that (I K
n
)
c
(t) does not tend to
0 as n . If it would, we would have
0 = lim
T
1
T
_
T
0
(I K
n
)
c
(t)
2
dt

c
(t)
2
lim
T
1
T
_
T
0
K
n
c
(t)
2
dt = 
c
(t)
2
, (5.28)
a contradiction.
In summary, regularity properties of spectral measures are related to
the long time behavior of the corresponding quantum mechanical system.
112 5. Quantum dynamics
However, a more detailed investigation of this topic is beyond the scope of
this manuscript. For a survey containing several recent results see [9].
It is often convenient to treat the observables as time dependent rather
than the states. We set
K(t) = e
itA
Ke
itA
(5.29)
and note
(t), K(t)) = , K(t)), (t) = e
itA
. (5.30)
This point of view is often referred to as Heisenberg picture in the physics
literature. If K is unbounded we will assume D(A) D(K) such that the
above equations make sense at least for D(A). The main interest is
the behavior of K(t) for large t. The strong limits are called asymptotic
observables if they exist.
Theorem 5.8. Suppose A is selfadjoint and K is relatively compact. Then
lim
T
1
T
_
T
0
e
itA
Ke
itA
dt =
p(A)
P
A
()KP
A
(), D(A).
(5.31)
If K is in addition bounded, the result holds for any H.
Proof. We will assume that K is bounded. To obtain the general result,
use the same trick as before and replace K by KR
A
(z). Write =
c
+
pp
.
Then
lim
T
1
T

_
T
0
K(t)
c
dt lim
T
1
T
_
T
0
K(t)
c
dt = 0 (5.32)
by Theorem 5.6. As in the proof of the previous theorem we can write
pp
=
j
j
j
and hence
j
1
T
_
T
0
K(t)
j
dt =
j
_
1
T
_
T
0
e
it(A
j
)
dt
_
K
j
. (5.33)
As in the proof of Wieners theorem, we see that the operator in parenthesis
tends to P
A
(
j
) strongly as T . Since this operator is also bounded
by 1 for all T, we can interchange the limit with the summation and the
claim follows.
We also note the following corollary.
Corollary 5.9. Under the same assumptions as in the RAGE theorem we
have
lim
n
lim
T
1
T
_
T
0
e
itA
K
n
e
itA
dt = P
pp
(5.34)
5.3. The Trotter product formula 113
respectively
lim
n
lim
T
1
T
_
T
0
e
itA
(I K
n
)e
itA
dt = P
c
. (5.35)
Problem 5.2. Prove Lemma 5.5.
Problem 5.3. Prove Corollary 5.9.
5.3. The Trotter product formula
In many situations the operator is of the form A + B, where e
itA
and e
itB
can be computed explicitly. Since A and B will not commute in general, we
cannot obtain e
it(A+B)
from e
itA
e
itB
. However, we at least have:
Theorem 5.10 (Trotter product formula). Suppose, A, B, and A+B are
selfadjoint. Then
e
it(A+B)
= slim
n
_
e
i
t
n
A
e
i
t
n
B
_
n
. (5.36)
Proof. First of all note that we have
_
e
iA
e
iB
_
n
e
it(A+B)
=
n1
j=0
_
e
iA
e
iB
_
n1j
_
e
iA
e
iB
e
i(A+B)
__
e
i(A+B)
_
j
,(5.37)
where =
t
n
, and hence
(e
iA
e
iB
)
n
e
it(A+B)
 [t[ max
st
F
(s), (5.38)
where
F
(s) = 
1
(e
iA
e
iB
e
i(A+B)
)e
is(A+B)
. (5.39)
Now for D(A+B) = D(A) D(B) we have
1
(e
iA
e
iB
e
i(A+B)
) iA + iB i(A+B) = 0 (5.40)
as 0. So lim
0
F
(e
iA
e
iB
e
i(A+B)
) C() (5.41)
and, since D(A+B) is a Hilbert space when equipped with the graph norm

2
(A+B)
= 
2
+(A+B)
2
, we can invoke the uniform boundedness
principle to obtain

1
(e
iA
e
iB
e
i(A+B)
) C
(A+B)
. (5.42)
114 5. Quantum dynamics
Now
[F
(s) F
(r)[ 
1
(e
iA
e
iB
e
i(A+B)
)(e
is(A+B)
e
ir(A+B)
)
C(e
is(A+B)
e
ir(A+B)
)
(A+B)
. (5.43)
shows that F
2
argument.
If the operators are semibounded from below the same proof shows
Theorem 5.11 (Trotter product formula). Suppose, A, B, and A+B are
selfadjoint and semibounded from below. Then
e
t(A+B)
= slim
n
_
e
t
n
A
e
t
n
B
_
n
, t 0. (5.44)
Problem 5.4. Proof Theorem 5.11.
Chapter 6
Perturbation theory for
selfadjoint operators
The Hamiltonian of a quantum mechanical system is usually the sum of
the kinetic energy H
0
(free Schrodinger operator) plus an operator V cor
responding to the potential energy. Since H
0
is easy to investigate, one
usually tries to consider V as a perturbation of H
0
. This will only work
if V is small with respect to H
0
. Hence we study such perturbations of
selfadjoint operators next.
6.1. Relatively bounded operators and the
KatoRellich theorem
An operator B is called A bounded or relatively bounded with respect
to A if D(A) D(B) and if there are constants a, b 0 such that
B aA +b, D(A). (6.1)
The inmum of all such constants is called the Abound of B.
The triangle inequality implies
Lemma 6.1. Suppose B
j
, j = 1, 2, are A bounded with respective Abounds
a
i
, i = 1, 2. Then
1
B
1
+
2
B
2
is also A bounded with Abound less than
[
1
[a
1
+ [
2
[a
2
. In particular, the set of all A bounded operators forms a
linear space.
There are also the following equivalent characterizations:
Lemma 6.2. Suppose A is closed and B is closable. Then the following are
equivalent:
115
116 6. Perturbation theory for selfadjoint operators
(i) B is A bounded.
(ii) D(A) D(B).
(iii) BR
A
(z) is bounded for one (and hence for all) z (A).
Moreover, the Abound of B is no larger then inf
z(A)
BR
A
(z).
Proof. (i) (ii) is true by denition. (ii) (iii) since BR
A
(z) is a closed
(Problem 2.6) operator dened on all of H and hence bounded by the closed
graph theorem (Theorem 2.7). To see (iii) (i) let D(A), then
B = BR
A
(z)(Az) a(Az) aA + (a[z[), (6.2)
where a = BR
A
(z). Finally, note that if BR
A
(z) is bounded for one
z (A), it is bounded for all z (A) by the rst resolvent formula.
We are mainly interested in the situation where A is selfadjoint and B
is symmetric. Hence we will restrict our attention to this case.
Lemma 6.3. Suppose A is selfadjoint and B relatively bounded. The A
bound of B is given by
lim
BR
A
(i). (6.3)
If A is bounded from below, we can also replace i by .
Proof. Let = R
A
(i), > 0, and let a
be the Abound of B. If B
is A bounded, then (use the spectral theorem to estimate the norms)
BR
A
(i) aAR
A
(i) +bR
A
(i) (a +
b
). (6.4)
Hence limsup
BR
A
(i) a
inf
BR
A
(i)
from the previous lemma, proves the claim.
The case where A is bounded from below is similar.
Now we will show the basic perturbation result due to Kato and Rellich.
Theorem 6.4 (KatoRellich). Suppose A is (essentially) selfadjoint and
B is symmetric with Abound less then one. Then A + B, D(A + B) =
D(A), is (essentially) selfadjoint. If A is essentially selfadjoint we have
D(A) D(B) and A+B = A+B.
If A is bounded from below by , then A + B is bounded from below by
min(, b/(a 1)).
Proof. Since D(A) D(B) and D(A) D(A+B) by (6.1) we can assume
that A is closed (i.e., selfadjoint). It suces to show that Ran(A+Bi) =
H. By the above lemma we can nd a > 0 such that BR
A
(i) < 1.
Hence 1 (BR
A
(i)) and thus I +BR
A
(i) is onto. Thus
(A+B i) = (I +BR
A
(i))(Ai) (6.5)
6.2. More on compact operators 117
is onto and the proof of the rst part is complete.
If A is bounded from below we can replace i by and the above
equation shows that R
A+B
exists for suciently large. By the proof of
the previous lemma we can choose < min(, b/(a 1)).
Finally, let us show that there is also a connection between the resolvents.
Lemma 6.5. Suppose A and B are closed and D(A) D(B). Then we
have the second resolvent formula
R
A+B
(z) R
A
(z) = R
A
(z)BR
A+B
(z) = R
A+B
(z)BR
A
(z) (6.6)
for z (A) (A+B).
Proof. We compute
R
A+B
(z) +R
A
(z)BR
A+B
(z) = R
A
(z)(A+B z)R
A+B
(z) = R
A
(z). (6.7)
The second identity is similar.
Problem 6.1. Compute the resolvent of A+, .). (Hint: Show
(I +, .))
1
= I
1 +, )
, .) (6.8)
and use the second resolvent formula.)
6.2. More on compact operators
Recall from Section 5.2 that we have introduced the set of compact operators
C(H) as the closure of the set of all nite rank operators in L(H). Before we
can proceed, we need to establish some further results for such operators.
We begin by investigating the spectrum of selfadjoint compact operators
and show that the spectral theorem takes a particular simple form in this
case.
We introduce some notation rst. The discrete spectrum
d
(A) is the
set of all eigenvalues which are discrete points of the spectrum and whose
corresponding eigenspace is nite dimensional. The complement of the dis
crete spectrum is called the essential spectrum
ess
(A) = (A)
d
(A).
If A is selfadjoint we might equivalently set
d
(A) =
p
(A)[rank(P
A
(( , +))) < for some > 0. (6.9)
respectively
ess
(A) R[rank(P
A
(( , +))) = for all > 0. (6.10)
118 6. Perturbation theory for selfadjoint operators
Theorem 6.6 (Spectral theorem for compact operators). Suppose the op
erator K is selfadjoint and compact. Then the spectrum of K consists of
an at most countable number of eigenvalues which can only cluster at 0.
Moreover, the eigenspace to each nonzero eigenvalue is nite dimensional.
In other words,
ess
(K) 0, (6.11)
where equality holds if and only if H is innite dimensional.
In addition, we have
K =
(K)
P
K
(). (6.12)
Proof. It suces to show rank(P
K
(( , +))) < for 0 < < [[.
Let K
n
be a sequence of nite rank operators such that KK
n
 1/n.
If RanP
K
((, +)) is innite dimensional we can nd a vector
n
in this
range such that 
n
 = 1 and K
n
n
= 0. But this yields a contradiction
1
n
[
n
, (K K
n
)
n
)[ = [
n
, K
n
)[ [[ > 0 (6.13)
by (4.2).
As a consequence we obtain the canonical form of a general compact
operator.
Theorem 6.7 (Canonical form of compact operators). Let K be compact.
There exists orthonormal sets
j
,
j
and positive numbers s
j
= s
j
(K)
such that
K =
j
s
j
j
, .)
j
, K
j
s
j
j
, .)
j
. (6.14)
Note K
j
= s
j
j
and K
j
= s
j
j
, and hence K
K
j
= s
2
j
j
and KK
j
=
s
2
j
j
.
The numbers s
j
(K)
2
> 0 are the nonzero eigenvalues of KK
respec
tively K
) = s
j
are called
singular values of K. There are either nitely many singular values (if K
is nite rank) or they converge to zero.
Proof. By Lemma 5.5 K
K
((0, ))H and let
s
2
j
be the eigenvalue corresponding to
j
. Then, for any H we can write
=
j
, )
j
+
(6.15)
6.2. More on compact operators 119
with
Ker(K
K) = Ker(K). Then
K =
j
s
j
j
, )
j
, (6.16)
where
j
= s
1
j
K
j
, since K

2
=
, K
K
) = 0. By
j
,
k
) =
(s
j
s
k
)
1
K
j
, K
k
) = (s
j
s
k
)
1
K
K
j
,
k
) = s
j
s
1
k
j
,
k
) we see that
j
are orthonormal and the formula for K
j
,
2
j
= 1 are the eigenvectors of K and
j
s
j
are the corresponding eigenvalues.
Moreover, note that we have
K = max
j
s
j
(K). (6.17)
Finally, let me remark that there are a number of other equivalent de
nitions for compact operators.
Lemma 6.8. For K L(H) the following statements are equivalent:
(i) K is compact.
(i) K
is compact.
(ii) A
n
L(H) and A
n
s
A strongly implies A
n
K AK.
(iii)
n
weakly implies K
n
K in norm.
(iv)
n
bounded implies that K
n
has a (norm) convergent subse
quence.
Proof. (i) (i). This is immediate from Theorem 6.7.
(i) (ii). Translating A
n
A
n
A it is no restriction to assume A = 0.
Since A
n
 M it suces to consider the case where K is nite rank. Then
(by (6.14))
A
n
K
2
= sup
=1
N
j=1
s
j
[
j
, )[
2
A
n
j

2
j=1
s
j
A
n
j

2
0. (6.18)
(ii) (iii). Again, replace
n
n
and assume = 0. Choose
A
n
=
n
, .),  = 1, then K
n

2
= A
n
K
 0.
(iii) (iv). If
n
is bounded it has a weakly convergent subsequence
by Lemma 1.12. Now apply (iii) to this subsequence.
(iv) (i). Let
j
be an orthonormal basis and set
K
n
=
n
j=1
j
, .)K
j
. (6.19)
120 6. Perturbation theory for selfadjoint operators
Then
n
= K K
n
 = sup
span{
j
}
j=n
,=1
K (6.20)
is a decreasing sequence tending to a limit 0. Moreover, we can nd
a sequence of unit vectors
n
span
j
j=n
for which K
n
 . By
assumption, K
n
has a convergent subsequence which, since
n
converges
weakly to 0, converges to 0. Hence must be 0 and we are done.
The last condition explains the name compact. Moreover, note that you
cannot replace A
n
K AK by KA
n
KA unless you additionally require
A
n
to be normal (then this follows by taking adjoints recall that only
for normal operators taking adjoints is continuous with respect to strong
convergence). Without the requirement that A
n
is normal the claim is wrong
as the following example shows.
Example. Let H =
2
(N), A
n
the operator which shifts each sequence n
places to the left, and K =
1
, .)
1
, where
1
= (1, 0, . . . ). Then slimA
n
=
0 but KA
n
 = 1.
Problem 6.2. Deduce the formula for K
_
M
[K(x, y)(y)[d(y)
2
d(x)
_
M
__
M
[K(x, y)[
2
d(y)
___
M
[(y)[
2
d(y)
_
d(x)
=
__
M
_
M
[K(x, y)[
2
d(y) d(x)
___
M
[(y)[
2
d(y)
_
(6.22)
we see that K is bounded. Next, pick an orthonormal basis
j
(x) for
L
2
(M, d). Then, by Lemma 1.9,
i
(x)
j
(y) is an orthonormal basis for
L
2
(M M, d d) and
K(x, y) =
i,j
c
i,j
i
(x)
j
(y), c
i,j
=
i
, K
j
), (6.23)
6.3. HilbertSchmidt and trace class operators 121
where
i,j
[c
i,j
[
2
=
_
M
_
M
[K(x, y)[
2
d(y) d(x) < . (6.24)
In particular,
K(x) =
i,j
c
i,j
j
, )
i
(x) (6.25)
shows that K can be approximated by nite rank operators (take nitely
many terms in the sum) and is hence compact.
Using (6.14) we can also give a dierent characterization of Hilbert
Schmidt operators.
Lemma 6.9. If H = L
2
(M, d), then a compact operator K is Hilbert
Schmidt if and only if
j
s
j
(K)
2
< and
j
s
j
(K)
2
=
_
M
_
M
[K(x, y)[
2
d(x)d(y), (6.26)
in this case.
Proof. If K is compact we can dene approximating nite rank operators
K
n
by considering only nitely many terms in (6.14):
K
n
=
n
j=1
s
j
j
, .)
j
. (6.27)
Then K
n
has the kernel K
n
(x, y) =
n
j=1
s
j
j
(y)
j
(x) and
_
M
_
M
[K
n
(x, y)[
2
d(x)d(y) =
n
j=1
s
j
(K)
2
(6.28)
Now if one side converges, so does the other and in particular, (6.26) holds
in this case.
Hence we will call a compact operator HilbertSchmidt if its singular
values satisfy
j
s
j
(K)
2
< . (6.29)
By our lemma this coincides with our previous denition if H = L
2
(M, d).
Since every Hilbert space is isomorphic to some L
2
(M, d) we see that
the HilbertSchmidt operators together with the norm
K
2
=
_
j
s
j
(K)
2
_
1/2
(6.30)
122 6. Perturbation theory for selfadjoint operators
form a Hilbert space (isomorphic to L
2
(MM, dd)). Note that K
2
=
K

2
(since s
j
(K) = s
j
(K
n
K
n

2
< (6.31)
for some orthonormal set and
n
K
n

2
= K
2
2
(6.32)
in this case.
Proof. This follows from
n
K
n

2
=
n,j
[
j
, K
n
)[
2
=
n,j
[K
j
,
n
)[
2
=
n
K
n

2
=
j
s
j
(K)
2
. (6.33)
n
AK
n

2
A
n
K
n

2
= AK
2
. (6.35)
For AK just consider adjoints.
This approach can be generalized by dening
K
p
=
_
j
s
j
(K)
p
_
1/p
(6.36)
plus corresponding spaces
p
(H) = K C(H)[K
p
< , (6.37)
which are known as Schatten pclass. Note that
K K
p
. (6.38)
and that by s
j
(K) = s
j
(K
) we have
K
p
= K

p
. (6.39)
6.3. HilbertSchmidt and trace class operators 123
Lemma 6.12. The spaces
p
(H) together with the norm .
p
are Banach
spaces. Moreover,
K
p
= sup
_
_
_
_
j
[
j
, K
j
)[
p
_
1/p
j
,
j
ONS
_
_
_
, (6.40)
where the sup is taken over all orthonormal systems.
Proof. The hard part is to prove (6.40): Choose q such that
1
p
+
1
q
= 1 and
use Holders inequality to obtain (s
j
[...[
2
= (s
p
j
[...[
2
)
1/p
[...[
2/q
)
j
s
j
[
n
,
j
)[
2
j
s
p
j
[
n
,
j
)[
2
_
1/p
_
j
[
n
,
j
)[
2
_
1/q
j
s
p
j
[
n
,
j
)[
2
_
1/p
. (6.41)
Clearly the analogous equation holds for
j
,
n
. Now using CauchySchwarz
we have
[
n
, K
n
)[
p
=
j
s
1/2
j
n
,
j
)s
1/2
j
j
,
n
)
j
s
p
j
[
n
,
j
)[
2
_
1/2
_
j
s
p
j
[
n
,
j
)[
2
_
1/2
(6.42)
Summing over n, a second appeal to CauchySchwarz and interchanging the
order of summation nally gives
n
[
n
, K
n
)[
p
n,j
s
p
j
[
n
,
j
)[
2
_
1/2
_
n,j
s
p
j
[
n
,
j
)[
2
_
1/2
j
s
p
j
_
1/2
_
j
s
p
j
_
1/2
=
j
s
p
j
. (6.43)
Since equality is attained for
n
=
n
and
n
=
n
equation (6.40) holds.
Now the rest is straightforward. From
_
j
[
j
, (K
1
+K
2
)
j
)[
p
_
1/p
j
[
j
, K
1
j
)[
p
_
1/p
+
_
j
[
j
, K
2
j
)[
p
_
1/p
K
1

p
+K
2

p
(6.44)
we infer that
p
(H) is a vector space and the triangle inequality. The other
requirements for are norm are obvious and it remains to check completeness.
If K
n
is a Cauchy sequence with respect to .
p
it is also a Cauchy sequence
124 6. Perturbation theory for selfadjoint operators
with respect to . (K K
p
). Since C(H) is closed, there is a compact
K with K K
n
 0 and by K
n

p
C we have
_
j
[
j
, K
j
)[
p
_
1/p
C (6.45)
for any nite ONS. Since the right hand side is independent of the ONS
(and in particular on the number of vectors), K is in
p
(H).
The two most important cases are p = 1 and p = 2:
2
(H) is the space
of HilbertSchmidt operators investigated in the previous section and
1
(H)
is the space of trace class operators. Since HilbertSchmidt operators are
easy to identify it is important to relate
1
(H) with
2
(H):
Lemma 6.13. An operator is trace class if and only if it can be written as
the product of two HilbertSchmidt operators, K = K
1
K
2
, and we have
K
1
K
1

2
K
2

2
(6.46)
in this case.
Proof. By CauchySchwarz we have
n
[
n
, K
n
)[
2
=
n
[K
n
, K
2
n
)[
2
n
K
n

2
n
K
2
n

2
= K
1

2
K
2

2
(6.47)
and hence K = K
1
K
2
is trace calls if both K
1
and K
2
are HilbertSchmidt
operators. To see the converse let K be given by (6.14) and choose K
1
=
j
_
s
j
(K)
j
, .)
j
respectively K
2
=
j
_
s
j
(K)
j
, .)
j
.
Corollary 6.14. The set of trace class operators forms a ideal in L(H)
and
KA
1
AK
1
respectively AK
1
AK
1
. (6.48)
Proof. Write K = K
1
K
2
with K
1
, K
2
HilbertSchmidt and use Corol
lary 6.11.
Now we can also explain the name trace class:
Lemma 6.15. If K is trace class, then for any ONB
n
the trace
tr(K) =
n
, K
n
) (6.49)
is nite and independent of the ONB.
Moreover, the trace is linear and if K
1
K
2
are both race class we have
tr(K
1
) tr(K
2
).
6.4. Relatively compact operators and Weyls theorem 125
Proof. Let
n
be another ONB. If we write K = K
1
K
2
with K
1
, K
2
HilbertSchmidt we have
n
, K
1
K
2
n
) =
n
K
n
, K
2
n
) =
n,m
K
n
,
m
)
m
, K
2
n
)
=
m,n
K
m
,
n
)
n
, K
1
m
) =
m
K
m
, K
1
m
)
=
m
, K
2
K
1
m
). (6.50)
Hence the trace is independent of the ONB and we even have tr(K
1
K
2
) =
tr(K
2
K
1
).
Clearly for selfadjoint trace class operators, the trace is the sum over
all eigenvalues (counted with their multiplicity). To see this you just have
to choose the ONB to consist of eigenfunctions. This is even true for all
trace class operators and is known as Lidiskij trace theorem (see [17] or [6]
for an easy to read introduction).
Problem 6.3. Show that A 0 is trace class if (6.49) is nite for one
ONB. (Hint A is selfadjoint (why?) and A =
A
A.)
Problem 6.4. Show that K :
2
(N)
2
(N), f(n)
jN
k(n +j)f(j) is
HilbertSchmidt if [k(n)[ C(n), where C(n) is decreasing and summable.
6.4. Relatively compact operators and Weyls
theorem
In the previous section we have seen that the sum of a selfadjoint and a sym
metric operator is again selfadjoint if the perturbing operator is small. In
this section we want to study the inuence of perturbations on the spectrum.
Our hope is that at least some parts of the spectrum remain invariant.
Let A be selfadjoint. Note that if we add a multiple of the identity to
A, we shift the entire spectrum. Hence, in general, we cannot expect a (rel
atively) bounded perturbation to leave any part of the spectrum invariant.
Next, if
0
is in the discrete spectrum, we can easily remove this eigenvalue
with a nite rank perturbation of arbitrary small norm. In fact, consider
A+P
A
(
0
). (6.51)
Hence our only hope is that the remainder, namely the essential spectrum,
is stable under nite rank perturbations. To show this, we rst need a good
criterion for a point to be in the essential spectrum of A.
Lemma 6.16 (Weyl criterion). A point is in the essential spectrum of
a selfadjoint operator A if and only if there is a sequence
n
such that
126 6. Perturbation theory for selfadjoint operators

n
 = 1,
n
converges weakly to 0, and (A )
n
 0. Moreover, the
sequence can chosen to be orthonormal. Such a sequence is called singular
Weyl sequence.
Proof. Let
n
be a singular Weyl sequence for the point
0
. By Lemma 2.12
we have
0
(A) and hence it suces to show
0
,
d
(A). If
0
d
(A) we
can nd an > 0 such that P
= P
A
((
0
,
0
+)) is nite rank. Consider
n
= P
n
. Clearly
n
converges weakly to zero and (A
0
)
n
 0.
Moreover,

n
n

2
=
_
R\(,+)
d
n
()
2
_
R\(,+)
(
0
)
2
d
n
()
2
(A
0
)
n

2
(6.52)
and hence 
n
 1. Thus
n
= 
n

1
n
is also a singular Weyl sequence.
But
n
is a sequence of unit length vectors which lives in a nite dimensional
space and converges to 0 weakly, a contradiction.
Conversely, if
0
ess
(A), consider P
n
= P
A
([
1
n
,
1
n+1
) ( +
1
n+1
, +
1
n
]). Then rank(P
n
j
) > 0 for an innite subsequence n
j
. Now pick
j
RanP
n
j
.
Now let K be a selfadjoint compact operator and
n
a singular Weyl
sequence for A. Then
n
converges weakly to zero and hence
(A+K )
n
 (A)
n
 +K
n
 0 (6.53)
since (A)
n
 0 by assumption and K
n
 0 by Lemma 6.8 (iii).
Hence
ess
(A)
ess
(A + K). Reversing the roles of A + K and A shows
ess
(A+K) =
ess
(A). Since we have shown that we can remove any point
in the discrete spectrum by a selfadjoint nite rank operator we obtain the
following equivalent characterization of the essential spectrum.
Lemma 6.17. The essential spectrum of a selfadjoint operator A is pre
cisely the part which is invariant under rankone perturbations. In particu
lar,
ess
(A) =
KC(H),K
=K
(A+K). (6.54)
There is even a larger class of operators under which the essential spec
trum is invariant.
Theorem 6.18 (Weyl). Suppose A and B are selfadjoint operators. If
R
A
(z) R
B
(z) C(H) (6.55)
6.4. Relatively compact operators and Weyls theorem 127
for one z (A) (B), then
ess
(A) =
ess
(B). (6.56)
Proof. In fact, suppose
ess
(A) and let
n
be a corresponding singular
Weyl sequence. Then (R
A
(z)
1
z
)
n
=
R
A
(z)
z
(A)
n
and thus (R
A
(z)
1
z
)
n
 0. Moreover, by our assumption we also have (R
B
(z)
1
z
)
n
 0 and thus (B )
n
 0, where
n
= R
B
(z)
n
. Since
lim
n

n
 = lim
n
R
A
(z)
n
 = [ z[
1
,= 0 (since (R
A
(z)
1
z
)
n
 = 
1
z
R
A
(z)(A)
n
 0) we obtain a singular Weyl sequence
for B, showing
ess
(B). Now interchange the roles of A and B.
As a rst consequence note the following result
Theorem 6.19. Suppose A is symmetric with equal nite defect indices,
then all selfadjoint extensions have the same essential spectrum.
Proof. By Lemma 2.27 the resolvent dierence of two selfadjoint extensions
is a nite rank operator if the defect indices are nite.
In addition, the following result is of interest.
Lemma 6.20. Suppose
R
A
(z) R
B
(z) C(H) (6.57)
for one z (A)(B), then this holds for all z (A)(B). In addition,
if A and B are selfadjoint, then
f(A) f(B) C(H) (6.58)
for any f C
(R).
Proof. If the condition holds for one z it holds for all since we have (using
both resolvent formulas)
R
A
(z
) R
B
(z
)
= (1 (z z
)R
B
(z
))(R
A
(z) R
B
(z))(1 (z z
)R
A
(z
)). (6.59)
Let A and B be selfadjoint. The set of all functions f for which the
claim holds is a closed subalgebra of C
n
(A
n
).
6.5. Strong and norm resolvent convergence 129
Using the StoneWeierstra theorem we obtain as a rst consequence
Theorem 6.23. Suppose A
n
converges to A in norm resolvent sense, then
f(A
n
) converges to f(A) in norm for any bounded continuous function
f : C with lim
f() = lim
f(). If A
n
converges to A
in strong resolvent sense, then f(A
n
) converges to f(A) strongly for any
bounded continuous function f : C.
Proof. The set of functions for which the claim holds clearly forms a 
algebra (since resolvents are normal, taking adjoints is continuous even with
respect to strong convergence) and since it contains f() = 1 and f() =
1
z
0
this algebra is dense by the StoneWeiersta theorem. The usual
3
shows that this algebra is also closed.
To see the last claim let
n
be a compactly supported continuous func
tion (0
m
1) which is one on the interval [m, m]. Then f(A
n
)
m
(A
n
)
s
f(A)
m
(A) by the rst part and hence
(f(A
n
) f(A)) f(A
n
) (1
m
(A
n
))
+f(A
n
) (
m
(A
n
)
m
(A))
+(f(A
n
)
m
(A
n
) f(A)
m
(A))
+f(A) (1
m
(A)) (6.64)
can be made arbitrarily small since f(.) f
and
m
(.)
s
I by Theo
rem 3.1.
As a consequence note that the point z is of no importance
Corollary 6.24. Suppose A
n
converges to A in norm or strong resolvent
sense for one z
0
, then this holds for all z .
and that we have
Corollary 6.25. Suppose A
n
converges to A in strong resolvent sense, then
e
itAn
s
e
itA
, t R, (6.65)
and if all operators are semibounded
e
tAn
s
e
tA
, t 0. (6.66)
Finally we need some good criteria to check for norm respectively strong
resolvent convergence.
Lemma 6.26. Let A
n
, A be selfadjoint operators with D(A
n
) = D(A).
Then A
n
converges to A in norm resolvent sense if there are sequences a
n
and b
n
converging to zero such that
(A
n
A) a
n
 +b
n
A, D(A) = D(A
n
). (6.67)
130 6. Perturbation theory for selfadjoint operators
Proof. From the second resolvent identity
R
An
(z) R
A
(z) = R
An
(z)(AA
n
)R
A
(z) (6.68)
we infer
(R
An
(i) R
A
(i)) R
An
(i)
_
a
n
R
A
(i) +b
n
AR
A
(i)
_
(a
n
+b
n
) (6.69)
and hence R
An
(i) R
A
(i) a
n
+b
n
0.
In particular, norm convergence implies norm resolvent convergence:
Corollary 6.27. Let A
n
, A be bounded selfadjoint operators with A
n
A,
then A
n
converges to A in norm resolvent sense.
Similarly, if no domain problems get in the way, strong convergence
implies strong resolvent convergence:
Lemma 6.28. Let A
n
, A be selfadjoint operators. Then A
n
converges to
A in strong resolvent sense if there there is a core D
0
of A such that for any
D
0
we have D(A
n
) for n suciently large and A
n
A.
Proof. Using the second resolvent identity we have
(R
An
(i) R
A
(i)) (AA
n
)R
A
(i) 0 (6.70)
for (A i)D
0
which is dense, since D
0
is a core. The rest follows from
Lemma 1.13.
If you wonder why we did not dene weak resolvent convergence, here
is the answer: it is equivalent to strong resolvent convergence.
Lemma 6.29. Suppose wlim
n
R
An
(z) = R
A
(z) for some z , then
also slim
n
R
An
(z) = R
A
(z).
Proof. By R
An
(z) R
A
(z) we have also R
An
(z)
R
A
(z)
and thus by
the rst resolvent identity
R
An
(z)
2
R
A
(z)
2
= , R
An
(z
)R
An
(z) R
A
(z
)R
A
(z))
=
1
z z
, (R
An
(z) R
An
(z
) +R
A
(z) R
A
(z
))) 0. (6.71)
Together with R
An
(z) R
A
(z) we have R
An
(z) R
A
(z) by virtue
of Lemma 1.11 (iv).
Now what can we say about the spectrum?
Theorem 6.30. Let A
n
and A be selfadjoint operators. If A
n
converges to
A in strong resolvent sense we have (A) lim
n
(A
n
). If A
n
converges
to A in norm resolvent sense we have (A) = lim
n
(A
n
).
6.5. Strong and norm resolvent convergence 131
Proof. Suppose the rst claim were wrong. Then we can nd a (A)
and some > 0 such that (A
n
) ( , + ) = . Choose a bounded
continuous function f which is one on (
2
, +
2
) and vanishes outside
(, +). Then f(A
n
) = 0 and hence f(A) = limf(A
n
) = 0 for every
. On the other hand, since (A) there is a nonzero RanP
A
((
2
, +
2
)) implying f(A) = , a contradiction.
To see the second claim, recall that the norm of R
A
(z) is just one over
the distance from the spectrum. In particular, , (A) if and only if
R
A
( + i) < 1. So , (A) implies R
A
( + i) < 1, which implies
R
An
( + i) < 1 for n suciently large, which implies , (A
n
) for n
suciently large.
Note that the spectrum can contract if we only have strong resolvent
sense: Let A
n
be multiplication by
1
n
x in L
2
(R). Then A
n
converges to 0 in
strong resolvent sense, but (A
n
) = R and (0) = 0.
Lemma 6.31. Suppose A
n
converges in strong resolvent sense to A. If
P
A
() = 0, then
slim
n
P
An
((, )) = slim
n
P
An
((, ]) = P
A
((, )) = P
A
((, ]).
(6.72)
Proof. The idea is to approximate
(,)
by a continuous function f, say
0 f
(,)
. Then
(P
A
((, )) P
An
((, ))) (P
A
((, )) f(A))
+(f(A) f(A
n
)) +(f(A
n
) P
An
((, ))) (6.73)
The rst term can be made arbitrarily small if we let f converge pointwise
to
(,)
(Theorem 3.1) and the same is true for the second if we choose n
large (Theorem 6.23). However, the third term can only be made small for
xed n. To overcome this problem let us choose another continuous function
g with
(,]
g 1. Then
(f(A
n
) P
An
((, ))) (g(A
n
) f(A
n
)) (6.74)
since f
(,)
(,]
g. Furthermore,
(g(A
n
) f(A
n
)) (g(A
n
) f(A))
+(f(A) g(A)) +(g(A) g(A
n
)) (6.75)
and now all terms are under control. Since we can replace P
.
((, )) by
P
.
((, ]) in all calculations we are done.
Example. The following example shows that the requirement P
A
() = 0
is crucial, even if we have bounded operators and norm convergence. In fact,
132 6. Perturbation theory for selfadjoint operators
let H = C
2
and
A
n
=
1
n
_
1 0
0 1
_
. (6.76)
Then A
n
0 and
P
An
((, 0)) = P
An
((, 0]) =
_
0 0
0 1
_
, (6.77)
but P
0
((, 0)) = 0 and P
0
((, 0]) = I.
Problem 6.6. Show that for self adjoint operators, strong resolvent conver
gence is equivalent to convergence with respect to the metric
d(A, B) =
nN
1
2
n
(R
A
(i) R
B
(i))
n
, (6.78)
where
n
nN
is some ONB.
Part 2
Schrodinger Operators
Chapter 7
The free Schrodinger
operator
7.1. The Fourier transform
We rst review some basic facts concerning the Fourier transform which
will be needed in the following section.
Let C
(R
n
) be the set of all complexvalued functions which have partial
derivatives of arbitrary order. For f C
(R
n
) and N
n
0
we set
f =

f
x
1
1
x
n
n
, x
= x
1
1
x
n
n
, [[ =
1
+ +
n
. (7.1)
An element N
n
0
is called multiindex and [[ is called its order. Recall
the Schwarz space
o(R
n
) = f C
(R
n
)[ sup
x
[x
f)(x)[ < , , N
n
0
(7.2)
which is dense in L
2
(R
n
) (since C
c
(R
n
) o(R
n
) is). For f o(R
n
) we
dene
T(f)(p)
f(p) =
1
(2)
n/2
_
R
n
e
ipx
f(x)d
n
x. (7.3)
Then it is an exercise in partial integration to prove
Lemma 7.1. For any multiindex N
n
0
and any f o(R
n
) we have
(
f)
(p) = (ip)
f(p), (x
f(x))
(p) = i

f(p). (7.4)
Hence we will sometimes write pf(x) for if(x), where = (
1
, . . . ,
n
)
is the gradient.
135
136 7. The free Schrodinger operator
In particular T maps o(R
n
) into itself. Another useful property is the
convolution formula.
Lemma 7.2. The Fourier transform of the convolution
(f g)(x) =
_
R
n
f(y)g(x y)d
n
y =
_
R
n
f(x y)g(y)d
n
y (7.5)
of two functions f, g o(R
n
) is given by
(f g)
(p) = (2)
n/2
f(p) g(p). (7.6)
Proof. We compute
(f g)
(p) =
1
(2)
n/2
_
R
n
e
ipx
_
R
n
f(y)g(x y)d
n
y d
n
x
=
_
R
n
e
ipy
f(y)
1
(2)
n/2
_
R
n
e
ip(xy)
g(x y)d
n
xd
n
y
=
_
R
n
e
ipy
f(y) g(p)d
n
y = (2)
n/2
f(p) g(p), (7.7)
where we have used Fubinis theorem.
Next, we want to compute the inverse of the Fourier transform. For this
the following lemma will be needed.
Lemma 7.3. We have e
zx
2
/2
o(R
n
) for Re(z) > 0 and
T(e
zx
2
/2
)(p) =
1
z
n/2
e
p
2
/(2z)
. (7.8)
Here z
n/2
has to be understood as (
z)
n
, where the branch cut of the root
is chosen along the negative real axis.
Proof. Due to the product structure of the exponential, one can treat each
coordinate separately, reducing the problem to the case n = 1.
Let
z
(x) = exp(zx
2
/2). Then
z
(x)+zx
z
(x) = 0 and hence i(p
z
(p)+
z
z
(p)) = 0. Thus
z
(p) = c
1/z
(p) and (Problem 7.1)
c =
z
(0) =
1
2
_
R
exp(zx
2
/2)dx =
1
z
(7.9)
at least for z > 0. However, since the integral is holomorphic for Re(z) > 0,
this holds for all z with Re(z) > 0 if we choose the branch cut of the root
along the negative real axis.
Now we can show
7.1. The Fourier transform 137
Theorem 7.4. The Fourier transform T : o(R
n
) o(R
n
) is a bijection.
Its inverse is given by
T
1
(g)(x) g(x) =
1
(2)
n/2
_
R
n
e
ipx
g(p)d
n
p. (7.10)
We have T
2
(f)(x) = f(x) and thus T
4
= I.
Proof. It suces to show T
2
(f)(x) = f(x). Consider
z
(x) from the
proof of the previous lemma and observe T
2
(
z
)(x) =
z
(x). Moreover,
using Fubini this even implies T
2
(f
)(x) = f
(x) =
1
n
_
R
n
1/
2(x y)f(y)d
n
y. (7.11)
Since lim
0
f
)(x) = lim
0
f
(x) =
f(x).
From Fubinis theorem we also obtain Parsevals identity
_
R
n
[
f(p)[
2
d
n
p =
1
(2)
n/2
_
R
n
_
R
n
f(x)
f(p)e
ipx
d
n
p d
n
x
=
_
R
n
[f(x)[
2
d
n
x (7.12)
for f o(R
n
) and thus we can extend T to a unitary operator:
Theorem 7.5. The Fourier transform T extends to a unitary operator T :
L
2
(R
n
) L
2
(R
n
). Its spectrum satises
(T) = z C[z
4
= 1 = 1, 1, i, i. (7.13)
Proof. It remains to compute the spectrum. In fact, if
n
is a Weyl se
quence, then (T
2
+ z
2
)(T + z)(T z)
n
= (T
4
z
4
)
n
= (1 z
4
)
n
0
implies z
4
= 1. Hence (T) z C[z
4
= 1. We defer the proof for
equality to Section 8.3, where we will explicitly compute an orthonormal
basis of eigenfunctions.
Lemma 7.1 also allows us to extend dierentiation to a larger class. Let
us introduce the Sobolev space
H
r
(R
n
) = f L
2
(R
n
)[[p[
r
f(p) L
2
(R
n
). (7.14)
We will abbreviate
f = ((ip)
f(p))
, f H
r
(R
n
), [[ r (7.15)
which implies
_
R
n
g(x)(
f)(x)d
n
x = (1)
_
R
n
(
g)(x)f(x)d
n
x, (7.16)
138 7. The free Schrodinger operator
for f H
r
(R
n
) and g o(R
n
). That is,
(R
n
) denote the Banach space of
all continuous functions f : R
n
C which vanish at equipped with the
sup norm. Then the Fourier transform is a bounded map from L
1
(R
n
) into
C
(R
n
) satisfying

f
(2)
n/2
f
1
. (7.17)
Proof. Clearly we have
f C
(R
n
) if f o(R
n
). Moreover, the estimate
sup
p
[
f(p)[
1
(2)
n/2
_
R
n
[e
ipx
f(x)[d
n
x =
1
(2)
n/2
_
R
n
[f(x)[d
n
x. (7.18)
shows
f C
(R
n
) for arbitrary f L
1
(R
n
) since o(R
n
) is dense in L
1
(R
n
).
j=1
2
x
2
j
. (7.20)
Our rst task is to nd a good domain such that H
0
is a selfadjoint operator.
By Lemma 7.1 we have that
(x) = (p
2
(p))
(x), H
2
(R
n
), (7.21)
and hence the operator
H
0
= , D(H
0
) = H
2
(R
n
), (7.22)
is unitarily equivalent to the maximally dened multiplication operator
(T H
0
T
1
)(p) = p
2
(p), D(p
2
) = L
2
(R
n
)[p
2
(p) L
2
(R
n
).
(7.23)
7.2. The free Schrodinger operator 139
Theorem 7.7. The free Schrodinger operator H
0
is selfadjoint and its
spectrum is characterized by
(H
0
) =
ac
(H
0
) = [0, ),
sc
(H
0
) =
pp
(H
0
) = . (7.24)
Proof. It suces to show that d
, R
p
2(z)
) =
_
R
n
[
(p)[
2
p
2
z
d
n
p =
_
R
1
r
2
z
d
(r), (7.25)
where
d
(r) =
[0,)
(r)r
n1
__
S
n1
[
(r)[
2
d
n1
_
dr. (7.26)
Hence, after a change of coordinates, we have
, R
H
0
(z)) =
_
R
1
z
d
(), (7.27)
where
d
() =
1
2
[0,)
()
n/21
__
S
n1
[
)[
2
d
n1
_
d, (7.28)
proving the claim.
Finally, we note that the compactly supported smooth functions are a
core for H
0
.
Lemma 7.8. The set C
c
(R
n
) = f o(R
n
)[supp(f) is compact is a core
for H
0
.
Proof. It is not hard to see that o(R
n
) is a core (Problem 7.3) and hence it
suces to show that the closure of H
0
[
C
c
(R
n
)
contains H
0
[
S(R
n
)
. To see this,
let (x) C
c
(R
n
) which is one for [x[ 1 and vanishes for [x[ 2. Set
n
(x) = (
1
n
x), then
n
(x) =
n
(x)(x) is in C
c
(R
n
) for every o(R
n
)
and
n
respectively
n
.
Note also that the quadratic form of H
0
is given by
q
H
0
() =
n
j=1
_
R
n
[
j
(x)[
2
d
n
x, Q(H
0
) = H
1
(R
n
). (7.29)
Problem 7.3. Show that o(R
n
) is a core for H
0
. (Hint: Show that the
closure of H
0
[
S(R
n
)
contains H
0
.)
Problem 7.4. Show that o(R)[(0) = 0 is dense but not a core for
H
0
=
d
2
dx
2
.
140 7. The free Schrodinger operator
7.3. The time evolution in the free case
Now let us look at the time evolution. We have
e
itH
0
(x) = T
1
e
itp
2
(p). (7.30)
The right hand side is a product and hence our operator should be express
ible as an integral operator via the convolution formula. However, since
e
itp
2
is not in L
2
, a more careful analysis is needed.
Consider
f
(p
2
) = e
(it+)p
2
, > 0. (7.31)
Then f
(H
0
) e
itH
0
by Theorem 3.1. Moreover, by Lemma 7.3 and
the convolution formula we have
f
(H
0
)(x) =
1
(4(it +))
n/2
_
R
n
e
xy
2
4(it+)
(y)d
n
y (7.32)
and hence
e
itH
0
(x) =
1
(4it)
n/2
_
R
n
e
i
xy
2
4t
(y)d
n
y (7.33)
for t ,= 0 and L
1
L
2
. For general L
2
the integral has to be
understood as a limit.
Using this explicit form, it is not hard to draw some immediate conse
quences. For example, if L
2
(R
n
) L
1
(R
n
), then (t) C(R
n
) for t ,= 0
(use dominated convergence and continuity of the exponential) and satises
(t)
1
[4t[
n/2
(0)
1
(7.34)
by the RiemannLebesgue lemma. Thus we have spreading of wave functions
in this case. Moreover, it is even possible to determine the asymptotic form
of the wave function for large t as follows. Observe
e
itH
0
(x) =
e
i
x
2
4t
(4it)
n/2
_
R
n
e
i
y
2
4t
(y)e
i
xy
2t
d
n
y
=
_
1
2it
_
n/2
e
i
x
2
4t
_
e
i
y
2
4t
(y)
_
(
x
2t
). (7.35)
Moreover, since exp(i
y
2
4t
)(y) (y) in L
2
as [t[ (dominated conver
gence) we obtain
Lemma 7.9. For any L
2
(R
n
) we have
e
itH
0
(x)
_
1
2it
_
n/2
e
i
x
2
4t
(
x
2t
) 0 (7.36)
in L
2
as [t[ .
7.4. The resolvent and Greens function 141
Next we want to apply the RAGE theorem in order to show that for any
initial condition, a particle will escape to innity.
Lemma 7.10. Let g(x) be the multiplication operator by g and let f(p) be
the operator given by f(p)(x) = T
1
(f(p)
(p))(x). Denote by L
(R
n
) the
bounded Borel functions which vanish at innity. Then
f(p)g(x) and g(x)f(p) (7.37)
are compact if f, g L
(R
n
) and (extend to) HilbertSchmidt operators if
f, g L
2
(R
n
).
Proof. By symmetry it suces to consider g(x)f(p). Let f, g L
2
, then
g(x)f(p)(x) =
1
(2)
n/2
_
R
n
g(x)
f(x y)(y)d
n
y (7.38)
shows that g(x)f(p) is HilbertSchmidt since g(x)
f(x y) L
2
(R
n
R
n
).
If f, g are bounded then the functions f
R
(p) =
{pp
2
R}
(p)f(p) and
g
R
(x) =
{xx
2
R}
(x)g(x) are in L
2
. Thus g
R
(x)f
R
(p) is compact and tends
to g(x)f(p) in norm since f, g vanish at innity.
In particular, this lemma implies that
(H
0
+ i)
1
(7.39)
is compact if R
n
is bounded and hence
lim
t

e
itH
0

2
= 0 (7.40)
for any L
2
(R
n
) and any bounded subset of R
n
. In other words, the
particle will eventually escape to innity since the probability of nding the
particle in any bounded set tends to zero. (If L
1
(R
n
) this of course also
follows from (7.34).)
7.4. The resolvent and Greens function
Now let us compute the resolvent of H
0
. We will try to use a similar approach
as for the time evolution in the previous section. However, since it is highly
nontrivial to compute the inverse Fourier transform of exp(p
2
)(p
2
z)
1
directly, we will use a small ruse.
Note that
R
H
0
(z) =
_
0
e
zt
e
tH
0
dt, Re(z) < 0 (7.41)
by Lemma 4.1. Moreover,
e
tH
0
(x) =
1
(4t)
n/2
_
R
n
e
xy
2
4t
(y)d
n
y, t > 0, (7.42)
142 7. The free Schrodinger operator
by the same analysis as in the previous section. Hence, by Fubini, we have
R
H
0
(z)(x) =
_
R
n
G
0
(z, [x y[)(y)d
n
y, (7.43)
where
G
0
(z, r) =
_
0
1
(4t)
n/2
e
r
2
4t
+zt
dt, r > 0, Re(z) < 0. (7.44)
The function G
0
(z, r) is called Greens function of H
0
. The integral can
be evaluated in terms of modied Bessel functions of the second kind
G
0
(z, r) =
1
2
_
z
4
2
r
2
_n2
4
K
n
2
1
(
zr). (7.45)
The functions K
(x) = 0 (7.46)
and have the following asymptotics
K
(x) =
_
()
2
_
x
2
_
+O(x
+1
) ,= 0
ln(
x
2
) +O(1) = 0
(7.47)
for [x[ 0 and
K
(x) =
_
2x
e
x
(1 +O(x
1
)) (7.48)
for [x[ . For more information see for example [22]. In particular,
G
0
(z, r) has an analytic continuation for z C[0, ) = (H
0
). Hence we
can dene the right hand side of (7.43) for all z (H
0
) such that
_
R
n
_
R
n
(x)G
0
(z, [x y[)(y)d
n
yd
n
x (7.49)
is analytic for z (H
0
) and , o(R
n
) (by Moreas theorem). Since
it is equal to , R
H
0
(z)) for Re(z) < 0 it is equal to this function for all
z (H
0
), since both functions are analytic in this domain. In particular,
(7.43) holds for all z (H
0
).
If n is odd, we have the case of spherical Bessel functions which can be
expressed in terms of elementary functions. For example, we have
G
0
(z, r) =
1
2
z
e
z r
, n = 1, (7.50)
and
G
0
(z, r) =
1
4r
e
z r
, n = 3. (7.51)
Problem 7.5. Verify (7.43) directly in the case n = 1.
Chapter 8
Algebraic methods
8.1. Position and momentum
Apart from the Hamiltonian H
0
, which corresponds to the kinetic energy,
there are several other important observables associated with a single parti
cle in three dimensions. Using commutation relation between these observ
ables, many important consequences about these observables can be derived.
First consider the oneparameter unitary group
(U
j
(t))(x) = e
itx
j
(x), 1 j 3. (8.1)
For o(R
3
) we compute
lim
t0
i
e
itx
j
(x) (x)
t
= x
j
(x) (8.2)
and hence the generator is the multiplication operator by the jth coordinate
function. By Corollary 5.3 it is essentially selfadjoint on o(R
3
). It is
custom to combine all three operators to one vector valued operator x, which
is known as position operator. Moreover, it is not hard to see that the
spectrum of x
j
is purely absolutely continuous and given by (x
j
) = R. In
fact, let (x) be an orthonormal basis for L
2
(R). Then
i
(x
1
)
j
(x
2
)
k
(x
3
)
is an orthonormal basis for L
2
(R
3
) and x
1
can be written as a orthogonal
sum of operators restricted to the subspaces spanned by
j
(x
2
)
k
(x
3
). Each
subspace is unitarily equivalent to L
2
(R) and x
1
is given by multiplication
with the identity. Hence the claim follows (or use Theorem 4.12).
Next, consider the oneparameter unitary group of translations
(U
j
(t))(x) = (x te
j
), 1 j 3, (8.3)
143
144 8. Algebraic methods
where e
j
is the unit vector in the jth coordinate direction. For o(R
3
)
we compute
lim
t0
i
(x te
j
) (x)
t
=
1
i
x
j
(x) (8.4)
and hence the generator is p
j
=
1
i
x
j
. Again it is essentially selfadjoint
on o(R
3
). Moreover, since it is unitarily equivalent to x
j
by virtue of
the Fourier transform we conclude that the spectrum of p
j
is again purely
absolutely continuous and given by (p
j
) = R. The operator p is known as
momentum operator. Note that since
[H
0
, p
j
](x) = 0, o(R
3
) (8.5)
we have
d
dt
(t), p
j
(t)) = 0, (t) = e
itH
0
(0) o(R
3
), (8.6)
that is, the momentum is a conserved quantity for the free motion. Similarly
one has
[p
j
, x
k
](x) =
jk
(x), o(R
3
), (8.7)
which is known as the Weyl relation.
The Weyl relations also imply that the meansquare deviation of position
and momentum cannot be made arbitrarily small simultaneously:
Theorem 8.1 (Heisenberg Uncertainty Principle). Suppose A and B are
two symmetric operators, then for any D(AB) D(AB) we have
(A)
(B)
1
2
[E
(B)) = i(AE
A = AE
(A),
B = B E
(B). (8.10)
Then
(A) = 
A,
(B) = 
A,
B)[
(A)
(B). (8.11)
Now note that
A
B =
1
2
A,
B +
1
2
[A, B],
A,
B =
A
B +
B
A (8.12)
where
A,
B and i[A, B] are symmetric. So
[
A,
B)[
2
= [,
A
B)[
2
=
1
2
[,
A,
B)[
2
+
1
2
[, [A, B])[
2
(8.13)
8.2. Angular momentum 145
which proves (8.8).
To have equality if is not an eigenstate we need
B = z
A for equality
in CauchySchwarz and ,
A,
B) = 0. Inserting the rst into the second
requirement gives 0 = (z z
)
A
2
and shows Re(z) = 0.
In case of position and momentum we have ( = 1)
(p
j
)
(x
k
)
jk
2
(8.14)
and the minimum is attained for the Gaussian wave packets
(x) =
_
_
n/4
e
2
xx
0

2
ip
0
x
, (8.15)
which satisfy E
(x) = x
0
and E
(p) = p
0
respectively
(p
j
)
2
=
2
and
(x
k
)
2
=
1
2
.
Problem 8.1. Check that (8.15) realizes the minimum.
8.2. Angular momentum
Now consider the oneparameter unitary group of rotations
(U
j
(t))(x) = (M
j
(t)x), 1 j 3, (8.16)
where M
j
(t) is the matrix of rotation around e
j
by an angle of t. For
o(R
3
) we compute
lim
t0
i
(M
i
(t)x) (x)
t
=
3
j,k=1
ijk
x
j
p
k
(x), (8.17)
where
ijk
=
_
_
_
1 if ijk is an even permutation of 123
1 if ijk is an odd permutation of 123
0 else
. (8.18)
Again one combines the three components to one vector valued operator
L = x p, which is known as angular momentum operator. Since
e
i2L
j
= I, we see that the spectrum is a subset of Z. In particular, the
continuous spectrum is empty. We will show below that we have (L
j
) = Z.
Note that since
[H
0
, L
j
](x) = 0, o(R
3
), (8.19)
we have again
d
dt
(t), L
j
(t)) = 0, (t) = e
itH
0
(0) o(R
3
), (8.20)
that is, the angular momentum is a conserved quantity for the free motion
as well.
146 8. Algebraic methods
Moreover, we even have
[L
i
, K
j
](x) = i
3
k=1
ijk
K
k
(x), o(R
3
), K
j
L
j
, p
j
, x
j
, (8.21)
and these algebraic commutation relations are often used to derive informa
tion on the point spectra of these operators. In this respect the following
domain
D = spanx
x
2
2
[ N
n
0
o(R
n
) (8.22)
is often used. It has the nice property that the nite dimensional subspaces
D
k
= spanx
x
2
2
[ [[ k (8.23)
are invariant under L
j
(and hence they reduce L
j
).
Lemma 8.2. The subspace D L
2
(R
n
) dened in (8.22) is dense.
Proof. By Lemma 1.9 it suces to consider the case n = 1. Suppose
, ) = 0 for every D. Then
1
2
_
(x)e
x
2
2
k
j=1
(itx)
k
j!
= 0 (8.24)
for any nite k and hence also in the limit k by the dominated conver
gence theorem. But the limit is the Fourier transform of (x)e
x
2
2
, which
shows that this function is zero. Hence (x) = 0.
Since it is invariant under the unitary groups generated by L
j
, the op
erators L
j
are essentially selfadjoint on D by Corollary 5.3.
Introducing L
2
= L
2
1
+L
2
2
+L
2
3
it is straightforward to check
[L
2
, L
j
](x) = 0, o(R
3
). (8.25)
Moreover, D
k
is invariant under L
2
and L
3
and hence D
k
reduces L
2
and
L
3
. In particular, L
2
and L
3
are given by nite matrices on D
k
. Now
let H
m
= Ker(L
3
m) and denote by P
k
the projector onto D
k
. Since
L
2
and L
3
commute on D
k
, the space P
k
H
m
is invariant under L
2
which
shows that we can choose an orthonormal basis consisting of eigenfunctions
of L
2
for P
k
H
m
. Increasing k we get an orthonormal set of simultaneous
eigenfunctions whose span is equal to D. Hence there is an orthonormal
basis of simultaneous eigenfunctions of L
2
and L
3
.
Now let us try to draw some further consequences by using the commuta
tion relations (8.21). (All commutation relations below hold for o(R
3
).)
Denote by H
l,m
the set of all functions in D satisfying
L
3
= m, L
2
= l(l + 1). (8.26)
8.2. Angular momentum 147
By L
2
0 and (L
3
) Z we can restrict our attention to the case l 0
and m Z.
First introduce two new operators
L
= L
1
iL
2
, [L
3
, L
] = L
. (8.27)
Then, for every H
l,m
we have
L
3
(L
) = (m1)(L
), L
2
(L
) = l(l + 1)(L
), (8.28)
that is, L
H
l,m
H
l,m1
. Moreover, since
L
2
= L
2
3
L
3
+L
(8.29)
we obtain
L

2
= , L
H
l,m
H
l,m1
is injective
unless [m[ = l. Hence we must have H
l,m
= 0 for l , N
0
.
Up to this point we know (L
2
) l(l +1)[l N
0
, (L
3
) Z. In order
to show that equality holds in both cases, we need to show that H
l,m
,= 0
for l N
0
, m = l, l + 1, . . . , l 1, l. First of all we observe
0,0
(x) =
1
3/2
e
x
2
2
H
0,0
. (8.31)
Next, we note that (8.21) implies
[L
3
, x
] = x
, x
= x
1
ix
2
,
[L
, x
] = 0, [L
, x
] = 2x
3
,
[L
2
, x
] = 2x
(1 L
3
) 2x
3
L
. (8.32)
Hence if H
l,l
, then (x
1
ix
2
) H
l1,l1
. And thus
l,l
(x) =
1
l!
(x
1
ix
2
)
l
0,0
(x) H
l,l
, (8.33)
respectively
l,m
(x) =
(l +m)!
(l m)!(2l)!
L
lm
l,l
(x) H
l,m
. (8.34)
The constants are chosen such that 
l,m
 = 1.
In summary,
Theorem 8.3. There exists an orthonormal basis of simultaneous eigen
vectors for the operators L
2
and L
j
. Moreover, their spectra are given by
(L
2
) = l(l + 1)[l N
0
, (L
3
) = Z. (8.35)
We will rederive this result using dierent methods in Section 10.3.
148 8. Algebraic methods
8.3. The harmonic oscillator
Finally, let us consider another important model whose algebraic structure
is similar to those of the angular momentum, the harmonic oscillator
H = H
0
+
2
x
2
, > 0. (8.36)
As domain we will choose
D(H) = D = spanx
x
2
2
[ N
3
0
L
2
(R
3
) (8.37)
from our previous section.
We will rst consider the onedimensional case. Introducing
A
=
1
2
_
x
1
d
dx
_
(8.38)
we have
[A
, A
+
] = 1 (8.39)
and
H = (2N + 1), N = A
+
A
, (8.40)
for any function in D.
Moreover, since
[N, A
] = A
, (8.41)
we see that N = n implies NA
= (n 1)A
. Moreover, A
+

2
=
, A
A
+
) = (n + 1)
2
respectively A

2
= n
2
in this case and
hence we conclude that (N) N
0
If N
0
= 0, then we must have A
0
(x) =
_
_
1/4
e
x
2
2
D. (8.42)
Hence
n
(x) =
1
n!
A
n
+
0
(x) (8.43)
is a normalized eigenfunction of N corresponding to the eigenvalue n. More
over, since
n
(x) =
1
n!
_
4
_
1/4
H
n
(
x
)e
x
2
2
(8.44)
where H
n
(x) is a polynomial of degree n given by
H
n
(x) = e
x
2
2
_
x
d
dx
_
n
e
x
2
2
= (1)
n
e
x
2 d
n
dx
n
e
x
2
, (8.45)
we conclude span
n
= D. The polynomials H
n
(x) are called Hermite
polynomials.
In summary,
8.3. The harmonic oscillator 149
Theorem 8.4. The harmonic oscillator H is essentially self adjoint on D
and has an orthonormal basis of eigenfunctions
n
1
,n
2
,n
3
(x) =
n
1
(x
1
)
n
2
(x
2
)
n
3
(x
3
), (8.46)
with
n
j
(x
j
) from (8.44). The spectrum is given by
(H) = (2n + 3)[n N
0
. (8.47)
Finally, there is also a close connection with the Fourier transformation.
without restriction we choose = 1 and consider only one dimension. Then
it easy to verify that H commutes with the Fourier transformation
TH = HT. (8.48)
Moreover, by TA
= iA
T we even infer
T
n
=
1
n!
TA
n
+
0
=
(i)
n
n!
A
n
+
T
0
= (i)
n
n
, (8.49)
since T
0
=
0
by Lemma 7.3. In particular,
(T) = z C[z
4
= 1. (8.50)
Problem 8.2. Show that H =
d
2
dx
2
+q can be written as H = AA
, where
A =
d
dx
+, if the dierential equation
A. (Hint: =
.)
Chapter 9
One dimensional
Schrodinger operators
9.1. SturmLiouville operators
In this section we want to illustrate some of the results obtained thus far by
investigating a specic example, the SturmLiouville equations.
f(x) =
1
r(x)
_
d
dx
p(x)
d
dx
f(x) +q(x)f(x)
_
, f, pf
AC
loc
(I) (9.1)
The case p = r = 1 can be viewed as the model of a particle in one
dimension in the external potential q. Moreover, the case of a particle in
three dimensions can in some situations be reduced to the investigation of
SturmLiouville equations. In particular, we will see how this works when
explicitly solving the hydrogen atom.
The suitable Hilbert space is
L
2
((a, b), r(x)dx), f, g) =
_
b
a
f(x)
g(x)r(x)dx, (9.2)
where I = (a, b) R is an arbitrary open interval.
We require
(i) p
1
L
1
loc
(I), realvalued
(ii) q L
1
loc
(I), realvalued
(iii) r L
1
loc
(I), positive
151
152 9. One dimensional Schrodinger operators
If a is nite and if p
1
, q, r L
1
((a, c)) (c I), then the SturmLiouville
equation (9.1) is called regular at a. Similarly for b. If it is both regular at
a and b it is called regular.
The maximal domain of denition for in L
2
(I, r dx) is given by
D() = f L
2
(I, r dx)[f, pf
AC
loc
(I), f L
2
(I, r dx). (9.3)
It is not clear that D() is dense unless (e.g.) p AC
loc
(I), p
, q L
2
loc
(I),
r
1
L
loc
(I) since C
0
(I) D() in this case. We will defer the general
case to Lemma 9.4 below.
Since we are interested in selfadjoint operators H associated with (9.1),
we perform a little calculation. Using integration by parts (twice) we obtain
(a < c < d < b):
_
d
c
g
(f) rdy = W
d
(g
, f) W
c
(g
, f) +
_
d
c
(g)
f rdy, (9.4)
for f, g, pf
, pg
AC
loc
(I) where
W
x
(f
1
, f
2
) =
_
p(f
1
f
2
f
1
f
2
)
_
(x) (9.5)
is called the modied Wronskian.
Equation (9.4) also shows that the Wronskian of two solutions of u = zu
is constant
W
x
(u
1
, u
2
) = W(u
1
, u
2
), u
1,2
= zu
1,2
. (9.6)
Moreover, it is nonzero if and only if u
1
and u
2
are linearly independent
(compare Theorem 9.1 below).
If we choose f, g D() in (9.4), than we can take the limits c a and
d b, which results in
g, f) = W
b
(g
, f) W
a
(g
AC
loc
(I) of the dierential equation
( z)f = g, z C, (9.8)
satisfying the initial condition
f(c) = , (pf
)(c) = , , C, c I. (9.9)
In addition, f is holomorphic with respect to z.
Note that f, pf
(x) = u
1
(x)
_
+
_
x
c
u
2
gr dy
_
+u
2
(x)
_
_
x
c
u
1
gr dy
_
. (9.10)
Note that the constants , coincide with those from Theorem 9.1 if u
1
(c) =
(pu
2
)(c) = 1 and (pu
1
)(c) = u
2
(c) = 0.
Proof. It suces to check f z f = g. Dierentiating the rst equation
of (9.10) gives the second. Next we compute
(pf
= (pu
1
)
_
+
_
u
2
gr dy
_
+ (pu
2
)
_
_
u
1
gr dy
_
W(u
1
, u
2
)gr
= (q z)u
1
_
+
_
u
2
gr dy
_
+ (q z)u
2
_
_
u
1
g dy
_
gr
= (q z)f gr (9.11)
which proves the claim.
Now we want to obtain a symmetric operator and hence we choose
A
0
f = f, D(A
0
) = D() AC
c
(I), (9.12)
where AC
c
(I) are the functions in AC(I) with compact support. This de
nition clearly ensures that the Wronskian of two such functions vanishes on
the boundary, implying that A
0
is symmetric. Our rst task is to compute
the closure of A
0
and its adjoint. For this the following elementary fact will
be needed.
Lemma 9.3. Suppose V is a vector space and l, l
1
, . . . , l
n
are linear func
tionals (dened on all of V ) such that
n
j=1
Ker(l
j
) Ker(l). Then l =
n
j=0
j
l
j
for some constants
j
C.
Proof. First of all it is no restriction to assume that the functionals l
j
are
linearly independent. Then the map L : V C
n
, f (l
1
(f), . . . , l
n
(f)) is
surjective (since x Ran(L)
implies
n
j=1
x
j
l
j
(f) = 0 for all f). Hence
there are vectors f
k
V such that l
j
(f
k
) = 0 for j ,= k and l
j
(f
j
) = 1. Then
f
n
j=1
l
j
(f)f
j
n
j=1
Ker(l
j
) and hence l(f)
n
j=1
l
j
(f)l(f
j
) = 0. Thus
we can choose
j
= l(f
j
).
Now we are ready to prove
154 9. One dimensional Schrodinger operators
Lemma 9.4. The operator A
0
is densely dened and its closure is given by
A
0
f = f, D(A
0
) = f D() [ W
a
(f, g) = W
b
(f, g) = 0, g D().
(9.13)
Its adjoint is given by
A
0
f = f, D(A
0
) = D(). (9.14)
Proof. We start by computing A
0
and ignore the fact that we dont know
wether D(A
0
) is dense for now.
By (9.7) we have D() D(A
0
) and it remains to show D(A
0
) D().
If h D(A
0
) we must have
h, A
0
f) = k, f), f D(A
0
) (9.15)
for some k L
2
(I, r dx). Using (9.10) we can nd a
h such that
h = k and
from integration by parts we obtain
_
b
a
(h(x)
h(x))
(f)(x)r(x)dx = 0, f D(A
0
). (9.16)
Clearly we expect that h
h(x))
g(x)r(x)dx, l
j
(g) =
_
b
a
u
j
(x)
g(x)r(x)dx, (9.17)
on L
2
c
(I, r dx), where u
j
are two solutions of u = 0 with W(u
1
, u
2
) ,= 0.
We have Ker(l
1
) Ker(l
2
) Ker(l). In fact, if g Ker(l
1
) Ker(l
2
), then
f(x) = u
1
(x)
_
x
a
u
2
(y)g(y)r(y)dy +u
2
(x)
_
b
x
u
1
(y)g(y)r(y)dy (9.18)
is in D(A
0
) and g = f Ker(l) by (9.16). Now Lemma 9.3 implies
_
b
a
(h(x)
h(x) +
1
u
1
(x) +
2
u
2
(x))
g(x)r(x)dx = 0, g L
2
c
(I, rdx)
(9.19)
and hence h =
h +
1
u
1
+
2
u
2
D().
Now what if D(A
0
) were not dense? Then there would be some freedom
in choice of k since we could always add a component in D(A
0
)
. So suppose
we have two choices k
1
,= k
2
. Then by the above calculation, there are
corresponding functions
h
1
and
h
2
such that h =
h
1
+
1,1
u
1
+
1,2
u
2
=
h
2
+
2,1
u
1
+
2,2
u
2
. In particular,
h
1
h
2
is in the kernel of and hence
k
1
=
h
1
=
h
2
= k
2
contradiction our assumption.
9.2. Weyls limit circle, limit point alternative 155
Next we turn to A
0
. Denote the set on the right hand side of (9.13) by
D. Then we have D D(A
0
) = A
0
by (9.7). Conversely, since A
0
A
0
we can use (9.7) to conclude
W
a
(f, h) +W
b
(f, h) = 0, f D(A
0
), h D(A
0
). (9.20)
Now replace h by a
h D(A
0
) which coincides with h near a and vanishes
identically near b (Problem 9.1). Then W
a
(f, h) = W
a
(f,
h) +W
b
(f,
h) = 0.
Finally, W
b
(f, h) = W
a
(f, h) = 0 shows f D.
Example. If is regular at a, then f D(A
0
) if and only if f(a) =
(pf
)(a) = 0. This follows since we can prescribe the values of g(a), (pg
)(a)
for g D() arbitrarily.
This result shows that any selfadjoint extension of A
0
must lie between
A
0
and A
0
. Moreover, selfadjointness seems to be related to the Wronskian
of two functions at the boundary. Hence we collect a few properties rst.
Lemma 9.5. Suppose v D() with W
a
(v
, v) = 0 and there is a
f D()
with W(v
,
f)
a
,= 0. then we have
W
a
(v, f) = 0 W
a
(v, f
) = 0 f D() (9.21)
and
W
a
(v, f) = W
a
(v, g) = 0 W
a
(g
, f) = 0 f, g D() (9.22)
Proof. For all f
1
, . . . , f
4
D() we have the Pl ucker identity
W
x
(f
1
, f
2
)W
x
(f
3
, f
4
) +W
x
(f
1
, f
3
)W
x
(f
4
, f
2
) +W
x
(f
1
, f
4
)W
x
(f
2
, f
3
) = 0
(9.23)
which remains valid in the limit x a. Choosing f
1
= v, f
2
= f, f
3
=
v
, f
4
=
f we infer (9.21). Choosing f
1
= f, f
2
= g
, f
3
= v, f
4
=
f we
infer (9.22).
Problem 9.1. Given , , , , show that there is a function f D()
restricted to [c, d] (a, b) such that f(c) = , (pf)(c) = and f(d) = ,
(pf)(c) = . (Hint: Lemma 9.2)
Problem 9.2. Let A
0
=
d
2
dx
2
, D(A
0
) = f H
2
[0, 1][f(0) = f(1) = 0.
and B = q, D(B) = f L
2
(0, 1)[qf L
2
(0, 1). Find a q L
1
(0, 1) such
that D(A
0
) D(B) = 0. (Hint: Problem 0.18)
9.2. Weyls limit circle, limit point alternative
We call limit circle (l.c.) at a if there is a v D() with W
a
(v
, v) = 0
such that W
a
(v, f) ,= 0 for at least one f D(). Otherwise is called
limit point (l.p.) at a. Similarly for b.
156 9. One dimensional Schrodinger operators
Example. If is regular at a, it is limit circle at a. Since
W
a
(v, f) = (pf
)(a)v(a) (pv
)(a)f(a) (9.24)
any realvalued v with (v(a), (pv
, v) = 0 is
trivially satised in this case. In particular, is limit point if and only if
W
a
(f, g) = 0 for all f, g D().
Theorem 9.6. If is l.c. at a, then let v D() with W(v
, v)
a
= 0 and
W(v, f)
a
,= 0 for some f D(). Similarly, if is l.c. at b, let w be an
analogous function. Then the operator
A : D(A) L
2
(I, r dx)
f f
(9.25)
with
D(A) = f D()[ W(v, f)
a
= 0 if l.c. at a
W(w, f)
b
= 0 if l.c. at b
(9.26)
is selfadjoint.
Proof. Clearly A A
0
. Let g D(A
). As in the computation of A
0
we conclude W
a
(f, g) = 0 for all f D(A). Moreover, we can choose f such
that it coincides with v near a and hence W
a
(v, g) = 0, that is g D(A).
The name limit circle respectively limit point stems from the original
approach of Weyl, who considered the set of solutions u = zu, z CR
which satisfy W
c
(u
)(a)) =
(sin(), cos()), [0, ), such that
W
a
(v, f) = cos()f(a) + sin()(pf
)(a). (9.27)
The most common choice = 0 is known as Dirichlet boundary condi
tion f(a) = 0.
Next we want to compute the resolvent of A.
Lemma 9.7. Suppose z (A), then there exists a solution u
a
(z, x) which
is in L
2
((a, c), r dx) and which satises the boundary condition at a if is
l.c. at a. It can be chosen locally holomorphic with respect to z such that
u
a
(z, x)
= u
a
(z
, x). (9.28)
Similarly, there exists a solution u
b
(z, x) with the analogous properties near
b.
9.2. Weyls limit circle, limit point alternative 157
The resolvent of A is given by
(Az)
1
g(x) =
_
b
a
G(z, x, y)g(y)r(y)dy, (9.29)
where
G(z, x, y) =
1
W(u
b
(z), u
a
(z))
_
u
b
(z, x)u
a
(z, y) x y
u
a
(z, x)u
b
(z, y) x y
. (9.30)
Proof. Let g L
2
c
(I, r dx) be realvalued and consider f = (A z)
1
g
D(A). Since ( z)f = 0 near a respectively b, we obtain u
a
(z, x) by setting
it equal to f near a and using the dierential equation to extend it to the
rest of I. Similarly we obtain u
b
. The only problem is that u
a
or u
b
might
be identically zero. Hence we need to show that this can be avoided by
choosing g properly.
Fix z and let g be supported in (c, d) I. Since ( z)f = g we have
f(x) = u
1
(x)
_
+
_
x
a
u
2
gr dy
_
+u
2
(x)
_
+
_
b
x
u
1
gr dy
_
. (9.31)
Near a (x < c) we have f(x) = u
1
(x) +
u
2
(x) and near b (x > d) we have
f(x) = u
1
(x) +u
2
(x), where = +
_
b
a
u
2
gr dy and
= +
_
b
a
u
1
gr dy.
If f vanishes identically near both a and b we must have = = =
= 0
and thus = = 0 and
_
b
a
u
j
(y)g(y)r(y)dy = 0, j = 1, 2. This case can
be avoided choosing g suitable and hence there is at least one solution, say
u
b
(z).
Now choose u
1
= u
b
and consider the behavior near b. If u
2
is not square
integrable on (d, b), we must have = 0 since u
2
= f u
b
is. If u
2
is
square integrable, we can nd two functions in D() which coincide with
u
b
and u
2
near b. Since W(u
b
, u
2
) = 1 we see that is l.c. at a and hence
0 = W
b
(u
b
, f) = W
b
(u
b
, u
b
+ u
2
) = . Thus = 0 in both cases and we
have
f(x) = u
b
(x)
_
+
_
x
a
u
2
gr dy
_
+u
2
(x)
_
b
x
u
b
gr dy. (9.32)
Now choosing g such that
_
b
a
u
b
gr dy ,= 0 we infer existence of u
a
(z). Choos
ing u
2
= u
a
and arguing as before we see = 0 and hence
f(x) = u
b
(x)
_
x
a
u
a
(y)g(y)r(y)dy +u
a
(x)
_
b
x
u
b
(y)g(y)r(y)dy
=
_
b
a
G(z, x, y)g(y)r(y)dy (9.33)
for any g L
2
c
(I, r dx). Since this set is dense the claim follows.
158 9. One dimensional Schrodinger operators
Example. We already know that =
d
2
dx
2
on I = (, ) gives rise to
the free Schrodinger operator H
0
. Furthermore,
u
(z, x) = e
zx
, z C, (9.34)
are two linearly independent solutions (for z ,= 0) and since Re(
z) > 0
for z C[0, ), there is precisely one solution (up to a constant multiple)
which is square integrable near , namely u
and for u
b
is u
+
and we get
G(z, x, y) =
1
2
z
e
zxy
(9.35)
which we already found in Section 7.4.
If, as in the previous example, there is only one square integrable solu
tion, there is no choice for G(z, x, y). But since dierent boundary condi
tions must give rise to dierent resolvents, there is no room for boundary
conditions in this case. This indicates a connection between our l.c., l.p.
distinction and square integrability of solutions.
Theorem 9.8 (Weyl alternative). The operator is l.c. at a if and only if
for one z
0
C all solutions of ( z
0
)u = 0 are square integrable near a.
This then holds for all z C. Similarly for b.
Proof. If all solutions are square integrable near a, is l.c. at a since the
Wronskian of two linearly independent solutions does not vanish.
Conversely, take two functions v, v D() with W
a
(v, v) ,= 0. By con
sidering real and imaginary parts it is no restriction th assume that v and
v are realvalued. Thus they give rise to two dierent selfadjoint operators
A and
A (choose any xed w for the other endpoint). Let u
a
and u
a
be the
corresponding solutions from above, then W(u
a
, u
a
) ,= 0 (since otherwise
A =
A by Lemma 9.5) and thus there are two linearly independent solutions
which are square integrable near a. Since any other solution can be written
as a linear combination of those two, every solution is square integrable near
a.
It remains to show that all solutions of ( z)u = 0 for all z C are
square integrable near a if is l.c. at a. In fact, the above argument ensures
this for every z (A) (
A), that is, at least for all z CR.
Suppose ( z)u = 0. and choose two linearly independent solutions u
j
,
j = 1, 2, of ( z
0
)u = 0 with W(u
1
, u
2
) = 1. Using ( z
0
)u = (z z
0
)u
and (9.10) we have (a < c < x < b)
u(x) = u
1
(x) +u
2
(x) + (z z
0
)
_
x
c
(u
1
(x)u
2
(y) u
1
(y)u
2
(x))u(y)r(y) dy.
(9.36)
9.2. Weyls limit circle, limit point alternative 159
Since u
j
L
2
((c, b), rdx) we can nd a constant M 0 such that
_
b
c
[u
1,2
(y)[
2
r(y) dy M. (9.37)
Now choose c close to b such that [z z
0
[M
2
1/4. Next, estimating the
integral using CauchySchwarz gives
_
x
c
(u
1
(x)u
2
(y) u
1
(y)u
2
(x))u(y)r(y) dy
_
x
c
[u
1
(x)u
2
(y) u
1
(y)u
2
(x)[
2
r(y) dy
_
x
c
[u(y)[
2
r(y) dy
M
_
[u
1
(x)[
2
+[u
2
(x)[
2
_
_
x
c
[u(y)[
2
r(y) dy (9.38)
and hence
_
x
c
[u(y)[
2
r(y) dy ([[
2
+[[
2
)M + 2[z z
0
[M
2
_
x
c
[u(y)[
2
r(y) dy
([[
2
+[[
2
)M +
1
2
_
x
c
[u(y)[
2
r(y) dy. (9.39)
Thus
_
x
c
[u(y)[
2
r(y) dy 2([[
2
+[[
2
)M (9.40)
and since u AC
loc
(I) we have u L
2
((c, b), r dx) for every c (a, b).
Note that all eigenvalues are simple. If is l.p. at one endpoint this is
clear, since there is at most one solution of ( )u = 0 which is square
integrable near this end point. If is l.c. this also follows since the fact that
two solutions of ( )u = 0 satisfy the same boundary condition implies
that their Wronskian vanishes.
Finally, led us shed some additional light on the number of possible
boundary conditions. Suppose is l.c. at a and let u
1
, u
2
be two solutions
of u = 0 with W(u
1
, u
2
) = 1. Abbreviate
BC
j
x
(f) = W
x
(u
j
, f), f D(). (9.41)
Let v be as in Theorem 9.6, then, using Lemma 9.5 it is not hard to see that
W
a
(v, f) = 0 cos()BC
1
a
(f) + sin()BC
2
a
(f) = 0, (9.42)
where tan() =
BC
1
a
(v)
BC
2
a
(v)
. Hence all possible boundary conditions can be
parametrized by [0, ). If is regular at a and if we choose u
1
(a) =
p(a)u
2
(a) = 1 and p(a)u
1
(a) = u
2
(a) = 0, then
BC
1
a
(f) = f(a), BC
2
a
(f) = p(a)f
(a) (9.43)
160 9. One dimensional Schrodinger operators
and the boundary condition takes the simple form
cos()f(a) + sin()p(a)f
(a) = 0. (9.44)
Finally, note that if is l.c. at both a and b, then Theorem 9.6 does not give
all possible selfadjoint extensions. For example, one could also choose
BC
1
a
(f) = e
i
BC
1
b
(f), BC
2
a
(f) = e
i
BC
2
b
(f). (9.45)
The case = 0 gives rise to periodic boundary conditions in the regular
case.
Now we turn to the investigation of the spectrum of A. If is l.c. at
both endpoints, then the spectrum of A is very simple
Theorem 9.9. If is l.c. at both end points, then the resolvent is a Hilbert
Schmidt operator, that is,
_
b
a
_
b
a
[G(z, x, y)[
2
r(y)dy r(x)dx < . (9.46)
In particular, the spectrum of any self adjoint extensions is purely discrete
and the eigenfunctions (which are simple) form an orthonormal basis.
Proof. This follows from the estimate
_
b
a
_
_
x
a
[u
b
(x)u
a
(y)[
2
r(y)dy +
_
b
x
[u
b
(y)u
a
(x)[
2
r(y)dy
_
r(x)dx
2
_
b
a
[u
a
(y)[
2
r(y)dy
_
b
a
[u
b
(y)[
2
r(y)dy, (9.47)
which shows that the resolvent is HilbertSchmidt and hence compact.
If is not l.c. the situation is more complicated and we can only say
something about the essential spectrum.
Theorem 9.10. All self adjoint extensions have the same essential spec
trum. Moreover, if A
ac
and A
cb
are selfadjoint extensions of restricted to
(a, c) and (c, b) (for any c I), then
ess
(A) =
ess
(A
ac
)
ess
(A
cb
). (9.48)
Proof. Since ( i)u = 0 has two linearly independent solutions, the defect
indices are at most two (they are zero if is l.p. at both end points, one if
is l.c. at one and l.p. at the other end point, and two if is l.c. at both
endpoints). Hence the rst claim follows from Theorem 6.19.
For the second claim restrict to the functions with compact support
in (a, c) (c, d). Then, this operator is the orthogonal sum of the operators
A
0,ac
and A
0,cb
. Hence the same is true for the adjoints and hence the defect
indices of A
0,ac
A
0,cb
are at most four. Now note that A and A
ac
A
cb
9.3. Spectral transformations 161
are both selfadjoint extensions of this operator. Thus the second claim also
follows from Theorem 6.19.
Problem 9.3. Compute the spectrum and the resolvent of =
d
2
dx
2
, I =
(0, ) dened on D(A) = f D()[f(0) = 0.
9.3. Spectral transformations
In this section we want to provide some fundamental tools for investigating
the spectra of SturmLiouville operators and, at the same time, give some
nice illustrations of the spectral theorem.
Example. Consider again =
d
2
dx
2
on I = (, ). From Section 7.2
we know that the Fourier transform maps the associated operator H
0
to the
multiplication operator with p
2
in L
2
(R). To get multiplication by , as in
the spectral theorem, we set p =
and split the Fourier integral into a
positive and negative part
(Uf)() =
_
_
R
e
i
x
f(x) dx
_
R
e
i
x
f(x) dx
_
, (H
0
) = [0, ). (9.49)
Then
U : L
2
(R)
2
j=1
L
2
(R,
[0,)
()
2
d) (9.50)
is the spectral transformation whose existence is guaranteed by the spectral
theorem (Lemma 3.3).
Note that in the previous example the kernel e
i
x
of the integral trans
form U is just a pair of linearly independent solutions of the underlying
dierential equation (though no eigenfunctions, since they are not square
integrable).
More general, if
U : L
2
(I, r dx) L
2
(R, d), f(x)
_
R
u(, x)f(x)r(x) dx (9.51)
is an integral transformation which maps a selfadjoint SturmLiouville op
erator A to multiplication by , then its kernel u(, x) is a solution of the
underlying dierential equation. This formally follows from UAf = Uf
which implies
_
R
u(, x)( )f(x)r(x) dx =
_
R
( )u(, x)f(x)r(x) dx (9.52)
and hence ( )u(, .) = 0.
162 9. One dimensional Schrodinger operators
Lemma 9.11. Suppose
U : L
2
(I, r dx)
k
j=1
L
2
(R, d
j
) (9.53)
is a spectral mapping as in Lemma 3.3. Then U is of the form
Uf(x) =
_
b
a
u(, x)f(x)r(x) dx, (9.54)
where u(, x) = (u
1
(, x), . . . , u
k
(, x)) and each u
j
(, .) is a solution of
u
j
= u
j
for a.e. (with respect to
j
). The inverse is given by
U
1
F() =
k
j=1
_
R
u
j
(, x)
F
j
()d
j
(). (9.55)
Moreover, the solutions u
j
() are linearly independent if the spectral
measures are ordered and, if is l.c. at some endpoint, they satisfy the
boundary condition. In particular, for ordered spectral measures we have
always k 2 and even k = 1 if is l.c. at one endpoint.
Proof. Using U
j
R
A
(z) =
1
z
U
j
we have
U
j
f(x) = ( z)U
j
_
b
a
G(z, x, y)f(y)r(y) dy. (9.56)
If we restrict R
A
(z) to a compact interval [c, d] (a, b), then R
A
(z)
[c,d]
is HilbertSchmidt since G(z, x, y)
[c,d]
(y) is square integrable over (a, b)
(a, b). Hence U
j
[c,d]
= ( z)U
j
R
A
(z)
[c,d]
is HilbertSchmidt as well and
by Lemma 6.9 there is a corresponding kernel u
[c,d]
j
(, y) such that
(U
j
[c,d]
f)() =
_
b
a
u
[c,d]
j
(, x)f(x)r(x) dx. (9.57)
Now take a larger compact interval [ c,
d] [c, d], then the kernels coincide
on [c, d], u
[c,d]
j
(, .) = u
[ c,
d]
j
(, .)
[c,d]
, since we have U
j
[c,d]
= U
j
[ c,
d]
[c,d]
.
In particular, there is a kernel u
j
(, x) such that
U
j
f(x) =
_
b
a
u
j
(, x)f(x)r(x) dx (9.58)
for every f with compact support in (a, b). Since functions with compact
support are dense and U
j
is continuous, this formula holds for any f (pro
vided the integral is understood as the corresponding limit).
Using the fact that U is unitary, F, Ug) = U
1
F, g), we see
j
_
R
F
j
()
_
b
a
u
j
(, x)g(x)r(x) dx =
_
b
a
(U
1
F)(x)
cd,0
) and A
ab,0
u
j
(, .)[
[c,d]
=
u
j
(, .)[
[c,d]
. In particular, u
j
(, .) is a solution of u
j
= u
j
. Moreover, if
u
j
(, .) is is l.c. near a, we can choose = a and f to satisfy the boundary
condition.
Finally, x l k. If we assume the
j
are ordered, there is a set
l
such
that
j
(
l
) ,= 0 for 1 j l. Suppose
l
j=1
c
j
()u
j
(, x) = 0 (9.61)
then we have
l
j=1
c
j
()F
j
() = 0, F
j
= U
j
f, (9.62)
for every f. Since U is surjective, we can prescribe F
j
arbitrarily, e.g.,
F
j
() = 1 for j = j
0
and F
j
() = 0 else which shows c
j
0
() = 0. Hence
u
j
(, x), 1 j l, are linearly independent for
l
which shows k
2 since there are at most two linearly independent solutions. If is l.c.
and u
j
(, x) must satisfy the boundary condition, there is only one linearly
independent solution and thus k = 1.
Please note that the integral in (9.54) has to be understood as
U
j
f(x) = lim
a,b
_
u
j
(, x)f(x)r(x) dx, (9.63)
where the limit is taken in L
2
(R, d
j
). Similarly for (9.55).
For simplicity we will only pursue the case where one endpoint, say a,
is regular. The general case can usually be reduced to this case by choosing
c (a, b) and splitting A as in Theorem 9.10.
We choose a boundary condition
cos()f(a) + sin()p(a)f
(a) = 0 (9.64)
and introduce two solution s(z, x) and c(z, x) of u = zu satisfying the
initial conditions
s(z, a) = sin(), p(a)s
(z, a) = cos(),
c(z, a) = cos(), p(a)c
j=1
1
s(E
j
)
2
d( E
j
),
p
(A) = E
j
j=1
, (9.70)
where d is the Dirac measure centered at 0. For arbitrary A, the above
formula holds at least for the pure point part
pp
.
In the general case we have to work a bit harder. Since c(z, x) and s(z, x)
are linearly independent solutions,
W(c(z), s(z)) = 1, (9.71)
9.3. Spectral transformations 165
we can write u
b
(z, x) =
b
(z)(c(z, x) +m
b
(z)s(z, x)), where
m
b
(z) =
cos()p(a)u
b
(z, a) + sin()u
b
(z, a)
cos()u
b
(z, a) + sin()p(a)u
b
(z, a)
, z (A), (9.72)
is known as WeylTitchmarsh mfunction. Note that m
b
(z) is holomor
phic in (A) and that
m
b
(z)
= m
b
(z
) (9.73)
since the same is true for u
b
(z, x) (the denominator in (9.72) only vanishes if
u
b
(z, x) satises the boundary condition at a, that is, if z is an eigenvalue).
Moreover, the constant
b
(z) is of no importance and can be chosen equal
to one,
u
b
(z, x) = c(z, x) +m
b
(z)s(z, x). (9.74)
Lemma 9.12. The Weyl mfunction is a Herglotz function and satises
Im(m
b
(z)) = Im(z)
_
b
a
[u
b
(z, x)[
2
r(x) dx, (9.75)
where u
b
(z, x) is normalized as in (9.74).
Proof. Given two solutions u(x), v(x) of u = zu, v = zv it is straightfor
ward to check
( z z)
_
x
a
u(y)v(y)r(y) dy = W
x
(u, v) W
a
(u, v) (9.76)
(clearly it is true for x = a, now dierentiate with respect to x). Now choose
u(x) = u
b
(z, x) and v(x) = u
b
(z, x)
= u
b
(z
, x),
2Im(z)
_
x
a
[u
b
(z, y)[
2
r(y) dy = W
x
(u
b
(z), u
b
(z)
) 2Im(m
b
(z)), (9.77)
and observe that W
x
(u
b
, u
b
) vanishes as x b, since both u
b
and u
b
are in
D() near b.
Lemma 9.13. We have
(Uu
b
(z))() =
1
z
, (9.78)
where u
b
(z, x) is normalized as in (9.74).
Proof. First of all note that from R
A
(z)f = U
1 1
z
Uf we have
_
b
a
G(z, x, y)f(y)r(y) dy =
_
R
s(, x)F()
z
d(), (9.79)
where F = Uf. Here equality is to be understood in L
2
, that is for a.e. x.
However, the right hand side is continuous with respect to x and so is the
166 9. One dimensional Schrodinger operators
left hand side, at least if F has compact support. Hence in this case the
formula holds for all x and we can choose x = a to obtain
sin()
_
b
a
u
b
(z, y)f(y)r(y) dy = sin()
_
R
F()
z
d(), (9.80)
for all f, where F has compact support. Since these functions are dense, the
claim follows if we can cancel sin(), that is, ,= 0. To see the case = 0,
rst dierentiate (9.79) with respect to x before setting x = a.
Now combining the last two lemmas we infer from unitarity of U that
Im(m
b
(z)) = Im(z)
_
b
a
[u
b
(z, x)[
2
r(x) dx = Im(z)
_
R
1
[ z[
2
d(z) (9.81)
and since a holomorphic function is determined up to a real constant by its
imaginary part we obtain
Theorem 9.14. The Weyl mfunction is given by
m
b
(z) = a +
_
R
_
1
z
1 +
2
_
d(z), a R, (9.82)
and
a = Re(m
b
(i)),
_
R
1
1 +
2
d(z) = Im(m
b
(i)) < . (9.83)
Moreover, is given by Stieltjes inversion formula
() = lim
0
lim
0
1
_
+
Im(m
b
( + i))d, (9.84)
where
Im(m
b
( + i)) =
_
b
a
[u
b
( + i, x)[
2
r(x) dx. (9.85)
Proof. Choosing z = i in (9.81) shows (9.83) and hence the right hand side
of (9.82) is a welldened holomorphic function in CR. By Im(
1
z
1+
2
) =
Im(z)
z
2
its imaginary part coincides with that of m
b
(z) and hence equality
follows. Stieltjes inversion formula follows as in the case where the measure
is bounded.
Example. Consider =
d
2
dx
2
on I = (0, ). Then
c(, x) = cos() cos(
x) + sin()
sin(
x)
(9.86)
and
s(, x) = sin() cos(
x) + cos()
sin(
x)
. (9.87)
9.3. Spectral transformations 167
Moreover,
u
b
(z, x) = u
b
(z, 0)e
zx
(9.88)
and thus
m
b
(z) =
z cos() + sin()
cos()
z sin()
(9.89)
respectively
d() =
(cos()
2
+sin()
2
)
d. (9.90)
(R
n
) and for
any a > 0 there is a b > 0 such that

aH
0
 +b. (10.2)
Proof. The important observation is that (p
2
+
2
)
1
L
2
(R
n
) if n 3.
Hence, since (p
2
+
2
)
L
2
(R
n
), the CauchySchwarz inequality


1
= (p
2
+
2
)
1
(p
2
+
2
)
(p)
1
(p
2
+
2
)
1
 (p
2
+
2
)
(p). (10.3)
shows
L
1
(R
n
). But now everything follows from the RiemannLebesgue
lemma

(2)
n/2
(p
2
+
2
)
1
(p
2
(p) +
2

(p))
= (/2)
n/2
(p
2
+ 1)
1
(
2
H
0
 +) (10.4)
169
170 10. Oneparticle Schrodinger operators
nishes the proof.
Now we come to our rst result.
Theorem 10.2. Let V be realvalued and V L
(R
n
) if n > 3 and V
L
(R
n
) +L
2
(R
n
) if n 3. Then V is relatively compact with respect to H
0
.
In particular,
H = H
0
+V, D(H) = H
2
(R
n
), (10.5)
is selfadjoint, bounded from below and
ess
(H) = [0, ). (10.6)
Moreover, C
0
(R
n
) is a core for H.
Proof. Our previous lemma shows D(H
0
) D(V ) and the rest follows
from Lemma 7.10 using f(p) = (p
2
z)
1
and g(x) = V (x). Note that
f L
(R
n
) L
2
(R
n
) for n 3.
Observe that since C
c
(R
n
) D(H
0
), we must have V L
2
loc
(R
n
) if
D(V ) D(H
0
).
10.2. The hydrogen atom
We begin with the simple model of a single electron in R
3
moving in the
external potential V generated by a nucleus (which is assumed to be xed
at the origin). If one takes only the electrostatic force into account, then
V is given by the Coulomb potential and the corresponding Hamiltonian is
given by
H
(1)
=
[x[
, D(H
(1)
) = H
2
(R
3
). (10.7)
If the potential is attracting, that is, if > 0, then it describes the hydrogen
atom and is probably the most famous model in quantum mechanics.
As domain we have chosen D(H
(1)
) = D(H
0
) D(
1
x
) = D(H
0
) and by
Theorem 10.2 we conclude that H
(1)
is selfadjoint. Moreover, Theorem 10.2
also tells us
ess
(H
(1)
) = [0, ) (10.8)
and that H
(1)
is bounded from below
E
0
= inf (H
(1)
) > . (10.9)
If 0 we have H
(1)
0 and hence E
0
= 0, but if > 0, we might have
E
0
< 0 and there might be some discrete eigenvalues below the essential
spectrum.
10.2. The hydrogen atom 171
In order to say more about the eigenvalues of H
(1)
we will use the fact
that both H
0
and V
(1)
= /[x[ have a simple behavior with respect to
scaling. Consider the dilation group
U(s)(x) = e
ns/2
(e
s
x), s R, (10.10)
which is a strongly continuous oneparameter unitary group. The generator
can be easily computed
D(x) =
1
2
(xp +px)(x) = (xp
in
2
)(x), o(R
n
). (10.11)
Now let us investigate the action of U(s) on H
(1)
H
(1)
(s) = U(s)H
(1)
U(s) = e
2s
H
0
+ e
s
V
(1)
, D(H
(1)
(s)) = D(H
(1)
).
(10.12)
Now suppose H = , then
, [U(s), H]) = U(s), H) H, U(s)) = 0 (10.13)
and hence
0 = lim
s0
1
s
, [U(s), H]) = lim
s0
U(s),
H H(s)
s
)
= , (2H
0
+V
(1)
)). (10.14)
Thus we have proven the virial theorem.
Theorem 10.3. Suppose H = H
0
+ V with U(s)V U(s) = e
s
V . Then
any normalized eigenfunction corresponding to an eigenvalue satises
= , H
0
) =
1
2
, V ). (10.15)
In particular, all eigenvalues must be negative.
This result even has some further consequences for the point spectrum
of H
(1)
.
Corollary 10.4. Suppose > 0. Then
p
(H
(1)
) =
d
(H
(1)
) = E
j1
jN
0
, E
0
< E
j
< E
j+1
< 0, (10.16)
with lim
j
E
j
= 0.
Proof. Choose C
c
(R0) and set (s) = U(s). Then
(s), H
(1)
(s)) = e
2s
, H
0
) + e
s
, V
(1)
) (10.17)
which is negative for s large. Now choose a sequence s
n
such that
we have supp((s
n
)) supp((s
m
)) = for n ,= m. Then Theorem 4.11
(i) shows that rank(P
H
(1) ((, 0))) = . Since each eigenvalue E
j
has
nite multiplicity (it lies in the discrete spectrum) there must be an innite
number of eigenvalues which accumulate at 0.
172 10. Oneparticle Schrodinger operators
If 0 we have
d
(H
(1)
) = since H
(1)
0 in this case.
Hence we have gotten a quite complete picture of the spectrum of H
(1)
.
Next, we could try to compute the eigenvalues of H
(1)
(in the case > 0) by
solving the corresponding eigenvalue equation, which is given by the partial
dierential equation
(x)
[x[
(x) = (x). (10.18)
For a general potential this is hopeless, but in our case we can use the rota
tional symmetry of our operator to reduce our partial dierential equation
to ordinary ones.
First of all, it suggests itself to switch to spherical coordinates (x
1
, x
2
, x
3
)
(r, , )
x
1
= r sin() cos(), x
2
= r sin() sin(), x
3
= r cos(), (10.19)
which correspond to a unitary transform
L
2
(R
3
) L
2
((0, ), r
2
dr) L
2
((0, ), sin()d) L
2
((0, 2), d). (10.20)
In these new coordinates (r, , ) our operator reads
H
(1)
=
1
r
2
r
r
2
r
+
1
r
2
L
2
+V (r), V (r) =
r
, (10.21)
where
L
2
= L
2
1
+L
2
2
+L
2
3
=
1
sin()
sin()
1
sin()
2
2
. (10.22)
(Recall the angular momentum operators L
j
from Section 8.2.)
Making the product ansatz (separation of variables)
(r, , ) = R(r)()() (10.23)
we obtain the following three SturmLiouville equations
_
1
r
2
d
dr
r
2
d
dr
+
l(l + 1)
r
2
+V (r)
_
R(r) = R(r)
1
sin()
_
d
d
sin()
d
d
+
m
2
sin()
_
() = l(l + 1)()
d
2
d
2
() = m
2
() (10.24)
The form chosen for the constants l(l + 1) and m
2
is for convenience later
on. These equations will be investigated in the following sections.
10.3. Angular momentum 173
10.3. Angular momentum
We start by investigating the equation for () which associated with the
StumLiouville equation
=
(0) =
(2)
.
(10.26)
From our analysis in Section 9.1 we immediately obtain
Theorem 10.5. The operator A dened via (10.25) is selfadjoint. Its
spectrum is purely discrete
(A) =
d
(A) = m
2
[m Z (10.27)
and the corresponding eigenfunctions
m
() =
1
2
e
im
, m Z, (10.28)
form an orthonormal basis for L
2
(0, 2).
Note that except for the lowest eigenvalue, all eigenvalues are twice de
generate.
We note that this operator is essentially the square of the angular mo
mentum in the third coordinate direction, since in polar coordinates
L
3
=
1
i
. (10.29)
Now we turn to the equation for ()
m
() =
1
sin()
_
d
d
sin()
d
d
+
m
2
sin()
_
(), I = (0, ), m N
0
.
(10.30)
For the investigation of the corresponding operator we use the unitary
transform
L
2
((0, ), sin()d) L
2
((1, 1), dx), () f(x) = (arccos(x)).
(10.31)
The operator transforms to the somewhat simpler form
m
=
d
dx
(1 x
2
)
d
dx
m
2
1 x
2
. (10.32)
174 10. Oneparticle Schrodinger operators
The corresponding eigenvalue equation
m
u = l(l 1)u (10.33)
is the associated Legendre equation. For l N
0
it is solved by the
associated Legendre functions
P
lm
(x) = (1 x)
m/2
d
m
dx
m
P
l
(x), (10.34)
where
P
l
(x) =
1
2
l
l!
d
l
dx
l
(1 x
2
) (10.35)
are the Legendre polynomials. This is straightforward to check. More
over, note that P
l
(x) are (nonzero) polynomials of degree l. A second,
linearly independent solution is given by
Q
lm
(x) = P
lm
(x)
_
x
0
dt
(1 t
2
)P
lm
(t)
2
. (10.36)
In fact, for every SturmLiouville equation v(x) = u(x)
_
x
dt
p(t)u(t)
2
satises
v = 0 whenever u = 0. Now x l = 0 and note P
0
(x) = 1. For m = 0 we
have Q
00
= arctanh(x) L
2
and so
0
is l.c. at both end points. For m > 0
we have Q
0m
= (x1)
m/2
(C +O(x1)) which shows that it is not square
integrable. Thus
m
is l.c. for m = 0 and l.p. for m > 0 at both endpoints.
In order to make sure that the eigenfunctions for m = 0 are continuous (such
that dened via (10.23) is continuous) we choose the boundary condition
generated by P
0
(x) = 1 in this case
A
m
f = f, D(A
m
) = f L
2
(1, 1)[ f AC
1
(0, ), f L
2
(1, 1)
lim
x1
(1 x
2
)f
(x) = 0
.
(10.37)
Theorem 10.6. The operator A
m
, m N
0
, dened via (10.37) is self
adjoint. Its spectrum is purely discrete
(A
m
) =
d
(A
m
) = l(l + 1)[l N
0
, l m (10.38)
and the corresponding eigenfunctions
u
lm
(x) =
2l + 1
2
(l +m)!
(l m)!
P
lm
(x), l N
0
, l m, (10.39)
form an orthonormal basis for L
2
(1, 1).
Proof. By Theorem 9.6, A
m
is selfadjoint. Moreover, P
lm
is an eigenfunc
tion corresponding to the eigenvalue l(l +1) and it suces to show that P
lm
form a basis. To prove this, it suces to show that the functions P
lm
(x)
are dense. Since (1 x
2
) > 0 for x (1, 1) it suces to show that the
functions (1 x
2
)
m/2
P
lm
(x) are dense. But the span of these functions
10.3. Angular momentum 175
contains every polynomial. Every continuous function can be approximated
by polynomials (in the sup norm and hence in the L
2
norm) and since the
continuous functions are dense, so are the polynomials.
The only thing remaining is the normalization of the eigenfunctions,
which can be found in any book on special functions.
Returning to our original setting we conclude that
lm
() =
2l + 1
2
(l +m)!
(l m)!
P
lm
(cos()), l = m, m+ 1, . . . (10.40)
form an orthonormal basis for L
2
((0, ), sin()d) for any xed m N
0
.
Theorem 10.7. The operator L
2
on L
2
((0, ), sin()d) L
2
((0, 2)) has
a purely discrete spectrum given
(L
2
) = l(l + 1)[l N
0
. (10.41)
The spherical harmonics
Y
lm
(, ) =
lm
()
m
() =
2l + 1
4
(l +[m[)!
(l [m[)!
P
lm
(cos())e
im
, [m[ l,
(10.42)
form an orthonormal basis and satisfy L
2
Y
lm
= l(l + 1)Y
lm
and L
3
Y
lm
=
mY
lm
.
Proof. Everything follows from our construction, if we can show that Y
lm
form a basis. But this follows as in the proof of Lemma 1.9.
Note that transforming Y
lm
back to cartesian coordinates gives
Y
l,m
(x) =
2l + 1
4
(l +m)!
(l m)!
P
lm
(
x
3
r
)
_
x
1
ix
2
r
_
m
, r = [x[, (10.43)
where
P
lm
is a polynomial of degree l m given by
P
lm
(x) = (1 x
2
)
m/2
P
lm
(x) =
d
l+m
dx
l+m
(1 x
2
)
l
. (10.44)
In particular, Y
lm
are smooth away from the origin and by construction they
satisfy
Y
lm
=
l(l + 1)
r
2
Y
lm
. (10.45)
176 10. Oneparticle Schrodinger operators
10.4. The eigenvalues of the hydrogen atom
Now we want to use the considerations from the previous section to decom
pose the Hamiltonian of the hydrogen atom. In fact, we can even admit any
spherically symmetric potential V (x) = V ([x[) with
V (r) L
(R) +L
2
((0, ), r
2
dr). (10.46)
The important observation is that the spaces
H
lm
= (x) = R(r)Y
lm
(, )[R(r) L
2
((0, ), r
2
dr) (10.47)
reduce our operator H = H
0
+V . Hence
H = H
0
+V =
l,m
H
l
, (10.48)
where
H
l
R(r) =
l
R(r),
l
=
1
r
2
d
dr
r
2
d
dr
+
l(l + 1)
r
2
+V (r)
D(H
l
) L
2
((0, ), r
2
dr). (10.49)
Using the unitary transformation
L
2
((0, ), r
2
dr) L
2
((0, )), R(r) u(r) = rR(r), (10.50)
our operator transforms to
A
l
f =
l
f,
l
=
d
2
dr
2
+
l(l + 1)
r
2
+V (r)
D(A
l
) L
2
((0, )). (10.51)
It remains to investigate this operator.
Theorem 10.8. The domain of the operator A
l
is given by
D(A
l
) = f L
2
(I)[ f, f
AC(I), f L
2
(I),
lim
r0
(f(r) rf
(r)) = 0 if l = 0,
(10.52)
where I = (0, ). Moreover,
ess
(A
l
) = [0, ).
Proof. By construction of A
l
we know that it is selfadjoint and satises
ess
(A
l
) = [0, ). Hence it remains to compute the domain. We know at
least D(A
l
) D() and since D(H) = D(H
0
) it suces to consider the case
V = 0. In this case the solutions of u
(r) +
l(l+1)
r
2
u(r) = 0 are given by
u(r) = r
l+1
+ r
l
. Thus we are in the l.p. case at for any l N
0
.
However, at 0 we are in the l.p. case only if l > 0, that is, we need an
additional boundary condition at 0 if l = 0. Since we need R(r) =
u(r)
r
to
be bounded (such that (10.23) is in the domain of H
0
), we have to take the
boundary condition generated by u(r) = r.
10.4. The eigenvalues of the hydrogen atom 177
Finally let us turn to some explicit choices for V , where the correspond
ing dierential equation can be explicitly solved. The simplest case is V = 0
in this case the solutions of
u
(r) +
l(l + 1)
r
2
u(r) = zu(r) (10.53)
are given by the spherical Bessel respectively spherical Neumann func
tions
u(r) = j
l
(
zr) + n
l
(
zr), (10.54)
where
j
l
(r) = (r)
l
_
1
r
d
dr
_
l
sin(r)
r
. (10.55)
In particular,
u
a
(z, r) = j
l
(
zr) and u
b
(z, r) = j
l
(
zr) + in
l
(
zr) (10.56)
are the functions which are square integrable and satisfy the boundary con
dition (if any) near a = 0 and b = , respectively.
The second case is that of our Coulomb potential
V (r) =
r
, > 0, (10.57)
where we will try to compute the eigenvalues plus corresponding eigenfunc
tions. It turns out that they can be expressed in terms of the Laguerre
polynomials
L
j
(r) = e
r
d
j
dr
j
e
r
r
j
(10.58)
and the associated Laguerre polynomials
L
k
j
(r) =
d
k
dr
k
L
j
(r). (10.59)
Note that L
k
j
is a polynomial of degree j k.
Theorem 10.9. The eigenvalues of H
(1)
are explicitly given by
E
n
=
_
2(n + 1)
_
2
, n N
0
. (10.60)
An orthonormal basis for the corresponding eigenspace is given by
nlm
(x) = R
nl
(r)Y
lm
(x), (10.61)
where
R
nl
(r) =
3
(n l)!
2n
3
((n +l + 1)!)
3
_
r
n + 1
_
l
e
r
2(n+1)
L
2l+1
n+l+1
(
r
n + 1
). (10.62)
In particular, the lowest eigenvalue E
0
=
2
4
is simple and the correspond
ing eigenfunction
000
(x) =
_
3
4
3
e
r/2
is positive.
178 10. Oneparticle Schrodinger operators
Proof. It is a straightforward calculation to check that R
nl
are indeed eigen
functions of A
l
corresponding to the eigenvalue (
2(n+1)
)
2
and for the norm
ing constants we refer to any book on special functions. The only problem
is to show that we have found all eigenvalues.
Since all eigenvalues are negative, we need to look at the equation
u
(r) + (
l(l + 1)
r
2
r
)u(r) = u(r) (10.63)
for < 0. Introducing new variables x =
r and v(x) = x
l+1
e
x
u(x/
)
this equation transforms into
xv
(x) + 2nv(x) = 0, n =
2
(l + 1). (10.64)
Now let us search for a solution which can be expanded into a convergent
power series
v(x) =
j=0
v
j
x
j
, v
0
= 1. (10.65)
The corresponding u(r) is square integrable near 0 and satises the boundary
condition (if any). Thus we need to nd those values of for which it is
square integrable near +.
Substituting the ansatz (10.65) into our dierential equation and com
paring powers of x gives the following recursion for the coecients
v
j+1
=
2(j n)
(j + 1)(j + 2(l + 1))
v
j
(10.66)
and thus
v
j
=
1
j!
j1
k=0
2(k n)
k + 2(l + 1)
. (10.67)
Now there are two cases to distinguish. If n N
0
, then v
j
= 0 for j > n
and v(x) is a polynomial. In this case u(r) is square integrable and hence an
eigenfunction corresponding to the eigenvalue
n
= (
2(n+l+1)
)
2
. Otherwise
we have v
j
(2)
j
j!
for j suciently large. Hence by adding a polynomial
to v(x) we can get a function v(x) such that v
j
(2)
j
j!
for all j. But
then v(x) exp((2 )x) and thus the corresponding u(r) is not square
integrable near .
10.5. Nondegeneracy of the ground state
The lowest eigenvalue (below the essential spectrum) of a Schrodinger oper
ator is called ground state. Since the laws of physics state that a quantum
system will transfer energy to its surroundings (e.g., an atom emits radia
tion) until it eventually reaches its ground state, this state is in some sense
10.5. Nondegeneracy of the ground state 179
the most important state. We have seen that the hydrogen atom has a
nondegenerate (simple) ground state with a corresponding positive eigen
function. In particular, the hydrogen atom is stable in the sense that there
is a lowest possible energy. This is quite surprising since the corresponding
classical mechanical system is not, the electron could fall into the nucleus!
Our aim in this section is to show that the ground state is simple with a
corresponding positive eigenfunction. Note that it suces to show that any
ground state eigenfunction is positive since nondegeneracy then follows for
free: two positive functions cannot be orthogonal.
To set the stage let us introduce some notation. Let H = L
2
(R
n
). We call
f L
2
(R
n
) positive if f 0 a.e. and f ,= 0. We call f strictly positive if
f > 0 a.e.. A bounded operator A is called positivity preserving if f 0
implies Af 0 and positivity improving if f 0 implies Af > 0. Clearly
A is positivity preserving (improving) if and only if f, Ag) 0 (> 0) for
f, g 0.
Example. Multiplication by a positive function is positivity preserving (but
not improving). Convolution with a strictly positive function is positivity
improving.
We rst show that positivity improving operators have positive eigen
functions.
Theorem 10.10. Suppose A L(L
2
(R
n
)) is a selfadjoint, positivity im
proving and real (i.e., it maps real functions to real functions) operator. If
A is an eigenvalue, then it is simple and the corresponding eigenfunction
is strictly positive.
Proof. Let be an eigenfunction, then it is no restriction to assume that
is real (since A is real both real and imaginary part of are eigenfunctions
as well). We assume  = 1 and denote by
=
ff
2
the positive and
negative parts of . Then by [A[ = [A
+
A
[ A
+
+ A
= A[[
we have
A = , A) [[, [A[) [[, A[[) A, (10.68)
that is, , A) = [[, A[[) and thus
+
, A
) =
1
4
([[, A[[) , A)) = 0. (10.69)
Consequently
= 0 or
+
= 0 since otherwise A
+
, A
(H
0
), < 0 are since they are given by convolution
180 10. Oneparticle Schrodinger operators
with a strictly positive function. Our hope is that this property carries over
to H = H
0
+V .
Theorem 10.11. Suppose H = H
0
+ V is selfadjoint and bounded from
below with C
c
(R
n
) as a core. If E
0
= min(H) is an eigenvalue, it is
simple and the corresponding eigenfunction is strictly positive.
Proof. We rst show that e
tH
, t > 0, is positivity preserving. If we set
V
n
= V
{x V (x)n}
, then V
n
is bounded and H
n
= H
0
+ V
n
is positivity
preserving by the Trotter product formula since both e
tH
0
and e
tV
are.
Moreover, we have H
n
H for C
c
(R
n
) (note that necessarily
V L
2
loc
) and hence H
n
sr
H in strong resolvent sense by Lemma 6.28.
Hence e
tHn
s
e
tH
by Theorem 6.23, which shows that e
tH
is at least
positivity preserving (since 0 cannot be an eigenvalue of e
tH
it cannot map
a positive function to 0).
Next I claim that for positive the closed set
N() = L
2
(R
n
) [ 0, , e
sH
) = 0s 0 (10.70)
is just 0. If N() we have by e
sH
0 that e
sH
= 0. Hence
e
tVn
e
sH
= 0, that is e
tVn
N(). In other words, both e
tVn
and e
tH
leave N() invariant and invoking again Trotters formula the same is true
for
e
t(HVn)
= slim
k
_
e
t
k
H
e
t
k
Vn
_
k
. (10.71)
Since e
t(HVn)
s
e
tH
0
we nally obtain that e
tH
0
leaves N() invariant,
but this operator is positivity increasing and thus N() = 0.
Now it remains to use (7.41) which shows
, R
H
()) =
_
0
e
t
, e
tH
)dt > 0, < E
0
, (10.72)
for , positive. So R
H
() is positivity increasing for < E
0
.
If is an eigenfunction of H corresponding to E
0
it is an eigenfunction
of R
H
() corresponding to
1
E
0
.
Chapter 11
Atomic Schrodinger
operators
11.1. Selfadjointness
In this section we want to have a look at the Hamiltonian corresponding to
more than one interacting particle. It is given by
H =
N
j=1
j
+
N
j<k
V
j,k
(x
j
x
k
). (11.1)
We rst consider the case of two particles, which will give us a feeling
for how the many particle case diers from the one particle case and how
the diculties can be overcome.
We denote the coordinates corresponding to the rst particle by x
1
=
(x
1,1
, x
1,2
, x
1,3
) and those corresponding to the second particle by x
2
=
(x
2,1
, x
2,2
, x
2,3
). If we assume that the interaction is again of the Coulomb
type, the Hamiltonian is given by
H =
1
[x
1
x
2
[
, D(H) = H
2
(R
6
). (11.2)
Since Theorem 10.2 does not allow singularities for n 3, it does not tell
us whether H is selfadjoint or not. Let
(y
1
, y
2
) =
1
2
_
I I
I I
_
(x
1
, x
2
), (11.3)
then H reads in this new coordinates
H = (
1
) + (
2
2
[y
2
[
). (11.4)
181
182 11. Atomic Schrodinger operators
In particular, it is the sum of a free particle plus a particle in an external
Coulomb eld. From a physics point of view, the rst part corresponds to
the center of mass motion and the second part to the relative motion.
Using that /(
2[y
2
[) has (
2
)bound 0 in L
2
(R
3
) it is not hard to
see that the same is true for the (
1
2
)bound in L
2
(R
6
) (details will
follow in the next section). In particular, H is selfadjoint and semibounded
for any R. Moreover, you might suspect that /(
2[y
2
[) is relatively
compact with respect to
1
2
in L
2
(R
6
) since it is with respect to
2
in L
2
(R
6
). However, this is not true! This is due to the fact that /(
2[y
2
[)
does not vanish as [y[ .
Let us look at this problem from the physical view point. If
ess
(H),
this means that the movement of the whole system is somehow unbounded.
There are two possibilities for this.
Firstly, both particles are far away from each other (such that we can
neglect the interaction) and the energy corresponds to the sum of the kinetic
energies of both particles. Since both can be arbitrarily small (but positive),
we expect [0, )
ess
(H).
Secondly, both particles remain close to each other and move together.
In the last coordinates this corresponds to a bound state of the second
operator. Hence we expect [
0
, )
ess
(H), where
0
=
2
/8 is the
smallest eigenvalue of the second operator if the forces are attracting ( 0)
and
0
= 0 if they are repelling ( 0).
It is not hard to translate this intuitive ideas into a rigorous proof.
Let
1
(y
1
) be a Weyl sequence corresponding to [0, ) for
1
and
2
(y
2
) be a Weyl sequence corresponding to
0
for
2
/(
2[y
2
[). Then,
1
(y
1
)
2
(y
2
) is a Weyl sequence corresponding to +
0
for H and thus
[
0
, )
ess
(H). Conversely, we have
1
0 respectively
2
/(
2[y
2
[)
0
and hence H
0
. Thus we obtain
(H) =
ess
(H) = [
0
, ),
0
=
_
2
/8, 0
0, 0
. (11.5)
Clearly, the physically relevant information is the spectrum of the operator
2
/(
2[y
2
[) which is hidden by the spectrum of
1
. Hence, in order
to reveal the physics, one rst has to remove the center of mass motion.
To avoid clumsy notation, we will restrict ourselves to the case of one
atom with N electrons whose nucleus is xed at the origin. In particular,
this implies that we do not have to deal with the center of mass motion
11.2. The HVZ theorem 183
encountered in our example above. The Hamiltonian is given by
H
(N)
=
N
j=1
j
N
j=1
V
ne
(x
j
) +
N
j=1
N
j<k
V
ee
(x
j
x
k
),
D(H
(N)
) = H
2
(R
3N
), (11.6)
where V
ne
describes the interaction of one electron with the nucleus and V
ee
describes the interaction of two electrons. Explicitly we have
V
j
(x) =
j
[x[
,
j
> 0, j = ne, ee. (11.7)
We rst need to establish selfadjointness of H
(N)
. This will follow from
Katos theorem.
Theorem 11.1 (Kato). Let V
k
L
(R
d
) + L
2
(R
d
), d 3, be realvalued
and let V
k
(y
(k)
) be the multiplication operator in L
2
(R
n
), n = Nd, obtained
by letting y
(k)
be the rst d coordinates of a unitary transform of R
n
. Then
V
k
is H
0
bounded with H
0
bound 0. In particular,
H = H
0
+
k
V
k
(y
(k)
), D(H) = H
2
(R
n
), (11.8)
is selfadjoint and C
0
(R
n
) is a core.
Proof. It suces to consider one k. After a unitary transform of R
n
we can
assume y
(1)
= (x
1
, . . . , x
d
) since such transformations leave both the scalar
product of L
2
(R
n
) and H
0
invariant. Now let o(R
n
), then
V
k

2
a
2
_
R
n
[
1
(x)[
2
d
n
x +b
2
_
R
n
[(x)[
2
d
n
x, (11.9)
where
1
=
d
j=1
2
/
2
x
j
, by our previous lemma. Hence we obtain
V
k

2
a
2
_
R
n
[
d
j=1
p
2
j
(p)[
2
d
n
p +b
2

2
a
2
_
R
n
[
n
j=1
p
2
j
(p)[
2
d
n
p +b
2

2
= a
2
H
0

2
+b
2

2
, (11.10)
which implies that V
k
is relatively bounded with bound 0.
11.2. The HVZ theorem
The considerations of the beginning of this section show that it is not so
easy to determine the essential spectrum of H
(N)
since the potential does
not decay in all directions as [x[ . However, there is still something we
184 11. Atomic Schrodinger operators
can do. Denote the inmum of the spectrum of H
(N)
by
N
. Then, let us
split the system into H
(N1)
plus a single electron. If the single electron is
far away from the remaining system such that there is little interaction, the
energy should be the sum of the kinetic energy of the single electron and
the energy of the remaining system. Hence arguing as in the two electron
example of the previous section we expect
Theorem 11.2 (HVZ). Let H
(N)
be the selfadjoint operator given in (11.6).
Then H
(N)
is bounded from below and
ess
(H
(N)
) = [
N1
, ), (11.11)
where
N
= min(H
(N)
) < 0.
In particular, the ionization energy (i.e., the energy needed to remove
one electron from the atom in its ground state) of an atom with N electrons
is given by
N
N1
.
Our goal for the rest of this section is to prove this result which is due to
Zhislin, van Winter and Hunziker and known as HVZ theorem. In fact there
is a version which holds for general Nbody systems. The proof is similar
but involves some additional notation.
The idea of proof is the following. To prove [
N1
, )
ess
(H
(N)
)
we choose Weyl sequences for H
(N1)
and
N
and proceed according to
our intuitive picture from above. To prove
ess
(H
(N)
) [
N1
, ) we will
localize H
(N)
on sets where either one electron is far away from the others
or all electrons are far away from the nucleus. Since the error turns out
relatively compact, it remains to consider the inmum of the spectra of
these operators. For all cases where one electron is far away it is
N1
and
for the case where all electrons are far away from the nucleus it is 0 (since
the electrons repel each other).
We begin with the rst inclusion. Let
N1
(x
1
, . . . , x
N1
) H
2
(R
3(N1)
)
such that 
N1
 = 1, (H
(N1)
N1
)
N1
 and
1
H
2
(R
3
) such
that 
1
 = 1, (
N
)
N1
 for some 0. Now consider
r
(x
1
, . . . , x
N
) =
N1
(x
1
, . . . , x
N1
)
1
r
(x
N
),
1
r
(x
N
) =
1
(x
N
r), then
(H
(N)
N1
)
r
 (H
(N1)
N1
)
N1

1
r

+
N1
(
N
)
1
r

+(V
N
N1
j=1
V
N,j
)
r
, (11.12)
where V
N
= V
ne
(x
N
) and V
N,j
= V
ee
(x
N
x
j
). Since (V
N
N1
j=1
V
N,j
)
N1
L
2
(R
3N
) and [
1
r
[ 0 pointwise as [r[ (by Lemma 10.1), the third
11.2. The HVZ theorem 185
term can be made smaller than by choosing [r[ large (dominated conver
gence). In summary,
(H
(N)
N1
)
r
 3 (11.13)
proving [
N1
, )
ess
(H
(N)
).
The second inclusion is more involved. We begin with a localization
formula, which can be veried by a straightforward computation
Lemma 11.3 (IMS localization formula). Suppose
j
C
(R
n
), 0 j
N, is such that
N
j=0
j
(x)
2
= 1, x R
n
, (11.14)
then
=
N
j=0
j
[
j
[
2
, H
2
(R
n
). (11.15)
Abbreviate B = x R
3N
[[x[ 1. Now we will choose
j
, 1 j N,
in such a way that x supp(
j
) B implies that the jth particle is far
away from all the others and from the nucleus. Similarly, we will choose
0
in such a way that x supp(
0
) B implies that all particle are far away
from the nucleus.
Lemma 11.4. There exists functions
j
C
(R
n
, [0, 1]), 0 j N, is
such that (11.14) holds,
supp(
j
) B x B[ [x
j
x
[ > n
1
for all ,= j, and [x
j
[ > n
1
,
U
N
0
= x S
3N1
[ [x
[ > n
1
for all . (11.17)
We claim that
_
n=1
N
_
j=0
U
n
j
= S
3N1
. (11.18)
Indeed, suppose there is an x S
3N1
which is not an element of this union.
Then x , U
n
0
for all n implies 0 = [x
j
[ for some j, say j = 1. Next, since
x , U
n
1
for all n implies 0 = [x
j
x
1
[ = [x
j
[ for some j > 1, say j = 2.
186 11. Atomic Schrodinger operators
Proceeding like this we end up with x = 0, a contradiction. By compactness
of S
3N1
we even have
N
_
j=0
U
n
j
= S
3N1
(11.19)
for n suciently large. It is wellknown that there is a partition of unity
j
(x) subordinate to this cover. Extend
j
(x) to a smooth function from
R
3N
0 to [0, 1] by
j
(x) =
j
(x), x S
3N1
, > 0, (11.20)
and pick a function
C
(R
3N
, [0, 1]) with support inside the unit ball
which is 1 in a neighborhood of the origin. Then
j
=
+ (1
)
j
_
N
=0
+ (1
)
(11.21)
are the desired functions. The gradient tends to zero since
j
(x) =
j
(x)
for 1 and [x[ 1 which implies (
j
)(x) =
1
(
j
)(x).
By our localization formula we have
H
(N)
=
N
j=0
j
H
(N,j)
j
+K, K =
N
j=0
2
j
V
(N,j)
+[
j
[
2
, (11.22)
where
H
(N,j)
=
N
=1
=j
V
+
N
k<, k,=j
V
k,
, H
(N,0)
=
N
=1
+
N
k<
V
k,
V
(N,j)
= V
j
+
N
=j
V
j,
, V
(N,0)
=
N
=1
V
(11.23)
To show that our choice of the functions
j
implies that K is relatively
compact with respect to H we need the following
Lemma 11.5. Let V be H
0
bounded with H
0
bound 0 and suppose that

{xxR}
V R
H
0
(z) 0 as R . Then V is relatively compact with
respect to H
0
.
Proof. Let
n
converge to 0 weakly. Note that 
n
 M for some
M > 0. It suces to show that V R
H
0
(z)
n
 converges to 0. Choose
C
0
(R
n
, [0, 1]) such that it is one for [x[ R. Then
V R
H
0
(z)
n
 (1 )V R
H
0
(z)
n
 +V R
H
0
(z)
n

(1 )V R
H
0
(z)

n
 +
aH
0
R
H
0
(z)
n
 +bR
H
0
(z)
n
. (11.24)
11.2. The HVZ theorem 187
By assumption, the rst term can be made smaller than by choosing R
large. Next, the same is true for the second term choosing a small. Finally,
the last term can also be made smaller than by choosing n large since
is H
0
compact.
The terms [
j
[
2
are bounded and vanish at , hence they are H
0
compact by Lemma 7.10. The terms
j
V
(N,j)
are relatively compact by
the lemma and hence K is relatively compact with respect to H
0
. By
Lemma 6.22, K is also relatively compact with respect to H
(N)
since V
(N)
is relatively bounded with respect to H
0
.
In particular H
(N)
K is selfadjoint on H
2
(R
3N
) and
ess
(H
(N)
) =
ess
(H
(N)
K). Since the operators H
(N,j)
, 1 j N, are all of the
form H
(N1)
plus one particle which does not interact with the others and
the nucleus, we have H
(N,j)
N1
0, 1 j N. Moreover, we have
H
(0)
0 since V
j,k
0 and hence
, (H
(N)
K
N1
)) =
N
j=0
j
, (H
(N,j)
N1
)
j
) 0. (11.25)
Thus we obtain the remaining inclusion
ess
(H
(N)
) =
ess
(H
(N)
K) (H
(N)
K) [
N1
, ) (11.26)
which nishes the proof of the HVZ theorem.
Note that the same proof works if we add additional nuclei at xed
locations. That is, we can also treat molecules if we assume that the nuclei
are xed in space.
Finally, let us consider the example of Helium like atoms (N = 2). By
the HVZ theorem and the considerations of the previous section we have
ess
(H
(2)
) = [
2
ne
4
, ). (11.27)
Moreover, if
ee
= 0 (no electron interaction), we can take products of one
particle eigenfunctions to show that
2
ne
_
1
4n
2
+
1
4m
2
_
p
(H
(2)
(
ee
= 0)), n, m N. (11.28)
In particular, there are eigenvalues embedded in the essential spectrum in
this case. Moreover, since the electron interaction term is positive, we see
H
(2)
2
ne
2
. (11.29)
Note that there can be no positive eigenvalues by the virial theorem. This
even holds for arbitrary N,
p
(H
(N)
) (, 0). (11.30)
Chapter 12
Scattering theory
12.1. Abstract theory
In physical measurements one often has the following situation. A particle
is shot into a region where it interacts with some forces and then leaves
the region again. Outside this region the forces are negligible and hence the
time evolution should be asymptotically free. Hence one expects asymptotic
states
(t) = exp(itH
0
)
(t) 0 as t . (12.1)
(t)
(t)
+
(t)
!
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$$X
(0) = lim
t
(0) e
itH
e
itH
0
(0)
(12.2)
and motivated by this we dene the wave operators by
D(
) = H[lim
t
e
itH
e
itH
0
= lim
t
e
itH
e
itH
0
. (12.3)
189
190 12. Scattering theory
The set D(
and
Ran(
), it is called a
scattering state.
By construction we have

 = lim
t
e
itH
e
itH
0
 = lim
t
 =  (12.4)
and it is not hard to see that D(
by
1
) is
also closed. In summary,
Lemma 12.1. The sets D(
) and Ran(
: D(
)
Ran(
) is unitary.
Next, observe that
lim
t
e
itH
e
itH
0
(e
isH
0
) = lim
t
e
isH
(e
i(t+s)H
e
i(t+s)H
0
) (12.5)
and hence
e
itH
0
= e
itH
, D(
). (12.6)
In addition, D(
) is invariant
under exp(itH). Moreover, if D(
then
, exp(itH
0
)) = exp(itH
0
), ) = 0, D(
). (12.7)
Hence D(
is invariant
under exp(itH). Consequently, D(
) reduces exp(itH
0
) and Ran(
)
reduces exp(itH). Moreover, dierentiating (12.6) with respect to t we
obtain from Theorem 5.1 the intertwining property of the wave operators.
Theorem 12.2. The subspaces D(
) respectively Ran(
) reduce H
0
re
spectively H and the operators restricted to these subspaces are unitarily
equivalent
H
0
= H
, D(
) D(H
0
). (12.8)
It is interesting to know the correspondence between incoming and out
going states. Hence we dene the scattering operator
S =
1
+
, D(S) = D(
)[
Ran(
+
). (12.9)
Note that we have D(S) = D(
) and
Ran(S) = D(
+
) if and only if Ran(
) Ran(
+
). Moreover, S is unitary
from D(S) onto Ran(S) and we have
H
0
S = SH
0
, D(H
0
) D(S). (12.10)
However, note that this whole theory is meaningless until we can show
that D(
I)
_
0
(H H
0
) exp(itH
0
)dt (12.12)
in this case.
Proof. The result follows from
e
itH
e
itH
0
= + i
_
t
0
exp(isH)(H H
0
) exp(isH
0
)ds (12.13)
which holds for D(H
0
).
As a simple consequence we obtain the following result for Schrodinger
operators in R
3
Theorem 12.4. Suppose H
0
is the free Schrodinger operator and H =
H
0
+V with V L
2
(R
3
), then the wave operators exist and D(
) = H.
Proof. Since we want to use Cooks lemma, we need to estimate
V (s)
2
=
_
R
3
[V (x)(s, x)[
2
dx, (s) = exp(isH
0
), (12.14)
for given D(H
0
). Invoking (7.34) we get
V (s) (s)
V 
1
(4s)
3/2

1
V , s > 0, (12.15)
at least for L
1
(R
3
). Moreover, this implies
_
1
V (s)ds
1
4
3/2

1
V  (12.16)
and thus any such is in D(
+
). Since such functions are dense, we obtain
D(
+
) = H. Similarly for
.
By the intertwining property is an eigenfunction of H
0
if and only
if it is an eigenfunction of H. Hence for H
pp
(H
0
) it is easy to check
whether it is in D(
) (12.17)
and that they are complete if, in addition, all elements of H
ac
(H) are
scattering states, that is,
H
ac
(H) Ran(
). (12.18)
192 12. Scattering theory
If we even have
H
c
(H) Ran(
), (12.19)
they are called asymptotically complete. We will be mainly interested in
the case where H
0
is the free Schrodinger operator and hence H
ac
(H
0
) = H.
In this later case the wave operators exist if D(
). In
particular asymptotic completeness implies H
sc
(H) = since H restricted
to Ran(
) is unitarily equivalent to H
0
.
12.2. Incoming and outgoing states
In the remaining sections we want to apply this theory to Schrodinger op
erators. Our rst goal is to give a precise meaning to some terms in the
intuitive picture of scattering theory introduced in the previous section.
This physical picture suggests that we should be able to decompose
H into an incoming and an outgoing part. But how should incom
ing respectively outgoing be dened for H? Well incoming (outgoing)
means that the expectation of x
2
should decrease (increase). Set x(t)
2
=
exp(iH
0
t)x
2
exp(iH
0
t), then, abbreviating (t) = e
itH
0
,
d
dt
E
(x(t)
2
) = (t), i[H
0
, x
2
](t)) = 4(t), D(t)), o(R
n
),
(12.20)
where D is the dilation operator introduced in (10.11). Hence it is natural
to consider Ran(P
),
P
= P
D
((0, )), (12.21)
as outgoing respectively incoming states. If we project a state in Ran(P
)
to energies in the interval (a
2
, b
2
), we expect that it cannot be found in a
ball of radius proportional to a[t[ as t (a is the minimal velocity of
the particle, since we have assumed the mass to be two). In fact, we will
show below that the tail decays faster then any inverse power of [t[.
We rst collect some properties of D which will be needed later on. Note
TD = DT (12.22)
and hence Tf(D) = f(D)T. To say more we will look for a transformation
which maps D to a multiplication operator.
Since the dilation group acts on [x[ only, it seems reasonable to switch
to polar coordinates x = r, (t, ) R
+
S
n1
. Since U(s) essentially
transforms r into r exp(s) we will replace r by = ln(r). In these coordinates
we have
U(s)(e
) = e
ns/2
(e
(s)
) (12.23)
12.2. Incoming and outgoing states 193
and hence U(s) corresponds to a shift of (the constant in front is absorbed
by the volume element). Thus D corresponds to dierentiation with respect
to this coordinate and all we have to do to make it a multiplication operator
is to take the Fourier transform with respect to .
This leads us to the Mellin transform
/ : L
2
(R
n
) L
2
(R S
n1
)
(r) (/)(, ) =
1
2
_
0
r
i
(r)r
n
2
1
dr
. (12.24)
By construction, / is unitary, that is,
_
R
_
S
n1
[(/)(, )[
2
dd
n1
=
_
R
+
_
S
n1
[(r)[
2
r
n1
drd
n1
,
(12.25)
where d
n1
is the normalized surface measure on S
n1
. Moreover,
/
1
U(s)/ = e
is
(12.26)
and hence
/
1
D/ = . (12.27)
From this it is straightforward to show that
(D) =
ac
(D) = R,
sc
(D) =
pp
(D) = (12.28)
and that o(R
n
) is a core for D. In particular we have P
+
+P
= I.
Using the Mellin transform we can now prove Perrys estimate [11].
Lemma 12.5. Suppose f C
c
(R) with supp(f) (a
2
, b
2
) for some a, b >
0. For any R R, N N there is a constant C such that

{x x<2at}
e
itH
0
f(H
0
)P
D
((R, ))
C
(1 +[t[)
N
, t 0, (12.29)
respectively.
Proof. We prove only the + case, the remaining one being similar. Consider
o(R
n
). Introducing
(t, x) = e
itH
0
f(H
0
)P
D
((R, ))(x) = K
t,x
, TP
D
((R, )))
= K
t,x
, P
D
((, R))
), (12.30)
where
K
t,x
(p) =
1
(2)
n/2
e
i(
p
2
t
+px)
f(p
2
)
, (12.31)
we see that it suces to show
P
D
((, R))K
t,x

2
const
(1 +[t[)
2N
, for [x[ < 2a[t[, t > 0. (12.32)
194 12. Scattering theory
Now we invoke the Mellin transform to estimate this norm
P
D
((, R))K
t,x

2
=
_
R
_
S
n1
[(/K
t,x
)(, )[
2
dd
n1
. (12.33)
Since
(/K
t,x
)(, ) =
1
(2)
(n+1)/2
_
0
f(r)e
i(r)
dr (12.34)
with
f(r) = f(r
2
)
r
n/21
C
c
((a
2
, b
2
)), (r) = tr
2
+ rx ln(r). Esti
mating the derivative of we see
(r)[
const
1 +[[ +[t[
, r (a, b), (12.36)
and , t as above. Using this we can estimate the integral in (12.34)
_
0
f(r)
1
(r)
d
dr
e
i(r)
dr
const
1 +[[ +[t[
_
0
(r)e
i(r)
dr
, (12.37)
(the last step uses integration by parts) for , t as above. By increasing the
constant we can even assume that it holds for t 0 and R. Moreover,
by iterating the last estimate we see
[(/K
t,x
)(, )[
const
(1 +[[ +[t[)
N
(12.38)
for any N N and t 0 and R. This nishes the proof.
Corollary 12.6. Suppose that f C
c
((0, )) and R R. Then the
operator P
D
((R, ))f(H
0
) exp(itH
0
) converges strongly to 0 as t
.
Proof. Abbreviating P
D
= P
D
((R, )) and =
{x x<2at}
we have
P
D
f(H
0
)e
itH
0
 e
itH
0
f(H
0
)
P
D
 +f(H
0
)(I). (12.39)
since A = A
r
=
{x xr}
. (12.41)
The potential V will be called short range if these quantities are integrable.
We rst note that it suces to check this for h
1
or h
2
and for one z (H
0
).
Lemma 12.7. The function h
1
is integrable if and only if h
2
is. Moreover,
h
j
integrable for one z
0
(H
0
) implies h
j
integrable for all z (H
0
).
Proof. Pick C
c
(R
n
, [0, 1]) such that (x) = 0 for 0 [x[ 1/2 and
(x) = 0 for 1 [x[. Then it is not hard to see that h
j
is integrable if and
only if
h
j
is integrable, where
h
1
(r) = V R
H
0
(z)
r
,
h
2
(r) = 
r
V R
H
0
(z), r 1, (12.42)
and
r
(x) = (x/r). Using
[R
H
0
(z),
r
] = R
H
0
(z)[H
0
(z),
r
]R
H
0
(z)
= R
H
0
(z)(
r
+ (
r
))R
H
0
(z) (12.43)
and
r
=
r/2
r
, 
r


/r
2
respectively (
r
) =
r/2
(
r
),

r


/r
2
we see
[
h
1
(r)
h
2
(r)[
c
r
h
1
(r/2), r 1. (12.44)
Hence
h
2
is integrable if
h
1
is. Conversely,
h
1
(r)
h
2
(r) +
c
r
h
1
(r/2)
h
2
(r) +
c
r
h
2
(r/2) +
2c
r
2
h
1
(r/4) (12.45)
shows that
h
2
is integrable if
h
1
is.
Invoking the rst resolvent formula

r
V R
H
0
(z) 
r
V R
H
0
(z
0
)I (z z
0
)R
H
0
(z) (12.46)
nishes the proof.
As a rst consequence note
Lemma 12.8. If V is short range, then R
H
(z) R
H
0
(z) is compact.
Proof. The operator R
H
(z)V (I
r
)R
H
0
(z) is compact since (I
r
)R
H
0
(z)
is by Lemma 7.10 and R
H
(z)V is bounded by Lemma 6.22. Moreover, by
our short range condition it converges in norm to
R
H
(z)V R
H
0
(z) = R
H
(z) R
H
0
(z) (12.47)
as r (at least for some subsequence).
196 12. Scattering theory
In particular, by Weyls theorem we have
ess
(H) = [0, ). Moreover,
V short range implies that H and H
0
look alike far outside.
Lemma 12.9. Suppose R
H
(z)R
H
0
(z) is compact, then so is f(H)f(H
0
)
for any f C
(R) and
lim
r
(f(H) f(H
0
))
r
 = 0. (12.48)
Proof. The rst part is Lemma 6.20 and the second part follows from part
(ii) of Lemma 6.8 since
r
converges strongly to 0.
However, this is clearly not enough to prove asymptotic completeness
and we need a more careful analysis. The main ideas are due to En [4].
We begin by showing that the wave operators exist. By Cooks criterion
(Lemma 12.3) we need to show that
V exp(itH
0
) V R
H
0
(1)(I
2at
) exp(itH
0
)(H
0
+I)
+V R
H
0
(1)
2at
(H
0
+I) (12.49)
is integrable for a dense set of vectors . The second term is integrable by our
short range assumption. The same is true by Perrys estimate (Lemma 12.5)
for the rst term if we choose = f(H
0
)P
D
((R, )). Since vectors of
this form are dense, we see that the wave operators exist,
D(
) = H. (12.50)
Since H restricted to Ran(
) is unitarily equivalent to H
0
, we obtain
[0, ) =
ac
(H
0
)
ac
(H). And by
ac
(H)
ess
(H) = [0, ) we even
have
ac
(H) = [0, ).
To prove asymptotic completeness of the wave operators we will need
that (
I)f(H
0
)P
are compact.
Lemma 12.10. Let f C
c
((0, )) and suppose
n
converges weakly to 0.
Then
lim
n
(
I)f(H
0
)P
n
 = 0, (12.51)
that is, (
I)f(H
0
)P
is compact.
Proof. By (12.13) we see
R
H
(z)(
I)f(H
0
)P
n

_
0
R
H
(z)V exp(isH
0
)f(H
0
)P
n
dt.
(12.52)
Since R
H
(z)V R
H
0
is compact we see that the integrand
R
H
(z)V exp(isH
0
)f(H
0
)P
n
=
R
H
(z)V R
H
0
exp(isH
0
)(H
0
+ 1)f(H
0
)P
n
(12.53)
12.3. Schrodinger operators with short range potentials 197
converges pointwise to 0. Moreover, arguing as in (12.49) the integrand
is bounded by an L
1
function depending only on 
n
. Thus R
H
(z)(
I)f(H
0
)P
I)
f(H
0
)P
= R
H
(z)(
I)f(H
0
)P
(R
H
(z) R
H
0
(z))f(H
0
)P
(12.54)
is compact by Lemma 6.20, where
f() = ( + 1)f().
Now we have gathered enough information to tackle the problem of
asymptotic completeness.
We rst show that the singular continuous spectrum is absent. This
is not really necessary, but avoids the use of Ces`aro means in our main
argument.
Abbreviate P = P
sc
H
P
H
((a, b)), 0 < a < b. Since H restricted to
Ran(
) is unitarily equivalent to H
0
(which has purely absolutely continu
ous spectrum), the singular part must live on Ran(
, that is, P
sc
H
= 0.
Thus Pf(H
0
) = P(I
+
)f(H
0
)P
+
+P(I
)f(H
0
)P
is compact. Since
f(H) f(H
0
) is compact, it follows that Pf(H) is also compact. Choos
ing f such that f() = 1 for [a, b] we see that P = Pf(H) is com
pact and hence nite dimensional. In particular
sc
(H) (a, b) is a 
nite set. But a continuous measure cannot be supported on a nite set,
showing
sc
(H) (a, b) = . Since 0 < a < b are arbitrary we even
have
sc
(H) (0, ) = and by
sc
(H)
ess
(H) = [0, ) we obtain
sc
(H) = .
Observe that replacing P
sc
H
by P
pp
H
the same argument shows that all
nonzero eigenvalues are nite dimensional and cannot accumulate in (0, ).
In summary we have shown
Theorem 12.11. Suppose V is short range. Then
ac
(H) =
ess
(H) = [0, ),
sc
(H) = . (12.55)
All nonzero eigenvalues have nite multiplicity and cannot accumulate in
(0, ).
Now we come to the anticipated asymptotic completeness result of En.
Choose
H
c
(H) = H
ac
(H) such that = f(H) (12.56)
for some f C
c
((0, ). By the RAGE theorem the sequence (t) converges
weakly to zero as t . Abbreviate (t) = exp(itH). Introduce
(t) = f(H
0
)P
(t). (12.57)
198 12. Scattering theory
which satisfy
lim
t
(t)
+
(t)
(t) = 0. (12.58)
Indeed this follows from
(t) =
+
(t) +
I)
(t) = 0 (12.60)
by Lemma 12.10. Now suppose Ran(
, then

2
= lim
t
(t), (t))
= lim
t
(t),
+
(t) +
(t))
= lim
t
(t),
+
+
(t) +
(t)). (12.61)
By Theorem 12.2, Ran(
implying

2
= lim
t
(t),
(t)) (12.62)
= lim
t
P
f(H
0
)
(t), (t)).
Invoking the intertwining property we see

2
= lim
t
P
f(H
0
)
e
itH
0
, (t)) = 0 (12.63)
by Corollary 12.6. Hence Ran(
) = H
ac
(H) = H
c
(H) and we thus have
shown
Theorem 12.12 (En). Suppose V is short range, then the wave operators
are asymptotically complete.
For further results and references see for example [3].
Part 3
Appendix
Appendix A
Almost everything
about Lebesgue
integration
In this appendix I give a brief introduction to measure theory. Good refer
ences are [2] or [18].
A.1. Borel measures in a nut shell
The rst step in dening the Lebesgue integral is extending the notion of
size from intervals to arbitrary sets. Unfortunately, this turns out to be too
much, since a classical paradox by Banach and Tarski shows that one can
break the unit ball in R
3
into a nite number of (wild choosing the pieces
uses the Axiom of Choice and cannot be done with a jigsaw;) pieces, rotate
and translate them, and reassemble them to get two copies of the unit ball
(compare Problem A.1). Hence any reasonable notion of size (i.e., one which
is translation and rotation invariant) cannot be dened for all sets!
A collection of subsets / of a given set X such that
X /,
/ is closed under nite unions,
/ is closed under complements.
is called an algebra. Note that / and that, by de Morgan, / is also
closed under nite intersections. If an algebra is closed under countable
unions (and hence also countable intersections), it is called a algebra.
201
202 A. Almost everything about Lebesgue integration
Moreover, the intersection of any family of ()algebras /
is again
a ()algebra and for any collection S of subsets there is a unique smallest
()algebra (S) containing S (namely the intersection of all ()algebra
containing S). It is called the ()algebra generated by S.
If X is a topological space, the Borel algebra of X is dened to be
the algebra generated by all open (respectively all closed) sets. Sets in the
Borel algebra are called Borel sets.
Example. In the case X = R
n
the Borel algebra will be denoted by B
n
and we will abbreviate B = B
1
.
Now let us turn to the denition of a measure: A set X together with a 
algebra is called a measure space. A measure is a map : [0, ]
on a algebra such that
() = 0,
(
j=1
A
j
) =
j=1
(A
j
) if A
j
A
k
= for all j, k (additivity).
It is called nite if there is a countable cover X
j
j=1
of X with (X
j
) <
for all j. (Note that it is no restriction to assume X
j
X.) It is called
nite if (X) < . The sets in are called measurable sets.
If we replace the algebra by an algebra /, then is called a premea
sure. In this case additivity clearly only needs to hold for disjoint sets
A
n
for which
n
A
n
/.
We will write A
n
A if A
n
A
n+1
(note A =
n
A
n
) and A
n
A if
A
n+1
A
n
(note A =
n
A
n
).
Theorem A.1. Any measure satises the following properties:
(i) A B implies (A) (B) (monotonicity).
(ii) (A
n
) (A) if A
n
A (continuity from below).
(iii) (A
n
) (A) if A
n
A and (A
1
) < (continuity from above).
Proof. The rst claim is obvious. The second follows using
A
n
= A
n
A
n1
and additivity. The third follows from the second using
A
n
= A
1
A
n
and
(
A
n
) = (A
1
) (A
n
).
Example. Let A P(M) and set (A) to be the number of elements of A
(respectively if A is innite). This is the so called counting measure.
Note that if X = N and A
n
= j N[j n, then (A
n
) = , but
(
n
A
n
) = () = 0 which shows that the requirement (A
1
) < in the
last claim of Theorem A.1 is not superuous.
A.1. Borel measures in a nut shell 203
A measure on the Borel algebra is called a Borel measure if (C) <
for any compact set C. A Borel measures is called outer regular if
(A) = inf
AO,O open
(O) (A.1)
and inner regular if
(A) = sup
CA,C compact
(C). (A.2)
It is called regular if it is both outer and inner regular.
But how can we obtain some more interesting Borel measures? We will
restrict ourselves to the case of X = R for simplicity. Then the strategy
is as follows: Start with the algebra of nite unions of disjoint intervals
and dene for those sets (as the sum over the intervals). This yields a
premeasure. Extend this to an outer measure for all subsets of R. Show
that the restriction to the Borel sets is a measure.
Let us rst show how we should dene for intervals: To every Borel
measure on B we can assign its distribution function
(x) =
_
_
_
((x, 0]), x < 0
0, x = 0
((0, x]), x > 0
(A.3)
which is right continuous and nondecreasing. Conversely, given a right
continuous nondecreasing function : R R we can set
(A) =
_
_
(b) (a), A = (a, b]
(b) (a), A = [a, b]
(b) (a), A = (a, b)
(b) (a), A = [a, b)
, (A.4)
where (a) = lim
0
(a). In particular, this gives a premeasure on the
algebra of nite unions of intervals which can be extended to a measure:
Theorem A.2. For every right continuous nondecreasing function : R
R there exists a unique regular Borel measure which extends (A.4). Two
dierent functions generate the same measure if and only if they dier by a
constant.
Since the proof of this theorem is rather involved, we defer it to the next
section and look at some examples rst.
Example. Suppose (x) = 0 for x < 0 and (x) = 1 for x 0. Then we
obtain the socalled Dirac measure at 0, which is given by (A) = 1 if
0 A and (A) = 0 if 0 , A.
204 A. Almost everything about Lebesgue integration
Example. Suppose (x) = x, then the associated measure is the ordinary
Lebesgue measure on R. We will abbreviate the Lebesgue measure of a
Borel set A by (A) = [A[.
It can be shown that Borel measures on a separable metric space are
always regular.
A set A is called a support for if (XA) = 0. A property is
said to hold almost everywhere (a.e.) if the it holds on a support for
or, equivalently, if the set where it does not hold is contained in a set of
measure zero.
Example. The set of rational numbers has Lebesgue measure zero: (Q) =
0. In fact, any single point has Lebesgue measure zero, and so has any
countable union of points (by countable additivity).
Example. The Cantor set is an example of a closed uncountable set of
Lebesgue measure zero. It is constructed as follows: Start with C
0
= [0, 1]
and remove the middle third to obtain C
1
= [0,
1
3
][
2
3
, 1]. Next, again remove
the middle thirds of the remaining sets to obtain C
2
= [0,
1
9
] [
2
9
,
1
3
] [
2
3
,
7
9
]
[
8
9
, 1].
C
0
C
1
C
2
C
3
.
.
.
Proceeding like this we obtain a sequence of nesting sets C
n
and the limit
C =
n
C
n
is the Cantor set. Since C
n
is compact, so is C. Moreover,
C
n
consists of 2
n
intervals of length 3
n
, and thus its Lebesgue measure
is (C
n
) = (2/3)
n
. In particular, (C) = lim
n
(C
n
) = 0. Using the
ternary expansion it is extremely simple to describe: C is the set of all
x [0, 1] whose ternary expansion contains no ones, which shows that C is
uncountable (why?). It has some further interesting properties: it is totally
disconnected (i.e., it contains no subintervals) and perfect (it has no isolated
points).
Problem A.1 (Vitali set). Call two numbers x, y [0, 1) equivalent if xy
is rational. Construct the set V by choosing one representative from each
equivalence class. Show that V cannot be measurable with respect to any
nite translation invariant measure on [0, 1). (Hint: How can you build up
[0, 1) from V ?)
A.2. Extending a premasure to a measure 205
A.2. Extending a premasure to a measure
The purpose of this section is to prove Theorem A.2. It is rather technical and
should be skipped on rst reading.
In order to prove Theorem A.2 we need to show how a premeasure can
be extended to a measure. As a prerequisite we rst establish that it suces
to check increasing (or decreasing) sequences of sets when checking wether
a given algebra is in fact a algebra:
A collections of sets / is called a monotone class if A
n
A implies
A / whenever A
n
/ and A
n
A implies A / whenever A
n
/.
Every algebra is a monotone class and the intersection of monotone classes
is a monotone class. Hence every collection of sets S generates a smallest
monotone class /(S).
Theorem A.3. Let / be an algebra. Then /(/) = (/).
Proof. We rst show that / = /(/) is an algebra.
Put M(A) = B /[A B /. If B
n
is an increasing sequence
of sets in M(A) then A B
n
is an increasing sequence in / and hence
n
(A B
n
) /. Now
A
_
_
n
B
n
_
=
_
n
(A B
n
) (A.5)
shows that M(A) is closed under increasing sequences. Similarly, M(A) is
closed under decreasing sequences and hence it is a monotone class. But
does it contain any elements? Well if A / we have / M(A) implying
M(A) = /. Hence A B / if at least one of the sets is in /. But this
shows / M(A) and hence M(A) = / for any A /. So / is closed
under nite unions.
To show that we are closed under complements consider M = A
/[XA /. If A
n
is an increasing sequence then XA
n
is a decreasing
sequence and X
n
A
n
=
n
XA
n
/ if A
n
M. Similarly for decreas
ing sequences. Hence M is a monotone class and must be equal to / since
it contains /.
So we know that / is an algebra. To show that it is an algebra let
A
n
be given and put
A
n
=
kn
A
n
. Then
A
n
is increasing and
n
A
n
=
n
A
n
/.
The typical use of this theorem is as follows: First verify some property
for sets in an algebra /. In order to show that it holds for any set in (/),
it suces to show that the sets for which it holds is closed under countable
increasing and decreasing sequences (i.e., is a monotone class).
Now we start by proving that (A.4) indeed gives rise to a premeasure.
206 A. Almost everything about Lebesgue integration
Lemma A.4. as dened in (A.4) gives rise to a unique nite regular
premeasure on the algebra / of nite unions of disjoint intervals.
Proof. First of all, (A.4) can be extended to nite unions of disjoint inter
vals by summing over all intervals. It is straightforward to verify that is
well dened (one set can be represented by dierent unions of intervals) and
by construction additive.
To show regularity, we can assume any such union to consist of open
intervals and points only. To show outer regularity replace each point x
by a small open interval (x+, x) and use that (x) = lim
(x+)
(x). Similarly, to show inner regularity, replace each open interval (a, b)
by a compact one [a
n
, b
n
] (a, b) and use ((a, b)) = lim
n
(b
n
) (a
n
)
if a
n
a and b
n
b.
It remains to verify additivity. We need to show
(
_
k
I
k
) =
k
(I
k
) (A.6)
whenever I
n
/ and I =
k
I
k
/. Since each I
n
is a nite union of in
tervals, we can as well assume each I
n
is just one interval (just split I
n
into
its subintervals and note that the sum does not change by additivity). Sim
ilarly, we can assume that I is just one interval (just treat each subinterval
separately).
By additivity is monotone and hence
n
k=1
(I
k
) = (
n
_
k=1
I
k
) (I) (A.7)
which shows
k=1
(I
k
) (I). (A.8)
To get the converse inequality we need to work harder.
By outer regularity we can cover each I
k
by open interval J
k
such that
(J
k
) (I
k
) +
2
k
. Suppose I is compact rst. Then nitely many of the
J
k
, say the rst n, cover I and we have
(I) (
n
_
k=1
J
k
)
n
k=1
(J
k
)
k=1
(I
k
) +. (A.9)
Since > 0 is arbitrary, this shows additivity for compact intervals. By
additivity we can always add/subtract the end points of I and hence 
additivity holds for any bounded interval. If I is unbounded, say I = [a, ),
then given x > 0 we can nd an n such that J
n
cover at least [0, x] and hence
A.2. Extending a premasure to a measure 207
n
k=1
(I
k
)
n
k=1
(J
k
) ([a, x]) . (A.10)
Since x > a and > 0 are arbitrary we are done.
This premeasure determines the corresponding measure uniquely (if
there is one at all):
Theorem A.5 (Uniqueness of measures). Let be a nite premeasure
on an algebra /. Then there is at most one extension to (/).
Proof. We rst assume that (X) < . Suppose there is another extension
and consider the set
S = A (/)[(A) = (A). (A.11)
I claim S is a monotone class and hence S = (/) since / S by assump
tion (Theorem A.3).
Let A
n
A. If A
n
S we have (A
n
) = (A
n
) and taking limits
(Theorem A.1 (ii)) we conclude (A) = (A). Next let A
n
A and take
again limits. This nishes the nite case. To extend our result to the nite
case let X
j
X be an increasing sequence such that (X
j
) < . By the
nite case (A X
j
) = (A X
j
) (just restrict , to X
j
). Hence
(A) = lim
j
(A X
j
) = lim
j
(A X
j
) = (A) (A.12)
and we are done.
Note that if our premeasure is regular, so will be the extension:
Lemma A.6. Suppose is a nite premeasure on some algebra / gen
erating the Borel sets B. Then outer (inner) regularity holds for all Borel
sets if it holds for all sets in /.
Proof. We rst assume that (X) < . Set
(A) = inf
AO,O open
(O) (A) (A.13)
and let M = A B[
(A
n
) (O
n
) (A
n
) +
1
n
. (A.14)
Now if A
n
A just take limits and use continuity from below of . Similarly
if A
n
A.
208 A. Almost everything about Lebesgue integration
Now let be arbitrary. Given A we can split it into disjoint sets A
j
such that A
j
X
j
(A
1
= A X
1
, A
2
= (AA
1
) X
2
, etc.). Let X
j
be a
cover with (X
j
) < . By regularity, we can assume X
j
open. Thus there
are open (in X) sets O
j
covering A
j
such that (O
j
) (A
j
) +
2
j
. Then
O =
j
O
j
is open, covers A, and satises
(A) (O)
j
(O
j
) (A) +. (A.15)
This settles outer regularity.
Next let us turn to inner regularity. If (X) < one can show as before
that M = A B[
(A) = sup
CA,C compact
(C) (A) (A.16)
is a monotone class. This settles the nite case.
For the nite case split again A as before. Since X
j
has nite measure,
there are compact subsets K
j
of A
j
such that (A
j
) (K
j
) +
2
j
. Now
we need to distinguish two cases: If (A) = , the sum
j
(A
j
) will
diverge and so will
j
(K
j
). Hence
K
n
=
n
j=1
A is compact with
(
K
n
) = (A). If (A) < , the sum
j
(A
j
) will converge and
choosing n suciently large we will have
(
K
n
) (A) (
K
n
) + 2. (A.17)
This nishes the proof.
So it remains to ensure that there is an extension at all. For any pre
measure we dene
(A) = inf
_
n=1
(A
n
)
_
n=1
A
n
, A
n
/
_
(A.18)
where the inmum extends over all countable covers from /. Then the
function
() = 0,
A
1
A
2
(A
1
)
(A
2
), and
n=1
A
n
)
n=1
(A
n
) (subadditivity).
Note that
be an outer measure.
Then the set of all sets A satisfying the Caratheodory condition
(E) =
(A E) +
(A
E) E X (A.19)
A.2. Extending a premasure to a measure 209
(where A
restricted
to this algebra is a measure.
Proof. We rst show that is an algebra. It clearly contains X and is closed
under complements. Let A, B . Applying Caratheodorys condition
twice nally shows
(E) =
(A B E) +
(A
B E) +
(A B
E)
+
(A
E)
((A B) E) +
((A B)
E), (A.20)
where we have used De Morgan and
(ABE) +
(A
BE) +
(AB
E)
((AB) E) (A.21)
which follows from subadditivity and (A B) E = (A B E) (A
B E) (A B
kn
A
n
, A =
n
A
n
. Then for any set E
we have
(
A
n
E) =
(A
n
A
n
E) +
(A
A
n
E)
=
(A
n
E) +
(
A
n1
E)
= . . . =
n
k=1
(A
k
E). (A.22)
Using
A
n
and monotonicity of
, we infer
(E) =
(
A
n
E) +
(
A
n
E)
k=1
(A
k
E) +
(A
E). (A.23)
Letting n and using subadditivity nally gives
(E)
k=1
(A
k
E) +
(A
E)
(A E) +
(B
E)
(E) (A.24)
and we infer that is a algebra.
Finally, setting E = A in (A.24) we have
(A) =
k=1
(A
k
A) +
(A
A) =
k=1
(A
k
) (A.25)
and we are done.
210 A. Almost everything about Lebesgue integration
Remark: The constructed measure is complete, that is, for any mea
surable set A of measure zero, any subset of A is again measurable (Prob
lem A.4).
The only remaining question is wether there are any nontrivial sets sat
isfying the Caratheodory condition.
Lemma A.8. Let be a premeasure on / and let
n=1
(A
n
) =
n=1
(A
n
A)+
n=1
(A
n
A
(EA)+
(EA
) (A.26)
since A
n
A / is a cover for E A and A
n
A
/ is a cover for E A
.
Taking the inmum we have
(E)
(EA) +
(EA
) which nishes
the proof.
Thus, as a consequence we obtain Theorem A.2.
Problem A.2. Show that
k=1
for A
n
such that
(A
n
) =
2
n
+
k=1
(B
nk
) and note that B
nk
n,k=1
is a cover for
n
A
n
.)
Problem A.3. Show that
j=1
(a
j
, ). (A.27)
In particular, a function f : X R
n
is measurable if and only if every
component is measurable.
Proof. All you have to use is f
1
(R
n
A) = Xf
1
(A), f
1
(
j
A
j
) =
j
f
1
(A
j
) and the fact that any open set is a countable union of open
intervals.
If is the Borel algebra, we will call a measurable function also Borel
function. Note that, in particular,
Lemma A.10. Any continuous function is measurable and the composition
of two measurable functions is again measurable.
Moreover, sometimes it is also convenient to allow as possible values
for f, that is, functions f : X R, R = R , . In this case A R
is called Borel if A R is.
The set of all measurable functions forms an algebra.
Lemma A.11. Suppose f, g : X R are measurable functions. Then the
sum f +g and the product fg is measurable.
Proof. Note that addition and multiplication are continuous functions from
R
2
R and hence the claim follows from the previous lemma.
Moreover, the set of all measurable functions is closed under all impor
tant limiting operations.
Lemma A.12. Suppose f
n
: X R is a sequence of measurable functions,
then
inf
nN
f
n
, sup
nN
f
n
, liminf
n
f
n
, limsup
n
f
n
(A.28)
are measurable as well.
Proof. It suces to proof that supf
n
is measurable since the rest follows
from inf f
n
= sup(f
n
), liminf f
n
= sup
k
inf
nk
f
n
, and limsupf
n
=
inf
k
sup
nk
f
n
. But (supf
n
)
1
((a, )) =
n
f
1
n
((a, )) and we are done.
= max(f, 0).
212 A. Almost everything about Lebesgue integration
A.4. The Lebesgue integral
Now we can dene the integral for measurable functions as follows. A
measurable function s : X R is called simple if its range is nite
s(X) =
j
p
j=1
, that is, if
s =
p
j=1
j
A
j
, A
j
= s
1
(
j
) . (A.29)
Here
A
is the characteristic function of A, that is,
A
(x) = 1 if x A
and
A
(x) = 0 else.
For a positive simple function we dene its integral as
_
A
s d =
n
j=1
j
(A
j
A). (A.30)
Here we use the convention 0 = 0.
Lemma A.13. The integral has the following properties:
(i)
_
A
s d =
_
X
A
s d.
(ii)
_
S
j=1
A
j
s d =
j=1
_
A
j
s d.
(iii)
_
A
s d =
_
A
s d.
(iv)
_
A
(s +t)d =
_
A
s d +
_
A
t d.
(v) A B
_
A
s d
_
B
s d.
(vi) s t
_
A
s d
_
A
t d.
Proof. (i) is clear from the denition. (ii) follows from additivity of .
(iii) is obvious. (iv) Let s =
j
j
A
j
, t =
j
j
B
j
and abbreviate
C
jk
= (A
j
B
k
) A. Then
_
A
(s +t)d =
j,k
_
C
jk
(s +t)d =
j,k
(
j
+
k
)(C
jk
)
=
j,k
_
_
C
jk
s d +
_
C
jk
t d
_
=
_
A
s d +
_
A
t d(A.31)
(v) follows from monotonicity of . (vi) follows using t s 0 and arguing
as in (iii).
Our next task is to extend this denition to arbitrary positive functions
by
_
A
f d = sup
sf
_
A
s d, (A.32)
A.4. The Lebesgue integral 213
where the supremum is taken over all simple functions s f. Note that,
except for possibly (ii) and (iv), Lemma A.13 still holds for this extension.
Theorem A.14 (monotone convergence). Let f
n
be a monotone nondecreas
ing sequence of positive measurable functions, f
n
f. Then
_
A
f
n
d
_
A
f d. (A.33)
Proof. By property (v)
_
A
f
n
d is monotone and converges to some number
. By f
n
f and again (v) we have
_
A
f d. (A.34)
To show the converse let s be simple such that s f and let (0, 1). Put
A
n
= x A[f
n
(x) s(x) and note A
n
X (show this). Then
_
A
f
n
d
_
An
f
n
d
_
An
s d. (A.35)
Letting n we see
_
A
s d. (A.36)
Since this is valid for any < 1, it still holds for = 1. Finally, since s f
is arbitrary, the claim follows.
In particular
_
A
f d = lim
n
_
A
s
n
d, (A.37)
for any monotone sequence s
n
f of simple functions. Note that there is
always such a sequence, for example,
s
n
(x) =
n
2
k=0
k
n
f
1
(A
k
)
(x), A
k
= [
k
n
,
k + 1
n
2
), A
n
2 = [n, ). (A.38)
By construction s
n
converges uniformly if f is bounded, since s
n
(x) = n if
f(x) = and f(x) s
n
(x) <
1
n
if f(x) < n + 1.
Now what about the missing items (ii) and (iv) from Lemma A.13? Since
limits can be spread over sums, the extension is linear (i.e., item (iv) holds)
and (ii) also follows directly from the monotone convergence theorem. We
even have the following result:
Lemma A.15. If f 0 is measurable, then d = f d dened via
(A) =
_
A
f d (A.39)
214 A. Almost everything about Lebesgue integration
is a measure such that
_
g d =
_
gf d. (A.40)
Proof. As already mentioned, additivity of is equivalent to linearity of the
integral and additivity follows from the monotone convergence theorem
(
_
n=1
A
n
) =
_
(
n=1
An
)f d =
n=1
_
An
f d =
n=1
(A
n
). (A.41)
The second claim holds for simple functions and hence for all functions by
construction of the integral.
If f
n
is not necessarily monotone we have at least
Theorem A.16 (Fatous Lemma). If f
n
is a sequence of nonnegative mea
surable function, then
_
A
liminf
n
f
n
d liminf
n
_
A
f
n
d, (A.42)
Proof. Set g
n
= inf
kn
f
k
. Then g
n
f
n
implying
_
A
g
n
d
_
A
f
n
d. (A.43)
Now take the liminf on both sides and note that by the monotone conver
gence theorem
liminf
n
_
A
g
n
d = lim
n
_
A
g
n
d =
_
A
lim
n
g
n
d =
_
A
liminf
n
f
n
d,
(A.44)
proving the claim.
If the integral is nite for both the positive and negative part f
of an
arbitrary measurable function f, we call f integrable and set
_
A
f d =
_
A
f
+
d
_
A
f
d. (A.45)
The set of all integrable functions is denoted by L
1
(X, d).
Lemma A.17. Lemma A.13 holds for integrable functions s, t.
Similarly, we handle the case where f is complexvalued by calling f
integrable if both the real and imaginary part are and setting
_
A
f d =
_
A
Re(f)d + i
_
A
Im(f)d. (A.46)
Clearly f is integrable if and only if [f[ is.
A.4. The Lebesgue integral 215
Lemma A.18. For any integrable functions f, g we have
[
_
A
f d[
_
A
[f[ d (A.47)
and (triangle inequality)
_
A
[f +g[ d
_
A
[f[ d +
_
A
[g[ d. (A.48)
Proof. Put =
z
z
, where z =
_
A
f d (without restriction z ,= 0). Then
[
_
A
f d[ =
_
A
f d =
_
A
f d =
_
A
Re(f) d
_
A
[f[ d. (A.49)
proving the rst claim. The second follows from [f +g[ [f[ +[g[.
In addition, our integral is well behaved with respect to limiting opera
tions.
Theorem A.19 (dominated convergence). Let f
n
be a convergent sequence
of measurable functions and set f = lim
n
f
n
. Suppose there is an inte
grable function g such that [f
n
[ g. Then f is integrable and
lim
n
_
f
n
d =
_
fd. (A.50)
Proof. The real and imaginary parts satisfy the same assumptions and so
do the positive and negative parts. Hence it suces to prove the case where
f
n
and f are nonnegative.
By Fatous lemma
liminf
n
_
A
f
n
d
_
A
f d (A.51)
and
liminf
n
_
A
(g f
n
)d
_
A
(g f)d. (A.52)
Subtracting
_
A
g d on both sides of the last inequality nishes the proof
since liminf(f
n
) = limsupf
n
.
Remark: Since sets of measure zero do not contribute to the value of the
integral, it clearly suces if the requirements of the dominated convergence
theorem are satised almost everywhere (with respect to ).
Note that the existence of g is crucial, as the example f
n
(x) =
1
n
[n,n]
(x)
on R with Lebesgue measure shows.
216 A. Almost everything about Lebesgue integration
Example. If (x) =
n
n
(x x
n
) is a sum of Dirac measures (x)
centered at x = 0, then
_
f(x)d(x) =
n
f(x
n
). (A.53)
Hence our integral contains sums as special cases.
Problem A.5. Show that the set B(X) of bounded measurable functions
is a Banach space. Show that the set S(X) of simple functions is dense
in B(X). Show that the integral is a bounded linear functional on B(X).
(Hence Theorem 0.24 could be used to extend the integral from simple to
bounded measurable functions.)
Problem A.6. Show that the dominated convergence theorem implies (un
der the same assumptions)
lim
n
_
[f
n
f[d = 0. (A.54)
Problem A.7. Suppose y f(x, y) is measurable for every x and x
f(x, y) is continuous for every y. Show that
F(x) =
_
A
f(x, y) d(y) (A.55)
is continuous if there is an integrable function g(y) such that [f(x, y)[ g(y).
Problem A.8. Suppose y f(x, y) is measurable for xed x and x
f(x, y) is dierentiable for xed y. Show that
F(x) =
_
A
f(x, y) d(y) (A.56)
is dierentiable if there is an integrable function g(y) such that [
x
f(x, y)[
g(y). Moreover, x
x
f(x, y) is measurable and
F
(x) =
_
A
x
f(x, y) d(y) (A.57)
in this case.
A.5. Product measures
Let
1
and
2
be two measures on
1
and
2
, respectively. Let
1
2
be
the algebra generated by rectangles of the form A
1
A
2
.
Example. Let B be the Borel sets in R then B
2
= B B are the Borel
sets in R
2
(since the rectangles are a basis for the product topology).
Any set in
1
2
has the section property, that is,
A.5. Product measures 217
Lemma A.20. Suppose A
1
2
then its sections
A
1
(x
2
) = x
1
[(x
1
, x
2
) A and A
2
(x
1
) = x
2
[(x
1
, x
2
) A (A.58)
are measurable.
Proof. Denote all sets A
1
2
in with the property that A
1
(x
2
)
1
by
S. Clearly all rectangles are in S and it suces to show that S is a algebra.
Moreover, if A S, then (A
)
1
(x
2
) = (A
1
(x
2
))
2
and thus S is closed
under complements. Similarly, if A
n
S, then (
n
A
n
)
1
(x
2
) =
n
(A
n
)
1
(x
2
)
shows that S is closed under countable unions.
This implies that if f is a measurable function on X
1
X
2
, then f(., x
2
) is
measurable on X
1
for every x
2
and f(x
1
, .) is measurable on X
2
for every x
1
(observe A
1
(x
2
) = x
1
[f(x
1
, x
2
) B, where A = (x
1
, x
2
)[f(x
1
, x
2
) B).
In fact, this is even equivalent since
A
1
(x
2
)
(x
1
) =
A
2
(x
1
)
(x
2
) =
A
(x
1
, x
2
).
Given two measures
1
on
1
and
2
on
2
we now want to construct
the product measure,
1
2
on
1
2
such that
2
(A
1
A
2
) =
1
(A
1
)
2
(A
2
), A
j
j
, j = 1, 2. (A.59)
Theorem A.21. Let
1
and
2
be two nite measures on
1
and
2
,
respectively. Let A
1
2
. Then
2
(A
2
(x
1
)) and
1
(A
1
(x
2
)) are mea
surable and
_
X
1
2
(A
2
(x
1
))d
1
(x
1
) =
_
X
2
1
(A
1
(x
2
))d
2
(x
2
). (A.60)
Proof. Let S be the set of all subsets for which our claim holds. Note
that S contains at least all rectangles. It even contains the algebra of nite
disjoint unions of rectangles. Thus it suces to show that S is a monotone
class. If
1
and
2
are nite, this follows from continuity from above and
below of measures. The case if
1
and
2
are nite can be handles as in
Theorem A.5.
Hence we can dene
2
(A) =
_
X
1
2
(A
2
(x
1
))d
1
(x
1
) =
_
X
2
1
(A
1
(x
2
))d
2
(x
2
) (A.61)
or equivalently
2
(A) =
_
X
1
__
X
2
A
(x
1
, x
2
)d
2
(x
2
)
_
d
1
(x
1
)
=
_
X
2
__
X
1
A
(x
1
, x
2
)d
1
(x
1
)
_
d
2
(x
2
). (A.62)
Additivity of
1
2
follows from the monotone convergence theorem.
218 A. Almost everything about Lebesgue integration
Note that (A.59) uniquely denes
1
2
as a nite premeasure on
the algebra of nite disjoint unions of rectangles. Hence by Theorem A.5 it
is the only measure on
1
2
satisfying (A.59).
Finally we have:
Theorem A.22 (Fubini). Let f be a measurable function on X
1
X
2
and
let
1
,
2
be nite measures on X
1
, X
2
, respectively.
(i) If f 0 then
_
f(., x
2
)d
2
(x
2
) and
_
f(x
1
, .)d
1
(x
1
) are both mea
surable and
__
f(x
1
, x
2
)d
1
2
(x
1
, x
2
) =
_ __
f(x
1
, x
2
)d
1
(x
1
)
_
d
2
(x
2
)
=
_ __
f(x
1
, x
2
)d
2
(x
2
)
_
d
1
(x
1
). (A.63)
(ii) If f is complex then
_
[f(x
1
, x
2
)[d
1
(x
1
) L
1
(X
2
, d
2
) (A.64)
respectively
_
[f(x
1
, x
2
)[d
2
(x
2
) L
1
(X
1
, d
1
) (A.65)
if and only if f L
1
(X
1
X
2
, d
1
d
2
). In this case (A.63)
holds.
Proof. By Theorem A.21 the claim holds for simple functions. Now (i)
follows from the monotone convergence theorem and (ii) from the dominated
convergence theorem.
In particular, if f(x
1
, x
2
) is either nonnegative or integrable, then the
order of integration can be interchanged.
Lemma A.23. If
1
and
2
are nite regular Borel measures with, so is
2
.
Proof. Regularity holds for every rectangle and hence also for the algebra of
nite disjoint unions of rectangles. Thus the claim follows from Lemma A.6.
2
)
3
=
1
(
2
3
). (A.66)
A.6. Decomposition of measures 219
Proof. First of all note that (
1
2
)
3
=
1
(
2
3
) is the sigma
algebra generated by the cuboids A
1
A
2
A
3
in X
1
X
2
X
3
. Moreover,
since
((
1
2
)
3
)(A
1
A
2
A
3
) =
1
(A
1
)
2
(A
2
)
3
(A
3
)
= (
1
(
2
3
))(A
1
A
2
A
3
) (A.67)
the two measures coincide on the algebra of nite disjoint unions of cuboids.
Hence they coincide everywhere by Theorem A.5.
Example. If is Lebesgue measure on R, then
n
= is Lebesgue
measure on R
n
. Since is regular, so is
n
.
A.6. Decomposition of measures
Let , be two measures on a measure space (X, ). They are called
mutually singular (in symbols ) if they are supported on disjoint
sets. That is, there is a measurable set N such that (N) = 0 and (XN) =
0.
Example. Let be the Lebesgue measure and the Dirac measure
(centered at 0), then : Just take N = 0, then (0) = 0 and
(R0) = 0.
On the other hand, is called absolutely continuous with respect to
(in symbols ) if (A) = 0 implies (A) = 0.
Example. The prototypical example is the measure d = f d (compare
Lemma A.15). Indeed (A) = 0 implies
(A) =
_
A
f d = 0 (A.68)
and shows that is absolutely continuous with respect to . In fact, we will
show below that every absolutely continuous measure is of this form.
The two main results will follow as simple consequence of the following
result:
Theorem A.25. Let , be nite measures. Then there exists a unique
(a.e.) nonnegative function f and a set N of measure zero, such that
(A) = (A N) +
_
A
f d. (A.69)
Proof. We rst assume , to be nite measures. Let = + and
consider the Hilbert space L
2
(X, d). Then
(h) =
_
X
hd (A.70)
220 A. Almost everything about Lebesgue integration
is a bounded linear functional by CauchySchwarz:
[(h)[
2
=
_
X
1 hd
__
[1[
2
d
___
[h[
2
d
_
(X)
__
[h[
2
d
_
= (X)h
2
. (A.71)
Hence by the Riesz lemma (Theorem 1.7) there exists an g L
2
(X, d) such
that
(h) =
_
X
hg d. (A.72)
By construction
(A) =
_
A
d =
_
A
g d =
_
A
g d. (A.73)
In particular, g must be positive a.e. (take A the set where g is negative).
Furthermore, let N = x[g(x) 1, then
(N) =
_
N
g d (N) = (N) +(N), (A.74)
which shows (N) = 0. Now set
f =
g
1 g
N
, N
= XN. (A.75)
Then, since (A.73) implies d = g d respectively d = (1 g)d, we have
_
A
fd =
_
A
g
1 g
N
d
=
_
AN
g d
= (A N
) (A.76)
as desired. Clearly f is unique, since if there is a second function
f, then
_
A
(f
f)d = 0 for every A shows f
f = 0 a.e..
To see the nite case, observe that X
n
X, (X
n
) < and Y
n
X,
(Y
n
) < implies X
n
Y
n
X and (X
n
Y
n
) < . Hence when
restricted to X
n
Y
n
we have sets N
n
and functions f
n
. Now take N =
N
n
and choose f such that f[
Xn
= f
n
(this is possible since f
n+1
[
Xn
= f
n
a.e.).
Then (N) = 0 and
(A N
) = lim
n
(A (X
n
N)) = lim
n
_
AXn
f d =
_
A
f d, (A.77)
which nishes the proof.
Now the anticipated results follow with no eort:
A.7. Derivatives of measures 221
Theorem A.26 (Lebesgue decomposition). Let , be two nite mea
sures on a measure space (X, ). Then can be uniquely decomposed as
=
ac
+
sing
, where
ac
and
sing
are mutually singular and
ac
is abso
lutely continuous with respect to .
Proof. Taking
sing
(A) = (A N) and d
ac
= f d there is at least one
such decomposition. To show uniqueness, let be nite rst. If there
is another one =
ac
+
sing
, then let
N be such that (
N) = 0 and
sing
(
N
) = 0. Then
sing
(A)
sing
(A) =
_
A
(
f f)d. In particular,
_
AN
(
f f)d = 0 and hence
f = f a.e. away from N
N. Since
(N
N) = 0, we have
f = f a.e. and hence
ac
=
ac
as well as
sing
=
ac
=
ac
=
sing
. The nite case follows as usual.
Theorem A.27 (RadonNikodym). Let , be two nite measures on a
measure space (X, ). Then is absolutely continuous with respect to if
and only if there is a positive measurable function f such that
(A) =
_
A
f d (A.78)
for every A . The function f is determined uniquely a.e. with respect to
and is called the RadonNikodym derivative
d
d
of with respect to .
Proof. Just observe that in this case (A N) = 0 for every A, that is
sing
= 0.
Problem A.9. Let is a Borel measure on B and suppose its distribution
function (x) is dierentiable. Show that the RadonNikodym derivative
equals the ordinary derivative
(x).
A.7. Derivatives of measures
If is a Borel measure on B and its distribution function (x) is dieren
tiable, then the RadonNikodym derivative is just the ordinary derivative
(x). Our aim in this section is to generalize this result to arbitrary mea
sures on B
n
.
We call
(D)(x) = lim
0
(B
(x))
[B
(x)[
, (A.79)
the derivative of at x R
n
provided the above limit exists. (Here B
r
(x)
R
3
is a ball of radius r centered at x R
n
and [A[ denotes the Lebesgue
measure of A B
n
).
Note that for a Borel measure on B, (D)(x) exists if and only if (x)
(as dened in (A.3)) is dierentiable at x and (D)(x) =
(x))
[B
(x)[
and (D)(x) = liminf
0
(B
(x))
[B
(x)[
. (A.80)
Clearly is dierentiable if (D)(x) = (D)(x) < . First of all note that
they are measurable:
Lemma A.28. The upper derivative is lower semicontinuous, that is the set
x[(D)(x) > is open for every R. Similarly, the lower derivative is
upper semicontinuous, that is x[(D)(x) < is open.
Proof. We only prove the claim for D, the case D being similar. Abbre
viate,
M
r
(x) = sup
0<<r
(B
(x))
[B
(x)[
(A.81)
and note that it suces to show that O
r
= x[M
r
(x) > is open.
If x O
r
, there is some < r such that
(B
(x))
[B
(x)[
> . (A.82)
Let > 0 and y B
(x). Then B
(x) B
+
(y) implying
(B
+
(y))
[B
+
(y)[
_
+
_
n
(B
(x))
[B
(x)[
> (A.83)
for suciently small. That is, B
(x) O.
In particular, both the upper and lower derivative are measurable. Next,
the following geometric fact of R
n
will be needed.
Lemma A.29. Given open balls B
1
, . . . , B
m
in R
n
, there is a subset of
disjoint balls B
j
1
, . . . , B
j
k
such that
m
_
i=1
B
i
3
n
k
i=1
[B
j
i
[. (A.84)
Proof. Start with B
j
1
= B
1
= B
r
1
(x
1
) and remove all balls from our list
which intersect B
j
1
. Observe that the removed balls are all contained in
3B
1
= B
3r
1
(x
1
). Proceeding like this we obtain B
j
1
, . . . , B
j
k
such that
m
_
i=1
B
i
k
_
i=1
B
3r
j
i
(x
j
i
) (A.85)
and the claim follows since [B
3r
(x)[ = 3
n
[B
r
(x)[.
A.7. Derivatives of measures 223
Now we can show
Lemma A.30. Let > 0. For any Borel set A we have
[x A[ (D)(x) > [ 3
n
(A)
(A.86)
and
[x A[ (D)(x) > 0[ = 0, whenever (A) = 0. (A.87)
Proof. Let A
(A.88)
for any compact set K and open set O with K E O. The rst claim
then follows from regularity of and the Lebesgue measure.
Given xed K, O, for every x K there is some r
x
such that B
rx
(x) O
and [B
rx
(x)[ <
1
(B
rx
(x)). Since K is compact, we can choose a nite
subcover of K. Moreover, by Lemma A.29 we can rene our set of balls such
that
[K[ 3
n
k
i=1
[B
r
i
(x
i
)[ <
3
n
i=1
(B
r
i
(x
i
)) 3
n
(O)
. (A.89)
To see the second claim, observe that
x A[ (D)(x) > 0 =
_
j=1
x A[ (D)(x) >
1
j
(A.90)
and by the rst part [x A[ (D)(x) >
1
j
[ = 0 for any j if (A) = 0.
Theorem A.31 (Lebesgue). Let f be (locally) integrable, then for a.e. x
R
n
we have
lim
r0
1
[B
r
(x)[
_
Br(x)
[f(y) f(x)[dy = 0. (A.91)
Proof. Decompose f as f = g + h, where g is continuous and h
1
<
(Theorem 0.31) and abbreviate
D
r
(f)(x) =
1
[B
r
(x)[
_
Br(x)
[f(y) f(x)[dy. (A.92)
Then, since limF
r
(g)(x) = 0 (for every x) and D
r
(f) D
r
(g) + D
r
(h) we
have
limsup
r0
D
r
(f) limsup
r0
D
r
(h) (D)(x) +[h(x)[, (A.93)
224 A. Almost everything about Lebesgue integration
where d = [h[dx. Using [x[ [h(x)[
1
h
1
and the rst part of
Lemma A.30 we see
[x[ limsup
r0
D
r
(f)(x) [ (3
n
+ 1)
(A.94)
Since is arbitrary, the Lebesgue measure of this set must be zero for every
. That is, the set where the limsup is positive has Lebesgue measure
zero.
The points where (A.91) holds are called Lebesgue points of f.
Note that the balls can be replaced by more general sets: A sequence of
sets A
j
(x) is said to shrink to x nicely if there are balls B
r
j
(x) with r
j
0
and a constant > 0 such that A
j
(x) B
r
j
(x) and [A
j
[ [B
r
j
(x)[. For
example A
j
(x) could be some balls or cubes (not necessarily containing x).
However, the portion of B
r
j
(x) which they occupy must not go to zero! For
example the rectangles (0,
1
j
) (0,
2
j
) R
2
do shrink nicely to 0, but the
rectangles (0,
1
j
) (0,
2
j
2
) dont.
Lemma A.32. Let f be (locally) integrable, then at every Lebesgue point
we have
f(x) = lim
j
1
[A
j
(x)[
_
A
j
(x)
f(y)dy. (A.95)
whenever A
j
(x) shrinks to x nicely.
Proof. Let x be a Lebesgue point and choose some nicely shrinking sets
A
j
(x) with corresponding B
r
j
(x) and . Then
1
[A
j
(x)[
_
A
j
(x)
[f(y) f(x)[dy
1
[B
r
j
(x)[
_
Br
j
(x)
[f(y) f(x)[dy (A.96)
and the claim follows.
Corollary A.33. Suppose is an absolutely continuous Borel measure on
R, then its distribution function is dierentiable a.e. and d(x) =
(x)dx.
As another consequence we obtain
Theorem A.34. Let be a Borel measure on R
n
. The derivative D
exists a.e. with respect to Lebesgue measure and equals the RadonNikodym
derivative of the absolutely continuous part of with respect to Lebesgue
measure, that is,
ac
(A) =
_
A
(D)(x)dx. (A.97)
A.7. Derivatives of measures 225
Proof. If d = f dx is absolutely continuous with respect to Lebesgue mea
sure the claim follows from Theorem A.31. To see the general case use the
Lebesgue decomposition of and let N be a support for the singular part
with [N[ = 0. Then (D
sing
)(x) = 0 for a.e. x N
(x) V
j
and (B
(x)) k[B
i
(B
i
(x
i
)) k
i
[B
i
(x
i
)[. (A.98)
Selecting disjoint balls us in Lemma A.29 further shows
(K) k3
n
[B
(x
i
)[ k3
n
[V
j
[. (A.99)
Letting j we see (K) k3
n
[K[ and by regularity we even have
(A) k3
n
[A[ for every A O
k
. Hence is absolutely continuous on O
k
and since we assumed to be singular we must have (O
k
) = 0.
Thus (D
sing
)(x) = for a.e. x with respect to
sing
and we are done.
2
min(n,m)
and hence
we can dene the Cantor function as f = lim
n
f
n
. By construction f
is a continuous function which is constant on every subinterval of [0, 1]C.
Since C is of Lebesgue measure zero, this set is of full Lebesgue measure
and hence f
c
(U, V ) . . . set of compactly supported smooth functions
(R
n
) . . . Lebesgue space of bounded functions vanishing at
. . . a real number
max . . . maximum
/ . . . Mellin transform, 193
. . . spectral measure, 82
N . . . the set of positive integers
N
0
= N 0
. . . a Borel set
ac
(A) . . . absolutely continuous spectrum of A, 90
sc
(A) . . . singular continuous spectrum of A, 90
pp
(A) . . . pure point spectrum of A, 90
p
(A) . . . point spectrum (set of eigenvalues) of A, 88
d
(A) . . . discrete spectrum of A, 117
ess
(A) . . . essential spectrum of A, 117
span(M) . . . set of nite linear combinations from M, 12
Z . . . the set of integers
z . . . a complex number
Glossary of notations 231
I . . . identity operator
. . . complex conjugation
A
. . . adjoint of A, 51
A . . . closure of A, 55
f = T
1
f, inverse Fourier transform of f
. . . . norm in the Hilbert space H
.
p
. . . norm in the Banach space L
p
, 22
., ..) . . . scalar product in H
E
(A) = E
(A
2
) E
(A)
2
variance
. . . orthogonal sum of linear spaces or operators, 38
. . . Laplace operator, 138
. . . gradient, 135
. . . derivative, 135
M
. . . orthogonal complement, 36
(
1
,
2
) = R[
1
< <
2
, open interval
[
1
,
2
] = R[
1
2
, closed interval
n
. . . norm convergence
n
. . . weak convergence, 42
A
n
A . . . norm convergence
A
n
s
A . . . strong convergence, 43
A
n
A . . . weak convergence, 43
A
n
nr
A . . . norm resolvent convergence, 128
A
n
sr
A . . . strong resolvent convergence, 128
Index
a.e., see almost everywehre
Absolutely continuous
function, 72
measure, 219
Adjoint, 40
Algebra, 201
Almost everywhere, 204
Angular momentum operator, 145
Banach algebra, 21
Banach space, 11
Basis
orthonormal, 34
spectral, 80
Bessel function, 142
spherical, 177
Bessel inequality, 33
Borel
function, 211
measure, 203
set, 202
algebra, 202
Borel measure
regular, 203
Borel transform, 82
Creal, 71
Cantor function, 225
Cantor measure, 226
Cantor set, 204
CauchySchwarz inequality, 16
Caylay transform, 69
Ces`aro average, 108
Characteristic function, 212
Closed set, 5
Closure, 5
Commute, 99
Compact, 7
locally, 9
sequentially, 8
Complete, 6, 11
Conguration space, 48
Conjugation, 71
Continuous, 6
Convolution, 136
Core, 55
Cover, 7
C
algebra, 41
Cyclic vector, 80
Dense, 6
Dilation group, 171
Dirac measure, 203, 216
Dirichlet boundary condition, 156
Distance, 9
Distribution function, 203
Domain, 20, 48, 50
Eigenspace, 96
Eigenvalue, 61
multiplicity, 96
Eigenvector, 61
Element
adjoint, 41
normal, 41
positive, 41
selfadjoint, 41
unitary, 41
Essential supremum, 23
Expectation, 47
First resolvent formula, 63
233
234 Index
Form domain, 58, 84
Fourier series, 35
Fourier transform, 108, 135
Friedrichs extension, 61
Function
absolutely continuous, 72
Gaussian wave packet, 145
Gradient, 135
GramSchmidt orthogonalization, 35
Graph, 55
Graph norm, 55
Greens function, 142
Ground state, 178
Holders inequality, 23
Hamiltonian, 49
Harmonic oscillator, 148
Hausdor space, 4
Heisenberg picture, 112
Herglotz functions, 82
Hermite polynomials, 148
Hilbert space, 15, 31
separable, 35
Hydrogen atom, 170
Ideal, 41
Induced topology, 4
Inner product, 15
Inner product space, 15
Integrable, 214
Integral, 212
Interior, 5
Interior point, 4
Intertwining property, 190
Involution, 41
Ionisation, 184
Kernel, 20
l.c., see Limit circle
l.p., see Limit point
Laguerre polynomial, 177
associated, 177
Lebesgue measure, 204
Lebesgue point, 224
Legendre equation, 174
Lemma
RiemannLebesgue, 138
Lidiskij trace theorem, 125
Limit circle, 155
Limit point, 4, 155
Linear functional, 21, 37
Localization formula, 185
Meansquare deviation, 48
Measurable
function, 210
set, 202
Measure, 202
absolutely continuous, 219
complete, 210
nite, 202
growth point, 85
mutually singular, 219
product, 217
projectionvalued, 76
spectral, 82
support, 204
Measure space, 202
Mellin transform, 193
Metric space, 3
Minkowskis inequality, 24
Mollier, 26
Momentum operator, 144
Multiindex, 135
order, 135
Multiplicity
spectral, 81
Neighborhood, 4
Neumann function
spherical, 177
Neumann series, 64
Norm, 10
operator, 20
Norm resolvent convergence, 128
Normal, 10
Normalized, 15, 32
Normed space, 10
Nowhere dense, 27
Observable, 47
Oneparameter unitary group, 49
Open ball, 4
Open set, 4
Operator
adjoint, 51
bounded, 20
closable, 55
closed, 55
closure, 55
compact, 110
domain, 20, 50
nite rank, 109
Hermitian, 50
HilbertSchmidt, 120
linear, 20, 50
nonnegative, 58
normal, 79
positive, 58
relatively bounded, 115
relatively compact, 110
selfadjoint, 51
Index 235
semibounded, 61
strong convergence, 43
symmetric, 50
unitary, 33, 49
weak convergence, 43
Orthogonal, 15, 32
Orthogonal complement, 36
Orthogonal projection, 37
Orthogonal sum, 38
Outer measure, 208
Parallel, 15, 32
Parallelogram law, 17
Parsevals identity, 137
Perpendicular, 15, 32
Phase space, 48
Pl ucker identity, 155
Polarization identity, 17, 33, 50
Position operator, 143
Positivity improving, 179
Positivity preserving, 179
Premeasure, 202
Probability density, 47
Product measure, 217
Projection, 41
Pythagorean theorem, 15
Quadratic form, 50
Range, 20
Rank, 109
Regulated function, 96
Relatively compact, 110
Resolution of the identity, 77
Resolvent, 62
Neumann series, 64
Resolvent set, 61
Riesz lemma, 37
Scalar product, 15
Scattering operator, 190
Scattering state, 190
Schatten pclass, 122
Schrodinger equation, 49
Second countable, 4
Second resolvent formula, 117
Selfadjoint
essentially, 55
Semimetric, 3
Separable, 6, 12
Short range, 195
algebra, 201
nite, 202
Simple function, 96, 212
Simple spectrum, 81
Singular values, 118
Span, 12
Spectral basis, 80
ordered, 89
Spectral measure
maximal, 89
Spectral theorem, 83
compact operators, 118
Spectral vector, 80
maximal, 89
Spectrum, 61
absolutely continuous, 90
discrete, 117
essential, 117
pure point, 90
singularly continuous, 90
Spherical harmonics, 175
algebra, 41
ideal, 41
Stieltjes inversion formula, 82
Stones formula, 98
Strong resolvent convergence, 128
SturmLiouville equation, 151
regular, 152
Subcover, 7
Subspace
reducing, 67
Superposition, 48
Tensor product, 39
Theorem
BanachSteinhaus, 28
closed graph, 57
dominated convergence, 215
Fubini, 218
HeineBorel, 9
HVZ, 184
KatoRellich, 116
Lebesgue decomposition, 221
monotone convergence, 213
Pythagorean, 32
RadonNikodym, 221
RAGE, 111
spectral, 84
spectral mapping, 89
Stone, 106
StoneWeierstra, 45
virial, 171
Weierstra, 13
Weyl, 126
Wiener, 108
Topological space, 4
Topology
base, 4
product, 7
Total, 12
Trace, 124
Trace class, 124
Triangel inequality, 10
236 Index
Trotter product formula, 113
Uncertainty principle, 144
Uniform boundedness principle, 28
Unit vector, 15, 32
Unitary group
Generator, 49
Urysohn lemma, 10
Variance, 48
Vitali set, 204
Wave function, 47
Wave operators, 189
Weak convergence, 21, 42
Weierstra approxiamation, 13
Weyl relation, 144
Weyl sequence, 64
singular, 126
WeylTitchmarsh mfunction, 165
Wronskian, 152