Foundations of Mathematical Physics

Foundations of Mathematical Physics
Paul P. Cook
and Neil Lambert
Department of Mathematics, Kings College London

The Strand, London WC2R 2LS, UK
email: paul.cook@kcl.ac.uk
email: neil.lambert@kcl.ac.uk
2
Contents
1 Classical Mechanics 5
1.1 Lagrangian Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Conserved Quantities . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Noethers Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Hamiltonian Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.1 Hamiltons equations. . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.2 Poisson Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.3 Duality and the Harmonic Oscillator . . . . . . . . . . . . . . . . 15
1.3.4 Noethers theorem in the Hamiltonian formulation. . . . . . . . . 16
2 Special Relativity and Component Notation 19
2.1 The Special Theory of Relativity . . . . . . . . . . . . . . . . . . . . . . 19
2.1.1 The Lorentz Group and the Minkowski Inner Product. . . . . . . 23
2.2 Component Notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1 Matrices and Matrix Multiplication. . . . . . . . . . . . . . . . . 28
2.2.2 Common Four-Vectors . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.3 Classical Field Theory . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.4 Maxwells Equations. . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.5 Electromagnetic Duality . . . . . . . . . . . . . . . . . . . . . . . 39
3 Quantum Mechanics 41
3.1 Canonical Quantisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.1.1 The Hilbert Space and Observables. . . . . . . . . . . . . . . . . 43
3.1.2 Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . . . 45
3.1.3 A Countable Basis. . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.1.4 A Continuous Basis. . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 The Schrodinger Equation. . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.1 The Heisenberg and Schrodinger Pictures. . . . . . . . . . . . . . 52
4 Group Theory 59
4.1 The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2 Common Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.1 The Symmetric Group S
n
. . . . . . . . . . . . . . . . . . . . . . 61
4.2.2 Back to Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Group Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3.1 The First Isomomorphism Theorem . . . . . . . . . . . . . . . . 71
3
4 CONTENTS
4.4 Some Representation Theory . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4.1 Schurs Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.4.2 The Direct Sum and Tensor Product . . . . . . . . . . . . . . . . 76
4.5 Lie Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.6 Lie Algebras: Innitesimal Generators . . . . . . . . . . . . . . . . . . . 82
4.7 Everything you wanted to know about SU(2) and SO(3) but were afraid
to ask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.7.1 SO(3) = SU(2)/Z
2
. . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.7.2 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.7.3 Representations Revisited . . . . . . . . . . . . . . . . . . . . . . 91
4.8 The Invariance of Physical Law . . . . . . . . . . . . . . . . . . . . . . . 93
4.8.1 Translations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.8.2 Special Relativity and the Innitesimal Generators of SO(1, 3). . 93
4.8.3 The Proper Lorentz Group and SL(2, C). . . . . . . . . . . . . . 95
4.8.4 Representations of the Lorentz Group and Lorentz Tensors. . . . 97
Chapter 1
Classical Mechanics
1.1 Lagrangian Mechanics
Newtons second law of motion states that for a body of constant mass m acted on by
a force F
F =
d
dt
(p) = m x (1.1)
where p is the linear momentum (p m x), x is the position of the body and x
dx
dt
.
Hence if F = 0 then the linear momentum is conserved p = 0.
F is called a conservative force if the two equivalent two statements hold:
(i) The work done under the force is path-independent, and
(ii) The force may derived from a scalar eld: F = V .
If so then the energy, dened as E =
1
2
m[ x[
2
+V is constant.
The work done by a mass m subject to a force F moving on a path from x(t
1
) to
x(t
2
) is
W =
_
x(t
2
)
x(t
1
)
F dx
=
_
t
2
t
1
F xdt
=
_
t
2
t
1
m x xdt (1.2)
=
_
t
2
t
1
m
d
dt
(
1
2
x
2
) dt
=
1
2
m x
2
(t
2
)
1
2
m x
2
(t
1
)
T
where T
1
2
m x
2
is the kinetic energy. Ones sees that if F = V then we immediately
have
W =
_
x
2
x
1
F dx
=
_
x
2
x
1
V dx
= V (x
1
) V (x
2
) , (1.3)
5
6 CHAPTER 1. CLASSICAL MECHANICS
which is path independent.
In general the work done depends on the precise path taken from x(t
1
) to x(t
2
).
It would seem common-sense that to push a supermarket trolley from x(t
1
) to x(t
2
)
requires an amount of work that is path-dependent - a path may be short or long, it
might traverse a hill or go around it - and one might expect the amount of work to
vary for each path. For many theoretical examples including these where work has to
be done against and by the force of gravity the work function is path-independent. An
example of a path-dependent work function is the work done against friction
1
Whenever W is path-independent the force F is called conservative. If the force
only depends on positions, but not velocities, then it can always be derived from a scalar
eld V , called the potential, as
F = V. (1.4)
When F is conservative the work function W depends only on the values of V at the
endpoints of the path:
W =
_
t
2
t
1
V xdt
=
_
t
2
t
1
(
V
x
dx
dt
+
V
y
dy
dt
+
V
z
dz
dt
) dt
=
_
t
2
t
1
(
dV
dt
) dt (1.5)
= (V (t
2
) V (t
1
)).
In terms of kinetic energy we had W = T(t
2
) T(t
1
) hence,
T(t
2
) T(t
1
) = V (t
1
) V (t
2
)
(T +V )(t
1
) = (T +V )(t
2
). (1.6)
Hence a conservative force conserves energy E T +V over time.
In terms of the potential V , Newtons second law of motion (for a constant mass)
becomes:
V
x
i

i
V = m x
i
(1.7)
where x
i
are the components of the vector x (i.e. i 1, 2, 3) and we have introduced
the notation
i
for

x
i
. This law of motion may be derived from a variational principle
on the functional
2
S =
_
t
2
t
1
dtL (1.8)
1
You might consider the work done moving around a closed loop. For a conservative force the work
is zero (split the closed loop into two journeys from A to B and from B to A, as the work done by
a conservative force depends only on A and B we have W
AB
= T
A
T
B
= W
BA
, hence total work
around the loop equals W
AB
+W
BA
= 0). For work against a friction force there is positive contribution
around every leg of the journey to the work which does not vanish when summed.
2
A functional takes a function as its argument and returns a number. The action is a function of the
vectors x, x as well as the scalar time, t and returns a real-valued number.
1.1. LAGRANGIAN MECHANICS 7
called the action, where L is the Lagrangian. To each path the action assigns a number
using the Lagrangian.
x(t1)
x(t2)
(1.9)
You may recall from optics the principle of least time which is used to discover which
path a photon travels in moving from A to B. The path a photon takes when it is
diracted as it moves between two media is dictated by this principle. The situation for
diraction is analagous to the physicist on the beach who observes a drowning swimmer
out at sea. The physicist knows that she can travel faster on the sand than she can swim
so her optimal route will travel not in a straight line towards the swimmer but along a
line which minimises the journey to the swimmer. This line will be bent in the middle
and composed of two straight lines which change direction at the boundary between the
sand and the sea. How does she work out which path she should follow to get to the
swimmer in optimal time? Well she rst derives a function which for each path to the
swimmer computes the time the path takes to travel. Then she considers the innitude
of all possible paths to the swimmer and reads o from her function the time each path
will take. The path that takes the shortest time will extremise her function (as will
the longest time, if it exists), and she can nd the quickest path to take in this way.
Of course the swimmer may not thank her for taking so long. In a similar manner the
action assigns a number to each motion a system may make, and the dynamical motion
is determined when the action is extremised. The action contains the Lagrangian which
is dened by
L(x, x; t) T V
=
n
i=1
1
2
m
i
x
2
i

n
i=1
V
i
(1.10)
for a system of n particles of masses m
i
with position vectors x
i
and velocities x
i
. Note
that here we are not referring to the ith component of a vector but rather the properties
of the ith particle. The equations of motion are found by extremising the action S. For
simplicity of notation we will consider only a one-particle system (i.e. n = 1),
S =
_
t
2
t
1
dt L
=
_
t
2
t
1
dt (
1
2
m x
2
V (x))
=
_
t
2
t
1
dt [m x x V ((x))] (1.11)
=
_
t
2
t
1
dt [m x
d
dt
(x)
i
V x
i
]
=
_
t
2
t
1
dt [
d
dt
(m x
i
)
i
V ]x
i
+ [x
i
m x
i
]
t
2
t
1
where we have used integration by parts in the nal line. Under the variation the action
is expected to change at all orders:
S(x +x) = S(x) +
S
x
x +O((x)
2
) S +S +O((x)
2
) (1.12)
When the rst order variation of S vanishes (
S
x
= 0) the action is extremised. Each
path from x(t
1
) to x(t
2
) gives a dierent value of the action, and the extremisation
of the action occurs only for certain paths between the xed points. From above we
see that when S = 0 (and noting that the endpoints of the path are xed hence
x(t
1
) = x(t
2
) = 0) then
S =
_
t
2
t
1
dt [
d
dt
(m x
i
)
i
V ]x
i
= 0 (1.13)
for all x
i
. Which is satised only when Newtons law of motion is satised for the path
with components x
i
, (i.e. when
i
V =
d
dt
(m x
i
)). This is no coincidence as Lagranges
equations may be derived from the Newtons second law.
More generally a generic dynamical system may be described by n generalised coor-
dinates q
i
and n generalised velocities q
i
where i = 1, 2, 3, . . . n and n is the number of
independent degrees of freedom of the system. The choice of generalised coordinates is
where the art of dynamics resides. Imagine a system of N particles moving in a three
dimensional space V . There are 2 3N Cartesian coordinates and velocities which de-
scribe this system. Now suppose further that the particles are all constrained to move
on the surface of a sphere of radius R. One could make the change of coordinates to
spherical coordinates but for each particle the radial coordinate would be redundant
(since it is xed to equal the spheres readius R) and the new coordinates would be
awash with trigonemtric functions. As the surface of the sphere is two-dimensional only
two coordinates on the surface of the sphere are needed to identify a unique position.
One reasonable choice is the angular variables and dened relative to the x-axis
and the z-axis for example. These are independent coordinates and are an example of
generalised coordinates. To summarise the example, each particle has three Cartesian
coordinates which must satisfy one constraint: the equation x
2
+ y
2
+ z
2
= R
2
, hence
there are only two generalised coordinates per particle which may be chosen as (, ).
The Lagrangian function is dened via Cartesian coordinates, but constraint equa-
tions allow one to rewrite the Largrangian in terms of q
i
and q
i
, i.e. L = L(q
i
, q
i
; t). The
equations of motion for the system are the (Euler-)Lagrange equations:
d
dt
_
L
q
i
_
L
q
i
= 0 (1.14)
Problem 1.1.1. Derive the Lagrange equations for an abstract Lagrangian L(q
i
, q
i
) by
extremizing the action S.
1.1. LAGRANGIAN MECHANICS 9
Example 1: The free particle.
For a single free particle in R
3
we have:
L = T V (1.15)
=
1
2
m( x
2
+ y
2
+ z
2
) V (1.16)
The generalised coordinates may be picked to be any n quantities which completely
paramaterise the resulting path of the particle, in this case Cartesian coordinates suce
(i.e. let q
1
x, q
2
y, q
3
z). The particle is not subject to a force, hence V = 0 and
hence the Lagrange equations (1.14) give
d
dt
(m q
i
) = 0 (1.17)
i.e. that linear momentum is conerved.
Example 2: The linear harmonic oscillator.
The system has one coordinate, q, and the potential is V (q) =
1
2
kq
2
where k > 0 (n.b.
F = kq). The Lagrangian is
L =
1
2
m q
2
1
2
kq
2
(1.18)
and the equation of motion (1.14) gives
d
dt
(m q) +kq = 0 (1.19)
q =
k
m
q
Hence we nd
q(t) = Acos(t) +Bsin(t) (1.20)
where
_
k
m
is the frequency of oscillation and A and B are real constants. The
energy for these solutions are
E =
1
2
q
2
+
1
2
kq
2
=
1
2
k
2
(A
2
+B
2
) (1.21)
Example 3: Circular motion.
Consider a bead of mass m constrained to move under gravity on a frictionless, circular,
immobile, rigid hoop of radius R such that the hoop lies in a vertical plane.
The Lagrangian formulation oers a neat way to ignore the forces of constraint
(which keep the bead attached to the hoop) via the use of generalised coordinates. If
the hoop rests in the xz-plane and is centred at z = R then the Cartesian coordinates
(in terms of a suitable chosen generalised coordinate q ) of the bead are:
x = Rcos x = Rsin
y = 0 y = 0 (1.22)
z = R +Rsin z = Rcos

These encode the statement that the bead is constrained to move on the hoop but
without needing to consider any of the forces acting to keep the bead on the hoop. The
Lagrangian is
L =
1
2
m( x
2
+ y
2
+ z
2
) V (1.23)
=
1
2
m(R
2

2
) mg(Rsin +R) (1.24)
where we have used the gravitational potential V = mgz(
z
V = mg F
G
). The
equations of motion (1.14) are
d
dt
(mR
2

) mgRcos = 0 (1.25)
mR
2
= mgRcos

=
_
g
R
_
cos
=
_
g
R
_
(1

2
2
+O(
4
))
For << 1 we have

g
R

1
2
(
g
R
)t
2
+ At + B where A and B are real constants.
Obviously the assumption used for this approximation fails after a short time!
1.1.1 Conserved Quantities
For every ignorable coordinate in the Lagrangian there is an associated conserved quan-
tity. That is if L(q
i
, q
i
; t) satises
L
q
i
= 0 then, as a consequence of 1.14,
d
dt
_
L
q
i
_
= 0 (1.26)
and
L
q
i
is conserved. This quantity is called the generalised momentum p
i
associated
to the generalised coordinate q
i
:
p
i

L
q
i
. (1.27)
For example, consider free circular motion (set V=0 in the last example), where we have:
L =
1
2
mR
2

2
. (1.28)
We observe that is an ignorable coordinate as
L
= 0 and hence p
= mR
2

is
conserved. This is the conservation of angular momentum, as [r p[ = p
, as you may
conrm.
1.2 Noethers Theorem
Theorem 1.2.1. (Noether) To every continuous symmetry of an action there is an
associated conserved quantity.
Let us denote the action by S
R
[q] where
S
R
[q]
_
R
dtL(q, q) where R = [t
1
, t
2
]. (1.29)
There are two types of symmetry that we would like to consider,
1.2. NOETHERS THEOREM 11
(i.) Spatial: S
R
[q
] = S
R
[q] and
(ii.) Space-time: S
R
[q
] = S
R
[q].
These two types foreshadow the symmetries that appear in eld theory where an
internal symmetry such as an SO(n) scalar symmetry rotates the Lagrangian into itself,
other types of symmetry of the action are called external. The spatial symmetries above
are a symmetry of the Lagrangian alone and would be the prototype of an internal
symmetry. We will consider Noethers theorem for a spatial symmetry, case (i) rst and
nd the associated conserved quantity (also called the conserved charge).
Case (i) means occurs if symmetry of the Lagrangian:
L[q
i
, q
i
] = L[q
i
, q
i
] . (1.30)
where the symmetry acts as
q
i
q
i
= q
i
+
i
(q) q
i
+q
i
(1.31)
In fact it all that is required is that we have a symmetry of the action so it is possible
that L is only invariant up to a boundary term:
L[q
i
, q
i
] = L[q
i
, q
i
] +
dK
dt
, (1.32)
for some expression K.
Now
L(q
i
+q
i
, q
i
+ q
i
) = L(q
i
, q
i
) +
i
_
q
i
L
q
i
+ q
i
L
q
i
_
+O(q)
2
. (1.33)
If the transformation q
i
q
i
is a symmetry then by denition L =

K up to terms of
O(q
2
i
), so that
i
_
i
L
q
i
+
i
L
q
i
_
=
d
dt
K (1.34)
The conserved quantity is explicitly given by
Q
i
L
q
i
K (1.35)
and all we need to do is compute:
dQ
dt
=
i

L
q
i
+
i
d
dt
L
q
i
dK
dt
=
i

i
L
q
i
+
i
L
q
i
dK
dt
= 0 . (1.36)
where we have used the equation of motion to get to the second line and (1.34) to get
the the third line.
Next we turn to case (ii). In fact this can be treated in the same way by including
a correction to K. To see this note that, to lowest order,
S
R
=
_
t
2
t
1
L +
_
t
2
t
1
L +
_
t
2
+t
2
t
2
L +
_
t
1
t
1
+t
1
L
= S
R
+
_
t
2
t
1
L +L(t
2
)t
2
L(t
1
)t
1
= S
R
+
_
t
2
t
1
L +
_
t
2
t
1
d
dt
(Lt)dt . (1.37)
Thus it is just as if K K +Lt
Example 1
Suppose that the spatial translation given by
q
i
q
i
= q
i
+a
i
(1.38)
where a
i
is a constant shift in the ith generalised coordinate is a symmetry of the action.
Then we see that the conserved charge is
Q =
i
a
i
L
q
i
=
i
a
i
p
i
(1.39)
where p
i
are the generalised momenta. The conserved quantity is a linear sum of the
generalised momenta which are all independently conserved.
Example 2
Suppose that the temporal translation is a symmetry of the action, i.e there is no explicit
time dependence so L = 0. Let the translation be
t t
= t + (1.40)
where b is a constant. The coordinates shift as follows:
q
i
q
i
= q
i
(t +) = q
i
+ q
i
i .e. q
i
= q
i
(1.41)
and similarly q
i
= q
i
. Following the discussion above the change in boundary condi-
tions means that we need to use the corrected formula for the conserved quantity with
J = L
Q =
i
L
q
i
q
i
L
=
i
p
i
q
i
L
= H . (1.42)
Thus for time translations the Hamiltonian is the conserved quantity.
Problem 1.2.1. The Lagrangian for a two-dimensional harmonic oscillator is
L =
m
2
( x
2
+ y
2
)
k
2
(x
2
+y
2
)
where x and y are Cartesian coordinates, x and y are their time-derivatives, m is the
mass of the oscillator and k is a constant.
1.3. HAMILTONIAN MECHANICS 13
(a.) Rewrite the Lagrangian in terms of the complex coordinate z = x+iy, its complex
conjugate z and their time-derivatives.
(b.) Show that
z z
= e
i
z = z +iz +O(
2
)
is a symmetry of the Lagrangian.
(c.) Consider the innitesimal version of the transformation given in part (b.) so that
z = iz. Find the conserved quantity Q associated to this transformation and
use the equations of motion to prove directly that its time-derivative
dQ
dt
is zero.
1.3 Hamiltonian Mechanics
Hamiltonians also encode the dynamics of a physical system. There is an invertible map
from a Lagrangian to a Hamiltonian so no information is lost. The map is the Legendre
transform and is used to dene the Hamiltonian H:
H(q
i
, p
i
; t) =
i
q
i
p
i
L (1.43)
where
p
i
=
L
q
i
(1.44)
is the conjugate momentum. N.B. The Hamiltonian is a function of q
i
and p
i
and
not q
i
and q
i
.. In particular we use this equation to solve for q
i
as a function of p
i
and
then we do not see q
i
again (except when we look at the time-evolution equations).
The Hamiltonian is closely related to the energy of the system. While the dynamics
of the Lagrangian system are described by a single point (q) in an n-dimensional vector
space called conguration space, the equivalent structure for Hamiltonian dynamics is
the 2n-dimensional phase space where a single point is described by the vector (q, p).
This is a little more than cosmetics as the equations of motion describing the two systems
dier. The Lagrangian has n second order dierential equations describing the motion,
while the Hamiltonian system has 2n rst order equations of motion. In both cases 2n
boundary conditions are required to completely solve the equations of motion.
Example.
Let L =
i
1
2
q
2
i
V (q) then p
i
=
L
q
i
= m q
i
so that
H =
i
q
i
(m q
i
)
i
1
2
mq
2
i
+V (q) (1.45)
=
i
1
2
mq
2
i
+V (q)
=
i
p
2
i
2m
+V (q).
1.3.1 Hamiltons equations.
As H H(q
i
, p
i
; t) then,
dH =
i
_
H
q
i
dq
i
+
H
p
i
dp
i
+
H
t
dt
_
. (1.46)
While as H =
i
q
i
p
i
L we also have
dH =
i
_
d q
i
p
i
+ q
i
dp
i
L
q
i
dq
i
L
q
i
d q
i
L
t
dt
_
(1.47)
=
i
_
q
i
dp
i
L
q
i
dq
i
L
t
dt
_
where we have used the denition of the conjugate momentum p
i
=
L
q
i
to eliminate
two terms in the nal line. By comparing the coecients of dq
i
, d q
i
and dt in the two
expressions for dH we nd
q
i
=
H
p
i
, p
i
=
H
q
i
,
H
t
=
L
t
(1.48)
where we have used Lagranges equation 1.14 to observe that p
i
=
L
q
i
. The rst two of
the above equations are usually referred to as Hamiltons equations of motion. Notice
that these are 2n rst order dierential equations compared to Lagranges equations
which are n second-order dierential equations.
Example.
If
H =
p
2
2m
+V (q) (1.49)
then
q =
H
p
=
p
m
and p =
H
q
=
V
q
. (1.50)
In other words we nd, for this simple system, p = m q (the denition of linear momen-
tum if q is a Cartesian coordinate) and F =
V
q
= p (Newtons second law).
1.3.2 Poisson Brackets
The Hamiltonian formulation of mechanics, while equivalent to the Lagrangian fomrula-
tion, makes manifest a symmetry of the dynamical system. Notice that if we interchange
q
i
and p
i
in Hamiltons equations, the two equations are interchanged up to a minus sign.
This kind of skew-symmetry indicates that Hamiltonian dynamical systems possess a
symplectic structure and the phase space is related to a symplectic manifold Sp(2n) (see
the group theory chapter for the denition of the symplectic group). There is, conse-
quently, a useful skew-symmetric structure that exists on the phase space. It is called
the Poisson bracket and is dened by
f, g
i
_
f
q
i
g
p
i
g
q
i
f
p
i
_
(1.51)
where f = f(q
i
, p
i
) and g = g(q
i
, p
i
) are abitrary functions on phase space.
One can write the equations of motion using the Poisson bracket as
q = q
i
, H =
H
p
i
and p = p
i
, H =
H
q
i
. (1.52)
Being curious pattern-spotters we may wonder whether it is generally the case that
f
?
= f, H for an arbitrary function f(q
i
, p
i
) on phase space. It is indeed the case as
f, H =
i
_
f
q
i
H
p
i
H
q
i
f
p
i
_
(1.53)
=
i
_
f
q
i
dq
i
dt
+
dp
i
dt
f
p
i
_
=
df
dt
if f = f(q
i
, p
i
).
The set of Poisson brackets acting on simply q
i
and p
j
are known as the fundamental
or canonical Poisson brackets. They have a simple form:
q
i
, p
j
=
ij
(1.54)
q
i
, q
j
= 0
p
i
, p
j
= 0
which one may conrm by direct computation.
1.3.3 Duality and the Harmonic Oscillator
In string theory there are a number of surprising transformations called T-duality which
leave the theory unchanged but give a new interpretation to the setting. By T-duality
one observes that the theory is unchanged whether the fundamental distance is R or
1
R
. This is a most unusual statement
3
which you will learn more about elsewhere.
The prototype for duality transformations in a physical theory is the electromagnetic
duality which we will look at briey after we have discussed special relativity and tensor
notation. The most simple duality transformation is exhibited in the harmonic oscillator.
We have seen that the Lagrangian and Hamiltonian of the harmonic oscillator are
L =
1
2
m q
2
1
2
kq
2
(1.55)
H =
p
2
2m
+
kq
2
2
.
The Hamilton equations are
q =
p
m
and p = kq (1.56)
q =
_
k
m
_
q
and these have the solution
q = Acos (t) +Bsin (t) where =
_
k
m
. (1.57)
3
If we were able to make such a transformation of the world we observe we would expect it to appear
very dierent - if we survived
The solution is unchanged under the transformation
(m, k) (
1
k
,
1
m
) (1.58)
as
_
(
1
m
)k = . The transformation which we call a duality leaves the solution of
the equations of motion unchanged. However the Lagrangian is transformed as
L L
=
q
2
2k

q
2
2m
(1.59)
and looks rather dierent. The Hamiltonian is transformed as
H H
=
kp
2
2
+
q
2
2m
(1.60)
which up to a canonical transformation is identical to the original Hamiltonian H. The
precise canonical transformation is
q q
= p (1.61)
p p
= q
which takes H
H. The transformation above is canonical as the Poisson brackets are

preserved: q
, p
= p, q = 1. The Hamiltonian with dual parameters is canonically

equivalent to the original Hamiltonian. Investigation of dualities can be rewarding, for
example it is surprising to realise that the harmonic oscillator with large mass m and
large spring constant k is equivalent to the same system with small mass
1
k
and small
spring constant
1
m
.
1.3.4 Noethers theorem in the Hamiltonian formulation.
Canonical transformations (q
i
q
i
, p
i
p
i
) are those transformations which preserve
the form of the equations of motion written in the transformed variables, i.e. under a
canonical transformation the equations of motion are transformed into
q
i
=
H(q
i
.p
i
)
p
i
and p
i
=
H(q
i
.p
i
)
q
i
. (1.62)
A necessary and sucient condition for a transformation to be canonical is that the
fundamental Poisson brackets are preserved under the transformation, i.e.
q
i
, p
j
=
ij
, q
i
, q
j
= 0 and p
i
, p
j
= 0. (1.63)
In fact a canonical transformation may be generated by an arbitrary function f(q
i
, p
i
)
on phase space via
q
i
q
i
= q
i
+q
i
, f q
i
+q
i
(1.64)
p
i
p
i
= p
i
+p
i
, f p
i
+p
i
Note that
q
i
= q
i
, f =
f
p
i
p
i
= p
i
, f =
f
q
i
(1.65)
(1.66)
In fact if 1 then the transformation is an innitesimal canonical transformation.
It is easy to check that this preserves the fundamental Poisson brackets up to terms of
order O(
2
), e.g.
q
i
, p
j
= q
i
+q
i
, f, p
j
+p
j
, f (1.67)
= q
i
, p
j
+(q
i
, f, p
j
+q
i
, p
j
, f) +O(
2
)
= q
i
, p
j
+(
f
p
i
, p
j
+q
i
,
f
q
j
) +O(
2
)
=
ij
+
_

2
f
q
j
p
i

2
f
p
i
q
j
_
+O(
2
)
=
ij
+O(
2
).
If the innitesimal canonical transformation generate by f is a symmetry of the Hamil-
tonian then H = 0 under the transformation. Now,
H =
i
_
H
q
i
q
i
+
H
p
i
p
i
_
(1.68)
=
i
_
H
q
i
f
p
i
H
p
i
f
q
i
_
= H, f
=
df
dt
where we have assumed that f is an explicit function of the phase space variables and
not time, i.e.
f
t
= 0. Hence if the transformation is a symmetry H = 0 then f(q
i
, p
i
)
is a conserved quantity.
Chapter 2
Special Relativity and
Component Notation
In 1905 Einstein published four papers which each changed the world. In the rst he
established that energy occurs in discrete quanta, which since the work of Max Planck
had been thought to be a property of the energy transfer mecahnism rather than energy
itself - this work really opened the door for the development of quantum mechanics. In
his second paper Einstein used an analysis of brownian motion to establish the physical
existence of atoms. In his third and fourth papers he set out the special theory of
relativity and derived the most famous equation in physics, if not mathematics, relating
energy to rest mass E = mc
2
. Hence 1905 is often referred to as Einsteins annus
mirabilis.
At the time Einsein had been refused a number of academic positions and was
working in the patent oce in Bern. He was living with his wife and two young children
while he was writing these historic papers. Not only was he insightful but perhaps, more
importantly, he was dedicated and industrious. He must also have been pretty tired too.
In 1921 Einstein was awarded the Nobel prize for his work on the photoelectric eect
(the work in the rst of his four papers that year) but special relativity was overlooked
(partly because it was very dicult to verify its predictions accurately at the time). If
there is any message to be taken from the decision of the Nobel committee it is probably
that you should keep your own counsel with regard to the quality of your work.
In this chapter we will give a brief description of the special theory of relativity - a
more complete description of the theory will require group theory and will be covered
again the group theory chapter. One consequence of relativity is that time and space
are put on equal footing and we will need to develop the notation we have used for
classical mechanics in which time was a special variable. Consequently we will spend
some time developing our notation and will also consider the component notation for
tensors. Sometimes a good notation is as good as a new idea.
2.1 The Special Theory of Relativity
The theory was constructed on two simple postulates:
(1.) the laws of physics are independent of the inertial reference frame of the observer,
19
20 CHAPTER 2. SPECIAL RELATIVITY AND COMPONENT NOTATION
and
(2.) the speed of light is a constant for all observers.
Surprisingly these simple postulates necessitated that coordinate and time transforma-
tions between two dierent frames F and F
moving at relative speed v in the x-direction

were no longer the Gallilean transformation but rather the Lorentz transformations:
t
= (t
xv
c
2
) (2.1)
x
= (x vt)
y
= y
z
= z
where

_
_
1
v
2
c
2
_
1
. (2.2)
Let us consider two thought experiments to motivate these transformations, the rst will
demonstrate time dilation and the second the shortening of length. Consider a clock
formed of two perfect mirrors separated vertically such that a photon bouncing between
the mirrors takes one second to travel from the bottom mirror to the top mirror and
back again. It is consequently a very tall clock, it has height h =
c
2
metres where c is
the speed of light (hence h
299792458
2
= 149, 896, 229 metres in a vacuum!). Let us set
the clock in motion with a speed v in the +x-direction and consider two observers: one
in the rest frame of the clock F
and a second in a frame F and a second observer in

frame F
which moves at speed v along the x-axis. Suppose at time t = 0 the two clocks
are at the origin of frame F (i.e. the origin of both frames F and F
coincide at t = 0).
As the observer at the origin of frame F
moves o at speed v the observer in frame F

observes the ticking of the relatively moving photon clock slow down. Schematically
we indicate a view of the moving clock as seen from frame F
below:
h=c/2
x
The photon in the moving clock now is seen to move along the hypotenuse of a right-
angled triangle as the clock moves horizontally. What are the dimensions of this triangle
as seen from frame F
? The height is the same as the clock at

c
2
. As viewed from the
frame F
where the clock appears to be moving t
seconds are observed to pass, in which

time the clocks base has moved a distance vt
. Now using the Pythagorean formula and

the rst postulate of special relativity (that the speed of light is a constant) we nd that
2.1. THE SPECIAL THEORY OF RELATIVITY 21
the photon travels a distance x = ct
where
ct
= 2(
_
c
2
4
+
v
2
t
2
4
) =
_
c
2
+v
2
t
2
. (2.3)
Rearranging we nd that, after one second has passed as measured in the rest frame of
the clock, that t
seconds have passed as viewed from the frame F
in which the clock is

moving and
1 =
_
_
1
v
2
c
2
_
t
=
1
. (2.4)
We deduce that after t oscillations of the moving photon clock
ct
=
_
c
2
t
2
+v
2
t
2
t
= t. (2.5)
As 1 the time measured on a moving clock has slowed, because the same physical
process, namely the propagation of the light signal, has taken longer. This derivation of
time dilation is only a toy model as we assumed we could instantaneously know when
the photon on the moving clock had completed its oscillation. In practise the observer
would sit at the origin of frame F
and record measurements from there, information

would take time to be transported back to their frames origin and a second property of
special relativity would need to be considered, that of length contraction.
Let us consider a second toy model that will indicate length contraction as a conse-
quence of the postulates of special relativity.
Suppose we construct a contraption, consisting of a straight rigid rod with a perfect
mirror attached to one end (as drawn below), whose rest length is l. We will aim to
measure its length using a photon, whose arrival and departure time we will suppose we
can measure accurately. The experiment will involve the photon traversing the length of
the rod, being reected by the perfect mirror and returning to its starting point. When
conducted at rest the photon returns to its starting point in time t
1
+t
2
=
2l
c
, where t
1
is the time to go to the mirror and t
2
the time to come back, so in fact t
1
= t
2
. Now we
will change frames so that in F
the contraption is seen to be moving with speed v in

the positive x direction (left-to-right horizontally across the page as drawn below) and
repeat the experiment.
Contraption of length L,
all moving at speed v.
Photon of
speed c.
Perfect mirror.
c v
Now we know that on the rst leg of the journey the photon will take a longer time
to reach the mirror, as the mirror is traveling away from the photon. However on the
return leg the photons starting point at the other end of our contraption is moving
towards the photon. So we may wonder if the total journey time for the photon has
changed overall. We compute the time taken for each of the two legs. In the moving
frame
ct
1
= l
+vt
1
t
1
=
l
c v
(2.6)
ct
2
= l
vt
2
t
2
=
l
c +v
, (2.7)
where l
is the length that the moving observer sees. So the total time taken for the
photon to traverse twice the contraption length when it is moving at speed v is
t
1
+t
2
=
_
l
c v
+
l
c +v
_
=
2l
c
c
2
v
2
=
2
c
l
2
. (2.8)
On the other hand, using the Lorentz transformations for time between frames, we have
that
l =
c
2
(t
1
+t
2
) =
c
2
t
1
+t
= l
. (2.9)
So the length that the moving observer will see is l
= l/. As 1, l
l. Thus the
length appears to have contracted in the moving frame.
Let us complete this thought experiment by bringing together time dilation and
length contraction to nd the Lorentz transformations given in equation (2.1). Consider
an event occurring in the stationary event at spacetime point (t, x)
1
The event is the
arrival taken of a photon having started at the origin at t = 0, i.e. x = ct. Observing
the same motion of a photon in the moving frame we deduce (as for the rst leg in the
thought experiment used to derive length contraction):
x
+vt
= ct
= (c v)t
(2.10)
Using the time dilation t
= t gives
x
= (ct vt) = (x vt) (2.11)

since x = ct. As the speed of light is unchanged in either frame we have
x
t
=
x
, and
using equation (2.11) we have
t
= x
t
x
= (x vt)
t
x
= (t
vt
2
x
) = (t
vx
c
2
) (2.12)
where we have used t =
x
c
which is valid for photon motion. Thus we have arrived at
the Lorentz transformations of equation (2.1).
These simple thought experiments changed the world and demonstrate the possibility
for thought alone to outstrip intuition and experiment.
Problem 2.1.1. The Lagrangian of a relativistic particle with mass m and charge e
and coupled to an electromagnetic eld is
L =
mc
2
e(x, t) +
i
eA
i
(x, t) x
i
where x
i
are the coordinates of the particle with i = 1, 2, 3, = (1
x
2
c
2
)
1
2
, x
i
is the
time derivative of the coordinate x
i
, (x, t) is the electric scalar potential and A(x, t) is
the magnetic vector potential.
1
We suppress the y and z coordinates as they are unchanged for a Lorentz transformation in the
x-direction only.
(a.) Show that the equations of motion may be written in vector form as
d
dt
_
m x
_
= e
A
t
e + x A.
(b.) Find the Hamiltonian of the system.
(c.) Show that the rest energy of the system (i.e. when p = 0) is
mc
2
+
1
2
e
2
m
A
2
+e +O(
1
c
2
).
2.1.1 The Lorentz Group and the Minkowski Inner Product.
As we will see in the chapter on group theory, the Lorentz transformations form a group
denoted O(1, 3). The subgroup of proper Lorentz transformations has determinant one
and is denoted SO(1, 3). When the Lorentz transformations are combined with the
translations in space and time the new larger group formed is called the Poincare group.
It is the relativistic analogue of the Gallilean group which map between inertial frames
in Newtonian mechanics
2
. The Lorentz group O(1, 3) is dened by
O(1, 3) GL(4, R)[
T
= ; diag(1, 1, 1, 1)
GL(4, R) is the set of invertible four-by-four matrices whose entries are elements of R,
T
is the transpose of the matrix and , the Minkowski metric, is a four-by-four
matrix whose diagonal elements are non-zero and given in full matrix notation by

_
_
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
. (2.13)
It is not yet obvious that either the Lorentz transformations do form a group nor that
the denition of O(1, 3) encodes the Lorentz transformations as given in section 2.1. We
will wait until we encounter the denition of a group before checking the rst assertion.
The group SO(1, 3) itself is the rotation group in a Minkowski space the numbers (1, 3)
indicate the signature of the spacetime and corresponds to a spacetime with one timelike
coordinate and three spatial coordinates or R
1,3
. Rather more mathematically the ma-
trix denes the signature of the Minkowski metric
3
which is preserved by the Lorentz
transformations. It is the insightful observation that the Lorentz transformations leave
invariant the Minkowski inner product between two four vectors that will give the rst
hint that Lorentz transformations are related to the denition of O(1, 3). The equivalent
2
The Gallilean group consists of 10 transformations: 3 space rotations, 3 space translations, 3
Gallilean velocity boosts v v + u and one time tranlsation.
3
We commence the abuse of our familiar mathematical denitions here as the Minkowski metric is not
positive denite as is implied by the denition of a metric, similarly the Minkowski inner product is also
not positive denite but the constructions of both Minkowski inner product and Minkowski metric are
close enough to the standard denitons that the misnomers have remained, and the lack of vocabulary
will not confuse our work. Properly Minkowski space is a pseudo-Riemannian manifold in contrast to
Euclidean space equipped with the standard metric which is a Riemannian manifold.
statement in Euclidean space R
3
is that rotations leave distances unchanged. The inner
product on R
1,3
is dened between any two four-vectors
v =
_
_
_
_
_
_
v
0
v
1
v
2
v
3
_
_
_
_
_
_
and w =
_
_
_
_
_
_
w
0
w
1
w
2
w
3
_
_
_
_
_
_
(2.14)
in R
1,3
by
< v, w > v
T
w (2.15)
= (v
0
, v
1
, v
2
, v
3
)
_
_
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
_
_
_
_
_
_
w
0
w
1
w
2
w
3
_
_
_
_
_
_
(2.16)
= v
0
w
0
v
1
w
1
v
2
w
2
v
3
w
3
. (2.17)
Now we can see clearly that the Minkowski inner product < v, w > is not positive for
all vectors v and w.
Problem 2.1.2. Show that under the Lorentz transformation x
2
x
is invariant,
where x
0
= ct, x
1
= x, x
2
= y and x
3
= z.
It is worthwhile keeping the comparison with R
3
in mind. The equivalent group
would be SO(3) and its elements are the rotations in three-dimensional space, the inner
product on the space is dened using the identity matrix I whose diagonal entries
are all one and whose o-diagonal entries are zero. The Euclidean inner product on
R
3
between two vectors x and y is x
t
Iy x
1
y
1
+ x
2
y
2
+ x
3
y
3
. The vector length
squared x
2
= x
T
Ix x x is positive denite when x ,= 0. The rotation of a vector
leaves invariant the length of any vector in the space, or in other words leaves the
inner product invariant. In the comparison with Lorentz transformations in Minkowski
space the crucial dierence is that the metric is no longer positive denite and hence
four-vectors fall into one of three classes:
< v, v >
_
_
> 0 v is called timelike
= 0 v is called lightlike or null
< 0 v is called spacelike
. (2.18)
Consider the subspace of R
1,3
consisting of the x
0
and the x
1
axes. Vectors in this
two-dimensional sub-space are labelled by points which lie in one of, or at the meeting
points of, the four sectors indicated below:
Let
v =
_
_
_
_
_
_
v
0
v
1
0
0
_
_
_
_
_
_
(2.19)
be an arbitrary vector in R
1,3
also lying entirely within R
1,1
due to the zeroes in the the
third and fourth compoenents. So
< v, v >= (v
0
)
2
(v
1
)
2
(2.20)
and hence if
v
0
> v
1
v is timelike.
v
0
= v
1
v is lightlike or null. (2.21)
v
0
< v
1
v is spacelike.
In relativity Minkowski space, R
1,3
equipped with the Minkowski metric , is used to
model spacetime. Spacetime, which we have taken for granted so far, has a local basis of
coordinates which we are associated with time t and the Cartesian coordinates (x, y, z)
by
x
0
= ct, x
1
= x, x
2
= y and x
3
= z (2.22)
where (x
0
, x
1
, x
2
, x
3
) are the components of a four-vector x, c is the speed of light - a
useful constant that ensures that the dimenional units of x
0
are metres, the same as x
1
,
x
2
and x
3
.
If we plot the graph of a one-dimensional (here x
1
) motion of a particle against
x
0
= ct the resulting curve is called the worldline of the particle. We measure the
position x
1
of the particle at a sequence of times and plot we might nd a graph that
looks like:
What is the gradient of the worldline?
Gradient =
(ct)
(x
1
)
=
c
v
1
(2.23)
where v
1
is the speed of the particle in the x
1
direction. Hence if the particle moves
at the speed of light, c, then the gradient of the worldline is 1. In this case, when
x
1
= v
1
t = ct (and recalling the particle is only moving in the x
1
direction) then
x
2
= (x
0
)
2
(x
1
)
2
= (ct)
2
(x
1
)
2
= 0 (2.24)
so x is a lightlike or null vector. If the gradient of the worldline is greater than one then
v
1
< c and x is timelike, otherwise if the gradient is less than one then v
1
> c and x
is a spacelike vector. One of the consequences of the special theory of relativity is that
objects cannot cross the lightspeed barrier and objects with non-zero rest-mass cannot
be accelerated to the speed of light.
Problem 2.1.3. Compute the transformation of the space-time coordinates given by two
consecutive Lorentz boosts along the x-axis, the rst with speed v and the second with
speed u.
Problem 2.1.4. Compare your answer to problem 2.1.3 to the single Lorentz transfor-
mation given by (u v) where denotes the relativistic addition of velocities. Hence
show that
u v =
u +v
1 +
uv
c
2
.
The spacetime at each point is split into four pieces. In the sketch above the set of null
vectors form the boundaries of the light-cone for the origin. Given any arbitrary point
in spcaetime p the set of vectors x p are all either timelike, spacelike or null. In the
diagram above this would correspond to shifting the origin to the point p, with spacetime
again split into four pieces and their boundaries. The points which are connected to
p by a timelike vector lie in the future or past lightcone of p, those connected by a
null vector lie on the surface lightcone of p and those connected by a spacelike vector
to p are outside the lightcone. As nothing may cross the lightspeed barrier any point
in spacetime can only exchange information with other points in spacetime which lie
within or on its past or future lightcone.
In the two-dimensional spacetime that we have sketched it would be proper to refer
to the forward or past light-triangle. The extension to four-dimensional spacetime is not
easy to visualise. First consider extending the picture to a three-dimensional spacetime:
add a second spatial axis x
2
, as no spatial direction is singled out (there is a symmetry
in the two spatial coordinates) the light-triangle of two-dimensions extends by rotating
the the light-triangle around the temporal axis into the x
2
direction
4
. Rotating the
light-triangle through three-dimensions gives the light-cone. The full picture for four-
dimensional spacetime (being four-dimensional) is not possible to visualise and we refer
still to the light-cone. However it is useful to be cautious when considering a drawing of
a light cone and understand which dimensions (and how many) it really represents, e.g.
a light-cone in four dimensions could be indicated by drawing a cone in three-dimensions
with the implicit understanding that each point in the cone represents a two-dimensional
space the drawing of which has been suppressed.
In all dimensions the lightcone is the cone at a point p is traced out by all the
lightlike vectors connected to p. No spacelike separated points can exchange a signal
since the message would have to travel at a speed exceeding that of light.
We nish this section by making an observation that will make the connection be-
tween the denition of O(1, 3) and the Lorentz transformatons explicit. But which will
be most usefully digested a second time after having read through the group theory
chapter. Consider again the Lorentz boost transformation shown in equation (2.1).
By making the substitution = cosh the transformations are re-written in a way
that looks a little like a rotation, it is in fact a hyperolic rotation. We note that
cosh
2
sinh
2
= 1 =
2
sinh
2
, i.e. sinh
2
=
2
1, therefore we have the
useful relation
tanh =
1
(
2
1)
1
2
= (1
1
2
)
1
2
= (1 (1
v
2
c
2
))
1
2
=
v
c
. (2.25)
Hence we can rewrite the Lorentz boost in (2.1) as
ct
= c cosh
_
t
x
c
tanh
_
= ct cosh xsinh (2.26)
x
= cosh
_
x ct tanh
_
= xcosh ct sinh (2.27)
y
= y (2.28)
z
= z (2.29)
or in matrix form as
x

_
_
_
_
_
_
ct
_
_
_
_
_
_
=
_
_
_
_
_
_
cosh sinh 0 0
sinh cosh 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
_
_
_
_
_
_
ct
x
y
z
_
_
_
_
_
_
= ()x (2.30)
4
By taking a slice of the three dimensional graph through ct and perpendicular to the (x
1
, x
2
) plane
the two-dimensional light-triangle structure reappear.
where is the four-by-four matrix indicated above and is a group element of SO(1, 3).
The Lorentz boost is a hyberbolic rotation of x into ct and vice-versa.
Problem 2.1.5. Show that () SO(1, 3).
2.2 Component Notation.
We have introduced the concept of the position four-vector implicitly as the extension of
the usual three-vector in Cartesian coordinates to include a temporal coordinate. The
position four vector is a particular four-vector x which species a unique position in
space-time:
x =
_
_
_
_
_
_
ct
x
y
z
_
_
_
_
_
_
. (2.31)
The components of the postion four-vector are denoted x
where 0, 1, 2, 3 such
that
x
0
= ct, x
1
= x, x
2
= y and x
3
= z. (2.32)
It is frequently more useful to work with the components of the vector x
rather than
the abstract vector x or the column vector in full. Consequently we will now develop
a formalism for denoting vectors, their transposes, matrices, matrix multiplication and
matrix action on vectors all in terms of component notation.
The notation x
with a single raised index we have dened to mean the entries in a

single-column vector, hence the raised index denotes a row number (the components of
a vector are labelled by their row). We have already met the Minkowski inner product
which may be used to nd the length-squared of a four-vector: it maps a pair of vectors
to a single scalar. Now a scalar object needs no index notation it is specied by a single
number, i.e.
< x, x >= x
2
= (x
0
)
2
(x
1
)
2
(x
2
)
2
(x
3
)
2
. (2.33)
On the right-hand-side we see the distribution of the components of the vector. Our
aim is to develop a notation that is useful, intuitive and carries some meaning within
it. A good notation will improve our computation. We propose to develop a notation
so that
x
2
= x
(2.34)
where x
is a row vector, although not always the simple transpose of x. To do this

we will develop matrix multiplication and the Einstein summation convention in the
component notation.
2.2.1 Matrices and Matrix Multiplication.
Let us think gently about index notation and develop our component notation. Let A be
an invertible four-by-four matrix with real entries (i.e. A GL(4, R)). The matrix may
multiply the four-vector x to give a new four-vector x
. This means that in component

notation matrix multiplication takes the component x
to x
, i.e. x
= Ax. In terms
2.2. COMPONENT NOTATION. 29
of components we write the matrix entry for the th row and th column by A
and
matrix multiplication is written as
x
. (2.35)
This notation for matrix multiplication is consistent with our notation for a column
vector x
and row vector x
: raised indices indicate a row number while lowered indices

indicate a column number. Hence the summation above is a sum of a product of entries
in a row of the matrix and column of the vector - as the summation index is a
column label (the matrix row stays constant in the sum). The special feature we have
developped here is to distinguish the meaning of a raised and lowered index, otherwise
teh expressions above are very familiar.
In more involved computations it becomes onerous to write out multiple summation
symbols. So we adopt in most cases the Einstein summation convention, so called
because it was notably adopted by Einstein in a 1916 paper on general relativity. As can
be seen above the summation occurs over a pair of repeated indices, so it is not necessary
to use the summation sign. Instead the Einstein summation convention assumes that
there is an implicit summation over any pair of repeated indices in an expression. Hence
the matrix multiplication written above becomes
x
= A
(2.36)
when the Einstein summation convention is assumed. In four dimensions this means
explcitly
x
= A
= A
0
x
0
+A
1
x
1
+A
2
x
2
+A
3
x
3
. (2.37)
The summed over indices no longer play any role on the right hand side and the index
structure matches on either side of the expression: on both sides there is one free
raised index indiciating that we have the components of a vector on both sides of the
equality. The repeated pair of indices which will be summed over and missing from the
nal expression are called dummy-indices. It does not matter which symbol is used to
denote a pair of indices to be summed over as they will vanish in the nal expression,
that is
x
= A
= A
= A
= A
0
x
0
+A
1
x
1
+A
2
x
2
+A
3
x
3
. (2.38)
The index notation we have adopted is useful as free indices are matched on either side
as are the positions of the indices.
So far so good, now we will run into an oddity in our conventions: the Minkowski
metric does not have the index structure of a matrix in our conventions, even thought we
wrote as a matrix previously! Recall that we aimed to be able to write x
2
= x
. Now
we understand the meaning of the right-hand-side, applying the Einstein summation
convention we have
x
= x
0
x
0
+x
1
x
1
+x
2
x
2
+x
3
x
3
(2.39)
but we have seen already that the Minkowski inner product is
< x, x >= (x
0
)
2
(x
1
)
2
(x
2
)
2
(x
3
)
2
(2.40)
so we gather that x
0
= x
0
, x
1
= x
1
, x
2
= x
2
and x
3
= x
3
and as we hinted x
is not
simply the components of the transpose of x. It is the Minkwoski metric on Minkowski
space that we may use to lower indices on vectors:
x
. (2.41)
This is the analogue of vector transpose in Euclidean space (where the natural inner
product is the identity matrix
ij
and the transpose does not change the sign of the
components as x
i
=
ij
x
j
. Now we note the aw in our notation, as can lower indices
then we could form an object A
which is obviously related to a matrix A
.
So we write as a matrix
=
_
_
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
(2.42)
we are forced to defy our own conventions and understand
to mean the entry in the

th row and th column of the matrix above.
Now we can write the Minkowski inner product in component notation:
= x
= x
= (x
0
)
2
(x
1
)
2
(x
2
)
2
(x
3
)
2
=< x, x > . (2.43)
The transpose has generalised to the raising and lowering of indices using the Minkowski
metric (x
)
T
=
= x
. To raise indices we use the inverse Minkowski metric

denoted
and dened by
(2.44)
which is the component form of
1
= I. From the matrix form of we note that
1
= . We can raise indices with the inverse Minkowski metric: x
.
Exercise Show that the matrix multiplication
T
= used to dene the matrices
O(1, 3) in component notation may be written as
.
Solution
(
T
)
where we have used the Minkowski metric to take the matrix transpose.
Since the components of vectors and matrices are numbers the order of terms in products
is irrelevant in component notation e.g.
= x

or
A
= (x
T
)A = x
.
We are also free to raise and lower simultaneously pairs of dummy indices:
x
= x
= x
= x
.
So we have many ways to write the same expression, but the key point for us are the
things that do not vary: the objects involved in the expression (x and A below) and the
free indices (although the dummy indices may be redistributed):
x
T
A = x
= x
= A
= A
= A
= A
2.2.2 Common Four-Vectors

We have seen that the Minkwoski inner product gives a Lorentz-invariant quantity for
any pair of four-vectors. We can make use of this Lorentz invariance to construct new
but familiar four-vectors. Consider two events, one occurring at the 4-vector x and
another at y where
x =
_
_
_
_
_
_
ct
1
x
1
y
1
z
1
_
_
_
_
_
_
and y =
_
_
_
_
_
_
ct
2
x
2
y
2
z
2
_
_
_
_
_
_
. (2.45)
In Newtonian physics the dierence in the time t [t
2
t
1
[ the two events occurred at
and the distance in space between the locations of the two events r
_
3
i=1
[x
i
y
i
[
2
are both invariants of the Gallilean transformations. As we have seen, under the Lorentz
transformations a new single invariant emerges: [x y[
2
= c
2
xy
where
xy
is called
the proper time between two events x and y, i.e.
c
2
2
xy
= c
2
(t
2
t
1
)
2
(x
2
x
1
)
2
(y
2
y
1
)
2
(z
2
z
1
)
2
. (2.46)
Every point x in space-time has a proper-time associated to it by
c
2
2
x
= c
2
t
2
1
x
2
1
y
2
1
z
2
1
= x
(2.47)
We have already shown in problem 2.1.2 that this is invariant under the under the
Lorentz transformations and one can show that
xy
is also invariant as c
2
2
xy
=< x
y, xy >= (xy)
(xy)
. Now as < xy, xy >= x

2
2 < x, y > +y
2
is invariant
then we can conlude that < x, y > is also an invariant as x
2
and y
2
are also invariant
under the Lorentz transformations.
Problem 2.2.1. Show explicitly that < x, y >= x
is invariant under the Lorentz

group.
These quantities are all called Lorentz-invariant quantities. You will notice that they
do not have any free indices for the Lorentz group to act on.
All four-vectors transform in the same way as the position four-vector x under a
Lorentz transformation (just as 3D vectors all transform in the same way under SO(3)
rotations). We can nd other physically relevant four-vectors by combining the position
four-vector x with Lorentz invariant quantities. For example the Lorentz four-velocity
u is dened using the proper time, which is Lorentz invariant, rather than time which
is not:
u =
dx
d
=
dx
dt
dt
d
=
dt
d
_
_
_
_
_
_
c
u
1
u
2
u
3
_
_
_
_
_
_
(2.48)
where
_
_
_
u
1
u
2
u
3
_
_
_
is the usual Newtonian velocity vector in R
3
. Let us compute
dt
d
, starting
from
=
1
c
_
c
2
t
2
x
2
y
2
z
2
(2.49)
then
d
dt
=
1
2c
2
(2c
2
t 2xu
1
2yu
2
2zu
3
) (2.50)
=
(t
xu
1
c
2

yu
2
c
2

zu
3
c
2
)
=
t(1
u
2
c
2
)
2
=
1
where u
2
= (u
1
)
2
+(u
2
)
2
+(u
3
)
2
and
1
=
_
1
u
2
c
2
. Hence the four velocity is given by
u =
_
_
_
_
_
_
c
u
1
u
2
u
3
_
_
_
_
_
_
. (2.51)
We can check that u
2
is invariant:
u
2
= u
=
2
(c
2
u
2
) = c
2
2
(1
u
2
c
2
) = c
2
(2.52)
The four-momentum is dened as p = mu where m is the rest-mass. The spatial part
of the four-momentum is the usual Newtonian momentum p
N
multiplied by , while the
zeroth component is proportional to energy:
p
0
=
E
c
= mc. (2.53)
The invariant quantity associated to p is
p
= (
E
c
)
2
2
p
2
N
= m
2
c
2
(2.54)
Rearranging gives
E = (m
2
c
4
+
2
p
2
N
c
2
)
1
2
(2.55)
which is the relativistic version of E =
1
2
mu
2
and you could expand the above expression
to nd the usual kinetic energy term together with other less familiar terms. For a
particle at rest we have = 1 and p
N
= 0 hence we nd a particles rest energy E
0
is
E
0
= mc
2
. (2.56)
2.2.3 Classical Field Theory
In the rst chapter we studied Lagrangians and Hamiltonians of systems with a nite
(or at least discrete number of degrees of freedom) which we labelled by q
i
(t). But in
modern physics, starting with Maxwell (did we mention yet that he was at Kings -
probably), one thinks that space is lled with elds that the move in time. A eld is
a function (x, y, z, t) that takes values in some space (usually a real or complex vector
space). It may also carry a Lorentz index. The eld is all around us and is allowed to
uctuate according some dynamical rule. The prime example is the electromagnetic eld
A
that we will discuss in detail next. One can think of a eld a continuous collection
of degrees of freedom q
i
(t) - one at each spacetime point. Then roughly speaking
_
d
3
x (2.57)
The action principle based on a Lagrangian is now lifted to one based on a Lagrangian-
density:
S =
_
d
4
x/(
I
,
I
) (2.58)
which depends on the elds
I
and their rst derivatives along any of the spacetime
dimensions. Here I is an index like i was that allows us to consider theories with more
than one eld In a relativistic theory we require that / is Lorentz invariant. If so the
equation of motion that come from extemizing the action will be Lorentz covariant.
Problem 2.2.2. Show that the principle of least action leads to the Euler-Lagrange
equations
_
/
I
_
I
= 0. (2.59)
To do this one must assume that the elds all vanish suciently quickly at spatial innity.
We can again consider innitessimal symmetries of the form
I
=
I
+
I
I
=
I
+
I
(2.60)
where
I
is allowed to depend on the elds. A Lagrangian density is invariant if
/(
I
,
I
) = /(
I
,
I
) +
(2.61)
where K
is some expression involving the elds. In this case the conserved Noether
charge becomes a conserved current J
dened by
J
I
/
I
K
(2.62)
Problem 2.2.3. Show that, if
I

I
is a symmetry and the equation of motion are
satised then J
is conserved in the sense that
= 0 (2.63)
Given a conserved current we can construct a conserved charge by taking
Q =
_
d
3
xJ
0
(2.64)
It then follows that
0
Q =
_
d
3
x
0
J
0
=
_
d
3
x J
=
_
d
2
xJ dS
= 0 (2.65)
where a bold face indicates the spatial components of a vector and dS is the volume
element of the 2-sphere at spatial innity. To obtain the nal line we assume that the
elds all vanish at innity.
One can think of the Lagrangian as
L =
_
d
3
x/ (2.66)
And similarly one can consider a Hamiltonian density
1 =
I
/ (2.67)
where
I
=
/
I
(2.68)
so that the Hamiltonian is
H =
_
d
3
x1 (2.69)
Problem 2.2.4. Consider the action for a massless, real scalar eld with a quartic
potential in Minkowksi space-time:
S =
_
d
4
x/ =
_
d
4
x
_
1
2

4
_
where R is a constant. Under a conformal transformation the eld transforms as

+x
+ where is the innitesimal parameter for the transformation.

(d.) Show that the variatation of the Lagrangian under the conformal transformation
is given by (upto order
2
):
L L +
(x
L).
(e.) Hence show that there is an associated conserved quantity
j
(x
+) x
L.
(f.) Find the equation of motion for and use this to show explicitly that
= 0.
2.2.4 Maxwells Equations.
The rst clue that there was a democracy between time and space came with the discov-
ery of Maxwells equations. James Clerk Maxwells work that led to his equations began
in his 1861 paper On lines of physical force which was written while he was at Kings
College London (1860-1865). The equations include an invariant speed of propagation
for electromagnetic waves c, the speed of light, which is one of the two assumptions in
Einsteins special theory of relativity. Consequently they have an elegant formulation
when written in terms of Lorentz tensors.
Half of Maxwells equations can be solved by introducing an electrostatic potential
and vector magnetic potential A, both of which depend on space and time. One then
writes the electric and magnetic elds as:
E =

A
B = A . (2.70)
Note that and A are not uniquely determined by E and B. Given any pair and A
we can also take
= A . (2.71)
and one nds the same E and B. Here is any function of space and time. Such a
symmetry is called a gauge symmetry. We can put these together to form a 4-vector:
A
= (, A) . (2.72)
In this case the gauge symmetry is
A
= A
. (2.73)
The fact that one may arbitrarily shift the potential A
in this way without changing L

is an example of a gauge symmetry. These symmetries are a pivotal part of the standard
model of particle physics and this U(1) gauge symmetry of electromagnetism is the
prototypical example of gauge symmetry.
We want to derive Maxwells theory of electromagnetism from a relativistic invariant
action S given by
S =
_
d
4
x/ (2.74)
where / is call a Lagrangian density. We have two requirements on /. Firstly it needs to
be a Lorentz scalar. This means that all , indices must be appropriately contracted.
Secondly it should be invariant under (2.73).
To start we note that
F
(2.75)
is invariant under (2.73).
Problem 2.2.5. Show that the transformation
A
(2.76)
where is an arbitrary function of x
leaves the F
invariant.
Thus we can construct our action using Lorentz invariant combinations of F
and
. Let us expand in powers of F
:
/ =

1
4
F
+. . . (2.77)
The rst term is zero since
is symmetric but F
is anti-symmetric. So we take
/ =
1
4
F
(2.78)
We would like to use the action above to nd the equations of motion but we are
immediately at a loss if we attempt to write Lagranges equations. The problem is we
have put space and time on an equal footing in relativity, and in the above action, while
in Lagrangian mechanics the temporal derivative plays a special role and is distinguished
from the spatial derivative. Lagranges equations are not covariant. We will return to
this problem and address how to upgrade Lagranges equations to space-time. Here we
will vary the elds A
in the action directly and read o the equation of motion. To

simplify the expressions we begin by writing the variation of the Lagrangian:
A
L =
1
4
A
(F
)F
1
4
F
A
(F
) (2.79)
=
1
2
A
(F
)F
(2.80)
Now under a variation of A
the eld strength F
transforms as
F
(A
+A
(A
+A
) F
+
A
(F
) (2.81)
so we read o
A
(F
) =
(A
(A
). (2.82)
So from the variation of the Lagrangian we have:
A
L =
1
4
A
(F
)F
1
4
F
A
(F
) (2.83)
=
1
2
_
(A
(A
)
_
F
(2.84)
=
(A
)F
(2.85)
where we have used the antisymmetry of F
= F
and a relabelling of the dummy

indices in the second term of the second line to arrive at the nal expression. To take
the derivative o of A
we use the same technique as when one integrates by parts

(although here there is no integral, but when we put the Lagrangian variation back into
the action there will be) namely we rewrite the expression using the observation that
(A
) =
(A
)F
+A
(F
) (2.86)
to give
A
L =
(A
) +A
(F
). (2.87)
Returning to the action we have
A
S =
_
d
4
x
_
(A
) +A
(F
)
_
. (2.88)
The rst term we can integrate diretl - it is called a boundary term as it is a total
derivative - but it vanishes as the term A
vanishes at the xed points of the path (in

eld space) we are varying leaving us with
0 =
A
S =
_
d
4
xA
(F
). (2.89)
Hence the eld equation is
= 0. (2.90)
We could consider adding in a source term. Suppose that we have some background
electromagnetic current j
. Then we could add to the Lagrangian the term

/
source
= j
. (2.91)
Note that this is not gauge invariant in general but one has, under (2.73),
/
source
= /
source
j
= /
source
+
(j
) . (2.92)
The last term is a total derivative and can be dropped. Therefore the source term leads
to a gauge invariant action if j
is a conserved current:
= 0 . (2.93)
Taking the variation of the source term in action with respect to A
is easy any simply

changes the equation of motion to
= j
. (2.94)
Note that the conservation equation also follows from the equation of motion since
= 0, where again weve used the fact that the derivatives are symmetric
but F
is anti-symmetric.
This is a space-time equation. If we split it up into spatial and temporal components
we can reconstruct Maxwells equations in their familiar form. To do this we introduce
the electric E and magnetic B elds in terms of components of the eld strength:
F
0i
= E
i
and F
ij
=
ijk
B
k
(2.95)
where E
i
and B
i
are the components of E and B respectively, i, j, k 1, 2, 3 and
ijk
is the Levi-Civita symbol normalised such that
123
= 1. We will meet the Levi-Civita
symbol when we study tensor representations in group theory, at this point it is sucient
to know that it has six components which take the values:
123
= 1,
231
= 1,
312
= 1 (2.96)
213
= 1,
132
= 1,
321
= 1
note that swapping of any neighbouring indices changes the sign of the Levi-Civita
symbol - the Levi-Civita symbol is an antisymmetric tensor. We will split the equation
of motion in equation (2.90) into its temporal part = 0 and its spatial part = i
where i 1, 2, 3. Taking = 0 we have
0
F
00
+
i
F
i0
=
i
E
i
= j
0
(2.97)
that is
E = j
0
(2.98)
From the spatial equations ( = i) we have
0
F
0i
+
j
F
ji
=
0
E
i
+
j
(
jik
B
k
) =
1
c
t
E
i
ijk
j
(B
k
) = j
i
(2.99)
i.e.
B =
1
c
E
t
j. (2.100)
That is all we obtain from the equation of motion, so we seem to be two equations short!
However there is an identity that is valid on the eld strength simply due to its denition.
Formerly F
is an exact form as it is the exterior derivative of the one-form A
5
.
Exact forms vanish when their exterior derivative, which is the antisymmetrised partial
derivative, is taken.
Problem 2.2.6. Show that
3
[
F
]

= 0 (2.101)
The identity
[
F
]
= 0 is called the Bianchi identity for the eld strength and is a
consequence of its antisymmetric construction. However it is non-trivial and it is from
the Bianchi identity for F
that the remaining two Maxwell equations emerge.

Let us consider all the non-trivial spatial and temporal components of
[
F
]
=
0. We note that we cannot have more than one temporal index before the identity
trivialises, e.g. let = = 0 and = i then we have
0
F
0i
+
0
F
i0
+
i
F
00
=
0
F
0i
0
F
0i
= 0 (2.102)
from which we learn nothing. When we take = 0, = i and = j we have
0
F
ij
+
i
F
j0
+
j
F
0i
= 0 (2.103)
We must use the Minkowski metric to nd the components F
of the eld strength in

terms of E and B:
F
ij
=
i
j
F
=
ik
jl
F
kl
= F
ij
=
ijk
B
k
(2.104)
F
0i
=
0
i
F
=
ik
F
0k
= F
0i
= E
i
. (2.105)
Substituting these expressions into equation (2.103) gives
0
(
ijk
B
k
) +
i
E
j
j
E
i
= 0. (2.106)
To reformulate this in a more familiar way we can make use of an identity on the
Levi-Civita symbol:
ijm
ijk
= 2
k
m
. (2.107)
5
Dierential forms are a subset of the tensors whose indices are antisymmetric. They are introduced
and studied in depth in the Manifolds course.
Problem 2.2.7. Prove that
ijm
ijk
= 2
k
m
.
Contracting
ijm
with equation (2.106) gives
ijm
0
(
ijk
B
k
) +
ijm
i
E
j
ijm
j
E
i
= 2
0
(B
m
) +
ijm
i
E
j
ijm
j
E
i
(2.108)
= 2
0
(B
m
) + 2
ijm
i
E
j
= 0
which we recognise as
E =
1
c
B
t
. (2.109)
The nal Maxwell equation comes from setting = i, = j and = k in equation
(2.101):
i
F
jk
+
j
F
ki
+
k
F
ij
=
i
(
jkl
B
l
) +
j
(
kil
B
l
) +
k
(
ijl
B
l
) = 0 (2.110)
Contracting this with
ijk
gives
ijk
_
i
(
jkl
B
l
) +
j
(
kil
B
l
) +
k
(
ijl
B
l
)
_
=
i
(2
l
i
B
l
) +
j
(2
l
j
B
l
) +
k
(2
l
k
B
l
)
(2.111)
= 6
i
B
i
= 0
That is,
B = 0. (2.112)
Indeed the whole point of introducing A
= (, A) was to ensure that (2.109) and

(2.112) were automatically solved. So thats it, we have recovered Maxwells theory of
electromagnetism from simple symmetry reasoning and Lorentz invariance.
2.2.5 Electromagnetic Duality
The action for electromagnetism can be rewritten in terms of E and B where it has a
very simple form. Now
F
= F
0
F
0
+F
i
F
i
(2.113)
= F
00
F
00
+F
0i
F
0i
+F
i0
F
i0
+F
ij
F
ij
(2.114)
= 2E
i
E
i
+
ijk
B
k
ijl
B
l
(2.115)
= 2E
i
E
i
+ 2B
i
B
i
(2.116)
= 2E
2
+ 2B
2
. (2.117)
Hence,
L =
1
2
(E
2
B
2
) (2.118)
Some symmetry is apparent in the form of the Lagrangian and the equations of motion.
We notice (after some reection) that if we interchange E B and B E that while
the Lagrangian changes sign, the equations of motion are unaltered. This is electro-
magnetic duality: an ability to swap electric elds for magnetic elds while preserving
Maxwells equations
6
.
6
The eagle-eyed reader will notice that the electromagnetic duality transformation exchanges equa-
tions of motion for Bianhci identities.
As with the harmonic oscillator, electromagnetic duality is much more apparent in
the associated Hamiltonian which takes the form
H =
1
2
(E
2
+B
2
) (2.119)
which is itself invariant under (E, B) (B, E).
Chapter 3
Quantum Mechanics
Historically quantum mechanics was constructed rather than logically developed. The
mathematical procedure of quantisation was later rigorously developed by mathemati-
cians and physicists, for example by Weyl; Kohn and Nirenberg; Becchi, Rouet, Stora
and Tyutin (BRST quantisation for quantising a eld theory); Batalin and Vilkovisky
(BV eld-antield formalism) as well as many other signicant contributions and re-
search into quantisation methods continues to this day. The original development of
quantum mechanics due to Heisenberg is called the canonical quantisation and it is the
approach we will follow here.
Atomic spectra are particular to specic elements, they are the ngerprints of atomic
forensics. An atomic spectrum is produced by bathing atoms in a continuous spectrum
of electromagnetic radiation. The electrons in the atom make only discrete jumps as
the electromagnetic energy is absorbed. This can be seen in the atomic spectra by the
absence of specic frequencies in the outgoing radiation and by recalling that E = h
where E is energy, h is Plancks constant and is the frequency.
In 1925 Heisenberg was working with Born in Gottingen. He was contemplating the
atomic spectra of hydrogen but not making much headway and he developed the most
famous bout of hayfever in theoretical physics. Complaining to Born he was granted
a two-week holiday and escaped the pollen-lled inland air for the island of Helgoland.
He continued to work and there in a systematic fashion. He arranged all the known
frequencies for the spectral lines of hydrogen into an array, or matrix, of frequencies
ij
.
He was also able to write out matrices of numbers corresponding to the transition rates
between energy levels. Armed with this organisation of the data, but with no knowledge
of matrices, Heisenberg developed a correspondence between the harmonic oscillator
and the idea of an electron orbitting in an extremely eccentric orbit. Having arrived
at a consistent theory of observable quanitites, Heisenberg climbed a rock overlooking
the sea and watched the sun rise in a moment of triumph. Heisenbergs triumph was
short-lived as he quickly realised that his theory was based around non-commuting
variables. One can imagine his shock realising that everything worked so long as the
multiplication was non-Abelian, nevertheless Heisenberg persisted with his ideas. It was
soon pointed out to him by Born that the theory would be consistent if the variables
were matrices, to which Heisenberg replied that I do not even know what a matrix
is. The oddity that matrices were seen as an unusual mathematical formalism and not
41
42 CHAPTER 3. QUANTUM MECHANICS
a natural setting for physics played an important part in the development of quantum
mechanics. As we will see a wave equation describing the quantum theory was developed
by Schrodinger in apparent competition to Heisenbergs formulation. This was, in part,
a reaction to the appearance of matrices in the fundamental theory as well as a rejection
of the discontinuities inherent in Heisenbergs quantum mechanics. Physicists much
more readily adopted Schrodingers wave equation which was written in the language
of dierential operators with which physicists were much more familiar. In this chapter
we will consider both the Heisenberg and Schrodinger pictures and we will see the
equivalence of the two approaches.
3.1 Canonical Quantisation
We commence by recalling the structures used in classical mechanics. Consider a classical
system described by n generalised coordinates q
i
of mass m
i
subject to a potential V (q
i
)
and described by the Lagrangian
L =
n
i=1
1
2
m
i
q
2
i

n
i=1
V (q
i
) (3.1)
where V (q) = V (q
1
, q
2
, . . . q
n
). The equations of motion are:
m
i
q
i
+
V
q
i
= 0 F
i
= m
i
q
i
. (3.2)
The Hamiltonian is
H =
n
i=1
p
i
q
i
L =
p
2
i
2m
i
+V (q) (3.3)
and the Hamiltonian equations make explicit that there exists a natural antisymmetric
(symplectic) structure on the phase space, the Poisson brackets:
q
i
, p
j
=
ij
(3.4)
with all other brackets being trivial.
Canonical quantisation is the promotion of the positions q
i
and momenta p
i
to op-
erators (which we denote with a hat):
(q
i
, p
i
) ( q
i
, p
i
) (3.5)
together with the promotion of the Poisson bracket to the commutator by
A, B
1
i
[

A,

B] (3.6)
where A and B indicate arbitrary functions on phase space, while

A and

B are operators.
For example we have
[ q
i
, p
j
] = i
ij
(3.7)
where
h
2
and h is Plancks constant. In particular the classical Hamiltonian becomes
under this promotion
H

H =
n
i=1
p
2
i
2m
i
+
i
V ( q
i
). (3.8)
3.1. CANONICAL QUANTISATION 43
While the classical q
i
and p
i
collect to form vectors in phase space, the quantum oper-
ators q
i
and p
i
belong to a Hilbert space. In quantum mechanics physical observables
are represented by operators which act on the Hilbert space of quantum states. The
states include eigenstates for the operators and the corresponding eigenvalue represents
the value of a measurement. For example we might denote a position eigenstate with
eigenvalue q for the position operator q by [q so that:
q[q = q[q (3.9)
we will meet the bra-ket notation more formally later on, but it is customary to label
an eigenstate by its eigenvalue hence the eigenstate is denoted [q here. More general
states are formed from superpositions of eigenstates e.g.
[ =
_
dx(x)[x or [ =
i
[q
i
(3.10)
where we have taken [x as a continuous basis for the Hilbert space while [q
i
is a discrete
basis.
If we work using the eigenfunctions of the positon operator as a basis for the Hilbert
space it is customary to refer to states in the position space. By expressing states as a
superposition of position eigenfunctions we determine an expression for the momentum
operator in the position space. For simplicity, consider a single particle state described
by a single coordinate given by = c(q)[q, where [q is the eigenstate of the position
operator q and q = q. The commutator relation [ q, p] = i xes the momentum
operator to be
p = i

q
(3.11)
as
[ q, p] = ( q p p q)c[q (3.12)
= q pc[q pqc[q
= i q
c
q
[q +i
(qc)
q
[q
= i
For many-particle systems we may take the position eigenstates as a basis for the Hilbert
space and the state and momentum operator generalise to

i
c
i
(q)[q
i
and p
i
i

q
i
. (3.13)
Note that the Hamiltonian operator in the position space becomes
H =

2
2m
i
2
q
2
i
+
i
V ( q
i
). (3.14)
3.1.1 The Hilbert Space and Observables.
Denition A Hilbert space 1 is a complex vector space equipped with an inner product
<, > satisfying:
(i.) < , >= < , >
(ii.) < , a
1
1
+a
2
2
>= a
1
< ,
1
> +a
2
< ,
2
>
(iii.) < , > 0 H where equality holds only if = 0.
where indicates the complex conjugate of
Note that as the inner product is linear in its second entry, it is conjugate linear in its
rst entry as
< a
1
1
+a
2
2
, > = < , a
1
1
+a
2
2
> (3.15)
= a
1
< ,
1
>+a
2
< ,
2
>
= a
1
<
1
, > +a
2
<
2
, >
where we have used a
1
to indicate the complex-conjugate of a
1
. The physical states in a
system are described by normalised vectors in the Hilbert space, i.e. those H such
that < , >= 1.
Observables are represented by Hermitian operators in 1. Hermitian operators are
self-adjoint.
Denition An operator

A
is the adjoint operator of

A if
<

A
, >=< ,

A > . (3.16)
From the denition it is rapidly observed that
=

A
(

A+

B)
=

A
+

B
(K

A)
= K
(

A

B)
=

B
If

A
1
exists then (

A
1
)
= (

A
)
1
.
A self-adjoint operator satises A
= A. The prototype for the adjoint is the Hermitian

conjugate of a matrix M
(M
T
)
.
Example 1:C
n
as a Hilbert Space
In a sense a Hilbert space is a generalization to innite dimensions of simple C
n
(if we
ignore lots of subtle mathematical details). The natural inner product is
< x, y > x
y. (3.17)
Let

A denote a self-adjoint matrix and we will show that

A
=

A
:
< x,

Ay >= x

Ay = (

A
x)
y =<

A
x, y > . (3.18)
Example 2: L
2
as a Hilbert Space
Let H = L
2
(R) i.e. H < , >< and the inner product is
< , >
_
R
dq
(q)(q). (3.19)
Using this inner product the momentum operator is a self-adjoint operator as
< , p > =
_
R
dq
(q)
_
i

q
(q)
_
(3.20)
=
_
R
dq i
_

q
(q)
_
(q)
=
_
R
dq
_
i

q
(q)
_
(q)
=
N.B. we have assumed that 0 and 0 at q = such that the boundary term
from the integration by parts vanishes.
3.1.2 Eigenvectors and Eigenvalues
In this section we will prove some simple properties of eigenvalues of self-adjoint opera-
tors.
Let u 1 be an eigenvector for the operator

A with eigenvalue C such that
Au = u. (3.21)
The eigenvalues of a self-adjoint operator are real:
< u,

Au >=< u, u >= < u, u > (3.22)
=<

Au, u >=< u, u >=
< u, u >
hence =
and R.
Eignevectors which have dierent eigenvalues for a self-adjoint operator are orthog-
onal. Let
Au = u and

Au
(3.23)
where

A is a self-adjoint operator and so ,
R. Then we have
< u,

Au
>=< u,
>=
< u, u
> (3.24)
=<

Au, u
>=< u, u
>= < u, u
> (3.25)
Therefore,
(
) < u, u
>= 0 < u, u
>= 0 if ,=
. (3.26)
Theorem 3.1.1. For every self-adjoint operator there exists a complete set of eigenvec-
tors (i.e. a basis of the Hilbert space 1).
The basis may be countable
1
or continuous.
1
Countable means it can be put in one=to-one correspondence with the natural numbers.
3.1.3 A Countable Basis.
Let u
n
denote the eigenvectors of a self-adjoint operator

A, i.e.
Au
n
=
n
u
n
. (3.27)
By the theorem above u
n
form a basis of 1, let us suppose that it is a countable basis.
Let u
n
be an orthonormal set such that
=
nm
. (3.28)
Any state may be written as a linear superposition of eigenvectors
=
n
u
n
(3.29)
so that
==
m
. (3.30)
Let us now adopt the useful bra-ket notation of Dirac where the inner product is denoted
by
u
n
[ (3.31)
so that, for example in C
n
, vectors are denoted by kets e.g.
u
n
[u
n
and [ (3.32)
while adjoint vectors become bras:
u
n
u
n
[ and
[. (3.33)
One advantage of this notation is that, being based around the Hilbert space inner
product, it is universal for all explicit realisations of the Hilbert space. However its
main advantage is how simple it is to use.
Using equation (3.30) we can rewrite equation (3.29) in the bra-ket notation as
[ =
n
u
n
[[u
n
=
n
[u
n
u
n
[ (3.34)
n
[u
n
u
n
[ = I
H
where I
H
is known as the completenes operator. It is worth comparing with R
n
where
the identity matrix can be written
n
e
n
e
T
n
= I where e
n
are the usual orthonormal
basis vectors for R
n
with zeroes in all compenents except the nth which is one.
Using the properties of the Hilbert space inner product we observe that
= u
n
[ = [u
n
(3.35)
and further note that this is consistent with the insertion of the completeness operator
between two states
[ =
n
[u
n
u
n
[ =
n
. (3.36)
We may insert a general operator

B between two states:
< ,

B >= [
B[ =
n,m
[u
n
u
n
[
B[u
m
u
m
[ =
n,m
n
B
m
n
m
(3.37)
where B
m
n
are the matrix components of the operator

B written in the u
n
basis. For
example as u
n
are eigenvectors of

A with eigenvalues
n
then the matrix components
A
m
n
are
A =
_
_
_
_
_
_
1
0 . . . 0
0
2
. . . 0
.
.
.
.
.
.
.
.
.
0
0 0 . . .
n
_
_
_
_
_
_
i.e. A
m
n
=
n
nm
. (3.38)
Theorem 3.1.2. Given any two commuting self-adjoint operators

A and

B one can
nd a basis u
n
such that

A and

B are simultaneously diagonalisable.
Proof. As

A is self-adjoint one can nd a basis u
n
such that
Au
n
=
n
u
n
. (3.39)
Now
A

Bu
n
=

B

Au
n
=
n

Bu
n
(3.40)
as [

A,

B] = 0 and hence

Bu
n
is in the eigenspace of

A (i.e.

Bu
n
=
m
u
m
) and has
eigenvalue
n
hence
Bu
n
=
n
u
n
. (3.41)
Example: Position operators in R
3
.
Let ( x, y, z) be the position operators of a particle moving in R
3
then
[ x, y] = 0, [ x, z] = 0 and [ y, z] = 0 (3.42)
using the canonical quantum commutation rules and hence are simultaneously diagonal-
isable. One can say the same for p
x
, p
y
and p
z
.
The Probabilistic Interpretation in a Countable Basis.
The Born rule gives the probability that a measurement of a quantum system will yield
a particular result. It was rst evoked by Max Born in 1926 and it was principally for
this work that in 1954 he was awarded the Nobel prize. It states that if an observable
associated with a self-adjoint operator

A then the measured result will be one of the
eigenvalues
n
of

A. Further it states that the probability that the measurement of [
will be
n
is given by
P(, u
n
)
[
P
n
[
[
(3.43)
where

P
n
is a projection onto the eigenspace spanned by the normalised eigenvector u
n
of

A, i.e.

P
n
= [u
n
u
n
[ giving
P(, u
n
)
[u
n
u
n
[
[
=
[[u
n
[
2
[
. (3.44)
Note that if the state was an eigenstate of

A (i.e. =
n
u
n
) then P(, u
n
) = 1.
Following a measurement of a state the wavefunction collapses to the eigenstate that
was measured. Given the probability of measuring a system in a particular eigenstate
one can evaluate the expected value when measuring an observable. The expected
value is a weighted average of the measurements (eigenvalues) where the weighting is
in proportion to the probability of observing each eigenvalue. That is we may measure
the observable associated with the operator

A of a state and nd that
n
occurs with
probability P(, u
n
) then the expected value for measuring

A is
n
P(, u
n
) (3.45)
Now given that

A[u
n
=
n
[u
n
we have that the expectation value of a measurement
of the observable associated to

A is
n
[[u
n
[
2
[
=
n,m
[u
n
u
n
[
A[u
m
u
m
[
[
=
[
A[
[
(3.46)
where we have used u
n
[u
m
=
nm
. If is a normalised state then
= [
A[.
The next most reasonable question we should ask ourselves at this point is what is the
probability of measuring the observable of a self-adjoint operator

B which does not share
the eigenvectors of

A, i.e. what does the Born rule say about measuring observables
of operators which do not commute? The answer will lead to Heisenbergs uncertainty
principle, which we relegate to a (rather long) problem.
Problem 3.1.1. The expectation (or average) value of a self-adjoint operator

A acting
on a normalised state [ is dened by
A
avg
=
A [
A[. (3.47)
The uncertainty in the measurement of

A on the state [ is the average value of its
deviation from the mean and is dened by
A
_
(AA
avg
)
2
=
_
[(

AA
avg
I)
2
[ (3.48)
where

I is the completeness operator.
(a.) Show that for any two self-adjoint operators

A and

B
[[
A

B[[
2
[
A
2
[[
B
2
[. (3.49)
Hint: Use the Schwarz inequality: [ < x, y > [
2
< x, x >< y, y > where x, y are
vectors in a space with inner product <, >.
(b.) Show that AB + BA is real and AB BA is imaginary when

A and

B are
self-adjoint operators.
(c.) Prove the triangle inequality for two complex numbers z
1
and z
2
:
[z
1
+z
2
[
2
([z
1
[ +[z
2
[)
2
. (3.50)
(d.) Use the triangle inequality and the inequality from part (a.) to show that
[[[

A,

B][[
2
4[
A
2
[[
B
2
[. (3.51)
(e.) Dene the operators

A
I and

B
I where , R. Show that

A
and

B
are self-adjoint and that [

A
,

B
] = [A, B].
(f.) Use the results to show the uncertainty relation:
(A)(B)
1
2
[[[

A,

B][[ (3.52)
What does this give when

A = q and

B = p?
3.1.4 A Continuous Basis.
If an operator

A has eigenstates u
where the eigenvalue is a continuous variable then

an arbitrary state in the Hilbert space is
[
_
d
[u
. (3.53)
Then
u
[ =
_
du
[u
. (3.54)
The mathematical object that satises the above statement is the Dirac delta function:
u
[u
( ). (3.55)
Formally the Dirac delta function is a distributon or measure that is equal to zero
everywhere apart from 0 when (0) = . Its dening property is that its integral over
R is one. One may regard it as the limit of a sequence of Gaussian functions of width a
having a maximum at the origin, i.e.
a
(x)
1
a
exp (
x
2
a
2
) (3.56)
so that as a 0 the limit of the Gaussians is the Dirac delta function as
_

a
(x)dx =
_

1
a
exp (
x
2
a
2
)dx = (
1
a
a = 1 (3.57)
which is unchanged when we take the limit a 0 and so in the limit has the properties
of the Dirac delta function. We recall that the Gaussian integral
I
_

dxexp (
x
2
a
2
) (3.58)
gives
I
2
dxdy exp (
x
2
+y
2
a
2
) =
_
2
0
_

0
rdrd exp (
r
2
a
2
) (3.59)
=
_
2
0
d
_
a
2
2
exp (
r
2
a
2
)
_
0
(3.60)
=
_
2
0
d
a
2
2
(3.61)
= a
2
(3.62)
hence
I = a
. (3.63)
As a consequence the eigenstate [u
on its own is not correctly normalised to be a

vector in the Hilbert space as
u
[u
= ( ) u
[u
= (3.64)
however used within an integral it is a normalised eigenvector for

A in the Hilbert space:
_
du
[u
= 1. (3.65)
We can show that the continuous eigenvectors form a complete basis for the Hilbert
space as
[ =
_ _
dd u
[u
(3.66)
=
_ _
dd u
[[u
[[u
=
_ _
dd u
[u
[u
[
=
_ _
dd ( )[u
[
=
_
d[u
[
hence we nd the completeness relation for a continuous basis:
_
d[u
[ = I
H
(3.67)
The Probabilistic Interpretation in a Continuous Basis.
The formulation of Borns rule is only slightly changed in a continuous basis. It now is
stated as the probability of nding a system described by a state [ to lie in the range
of eigenstates between [u
and [u
+
is
P(, u
) =
_
+
d
[u
[
[
=
_
+
d
[
[
2
[
(3.68)
Transformations between Dierent Bases
We nish this section by demonstrating how a state [ H may be expressed using
dierent bases for H by using the completeness relation. In particular we show how one
may relate a discrete basis of eigenstates to a continuous basis of eigenstates.
Let [u
n
be a countable basis for H and let [v
be a continuous basis, then:

u
n
[ =
n
and v
[ =
. (3.69)
Hence we may expand each expression using the completeness operator for the alterna-
tive basis to nd:
= v
[ (3.70)
=
n
v
[u
n
u
n
[
=
n
u
n
()
n
3.2. THE SCHR
ODINGER EQUATION. 51
where u
n
() v
[u
n
, and similarly,
n
= u
n
[ (3.71)
=
_
du
n
[v
[
=
_
du
n
()
.
3.2 The Schr odinger Equation.
Schrodinger developed a wave equation for quantum mechanics by building upon de
Broglies wave-particle duality. Just as the (dynamical) time-evolution of a system
represented in phase space is given by Hamiltons equations, so the time evolution of a
quantum system is described by Schrodingers equation:
i
t
=

H (3.72)
A typical Hamiltonian in position space has the form
H =
2
2
n
i=1
1
m
i
2
q
2
i
+
n
i=1
V
i
(q) (3.73)
where V (q) = V (q
1
, q
2
, . . . q
n
) and is Hermitian
2
. We will make use of the Hamiltonian
in this form in the following.
Theorem 3.2.1. The inner product on the Hilbert space is time-indpendent.
Proof. We will prove this for the L
2
norm and use the form of the Hamiltonian

1 given
above. As
[ =
_
R
k
d
k
q
q
(3.74)
we have
t
[ =
_
R
k
d
k
q
_
q
t

q
+
q
t
_
(3.75)
=
_
R
k
d
k
q
_
i
(

H
q
)
q
q
(

H
q
)
_
where we have used Schrodingers equation and its complex conjugate: i
t
=

H
.
2
This guarantees that the energy eigenstates have real eigenvalues and form a basis of the Hilbert
space. We will only consider Hermitian Hamiltonians in this course. However while it is conventional to
consider only Hermitian Hamiltonians it is by no means a logical consequence of canonical quantisation
and one should be aware that non-Hermitian Hamiltonians are discussed occasionally at research level
see for example the recent work of Professor Carl Bender.
As

H is Hermitian we have

H
=

H and so,
t
[ =
_
R
k
d
k
q
_
i
2
2
n
i=1
1
m
i
q
q
2
i
+
n
i=1
V
i
(q)
q
)
q
q
(
2
2
n
i=1
1
m
i
q
q
2
i
+
n
i=1
V
i
(q)
q
)
_
(3.76)
=
i
2
_
R
k
d
k
q
n
i=1
1
m
i
_
q
q
2
i
q
q
2
i
_
=
i
2
_
R
k
d
k
q
n
i=1
1
m
i
_
q
q
i
q
q
i
+
q
q
i
q
q
i
_
i
2
_
n
i=1
1
m
i
_
q
q
i
q
q
i
__
R
k
=
i
2
_
n
i=1
1
m
i
_
q
q
i
q
q
i
__
R
k
= 0
if the boundary term vanishes: typically well-behaved wavefunctions which have compact
support and will vanish at . So to complete the proof we have assumed that both
the wavefunctions go to zero while their rst-derivatives remain nite at innity.
From the calculation above we see that the probability density
(N.B. just
the integrand above) for a wavefuntion , which was used to normalise the probability
expressed by Borns rule, is conserved, up to a probability current J
i
corresponding to
the boundary term above:
t
=

q
i
_
i
2
n
i=1
1
m
i
_
q
q
i
q
q
i
__

n
i=1
J
i
q
i
(3.77)
where J
i
is called the probability current and is dened by
J
i
i
2m
i
_
q
q
i
q
q
i
_
. (3.78)
Consequently we arrive at the continuity equation for quantum mechanics
t
+ J = 0 (3.79)
where J is the vector whose components are J
i
.
While the setting was dierent, we note the similarity in the construction of the
equations to the derivation of a conserved charge in Noethers theorem as presented
above.
3.2.1 The Heisenberg and Schr odinger Pictures.
Initially the two formulations of quantum mechanics were not understood to be identical.
The matrix mechanics of Heisenberg was widely thought to be mathematically abstract
while the formulation of a wave equation by Schrodinger although it appeared later was
much more quickly accepted as the community of physicists were much more familiar
3.2. THE SCHR
with wave equations than non-commuting matrix variables. However both formulations
were shown to be identical. Here we will discuss the two pictures and show the
transformations which transform them into each other.
The Schrodinger Picture
In the Schrodinger picture the states are time-dependent = (q, t) but the operators
are not
d

A
dt
= 0. One can nd the time-evolution of the states from the Schrodinger
equation:
i

t
[(t)
S
=

H[(t)
S
(3.80)
which has a formal solution
[(t)
S
= e
i

Ht
[(t)
S
t=0
= e
i

Ht
[(0)
S
(3.81)
Using the energy eigenvectors (the eigenvectors of the Hamiltonian) as a countable basis
for the Hilber space we have
[(t)
S
=
n
[E
n
E
n
[(0)
S
e
Et
(3.82)
i.e. we have taken E to be the eigenvalue for the Hamiltonian of (0)
S
:

H[(0)
S
=
0
n
E
n
[E
n
E[(0)
S
so that (t) = e
iEt
[(0)
S
.
The Heisenberg Picture
In the Heisenberg picture the states are time-independent but the operators are time-
dependent:
[
H
= e
i

Ht
[(t)
S
= [(0)
S
(3.83)
while
A
H
(t) = e
i

Ht

A
S
e
i

Ht
. (3.84)
Note that the dynamics in the Heisenberg picture is described by
A
H
(t) =
i

H
A
H
(t)

A
H
(t)
i

H
=
i
[

H,

A
H
(t)] (3.85)
and we note the parallel with the statement from Hamiltonian mechanics that
df
dt
=
f, H for a function f(q, p) on phase space.
Theorem 3.2.2. The picture changing transformations leave the inner product invari-
ant.
Proof.
H
[
H
=
S
[e
i

Ht
e
i

Ht
[
S
=
S
[
S
(3.86)
Theorem 3.2.3. The operator matrix elements are also invariant under teh picture-
changing transformations.
Proof.
H
[
A
H
(t)[
H
=
S
[e
i

Ht

A
H
(t)e
i

Ht
[
S
(3.87)
=
S
[e
i

Ht
e
i

Ht

A
S
e
i

Ht
e
i

Ht
[
S
=
S
[
A
S
[
S
Example The Quantum Harmonic Oscillator. The Lagrangian for the harmonic oscil-
lator is
L =
1
2
m q
2
1
2
kq
2
(3.88)
The equation of motion is
q =
k
m
q (3.89)
whose solution is
q = Acos (t) +Bsin (t) (3.90)
where =
_
k
m
. The Legendre transform give the Hamiltonian:
H =
p
2
2m
+
k
2
q
2
=
1
2
m
2
q
2
+
p
2
2m
. (3.91)
The canoonical quantisation procedure gives the quantum hamiltonian for the harmonic
oscillator:
H =
1
2
m
2
q
2
+
p
2
2m
. (3.92)
Let us rst deal with this by directly trying to solve the Schrodinger equation.
Following the quantization prescription above the Schrodinger equation is
i
t
=

2
2m
q
2
+
1
2
kq
2
. (3.93)
First we look for energy eigenstates:

2
2m
n
q
2
+
1
2
kq
2
n
= E
n
n
, (3.94)
so that the general solution is
(t) =
n
e
iE
n
t/
n
. (3.95)
To continue we write (q)
n
= f(q)e
q
2
b
2
where b is a constant and f an unknown
function. We nd
n
q
2
=
_
f
4f
b
2
qf
2b
2
f + 4fb
4
q
2
f
_
e
q
2
b
2
(3.96)
and hence

2
2m
_
f
4b
2
qf
2b
2
f + 4fb
4
q
2
f
_
+
1
2
kq
2
f = E
n
f . (3.97)
So far f was arbitrary so we can choose b
4
= km/4
2
so that the terms involving q
2
f
are cancelled. This in turn means that a constant f = C
0
provides one solution:
0
= C
0
e
kmq
2
/2
E
0
=

2
b
2
m
=
1
2
(3.98)
3.2. THE SCHR
We can x C
0
be demanding that
1 =
_

dq[
0
(q)[
2
= [C
0
[
2
_

dqe
kmq
2
/
= [C
0
[
2
_

km
_1
2
_

dxe
x
2
= [C
0
[
2
_
km
_1
2
(3.99)
Thus we can take C
0
= (km/)
1/4
.
To nd other solutions we note that the general equation for f is
f
4b
2
qf
2b
2
f =
2m
2
E
n
f . (3.100)
It is not hard to convince yourself that polynomials of degree n in q will solve this
equation. One can then work out the E
n
for low values of n. And although
0
is indeed
the ground state this is not obvious.
However there is a famous and very important algebraic way to solve the harmonic
oscilator. Let us make an inspired change of variables and reqrite the Hamiltonian in
terms of
=
_
m
2
_
q +
i
m
p
_
(3.101)

=
_
m
2
_
q
i
m
p
_
so that
q =
_

2m
_
+
_
and p = i
_
m
2
_

_
. (3.102)
Therefore,
H =
1
2
m
2

2m
_
+
__
+
1
2m
m
2
_

__

_
(3.103)
=

4
_
+
_
=

2
_

_
Problem 3.2.1. Show that [ ,
] = 1.
Using [ ,
] = 1 we nd that
H =
_
1
2
+
_
. (3.104)
The Hilbert space of states may be constructed as follows. Let [n be an orthonormal
basis such

H is diagonalised - i.e. these are the energy eignestates:
H[n E
n
[n. (3.105)
Now we note that
[

H,
] =
1
2

1
2

(3.106)
=
[,
]
=
and, similarly,
[

H, ] = . (3.107)
Consequently we may deduce that alpha
raises the eignevalue of the energy eigenstate,

while lowers the energy eigenstates:
[n = (

H +
)[n = (E
n
+)
[n (3.108)
H [n = (

H )[n = (E
n
) [n
consequently
is called the creation operator while is called the annihilation operator.

Together and
are sometimes called the ladder operators.

It would appear that given a single eigenstate the ladder operators create an innite
set of eigenstates, however due to the postive denitieness of the Hilbert space inner
product we see that the innite tower of states must terminate at some point. Consider
the length squared of the state [n:
0 n[
[n = n[
1
H
1
2
[n =
_
E
n

1
2
_
(3.109)
hence E
n

1
2
. However the energy eigenvalues of the states
k
[n are
H
k
[n = (E
n
k)
k
[n (3.110)
where k Z and k > 0. We see that the eigenvalues of the states are continually
reduced, but we know that a minimum energy exists (
1
2
) beyond which the eigenstates
will have negative length squared. Consequently we conclude there must exist a ground
state eigenfunction [0 such that [0 = 0. In fact if [0 = 0 then
0[
[0 = 0 E
0
=
1
2
. (3.111)
Finally we comment on the normalisation of the energy eigenstates. Our aim is to nd
the normalising constant where
[n 1 = [n. (3.112)
Then as both [n 1 and [n are normalised we have:
1 = n 1[n 1 = [[
2
n[
[n = [[
2
nn[n = [[
2
n (3.113)
where we have used the observation that
is the number operator.

Problem 3.2.2. Let the state [n be interpreted as an n-particle eigenstate with energy
E
n
=
1
2
+n. Show that the number operator

N
satises:
n[
N[n = n (3.114)
3.2. THE SCHR
Hence =
1
n
and [n =
n[n 1.
[n =
n + 1[n + 1.
Thus we see that the spectrum of the harmonic oscilator is
E
n
=
_
n +
1
2
_
, (3.115)
with n = 0, 1, 2, 3.... So indeed
0
found above is the ground state. We could have easily
found it from this discussion as [0 = 0 becomes the dierential equation
0 =
_
q +
i
m
p
_
0
= q
0
+

m
0
q
. (3.116)
Integrating this immediately gives the
0
(q) that we found above. Furthermore the
higher eigenstates can be found by acting with powers of
n+1
=
1
n + 1

n
=
1
n + 1
_
m
2
_
q
n
n
q
_
. (3.117)
These will be normalized and will clearly take the form of a polynomial of degree n
times
0
.
Compare this spectrum to the classical answer we had before:
E =
1
2
k(A
2
+B
2
) (3.118)
This depends on the amplitude of the wave and k (not ) and takes any non-negative
value. Whereas in the quantum theory there is a non-zero ground state energy
1
2
with
a discrete spacing above that. The ground state energy can in fact be measured in what
is known as the Casimir eect. It also plays an important role in string theory leading
to the need to have 10 (or 26) dimensions.
Chapter 4
Group Theory
The rst investigations of groups are credited to the famously dead-at-twenty Evariste
Galois, who was killed in a duel in 1832. Groups were rst used to map solutions of
polynomial equations into each other. For example the quadratic equation
y = ax
2
+bx +c (4.1)
is solved when y = 0 by
x =
1
2a
(b
_
b
2
4ac). (4.2)
It has two solutions () which may be mapped into each other by a Z
2
reection which
swaps the + solution for the solution. The Z
2
is the cyclic group of order two
(which is sometimes denoted C
2
and similarly there exist groups which map the roots of
a more general polynomial equation into each other. Groups have a geometrical meaning
too. The symmetries which leave unchanged the n-polygons under rotation are also the
cyclic groups, Z
n
(or C
n
). For example Z
3
rotates an equilateral triangle into itself using
rotations of
2
3
,
4
3
and
6
3
= 2 about the centre of the triangle and Z
4
is the group of
rotations of the square onto itself.
The cyclic groups are examples of discrete symmetry groups. The action of the
discrete group takes a system (e.g. the square in R
2
) and rotates it onto itself without
passing through any of the suspected intervening orientations. The Z
4
group includes
the rotation by

2
but it does not include any of the rotations through angles less than
2
and greater than 0. One may imagine that under the action of Z
4
the square jumps
between orientations:
A B
C D
(4.3)
On the other hand continuous groups (such as the rotation group in R
2
move the
square continuously about the centre of rotation. The rotation is parameterised by a
continuous angle variable, often denoted . The Norwegian Sophus Lie began the study
of continuous groups, also known as Lie groups, in the second half of the 19
th
century.
Rather than thinking about geometry Sophus Lie was interested in whether there were
some groups equivalent to Galois groups which mapped solutions of dierential equations
59
60 CHAPTER 4. GROUP THEORY
into each other
1
. Such groups were identied, classied and named Lie groups. The
rotation group SO(n) is a Lie group.
In the wider context groups may act on more than algebraic equations or geometric
shapes in the plane and the action of the group may be encoded in dierent ways. The
study of the ways groups may be represented is aptly named representation theory.
It is believed and successfully tested (at the present energies of expereiments) that
the constiuent objects in the universe are invariant under certain symmetries. The
standard model of particle physics holds that all known particles are representations
of SU(3) SU(2) U(1). More simply, Einsteins special theory of relativity may be
studied as the theory of Lornetz groups.
We will make contact with most of these topics in this chapter and we begin with
the preliminaries of group theory: just what is a group?
4.1 The Basics
Denition A group G is a set of elements g
1
, g
2
, g
3
. . . with a composition law ()
which maps GG G by (g
1
, g
2
) g
1
g
2
such that:
(i) g
1
(g
2
g
3
) = (g
1
g
2
) g
3
g
1
, g
2
, g
3
G ASSOCIATIVE
(ii) e G such that e g = g e = g g G IDENTITY
(iii) g
1
G such that g g
1
= g
1
g = e g G INVERSES
Consequently the most trivial group consists of just the identity element e. Within the
denition above, together with the associative proprty of the group multiplication, the
existence of an identity element and an inverse element g
1
for each g, there is what
we might call the zeroth property of a group. namely the closure of the group (that
g
1
g
2
G.
Let us now dene some of the most fundamental ideas in group theory.
Denition A group G is called commutative or ableian if g
1
g
2
= g
2
g
1
g
1
, g
2
G.
Denition The centre Z(G) of a group is:
Z(G) g
1
G[ g
1
g
2
= g
2
g
1
g
2
G (4.4)
The centre of a group is the subset of elements in the group which commute with all
other elements in G. Trivially e G as e g = g e g G.
Denition The order [G[ of a group G is the number of elements in the set g
1
, g
2
, . . ..
For example the order of the group Z
2
is [Z
2
[ = 2, we have also seen [Z
3
[ = 3, [Z
4
[ = 4
and in general [Z
n
[ = n, where the elements are the rotations
m2
n
where m Z mod n.
Denition For each g G the conjugacy class C
g
is the subset
C
g
h g h
1
[ h G G. (4.5)
1
Very loosely, as each solution to a dierential equation is correct up to a constant, the solutions
contain a continuous parameter: the constant.
4.2. COMMON GROUPS 61
Exercise Show that the identity element of a group G is unique.
Solution Suppose e and f are two distinct identity elements in G. Then eg = f g
e (g g
1
) = f (g g
1
) e = f. Contrary to the supposition.
4.2 Common Groups
A list of groups is shown in table 4.2.1, where the set and the group multiplication law
have been highlighted.
A few remarks are in order.
(1,6-10) are nite groups satisfying [G[ < .
(14-20) are called the classical groups.
Groups can be represented by giving their multiplication table. For example con-
sider Z
3
:
e g g
2
e e g g
2
g g g
2
e
g
2
g
2
e g
Arbitrary combinations of group elements are sometimes called words.
4.2.1 The Symmetric Group S
n
The Symmetric group S
n
is the group of permutations of n elements. For example S
2
has order [S
2
[ = 2! and acts on the two elements ((1, 2), (2, 1)). The group action is
dened element by element and may be written as a two-row matrix with n columns,
where the permutation is dened per column with the label in row one being substituted
for the label in row two. For S
2
consider the group element
g
1

_
1 2
2 1
_
. (4.6)
This acts on the elements as
g
1
(1, 2) = (2, 1) g
1
(2, 1) = (1, 2) (4.7)
g
2
1
(1, 2) = (1, 2) g
2
1
(2, 1) = (2, 1) (4.8)
hence g
1
= g
1
1
and g
2
1
= e and S
2
e, g
1
. It is identical to Z
2
.
More generally for the group S
n
having n! elements it is denoted by a permutation
P such as:
P
_
1 2 3 . . . n
p
1
p
2
p
3
. . . p
n
_
(4.9)
where p
1
, p
2
, p
3
, . . . p
n
1, 2, 3, . . . n. The permutation P takes (1, 2, 3, . . . , n) to
(p
1
, p
2
, p
3
, . . . , p
n
). In general successive permutations do not commute. For example
consider S
3
and let
P
_
1 2 3
2 3 1
_
and Q
_
1 2 3
1 3 2
_
. (4.10)
1 G = e Under multiplication.
2 F where F = Z, Q, R, C Under addition.
3 F
F0 where F = Q, R, C Under multiplication.

4 F
>0
where F = Q, R An abelian group under multiplication.
5 0, n, 2n, 3n, . . . nZ where n Z. An abelian group under addition.
6 0, 1, 2, 3, . . . , (n 1). Addition mod (n), e.g. a +b = c mod n.
7 1, 1. Under multiplication.
8 e, g, g
2
, g
3
, . . . g
n1
. With g
k
g
l
= g
(k+l) mod n
.
This is the cyclic group of order n, Z
n
.
9 S
n
the symmetric group or Under the composition of permutations.
permutation group of n elements.
10 D
n
the dihedral group. Under the composition of permutations.
The group of rotations and reections Composition of transformations.
of an n-sided polygon with undirected edges.
11 Bijections f : X X where X is a set. Composition of maps.
12 GL(V ) f : V V [ f is linear and invertible. Composition of maps.
V is a vector space.
13 A vector space, V . An abelian group under vector addition.
14 GL(n, F) M n n matrices [ M is invertible. Matrix multiplication.
The general linear group, with matrix entries in F.
15 SL(n, F) M GL(n, F) [ det M = 1 Matrix multiplication.
The special linear group.
16 O(n) M GL(n, R) [ M
T
M = I
n
Matrix multiplication.
The orthogonal group.
17 SO(n) M GL(n, R) [ det M = 1 Matrix multiplication.
The special orthogonal group.
18 U(n) M GL(n, C) [ M
M = I
n
The unitary group.
19 SU(n) M U(n) [ det M = 1 Matrix multiplication.
The special unitary group.
20 Sp(2n) M GL(2n, R) [ M
T
JM = J Matrix multiplication.
Where J
_
0
n
I
n
I
n
0
n
_
.
The symplectic group.
21 O(p, q) M GL(p +q, R) [ M
T
p,q
M =
p,q
Where
p,q

_
I
p
0
pq
0
pq
I
q
_
.
22 SL(2, Z)
_
a b
c d
_
[ a, b, c, d Z, ad bc = 1 Matrix multiplication.
The modular group.
Table 4.2.1: A list of commonly occurring groups.
Then,
P Q =
_
1 2 3
1 3 2
_
_
1 2 3
2 3 1
_
=
_
1 2 3
3 2 1
_
(4.11)
while
Q P =
_
1 2 3
2 3 1
_
_
1 2 3
1 3 2
_
=
_
1 2 3
2 1 3
_
. (4.12)
Hence P Q ,= Q P and S
3
is non-abelian. So it also follows that S
n
is non-abelian
for all n > 2.
Alternatively one may denotes each permutation by its disjoint cycles of labels formed
by multiple actions of that permutation. For example consider P S
3
as dened above.
Under successive actions of P we see that the label 1 is mapped as:
1
P
2
P
3
P
1. (4.13)
We may denote this cycle as (1, 2, 3) and it denes P entirely. On the other hand Q, as
dened above, may be described by two disjoint cycles:
1
Q
1 (4.14)
2
Q
3
Q
2. (4.15)
We may write Q as two disjoint cycles (1), (2, 3). In this notation S
3
is written
(), (1, 2), (1, 3), (2, 3), (1, 2, 3), (1, 3, 2) (4.16)
where () denotes the trivial identity permutation. S
3
is identical to the dihedral group
D
3
. The dihedral group D
n
is sometimes dened as the symmetry group of rotations
of an n-sided polygon with undirected edges - this denition requires a bit of thought,
as the rotations may be about an axis through the plane of the polygon and so are
reections. The dihedral group should be compared with cyclic groups Z
n
which are
the rotation symmetries of an n-polygon with directed edges, while D
n
includes the
reections in the plane as well. For example if we label the vertices of an equilateral
triangle by 1, 2 and 3 we could denote D
3
as the following permutations of the vertices
_
1 2 3
1 2 3
_
,
_
1 2 3
2 1 3
_
,
_
1 2 3
3 2 1
_
, (4.17)
_
1 2 3
1 3 2
_
,
_
1 2 3
3 1 2
_
,
_
1 2 3
3 1 2
_
= (), (1, 2), (1, 3), (2, 3), (1, 2, 3), (1, 3, 2).
So we see that D
3
is identical to S
3
. We see that there are three reections and three
rotations within D
3
(the identity element is counted as a rotation for this purpose). In
general D
n
contains the n rotations of Z
n
as well as reections. For even n there is an
axis in which the reection is a symmetry which passes through each pair of opposing
vertices (
n
2
and also reections in the line through the centre of each opposing edge
n
2
.
For odd n there are again n lines about which reection is a symmetry, however these
lines now join a vertex to the middle of an opposing edge. In both even and odd cases
there are therefore n rotations and n reections. Hence [D
n
[ = 2n.
We may wonder if all dihedral groups D
n
are identical to the permutation groups
S
n
. The answer is no, it was a coincidence that S
3

= D
3
. We can convince ourselves
of these by considering the order of S
n
and D
n
. As we have already observed [S
n
[ = n!
while [D
n
[ = 2n. For the groups to be identical we at least require their orders to match
and we note that we can only satisfy n! = 2n for n = 3.
Returning to the symmetric group we will mention a third important notation for
permutations which is used to dene symmetric and anti-symmetric tensors. Each per-
mutation P can be written as combinations of elements called transpositions
ij
which
swap elements i and j but leave the remainder untouched. Consequently each transpo-
sition may be written as a 2-cycle
i,j
= (i, j). For example,
P
_
1 2 3
2 3 1
_
=
1,3

2,3
. (4.18)
If there are N transpositions required to replicate a permutation P S
n
then the sign
of the permuation is dened by
Sign(P) (1)
N
. (4.19)
You should convince yourself that this operation is well-dened and that each permu-
tation P has a unique value of Sign(P) - this is not obvious as there are many dierent
combinations of the transpositions which give the same overall permutation. The canon-
ical way to decompose permutations into transpositions is to consider only transpositions
which interchange consecutive labels, e.g
1,2
,
2,3
, . . .
n1,n
. A general r-cycle may be
decomposed (not in the canonical way) into r 1 transpositions:
(n
1
, n
2
, n
3
, . . . n
r
) = (n
1
, n
2
)(n
2
, n
3
) . . . (n
r1
, n
r
) =
n
1
,n
2
n
2
,n
3
. . .
n
r1
,n
r
. (4.20)
Consequently an r-cycle corresponds to a permutation R such that Sign(R) = (1)
(r1)
.
Therefore the elements of S
3

= D
3
may be partitioned into those elements of sign 1
(), (1, 2, 3), (1, 3, 2), which geometrically correspond to the rotations of the equilateral
triangle in the plane, and those of sign -1 (1, 2), (2, 3), (1, 3) which are the reections in
the plane. The subset of permutations P S
n
which have Sign(P)=1 form a sub-group
of S
n
which is called the alternating group and denoted A
n
.
We nish our discussion of the symmetric group by mentioning Cayleys theorem. It
states that every nite group of order n can be considered as a subgroup of S
n
. Since
S
n
contains all possible permutations of n labels it is not a surprising theorem.
Problem 4.2.1. D
n
is the dihedral group the set of rotation symmetries of an n-polygon
with undirected edges.
(i.) Write down the multiplication table for D
3
dened on the elements e,a,b by a
2
=
b
3
= (ab)
2
= e. Give a geometrical interpretation in terms of the transformations
of an equilateral triangle for a and b.
(ii.) Rewrite the group multiplication table of D
3
in terms of six disjoint cycles given
by repeated action of the basis elements on the identity until they return to the
identity, e.g. e e under the action of e, e a e under the action of a.
(iii.) Label the vertices of the equilateral triangle by (1, 2, 3). Denote the vertices of the
triangle by (1, 2, 3) and give permutations of 1, 2, 3 for e, a and b which match
the dening relations of D
3
.
(iv.) Rewrite each of the cycles of part (b.) in cyclic notation on the vertices (1, 2, 3) to
show this gives all the permutations of S
3
.
4.2.2 Back to Basics
Denition A subgroup H of a group G is a subset of G such that e H, if g
1
, g
2
H
then g
1
g
2
H and if g H g
1
H where g, g
1
, g
2
, g
1
G.
The identity element e and G itself are called the trivial subgroups of G. If a subgroup
H is not one of these two trivial cases then it is called a proper subgroup and this is
denoted H < G. For example S
2
< S
3
as:
S
2
= (), (1, 2) and (4.21)
S
3
= (), (1, 2), (1, 3), (2, 3), (1, 2, 3), (1, 3, 2).
Denition Let H < G. The subsets g H g h G[ h H are called left-cosets
while the subsets H g h g G[ h H are called right-cosets.
A more formal way to dene a left coset is to consider and equivalence relation
g
1
g
2
i g
1
1
g
2
H. Equivalence relations satisfy three properties
g g
if g
1
g
2
then g
2
g
1
if g
1
g
2
and g
2
g
3
then g
1
g
3
It is easy to check these for our case. The left coset g H is then dened as H/ .
Similarly a right coset is dened by the equivalence relation g
1
g
2
i g
1
g
1
2
H.
The left-coset g H where g G contains the elements
g h
1
, g h
2
, . . . , g h
r
(4.22)
where r [H[ and h
1
, h
2
, . . . , h
r
are the distinct elements of H. One might suppose
that r < [H[ which could occur if two or more elements of g H were identical, but if
that were the case we would have
g h
1
= g h
2
h
1
= h
2
(4.23)
but h
1
and h
2
are dened to be distinct. Hence all cosets of G have the same number
of elements which is [H[, the order of H.
Consequently any two cosets are either disjoint or coincide. For example, consider
the two left-cosets g
1
H and g
2
H and suppose that there existed some element g
in the intersection of both cosets, i.e. g g
1
H g
2
H. In this case we would have
g = g
1
h
1
= g
2
h
2
for some h
1
, h
2
H. Then,
g
1
H = (g h
1
1
) H = g H = g (h
1
2
H) = g
2
H. (4.24)
Hence either the cosets are disjoint or if they do have a non-zero intersection they are
in fact coincident. This means that the cosets provide a disjoint partition of G
G
g1H
g2H
g3H
gnH
(4.25)
hence
[G[ = n[H[ (4.26)
for some n Z. This statement is known as Lagranges theorem which states that the
order of any subgroup of G must be a divisor of [G[.
A corollary of Lagranges theorem is that groups of prime order have no proper
subgroups (e.g. Z
n
where n is prime).
Denition H < G is called a normal subgroup of G if
g H = H g (4.27)
g G. This is denoted H G.
The denition of a normal subgroup is equivalent to saying that g H g
1
= H.
Denition G is called a simple group is it has no non-trivial normal subgroups (i.e.
besides e and G itself).
Theorem 4.2.1. If H G then the set of cosets
G
H
is itself a group with composition
law
(g
1
H) (g
2
H) = (g
1
g
2
) H g
1
, g
2
G. (4.28)
This group is called the quotient group, or factor group, and denoted
G
H
.
Note that the normal condition is needed to ensure that this product is well dened,
i.e. independent of the choice of coset representative. To see this suppose that we
choose g
1
G and g
2
G as the coset representatives so that the coset representative
of (g
1
g
2
) H is g
1
g
2
. But we could also have chosen g
1
= g
1
h
1
and g
2
= g
2
h
2
(here we are talking about left cosets). In this case the coset representative of the
product is h
1
g
1
h
2
g
2
and we require that this is equivalent to g
1
g
2
. This means that
g
1
2
g
1
1
h
1
g
1
h
2
g
2
H. If H is normal then g
1
2
g
1
1
h
1
g
1
g
2
= h
H and g
1
2
h
2
g
2
= h

H so that g
1
2
g
1
1
h
1
g
1
h
2
g
2
= h
g
1
2
h
2
g
2
= h
H.
Proof. Evidently it is closed as the group action takes g H g H g H. Let us
check the three axioms that dene a group.
(i.) Associativity:
(g
1
H) ((g
2
H) (g
3
H)) = (g
1
H) (g
2
g
3
) H (4.29)
= (g
1
(g
2
g
3
)) H
= ((g
1
g
2
) g
3
) H
= ((g
1
g
2
) H) (g
3
H)
= ((g
1
H) (g
2
H)) (g
3
H)
(ii.) Identity. The coset e H acts as the identity element:
(e H) (g H) = (e g) H = g H
(g H) (e H) = (g e) H = g H (4.30)
(iii.) Inverse. The inverse of the coset g H is the coset g
1
H as:
(g H) (g
1
H) = e H = H (4.31)
N.B. that the group composition law arises as HG so g
1
H g
2
H = g
1
g
2
H.
Let us give a simple example: modular arithmetic. We start with Z as an additive
group. Let x an integer p and let H = pZ = kp[k Z. It is easy to see that pZ
is a subgroup of Z with the standard denition of addition. Since Z is abelian pZ is a
normal subgroup. Thus the coset Z/pZ is a group. In particular the cosets are
n H = n +kp[k Z (4.32)
There are p disjoint choices:
0 H , 1 H , 2 H , ... (p 1) H . (4.33)
since p H = 0 H, (p +1) H = 1 H etc.. The group product is just addition modulo
p:
(n
1
H) (n
2
H) = (n
1
+n
2
) H = n
1
+n
2
+kp[k Z
= ((n
1
+n
2
) mod p) H . (4.34)
Let us look at another example where the subgroup H is not normal. Se consider
S
3
which has elements
S
3
=
_
1 2 3
1 2 3
_
,
_
1 2 3
2 1 3
_
,
_
1 2 3
3 2 1
_
, (4.35)
_
1 2 3
1 3 2
_
,
_
1 2 3
3 1 2
_
,
_
1 2 3
3 1 2
_
Let us take the subgroup H to be

H =
_
1 2 3
1 2 3
_
,
_
1 2 3
2 1 3
_
. (4.36)
This is clear a subgroup since it simply consists of two elements e and g with g
2
= e. In
fact H = S
2
since it is just permuting the rst two elements. One can explicitly check
that
_
1 2 3
1 2 3
_
H = H
_
1 2 3
1 2 3
_
= H (4.37)
as expected. And also that
_
1 2 3
2 1 3
_
H = H
_
1 2 3
2 1 3
_
= H (4.38)
as expected. But lets look at a non-trivial coset:
_
1 2 3
1 3 2
_
H =
_
1 2 3
1 3 2
__
1 2 3
1 2 3
_
,
_
1 2 3
1 3 2
__
1 2 3
2 1 3
_
=
_
1 2 3
1 3 2
_
,
_
1 2 3
3 1 2
_
(4.39)
But the right coset is
H
_
1 2 3
1 3 2
_
=
_
1 2 3
1 2 3
__
1 2 3
1 3 2
_
,
_
1 2 3
2 1 3
__
1 2 3
1 3 2
_
=
_
1 2 3
1 3 2
_
,
_
1 2 3
2 3 1
_
(4.40)
and this is not the same as the left coset. So although S
2
is a subgroup of S
3
it is not a
normal subgroup.
4.3 Group Homomorphisms
Maps between groups are incredibly useful in recognising similar groups and constructing
new groups.
Denition A group homomorphism is a map f : G G
between two groups (G, )

and (G
) such that
f(g
1
g
2
) = f(g
1
)
f(g
2
) g
1
, g
2
G (4.41)
Denition A group isomorphism is an invertible group homomorphism.
If an isomorphism exists between Gand G
we write G
= G
and say that Gis isomorphic

to G
.
Denition A group automorphism is an isomorphism f : G G.
Problem 4.3.1. If f : G G
is a group homomorphism between the groups G and

G
, show that
4.3. GROUP HOMOMORPHISMS 69
(i.) f(e) = e
, where e and e
are the identity elements of G and G
respectively, and
(ii.) f(g
1
) = (f(g))
1
.
Theorem 4.3.1. If f : G G
is a group homomorphism then the kernel of f, dened

as Ker(f) g G[f(g) = e
is a normal subgroup of G.
Problem 4.3.2. Prove Theorem 4.3.1.
The theorem above can be used to prove that
G
Ker(f)
= G
for a given group homo-

morphism f : G G
, or conversely given an isomorphism between

G
Ker(f)
and G
to
identify the group homomorphism f (see section 4.3.1). A corollary of the theorem
above is that simple groups, having no non-trivial normal subgroups, admit only trivial
homomorphisms, i.e. those for which Ker(f) = G or Ker(f) = e.
Comments
(nZ, +) are abelian groups and hence normal subgroups of Z: nZ Z.
(F
>0
, ) (F
, ).
Group 6 in table 4.2.1 (0, 1, 2, 3, . . . , (n1), + mod (n)) is isomorphic to group
8 (e, g, g
2
, g
3
, . . . g
n1
, g
k
g
l
= g
(k+l) mod n
), with the group isomorphism being
f(1) = g.
D
n
< S
n
and D
n
is not a normal subgroup in general.
Sign(P S
n
) Z
2
is a group homomorphism. Consequently the alternating
group A
n
(P S
n
, Sign(P) = 1) is a normal subgroup of S
n
as A
n
Ker(Sign).
The determinant, Det is a group homomorphism: Det(GL(n, F)) (F
, ).
Hence:
- SL(n, F) GL(n, F) as SL(n, F) Ker(Det),
- SO(n) O(n) and
- SU(n) U(n).
And so
-
GL(n,F)
SL(n,F)
= (F
, ),
-
O(n)
SO(n)
= Z
2
and
-
U(n)
SU(n)
= U(1) z C, [z[ = 1.
The centre of SU(2) denoted Z(SU(2)) = Z
2
and one can show that the coset
group
SU(2)
Z
2
= SO(3).
There are a number of simple ways to create new groups from known groups for example:
(1.) Given a group G, identify a subgroup H. If these are normal H G then
G
H
is a
group.
(2.) Given two groups G and G
, nd a group homomorphism F : G G
such that
Ker(f)G then
G
Ker(f)
= G
and we observe as a corollary that Ker(f) is a group.

(3.) One can form the direct product of groups to create more complicated groups.
The direct product of two groups G and H is denoted GH and has composition
law:
(g
1
, h
1
)
(g
2
, h
2
) (g
1
G
g
2
, h
1
H
h
2
) (4.42)
where g
1
, g
2
G, h
1
, h
2
H,
G
is the composition law on G and
H
is the
composition law on H. E.g. the direct product R R has the compsition law
corresponding to two-dimensional real vector addition, i.e. (x
1
, y
1
) + (x
2
, y
2
) =
(x
1
+x
2
, y
1
+y
2
). The direct product of a group G with itself GG has a natural
subgroup (G) called the diagonal and dened by (G) (g, g) GG[g G.
(4.) If X is a set and G a group such that there exists a map f : X G then the
functions f with the composition law
f
1
f
2
(x) f
1
(x)
G
f
2
(x) (4.43)
where x X form a group. For example if X = S
1
the set of maps of X into G
form the loop group of G.
There are only a nite number of nite simple groups. The quest to identify them all
is universally accepted as having been completed in the 1980s. In addition to groups
such as the cyclic groups Z
n
, the symmetric group S
n
, the dihedral group D
n
and
the alternating group A
n
there are fourteen other innite series and twenty-six other
sporadic groups. These include:
The Matthieu groups (e.g. [M
24
[ = 2
1
0.3
3
.5.7.11.23 = 244, 823, 040),
the Janko groups (e.g. [J
4
[ 8.67 10
19
),
the Conway groups (e.g. [Co
1
[ 4.16 10
18
),
the Fischer groups (e.g. [Fi
24
[ 1.26 10
24
) and
the Monster group ([M[ 8.08 10
53
).
Denition Let G be a group and X be a set. The (left) action of G on X is a map
taking GX X and denoted
2
(g, x) g x T
g
(x) (4.44)
that satises
(i.) (g
1
g
2
) x = g
1
(g
2
x) g
1
, g
2
G, x X
(ii.) e x = x x X where e is the identity element in G.
The set X is called a (left) G-set.
2
Here we use T
g
to denote the left-translation by g, but we could similarly dene the right-translation
with the group element acting on the set from the right-hand-side.
4.3. GROUP HOMOMORPHISMS 71
Denition The orbit of x X under the G-action is
G x x
X[x
= g x g G. (4.45)
Denition The stabiliser subgroup of x X is the group of all g G such that gx = x,
i.e.
G
x
g G[g x = x. (4.46)
Denition The fundamental domain is the subset X
F
X such that
(i.) x X
F
g x / X
F
g Ge and
(ii.) X =
gG
g X
F
.
Examples
(1.) S
n
acts on the set 1, 2, 3, . . . n.
(2.) A group G can act on itself in three canonical ways:
(i.) left translation: T
(L)
g
1
(g
2
) = g
1
g
2
,
(ii.) right translation: T
(R)
g
1
(g
2
) = g
2
g
1
and
(iii.) by conjugation
3
: T
(R)
g
1
1
T
(L)
g
1
(g
2
) = g
1
g
2
g
1
1
Ad
g
1
(g
2
).
(3.) SL(2, Z) acts on the set of points in the upper half-plane H z C[Im(z) > 0
by the Mobius transformations:
__
a b
c d
_
, z
_
az +b
cz +d
H (4.47)
Problem 4.3.3. Consider the Klein four-group, V
4
, (named after Felix Klein) consisting
of the four elements e, a, b, c and dened by the relations:
a
2
= b
2
= c
2
= e, ab = c, bc = a and ac = b
(i.) Show that V
4
is abelian.
(ii.) Show that V
4
is isomorphic to the direct product of cyclic groups Z
2
Z
2
. To do
this choose a suitable basis of Z
2
Z
2
and group composition rule and use it to
show that the basis elements of Z
2
Z
2
have the same relations as those of V
4
.
4.3.1 The First Isomomorphism Theorem
The rst isomomorphism theorem combines many of the observations we have made in
the preceeding section.
Theorem 4.3.2. (The First Isomorphism Theorem) Let G and G
be groups and let

f : G G
be a group homomorphism. Then the image of f is isomorphic to the coset

group
G
Ker(f)
. If f is a surjective map then G

=
G
Ker(f)
.
3
The conjugate action is also called the group adjoint action
Proof. Let K denote the kernel of f and H denote the image of f. Dene a map
:
G
K
H by
(g K) = f(g) (4.48)
where g G. Let us check that is well-dened in that it maps dierent elements in a
coset gK to the same image f(g). Suppose that g
1
K = g
2
K then g
1
1
g
2
K and
(g
1
K) = f(g
1
) (4.49)
= f(g
1
)
= f(g
1
)
f(g
1
1
g
2
)
= f(g
1
g
1
1
g
2
)
= f(g
2
)
= (g
2
K).
is a group homomorphism as
(g
1
K)
(g
2
K) = f(g
1
)
f(g
2
) (4.50)
= f(g
1
g
2
)
= ((g
1
g
2
) K)
= ((g
1
K) (g
2
K))
as K G. To prove that is an isomorphism we must show it is surjective (onto)
and injective (one-to-one). For any h H we have by the denition of H that there
exists g G such that f(g) = h, hence h = f(g) = (g K) and is surjective. To
show that is injective let us assume the contrary statement that two distinct cosets
(g
1
K ,= g
2
K) are mapped to the same element f(g
1
) = f(g
2
). As f is a homorphism
f(g
1
1
g
2
) = e
, hence g
1
1
g
2
K and so g
1
K = g
1
(g
1
1
g
2
K) = g
2
K
contradicting our assumption that g
1
K ,= g
2
K. Hence is injective. As is both
surjective and injective it is a bijection. The inverse map
1
(f(g)) = g K is also a
homomorphism:
1
(f(g
1
)
f(g
2
)) =
1
(f(g
1
g
2
)) (4.51)
= (g
1
g
2
) K
= (g
1
K) (g
2
K)
=
1
(g
1
K)

1
(g
2
K))
as well as a bijection. Hence is a group isomorphism and
G
Ker(f)
= H. If f is surjective
onto G
then H = G
and
G
Ker(f)
= G
.
4.4 Some Representation Theory
Denition A representation of a group on a vector space V is a group homomorphism
: G GL(V ).
In other words a representation is a way to write the group G as matrices acting on
a vector space which preserves the group composition law. Many groups are naturally
4.4. SOME REPRESENTATION THEORY 73
written as matrices e.g. GL(n, F), SL(n, F), SO(n), O(n), U(n), SU(n) etc. (where
F stands for Z, R, Q, C . . .) however there may be numerous ways to write the group
elements as matrices. In addition not all groups can be represented as matrices e.g. S
(the innite symmetric group) - try writing out an matrix! Similarly GL(, F),
SL(, F), . . . for that matter. Here V is called the representation space and the dimen-
sion of the representation is the dimension of the vector space V , i.e. Dim(V ).
Denition If a representation is such that Ker() = e where e is the identity element
of G then is a faithful representation.
That Ker is trivial indicates that is injective (one-to-one), as suppose was not in-
jective so that (g
1
) = (g
2
) where g
1
,= g
2
for g
1
, g
2
G then as is a homomorphism
(g
1
2
g
1
) = I (4.52)
where I is the identity matrix acting on V . Hence g
1
2
g
1
Ker() and the kernel
would be non-trivial.
Denition A representation
1
(G) GL(V
1
) is equivalent to a second representation
2
(G) GL(V
2
) if there exists an invertible linear map T : V
1
V
2
such that
T
1
(g) =
2
(g)T g G (4.53)
The map T is called the intertwiner of the representations Pi
1
and Pi
2
.
Denition W V is an invariant subspace of a representation : G GL(V ) if
(g)W W for all g G.
W is called a subrepresentation space and if such an invariant subspace exists evidently
one can trivially construct a representation of G whose dimension is smaller than that
of (as Dim(W) < Dim(V )) by restricting the action of to its action on W. The
representations which possess no invariant subspaces are special.
Denition An irreducible representation : G GL(V ) contains no non-trivial in-
variant sub-spaces in V .
That is there do not exist any subspaces W V such that (g)W W g G
except W = V or W = e. The irreducible represesntations are often referred to by
the shorthand irrep and they are the basic building blocks of all the other reducible
representations of G. They are the prime numbers of representation theory.
4.4.1 Schurs Lemma
Theorem 4.4.1. (Schurs lemma rst form) Let
1
: G GL(V ) and
2
: G
GL(W) be irreducible representations of G and let T : V W be an intertwining map
between
1
and
2
. Then either T = 0 (the zero map) or T is an isomorphism.
Proof. T is an intertwining map so T
1
(g) =
2
(g)T for all g G. First we show that
Ker(T) is an invariant subspace of V as if v Ker(T) then Tv = 0 (as the identity
element on the vector space is the zero vector under vector addition), therefore
T
1
(g)v =
2
(g)T(v) = 0
1
(g)v Ker(T) v Ker(T). (4.54)
Hence Ker(T) is an invariant subspace of V under the action of
1
(G). As
1
(G) is an
ireducible representation of G then Ker(T) = 0 or V . If Ker(T) = V then T is a map
sending all v V to 0 W (the zero map) and T = 0. If Ker(T) = 0 V then T is an
injective map. If T is injective and in addition surjective then it is an isomorphism, so
it remains for us to show that if T is not the zero map it is a surjective map. We will
do this by proving that the image of T is an invariant subspace of W. Let the image of
a vector v V be denoted w W, i.e. T(v) = w then
2
(g)w =
2
(g)T(v) = T(
1
(g)v) Im(T) g G (4.55)
and so the image of T is an invariant subspace of W. As
2
is an irreducible represen-
tation then it has no non-trivial invariant subspaces, hence Im(T) = 0 or W. If the
image of T is the zero vector then T is the zero map, otherwise if the image of T is W
then T is a surjective map. Consequently either T = 0 or T is an isomorphism between
V and W.
Theorem 4.4.2. (Schurs lemma second form) If T : V V is an intertwiner from an
irreducible representation to itself and V is a nite-dimensional complex vector space
then T = I for some C.
Proof. We have T(g) = (g)T and as V is a complex vector space then one can always
solve the equation det(T I) = 0 to nd a complex eigenvalue
4
. Hence Tv = v
where v is an eigenvector of T and
T(g)v = (g)Tv = (g)v g G (4.56)
So (g)v is another eigenvector for T with eigenvalue . Hence the -eigenspace of T
is an invariant subspace of (G). As is an irreducible representation then the -
eigenspace of T is either 0 or V itself. If we assume V to be non-trivial then at least
one eigenvalue exists and so the -eigenspace of T is V itself. Therefore
Tv = v v V T = I. (4.57)
A corollary of Schurs lemma is that if there exist a pair of intertwining maps T
1
:
V W and T
2
: V W which are both non-zero then T
1
= T
2
for some C. For if
T
2
is non-zero then it is an isomorphism of V and W and its inverse map T
1
2
: W V
is also an interwtwiner. Now
T
1
T
1
2

2
(g) = T
1
1
(g)T
1
2
=
2
(g)T
1
T
1
2
(4.58)
hence T
1
T
1
2
: W W and by Schurs lemma (second form) we have T
1
T
1
2
= I and
so T
1
= T
2
for some C.
Problem 4.4.1. If (G) is a nite-dimensional representation of a group G, show that
the matrices
(g) also form a representation, where
(g) is the complex-conjugate of

(g).
4
This gives a polynomial in which always has a solution over C, or indeed over any algebraically
closed eld.
Problem 4.4.2. The representation
(g) may or may not be equivalent to (g). If

they are equivalent then there exists an intertwining map, T, such that:
(g) = T
1
(g)T
Show that if (g) is irreducible then TT
= I
Problem 4.4.3. If (g) is a unitary representation on C
n
show that TT
= I. (Hint:
Make use of the fact that the inner product on C
n
is < v, w >= v
w where v, w C
n
to nd a relation between
and .) Show that T may be redened so that = 1 and

that T is either symmetric or antisymmetric.
Problem 4.4.4. Let G be an abelian group. Show that
(g
2
) = (g
1
)
1
(g
2
)(g
1
)
where g
1
, g
2
G and is an irreducible representation of G. Hence show that every
complex irreducible representation of an abelian group is one-dimensional by proving
that (g) = I for all g G where C.
Problem 4.4.5. Prove that a representation of G of dimension n+m having the form:
(g) =
_
A(g) C(g)
0 B(g)
_
g G
is reducible. Here A(g) is an nn matrix, B(g) is an mm matrix, C(g) is an nm
matrix and 0 is an empty mn matrix where n and m are integers and n > 0.
Problem 4.4.6. The ane group consists of ane transformations (A, b) which act on
a D-dimensional vector x as:
(A, b)x = Ax +b
Find, with justication, a (D + 1)-dimensional reducible representation of the ane
group of transformations.
Denition Let V be a vector space endowed with an inner product <, >. A represen-
tation : G GL(V ) is called unitary if (g) are unitary operators i.e.
< (g)v, (g)w >=< v, w > g G, v, w V. (4.59)
Denition Let : G GL(V ) be a representation on a nite-dimensional vector
space V , then the character of is the function
: G C dened by
(g) = Tr((g)) (4.60)

where Tr is the trace.
Notice that
(e) = Tr((e)) = Tr(I) = Dim(V ) is the dimension of the representation.

The character is constant on the conjugacy classes of a group G as
(g h g
1
) = Tr((g h g
1
)) (4.61)
= Tr((g)(h)(g
1
))
= Tr((h))
=
(h).
where we have used the cyclicty of the trace. Any function which is invariant over the
conjugacy class is called a class function. If is a unitary representation then
(g
1
) = Tr((g
1
)) = Tr((g)
1
) = Tr((g)
) =
(g) =
(g). (4.62)
If
1
and
2
are equivalent representations (with intertwinging map T) then they have
the same characters as
1
(g) = Tr(
1
(g)) (4.63)
= Tr(T
1
2
(g)T)
= Tr(
2
(g))
=
2
(g)
and conversely if two representations of G have the same characters for all g G then
they are equivalent representations.
4.4.2 The Direct Sum and Tensor Product
Given two representations
1
: G GL(V
1
) and
2
: G GL(V
2
) of a group G one
can form two important representations:
1. The direct sum,
1
2
: G GL(V
1
V
2
) such that (
1
2
)(g) =
1
(g)
2
(g). This is a homomorphism as
(
1
2
)(g
1
g
2
) =
_

1
(g
1
g
2
) 0
0
2
(g
1
g
2
)
_
(4.64)
=
_

1
(g
1
)
1
(g
2
) 0
0
2
(g
1
)
2
(g
2
)
_
=
_

1
(g
1
) 0
0
2
(g
1
)
__

1
(g
2
) 0
0
2
(g
2
)
_
= (
1
2
)(g
1
)(
1
2
)(g
2
)
If V
1
is the vector space with basis e
1
, e
2
, . . . e
n
and V
2
is the vector space with
basis f
1
, f
2
, . . . f
m
then V
1
V
2
has the basis e
1
, e
2
, . . . e
n
, f
1
, f
2
, . . . f
m
, i.e. we
can write this using the direct product as V
1
V
2
(v
1
, v
2
) V
1
V
2
[v
1
V
1
, v
2

V
2
with vector addition and scalar mulitplication acting as
(v
1
, v
2
) + (v
1
, v
2
) = (v
1
+v
1
, v
2
+v
2
) (4.65)
a(v
1
, v
2
) = (av
1
, av
2
)
where v
1
, v
1
V
1
, v
2
, v
2
V
2
and a is a constant. In this notation the basis of
V
1
V
2
is
(e
1
, 0), (e
2
, 0), . . . (e
n
, 0), (0, f
1
), (0, f
2
), . . . (0, f
m
)
= e
1
, e
2
, . . . e
n
, f
1
, f
2
, . . . f
m
.
Hence Dim(V
1
V
2
) = Dim(V
1
) +Dim(V
2
) = n +m.
Example Let G be Z
2
e, g[e = Id, g
2
= e with V
1
= R
1
and V
2
= R
2
so that
1
(e) = 1,
1
(g) = 1 (4.66)
2
(e) =
_
1 0
0 1
_
,
2
(g) =
_
1 0
0 1
_
now V
1
V
3
= R
3
with
(
1
2
)(e) =
_
_
_
1 0 0
0 1 0
0 0 1
_
_
_
,
2
(g) =
_
_
_
1 0 0
0 1 0
0 0 1
_
_
_
. (4.67)
2. The tensor product,
1

2
: G GL(V
1
V
2
) such that (
1

2
)(g) =
1
(g)
2
(g). The tensor product is the most general blinear product and so its
dention may seem obscure at rst sight. This is a homomorphism as
(
1
2
)(g
1
g
2
) =
1
(g
1
g
2
)
2
(g
1
g
2
) (4.68)
=
1
(g
1
)
1
(g
2
)
2
(g
1
)
2
(g
2
)
= (
1
2
)(g
1
)(
1
(g
2
)
2
(g
2
))
= (
1
2
)(g
1
)(
1
2
)(g
2
)
If V
1
is the vector space with basis e
1
, e
2
, . . . e
n
and V
2
is the vector space with
basis f
1
, f
2
, . . . f
m
then V
1
V
2
has the basis
e
1
f
1
, e
1
f
2
, . . . e
1
f
m
, e
2
f
1
, e
2
f
2
, . . . e
2
f
m
, . . . , e
n
f
1
, e
n
f
2
, . . . e
n
f
m
i.e. the basis is e

i
e
j
[i = 1, 2, . . . Dim(V
1
), j = 1, 2, . . . Dim(V
2
). Hence
Dim(V
1
V
2
) = Dim(V
1
) Dim(V
2
) = nm. The tensor product of two vec-
tor spaces V and W satises
(v
1
+v
2
) w
1
= v
1
w
1
+v
2
w
1
(4.69)
v
1
(w
1
+w
2
) = v
1
w
1
+v
1
w
2
av w = v aw = a(v w)
where v, v
1
, v
2
V , w, w
1
, w
2
W and a is a constant.
Example As for the direct sum consider the example where G is Z
2
and
1
and
2
are the representations given explicitly in equation (4.66) above. Then the
basis elements for V
1
V
2
are e
1
f
1
, e
1
f
2
where e
1
is the basis vector for R
and f
1
, f
2
are the basis vectors for R
2
and the tensor product representation is
(
1
2
)(e) = 1
_
1 0
0 1
_
, (
1
2
)(g) = 1
_
1 0
0 1
_
.
These act on R R
2
by
(
1
2
)(e)(v
1
v
2
) = v
1
v
2
, (4.70)
(
1
2
)(g)(v
1
v
2
) = v
1
_
1 0
0 1
_
v
2
= v
1
v
2
which is the trivial representation acting on the two-dimensional vector space
R R
2
= R
2
. A slightly less trivial example involves the representation
3
of Z
2
on R
2
given by
3
(e) =
_
1 0
0 1
_
,
3
(g) =
_
1 0
0 1
_
. (4.71)
The tensor product representation
1
3
acts on R
2
as
(
1
3
)(e) = 1
_
1 0
0 1
_
, (
1
3
)(g) = 1
_
1 0
0 1
_
these act on R R
2
by
(
1
3
)(e)(v
1
v
2
) = v
1
v
2
, (4.72)
(
1
3
)(g)(v
1
v
2
) = v
1
_
1 0
0 1
_
v
2
= v
1
_
1 0
0 1
_
v
2
which is non-trivial.
One may introduce scalar products on the direct sum and tensor product spaces:
< v
1
w
1
, v
2
w
2
>
V W
< v
1
, v
2
>
V
+ < w
1
, w
2
>
W
(4.73)
< v
1
w
1
, v
2
w
2
>
V W
< v
1
, v
2
>
V
< w
1
, w
2
>
W
as well as the character function:
2
(g) = Tr(
1
(g)) +Tr(
2
(g)) (4.74)
2
(g) = Tr
V
(
1
(g))Tr
W
(
2
(g)).
One might think that all the information about these product representations is con-
tained already in V and W. However consider the endomorphisms (the homomorphisms
from a vector space to itself
5
) of V W, denoted End(V W). Any A End(V W)
may be written
A =
_
A
V V
A
V W
A
WV
A
WW
_
(4.75)
where A
V V
: V V , A
V W
: V W etc. that is A
V V
End(V ) and A
WW

EndW do not generate all the endomorphisms of V W (note that if Dim(V ) = n
and Dim(W) = m then Dim(End(V W)) = (n +m)
2
n
2
+m
2
= Dim(End(V )) +
Dim(End(W)). On the other hand the endomorphisms of V and W do generate all the
endomorphisms of the tensor product space V W as Dim(End(V W)) = n
2
m
2
=
Dim(End(V ))Dim(End(W)).
The direct sum never gives an irreducible representation, having two non-trivial
subspaces V 0

= V and 0 W

= W. It is less straightforward with the tensor
product to discover whether or not it gives an irreducible representation. Frequently
one is interested in decomposing the tensor product into direct sums of irreducible sub-
representations:
V W = U
1
U
2
. . . U
n
. (4.76)
5
If an endomorphism is invertible then the map is an automorphism.
To do this one must nd an endomorphism (a change of basis) of V W such that
T(
1
2
(g))T
1
=

1
(g)

2
(g) . . .

n
(g) (4.77)
where T End(V W). The decomposition
(G) (G) =
i
a
i
i
(G) (4.78)
is called the Clebsch-Gordan decomposition. This is not always possible. One can
achieve this decomposition for one example central to quantum mechanics G = SU(2).
It is a fact (which we will not prove here) that SU(2) has only one unitary irreducible rep-
resentation for each vector space of dimension Dim(V ) n+1. This n+1-dimensional
representation is isomorphic to a representation of the irreducible representations of
SO(3) associated to angular momentum in quantum mechanics due to the group isomor-
phism
SU(2)
Z)2
= SO(3) which will be shown explicitly later in this chapter. In summary

representations of SU(2) may be labelled by Dim(V ) = n+1 and the equivalent SO(3)
representation is labelled by spin j. In fact j =
n
2
hence as n Z
+
then j may take
half-integer (fermions) as well as integer (bosons) values. When j = 0 then n = 0 so
Dim(V ) = 1 is the trivial representation of SU(2); j =
1
2
then n = 1 and Dim(V ) = 2
giving the fundamental or standard representation of SU(2) as a two-by-two matrix;
and when j = 1 then n = 2 giving Dim(V ) = 3 is called the adjoint representa-
tion of SU(2). The Clebsch-Gordan decomposition rewrites the tensor product of two
SU(2) irreducible representations [j
1
] and [j
2
], labelled using the spin, as a direct sum
of irreducible representations:
[j
1
] [j
2
] = [j
1
+j
2
] [j
1
+j
2
1] . . . [[j
1
j
2
[]. (4.79)
Some simple examples are
[0] [j] = [j] (4.80)
One can quickly check that the tensor product has the same dimension as the direct sum.
Note that Dim[j] = Dim(V ) = n + 1 = 2j + 1 so that Dim([0] [j]) = 1 (2j + 1) =
Dim[j]. Another example short example is
[
1
2
] [j] = [
1
2
+j] [
1
2
+j] (4.81)
where we have Dim([
1
2
] [j]) = (2
1
2
+ 1)(2j + 1) = 4j + 2 while the direct sum of
representations has Dim([
1
2
+j][
1
2
+j]) = (2(
1
2
+j)+1)+(2(
1
2
+j)+1) = 4j+2. Notice
that the tensor products of the fundamental representation [
1
2
] with itself generates
all the other irreducible representations of SU(2) that is
[
1
2
] [
1
2
] = [1] [0] (4.82)
Dimensions: 2 2 = 3 + 1
[1] [
1
2
] = [
3
2
] [
1
2
]
Dimensions: 3 2 = 4 + 2.
For other groups the decomposition theory is more involved. To work out the Clebsch-
Gordan coecients one must know the inequivalent irreducible representations of the
group, its conjugacy classes and its character table. If a representation of a group
itself may be rewritten as a sum of representations it is by denition not an irreducible
representation - it is called a reducible representation.
Denition A representation : G GL(V
n
V
m
) on a vector space of dimension
n +m is reducible if (g) has the form
(g) =
_
A(g) C(g)
0 B(g)
_
g G (4.83)
where A is an n n matrix, B is an mm matrix, C is an n m matrix and 0 is the
empty mn matrix.
Notice that
_
A(g) C(g)
0 B(g)
__
v
n
0
m
_
=
_
A(g)v
n
0
m
_
(4.84)
where 0
m
V
m
is the m-dimensional zero vector and v
n
V
n
is an n-dimensional vector.
So we see that V
n
is an invariant subspace of and so is reducible. Furthermore if
we multiply two such matrices together we have
(g
1
)(g
2
) =
_
A(g
1
) C(g
1
)
0 B(g
1
)
__
A(g
2
) C(g
2
)
0 B(g
2
)
_
(4.85)
=
_
A(g
1
)A(g
2
) A(g
1
)C(g
2
) +C(g
1
)B(g
2
)
0 B(g
1
)B(g
2
)
_
= (g
1
g
2
)
=
_
A(g
1
g
2
) C(g
1
g
2
)
0 B(g
1
g
2
)
_
hence we see that A(g
1
g
2
) = A(g
1
)A(g
2
) and A(g) is representation of G on the
invariant subspace V
n
. For nite groups the matrix C is equivalent to the null matrix
(by Maschkes theorem all reducible representations of a nite group are completely
reducible). In this case the representation is said to be completely reducible:
(g) = A(g) B(g). (4.86)
It does not follow that A(G) and B(G) are themselves irreducible, but if they are not
then the process may be repeated until (G) is expressed as a direct sum of irreducible
representations.
4.5 Lie Groups
Many of the groups we have met so far have been parameterised by discrete variables
e.g. e, g, g
2
for Z
3
but frequently a number of group actions we have met, e.g. So(n),
SU(n), U(n), Sp(n), have been described by continuous parameters. For example SO(2)
4.5. LIE GROUPS 81
describing rotations of S
1
is parameterised by which takes values in the continuous
set [0, 2) and for each value of we nd an element of SO(2):
R() =
_
cos() sin()
sin() cos()
_
(4.87)
(one may check that R()R
T
() = I and Det(R()) = 1). R() is a two-dimensional
representation of the abstract group SO(2). We may check that is a faithful represen-
tation of SO(2): R(0) = I and the kernel of the representation is trivial for [0, 2).
Incidentally the two-dimensional representation is irreducible over R but it is reducible
over C. Over C we take as column vector
_
z
z
_
=
_
x +iy
x iy
_
=
_
re
i
re
i
_
and an
SO(2) rotation takes
_
z
z
_
z
_
=
_
re
i(+)
re
i(+)
_
(4.88)
that is
R(, C) =
_
e
i
0
0 e
i
_
(4.89)
There is a qualitative dierence when we move from R to C as this matrix is block diag-
onal and hence reducible into two one-dimensional complex representations of U(1)

=
SO(2). Geometrically the parameter dening the rotation parameterises the circle S
1
.
For other continuous groups we may also make an identication with a geometry e.g.
R0 under multiplication is associated with two open half-lines (the real line with zero
removed), a second example is SU(2) =
_

_
[[[
2
+ [[
2
= 1 which as a
set parameterises S
3
. The proper notion for the geometric setting is the manifold and
each group discussed above is a manifold. Any geometri space one can imagine can be
embedded in some Euclidean R
n
as a surface of some dimensions less than or equal to n.
For example the circle S
1
R
2
and in general S
n1
R
n
. No matter how extraordinary
the curvature of the surface (so long as it remains well-dened) a manifold will have the
appearance of being a Euclidean space at a suciently local scale. Consider S
1
R
2
suciently close to a point on S
1
, the segment of S
1
appears identical to R
1
. The ge-
ometry of a manifold is found by piecing together these open and locally-Euclidean stes.
Each open neighbourhood is called a chart and is equipped with a map that converts
points p M, where M is the manifold, to local Euclidean coordinates. Using these lo-
cal coordinates one can carry out all the usual mathematics in R
n
. The global structure
of a manifold is dened by how these open sets are glued together. Since a manifold is a
very well-dened structure these transition functions, encoding the gluing, are smooth.
The study of manifolds is the beginning of learning about dierential geometry.
Denition A Lie group is a dierentiable manifold G which is also a group such that
the group product GG G and the inverse map g g
1
are dierentiable.
We will restrict our interest to matrix Lie groups in this foundational course, these are
those Lie groups which are written as matrices e.g. SL(n, F), SO(n), SU(n), Sp(n).
Denition A matrix Lie group G is connected if given any two matrices A and B in G,
there exists a continuous path A(t) with 0 t 1 such that A(0) = A and A(1) = B.
A matrix Lie group which is not connected can be decomposed into several connected
pieces.
Theorem 4.5.1. If G is a matrix Lie group then the component of G connected to the
identity is a subgroup of G. It is denoted G
0
.
Proof. Let A(t), B(t) G
0
such that A(0) = I, A(1) = A, B(0) = I and B(1) = B are
continuous paths. Then A(t)B(t) is a continuous path from I to AB. Hence G
0
is closed
and evidently I G
0
. Also A
1
(t) = A(t) is a continuous path from I to A
1
G
0
dened by A(t)A(t) = I.
The groups GL(n, C), SL(nC, SL(n, R), SO(n), U(n) and SU(n) are connected
groups. While GL(n, R and O(n) are not connected. For example one can convince
oneself that O(n) is not connected by supposing that A, B O(n) such that Det(A) =
+1 and Det(B) = 1. Then any path A(t) such that A(0) = A and A(1) = B would
give a continuous function Det(A(t)) passing from 1 to 1. Since A O(n) satisfy
Det(A) = 1 then no such set of matrices forming a continuous path from A to B exist.
A similar argument can be made for GL(n, R) splitting it into components with Det > 0
and Det < 0.
4.6 Lie Algebras: Innitesimal Generators
Let us now return to thinking like physicists. From this perspective we would like
to think of Lie groups as continuous actions that can be realized by an innitesimal
transformation
g = 1 +iT +. . . , (4.90)
where the ellipsis denotes higher order terms in << 1. The factor of i is for later
convenience. Here we think of g in terms of some representation. Thus we really should
write
(g) = 1 +iT +. . . , (4.91)
so that T is a matrix and 1 is the identity matrix. However as physicists we will forget
that we are talking about representations since what we say applies to any representa-
tion. In general g is subject to some restriction such as unitarity. Thus the set of Ts
that one nds is restricted. This denes the Lie algebra Lie(G): its the set of operators
T that are required to generate the group innitesimally.
There is an analogous notion of a representation of the Lie algebra to that of a
representation of a group. denition a representation of a Lie-algebra is a map :
Lie(G) GL(V ) such that [A, B] = [(A), (B)]
Let us look at an example: U(N) = N N complex matrices g[g
= g
1
. This is
a group since 1 U(N). By construction if g U(N) then g
1
U(N) as (g
1
)
= g.
Finally if g
1
, g
2
U(N) then (g
1
g
2
)
1
= g
1
2
g
1
1
= g
2
g
1
= (g
1
g
2
)
. What is the condition

4.6. LIE ALGEBRAS: INFINITESIMAL GENERATORS 83
1 A
N
= su(N + 1) M = (N + 1) (N + 1) matrix[M
= M , trM = 0
2 B
N
= so(2N + 1) M = (2N + 1) (2N + 1) matrix[M
T
= M
3 C
N
= sp(2N) J = 2N 2N matrix[J
T
+J = 0 , =
_
0 1
NN
1
NN
0
_
4 D
N
= so(2N) M = (2N) (2N) matrix[M
T
= M
5 E
6
, E
7
, E
8
6 F
4
7 G
2
Table 4.6.1: The classication of semi-simple Lie-algebras
that g = 1 +iT U(N)? Well rst note that the inverse of g is g
1
= 1 iT since
gg
1
= (1 +iT)(1 iT) = 1 +. . .
g
1
g = (1 iT)(1 +iT) = 1 +. . . .
Thus for g U(N) we require that
g
= g
1
1 iT
= 1 iT T
= T (4.92)
So the Lie algebra Lie(G) is the space of Hermitian matrices.
As we noted above a group always acts on itself via conjugation. Thus if we have
g G and consider an innitesimal conjugation by h = 1 + iU. Thus conjugation
amounts to
g hgh
1
= (1 +iU)g(1 iU)
= g +i(Ug gU) +. . .
= g +i[U, g] +. . . . (4.93)
If we further expand g = 1 + iT the group action induces a commutator structure on
the Lie algebra since i[U, T] Lie(G). Thus if we have a basis T
a
of Lie(G) then there
must exist constants, called structure constants, such that
[T
a
, T
b
] = if
ab
c
T
c
. (4.94)
Since we are considering matrices the product is automatically associative and a
simple expansions shows that the brackets satisfy the Jacobi identity:
[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0 (4.95)
More generally (i.e. more abstractly) one must require this in addition. In other words
a Lie algebra is a vector space with an anti-symmetric product [, ] that satises the
Jacobi identity. It turns out that the tangent space to a Lie group at the identity is a
Lie algebra.
There is a classication of semi-simple Lie algebras, that is to say ones that are
not direct sums of smaller Lie algebras. There are four innite families along with ve
exceptional cases. These are listed in table (4.6)
You are presumably familiar with su(N +1), so(2N +1) and so(2N) that arise from
the groups SU(N+1), SO(2N+1) and SO(2N). The symplectic algebra sp(2N) arises,
for example, in Hamiltonian dynamics where the vector space R
2N
is the phase space
that comes from combining (q
i
, p
i
) into a single 2N vector. The matrix then arises
analogously to an inner product through q
i
, p
j
= p
j
, q
i
=
j
i
and is known as a
symplectic product. Unfortunately the exceptional Lie algebras E
6
, E
7
, E
8
, F
4
, G
2
do
not have a simple denition that we can give here.
What is the number associated to each Lie algebra? That is called the rank and is
dened as the dimension of the Cartan subalgebra. What is the Cartan subalgebra?
It is the maximal subspace of the Lie algebra that is spanned by mutually commuting
generators.
Let us not continue with generalities and simply deal in detail with the simplest
Lie groups: SU(2) and SO(3) and their Lie algebras su(2) and so(3). We will see
that they have the same Lie algebra but they are not equal as groups. Rather there
is a 2-1 homeomorphism from SU(2) SO(3). The reason that two dierent groups
can have the same Lie-algebra is because the Lie algebra only encodes innitesimal
transformations and the nite transformations can dier.
4.7 Everything you wanted to know about SU(2) and SO(3)
but were afraid to ask
First we start with SU(2): Denition: SU(2) = 2 2 complex matrices g[g
=
g
1
and det g = 1.
It is natural to think of this as also dening a representation of SU(2) in terms of
its action on vectors in C
2
. But that would be getting ahead of ourselves there are in
fact innitely many representations that we will construct later.
Next we compute the Lie algebras. Clearly SU(2) U(2) and hence it we write
g = i +iT we require T
= T. We also have an extra condition:

det g = det(1 +iT) = 1 +itr(T) +. . . (4.96)
Thus we require that tr(T) = 0 in addition to T
= T. The Pauli matrices form a

natural basis for su(2):
1
=
_
0 1
1 0
_

2
=
_
0 i
i 0
_

3
=
_
1 0
0 1
_
(4.97)
Thus any complex, traceless, Hermitian, 2 2 matrix is a real linear combination of the
i
:
T su(2) T =
1
2
i
=
i
. (4.98)
The appearance of 1/2 will be apparent later. A little calculation shows that
_
i
2
,

j
2
_
= i
ijk
k
2
. (4.99)
To obtain group elements we exponentiate:
g = e
i
i
i
/2
(4.100)
4.7. EVERYTHINGYOUWANTEDTOKNOWABOUT SU(2) ANDSO(3) BUT WERE AFRAIDTOASK85
This is dened as a innite sum but it always converges. If we write
[[ =
_
(
1
)
2
+ (
2
)
2
+ (
3
)
2
n = /[[ (4.101)
then an adaptation of the famous e
i
= cos +i sin formula gives
g = cos
_
[[
2
_
+i n sin
_
[[
2
_
. (4.102)
In particular all we have done is replaced i by I = n which still satises I
2
= 1.
Here we see some global structure: [[ [0, 4) covers all of SU(2).
Now let us turn to SO(3). Denition: SO(3) = 3 3 real matrices g[g
T
=
g
1
and det g = 1.
In our conventions with g = 1 + iT we see that T is pure imaginary and anti-
symmetric T
T
= T. A natural basis is
L
1
=
_
_
_
0 0 0
0 0 i
0 i 0
_
_
_
, L
2
=
_
_
_
0 0 i
0 0 0
i 0 0
_
_
_
and L
3
=
_
_
_
0 i 0
i 0 0
0 0 0
_
_
_
. (4.103)
so that
T = L (4.104)
To nd the group element we exponentiate again:
g = e
iL
(4.105)
this does not have a simple expression analogous to the one we found for SU(2). However
we observe that since T
T
= T and T is pure imaginary we have that T is Hermitian.
The eigenvalues of T come in pairs diering by a sign. To see this we look at the
characteristic polynomial:
0 = det(T 1) = det((T 1)
T
) = det(T 1)
0 = det(T +1) (4.106)
Thus in odd dimensions there must be a zero eigenvalue. The corresponding eigenvector
is invariant under the rotation. Thus in three-dimensions all rotations are the more
familiar two-dimensional rotations about some xed axis. Let us x the rotation to be
about the x
3
axis so that
g = e
i
3
L
3
= exp
_
_
_
0
3
0
3
0 0
0 0 0
_
_
_
=
_
_
_
cos
3
sin
3
0
sin
3
cos
3
0
0 0 1
_
_
_
. (4.107)
Thus we see that [[ [0, 2) covers the group.
4.7.1 SO(3) = SU(2)/Z
2
Let us look at the Lie-algebra so(3). By explicit calculation we can see that
[L
i
, L
j
] = i
ijk
L
k
. (4.108)
This is the same as su(2). Thus su(2)
= so(3).
Given the isomorphism between the two Lie algebras we may wonder whether the two
groups SU(2) and SO(3) are isomorphic. To do this we look for a group homomorphism
: SU(2) SO(3) derived from the Lie algebra isomorphism (
i
2
) = L
i
and given by
(exp (
i[[
2
n )) = exp (i[[ n L) (4.109)
where L is the vector whose components are the matrices L
i
which form a basis for the
Lie algebra of SO(3). The matrix exp (i[[ n L) is a rotation about the axis parallel
with n of angle [[. While we know that
exp (
i[[
2
n ) = cos (
[[
2
)I +i n sin (
[[
2
) (4.110)
which covers the group elements of SU(2) when 0
||
2
< 2 i.e. when 0 < 4. On
the other hand this range of alpha corresponds to roatations with angle 0 < 4 in
SO(3) under the homomorphism. That is the homomorphism gives a double-covering of
SO(3). The kernel of the homomorphism is non-trivial. Due to the geometrical intuition
we have of the rotations in SO(3) we know that a rotation by 2 is the identity element,
thus we quickly identify the kernel of to be where
[[ = 0, 2 . (4.111)
Although these are trivial rotations in SO(3) from (4.110) we see that
exp (
i[[
2
n ) = I, I . (4.112)
This is the centre of SU(2), namely the set of elements in SU(2) that commute with
all other elements. Thus the kernel of is I, I

= Z
2
. So by the rst isomorphism
theorem we have
SU(2)
Z
2
= SO(3). (4.113)
Let us summarise our observations. We commenced with an isomorphism between
representations of two Lie algebras and we wondered whether it extended by the ex-
ponential map to an isomomorphism between the representations of the Lie groups.
However the identication of the group representation (which is informed by the global
group structure) with the exponentiation of the Lie algebra representation is only possi-
ble for a certain class of groups. Such groups are called simply-connected and in addition
to being connected, every closed loop on them may be continuously shrunk to a point.
In this class of groups one can make deductions about the global group structure from
the local knowledge of the Lie algebra. We will not discuss simple-connectedness in any
detail here, but in the example above both SU(2) and SO(3) are connected but only
SU(2) is simply-connected. Hence for SU(2) we may identify the representations of the
group with those of the algebra but for SO(3) we may not. A Lie algebra homomor-
phism does not in general give a Lie group homomorphism. However if G is a connected
group then there always exists a related simply-connected group

G called the universal
covering group for which the Lie algebra homomorphism does extend to a Lie group
homomorphism. Above we see that SU(2) is the universal covering group of SO(3).
The double cover of the group SO(p, q) is the universal covering group of SO(p, q) and
is called Spin(p, q), hence here we see that Spin(3)
= SU(2).
4.7.2 Representations
Next we wish to construct all nite dimensional unitary representations of su(2). Ex-
ponentiation lifts these to representations of SU(2). We can then ask which ones lift to
representations of SO(3). To do this we will proceed as we did above for the harmonic
oscillator.
Let us suppose that we are given matrices J
i
that satisfy [J
i
, J
j
] = i
ijk
J
k
. Since we
want a unitary representation we assume that J
i
= J
i
but we do not know anything
else yet and we certainly dont assume that they are 2 2 or 3 3 matrices as above.
First note that
J
2
= (J
1
)
2
+ (J
2
)
2
+ (J
3
)
2
, (4.114)
is a Casimir. That means it commutes with all the generators
[J
2
, J
i
] =
j
[J
2
j
, J
i
]
=
j
J
j
[J
j
, J
i
] + [J
j
, J
i
]J
j
=
jk
J
j
jik
J
k
+
jik
J
k
J
j
=
jk
jik
(J
j
J
k
+J
k
J
j
)
= 0 . (4.115)
From Schurs lemma this means that J
2
= I in any irreducible representation.
Since the J
i
are Hermitian we can chose to diagonalise one, but only one since su(2)
has rank 1, say J
3
. Thus the representation has a basis of states labelled by eigenvalues
of J
3
:
J
3
[m = m[m . (4.116)
In analogy to the harmonic oscillator we swap J
1
and J
2
for operators
J
= J
1
iJ
2
J
+
= J
. (4.117)
Notice that
[J
3
, J
] = [J
3
, J
1
J
2
]
= [J
3
, J
1
] [J
3
, J
2
]
= iJ
2
J
1
= J
. (4.118)
We can therefore use J
to raise and lower the eigenvalue of J

3
:
J
3
(J
[m) = ([J
3
, J
] +J
J
3
)[m
= (J
+mJ
)[m
= (m1)(J
[m) (4.119)
Therefore we have
J
+
[m = c
m
[m+ 1 J
[m = d
m
[m1 , (4.120)
where the constants c
m
and d
m
are chosen to ensure that the states are normalized
(we are assuming for simplicity that the eigenspaces of J
3
are one-dimensional - we will
return to this shortly).
To calculate c
m
we evaluate
[c
m
[
2
m+ 1[m+ 1 = m[J
+
J
+
[m
= m[J
J
+
[m
= m[(J
1
iJ
2
)(J
1
+iJ
2
)[m
= m[J
2
1
+J
2
2
+i[J
1
, J
2
][m
= m[J
2
J
2
3
J
3
[m
= ( m
2
m)m[m (4.121)
Thus if m[m = m+ 1[m+ 1 = 1 we nd that
c
m
=
_
m
2
m . (4.122)
Similarly for d
m
:
[d
m
[
2
m1[m1 = m[J
[m
= m[J
+
J
[m
= m[(J
1
+iJ
2
)(J
1
iJ
2
)[m
= m[J
2
1
+J
2
2
i[J
1
, J
2
][m
= m[J
2
J
2
3
+J
3
[m
= ( m
2
+m)m[m (4.123)
So that
d
m
=
_
m
2
+m . (4.124)
Thus we see that any irrep of su(2) is labelled by and has states with J
3
eigenvalues
m, m1, m2m, . . .. If we look for nite dimensional representations then there must be
a highest value of J
3
-eigenvalue m
h
and lowest value m
l
. Furthermore the corresponding
states must satisfy
J
+
[m
h
= 0 J
[m
l
= 0 (4.125)
This in turn requires that c
m
h
= d
m
l
= 0:
m
h
(m
h
+ 1) = 0 and m
l
(m
l
1) = 0 . (4.126)
This implies that
= m
h
(m
h
+ 1) (4.127)
and also that
m
h
(m
h
+ 1) = m
l
(m
l
1) . (4.128)
This is a quadratic equation for m
l
as a function of m
h
and hence has two solutions.
Simple inspection tells us that
m
l
m
h
or m
l
= m
h
+ 1 . (4.129)
The second solution is impossible since m
l
m
h
and hence the spectrum of J
3
eigen-
values is:
m
h
, m
h
1, ..., m
h
+ 1, m
h
, (4.130)
with a single state assigned to each eigenvalue. Furthermore there are 2m
h
+ 1 such
eigenvalues and hence the representation has dimension 2m
h
+ 1. This must be an
integer so we learn that
2m
h
= 0, 1, 2, 3.... . (4.131)
We return to the issue about whether or not the eigenspaces [, m can be more
than one-dimensional. If space of eigenvalues with m = m
h
is N-dimensional then
when we act with J
we obtain N-dimensional eigenspaces for each eigenvalue m. This

would lead to a reducible representation where one could simply take one-dimensional
subspaces of each eigenspace. Let us then suppose that there is only a one-dimensional
eigenspace for m = m
h
, spanned by [, m
h
. It is then clear that acting with J
produces
all states and each eigenspace of J
3
has only a one-dimensional subspace spanned by
[, m (J
)
n
[, m
h
for some n = 0, 1, ..., 2 + 1.
In summary, and changing notation slightly to match the norm, we have obtained a
(2l+1)-dimensional unitary representation determined by any l = 0,
1
2
, 1,
3
2
, ... having the
Casimir J
2
= l(l+1). The states can be labelled by [l, m where m = l, l+1, ..., l1, l.
Let us look at some examples.
l = 0: Here we have just one state [0, 0 and the matrices J
i
act trivially. This is the
trivial representation.
l = 1/2: Here we have 2 states:
[1/2, 1/2 =
_
1
0
_
[1/2, 1/2 =
_
0
1
_
. (4.132)
By construction J
3
is diagonal:
J
3
=
_
1/2 0
0 1/2
_
. (4.133)
We can determine J
+
through
J
+
[1/2, 1/2 = 0 J
+
[1/2, 1/2 =
_
3/4 1/4 + 1/2[1/2, 1/2 = [1/2 (4.134)
so that
J
+
=
_
0 1
0 0
_
. (4.135)
And can determine J
through
J
[1/2, 1/2 =
_
3/4 1/4 + 1/2[1/2, 1/2 J
[1/2, 1/2 = 0 (4.136)

so that
J
=
_
0 0
1 0
_
. (4.137)
Or alternatively
J
1
=
1
2
(J
+
+J
) =
1
2
_
0 1
1 0
_
J
2
=
1
2i
(J
+
J
) =
1
2
_
0 i
i 0
_
(4.138)
Thus we have recovered the Pauli matrices.
Problem: Obtain the 3 3 J
i
matrices in the j = 1 representation.
To obtain representations of SU(2) we simply exponentiate these matrices as before.
Which of these representations are also representations of SO(3)? Well these will be
the representations for which the centre of SU(2) is mapped to the identity. Since the
non-trivial part of the centre corresponds to [[ = 2 we require, for example, that
e
2iJ
3
= I (4.139)
This will be the case if the J
3
eigenvalues are all integers and this in turn means that
l Z.
The l = 1/2, 1 representations are easy to visualize. They are known as the spinor
(or sometimes fundamental) and vector representations respectively. Although one may
ask which representation of SU(2) corresponds to l = 3. The hint is that that l = 3
is also the dimension of su(2). Any Lie algebra always admits the so-called adjoint
representation where the lie algebra acts on itself. Indeed this is the Lie algebra version
of conjugation in the group:
g hgh
1
g g +i[T, g] (4.140)
if h = 1 +T. Thus in a Lie algebra we always have the adjoint representation:
ad
T
(X) = i[T, X] . (4.141)
The Jacobi identify ensures that this is indeed a representation as
ad
i[T
1
,T
2
]
X = [[T
1
, T
2
], X]
= [[T
2
, X], T
1
] + [[X, T
1
], T
2
]
= [T
1
, [T
2
, X]] + [T
2
, [T
1
, X]]
= ad
T
1
(ad
T
2
(X)) ad
T
2
(ad
T
1
(X)) (4.142)
The dimension of this representation is therefore the dimension of the Lie-algebra and
hence, for su(2) corresponds to l = 3. Here it is also apparent why the centre of SU(2)
acts trivially and hence also leads to a representation of SO(3).
More general representation arise by considering tensors T
1
,..,
n
over C
2
for su(2)
or R
3
for SO(3). The group elements act on each of the
i
indices in the natural
way. In general this does not give an irreducible representation. For larger algebras
such as SU(N) and SO(N) taking T
1
,...,
n
to be totally anti-symmetric does lead to
an irreducible representation. So does totally symmetric and traceless on any pair of
indices.
4.7.3 Representations Revisited
How does this work for more general Lie algebras. Let us re-do it using a slightly
dierent notation. su(2) consists of three generators which we now denote by H, E
and E
that satisfy
[H, E
] = E
, [H, E
] = E
(4.143)
Thus we should think of H as J
3
and E
as J
. However it is also common to rescale

the generators so that =
2. In terms of Pauli matrices this means that we choose

J
i
=
1
i
. (4.144)
This has the nice normalization that
tr(J
i
J
j
) =
ij
, (4.145)
but at the end of the day it is just another choice of basis and is equivalent to any other
choice. The corresponding J
3
eigenvalues are no longer half-integer but rather of the
form n/
2 with n Z and the representation is labelled by n

h
2, where n
h
/
2 is the
largest J
3
eigenvalue that appears. It is called the highest weight and the representation
is known as a highest-weight representation. One can also dene a similar notion of
lowest weight and lowest-weight representation.
What happens in a general Lie algebra? These have rank r > 1 and hence one can
nd r simultaneously diagonal matrices H
1
, ..., H
r
that commute with each other. We
assemble these into a vector H. The rest of the generators are split into positive and
negative root generators E
and E
which satisfy
[H, E
] = E
, [H, E
] = E
. (4.146)
Here is an r-dimensional vector and is known as a root, each Lie algebra will have a
nite number of such roots. Furthermore it is possible to split the set of roots in a Lie-
algebra into positive and negative roots such that any root is either positive or negative.
This choice is somewhat arbitrary but dierent choices do not aect the answers in the
end. So for us is a positive root and is a negative root.
Furthermore the space of positive roots can be spanned by a basis of r so-called
simple roots. This means that all positive roots can be written as
= n
1
1
+. . . +n
r
r
, (4.147)
with n
i
non-negative integers.
Let us mention some denitions and a theorem you may have heard of: The Cartan
matrix is
K
ij
= 2
i

j
i

i
. (4.148)
A Lie algebra is called simply laced if all simple roots have the same length and usually
one takes = 2. For the record the A, D, E series of Lie-algebras are simply laced
whereas the B, C, F, G series are not.
Theorem (not proven here): The set of all Lie-algebras is completely determined
and classied by the Cartan matrix.
Let us now look at representations. States in a representation are now labelled by a
vector w known as a weight:
H[w = w[w . (4.149)
The positive root generators play the role of raising the weight
E
[w = c
[w + , (4.150)
whereas the negative root generators lower the weight
E
[w = c
[w . (4.151)
You might wonder what is meant by an ordering of weights which are vectors in a higher-
dimensional space. By dening a notion of positive root one can then say that for two
weights that appear in a representation, w
1
> w
2
i w
1
w
2
is a positive root. And
similarly w
1
< w
2
if their dierence is a negative root. In general the space of possible
weights is innite and forms a lattice, although of course in any given nite-dimensional
representation only a nite number of weights appear.
One then has two theorems for unitary nite dimensional representations (not proven
here). The rst is:
Theorem: The set of possible weights is dual to the set of roots in the sense that
w Z . (4.152)
This motivates two denitions: The fundamental weights w
1
, ..., w
r
satisfy
i
w
j
=
j
i
. (4.153)
where
i
are the simple roots. A weight w is called dominant i
w = n
i
w
1
+. . . +n
r
w
r
. (4.154)
with n
i
non-negative integers.
And we now have the second theorem:
Theorem: The set nite-dimensional irreducible representations is in one-to-one
correspondence with the set of dominant weights. In particular the highest weight
of a given representation is a dominant weight and every dominant weight denes an
irreducible representation with itself as the highest weight.
It follows that the highest weight state is anhilated by the action of all positive root
generators. One then obtains the remaining states by acting with the negative root
generators. This is a well-dened process that by the above theorem always ends after
a nite number of states.
Returning to su(2) the simple and only root is
2 and so the fundamental weight is

1/
2. The dominant weights are just n/
2 with n = 1, 2, .... Each of these denes a

irreducible representation with states:
[n/
2, n/
2, [n/
2, n/
2, , . . . [n/
2, n/
2 (4.155)
since now the negative root generator E
lowers the H eigenvalue by
2.
4.8. THE INVARIANCE OF PHYSICAL LAW 93
4.8 The Invariance of Physical Law
Let us now see how group theory arises in physical laws. At least in two fundamental no-
tions: translational invariance and relativity. There are many other important examples
of groups and symmetries in physics, the Standard Model is built on various symmetry
principles. But let us just focus on these which in eect determine the structure of
spacetime.
4.8.1 Translations
We have seen that there is a natural operator for momentum an energy in quantum
mechanics:
p
i
= i

x
i
E = i

t
(4.156)
As luck would have it these form a nice relativistic 4-vector:
p
= i

x
(4.157)
where t = x
0
and c = 1. As such these operators form a innite dimensional represen-
tation of an abelian algebra:
[ p
, p
] = 0 . (4.158)
As an algebra this is not so interesting but clearly it plays an important role in physics.
We have dropped the , or more precisely taken = 1 because these operators also
appears as the generator of translations even in a classical eld theory. To see this
consider an innitesimal shift x
. Any function, not just a wavefunction, will

then change according to
(x
) =
+. . .
= +i
+. . . (4.159)
The nite group action is then obtained by exponentiation:
e
ia
(x) =
n=0
1
n!
(ia
)
n
n=0
1
n!
_
a
1
a
2
...a
n
x
1
...x
n
_
n
= (x
) , (4.160)
where the last line is simply Taylors theorem.
It follows that any Physical laws that are written down in terms of elds of x
will
have translational invariance provided that no specic potentials or other xed functions
arise.
4.8.2 Special Relativity and the Innitesimal Generators of SO(1, 3).
In addition to translations in space and time Special relativity demands that the physical
laws are invariant under Lorentz transformations.
Recall that the Lorentz group O(1, 3) is dened by
O(1, 3) GL(4, R)[
T
= ; diag(1, 1, 1, 1)
In addition to rotations (in the three-dimensional spatial subspace parameterised by
x, y, z which are generated by L
1
, L
2
and L
3
in the notation of the previous section)
and reections (t t, x x, y y, z z) the Lorentz group includes three
Lorentz boosts. The proper Lorentz group consists of such that Det() = 1 and is
the group SO(1, 3). The orthochoronous Lorentz group is the subgroup which preserves
the direction of time, having
0
0
1. The orthochronous proper Lorentz group is
sometimes denoted SO
+
(1, 3). The proper Lorentz group SO(1, 3) consists of just the
rotations and boosts. The Lorentz boosts are the rotations which rotate each of x, y
and z into the time direction and are represented by the generalisation of the matrix
shown in equation (2.30):
1
() =
_
_
_
_
_
_
cosh sinh 0 0
sinh cosh 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
,
2
() =
_
_
_
_
_
_
cosh 0 sinh 0
0 1 0 0
sinh 0 cosh 0
0 0 0 1
_
_
_
_
_
_
and
3
() =
_
_
_
_
_
_
cosh 0 0 sinh
0 1 0 0
0 0 1 0
sinh 0 0 cosh
_
_
_
_
_
_
. (4.161)
We identify a basis for the Lorentz boosts in the Lie algebra so(1, 3):
Y
1
=
_
_
_
_
_
_
0 i 0 0
i 0 0 0
0 0 0 0
0 0 0 0
_
_
_
_
_
_
, Y
2
=
_
_
_
_
_
_
0 0 i 0
0 0 0 0
i 0 0 0
0 0 0 0
_
_
_
_
_
_
and Y
3
=
_
_
_
_
_
_
0 0 0 i
0 0 0 0
0 0 0 0
i 0 0 0
_
_
_
_
_
_
.
(4.162)
The remainder of the Lie algebra of the proper Lorentz group is made up of the gener-
ators of rotations:
L
1
=
_
_
_
_
_
_
0 0 0 0
0 0 0 0
0 0 0 i
0 0 i 0
_
_
_
_
_
_
, L
2
=
_
_
_
_
_
_
0 0 0 0
0 0 0 i
0 0 0 0
0 i 0 0
_
_
_
_
_
_
and L
3
=
_
_
_
_
_
_
0 0 0 0
0 0 i 0
0 i 0 0
0 0 0 0
_
_
_
_
_
_
.
(4.163)
Computation of the commutators gives (after some time...)
[L
i
, L
j
] = i
ijk
L
k
, [L
i
, Y
j
] = i
ijk
Y
k
and [Y
i
, Y
j
] = i
ijk
L
k
. (4.164)
It is worth observing that the generators for the rotations are skew-symmetric matrices
L
T
i
= L
i
while the boost generators are symmetric matrices Y
T
i
= Y
i
for i 1, 2, 3.
This is a consequence of the rotations being an example of a compact transformation
(all the components of the matrix representation of the rotation (cos , sin ) in the
group are bounded) while the Lorentz boosts are non-compact transformations (some
of the components of the matrix representation of the boosts (cosh , sinh ) in the
group are unbounded - they may go to .)
Notice that if one uses the combinations
W
i

1
2
(L
i
iY
i
) (4.165)
as a basis of the Lie algebra then the commutator relations simplify:
[W
+
i
, W
+
j
] = i
ijk
W
+
k
su(2)
[W
i
, W
j
] = i
ijk
W
k
su(2) (4.166)
[W
+
i
, W
j
] = 0.
Via a change of basis for the Lie algebra we recognise that it encodes two copies of the
algebra su(2):
so(1, 3)
= su(2) su(2). (4.167)

4.8.3 The Proper Lorentz Group and SL(2, C).
We will now show that so(1, 3)

= sl(1, C) as Lie algebras and that in terms of groups
SO
+
(1, 3)
= SL(2, C)/Z
2
, where Z
2
is the centre of SL(2, C). Furthermore SL(2, C) is
the double cover (universal cover) of SO(1, 3) known as Spin(1, 3).
Let us recall the Pauli matrices and introduce the identity matrix as
0
:
0
=
_
1 0
0 1
_
,
1
=
_
0 1
1 0
_
,
2
=
_
0 i
i 0
_
,
3
=
_
1 0
0 1
_
. (4.168)
Consider for each Lorentz vector x R
1,3
the map two-by-two matrix given by
X x
=
_
x
0
+x
3
x
1
ix
2
x
1
+ix
2
x
0
x
3
_
(4.169)
One easily sees that X
= X spans all 2 2 Hermitian matrices. One may conrm that

matrices A GL(2, C) transforming X X
by the action
X X
AXA
(4.170)
preserve X
= X.
Furthermore one has
Det(X) = (x
0
)
2
(x
3
)
2
(x
1
)
2
(x
2
)
2
= x
. (4.171)
Consequently the transformations on X which leave its determinant unaltered are Lorentz
transformations. What are these? Well Det(X
) = Det(AXA
) = Det(XA
A) =
Det(X)Det(A
A). Thus we require as Det(A
A) = [Det(A)[
2
= 1. If we write
A = e
i/2
A
0
(4.172)
with A
0
SL(2, C), i.e. Det(A
0
) = 1. Then Det(A) = e
i
and A
= e
i/2
A
0
. The
factors of e
i
cancel in the action X AXA
so that without loss of generality we

simply take A SL(2, C).
Hence each A SL(2, C) encodes a proper Lorentz transformation on x
. However
it is also clear that if A SL(2, C) then A SL(2, C). However both lead to the same
action on X. So at best we have SO(1, 3)
= SL(2, C)/Z
2
but actually there is more.
Next we note that the sign of x
0
is never changed. To see this is it sucient to have
only x
0
,= 0 so that X = x
0
I. Consider the matrix
_
_
_
_
_
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
SO(1, 3) (4.173)
which will change the sign of x
0
(and x
1
but have set x
1
= 0 for this). In the SL(2, C)
action above one has
X
= x
0
AA
. (4.174)
To change the sign of x
0
we require an A SL(2, C) with AA
= I. But this is
impossible since AA
is Hermitian and positive denite whereas I is Hermitian and

negative denite. Thus SO
+
(1, 3)
= SL(2, C)/Z
2
.
To discover the precise transformation one considers the components of x
which are
simply related to X. By direct computation we can check that
j
=
ij
0
+i
ijk
0
=
(4.175)
and
X
= x
= x
0
+x
i
=
_
_
x
0
0
+x
i
i
= 0
x
0
j
+x
i
j
= j
_
_
_
x
0
j
+ix
i
ijk
k
j ,= i
x
0
j
+x
i
ij
0
j = i
As Tr(
0
) = 2 while Tr(
i
) = 0 we have
Tr(X
) = 2x
=
1
2
Tr(X
) (4.176)
and we have used the Minkowski metric to lower indices where necessary. We leave the
exercise of nding the proper Lorentz transformation corresponding to each matrix of
SL(2, C) to the following problem.
Problem 4.8.1. Let X = x
and show that the Lorentz transformation x
induced by X
= AXA
has:
(A) =
1
2
Tr(A
)
thus dening a map A (A) from SL(2, C) into SO(1, 3). Where
0
is the two-by-two
identity matrix and
i
are the Pauli matrices as dened in question 4.2. (Method: show
rst that Tr(X
) = 2x
, then nd the expression for the Lorentz transform of x
associated to X X
. Finally set x to be the 4-vector with all components equal to zero

apart from the x
component which is equal to one.)

By considering a further transformation X
= BX
show that:
(BA) = (B)(A)
so that the mapping is a group homomorphism. Identify the kernel of the homomorphism
as the centre of SU(2) i.e. A = I, thus showing that the map is two-to-one.
Thus SL(2, C) can be view as the double cover of SO
+
(1, 3) and plays an analogous
role that SU(2) plays with respect to SO(3). In particular representations of SL(2, C)
are labeled by a pair of su(2) representations with highest weights l
1
and l
2
respectively.
Representations with integer values of l
1
+l
2
descend to representations of SO(1, 3) but
the ones where l
1
+ l
2
is half-integer do not. In particular the spin-statistics theorem
states that the former correspond to bosons whereas the later correspond to fermions.
Although we havent shown it here SU(2) and SL(2, C) are simply connected, mean-
ing that any closed loop in them can be continuously contracted to a point. The
groups SO(3) and SO
+
(1, 3) are not simply connected. SU(2) and SL(2, C) are known
as universal covering spaces. This is a general pattern and the universal covering
spaces of SO(d) and SO
+
(1, d) are known as Spin(d) and Spin(1, d) respectively i.e.
Spin(3) = SU(2) and Spin(1, 3) = SL(2, C). These groups act on spinors and their
tensor products whereas SO(d) and SO
+
(1, d) act on vectors and their tensor products.
Note that the tensor product of two spinors gives a vector. Again the spin-statistics
theorem states that in quantum eld theory spinors must be fermions.
Finally we can marry translations and Lorentz transformations to obtain the bf
Poincare Group. The Poincare group is the group of isometries of Minkowski spacetime.
It includes the translations in Minkowski space in addition to the Lorentz transforma-
tions:
(, a)[ O(1, 3), a R
1,3
(4.177)
a general transformation of the Poincae group takes the form
x
+a
. (4.178)
It is known as a semi-direct product of translations and Lorentz transformations. Semi-
direct product means the actions of translations and Lorentz transformations do not
simply commute with each other as they do in a direct product.
4.8.4 Representations of the Lorentz Group and Lorentz Tensors.
The most simple representations of the Lorentz group are scalars. Scalar objects being
devoid of free Lorentz indices form trivial representation of the Lorentz group (objects
which are invariant under the Lorentz transformations). The standard vector represen-
tation of the Lorentz group on R
1,3
acts as
x
. (4.179)
This is the familiar vector action of on x and we shall denote it by
(1,0)
.
Similarly one may dene the contragredient, or co-vector, representation
(0,1)
acting
on co-vectors as
x
. (4.180)
(1,0)
and
(0,1)
are equivalent representations with the
intertwining map being the Minkowski metric .
More general tensor representations are constructed from tensor products of the
vector and co-vector representations of the Lorentz group and are called (r, s)-tensors:
(1,0)
(1,0)
. . .
(1,0)
. .
r
(0,1)
(0,1)
. . .
(0,1)
. .
s
(4.181)
(r, s)-tensors have components with r vector indices and s co-vector indices
T
2
...
r
2
...
s
and under a Lorentz transformation the components transform as
T
2
...
r
2
...
s

2
. . .
2
. . .
r
T
2
...
r
2
...
s
. (4.182)
There are two natural operations on the tensors that map them to other tensors:
(1.) One may act with the metric to raise and lower indices (raising an index maps
an (r, s) tensor to an (r + 1, s 1) tensor while lowering an index maps an (r, s)
tensor to an (r 1, s + 1) tensor):
k
T
2
...
r
2
...
s
= T
2
...
k1

k+1
...
r

1
2
...
s
(4.183)
k
T
2
...
r
2
...
s
= T
2
...
r

2
...
k1

k+1
...
s
(2.) One can contract a pair of indices on an (r, s) tensor to obtain an (r 1, s 1)
tensor:
T
2
...
r1
2
...
s1
= T
2
...
r1
2
...
s1
. (4.184)
One may be interested in special subsets of tensors whose indices (or even a subset of
indices) are symmetrised or antisymmetrised. Given a tensor one can always symmetrise
or antisymmetrise a set of its indices:
A symmetric set of indices is denoted explicitly by a set of ordinary brackets ( )
surrounding the symmetrised indices, e.g. a symmetric (r, 0) tensor is denoted
T
(
1
2
...
r
)
and is constructed from the tensor T
2
...
r
using elements P of the
permutation group S
r
:
T
(
1
2
...
r
)
1
r!
PS
r
T
P(1)
P(2)
...
P(r)
(4.185)
so that under an interchange of neighbouring indices the tensor is unaltered, e.g.
T
(
1
2
...
r
)
= T
(
2
1
...
r
)
. (4.186)
One may wish to symmetrise only a subset of indices, for example symmetrising
only the rst and last indices on the (r, 0) tensor is denoted by T
(
1
|
2
...
r1
|
r
)
and dened by
T
(
1
|
2
...
r1
|
r
)
1
2!
PS
2
T
P(1)
2
...
r1
P(r)
(4.187)
the pair of vertical lines indicates the set of indices omitted from the symmetrisa-
tion.
An antisymmetric set of indices is denoted explicitly by a set of square brackets
[ ] surrounding the antisymmetrised indices, e.g. an antisymmetric (r, 0) tensor is
denoted T
[
1
2
...
r
)]
and is constructed from the tensor T
2
...
r
using elements P
of the permutation group S
r
:
T
[
1
2
...
r
]
1
r!
PS
r
Sign(P)T
P(1)
P(2)
...
P(r)
(4.188)
so that under an interchange of neighbouring indices the tensor picks up a minus
sign e.g.
T
[
1
2
...
r
]
= T
[
2
1
...
r
]
. (4.189)
Frequently in theoretical physics the symmetry or antisymmetry of the indices on a
tensor will be assumed and not written explicitly (which can cause confusion). For
example we might dene g
to be a symmetric tensor which means that g

[]
= 0
while g
()
= g
. Similarly for the Maxwell eld strength F
which was dened to be

antisymmetric hence F
[]
= F
while F
()
= 0.
We stated earlier that the tensor product of two irreducible representations is typ-
ically not irreducible. We can see that explicitly here for the case of a generic tensor
T
which transforms in the tensor product of two vector representations. let us write
T
= T
()
+T
[]
(4.190)
where
T
()
1
2
(T
+T
) T
()
= T
()
(4.191)
T
[]
1
2
(T
) T
[]
= T
[]
.
First let us show that T
()
and T
[]
form separate representations, meaning that under
a Lorentz transformation T
()
remains symmetric while T
[]
remains anti-symmetric.
First consider the Lorentz transformation of T
()
T
()
=
1
2
+
1
2
1
2
(T
+T
)
=
T
()
. (4.192)
Thus after a Lorentz transformation the symmetric part remains symmetric. A similar
argument shows that the anti-symmetric part remains anti-symmetric after a Lorentz
transformation (you just replace the + by a ). Thus the representation is reducible:
the subspaces of symmetric or anti-symmetric tensors are invariant subspaces.
But there is a further reduction. The symmetric part can be written as
T
()
=
T +

T
()
, (4.193)
where

T
()
is traceless:

T
()
= 0 . (4.194)
Thus
T
()
=
(T
+

T
()
+T
[]
)
= (1 +d)T (4.195)
and
T
()
=
T +T
()
= T
()
1
1 +d
T
()
(4.196)
where we have assumed that spacetime has dimension 1 + d. By construction T is
Lorentz invariant and therefore gives a separate, albeit trivial, Lorentz representation.
Thus even a symmetric tensor gives a reducible representation with pure-trace tensors,
i.e. those of the form T
= T
an invariant subspace. Finally we see that at traceless

symmetric tensor remains so after a Lorentz transformation:

T
()
= (d + 1)T
()
= (d + 1)T +
T
()
= (d + 1)T +
T
()
= (d + 1)T + (d + 1)T
= 0 . (4.197)
Therefore we see that a tensor T
splits into an anti-symmetric, symmetric traceless

and pure trace pieces, each of which is a representation of the Lorentz group.
Problem 4.8.3. Consider the space of rank (3, 0)-tensors T
3
forming a tensor
representation of the Lorentz group SO(1, 3) which transforms under the Lorentz trans-
formation as
T
3
=
3
T
3
.
(a.) Prove that
T
2
T
3
T
3
is a Lorentz invariant. The Einstein summation convention for repeated indices is
assumed in the expression for T
2
.
(b.) Give the denitions of the symmetric (3, 0)-tensors and of antisymmetric (3, 0)-
tensors and show that they form two invariant subspaces under the Lorentz trans-
formations.
(c.) Prove that the symmetric (3, 0)-tensors form a reducible representation of the
Lorentz group.

Foundations of Mathematical Physics

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Foundations of Mathematical Physics

Enviado por

Direitos autorais:

Formatos disponíveis

Foundations of Mathematical Physics

and Neil Lambert

Department of Mathematics, Kings College London

10 CHAPTER 1. CLASSICAL MECHANICS

H. The transformation above is canonical as the Poisson brackets are

= p, q = 1. The Hamiltonian with dual parameters is canonically

moving at relative speed v in the x-direction

and a second in a frame F and a second observer in

moves o at speed v the observer in frame F

? The height is the same as the clock at

where the clock appears to be moving t

seconds are observed to pass, in which

. Now using the Pythagorean formula and

seconds have passed as viewed from the frame F

in which the clock is

and record measurements from there, information

the contraption is seen to be moving with speed v in

= (ct vt) = (x vt) (2.11)

with a single raised index we have dened to mean the entries in a

is a row vector, although not always the simple transpose of x. To do this

. This means that in component

and row vector x

: raised indices indicate a row number while lowered indices

which is obviously related to a matrix A

to mean the entry in the

. To raise indices we use the inverse Minkowski metric

2.2. COMPONENT NOTATION. 31

2.2.2 Common Four-Vectors

. Now as < xy, xy >= x

is invariant under the Lorentz

is conserved in the sense that

+ where is the innitesimal parameter for the transformation.

in this way without changing L

. Let us expand in powers of F

in the action directly and read o the equation of motion. To

the eld strength F

and a relabelling of the dummy

we use the same technique as when one integrates by parts

vanishes at the xed points of the path (in

. Then we could add to the Lagrangian the term

is easy any simply

is an exact form as it is the exterior derivative of the one-form A

that the remaining two Maxwell equations emerge.

of the eld strength in

= (, A) was to ensure that (2.109) and

is the adjoint operator of

= A. The prototype for the adjoint is the Hermitian

I where , R. Show that

are self-adjoint and that [

where the eigenvalue is a continuous variable then

on its own is not correctly normalised to be a

be a continuous basis, then:

raises the eignevalue of the energy eigenstate,

is called the creation operator while is called the annihilation operator.

are sometimes called the ladder operators.

is the number operator.

F0 where F = Q, R, C Under multiplication.

Let us take the subgroup H to be

between two groups (G, )

and say that Gis isomorphic

is a group homomorphism between the groups G and

are the identity elements of G and G

is a group homomorphism then the kernel of f, dened

for a given group homo-