Diff Geom

4P9
An Introduction to Dierential Geometry.

Michael D. Alder
November 29, 2008
2
Contents
1 Introduction 9
2 Smooth Manifolds and Vector Fields 11
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Smooth Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Smooth maps and tangent vectors . . . . . . . . . . . . . . . . 15
2.4 Notation: Vector Fields . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Cotangent Bundles . . . . . . . . . . . . . . . . . . . . . . . . 31
2.6 The Tangent Functor . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.1 The (non-existent) Cotangent Functor . . . . . . . . . 35
2.7 Autonomous Systems of ODEs . . . . . . . . . . . . . . . . . . 37
2.7.1 Systems of ODEs and Vector Fields . . . . . . . . . . . 37
2.7.2 Exponentiation of Things . . . . . . . . . . . . . . . . 39
2.7.3 Solving Linear Autonomous Systems . . . . . . . . . . 40
2.7.4 Existence and Uniqueness . . . . . . . . . . . . . . . . 41
2.8 Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.9 Lie Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3 Tensors and Tensor Fields 51
3.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.1 Natural and Unnatural Isomorphisms . . . . . . . . . . 51
3.1.2 Multilinearity . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.3 Dimension of Tensor spaces . . . . . . . . . . . . . . . 57
3.1.4 The Tensor Algebra . . . . . . . . . . . . . . . . . . . . 62
3.2 Tensor Fields on a Manifold . . . . . . . . . . . . . . . . . . . 67
3
4 CONTENTS
3.3 The Riemannian Metric Tensor . . . . . . . . . . . . . . . . . 72
3.3.1 What this means: Ancient History . . . . . . . . . . . 76
3.4 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5 The Exterior Algebra . . . . . . . . . . . . . . . . . . . . . . . 91
3.6 The Exterior Calculus . . . . . . . . . . . . . . . . . . . . . . 96
3.7 Hodge Duality: The Hodge Operator . . . . . . . . . . . . . 102
3.7.1 The Riemannian Case . . . . . . . . . . . . . . . . . . 102
3.7.2 The SemiRiemannian Case . . . . . . . . . . . . . . . . 106
4 Some Elementary Physics 109
4.1 Three weird forces . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.2.1 Gradient Fields . . . . . . . . . . . . . . . . . . . . . . 116
4.2.2 What are Flux? . . . . . . . . . . . . . . . . . . . . . . 117
4.3 Maxwell and Faraday . . . . . . . . . . . . . . . . . . . . . . . 120
4.4 Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4.1 The Idea of Invariance . . . . . . . . . . . . . . . . . . 126
4.4.2 The Lorentz Group . . . . . . . . . . . . . . . . . . . . 130
4.4.3 The Maxwell Equations . . . . . . . . . . . . . . . . . 135
4.5 Saying it with Dierential Forms . . . . . . . . . . . . . . . . 136
4.6 Lorentz Invariance . . . . . . . . . . . . . . . . . . . . . . . . 141
4.6.1 Special Relativity . . . . . . . . . . . . . . . . . . . . . 145
5 DeRham Cohomology: Counting holes 149
5.1 Cultural Anthropology . . . . . . . . . . . . . . . . . . . . . . 149
5.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.3 Innite Variety . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.4 Gauge Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.5 Exact and Closed forms . . . . . . . . . . . . . . . . . . . . . 157
5.6 Homotopies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.7 Counting Holes . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.8 More Cultural Anthropology . . . . . . . . . . . . . . . . . . . 166
5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6 Lie Groups 169
CONTENTS 5
6.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . 169
6.1.1 The rest of the course . . . . . . . . . . . . . . . . . . 169
6.2 Introduction to Lie Groups . . . . . . . . . . . . . . . . . . . . 169
6.3 Group Representations . . . . . . . . . . . . . . . . . . . . . . 173
6.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 173
6.3.2 Irreducible Representations . . . . . . . . . . . . . . . 176
6.3.3 Tensor Representations . . . . . . . . . . . . . . . . . . 177
6.3.4 Schurs Lemma . . . . . . . . . . . . . . . . . . . . . . 178
6.3.5 Representations of SU(2, C) . . . . . . . . . . . . . . . 179
6.3.6 Representations of SU(2) . . . . . . . . . . . . . . . . . 182
6.4 Lie Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7 Fibre Bundles 185
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.2 Principal Bundles . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.3 The Endomorphism Bundle . . . . . . . . . . . . . . . . . . . 191
8 Connections 193
8.1 Fundamental Ideas . . . . . . . . . . . . . . . . . . . . . . . . 193
8.2 Back in R
n
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.2.1 Covariant dierentiation . . . . . . . . . . . . . . . . . 197
8.2.2 Curves and transporting vectors . . . . . . . . . . . . . 199
8.3 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.4 Extensions to Tensor Fields on R
2
. . . . . . . . . . . . . . . . 201
8.5 The Koszul Connection . . . . . . . . . . . . . . . . . . . . . . 203
8.6 Vector Potentials . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.6.1 Tensor formulation . . . . . . . . . . . . . . . . . . . . 206
8.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . 207
6 CONTENTS
Preface
This is a rst course in Dierential Geometry. I follow a number of sources:
rst the text for the course, Baez and Muniains Gauge Fields, Knots and
Gravity, second the unique Michael Spivakss Comrehensive Introduction to
Dierential Geometry, which is almost encyclopaedic and also readable, if at
times demanding. Third R.W.R Darlings Dierential Forms and Connec-
tions and nally the rather old fashioned Sternberg Lectures in Dierential
Geometry. I shall also make some allusions to Helgasons Dierential Geom-
etry, Lie Groups and Symmetric Spaces.
My aim is to cover some of the ideas with applications to theoretical physics
from the text book while covering basic ideas. I hope that the students will
have done my 3P0 course which introduces tensors and tensor elds, but it
seems unsafe to count on it having been absorbed as thoroughly as desired.
So some of the introductory material has been lifted from my 3P0 notes.
Mike Alder, February 2007
7
8 CONTENTS
Chapter 1
Introduction
This course is about Dierential Geometry and the text book is really im-
portant. You need your own copy unless you are sharing with a very good
friend. It is particularly important if you want to see where the physical
reasons for studying this topic tie in with the mathematics. We shant get
through the whole book but we shall have started on the journey.
I copied some of that from the introduction to 3P0. Its still true.
You can see what is in the course by reading the contents page. Not that
it will help; this is Mathematics, not one of those subjects where you learn
to say the right things without considering whether they are true or false or
even whether or not they mean anything.
There are many dierent reasons for studying this subject but one impor-
tant one is that you will come to grips with some of the ideas that modern
Physics needs to make sense of the universe. You may be a physicist or a
mathematician or even an engineer (embryonically at least). The three sub-
jects tend to attract slighly dierent kinds of people (only slightly dierent:
compared with poets, pop-stars, princes, politicians and philosophers we are
barely dierent at all). Engineers tend to see the world in terms of facts and
protocols which they have to learn and which may or may not make much
sense, physicists see the world in terms of facts and theories, the theories
being there to summarise and predict the facts. Mathematicians expect to
see reasons and logic and relatively few basic facts from which the others
can be deduced. For a mathematician it has to make sense or it is denitely
wrong. For a theoretical physicist it has to be elegant or it is denitely wrong.
Because I am a mathematician, I have put in material which is left out of
the text book where it is treated as some bunch of facts which you need to
know, whereas I want to show why these things are the case. Maybe I just
have a bad memory for facts but a good one for arguments. Whatever the
9
10 CHAPTER 1. INTRODUCTION
reason, I am going to try to show you the essential beauty of the subject,
to get you to agree that it is amazingly cool, because this is ultimately why
mathematicians do it. The fact that it is also very useful is not why we do
it, it is why we get paid to do it. Though not much
1
.
This is very tough stu, so dont expect an easy time. On the other hand it
will be very exciting.
1
This is a really bad subject to do if you want to get rich, or boss people about, but a
very good one if you want to be happy and have lots of interesting and important things
to think about.
Chapter 2
Smooth Manifolds and Vector
Fields
2.1 Introduction
This chapter considers the machinery needed to say what we mean by a
smooth manifold. We also look at vector elds on smooth manifolds and
explain what this has to do with systems of ordinary dierential equations.
The rst idea is that a curve (in the plane or in three space) is a one di-
mensional object, a surface such as a sphere (the surface of a beach ball) or
a torus (the surface of an American doughnut where they sell you a hole in
the middle) is a two dimensional object. And there ought reasonably to be
higher dimensional variants of these things, as for example the n-sphere S
n
given by
S
n
x R
n+1
: |x| = 1
It is also reasonable to look at smooth maps between manifolds. If you draw a
smooth curve on a beach ball without stopping, which joins up to stop where
it starts and has the same nal and initial velocity, then we could think of
this as a smooth map from S
1
to S
2
. But we have a problem in dealing
with what we mean by a dierentiable map in this case since neither S
1
nor
S
2
are Banach spaces, and Banach spaces are the setting for talking about
dierentiation, since they contain the linear algebra and distance notions
needed to talk of linear approximations, which is what derivatives are.
We might be able to make sense of this if the sphere is sitting in R
3
, and the
circle in R
2
, in which case we have a way of dening smoothness extrinsically.
But smoothness ought to make sense intrinsically, that is without reference
to some external space in which the manifold may or may not be sitting. An
11
12 CHAPTER 2. SMOOTH MANIFOLDS AND VECTOR FIELDS
important reason for this is that we live in what looks to be a 3-manifold
called the physical universe, at least at some scales. Make it a 4-manifold if
you want to throw in time. If the universe is sitting in some higher dimen-
sional space, we cant know much about it, so it is slightly lunatic to believe
it is there. Google branes for an alternative viewpoint. Also string theory for
some disturbing ideas. But these do not contradict the idea that if we live
in a three dimensional physical universe it makes sense to talk about smooth
motion in it without having to postulate some inaccessible space external to
the universe.
So we seek to specify enough extra structure on a manifold so that we can
talk about smooth maps between them without reference to any space in
which they may be sitting. This will certainly be necessary if we are to
suppose that we live in a 3-manifold and want to talk about geodesics in it,
curves of minimal length. We shall certainly want to do this if we are to talk
of the path of a photon in our universe.
All this means generalising ideas about maps from R
n
, or subsets of R
n
, to
R
m
, which involve dierentiability. Which we understand. Or do we?
Recall that if U, V are open subsets of R
n
and f : U V is a dierentiable
map we have that at each point a U, the derivative of f at a is the linear
map Df(a) : R
n
R
n
which is represented in the standard basis by the
n n matrix of partial derivatives:
[Df
a
] =
D
1
f
1
(a) D
2
f
1
(a) D
n
f
1
(a)
D
1
f
2
(a) D
2
f
2
(a) D
n
f
2
(a)
.
.
.
.
.
.
.
.
.
D
1
f
n
(a) D
2
f
n
(a) D
n
f
n
(a)
f
1
x
1
f
1
x
2

f
1
x
n
f
2
x
1
f
2
x
2

f
2
x
n
.
.
.
.
.
.
.
.
.
f
n
x
1
f
n
x
2

f
n
x
n
x=a
Usually I shant bother to distinguish between the linear map and its matrix
representation. You know how to compute this matrix if it should be abso-
lutely necessary, and you should understand that the linear map is the linear
part of the best ane approximation to f at a. Note that I have used f
i
for
the n component functions of f and x
i
for the components of a vector in R
n
.
I shall explain this notation later.
An important point about smooth curves needs to be considered:
2.2. SMOOTH MANIFOLDS 13
Figure 2.1.1: A smooth curve.
Exercise 2.1.1. Figure 2.1.1 shows two line segments joined together. The
horizontal one is the set of points in R
2
with y = 0 and 0 x 1 and the
vertical one is the set of points in R
2
with x = 1 and 0 y 1
Show that there is a continuous but non-dierentiable function from [0, 2] to
R
2
which traces the curve formed by the two segments from the origin to the
point (1, 1)
T
.
Show that there is a dierentiable function from [0, 2] to R
2
which does the
same job.
Exercise 2.1.2. Show that [1, 1] is the image of [1, 1] by a continuous
bijection which is not dierentiable.
The conclusion you should draw from this is that you cannot decide if a curve
is smooth or not merely by looking at the image!
2.2 Smooth Manifolds
Denition 2.2.1. A chart on a topological space X is a homeomorphism
from some open subset of X onto an open subset of R
n
. I shall call the
inverse of such a homeomorphism a local parametrisation.
We show a typical local parametrisation map from a rectangular neighbour-
hood of the origin in R
2
to a region on the surface in gure 2.2.1. A chart
can be used to give coordinates for points of the space, at least some of them.
Dierent charts will, of course, give dierent coordinates in general to those
points on which the domains overlap.
Denition 2.2.2. Two charts on a space X, f : U R
n
and g : V R
n
are
smoothly compatible i the maps f g
1
and gf
1
are innitely dierentiable
wherever they are dened.
Figure 2.2.1: A local coordinate map.
Figure 2.2.2: Two charts.
In other words, the composite map f g
1
must have partial derivatives of
all orders at every point of the domain, and the same is true of the inverse
map.
If U and V have empty intersection then this holds vacuously. If they do have
an intersection, then f g
1
has domain and codomain some open subsets
of R
n
and is certainly continuous. It makes sense to demand that this map
be smooth, that is, innitely dierentiable. The picture of gure 2.2.2 may
help.
Denition 2.2.3. A smooth atlas for a space X is a collection of smoothly
compatible charts such that every point of X is in the domain of at least
one chart. Such an atlas is maximal i every possible (smoothly compatible)
chart is in it.
Denition 2.2.4. A smooth n-manifold is a hausdor topogical space to-
gether with a maximal atlas of smoothly compatible charts. The atlas is said
2.3. SMOOTH MAPS AND TANGENT VECTORS 15
to dene a smooth dierential structure on X.
The reason for wanting the atlas to be maximal is just so that anyone wan-
dering in with a new local coordinate map cant cause us trouble. Either it
is compatible with our atlas in which case we already have it, or it is not, in
which case it may be part of a dierent dierential structure for the manifold.
Exercise 2.2.1.
1. Show that S
1
and S
2
as usually dened are smooth manifolds.
2. Show that the at torus obtained by gluing opposite edges of a square
is also a smooth manifold.
3. Show that S
n
is a smooth manifold for any n Z
+
. Hint: use the
Implicit Function Theorem.
(Generic hint: you dont need to have many charts. Enough to cover the
manifold will do, then just add the instruction to ll up with all other possible
smoothly compatible charts.)
Exercise 2.2.2. Construct a denition of an orientable manifold.
2.3 Smooth maps and tangent vectors
Now we have enough to say what it means for a map f : X Y to be
smooth when X and Y are smooth manifolds:
Denition 2.3.1. A map f : X Y between smooth manifolds is dieren-
tiable when h
1
f g is dierentiable for all charts h on X and all charts g
on Y belonging to the dierential structures.
The diagram of gure 2.3.1 gives the idea.
We can dene higher order dierentiability in the same way and we can say
that a map f : X Y is smooth whenever all composites h
1
f g are
smooth for all charts h on X and all charts g on Y
Exercise 2.3.1. Show that if f : X Y has composite h
1
f g dier-
entiable at some point a in X then it is dierentiable in any other pair of
charts containing a, f(a).
Figure 2.3.1: A smooth map.
Figure 2.3.2: Some tangent curves in a manifold.
Note that although we can say that f is dierentiable, we cannot provide
a derivative, since this will generally be dierent in dierent charts. If we
move away from simple linear spaces we must pay the price: there is no longer
a best ane approximation because ane maps dont make sense between
manifolds in general.
We can however say when two maps from R into a manifold X are tangent.
Let f, g : (1, 1) X be smooth maps into a manifold and without loss
of generality let f(0) = g(0) = a X. Then we can say that f and g are
tangent at a i the derivative of f and the derivative of g are the same for
any chart h : U R
n
where f(0) = g(0) = a U. If they are the same in
any one chart they must be the same in any other.
Exercise 2.3.2. Prove the last remark.
Exercise 2.3.3. Show that tangency is an equivalence relation on the set of
maps from R to X, and that we can do the same thing with maps from X
to R.
We can take a tangency equivalence class of maps from R to the manifold
X, and regard it as an object in its own right. The picture 2.3.2 shows some
members of a tangency equivalence class.
The curves can be thought of as the trajectories of moving points, and they
are all moving through the point a at the same speed, and in the same
direction, although we cannot give the direction a particular vector to specify
it, and the speed may also be dierent in dierent charts.
Denition 2.3.2. A tangency equivalence class at a point a in a manifold
X is called a tangent vector at a in X.
Remark 2.3.1. Watching the faces of students in class when giving this
denition is a real treat. The look of stark horror and incomprehension
is very encouraging, as it proves that some at least are listening. A small
amount of imagination, however, goes a long way to making this denition
quite reasonable.
Suppose that the North pole has been cleared of snow and turned into a
skating rink for penguins
1
, and the North pole itself is marked by a ashing
red light. There are two space craft hovering up there, call them A and B. In
space craft A, an astronaut leans out and takes a photograph of the region
around the north pole. Suppose for simplicity he is directly above the north
1
It has been pointed out to me that there are no penguins at the North pole only at
the South pole. On the other hand there isnt a skating rink at the North pole either. So
if we are going to make a skating rink we might as well import the penguins.
Figure 2.3.3: Penguins (imported from the antarctic) skating.
pole so his photograph, when enlarged is a disc as in the picture gure 2.3.4.
Astronaut B is somewhere over Russia and he also takes a photograph of
what he can see.
Now each astronaut looks at his photograph and lays it out at and enlarges
it to a nice size, and each marks on a coordinate grid using a ruler and pen,
and so each has a chart of a bit of the polar regions, with the north pole
in the domain of each chart. If both put the origin in the centre, astronaut
A will have the ashing red light at the origin, and astronaut B will have a
negative x coordinate for the red light if he puts his coordinates on the chart
in the way suggested by the diagram. I regard the chart as both the bit of
the earth the astronaut can see, and also the process of turning it into a at
picture with a coordinate grid on it. Call them u and v for the maps and U
and V for the domains of the maps back on earth.
I claim it makes sense to talk of a penguin skating over the north pole as
having a velocity vector as it passes through the north pole. Each astronaut
can plot the position of the green penguin in his chart, and each will agree if
the curve is dierentiable. Note that if g : (1, 1) S
2
describes the green
penguin then astronaut A will plot the green curve at the top of the picture
and will be able to give it a perfectly respectable velocity on his chart relative
to the cartesian coordinates marked on the chart. Similarly astronaut B can
do the same. The problem is that they will have, usually, dierent estimates
or what the velocity is. If B is much higher up in space, his scale will be such
that the penguins will seem to be moving more slowly, for instance.
Figure 2.3.4: Two penguins skating under the watchful eyes of two astro-
nauts.
Does this mean my claim that we can assign a meaningful velocity vector to
the penguin is just nonsense? No, for if b is the blue penguin, also skating
over the north pole at the same time as the green penguin (and mysteriously
not knocking over the rst penguin: maybe they are ghost penguins and can
occupy the same space), it certainly makes sense to say if they are travelling
in the same direction at the same speed. A penguin cutting across the path
would obviously be travelling in a dierent direction, and a really slow pen-
guin would be slower for both astronauts, and the fast penguins would pass
through it. So I claim that the penguin velocity is a real thing which exists
at the penguin level if not at the astronaut level. But if one astronaut said
that the blue penguin and the green penguin had the same velocity at the
instant they went through the pole, the other astronaut would agree even
though disagreeing as to the actual value of the vector in both direction and
speed, these things being properties of the charts, not the penguins.
The reason this happens is that there are two things going on here. The actual
velocity at the north pole is a real thing, penguins are actually moving, and
either they pass through the north pole at the same time in the same direction
at the same speed or they dont. But attempts by the two astronauts to
describe the penguin motion to each other with numbers involve inventing
coordinate systems which are bits of language. So the numerical value of any
vector is dependent on the language. But the fact that dierent languages
agree on whether two penguins have the same velocity tells you that the
velocity is real. It exists independent of the coordinate system, provided the
two coordinate systems are related by a dieomorphism. So there are moving
penguins and there is language, and the penguins will have the same velocity
at the pole or they wont, and this is true no matter what language you use
to talk about it unless your language is really weird.
The problem then is to say what a velocity vector is given that any pair of
astronauts can disagree about the actual numbers. And the most elegant
solution is to say that it is what all the penguin trajectories, real and poten-
tial, have in common. And what they have in common is that every observer
will agree that they pass through the north pole in the same direction at the
same speed. This is the tangency equivalence class.
Note that I have assumed that all observers use synchronised clocks so they
all agree that the time at which the penguins hit the north pole is time
zero. This doesnt have to be the case either. They will all agree on the
simultaneity of the events, whatever time they claim they occur. This is
because two penguins either meet or they dont, and this is not a matter of
language but of fact.
The ghost-penguins are negotiable. Having a nice vivid picture of some sort
is essential: you should be prepared to invent your own, but this time you
may borrow my penguins if they help. If I give you more denitions like this,
it is your job to supply the penguins, or whatever it takes.
Exercise 2.3.4. Show that the claim that the two astronauts would agree if
two penguins have the same velocity at the north pole is true provided that
u v
1
and v u
1
are both dierentiable.
Remark 2.3.2. There is, of course, a simpler way of dening tangent vectors
on S
2
. It is usually viewed as a subspace of R
3
, so a curve on S
2
is also a
curve in R
3
and we can dene velocities on S
2
as tangent vectors in R
3
in
the sense of the derivatives of maps from (1, 1) to R
3
which, for tangent
vectors at a particular point, happen to lie in a plane in R
3
which is tangent
to S
2
at that point. This certainly removes some tricky conceptual problems
but at the expense of making tangent vectors extrinsic rather than intrinsic
to the space. The whole thrust of the text book is to using intrinsic ideas
for the very good reason that we live in a 3-manifold and cannot form any
useful idea of an embedding of it in some higher dimensional space.
The next proposition tells us that the set of tangency equivalence classes at
a xed point a in a manifold form a vector space, the tangent space at a.
Proposition 2.3.1. The set of tangent vectors at a point a of a smooth
n-manifold X comprise a real vector space of dimension n.
Figure 2.3.5: The sum of tangent vectors.
Proof:
We have to produce sensible rules for adding and scaling tangent vectors.
Then we have to show that the result satises the axioms for a real vector
space. Suppose we have a tangency equivalence class v and that v is an
element of it, that is a curve v : R X with v(0) = a and in any chart
w : W R
n
with a U there is some derivative of w v. Then we can
scale the function v by a scalar k R to get v(kt) instead of v(t) for t R
and the derivative of w v will also be scaled by the factor k. This will be
the same scaling in any chart, so it makes sense to call this new function kv.
This has its own tangency equivalence class, kv.
It would not make a dierence if we had chosen another function v
v,
kv
is a function tangent to kv since they both have the same derivative no

matter what chart we choose, although in dierent charts the derivatives will
be dierent but still equal to each other.
So we can say that kv exists and we have scaled the equivalence class.
If v and u are distinct tangency equivalence classes through the point a
as in gure 2.3.5, we can take representative functions v, u : R X with
u(0) = v(0) = a and composing with w : W R
n
, a chart, we have two
maps, w u and w v from R into R
n
. Such maps may be added: we take
the map w u +w v w(a). At t = 0 this passes through w(a) R
n
. The
resulting curve in R
n
can be mapped back into the manifold by w
1
, or at
least a bit of it in a neighbourhood of w(a) can be. This gives a sort of sum
curve of u and v in the manifold, w
1
(w u+w v w(a)). The tangency
class of this sum curve is dened to be the sum of u and v. It is easy to see
that the tangency equivalence class does not depend on the choice of chart.
(Although the sum curve does.)
Nor does it depend on which representatives u of u and v of v which we
choose because they all have the same derivative. We may write u+v is the
tangency equivalence class of w
1
(w u) +(w v) w(a)) therefore, and we
may add tangency equivalence classes, otherwise known as tangent vectors
at a.
(If in doubt about this argument, say it with penguins.)
It is clear that the sum is associative and commutative and there is a zero
which contains the constant function sending R to a. The rest of the axioms
for a vector space are easily checked.
The claim that it has dimension n the same as the dimension of X is left as
an exercise.
Exercise 2.3.5. Check all the axioms for a vector space. This kind of thing
is called axiom bashing and is good for you.
The resulting vector space is called T
a
(X) and is isomorphic to R
n
when X
is an n-manifold. I want to emphasise an important point: there is in general
no particular or natural isomorphism between T
a
(X) and R
n
. If X = R
n
,
then I can get away with calling T
0
(R
n
) by the slang name

R
n
, because in
this case there is an obvious basis for the tangent space at the origin, I have
the unit vectors along the axes. And by a simple translation I can carry R
n
to R
n
and take the origin to any point a, and this translation will also take
curves through the origin (and hence vectors) to curves through a. So in
this rather special case, I do have a natural basis for T
a
(X). But there is no
natural basis for T
a
(S
2
) for any a S
2
; the best I could do is to fudge one by
using the embedding in R
3
, but this is a property of the embedding, not of
S
2
. This loss of a natural basis, or if you prefer a natural isomorphism with
R
n
, has important implications. It parallels the fact that there is no obvious
choice of a coordinate frame in the space we inhabit
2
.
Exercise 2.3.6. Prove the last statement. Hint: Do it for X = R
n
rst,
then observe that locally any X is R
n
as near as dammit, and that tangency
is a very local kind of business.
There is one such tangent space T
a
(X) for each point a X. There is, in
general, no particular isomorphism between any T
a
(X) and R
n
. If X = R
n
then there is (what?), but in general there is a huge choice and no way of
picking any particular one.
2
Although in earlier days, it was thought in some quarters that Jerusalem was a good
place to put one. Where exactly in Jerusalem was not altogether clear.
Figure 2.3.6: The simplest tangent bundle.
Exercise 2.3.7. Show that the tangent plane at the north pole to S
2
as
usually embedded in R
3
can be mapped isomorphically to the tangent space
as dened here. Is there an obvious isomorphism?
Exercise 2.3.8. Show that there is an isomorphism between T
a
(X) and
T
b
(X) for any two points a, b X.
Examples 2.3.1.
1. The simplest case is where X = R. A tangent vector at the point 1
can be thought of as the space of velocities of moving points as they go
through 1; the chart consisting of the identity map does it all nicely.
So we have a line of possible velocities attached to each point of R and
the tangent bundle is the collection of all the tangent spaces. We can
draw it as R

R where the rst component is the space itself and the
second is the space of velocities. I am making up the notation of

R
for the space of velocities, and you wont nd it in the books, but it
makes sense and reduces confusion. Since

R is isomorphic to R, R

R
looks an awful lot like R
2
. We think of the dierent tangent spaces

R
a
attached to each a R and draw some of them as in gure 2.3.6
The reason it is called a bundle is because it looks like a bundle of (red)
tangent spaces. The tangent spaces are called the bres of the bundle.
The manifold to which the bres are attached is called the base space
of the bre bundle.
2. Let the manifold X be S
1
. Again it makes sense to have curves in S
1
which all pass through a point and have the same velocity vector at that
point. The dierent velocities again form a vector space

R and there is
Figure 2.3.7: The next simplest tangent bundle.
one attached to each point of S
1
. If we draw the possible tangents in
the plane, they intersect; this is a property of the space we are trying
to squash the tangent bundle into and if we turn them through a right
angle as in the last example we get the bre bundle of gure 2.3.7
Again the bres are all copies of a line and the bundle is pretty much the
same as S
1
R. The red dot sitting over the black one represents a speed
in the positive direction passing through the black point underneath it.
3. We have now run out of cases where we can draw the pictures, since
R and S
1
are the only one dimensional manifolds, and if we go to S
2
we get a tangent bundle of dimension four. We can draw one tangent
plane, but any more would usually intersect, and this is what happens
when we try to embed a four dimensional space in R
3
. We can see
however that there is a collection of planes, one for each point of S
2
and they form a four dimensional space. It is useful to visualise at least
a part of the tangent space of S
2
as a sphere in R
3
with some bits of
tangent planes attached to it, as in gure 2.3.8, because it is better to
have a partial idea than stick entirely to the algebra, but you should
be aware of the limitations of the picture.
The two earlier examples came out to be simple cartesian products of the
tangent space at any one point with the manifold. Such bundles are called
trivial bundles. An example of a non-trivial bre bundle is the Mobius bundle
shown in gure 2.3.9
This has bre an interval, say (1, 1), from R and base space S
1
. But it is
Figure 2.3.8: A bit of the tangent bundle for S
2
.
Figure 2.3.9: A non-trivial bre bundle.
not the cartesian product of the two.
Every tangent bundle has, however, a projection onto the base space, the
underlying manifold. We may write this as a vertical pair of spaces
TM
?
M
where M is the manifold, TM is the tangent bundle and is the projection
which sends a tangent vector to the point in the manifold to which it is
attached.
Now we have described the tangent bundle as a union of all the tangent
spaces T
a
(M) for a M but that does not specify a topology on it. To do
that we say a subset U of TM is open i the projection (U) is open in M
and the intersection of U with any bre is open in the bre. Since the bres
are all real vector spaces we can give them the usual topology, obtained from
an isomorphism with R
n
.
Denition 2.3.3. The tangent bundle to a smooth manifold M is the set
aM
T
a
(M)
with the topology specied by saying U TM is open whenever (U) is
open in M and for every a (U), U T
a
(M) is open in T
a
(M), where
T
a
(M) has a topology induced by any isomorphism with R
n
.
Note that this assumes that any two isomorphisms with R
n
will induce the
same topology.
Exercise 2.3.9.
1. Show that a linear map from R
n
to R
m
is continuous i it is continuous
at the origin.
2. Show that any linear map from R
n
to R
m
is continuous.
3. Show that any isomorphism from R
n
to itself is a homeomorphism.
Now for some formal denitions:
Denition 2.3.4. A bre bundle is a quartet (E, B, F, ) where E is called
the total space, B is called the base space, : E B is a continuous map
called the projection and for every b B,
1
(b) is homeomorphic to F. The
spaces
1
(b) are called the bres of the bundle.
Denition 2.3.5. A bre bundle is called locally trivial i for every b B
there is an open set U B containing b such that
1
(U) is homeomorphic
to U F
The bundle B F is called a trivial bundle.
Exercise 2.3.10. Describe clearly the trivial bundle with base space S
2
and
bre S
1
and give an example of a non-trivial bundle with the same base and
bre. Hint: you might nd it easier if you specify some gluings.
Exercise 2.3.11. Show that the tangent bundle of a smooth manifold is
locally trivial.
Exercise 2.3.12. Show there is a natural atlas on the tangent bundle which
makes it a smooth manifold. Is the bundle projection smooth?
Note that for a locally trivial bre bundle a topology on the bundle must
have as base the cartesian product of sets which are open in B (and over
which the bundle is locally trivial) with open sets in the bre.
Denition 2.3.6. A section of a bre bundle E with projection to base
space B is a map s : B E such that s is the identity on B.
2.4. NOTATION: VECTOR FIELDS 27
Denition 2.3.7. A vector eld on a manifold M is a section of the tangent
bundle TM.
You should be able to see that this makes sense and we can talk about
continuous, dierentiable and smooth vector elds according as the section
(which is after all a map) is continuous, dierentiable or smooth.
Exercise 2.3.13.
1. Draw a vector eld on R
2
which is nice and easy and write it as a
section of the tangent bundle.
2. Show that the tangent bundle for S
2
is not trivial. Use the hairy ball
theorem which says that any continuous vector eld on S
2
must have
at least one place where the vector is of length zero.
2.4 Notation: Vector Fields
On R
2
, I can write the tangent space as R
2

R
2
which is mildly useful for
thinking about the meaning but not standard and not particularly useful for
computations. I shall extend this to talking about the standard basis for

R
2
and call it ( e
1
, e
2
). A vector eld on R
2
is an assignment to each point of R
2
of a vector, and if it is a smooth vector eld this vector changes smoothly as
we move around in R
2
. So there is a tangent vector, with two components,
which both depends smoothly on x and y and hence is given by a pair of
functions P(x, y), Q(x, y). We might write the vector eld as
P(x, y) e
1
+Q(x, y) e
2
but we dont. We write it as
P(x, y)

x
+Q(x, y)

y
This notation takes a bit of explaining.
If we have a smooth function f : R
2
R and a smooth vector eld on R
2
we
can take the directional derivative of f at any point in the direction of the
vector eld at the point, and multiply it by the length of the vector. This
will give us a new smooth function on R
2
. This means that such a vector
eld can be thought of as an operator on the space of smooth functions from
R
2
to R, which is usually written as (
(R
2
). The constant vector eld which
assigns the vector e
1
to every point of R
2
can easily be seen to be the operator
/x and similarly the orthogonal constant vector eld which assigns e
2
is
the operator /y. This explains the notation for vector elds on R
2
and by
an obvious extension we can write a vector eld v on R
n
as
i[1:n]
v
i
(x)

x
i
where each v
i
is a function from R
n
to R and where I called v
1
the function
P(x, y) and v
2
the function Q(x, y) when n = 2.
Since the procedure for interpreting a vector eld as an operator on (
(R
2
)
is local, a vector eld on a manifold M is an operator on (
(M) although
there is an issue involved in choosing a basis for each tangent space T
a
(M)
if we wish to do calculations.
This gives two quite dierent ways of looking at a vector eld on a smooth
manifold. We have the tangency equivalence classes which we may think of
as little arrows, each selected by a section of the tangent bundle. This is
a quite straightforward transfer of ideas from R
n
and should seem natural
and reasonable once you have come to terms with the problem of having to
say everything via charts. But the other way of thinking of a vector eld as
an operator on the space (
(M) has some advantages. One of these is that

it makes sense without immediate reference to charts. Of course, we need
charts to say what it means for some map f : M R to be dierentiable, but
given that, we have a pleasant freedom from particular coordinate systems.
Physicists are particularly interested in this, because the physical universe
does not come equipped with charts anymore than it has an origin and axes
sticking out of it. Recall penguins, and what they do, versus the language for
talking about them given by charts. Now we want to focus on the behaviour
of the physical universe (penguins) and not be to distracted by the language
(charts). So an invariant description, that is one which does not depend
on choosing a particular language, is denitely more physical. Note Oliver
Heavisides remarks quoted at the top of chapter three of the text book.
On R
n
we can therefore write a vector eld v as a map
v : (
(R
n
) (
(R
n
)
with vf the map
i[1:n]
v
i
(x)
f
x
i
.
This can be compressed into
v =
i[1:n]
v
i

x
i
2.4. NOTATION: VECTOR FIELDS 29
An even more compact form is
v =
i[1:n]
v
i
i
We can make this even terser by using the Einstein Summation Convention
which is that if an index is repeated as a superscript and a subscript then we
automatically sum over the possible values. This gives us
v = v
i
i
where you have to know what the space is in which we are working to know
how many is there are. For some reason physicists prefer to use greek letters
as indices which means that you are likely to nd expressions such as
v = v
instead. I fear that you will have to get used to this as the textbook is
committed to it.
This leads to a new denition of a vector eld on a smooth manifold M.
First we dene (
(M) to be the set of smooth maps from M to R. This

is clearly a real vector space. It is certainly possible to add and scale the
functions, and the rest is simple axiom bashing, as done in second year. It
is rather more than just a vector space, it is an algebra, which is to say it is
possible to multiply any pair of elements, fg being the function
a M, fg(a) = f(a)g(a)
where the right hand side of the equality means we just multiply the two real
numbers f(a) and g(a). The multiplication is associative, commutative, and
left and right distributive over addition. In other words, it is a real vector
space which is also a commutative ring, which is basically what we mean by
a real algebra. You should write down the complete list of axioms for such a
thing, not relying on the text book too much.
Now we dene a linear operator on such an algebra A by saying it is a map
v : A A
which is linear, that is,
f, g A v(f +g) = vf +vg
and
f A, t R v(tf) = tv(f)
Such an operator is called a derivation if it also satises
f, g A v(fg) = fv(g) +gv(f)
which you will recognise as Leibnitz Rule for dierentiating a product func-
tion.
Exercise 2.4.1. Take M = R
2
and any smooth vector eld on it. Show that
it is a derivation.
Note that it makes sense to dene a derivation over any real algebra and
algebraists indeed do exactly this. This is a long way from dierentiating
functions, but it gives all the essential properties, and algebraists have a
habit of studying the properties without much caring where they came from.
They have their uses. Algebraists, that is.
We can nally dene a vector eld on a manifold M as a derivation on the real
algebra (
(M). Such a denition has advantages and disadvantages. The

obvious disadvantage is that it is so abstract it seems to have nothing to do
with the things we care about, but the advantage is that the abstraction has
removed all the irrelevancies which get in the way of thinking about things
and left the bare essentials. Any lingering suspicion that the geometric baby
has been thrown out with all that bathwater may be put to rest by checking
through the last exercise carefully, and by doing it with S
1
instead of R
2
:
Exercise 2.4.2. Take M = S
1
and any smooth vector eld v on it regarded
as a section of the tangent bundle. Show that v is a derivation: take some
simple functions from S
1
to R and operate on them by v. Conrm that all
the rules for a derivation are satised.
We also need to be able to go in the opposite direction: if v : (
(M)
(
(M) is a derivation, then it must be able to be expressed as a vector eld

in the earlier sense.
Exercise 2.4.3. Do this on R
2
at the origin. Suppose f : R
2
R is a
smooth map. Then we can write
f
x
y
= f(0) +
f
x
,
f
y
x
y
+ax
2
+ 2bxy +cy
2
where a, b, c are second order partial derivatives of f evaluated at some point
between the origin and (x, y)
T
(and hence, we have to admit, depend upon
x and y). This is just the Taylor expansion with Lagrange form of the
remainder in two dimensions.
2.5. COTANGENT BUNDLES 31
Now apply v to f to get a new function g: then g(0) must be the limit of
g(x, y)
T
as (x, y)
T
0, as g is certainly continuous, and show that since
v is linear, g(x, y)
T
must be the sum of the action of v on the above three
terms in a neighbourhood of the origin, that v takes the constant rst term
to zero, and that since v satises the Leibnitz condition, g(0) must be
f
x
,
f
y

u
v
for some vector (u, v)

T

R
2
. Finally show that if it works on R
2
it must
work on R
n
and also on any smooth manifold.
We can now dene Vect(M) or 1(M) as the set of all vector elds on the
smooth manifold M.
Exercise 2.4.4. Show that Vect(M) (1(M) ) is a real vector space. Show
that it is a module over (
(M), that is, it is like a vector space over (
(M)
except that (
(M) is not a eld but a ring.

Exercise 2.4.5. Show that 1(M) as a module over (
(R
n
), is nite dimen-
sional and has the obvious basis.
2.5 Cotangent Bundles
I mentioned earlier that we could do the business of equivalence classes of
maps from the manifold to R in exactly the same way as we took maps from
R to the manifold. If we do this we get an exact parallel and a tangency
equivalence class of such maps at a point is called a cotangent or covector
at the point. Somewhat easier is to dene the space of cotangents at a X
for a smooth manifold X as the dual space of T
a
(X). Recall that the dual
(vector) space for a space V is the space V
of linear maps from V to R. I

shall say more about this in the next chapter. We can do exactly the same
process of taking the union of all the T
a
(X)
as we did for the tangent bundle

and this gives us a slightly dierent object called the cotangent bundle. It
has to be admitted that there is no dierence between them as topological
spaces. All the dierence is in the algebra and it manifests itself strongly
when we look to see what happens under maps between manifolds.
Exercise 2.5.1. I have given two dierent denitions of the cotangent space.
Show they are equivalent.
The same sort of considerations as worked for vector elds apply to covec-
tor elds or dierential 1-forms as they are more commonly known. At each
point of R
2
, we select an element of

R
2
, the cotangent space, which again has
two components. I suppose we might call the standard basis for this space
( e
1
, e
2
) where e
1
is the linear map from

R
2
to R which projects everything
onto the rst component and e
2
projects everything onto the second com-
ponent. But we actually call them dx, dy to be loosely consistent with the
classical notation. So we interpret dx as the linear map which takes (x, y)
T
to x where (x, y)
T
is a point in the tangent space T
a
(R
2
) at some point a.
Similarly for dy. So a dierential 1-form or covector eld on R
2
is written
P(x, y) dx +Q(x, y) dy
The generalisation to R
n
is of course
i[1:n]
i
(x) dx
i
(or
i
dx
i
using the Einstein summation convention)
and this, for smooth functions
i
, i [1 : n] represents a covector eld
or dierential 1-form on R
n
. The preference for letters towards the end of
the Greek alphabet to denote dierential forms is widespread so again you
ought to get used to it. The subscripts instead of superscripts for indices
tells you something about the covariance or contravariance of the entities. I
shall explain this properly shortly.
If you wonder why on earth anybody bothers to distinguish between vector
elds and dierential 1-forms, one answer is that it is natural to dierentiate
k-forms to get (k +1)-forms for k N. This is what Stokes theorem is really
all about. As you ought to have learnt in second year but probably didnt.
2.6 The Tangent Functor
Suppose f : X Y is a dierentiable map between manifolds. Then for the
case where X = R
n
and Y = R
m
there is a map between the tangent spaces
at each point which takes the tangent space at a X to the tangent space
at f(a) Y . To take a tangent vector v
a
in the tangent space T
a
(X) to one
in the tangent space T
f(a)
(Y ) all we have to do is to operate on it by Df(a)
which is by denition a linear map and has the right dimensions for domain
and codomain. If we are prepared to choose a basis for T
a
(X) and T
f(a)
(Y )
we could represent Df(a) by a matrix, and there is a perfectly sensible way
of choosing the same basis for tangent spaces over dierent points. All this
makes sense even if X and Y are just nite dimensional real vector spaces
2.6. THE TANGENT FUNCTOR 33
without the extra structure of R
n
. In fact it makes sense in arbitrary Banach
spaces.
Of course, there is a slight problem of how to extend this to manifolds which
are not Banach spaces. Spheres and tori spring to mind.
If we take v
a
, and recall that it is a tangency equivalence class of curves
v : (1, 1) X taking 0 to a then f v is a curve through f(a) and it
species a tangency class. Moreover if v
is tangent to v at a then f v
is
tangent to f v at f(a).
Exercise 2.6.1. Most of this should have been a second year exercise but
probably wasnt. Do it now and all about tangent vectors and maps will be
clear. Well, clearer.
1. Let f : R
2
R
2
be dened by
x
y
u
v
x
2
+x +y +y
2
1 +xy
Compute f on the set of points
t
0
for t [0, 1]. Do this by choosing

ten points along the interval and evaluating f on them and plot them
on a sheet of graph paper to obtain ten points which should lie on a
smooth curve. Do the same for points on the interval
0
t
for t [0, 1].

2. Calculate f
1/10
0
and f
0
1/10
if you havent already.

3. Calculate Df
1
0
.
4. Evaluate the above matrix on the tangent vector e
1
5. Evaluate the above matrix on the tangent vector e
2
6. Map the two tangent vectors obtained by the last two jobs on the same
graph.
7. Represent the tangent vector e
1
by any curve c
1
in the tangency equiv-
alence class and compose with f. Dierentiate to nd a linear repre-
sentative of Tf(0, e
1
)
8. Repeat for a curve c
2
representing e
2
9. Sketch the curves f c
1
and f c
2
10. Prove the claim that if v
is tangent to v at a then f v
is tangent to
f v at f(a).
It follows that f induces a map Tf which takes tangent vectors at a to tan-
gent vectors at f(a). This process doesnt, on the face of things, involve
dierentiation. Nor does it involve charts. Of course it does involve dier-
entiation, as the last series of exercises shows convincingly. And it is easy to
see that it goes through on charts for the usual reasons, which involve the
chain rule.
In the case when we have a dierentiable f : R
n
R
m
the last exercises
should convince you we have at each point a X the diagram
T
a
(X)
?
X
X
T
f(a)
(Y )
?
Y
Y
-
-
Df(a)
f
This diagram commutes which means whichever way around you go you get
the same result. We can do this for every point a X to get the commutative
diagram:
TX
?
X
X
TY
?
Y
Y
-
-
Tf
f
The process of taking a manifold and producing its tangent bundle is said to
be functorial because if we have two manifolds and a smooth map between
them the process gives a map between the bundles.
Instead of writing Tf we often write f
for the same map. This is more

general because it makes sense for some other vector bundles and not just
the tangent bundle.
Such a map between tangent bundles is said to be bre preserving, since
it takes anything in the bre over a to the bre over f(a). And we can
generalise this to maps between any bre bundles, so they are also called
bundle maps. If the bre is a vector space we talk of vector bundles and we
2.6. THE TANGENT FUNCTOR 35
require the bundle maps to be linear, so the map Tf is also a vector bundle
map.
Note that the map Tf contains all the information about the derivative and
also tells you where things are, which the derivative (being only the linear
part of an ane map) does not. So this is actually cleaner and conceptually
simpler than the usual description of the business of dierentiation. Another
way to put this, in the light of the last exercise, is that when you calculate
lots of partial derivatives you are merely trying to calculate the linear part of
an ane map which species a tangency equivalence class, that is, a tangent
vector.
We can usefully think of Tf as coming in two parts, since locally the tangent
space is simply a cartesian product of possible tangent vectors over a space
with a part of the space. On the rst part Tf is simply f and on the second
part, the bres, it is Df, the derivative of f. We can now choose to dene the
derivative of a smooth map this way. I have hankered after teaching calculus
this way in rst year. It is actually easier, probably because you need to
isolate the core ideas in order to generalise things and fronting up to the core
ideas although demanding at rst makes life a lot easier subsequently.
Note that the chain rule can now be formulated as
T(f g) = Tf Tg
Exercise 2.6.2. Conrm that the chain rule holds. This is also a part of
saying that T is functorial.
Exercise 2.6.3. Guess what a functor is and what it is a map between.
Conrm your guess by doing some googling. I warn against doing the googling
rst.
Exercise 2.6.4. Take f : R
+
R
+
dened by x x
2
. Show this is a
dieomorphism. Let V be the vector eld on R
+
which has constant vectors
of length 1 at every point. Show that Tf takes this into a new vector eld on
R
+
, and say what the new vector eld is. Regarding the two vector elds as
dierential equations, nd both solutions.
2.6.1 The (non-existent) Cotangent Functor
Suppose we have f : X Y a smooth map between smooth manifolds,
and we look to see what happens in the cotangent bundle. Thinking of a
cotangent at a X as a tangency equivalence class of maps from some
neighbourhood of a to R, we see that the map between the bres goes in
the reverse direction. Given v : W Y R as a representative function in
the tangency equivalence class at f(a) (with f(a) W), f induces v f :
f
1
(W) R on X which denes a cotangent vector at a. So we obtain the
diagram:
T
a
X
?
X
X
T
f(a)
Y
?
Y
Y
-
f
f
This makes T
, a hypothetical induced map on the whole cotangent space

a mess, because it goes one way (left to right) on the space part and the
opposite way (right to left) on the cotangent part. If f has a smooth inverse
we can get around this, but it is not so neat. Incidentally:
Denition 2.6.1. A smooth map with a smooth inverse is called a (smooth)
dieomorphism
Exercise 2.6.5. How, if at all, can we relate the derivative of f to f
when
X = R
n
, Y = R
m
?
Remark 2.6.1. In older books, a covector eld is called a contravariant
vector eld and a vector eld is called a covariant vector eld. See for
example, Mackeys Theoretical Foundations of Quantum Mechanics. As we
shall see later, a covariant vector eld is a contravariant tensor eld. Dont
blame me for this.
This is all rather confusing on rst encounter. Familiarity breeds acceptance
and the best way to become familiar with these ideas is to work them through
in very simple cases. So make up a set of exercises yourself in which you work
with particular simple maps between very simple manifolds (R
n
and R
m
for
n, m small positive integers.) As a start:
Exercise 2.6.6. Let f : R R be given by f(x) = x
2
. Put a = 2 and
investigate what happens if we take (a) a tangent vector at 1 and (b) a
cotangent vector at 4.
Now try it for f : R
2
R
2
with
x
y
x
2
+y
2
xy
ans some suitable points for a and f(a). In this case you can conveniently
represent tangent vectors as columns and cotangents as rows.
2.7. AUTONOMOUS SYSTEMS OF ODES 37
Exercise 2.6.7. Write out a lecture for rst year students which describes
tangent vectors on R in a really simple way as possible velocities along the
line, and hence dene the tangent bundle R

R. Dene dierentiation of
maps fromR to R in terms of bundle maps. Prove the chain rule as T(f g) =
Tf Tg. Be prepared to answer any awful questions an intelligent student
might ask.
Write out a lecture on ordinary dierential equations in terms of sections of
the tangent bundle. Set up and solve some easy ones in this notation.
Do you think this is easier or harder than the traditional way of doing it?
Assume that since Mathematica can solve ODEs, the idea is not to train
students to jump through hoops but to get them to understand what they
are doing.
2.7 Autonomous Systems of ODEs
2.7.1 Systems of ODEs and Vector Fields
Consider the system of linear ordinary dierential equations:
x = y x(0) = 1
y = x y(0) = 0
We can write this as a two dimensional problem:
x
y
0 1
1 0

x
y
or more succinctly:
x = Ax (2.7.1)
where A is the above matrix.
The matrix A denes a vector eld on R
2
by taking the location x to the
vector A(x). We are now used to the idea of a vector eld on R
2
both visually
in terms of lots of little arrows stuck on the space (which can incidentally be
generated quickly and painlessly using Mathematica), and algebraically as
a map from R
2
to

R
2
sending locations to arrows (with their tails attached
to those locations).
Such a system of ordinary dierential equations is called autonomous, mean-
ing that the vector eld specied by the system doesnt change in time.
Figure 2.7.1: A vector eld or system of ODEs in R
2
Consequently we can either refer to an Autonomous System of Ordinary Dif-
ferential Equations dened on an open set U R
n
, or we can talk about a
Smooth Vector Field on U. The second is much shorter and easier to think
about.
If we draw the vector eld in the above case, we get arrows which go around
the space in a positive direction as in gure 2.7.1
A solution to the system of dierential equations, or an integral curve for the
vector eld is a map f : R R
2
, usually written
x(t)
y(t)
with the property that x and y satisfy the given system of equations. What
this means is that we think of a point moving in R
2
so that its velocity at
any point is just the vector attached to that point. So the solution curve has
to have the vector eld tangent to it always.
It is possible to learn to solve autonomous systems of dierential equations
without ever understanding that they are all about vector elds which give
the velocity of a moving point, and that a solution is simply a function which
says where the moving point is at any time, and which agrees with the given
vector eld in what the velocity vector is. This is a pity.
In the above case, you can see by looking at the system what the solution
is: obviously the solution orbits are circles, and given the initial condition
where at time t = 0 we start at the point (1, 0)
T
, the solution can be written
down as
x = cos(t), y = sin(t)
and it is easy to verify that this works.
Exercise 2.7.1. Do it.
Obviously, solving initial value ODE problems for more complicated vector
elds isnt going to be so easy, and doing it in dimensions greater than three
by the look at it and think method also looks doomed. So it is desirable
to have a general rule for getting out the solution. Fortunately this is easy
enough for linear vector elds in principle, although the calculations can be
messy in preactice. But again, thats what computers are for.
2.7.2 Exponentiation of Things
I did this in second year M213 but some of you may have missed out on it in
which case here it is. Those of you who did it can read this rather quickly.
If you write down the usual series for the exponential function you get:
exp(x) = 1 +x +
x
2
2!
+
x
3
3!
+
x
n
n!
+
Now think about this and ask yourself what x has to be for this to make
sense. You are used to x being a real number, but it should be obvious that
it could equally well be a complex number. After all, what do you do with
x? Answer, you have to be able to multiply it by itself lots of times, and you
have to be able to scale it by a real number, and you have to be able to add
the results of this. You also have to have an identity to represent x
0
. Oh, and
you need to be able to take limits of these things. So it will certainly work
for x a real or a complex number. But it also makes sense if x is a square
matrix. Or, with any system where the objects can be added and scaled and
multiplied by themselves. And have limits of sequences of these things.
The name of a system of objects which can be added and scaled by real
numbers is a vector space, and a vector space where the vectors can also
be multiplied is called an algebra. We can do exponentiation in any algebra
which has a norm and a multiplicative identity. (And it would be a help if it
was complete in that norm, i.e. limits of cauchy sequences exist.) The square
n n matrices form such an algebra. We can also hope to take sequences of
them and maybe have them converge to some matrix. So we can exponentiate
square matrices.
Exercise 2.7.2. Exponentiate the matrix A in equation 2.7.1. Now expo-
nentiate the matrix tA. Do you recognise the result?
It should be obvious that we could, in principle, calculate the exponential of
a matrix to some number of terms, and if the innite sum makes sense and
the sequence of partial sums converges, then we could always get some sort of
estimate of exp(A) for any matrix A by computing enough terms. We would
hope that multiplying A by itself n times would give some reasonable sort of
matrix, and when we divided all the entries by n! we would get something
pretty close to the zero matrix. If this happened for all the n past some
point, then we could optimistically suppose that exp(A) was some matrix
which we could at least get better and better approximations to, which after
all is exactly what we have with exp(x) for x a real number.
Exercise 2.7.3. Dene the norm of an n n matrix A to be
|A| = sup
x=1
|A(x)|
as in an earlier problem, and show that |A
2
| (|A|)
2
. Hence prove that
the function exp is always dened for any n n matrix.
Exercise 2.7.4. If e
tA
exp(tA) denotes a map from R to the space of nn
matrices, show that its derivative is Ae
tA
.
There are other algebras where a bit of exponentiation makes sense, so be
prepared for them.
2.7.3 Solving Linear Autonomous Systems
In principle this is now rather trivial:
Proposition 2.7.1. If x = Ax is an autonomous linear system of ODEs
with x(0) = a, then
x = e
tA
a
is the solution.
Proof:
Dierentiating e
tA
gives Ae
tA
by the last exercise and since exp
0
= I the
identity matrix, the initial value x(0) = a is satised. So it is certainly a
solution.
If this looks a bit like a miracle and in need of explanation, you are thinking
sensibly and merely need to do more of it. It may help to note that the
exponential function is the unique function with slope at a point the same
as the value at the point, and that this leads to the general solution for the
linear ODE in dimension one, and that this goes over to higher dimensions
with no essential changes. In eect, the exponential function was invented to
solve all these cases. It actually goes deeper than this, see Vladimir Arnolds
book Ordinary Dierential Equations.
2.7.4 Existence and Uniqueness
Could you have two dierent solutions (or more)? No, not for linear systems,
but this requires thought. Certainly the 1-dimensional ODE given by
x(t) = 3x
2/3
, x(0) = 0
has the solution x(t) = t
3
but also the solution x(t) = 0 It also has innitely
many other solutions. (Can you nd some?) Of course this is not a linear
ODE, but it is clear that some sort of conditions will need to be imposed
before we can look at vector elds which are not linear and expect them to
have solutions. Happily, there is a simple one which guarantees at least local
existence and uniqueness:
Theorem 2.7.1. If f : U R
n
R
n
is a continuously dierentiable
vector eld, then for any point a in U there is a neighbourhood W U
of a containing a solution to the system of equations x = f(x) with a as
initial value, and the solution is unique. Moreover, there is a continuously
dierentiable map F : W J R
n
for some interval J = (a, a) on 0 R
such that for all b in W, the map F
b
: J R
n
is the solution for initial
value b at t = 0.
There is a proof in Hirsch and Smales Dierential Equations, Dynamical
Systems and Linear Algebra, pages 163 to 169.
There is a better proof in Arnolds book on page 213. It is actually the
same proof but much better explained. It is given for the general (non-
autonomous) case. Both arguments use the contraction mapping theorem.
You should read through it if you have not already done a proof in your
ODEs course. Assuming you did one.
The results follow easily from a more basic result sometimes called The
Straightening Out Theorem (In Arnold The basic theorem of the theory of
ordinary dierential equations or the rectication theorem. See chapter 2).
The theorem says that in a neighbourhood U of a point of R
n
where the
(continuously dierentiable) vector eld is non-zero, we can nd a one-one
dierentiable map from U to W R
n
with a dierentiable inverse, such that
the transformed vector eld on W is uniform and constant.
Given that we can do that, we could also make the vectors all have length
one and lie along the x
1
axis in R
n
with a rotation and scaling. The system
of ODEs then would be, in this transformed region W, the rather boring
system:
x
1
= 1
x
2
= 0
.
.
.
x
n
= 0
with the solution
x
1
(t) = t +a
1
; x
2
(t) = a
2
; x
n
(t) = a
n
If you believe in the Straightening Out Theorem, then it is obvious that any
continuously dierentiable vector eld has at any point where the vector eld
is non-zero a solution which is unique in some neighbourhood of the point
and which depends smoothly on the point. All we have to do is to map the
straight line boring solution(s) back by the dierentiable inverse.
When the vector eld is zero at a point, the solution is the constant function
taking all of R to the point. So there is a unique solution here too.
Remark 2.7.1. You will nd a proof of the straightening out theorem in
Arnold. I shant prove it in this course on the grounds that this isnt a course
on ODEs. At least, I dont think it is.
Remark 2.7.2. It should be obvious that although we have looked at sys-
tems of ordinary dierential equations on R
n
, the fact that everything is
dened locally means that they ship over to any smooth manifold. If the
manifold is compact then the completeness is guaranteed, and the solution
can be found by doing everything in charts and piecing the bits together.
2.8 Flows
I rather slithered over one important point, which is the question of whether
we always get a solution for all time, past and future. It is not hard to see
2.8. FLOWS 43
that the vector eld X(x) = x
2
, X(0) = 1 on R has a solution
x(t) =
1
1 t
which goes o to innity in nite time. From which we deduce that it is
not in general possible to ensure that there is a solution for all time, and
this explains the cautious statement of the last theorem. The best we can
hope to do, the theorem tells us, for a smooth vector eld at a point is to
nd a neighbourhood of the point in which there is a parametrised curve,
x(t) : t (a, a) where if we are lucky a will be and if we arent it will
be some possibly rather small positive number.
Denition 2.8.1. A vector eld on U R
n
is said to be complete if any
solution can be extended to the whole real line.
Exercise 2.8.1. Show that if a vector eld has compact support then it is
complete.
Exercise 2.8.2. Show that if U is the unit open ball in R
n
centred on the
origin and X is a smooth vector eld on U, then if X is complete, and if
Proj(X(x), x) is the projection of X(x) on x, then
lim
x1
Proj(X(x), x) = 0
Remark 2.8.1. It should be obvious that there are not many physical situa-
tions where things go belting o to innity in nite time, and for that reason
I shall restrict myself from now on to complete vector elds. If I forget to
put the word in, put it in yourself. Also put the word smooth in front of
the term vector eld whenever it occurs since I shall not consider any other
sort.
The business of getting a solution is going to work not just for the point we
selected as our starting point but also for neighbouring points provided we
dont go too far away. In the happy case where the vector eld has solutions
for all time, the space U on which the vector eld is dened is decompos-
able as a set of integral curves, since solutions cant intersect each other, or
themselves, although they can, of course, be closed loops. This statement
follows from the uniqueness of a solution. Hence we deduce that a vector
eld gives rise to what is called a foliation of the space into integral curves.
You can, perhaps, guess that partial dierential operators more complicated
than vector elds will give rise to higher dimensional foliations, decomposing
the space into surfaces and other manifolds.
Exercise 2.8.3. Describe the foliation of R
2
by the vector eld
y

x
+x

y
Recall that in second year (M213) we discussed the idea of groups acting on
sets and came to the conclusion that they were conveniently seen as homo-
morphisms from a group G into the group Aut(V ) of maps from the set V
into itself. Then a complete smooth vector eld X on U R
n
gives rise to
an action of the group R on U as follows:
x : R U U
(t, x
0
) x(t)
where x(t) is the integral curve of X with x(0) = x
0
.
To prove this is indeed a group action, we need to show that x(0, x
0
) = x
0
for every x
0
which follows immediately from my denition of x. (Since the
additive identity of R is 0.) We also need to show that
s, t R, x
0
R
n
, v(s, v(t, x
0
)) = v(s +t, x
0
)
which merely means that if you travel for time t from x
0
along the solution
curve, and then go on for time s, this gives the same result as travelling for
time s + t from the starting point x
0
, which is, after all, what we expect a
solution curve to do.
If we x t and look to see what the group action does, it is a map from R
n
to itself. Well, we knew that. It is a truth that this map is always a smooth
dieomorphism. The old fashioned way of saying this is that the solutions
depend smoothly upon the initial conditions, but I much prefer the modern
way of saying it. You should be able to see that all we are doing is taking
each point as input, and outputting the point it will get to after time t.
Proposition 2.8.1. For a complete smooth vector eld X on U open in
R
n
, for any t R, the map x
t
: U U, which sends x
0
to x(t, x
0
) is a
dieomorphism of U
Proof:
The map x
t
certainly has an inverse, x
t
. And the theorem on existence of
solutions to an ODE establishes that the map is continuously dierentiable
when X is. So if X is smooth, so is x
t
.
Remark 2.8.2. The set of dieomorphisms x
t
: t R, or in other words
the map x : R U U, is called in old fashioned books a one-parameter
group of dieomorphisms. I shall simply say that the map x obtained from
the vector eld X is the ow of X.
2.9. LIE BRACKETS 45
Remark 2.8.3. Given a ow x on U R
n
we can always recover the vector
eld by simple taking any point, a and dierentiating the map x
a
: R U
which sends t to x(t, a) at t = 0. This must give us the required vector eld
from which the ow can be derived. So there is a correspondence between
ows and vector elds.
You now have four ways of thinking about vector elds. They are bunches
of arrows tacked onto a space; they are autonomous systems of ordinary
dierential equations. And they are also ows, obtained by solving the au-
tonomous system. And last but not least they are operators on the algebra of
smooth functions from the space to R. This demonstrates that vector elds
are more interesting and complicated than you might have supposed.
I shall give one important feature of vector elds which arises from this
multiple perspective and which is much less obvious if you stick only to
systems of ordinary dierential equations.
2.9 Lie Brackets
Writing, as is conventional in some areas, X and W for two vector elds in
1(R
n
) and bearing in mind that we can compose any such operators to get
X W and W X (which we write XW and WX for short). In general the
result is a perfectly good operator but some calculations will rapidly convince
you that XW is not, in general, a vector eld operator but something much
nastier.
Example 2.9.1. Let V = y /x + x /y and W = x /x + y /y
Then V Wh is
xy
2
h
x
2
y
h
x
y
2

2
h
xy
0 +x
2

2
h
yx
+x
h
y
+xy
2
h
y
2
+ 0
and WV h =
xy
2
h
x
2
+ 0 +x
2

2
h
yx
+x
h
y
y
2

2
h
xy

h
x
+xy
2
h
y
2
+ 0
Neither of these look like a vector eld operating on h. If however we take
the dierence, V W WV we get some happy cancellation and wind up with
V W WV = (y

x
+x

y
) (x

y
y

x
) = 0
which is a vector eld although not a very interesting one.
Exercise 2.9.1. Write down another pair of vector elds V, W on R
2
and
compute V W WV . Check to see if you always get the zero vector eld.
What is it telling you about the vector elds when V W WV = 0? (Some
intelligent conjectures would be of interest but only if supported by evidence
not used in framing the conjecture.)
Exercise 2.9.2. If X = P(x, y)/x+Q(x, y)/y and W = R(x, y)/x+
S(x, y)/y, calculate XW WX and verify that is is a vector eld.
Exercise 2.9.3. Compute XW WX for X, W 1(R
n
) and show it is a
vector eld in 1(R
n
) Show that this also holds for 1(U) for any open set
U R
n
.
All this gives the following denition:
Denition 2.9.1. The Lie Bracket or Poisson Bracket of two vector elds
X, W in 1(U) for U R
n
is written [X, W] and dened by
[X, W] XW WX
It is a multiplication on the vector space of Vector elds on U.
Exercise 2.9.4. Do some simple calculations preferably for U R
1
and con-
vince yourself that the Lie bracket multiplication is not in general associative
but does satisfy the Jacobi Identity:
X, Y, Z X(U), [X, [Y, Z]] + [Y [X, Z]] + [Z, [X, Y ]] = 0
Exercise 2.9.5. Prove that the Jacobi Identity is always satised for Vector
Fields.
The Lie bracket almost makes the vector space of vector elds on U, an
open subset of R
n
, into an algebra, which you will recall is merely a vector
space where the vectors can be multiplied, to make a ring. Here the Lie
Bracket operation fails to be associative in general, but a vector space with a
non-associative multiplication which satises the Jacobi Identity is, notwith-
standing, called a Lie Algebra. There are others besides these and again
algebraists have gone to town on investigating abstract Lie Algebras. Well,
we wouldnt like them to be at a loose end and hang around street corners
3
.
Exercise 2.9.6. Prove [X, (Y + Z)] = [X, Y ] + [X, Z] and [(X + Y ), Z] =
[X, Z] + [Y, Z] Prove also that a R, [aX, Y ] = a[X, Y ] and [X, aY ] =
a[X, Y ].
3
Although theyd probably have an interesting line in grati.
2.9. LIE BRACKETS 47
Remark 2.9.1. The above properties you will recognise as bilinearity.
Exercise 2.9.7. Investigate the relation between [hX, Y ], [X, hY ] and h[X, Y ].
It should be apparent that although the calculations tend to be messy and
provide great scope for making errors, they are not essentially dicult. A
natural candidate for a good symbolic algebra package, you might say.
Exercise 2.9.8. Is there a multiplicative identity for the Lie Bracket oper-
ation on vector spaces? That is, is there a vector eld J such that for every
other vector eld, X, [J, X] = X? (Hint: what is [J, J]?)
You might be interested in an area of applications of these ideas. If so read
on.
It is easy to nd the solution, h(x, y) = x
2
+y
2
to the PDE
y
h
x
+x
h
y
= 0
Now this is one solution, and nding a single solution is very nice, but we
usually want the general solution. In this particular case you can probably
guess it. But in general, if we have some linear partial dierential operator
L acting on F, a suitable space of smooth functions, and if we want the set
of all solutions of Lh = 0, then it will usually be a lot harder to nd them.
This process is aided by the following idea: The set of solutions of L is going
to be a linear subspace of F, by denition of the term linear operator. Call
it F
0
. Now a symmetry of the solution space of the operator L, often called
a symmetry of the operator L, is some vector eld operator X such that X
takes F
0
into itself, i.e. if whenever h is a solution to Lh = 0, so is Xh. If we
know the collection of all symmetry operators for L and we have a solution,
then we can nd all the other solutions. In trivial cases this will amount
to no more than adding in arbitrary constant functions, but in non-trivial
cases it will do a whole lot more than this. So it would be a good idea to
be able to nd, for a given L, the set of all symmetries X for L. It is clear
that the Poisson-Lie bracket can be used for any pair of linear operators, not
just vector elds. The following observation goes some way to explaining our
interest in them:
Proposition 2.9.1. If [L, X] = wL for some function w F, then X is a
symmetry of L.
Proof:
We need to show that h F, L(Xh) = 0 Now
LX XL = gL LX = gL +XL
and
h F, (gL +XL)h = gLh +XLh = 0 +0 = 0
Exercise 2.9.9. Prove the converse, that if X is a (vector space) symmetry

of L, then [L, X] = gL for some g F.
Exercise 2.9.10. What symmetry is involved in nding the general solution
to the equation
y
h
x
+x
h
y
= 0
and how does it give the general solution?
Now it is possible to prove that the set of all vector space symmetries of an
operator L is itself a Lie Algbra. Which is one reason for wanting to know
more about them.
Some students of PDEs want to know why it is that the standard partial
dierential equations all had their variables separable: does this happen for
all possible PDEs and why does it work for these cases? The answer to this
question is rather long and may be found in Volume 4 of the Encylopedia
of Mathematics and Its Applications, Symmetry and Separation of Variables
by Willard Miller. It has a lot to do with Lie Algebras.
It is now possible to state properly a signicant problem.
Going back to the idea of ows, it makes sense to discover whether ows
commute. For a suitable pair of ows, x, y : R U R
n
we can start o
from a U and go by ow x for a time s and then by ow y for time t. This
will get us to some point in U, written naturally enough as y
t
x
s
(a). Or we
could go the other way around, rst by y and then by x to get x
s
y
t
(a). If
we always wind up at the same point for any starting point and any pair of
times s, t then we may say that the ows commute.
Then when the ows x, y correspond to the vector elds X, Y , we have the
following result: x and y commute i [X, Y ] = 0. You can see that this works
for the case of the two vector elds V, W in Example 2.9.1.
At present we lack the machinery to prove this result economically, so I shall
skip it until it is needed.
Remark 2.9.2. Again all this makes perfectly good sense on manifolds for
the usual reasons. The idea of thinking of a vector eld on a manifold M
as a special kind of operator on (
(M) ensures that we can compose them

and add them and subtract them, so the Lie Bracket makes sense there too,
and we can write down vector elds on manifolds via charts and nd integral
curves for them and so foliate the manifold.
2.10. CONCLUSION 49
Exercise 2.9.11. Demonstrate the truth of the last remark by doing some
of these things on S
1
and, if you are feeling very brave, S
2
.
2.10 Conclusion
This has been a quick introduction to the ideas of smooth manifolds and
vector elds on them. There are whole books dedicated to these ideas and
you will nd some in the library. You will nd some of these ideas covered
very quickly in chapters two and three of the text book, which you should
read and satisfy yourself that it is intelligible. You should be able to see why
denitions are as they are.
Chapter 3
Tensors and Tensor Fields
This chapter deals with the machinery needed to talk about diferential geom-
etry, although it only starts on actually doing so. It contains the information
in Chapter four of the text book and goes into the algebra in more detail.
This is because we are doing it right, on account of being mathematicians
and therefore feeling uneasy about relying on our intuitions without being
able to check the logic. We also cover part of Chapter one of Part three,
where the text book is decidedly scrappy, and part of Chapter ve of Part
one. I do things in a slightly dierent order. You should however read the
text book in conjunction with the notes and do the exercises.
I also throw in a few remarks about the exterior calculus and Stokes Theo-
rem, not because this is part of the course but because it is a part of every
educated persons background in the twenty-rst century. It has to be ad-
mitted that there arent many educated people around, but then there never
have been.
3.1 Tensors
3.1.1 Natural and Unnatural Isomorphisms
Let V denote a real vector space of nite dimension, so V is isomorphic to
R
n
for some n Z
+
. Then I can dene the space of shifts of V which is a
collection of maps from V to V by taking any v V and writing
v : V V, w V, w w+v
The set of all such maps I shall call

V . The map v is the map that adds v to
everything. I can compose such maps, and it is immediate that u v =

u +v.
51
52 CHAPTER 3. TENSORS AND TENSOR FIELDS
Similarly, for every t R,

tu = t u where we scale maps in the usual way,
(tf)(x) = t(f(x)).
This makes

V a vector space and gives an isomorphism
: V

V , v V, v v
Exercise 3.1.1. Conrm that this is an isomorphism of vector spaces.
This isomorphism is natural, which means (in part) that given f : U V , a
linear map between real vector spaces, we get a map from

U to

V :
f : U V f
:

U

V , f
( u) =

f(u)
Note that we can specify the isomorphism and the map f
without making
any reference to a basis for U or V . This is the other part of what we mean
by natural. I cannot dene naturalness properly without an excursion into
category theory which I am hoping to avoid, but the idea is suciently clear
for present purposes. I hope.
I shall write U

= V when vector spaces U and V are isomorphic and U
N
= V
when they are naturally isomorphic.
Exercise 3.1.2. Take the space L(R, V ), of linear maps from R to V , and
show this is also a vector space, naturally isomorphic to V .
The space of shifts being naturally isomorphic to V leads to two pictures
of a vector space, one has got points in it and the other has got arrows in
it. We can certainly think of a shift map taking u to u + v as an arrow
from u to u + v in the original space, and the map itself as a whole lot of
arrows, all basically showing where each point starts and nishes under the
map. And since the spaces are isomorphic we can cheerfully think in either
one. Physicists do this all the time as do applied mathematicians, and so
they confuse the two distinct things, points and arrows, and usually this does
no harm; in fact the more ways you have of thinking about something the
easier it is to solve problems, so it actually does some good. It is, however,
probably better to confuse things when you know you are doing it, rather
than just being confused.
Now I dene the space V
, the dual space to V .

Denition 3.1.1. V
is the set L(V, R) of linear maps from V to R with

the usual rules for addition and scaling of maps, viz, (tf)(v) = t(f(v)) and
(f +g)(v) = f(v) +g(v), for every t R and every u, v V .
3.1. TENSORS 53
Exercise 3.1.3. Conrm that V
is indeed a vector space of the same di-

mension as V .
Exercise 3.1.4. Show that a basis for V determines a basis for V
in an
obvious way. It is called the dual basis. In R
2
, [1, 0] is the dual to
1
0
R
2
, and so on. Note that my usage of representing vectors as columns (and
elements of R
n
as rows) is consistent with standard matrix notation and
makes it easier to distinguish R
n
from its dual space.
Remark 3.1.1. The standard (ordered) basis in R
n
is often written as the
ordered set (e
1
, e
2
, , e
n
) which saves writing lots of columns. e
j
is the
column of n numbers which has a 1 in the j
th
place and a zero everywhere
else. I shall often write (e
1
, e
2
, , e
n
) for the dual basis. You can think of e
j
either as a row matrix with n entries, with the j
th
entry 1 and all the others
zero, or you can think of it as the projection onto the j
th
axis, according to
taste. People who cannot tell subscripts from superscripts are going to have
a hard time with tensors.
It follows from the last exercise that V and V
are isomorphic, but the

isomorphism is not natural. Given U and V real vector spaces, and a linear
f : U V , we get a map f
: V
dened by
f
: V
, g V
g g f
This map is the wrong way around. The term contravariant is used for
things like this. Again I am being a trie vague here in order to avoid a long
discursion.
If V = R
n
then I nd it helpful to write the elements of R
n
as column arrays.
Then it is natural to write the elements of R
n
as row arrays. Then this
makes it clear that the latter act on the former (by matrix multiplication) so
there is a map
R
n
R
n
R,
[a
1
, a
2
, a
n
] ,
x
1
x
2
.
.
.
x
n
a
1
x
1
+a
2
x
2
+ a
n
x
n
Physicists write the thing on the right as a
i
x
i
by what is called the Einstein
summation convention which means that a repeated lower index and upper
index is short for a sum over all possible values of the index. This explains
why we use lower indices or subscripts for covectors, elements of the dual
space, and superscripts or upper indices for the components of a vector. It
makes writing squares and higher powers a real bugger, but fortunately we
dont have to do that very often.
Note that this generalises, there is a map
V
V R, (g, v) g(v)
All I have done with my rows and columns is specify the maps and the vectors
by arrays of numbers. This is so we can do sums. Usually rather horrid sums,
but that is what Mathematica and MATLAB are for.
Note that confusing a space and its dual is not a good idea: physicists did this
and got themselves in a bit of a mess in consequence. They are isomorphic,
at least when nite dimensional, but not naturally isomorphic, so it is a good
idea to keep them separate.
Exercise 3.1.5. Dene the unit vector at 0 on R to be the tangency equiv-
alence class of the map i : R R given by i(t) = t. Then i

R
0
is a basis
element. Dene dx :

R
0
R by dx(i) = 1. The identity map x : R R
goes over to the identity map

R

R, and it takes the tangent vector i to
itself. What does it do to dx?
Note that all the above spaces are isomorphic and all maps are pretty much
the identity map if you are prepared to be sloppy. Examine which of the
various maps are covariant and which contravariant.
Exercise 3.1.6. Let V be the space of all real valued functions dened on R.
Is it the case that V and V
are isomorphic? If so provide an isomorphism.

Exercise 3.1.7. Show that an isomorphism between a vector space V and
its dual provides a quadratic form on V which, if positive denite, denes
an inner product on V . Show that an inner product on V determines an
isomorphism between V and its dual whenever V is nite dimensional. Is it
true when V is not nite dimensional?
3.1.2 Multilinearity
Denition 3.1.2. A bilinear map f : U V W for real vector spaces
U, V, W is one such that
u, u
, U, v V, s, t R, f(su+tu
, v) = s f(u, v)+t f(u
, v) and
u U, v, v
, V, s, t R, f(u, sv +tv
) = s f(u, v) +t f(u, v
)
3.1. TENSORS 55
We can describe this by saying that f is linear in each variable separately.
The eld can in fact be any eld you like as long as it is the same eld for
U,V and W .
Exercise 3.1.8. Find a bilinear map from R R to R.
Denition 3.1.3. For any u U and a bilinear map f : U V R I can
write f
(u,)
: V R as the map
f
(u,)
: V R, v f(u, v)
Similarly for any v V, f
(,v)
: U R sends u to f(u, v).
We can describe bilinearity of f by saying that f is linear in each variable
separately, meaning that for any u U, f
(u,)
is linear and for any v
V, f
(,v)
is linear.
Two exercises which may help later in understanding some technicalities:
Exercise 3.1.9.
1. Show that when V is nite dimensional, V
, the dual of V
, is nat-
urally isomorphic to V . That is, show there is an isomorphism which
does not require a basis of either space to specify it, and that f : U V
induces a map f
: U
.
2. Show that if Bil(U V, W) is the vector space of bilinear maps from
U V to W and L(A, B) is the vector space of linear maps from A to
B for any real vector spaces A and B then
Bil(U V, W)
N
= L(U, L(V, W))

where
N
= denotes a natural isomorphism of vector spaces.

Now we generalise the idea of bilinearity which deals with maps from U V
to a vector space, to multilinearity which has more than just two terms in
the product.
Denition 3.1.4. If we have a k-fold cartesian product of real vector spaces
U
1
U
2
U
k
and if j [1 : k] we can take (u
1
, u
2
, , u
k
) in this
product, and for any
f : U
1
U
2
U
k
R
we dene f
(u
1
,u
2
, u
j
,u
k
)
: U
j
R where the u
j
means that the j
th
term has
been replaced by , to be the map which sends u
j
to f(u
1
, u
2
, , u
j
, , u
k
).
Note that u
j
has absolutely nothing to do with shift maps!
Denition 3.1.5. A k-multilinear map f : U
1
U
2
U
k
to R for real
vector spaces U
j
, j [1 : k] is a map which if we keep all but one term
(u
1
, u
2
, , u
k
) xed , representing this as (u
1
, u
2
, u
j
, u
k
), then
f
(u
1
,u
2
, u
j
,u
k
)
is a linear map from U
j
to R, for any j [1 : k] and any (u
1
, u
2
, u
j
, u
k
).
This is actually quite simple but a swine to write down. If you nd it con-
fusing write down a trilinear map from R
2
R
2
R
2
to R.
Denition 3.1.6. A covariant k-tensor on a vector space V is a multilinear
map
f : V V V
. .. .
k copies
R
We write T
k
(V ) for the vector space (under the usual addition and scaling of
maps) of all covariant k-tensors on V . By convention, a 0-tensor on a (real)
vector space V is a real number.
Denition 3.1.7. A contravariant -tensor on a vector space V is a multi-
linear map
g : V
. .. .
copies
R
We write T
(V ) for the vector space of contravariant -tensors on V .

Since a covariant 1-tensor on V is actually an element of V
and a contravari-
ant 1-tensor on V is actually an element of V
= V , there is a case for saying

that the names co and contra should be swapped around. But I havent got
the nerve.
We can have mixed tensors which are covariant in some arguments and con-
travariant in others. When a physicist writes down something like g
,
the
fact that they are subscripts tell you that this is a covariant tensor, the fact
that there are two of them tells you it is a bilinear map from V V to R, and
almost certainly V is either R
3
or possibly the tangent space or the cotangent
space to the manifold we live in. When a physicist writes
g
,
he has a tensor g : V V V
R for some V which he frequently forgets to

specify on the grounds that he knows, as do all right thinking people, what it
is. He uses subscripts for the coecients g
,
so that he can use superscripts
for the things they operate on and use the Einstein convention.
3.1. TENSORS 57
Denition 3.1.8. We talk of a
type tensor on V when it is covariant

of order k and contravariant of order , that is when it is a multilinear map
: V V V
. .. .
k copies
V
. .. .
copies
R
We write T
k
(V ) for the space of type (k, )

T
tensors on V . T
k
0
(V ) is written
T
k
(V ) and T
0
(V ) is written T
(V )
I shall expand on this when I explain tensor elds which comes up next.
Denition 3.1.9. A covariant k-tensor is symmetric i
(u
1
, u
2
, , u
k
) = (u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the result is the same.
Denition 3.1.10. A covariant k-tensor is alternating (or antisymmetric)
i
(u
1
, u
2
, , u
k
) = (u
2
, u
1
, u
3
, , u
k
)
and whenever we swap any two arguments the sign only is changed.
Note that we can say this more easily: a covariant k-tensor is symmetric
i it is invariant under the symmetry group S
k
acting on the arguments, in
algebra, T
k
(V ) is symmetric i = for every in the permutation
group S
k
on the arguments of . And if it is antisymmetric then it is invariant
under the group A
k
. If is a permutation of the set of arguments, we write
sgn() to be +1 if is an even permutation and 1 if it is odd. Then we
can say is alternating i = sgn() .
Alternating k-tensors are also known as k-forms and are important for later
work. They have everything to do with orientation.
Denition 3.1.11. We write
k
(V
n
) for the space of alternating covariant
k-tensors on the vector space V having dimension n.
3.1.3 Dimension of Tensor spaces
You can either read this carefully or simply do the exercises at the end of
the subsection. Or you can do both. As long as you nd out how to do the
exercises!
The space of k-tensors on V
n
is obviously a vector space because we can
add and scale the maps; the sum or two tensors of type (k, )
T
is obviously
another tensor of the same type, and likewise scaling such a tensor by a real
number gives another tensor of the same type. Since the set of all maps from
any X to R is a vector space, the type (k, )
T
tensors form a linear subspace.
Example 3.1.1. Suppose is any (2, 0)
T
tensor on R. Then we put (1, 1) =
a. Then by multininearity, keeping the second component xed we deduce
that (x, 1) = xa for any x R, and now keeping the rst component xed
we see that (x, y) = xya. Thus the tensor is specied by just one number,
a and so the space of covariant 2-tensors on R is a one dimensional vector
space, having (1, 1) = 1 as a basis element.
We note that is always a symmetric tensor. There is precisely one alter-
nating (2, 0)
T
tensor on R and it is the zero map. So the space of alternating
(2, 0)
T
tensors on R is zero dimensional. The zero tensor is both symmetric
and antisymmetric.
To get a basis for the covariant k tensors on R
n
, we need to specify the maps
on every choice of basis elements. For example, for 2-tensors on R
2
we know
the multilinear map completely if we know it on (e
1
, e
1
), (e
1
, e
2
), (e
2
, e
1
)
and (e
2
, e
2
). Then multilinearity will guarantee us the value on any pair of
vectors, each in R
2
. The extension to higher order tensors and dierent n
is obvious, and by taking any basis for V we get the same conclusion. This
gives the obvious result, the dimension of the space of covariant k-tensors
on V
n
is n
k
. For example, we can take as a basis for T
2
(R
2
) the four maps
dened by the four columns:
(e
1
, e
1
) 1 (e
1
, e
1
) 0 (e
1
, e
1
) 0 (e
1
, e
1
) 0
(e
1
, e
2
) 0 (e
1
, e
2
) 1 (e
1
, e
2
) 0 (e
1
, e
2
) 0
(e
2
, e
1
) 0 (e
2
, e
1
) 0 (e
2
, e
1
) 1 (e
2
, e
1
) 0
(e
2
, e
2
) 0 (e
2
, e
2
) 0 (e
2
, e
2
) 0 (e
2
, e
2
) 1
Then it is obvious that these four maps are linearly independent and that
any bilinear map from R
2
R
2
to R is a linear combination of these.
Remark 3.1.2. We can write the map given by the rst column as dx dx,
the second column map as dxdy, the third as dydx and the last as dydy.
I shall explain this neat notation later.
If the tensors are of mixed type (k, )
T
, then by taking the dual basis for
the contravariant tensors we get the dimension is n
k+
. The Riemannian
Curvature tensor, which you may meet later, is a (3, 1)
T
tensor and in R
4
,
3.1. TENSORS 59
spacetime, it therefore has dimension 4
4
= 256. This means it takes 256
numbers to specify it. Fortunately it has a lot of symmetries which reduces
the dimension to 20, otherwise nobody would have the patience to do any
calculations with it.
The space of alternating covariant 2-tensors is obviously a subspace of the
space of all covariant 2-tensors: on R
2
, we do not need to look at what any
such tensor does to (e
1
, e
1
) because it has to be zero. Similarly if we know
it on (e
1
, e
2
) we know its value on (e
2
, e
1
), it is just the negative. So if we
know its value on (e
1
, e
2
) we know it completely, and since a basis for the
space
2
(R
2
) is the single alternating tensor which sends sends (e
1
, e
2
) to 1,
the dimension of
2
(R
2
) is one. Since it is easily veried that the alternating
map which sends (e
1
, e
2
) to 1 is the determinant of the matrix formed by
putting the two vectors as adjacent columns, we see that the determinant is
a basis for the space
2
(R
2
) of alternating 2-tensors on R
2
.
Exercise 3.1.11. Easily verify the above claim.
In R
3
we have three basis elements, (e
1
, e
2
, e
3
). and if we look to see what
we have as a basis for the alternating two tensors we observe that we know
any such alternating if we know it on (e
1
, e
2
), (e
2
, e
3
) and (e
1
, e
3
). For
every other pair of basis elements, the result is forced by knowing on these
three together with the fact that is alternating. Since we have only three
choices of real numbers to make in order to nail down a particular alternating
2-tensor on R
3
, the dimension of
2
(R
3
) is 3. And for R
n
, all we have to
do is to take pairs e
i
, e
j
with i < j, and again knowing on these tells us
everything about . There are n(n 1) ways of choosing two dierent basis
vectors from R
n
, and we need half of them, so the dimension of
2
(R
n
) is
n(n 1)/2.
And nally, we can choose k distinct basis elements from the set of n in
n(n 1)(n 2) (n k + 1) ways and each such way can be permuted in
k! ways and we need only one of them. We can choose a suitable basis on
which to dene an alternating k-tensor to be the set e
i
1
, e
i
2
, e
i
k
with
i
1
< i
2
< < i
k
and this can be done in
n
C
k
=
n!
k!(n k)!
ways, the
number of ways of choosing k things from n. So the dimension of
k
(V
n
) is
n
C
k
.
The space of symmetric tensors is similar except that we do not know the
value of when two choices of the same basis elements of R
2
are made. In
R
2
the 2-tensor is determined if we know (e
1
, e
1
), (e
1
, e
2
), (e
2
, e
1
),
and (e
2
, e
2
). If we know it is symmetric we dont need both (e
1
, e
2
) and
(e
2
, e
1
). So the dimension of the symmetric 2-tensors on R
2
is three. The
symmetric k-tensors on R
n
have a basis the set of values of maps dened
on e
i
1
, e
i
2
, e
i
k
with i
1
1
2
i
k
. For the 2-tensors on R
n
we
can choose two elements in n
2
ways and we can notice that n of these have
both elements the same. The remaining n(n 1) ways have the subscripts
dierent and we can select half of them. So the dimension is n(n1)/2+n =
n(n +1)/2. I leave you to work out the dimension of the space of symmetric
k tensors on R
n
.
Note that the space
n
(R
n
) always has dimension 1. Taking the dened by
taking the value one on (e
1
, e
2
, e
3
, e
n
) in that order, we observe that we
have a particularly simple alternating n-form on R
n
. It is called the volume
element, and its value on any set of n vectors in some order can be calculated
using multilinearity. If we write each vector out as a column, the result is
the determinant of the resulting n n matrix. This is a good way to dene
the determinant.
Note that we could write out a basis for
k
(R
n
) for k n in terms of
the possible choices of k row elements by taking the determinant of the
result. Thus alternating tensors are all about determinants or, alternatively,
determinants are all about alternating tensors.
Exercise 3.1.12.
1. By evaluating an in the space of (2, 0)
T
tensors on R
2
on the elements
(e
1
, e
1
), (e
2
, e
1
), (e
1
, e
2
)(e
2
, e
2
) to get (a, b, c, d) respectively, show that
is dened by a 2 2 matrix using suitable matrix operations.
2. Show that this is equivalent to acting on the pair of vectors (a, b) by a
2-tensor by writing the matrix as A and calculating a
T
Ab.
3. Complete the scruy arguments used to obtain the dimension of
k
(R
n
)
which took suitable basis elements of
R
n
R
n
R
n
. .. .
k terms
to dene a set of multilinear maps, with the implied belief that we can
extract a basis for
k
(R
n
) by xing suitable values. In particular show
that a set of such maps is linearly independent and spans
k
(R
n
).
4. Show that the symmetric (2, 0)
T
tensors form a subspace. What is the
dimension? Give a basis for it.
5. Do the same for the alternating (2, 0)
T
tensors.
3.1. TENSORS 61
6. Show that the determinant acting on
x
y
u
v
by taking the two

vectors to vx uy is an alternating 2-tensor on R
2
.
7. Show that any other alternating 2-tensor on R
2
is a multiple of this by
a real number.
8. Repeat for (0, 2)
T
tensors and again for (1, 1)
T
tensors.
9. Find the dimension of the linear space of all covariant k-tensors on R
n
.
10. Find the dimension of the space of all alternating covariant k-tensors on
R
n
. Hint: start o in a small way by looking for a non-zero alternating
2-tensor on R and showing there arent any. You have done alternating
2-tensors on R
2
. The determinant is a basis for the alternating 3-
tensors on R
3
, and this generalises. The alternating 2-tensors on R
3
have a basis obtained by choosing two rows of the three rows made up of
the two vectors side by side, and producing the 22 determinant on the
entries. This leads to three basis elements. Show this by looking at the
four choices of a pair of bases. Now generalise to higher dimensions.
Finally generalise to higher order alternating tensors. Its a fair bit of
work but will burn the elements of the exterior algebra of alternating
tensors into your brain for ever.
Once you have done the work of nding out how you manipulate them,
nding out the actual use and hence the point of the things is painless.
Remark 3.1.3. If we take the space of (2, 0)
T
tensors on R
2
or R
3
and
represent them as spaces of 2 2 and 3 3 matrices, you might think that
we have captured all the properties needed for (2, 0)
T
tensors and they are
merely matrices dressed up. You might conclude that the same holds for
(0, 2)
T
tensors and for (1, 1)
T
tensors. We have a bit of a problem however
if we decide to change the basis. Obviously this will change the matrix
representing a particular tensor, even if we agree to use the same basis for
both occurrences of V or V
, or if we use some new basis for V and the

dual basis for V
. There has to be a matrix representing the transition from

one basis to another, and you might think that the usual rule for transition
matrices applies as in Linear algebra. But the matrix here represents a
bilinear map, not a linear one, and you would be wrong in general. While I
shant be concerned with change of basis in what remains, a proper course
in tensors would certainly go into this, and you might want to play around
with nding out what happens.
Representing higher order tensors with matrices doesnt work. We would
need at least cubes and tesseracts of numbers instead of squares or rectan-
gles of them. Fortunately, there are neater ways of writing them down which
we shall meet later. In the next chapter we shall be concerned with Mawells
Equations for the Electro-magnetic eld, and this will take us as far as al-
ternating 3-tensors on R
4
. Physicists and old style mathematicians have an
obsession with matrix representations which causes them serious problems
for higher order tensors. We shall breeze through them without eort merely
by using more powerful notations.
The 0-tensors have dimension one by denition.
If you ask an old-fashioned applied mathematician what a tensor is, he might
well tell you that it is a matrix, but it transforms dierently. This tells you
more about old-fashioned applied mathematicians than it tells you about
tensors.
3.1.4 The Tensor Algebra
A denition of something we have met before:
Denition 3.1.12. An algebra is a vector space with a left and right dis-
tributive multiplication on it. The multiplication is usually associative so the
elements of the space dene a ring. The exception is Lie Algebras which are
not associative but instead satisfy the Jacobi Identity (see Exercise 2.9.4).
Denition 3.1.13. A graded algebra is a set of vector spaces indexed by the
group Z
n
for some n Z
+
(or the group Z), with a distributive multiplication
on the set.
Given two covariant tensors we can multiply them. More specically, given
a k-tensor and an -tensor we can construct a k +-tensor as follows:
If
: V V V
. .. .
k copies
R
is a covariant k-tensor and
: V V V
. .. .
copies
R
is a covariant -tensor we dene
: V V V
. .. .
+k copies
R
3.1. TENSORS 63
by taking on the rst k elements, on the last , and multiplying the
results.
Exercise 3.1.13. Show that this gives a covariant k +- tensor.
This is called the tensor product in the tensor algebra. This makes the set of
all covariant tensors a graded algebra. Graded algebras are quite common and
you will meet them later if you do algebraic topology or theoretical physics.
Physicists stick the word super in front of a theory when it goes to a graded
version, hence superstring theory. Usually they have only two levels so they
talk about Z
2
gradings.
Note that the tensor product of alternating tensors is not alternating unless
one of the tensors is a constant (zero tensor).
Note also that the tensor product although not commutative is associative
and distributes over addition.
Exercise 3.1.14.
1. Show that if
1
,
2
are k-tensors and is an -tensor then
(
1
+
2
) =
1
+
2
2. Show that if s, s
and t, t
are real numbers,

(s
1
+t
2
) (s
1
+t
2
)
is what youd expect it to be on the optimistic assumption that is a
nice well behaved multiplication.
3. Give a basis for the space of (2, 0)
T
tensors on R
2
in terms of dx and
dy. Hint: Note that dx
i
: R
n
R is a covariant 1-tensor on R
n
for
any n. If n = 2 we call them dx and dy. Certainly the tensor product
of any two 1-tensors is a 2-tensor. Show that every 2-tensor is a linear
combination of such tensor products. (A count of basis elements might
save you some trouble here.) Look back to Remark 3.1.2 to nd the
answer written down, with an explanation promised later. This is the
explanation.
4. Represent the tensor dx dy as a matrix over R
2
.
5. Represent the tensor dx dy as a matrix over R
3
.
6. Repeat the last two for the tensor dx dy dy dx. (Later we shall
call this tensor dx dy.)
Exercise 3.1.15. Show by an example that not every two-tensor on R
2
can
be written as a tensor product of one-tensors. This is obvious once you see
it but some people are tempted to suppose all higher order tensors are tensor
products of one-tensors. The moral: one 2-tensor is not the same things as
two 1-tensors!
Example 3.1.2. We can write down a bit of the tensor algebra (not all of
it, it is innite dimensional) on R
2
without too much trouble. Note that I
use dx to specify the linear map from R
2
to R which projects on the rst
component, and dy for the projection on the second component.
Order basis isomorphic to
T
k
(R
2
) dx
i
1
dx
i
k
R
2
k
.
.
.
.
.
.
.
.
.
T
3
(R
2
) dx dx dx, , dy dy dy R
8
T
2
(R
2
) dx dx, dx dy, dy dx, dy dy R
4
T
1
(R
2
) dx, dy R
2
T
0
(R
2
) 1 R
Using the isomorphisms we can also write out the tensor multiplication in
an admittedly strange form:
R R
2
R
4

R (R, ) (R
2
,
q
) (R
4
,
q
)
R
2
(R
2
,
q
) (R
4
,?) (R
8
,?)
R
4
(R
4
,
q
) (R
8
,?) (R
16
,?)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Table 3.1.2
Here I have started with the 0-tensors, then the 1-tensors, and so on, and
used the isomorphisms to indicate where the tensor product takes us. The
symbol means ordinary multiplication, and
q
means scalar multiplication.
The question marks remain to be lled in, but I will do the multiplication
from T
1
(R
2
) T
1
(R
2
) to T
2
(R
2
). In the bases given this is
R
2
R
2
R
4
a
b
c
d
ac
ad
bc
bd
3.1. TENSORS 65
If you represent elements of T
2
(R
2
) by 22 matrices, you can get this result
by a matrix multiplication
c
d
[a, b]
For what thats worth.
Exercise 3.1.16. Fill in the other question marks.
It is easy to generalise the tensor algebra so that we can take the tensor
product of contravariant tensors or of a covariant tensor with a contravari-
ant tensor or of two mixed tensors. These things are best understood by
constructing simple examples when they are ridiculously easy, rather than
by looking at the formal denitions which on rst encounter are terrifying.
Algebra is learnt by making up lots of examples. When you have done this
you can easily see what is being said and after a small amount of practice
you can use the language to terrorise people unfamiliar with it. This is child-
ish and you should be ashamed of yourself for actually frightening engineers,
applied mathematicians and physicists this way.
Exercise 3.1.17. A covariant 1-tensor on R
n
is a linear map from R
n
to
R and consequently an element of R
n
. We write dx
i
: R
n
R to be the
projection which picks out the i
th
component of each vector.
1. Show the set dx
i
: i [1 : n] is a basis for T
1
(R
n
)
2. Show that the set dx
i
dx
j
, i, j [1 : n] is a basis for the space
T
2
(R
n
), so any T
2
(R
n
) can be specied by the entries in an n n
matrix relative to this basis.
3. Take some T
2
(R
3
) and specify it by such a matrix, take two ele-
ments of R
3
and show how to evaluate on them by matrix multiplica-
tions of representations of the vectors with respect to the standard basis
for R
3
.
4. Choose a dierent basis for R
3
; discuss what has to be done to the
matrix in order to get it to still represent the same T
2
(R
3
)
5. The space T
1
(R
n
) is the space of linear maps from R
n
to R and is hence
R
n
. It is naturally isomorphic to R
n
, and we can use the natural
isomorphism to take e
i
, i [1 : n] to be a basis for T
1
(R
n
). What is
a basis for T
2
(R
n
)? For T
k
(R
n
)?
6. Since the dimension of T
2
(R
n
) is clearly n
2
we can represent any ele-
ment T
2
(R
n
) by an nn matrix as before. How does this transform
under change of basis?
Remark 3.1.4. Note that this confuses an earlier denition which had dx
i
as a linear map from

R
n
to R, but the reason for the confusion will become
clear later. If you are rather more fussy than I am, you might want to do it
right: If e
i
: i [1 : n] are the standard basis elements for R
n
, we can write
the corresponding dual basis for R
n
as e
i
: i [1 : n]. Then go through
replacing dx
i
with e
i
throughout and you will have it in impeccable form.
Remark 3.1.5. We dene a 0-tensor on any real vector space to be another
and more exotic name for a real number. This means we can take tensor
products of 0-tensors with k tensors to get the scaling operation. You should
have already worked this out from doing the exercises.
Exercise 3.1.18. If you read Darlings book you will discover that he goes
about dening tensor algbras quite dierently. He denes U V for any real
vector spaces U and V , by proving a universality theorem which is somewhat
obscure.
You can recover Darlings treatment as follows.
First note that if we have L(U, R) and L(V, R) we can dene L(U, R)L(V, R)
to have elements f g which means for each f L(U, R) and g L(V, R)
we take f g : U V R by
f g(u, v) = f(u) g(v)
where is just multiplication in R. From now on I shall just write f(u)g(v)
for this. It is clear that this is a bilinear map from U V to R. It is also
clear that L(U, R) L(V, R) is a vector space under the usual operations of
scalar multiplication and addition.
But L(U, R) is just U
and L(V, R) = V
. So we have dened U
as a
new vector space. It is therefore perfectly straightforward to dene U
as a new vector space. And if we identify U
with U and V
with V , we
have U V .
Show that this gives us Darlings treatment. Find the dimension of U V
in terms of the dimension of U and the dimension of V . Find an explicit
representation for R
2
R
3
and calculate
1
2
4
5
6
3.2. TENSOR FIELDS ON A MANIFOLD 67

3.2 Tensor Fields on a Manifold
It makes sense to attach various things to manifolds. For example, to each
point of S
2
we can attach a number. We can think of it as glued on to the
sphere. Or we can imagine it as giving a function from the sphere to the
real numbers. It might measure the temperature at the surface of a solid
ball, perhaps. It makes sense to attach numbers smoothly so the function is
smooth. As you wander about the surface of the sphere the numbers will not
change too sharply.
For a dierent example of attaching things to manifolds, we can attach vec-
tors. Again we can think of it in various ways, and it can be used for various
purposes: for example it might make sense to have the wind blowing at the
surface of the sphere and to want to say by how much and in what direction
at each instant. Or we might want to measure the tangential component in
the surface of a magnetic or electric eld.
We might want to attach tensors. Such things are important and useful
particularly to physicists. You can see that we might want to assign, for
some purpose, to each point of a sphere a two by two matrix, and it would
make sense to require the matrices to change smoothly as we moved over
the surface of the sphere. To give an example of something quite practical
that we might want to attach to a manifold, suppose we had the job of
describing distances on S
2
. There is, of course, a standard metric on S
2
but
it is extrinsic and arises from the embedding in R
3
. If you think about the
intrinsic denition of S
2
you can see that there is absolutely nothing in it
which allows us to talk of distances. It is rather reasonable to want to have
an intrinsic notion of distance on S
2
and on other manifolds. In fact we cant
do General Relativity without it. In the same way, there is no way to talk
about the area of a region, or the angle between vectors on a sphere, except
with reference to an embedding of the sphere in R
n
, and it makes sense to do
these things intrinsically, as is shown by the fact that we habitually do these
things in the space in which we live which might be S
3
for all we know.
Or something more complicated. All this and much more is done by means
of attaching tensors to a manifold giving what are called tensor elds. We
now investigate these things.
The idea of a vector eld on a manifold is not too hard to grasp: technically
the section of the tangent bundle takes each point of the manifold and assigns
to it a vector attached to that point from the space of possible choices. If
the manifold is R
2
, we assign to a point an arrow, an element of what I shall
call

R
2
for the tangent space at each point. Confusing

R
2
with R
2
, we get
such things as
V : R
2
R
2
,
x
y
y
x
This of course is the same as the system of Ordinary Dierential Equations

x = y
y = x
in the notation of M213. It also shows you why I want to write

R
2
, with
elements
x
y
, for the codomain of a vector eld.

The dierence between the modern notation and the older one is that we are
being careful to make it clear that the two spaces R
2
(in the expression for a
vector eld V : R
2
R
2
) are actually dierent. One is a space of locations
and the other is a space of arrows, actually tangent vectors. You probably
found the old fashioned notation confusing when you rst met it, and indeed
it is. The new notation is not only clearer, it generalises to manifolds which
the old notation does not. Even V : R
2

R
2
is an improvement.
As mentioned earlier, as well as attaching vectors to points in a manifold
we can attach other things. If we attach real numbers we merely get a map
from the manifold to R, and we have now got a new way to think of such
a map. Could we attach a matrix? To see the useful way to do such a
thing, observe that the tangent bundle is merely obtained by taking as bre
the tangent space at each point. But we could start with the tangent space
and replace it with its dual space. The thing that we get when we take the
bre bundle with the dual to the tangent space as bre and glue all these
bres together using the same method as for the tangent bundle is called
the cotangent bundle. It looks rather similar and in R
2
the only dierence
would be that instead of attaching the space of columns of two numbers
(representing the possible arrows at each point), we would be attaching the
space of rows of two numbers (representing linear maps from R
2
to R). The
spaces are isomorphic, but clearly the elements are not the same. Of course I
have added my own bit of confusion here by confusing linear maps with their
matrix representation, another isomorphism. On R
n
this is harmless. On a
manifold it generally is not and again the isomorphism needs to be thought
about.
The cotangent bundle is important in classical mechanics where it corre-
sponds to the momentum space whereas the tangent space corresponds to
the velocity space. The reason is that we have an energy function. If we look
at R
2
it has tangent space what I have called R
2

R
2
. Now the function
1/2mv
2
is a function from

R
2
to R,
v
1
v
2
m
2
((v
1
)
2
+ (v
2
)
2
)
and the derivative of this function is the row matrix
[mv
1
, mv
2
]
This is an element of the cotangent bundle because it is a covector, not a
vector. Cheerfully confusing the two leads to ghastly muddle further down
the track.
As well as the dual of the tangent space attached to each point of a smooth
manifold, we can attach tensors. The vector space of k-tensors such as
: V V V
. .. .
k copies
R
(any space of maps into R is a vector space) has a basis consisting of the
multilinear maps evaluated on each of the n
k
combinations of basis elements
of V . If V = T
a
(M) for some manifold M and some a M, then will be
specied relative to this basis by n
k
numbers where n is the dimension of V
and hence M, and k is the order of the tensor. In principle this could be an
awful lot of numbers (but in practice it usually isnt).
This k-tensor vector space can also be thought of as stuck on the manifold
at a. The tangent space is just the locally trivial vector bundle over the
manifold as base space with bre the tangent space at each point. In exactly
the same way we can take a locally trivial vector bundle with bre the vector
space of k-tensors on the tangent space at each point. At each a M this is
a vector space of dimension n
k
. Given that a nite dimensional real vector
space has a topology which is invariant under isomorphisms, and given that
the tensor bundles must be locally trivial (since the tangent space to R
n
is a
trivial bundle), the topology of each tensor bundle has as a base the cartesian
product of those open sets in the manifold which are subsets of those open
sets over which the bundle is trivial, with open sets in the bre.
If we do this for every a M we get the k-tensor bundle of M. If we do it
for all the possible k we get the full tensor bundle of M.
Then:
Denition 3.2.1. A smooth k-tensor eld on a manifold M is a smooth
section of the k-tensor bundle.
Awful Warning
There is scope for some confusion here. If we take the manifold to be R
n
,
the tangent space is, in my idiosyncratic notation, R
n

R
n
and this makes
a vector eld a section of this bundle. If we look to see what sort of tensor
eld this is, we see that the tensor eld must assign to each point of R
n
a multilinear map from some space V to R. There is only one possibility
and that is to make V the dual space to

R
n
and take the linear maps. This
means that by identifying the double dual with the original space

R
n
we get
the right answer. Thus a vector eld is a T
1
tensor on the tangent space
T
a
(R
n
) for every a R
n
. So a covariant vector eld in the ordinary sense,
mentioned in the last chapter is a contravariant tensor eld. There is, if you
like, an element of dualising in dening a tensor in the rst place, so we have
to dualise again to get rid of it.
The terminology is unfortunate since the tangent functor takes tangent vec-
tors to tangent vectors and is covariant, but not many people have the nerve
to change the traditional terminology. I certainly dont.
Some of the books by physicists make a pigs breakfast of all of this duality.
Confusion is the natural state of man. And woman. Try to be clear about
which space you are working in and avoid the muddle.
End of Awful Warning
If we take the alternating covariant k-tensors on the tangent space at every
point of the manifold the smooth tensor eld of sections is called a dierential
k-form on the manifold. Similary we can limit the section to taking values
in the symmetric k-tensors. Both of these are important.
This sounds horrible but if you look at 2-tensors, alternating or not, you can
see that on the tangent space T
a
(M) when M = R
n
, any one of them can
be represented nicely by an n n matrix of numbers. And if we select one
such matrix for each point a of M, then we get an n n matrix of functions
from the manifold to R. So as long as k is one or two we do not have
anything very complicated. If k = 1 then we are talking about vector elds
or covector elds, and if k = 2 we are sticking matrices onto the manifold at
each point. If we never go beyond dimension 3 then the worst thing we have
to imagine is a space with a 3 3 matrix associated with each point of the
space. This is not really very bad. Admittedly this only gets us the applied
mathematicians view of the world, but at least we know how to generalise
it to higher dimensions and higher orders if it turns out to be necessary.
A serious issue with this simplied view of things is that the specication of
the matrix representing a tensor on any T
a
(M) requires us to choose a basis
Figure 3.2.1: Shifting a vector between tangent spaces.
for T
a
(M). And if we now do the same for some T
b
(M) for a = b then we
need to choose a basis for representing tensors on T
b
(M). But the spaces
T
b
(M) and T
a
(M) dont have much to do with each other in general. So in
what sense can it be made the same basis? And if it is dierent, how do we
ensure that the matrix of functions is going to behave nicely in representing
the tensor elds? The tensor elds are perfectly respectable things, but if
we insist on representing them by matrices of functions we have some serious
problems. Note that if M = R
n
, the tangent spaces can be shifted into each
other in a natural way and the idea that we are using the same basis for
each of them makes sense. It all goes wrong if M = S
n
. We need in this
case something like an explicit isomorphism between the tangent spaces at
dierent points.
To see what can go wrong here, imagine a sphere and take a point on the
equator. Attach a vector to this point, say one pointing along the equator.
I have shown this in gure 3.2.1. Now look to see what happens if you move
it parallel to itself along a line of longitude so that it moves up towards the
north pole. It seems reasonable to say that we are shifting the vector so it
is still pointing in the same direction, and still has the same length, despite
the fact that the vectors are all in dierent tangent spaces. In other words I
am claiming that I can tell when two vectors in two dierent tangent spaces
are the same. That this is insanely foolhardy becomes apparent if I go to
the same place by a dierent route. Suppose I rst go around the equator.
My pink vector also goes around the equator, becoming rather purpler as it
goes. When it is opposite the starting point, I now move it up the curve of
longitude until it gets to the north pole. All the way, by both paths, I have
moved the vector so it is pointing in the same direction, but the result is
a pair of vectors pointing in opposite directions. So cheerfully doing on a
sphere what makes perfect sense on R
2
is fraught with problems.
One of the hardest things to do is to unlearn things you soaked up through
the skin when young and gullible. If you were encouraged to think that
isomorphic vector spaces were never worth distinguishing and you went all
sloppy in your thinking as a consequence, you now have the formidable job
of working it all out again. Dont blame me, blame the scruy bunch who
taught you manifest nonsense and blame yourself for buying it. Think of this
in the future and currently and regard it as a second Awful Warning.
Exercise 3.2.1.
1. Take M = S
1
and k = 2. Dene a k-tensor eld on S
1
.
2. Take M = S
2
and k = 2. Dene an alternating 2-tensor eld (2-form)
on S
2
. Explain what this might have to do with area of regions on a
sphere and indicate how you might calculate the area of a region in S
2
with respect to your choice of 2-form.
3. Take M = R
3
and k = 2. Dene a dierential 2-form on R
3
.
Note that we can talk about k-covariant and -contravariant mixed tensors
and mixed tensor elds.
3.3 The Riemannian Metric Tensor
Recall from M213 that an inner product on a vector space V is a positive
denite symmetric quadratic form, which is to say a map
', ` : V V R
such that
1. ', ` is bilinear; that is u V, 'u, ` : V R is linear and
v V, ', v` : V R is linear
2. ', ` is symmetric; that is u, v V, 'u, v` = 'v, u`
3. ', ` is positive denite; that is u V, 'u, u` 0 and
'u, u` = 0 u = 0
We can now summarise the above conditions by saying that an inner product
for V is a symmetric covariant 2-tensor on V , with the additional property
3.3. THE RIEMANNIAN METRIC TENSOR 73
Figure 3.3.1: (Bits of) some perfectly respectable Hilbert Spaces stuck on a
manifold.
of being positive denite. Recall also from M213 that positive deniteness
can be specied by observing that non-degenerate quadratic forms can be
classied as to their general shape by diagonalising them and then rescaling
the axes so that they are all diagonal with entries +1 along the diagonal
down to some point after which they are 1. This gives the signature of
the quadratic form (1, 1, 1, , 1, 1, 1 1) for some number of posi-
tive and some number of negative ones. A positive denite form has all n
entries +1. There are also degenerate forms where some of the entries after
diagonalisation may be zero.
I shall only consider the positive denite forms here, although physicists want
to look at the general case of non-degenerate forms because in relativity we
have to put in time as an extra dimension, which gives a signature (1, 1, 1, 1)
or (3, 1). (Or (1, 1, 1, 1) if you are a physicist. Physicists put time rst.
Some of them even use (1, 3), multiplying our form by 1. I shall outline a
reason for this in the next chapter.)
We now dene a Riemannian metric tensor eld on a manifold as a positive
denite symmetric two tensor eld. That is, at every point of the manifold
we attach, smoothly, some positive denite symmetric 2-tensor. This means
we have some bilinear function of a pair of tangent vectors at each point.
It is a daft name, and it would have been much more sensible to call it a
Riemannian inner product tensor eld, because it gives an inner product on
each tangent space. But it is too late to be sensible now. Figure 3.3.1 shows
some vectors in some of the tangent spaces to a sphere, and each pair has a
sort of local dot product in a perfectly respectable tangent space which is now
a perfectly respectable inner product space, in fact a perfectly respectable
Hilbert Space.
Each such inner product may be specied, via charts, as a symmetric 2 2
matrix in the case of the gure, each matrix A(a) at the point a on the sphere
acting on a pair of tangent vectors u
a
and v
a
to give
u
T
a
A(a)v
a
but it would be better to regard it as a bilinear symmetric map which takes
pairs of tangent vectors with their tails at some point of the manifold, and
returns a real number. Thinking of it as a matrix makes it clear that there
are three distinct numbers which depend on where we are on the sphere. For
an n-manifold it will be
n
C
2
distinct numbers for each point of the manifold.
Or if you insist you can think of the metric tensor as n(n 1)/2 distinct
functions from the manifold to R. So it makes sense to physicists to write
such a thing as g
,
where (mu) and (nu) range through the two possible
values on a sphere or the three possible values on a three-manifold, or the
four on space-time. With g
,
= g
,
. Of course this involves a choice of some
charts to cover the manifold. It might be better to write g
,
(a) for a a point
in the manifold to remind ourselves that we have what is in eect a matrix
valued function on the manifold, but we dont.
Note that there is absolutely no machinery for calculating the dot product
of a tangent vector u
a
at a point a, with a tangent vector v
b
at a dierent
point b. This can be done in R
n
but the Inner Product tensor eld doesnt
allow it.
If the symmetric tensor is always positive denite we call it a Riemannian
metric, and the manifold with this tensor eld is called a Riemannian Manifold.
If the symmetric tensor eld has signature (1, 1, 1, 1) it is called a Lorentzian
metric and the manifold is called a Lorentzian manifold. Physicists treat the
universe we live in, including time, as a Lorentzian manifold. More generally,
for any signature of form we say we have a semi-Riemannian metric. Bear in
mind at all times that when a physicist talks about a metric on a manifold
he means, almost always, an inner product on all of its tangent spaces, not
necessarily positive denite but always non-degenerate. Usually it is either
riemannian or lorentzian.
If two quadratic forms are positive denite, so is their sum. It makes sense to
add them because they are just functions, and if 'u, u` 0 and u, u ~ 0
then the sum is also non-negative, if the sum is equal to zero both the terms
'u, u` and u, u ~ must be equal to zero so u = 0. Moreover if we scale
by a positive constant the result is another positive denite form, while if
we scale by a negative constant the result is a negative denite form. Hence
the positive denite symmetric covariant 2-tensors which are positive denite
are not a vector subspace of the space of covariant 2-tensors on V , but they
are an open subset of the vector space of symmetric covariant 2-tensors and
therefore a manifold with a dimension.
Exercise 3.3.1. What is the dimension of the space of positive denite sym-
metric 2-tensors on R
2
? Hint: it is the same as the dimension of the vector
space of symmetric 2-tensors and if you represent a tensor by a matrix, you
need to count the number of independent numbers in the matrix.
All the above makes sense if we use contravariant 2-tensors. In fact since I
havent said anything about V , it might just as well be the dual space to
some other space.
Now we say it again formally:
Denition 3.3.1. A (positive denite) Riemannian metric for a manifold
M is a positive denite symmetric covariant 2-tensor eld on M.
What does this mean in computational terms? It is easiest to begin by
looking at a very simple case, a metric tensor eld on R
2
. The idea of such a
tensor eld on R
2
has to do with inner products on

R
2
, in fact one such inner
product for each point of R
2
. This can be grasped by thinking of the matrix
of numbers operating on pairs of vectors in

R
2
being xed for each point in
R
2
, and as we move about in R
2
, we change the numbers in the matrix. So
the numbers depend on where you are, and are given by smooth functions of
your location in R
2
.
More generally, we take a manifold, we take a point on it, a and look at
the tangent space at a. Now we take the symmetric bilinear maps from this
space to R which are positive denite. On R
n
, this inner product could be
specied by taking n independent vectors as a basis, then taking the dual
space and the basis elements for that, and calling them (dx
1
, dx
2
, dx
n
),
and then writing the tensor as
i,j[1:n]
g
ij
dx
i
dx
j
where g
ij
is an n n symmetric positive denite matrix. This follows from
an exercise which I hope you did. Alternatively you can use the Einstein
convention and just write g
ij
dx
i
dx
j
. If you were a classical mathematician
or happen to be scruy, you might leave out the , as if it is obvious to the
meanest intellect what dx
i
dx
j
means. You might perhaps imagine in a dim
sort of way it means that you are multiplying a very, very little bit of the i
th
component of a vector with another very, very little bit of the j
th
component
of a possible dierent vector. In which case you are so confused there is no
hope for you.
The standard inner product on R
2
can be written in this form as the identity
matrix. To calculate
x
y
u
v
we simply compute
[x y]
1 0
0 1

u
v
to get xu + yv. Doing the same with any other symmetric positive denite
matrix instead of the identity will give us a new inner product.
For M = R
2
it makes sense to take the same basis (dx, dy) for elements of the
cotangent space over every point, so we get that it is possible to represent a
Riemannian metric on R
2
in the form
[ a ,

b ]
g
11
(x, y) g
12
(x, y)
g
12
(x, y) g
22
(x, y)

c
Again, this was an exercise which I hope you did.

This when multiplied out gives the required bilinear map from

R
2

R
2
to
R. For any choice of two tangent vectors we get a real number. The g
ij
are
smooth functions, for i, j [1 : 2] (and g
12
= g
21
.)
On R
3
they would be smooth functions for i, j [1 : 3]. The matrix would be
symmetric still and at each point it would be positive denite (or in general
have the required signature).
3.3.1 What this means: Ancient History
If you reect a little on what a covariant 2-tensor on the tangent space is,
you will see that we have bilinear maps from pairs of vectors in the tangent
space at a to R, for every a in the manifold.
Now tangent vectors in the old days of classical geometry were not thought
of as elements of a perfectly respectable vector space, but were imagined to
be innitesimal elements in the base space. You can see that if you take a
velocity vector at a point a R
2
and travel along it for a very, very short
time, you trace out, more or less, a line segment in R
2
. If you put a little
arrow on its head (its tail being at a) you get the beginings of a picture of
a vector eld, which we learnt how to draw in second year. If you have a
uniform velocity parallel to the X-axis and of unit length and in the direction
of increasing x we can represent this by a tiny little arrow attached to a and
pointing in the direction of increasing x. Such a tangent vector should be
rather small because it really represents a velocity through a, and hence an
element of what I have called

R
2
, not a set of points in R
2
. The practicalities
are that velocities change and can change continuously so a big long vector
would be misleading. In fact any nite length vector is misleading, but we
can be sloppy and imagine that velocities have been turned into distances by
travelling for very short times.
The idea of an innitesimal time, one so small it was not eectively distin-
guishable from zero, but where ratios of innitesimals made sense and need
not be zero is one which seems natural to many people. My Mathematics
teacher at school talked of dy/dx being a ratio of numbers each of which was
innitesimal, that is not individually distinguishable from zero. I thought he
was o his head. I still do. This isnt mathematics, its nonsense
1
. It does
however suggest mathematics. So although my Maths master was talking
incoherent garbage, there is something there which makes sense. And the
idea of innitesimal distances and times leading to a denite velocity, a sort
of garbled version of the denition of a limit, has been used a great deal in
times past.
One way to think of this which you may nd useful is contained in the
following example.
Let c be the curve x(t) = t, y(t) = 2 sin(t) be given. We look at the origin,
through which the curve passes. First we take the line segment from
0
0
to
u
2 sin(u)
for some u = 0.
This line segment has two important numbers associated with it, the projec-
tion along the x-axis and the projection along the y-axis. I shall call such a
line segment
u
and the two numbers x(
u
) and y(
u
). The slope of the
line segment is
y(
u
)
x(
u
)
. So I think of y as assigning one number to each
such
u
and x as assigning another with the ratio being the slope of the
line segment.
As we take shorter and shorter line segments, that is if we let u 0 in the
example, the numbers get smaller but the ratio in general does not. I can
easily stipulate that the line segment has one end xed (at 0 in our case)
and the other end lies along the curve given.
1
It is possible to go through model theory and make these ideas respectable, but this
requires a lot of logic. It is also possible to junk the lot and replace it with the idea of a
limit. And nally it is possible to choose terminology which looks a lot like the incoherent
rubbish but actually makes sense. This last is what we do and it explains some of the
more baroque aspects of our language.
Figure 3.3.2: x and y and dx and dy.
Now look at the tangent vector at 0 dened by the curve above. It is a
perfectly respectable vector in the tangent space

R
2
0
at 0. In fact I can take
a basis for

R
2
0
consisting of the vector of unit positive speed along the x-
axis, which I have called i or e
1
or /x earlier, and the second vector being
dened by a curve of positive unit speed along the y-axis which I have called
e
2
and /x earlier but might have called j. In this basis it is easy to see
that the tangent vector at 0 dened by the curve is just
1
2
I have already dened dx in the cotangent space as the linear map which
sends this tangent vector to 1 R and dy as the linear map which sends it
to 2.
So dx and dy do to tangent vectors what x and y do to line segments
in the original space. I have shown the idea in gure 3.3.2. Note that we
can say that dy/dx for this tangent vector is just 2 by straight division. And
of course this is precisely what we get when we dierentiate 2 sin(x) at the
origin, which is not exactly a surprise.
Classically, the idea of x was what you were probably taught at school:
it was a little bit of x, and y was a little bit of y, but you were really
looking at line segments along curves, and x and y are probably better
thought of as maps from line segments to R. It is easy to see that with this
way of looking at things, the claim
dy
dx
= lim
x0
y
x
Figure 3.3.3: A new rule for measuring distances of points from the origin.
makes sense provided we specify what we really mean by the terms. This
would involve saying that we are calculating x() and y() for line seg-
ments joining some xed point on a curve to other points, and the limit
means that the other points are taken to be getting closer and closer to the
xed point. All this explanation was unfortunately regarded as not really
part of the mathematics and consequently got left out of the notation. If we
intend to study the subject on manifolds we have to put it back in.
The idea, then, that x means a little bit of x and dx means a very,very
little bit of x (so little that it is innitesimal) still survives in the literature.
And the classical mathematicians wrote
d
2
= dx
2
+dy
2
to be an innitesimal version of Pythagoras Theorem and then used it to
nd the length of curves. These days we dene everything through limits,
which you spend a lot of time doing more or less rigorously in rst year. At
least, that was the idea.
So instead of writing
[x, y]
a b
b c

x
y
as the square of a new norm on R

2
, people wrote
[dx, dy]
a b
b c

dx
dy
as the same thing with innitesimals to give innitesimal sizes of innitesimal

vectors.
If you take a symmetric positive denite matrix and use it to dene a new
norm (squared) on R
2
you can look at the set
x
y
R
2
: ax
2
+ 2bxy +cy
2
= 1
as the set of points at distance 1 from the origin. This is an ellipse as in

gure 3.3.3. To calculate the distance of the indicated point from the origin
we need to measure the length of the orange line by taking the length of the
blue part of it as one unit. Alternatively we scale the ellipse until it passes
through the point and then look to see what the scaling factor was. This
makes the distance of the point from the origin about 2 units.
If you do the same thing using a Riemannian metric, the ellipse changes as
you move around the space. One can follow the idea of the old fashioned
geometers by drawing little (innitesimal?) ellipses around every point of
the space. They thought of this in terms of a dx
2
+ 2b dxdy + c dy
2
, where
dx
2
means take an innitesimal amount of x and square it. So to calculate
the distance along a curve in R
2
equipped with a Riemannian metric, you
took some nite set of points along the curve, one at the start and one at the
end, took the ellipse on each point, and measured the distance to the next
point and then added them all up. Then you repeated with more and more
points on your curve. In the limit we get the right answer. I have shown
a stage in this process in gure 3.3.4. Nobody, of course actually did it by
taking limits, they used Calculus which is quicker and less eort.
Denition 3.3.2. A geodesic on a manifold with a riemannian metric tensor
is a curve joining two points such that its length is less than or equal to that
of any other curve joining the points.
Exercise 3.3.2. Show that in R
n
with the euclidean metric, geodesics are
straight line segments. Hint: This is a standard calculus of variations prob-
lem. Google this if stuck.
Exercise 3.3.3. Describe the geodesics on the (at) torus.
Of course the idea of innitesimal ellipses is daft: the question is how to
rescue the idea so that it gives us a way of computing the length of a curve
in a space where distances keep changing. If the ellipses, or more properly the
positive denite symmetric quadratic forms, are perfectly respectable things
dened on the tangent space at each point, we get what we need.
Figure 3.3.4: Length of a curve via a Riemannian metric.
Example 3.3.1. On a suitable open set in R
2
I dene a new metric by saying
that locally it is given by the matrix
1 +xy 0
0 x
2
+y
2
Find the length of the curve along the parabola y = x

2
from the origin to
x = y = 1 in this metric.
Solution:
The ordinary formula for the curve is that it is
c
d where c is the curve and
d
2
= dx
2
+dy
2
is the innitesimal path length. We can write this as
d
2
= [dx, dy]
1 0
0 1

dx
dy
Our new and improved inner product changes from place to place but it gives
rise to a norm just as the old one does, and it is a norm on the tangent space.
We therefore have
d
2
1
= [dx, dy]
1 +xy 0
0 x
2
+y
2

dx
dy
for the new way of measuring the dierential path length and so the length
of the path along the parabola, with x = t, y = t
2
is
1
0
(1 +t
3
).1 + (t
2
+t
4
)(4t
2
) dt 1.49958
where the approximation is done using Mathematica. This compares with
about 1.29361 using the standard metric.
Example 3.3.2. Find the path length of the spiral r = for 0 2 in
the metric on R
2
given by
d
2
2
= [d, dr]
r
2
0
0 1

d
dr

Solution: This is just the usual metric on R
2
disguised by using polar
coordinates since d
2
2
= (rd)
2
+ (dr)
2
is the usual way of calculating the
innitesimal path length and the answer is
2
0
t
2
+ 1 dt 21.2563
This compares with 2
2 8.885765876 in the euclidean metric on the , r

space. Well, in that space the curve is a straight line.
Exercise 3.3.4. Draw the curve and obtain a crude estimate of the length if
possible with upper and lower bounds to see if you think this is the length in
the usual metric.
Exercise 3.3.5. Find the path length of the above spiral using the metric
given by
d
2
2
= [d, dr]
r
2
0
0 r
4

d
dr
Remark 3.3.1. I should feel ashamed of myself for writing out expressions
such as the above for specifying a metric (or more accurately the square of a
norm), and should undoubtedly have written
= r
2
d d +r
4
dr dr
or something similar. I have tried to give you something which will relate
the correct formulation to the things that the classical mathematicians did
(and which you may nd at least as badly expressed in works on tensors and
tensor elds written by the congenitally confused). The bad notation can be
used to do sums quite quickly so is not wholly bad. Much depends on whether
you want to do an awful lot of sums without thinking what you are doing.
And face it, who would want to think while doing monster sums if they didnt
have to?
Exercise 3.3.6.
1. Find the length of the path A, r = 1 for 0 /2 with respect to
the metric given by r
2
d d + dr dr. (Note that in the , r space
this gives the same answer as the usual metric,. that is, treating , r as
if it were a piece of R
2
with the euclidean metric.)
2. What is the length of the parallel line B, r = 2 for 0 /2 in the
new metric?
Figure 3.3.5: Three lines of dierent lengths.
3. What is the length of the line C, r = 0 for 0 /2?
I show the three lines in gure 3.3.5.
4. Explain what has gone wrong. The length of a line segment with the
end points dierent cannot be zero in a metric.
5. On the gure 3.3.5, draw the curve r = 1/(sin() + cos()), for
[0, /2]. Calculate its length with respect to the new metric. Hint: you
might try using NIntegrate in Mathematica.
6. Show the curve is a geodesic in the space, in particular it is shorter
than the straight line A. Hint: transform back to R
2
with the euclidean
metric. Find out how to do this by reading on a bit.
Note that the Riemannian metric tensor enables us to make sense of the angle
at which two curves cross. Without this it makes no sense at all to say that
curves intersect at right angles on a manifold, because in dierent charts we
could get totally dierent answers. We feed in two tangent vectors, one along
each curve, at the point of intersection so they are both in the same tangent
space. The Riemannian metric tensor gives us a number out and this leads
us to the angle just as in R
n
.
Suppose now that I want to compute path length of a curves on a manifold.
Let us say I have a curve c : [0, 1] S
2
on S
2
. I want to compute its length.
I take a chart containing some of the curve, say u : U R
2
and this takes
the bit of the curve in S
2
to a bit of curve in R
2
. I have a Riemannian metric
tensor on the manifold. The picture of gure 3.3.6 shows a local parametri-
sation by u
1
of a patch containing some of the curve. The composite u c
shifts the curve to the codomain of u, the open set u(U) in R
2
.
Figure 3.3.6: Length of a curve on a manifold via a Riemannian metric.
Now I want to know what happens to the covariant 2-tensor eld on S
2
which
tells me how to measure distances there. I claim that u
1
induces a covariant
2-tensor eld on u(U). This requires a certain amount of thought.
We have the picture from the last chapter:
T
a
X
?
X
X
T
f(a)
Y
?
Y
Y
-
f
f
Now a linear map from T
a
(X) to R is taken by dierentiable f to a linear
map from T
f(a)
(Y ) to R. We can see this by using the natural equivalence
of V
with V or we can simply send : T
a
(X) R to f
: T
f(a)
R.
These are the same thing.
Now we know what f
is on tangent vectors, it is just the derivative of f at

each point. So on R
2
if we have f : R
2
R
2
given by
f
x
y
u
v
we can write
[du dv] = [dx dy]
f
1
x
f
1
y
f
2
x
f
2
y
Similarly we can, given f : X Y for X = R

n
and Y = R
m
transform the
covectors dx, dy in a way strictly dual to the way we can carry a tangent
vector on X to one on Y . In fact it works better for covectors because
a covector eld on Y is pulled back to one on X (and it is certainly not
generally true that a vector eld on X is taken to one on Y ).
Exercise 3.3.7. Why not?
Exercise 3.3.8. Dene the pullback of a covector eld on Y to one on X.
Example 3.3.3. Let
P : R
2
[0, 2) [0, ),
x
y
be the polar coordinate map. Then we can write P

1
as
x = r cos()
y = r sin()
Now we have
dx =
x
d +
x
r
dr
dy =
y
d +
y
r
dr
hence
dx = r sin() d + cos() dr
dy = r cos() d + sin() dr
where upon we can calculate the various tensor products in the inner product:
dx dx = (r sin() d cos() dr) (r sin() d cos() dr)
= r
2
sin
2
() ddr sin() cos()drdr sin() cos()ddr+cos
2
() drdr
and similarly for dx dy, dy dx and dy dy
= r
2
cos
2
() dd+r sin() cos()drd+r sin() cos()ddr+sin
2
() drdr
Hence we have
d d = dx dx +dy dy = r
2
d d +dr dr
This, when translated into matrix terms and old fashioned dx
2
+dy
2
language,
gives us that the standard identity matrix on R
2
for the euclidean metric
tensor goes over to the matrix
r
2
0
0 1
of example 3.3.2
Example 3.3.4. Suppose f : [0, 1] R
2
is a curve in R
2
and we wish to
compute its length. Writing f(t) = (x(t), y(t))
T
we have the length of f is
[0,1]
du
where du is the pull-back from R
2
by f of the length measure d on R
2
. The
usual (Lebesgue) measure on [0, 1] is written dt. This gives us:
dx dx = (dx/dt dt) (dx/dt dt)
= (dx/dt)
2
dt dt
dy dy = (dy/dt dt) (dy/dt dt)
= (dy/dt)
2
dt dt
d d = dx dx +dy dy
du = f
d
du du = ((dx/dt)
2
+ (dy/dt)
2
) dt dt
du =
((dx/dt)
2
+ (dy/dt)
2
) dt
[0,1]
du =
[0,1]
((dx/dt)
2
+ (dy/dt)
2
) dt
A familiar formula usually derived somewhat less formally but using essen-
tially the same ideas. It is worth going through this argument while thinking
of dx/dt as the amount of stretching f does to the unit interval in the x-
direction when it takes [0, 1] into R
2
(and likewise dy/dt). Working through
the new jargon for a simple, friendly example makes you appreciate how the
new jargon actually does a good job of articulating geometric ideas of what
is going on.
Exercise 3.3.9. Write out the matrix and tensor product forms of the spher-
ical and cylindrical polar coordinate transforms of R
3
and conrm that the
euclidean metric goes to what it ought to.
Returning to the tensor eld exported by u
1
to R
2
, the map u distorts
distances, but it also distorts the metric tensor in exactly the right way so
that if we use the u
1
-induced metric tensor to measure path length in R
2
we get the right answer for the metric tensor on S
2
.
You might be surprised at rst that we transport a metric tensor eld on a
space X to one on a space Y by a homeomorphism u
1
which is the inverse
of the map u : X Y . Actually this makes good sense. Suppose we take
the simplest case of the usual metric tensor eld on R which assigns to the
interval [a, b] the length b a. Map R R by u(x) = 2x. Now we want a
new, shiny metric tensor eld on the codomain which gives the image of [a, b]
the same length, b a, so we can feel we have shifted not just the interval
[a, b] but also the metric with which to measure its length.
Writing the length in the domain as
b
a
dx we see that we can get the (usual,
boring, old fashioned) length of the image in the codomain by writing it as
x=b
x=a
du =
b
a
2 dx
where du = 2dx follows from u = 2x.
You might have felt a bit happier had I written this as
b
a
du
dx
dx =
b
a
2 dx
Much depends on your previous experience of Calculus.
If we want to have length b a for the new, shiny length in the codomain,
which I shall call the u-space, we need to use du
1
. Now the classical mathe-
maticians, Gauss and his mob, would cheerfully write things not very dierent
from
u
1
(x) = x/2, du
1
= 1/2 dx
then using the metric given by du
1
on the u-space we get the length of the
interval [2a, 2b] in the u-space with the right metric is
x=2b
x=2a
1/2 dx = 1/2 (2b 2a) = b a
Obviously this works with all linear maps from R to R not just 2x.
Exercise 3.3.10. Show it works for u(x) = 2(x).
If u is a dieomorphism from R to R then we have something like
dx
du
= D u
1
=
1
du/dx
by the inverse function theorem. The interval [a, b] in the x-space is taken to
[u(a), u(b)] if u is increasing, which I can assume it is without loss of generality
since if it isnt I just compose with the map that multiplies everything by
1 and rename the composite to be u. Now the length of this in the usual
metric is
b
a
du/dx dx. If I choose the metric given by du
1
then I replace the
old, boring metric dx with the new, shiny, transported metric 1/(du/dx) dx,
then the length is
b
a
du/dx 1/(du/dx) dx = b a
This tells us that if we use the metric transported by u
1
to measure the
length of a curve in R transported by u, we get the same length.
Exercise 3.3.11. Show this works just as well if the curve is in R
2
and
u : R
2
R
2
is a dieomorphism. The length of the curve in the u-space
measured by the metric transported to the u-space by u
1
is the same in both
spaces. Hint: Try it for linear maps u rst.
Exercise 3.3.12. By taking two distinct charts both covering a curve on S
2
and hence related by a dieomorphism, show that whichever chart you use, if
you induce the right metric tensors on R
2
from the charts and calculate the
lengths by both of them, they agree on the length of the curve on S
2
. Note that
you dont really need S
2
at all for this exercise, it is about the way covariant
tensor elds on R
2
transform under dieomorphisms.
Exercise 3.3.13.
1. Show that contravariant tensors of any order on a vector space U are
carried by linear maps f : U V to contravariant tensors of the same
order on the vector space V .
2. Show that covariant tensor elds of any order on a smooth manifold
V are carried to covariant tensor elds on a manifold U of the same
order by the inverse of a dieomorphism h : V U.
3. Let ', ` be an inner product on V . Show that it induces an isomorphism
between V and V
.
4. Does an isomorphism from V to V
always give an inner product on

V ?
5. Deduce that if we have an inner product on V we can induce a dual
inner product on V
and vice-versa, and hence that it would be possible

to dene a Riemannian metric tensor as being a contravariant tensor
eld.
6. Explain why this is not usually done.
7. Find a metric tensor on o
2
which gives the usual notions of distance
and angles between intersecting curves.
8. Using this tensor, conrm that the angle at the north pole between the
curves obtained by travelling around the great circle in the x z plane
and the great circle in the y z plane is what common sense says it
should be.
We now have enough machinery to say quite a lot about what we mean by the
geometry of a space and in particular we can say something about curvature.
If I give a curve on a manifold, and a riemannian metric structure on it, we
can cover the curve with open sets homeomorphic to open sets in R
n
, shift
the curve (in pieces if necessary) back to R
n
, shift the metric structure, and
compute the length. But how do we specify a curve on the manifold in the
rst place? And how do we specify a riemannian structure on it? We can do
that with charts too, if all else fails.
Note that this is all intrinsic, it does not require an embedding of the manifold
in R
n
. If we do have an embedding, we can derive a riemannian structure
for the manifold from the usual euclidean metric on the enclosing space.
Exercise 3.3.14. How?
But if we are to say anything about the geometry of the space of the universe
in which we live, it has to be done intrinsically. If there is an embedding of
the universe in some higher dimensional euclidean space, we cannot ever
know anything about it, and so it is idle to talk about it.
A useful reference for much of the material covered so far is Volume One of
A Comprehensive Introduction to Dierential Geometry by Michael Spivak.
A quick glance should persuade you that there is a rather considerable depth
in the material. Also bear in mind there are several volumes.
The idea of a vector bundle with bundle maps which are linear on each bre
is of great importance in theoretical physics. Essentially, elds are sections of
suitable locally trivial vector bundles. We can specify a general locally trivial
vector bundle over a manifold by taking local trivialisations and specifying
a way of gluing the local products together. This often involves a group:
in the case of the mobius bundle, for example, the group Z
2
has everything
to do with the bundle structure. Quantum Chromodynamics and Gauge
Invariance are describable in terms of the structure of locally trivial vector
bundles. The physicists Yang and Mills were obliged to reinvent some of the
ideas well known to dierential geometers, a good argument for doing serious
mathematics before tackling theoretical physics.
3.4 Geometry
Given a Riemannian Manifold M
n
which we assume is compact and path
connected, for any a, b M
n
we can take a smooth path between a and b
and compute its length. This gives a map from the space of smooth paths
joining a and b to R. Of all possible such paths, we may hope that there is
one with the length a minimum, a geodesic on the manifold. It is not entirely
trivial to show that such a path exists.
Exercise 3.4.1. Show that if M
n
is not compact there may be points a, b
such that there is no path of minimum length joining them.
When this can be done we assign the distance between a and b to be this
minimum length. Note that the minimum length may exist even though
there is no path having it as length.
This makes the Riemannian manifold a metric space.
Exercise 3.4.2.
1. Prove that last claim.
2. Show that T
2
, S
2
, RP
2
, K
2
all have the structure of Riemannian man-
ifolds and give the metric arising.
Denition 3.4.1. A map f : (X, d) (Y, e) between metric spaces is an
isometry i it preserves distances, i.e. i
a, b X, e(f(a), f(b)) = d(a, b)
If we take a small ball centred on the north pole of S
2
as in the diagram,
gure 3.4.1, we can make it a ball in the metric of some radius r.
There is certainly a dieomorphism between this ball and the ball of radius
r centred on the origin in R
2
. However it is easy to see that the length of the
perimeter is 2r for the ball in R
2
but is less than this for the ball on S
2
. So
there cannot be an isometry between the two balls.
Exercise 3.4.3. Provide a convincing argument for these claims.
On the other hand, there is an obvious isometry between any two balls of the
same radius in R
2
, and also one between any two balls of the same suciently
small radius in S
2
. A shift does it in R
2
and a rotation in S
2
.
Denition 3.4.2. If X
n
is a Riemannian manifold and for any two points
a, b X, there is a smooth isometry f : X X with f(a) = b then we say
the geometry is homogeneous.
3.5. THE EXTERIOR ALGEBRA 91
Figure 3.4.1: A ball (disc) on S
2
.
Denition 3.4.3. If X
n
and Y
m
are homogeneous Riemannian manifolds
and for any suciently small ball B in X there is a smooth map f taking B
isometrically to a ball on Y , then we say that X and Y have the same local
geometry.
Exercise 3.4.4.
1. Show that if two homogeneous manifolds have the same local geometry
they have the same dimension.
2. Show that having the same local geometry is an equivalence relation
on homogeneous Riemannian manifolds.
3. Show that the at torus dened by gluing is a homogeneous Riemannian
manifold with the same local geometry as R
2
.
4. Show that the cylinder S
1
R as the subspace (cos(t), sin(t), z)
T
of R
3
has the same local geometry as R
2
.
5. Show that RP
2
is a homogeneous Riemannian manifold.
6. Construct a denition of what it means for two manifolds to have the
same local topology. Given an example of distinct manifolds having
the same local topology.
7. Construct a denition of what it means for two manifolds to have the
same global geometry.
3.5 The Exterior Algebra
I have explained that the tensor algebra T
k
(R
n
) has basis the set
dx
i
1
dx
i
2
dx
i
k
: dx
i
j
T
1
(R
n
)
Well to be more exact, I asked you to prove it. It is all a matter of getting
used to the jargon and is conceptually rather simple once you are happy
with dual bases and elements of the cotangent bundle as linear maps taking
tangent vectors to numbers.
The space of alternating covariant k-tensors,
k
(R
n
), we know has dimension
n
C
k
and we can get a basis by noting that we have the maps specied when
we know what they do to some standard basis elements of
R
n
R
n
R
n
. .. .
k terms
Since the maps are clearly linearly independent for dierent choices and since,
by an exercise, any alternating k-tensor on R
n
can be expressed as a linear
combination of maps which take each of the needed basis elements of R
nk
to
1, we can say, at some length, what the basis elements of
k
(R
n
) are. They
are the maps which take each choice of e
i
1
, e
i
2
, , e
i
k
having i
1
< i
2
< i
k
to one, and the value on every other basis element of R
nk
are specied by the
fact that they alternate. Then multilinearity forces the value of any linear
combination of these things everywhere.
It has to be said that this is messy. It would be nice if we could specify the
basis elements more neatly. Something similar to the description given for
the general tensor space T
k
(R
n
) would be neater. We can take it that this is
possible because we know that we can express a basis for the space in terms
of making choices of k distinct rows of k vectors from R
n
placed side by side
and evaluating the determinant on our choice.
If we have two vectors from R
3
,
(x, a) =
x
y
z
a
b
c
then we can take the top pair to get
12
(x, a) = xb ya, or the bottom pair
to get
23
(x, a) = yc zb, or the top and bottom to get
13
(x, a) = xc za.
These three are all alternating and every alternating 2-tensor on R
3
is a linear
combination of these three.
Exercise 3.5.1. Prove this last remark.
We actually write these as dx dy, dy dz and dx dz respectively, and the
only thing left to do is to explain where the terminology comes from.
First I want to explain the determinant for n n matrices. You will need to
recall the material on odd and even permutations from 3P0.
Suppose I have 3 columns, each column a vector in R
3
. I shall write them
x
1
x
2
x
3
y
1
y
2
y
3
z
1
z
2
z
3
Now I choose one element from each column, taking care never to choose
two things from the same row. So I can pick x
1
, y
2
, z
3
or x
2
, y
1
, z
3
but not
x
1
, y
3
, z
1
.
It follows that if we just look at the indices in xyz order, we get a permu-
tation of (1, 2, 3) specifying a choice. If I pick x
1
, y
2
, z
3
I get the identity
permutation. If I pick x
2
, y
1
, z
3
I get the permutation which I wrote out as
1 2 3
2 1 3
Now I make every possible choice of x
i
, y
j
, z
k
with no two indices the same,
and I get 6 (count them) possibilities, which is 3!, the size of the permutation
group S
3
. Now I multiply every number in each choice together, obtaining
x
1
y
2
z
3
for the rst permutation and x
2
y
1
z
3
for the second, and so on. Note
that the terms in the product are never the same (although of course the
values of the terms or the result of multiplying them together may be). This
gives me 3! products. Had I done this with n vectors in R
n
I should have got
n! distinct products.
Now I take these products, multiply each by the sign of the permutation (+1
if an even permutation, 1 if odd), and add them up. This sum of n! terms,
with parity taken into account is the determinant of the matrix.
Exercise 3.5.2. Conrm this for n = 2 and n = 3.
Exercise 3.5.3. Show that for any n n matrix A, det(A) = det(A
T
).
I could write out the choice x
1
, y
2
, z
3
as dx
1
dx
2
dx
3
applied to the three
vectors. This is taking dx
i
to mean the projection map from R
3
to R which
selects the i
th
component. I am confusing this projection map (which I might
more reasonably have called e
i
, the dual basis element to e
i
, or e
i
) with the
map dx
i
:

R
n
R and the reason is that pretty soon we shall be doing all
this on the tangent space, and if I confuse the notation a bit now there is
less novelty later.
In the case of R
2
I get that the determinant can be written easily as dx
1
dx
2
dx
2
dx
1
. There are,after all, only two permutations of two things.
I shall write this as dx dy. In fact if I have a covariant 2-tensor on R
n
, I
shall also write:
dx
1
dx
2
= dx
1
dx
2
dx
2
dx
1
This is equivalent to choosing the rst two rows of the n 2 matrix made
up by choosing any two vectors in R
n
, and computing the determinant.
It is easy to see that it is an alternating covariant 2-tensor on R
n
. I have
immediately that dx
2
dx
1
= dx
1
dx
2
Similarly I can take dx
i
dx
j
dened by
dx
i
dx
j
= dx
i
dx
j
dx
j
dx
i
and this is dx
j
dx
i
and dx
i
dx
i
= 0, for i, j [1 : n]. What this means
is that I select the 2 2 matrix comprising the i
th
and j
th
rows of the two
column vectors, and calculate the determinant of them.
This can be generalised to covariant 3-tensors on R
n
without too much trou-
ble. In this case I have to dene dx
i
dx
j
dx
k
and I do this by writing out
every permutation of i, j, k so that if is a permutation I take the 3! terms
dx
(i)
dx
(j)
dx
(k)
for the 3! permutations, . I then multiply the resulting numbers together,
multiply by the sign of the permutation, and sum the 3! numbers. This gives
dx
i
dx
j
dx
k
. It is easy to see that it is an alternating 3-tensor on R
n
.
Exercise 3.5.4. Prove the last claim.
The generalisation to alternating k tensors on R
n
is obvious.
Exercise 3.5.5. Write it down.
It follows that we can give a basis for the space
k
(R
n
) rather easily: it
consists of the alternating tensors
dx
i
1
dx
i
2
dx
i
k
: i
1
< i
2
< < i
k
[1 : n]
Now putting x
1
= x, x
2
= y and x
3
= z in traditional fashion, we recover
the mysterious expression at the beginning of this section on the Exterior
Algebra.
Exercise 3.5.6.
1. Show how to construct a linear map Alt: T
k
(R
n
)
k
(R
n
) which
alternates any tensor and sends any alternating to itself. Hint: the
essential idea occurs in turning dxdy into dxdy and dx
1
dx
2
dx
3
into dx
1
dx
2
dx
3
by adding up the signed permutations. It might be
better to average them in this case.
2. Show how to generalise the of dx
i
, dx
j
so that if is an alternating
k-tensor on R
n
and is an alternating tensor, then is an
alternating k + tensor. Hint: Hit the tensor product with the Alt of
the last exercise.
3. Show that dx
1
dx
2
applied to a pair of points in R
2
, represented in
the standard way, gives twice the oriented area of the triangle formed
by the pair of points together with the origin.
4. What do you need to get the area of the triangle formed by two points
in R
n
and the origin? Does it make sense to talk of an oriented area
in this case?
The exercises should now make it clear that just as between k-tensors
and -tensors gives us k + -tensors and hence a graded algebra, so be-
tween alternating k-tensors and alternating -tensors gives us an alternating
k + -tensor and hence another graded algebra. This is called the Exterior
Algebra. Since the only alternating n-tensor on R
n
is the determinant, and
since
k
(R
n
) is just the zero tensor whenever k > n, we are really only
concerned with the graded algebra
k
(R
n
) : 1 k n. This makes the
exterior algebra rather simpler (and a lot smaller) than the tensor algebra.
Exercise 3.5.7. Show that
k
(R
n
) is just the zero tensor whenever k > n.
I can write out the full exterior algebra in the form:
Order basis isomorphic to
0
(R
2
) 1 R
1
1
(R
2
) dx, dy R
2
2
(R
2
) dx dy R
1
This is a nice nite table. Just as I wrote out table 3.1.2 I can write out the
exterior algebra for R
2
:
R R
2
R
R (R, ) (R
2
,
q
) (R, )
R
2
(R
2
,
q
) (R,det) 0
R (R, ) 0 0
Table 3.5
Again, denotes ordinary multiplication in R and
q
denotes scalar multipli-
cation. And det denotes the determinant. The table starts with 0-tensors at
the top and 2-tensors at the bottom.
Exercise 3.5.8. Write out the full exterior algebra on R
3
. You should repli-
cate the above two tables with rather more columns and rows. In the second
table, work out what the multiplications are, as for table 3.1.2. Do you recog-
nise anything?
3.6 The Exterior Calculus
The step from the tensor algebra to tensor elds consisted of having a section
of the tensor bundle, which meant attaching a type (k, )
T
tensor to each
point on a manifold. We do exactly the same thing again, we take a section
of the
k
(V ) bundle where V is a tangent space. This means that we attach
to each point a of the n-manifold M
n
an alternating k-tensor on the space
T
a
(M). A 0-tensor is just a number, and attaching a number to each point of
a manifold is merely dening a map from M
n
to R. Similarly, attaching an
n-tensor is attaching a number, the volume element, at each point of M
n
. In
between we have k-forms attached at each point of the manifold. Naturally
we want the sections to be smooth.
Such sections are called dierential forms on the manifold.
To make this concrete we look at R
2
and R
3
.
A dierential 0-form on R
2
is just a smooth map from R
2
to R. We know a
fair bit about these.
2
assigns to each point of R
2
a pair of numbers
a dx +b dy and consequently is a pair of functions
P(x, y) dx +Q(x, y) dy
It is a covector eld and looks very like a vector eld (but watch out for what
happens when you change bases!)
2
assigns to each point a of R
2
an operator (a) dx
dy. This is short for (a) dx dy dy dx for some number (a) which
3.6. THE EXTERIOR CALCULUS 97
depends on a. This acts on any pair of vectors in the tangent space at a.
Lets choose some with respect to the standard basis for

R
2
(since, for any
a R
2
,

R
2
a
is isomorphic to

R
2
0
in a natural way). Then
(a) dx dy
x
y
u
v
= (a)(xv yu)
So (a) dxdy assigns to any pair of tangent vectors the area of the parallel-
ogram in the tangent space which they determine, multiplied by a function
of a. Or if you prefer, twice (a) times the area of the triangle consisting of
the two points and the origin of the tangent space

R
2
.
A quite useful way of looking at this is that (a) dx dy is doing something
a bit like the riemannian metric, but instead of returning the inner product
of two tangent vectors it is returning an innitesimal area element. So we
imagine that we want the area denition to vary over the space so that
calculating an area of a region is now more complicated. On the other hand
you have seen this before, more or less.
Example 3.6.1. First I am going to transform the usual area measure dxdy
on R
2
and use it to calculate the area of the unit disc in polar coordinates.
We have the polar coordinate transform
P : R
2
` 0 S
1
R
+
x
y
We have the inverse given by

x = r cos()
y = r sin()
and exactly as before
dx = r sin() d + cos() dr
dy = r cos() d + sin() dr
Last time we calculated dxdx and the three others. This time there is only
one thing to calculate, dx dy. We get
dx dy = (r sin() d + cos() dr) (r cos() d + sin() dr)
which we can easily see is just:
r sin
2
() d dr +r cos
2
() dr d = r dr d
Exercise 3.6.1. Show this carefully.
Using this new area element we get that
B
1
(0)
dx dy =
B
1
(0)
r dr d
which we already knew although not in this language. Note that the domain
of integration, B
1
(0) is a disk in the x y space and a rectangle wrapped
around a cylinder in the r space. This is what happens to the punctured
disc under the dieomorphism P.
The new integral has S
1
and r [0, 1] which makes for an easy integral,
(1/2)(2) = .
I knew that.
Note that this works because we
transformed the disc in R
2
into a rectangle in S
1
R
+
, except that the
centre of the disc really got thrown away (zero area so does not aect
the result) so the rectangle (a) doesnt have a base (zero area in any
sane density on R
2
) and b gets wrapped once around the circle.
Back transformed the measure density dx dy to get the right density
to use to compute the area. All this does is to make clear something
which you were trained to do using much sloppier arguments to justify
the right rule for the change to polars. It was all perfectly OK but the
rationale was scruy. Note how the exterior algebra rules for computing
the new form automatically take care of signs and orientations. Doing
it for any other transformation than the polar one is now a doddle.
Exercise 3.6.2. Calculate the area of the unit disc in R
2
with respect to the
density xy dx dy.
Exercise 3.6.3. Work through the argument for the spherical and cylindrical
polar coordinate transformations in R
3
.
Exercise 3.6.4. Think of some bizarre dieomorphism of R
2
to some two
dimensional space that does something frightful but has an explicit inverse
(make sure the inverse can be written down even if the original is a swine).
Use it to evaluate the area of some region in both spaces, before and after
being transformed. This should give two moderately foul double integrals
with weird limits. Use Mathematica to get numerical solutions and conrm
they are pretty much the same.
Remark 3.6.1. This should give you a conviction that dierential forms
have their uses, and will suggest the most important thing about them:
Dierential Forms are things you
integrate over manifolds. A dif-
ferential k-form can be integrated
over a k-manifold or k-manifold
with boundary.
I have made sure you would see this as it tells you what they are for.
Transforming dierential forms by dieomorphisms follows the same pattern
as for transforming the riemannian metric tensor, except that we may have
to transform k-forms for k > 2. The rules are simple however.
Exercise 3.6.5. Write down explicit rules in terms of partial derivatives for
transforming a dierential 3-form on R
3
under a dieomorphism, rules which
you must have used in doing the preceding exercise but one.
It follows from the big announcement that 2-forms on R
2
are integrated over
things like discs, and a 1-form on R
2
has to be integrated over curves.
Example 3.6.2. Let the curve c be the graph of y = x
2
between x = 0 and
x = 1. Let the dierential 1-form be dx + dy. What would we expect the
answer to
c
dx + dy be on the basis of what this means, and what would
the calculation be?
Solution: Drawing the graph and taking a typical line segment on the curve,
x() is the projection along the x-axis and y() is the projection along
the y-axis. If we add these up we get 1 + 1 = 2, and this is not going to
change as the segments get shorter. So the answer is 2. All done by a little
thought about what these things mean.
If we write y = x
2
we get dy = 2x dx so
c
dx +dy =
[0,1]
(1 + 2x) dx = x +x
2
1
0
= 2
As an alternative we could write x = t, y = t
2
, t [0, 1] to express the curve
parametrically and this would give the same answer.
Exercise 3.6.6. Now try it for the curve c being the rst quadrant of the
unit circle. Do you get the same answer? If not why not?
Note that when we express the curve parametrically we do so by a function c
and this allows us to pull back the dierential 1-form on R
2
to a dierential
1-form on R which is just some function multiplied by dt and we can integrate
this in the usual way, numerically if necessary.
Exercise 3.6.7.
1. What would you expect the result of
c
dx + dy to be when c is any
smooth closed curve?
2. Suppose we take f(x, y) = x +y. Then we have
df =
f
x
dx +
f
y
dy
which in this case is the 1-form dx + dy. So dierentiating a smooth
0-form gives a smooth 1-form. Show that this is always the case.
3. It follows that
c
dx + dy =
c
df. Use the fundamental theorem
of calculus to prove your solution to the rst question is correct, and
verify that it gives the right answers to all the other integrations of this
1-form along curves.
4. Find a 1-form on R
2
, P(x, y) dx +Q(x, y) dy, that is not df for any f.
Hint: What can we say about P/y and Q/x if the 1-form is the
derivative of a 0-form?
The usual way to represent the derivative of a 0-form f is as the row matrix
[f/x, f/y] and since this represents, when evaluated at any point, a
linear map from R
2
to R and P dx+Q dy represents a linear map from

R
2
to
R when evaluated at any point, the dierence is rather small, but signicant.
When we treat the dierentiation in the second sense, we call d the exterior
derivative. It goes much further than this. I shall dene an exterior derivative
of 1-forms to give a 2-form:
For = P dx +Q dy, I dene
d =
P
y
dy dx +
Q
x
dx dy = (
Q
x

P
y
) dx dy
Exercise 3.6.8.
1. Show that d
2
= 0 for any 0-form f.
2. Calulate the exterior derivative of a 1-form on R
3
by making up a
suitable example.
3. Pretend, briey, that there is no such thing as duality and that the last
1-form is a vector eld. Identify the 2-form.
4. Make the rule: To obtain the exterior derivative of a k form on R
n
, take
each component function P(x
1
, x
2
, x
n
) dx
i
1
dx
i
2
dx
i
k
of the
k-form, dierentiate each such P with respect to each of the variables
separately to get, for example, some P/x
j
, and put dx
j
in front of
the existing term, to get
P/x
j
dx
j
dx
i
1
dx
i
2
dx
i
k
Sum the results for the n dierent variables x
j
and also for the dierent
functions P. The result is a k + 1 form on R
n
. Show that this rule
gives the same answer as in the particular cases you have worked with.
5. Use the above rule to calculate the exterior derivative of a 2-form on
R
3
. Choose your own 2-form, preferably so as to have three non-trivial
but dierentiable component functions.
6. Pretending, briey, that the 2-form on R
3
is a vector eld, identify
d.
Remark 3.6.2. You should be able to see that the clunky way you did
Stokes Theorem using vector elds arose from confusing vector elds with
both 1-forms and 2-forms, which you can do only on R
3
. In fact it is really
about dierential forms. Stokes Theorem in general says
M
=
M
d
where M is an n-manifold with boundary M and is any dierential n1-
form. For a proof, dig up my old 2C2 notes o the web.
This is the modern form of Stokes Theorem. It diers from the old obsolete
form in two ways: rst it is about dierential forms not vector elds, so a
graps of duality is important and second it works for all positive integers n,
all n-manifolds with boundary (or without, but they are less interesting).
Exercise 3.6.9.
1. Show that Stokes theorem in dimension 1, with a 0-form, is just a
restatement of the Fundamental Theorem of Calculus.
2. Show that the almost certainly scrofulous proof you met in second year
of Greens Theorem is also a scrofulous proof that Stokes Theorem
holds when is a 1-form on R
2
.
3. Show that the almost certainly scrofulous proof you met in second year
of Stokes Theorem is also a scrofulous proof that
M
=
M
d
holds when is a 1-form on R
3
.
4. Show that the almost certainly scrofulous proof you met in second
year of the Divergence Theorem is also a scrofulous proof that Stokes
Theorem holds when is a 2-form on R
3
.
5. Construct a plausible explanation of why you had to do a bungled
version of Stokes Theorem in second year, given that the correct version
has been known since about 1925.
Remark 3.6.3. If you want to see a proper proof of Stokes Theorem (all the
above, and more, in one hit) read Michael Spivaks Calculus on Manifolds.
It consists of proper denitions of all the terms in awful generality and some
calculations. I should point out that all the physical intuitions which led
to the theorem are contained in the exercises and are not, as some shallow
people imagine, completely absent.
3.7 Hodge Duality: The Hodge Operator
3.7.1 The Riemannian Case
In R
3
we know from the table referred to in Exercise 3.5, that the Exterior
Algebra has a striking symmetry. If we look at the 0-forms we observe they
have dimension 1, just as do the 3-forms, while the 1-forms have dimension
3, just like the 2-forms.
It follows from the equality of the dimension that there is an isomorphism
between the space of 1-forms on R
3
and the space of 2-forms on R
3
; also one
between the space of 0-forms (numbers) on R
3
and the space of 3-forms on
R
3
.
Only a small amount of thought shows that there must be, in general, an
isomorphism between the k-forms on R
n
and the n k-forms on R
n
. In fact
3.7. HODGE DUALITY: THE HODGE OPERATOR 103
not just one isomorphism of course, but scads of them. The question is, can
we nd a more or less natural isomorphism by some process which works in
all cases? The answer is yes, and the isomorphism is called , or Hodge if
you want to give credit where it belongs.
Let us start by going from 2-forms on R
3
to 1-forms on R
3
and see if we can
work out the general pattern by doing concrete cases.
Recall from Exercise 3.5.1 that if is a 2-form on R
3
then it operates on any
pair of vectors
a
1
a
2
a
3
b
1
b
2
b
3
by expressing the resulting number as

c
1
a
2
b
2
a
3
b
3
+c
2
a
1
b
1
a
3
b
3
+c
3
a
1
b
1
a
2
b
2
for some particular numbers c

1
, c
2
, c
3
.
But this is just the value of the three by three determinant:
a
1
b
1
c
1
a
2
b
2
c
2
a
3
b
3
c
3
It seems reasonable therefore to dene () to be the 1-form [c

1
, c
2
, c
3
]
which may be written c
1
e
1
c
2
e
2
+c
3
e
3
using the standard dual basis in R
3
.
If you are troubled by the minus sign, note that it arises because I have
always described the submatrices obtained by omitting a row in numerical
order rather than cylic order. So I get (1, 3) where it might have been more
natural to put (3, 1). But it is much easier to specify submatrices this way
so I shall carry on doing so.
There is also an isomorphism between 0-forms and 3-forms on R
3
. It takes 1
to det. Det of course takes the three ordered vectors (a, b, c) to the determi-
nant of the matrix obtained by writing each vector out as a column (or row)
and listing them to get the three by three matrix.
In R
3
we used to send the 2-form
c
1
e
2
e
3
+c
2
e
1
e
3
+c
3
e
1
e
2
to the 1-form
c
1
e
1
c
2
e
2
+c
3
e
3
Figure 3.7.1: A choice of k numbers from n.
We can simplify this by saying that (e
2
e
3
) = e
1
, (e
1
e
3
) = e
2
and
(e
1
e
2
) = e
3
. Then everything is multilinear and so we dont need any
more. In other words, it suces to specify on all the choices of i, j in e
i
e
j
.
Now suppose we have a k-form on R
n
. We do not have just three basis
elements, we have
n
C
k
of them and we can take one of them and write it as
e
i
1
e
i
2
e
i
k
where we have chosen from the numbers [1 : n] some increasing subsequence
consisting of the k numbers i
1
, i
2
, , i
k
. We specify if we give an n k
form that every such basis element gets sent to. There is an obvious choice:
we have taken a row of n numbers and picked out k of them. Figure 3.7.1
shows that I have selected some numbers in order and painted them red. This
leaves n k black ones. I wedge the black projections. Thus the indices
we take out go to the indices we leave behind.
The only problem is that of sign. I have gone from the red and black mixed
up to the red ones followed by the black ones. This is a permutation of the
n numbers [1 : n] and it may be an odd permutation or an even one. (see
3P0 notes if you dont understand these things) We call the sign of an even
permutation 1 and the sign of an odd permutation 1. So I nish up with
the denition of :
(e
i
1
e
i
2
e
i
k
) = sign()e
i
k+1
e
i
k+2
e
i
n
where is the permutation:
1 2 3 n
i
1
i
2
i
3
i
n
and the i
j
for 1 j k on the left hand side are the red numbers and the
remainder are the black numbers in the same order as before.
Exercise 3.7.1. Conrm that the general case copies exactly what we did
with 2-forms on R
3
.
Exercise 3.7.2. Conrm by explicit calculation for some particular k-forms
on R
n
for various n that this denition makes sense and gives reasonable
answers. In particular send some 2-forms on R
4
to some 2-forms.
Exercise 3.7.3. Is it true that
2
is the identity? Prove it or give a coun-
terexample.
Exercise 3.7.4. Verify that the result of taking (e
i
e
j
) on R
3
looks a lot
like the cross product. (!) Explain this.
Exercise 3.7.5. Verify that if we choose the ordered basis (e
1
, e
3
, e
2
) and
make = dx we get a dierent value for () than if we used the ordered
basis (e
1
, e
2
, e
3
). Show that if we use the ordered basis
e
1
,
0
cos()
sin()
0
sin()
cos()
we get the same result as in the standard basis.

If is a k-form on V and f : U V is a linear map, then we can pull back
to a k-form on U dened by
f
()(a
1
, a
2
, , a
k
) = (f(a
1
, ), f(a
2
), .f(a
k
))
Exercise 3.7.6. Dene f : R
3
R
3
by f(e
1
) = e
2
, f(e
2
) = e
3
, f(e
3
) = e
1
.
Let be the 1-form dx + 2dy + 3dz. Show that f
(()) = (f
()). The
last exercise makes it clear that at least in this case, f
and commute.
Exercise 3.7.7. Do they always? Is functorial? Its denition is locked
into the standard basis, does it need to be? If not, what does this do for
dening it on manifolds?
The answer to the last question is that if f preserves both the inner product
and the orientation, that is if it an element of SO(n, R), then f
(()) =
(f
()). This tells us that the operator involves the parity (or orienta-
tion or chirality) of an orthonormal basis in an essential way. Note that in
a Hilbert space we have a natural denition of an orthonormal basis and
that since a Riemannian structure on an oriented manifold makes each tan-
gent space a Hilbert space, the operator makes sense on oriented smooth
manifolds with a Riemannian structure.
I do wish the physicists would learn not to call it a metric, but they are
beyond saving.
3.7.2 The SemiRiemannian Case
We are going to generalise the idea of an inner product so as to be able to
deal with the Minkowski metric on space-time. This is Lorentzian. And we
might as well deal with the general case because it isnt any more work.
Denition 3.7.1. A symmetric bilinear form : V V R on a real vector
space is said to be non-degenerate i u V, (u, v) = 0 v = 0
Denition 3.7.2. A non-degenerate symmetric bilinear form : V V R
on a real vector space is said to be an inner product. We write such forms as
', ` with (u, v) = 'u, v`.
Exercise 3.7.8. Dene the inner product on R
2
by
x
y
a
b
= xb +ya
Show that this is an inner product in the new sense but that it is not positive
denite.
Exercise 3.7.9. On R
4
dene
x
0
x
1
x
2
x
3
a
0
a
1
a
2
a
3
= x
0
a
0
+x
1
a
1
+x
2
a
2
+x
3
a
3
Show that this is an inner product (the Lorentzian inner product on space-
time), and nd a non-zero vector which is orthogonal to itself. Find two
distinct points which are at distance zero from each other. Explain what
this means physically.
It should be clear that our generalisation of the idea of an inner product is
constructed so as to allow us to do for space-time what we usually do for
space, and that the metric derived from the inner product isnt a metric in
the standard sense at all. This is (a) strictly necessary in order to describe
relativity in a sensible fashion and (b) an awful shock to the system. It means
that I have set the velocity of light equal to 1 and that any two points on
the path of a ray of light have zero separation in space-time. I shall discuss
some aspects of Physics in the next chapter which may shed some light on
this extraordinary behaviour.
From now on I shall use the generalised denition of the inner product.
Proposition 3.7.1. An inner product on a nite dimensional real vector
space V determines an isomorphism from V to V
.
Proof: Dene : V V
by (u) = 'u, `. That is,

v V, (u)(v) = 'u, v`
Then that is 1-1 follows from the non-degeneracy of ', `, and that is
linear follows from the bilinearity of ', `. Since the spaces have the same
dimension, this is sucient to make an isomorphism.
Exercise 3.7.10. Let ', ` be the inner product of exercise 3.7.8. Write down
the isomorphism explicitly, representing elements of R
2
by row matrices.
The converse is also true:
Proposition 3.7.2. Suppose : V V
is an isomorphism of nite di-

mensional real vector spaces. Then the bilinear form
'u, v` =
1
2
(((u))(v) + ((v))(u))
is an inner product on V .
Proof: That it is bilinear follows from the linearity of , and it is constructed
to be symmetric. It remains only to show it is non-degenerate. If it were
degenerate then there would be some non-zero u such that 'u, ` is the zero
map in V
which would put u ker() contradicting being 1-1.

Exercise 3.7.11. Construct an example of an isomorphism from R
2
to R
2
which gives an inner product in the old sense, that is a bilinear positive
denite map, and also one which gives an inner product which is not one
of the old fashioned sort. Show that by composing any given isomorphism
: R
n
R
n
with a suitably chosen isomorphism from R
n
to itself, we can
always get an old fashioned style inner product. Hint: Use the material on
the classication of quadratic forms from Second year.
Now we have to mess around with Hodge a bit to make it work properly
on a semi-Riemannian manifold. Note that we can take the determinant of
an orthonormal basis and if this is riemannian we must get 1, since the inner
product of dierent basis elements will be zero, and the inner product of a
basis element with itself is 1. And the determinant of the identity matrix is
1.
This stops being true for a generalised inner product: the inner product
of the time axis with itself is 1, alternatively the matrix representing the
Lorentz inner product is
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
which has determinant 1. Since determinants gure largely in dening

we need to take account of this. If the inner product has signature (s, n s)
then we change the denition of to:
(e
i
1
e
i
2
e
i
k
) = sign()(i
1
)(i
2
) (i
p
)e
i
k+1
e
i
k+2
e
i
n
where (i
j
) is dened to be the inner product of the i
th
j
basis element with
itself.
This gives the convenient form of the operator given in the text book and
ensures invariance under the generalisation of the special orthogonal group
which preserves the orientation and generalised inner product.
Exercise 3.7.12. Prove this.
As usual, the best way to accomodate a new idea is to do lots of sums until
you are used to using the idea, whereupon it stops leading to panic attacks
and becomes just another part of your machinery for thinking.
Chapter 4
Some Elementary Physics
Most of you will know most, perhaps almost all, of this, but some may not
and it is convenient to summarise it briey.
4.1 Three weird forces
The rst time I met electrostatic forces up close and personal, in other than
a laboratory situation, was the rst time I undressed a girl. It was dark
and she was wearing nylon underwear, and when I took it o it crackled and
glowed as miniature lightning ashed. My rst thought was that this was
the wages of sin, and the devil had come to claim his own, but then my brain
started working and I realised I was merely seeing and hearing electricity.
An education can be useful.
Later, in an American hotel, I once walked across a nylon carpet while wear-
ing Ug boots (among other things) and reached out to the door handle, and
got a large blue spark between my ngers and the handle. This was the
second time I encountered electrostatic forces up close and personal, in other
than a laboratory situation.
The management accepted no responsibility for guests electrocuting them-
selves with Ug boots.
It is possible to replicate these eects in a small way by stroking some fur
against a nylon or plastic ball and one can use it to pick up little bits of
paper. They just jump up o the table top to stick, briey, to the plastic
ball. Try it if you dont believe me.
The force that aects the paper is called an electrostatic force, and on a
human scale it is usually quite small, working only for rather litle bits of
109
110 CHAPTER 4. SOME ELEMENTARY PHYSICS
paper, although google Van de Graaf generator to nd out how to scale
up rather a lot. Or stand on a hilltop holding up a spiky metal stick in a
thunderstorm if you really need convincing that the process of building up
charge (to use the jargon) can be rather spectacular
1
.
So there is a rather weird force which attracts all sorts of objects and which
needs some investigation. And knowledge of which can prevent heart attacks
and interference with your love life.
You can get at any time a fridge magnet from the government, or from lots
of even more useless people at New Year, and it magically glues itself to the
fridge. It does not work on paper or wood. But it can demonstrably attract
small bits of iron. So it is dierent from the electrostatic force. But similar
in some ways. A second weird force.
And nally if you step o the top of a tall building you will come down
towards the rest of us with an acceleration of about 9.81 meters per second
per second less air resistance
2
. The planet appears to also attract things.
Like electrostatics but unlike magnetism, it appears to attract dierent sub-
stances. Unlike electostatics it does not stop working when you connect the
thing attracting and the thing attracted by a metal wire.
So there are at least three weird forces, electrostatics, magnetism and gravity.
It is natural to wonder whether there is any connection between them, what
the similarities and dierences are. Michael Faraday investigated these issues
in the early nineteenth century
3
.
1
Children, do not do this unless you are very, very bored. As a cure for boredom, this
compares with the guillotine as a cure for dandru.
2
Children, do not attempt this unless you are superman. Note that believing you
are superman doesnt cut it. The Universe does not have the slightest respect for your
opinions unless they are right. And neither do I. Incidentally, it doesnt matter how deeply
or passionately you believe something. If you are wrong in your beliefs the universe may,
with supreme indierence, kill you stone dead. If you have been brought up with the
view that you are entitled to your beliefs whatever they are, I encourage you to see if the
universe agrees with you.
3
You can nd his Notebooks in the library and they make interesting reading. They
are also very well written. I should rate him as a better writer than Jane Austen, and
the subject is a lot more interesting. Unless you think the matter of manners and which
male ends up mating with which female is more interesting than understanding how the
universe works. If so you have denitely come to the wrong shop and should enrol in
English, where you may learn the vital skill of saying the right things about storybooks.
It never fails to amaze me that there really are people who feel this is quite a reasonable
way to spend their time.
4.2. FIELDS 111
4.2 Fields
If you do some delicate quantitative measurements on electrostatic forces
you nd some interesting and important things. One is that the source of an
electric force comes in lumps, although rather small lumps, so this was not
known at the beginning of the study of the subject.
If you nd the smallest possible lumps you can nd that they are very small
and they all repel each other. They are called electrons, derived from the
Greek name for amber, and you might amuse yourself by guring out the con-
nection. The amount of repulsion is inversely as the square of the distance
between them, and we say the electron has a charge and by a curious con-
vention it is called a negative charge. Some things appear to have a positive
charge, and indeed something must have or the whole universe would have
negative charge instead of being mainly neutral. Opposite charges attract
again by an amount that depends inversely on the square of the distance
between them. When we say they attract or repel, we use Newtons termi-
nology of forces: If you want to feel a force, get a friend to poke you with a
stick. Failing that, jump o a wall. You will not feel any force while falling,
but you will when you stop. It will be very like being poked all over by a
large stick.
In Newtons day the word force was strongly associated with sticks if you
poked, or possibly ropes if you pulled. The idea that you could have a
force acting without some material object connecting the thing on which the
force acted and the thing doing the acting (usually a person or a horse) was
regarded as a contradiction in terms; it was called action at a distance and
was thought to be quite contrary to standard usage of the term force, and
hence a violation of common sense and the natural order. Hence Newtons
denition of a force in terms of what it did, dissociating the eect from the
mechanism, was a wild and novel idea. Now that we all think like this, it
is hard to appreciate the intellectual jump made in simply dening force in
terms of observable changes in the motion of a mass. It is true that forces still
tend to be associated with physical objects, the sun for example in the case of
forces acting on planets, but it is possible to conceive of a force eld in space
with no such association. Well, it is now, it was a radical if not actually loony
idea in the middle of the seventeenth century. See the discussion by Feynman
in his Lecture Notes on Physics on the question of whether Newtons Law
is merely a denition or something more. I think he misses the point here,
which is to assert that forces are not to be thought of in terms of sticks or
ropes joining the source of the force to the thing acted upon, but in terms of
motion of the thing acted upon. The source of the force is a separate matter
and will depend on what type of force it is.
We therefore measure forces by using Newtons Law which says Force is Mass
times acceleration. Newton used Latin but we prefer algebra: F = ma or
F = m x. Even better, since the mass may change in time (as when a rocket
moves by consuming fuel and throwing it out rather fast at the back), it
would be more useful to write F = p or in Leibnitz notation, F = dp/dt
where p is the momentum.
For a xed mass, we measure the acceleration. We compare masses using a
spring balance or something equivalent.
Since electrons dont seem to aect each other if we pile lots of them onto
some small object, other than trying to repel each other, and opposite charges
cancel each other out to a good approximation when they are brought to-
gether, the force between a charge Q
1
and one Q
2
is
1
4
0
Q
1
Q
2
r
2
(4.2.1)
where Q
j
is the amount of charge, ultimately a count of electrons or their
compensating positive equivalents which also come in lumps, r is the distance
between them, and
0
is a constant. This is the size of the force, the direc-
tion on each is towards the other if they have dierent sign and away from
each other if the charges are of the same type. It is worth remarking that
the attractions or repulsions do not depend on there being air or any other
medium in the space between the objects, although the medium changes the
constant which is therefore a property of the medium. We usually take
it that the medium is a hard vacuum and
0
is the number associated with
empty space. You would nd the reason for the 4 too incredible, so I shall
simply observe that it is a rather bizarre choice of unit.
We can set up some rather special circumstances which merit a little thought:
if we take two metal parallel plates we can put a negative charge on one of
them, using Ug boots or nylon underwear if all else fails, and investigate to
see what happens in between. See gure 4.2.1 for a picture of the situation.
We take the little green ball to have some standard unit charge, supposed
to be small and positive and, we put a (large) positive charge on one plate
and an equal but opposite negative charge on the other. It is found that
there is a force which tends to accelerate the test charge in the direction
shown. This happens throughout the intervening space and we know of its
existence by looking to see how much the test charge accelerates. And only
by looking at some such test charge, because neutral objects are unaected
to a rst approximation, and negatively charged test objects have the force
vector reversed. It really is a force eld because we can check back from the
4.2. FIELDS 113
Figure 4.2.1: An Electric Field.
acceleration experienced by the same charge with dierent masses. It exists
throughout the space. We assume that it is there when we dont actually
have a charge there to measure it, and we assume that trees that fall in
a forest make a noise even when there is nobody there to hear it. This is
because life seems simpler that way, and we tend to think there really is a
world outside us.
Consequently it is natural for us to believe that there is a vector eld de-
ned on the space in which we live, possibly changing in time, and where
the physical meaning is that this is an electric eld detected by measuring
the acceleration on a known charge and mass. Acceleration measurements
require, in principle, rulers and clocks, and we can get those. In practice
we also cheat by using geometry and trigonometry, but checking up suggests
these work pretty well.
You have to understand that this idea comes at the end of a loooooong
sequence of delicate measurements and careful experiments and seems to
explain things in a satisfactory way. In particular we can often calculate
in simple cases what we expect the eld to be like and we get very good
agreement between the numbers we calculate and the ones we measure. It
is this that we mean when we say we understand something: we get good
agreement metween measurements and calculations
4
.
So we are inclined to have a certain amount of faith in the existence of electric
4
This is not what philosophers or social scientists mean when they say they understand
something; what they mean is that they get a pleasant sensation of insight. Pleasant
sensations of insight are nice to have, and physicists and mathematicians get them too,
but we like to convince ourselves they are not bogus. Read Karl Poppers Logic of Scientic
Discovery to get an inkling about the dierence.
Figure 4.2.2: Two bar magnets attracting each other.
elds, for essentially the same reason that we believe in the existence of the
Pope. Most of us havent actually seen the Pope, the closest is usually a
picture, perhaps on television. But the hypothesis that he exists accounts
for a lot of phenomena which would otherwise be rather hard to explain, such
as television pictures and photographs.
Physicists also believe in Magnetic elds for similar reasons: in fact sprin-
kling some iron lings on a sheet of paper under which sits a bar magnet
(obtainable from most good toy-shops) makes it hard to doubt that you
can test the strength of a magnetic eld with a small piece of iron. There
are however some signicant dierences between magnetism and electricity.
Electric charge comes in lumps, negative lumps and, sort of, positive lumps.
Magnetism doesnt. People have actually looked hard for so called magnetic
monopoles and totally failed to nd them. What turns up is invariably two
of them, one called North and the other called South. The name of course
is derived from the discovery that the planet has a magnetic eld which can
be used for nding out what direction on the Earth you are pointing in, by
means of the magnetic compass. This made sailing a boat a much safer bet,
and has been known for a long time by the Chinese
5
Magnetism also denes a eld.The picture of gure 4.2.2 shows two bar mag-
nets close together. In the conguration shown, the north pole of the magnet
on the left attracts the south pole of the magnet on the right. The south pole
of the magnet on the left repels the south pole of the magnet on the right,
and vice versa, but less because they are further away, and we can conrm
that we have an inverse square law by using longer magnets. The force is
easily detected and we can arrange various congurations of magnets just as
we could arrange the parallel plates for testing charge. So magnetic elds
exist too.
It is natural to regard the two elds as specied by vector elds and we can
expect to be able to describe the elds for reasonably simple congurations,
we can measure the constants in an equation similar to equation 4.2.1, and
calculate the eld at other points and this works very well. Of course we
5
We in the West stole the idea over ve hundred years ago, along with Printing and the
recipe for Gunpowder. Unfortunately we also stole the idea of Bureaucracy o them. But
we got out own back by letting them copy Marxism o us. That slowed them down a bit,
but theyve gured out it was a trick quite recently. Which is more than our educational
theorists have done.
4.2. FIELDS 115
Figure 4.2.3: An electrical circuit.
dont get the same constant
0
occurring, we get a dierent constant,
0
.
Although an electric eld will move a charge, the magnetic eld also has an
eect on charges, but only when they move. If a charge q moves at some
velocity vector v in a magnetic eld B, there is a force which is orthogonal to
both v and B which is proportional to the product of the speed, the charge,
and the strength of the magnetic eld at the point. We can write this out in
vector form as the Lorentz force law:
F = q(E +v B) (4.2.2)
This can be veried by using a Cathode Ray Tube such as occurs in old
fashioned television sets and computer monitors: just bring a magnet close
to the side of the tube and watch the screen. Try not to electrocute yourself.
All this strongly suggests that electricity and magnetism are closely related,
as is indeed the case.
One of the ways of seeing this is to take a coil of wire and join the ends to
opposite sides of large metal plates as in gure 4.2.3.
The circle labelled A is an ammeter which measures current and you can
ignore it for a rst approximation and assume the wire goes right through
it. Disconnect the wire somewhere, and charge the metal plates just as in
gure 4.2.1. Now complete the circuit. Since the wire is metal, the charge
leaves the plates and tries to neutralise itself by owing along the wire. When
owing through the coil however it creates a magnetic eld, and if you look
ahead to gure 4.3.1 you can work out that you get something very like a
bar magnet created by the owing charge. This magnetic eld is caused
by the changing current ow and it contains energy. The eld acts so as
to impeded the charge and in fact to send it back the way it came. Once
it is back on the plate, the magnetic eld vanishes and the process starts
again. The current, that is the moving charge, thus oscillates and this can
Figure 4.2.4: An explanation of an inverse square law.
be measured (at least in principle, although it can be rather fast) by the
ammeter, which registers a sine wave. This oscillation will eventually die
down under normal circumstances because of resistance in the wire, so we
get a decaying sinusoidal wave. You probably saw the second order ODE
which describes this process in rst year. This circuit is the basis of radio
and television transmissions.
4.2.1 Gradient Fields
One of the possibilities for vector elds is that the direction and length of
each vector corresponds to the acceleration with which a small mass placed
at that point would fall down a hill. The question is, does such a hill exist?
If it does we say that the vector eld is the gradient of a potential eld.
For an inverse square law of attraction, as with gravity, we can imagine a
eld as indicated in gure 4.2.4, where the sun, say is at the bottom of the
well, and a planet would be a little dimple in the surface (to denote its own
gravitational eld) and would move in an ellipse, much like taking a large
sheet of rubber, putting a heavy object in the middle to deform it, and then
knocking a light ball along the rubber sheet. You can, I hope, imagine the
trajectory it would follow.
As a way of thinking about force elds this is quite productive. We can write
V for the vector eld at any point and then there is a height function f also
4.2. FIELDS 117
Figure 4.2.5: Another explanation of an inverse square law.
dened at each point and we have that
V = f =
f/x
f/y
f/z
Since we like to think of things running down hills rather than up them, it
is quite usual for physicists and engineers to put a minus sign in the above
equation. Do so if it makes you feel better.
All three of the forces we are considering are of this type, gradient elds,
except for singularities at the centre of attraction.
It is clear that is pretty much dierentiating f to give the three components
of the derivative of f along orthogonal axes, and this raises the possibility that
it might be more natural to regard the electric or magnetic or gravitational
elds as 1-forms rather than vector elds on the space we live in.
4.2.2 What are Flux?
The fact that the three elds all fall o according to an inverse square law
suggests that this is a property of the space we are living in. One possible
explanation of an inverse square law of repulsion between two objects is that
each is emitting some particles, small point like objects, which hit the other
object and force it away. This would mean that the number passing through
a given area would decrease as the area is moved further away from the
source, as in gure 4.2.5.
The area subtended by a disc of xed size would be proportional to the
inverse square of the distance from the centre, so counting hits would give
an inverse square law simply as a property of the dimension of the space we
live in reduced by one.
Even if we dont believe in anything as fanciful as microscopic particles spat
out by charges, we can certainly think of a ow of imaginary stu put out
at a constant rate proportional to the amount of charge, and so people would
talk and write of the electric ux or the magnetic ux where the word ux
means something which ows, and was used in medicine to mean stu which
dribbled out of sores and noses
6
. Like the potential function representing a
hill down which objects roll, this is simply a possible way of thinking about
things and we do not feel obliged to specify the imaginary owing stu in
any detail. After all, we are doing nothing more than observing that a vector
eld has an associated ow which is equivalent to it in that we can get to the
ow from the vector eld by solving a system of ODEs and given the ow
we can get back to the vector eld by dierentiating it at every point.
Regarding Electric eld in terms of a ow invites us to consider how much
ows out of a region compared with how much ows into the region. This
has much to do with Stokes Theorem and the extent to which the stu is
created. Obviously the stu ows out of any charge and ought to either
be conserved or get compressed elsewhere. Imagine, to picture this, water
owing along in a stream. Now place an imaginary football in the stream.
Water ows through the imaginary football as if it isnt there, which is fair
enough because it isnt. The point is that water is hard to compress so
the density is pretty much uniform throughout the stream, and moreover
the water doesnt come out of nowhere or suddenly vanish. This severely
constrains the kind of vector eld that we get in a stream by attaching to
each point a little arrow saying how fast, and in which direction, the water
is moving at any point. It satises the condition that the divergence of the
vector eld is zero at every point, where we measure the divergence at a point
by putting a small box at the point, and taking the amount of water coming
out of the box less the amount of water going into the box and dividing by
the volume of the box. Now take the limit as the box gets smaller to get
a number at each point of the stream. This is the divergence of the vector
eld, and for streams it has to be zero. There is precisely as much imaginary
water owing into the imaginary football as there is owing out. You might
reasonably conjecture that the divergence of an Electric eld is zero at a
point except when there is some charge at that point, when it depends on
the sign and amount of charge.
And you would be right.
In algebra we can write the divergence of a vector eld V as a function g
with
g =
q
V = V
x
/x +V
y
/y +V
z
/z
Then if our charge comes in lumps, which we often assume to be the case,
any little box containing a positive charge will have some net amount of ux
coming out proportional to the charge, and if the little box is empty of net
6
Many things have improved since the early Nineteenth century.
4.2. FIELDS 119
Figure 4.2.6: The Electric eld around a positive charge.
Figure 4.2.7: The magnetic dipole eld.
charge there will be just as much going in as there is coming out. Electric
eld ows are incompressible, and so are magnetic elds.
If the ow is not incompressible, it may still satisfy the Continuity Equation
which holds for a larger class of ows of physical systems. It says:
t
=
q
j
where is the density of stu at a point and time and j is the vector eld
of the ow of the stu. You can translate this as: What goes in must
come out or wind up as a sticky mess in the middle. It applies to every
little box you put in the ow, so it makes sense in the limit as the boxes
get smaller. The right hand side can be imagined to represent the build up
of concentration of the ow, which accounts for the minus sign, and the left
hand side then represents the consequence of an increase in the density.
The ow of an electric eld for a point charge looks like gure 4.2.6 and the
ow for a magnetic eld looks like gure 4.2.7. If we take a eld like this at
every point along a line segment and add them up we get the eld of a long,
thin bar magnet.
4.3 Maxwell and Faraday
Faraday spent a lot of time investigating the relationship between the three
forces. He didnt nd any link between the other two and gravity, although
he spent a long time looking as you will see from the Notebooks. But he did
nd some important relationships between electricity and magnetism. Some
of this had also been done by Ampere in France
7
.
The key things that turn up are that a moving charge produces a magnetic
eld, and that a changing magnetic eld moves charge.
Electrons move rather easily through metals. The electrons in metals that
are attached to the positively charged nuclei in the atoms may be bound
more or less tightly in the atoms, and the outer electrons are more or less
communal to a set of atoms in the crystal lattice which metals form. A
bar of iron is basically a mess of little crystals all jammed together; if you
heat it and let it cool very slowly, you get fewer and bigger crystals, in an
extreme case, practicable only for small bits of iron, you can get a single
crystal. Trying to make it one big crystal is important in some applications
because the strength of a crystal is much greater than the strength of the
metal mixture. Or to put it another way, when you pull at two ends of a
wire, it comes apart at the places between the crystals, not in the middle of
a crystal. And electrons are very small and light. So a metal looks to an
electron rather like a sequence of mostly empty rooms (the crystals) and the
electrons are rather like a swarm of ies, buzzing about aimlessly. Except
that the ies repel each other. When an electric eld is put across the wire,
the electrons drift in the direction forced by the eld. In eect, if you pump a
bunch of electrons in at one end of a piece of wire, they repel nearby electrons
and so on so a compression wave passes rather quickly down the wire.
It makes sense therefore to talk of the current which is basically a count of
the number of electrons passing a point in a second
8
. By measuring charge
in some more practical way we can write i = dQ/dt where charge Q moves
7
During the Napoleonic Wars in Europe, Faraday and Sir Humphrey Davy travelled
around to talk to the physicists in various European countries. They regarded the war as
rather a nuisance, and had to avoid the ghting which they saw as a form of insanity to
which some people are addicted. They didnt need passports which hadnt been invented:
it was any free-born Englishmans right to go wherever he wanted. Passports were intro-
duced later under the usual excuse that the government wants to help you. Few people
of any intelligence believed this in early nineteenth century Britain, it being too obvious
that politicians mainly want to help themselves. Not everything has improved since the
early Nineteenth century.
8
But with the direction reversed because current is positive charge and electrons are
negative.
4.3. MAXWELL AND FARADAY 121
Figure 4.3.1: The magnetic eld of a current (moving charge).
along a wire, or even in a stream through empty space. Since what goes in
must come out and since the electrons are not going to bunch up if they can
help it, the current at one point of a piece of wire must be the same at any
other point except for brief transients when we switch on the process. It is
clear that current is a vector since moving charge has a direction associated
with it.
Michael Faraday, one of the nest experimentalist the world has produced
and an all round smart cookie, established that charge moving along a wire
produces a magnetic eld which circles the wire. Drawing curves for the ow
of the eld we get something like gure 4.3.1. I have drawn ony a section at
one point of the wire, there is such a set of circles centred on every point.
He also found that when a magnetic eld changes it induces a current. This is
how we get our mains electricity from power stations. We spin a magnet and
surround it by a coil of wire in eect. Some serious googling or any elementary
text book on electricity should show you exactly how this is done
9
.
James Clerk Maxwell took the ndings of Faraday and wrote them out in
Algebra.
If you reect that changing magnetic elds produce an electric eld and
changing electric elds produce a magnetic eld, it might occur to you that
this swapping of energy between the two forms might happen in a cyclic way,
and might indeed happen in empty space. It might even occur to you that
such a cycling arrangement might travel through space. Your opinions would
not however count for much and would be considered of very minor interest,
mostly by your friends and relatives, and some of them might consider your
views as evidence of insanity. If however you were to take your wamblings
and turn them into algebra, you might be able to prove that this could
indeed happen and show how to calculate the speed of propagation of such
9
There are people who are convinced they understand electricity: when you click the
switch the light comes on, or maybe the television set, although this generally requires a
dierent switch. There is more to it than that, and it is a good idea to understand it a
little better, or you are not really a member of our civilisation, just a free-loading parasite
on it, hardly better than a politician or an arts graduate
an electromagnetic wave in terms of constants which were properties of the
vacuum, such as
0
and the corresponding magnetic one
0
. And if this turned
out to be the same as the measured velocity of light, you would eventually
be taken very seriously. This is what Maxwell did. The velocity of light
just happens to be 1/
0
, both of which were known from entirely static
experiments. And it led shortly afterwards to people trying deliberately to
produce such electromagnetic waves, and this led on to radio, radar, television
and most recently mobile phones
10
.
Maxwells Equations are four in number and state things that are known
about the electric and magnetic elds. Two deal with the nature of the elds
separately and two deal with the interaction between them. I give them here
for deniteness in more or less the same form as the text book. We suppose
that E is the electric eld, B is the magnetic eld and is the density of
charge.
q
E = (4.3.1)
q
B = 0 (4.3.2)
E +
B
t
= 0 (4.3.3)
B
E
t
= j (4.3.4)
The vector j is in the direction of the moving charge and has norm the rate
of it.
These are very dierent from the form that Maxwell wrote them in, which
were much longer and not so compressed, and we shall get an even terser
form later. I note that there are some constants for the medium which have
been xed up to make the velocity of light one. This is just a choice of units
in which to measure things and so is quite harmless and makes equations
simpler.
10
Whether this is altogether desirable may be doubted, but there are at least some
benets. Certainly the reason we are much better o than the inhabitants of Bangladesh
or Congo is that we are more closely related to Isaac Newton, Michael Faraday and James
Clerk Maxwell than they are, biologically or socially. And we live with traditions which
are still in many ways similar to the traditions in which these men lived and produced the
amazing changes that they did. There are also some dierences. Nowadays, instead of
being funded by the Royal Society at the discretion of Sir Humphrey Davey (its president),
Faraday would have had to submit a grant application to a committee to study electricity.
It is very doubtful if hed get it. First he had no appropriate qualications, and second
the practical applications would certainly have been beyond the imagination of the kind
of people who enjoy being on committees. Hed probably have been told to give up all
this foolery with wires and magnets and work on more powerful steam engines.
Figure 4.3.2: The curl of a vector eld.
The rst two equations simply say that the electric and magnetic elds are
incompressible, that the ux into a region is equal to the ux out in the
case of an electric eld, except in a region containing charge, and that the
magnetic eld is always incompressible (there are no magnetic monopoles).
The second two contain the information about the interchange between mag-
netic elds and electric elds. It is essential to understand what they are
saying, do not merely memorise them.
The curl of a vector eld is the extent to which it tends to twist around
some axis. If you visualise a stream of water owing and you put a very
small paddle in it, then in general the paddle gets turned by the ow being
greater on one side than on the other. Figure 4.3.2 gives the idea.
The curl can be thought of as a vector by taking the amount of twist about
the positive x-axis, the positive y-axis and the positive z-axis to give three
components; alternatively we can take the direction of the vector to be that
in which the rotation is a maximum and the length equal to the maximum
torque. Only a little thought suggests that it would be much more natural
to think of it as a 2-form, when it is simply the exterior derivative of the
1-form which replaces the vector eld. This is undoubtedly a better way to
think of it, as it generalises to higher dimensions quite naturally. And of
course the divergence can be thought of as applying the exterior derivative
to a 2-form to get a 3-form, which on R
3
is, at each point, a number. So we
may anticipate the next stage of writing these equations out will be to turn
them into dierential forms instead of vector elds.
For the present however, equation 4.3.3 says that the curl of the electric eld
is the rate of change of the magnetic eld with the direction reversed. We
have to think of B as a vector eld which depends upon the time: if we
take each of the three components it is a function of x, y, z and t, and if we
dierentiate it (partially) with respect to t we get a new vector eld, also
usually depending on t. So equation 4.3.3 says that for every time t, the
vector eld E is the negative of the time derivative of B. The amount of
twist of the electric eld depends upon the rate at which the magnetic eld
is changing. This is, like equation 4.5.1, and of course equation 4.3.4, part of
the interconnection between magnetic and electric elds.
Finally, equation 4.3.4 is almost dual to equation equation 4.3.3 except for
a minus sign and the j term and tells us something about the curl of the
magnetic eld at every time in terms of the current vector and the rate of
change of the electric eld. The latter term was introduced by Maxwell
not on the basis of experimental results, but because it led to the wave
solution to the four equations. One suspects strongly he had done the vague
English language argument about the exchange of energy between electric
and magnetic elds in free space and wanted to make it come good.
In order to collect your ideas on these equations, and to recall some earlier
work, some simple exercises will establish what is going on. If you did second
year physics you have probably already done these, although not perhaps in
this form.
Exercise 4.3.1. Find a vector eld Von R
3
with constant curl the vector
(0, 0, 1)
T
. Find some more vector elds with the same curl. Show that there
is an innite dimensional space of vector valued functions on R
3
which can
be added to your solution to give another solution.
Exercise 4.3.2. Find a vector eld V on R
3
which has a constant curl vector
zero, but which has the property that the integral around the unit circle (in
the z = 0 plane) of V is non-zero.
Exercise 4.3.3. A current vector j is dened to be uniformly (0, 0, 1)
T
for
points of distance less than one from the z-axis. You may imagine a rod of
radius 1 along the z-axis carrying a current. The current is zero for points
at a distance from the z-axis greater than one (that is, outside the rod). Find
a continuous magnetic eld the curl of which is the given j eld.
Explain why continuity is worth having, and given that there are rather a lot
of other solutions, explain what grounds you have for preferring yours. You
might nd gure 4.3.1 inspirational.
Exercise 4.3.4. Show that the wave equation is a solution to the Maxell
Equations in empty space. First write down the equation of an electric eld
which has all the vectors in a plane the same length and direction: take, say,
planes x = s and arrange to have for xed s, every electric vector the same
length and direction at any time t, but change the vector in time and also with
s so that it has unit speed along the x axis. Now look to see if this satises the
Maxwell Equations, for and j zero, by doing some partial dierentiating.
When you have done so, throw your hat in the air and shout huzzah! in
celebration. You have seen the light.
Figure 4.3.3: What is the eect on a television set of a magnet?
Exercise 4.3.5. A beam of electrons is emitted by a cathode at the back of
a television set and paints a spot on the centre of the screen. Traditionally,
deector plates are charged so as to sweep the spot in a raster scan giving your
television picture. I show a horizontal section through the tube in gure 4.3.3.
Discuss what happens when a bar magnet is placed in the location shown.
What numbers would you need to know in order to calulate the deection of
the beam and its direction?
Exercise 4.3.6. Read the rst twenty chapters of volume Two of the Feyn-
man Lectures on Physics (in the library). Chapter nineteen is of no direct
relevance but is good fun and worth reading to see how a great physicist thinks.
If you have been doing physics you should nd this easy, but there are some
penetrating observations which you may want to think about. If you havent
done much physics, again this will show you something of what you have been
missing.
Exercise 4.3.7. Read Chapter four of the text book and do all the exercises.
Remark 4.3.1. The work which has been described so far has changed the
world, mainly for the better, and changed it enormously. It is the product
of the Western Intellectual Tradition, and it is worth reecting on the kind
of society which can produce such things, and also on the kinds of society
which cannot, which is most of them.
Maxwells Equations represent one of the glories of Western civilisation,
something which is likely to remain as long as humanity endures and possibly
for much longer. Maxwell, Faraday and others stole lightning from the gods:
these men are heroes far beyond such as Alexander, Caesar or Napoleon
11
.
11
Or miscellaneous footy players, or people who hit balls with sticks. Or people who
play guitars or sing. The list goes on.
Figure 4.4.1: A ball about to bounce o a wall.
Your life at University is being spent in part at least in coming to under-
stand the thinking of the great men who produced these marvels, and also
to understand something of how they did it. There are worse ways to spend
your time
12
.
4.4 Invariance
4.4.1 The Idea of Invariance
Imagine a ball rolling in the plane and bouncing of a xed wall, as in gure
4.4.1.
If the ball has initial momentum p in the direction of the arrow, then it is
simple to compute the new momentum after the ball has bounced: the com-
ponent parallel to the wall is unchanged and the component perpendicular
to the wall is reversed in sign.
This makes a number of assumptions which are less than satisfactory; one
is that the ball is elastic since if it was made of putty it might stick to the
wall after deforming. So we also assume that energy is conserved by the
impact, in general not a realistic assumption, but approximately true for
bodies which are elastic enough. We also assume that the wall is rather
at and very smooth, since the ball will actually impact over a region, not
at a point, and if dierent bits of the wall made dierent angles with the
trajectory then the behaviour is potentially more complicated and harder to
compute. When you did this sort of thing at school I rather suspect they just
12
Conquering Europe, or anything involving balls or guitars, for instance.
4.4. INVARIANCE 127
trained you to make the assumptions that the school teacher made without
asking many questions, so you probably never questioned the assumptions
and indeed didnt even think about what they were. There can be rather a
lot.
Exercise 4.4.1. Think of some more assumptions that are necessary to get
a solution to the problem as posed.
In order to do the calculation, subject to all the usual assumptions, we need
to take a coordinate system, and some are better than others. I have shown
some axes in the left lower side of the picture. I havent however marked on
any units and you dont know what units the momentum is given in either. It
should be fairly obvious that the units dont much matter, in that whatever
we choose, as long as we take the same ones after as before we will get the
same answer. What is crucial is the angle the momentum vector makes with
the wall.
Now suppose we change the position and orientation of the axes. I do the
calculation in the original system and you do it in the new system. We can
translate everything from one to the other; your initial momentum vector p
will consist of an ordered pair of numbers, and an initial point for the ball
will also consist of another ordered pair of numbers. So will mine, and mine
will be dierent. It is easy to write down a euclidean group element which
will reliably translate your numbers to mine and the inverse will translate
my numbers to yours. And the resulting momentum vector, specied by a
direction and a point through which it passes, will translate by the same rule.
This is just like the situation of chapter one where I talked about penguins:
we have language which consists of nite lists of numbers, and we have the
physical entities, and the behaviour had better be described in the same way
whatever the language, because what happens in the world does not depend
on the language we use to talk about it. This is a key assumption about the
physical world which we use to put conditions upon the kinds of language
we shall use to talk about it
13
.
In the above case we can also change the units, you can use metres per second
and I can use parasangs per lunar month; the translation system still works
in that the system that translates the initial momentum from yours to mine
will also translate the nal momentum from yours to mine. In fact it is
dicult to think of dierent systems of specifying initial and nal momenta
and positions where a consistent translation system will not work.
13
This condition does not seem to apply to ones love life. There are tactful ways a bloke
can tell his girlfriend he doesnt like her outt, and there are others. He may tell the truth
in both cases, but the language makes a dierence. In particular, it is always a mistake
to laugh. I speak from bitter experience.
Exercise 4.4.2. But not impossible. Hint: consider the map from the polar
coordinate space r, to itself that doubles the angle. This is not a bijection,
we can translate events one way, but not unambiguously the other.
What this means is that if we have two languages for talking about events,
then as long as the translation scheme between the two languages is a bijec-
tion, and as long as an event can be specied in one language, then it can
also be specied in the other, and the translation scheme will work for all
such events if those events are correctly described.
But as well as specifying observable events, we also want to predict what will
happen in advance by means of some kind of theory. And it is going to be a
poor sort of theory where the prediction is dierent in languages with such a
translation scheme to hand. This imposes a constraint on the theory: it must
translate the same predictions into each other. This is known as Einsteins
Principle of Covariance and you should be able to see how he (and Poincare)
came to it: by seeing dierent coordinate frameworks as providing dierent
languages and there being a translation system between them.
We normally have not just two languages and one translation system between
them but a whole space of languages and a group of translations schemes,
since given any three languages, if I can translate from A to B and from B
to C by maps, then I can translate from A to C by the composite; moreover
the identity will translate from any language to itself, and we really want
every translation system to work in both directions, so there is an inverse
map. The associativity of composites of maps is immediate, so we have a
group of such translation systems. In the case of the shifting and rotation of
a coordinate frame used to specify only the positions of points, this is clearly
the Special Euclidean Group, SE(2, R). See the M2213 lecture notes if you
have forgotten this.
If f : R
2
R
2
is any map, it makes sense to ask if it is invariant with
respect to a group action. For example, f(x, y) = x
2
+ y
2
is invariant under
the rotation group SO(2, R): putting
X = x cos y sin
Y = x sin +y cos
we easily conrm that f(X, Y ) = f(x, y) for any theta. So if some sort of
prediction is specied by a function we can look to see if it is invariant under
the appropriate group of transformations of coordinates which we regard as
specifying the possible languages we have available, and if it is not then it
cannot possibly dene a satisfactory theory, because dierent observers will
expect to have dierent and incompatible outcomes. If a theory is specied
4.4. INVARIANCE 129
by requiring two functions to be equal, then they must be equal both before
and after we perform the appropriate group actions on them.
What is the group action in the case of the ball bouncing o the wall? We
have that the space in question is the space of positions and momenta of
balls. This we have seen is the cotangent space to R
2
which is isomorphic to
R
4
. It is not uncommon to write elements of this in the form (q
1
, q
2
, p
1
, p
2
)
where the q
i
are the positions, x and y more conventionally, and the p
i
are
the momenta. If we do a shift of a coordinate system, this will aect the q
i
but not the p
i
. If we do a rotation, both will be aected in the same way.
We can also consider a coordinate frame which is moving at a constant ve-
locity with respect to another one, requiring us to specify also the time. So
we have a ve dimensional space in which to specify the position and mo-
mentum of the ball at each time, two coordinate frameworks for turning the
motion into a map from R denoting the time into R
5
, and a map between
them which has the property of taking one description to another description
of the same event.
Exercise 4.4.3. Write down a specication for a ball moving in a atraight
line in R
2
with constant momentum. Use the ve numbers (t, q
1
, q
2
, p
1
, p
2
)
Choose actual numbers for the motion!
1. Take a coordinate frame which is shifted by some constant amount and
translate the function giving the position and momentum of the ball into
the new framework.
2. Do the same with a rotated coordinate frame.
3. Do the same for a frame which is both rotated and shifted.
4. Do the same for a frame which is both rotated and is moving at constant
velocity.
5. Do the same for a frame which is rotating with constant angular veloc-
ity.
6. Write down the groups for the rst three transformations. What phys-
ically intelligible function is invariant under this group?
7. Write down the group for all the rst four transformations. (This is
called the Galilean Group.) What is its dimension?
8. Write down the group for all ve transformations.
9. Is your function invariant under either of the last two groups?
10. What happens when the ball bounces?
11. Explain the physics here.
Exercise 4.4.4. If V is a vector eld on R
3
and f : R
3
R
3
is a euclidean
transformation, is it true that when V satises the equation
q
V = 0 then
so does Tf(V )? If so prove it, if not give a counterexample.
Exercise 4.4.5. If V is a vector eld on R
3
and f : R
3
R
3
is a dieo-
morphism, is it true that when V satises the equation
q
V = 0 then so
does Tf(V )? If so prove it, if not give a counterexample.
4.4.2 The Lorentz Group
The denition of the Orthogonal group O(n,R) was either that it consisted of
the orthogonal n n real matrices, or, better, that it consisted of the linear
maps from R
n
to R
n
which preserved the inner product, Formally,
A L(R
n
, R
n
), A O(n, R) u, v R
n
, 'u, v` = 'Au, Av`
So as to simplify things I shall now take a generalised inner product on R
2
which I shall write as having elements
t
x
and the (lorentzian) inner product is dened to be
t
x
= tt
xx
Note that I have reversed the sign from what you probably expected and the
one that the text book favours. If you feel uneasy about this go through
multiplying everything by 1.
It follows that the norm of the vector (t, x)
T
is t
2
x
2
.
I shall argue by analogy with the usual inner product on R
2
.
In order to nd out what the orthogonal maps looked like on R
2
, we took
the unit circle, and argued that any point on it had to remain on it under
an orthogonal map. Doing the same here, take the set
H
1
=
t
x
R
2
: t
2
x
2
= 1
4.4. INVARIANCE 131

Figure 4.4.2: An analogue of the unit circle in a Lorentzian space.
Then if A preserves the new inner product, or is lorentzian we have that
t
x
H
1
s
u
= A
t
x
H
1
s
2
u
2
= 1
It is easy to draw the set t
2
x
2
= 1 and it consists of a hyperbola as in
gure 4.4.2.
My reason for drawing it this way around and taking t
2
x
2
and not x
2
t
2
is that all the action has x < t which, given that the velocity of light is one
in these units and that things dont travel faster than light, is the way things
ought to be.
Now we can parametrise the unit circle by cos , sin and it is easy to
parametrise the curve H
1
by
t = cosh , x = sinh
since
cosh
2
sinh
2
=
e
2
+e
2
+ 2
4

e
2
+e
2
2
4
= 1
Now we note that the standard basis elements in R
2
are t = 1, x = 0 and
t = 0, x = 1 and that the norm of the rst is 1, so it is in H
1
and the norm
of the second is 1 and it is not in H
1
. So there is a slight problem with
dening a Lorentzian matrix in terms of cosh and sinh . The solution is
to observe that we need to extend H
1
to contain the other hyperbola which
intersects the x axis, as in gure 4.4.3
Figure 4.4.3: A better analogue of the unit circle in a Lorentzian space.
We can now see that we should have dened
H
1
=
t
x
R
2
: t
2
x
2
= 1
which would have made no dierence in the case of S

1
and the usual inner
product had we done it, but does make a dierence here. We now have that
the lorentzian matrices are those for any real :
cosh sinh
sinh cosh
Those vectors for which we are in the original hyperbola are called spacelike,
since they represent velocities which are less than 1 and correspond to things
we may see in our universe. Light, which moves at the velocity 1 in our units
must lie along x = t, and consists of vectors not in H
1
and the norm of
any such vector is zero. So the analogue of distance in our Lorentzian space,
which we call the interval, is zero for any light ray. Seen from the point
of view of a ray of light, there is no dierence between starting from the
Andromeda Galaxy and arriving in your eye. This is denitely weird; well,
thats reality for you.
Supposing we start with a spacelike vector in the two dimensional Lorentzian
space, for example t = 2, x = 1. It goes to
t = 2 cosh + sinh , x = 2 sinh + cosh
It is easy to verify that the norm of the original vector is 3 and so is the
norm of the nal vector.
4.4. INVARIANCE 133
Exercise 4.4.6. Conrm that all such matrices as those advertised do in-
deed preserve the lorentzian form. What is their determinant. What other
matrices preserve the Lorentzian form? What is their determinant?
Now lets get back to higher dimensional spaces with a (1,n) signature. I
have the usual spacetime situation with (x
0
, x
1
, x
2
, x
n
)
T
and the lorentzian
generalised inner product x
0
x
0
j[1:n]
x
j
x
j
. I am particularly concerned
with n = 3 because that is the number of spatial dimensions of the universe
we live in.
There are six basic lorentzian matrices in R
4
with the lorentzian inner
product:
1 0 0 0
0 1 0 0
0 0 c s
0 0 s c
1 0 0 0
0 c 0 s
0 0 1 0
0 s 0 c
1 0 0 0
0 c s 0
0 s c 0
0 0 0 1
(4.4.1)
gives three of them, where the c and s entries represent cosines and sines
of angles and give the three dimensional space of real orthogonal matrices.
The time axis is left xed in this case, and each of these leaves one other
orthogonal axis xed.
The other three are
ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
ch 0 sh 0
0 1 0 0
sh 0 ch 0
0 0 0 1
ch 0 0 sh
0 1 0 0
0 0 1 0
sh 0 0 ch
(4.4.2)
where ch is short for cosh and sh is short for sinh for various . Each of
these swaps the time into one of the three space axes and vice versa. Again,
two axes are left xed. They are known to physicists as Lorentz Boosts.
Exercise 4.4.7. Show that each of the above six matrices preserves the
lorentzian inner product, and hence that any composite of them (for any
consistent values of the argument of cos, sin, cosh or sinh) will also.
Exercise 4.4.8. Show that every matrix which preserves the lorentzian inner
product must be some nite product of such matrices.
Exercise 4.4.9. Show that the Galilean group can be represented by matrices
of the form
1 0 0 0
v
1
a
11
a
12
a
13
v
2
a
21
a
22
a
23
v
3
a
31
a
32
a
33

where

a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
is in SO(3, R).
Remark 4.4.1. Note that both the Lorentz group and the Galilean group
can deal with a change of coordinates from a xed system to one moving at
uniform velocity with respect to it. And they are dierent! The lorentz group
is the right one for relativity. You should observe that for the lorentz group,
movement with velocity v means setting tanh() = v, and that we recover the
usual (relativistic) rules for the addition of velocities.
Exercise 4.4.10. Find a good source on Special Relativity: you could do
worse that the Feynman Lectures on Physics, Volume 1, chapter 15. Note
the equations
x
=
x ut
1 u
2
y
= y
z
= z
t
=
t ux
1 u
2
Show that these are essentially the inverse of the rst matrix in 4.4.2.
Explain why physicists use the inverse.
Exercise 4.4.11. Show that if a space ship zooms past you at velocity half
that of light, and another spaceship zooms past that at half the speed of light
(relative to the rst spaceship) in the same direction, then you will decide the
second spaceship has a velocity of 4/5 the speed of light. Show that if the rst
had speed u and the second had speed v relative to the rst, your opinion of
the speed of the second is given by
w =
u +v
1 +uv
Show that if [u[ < 1 and [v[ < 1 then [w[ < 1.
Exercise 4.4.12. Read Feynmans Lecture Notes in Physics, Volume 1,
chapter 15. If you are a physicist you will have already covered this mate-
rial, if not you will nd it comforting to discover you have now done Special
Relativity. Easy, isnt it? Note that apart from a few technical diculties (!)
you have discovered how atom bombs and nuclear power stations work
14
.
14
The details of atom bombs are very simple, and the recipe is as follows: take about
4.4. INVARIANCE 135
4.4.3 The Maxwell Equations
The rst of Maxwells equations for the vacuum, with charge density zero is
q
E = 0
To see that this is invariant under SO(3,R) is in one sense trivial. The equa-
tion says that the net outow of the ux determined by the vector eld E for
any little box is zero, in fact for any box whatever, where a box is a region
bounded by something dieomorphic to a 2-sphere. Rotating a box gives an-
other box, so the net outow from a shifted box is also zero. This obviously
extends to the non-vacuum case with a non-zero charge density. It obviously
holds for a much larger group than SO(3,R) too: it must hold for any dieo-
morphism, although the equation stating the fact that the divergence is zero
would look rather dierent.
Although this argument is persuasive, it lacks a certain rigour, so a slightly
more careful argument is required. We can observe that when we take the
divergence of the original vector eld at any point it has to be the same as
the divergence of the transformed eld at the transformed point. And since
the zero map is also preserved by the orthogonal map, the result follows.
The equation
q
B = 0 is also invariant for the same reason.
Exercise 4.4.13. Show that the statement the divergence of the original
vector eld at any point has to be the same as the divergence of the trans-
formed eld at the transformed point can be translated into algebra by doing
it, and that it is true for any vector eld.
Now this argument uses only the linearity of the matrix and the fact that
t = t
, and indeed is far more general than that.

It remains to prove invariance for the boost maps, the rst of which is
ch sh 0 0
sh ch 0 0
0 0 1 0
0 0 0 1
The problem here is that the electric and magnetic elds are dened as vector
elds on R
3
and the lorentz boost maps are represented by 4 4 matrices,
so we can expect some serious complications.
a kilogram of Uranium 235 or Plutonium and shape it into a hemisphere. Do the same
with a second kilogram. Now clap them together hard to make a solid sphere. Children,
do not do this at home unless you really dislike your parents.
The actual transformation of the E and B elds is rather a shock at rst and
is given by
cosh sinh 0 0
sinh cosh 0 0
0 0 1 0
0 0 0 1
E
x
E
y
E
z
E
x
E
y
cosh B
z
sinh
E
z
cosh +B
y
sinh
(4.4.3)
and
cosh sinh 0 0
sinh cosh 0 0
0 0 1 0
0 0 0 1
B
x
B
y
B
z
B
x
B
y
cosh +E
z
sinh
B
z
cosh E
y
sinh
(4.4.4)
What is surprising about this is that the Electric and Magnetic components
get mixed up. This means that if I am travelling in the x direction with
velocity tanh (which you will note has absolute value always less than 1,
the speed of light) and we both measure an electric eld and a magnetic
eld, we shall dier on which bits are which. This is a strong hint that the
two phenomena of electric elds and magnetic elds are all part of the same
underlying entity, called the electromagnetic eld.
The derivation of the above transforms will be easier once we go over to using
dierential forms instead of vector elds to represent the two parts, E and
B of the electromagnetic eld, so I shall defer it.
The invariance of the Maxwell Equations under the transforms is also eas-
ier to see in this setting. So we now turn to the right way to talk of the
electromagnetic eld.
4.5 Saying it with Dierential Forms
Given a physical phenomenon, in this case the force exerted on a charged
particle, and given that two bits of language can be used to describe it, in
this case as a vector eld on R
3
or as a 1-form or a 2-form, the question of
which piece of language to use comes up immediately. We may, of course,
choose the rst one that occurs to us and having made a choice stick to
it in deance of later developments. This is rather stupid and regrettably
common. An alternative is to ask if there are any physically obvious grounds
for making a choice, and a second is to keep them both in use until such time
as one demonstrates advantages.
4.5. SAYING IT WITH DIFFERENTIAL FORMS 137
Let me rst argue that it is reasonable to use 2-forms for describing magnetic
elds. The reason is that 2-forms take account of orientations, and magnetic
elds certainly exhibit all the usual signs of being orientation aware. If you
look at the Lorentz Force law, which I give again for your greater comfort,
F = q(E +v B) (4.5.1)
you will see that the v B part clearly has an orientation aspect in it by
virtue of the cross product, whereas the electric eld has only the sense or
direction. It therefore makes sense to represent the magnetic eld as a 2-
form. Some sort of right hand rule is involved in computing a cross product:
this may be seen as containing the information that magnetic elds also use
some sort of orientation information. They care about which direction a
charge is moving. Of course, we can force a vector view on the eld if we
insist, which means we have to keep in mind the right hand rules of physics,
whereas if we use 2-forms, this should be taken care of by the formalism. A
good language is one which does most of the work and doesnt require us to
keep a close watch on it.
If B is a 2-form then so is its time derivative, and the equation
E =
B
t
is telling us that E is a 1-form. So we can write the above equation in the
form
dE =
B
t
where d is the exterior derivative which we know takes 1-forms to 2-forms.
We may also write dB = 0 with some condence to represent the classical
equation
q
B = 0 since the divergence of a vector eld is a scalar eld and
the exterior derivative of a 2-form is a 3-form which on R
3
is pretty much
also a scalar eld, multiplied by det if you want to be careful.
Unfortumately, after that everything goes pear-shaped.
the equation
B
E
t
= j
makes no sense when we try to translate it: B cant have a curl, it is one.
t
E has to be another 1-form, and so is j. So we somehow have to arrange
that the translation of B is a 1-form.
On the other hand, we have also avoided facing the fact that we should be
doing all of this on R
4
with the lorentz metric. Maybe we can save things by
a small amount of rearrangement.
Let us look at the simplest case rst, the equations
q
B = 0
E +
B
t
= 0
become
dB = 0
dE +
B
t
= 0
This corresponds closely to the physics: dB is indeed a divergence that is a
3-form on R
3
and the exterior derivative of the electric 1-form is indeed a
2-form.
We can shift all this to our lorentz space, R
4
with the lorentz inner product,
bt dening a 2-form on R
4
as follows:
F = B+E dt (4.5.2)
Writing this out in coordinate form with respect to the standard basis we get
B = B
x
dy dz +B
y
dz dx +B
z
dx dy
and
E dt = E
x
dx dt +E
y
dy dt +E
z
dz dt
I hope you recall representing the 2-form 3dx dx + 4dx dy 2dy dx +
5dy dy on R
2
as a matrix
3 4
2 5
You will have veried that this operates on the two input vectors
x
y
u
v
by sending them to the number

[x, y]
3 4
2 5

u
v
If you dont recall this or didnt do the exercise, do it NOW.

4.5. SAYING IT WITH DIFFERENTIAL FORMS 139
It is straightforward to verify that the 2-form F can be represented in the
same way on R
4
by the matrix
0 Ex Ey Ez
Ex 0 Bz By
Ey Bz 0 Bx
Ez By Bx 0
Exercise 4.5.1. Verify this on pain of death.

It is now easy to compute dF. This will be a 3-form on R
4
.
We have:
F = B+E dt
dF = dB+d(E dt)
Doing the dB part rst:
dB = d(B
x
dy dz +B
y
dz dx +B
z
dx dy)
=
B
x
t
dt dy dz +
B
x
x
dx dy dz
+
B
y
t
dt dz dx +
B
y
y
dy dz dx
+
B
z
t
dt dx dy +
B
z
z
dz dx dy
= (
B
x
x
+
B
y
y
+
B
z
z
) dx dy dz
+
B
x
t
dt dy dz +
B
y
t
dt dz dx +
B
z
t
dt dx dy
Now doing the d(E dt) part:
d(E dt) = d(E
x
dx dt +E
y
dy dt +E
z
dz dt)
=
E
x
y
dy dx dt +
E
x
z
dz dx dt
+
E
y
x
dx dy dt +
E
y
z
dz dy dt
+
E
z
x
dx dz dt +
E
z
y
dy dz dt
=
E
z
y

E
y
z
dy dz dt
+
E
x
z

E
z
x
dz dx dt
+
E
y
x

E
x
y
dx dy dt
Collecting up both parts we get:
dF =
B
x
x
+
B
y
y
+
B
z
z
dx dy dz (4.5.3)
+
E
z
y

E
y
z
+
B
x
t
dy dz dt
+
E
x
z

E
z
x
+
B
y
t
dz dx dt
+
E
y
x

E
x
y
+
B
z
t
dx dy dt
If dF = 0 then each of the above four lines must be zero. The rst line says
that div B = 0 in old fashioned language, and the last three say that curl
E +B/t = 0 in the same old fashioned language.
In other words, we get two of the Maxwell equations out. This is encouraging
and leads us to feel that gluing E and B together into a single entity, the
2-form F is a good idea. This is the physically signicant thing of which the
magnetic eld and the electric eld are merely dierent aspects.
The next step is to express the other pair of Maxwell equations in the same
language.
This is where the operator comes in. It is clear that if we F we get
another 2-form. When we calculate the matrix for it we get
0 Bx By Bz
Bx 0 Ez Ey
By Ex 0 Ex
Bz Ey Ex 0
Exercise 4.5.2. Conrm this. The calculation is utterly trivial, all you need
to do is to organise your thoughts sensibly. Observe that the Hodge dual can
be memorised by mapping Ej to Bj and Bj to Ej. This looks very like
what we want for the other pair of Maxwell Equations in the classical form.
If we take the exterior derivative we get a 3-form on R
4
, and if we it we
get a 1-form. We can represent the current as a 1-form on R
4
by putting the
charge density in the zeroth place and using the other three places to give
the values for the current ow. This means we need to dene J as the 1-form

t
+j
1
x
+j
2
y
+j
3
z
4.6. LORENTZ INVARIANCE 141
Whereupon we may write the other two Maxwell Equations out as
(d((F))) = J
Exercise 4.5.3. Show that this does indeed amount to precisely the other
pair of Maxwells Equations.
It is common to leave out all the parentheses and summarise the Maxwell
Equations, all together, in the form
dF = 0 (4.5.4)
d F = J (4.5.5)
Remark 4.5.1. You have to allow that this is rather cool. Compacting the
equations in this way gives us a much more elegant way to express the basic
facts of electromagnetism and should leave you feeling that it is more true to
the underlying reality than the classical form. If you had never met Maxwells
Equations in the classical formalism and you had just met these for the rst
time, you would, I think, nd a good deal of charm in the conciseness, and
feel that the evidence for the electromagnetic eld being a 2-form on a four
dimensional spacetime is overwhelming. The fact that it requires a lorentz
inner product to work properly is at least highly suggestive.
Exercise 4.5.4. How far could you get in rewriting the Maxwell Equations
(in terms of forms) with the standard inner product on R
4
? What changes
would you need?
4.6 Lorentz Invariance
The rst issue to be addressed is to determine how 2-forms transform under
a transformation of coordinates.
Suppose B is a 2-form on R
n
with a generalised inner product, and A : R
n
R
n
is a dieomorphism. Then take one coordinate system at the origin and
let another be obtained by performing A on it. Call S the coordinate system
at the origin, and think ordered basis for a concrete example. Then AS is the
second coordinate system. Think of A as a linear map, possibly a lorentz map
to make this relatively concrete. Then a vector x in the coordinate system S
is read as A
1
x in AS. Call it x
to save space. That is, we have two ways of

talking about the same point in the space, two languages. Similarly, u and
u
= A
1
u for a second vector.
Then if B
is the correct transform of B in AS we shall have that B
acts
on the ordered pair x
, u
to give the same number as B does on the ordered

pair x, u. This must be the case since the number we get out must not
depend on the language we are using to describe the points which exist and
are independent of the language.
If A is linear and S and hence S
= AS are given by ordered bases, then we

can represent B by a matrix [B] and write B(x, u) as x
T
[B]u. Similarly we
have x
T
[B
]u
for the same number obtained by the transformed 2-form, and

these are equal for any choice of x and u. Saying this in algebra:
x, u R
n
x
T
A
T
[B]
Au = x
T
[B]u
This can only happen when A
T
[B]
A = [B] which tells us that

[B]
= A
T1
[B]A
1
which gives us the transform of the matrix [B] representing the 2-form B.
Exercise 4.6.1. It is well known that for an orthogonal matrix A, A
T
= A
1
.
What can you say about the transpose of a lorentzian matrix?
Exercise 4.6.2. The question arises, how much of this depends on linearity?
Obviously we have chosen to represent things by matrices, but the equation
makes sense in a much more general setting except possibly for the business
of the transpose, which arose from our determination to represent B by a
matrix. Suppose we express the 2-form relative to a basis in the usual way
as a sum of dx
i
dx
j
. What can be said about the expression of B
relative
to the dx
i
dx
j
? How much if anything can be saved if we permit A to be
a dieomorphism? Hint: investigate this in R
2
` 0 with reference to the
polar coordinate transform.
The obsession which physicists and old stle mathematicians have with matrix
representations of dierential forms can obscure the basic simplicities. We
have already computed dF in standard terms as a 3-form in equation 4.5.3,
and it is simpler to investigate the lorentz transformations of both 2-forms
and 3-forms directly.
Lets do it for the 2-form F on R
4
and the rst lorentz boost.
We have that any 2-form on R
4
is given by suitable linear combinations,
weighted sums, of the six terms dt dx, dt dy, dt dz, dx dy, dx dz,
dy dz. From the matrix representation for F we can read these o:
F = Ex dt dx Ey dt dy Ez dt dz
+ Bz dx dy By dx dz +Bx dy dz
If we suppose that the rst lorentz boost is used to transform the standard
basis in R
4
to a new basis, what I shall call the dashed basis, then we need
the inverse map to transform the coordinates of a point (event) in R
4
to the
new coordinates in the dashed frame. Thus we have
c s 0 0
s c 0 0
0 0 1 0
0 0 0 1
t
x
y
z
ct sx
st cx
y
z
where c is short for cosh() and s for sinh(), for any R.

Taking the exterior derivative we get
dt
dx
dy
dz
c dt s dx
s dt +c dx
dy
dz
We have therefore that

dt
dx
= dt dx; dt
dy
= c dt dy s dx dy;
dt
dz = c dt dz s dx dz; dx
dy
= s dt dy +c dx dy
and
dx
dz
= s dt dz +c dx dz; dy
dz
= dy dz
Exercise 4.6.3. Find the expressions for the four basis elements of a three
form,
dt
dx
dy
, dt
dx
dz
, dt
dy
dz
and dx
dy
dz
in terms of the undashed forms, dt dx dt and so on.

Exercise 4.6.4. Calculate the exterior derivative dF in terms of the func-
tions Ex, Ey, Ez, Bx, By, Bz and the basis elements
dt dx dy, dt dx dz, dt dy dz and dx dy dz
Your rst term should be, using the shorter notation of the text book for
partial derivatives,
(
y
Ex +
y
Ex +
t
Bz) dt dx dy
In the dashed frame we have the same form as in the last exercise for dF
except that we have Ex
, Ey
, Ez
, Bx
, By
, Bz
, partial derivatives of these

with respect to the dashed variables, and the usual suspects:
dt
dx
dy
, dt
dx
dz
, dt
dy
dz
and dx
dy
dz
Now we can replace the last four terms by the undashed terms.
We also have that the chain rule allows us to replace the
x
Ey
and similar
terms by their undashed translations:
[
t
f
,
x
f
,
y
f
,
z
f
] = [
t
f,
x
f,
y
f,
z
f]
c s 0 0
s c 0 0
0 0 1 0
0 0 0 1
Exercise 4.6.5. Do this to obtain an expression for dF
in terms of the
undashed symbols.
We nally have to conrm that
dF = 0 dF
= 0
At rst sight, the expression for dF
is a mess, but a little thought shows it

is just a new dF for a slightly dierent pair of electric and magnetic elds
which also satisfy Maxwells equations.
Exercise 4.6.6. Checking with the forms of 4.4.3 and 4.4.4, show that
dF
= 0
This sequence of exercises has established that the vacuum Maxwell equations
are invariant under the rst lorentz boost.
Exercise 4.6.7. Conrm that this also works for the other two lorentz boosts.
This is best done using a small amount of thought rather than a large amount
of algebra.
Since we have already seen that the vacuum Maxelll equations are invariant
under the special orthogonal subgroup, it follows that the equation
dF = 0
is invariant under the lorentz group.
Now if d F = 0 which it does in the vacuum, it also follows that d F = 0
since takes zero forms to zero forms and
2
is a number, in fact 1.
Exercise 4.6.8. Which?
So the same calculation will establish the invariance of the second equation.
After doing this we conclude that both the vacuum equations are invariant
under the Lorentz group.
Exercise 4.6.9. Show that d F is also invariant under the three lorentz
boosts and the special orthogonal group. Hint: this does not require doing it
all over again!
This result is absolutely astonishing. I shall now explain why.
4.6.1 Special Relativity
Newtons Laws of motion are about forces, which is to say accelerations
and masses if we look at the things we can actually measure directly. And
these appear at rst sight to be invariant under the galilean group. Certainly
Newton thought they were, although group theory not having been formalised
in his day, he wouldnt have put it in that way. Had the lorentz group been
in existence, the possibility that the laws of motion were invariant under the
lorentz group would have been regarded as a bizarre possibility too absurd
to waste time on, although one couldnt rule it out on experimental grounds
since speeds with which we are familiar are small compared with the velocity
of light.
Invariance of the laws of nature under the galilean group explained why
we havent found anywhere in the universe labelled origin, a special point,
possibly with three orthogonal axes sticking out of it. It must have looked
unlikely that we will, and the fact that we have used coordinate systems and
bases of orthogonal vectors to talk about the universe led immediately to the
observation that this was merely a convenient language, and no particular
coordinate frame was better than any other; indeed, one could be moving
at constant velocity with respect to another and they were equally good.
Although accelerating frames changed things, as one nds when looking at
tops and roundabouts and planets.
The fact that Maxwells equations are not invariant under the galilean group
but are under the lorentz group produces some very strange results. One
is that the velocity of light is constant and does not depend on your own
velocity.
This is very unnerving indeed. If light is a wave motion like waves in water
then the wave velocity is a property of the water. If you are at rest with
respect to the water you get one answer, and if you are moving with respect
to the water you get another. This is not observed. If light goes like little
Figure 4.6.1: An experiment with two charges.
bullets from a source, then the velocity of the light is aected by the velocity
of the source with respect to any other observer. This is also not observed.
What happens when you travel fast away from the light is that the frequency
is shifted down, the colour changes, and if you travel towards it the frequency
is shifted up, the doppler eect. But the velocity is not aected.
It gets worse. Suppose I am sitting in a frame of rulers with a clock and
describing what happens when I take two charged balls of opposite signs and
place them a small distance from each other, as in gure 4.6.1.
I have tied each charge to some xed object and measured the tendency
of the charges to attract each other by measuring the stretch of a spring.
Since everything is pretty much the same except for the sign of the charge,
including the mass of the balls and the elasticity of the springs, there is a
high degree of symmetry in the arrangement and I expect the stretches to be
the same. I can measure these by reading o the number on the ruler where
the edge of the ball is.
Now you come zooming past me going into the picture at half the speed of
light.
You see the two balls, you see the extension of the spring, and you can
measure the electric and magnetic elds with some little test charge you
carry with you, and some little bar magnet. Your little bar magnet can be
thought of as just two charges orbiting about each other, close enough so
they have no net electric eld to speak of, and fast enough so they produce a
magnetic eld. You can also observe, just as I can, the numbers on the ruler
where the edges of the balls are, and we had better agree on these numbers.
The collision of penguins is a fact in any language that is not totally bizarre,
and the coincidence of edges of balls and numbers on an external ruler must
also be a property of the universe, not the language we choose to talk about
it.
In my framework there is no magnetic eld at all, unless we count your
measuring apparatus. But in yours there are two charges coming towards
you at half the velocity of light so there has to be a magnetic eld, because
moving charges have to produce one; Faraday said they did and so they
do. So the numbers you get for the electric eld and the magnetic eld will
be dierent from the numbers I get, yours will have a non-zero magnetic
component.
Exercise 4.6.10. Use the rst lorentz boost to work out what the rate of
exchange is.
We will both agree that the springs extend and the objects are attracting
each other. But our explanations of why will be dierent. You will have an
explanation which involves magnetic elds and mine wont.
In dierent congurations we may disagree about the extension of the springs
or the masses of the balls or the time duration between events, although we
must of course agree about whether an event occurs or not. If two balls
(or penguins) collide, we must agree that the event happens, this being a
property of the balls (or penguins), but the translation between our languages
may make our measurements of various forces disagree when our translation
is done using the lorentz group.
The problem which was tackled in the early days of the twentieth century
was, can you have the mechanical part of nature invariant under the galilean
group and the electromagnetic part invariant under the lorentz group? As you
can see from our discussion on penguins and Einsteins covariance principle,
this amounts to having a dierent and incompatible language for dierent
aspects of the universe, and we measure the eects of elds by mechanical
means. So it does not make much sense to have two incompatible languages
for talking about the same thing. The Michelson-Morley experiment tried
to measure the velocity of the earth with respect to the luminiferous aether,
the whatever it was that waggled when light passed through it (luminiferous
just means light bearing and aether meant some weird stu which spread
throughout the universe and had no other function than to bear light. In
particular it didnt obstruct or slow down mechanical things like planets or
penguins). This makes some sort of sense if there is one kind of invariance
for matter and another for electromagnetic elds. The answer seemed to
be it was zero: the velocity of light does not depend on the velocity of the
observer measuring it. This is consistent with the lorentz invariance of the
Maxwell equations, but not with the idea that you can get away with two
incompatible languages for talking about the world.
Given this there are only two possibilities: either Maxwells laws are wrong
or Newtons are. Mostly people assumed that Maxwell was going to have
to lose out in the ght between the intellectual giants, mainly because they
were used to Newton, and Maxwell was the new kid on the block, although
it was hard to see how he was wrong.
Poincare pointed out that the alternative was to suppose that everything was
invariant under the lorentz group. Einstein worked out the consequences and
we had the special theory of relativity, and E = mc
2
is a trivial corollary.
Hence atom bombs and nuclear power stations. So today we use galilean
invariance as a simple approximation when velocities are low, and lorentz
invariance is taken to be right. For everything.
It is fashionable for philosophers to ponticate on science, and since scientists
are usually much too busy doing interesting things to bother much about
them, the philosophers have much more inuence on the great unwashed
than they should. One line of argument goes: Einstein showed Newton
was wrong, no doubt someone will eventually show Einstein is wrong too,
so nothing is known for sure and all knowledge and belief is temporary. So
we might as well stay totally ignorant of science. In fact since all knowledge
is liable to error we cant be said to really know anything. And there is no
sense in pursuing truth if there is no possibility of catching it. I shall call
this the postmodern fallacy.
This certainly saves philosophers and others the trouble of learning about
tensors, or anything else complicated. The argument is very popular with
people who like to be thought of as intellectual but dont have much intellect.
Exercise 4.6.11. Explain, as to a philosopher, why the postmodern argument
sucks.
Exercise 4.6.12. Google Michelson-Morley Experiment
This has been a quick introduction to special relativity. I have essentially
followed the historical development of the ideas, whereas it is more usual to
give you the facts which have become known as the result of experiments
since. Facts there are in plenty and they support completely the invariance
under the lorentz group of everything in the universe. Physicists tend to see
life as a huge collection of facts, mathematicians as a much smaller collection
of ideas. To mathematicians, reality is there to give us interesting things to
think about, and so we rely on physicists and engineers to nd out how things
behave so we can make languages which describe them concisely. It has been
a very succesful partnership and physics (and more recently engineering) has
forced us to produce some beautiful mathematics, some small amount of
which we have used thus far.
There is still lots left as the next chapter will show.
Chapter 5
DeRham Cohomology:
Counting holes
First some observations on a few cultural issues. There are dierences be-
tween mathematicians and physicists which cause problems. I dont want to
overstate these, nor to leave anyone thinking I disapprove of either culture;
my rst degree was in Physics and my Ph.D. in Pure Mathematics, and I
nd both subjects wonderfully worth studying, but failure to confront issues
tends to make them harder to deal with, not easier. So some thoughts on the
dierences may be worth putting up for your consideration. It should also
be noted that, seen from outside, the two cultures are so similar it is hard to
see any dierence at all.
5.1 Cultural Anthropology
In the beginning of the twentieth century, the Mata Grosso was the great
unexplored jungle of the Amazon basin in South America. It was full of,
well, jungle, which we now call a rain forest (possibly to avoid giving oence
to jungles), and contained exotic animals and exotic tribes of people with
strange customs, such as shooting curare tipped darts at strangers.
Cultural anthropologists, anxious to study humanity in all its bizarre aspects,
visited these tribes in order to learn their ways; those who avoided the curare
tipped darts were able to return to civilisation and tell it about the customs
and manners of these fascinating people. One of the chief diculties they
faced was the strange human ability to follow complicated rules without being
able to say what the rules actually are. This is obvious in language: a ten
year old has a good grasp of his native language and can follow incredibly
149
150 CHAPTER 5. DERHAM COHOMOLOGY: COUNTING HOLES
complex rules of grammar with no apparent diculty. The conclusion that
French must be an easy language because lots of French children speak it, is
not in fact the case. It is rather that they have internalised a huge number
of rules, but they dont know what they are. In order to learn French as an
adult, we tend to want to know the rules. It is no use asking a French child
to tell you. They dont know.
In the same way, it was no use asking a denizen of the Amazon basin what
their basic assumptions about the universe were. They had them, they fol-
lowed these assumptions, but they had been brought up in the culture and
couldnt articulate them. Part of the interest in exotic tribes is trying to
work out what those assumptions are, but there is no use asking the exotic
tribesmen. They learnt them at too early an age to realise they were making
them.
I once talked to an Australian engineer on a Japanese train, and (in front of
some of the Japanese sta) he expressed his amusement about the introduc-
tion of ush toilets to Japan many years ago. They needed to have pictures
explaining to Japanese how to sit on the toilet seat, otherwise the more hy-
gienically conscious Japanese would stand on it and squat. He thought this
was very funny, because he presumably believed that the customary manner
of using a ush toilet is something people are born knowing. He thought
this because he had been potty-trained at an early age and had forgotten
the process. I doubt if his mother had. Modern Japanese toilets are so com-
plicated, they have to be explained to foreigners, so the Japanese have had
their revenge on ignorant Aussies.
These days the Mata Grosso is in the process of being turned into farms and
housing estates and the exotic tribesmen drive cars and drink coca-cola, so
there is not much point in a cultural anthropologist visiting the place. It
is much like back home. By way of compensation, there are other exotic
tribes being created. One of these is the tribe of theoretical physicists
1
. Just
like the Amazonian Indians, they have their special language, their cultural
assumptions about the world. And just like the Amazonian Indians, they
dont actually know what they are.
This is where mathematicians come in. They are also a weird tribe, as you
may have noticed, but being professionally interested in rules they have a
much clearer idea of what those they follow actually are. And when they
study theoretical physics they nd it necessary to articulate the assumptions
1
One cultural anthropologist actually spent some months with a group of physicists
but his report on their weird ways aroused little interest, perhaps because he couldnt
make as much sense of them as they could of him. This is a true story and not a joke.
Well, maybe it is a joke but it is also true.
5.2. SOLUTIONS 151
which the physicists make. It is no use asking the physicists, they do it
be training and reex and dont even notice they are doing it. Learning
theoretical Physics as an adult is harder than learning French, and asking
French children is no help, as noted above.
So I am going to make some points which theoretical physicists would regard
as too obvious to talk about and dont.
5.2 Solutions
The Maxwell Equations are basically about a set of six functions from R
4
to
R, Ex, Ey, Ez, Bx, By, Bz which correspond to things that can be measured
using particular instruments. In practice we can only sample these functions
discretely if there is something in nature producing them, or we can more or
less ignore reality and just write down the six functions. They are, in our
notation, the components of a 2-form on R
4
and we take the Lorentz inner
product should it be necessary. It is possible to see this as a map from R
4
to
R
6
. Any such map will dene a suitable 2-form, and it is not unreasonable to
demand that we restrict ourselves to smooth functions and maybe analytic
functions.
Exercise 5.2.1. Why is it not unreasonable?
Now the Maxwell Equations impose conditions on these six functions. Not
every choise of six functions fromR
4
to R will satisfy them. In fact there must
be some space of solutions. We have from dF = 0 four conditions on these
functions and from d F = 0 another four. I am staying with the vacuum
equations for the time being. So we have a total of eight constraints on an
innite dimensional space of functions, so we get some innite dimensional
manifold of solutions. This is not much help.
The sad fact is we have only one solution to the vacuum Maxwell Equations,
which we got by guessing that a plane wave in space would do it. If you
write down
Ex(t, x, y, z) = 0, Ey(t, x, y, z) = 0, Ez(t, x, y, z) = C sin(y t)
for any real number C, then you are describing a (sine) wave with an electric
eld which exists only in the z-direction and which travels at unit speed in
the y-direction. If we take the curl of this we get, so Maxwell tells us,
B
t
=
cos(t y)
0
0

and integrating gives
Bx = sin(y t), By = 0, Bz = 0
which says the magnetic eld is a plane wave also travelling along the y axis
being non-zero only in the x direction.
Exercise 5.2.2. Draw a picture.
It is easy to verify that
q
E = 0 and
q
B = 0 and it remains only to show
that curl B =
t
E which is rather easy.
Of course we can get some more solutions out of this, in fact an innite
number of them. For a start we can rotate things so that instead of travelling
along the y axis it goes along any other line. Just apply any element of
SO(3,R) to the above system and we get a new one which we already know
also satises the Maxwell equations. From a physical perspective it would
be astonishing if it didnt.
Better, apply a lorentz transformation to the 2-form and get a larger class of
equivalent solutions.
For a second thing, we can change the frequency of the wave so it has more
oscillations in time and space.
And nally if we have two or more solutions we can add them and get another
solution.
Exercise 5.2.6. Verify this.
This gives us a lot more solutions, innitely many more, but one has to feel
that they are not really all that dierent. Of course Fourier Theory tells us
that any function can be approximated as a nite sum of such things. On
the other hand it is easy to construct functions which are not solutions.
So we ask the question, how many other solutions are there? It is conceivable
that this is in fact all, and it is conceivable that there are squillions of other
families of solutions, where another family is obtained from any one solution
by doing some rotations, or lorentz transformations, and scalings and sums.
One thing we note is that in going from the wave equation for the electric
eld to obtain a magnetic eld, we simplied the integral by making some
functions equal to zero.
5.2. SOLUTIONS 153
We had
curl
0
0
sin(y t)
cos(y t)
0
0
=
B
t
Integrating this gives
B =
sin(y t)
A
B
where A and B can be any functions which do not depend on t. It is natural

to cheat and make them zero, as I just did. Can we have any other possibility?
We would need to ensure that
q
B = 0 which would force
y
A +
z
B = 0
for a start. We also have
t
E =
0
0
cos(y t)
would have to be curl B, that is
Bx
By
Bz
0
0
cos(y t)
or

y
Bz
z
By
z
Bx
x
Bz
x
By
y
Bx
0
0
cos(y t)
This would seem to give rather a lot of possibilities for B other than the
simplest one we have considered.
Exercise 5.2.8. Does it? Find one or prove there arent any.
Note how we got this present family. Basically, we guessed from knowing that
light travels through space as a wave and the velocity is 1 in our units and the
conjecture that light is an electromagnetic thing, that a wave would work,
and wow, it did. Checking to see if a function from R
4
to R
6
will satisfy
Maxwells Equations is very simple, actually nding one by some process
other than guessing is a dierent story. And that makes the assumption
that we should look in a space of analytic or at least smooth functions; in
practice we are going to be using the elementary functions because these are
the ones we can easily write down and dierentiate. Why should the universe
be so kind as to use the functions we nd easy to write down? What if some
important physical process depended on functions we cant write down as a
small sum of elementary functions? What, if anything, could be said about
it?
Exercise 5.2.9. Think about this. Have we just been dead lucky with light?
The question as to whether there are any other solutions to the vacuum
equation outside nite sums of lorentz transforms of the wave solution merits
a little thought.
The physicist will surely observe that there are bound to be solutions to any
non-vacuum problem. Take any conguration of moving charges. Specify
them by elementary functions Ex, Ey, Ez where possible. Then we can
hope to derive, in any coordinate frame in which the data is specied, the
corresponding magnetic elds. This requires merely some dierentiation and
integration, leaving some unknown functions provided by the integration
stage. Now the physicist knows that there is a solution: his reasoning
is that the universe will surely provide one, so it must be there to be found.
Indeed he will believe it is unique up to the transformations of coordinates,
since the universe doesnt toss up between options. This, of course, assumes
that the Maxwell equations are true, which physicists do indeed believe. In
the main.
The question of why do physicists feel happy to restrict themselves largely
to the analytic elementary functions which I invited you to ponder a while
back, and the question of why physicists are so condent about being able to
prove uniqueness and existence of solutions are explained by two essentially
philosophical positions which go back to Newton.
The rst can be summarised by the old adage If something is ineable,
theres no point trying to e it, and perhaps also If something cant be
detected it isnt there. If a function that was zero around the planet earth
was non-zero somewhere else, rst it could not be represented by an analytic
function and second, we would have no way of knowing by local measurements
of any precision that it existed, so there is no point in wasting thinking
time about it, and if a function that cant be written down is essential to
understanding something then we are never going to understand it, so again
forget about the possibility.
The second can be summarised by the principle that if you have a theory
which accounts for the phenomena, commit yourself to it until either someone
comes up with a simpler or more encompassing theory or you run into facts
which are in conict with it, in which case bend the theory minimally to
accomodate the new facts. The more committed you are to the theory, the
more likely you are to discover such facts. Pondering what if s is a waste of
5.3. INFINITE VARIETY 155
time.
The belief that the universe does not toss up but is consistent and hence
provides us with unique solutions is again a philosophical position. One can
argue that it is justied in various ways: it can be productive because if we
get lots of solutions we can look for extra conditions to force uniqueness and
usually nd them. Recall the exercise you did on the magnetic eld for a
wire carrying a current.
Most physicists regard these metaphysical convictions as so obvious that they
never bother to mention them. Much like the french children.
5.3 Innite Variety
Giving the curl of a eld and asking for a solution is, as you will have dis-
covered, dicult because there are so many solutions. In dierential form
notation we have dX = Y where X is a 1-form and Y is a 2-form. Now d is
linear, so it follows that if d = 0 then whenever X is a solution, so is X+.
And since d
2
= 0, if f is any dierentiable function whatever, X + df is a
solution.
What does this do to the physicists conviction that solutions have to be
unique on the physical grounds that the universe does not toss up? There
are two things one might do, and physicists do both of them. One is to
impose extra conditions which force uniqueness. Another is to declare the
dierence between dierent solutions as an artefact of the language and deny
that it is physically signicant. In the former case they explain that the uni-
verse has some rather unexpected preferences, often for continuous functions,
and in the second they glory in the freedom that they get to choose arbitrary
functions to suit their convenience, seeing no objection to making dierent
choises at dierent times. If a mathematician has noticed that they often
do the rst and then unexpectedly do the second, and points out the incon-
sistency, they express surprise and a certain contempt that mathematicians
lack the courage to follow them. You will nd this attitude in the text book
section on Gauge Freedom. I have not been able to get physicists to agree
that consistency in how they resolve multiple solutions to physical problems
is particularly desirable, although they insist that the universe shows con-
sistency. This appears to be a religious conviction, possibly derived from
Newton who believed (a) that God had created the universe and (b) that
God was not small-minded enough to be inconsistent or to try to fool us.
Quantum mechanics might have given him spiritual indigestion, as would
some of the practices of his intellectual heirs. But then, Newton consid-
ered himself a philosopher rst and a mathematician second, and Physics or
indeed Science hadnt been invented as a separate subject in his time.
Again, physicists remain happily oblivious to the underlying assumptions
in their practice, or the great majority of them do. Extracting them for
inspection is time consuming, but I havent found a quicker way of making
sense of their work. And it is suciently interesting and important work to
justify the eort.
5.4 Gauge Freedom
We have seen that guessing a solution and then verifying it is the quick and
easy way, but it presumes that we are good at guessing, or equivalently that
the solution is simple. It doesnt seem safe to rely on this. So it is reasonable
to introduce other assumptions, some on physical grounds, some in a spirit
of optimism.
We know that d
2
= 0 and so when given dB = 0 it is tempting to consider
the possibility that the reason dB is equal to zero is that B = dX for some
unknown 1-form X. Such a thing is known in the literature as a vector
potential. We also know that it is far from unique: adding df for any function
(0-form) f will give an equally good X. This is precisely analogous to having
a constant of integration crop up: again we might feel inclined to x it in
a physical situation by imposing an extra condition, as when we solve an
ODE, or we may feel that it gives us a glorious freedom to choose one that is
convenient, or we may decline to make a choice at all. In the case of vector
potentials, the practice of physicists is to glory in the freedom and call it gauge
freedom. A similar situation exists when we obtain a potential function for
a physical situation, when adding in an arbitrary constant will not change
the force eld which is its gradient. Physicists sometimes insist that physical
constraints such as ensuring the potential goes to zero at innity suce to
get rid of the ambiguity, but they do not usually feel any such compulsion
in the case of the vector potential. Just what exactly is physically real and
what is an artefact of language is never precisely specied
2
. This allows
2
This creates real problems. One of my lecturers at Imperial College told the class,
rather sadly, that when he was starting on a PhD, he had come up with what he saw as
a very interesting problem. His kindly supervisor had assured him that it wasnt a real
problem, but an artefact of language. Someone else, perhaps with a less assured or less
kindly supervisor had assumed it was real, done the research, and become famous as a
result. I suppose one moral to be derived is that you shouldnt trust your supervisor. The
conclusion I derived was that physicists are not at all clear as to what is real and what
isnt. This surprised me at the time, but I was very young.
5.5. EXACT AND CLOSED FORMS 157
physicists to spout manifest drivel. I was once assured that there were 4
lines of force coming out of a unit charge, and on pointing out that this was
roughly 12
4
7
lines and what did 4/7
th
of a line look like? I was reproved for
being too literal. Clearly one wasnt supposed to ask what things meant, one
was simply being instructed in the right things to say, whether it made sense
or was total bullshit. Thus do subcultures maintain a wall against outsiders:
theres a lot of it about.
Exercise 5.4.1. Listen to some conversation between your friends and de-
cide, how much of what is said is carrying information about the world which
could be translated into a foreign language and remain intelligible, as in Your
dress is transparent in the sunlight and how much is comprehensible only af-
ter a large number of extra propositions have also been translated, and possibly
not then, as in All cultures are equally valid in their own terms.
You will note that the mathematical subculture has a quite dierent set of
underlying assumptions from those of most of the rest of the human race.
One is that assertions have to make sense and should, if possible, be true
or derivable from other assertions which are either true or clearly stated to
be assumptions. Many students come to university with a quite dierent
assumption: that what is to be said is anything that has been approved by
authority. Whether it is true, false or totally meaningless is of no importance.
Answering an examination question is done by taking a few half recalled
fragments from lectures and gluing them together with bullshit. No doubt
this works well in the schools, and perhaps in other university departments,
but mathematicians really dont like it. As I am sure you have noticed by
now.
Exercise 5.4.2. What other fundamental but usually unstated assumptions
characterise mathematical culture ?
For the time being I shall simply go along with the physicists, but point out
any oddities while doing so.
5.5 Exact and Closed forms
I suggested earlier that given that dF = 0 for a 2-form F, we could get this
result if F = dX, using the well known result that d
2
= 0.
Denition 5.5.1. A form Y which satises the condition dY = 0 is said to
be closed.
Denition 5.5.2. A k-form Y such that there exists a k 1-form such
that Y = d is said to be exact.
Then we have the result that:
Proposition 5.5.1. Every exact form is closed.
Proof: d
2
= 0
What about the other way around? Is every closed form exact? The answer
is interesting: it depends completely on topological properties of the space
on which the form is dened. You might think that this is interesting only if
you are a topologist; it has however some important implications for Physics.
The idea is explained clearly enough in Chapter Six of the text book, which
does it rst for the case when X is a 1-form on R
2
. To say that dX = 0 is
to say that the eld, corresponding to X when we use the inner product to
change to an equivalent vector eld, has zero curl. The question then is, is
it a potential eld? Is it the gradient of a scalar eld f : R
2
R?
We can try to construct one by the simple process of taking some point
a R
2
and declaring f(a) = 0. To get, for any other point b a credible value
of f(b), we take a path from a to b and integrate the vector eld along the
path. This tells us the amount of work the vector eld does along the path.
We can put a minus sign in if we feel like, but hey, who cares? Now this will
certainly give a value of f(b) but the obvious problem is that if we took a
dierent path, we might get a dierent answer. In general, we would. Your
second year exercises in the course of doing Stokes Theorem should have
convinced you of this.
If however the curl of the eld is zero, then the integral around any closed loop
is zero. This follows from Stokes Theorem in the plane, otherwise known
as Greens Theorem, immediately. And this means that the value of f(b)
cannot depend on the path, because two paths between xed points when
joined together give a closed loop. Hence the value of f(b) does not depend
on the path, and so we can take this as a sensible value for f(b) because it
depends only on the vector eld and the point a.
Exercise 5.5.1. Show it depends on the point a only up to an additive con-
stant: in other words if I choose a and you choose a
, your function f
and
my function f will dier by a constant.
Exercise 5.5.2. Translate this into the language of 1-forms. 2-forms and
0-forms on R
2
.
This seems to give us the following:
Proposition 5.5.2. Every Closed 1-form on R
2
is exact.
Proof: Just construct the 0-form as indicated. Then it is trivial to verify
that d of the 0-form is the given 1-form.
This seems perfectly reasonable and hasnt seemed to involve us in any topol-
ogy, so I shall now give what looks like a counterexample to the last propo-
sition:
Proposition 5.5.3. The 1-form
X =
y
x
2
+y
2
dx
x
x
2
+y
2
dy
is closed but not exact.
Proof:
First the closed part.
dx =
y
y
x
2
+y
2
dy dx
x
x
x
2
+y
2
dx dy
=
2(x
2
+y
2
) 2(x
2
+y
2
)
(x
2
+y
2
)
2
dy dx
= 0
Now suppose X = df for some function (0-form) f. Then it would follow that
the integral of X around the unit circle is zero, since starting at a = (1, 0)
T
and proceeding in the positive direction would give us f(a) f(a) = 0 for
the integral, by denition of the construction of f. But a glance at the vector
eld shows this is wrong. We have unit length vectors against us each step of
the way, so the integral is 2. Check it by doing the algebra if the geometric
argument fails to carry conviction.
So there aint no such f, and X is not exact.
And something has gone horribly wrong.
Exercise 5.5.3. Can you see what? Stop now and try to work out why this
result is not, as at rst appears, in conict with the penultimate proposition
that said that every closed 1-form on R
2
is exact. Warning: I am about to
give the game away on the next page, so stop now and work it out.
The answer is of course obvious once you have seen it. The 1-form
X =
y
x
2
+y
2
dx
x
x
2
+y
2
dy
is not dened on R
2
. It is dened and smooth on R
2
` 0. This is R
2
with
a hole in it. The hole completely destroys the argument, because Greens
Theorem, Stokes in the plane, doesnt work if there is a hole in the region.
The boundary of a disc with a hole in it consists of both the bounding circle
and the point at the hole. Ignoring missing points screws up everything.
You should be warned that evil people, I suspect physicists, have the bad
habit of writing the negative of this form as d. You can see why they do it,
but you have to deplore their moral and mathematical muddle.
Exercise 5.5.4. Why do they do it? You might like to consider the function
which takes a point in the plane, writes it in polar coordinates and sends it
to . What happens if you take the exterior derivative of this 0-form?
The removal of a point of R
2
makes a mess of the result that all closed forms
are exact. The argument works however for subsets of R
2
which dont have
any holes in. One hole is enough to bugger things up.
Exercise 5.5.5. Show this.
Exercise 5.5.6. What about the corresponding case of closed 1-forms on
R
3
` 0. Are they always exact? After all, if we have a loop in R
3
` 0 we
need to nd a surface with the loop as boundary which does not contain 0,
in order to use the classical Stokes Theorem. This will allow the argument
to go through even in R
3
` 0. And such surfaces are always there, we have
lots of extra room and can deform smoothly any bad surface that contains the
origin until it doesnt.
It might occur to you to wonder if it goes on in the same way: does closed
imply exact on R
n
in general? Investigating in the simplest case, R
2
, we
know that the only 3-form is zero so every 2-form on R
2
is closed. This
would suggest that if it is true, every smooth function on R
2
has a smooth
vector eld of which it is the divergence.
Exercise 5.5.7. Is this indeed the case? If so prove it, if not give a coun-
terexample.
Exercise 5.5.8. Show that every 3-form on R
3
is the exterior derivative of
a 2-form.
Exercise 5.5.9. What about closed 2-forms on R
3
? The required theorem
we would need is obviously more complicated since we have to construct a
1-form not just a function. Do it for the 2-form dx dy +dx dz +dy dz.
The last exercise will show there is a certain amount of slack and that we
can make some choices. It would be nice however to have a more systematic
approach.
To do this, lets look at 1-forms on R
2
which are exact and see if we can
be systematic about getting the potential function f. Suppose we have a
1-form
= Pdx +Qdy
We can take the origin as a starting point and look to see what we get if we
integrate along a path from 0 to the point x. Rather than talk about any
old path, lets do it with a straight line. Then the line is the set of points tx
for t [0, 1] and we get that the path integral of along this path is
1
0
P
dx
dt
+Q
dy
dt
dt where x =
x
y
or
1
0
(P(tx, ty)x +Q(tx, ty)y)
Exercise 5.5.10. Evaluate this for x = [1, 2]
T
and the 1-form x dx +y dy.
Exercise 5.5.11. Evaluate this for x = [1, 2]
T
and the 1-form y dx+x dy.
Note that this is not closed.
The result, for any point x is a number which we can call f(x) I shall call
it I()(x) and use I() instead of f. The reason is that I goes in the
opposite direction to the exterior derivative so I (for exterior Integral?) seems
a reasonable symbol to use.
So we have an operator I from 1-forms to 0-forms which makes sense on R
n
and always gives an answer whether the 1-form is closed or not. And we
observe that when is closed, then dI() = so must be exact.
Can we get from 2-forms to 1-forms by a similar process? We investigate
the simplest case of R
2
and a nice simple 2-form. Let us start by taking
the constant 2-form 2 dx dy. We want to do some integrating to obtain a
suitable 1-form I(2 dxdy) = Pdx+Qdy. Since all 2-forms on R
2
are closed
we would rather like to have d(Pdx +Qdy) = 2 dx dy.
There is a fair bit of slack here. We would have
x
Q
y
P = 2
and we would need to make up our minds about how to split up the 2 between
the two contributions. Lets make them equal. Then we would have
x
Q = 1;
y
P = 1
We can integrate both these equations to get
Q(x, y) = x; P(x, y) = y
or
I(2dx dy) = y dx +x dy
and checking conrms that this works: d(y dx +x dy) = 2 dx dy.
Had we chosen some other way to split the number 2 up between the two
terms, we would have got another equally good 1-form: there is no shortage
of them.
Exercise 5.5.12. Try it. Make one term zero. Or 1. Now look at the
various 1-forms which have constant exterior derivative 2dx dy. What can
you say about their dierence?
Now we try to make the process look more like an operator I taking 2-forms
to 1-forms. First we split the elements up in equal amounts to be denite.
Then we integrate along a path as for the case of 1-forms. I write
I(2dx dy) =
1
0
t 2x dt
dy
1
0
t 2y dt
dx (5.5.1)
The term t is in there to make sure we divide by 2, which we can regard as
sharing the contributions out equally.
Exercise 5.5.13. Suppose we do the same with some more complicated 2-
form which is not constant, such as = x
2
+y
2
dxdy. Can you see how to
x up to obtain the 1-form I() by modifying equation 5.5.1 appropriately?
Exercise 5.5.14. Can you make it work for 2-forms on R
3
? Try it on closed
2-forms rst. Then try it on a 2-form that is not closed, and also try to
make it work for the 3-form d. Notice anything?
If you have been good and virtuous and done the sequence of exercises above
you will be prepared to believe that we can construct for any k-form on
R
n
, k > 0, a (k 1)-form I, also on R
n
, given by:
I(x) =
i
1
<<i
k
k
=1
(1)
1
1
0
t
k1
i
1
< < i
k
(tx) dt
x
i
dx
i
1

dx
i
dx
i
k
(5.5.2)
where the

dx
i
means this term is omitted.

5.6. HOMOTOPIES 163
This is undoubtedly a bit messy, which is why I gave the sequence of exercises.
If you prefer memorising things to understanding them, the very best of luck.
Note that I takes the zero k-form to the zero (k 1)-form.
It is now possible to prove the Poincare lemma:
Theorem 5.5.1 (Poincare Lemma). If a region U R
n
is star-shaped
with respect to the origin and if is a smooth k-form dened on U, then
there is a smooth (k 1)-form I dened on U and
= d I +I d
It follows that if is closed then it is exact.
Proof: This is a thoroughly horrid calculation which is done on page 95
of Michael Spivaks Calculus on Manifolds. You have probably worked out
what the term star shaped with respect to the origin means: if a point is
in the set U so is every point on the line segment joining that point to the
origin.
Exercise 5.5.15. Show that we can get the same result for any star-shaped
subsets U R
n
where U is star-shaped with respect to any point.
Exercise 5.5.16. Show that if a region U R
n
is dieomorphic to any
star-shaped subset, then the result still holds for all smooth k-forms on U.
Exercise 5.5.17. Gives some examples of subsets U in R
n
which are not
dieomorphic to star-shaped regions.
5.6 Homotopies
Recall from 3P0 the idea of a homotopy:
Denition 5.6.1. We say that two continuous maps f, g : X Y where
X, Y are topological spaces are homotopic i there is a continuous map F :
X I Y such that x X, F(x, 0) = f(x) and x X, F(x, 1) = g(x).
In such a case we write f g.
Exercise 5.6.1. Show that homotopy is an equivalence relation on the con-
tinuous maps from X to Y .
In other words, we can change t continuously from 0 to 1 and interpolate
between f and g. If X is the space consisting of a single point, , then to say
that two maps, f, g from to Y are homotopic is to say that f() and g()
can be connected by a continuous path joining them. So we can say that:
Denition 5.6.2. a space Y is path connected or (0-connected) i every
two maps from to Y are homotopic.
Or equivalently, we say Y is 0-connected i every constant map to Y is
homotopic to every other constant map.
This can be extended considerably:
Denition 5.6.3. A space Y is simply connected or 1-connected i every
map f : S
1
Y is homotopic to a constant map.
Denition 5.6.4. A space Y is k-connected i every map S
k
Y is homo-
topic to a constant map.
You should be warned that some writers use the term k-connected to mean
what I call n-connected for every n [0 : k]. In my sense,
Proposition 5.6.1. The circle, S
1
is 0-connected but not 1-connected.
Proof:
To see this we make use of the exponential map exp : R S
1
, t e
2it
.
If we take a map f : [0, 1] S
1
with f(0) = f(1) we can regard f as a map
from S
1
to S
1
. Using the fact that exp is locally a dieomorphism, we can
lift f to

f : [0, 1] R with exp
f = f. If, without loss of generality, we

assume f(0) = f(1) = [1, 0]
T
then we can x

f(0) = 0 and observe that

f(1)
must be an integer. This integer is called the winding number of f.
Exercise 5.6.2. Draw a picture. Conrm that it is always possible to chop
[0, 1] into small enough bits so that exp has a smooth inverse on the image
by f of each bit. Explain precisely how

f is constructed.
It is not hard to show that the winding number is a homotopy invariant,
which is to say that if two maps are homotopic then they have the same
winding number, and also that if they have the same winding number they
are homotopic.
It follows that there is no homotopy between the identity map and a constant
map from S
1
to itself, and hence that S
1
is not 1-connected.
Exercise 5.6.4. Finish the argument.
Exercise 5.6.5. Show that S
2
is path connected and 1-connected but not
2-connected.
Exercise 5.6.6. Show that S
n
is k-connected for 0 k < n but not n-
connected.
5.7. COUNTING HOLES 165
Denition 5.6.5. If f : X Y and g : Y X are continuous maps and if
f g I
X
and gf I
Y
then we say that X and Y have the same homotopy
type, and f is a homotopy equivalence.
Exercise 5.6.7. Show that having the same homotopy type is an equivalence
relation on topological spaces. Show that R
n
has the homotopy type of a one
point space, and that S
k
has the homotopy type of S
n
i k = n.
Exercise 5.6.8. Show that R
n
`0 has the homotopy type of S
n1
for n 1.
The last exercise has as an almost immediate corollary that if R
2
has any
holes in it, the resulting space is not simply connected.
Exercise 5.6.9. Show this.
Let A denote any compact subset of R
2
. Now it is immediate that if two
loops in R
2
` A are homotopic, and if is any closed 1-form on R
2
` A, then
the integral of over the rst loop is equal to the integral over the second.
It follows that we can say that:
Theorem 5.6.1. If a manifold is connected and simply connected then every
closed 1-form on it is exact.
Proof:
If is a closed 1-form on a connected and simply connected manifold M
n
,
then the integral around any loop is zero since the loop is homotopic to a
constant map. Hence the integral along any path between any pair of end
points does not depend on the path. To compute the integral along any path
we take a chart containing one end point and take a point along the path
which is in the domain of the same chart, shift the 1-form and the path to
R
n
by means of the chart and compute the integral in R
n
. Do this for a
set of points along the path until we have the whole path, and add up the
part integrals to get the value of the integral for the whole path. Putting the
function I() equal to zero at the starting point and the value of the integral
at the nishing point denes I() at the end point. We can do this for every
end point on M
n
. The argument that dI() = takes place in R
n
and is
trivial.
5.7 Counting Holes
The text book does an excellent job of explaining how we have a vector space
Z
p
(M) of closed p-forms on M and another B
p
(M) of exact p-forms and we
know B
p
(M) Z
p
(M). So we can form the quotient vector space
H
p
(M) Z
p
(M)/B
p
(M)
This measures the number of p-holes in M. If you have troubles with quotient
spaces, take R
n
and R
m
with m < n, take an embedding of R
m
in R
n
by
a linear map, and look at the quotient object, which should look a lot like
R
nm
.
There are a number of ways of computing the cohomology of spaces, not
necesarily manifolds, and hence a number of dierent cohomology theories.
In a sense, and up to a choice of a coecient group, they all give the same
answers. This takes us further into algebraic topology than I am game to go
in this course, but you should know that much. If you want to know more
algebraic topology, do the unit in second semester. You can nd my notes
on the web, which may or may not help.
Exercise 5.7.1. Do exercise 98 on page 125 of the text book. Read the section
carefully.
5.8 More Cultural Anthropology
There is a section in the text book on the Bohm-Aharonov eect which should
be on keen interest to cultural anthropologists. The eect is a quantum
mechanical phenomenon of some interest.
The text book explains that physicists get some insight into the eect by
visualising an innitely long core on which is wound a helical wire which the
authors wrongly call a spiral. Then it appears that the fact that the region
outside the wire is not simply connected accounts for the eect happening.
They then go on to admit that in fact the wire is not innite so the space
outside is in fact simply connected. Since the wires are normally joined via
a generator or battery so as to produce a current in the wire, one might feel
that they were right the rst time. But one can certainly visualise a very
long coil, say a light-year length of wire, and a humungous charge at one
end which attracts the electrons towards it. Then if we neutralise the charge
with an equal and opposite one (producing a humungous ash, perhaps) the
electrons will be released to head down the coil. After about nine months
one could conduct the experiment to detect the eect somewhere about the
middle of the coil. Presumably it would be observed to happen despite the
fact that the coil has ends and the complementary space is in fact simply
connected.
5.8. MORE CULTURAL ANTHROPOLOGY 167
Think about this. It is claimed that physicists get insight into why something
happens based on an assumption which is in fact false. It is rather like
claiming that you get some insight into why human beings have two legs by
observing that horses have four legs so the rear half of a horse has two. If
you were told this, you might point out that the claim is made by a person
who might in fact be a horses rear-end, but since you are not, it does not in
fact contribute noticeably to your understanding.
Us coarse, crude mathematicians have a technical term for this sort of thing:
we call it bullshit.
It might be that some kind of sense can be made of this, and it would be
interesting to see it done, even more interesting to try to actually do it.
One is left with the impression that to a physicist, mathematics is there in
two roles, one is to supply a means of doing the computations and the other is
as a sort of mnemonic for remembering the rules for doing them. Mnemonics
do not have to make sense, and generally dont.
To a mathematician, the rules are there because they reect the way the
universe works, and they have to make sense. Either the universe does in
fact work this way in which case the rules are right and we may trust our
calculations, or it doesnt and they are unreliable. One may, course, have
only uncertain knowledge of which of these states of aairs actually obtains
of. Taking a punt on it being right and then examining reality closely and
discovering if our sums agree with our measurements is usually felt to be the
way to go. Talking pure bullshit, even if it is the same bullshit as that uttered
by the rest of the tribe, doesnt cut it. In the creative phase of development
of an idea, some haziness is allowable indeed necessary. But bullshit is always
a bad idea. And removing the haziness is crucial to progress. Incorporating
it into your subject is popular among people like publicists and politicians,
where a career built on a foundation of bullshit is quite common, but it is
disappointing to nd it in Physics.
It wouldnt be quite so bad if physicists understood that what they are doing
here is rather silly. Like the arts students who feel quite proud of their
inability to use logic and announce that they are not to be constrained by
mere rules and consistency, the price paid is that nobody else will trust their
arguments. Long, long ago, physicists understood that bullshit is baaaaaad.
Some of them these days do not. So do civilisations crumble.
5.9 Summary
There are serious problems for a mathematician trying to understand physics,
many of them put in place by physicists, who have a very dierent notion
of what constitutes an explanation. Nevertheless it is a fascinating and re-
warding subject.
I have worked through most of part I of the text book and would like to
have got much further. It is possible for the interested reader to tackle the
next two sections, and I would encourage you to do this. You will certainly
come to the conclusion that understanding Physics entails getting a grasp of
an awful lot of contemporary mathematics. You might nd it easier to do it
the physicists way, which involves knowing a lot of facts and stringing them
together with algebra in a rather muddled manner, or you might nd it better
to understand the mathematics rst. This is probably to be determined by
how much brainpower you have versus the extent of your memory.
The remaining chapter headings represent a pious hope of how far I would
have liked to get but probably wont. As time permits I shall continue nish-
ing the material but I doubt if we will get any further this semester. Maybe
we want a post-graduate unit on it.
Chapter 6
Lie Groups
6.1 Introduction and Motivation
6.1.1 The rest of the course
The next few chapters will treat the machinery needed to deal with part Two
of the text book. There are a number of elements of this. The rst is a study
of some Lie Groups which will require a small amount of group theory and
a brief return to the tensor algebra, the second is a study of vector bundles
in particular G-bundles, where the bundle structure is specied by a group.
This is known to physicists as the gauge group of the bundle. It tells us
how to glue things together in order to build a bundle from trivial bundles.
This will lead to the Yang-Mills equation as a generalisation of the Maxwell
Equations for force elds other than the electromagnetic.
6.2 Introduction to Lie Groups
I discussed Lie Groups briey in the second year algebra unit. They were
all matrix groups, and hence mostly subspaces of the general linear group
GL(n, R), which we can think of either as the space of all invertible linear
maps from R
n
to itself, or as the space of n n invertible matrices with
real entries. The exceptions were barely mentioned subgroups of GL(n, C)
which is either the space of invertible linear maps from C
n
to itself or the
space of nn invertible matrices with complex entries. Which denition you
prefer is a matter of taste; I prefer to think of the linear maps as being more
fundamental and regard the matrices as handy devices for representing the
169
170 CHAPTER 6. LIE GROUPS
linear maps in a convenient form for computation
1
. In this course however
I shall usually write GL(n, R) for the matrices and Aut(R
n
) for the linear
automorphisms (isomorphisms with itself) of R
n
.
Notable among these groups were the Orthogonal groups, O(n, R) and the
Special Orthogonal groups SO(n, R), the Unitary groups, U(n, C) and the
Special Unitary groups SU(n, C). GL(n, R) is of course a vector space, in
fact an algebra because we have a multiplication, not usually commutative,
obtained by composing the maps or, equivalently, multiplying the matri-
ces. It is obvious that the group property of the Lie Groups is that of the
multiplication, but that if we add two orthogonal matrices the result is not
an orthogonal matrix, so the Lie Groups are not vector spaces. They are
however smooth manifolds, and hence have a dimension.
To see that they are manifolds, the easy way is to note that for all the above
examples, an element of the matrix group is dened by putting a bunch of
smooth conditions on the elements of the matrix. For example, to get O(2, R)
we take the space of all 2 2 matrices with real entries,
x u
y v
and require the conditions:

x
2
+y
2
= 1, u
2
+v
2
= 1, xu +yv = 0
This gives us three independent conditions on four numbers so we expect,
or at least hope, to have one degree of freedom left and a one dimensional
manifold. This is a rather sloppy discussion of an application of the implicit
function theorem, which you need to remind yourself of. And the implicit
function theorem is a generalisation done locally of the rank nullity theorem.
Which you know from second year. I hope.
Lets do two cases in agonising detail. First the unit circle, because it is so
easy.
The Implicit function theorem deals with the zero of a function f : RR R
which is dierentiable at a point (a, b) RR. Think f(x, y) = x
2
+y
2
1,
Df(a, b) = [
x
f(a, b),
y
f(a, b)] = [2a, 2b]. It tells us that when the derivative
with respect to y is invertible, we can represent the zero of f locally as the
1
This is probably related to the fact that I dont particularly enjoy doing sums, but I
do like understanding the ideas which tell me how to do them. This often requires me to
do sums, but I prefer to do the bare minimum. Of course, if the ideas didnt tell me how
to do the sums, I should suspect them of being metaphysical tosh, so I do believe that
sums are, or at least the fact that they can be done is, important.
6.2. INTRODUCTION TO LIE GROUPS 171
graph of a curve y = g(x) for a dierentiable g. The derivative with respect
to y is invertible when it is non-zero which happens everywhere except at
y = 0, x = 1 in the case of f(x, y) = x
2
+ y
2
1. And in this case, if we
swop x and y we can expresss the curve as the graph of a map from y to
x. Since in either case the rank of the derivative is one and also we have
the curve is locally (that is, in a neighbourhood of the point) a graph of a
dierentiable function, then we have the conclusion that at every point of
the zero of f where the rank of the derivative is one, the zero of f is locally
dieomorphic to an interval.
The generalisation of this which we need is the Implicit Function Theorem
which I give in what may be a new (manifold) form:
Theorem 6.2.1. If f : R
n
R
m
R
m
is dierentiable and
M = (x, y) R
n
R
m
: f(x, y) = 0
Then if (a, b) M is such that rankDf(a, b) = m, then there is a neighbour-
hood U of (a, b) M which is dieomorphic to an open ball in R
n
.
Exercise 6.2.1. Find the version of the implicit function theorem you are
used to and verify that it is equivalent to the form given.
An even more useful form is:
Theorem 6.2.2. If f : R
n
R
m
with n m is a smooth map and
M = (x) R
n
: f(x) = 0
Then if rank Df = m on M, M is a smooth manifold of dimension n m.
This is somewhat stronger than the classical Implicit Function Theorem and
the idea is intuitively appealing: locally we have that f may be approximated
by an ane map the linear part of which is Df, and if Df : R
n
R
m
is onto
then the kernel has dimension n m. And in a neighbourhood, the graph
of the derivative is dieomorphic to the graph of f. In other words it is the
rank-nullity theorem and the fact that the derivative is a good approximation
to the function in a suciently small neighbourhood.
In the case of f(x, y) = x
2
+ y
2
1 it is easy to verify that the rank of Df
is never zero on the solution so must always be at least one.
Now to do the same with the orthogonal group O(2, R): we have
f : R
4
R
3
, (x, y, u, v)
T
(x
2
+y
2
1, u
2
+v
2
1, xu +yv)
T
and
Df =
2x 2y 0 0
0 0 2u 2v
u v x y

and the rank of Df is 3 on M so M is a smooth manifold (since Df is
smooth) and has dimension 1.
Exercise 6.2.2. Do it for O(n, R). Show that the condition that a matrix
be in O(n, R) forces the determinant to be 1, and deduce the dimension of
SO(n, R).
To get SO(2, R) we need another condition, namely xv uy = 1. This might
lead you to suspect that SO(2, R) is a zero dimensional manifold, but the
fact is that the constraints are not independent, and we may deduce from
the rst three that xv uy = 1. This means that O(2, R) is disconnected
and SO(2, R) is one component of it. And the argument from second year
shows that SO(2, R) is dieomorphic to the unit circle as a manifold. So
O(2, R) is dieomorphic to two circles. Arent you glad we did not stipulate
that our manifolds had to be connected.
Exercise 6.2.3. First show that for a manifold (not necessarily smooth or
even dierentiable) connected implies path-connected. Then show that if we
have a Lie group, the connected component containing the identity is a Lie
subgroup.
Other Lie Groups can be obtained in essentially the same way as O(2, R)
by imposing conditions on linear maps or matrices: at one end we have all
GL(n, F) which is the space of all linear maps from F
n
to itself which are
invertible, where F is any eld. Then we can restrict ourselves to the case
where F = C or F = R which is less than adventurous but still more than
enough to require some thought. We can stipulate that the determinant be
1 which will ensure that the measure is unchanged to get SL(n, R), we can
insist that some generalised inner product with signature (k, nk) on R
n
be
preserved to get what we call O((k, nk), R). We can restrict to determinant
1 in addition, which requires us to put S for Special in front of the name.
And we can more or less repeat using C instead of R, except we use the term
unitary instead of orthogonal. And we can do much of it all over again using
nite elds.
You will note that SO((3, 1), R) is what we have called the Lorentz group.
This suggests an extension: we could take any of the groups regarded as op-
erations on R
n
which preserve the origin and anise them by also allowing
shifts. This will increase the dimension of the group by n since there are n
independent directions in which we can do the shifting. If we do this to the
Lorentz group we get the Poincare group
2
.
2
This allows the writer of the Wikipedia article on the Lorentz group to start o by
dening the Lorentz group as a subgroup of the Poincare group, probably the least hepful
6.3. GROUP REPRESENTATIONS 173
Exercise 6.2.4. Show that the cartesian product of two Lie Groups is a Lie
Group.
This gives us enough Lie Groups to be going on with.
Exercise 6.2.5. Do the exercises 1 to 10 in chapter one of part two of the
text book.
6.3 Group Representations
6.3.1 Introduction
Recall, from second year, that an abstract group is merely a collection of
things which can be multiplied and divided to give other things in the collec-
tion. This statement is usually made more precise by giving three axioms:
Denition 6.3.1. A group is a set G and a binary operation
q
: GG G
(with the operation
q
(g, h) usually written in inx notation as g
q
h) such
that
1. a, b, c G, (a
q
b)
q
c = a
q
(b
q
c)
2. e G, a G, a
q
e = e
q
a = a
3. a G, a
1
G, a
q
a
1
= a
1
q
a = e
We can now give a formal denition of a Lie group:
Denition 6.3.2. A Lie group is a group G which is also a smooth manifold
such that the maps inv:G G, g g
1
and
: GG G
are smooth, where is the multiplication in the group.
You will have already worked this out from contemplating the examples. I
hope.
Denition 6.3.3. A Lie group homomorphism is a homomorphism between
Lie Groups which is a smooth map between the manifolds.
denition one could imagine. Perhaps he wrote the article on the Poincare group and
dened it as an extension of the Lorentz group.
Exercise 6.3.1. Verify that all the Lie groups discussed are indeed Lie
groups.
We are also interested in abelian groups which also satisfy the condition:
4. a, b G, a
q
b = b
q
a
Abstract groups are sometimes dicult to work with and so we ofen use a
representation of the group which means that the elements of the group be-
come represented by matrices and the group action by matrix multiplication.
Thus we may take the rather forlorn group Z
2
which has only two elements,
usually written 0 and 1, we replace
q
by + since the group is abelian, and we
seek to represent 0 by the identity matrix, 1 by some other matrix, and +
by matrix multiplication. There are a lot of possible choices. For example
we can choose the 2 2 matrix pair
1 0
0 1
1 0
0 1
which clearly works. Such a thing is called a representation of dimension 2.

There is a rather simpler representation of dimension 1 which you should be
able to see almost instantly.
Exercise 6.3.2. Write it down!
Formally,
Denition 6.3.4. A real representation of a group Gof dimension (or degree)
n is a homomorphism from G into GL(n, R).
and
Denition 6.3.5. A Lie group representation of a Lie group G is a Lie group
homomorphism from G into GL(n, R).
The theory of complex representations, where we go into GL(n, C), is much
simpler, and we shall nd that there is a strong preference for complex repre-
sentations in the books. There is also some interest to physicists in Quater-
nionic representations where the quaternions, H, are the step beyond C.
Just as C is a two dimensional eld, H is a four dimensional eld, actually
not a eld but a skew-eld since the multiplication does not commute.
Exercise 6.3.3. Dene H as the set of quartets a + bi + cj + dk where
i, j, k are meaningless symbols satisfying the rules i
2
= j
2
= k
2
= 1 and
ij = k, jk = i, ki = j. Assuming everything distributes in a sensible way,
show the result is a skew eld. Go back to the M213 notes to see this done
for C if hopelessly lost.
Exercise 6.3.4. Just as there are orthogonal group over C called the unitary
groups, there are analogues over H called the symplectic groups. Construct
one as a group of quaternionic matrices. Construct the one dimensional
complex group U(1, C) as a group of real 22 matrices and the corresponding
symplectic group as a group of real 4 4 matrices.
Remark 6.3.1. The use of H comes from Hamilton who invented them. It
is said that the relations dening H are carved in a bridge in Ireland. If
you invent something great, you may be allowed to deface bridges too, but
these days it would probably be an oence and you would face a severe ne if
caught.
Remark 6.3.2. You might have felt that it makes more sense to insist that
the homomorphisms are 1-1, but this complicates the theory enough to make
it a bad idea. If it is 1-1, we say the representation is faithful
Remark 6.3.3. We could call any linear map from G into Aut(V ) for any
vector space V a representation over V , and this has its advantages. Most
representations are matrix representations in practice.
The above example suggests that there could be rather a lot of representa-
tions of a group, and that we can build some of them up from other, simpler,
representations. Such is indeed the case, and the theory of representations
deals with precisely this issue. It is a quite satisfying kind of theory for alge-
braists and they often give courses on it, usually for nite groups, occasionally
for compact Lie groups, rather rarely in complete generality.
The representation of Z
2
of dimension 2 given above sends the positive x-axis
to the negative x-axis and vice-versa for the non-identity element, and leaves
the x-axis xed for the identity: we say the x-axis is invariant under the
group action. It is clearly a subspace of the vector space R
2
and gives rise to
the sub-representation which you will surely have discovered when looking
for a one dimensional representation of Z
2
. The y-axis is also an invariant
subspace, and R
2
is the direct sum of these two subspaces. This is revision
of second year material and I hope you recall it.
Had one of the minus signs in the second matrix been removed, note that
again there are two invariant subspaces of which R
2
is a direct sum, and that
this gives a new representation of Z
2
. Both parts give sub-representations of
Z
2
, but only one is faithful. In fact one is distinctly trivial.
Exercise 6.3.5. Find some real representations of Z
2
Z
2
. Is there a faith-
ful one dimensional real representation? Is there a faithful one dimensional
complex representation? A faithful two dimensional real or complex repre-
sentation?
The fact that the given two dimensional representation of Z
2
can be split
into two sub-representations of lower dimension means that it is really not
worth a deal of thought, because we can obviously recover it from the lower
dimensional representations by direct summing them. In fact all the real
representations of Z
2
can be obtained by minor variations of this process.
Exercise 6.3.7. Find some complex representations of Z
3
. Of Z
n
. Can you
nd any two-dimensional representations which do not have (complex) one
dimensional subrepresentations?
6.3.2 Irreducible Representations
Denition 6.3.6. A representation m : G GL(n, R) is irreducible i there
are no proper subspaces of the space on which the matrices act which are
invariant under the group action.
Remark 6.3.4. It might be more natural to dene reducible representations
but they are too boring.
The two things that make this interesting to physicists are
1. The representations of compact groups are all direct sums of irreducible
representations
2. Most gauge groups are compact
3. The irreducible representations of the gauge groups correspond to the
fundamental particles, for example, electrons
This allows us to compute properties of the fundamental particles by looking
at group representations. This is surely quite astonishing and wonderful. I
have said some rude things about physicists, but if they can do this then
they have more than redeemed themselves. They are good blokes. Or good
sheilas, as the case may be. Or at least, some of them certainly are.
I have not given a formal denition of the direct sum of two representations.
Exercise 6.3.8. Construct a suitable denition.
Denition 6.3.7. Two representations f, g : G GL(n, R) are equivalent
i there is an isomorphism : R
n
R
n
such that
a G, f(a) = g(a)
Exercise 6.3.9. Draw the obvious commutative diagram.
Observe that we could have generalised this by dening Aut(V ) as the set
of invertible linear maps from V to itself where V is any real vector space,
and then dening a representation of a group G over V as a homomorphism
m : G Aut(V ). Then if : U V is an isomorphism of vector spaces, we
can talk of representations of a group G over U and V as being equivalent
provided the appropriate diagram commutes.
Exercise 6.3.10. Draw the new diagram.
6.3.3 Tensor Representations
I shall present this as a sequence of easy exercises.
Exercise 6.3.11. Write out a non-trivial representation, , of Z
2
as 2 2
real matrices.
Exercise 6.3.12. Write out a non-trivial representation,, of Z
2
as 3 3
real matrices.
Exercise 6.3.13. Using the discussion on Darlings expressions for the ten-
sor product, nd an isomorphism between R
2
R
3
and R
6
.
Exercise 6.3.14. Find the obvious tensor representation in terms of
6 6 real matrices and the above isomorphism.
Exercise 6.3.15. Show it really is a representation.
Exercise 6.3.16. Repeat for the group SO(2, R).
Exercise 6.3.17. Dene the tensor product of two representations, one over
a vector space U and the other over a vector space V .
Exercise 6.3.18. Write down a really, really obvious map from R
2
R
3
to
R
2
R
3
.
Exercise 6.3.19. Show that any bilinear map f : R
2
R
3
R factors into
your really, really obvious map and a linear map from R
2
R
3
to R.
Exercise 6.3.20. Show this generalises to bilinear maps from U V to R.
Exercise 6.3.21. List all the one dimensional complex representations of
Z
4
.
Exercise 6.3.22. Hence or otherwise, list all the one dimensional complex
representations of SO(2, R). (Which it may be convenient to identify with
U(1, C))
Exercise 6.3.23. Explain why the above repesentations are irreducible when
they are.
6.3.4 Schurs Lemma
I got this from Frank Adams Lectures on Lie Groups which I recommend
only to the bravest. It is a beautiful book but very, very dense.
Denition 6.3.8. A CG-space V is a complex nite dimensional vector space
and a homomorphism from the Lie group G into Aut(V ); that is, a repre-
sentation of G over V .
Denition 6.3.9. A map between CG spaces U and V is a C linear map f :
U V which commutes with the homomorphisms, that is, if : G Aut(U)
and : G Aut(V ) are the representations, for all g G, for all u U,
f((g)(u)) = (g)(f(u))
Proposition 6.3.1. If and are irreducible, any CG map is either zero
or an isomorphism.
Proof: Ker(f) and Im(f) are clearly invariant subspaces of the representa-
tions and are hence either zero or the whole space.
Remark 6.3.5. It is clear that this works over arbitrary elds, not just C.
The actual Lemma needs the complex numbers:
Proposition 6.3.2 (Schurs Lemma). If f : V V is a CG map between
irreducible representations and of a Lie group G, then f = I
V
for some
C.
Proof:
V is isomorphic to C
n
for some n Z
+
so we work there. Then there are
n complex eigenvalues for f by the Fundamental Theorem of Algebra, not
necessarily dierent. So there is at least one C such that det(f I
V
) is
zero. Then by the previous proposition we must have f = I
v
, since f I
V
cannot be an isomorphism for this value of and is hence zero.
Corollary 6.3.2.1. All the irreducible complex representations of an abelian
group have dimension one.
Proof: If G is abelian and : G Aut(V ) is a representation then for every
g G (g) is an automorphism of V which is a CG map from to . It
follows that (g) is I
V
for some complex number and hence that every
subspace of V is invariant under (g). If is irreducible then it follows that
V has dimension one, where the only subspaces are the space itself and the
zero element.
Remark 6.3.6. If you have a trace of mathematical taste you will allow that
the last three results are very cool.
It follows that the complex irreducible representations of U(1, C) are all
equivalent to one of the form
n
(1, ) : (r, ) (r, +n), n Z
Exercise 6.3.24. Show this carefully.
Exercise 6.3.25. Show that tensor multiplication in R is just multiplication,
likewise in C, and hence that the tensor product of the above irreducible
representations just makes
n
m
=
n+m
6.3.5 Representations of SU(2, C)
The text book indicates, without any very compelling arguments, that the
irreducible representations of U(1, C) have important physical signicance.
Since the denition of U(1, C) means that it has to preserve lengths, it must
be the subset of C which contains only complex numbers of modulus 1, that
is, it is the unit circle. And a very ne group it is too, being isomorphic as
a Lie group to SO(2, R).
The representations of U(1, C) being so simple, it is natural to investigate
the representations of U(2, C) and SU(2, C). I shall refer to these as U(2)
and SU(2) from now on since the C may reasonably be taken for granted.
Again we need look only at the irreducible representations and again we are
motivated by the hope of some important physical applications of these ideas.
The rst observation worth noting is that U(2) and SU(2) are not abelian
groups, so we expect complications.
First it is essential to get some sort of feeling for the groups. SU(2) is the
subgroup of U(2) having determinant one, and U(2) will consist of the 2 2
matrices with complex entries which preserve the complex inner product on
C
2
, that is the rule
a
b
u
v
= a u +b v
where v is the complex conjugate of v. The maps will have to take the
standard basis for C
2
to vectors which have length 1 and which are orthogonal
with respect to the complex inner product, and so the columns of the matrices
representing these linear maps must also be orthogonal and have length 1,
which implies that the inverse of such a matrix is its conjugate transpose.
We use A
to denote the conjugate transpose of A, although many physicists

use A
.
We note that

e
i
0
0 e
i
is such a matrix for any , and so is
cos sin
sin cos
for any , since real orthogonal matrices are necessarily unitary. Also the
product of two matrices which have inverses equal to their conjugate trans-
pose has its inverse equal to its conjugate transpose.
Exercise 6.3.27. Prove this.
This would lead one to conjecture that the manifold U(2) has (real)dimension
at least three. That it is a (real) manifold follows from the usual arguments
involving the Implicit Function theorem. Note that it makes sense to have
complex manifolds with smooth maps between charts in C
n
, but we shall not
be dealing with such things.
Exercise 6.3.28. Find the dimension of U(n) from the Implicit Function
theorem. Show that an element of U(n) must have determinant a complex
number of modulus 1, and hence deduce the dimension of SU(n). (The answer
to the last part is n
2
1; make sure you get it right!)
An insight into the geometry of SU(2) is obtained from the Pauli matrices.
Recall that a matrix is hermitean if it is equal to its conjugate transpose.
The Pauli matrices are
0
=
1 0
0 1
,
1
=
0 1
1 0
,
2
=
0 i
i 0
,
3
=
1 0
0 1
It is easy to see that these are linearly independent over C and hence form
a basis for the four (complex) dimensional space GL(2, C). If we take only
real coecients then we get the hermitean 2 2 matrices.
Exercise 6.3.29. Conrm this claim. Conrm that all hermitean matrices
are obtained in this way.
You will observe that the Pauli matrices are certainly hermitean themselves
but are also unitary.
Multiply each of
j
for j [1 : 3] by i and call these, following Baez and
Muniain, I, J, K to get:
0
=
1 0
0 1
, I =
0 i
i 0
, J =
0 1
1 0
, K =
i 0
0 i
Note that
1. These matrices also span GL(2, C) with complex coecients
2. Each has determinant one
3. Each is unitary
Now it is easy to verify that taking all possible real linear combinations of
these matrices gives us a representation of the Quaternions, H.
It is also easy to verify that whenever a
2
+b
2
+c
2
+d
2
= 1, for reals a, b, c, d,
a
0
+bI +cJ +dK
is unitary,
Exercise 6.3.31. Do it
has determinant one
Exercise 6.3.32. Do it
and only slightly harder to conrm that every unitary 2 2 matrix with
determinant one is of this form.
This has shown that SU(2) is the three sphere S
3
equipped with a multipli-
cation which does not commute.
Remark 6.3.7. There is quite a lot of useful structure lying about here which
has been used by engineers and physicists for many a long year. Mathemati-
cians tend to see themselves as discovering structure and pointing it out to
physicists and engineers who eventually come to nd it useful in talking about
something in reality, and then imagine that they discovered the structure ex-
perimentally. Physicists and engineers have a dierent story.
6.3.6 Representations of SU(2)
This is reasonably well described in the text book: the representations are
over vector spaces of homogeneous polynomials: The zero degree polynomials
are simply the complex numbers, the space H
j
for j half an integer is the
space of polynomials of degree j in two variables. Thus we have for j = 0
the constant functions from C
2
to C and for j = 1 the functions
f
a,b
: C
2
C,
x
y
ax +by
for a, b, x, y C. Then H
j
is a vector space over C of (complex) dimension
2j + 1. U
j
: SU(2) Aut(H
j
) is the representation which takes any g
SU(2) to the automorphism carrying the polynomial p to the polynomial q
dened by
q
x
y
= p
g
1
x
y
Exercise 6.3.34. Conrm this gives a representation of SU(2).

These are in fact all the irreducible representations of SU(2), something which
is not proved in the text book and I shant prove it either. You may if you
wish.
Remark 6.3.8. This concludes everything we have to say about representa-
tions, where we means Baez, Muniain and me, but it is far from completing
the business. There is a lot of important and relevant material still uncovered.
Well, thats life.
Exercise 6.3.35. Read the discussion in the text book and ll in any gaps.
Remark 6.3.9. I am skipping the material which claims that SU(2) is a
double covering of SO(3); quite a lot can be said about this and it explains
the interest physicists have in SU(2).
Exercise 6.3.36. A lot of deep issues arise which physicists tend to gloss
over. This is an invitation to think about them.
First we have the mystery that the irreducible representations of the group
U(1) or SO(2) has something to do with the fact that charge is conserved.
Then we have that the irreducible representations of SU(2) tell us something
about spin, and about fundamental particles. This invites two separate ques-
tions, the rst is what exactly do the groups have to do with it? Groups such
as the Lorentz and Poincare groups arise naturally enough from our desire
to have the physics independent of the detailed choice of language, Einsteins
6.4. LIE ALGEBRAS 183
principle of general covariance. What is the explanation for U(1) having
everything to do with the Maxwell Equations and Electromagnetism?
Second, why irreducible representations? Why representations at all? One
can see that they might be convenient in doing calculations, but it looks as
though the use made of them goes beyond simple convenience. What exactly
is the relation between the physics and our description, and why are repre-
sentations central to it?
There is a sketch of an answer to these questions in the part I have skipped:
it involves Quantum Mechanics and the standard Hilbert space representation
of quantum states.
You are invited to write a short essay addressing these questions.
You are also invited to consider the extent to which the Hilbert Space repre-
sentation is essential to Quantum Mechanics, and to ponder whether a wholly
abstract description of what is needed for a mathematical model of QM would
necessitate Unitary representations.
6.4 Lie Algebras
Denition 6.4.1. The Lie Algebra of a Lie group G is the tangent space
at the identity. It is called g. This makes it a vector space of the same
dimension as G.
Remark 6.4.1. The multiplication comes later.
Remark 6.4.2. Elements of g used to be (and still are by some people)
called the innitesimal elements of G. You can see why. In particular the
innitesimal rotations are obtainable from the rotations in SO(3) by taking
curves through the identity in SO(3) and dierentiating them. The text book
gives some natural examples: we take the matrix function
cos t sin t 0
sin t cos t 0
0 0 1
which represents a curve of rotations about the z-axis and dierentiate it at

t = 0 to get
J
z
=
0 1 0
1 0 0
0 0 0
and J
x
, J
y
can be obtained in the same way.
These three matrices are linearly independent and span the algebra so(3).
The multiplication is the Lie Bracket in this case,
[X, Y ] = XY Y X
Exercise 6.4.2. Verify that the Lie bracket is in the vector space so(3) when
X and Y are.
We can recover the original matrix functions by exponentiation:
Exercise 6.4.3. Show that exp(tJ
z
) is what it ought to be.
Exercise 6.4.4. Show that the Lie algebra of SO(2) is just R.
Exercise 6.4.5. Do exercises 33 to 54 in the text book.
Remark 6.4.3. Lie algebras are, as the book tells us, nicer in many ways to
work with than Lie groups because they are vector spaces. They give a lot of
information about the groups and their representations.
Chapter 7
Fibre Bundles
7.1 Introduction
A standard source on Fibre Bundles is Dale Husemollers Fibre Bundles.
There are probably more modern books, and there are certainly better writ-
ten books, but I own a copy so will stick to following it. I shall do very little
on this subject (there is quite a lot to be done) because I want to focus on
dierential geometry, the subject of these notes, but there are close connec-
tions, as is shown by the physics. Anyway, to get very far in Fibre Bundles
you would need more homotopy theory than you have. So this will be a short
chapter.
First some examples:
1. The product S
1
R with projection to S
1
has base space S
1
, bre
(space) R and total space S
1
R. It is easy to see why we call the bre
a bre (it is long and thin) and the bres are glued together by the
topology of the base space.
2. The Mobius bundle which has the same base space and bre as the last
example, but has a twist in it so as to make a mobius strip (without
a boundary). Again there is a projection from the total space to the
base space, and the inverse image of any point is a copy of R.
3. Any product of two spaces. For example a 2-torus has base space S
1
and also bre S
1
.
4. Any tangent bundle. This attaches to every point of a smooth manifold
a vector space, the tangent space at the point, and the resulting object
185
186 CHAPTER 7. FIBRE BUNDLES
is a vector bundle, which is dened as a bre bundle which has a vector
space for the bre, an important subclass of bre bundles.
5. Tensor bundles. Again, these are all vector bundles.
6. SO(n,R), n 2 is a bre bundle with base space S
n1
. The map takes
an element of SO(n,R) and sends it to wherever the north pole of the
sphere S
n1
is taken by applying the element to the sphere. The inverse
image of this point in SO(n,R) is a subset which is an embedded copy
of SO(n-1,R), the bre. When n = 2, SO(n-1,R) is a single point,
the identity map from R to itself, so SO(2,R) is just a copy of S
1
topologically.
7. The sphere S
n
is a bre bundle over RP
n
which sends antipodal points
to the same point and hence has bre Z
2
. It might be better to describe
the bre as S
0
, the pair 1 under multiplication, or O(1,R). More
interesting bundles can be obtained by replacing R with C.
8. Take the sphere S
2
and at each point take the space of ordered pairs of
orthonormal tangent vectors. This gives an orthogonal 2-frame bundle
over S
2
. In general, if M is a smooth manifold, for k an integer less than
or equal to the dimension of the manifold, take the space of (ordered)
k orthonormal tangent vectors at each point. An orthogonal 1-frame
bundle on S
2
would consist of attaching a unit circle to each point
of the space, the circle being in the tangent space at the point. An
orthogonal 2-frame bundle on S
2
would attach 2 circles at each point
(Explain why). Clearly this supposes a Riemannian Inner Product.
More generally, it makes sense to attach at each point of a smooth
n-manifold, an ordered set of k linearly independent vectors of the
tangent space at that point, for k n. These bundles are called frame
bundles. A section of the two-frame bundle on S
2
would give a rather
special pair of vector elds being everywhere linearly independent, and
we know there is not even one such vector eld. So it is not at all
obvious whether a given manifold admits a eld of k-frames in general.
Note that there is a rather natural group action on frame bundles,
O(n,R) on the orthogonal frame bundles, and GL(n,R) on the bundles
where we do not suppose a Riemannian structure. The group acts on
the total space but sends bres to bres by what is a multiplication
of the (Lie) group and hence a dieomorphism. Bundles with a group
action of this sort are called principal bundles. I shall elaborate on
these later.
7.1. INTRODUCTION 187
Exercise 7.1.1. Show that attaching an ordered set of n orthonormal vectors
to each point of a space is equivalent to attaching an element of the orthogonal
group, and that the n-frame bundle eectively attaches GL(n,R) to each point
of the n-manifold, this being the bre. Thus a useful way of thinking of a
bundle with base space a manifold is to regard the manifold as having a copy
of the bre attached at each point of the manifold.
Exercise 7.1.2. Show that the 2-torus admits a eld of 2-frames. Does S
3
?
The above should convince you that (a) some spaces have a structure which
makes them something like a generalised cartesian product and (b) it is worth
knowing more about them because there are interesting examples.
Denition 7.1.1. A bre bundle is a triple of space, E, B, F and a map
: E B such that for every b B,
1
(b) is homeomorphic to F.
Denition 7.1.2. A bre bundle is locally trivial i there is a cover of B by
open sets U
j
and for each of them
1
(U
j
) is homeomorphic to U
j
F.
Remark 7.1.1. All our bre bundles will be locally trivial
Exercise 7.1.3. Give an example of a bre bundle which is not locally
trivial.
Denition 7.1.3. A bundle map between bre bundles (E, B, F, ) and
(E
, B
, F
) is a pair of maps f
B
: B B
and f
E
: E E
such that
f
E
= f
B
.
Remark 7.1.2. It follows that bres wind up inside bres under f
E
. It is
helpful to draw a picture of a square:
E
?
B
E
-
-
f
E
f
B
We say that the square commutes with the condition
f
E
= f
B
. If
is onto then f
E
determines f
B
. From the denition, it has to be.
Exercise 7.1.4. Dene the terms product of bre bundles, subbundle, quo-
tient bundle. Give examples of each.
Exercise 7.1.5. Find out what a bre product is and give an example.
7.2 Principal Bundles
In many of the above examples, the bre had some extra structure besides
being a topological space: often it was a vector space, giving a vector bundle,
and sometimes it was a group. A group acts on itself by multiplication, and
so we can more generally consider the case when the bre has a group action
on it. We care most about the case where the group action on the bre is that
of a Lie group, and the action is regular, which means it is both transitive,
that is for any two points of the space there is a group element which acts to
take one to the other, and also free, that is only the identity leaves any point
xed; this is equivalent to saying that for any two x, y in the bre, F there
exists precisely one g in G such that g x = y. In this case, F is known as
a principal homogeneous space for G or as a G-torsor. This denition holds
whether F is actually the bre of a bundle or not.
Exercise 7.2.1. Show that the action of S
1
on itself (regarded as U(1,C),
i.e. the set of complex numbers of modulus 1 with the usual complex mul-
tiplication) makes it an S
1
-torsor. Is there a regular action of S
1
on T
2
?
Is there a regular action of R on T
2
? Take the quotient space I/I which
joins the ends of the unit interval together. This is homeomorphic to S
1
but
lacks the group structure and the smoothness structure. Show that it can be
given the structure of a smooth manifold via any homeomorphism with S
1
and also that it is an S
1
-torsor. Is any Lie group G a G-torsor? Are there
any G-torsors that are not homeomorphic to G?
Denition 7.2.1. A bre bundle where each bre is a G-torsor (for the same
G) is called a Principal bundle
Exercise 7.2.2. Show that the n-frame bundle for any smooth manifold
(Usually written F(M)) is a principal bundle.
Exercise 7.2.3. By taking the mobius strip with bre a closed interval and
gluing the ends of each bre, show that the resulting space is a principal
bundle and work out what the space is.
Remark 7.2.1. This has a lot to do with gauge theory.
Exercise 7.2.4. Do some googling to understand the last remark.
Remark 7.2.2. The condition that the bre be a G-torsor means that we
can use group actions to say something about the bundle structure. We
have, in eect, a sort of Construction Kit for the bundle which tells us how
to put it together, the group elements can be used to specify how to glue
local trivialisations together.
7.2. PRINCIPAL BUNDLES 189
Figure 7.2.1: A locally trivial cover of S
1
.
In the simplest case, take a bundle over S
1
with bre the interval R and
the action of O(1,R) on it. We might stipulate that the action be always
the identity, so if we have a pair of trivialisations of the bundle, on the
intersection the relation between the bres is that they are the same way
up. This inevitably forces the bundle to be trivial, S
1
R. Or we might
insist that the group action be 1 on one intersection and +1 on the other,
when we would get the mobius strip. Since we would like to be consistent
on intersections, it is reasonable to want the intersection to be connected, so
for S
1
we shall do it with three open sets which cover S
1
.
Exercise 7.2.5. Is the bre a G-torsor for the orthogonal group?
Remark 7.2.3. In the above case, if we impose the condition that the group
action has to be constant on the intersection of the trivialising cover of the
base space, then we need at least three such open sets in the cover. Labelling
them , and , we can characterise each intersection by specifying an or-
dered pair, see the diagram gure 7.2.1. is the red open set, the blue and
the green. Then the intersection is the region between the black bars
at the top right. If I now assign the element 1 O(1,R) to the element
1 to at the left and the element 1 to then you can read this as an
instruction to start with three strips, R R and R, and glue the
rst two strips together keeping both orientations of R to have the positive
numbers pointing up, the strip R is glued to R also with the real line
having the same orientation, but R is glued to R with a reversal, so
that the part is upside down. It is clear that these instructions produce a
mobius strip. Moreover, in general, we can specify a locally trivialising cover
of the base space, with the condition that the intersection is path connected,
take any bre having a group action on it and, by assigning group elements
to intersections, give instructions to build a new object. We need to ensure
that the instructions are unambiguous and that the resulting object is a bre
bundle.
Denition 7.2.2. In general, If U
is a trivialising cover of a manifold,

with the condition that U
is connected or empty, then when there is

a group action on the bre with a group G, the map from the (non-empty)
intersections () to G gives the transition functions for the bundle.
Exercise 7.2.6. Suppose means that you hold the right way up in
some sense and apply a group element g
to before doing the gluing of

each x in the bre F over to g
(x) over . Verify that this means that

we must have g
the inverse of g
. What can you say about g
?
Exercise 7.2.7. Verify also that for an unambiguous instruction we need to
have the cocycle condition:
g
= 1
on any non-empty region .
Exercise 7.2.8. Verify that the construction described always gives a bre
bundle.
Note that we do not need the bre to be a G-torsor for this group, it suces
that the action be that of a subgroup. In fact we can get a trivial bundle by
consistently choosing the identity. (We can get it other ways, too!)
Exercise 7.2.9. Explain the last, parenthetic, remark.
Exercise 7.2.10. Show that by choosing bre S
0
instead of R, we can deal
with the case where the bre is a G-torsor. Instead of taking a subgroup,
we can throw out most of the bre. So for the trivial bundle and the mobius
bundle over S
1
with bre S
0
, we still have all the essential properties, and
now we need only count connected components to see the dierence.
Denition 7.2.3. In the case described above of a bundle the structure of
which is determined by a group action on the bres and a set of transition
functions, the bundle is called a G-bundle, and the group is called the gauge
group of the bundle.
Remark 7.2.4. In practice, the bre is a vector space, usually a tangent
space or tensor space.
7.3. THE ENDOMORPHISM BUNDLE 191
Denition 7.2.4. For any linear transformation T of a G-vector bundle bre
F
p
attached to a point p in the manifold, we can ask whether it arises from
the action of G. In general some will and some wont. If it does, we say T
lives in G.
Exercise 7.2.11. Show this is well dened. That is, show that if p is in two
charts with domains and , then if T lives in G over it also lives in G
over , even though the particular element g G is in general dierent.
Exercise 7.2.12. Give examples of G-vector bundles and linear transforma-
tions of F
p
that live in G and others which do not.
Exercise 7.2.13. Extend this idea to the Lie algebra g when G is a Lie
group.
Denition 7.2.5. A Gauge Transformation is a smooth G-bundle map from
a vector bundle into itself which is the identity on the base space and such
that every linear map from a bre F
p
to itself lives in the (Lie) group G.
Remark 7.2.5. Physicists care about these a lot. See the section on page
215 of the text book to nd out why. Or at least get some vague idea.
7.3 The Endomorphism Bundle
There is a natural isomorphism between
V V
and End(V )
where End(V ) is the vector space of endomorphisms of V , that is the linear
maps from V to itself. Observe that End(V ) is a ring under composition, it
has a unit, but is not in general commutative. Recall that we dened a vector
space with an associative multiplication to be an algebra. From the isomor-
phism it is clear that for any smooth manifold we have an endomorphism
bundle where V is the tangent space at each point of the manifold.
More generally, if E is any vector bundle over a smooth manifold M with
bre V , we can dene the endomorphism bundle EE
by attaching End(V
p
)
to each point p in M. There is nothing new here, we have the tensor bundle
construction in the very special case of (1, 1)
T
tensors.
A section T of E E
acts on a section s of E. If at each point p in M, the

section s(p) is in the bre then T(p) is a linear map from the bre into itself
which takes s(p) to T(p)(s(p)), everything done pointwise. And if (E) is
the space of sections of E, any section T of E E
gives a map
T : (E) (E)
which is linear regarding (E) as a C
(M, R) module.
Exercise 7.3.1. Verify the above claim.
Exercise 7.3.2. Show that any C
(M, R)-linear map denes a section of

E E
. This will involve partitions of unity so needs M paracompact. Do

it rst for the case where E = M V .
Exercise 7.3.3. Show that the set of all gauge transformations for a G-vector
bundle E is iself a group, (, the gauge group.
Exercise 7.3.4. Read page 222 of the text book.
Chapter 8
Connections
I have followed and amplied R.W.R Darlings book Dierential Forms and
Connections, of which I can only say thank God for Wikipedia. You should
google the term Covariant Derivative on Wikipedia and anywhere else you
can nd it.
8.1 Fundamental Ideas
You will have noticed that I have used the notation

R
n
to denote the tangent
space of R
n
at any point. I can get away with this because if I take any
a, b R
n
, then

R
n
a
and

R
n
b
are isomorphic and moreover the isomorphism
comes from the shift map that takes a to b by adding b a to everything
in R
n
. Clearly this takes curves and their tangency equivalence classes at a
to corresponding curves and their equivalence classes at b in a thoroughly
uninteresting way. An important consequence of this is that if I am standing
at the origin in R
n
and you are standing somewhere else, it makes sense to
ask if we are looking in the same direction. We can ask if the unit tangent
vector representing my direction of look is carried by the shift from me to
you into the unit vector representing your direction of look.
On a 2-sphere, even the standard one embedded in R
3
, this is not the situation
at all. It is true that we could use R
3
as our notion of what consitutes the
same direction, but if you and I are both on the equator of the Earth,
supposed to be an embodiment of S
2
, and if you are a quarter of a planet
away from me, if I am looking at my horizon due West, watching the sun
set, and you are also looking due West, you would not be looking towards
the sun. If you are somewhere to the West of me, then the Sun is overhead
from your point of view, and if I am to the West of you, it is dark for you
193
194 CHAPTER 8. CONNECTIONS
and the direction of the Sun is under your feet. Yet if we are both looking
due West, it clearly makes some sort of sense to say we are looking in the
same direction. We wouldnt feel quite so tempted to say this if I watched
the sunset and you were looking due North.
Exercise 8.1.1. Show that there is a curve joining us so that at each step
neighbours are looking in the same direction but I am looking due West and
you are looking due North.
So the question is, how do we dene this notion of the same direction for
dierent places on a manifold?
One approach is to go through a group action, since the shift maps from
R
n
to itself constitutes precisely such an action of the additive group R
n
on
the vector space R
n
, and the group action takes tangent vectors to unique
tangent vectors. In the case of S
2
embedded in the standard way in R
3
, one
would be tempted to use the special orthogonal group. If there is a rotation
taking me to you (and there surely is ) then we could use the rotation to take
my tangent space to yours. Then if the image of my direction of look is your
direction of look, we are looking in the same direction. The fact that we
can decide which directions are North, South, East and West, for all points
on the Earth except the poles, suggests that some sort of sense can be made
of this. On the other hand the fact that all directions at the North Pole are
due South also suggests that there is something a bit wrong. Nevertheless, if
I am at the North Pole and you are a kilometre away and we are looking at
each other, then it makes sense to say we are looking in opposite directions,
and if I turn around to see what you are looking at (a polar bear behind
me perhaps) then we would be looking in the same direction. This may be
because a distance of a kilometre is small enough to make the world pretty
much at. But suppose we have a chain of ve thousand people, all looking
towards the next one in the chain, all able to see over each others shoulders
at the next one beyond, we could each decide that we were looking in the
same direction. If we had forty thousand people, and I am number one (as
seems reasonable to me) and standing at the North Pole looking towards you,
and you are looking due South at somebody a kilometre away also looking
in the same direction as you, and so on, then we are all looking due South
until we get to the South Pole, when everybody later in the chain is looking
due North. So North and South have got screwed up, but we are all still
looking in the same direction. Any three consecutive people will agree on
this. And if we all turn through a right angle anticlockwise, are we are still
all looking in the same direction? Any consecutive triple of people, observing
their neighbours out of the corners of their eyes would surely agree that they
were. And if they all held their arms out sideways, their right arms would be
8.1. FUNDAMENTAL IDEAS 195
pointing towards the neighbour they had previously been looking at. Half of
them would say they were looking West and the other half would say they
were looking East. Were it not for the axial inclination of the Earth, we
might have them all looking at the sun on the horizon, although some would
nd it setting and others rising.
You will note that a suitable rotation of the Earth, although not the usual
one, would take each person to his successor and would carry the direction
of look along with it.
We certainly have that parallel translation of a tangent vector around this
closed loop would have to bring us back to the original vector, but this need
not happen for all closed loops, see gure 3.2.1 in chapter three.
Exercise 8.1.2. Without looking at the picture, draw a closed loop of people
on S
2
with a distinguished starting point, so that everyone is looking in the
same direction as the person in front, but such that the last person is at the
same point as the rst person but is looking in a dierent direction.
Three observations: rst that we should feel that when the people are very
close together on any smooth manifold, it should be possible to say if they
are looking in dierent directions. Since very close is a scale dependent
kind of thing, it must be intelligible to take limits, so dierentiation must
come into it somewhere. On a symmetric space with a Lie group action,
the group action ought to also give us some sense of what looking the same
way means, but the notion ought to be intelligible on any smooth manifold,
although not with the structure we have on it at present. Another reason for
thinking dierentiation comes into it somewhere is that directions are given
by tangent vectors.
Second, we seem to have that paths come into it too. Whether two people
at dierent points a and b can be said to be looking in the same direction
would seem to require us to have a path of people, everyone looking in the
same direction as the person next to him, the path joining the person at a
to the person at b. There isnt a next person on a continuous path, and
the last exercise makes it clear that whether the person at a is looking in the
same direction as the person at b would depend on the path.
Exercise 8.1.3. Find two paths on a sphere between points a and b such
that each person along each path is looking in the same direction, the person
at a is looking in a denite direction, but the two coincident people at b are
looking in dierent directions.
And Third, we might want to transfer along paths on a manifold other things
besides directions of looking: for example we might naturally want to transfer
a frame, that is a coordinate system, or a linear map. For this reason we
need to think in terms of doing parallel transportations for sections of vector
bundles in general, not just the tangent bundle.
So the problem is, how to articulate precisely this notion of shifting some
object parallel to itself along a curve on a smooth manifold. The above
discussion shows it should make some sort of sense, but consideration of S
2
shows that it may have some surprises.
There seem to be two possibilities: one is to restrict ourselves to symmetric
spaces with a good group action under which the manifold is invariant, and
the other is to nd some more general dierential structure. The rst choice
leads to Cartan connections, which generalise the idea of using rotations of
S
2
to carry tangent vectors along with the points, or shifts to take tangent
spaces to tangent spaces in R
n
. The second choice is more general and leads
to Koszul connections and in particular to the Levi-Civita connection. There
are more connections than you can shake a stick at, but it is better to get
one type sorted out properly before going on to others.
Exercise 8.1.4. Can you say what kinds of paths on S
2
look right for the
action of SO(3, R) to be used for determining how to transport a unit tangent
vector (direction of look!)?
Exercise 8.1.5. Can you transport a frame on S
2
using the same idea? If
you had an eye where your left ear is so you could look in two orthogonal
directions at once, could you have a chain of people all looking in the same
direction with eyes and left ears? Or could you have a chain where this is
impossible?
Exercise 8.1.6. You could certainly have a Riemannian inner product on
R
2
that started o with the standard basis being orthornormal and changed
gradually along a path until the basis
1
0
1
1
was an orthonormal
basis. Find such a path. Transport an orthogonal frame along the path so it
stays an orthogonal frame in the Riemannian inner product.
Exercise 8.1.7. Could you do the same thing with the standard basis,
(e
1
, e
2
), and the basis (e
2
, e
1
)? Prove your claim.
Exercise 8.1.8. Can you do the same kind of thing on S
2
? RP
2
?
We conclude that the idea is to shift things along curves. At the very least
the curves should be smooth. In fact any smooth curve, at least locally is the
solution to a system of ODEs (the Straightening Out Theorem from ODE
theory: see Arnold.) So an alternative is to move them innitesimally
along a vector eld, and the things will be sections of some vector bundle.
8.2. BACK IN R
N
197
If the shifts are innitesimal then we can hope to get a shift along a curve
by some sort of integration process. This leads to asking if we can have some
sort of dierential operation of a vector eld on various other sections of a
vector bundle.
8.2 Back in R
n
8.2.1 Covariant dierentiation
I shall deal with the case n = 2 in order to save typing, but the extension is
trivial. Take a vector eld X on R
2
and a point a R
2
with X(a) denoting
the vector at a. We write a as
a
1
a
2
and X(a) as u =
u
1
u
2
. Let Y be
another vector eld on R
2
. Can we dierentiate Y in the direction u at a?
If we take Y (a) = v =
v
1
v
2
, then we can only talk about dierentiating Y

at a if we have Y making sense in a neighbourhood of a so we want to seee
v as a pair of functions,
v(a) =
v
1
(a)
v
2
(a)
Of course, we are going to be looking at the derivative of Y at a in the

direction X(a) for dierent points a.
We already have a way of talking about the directional derivative of a function
with respect to a vector eld. Take a function f : R
2
R and a vector eld
X on R
2
. Then I can take the Lie derivative Xf or L
X
(f) and dierentiate
f along the vector eld with
Xf
x
y
f
x
f
y

X
1
(x, y)
X
2
(x, y)
This gives me another function, a 0-tensor eld.

I can certainly do this to both components v
1
(a) and v
2
(a) and this will
give me a new vector eld which I shall write as
X
(Y ). Baez and Munian
write D
X
(Y ), at least some of the time, which reminds you that this is
something to do with dierentiation, but then, is also something to do
with dierentiation. Notice
X
(Y ) is very dierent from
Y
(X) in general.
The former requires dierentiation of Y in a direction at a point, but has
nothing to do with dierentiating X, and the latter is the other way around.
With the notation given I have
X = u
1
(x, y)

x
+u
2
(x, y)

y
and
Y = v
1
(x, y)

x
+v
2
(x, y)

y
Then going back to writing vectors as columns we have
X
(Y ) =
v
1
x
v
1
y
v
2
x
v
2
y
u
1
(x, y)
u
2
(x, y)
Remark 8.2.1. We could write this using the Einstein summation conven-
tion as
X
(Y ) = u
i
i
(v
j
)
j
, i, j [1 : 2]
which has the advantage that if I leave out the last part, by not specifying
which n we are working in, it makes sense for R
n
for any n.
Exercise 8.2.1. Take a nice vector eld X on R
2
such as y

x
x

y
. Choose
a nice simple Y and calculate
X
(Y ), also
Y
(X),
X
(X) and
Y
(Y ).
Sketch all the vector elds and satisfy yourself everything makes sense, and
that we can legitimately regard
X
(Y ) as a derivative of Y in the direction
X at each point.
Remark 8.2.2. From the matrix notation, certain things are obvious:
1.
X
(Y ) is certainly additive in X:
X
1
+X
2
(Y ) =
X
1
(Y ) +
X
2
(Y )
2.
X
(Y ) is R-linear in X:
t R,
tX
(Y ) = t
X
(Y )
3. Since this is done pointwise as far as X is concerned, it is C
(R
2
, R)
linear in X:
f C
(R
2
, R),
fX
(Y ) = f
X
(Y )
4.
X
(Y ) is R-linear in Y :
X
(Y
1
+Y
2
) =
X
(Y
1
) +
X
(Y
2
)
t R,
X
(tY ) = t
X
(Y )
5. It satises the Leibnitz rule so far as C
(R
2
, R) scaling of Y is con-
cerned:
X
(fY ) = f
X
(Y ) + (Xf)Y
8.2. BACK IN R
N
199
Exercise 8.2.2. Verify all these.
Exercise 8.2.3. Conrm that the Lie derivative L
X
(Y ) does not satisfy all
these conditions.
Exercise 8.2.4. Instead of operating with
X
on a vector eld, you could
operate in a very similar way on a covector eld or 1-form, = P dx+Q dy
to get another 1-form
X
() = XP dx + XQ dy. Show this also satises
the above conditions.
Remark 8.2.3. We have now dened
X
for two dierent sections of bundles
over R
2
, the tangent bundle and the cotangent bundle.
Denition 8.2.1. Any operator
X
for a vector eld X on sections of any
vector bundle over R
2
which satises the properties of Remark 8.2.2 is called a
connection, and the particular connections described are called the Euclidean
connections on the tangent and cotangent bundles.
Obviously extending these to R
n
merely means more terms. Extending them
to manifolds gives some complications.
8.2.2 Curves and transporting vectors
We can now talk about moving a vector along a curve in R
2
so it stays
pointing in the same direction. : I R
2
is a smooth curve, and u(t) is a
vector at (t) R
2
, I want to use the fact that for each t I, if I dierentiate
u(t) in the direction
(t) there is no change, so I want

t I,
(t)
(u(t)) = 0
This does not make sense, because neither
(t) nor u are vector elds on

R
2
, although
is a vector eld on the curve. This doesnt matter as far as
(t) is concerned, because we only need a vector at each point of the curve
to give a direction in which to dierentiate. It does matter as far as u(t)
is concerned, at the very least we want to know what u is doing in some
neighbourhood of the curve, whereupon
(t)
(u(t)) becomes intelligible and
I can insist that it be zero. This will put some conditions on u. We already
know, of course, exactly what we want to get out, because we know that
shifting vectors parallel to themselves in R
2
is rather trivial. What we want
to conclude is that u is constant along , and that this does not depend on
the curve. But we have an eye on doing the same sort of thing on S
2
, where
life is more complicated.
You can see that the proposition that the directional derivative of a function
is zero in every direction certainly tells us that the function is constant. You
can also see that if the directional derivative of a function along a curve is
zero, it must be constant along the curve. And we dont give a damn what
the function is doing elsewhere. This holds for each component function of
the vector eld that extends u in a neighbourhood of the curve.
Exercise 8.2.5. Prove that a smooth function which has directional deriva-
tive along a curve equal to zero is constant along the curve.
Exercise 8.2.6. Is it true for any continuous curve in R
2
, that a function
dened on it can always be extended to a neighbourhood of the curve?
We therefore deduce that for the euclidean connection on R
2
, the properties
of the connection ensure that the condition
(t)
(u(t)) = 0
is quite intelligible since it makes sense for any extension of u(t) into a
neighbourhood of the curve, and it tells us how to parallel transport a vector
along a curve, and that for any two points on the curve, the condition ensures
the vector at each point is the same. Big deal. Of course, the notion of
vectors in two dierent tangent spaces being the same is certainly trivial in
R
2
, indeed in R
n
generally, so long as we have the standard structure; there
is one obvious sense in which it makes sense, the trick is to say what it means
along curves in manifolds that are not so simple.
From this, after some reection, we conclude that the euclidean connection
on R
n
solves all the problems of parallel transportation on R
n
. This, face
it, wasnt much of a problem. On the other hand, it does give us a hope of
solving the same problem on S
2
and other manifolds, including the universe
in which we live. And if we can parallel translate vectors we should be able
to parallel translate other things using the same ideas.
So we study connections.
8.3 Covariance
The question is, can we make this work on manifolds in general? Certain
things are prerequisites: in particular, this all has to be independent of a
choice of basis. If you are a physicist and you use a vector eld or dierential
1-form to represent some thing like an electric eld, you would insist that
the vector at a point is a real thing that does not depend on the choice of a
8.4. EXTENSIONS TO TENSOR FIELDS ON R
2
201
coordinate system. We now know that the right way to express this belief is
in terms of invariance under certain group actions. You and I may dier in
the actual numbers to be assigned, but wed better agree on what happens
in the world after a suitable translation scheme is established. Or it aint
Science.
If you change the basis of R
n
for some reason known only to yourself, then
both X and Y will have dierent representations. Still, a vector eld exists
independent of your description, and something would be horribly wrong if
X
(Y ) depended on the basis.
Exercise 8.3.1. Take the vector elds on R
2
you used for an earlier problem
and express them in the basis
1
1
1
1
in such a way that it really is the same vector eld. Do the calculations all
over again.
Exercise 8.3.2. Show that in general if we change the basis on R
n
so that
X and Y are written in the new basis as X
and Y
, then
X
(Y
) is what it
jolly well ought to be.
Exercise 8.3.3. Using the same vector elds, do the calculations using polar
coordinates. What conclusions do you draw?
Exercise 8.3.4. Suppose is a dieomeorphism of R
n
and X, Y are vector
elds on R
n
. Explain how to describe X, Y in terms of the coordinate system
given by . What happens to
X
(Y ) under this dieomorphism?
8.4 Extensions to Tensor Fields on R
2
Returning to R
2
, we could express the Euclidean connection in the form:
X
(Y ) =
v
1
x
v
1
y
v
2
x
v
2
y
u
1
(x, y)
u
2
(x, y)
or the somewhat more compact:
X
(Y ) = u
i
i
(v
j
)
j
This rather obscures the fact that I am using us to represent the vector eld
X and vs to represent the vector eld Y , so it might be better to write
X
(Y ) = X
i
i
(Y
j
)
j
(8.4.1)
The denition of
X
() where
= P dx +Q dy =
1
dx +
2
dy
was just
X
() = X
1
dx +X
2
dy = X
j
dx
j
and unpacking the expression for X
j
we get
X
() = X
i
j
dx
j
(8.4.2)
which looks a lot like equation 8.4.1.
Exercise 8.4.1. Write out equation 8.4.2 as a matrix.
The fact that we have the same basic shape for vector elds as for 1-forms
tells us that all we are doing is choosing a suitable basis for each of them:
(e
1
, e
2
) is the standard basis for the vectors in R
2
and I have (
1
,
2
) for
the standard basis for the tangent vectors, and (dx
1
, dx
2
) for the cotangent
vectors. Suppose I have a (k, ) tensor bundle, then I can write out a basis
for any section as a collection of terms in the form
dx
i
1
dx
i
2
dx
i
k
i
k+1

i
k+
We can extend the denition of
X
(Y ) to
X
(s) where s is any section of
the tensor bundle by writing
X
( ) =
X
() +
X
()
and extending to as many tensor products as you feel a need for, and using
linearity.
Exercise 8.4.2. Show that for all sections of a tensor bundle s, the properties
of Remark 8.2.2 hold.
Exercise 8.4.3. Letting X be the same old vector eld on R
2
as in earlier
exercises, and let a Riemannian inner product be dened on the positive
quadrant by the matrix
s =
1 +xy 0
0 1 +x
2
+y
2
Find
X
(s). Is it positive denite? If we take the covariant derivative of a
symmetric 2-tensor s, is the the resulting 2-tensor necessarily symmetric?
8.5. THE KOSZUL CONNECTION 203
8.5 The Koszul Connection
The crucial properties of the covariant derivative of tensor bundles on R
2
were: For any section s of a tensor bundle,
1.
X
(s) is certainly additive in X:
X
1
+X
2
(s) =
X
1
(s) +
X
2
(s)
2.
X
(s) is R-linear in X:
t R,
tX
(s) = t
X
(s)
3. Since this is done pointwise as far as X is concerned, it is C
(R
2
, R)
linear in X:
f C
(R
2
, R),
fX
(s) = f
X
(s)
4.
X
(s) is R-linear in s:
X
(s
1
+s
2
) =
X
(s
1
) +
X
(s
2
)
t R,
X
(ts) = t
X
(s)
5. It satises the Leibnitz rule so far as C
(R
2
, R) scaling of s is con-
cerned:
X
(fs) = f
X
(s) +Xfs
Extending these to R
n
is rather trivial; consider it done. The next step is
to dene a Koszul connection on any vector bundle E over a manifold M as
a map operating on a vector eld, X, on M and any section s of E which
satises the above rules. This is rather abstract, but I have built up the
simple concrete cases rst in order to cheer you up.
8.6 Vector Potentials
We gave a very simple covariant derivative on R
2
which quite obviously sat-
ised the rules for a connection, indeed thats where we got the rules from.
Now we take the Physicists perspective. Weening them o coordinates is an
ongoing process, so lets try doing it their way then we get to be able to do
lots of sums, which is good, and confuse things horribly, which is bad.
Suppose we have a section s of some bundle E over R
2
with bre a vector
space F so that E = R
2
F. s is therefore a map from R
2
to F, and if
we let (e
1
, e
2
, e
n
) be a basis for F, then for every point v R
2
we have
s(v) = s
1
(v)e
1
+s
2
(v)e
2
+ s
n
(v)e
n
, or s(v) = s
i
e
i
in Physicists notation.
If X is a vector eld on R
2
we can write X = X
1
1
+ X
2
2
= X
j
j
. And
if s
=
X
(s) we have also s
(v) = s
1
(v)e
1
+ s
n
(v)e
n
= s
i
e
i
. And
the n functions s
i
(v) depend on the n functions s
i
(v) and on the functions
X
1
, X
2
only. Moreover, the rules for getting the s
i
are specied by the rules
of section 8.5 and nothing else. Lets see how it works out. I shall have to
take
Y
for Y the unit vector eld in the direction of the x-axis and also the
y-axis, and rather than write this as
1
or
2
I shall shorten this to
1
and
2
respectively.
s
=
X
(s) =
(X
1
1
+X
2
2
)
(s)
=
X
1
1
(s) +
X
2
2
(s)
= X
1
1
(s) +X
2
2
(s)
= X
1
1
(s
1
e
1
+ s
n
e
n
) +X
2
2
(s
1
e
1
+ s
n
e
n
)
= X
j
j
(s
i
e
i
) using the Einstein convention
= X
j
(s
i
j
(e
i
) + (X
j
s
i
)e
i
) (Leibnitz)
It might be better to expand this last to
X
(s) = X
1
(s
1
1
(e
1
) +s
2
1
(e
2
) + s
n
1
(e
n
))
+ X
2
(s
1
2
(e
1
) +s
2
2
(e
2
) + s
n
2
(e
n
))
+ X
1
((
1
s
1
)e
1
+ (
1
s
2
)e
2
+ + (
1
s
n
)e
n
)
+ X
2
((
1
s
1
)e
1
+ (
1
s
2
)e
2
+ + (
1
s
n
)e
n
)
Now the terms
1
(e
i
) and
2
(e
i
) are, for each i [1 : n], going to be
values of the section, and can therefore be expressed in terms of the basis
(e
1
, e
2
, e
n
). I have, in other words, for each j [1 : 2] and for each
i [1 : n], at each point v R
2
, there is a collection of numbers A
k
i,j
(v))
expressing
j
(e
i
) as
k[1:n]
k
i,j
e
k
Which tells us that we can express
X
(s) as
X
1
i,k[1:n]
s
i
k
i,1
e
k
+X
2
i,k[1:n]
s
i
k
i,2
e
k
+X
1
i[1:n]
(
1
s
i
)e
i
+X
2
i[1:n]
(
2
s
i
)e
i
8.6. VECTOR POTENTIALS 205
Collecting these up and changing the name of a summation index gives us:
X
(s) =
j[1:2],i,k[1:n]
X
j
(
j
s
i
+
i
k,j
)e
i
(8.6.1)
or
X
(s) = X
j
(
j
s
i
+
i
k,j
)e
i
in physics speak.
Exercise 8.6.1. Find the expression for
j
(
i
) in terms of the standard
basis for vectors in R
2
. Now do it for polar coordinates.
The 2n
2
functions
i
k,j
from R
2
to R (On R
2
, on R
m
it would be mn
2
) pretty
much tell us everything about the connection, given that the X
j
tell us about
the vector eld X and the
j
s
i
tell us about dierentiating the section. In
general they are mn
2
functions from the manifold of dimension m to R,
and they tell us how the connection works on the bundle with bre F of
dimension n. The collection of functions is called the Vector Potential for
the connection, or sometimes the Christofel symbols. When the manifold
has the same dimension, n, as the bre, there are n
3
such functions. The
text book prefers to use the term Christoel symbols for the case when the
connection respects a riemannian inner product.
Exercise 8.6.2. When we discussed the Euclidean connection for sections
of the tangent bundle, what were the
i
k,j
?
The signicance of the vector potential term in equation 8.6.1 is not hard to
see. If we left it out, or equivalently insisted that all terms are zero, then in
the case of a vector eld we would simply have the situation of the Euclidean
connection,
X
(Y ) =
v
1
x
v
1
y
v
2
x
v
2
y
u
1
(x, y)
u
2
(x, y)
In order to work on S
2
, this would have to survive a dieomorphism, and by
an earlier exercise, it doesnt.
Exercise 8.6.3. Take the usual vector elds on R
2
` 0, X = y
x
+
x
y
and Y = x
x
+ y
y
. Compute
X
(Y ) in cartesian form. Now nd
expressions for the same vector elds in polar form. (Youd better get X
P
=
and Y
P
= r
r
and make sure you can prove these are correct, not just look
at the pictures!) Now use the rule for the Euclidean connection to calculate
X
P
(Y
P
). This had better not be the polar form of
X
(Y ) or you have got
the wrong answer.
Exercise 8.6.4. Find a vector eld Z on R
2
` 0 which corresponds to the
polar eld
r
. (That is, it consists of a unit vector radially outwards at each
point.) Calculate
r
(
) by translating it into cartesian coordinates to do

the sums and then translate back. What if you had chosen a dierent basis
for the tangent vectors and picked r
r
instead? What is
r
r
(
)? Calculate
),
(r
r
) and
r
r
(r
r
) in the same way. Translate them all back
into polar coordinates.
Exercise 8.6.5. Explain why r
r
is a better choice than
r
. (Hint, look at
the polar dieomorphism.)
Exercise 8.6.6. Hence compute the vector potential for
X
P
(Y
P
).
Exercise 8.6.7. Conrm that if you take the vector potential into account,
you get the right answer for
X
P
(Y
P
).
Exercise 8.6.8.
x
(
y
) and the similar terms in the cartesian framework
are what youd expect them to be, but the
i
(
j
) in polar coordinates
contain signicant information. What are the numbers telling you?
Exercise 8.6.9. If you look at what you have been doing with the above
calculations, you can see that we have dened
X
(Y ) in cartesian coordinates
on R
2
(and hence by trivial modication on R
n
) and then proceeded to
take it on the subspace R
2
` 0 by ignoring the deleted point. Then we
transferred it all to o
1
R
+
by the dieomorphism P for polar coordinates.
In order to compute
X
P
(Y
P
), I rather took it for granted that it is to be
done by translating X
P
and Y
P
into cartesian form, doing it there, and then
translating the answer back into polar coordinates, which surely is the only
sane thing to do. If were any dieomorphism from a subset U R
2
to
some other space, V , then what we are doing is taking a vector eld X on
U to the vector eld
X
1
on V , a vector eld Y on U to the vector
eld
Y
1
on V , and dening
X
1(
Y
1
) =

X
(Y )
1
Show that this is a consistent way to export from one manifold to an-
other which is dieomorphic to it, and hence explain why we can dene a
connection on a manifold, and why is called covariant dierentiation.
8.6.1 Tensor formulation
The term
i
k,j
looks very like a (2,1) tensor in coordinate terms. For the
Riemannian Inner product, what goes in at each point in the same tangent
8.7. CONCLUDING REMARKS 207
space is a pair of vectors and what comes out is a number, and this is bilinear
and varies smoothly as we move around in the manifold. For the vector
potential, we have that the things going in are two vector elds, or at least a
vector in the tangent space at a point and a eld (or possibly more general
section) dened in a neighbourhood of the point (so we can dierentiate), and
what comes out is another vector eld (or possibly more general section).
8.7 Concluding Remarks
This just starts on the subject of connections, which are crucial to much
dierential geometry. For example, it is connections that have curvature. We
could show how a Riemannian Inner product (metric) leads to a connection,
the Levi-Civita connection, which is compatible with the metric. But there
is too much to t into an introductory course unless I follow the tradition of
training you to say the right things with only minimal grasp of what they
mean, something I much prefer not to do.

Diff Geom

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Diff Geom

Enviado por

Direitos autorais:

Formatos disponíveis

4P9

An Introduction to Dierential Geometry.

is a function tangent to kv since they both have the same derivative no

(M) has some advantages. One of these is that

(M) to be the set of smooth maps from M to R. This

(M). Such a denition has advantages and disadvantages. The

(M) is a derivation, then it must be able to be expressed as a vector eld

for some vector (u, v)

(M), that is, it is like a vector space over (

(M) is not a eld but a ring.

of linear maps from V to R. I

as we did for the tangent bundle

Compute f on the set of points

for t [0, 1]. Do this by choosing

for t [0, 1].

if you havent already.

for the same map. This is more

, a hypothetical induced map on the whole cotangent space

Exercise 2.9.9. Prove the converse, that if X is a (vector space) symmetry

(M) ensures that we can compose them

, the dual space to V .

is the set L(V, R) of linear maps from V to R with

is indeed a vector space of the same di-

are isomorphic, but the

are isomorphic? If so provide an isomorphism.

, v) = s f(u, v)+t f(u

= L(U, L(V, W))

= denotes a natural isomorphism of vector spaces.

(V ) for the vector space of contravariant -tensors on V .

= V , there is a case for saying

R for some V which he frequently forgets to

type tensor on V when it is covariant

(V ) for the space of type (k, )

by taking the two

, or if we use some new basis for V and the

. There has to be a matrix representing the transition from

are real numbers,

as a new vector space. And if we identify U

3.2. TENSOR FIELDS ON A MANIFOLD 67

This of course is the same as the system of Ordinary Dierential Equations

, for the codomain of a vector eld.

Again, this was an exercise which I hope you did.

as the square of a new norm on R

as the same thing with innitesimals to give innitesimal sizes of innitesimal

as the set of points at distance 1 from the origin. This is an ellipse as in

Find the length of the curve along the parabola y = x

82 CHAPTER 3. TENSORS AND TENSOR FIELDS

2 8.885765876 in the euclidean metric on the , r

with V or we can simply send : T

is on tangent vectors, it is just the derivative of f at

Similarly we can, given f : X Y for X = R

be the polar coordinate map. Then we can write P

always give an inner product on

and vice-versa, and hence that it would be possible

We have the inverse given by

by expressing the resulting number as

for some particular numbers c

It seems reasonable therefore to dene () to be the 1-form [c

we get the same result as in the standard basis.

by (u) = 'u, `. That is,

is an isomorphism of nite di-

which would put u ker() contradicting being 1-1.

which has determinant 1. Since determinants gure largely in dening