Você está na página 1de 248

de Gmyter

Studies Mathematics
26

Heinz Bauer

Measure and
Integration Theory
de Gruyter Studies in Mathematics 26
Editors: Carlos Kenig Andrew Ranicki Michael Rockner
de Gruyter Studies in Mathematics

1 Riemannian Geometry, 2nd rev. ed., Wlhelm R A. Klingenberg


2 Semimartingales, Michel M6tivier
3 Holomorphic Functions of Several Variables, Ludger Kaup and Burchard Kaup
4 Spaces of Measures, Corneliu Constantinescu
5 Knots, Gerhard Burde and Heiner Zieschang
6 Ergodic Theorems, Ulrich Krengel
7 Mathematical Theory of Statistics, Helmut Strasser
8 Transformation Groups, Tammo tom Dieck
9 Gibbs Measures and Phase Transitions, Hans-Otto Georgii
10 Analyticity in Infinite Dimensional Spaces, Michel Hervt
11 Elementary Geometry in Hyperbolic Space, Werner Fenchel
12 Transcendental Numbers, Andrei B. Shidlovskii
13 Ordinary Differential Equations, Herbert Amann
14 Dirichlet Forms and Analysis on Wiener Space, Nrcolas Bouleau and
Francis Hirsch
15 Nevanlinna Theory and Complex Differential Equations, Apo Laine
16 Rational Iteration, Norbert Steinmetz
17 Korovkin-type Approximation Theory and its Applications, Francesco Altomare
and Michele Campiti
18 Quantum Invariants of Knots and 3-Manifolds, Vladimir G. Turaev
19 Dirichlet Forms and Symmetric Markov Processes, Masatoshi Fukushima,
Yoichi Oshima, Masayoshi Takeda
20 Harmonic Analysis of Probability Measures on Hypergroups, Walter R. Bloom
and Herbert Heyer
21 Potential Theory on Infinite-Dimensional Abelian Groups, Alexander Bendikov
22 Methods of Noncommutative Analysis, Vladimir E. Nazaikinskii,
Victor E. Shatalov, Boris Yu. Sternin
23 Probability Theory, Heinz Bauer
24 Variational Methods for Potential Operator Equations, Jan Chabrowski
25 The Structure of Compact Groups, Karl H. Hofmann and Sidney A. Morris
Heinz Bauer

Measure and Integration Theory


Translated from the German by Robert B. Burckel

W Walter de Gruyter
Berlin New York 2001
Author Translator
Heinz Bauer Robert B. Burckel
Mathematisches Institut Department of Mathematics
der Universit t Erlangen-Numberg Kansas State University
Bismarckstral3e 1 1/2 137 Cardwell Hall
91054 Erlangen Manhattan, K ansas 66506-2602
Germany USA
Series Editors
Carlos E. Kenig Andrew Ranicki Michael Rockner
Department of Mathematics Department of Mathematics Fakultit fiir Mathematik
University of Chicago University of Edinburgh Universitiit Bielefeld
5734 University Ave Mayfield Road UniversitiitsstraBe 25
Chicago, IL 60637 Edinburgh EH9 3JZ 33615 Bielefeld
USA Scotland Germany

Mathematics Subject Classification 2000: 28-01; 28-02


Keywonts: Product measures, measures on topological spaces, topological measure theory, introduction
to measures and integration theory

Ptimod on acid-free papa which fans widen the guidelines of the ANSI to errawe permanence and dwability.

Library of Congress - Cataloging-in-Publication Data

Bauer, Heinz, 1928-


[Mass- and Integrationstheorie. English]
Measure and integration theory / Heinz Bauer ; translated from the
German by Robert B. Burckel.
p. cm. - (De Gniyter studies in mathematics ; 26)
Includes bibliographical references and indexes.
ISBN 3110167190 (acid-free paper)
1. Measure theory. 2. Integrals, Generalized. I. Title. It. Series.
QC20.7.M43 84813 2001
530.8'0 1 - dc2l 2001028235

Die Deutsche Bibliothek - Cataloging-in-Publication Data

Bauer, Heinz:
Measure and integration theory / Heinz Bauer. Trans[. from the German
Robert B. Burckel. - Berlin ; New York : de Gruyter, 2001
(De Gruyter studies in mathematics ; 26)
Einheitssacht.: Mass- and Integrationstheorie (engl.)
ISBN 3-11-016719-0

© Copyright 2001 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin, Germany.
All rights reserved including those of translation into foreign languages. No part of this book may be
reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or
any information storage and retrieval system, without permission in writing from the publisher.
Printed in Germany.
Typesetting: Oldlich Uhych, Prague, Czech Republic.
Printing and binding: Hubert & Co. GmbH & Co. KG, GBttingen.
Cover design: Rudolf Hubler, Berlin.
In memoriam
Orro HAUPT
(5.3.1887 -10.11.1988)
former Professor of Mathematics
at the University of Erlangen
Preface

More than thirty years ago my textbook Wahrscheinlichkeitstheorie and Grundziige


der Maf3theorie was published for the first time. It contained three introductory
chapters on measure and integration as well as a chapter on measure in topologi-
cal spaces, which was embedded in the probabilistic developments. Over the years
these parts of the book were made the basis for lectures on measure and integra-
tion at various universities. Generations of students used the measure theory part
for self-study and for examination preparations, even if their interests often did
not extend as far as the probability theory.
When the decision was made to rewrite and extend the parts devoted to prob-
ability theory, it was also decided to publish the part on measure and integration
theory as a separate volume. This volume had to serve two purposes. As before
it had to provide the measure-theoretic background for my book on probability
theory. Secondly, it should be a self-contained introduction into the field. The Ger-
man edition of this book was published in 1990 (with a second edition in 1992),
followed in 1992 by the rewritten book on probability theory. The latter was trans-
lated into English and the translation was published in 1995 as Probability Theory
(Volume 23) in this series.
When offering now a translation of the book Mall- and Integrationstheorie
we have two aims: To provide the reader of my book on probability theory with
the necessary auxiliary results and, secondly, to serve as a secure entry into a
theory which to an ever-increasing extent is significant not only for many areas
within mathematics, but also for applications in physics, economics and computer
science.
However, once again this book is much more than a pure translation of the
German original and the following quotation of the preface of my book Probability
Theory, applies a further time: "It is in fact a revised and improved version of
that book. A translator, in the sense of the word, could never do this job. This
explains why I have to express my deep gratitude to my very special translator, to
my American colleague Professor Robert B. Burckel from Kansas State University.
He had gotten to know my book by reading its very first German edition. I owe
our friendship to his early interest in it. He expended great energy, especially on
this new book, using his extensive acquaintance with the literature to make many
knowledgeable suggestions, pressing for greater clarity and giving intensive support
in bringing this enterprise to a good conclusion."
In addition I want to thank Dr. Oldfich Ulrych from Prague for his skill and pa-
tience in preparing the book manuscript in TJ( for final processing. Many thanks
are due to my family and Professor Niels Jacob, University of Swansea, for reasons
viii Preface

they will know. Finally, I thank my publisher Walter de Gruyter & Co., and, above
all, Dr. Manfred Karbe for publishing the translation of my book.

Erlangen, March 2001 Heinz Bauer


Introduction

Measure theory and integration are closely interwoven theories, both content-wise
and in their historical developments. They form a unit. The development of analy-
sis in the 19th century - here one is thinking especially about the theory of Fourier
series and classical function theory - compelled the creation of a sufficiently gen-
eral concept of the integral that discontinuous functions could also be integrated.
The jump function of P. G. LEJEUNE DIRICHLET should be seen in this light. At
that time only an integration theory due to CAUCHY, a precursor of Riemann's,
was known. And it was not until B. RIEMANN's Habilitation in 1854 (text pub-
lished posthumously in 1867) that Cauchy's ideas were made sufficiently precise
to integrate (certain) discontinuous functions. For the first time the need was felt
for integrability criteria. Parallel to this a "theory of content" was evolving - pri-
marily at the hands of G. PEANO and C. JORDAN - to measure the areas of plane
and the volumes of spatial "figures".
But the decisive breakthrough occurred at the turn of the century, thanks to
the French mathematicians EMILE BOREL and HENRI LEBESGUE. In 1898 Borel -
coming from the direction of function theory - described the "a-algebra" of sets
that today bear his name, the Borel sets, and showed how to construct a "measure"
on this a-algebra that satisfactorily resolved the problems of measuring content.
In particular, he recognized the significance of the "a-additivity" of the measure.
In his thesis (1902) LEBESGUE presented the integral concept, subsequently named
after him, that proved decisive for the development of a general theory. At the same
time he furnished the tools needed to make Borel's ideas more precise. From then
on Lebesgue-Borel measure on the a-algebra of Borel sets and Lebesgue measure
on a somewhat larger a-algebra - consisting of the sets which are "measurable" in
Lebesgue's sense became standard methods of analysis.
What was new about Lebesgue's integral concept was not just the way it was
defined, but also - and this was the real reason for its fame - its great versatility as
manifested in the way it behaved with respect to limit operations. Consequently
the convergence theorems are at the center of the integration theory developed by
Lebesgue and his intellectual progeny.
Subsequent developments are characterized by increasing recognition of the
versatility of Lebesgue's concepts in dealing with new demands from mathematics
and its applications. In the course of time (up to 1930) the general (abstract) mea-
sure concept crystallized, and a theory of integration built on it - after Lebesgue's
model.
It is this theory that will be developed here in an introductory fashion, but
far enough that from the platform so erected the reader can easily press ahead to
deeper questions and the manifold applications. Areas in which measure and inte-
gration play a key role are, for example, ergodic theory, spectral theory, harmonic
x Introduction

analysis on locally compact groups, and mathematical economics. But the fore-
most example is probability theory, which uses measure and integration as an
indispensable tool and whose own specific kinds of questions and methods have
in turn helped to shape the former. Even today the development of measure and
integration theory is far from finished.
The book is comprised of four chapters. The first is devoted to the measure
concept and in particular to the Lebesgue-Borel measure and its interplay with
geometry. In the second chapter the integral determined by a measure, and in
particular the Lebesgue integral, the one determined by Lebesgue-Borel measure,
will be introduced and investigated. The short third chapter deals with the product
of measures and the associated integration. An application of this which is very
important in Fourier analysis is the convolution of measures. In the fourth and
last chapter the abstract concept of measure is made more concrete in the form
of Radon measures. As in the original example of Lebesgue-Borel measure, here
the relation of the measure to a topology on the underlying set moves into the
foreground. Essentially two kinds of spaces are allowed: Polish spaces and locally
compact spaces. The topological tools needed for this will mostly be developed in
the text, with the reader occasionally being given only a reference (very specific)
to the standard textbook literature.
The examples accompanying the exposition of a theme have an important func-
tion. They are supposed to illuminate the concepts and illustrate the limitations
of the theory. The reader should therefore work through them with care. Exer-
cises also accompany the exposition. They are not essential to understanding later
developments and, in particular, proofs are not superficially shortened by con-
signing parts to the exercises. But the exercises do serve to deepen the reader's
understanding of the material treated in the text, and working them is strongly
recommended.
Notations

Here we assemble some of the notation and phraseology which will be used in the
text without further comment and which - with but a few exceptions - are in
general use.

By N, Z, Q, R we designate the sets of natural numbers 1,2,... (excluding 0), of


whole numbers, of rational numbers and of real numbers, respectively. We always
think of the field R as equipped with its usual (euclidean) metric and the topology
that it determines. Thus Ix - yi is the euclidean distance between two numbers
x, y E R. We also speak of the number line R.
Via the adjunction of (+)oo and -oo to R, the extended or compactified number
line K is produced. Addition with the improper numbers +oo and -oo is performed
in the usual way: a + (±oo) = (±oo) + a = ±oo for a E R, and as well (+oo) +
(+oo) = +oo and (-oo) + (-oo) = -oo. On the other hand +oo + (-oo) and
-00 + (+oo) are not defined.
As usual too we set a (too) = ±oo for all real a > 0, including a = +oo, and
a (±oo) = Too for all real a < 0, including a = -oo. Not so general but typical
in measure theory are the additional conventions

which mean that the product a b is defined for all a, b E R.


The notation A := B or B =: A means that this equation is the definition of A
in terms of B.
The < (resp., <) relation in R is extended to R via the decree -oo < a < +oo
for all a E R. A plus sign affixed to Z, Q, R or K as a subscript means the sets
Z+, Q+, R+, R+ of all non-negative whole, rational, real numbers, or - in the last
case - all a EIltwith 0<a<+oo.
Intervals in R are designated as usual by [a, b], ]a, b[, ]a, b] and [a, b[. However,
(a, b) will never be used for an open interval, but only for the ordered pair with
first element a, second element b.

For every pair of elements a, b E R


aVb:=max{a,b}, aAb:=min{a,b}
designate their respective maximum and minimum. Obviously the equations
IaI=av(-a)=a++a- and a=a+-a-
hold without any restrictions on a if we set, as usual,
a+ := a V 0 and a - :_ (-a)+ _ -(a A 0) .
xii Notations

Of course, a+ > 0 and a- > 0 for all a. For finitely many a1, ... , an E It the
corresponding expressions a1 V ... V an and al A ... A an stand for max{a1i... , an }
and min{ai, .... an), respectively.
For the set-theoretic operations we use the usual symbols: U or U for union,
n or n for intersection, and the prefix C to signify complementation. The set-
theoretic relation of inclusion is written A C B, and equality of the sets is not
thereby excluded. For the difference set A fl CB, the set of all x E A such that
x ¢ B, we also write A \ B. Sets A and B which have an empty intersection, that
is, for which A fl B = 0, are said to be disjoint.
The power set _9(Q) of a set f2 is the set of all subsets of f2, including the
empty set 0. A set A will be called countable if it is either finite or denumerably
infinite. In other words, we will be using "countable" in lieu of the equally popular
expression "at most countable". Obviously the empty set is to be understood as
a finite set. A set will be called non-denumerable or uncountable if it is neither
finite nor denumerable.
Mappings of a set A into a set B will be denoted by f : A - B or by the
mapping prescription x y f (x) (with x E A). In case B = R we speak of a real
function or a real-valued function on A. Not universal, but useful for our purposes,
is the designation numerical function on A for mappings f : A - R into the
extended number line. The restriction of a mapping f : A -+ B to a subset A'
of A will be denoted by f I A'. The composition of f with a mapping g : B - C
will be denoted g o f and the pre-image or inverse-image of a set B' C B under
the mapping f will be denoted f -1(B').
A sequence in a set A is a mapping f : N -* A of the set N of natural num-
bers into A. Designating the image element f (n) by an, we also write
or simply for the mapping f. If other index sets, e.g., Z+ =
{0, 1... .}, come up, this notation is appropriately modified to, e.g., (an)nEZ+ or
(an)n=o.1.... In the same way finite sets are often exhibited as with
n E N. Even more generally, we write mappings f : I -+ A of a set I into the
set A as (a,),EI. understanding by ai the element f (i) of A. And we then speak of
a family in A (with index set I).
If the terms of a sequence (an)fES in a set A from some index no E N onwards
possess a certain property, that is, if there are but finitely many exceptional indices,
we say that ultimately all terms of the sequence have the property. The popular
phrasing "almost all terms of the sequence possess the property" has to be avoided
in measure theory because there the concept "almost all" is employed in another
sense.

If f and g are real functions on a set X, then f + g, f g, etc., designate the


real functions x H f (x) +g(x), x H f (x)g(x), etc., on X. Numerical functions are
combined analogously, as long as f (x) +g(x) is defined for every x E X, there being
no problem with f (x)g(x) in this regard, thanks to the preceding conventions. If
00
is a sequence of real or numerical functions on X such that the series E f (x)
n=1
Notations xiii
co
converges in Ht for every x E X, then E fn, or simply F_ fn, designates the
n=1
00
function x H E fn (x). Also, functions like sup fn, inf In, Urnn-+oo
sup fn, lim inf fn,
n-*oo
n=1 nEN nEN
lim fn are defined "pointwise" via x '-+ sup fn (x), x H inf f (x), etc.; whereby,
n +00 nEN nEN
of course, use of lim fn presupposes the convergence in IIt of the sequence (fn(x))
for each xEX.
For numerical functions f1,..., fn on a set X
A V...Vfn and f1 A...Afn
designate the functions
xi-+ f1(x)V...V f,, (x) and xH fi(x)A...Afn(x).
At each point x E X they assume, respectively, the largest and smallest of the
function values f, (x),fi(x),.. . , f (x). These two functions are called, respectively, the
upper and the lower envelopes of f1, ... , f, . Correspondingly, sup fn and inf fn are
nEN
called the upper and lower envelopes of the sequence (fn) of numerical functions
on X.
A numerical function defined on a subset of IR is called isotope, resp., antitone, if
it is weakly increasing, reap., decreasing. We use this terminology also for numerical
functions f : A -> R when A is a (partially) ordered set. That is, if from x, y E A
and x < y always follows f (x) < f (y), reap., f (x) > f (y), then f is called isotone,
reap., antitone. If from x < y always follows f (x) < f (y), reap., f (x) > f (y), then
f is called strictly isotone, reap., strictly antitone.
For sequences (an) in R the symbolisms
anTa , an .l.a
express that the sequence is isotone, reap., antitone, and that a E IIP is its supre-
mum, reap., its infimum.
The end of a proof is signaled by the symbol O.
References of the form "RADON [1913]" are to the bibliography at the end of
the book.
Section 18, labelled with *, can be skipped over in a first reading.
Table of Contents

PrefRee vii
Introduction ix
Notations xi

Chapter I Measure Theory 1


1. a-algebras and their generators 2
2. Dynkin systems 5
3. Contents, premeasures, measures 8
§ 4. Lebesgue premeasure 14
§ 5. Extension of a premeasure to a measure 18
6. Lebesgue-Borel measure and measures on the number line 26
§ 7. Measurable mappings and image measures 34
§ 8. Mapping properties of the Lebesgue-Borel measure 38

Chapter II Integration Theory 49


9. Measurable numerical functions 49
§ 10. Elementary functions and their integral 53
§ 11. The integral of non-negative measurable functions 57
§ 12. Integrability 64
§ 13. Almost everywhere prevailing properties 70
§ 14. The spaces 2P(µ) 74
§ 15. Convergence theorems 79
§ 16. Applications of the convergence theorems 88
§ 17. Measures with densities: the Radon-Nikodym theorem 96
§ 18.' Signed measures 107
19. Integration with respect to an image measure 110
§ 20. Stochastic convergence 112
21. Egui-integrability 121

Chapter III Product Measures 132


§ 22. Products of a-algebras and measures 132
§ 23. Product measures and Fubini's theorem 135
§ 24. Convolution of finite Borel measures 147

Chapter IV Measures on Topological Spaces 152


§ 25. Borel sets, Borel and Radon measures 152
§ 26. Radon measures on Polish spaces 157
§ 27. Properties of locally compact spaces 166
428. Construction of Radon measures on locally compact spaces 170
429. Riesz representation theorem 177
xvi Table of Contents

30. Convergence of Radon measures 188


31. Vague compactness and metrizability questions 204

Bibliography 217
Symbol Index 221
Name Index 223
Subiect Index 225
Chapter I
Measure Theory

To geometrically simple subsets of the line, the plane, and 3-dimensional space,
elementary geometry assigns "numerical measures" called length, area and volume.
At first all that is intuitively clear is how the length of a segment, the area of
a rectangle and the volume of a box should be defined. Proceeding from these we
can determine by elementary geometric methods the lengths, areas, and volumes
of more complicated sets if we accept certain calculational rules for dealing with
such numerical measures.
If one thinks for example about the elementary determination of the area of
a (topologically) open triangle, one begins by decomposing it via one of its altitudes
into two open right triangles and the altitude itself. One further recalls that every
right triangle arises from insertion of a diagonal into an appropriate rectangle.
Every line segment is assigned numerical measure 0 when considered as a surface.
The following two rules of calculation therefore lead to the determination of the
areas of triangles:
(A) If the set A has numerical measure a, and B is congruent to A, then B also
has numerical measure a.
(B) If A and B are disjoint sets with numerical measures a and p, reap., then
A U B has numerical measure a +)3.
The limits of such elementary geometric considerations are already reached in
defining the area of an open disk K, to which end one proceeds thus: A sequence of
open 3.2"-1-goes En (n E N) is inscribed in K, with El being an open equilateral
triangle, and the vertices of En+1 being those of En together with the intersections
of the circle with the radii perpendicular to the sides of En. Thus En+1 consists
of En together with its 3.2n-1 edges and the open isosceles triangles which have
these edges as hypotenuses and vertices on the circle. Since K is the union of all
the En, it looks like a "mosaic of triangles", that is, like a union of disjoint open
triangles and segments (namely, common sides of various triangles). The following
broader formulation of (B) therefore leads to a definition of the area of the disk K:
(C) If (An) is a sequence of pairwise disjoint sets, and An has numerical mean
00
sure an (n E N), then U
0. An has numerical measure E an.
n=1 n=1
If we replace K and every En by its topological closure, this method would not
lead to a plausible definition of the area of a closed disk K, because K is not the
union of the closures En of the above constructed polygons En. A peculiarity and
disadvantage of the elementary geometric procedure is precisely the necessity of
2 I. Measure Theory

choosing a special mode of decomposition tailored to the set K being considered


in order to arrive at a numerical measure.
The question of a general method by means of which as many subsets of Rd (for
arbitrary d E N) "as possible" could in a natural way be assigned a d -dimensional
volume as numerical measure is what finally led to the mathematical discipline
called measure theory. The primary content of this chapter is an exposition of the
answer which measure theory furnished to this question. It will be seen that the key
to the answer lies in rule (C), and that this rule is obeyed by much more general
"numerical measures" which arise in situations quite remote from the original
intuitive geometric one. It is just the latter reason that explains the variety of
opportunities for applying measure theory in analysis, geometry and stochastics.

§1. a-algebras and their generators


Let SI be an arbitrary set, .9(SI) its power set, that is, the set of all subsets of Q.
Then along with every family (A1)iEI of sets from Y(f2), its union U Ai and its
iEI
intersection n Ai are also in Y(O). Furthermore, Y(Q) contains the complement
iEI
CA of every set A which it contains. In what follows we will be interested in
subsystems d C Y(fl) which have the corresponding properties, at least for
countable index sets I. According to the conventions set out in the introduction,
such index sets are those that are either finite or denumerably infinite.

1.1 Definition. A system si of subsets of a set iI is called a o-algebra (in SI) if


it has the following properties:
(1.1) SIEa/;
(1.2) A E .a0 CAE sat ;
(1.3) (Af),EN C .as' U An E d.
nEN

Examples. 1. -10(0) is always a a-algebra.


2. For any set SI the system of all its subsets which are either countable or
co-countable, that is, the A C SI such either A of CA is countable, constitute a a-
algebra. Property (1.3) is confirmed as follows: If each An is countable, then so
is the union tJ An. If some An, is not countable, then its complement is, and
nEN
C U An = n CAn C is likewise countable.
nEN nEN

3. If 0 is a a-algebra in a set SI and SI' is a subset of fl, then


(1.4) f2'nal:={SI'nA:AEsr}
§1. a-algebras and their generators 3

is a a-algebra in S2', called the trace of .sad in ff. In case S2' E of, 0' nod consists
simply of all the subsets of 12' which are elements of 0.
4. Let S2, S2' be sets, 0' a a-algebra in Cl', and T : Cl -> 12' a mapping. Then the
system of sets
(1.5) T-1(d) := {T-1(A') : A' E Ad'}
is a a-algebra in Cl, as follows from the known behavior of the set-theoretic oper-
ations under inverse mappings (like T-1 here).

Every a-algebra .d has properties "dual" to (1.1) and (1.3), namely:


(1.6) OE.srd ,

(1.7) (an)nEN C d n An E W.
nEN

These follow from (1.1)-(1.3) and the identities 0 = C11 and nAn = C(UCAn).
Moreover,
A,u...UAn =A,u...UAnuOuOu...
and
A, n... nAn = A, n... nAn nCln1n...
Therefore, along with any finite number of sets which 0 contains, it also contains
their union and their intersection. From this observation and (1.2) follows as well:
(1.8) A,BEd A\B=AnCBEd.
For constructing a-algebras the following theorem is important:

1.2 Theorem. The intersection n .si of any family (dj)iEI of o-algebras in


iEI
a common set 0 is itself a a-algebra in Q.

Its proof is just a routine check of properties (1.1)-(1.3). It follows that for every
system 9 of subsets of Cl there is a smallest a-algebra a(8) which contains 9; that
is, a(8) is a a-algebra in 0 with the defining properties
(i) 9 C a(9),
(ii) for every o-algebra .sd in Cl with 8 C 0, a(8) C.W.
For a proof, consider the system E of all a-algebras nd in S2 with 9 C nd;
for example, . (S2) is an element of E. Then o(e) is the intersection of all the
0 E E, which according to 1.2 possesses all the desired properties.
Q(8) is called the a-algebra generated by 8 (in Cl) and .9 is called a generator
of a(8).
Examples. 5. If 9 itself is a a-algebra in S2, then 9 = a(8).
6. If S consists of a single set A C Cl, then a(S) = {0, A, CA, S2}.
4 1. Measure Theory

7. The a-algebra in Example 2 is generated by the system of all finite subsets


of Q.

Several systems of sets possessing some of the properties of a-algebras fre-


quently occur as generators. Of special interest are rings of sets.

1.3 Definition. A system . of subsets of a set 11 is called a ring (in Sl) if it has
the following properties:
(1.9) O E R;
(1.10) A,BE.J A\BE-4;
(1.11) A,BER AuBEF.
If in addition
(1.12) SZ E R
then :.8 is called an algebra (in fl).
A ring contains with each two of its sets (and so, with each finite collection of
its sets) not only their union, but also their intersection. This is because An B =
A \ (A \ B).

1.4 Theorem. A system 1 of subsets of a set 0 is an algebra if and only if it has


properties (1.1), (1.2) and (1.11).

Proof. By definition an algebra has properties (1.1) and (1.11) and (1.10), and from
the latter follows (1.2). The converse follows from the fact that 0 = Co, together
with the set-theoretic identity
A\B=AnCB=C(BuCA). 0
Examples. 8. Every a-algebra is an algebra.
9. For any set 0 the system of all sets A C 0 which are either finite or co-finite
(i.e., have finite complement in i2) is an algebra, but is a a-algebra only if fl is
finite.
10. The system of all finite subsets of a set 0 is a ring, but is an algebra only
if fl itself is finite.
11. The smallest ring of subsets of a set 0 is the empty set O.

Exercises.
1. For every system 8 of subsets of a set n there exists a smallest ring p(8)
in 0 which contains if. It is called the ring generated by 8. Prove this existence
assertion. Determine p(8) and a(8) in the case where f consists of two subsets
A, B of Q. When does p(8) = a(e) hold in this latter case; when does it hold for
general 8?
§2. Dynkin systems 5

2. For sets A and B


AL.B:=(A\B)U(B\A)
is called their symmetric difference. Prove that it obeys the following rules of
calculation (in which A, B, C are arbitrary sets):
(a) ADB=BAA;
(b) (AAB)ACAA(BAC);
(c) AAA=0; AA0=A;
(d) CA A CB =ADB ;
(e) (A 6 B) n C = (A n C) A (B n C);
(f) (U An) 0 (U Bn) C U (An A Bn)
nEN nEN nEN

(for arbitrary sequences (An) and (Bn) of sets).


3. Deduce from exercise 2 that -4 C .9(Q) is a ring in a set Q if and only if with
respect to the operation A (as addition) and n (as multiplication) -4 constitutes
a commutative ring in the sense that the algebraists use that term.
4. A subset V of a ring -4 in a set Q is called an ideal if it satisfies
(a) 0EN;
(b) NE.A',ME, ,MCN ME.X; .

(c) M,N E.N => MUN E.N.


Continuing with exercise 3, show that .N C 9 is an ideal in 9 if and only if it is
an ideal in the algebraists' sense in the commutative ring -4. Every ideal in . ' is
itself a ring in Q.
5. Let Q := N and for each n E N, do denote the a-algebra in 12 generated by
the system do comprised of the singletons {1}, {2},..., {n}. Show that do con-
sists of all subsets of Q which are either contained in (1, 2,. . ., n} or contain the
complement of this set. Obviously stI'n C .s4 for every n E N. Why is U stn
nevertheless not a a-algebra in 0 = N? nEN
[Hint: It is generally true of any isotope sequence (.4n)nEN of rings in a set Q
that the union of all of them constitutes a a-algebra if and only if they are equal
from some index onward. Cf. OVERDIJK, SIMONS and THIEMANN [1979] and, for
the special case of a-algebras, BROUGHTON and HUFF [1977].1

§2. Dynkin systems

It is often difficult to directly determine whether a given system of sets is a a-al-


gebra. The following concept, which goes back to DYNKIN [1961] but in inchoate
form even to SIERPINSKI (1928], helps to get around some of these difficulties.
6 I. Measure Theory

2.1 Definition. A system 9 of subsets of a set Il is called a Dynkin system (in A)


if it has the following properties:
(2.1) S2 E 9;
(2.2) DE9 . CDE9;
(2.3) D pairwise disjoint E 9 (n E N) U D E 9.
nEN

Every Dynkin system 9 thus contains the empty set 0 = CA, and then (2.3)
also insures that 9 contains the union of every finite, pairwise disjoint collection
of its sets.

Examples. 1. Every a-algebra is obviously a Dynkin system.


2. Let A be a finite set with an even number 2n of elements (n E N). Then the
system 9 of all D C A which contain an even number of elements is a Dynkin
system. In case n > 1, 9 is not an algebra, hence certainly not a a-algebra.

The precise connection between the concepts of or-algebra and Dynkin system
is elucidated in the following considerations:

2.2 Lemma. Every Dynkin system 9 is closed with respect to the formation of
proper complements, meaning that
(2.2') D,EE9, DcE E\DE9.
Proof. According to what was noted right after definition 2.1, the set D U CE,
being the union of the disjoint sets D and CE from 9, lies in 9. But then the
complement of this set with respect to 0, that is, E f1 CD = E \ D, lies in 9.

Consequently, Dynkin systems can also be defined via properties (2.1), (2.2')
and (2.3).

2.3 Theorem. A Dynkin system is a o-algebra just if it contains the intersection


of any two of its sets.

Proof. What needs to be shown is that every Dynkin system .9 which is closed
under finite intersections is a a-algebra. Of the defining properties of a a-algebra,
only (1.3) needs to be confirmed and we do that thus: According to (2.2') and
the closure hypothesis, A \ B = A \ (A fl B) lies in 9 whenever A, B E 9. Since
(A \ B) fl B = 0 and A U B = (A \ B) U B, 9 contains the union of any two, hence
the union of any finitely many, of its elements. For any sequence (Da)nEN C 9,
we have
00 00

U Dn=U(D'n+1\D,)
n=1 n=e
§2. Dynkin systems 7

in which D' := 0 and D;, := Dl U ... U D for each n E N. The sets D;+i \ D;, are
pairwise disjoint and, thanks to (2.2') and what has already been proved, they lie
in 2. According to (2.3) then the union of the sets D lies in 2. 0

Just as for a-algebras, algebras and rings, every system Cr C .9(Q) lies in
a smallest Dynkin system. It is, of course, called the Dynkin system generated
by 8, and is denoted 6(8).
The significance of Dynkin systems lies primarily in the following fact:

2.4 Theorem. Every 9 C .9(Q) which is closed with respect to finite intersection
satisfies
(2.4) 6(8) = 0(6°) .

Proof. Since every a-algebra is a Dynkin system, o(8) is a Dynkin system con-
taining 9' and consequently 6(8) C o((fl. If conversely, 6(8) were known to be
a a-algebra, the dual relation o(8) C 6(8) would also follow. In view of 2.3 there-
fore it suffices to show that 6(8) is closed under intersection. To prove this, we
introduce for every D E 6(8) the system
1D:={QE.9(st):QnDE6(8)}.
A routine check confirms that 9D is a Dynkin system. For every E E 8 the
hypothesis on 8 insures that 8 C 2E and therewith that 6(8) C 2E. Thus for
every DE6(8)andevery EE8wehave EnDE6(8);that is,8C2D,and
consequently 6(8) C 9D, holding for every D E 6(8). But this is just the property
of d(eb) that had to be confirmed. 0

Systems of subsets which are closed under intersections (respectively, unions)


of two, hence of any finite number, of their sets will from now on be described as
r)-stable (respectively, U-stable).

Exercise.
Determine the Dynkin system generated by the system consisting of just two
subsets A, B of fl. Show that 6(&) and o(8) coincide just in case one of the sets
A n B, A n CB, B n CA of CA n CB is empty.
8 1. Measure Theory

§3. Contents, premeasures, measures


Combining the concepts of ring and or-algebra with the properties (B) and (C) of
lengths, areas and volumes that we encountered in the introduction leads to the
basic concepts of measure theory.

3.1 Definition. Let .4 be a ring in SI and it a function on sP with values in


10, +oo]. It is called a premeasure on 9 if
(3.1) p(0) = 0

and for every sequence (An) of pairwise disjoint sets from R whose union lies in 1B
00 00

(3.2) u(U An) = E p(A,) (a-additivity)


n=1 n=1

holds. it is called a content if instead of (3.2) it only satisfies


tt n
(3.3) It (U A;) = F p(A;) (finite additivity)

(for every two and therewith) for every finitely many pairwise disjoint sets A,,. .. ,
A,, E R_
Due to (3.1) every premeasure is evidently a content. To see this, you have only
to take An+1 = An+2 = ... = 0 in (3.2).

Examples. 1. For every ring R in 11 and every point w E 11 the function s,,,
defined on . by
if U) EA
if1.r0A
is a premeasure. It is called the premeasure defined by unit mass at W.
2. Let a be the a-algebra defined in Example 2 of §1, for an uncountable set fl,
say for S2 =1R. Set p(A) := 0 or 1 according as A of CA is countable. Since of two
disjoint subsets of f? at most one can have a countable complement, property (3.2)
is easily confirmed; thus p is a premeasure on d.
3. Let W be the algebra defined in Example 9 of §1, for a countably infinite set i.
Set p(A) := 0 or 1 according as A or CA is finite. Then p is a content but not
a premeasure. The first assertion has a proof analogous to that in the preceding
example, the second follows from the fact that f) is the disjoint union of countably
many 1-element sets.
4. Let 111,112.... be a sequence of contents (premeasures) on a ring 9, and let
a 1, 02, ... be a sequence of non-negative real numbers, Then
00

p
n=1
§3. Contents, premeasures, measures 9

is also a content (premeasure) on R.

Every content µ on a ring R enjoys the following further properties (in which
A, B, A1, B1, ... E R):
(3.5) µ(A U B) +µ(A n B) = µ(A) + µ(B) ;
(3.6) ACB µ(A) < µ(B) (isotoneity);
(3.7) A C B, µ(A) < +oo . µ(B \ A) = µ(B) - µ(A) (subtractivity);
n n
(3.8) µ(U Ai) ,p(Ai) (subadditivity);
i=1 i=1

for every sequence (An) of pairwise disjoint sets from R whose union lies in R
00

Lµ(An)<µ(UAn).
n=1
"D
n=1

Proof. For arbitrary A, B E R


AUB=AU(B\A) and B=(AnB)u(B\A).
Because of finite additivity, it follows from these that
µ(A U B) = µ(A) +µ(B \ A) and µ(B) =µ(A n B) + µ(B \ A),
and from addition of the last two equations
µ(AUB)+µ(AnB)+µ(B\A) =µ(A)+µ(B)+µ(B\A).
In case µ(B \ A) is finite, (3.5) follows from this. In case µ(B \ A) = +oo, the
formulas for µ(A U B), µ(B) show that each of them must also equal +oo, and
(3.5) consequently holds in this case too. If A C B, the preceding formula for µ(B)
reads
µ(B) = µ(A) + µ(B \ A),
which, thanks to µ > 0, delivers both (3.6) and (3.7). If we set B1 := A1, B2
A2 \ A1,... ,Bn := A. \ (A1 u ... u A,-,), then B1,..., Bn are pairwise disjoint
sets from R, which entails that
n n

µ(U B,) =Ej(Bi)


n
From the facts that Bi C Ai (i = 1,. .. , n), a is isotone, and U B, = U Ai
i=1 i=1
now follows (3.8). To prove (3.9) we only have to observe that for every se-
quence (An)nEN of pairwise disjoint sets from R with A := u An E R
nEN

µ(A1) + ... + µ(Am) = µ(A1 U ... U A.n) < µ(A) (m E N)


and let m -+ oo.
10 1. Measure Theory

Finally, if it is a premeasure on .4, then for any sets A0, A1, ... E 9
0
(3.10) Ao C U A. = p(Ao) :5 >2 p(An)
n=1 n=1

Because of AO = U(Ao n An) and (3.6), we can assume, in verifying (3.10), that
Ao = U An. Then set B1 := Al, B2 := A2 \ Al,... ,Bn := An \ (A1 U ... U An-1)
and proceed as in the proof of (3.8).
In particular, we now have

(3.10') ,u(UAn) <Ep(Aa)


n=1 n=1

whenever all the sets An as well as their union lie in R.

The following theorem characterizes premeasures via other properties related


to the a-additivity. Its formulation is facilitated by the notations:
(3.11) En T E and En J. E
which mean that the sets E1 C E2 C ... satisfy E = U En, or that the sets
El ) E2 D ... satisfy E= n En. In other words, the sequence (En) either
increases isotonically to E or decreases antitonically to E.

3.2 Theorem. For a content p on a ring .9 consider the following statements:


(a) p is a premeasure.
(b) A,,, A E 9 with An T A =; limn_, , p(An) = p(A) (continuity from below).
(c) An, A E .4 with An 1 A and p(An) < +oc for all n

lim p(An) = p(A) (continuity from above).


n400
(d) An E 9 with An 4.0 and p(An) < +oo for all n =
1imn p(An) = 0 (continuity at 0).
n-+oo

Then the following implications hold:


(a) a (b) (c) a (d).
If it is finite on R, that is, p(A) < +oc for all A E .9, then all four statements
(a)-(d) are equivalent.

Proof. (a)=(b): Defining Au := 0, the sets Bn := A. \ An-1 (n E N) are pairwise


disjoint, lie in .9 and satisfy

A= U Bn, An=B1U...UB,,.
Yd=1
§3. Contents, premeasures, measures 11

Therefore on account of the a-additivity of p


00 n
µ(A) = E y(Bn) = nlim
+oo
J µ(Bi) = lim p(An)
n=1 i=1
(b)=(a): Let (An) be a sequence of pairwise disjoint sets from R whose union
A:= U An also is in R. If we set Bn := Al U ... u An, then Bn E 9 and Bn T A;
therefore µ(A) = limµ(Bn). As a result of the finite additivity of µ
µ(B.) = µ(A1) +... +µ(An)
and therefore p(A) = F_µ(An). Thus µ is a-additive, and consequently is a pre-
measure.
(b)=(c): According to (3.7), µ(A1 \ An) = µ(A1) - p(An) for every n E N.
From An 1 A follows Al \ An T Al \ A, and all the sets appearing here are in R.
From (b) therefore
µ(A1 \ A) _ imoµ(A1 \ An) = µ(A1) - im0p(An).
From this follows (c), because A C An means that also µ(A) < +oo and so
µ(A1 \ A) = µ(A1) - µ(A).
(c) .(d): Here there is nothing to prove!
From An 1. A follows An \A 10. Since An\A C An, the isotoneity of µ
means that along with p(An), µ(An\A) is finite too. Hence by (d), limµ(An\A) =
0. But then (c) follows because p(A) < µ(An) < +oo, causing µ(An \ A) to equal
p(A.) - p(A)
To finish off, let us consider the case that µ is finite, and show that then
(d) =*- (b): If (An) is a sequence of sets from .9 and A. T A E.9, then A\An 10.
Taking account of the finiteness of µ, it therefore follows that 0 = lim µ(A \ An) _
lim[µ(A) - µ(An)] and therewith (b). 0
Remark. If one modifies Example 3 of this section by making µ(A) := 0 for all
finite sets and p(A) := +oo for all cofinite sets, then he gets a content that is
continuous at 0 but is not a premeasure. Thus without the finiteness hypothesis
in the preceding theorem, statements (a)-(d) are not generally equivalent. On the
other hand, in (c) and (d) it is enough to explicitly hypothesize µ(An) < +oo for
some n E N, as then u(A,n) < +oo for all m > n (isotoneity).
The concepts of content and premeasure are preliminary to the central concept
of this book, that of a measure.

3.3 Definition. A premeasure defined on a a-algebra 41 of subsets of a set 51 is


called a measure (on ark). The function value µ(A) of µ at an A E d is called the
(p-)measure or the (p-)mass of A. If p(S1) < +oo (and consequently µ(A) < +oo
for every A E 4), the measure µ is called finite.
Thus a measure is a non-negative, numerical function p defined on a a-al-
gebra .0 and enjoying properties (3.1) and (3.2). The constant function µ = 0
is a measure on every a-algebra, the so-called zero-measure. The examples that
12 1. Measure Theory

follow are still of a rather formal nature. But as early as §6 and then quite a bit
later we will become acquainted with an abundance of important examples.

Examples. 5. If for the ring R in Example 1 one takes a a-algebra d in 1a,


then e, is a measure on d, called the measure defined by a unit point mass
at w, or more briefly the unit mass at w, and also the Dirac measure at w. These
designations derive from interpreting a measure p on a a-algebra in f as a mass
distribution over Q. Accordingly for A E 0, p(A) is viewed as the mass that has
been "smeared" over A. The Dirac measure at w has, in so far as the one-element
set {w} lies in d, all of its (unit) mass concentrated at the point w: e({w}) = 1,
eW(C{w}) = 0.
6. Let SZ be an arbitrary set. For every A E .(12) let JAI denote the number
of elements in A in case A is finite, and otherwise +oo. Then r;(A) :_ IAI defines
a measure on :x(11), called the counting measure on ft (or on .9(1)). Its restriction
to a o-algebra W in i is called counting measure on W.
7. The premeasure defined in Example 2 is a measure.

Next we derive a not-so-obvious consequence of the a-additivity of measures.

3.4 Lemma. Let p be a measure on a a-algebra ii and (An)nEN a sequence of


sets from 0. Suppose there is a k E N such that the sets A,n and An are disjoint
whenever their indices satisfy Im - nl > k. Then
00 00
(3.12) J >(A.) < kp (U A.).
n=1 n=1

When k = 1 this is, in view of (3.10'), just the a-additivity requirement of


a measure.

Proof. Designate the union of all the An as C. For each r = 1,.. . , k the sets
(Ar+rnk)mENo are pairwise disjoint. So if we set
00
Fr U Ar+mk r
rn=0
then
00

E p(Ar+mk) = p(Fr) < p(C)


M=0
because Fr C C. Since the sum of a series of non-negative terms in independent
of the ordering of the terms, it follows that
00 k

E p(A,) = E u(Fr)
n=1 r=1
§3. Contents, premeasures, measures 13

From this equality and the preceding inequality the asserted inequality can be read
off.

Exercises.
1. Let 12 be a finite, non-empty set. Show that the counting measure ( on Y(O)
coincides with E e,,. Show further that every measure p on :x(1l) has the form
cEn
p= a,,e,,,, with each a, := p({w}).
WE n
2. For a finite content p on a ring .4 establish the following input-output formula
generalizing equality (3.5): For all n E Nl, A,, ... , An E M
n n
µ(U A;) =EA(Ai)- E t(AinAj)+ p(AinAjnAk)
i=1 i=1 1<i<j<n 1<i<j<k<n
- +...+(-1)n-1µ(A1n...nAn).
3. For a premeasure p on a ring. in 12 define
.':={AE-6P(1l):AnRE.4for every RE-4}
µ(A) := sup{p(R) : R C A, R E-4}, for A E i.
Show that .9' is an algebra in 12 which contains .?, and that µ is a premeasure
on 8 which extends p.
4. Suppose that (A- )-EN is a sequence of premeasures on a common ring 9 which
is isotone, that is, satisfies µn (A) < pn+1(A) for all A E R, n E N. Show that via
p(A) := sup An (A) a premeasure µ is defined on R.
nEN
5. Let p be a measure on a a-algebra .sat in 0, and denote by .N,, the set of all
p-null (or µ-negligible) sets, that is, the N E .szd for which µ(N) = 0. Check that
.M,,, has the following properties:
(a) 0 E t',,;
(b) NE.Y,,,MEd,MCN ME.A;
(c) (Nn)nEN C ,N,, U Nn E ,4,.
nEN
Subsets of .sat with these properties are called a-ideals in d. Thus Y is always
a a-ideal. (Cf. Exercise 4 of §1.)
6. Every a-ideal .N in a a-algebra d is the a-ideal .N,, of p-null sets of an appro-
priate measure p on d. To get such a p, define
0 if AEa
_ I+oo ifAEd\-,Y.
'L(A)'-
As a special case, on the power set .9(12) of any set SZ there is a measure p such
that p(A) = 0 precisely if A is a countable subset of Q.
14 I. Measure Theory

7. Let p be a finite content on a ring .9. Show that


di,(A, B) := p(A A B) (A, B E.9)
defines a pseudometric on M, that is, d,, has all the properties of a metric on .9 with
one possible exception: d,, (A, B) = 0 can happen without A = B. (Cf. Exercise 3
of §15.)

§4. Lebesgue premeasure

Now we specialize Sl to be the d-dimensional number-line Rd (d E N). For every


two points a = (al, .. , ad) and b = E Rd we write a:5 b (reap., a 4 b)
if a, < Qi f o r all i = 1 , ... , d (resp., ai < ,Q; for all i = I,-, d). Every set of the
form
(4.1) [a,b[:= {XERd:a<x-o b},
where a, b E Rd and a < b, is called a right half-open interval in Rd. Geometrically
described, these are parallelepiped "open on the right" and having sides parallel
to the coordinate axes. Clearly [a, b[ is nonempty if and only if a < b, and in this
case the interval [a, b[ uniquely determines the points a, b.
For every such interval [a, b[ the real number
(4.2)
is called its d- dimensional elementary content. It equals 0 just when [a, b[ = 0, that
is, when a < b fails (although a < b holds, a prerequisite to employing interval
notation).
From now on, #d shall designate the set of all right half-open intervals in Rd,
and 9d the system of all finite unions of such intervals, so ,.Od C .91d. The elements
of fd are called d- dimensional figures.

4.1 Lemma. For all 1, J E >fd


In JE fd and J\IE.Pd.
Every figure is a union of finitely many pairwise disjoint intervals from ,ld.

Proof. Let I = [a, b[, J = [a', b'[ with a < b, a' < b', and let the corresponding
coordinates of these points be ai, 3i, a;, 3,. If we let a and f denote those points
in Rd whose coordinates are max{ai, a' j) and min{13i, $ } (i = 1, ... , d), respec-
tively, then I n J = [e, f [ in case e < f and otherwise I fl J = 0. Consequently,
I n J is already in .ld. Because J \ 1 = J \ (I n J) and we now have I n J E -old,
in proving the second claim we may assume that 196 0 and I C J. Then I and J
determine the points a, 6, a', 6' uniquely and they satisfy a' < a ci b < Y.
Create new points from a = (al , ... , ad) and b = Ad) by replacing
ai by a; and /3i by ai, or by replacing ai by ai and Qi by $ , and do this in all
§4. Lebesgue premeasure 15

possible ways. More precisely, make such replacements for the i coming from each
non-empty subset of { 1, ... , d}. The points so created give rise to at most 3d - 1
pairwise disjoint intervals from _Od whose union is J \ I. Thus J \ I is a figure
and is representable as a finite union of pairwise disjoint sets from _0d. That this
obtains as well for every figure F = Il U ... U I,, E _4rd with Ii, ... , I E .0d can
now be seen as follows:
F=I1U(I2\I1)U(13\IlUI2)U...U(In\IIU...UIn-1)
exhibits F as a union of n pairwise disjoint sets, each of the form I \ J1 U ... U Jm
with I, J1, ... , J. intervals from jd. Thus it suffices to show that every set of this
form is the union of finitely many pairwise disjoint intervals from mod. But this
follows from
m
I\J1U...UJm=n(1\Ji)
i=1

when, using what has already been proved, we write each I \ Ji as a union of
finitely many pairwise disjoint intervals from j0d and distribute the intersection
through these unions. 0

4.2 Theorem. 4'd is a ring in Rd.

Proof. The only thing that is not obvious is property (1.10) of a ring, according to
which along with any sets F, G E .$d their difference F \ G must also be in `$d.
By definition there exist intervals Ii, ... , I,,,, Ii , ... , I;; E pd such that
m n
F=UI; and G= U1 .
i=1 j=1

But then
M n

F\G(nvi,\Ijn)
i=1 j=1

M
and so it only has to be shown that each set n (I; \ I) is a figure. According to 4.1
j=1
I; \ Ij" is always a figure. So it further suffices to demonstrate that the intersection
of two (whence, of any finite number of) figures is itself a figure. If however F
and G are two figures represented as above, then thanks to distributivity F fl G
is just the union of the sets I; fl I , " (i = 1, ... , m; j = 1, ... , n), which by another
appeal to 4.1 is a figure. 0

By definition every figure is a union of finitely many intervals from 5d. Conse-
quently, .mod C 9 for every ring 9 in Rd such that fd C R. So theorem 4.2 really
says that .Ird is the ring generated by fd.
Our geometric intuition now suggests the validity of the following theorem:
16 1. Measure Theory

4.3 Theorem. There exists exactly one content A on 911 with the property that
A(I) coincides with the d-dimensional elementary content of I, for each I E .fad.
This content is real-valued.

Proof. According to 4.1, every figure F E 90 has a representation F = Il U... UIn


as a union of finitely many pairwise disjoint intervals from 9d. Every content A
on 9' therefore satisfies

A(F) = A(11) + ...

which shows that A is determined throughout gd just by its values on fd and is


necessarily real-valued. Thus all we have to do is settle the existence question. To
this end we first define A only on .d as it must be defined, namely A(I) shall be
the d-dimensional elementary content of I for each I E fad. Then we have.
(a) Let I = [a, b[ E .>fd, a = ( a l , ... , ad) and b = I3d) and y a real
number satisfying ai < -t < /3i for a fixed i E {1,.. . , d}. The hyperplane with
equation t;; = y divides I into two disjoint intervals 11 := [a', b[ and 12 := [a, b'[,
a' being a with its its' coordinate replaced by -y, and b' being b with its ith coordi-
nate replaced by ry. From (4.2) then follows that .1(1) = A(I1) + A(12). Induction
therefore yields
(b) If I E sad is decomposed by finitely many hyperplanes in the manner
described in (a) into pairwise disjoint intervals Ii,. .. , I E .fad, then \(I) _
,\(11) +... + A(I ). More generally:
(c) For any finitely many, pairwise disjoint 11, ... , In E Yd with to := I1 U
U In E sad, A(Io) = A(I1) + ... + A(In). In proving this we can obviously
assume that each Ij is not empty. Then there are points aj = (ail, ... , a jd) and
bj = (ris 1, ... , J3jd) from Rd with aj d bj and Ij = [aj, bj [, j = 0,1, ... , n. The
hyperplanes whose respective equations are 1;; = aj; or & =,3j, for i E {1,...,d},
j E (1, ... , n) decompose Io into pairwise disjoint intervals 11.... ,1,, E frd. Each
of I1.... , In also decomposes into certain of these Ii.... ,1 . The claimed equality
therefore follows from (n + 1) citations of (b).
(d) If now
F=I1U...UIn=J1U...UJnj
are two representations of the figure F E .>5d, each a union of pairwise disjoint
intervals, then
A(I1) + ... + A(In) = A(J1) + ... + A(Jm)
m
Indeed, Ij = Ij nF = U (Ij nJ;) is a representation of Ii as a union of the pairwise
1=1
disjoint intervals Ij n Js, ... , I, n J,,, and thanks to (c)

A(Ij)=>A(IjnJ=) (j=1,...,n).
i=1
§4. Lebesgue premeasure 17

Upon interchanging the roles of i and j, one gets analogously


n
A(J1) = EA(Ij nJ1) (i = 1,...,m).
j=1

Together these last two equations entail the equality E A(Ij) = E A(JJ).
(e) Thus for every F E .'d the number F_ A(Ij) is independent of the special
representation
F=I1u...UI,
of F as a union of finitely many pairwise disjoint 11, ... , In E fd. Therefore the
decree
A(F) := A(I1) +... + A(In)
well defines an extension, to be denoted still by A, of the original function on .fd to
one on gd. This function is real-valued, non-negative, and according to (d) finitely-
additive. Since 0 E j0d and A(0) = 0, a content with the sought-for properties is
at hand.

4.4 Theorem. The content A on !Fd is a premeasure.

Proof. Because A is finite, 3.2 says that we only need to prove the continuity of A
at 0. To this end, let (Fn) be an antitone sequence of figures from d. We will
show that from the assumption that
b:= limoA(FF)=n NA(Fn)>0
follows
nFn #0.
Each Fn being a union of finitely many pairwise disjoint intervals from .>Id, it
should be clear that by a slight leftward shift of the right endpoints of each of
these intervals a new figure an E .fin is created, whose topological closure Gn is
still a subset of Fn, and
A(Fn) - A(Gn) < 2-"6.
If we set Hn := G1 fl ... fl Gn, then (Hn) is a sequence of sets from gd satisfyin
Hn Hn+1+ Fn C Gn C Fn for all n. Because Fn is bounded its closed subset H,,
is compact. As soon as we succeed in showing that each Hn is not empty, it will
follow from the finite-intersection property of compacts (WILLARD [1970), p. 118,
KELLEY [1955], p. 136) that n Hn 0 0 and so a fortiori n Fn 54 0. So let us
nEN nEN
prove that no Hn is empty. For every n E N
(*) A(Hn) > A(Fn) - (1- 2-")d,
as we will confirm by induction. The inequality holds for n = 1 because H1 = G1,
and by choice of G1, A(F1) - A(G1) < 2-16. Suppose the inequality valid for
18 1. Measure Theory

some n. Since G,,+1 fl H, and everything is finite, (3.5) gives


A(H,,+1) = A(Gn+1) + A(HH) - A(Gn+1 U Hn)
From the induction hypothesis A(H,,) > A(F,) - (1 - 2-")b; from the choice
of G,.+1, A(Gn+1) > A (Fn+1) - 2-"-'b and G.+1 U Hn C F.+1 U Fn = Fn, so that
U Combining these observations completes the inductive
step in the confirmation of (*):
A(F'n+l) - 2-"-lb - (1 - 2-")b = A(Fn+l)
Recalling that A(F,) > S by definition of b, we infer from (*) the inequality
A(H") > 2`5 > 0 and therewith the fact that Hn 0 0, the last link that had to
be accounted for in the logical chain. 0

4.5 Definition. The premeasure A on the ring Jrd of d-dimensional figures in Rd


is called Lebesgue premeasure in Rd or d-dimensional Lebesgue premeasure. From
now on it will be denoted by Ad.
Here we encounter for the first time the name of the French mathematician
H. LEBESGUE (1875-1941), the inventor of the measure and integration con-
cepts that today are named after him. The development of the theory of mea-
sure and integration was spurred above all by his investigations and those of his
countryman E. BOREL (1871-1956). For the history of Lebesgue integration see
DIEUDONNE [1978] and HAWKINS [1970].

Exercises.
1. Show that on 91 there is exactly one content p that assigns to the right half-
open interval [a,,3[, a, f3 E R, the following values
ifa<0<$
'L([a,13[)={10 in all other cases.
Is µ a-additive?
2. Two intervals 1o, J E jrd with 1o C J are given. Prove the existence of k < 2d
intervals I1, ... , Ik E Od with the following two properties: (i) 10 U ... U 1j E 'Od
for each j E {0, ... , k}; (ii) J = lo U ... U Ik. [Hint: Proceed by induction on the
dimension d.]

§5. Extension of a premeasure to a measure

Lebesgue premeasure is not a measure because its domain of definition, the ring
of d-dimensional figured, is not a o-algebra. For example, the whole space Rd is
not in .$'d, every d-dimensional figure being a bounded subset of Rd.
The elementary geometric considerations sketched at the beginning of this chap-
ter however suggest that the domain of the premeasure Al be so enlarged that
§5. Extension of a premeasure to a measure 19

a "numerical measure" gets assigned also to more complicated subsets of Rd. The
most satisfactory such result would say that Ad can be extended in exactly one
way to a measure on an appropriate or-algebra W in Rd with g d C a0.
Here we encounter the following general problem: A ring ? in a set Q and
a content p on -4 are given. Under what conditions does there exist a o-algebra 0
in fl and a measure µ on at such that p is the restriction of j to -IV An obvious
necessary condition for this is that µ be a premeasure on R. The designation
"premeasure" will turn out to be justified if we can show the converse: For every
premeasure p on a ring . there exists a or-algebra ark in Il with -4 C at, and
a measure µ on ark satisfying i I .? = p. It suffices to take for 0 the o-alge-
bra o(3) generated in fI by .M.

5.1 Theorem (Extension theorem). Every premeasure p on a ring .? in fI can


be extended in at least one way to a measure Ti on the or-algebra or(R) generated
by E in Q.

Proof. For each subset Q C S1 designate by 'W (Q) the set of all sequences (An)nEN
of sets from 3 which cover Q, that is, which satisfy

QC UAn.
nEN

Then the numerical function p may be defined on .B(SI) via


W
inf{ E p(An) : (An) E'P1(Q)), in case °P(Q) 34 0
µ (Q)
+oo, in case *'(Q) = 0.

It has the following properties:

(5.2) A* (0) = 0;
(5.3) Q1 C Q2 A WO < P*(Q2);
00 00

(5.4) (Qn)nEN C -60(1l) (U Qn) E u*(Qn)


n=1 n=1

Equality (5.2) follows from the observation that the constant sequence 0, 0, ... is
in W (O). The observation that °1(Q2) C V (Q0 follows from Q1 C Q2, serves to
confirm (5.3). For the proof of (5.4) it can evidently be assumed that p (Qn) is
finite and so in particular 0&(Qn) # 0, for every n E N. For an arbitrary e > 0
then, each 0&(Qn) contains a sequence (Anm)mEN such that
00

1: p(Anm) : µ'(Q.) +2-ne.


M=1
20 1. Measure Theory

The double sequence (A,nm)n,mEN lies in 11 ('J Qn) and as a consequence the
n=1
definition of µ' gives
00 00

,t* (U Qn) < E lp(Anm) <


n=1 n,mEN
L, µ'(Qn) + £
n=1

and (5.4) follows from this and the arbitrariness of £ > 0. It is immediate from the
definition that
(5.5) i >0.
Decisive for what follows is the fact that every A E .4 satisfies
(5.6) p (Q) > µ' (Q fl A) + µ' (An CA) for every Q E .9(0),
as well as
(5.7) p*(A) =;&(A).
In proving (5.6) we can again assume µ'(Q) < +oo, so that P(Q) 34 0. First of
all we have 00 00
00

p(An) _ >lx(AnflA)+Ep(An \A)


n=1 n=1 n=1
for every sequence (An) from °1!(A), due to the finite additivity of p. Moreover,
the sequence (AnflA) lies in 9l(QnA) and the sequence (An \A) lies in P!(Q\A).
Consequently,
00

1: p(An) > p*(QnA)+µ"(Q\A)


n=1
for every such sequence (An), and from this fact (5.6) is immediate. Equality (5.7)
follows on the one hand from (3.10), according to which u(A) < p*(A), and on
the other hand from consideration of the sequence A, 0, 0.... which lies in P (A).
The significance of what has been proven lies in the fact, which we will establish,
that the system d' of all sets A E .9(1) satisfying (5.6) is a a-algebra in 52 and the
restriction of µ' to af' is a measure. Now (5.6) as just proved says that .' C d',
and so we shall have C W*. Then according to (5.7) ji := µ' I a (R) is an
extension of it to a measure on o(ff). The definition and theorem which follow
will therefore complete the present proof. 0

5.2 Definition. A numerical function µ' on the power set .9(St) having properties
(5.2)-(5.4) is called an outer measure on the set fl. A subset A of 0 is called u-
-measurable if it satisfies (5.6).
Notice that µ' > 0 always prevails, an immediate consequence of (5.2) and (5.3)
together.
The idea in the proof of the measure-extension theorem, which goes back to
C. CARATHFODORY (1873-1950), consists in associating via definition (5.1) an
outer measure to the premeasure p on.' and then invoking the following theorem.
§5. Extension of a premeasure to a measure 21

5.3 Theorem (Caratheodory). Let µ' be an outer measure on a set f). Then the
system 0' of all µ'-measurable sets A C fl is a o-algebra in fl. Moreover, the
restriction of µ' to dA' is a measure.

Proof. First let us note that the requirement (5.6) for a subset A of St to lie in d'
is equivalent to
(5.6') µ'(Q)=µ'(QnA)+µ'(Q\A) for allQE9(1),
because from (5.4) applied to the sequence Q n A, Q \ A, 0, 0.... follows the reverse
of inequality (5.6), for every Q E 9(S1). From either (5.6) or (5.6') it is immediate
that S2 E d', and because of their symmetry in A and CA, whenever A lies in d',
so does CA. The following considerations will show that with each two of its sets A
and B, .d' also contains their union A U B, and so d is an algebra. B E as''
entails that
µ' (Q) = µ' (Q n B) +.u* (Q \ B)
for every Q E 9(11). Replacing Q here first by Q n A, then by Q \ A = Q n CA, we
get two new equalities (valid for all Q E 9(1)) which, when inserted into (5.6'),
lead to
µ'(Q) =µ'(QnAnB)+µ'(QnAnCB)+µ'(QnCAnB)+µ (QnCAnCB).
Replacing Q here by Q n (A U B) gives
(5.8) µ'(Qn(AuB)) =µ'(QnAnB)+µ'(QnAn CB)+µ'(QnCAnB),
which in conjunction with the preceding equality yields
µ'(Q) = µ'(Qn(AUB))+µ'(QnCAnCB) = µ'(Qn(AuB))+µ'(Q\(AuB))
This being valid for all Q E Y(n) affirms that A U BE d'.
Now let (An) be a sequence of pairwise disjoint sets from W' and A be their
union. The choice of A := A1, B:= A2 in (5.8) produces

µ'(Qn(A1 uA2)) =µ'(QnA1)+µ'(QnA2)


An induction argument generalizes this to
n n

, (Q n U A) = E(Q n Ai)
i=1 i=1
n
for all Q E 9(1), all n E N. Recalling that Bn U Ai has already been proven
i=1
to be in Af ', and that Q \ Bn D Q \ A, so that µ' (Q \ Bn) > µ' (Q \ A), we obtain
n
p* (Q) =14* (QnBn)+p'(Q\Bn)?F1i'(QnAi)+µ'(Q\A)
i=1
22 I. Measure Theory

for all n E N. From this and an application of (5.4) follows


00

W(Q) ? F, p'(QnA.)+µ'(Q\A) ? 1 (QnA)+,u*(Q\A)


n=1
and consequently, as noted at the beginning of the proof, we actually have equality
throughout:

p'(Q) = 2p'(QnAn)+p'(Q\A) =p'(QnA)+p'(Q\A),


n=1
holding for all Q E 9x(1l). Thus A lies in d'. After all this we recognize that the
algebra sad' is an r)-stable Dynkin system and therefore by Theorem 2.3 a o-alge-
bra. If in the last pair of equalities we take Q := A, we get
00

p'(A) _ E (An),
n=1

proving that the restriction of p' to d' is a measure. 0

It can be further shown that in many important cases the measure µ from
Theorem 5.1 is uniquely determined. As a preliminary we give a proof that is
a typical application of the technique of Dynkin systems. (Cf. also Exercise 9.)

5.4 Theorem (Uniqueness theorem). Let 9 be an n-stable generator of a a-al-


gebria d in 1 and suppose that (En) is a sequence in 9 with U En = n. Then
nEN
measures p1 and p2 on W which satisfy
(i) p1(E) = p2(E) for all E E c9
and

(ii) p1(En)=p2(En)<+oo for olin EN


must in fact be identical.

Proof. Denote by 8f the system of all sets E E.9 satisfying µ1(E) =p2(E) < +oo.
For a given E E of consider the system
9E :_ {D E sV: p1(E n D) = p2(E n D)).
We will show that it is a Dynkin system. Obviously Sl E E. If D E 9E, then
p1(E n D) = µ2(E n D) < +oo (since E E 8j), and so (3.7) shows that
p1(EnCD) = p1(E\EnD) = p1(E)-p1(EnD) = µ2(E)-p2(EnD) = p2(EnCD),
which says that CD E 9E. The remaining property of Dynkin systems (2.3) follows
at once from the a-additivity of the measures p1, p2. Because 8 is n-stable,
8 C 9E follows from (i) and the definition of 9E. But then S(td) C 9E because
6(8) is the smallest Dynkin system which contains 8. From Theorem 2.4 however,
§5. Extension of a premeasure to a measure 23

6(9') = a(.9) =.W. Therefore 6(.9) C -9E C Sd entails 2E = 0. Thus


(5.9) µ1(E n A) = µ2(E n A)
holds for all E E -ff and A E d. On account of (ii) then in particular
(5.9') µ1(EnnA)=µ2(E,,nA) (nEN, AEsd).
In analogy with the proofs of (3.8) and (3.10) we set
F, := E1, F2:=E2\E1,..., Fn:=En\(E1U...UEn_1),...,
and get a sequence (F,,) of pairwise disjoint sets from dd satisfying F,, C En for
all n E N and U F,, = U E. = Q. Since Fn n A E d it follows from (5.9') that
nEN nEN
µ1(FnnA)=µl(EnnF,,nA)=p2(EnnF,,nA)=µ2(FnnA)
for all A E al and all n E N. But then the fact that
A= U(FnnA)
nEN
combines with the a-additivity of It, and µ2 to deliver
00 00

µ1(A)=Eµ1(FnnA)=>µ2(FnnA)=µ2(A)
n=1 n=1
for every A E .W, which says that the measures 1A1, µ2 are identical. O

For finite measures some other natural stability properties of the generator c9
(e.g., its closure under set-differences) also insure uniqueness. See, for example,
ROBERTSON [1967].

In order to be able to formulate a useful sufficient condition for the uniqueness


of the measure ,& from Theorem 5.1, we make the

5.5 Definition. A content µ on a ring .9 in fl is called a-finite when a sequence


(An)nEN of sets from .9 exists such that U An = fl and µ(A,,) < +oo for ev-
nEN
ery nEN.

Examples. 1. Suppose that the content p on the ring .4 in f is finite, that is,
p(A) < +oo for every A E R. The a-finiteness of u is the equivalent to the existence
of a sequential covering (An) of Il by sets An E R. But the latter condition does
not automatically hold, as the trivial example Sl 54 0, .9 := {0} illustrates.
In general, the a-finiteness of a content p on a ring .4 is equivalent to the
existence of a sequence (A;,) of sets in .4 with µ(A',) < +oo for all n and A', T Q.
In fact, if (An) is merely a covering of S2 by sets in -4 having finite µ-measure,
then the sets An := Al U ... U An, n E N, furnish a sequence of the desired kind.
2. Lebesgue premeasure in R" is a-finite (as well as finite). For if we denote by n
the point in Rd whose coordinates are all equal to n, then In :_ [-n, n[ is an
interval from 'Od, Ad(I,,) < +00 (n E N), and In t Rd.
24 I. Measure Theory

3. The counting measure on a set S2, defined in Example 6 of §3 is v-finite (resp.,


finite) just when S2 is countable (resp., finite).
In summary we have

5.6 Theorem. Every or -finite premeasure p on a ring ' in a set 1 can be extended
in exactly one way to a measure it on a(M).

Proof. Only the uniqueness of it has to be proved. But this follows immediately
from 5.4: thanks to the o-finiteness of it, the ring. has all the properties required
of the generator 6' in the hypothesis of 5.4.

Remark. The hypothesis of a-finiteness of lc on 5.6 can not be dispensed with. It


suffices to look, as in Example 1, at a non-empty set ft and to take for . the ring
consisting just of the empty set. On a(R) = {0, 1} two different measures having
the same restriction to9 are defined byp(0) = v(0) := 0 and p(S2) := 0 =: 1-v(1l).
The uniqueness of the measure a which extends the or-finite premeasure 1A in
5.6 is expressed more dramatically by the following approximation property. For
simplicity we formulate it only for finite measures on an algebra.

5.7 Theorem (Approximation property). Let p be a finite measure on a v-alge-


bra d inn which is generated by an algebra do in fI. Then for each A E d there
is a sequence (Cn)nEN in .moo satisfying
(5.10) lim u(ALCn)=0.
n- 00

Here A designates the symmetric difference defined in Exercise 2 of §1. Exer-


cise 7 of §3 is the real justification for the terminology "approximation property".

Proof. Let A E d, E > 0 be given. At issue is the existence of a C E X with


µ(A 0 C) < e. According to 5.1 and 5.6, especially the equation (5.1) which
extends pl do to 0, there exists a sequence (Af)1EN in .00 which covers A and
satisfies
00
(5.11} 0 < E µ(A11) -;i(A) < 2 .
11=1
n
If we set Cn U Ai, n E N, then A' U An satisfies
i=1 nEN

C n f A' and A' \ Cn y. 0.


Since p is finite, and consequently continuous at 0, an no E N exists for which
(5.12) p(A' \ Coo) < 2
Let us show that the set C := Cno E do does what is wanted:
A,L C= (A \C)u(C\A) c (A'\ C) u (A'\ A),
§5. Extension of a premeasure to a measure 25

and so the subadditivity of µ yields


,u(ADC) <µ(A'\C)+p(A'\A) =µ(A'\C)+µ(A') -µ(A)
<p(A'\C)+p(An)-p(A)
00
n=1
E/2 + e/2, by (5.11) and (5.12),
which establishes the claim u(A L C) < E. 0
It should also be noted that
(5.13) limo µ(Cn) = A(A) n

follows immediately from (5.10). The inequalities


(5.14) I p(Cn) - µ(A)I < µ(A 0 Cn) (n E N)
make this obvious: For C, D E ef, C C D U (C \ D), so that u(C) -,u(D) <
µ(C \ D) < µ(C A D). As C and D may be interchanged here, (5.14) is con-
firmed. 11

Exercises.
1. Let µ = E,,, be the premeasure on a ring .R in Sl defined by putting unit mass at
the point w E fl. Under the hypothesis that {w} can be realized as the intersection
of a sequence from .s and fl as the union of such a sequence, prove that: (a) The
outer measure µ' defined from µ via (5.1) assigns to every set A E .9(11) the
value 1 or 0, according as w E A or w E CA. (b) Every subset of Sl is p'-measurable.
(c) µ' is the measure E,,, on .9(11).
2. Consider the measure p in Examples 2 and 7 of §3, say for Sl := R, and prove
that: (a) The outer measure µ' defined from p via (5.1) assigns to every set
A E .9(fl) the value 0 or 1, according as A is countable or not. (b) µ is not
a measure on .9(11), not even a content. (c) The only µ'-measurable sets are
those in the a-algebra sd on which u is defined.
3. Let d be the a-algebra generated by an algebra .r on the set Cl, a and v
measures on 0. Show that the validity of u(A) < v(A) for all A E . need not
imply its validity for all A E d. [Hint: do := 91, v counting measure, p := 2v.]
Find supplemental hypotheses that will render such an implication true.
4. Show that the sequence required in Definition 5.5 of the a-finiteness of the
content p on the ring 9' in fl, can always be chosen to be a sequence of pairwise
disjoint sets from ,R which cover Cl and each have finite measure.
5. Let µ be a a-finite measure on a a-algebra sF in fl, and µ' the outer measure
defined by (5.1). Then to every set Q E .9(11) corresponds an A E 0, called
a measurable hull of Q, with the properties that Q C A, µ' (Q) = p(A), and
µ(B) = 0 for all B E d such that B C A \ Q. [Hint: In case µ' (Q) < +oo, show
that there exists a sequence (A,,) in at with Q C A,, and µ(A,) < µ'(A) + n'
for every n E N. Then A := n A,, has the desired properties.]
nEN
26 I. Measure Theory

6. A measure it on a a-algebra dd in Sl is called complete if every subset of a p-null


set (cf. Exercise 5, §3) belongs to W, and consequently is itself a it-null set. Show
that:
(a) The measure ,t f dd' from Theorem 5.3 is complete.
(b) The measure in Examples 2 and 7 of §3 is complete.
(c) If dd is a a-algebra in a set U, w E 0 and {w} E V, then the Dirac measure e,,,
on dd is complete just when dd = f1a(St).
7. (a) Show that every measure it on a a-algebra AV in a set U can be completed.
That is, µ can be extended to a complete measure po on a a-algebra ddo in U,
dd C ddo, in such a way that every complete measure p' on a a-algebra 0' in U,
dad C a(', which extends p is also an extension of po. The (obviously unique)
a-algebra ddo is called the p-completion of d; the triple (U, 4, PO) is called the
completion of (0, dd, p). [For such triples the term measure space will be introduced
in §7.]
(b) Determine the completion of (,Q.0, e.,) from Exercise 6(c).
(c) Show that the p-completion do of a or-algebra dd in U consists of all sets AUN
with A E d and N a subset of a it-mill set. For every such set, po(A U N) = p(A).
(d) Characterize the sets in olo as follows: A set Ae C U lies in 0e just if sets
AI, A2 E dQ exist such that A, C Ao C A2 and p(A2 \ Al) = 0.
8. Let it be a a-finite measure on a a-algebra dd in U, p' the outer measure it
determines via (5.1), and dd' the or-algebra of all p'-measurable subsets of Q.
With the help of Exercises 5 and 7, show that (U,dd5, p' Id') is the completion
of (f2, dd, p).
9. The proof of Theorem 5.4 only uses condition (i) for sets A E c' which sat-
isfy pi(E) = p2(E) < +oo. Clarify this observation by showing that under the
hypotheses of Theorem 5.4 the system tj of all sets E E if satisfying pl(E) _
µ2(E) < +oo is likewise an ft-stable generator of dd.

§6. Lebesgue-Borel measure and measures on the number


line

We are going to pursue further the investigations in §4. So as before jd will be


the set of all right half-open intervals in Rd, gd the ring of all d-dimensional
figures, and Ad the Lebesgue premeasure on .,mod. We have already noted that Ad
is a-finite. According to 5.6, ad can be extended in exactly one way to a measure
on 0.(,d), which measure will also be denoted by ad from now on. Since every
figure is a union of finitely many intervals I E .1d, we have
a(. d) = o(5d) .
6.1 Definition. The elements of the a-algebra generated in ltd by the system 5d
of half-open intervals are called the Bored subsets of the space Rd. Correspond-
§6. Lebesgue-Borel measure and measures on the number line 27

ingly o(.Fd) is called the a-algebm of Borel subsets of Rd; it will henceforth be
denoted .mod.

The results reviewed in the introduction can, following 4.3, be expressed thus:

6.2 Theorem. There is exactly one measure Ad on ,mod which assigns to every
right half-open interval in Rd its d-dimensional elementary content.

6.3 Definition. The measure Ad in Theorem 6.2 is called the Lebesgue-Borel


measure (L-B measure, for short) on Rd. For every Borel set B E .mod, Ad(B) will
also be called the d-dimensional Lebesgue measure of B.
It is expedient to expand this definition: For every set C E 0 the trace a-
algebra C fl 0 consists of all Borel subsets of C (cf. (1.4)). The restriction Ac
of Ad to C fl 0 is a measure. It will also be called the L-B measure on C.
Like the Lebesgue premeasure of which it is an extension, the L-B measure Ad
is a-finite (cf. Example 2 of §5). More generally
(6.2) Ad(B) < +oo
for every bounded set B E .mod, since such a B lies in an interval in 0d; e.g.,
excepting finitely many n, B lies in each interval I from Example 2, §5, with the
result that Ad(B) < Ad(In) < +00.
Let us recall the question formulated in the introduction to Chapter I of finding
a unified method for assigning a numerical measure of d-dimensional volume to
as many subsets of Rd as possible. Step by step we will come to recognize that
Theorem 6.2 answers this question in a most satisfactory way: for every Borel
set B in Rd its d-dimensional measure in the number we were seeking.
First of all it seems desirable to get a deeper insight into the a-algebra gd of
Borel sets. In particular, the question naturally comes up whether topologically
interesting sets, like the open, closed, or compact ones are Borel. The characteri-
zation of .mod via such sets in the next theorem is often taken as the definition of
the a-algebra 0.

6.4 Theorem. Let 0d, `ed, .d denote the system of all open, closed, compact
subsets of Rd, respectively. Then
,Wd
(6.3) = o(6d) = o(`ed) = r(. d) .

Proof ..lt'd C `E'd C 0,((2d), so o(Xd) C o(('d). Every set C E'd is the union of
a sequence of sets C E ..1C'd; for example, if K. are the compact balls with a fixed
center and radii n E N, then the sets C,, := C fl Kn furnish such a sequence. Thus
by (1.3), Wd C o(..lE'd), whence o(Wd) C o(..iE'd) and so finally the equality of
these two a-algebras. Since the open sets are the complements of the closed ones,
the equality o(6d) = o(( d) is obvious; therewith the last two equalities in (6.3)
are confirmed.
28 I. Measure Theory

We finish up by showing that o,(®d) =mod. We will, as usual, use the term
bounded open interval in Rd for every set of the form
(6.4) Ja,b[:={xERd:a.x4b},
where a, b E Rd satisfy a < b. Every right half-open interval [a, b[ E fd is the
intersection of a sequence of bounded open intervals, namely, for
a:= (a1, ... , ad) and an := (al - n-1, ... , ad - n-1) (n E AI)

we have
Jan, b[ .l. ]a, b[ .

Therefore jid c o(rYd) by (1.7) and consequently o(.fd) C a(&). Every


open set in Rd can be exhibited as the union of countably many bounded open
intervals (e.g., all those which it contains whose endpoints have only rational co-
ordinates). Moreover, every bounded open interval ]a, b[ is the union of a sequence
of intervals from .fd, namely
(6.5) [an, b[ T ]a, b[

if we set
(6.5') an := (min{al + n-',,31 I,-, min{ad + n-1,Qd}) (n E N),
ai, ... , ad and being the coordinates of a and b, respectively. Every
open set is therefore the union of a sequence of intervals from jd, and so 6d C
0,(.f d) = .jd. It thus follows that o(eld) C .mod and, as the reverse inequality has
already been established, equality Rd = o,((l'd) is confirmed. O

We will become acquainted with some deeper properties of L-B measure in §8.
In particular, there the existence of non-Borel sets, that is, the assertion
Rd #-'P(fltd)
will be proved. For the moment we content ourselves with computing the Lebesgue
measure ,d(B) of some geometrically simple Borel sets B.

Examples. 1. Every hyperplane H orthogonal to one of the coordinate axes in Rd


is an L-B-null set, i.e., a Borel set with Ad(H) = 0. Let, say H be orthogonal to
the i°h coordinate axis, i E { 1, ... , d}, that is, be of the form
(6.6) H:={x=(l:l,...,ed)ERd: F.=a}
for an appropriate a E R. H is a closed set, and so is Borel. For each n E N,
let x,,, yn be those points in Rd whose coordinates are -n or n, respectively,
at every index except i and whose ith coordinates are a or a + 2-"(2n)1-de,
respectively, where e > 0. Evidently

HC U [x",yn[
neeN
§6. Lebesgue-Borel measure and measures on the number line 29

and
Ad([xn,yn[) = 2-ne, n E N.
From (3.10) we therefore get

00

Ad(H) 5 E Ad([xn, yn[) = E.


n=1

Since this is true for every e > 0, Ad(H) = 0 follows.


Due to the isotoneity of measures, we consequently also have Ad(B) = 0 for
every Borel subset B of such a hyperplane H.
2. Every countable subset of Rd is an L-B-null set. Because of o-additivity of
measures, it suffices to treat the case of one-point sets {x} C Rd. Being a closed
set, it is Borel; moreover for an appropriate hyperplane H of the form (6.6) we
have {x} C H.
3. For points a, b E Rd with a < b consider besides the intervals [a, b[ and ]a, b[
already defined, the compact interval
[a,b]:={xE1Rd:a<x<b}
and, in contrast to [a, b[, the left half-open interval
]a,b]:=Ix ERd:aax<b}.
Then
(6.7) Ad([a, b[) = Ad(]a, b[) = Ad([a, b]) = Ad(la, bl)
First of all the intervals [a, b[, ]a, b[ and [a, b) are Borel sets by Theorem 6.4. As in
its proof, we can show that
(6.8) ]a, bn [ .. ]a, b] and [a, bn [ 1 [a, b]
for appropriate sequences (bn) in Rd converging to b. Again from 6.4 we then get
that ]a, b] is Borel. From (6.5) follows
Ad(]a,b[) = lim Ad([a,,,b[) = Ad([a,b[)
n-,ao
the first equality using the continuity from below of a measure, and the second
using lim an = a (from (6.5')) and the continuous dependence on c and d of the
n-aoo
elementary content of the interval [c, d[. Analogously, with the help of (6.8), we
conclude that
Ad(]a, b]) = Ad([a, b]),
this time citing the continuity of measures from above. Thus finally from the
inclusions ]a, b[ C ]a, b] C [a, b] the remaining equality in (6.7) follows.
The choice of right half-open intervals for the construction of Ad is now seen
to have been due solely to the fact that the ring .ld they generate is so simple to
describe.
30 I. Measure Theory

In a second step a large class of measures on the o-algebra R1 of Borel subsets


of the line will now be presented. These are the Borel measures. In general for
d E N, a measure p defined on .mod is called a Borel measure on Rd if
p(K) < +oo for every compact K C Rd
or, equivalently, if p(B) < +oo for every bounded set B E Rd. I.-B measure Ad is
such a measure, according to (6.2).
The point of departure for defining A' is the determination of AI([a,b[) for
intervals [a, b[ E 51, namely as b - a. It suggests itself that this opening move
might be generalized as follows: One has a function F : R -+ R and asks for
conditions on it which guarantee the existence of a measure p on 0 with the
property
(6.9) ,u([a, b[) = F(b) - F(a) for all a,bERwith a<b.
Thanks to the uniqueness theorem 5.4 such a measure is already thereby, i.e.,
by its values on 5', uniquely specified. Since p([a, b[) > 0, (6.9) entails that the
function F must be isotone. Moreover, F has to be left-continuous. This is because
for every x E R and every sequence (x,,) in R with x,, 1 x, the corresponding
interval behavior is t [x1,x[, and since p must be continuous from below,
it follows that
lira F(xn) - F(xl) = lim p([xl,xn[) = pQxl,x[) = F(x) - F(xl)
n-+oo

that is, lin, F(xn) = F(x), F is left-continuous at x.


n-1oo
Functions F : R -+ R which are isotone and left-continuous will be called
measure-generating (or measure-defining) functions (on R). Of course, whenever F
is such a function, so is aF + b for any a E R+, b E R. The designation "measure-
generating" is justified by the next theorem, which answers completely the earlier
question of what are the appropriate conditions on F.

6.5 Theorem. To every measure-generating function F on R there corresponds


exactly one measure OF on 91 having property (6.9), that is, satisfying
pp([a,b[) = F(b) - F(a) for all [a,b[ E 91.
The measure pc determined by the measure-generating function G satisfies PC =
pp if and only if G = F + c for some constant c E R. Every pF is a Borel measure
on R, and every Borel measure on R is a pp for an appropriate F.

Proof The techniques employed in the proof of Theorem 4.3 can be repeated
to show that corresponding to F there is a unique content p on the ring Jr'
of 1-dimensional figures which has property (6.9). That part of the proof used
only the isotoneity of F. From the left-continuity of F it follows that for every
1=[a,b[E5' and every e>0there isaJ=[a,c[E51with JCland
IA(1) - p(J) = p([c, b[) = F(b) - F(c) < e.
§6. Lebesgue-Borel measure and measures on the number line 31

But then the technique employed in the proof of Theorem 4.4 shows that it is
a a-finite (as well as finite) premeasure on .071.
According to 5.6 it can be extended in exactly one way to a measure on 0.
This measure does what is wanted, is a pF. Its uniqueness with respect to its
prescription on .1 via F was settled in the deliberations preceding the present
theorem. From pF = pc we get G(b) - G(a) = F(b) - F(a) whenever a < b. Upon
applying this with a = 0 < b as well as with a < 0 = b, we learn that G = F + c,
with c := G(O) - F(0). Every AF is a Borel measure, because every bounded
B E 91 is contained in [-n,n[ for some n E N and so pF(B) < IAF([-n,n[) _
F(n) - F(-n) < +oo.
If conversely, p is an arbitrary Borel measure on R, we can define
p([0, x[) if x > 0
F(x) .=
I-p([x, 0[) if x < 0
and get a function on R having property (6.9) and therewith, in light of the
discussion preceding this proof, measure-generating. In fact, for real numbers 0!5
a < b the subtractivity (3.7) of measures entails that
p([a, b[) = p([0, b[ \ [0, a[) = F(b) - F(a) ,

and (6.9) is confirmed analogously when a < b < 0. In the remaining case a < 0 < b
we get (6.9) from [a, b[ = [a, 0[ U 10, b[ and the additivity of it. The uniqueness
already proved leads finally to the equality of p with the measure AF derived
from F.

Notice that L-B measure )' has the form PF, with F the identity map x H x
on R.
Of special importance are the finite measures on 0. Every one is a Borel
measure on R. Because 0 < p(B) < p(R) < +oo for all B E 91, a finite Borel
measure p on R is either the zero measure p = 0, or 0 < p(R) < +co and
v:= p is a measure on.91 with v(R) = 1. Measures normalized this way play
p(R)
a fundamental role in probability theory. This explains the following vocabulary:
A measure p on a a-algebra .sad in a set Q is called a probability measure (ab-
breviated to p-measure) if p(1l) = 1. Because of the isotoneity property every
p-measure satisfies
(6.10) 0 < p(A) < 1 = p(fl) for all A E W.
Consider now a p-measure p on 0. The open interval [-co, x[ lies in 91 for each
x E R, so a real function F. with values in [0,1] is defined by
(6.11) F,,(x) := p(] - oo,x[) (x E R).
It is called the distribution function of p. For example, the distribution of the
Dirac measure eo equals 0 throughout ] - oo, 0] and 1 throughout ]0, +oo(.
Since ] - coo, b[ \ ] - oo, a[ = [a, b[ whenever a < b,
p([a, b[) = F, (b) - F, (a) for all (a, b[ E S1.
32 I. Measure Theory

Therefore (6.11) uniquely defines a measure-generating function, which obviously


satisfies
(6.12) µF,. = A
in the notation introduced in Theorem 6.5. Among the infinitely many measure-
generating functions F that satisfy pF = µ for a given p-measure p the distribution
function F. is characterized as follows:

6.6 Theorem. A real function F on J is the distribution function of a -- necessar-


ily uniquely determined -p-measure p on 4' if and only if it is measure-generating
(that is, isotone and left-continuous) and satisfies
lira F(x) = 0 and lira F(x) = 1.
(6.13)
_cc X-++oo

Proof. The distribution function Fµ of a p-measure it on 91 is always measure-


generating, as (6.12) shows. Properties (6.13) follow from the continuity at 0
and the continuity from below of every finite measure, respectively, since for se-
quences (x,2) in R with x,, , -oo, resp., xn t +oo we have ] - oo,xn[ .. 0, reap.,
]-oo,x,, [TR.
If conversely F is a measure-generating function satisfying (6.13), then accord-
ing to 6.5µF is the only Borel measure on R with property (6.9), in particular, with
pp([-n, n[) = F(n)-F(-n) for all n E N. When n - +oo here, the normalization
condition u(R) = 1 follows from (6.13). Thus µF is a probability measure. F is
then the distribution function of pp, because for x E R and all n E N fl [-x,+00[
pF([-n,x[) = F(x) - F(-n) and [-n.x[ t ] - oo,x[
so that
lira F(-n) = u(] - oo,x[) = F,,, (x) .
F(x) = bin ILF([-n,xD +n-+oo
Via p +-> F,, the set of p-measures on 91 is thus bijectively mapped onto the
set of measure-generating functions F on JR having property (6.13). This is the
significance of the preceding theorem.

Remarks. 1. Measure-generating functions are also called "Stieltjes measure func-


tions". This is because, even before the invention of the measure concept,
T.J. STIELTJES (1856-1894) had used such functions to extend the ideas behind
the Riemann integral (cf. Remark 2 in §12).
2. Measure-generating functions (and distribution functions) also make sense
in Rd. But they are difficult to deal with and that is not the least reason why they
are of less significance. A function F : Rd -* R is called measure-generating if in
each of its d variables 1;1.... , l d, when the others are held fixed, it is left-continuous
and satisfies the additional condition
A$'...AQ,F>0 for all a,bERdwith a<b.
§6. Lebesgue-Borel measure and measures on the number line 33

Here ak, (3k (k = 1,. .. , d) are the coordinates of a, b, resp., and ,aI F is the
function defined on Rd-1 via (t1, ... , d) '-4 F2 (6, , G) := F(01,6, -,td) -
F(a1, t2, ... , d). Then Da2F2 = AJ &3, F is defined and the further "difference
operators" Dak are inductively brought into play. There is a theorem analogous
to 6.5: To every measure-generating function F on Rd corresponds a unique Borel
measure AF on Rd which satisfies the iterated difference condition
(6.14) pF([a, b[) = Aad ... L&QI F for all [a, b[ E d.
For d = 1 this reduces simply to (6.9'). As an example, for the function
CC

Aad ... Aa, Fo = (.31 - a1) (Qd - ad) for a, b E Rd with a < b.
This function is consequently measure-generating, and generates the L-B mea-
sure Ad in the sense that uF0 = Ad. Details can be found in RICHTER [1966],
TUCKER [1967) and GNEDENKO [1988.

Exercises.
1. Prove that a Borel set B E .mod is an L-B-null set if and only if one of the
two following conditions (which are hence equivalent) is satisfied: (a) For every
e > 0 there is a covering of B by countably many open intervals In C Rd such
00
that E Ad(In) < c. (b) There is a covering of B by countably many open inter-
n=1
00
vals In such that E Ad(II) < +oo and every point of B lies in In for infinitely
n=1
many n. Both characterizations remain valid if the In are allowed to be half-open
or compact, instead of open. [Hint for (a): Utilize (5.1).]
2. Write Rd in the form Rd = Rp X RQ with p, q E N, p + q = d, by grouping the
first p coordinates of a point x E Rd into a point in RP and the last q coordinates
into a point in R. Denoting by 0 the zero of the vector space R9, show that for
a set A C RP, A x {O} E .mod precisely when A E P.
3. Let p be a p-measure on 0 and Fµ its distribution function. Show that Fµ is
continuous at the point x E R just if p({x}) = 0.
4. Determine the p-measure on .r which has x -+ 0 V (x A 1) as distribution
function, and answer anew the question in Exercise 1 of §4.
5. Show that every a-finite measure p on 0 can be represented in the form
00
p = E an pn, where for each n E N, an E R+ and An is a p-measure on .mod. The
n=1
supplemental condition that for every bounded set B E Rd, pn(B) -A 0 for only
finitely many n E N can be imposed if and only if y is a Borel measure.
34 I. Measure Theory

§7. Measurable mappings and image measures

The following considerations can be more simply formulated if we introduce some


shorthand terminology. If 11 is a set and d9 a a-algebra in fl, the pair (12, mot) will
be called a measurable space and the sets in d measurable sets. If in addition
a measure p is defined on the a-algebra d, then the triple (Cl, d, la) arising from
the measurable space (12, a) is called a measure space (cf. Exercise 7 of §5). If p is
a p-measure, the measure space (Sl, .a(, pC) is called a probability space (p-space for
short). Correspondingly, one speaks of a a-finite measure space p) if the
measure p is a-finite.
The measurable space (ltd, .4d) will henceforth be called the d-dimensional
Borel measurable space. The measure space (ltd, .mod, Ad) will correspondingly be
called the d-dimensional Lebesgue-Bored measure space abbreviated to L-B mea-
sure space).
The concept measurable space exhibits a formal analogy to that of topological
space. For a topological space is also a pair, consisting of a set and a system of its
subsets, namely, the open ones. In the sense of this analogy the next concept, that
of a measurable mapping, corresponds to the concept of continuity in topology.

7.1 Definition. Let (11,,W) and be measurable spaces, and T : fl -, Cl'


a mapping of 11 into Cl'. T is called W-d'_measurable if

(7.1) T-'(A') E.off for every A' E ,V'.

We express the W-sad'-measurability of T symbolically by

and speak of a measurable mapping of the first measurable space into the second.
Using the notation introduced in (1.5), (7.1) can be written as

(7.1') T-'(,W') Cd.

Examples. 1. Every constant mapping T : 1-> Cl' is .W-a'-measurable.


2. Every continuous mapping T : Rd - Rd' (d, d' E N) is : 1d-9"-measurable,
briefly put, Borel measurable. According to 6.4 the system /P' of all open subsets
of Rd' is a generator of .$. Because of the continuity of T, T-1(O) E Od C Rd
for every 0 E Od'. The asserted measurability of T therefore follows from the next
theorem.

7.2 Theorem. Let (12, d) and (Q', W') be measurable spaces; further, let 9' be
a generator of 0'. A mapping T : Cl - 12' is measurable just if
(7.2) T-1(E') E R1 for every E' E 4'.
§7. Measurable mappings and image measures 35

Proof. The system .l' of all sets Q E 9(S2') for which T-1(Q') E d is a a-algebra
in 11'. Consequently, 0' C °. ' holds just if 8' C 2' does. sZf' C .l' is equivalent
to the measurability of T, while 8' c 2' is equivalent to (7.2).

Concerning the composition of measurable mappings, what the earlier analogy


with topology suggests, prevails:

7.3 Theorem. If Ti : (c', .s+'j) -> (Sl2, a/2) and T2 : (S22, saI2) -* (S13, s71/3) are

measurable mappings, then the composite mapping T2 o T, is sari-d -measurable.

Proof. The claim follows from the validity of the equation (T2 o T,)-1(A) =
Ti 1(TZ 1(A)) for all A E 9(SZ3), in particular, from its validity for all A E saf3.

Next consider a family of measurable spaces ((c,, sO ))iEI and a family (Ti)iE1 of
mappings Ti : S2 -> S2i of some fixed set S2 into the individual sets 11,. Obviously the
a-algebra in 0 generated by U Ti 1(sa;) is the smallest a-algebra 0 with respect
to which every Ti is 0-sfi-measurable. We designate this a-algebra o(T, : i E I),
that is, we define
(7.3) o(Ti : i E I) := o(U(T; 1(-Wi))
iEI
and call it the a-algebra generated by the mappings Ti (and the measurable spaces
(Sti, r!)). In the case of the finite index set I n}, we also use the notation
o(T1i...,Tn)-
For n = 1 we clearly have a(TI) = Ti 1(sad1). If therefore a a-algebra d in
a set S1 is given, then a mapping T, : S2 -> S1, being d- s i(i -measurable is equivalent
to
(7.4) a(T,)C0.
Cf. (7.1').
As a further application of 7.2 we will demonstrate:

7.4 Theorem. Let (T,)iEI be a family of mappings Ti : 0 -+ S2, of a set Sl into


measurable spaces (Sli, s ). Further, let S : Slo -> fl be a mapping of a measurable
space (Slo, sto) into Sl. The mapping S is then solo-o(Ti : i E I) -measurable if and
only if each mapping Ti o S (i E I) is sago-d-measurable.

Proof. According to Theorem 7.3 the condition is necessary. The following consid-
erations show that it is also sufficient. By (7.3) the system
8:=UT,'(s )
iE1

is a generator of o(TT : i E I). Each set E E 8 has the form E = Ti 1(Ai) for some
i E I, A, E .sad . Thus S-1(E) = (Ti o S) -1(Ai) E s to because of the hypothesized
measurability of Ti o S. From 7.2 therefore, S is sio-o(Ti : i E I)-measurable.
36 I. Measure Theory

Finally, with the aid of measurable mappings, measures can be mapped:

7.5 Theorem. Let T : (I ,.d) -+ (0', 0') be a measurable mapping. Then for
every measure p on a+f,
(7.5)

defines a measure p' on af'.

Proof. We only have to observe that for every sequence (An)nEN of pairwise disjoint
sets from al', (T-1(A'n))nEN is a sequence of pairwise disjoint sets from W, and
that
T-1(UA')=UT-'(Art).
O
nEN nEN

7.6 Definition. In the situation described in 7.5, the measure p' is called the
image of p under the mapping T and is denoted by T(p).
Thus according to this definition
(7.5') T(p)(A') := p(T-1(A')) for all A' E ai'.
The formation of image measures is transitive, that is,
(7.6) (T2 o TO) (p) = T2(Ti(p)),

whenever we are in the situation of 7.3 and U is a measure on .aft: For every A E aft,
T := T2oT1 satisfies T -'(A) = Ti ' (T;" (A)), and T;" (A) E .aft. Therefore, setting
A':= Ti(p), 14":= T2(µ') for short, it follows that
T(p)(A) = p(Ti '(Tz 1(A))) = µ (Tz 1(A)) = p"(A),
for all A E W3i showing that T(p) = p" and confirming (7.6).

Examples. 3. Let (Q, d) = (11',.af') :_ (Rd, Rd) be the d-dimensional Borel


measurable space and p := Ad the associated L-B measure. For every point a E Rd,
the translation mapping T. : Rd -a Rd is defined by
Ta(x) := a + x x E Rd.
It is continuous and so (Example 2) measurable. We inquire into the image measure
A' := Ta(Ad).
The mapping Ta is bijective, and Ta 1 = T_a. So for every interval [b, c[ E
jd, Ta 1([b, c[) = (b - a, c - a[, whence A'([b, c[) = Ad([b - a, c - a[) = Ad([b, c[).
Both measures Ad and A' thus assign to every interval from . pd its d-dimensional
elementary content. According to 6.2 therefore Ad = A', that is,
(7.7) Ta(Ad) = Ad for every a E Rd.
This property of Ad is called its translation-invariance. If we set, as is customary
(7.8) a+A=A+a:=Ta(A)={a+x:xEA)
§7. Measurable mappings and image measures 37

for sets A E .9(Rd) and points a E Rd, then TQ(Ad)(A) = ad(-a+A) for arbitrary
A E Rd. Property (7.7) can therefore also be expressed as
(7.7') Ad(a + A) = Ad(A) for all A E 69d, a E Rd.

4. In the context of Example 3, each non-zero real number a and each i E


{ 1, ... , d} determine a continuous, hence Borel measurable, linear mapping DQ')
which assigns to the point x = (x1, ... , xd) E Rd the image point x' E Rd having
coordinates x; := ax;, and x' = xj for all j 0 i, a dilation of x. It satisfies
(7.9) Da'>(ad) = 1a1-1 Ad.

For, every open interval ]a, b[ C Rd has D.()-pre-image equal to ]a', b'[, where the
coordinates of a', b' except the ith are those of a, b, the ith being a-1 times those
of a, b if a > 0, and a-1 times those of b, a if of < 0. Hence
Ad((DR'i)-'Qa,b[)) = IaI-' Ad(]a,b[)
DQ'i(ad) and IaI-l Ad are therefore measures on .mod which coincide on all bounded
open intervals. Thanks to 6.4 such intervals constitute a generator of 9d, which
obviously has with respect to each of these measures all the properties of the
generator 8 in the uniqueness Theorem 5.4. From that theorem (7.9) therefore
follows.
5. If we set Hr := Dr1) o ... o D(rd) for real r 96 0, we obtain the linear mapping
Hr(x) = rx (x E Rd), called a homothety. Because of the transitivity of image
measures, it follows from (7.9) that
(7.10) Hr(Ad) = Iri-dad

For r = -1 we get H_ 1(Ad) = A' Because H_ 1 is reflection through the origin,


this property is called the reflection-invariance of Ad.

Exercises.
1. For fl := R, let (Sl, dA, p) be the measure space of Example 2, §3. For SY := {0,1 }
and .sad' 9(fl) define the mapping T : fl --, SW by T(w) := 0 if w is rational,
T(w) := 1 if w is irrational. Show that T is d-d'-measurable and determine the
image measure T(µ).
2. Show that for any sets fl, Sl', any mapping T : 11 - fl', and any system of sets
B' c .9(11'), T-1(o(8')) = a,(T-'(r))
3. Let K be a compact subset of Rd with the property that the intersection HH(K)fl
Hr, (K) of every two homothetic images of K with 0 < r < r' < 1 is an L-B-null
set. (This property is enjoyed by every sphere S,,(0) of radius a > 0 and center
0 := (0,. .. , 0), that is, the set of x E Rd having euclidean distance a from 0.) Show
that Ad(K)=0. [Hint: For allrE10,11, Hr(K)CK:={tx:0<t<1,xEK},
which is a compact set. Hence Ad (k) < +oo.)
38 I. Measure Theory

4. Let T:= {(x, y) E R2 : x2 + y2 = 1} denote the unit circle, that is, the sphere
S1 (0) in R2. Prove the existence of a finite non-zero measure v on the a-algebra
,4(T) : _ T n 0 which is invariant under all rotations of T. [Hint: Take for v an
image of Ac for an appropriate interval C C R.]

§8. Mapping properties of the Lebesgue-Borel measure

L-B measure Ad on ,Rd is, as was shown in Example 3 of the preceding section,
translation-invariant. Of the greatest significance is the fact that Ad is uniquely
determined by this invariance property, together with a simple normalization. For
the d'-dimensional unit cube, defined by
(8.1) W:= 10, 1[,
where 0 = (0, ... , 0) E Rd and 1 :_ (1, ... ,1) E Rd, 6.2 insures that
(8.2) Ad(W) = 1.
Along with Ad each non-negative multiple aAd (a E R+) of it is a translation-
invariant measure u on 0, which satisfies u(W) = a < +oo. The following
converse of this also holds, and contains the aforementioned characterization of Ad
as a special case.

8.1 Theorem. Every measure µ on .mod which is translation-invariant, i.e., satis-


fies T. (p) = µ for every translation x - Tp (x) := a + x of Rd, and which assigns
finite measure
(8.3) a := ii(W) < +oo
to the unit cube W, has the form
(8.4) p = aAd.

Proof. Let an := n'l, the point in W' all of whose coordinates are 1/n. Then
W := [0, a cube with
.u(Wn) = a/nd
In fact: The interval [0,1 [ E .01 is the union of the pairwise disjoint intervals
[!,-±_![ with v = 0,1, ... , n - 1. If therefore Gn denotes the set of points
(Bl, ... , Pd) E Rd whose coordinates all come from the set { v/n : v = 0,...,n- 1},
then
W = U [r, r -i- M' [
rEG.
§8. Mapping properties of the Lebesgue-Borel measure 39

a union of nd pairwise disjoint intervals. Because [r, r+an[ = T,.([0, a,,[) = Tr(Wn)
and because of the translation-invariance of µ, it follows from this representation
of W that a = ndµ(Wn).
A repetition of these considerations will show that
µ([a, b[) = aAd([a, b[)
holds for every interval [a, b[ E fd in which the points a, b have only rational
coordinates. Obviously in proving this we can assume that a 4 b, and due to
the translation-invariance of both measures we can further assume that a = 0.
Then b = (ml /n, ... , and/n) for appropriate ml,..., md, n E N, and therefore
[0, b[ is the union of the ml ... and pairwise disjoint intervals [r, r + an[ with
r = (Bl /n, , Pd/n) and Pi E 10,..., m; - 1) for each i. As before, this yields
m1 ... =µ([0,b[), hence
µ((0,b[) = a ... nd = aAd([O, b[)
nl
Now the set ee of all intervals [a, b[ E f d for which a, b have only rational
coordinates is an fl-stable system. The technique used in the proof of Theorem 6.4
shows that ac is, just like 5d, a generator of.. Because the measures p and aAd
coincide on ee and for n:= (n,. .. , n) with n E N the intervals [-n, n[ lie in 8e
and increase to Rd, our claim (8.4) follows from the uniqueness theorem 5.4.

For a = 1 we immediately get from 8.1.

8.2 Corollary. Lebesgue-Borel measure Ad is the only translation-invariant mea-


surep on . which satisfies
(8.2') µ(W) = 1.

This corollary says that Ad is, in the theory of locally-compact groups, a Haar
measure on the additive group Rd. That theory provides an analogous non-zero
invariant measure on every locally compact abelian group G; it is unique to within
a positive scalar factor and is called Haar measure on G. The reader interested in
its theory should consult NACHBIN [1965]. (Cf. also Exercise 4 of §7 and Exercise 8
of §17.)
The conclusion of the theorem and its corollary remain valid if in the normal-
ization (8.2) and (8.2') the unit cube W is replaced by its open interior ]0,1[ or
its compact closure [0,1]. This is immediate from (6.7). However, if p(W) = +oo
is allowed, p need not be a multiple of Ad. See HENLE and WAGON [1983].

Example. 1. Besides the DO(') of Example 4, §7 there is another basic class of linear
mappings in Rd, those that skew one coordinate by means of another. Specifically,
for each i, k E { 1, ... , d} with i -A k we define
S(i,k)(x 1,..., ad) :_ (xl,...,Xi-l,xi +2k,Xi+1,.... 2d).
40 I. Measure Theory

Evidently this mapping is continuous. It is also invertible, with inverse D(k)oS(i,k) o


D(k), as is easily seen. In view of Example 2 in §7, the image measure S(i,k)(Ad)
may therefore be formed. We want to show that Ad is also invariant under these
mappings, that is, that
S(i,k)(Ad) = Ad for all i, k E { 1, ... , d) with i j4 k.

Fix such a pair (i, k) and write simply S for S(i,k) Since this is a linear mapping,
S(Ad) is a translation-invariant measure on 0, so (8.5) will follow from 8.2 if we
succeed in showing that S(Ad)(W) = 1, that is, Ad(S-1(W)) = 1. In view of (7.9)
and the equality S' = D(kl o S o D(ki, it suffices to show instead that

(8.5') Ad(S(W)) = 1.

Let a denote the vector in Rd whose only non-zero coordinate is the ith one, it
being -1. Introduce
4iW'{(xl,...,xd):0<xj <1forjq6i,0<xi<xk} and
Wit :={(xl,...,xd):0<x,<1for j 0i,I+xk<xi<2}.
Notice that
Ta(W") _ { (xl, ... , xd) : 0 < xj < I for j 96 i., xk < xi < 1} .
Clearly

(8.6) W = W' UTa(W") disjointly.


W' is the intersection of W with the open set {(X1, ... , xd) E Rd : xi < xk }, so
W' is a Borel set. Similarly Ta(W") is the intersection of W with the closed set
{(xl, ... , xd) E Rd : xk < xi}, so it is a Borel set. Thus W", its preimage under Ta,
is also a Borel set. Since S is a homeomorphism, S(W) is a Borel set. Next notice
that
(8.7) W', W" and S(W) are pairwise disjoint.

For the conditions on the ith coordinate that define each set in (8.7) are obviously
incompatible with those that define the other two sets. Moreover,

(8.8) W' u W" U S(W) = D( 4)(W) .

Here the inclusion "C" is obvious from the coordinate inequalities defining the
sets. A typical point x of D(i)(W) has j`h coordinate xx E [0,1[ if j 96 i and ith
coordinate t E [0, 2[ = [0, xk[ U [xk,1 +xk[ u [1 +xk, 2[. If t lies in the first (third)
interval, then x E W' (x E W"). Otherwise, xi := t - xk E 10, 1[, and

x = (XI.... , xi-1, t, xi+1, .... xd) = (x1, ... , xi-1, xi + xk, xi+1, ... , xd) E S(W ).
§8. Mapping properties of the Lebesgue-Borel measure 41

This confirms (8.8). Combining all that we have learned gives the desired (8.5') as
follows:
2 = 2Ad(W) =Ad (DZ')(W)) by (7.9)
=Ad W) + Ad(W"i)
+ Ad(S(W)) by (8.7) and (8.8)
=Ad(WI) + Ad(Ta(W )) + Ad(S(W)) by (7.7)
= Ad(W) +,d(S(W)) by (8.6)
= 1 + Ad(S(W)) by (8.2).

One usually thinks of the space Rd as equipped with the euclidean scalar-
product
d

(x, y) E cn,
i=1

and the euclidean metric derived from it by


P(x,y):= (x-y,x-y)
where x y := Every mapping T : Rd -a Rd which
leaves this metric invariant, that is, satisfies
(8.9) P(T(x),T(y)) = Lo(x,y) for all x,y E Rd,
is called a motion (or an isometry) in Rd. It is obviously continuous, hence Borel
measurable.
Suppose in addition that T fixes 0, that is,
T(0) = 0.
Using the linearity of (,) in each of its positions, we get
e2(T(x),T(y)) _ (T(x) -T(y),T(x) - T(y))
_ (T(x), T(x)) - 2 (T(x), T(y)) + (T(y), T(y))
= L02(T(x), T (O)) - 2 (T (x), T (y)) + P2(T (y), T(O)),
so that in view of (8.9)
p2(x, y) = e2(x, 0) - 2 (T(x),T(y)) + p2 (y, 0) .
Replacing T with the identity mapping here shows that (8.9) may be supple-
mented with
(8.9') (T(x),T(y)) = (x,y) for all x,y E Rd if T(0) = 0.
Consider A E R, x, y E Rd and again suppose that T (O) = 0. Using the linearity
properties of (,) once more we expand
(*) e2(AT(x) + T(y) - T(Ax+ y), AT(x) + T(y) - T(Ax + y))
into a linear combination of expressions (T(a), T(b)) with a, b E {x, y, Ax + y}.
Equation (8.9') allows T to be replaced by the identity mapping in every such
42 1. Measure Theory

expression. Upon doing so and re-assembling the terms, we get back a single ex-
pression like (*) but with the identity mapping in place of T. That is, we get 0. In
other words,
AT(x) + T(y) - T(Ax + y) = 0,
T(Ax + y) = AT(x) + T(y),
holding for all A E It, x, y E Rd. This says that T is a linear mapping. It is
immediate from (8.9) that T is then injective. The dimension of T(Rd) C Rd is
therefore d, so T(Rd) = Rd, and T is surjective. A motion T that is also a linear
mapping, and the preceding deliberations show that this is equivalent to T(0) = 0,
is called an orthogonal transformation.
If T is any motion and we set a := T(0), then the mapping U := T - a =
T_a o T is a motion that fixes 0. Therefore by the above, every motion T is
a composite Ta o U of a translation and an orthogonal transformation, and is
consequently a bijection of ltd. From this and (8.9) it is clear that the mapping
inverse to a motion is itself a motion, and that the set of motions is a group under
composition, the motion group Mot(Itd) of Rd.

The translation-invariance of All derived in 8.1 not only characterizes L-B mea-
sure but renders excellent service in the derivation of further invariance properties.
We begin with the motion-invariance of Ad, that is, with the proof that
(8.10) T(Ad) = Ad for all T E Mot(Rd).
The reflection-invariance treated in Example 5 of §7 is contained in this as a special
case.

8.3 Theorem. Lebesgue-Borel ineasure Ad is motion-invariant.

Proof. Let a motion T of lRd, about which we initially assume that T(O) = 0, be
given. Thus T is an orthogonal, linear transformation. Via the following consid-
erations, we will quickly convince ourselves that T(Ad) is a translation-invariant
measure on 4d: Denoting as before by T,, the translation x H x + c, for each
e E ltd, we consider any a E Rd, set b:= T`(a), and observe that
(8.11) T. oT =T oTb.
For every x E Rd, T. oT(x) = T(x) +a = T(x)+T(b) = T(x+b) = ToTb(x), con-
firming (8.11). From this and the translation-invariance of Ad we get T.(T(Ad)) =
T(Tb(Ad)) = T(Ad). As a E ltd is arbitrary, this says that it:= T(Ad) is a transla-
tion-invariant measure on.*'. For the unit cube W = (0,1( we have a :=.u(W) =
Ad(T-I(W)) < +oo by (6.2), since T is an isometry and therefore along with W
the set T(W) is also bounded. Now Theorem 8.1 comes into action and guar-
antees that T(Ad) = it = aAd holds. So what remains is to see that a = 1. To
this end we look at the compact ball K := {x E ltd : p(0, x) < 1} of radius 1 and
center 0. Since T and T- i are orthogonal transformations, they fix 0 and leave
§8. Mapping properties of the Lebesgue-Borel measure 43

distances invariant (8.9). Hence T-I(K) = K, and from T(ad) = aAd follows
Ad(K) = Ad(T-I (K)) = T(Ad)(K) = aad(K)
From this follows the desired a = 1, because on the one hand Ad(K) < +oo
by (6.2) and on the other hand Ad(K) > 0 because K contains a non-empty
interval I E jd, namely I := [-t, t[ with t := (d-1/2, _ .. , d-1/2)_ [In Exercise 6
of §23 we will compute Ad(K) explicitly.]
To handle the case of an arbitrary motion T, set c := T(O) and S := T, o T,
getting a motion that fixes 0, for which Ad = S(Ad) by what was first proved. It
follows finally from transitivity and T = TaoS that T(Ad) = T'(S(Ad)) =To(ad) _
Ad. Thus the theorem is proved.

Since with every motion T of Rd its inverse T-' is also one, the motion-
invariance can also be recorded in the following form: For every motion T of Rd
and every Borel set A E pfd
(8.12) Ad(T(A)) _.d(A)
In this form Theorem 8.3 just says that any two congruent Borel sets in Rd have
the same d-dimensional Lebesgue measure. This however is the measure-theoretic
formulation and refinement of the elementary geometric principle (A) enunciated
in the introduction to the chapter. Via it L-B measure is seen in the final analysis
to be a concept from euclidean geometry.

Examples. 2. Every hyperplane H C Rd is an L-B-nullset. This follows from Ex-


ample 1 of §6 and the fact that there is a motion T which transforms a hyperplane
of the kind considered in that example, say the hyperplane with equation td = 0,
into H.
3. Every closed or open box (meaning a parallelepiped with pairwise orthogonal
edges) Q C Rd whose edge-lengths are 11, ... , ld has Lebesgue measure Ad(Q) _
11 ... 1d. This follows analogously from Example 3 of §6.

The behavior of Ad with respect to linear transformations of the vector space Rd


into itself - and then too with respect to arbitrary affine mappings - can also be
clarified using a slight modification of the preceding method of proof.
Linear mappings T : Rd -a Rd are just those that with respect to the canonical
basis in Rd (or indeed any basis) can be represented in the form T(x) = Cx, with
C a d x d matrix and x E Rd interpreted as a column vector. The determinant
of T, in symbols det T, is by definition that of C (and is independent of the choice
of basis).
We will restrict ourselves to the case where T is non-singular, that is, det T 34 0,
and consequently bijective. These are elements of the group GL(d, R) known as the
general linear group. The mappings T E GL(d, R) with det T = 1 form a subgroup
of GL(d, R), the special linear group SL(d, R). It is in fact the commutator subgroup
of GL(d, R) and this fact is used by DIEROLF and SCHMIDT [1998] to give an
44 I. Measure Theory

alternative proof of our next theorem. (The behavior of Ad with respect to linear
mappings T with det T = 0 is elucidated in Exercise 2 below.)

8.4 Theorem. Every T E GL(d, R) satisfies


1
(8.13) T (Ad)
= 1 T) Ad
I det
or, equivalently
(8.14) Ad(T(A)) = IdetTIAd(A)
for all AE.Rd.

Proof. Consider first the elementary mappings Dam) defined in Example 4 of §7


and S(") defined in Example 1 of this section. It is obvious that det D') = a
and det 1. Therefore (7.9) and (8.5) confirm (8.13) in the special case that
T lies in the set
(8.15) {DZ`),S(',k) :crER\{0},i,kE {1,...,d},i4k}.
Since det is a homomorphism of GL(d, R) into the multiplicative group R" := R
{0}, it follows from this and (7.6) that in fact (8.13) holds for all T in the subgroup
of GL(d, R) generated by the set (8.15). The proof of Theorem 8.4 is therefore
completed by recalling the key fact, proved in every linear algebra text, that this
subgroup is the whole group GL(d, R). Actually what is usually proved (see, e.g.,
BIRKHOFF and MACLANE [19651, p. 217) is that GL(d, R) is generated by (8.15)
together with the transformations j(',k) which send every vector x = (xl,... , xd)
to the vector whose coordinates are those of x but with xf and xk interchanged.
However, every such transformation is already in the group generated by (8.15), for
it is routine to confirm, by watching the behavior of the ith and the kth coordinates
at each step, that for i 0 k
j(',k) = D(i o S(lc) o D(k) o SO-0 o D(ki o S(k,*). O

Theorems 8.3 and 8.4 taken together confirm an elementary fact from linear
algebra, namely that det T = ±1 for every orthogonal transformation T. And this
means that 8.3 is contained in the following immediate consequence of 8.4:

8.5 Corollary. The L-B measure Ad is invariant under all transformations T E


GL(d, R) with I det T j = 1, in particular, under mappings T E SL(d, R).

Remark. 1. As is known from the differential calculus in Rd, a Cl-dif'eomorphism


V : G -, G' between two open subsets C and G' of Rd is approorimable near each
point x E G by a mapping Tx E GL(d, R), namely by the derivative Ti := DV(x). It
should now not come as a surprise that the following transformation, involving the
density concept from 17.2, relates the image of L-B measure on G to L-.B measure
on G':
(8.16) ,P (Ao)
= I det Dapl o cp-1
§8. Mapping properties of the Lebesgue-Borel measure 45

or equivalently
(8.16') IdetDcpIAd .
We will not go into this any further, but refer the reader to the textbook literature,
e.g., STROMBERG [1981], or to VARBERG [1971].
We will conclude the chapter by proving the existence of non-Borel subsets
of Rd. A different approach is indicated in the prologue to Theorem 26.6.

8.6 Theorem. For every dimension d E N, 0 54 9(R d).

Proof. Let Qd denote the set of points in Rd each of whose d coordinates is rational.
This is a subgroup of the additive group Rd, so congruence x - y of points x, y E Rd
modulo Qd is an equivalence relation; it is defined by x - y if and only if x-y E Qd.
The space Rd decomposes into disjoint equivalence classes, each a set x+Qd with
x E Rd, the statement x - y being equivalent to the equality x + Qd = y + Qd.
Since to every real number 77 corresponds an integer n such that n < r) < n + 1,
that is, such that q - n E [0,1 [, every equivalence class contains a point x E [0,1 [.
Consequently, there is a set K C [0, 1[ which contains exactly one element from
each equivalence class. (On the role of the Axiom of Choice from set theory in this
existence claim see SOLOVAY [1970] and HALMOS [1974].) We have then

(8.17) Rd = U (k + Qd) = U (y + K)
kEK VEQd

and
(8.18) y1, y2 E Qd, t 1 0 y2 (y1 + K) fl (y2 + K) = 0 .
(Otherwise there are k, k' E K with y1 + k = y2 + k', that is, with k - k', which
by definition of K means that k = k' and consequently also y1 = y2.) Let us now
suppose that K E .mod. Since Q and therewith Qd is countable, it follows from
(8.17), (8.18) and the o-additivity of Ad that
(8.19) E Ad(y + K) = Ad(Rd) = +00.
yEQd

Translation-invariance of Ad says that


(8.20) Ad(y + K) = Ad(K) for all y E Qd
and so in view of (8.19), Ad(K) > 0. Now K C [0, 1[, and so
U (y+K) C [0, 2[,
yE[o,l[ Qd
2 being the point in Rd each of whose coordinates equals 2. From this fact and
(8.18) follows, again via Q-additivity of Ad, that
F Ad(y + K) < Ad([0, 2[) = 2d < +oo .
yE io,l [nQd
46 I. Measure Theory

But then (8.20) means that we must have Ad(K) = 0, contradicting (8.19). The
assumption K E Yd is what led to this contradiction, so we conclude that K is,
after all, not a Borel set.

The following remarks serve to round out the foregoing and to provide a glimpse
of some closely related issues.

Remarks. 2. The "content-problem" in Rd described in the introduction to this


chapter, namely the problem of determining a d-dimensional volume for as large
a class of subsets of Rd as possible, is quite satisfactorily solved by the Lebesgue-
Borel measure Ad, especially in view of its motion-invariance. More satisfactory
still for a reader without preconceived notions would be a proof of the existence
of a measure to on the whole power set 9(Rd) which assigns mass 1 to the unit
cube [0,1 [ and is invariant under all motions of Rd. According to Corollary 8.2 such
a p would have to be an extension of the Lebesgue-Borel measure Ad. But it was
shown by F. Hausdorff (1868-1942), cf. HAUSDORFF [1914J, pp. 401-402, that no
such measure p on .9(Rd) exists, for any dimension d > 1; in fact, there is not even
a or-additive, motion-invariant content rt # 0 on the ring of all bounded subsets
of Rd. For this result the reader is referred to the exposition in AUMANN [1969],
pp. 275-276, which further exploits the ideas in the preceding proof of Theorem 8.6
and will consequently be mathematically accessible to him.
HAUSDORFF [1914], p. 469 further showed that the content-problem for bounded
subsets of Rd does not even have a solution if the motion-invariant content 11 0 0
is only required to be finitely additive, and d > 3. The reader is referred for this to
the presentation by STROMBERG [1979] or WAGON [1985] (in each of which a cen-
tral role is played by the so-called Banach-Tarski paradox, discovered in 1924)
and to the subsequent investigations of VON NEUMANN [1929] that introduced the
idea of amenable groups.
S. Banach (1892-1945), cf. BANACH [1923], discovered that the finitely-additive
content-problem mentioned above has a solution in dimensions d = I and d = 2.
But such an rt is not uniquely determined by the normalization rt([0,1[) = 1 and
for this very reason its further study has not seemed worthwhile. A remarkable
generalization of Banach's result will be found on pp. 242-245 of HEwrrr and
Ross [1979].

3. If (11, 0, p) is a measure space and A a p-null set, then indeed because of


isotoneity every subset of A that belongs to W is itself ft-null; nevertheless, not
every subset of A need belong to d. This phenomenon even occurs with the L-
B measure, as the second part of Remark 4 will show. If .i contains all subsets of
each p-null set, then p is called a complete measure.
Exercise 7 of §5 describes how an arbitrary measure p can be extended in
a natural way to a complete measure by passage to its so-called completion po.
The completion of the Lebesgue-Borel measure in Rd is called Lebesgue measure
in Rd; the sets in the a-algebra on which it is defined are called Lebesgue measurable
and those of them having measure zero are called Lebesgue-null sets in Rd.
§8. Mapping properties of the Lebesgue-Borel measure 47

In passing from Borel sets to Lebesgue measurable sets the important property
of the former that they are determined only by the topology of Rd is lost. Because
d is the defining or-algebra for so many other important measures (for d = 1
Theorem 6.5 already attests to this), we will not dwell in detail on the transition
from Lebesgue-Borel to Lebesgue measure; only the former will be employed in
the sequel.
4. There exists a Borel set B E 0 whose image 7r1(B) under the first projection
map irl : R2 -* R (which sends every point (xi, x2) E R2 to its first coordinate xi)
is not a Borel subset of R. A proof of this will be found in SRIVASTAVA [1998],
p. 130. Such a B can even be found which is G6-set, that is, the intersection of
countably many open subsets of R2; see p. 36 of CHRISTENSEN [1974]. In particular,
the continuous image of a Borel set need not be a Borel set. The system of all sets
7r, (B) with B E 92 comprises rather the so-called Souslin or analytic subsets of R.
See SRIVASTAVA [1998] and CHOQUET [1969].
For any non-Borel set A C R, Exercise 2 of §6 shows that A x {0} is a non-Borel
subset of the A2-null set R x {0}.
5. Examples 4 and 5 of §7 as well as Theorem 8.4 illustrate that the L-B mea-
sure Ad is not invariant with respect to all homeomorphisms T : Rd -3 Rd of Rd
with itself. For such a homeomorphism T however, p := T(ad) is always a mea-
sure on .mod with the following properties: (i) p(K) < +oo for every compact
K C Rd; (ii) p({x}) = 0 for every x E Rd; (iii) u(U) > 0 for every non-empty open
U C Rd; (iv) p(Rd) _ +o0. OXTOBY and ULAM [1941] showed that, conversely,
every measure p on 0 enjoying properties (i)-(iv) has the form it = T(ad) for
some homeomorphism T : Rd Rd. A simpler treatment of their result was later
provided by COFFMAN and PEDRICK [1975].

Exercises.
1. Let T : (fl, .ad) -4 (fl', d') be a measurable mapping, p a measure on the or-alge-
bra 0, and p' := T (p) its image under this mapping. (1l, .Wo, po) and (S2', 00, IA')
will denote the completions of these measure spaces (Exercise 7, §5). Show that the
mapping T is also do-.olo-measurable and that T(po) = µo. From this it follows
that Lebesgue measure in Rd is also motion-invariant.
2. Let T be a linear mapping of Rd into itself with det T = 0. Show that for every
A E .9d, T(A), although it may fail to be a Borel set (as noted in Remark 4) is
at least a Lebesgue-null set, thus a subset of an L-B-null set, namely the linear
subspace T(Rd) of Rd. In this sense equality (8.14) retains its validity for linear
transformation T : Rd -+ Rd with det T = 0, i.e., (8.14) is valid for every linear
transformation T of Rd into itself.
3. Show that the set K constructed in the proof of Theorem 8.6 is not even
Lebesgue measurable.
4. In the section entitled "Fallacies, Flaws and Flimflam", p. 39, vol. 22, no. 1
(1991) of the College Mathematics Journal the following short "proof" of Theo-
rem 8.6 is offered: Suppose that A1(X) is defined for every subset X of 10, 11. By
isotoneity it is a number in 10, 11. Consider the set B defined as {al(X) : X E
48 1. Measure Theory

6'([O.1)), A' (X) % X}. It is a subset of 10,1] and upon testing the number A '(B)
for membership in B we find that the statements A'(B) E B and Al(B) 0 B are
equivalent, a contradiction. What is the error in this reasoning, or is it perhaps
a legitimate proof of Theorem 8.6?
Chapter II
Integration Theory

A measure space (S2, W, p) is given. We pose the problem of assigning to each


function on S2 from as large a class as possible an integral, that is, a "mean value"
constructed with respect to µ. After the introduction in §9 of the property of
measurability, which is fundamental, this problem will be resolved step by step
in §10-§12. The later sections of this chapter are devoted to erecting the theory
and exploring the applications of the integration procedure thus defined.

§9. Measurable numerical functions

On the number line R we have defined the a-algebra 91 of Borel sets. If we


compactify R to R in the customary way by adjoining the "ideal" points -oo
and +oo, the sets A C R for which AnR E -41 are called Borel in R. Constructively,
the Borel sets in R are precisely all sets B, B U {-oo }, B U {+oo}, B U { -oo, +oo}
with B E .61. The system 41 of these sets is obviously a a-algebra in IIt whose
trace in R is 0:
(9.1) R n R1 = -41 .

If now (Q,.&) is a measurable space, the sd-,mil-measurability of functions


f : S2 -+ R is defined. Such functions will henceforth be called (sad-)measurable nu-
merical functions on Q. Real functions f : S2 -+ R are special numerical functions;
in view of (9.1) the af-. '-measurability of such a function is just the same as its
sat-RI-measurability.

Examples. 1. Let (Il, d) be a measurable space, A a subset of f2. The function


1 if w E A
(9.2) 1A(w) := { 0 ifwEi2\A
is called the indicator function (sometimes also the characteristic function) of A.
This real function on S2 is d-measurable just if A E 0, because for every B C R
the set (1A)-1(B) must be one of the four fl, A, SZ \ A, 0.
Thus sets and their indicator functions correspond biuniquely. The following
calculation rules, in which A, B C 0 and A; C S2 for i E I, are often used, and
their validity is immediately perceived:
ACB . 1A<1B;
50 11. Integration Theory

1CA=1-IA; IAOB=IIA - IBI; lAnB=lA-1Bi


1U,eIAc = SUP IA, , 111,e,A, inf IA, .

2. For an arbitrary subset Q of Rd consider the measurable space (Q, Qn9d). The
corresponding measurable numerical functions on Q will be called Borel measurable
functions or Borel functions on Q. Every continuous numerical function f on Q
is such a Borel measurable function. Indeed, for every a E R the set Q. of all
x E Q with f (x) > a is a relatively closed subset of Q, that is, of the form Q fl F
for a set F which is closed in Rd. (Such an F would be, e.g., the closure of Q.
in Rd.) Since F E .mod, this intersection lies in the trace a-algebra Q fl .mod. The
claim therefore follows from the next theorem. (Cf. also Example 2 of §7.)

9.1 Theorem. A numerical function f on St is sad-measurable if and only if


(9.3) {wEfi: f(w)>a}EAf forallaER.
Proof. According to 7.2 we have only to show that the system 7 of all inter-
91
vals [a, +oo] with a E R generates the a-algebra Y in K. Since [a, +oo] E
for every a E R, we have at any rate that al C
.1 for the a-algebra 22 generated
by d°. Because [a, J3[ _ [a, +oo( \ [/3, +oo[, the intervals (a, /3[ with a,,3 E R and
a < /3 all lie in R f :N. From 6.1 therefore follows that M1 C R ft :N. Now the
single-element sets
{-or,} = n C(-n,+oo] and {+oo} = n [n,+oo]
nEN nEN

both lie in :9. Consequently, along with each Q E :N, the set R fl Q is also in :9.
In other words, R n C :9 and therewith 91 C .2. This fact together with
(-oo), {+oo} E and the remarks preceding (9.1) make it clear that 91 C :N,
so that finally we have _'l = .1.

We now introduce some popular short-hand notation: For numerical functions


f and g on
(9.4) (f < g} := {w E Q: f (w) < g(w)}
and the sets (f < g}, (f = g}, (f # g}, etc., are defined analogously. Condi-
tion (9.3) in this language reads: (f > a} E ii for all a E R.
That we can just as well employ the sets If > a}, f f < a}, etc., in the
preceding characterization is the content of

9.2 Theorem. Each of the following conditions is equivalent to the d-measuna-


bility of the numerical function f on St:
(a) if >a}Ed for all aER;
(b) if >a}E01 forallaER;
§9. Measurable numerical functions 51

(c) If <a}ES/ forallaER;


(d) {f <a} E.0' forallaER.
Prof. All that has to be shown is the equivalence of these four assertions, and
that results from the validity, for all a E R, of the equations
if > a} = U If > a+n-'}; if <a} =C{f > a};
nEN
if <a} = U{f <a-n-'}; If >a}=C{f <a}.
nEN

It may be noted that the four related assertions in which quantification is over
all a E R are also equivalent.
A plethora of assertions about calculating with measurable numerical functions
now presents itself.

9.3 Theorem. For any 0-measurable functions f,g : fl - Ilt the sets If < g},
If < g}, If = g} and If & g} lie in W.
Prof. Because the set Q of rational numbers is countable, the claims follow (with
the help of 9.2) from the equalities
If <g}= U{f <e}fl{e<g};
FEQ
{f<g}=C{f >g}; If =g}={f<g}f1{g< f};
{fog}=C{f=g}.
9.4 Theorem. Along with f, g : 11 -> R, the function f g and, if everywhere
defined, the functions f + g and f - g are also d-measurable.

Prof. First of all, along with g, a + rg is measurable for all a, ,r E R. This follows
from 9.2 because {o + rg > a} is {g > (a - a)/r} if -r > 0 and is {g < (a - a)/r}
if r < 0, the case r = 0 being trivial. This preliminary remark takes care of the
passage from g to -g and reduces the case f - g to the case f + g. Furthermore,
together with the remark following 9.2 and the equalities
{f+g>-a}={f>a-g} (aER)
it yields the measurability of f + g.
In investigating f g we will first suppose both functions are real-valued. Then
the identity
f9 = (f + 9)2 - 1(f - 9)2
4
reduces the product question to the case g = f. But (f2 > la)if is
a < 0 and
is If > V a-1 U { f < - f } if a > 0, which shows that the measurability of f2
follows from that of f.
52 1 1. Integration Theory

If finally f and g are numerical functions, introduce 521 := { fg = +oo}, Ua :_


{ fg = -oo}, 123 := { fg = 0} and 124 := C(121 U U2 U f13). Using 9.3 and the
measurability of constant functions we check that these four pairwise disjoint sets
lie in al'. The restrictions f', g' of f, g to 124 are U4 nd-measurable and real-valued.
The product f'g' is therefore f24 fl at-measurable. From this follows immediately
the ti-measurability of f g. 0
A useful special case of 9.4, isolated already in the course of its proof, is that of
is 0-measurable whenever f is and a E R.

9.5 Theorem. Let (fn)nEN be a sequence of ti-measurable numerical functions


on ft. Then each of the following numerical functions is i-measurable:
inf f n , sup f , 1im inf fn, lim sup f .
nEN nEN nEN nEN

Proof. The function s := sup fn is measurable because


{s< a}= n{fn<a} for all aER.
nEN
Due to 9.4, inf fn = - sup(- fn) is then also measurable. By definition we have
lim inf fn = sup inf fn ,
ra- 00 nEN m>n
lim sup f,, = inf sup fm.
n-,oc nEN m>n
By what has already been proved, each of these functions is measurable. 0

9.6 Corollary 1. For every finitely many .off-measurable numerical functions


fl,..., fn on ft, their lower and upper envelopes
fl A...Af, and fl V...Vfn
are .W-measurable.

Proof. Apply 9.5 to the ultimately constant sequence fl, fn, fn .... 0
9.7 Corollary 2. If a sequence (fn)nEN of ti-measurable functions converges
pointwise throughout 12, that is, if lim fn(w) exists on R for every w E 52, then
the limit function lim fn is 0-measurable.
n-+m

This is immediate from


lim f = limn-+00
inf f" = liznn-400
sup fn
n-+oo

To every numerical function f : ft -4 R three other functions on U are associ-


ated (cf. the section "Notations"): the absolute value
(9.5) !fl := f V (-f),
§10. Elementary functions and their integral 53

the positive part


(9.6) f+ := f V 0,
and the negative part of f
(9.7) f (-f)+ _ -(f A 0).
Thus f+ (w) = f (w) in case f (w) > 0 and f+(w) = 0 in case f (w) < 0. Observe
that not only f + > 0, but also f - > 0. The important equalities
(9.8) f=f+-fand Ifl=f++f-
are immediate.
From 9.4 and 9.6 we effortlessly infer our concluding result:

9.8 Theorem. A numerical function f on Il is jz -measurable if and only if both


its positive part f + and its negative part f - are each d-measurable. Furthermore,
along with f, its absolute value If I is always saf -measurable.

Exercises.
1. Let (Q, a() be a measurable space, D a dense subset of llt (e.g., Q). Show that
a numerical function f on fl is af-measurable if the analog for all a E D of one of
(a)-(d) in Theorem 9.2 holds.
2. Let (fn)nEN be a sequence of as -measurable numerical functions on a measurable
space (0,W). Why is the set of all w E f2 for which the sequence (fn(w))fEN
converges in R, and that for which it converges in R, xf-measurable?
3. The real function f : Sl -> R is measurable on the measurable space (0, sd). Are
exp f and sin f , that is, the function w H of (1) and w - sin f (w), 0-measurable?
4. With the aid of Theorem 9.1 show that the real function defined on R2 by
(x, y) +-> max{x, y} is 6#2-measurable. Deduce from this another proof of Corol-
lary 9.6.
5. Show via an example that the measurability of a numerical function f is not
always a consequence of the measurability of if I.

§10. Elementary functions and their integral


Our path to the integral proceeds via the set
E = E(1,0)
of sag-elementary functions on ft, which we define as follows:

10.1 Definition. A real function on 11 is called an (.sat-)elementary function (or


a non-negative step function) if it is non-negative, sad-measurable, and assumes
only finitely many different values.
54 11. Integration Theory

If {a1, ... , a,, } is the set of distinct values of a function u E E, then the sets
Ai := u-1(a; ), i = 1,..., n, are pairwise disjoint, and as pre-images of the Borel
sets {ai} they each lie in d. Using the notation for indicator functions introduced
in (9.2), we have then
n
(10.1) u = E ailA,.
i=1

If conversely, numbers al,... , a,, E R+ and sets &..., An E 0 are given (n E N)


and we define u via (10.1), then u is an elementary function, because by 9.4 it is
measurable. Thus E is the set of all functions having a representation of the form
(10.1), with n E N, coefficients ai in It+ and sets Ai from W.
From Definition 10.1 and the results of §9 the following further properties of E
are immediate:
(10.2) 14,11 EE,aER+ au, u+v, uVv, uAvEE.
The derivation of (10.1) shows moreover that every function u E E has a rep-
resentation of the form (10.1) in which the sets Ai E d are pairwise disjoint
and cover Il, that is, constitute a decomposition of 0. Such representations will
henceforth be called normal representations of u.
It is easy to see that generally functions u E E can have several different normal
representations. However, for u 96 0 there is only one representation in which the
coefficients are the distinct non-zero values taken by u. Anyway, for purposes of
integration non-uniqueness of normal representations is not an issue, as the next
lemma shows.

10.2 Lemma. Let (it, d,,u) be a measure space. For any normal representations
m n
=fl,1B'
q
i=1 j=1

of an elementary function u E E we have


m n

L,Q1µ(Bj)
tol j=1

(bearing in mind the conventions for calculating with +oo).

Proof. From
i1=AlU...UAm=B1U...UBn
follows
n m
Ai = U (Ai n Bj) and Bj = U (Ai n Bj )
j=1 i=1
§10. Elementary functions and their integral 55

in which the sets Ai n Bj are pairwise disjoint. The finite additivity of A therefore
supplies the equalities
n ns

p(Ai) = > p(Ai n Bj) and µ(Bj) _ E p(Ai n Bj),


j=1 i=1
the first for all i E { 1, ... , m}, the second for all j E After further
summation
m n
Eajp(A1)=>aip(AinBj) and Ef3jp(Bj)=E/3jii(AinBj)
i=1 j=1 i,j
From these two equalities the claim follows when we observe the following fact:
Because we started with normal representations of u, ai = Qj for every index
pair (i, j) such that Ai n Aj 0 0, in particular, for every pair (i, j) such that
p(AinAj)j4 0. o
Thanks to the preceding our next definition is sound:

10.3 Definition. Let u be an elementary function. The number

(10.3) Judo :_
i=1
which is independent of the special choice of normal representation
U it

= E ailA,
i=1

of u, is called the (p-)integral of u (over 1).

Thus u H f u dp defines a mapping from E into R+. Clearly it is a map-


ping in R+ just if p is finite. The most important properties of this mapping are
summarized in:
r
(10.4)
J IA dpi = p(A) for all A E 0;

(10.5) J(au)d;i =ra J udfor all u E E, a E ;

(10.6) f(u+v)dp=J udp+Jv dp for all u,vEE;

(10.7) u<v * for all u, v E E.

Properties (10.4) and (10.5) are immediate from 10.3. The next property in the
list is confirmed thus: Start with normal representations
in n
u=EailAi and v=J:pjlE,
i=1 j=1
56 1 1. Integration Theory

of the functions u and v in E. As before


n m
Ai = U (Ai n Bj) and Bj = U (Ai n Bj );
j=1 i=1
and because the sets Ai n Bj are pairwise disjoint, these equations entail
n m
1A, = E 1A,nB, and 1Bf 1A,nB1 ,
j=1 i=1

the first for all i E { 1, ... , m}, the second for all j E { 1, ... , n}, from which in turn
new normal representations
u=F'ai1A,nBj, v=EQi1A,nB, and u + v = E(ai + 13j) lA,nB3
ij ij ij
emerge. Using them to compute all the integrals,

u dii = aA(A; n Bj)fvdiz=>/3iIL(AinBi).


ij ij
J(u+v)dii=J(ai+Qj)p(AinBj)
ij
makes clear the validity of (10.6).
These deliberations have shown that every u, v E E admit normal representa-
tions
k k
u= E,yi1c, and v = Ebilc,
i=1 i=1

involving the same sets C1, ... , Ck E d. In case u < v, it then follows that ryi < bi
for each i E { 1, ... , k} such that Ci 34 0, and from this we have (10.7).
n
Now let u = E ail A, be an arbitrary representation of an elementary function
i=1
u E E with coefficients ai E R.4. and sets Ai E .op, but not necessarily a normal
representation. From (10.4)-(10.6) it follows that
n
Judµ = aiu(Ai)
i=1

For normal representations this equation served as the definition of f u du. Its
validity without this restriction, which we now perceive, indicates that the intro-
duction of normal representations was simply a technique of proof.

Exercises.
1. Let (S2, p) be a measure space and (Sl, sVo, po) its completion. Prove that
for every moo-elementary function u there are d-elementary functions u1i u2 such
that u1 < u < u2 and ji({u1 # u2}) = 0. For every such pair, f u1 dp = f u2 du =
f udpo. (Cf. Exercise 7(d) in §5.)
§11. The integral of non-negative measurable functions 57

2. The function 1Q on IIt has long been known as Dirichlet's jump function. Is it
a -41-elementary function?

§11. The integral of non-negative measurable functions

Further progress hinges on the following result:

11.1 Theorem. For every isotone sequence (un)neN of functions from E and
every u E E

(11.1) u < sup un JUd/L<sUPfUndIL.


nEN nEN

Proof. Choose a representation


m
U = 1aj1Aj
j=1

of u with sets Aj E af and coefficients aj E R+, and let a be any number in 10,1[.
Then because of measurability the set
B,,:={un>au}
lies in 0 for each n E N. From this definition follows on the one hand that un >
au1B and consequently by (10.5) and (10.7)

undp>a J
J
for every n E N. Since the sequence (un) is isotone and u < supun, it follows on
the other hand that Bn T St, and so Aj n Bn T Aj for each j E {1, ... , m} and
consequently, because p is continuous from below
r m na

JudµajA(A1)_ mpVajµ(AjnBn)=nl +00 ula dµ.


j=1=1 f

sup un dµ > sup a J u 1 B dp


nEN J nEN
r r
= a n-oo
lim J u1s dp = a udµ .
f
where the first step follows from f un dµ > a f ul B dµ. Since a E 10,1 [ is arbitrary
here, the claim follows.
58 1 1. Integration Theory

11.2 Corollary. For any sequences (vn)fEN of functions from E

(11.2) sup un = sup vn * sup / un dµ = sup ( vn 41A.


nEN nEN nEN J nEN J

Proof. For every m E N, vn, < supun and u,,, < sup vn, from which inequalities
n n
and 11.1 follow

J vn, dp < nEN


sup J un dp and j u.. du < sup
nEN
J vn du.
Claim (11.2) is immediate from the validity of these inequalities for all m E N.

Now let
(11.3) E- = E'(0,a)
designate the set of all non-negative numerical functions f on 1 for which an
isotone sequence of functions from E can be found satisfying
sup un = f .
nEN
Then according to (11.2) the number

sup J U. dp E Ft+
nEN
depends only on f and not on the special representating sequence (u,,) of f used
to compute it. We're in a position similar to that of 10.3. Therefore we make the

11.3 Definition. Let f be a function in E', represented as the upper envelope


f = supun of an isotone sequence (un)nEN for elementary functions. Then the
number
r
(11.4) fdp:=sup J undpEk+,
J neN
shown above to be independent of the special representing (un), is called the
(p-)integrnl of f (over f1).
Evidently E C E*, because every u r= E satisfies u = sup un for the constant
sequence un := u. Moreover, using this sequence (as we may) in (11.4), we see
that in case f = u E E, that definition of the integral coincides with the earlier
one. The mapping f i-+ f f dp initially defined only on E is thereby extended to
a mapping of E' into It+. That in this extension process the known properties of
the integral persist, will now be confirmed.
The analogs of (10.2) and of (10.5)-(10.7) are
(11.5) f,gEE',aElt+ of, f+9, f.9, fVg, fA9EE*;
(11.6) Jfrxf)di=affdpfor all f EE' ,aER+;
X11. The integral of non-negative measurable functions 59

(11.7) J(f+9)dii=Jfd+fgdi.i for all f,gEE*;

(11.8) f <g Jfd/iJgdlz for allf,gEE'.

Proof. From the definition of E* and from (10.2) follows (11.5). One only has
to note that sup un = lim un for isotone sequences (un). The earlier proofs carry
n n
over almost verbatim to (11.6) and (11.7). We'll do (11.7) and leave (11.6) for the
reader: Let f = SUP un, 9 = sup vn be representations of f, g E E' by means of
n n
elementary
isotone sequences of functions. Then by definition

dµ = sup n dµ and g dµ = supvn dµ ,


n J n
jI
r
( f +g)dµ = supJ (un +v.)dµ.
n
From this and (10.6) we get (11.7), since due to isotoneity
r
sup J (un + v,,) dµ = l nm (Jun dµ + J vn dIL) _ J f d., +Jgdµ .
If in addition we assume that f < g, then urn < sup vn for every m E N.
n
(11.8) therefore follows from 11.1

Properties (11.6)-(11.8) say that the integral is a positively-homogeneous, ad-


ditive and isotope function on E*.
Finally, it turns out that Theorem 11.1, which is so critical for our program, is
valid also in E. This is the content of a theorem which goes back to B. LEVI (1875-
1961):

11.4 Theorem (on monotone convergence). For every isotone sequence (fn)nEN
of functions from E'

sup fn E E' and JsuPffldlz=supJfndlL.


nEN nEN nEN

Proof. Set f := sup fn. It suffices to find an isotone sequence (vn) of functions
n
from E which satisfy
sup vn = f and vn < fn for every n E N.
nEN

For then f E E' and f f dµ = sup f vn dµ by definition of the integral in E', while
f vn dµ < f fn dµ by (11.8). Consequently, f f dµ < sup f fn dµ and therewith the
equality claimed by the theorem follows, since the other inequality sup f fn dp <
n
f f dµ is immediate from (11.8) and the fact that fn < f for all n.
60 Il. Integration Theory

The sequence (vn) is gotten thus: For each fn there is by definition an isotone
sequence (umn)mEN of functions from E with sup urn = fn According to (10.2)
the functions mEN

Cm:=um1 V...Vumm
be in E (for each m E N). The isotoneity of each sequence (umn)mEN clearly entails
that of the sequence (Vm)mEN. From the isotoneity of (fm) n,EN follows v n < fm
for all m, and thus sup um < f . For all m > n we have u,nn < vm and so
m

sup umn = fn < sup vm for every n E N.


mEN mEN

Together with the preceding this gives finally sup vm = f . Therefore (vn) is a se-
quence with the needed properties 0 n

11.5 Corollary. For every sequence (fn)nEN of functions from E'


00

fn E E'
00

nn=1
and J(f)d$t=JfdIL.
n=1
00

n=1

Proof. Apply 11.4 to the sequence U t + ... + fn)nEN and recall (11.7). 0

In analogy with the device of writing An T A, An 4. A for sets, introduced in §3,


we will from now on write
fn t f, fn 4.p
for numerical function f, 11, f2,... on the set S2 to signal that fn(w) T f (w) for
every w E S2, or fn(w) 4. f (w) for every w E Q; that is, the notations mean (fn) is
an isotone sequence and f is its upper envelope, or (fn) is an antitone sequence
and f is its lower envelope. Obviously for a sequence (An) of subsets of 12
ABTA a 1 A T lA and An J. A q 1A 4.'A .

Examples. 1. Let (S2, 0) be an arbitrary measurable space and c,, the measure
defined on d by unit mass at the point w E S2 (cf. Example 5 in §3). Then

f fde.=f(w)
for every f E E. Due to 11.3 we can at once assume that f E E.
If, however, f = E ai 1 A, is a normal representation of f, then w lies in exactly
one of the sets A;, say in Aj0. Then f f den, = E ajc,,(Aj) = a;. = f (w).
2. Consider 0 := N and .d :_ ,90(N). The o-additivity requirement means that
a measure p on V is uniquely defined whenever numbers do = p({n}) E R+ are
specified for each n E N. E` consists of all numerical function f > 0 on Q. Indeed,
one sets fn := f (n) It,,) for each n E N and then fn E E`, and in case f (n) < +oo,
§11. The integral of non-negative measurable functions 61

fn E E. Since
00

f=I:fn,
n=1

it follows from 11.5 that f E E' and

f du = f (n)pn .
J n=1

3. Let (0,0) be a measurable space, (pn)iEN a sequence of measures on 0 and


00
.U:= F, pn (cf. Example 4 in §3). Then for every f E E`
n=1

fidp ->fidpn.

This is evidently true of indicator functions f, so the claimed equality holds for
all elementary functions. Transition to an arbitrary f E E' is accomplished thus:
Let (un) be a sequence in E with un t f. Then the double sequence

amn = >2
i=1
n
ff
,,n dpi
*n E N)

satisfies
sup (supamn)= sup(sup amn) (= sup amn) ,
mEN nEN nEN mEN m.nEN

which confirms the assertion.


Now that E` is seen as a natural generalization of E, we might ask for a more
workable characterization of it. A surprisingly simple one exists which brings us
back to the measurability concept in §9.

11.6 Theorem. E' is the set of non-negative, d-measurable, numerical functions


an 11.

Proof. Every elementary function is measurable and so therefore is every function


in E', by 9.5. Suppose conversely that f is a non-negative, measurable, numerical
function on 11. The sets
if < ( -E 1)2-n}, i = n, 1 ..., n2n - 1
I {If }, n) n
A3n
all lie in W, and for each fixed n E N the n211 sets are a decomposition of I.
Consequently, for each n
n2n
un i2-n1A,,,
i=1
62 I l. Integration Theory

is a normal representation of a function in E. On the set Air the function un+1 can
take only the values (2i)2'n-1 and (2i + 1)2-"-1 if i E {O... , n2" -1}, and only
values > n when i = n2". Therefore the sequence (un) is isotone. It satisfies
sup un = f , because for any w E 11 either f (w) = +oo, in which case un (w) = n
n
for every n, or f (w) < +oo, in which case u. (w) < f(w) < un(w) + 2'n for all
n > f (w). Thus f lies in E. 0

Example. 4. Let fi be an uncountable set, dd the a-algebra in fZ comprised of


all sets which are either countable or have countable complement (introduced
in Example 2 of §1). We claim that a numerical function f on 0 is daf-measurable
just if there is a countable set A in the complement of which f is constant. This
constant a(f) does not depend on the particular set A, because if B is another
such, CA n CB, being the complement in uncountable ft of the countable set Au B,
is not empty. That this condition really implies the si-measurability of f follows
from Theorem 9.1, because for every a E R either if > a} C A or CA C (f > a}.
In proving the converse we can, thanks to (9.8), assume that f > 0. The claim
is then true for elementary functions f E E(fl, dd), because among finitely many
pairwise disjoint sets whose union is 1, exactly one has a countable complement.
For arbitrary f E E'(11,d) let (un) be a sequence of elementary functions with
it, T f . Each function un is constantly a(un) in the complement of some countable
set A. But then f (w) has the constant value for all w E n CA. =
n nEN
C( U An). As the set U An is countable, this proves that f has the asserted
nEN nEN
property and that moreover a(f) = supa(u,,).

If now p is the measure defined in Examples 2 and 7 of §3 which takes only the
values 0 and 1, then it follows from the preceding deliberations that

f f dp = a(f) for all f E E=(l2,.ul).

In closing we will use Theorem 11.6 to derive a factorization lemma, due to


J.L. Doob, which is interesting in its own right and quite important for its appli-
cations in probability theory.

11.7 Factorization lemma. Let T : St -> W be a mapping of a set 12 into a mea-


surable space (n', dd') and f : 11 - Ft a numerical function on i2. The function f
is measurable with respect to the a-algebra o(T) = T-1(4d') in D generated by T
if and only if there exists a measurable numerical function g on (f2', s') such that
(11.9) f =goT.
In case f is c(T)-measurable and real (reap., non-negative)-valued, then there is
such a g which is real (reap., non-negative) -valued.
§11. The integral of non-negative measurable functions 63

Proof. If f has the form f = g o T as specified, then it is the composite of


a Q(T)-sad'-measurable with an a('-21 -measurable mapping, making it a(T)41-
measurable. For the proof of the converse we distinguish three cases:
n
1. Let f = E ai1A, be a Q(T)-elementary function; so Ai E o(T) and ai E R+ for
=1
i = 1, ... , n. For each Ai there is a set A; E 0' with Ai = T-1 (A;), by definition
n
of o(T). Therefore the function g := ailA' does what is wanted.

2. Let f > 0. According to Theorem 11.6 there is an isotone sequence (un)neri


of o(T)-elementary functions with f = sup u,,, and by the proof just given, there
n
are d'-elementary functions gn such that un = gn o T. The function g := sup gn
n
then does what is wanted in this case.
3. An arbitrary r(T)-measurable f : 0 -* Ilk decomposes into its positive part f+
and its negative part f -. From 2. we get d'-measurable go > 0 and go > 0 on Sl'
for which f + = go o T and f - = g, "o T. For w' in the set U' := {g'o = +oo} fl {go =
+oo} the difference go(w') - go(w') is not defined. But the set T(Sl) is disjoint
from U', because go' (T(w)) = +oo always entails that 9o(T(w)) = f (w) = 0.
Therefore if we set
9 1Cu'9o and g" 1Cu'9o
then g := g' - g" will do the desired job.
4. If f is real, 3. supplies a numerical d'-measurable function go on SW such that
f = go oT. If we set U := {IgoI = +oo}, then U fl T(f2) = 0 since f takes only real
values, and so the real function g := 1Cu9o does what is wanted. 0

Remark. The restriction of g to T(1l) is uniquely determined by f and (11.9).


Specifically, for each w' E T(0), g(w') = f(w) for every w E T`(w'). On T(fl)
one therefore has no other choice than to set g(T(w)) := f (w). In case T(1) E at,
in particular when T(11) = fl', the existence of g can thus be secured without
recourse to 11.7 - cf. Exercise 3 below.
The factorization lemma is therefore noteworthy only in so far as it allows the
measurability of T(f)) to be dispensed with. And in doing that the special structure
of (1, 91) is critical. Remark 4 in §8 shows how we are sometimes forced to do
without the measurability of T(Q).

Exercises.
1. Show that every bounded, 0-measurable, non-negative real-valued function
on a measurable space (fl, d) is the uniform limit of an isotone sequence of d-
measurable elementary functions.
2. Let (Sl, .r9, µ) be a measurable space with a finite measure µ. Further, let
f, f1, f2.... be measurable numerical functions on 11. Prove the equivalence of
64 11. Integration Theory

the two assertions:


(i) limµ( U{f,,,>f+E))=0 for every e>0;
m>n
(ii) for every 6 > 0 there exists an A6 E .& with µ(A6) < 6 such that for every
e > 0, f,, (w) < f (w) + E holds for all w E CA6 and all sufficiently large n E N.
[Hints: Note that (i) is also equivalent to the statement that for every e > 0 and
6 > 0 there exists an A6,, E 0 with µ(A6,,) < 6 and an N6,,. E N such that
f,, (w) < f (w) + e for all w E CA6,, and n > N6,e.] Why does (i) hold, given the
sequence (fn)n£N, for every measurable function f which satisfies f > lim sup fn?
n-4oo
3. With the hypotheses and notation of the factorization lemma, show that for
any w1, w2 E 12 with T(wi) = T(w2), and every C E a,(T), either wl,w2 E C or
w1, w2 E CC. (That is, w1 and w2 cannot be "separated" by any set in o(T).)
From this fact infer that a Q(T)-measurable f satisfies f(wl) = f(w2) whenever
T(wl) = T(u)2). In case T(S1) E d', deduce the existence of a er(T)-measurable
mapping g : SY -4 fR with f = g o T. [Hint: Consider the system `B of all C C Sl
which have this two-point property and conclude that o(T) C W. Further, take
note of the equality T(T'1(A')) = A' fl T(1) for A' C W.]

§12. Integrability

By now the integral f f d;i is defined for all non-negative d-measurable numerical
functions on 11, as a result of 11.4 and 11.6 together. In a third and final step f f du
will now be defined for certain numerical functions f which are not of constant
sign.
According to Theorem 9.8, f is measurable just if both its positive part f+ and
its negative part f - are measurable. This remark prompts the following definition:

12.1 Definition. A numerical function f on the measure space (Sl, 0, µ) is called


(p-) integrable if it is s/-measurable and the integrals f f + dµ, f f " dµ are real
numbers. Then
J fdu := f f+dµ- f f dµ
is called the (µ-)integral of f (over Sl).
If for some reason one wants to put the variable w E Sl into evidence, he also
writes
f f (w),u(dw) or f (w) dit(ty) .
J

Remarks. 1. The right side of (12.1) is meaningful for measurable f if at least


one of f +, f - has a real integral. One says that then f is quasi-integrable or that
§12. Integrability 65

the integral off exists and one uses (12.1) to define f f dµ E R. Only occasionally
will we be concerned with this obvious generalization.
2. In the special case µ = ad we speak of Lebesgue integrable functions (on Rd)
and of their Lebesgue integrals. If a Borel measure µF on Rd is described with the
help of a measure-generating function F on Rd (cf. §6), the µF-integrable func-
tions f on Rd are called Lebesgue-Stieltjes integrable (or Stieltjes integrable) with
respect to F. One speaks of its (Lebesgue-)Stieltjes integral and writes f f dF
instead of f f dtF. The general theory of measure and integration has however
displaced this terminology and the notation f f dF, despite their historical signif-
icance.
Let us now summarize the most important properties of the conceptual edifice
just built:

12.2 Theorem. Each of the following four statements is equivalent to the inte-
grability of the measurable numerical function f on S2:
(a) f + and f - are integrable.
(b) There are integrable functions u > 0, v > 0 such that f = u - v. (Note that the
last equality entails that u(w) - v(w) is defined (in R) for every w E 11.)
(c) There is an integrable function g with if I < g.
(d) If I is integrable.

From (b) follows: f f dµ = f u dµ - f v dµ.

Proof. What has to be shown is the equivalence of (a) through (d), since (a) con-
stitutes the definition of f being integrable.
(a)=:-(b): According to (9.8), u := f+ and v := f- do the job required in (b).
Because the integral is additive on E', along with u and v, u + v is
also integrable. Since f = u - v < u < u + v and -f = v - u < v < u + v, the
function g := u + v is as required.
(c)=*(d): This follows from the isotoneity of the integral on E* and the fact
that If I E E' (Theorems 11.6 and 9.8): f If I dµ < f gdµ < +oo.
(d)=:;-(a): Upon recalling that f+ < IfI and f- If I, this too follows from the
isotoneity of the integral on E*.
In (b), f = u - v = f + - f - and so u + f v + f +, which via (11.7)
yields f u dµ + f f - dµ = f v dµ + f f + dµ and therewith the last assertion of the
theorem, since all the integrals here are finite. 0

12.3 Theorem. Let f and g be integrable numerical functions on 0, a E R. Then


the functions of and, if it is everywhere defined on 11, f + g are integrable, and
satisfy

(12.2) f(af)d=aJfdtz and J(f+)dit=Jfdii+Jgdt.


Furthermore, the functions
fVg and fAg
66 1 1. Integration Theory

are integrable.

Proof. The claims regarding of follow from (11.6), since


(of)+=of+, (af)-=of- ifa>0,and
(af)+ = Ial f-, (af) = lalf+ ifa < 0.
Regarding f + g, we argue as follows: from f = f + - f - and g = g+ - g- follow
f+g=f++g+-(f +g ).(11.7) insures that u:=f++g+ and v:=f- +g-
are integrable. Then the claims about f + g follow from the equality f +g = u - v
via 12.2. Finally, If V gI < If I + I9I and If A 91 <_ IfI + IgI, and we know that
If I + IgI is integrable. The integrability of the measurable functions f V g and f A g
follows then from these inequalities and part (c) of 12.2.

12.4 Theorem. For any integrable numerical functions f and g on !1

(12.3) f <9
(12.4) Jfdµ 5 fiji dµ.
Proof. From f < g follows f+ < g+ and f - > g-, and from these inequalities and
the isotoneity of the integral on E' follows (12.3). Because f < IfI and -f: If 1,
(12.4) follows from the first equality in (12.2) and from (12.3), with If I in the role
of g there.

The relevant properties are particularly clearly perceptible when we consider


only real-valued integrable functions. To aid in that we define
(12.5) 2l(µ) := the set of all µ-integrable real functions on Cl.
Using this widespread notation it follows immediately from Theorem 12.3 and
from (12.3) that: With respect to the operations
(f + g)(w) := f (w) + g(w) and (a f)(w) := of (w) w E Cl

of pointwise addition and multiplication b y scalars a E R, Y 1 (p) is a vector space


over R, and on it f '-+ f f dµ is an (isotone) linear form.

Examples. 1. Let (Cl, d) be any measurable space, e,,, the measure on ii defined
by unit mass at w E Cl. According to Example I of §11, the e,,-integrable functions
are just the W-measurable numerical function f on Cl with I f (w) I < +oo. For them

f fde,,=f(w)
2. Let be the measure space defined in Example 2 of §11, µ({n}) = an
for n E N. From what was shown there it follows that the µ-integrable functions
§12. Integrability 67

f : SZ -4 9 are precisely those for which


ao
>1f(m)Ian <+00
n=1
and for such an f

J
fdµf(n)an
n=1

3. Let (0, d, µ) be the measure space defined in Examples 2 and 1 of §3. A func-
tion f : S2 -* R is then µ-integrable if and only if it is equal to a real constant a
throughout the complement of some countable subset of 0. From Example 4 of §11
we have f f dµ = a for such an f.
4. Let (9, 0,,u) be a measure space with µ(f2) < +oo. Then every constant real
function, and consequently after 12.2, every bounded, measurable real function
on 12 is µ-integrable.
5. Let µ and v be measures on a a-algebra si in Q. A numerical function f on 0
is (µ + v)-integrable if and only if it is both µ- and v-integrable, and in this case

J
fd(µ+v)=Jfd+Jfdv.
In fact: For every non-negative sf-measurable function g on 12, f g d(µ + v) _
f g du + f gdv holds by Example 3 in §11. Applied tog := If I this and 12.2 prove
the integrability claim, and applied to g := f + and g := f - it implies the claimed
equality. In particular
2'(µ+v)=21(i)n21(v)
is valid.
We can now free ourselves of the restriction that functions always be integrated
over the whole 1. (11.5) insures that along with any pair of functions from E' =
E*(S2, s9) their product is also in E. So from f E E' and A E d follows lA f E E.
If f is an integrable numerical function on S2, then so is lA f, for every A E srd:
Because of the trivial inequality I lAf 15 If I, this is immediate from 12.2 (and 9.4).
In the light of this the following seems natural:

12.4 Definition. If f is a numerical function which is defined on S2 and either


belongs to E' or is p-integrable, we set

(12.6) jfdiu =f lAfdµ


:

for every A E d and call it the µ-integral off over A.


As a special case of this notation

(12.7) jfdIi=Jfd,i.
68 11. Integration Theory

The following rules of calculation are evident, for all f, g which either lie in E'
or are integrable:

(12.8) fAUBf1P+IAflBf=JAf+LfL forallA,BEd


and, as a Special case,

(12.8') J fdµ = jfd+jfdfor all disjoint A, B E d;


AuB

f Afdj<_f gdµ
A
One merely has to reflect on the definitions involved. Moreover, pursuant to the
discussion after (12.5).

(12.10) f - j fdµ A
is a linear form on .l(µ), for each A E ad.

But we can get at integrals over sets in ad in a different way, namely by con-
sidering the restriction µA of the given measure µ to the trace a-algebra A n a+d.
That one is thereby led to the same result is the content of

12.5 Lemma. Let A E .d and for every function f on IZ which either lies in E*
or is µ-integrable let f denote the restriction of f to A, and µA the restriction
of µ to A n .W. Then
r
(12.11) ff'dPA= J fdµ.
A

Proof. First consider f E E' (St, at). Then f' E E' (A, A n W) since
(f')-'(B) = An f-'(B)
holds for all Bore] sets B in R (cf. 11.6). For the function lA f E E' there is
a sequence (un) of a/-elementary functions satisfying it,, f IA f . The sequence (u;,)
of restrictions to A obviously consists of A n ad-elementary functions that satisfy
u',, t f', from all of which follows that

(12.12) f fdµ = sup


A
,,
nENJndii and
nEN
f
ff'dPA = Sup udµA .
Since 0 !5 u,, C 1A f, It,, = 0 in CA, so u = IAttn and consequently

Un = a;1A,
i=1

for appropriate (depending on n) sets A; E .d which are all subsets of A, and


appropriate (also n-dependent) real coefficients a; > 0, and k,, E N. It follows
§12. Integrability 69

that
k
Un = ai1'qi .
i=1

(Notice that for Q C A, the restriction 1Q coincides with the indicator function
with respect to A of Q.) From the last two equalities we see that

JufldP=JudPA for all n E N,

k.,
because each integral equals aiµ(Ai), and from these equalities and (12.12)
i=1
follows (12.11) for f E E'(1l,sv).
If f is p-integrable the preceding can be applied to both f + and f -. All integrals
are finite and it is obvious that (f')+ = (f+)', (f')- = (f-)', so (12.11) follows
from linearity of the integral.

In a final step of generalization let us note that we can conversely proceed


from an A E al and a function f on A which either lies in E' (A, A n sV) or is
pA-integrable to define the µ-integral of f over A via

(12.13)

and in the second case to say that f is also p-integrable over A. With the aid of
Lemma 12.5 we thereby get:

12.6 Corollary. A numerical function f defined on a set A E sr' is p-integrable


over A if the function defined on the whole of St by

fA(w)
f 0(w) ifwEA
' if w E St \ A
is p-integrable. In this case

fAfdµ= f fAdµ= JfdPA.


From this discussion we see that a p-integral over a set A E 0 is nothing
other than a µA-integral over the new base space A. It can also be thought of as
a p-integral over SZ employing the integrands fA.

Exercises.
1. Characterize the functions u E E(12, d) which are p-integrable.
2. Let (12, d, p) be a measure space. The indicator function IA of a set A E at
is p-integrable just when µ(A) < +oo. Such sets are called p-integrable, and 9
will denote their totality. Show that R is an ideal in the ring 0 (cf. Exercise 4
in §1); in particular, R E .S and A E 0 A n R E R. For a or-finite measure p
a converse also holds: A C St and A n R E 9 for all R E R implies A E W.
70 It. Integration Theory

3. Is the Dirichlet jump function from Exercise 2, §10 A'-integrable?


4. Consider the measurable space (S2, d) from Example 4 of §11. On it is defined
the measure p which assigns countable sets the value 0, uncountable sets (from il)
the value +oc. Determine all the µ-integrable functions and their integrals.
5. Let (Sl, 0, p) he any measure space, (An) a sequence of pairwise disjoint sets
from W, A their union, f a numerical function on A. Show that f is µ-integrable
M
over A if and only if it is µ-integrable over each A. and E fA.. If I dµ < +oo.
n=1
6. Let (S2, r9, p) be a measure space with p finite. Show that every real func-
tion f on St which is the uniform limit of a sequence (f,,) in 2l (µ) itself belongs
to 2l (µ). Why does this conclusion fail for every non-finite µ which is or-finite?
(Hint: Construct a sequence (gn) in 2l(µ) with 0 < gn < 1 and f gn dµ > n2 for
each n E N and then consider fn := F j-2g1.j
j=t

§13. Almost everywhere prevailing properties

For the further construction of the theory the concept of a negligible set, already
frequently mentioned in Chapter I, will now play an important role. We recall:
N C 11 is called a (µ-)nullset if N E a and µ(N) = 0. The union of every
sequence of p-nullsets is again one (3.10), as is every set in W which is contained
in a p-nullset, thanks to isotoneity (cf. Exercise 5 in §3).

13.1 Definition. Let q be a property of points in 1: every w E Sl either enjoys


property fl or does not. We say that "(µ-)almost all points of Cl have property 17"
or "rl prevails (µ-)almost everywhere in St" if there is a µ-nullset N such that all
points of CN enjoy property il.

Be careful: It is not required that the set N,, of all w E Cl which enjoy property rl
be a µ-nullset. Indeed, generally N,, may not belong to W. For example, if A is
a subset of S2 which does not belong to ii and q is the property "w is a point
of A", then N,, = CA is not in sir.
Examples of properties q which will come up in the sequel are: Equality of the
values at a point w E Cl of two functions f and g which are defined on fl, finiteness
of the value at w E Cl of a function f, etc. Corresponding to these we have the
following modes of speaking: f and g are (µ-) almost everywhere equal on Cl, in
symbols
f=9 (µ-)almost everywhere;
f is (p-) almost everywhere finite, in symbols
If ( < +oo (p-)almost everywhere;
§13. Almost everywhere prevailing properties 71

f is (,u-) almost everywhere bounded, meaning that for some a E R


If < a (µ-)almost everywhere,
etc.
The theorems that follow explicate the significance that this new concept has
for integration theory:

13.2 Theorem. For every f E E'(0, d), that is, (cf. 11.6) for every +d-
measur-able, non-negative numerical function f

Ji dµ = 0 a f = 0 p-almost everywhere.

Proof. Since f is measurable, the set


N:={f54 0}={f>0}
lies in sat. What has to be shown is that
f f dy = 0 q µ(N)=0.
Suppose f f dp = 0. For each n E N the set A. := If > n-1) also lies in af and
An T N, so that µ(N) = limoµ(A,,) and it is enough to show that p(An) = 0 for
every n. But obviously f > n-11A,,, entailing that 0 = f f dp > n-1p(An) > 0,
that is, p(An) = 0, as wanted.
Suppose conversely that p(N) = 0. Each of the functions un := n1N (n E N)
lies in E(1l, 0) and satisfies fun dµ = 0. Setting g := sup un gives a function
n
g E E' (0, 0) such that un T g, so f g dµ = sup f un dp = 0. Finally, since
n
evidently f < g, 0 < f f dµ < f g dµ = 0 gives the desired equality f f dµ = 0. 0

13.3 Corollary. Every W-measurable numerical function f on fl is integrable


over every µ-nullset N, and
fdp=0.
IN

Proof. If f > 0, this claim follows from the theorem, because each function 1N f
lies in E' (12, sd) and is almost everywhere 0. In turn, application of this to f +
and f - delivers the full claim. 0

13.4 Theorem. Let f, g be sat-measurable numerical functions on Sl which are


µ-almost everywhere equal on Sl. Then

(a) f>0,g>0 Jfd=J9d;


(b) f integrable = g integrable and fi dµ = J g dµ .
72 11. Integration Theory

Proof. (a): By hypothesis (and 9.3) N := { f 34 g} is a Wnullset. From 13.3 then

f Nfdµ= f Ngdµ=0.
On the other hand, for M = CN we have lM f = 1Mg due to the definition of N,
and so by (12.6)
dµ_IM
JM dµ.
A dding integrals and using (12.8') leads to the conclusion in (a).
(b): The almost everywhere equality hypothesis entails that
f+ = g+ almost everywhere and f g- almost everywhere.
From (a) then

f f+dµ= J g+dµ and If-dA= f g-dµ.


Because f is integrable, what we have here are non-negative real numbers, showing
that g is integrable (part (a) of 12.2) and, upon subtracting the second equality
from the first, we get the equality claimed in (b).
Since, roughly speaking, all this shows that integrability and the integral of
a function are insensitive to (measurable) changes of the function on nullsets,
results proved earlier can easily be reformulated somewhat more sharply. For ex-
ample:

13.5 Corollary. Let the l-measurable numerical functions f and g on 11 sat-


isfy If I <_ g µ-almost everywhere. Then along with g, the function f will also be
µ-integrable.

Proof. If we set g' := g V If 1, then g' is measurable, g' = g almost everywhere


and If I < g'. From 13.4 part (b) we see first of all that g' is integrable, and then
from 12.2 f is as well.

Of special importance is the realization that integrability imposes limitations


on how often a function can assume the values ±oo, or indeed any non-zero value.
This is made precise in

13.6 Theorem. Every µ-integrable numerical function f on Il is µ-almost every-


where real-valued. Moreover, the set { f 0 0} is of a -finite measure.

Where a set A E a( is said to possess a -finite measure if it is the union of


a sequence of sets in of each of which has finite measure. This means nothing
other than that the restriction of µ to A fl d is a or-finite measure.

Proof. The set N := (If I = +oo} lies in a( and for every real a > 0 satisfies
alN < if 1. Consequently, aµ(N) < f If I dµ < +oo, from which follows the first
§13. Almost everywhere prevailing properties 73

claim, µ(N) = 0. To prove the second claim we pass over to If I and thereby assume
that f > 0. Then
If 540}={f >0}= U{f >n-1}.
nEN

Every set An :_ If > n-' J = fn f > 11 /satisfies IA < n f and therewith

µ(An) :5 n ,f dµ < +00.


f
This holds for all n E N, confirming the a-finiteness claim. 0

Theorem 13.6 has yet another consequence: Let N be a p-nullset and f a nu-
merical function which is defined on M := CN and is M fl ad-measurable. Such
a function is described as being a (p-)almost everywhere defined (d)-measurable
function. The function fm introduced in 12.6 extends it to an &d-measurable func-
tion on 11. Any other extension of f to SZ must agree with fm almost everywhere.
According to 13.4 therefore either every such extension is integrable or none is. In
the first case moreover all extensions have the same µ-integral. These observations
justify the following definition:

13.7 Definition. Let f be a µ-almost everywhere defined, std-measurable nu-


merical function on 0. It will be called (µ-)integrable if it can be extended to
a (p-)integrable function f' defined on the whole of ft f f' dµ will then be called
the (p-)integral of f and denoted f f dµ.
We will only occasionally be concerned with this extension of the integral con-
cept, but its utility is already shown by the following

Remark. Suppose f and g are integrable numerical functions on Q. According


to 13.6 each is almost everywhere finite. Because the union of two nullsets is itself
a nullset, there is a nullset N such that both If (w) I < +oo and Ig(w) I < +oo for
all w E CN. But then
w H f (w) + g(w) (w E CN)
is an almost everywhere defined measurable function. This fact, in conjunction
with what was shown above, shows that the explicit hypothesis made in 12.3 that
f + g be everywhere defined is of little significance. For two integrable numerical
functions f and g on 11 the sum f + g is almost everywhere defined, and in the
sense of 13.7 integrable. The equality

J(f+o)d=ffd+J9d µ
prevails unrestrictedly.

Exercises.
1. The numerical functions f and g on the measure space (St, s(, µ) satisfy f = g
,u-almost everywhere. Show via an example that in general the sat-measurability
74 1 1. Integration Theory

of g does not follow from that off . Show however that in case (52, d, p) is complete,
the d-measurability of g is equivalent to that of f.
2. Let (S2, .od, p) be a measure space, (1, x 1o', po) its completion. Prove that f :
Q -* R is wo-measurable just if .vd-measurable numerical functions fl, f2 on fl
exist with the properties f, < f < f2 everywhere in f1 and fl = f2 p-almost
everywhere. If f is po-integrable, then any functions fl, f2 with these properties
are p-integrable, and f fl dp = f f2 dp = f f dpo. (This supplements Exercise 7
in §5 and generalizes Exercise 1 in §10.)
3. Even if the f in the preceding exercise is real-valued, the functions fl, f2 which
were proved to exist there cannot always be chosen to be real-valued. Prove this
for the case where 11 is any infinite set, Ad := {Q1, S2} and p := 0.

§14. The spaces 2P(µ)

According to 9.4 the product of two measurable functions is again measurable. By


contrast however the product of two integrable functions is not generally integrable,
as the next example shows:

Example. (0, sd, p) is the measurable space described in Example 2 of §12 and
Example 2 of §11, with a,, := n_P-1 for each n E N, where 1 < p < +oo. The
identity function, f (n) := n for all n E N, is integrable, but its p-th power is not.
Thus for p = 2, f2 = f f is not integrable.

This observation suggests the investigation of those measurable functions f


on I for which if IP is integrable.
In what follows p will designate a real number, p > 1. For every od-measurable
function f on fI, If I and then also If Ip is measurable, because (adopting the usual
convention that (+oo)P := oo) for every real a
Q ifa<0
(Iflp>a}= (IfI2:a'/P) ifa>0.
For such an f
1/p
(14.1) Np(f) (f Iflp di )
is therefore defined. It satisfies 0 5 Np(f) < +oo and, clearly,
(14.2) Np(af)=IaINp(f) for all aER.
Two deeper properties will now be established:
§14. The spaces .`gy(p) 75

14.1 Theorem. p > 1 is a real number and q > 1 is defined by the equation

-+-=1.
1

P q
1

Then for any measurable numerical functions f, g on St


(14.3) NI(fg) < NP(f)NN(g) (HOLDER'S inequality).

Proof. It is clear from definition (14.1) that we may assume f > 0 and g > 0.
Setting
a:=Np(f) and r:=Nq(g),
we can also assume that both these numbers are positive. For if, say a = 0, then
by 13.2 f P, whence also f , is almost everywhere equal to 0. The same is then true
of f g (remember that 0 (+oo) = 0), so that again by 13.2 we have NI (f g) = 0,
and (14.3) holds. Once a, ,r are each positive, no loss of generality is incurred by
assuming that each is also finite, which we now do.
Applying the mean-value theorem of the differential calculus to the function
q 1- (1 + rl)l/D, there follows at once the well-known Bernoulli inequality

(1+71)I/p<2+1
_p for all11ER+
or

P q
If now x and y are positive real numbers, then one of xy-1 and x-Iy is such
a l;. Inserting this t into the last inequality (and reversing the roles of p and q if
necessary), gives
xllpyllq < 1x+ ly.
P q
This inequality - really equivalent to the concavity of the (natural) logarithm
function - holds as well for all x, y E R+. If finally we take x := (o-I f (w))P and
y := (rr-lg(w))q for an w E If < +oo} fl {g < +oo}, we get
1 1 1

OT fg < app fP + rgg,


gq

valid throughout fI, since it trivially prevails as well in the complementary set
If= +oo} U {g = +oo}. Integration of this inequality leads at once to (14.3). 0

14.2 Theorem. For all measurable numerical functions f and g on £l whose sum
f + g is defined throughout fI, and for every p E [1, +oo[
(14.4) Np(f + g) <_ Np(f) + Ng(g) (MINKOWSKI's inequality).

Proof. Since If + gI 5 If I + IgI,

Np(f +g) <_ Np(IfI +Igi),


76 1 1. Integration Theory

which shows that we may assume f > 0 and g > 0. In case p = 1 there is then
even equality in (14.4), by (11.7). Therefore, for the rest we can assume that
1 < p < +oo, and then again define q by p-1 + q 1 = I. We may further assume
that both NN(f) and NN(g) are finite, that is, that if and gp are integrable.
12.2(c) and the estimates
(f + g)P <- [2(f V g)J" = 2P[fP v gPJ < 2P(fP + gP)
then insure the integrability of (f + g)P, that is, Np(f + g) < +oo. Now write

1(f + g)P-1 f dµ + J(f + g)"-'g dp


1(1 + g)p dp =
and apply Holder's inequality to each integral on the right to get

J(f +g)Pdµ < Nq((f +g)P-1)Np(f)+Nq((f +g)p-1)Np(g)


which thanks to the fact (p - 1)q = p reads
(Np(f + g))P < (Np(f + g))P-1 [Np(f) + Np(g)J
The desired inequality (14.4) follows from this and the finiteness of Np(f + g). 0

14.3 Definition. A numerical function f on S2 is called p -fold (p-)integrable or


integrable of order p or pth-power integrable, for some p E [1, +oo[, if f is measur-
able and If I p is p-integrable; that is, f is measurable and NN(f) < +oo.
According to 12.2, 1-fold integrable functions are indeed just the integrable
functions. In the case p = 2 we also speak of square-integrability.
It is immediate from the definition that a measurable function f is p-fold in-
tegrable if and only if if I is p-fold integrable; equivalently, if and only if there is
a p-fold integrable function g > 0 with IfI < g. Further properties, already known
to hold when p = 1, are codified in:

14.4 Theorem. Consider p E [1, +oc[ and p -fold integrable functions f and g.
Then for every a E R
of, f Vg and f Ag
are p -fold integrable, and in case it is defined throughout St, the function f + g is
p-fold integrable.

Proof. Because a function f is p-fold integrable just if it is measurable and Np(f) is


finite, the claims about a f and f + g follow from (14.2) and (14.4). The p-fold
integrability off V g and f A g then follow as in the case p = 1 from the estimates
If V gI <- IfI + Igl and If A gI <- If I + IgI. 0
14.5 Corollary. For 1 < p < +oo a numerical function f on Il is p-fold integrable
just if its positive part f + and its negative part f - are both p -fold integr able.
§14. The spaces i(p) 77

Proof. Since f = f + - f -, from the p-fold integrability of f + and f - follows that


of f, by 14.4. The converse also follows from 14.4, and the equalities
f+= f v0 and f =(-f)vO.
Now for each p E (1, +oo[ we define
(14.5) 2"(µ) := the set of all real p-fold µ-integrable functions on Q.
Then from 14.4 we get the property, already known for p = 1:
(14.6) i(14) is a vector space over R.
In view of (14.5) real-valued p-fold integrable functions are also known as .-
functions.
From (14.3) we immediately get:

14.6 Theorem. The product of a p-fold and a q -fold integrable numerical function
is integrable (where 1 < p < +oo and 1 + a = 1).

In particular, the product of two square-integrable functions is always integrable.

14.7 Corollary. If 1 < p < +oo and the measure µ is finite, then every p -fold
integrable function is integrable.

Proof. Because µ(S2) < +oo, the constant function 1 is q-fold integrable on 0, for
each q E (1,+00[. So the present claim follows from 14.6 upon writing any p-fold
integrable f as f 1.

Remark. 1. Without the hypothesis µ(S2) < +oo the conclusion of 14.7 may fail.
For example, in Example 2 of §12 choose the measure it by requiring a = n-1/2
for all n. Then the function f defined on S2 = N by f (n) := an for all n E N lies
in 22(p) but not in 2'(p).
More generally when µ is finite, from p-fold integrability follows p'-fold inte-
grability for every p' E [1, p] - cf. Exercise 3 below.
Related to 14.6 we have:

14.8 Theorem. Let 1 < p < +oo, f : S2 -+ l p -fold integrable and g : 11 -a R


a measurable almost-everywhere bounded function. Then the product f g is p -fold
integrable.

Proof. The boundedness hypothesis on g means that there is an a E R such


that IgI < a almost everywhere. Then of course Ifgl 5 a If I almost everywhere.
Because of the p-fold integrability of a If I, the claim follows from this inequality
via 13.5.

In particular, along with every f E -?'P(p) and A E V the function 1A f lies


in LP(p).
78 [1. Integration Theory

It seems natural to formulate the analog of Theorem 14.6 in case p = 1. To this


end we define
(14.7) the set of all real, d-measurable, p-almost every-
where bounded functions on S1.
One immediately perceives that 2'1(14) is also a vector apace over R. The union
of Theorems 14.6 and 14.8 results in the assertion
1+q-1=1
(14.8) f E-"(µ),9E-2°(µ),1 <p<+oo, P = f9E21(F+),
where of course the convention = 0 has to be recalled. Functions which
(+oo)-1

are (µ-)almost everywhere bounded are also called (µ-)essentially bounded.


In closing it may be noted that for counting measure S on .9(12), with 0
... , n }, (14.3) and (14.4) go over into discrete versions of the Holder and
{ I,
Minkowski inequalities. When p = 2 we get the Cauchy-Schwarz inequality and
the triangle inequality for the euclidean norm, familiar from linear algebra and
analytic geometry.

Remark. 2. Definition (14.1) of Np obviously makes sense for every real p >
0, thus also for those 0 < p < 1 heretofore excluded from consideration. For
these p, however, the fundamental properties (14.3) and (14.4) are lost and the q
determined by p` l+q-1 = 1 is negative. (On this point, compare Exercise 5 below.)
Remark 3. at the end of §15 will show that pathologies occur when 0 < p < 1. All
subsequent work will therefore be restricted to the case p > 1.

Exercises.
1. Let (S2, d, µ) be a finite measure space, 1 < p < +oo. Show that every function f
on fl which is the uniform limit of a sequence (fn) from VP(IA) itself lies in .'(p).
2. For an arbitrary measure space (S1, rd, p) and 1 < p < +oo, show that a real
function f on 9 is p-fold integrable if and only if f If I" is Integrable. (In the "if"
direction, measurability of f itself is not part of the hypothesis.)
3. Let (11, 0,;t) be a finite measure space, 1 < p' < p < +oo, and f a measurable
numerical function on Q. Then
Np'(f) < Np(f) .1 (01/P -1/P and 2'(p) C -2v'(µ).
4. For any finite number of measurable numerical functions fl,..., fn on a measure
n
space and real numbers p i , , pn E 11, +oo [ satisfying F, p., 1 = 1, prove the
j=1
generalized Holder inequality
Nl(fl- fn):5Np,(fl).....NP"(f.)
5. Let (52,. 9, p) be a measure space, p E J0,1 [ and q < 0 be defined by p`1 +q-1 =
1. Consider non-negative f E .P(µ) and a measurable g : S1 -a 10, +oo[ satisfying
0 < Nq(g) := (f gq dµ) I /q < +oo. By an appropriate application of Holder's
§15. Convergence theorems 79

inequality show that


f fgdp > Np(f)Na(9)
Infer that
Np(f + g) >Np(f ) + Np(g)
and find an example to show that generally equality does not prevail here.

§15. Convergence theorems

Again consider 1 < p < +oo and a measure space (12, .sa', p). The function Np is
real-valued on the vector space 2P(p), and in fact a semi-norm, that is, a mapping
Np :.2 (p) - R+
having properties (14.2) and (14.4). From the second of those properties, the
Minkowski inequality, it follows that the function
dp(f,9):= Np(f - 9) f,9 E 2P(p),
satisfies the triangle inequality, that is,
dp(f, 9) S dp(f, h) + dp(h, g) for all f, g, h E -"(p).
Evidently dp thus has all the properties of a metric on 2"(p), with one exception:
According to 13.2 and 13.3
dp(f, 9) = 0
is not equivalent to f = g, but only to
f = g p-almost everywhere.
Distance-like functions without the property that "distance between two elements
equal zero entails equality of the elements", are usually called pseudometrics.
Np and d,, are called the .P-semi-norm and the Pp-pseudometric, also the semi-
norm or the pseudometric of convergence in the pth mean or in 2'-convergence.
To elaborate: If (f,,) is a sequence in YP(i), then it is said to converge in eh
mean to f E 2'P(p), or to be 2P-convergent if
(15.1) lim N,(fn-f)= lim
n +oo n-ioo
dp(fn,f)=0.
By virtue of what was noted above, the limit function f is only almost everywhere
uniquely determined. (14.2) and (14.4) insure that linear computations with con-
vergent sequences are like those we are accustomed to involving real numbers. In
immediately apprehensible symbolic form these say:
A - f, 9n -1 9 a fn + f3gn -4 of + 09
for any a, 0 E R.
80 II. Integration Theory

From (14.4) also follows a triangle inequality from below"


(15.2) I Np(f) - Np(9)I < Np(f ±g) for all f,9 E 2"(it),
simply because

Np(f) = Np(f - 9 + 9) s Np(f - 9) + Np(9),


Np(-g) = NN(g), and the roles of f and g can be interchanged throughout.
In case p = 1 we speak of simply convergence in mean, and when p = 2 of
mean-square convergence.
Taking note of the inequalities

(15.3) [ffdfi_f9diijIf-i dp<_N,(f-g),


validfor all A E d, f, g E 21(14), and
(15.4) INp(lAf) - Np(lA9)I 5 Np((f - 9)1A) <_ Np(f - 9)
valid for all A E at, f, g E 2p(µ), we immediately get:

15.1 Theorem. Every sequence (fn) in 21(u) (reap., in 2 (1i)) which converges
in mean (resp.. in pth mean) to a function f from 21(p) (reap., from -gy(p)) also
satisfies
f
(15.5) lim f
n- oo A
fn dµ = J fdµ for every A E d
A

(p.,
(15.6) Jim f
A
Ifnlp dp = f If I' dp
A
for every A E d.)

Proof. (15.5) follows from (15.3). Correspondingly (15.6) follows from (15.4), which
gives Np(lAfn) = Np(lAf), upon taking pth powers in this last limit and
using the continuity of the mapping x H xp on R+. O

(15.5) and (15.6) say nothing other than that for each A E d the mappings

f Hf fdµ and f HJ Iflpdp


A A

on 2l (µ) and 2P(p), respectively, are continuous with respect to 2'-convergence


and 2"-convergence, respectively.
Further developments require a lemma which is fundamental for the whole of
integration theory as well as its applications in probability theory and which goes
back to P. FATOU (1878-1929):
§15. Convergence theorems 81

15.2 Lemma (of Fatou). Every sequence (fn)fEN in E*(fl,ii), that is consisting
of 0-measurable numerical functions fn > 0, satisfies

f limonf fndp<liminf f fndp.


Proof. According to 9.5 and 11.6 the functions
f := lim inf fm and gn inf fn for all n E N

he in E' (S2, dd ). By definition of limit inferior, gn T f and thus by 11.4

nEN
f
f If dp = sup gn dp = n-+00
lim f 9n dp.

From this the claimed inequality follows, because by isotoneity

f gn dp infra J fm d!L

holds for all n E N.

If we choose for (fn) a sequence of indicator functions of measurable


An c 11, then lim inf IA,, is the indicator function of the set
n-+00

(15.7) lim inf An :=


n-+oo
U I I
A-.
nEN m>n

This is the set of w E Il which lie in ultimately all of the sets An. Dual to it one
defines
(15.8) lim sup An := n U U Am ,
n-pm
nEN m>n
the set of w E fl which lie in infinitely many of the sets An, more correctly, the w
which he in An for infinitely many n. Evidently
lim sup A,) = lim inf CAn .
n-+oo n-+oo
Hence we get the following corollary:

15.3 Corollary. For every sequence (Af)nEN of sets in the o-algebra.Q/

(15.10) ,t(lim ml A) < lim inf p(An),


n-+oo n-4oo
and if the measure p is finite, the inequality
(15.11) lim sup p(An) < p(limisup An)
n-+oo n-oc
holds as well.
82 1 1. Integration Theory

Proof. (15.10) is an immediate consequence of 15.2. In turn, if we apply (15.10) to


the sequence (CAn) and use (15.9), we get
µ(S2) - µ(limsup µ(C lim sup An) = u(liminf CA,,)
n-+oc n-+oo n-+oo
< lim inf p(CA,) = µ(fl) - lim sup 1(An),
n-,oo n-4oo
confirming (15.11).

Fatou's lemma leads - in the hands of NOVINGER [1972] - surprisingly simply


to the first convergence theorem, by which is meant a mechanism for inferring
convergence in p'l' mean from almost everywhere convergence. The result itself
goes back to F. RIEsz (1880-1956); cf. RIESz (1911].

15.4 Theorem (of F. Riesz). Suppose 1 < p < +oo and the sequence (fn)nEN
in 2P(S1) converges almost everywhere in 11 to a function f E 2P(51). Then the
condition
(15.12) Jim Jtfnrdsti= JIf lpdu
is (necessary and) sufficient for the convergence of (f,,) to f in eh mean.

Proof. The necessity of (15.12) follows (even without the hypothesis of almost-
everywhere convergence) immediately from 15.1. The proof of sufficiency proceeds
from the inequality
(a +/3)P < 2P (aP +,6P) (a, /3 E R+)
which has already been used in the proof of (14.4). Since Ia - 0I < a + /3 this
inequality yields
la - QIP < 2P(IaIP + IQIP) (a,$ E R).
This inequality insures that
9n:=2P(IffIP+VIP) -Ifn-fl", nEN,
are non-negative functions. They lie in .2o1(µ) and by hypothesis they converge
almost everywhere to 2P+1 If IP. In particular, 2P+1 If I = lim inf gn almost every-
where. Therefore Fatou's lemma in conjunction with (15.12) delivers the relations

21+1 J If IP dµ = J lim inf gn dp < Jim inf J g. du


n-+oo n-+OC

=2P+I Jiii''dµ-limsup Jii.-f1Pdµ.


Since 2P+ f if I ' dµ < +oo, we infer by subtracting it that

limsup JIfn - f1Pdµ<<0,


which asserts the claimed 2P-convergence.
§15. Convergence theorems 83

In preparation for the proof of the next convergence theorem we extend Min-
kowski's inequality to series of non-negative functions.

15.5 Lemma. Every sequence (fn)FEN of functions from E'(f1,d) satisfies


00 00
(15.13) Np(> fn) < E Np(fn) for every p E [1,+0o[.
n=1 n=1

Proof. If foreach nENwesetsn:=f1+...+fn,then by(14.4)


00
Np(sn) < > Np(fi) < > Np(fi)
i=1 i=1
00
The sequence (s,) is isotone and E fn is its upper envelope; the same holds for
n=1
the pth powers. Therefore from the monotone convergence theorem 11.4 follows
00
Np(Efn) =suPNp(9n)
n=1 nEN

and together with the preceding inequalities this gives (15.13). 0

We come now to a second convergence theorem. It goes back to H. Lebesgue


and is therefore frequently called Lebesgue's convergence theorem.

15.6 Theorem (on dominated convergence). Let 1 < p < +oo and (fn)nEN be
a sequence from .'P(p) which converges almost everywhere on Q. Suppose there
exists a p-integrable numerical function g > 0 on fI such that
(15.14) for all n E N.
Then there is a real-valued measurable function f on fI to which (fn) converges
almost everywhere. Every such f lies in 21'(p) and the sequence (fn) converges
to f in pth mean.

Proof. By assumption there is a nullset M1 such that lim f,, (w) exists (in 1[1) for
every w E CM1. Because of the integrability of gp there is, according to 13.6,
another nullset M2 with g(w) < +oo for every w E CM2. If we set
limo f,, (w), w E C(M1 U M2)
f (w)
{ 0, w E M1 U M2,
then f is real-valued and aaf-measurable, and the sequence (fn) converges almost
everywhere to f. Consider now any function f with these properties. Then If I < g
almost everywhere, so along with gp the function If Ip is also integrable, that is,
f E 2p(µ), by 13.5. We set, for each n E N
9n:=Ifn-fIp
84 II. Integration Theory

and then what has to be shown is that lim f gn dµ = 0. From the definition of gn,
0:5 gn <- (Ifnl+IfD <_ (9+IfI)P.
Since the fimction h :_ (g + If I )P is integrable, so is each gn (by 14.4 and 12.2).
Fatou's lemma applies to the sequence (h - gn) and says that

lim nf(h-gn)dµ<liminf J(h-gn)dp=[hdp-limsup f 9gdA.


Since (fn) converges almost everywhere to f, (h-g.) converges almost everywhere
to h. In particular,
lim inf(h - g,,) = h almost everywhere
n-4oc
and so
inf =J hdp.
J
The preceding inequality therefore yields lim sup f g,, du < 0. Since all 9,,, are
non-negative, this is equivalent to the desired lim f g,, du = 0.

The concept of a Cauchy sequence makes sense in any pseudometric space, in


particular therefore in 2p(µ). A sequence (fn) of functions from (t) is said to
be a Cauchy sequence in _49P(p) if for every e > 0
dp(fm, fn) = NP(fm - fn) < E
holds for ultimately all m, n. Every .2P(µ)-convergent sequence is a Cauchy se-
quence, as Minkowski's inequality shows. That the converse of this is also true,
that, in other words, the space 2P(14) is (metrically) complete, is the content of
the third convergence theorem. Its special case p = 2 goes back to F. RIESZ and
E. FISCHER (1875-1956).

15.7 Theorem. For each 1 < p < +oo, every Cauchy sequence (fn)nEN en '(k)
converges in pt' mean to an f E 2P(p). Some subsequence of (fn) converges
almost everywhere to f.

Proof. Straight from the definition of Cauchy sequence we can construct 1 < n1 <
n2 < ... such that Np(fnk+, - fnk) < 2-k for all k E N. We define
00
9k *= fnk+, - fnk for each k E R, and g:= Z I9kI
k=1

Then from 15.5


00 00

NP(9)<_ENP(9k)<E2-k=1.
k=1 k=1

Consequently, the d-measurable, non-negative numerical function g is ptI power


integrable and therefore (by 13.6) it is almost everywhere real-valued, that is, the
series F,gk is absolutely convergent almost everywhere. The kte partial sum of
§15. Convergence theorems 85

this series is f,,k+, - fn so we see that the sequence (fnk)kEN converges almost
everywhere in Q. Moreover,
Ifnk+,I = 191 +... +9k + fn,I <- 9+ Ifn,I
and by 14.4 the sum g + I fn, I is pth-power integrable. Thus the sequence (fn. )W
satisfies all the hypotheses of the dominated convergence theorem, according to
which it therefore converges in eh mean to an f E 2P(1) and
lim fnk = f almost everywhere.
k-woo

Since (fn) is a Cauchy sequence, this subsequence behavior entails the convergence
in eh mean of the whole sequence: Given c > 0 there is an mE E N such that
Np(fn-fn)<E for all m,n>m,.
Then there is a k E N with nk > me such that
NP(fnk - f) < E.
The triangle inequality then insures that
Np(fn - f) < Np(fn - fnk) + Np(fnk - f) < 2E

holds for any n > mE. 0

Passage to a subsequence cannot generally be circumvented if one wants almost


everywhere convergence, as the next example illustrates.

Example. Consider fl := (0, 1[, d := Clf1.1 and a := an. Every natural number n
is representable as n = 2h + k for a unique pair of integers h > 0, k > 0 with
k < 2h. Set An := [k2-h, (k + 1)2-h[ and let fn denote its indicator function.
Then f fn dµ = f fn du = µ(An) = 2-4 < 2/n, so (fn) converges to 0 in eh mean
(for any 1 < p < +oo) and is therefore certainly a Cauchy sequence in 2p(14).
But the sequence (fn(w))fEN in {0, 11 is not convergent for any w E Cl. Indeed,
given w E Sl and h = 0, 1, ..., there is exactly one k = 0,..., 2h - 1, such that
w E [k2-h, (k + 1)2-h[, that is, w E A2k+k. In case k < 2h - 1, w AZk+k+I. In
case k = 2h -1 and h> 1, w ¢ A2h+, .
We record the following simple corollary to 15.7:

15.8 Corollary. If the Cauchy sequence (fn) in 2p(µ) converges almost every-
where to an d-measurable real function f on Cl, then f lies in 20P(A) and the
sequence converges to it in eh mean.

Proof. According to 15.7 there is an f' E 2P(p) to which (fn) converges in


eh mean and to which a subsequence of (fn) converges almost everywhere. Outside
the union of this exceptional nullset and that in the hypothesis the two limits f
and f * must agree. Hence f = f * almost everywhere. 0
86 1 1. Integration Theory

Corresponding to Theorem 14.6 and its corollary we have finally the following
two convergence assertions:

15.9 Theorem. The sequence (fn) in .4°D(p) converges in pth mean to a function
f E 2'(p) and the sequence (gn) in 29(p) converges in qth mean to g E
If I < p < +oo and p-' +q-1 = 1, then the sequence (fn9n) of products converges
in mean to f g.

Proof. The triangle inequality in IR yields


(fn9n-f9l<Ifn-fII9.I+If II9n-9I (nEN)
which the Holder inequality (14.3) transforms into
N1(fngn - .f9) < Np(fn - f)Nq(9n) + Np(f)Nq(gn - 9) (n E N).
Our claim follows from this when we recall from (15.2) or (15.6) that the sequence
(Nq(gn))neN is convergent, hence bounded.

15.10 Corollary. If the measure p is finite, then every sequence (fn),,EN in 2'(p)
which converges in pth mean to an f E YP(p) for some 1 < p < +oo, also
converges to f in mean.

Proof. For p = 1 there is nothing to prove. For 1 < p < +oo the claim follows from
the theorem upon taking every function gn there to be the constant function 1;
because of the finiteness of p the constant functions lie in 29(p) for every q E
(1, +oo(.

The reader should convince himself via an example like that in the remark af-
ter 14.7 that the converse of the assertion in this corollary is not true. However, the
conclusion of the corollary can be refined somewhat; namely, under its hypotheses
there is 2'V-convergence of (fn) to f for every p' E (1,p). Cf. Exercise 2 below.

Remarks. 1. Because
Np : 2'(p) - R+
is a semi-norm, the set
.N := N;'(0)
is a linear subspace of .gy(p). It is independent of p because it consists of all
measurable real functions on Sl which are almost everywhere equal to 0. The
quotient vector space

becomes a normed space in a natural way: Letting f H f denote the canonical


mapping of .2'p(p) onto LP(p), we define
IIf IIp = Np(f) for all f E L"(p).
§15. Convergence theorems 87

One checks effortlessly that f H 1If IIP is thereby well defined and provides a norm
on LP(p). Theorem 15.7 says that LP(p) is complete with respect to this norm,
that is, it is a Banach space (for 1 < p < +oo).
L2(µ) is even a Hilbert space. For the product fg of two functions f,g E 22(p)
is integrable, by 14.6, and it is clear that the integral f f g dp depends only on the
canonical images f , g of these functions, which means that

(f, 9) -ffdp
is a well-defined mapping. A short calculation suffices to confirm that it provides
a scalar product in L2(p).
2. f E 2°°(p) means that the set W J of all a E R+ such that If I < a almost
everywhere is not empty. We can set
N00(f):=infWj
and show easily that N,,, :2°°(p) -r R+ is a semi-norm on 2°°(p). Also in this
case N ' (0) coincides with the space .At described in 1. In the quotient space
LO°(,u) := Y°°(p)1_41
can be defined via N,,. just as before. One checks that L°° (p)
a norm f H II f I I
thus also becomes a Banach space.
3. For every measure space (SI, dry, p) and every p E ]0,1[ the set 2P(p) (cf.
Remark 2 in §14) turns out to be a vector space. NP is generally not a semi-norm
(cf. Exercise 5, §14), but 4(f, g) := Ny (f - g) is a complete pseudometric - with,
however, strange properties: The unit "ball" centered at 0 is generally not convex.
For L-B measure on (0, 1], every f E .2P is actually a convex combination of
functions in this ball. See BoURBAKI [1965], chap. 4, §6, exer. 13.

Exercises.

1. Let (fn) be a sequence of numerical measurable functions on a measure space


(11, 0, p). Under the hypothesis that a p-integrable function g satisfying Ifn l S 9
for every n E N exists, show that lim inf fn and lim sup fn are p-integrable func-
tions and satisfy
f liminf
n-4°o
fndp <liminff
n-4oo
fndp < limsup
n-*oo
f fndp < f limsup fndp.
Show by an explicit example that this chain of inequalities can fail if there is no
such majorizing function g. (To this end, cf. Exercise 6 in §21.)
2. Let p be a finite measure, 1 < p' < p < +oo. Show that if a sequence in 2P(p)
converges in pth mean to a function f E 2%p), then it also converges in p`h mean
to f. (Cf. Exercise 3 in §14.)
3. Let (f', p) be a finite measure space and on d9 consider the pseudometric
dµ(A, B) := p(AAB) = f IlA - 1BI dp introduced in Exercise 7 of §3. Show that
the pseudometric space (d, dN) is complete.
88 H. Integration Theory

4. Show that if 1 < p < +oo and f, fn E f%µ) satisfy


00

n=1
then the sequence (f) converges almost everywhere to f.
5. Show that the conclusion of Theorem 15.9 remains valid for p = 1 and q = +oo.

§16. Applications of the convergence theorems

We will now demonstrate the applicability of the convergence theorems by means


of three examples which will be important in the sequel. The first concerns the
behavior of parameter-dependent integrals, the second the connection between the
Riemann and the Lebesgue integral, and the third the calculation of the (Gaussian)
integral
(16.1) G:=
J_2)1() .

1. Parameter-dependent Integrals. The question of the continuity and dif-


ferentiability of functions which are defined by integrals will be answered in the
following lemmas and corollary. Throughout, (fl, srd, p) is an arbitrary measure
space.

16.1 Lemma (Continuity lemma). Let E be a metric space and f : E x it -R


a function with the properties
(a) w H f (x, w) is p-integrable for every x E E;
(b) x 1-4 f (x, w) is continuous at xo E E for every w E 1l;
(c) them is a µ-integrable function h >_ 0 on fl such that
If (x, w)I < h(w) for all (x, w) E E x Cl.
Then the function defined on E by
O(x):= J f(x,w)µ(dw)
is continuous at xo.

Proof. The continuity of V at xo is proved if we show that for every sequence (xn)
in E with lim xn = xo,
nim V(xn) = p(xo)
holds. To accomplish this, we introduce the sequence (fn) by
fn (w) := f (xn, w) (n E Z+, w E 0).
By hypothesis these are integrable functions, each satisfies IfnJ < h, and for every
fixed w E 11, lieu fn(w) = fo(w). From the theorem on dominated convergence
n--+oo
§16. Applications of the convergence theorems 89

therefore follows that


I/
fn du = fo du .f (xo, w)p(dw)
J J
that is, indeed lim cp(xn) = V(xo). 0

In the following applications of this lemma the space E will frequently be an


interval in R or, more generally, a subset of Rd.

16.2 Lemma (Differentiation lemma). Let I be a non-degenerate (meaning, con-


taining more than one point) interval in R, and f : I x 11 - R be a function with
the properties
(a) w '- f (x, w) is p-integrable for each x E I;
(b) x ,-a f(x,w) is differentiable on I for each w E 1?, the derivative at x being
denoted by f'(x,w);
(c) there is a p-integmble function h > 0 on f? such that

Jf'(x,w)I < h(w) for all (x,w) E I x SZ.


Then the function defined on I by

(16.2) ca(x) Jf(xw)li(dw)


is differentiable, for each x E I the function w H f'(x, w) is p-integrable, and

(16.3) VP (x) = Jfl(xw)tz(dw) for every x E I.

In short, under the stated conditions (16.2) can be differentiated under the
integral sign.

Proof. Fix xo E I and consider any sequence (xn)nEN C I \ {xo} which converges
to xo. Then the function defined on S2 by
f (xn,w) - f(xo,w)
gn(w) xn - xo
is p-integrable, for each n E N, and
lim gn(w) = f'(xo,w) for all w E Q.
n-+oo
It is a consequence of hypothesis (c) that Ign 1 < h for n E N, as we now confirm. It
suffices to apply the mean-value theorem of differential calculus. According to it,
for each x E I \ {xo} and each fixed w E fl there is a point t, in the open interval
whose endpoints are x and xo, such that
f(x,w) - f(xo,w) = f'(t,w)
x - xo
90 11. Integration Theory

and therefore by (c) this quotient is majorized by h(w). In particular,


Ign(w)I <h(w) for all w E S1, and every n E N.
Now the dominated convergence theorem comes into play to insure that the func-
tion w H f'(xo, w) to which the gn converge is tc-integrable and

im
4oo J gds = J f'(xo, w)!p(dw)
Claim (16.3) follows from this because

I
gdp=forallnEN.
xn -xe
11

Passage to the multi-dimensional analog is painless:

16.3 Corollary. Let U be an open subset of Rd, i E and f : U x f -i R


a function with the properties
(a) w H f (x, w) is i-integrable for each x E U;
(b) x H f (x, w) has an ill' partial derivative at each point of U, for every w E S2;
(c) there is a µ-integrable function h > 0 on S2 such that
8f
(x, w) < h(w) for all (x, w) E U x S2.
8xi
Then the function defined on U by

w(x) := ff(x.w)i(d)

'- 8f (x, w) is µ-
has an ith partial derivative at every x E U, the function w
8x,
integrable, and
av (x) = J az (x, w),u(dw) for every x E U.
axj

This follows at once from the differentiation lemma: Given T = (T,, ... ,Td) E
U, there is an open interval I C R containing ai such that for each t E I the
point (zl , ... , T,- j , t, Ti+i .... 7d) lies in U, and we can apply 16.2 to the function
(t,w),_, f(xl,...,xi-1, .Td,w).
II. Comparison of the R.iemann and Lebesgue Integrals. For every d-
dimensional Borel set B E .mod and suitable Borel measurable numerical func-
tions f on B the integral fa f dad was defined in §12 and identified with f f dAB.
This integral is called for short the Lebesgue integral of f over B. A frequently
encountered alternative way of writing it is

(16.4) ff(x)dx= Jfda5.


§16. Applications of the convergence theorems 91

In case d = 1 and B = [a, a], or ] - oo, a], or R, etc. the notations fa f (x) dx, or
f °. f (x) dx, or f ±' f (x) dx, etc., are also common.
Since in basic analysis courses it is frequently only the Riemann integral that
is dealt with, the following remarks relating it to what has been done here may be
useful.

16.4 Theorem. Consider a Borel measurable real function f defined on a compact


interval I := [a,)31 in R. If f is Riemann integrable (which in particular means it
is bounded), then it is also Lebesgue integrable, and the values of the two integrals
off coincide.

Proof. To every finite subdivision


J:={a=ao<al <...<an=p}
of I the Riemann theory associates the lower and upper sums
n n
L1:=E'y:(a;-ai_1) and Ue:=Eri(ai-ai-1
i_1 i=1
in which
'yi := inf f([a1_i,ai]) and ri :=supf([ai-l,ai]), i = 1,...,n.
If we set A:= A} and Ai := [ai_ 1 i ai], i = 1, ... , n, then
n n
1a=EyilA, and u,:=Eri1A4
are µ-integrable functions on II for which

L1=J 11dµ and U1=J ujdp.


Riemann integrability of f means, by definition, that there is a sequence (.4n) of
subdivisions of I such that each .,i,,+1 refines its predecessor an and the sequences
and (Uk) tend to the same real limit value, the Riemann integral p of f
over I.
Because of the refinement feature of the sequence (en), (14) is an isotone and
(u4) is an antitone sequence. Hence

exists (in R) on 1. If therefore we apply Fatou's lemma 15.2 to the sequence of


functions uj -1j > 0, there follows

0:5 I q dµ < lim (U,4 - Lin) = 0


n-+oo

and so by 13.2, q = 0 p-almost everywhere. Since in addition for every n,1.4 < f <
uj holds p-almost everywhere (everywhere except possibly at the points of in),
92 11. Integration Theory

q = 0 almost everywhere entails that


lim 1 j = f p-almost everywhere on I.
n-+oo

As has been noted, f is bounded, say If 1 <_ M E R. The sequence ([1.4. [) is therefore
majorized by the constant M, a p-integrable function, and so Theorem 15.6 on
dominated convergence delivers the 1s-integrability of f as well as the convergence
of (1k) to f in mean. From 15.1 finally follows

I fdp=lim J
which finishes the proof. 0

Remarks. 1. Consider once again Dirichlet's jump function f on the unit interval
(cf. Exercise 2 of §10). Being the indicator function of Q fl 10, 11, it is Borel mea-
surable and almost everywhere 0 with respect to L-B measure .1011. Consequently
it is Lebesgue integrable and fo f (x) dx = 0. But f is not Riemann integrable. So
the roles of Riemann and Lebesgue integration cannot be reversed in 16.4.
2. Borel measurability of f need not be hypothesized: the above proof shows,
even without it, that lim 1.4. = f p-almost everywhere and so f is -almost every-
where equal to the Borel function lim lj, . However, in this case it can well happen
that f itself is not Borel measurable.
3. The ideas in the proof of Theorem 16.4 can be amplified into a non-trivial
criterion for Riemann integrability. Namely, f : [a, 0] -+ R is Riemann integrable
if and only if it is bounded and is continuous at V-almost every point of [a, fiJ.
See Theorem 2.5.1 of COHN [1980] or the multi-part Exercise 12.51 of HEwITT
and STROMBERG [1965].

16.5 Corollary. The non-negative, real-valued, Borel measurable function f is


Riemann integrable over every compact interval. Then f is Lebesgue integrable
over R if and only if the improper Riemann integral
r+n
,0:= lim J f (x) dx
n

exists. In this case p = f f V.

Proof. Denote by pn the Riemann integral of f over An := [-n, +n] for each n E N.
According to the theorem just proved

pn=IA
From 11.4 and the fact that IA f T f we get

sup p JfdA'.
=
§16. Applications of the convergence theorems 93

The improper Riemann integral exists, by definition, just if this supremum is finite
and in that case its value g is that supremum. From these observations and the
monotone convergence theorem our present result follows. 0

Utilizing the decomposition f = f + - f - into positive and negative parts, it


follows from 12.2 and 16.5 that every Borel measurable real function f on R with
absolutely convergent improper Riemann integral is also Lebesgue integrable and
f f dal coincides with the improper Riemann integral of f. Obviously too, any
open or half-open interval I C iR can take over the role of R in 16.5.
By contrast, from the existence of the improper Riemann integral off does not
follow the Lebesgue integrability of f, even for continuous functions. Consider, for
example, the function f : R -+ 1R defined by f (x) :_ (sin x) Ix when x 54 0 and
f (0) := 1 m (sin x)/x = sin'(0) = 1. Of course, it is continuous. If for each k E N
we set
ak I(k+l)w sinx sin
:= dx = (-l)k I dt,
t t
we see that the signs of the ak alternate, their moduli decrease as k increases, and
r(k+1)n 1 (k + 1)a
Jakl < J 1 dx = log = log (I + 0 as k -> oo.
k+r kir
00
Therefore the series > ak converges. Using this it is very easy to confirm that the
k=1
improper Riemann integral

lim
R ++oo 0 J- rR sin x
X
dx

exists. On the other hand,


k +1)R IsinxJ n
sin t a sin t 2
L x
J0
F+ kir dt - Jo it+ k7r
at.
(k + 1)lr
and so for every n E N
= n (k+1)n 2
n

JR+
If I d,\' >
J fa,(n+1)w)
If ( dA' E
k=lJka
dx >
E k+1
k=11

Since the harmonic series diverges, these inequalities show that fR+ If I dA' = +oo,
and so by 12.2 f is not Lebesgue integrable over R+.

III. Calculation of the integral G. The preceding considerations show that


integrals which the reader may already have encountered as Riemann integrals
can, in the stated circumstances, be immediately interpreted as Lebesgue integrals.
Known formulas and computational rules for the Riemann integral thereby become
available to the Lebesgue theory as well.
94 H. Integration Theory

As an illustration, consider the non-negative function

e-x(1+m2 )
(16.5) f (x, w) :_ (x,w)ER x1R.
1 + w2

Both f and the function (x, w) t-+ f'(x, w) := -e-:(1+w2) are continuous. For fixed
xo > 0 form the auxiliary functions

ho(w) := e-220Iwl and h(w) :_ (1 +w2)-1 , w E It.

Their A'-integrability (over R) follows from Corollary 16.5 and the fundamental
theorem of calculus. For example,

r+
J/ (1 + W2)-1 hm [arctan(W)]"n = r.
n-too

Obviously f (x, w) < h(w) for all (x, w) E HI+ x R. It follows from 12.2 that for each
x E It+ the function w H f (x, w) is A'-integrable. And the real function defined
by

(16.6) V(x) := Jf(z)dw x E IR+

is continuous by the continuity lemma 16.1. Note that p(O) = r. Since 2 JWJ < 1+w2
for all w E R, we have I f'(x,w)J < ho(w) for all (x, w) E [xo,+oo[x]R. Consequently
the differentiation lemma 16.2 insures that <p is differentiable in ]xo, +oo[, for every
xo > 0, that is, differentiable in JO,+oo[, and

(16.7) (x) = - e_2(1+")).1(dw) for x > 0

and via the substitution t = w f this reads

(16.8) cp'(x) = -Gx-1"2e-z forx>0

where G designates the integral (16.1) that we are trying to explicitly compute. Its
existence is already fart of the preceding analysis, but can also be inferred from
the majorization a-' < e-t, which holds for t > 1. From (16.6), (16.8) and the
§16. Applications of the convergence theorems 95

fundamental theorem of calculus

V(x) - V(a) = GI t-1/2e-° dt = 2G 41. e" dw,

for x > 0 and a > 0. Upon letting a run to +oo, we will get

(16.9) p(x) = 2G +oo a-", dw


J,r-
if we notice that V(a) -+ 0 as a - +oo, which in turn is a consequence of the
inequalities
w(a) < e-° f(i +w2)-1A1(dw) = p(O)e-0 for all a > 0.

Because cp is continuous on R+ we can pass to the limit x -+ 0+ in (16.9) and get

it = p(0) = 2G
r+ e-"'2 dw = G2,
J0
using the obvious (on grounds of symmetry) fact that f °. a-"'' dw = f0+00 e' dw.
G = . That is,
Since G > 0, it follows finally thatfe2

(16.10) dx = r
or equivalently, in the form seen in probability theory,

(16.10') 2a.

This derivation goes back to ANONYME [1889J and VAN YZEREN [1979]. A par-
ticularly short alternative one is made possible by Tonelli's theorem (cf. Exercise 4
in §23).

Exercises.
1. Which of the two functions below are integrable, which are square-integrable
with respect to Lebesgue-Borel measure on the indicated intervals?
(a) f (x) := x-1, x E I:= [l, +oo[;
(b) f (x) := x-1/2, x E I:= 10,1] .

2. Show that for every real number a > 0 the function x H e" is A1-integrable
over R+.
3. Show that for every real number a > 0 the function

x - a_°x [sinX x13


J
96 1 1. Integration Theory

is A'-integrablc over JO, +oo[ and that

rsinx13 A1(dx)
Jo x J
is continuous Oil 10, +00[.

§17. Measures with densities: the Radon-Nikodym theorem

Again let (12, dd, p) be an arbitrary measure space and E' = E'(f2, sd) the set of
all W-measurable, non-negative numerical fimctions on 12. In 12.4 we defined the
integral of every function f E E* over every set A E id'. We are interested here in
how this integral behaves with respect to A.

17.1 Theorem. For each function f E E`JA the equation

(17.1) v(A) := f du

defines a measure v on sd.

Proof v(0) = 0 and v > 0. For every sequence (An)nEN of pairwise disjoint sets
from W with A:= U A
nEN

IAf = IA, f
n=1

and so by 11.5
v(A) v(An),
n=1
the final property needing to be checked in confirming that v is a measure on 0. 0

17.2 Definition. If f is a non-negative .d-measurable, numerical function on 11,


then the pleasure v defined on .0' by (17.1) is called the measure having density f
with respect top. It will be denoted by
(17.2) v=fiz.
Concerning the relationship between v- and µ-integrals we will show

17.3 Theorem. Let f,, E E', v:= fu. Then


(17.3)
1
§17. Measures with densities: the Radon-Nikodym theorem 97

or, written out,

(17.3') Jd(f,i) = f Wf dµ -
An id-measurable function V : fl - R is v-integrable if and only if ,pf is µ-
integrable. In this case (17.3) is again valid.

Proof. First suppose p = a,lA; is an sad-elementary function. In this case (17.3)


holds because
n n
f ,pdvaiv(A1)a;f lA,fdµ=Jcof d µ.
For an arbitrary p E E' there is a sequence (un) in E such that U. T V. Since then
un f T W f as well, (17.3) follows from 11.4. Finally, consider any id-measurable
numerical function p on Sl. By now we know that

fco+ dv = Jco+f dµ = J(caf)+ dµ and W- dv = f V f du = f(f ) dp.


f
From these equations and the definition of integrability follows the second part of
the theorem. 0

It now follows that the formation of measures with densities is transitive:

17.4 Corollary. Let f, g E E', v := fit and P := gv. Then B = (gf )µ, that is,
(17.4) 9(fµ) = (9f)µ
Proof. For every A E id

g(A) = f gdv =
A
f lAgdv

and furthermore, according to 17.3

f lA9dv= f lA9fdµ= f(9f)dii.


We thus obtain p(A) = fA g f dµ, for all A E W; which is what had to be proved. 0

On the question of uniqueness of density functions we have

17.5 Theorem. For functions f, g E E'


(17.5) f =g µ-almost everywhere = f p = gµ .
If either f or g is µ-integrable, the converse implication holds as well.
98 IL . Integration Theory

Proof. If f and g coincide p-almost everywhere, then so do 1A f and 'Ag for each
A E a(, whence
JALgdp for allAEd,
which just says that fit = gp.
Now suppose that f is p-integrable and that fit = gp. Since g > 0 and f gdp =
f f dp < +oc, g is also p-integrable. Let us show that the set
N:={f>g},
which lies in 0 by 9.3, is a p-nullset. For every w E N, f (w) - g(w) is defined and
is positive, which means that the definition
h:= 1Nf - 1N9
makes sense. The functions 1N f, 1Ng, being majorized by the p-integrable func-
tions f, g, are themselves integrable. Because fit = gp, they have the same it-
integral. From this we getr that
r
J
hdp= Ir fdp- /Ngdp=0.
Since N = {h > 0}, this equality and 13.2 tell us that p(N) = 0. With the roles
of f and g reversed, this conclusion reads u(N') = 0, where N' := {g > f }. Since
if 54 g} = N U N', the desired conclusion, namely that If 34 g} is a p-nullset, is
obtained. 0

The converse of implication (17.5) is not valid without some additional hypoth-
esis on the densities f and g. The next example illustrates this.

Example. 1. As in Example 2 of §3 let fl be an uncountable set, 0 the a-alge-


bra of countable and co-countable subsets of (1 (see Example 2 in §1). But the
measure p will be defined on 0 by p(A) := 0 or +oo, according as A or CA is
countable. If f and g are the constant functions on ft with the respective values 1
and 2, then indeed f p = gp, yet f (w) = g(w) holds for no w E ft. Of course, it
then follows from 17.5 that neither f nor g is p-integrable.
Before turning to the principal problem of this section, we will examine another
characterization of a-finite measures which is important for what follows and is of
interest in its own right.

17.6 Lemma. Let (fl,.ad,p) be a measure space. The measure is a -finite if and
only if there exists a p-integrable function h on Cl which satisfies
(17.6) 0<h(w)<+oo forevery wEf2.

Proof. If It is a-finite, there is a sequence in a0 such that p(An) < +oo for
each n E N and A7, fi Cl. Choose positive real numbers gn satisfying both r) < 2-n
§17. Measures with densities: the Radon-Nikodym theorem 99

and i p(An) < 2-n, for each n E N. Then the function


00
h := L?In1A
n=1
does what is wanted. It is measurable, 0 < h(w) < 1 for each m E 0, and f h dp < 1.
The converse implication is already known: it is contained in the second part
of 13.6.

In the light of 13.2 this lemma has another formulation: For each or-finite mea-
sure R there exists a real, measurable function h > 0 such that the measure hp is
finite and has the same nullsets as A.
We come now to the main problem, already alluded to: On the v-algebra sF of
the measurable space (S2, 0) two measures v and p are given. We pose the question
of how to decide whether v has a density with respect to µ, that is, whether there
is an .W-measurable, non-negative, numerical function f on St satisfying v = f p,
satisfying in other words
r
v(A)=J fdp for allAE.d.
a
For an affirmative answer it is necessary, as 13.3 shows, that every p-null set in a
be a v-null set as well.

17.7 Definition. A measure v on W is called continuous with respect to a mea-


sure it on 0, for short, p-continuous, if every p-nullset from 0 is also a v-nullset.
In the case of a finite measure v there is a condition equivalent to p-continuity
which clarifies and justifies the terminology:

17.8 Theorem. A finite measure v on jzf is p-continuous if and only if for every
c > 0 there exists d > 0 such that
(17.7) A E O and u(A)<b . v(A) < e.

Proof. From (17.7) it follows that v(A) < e holds for every E > 0 if A is a p-nullset.
Hence v(A) = 0 and v is thus a p-continuous measure, even without the finiteness
hypothesis. For the converse we will show that if (17.7) fails, then v is not µ-
continuous. Thus, for some c > 0 there is no 6, which means there is a sequence
(An)nEN in with the properties
p(An) < 2_n and v(An) > E for each n E N.
We set
A := 41.s .up An := n U An
nEN m>n
and have a set in ap which on the one hand satisfies
00 00
2-m = 2-n+1 for every n E N,
A(A) < µ( U Am) < E p(Am) <_
m>n m=n m=n
100 II. Integration Theory

whence p(A) = 0, and on the other hand, due to the finiteness of v and 15.3,
satisfies
v(A) > limsup E > 0,
nix
which proves that v is not p-continuous. 0

Examples. 2. Let 12 be an uncountable set, W the or-algebra of countable and co-


countable subsets of .W (Example 2 in §1). As in the preceding Example, consider
the measure v on .i which assigns to a set the value 0 or +oo according as the set
or its complement is countable. Let is denote the counting measure C on at (from
Example 6, §3). Since 0 is the only p-nullset, v is trivially µ-continuous. However,
v cannot have a density with respect to p. For from v = f p with f E E* it would
follow that
0 = v({w}) = f f dp = f(w)k({w}) = f(w)
W}

for every w E S2, making f = 0 and therefore v = fit = 0, which is not the case
because Sl is uncountable.
3. Let (R, 0, It) be the 1-dimensional Lebesgue-Borel measure space (so p = 'V)
and denote by A" the system of all p-nullsets. Then is an example of a or-ideal
in W1: The union of any sequence of its sets is another, as are the intersections of
its sets with those of ,5d1 (cf. Exercise 5, §3). These properties insure that

v(A)
- 10 ifAE-4
+oo if AEJO\.X
defines a measure on 1 (cf. Exercise 6, §3). From its definition it is clear that v
is p-continuous. Here however (17.7) falls, since for every b > 0
jp([o,ap = s and v([0,ap =+oo.
Thus the finiteness hypothesis on v in 17.8 is not superfluous. Example 2 shows
that for the existence of a density f E E' with v = fit, the µ-continuity of v, while
necessary, is not sufficient. All the more noteworthy is the theorem of Radon and
Nikodym which we will prove, after a preparatory lemma.

17.9 Lemma. Let or and r be finite measures on a o-algebra ii of subsets of 11


and let a := r - a denote their difference. Then there is a set S2o E W with the
properties
(17.8) e(fl0) > LOW);
(17.9) @(A) >0 for all AESTOltW.

Proof. Let us first proof the weaker claim:


(*) For every, e > 0 there exists 0e E 0 with the properties
(17.8') N(1l) >- 9(f) ;
(17.9') g(A) > -E for all A ED, ft a/.
§17. Measures with densities: the Radon-Nikodym theorem 101

We may obviously suppose that p(l) > 0, since otherwise SlE := 0 does what is
wanted. If then e(A) > -e for all A E .sad, it suffices to choose 1 := Q. So we
consider the case that some Al E ad satisfies e(A1) < -e. From the definition of e
and the subtractivity of the finite measures a and T,
e(CA1) = e(fl) - e(A1) ? e(1) + e > e(11) .
Therefore, if e(A) > -e for all A E (CAI) fl 0, we can set S1E := CA1 and be done.
In the contrary case there is a set A2 E (CAI) flsat with e(A2) < -e. Then because
A1, A2 are disjoint
e(C(A1 U Az)) = o(Q) - e(A1) - e(A2) > e(fl) + 2e > e(n)
and the preceding dichotomy presents itself anew. If after finitely many repeti-
tions of this procedure we have not reached our goal, then we will have generated
a sequence (An)nEN of pairwise disjoint sets in gd with
e(Sl \ (A1 U ... U An)) > e(Sl) and e(A.) < -e for every n E N.
Because of the finite additivity of a and r, this would have the consequence that
n
e(A1U...UAn)=Ee(A,) <-ne for every n E N
i=1

00
and entail the divergence of the series 1 e(An). But the latter is untenable,
n=1
because when the a-additivity of a and r is applied to the disjoint union A
U An it shows this series to be convergent:
nEN
00 00

E e(A,) = 1: (r(An) - a(An)) = r(A) - a(A) E R.


n=1 n=1
This contradiction proves that the construction procedure must terminate after
some finite number n of steps, with the set QE := C(A1 U ... U An) then satisfy-
ing (17.8') and (17.9').
We now take e = 1/n in (*) for successive n E N. The sets (1 can be chosen with
the additional property of isotoneity. For if Sll D 121/2 3 ... 3 Sll/n has already
been realized, we simply apply (*) to fll/n as a new base space in the role of Sl,
that is, we consider the restriction of the measures or and T to S21/n fl dd. Finally,
the set Slo := n Sll/n will be seen to do the desired job. For since 01/n j Sla,
nEN
(17.8) follows from (17.8'), and (17.9) follows from (17.9'), which insures that
e(A)>-1/nforallnENandeveryAESlofl.od. O
As indicated, this puts us in a position to answer the important question we
posed earlier.

17.10 Theorem (Radon-Nikodym). Let u and v be measures on a a-algebra .srd


in a set Q. If µ is a-finite, the following two assertions are equivalent:
102 I l. Integration Theory

(i) v has a density urith respect to A.


(ii) v is 14-continuous.

Proof. Only the implication (ii)=(i) is still in need of proof. To that end we
distinguish three cases.
First Case: The measures µ and v are each finite. Form the set 9 of all d mea-
surable numerical functions g > 0 on Sl which satisfy gµ < v, that is, which
satisfy
for allAEd.
The constant function g = 0 lies in 9, so 9 is not empty. 9 is moreover sup-stable,
that is, g V h E 9 whenever g, h E W. Indeed, setting Al := {g > h}, A2 := CAI,
every A E d satisfiees
r
gvhdµ= 1 gdµ+J
J Ana, ArA,
Since f gdµ < v(Q) < +oo for every g E 9, the number

ry:=suP{ f 9dµ:gE9)
is finite and there is a sequence (g;,) in 9 such that lim f gn dµ = -y. Due to sup-
stability the functions gn := gi V ... V gn lie in 9, and consequently ry > f gn dµ >
f gn dµ (since g,, > gn) for all n E N. Which shows that lim f gn dµ = ry. As
the sequence (gn) is isotone, the monotone convergence theorem can be applied,
assuring that f := supgn is a function in 9 and that f f dµ = ry. All this proves
that the function g H f g dµ on 9 assumes its maximum value at f.
Now we prove that v = f µ. In any case we have f µ < v, since f E 9, and so
T:= V- f A
is a finite measure on sat, evidently µ-continuous since v is by hypothesis. We have
to show that r = 0. So let us assume contrariwise that r(Sl) > 0. Due to the
µ-continuity of r, this entails that µ(11) > 0 as well, and we may form the real
number
Q:=2 (M}>0,

which satisfies r(Sl) = 20µ(Sl) > Qµ(St). The preceding lemma applied to r and
a:= Q3µ supplies a set flo E 0 which satisfies
r(flo) - lµ(ilo) > r(1) - $µ(!l) > 0 and r(A) > Qµ(A) for all A E f o n 0.
The .sat-measurable, non-negative function fo := f +,81n. therefore has the prop-
erty

ffodiz=jfdii+I3(QonA) jfd+r(A)=v(A)
§17. Measures with densities: the Radon-Nikodym theorem 103

for every A E sV. These inequalities put fo in 9. Since r is p-continuous and


r(S2o) > Qµ(S2o), we must have µ(S20) > 0, leading to

f fodµ= ffdµ+ap(no)=7+i3µ(Slo)>7,
an inequality which is incompatible with the definition of -f and the fact that
fo E 9. The assumption r(S1) > 0 is therefore untenable, and r = 0, as desired.
Second Case: The measure µ is finite and the measure v is infinite. We will produce
00
a decomposition SZ = U On of S1 into pairwise disjoint sets from d with the
n=0
following properties
(a) A E 1o fl at either µ(A) = v(A) = 0 or 0 < µ(A) < v(A) = +oo .
(b) v(S1n) < +0o for all n E N.
To this end let 2 denote the system of all Q E 0 with v(Q) < +oo and define
a:= sup{µ(Q) : Q E _l} .
This is a real number because the measure µ is finite. There is a sequence (Qm)mEN
in .l with limµ(Qn,) = a. Since 1 is evidently closed under finite unions, (Q,n)
may be assumed to be isotone. Qo U Q,n is then a set from std satisfying
mEN
µ(Qo) = a. We will show that 52o := CQo satisfies (a). So consider A E Stood with
v(A) < +oo. We need to see that p(A) = v(A) = 0, and since v is µ-continuous
we really only need to confirm that p(A) = 0. Since v(A) < +oo and, as noted
already, . is closed under union, each Q,n U A lies in 2, so that p(Q,, U A) < a,
and consequently
µ(Qo U A) = lim p(Qm U A) < a.
"t-400
Since A is disjoint from 1o, u(Qo U A) = a + µ(A). Conjoined with the preceding
inequality and the finiteness of a this says that indeed p(A) must be 0. Finally, to
take care of (b) we merely define S21 := Ql, and u n := Qm \Q,n_1 for all integers
m > 2 in order to get a decomposition of S2 with the desired properties.
Now let An, vn denote the restrictions of µ, v to the trace a-algebra On fl 8d,
for n = 0, 1.... and note that each vn is a µn-continuous measure. Moreover, for
all n > 1 both An and vn are finite. Case 1 therefore supplies Cl,, n 0-measurable
functions fn > 0 on Cl,, with vn = fnµn Taking fo to be the constant function +oo
on Sto, vo = foµo also holds, thanks to (a). Finally, "putting all the pieces together"
gives our result in this second case. Namely, the function f on Cl defined to coincide
on each Cl,, with fn (n = 0, 1, ...) is non-negative, sad-measurable and satisfies
v=fp.
Third Case: This is the general case: only the a-finiteness of it is demanded. There
is according to 17.6 a strictly positive function h E 2'(µ). The measure hp is
therefore finite and possesses exactly the same nullsets as does A. Consequently
v is also (hp)-continuous. By what has already been proved there is then an
104 II. Integration Theory

0-measurable function f > 0 on 1 with v = f (hµ). According to 17.4 v then


has the density f h with respect to A. 0

The question arises whether, in the situation of Theorem 17.10 the density f
of v is p-almost everywhere uniquely determined. From 17.5 we at least get a pos-
itive answer when f is p-integrable, that is, when v is a finite measure. But more
is true:

17.11 Theorem. Let v = fit be a measure having a density f with respect to


a a-finite measure p on 0. Then f is p-almost everywhere uniquely determined.
The measure v is or-finite exactly when f is p-almost everywhere real-valued.

Proof. First we show that f is µ-almost everywhere uniquely determined if the mea-
sure p is finite. In proving this we may assume that v(St) = +oo, since its truth is
otherwise a consequence of the second part of 17.5. Furthermore, as we now find
ourselves in case 2 of the preceding proof, the decomposition of St into %J11,...
employed there lets us confine our attention to Sto, as 17.5 takes care of the re-
maining Stn (n E N). So it suffices to treat the case ft = Sto, that is, to assume
that p and v are linked by the alternative:
A E srp = either p(A) = v(A) =0 or 0 < µ(A) < v(A) = +oo.
The constant function +oo is then a density for v with respect to p and what has
to be shown for uniqueness is that f = +oo holds p-almost everywhere. And for
that it suffices to show that
µ({ f < n}) = 0 for each n E N,
which in turn is a consequence of the above alternative and the inequalities

v({f <n})=J fdp<np({f <n})<+oo


f<n}
coming from the finiteness of A.
We will use 17.6 to reduce the general case of or-finite p to the case just treated.
That lemma supplies a strictly positive function h E 21(p). The measure by =
h(fp) = f (hp) has the density f with respect to the finite measure hp, so f is
(hµ)-almost everywhere uniquely determined. Since the measures p and hp have
the same nullsets, f is therefore also uniquely determined p-almost everywhere.
Next, suppose v is a-finite. From 17.6 once again we get a strictly positive
function k E 21(v). Then kv = (f k)µ is a finite measure, that is, f k is µ-
integrable, consequently also p-almost everywhere real-valued. Because k takes
only non-zero real values, this means that f itself is real p-almost everywhere.
Conversely, suppose that f is p-almost everywhere real-valued. We want to
00
see that v is a-finite. First of all, there is a decomposition St = U On of f?
n=0
into a sequence of pairwise disjoint sets from 0 each of finite p-measure. Set
An := in - 1 < f < n} for each n E N and Ao := { f = +oo}, the present
§17. Measures with densities: the Radon-Nikodym theorem 105

00
hypothesis being just that p(Ao) = 0. D = U (1li n AJ) is a decomposition of fl
i.J=o
into a (doubly-indexed) sequence of pairwise disjoint sets from sat. If each has finite
v-measure, this proves that v is a-finite. Consider any i E Z+. Because p(Ao) = 0
and v = fµ, we have v(1l, n Ao) < v(Ao) = 0. Because v = fit and f < j in AJ,
we have v(12i n AJ) < jp(ni) < +oo for all j E N as well. Thus all is proven. 0

In the generality presented here Theorem 17.10 was proved in 1930 by O.M. NI-
KODYtM (1888-1974). H. Lebesgue proved the theorem in 1910 for the case where
At is the L-B measure A1. J. RADON (1887-1956) pushed things further in a funda-
mental work which appeared in 1913. So 17.10 is often also called the theorem of
Lebesgue-Radon-Nikodym. The uniquely determined density f in 17.11 is called
the Radon-Nikodym density or the Radon-Nikodym integrand (of v with respect
top). A beautiful proof of 17.10 by elementary Hilbert-space methods was discov-
ered in 1940 by J. VON NEUMANN (1903-1957) and appears in many textbooks,
e.g., in RUDIN [19871, p. 130-131.
The history of the result to be presented next, the Lebesgue decomposition
theorem, runs somewhat parallel, Radon and Nikodym having also made signif-
icant contributions. We need a concept complementary to p-continuity, namely
p-singularity:

17.12 Definition. Let (Sly, sat) be a measurable space, µ and v measures defined
on sat. Let us write v << p if v is p-continuous. v is said to be singular with respect
top (or p-singular), written v J p, if a set N E sl exists with µ(N) = 0 = v(CN).
It is obvious that the relation v J p is symmetric in µ and v, so it is also ex-
pressed as p and v are singular to each other (or mutually singular). The definition
of v 1 p expresses the fact that for a suitable p-nullset N E W
(17.10) v(A) = v(A n N) for all A E d,
as follows from v(A) = v(A n N) + v(A n CN) and v(CN) = 0. The condition
that v J it thus says that the measure v is "carried by a p-nullset". From v << p
and v 1 p together follows that v(N) = 0, and so v = 0. In this sense the
concepts p-continuity and p-singularity are diametral or antipodal. Relative to
L-B measure Ad every Dirac measure ex on d obviously satisfies Ad 1 ex.

17.13 Theorem (Lebesgue's decomposition theorem). If p and v are a -finite


measures on a a-algebra sat in a set 12, then v can be decomposed in just one way
as v = v, + v, with measures vv, v, on sat that satisfy v, << p and v. J p.

v, is called the continuous part of v with respect to p, v, the singular part. The
Radon-Nikodym theorem is applicable to the part vc.

Proof. We will carry out the proof in detail only for finite p and v and indicate in
Exercise 4 how the reader can then handle the general case himself.
106 1 1. Integration Theory

Existence of a decomposition: Let ,, designate the system of all µ-nullsets


in W. Since v(A) < v(Q) < +oo for every A E Of,
(17.11) a := sup{v(A) : A E Xµ}
is a real number. Since .X,, is closed under countable unions, there exists an isotone
sequence (An) in .A', with v(An) T a. Since v is continuous from below, it follows
that
v(N) = a
for the set N := U A E .A',,. We will show that via
nEN

vc(A) := v(A n CN) and v,(A) := v(A n N)


two measures are defined on W that do what is wanted. Evidently v = Me + v8,
and V. 1 p since N E .NN. To prove that ve 4 it, it must be shown that v(A') = 0
whenever A E -A;, and A:= A n CN. As a subset of A E Xµ, the set A' and then
also the set A' U N, is p-null. Therefore v(A' U N) < a by definition of a. But
A' n N = 0 and v(N) = a. Hence
a + v(A') = v(N) + v(A') = v(N U A') < a,
from which follows v(A') = 0 as desired, since a is finite.
Uniqueness of the decomposition: Suppose
(17.12) v=VC +v,=vC'+v,'
are two decompositions of the kind described in the theorem. The measures v v,
are carried by p-nullsets N, N' in the sense of (17.10); which means that
(17.13) v,(A)=v,(AnN) and v,(A)=v''(AnN') for all AEd.
Setting No := NUN' gives a set in so that from vi,, u< K p follows
v,,(AnNO) =0 for every A E.Qd.
Therefore (17.12) and (17.13) give
v(AnNo) =v,(AnNo)+v,(AnNO) =v,(AnNO) =v,(AnNonN)
= v.(A n N) = v,(A), for every A E Ad.
Analogously of course, v(AnNo) = v,(A) for every A E 0. Thus we have v, = va.
A return to (17.12), recalling that all measures are finite, gives v,; = v'' as well. 0
There is a short, elementary proof of 17.13 that does not make use of the
Radon-Nikodym theorem; see Woo [1971).

Exercises.
1. Show that the Dirac measure e., on Rd has no density with respect to .1d,
for any x E W'. (Physicists occasionally work with such a "symbolic" density d5,
calling it the Dirac. function at the point x. The correct mathematical object is
nevertheless the Dirac measure es.)
§18*. Signed measures 107

2. Show that the relation << on the set of measures on a a-algebra d is reflexive
and transitive. The relation p - v defined as p << v and v « is is then an
equivalence relation. Two measures p and v stand in this relation just when they
have the same nullsets. For a-finite measures p and v on d show that p - v is
equivalent to v = f 1L for a density f which satisfies 0 < f (w) < +oo for p-almost
all (or even for all) w E Q.
3. On a a-algebra 0 in a set 11 two measures a and v are related by v < A. Show
that if further it is a-finite, then there is an d-measurable function f satisfying
0< f<lsuch that y= fµ.
4. Lebesgue's decomposition theorem was proved for finite measures p and v. Show
how to infer its validity for a-finite measures from this. [Hint: For the existence
proof use 17.6. For the uniqueness proof choose a sequence (An) in 0 with An T Sl
and a(An), v(An) finite for each n, and consider the measures vn(A) := v(Af1An),
AEd,nEN.]
5. Let v = vi+ve be the Lebesgue decomposition of a a-finite measure v on d with
respect to a a-finite measure p. The singular part V. has the form v,(A) = v(AfN)
for all A E 0 and a suitable p-nullset N E d. Show that if N' is any other p-
nullset with this property, then u(N 0 N') = v(N A N') = 0.
6. Let (S2, .mot, p) be a measure space, v = f 1A a a-finite measure on d having
density f with respect to p. Show that this density function is p-almost every-
where uniquely determined and is p-almost everywhere real-valued. Show that if f
is strictly positive, then p itself is a-finite.
7. Let (11, d) be a measurable space. For every measure µ on s0 let .M,,, denote
the a-ideal of its nullsets. Show that for any sequence (Pn)nEN of a-finite measures
on ae there is a finite measure µ on d for which /V,,= n N,,,
nEN
8. The set n := 10, +oo[ is a group with respect to multiplication. Show that the
measure on SZ f1.1 defined by p := han with density function h(x) := 1/x is
invariant under each self-mapping x H as of fI (a E fI). p is thus the Haar
measure of the group f2 in the sense of the remark immediately following 8.2.

§18*. Signed measures

It is worthwhile turning our attention back to Lemma 17.9. The measure concept
in this book is that formulated in Definition 3.3: Measures are premeasures p on
a a-algebra sad, and so are non-negative a-additive functions on d satisfying the
additional condition u(0) = 0. In Lemma 17.9 we encountered a real-valued, a-
additive function p which is the difference of two finite measures. Similarly for any
f E 2' (p) the function A H fA f dp on W is the difference of two finite measures,
for example f + p, and f -µ.
We will call a real function p : sr' - R on a a-algebra a finite signed measure
if it is a-additive in the sense of (3.2), non-negativity not being required. From
108 I l. Integration Theory

a-additivity applied to the constant sequence 0, 0.... follows immediately that


(3.1) is also satisfied, that is, g(0) = 0, because g is only allowed to take real
values. A second pass through the proof of Lemma 17.9 will convince the reader
that this lemma is in fact valid for every finite signed measure. As a corollary
we immediately get the following theorem on the existence a Hahn decomposition
of g, a theorem that goes back to H. HAHN (1879-1934).

18.1 Theorem. Let g be a finite signed measure on a a-algebra and in a set Cl.
Then there are sets Sl+, St- E of with Cl = Sl+ U fl-, Sl+ n fl- = 0, and g(A) > 0
for all A in the trace a-algebra Sl+ n 0, and g(A) < 0 for all A E Sl- n dd.

Proof. Set
-y:= sup{g(A) : A E 0}
and choose a sequence (An) in 0 with limg(An) = y. By applying 17.9 to the
restriction of g to An nad, we may replace An by a set Pn E 0 satisfying g(Pn) >
g(An) and g(A) > 0 for all A E Pn n 0. We will then have
(18.1) y=sup{g(Pn):nEN).
The decomposition of Cl that is sought can be realized by
Sl+ := U Pn, S2- := S2 \ Q+ .
nEN

Indeed, all A E H+ n .ad satisfy g(A) > 0 because such an A has the form

A = U Bn
nEN

with pairwise disjoint sets B. E P. n ad (by the disjointification procedure used


in the verification of (3.10)). From this representation of A and the a-additivity
follows g(A) _ g(B,) > 0. Thus p assumes only non-negative real values
n=1
on Sl+ n .sad, that is, the restriction of g to Sl+ n 0 is a finite measure. Moreover,
because @(P.):5 g(Sl+) < y and (18.1) this measure satisfies
y=Q(sl+)
In particular, y < +oo since p assumes only real values. g(A) > 0 cannot hold for
any A E Sl- n .sat, for otherwise g(C+ U A) = g(Sl+) + g(A) > y. Thus, g(A) < 0
for allAESl-n0.
Measures (in the sense of Definition 3.3) have occasionally been interpreted
as mass distributions on the underlying set Cl. A finite signed measure can be
analogously interpreted as an (electric) charge distribution smeared over Cl. The
foregoing theorem justifies this metaphor by showing that as with charge in elec-
trostatics, there are two disjoint sets, one carrying all the positive charge, the other
all the negative charge.
§18*. Signed measures 109

From this theorem another important feature of signed measures becomes ev-
ident: The difference p in Lemma 17.9 is more than an illustrative example of
a signed measure - it is the typical signed measure:

18.2 Corollary. Every finite signed measure p on a a-algebra sat in ] is the


difference of two finite measures on sat.

Proof. Let fl = S2+ U S2- be a Hahn decomposition in the sense of 18.1. Then
evidently
p+(A) p(A n St+) and p(A) :_ - p(A n St-), A E sat
define measures on d, which satisfy p = p+ - p-, since each A E sat is the disjoint
union (AnS2+)u(Ancl-). 0
With this result the circle closes: finite signed measures are nothing more than
the differences of finite measures. It is however possible to dispense with the finite-
ness hypothesis if a-additivity is handled with sufficient care, but we will not go
into this further.
In the final analysis it is because of the preceding corollary that we only consider
measures with non-negative values in this book. Often to emphasize the distinction
with signed measures, what we call simply measures are called positive measures.

Exercises.
1. Show that every finite signed measure on a a-algebra is bounded and assumes
a largest and a smallest value.
2. Let p be a finite signed measure on a-algebra d in Sl, and St = Sli U f1i ,
fl = fl2 Uci be two Hahn decompositions for it. Show that ii LSl2 and Sti OS22
are totally p-nulsets, meaning that p(N) = 0 for every N E 0 which is subset of
either of them. Conclude that to within such totally p-nullsets there is only one
Hahn decomposition for p.
3. Let p be a finite signed measure on a a-algebra sat in Q. Show that the specific
representation p = p+ - p- of p as the difference of the two measures on sat which
was produced in the proof of 18.2 is characterized by the following minimality
property: In every representation p = pl - p2 as the difference of measures pl, p2
on 0, pl = p+ + 8 and p2 = p + b for an appropriate finite measure 8 on sa7,
and indeed if 11 = Sl+ U S2- is any Hahn decomposition of S2 corresponding to p,
8 = (ln+)p2 + (1n-)pl. (Conversely, of course, every finite non-zero measure b
on sat generates in this way a different representation of p.) Infer that the only
measure v on sat which satisfies v(A) < min{p+(A), p-(A)} for every A E sat is
the identically 0 measure. [Remark: The representation p = p+ - p uniquely
determined by this minimality condition is called the Jordan decomposition of the
finite signed measure p. As with functions, p+ and p- are called the positive part
and the negative part of p.]
110 1 1. Integration Theory

§19. Integration with respect to an image measure


Along with the measure space (it, .0', i) a measurable space (W,01) and an
jW-d'-measurable mapping
T : (fl, a) -a (ft', d')
are given. Then the image measure

p` := T(p)
is defined in (7.5). The connection between p-integrals and µ'-integrals is eluci-
dated by:

19.1 Theorem. For every s/'-measurable numerical function f' > 0 on 0'

(19.1)

Proof. The non-negative function f' o T is d-measurable, by 7.3. The integral on


the right-band side of (19.1) is therefore defined. To prove the equality there we
first consider only d'-elementary f':
n
f ailA s
i=1

(with coefficients ai E R+ and sets A; E d'). For such f

f'oTa;lAi
e=1

with A; := T-r (A;), so this composite is an d-elementary function. Since

T(p)(Ai) = p(Ai) (i = 1,...,n)


holds by definition of image measures, (19.1) follows in this case. For an arbitrary
s9'-measurable f > 0 there is an isotone sequence (un) of d'-elementary functions
for which u;, T f'. Then (un o T) is a sequence of s(-elementary functions for which
u;, o T T f o T. From the validity of (19.1) for the u;, and Definition 11.3 of the
integral in general, we get (19.1) for f'.

19.2 Corollary 1. Let f' be an sf'-measurable numerical function on W. Then


the T(µ)-integrability of f' entails the p-integrubility of f' oT, and conversely. In
case of integrability

(19.2)
1
§19. Integration with respect to an image measure 111

Proof. From 19.1

f (f')+dT(p)=J(f')+
and of course
o Tdp and J(f')_dT(P) = f (f')- oT d1 z,

(f'oT)+=(f')+oT and (f'oT)-=(f')-oT.


Both claims therefore follow from the definition of the integral 12.1.

19.3 Corollary 2. The mapping T : S2 -+ S2' is bijective and d -d'-measurable,


with W'-d-measurable inverse T'. Further f' is a numerical function on W.
Then the T(p)-integrability of f' is equivalent to the p-integrability of f' o T, and
in its presence equality (19.2) prevails.

One has only to note that the integrability of f' o T entails the measurability
of f' o T and therewith that off'= f' o T o T -1.
The content of 19.1-19.3 constitutes what is called the "general transformation
theorem for integrals".
As the behavior of the L-B measure with respect to Cl-diffeomorphisms is
known from (8.16'), the transformation theorem for Lebesgue integrals follows at
once:

19.4 Theorem. Let G. G' be open subsets of W', cp : G -> G' a C1-diffeomorphisrn
of G onto G'. A numerical function f' on G' is Ad-integrable if and only if the
function f' o cp I det DWI is Ad-integrable over G, and in this case

(19.3) IG, f' dAd = fcf' o' I det D,,, I dAd .


Proof. The Ad-integrability of f' over G' and that of f' o W I (let DWI over G means
the AG,-integrability and the AC-integrability of those functions, respectively. Ac-
cording to (8.16')
' (Ac) = I det DWI Ad ;
furthermore, the Borel measurability of f' is equivalent to that of f'o<p. According
to 17.3 therefore f'otip is integrable with respect to the measure y:= I (let DWI AG if
and only if f'o(p I det DcpI is integrable with respect to A. Consequently the present
claim follows from Corollary 19.3 applied to T := W-1, because f' = f' o W o -
and

f f'dAc,_f
f f f'o,pIdctDWIdAd,.
Because of Theorem 19.1, equality (19.3) holds as well for all non-negative,
Borel measurable, numerical functions on G'.
112 1 1. Integration Theory

Exercises.
1. Let (0, dal, p) be a measure space, T : fZ -+ f 1 a mapping which together with
its inverse is an d-d-measurable bijection. Show that for every f E E (St, .ad) the
image measure T(f p) has a density with respect to T(p), namely f o T-1.
2. Let (0,.', p) be a a-finite measure space, T : ) -4 i2 an alf-d-measurable
mapping such that T-1(A) is a p-nullset whenever A is. Prove the existence of
a measurable function q > 0 such that
r
fA-'(A) f oTdp
-TJ
fqdu

for all dat-measurable numerical functions f > 0 on fl, and all A E d.

§20. Stochastic convergence

Let us return to the study of p-fold integrable functions begun in §14. Our goal will
be to replace the almost-everywhere convergence concept that underlies the theo-
rems proved there with a weaker convergence concept. It is suggested by a simple
but very useful inequality.
The setting is once again an arbitrary measure space (el, 0,u).

20.1 Lemma. For every measurable numerical function f on 0 and every pair of
real numbers p > 0 and a > 0 the Chebyshev-Markov inequality

(20.1) p({IfI >- a}) <-


af Iflp dp

holds.

For p = 2 this is also known simply as Chebyshev's inequality.

Proof The set A6 := {IfI > a} lies in d and

fdp>j Iflpdp2j apdp=app(Aa)


a a
which is what (20.1) claims.

Therefore if f If Ip dp is finite, which when p > 1 means just that f is p-fold


integrable, it follows from (20.1) that
(20.2) lim p({IfI > a}) = 0.
a-r+co
One can also study the dependence on it E N of the measures of the sets
{ I fn - f I > a} when f, fl, f2.... are measurable real functions. That leads to
the aforementioned new convergence concept.
§20. Stochastic convergence 113

20.2 Definition. A sequence (fn)nEN of measurable real functions on 1 is said to


be (µ-)stochastically convergent (or to be convergent in p-measure) to a measurable
real function f on S2, if for each real number a > 0 and each A E d of finite
measure
(20.3) nlim tt({I fn - f I > a} n A) = 0.
+oo

In this case we also write


(20.4) µ- lim fn = f
and call f a (µ-)stochastic limit of the sequence (fn).

Remarks. 1. For a finite measure p we may take A = 52 in (20.3) and in this case
stochastic convergence of (fn) to f is equivalent to the requirement
(20.5) lim µ({lfn- fI>a})=0 for every a>0.
The more complicated condition (20.3) is dictated by the desire to treat infinite,
and especially a-finite, measures as well as finite ones.
2. For a-finite measures p the stochastic convergence of a sequence (fn) to f is
generally not equivalent to (20.5), as the next example illustrates.

Example. 1. Let St := N, 0 := .9(N), It the measure (obviously a-finite) defined


on sad by the equations
µ({n}) = n for every n E N

and the requirement of o-additivity. With An := {n, n + 1,.. .} and In := 1A., for
each n E N, the sequence (fn) converges stochastically to 0: For every a E 10, 1[,
{ jn > a} = An, and since An ,. 0, it follows from 3.2 that lim µ(An n A) = 0
for every A E Af having finite measure. On the other hand, u(A.) = +oo for
every nEN.

Remark. 3. Let f be a stochastic limit of a sequence (fn) and consider any


measurable real function f' on 11. If f' = f p-almost everywhere in every A E d
which has finite measure, then f' is also a stochastic limit of the sequence (fn).
This is because the sets
{Ifn-f*I >a}nA and {Ifn-fl>a}nA
differ from each other only in an (n-independent) nullset.

The converse of this is important:

20.3 Theorem. For every o-finite measure p, any two stochastic limits of a se-
quence of measurable real functions are µ-almost everywhere equal to each other.
114 1 1. Integration Theory

Proof. If f and f* are stochastic limits of the sequence (fn), then from the triangle
inequality in R
{If -f*I2al C{If.-fI? a/2}U{Ifn-f*I2! a/2},
whence
p({If-f*I >a}nA)<p({Ifn-fl>a/2}nA)+p({Ifn-f*I2:a/2}n A)
for every n E N and every A E d. Letting n -3 oo shows that
p({ If -f*1 >- a} nA) = 0
for every a > 0 and every A E ii of finite measure. Then however, f = f* "-almost
everywhere in every such set A, since
If 54 f*} n A= U{If - f*1 > Ilk} nA
kEN

is a p-nullset. Upon taking for A the sets in a sequence (An) in 41 which satisfies
p(An) < +oo for all n and An t 0, the p-almost everywhere equality of f and f
follows. D

To supplement this fact we mention:

Remark. 4. Stochastic limits f and f* of the same sequence (fn) are almost
everywhere equal without any hypotheses on the measure itself if both functions
are p-fold integrable for some p E [1, +oo[. This is because for every real a > 0 the
set (if - f* I > a} has finite measure, by (20.1), and so f = f * p-almost every-
where in this set, whence { If - f * I > 0} = U {If - f* I > 1/n} is a countable
nEN
union of p-nullsets. This just says that f = f* p-almost everywhere in Sl. But the
next example shows that it may fail if one of the functions is not in any 2P-space.

Example. 2. Consider the measure space (fl, Y(fl), p), where 11 consists of exactly
two elements wo,wl and p({wo}) = 0, p({wl}) = +oo, fn = f = 0 for every n E N.
These functions lie in every .2'P(p) and the sequence (fn) converges stochastically
to f , as well as to every real-valued function f * on 0. Every such f* which is
non-zero at wl, however, lies in no 2"(p) with 1 < p < +00 and fails to coincide
p-almost everywhere in 11 with f.
The considerations with which we began this section lead to an important class
of stochastically convergent sequences:

20.4 Theorem. If the sequence (fn) in 2P(p) converges in e" mean to a function
f E 2P(p) for some 1 < p < +oo, then it also converges to f p-stochastically.

Proof. The Chebyshev-Markov inequality tells us that

p({Ifn - fl ?a}nA)<p({lfn-fl ! a})<a-P fl/n irdp


§20. Stochastic convergence 115

holds for every n E N, every a > 0 and every A E s+d. The claimed stochastic
convergence, that is, the convergence to 0 of the left end of this chain as n -+ oo,
follows because f I fn - f I' dµ -+ 0 as n -+ oo is the definition of convergence
in pth mean. 0

The proof shows that convergence in eh mean actually entails the stronger
form of stochastic convergence in (20.5). The situation is different when the given
sequence is almost everywhere convergent. (On this point cf. also Remark 5.)

20.5 Theorem. If a sequence (fn)nEN of measurable real functions on fl converges


µ-almost everywhere in Sl - or even just p-almost everywhere in each set A E st
of finite measure - to a measurable malfunction f on 1l, then this sequence also
converges p-stochastically to f.

Proof. For every a > 0,

{Ifn - .fI 1a} C {m>p Ifm - .fI 1a}


n

and so
A({Ifn - fl2! a}nA):5 µ({supI.fm-f1 >a}nA)
m>n
for every A E d. The present claim therefore follows from our next lemma, applied
to the restriction of p to A n sl for each A of finite measure. 0

20.6 Lemma. If the measure p is finite, then each of the following three conditions
on a sequence (fn)nEN of measurable real functions is equivalent to (fn) converging
p-almost everywhere to 0:

(20.6) lim A Ifml > a}) =0 for every a > 0,


m>n
(20.6') lim µ({sip Ifml > a}) = 0
n-rao
for every a > 0,
m>n
(20.7) p(limsap{Ifnl>a})=0 for every a>0.

Proof. To prove the equivalence of (20.6) with the almost everywhere convergence
of (fn) to 0, we set, for each a > 0 and each n E N

An :_ { sup IN > a} .
m>n

Obviously both n H An and a H An are antitone mappings; then k H An/k is


isotone on N. If we also set
A:= {w E fl :limo fn(w) = 0} = {w E Sl : limas Ifnl (w) = 0),
op
116 1 1. Integration Theory

then these lie in W. either by appeal to 9.5 or by noticing that each A; E W and
A= n U
kEN nEN

Passing to complements,
CA= U nAnk
kEN nEN

and so
n A ;/k r CA as k -+ oo, and Al/k
n 1
fI' dl
"m as n -00.
nEH mEN
Consequently,

(20.8) u(CA) = sup p ( n A,imk) = sup inf


kEN nEN kEN 'nEN

because the finite measure µ is both continuous from above and continuous from
below, by 3.2. Thus (fn) converges almost everywhere to 0 just when the number
defined by (20.8) is 0. In turn, the latter occurs exactly in case
inf p(AIlk) = Iuu p(An1fk) = 0
nEN n-+oo

for every k E N. The first equivalence follows from this. The equivalence of (20.6)
with (20.6') follows from the observation that for any numerical function g on S2
{g>a}C{g>a}C{g>a'}
whenever 0 < a' < a.
Finally, the equivalence of (20.6') with (20.7) follows from the validity, for every
a > 0, of the equality
(20.9) a(( sup Ifml > a}) = µ(limsop tlfnl > a}) .
m> n

For the proof of which we introduce


Bn:= U{Ifml>a} and B:=llmspp{Ifnl>a}.
m>n
On the one hand, Bn I B and consequently tim p(Bn) = µ(B). On the other hand,
however,
Bn= U {Ifml>a}={sup Ifml>a}.
rn>n m>n
From this finally we get the needed (20.9). 0

The conditions involved in Theorems 20.4 and 20.5 are indeed sufficient to
insure stochastic convergence, but they are not necessary for it, as the following
examples show.
§20. Stochastic convergence 117

Examples. 3. Let S2 :_ [0,1 [, s/ := 1 n 91 and µ := an, a finite measure. With


An :_ JO, 1/n[ E a, the sequence converges to 0 at every point of Q
and so, either by appeal to 20.4 or by virtue of

µ({n1A > a)) = µ(An) = n whenever 0 < a < n E N,

this sequence also converges stochastically to 0. By contrast

= n"p(An) = np-1

shows that the sequence does not converge to 0 in pth mean for any p > 1.
4. Let (fl, 0, µ) be the measure space of the preceding example. Write each n E N
as n = 2' + k with non-negative integers h and k satisfying 0 < k < 21 (which
uniquely determines them) and set
An :_ [k2-h, (k+ 1)2-h[, In lAn, n E N.
It was shown in the example in §15 that the sequence (fn(w))nEN converges for
no w E S1. Nevertheless the sequence (fn) does converge stochastically to 0, since
for every a > 0 and n E N

p({) fnI 1 a}) < 2-h < 2r2 .

In this example stochastic convergence can also be inferred from 20.4, since the
example in §15 showed that (fn) converges to 0 in pth mean for every p E [1, +oo[.

The connection between stochastic convergence and almost-everywhere conver-


gence is nevertheless closer than one would be led to suspect on the basis of the
last example.

20.7 Theorem. If a sequence (fn)nEN of measurable real functions converges


,u-stochastically to a measurable real function f, then for every A E 0 of finite
p-measure some subsequence of (fn) converges to f µ-almost everywhere in A.

Proof. For A E sa( with µ(A) < +oo, the measure µA, which is the restriction of p
to A n.ad, is finite. It therefore suffices to deal with the case of a finite measure u;
moreover, in that case we can simply take A to be St itself.
For a > 0 and m, n E N the triangle inequality shows that
{Ifm - fnI 2: a} C {If,. - f I ! a/2} U {Ifn - f I a/2);
thus by hypothesis µ({I fn, - fnl > a}) can be made arbitrarily small by taking m
and n sufficiently large. If therefore (rlk)kEN is a sequence of positive real numbers
with
00
E rlk < +00,
k=1
118 I l. Integration Theory

then for each k E N there is an nk E N such that


{t({Ifm-fnkl?nk})<-nk forallm>nk.
Clearly the sequence (nk)kEN can be chosen strictly isotone: nk < nk+1 for every
k E N. If now we set
Ak {Ifnk+t - fnk l llk}, k E N,
then
00 00

> (Ak) < E 77k < +00,


k=1 k=1
and consequently,
lira p(Ak) = 0.
n-oo
k=n
From this it follows that the set A := lira sup An satisfies
n-,00
p(A) = 0,
00
because A C U Ak for every n E N, entailing that p(A) < E p(Ak) for every n.
k>n k=n
The definition of A shows that if w E CA, then the inequality
Ifnk+. (w) - fnk (w) I ? rlk
prevails for at most finitely many k E N. Therefore, along with the series E Ilk,
the series 00

1: lfnk+l(w) - A. (w)1
k=1

converges (absolutely); that is, the sequence Y n& (w))kEN converges in R. In sum-
mary, the sequence (fnk) converges almost everywhere to a measurable real func-
tion f' on !l. By 20.5 f' is also a stochastic limit of (fnk )kEN. But, as a sub-
sequence of that sequence converges stochastically to f as well. Hence
by 20.3, f = f " almost everywhere. We have shown therefore that (fnk )kEN con-
verges almost everywhere to f. 0

In terms of almost-everywhere convergence we can now even characterize sto-


chastic convergence by a subsequence principle.

20.8 Corollary. A sequence (fn) of measurable real functions on 11 converges p-


stochastically to a measurable real function f on ) if and only if for each A E of of
finite measure, each subsequence (fnk )kEN of (fn) contains a further subsequence
which converges to f p-almost everywhere in A.

Proof. The preceding theorem establishes that the subsequence condition is nec-
essary for the stochastic convergence of (fn) to f, since every subsequence of (fn)
§20. Stochastic convergence 119

likewise converges stochastically to f. Let us now assume that the subsequence con-
dition is fulfilled, and fix an A E W of finite measure. Since every subsequence (f,,.)
contains another which converges almost everywhere in A to f and by 20.5 this
latter subsequence must also converge (in A) stochastically to f, we see that in
the sequence of numbers
p({Ifnk - fI -a}nA) (kEN),
in which a > 0 is fixed, a subsequence exists which converges to 0. But, as an
easy argument confirms, a sequence of real numbers whose subsequences, have this
property must itself converge to 0. That is, the sequence of real numbers
>a}nA) (nEN)
converges to 0. As this is true of every A E d having finite measure and ev-
ery a > 0, the stochastic convergence of to f is thereby confirmed. 0

Remarks. 5. It is not to be expected that in 20.7 and 20.8 the reference to the
finite-measure set A E W can be stricken. This is already illustrated by Example 2
if one replaces the sequence (fn) there with the sequence (f) defined by f,, :_
nl(,,,, ), n E N. This new sequence also converges stochastically to f := 0. See
however Exercise 5.
6. The second part of the proof of 20.7 shows that for finite measures u there is
a Cauchy criterion for the stochastic convergence of a sequence (f.): Necessary
and sufficient for the stochastic convergence of a sequence to a measurable
real function on S1 is the condition
litre for every a > 0.
m.n-ix

7. The sequence formed by alternately taking terms from each of two stochasti-
cally convergent sequences whose limit functions do not coincide almost everywhere
shows that in Corollary 20.8 it does not suffice to demand that in each A some
sub sequence of the full sequence (fn) converge almost everywhere.
A particularly useful consequence of 20.8 is:

20.9 Theorem. If the sequence (f,,) ,EN of measurable rral functions on 11 con-
verges stochastically to a measurable real function f on. Q. and yo : R -4 R is
continuous, then the sequence (y^ o f )nEN converges stochastically to V o f.

Proof. One exploits both directions of 20.8, noting that from the almost every-
where convergence of a subsequence to f on an A E 41 follows the almost
everywhere convergence of (,p o f on A. 0
The general question of functions p : R -* R which preserve convergence, in the
sense that (o o f, inherits the kind of convergence (f,,)iE14 has, is investigated
by BARTLE and Jo1CH1 (1961]. They show how Theorem 20.9 can fail if the more
restrictive definition (20.5) is adopted for stochastic convergence.
120 11. Integration Theory

Exercises.
1. (fn) and are stochastically convergent sequences of measurable real func-
tions, having limit functions f and g, respectively. Show that for all a,,8 E R
the sequence (af,, + 13g,,) converges stochastically to of + fg, and the sequences
(fn A gn), (f V g,,) converge stochastically to f A.9, f V g, respectively.
2. For a measure space (Si, d,,u) with finite measure p let d, be the pseudomet-
ric on d constructed in Exercise 7 of §3. Show that a sequence (An) in saf is
d,,-convergent to A E 0 if and only if the sequence (NAB) of indicator functions
converges stochastically to the indicator function IA.
3. For every pair of measurable real functions f and g on a measure space (Cl, sA, µ)
with finite measure µ define
D,(f,g) := inf{e > 0 : p({I If - gI > e}) < e}
and then prove that
(a) DP is a pseudometric on the set M(d) of all measurable real functions.
(b) A sequence (fn) in M(W) converges stochastically to f E M(d) if and only if
lim D, (f,,, f) = 0.
n +00
(c) M(se) is D,,-complete, that is, every Dµ Cauchy sequence in M(d) converges
with respect to Da to some function in M(Ao ).
What is the relation of D,, to the dµ of Exercise 2?
4. In the context of Exercise 3 define
If - gi dp,

for every pair of functions f, g E M(ss). Show that Dµ also enjoys the properties
(a)-(c) proved for D$, in the preceding exercise.
5. Let be a or-finite measure space. Show that a sequence (fn) of measur-
able real functions on Cl converges stochastically to a measurable real function f
on Cl if and only if from every subsequence (fk) of (fn) a further subsequence can
be extracted which converges almost everywhere in 0 to f. [Hints: Suppose (fn) is
stochastically convergent. Choose a sequence (Ak) from d with p(Ak) < +oo for
each k and Ak 1 11, and consider the finite measures pk(A) := µ(A fl At,) on sW.
The claim is true of each measure Pk. Given a subsequence 4 of (fn), there is for
each k E N a subsequence of (g;,k))nEN of 4' which converges pk-almost everywhere
to f. It can be arranged that (g nk+u)) is a subsequence of (gnl) for each k. Then
the diagonal subsequence (g;,ni ), EN does what is wanted.]
6. Give an "elementary" proof of 20.9 based directly on the relevant definition 20.2.
To this end, show that for each E E 10, 1[ there exists 6 > 0 such that fl f I <
11F}fl{Ifn-f1 :56}C{IVo fn-Wofl<£}for all nEN.
7. (Theorem of D.F. Ecoaov (1869-1931)) Let (S2,srd,A) be a measure space
with finite measure p. Show that: For every sequence (fn)nEN of measurable real
functions on Cl its convergence almost everywhere to a measurable real function f
is equivalent to its so-called almost-uniform convergence to f. The latter means
§21. Equi-integrability 121

that for every 6 > 0 there exists an A6 E W such that p(A6) < b and (fn) converges
to f uniformly on CA6. [Hint: Exercise 2 of §11.]

§21. Equi-integrability

The sufficient condition for convergence in eh mean which is set out in Lebesgue's
dominated convergence theorem can be transformed into a necessary as well as suf-
ficient condition with the help of stochastic convergence. But we need the concept
of equi-integrability, which is of fundamental significance.
In the following (S2, sz4, p) will again be an arbitrary measure space, and p is
always a real number satisfying 1 < p < +oo.
The point of departure is a simple observation. A measurable numerical func-
tion f on S2 is integrable if and only if for every e > 0 there is a non-negative
integrable function g = ge such that

(21.1)
J I9} IfI dp <e.
For if f is integrable and we take, as we then may, g to be 2 If I, then { If I > g} _
{ f = 0} U { If I = +oo} and thanks to 13.6 the integral in (21.1) is actually equal
to 0. Conversely, if we have (21.1) even for just one real e > 0, then

f IfI dp= f{IfI?9}


IfI dp+ f {III<9}
IfI dp<e+f gdp<+oo
and hence f is integrable.
This observation induces us to make

21.1 Definition. A set M of d9-measurable numerical functions on S2 is called


(p-)equi-integrable if for every e > 0 there exists a p-integrable function g = ge > 0
on 0 such that every f E M satisfies
(21.2)
f III_9}
f I dp< e.

Correspondingly a family (fi)iEl of measurable numerical functions on f is


called equi-integrable if the set { fi : i E I) is equi-integrable. Equi-integrable sets
and families are sometimes also called "uniformly integrable".
From now on, any function ge as described in Definition 21.1 will be called an
e-bound for the given set of functions. Obviously, along with an a-bound g for a set
of functions, any integrable g' 2 g is also an e-bound.

Examples. 1. If Ml,..., Mn are finitely many p-equi-integrable sets of measurable


functions on S2, then their union is also p-equi-integrable, because whenever gj is
an a-bound for MM (j = 1, ... , n), then gl V... Vg,, is an a-bound for Ml U... U Mn.
122 1 1. Integration Theory

2. Every finite set of µ-integrable functions is u-equi-integrable. This follows from


Example 1 and the fact, demonstrated in the course of proving (21.1), that any
set consisting of just one integrable function f is equi-integrable, the function 2 If I
being an a-bound for every e > 0.
3. Suppose M is a set of measurable numerical functions on fl, 1 < p < +oo, and
there is a p-fold µ-integrable majorant g for M, that is, every f E M satisfies
If1 < g µ-almost everywhere.
Then the set
M":={IfIP:fEM}
is equi-integrable. Indeed, as in Example 2, the single integrable function h := 2gP
is an --bound for every e > 0, since by 13.6

fIdµ < J gP dµ = J dµ = 0
J 1f1P>h} {gP>h} {g=too}

This example shows that Theorem 15.6 on dominated convergence is really


about an equi-integrable set of functions. Of course, one cannot expect that con-
versely from the equi-integrability of a subset of .`" (t) there should follow the
existence of a single integrable majorant for the set. The following example con-
firms this.
4. Consider the probability space (N, .(N), µ), the finite measure µ being speci-
fied by µ({n}) = 2-n for each n E N. The sequence of functions fn := 2"n-11{n)
(n E N) is equi-integrable: For the constant function 1 E .2o1(µ) the inequality

fn dµ < 1 holds for all n E N.


n
However, the smallest function g which majorizes every fn is the non-µ-integrable
function n i-- 2nn-1 on N.
5. Let (St, d, µ) be the measure space of Example 3, §20, and (fn)nEN the sequence
of functions considered there: An := [0, [ and fn := n1A, for each n E N. This
sequence is not equi-integrable, which wensee as follows: for every integrable g > 0
and every n E N
/ r
If,.Idµ=J ndµ=J ndµ-J ndµ>1-J
JIf-I>g} A A
From the finiteness of the measure gµ and the fact that An 1 {0}, it follows that

liminf
n_+00
J Ifnl dµ> 1,
{If..I>g}
showing that g cannot be an a-bound for any e E ]0, 1[.
Here is a useful characterization of equi-integrability, which, for o-finite mea-
sures, will be improved upon in 21.8.
§21. Equi-integrability 123

21.2 Theorem. A set M of measurable numerical functions on l is equi-integrable


if and only if the following two conditions are satisfied:

(21.3) sup f If I dµ < oo .


fEM
(21.4) For every e > 0 there exists a p-integrable function h > 0 and a number
3 > 0 such that
< d=* Jill/iforallfEMand
Proof. For every A E &/, every measurable numerical function f on 0, and every
integrable function g > 0

f AIfI du= f An{IfI>g}


IfI du+ f
An{III<g}
IfI du<_ f {IfI?g}
IfI du+f gdu
A

and in particular for A := fZ

f IfI du <_ f
{IfI>_g}
IfI dµ+ f gdu.
Assuming that the set M is equi-integrable, let us choose for g an E-bound for it
and then set h := g, d 2. Then conditions (21.3) and (21.4) follow from the
preceding inequalities.
Conversely, assume the two conditions are fulfilled and let e > 0 be given. Let
h and b > 0 be as furnished by (21.4). For each f E M and real a > 0, consider
the obviously valid inequality

f IfI du Ifl du > f


4IfI?ah} (If I>_-h}
{If

or its equivalent
1

J IfI?ah} h djo < - If I dM.

The integrals f If I dµ here are bounded as f ranges over M, by (21.3). Therefore


a > 0 can be chosen so large that

hdµ < b for all f E M.


{IfIiah}
(21.4) then insures that g := ah is an c-bound for M, which proves that this set
is equi-integrable. 0

21.3 Corollary. Let M C 2P and the set MP :_ { If I P : f E MI be equi-


integrable, where 1 < p < +oo. Then the set
M;:={laf+,0glP:f,gEM,a,,0ER,Ial:_1,1,01<_1}
is equi-integrable.
124 II. Integration Theory

Proof. For every f E 2P(p) and every A E dd, I lA f l <- If I shows that 1A f E
2'(p) too, and so for all fl, f2 E 2P(p) Minkowski's inequality (14.4) gives
Np(lAfl + lAf2) :5 Np(lAfl)+Np(lAf2),
whence
///' 1/v 1/1 p
Ifl + f2Ip dp < If,Ip dp) + (!A 1f21P dp}
JA
Applying this inequality to fl = a fl, f2 = pg with f, g E M a, 8 E R and Ial < 1,
ICI < 1, and hearing in mind that 21.2 is (by hypothesis) valid for the set MP,
one realizes that conditions (21.3) and (21.4) are fulfilled by M: as well as by MP,
with the same function h in both cases. 0
We are now in a position to deliver the sharpened version of the dominated
convergence theorem mentioned in the introduction to this section. That we really
have to do with a sharpening here is attested to on the one hand by Example 3
and Theorem 20.4, according to which stochastic convergence follows from almost-
everywhere convergence, and on the other by Example 4 of §20, which shows that
there are situations in which the dominated convergence theorem is not applicable
but the following theorem is.

21.4 Theorem. For every sequence. (fn)nEN of p -fold, p-integrable real functions
on a measure space (1l, sd, p) the following two assertions are equivalent:
(i) The sequence (fn) converges in p`h mean.
(ii) The sequence (fn) converges p-stochastically, and the sequence (Ifnlp) is p-
equi-integrable.

Proof. (i)=(ii): Suppose converges in eh mean, to f E 2P(µ); thus


lim Np(fn-f)=0.
n+oo
In the light of 20.4 only the equi-integrability of the sequence (I fnI") has to be
proved. By (15.2) the sequence (Np(fn))nEN converges to Np(f) and is therefore
bounded, so the set M := (If,, 1' : n E N} satisfies (21.3).
For every AEa(andevery nENwehave by(15.4)
If,.Idt) "<-Np(fn-f)+(JA
(fA If1Pdµ\1/µ
J

To every e > 0 corresponds an nE E N such that Np(fn - f) < 2-eel/p for all
n > nE. Therefore, if we set 6:= 2-'Pe and
h:=If1IPV...VIfn,IPVIfIP,
condition (21.4) is also satisfied by M.
(ii) .(i): From the stochastic convergence of the sequence (fn) and Remark 6
in §20 it follows that
(21.5) lim p({I fm - a} n A) = 0
n,m- .
§21. Equi-integrability 125

for every A E W of finite measure and every real a > 0. We have to show that
is a Cauchy sequence in 2P(µ), that is, that the doubly-indexed sequence of
functions frnn := frn - fn satisfies

rrr lim fIfrnfll' do


= 0.

According to 21.3, along with the set {IfnIP : it E N} the set 1190 :_ {lfnrnI
m, n E N} is also equi-integrable. Hence to every e > 0 corresponds an integrable
function gE > 0 such that f{f _g. } f dµ < e holds for all f E Mo. If we set g := 9E1 /P
then g is p-fold integrable and the preceding inequality can be written

for allm,nEN.
J fnrnIPdu<<e

Because

f If,.. I" dµ = f{If.,,I>g} Ifnrn IP do + J Ifm,.I<g} frn lP dµ

it suffices to show that

(21.6) Ifnrn IP dµ < 3E


{Ifm I<g}
holds for all sufficiently large m, n E N. Now gPµ, being a finite measure on so',
is continuous from above. Since n {g < k-1 } = {g = 0), i'l > 0 can therefore be
chosen small enough that kEN

g" (11,<E.
fwnl
Consequently we also have

(21.7)
J fIdµ J g «}
gP dµ < for all m, n E N.

The Chebyshev-Markov inequality insures that the set {g > Y}} has finite µr
measure. According to (21.5) therefore the doubly-indexed sequence of sets
Ann :_ {I fnrnl > a} fl {g > 7)} in,nEN
satisfies, whatever a > 0 is involved,
lim µ(A,,,n) = 0.
m.n-4Q0

We choose the positive number a so as to have


()PJgpd1j
< E,
126 1 1. Integration Theory

The p-continuity of the finite measure gPp and 17.8 provide for an no E N such
that
for all m, n > no.
J ,,
gP dp < e

Hence

(21.8) J Ifmn IP dp < r gP du < e for all m, n > no.

A second application of the Chebyshev-Markov inequality furnishes the estimate

(21.9) JIfrnnV' dp<&A({g>r)})<()"f?d <efor allm,nE14,


17
{Ifmk9}fAm.,

Amn := {Ifm,il < a} n {g < rl} .


By adding the inequalities (21.7)-(21.9) we get finally inequality (21.6), whose
confirmation was the last outstanding claim in the proof that (ii) implies (i).

Remark. 1. Theorem 21.4 does not claim that from the stochastic convergence of
a sequence (fn) to a measurable real function f, the p-fold integrability of f and the
convergence of (fn) to f in pth mean follow as soon as the sequence (if. JP) is equi-
integrable. Rather the theorem guarantees the existence of a p-fold integrable func-
tion among the possible stochastic limits of the sequence (fe). The sequence (fn)
does converge in eh mean to every such stochastic limit, as follows from the proof
of the theorem in the light of Remark 4 of §20, according to which any two p-fold
integrable stochastic limits must in fact coincide almost everywhere.
But stochastic limits that are not p-fold integrable do exist, a fact that can be
demonstrated with the aid of the Example in §20: For the sequence (fn) there,
(If,, 1") is equi-integrable. But among the stochastic limits f' that occur there,
f' E .`BP(p) for some p E I1,+oo[ if and only if f'(wi) = 0.
However, the phenomenon discussed above does not occur for a-finite mea-
sures. By 20.3 in that case any two stochastic limits are almost everywhere equal.
Therefore we have

21.5 Corollary. Suppose the measure p is a -finite. If a sequence (fn) from. "P(p)
converges stochastically to a (measurable, real) function f, and if the sequence
(IfnIP) is equi-integrable, then f E 2P(p) and (fn) converges in pth mean to I.

Theorem 21.4 can be sharpened by bringing in a further condition equivalent


to (i) and (ii) which is suggested by F. Riesz' Theorem 15.3. En route to this
sharpening the following lemma plays a key role. On the other hand, from the
sharpening that we are aiming for, the lemma can in turn be deduced, as can
the theorem of F. Riesz, even with its almost-everywhere convergence hypothesis
weakened to stochastic convergence.
§21. Equi-integrability 127

21.6 Lemma. Suppose the sequence of functions f > 0 from 2' (p) converges
stochastically to a function f > 0 from 2'(It). If in addition

lien f f dit = If dp,


J
then the sequence converges to f in mean.

Proof. We consider the sequence (f A fn)nEN. The inequalities


0< fA
and Example 3 show that it is equi-integrable. Since
05f-fAfn<-Ifn-fI (forallnEN),
stochastic convergence of (fn) to f entails that of (f A fn) to f . From Theorem 21.4
this new sequence then converges to f in mean. We therefore also have

(21.10) lim
n>z
From this, the decomposition f + fn = f V f + f A fn, and the convergence
hypothesis follows the companion result

(21.10') lim If V f dp = f f du.

But then the decomposition


If,, - fl =.f V .fn -.f A.fn
shows that the claimed mean convergence ensues upon subtracting (21.10) from
(21.10').

Now we can get the sharpening of Theorems 21.4 and 15.4 mentioned earlier:

21.7 Theorem. For every sequence (fn) in 2P(t) which converges p-stochastically
to a function f E 2P(,u) the following three assertions are equivalent:
(1) The sequence (fn) converges in p'h mean to f .
(ii) The sequence (If,, 1") is equi-integrable.
(iii) lim f If,, I' d;i = f If I' dp.
n-, x.

Proof. The equivalence of (i) and (ii) is contained in Theorem 21.4. We need
therefore establish only two implications:
(i) .(iii): Assertion (15.6) in Theorem 15.1 affirms this.
(iii)=,>(ii): From the hypothesized stochastic convergence of the sequence (f,,)
to f follows that of (I f I') to If 11, via 20.9. And then from the preceding lemma
it further follows that the sequence (If P) converges to I fI' in mean. Finally,
Theorem 21.4 - with the p there chosen to be I - shows that the convergence in
mean of this sequence entails its equi-integrability.
128 1 1. Integration Theory

For a-finite measures µ, equi-integrability can be characterized in a way that is


particularly convenient for applications. The a-finiteness will be exploited in the
form expressed by 17.6, that there is a strictly positive function h in Y' (it).

21.8 Theorem. Let (S2, dd, p) be a o-finite measure space and h a strictly positive
function from 2'(p). Then for any set M of dd-measurable numerical functions
on Sl the following three assertions are equivalent:
(i) M is equi-integrable.
(ii) For every e > 0 some scalar multiple of h is an a-bound for M.
(iii) M satisfies

(21.11) sup fIfI dµ < +oo


JEM
as well as the following: Given e > 0 there exists 6 > 0 such that

(21.12) fhd6=JIfIdlA<c for allAEdd,fEM.

Statement (ii) simply says that

s lim
(21.13)
JIfI>ah} If I du = 0
holds uniformly for f E M. Condition (21.12) is for obvious reasons (cf. 17.8)
called the equi-(hit)-continuity of the measures If I µ, f E M.

Proof. (i) .(ii): Let g be an E-bound for M. Then for all f E M and all a > 0

{IfI>-hh}
IfI dµ= f
{IfI>oh}n{IfI>g}
IfI dµ+ f {(fI>«h)n{(fI<9)
IfI dµ

< fj IfI>_g} I fI dµ+ f{g>ah}


gdla < E +
fig >cth} 2
9dµ

According to 13.6, µ({g = +oo}) = 0. Since gµ is a finite measure on dd, it is


continuous from above. Hence the fact that
n {g > ah} = n {g > nh} = {g = +oo}
a>o nEN

is a set of (gµ)-measure 0 means that

g dµ < 2
k>ah)
for all sufficiently large a. Coupled with the preceding inequality this shows that
indeed ah is an a-bound for all sufficiently large a, that is, (ii) holds.
§21. Equi-integrability 129

This can be gleaned from the inequality derived at the beginning of


the proof of 21.2, ah being now eligible for the function g there:

JIfIdJLjIJI> an}IfI d1+a hd/1 for all f EM.

21.2 affirms this. 0

Theorem 21.8 is of special significance for finite measures p. Then it is often


expedient to choose for h the constant function 1. When one does, (21.13) assumes
the equivalent form

(21.13') lim
a-++oo J IfI?a} IfI dp = 0 uniformly for f E M.

This condition is thus - just as (21.13) for a-finite measures - necessary and
sufficient for equi-integrability of M.

Remark. 2. In part (iii) of Theorem 21.8 the 21-boundedness of M expressed


by (21.11) cannot in general be dropped from the hypotheses. It suffices to consider
the measure space ({a}, Y({ a}), Ca) consisting of a single point and the sequence
of functions f,, := n 1. This sequence is not equi-integrable, although for every
e > 0 and every strictly positive h, (21.12) holds whenever 0 < 6 < h(a).

Let us close by deriving a sufficient condition for equi-integrability in the finite-


measure case which generalizes the introductory Example 3.

21.9 Lemma. Let p be a finite measure and M C Y' (y). Suppose that there is
a p-integrable function g > 0 such that

(21.14)
J{Ift?a}
IfI dp < f
J{IJI>a}
9dp

for all f E M and all a E R+. Then M is equi-integrable.

Proof. The case a:= 0 of (21.14) says that f If I dp < f g dp < +oo for all f E M.
Then Chebyshev's inequality tells us that

p({IfI ? a}) <_ f IfI dp < a f 9dp for all a > 0, f EM.
It follows from this that
(21.15) lim p({IfI > a}) = 0 uniformly in f E M.
a-4+oo

For each e > 0, 17.8 supplies a 8 > 0 such that

AEd and p(A)<b = fdize.


130 II. Integration Theory

Putting this together with (21.14) and (21.15) gives us

Jim IfI dp = 0 uniformly for f E M,


n++0oJ(Ill>o)
that i4, (21.13'), which we have seen entails equi-integrability of M. O

Exercises.
1. Show that for any measure space (0, a, p) a set M of measurable numerical
functions is equi-integrable if and only if for every e > 0 there is an integrable
function h = hr > 0 such that f (If I - h)+ < e for all f E M. [Hint: For sufficiently
large q > 0, g := r)h will be a 2e-bound for M.]
2. Let (S2, d,14) be an arbitrary measure space, 1 < p < +oo. Suppose the se-
quence (f,,) in ((t) converges almost everywhere on 12 to a measurable real
function f. Show that f lies in 2P(p) and (fn) converges to fin pth mean if the
sequence (If,, I P) is equi-integrable.
3. Show that from the 2-convergence of a sequence (fn) to a function f E 2"(e)
follows the 21-convergence of the sequence (I fn IP) to If I, for any 1 < p < +oo.
4. Consider a finite measure .t and an M C Y1(µ). For each n E N, f E M set
an(f):=nµ({n<_IfI<n+1}).
00
Show that M is equi-integrable if and only the series E an(f) converges uniformly
na
in f E M. [Cf. Theorem 3.4 and its proof in BAUER [1996].]
5. Consider a finite measure p and an M C 2 (z). Show that M is equi-integrable
if there is a function q : a+ - R+ with the properties

lilri q(t)
t0+00 t
_ +oo and spu J q °If I du < +oo.
(In fact we have to do here with a necessary as well as a sufficient condition, which
goes back to CH. DE LA VALLEE POUSSIN (1866-1962). Moreover, q can always
be chosen to be convex and isotone. Cf. MEYER [1976], p. 19 or DELLACHERIE
and MEYER [1975], p. 38.)
6. Let (fl,.ad,p) be a measure space with µ(S2) < +oo, (fn)nEN a sequence of
measurable numerical functions fn > 0, and set f* := lira .supoofn. Show that:
n
(a) If the sequence (fn) is equi-integrable (or at least satisfies condition (21.12)),
then the following "dual version" of Fatou's lemma is valid:

(*) lim sup f fn dµ < J f * dit for all A E S1.


A A

How does the corresponding result in Exercise I of §15 fit in? [Hint: Exercise 2
of §11.]
(b) Under the hypothesis f f' du < +oc, the sequence (f,,) is equi-integrable if
and only if (*) holds. [In proving the "if" direction, argue indirectly.]
§21. Equi-integrability 131

(c) Result (b) can fail in case f f ` dµ = +oo. Try to corroborate this with a se-
quence (an derived by appropriate choice of (sufficiently large) numbers
a,, > 0 from the sequence (f,,) in the Example from § 15.
7. Let (f), .x, µ) be a measurable space with µ(S2) < +oo, and let (v;)iE f be a family
of finite and it-continuous measures on 0. Suppose this family is equi-continuous
at 0, meaning that to every sequence (An)nEN in iA with A,, J. 0 and to every
c>0there is an nEENsuch that y;(A,)<efor all n>nE,and all iEI.Show
that then this family is equi-µ-continuous in the following sense (cf. (21.12)): To
every E > 0 there corresponds a 6 = 6e > 0 such that
and µ(A)<6 vi(A)<eforalliEI.
What does this result say in view of Theorem 21.8? (Hint: Review the proof of
Theorem 17.8.1
Chapter III
Product Measures

In this short chapter we will investigate whether and how one can associate a prod-
uct with finitely many measure spaces. And for the product measures thus gotten
we will want to see about how to integrate with respect to them in terms of their
factors. We will recognize the L-B measure Ad as being a special product measure
when d > 2. One important application of product measures is the introduction
of the concept of convolution for measures and functions.

§22. Products of c-algebras and measures

Finitely many measurable spaces j = 1, ... , n E N are given. We consider


the product set
n
Q:= X11j=Q1x...xQ,t
j=1
and for each j the projection mapping
Pj : 52 -> S2y

which assigns to each point (w1, ,w,) E I its jth coordinate wj. The a-algebra
in Q generated by the mappings pa,. , pn is designated
n

j=1

and called the product of the a-algebrns d1 r ... , d,,. According to (7.3) we have to
do here with the smallest a-algebra s® in ft such that each pj is d-safj-measurable.
The reader may recall that the product of finitely many topological spaces is
defined in a very similar way.
An important principle of generation for such products is immediately at hand:

22.1 Theorem. For each j = 1, ... , it let Ag be a generator of the a-algebra salj
in SZj which contains a sequence (Ejk)kEN of sets with Ejk T Q j. Then the a-algebra
A(i 0 ®.n is generated by the system of all sets
E1x...xEn
with E., E 9, for each j = 1, ... , n.
§22. Products of a-algebras and measures 133

Proof. Let 0 be any a-algebra in Q. What we have to show is that the mappings p,
are all d-Oj-measurable (j = 1,.. . , n) if and only if s+d contains each of the
sets El x ... x En described above. According to 7.2 pj is .V-Afj-measurable just
exactly if p 1(E3) E 0 for every E3 E 8 . If this condition is fulfilled for each
j E {1,.. . , n}, then the sets
El x ... x En =p11(El)n...npnl(En)
all lie in 0. If conversely, E, x ... x En E s+1 for every possible choice of E3 E 4
and j E {1,. .. , n }, then upon fixing E3 E 8j, the sets
Fk:=Elkx...xEj-1.kxEi xEj+1,kx...xEnk, kEN,
all lie in W. Since the sequence (Fk)kEN increases to
U1 x...x1j-1 xEj xflj+1 x... xOn =pj1(Ej),
this set too lies in d, for each j. The claim is therewith proven. 13

Remark. 1. The restriction imposed on the generators S, cannot generally be


dispensed with. Take, for example, n := 2, sail {0,111}, ell := {0} and 82 := W2i
in which .QF2 contains at least four sets.

A particular case of this theorem is the fact that the product dj ® ... ®srdn is
generated by all the sets Al x ... x An with each A3 E . . Our further course will
be guided by the following example:

Example. F o r each j E { 1, ... , n} let Std := R, . rt :_ .41 and 8j :_ f 1. The


system of all sets E1 x ... x En with each E? E Jr' is evidently just the system .5n
of all right half-open intervals in Rn. According to 6.1, fn generates the a-alge-
bra R" of n-dimensional Borel sets. Taken together with 22.1 - whose hypotheses
are clearly satisfied here - this reveals that
(22.2)
,qn = a1 ® (& R1 (n factors on the right).
By 6.2, A" is the only measure on R" which satisfies
,\' V1 x ... X In) = V1(Il) . ... Al (In)
for all I, i ... , In E .01. This remark and the example preceding it leads to the
following question.

Measure spaces (f13, O j, pi) are given, 1 < j < n with n > 2, and for each dj
a generator 9j. Under what hypotheses can the existence of a measure a on
010 .. . (9 On satisfying
(22.3) zr(E1 for all E,ESj,I<j<n
be proven?
The accompanying uniqueness question can be settled at once:
134 III. Product Measures

22.2 Theorem. Suppose that for each j = 1, ... , n irj is an n-stable generator
of ao which contains a sequence (Ejk)kEN of sets of finite pj-measure satisfy-
ing Ejk f 11j. Then there is at most one measure rr on alt ®... ®x/ erljjoying
property (22.3).

Proof. Let 8 denote the system of all sets El x ... x E,,, where Ej E ej for each j.
According to 22.1, 8 generates the a-algebra dj (9 ... ® 04. Since each Bj is
f-)-stable, so is 8, as the identity
?I n
X Ej)n(X Fj) = X(E,nF,)
J=1 9=1 j=1

makes clear. Moreover Ek := Elk x ... X E N) defines a sequence in 8 that


evidently satisfies
EkTf1,x...xf1,,.
Recalling that µj (Ejk) < +oe for all (relevant) j and k, we see that the uniqueness
claim therefore follows from 5.4. (Obviously it would suffice if U Ejk = f1j instead
kEN
of Ejk T SZj were satisfied for each j.) 0

Under the hypotheses of 22.2, which obviously entail the a-finiteness of each
measure uj, the existence of the desired measure it can also be proven. This proof
will be carried out in the next section, first for it = 2, then for arbitrary n > 2.

Remark. 2. In closing it should again be mentioned that a mapping


f:S2o-4 SZlx..-xSZ
of a measurable space (11o, ado) into a product of measurable spaces (0j, Afj) is
measurable with respect to the a-algebra all ® ... ®as' if and only if each com-
ponent mapping fj := pj o f off is d0-Oj-measurable - a fact which is immediate
from Theorem 7.4.

Exercise.
Finitely many measurable spaces (flj,.Wj) are given, j = 1,. .. , n. Show that the
algebra in S21 x ... x S2 generated by all sets Al x ... x A,, with each Aj E .rrdj
consists of all finite unions of such product sets.
§23. Product measures and Fubini's theorem 135

§23. Product measures and Fubini's theorem

Initially measure spaces (521, .sdl, pj ), (522, sd2, µ2) are given. For every Q C ill x 112
the sets
Q111 {w2 E ill : (WI, W2) E Q}
(23.1)
Q,,,., {w1 E ili : (w1,w2) E Q}
are called, respectively, the w1-section of Q (w1 E ill) and the w2-section of Q
(w2 E p2)
This notation is chosen for typographic simplicity and will see us through §23,
after which it is not needed. In case ill = il2i however, it presents obvious prob-
lems, to circumvent which, alternative notations like,,,, Q or Q4 for Q,,1 are also
popular in the literature.
About these sets we claim:

23.1 Lemma. If Q E sd1 ® sd2i then its w1-section lies in ad2 for every w1 E 01,
and its w2-section lies in sd1 for every w2 E i12.

Proof. For arbitrary subsets Q, Q1 i Q2.... Of fl :=121 x 522i and points w1 E ill
(!\Q)w, =!2\Q.1
and
(U Qn) = U (Qn)., .
nEN nEN

Furthermore 52, = 112, and more generally for Al C 111, A2 C ill we have

(A1 x A2),1 =
j A2 if w1 E Al
0 if w1 E ill \ A1.
For each w1 E 121, therefore, the system of all sets Q C fl having section Q,,, E .ode
is a a-algebra in Cl which contains every product set Al x A2 with Al E .o'j,
A2 E ode. But according to 22.1 01 (& ad2 is the smallest a-algebra which contains
all such product sets. This proves the part of the lemma dealing with w1-sections.
Of course, w2-sections are treated the same way. 0

Since now µ2(QW1) and make sense for all Q E 01 ®.02, wl E ill and
w2 E S12, we are in a position to take the next step:

23.2 Lemma. Suppose the measures p1 and µ2 are or-finite. Then for every Q E
sd1 ® . 9 the functions
w1 H µ2(Q.,) and w2 H A, (Q..)
on 121 and 122, respectively, are sd1-measurable and 02-measurable, respectively.
136 III. Product Measures

Proof. The function wl H P2(Qw,) will be denoted by sq. We will establish the
d1-measurability of sq, for each Q E d1 ®sal2. The other function can be treated
analogously.
First suppose that µ2(1Z2) < +oo. In this case the set ) of all D E .01 ®sal2
whose sD function is.call-measurable constitutes a Dynkin system in C := 111 x 11.2.
This involves the following easily checked assertions:
811 = /12(122);
sf1\D = 851 - SD for every D E .9;
svD = ESD. for every sequence (D,6) of disjoint sets in .9.
Furthermore 9 contains Al x A2 for every Al E salli A2 E sale, since
SA, xA2 =112(A2) - lA,
The system if of all such Al x A2 is fl-stable and generates sale ®sd2, by 22.1.
Therefore 2.4 insures that 01 ®ad2 is the Dynkin system generated by it. From
9 C -9 C Wl ®,42 therefore follows that .9 = .call ®.v i which is what is being
claimed.
If 162 is only a-finite, then there is a sequence of sets from ae, each of
finite 162-measure, with Bn T 112. For each n, A2 H u2(A2f B.) is therefore a finite
measure 162,, on sate, to which the already proven result can be applied, showing
that wl H is .aft-measurable for each Q E Of, ® 02. Now
112(Q,,,) = auP112,,(Qw,)
nEN
because of the continuity from below of the measure 162. From Theorem 9.5 then
the mapping wl -r 162(Q,,,) is indeed al-measurable.

It is now rather simple to construct the measure it that we seek:

23.3 Theorem. Let (f1j, dj, pp) be o-finite measure spaces, j = 1, 2. Then there
is exactly one measure.. it on all ® .sate which satisfies
(23.2) rr(A, x A2) = p, (Al)112(A2) for all Al E sli, A2 E sate.
In addition this measure satisfies

(23.3) it(Q) = f f for all Q E sail ®d2

and is a-finite.

Proof. As before, for each Q E sate e s12 let sq denote the Wi-measurable function
w1 on 121; it is of course non-negative. Consequently via

ir(Q) := JSQdILI

a non-negative function it is well defined on 010 sate. For every sequence (Q,)nEH
of pairwise disjoint sets from sat 0 szt2 the equality sUq = E sq, and 11.5 insure
§23. Product measures and Fubini's theorem 137

that 00

7r U Qn) _ F, n(Qn)
nEN n=1
Since so = 0 we have 7r(0) = 0. This proves that 7r is indeed a measure on .od1®a2.
It has property (23.2) because
SA, XA2 = p2(A2)IA,, whence integration yields
7r(A1 x A2) = pl(A1)a2(A2)
Proceeding analogously, we confirm that

ir'(Q) := fi(Qw2)iz2(dw2)

also defines a measure on s1® ® d2 having this property. But when Theorem 22.2
is applied to 9d°1 sr'1 and &2 := W2 it affirms that there is at most one such
measure. Thus 7r = 7r' and (23.3) is confirmed. There is a sequence (Ajn)nEN of
sets from ,rarj, each of finite pj-measure, with Ajn T 52j, for j = 1 and j = 2. Using
these as the A1, A2, respectively, in (23.2) proves the a-finiteness of IT because
r(A1nxA2n)<+ooand A1nxA2nTfu1 xQ2
23.4 Definition. The measure IT on 010 .W2 which is uniquely specified by (23.2)
whenever (521,911,p1) and (122,d2ip2) are a-finite measure spaces is called the
product of the measures p1 and 02 and is denoted by

Thus also the question posed in §22 is answered for a-finite measures p1, P2.
If namely ej is a generator of salj (j = 1, 2) with the properties formulated in
Theorem 22.2, then according to 22.2 and 23.3, Al ® p2 is the only measure IT on
01 ® 02 which satisfies (22.3).
The Example in §22 therefore entails that A2 = a1 ®a1. Similar considerations
lead to the validity of
Am+n = '\® ®)n

for any m, n E N, once the appropriate identification of 1R"'+" with RI x Rn has


been made.
We turn now to integrating with respect to the product measure 141 ®p2. Our
notation for sections can be usefully extended to functions for this purpose. If
f : S21 X 122 -+ 12o is any mapping, we define its sections f, for each w1 E 521 and
f,, for each w2 E 92 as mappings of 121 and f12, respectively, into 11o by
f., (w2) f (w1,w2) for all w2 E 112
(23.4)
f,.,2 (wi) f (wi, w2) for all w1 E 521.
Notice that if Q C 121 x 122 and f := 1Q, then these functions satisfy
(23.5) (IQ),,, = IQ.,, and (IQ),,,2 = IQ, .
138 111. Product Measures

Note, of course, that these indicator functions have different domains, and, just as
with (23.1), further caution is called for with (23.4) in case ill = f12. Equations
(23.4), and (23.5) lead us to call the mapping f,,,, the wj-section of f. It enjoys
the expected properties:

23.5 Lemma. For every measurable space (W, d') and every measurable mapping
f: (11 x122,4110A)-(11',d')
is sate -d' -measurable and f,,, is .11-d'-measurable for every wl E 11 i w2 E S12.

Proof. For every A' E W', w1 E 11


fJ,'(A') = {w2 E 122 : (w1, w2) E f-1(A')}
_ (f -'(A')),,,
and similarly for every w2 E 122

(f-1(A'))w,,
so the measurability claims follow from Lemma 23.1.

Decisive is the following theorem which extends formula (23.3) from indicator
functions to non-negative measurable functions. It goes back to L. TONELLI (1885-
1946), its corollary to G. FUBINI (1879-1943). Both statements are often combined
under the single designation the theorem of Fubini.

23.6 Theorem (of Tonelli). Let (111,41z) be o-finite measure spaces (j = 1, 2),
and let
f: 121x122 R+
be s1® 0 .sat2-measurable. Then the functions
r
w2' J f,n dµ1 and w1 H dµ2

are .sate-measurable and Ol -measurable, respectively. Moreover,

(23.6) ffd(i0u2)= J(ffW2dul),02(dw2)=J(ff1due)µl(dw1)

Proof. Set Sl := Sl1 x 112, NY' := . ®.so42 and rr := Eq ®µ2. Consider first an
at-elementary function f :
n
f:_Eaj1Qj (ai>O,QjEa,nEN).
j=1
Then a glance at (23.5) reveals that for each w2 E aj14IL2 and so
f" n

f d,u1=
Eaj1il040,
j=1
§23. Product measures and Fubini's theorem 139

an iA2-measurable function on l2 thanks to 23.2. Its integration is therefore ac-


complished by (23.3) thus:

f(ff2d1) _ aj7r(Q7) =
J
f d7r,
j=1
which confirms the first equation in (23.6), for elementary f.
For an arbitrary d-measurable numerical function f > 0 let (u(')) be a se-
quence of .say-elementary functions such that uini T f. Then, as was noted in the
first part of the proof, is a sequence of dl-elementary functions, which
obviously satisfy u) T fw2 (for each w2 E 112). Consequently, the functions
V(n)(w2) Ji4)dir w2 E 112,

which are d2-measurable by what has already been proven, increase to the function

w2H f f)2dp1,
by 11.3. This function is therefore also a02-measurable and the monotone conver-
gence theorem 11.4 says that

ff fiat dµl)µ2(dw2) = suP f


nEN
w(') dµ'2
Again, by what has already been proved,

f ep(n) dM2 = J u(n) d7r for each n E N.

By the choice of the sequence (u(')) and definition 11.3

f f d7r = sup I
nEN
u(n) d7r.

Combining the last three equations gives the desired

J(ffdi)P2(dw2) = f f dir,
and wholly analogous arguments establish the claims about the functions f", . 0

Having disposed of non-negative functions, the next step in integration theory


is to pass over to integrable functions. For them we get

23.7 Corollary (Theorem of Fubini). For j = 1, 2 let (llj, a4j, 14j) be a-finite
measure spaces, f a k1 0 p2-integrable numerical function on !l x 02. Then for
µl-almost every w1 the function f,, is 142-integrable and for µ2-almost every w2
the function f,,,2 is µ1-integrable. The functions

w1 y f fu dp2 and w2 H f fw2 dµ1


140 III. Product Measures

thus defined p,-almost everywhere on fl, and P2-almost everywhere on f12, respec-
tively, are pl-integrable and µ2-integrable, respectively, and equations (23.6) are
valid.

Proof. Evidently for all w? E S2? (j = 1, 2),

IfIWj - I fWjI, (f+)., = (fWj)+ and (f (f.,)-


so we will employ parenthesis-free notation. According to (23.6) the product mea-
sure it := µ, ®µ2 satisfies

f(JIf1I d142)µ1(dw1) = f (f IfW,I dill)µ2(dw2) = f Ifl d' < +10-

In particular, the ddb,-measurable numerical function w, H f Iff,I dµ2 is µl-


integrable and so by 13.6 it is µ,-almost everywhere finite. That is (by 12.1),
for µ,-almost every w, the section f, is µ2-integrable. Consequently,

w1 14 f L. dµ2 = f f.+, d112 - f f,;, d02

is a p1-almost everywhere defined function, which is x/1-measurable because that


is assured of each integral on the right by Theorem 23.6. In turn each of these
integrals is µ,-integrable by 23.6. So our pi-almost everywhere defined function
w1 H f fW, dp2 is µ,-integrable and

f (f fW, dµ2)µ1(dw1) = f(Ji-, dp2)pi(dwi) - f (f fZ dµ2)p1(dw1)


= f f + dir -
J
f - dir = f dir.

Of course, the roles of w1 and w2 can be interchanged in this argument and we


thereby secure the rest of what is being claimed.

The theorems of Tonelli and Fubini insure, in particular, that under the stated
hypotheses the order of repeated integrations is immaterial. We can emphasize this
by writing the equation (23.6) in the form

f f d(µ1(9 µ2) = f f f(W1,W2)111(dW0112(dw2)


(23 6')
= Jff(wiw2)2(dw2)i(dwi).
That exceptional sets of measure zero cannot generally be ignored in the con-
clusions of Fubini's theorem is illustrated by the following example.

Example. 1. Consider L-B measure A2 = A' ®A' on R2, the set A := Q x R E R',
and its indicator function f := 1,1. According to 23.3 or 23.6 we have A2(A) =
f f dA2 = 0, so f is A2-integrable. Nevertheless, for every w1 E Q, the section
f,,,, = la is not A'-integrable.
§23. Product measures and Fubini's theorem 141

Remark. 1. For certain measures µ1,P2 which are not or-finite the existence but
usually not the uniqueness of a product measure can be proved by other methods.
See, e.g., BERBEIUAN [1962]. Even if just one of p' or 112 fails to be a-finite, the
second equality in (23.3) can fail. Cf. Exercise 1, p. 145 of HALMOS [1974], as
well as chapter IV, §16 of HAHN and ROSENTHAL [1948]. Moreover, there exist
f : 91 x f12 - R+ which are not sail (9 02-measurable yet the "iterated integrals"
on the right side of (23.6) make sense (and are finite). For an abundance of il-
luminating but elementary counterexamples related to this famous theorem, see
CHATTERJI [1985-86] and MATTNER [1999].

A useful and at the same time surprising consequence of Tonelli's theorem is


that it permits p-integrals to be expressed by means of A1-integrals.

23.8 Theorem. Let (S2, d, p) be a a -finite measure space and f : Il - R+ a mea-


surable, non-negative, real function. Further, let W : R+ -+ 11 P+ be a continuous
isotone function which is continuously differentiable at least on R+ :_ ]0,+00[
and satisfies w(0) = 0. Then

(23.7) f co o f dp = fit ,
+
(t)p({ f > t})A1(dt) = 0
J0
+00
w (t)µ({ f > t}) dt .

Proof. Consider the L-B measure A' := AR+ on the o-algebra R' := R+ fl9l. The
function F : 0 x R+ -+ R2 defined by
F(w, t) :_ (f (w), t)

is, according to Remark 2 in §22, 0 ®.4'-measurable, because each of its compo-


nent functions is. Therefore the F-preimage of the closed half-plane {(x, y) E R2 :
x > y}, namely
E:={(w,t)ESZxR+: f(w)>t},
lies in sad®.. Theorem 23.6 for the product measure p®A' consequently supplies
the equalities

(t)IE(w,t)A'(dt)p(dw) = f f V(t)1E(w,t)µ(dw)X'(dt)
(23.8) JJ V

= Jw'(t)iz(Ei)A(dt) = Jc'(t)({f > t})A'(dt),


since the t-section of E is just the set of all w E 1 which satisfy f (w) > t. As V
is isotone, W'(t) > 0 for all t > 0. The continuous function gyp' is integrable over
[1/n, a] whenever 1/n < a < +oo, and since [1/n, a] t ]0, a], and

(t)A'(dt) = limo J
f
oal n
(t) dt = W(a) - n m V(1/n) = w(a)
142 !IL Product Measures

(cp(0) = 0 and Sp is continuous on R+), we see that V is also integrable over 10, a]
for every a > 0. It follows from f > 0 and the preceding calculation that

p'(t)a(dt) = (f(w)) for every


J o,f(W)l
E S1,

both expressions being 0 whenever f (w) = 0. We thus get

o f dµ = f (Jlo,f(W)l
= J f o'(t)llo,nw)d(t)A*(dt)µ(&)

= IV
J
which combined with (23.8) concludes the proof. D

Example. 2. The relevant hypotheses are certainly fulfilled by the functions


V(t) := t' with p > 0. Thus for every a(-measurable real function f > 0 on S1

(23.9) fl'dµ=p +
J 0

When p = 1 we get the especially important formula

(23.10) f f du = r p({f > t})A1(dt) = t})dt.

The reader should not overlook the geometric significance of this, which is that
the integral f f dµ is formed "vertically", while the integral on the right-hand side
of (23.10) is formed "horizontally".

Now at last we turn back to the general case of §22 and consider finitely many
o-finite measure spaces (S1i, di,,a ), j = 1, ... , n and n > 2.
The two product sets (f21 x ... x 1li_1) x On and SZ1 x ... x Sln_1 x Stn will
be identified via the bijection
((w1,...,W,y_1),wn) H (L11,...,wn-l,wn)
The agreed-upon equality of these sets leads at once to the equality of the corre-
sponding products of v-algebras:
(23.11) (Wi®...®An-1)®-Wn=010...®An-1®dd/n.

In fact, by 22.1 the sets Al x ... x An- l with each Ai E jz(j generate rote®...OAfn-1,
and by the same theorem the sets

then generate (.Q91 0 ... 0 s0n_ 1) ®6dn as well as .c ® ... ®sOn_ 1 ®SF,.
§23. Product measures and Fubini's theorem 143

In a completely analogous fashion one confirms a general associativity in the


formation of products of a-algebras:
m n n
(23.12)
-'10 = j=1
® 0j (1<m<nEN).
j=1 j=m+1
The convention (23.11) opens up the possibility of proving the existence of product
measures on any finite number n > 2 of factors via induction on n.

23.9 Theorem. or-finite measures µl, ... , µn on a-algebras .d1, ... , jVn uniquely
determine a measure 7r on safe ® ... 0 do such that
(23.13) 7r(A1 x ... x An) = ul(A,) .... µn(An) for all Aj E 0j, 1 < j < n.
This measure 7r is a-finite.

Corresponding to Definition 23.4, 7r is called the product of the measures


µl, ... , µn and is denoted by
n
®µj µl®...®µn.
j=1
The question posed in §22 is finally answered in full, by this theorem.

Proof. In 22.2 take for the various generators 8j the o-algebra .dj itself, and learn
that there is at most one measure 7r which satisfies (23.13). The existence question
has already been settled for n = 2, in 23.3. We make the inductive assumption
that 7r' := µ1 ®... ®µn-1 exists for some n > 2 and show how that leads to the
existence of µl ® ... ®µn. Evidently the a-finiteness of µl, ... , µn_1 entails that
of 7r', as in the proof of Theorem 23.3. That theorem therefore supplies us with
a measure 7r := 7r' ®µn on (.W1 ®... ®.dn_ 1) ®.dn which satisfies
7r(Q' x An) = 7r'(Q')µn(An)
for all Q' E .d1 ® ... ® .dn-1 and all An E dd4n. Because of (23.11) this measure
does what is wanted at level n, completing the induction. Again, a-finiteness of 7r
is confirmed exactly as in the proof of 23.3. 0

This inductive construction of the n-fold product measure builds in the equality
(23.14) (141 ®... (&µn-1) ®µn = µ1 ®... ®µn-1 ®µn
By now familiar considerations show that in fact a general associativity prevails
in the formation of product measures:
m n n
(23.15) (®µj)®(
j=1
® µj)=®
j=m+1 j=1
µj (1<m<nEN).

In particular
xd = V ®V, with d factors.
144 III. Product Measures

In view of (23.15) induction can also be used to extend the theorems of Tonelli
and Fubini to multiple factors. We will formulate only the analog of 23.6:
Let f _> 0 be an s91®... ®.c 4-measurable numerical function on 01 x... x Stn.
Then for every permutation j1, ... , j,, of 1, ... , n

(23.16) Jfd(ii®...®in)

= f(... (f (f f(w1i...,wn)µj,(dwj,))µj.(dwjs))...)µjr(dwj.)'
Every integral that occurs on the right-hand side is measurable with respect to
the product of the appropriate Oj, namely those corresponding to the coordinates
in which integration has not yet occurred. This right-hand side is often written in
the shorter fashion
J ... J
The simple proof of this theorem (involving induction), as well as the formula,
tion and proof of the analog of 23.7, will be left to the reader.
One more piece of notation is convenient:

23.10 Definition. For finitely many a-finite measure spaces (SZj, Wj, µj), 1 < j <
+, 1l 1!
n, the triple ()( SZj, ®.Wj, ®µj) is called the product of these measure spaces
7=1 j=1 j=1
and is denoted by n
j,
14Y
j=1

Remark. 2. Throughout the preceding the index set was finite. But there is
also a theory of products of (finite) measures indexed by arbitrary sets, which
is particularly important in probability theory; it is treated in detail by BAUER
[1996], and somewhat more extensively in HEw rr and STROMBERG [1965]. For
p-measures SAF,KI [1996] gives a short, elementary proof that uses only 5.1.

In closing we will consider the case where each measure µj comes with a real
density f j > 0. According to Theorem 17.11, vj := f jµj is then a a-finite measure
too.

23.11 Theorem. Let (S2j,.Vj, jAj) be or-finite measure spaces andfj>0real-


valued w(j-measurable, functions on S1j. Set
vj = fjµj, j = 1,...,n.

Then the product of these measures is defined and satisfies


n n
(23.17) ®vj = F. (®µj)
j=1 j=1
§23. Product measures and Fubini's theorem 145

with the density function


n
(23.18) F(wl,...,wn) [ffj(wj),
j=1

The function F is the so-called tensor product of the densities f1,..., fn

Proof. As already noted, 17.11 insures that each measure vj is a-finite, guarantee-
ing that their product is defined. It suffices to treat the case n = 2 and refer the
general case to induction. For sets Al E and A2 E s12

vl(A1)v2(A2) = (jfid14i)(j12d142)
z

=
Jf lA,(w1)fl(wl)lA2(w2)f2(w2)141(dwl)112(dw2)
I ._

= Jf lA,xA2(wl,w2)F(wl,w2)1L1(dwl)122(dw2)
From 23.6 therefore

v1(A1)v2(A2) = J Fd(141 ®1L2), for all Al E. iA2Ed2.


, x A2

But then according to 23.3, v1 ® v2 coincides with the measure F (141 ®14z). 0

Exercises.
1. Consider 521 = 522 :=1R, 01 = 02 := ,41, it, := Al and 142 the non-a-finite
counting measure on .41 (cf. Example 3, §5). Show that equality (23.3) fails to
hold for Q := D, the diagonal {(w,w) : w E R} in 121 x 522. Why does D lie in
jV1 002 =W2?
2. Show that the function
(x, y) H 2e2xv - exv
is not A2-integrable over the set [1, +oo[x [0, 1].
3. With the aid of Tonelli's theorem find a new proof of Theorem 8.1 along the
following lines: Up is a translation-invariant measure on mod, 14([O,1[) = 1, and f >
0, g > 0 are Borel measurable numerical functions on Rd, compare the integrals

f()f(x + y)14(dx)Ad(dy) and f f g(y - x)f(y)14(dx)Ad(dy)


f
and, finally, take f to be any indicator function, g the indicator function of [0, 1[.
4. Compute
00
2
I:= f e_x dx,
0

and thereby evaluate anew the important integral G = 21 in (16.1), in the fol-
ye_y2V2
lowing simple way: fo a-e2 dt = fo dx for every y > 0 and therefore
146 III. Product Measures

I2 = f °° (, fn f (x, y) dx) dy for the function f on R+ x R+ defined by f (x, y)


yP-v2(1+z2).
Applying Tonelli's theorem leads to I = 2Vr7r.
5. Let IxI := (x + ... + xd)112 denote the usual euclidean norm of the vector
x := (x1,. .. , xd) E Rd. Show that the function x H e-Iz1° is ad-integrable for
every a > 0. (Recall Exercise 2 of §16.) In case a = 2, show that the Ad-integral
of this function is Gd.
6. KL(xo) will denote the closed ball in Rd with center xo and radius r > 0. Set
ad :_ and prove that
,\d(K*(xo)) = adrd .

Show also that the numbers ad can be calculated by


2q(2q a-1
a2q = 4 9rq, and a2q- i = 1 3 - 1) (q E Dl).

[Hint: Use (7.10) and note that every xd-section of K,.(0) is either empty or is
a (d-1)-dimensional closed ball. Tone1G's theorem then leads to a recursion formula
for the ad. Here, of course, 7r has its customary geometric meaning.]
How do these relations change if we replace K,.(xo) by the open ball Kr(xo)
in Rd of radius r and center xo? [Cf. Exercise 3 in §7.]
7. For every compact interval [a, ,Q] C R+ designate by R(a, Q) the spherical shell
K,3(0) \ K.(0) _ {x E Rd : a < IxI < /3} .
Show that for every continuous real function h on such an interval (a, /3] C R+
.
fR(a,p)
h(Jxj)Ad(dx) = d ad f
a
h(t)td-1

dt,

ad being the number ad(KI (0)) from the preceding exercise. [Hint: The function H
defined on [a, p) by
H(t) := f h(IxI)J1d(dx), tE

is differentiable with H'(t) = d ad h(t) td-1 for all such t.]


a-t2
8. Apply the result of Exercise 7 to the case d = 2 and h(t) := in order to
show, using Exercise 5, once again that G = f.
9. Let (S2, d1. p) be a o-finite measure space, f : Il -+ R+ measurable. Show that
the set of all t > 0 such that u({f = t}) # 0, as well as the set of all t > 0 such
that µ({ f > t}) # µ({ f > t}) is countable. Therefore in the equalities (23.8),
(23.9) and (23.10), p({ f > t}) can always be replaced by µ({ f > t}).
§24. Convolution of finite Borel measures 147

§24. Convolution of finite Borel measures

Consider the d-dimensional Borel measurable space (Rd,.gd). Every finite mea-
sure µ on Rd will be called a finite or also a bounded Borel measure, and the set
of all of them will be designated by.,&+' (lR'). For every such µ the number
(24.1) lI,II := IA(Rd)
is called the total mass of A.
Making critical use of the group structure of (Rd, +) a so-called convolution
product can be assigned to any finitely many measures Al, ... , An E .K+ (Rd);
in contrast to the previously studied product measure, it is again a measure on
the original o-algebra Vd, even an element of .,of' (Rd). What we do below can
be carried out in every (abelian) locally compact group. We cannot, however, go
into this generalization, but must instead refer interested readers to the excellent
monographs of HEwIrr and Ross [1979] and RUDIN [1962]. Initially we consider
the product measure Al ® ... ® An defined in §23. Since W d = Rd ®... 00,
this measure is an element of .,W+b (Rod) The mapping A. : R"d -3 Rd defined by
A,,(xl,... , xn) := x1 + ... + xn
is continuous, and so Vnd-.mod-measurable. The following definition accordingly
makes sense:

24.1 Definition. The image under the mapping An of the product measure
-IC/+b(Rd),
plo. .®Idn is called the convolution product of the measures pl,... , An E
in symbols
(24.2)

The theorems on product and image measures combine to yield the most im-
portant properties of the convolution operation *. First of all, At * ... *An is again
an element of .0+1 (Rd) and

µl*...*µn(R")=µl®...®p,(R"d)=11µ11I ... IIJUnII

so that in fact
(24.3) IIµl * ... * poll = 11µ11I ...' 11µn11
In studying the convolution product it suffices to deal with n = 2, because
(24.4) Al * ... * An * I`n+1 = (Al * ... * ln) * ltn+1
for every n + 1 measures from .4 (Rd). To see this, introduce the continuous
mapping Bn+1 : R(n+l)d _+ Red by

Bn+1(x1, ... , xn, xn+l) := (XI + ... + xn, xn+l )


148 III. Product Measures

and have An+l = A2 o B.+1. Checking that


Bn+1(p1 ®... OA. 0 pn+1) = A. (j AI ®... ®pn) ®pn+1,
and remembering that the formation of image measures is transitive, we get
Al * ... * pn * µn+1 = A2(Bn+l (JAI ®... ®pn ®pn+i ))
= A2((1.t1 * ... * A.) 0 pn+1),
which confirms (24.4). Henceforth therefore n = 2.
For any measures p, v E .4f+' (Rd) and any 0-measurable numerical function
f > 0 it follows from T19.1 and 23.6 that

r
J fd(E.e*v)
=J foA2d(p®v)
(24.5) = ff f(x + y)p(dx)v(dy)

= f f f(x + y)v(dy)µ(dn)

As this holds for f := 1B, they indicator function of any set B E fed, we have

(24.6) p * v(B) = J µ(B - y)v(dy) = J v(B - x)p(dx)

(Recall (7.8) that B-x = -x+B.) Consequently * is a commutative, and by (24.4)


also an associative operation in .1/+(R.d)
Due to 19.2 and 23.7, (24.5) are valid as well for every p*v-integrable numerical
function f on Rd. Equality (24.6) is frequently taken as the definition of p * v.
Evidently .,W+6 (Rd) is closed with respect to addition and under multiplication
by numbers in R+. From (24.6) we immediately see the relation of convolution to
these two operations: For all p, v, v1i v2 E .41+(Rd), a E 11 Y+
(24.7) p*(vl+v2)=p*v1+p*v2,
(24.8) p*(av)=(ap)*v=a(p*v).
The distributive law (24.7) even holds in the following generality: For every
sequence of measures from .4r+(Rd) satisfying E IkvJJ1 < +oo, the sum
n=1
00
E vn is also a measure in .4f+1 (Rd) (cf. Example 4 of §3). Taking account of 11.5,
n=1
it therefore follows from (24.6) that
00 00
(24.9) 14 *(E14t Ep*vn
n=1 n=1

for every p E A,(+(Rd)

Let us now compute p * v in some special cases.


§24. Convolution of finite Borel measures 149

1. We again denote by T. the translation mapping x H x + a of Rd onto itself via


a E Rd, and by ea the (Dirac-)measure on Md defined by unit mass at the point a.
Of course, Ea E -f+(Rd) and IIEa1I = 1. From (24.6) follows that Ea * µ(B) _
µ(B - a) = µ(T; ' (B)) for all B E mod, and so
(24.10) E. * µ = Ta(p) for all p E .4W+6 (Rd), a E Rd.

Now To is the identity mapping, so co is a - and obviously the only - unit with
respect to convolution. If, namely, E were also a unit, meaning that p = E *,U for
every µ E 4. (Rd), then it would follow that Eo = E * co = E.
For the special choice p := Eb, (24.10) says that
(24.10') Ea * Eb = Ea+b for all a, b E Rd.

2. Let f > 0 be a Ad-integrable numerical function on Rd and p := fAd. Since


IIµII = f f dAd < +oo, p also lies in W+ (Rd). Let us compute p*v for an arbitrary
v E .,4+(Rd). From 17.3 using the translation-invariance of Ad and the general
transformation theorem 19.1, we get

p * v(B) = J J 1B(x + y)f (x)Ad(dx)v(dy)

= f f 1B(x + y)f(x)T-v(Ad)(dx)v(dy)

= f f 1B(x)f(x - y)Ad(dx)v(dy)

for every B E .mod. With the help of Tonelli's theorem it further follows that

p * v(B) = f 1B(x)q(x)Ad(dx) = f gdAd,


B

where q is the non-negative .mod-measurable function x H f f (x - y)v(dy). This


function is also Ad-integrable, since f q dAd = Ilp * vfl < +oo. Thus whenever p has
a density with respect to Ad, so does p * v. We set f * v := q, that is, we make the
definition

(24.11) f * v(x) := f f (x - y)v(dy) for x E Rd.

The preceding result now assumes the more suggestive form


(24.12) (/Ad) * v = (f * v)Ad.
Naturally f * v is called the convolution of f and v.
3. Besides p = f Ad, let now v = gAd also have a Ad-integrable density g > 0.
According to 17.3 and the preceding

f * (gAd)(x) = f f(x - y)g(y)Ad(dy) (x E Rd)


150 III. Product Measures

is a density for u * v with respect to Ad. We denote this function by f * g, that is,
we set

(24.13) f * g(x) f f(x - y)g(y).d(dy) (x E Rd)

and get
(24.14) (f Ad)*(gAd)_(f*g)Ad-
Here too f *g is called the convolution off and g. It is defined for every pair of non-
negative Ad-integrable functions and is itself such a function. Nevertheless, it might
not be real-valued, even if f and g each are (cf. Remark 1 below). Ftom (24.13)
and the translation- and reflection-invariance of Ad it follows that for every x E Rd

f * g(x) = f f(x - y)g(y)Ad(dy) = f f(x + y)g(-y)Ad(dy)

= f f(y)g(x _ y)Ad(dy) = g * f(x)-


That is, the * operation between functions is also commutative:
(24.15) f * g = g * f.
Similar calculations confirm its associativity; that is,
(24.16) (f*g)*h=f*(g*h)
for all Ad-integrable, non-negative functions f, g, h.
The distributive law
(24.17) f*(g+h)=f*g+f*h
and the homogeneity property
(24.18) f * (ag) _ (af) * g = a(f * g) (aER.F.)
for such functions hold as well and follow immediately from (24.13).

4. For arbitrary functions f, g E 2' (Ad) decomposition into their positive and
negative parts and appeal to the resusecured in 3. show that

x + ff(x - y)g(y)Ad(dy),
while possibly defined only Ad-almost everywhere (see Remark 1 below), is always
Ad-integrable. One can therefore define f * g by

f * g(x):= f f(x - y)g(y)Ad(dy)

but generally only for Ad-almost all x E Rd. Once again the expression convolution
is used for this f * g.
§24. Convolution of finite Borel measures 151

Remarks. 1. For real-valued, non-negative functions f, g E pl (Ad) the func-


tion f * g need not be finite everywhere. It suffices to consider any real-valued,
non-negative, even function f which lies in Y1 (A") but not in 22(Ad) and to take
g = f. Then f * g(0) = +oo. In case d = 1, such a function is
10 forlxI>Iorx=0
f(x) := IXI-112
for 0 < IxI < 1.
1
2. In passing to Le(ad) - cf. Remark 1 in §15 - the difficulties high-lighted
above with the definition of f * g disappear. Indeed, let f H f be the canonical
mapping of .1 (Ad) onto Ll (Ad). One defines f * g for arbitrary f , § E Ll (Ad) as
the image h of a function h E 21 (Ad) which coincides Ad-almost everywhere with
f * g. This definition is independent of the special choice of representing functions
f, g and h from 21 (Ad). The new operation * renders the vector space Ll (Ad) an
algebra over R.

Exercises.
1. Show that for any it, v E dii (Rd) and any linear mapping T : Rd - Rd,
T(µ * v) = T(p) * T(v). To this end, first observe that T o A2 = A2 o (T (& T),
where T 0 T denotes the mapping (x, y) -+ (T (x), T (y)) of Rd x Rd into itself.
2. Compute the nlh convolution power of the function f defined on R by f (x)
ethat is, the convolution f * ... * f with n(E N) factors. Is it true that for
every n E N, f has an "nth convolution root"? That is, is f the nth convolution
power of some A'-integrable function g > 0?
3. If we set N1(f) f I f I dAd (this is (14.1) for it := Ad), then
N, (f *g) <N,(f)N,(g)
holds for all f, g E 21(Ad), and for non-negative functions equality prevails.
4. Write out the details of Remark 2 and show that
III*9II1 5 II/II, 119111

holds for all elements f and g of the Banach space L1(Ad). The latter is therefore
a Banach algebra.
Chapter N
Measures on Topological Spaces

In view of many applications in analysis, geometry and probability theory it turns


out to be unavoidable to subject the Borel measures on Rd to more precise analy-
sis. These measures possess a host of remarkable properties involving the topology
of Rd. Up to this point topology has only entered the picture in the generation
of the Borel sets. We will see that the completeness of the euclidean metric, re-
spectively, the local compactness of Rd were responsible for the aforementioned
properties. But there are more general topological spaces, important in their own
right, which share these properties with Rd.
Therefore from the start we will set our exposition in an essentially more gen-
eral framework: Instead of Borel measures on Rd we will study Radon measures
on Polish and locally compact spaces. In the process new facts, even in the Rd
environment, about the nature of the integral and the measurability concept will
emerge. A natural and useful convergence concept will play a role.
In what follows some simple things from general (point-set) topology will be
pre-supposed. The textbooks of KELLEY [1955] and WILLARD [1970] are good
sources for these, and explicit references to them will be given at the appropriate
points in the text.

§25. Borel sets, Borel and Radon measures

Initially E will be an arbitrary topological space. The system of its open subsets
which defines the topology will be denoted B. In the case of Rd we had deter-
mined (cf. 6.4) that the o-algebra of Borel sets is generated by the open sets.
Consonant with this we now make the general

25.1 Definition. The a-algebra in E generated by 6 is denoted by V(E) and


called the Borel or-algebra in E:
(25.1) .l (E) := Q(6) .

The closed sets being the complements of the open ones, _V(E) is also generated
by the system of all closed subsets of E. In this respect the analogy with 6.4 extends
a bit farther. The intersection of a sequence of open sets is called a G6-set, and
the dual, the union of a sequence of closed sets is called an Fa-set. All such sets
are clearly Borel.
§25. Borel sets, Borel and Radon measures 153

From now on E will be a Hausdorff space. Then every compact subset of E is


closed, hence Borel. The second example below will show, however, that generally
the a-algebra generated by the class Xfof all compact subsets of E is strictly
smaller that _4(E). So at this point the analogy with 6.4 falters.

Examples. 1. From 6.4, as has already been mentioned,


(25.2) -4 (Rd) = .Vd,
E := Rd here carrying its euclidean topology.
2. Let E be a discrete space, meaning that 6 = 9(E). Then the system ' of
compact sets consists just of the finite subsets of E. Consequently (cf. Examples 2
and 7 in §1) o(..iE') is the countable and co-countable a-algebra, comprised of all
countable subsets of E and their complements, and so o ,(X) = -V(E) if and only
if E is countable.
3. Let Q be a subspace of the Hausdorff space E. Then .V(Q) is the trace of
R(E) on Q:
-v(Q) = Q n 9(E).
In fact, by definition the subspace topology of Q consists of the sets {Q n G : G E
6}, so this system generates ..(Q). Since Q n B(E) is a a-algebra in Q which
contains this system, it follows that 9(Q) C Q n . f(E). On the other hand, the
system {A C E : Q n A E .9(Q)} is obviously a a-algebra in E which contains all
the open subsets of E, a generating system for .V(Q). Hence Q n M(E) C SR(Q).
If Q itself is a Borel set in E, then ..(Q) just consists of all the Borel sets in E
which are subsets of Q.
4. The compactified number line i is a topological space which is homeomorphic
to the compact interval [-1,+1]. For it
(25.3) .9(i) = R1 .

In fact, R n..(i) = ..(R) = V1 by Examples 1 and 3 above. The subsets {-oo}


and {+oo} are closed in R and the subset R is open in R, hence all three are Borel
sets in K. Equality (25.3) therefore follows from the definition of R1, given in §9.

In the sequel we will be studying measures on R(E) for two important classes
of spaces E. In preparation for which we make

25.2 Definition. Let E be a Hausdorff space. A measure p on the a-algebra..(E)


is called:
(i) a Borel measure on E if
µ(K) < +oo for every compact K C E;
(ii) locally finite if every point of E has an open neighborhood of finite µ-measure;
(iii) inner regular if for every B E ..(E)
(25.4) µ(B) = sup{µ(K) : K compact C BI;
154 IV. Measures on Topological Spaces

(iv) outer regular if for every B EL(E)


(25.5) p(B) = inf {p(U) : B C U open} ;
(v) regular if it is both inner regular and outer regular.
Note that a Borel measure is more than just a measure defined on 69(E): in
addition finiteness on the system of compact sets is demanded. The inner and
outer regularity conditions say that the measure is determined on every Borel set
by its values on the compact, resp., the open sets. The Borel measures on E = Rd
are already familiar to us from §6.
Every finite measure on M(E) is obviously a Borel measure; as in §24 where
E = Rd, we naturally call it a finite Borel measure on E. The notation introduced
there for the total mass of a finite Borel measure will be carried over to this more
general setting: For every finite Borel measure i on a Hausdorff space E
(25.6) IIiII := p(E)
is called the total mass of it.
Already at this point we can observe that every locally finite measure p on R(E)
is a Borel measure, that is, that
(25.7) (ii) (i) .
Indeed, each point x in the compact set K has an open neighborhood V,, with
p(Vr) < +oo, and compactness means that finitely many of these, say those cor-
responding to x, , .... x,,, cover K. Then
n
p(K) < p(VV, u ... U p(Vxf) < + 0 C .

The converse of (25.7) is, however, not generally valid. Exercise 2 below furnishes
eui example.
Because of the implication (25.7), instead of locally finite measures defined
on 1(E), we will henceforth say simply locally finite Bore! measures.
For the moment we will be content to illustrate the regularity concept with
some examples.

Examples. 5. Let E be an arbitrary Hausdorff space, a a point in E. The mea-


sure eq on .(E) defined by unit mass at a:
(25.8) e .(A) = 1A(a) for A E R (E)
is both inner and outer regular on E. Henceforth it will be called the Dirac measure
on Eat a.
6. As in Example 2, let E be a discrete space, so that t9 = .9p(E). The compact
sets are just the finite ones. The measure defined on .9(E) by
J0 if A is countable
p(A)
1 +oo otherwise
§25. Borel sets, Borel and Radon measures 155

is a locally finite Borel measure which is obviously outer regular. It is, however,
inner regular if and only if the set E is countable.
7. On -41 = .a(R) consider the counting measure. It is not a Borel measure, is
however inner regular, but not outer regular. In fact, equality (25.5) fails even for
one-point sets B.
8. L-B measure Ad on ( d =M(Rd) is a (locally finite) Borel measure. In §26 we
will see that it - and indeed every Borel measure on Rd - is regular.

Developments stretching over decades attest to the fact that on a Hausdorff


space those Borel measures which are locally finite and inner regular play a dis-
tinguished role. Such measures are nowadays named after J. RADON (1887-1956).
A work of his from the year 1913 (cf. the bibliography), which has since become
classical, set this development in motion.

25.3 Definition. A measure defined on the Borel a-algebra . (E) of a Hausdorff


space E is called a Radon measure on E if it is both locally finite and inner regular.

More precisely the term used is "positive" Radon measure, but in this book
we dispense with that adjective because non-negativity is built into our definition
of measure, that is, we consider only measures with values in [0, +oo]. Example 5
says that the Dirac measure at any point a E E is always a Radon measure on E.

We have already noted that Borel measures are not automatically locally finite.
Nevertheless for many spaces Radon measures can be defined simply as the inner
regular Borel measures. That is the import of

25.4 Lemma. On a Hausdorf space E in which every point has a countable


neighborhood basis, every inner regular Borel measure p on E is also locally finite
and hence a Radon measure.

Prof. We argue by contradiction: Suppose that it is not locally finite, which means
there is a point x E E such that u(V) = +oo for every open neighborhood V of x.
By hypothesis x has a neighborhood basis consisting of a sequence (Vn) of open
sets, and by replacing each V. with V1 fl ... fl V,,, we may suppose that V. 1. {x}.
Since p(Vn) = +oo and p is inner regular, there exists a compact subset Kn C V.
such that p(K,,) > n, and this is true of each n E N. Now the set
K := {x} U U Kn
nEN

is compact. For if °1! is an open cover of K, then some U E P1 contains x and


since (Vn) is a neighborhood basis at x, Vno C U for some no E N. It follows that
K, C Vn C C U for all n > no. Since Kl U ... U Kno is a compact subset of K,
it is covered by finitely many sets in 9l. These together with U then furnish the
desired finite covering of K. On the one hand then p(K) < +oo, since p is a Borel
156 IV. Measures on Topological Spaces

measure, and on the other hand since K C K


µ(K) ? p(KK) > n for allnEN.
This is the contradiction sought. O

Exercises.
1. Let (Q, .W) be a measurable space, 8 a generator of &V and ! ' a subset of Q.
Consider the traces a' and d" of a' and 8, reap., on S2' and show that e' is
a generator of the a-algebra .rah' in ff. Example 3 above is a special case.
2. Equip the set R with the so-called right-sided topology (which is also sometimes
named after SORGENFREY [1947) whose system 0, of open sets is defined as
follows: A subset U C R lies in ®r if and only if for each x E U there is an e > 0
such that [x, x + E[ C U. The topological space thus created will be denoted R,.
Establish, one after another, the following claims:
(a) Every right half-open interval [a, b[ is both open and closed in R,.. The right-
sided topology on R is strictly finer than the usual topology. In particular,
R, is a Hausdorff space.
(b) .W(R,) =0.
(c) Suppose (x,e) is a strictly isotone sequence of real numbers possessing the
supremum b E R. Then the set {z : n E N} U {b} is closed but not compact
in R,. By contrast, if (y,,) is a strictly antitone sequence of real numbers
possessing the infimum a E R, then {a} U {y : n E N} is compact in R,..
(d) Let K be compact in R,. Then there exists (from the first part of (c)) for every
x E Kay E Q with y < x and [y, x[f1K = 0. If for each x E K, p(x) designates
such a rational number y, then a mapping B : K -+ Q materializes which is
strictly isotone, and hence injective.
(e) Every compact subset of R, is countable. (But (c) shows that the converse is
not true.)
(f) Consider on .W(R,) = . 1 the measure p which assigns to every countable set
the value 0 and to every uncountable set the value +oo (cf. Example 6). Then
p is a Borel measure on R, for which no point of R, has a neighborhood of
finite measure. In particular, the measure p is not locally finite and is neither
inner regular nor outer regular.
(g) Consider the measure v := IA' with density
f(x) := x-' llo,+ool(x) (x E R)
and show that it too is a non-locally-finite Borel measure on R,.
(h) Investigate the L-B measure Al, thought of as a Borel measure on R in
respect to its inner and outer regularity.
§26. Radon measures on Polish spaces 157

§26. Radon measures on Polish spaces

For two extensive classes of Hausdorff spaces Borel measures come up very natu-
rally. The first of these classes will be discussed in this section, beginning of course
with its

26.1 Definition. A topological space E is called Polish when its topology has
a countable base and can be defined by a complete metric.
The terminology is due to N. BouRBAKI and commemorates the achievements
of Polish topologists in the development of general topology.
A metric is called complete when the associated metric space is complete: every
Cauchy subsequence in it converges. A countable base or basis for the topology is
a countable system of open sets such that every open set is the union of those from
the system which are subsets of it. For a metrizable space E the existence of such
a basis is equivalent to the existence of a countable dense subset.

Examples. 1. The euclidean spaces Rd of every dimension d > 1 are Polish, the
ordinary euclidean metric being complete.
2. The product E' x E" of two Polish spaces is another, when given the product
topology. For if d, d" are complete metrics generating the topologies of E' and E",
reap., then the product topology of E' x E" is generated by the metric
d(x, y) = d'(x', y) + d"(:r", y"), x := (x', x"), y (y', y").
which moreover is complete. If 9',9" are countable bases for E', E", resp., then
{G' x G" : G' E 91, G" E 9") is a countable basis for E' x E".
3. Every closed subspace F of a Polish space E is Polish. Just restrict to F any
complete metric that generates the topology of E.
4. Every open subspace G of a Polish space E is Polish.

Proof. We may suppose G # E. By 1. and 2. R x E is Polish. Let d be a complete


metric giving the topology of E, and consider the set F of all (A, x) E R x E
satisfying E\G) = 1. Here, as usual, for 0 0 A C E. d(x., A) := inf{d(x, a)
a E A} is the distance from the point x E E to A. The mapping x H d(x, A) is
continuous on E, in fact., as the reader can easily check, ld(x, A) - d(y, A)l <
d(x, y) for all x, y r= E. Consequently, (A, x) Fa A d(x, E \ G) is a continuous real
function on R x E, and F is a closed subset of R x E, hence itself a Polish space,
by 3. Finally, (A, x) H .r. maps F homeomorphically onto G. To see surjectivity,
we only have to notice that, because E \ G is closed, G coincides with the set {x E
E : d(x, E \ G) > 0}.
5. More generally it is true (cf. COHN [1980], Theorem 8.1.4 or WILLARI) [1970],
Theorem 24.12) that a subspace A of a Polish space E is Polish if A is a Ga-set
in E, that is. A is the intersection of a sequence of open subsets of E. Thus, for
158 IV. Measures on Topological Spaces

example, the set J of all irrational numbers with its topology as a subspace of R
is Polish, since
J= n (R \ {x}) .
2E'Q

6. Every compact space E with a countable basis is Polish. For a famous theorem
of P.S. URYSOHN (1889-1924) (cf. KELLEY [1955], p. 125 or WILLARD [1970],
Theorem 23.1) guarantees that E is metrizable, and in Remark 3 of §31 we shall
even give a proof of this. The compactness of E easily entails that every metric
defining its topology is complete.
The key to the further discussion is the following lemma, which is here just
a preliminary to the big theorem that follows it, but nevertheless is significant in
its own right. In it we encounter our first extensive class of Radon measures.

26.2 Lemma. Every finite Borel measure it on a Polish space E is regular.

Proof. We consider the system .9 of all B E -W(E) which satisfy both


(26.1) p(B) = sup{µ(K) : K compact C B}
and
(26.2) µ(B) = inf {it(U) : B C U open).
The goal of course is to show that .9 = M(E). We block off the work into five
sections. Let d be a complete metric defining the topology of E.
1. E E 9: Only (26.1) needs proof when B = E. Let (X,,)-EN be a sequence which
is dense in E, and for x E E, real r > 0 let Kr(x) denote the open ball of center x
and d-radius r. For every r then E _ U K,.(xn), because in every ball Kr(x) lies
nEN
some x,, so that x E Kr(xn). Sincep is continuous from below
k
p(E) = kunµ(U Kr(xj)) .
j=1
Therefore, for each e > 0 and n E N there exists kn E N such that
k
µ K1/,, (xj)) > p(E) -F2'°
j=1
kp
Each set Bn U K 1 / (x j ), hence also their intersection K:= f Bn is closed,
j=1 nEN
and we have

u(E)-µ(K)=µ(E\K)=p(U (E\B,)) 5 p(E\Bn)<_


Ee2-n=e.
00
nEN n=1 n=1
This will prove (26.1) if we can confirm that the closed set K is actually compact.
For every n E N
K C B. E l l , ,
§26. Radon measures on Polish spaces 159

and each set in this union has diameter no greater than 2/n. This shows that K is
pre-compact (=totally bounded) and in a complete metric space that is equivalent
to compactness, by very easy arguments (cf. WILLARD [1970], Theorem 39.9 or
KELLEY [1955], p. 198).
2. Every closed set C lies in 9: Let F > 0 be given. We already know that there is
a compact set K with
µ(E) - IA(K) < e.
According to 3.5 however

µ(C) - µ(C fl K) = p(C U K) - µ(K) < µ(E) - µ(K) < £


and this proves (26.1) for B : C, because C fl K is compact. As a closed subset
of a metric space, C is a G6-set, that is, there are open sets G. J. C. To see
this we may assume C 9& 0, so that G := E \ C is an open proper subset of E.
Consequently, x H d(x, C) is a continuous mapping whose zero-set is C, as was
shown in treating Example 4. The sets Gn :_ {x E E : d(x,C) < 1/n} are
therefore open and decrease to C. From the finiteness of µ and 3.2(c) we then
have that µ(G.) 4. µ(C), showing that (26.2) is also satisfied by B := C.
3. Whenever B lies in 9 so does CB: First note that for every compact K C B
µ(CK) - p(CB) = µ(B) - µ(K) ,
and so CB satisfies (26.2) whenever B satisfies (26.1). Moreover, if G is an open
superset of B, then CG is a closed subset of CB with
µ(CB) -,u(CG) = µ(G) - µ(B) ,
showing, at least, that CB satisfies (26.1) weakened by replacing "compact" there
by "closed". But then application of step 2 to these closed sets gives us the
full (26.1) for CB.
4. Whenever pairwise disjoint sets Dn lie in 9 (n E N), their union D also lies
in 9: First of all
µ(D) _ µ(D.)
n=1
Letting e > 0 be given, we therefore have an nr E N such that
n,
(26.3) µ(D) - E p(Dn) < c/2.
n=1
Every Dj contains a compact K,j such that

µ(Di) - µ(Ka) < (7 = 1, ... , ne)


2nE
since each D, E 9. Then K := K1 U...UKn, is a compact subset of D1 U...UD0, C
D which satisfies
( n, = n,

µ(D1 U... U Dn.) - µ(K) S µ U (D, \ Ki )) µ(Di \ K,) < e/2


j=1 j=1
160 IV. Measures on Topological Spaces

from which, in view of (26.3),


IA(D) - µ(K) < e .
Again, D. E .9 means there exists open Un 7 D,, such that
e/2" for each n E N.
Then the open set U := U Un contains D and satisfies
nEN
00

l(U) - p(D) < µ( U (Un \ Dn)) < E li(Un \ D,,) < C.


nEN n=2
In summary, we have shown that (26.1) and (26.2) hold for B := D = U D,,.
5. The result of the first four steps is that 9 is a Dynkin system which contains
the system .$ of all closed sets. The claim, namely that -9 = R(E), now follows
in the familiar way: Because 9 is n-stable, 6(.F) = o(Jr) = R (E). From Or C
9 c£(E) follows . (E) = J(9) c9 c . (E), and thus the equality sought. o

We come now to the principal result of this section. It generalizes the foregoing
lemma.

26.3 Theorem. On a Polish space E every locally finite Borel measure p is a o-


fenite Radon measure.

Proof. The hypothesis is that every point x E E has an open neighborhood U. of


finite u-measure. The family (U:)XEE is an open cover of E. Because the topology
of E has a countable basis, a theorem of E. LINDEt.oF (1879-1946) insures that
this cover contains a countable subcover. That is, there is a sequence (xn)fEN in E
such that the sequence already covers E. [It is easy enough to prove
Lindeldf's result right here: Let V be any open cover of E, 0 a countable basis
for the topology of E, and define d' to be the system of all A E d such that A C U
for some U E 9l and let U(A) be one such member of 'Pl. The subset 0' of at,
and therewith the system of all these U(A), is countable. This system covers E.
For if x E E, then there is some U E Pl that contains x, and since d is a basis
there is some AEsi such that xEACU.Thus AEii'and xEAcU(A).j
The system of sets Gn := U,z, U ... U US,,, n E N, satisfies
(26.4) u(G,) < +oo for every n E N, and G,, ? E.
Via
1,6. (A) :_ p(AnG.), A E R(E)
a finite Borel measure µ,, is defined on E for every n E N. Each such measure is
inner regular by the preceding lemma. It follows that for each A E SR(E)
µ(A) = sup p(A n Gn) = sup µ,(A) = sup sup µ (K) .
nEN nEN nEN KEA
§26. Radon measures on Polish spaces 161

After commuting the two suprerna this reads


jt(A) = sup suptin(K) = sup p(K),
KEr nEN KEr
KCA KCA
proving the inner regularity of tt. The a-finiteness of it is affirmed by (26.4), so the
proof is complete.

The question now suggests itself whether - in analogy with 26.2 - the outer
regularity of p can be proved. This is in fact the case.

26.4 Corollary. Every Radon measure on a Polish space is outer regular.

Proof. We have to show that every B E 4(E) satisfies (25.5). So let B E .4(E)
and e > 0 be given. Consider the open sets G. and the finite measures tt created
in the preceding proof. Lemma 26.2 furnishes open sets U. J B such that
(26.5) ti((U,, \ B) n p. (U,. \ B) < e/2" for each n E N.
Let U U U n G,,, an open set. Since
nEN

B = B n E = B n UG,, U BnC,,,
nEN nEN

it follows from B C U for every n, that B C U. Moreover, this representation


of B shows that
U\B = U (UnnG,,)\ U (BnGn) C U (UnnGn)\(BnGn) = U (Un\B)nGn
nEN nEN nEN nEN
and consequently
x, x
tt(U\B) < e/2" =E.
n=1 n=1
by (26.5). It follows finally that
µ(U) = u(B) + tt(U \ B) < µ(B) + c,
which confirms (25.5).

The regularity conditions (25.4), (25.5) make sense for outer measures px and
together with one other minimal demand on p* they assure that all Borel sets are
,W-measurable. In fact, these conditions on an outer measure come up naturally in
the course of proving the famous Riesz representation theorem in §29; cf. also 28.3.

26.5 Lemma. Let E be a Hausdorf space and tt' an outer measure on E with
the following three properties:
(i) for every set A C E
tt'(A) = inf{tt'(U) : A C U open 1;
162 IV. Measures on Topological Spaces

(ii) for every open set U C E


p* (U) = sup{Ec*(K) : K compact C U};
(iii) for any two disjoint compact sets K1, K2 C E
JL*(Kl UK2) = p*(Kl) +{l*(K2)
Then the restriction of µ* to R (E) is a measure.

Proof. We consider the a-algebra d* of all µ*-measurable sets, that is, according
to (5.6) the set of all A E .9(E) which satisfy
(26.6) k*(Q) > µ*(Q n A) + p*(Q \ A) for all Q E .9(E).
First note that it suffices that this hold for all open sets Q in order that it hold
for all Q whatsoever. In other words, what we need to check for an A to be in d*
is that
(26.6') p*(U) > p*(U n A) +,t.*(U \ A) for all U E 0.
Indeed from (26.6') it follows for any Q C E that
p*(U) > p*(Q f1 A) + p*(Q \ A)
whenever U is an open set containing Q; then (26.6) itself follows by taking the
infimum over such U and invoking (i). So now let A = G be an open set; we will
use criterion (26.6) to show that G lies in W*. To this end consider any open
U C E; further, consider any compact Kl C U n G and any compact K2 C U \ K1.
Since then K1 n K2 = 0 and Kl U K2 C U, it follows from (iii) that
y* (U) > {b' (K1 UK2) =A* (KI) +Ft*(K2)
The set U\Kl is open, so if we take the supremum over all such K2 in the preceding
inequality and appeal to (ii), we get
it* (U) > IA*(Kl) + u* (U \ K1) > u'(Ki) + t,* (U \ G),
the last inequality because U\Kl D U\G. This holds for all compact Kl C UnG,
and so after a second appeal to (ii) it yields
p*(U) > p*(UnG)+µ'(U\G),
holding for all U E 0. That is, (26.6') holds for A = G, and consequently G E d9*.
This all proves that B C W*. But then .9(E) = a(®) C j W* the latter
is a a-algebra, by Theorem 5.3. That theorem further affirms that the restriction
of u* to W* is a measure.
The foregoing Theorem 26.3 and its corollary show in particular that the
L-B measure Ad is a regular Bored measure on Re in e a c h dimension d = 1, 2, ... .
In fact every Bore] measure on Rd is regular (cf. also Theorem 29.12). Following
STROMBERG [19721 we derive from the regularity of Ad a purely topological result
of H. STEINHAUS (1887-1972). It shows, incidentally, that every set of positive
L-B measure has the cardinality of R.
§26. Radon measures on Polish spaces 163

26.6 Theorem (of Steinhaus). Let A E Rd be a Borel set in Rd of positive d-


dimensional Lebesgue measure. Then 0 is an interior point of the set A - A of
differences of elements of A.

Proof. The inner regularity of Ad means that A contains a compact subset K


with Ad(K) positive. It suffices to prove the claim with K in place of A. Outer
regularity furnishes an open set U D K with Ad(U) < 2Ad(K). There is an open
ball V centered at 0 of positive radius such that the sum set satisfies K + V C U.
One only has to choose the radius less than the (positive) distance between the
compact set K and the closed set CU from which it is disjoint. We will show that
V C K - K, which makes 0 an interior point of this difference set. Consider any
v E V. The translated set v + K cannot be disjoint from K, for otherwise from
K U (v + K) C K + V C U and translation-invariance of Ad would follow that
2Ad(K) = Ad(K) + Ad(v + K) = \d (K U (v + K)) < Ad(U),
contrary to the choice of U. But K fl (v + K) 0 0 means that for some x, y E K,
x = v + y; which says that the given point v = x - y lies in K - K. 0
In closing we turn to a remarkable consequence of Theorem 26.3 and its Corol-
lary 26.4. It concerns the analogy, pointed out in §7 as measurable mappings were
being introduced, between the notions of measurability and continuity. Initially
this analogy is merely an analogy. Namely, if f : E -+ E' is a mapping of one topo-
logical space into another, then f is Borel measurable (i.e., .(E)-.(E')-measur-
able) just if the pre-image f - i (G') of every open set G' C E' is a Borel set in E.
This follows from Theorem 7.2 and the fact that the Borel o-algebra M (E") is
generated by the open subsets of E'. By contrast, f is continuous just if f-1(G')
is open in E for every open set G' C E. What is quite remarkable is that for
Polish spaces E a much closer connection between those two concepts exists.
This is brought out by the following theorem, discovered in its definitive form
by N. LUSIN (1883-1950).

26.7 Theorem (of Lusin). Let ,a be a locally finite Borel measure, thus a Radon
measure, on a Polish space E, and E' be a topological space with a countable basis.
Then for every mapping f : E -+ E' the following are equivalent:
(a) f coincides p-almost everywhere with a Borel measurable mapping of E into E'.
(b) There is a decomposition of E into a p-nullset N E R(E) and a sequence
(K,.)nEN of compact sets, such that the restriction off to each K is contin-
uous.
If the measure µ is finite, (a) and (b) are further equivalent to:
(c) For every e > 0 there is a compact subset KK C E such that p(CKE) < e and
the restriction off to K, is continuous.
Proof. Let us first suppose that p is finite. Let 9' be a countable base for the topol-
ogy of E' and (Gn)nEN a sequential arrangement of its elements. Notice that 9' is
a generator of the Borel o-algebra because every open subset of E' is a (countable)
union of sets from s'.
164 IV. Measures on Topological Spaces

(a)=(c): By hypothesis there is a Borel measurable mapping g : E -* E' and


p-nullset N E .£(E) with
(26.7) f (x) = g(x) for all x E CN.
For every set Gn, g-1(Gn) E . (E). Because every Radon measure on E is regular,
given E > 0, there exist compact sets Kn and open sets Un such that
(26.8) K C g-1(G'n) C Un and p(Un \ Kn) < 2-ne for each n E N.
The set A U (Un \ Kn) is open, being a union of open sets. For its measure
nEN
we have the obvious inequality
00

p(A) s E p(Un \ Kn) < C.


n=1
Using once more the (inner) regularity of 1S, we find a compact K C C(A U N) _
CA n CN such that
p(CAnCNnCK) <e-p(A),
thus (since A U N C CK and A U N U (CA n CN = E) such that
p(CK) = p(A U N U [CA n CN n CKI) < p(A) + p(N) + E - p(A) = E .
This set K does what is wanted in (c), because by (26.7) f and g coincide in K
and because the restriction go of g to CA is continuous, as we now confirm. For
each set Gn,
go 1(Gn) = g-1(Gn) n CA;
from (26.8) and the fact Un \ Kn C A follows therefore
UnnCA =KnnCA cg'(G')cUnnCA,
which means that
goI(Gn)=UnnCA =KnnCA,
showing that the go-pre-image of G;, is open (as well as closed) in CA. Since
(Gn)nEN is a base for the topology of E', this is enough to guarantee the continuity
of go=gICA.
(c)=(b): It suffices to find pairwise disjoint compact subsets Kn of E such that
f I Kn is continuous and
K3) <
p(C ?=1
U
J n
=
for each n E N. For then
N:=CUKn= nCKn
nEN nEN
is a Borel set disjoint from each Kn and satisfying p(N) < 1/n for every n E N, i.e.,
p(N) = 0. The sequence (Kn) is gotten inductively from (c) as follows: To start
off, there is a compact K1 C E such that u(CKI) < 1 and f I K1 is continuous.
§26. Radon measures on Polish spaces 165

If Ks,. .. , Kn have been defined having the desired properties, we will get K"+1
from (c) and the inner regularity of p. By (c) there is a compact K' C E such that
p(CK') < (2n + 2)-'
and f I K' is continuous. With L := K, U... UKn the inner regularity of p supplies
a compact Kn+1 C K' \ L such that
µ(K' \ L) - p(Kn+1) = µ(K' n CL n CKn+,) < (2n + 2)' 1 .

Because
p(C(L U Kn+,)) = p(CK' n CL n CKn+1) + µ(K' n CL n CKn+, )
< p(CK')+p(K'nCL nCK,,+,) < (n + 1)-',
with this set Kn+, the inductive construction is complete.
(b)=(a): If E = N U K, U K2 U ... is the given decomposition, one defines
a mapping g : E -* E' as follows. In case N = 0, let g := f. In case N 96 0, choose
yo E f (N) arbitrarily and set
g(x) := f (x) for x E E \ N, g(x) := yo for x E N.
What has to be shown is that g is Borel measurable, which is done as follows: For
every open G' C E'
9_1(G')
= (g-1 (G') n N) U U (g-1(G') n Kn) = No U U g; 1(G')
nEN nEN

where No := g-1(G') n N and gn := g I Kn. Now No is either N or 0, according


as yo E G' or yo V G'. Moreover, gn coincides with the restriction of f to Kn, so
that by hypothesis gn 1(G') is open in Kn, that is, of the form Kn n Un for some
open subset U,, of E. Therefore only Borel sets occur in the above decomposition
of g-1(G') and we conclude that g-1(G') is a Borel set. This being true of every
open G' C E', the Borel measurability of g follows from 7.2.
Now consider an arbitrary locally finite measure p on R(E). According to 26.3,
p is a-finite. Lemma 17.6 therefore furnishes a strictly positive p-integrable real
function h on E. The measure v := hp is then a finite Borel measure on E which
has exactly the same nullsets as p. The proven equivalence of (a) and (b) for the
measure v therefore entails the validity of this equivalence for the measure it. Thus
the whole theorem is proved.

Remarks. 1. The equivalence of (a) and (b) in Lusin's theorem may be lost if (a) is
strengthened to the 9(E)-9(E')-measurability of f. It suffices to take for E the
compact set [0,1] x [0,1] and for p the L-B measure .X E. As was noted in the
second part of Remark 4, §8, E contains a p-nullset N which contains a non-Borel
subset. If M is such a set, its indicator function f = l,w is not Borel measurable,
although f is p-almost everywhere equal to the Borel measurable function 1N On
the other hand, if f is . (E)-. (E')-measurable, there is a Polish topology r on E,
stronger than the original but generating exactly the same Borel sets, such that f
is r-continuous. See 3.2.6 of SRIVASTAVA [1998] for the proof, which is not difficult.
166 IV. Measures on Topological Spaces

2. The Dirichlet jump function (cf. Remark 1 of §16) is continuous at no point


of its domain of definition 10, 1], yet it is Borel measurable. This shows that in as-
sertion (c) of Lusin's theorem one cannot hope to be able to replace the continuity
of the function f I K by the continuity of f at each point of K.
Exercises.
1. Show that every inner regular finite Borel measure on a Hausdorff space is outer
regular.
2. Show that in a Polish space E the Dirac measures are the only non-zero Borel
measures it which take only the values 0 and 1. [Hint: Show that the system of all
compact K C E such that tt(K) = I is fl-stable and investigate the intersection
of all itssets.]
3. Show that AE x E') _ i(E) ®M(E') for any Polish spaces E,E'.
4. Consider K compact C U open C Rd, and for each n E N let V denote the
open ball of radius 1/n and center 0. Show that K + V C U for some n. [Hint:
If (K + n CU # 0 for every it E N, find xn E K, vn E V,,, zn E CU such that
x + v = z,,, for every n E N. Some subsequence of (xn) converges to a point
xo E K and because CU is closed we even have x0 E K fl CU, which contradicts
the fact that K C U.]
5. Let p be a locally finite Borel measure on a Polish space E and f : E - E'
a mapping into a topological space E' with a countable base. Show that asser-
tions (a) and (b) in Lusin's theorem are equivalent to (c'): For every e > 0 and
every compact K C E there is a further compact Kf C K such that p(K\Kf) < c
and f I KE is continuous.

§27. Properties of locally compact spaces

A topological space is called locally compact if it is Hausdorff and if each of its


points has at least one compact neighborhood. Examples of such spaces are the
euclidean space Rd, every manifold (i.e., every locally euclidean Hausdorff space),
every discrete space, and every compact space.
When an arbitrary point is removed from a compact space the remainder is
a locally compact space. Actually every locally compact space is of this form. For
if © is the system of all open subsets of the locally compact space E and wo is any
(so-called ideal) point not in E, then a topology can be defined on E' := EU {WO}
as follows: The system d' of open sets in E' shall consist of ® together with the sets
E' \ K for all the compact subsets K of E. This defines a compact topology on E',
E is an open subset of E' and the topology that E inherits from t9' is its original
topology. E was compact to start with if and only if wo is an isolated point in E'. If
E is not compact, then it is dense in E'. These claims are easily confirmed, or the
reader can consult KELLEY [1955], p. 150, or WILLARD [1970], 19.2. The space E'
§27. Properties of locally compact spaces 167

is called, after its creator P.S. ALEXANDROFF (1896-1982), the (Alexandroff)


one-point compactification of E and wo its infinitely remote point.
We will pursue the further theory of locally compact spaces via this compactifi-
cation. First we study some distinguished continuous functions in this environment.
For an arbitrary topological space E we denote by
C(E) and Ct(E)
the vector space of all, respectively all bounded, continuous real functions on E.

27.1 Definition. Let f : E -> JR be a real function on a topological space E. The


set
(27.1) supp(f) := If 34 0}
is called the support of f.
The complement of supp(f) is thus the largest open set at every point of
which f takes the value zero. If E is locally compact. we will designate by
CA(E)
the set of all f E C(E) with compact support supp(f). A function f E C(E) lies
in CA(E) just if there is some compact subset of E in the complement of which f is
identically zero.
Clearly
(27.2) C (E) C Cb(E) C C(E),
since an f E CA(E) is bounded on its compact support, hence throughout E.
C,.(E) is a vector subspace of Cb(E). More generally for any n E N, E C(1R")
with V(O) = 0 and fl,.. . E C,.(E), the composition f,,) lies in CA(E),
rr
and indeed its support is a subset of f supp(fj). In particular, whenever u, v E
j=1
C,.(E) the functions Jul, u V zv. u A v, and therewith u+ and u.-, all lie in C'(E).
The needed continuity of y,(x, y) := r V y on 1R2 follows from the identity r V y =
(.x+y+I.e-yI)
In the special case of a compact space E, all three function spaces in (27.2)
coincide.

A fundamental property of the space C,.(E) is the following:

27.2 Theorem (on partitions of unity). Suppose that the compact subset K of
the locally compact space E is covered by the n open sets U1, ... , U,,. n E N. Then
there are functions fl.... , f E C,.(E) with the following properties
(27.3) fj>0 for j = 1.....n;
(27.4) supp(fj) C Uj for j = 1,....n:
r4

(27.5) f(x) < 1 for all r E E;


j=1
168 IV. Measures on Topological Spaces
n
(27.6) rfj(x) forallXEK.
j=1

Proof. We work in the one-point compactification E' := E U {wo} of E. The


given open sets together with Uo := E' \ K constitute an open cover of E'. Be-
cause compact spaces are normal topological spaces (cf. KELLEY [1955], p. 141
Or WILLARD [1970], Theorem 17.10), this covering can be "shrunk" to an open
covering Ui, ... , Un of E' satisfying
UUCUj for each j =0,...,n,
where of course the bar denotes closure in E'. The theorem on partitions of unity in
normal spaces (KELLEY [1955], p. 171 Or WILLARD [1970], 20 C) provides functions
fo..... fn E C(E') such that
(i) fj' > 0, supp(f f) C Uj, for j = 0,..., n;
n
(ii) Ef,(x)=1 for all xE E'.
j=o
The restrictions f I , ... , fn to E of f f,i lie in C(E) and it will be easy to show
that they have all the properties wanted. From (i) and (ii) properties (27.3)-(27.5)
follow almost immediately. One only has to notice that for each j = 1,.. . , n
supp(fj)=supp(ff)flECUUflE=UUCUj
since UU C Uj C E. In particular, Uf being a closed subset of the compact space E',
is a compact subset of E. From supp(fj) C W therefore follows the compactness
of this support. Thus f I, ... , f,, all lie in CA(E). The remaining property (27.6)
likewise follows from (ii) because supp(fo) C Uo = E \ K entails that fo(x) = 0
for all x E K. 0

Two consequences of the foregoing will turn out to be especially useful. The
first - known as Urysohn's lemma - often serves as the starting point for inductive
constructions of partitions of unity (see, e.g., RUDIx[1987J, p. 39). The second can
also be proven directly, as indicated in Exercise 1 below.

27.3 Corollary 1. In the locally compact space E, U is an open neighborhood of


the compact subset K. Then CA(E) contains a function f which satisfies
(27.7) 0:5f:51, f(K)=fl), and supp(f) C U .
In particular, supp(f) is a compact neighborhood of K.

Proof. We have only to apply 27.2 for n = 1. Since K C (f, > 0} C supp(f3), the
fact that (f, > 0) is open means that supp(f 1) is indeed a neighborhood of K. 0

27.4 Corollary 2. In the locally compact space E the compact subset K is covered
by then open sets UI,... , Un, n E N. Then K can be decomposed as K = KI U
... U Kn with Kj a compact subset of Uj for each j = 1, ... , n.
§27. Properties of locally compact spaces 169

Proof. Let fl, , fn E Cc,(E) be as provided by 27.2. The compact sets


K; := K n supp(f3 ), j = 1, ... , n
do what is wanted; for if x E K, then 1 = f i (x) +... + f n (x) means that f, (x) j4 0
for some j, and therefore x E K3.

For a locally compact space E there is another function space besides CC(E)
that is of importance. To define it we assign to every bounded real function f on
an arbitrary space E its supremum norm, also called its uniform norm, via
Ilf11 sup If W1
sEE
The mapping (f, g) -+ If -gIi makes Cb(E) - more generally even the vector
space of all bounded real functions on E - into a metric space. One speaks of the
metric of uniform convergence (on E). A sequence (fn) of bounded real functions
on E converges uniformly on E to a bounded function f just means that
lim Ilfn - f 1l = 0 .
nloo
27.5 Definition. A continuous real function f on a locally compact space E is
said to vanish at infinity if it lies in the closure Co(E) of CC(E) in Cb(E) with
respect to the metric of uniform convergence. Denoting closure in this metric by
bar, we thus have
Co(E) := CC(E) C Cb(E).

The terminology "vanishing at infinity" is both clarified and justified by

27.6 Theorem. For a real function f on a locally compact space E the following
statements are equivalent:
(a) f E Co(E);
(b) f E C(E) and {If I > e} is compact for each e > 0;
(c) the function
f'(x) :_ { f (x), for all x E E
0, for x = wo
is continuous on the one-point compactification E' of E.

Proof. (a)=(b): Given e > 0, there is by definition off E Co(E) a g E Cc(E) with
Ilf - gfl S e/2. Every x E E satisfies If (x)I - Ig(x)I <- If (x) - g(x)I S Ilf - gAI, so
we see that
(If 1> e} C {IgI > E/2} C supp(g).
This shows that (If 12: c} is a relatively compact set. But, due to the continuity
of f, it is also closed. Hence it is compact.
(b)*(c): Since the subspace topology of E in E' is its original topology and E is
an open subset of E', continuity of f' at each point of E is assured by f E C(E). As
to continuity at the ideal point wo, given e > 0, we have I f'(x) - f'(wo) I = l f'(x) I <
170 IV. Measures on Topological Spaces

e for all x in the set E' \ {If I > E}, which by definition of E' is a neighborhood
of wo, since (If I > e} is a compact subset of E.
(c)=:>(a): Continuity of f' at wo and the definition of the topology in E' mean
that for each e > 0 there is a compact K C E such that If (x)I = If'(x) - f'(wo)I <
E for all x E E \ K. 27.3 supplies a g E CA(E) with 0 < 9< I and g(K) = {1}.
Then fg E CA(E) and satisfies
If - f(x)I = If(x)I (1-g(x)) < E
for all x E E, so Ilfg - f II < E. As e > 0 is arbitrary, this proves that f E CA(E).

Exercises.
1. Without resort to partitions of unity, prove Corollary 27.4 directly. [Hint for
the case n = 2: Separate the disjoint compacta K \ U1, K \ U2 with disjoint open
neighborhoods V1, V2 and set Kl := K \ V1, K2 := K \ V2.]
2. Let E' = E U {wo } be the one-point compactification of a locally compact
space E. Describe the Borel sets in E' by means of the Borel sets in E. In particular,
see how your description fits into the following general picture: For a measure
space (E,.o), a point wo it E and the set EWO := E U {wo}, the a-algebra d"'O
in E"'° generated by d and {wo} consists of all A' C El- such that All fl E E St.

§28. Construction of Radon measures on locally compact


spaces

In what follows E will be a locally compact space. We consider a Borel measure p


(defined on R(E)). Here the requirement µ(K) < +oo for every compact set K
is the same as the local finiteness requirement, because every point of E has
a compact neighborhood and the implication (25.7) holds in general. So in the
present context the concepts of Borel measure and locally finite measure on .W(E)
coincide. The Radon measures on E are thus (cf. 25.3) those Borel measures which
are inner regular.
For a Borel measure it every u E CA(E) turns out to be p-integrable. For, being
continuous, u is Borel measurable. Denoting by K the compact support of u, we
have 1111 5 IIuII 1K. Since It is a Borel measure, 1K is p-integrable, and the p-
integrability of u follows. Therefore corresponding to the Borel measure is a linear
form 1,, on C,;(E) defined by

(28.1) lu(u) := Judy.


This is an isotope linear form in the sense of (12.3): From u < v follows I,,(u) <
I,,(v). Because of the linearity of I,, this is equivalent to
0<uEC,(E) 1,,(u)>0,
§28. Construction of Radon measures on locally compact spaces 171

which is why I,, is usually called a positive linear form.


This brings us to a key question for our further work: Is every positive linear
form on C,.(E) an I,, for some Borel measure p on E, or are there possibly positive
linear forms of a completely different kind? Even for compact intervals J := [a, b]
on the number line, answering this question is by no means a trivial task. In this
case however, as early as 1909 F. Riesz showed (cf. RIEsz (1911]) that besides the
linear forms I,, arising from Borel measures it on J, there are no other positive
linear forms on Q,,(J) = C(J). One of our goals is to show that every locally
compact space E shares this property with J. The result in question will, in view
of this pioneering work, be called the Riesz representation theorem. En route to it
we will naturally be led to the construction of Radon measures on E.
Besides the locally compact space E. let now a positive linear form
I : Cr(E) -+ R
be given. What follows will prepare the way for the proof of the Riesz representa-
tion theorem.
For every compact K C E we set
(28.2) p.(K) := inf{I(u) : 1K < it E C.,,(E)}.
Such functions u exist thanks to Corollary 27.3. Consequently,
(28.3) 0 < p. (K) < +oc.
Moreover, the mapping K ' p.(K) is obviously isotone on the system ..l' of all
compact, sets. For an arbitrary A E -1P(E) we set
(28.4) p.(A) := sup{p.(K) : K compact C Al.
Because of the above noted isotoneity of it. on ..it', this new definition is consistent
with (28.2). Finally, for A E .9(E) we define
(28.5) p'(A) := inf{p.(U) : A C U open}.
Then it. and p` are isotone functions on . (E). Moreover
(28.6) p. (A) < y* (A) for all A E .0(E),
as follows from the obvious fact that it.(A) < p.(U) for every open U D A; and
(28.7) p.(U) = /I* (U) for all open U E Y(E),
which follows from (28.5) and the isotoneit.v of it.. Somewhat more effort is required
to check that
(28.8) p.(K) = p`(K) for all K E X.
For every e > 0 definition (28.2) supplies a u E C,.(E) with to > 1K and
I(u) - p.(K) < E.
172 IV. Measures on Topological Spaces

For0<a< 1, Ua:={u>a} is an open superset of K and


1Ue < U.

If therefore Lisa compact subset of Ua, then 1y < u and so from (28.2) P. (L) <
a 1(u). From definition (28.4) therefore

ps(Ua) < I(u)


and so, since K C Ua,

0<1 s(Ua)-ps(K)

=(a-l)p.(K)+a.
As a 1 1 this majorant converges to e, which shows that
inf{ps(U) : K C U open} < IA. (K) +e
holds for every e > 0; that is,
p`(K) = inf{µ.(U) : K C U open} < µ.(K).
This confirms (28.8), the reverse inequality being part of (28.6).
Of critical importance is the following result:

28.1 Lemma. W is an outer measure on E.

Proof. Obviously p*(0) = 0, so what we have to prove is that


00
(28.9) ias (U Q-):5 E /bs (Qn)
nEN n=1

holds for every sequence (Qn) in .9(E). We proceed in three steps.


First step: For any two compact sets K1, K2
p`(K1 UK2) <ps(K1)+ps(K2).
Consider any uj E CA(E) with uj > 1K, for j = 1, 2. Then IK,UK2 U1 + U2,
so (28.2) says that
/L.(K1 U K2) < I(u1 + U2) = I(u1) + I(u2) .
The claimed inequality now follows from (28.2) and (28.8).
Second step: For any finitely many open sets U1,.. . , U.
A*(U1U...UUn) ps(U1)+...+AV.).
§28. Construction of Radon measures on locally compact spaces 173

It suffices to settle the case n = 2, as induction then takes care of the rest. If K is
a compact subset of Ul U U2, then 27.4 provides compact Kj C Uj, j = I, 2, such
that K = Kl U K2. Then by the result of our first step
,u*(K) < lj*(KI) + p*(K2) <;t'(U,) +p`(U2)
The claimed inequality (with n = 2) then follows from (28.8), (28.4) and (28.7).
Third step: Now we will prove (28.9). In doing so we may obviously assume that
p'(Q,,) < +oo for every n. E N. Given e > 0, there then exist open U. J Q,, such
that
2-11e for every n E N.
The open set U := U U contains Q :_ U Q. If now K is a compact subset
"EN nEN
of U, then K C U1 U ... U U for sufficiently large n.. From this it follows that
:, x
p.(K)_p*(K)<p'(UiU...UUn)<Ep'(Uj)<Ep`(Qj)+E.
j=t j=1
where we used the second step. As this last inequality is satisfied by every compact
subset K of U, definition (28.4) and equation (28.7) give
a
it. (U) = Et'(U) <- E; (Qj) +e,
j=t
and since Q C U we will then have as well
00

(Q):5 EW (Qj) +e.


j=
Finally, e > 0 being arbitrary here, (28.9) is proven. 0

The next corollary sharpens the inequality proved in the first step above.

28.2 Corollary. For any two disjoint compact subsets K1, K2 of E


p"(Ki U K2) = p'(K1) + p'(K2)

Proof. Consider any u E C,(E) satisfying


u.>1K,uK2=1K,+1Ks.
According to 27.3 there is a v E C,(E) with 0 < v < 1, v(K1) I}, and
supp(v) C CK2, hence with v(K2) = {0}. The functions vu and (1 - v)u lie
in CA(E) and satisfy
vu > 1K, and (1 - v)u > 1K2.

Therefore
p.(Ki) +p.(K2) < I(vu) + I((1 -v)u) =1(u) ,
174 IV. Measures on Topological Spaces

which, because of (28.2), has the consequence that


p.(Ki) + µ.(K2) < u.(K1 U K2).
In view of (28.8) this inequality is half of the equality being claimed. The other
half is simply the subadditivity of the outer measure µ'.

The first important consequence of all this is:

28.3 Theorem. The restriction of µ' to M(E) is a Borel measure.

The proof is immediate from Lemma 26.5 and the facts accumulated to this
point. Notice that (28.7) and (28.5) say that hypothesis (1) of 26.5 is fulfilled, while
(28.7), (28.8) and (28.4) insure that hypothesis (ii) of 26.5 is fulfilled.

The Borel measure µ' I ..(E) has a series of further remarkable properties:

28.4 Theorem. Every Borel subset A C E with µ'(A) < +oo satisfies
µ.(A) = µ`(A)
Proof. Given e > 0, there is an open U D A such that
It* (U) - µ'(A) < e/2,
which, due to µ' (A) < +oo and µ' being a measure on 9(E), can be written as
µ'(U\A) =µ'(U) -µ'(A) <e/2.
From (28.4) we get compact L C U such that
µ'(U\L)=µ'(U)-li (L) <e/2.
The set
Q:=(U\A)U(U\L)
then satisfies p* (Q) < e. Hence there is an open G Q such that
µ'(G) < C.
Now K := L \ G is a (closed, hence) compact subset of L with the properties
(28.10) K C A and A\ K C G.
In fact, on the one hand
K = L \ G C L \ Q C L \ (U \ A) = L n A,
since L C U, and on the other hand
A\K=A\(L\G)=(AnG)U(A\L)CGu(U\L)=G,
since U \ L C Q C G. From (28.10) we get
µ'(A) - µ'(K) = µ'(A \ K) 5 µ'(G) < e,
§28. Construction of Radon measures on locally compact spaces 175

and so u* (A) < µ'(K) + e <- µ.(A) + e. As e > 0 was arbitrary, this says that
µ'(A) < µ.(A), which with (28.6) finishes the proof.

The finiteness hypothesis in the preceding theorem can be weakened. In doing so


we make use of the terminology introduced just before the proof of Theorem 13.6.

28.5 Corollary. The equality p. (A) = u* (A) also holds for every A E -V(E)
which has o'-finite µ'-measure.

Proof. The terminology means that there exist An E R (E) (n E N), each of finite
µ'-measure, such that An T A. The preceding theorem and the isotoneity yield
µ'(An) = p.(An) < µ.(A) ,
from which and the continuity of µ' from below on R (E) follows
µ'(A) = sup p* (An) <_ p. (A).
n
Together with (28.6) this proves the claimed equality.

Another central result, analogous to 28.3, emerges:

28.6 Theorem. The restriction of µ. to ..(E) is also a Borel measure.

Proof. Since all compact K satisfy µ.(K) = p'(K) < +oo, all that has to be
proved is that p. I M(E) is a measure, i.e., that p. is countably additive on M (E).
To that end, let (An) be a sequence of pairwise disjoint sets from R(E), whose
union is A. For every compact K C A, K = U (K n An), so from 28.3 and 28.4
nEN
we get
00 00 00

µ.(K)=ii (K)=1: µ'(KnAn)=1: 1.(KnAn)<Eµ.(An).


n=1 n=1 n=1
Taking the supremum over such K on the left, (28.4) gives
00
,u. (A) S !L=(An)
n=1
In proving the reverse inequality we may assume that µ. (A) < +oo, and therefore
P. (An) < +oo for every n E N. There is then, given e > 0, a compact Kn C A.
satisfying
p. (An) - µ.(KK) < 2-ne for each n E N.
Since the sets Kj are pairwise disjoint,
UKj)=µ*\UKj/IL_(Kj)A.(Kj)

j=1 j=1 j=1 j=1 j=1


n n
> Ep.(Aj) - E for every n E N.
j=1 j=1
176 IV. Measures on Topological Spaces

Letting n -+ oo we infer that


00

(A) ? Eµ.(A.i) -e,


00
holding for every c > 0. That is, µ. (A) > E µ. (A,,), the complementary inequality
we needed to finish the proof.

We now set
(28.11) µo := µ. I .4(E) a n d µ° := µ* I R(E)
and, inspired by COURREGE [19621, call these the essential measure determined
by I and the principal measure determined by I, respectively. Each is a Borel
measure (28.3 and 28.6).

Obviously the essential measure tb is inner regular, hence is a Radon measure


on E. By contrast the principal measure µ° is outer regular. It turns out that µ°
is the more important of the two.
Thus to the given positive linear form I on CA(E) we have associated two Borel
measures. The further relation of these measures to I and the questions of whether
and when they coincide will be clarified in the next section. The closing lemma
of this section recasts definition (28.4), when A is open, into a equivalent form. It
has a preparatory character.

28.7 Lemma. Every open set U C E satisfies


(28.12) 110(U) =11°(U) = sup{I(u) : u E C0(E), supp(u) C U, 0 < u < I}.

Proof. The first equality is just (28.7). Denote the right side of (28.12) by y, and
consider any compact K C U. Corollary 27.3 provides a function u E CA(E) with
0 < u < 1, u(K) = {1} and supp(u) C U. In particular, 1K < u and so by (28.2)
µ.(K) < I(u) < y, that is, µ.(K) < y for every such K. It follows that µ°(U) =
µ`(U) = µ.(U) < y, by (28.4). The reverse inequality y < µ°(U) is derived as
follows: Let u E CA(E) be a typical function involved in the definition of y. Set
L := supp(u) and consider a typical v E C0(E) involved in the definition (28.2)
of µ.(L). Evidently then u < v, so 1(u) < I(v); that is, I(u) < µ.(L) = µ0(L) =
µ°(L) < µ°(U). Taking the supremum over eligible u gives finally the desired
complementary inequality -y:5 µ°(U).

A sharpening of equality (28.12) will be presented in Exercise 2 of §29. The


special case U = E of lemma 28.7 furnishes the following useful description of the
total masses of it. and µ°:
(28.13) 11µo11 = 11µ°II = sup{1(u) : u E CC(E),0 < u < 1).
§29. Riesz representation theorem 177

Exercises.
1. For a locally compact space E and a measure p defined on ..(E), show that it
is a Borel measure if and only if Cc(E) C 21(p).
2. Let p be a Radon measure on a locally compact space E and (Gi)1EI a family
of open sets which is upward filtering, that is, for any i, j E I there is a k E I such
that Gi U G; C Gk. Show that C := U Gi satisfies
iEI
p(G) = sup{p(Gi) : i E I} .
3. Using the preceding exercise, show that for any Radon measure p on a locally
compact space E:
(a) There exists a largest open set G with p(G) = 0. The set CG is called the
support of the measure p and is denoted supp(p).
(b) A point x E E lies in supp(p) if and only if every open neighborhood of x has
positive p-measure.
(c) For a non-negative f E C(E), f f dµ = 0 if and only if f = 0 throughout
supp(p).
Determine supp(Ad) for L-B measure Ad on Rd, and supp(E°) for every Dirac
measure ea on E.
4. Let p be a Borel measure on a locally compact space E. Show that every set A
from the a-ring p0(X) generated by the system ..iE' of compact subsets of E is
a Borel set which satisfies p.(A) = p°(A). Here a ring .4 in a set 0 is called a a-
ring if the union of every sequence of sets in .9 is itself a set in R. In complete
analogy with a-algebras, every subset of .9(0) is contained in a smallest a-ring.
Sometimes it is only the sets in pe(a') which get called "Borel sets"; this is the
case, e.g., in the classic exposition of HALMOS [1974]. Why is it generally the case
that po(..1E') 3 .9(E)?

§29. Riesz representation theorem


Again let E be a locally compact space. Every Borel measure p on E defines
a positive linear form
I,,(u) := fudp

on CA(E). The question posed in §28 was: Is it true that for every positive linear
form I on CA(E) there is a Borel measure p on E such that Iµ = I, that is, such
that
I(u) = Judp foralluECC(E)?
Any such Borel measure p will be called a representing measure for I. The answer,
leaked earlier, to this question reads:
178 W. Measures on lbpological Spaces

29.1 Riesz representation theorem. If E is a locally compact space, every


positive linear form I on CA(E) has at least one representing measure. In fact, both
the essential measure Po determined by I and the principal measure p° determined
by I are representing measures for I.

Proof. po and p° are Borel measures. It must be shown that

(29.1) I(u)= fud = Judpo for all uECC(E),

and because of linearity and the fact that the positive and negative parts of each
u E CA(E) also lie in C°(E), it suffices to show this for non-negative u. So let such
auE be given and let the real number b > 0 be an upper bound for u. Fbr
a given e > 0 choose real numbers yp,... , y,, with
0=yo<yt<...<yn=b
and
yj-yj-1< C for each j = I,-, n.
We set
uj :_ (u - yj-1)+ A (yj '- yj-1) (j = 1,...,n)
and get non-negative continuous functions, each having its support in supp(u),
which satisfy
n
(29.2) u=Euj,
j=1

as the following deliberations will confirm. If x E E and u(x) = 0, then uj(x) = 0


for each j = 1, ... , n. If x E E and u(x) > 0, then there is a unique j E {1,...,n}
such that yj-1 < u(x) < yj. In that case uj(x) = u(x)-yj-1 and uk(x) = yk-yk-1
for k < j and uk(x) = 0 for k > j. Equality (29.2) follows. Next we set
Ko := supp(u) and Kj := {u > yj } for j = 1, ... , n
and have

(29.3) (yj -yj-1)lx, < uj < (yj -yj-1)1K1_,, for j = 1,...,n,


which becomes clear from considering the three properties
(29.4) O!5 uj :5 yj-yj-1,
(29.5) CKj_1 c {uj = 0},
(29.6) Kj c {uj = yj - yj_1},
valid for j = 1, ... , n. Integrating in (29.3) with respect to p° gives

(29.7') (yj - yj-1)p°(Kj) <_ 1 uj dp° _< (yj - yj-1)p°(Kj-1),


§29. Riesz representation theorem 179

and from (29.3) we will - momentarily - infer the analogous inequalities


(29.7") (yj -Eli-1)lL°(Kj) 5 1(uj) 5 (yj -
valid for all j E {1, ... , n}. The left half of (29.7") follows from the left half of (29.3)
when account is taken of (28.2) and the fact that u.(Kj) = U*(Kj) = µ°(Kj).
From (29.5) we have supp(uj) C Kj_1. For every open U i Kj_1, the function
v :_ (yj - yyj_1)-luj is therefore an element of Cc(E) with supp(v) C U and
satisfying, by (29.4), 0 < v < 1. From Lemma 28.7 then 1(v) < p°(U) and hence
1(uj) 5 (yj -yj-1)/P(U).
According to (28.7) p°(U) = p.(U) and therefore from (28.5) and the arbitrariness
of U we have confirmation of the right-hand side of (29.7"). Upon adding up the
inequalities in (29.7') and those in (29.7") and recalling (29.2), we find that both
of the numbers f u dµ° and I (u) lie between
n n
E(yj - yi-1)µ°(Kj) and E(yj - yj-1)1°(Kj-1)
j=1 j=1
and consequently
5 n
E( yj - yj -1)Fz°(Kj-1 \ Kj),
if j=1
since Kn C Kn_1 C ... C Ko. Due to the choice of the yj it follows that

Jud1L0- Eu°(KK-1\K3)-Fµ°(Ko\K.)<EIp°(Ko)
I(u)I <_F,
j=1
The extreme inequality being valid for every e > 0 and p°(Ko) being finite, the
desired equality
f
(29.8) I(u) = udµ°
J
emerges.
The measures of the compact sets Kj, j = 0, ... , n do not change, thanks
to (28.8), when µ° is replaced by p ,. Another pass through the preceding derivation
therefore leads to the conclusion that µO is also a representing measure for 1. O

These two representing measures can be characterized by extremality proper-


ties:

29.2 Lemma. Every representing measure p for I satisfies


p(K) 5 p.(K) and p°(U) < µ(U)
for all compact subsets K and all open subsets U of E.
180 IV. Measures on Topological Spaces

Proof. Given K and U, consider functions u,v E CA(E) with iK < v, 0 < u < 1,
and supp(u) C U. Integrating these inequalities,
r
µ(K) < Jvd de = I(v) and I(u) = udp < p(U).
J
From (28.2) and Lemma 28.7 therefore the claimed inequalities follow. 0

After this preparation we can enhance the statement of the Riesz representation
theorem by characterizing the measures p and µ°, thereby putting into relief the
role of Radon measures.

29.3 Theorem. For every positive linear form I on CA(E) the associated essential
measure F4° is the unique Radon measure among the representing measures of 1.

Proof. Let p he a representing measure for I which is inner regular, thus a Radon
measure. Since 1I° is also inner regular, it follows from the first part of the preceding
lemma that
p(A) < p,(A) for every A E .R(E).
In particular then all open U C E satisfy µ(U) < p0(U) < p°(U) and when this
is combined with the second part of 29.2 we have
(29.9) p(U) = {I°(U) for every open U C E.
If compact K C E is given and U is an open, relatively compact neighborhood
of K, then U \ K is open, so that (29.9) is applicable and
p(U) - p(K) = p(U \ K) = po(U \ K) = p, (U) - p0(K)
Another appeal to (29.9), remembering that p0(U) < +oo, gives the equality
p(K) = po(K) ,
valid for every compact K C E. This fact and the inner regularity of both measures
results in their equality. 0

29.4 Theorem. Among all representing measures for a positive linear form I
on CA(E) the principal representing measure 1° is characterized by each of the
following two properties:
(i) p° is the smallest among all outer regular representing measures.
(ii) p° is the unique outer regular representing measure p which is inner regular
on open sets, that is, satisfies
(29.10) p(U) = sup{µ(K) : K compact C U} for every open U.

Proof. Let p be an outer regular representing measure. By Lemma 29.2, p°(U) <
p(U) holds for all open sets U. Since, however, µ° is also outer regular, that
inequality passes over to Borel sets generally:
µ°(B) < p(B) for all B E M(E),
§29. Riesz representation theorem 181

which confirms (i). If K is a compact set


u(K) 5 A.(K) = A(K)
by Lemma 29.2 and (28.8), so by what has already been proven equality prevails
here. That is, k and µ° coincide on the system .X' of all compact sets. Now p° satis-
fies the inner regularity condition for open sets in (29.10), as we know from (28.4),
(28.7) and (28.8). If p also satisfies these conditions, then for every open set U
µ(U) = sup{µ(K) : U D K E ..'} = sup{p°(K) : U D K E JL'} = µ°(U),
an equality which passes over to all Borel sets via the outer regularity of both
measures; i.e., p = µ° on M(E).
Remark. 1. Some authors (cf. HEWITT and STROMBERG [1965] and COHN [1980])
employ the adjective "regular" for just those outer regular Borel measures p that
have property (29.10), in contrast to our usage.

The following example shows that in general uO is not the only outer regular
representing measure.

Example. 1. Let E be an uncountable set and equip it with the discrete topology.
For I take the identically 0 form. Then from the last two theorems it follows that
µ° = µ° = 0. However the measure it from Example 6 of §25 is an outer regular
representing measure which is not identically 0.

Example 1 - there u. and p° are identical - leads to the important question


whether the essential and the principal measures coincide in general, or under
appropriate supplemental conditions. Although according to 28.5 µ°(A) = µ°(A)
for all A E M (E) having a-finite p°-measure, generally A. 96 A. An example due
to C.H. DOWKER (cf. the reference in EDWARDS [1953], p. 160) will be presented
in Exercise 7 below. Nevertheless in many important situations these measures do
coincide and we are going to look into this now.
We will encounter two types of supplemental hypotheses which will entail the
equality p° = p° on M(E). The first imposes conditions on the space E, but none
on the linear form I.
We already know, for example, that for a compact space E the representing
measures p. and p° determined by a given positive linear form I on CC(E) coincide.
This follows immediately from Theorem 28.4. The reasons that underlie this need
to be examined more closely.

29.5 Definition. A locally compact space is called countable at infinity (also


sometimes o-compact) when it can be covered by a sequence of compact subsets.

Examples. 2. The following spaces are countable at infinity:


(i) every compact space;
(ii) the euclidean spaces Rd, d E N: The closed balls with any fixed center and
integer radii provide a countable covering by compact sets.
182 N. Measures on Topological Spaces

(iii) every locally compact space with a countable basis W. For 90 := {G : G E


9, G relatively compact} is a countable system of compact sets which covers E. In-
deed, each x E E possesses by definition a compact neighborhood V, and since 9 is
a basis, x E G C V for some GE 9. Of course then GE40.
3. A discrete space is countable at infinity just if it is a countable set.

Every subset A of a space E which is countable at infinity is of course covered


by a sequence of compact subsets of E, so from 28.5 we immediately get:

29.6 Theorem. If the locally compact space E is countable at infinity, then the
representing measures ii° and p° determined by any positive linear form I on CA(E)
coincide.

A simple consequence is:

29.7 Corollary. On a locally compact space E which is countable at infinity every


Radon measure (inner regular by definition) is also outer regular.

Proof. Every Radon measure it on E defines a positive linear form I. on CA(E)


of which it is a representing measure. According to 29.3 p must coincide with
the essential measure pO determined by Iµ. Since µO = p° and the latter is outer
regular, so must be A. 0
To justify the terminology "countable at infinity" we sharpen the covering
condition featuring in Definition 29.5.

29.8 Lemma. Let E be a locally compact space which is countable at infinity.


Then E can be covered by a sequence (Ln)nEN of compact subsets each contained
in the interior of its successor. Every compact subset of E is therefore a subset of
some (hence of all but finitely many) L.

Proof. First of all there is a sequence (Kn) of compact sets K such that Kn t E.
Using Corollary 27.3 we find 0:5 u,, E CA(E) with u, t 1E. But then the sets
Ln:={un>1/n}, nEN,
do what is wanted: Each is closed and, since Ln C supp(u,,), it is compact. Because
(zun) is isotone
L C {Yin+i > 1/n} C 1/(n + 1)} open C Ln+l,
whence L C I n+t, where A denotes the interior of a set A. As a result, (t )nEN
is an open covering of E, so finitely many of its sets suffice to cover any given
compact subset of E. 0
A simple interpretation of countability at infinity now emerges: A locally com-
pact space E is countable at infinity if and only if the infinitely remote point wo
§29. Riesz representation theorem 183

in the one-point compactification E' has a countable base of neighborhoods. Such


a countable neighborhood basis is furnished by the complements E' \ Ln of any
sequence (L,,) with the properties described in 29.8.

We come now to the second type of supplemental hypotheses. Here E is an


arbitrary locally compact space and conditions will be imposed on the positive
linear form I on Cc(E).

29.9 Definition. A positive linear form I on Cc(E) is called bounded if there is


a real number M such that
(29.11) II(u)1 < M IIuII for all u E CA(E)-

Here IIf II denotes the supremum norm of any bounded real function f on E.
The requirement (29.11) means that I is continuous with respect to the metric (of
uniform convergence) in CA(E) derived from this norm.

Remark. 2. If the space E is compact, then every positive linear form I on Cc(E)
is bounded, because CA(E) = C(E) so the constant function 1 lies in Cc(E).
Therefore from - Dull 1 < u < IIuII . 1 and the positivity of I we infer that

- Hull 1(1) < I(u) <_ Hull 10),


so that (29.11) holds with M := 1(1).

The next theorem - like its predecessor - covers compact spaces as a special
case.

29.10 Theorem. If I is a bounded positive linear form on a locally compact


space E, then its principal representing measure µ° is finite and coincides with the
essential measure µO.

Proof. According to (28.13)

Il,0Il=sup{I(u):0<u<1,uECc(E)}.
Since 0:5 u < 1 entails Dull < 1, (29.11) says that 0:5 1(u) < M IIuII < M, and so
IW° II <- M < +oo .

Thus µ° is a finite measure and the rest follows from 28.4. 0

Proceeding via Iµ as before (cf. 29.7) yields

29.11 Corollary. Every finite Radon measure µ on a locally compact space E is


also outer regular.
184 IV. Measures on Topological Spaces

Indeed, the positive linear form I4 on C°(E) defined by It is bounded, by


M := Ilicll < +00:

I,,(-)I = if < r Jul du <_ Dull M for every u E C°(E),

and we can conclude as in the proof of 29.7. 0

Remarks. 3. From the proof of Theorem 29.10 it also follows that the total
maw [l;t°II of u° is the smallest real number M > 0 that can serve in Definition 29.9.
4. It is not to be expected that in every locally compact space E which is
countable at infinity every positive linear form on CA(E) will have exactly one
representing measure with no further qualification. Still less is unqualified unique-
ness of representing measures for bounded positive linear forms on C°(E), when E
is only a locally compact space, to be expected. There is a counterexample to
both in HALMOS [1974), p. 231 - DIEUDONNIi [1939) is also cited there - in which
the space E is even compact: It is the interval [1, Q] of all ordinal numbers not
greater than the first uncountable ordinal f2, equipped with the order topology.
The positive linear form IEn on C([1,52]) defined by the Dirac measure en has
a representing measure it which is neither inner regular nor outer regular. Thus
f f den = f f dp for all f E C([1,1z]) although It 96 eS2. Details can be found in
PFEFFER [1977], p. 116.

In view of the last remark the following theorem is especially noteworthy, as


well as useful:

29.12 Theorem. If the locally compact space E has a countable base for its topol-
ogy, then every Borel measure on E is regular, hence in particular a Radon mea-
sure.

Proof. Let It be a Borel measure, I, the associated positive linear form on CA(E)
and p° the principal representing measure for I. Along with E each of its open
subspaces U also has a countable base. From Example 2 therefore U is countable
at infinity; there exists a sequence of compact sets such that K 1' U. Since
the measures It, p° are continuous from below, it follows that
u(U) _ rn p(K,,) and p°(U) _ im° p°(K,,).
But u(K,,) < u°(K,,) for every n E N, by Lemma 29.2. So we get
u(U) < u°(U), from which and a second appeal to 29.2
(29.12) u(U) = u°(U), for every open U C E.
For an arbitrary Borel set A and open U D A we then have u(A) < u(U) = u°(U)
and so, on account of the outer regularity of u°,
(29.13) u(A) < u°(A), for every A E ..(E).
§29. Riesz representation theorem 185

If A E .4(E) is relatively compact, we can choose an open relatively compact


neighborhood U of A and apply the last inequality to U \ A, getting
u(U) - u(A) = u(U \ A) < u°(U \ A) = u°(U) - u°(A).
Subtracting (29.12) from this gives us the reverse inequality to (29.13). In sum-
mary,
(29.14) tt(A) = p°(A), for every relatively compact A E .V(E).
Now, E is, as already noted, countable at infinity. So we have a sequence of
compact sets which increase to E. (29.14) is applicable to B n L for any Borel
set B and any n. E N. We therefore get
u(B) = 'x-+x
lim u(B n Lg) = n-x
lim u°(B n L,) = u°(B).
That is, u and u° coincide throughout .W(E). Since the essential measure u° is
a representing measure for I,,, this fact insures (as does Theorem 29.6, for that
matter) that u = u°. From the double equality it = u° = p° follows finally the
regularity of u. 0
In this situation the Riesz representation theorem can therefore be expressed
thus:

29.13 Corollary. For a locally compact space E whose topology has a countable
base, every positive linear form I on CA(E) can be represented as

1(u) = Judprt E
by exactly one Borel rrteasur p on E.

Example. 4. For cacti u E CS(R) choose real numbers a < 13 such that supp(u) C
(a,131 and define
L(u) :=
j a u(x) dx,
a
the integral being the usual Riemann integral: it is independent of the specific
numbers a and,3 used. Evidently L is a positive linear form on CS(R). According
to 16.4 L-B measure A' represents L, and by 29.13 it is the only representing
measure.

Remark. 5. It is also possible to deduce Theorem 29.12 from Theorem 26.3 and its
Corollary 26.4 because every locally compact space E whose topology has a count-
able basis is Polish. In fact along with E, its one-point compactification E' also has
a countable base, as follows from Lenima 29.8 and the commentary after it. It will
be shown in Remark 3 of §31 that E' is consequently ntetrizable, and completeness
of the metric follows easily from compactness (cf. Example 6, §26). Thus E' is
Polish and E is an open subset of it. Therefore according to Example 4, §26 E
itself is Polish.
186 IV. Measures on Topological Spaces

Summarizing, we can say that for every locally compact space E, the mapping
that associates to each Radon measure p on E the positive linear form 1. on Cc(E)
is a bijection between the set of Radon measures on E and the set of positive linear
forms on CA(E). That is the reason why in BOURBAKI [1965) the positive linear
forms on CA(E) are themselves designated as (positive) Radon measures.
If the space E is countable at infinity as well, the Radon measures on E are
all outer regular. If moreover the topology of E has a countable base, the Radon
measures and the Borel measures on E coincide.
We give now an application to integration that is of fundamental importance.

29.14 Theorem. For any regular Borel measure p on a locally compact space E
and any p E [1, +oo[, the vector space CA(E) is dense in 2P(p) with respect to
convergence in pen mean.

Proof. First of all, CA(E) C .`(p), because CA(E) C .2"(p) by (28.1) and Iulp E
CA(E) whenever u E CA(E). The denseness claim requires that for each f E gy(p)
and each number e > 0, a function u E CA(E) be produced with

Np(f -u):= (f If - uIp du) "P <e.


We accomplish this by a stepwise simplification of the function f to be approxi-
mated. Since along with f , both f+ and f - are in .`gy(p), and Np is a semi-norm,
we can assume that f > 0. By 11.3 and 11.6 there is an isotone sequence (fn) of
SR(E)-elementary functions such that f,, t f. All these functions also lie in £"(p),
due to 0 < fn < f, Therefore from the dominated convergence theorem
lim Np(f - f,,) = 0.
nioo
This makes it clear that only . (E)-elementary functions need be approximated
by CA(E), and because of the semi-norm properties of Np the matter even comes
down to approximating the indicator functions 1A of Borel sets A having p(A) =
(Np(lA)Jp < +oo. For such an A the outer regularity of p supplies an open U J A
such that
[p(U) - p(A)J1/p = Np(lu\A) = Np(lU - 1A) < e/2.
In particular, p(U) < +oo. Therefore the inner regularity of p insures that for
some compact K C U

that is,
Np(lu - 1K) < e/2.
Finally, we use 27.3 to select u E CA(E) satisfying 1K < u:5 1U, whence
0<1u-u<1u-1K
§29. Riesz representation theorem 187

and so
Np(lu - u) < e/2.
For the function f = 1 A to be approximated we now have
Np(f - u) < Np(lA - lu) + N,(lu - u) < e,
completing the proof. 0

The proof actually uses the inner regularity of µ only on open sets. So what is
involved here are conditions which according to 29.4(ii) characterize the principal
representing measure. We will not pursue this any further but interested readers
can in BOURBAKI [1965) and BAUER [1984), where this remark is placed in a more
general framework.

Exercises.
1. Let E be an uncountable discrete space. Using the Borel measure from Ex-
ample 6 in §25, show that every positive linear form on CA(E) has at least two
different representing measures. This sharpens Example 1 of this section.
2. Let E be a locally compact space and I a positive linear form on CA(E). With
the help of the R.iesz representation theorem prove the following refinement of
equality (28.12): For every open U C E
µ°(U) = sup{I(u) :0:5 u:5 lu, u E CA(E)} .

3. A Ko-set is a union of countably many compacta. Prove that in a locally compact


space in which every open set is a Ks-set, every Borel measure is regular. (Hint:
Re-examine the proof of Theorem 29.12.)
4. Show that a locally compact space E is countable at infinity if and only if there
exists a strictly positive function in Co(E).
5. Prove that for an arbitrary Borel measure µ on a locally compact space E the
following two assertions are equivalent: (a) it is finite. (b) Cb(E) C 2l(µ). Show
that if µ is a Radon measure, the assertion C0(E) C 2l(µ) is equivalent to each
of (a) and (b).
6. Let E be a locally compact space, I a positive linear form on Co(E). Show
that there is exactly one finite Radon measure µ on E such that 1(f) = f f dµ
for every f E Co(E). (Hints: Indirect proof. Or: For every e > 0 and non-negative
f E Co(E) there is a u E CA(E) with If - uI 5
7. Let El, E2 be the interval [0, 1) equipped with the discrete topology, respectively,
the usual euclidean topology, and consider the product space E = El x E2. Show
that
(a) E is locally compact.
(b) Every product
xE:_{x}x[0,1), 0<x<1,
is a compact subspace of E, which is also open in E.
188 IV. Measures on Topological Spaces

(c) A set U C E is open if and only if U fl xE is open for each x E 10111-


(d) Every compact subset of E is covered by finitely many of the sets xE.
Now consider u E CA(E). By (d) u vanishes in the complement of the union of
finitely many xE sets, and for each fixed x, y u-+ u(x, y) is a continuous function
on the compact interval Ea = [0, 1]. Therefore

I (u) II u(x, y) dy
O<x<I 0

is a well defined finite sum, evidently a positive linear form on Cc(E). Show that
(e) The essential and the principal representing measures for I do not coincide.
[Hint: Show that the set A := El x {0} is closed and that s°(A) = 0, while
u°(A) = +oo.]
(f) In passing from u° to the Borel measure 1Bµ° for B E M(E) outer regularity
may be lost. [It suffices to consider B := E \ A, for the set A in the preceding
hint.]

§30. Convergence of Radon measures

For locally compact spaces E we will henceforth use the notation .4'.. (E) for the set
of all (positive) Radon measures on E. The Riesz representation theorem furnishes
a canonical bijection of fl+(E) onto the set of all positive linear forms on CA(E).
With p, v E 4+ (E) and real numbers a > 0, i3 > 0 the measure aµ +)3v also lies
in .ill+(E), as is easily checked. That is, .0+(E) is what is called a convex cone.
Besides . W+ (E) we often consider the following subsets

.'+(E) = (1A E 4'(E) : p(E) < +oo}


-#+'(E) =fu E-0+(E):µ(E)=1},
the set of all finite (or bounded) Radon measures and the set of all Radon p-
measures on E, respectively. Evidently

-&+' (E) C.-W+(E) C .4+(E) .


In .f+1 (E) are to found all the Dirac measures on E. And 4 (E) is a convex
subcone of 4f+ (E).
In the special case E = Rd the set ..W+b (W') is the set of all finite Borel measures
on Rd, already familiar to us from §24. That the definition there is equivalent to
the present one is due to Theorem 29.12, according to which every Borel measure
on Rd is a Radon measure.
Depending on whether one thinks of the elements of . W+(E) as measures
on -V(E) or as positive linear forms on CA(E), two notions of convergence sug-
gest themselves: One can define the convergence of a sequence (ta,,) in 4'+(E) to
§30. Convergence of Radon measures 189

pE by requiring either that


lim An (A) = p(A) for all A E R(E)
n-+oo
or

lim f dp = J f dp for all f E CC(E).


n-+oo J J
We will forthwith show that the first of these is of limited interest, while the second
is of considerable significance.

30.1 Definition. A sequence (pn)nEN of Radon measures on E is said to be


vaguely convergent to a Radon measure y if

(30.1) lim for all f E CA(E).


-oo

A sequence (pn) in 4'+(E) is vaguely convergent just when the sequence of


real numbers (f f dpn) converges in R for every f E CA(E). For in this case
f H lim f f dpn evidently defines a positive linear form on CA(E), so by the Riesz
n
representation theorem together with Theorem 29.3 there is a unique Radon mea-
sure p to which (An) vaguely converges. At the same time we see that a sequence
in . K+(E) can have at most one vague limit.

Examples. 1. Let (xn) be a sequence in E, x E E. If (xn) converges to x, then


(e2 ) converges vaguely to eZ, for the latter just amounts to lim f (xn) = f(X)-
In general however lime= (A) = ex(A) does not hold for all A E -V(E); in fact,
if all xn are distinct from x, A := {x} is such a set. Conversely, if (es,) vaguely
converges to ey, then (xn) converges to x. For if this were not so, there would
be a subsequence of (xn) which remains outside of some neighborhood U of x.
27.3 furnishes an f E CA(E) with f (x) = 1 and supp(f) C U. Evidently the
sequence (f f (f (xn)) does not converge to f f de,.
2. Let (an) be an arbitrary sequence of non-negative real numbers and (xn) a se-
quence in E with the property that {n E N : xn E K} is finite for every compact
K C E. (In other words, E is not compact and limxn = wo E E'.) Then the se-
quence of measures An := ane: (n E N) is vaguely convergent to the zero measure
p := 0. For f f dpn = an f (xn) = 0 for all n except the finitely many for which
xn E supp(f), whenever f E Cc(E).

The fact, illustrated by Example 1, that the vague convergence of (An) to A


does not generally entail the convergence of (pn(A)) to p(A) for each A E . (E),
while, as 30.2 will show, the converse is true, seems to indicate that the first mode
of convergence mentioned above is too restrictive to be of much use. Actually,
vague convergence of (An) to p follows just from knowing that (An (A)) converges
to p(A) for certain special sets A E R(E). Even more:
190 IV. Measures on Topological Spaces

30.2 Theorem. A sequence (pn) of Radon measures on a locally compact space E


converges vaguely to a Radon measure p if and only if the following condition is
fulfilled:

(30.2) lim pp 1zn (K) < p(K) and lim oinµn (G) > jz(G)

for every compact K C E and every relatively compact, open G C E.

Proof. Suppose converges vaguely top and that K and G are any compact and
open sets, respectively. Consider functions u,v E CC(E) with u > 1K, 0 < v < 1
and supp(v) C G. Then for all n E N

µn(K) < J udjcn and JVdPn<Pn(C) ,

whence
limss op jln(K) < udp and vdµ < liimianfµn(G).
J J
From these inequalities (30.2) follows via (28.2) and (28.12). One only has to recall
that the Radon measure p coincides, thanks to Theorem 29.3, with the essential
measure po determined by the linear form Iµ.
Now suppose conversely that condition (30.2) is fulfilled and that an f E CA(E)
has been given. Since our goal is to confirm (30.1), we lose no generality by as-
suming that f > 0. For a pre-assigned e > 0 we choose finitely many numbers
0=yo<y1<...<yk
with yk > IIfII and yj - yj_1 = e for each j = 1,...,k. Set
K:= supp(f) and Aj :_ {yj_1 < f < yj} f1 K, j = 1,...,k.
Denoting the compact set { f > yj } fl K by Kj for j = 0,..., k (so Kk = 0 and
Ko = K), we have K, -I D Kj and

A =Kj-1\Kj (j=1,...,k).
Because of the obvious inequalities
k k
Eyj-11A; <.f <_ Eyj1A,,
j=1 j=1
every Radon measure v on E satisfies
k k
1: yj_1v(Aj) < Jfthi < yjv(A,),
j=1 j=1
§30. Convergence of Radon measures 191

from which and a simple calculation using the facts v(A,) = v(Ki_1) -v(K,) and
yi - yi _ 1 = e, we get
k k k
e E v(KK) - ev(K) = e E v(KK) < Jfdzi<r>v(Ki).
i=o i= i=o
For v := it,, the right-hand inequality gives us
k
Jfd/<EJL(Kj) for all n E M,
i=o
and therefore from the first half of hypothesis (30.2)
r k
limsopJ fdµn<eEµ(K1)
i=o

But this right-hand side can be estimated by using the left end of the earlier chain
of inequalities, with v:= µ. We thereby get

lim sup f f dµn < r f dµ + eµ(K),


valid for every e > 0. Consequently,

lira sup Jfd µn < ffd.


The complementary inequality that we need is

f fdµ<liminf f fdµn
and we get it by an analogous procedure, using the second half of hypothesis (30.2).
One sets Gi := If > yi }, j = 0, ... , k, which are open, relatively compact subsets
of K with
Gi-i \ Gi = {yi-i < f < yi} _ {yi_1 < f < yi} fl K.
These sets take over the role of the Ki. 0

The second example above (for the case in which, say, all the an equal 1) shows
that a vaguely convergent sequence of measures from .41+(E) need not converge
to a measure in .,W+l (E): mass can be lost. This illustrates the following general
phenomenon:

30.3 Lemma. If the sequence (µn)nEN of Radon measures on the locally compact
space E converges vaguely to the measure µ E then the associated total
masses satisfy
(30.3) IIiII < Inm onf IIIinII
192 IV. Measures on Topological Spaces

<u<1
Proof. For every u E CA(E) with 0JudlLn

< -IIunhI

holds for n E N, so from (30.1) follows that

f udp < liminf IIpnII


J n-00
Take the supremum of these integrals over all such u and you get, according
to (28.13), the total mass p(E) = IIpII of p. The inequality persists after this
operation

Vague convergence of sequences in .4'+(E) is convergence in a certain topol-


ogy on ..ff+(E), called, naturally, the vague topology. It is defined as the coarsest
topology on .4f+ (E) with respect to which all the mappings

(30.4) p y J f dp (f E CA(E))

are continuous. A fundamental system of neighborhoods of a typical po E 4' (E)


consists of all sets of the form

(30.5) Vi...... t..:E(WJ) 1/1 E 4+(E) : if fi d1a - J fi dPol < s,.1 = 1, ... , n}

in which n E N, 0 < E E R and fl,..fn E CA(E) are all arbitrary. The vague
topology is Hausdorff because the uniqueness aspect of Riesz's theorem says that
if p, v are different Radon measures, then I, 36 It,., which just means that f f du 34
f f dv for some f E C (E).
In this context it is now clear too what should be understood by the vague
convergence of a mapping t i-+ p of a subset A of a topological space T into W+ (E)
when t converges to a point to E A. With respect to the vague topology the
convergence
lim
t 10
µt = µ
tE A

for some U E 4'+(E) just means that

(30.6) lim ffdt = ffd forevery f E C(E).


tEA

Example. 3. Let K be a non-negative Ad-integrable, real function on E := Rd


with f K dAd = I (for example, the indicator function of the unit cube [0,1] ). For
every real r > 0 set
K,.(x) := rdK(rx) (x E Rd).
Then K, is also non-negative and Ad-integrable, and f K, dAd = 1 as well. To see
this we only have to recall (7.10), according to which the homothety H,(x) := rx
on Rd transforms L-B measure thus: Hr(Ad) = r-dAd. For from that it follows
§30. Convergence of Radon measures 193

that

J KrdAd=rd I K0HrdAd=rdJ Kd(Hr(Ad))= I KdAd = 1.


Now r -+ Kr)1d is a mapping of JO,+oo[ into dl. (Rd), and in the sense of the
vague topology it satisfies
(30.7) lim KrAd = e0

.F
r-a+oC

To confirm this, first notice that for every f E

f f Kr dad = rd J f (K o Hr) dad = rd f (f o Hr-') K dHr(Ad)


= f(f oHH')KdAd= ff(f_1x)K(x)Ad((fr)

this and the Lebesgue dominated convergence theorem the claim (30.7) fol-
lows upon checking that, on the one hand
lint f (r-'x)K(x) = f (0)K(x)
r-++oo
for every x E Rd,

and on the other hand for all real r > 0 and all x E Rd
If (r-'x)K(x) I <_ Ilf11. K(x),
so that 11111 K is an integrable majoraut for all functions. The "approximation of
the identity" co expressed by (30.7) plays an important role in Fourier analysis (cf.
the exercises in §23 of BAUER [1996] ). For the algebra L' (ad) (cf. Remark 2, §24)
has no identity element with respect to convolution, but it is not hard to show
that II Kr * f - f 11 -+ 0 as r -+ +oo for each f E L' (Ad), and in many situations
this is almost as useful as having an identity.
To .,W+b (E) belong in particular all discrete Radon measures on E. These are
the measures 6 which can be represented in the form
k
5 = E aic",
7=1

f o r some finite number of points x1, ... , xk E E and non-negative real numbers
at, ... , ak. Every 5 admits many such representations. Every Radon measure can
be approximated, in the sense of the vague topology, by such 5, as we next show.

30.4 Theorem. For every locally compact space E the set of discrete Radon mea-
sures on E is dense in .4f+ (E) in the vague topology.

Proof. Let a measure tso E .W+(E) and a vague neighborhood V of be given. As


noted after (30.5), we can suppose V is Vj, ,....I,, :1(0) for some non-zero Ii..... f E
,,(E). We have to find a discrete measure 6 in V. To that end, consider the com-
194 IV. Measures on Topological Spaces

pact set
n
K := U supp(fi)
i=1
and g > 0 such that npo(K) < 1. Every y E K has an open neighborhood U. in E
such that 1 fi (y') - fi (y") I < q for all y', y" E U. and all j E {1, ... , n}. Finitely
many Us,, say Uy...... Uy,, suffice to cover K. Set
Al :=KnU,,, A2:=(KnU,,)\Al,...,Ak:=(KnUYk)\(ALU...UAk_1).
These are pairwise disjoint, relatively compact Borel sets whose union is K, and for
all j E { 1, ... , n}, i E { 1, ... , k} and y', y" E A. the inequality I f i (y') - fi (y") 15 rl
holds. Since only these properties of the A; are used in the sequel, we can discard
those that are empty (not all are because 0 0 K = Al U ... U Ak), and re-index the
others. That is, we can suppose all the A; are non-empty and then select a point
xi E A, for each i. The discrete measure
k

i=1

(notice that po(A;) is finite because A; is relatively compact) will be shown to lie
in V and that will complete the proof-

f fi dpo - po(A:)fi(xi)
i=1 +
i=1

I:k fA.' -f(x))dpo


Ifi - fi(xi)I dpo<Eipo(A.)=rlpo(K),
fA, iel

using the fact that Ifi(x) - fi(xi)I < 17 for all x E A;, all i E {1,...,k}. This
holds for each j E { 1, ... , n}, and gpo(K) < 1 by choice of q. Therefore b E
V1,,..., f,,;1(po) = V, as was to be shown. 0

30.5 Corollary. The discrete p-measures on E are dense in di. (E) in the vague
topology.

Proof. We take over the notation of the preceding proof. Now po is a measure
in 4+' (E), but the discrete measure 6 = F, po(A;)ez, may not be a p-measure,
so more work is required. Set a; := po(A1), i = 1, ... , k. If K = E (in which case
E had to be a compact space), then a1 +... + ak = po(K) = 1 and b actually is
a p-measure. In general what we have is
a1 + ... + ak = po(K) < uo(E) = 1
§30. Convergence of Radon measures 195

and if K 0 E we can choose another point, xk+l E E \ K, and set


ak+l (al +... + ak),
which is non-negative. Then

is a discrete p-measure with f fj dd = f fj db' for each j = 1, ... , n, since xk+l lies
outside the supports of all these functions. Consequently, 6 E V = Vf...... f,,;I(P0)
yields that also 6' E V. 0
Next we will investigate whether the equality (30.1) and the continuity asser-
tion (30.4) remain valid for classes of continuous functions more general than C..(E).
Recall in this connection that for a measure µ E .,&+' (E), every f E Cb(E) is u-
integrable: it is g(E)-measurable and its modulus is majorized by a real constant,
hence p-integrable, function. We will formulate the relevant results for sequences
only; their extensions to mappings t u-+ pt are routine.

30.6 Theorem. If a sequence (µn)nEN in .14(E) is vaguely convergent to µ E


.1/+(E) and if the sequence (IIµnII)fEN of total masses is bounded, then along with
all the pn the measure µ is also finite, and for every f E Co(E)

lim f f dµn = Jfd.µ

Proof. If we set a := sup{11µn11: n E N}, which is finite, then 111411:5 a, by (30.3),


so µ is a finite measure. Definition 27.5 says that for each e > 0 there is a g =
gf E CA(E) such that 11f - g11 5 e. Therefore

for each n E N
if
and
if f du < ae,
so that via the triangle inequality

if f dµn - f f dµ - jgd.Uj
I< 2ae + 119dJun for all n E N.

Since the hypothesis of vague convergence means that f g dµn -1 f g dµ, we get

lim sup if f dAn - f f dµ < 2ae,


1

valid for every e > 0. That is, the limit exists and is 0. 0

Remarks. 1. If one considers measures pn and µ E .-W+6 (E) without the hypothesis
sup 11µn 11 < +oo, the above conclusion can fail. The special case of Example 2 in
196 IV. Measures on Topological Spaces

which E := R, x := it and a := it for all n E N illustrates this. For the function f


defined by
f (x) := min (1, Ix[-1} for x # 0, f(0) := 1
lies in C0(R). But f f dpn = 1 for every n E N, while f f dµ = 0, because here the
vague limit p is the 0-measure.
2. Example 2, again with E := R and xn := n for all n, considered earlier, but
this time with the constant sequence a := 1, shows that indeed lim f f dey =
f f dµ for the measure p := 0 and all f E Co(R), but this equality is already false
for the constant function f := 1E in Cb(R).

The passage from Co(E) to Cb(E) therefore calls for a special investigation,
which we stress by introducing a new definition:

30.7 Definition. Let p, p1, p2.... be measures in 4(E). The sequence


is said to be weakly convergent to p if

(30.8) lim
n-+00
JfdP=Jfdp for all f E Cb(E).

30.8 Theorem. Suppose the sequence (An)nEN in ..4+(E) converges vaguely to


the measure it E .W+ (E). Then the following statements are equivalent:
(i) The sequence converges weakly to it.
(ii) 11m IIpnll = IIEiII
(iii) For every e > 0 there exists a compact subset K = K, of E such that
(E\K)<e forallnEN.
Proof. (i)*(ii) is obvious because 1 E Cb(E).
Let c > 0 be given. The inner regularity and finiteness of p yield
that there is a compact subset L of E such that p(E \ L) < e. According to 27.3,
L has a compact neighborhood KO, so there is an open set G with L C G C Ko.
By (30.2)
lim inf µn(G) > p(G) > p(L) > IIp1I - e,
,l-+00
so if we choose a E I 11p 11 - e,p(L)[ there will be an no E N such that pn(G) > a
for all it > no. Moreover, in view of (ii) this no may be supposed large enough
that IIpnII < a+e for all n > no. Consequently, pn(Ko) > pn(G) > a > IIII -e,
so that p.n (E \ KO) < e, for all n > no. For each n E { 1, ... , no) inner regularity
and finiteness of pn give us a compact K C E such that pn(E \ Kn) < e. The
compact set K := Ko U K1 U ... U Kn0 then satisfies (iii).
Given e > 0, let K = K, be as described. Again from (30.2) we have
p(E \ K) < lim inf K) < e. There is a function u E C,:(E) with 0 < u < 1
and u(K) _ { I). It satisfies 0 < 1 - it < 1CK and so for each f E Cb(E)

ifl 5 I[f11 f(i-u)dp,,<11fllM(CK)<-IIIIIk forallnEN


§30. Convergence of Radon measures 197

and by the same argument

J(i- u)fd/) <- If 11£.

As in the preceding proof, the triangle inequality then gives

Jfdµl <2IIfIIE+11ufd/Ln- Juid µl for all n EN.


if
Since of E CA(E), the hypothesis of vague convergence insures that (f of dµ,,)
converges to f u f dµ, so the preceding inequality yields

limsup if f dILf -1 f dµl s 2IIf1I e,


valid for every e > 0. That is, this limit exists and equals 0, for every f E Cb(E).
Which proves (i). 0

30.9 Corollary. A sequence (µn)fEN in ..,f+ (E) is vaguely convergent to µ E


4' (E) if and only if it is weakly convergent to p.

Remark. 3. A sequence (µn) in .f+(E) which satisfies condition (iii) is called


tight, whether or not any convergence is going on. If a tight sequence from _f+1 (E)
vaguely converges to a measure µ E .ill+(E), then first of all, IIµII S 1 by (30.3), so
that µ E _&+6 (E). The preceding theorem then guarantees the weak convergence
of (µn) to p and therewith µ E _W+' (E). In particular, with vaguely convergent tight
sequences in ..f+ (E) no mass is lost (cf. the remark preliminary to Lemma 30.3).
Consequences like these constitute the real significance of the tightness concept.

At this point it is worth returning once more to Theorem 30.2. If the measures
µ,µn there are all finite and of the same total mass, e.g., if they are all p-measures,
then the two components of the compound condition (30.2) become equivalent. The
result is the following portmanteau-theorem:

30.10 Theorem. Let µ,µl,µ2, ... be measures in &+' (E). Then the following
three assertions are equivalent:
(i) The sequence (µn)nEN converges vaguely (and therefore also weakly) to p.
(ii) For every closed F C E
(30.9) lim so p µn (F) < µ(F) .
(iii) For every open G C E
(30.9') lim of µn (G) >- IL(G)

Proof. The first paragraph of the proof of 30.2 actually established that (i)=(iii),
under the less restrictive hypotheses prevailing there. Since that theorem further
shows that the conjunction of (ii) and (iii) implies (i), it only remains to establish
198 IV. Measures on Topological Spaces

the equivalence of (ii) and (iii). That follows from the trivial observation that
v(CA) = v(E) - v(A) = 1 - v(A)
holds for all A E -4(E) and all v E _W+1(E).

Example 1 in this section shows that the weak convergence of a sequence (µn)
in .4/+(E) to a It E 4' (E) does not imply the convergence of (f f dµ,+) to f f dµ
for every bounded Borel measurable function f . Nevertheless the continuity of the
functions f which define weak convergence can be relaxed somewhat. To this end,
we consider bounded, real-valued, Borel measurable functions f on E which are
p-almost everywhere continuous for a p E .A"+(E): After excision of a p-nullset
N E .3(E), f is continuous at each point of E \ N. Important examples of such
are the indicator functions of boundaryless Borel sets. The latter are defined as
follows:

30.11 Definition. A Borel subset Q of a locally compact space E is called bound-


aryless with respect to a measure p E .AY+(E), p-boundaryless (or p-quadrable)
for short, if the boundary Q' \ $ of Q is Eo-mill:
(30.10) µ(Q') = 0.

Examples. 4. Every interval of the number line R is A'-boundaryless.


5. A set Q E V(E) is boundaryless with respect to a Dirac measure ea if and
only if a E E \ Q*. Look back at Example 1 with this observation and the following
theorem in mind.

30.12 Theorem. Suppose the sequence (µn),+EN in .Al!+(E) converges weakly to


it E J4 (E). Then
(30.11) lim
n-,00
JfdPn=JfdP
holds for every bounded Borel measurable function f that is p-almost everywhere
continuous on E. In particular,
(30.12) lim p,,(Q) = µ(Q)
n-,OC

holds for every p-boundaryless set Q E .O(E).

Proof. By hypothesis there is a Borel set Eo C E with µ(E \ Eo) = 0 such that
f is continuous at the each point of E0. Let e > 0 be given. Since p is a Radon
measure, there is a compact K C Eo with
p(Eo\K)<e.
Every x E K has an open neighborhood Ux on which the oscillation of f is at
most e, meaning that
If(yi)-f(Y2)I <_e for all y1, y2EUx.
§30. Convergence of Radon measures 199

Choose a compact neighborhood V= of x with VV C Ux and then use the com-


pactness of K to find finitely many points x 1, ... , x, E K such that V=, , ... , V=,,
cover K. If we now set
a := inf f (E), 13 := sup f (E), aj:= inf f (U=; ), Q3 := sup f (U , )
for j = 1, . . . , n, then for each such j there exist functions gj, h3 E Cb(E) satisfying
(aj if x E Vx if x E Vj
9i( x) _ a ifxECUU, and h (x) = { ,Qi
ifxECUU,
[3

as well as
a<g;<a;</3;<h;<0.
This follows at the once from 27.3 and the application of an appropriate affine
transformation in the range space R. From these properties and definitions it
follows in particular that gi S f < hj for all j. Therefore if we set
g:= g1 V... Vg,, and h:=h1 A...Ahn,
then both these functions lie in Cb(E) and they satisfy a < g < f < h < ,0.
Moreover,
0<h(x)-g(x)<e forallxEK.
For each x E K lies in some V1, C Us, and because of the way Ux; was chosen
with respect to the oscillation of f, it follows that h(x) - g(x) < h,(x) - gj(x) _
/31 - aj < E. We are now in a position to finish the proof, as follows:

J(h-g)di=IK-g)
<
dµ+JE\K-g)dit
eµ(K) + ((3 - a)µ(E \ K) < e(µ(E) + 3 - a) ;
and, because g < f < h and g, h E Cb(E), the weak convergence hypothesis gives

g dp = n-too
lim g dµn < lim inf f f dttn < lim sup if f dµn
J J nloo J n-+00

< lim h dµn = h du.


-n +00 J J
Of course we also have f g dµ < f f dµ < f h dµ. Putting all this together shows
that any pair of the numbers f f dµ, lim inf f f dµn and lim sup f f dµn differ by
at most e(µ(E) +,3-a). Since e > 0 is arbitrary, (30.11) holds. 0

Let us now look at an application of this theorem which relates the vague
convergence of p-measures on the number line to their Theorem 6.6 description
in terms of distribution functions. This is the way that weak (and hence vague)
convergence made its original historical appearance.

30.13 Theorem. Let µ, Al, A2.... be measures in 4+1(R), that is, probability mea-
sures on .41, and F, F1, F2 ... their distribution functions. If the sequence (µn)nEN
200 IV. Measures on Topological Spaces

converges weakly to p., then


(30.13) limo F,,(x) = F(x)
n +0
holds for every x E R at which F is continuous. If F is continuous throughout R,
then this convergence is uniform on R.

Proof. According to Theorem 30.12, 1im p.,, (Q) = p(Q) for every p-boundaryless
set Q E .£1 and thus, after (6.11), lim F,,(x) = F(x) for every x E R such that the
interval Qx oo, x( is p-bounda.ryless. We have

] - oo, x] = Qx = n Q=+1 /k
kEN

and therefore
t (Qx) = klim u(Q.+1/k) =kin F(x + Ilk) .
Consequently, Q, is L-boundaryless just if the (isotone) function F is right-conti-
nuous at x, that is (since distribution functions are everywhere left-continuous),
just if x is a point of continuity of F. This proves the first assertion.
Let us now hypothesize that F is continuous on the whole line, and let e > 0
be given. First of all, (6.13) supplies numbers a < b such that F(a) < e and
1 - F(b) < c. The uniform continuity of F on the compact interval [a,b] insures
that points a = xo < x1 < ... < xk = b exist such that
F(xj)-F(xj_1)<e forj=1,...,k.
From what has already been proven we know that there exists nE E N such that
IFn(xj) - F(xj)I <,E for each j E 10,..., k} and all n > nE.
But then, as we will show, the inequality (Fo(x) - F(x)] < 2e prevails for every
x E R and all n > ne1 which proves the uniform convergence of (Fn) to F. For if
x < x0, then
0 < F(x) < F(xo) < e and 0 < Fn(x) < Fn(xo) < F(xo) +e < 2e,
that is, I F,,(x) - F(x)j < 2e. And a similar argument works if x > xk. The re-
maining x fall into [x j _ 1, x j [ for an appropriate j E {1,...,k}, so
F(xj_1) < F(x) < F(xj) < F(xj_1) +e
and

F(xj_1) - c < Fn.(xj_1) < Fn(x) < F,,(xj) < F(xi) +e < F(xj_1) +2e,
confirming that in this case too IFn(x) - F(x)I < 2E.

Remarks. 4. At a point x E R of discontinuity of F limit relation (30.13) generally


fails, as the example Ee := n E N, confirms.
§30. Convergence of Radon measures 201

5. Condition (30.12) for every p-boundaryless set Q E R (E) is also sufficient


for the weak convergence of the sequence to p (cf. Exercise 6 below). The
same is true of condition (30.13) (cf. Exercise 7).

The concept of weak convergence (with the same definition) is also meaningful
if E is a Polish space (or even just a metric space) if the measures involved in
Definition 30.7 are all finite Borel measure on E. Only the uniqueness of limits
calls for discussion:

30.14 Lemma. Finite Borel measures p and v on a metric space E are equal if
f f dp = f f dv for all f E Cb(E).

Proof. Let d be a metric giving the topology of E and consider closed subsets
F C E. Suppose we can always find a sequence (fn) in Cb(E) with fn .1. 1F. Then
it would follow from the hypothesis and from Lebesgue's dominated convergence
theorem that u(F) = v(F). The system of closed subsets F of E is an fl-stable
generator of the Borel a-algebra R(E) and it contains the whole space E. The
equality µ = v would thus follow from the uniqueness theorem 5.4.
It remains therefore to prove the existence of such sequences (fn) and we
can suppose F 0 0. For this purpose we use the (uniformly) continuous an-
titone function h : R -+ R which is constantly 1 on ] - oo, 01, constantly 0
on [1, +oo[ and defined by h(t) := 1 - t on [0, 1], together with the function
x H d(x, F) := inf{d(x, y) : y E F}. The latter is a (uniformly) continuous func-
tion on E, as we showed in the proof of Example 4, §26. Moreover, its zero-set
is exactly F, because F is closed. Apparently then the sequence of (uniformly)
continuous functions
fn(x) := h(n d(x, fl), x E E, n E N
does what is wanted. 0

Remarks. 6. The concept of p-boundaryless sets is also meaningful for finite Borel
measures p on Polish spaces. One easily convinces himself that Theorem 30.12
remains valid in this new situation. In the proof one merely has to secure the
existence of the needed functions g3 and h2 somewhat differently: To this end one
engages Urysohn's lemma (WILLARD [1970], p. 102 or KELLEY [1955], p. 115).
7. Weak convergence in the set of finite Radon measures on a Polish or a locally
compact space E derives from a topology in the same way that vague convergence
does. It is called, naturally, the weak topology and it is defined by letting Cb(E)
take over the role of CC(E) in (30.4).

Weak convergence in (non-locally compact) Polish spaces plays only a marginal


role in this book, but is thoroughly investigated in BILLINGSLEY [1968] and PAR-
THASARATHY [1967].
202 IV. Measures on Topological Spaces

Exercises.
1. Let E be a locally compact space, (µn)fEN a sequence in ..Wb(E) which is
vaguely convergent to µ E . +(E). If 11µI.11 !5 1 for every n E N, then R
o.D
exists and equals 1.
2. Let be a convergent sequence of real numbers, with slim an = a E
+00
Further, let (a be a sequence of non-negative real numbers such that al > 0
and the series E a,,, is divergent. Then
alai +...'+'anon =a
lim
n-+no a,
the case in which all an = 1 being the best known instance. Here is an outline for
a measure-theoretic proof: The equations
x161 + ... + anEn
/tn := n E N,
al+...Ian
define a sequence of measures in -0 (N) which vaguely converges to 0. Therefore
according to 30.6, line f f dt. = 0 holds for every f E Ca(lm). The relevant f is
the one defined by f (n) := a - a.
3. Let E be a locally compact space and T a subset of C0(E) with the following
properties: Each compact K C E has a relatively compact neighborhood U such
that every f E C0(E) with supp(f) C K is uniformly approximable on E by
functions t E T whose supports He in U; and further, there exists a t E T with
0 < t < I and t(K) _ {1}. Show that:
(a) A sequence (µn) in .1+(E) is vaguely convergent if and only if the sequence
(f t dp) is convergent in R for every t E T.
(b) For E := R, the set of all continuously differentiable real-valued functions with
compact support is a T with the above properties.
4. With the help of Exercise 3 show that for the functions f, (x) := I - sin(nx)
on R, the sequence (f .\'),,EN converges vaguely to A1, and deduce from this the
Riemann-Lebesgue lemma:

Elm
n -r00
f f (x) sin(nx) dx = 0 for every f E

5. Let it be a finite Radon measure on a locally compact space E. Prove that:


(a) The system . of all p-boundaryless sets is an algebra in E.
(b) For every f E Cb(E) there is a countable set Al C R such that { f > a) E .
for every a E R \ A f. [Hint: For every finite set {al , .... an ) of real numbers
n
Eµ({f =aj)) <µ(E) < +oo.]
i=1
6. µ,µ1,µz, ... are finite Radon measures on the locally compact space E. Show
that condition (30.12) is also sufficient for weak convergence; that is, from
limA.(Q) = µ(Q) for every p-boiundaryless set Q C E follows the weak conver-
gence of to it. This is also true if E is a Polish space. [Hints: Imitate the proof
§30. Convergence of Radon measures 203

of Theorem 11.6 and show with the help of Exercise 5 that every 0 < f E Cb(E)
is the uniform limit on E of an isotone sequence (un) in the vector space spanned
by the indicator functions of the sets in -90-1
7. As an application of Exercise 6 show that in the context of Theorem 30.13
condition (30.13) there is also sufficient for the weak convergence of (µn) to p.
8. Let (an)nEN be a sequence of real numbers in J0, 1[. From [0,1] delete the open
interval Ill centered at 1/2 having length al. There remain two disjoint closed
intervals J11, J12. From J1j delete the open interval I2j of length a2A1(J13) whose
midpoint is that of J13 (j = 1,2). Then there remain four pairwise disjoint closed
intervals J21, J22, J23, J24. From J2, delete the open interval I3j of length a3.' (J23)
whose midpoint is that of J23 (j = 1,2,3,4). Then there remain 8 = 23 pairwise
disjoint closed intervals J3j, j = 1, ... , 8. Continuing in this way one gets for each
n E N pairwise disjoint closed intervals Jnj, j = 1, ... , 2n. The set
C:= n(Jn1U...UJn2n)
nEN

is called a generalized Cantor discontinuum, and if all an = 1/3 it is simply called


the Cantor discontinuum. Prove that:
(a) C is compact and non-void, but C has void interior.
I(,_ a,).
iim fln
cc
(c) A' (C) = 0 4* E an = +00
n=1
[Hint: Recall the inequalities 1 + a < (1 - a)-1 and 1 - a < e_a for 0 < a < 1.]
00
(d) In case an < +00, U :=]0,1[ \C is an open subset of R whose boundary
_ n=1
U' := U \ U is not a A'-nullset.
9. Construct an open subset of ]0,1[x]0, 1 [ whose boundary has positive \2-measure.

10. Let E be a metric space, with metric d, and let µ, µ1,p2, ... be p-measures
on .R(E). Show that each of the following is necessary and sufficient for the weak
convergence of the sequence (µn) to p:
(a) lim f f dµn = f f dµ for all bounded functions f which are uniformly continu-
ous on E.
(b) lim sup µn(F) < µ(F) for all closed F C E.
(c) lim inf µn (G) > µ(G) for all open G C E.
[Hints for (a) .(b): Re-examine the proof of 30.14. There it was shown how,
for a closed non-empty F C E, to construct uniformly continuous functions fn
satisfying fn 1F.1
204 IV. Measures on Topological Spaces

§31. Vague compactness and metrizability questions

We again consider a locally compact space E along with its space &+ = .,a'+(E)
of Radon measures, equipped with the vague topology. Our interest here is in the
subsets of ..41+ which are compact or relatively compact in this topology. They are
naturally called vaguely compact, resp., vaguely relatively compact.
A necessary condition for the vague relative compactness of a set H C -W+
can be inferred at once from the very definition of the vague topology. According
to it, for each f E Cc(E) the real function p H f f dµ is continuous on W+.
Therefore the image of any relatively compact H under each such mapping must
be a relatively compact subset of R, that is, a bounded set. This observation leads
to the following definition:

31.1 Definition. A set H C ..&+(E) is called vaguely bounded (sometimes simply


bounded) if

(31.1) ffd µl < +00 for every f E CA(E).


sup

Thus vague boundedness of a set H C -4'+ is a necessary condition for its vague
relative compactness. We want to show that it is also sufficient:

31.2 Theorem. A set H C 4f+(E) is vaguely relatively compact if and only if it


is vaguely bounded.

Proof. In view of the preceding, all that has to be shown in that vague relative
compactness follows from the vague boundedness of H. To this end, let of de-
note the real number in (31.1), for each f E Cc(E), and Jf the compact interval
(-a f, a fJ in R. Also denote the (vague) closure of H in W+ by H. First observe
that
fid AEJf
for all f E CA(E) and all p E H. In fact, if f E CA(E) and e > 0 are given

Vf;e6a):={vE.A"+:I ffdv_ffd ul <e}


is a vague neighborhood of p, so if p E H then H fl Vf;e(p) 34 0. For any v in this
intersection, f f dv E Jf and therefore
r
ifll <l ffdLI+f fdµ- J fdvl <af+e.
As the extreme inequality holds for every e > 0, we see that If f dpi < a f, that
is, f f dµ. E Jf.
§31. Vague compactness and metrizability questions 205

Now consider the product space

P:= RC = X Rl
IEC,
in which for each f E C, = CA(E) a copy RI := R of the number line appears as
a factor. The product
J:= X JI
I EC

is a subspace of P which, as a product of compact spaces, is compact, by the


famous Tychonoff theorem (KELLEY [1955], p. 139 or WRIGHT [1994]). To each
Radon measure p E .A/+ we assign the mapping f -r f f dµ of C,,(E) into R. This
is a point in P. In this way a mapping
4':.l+-4P
is defined which is injective by the Riesz representation theorem. On the basis of
what was shown in the opening campaign
4;(H) C J.
Our goal will be realized if we can show that
(a) 4' maps .4f+ homeomorphically onto and
(b) 4'(4'+) is closed in P.
For then 4)(H), as a closed subset of is also closed in P. From 4)(H) ly-
ing in the compact set J it therefore follows that 4'(H) is compact, hence too its
homeomorphic image H.
As to (a): Continuity of a mapping 4> into a product means continuity of every
"component" of 4), that is, of each mapping It P- f f dp (f E CA(E)). But this is
true right from the definition of the vague topology. Continuity of the mapping 4'
inverse to 4' means continuity of each mapping

4'(u)'-Jfd(4i(4'())) Jfdt

of 4>(.q'!.+) into R (f E C'(E)). But this mapping is just the restriction to 4)(..C/+)
of the projection of P = RC, onto its coordinate specified by f.
As to (b): Let I E P be a point in the closure of 4'(..E'+) in P. Then I is
a positive linear form on CA(E). To see its additivity, for example, let f, g E CA(E)
and E > 0 be given. The set of all I' E P which satisfy
II'(u) - I(u)I < E for u E (f, g, f + g}
is a neighborhood of I in P, and therefore contains a point I' = 4>(p) from
I' is thus the positive linear form

u H I' (u) = Judu


206 IV. Measures on Topological Spaces

on CA(E). That means that we have


II(f +g) - I(f) - I(g)I II(f +g) - I'(f +g)I + II'(f +g) - I(f) - I(g)I
=II(f+g)-I'(f+g)I+II'(f)-I(f)+I'(g)-I(g)I
<e+II'(f)-I(f)I+II'(g)-I(g)I <3c,
and because e > 0 is arbitrary, the extreme inequality means that its left-hand
side must be 0. In a completely analogous way one proves that I (a f) = aI (f) for
every a E R, f E CA(E), and I(g) > 0 for every non-negative g E CA(E). With
the linearity of I confirmed, the Riesz representation theorem supplies a Radon
measure v E + such that I. That is, I lies in confirming that
the latter is closed in P. lJ

31.3 Corollary. For every real number a > 0 the set


9a:={pE..t+(E):IItzII<a)
is vaguely compact.

Proof. For every f E CC(E) and p E 4, if f dpi < f If I du <_ a IIf 11. Conse-
quently, tf,, is vaguely bounded, hence vaguely relatively compact. What therefore
remains to be confirmed is the closedness of via in .4W+. According to (28.13)
6 is just the set of all Is E W+ such that f u dµ < a holds for all [0,1]-valued
u c- CA(E). Because the mapping p '-+ f u dtp of .'+ into R is continuous, the
set {µ E - ' + : f u du < a} is closed, for each u E CA(E), and by the preceding
observation 4 is an intersection of such sets, those for which u(E) C [0, 11. Thus
.9a is indeed (vaguely) closed. 0

Remark. 1. The set of all measures u E 4' (E) with IIpQ equal to a fixed positive
number a is vaguely closed if E is compact (because in that case 1E E CA(E)).
Example 2 of §30, with all the a there equal to a, illustrates this.

For a variety of applications it is important to know when, in terms of E,


the vague topology of 4+(E) is metrizable. One reason is that sequences suffice
for dealing with metric topologies, but generally not for non-metric ones. The
following remark will prove useful in answering this question.

Remark. 2. For every locally compact space E the, obviously injective, mapping
(31.2) V : E -+ .4f+ (E)
defined by V(x) := ex is a homeomorphism of E with cp(E) _ {ey : x E E}. For
every point x E E the (open) sets
Mf...... f..:n(x) = {y E E : If,(x) - f;(y)I < 17,,7 = 1,...,n}
form a neighborhood basis at x as the fj run through all finite subsets of CA(E)
and 17 through all positive real numbers. In fact, if U is a neighborhood of some
§31. Vague compactness and metrizability questions 207

x E E, 27.3 furnishes a u E CA(E) with 0 < u < 1, u(x) = 1 and supp(u) C U,


which implies that C U. Using the notation (30.5) it is obvious that
= V(E) n Vf...... J..;,&.)
for all relevant functions, q E R+ and x E E. Together with the injectivity this
clearly shows that cp is a homeomorphism.
As a result of the foregoing, the metrizability of the locally compact space E is
clearly a necessary condition for the metrizability of the vague topology on .41+(E).
For the former the existence of a countable basis in E is sufficient, as was noted
in Remark 5 of §29. It is useful to formulate this in terms of CA(E):

31.4 Lemma. For any locally compact space E the following assertions are equiv-
alent:
(a) E has a countable basis.
(b) There is a countable subset of CA(E) which is dense with respect to uniform
convergence.

Proof. (a)=::-(b): Let 9 be a countable base for (the topology of) E,.? the set of
all open intervals in R with rational endpoints. For every natural number n let
us say that an n-tuple (C1,... , Gn) E 1n and an n-tuple (II, ... , In) E Mn are
compatible with each other if a function f E CA(E) exists such that f(G,) C II
for each j = 1,...,n and supp(f) C Gl U ... U Gn. Any such f will be called
a compatibility function for the pair of n-tuples. Obviously, the set
U(9" x,1n)
nEN
is countable; there are therefore only countably many such pairs of n-tuples (n E N)
that are compatible with each other. We choose a compatibility function for each
such pair and designate by F the set of functions chosen. It suffices to prove that
F is a countable dense subset of CA(E). To prove its denseness, let u E CA(E)
and e > 0 be given. Denote the support of u by K. Every x E K lies in an open
neighborhood from 9 each point y of which satisfies Iu(x) - u(y) I < E. The com-
pact set K is covered by finitely many such neighborhoods, say by C1,.. . , Gn.
The diameter of each image set u(G,) is at most 2E. Consequently there are in-
tervals I j E 9 of length less that & such that u(G3) C II, f o r j = 1, ... , n. Thus
u is a compatibility function f o r the pair of n-tuples (G 1 i ... , G"), (I1, ... , In ).
Hence there must also be such a compatibility function f in the representative
set F. Every X E Gj therefore satisfies Iu(x) - f(x)I < .A'(Ij) < 3e; that is,
Iu(x) - f (x)I < 3e for all x E G1 U ... U Gn. But this latter inequality prevails as
well for all x E E \ (G1 U ... U Gn) for the simple reason that both f and u vanish
identically in this complement. In summary, llu - f II < 3F. This proves that F is
dense in CA(E).
(b)=*(a): Let D be a dense subset of Cc(E). We will show that the system 9
of all sets {u > 1/2} with u E D is a base for the topology of E. For every open
U C E and every point x E U Corollary 27.3 furnishes an f E CA(E) with f (x) = 1
208 IV. Measures on Topological Spaces

and supp(f) C U. Since D is dense, there is a u E D with 1$u - f O < 1/2. Then
xE{u>1/2}C{f> 0) C supp(f) C U.
If D is countable, so is If. O

Remark. 3. It is easy to show directly that (b) implies the metrizability of E.


To this end, let D be a countable dense subset of Ce(E). Now (cf. Corollary 27.3)
CA(E) separates the points of E, so D must also; that is, for any two distinct
points x, y E E there is a u E D with u(x) 96 u(y). The functions in D \ {0} may
be organized into a sequence ul, u2.... and we may then define

(31.3)
1un(x) -'uw(y)1
d(x, y) :_ X, Y E E.
n=1 2" 11un11

Point-separation by D means that d(x, y) > 0 whenever x # y. All the other prop-
erties of a metric on E are obvious for d. This function d on E x E is a uniform
limit of continuous functions and is consequently continuous. Therefore the topol-
ogy generated by d, which we will call the d-topology, is coarser than the original
topology of E. For any given point x E E and neighborhood U of x in the original
topology of E there is, as was shown in the "(b)=(a)" part of the preceding proof,
a u E D with
zEV:={u>1/2}CU.
This function u is however a u,,, so that by (31.3) u is d-continuous and V is d -open.
Therefore the d-topology is finer than the original topology of E. Consequently
the two topologies in fact coincide.

Now we can provide the final answer to the question posed after Remark 1.

31.5 Theorem. The following assertions about a locally compact space E are
equivalent:
(a) .A+(E) is a Polish space in its vague topology.
(b) The vague topology of 4+(E) is metrizable and has a countable base.
(c) The topology of E has a countable base.
(d) E is a Polish space.

Proof. (a)=>(b): This follows from Definition 26.1 of a Polish space.


In Remark 2 we learned that x o-4 ey is a homeomorphic mapping of E
onto the subspace {e. : x E E} of all Dirac measures in .4'+(E). Since the property
of having a countable basis clearly passes to subspaces, (c) follows.
(c) .(d): This was shown in Remark 5 of §29.
(d)*(a): Lemma 31.4 provides a countable Do C CC(E) which is dense in CA(E)
with respect to uniform convergence. Furthermore, according to Example 2 of §29,
E is countable at infinity, so that by 29.8 there is a sequence of compact
sets such that L. 1 E and every compact subset K of E satisfies K C L. for all
but finitely many n. For each n E N choose an e,, E CA(E) satisfying 0 < e,, < 1,
§31. Vague compactness and metrizability questions 209

en(Ln) _ {1}. The subset


D:=Do EDo,nEN}Ufen: nE N}
of CA(E) is still only countable and, of course, is dense in CA(E). Let d1, d2,... be
an enumeration of its elements:
D={d,,:nEN}.
Using this enumeration we define a mapping
e: +x-&+-+ R+
by

(31.4) e(µ, v) :_ E002-n min{1, I f do du - f do dvl }, µ, v E


n=1
All the properties of a metric save perhaps one are obvious for p. What needs
checking is that µ = v follows from g(µ, v) = 0. In view of the uniqueness part of
the Riesz representation theoremr this amounts to showing that from

J dodp=J dodo for all nEN


follows the equality
f f dp = fdv for every f E C,(E).
J
So let us show this. Given f E CA(E) there is k E N such that
supp(f)CLkC{ek=1}.
Further, given e > 0 there is u E Do with Ilf - ull < c, whence, since f = fek,
(31.5) If - uekl < Eek.
Integration yields

(31.6)
if
(31.61) I ffdv_Juekdv l < F J ek dv.
As the functions ek and uek are in D, the assumption that p(p, v) = 0 entails that
their p- and the v-integrals coincide, and it follows that

Jfdi_Jfdu
l <
2e J ek d,",
holding for every e > 0. That is, the desired equality f f dp = f f dv must hold.
The next step is to show that the topology determined by P is none other
than the vague topology. We will, to that end, make use of the fact that the sets
defined in (30.5) are a neighborhood base at v E ..&+ in the vague
210 IV. Measures on Topological Spaces

topology, when all possible finite subsets {fl,..., fn} of C0(E) and all numbers
e > 0 are considered. We will denote by Ue (v) the open ball of center v and radius e
with respect to the metric p.
1. Given e > 0 there exists m E N such that
(31.7) Vd,..... dm;e/2(V) C UU(V) for every v E .4'+.
Indeed, one may take any m E N such that
00
E 2-n < e/2
n=m+1
and every le E Vd,..... d,,,;e/2(V) will then satisfy
in
p(µ, V) <
E2-n
+<e
n=1

and consequently lie in UE(v).


2. For finitely many f1,..., fn E CC(E), for every number e > 0 and every
v E 4'+, there is a number i > 0 such that
(31.8) Un(v) C V11,---.fn;-(V)
First of all, choose k E N so that
n
U supp(fj) C Lk C {ek = 1}.
j=1

We can find a number 8, dependent on v, so that


0<8<1 and b2+(1+2fekdv)8<e.
For each j there is a function uj E Do with II fj - uj II < 6, hence with
Ifj - ujekl !5 bet, (j=1,...,n).
Integration with respect to v and any u E _W+ gives

(31.9) Jf)dL_Juiekd1zl<SJekdµ,

(31.9') fjdv- ujekdvl <d J ekdv

if
for j = 1, ... , n. Choosem so large) that all the functions ek, u1ek.... show
up among the first m functions dl,..., d,,, in the enumeration of D, to which they
all belong. Finally, set
,7] ._ d2-m

and consider any li E .,,v). It satisfies


2-'min{1,l fd;dle - fd;dvl}<p(u,v)<tl<82-',
§31. Vague compactness and metrizability questions 211

whence, since b < 1


p for i = 1, ... , m.
if
Because of the way m was chosen

(31.10) for j=1,...,n


if
and

(31.10') Jekd/L_fekdP<o.
From (31.9) and (31.9'), as well as from (31.10) it follows, via the triangle inequality
that

ffjd_Jfjdv l < (1+J ekdµ+J ekdv)b;


while from (31.10')
ekdp <6+ / ekdv,
J
so the preceding implies that J

Jfid_ffidv<82+(1+2 J edv)S<eAs

this holds for every j E{ 1, ... , n}, it asserts that p E V11 ,... j,,, (v) and con-
firms (31.8). Together (31.7) and (31.8) assert the equality of the vague and the
p-topologies.
The next step will be to prove the completeness of the metric p, and we can
do that via slight modifications in the foregoing arguments. Let (pn)nEN be a p-
Cauchy sequence in W+. Instead of the functions fl,..., fn and the number e > 0
in 2. above, let an f E CA(E) and a number b E ]0, 1[ be given. We aim first
to prove that the numerical sequence (f f dpn)nEN converges in R. Choose k E N
with supp(f) C {ek = 1) and u E Do with Ilf - It < b. Then choose m E N large
enough that the two functions ek and uek are among dl, ... , d,n and set 17:= 62-1.
Since (µn) is a p-Cauchy sequence, there is a natural number N, dependent on 'q,
thus on f and S, such that
p(pr, ps) < 77 for all r, s > N.
Just as in the earlier deduction scheme, we get that for such r, 8

for all i E {1,...,m},

which contains in particular the inequalities

(31.11) I < 6 and JekdPr_JekdPa < 6. I

if if
212 IV. Measures on Topological Spaces

Of course we also have the f-analogs of (31.9) and (31.9'), so that reasoning similar
to that used earlier deliversthe inequality

for all r, s > N.


if
The second of the (valid for all r,s > N) inequalities in (31.11) shows that the
numerical sequence (f ek d EN is bounded, say by M E R+:

forallnEN.
The earlier inequality therefore yields

Jfdpr_JfdP8<62+(1+2M) for all r,s>N.

Notice that M depends only on k, hence only on f. Furthermore N depends only


on b and f. Therefore this last inequality affirms that (f f dpn)nEN is a Cauchy
sequence in R. According to the remark following Definition 30.1 the sequence (tin)
is therefore vaguely convergent to some p0 E .4'... Since the vague topology co-
incides with the p-topology, as we have already confirmed, this means that the
sequence (pn) converges to po in the p-metric.
We finally need to prove that, like the topology of E, the vague topology of ..k+
has a countable base. Since the vague topology is generated by the metric p, it
is enough to find a countable set 9o which is dense in . W+; because it is obvious
that the set of all open balls with respect to the metric p centered at points of 9o
and having rational radii is then a countable base for the p-topology of . '... Our
candidate for 9o is the set of all discrete measures
k
b :_ aifx,

with positive rational ai and points ai drawn from a countable set Eo which
is dense in E. We get such a set Eo simply by taking a point from each set
in a countable base for the topology of E. Evidently, this 90 is countable. We
have to show that for every p E . fl+, every real e > 0, and every finite set
F :_ {fl,..., fn} C CA(E), the basic vague neighborhood Vj,,... contains
a measure from 90. At least, according to 30.4, this neighborhood contains a

with positive real Ui and Ti E E. Thus


k
(31.12) ip- Jfdbl-l Jfd <e for all f EF.
i=1
§31. Vague compactness and metrizability questions 213

Now for such f and d as above

if fdIt-Jfd.6l< J
fdlt -
J
f &l +IJ fa-J fd6l
f k
fd,,- ffdal+Ea;If(=i)-f(xj)I+FI-;-ailIlfII
k

Inequality (31.12) says that the number

E-
HJfd,1
-ffdbl
is positive. If we choose a; from Q+ sufficiently close to i and x; from the dense
set E o sufficiently close to T, (i = 1, ... , k ), then because of the continuity of the
(finitely many) functions f, we can obviously see to it that the two sums in (31.13)
together are less than this, so that the right side of inequality (31.13) is less than e,
for each f E F. But that means that b E 9o n V1..... f,,;, (It).

Remarks. 4. The reader should recall the rather elementary fact that for a met-
ric space compactness and sequential compactness are equivalent (see (6.37) in
HEwrrr and STROMBERG [19651). In view of this, a very useful consequence of
Theorems 31.2 and 31.5 for a locally compact space E with a countable base is
that every vaguely bounded sequence in _J!+(E) contains a vaguely convergent
subsequence..

In particular, for such E every sequence (p,,) in ..#+(E), that is, every sequence
of p-measures, contains a vaguely convergent subsequence. Moreover, in case all
convergent subsequences have the same limit e, the original sequence (p,,) itself
converges vaguely to /t: Otherwise there would be an f E CA(E) for which (f f dlt )
sloes not converge to f f dlt, and so an e > 0 and integers I < n1 < n2 < ... such
that If f dlt,,; - f f ditl > e for all j E N. The sequence )jEN would have
a vaguely convergent subsequence and its vague limit could not be iz. If we further
hypothesize of that it is tight, then with the aid of Remark 3 in §30 we can
conclude that it E .W+(E) as well, and that even converges weakly to it.
5. The foregoing deliberations show (for locally compact. E with a countable
base) that tight sequences in &+'(E) always contain weakly convergent subse-
quences. Explicitly formulated this says: A set H C .,i.+ (E) is relatively compact
(= relatively sequentially compact) in the weak topology if it is tight, meaning
that for every e > 0 a compact Kf C E exists such that p(E \ KE) < e for ev-
ery it E H. A theorem of Yu.V. PROHOROV asserts that the lightness of H is
even equivalent to its weak relative compactness. More is true: This equivalence
prevails as well whenever E is any Polish space. For details the reader can consult
BILLINGSLEY [1968[.
214 IV. Measures on Topological Spaces

The ideas employed in the proofs of Theorems 31.4 and 31.5, slightly modified,
lead to a further interesting result. It concerns the space
C := C(R+, E)
of all continuous mappings f of R+ := [0, +oo into a Polish space E, for exam-
ple, Rd. We endow C with the topology of uniform convergence on compact subsets
of R+.

31.6 Theorem. Along with E, the space C(R+, E) is also Polish.

Proof. Consider any complete metric B which generates the topology of E. Another
such metric is given by (x,y) H min{1, p(x,y)}, and using it if need be, we can
simply assume that L< 1. This lets us define do in C for each n E N by
dn(f,g) := sup{p(f(x),g(x)) : x E [0, n]), f,g E C;
and

(31.14) d(f,g) :_ 00 E2-ndn(f,g), f,g E C.


n=1

Just as earlier (cf. (31.3) and (31.4)), one easily confirms that d is a metric
on C (with all its values in [0,1]) which satisfies
(31.15) 2-nd(f,g)<d(f,g)<dn(f,g)+2'n for allnEIN,
the right-most inequality following from the fact that d< < d,+1 for all i E N,
resulting in
n 00

d(f, g) 5 E 2-`dt(f, g) + E 2-'.


i=1 i=n+1
It follows from (31.15) via by-now-familiar reasoning that the d-topology coincides
with the original topology of C, and moreover that d is a complete metric.
So it only remains to prove that the topology of C has a countable base.
As we showed in the very last phase of the proof of Theorem 31.5, the Polish
space E contains a countable dense subset E0. The system 9 of all open balls with
respect to the metric o with centers in Eo and with positive rational radii is then
a countable base for E. Together with it we consider a countable base 0 for R+.
Thus n-tuples (01, ... , On) E 0n and (G,,.. -, E [9n are called compatible if
there is a function f E C such that f (O,) C G,, for each j = 1, ... , n. And, as
before, any such f will be called a compatibility function. Because

U(®n
nEN
is countable, there is a countable set F C C which contains a compatibility function
for each pair of compatible n-tuples, for each n E N. The open d-balls having
centers in F and rational radii are a countable set, and it is easy to see that they
constitute a base for the d-topology of C once we confirm that F is dense in C.
§31. Vague compactness and metrizability questions 215

So that is now our goal. Consider then an arbitrary fo E C and N E N. Set


c:= 2-N-2. Since f is continuous, every x E [0, NJ lies in a set 0 E 6' such that
p(fo(y), fo(x)) <,E/2 for all y E 0.
Finitely many such sets 0 suffice to cover [0, NJ, say 01,..., 0,,. By the triangle
inequality
Q(fo(y),.fo(x)) < e for all x, y E Oj, j E { 1, ... , n}.
Choose a point xj from each Oj. Then
p(fo(x),fo(.., j))<e for allxEOj.jE
The open Lo-ball of radius c centered at fo(xj) meets the dense set E0, say in the
point zj. As E is rational, the open p-ball of center zj and radius 2e is a set G j E I.
Then every x E Oj satisfies
P(fo(x),zj) <_ P(fo(x),fo(xj))+P(fo(xj),zj) <2e,
which means that fo(Oj) C G j, all this for each j E { 1, ... , n}. This shows that
fo is a compatibility function for (01,._O.) and Consequently,
this pair of n-tuples has a compatibility function f E F, that is, f E F satisfies
f(0j)CGj forj=1,....n.
It follows that f(x), fo(x) both lie in Gj whenever x E Oj and so
e(f (x), fo(x)) < 4c.
As the Oj cover [0, NJ, this inequality holds for every x E [0, NJ. It affirms that
dN(f, fo) < 4E, and so thanks to (31.15) and the definition of e, d(f, fo) < 4E +
2-N = 2-N+'. As N E N is arbitrary, this shows that F is d-dense in C, which,
as noted earlier, completes the proof.

The significance of Theorem 31.5 lies partly in the fact that for a locally com-
pact space E whose topology has a countable base the space .41+(E) of all (posi-
tive) Radon measures - which according to 29.12 is the set of all Borel measures
on E - being also a Polish space, is itself an environment in which measure theory
can be pursued. And this happens in convex analysis, in integral geometry, and
in stochastic geometry, a meeting point between geometry and probability theory.
The path-space C(R+, E) of all continuous paths or curves t H f (t) 1 t E R+,
in a Polish space E (Theorem 31.6) plays a fundamental role in the theory of
stochastic processes. For example, the Polish space C(R+, Rd) carries the famous
Wiener measure; it is the steering mechanism of the Brownian motion in Rd (cf.
BAUER [1996]).

Exercises.
1. Let E be a locally compact space, v E ..#+(E). Show that the set of all p E
..#+(E) which satisfy 0 <_ f u. d,u < f udv for every non-negative u E CA(E) is
vaguely compact.
216 IV. Measures on Topological Spaces

2. Let E be a locally compact space with a countable base. Prove that there is
a countable subset of C0(E) that has the properties of the set T in Exercise 3, §30.
[Hint: Try the set D that featured in the proof of Theorem 31.5.]
3. (Selection theorem of E. HELIX (1884-1943)). Prove the original form of Corol-
lary 31.3: To every sequence (Fn)nEN of distribution functions on R corresponds
a measure-generating function F : R -+ R and a subsequence (Fn,, )kEN of the
original sequence such that lim Fnk (x) = F(x) for every continuity point x of F.
k-roo
Why is F generally not a distribution function? How does one recover 31.3 (for
the case E := R) from Helly's theorem?
4. For a Polish space E consider the topology (introduced in Remark 7 of §30) of
weak convergence on the set of finite Borel measures (the finite Radon measures
- cf. 26.2) on E. By adapting the ideas in the proof of Theorem 31.5, show that
this topology is metrizable.
5. For what more general spaces taking over the role of R+ in the definition
of C(R+, E) does Theorem 31.6 remain valid?
Bibliography

e_ls
U. ANONYME [1889]: "Sur l'integrale JIx dr", Bull. Sci. Math. (2)13, 84.
G. AUMANN [1969]: Reelle Funktionen. Grundlehren Math. Wiss. 68 (2nd edition),
Springer-Verlag, Berlin-Heidelberg-New York.
S. BANACH [1923]: "Stir le problenne de la mesure", Fund. Math. 4, 7-33.
R.G. BARTLE and J.T..JoicHI [1961]: "The preservation of convergence of mea-
surable functions", Proc. Amer. Math. Soc. 12, 122-126.
H. BAUER [1984]: Mafle auf topologischen Raumen, Kurs der Fernuniversitat-
Gesamthochschule-Hagen.
- 11996]: Probability Theory, de Gruyter Stud. Math. 23. Walter de Gruyter.
Berlin-New York.
S.K. BERBERIAN [1962]: "The product of two measures", Amer. Math. Monthly
69, 961-968.
P. BILLINGSLEY [1968]: Convergence of Probability Measures. John Wiley & Soils,
Inc., New York-London-Sydney-Toronto.
G. BIRKHOFF and S. MACLANE [1965]: A Survey of Modern Algebra (3rd edition).
The Macmillan Co., New York.
N. BOURBAKI [1965]: Integration, Chap. 1-4. Hermann, Paris.
A. BROUGHTON and B.W. HUFF [1977]: "A comment on unions of a-fields",
Amer. Math. Monthly 84, 553-554.
S.D. CHATTERJI [1985-86]: "Elementary counter-examples in the theory of double
integrals", Atti Sem. Mat.. Fis. Univ. Modena 34, 363-384.
G. CHOQUET [1969]: Lectures on Analysis. Vol. 1. W.A. Benjamin, New York-
Amsterdam.
.I.P.R. CHRISTENSEN [1974]: Topology and Borel Structure. Mathematical Studies
10. North-Holland Publ. Co., Amsterdam-London.
D.L. COHN [1980]: Measure Theory. Birkhauser Verlag, Basel-Boston-Stuttgart.
P. COURREGE [1962]: Theorie dc la mesue. Les cours de Sorboune. Centre de
Documentation Universitaire, Paris 5'.
C. DELLACHERIE et P.-A. MEYER [1975]: Prnbabilites et potentiel, Chap. I a IV.
Hermann, Paris.
P. DIEROLF and V. SCHMIDT [1998]: "A proof of the change of variable formula
for d-dimensional integrals", Amer. Math. Monthly 105, 654-656.
J. DIEUDONNE [1939]: "Un exemple d'espacc normal non susceptible dune struc-
ture uniforme d'espace complet", C. R. Acad. Sci. Paris Ser. I Math. 209,
145-147.
218 Bibliography

- [1978]: Abreg6 d'Histoire des MathEmatiques, 1700-1900, tome II. Hermann,


Paris.
E.B. DYNKIN [1965]: Markov Processes, I, II. Grundlehren Math. Wiss. 121, 122.
Springer-Verlag, Berlin-Heidelberg-New York.
R.E. EDWARDS [1953]: "A theory of Radon measures on locally compact spaces",
Acta Math. 89, 133-164.
B.W. GNEDENKO [1988]: The Theory of Probability (translated from Russian by
G. Yankovsky) 6th printing. Mir Publishers, Moscow.
C. GOFFMAN and G. PEDRICK [1975]: "A proof of the homeomorphism of Lebes-
gue-Stieltjes measure with Lebesgue measure", Proc. Amer. Math. Soc. 52,
196-198.
H. HAHN and A. ROSENTHAL [1948]: Set Functions. The University of New
Mexico Press, Albuquerque.
P.R. HALMOS [1974]: Naive Set Theory. Undergrad. Texts Math., Springer-
Verlag, New York-Heidelberg.
- [1974]: Measure Theory. Grad. Texts in Math. 18, Springer-Verlag, New York-
Heidelberg-Berlin.
F. HAUSDORFF [1914]: Grundziige der Mengenlehre. Verlag von Veit and Comp.,
Leipzig; reprinted (1949), Chelsea Publishing Comp., New York.
T. HAWKINS (1970]: Lebesgue's Theory of Integration. University of Wisconsin
Press, Madison-Milwaukee-London.
J. HENLE and S. WAGON [1983]: "A translation-invariant measure", Amer. Math.
Monthly 90, 62-63.
E. HEwITT and K.A. Ross [1979]: Abstract Harmonic Analysis I. Grundlehren
Math. Wiss. 115 (2nd edition). Springer-Verlag, Berlin-Heidelberg-New York.
E. HEwITT and K. STROMBERG [1965]: Real and Abstract Analysis. Grad. Texts
in Math. 25. Springer-Verlag, New York-Heidelberg-Berlin.
J.L. KELLEY [1955]: General Topology, Grad. Texts in Math. 27. D. Van Nostrand
Co., Inc. Princeton; reprinted (1975), Springer-Verlag, New York-Heidelberg-
Berlin.
L. MATTNER (1999]: "Product measurability, parameter integrals, and a Fubini
counterexample", Enseign. Math. (2) 45, 271-279.
P.-A. MEYER [1966]: Probability and Potentials. Blaisdell Publ. Comp., Waltham,
Massachusetts-Toronto-London.
L. NACHBIN [1965]: The Haar Integral. The University Series in Higher Mathe-
matics. (Translated from Portugese by L. Bechtolsheim.) D. Van Nostrand Co.,
Inc. Princeton; reprinted (1976), R.E. Krieger Publ. Comp., Huntington, New
York.
J. VON NEUMANN (1929]: "Zur allgemeinen Theorie des MaBes", Fund. Math. 13,
73-116+333.
W.P. NOVINGER [1972]: "Mean convergence in Lp-space", Proc. Amer. Math.
Soc. 34, 627-628.
Bibliography 219

D.A. OVERDIJK, F.H. SIMONS and J.G.F. THIEMANN [1979]: "A comment on
unions of rings", Indag. Math. 41, 439-441.
J.C. OXTOBY and S. ULAM [1941]: "Measure-preserving homeomorphisms and
metrical transitivity", Ann. of Math. (2) 42, 874-920.
K.R. PARTHASARATHY [1967]: Probability Measures on Metric Spaces, Academic
Press, New York-London.
W.F. PFEFFER [1977]: Integrals and Measures. Marcel Dekker. New York-Basel.
J. RADON [1913]: "Theorie and Anwendungen der absolut additives Mengenfunk-
tioncn", Sitzungsber. Kaiserl. Akad. Wiss. Wien, Math.-NaturYaiss. K1. 122,
1295-1438.
H. RICHTER [1966[: Wahrscheinlichkeitstheorie. Grundlehren Math. 1Viss. 86
(2nd edition). Springer-Verlag, Berlin-Heidelberg-New York.
F. R.IESZ [1911]: "Sur certaines systemes singuliers ('equations intrgrales", Ann.
Sci. Ecole Norm. Sup. (3) 28, 33-62.
J.B. ROBERTSON [1967]: "Uniqueness of measures", Amer. Math. Monthly 74,
50-53.
W. RUDIN [1962]: Fourier Analysis on Groups. Interscience Tracts in Pure Appl.
Math. 12. John Wiley & Sons, New York-London.
- [1987]: Real and Complex Analysis (3rd edition). McGraw-Hill Book Comp.,
New York-Hamburg-Tokyo--Toronto.
S. SAEKI [1996]: "A proof of the existence of infinite product probability mea-
sures", Amer. Math. Monthly 103, 682-683.
W. SIERPINSKI [1928]: "Un thboreme general sur les families d'ensembles", Fund.
Math. 1, 206-210.
R.M. SOLOVAY [1970]: "A model of set-theory in which every set of reals is
Lebesgue measurable", Ann. of Math. (2) 92, 1-56.
R.H. SORGENFREY [1947]: "On the topological product of paracornpact spaces",
Bull. Amer. Math. Soc. 53,631-632.
S.M. SRIVASTAVA [1998]: A Course on Bore! Sets. Grad. Texts in Math. 180.
Springer-Verlag, New York-Berlin.
K. STROMBERG [1972]: "An elementary proof of Steinhaus's theorem", Proc.
Amer. Math. Soc. 36, 308.
- [1979]: "The Banach-Tarski paradox", Amer. Math. Monthly 86, 151-161.
- [1981]: An Introduction to Classical Real Analysis. Wadsworth International,
Belmont, California.
H.G. TucKER [1967]: A Graduate Course in Probability. Academic Press, New
York-San Francisco-London.
J. VAN YZEREN [1979]: "Moivre's and Fresnel's integrals by simple integration",
Amer. Math. Monthly 86, 691-693.
D.E. VARBERG [1971]: "Change of variables in multiple integrals", Amer. Math.
Monthly 18, 42-45.
220 Bibliography

S. WAGON [1985]: The Banach-Tarski Paradox. Encyclopedia Math. Appl. 24.


Cambridge University Press, Cambridge.
S. WILLARD (1970]: General Topology. Addison-Wesley Publishing Co., Reading,
Massachusetts.
J. YAM TING Woo [1971]: "An elementary proof of the Lebesgue decomposition
theorem", Amer. Math. Monthly 78, 783.
D.G. WRIGHT [1994]: "Tychonoff's theorem", Proc. Amer. Math. Soc. 120,
985-987.
Symbol Index

The numbers beside the symbols refer to the pages where the symbol in question
is defined.

C, u,n u, n, c, \, xii f * v (convolution of a function and a


0,33 measure), 149
-00, (+)oo, xi If < g}, If < g}, If = g}, { f 76 g},
IR, xi If >g}, If > 0, 50
N, Z, Q, R, xi, f f du, f f (w)A(dw), f f (w) da(w),
Z+, Q+, R+, R+, xi f u dµ, 55 58 64
R+, 141 f f dF, 65
(X)
R" (multiplicative group R \ {0}), 44 fA f dµ, fB f dx, fa f dAd, f f d4
R,., 156 67 90
Qd, 45. fnTf,
00
f If, 6D
T (unit circle [torus]), 38 F fn, E fn, xiii
n=1
lim sup An, lim inf An, 61
a<b,a<b,14 n-+00 n-4o0
[a, b1, a, b , a& a, b , xi, 14 28, 29 limsup fn, liminf fn, xiii
0,11 0,i, 0,1, 38 39 lim fn, xiii
n--,oo
avb, aA xi n = (n,.. . , n) E Rd (usually n E Z),
a-, a+, xi 23,32
(an)nEN, (a.).=j.2...,, Xii 91
an I a, an T a, xiii 8Q, 1,36
(ai)iEJ, Xli sup fn, inf f, xiii
d(x, A), 157, 201 supp(f) (support of a function), 167
det T, 43 supp(p) (support of a Borel measure),
f: A -. B, x H f (x) (mapping), xii 177
f I A' (restriction of a mapping), xii ix - yl (euclidean norm), 146
f-1(B') (pre-image), xii (x, y) (euclidean scalar product), 41
fA, 62
f";, 137 u 12
F,, 31 A (topological closure), 17
Ilf il (supremum norm), 169. 183 t (topological interior), 182
Ilflip, Ilflloo, 86, 8-2 A* (topological boundary), 198
f-, f+, [LL, 53 An, 147
f, 86 1A (indicator function), 49
fog (composition of mappings), xii a+A, A+a, 36
f = g (ti-)almost everywhere, 70 A:= B, B =: A, xi
f +g, fg, af, xii, 66 AC B, xii
f * g (convolution of functions), 150 A\ B, xii
222 Symbol Index

A - A (algebraic difference), 1.63 .2l'(µ), 1 < p < +00, 71


AD B,5 Y°°(µ), 78
C(E), Cb(E), Co(E), C°(E), 167,169 .4V+ (Rd), L47
C(R+, E), 214 ..41+(E), .4l+(E), _W+1 (E) (spaces of
Dye, 44 Radon measures), 188
Dai), 31 ,,,Y (negligible sets), 86 87 100
E = E(1l, sd), 53 .N,, 13,106, 107
E`(1,d), 58 ® (open sets), 152
E. T E. E, 10 (x,27
G:= f e-` )'(dx), 88. 93, 145. 146 ..(St) (power set), xii
GL(d,R), 43
H,. (homothety), 37 o(cia), 7
Ij,,171, Odd
Qa ... 5:, F, 32
K,.(xo) (closed ball), 146 e,, eZ (unit point mass, Dirac-measu-
K,.(xo) (open ball), 158 re), 8 154
L'(Ad), 151, 123 (counting measure), 12, 13
L°°(µ), LP(µ), 86, 81 Ad (d-dimensional L-B measure), 18,
L3 (lower sum), 91 26, 27
M(sd), 120 Ad (d-dimensional L-B measure on C),
Mot(] d), 42 27
N,(f),Np(f),87,74 (total mass), 14L 1.54
Q., Q.,, 135 µ., 171
S(i'k) (skewing transformation), 30 9 171
S,,(0) (euclidean sphere), 37 µ° (principal measure), 176
SL(d, R), 43 µO (essential measure), 176
Ta (translation), 36. 149 µA (restriction of µ), 68
T(µ) (image measure), 36 µF, 30
T-'(sd), 3 µ- lim (stochastic limit), 113
U3 (upper sum), 91 µ-v,1.0Z
IKv,P Lv,1455
Sv-, 170 µ1 ®µ2, 1M
a(-s.P-measurable, 34 ®n 11+j, 143
®.di = e®®... ® dd, (product of v convolution of measures), 147
i=1 **µn,147
a-algebras), 132 P(S), 4
.Vd (Borel a-algebra in Rd), 27 P."W ), 171
4' = V(K), 49, 1553 e(x, y) (euclidean metric), 41
.i(E) (Borel a-algebra in E), 152 P+,r, M2
`6'd, Cd, . (systems of closed, open, o'(8), Q(T), a(T1,... ,T,),
compact subsets of Rd), 27 o(Ti:iEI),3 35 62
9,,, 206 (Q, d) (measurable space), 34
jd 14 (S1,.,, µ) (measure space), 34
.,1E', 13. 171 0 7-119,*,µi), 1.44
.2'(µ), 6fi S 1' 0 sd (trace of a-algebra), 2
Name Index

Alexandroff, Pavel Sergeevich Fatou, Pierre (1878-1929), 80


(1896-1982),167 Fischer, Ernst (1875-1956), 84
Anonyme, U., 25 Fubini, Guido (1879-1943), 138
Aumann, Georg (1906-1980), 46
Gnedenko, Boris W. (1912-1995), 33
Banach, Stefan (1892-1945), 46 Goffman, C., 4Z
Bartle, R.G., 119
Hahn, Hans (1879-1934), 108, 141
Bauer, Heinz, 130, 144. 187, 193, 215
Halmos, Paul R., 45 141. 177. 184
Berberian, S.K., 141 Hausdorff, Felix (1868-1942), 46
Billingsley, P., 201, 213 Hawkins, T., 18
Birkhoff, Garrett (1911-1996), 44 Helly, Eduard (1884-1943), 216
Borel, Emile (1871-1956), ix, 18 Henle, J., 39
Bourbaki, Nicholas, 87, 157, 186, 187 Hewitt, Edwin (1920-1999), 46, 92,
Broughton, A., 5 144, 147, 181, 213
Holder, Otto (1859-1939), 75, 78
Caratheodory, Constantin Huff, B.W., 5
(1873-1950), 20
Cauchy, Augustin Louis (1789-1857), Joichi, J.T., 119
ix Jordan, Camille (1838-1922), ix
Chatterji, S.D., 141
Choquet, G., 47 Kelley, John Leroy (1916-1999), 1Z,
Christensen, J.P.R., 47 152, 158, 159, 166, 168, 201., 205
Cohn, D.L., 92,157, 181
Courrisge, P., 116 La Vallee Poussin, Charles de
(1866-1962), 130
Lebesgue, Henri Loon (1875-1941), ix,
Dellacherie, C., 130
18.
Dieroif, P., 43 Levi, Beppo (1875-1961), 59
Dieudonn6, Jean (1906-1992), 18 1.84 Lindelof, Ernst Leonhard (1870-1946),
Dirichlet, Peter Gustav Lejeune 160
(1805-1859), ix Lusin, NikolaT Nikolaevich
Doob J.L., 62 (1883-1950),163
Dowker, Clifford Hugh (1912-1982),
181 MacLane, Saunders, 44
Dynkin, E.B., 5 Mattner, L., L41
Meyer, P: A., 130
Edwards, Robert Edmund Minkowski, Hermann (1864-1909),75
(1926-2000), 181 83
Egorov, Dmitrii Fedorovii`
(1869-1931), 120 Nachbin, Leopoldo (1922-1993), 39
224 Name Index

Neumann, John von (1903-1957), 46, Solovay, R.M., 45


105 Sorgenfrey, Robert Henry
Nikodym, Otto Martin (1888-1974), (1915-1996), 156
105 Srivastava, S.M., 47. 165
Novinger, W.P., 82 Steinhaus, Hugo (1887-1972), 162
Stieltjes, Thomas Jan (1856-1894), 32
Overdijk, D.A., 5 Stromberg, Karl R. (1931-1994),
Oxtoby, John Corning (1910-1991), 4Z 46.92 144162, 181, 213

Parthasarathy, K.R., 201 Thiemann, J.G.F., 5


Peano, Giuseppe (1858-1932), ix Tonelli, Leonida (1885-1946), 95 138,
Pedrick, G., 47 144
Pfeffer, W.F., 184 Tucker, H.G., 33
Prohorov, Yurii Vasil'evich, 213
Ulam, Stanislaw Marcin (1909-1984),
Radon, Johann (1887--1956), 105, 155 47
Richter, Hv 33 Urysohn, Pavel Samuilowich
Riemann, Georg Friedrich Bernhard (1898-1924),158
(1826-1866), ix
Riesz, Frigyes (1880-1956), 82. 81171 Varberg, D.E., 45
Robertson, J.B., 23
Rosenthal, Arthur (1887-1959), 141 Wagon, S., 39, 46
Ross, Kenneth A., 46, 147 Willard, S., 17 152, 157-159, 166, 168,
Rudin, Walter, 105, 147, 168 201
Woo, J. Yam Ting, 105
Sacki, S., L44 Wright, D.G., 205
Schmidt, V., 43
Sierpinski, Waclaw (1882-1969), 5 Yzeren, J. van, 95
Simons, F.H., 5
Subject Index

fl-stable system, 7 p -measurable, 20


U-stable system, 2
a-additivity, 8
p-fold (p-)integrable, 7f a-algebra, 2
p-measure, 31 a-algebra of Borel sets in Rd, 27
p-space, 34 -R 42
pth-power integrable, 76 - topological space, 152
a-algebra generated by mappings, 35
Cl-diffeomorphism, 44 111 -- by a set, 3
F,-set, 152 a-compact, 181
Ga-set, 47 152, 157, 1.59 a-finite content, 23
K,-set, 187 a-finite measure, 23 72 28
a-finite measure space, 34
29-convergence, 72 a-ideal, 13, 100, 1117
."P-functions, 77
a-ring, 177
.`gyp-pseudometric, 79
-- generated, 177
2-semi-norm, 79
e-bound, 121
absolutely continuous (see p-continu-
p-almost all points, 7Q ous)
p-almost everywhere, 70 absolute value of function, 52
- continuous, 128 additivity, finite, 8
- defined measurable function, 73 -,a-,8
p-boundaryless, 19-8 - , sub-, 9
p-completion, 26 Alexandroff compaetification (see one-
p-continuous measure, 99 point compactification)
p-essentially bounded, 78 algebra, 1512 193
p-integrable over a set, 62 -,a-,2
p-integrable function, 64 - of sets, 4
p-integrable set, 68 almost everywhere, 70
p-integral of function, 55, 5$, 64 - bounded,71
- over a set, 67, 62 - defined function, 73
p-negligible, 13 - equal, 70
p-nullset, 13, 70 - finite, 74
p-quadrable, 198 analytic set, 47
p-singular, 105 antitone, xiii
p-stochastically convergent, 113 strictly, xiii
226 Subject Index

approximation by discrete measures, continuous with respect to a measure,


193, 194 99
of the identity, 193 continuous part with respect to u, 105
property, 24 convergence almost everywhere, 70
almost uniform, 120
Banach algebra, L51 - , mean square, 80
Banach space, 87 in mean, 80
basis, base (topological), 157 in eh mean, 71, 114 124
Bernoulli inequality, 75 in measure, 113
Borel Q-algebra, 27, 152 - , stochastic, 113
Borel (measurable) function, 50 1523 - , vague, 189
Borel measure, 311, 153 - , weak, 196, 2011
- , regular, 154, 158, 184 convex cone, 188
locally-finite, 154 convolution of functions, 151
bounded, 147 functions and measures, 150
Borel set in Rd, 26 - measures, 147
- - topological space, 152 convolution power, 151
boundaryless, 198 - root, 151
bounded unit, 142
Borel measure, 147 countable additivity (see Q-additivity)
(z)-essentially, 78 countable at infinity, 181
countable and co-countable o'-algebra,
Cantor discontinuum, 203 2
- , generalized 2113 countable (neighborhood) base, 157
carried by a set, 105 countable set, xii
Cauchy criterion for stochastic conver- counting measure, 12
gence, 114
Cauchy sequence in Y P, 84 85 density of a measure, 96
Cauchy-Schwarz inequality, 78 denseness of C, in 2, 186
characteristic function, 44 denseness of discrete measures, 1194
charge distribution, 108 diffeomorphism, 44, 111
Chebyshev-Markov inequality, 112 difference
Chebyshev inequality, 112 set-theoretic, xii
compatibility function, 207 - , symmetric, 5, 14 24, 87
complete measure, 26, 46 differentiation lemma, 82
completeness of LP, 82 Dirac function, 146
completion of a measure, 24 56 Dirac measure, 12. 154
composition of functions, xii Dirichlet jump function, 57, 92, 166
- measurable mappings, 35 disjoint sets, xii
content, 8 distribution function, 31, 201
content-problem, 411 dominated convergence theorem, 83
continuity at 111 --- , sharpened version, 124
continuity from above, 10 Doob's factorization lemma, 62
- from below, 10 Dynkin system, 6
continuity lemma, 88 --- generated by ', 7
Subject Index 227

elementary content, d-dimensional, U. Hahn decomposition, 108, 109


16,27 Hilbert space, 87
elementary function, 53 Holder inequality, 75
envelope, lower, xiii - - , generalized, 78
- , upper, xiii - , reversed, 72
equi-(hµ)-continuity, 128 homothety, 37
equi-p-continuous, 131 hull, measurable, 25
equi-continuous at 0, 131
equi-integrable, 121 if. ideal, 5
essential measure, 176 -- of 1A-null sets (see a-ideal)
extension theorem, 19 image measure, 3366 110
indicator function, 49
Factorization lemma, 62 input-output formula, 13
family, xii integrable, 64
Fatou's lemma, 81 - , equi-, 121
- , dual version, 130 - , Lebesgue, 65
figure, d-dimensional, 14 - , Lebesgue-Stieltjes, 65
finite (or bounded) Radon measure, -- quasi-, 64
188
- of order p, 76
finite additivity, 8 - over a set, 69
finite Borel measure, 3 147, 154 integral of f exists, 65
finite signed measure, 101 integral over a set, 67
finite-co-finite algebra, 4 8 U intervals in Rd, 14
Fubini's theorem, 13(1 isotone, xiii, 59 170
function, additive, 59
-- , strictly, xiii
- , antitone, xiii isotoneity, 9
- , integrable of order p, 76 Jordan decomposition, 109
- , isotone, xiii
- , Lebesgue integrable, 65 L-B measure, 27
- , Lebesgue-Stieltjes integrable, 65 L-B measure space, 34
- , measurable, 34, 49 L-B-nullset, 28, 29, 33
- , measure-generating, Q. 32 Lebesgue decomposition, 105
- , numerical, 49 Lebesgue integrable function, 65
- , positively homogeneous, 59 Lebesgue integral, 65
- , real, xii Lebesgue measure, 46
- , Riemann integrable, 91, 92 Lebesgue-Borel measure (see L-B
- , step, 53 measure)
- , with compact support, 167 Lebesgue-Stieltjes integral, 65
Lebesgue's convergence theorem, 83
Gaussian integral, 88, 93 - , sharpened version, 124
general linear group, 43 Lebesgue's decomposition theorem,
generator, of a a-algebra, 3 105,143
-- , of a product a-algebra, 132 left half-open interval, 29
left-continuous, 30
Haar measure, 39, 107 lemma of Doob, 62
228 Subject Index

- Fatou's, 81 - - , reversed, 79
- Urysohn's, 168 motion, 41
--- Riecnann-Lebesgue, 202 motion group, 42
--- on differentiation of integrals, 89 motion-invariance of ad, 42
linear form, 66, 68 motion-invariant content, 46
- , positive (isotone), 66, 171 mutually singular (measures), 1.05
Lusin's theorem, 10
negative part of a function, 53
mapping, xii - of a signed measure, 109
mass distribution, 12, 108 non-Borel set, 45 47
measurable mapping, 34 non-denumerable, xii
measurable numerical function, 49 norm of uniform convergence, 169
measurable sets, 34 normal representation, 54
measurable space, 34 nullset, 13
measurable, Borel, 34, 103 - , L-B, 28 20 33, 43
- , Lebesgue, 46 - , Lebesgue, 46
with respect to an outer measure, totally, 1119
21)
number line, xi
measure, 11 - , compactified, xi
Borel, 31L 153 - , extended, xi
- , carried by a set, 105
finite, U one-point compactification, 167
outer measure, 20
finite signed, 11)2
inner regular, 1.54 point, ideal, 106
L-B,27 point, infinitely remote, 167
- , Lebesgue, 46 point mass (see Dirac measure)
- , locally finite, 153 Polish space, 157, 208, 214
-, of a set, 11 portmanteau-theorem, 197
outer regular, 153 positive part of a function, 53
positive, 1519 - of a signed measure, 109
- , regular, 15.4 positively-homogeneous function, 59
- , u-continuous, 99 power set, xii, 2
- , a-finite, 23, 72, 98 pre-image, xii
- , signed, 102 premeasure, 8
with density, 96 - , Lebesgue, 18
measuue-defining function, 311 principal measure, 176
measure-extension theorem, Q. 21 probability measure, 31
measure-generating function, Q. 32 probability space, 34
measure space, 26. 34 product measure, 137, 143
- , a-finite, 34 product of measure spaces, 144
metric of uniform convergence, 169 - of a-algebras, 132, 142
metrizability of locally compact spaces, pseudometric, 79
208
- of vague topology, 208 Radon measure, 155
Minkowski inequality, 70 83 --- , bounded, 188
Subject Index 229

- , finite, 188 signed measure, 107


- , p-measure, 188 singular part of a measure, 105
- , regularity of, 156, 161, 183 singular, mutually, 103
Radon-Nikodym density, 105 - , to each other, 105
- integrand, 105 Sorgenfrey topology, 156
- theorem, 101 Souslin (analytic) subset, 4Z'
reflection-invariance, 37 space, locally compact, 186
regular, inner, 183 - , Polish, 157, 208, 214
- , outer, 154, 181 special linear group, 43
regularity of Borel measures, 184 square-integrability, 76
- of L-B measure, 162 Steinhaus' theorem, 163
relatively compact, vaguely, 204 step function, 53
- , weakly, 213 Stieltjes integrable, 65
representing measure, 173 Stieltjes measure function, 32
- , essential 176 stochastic convergence, 113
- , principal, L76 stochastic limit, 112
restriction of p, 19, 68 subadditivity, 9
restriction of f, xii subsequence principle, 118, 120
Riemann integrable, 91, 92 subtractivity, 9
- integral, 91 support of a function, 167
- - , improper, 92 - of a measure, 173
Riemann-Lebesgue lemma, 202 supremum norm, 189
Riesz representation theorem, 171, symmetric difference, , 14, 24 87
178, 185
right half-open interval, 14 tensor product, 145
ring (of sets), 4 theorem of Caratheodory on outer
- generated by intervals, 14 measures 21
- Egorov, 120
section of a function, 138 - Fubini, 139
- of a set, 135 -Helly,218
semi-norm, 79 - Lebesgue, 83, 103
-,2'-,79 - Lebesgue-Radon-Nikodym, 1125
- of convergence in pth mean, 79 - Levi, 59
sequence, xii - Lusin, 183
sequentially compact, 213 - Prohorov, 21.3
- , relatively, 213 Radon-Nikodym, 194
set, analytic, 47 F. Riesz, 82
- , Borel, 26, 49, 152, 172 - Riesz-Fischer, 84
- , difference, 183 - Steinhaus, 163
- , Lebesgue measurable, 47 Tonelli, 13$
- , non-Borel, 45, 47 theorem on dominated convergence 83
- of a-finite measure, 72, 175 - monotone convergence, 59
- , (partially) ordered, xiii - partitions of unity, 167
- , quadrable, 198 tight, 197, 213
Souslin, 47 topological basis (base), 157
230 Subject Index

-- , countable, 157, 2174 208 Urysohn's lemma, 168


topology, right sided, 157
vague, 192 vague density of discrete measures,
weak, 211 193, 194
total mass, 147. 154, L41 vague limit, 189
in vague convergence, 191, 195,19%
vague topology, 192
in weak convergence, 19&
vaguely bounded, 204
trace, 3
vaguely compact, 204
transformation theorem for general
vaguely convergent, 189
integrals, 111
for Lebesguc integrals, 111 vanish at infinity, 10
transitivity of image measures, 36 vector space, 66,7,778
Z8
translation-invariance of Ad, 36
translation-invariant measure, 4 39 weak convergence, 196, 211
--- and distribution functions, 2(111
ultimately all, xii
weak relative compactness, 213
uncountable, xii
weak topology, 201
uniformly integrable, 122
uniqueness theorem, 22 Wiener measure, 216
unit mass at w, 8. 12 (see also Dirac
measure) zero-measure, 11
This book gives a straightforward introduction to the field as it is
nowadays required in many branches of amtlysis and especially in
probability theory. The flrst three chapters Measure Theory.
Integration Theory. Product Measures) basically Follow the clear
and approved exposition given in the authors earlier book on
Probability Theory and Measure Theory'. Special emphasis is
laid on a complete discussion of the transformation of measures
anti integration with respect to the product measure, convergence
theorems, parameter depending integrals, as well as the Radon—
Nikodym theorem.
The final chapter. essentially new and written in a clear and concise
style, deals with the theory of Radon measures on Polish or locally
compact spaces. With the main results being Luzin's theorem. the
Riesz representation theorem, the Portimtiitcau theorcm. and a
characterization ol locally compact spaces which are Polish, this
chapter is a true invitation to study topological measure theory.
'lie (ext addresses graduate students, who wish to earn the
Fundamentals in measure and integration theory as netded in
modern analysis and probability theory. It will also bc an important
source for anyone teaching such a course.

Você também pode gostar