Escolar Documentos
Profissional Documentos
Cultura Documentos
Lüdde
Mathematical Supplement
Theoretical Physics 1
Springer
Berlin Heidelberg NewYork
Barcelona Hong Kong
London Milan Paris
Singapore Tokyo
Preface
2 Differential Equations I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Methods of solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2.1 Separation of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2.2 The linear differential equation of second order . . . . . . . 44
3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.1 Qualitative vector calculus . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.2 Quantitative formulation of vector calculus . . . . . . . . . . 58
3.1.3 Addendum I: n-dimensional vector spaces . . . . . . . . . . . 66
3.1.4 Addendum II: nonorthogonal coordinate systems and
extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.2 Linear coordinate transformations, matrices and determinants 72
3.2.1 Linear coordinate transformations I . . . . . . . . . . . . . . . . . 72
3.2.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.2.3 Linear coordinate transformations II . . . . . . . . . . . . . . . . 85
3.2.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
VIII Table of Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
1 Analysis I: Functions of one real variable
The first chapter provides a survey of topics from the analysis of functions
of one real variable which are of interest in theoretical mechanics. The topics
include differentiation, integration and series expansions of functions. The
chapter begins with a brief discussion of the definition of the concept of a
function.
x
domain
of definition
co-domain
2. Any rule, which allocates a unique, real number x(t) to each point of
the domain of definition, is called a function of one real variable.
• The domain of definition is the interval [−∞, ∞], the specification is x(t) =
et . The function is defined by an explicit formula in this example. It can be
represented by a ’smooth’ curve in a diagram. The co-domain (see Fig. 1.2a)
is [0, ∞] .
• Consider the function x(t) = sin(1/t) in the domain (0, ∞] . The specifi-
cation uses also a formula. The co-domain is [−1, 1] . This function can,
however, not be represented by a ’smooth’ curve. The value of the function
oscillates ever more rapidly the closer one approaches the (excluded) point
t = 0 (Fig. 1.2b).
(a) (b)
x
1
x
t
–1
t
0.5
1 t
1⎧
⎪
⎨ for t < 1
3−t
x(t) =
⎪
⎩ 1 for t ≥ 1 .
1+t
4 1 Analysis I: Functions of one real variable
On the other hand, the functions of the examples 3 and 4 quoted above
are not continuous. The value of the function 3 jumps from one point to
the next, the function in the case 4 is completely disconnected. Example 2
can not be accommodated easily in terms of the naive characterisation. The
function can be traced up to the very vicinity of the point t = 0, but the
effort increases the closer one approaches this point.
The naive characterisation can not be justified on mathematical grounds.
The question whether it is possible to draw a curve in one go or not is, at least
in part, a question of the skill. It is necessary to convert the tentative, naive
characterisation into a rigorous mathematical definition.It turns out that the
implementation of this task calls for the introduction of a considerable chain
of additional concepts
1.2.2 Sequences
The first concept, which is needed, is the concept of a numerical sequence.
In a numerical sequence a definite number a1 , a2 , a3 , . . . , an , . . . is associ-
ated with each natural number 1, 2, 3, . . . , n, . . .. Some explicit examples
are:
1 1 1
The sequence 1, , , . . . with the general term an = .
2 3 n
The sequence 1, 4, 9, . . . with the formula an = n2 .
The sequence 1, 2, 1, . . . with a2n−1 = 1, a2n = 2 for n = 1, 2, . . ..
These examples indicate the possible properties of sequences.
• The terms of the sequence approach a finite limiting value A (A = 0 in
the first example).
• The terms increase indefinitely with increasing n beyond all limits in the
second example.
• The terms oscillate between two (or more) values in the third.
Sequences with a finite limiting value (for short limit) are called convergent,
all sequences, which are not convergent, are called divergent.
The next step is a precise definition of the concepts ’convergence’ and
’limiting value’. The formal mathematical language needed for this purpose
sounds more complicated than it actually is.
A- ε A A+ ε
aN aN+1
The definition of the limiting value of a function x(t) is: consider a function
x(t) with a domain of definition and a point ta , which is the limit of a
sequence of points which lie all within the domain of definition. The function
x(t) possesses the limiting value xa at the point ta , that is
lim x(t) = xa ,
t→ta
if the following condition is satisfied: for each sequence of points in the domain
t1 , t2 . . . tn . . . with the limiting value
lim tn = ta one finds lim x(tn ) = xa .
n→∞ n→∞
This definition sounds again more cumbersome than it is. The function
sin t
x(t) = for 0 < t ≤ 1
t
can be used as an explicit example. This function is well defined for t = 0 .
The diagram of the function in the vicinity of the point t = 0 is shown in
Fig. 1.6.
x
1
t
Fig. 1.6. Limiting value of the function (sin t)/t for t → 0
1.2 Continuity and differentiability 7
(independent of the fact whether the point is approached from the right or
the left). According to the definition the value of the function is 10 . The
value of the function and the limiting value do not agree. The function is not
continuous for t = 0 .
8 1 Analysis I: Functions of one real variable
x
(a) (b)
x
x
t
t0 t
t0 t
(a) (b)
x
x
t0
t
t0 t
0.5
1 t
has to hold if the limit of the difference quotient can be evaluated. This
means that the value of the function and the limiting value in this point
must coincide. The differential quotient would otherwise not be defined.
A possible, though rough classification of functions is therefore: the set
of continuous functions is a subset of all possible functions. Differentiable
functions are a subset of the set of continuous functions.
The relevance of the concepts discussed for the functions, which are used
in the discussion of kinematics in Chap. 2 is apparent. Functions, which are
supposed to describe the position x(t) and the speed v(t) of a point particle,
should be differentiable. This is necessary for the existence of an acceleration.
Functions, which characterise the acceleration, have to be at least continuous.
Discontinuous functions can nonetheless be used to describe the acceleration,
but this is always an idealisation. For example, a point particle may be sub-
jected to a constant acceleration for some time which is turned off suddenly.
This process will, independent of the details, always be described by a con-
tinuous function (as indicated in Fig. 1.12a). A fast turn-off can in many
1.3 Series expansions 11
(a) (b)
a a
t0 t t0 t
real ideal
Fig. 1.12. Turning-off processes
x x(t)
approximation uses a straight line. The curve x(t) is replaced by the tangent
line on the curve in the point t = 0 .1
dx dn x
1
Derivatives will be denoted by = x (t), respectively = x(n) (t).
dt dtn
12 1 Analysis I: Functions of one real variable
dx(t)
x(t) ≈ x(0) + t = x(0) + x (0)t .
dt t=0
The intercept of the straight line with the axis is x(0), x (0) is its gradient.
This approximation is not sufficient for points further away from the point
t = 0 . One possibility to improve matters is an approximation by curves of
higher and higher order
x(t) ≈ a0 + a1 t +a2 t2 +a3 t3 + ... + aN tN .
straight line parabola cub. parabola
The general shorthand is
N
x(t) ≈ an tn .
n=0
A series of this form is called the Taylor expansion of the function x(t)
about the position t = 0 .
The argument presented does, however, present a problem. The ansatz has
been differentiated term by term without further thought. This is permitted
in the case of a polynomial. Whether this is also permitted for an infinite
series is the content of question (ii): in how far can the equal sign between
the actual function and the Taylor expansion really be justified? Alternatively
the question could be rephrased as: for which range of t-values can the equal
sign be guaranteed? The answer to these questions will be postponed for some
time. It is preferable to look first at some examples of Taylor series without
consideration of finer points.
• The derivatives of the exponential function x(t) = et
x(n) (t) = et and x(n) (0) = 1
give for the Taylor series about the origin
∞
1 2 1 1 n
et = 1 + t + t + t3 + . . . = t .
2! 3! n=0
n!
This series has e.g. been used for the discussion of free fall with friction
(Chap. 2) in the form (replace t → −kt)
∞
kn n
e−kt = (−1)n t .
n=0
n!
respectively
x(0) = 0 x (0) = 1 x (0) = 0 x (0) = −1 x(4) (0) = 0 ... .
All even powers of the series expansion vanish. This expresses the fact that
sin t is an odd function. The series is therefore
1 1 1
sin t = t − t3 + t5 − t7 + . . .
3! 5! 7!
or in general
∞
1
sin t = (−1)n t2n+1 .
n=0
(2n + 1)!
• The following example shows that the evaluation of the necessary deriva-
tives can be quite wearisome. The derivatives of the function x(t) = tan t are
x(t) = tan t x(0) = 0
1
x (t) = x (0) = 1
cos2 t
2 sin t
x (t) = x (0) = 0
cos3 t
2(1 + 2 sin2 t)
x (t) = x (0) = 2 .
cos4 t
The calculation becomes more and more involved from this point on (try it!).
The Taylor series of the tangent function
1 2 17 7 62 9
tan t = t + t3 + t5 + t + t + ...
3 15 315 2835
is, for this reason, obtained by a combination of the power series of sin t
and cos t . This is an example for the fact that many Taylor series are not
obtained by direct evaluation of higher order derivatives. It is more useful
to assemble a collection of rules which allow the construction of the series of
more complicated functions from the series of simpler functions.
• A power series that is used often is the binomial series. This series
corresponds to the Taylor series of the function
x(t) = (1 + t)α α arbitrary, real .
The calculation of the coefficients is elementary but involves some paperwork
x(t) = (1 + t)α x(0) = 1
α−1
x (t) = α(1 + t) x (0) = α
x (t) = α(α − 1)(1 + t)α−2 x (0) = α(α − 1)
..
. .
The n-th derivative is
1.3 Series expansions 15
x(t) = bn (t − t0 )n .
n=0
un = u0 + u1 + u2 + . . . .
n=0
Sk = u 0 + u 1 + . . . + u k = un .
n=0
These quantities are called subtotals or partial sums. The partial sums form
a sequence
S0 , S1 , S2 , . . . , Sk , . . . .
The numerical series possesses a unique and finite sum value if the sequence
of the partial sums converges towards a finite (and unique) limit
S = lim Sk .
k→∞
This series converges very slowly. The value of the sum is known to be
S = ln 2 = 0.693147 . . . .
∞ 1
• The harmonic series n=1 .
n
k 2 10 20 30 40
Sk 1.5 2.929 3.598 3.995 4.279 (rounded values) .
It looks as if the sequence of partial sums converges if ever so slowly. This
conjecture is, however, wrong. The cumulative value of the harmonic series
is ∞ . The series is divergent.
In order to prove this statement the series
∞ 1 1 1 1 1 1 1
n=1 vn = 1+ + + + + + +
2 4 4 8 8 8 8
1 1 1 1
+ + ... + + + ... + + ...
16 16
32
32
8 terms 16 terms
can be compared with the harmonic series
∞ 1 1 1 1 1 1 1
n=1 un = 1 + + + + + + +
2 3 4 5 6 7 8
1 1 1 1
+ + ... + + + ... + +... .
9 16 17 32
Each term of the harmonic series is larger than or equal to the correspond-
ing term of the comparative series vn ≤ un . The comparative series can
be rewritten in a different fashion
∞ 1 1 1
n=1 vn = 1 + + +
2 2 2
1 1
+ + + . . . −→ ∞ .
2 2
N N
The harmonic series diverges in view of the relation vn < un for
each N ≥ 3 .
The estimate of convergence on the basis of a direct evaluation of the partial
sums has to be regarded with caution if no closed expression for the partial
sums is available. It is necessary to establish more general criteria for the
investigation of convergence.
The formal proof, which will not be given here, follows if the general criterion
for the convergence of sequences is transcribed to the case of a sequence of
partial sums.
The complementary statement, which has been used above, is
The
sum un is divergent if 0 ≤ vn ≤ un for all n > N and if
vn diverges.
It should be noted that the condition for all n > N allows the possibility that
a finite number of terms of the series un may be larger (in the first case) or
smaller (in the second case) as the corresponding terms of the comparative
series.
It is not useful to try to find a special comparative series, for
which con-
vergence or divergence has been established, for every given series un . It is
more economical to use some standard series, which lead to simpler criteria,
for the comparison envisaged. A useful comparative series is the geometric
series
∞
tn = 1 + t + t2 + . . . ,
n=0
as it is possible to obtain an explicit expression for the partial sums. This
expression can be calculated with the following argument. Begin with
Sk = 1+ t + . . . +tk
tSk = t + . . . +tk +tk+1
and subtract to find
Sk (1 − t) = 1 − tk+1
or
1 − tk+1
Sk = if (1 − t) = 0 .
1−t
The series is divergent for |t| ≥ 1 . On the other hand the limiting value is
(1 − tk+1 ) 1
lim Sk = lim =
k→∞ k→∞ 1−t 1−t
for |t| < 1 . By comparison with the geometric series it is possible to establish
the very useful root criterion (or root test), which states
The series un is convergent, if there exists a number q with
0 < q < 1 , so that for all n > N the relation
n
|un | ≤ q < 1
is satisfied.
20 1 Analysis I: Functions of one real variable
A proof of the criterion can be given in the following fashion. Write the
condition stated in the criterion in the form
|un | ≤ q n for n > N .
The partial sum
SN +p = SN + uN +1 + . . . + uN +p
can be majorised by the absolute values
|SN +p | ≤ |SN | + |uN +1 | + . . . + |uN +p | .
Use then the conditions of the criterion to find
|SN +p | ≤ |SN | + q N +1 (1 + . . . + q p )
and sum the additional terms
1 − q p+1
|SN +p | = |SN | + q N +1 .
1−q
The finite result
q N +1
|S| ≤ |SN | +
1−q
can be obtained in the limit p → ∞ provided the condition q < 1 is satisfied.
Please note again that it is only necessary that all terms with n > N (N
a finite integer) satisfy the condition. Naturally, all terms with n ≤ N have
to be finite.
The related quotient criterion or quotient test can be demonstrated
in a comparable fashion. The formulation is similar, only the actual condition
is replaced by
un+1
un ≤ q < 1 for all n > N .
Application of the quotient test for the series defining the number e yields
un+1 n! 1
= =
un (n + 1)! n + 1 < 1 (for n > 0) .
The application of the criterion for the case of the harmonic series
(whether alternating or not) gives
un+1 n
=
un n + 1 .
This is smaller than 1 if n is finite, but the limiting value is
n
lim =1.
n→∞ n + 1
This is exactly the case which is excluded in the criteria. The quotient is
neither smaller nor greater than 1. It is not possible to conclude whether
the series is convergent (as the alternating series) or divergent (as the direct
harmonic series) on the basis of the quotient test.
The criteria represent necessary but not sufficient conditions for conver-
gence or divergence. Variants and refined criteria, which can be found in the
mathematical literature, might be more adequate for special cases. The cri-
teria do not help in any way with the calculation of the values of the infinite
sums. This can involve a good deal of work. The criteria will, however, tell
you whether the attempt is worth your while.
t
-R -r r R
1 1
|t| < √ and in particular |t| < lim √ =R
n a
n n→∞ n an
because of
an+1 tn+1
an tn < 1 .
If the complementary criteria for divergence are included in the discussion,
the role of the radius of convergence can be summarised in the form:
A power series converges for |t| < R . It diverges for |t| > R. No
statement can be made for |t| = R.
The value of the infinite sum is finite for every (even a very large) t-value with
t < ∞ . This might sound astonishing, for example in view of the exponential
series with t = 100 . The series starts with
104 106 108 1010
e100 = 1 + 102 + + + + + ... .
2 6 24 120
The individual terms decrease, however, after the largest term 10200 /100!
with the approximate value 10200 /100! ≈ 1042 has been reached. The value
of the infinite sum for e100 is
e100 ≈ 3 · 1043 .
The radius of convergence of the series for the sine-function
∞
(−1)n 2n+1
t
n=0
(2n + 1)!
is, as a consequence of the corresponding structure of the individual terms,
also R = ∞ . It is
n+1
R = lim =1
n→∞ α − n
for the binomial series (see the discussion of the convergence of the geometric
series).
1.3 Series expansions 23
The longish proof of this theorem requires the steps (consult the mathemat-
ical literature):
n−1
(i) Show that the series ϕ(t) = n nnan t has the same radius of conver-
gence as the series x(t) = n an t .
(ii) Show that the function ϕ(t) , defined by the power series n n an tn−1 ,
is the derivative of the function x(t) , that is ϕ(t) = dx(t)/dt .
The Taylor formula
1 (n)
an = x (0)
n!
has been obtained via term by term differentiation. The theorem therefore
guarantees, that the function x(t) , which has been used to generate the coef-
ficients, is really represented by the series within the interval of convergence.
Three remarks conclude this relatively condensed discussion of series ex-
pansions.
(1) The statements concerning the series expansion about the origin of a
coordinate system t = 0 can be transferred to the case of an expansion about
a point t0 . For example, the radius of convergence of the series
x(n) (t0 )
x(t) = bn (t − t0 )n = (t − t0 )n
n
n!
is determined by
1
|t − t0 | < R = lim .
n→∞ n |b |
n
(2) Convergent power series can (nearly) be manipulated in the same way
as numbers. As an example, the rule for themultiplication of two
power series
is: the product of two power series x(t) = n an tn and y(t) = n bn tn can
be represented as a power series
x(t)y(t) = an tn bn tn = cm tm
n n m
24 1 Analysis I: Functions of one real variable
tan t = = cn tn .
cos t n
n
sin t = cn t cos t .
n
After insertion of the series expansion for cos t the two series on the right
hand side are multiplied term by term. Comparison of the factors of tn with
the expansion for sin t yields recursion relations for the coefficients cn . The
radius of convergence of the tangent-series is R(tan) = π/2 as the function
cos t has the value zero for |t| = π/2 .
(3) Series constructed from more general functions are also of interest in
physics
∞
Fourier series constitute a tool which allows the representation and the anal-
ysis of periodic processes both in space and/or in time. They are, for this
reason, of particular interest in physics for the description of oscillations or
wave propagation.
1.3 Series expansions 25
1
N nπx
N nπx
SN (x) = A0 + An cos + Bn sin .
2 n=1
L n=1
L
The following argument shows that a partial sum with the coefficients
nπx
1 L
An = d x f (x) cos
L −L L
nπx
1 L
Bn = d x f (x) sin
L −L L
26 1 Analysis I: Functions of one real variable
This expression is minimal if the last three (positive) terms vanish. This
requires
an = An and bn = Bn for n ≤ N .
A partial sum SN with coefficients An and Bn , which are calculated as indi-
cated above, does provide the best mean approximation of a given periodic
function f (x) for each value of N . The mean square deviation is positive
definite. This allows to state the inequality
3
The Kronecker symbol δn,m takes the values 1 for n = m and 0 for n = m .
1.3 Series expansions 27
N
N
L
a20 1
a2n + b2n + ≤ d x f (x)2 for all N
n=1 n=1
2 L −L
The transition from the partial sums SN (x) to the Fourier series
1
∞ nπx
∞ nπx
f (x) = a0 + an cos + bn sin
2 n=1
L n=1
L
with the coefficients
nπx
nπx
1 L 1 L
an = d x f (x) cos and bn = d x f (x) sin
L −L L L −L L
in the limit N −→ ∞ demands a more extensive discussion. It is necessary
to demonstrate that this series converges absolutely and uniformly in the
basic interval4 . Only in this case is it possible to calculate the coefficients of
the series with a term by term integration. Uniform convergence implies in
mathematical language that there exists for each > 0 a partial sum SN (x)
so that |f (x) − SN (x)| < for all x of the interval. The Fourier series will
represent the function in the basic interval or a finite number of intervals, in
which it is continuous, if these conditions are met. At points with a step of
the function the series yields the number
1
lim (f (x + ) + f (x − )) .
2 →0
1.3.4.2 An explicit example. The calculation of the Fourier representa-
tion of a periodic function (if convergence is assured) implies the evaluation
of the integrals for the coefficients an and bn for all n . The coefficients bn
(or an ) will vanish for all even (or odd) functions f (x) as the sine-function is
odd and the cosine-function is even. This implies that only the coefficients
nπx
1 L
bn = d x x sin
L −L L
have to be evaluated for the saw tooth function (see Chap. 4.2.4)
f (x) = x for −L≤x≤L
The result for these coefficients is
1 L
2 nπx
Lx nπx
L
2L
bn = sin − cos = (−1)(n+1) .
L nπ L nπ L πn
−L
4
Alternatively: in the case of singular points in the basic interval as e.g. steps of
the function, in each closed part interval.
28 1 Analysis I: Functions of one real variable
1.4 Integration
A definite integral
b
I(a, b) = x(t) dt
a
is normally discussed under the conditions
a) The interval of integration is finite.
b) The integrand is bounded |x(t)| < M < ∞ for a ≤ t ≤ b .
It is, however, possible in certain cases to go beyond these restrictions. These
cases will be introduced in terms of explicit examples rather than by a rig-
orous approach.
x
1
t
∞
Fig. 1.15. Indication of the integral 0
e−λt dt
x
+ +
-- t
Fig. 1.16. Improper integral with cos t
b
I = lim cos t dt = lim [sin b − 0]
b→∞ 0 b→∞
and finds that the limiting value is not defined. This improper integral does
not exist.
The arguments can be summed up in the form: improper integrals with an
infinite interval of integration can be defined rigorously in terms of a limiting
process. On the basis of this process one distinguishes the cases
30 1 Analysis I: Functions of one real variable
⎧
⎪
⎨ finite −→ convergence
b
I = lim x(t) dt −→ does not exist −→ divergence
b→∞ 0 ⎪
⎩
∞ −→ divergence .
1 t
(a) (b)
x
x
1 1
1/4 1 t 1/4 1 t
the function x(t) = t−1/2 the function x(t) = t−2
Fig. 1.18. Comparison of the integrands x(t) = t−1/2 and x(t) = t−2
The rise of the integrand for t −→ 0 is slow enough in the first case
(Fig. 1.18a) so that the limiting value is finite. The rise is too strong in the
second case (Fig. 1.18b).
a c b t
This limit is called the Cauchy principal value of the improper integral
if it exists.
3. The improper integral is divergent if even the Cauchy principal value
does not exist.
An example for an integral with a Cauchy principal value is
b
dt
a, b > 0 .
−a t
-a +
-- b t
2.1 Orientation
The differential equations indicated are therefore of first order for the
function v(t) (left column) respectively of second order for the function x(t) .
It should be noted that v = a(x) is not really a differential equation for v(t) .
This expresses the point of view that only the function to be determined,
its derivatives and the independent variable should occur in the equation.
The three cases are special cases of a general explicit differential equation of
second order for the function x(t)
x (t) = a(t, x, x ) .
In terms of the language of mechanics: the acceleration can be an arbitrary
function of the time, the position and the velocity. Explicit examples for
the solution of more general differential equations of second order will be
addressed in Math.Chap. 6.3.
The statement
The general solution of a differential equation of n -th order
contains n constants of integration.
can be illustrated in a direct fashion. The term ’general solution’ implies that
no conditions of any kind are stipulated concerning the solution. The mean-
ing of the word ’constant of integration’ will be clarified immediately. The
illustration begins by considering functions which contain one, two, three,
. . . constants. Assume then that the functions are possible solutions of a
differential equation and demonstrate that they are solutions of a differen-
tial equation of first, second, third, . . . order. The assertion follows then by
inversion of this argument.
he first case deals with the specification
v = v(t, c) .
The function contains the independent variable t and the parameter c . It is
assumed that each value of c produces a unique curve in the v - t-diagram, so
that variation of the parameter c leads to a family of curves. Examples are:
• A family of cubic parabolae, which are parallel shifted, is characterised by
the function
v = (t + c)3 .
The individual parabolae intersect the t -axis in the point t = −c (see
Fig. 2.1 a).
• The first term of the function
v = t + c et
corresponds to the bisecting line in the first and third quadrant of the v - t
diagram. Added to this is an exponential function. For c > 0 the function
approaches the bisecting line in the limit t → −∞ from above, for c < 0
from below (see Fig. 2.1 b).
2.1 Orientation 35
(a) (b)
x
x
dx dv
x = x(t, c1 , c2 ), x =
= v(t, c1 , c2 ), x = = a(t, c1 , c2 ) ,
dt dt
respectively for the example
x = c1 (t + c2 )3 , x = 3c1 (t + c2 )2 , x = 6c1 (t + c2 ) .
The ratios
x 1 x 1
= (t + c2 ) and
= (t + c2 )
x 3 x 2
lead after simple sorting to the (not necessarily simple) differential equation
of second order
2
xx = x2 .
3
A general discussion shows that an implicit differential equation of second
order with the form F (t, x, x , x ) = 0 is obtained by elimination of the
two parameters from the equation x = x(t, c1 , c2 ) . Additional examples are
family equation differential equation
c1 c32 = 0 c1 (1 + c2 )3 = 1 .
The solution is c1 = 1 and c2 = 0 so that the parabola x = t3 is selected
by the boundary values given.
• A second possibility for the selection of a definite curve requires the spec-
ification of a value of the function x(t0 ) = x0 and the first derivative
v0 = x (t0 ) at a point t0 . The parameters are determined from the equa-
tions
x0 = x(t0 , c1 , c2 ) v0 = x (t0 , c1 , c2 )
in this case. This option is referred to as an initial value problem
(Fig. 2.3b).
Consider once more the example
x = c1 (t + c2 )3 with the specification t0 = 0, x0 = 1, v0 = 1 .
This leads to the equations c1 c32
= 1 and 3c1 c22 = 1 with the solutions
c1 = 1/27 and c2 = 3 . From the family of cubic parabolae the particular
curve x = (1/27)(t + 3)3 is selected by the specification of initial values.
It should be kept in mind that not every specification leads to the selec-
tion of a particular solution. Consider, for the sake of simplicity, differential
equations of first order for which the specification of the value of the solu-
tion at one point should be sufficient. A particular solution of the differential
equation tx = 2x can be selected by specifying (t0 , x0 ) as x = x0 t/t20 . An
exception is the point (0, 0) . All the solutions pass through this point. None
of the circles (t−c)2 +x2 = c2 , which are solutions of the differential equation
2txx − x2 + t2 = 0 , can satisfy the condition x(0) = x0 = 0 . For t0 = 0 all
the circles pass through the origin.
(a) (b)
x
x
x2 x0 v0
x1
t0 t
t1 t2 t
The pattern for this method of solution can be gleaned from differential
equation of first order of the form
g(t) + f (x) x = 0 with x(t0 ) = x0 ,
or written more explicitly for the sake of the present argument
dx
g(t) + f (x) =0.
dt
A simple but incorrect path to the solution of this differential equation of
first order is the following: the ’numerator’ and the denominator of the dif-
ferential quotient are interpreted as independent, small (that is infinitesimal)
quantities so that the differential equation can be written as
g(t) d t = −f (x) d x .
40 2 Differential Equations I
The two variables are separated here. Integrate both sides of this relation in
a corresponding fashion, using the proper initial conditions, and obtain2
t x
d t̃ g(t̃) = − d x̃ f (x̃) .
t0 x0
The problem is solved if the two indefinite integrals can be worked out. The
solution may be written in terms of the corresponding primitives
G(t) − G(t0 ) = −(F (x) − F (x0 )) ,
where the initial condition is incorporated in an explicit fashion. The two
constant terms can be subsumed so that the general form of the solution
F (x) + G(t) = c ,
which may be resolved with respect to either x or also t , follows.
This argument is not correct. The differential quotient represents a lim-
iting value and not a fraction. Fortunately the same result can be obtained
in a more rigorous manner. The following steps are involved:
• Begin with the differential equation g(t) + f (x) x = 0 and consider the
indefinite integral
t t
d t̃ g(t̃) + d t̃ x (t̃) f (x(t̃)) = c
and the initial conditions x(t0 ) = x0 , v(t0 ) = v0 lead to a result that repre-
sents essentially the law of energy conservation of mechanics for the motion
of one mass point in one space dimension3
1 2 1 2
v − v0 = φ(x) − φ(x0 ) .
2 2
Write this result in the form v 2 = 2φ(x)+c and solve the differential equation
dx
= ±[2φ(x) + c]1/2
dt
in a second step using separation of variables
x
d x
t − t0 = ± 1/2
.
x0 [2φ(x) + c]
This result can possibly be inverted in the form x = x(t) after evaluation
of the integral. The sign has to be chosen on the basis of suitable (physical)
arguments in order to have a unique solution.
An example for this case is the harmonic oscillator problem (which can
also be solved more simply, see Math.Chap. 2.2.2): the differential equation
dv
x = −ω 2 x or v = −ω 2 x
dx
is to be solved with e.g. the initial conditions
t0 = 0, x(0) = 0, v(0) = v0 .
The mass of the oscillator moves initially in the positive x - direction and
passes through the origin. Separation of variables gives in the first step
1 2 1 2 1
v − v0 = − ω 2 x2 .
2 2 2
Resolve in the form v = ± v02 − ω 2 x2 , choose the sign in agreement with
the initial condition v(0) = +v0 and apply separation of variables a second
time to find
3
Multiply by m and sort, compare Chap. 3.2.3.
2.2 Methods of solution 43
x
d x̃
ωt =
1/2 .
0 v0 2
− x̃2
ω
The primitive of this integral is the arc sine
x x
d x̃
= arcsin +c,
[a2 − x̃2 ]1/2 a
so that the result is
ωx
arcsin = ωt
v0
or by inversion
v0
x(t) = sin ωt .
ω
The amplitude A = v0 /ω is determined by the initial conditions, the maxi-
mal displacement is proportional to the initial velocity in this example. The
velocity function v(t) = v0 cos ωt can either be obtained by differentiation
of
x(t) or by insertion of the result for x(t) into the equation v = ± v02 − ω 2 x2 .
The process of solution is slightly different for the initial conditions
t0 = 0, x(0) = A, v(0) = 0 .
The mass is initially displaced in the positive x -direction. As the velocity is
zero at this time the sense of the oscillation is reversed. The solution with
the present initial conditions is
x(t) = A cos ωt v(t) = −Aω sin ωt .
The sign and the integration constant can only be selected after the second
step.
The simpler way to the solution, which has been mentioned, relies on
the fact that the function a(x) is linear in x. The method of separation of
variables has the advantage that a solution can be obtained for a general
form of the function a(x) . A prerequisite for gaining an analytical solution
are evaluation of all integrals and inversion of x = x(t) in an analytic manner.
The method of separation of variables can be applied again directly in the
case x = a(v). Write the differential equation in the form v (t) = a(v) and
obtain the solution
v
d ṽ
t − t0 = .
v0 a(ṽ)
The free fall problem with friction characterised this time by a(v) = g−kv
and the initial conditions
t0 = 0, x(0) = x0 , v(0) = v0
can be used as an example in this case. Calculate4
v
d ṽ 1 g − kv
t= = − ln
v0 (g − kṽ) k g − kv0
in the first step. The choice of the sign in
g − kv
−kt
g − kv0 = e ,
which is necessary after the inversion, can be dealt with according to the
special situation. The result of the inversion
g
v(t) = (1 − e−kt ) + v0 e−kt
k
has to be integrated once more. The final result has been given above.
with the general solution R(t) = c1 + c2 t . The two solutions found, that is
x1 = eα1 t , x2 = t eα1 t , are linearly independent because of
W (e α1 t , t e α1 t ) = e2α1 t = 0 .
The general solution of the differential equation in the case of a double
root of the characteristic equation is therefore
xh (t) = (c1 + c2 t) eαt .
The different situations, which can be encountered in practical applica-
tions, are demonstrated by the following four examples.
1. The characteristic equation of the differential equation x + 4x − 5x = 0
has the real roots α1 = −1 and α2 = −5 . The general solution is therefore
xh (t) = c1 e t +c2 e−5t .
2. The characteristic equation of the differential equation x +4x +5x = 0 has
the complex roots (see Math.Chap. 7 for some details concerning complex
numbers and functions)
α1 = −2 + i α2 = −2 − i (i2 = −1) .
The roots are complex conjugate with respect to each other α1 = α2 . This
guarantees that the combinations α1 + α2 and α1 · α2 are real. The general
solution is
xh (t) = (c1 e it +c2 e −it ) e−2t .
It might astonish that the solution of a differential equation with real
coefficients should be complex. However, with the relation
e±it = cos t ± i sin t
a real form
xh (t) = (A cos t + B sin t) e−2t
may be obtained. The relation between the coefficients is
A = c1 + c2 B = i(c1 − c2 ) .
The two trigonometric functions are linearly independent as one finds
W (sin t, cos t) = −1 = 0 . A third form of the solution is
xh (t) = C sin(t + ϕ) e−2t
with
A = C sin ϕ B = C cos ϕ
and the inverse
A
C = A2 + B 2 tan ϕ = .
B
2.2 Methods of solution 49
In the end it does not matter which of the three forms is used (except
the fact that the exponential functions are handled more easily, e.g. the
addition theorems). The specification of real initial values for an initial
value problem of physics will lead to a real solution. The initial conditions
x(0) = 1 , x (0) = 0 lead in all three cases to the real solution
xh (t) = (cos t + 2 sin t) e−2t
of the present differential equation.
3. The characteristic equation of the differential equation x − 4x + 4x = 0
has a double root α1 = α2 = 2 . Hence the general solution is
xh (t) = (c1 + c2 t) e 2t .
4. The differential equation of the harmonic oscillator x + ω 2 x = 0 can be
solved with the same method. The general solution can be given in three
(actually not so) different forms
xh (t) = c1 e iωt +c2 e −iωt = A cos ωt + B sin ωt = C sin(ωt + ϕ) .
The last example of this section is an inhomogeneous, linear differential
equation with constant coefficients The differential equation
x + ω 2 x = b0 sin ω0 t
characterises a driven harmonic oscillator (see Chap. 4.2.3). Besides the
general solution of the homogeneous oscillator equation a particular so-
lution of the inhomogeneous differential equation is needed. The ansatz
xp (t) = D sin ω0 t for the determination of this function is sufficient in this
case as the second derivative of the sine function is a sine function. Insertion
into the differential equation allows, via
(−ω02 + ω 2 )D sin ω0 t = b0 sin ω0 t
the determination of the constant D as
D = b0 /(ω 2 − ω02 )
(as long as ω 2 = ω02 ). The general solution of the inhomogeneous differential
equation is therefore
b0
xi (t) = A cos ωt + B sin ωt + sin ω0 t .
(ω 2 − ω02 )
The initial conditions x(0) = x0 , x (0) = 0 yield (by solution of a system
of linear equations for the integration constants) the special solution
b0 ω0
x(t) = x0 cos ωt + 2 sin ω 0 t − sin ωt .
(ω − ω02 ) ω
The inhomogeneous term vanishes for b0 = 0 and the special solution of the
homogeneous problem (with corresponding initial conditions) is recovered.
3 Linear Algebra
3.1 Vectors
B
r
r
A
→ −→
r = r = r (=AB) ,
or more calligraphical variants. The notation for the length (the magnitude)
of a directed line segment (of a vector) is
|r| = r .
Examples for vectorial quantities in physics are forces (it plays a role in which
direction the push or pull is applied), velocity, angular momentum, electric
or magnetic fields, etc.
Some finer points can be noted notwithstanding the simple definition. One
differentiates between
• Fixed vectors. The starting point is strictly fixed. An example are displace-
ment vectors, which start at a definite position.
• Sliding Vectors. The starting point can be shifted along the straight line,
which is given by the direction of the vector. Force vectors are examples
of such vectors. The point at which the force is applied can be shifted, for
instance by a ’rope’.
• Free vectors. They can be shifted in an arbitrary way but have to retain
their direction.
1
Mainly the left hand form is used.
3.1 Vectors 53
Free vectors will be used in the discussion that follows, first to assemble
a vector calculus (to begin with in a qualitative form which is useful for
applications in physics and geometry.).
The addition of vectors corresponds to consecutive displacements. An
object is first displaced by the vector r 1 , then by the vector r 2 . The vector,
which connects the starting point and the end point of this chain of vectors,
is the sum vector. It is defined by
S = r1 + r2 .
The sum vector marks the shortest connection of these points (Fig. 3.2a).
The sum vector can also be constructed by using the same starting points
for the two vectors and complementing the figure to form a parallelogram
(Fig. 3.2a). The long diagonal of the parallelogram is the sum vector. The
fact, that vector addition is commutative, is used in this construction
r1 + r2 = r2 + r1 .
The addition of more than two vectors (Fig. 3.2b) is indicated by the state-
(a) (b)
r1 S r1
S’
r2 r2
r3
two vectors three vectors
Fig. 3.2. Illustration of vector addition
ment
r 1 + r 2 + r 3 = (r 1 + r 2 ) + r 3 = r 1 + (r 2 + r 3 ) .
The associative law of vector addition is indicated in this equation.
The multiplication of a vector with a scalar describes the following
manipulation: do not shift the object by the vector r but by a times the
vector. The resulting vector
R = ar
has the length |ar| . It points in the same direction as the vector r if the
number a is positive. The vector R is called a zero vector (null vector) if
a = 0 . The question could be posed in this case if a quantity without a
length and a direction should be called a vector. The zero vector is, however
an indispensable quantity – similar to the number 0 for the operations with
numbers – with the property
54 3 Linear Algebra
r+0 = r.
The vector R points in the direction opposite to r if a is negative, the length
is still given by |ar| . For multiplication with a scalar the distributive laws
(a + b)r = ar + br a(r 1 + r 2 ) = ar 1 + ar 2
and the associative law
a(br) = (ab)r
hold.
The subtraction of vectors can be represented with the aid of addition
and multiplication with (−1) . The difference of two vectors is
D = r 1 − r 2 = r 1 + (−1) r 2 .
The vector r2 is turned around and then added. The difference vector can
also be obtained as the short diagonal in the vector parallelogram. The end
point of the difference vector is identical with the end point of the vector r 1
(Fig. 3.3).
r1
D D
-r 2 r2
Fig. 3.3. Subtraction of vectors
There exist two different products of a vector with another. The scalar
product, also called the inner product corresponds, so to speak to the
projection of one vector onto the other. The definition (and notation) of the
scalar product is
(r 1 · r 2 ) = r 1 · r 2 = r1 r2 cos ϕ12 .
The angle ϕ12 is the angle enclosed by the two vectors. The definition includes
the following operations:
• Project the vector r 2 onto the direction of r 1 . This leads to the factor
r2 cos ϕ12 .
• Multiply with the magnitude of the vector r 1 .
• Alternatively, the vector r 1 may first be projected onto the direction of
r 2 . This is to be followed by multiplication with r2 (Fig. 3.4a).
The definition therefore associates
two vectors =⇒ one scalar (number).
3.1 Vectors 55
(a) (b)
r3
r2 r2
ϕ
12
r1 r1
definition distributive law
π−γ
b γ
a
c
Fig. 3.5. Illustrating the law of cosine
(a) (b)
b
b
ϕab h
a a
There exist also rules for handling the vector product as well:
• The vector product is anticommuting (Fig. 3.7)
(a) (b)
b
a xb
a
b bx a
(r 1 × r 2 ) = −(r 2 × r 1 ) .
This emphasizes the detailed specification of the direction of r 3 in the right
hand rule.
• The definition implies the associative law with respect to multiplication
with a scalar
c(r 1 × r 2 ) = ((cr 1 ) × r 2 ) = (r 1 × (cr 2 )) .
• There exists a distributive law
(r 1 × (r 2 + r 3 )) = (r 1 × r 2 ) + (r 1 × r 3 ) .
A proof on the basis of elementary geometry is not difficult but somewhat
cumbersome (try it!).
The qualitative form of the vector calculus is not suitable for the pro-
duction of quantitative results. Unnecessary errors occur, for instance, if one
attempts to determine the vector sum with the aid of a ruler and a goniome-
ter (angle gauge ). It is necessary to go over a quantitative version of vector
calculus.
(a) (b)
z
r
e3 = ez
y
e2 = ey
e1 = ex
x
The factor which features in the vector sum is the Levi-Civita symbol
with the properties
⎧
⎨ 0 if two coefficients are equal,
ijk = 1 if (ijk) is a cyclic permutation of (123)
⎩
−1 for every other permutation .
60 3 Linear Algebra
3
r = x1 e1 + x2 e2 + x3 e3 = xi ei .
i=1
r
y
x
Fig. 3.9. The decomposition into components
r ⇒ (x1 , x2 , x3 )
can be used for this reason. Vectors in R3 can be characterised by a triple of
numbers3 .
The decomposition into components is the key for a quantitative formu-
lation of vector calculus. The individual arithmetic operations can be sum-
marised (use x instead of r 1 , etc.) with
2
An introductory discussion of additional spaces can be found in Math.Chap. 3.1.3
and 3.1.4.
3
The equivalence expressed by ⇒ can be read as an equal sign on the basis of
matrix calculus (Math.Chap. 3.2).
3.1 Vectors 61
x = x1 e1 + x2 e2 + x3 e3 y = y1 e1 + y2 e2 + y3 e3
in the following fashion (Fig. 3.10)
• Addition Using the rules indicated above one finds for the sum vector
S = x + y = (x1 + y1 )e1 + (x2 + y2 )e2 + (x3 + y3 )e3
3
= i (xi + yi )ei ⇒ (x1 + y1 , x2 + y2 , x3 + y3 ) .
The sum vector is obtained by addition of the individual components.
• Multiplication with scalar The decomposition is in this case
R = ax = (ax1 )e1 + (ax2 )e2 + (ax3 )e3
3
= i (axi )ei ⇒ (ax1 , ax2 , ax3 ) .
• Subtraction The difference of two vectors is
D = x − y = (x1 − y1 )e1 + (x2 − y2 )e2 + (x3 − y3 )e3
3
= i (xi − yi )ei ⇒ (x1 − y1 , x2 − y2 , x3 − y3 ) .
y
x D
x
R
S y
x
= x1 y1 + x2 y2 + x3 y3 .
The scalar product can be calculated as the sum of the products of the
components if the decomposition of the two vectors is known. The result
is a number.
62 3 Linear Algebra
The individual steps have used the rules given and the representation of
the vector product in terms of unit vectors. The result can be summarised
in the form: the k -th component of the product vector is
(x × y)k = ijk xi yj .
i,j
The formal double sum can be written more directly if the properties of
the Levi-Civita symbol are used. Only two of the nine contributions are
different from zero. The result is (reproduce it)
(x × y) = (x2 y3 − x3 y2 )e1 + (x3 y1 − x1 y3 )e2 + (x1 y2 − x2 y3 )e3
⇒ (x2 y3 − x3 y2 , x3 y1 − x1 y3 , x1 y2 − x2 y3 ) .
The rule to recall the sequence of indices is: the first term for every compo-
nent is indexed by the cyclic complement of the index marking the compo-
nent. The second term (with a minus sign) contains the anticyclic comple-
ment. An additional rule uses the concept of determinants. This rule will
be quoted in the appropriate section (see Math.Chap. 3.2.4).
It is opportune to consider a set of ’exercises’ in order to demonstrate
the use of the two products for the discussion of geometric and trigonometric
problems.
• Exercise 1: Calculate the distance between the end points of the vectors
a = (1, 1, 1) and b = (3, 0, 4) and determine the angle between the two
vectors (Fig. 3.11a). The units of the length may be cm, m, . . . .
Answer 1: The first question can be answered √by calculation of the magni-
tude of the difference vector D = |a − b| = 14 = 3.7417 . . . (Fig. 3.11a).
The answer to the second √ question requires the knowledge of the lengths
of the two vectors (a = 3, b = 5) . The scalar √ product (a · b = 7) and
its qualitative definition yield cos ϕab = 7/(5 3) = .8083 . . . and hence
ϕab = .6296 . . . rad.
• Exercise 2: Determine the equation of the straight line through the end
points of the vectors a and b (Fig. 3.11b).
Answer 2: Each point of the straight line is described by the equation
x = a + s(b − a) . The parameter s takes the values −∞ ≤ s ≤ ∞
(Fig. 3.11b). The components of this vectorial form of an equation for a
straight line in space correspond to
xi = ai + s(bi − ai ) i = 1, 2, 3 .
3.1 Vectors 63
(a) (b)
z
z
D
D
b a
b a
x y
x y g
distance straight line
Fig. 3.11. Applications of vector calculus
(a) (b)
10
10
8
5 6
4y
2
2
4
x
6
8
10
Answer 3: The difference vector r−r 0 lies in the plane if the end point of the
vector r is contained in the plane. The orthogonality of the vectors r − r 0
and n is expressed by the scalar product (r − r 0 ) · n = 0 . All points of the
plane (as end points of the vector r) satisfy this equation which is known as
the Hesse canonical form or Hesse normal form. An explicit equation
characterising the plane explicitly is obtained by transition to components
(use (x, y, z))
(x − x0 )nx + (y − y0 )ny + (z − z0 )nz = 0 .
This corresponds to the standard version of an equation of a plane in space
which is used in analytic geometry (see also Math.Chap. 4.1).
• Exercise 4: Calculate the area of a triangle in space which is spanned by
the end points of the vectors r 1 , r 2 , r 3 (Fig. 3.13a).
Answer 4: Calculate e.g. the difference vectors a = r 3 − r 1 and b = r 2 − r 1
(other combinations are possible) and evaluate the vector product of a and
64 3 Linear Algebra
(a) (b)
r2 d
r2
r0
r1
r1
r3
g
area of a triangle determination of distances
Fig. 3.13. Applications: vector product
(a b c) = a · (b × c) .
This product, a scalar quantity, represents the volume of a parallelepiped
which is spanned by the three vectors (Fig. 3.14). The vector (b × c) is per-
bxc a
c
b
Fig. 3.14. The parallelepipedal product
pendicular to the plane spanned by the two vectors and represents a measure
of the area marked by them. The projection of the vector a onto the vector
(b × c) describes the height of the parallelepiped. The volume of the paral-
lelepiped, according to the formula (base area) times (height), is therefore5
V (pepi) = (a b c) = abc cos ϕa,b×c sin ϕbc ,
or in terms of the components of the vectors involved
V (pepi) = a1 (b2 c3 − b3 c2 ) + a2 (b3 c1 − b1 c3 ) + a3 (b1 c2 − b2 c1 ) ,
or alternatively in a compact form
3
V (pepi) = ijk ai bj ck .
i,j,k=1
v1 = a2 (b × c)3 − a3 (b × c)2
(evaluation of the outer vector product)
= a2 (b1 c2 − b2 c1 ) − a3 (b3 c1 − b1 c3 )
(evaluation of the inner vector product)
= b1 (a1 c1 + a2 c2 + a3 c3 ) − c1 (a1 c1 + a2 c2 + a3 c3 )
(sort, add and subtract a suitable term)
= (a · c)b1 − (a · b)c1 .
A similar argument can be given for the other components of the vector v .
Additional products, as e.g. the products with four vectors
(a × b) · (c × d) =⇒ a scalar
(a × b) × (c × d) =⇒ a vector
are encountered occasionally in physics (mechanics). They will not be dis-
cussed here.
In the next section a short addendum to the discussion of vectors is offered.
Two topics, which can be included under the heading ’Linear Algebra’ are
introduced here: ’n-dimensional (Euclidian) vector spaces’ and ’nonorthog-
onal (oblique) coordinate systems’. A full discussion of these topics will be
taken up at a later stage.
The following request might tax the ability of abstraction: envisage a space
which is spanned by n (with n being larger than 3) mutually perpendicular
unit vectors. Such spaces can be discussed in mathematical terms without
any difficulties even if there are problems with the imagination.
A set of vectors, which are supposed to span the space in question, may
be denoted by
e1 , e2 , e3 , . . . , en .
The expected properties of these vectors can (in analogy to the situation in
the three-dimensional space) be expressed by the orthogonality relations
(ei · ek ) = δik i, k = 1, 2, . . . , n .
The postulate, that n vectors have the length 1 and are mutually perpendic-
ular, only makes sense if it is possible to define and to implement the basic
concepts of geometry – lengths, distances and angles – in this space in a
unambiguous fashion.
The first step towards this aim is an extension of the decomposition into
components. An arbitrary vector a in this space can be expressed in terms
of the basis envisaged as
3.1 Vectors 67
n
a = a1 e1 + · · · + an en = ai ei with ai = (a · ei ) .
i=1
ek = δki ei .
i=1
This shows that e.g. the basis vector ek can be represented by an n-tuple
with the number 1 at the k-th position
ek = (0, . . . , 1, . . . , 0) .
It is possible to transcribe the vector calculus of three-dimensional space,
including all calculation rules, to the n-dimensional space on the basis of
these definitions.
• Addition
S = x + y ⇒ (x1 + y1 , . . . , xn + yn ) .
• Multiplication with scalar
R = ax ⇒ (ax1 , . . . , axn ) .
• Subtraction
D = x − y ⇒ (x1 − y1 , . . . , xn − yn ) .
• Scalar product
n n n n
(x · y) = xi ei · yk ek = xi yk (ei · ek ) = xi yi .
i=1 k=1 i,k=1 i=1
The basic concepts of geometry can be formulated with the aid of these
operations. The length of a vector is given by the square root of the scalar
product
1/2
[a · a]1/2 = |a| = a21 + · · · + a2n .
The distance between two points in n-dimensional space, the endpoints of
two vectors b and c, is determined by the magnitude of the difference vector
d = b − c, that is |d| . The scalar product is also used to define the angle
between two vectors
(a · b)
cos ϕab = .
|a||b|
6
Use the equal sign instead of the more cautious equivalence in the sense of a
definition.
68 3 Linear Algebra
• The vector product has been used in three-dimensional space for fixing
the orientation of the trihedron. A generalisation of the vector product to
multidimensional spaces is possible but rather cumbersome. It is preferable
to avoid this discussion as long as the question of orientation is not of
interest.
The multidimensional Euclidian space indicated above can be defined over
the domain of real or of complex numbers. It is denoted by R(n) or Rn for real
n-tuples, for complex n-tuples by C(n) or Cn . The mathematical foundation of
quantum mechanics calls for an additional extension which involves the limit
n −→ ∞ . The corresponding space over the domain of complex numbers is
the Hilbert space C∞ . A four-dimensional space, the Minkowski space, plays
a central role in the (special) theory of relativity. The difference between
Euclidian and Minkowski spaces is indicated in the next section.
e3
e2
e1
Fig. 3.15. Oblique coordinate system
vectors do not need to be orthogonal nor do they have to have the length 1 .
The characterisation of the space is, also in this case, based on the scalar
products
(ei · ek ) = |ei ||ek | cos ϕik = gik i, k = 1, 2, 3 .
The scalar product is commutative
gik = gki
so that there are 6 independent quantities. The quantities gii are associated
with the lengths of the basis vectors. The quantities gik = gki with i = k
characterise the relative position described by (cos ϕik ) . The set of numbers
{gik } is called the metric tensor for this reason7 . It has the form
7
The term ’tensor’ is discussed briefly in Chap. 6.3.3. More information is given
in Vol. 2
3.1 Vectors 69
gik = δik
for the special case of a Cartesian system.
Two different decompositions of an arbitrary vector into components are
possible for a nonorthogonal coordinate system. The figures illustrate, for
the sake of clarity, the situation in a two-dimensional world. All equations
correspond to three space dimensions.
1. A vector a can be projected orthogonally onto the coordinate directions.
The components are in this case
ai = (a · ei ) i = 1, 2, 3 .
The vector can not be reconstructed from these components
3
a = ai ei .
i=1
3
a= ai ei .
i=1
(a) (b)
a1 e 1
a
e2 a e2
a2 e 2
e1 e1
covariant contravariant
Fig. 3.16. Decomposition of a vector with respect to a nonorthogonal coordinate
system
The two sets of components must be related. This relation is best charac-
terised by the introduction of a reciprocal coordinate system. The basis
of this system (Fig. 3.17) , which is denoted by upper indices, is defined by
70 3 Linear Algebra
3
e
e2
e1
Fig. 3.17. Reciprocal coordinate system: e1 × e2 −→ e3
ei = g ik ek .
k
(ei · em ) = g ik (ek · em )
k
gives
δim = g ik gkm .
k
The scalar product of the basis vectors of the reciprocal system themselves
yields
(ei · em ) = g ik g mk (ek ek )
kk
= g ik g mk gk k
kk
3.1 Vectors 71
or
(ei · em ) = g im .
The argument also shows that the elements {g ik } are unambiguously deter-
mined by the elements {gik } . There exist 6 independent equations for the
determination of 6 quantities. The name ’reciprocal system’ implies that the
inverse relations
(e2 × e3 )
e1 = etc.
(e1 e2 e3 )
are valid (a proof will not be given).
The contravariant components are the components of a vector with re-
spect to the original basis (see above)
3
a= ai ei .
i=1
(a · ei ) = ak (ek · ei )
k
ai = gik ak
k
are satisfied. The two decompositions are related via the metric tensor. The
covariant components are the components of a vector with respect to the
reciprocal basis.
a= ai ei
i
because of
(a · ek ) = ai (ei · ek ) = ai δik = ak .
i i
(a · b) = ai bk (ei · ek ) = ai bk g ik
ik ik
= ai bk (ei · ek ) = ai bk gik
ik ik
= ai bk (ei · ek ) = ai bi .
ik i
ai = g ik ak .
k
The decompositions can be fully converted into each other with the metric
(or the reciprocal metric) tensor.
Cartesian coordinate systems are sufficient for the representation of the
content of classical mechanics. There exist, however, two areas of physics
which demand the use of oblique coordinate systems:
∗ The coordinate systems have to be adapted to the crystal structure in
crystal physics.
∗ Space and time (times the velocity of light in order to have matching units)
coordinates are combined to form the basis of a four-dimensional space in
the (special) theory of relativity. The basis vectors of this space are still
orthogonal
(ei · ek ) = 0 for i = k .
The metric is, however, not Euclidian
(ei · ei ) = 1 for all i.
A distinction between co- and contravariant components is required in both
cases.
e 2
r
e’2
e’1
α
e 1
Consider a vector r in R2
r = x1 e1 + x2 e2 ⇒ (x1 , x2 )
which is referred to a Cartesian basis. A second Cartesian coordinate system,
which is rotated by the angle α with respect to the first system, is spanned
by the (primed) unit vectors e1 and e2 . The decomposition of the vector r
with respect to the second system is
r = x1 e1 + x2 e2 .
The following question has to be answered: how can the components x1 and
x2 be calculated if the components x1 , x2 and the angle α are known? A
first step towards an answer is the representation of the basis vectors of the
primed system in terms of the basis vectors of the unprimed system. Simple
trigonometric considerations give (see Fig. 3.19a)
e1 = + cos α e1 + sin α e2
e2 = − sin α e1 + cos α e2 .
It would also have been possible to proceed in the reverse order and express
e1 and e2 in terms of the basis vectors of the primed system (Fig. 3.19b).
The relevant equations can be obtained from the first set of vector equations
(a) (b)
e 2
e2 e’ 2
e’1
e’2 α
e’1
e1
α
e1
Projection on Projection on
unprimed system primed system
Fig. 3.19. Relation between the coordinate systems
74 3 Linear Algebra
e2
e’2
e’1
α(t)
e1
Fig. 3.20. Interpretation of the transformation
moving on the rotating (about its north-south axis) earth viewed from a
system tied to the earth and a system fixed in space.
3.2 Linear coordinate transformations, matrices and determinants 75
The situation becomes more involved if a second rotation (about the an-
gle β) of the unprimed coordinate system is considered (Fig. 3.21). For a
description of this situation three coordinate systems are needed:
2
β
α
1
Fig. 3.21. Three coordinate systems, two rotations
1) the unprimed,
2) the once primed as before, rotated by the angle α with respect to the
unprimed,
3) a double primed. This is rotated by the angle β with respect to system 2
and by the angle (α + β) with respect to system 1.
A vector r (in the plane) can be decomposed with respect to each of the three
systems and transformation equations between the three sets of components
can be given, for example
x1 = x1 cos β + x2 sin β β
2 −→ 3
x2 = −x1 sin β + x2 cos β .
The transformation between the components of system 1 and system 3 are
obtained by insertion of the first set of equations into this set. This yields
x1 = x1 {cos α cos β − sin α sin β} + x2 {sin α cos β + cos α sin β}
which can, as might have been expected, be written as
x1 = x1 cos(α + β) + x2 sin(α + β) .
The corresponding equation for the 2 -coordinate is
x2 = −x1 sin(α + β) + x2 cos(α + β) .
A sequence of transformations (here rotations in a plane) can be described
correctly by insertion of the relevant equations into each other. These explicit
insertions are not easily handled if a larger number of transformations has to
be considered if more than two space dimensions are involved. Some effort is
already required in the case of three space dimensions. An elegant tool for the
formulation of transformations in multidimensional spaces is matrix calculus
which is introduced in the next section.
76 3 Linear Algebra
3.2.2 Matrices
The simple basic definition is8 :
These are examples for matrices with one row or one column. The rep-
resentation of vectors as one row or one column is used alternatively
depending on the situation and typographical convenience.
In the spirit of the last example: the set of numbers of a general M × N
matrix
(ai1 . . . aiN )
is called the i-th row vector, the set of numbers
⎛ ⎞
a1k
⎜ .. ⎟
⎝ . ⎠
aM k
is called the k-th column vector.
The basic statement concerning matrix calculus is: it is possible to handle
matrices (nearly!) in the same way as numbers. The discussion of the various
mathematical operations with matrices has to be preceded by a completion
of the list of terms and concepts which are used in this context.
(i) The chain of elements a11 , a22 , . . . of a matrix A is called the main or
principal diagonal
⎛ ⎞
a11 . . . . . .
⎜ a22 ⎟
⎜ ⎟
A = ⎜. . . ⎟ .
⎝. . . . .⎠
.
. . . . . . aN N
(ii) A matrix, which is reflected with respect to the main diagonal, is the
transposed matrix AT (variants of notation are indicated)
⎛ ⎞
a11 a12 a13 . . .
⎜ a21 a22 a23 . . . ⎟
⎜ ⎟
A = ⎜ a31 a32 a33 . . . ⎟ =⇒
⎝ ⎠
.. .. ..
. . .
⎛ ⎞
a11 a21 a31 . . .
⎜ a12 a22 a32 . . . ⎟
⎜ ⎟
AT = {A = A∗ } = ⎜ a13 a23 a33 . . . ⎟ .
⎝ ⎠
.. .. ..
. . .
An example is
⎛ ⎞
T 13
123
= ⎝2 1⎠ .
311
31
The relation
78 3 Linear Algebra
(AT )T = A
can be obtained on the basis of the definition. The transposed of a trans-
posed matrix is the original matrix
(iii) Two matrices A and B are called similar if the number of rows and the
number of columns is the same
M A = MB N A = NB .
(rows) (columns)
(iv) Two matrices A and B are called equal if they are similar and if the
elements at each position agree
M A = MB N A = NB and aik = bik .
This is expressed by A = B .
Mathematical operations with matrices are addition, multiplication with
a number, subtraction, matrix multiplication and matrix inversion.
• The definition of the addition of matrices follows the definition of the
addition of vectors which may, as indicated above, be interpreted as a
particular matrix.
A direct example says more in this case than any further explanation
123 234 357
+ = .
311 422 733
Vector addition (here with columns) is a special case
x1 y1 x1 + y1
+ = .
x2 y2 x2 + y2
Note again: the addition of matrices is only defined for similar matrices.
• The second operation, the multiplication of matrix with a (real) num-
ber, can also be regarded as the extension of the corresponding operation
with vectors.
The multiplication of the matrix A with the (real) number α leads
to the matrix C = αA with the elements (cik ) = (α aik ) .
A set of rules applies to these operations with matrices. They are collected
below without additional comment.
Commutative law of addition: A + B = B + A
Associative law of addition : A + (B + C) = (A + B) + C
Distributive laws : (α + β)A = αA + βA
: α(A + B) = αA + αB
Rules for transposition : (αA)T = αAT
: (A + B)T = AT + BT
Multiplication with α = 0 yields a zero matrix or null matrix
⎛ ⎞ ⎛ ⎞
a11 a12 . . . 0 0 ...
⎜ ⎟ ⎜ ⎟
0 ⎝ a21 a22 . . . ⎠ = ⎝ 0 0 . . . ⎠ .
.. .. .. .. .. ..
. . . . . .
• The difference of two similar matrices can be defined by combination of
the two operations above
D = A + (−1)B = A − B with dik = aik − bik .
• The definition of the multiplication of two matrices is fashioned, as
indicated above, after the manipulation of a sequence of transformations.
Many problems and questions of mathematics and physics can be formu-
lated in a concise manner using matrix multiplication. The definition of
this operation is more involved
The matrix C = (cik )M,R with M rows and R columns and the elements
N
represents the product of the matrix A = (aik )M,N with M rows and N
columns with the matrix B = (bik )N,R with N rows and R columns
C = AB .
This definition calls for an explanation. Concentrate on the i-th row of the
matrix A with M rows and N columns and the k-th column of the matrix
B with N rows and R columns
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
.. ↑ b1k ↑ .. ↑
⎜ . ⎟ ⎜ b ⎟ ⎜ . ⎟
⎜ ai1 ai2 . . . aiN ⎟ M • ⎜ ⎟
2k
⎝ ⎠ ⎜ .. ⎟N =⎜ ⎟
⎝ . . . cik . . . ⎠ M
.. ⎝ ... . ... ⎠ ..
. ↓ bN k ↓ . ↓
←− N −→ ←− R −→ ← R →
The element cik of the product matrix has the form
cik = ai1 b1k + ai2 b2k + . . . aiN bN k .
80 3 Linear Algebra
This implies that the first element of the i-th row A is multiplied by the
first element of the k-th column of B plus the product of the corresponding
second elements and so on. The rule to remember is: each row of the matrix
A is combined in this fashion with each column of the matrix B. As the
matrix A has M rows and the matrix B R columns a product matrix with
M rows and R columns is obtained. The operation is only defined if the
shapes of the two factors are matched. The number of columns of A has
to agree with the number of rows of B .
The definition is also illustrated by a number of examples. The first ex-
ample illustrates the formal execution of the matrix multiplication
⎛ ⎞
b11 b12
a11 a12 a13 ⎝
b21 b22 ⎠ =
a21 a22 a23
b31 b32
.
a11 b11 + a12 b21 + a13 b31 a11 b12 + a12 b22 + a13 b32
a21 b11 + a22 b21 + a23 b31 a21 b12 + a22 b22 + a23 b32
The product of a 2 × 3 matrix with a 3 × 2 matrix yields a 2 × 2 matrix. The
outer indices in each term of the sums mark the position (row on the left,
column on the right) of the elements of the product matrix.
The second example is just numerical (please check!)
⎛ ⎞⎛ ⎞ ⎛ ⎞
1 1 1 1 1 2 3 4 4 8 12 16
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜2 2 2 2⎟⎜1 2 3 4⎟ ⎜ 8 16 24 32 ⎟
⎜ ⎟⎜ ⎟ = ⎜ ⎟
⎜3 3 3 3⎟⎜1 2 3 4⎟ ⎜ 12 24 36 48 ⎟ .
⎝ ⎠⎝ ⎠ ⎝ ⎠
4 4 4 4 1 2 3 4 16 32 48 64
The third example illustrates the transformation law between the com-
ponents of a vector in two coordinate systems in R2 , which are rotated with
respect to
each other, in terms of the matrix notation. Write
x1 for the components of the vector in the
x=
x2 unprimed system
x1 for the components of the vector in the
x =
x2 primed system and
d11 d12 for the rotation matrix which mediates the
D=
d21 d22 transition between the two systems.
The matrix relation is then
x1 d11 d12 x1
x = Dx =⇒ =
x2 d21 d22 x2
d11 x1 + d12 x2
= .
d21 x1 + d22 x2
3.2 Linear coordinate transformations, matrices and determinants 81
=⇒ (x · y) .
A multiplication of a row vector with a column vector results in a 1 × 1
matrix, a scalar.
82 3 Linear Algebra
(b) Explicit consideration of the element with the index ik on both sides
of the equation and comparison of these elements is then sufficient
Rule 6: The discussion of multiplication raises the question of the unit
matrix This matrix is defined as
⎛ ⎞
1 0 0 ... 0 0
⎜0 1 0 ... 0 0⎟
⎜ ⎟
⎜ .. ⎟ .
E = (δik )N N = ⎜ ... ..
. .⎟
⎜ ⎟
⎝0 0 0 ... 1 0⎠
0 0 0 ... 0 1
The unit element of matrix calculus is a square matrix with the number 1
at each position of the main diagonal and 0 in the off-diagonal elements.
Its properties are
(A)M N (E)N N = (E)M M (A)M N = (A)M N
or written in short hand
AE = EA = A .
The last mathematical operation to be discussed is ’matrix division’ or
more correctly the question of the existence of an inverse matrix. A matrix
B with the property
AB = E
is called the inverse of the matrix A . The notation is
B = A−1 .
A distinction between the left inverse and the right inverse is necessary as
matrix multiplication is in general not commutative
Square matrices which have an inverse are called invertible or regular, square
matrices without an inverse are called singular. A direct criterion, which al-
lows an answer to the question of the existence of the inverse of a square ma-
trix, will be found during the discussion of determinants (Math.Chap. 3.2.4).
There exist some useful rules for operations with regular matrices:
2. (A−1 )−1 = A .
Proof:
A−1 A = E =⇒ (A−1 A)−1 = E−1 = E
With 1. : A−1 (A−1 )−1 = (AA−1 )−1 = E =⇒ (A−1 )−1 = A .
1 −1
3. (αA)−1 = A .
α
Proof:
(αA)−1 (αA) = E =⇒ α(αA)−1 A = E
1 1
=⇒ (αA)−1 = EA−1 = A−1 .
α α
4. (AT )−1 = (A−1 )T .
Proof:
(A−1 A)T = ET = E
(a) (b)
x x
S’ x’
−α
α ϕ
S S
passive active
Fig. 3.22. Passive and active interpretation of rotations
(a) (b)
S S’
S’ b
x
-b x’
x
x’
passive active
Fig. 3.23. Transformations in R2 : translation
4 x’
3
1
x
0 0.5 1 1.5 2 2.5 3
cases of this rotation plus stretching are rotation only (as discussed above)
or stretching only, e.g.
α0
x = x = αx .
0 α
(2) The second example
10 x1 x1
x = =
00 x2 0
is a projection onto the 1 - or x -axis (Fig. 3.25a). Every vector x with
the same x1 -component is transformed into the same vector x . A more
general transformation is
a0 x1 ax1
x = = .
00 x2 0
88 3 Linear Algebra
(a) (b)
x x
x
x
x’ x’
projection projection and stretching
Fig. 3.25. Transformations in R2
This is a projection onto the 1 -axis with additional stretching (Fig. 3.25b).
This implies that the length of vectors and the angles between vectors are
not changed. Orthogonal transformations are isometric and isogonal. The fol-
lowing argument is used for a characterisation of orthogonal transformations:
Begin with the vectors
x = Ax and y = Ay
and postulate the invariance of the scalar product
3.2 Linear coordinate transformations, matrices and determinants 89
y T · x = y T · x .
Insert the transformation on the right hand side and obtain
y T · x = y T AT Ax .
The two sides agree with each other if the transformation matrix A satisfies
AT A = E
or
AT = A−1 .
The inverse of the transformation matrix of an orthogonal transformation
equals its transposed.
The matrix relation corresponds in R2 to three conditions which restrict
the form of the matrix
a11 a21 a11 a12 10
= ,
a12 a22 a21 a22 01
or explicitly
(1) a211 + a221 = 1
(2) a11 a12 + a21 a22 = 0
(3) a212 + a222 = 1 .
The following properties of the matrix can be extracted from these relations:
Equations (1) and (3) state that none of the four matrix elements can be
larger than 1 (|aik | ≤ 1 ). It is possible to choose one of the matrix elements
freely (observing the restriction) as there exist three conditions for four ma-
trix elements. The matrix elements are then determined up to a sign by the
conditions stated above.
Choose without loss of generality
a11 = cos α
and obtain from Eq. (1)
a21 = ± sin α .
Equation (2) then gives
a211 a212 = a221 a222 .
Substitute a212 from Eq. (3), resolve with respect to a222
a222 = a211 /(a211 + a221 ) = a211
and find
a22 = ± cos α .
From Eq. (3) follows finally
90 3 Linear Algebra
a12 = ± sin α .
Only two of the four combinations of signs are compatible with Eq. (2)
(a) (b)
α
α
x’ α/2
x’ α/2
ϕ
x ϕ x
ϕ ϕ
Vectors in the reflecting straight line are not changed (Fig. 3.26b). Every
vector is transformed for a rotations about an angle α = 0 . Rotations and
reflections differ in addition in the following point: The relative orientation
of two vectors is preserved for rotations, it is interchanged for reflections
(Fig. 3.27). This is compatible with the postulation of the invariance of the
(a) (b)
y’
x’ x’
y’
x x
y y
for reflections for rotations
Fig. 3.27. Orthogonal transformations in R2 : The invariance of the scalar product
scalar product. The cosine function, which features in the definition of the
scalar product is an even function.
The corresponding discussion of linear transformations in R3 is more
involved although similar in spirit. Not all details will be presented for this
reason. A transformation
⎛ ⎞ ⎛ ⎞⎛ ⎞
x1 a11 a12 a13 x1
x = Ax −→ ⎝ x2 ⎠ = ⎝ a21 a22 a23 ⎠ ⎝ x2 ⎠
x3 a31 a32 a33 x3
can be classified again into
92 3 Linear Algebra
3
ali alk = δik (i, k = 1, 2, 3) .
l=1
x x’
2
α
1
1 β
1’
3
2
β
3’
Fig. 3.29. Transformations in R3 : projection of rotation about 2 -axis onto 1 - 3
plane
⎛ ⎞
cos β 0 − sin β
D2 (β) = ⎝ 0 1 0⎠ .
sin β 0 cos β
describes a clockwise rotation by the angle β about the 2 -axis for a right-
handed system (the standard choice).
The complication, that one encounters in the discussion of rotations in
R3 , can be pointed out by the following consideration. Compare the matrices
D23 (βα) = D2 (β)D3 (α)
and
D32 (αβ) = D3 (α)D2 (β) .
A given vector is first rotated by the angle α about the 3 -axis and then by the
angle β about the 2 -axis for D23 (Fig. 3.30). In the second case the rotations
are executed in reverse order (Fig. 3.31). A calculation of the matrices for
the two combinations of rotations yields
⎛ ⎞
cos α cos β − sin α cos β − sin β
D23 = ⎝ sin α cos α 0 ⎠
cos α sin β − sin α sin β cos β
94 3 Linear Algebra
2 2
1 2 1
1
starting position rotation about 3 -axis rotation about 2 -axis
Fig. 3.30. Consecutive rotations in R3 : sequence 3, 2
2
2 1 1
2
1
starting position rotation about 2 -axis rotation about 3 -axis
Fig. 3.31. Consecutive rotations in R3 : sequence 2, 3
⎛ ⎞
cos α cos β − sin α − cos α sin β
D32 = ⎝ sin α cos β cos α − sin α sin β ⎠ .
sin β 0 cos β
One finds D23 = D32 . Consecutive rotations about different axes can not be
interchanged.
The special case with α = β = π/2 can be considered for a direct numer-
ical illustration. The result is in this case
⎛ ⎞⎛ ⎞ ⎛ ⎞
0 0 −1 x1 −x3
x = D23 x = ⎝ 1 0 0 ⎠ ⎝ x2 ⎠ = ⎝ x1 ⎠
0 −1 0 x3 −x2
⎛ ⎞⎛ ⎞ ⎛ ⎞
0 −1 0 x1 −x2
x = D32 x = ⎝ 0 0 −1 ⎠ ⎝ x2 ⎠ = ⎝ −x3 ⎠ .
1 0 0 x3 x1
It is apparent that the transformed vectors are not the same. The fact that
rotations about different axes in R3 can not be interchanged, leads to a
number of complications in applications.
A general rotation in R3 has to be represented by three parameters. One
possibility is the use of two angles to fix the direction of the axis of rotation
and a third angle for the actual rotation. An alternative (and a standard)
choice is to use the three Euler angles. A general rotation is composed of
3.2 Linear coordinate transformations, matrices and determinants 95
individual rotations by the Euler angles. The rotation axes used in this
case are: rotations about the x -axis, the x -axis (to complicate matters there
are variants which use the y -axis instead) and the z -axis (for details see
Chap 6.3.4). This choice is used for the discussion of the theory of tops (the
rotation of rigid bodies).
Reflections at coordinate planes in R3 are represented by simple matrices.
The example
⎛ ⎞⎛ ⎞ ⎛ ⎞
1 0 0 x1 x1
x = S12 x = ⎝ 0 1 0 ⎠ ⎝ x2 ⎠ = ⎝ x2 ⎠
0 0 −1 x3 −x3
describes a reflection of a vector at the 1 - 2 plane. The reflection of a vector
at planes, which are not coordinate planes, can be composed of two rotations
and a simple reflection, for instance with the steps
(1) Determine the rotation, which is necessary to rotate the plane into a
position, so that it coincides with a coordinate plane.
(2) Rotate the original vector in the same fashion.
(3) Reflect the rotated vector at the coordinate plane chosen.
(4) Rotate the reflected vector with the inverse transformation corresponding
to step (1).
These statements describe the active interpretation from a point of view of
the vector. A simple example is the reflection at a plane, which contains the
3 -axis and includes an angle α with the 1 -axis (see Fig. 3.32). The explicit
1 α
Fig. 3.32. Transformations in R3 : example for a reflection at a plane
3.2.4 Determinants
The usefulness of this concept can already be illustrated for the simplest
case, the 2 × 2 determinant:
• The formula for the solution of the system of equations can be given com-
pletely in terms of determinants
b1 a12 a11 b1
b2 a22 a21 b2
x1 =
x2 = .
a11 a12 a11 a12
a21 a22 a21 a22
98 3 Linear Algebra
These expressions are known as Cramer’s rule10 . The rule states: the
determinant in the denominator is the determinant of the matrix A . The
determinant in the numerator for the unknown xi is obtained by replacing
the i-th column by the vector b .
• The determination of the inverse matrix involves also the solution of a
system of linear equations, for example for a 2 × 2 matrix
−1 c1 c2 a11 a12 1 0
A A=E → = .
c3 c4 a21 a22 0 1
The resolution of the system for the matrix elements c1 to c4
a11 c1 + a21 c2 =1 a11 c3 + a21 c4 =1
a12 c1 + a22 c2 =0 a12 c3 + a22 c4 =0
yields with Cramer’s rule
−1 1 a22 −a12
A = .
|A| −a21 a11
The inverse matrix contains the factor |A|−1 . It exists only if the deter-
minant |A| is different from zero .
The tasks to solve a linear system of equations (of n equations in n un-
knowns) and to find the inverse of a (square) matrix are identical. The
resolution of Ax = b is x = A−1 b .
A geometrical interpretation of these statements is the following: given
are the transformation represented by A and a vector b which results from
transforming the original vector x . The determination of the original vector
corresponds to the determination of the inverse matrix A−1 .
• Determinants can be used to classify transformations. Rotations and
stretching operations are characterised in R2 by det(ADS ) = 0, projec-
tions by det(AP ) = 0 . Concerning orthogonal transformations one finds
det(AD ) = 1 for rotations and det(AS ) = −1 for reflections. Corresponding
statements are valid in R3 (and in Rn ).
It is useful to look at 2 × 2 determinants from a different angle before dis-
cussing determinants of square n × n matrices. The columns of the determi-
nant (or the rows) are interpreted as vectors for this purpose
a a12
det(A) = 11 = det(a1 , a2 ) .
a21 a22
The vectors a1 , a2 are then complemented to form a vector in R3 (they still
lie in the 1 - 2 plane)
⎛ ⎞
a1i
ai = ⎝ a2i ⎠ i = 1, 2 .
0
10
A corresponding rule exists also in the case of n equations with n unknowns, see
below.
3.2 Linear coordinate transformations, matrices and determinants 99
can be demonstrated by complete induction. The sum runs over all n! per-
mutations
(i1 , i2 , . . . , in ) of the numbers (1, . . . , n) .
The sign (expressed by sign(P)) is positive for even permutations, negative
for odd permutations. This formula does not necessarily represent a practical
way for the calculation of the value of a determinant. A task (not unusual)
as the calculation of the value of a 10 × 10 determinant would involve the
evaluation of 10! = 3628800 products with 10 factors followed by addition,
respectively subtraction.
A much more practical method for the solution of larger systems of linear
equations is the elimination technique. A system of equations of the form
a11 x1 + ··· ··· + a1n xn = b1
.. .. ..
. . .
an1 x1 + ··· ··· + ann xn = bn
can be converted into triangular shape by constructing suitable linear com-
binations of pairs of equations
ã11 x1 + ã12 x2 + ··· +ã1n xn = b̃1
ã22 x2 + ··· +ã2n xn = b̃2
.. .. .
. .
ãnn xn = b̃n
The result for the determinant of coefficients can now be read off directly
det(A) = det(Ã) = ã11 ã22 · · · ãnn .
The unknown can be determined consecutively starting with the last line.
A justification of the elimination technique and the assembly of further
rules for handling determinants can be obtained by a generalisation of the
alternative definition of a determinant indicated above for the case of a 2 × 2
102 3 Linear Algebra
The sum runs over all the elements of the row. The factor is a determinant
from which the k -th row and the i -th column have been removed
3
x2i,k = 1 .
k=1
Corresponding statements are valid for Rn . The eigenvalues are then de-
termined by an equation of degree n, the eigenvector has n components and
in the normalisation condition the upper limit 3 is replaced by n . Further in-
formation on the algebraic eigenvalue problem is found in Vol. 3 in connection
with the matrix formulation of quantum mechanics.
The section will be concluded by stating once more the cases for which
the concept of a determinant has been used in anticipation of this chapter.
• The 2 × 2 Wronski determinant (Math.Chap. 2.2.2)
x1 (t) x2 (t)
W (x1 (t), x2 (t)) = .
ẋ1 (t) ẋ2 (t)
106 3 Linear Algebra
4.1 Functions
x
Fig. 4.1. Representation of an explicit function two independent variables
y
y y
x
x x
hemisphere paraboloid of revolution plane in space
Fig. 4.2. Examples for functions of two variables
Z
Z=1
Z=3/4 x
Z=1/2
Z=1/4
Z=0
The domain of definition is the complete x - y plane without the origin. This
function in the form of rational fraction is not defined for x = y = 0 . It
is more convenient to use polar coordinates in the x - y plane instead of the
Cartesian coordinates for the discussion of this function
r2 (cos2 ϕ − sin2 ϕ)
z = f (r cos ϕ, r sin ϕ) = = cos 2ϕ .
r2 (cos2 ϕ + sin2 ϕ)
(a) (b)
y Z= –1
1.5 Z= 0
1
0.5
0 Z= 1/ √-2
–0.5
1.5 Z= 1
–1
–1.5
1 x
0.5
–1.5
–1 0y
–0.5 –0.5
0
x 0.5 –1 Z= 0
1
1.5 –1.5
y y y
x
x
x
• Imagine that a value of the function is attached to each point of the domain
of definition. Such a construct could, for instance, represent the distribution
of temperature in space
T = T (x, y, z) .
This is also an example for a scalar field (see Math.Chap. 5.1).
• A projection from 4 to 3 dimensions is also feasible. The implicit function
f (x, y, z) = const.
represents a surface in the three-dimensional space. The contour lines, dis-
cussed above, have to be replaced by families of surfaces. This possibility
is rarely employed.
No useful possibility of visualisation exists for the case of more than three
independent variables
z = f (x1 , x2 , . . . , xn ) (in the explicit form) .
The domain of definition is a region of n-dimensional space. The function
itself can be characterised as a n-dimensional hypersurface in a (n + 1) di-
mensional space. The difference with respect to the simpler cases is, excepting
the lack of visualisation, not that large. All necessary geometric properties (as
points, distances between points, etc.) can be defined in higher dimensional
Euclidian spaces (see Math.Chap 3.1.3).
The topics ’limiting values’ and ’differentiation’ are, as for the case of
functions of one variable, of particular interest for functions of several vari-
ables as well.
A function f (x) possess the limiting value A at the point x0 if the value
lim f (xν ) = A
ν→∞
This definition is (as has been remarked before) not very practical as the
postulate is for every sequence. It can, however, be readily carried over to the
case of several variables. The term ’sequence (of points on the number ray)’
can be read strictly as ’sequence (of points in space)’.
The domain of definition of a function of two variables is a region of the
x - y plane. A point P0 in a plane (or in space) can be approached from an
arbitrary number of directions. Sequences of points
P1 = (x11 , . . . , x1n ), P2 = (x21 , . . . , x2n ), . . . , P∞ = P0 ,
which approach the limiting point P0 from a chosen direction can therefore
be defined for functions of n (≥ 2) variables. The transcription of the criterion
for limiting values can then be noted as:
• The limiting value −1 is obtained for each sequence along the y -axis
2
y
lim f (0, y) = lim − 2 = −1 .
y→0 x→0 y
Different sequences lead to different values. A limiting value at the point
(0, 0) does not exist.
The domain of definition of the function
x2 y 2
z = f (x, y) = 2
x + y2
is also the x - y plane without the point (0, 0) . It is again useful to go over to
polar coordinates in order to discuss the limiting value at this position
z = r2 sin2 ϕ cos2 ϕ .
114 4 Analysis II: Functions of several variables
Every sequence of points with the limiting value (0, 0) can be characterised
by r → 0 . One finds therefore
lim r2 sin2 ϕ cos2 ϕ = 0 .
r→0
4.2.2 Differentiation
This section addresses the concept of a partial derivative, to begin with for
the case of two variables. The definition of the partial derivatives is in this
case:
A function z = f (x, y) , which is defined in a region D, possesses
a partial derivative with respect to x or with respect to y in the
point (x, y) ∈ D if the limiting value
∂f (x, y) f (x + h, y) − f (x, y)
= fx (x, y) = lim
∂x h→0 h
respectively the limiting value
∂f (x, y) f (x, y + k) − f (x, y)
= fy (x, y) = lim
∂y k→0 k
exists.
4.2 Limiting values and differentiation 115
The definition indicates that partial differentiation does not require any
new technical skills. The function is differentiated with respect to one of the
variables while the other variable is kept constant.
A brief list of examples is sufficient to explain the technique
f (x, y) = xy fx = y fy = x
1 2x 2y
f (x, y) = fx = − fy = −
(x2 + y 2 ) (x2 + y 2 )2 (x2 + y 2 )2
f (x, y) = ex (x2 − y 5 ) fx = ex (x2 − y 5 + 2x) fy = ex (−5y 4 ) .
The geometrical interpretation of the partial derivatives is not difficult. The
intersection of the surface f (x, y) and a plane y = const., parallel to the x - z
plane, is the curve used for the definition of the partial derivative with respect
to x (Fig. 4.6a). This derivative characterises the slope of the intersecting
curve in a point P , or expressed differently, the rise of the surface in this
point in the x -direction. The slope of the tangent lines in different points of
the intersecting curve is itself a function of x and y . The partial derivative
with respect to y describes the slope of the surface in the y -direction in a
corresponding fashion (Fig. 4.6b).
(a) (b)
z
z
y y
x x
partial derivative with respect to x partial derivative with respect to y
Fig. 4.6. The partial derivatives of f (x, y)
Partial derivatives of higher order can also be discussed. There exist four
partial derivatives of second order for the case of functions of two variables
116 4 Analysis II: Functions of several variables
∂ ∂f ∂2f
fxx = =
∂x ∂x ∂x2
∂ ∂f ∂2f
fxy = = (differentiate first with respect to y)
∂x ∂y ∂x∂y
∂ ∂f ∂2f
fyx = = (differentiate first with respect to x)
∂y ∂x ∂y∂x
∂ ∂f ∂2f
fyy = = .
∂y ∂y ∂y 2
The notation for the sequence of the indices is not standardised in the liter-
ature. The notation used here is: the derivative with respect to the variable
indicated on the right hand side is executed first (shorthand – left column,
standard notation – right hand column). The second derivatives for the ex-
amples given are
f (x, y) = xy
fxx = 0 fxy = fyx = 1 fyy = 0
f (x, y) = ex (x2 − y 5 )
fxx = ex (x2 − y 5 + 4x + 2) fxy = fyx = −5y 4 ex
fyy = −20y 3 ex .
It can be noticed that the mixed partial derivatives of second order agree.
This raises the question, whether this is always the case or which conditions
have to be satisfied for this to happen. An answer will be given shortly.
There exist eight derivatives of third order for the case of functions of two
variables
fxxx , fxxy , fxyx , fyxx , fxyy , fyxy , fyyx , fyyy .
The number of possible derivatives grows with the order k as 2k .
The definition of the partial derivative can be extended to the case of
functions of n variables. There exist n partial derivatives of first order with
the definition of the corresponding limiting values (i = 1, 2, . . . , n )
∂f (x1 , x2 , . . . , xi , . . . , xn )
=
∂xi
4.2 Limiting values and differentiation 117
)
f (x1 , . . . , xi + hi , . . . , xn ) − f (x1 , . . . , xi , . . . , xn )
fxi = lim .
hi →0 hi
The definition implies once again the actual technique: evaluate the derivative
with respect to one of the variables while treating the other variables as
constant. There exist n2 partial derivatives of second order
∂ ∂f
= fxi ,xk i, k = 1, 2, . . . , n ,
∂xi ∂xk
n3 derivatives of third order, etc.
The following example of a function of three variables is often used in
theoretical physics. The function
1 1
f (x, y, z) = =
[x2 + y2 2
+z ]1/2 r
describes the inverse distance of a point from the origin (with a simple form
in spherical coordinates). The three partial derivatives of first order with
respect to the Cartesian coordinates are
1 2x x
fx = − =− 3
2 [x2 + y 2 + z 2 ]3/2 r
1 2y y
fy = − 2 2 2 3/2
=− 3
2 [x + y + z ] r
1 2z z
fz = − 2 2 2 3/2
=− 3 .
2 [x + y + z ] r
There exist nine derivatives of second order
1 3x2 1 3y 2
fxx = − 3
+ 5 fyy = − 3
+ 5
r r r r
3xy 3yz
fxy = fyx = 5 fzy = fyz = 5
r r
3xz 1 3z 2
fxz = fzx = fzz = − + .
r5 r3 r5
The mixed partial derivatives of second order are again independent of the
sequence of the differentiation. The sum of the double derivatives with respect
to the three coordinates is
3 3(x2 + y 2 + z 2 )
fxx + fyy + fzz = − 3
+ =0.
r r5
This statement can be interpreted differently. The function f (x, y, z) = 1/r
is determined by a differential equation
∂2f ∂2f ∂2f
+ + =0.
∂x2 ∂y 2 ∂z 2
118 4 Analysis II: Functions of several variables
x2 y2
+ ≥1 x, y ≥ 0 b<R
R2 b2
illustrated in Fig. 4.7. The partial derivatives do not exist in the tip of
R x
Fig. 4.7. Partial derivatives: a problematic domain
the domain on the x - axis, even if the function defined over the domain is
reasonable. The surrounding area does not belong to the domain of defini-
tion so that a differential quotient can not be defined. Peculiar situations
can occur if the domain of definition is an area.
2
Partial differential equations are discussed in detail in Vol. 2.
4.2 Limiting values and differentiation 119
(b) It can be demonstrated that the existence of the derivative f (x) at the
point x implies continuity of the function of one variable f (x) at this
point. A corresponding statement is not possible for functions of several
variables. Any point is approached from two directions in the construc-
tion of the partial derivatives. This does not imply anything about the ap-
proach from an arbitrary direction. Such a statement would be necessary
for the transcription of the statement at the beginning of this paragraph.
The relation between differentiability and continuity is, as a consequence,
more involved in the case of functions of several variables. This point is
addressed in the section on directional derivatives (Math.Chap. 4.2.3).
The fact, that mixed partial derivatives are independent of the sequence in
which the derivatives are executed, has been found in all examples discussed
so far. This feature is explained by the theorem of Schwarz. This theorem,
formulated as a sufficient condition, states:
The mixed derivatives of k-th order are independent of the sequence in
which the derivatives are executed if all the derivatives in k-th order are
continuous.
There exist more rigorous variants of this theorem. Neither the variants
nor the proof of the soft version will be given here. One consequence of the
theorem is, however, worth a remark. The number of independent derivatives
of n-th order is reduced from n2 to n(n + 1)/2 if the theorem holds.
The fact that derivatives of functions of several variables are defined over
a region opens the possibility to consider further types of derivatives. The
most important for theoretical physics are the directional derivative and the
gradient.
(a) (b)
z
z
fx
fy
α y
y
x
x P
the partial derivatives of defining the directional derivative
f (x, y) in a point P
3
eα = cos αi ei and eα · eα = 1
i=1
3
(cos αi )2 = 1 .
i=1
This shows that only two angles are required for the specification of a direc-
tion in R3 . The third angle is then determined (uniquely).
The derivative of the function f (x, y, z) in the direction given by eα is
defined as
D(αx αy αz ) f (x, y, z)
)
1
= lim (f (x + ρ cos αx , y + ρ cos αy , z + ρ cos αz ) − f (x, y, z)) .
ρ→0 ρ
This yields under the same assumptions as in the case of two variables the
limiting value
= cos αx fx (x, y, z) + cos αy fy (x, y, z) + cos αz fz (x, y, z) .
This limiting value describes the slope of the function f (x, y, z) in the point
P = (x, y, z) in the direction characterised by eα .
A direction in an n-dimensional Euclidian space is characterised by a unit
vector
n
eα = (cos αi )ei .
i=1
The values of the n directional cosines describe (for an orthonormal basis) the
projections of the vector eα onto the coordinate axes (see Math.Chap. 3.1.3)
cos αi = (eα · ei )
with the restriction
(cos αi )2 = 1 .
i
n
= (cos αi ) fxi (x1 , . . . , xn ) .
i=1
The result for the case n = 2 can be recovered with α1 + α2 = π/2 and
cos α2 = cos(π/2 − α1 ) = sin α1 .
It is very useful to express the directional derivative in terms of the gra-
dient operator ∇ which is defined as
n
∂
∇= ei .
i=1
∂xi
The short hand version in terms of components is
∂ ∂ ∂
∇= , , ..., .
∂x1 ∂x2 ∂xn
The gradient operator is a differential operator with a vector character. An
alternative notation is
∇ = grad .
The application of this operator to a function f (x1 , . . . , xn ) , that is a
scalar function, yields a vector function (a function with n components, see
Math.Chap. 5.1)
n
in shorthand
∇f (x1 , . . . , xn ) = (fx1 (x1 , . . . , xn ), . . . , fxn (x1 , . . . , xn )) .
The components of the vector function are the n partial derivatives of the
scalar function f . The directional derivative can be expressed with the aid
of the gradient in the form
D(α1 ...αn ) f (x1 , . . . , xn ) = eα · ∇f (x1 , . . . , xn )
n
= (eα · ei )fxi (x1 , . . . , xn )
i=1
n
(a) (b)
z
y
x
y
x
illustration in R3 illustration of the projection
Fig. 4.9. The gradient vector for a paraboloid of revolution
ϕ
x1
x2
eβ eγ
eα
x 20
x1
x 10
The directional derivative in the direction of the tangent line can also be
calculated by
eβ · ∇f = (cos β e1 + sin β e2 ) · (fx1 e1 + fx2 e2 ) = 0 .
The vector ∇f (indicated by eγ in Fig. 4.11) is perpendicular to the tan-
gent line on a contour.
The square of the magnitude of the gradient vector is |∇f |2 = fx21 + fx22 .
This quantity can be compared with the square of the magnitude of the
directional derivative in the direction α
2
2
D(α) f = (cos α fx1 + sin α fx2 )
= cos2 α fx21 + 2 sin α cos α fx1 fx2 + sin2 α fx22
= fx21 + fx22 − (sin2 α fx21 − 2 sin α cos α fx2 fx1 + cos2 α fx22 )
≤ fx21 + fx22 .
The increase in the direction eα is smaller than the increase in the direction
of the gradient vector.
These arguments demonstrate that the gradient vector ∇f is perpendicu-
lar to the tangent line on a contour and that it marks the direction of the
strongest increase of a function.
A corresponding statement can be made (and demonstrated) for a func-
tion of three variables z = f (x1 , x2 , x3 ): the vector
gradf = fx1 e1 + fx2 e2 + fx3 e3
is perpendicular to the surfaces of equal value (the correct name is equipoten-
tial surfaces, see Chap. 3.2.3) through the point P and marks the strongest
rise of the function (Fig. 4.12). This assertion can be supported by the fol-
3
grad (f)
1 2
The equipotential surfaces are concentric spherical shells about the origin.
The gradient of this function is
x1 x2 x3
gradf = e1 + e2 + e3 ,
r r r
in spherical coordinates
gradf = (sin θ cos ϕ)e1 + (sin θ sin ϕ)e2 + (cos θ)e3 = er .
The gradient vector marks also the steepest slope of this function.
The gradient can be linked with an additional differential concept, the
total differential.
z
f(x,y)
y
x P P+dP
• The right hand side of the definition can be written in the form
dz = ∇f · dr with dr = dx ex + dy ey .
The linear increase of the function in an arbitrary infinitesimal direction dr
is given by the scalar product of gradf with dr . An expression of the form
vector function times (infinitesimal) displacement (with the appropriate
vector function) plays a central role in physics for the discussion of the
concept of work (see Chap. 3.2.3).
• Error analysis (e.g. for lab sessions) is based on the total differential.
The following statement applies to a situation characterised by a function
f (x, y): Two quantities x, y are measured and determine via the relation
z = f (x, y) the quantity z . The magnitude of the total differential is
|dz| = |fx dx + fy dy| .
Use of the triangle inequality |a + b| ≤ |a| + |b| gives the standard estimate
|dz| ≤ |fx ||dx| + |fy ||dy|
if the differentials dx, dy are interpreted as the errors of the measurement.
This estimate is, in view of the linear approximation, only correct if the
errors of measurement |dx| and |dy| are not too large.
• The terminology ’first’ total differential indicates that total differentials of
higher order can be considered. The second total differential of a function
of two variables is4
∂ ∂
d2 z = (dz) dx + (dz) dy .
∂x ∂y
The derivatives involved are
∂
(dz) = fxx (dx) + fyx (dy)
∂x
∂
(dz) = fxy (dx) + fyy (dy)
∂y
so that the second total differential can be written as
d2 z = fxx (dx)2 + 2fxy (dx dy) + fyy (dy)
if the order of the differentiation in the mixed derivatives can be inter-
changed. It approximates a surface f by an infinitesimal, tangential surface
of second order.
The definition can be extended to the case of n variables. The first total
differential of a function z = f (x1 , . . . , x2 )
n
The rules for partial differentiation do not differ greatly from the rules for
ordinary differentiation. An example in support of this statement is the rule
for the differentiation of a product
∂
(u(x, y)v(x, y)) = ux v + u vx .
∂x
The proof can be given on the basis of the definition of the partial derivative
and a repeat of the arguments leading to the corresponding rule of ordinary
differentiation.
An exception to the statement is the chain rule. The increased number
of variables in the case of functions of several variables allows a larger variety
of formulae but also a larger spectrum of applications. Consider, for instance,
the following situation: the functions x = x(t), y = y(t) can be inserted into
a function z = f (x, y) . The result is a function of t
z = f (x(t), y(t)) = F (t) .
The set of functions (x(t), y(t)) could represent the parametric representation
of a curve K in the x - y plane. The function F (t) describes in this case the
values of the surface f (x, y) over this curve (Fig. 4.14), that is a curve in
space.
F(t)
y
x K
Fig. 4.14. A variant of the chain rule
The ordinary derivative dF/dt (the tangent line on the curve in space)
can be expressed in terms of the partial derivatives of f and the ordinary
derivatives of x(t) and y(t) . The corresponding chain rule can be derived
with the argument: the infinitesimal difference
dz = f (x + dx, y + dy) − f (x, y)
is in linear approximation
dz = fx (x, y) dx + fy (x, y) dy .
4.2 Limiting values and differentiation 129
All symbols for partial differentiations have to be replaced by the symbol for
ordinary differentiation if only one x - or only one t -variable occurs.
An example of particular interest in physics is the application of the chain
rule for the transformation of the Laplace and the gradient operators into
5
See list of literature for the proof of these formulae.
130 4 Analysis II: Functions of several variables
∂ 1 ∂
∇ = er + eϕ .
∂r r ∂ϕ
The calculation for the Laplace operator needs more time. The starting
point is the chain rule
6
Mark the notation U (x, y) and u(r, ϕ).
4.2 Limiting values and differentiation 131
∂2 1 ∂ 1 ∂2
Δ= + + .
∂r2 r ∂r r2 ∂ϕ2
∂ 1 ∂ 1 ∂
∇ = er + eϕ + eθ ,
∂r r sin θ ∂ϕ r ∂θ
∂2 2 ∂ 1 ∂2 1 ∂2 cot θ ∂
Δ= 2
+ + 2 2 2
+ 2 2
+ 2 .
∂r r ∂r r sin θ ∂ϕ r ∂θ r ∂θ
4.3 Integration
There exists a number of options for the integration of functions of several
variables. A choice of the domain of integration in the form of curves, areas,
volumes and higher dimensional domains and the consideration of fixed and
open (dependence on the variables) limits of integration is possible. The re-
sult of the integration comprises areas, volumes (also in higher dimensional
spaces) and also functions, if the integration involves e.g. only one of the
variables. The integration of functions of two variables allows, for instance,
the definition and the representation of ’higher functions’. Elliptic integrals,
which are introduced in a separate section (Math.Chap. 4.3.4), constitute an
example of this class of functions. The discussion of integration begins with
integrals of functions of two variables.
(a) (b)
z z
a b
y y
x x
domain of integration dependence on x
Fig. 4.15. The integral Fab (x)
interpretation of this integral is: the function f (xfixed , y) describes the inter-
section of the surface f with the plane x = xfixed . The integral corresponds
to the contents of the planar area which is confined by the line segment ab
in the x - y plane, the curve of intersection and appropriate parallels to the
z-axis. It is actually a standard integral which is embedded in a three dimen-
sional world. Different values of x yield different curves of intersection and
hence different areas. The evaluation of the integral determines not only one
area but a whole family of areas (Fig. 4.15b).
The result for the explicit example (Fig. 4.16)
1 1
1 1
(x2 + y 2 ) dy = (x2 y + y 3 ) = x2 +
0 3 0 3
can be interpreted as follows: the intersection for x = 0 is the parabola
z = y 2 . The area under the parabola between the limits 0 and1 has the value
1/3 . The area under the curve z = x2 + y 2 for x = 0 is composed of the
rectangle x2 · 1 and the area under the arc of the parabola (Fig. 4.16b).
There exists no reason to favour the y -coordinate. The integral
β
Gαβ (y) = f (x, y) dx
α
can be interpreted in a similar fashion: it represents the area under the in-
tersection of the surface f and the plane y = yfixed . Additional examples are
the calculation of two integrals with the function y x
1 1
x y x+1 1
y dy = = x+1 (x = −1)
0 (x + 1) 0
1 1
y x (y − 1)
y x dx = = (y = 0)
0 ln y 0 ln y
134 4 Analysis II: Functions of several variables
(a) (b)
z z
y 1 y
cut for x = 0 cut for x = fixed = 0
Fig. 4.16. Integration under a paraboloid
π/2 −1/2
F (k) = 1 − k 2 sin2 y dy (first kind)
0
π/2 1/2
E(k) = 1 − k 2 sin2 y dy (second kind)
0
are defined in this fashion. These functions play a role in the discussion of
different physical problems, as the discussion of the motion of a mathematical
pendulum for large displacements from the equilibrium position (Chap. 4.2.1)
or of the motion of spinning tops (Chap. 6.3.7). Elliptic integrals are in-
troduced in detail in Math.Chap. 4.3.4. Two simpler functions, which are
represented in terms of such integrals are:
• The function defined by the integral (Fig. 4.17)
∞ b
sin xy sin xy
F (x) = dy = lim dy .
0 y b→∞ 0 y
The integrand is continuous for y = 0
sin xy sin xy
lim =x lim =0 .
y→0 y x→0 y
The area, which is represented by the integral, is reasonably complicated.
The intersection of a plane x = const. with the surface
sin (const. y)
f (x, y) =
y
is illustrated in Fig. 4.17. The integral can be simplified slightly by a sub-
stitution.
4.3 Integration 135
Fig. 4.18. The integral with (sin xy)/y defines a step function
• The next example is the often used Γ -function (Gamma function) which
is defined by the improper integral
∞
Γ (x) = y x−1 e−y dy
0
136 4 Analysis II: Functions of several variables
(see also Vol. 2, Math.Chap. 4.1). On the basis of the definition the follow-
ing properties can be established:
∗ The integral is simple for x = 1
∞
∞
Γ (1) = e−y dy = −e−y 0 = 1 .
0
∗ Partial integration of
∞
xΓ (x) = x y x−1 e−y dy (x > 0)
0
yields
∞ ∞
= y x e−y 0 + y x e−y dy .
0
The first term vanishes for x > 0 . The second term represents the function
Γ (x + 1) .
This result can be used as an alternative definition of the Gamma function
(for x > 0) in the form of a functional equation
Γ (x + 1) = xΓ (x) .
For integer values of x follows
Γ (n) = (n − 1)Γ (n − 1) .
The exploitation of this recursion relation starting with Γ (1) = 1 leads to
Γ (2) = 1 Γ (3) = 2 · 1 Γ (4) = 3 · 2 · 1 , etc. → Γ (n) = (n − 1)! .
The Γ -function is a generalisation of the factorial.
It is necessary to answer the question whether and how a function, defined
by such integrals, can be differentiated. The answer (rule) is:
b
The derivative of a function Fab (x) = a
f (x, y) dy is
b
dFab (x)
= fx (x, y) dy
dx a
The limits of integration are functions of x in this case (Fig. 4.19). The
function y = a(x) as well as the function y = b(x) represent curves in the
x - y plane. The interval of integration for the integration over y is confined
by these curves (for each value of x).
a(x) b(x)
x
y y y
x x x
x=0 x = 1/2 x=1
√
1−x2
F (x) = √ (3 − x2 − y 2 ) dy
− 1−x2
√1−x2
1
= (3 − x2 )y − y 3 √
3 − 1−x2
4
(4 − x2 ) 1 − x2 .
=
3
The rules for the differentiation of integrals, which define a function, have
to be extended for the case of integrals with variable limits. This extension
is based on the following
(i) The chain rule requires
d ∂g ∂g da ∂g db
g(x, a(x), b(x)) = + + .
dx ∂x ∂a dx ∂b dx
(ii) The derivative with respect to variable limits is
x
d
f (x̃) dx̃ = f (x) .
dx x0
Combination of these statements yields the extended rule
b(x) b(x)
d
f (x, y) dy = fx (x, y) dy
dx a(x) a(x)
db(x) da(x)
− f (x, a(x))
+f (x, b(x)) .
dx dx
The last term arises from the derivative with respect to the lower limit.
(Check the validity of this rule for the example given above).
A step beyond the standard integration, although still embedded in R3 ,
is taken by the discussion of double integrals of a function f (x, y) in the
following section.
4.3 Integration 139
a b
α y
x
Fig. 4.21. Integration over a rectangular domain: illustration of the integral
The explicit form of the integral indicates the manner in which it is evalu-
ated: the areas Fab (x) (inner integration) are obtained first. The second step
involves the addition of infinitesimal slabs Fab (x) dx . The result of this outer
integration is a volume (Fig. 4.22a). It is assumed that appropriate limiting
processes are used for each of the integrations.
There exist additional options to subdivide the total volume. A second
possibility is expressed by the integral
b β
I2 = dy dx f (x, y) .
a α
Areas parallel to the x - z plane are obtained first in this case. This is followed
by adding infinitesimal slabs in the y -direction (Fig. 4.22b). It might be
expected that the two volumes calculated are equal I1 = I2 . This is indeed
the case if the following condition is met: f (x, y) has to be bounded over the
domain of integration
|f (x, y)| < M
except for a finite number of points with infinities.
140 4 Analysis II: Functions of several variables
(a) (b)
z z
y y
x x
segmentation with dx segmentation with dy
Fig. 4.22. Integration over a rectangular domain
x
Fig. 4.23. Integration over a rectangular domain: division into infinitesimal
columns
same value for the volume I1 = I2 = I3 provided f is bounded over the do-
main R . This statement is in so far very useful as the evaluation of a domain
integral is in general only possible if it can be reduced to a double integral
(with two ordinary integrations which are executed one after the other).
The following example with the function
f (x, y) = x2 + y 2
4.3 Integration 141
z 4
–4 –4
–2 –2
0 0
y 2 2
x
4 4
The first step of the next task, the calculation of volumina over an arbi-
trary domain, involves the integrals
β b(x)
I1 = dx dy f (x, y) = f (x, y) dxdy
α a(x) B1
respectively
b β(y)
I2 = dy dx f (x, y) = f (x, y) dxdy .
a α(y) B2
y = b in the second example (Fig. 4.25b). The values of the integrals will
obviously be different (in general) as the domains of integration are different.
It is preferable, in view of the shapes of the domain of integration, to divide
the domain of integration into slabs parallel to the y -axis in the first and into
slabs parallel to the x -axis in the second case. The two integrals can also be
listed as
f (x, y) dxdy i = 1, 2 ,
Bi
(a) (b)
y
y α (y) β (y)
b(x) b
a(x)
a
x
α β x
variable limits in the y -direction variable limits in the x -direction
Fig. 4.25. Domains of integration with variable limits
Integrals of the type indicated can also be used to deal with a domain of
integration which is limited only by curvilinear boundaries.
• This is, for example, the case if the curves y = a(x) and y = b(x) intersect
in the points with x = α and x = β (Fig. 4.26a).
• A similar situation is encountered if the curves x = a(y) and x = b(y)
intersect in the points with y = a and y = b (Fig. 4.26b).
• More complicated shapes of the domain can be treated by a subdivision
into suitable sub-domains. An example is the domain of Fig. 4.27 with
a kidney shape. The subdivision can be oriented with respect to the x -
direction or the y -direction. The following subdivision offers itself for the
choice of an inner integration in the y -direction
f (x, y) dxdy = f (x, y) dxdy + f (x, y) dxdy
B
B1 B2
+ f (x, y) dxdy .
B3
(a) (b)
y y
a(x) α (y) β (y)
a
b(x)
b
x
α β x
variable boundaries in the variable boundaries in the
y -direction x -direction
Fig. 4.26. Examples for domains of integration with curvilinear boundaries
y a 1 (x)
B2
a 2(x)
B1
a 3 (x)
B3
a 4 (x)
α1 α2 α3 α4 x
α2 a1 (x)
f (x, y) dxdy = dx dy f (x, y)
B α1 a4 (x)
α3 a1 (x)
+ dx dy f (x, y)
α2 a2 (x)
α4 a3 (x)
+ dx dy f (x, y)
α2 a4 (x)
uses stripes parallel to the y -axis. The limiting curves are semicircles in
the upper and lower half-planes (Fig. 4.28a). The subdivision
144 4 Analysis II: Functions of several variables
R √R2 −y2
f (x, y) dxdy = dy √ dx f (x, y)
K −R − R2 −y 2
(a) (b)
y y
x x
Only terms with arcsin contribute for the limits specified. The final result
is
8 1 5
V = π− π= π
3 6 2
4.3 Integration 145
• The third example deals with a triangular domain marked by the corner
points (0, 0), (0, 2) and (1, 2) . The double integral is
1 2
f (x, y) dxdy = dx dy f (x, y)
B 0 2x
if the y -integration is chosen as the inner integration.
y=2
y = 2x
1 x
• The following options are possible for integration over a circular ring (ex-
ample 4) with the radii R1 and R2 (R1 < R2 ). The decomposition is simple
if the function f is defined and continuous over the interior of the annulus
(Fig. 4.30a)
f (x, y) dxdy = f (x, y) dxdy − f (x, y) dxdy .
B K2 K1
(a) (b)
y y
a2 a1
R1 2
1 4
x x
R 2
3
a3 a4
R1 √R22 −x2
+ dx √ dy f (x, y)
−R1 R12 −x2
√
R1 − R12 −x2
+ dx √ dy f (x, y)
−R1 − R22 −x2
R2 √R22 −x2
+ dx √ d yf (x, y) .
R1 − R22 −x2
rdϕ
dr
R
B
2π ϕ
(a) (b)
1 1
4 4
v
2 2
v+dv 3
u 3
u+du
the infinitesimal tetragon vectorial representation of the area of the tetragon
Fig. 4.34. Infinitesimal subdivision of a two-dimensional domain in arbitrary curvi-
linear coordinates
magnitude of the infinitely small area is of interest and not its orientation. It
is therefore sufficient to use
1
dS = |dSz | = |(x1 − x3 )(y4 − y2 ) − (x4 − x2 )(y1 − y3 )| .
2
Insertion of the coordinates of the corner points yields
∂x ∂y ∂x ∂y
dS = − dudv .
∂u ∂v ∂v ∂u
The absolute value can be written in terms of a determinant
∂x ∂x
∂u ∂v ∂(x, y)
|D| = = .
∂y ∂y ∂(u, v)
∂u ∂v
The determinant D is known as the Jacobian determinant or simply the
Jacobian of the transformation x = x(u, v), y = y(u, v) . The corresponding
matrix is the Jacobian matrix. The substitution rule can be formulated with
this definition in the form
∂(x, y)
f (x, y) dxdy =
f (x(u, v), y(u, v)) dudv .
B B ∂(u, v)
4.3 Integration 151
y u
B
B’
x v
The image of the domain B is a rectangle with the sides 1 and 2π for the
substitution
x = au cos v y = bu sin v .
The Jacobian is
a cos v −au sin v
|D| = | | = abu .
b sin v bu cos v
The calculation yields therefore
2π 1
I1 = F(ellipse) = dxdy = ab dv udu
B 0 0
2 1
u
= (ab)(2π) = abπ
2 0
and
1/2
x2 y2
I2 = V(ellipsoid/2) = c 1− 2 − 2 dxdy
B a b
2π 1
1/2
= abc dv u 1 − u2 du
0 0
1
1 3/2 2
= abc(2π) − 1 − u2 = abcπ .
3 0 3
x= α y
x= β
x
Fig. 4.37. Integration over a line segment in R3
(a) (b)
z
z b
a
α
β
y
y x
x
integration over a planar integration over a cuboid in R3
rectangular area in R3
Fig. 4.38. Integration over two- and three-dimensional domains
154 4 Analysis II: Functions of several variables
No additional calculational aspects arise in any of the three cases. The integral
I with f (x, y, z) = x + y + z is for instance
1 1 1
I= dz dy dx (x + y + z)
0 0 0
1 1 1
1 3
= dz dy +y+z = dz (1 + z) = .
0 0 2 0 2
The question of an interchange of the order of the integration in the cases 2
and 3 can be answered as before. The condition is: interchange is possible if
the function f (x, y, z) is bounded over the appropriate domain. The following
shorthand formulae can be used in this case
h(z) = f (x, y, z) dxdy = f (x, y, z) dSxy
R R
respectively
I= f (x, y, z) dxdydz = f (x, y, z) dV .
Q Q
α (y,z)
y
x
β (y,z)
Fig. 4.39. Integration over a straight line segment in R3 with variable limits
b(z) β(x,y)
2. h(z) = dy f (x, y, z) dx .
a(z) α(x,y)
4.3 Integration 155
(a) (b)
z
α(z) β(z)
z
α(y,z) z fixed
z fixed β(y,z)
y
x y
This standard domain BS (Fig. 4.41), defined by the limits indicated, can be
described as follows: the integration in the x -direction is limited by the planes
x = α, β . The limits of the y -integration are the surfaces y = a(x), b(x) .
The basic area, which is defined in this fashion, is projected into the space
and provided with a surface at the bottom z = A(x, y) and on the top
z = B(x, y) . The three-dimensional domain can be subdivided into arbitrary
infinitesimal volume elements (infinitesimal cuboids is only one possibility) if
the integrand is bounded. The limiting process leading to
I= f (x, y, z) dV
BS
156 4 Analysis II: Functions of several variables
α
y
x
β
can be carried out after construction and addition of the infinitesimal four-
dimensional volume elements f (x, y, z) dV .
The integral with integrand f = 1 can be used to calculate the contents
of arbitrary three-dimensional volumina (compare the situation for integrals
with f (x, y))
VB = dxdydz = dV .
B B
This statement is illustrated with some examples using a subdivision of the
domain of integration in the form
β b(x) B(x,y)
dx dy f (x, y, z) dz .
α a(x) A(x,y)
• The limits of the integration for a spherical volume (Fig. 4.42) about the
origin (radius R) are for this order of integration:
two hemispheres (bottom and lid) in the z -direction
y
x
1/2
A(x, y) = − R2 − x2 − y 2
1/2
B(x, y) = R2 − x2 − y 2 ,
two circular arcs in the y -direction
4.3 Integration 157
1/2
a(x) = − R2 − x2
1/2
b(x) = R2 − x2
and constant limits
α = −R β=R
in the x -direction.
The symmetric form of the domain allows an interchange of the order of the
integration with a corresponding adjustment of the limits. The volume of
a sphere could now be calculated. It is, however, apparent that a transition
to spherical coordinates offers advantages.
• An elliptic cylinder over the x - y plane with a parabolic lid (Fig. 4.43) is
characterised by the following limits:
The bottom of the domain is the x - y plane A(x, y) = 0, the lid is the
y
x
Fig. 4.43. An elliptic cylinder with a parabolic lid
z
θ
y ϕ
x r
The present rule is a direct extension of the rule for double integrals.
The domain B is subdivided with the aid of the surfaces u(x, y, z) = const.,
v(x, y, z) = const., w(x, y, z) = const. in order to prove the rule. The irregular
infinitesimal volume elements, which are formed by the infinitesimal surfaces,
are then approximated linearly.
Two important infinitesimal volume elements, which are used constantly,
should be kept in mind:
cylinder coordinates : dV = ρ dρdϕdz
can be found in all branches of physics. Their discussion and methods for
their evaluation are fashioned after the simpler cases. The actual evaluation
might, however, prove to be quite taxing.
162 4 Analysis II: Functions of several variables
All integrals of this kind can be classified with respect to three standard
integrals. They are known as elliptic integrals of the first, the second
or the third kind. The definition of the simplest forms (a form more or less
adapted to applications in physics) of these integrals are
ϕ
dϕ
first kind: F (ϕ, k) = 1/2 .
0 1 − k 2 sin2 ϕ
ϕ
1/2
second kind: E(ϕ, k) = dϕ 1 − k 2 sin2 ϕ .
0
ϕ
dϕ
third kind Πh (ϕ, k) = 1/2
0 1 − h sin2 ϕ 1 − k 2 sin2 ϕ
(h is a number from the interval −∞ < h < ∞ ).
The parameter k is restricted to the interval 0 ≤ k 2 ≤ 1 . It is usually written
in the form k 2 = sin2 α . A suitable substitution is required if numbers for
k , which are larger than 1 , occur. The substitution
1 1
ϕ = arcsin sin θ with dϕ = 1/2 cos θ dθ
k k 2 − sin2 θ
transforms e.g. an elliptic integral of the first kind (write κ = 1/k) into
ϕ θ
dϕ cos θ dθ
F (ϕ, k) = 1/2
= κ 1/2
0 1 − k 2 sin2 ϕ 0 cos θ 1 − κ2 sin2 θ
θ
dθ
=κ 1/2 .
0 1 − κ2 sin2 θ
Values of k 2 , which are larger than 1 can be handled in this fashion.
A form of the elliptic integrals, which is used often in mathematics, follows
from the substitution
dt
t = sin ϕ dϕ = √ .
1 − t2
The integrals then take the form
t ! "−1/2
F (t , k) = dt (1 − t 2 )(1 − k 2 t 2 )
0
1/2
t
1 − k2 t 2
E(t, k) = dt
0 1 − t 2
t ! "−1/2
Πh (t, k) = dt (1 − ht 2 )−1 (1 − t 2 )(1 − k 2 t 2 ) .
0
164 4 Analysis II: Functions of several variables
The original polynomials under the square root are apparent here.
The elliptic integral of the second kind is often rewritten as
t
(1 − k 2 t 2 )
E(t, k) = 2 2 1/2
dt
0 [(1 − t )(1 − k t )]
2
t
t 2 dt
= F (t, k) − k 2 1/2
[(1 − t 2 )(1 − k 2 t 2 )]
0
!
"−1/2
second kind : dt t 2 A3 + B3 t 2 A4 + B4 t 2
−1 !
"−1/2
third kind : dt 1−t2 A5 + B5 t 2 A6 + B6 t 2 .
9
See list of literature.
5 Basic concepts of vector analysis
The analysis of functions of several variables deals with the situation that
exactly one number is assigned to each point of a domain in an n-dimensional
space
f
(x1 , . . . , xn ) −→ f (x1 , . . . , xn ) .
It is, however, also possible to assign an m-tuple of numbers to each point of
such a domain
{f1 ,...,fm }
(x1 , . . . , xn ) −−−−−→ {f1 (x1 , . . . , xn ), . . . , fm (x1 , . . . , xn )} .
These m functions can be interpreted as the components of a vector in an
m-dimensional representation space (with an orthogonal basis) so that the
set can be summarised as
m
(a) (b)
y
R
x
The discussion of the differentiation of vector fields has to start pro forma
with the definition of the derivative with respect to one variable. The defini-
tion (and the nomenclature) of this limiting value is
∂
f (x1 , . . . , xn ) = f xi (x1 , . . . , xn )
∂xi
)
1
= lim [f (x1 , . . . , xi + hi , . . . , xn ) − f (x1 , . . . , xi , . . . , xn )]
hi →0 hi
m
1
= lim (fk (x1 , . . . , xi + hi , . . . , xn ) − fk (x1 , . . . , xi , . . . , xn )) ek .
hi →0 hi
k=1
Summation and the limiting procedure can be interchanged (and this is the
case of interest here) for a finite sum so that one obtains
m
∂fk (x1 , . . . , xn )
∂
f (x1 , . . . , xn ) = ek .
∂xi ∂xi
k=1
3
∂φ
φ(x1 , x2 , x3 ) −→ f (x1 , x2 , x3 ) = ∇φ(x1 , x2 , x3 ) = ek .
∂xk
k=1
The components of the vector function are the partial derivatives of the scalar
function. The visualisation of the vector f = gradφ is: it is perpendicular to
the tangential plane in each point of a surface φ = const. (Fig. 5.2).
x3 grad φ
x2
x1 φ
This differential operation associates a scalar function with the vector func-
tion f
div
f −→ Φ = div f .
The rotation of a vector field f is defined by the following equation
∂f3 ∂f2 ∂f1 ∂f3
rot f (x1 , x2 , x3 ) = − e1 + − e2
∂x2 ∂x3 ∂x3 ∂x1
∂f2 ∂f1
+ − e3 .
∂x1 ∂x2
5.2 Differentiation of vector fields 169
3
∂
∇= ek .
∂xk
k=1
∇ · f (x1 , x2 , x3 ) = ei · ek fk (x1 , x2 , x3 )
i=1
∂xi
k=1
∂fk (x1 , x2 , x3 )
∂fk (x1 , x2 , x3 )
= (ei · ek ) = .
∂xi ∂xk
ik k
1
The name curl is often used in the Anglo-Saxon literature instead of rot .
170 5 Basic concepts of vector analysis
The order of the ’vectors’ can obviously not be interchanged in this ’scalar
product’.
• The rotation of a vector function
rotf (x1 , x2 , x3 ) = ∇ × f (x1 , x2 , x3 )
(b) The combination ∇ × (∇φ(r)) yields a zero vector for every scalar
vector function which can be differentiated twice
∇ × (∇φ(r)) = rot (grad φ) = 0 .
(c) The combination ∇ · (∇ × f (r)) generates the number zero for every
vector function which can be differentiated twice
∇ · (∇ × f (r)) = div (rotf ) = 0 .
A physical interpretation of the application of the operators divergence
and rotation on a vector function can be obtained with the aid of the inverse
operation, the integration of vector functions. This topic is the theme of the
next section.
3
r(t) = ei xi (t) ta ≤ t ≤ t b .
i=1
x3
tb
ta
x2
x1
f (x1 , x2 , x3 ) = fi (x1 , x2 , x3 ) ei
i
is defined in a spatial domain G, which contains the curve.
172 5 Basic concepts of vector analysis
One of the consequences of this rule for physics is the following: the value
of the work for the motion of a point particle in a force field along the
curve from A to B is independent of the nature of the movement. The
same number is obtained for the actual motion of the point particle (t →
time) or any other mode of travelling (τ ) on the same curve.
2. A number of standard rules are
(f + g) · dr = f · dr + g · dr
K K K
f · dr = − f · dr .
−K K
The second equation states that the sign of the line integral is changed if
the line segment is traversed in the opposite direction. This rule follows
from
tb ta
F (t) dt = − F (t) dt .
ta tb
3. The following relations hold for two curves which are joined together
f · dr = f · dr + f · dr .
K1 +K2 K1 K2
This rule implies that line integrals can not only be defined for smooth
curves but also for arbitrary curves which are joined together.
4. An extension of the third rule are the decomposition theorems. The fol-
lowing theorem is particularly important in the discussion to follow: A
vector function f is defined in a domain G of R3 . A surface F , which
is bordered by a closed curve K, exists in G . The curve K is traversed
in a definite sense. The surface is now decomposed by a set of bordered
subdomains (Fig. 5.4). The following theorem holds in this case
5.3 Integration of vector functions 173
x3
x2
x1
2 2 2
f · dr = f · dr + . . . + f · dr .
K K1 Kn
is independent of the path of integration (provided the paths are in the do-
main of definition of the vector function) ?
The following example (Fig. 5.5) demonstrates that the value of a line
integral need not necessarily be independent of the path. The example is the
evaluation of the line integral with the vector function
f = 2xy, y 2 , 0
along the following paths in the x - y plane
K1 x=t y=0 0≤t≤1
x=1 y =t−1 1≤t≤2
K2 x=t y=t 0≤t≤1.
The value of the path K1 from the position (0, 0) to the position(1, 1) parallel
to the coordinate axes is
174 5 Basic concepts of vector analysis
(1,1)
K2
K1
(0,0) t
1 2 2
1
I1 = f · dr = (0) dt + (0 + (t − 1)2 ) dt = (t − 1)2 dt = .
K1 0 1 1 3
By contrast, the integral along the diagonal in the first quadrant (path K2 )
yields the result
1
I2 = f · dr = (2t2 + t2 ) dt = 1 .
K2 0
The question concerning the independence of a line integral from the path
can be answered by the following two arguments:
1. It is quite simple to prove that the integral K f ·dr is independent of the
path if the vector function is the gradient of a scalar function f = ∇φ .
The assumption allows the statement
tb
∂φ dx1 ∂φ dx2 ∂φ dx3
f · dr = + + dt .
K ta ∂x1 dt ∂x2 dt ∂x3 dt
Rewrite the expression in the bracket with the chain rule and integrate
to find
tb
dφ
= dt = φ(tb ) − φ(ta )
ta dt
or in detail
= φ (x1 (tb ), x2 (tb ), x3 (tb )) − φ (x1 (ta ), x2 (ta ), x3 (ta )) .
The line integral depends only on the values of the function φ at the
starting point and the endpoint of the path. It is therefore path indepen-
dent.
2. The proof of the
inverse statement
From K(A,B) f · dr = φ(B) − φ(A) follows f = ∇φ
is more involved. The assumption can be used to write (see Fig. 5.6)
5.3 Integration of vector functions 175
f · dr = φ(x1 , x2 , x3 ) − φ(A)
K
f · dr = φ(x1 , x2 , x3 + h) − φ(A) .
K+Kh
x3 Kh
x2
x1
Fig. 5.6. Illustration for the proof of the relation between line integration and
gradient formation
= f3 (x1 , x2 , x3 ) .
A corresponding argument can be given for an infinitesimal path parallel
to the x1 - or the x2 -axis.
The statements
K
f · dr is path independent and f = ∇φ
are completely equivalent. The first statement claims that a special class
of vector functions (those which can be represented as the gradient of
176 5 Basic concepts of vector analysis
a scalar function) exists for which the line integral is path independent.
The second statement confirms that this is the only class of functions
with path independence.
These two basic statements can be cast into a different form.
3. The statement
f · dr = f · dr
K1 K2
implies
2
f · dr = f · dr + f · dr = 0 .
K2 −K1
K
f · dr = φ(B) − φ(A) f = ∇φ
3
f · dr = 0 rotf = 0 .
The validity of one of the statements implies the validity of the other three.
If it has e.g. been verified that the rotation of a vector field f vanishes in a
domain G , then the statements that the line integral along a closed curve
vanishes or the line integral is path independent or f can be represented
as the gradient of a scalar function follow directly. This property of path
independence is of interest for the discussion of the law of energy conservation
in mechanics or for the foundation of electrostatics.
The next class of integrals with vector functions that has to be considered
are the slightly more intricate surface integrals.
The integrals, which are discussed in this section, can be described in the
following manner: specified are an arbitrary surface S in space and a vec-
tor function f which is defined in a domain G. The domain contains the
5.3 Integration of vector functions 177
x3 dS
f
S
x2
x1
(a) (b)
dS R d ϕ sinθ R dθ
R dϕ
dS
x3
x1 x2
surface normal. This direction is obtained (see Fig. 5.9) from the circulation
according to the right hand rule (or the rule of the screw). A simpler rule is
the right hand grip rule: the four fingers of the right hand indicate the sense
of circulation, the thumb points then in the direction of the (normal) vector.
It should be said, however, that not all surfaces in space can be oriented. A
popular counterexample is the strip of Moebius.
The surface integral can be calculated directly with this descriptive defi-
nition for simple situations. An example is the surface integral with a central
field f (r) = f (r)er on a spherical surface f (r) = f (r)er
I = f (r) · dS = f (R) dS
Sp Sp
One of the application of surface integrals can be recognised here: The surface
is smoothed out and charted to scale if it is traced with a suitable vector field
(unit vector in the direction of the normal).
The evaluation of surface integrals is more complicated if the vector field
is not a central field even if the surface is spherical. A decomposition of the
surface element dS in terms of Cartesian components has to be used in this
case
dS = dS er = R2 sin θ (cos ϕ sin θ, sin ϕ sin θ, cos θ) dϕdθ
so that the surface integral of a vector function f = (f1 , f2 f3 ) takes the form
I = f · dr
Sp
*
= R2 dϕdθ f1 (x(R, ϕ, θ), y(R, ϕ, θ), z(R, ϕ, θ)) cos ϕ sin2 θ
(a) (b)
z z
ds ds ds ds
f1 f1
f2 f2
x x
x3
B1
B2
x2
x1
B3
x3
x2
x1
f · dS = f1 dS1 + f2 dS2 + f3 dS3 .
S B1 B2 B3
A similar statement is possible for the other two domain integrals. The
complete form of the surface integral in Cartesian decomposition is there-
fore
182 5 Basic concepts of vector analysis
f · dS = f1 (x1 (x2 x3 ), x2 , x3 ) dx2 dx3
S
B1
+ f2 (x1 , x2 (x1 x3 ), x3 ) dx1 dx3
B2
+ f3 (x1 , x2 , x3 (x1 x2 )) dx1 dx2 .
B3
x3
V H x2
x1
The representation of vector fields with the aid of field patterns constitutes
a good introduction to this topic. The field lines are the tangential curves,
which are endowed with a direction, on neighbouring field vectors. The field
lines of the simple example of the gravitational field of a mass M at the origin
r
G = −γM 3
r
are rays which are directed radially towards the origin from all directions
(Fig. 5.14). The gravitational field of two equal masses, which are placed in
(a) (b)
r+a r r+a
M a a M
dominate the pattern in the vicinity of their position. The field lines are
strongly modified in the region between the two masses. The lines adapt
themselves to the parting plane. The complete field pattern is rotationally
symmetric with respect to the axis on which the masses are placed. The
pattern is an example of a dipole field.
The electric field of a point charge q (replacing a point mass) at the origin
can be repulsive or attractive with field lines emanating from or entering into
the origin
r
E = −γ q 3 (q = ±) .
r
Another example is the vector field
y x
B= − 2 , , 0 .
x + y 2 x2 + y 2
This field possesses translational symmetry with respect to the z-axis. The
same pattern is obtained for each plane z = const. The field lines are concen-
tric circles about the z -axis (Fig. 5.16). The field represents (up to a constant
factor) the magnetic field of a thin current carrying wire along the z -axis.
The surface integrals for the different fields
G · dS E · dS B · dS
S S S
are named the flux of the fields through the surface S, in particular the
gravitational flux, the electric flux, the magnetic flux, etc. This terminology
originates from hydrodynamics, for instance in the form of the velocity field
of a stationary fluid flow (Fig. 5.17). A velocity field v(r) is associated with
every infinitesimal volume element. The flux S v would be a measure of the
amount of fluid which flows through the surface per unit time if the velocity
186 5 Basic concepts of vector analysis
v(r)
if the flow is not uniform and/or the surface is not planar (Fig. 5.18). An
equivalent quantity describing the ’strength of the flow’ is needed if this the
(a) (b)
ds
v v(r)
1
G · dS = −γM 2 er · dS = −4πγM .
R
Sp Sp
The result is independent of the radius of the sphere. It is therefore valid for
any spherical surface about the origin. This corresponds to the interpretation
suggested above. The same number of field lines (however normalised) passes
through all spherical surfaces about the origin (Fig. 5.19a).
The same result can be expected for any closed surface S(or) of arbitrary
shape about the origin
G · dS = −4πγM .
S(or)
This assertion would follow from the interpretation of the concept of flux.
The same number of field lines, that pass through each of the spherical sur-
faces, passes through any arbitrary closed surface S(or) (Fig. 5.19b). The
(a) (b)
V V
expectation for an arbitrary closed surface S(nor) , which does not contain
the origin with a point mass, would be
G · dS = 0 .
S(nor)
The number of field lines entering into the volume enclosed by this surface
equals the number that leave this volume (Fig. 5.20). The proof of these
statements is provided by the theorem of Gauss which is discussed below.
188 5 Basic concepts of vector analysis
The language used in this connection is: The mass point is referred to as
a source of the field. More generally sources and sinks3 have to be distin-
guished (Figs. 5.21a,b). Field lines emanate from a source, they enter into a
(a) (b)
source sink
Fig. 5.21. Illustration of sources and sinks
sink. The flux through a closed surface about a source or a sink is not equal
to zero. The flux through a closed surface, which does not contain a source
or sink, vanishes.
A quantitative description of this situation is the contents of the theorem
of Gauss which is also known as divergence theorem. The theorem reads
as follows
3
In some instances (e.g. for semiconductors) the word ’drain’ is used instead of
’sink’.
5.3 Integration of vector functions 189
dS
V
S(V)
The restriction for the volume indicates that the domain should be convex
(Fig. 5.23).
(a) (b)
The proof of the theorem proceeds in the following fashion: write the left
side of the central relation as
190 5 Basic concepts of vector analysis
∂f1 ∂f2 ∂f3
div f (x1 , x2 , x3 ) dV = + + dx1 dx2 dx3 .
V ∂x1 ∂x2 ∂x3
The domain of integration of the first term
∂f1
T1 = dx1 dx2 dx3
V ∂x1
can be stated directly if the boundary of the volume is divided into a lower
and upper surface with respect to the x1 -direction (Fig. 5.24)
lower : x1 = B(x2 , x3 ) upper : x1 = D(x2 , x3 ) .
This statement uses the restriction to a complex volume. The triple integral
can then be written as
D(x2 x3 )
∂f1
T1 = dx2 dx3 dx1 .
B1 B(x2 x3 ) ∂x1
The remaining double integral over x2 and x3 has to be evaluated by projec-
tion of the volume V into the 2 - 3 plane. This domain (B1 ) does not have
to be specified further. The integration over the coordinate x1 is trivial. The
primitive is f1 so that the result
T1 = dx2 dx3 {f1 (D(x2 x3 ), x2 x3 ) − f1 (B(x2 x3 ), x2 x3 )}
B1
= f1 · dS1
S(V )
is obtained. This is exactly the first term of the surface integral (for a Carte-
sian subdivision and double covering of the domain B1 ) on the right hand side
of the equation. A corresponding argument for the terms T2 and T3 would
complete the proof.
x3
B1
B D
x1
div f dV = f · dS
V1 S(V1 )
div f dV = f · dS .
V2 S(V2 )
Addition of these relations yields the volume integral over the complete vol-
ume on the left hand side. The contributions of the two sides of the interface
cancel on the right hand side as they are oriented in opposite directions in
each point (Fig. 5.25b). There remains the surface integral over the boundary
(a) (b)
V1 V2
V1 V2
as div f = 1 .
192 5 Basic concepts of vector analysis
The divergence of this field is also needed for a discussion of the theorem
)
∂ x1
∂ x2
∂ x3
div f ce = c + + .
∂x1 r3 ∂x2 r3 ∂x3 r3
It can be evaluated with the relation
∂ xi
1 3x2i
= − for r = 0 (i = 1, 2, 3)
∂xi r3 r3 r5
which is valid for all points of space with the exception of the origin. The
divergence of the central field for these points turns out to be
)
3 3(x21 + x22 + x23 )
div f ce = c − =0.
r3 r5
It vanishes for all points except the origin. No statement can be made for
this point for the time being but the fact that the surface integral does not
vanish, calls for further discussion.
First of all it can be stated that this result allows the conclusion that an
integral over a closed surface S(nor), which does not contain the origin (with
the ponit source), satisfies the relation
f ce · dS = div f ce dV = 0 .
S(nor) V (S)
Surface integrals, which do not contain the source, vanish because the diver-
gence of the field vanishes in the relevant regions of space. This result can be
used to demonstrate that any surface integral, which contains the source, has
the same value. An arbitrary volume about the origin can be decomposed into
a spherical volume about the origin and a number of part volumina which do
not contain the origin (Fig. 5.26). The contributions of the interfaces cancel
due to the orientation. Therefore follows
f ce · dS = f ce · dS + f ce · dS
S(or) Sp(or) n S(nor)n
5.3 Integration of vector functions 193
even though div f ce vanishes in all points except the origin. This implies that
div f ce can not vanish for r = 0
lim (div f ce ) = 0 .
r→0
has the value 1 (Fig. 5.27). Such objects can obviously only be defined rigor-
ously by an extension of the concept of functions.
4
Distributions are discussed in Vol. 2, Math.Chap. 1.
194 5 Basic concepts of vector analysis
oo
The relation
)
div f ce = 0 f ür r = 0
div f ce = 4πcδ(r)
div f ce = 0 f ür r = 0
with δ(r) = δ(x1 )δ(x2 )δ(x3 ) is valid, independent of this sideline, for a point
source (a central field). The point with the source (or sink) – that is a point
mass or a point charge – is characterised by div f ce = 0, the remaining
(empty) space by div f ce = 0 . The divergence of a vector field describes in
this sense the distribution of the sources of the field in differential form. The
integral form can be obtained with the Gauss theorem in terms of surface
integrals over closed surfaces. All surface integrals, which enclose a point like
source, have the value 4π c, that is solid angle times ’ strength of the source’.
This statement can be expressed in a different form: Enclose a point of the
space carrying a (vector) field in an infinitesimal volume ΔV . The divergence
theorem states
div f ce dV = f ce · dS
ΔV S(ΔV)
The divergence of a vector field has the dimension flux of the field per volume.
This flux density can be termed a source density as it is a net flux. A positive
source density corresponds to a true source, negative source density to a sink.
Similar statements can be made for extended sources, for instance for
the gravitational field of a homogeneous (from a macroscopic point of view),
spherical mass distribution
ρ0 f ür r ≤ R
ρ(r) = .
0 f ür r > R
The mass is (see Math.Chap 4.3)
5.3 Integration of vector functions 195
4
M= ρ(r)dV = πρ0 R3 .
Ku 3
In order to calculate the gravitational field the volume is divided into
infinitesimal elements dV at the position r (Fig. 5.28a). The contribution
of this volume element to the field at the position r is
ρ0 dV
dG(r) = −γ (r − r ) .
|r − r |3
The contributions of all mass elements have to be added in the sense of a
limiting value
dV
G(r) = −γρ0 (r − r ) .
|r − r |3
Actually three (!) volume integrals have to be evaluated. The following consid-
eration reduces the labour of the explicit evaluation: there exists a diametrical
element for each element at the position r which gives a contribution of the
same magnitude. The vector sum of the contributions of the two volume ele-
(a) (b)
z
r’
dV’
division symmetry
Fig. 5.28. Calculation of the gravitational field of a sphere with a homogeneous
mass distribution
ments is a vector in the radial direction (Fig. 5.28b). The gravitational field
of a sphere with a homogeneous mass distribution is a radial field. Explicit
integration yields
⎧
⎪ M
⎨ −γ er r≥R
G(r) = r2 .
⎪
⎩ −γ M rer r ≤ R
R3
******************************************************************
The following paragraph contains the details for the evaluation of the vol-
ume integral. It can be skipped. It seems to be useful, on the other hand, to
present the steps of the calculation as they may serve as a model for similar
196 5 Basic concepts of vector analysis
calculations.
The symmetry allows the field point to be placed on the z -axis for the
evaluation of the integral
dV
G(r) = −γρ0 (r − r ) .
|r − r |3
This choice facilitates the calculation, the general result can be regained at
the end. The coordinates of the field point r and the variable of integration
r are then
r = (0, 0, z)
r = (r cos ϕ sin θ , r sin ϕ sin θ , r cos θ ) ,
the magnitude of the difference vector
1/2
|r − r | = r 2 + z 2 − 2r z cos θ .
The relation
2π 2π
cos ϕ dϕ = sin ϕ dϕ = 0
0 0
implies
G(r) = −2πγρ0 Iez
with
R
1
(z − r x)dx
I= r 2 dr
0 −1 (z 2 + r 2 − 2r zx)3/2
using the substitution cos θ = x . The inner integration can be performed
with the steps
1
(z − r x)dx
I(r , z) = 2
−1 (z + r − 2r zx)
2 3/2
1 1 1
= 2 (z − r )
2 2
1/2
− 1/2
2r z (z 2 + r 2 − 2r z) (z 2 + r 2 + 2r z)
1/2
1/2 )
− z 2 + r 2 − 2r z − z 2 + r 2 + 2r z .
G
r 1/ r
R r
Fig. 5.29. Radial variation of the gravitational field of a sphere with a homogeneous
mass distribution
sphere outside the sphere can not be distinguished from the field of a point
mass of the same magnitude in the centre. The divergence of the gravitational
field of the homogeneous sphere is
⎧
∂Gi ⎨0 r>R
div G = i =
∂xi ⎩ −3γ M r≤R,
R3
or, after insertion of the mass for r ≤ R ,
∂Gi 0 r>R
div G = i = .
∂xi −4πγρ0 r≤R
The form is in this case also ’solid angle times density of the source’. The di-
vergence is not equal to zero (div G = 0) for all points which carry mass (the
198 5 Basic concepts of vector analysis
sources of the field), it vanishes (div G = 0) for all other points. The diver-
gence of the field describes the distribution of its sources and their strength
(−4πγρ0 ). The step of div G (Fig. 5.30) at the surface of the sphere is a
consequence of the sharp edge of the distribution.
div G
Fig. 5.30. Illustration of the divergence of the gravitational field of the homoge-
neous sphere
The surface integral with the rotation of the vector field is equal to the
line integral over the oriented boundary of the surface S (Fig. 5.31a). The
orientation of the boundary and the orientation of the infinitesimal surface
elements dS are related by the right hand rule (or the right hand grip rule)
by the theorem of decomposition (Fig. 5.31b).
The proof of Stokes’ theorem uses the same pattern as the proof of the
divergence theorem. It is more involved though as the two integrals contain
scalar products of vectors in the present case.
The first step is the decomposition of the oriented surface S into subdo-
mains (Fig. 5.32). Each subdomain is characterised by
2
Ii = f · dr .
Ri
5.3 Integration of vector functions 199
(a) (b)
K(S)
K(S)
as the contributions of the dividing lines cancel. The same relation holds for
an arbitrarily fine subdivision
n 2
lim Ii = f · dr .
n→∞ K(S)
i=1
(a) (b)
x 2+ dx 2
x2
1
x1 x 1+ dx 1
3
Ii (1, 3) = (rotf )2 dS2 .
i B2
S2
S1
is valid for every surface Si , which is fully embedded in G and which has the
same oriented boundary curve K .
This form of the theorem illustrates again the connection between the
statement rotf = 0 and the path independence of the line integral which
has been discussed in Math.Chap. 5.3.1.
• The line integral over a closed curve vanishes
2
f · dr = 0
K
if the relation3 rotf = 0 holds in the domain containing the curve.
• The relation K f · dr = 0 for a closed curve in G implies rotf = 0 in the
domain as all surface integrals for surfaces with the same boundary vanish
according to the theorem.
An idea for the interpretation of the concept rotf can be gleaned from
the comparison of two examples: The pattern of the field lines for the vector
field with cylindrical symmetry
1/2
f = (xg(ρ), yg(ρ), 0) ρ = x2 + y 2
202 5 Basic concepts of vector analysis
(a) (b)
y y
x x
The line integral for a circle about the z -axis (traversed counter clockwise)
is
2 2π
2
f · dr = R g(R)
2
sin t + cos2 t dt = 2πR2 g(R) .
circle 0
A line integral with this vector field is in addition path dependent. The
integral for a curve along the sides of a square with the side length 2R about
the z-axis (Fig. 5.36) can be evaluated with the parametric representation
Fig. 5.36. A contour for the line integral of the second vector field with cylindrical
symmetry
⎫
x=R y =t ⎪ ⎪
⎬
x = −t y =R
R ≤t≤R .
x = −R y = −t ⎪
⎪
⎭
x=t y = −R
The result
2 R
f · dr = 4R g R2 + t2 dt
−R
204 5 Basic concepts of vector analysis
can not be evaluated without further specification of g(ρ) . The value for the
magnetic field with g = 1/ρ2 g = 1/ρ2 is
2 2
f · dr = 8R arctan R = f · dr .
square circle
dS
f
above is termed the circulation of the vector field. The rotation is then the
circulation per unit area or specific circulation. The limiting value defines a
quantity which can be called the vortex density.
The connection between a surface integral of (rotf ) with a line integral
of f can also be put to use in practical applications. It is, however, not quite
as useful as the Gauss theorem due to the more complicated structure of
the surface integrals. One application of interest is the explanation of the
connection between different methods for the calculation of the contents of
planar surfaces.
5.3 Integration of vector functions 205
• A method, which can be extracted from the discussion of the law of areas
in mechanics (Chap. 2.3.3), determines the contents of a planar area by
tracing the boundary of the area with the aid of a parametric representation
(Fig. 5.38a)
2
1
F1 = (x(t)ẏ(t) − y(t)ẋ(t))dt .
2 K
• The contents of planar surfaces can also be calculated by domain integra-
tion with functions of two variables. A subdivision of the area e.g. by means
of rectangles calls for the evaluation of (Fig. 5.38b)
F2 = dxdy .
B
(a) (b)
A connection between the two methods can be established with the following
argument: rewrite the integrand of the first method as
2
F1 = f · dr
K
with
1 1
f= − y, x, 0 and dx = ẋ(t)dt, dy = ẏ(t)dt .
2 2
Calculate rotf for this field function
rotf = ez
and use the theorem of Stokes to obtain
2
f · dr = rotf · dS = 1 · dSz = dxdy .
K S(K) S(K) S(K)
206 5 Basic concepts of vector analysis
The theorem provides an elegant connection of the two methods for the cal-
culation of the contents of planar surfaces.
The discussion of the basic concepts of vector analysis (in R3 ) can be
summarised in the following fashion: An idea of the structure of a vector
field f (x, y, z) can be obtained by examination of the quantities div f (a
scalar quantity) and rotf (a vector quantity). The divergence describes the
distribution of sources and sinks, the rotation the occurrence of vortices.
Conversely, these quantities can be used for an overall classification of vector
fields:
(1) A field f with div f = 0 is called source free or solenoidal. An example
is the magnetic field.
(2) A field f with rotf = 0 is called vortex free. Examples are the fields of
electrostatics or the conservative force fields of mechanics.
(3) A field f with rotf = 0 is called a vortex field.
Finally, two possibilities to extend the discussion are mentioned without
giving any details:
• The integral theorems of Green constitute a variation of the two inte-
gral theorems discussed here. They are considered in Vol. 2 in connection
with the theory of electrostatic fields.
• The statements of the last sections have dealt exclusively with the situation
f (x1 , x2 , x3 ) = (f1 (x1 , x2 , x3 ), f2 (x1 , x2 , x3 ), f3 (x1 , x2 , x3 )) ,
that is a vector field with m = n = 3 . They can be extended to the case
of arbitrary dimensions (with m = n)
f (x1 . . . xn ) = (f1 (x1 . . . xn ), f2 (x1 . . . xn ) . . . fn (x1 . . . xn )) .
The gradient operator is
n
∂
∇= ei .
i=1
∂xi
The concept ’divergence’ and the divergence theorem can be generalised
in a simple manner. The generalisation of the concept ’rotation’ and the
theorem of circulation is more complicated. Such extensions are e.g. of
interest in the discussion of the theory of relativity. The dimension of the
appropriate space is then n = 4, it is, however, not Euclidian.
6 Differential equations II
The function f (t) and the ’kernel’ of the integral K(t, t̃) are specified.
The task is the determination of the function x(t) which occurs under the
integral sign. There exists a relation between differential equations and
integral equations which is similar to the relation between differentiation
and integration. This relation is relevant for both formal as well as practical
aspects.
210 6 Differential equations II
f (t, x) dx + g(t, x) dt = 0 .
The replacement of the differential quotient by differentials can be justified
rigorously (see Math.Chap. 2.2.1). But not even all differential equations of
first order and first degree can be solved analytically. This is only possible if
the functions f and g possess a special form. The simplest case can be treated
with the method of separation of variables, which will be developed further
below.
separable. It can, however, be brought into a separable form with the trans-
formation
v =t+x dv = dt + dx .
The transformed differential equation
v dt + dv − dt = 0 or (v − 1) dt + dv = 0
has the implicit solution t + ln(v − 1) = c which can be rewritten with the
steps
et (v − 1) = c and et (t + x − 1) = c
in the explicit form x = c e−t − t + 1 . There exist no definite rules for finding
a suitable transformation. Everything rests on the so called ’mathematical
intuition’ which is not easily defined.
The simplest possible path can be chosen as the line integral is (under the
conditions stated) is path-independent. For a decomposition with a lower
path parallel to the axes according to Fig. 6.1 follows
t x
g(t, x0 ) dt + f (t, x) dx = c .
t0 x0
(x,t)
(x,t)0
t
Fig. 6.1. Line integration
The first section runs parallel to the t -axis with t - fixed, the second parallel
to the x -axis with t fixed. The upper path with
214 6 Differential equations II
t x
g(t, x) dt + f (t0 , x) dx = c ,
t0 x0
that is a t -integration for fixed x and an x -integration for fixed t0 , could have
been chosen as well. The starting position can be chosen freely. A change of
the starting position corresponds to a renaming of the constant of integration.
The actual process of solution is in general much simpler than the de-
scription above. This will be demonstrated in terms of two examples. The
differential equation of the first example is
(3t2 x2 + t2 )dt + (2t3 x + x2 )dx = 0 .
The first step is a check whether the differential equation is indeed exact
with the aid of the condition of integrability. This condition is satisfied as one
finds gx = ft = 6t2 x . The second step is the execution of the line integration.
Several options will be presented for the present example as an exercise.
The path (Fig. 6.2) begins at the origin (often a good choice of the starting
point), runs along the t -axis to the point (t, 0) and then parallel to the x -axis.
The corresponding integral
t x
2
t̃ dt̃ + (2t3 x̃ + x̃2 ) dx̃ = c
0 0
leads to the implicit solution
1 3 1
t + t3 x2 + x3 = c .
3 3
t=const.
dt=0
x=0 dx=0
t
Fig. 6.2. Variation of the path of integration: (0, 0) → (t, 0) → (t, x)
The second path (Fig. 6.3a) runs along the x -axis to the point (0, x) and
then parallel to the t -axis to the point (t, x). This gives
t x
2 2 2
(3t̃ x + t̃ ) dt̃ + x̃2 dx̃ = c
0 0
with the same solution as before.
A third variant uses a similar path as in the first case, but the starting
position is the point (1, 1) . The path shown in Fig. 6.3b connects the points
6.2 Differential equation of first order 215
(1, 1), (t, 1) and (t, x) with straight lines parallel to the axes. The integral
is here
t x
2
4t̃ dt̃ + (2t3 x̃ + x̃2 ) dx̃ = c1 .
1 1
The result
4 3 4 1 3 1
t − + t x + x −t −
3 2 3
= c1
3 3 3 3
agrees with the previous ones after renaming c1 + 5/3 = c .
(a) (b)
x x
x=const. dx=0
t=const.
t=0 dt=0
dt=0
x=const dx=0
t t
(0, 0) → (0, x) → (t, x) (1, 1) → (t, 1) → (t, x)
is exact. It can, however be stated that every differential equation of this class
can be converted into an exact differential equation (at least in principle).
The keyword is ’integrating factor’.
This topic is best introduced with a simple example. The differential equation
t dx − x dt = 0
is not exact, as ft = 1 and gx = −1 so that ft = gx . The solution is
nonetheless simple. Write
dx dt
− =0
x t
and obtain ln x − ln t = c1 , which can also be written as (after renaming
the constant of integration) x = ct . The total differential of the solution in
implicit form
x
1 x 1
d = dx − 2 dt = 2 (t dx − x dt) = 0
t t t t
gives a hint, how the differential equation could have been solved in a different
fashion. The differential equation, which results from multiplication of the
original form with 1/t2
1 x
dx − 2 dt = 0
t t
is exact (ft = gx = −1/t2 ) so that it could be solved by line integration.
The function u(t) = 1/t2 is called an integrating factor of the differen-
tial equation at hand. There exists, however, not only one but an arbitrary
number of such factors. Write e.g. the solution in the form t/x = c2 and find
that u(x) = −1/x2 is an integrating factor as
t t 1 1
d = − 2 dx + dt = − 2 (t dx − x dt) = 0 .
x x x x
A third possibility can be obtained with the modified implicit solution
x
x
1
arctan = c3 with d arctan = 2 (t dx − x dt) = 0 .
t t (x + t2 )
The function u(t, x) = 1/(x2 +t2 ) is also an integrating factor. A large number
of additional options are possible.
The general argument proceeds as follows: Try to choose an integrating
factor u(t, x) of the differential equation
u(t, x) f (t, x) dx + u(t, x) g(t, x) dt = 0
in such a way that the relation
6.2 Differential equation of first order 217
x − 3x = 15 c e3t − −5
x − x/t = t3 ct at4 t4 /3
x + x = cos t c e−t a cos t + b sin t (cos t + sin t)/2 .
This naive approach might not be possible. A general method to find the
special solution of the inhomogeneous differential equation is the method of
the variation of the constant. This method is based on the solution of the
homogeneous differential equation, which can be written as xh (t, c) = c g(t)
with
t
g(t) = exp − a(t̃)dt̃ .
The ansatz
xp (t) = c(t) g(t)
for the desired special solution accounts for the term ’variation of the con-
stant’. Insertion of the ansatz into the differential equation gives
c (t) g(t) + c(t)[g (t) + a(t) g(t)] = b(t) .
The term in the square brackets corresponds to the homogeneous differen-
tial equation. The remaining differential equation c (t) = b(t)/g(t) for the
function c(t) has the special solution
t
b(t̃)
c(t) = dt̃ .
g(t̃)
The same results for the solutions xp (t) of the simple examples above are
obtained by application of this method.
The differential equations of first order and first degree, which have been
discussed here, are part of every day tools of theoretical physics. The list of
differential equations of this class, which can be solved analytically, can be
enlarged to some extent5 . The next section contains, however, some remarks
on the related class of differential equations of first order and higher degree.
The importance of this class of differential equations can not be realised fully
by looking at the field of classical mechanics. Differential equations of this
kind are one of the main tools of electrodynamics and quantum mechanics.
The starting equations are actually partial differential equations which are,
as a rule, reduced to a set of ordinary differential equations (of second order)
by separation of variables. The basic items of electrodynamics, the electro-
magnetic fields, and of quantum mechanics, the wave functions, follow the
principle of superposition. This accounts for the fact that the partial and
finally the ordinary differential equations have to be linear.
General statements concerning the solution of the differential equation
a0 (t)x (t) + a1 (t)x (t) + a2 (t)x(t) = b(t)
have been discussed in Math.Chap. 2.2.2. Here is a summary:
• The general solution of the inhomogeneous differential equation is the sum
of the general solution of the homogeneous differential equation and a par-
ticular solution of the inhomogeneous differential equation
xi (t, c1 , c2 ) = xh (t, c1 , c2 ) + xp (t) .
• Two particular solutions of the homogeneous differential equation {x1 (t), x2 (t)}
are linearly independent if the Wronski determinant does not vanish
x1 (t) x2 (t)
W (x1 (t), x2 (t)) = = 0 .
x1 (t) x2 (t)
They represent a fundamental system of solutions in this case.
• The linear combination xh (t) = c1 x1 (t) + c2 x2 (t) of the fundamental solu-
tions is the general solution of the homogeneous differential equation.
The solution of the linear differential equation with constant coefficients has
also been outlined in Math.Chap. 2.2.2. The class of linear differential equa-
tions, for which the coefficients are (more or less simple) polynomials in t, will
be discussed here. This discussion includes the determination of the general
solution of homogeneous differential equation and the preparation of methods
for the calculation of particular solutions of the inhomogeneous differential
equations.
The determination of the particular solution represents an extension
of the corresponding problem for differential equation of first order (see
Math.Chap. 6.2.3). The ansatz for the case of an inhomogeneous differen-
tial equation of second order is
xp (t) = c1 (t)x1 (t) + c2 (t)x2 (t) .
The functions x1 and x2 form a fundamental system and there are two ’con-
stants’ which can be varied. As only one particular solution is required there
224 6 Differential equations II
exists a certain leeway which can be used to advantage. The first step is the
calculation of the derivative of the ansatz
xp = c1 x1 + c2 x2 + c1 x1 + c2 x2
followed by the demand (this is the leeway) that the coefficient functions ci (t)
satisfy the equation
c1 (t)x1 (t) + c2 (t)x2 (t) = 0 .
The ansatz for xp , the first derivative xp = c1 x1 + c2 x2 and the second
derivative are inserted into the inhomogeneous differential equation in the
second step. The result (properly sorted) is
c1 (a0 x1 + a1 x1 + a2 x1 ) + c2 (a0 x2 + a1 x2 + a2 x2 ) + a0 (c1 x1 + c2 x2 ) = b .
The expressions in the first two brackets vanish as x1 and x2 are solutions of
the homogeneous differential equation. The two equations
x1 (t)c1 (t) + x2 (t)c2 (t) = 0
x1 (t)c1 (t) + x2 (t)c2 (t) = b(t)/a0 (t) a0 = 0
represent a system of linear equations for the derivatives c1 (t) and c2 (t) . The
solution of this system of equations can e.g. expressed with Cramer’s rule
b(t)x2 (t) b(t)x1 (t)
c1 (t) = − c2 (t) = .
a0 (t)W (x1 (t), x2 (t)) a0 (t)W (x1 (t), x2 (t))
A non-trivial solution is assured as the Wronski determinant W (t) of the
fundamental solutions (and a0 (t)) are not equal to zero by assumption. The
functions themselves can be obtained by integration
t
ci (t) = ci (t̃) dt̃ (i = 1, 2) .
The example below illustrates the method in some detail. The general
solution of the homogeneous part of the differential equation
x + x = (cos t)−1
is xh (t) = c1 cos t + c2 sin t . The Wronski determinant is W (cos t, sin t) = 1 .
The system of equations for the derivatives of the constants to be varied is
cos t c1 (t) + sin t c2 (t) = 0
− sin t c1 (t) + cos t c2 (t) = (cos t)−1
so that the result
t
c1 = − tan t and c1 (t) = − tan t̃ dt̃ = ln | cos t|
t
c2 = 1 and c2 (t) = dt̃ = t
6.3 Differential equations of second order 225
2tx + (t + 1)x + 3x = 0 .
The ansatz for the solution is a power series in t multiplied by an arbitrary
power of t
∞
x(t) = tρ bn t n
n=0
= b0 tρ + b1 tρ+1 + b2 tρ+2 + . . . .
The additional factor tρ serves to eliminate a possible dependence of the
solution on non integer functions as t−1 , t1/3 , etc. Insertion of the ansatz
into the differential equation (after calculation of x and x ) and sorting by
powers of t yields the expansion
tρ−1 [2ρ(ρ − 1)b0 + ρb0 ]
+ tρ [(ρ + 1)(2ρ + 1)b1 + (ρ + 3)b0 ]
+ ...
+ tρ+k [(ρ + k + 1)(2ρ + 2k + 1)bk+1 + (ρ + k + 3)bk ]
+ ... = 0.
A power series can only have the value zero for all values of the variable if
the coefficients of all powers vanish
∞
dn tn = 0 −→ dn = 0 for all n .
n=0
This condition leads to the following statements for the present case
• The factor of the power tρ−1 is
2ρ(ρ − 1) + ρ = 0 .
This is the indicial or characteristic equation which determines the power
of the prefactor. The roots of the indicial equation in the present example
are ρ1 = 0 and ρ2 = 1/2 .
• The coefficient b0 can not be determined from the homogeneous differential
equation which can be multiplied by an arbitrary constant factor. The fac-
tors of the remaining powers (tρ+k with k = 0, 1, . . .) yield the binominal
recursion relation
(ρ + k + 3)
bk+1 = − bk .
(ρ + k + 1)(2ρ + 2k + 1)
The coefficients b1 , b2 , . . . can now be calculated after the choice of one of
the roots of the indicial equation and the value of the coefficient b0 (for
simplicity mostly 1). It can be expected that the different roots correspond
to linearly independent solutions. This has to be checked, however.
The recursion relation of the present example is binominal. It can therefore
be handled easily. Recursion relations with polynomial structure do occur
6.3 Differential equations of second order 227
• the values of the function x(t) at the positions tk−l , tk−l+1 , . . . , tk (explicit
method)
or
• the values of the function x(t) at the positions tk−l , tk−l+1 , . . . , tk , tk+1 (im-
plicit method).
The simplest approximation, which can be used in this case, is the rectan-
gle rule (see Fig. 6.4a). It uses only the lower sampling point t0 of the interval
[t0 , t0 + h], so that
Ir = f (t0 )h .
(a) (b)
f(t) f(t)
h
t t
to to
rectangle rule tangent trapezoidal rule
Fig. 6.4. Evaluation of integrals: Simple approximation rules 1
Variants with a linear approximation of the integrand are the tangent trape-
zoidal rule (Fig. 6.4b)
Itt = f (t0 + h/2)h ,
in which the central point of the interval is used as a sampling point, or the
direct trapezoidal rule (Fig. 6.5a) with
1
Itr = (f (t0 ) + f (t0 + h)) h .
2
(a) (b)
f(t) f(t)
t t
to to
direct trapezoidal rule Simpson rule
Fig. 6.5. Evaluation of integrals: Simple approximation rules 2
6.4 Addendum: Numerical methods of solution 231
The expansions for (Δx)tt and (Δx)tr agree with the exact result up to
second order in h . Construction of the combination (first suggested by Runge:
C. Runge, Mathematische Annalen, 46 (1895), p. 167)
2 1
(Δx)comb 1 = (Δx)T T + (Δx)ST ,
3 3
gives a result that agrees at least with the prefactor in third order. However
two of the terms are missing.
The agreement can be improved if the form
1
(Δx)ans 1 = k0 (t, x) + k2 (t, x) (6.3)
2
with
k0 (t, x) = f (t, x) h
k1 (t, x) = f (t + h, x + k0 (t, x)) h
k2 (t, x) = f (t + h, x + k1 (t, x)) h
6.4 Addendum: Numerical methods of solution 233
with
k0 (t, x) = f (t, x)h
..
.
n−1
and
• determine the parameters of the ansatz (as far as possible) by expansion
in powers of h and comparison with the exact expansion
has been employed by W. Kutta (W. Kutta, Z. für Mathematik und Physik,
46 (1901), p. 435). It is the basis for a large number of classical and of
modern variants of the Runge-Kutta method (see list of literature e.g. in
M. Abramovitz and I. Stegun).
The ansatz suggested has the virtue that optimal intermediate sampling
points in the interval [t, t + h] can be obtained by fixing αn (0 ≤ αn ≤ 1)
and that a direction kn (t, x(t))/h is determined by the polygonal line in
the already calculated directions starting at the initial point. In addition
to the adaption to the exact expansion, the postulate is used, that each of
the intermediate points should lie on the tangent line to the integral curve
f (t, x(t)) through the initial point with an error of better than second order.
This postulate is implemented by
n−1
αn = βni
i=0
234 6 Differential equations II
as well as
an = 1 .
n=0
value problem with vanishing step size. This happens only if the func-
tion f (t, x(t)) satisfies an additional condition concerning continuity (the
Lipschitz condition). Convergence can be demonstrated for all the meth-
ods discussed above.
2. An additional question of interest is: of which order in h is the error
for the different methods, respectively do explicit estimates of the error
exist?
• The question of roundoff errors has to be raised from a more practical
point of view. A to small choice of the step size favours the accumulation
of roundoff errors. A step size, which is too large, leads on the other hand
to an inaccurate representation of the integral. The optimal choice of the
step size is therefore not an easy matter. An estimate of the stability of
the solution by variation of the number of sampling points is usually the
measure used in practical calculations.
7 Complex numbers and functions
7.1 Definitions
The solutions of quadratic equations are not necessarily real numbers. For
instance, application of the standard method for the solution of the equation
x2 − 10x + 40 = 0
yields
√
x=5± −15 .
It is necessary to extend the system of real numbers and the corresponding
rules of arithmetic in a suitable and consistent manner. The first step is the
introduction of the imaginary unit
√
i = −1 with i2 = −1
so that the solution of the example can be written as
√
x = 5 ± i 15 ,
that is a complex number with a real and an imaginary part. Complex num-
bers have to be represented in a plane, in contrast to real numbers which can
be represented on a number ray. The complex number
z = x + iy x, y real
Im
x
Re
Fig. 7.1. The complex number z = x + iy
238 7 Complex numbers and functions
The basic statement concerning the four fundamental rules of arithmetic is:
it is possible to use complex numbers formally in the same manner as real
numbers. Addition and subtraction are executed by adding or subtracting
the real parts and the imaginary parts separately
z = z1 ± z2
= (x1 + iy1 ) ± (x2 + iy2 )
= (x1 ± x2 ) + i(y1 ± y2 ) .
The graphical representation of addition is indicated in Fig. 7.2.
Im
z z2
z1
Re
Fig. 7.2. Addition of two complex numbers
Im
z
ρ
φ
Re
Fig. 7.3. Trigonometric decomposition of a complex number
Im
z2
|z|
z1
Re
Fig. 7.4. Distance of two complex numbers
Im
z1z 2
z2
z1
Re
Fig. 7.5. The product of two complex numbers
Im
z2
Im
φ2
z1 / z 2 Re
z1 z1= 1
z2
Re 1/z2
z1 /z2 1/z2
1
z’
O A 1
z = x + iy z ∗ = x − iy ,
in trigonometric form
z ∗ = ρ(cos φ − i sin φ) .
Im
z
φ
Re
z*
Fig. 7.8. Complex conjugation
This corresponds to a reflection at the real axis (Fig. 7.8). The product z z ∗
is clearly a real number.
Multiple application of the rules for multiplication gives the Moivre for-
mula
z n = [ρ(cos φ + i sin φ)]n
= ρn (cos nφ + i sin nφ) .
Application of the binomial theorem for (cos φ + i sin φ)n followed by a sep-
aration of real and imaginary parts results in a representation of cos nφ and
sin nφ in terms of powers of cos φ and sin φ, as for instance in the simplest
case
(cos φ + i sin φ)2 = (cos2 φ − sin2 φ) + 2i sin φ cos φ
= cos 2φ + i sin 2φ .
7.3 Elementary functions 243
The standard relations for double the angle can be read off here. Correspond-
ing formulae for triple, quadruple, . . . the angle can be generated relatively
easily.
Im(w)
Im(z)
w = f (z) Re(w)
codomain
=⇒
domain of definition Re(z)
Fig. 7.9. Domain of definition and codomain of a function of one complex variable
z+b
b
=⇒
z z
Re(z) Re(w)
Im(z)
Im(w)
az
=⇒
z z
Re(z) Re(w)
Im(w)
Im(z)
=⇒
Re(w)
–1 1 Re(z)
Fig. 7.12. w = f (z) = 1/z
Im(z) Im(w)
Re(w)
Re(z)
=⇒
As one ought to return to the first sheet of the w plane after a full rev-
olution, one has to imagine that the two sheets of the w plane are cut
along the positive real axis and are then connected crosswise. Figure 7.14
illustrates the situation as viewed in the direction of the axis.
Im(w)
2 2
1 1
positive real
axis
The doubly covered w , which is connected along the real axis, is referred to
as the Riemann surface of the function w = z 2 . Each point of this surface
is endowed with two values (on the lower and the upper sheet) except the
origin (w = z = 0), which only occurs once. This point is called a branch
point of the surface. The advantage of the Riemann construction for the
function w = z 2 is the unambiguous mapping of the simply covered z plane
onto the two sheets of the w plane.
The complete analysis of functions of one real variable can be carried
over to the complex case in spite of the slightly more involved form of the
representation. Topics, which would have to be covered (see Mathematical
Supplement of Vol. 2), are e.g.
• sequences of numbers with complex members and their convergence,
• infinite series including power series,
• limiting values of functions,
• continuity and differentiability of functions.
(Compare Math.Chap. 1 for a corresponding discussion of these points for
functions of one real variable).
246 7 Complex numbers and functions
= cos y + i sin y .
One extracts
|ez | = eRe(z) arc(ez ) = Im(z) .
3. One finds with the power series for e−iy
e−iy = cos y − i sin y .
From the statements for e±iy follow after resolution with respect to cos y
or sin y the relations
1
cos y = (eiy + e−iy )
2
1
sin y = (eiy − e−iy ) .
2i
7.3 Elementary functions 247
These are the relations which are used often for a representation of oscil-
lations or of wave phenomena.
4. Other well used relations are
e2πi = 1 eπi = e−πi = −1
eπi/2 = i e−πi/2 = −i .
The multiplication formula given above states therefore in particular
ez+2πi = ez .
The complex exponential function is a periodic function with the period1
2πi . This implies that this function maps a fundamental strip of the
z plane (the standard choice is −π < Im z ≤ π) onto the entire w plane
(Fig. 7.15).
Im(z)
Im(w)
iπ
Re(z)
Re(w)
-i π w=e z
=⇒
Fig. 7.15. Domain of definition of the fundamental strip and corresponding
codomain of the function ez
1
Note: in the direction of the imaginary axis.
8 List of literature
The following listing contains textbooks with mathematical topics which are
available in book shops and in libraries. The books are listed separately for
the different fields. The alphabetic order does not refer to the level or the
quality of the presentation.
Reference books
Handbooks, general
Special functions
Tables of integrals
Special areas
Linear Algebra
Analysis
Vector Analysis