237 Course Notes

Calculus 3
Course Notes for MATH 237

Edition 4.21
J. Wainwright and D. Wolczuk
Department of Applied Mathematics
Copyright: J. Wainwright, August 1991
2nd Edition, July 1995
D. Wolczuk, 3rd Edition, April 2008
D. Wolczuk, 4th Edition, September 2009 (updated October 2010)
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
To the Student Reader . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Graphs of Scalar Functions 2
1.1 Scalar Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Geometric Interpretation of z = f(x, y) . . . . . . . . . . . . . . . . . . 4
2 Limits 9
2.1 Denition of a Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Limit Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Proving a Limit Does Not Exist . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Proving a Limit Exists . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Continuous Functions 20
3.1 Denition of a Continuous Function . . . . . . . . . . . . . . . . . . . . 20
3.2 The Continuity Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Limits revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 The Linear Approximation 29
4.1 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Second Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 The Tangent Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Linear Approximation for z = f(x, y) . . . . . . . . . . . . . . . . . . . 36
4.5 Linear Approximation in Higher Dimensions . . . . . . . . . . . . . . . 39
5 Dierentiable Functions 42
5.1 Denition of Dierentiability . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Dierentiability and Continuity . . . . . . . . . . . . . . . . . . . . . . 47
CONTENTS CONTENTS
5.3 Continuous Partial Derivatives and Dierentiability . . . . . . . . . . . 49
5.4 The Linear Approximation Revisited . . . . . . . . . . . . . . . . . . . 52
6 The Chain Rule 55
6.1 Basic Chain Rule in Two Dimensions . . . . . . . . . . . . . . . . . . . 55
6.2 Extensions of the Basic Chain Rule . . . . . . . . . . . . . . . . . . . . 62
6.3 The Chain Rule for Second Partial Derivatives . . . . . . . . . . . . . . 67
7 Directional Derivatives and the Gradient Vector 72
7.1 Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.2 The Gradient Vector in Two Dimensions . . . . . . . . . . . . . . . . . 76
7.3 The Gradient Vector in Three Dimensions . . . . . . . . . . . . . . . . 80
8 Taylor Polynomials and Taylors Theorem 82
8.1 The Taylor Polynomial of Degree 2 . . . . . . . . . . . . . . . . . . . . 82
8.2 Taylors Formula with Second Degree Remainder . . . . . . . . . . . . 85
8.3 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
9 Critical Points 91
9.1 Local Extrema and Critical Points . . . . . . . . . . . . . . . . . . . . . 91
9.2 The Second Derivative Test . . . . . . . . . . . . . . . . . . . . . . . . 95
9.3 Proof of the Second Partial Derivative Test . . . . . . . . . . . . . . . . 106
10 Optimization Problems 109
10.1 Extreme Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 109
10.2 Algorithm for Extreme Values . . . . . . . . . . . . . . . . . . . . . . . 112
10.3 Optimization with Constraints . . . . . . . . . . . . . . . . . . . . . . . 115
11 Coordinate Systems 124
11.1 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.2 Cylindrical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.3 Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
12 Mappings of R
2
into R
2
137
12.1 The Geometry of Mappings . . . . . . . . . . . . . . . . . . . . . . . . 138
12.2 The Linear Approximation of a Mapping . . . . . . . . . . . . . . . . . 142
12.3 Composite Mappings and the Chain Rule . . . . . . . . . . . . . . . . . 145
CONTENTS CONTENTS
13 Jacobians and Inverse Mappings 148
13.1 The Inverse Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . 148
13.2 Geometrical Interpretation of the Jacobian . . . . . . . . . . . . . . . . 154
13.3 Constructing Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
14 Double Integrals 163
14.1 Denition of Double Integrals . . . . . . . . . . . . . . . . . . . . . . . 163
14.2 Iterated Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
14.3 The Change of Variable Theorem . . . . . . . . . . . . . . . . . . . . . 174
15 Triple Integrals 179
15.1 Denition of Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . 179
15.2 Iterated Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
15.3 The Change of Variable Theorem . . . . . . . . . . . . . . . . . . . . . 185
A Implicitly Dened Functions 192
A.1 Implicit Dierentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
A.2 The Implicit Function Theorem . . . . . . . . . . . . . . . . . . . . . . 196
B Solutions to the Exercises 205
Problem Sets 219
Problem Set 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Problem Set 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Problem Set 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Problem Set 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Problem Set 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Problem Set 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Problem Set 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Problem Set 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Problem Set 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Preface
Content:
These notes cover the material of a traditional rst course in multivariable calculus,
apart from vector integral calculus, which is contained in the course Calculus 4 (AM
231).
Prerequisites:
A good knowledge of the fundamentals of one-variable calculus (limits, dier-
entiation, the Chain Rule, the linear approximation, Taylor polynomials, curve
sketching, the Riemann integral . . .).
A good knowledge of the fundamental of linear algebra (vector algebra, matrix
algebra, linear mappings, and determinants)
When studying multivariable calculus one begins to see how the concepts of linear
algebra begin to interact with those of Calculus.
Why is Calculus 3 a core course?
Multivariable calculus is one of the basic tools in the mathematical sciences. The
material in this course is used in a variety of 3rd and 4th year courses in all departments
of the Faculty of Mathematics. Examples of subject areas and related courses for which
Calculus 3 is a prerequisite, are:
ordinary and partial dierential equations (AM 351, 353)
mathematical optimization (C & 0 350)
non-linear programming (C & 0 367)
scientic computation (CS 371)
statistical theory and methods (STAT 330)
real and complex analysis (PM 331, 332)
Preface iii
Viewpoint:
In writing these notes, we have emphasized three aspects of multivariable calculus:
the geometrical interpretation
computational skills
the formal theoretical aspects (denitions, theorems and proofs)
Applications are mentioned as motivation, but are not discussed in depth.
We have given formal denitions of all concepts and have given precise statements
of all theorems. In part I we have given detailed proofs of most of the important
theorems (the Dierentiability Theorem, the Chain Rule, Taylors theorem, and the
Second Derivative Test). There are fewer formal proofs in part II, and in part III there
are no formal proofs, although the theorems are justied heuristically. We have taken
care to make a clear distinction between a formal proof and a heuristic argument.
Most of the concepts are discussed primarily for functions of two variables, but the
results are written in vector notation, so that the extension to functions of n variables
occurs naturally. The case n > 2 is usually discussed at the end of a section, under
the heading generalization.
To the Student Reader
These notes are written for students who are willing to work hard in order to obtain
a good understanding of multivariable calculus, so as to be able to apply the concepts
and methods elsewhere. In order to be successful in this course it is essential to know
single variable calculus well. So keep your notes from Math 137/138 or your rst year
calculus text handy for reference.
These notes are intended to be studied with a pencil and paper ready for use. The
examples show a suitable format for writing solutions, although in many cases the
details of a calculation are omitted and should be worked through by the reader.
RED MEANS STOP! The exercises are of a routine nature, and are designed to
be done quickly, when rst learning the material. You should always do these exercises
and check your answers in the back of the text before proceeding. They are designed to
make sure you have some understanding of the material before you proceed. The ten
iv Preface
problem sets at the back of the text are there for additional practice. The questions
in each problem set are divided into three sections, labeled A, B, and C. The
A questions are regarded as being routine, since they require standard calculations
as described in the notes, and all students should be able to complete these without
diculty. The B questions are not necessarily more dicult, but they require more
understanding of the course material; most, if not all, students should, after careful
consideration of all that is involved, be able to complete these questions. The C
questions are intended to be challenging. Answers to the problem sets problems are
provided on the uwace website.
It is essential to know the denitions before you try to solve problems. Memorize the
statements, but at the same time have a geometrical picture in mind, and then with
time you will develop a clear understanding of the concepts.
Some parts of the notes (for example, Chapters 2, 3 and 5) are more theoretical, and
hence more dicult, than other parts. It takes time and mathematical maturity to
fully understand the fundamental concepts of limit, continuity and dierentiability.
However, there is no need to feel discouraged if you nd these topics dicult, because
you can still press on and obtain a working knowledge of multivariable calculus, as
required for applications.
Understanding and writing proofs is usually the most dicult aspect of a course in
mathematics. Of course it is possible to apply a theorem without knowing the proof.
You just have to believe that it is true (trust me . . .). However, there are long term
benets in studying the proofs of theorems even if you do not plan to become a mathe-
matician. Firstly, you will repeatedly apply the denitions, and this will reinforce your
understanding of the basic concepts, making it possible for you to apply these con-
cepts elsewhere. Secondly, studying proofs is excellent training for the mind in logical
thinking. This experience will benet you in later years, most immediately in taking
courses of a more theoretical nature (e.g. a theoretical course in Computer Science).
Students grades in assignments, tests and the nal examination will be inuenced
by how clearly the ideas are expressed, and by how well the solutions are organized.
It is therefore important that students ensure that they understand not simply how
to obtain an answer which is technically correct, but also how to present a cogent
mathematical argument.
Preface v
Acknowledgements
Thanks are expressed to:
David Siegel: for many discussions and suggestions which have substantially af-
fected the structure and detailed contents of these notes.
Mike La Croix: for the amazing gures which he created for the text, and for his
assistance on editing, formatting and LaTexing.
Kathleen Wilkie: for proof-reading.
Joe Rocca: for proof-reading and checking solutions.
Stephen New, Nico Spronk, Lilia Krivodonova, Ronny Wan, and the many others
who have contributed suggestions for the material and formatting.
vi Preface
Part I
Multivariable Dierential Calculus
Chapter 1
Graphs of Scalar Functions
1.1 Scalar Functions
The most important concept in mathematics is that of a function. A function f : A
B associates with an element a A a unique element f(a) B. The subset of A for
which f(a) is dened is called the domain of f and is denoted by D(f). The subset of
B consisting of all f(a) is called the range of f and is denoted R(f).
In part I of these notes, we extend what we did in single variables calculus to
functions of several variables. In particular, we consider functions
f : R
2
R,
which maps a point (x, y) R
2
to a point f(x, y) R. Thus, the domain D(f) is a
subset of R
2
and the range R(f) is a subset of R. We will also consider more general
functions f : R
n
R.
REMARK
Although strictly speaking, f(x, y) denotes the value of the function f at the point
(x, y), it is common practice to use the phrase the function f(x,y).
A scalar function f : R
n
R is a function whose domain is a subset of R
n
and Denition
scalar function whose range is a subset of R.
We will use x to represent a point in R
n
. For example, a R
2
means a = (a, b) and Notation
x R
3
means x = (x, y, z).
Section 1.1 Scalar Functions 3
Consider a function f : R
2
R . Then f takes a point x R
2
and returns a scalar
value z, so we write z = f(x, y). Similarly, a function f : R
3
R could be written as
w = f(x, y, z).
Let f : R
2
R be dened by f(x, y) = 2x + 3y + 1. Find f(1, 4) and f(1, 1). EXAMPLE 1
Solution: We have f(1, 4) = 2(1)+3(4)+1 = 9 and f(1, 1) = 2(1)+3(1)+1 = 6.
Find the domain and range of the following functions: EXAMPLE 2
f(x, y) =
xy, g(x, y) =
x
2
y
2
|x| +|y|
Solution: For f, we know that we can not take a square root of a negative number,
so to get a valid answer we need xy 0. Thus, the domain is the set x 0, y 0 and
x 0, y 0. Since this is a subset of R
2
it is easy to represent with a picture.
Clearly the range of f is z 0 since the square root function returns a non-negative
number.
For g, we see that it is dened as long as (x, y) = (0, 0) so the domain is R
2
{(0, 0)}.
The range is a little more dicult to see. We need to determine what values we can
get from g by taking points in our domain. We rst consider points (c, 0). These give
g(c, 0) =
c
2
0
2
|c| +|0|
= |c|, hence g can take any positive value since c = 0. Similarly,
points (0, d) give f(0, d) =
0
2
d
2
|0| +|d|
= |d|, hence g can also take any negative value.
Finally, we observe that g(1, 1) = 0 and hence the range of g is R.
Sketch the domain and nd the range of the following functions. EXERCISE 1
a) f(x, y) = ln(1 x
2
y
2
).
b) g(x, y) =
_
16 x
2
+ y
2
.
4 Chapter 1 Graphs of Scalar Functions
For more complicated functions f : R
2
R it could be extremely dicult to determine
their range. When we had such situations with single variable functions we often found
it helpful to sketch the graph of the function. So, we now determine how to sketch a
graph of a function f : R
2
R .
1.2 Geometric Interpretation of z = f(x, y)
When we graphed a function f : R R we plotted points
(a, f(a)) in the xy-plane. Observe that we can think of
f(a) as representing the height of the graph y = f(x)
above (or below if negative) the x-axis at x = a. So, we
dene the graph of a function f : R
2
R as the set
of all points (a, b, f(a, b)) in R
3
such that (a, b) D(f).
In particular, we will think of f(a, b) as representing the
height of the surface z = f(x, y) above (x, y) = (a, b).
x y
z
z = f(x, y)
(a, b)
a, b, f(a, b)
Let f : R
2
R be dened by f(x, y) = c
1
x + c
2
y + c
3
. We recognize this as the EXAMPLE 3
equation of a plane in R
3
. i.e. the graph of z = f(x, y) is a plane.
In general, surfaces z = f(x, y) can be quite complicated. So, to help us visualize
and/or sketch these surfaces, we look at 2-dimensional slices of the surface.
For a function f : R
2
R , the level curves of f are the curves Denition
level curves
k = f(x, y),
where k is a constant in the range of f.
What are the level curves of f : R
2
R dened by f(x, y) = 2x 3y + 1. EXAMPLE 4
Solution: We observe that R(f) R. So, the equation k = f(x, y) = 2x 3y + 1,
denes a family of curves. Sketching this family of curves gives parallel straight lines.
Section 1.2 Geometric Interpretation of z = f(x, y) 5
REMARK
Observe that the level curve k = f(x, y) is the intersection of z = f(x, y) and the
horizontal plane z = k. Thus, in our family of level curves, each value of k represents
the height of that level curve above the xy-plane as shown in the diagram.
Consider the functions dened by EXAMPLE 5
f(x, y) = x
2
+ y
2
, g(x, y) = x
2
y
2
, h(x, y) = x
2
.
Sketch the level curves of f and use them to sketch the surface z = f(x, y).
Solution: For f, we rst observe that D(f) = R
2
and R(f) = {z R | z 0} since
x
2
+ y
2
0. Hence, k can take on values k 0. Hence, the level curves k = x
2
+ y
2
are circles with center (0,0). The level curve 0 = x
2
+ y
2
is thus just the single point
(0, 0), and is therefore called an exceptional level curve.
Remembering that k represents the height of the level curve k = f(x, y) above the
xy-plane, we sketch the surface by drawing the circles in the appropriate planes z = k
in R
3
. Thus, we get the surface in Figure 1 which is called a paraboloid.
x
y
x
y
z
z = C
z = x
2
+ y
2
For g, we rst observe that D(g) = R
2
and R(g) = R. Hence, for any k R we sketch
the level curves
k = x
2
y
2
,
which we recognize as a family of hyperbola with asymptotes y = x corresponding
to 0 = x
2
y
2
. Using these to sketch the surface, we get a saddle surface.
x
y
C = 0
C < 0
C > 0
x
y
z
C = 0
C < 0
C > 0
For h, we have that D(h) = R
2
and R(h) = {z R | z 0}. Thus, for and k 0 we
have level curves
k = x
2
x =
k.
Hence, the level curves are pairs of vertical straight lines. Using these to sketch the
surface, we get a parabolic cylinder.
x
y
C = 1
C =
9
16
C =
1
4
C =
1
16
x
y
z C = 1
C =
9
16
C =
1
4
C =
1
16
REMARK
Level curves occur in everyday life, e.g.:
1. The elevation of the earths surface above sea-level is described by an equation
z = h(x, y),
where h : R
2
R. A convenient way to represent h is by means of a contour
map, showing the curves of constant elevation
h(x, y) = k,
which are precisely the level curves of h.
Section 1.2 Geometric Interpretation of z = f(x, y) 7
2. The temperature over North America at a xed time is described by an equation
T = f(x, y),
where f : R
2
R. Weather maps often show curves of constant temperature
called isotherms, which are the level curves of f.
Sketch the level curves of each function f : R
2
R and use them to sketch/visualize EXERCISE 2
the surface z = f(x, y).
a) f(x, y) = x
2
+ 100y
2
b) f(x, y) = x + y
c) f(x, y) = 4x
2
+ x y
2
d) f(x, y) = (x + y)
2
REMARK
In general, it is not always possible to sketch the level curves of a given function
f(x, y) by inspection. Later in the course, we will develop some results which can be
used to obtain information about the level curves of a function.
One can also obtain insight into the shape of a surface z = f(x, y) by sketching the
curves of intersection of the surface with vertical planes x = c and y = d.
The cross-sections of a surface z = f(x, y) are the curves given by Denition
cross-sections
z = f(c, y) or z = f(x, d).
In Example 5, the cross-sections x = c are given by EXAMPLE 6
z = c
2
+ y
2
z = c
2
y
2
z = c
2
,
respectively. We again see that we get families of curves. Sketching z = c
2
+ y
2
gives
Sketch the the cross-sections x = c and y = d for z = x
2
y
2
and z = x
2
. EXERCISE 3
Sketch the level-curves and cross-sections of f(x, y) =
_
x
2
+ y
2
and use them to sketch EXERCISE 4
Generalization
For a function of three variables f : R
3
R, the equation f(x, y, z) = k, where
k R(f) is a family of surfaces in R
3
and so are often called the level surfaces of f.
The level surfaces of f(x, y, z) = x
2
+y
2
+z
2
are the family of spheres k = x
2
+y
2
+z
2
, EXAMPLE 7
for k > 0. In the exceptional case k = 0, the level surface is the single point (0, 0, 0).
For f : R
n
R we call the equations f(x) = k, k R(f) the level sets of f.
Let f : R
n
R be dened by EXAMPLE 8
f(x
1
, . . . , x
n
) = x
2
1
+ x
2
2
+ + x
2
n
.
The level sets f(x) = k, k > 0 in R
n
are called (n 1)-spheres, denoted S
n1
. If
n = 3 we obtain 2-spheres S
2
, as in Example 4.
Chapter 2
Limits
2.1 Denition of a Limit
Recall for a function f : R R we dened lim
xa
f(x) = L to mean that the values
of f(x) can be made arbitrarily close to L by taking x suciently close to a. More
precisely, for every > 0 there exists a > 0 such that
|f(x) L| < whenever 0 < |x a| < .
Moreover, we showed that lim
xa
f(x) = L if and only if lim
xa
f(x) = L = lim
xa
+
f(x).
Thus, for scalar functions f : R
2
R, we want lim
xa
f(x) = L to mean the values of
f(x) can be made arbitrarily close to L by taking x suciently close to a. For the one
variable case we could only approach the limit from two directions, left and right. For
multivariable scalar functions our domain is now multidimensional, so we can approach
the limits from innitely many directions. Moreover, we are not restricted to straight
lines either; we can approach a along any smooth curve as well! Hence, to generalize
the precise denition of a limit we need to generalize the concepts of an interval.
A neighborhood of a point a R
2
is a set N
r
(a) = {x R
2
| x a < r}. Denition
neighborhood
Recall that x a is the Euclidean distance in R
2
. That is, if
x = (x, y) and a = (a, b) then
x a = (x, y) (a, b) =
_
(x a)
2
+ (y b)
2
.
x
a
r
Nr(a)
Thus we get:
10 Chapter 2 Limits
Let f : R
2
R. If f is dened in a neighborhood of a R
2
, except possibly at a, then Denition
limit we dene lim
xa
f(x) = L to mean that for every > 0, there exists a > 0 such that for
all x in the domain of f
|f(x) L| < whenever 0 < x a < .
Domain in R
2
Range in R
..
L L +
L
f
x
a
_ _
x a < |f(x) L| <
Using the precise denition can be quite complex even for relatively simple limits.
Thus, we will instead use the denition to prove theorems to make this easier.
2.2 Limit Theorems
In extending our denition of a limit to functions f : R
2
R we would hope that
we have preserved all of our properties of limits we had for single variable functions
(otherwise it would not be a very good generalization!). In particular we have
Let f, g : R
2
R. If lim
xa
f(x) and lim
xa
g(x) both exist then Theorem 1
a) lim
xa
[f(x) + g(x)] = lim
xa
f(x) + lim
xa
g(x).
b) lim
xa
[f(x)g(x)] =
_
lim
xa
f(x)
_ _
lim
xa
g(x)
_
.
c) lim
xa
f(x)
g(x)
=
lim
xa
f(x)
lim
xa
g(x)
, provided lim
xa
g(x) = 0.
Proof: We will just prove a).
By denition of the limit, since lim
xa
f(x) = L
1
and lim
xa
g(x) = L
2
both exist we have
for every > 0 there exists a > 0 such that for any x for which 0 < x a < we
have |f(x) L
1
| <
1
2
and |g(x) L
2
| <
1
2
.
Section 2.3 Proving a Limit Does Not Exist 11
Thus, for any x for which 0 < x a < we have
(f(x) + g(x)) (L
1
+ L
2
)
(f(x) L
1
) + (g(x) L
2
)
|f(x) L
1
| +|g(x) L
2
| by the Triangle Inequality
<

2
+

2
= ,
as required.
Prove b) and c) in Theorem 1. EXERCISE 1
If lim
xa
f(x) exists, then the limit is unique. Corollary 1
Proof: Assume that lim
xa
f(x) = L
1
and lim
xa
f(x) = L
2
. Then we have
|L
1
L
2
| = | lim
xa
f(x) lim
xa
f(x)| = | lim
xa
(f(x) f(x))| = 0,
hence L
1
= L
2
.
2.3 Proving a Limit Does Not Exist
Recall for a function of one variable, we often showed a limit did not exist by showing
the left-hand limit did not equal the right-hand limit and using the fact that the limit
is unique. For multivariable functions, we will essentially do the same thing, only now
we have to remember that we are able to approach a along any smooth curve.
Let f : R
2
R be dened by f(x, y) =
xy
x
2
+y
2
, for (x, y) = (0, 0). Prove that EXAMPLE 1
lim
(x,y)(0,0)
f(x, y) does not exist.
Solution: To prove this does not exist we just need to approach the limit along two
paths that give dierent values.
We rst approach the limit along the line y = 0. Along this line
we have f(x, 0) =
0
x
2
+ 0
= 0, so that
lim
(x,y)(0,0)
f(x, 0) = lim
x0
0
x
2
= 0.
x
y
y = x
(0, 0)
(x, x)
(x, 0)
y = 0
12 Chapter 2 Limits
Now, approaching the limit along the line y = x we get
lim
(x,y)(0,0)
f(x, x) = lim
x0
x
2
x
2
+ x
2
=
1
2
.
Since f(x, y) approaches dierent values as (x, y) tends to (0, 0) along dierent paths,
the limit does not exist.
We often can approach the limit along innitely many lines or smooth curves at the
same time by introducing an arbitrary coecient m. If our limit depends on the value
of m, then it cannot be unique and hence the limit will not exist.
Prove that lim
(x,y)(0,0)
sin(xy)
x
2
+ y
2
does not exist. EXAMPLE 2
Solution: Approaching the limit along lines y = mx we get
lim
(x,y)(0,0)
sin(x(mx))
x
2
+ (mx)
2
= lim
x0
sin(mx
2
)
x
2
(1 + m
2
)
= lim
x0
2mxcos(mx
2
)
2x(1 + m
2
)
by LHR
= lim
x0
mcos(mx
2
)
1 + m
2
=
m
1 + m
2
Since the limit depends on m we get a dierent limit along each line y = mx and hence
lim
(x,y)(0,0)
sin(xy)
x
2
+ y
2
does not exist.
Let f(x, y) =
|x|
|x| + y
2
, for (x, y) = (0, 0). Show that EXERCISE 2
lim
(x,y)(0,0)
f(x, mx) = 1, for all m,
but lim
(x,y)(0,0)
Hint: y = mx does not describe all lines through the origin.
As described above, we can also approach limits along smooth curves.
Section 2.3 Proving a Limit Does Not Exist 13
Let f(x, y) =
x
2
y
x
4
+ y
2
, for (x, y) = (0, 0). Show that lim
(x,y)(0,0)
f(x, y) does not exist. EXAMPLE 3
Solution: As before we rst test the limit along lines y = mx. We get
lim
(x,y)(0,0)
f(x, mx) = lim
x0
x
2
(mx)
x
4
+ (mx)
2
= lim
x0
mx
x
2
+ m
2
= 0
and
lim
(x,y)(0,0)
f(0, y) = lim
y0
0
y
2
= lim
y0
0 = 0.
These all give the same value so we start testing curves. Of course, we dont want to
start randomly guessing curves. Observe that to get a value other than 0, we really
need the power of x everywhere in the denominator to match the power of x in the
numerator (so they cancel out). Thus, we see that approaching the limit along y = x
2
works. We get
lim
(x,y)(0,0)
f(x, x
2
) = lim
x0
x
2
(x
2
)
x
4
+ (x
2
)
2
= lim
x0
1
2
=
1
2
.
Since we have two dierent values along two dierent paths the limit does not exist.
REMARKS
1. We could have done the last example more eciently by just testing y = mx
2
to
begin with and showing the limit depends on m.
2. Make sure that all lines or curves you use actually approach the limit. A common
error is to approach a limit like in Example 3 along a line x = 1... which of course
is meaningless as it does not pass through (0, 0).
3. Example 3 shows that no matter how many lines and/or curves you test, you
cannot use this method to prove a limit exists. Just because you havent found
two paths that give dierent values does not mean there is not one!
a) Prove that lim
(x,y)(0,0)
x
3
y
x
6
+ y
2
does not exist. EXERCISE 3
b) Prove that lim
(x,y)(1,0)
(x 1)(y + 1)
|x 1| + y
does not exist.
14 Chapter 2 Limits
2.4 Proving a Limit Exists
Since we cannot use the method above to prove a limit exists, we prove another theorem
to help us.
Squeeze Theorem Theorem 2
For f : R
2
R, if there exists a function B(x) such that |f(x) L| B(x) for all
x = a in some neighborhood of a and lim
xa
B(x) = 0 then lim
xa
f(x) = L.
Proof: Since lim
xa
B(x) = 0 we have that for all > 0 there exists a > 0 such that
for all x for which 0 < x a < we have |B(x) 0| < . Hence, for all x for which
0 < x a < we have
|f(x) L| B(x) = |B(x)| < ,
since our hypothesis requires that B(x) 0 for all x = a in the neighborhood of a.
Thus, by denition of a limit we have
lim
xa
f(x) = L.
Our statement of the Squeeze Theorem above is not a direct generalization of the EXERCISE 4
Squeeze Theorem we used in single variable calculus. What would the direct general-
ization of the Squeeze Theorem be? Show how your generalization and the theorem
above are related.
Prove that lim
(x,y)(0,0)
x
2
y
x
2
+ 2y
2
= 0 EXAMPLE 4
Solution: We have f(x, y) =
x
2
y
x
2
+ 2y
2
, L = 0. For (x, y) = (0, 0) we obtain
0 |f(x, y) L| =
x
2
y
x
2
+ 2y
2
0
=
x
2
|y|
x
2
+ 2y
2
.
Since y
2
0, it follows that x
2
x
2
+ 2y
2
, and hence
x
2
|y|
x
2
+ 2y
2
(x
2
+ 2y
2
)
|y|
x
2
+ 2y
2
= |y|.
Section 2.4 Proving a Limit Exists 15
Thus
0 |f(x, y) L| |y|, for all (x, y) = (0, 0).
By inspection
lim
(x,y)(0,0)
|y| = 0
Thus, the Squeeze Theorem implies that lim
(x,y)(0,0)
f(x, y) exists and equals L, giving
the desired result.
The next example illustrates some manipulations with inequalities.
Prove that EXAMPLE 5
|2x
2
y
2
|
|x| +|y|
2|x| +|y|, for all (x, y) = (0, 0)
Solution: The idea is to manipulate the numerator so as to create a factor of |x| +|y|,
which will cancel the denominator. For arbitrary (x, y), consider
|2x
2
y
2
| = |2x
2
+ (y
2
)|
|2x
2
| +| y
2
|, by the Triangle Inequality
= 2|x|
2
+|y|
2
Since |x| |x| +|y|, and |y| |x| +|y|, we obtain
2|x|
2
+|y|
2
2|x|
_
|x| +|y|
_
+|y|
_
|x| +|y|
_
=
_
2|x| +|y|
__
|x| +|y|
_
Hence,
|2x
2
y
2
|
|x| +|y|

(2|x| +|y|)(|x| +|y|)
|x| +|y|
= 2|x| +|y|,
as required.
16 Chapter 2 Limits
REMARK
It is necessary to be careful when working with inequalities. For example, the
statement
x < x
2
is false if |x| < 1. See the appendix at the end of the chapter for a review of inequalities.
Prove that EXERCISE 5
|x
3
y
3
|
x
2
+ y
2
|x| +|y| for all (x, y) = (0, 0).
Does equality ever hold?
Summary
Before one can apply the Squeeze Theorem, you must have a possible limiting value L
in mind. Of course, if you are asked to
Prove that lim
(x,y)(a,b)
f(x, y) = L,
you are given the limiting value L, and can apply the Squeeze Theorem directly as in
Example 4. On the other hand, if you are asked to
Determine whether lim
(x,y)(a,b)
f(x, y) exists, and if so nd its value,
you should begin by letting (x, y) approach (a, b) along straight lines of dierent slope.
If the limiting value of f(x, y) depends on the slope, then lim
(x,y)(a,b)
f(x, y) does not
exist.
If the limiting value of f(x, y) does not depend on the slope and equals L, say, then
lim
(x,y)(a,b)
f(x, y) may exist and if it does exist, it equals L.
You should then try to apply the Squeeze Theorem to prove that the limit does exist
and equals L.
If you fail to derive a suitable inequality, you cannot draw a conclusion, and you are
faced with a dilemma . . .
perhaps a suitable inequality does exist, but you were not skillful enough
to derive it,
Section 2.4 Proving a Limit Exists 17
OR
perhaps if you let (x, y) approach (a, b) along curves, then the you may
get a limiting value other than L along one of those curves, in which case
lim
(x,y)(a,b)
This can be a process of trial and error, but experience will help to shorten the process.
Here is an example.
(x,y)(0,0)
x
2
|x| |y|
|x| +|y|
exists, and if so nd its value. EXAMPLE 6
Solution: Trying lines y = mx we get
lim
x0
x
2
|x| m|x|
|x| + m|x|
= lim
x0
|x| (1 + m)
1 + m
= 1.
Since the value along each line is L = 1, we try to prove the limit is 1 with the
Squeeze Theorem. Thus, we consider
x
2
|x| |y|
|x| +|y|
(1)
x
2
|x| |y|
|x| +|y|
+
|x| +|y|
|x| +|y|
=
x
2
|x| +|y|
=
|x| |x|
|x| +|y|
|x|(|x| +|y|)
|x| +|y|
= |x|, since |x| (|x| +|y|)
Since lim
(x,y)(0,0)
|x| = 0 we get lim
(x,y)(0,0)
x
2
|x||y|
|x|+|y|
= 1 by the Squeeze Theorem.
Consider f : R
2
R dened by EXERCISE 6
f(x, y) =
x
2
(x 1) y
2
x
2
+ y
2
, for (x, y) = (0, 0)
(x,y)(0,0)
f(x, y) exists, and if so nd its value.
18 Chapter 2 Limits
Generalization
The concept of a neighborhood, the denition of a limit, the Squeeze Theorem and
the limit theorems are all valid for functions f : R
n
R. In fact, to generalize these
concepts, one only needs to change all of the R
2
s to R
n
s in our denitions in Section
2.1 and recall to that if x, a R
n
, then
x a =
_
(x
1
a
1
)
2
+ + (x
n
a
n
)
2
.
Appendix: Inequalities
The following statements can be taken as axioms (i.e. assumed properties) which dene
the notion of less than (denoted <) for real numbers.
1
Trichotomy Property: For any real numbers a and b, one and only one of the
following holds:
a = b, a < b, b < a
Transitivity Property: If a < b and b < c, then a < c.
Addition Property: If a < b, then for all c, a + c < b + c.
Multiplication Property: If a < b and c < 0, then ac > bc.
Using these properties one can deduce other results.
The absolute value of a real number a is dened by
|a| =
_
_
_
a if a 0
a if a < 0.
Three frequently used results, which follow from the axioms, are listed below.
1. |a| =
a
2
.
2. |a| < b if and only if b < a < b.
3. the Triangle Inequality: |a + b| |a| +|b| for all a, b R.
1
One can equivalently use the notion of greater than (denoted >). The statement a > b
means b < a.
Section 2.4 Appendix: Inequalities 19
REMARK
When using the Squeeze Theorem, the most commonly used inequalities are:
(1) the Triangle Inequality.
(2) if c > 0, then a < a + c.
(3) the cosine inequality 2|x||y| x
2
+ y
2
.
One particularly common use of (2) is for things like
|x| =
x
2
_
x
2
+ y
2
.
We again stress that it is very important that one be careful when working with
inequalities. Another common mistake is
if c > 0, then |a + c| a + c.
Give an example to show where this statement is false.
Chapter 3
Continuous Functions
3.1 Denition of a Continuous Function
In many situations, we shall need to require that a function f : R
2
R is continuous.
Intuitively, this means that the graph of f (the surface z = f(x, y)) has no breaks or
holes in it. As with functions of one variable, continuity is dened by using limits.
x
y
z
break in the surface
z = f(x, y)
set of points (x, y) at
which f is not continuous
Review the denition of a continuous function of one variable in your rst year calculus EXERCISE 1
text. Give an example (formula and graph) of a function f : R R which is dened
for all x R, but is not continuous at x = 1. Use one-sided limits to prove that the
appropriate limit does not exist.
Section 3.1 Denition of a Continuous Function 21
Here is the formal denition.
A function f : R
2
R is continuous at a if and only if Denition
continuous
lim
xa
f(x) = f(a)
Additionally, if f is continuous at every point in a set D R
2
, then we say that f is
continuous on D.
REMARK
There are really three requirements in this denition:
1. lim
xa
f(x) exists,
2. f is dened at a,
3. the stated equality.
Let f : R
2
f(x, y) =
_
_
_
x
2
y
x
2
+2y
2
if (x, y) = (0, 0)
0 if (x, y) = (0, 0).
Determine whether f is continuous at (0, 0).
Solution: According to the denition we have to determine whether
lim
(x,y)(0,0)
x
2
y
x
2
+ 2y
2
= 0
This limit was established in Example 4 of Section 2.4. It follows that f is continuous
at (0, 0).
22 Chapter 3 Continuous Functions
Consider f : R
2
R dened by EXAMPLE 2
f(x, y) =
sin(xy)
x
2
+ y
2
if (x, y) = (0, 0).
Can f be dened at (0, 0) so that the resulting function, whose domain is R
2
, is
continuous at (0, 0)?
Solution: By denition of continuity, we must determine whether
lim
(x,y)(0,0)
sin(xy)
x
2
+ y
2
exists. It was shown in Example 2 of Section 2.3 that this limit does not exist. Thus,
no matter what value we assign to f(0, 0) the resulting function will not be continuous
at (0, 0).
Let f : R
2
R be dened by f(x, y) =
_
_
_
xy
|x|+|y|
if (x, y) = (0, 0)
0 if (x, y) = (0, 0).
EXERCISE 2
Determine if f is continuous at (0, 0).
3.2 The Continuity Theorems
One can often quickly prove that a function is continuous by applying certain theorems.
The idea is to view a given function as being formed from simple functions by certain
basic operations, which we now dene.
Let f : R
2
R and g : R
2
R and x D(f) D(g) then: Denition
operations on
functions
1. the sum f + g : R
2
R is dened by
(f + g)(x) = f(x) + g(x)
2. the product fg : R
2
R is dened by
(fg)(x) = f(x)g(x)
3. the quotient
f
g
: R
2
R is dened by
_
f
g
_
(x) =
f(x)
g(x)
, if g(x) = 0.
Section 3.2 The Continuity Theorems 23
Let g : R R and f : R
2
R. The composite function g f : R
2
R is dened Denition
composite function by
(g f)(x) = g(f(x)),
for all x for which f(x) D(g).
When composing multivariable functions, it is very important to make sure the range
of the inner function matches the domain of the outer function. For example, if f, h :
R
2
R we cannot compose hf since f returns a scalar value which is not acceptable
input into h.
Here are the required theorems, which we shall refer to collectively as the continuity
theorems.
Sum and Product Theorem 1
If f : R
2
R and g : R
2
R are continuous at a then f + g and fg are continuous
at a.
Proof: We prove the result for f + g and leave the proof for fg as an exercise. By
the hypothesis and the denition of continuous function we have that
lim
xa
f(x) = f(a), lim
xa
g(x) = g(a).
Hence, by denition of the sum and limit properties, we get
lim
xa
(f + g)(x) = lim
xa
f(x) + lim
xa
g(x) = f(a) + g(a) = (f + g)(a).
Complete the proof of the theorem, by proving that fg is continuous at a. EXERCISE 3

Quotient Theorem 2
If f : R
2
R and g : R
2
R are both continuous at a and g(a) = 0 then the quotient
f
g
is continuous at a.
Use the Limit Theorems to prove Theorem 2. Where is the hypothesis g(a) = 0 used EXERCISE 4
explicitly?
Composition Theorem 3
If f : R
2
R is continuous at a and g : R R is continuous at f(a), then the
composition g f is continuous at a.
Proof: By denition of continuity we have, lim
yf(a)
g(y) = g(f(a)), where we have
chosen to use y as the independent variable. By denition of a limit, for all > 0 there
exists a
1
> 0 such that for all y,
|y f(a)| <
1
implies |g(y) g(f(a))| < . (3.1)
Similarly we have by denition of continuity, lim
xa
f(x) = f(a). By denition of a
limit, given the above
1
, there exists a > 0 such that for all x,
x a < implies |f(x) f(a)| <
1
. (3.2)
The key step is to choose y = f(x) in (3.1), which is possible since (3.1) is preceded
by the quantier for all y. Then (3.1) reads
|f(x) f(a)| <
1
implies |g(f(x)) g(f(a))| < .
Combining this with (3.2), we have that for all > 0 there exists a > 0 such that for
all x,
x a < implies |g(f(x)) g(f(a))| <
or equivalently,
x a < implies |(g f)(x) (g f)(a)| < ,
in terms of the denition of a composite function. By denition of a limit,
lim
xa
(g f)(x) = (g f)(a),
which proves that the composite function g f is continuous at a.
Before we can apply these theorems, we need a list of basic functions which are
known to be continuous on their domains:
the constant function f(x, y) = k
the coordinate functions f(x, y) = x, f(x, y) = y
the logarithm function ln()
the exponential function e
()
the trigonometric functions, sin(), cos(), etc.
the inverse trigonometric functions, arcsin(), etc.
the absolute value function | |
Prove that the constant function f(x, y) = k and the coordinate functions f(x, y) = x, EXERCISE 5
f(x, y) = y are continuous on their domains.
Prove that the function h : R
2
h(x, y) = sin(6x
2
y + 3xy
2
)
is continuous for all (x, y) R
2
.
Solution: By applying Theorem 1 to the constant function and the coordinate func-
tions, it follows that
f(x, y) = 6x
2
y + 3xy
2
(3.3)
is continuous for all (x, y) R
2
. Theorem 3, with g() = sin() and f as in equation
(3.3), now implies that h is continuous for all (x, y) R
2
.
The function h : R
2
h(x, y) =
sin
2
| x + 2y |
x
2
+ y
2
is continuous for all (x, y) = (0, 0). Which of the basic functions and theorems do you
have to use in order to prove this?
You will notice that the power function x
a
is not included in the list of basic functions.
This omission was deliberate, since x
a
can be expressed in terms of e
()
and ln() :
x
a
= e
a ln x
,
for all x > 0. It thus follows from Theorem 3 that x
a
is continuous for all x > 0.
Prove that the function h : R
2
h(x, y) = (xy)
is continuous for all (x, y) which satisfy xy > 0. Which of the theorems and basic
functions do you have to use?
These examples show that by using the Continuity Theorems, one can often prove
continuity of a given function essentially by inspection. However, for certain points,
where the Continuity Theorems can not be applied, one still has to use the denition
of continuity in order to determine whether or not the function is continuous. Here is
an example.
Discuss the continuity of the function f : R
2
f(x, y) =
_
_
_
e
xy
1
x
2
+y
2
if (x, y) = (0, 0),
0 if (x, y) = (0, 0).
Solution: For (x, y) = (0, 0) the Continuity Theorems immediately imply that f is
continuous at these points.
Observe the point (0, 0) is singled out in the denition of the function. Thus, the
Continuity Theorems cannot be applied at (0, 0) and so we have to use the denition,
i.e. we have to determine whether
lim
(x,y)(0,0)
f(x, y) = f(0, 0) = 0.
On the line y = x we get
lim
(x,y)(0,0)
f(x, x) = lim
x0
e
x
2
1
2x
2
=
1
2
,
by LH opitals rule. It follows that lim
(x,y)(0,0)
f(x, y) does not equal f(0, 0), and hence
by denition, f is not continuous at (0, 0).
Referring to Example 4, can you make f continuous at (0, 0) by redening f(0, 0) =
1
2
? EXERCISE 8
Discuss the continuity of the function f : R
2
f(x, y) =
_
_
_
|yx|
yx
if x = y
0 if x = y.
Solution: For points (x, y) with x = y the Continuity Theorems immediately imply
that f is continuous at these points.
We can not apply the continuity theorems at the points (x, y) with x = y. Consider any
one of these points, and denote it by (a, a). If (x, y) approaches (a, a) with y x > 0,
then |y x| = y x, and f(x, y) approaches (and in fact equals) 1. On the other hand
if (x, y) approaches (a, a) with y x < 0, then f(x, y) approaches 1. Thus
lim
(x,y)(a,a)
By denition of continuity, f is not continuous at (a, a). The
geometric interpretation is simple. The graph of f consists
of two parallel half-planes which form a step along the line
y = x.
x
y
z
z = 1
z = 1
y = x
3.3 Limits revisited
So far in this chapter, we have shown how to prove that a function is continuous at a
point essentially by inspection, using the Continuity Theorems. This makes it easy
to evaluate lim
xa
f(x) if f is continuous at a. In particular, if f is continuous at a, then
lim
xa
f(x) can be evaluated simply by evaluating f(a).
Evaluate lim
(x,y)(,0)
cos
_
x
2
+ y
2
x
2
+ y
2
. EXAMPLE 6
Solution: Let f(x, y) =
cos
_
x
2
+ y
2
x
2
+ y
2
, for (x, y) = (0, 0). By the Continuity Theo-
rems, f is continuous for all (x, y) = (0, 0). Thus, by denition of continuity,
lim
(x,y)(,0)
f(x, y) = f(, 0) =
cos
2
=
1
2
.
Evaluate lim
(x,y)(1,)
ln(1 + e
sinxy
) justifying your method. EXERCISE 9
REMARK
In applying the Squeeze Theorem one has to prove that lim
xa
B(x) = 0. One hopes
to be able to evaluate this limit by inspection, and so one tries to set up the inequality
in the Squeeze Theorem so that B(x) is continuous at a.
Chapter 4
The Linear Approximation
4.1 Partial Derivatives
A function f : R
2
R which maps (x, y) f(x, y) can be dierentiated in two
natural ways:
1. Treat y as a constant, and dierentiate with respect to x, to obtain
f
x
.
2. Treat x as a constant, and dierentiate with respect to y, to obtain
f
y
.
The derivatives
f
x
and
f
y
are called the (rst) partial derivatives of f.
Here is the formal denition.
Let f : R
2
R. The partial derivatives of f at (a, b) are dened by Denition
partial derivatives
f
x
(a, b) = lim
h0
f(a + h, b) f(a, b)
h
,
f
y
(a, b) = lim
h0
f(a, b + h) f(a, b)
h
,
provided that these limits exist.
Typically one tries to calculate the partial derivatives by using the standard rules for
dierentiation. However, if these can not be applied, then the denition of the partial
derivatives must be used.
30 Chapter 4 The Linear Approximation
A function f : R
2
R is dened by EXAMPLE 1
f(x, y) = xe
kxy
,
where k is a constant. Calculate
f
x
and
f
y
at an arbitrary point.
Solution: By using the Product Rule and Chain Rule for dierentiation,
f
x
= (1)e
kxy
+ xe
kxy
(ky) = (1 + kxy)e
kxy
f
y
= xe
kxy
(kx) = kx
2
e
kxy
A function f : R
2
R is dened by f(x, y) = sin(xy
2
). Calculate
f
x
and
f
y
. EXERCISE 1
A function f : R
2
f(x, y) = (x
3
+ y
3
)
1
3
.
Determine whether
f
x
(0, 0) exists.
Solution: By dierentiation,
f
x
(x, y) =
x
2
(x
3
+ y
3
)
2/3
, (4.1)
for all (x, y) such that x
3
+ y
3
= 0. One cannot substitute (x, y) = (0, 0) in equation
(4.1) since the denominator would be zero. Thus, we must use the denition of the
partial derivatives at (0, 0). We get
f
x
(0, 0) = lim
h0
f(0 + h, 0) f(0, 0)
h
= lim
h0
(h
3
)
1/3
0
h
= lim
h0
1 = 1.
Refer to the function in Example 2. f is not continuous at all points on the line y = x EXERCISE 2
since x
3
+y
3
= 0 if and only if y = x. Show that
f
x
(a, a) does not exist for a = 0.
Section 4.2 Partial Derivatives 31
A function f : R
2
R is dened by EXERCISE 3
f(x, y) = |x(y 1)|
Determine whether
f
x
(0, 0) and
f
x
(0, 1) exist.
Hint: You must use the denition of the partial derivative at all points (0, a) and (a, 1),
for any a R, since one cannot dierentiate |x| and |y 1| at 0 and 1 respectively.
The partial derivatives of f(x, y) are also denoted by f
x
and f
y
, i.e. Notation
f
x
= f
x
,
f
y
= f
y
.
This is called the subscript notation. It is sometimes convenient to use the operator
notation D
1
f and D
2
f for the partial derivatives of f : R
2
R. The notation D
1
f
means: dierentiate f with respect to the variable in the rst position, holding the
other xed. If the independent variables are x and y, then
D
1
f =
f
x
, D
2
f =
f
y
.
Generalization
We can extend what we have done for f : R
2
R to scalar functions f : R
n
R.
That is, we take the partial derivative of f with respect to its i-th variable by holding
all the other variables constant and dierentiating with respect to the i-th variable.
Let f(x, y, z) = xy
2
z
3
. Find f
x
, f
y
and f
z
. EXAMPLE 3
Solution: We have
f
x
= y
2
z
3
f
y
= 2xyz
3
f
z
= 3xy
2
z
2
For f : R
3
R, write the precise denition of
f
x
,
f
y
, and
f
z
. EXERCISE 4
4.2 Second Partial Derivatives
In how many ways can one calculate a second partial derivative of f : R
2
R? Observe
that since the rst partial derivatives of f are also functions of two variables we can
take partial derivatives of them. Hence, there are 4 possible second partial derivatives
of f. They are:
2
f
x
2
=

x
_
f
x
_
, i.e. dierentiate
f
x
with respect to x, with y xed
2
f
yx
=

y
_
f
x
_
, i.e. dierentiate
f
x
with respect to y, with x xed.
Similarly
2
f
xy
=

x
_
f
y
_
,

2
f
y
2
=

y
_
f
y
_
.
It is often convenient to use the subscript notation or the operator notation:
2
f
x
2
= f
xx
= D
2
1
f,

2
f
yx
= f
xy
= D
2
D
1
f
2
f
xy
= f
yx
= D
1
D
2
f,

2
f
y
2
= f
yy
= D
2
2
f.
The subscript notation suggests that one should write the second partial derivatives
as a 2 2 matrix.
Let f : R
2
R. The Hessian matrix of f, denoted by Hf(x, y), is dened as Denition
Hessian matrix
Hf(x, y) =
_
f
xx
f
xy
f
yx
f
yy
_
.
REMARKS
1. Observe the dierence in the order of the mixed second partial derivatives for
the dierent notations.
2. We will see later that the Hessian matrix is very useful.
Section 4.2 Second Partial Derivatives 33
A function f : R
2
f(x, y) = xe
kxy
,
where k is a constant. Find all the second partial derivatives of f.
Solution: We rst calculate the rst partial derivatives. We have
f
x
= e
kxy
+ kxye
kxy
f
y
= kx
2
e
kxy
.
Thus we get
2
f
x
2
=

x
_
e
kxy
+ kxye
kxy
_
= 2kye
kxy
+ k
2
xy
2
e
kxy
,
2
f
yx
=

y
_
e
kxy
+ kxye
kxy
_
= 2kxe
kxy
+ k
2
x
2
ye
kxy
,
2
f
xy
=

x
_
kx
2
e
kxy
_
= 2kxe
kxy
+ k
2
x
2
ye
kxy
,
2
f
y
2
=

y
_
kx
2
e
kxy
_
= k
2
x
3
e
kxy
.
REMARK
In the previous example, observe that
2
f
xy
=

2
f
yx
.
This is in fact a general property of partial derivatives, subject to a continuity require-
ment, as follows.
Let f : R
2
R. If f
xy
and f
yx
are dened in some neighborhood of a and f
xy
and f
yx
Theorem 1
are continuous at a then
f
xy
(a) = f
yx
(a).
Proof: The proof is rather technical, and is thus omitted.
Verify that f(x, y) = ln(x
2
+ y
2
) satises EXERCISE 5
f
xx
+ f
yy
= 0, for (x, y) = (0, 0).
Verify that f(x, y) = x
y
satises EXERCISE 6
f
xy
= f
yx
, for x > 0.
Higher-order partial derivatives
Of course, we can take higher-order partial derivatives in the expected way. In partic-
ular, observe that f : R
2
R has 8 third partial derivatives. They are
f
xxx
, f
xxy
, f
xyx
, f
xyy
, f
yxx
, f
yxy
, f
yyx
, f
yyy
.
Not surprisingly, given Theorem 1, we get that if they are continuous, then the higher-
order partial derivatives are equal regardless of the order the partial derivatives are
taken. For example,
f
xxy
= f
xyx
= f
yxx
.
For many situations, we will want to require that a function have continuous partial
derivatives of some order. Thus, we introduce some notation for this.
If the k-th partial derivatives of f : R
n
R are continuous, then we write Notation
f C
k
and say f is in class C
k
.
So, f : R
2
R, f C
2
means that f has continuous second partial derivatives, and
therefore, by Theorem 1, we have that f
xy
= f
yx
.
Section 4.3 The Tangent Plane 35
4.3 The Tangent Plane
The surface of a sphere has a tangent plane at each point P,
namely the plane through P that is orthogonal to the line
joining P and the centre O. The tangent plane at P can be
thought of as the plane which best approximates the surface
of the sphere near P.
P
O
This concept can be generalized to a surface dened by an equation of the form
z = f(x, y). (4.2)
Let C
1
be the cross-section y = b of the surface, that is, C
1
is given by
z = f(x, b).
It follows that
f
x
(a, b) equals the slope of the tangent
line L
1
of C
1
at the point P(a, b, f(a, b)). A similar inter-
pretation holds for
f
y
(a, b) in terms of the cross-section
z = f(a, y).
We provisionally dene the tangent plane to the surface
(4.2) at the point P(a, b, f(a, b)) to be the plane which
contains the tangent lines L
1
and L
2
(refer to the gure).
x
y
z
P
L
1
L
2
C
1
C
2
z = f(x, y)
(a, b)
In order to derive the equation of the tangent plane, we note that any (non-vertical)
plane through the point P(a, b, f(a, b)) has an equation of the form
z = f(a, b) + m(x a) + n(y b),
where m and n are constants. The intercept of this plane with the vertical plane y = b
is the line
z = f(a, b) + m(x a) (4.3)
We require this line to coincide with L
1
. Thus the slope m of the line (4.3) must equal
the slope
f
x
(a, b) of the line L
1
:
m =
f
x
(a, b).
A similar argument yields
n =
f
y
(a, b).
Thus, we make the following denition which we will formalize in Chapter 5.
The tangent plane to z = f(x, y) at the point (a, b, f(a, b)) is
z = f(a, b) +
f
x
(a, b)(x a) +
f
y
(a, b)(y b).
The graph of the function EXERCISE 7
f(x, y) =
_
x
2
+ y
2
is the cone z =
_
x
2
+ y
2
. Find the equation of the tangent plane at the point (3, 4, 5).
Show that the tangent plane at any point on the cone in Exercise 1 passes through the EXERCISE 8
origin.
REMARK
In Exercise 8, you should note that a tangent plane does not exist at the vertex
(0, 0, 0) of the cone, since the cone is not smooth there. We shall discuss the question
of the existence of a tangent plane in Chapter 5.
4.4 Linear Approximation for z = f(x, y)
Review of the 1-D case
For a function f : R R the tangent line can be used to approximate the graph of
the function near the point of tangency. Recall that the equation of the tangent line
to y = f(x) at the point (a, f(a)) is
y = f(a) + f
(a)(x a).
The function L
a
: R R dened by
L
a
(x) = f(a) + f
(a)(x a)
is called the linear approximation of f at x = a since L
a
(x) approximates f(x) for
x suciently close to a.
We express this linear approximation formula as
f(x) L
a
(x), for x suciently close to a.
Section 4.4 Linear Approximation for z = f(x, y) 37
The quantier on x is essential in order for the approximation to be reasonable.
Verify each approximation: EXERCISE 9
i) sin x x, for x suciently close to 0,
ii)
1 + x 1 +
1
2
x, for x suciently close to 0,
iii) ln x (x 1), for x suciently close to 1.
The 2-D case
For a function f : R
2
R, the tangent plane can be used to approximate the surface
z = f(x, y) near the point of tangency.
Let f : R
2
R. We dene the linear approximation L
(a,b)
(x, y) of f at (a, b) by Denition
linear
approximation
L
(a,b)
(x, y) = f(a, b) +
f
x
(a, b)(x a) +
f
y
(a, b)(y b).
As in the 1-D case, we have that L
(a,b)
(x, y)
approximates f(x, y) for (x, y) suciently close
to (a, b). We express this linear approximation
formula as
f(x, y) L
(a,b)
(x, y),
for (x, y) suciently close to (a, b).
x y
z
P
Q
(x, y)
z = f(x, y)
z = L
(a,b)
(x, y)
(a, b)
a, b, f(a, b)
Use the linear approximation to approximate

_
(0.95)
3
+ (1.98)
3
. EXAMPLE 5
Solution: A choice of function and point of tangency must be made. Let
f(x, y) =
_
x
3
+ y
3
, and (a, b) = (1, 2).
The partial derivatives of f are
f
x
=
3x
2
2
_
x
3
+ y
3
,
f
y
=
3y
2
2
_
x
3
+ y
3
.
Thus,
L
(1,2)
(x, y) = f(1, 2) + f
x
(1, 2)(x 1) + f
y
(1, 2)(y 2)
= 3 +
1
2
(x 1) + 2(y 2). (4.4)
The linear approximation formula becomes
_
x
3
+ y
3
3 +
1
2
(x 1) + 2(y 2),
for (x, y) suciently close to (1, 2). Evaluate for (x, y) = (0.95, 1.98):
_
(0.95)
3
+ (1.98)
3
3 +
1
2
(0.05) + 2(0.02) = 2.935.
The calculator value is 2.935943.
REMARK
Resist the temptation to expand the brackets and simplify in equation (4.4). The
bracketed terms represent small increments, and it is helpful to keep them separate.
Calculate
_
sin
_
1
10
_
+ tan
_
3
4
_
approximately. Compare your answer with the value EXERCISE 10
from a calculator.
Hint: Choose the point of tangency so that the increments in x and y do not exceed
1
10
. Use the approximate value 3.14 for .
Verify each approximation: EXERCISE 11
i)
xy
x + y

6
5
+
9
25
(x 2) +
4
25
(y 3), for (x, y) suciently close to (2, 3)
ii) ln(x
2
+ y) 2(x 1) + y, for (x, y) suciently close to (1, 0)
iii) e
3x2y
1 + 3x 2y, for (x, y) suciently close to (0, 0).
Increment form of the linear approximation
Let f : R
2
R. Suppose that we know f(a, b) and want to calculate f(x, y) at a
nearby point. Let
x = x a, y = y b,
Section 4.5 Linear Approximation in Higher Dimensions 39
and
f = f(x, y) f(a, b).
The linear approximation formula is
f(x, y) f(a, b) +
f
x
(a, b)(x a) +
f
y
(a, b)(y b)
for (x, y) suciently close to (a, b). This can be rearranged to yield
f
f
x
(a, b)x +
f
y
(a, b)y, (4.5)
for x, y suciently close to zero. This gives an approximation for the change f in
f(x, y) due to a change (x, y) away from the point (a, b). We shall refer to equation
(4.5) as the increment form of the linear approximation formula.
An isosceles triangle has base 4 m, and equal angles of

4
. If the base is increased by EXERCISE 12
16 cm, and the equal angles are decreased by 0.1 radians, estimate the change in area.
4.5 Linear Approximation in Higher Dimensions
3
R. By analogy with the case of a function of two
variables, we dene the linear approximation of f at a by
L
a
(x) = f(a) + f
x
(a)(x a) + f
y
(a)(y b) + f
z
(a)(z c).
The notation is becoming cumbersome, but one can improve matters by noting that
the nal three terms can be represented by the dot product of the vectors
x a = (x a, y b, z c), and (f
x
(a), f
y
(a), f
z
(a)) .
The second vector is called the gradient of f at a, denoted f(a). Here are the formal
denitions.
Suppose that f : R
3
R has partial derivatives at a. The gradient of f at a is Denition
gradient dened by
f(a) = (f
x
(a), f
y
(a), f
z
(a)) .
Suppose that f : R
3
R has partial derivatives at a. The linear approximation of Denition
linear
approximation
f at a is dened by
L
a
(x) = f(a) +f(a) (x a). (4.6)
The linear approximation formula for f : R
3
R is expressed as
f(x) f(a) +f(a) (x a), (4.7)
for all x suciently close to a.
The function f : R
3
f(x, y, z) =
_
x
2
+ y
2
+ z
2
.
Find the gradient of f and the linear approximation for f at a = (1, 2, 2).
Solution: Dierentiate to obtain
f(x) =
_
x
_
x
2
+ y
2
+ z
2
,
y
_
x
2
+ y
2
+ z
2
,
z
_
x
2
+ y
2
+ z
2
_
.
Evaluate at a = (1, 2, 2) to get
f(a) =
_
1
3
,
2
3
,
2
3
_
.
Thus
L
a
(x) = f(a) +f(a) (x a)
= 3 +
1
3
(x 1) +
2
3
(y 2)
2
3
(z + 2).
The linear approximation formula for f at (1, 2, 2) is
_
x
2
+ y
2
+ z
2
3 +
1
3
(x 1) +
2
3
(y 2)
2
3
(z + 2),
for (x, y, z) suciently close to (1, 2, 2).
Section 4.5 Linear Approximation in Higher Dimensions 41
Use the linear approximation to estimate 4.99 7.01 9.99. Compare your answer to EXERCISE 13
the calculator value.
Generalization
The advantage of using vector notation is that equations (4.6) and (4.7) hold without
change for a function of n variables, f : R
n
R. For arbitrary n,
x a = (x
1
a
1
, x
2
a
2
, . . . , x
n
a
n
),
and
f(a) = (D
1
f(a), D
2
f(a), , D
n
f(a)) .
The increment form of the linear approximation formula has the form
f f(a) x,
where x = (x
1
, x
2
, . . . , x
n
) must be suciently close to zero.
Observe that this gives us that if g : R R then g(a) = g
(a) and the increment

form of the linear approximation is
g g(a) x = g
(a)(x a),
which is our familiar formula from Calculus 1.
For f : R
2
R we have f(a) = (f
x
(a), f
y
(a)) and the increment form of the linear
approximation is
f f(a) x = f
x
(a)(x a) + f
y
(a)(y b),
which matches our work above.
Chapter 5
Dierentiable Functions
5.1 Denition of Dierentiability
For f : R
2
R, the linear approximation formula is
f(x) L
a
(x), (5.1)
for x suciently close to a, where
L
a
(x) = f(a) +f(a) (x a).
When making an approximation, it is important to ask: how large is the error?
The error in the linear approximation formula (5.1) is dened by
R
1,a
(x) = f(x) L
a
(x).
We are interested in how large R
1,a
(x) is compared to the magnitude of the displace-
ment, x a.
1-D Case
In order to gain insight, we consider a function of one variable, f : R R. In this case
L
a
(x) = f(a) + f
(a)(x a), (5.2)

the error is
R
1,a
(x) = f(x) L
a
(x), (5.3)
Section 5.1 Denition of Dierentiability 43
and the magnitude of the displacement is |x a|. The following theorem shows that
the error R
1,a
(x) tends to zero faster than the displacement.
If f : R R and f
(a) exists, then lim

xa
|R
1,a
(x)|
|xa|
= 0. Theorem
Proof: By equations (5.2) and (5.3),
|R
1,a
(x)|
|x a|
=
f(x) f(a) f
(a)(x a)
x a
f(x) f(a)
x a
f
(a)
.
The result follows, since the hypothesis and the denition of derivative imply that
lim
xa
f(x) f(a)
x a
= f
(a).
It can be shown that if one replaces the tangent line y = L
a
(x) by any other straight
line y = f(a) + m(x a) through the point (a, f(a)), the error will not satisfy the
conclusion of the theorem. Thus the property
lim
xa
|R
1,a
(x)|
|x a|
= 0
characterizes the tangent line at (a, f(a)) as the best straight line approximation to
the graph y = f(x) near (a, f(a)).
2-D Case
The situation is dierent for a function of two variables whose partial derivatives
exist at a. In general, it does not follow that the error tends to zero faster than
the displacement. For example,
lim
xa
|R
1,a
(x)|
x a
= 0
is not valid (see Example 1 to follow). Since this is a desirable property, we incorporate
it into a denition.
A function f : R
2
R is dierentiable at a = (a, b) if and only if there is a linear Denition
dierentiable function L(x) = f(a, b) + c(x a) + d(y b) such that
lim
xa
|R
1,a
(x)|
x a
= 0,
where
R
1,a
(x) = f(x) L(x).
44 Chapter 5 Dierentiable Functions
If f : R
2
R is dierentiable at a = (a, b) with linear function Theorem 1
L(x) = f(a, b) + c(x a) + d(y b),
then L(x) is the linear approximation of f at a. That is, c = f
x
(a, b) and d = f
y
(a, b).
Proof: Since f is dierentiable at a we have
lim
xa
|R
1,a
(x)|
x a
= 0
Hence, the limit is 0 along any path. Consider the path along y = b. Then we get
0 = lim
xa
|f(x, b) f(a, b) c(x a) d(b b)|
(x, b) (a, b)
= lim
xa
|f(x, b) f(a, b) c(x a)|
|x a|
= lim
xa
f(x, b) f(a, b)
x a
c
= f
x
(a, b) c
c = f
x
(a, b)
Similarly, approaching along x = a we get that d = f
y
(a, b).
REMARKS
1. Theorem 1, tells us that L(x) in the denition of dierentiability is always the
linear approximation. Hence, to prove a function is dierentiable at a point we
need to prove that
lim
xa
|R
1,a
(x)|
x a
= 0,
where
R
1,a
(x) = f(x) L
a
(x).
2. Observe that for the linear approximation to exist at a both partial derivatives of
f must exist at a. However, both partial derivatives existing does not guarantee
that f will be dierentiable. We say that the partial derivatives of f existing at
a is necessary but not sucient.
3. Like the 1-D case, Theorem 1 proves that the tangent plane is the best linear
approximation to the graph z = f(x, y) near a. Moreover, it tells us that the
linear approximation is a good approximation if and only if f is dierentiable at
a.
Section 5.1 Denition of Dierentiability 45
Let f : R
2
f(x, y) =
_
|xy|.
Determine whether f is dierentiable at (0, 0).
Solution: We rst need to nd L
(0,0)
(x, y) hence we need to nd the partial derivatives
at (0, 0). We have
lim
h0
f(h, 0) f(0, 0)
h
= lim
h0
0 0
h
= 0.
Hence by denition
f
x
(0, 0) = 0.
By symmetry,
f
y
(0, 0) = 0.
So, both partial derivatives exist at (0, 0) and, since f(0, 0) = 0, the linear approxima-
tion is
L
(0,0)
(x) = 0.
The error is
R
1,(0,0)
(x) = f(x) L
(0,0)
(x) =
_
|xy|,
and the magnitude of the displacement is x (0, 0) =
_
x
2
+ y
2
. Hence
|R
1,(0,0)
(x)|
x (0, 0)
=
_
|xy|
_
x
2
+ y
2
, for (x, y) = (0, 0).
We must determine whether
lim
(x,y)(0,0)
_
|xy|
_
x
2
+ y
2
= 0. (5.4)
Along the line y = x,
lim
(x,y)(0,0)
_
|xy|
_
x
2
+ y
2
= lim
x0
_
|x|
2
2x
2
= lim
x0
|x|
2|x|
=
1
2
,
so that
lim
x0
|R
1,0
(x)|
x 0
=
1
2
= 0.
It follows that (5.4) is false. Thus, by denition, the given function f is not dieren-
tiable at (0, 0).
Observe that in this example we have that the partial derivatives at (0, 0) both exist,
but
lim
x0
|R
1,0
(x)|
x 0
= 0,
so the plane z = L
0
(x) = 0 does not give a good approximation to the surface z =
_
|xy| near the origin. This can be explained geometrically. The vertical plane y = x
intersects the surface z =
_
|xy| in the curve z = |x|, which has a corner at x = 0, and
hence no tangent line. This means that the surface is not smooth at (0, 0, 0), and
hence the plane z = L
0
(x) = 0 cannot be interpreted as a tangent plane.
The function f : R
2
f(x, y) =
_
_
_
x
3
x
2
+y
2
if (x, y) = (0, 0)
0 if (x, y) = (0, 0).
Prove that f is not dierentiable at (0, 0).
The function f : R
2
R is dened by f(x, y) = |xy|. EXERCISE 2
Prove that f is dierentiable at (0, 0).
Refer to the function f in Exercise 2. Prove that f is not dierentiable at (0, 1). EXERCISE 3
We can now give a formal denition of the tangent plane of z = f(x, y).
2
R which is dierentiable at (a, b). The tangent plane Denition
tangent plane of the surface z = f(x, y) at (a, b, f(a, b)) is the graph of the linear approximation, i.e.
the plane given by
z = f(a, b) +
f
x
(a, b)(x a) +
f
y
(a, b)(y b).
Section 5.2 Dierentiability and Continuity 47
REMARK
Since f is assumed to be dierentiable at (a, b), the error in approximating the
surface by the tangent plane tends to zero faster than the displacement x a. We
have shown that no other plane through the point (a, b, f(a, b)) has this property.
Thus, the tangent plane is the plane that best approximates the surface near the point
(a, b, f(a, b)). In this case, we say that at the point (a, b, f(a, b)) the surface z = f(x, y)
is smooth.
Invent a function f : R
2
R whose graph z = f(x, y) is not smooth at (1, 2, f(1, 2)). EXERCISE 4
That is, invent a function which is not dierentiable at (1, 2).
5.2 Dierentiability and Continuity
Recall that for f : R R, if f
(a) exists then f is continuous at a. We now show that a

function f : R
2
R can fail to be continuous at a point a even if the partial derivatives
exist at a. We then prove that the stronger condition that f being dierentiable at a
implies that f is continuous at a.
Consider f : R
2
f(x, y) =
_
_
_
xy
x
2
+y
2
if (x, y) = (0, 0)
0 if (x, y) = (0, 0).
Prove that
f
x
(0, 0) = 0 =
f
y
(0, 0), but that f is not continuous at (0, 0).
Solution: We have
f
x
(0, 0) = lim
h0
f(h, 0) f(0, 0)
h
= 0,
and so
f
y
(0, 0) = 0 by symmetry. Thus, both partial derivatives exist.
However, on the line y = x,
f(x, x) =
x
2
x
2
+ x
2
=
1
2
, for all x = 0,
hence lim
(x,y)(0,0)
f(x, y) = f(0, 0) and so f is not continuous at (0, 0).
REMARK
When investigating lim
(x,y)(0,0)
f(x, y) in Example 2, it is not necessary to consider
the limit along all straight lines which approach (0, 0). It is sucient to show, as we
did, that f(x, y) does not tend to f(0, 0) along one straight line which approaches
(0, 0).
Let f : R
2
R. If f is dierentiable at a, then f is continuous at a. Theorem 2
Proof: The error R
1,a
(x) is dened by
R
1,a
(x) = f(x) L
a
(x).
On using the denition of L
a
(x), this equation can be rearranged to read
f(x) = f(a) +f(a) (x a) + R
1,a
(x). (5.5)
We can write
R
1,a
(x) =
R
1,a
(x)
x a
x a, for x = a.
Since f is dierentiable and by the limit theorems, we get
lim
xa
R
1,a
(x) = 0.
It now follows from equation (5.5) that
lim
xa
f(x) = f(a) + 0 + 0 = f(a),
and so by denition, f is continuous at a.
Suppose that f : R
2
R is not continuous at a. Can you draw a conclusion about EXERCISE 5
whether f is dierentiable at a?
Give an example of a function f : R
2
R which is continuous but not dierentiable EXERCISE 6
at a. This shows that the converse of Theorem 1 is not true.
Hint: Look at the example and exercises in Section 5.1, or invent your own example.
Section 5.3 Continuous Partial Derivatives and Dierentiability 49
5.3 Continuous Partial Derivatives and
Dierentiability
We need an ecient way of proving that a given function f is dierentiable at a typical
point. In this section, we present a theorem for this purpose, which states that if the
partial derivatives of f : R
2
R are continuous at a, then f is dierentiable at a.
First, let us explain the idea of continuous partial derivatives. It is important to make
a distinction in your mind between:
f
x
(a, b), which denotes the value of the partial derivative of f with respect to
x, evaluated at the point (a, b)
and
f
x
, which denotes the function which maps (a, b)
f
x
(a, b)
Thus, if f is a function which maps R
2
R, then both partial derivatives
f
x
and
f
y
are functions which map R
2
R. For example, if f(x, y) = e
xy
, then f : R
2
R is
the function which maps
(x, y) e
xy
.
On dierentiating,
f
x
(x, y) = ye
xy
.
Thus,
f
x
: R
2
R is the function which maps
(x, y) ye
xy
.
On the other hand,
f
x
(1, 2) = 2e
2
is a number. This distinction is important in
connection with the requirement that the partial derivatives of f be continuous. In
terms of the denition of continuous function, we can state that the partial derivative
f
x
is continuous at (a, b) if and only if
lim
(x,y)(a,b)
f
x
(x, y) =
f
x
(a, b).
In writing this statement, we are assuming that
f
x
is dened in some neighbourhood
of (a, b).
Let f : R
2
R. If
f
x
and
f
y
are continuous at a then f is dierentiable at a. Theorem 3
The proof of the theorem is based on the Mean Value Theorem of single variable
calculus which we now review.
Mean Value Theorem: If f : R R and f
(t) is dened on the closed interval

[t
1
, t
2
] then there exists t (t
1
, t
2
) such that
f(t
2
) f(t
1
) = f
(t)(t
2
t
1
)
Proof of Theorem 3: We derive an expression for the error R
1,a
(x), given by
R
1,a
(x) = f(x, y) f(a, b) f
x
(a, b)(x a) f
y
(a, b)(y b). (5.6)
Since f
x
and f
y
are continuous then f
x
and f
y
exist in some neighbourhood B(a). For
(x, y) B(a), we write
f(x, y) f(a, b) =
_
f(x, y) f(a, y)
_
+
_
f(a, y) f(a, b)
_
, (5.7)
by adding and subtracting f(a, y). The Mean Value Theorem can be applied to each
bracket, since one variable is held xed, and the partial derivatives are assumed to
exist. For the rst bracket:
f(x, y) f(a, y) = f
x
(x, y)(x a),
where x lies between a and x. By adding and subtracting f
x
(a, b)(x a), we obtain
f(x, y) f(a, y) = f
x
(a, b)(x a) + A(x a), (5.8)
where
A = f
x
(x, y) f
x
(a, b). (5.9)
Similarly for the second bracket:
f(a, y) f(a, b) = f
y
(a, y)(y b)
= f
y
(a, b)(y b) + B(y b), (5.10)
where
B = f
y
(a, y) f
y
(a, b) (5.11)
and y lies between b and y.
Section 5.3 Continuous Partial Derivatives and Dierentiability 51
Substitute equations (5.8) and (5.10) into (5.7) and then substitute equation (5.7) into
(5.6), to obtain
R
1,a
(x) = A(x a) + B(y b),
where A and B are given by equations (5.9) and (5.11). It follows by the triangle
inequality that
0
|R
1,a
(x)|
x a

|A||x a|
_
(x a)
2
+ (y b)
2
+
|B||y b|
_
(x a)
2
+ (y b)
2
|A| +|B|. (5.12)
We can now apply the Squeeze Theorem.
As (x, y) (a, b), it follows that
(x, y) (a, b) and (a, y) (a, b).
Since f
x
and f
y
are continuous at (a, b), it follows from equations (5.9) and (5.11) that
lim
(x,y)(a,b)
A = 0 and lim
(x,y)(a,b)
B = 0.
Equation (5.12) and the Squeeze Theorem now imply
lim
(x,y)(a,b)
|R
1,a
(x)|
x a
= 0,
so that f is dierentiable at a, by denition.
We now apply Theorem 3 to investigate the dierentiability of a given function.
The function f : R
2
f(x, y) = (x
2
+ y
2
)
2/3
.
Determine at what points f is dierentiable.
Solution: By dierentiation
f
x
=
4x
3(x
2
+ y
2
)
1/3
, for (x, y) = (0, 0).
By inspection, using the continuity theorems,
f
x
is continuous for all (x, y) = (0, 0).
By symmetry, the same conclusion holds for
f
y
. It follows from Theorem 3 that f is
dierentiable for all (x, y) = (0, 0).
At the point (0, 0), it is not clear whether the partial derivatives exist and one has to use
the denition of partial derivative. Then one has to use the denition of dierentiable
function, as in Example 1 in Section 5.1. The conclusion is that f is dierentiable at
(0, 0).
Referring to Example 3, prove that f is dierentiable at (0, 0). EXERCISE 7
Let f : R
2
R. Prove that if f C
2
at (a, b), then f is continuous at (a, b). EXERCISE 8
Summary
Theorem 3 makes it easy to prove that a function f is dierentiable at a typical point.
One simply dierentiates f to obtain the partial derivatives f
x
, f
y
, and then you check
that the partials are continuous functions by inspection, referring to the continuity
theorems, as in Section 3.2. It is only necessary to use the denition of a dierentiable
function at an exceptional point.
Generalization
The denition of a dierentiable function and theorems 1 and 2 are valid for functions
of n variables f : R
n
R. The only change is that there are n partial derivatives,
f
x
1
,
f
x
2
, ,
f
x
n
.
5.4 The Linear Approximation Revisited
The error in the linear approximation is dened by
R
1,a
(x) = f(x) L
a
(x),
where
L
a
(x) = f(a) +f(a) (x a).
It is convenient to rearrange the denition of R
1,a
(x) to read
f(x) = f(a) +f(a) (x a) + R
1,a
(x). (5.13)
Section 5.4 The Linear Approximation Revisited 53
The approximation formula
f(x) f(a) +f(a) (x a) (5.14)
for x suciently close to a, arises if one neglects the error term. In general, one has
no information about R
1,a
(x), and so it is not clear whether the approximation is
reasonable. However, Theorem 2 provides an important piece of information about
R
1,a
(x), namely that if the partial derivatives of f are continuous at a, then
lim
xa
|R
1,a
(x)|
x a
= 0,
i.e. the error tends to zero faster than the displacement. In this case, the approximation
(5.14) is reasonable for x suciently close to a, and we say that L
a
(x) is a good
approximation of f(x) near a.
Discuss the validity of the approximation EXAMPLE 4
(xy)
1/3
2 +
1
3
(x 2) +
1
6
(y 4).
Solution: Let f(x, y) = (xy)
1/3
. By dierentiation,
f(x, y) =
_
1
3
x
2
3
y
1
3
,
1
3
x
1
3
y
2
3
_
,
so f(2, 4) =
_
1
3
,
1
6
_
. With a = (2, 4), equation (5.13) becomes
(xy)
1
3
= 2 +
1
3
(x 2) +
1
6
(y 4) + R
1,(2,4)
(x).
Using the continuity theorems we see that f has continuous partials at the point (2, 4).
Theorem 2 implies that
lim
(x,y)(2,4)
R
1,(2,4)
(x)
_
(x 2)
2
+ (y 4)
2
= 0.
It follows that for (x, y) suciently close to (2, 4), we may neglect R
1,(2,4)
(x). Thus,
(xy)
1
3
2 +
1
3
(x 2) +
1
6
(y 4)
gives a good approximation for (x, y) suciently close to (2, 4).
Discuss the validity of the approximation EXERCISE 9
_
1 + 3 tanx + sin y 2 +
3
2
_
x

4
_
+
1
4
y.
Note that approximation is a recurring theme in calculus, and the equation
f(x) = f(a) +f(a) (x a) + R
1,a
(x)
is of fundamental importance. In Chapter 8 we shall nd out more about the error
term R
1,a
(x) in terms of the second partial derivatives.
Chapter 6
The Chain Rule
6.1 Basic Chain Rule in Two Dimensions
Review of the Chain Rule for f(x(t))
Let T = f(x) be the temperature of a heated metal rod as a function of the position
x. An ant runs on the rod, with its position given by x = x(t) as a function of time t.
Find an expression for the time rate of change of temperature as experienced by the
ant.
We have
T = f(x), x = x(t).
The Chain Rule states that
dT
dt
=
dT
dx
dx
dt
(6.1)

T as a composite of t T as a function of x
This commonly used Leibniz form of the Chain Rule involves an abuse of notation,
since T is used in two dierent contexts. A precise statement is
d
dt
f(x(t)) = f
(x(t))x
(t). (6.2)
Alternatively one can dene the composite function T : R R by
T(t) = f(x(t))
56 Chapter 6 The Chain Rule
and write
T
(t) = f
(x(t))x
(t).
Note that f
(x(t)) is the derivative of the function f : R R evaluated at x(t). It is

essential in what follows to understand these dierent ways of writing the 1-D chain
rule.
The Chain Rule for f(x(t), y(t))
In order to provide a physical context, suppose that the sur-
face temperature of a pond is T = f(x, y), as a function of
position (x, y). A duck swims on the pond, with its position
given by
x = x(t), y = y(t),
as a function of time t. Find an expression for the time rate
of change of temperature as experienced by the duck.
x
y
path of duck
_
x(t), y(t)
_
We have
T = f(x, y), x = x(t), y = y(t),
so that the temperature experienced by the duck depends on time t. In a time change
t, x and y change by
x = x(t + t) x(t), y = y(t + t) y(t).
By the increment form of the linear approximation formula, the change in T corre-
sponding to changes x and y is approximated by
T
T
x
x +
T
y
y
for x and y suciently small. Divide by t, let t 0, and use the denition of
derivative to get
dT
dt
on the left side of the equation. Assuming that T is dierentiable
at (x, y), then as x and y 0, the error in the linear approximation tends to zero,
and so the approximation becomes increasingly accurate, leading to
dT
dt
=
T
x
dx
dt
+
T
y
dy
dt
(6.3)

T as a composite T as a function of x and y
function of t
Section 6.1 Basic Chain Rule in Two Dimensions 57
This is the simplest example of the Chain Rule in two dimensions, and should be
compared with equation (6.1). A precise form of equation (6.3), which avoids abuse of
notation, is
d
dt
f
_
x(t), y(t)
_
= f
x
_
x(t), y(t)
_
x
(t) + f
y
_
x(t), y(t)
_
y
(t) (6.4)
which should be compared with equation (6.2). Alternatively, dene the composite
function T : R R by
T(t) = f(x(t), y(t))
and write
T
(t) = f
x
(x(t), y(t))x
(t) + f
y
(x(t), y(t))y
(t) (6.5)
Note that f
x
(x(t), y(t)) is the partial derivative of the function f : R
2
R with
respect to x, evaluated at (x(t), y(t)). In order to be able to apply the Chain Rule, it
is important to study and understand both forms (6.3) and (6.4)/(6.5).
REMARK
The preceding derivation is intended to make the Chain Rule plausible, but is
NOT a proof. The diculty lies in the approximation sign . This can be remedied
by keeping track of the error in the linear approximation, and leads to a proof. Note
that a hypothesis on the function f, stronger than existence of the partial derivatives,
is required.
Chain Rule Theorem 1
Given f : R
2
R, x : R R, and y : R R, let G(t) = f(x(t), y(t)), and let
a = x(t
0
) and b = y(t
0
). If f is dierentiable at (a, b) and x
(t
0
) and y
(t
0
) exist, then
G
(t
0
) exists and is given by
G
(t
0
) = f
x
(a, b)x
(t
0
) + f
y
(a, b)y
(t
0
).
Proof: By denition of the derivative,
G
(t
0
) = lim
tt
0
G(t) G(t
0
)
t t
0
, (6.6)
provided that this limit exists. By denition of G(t),
G(t) G(t
0
) = f(x(t), y(t)) f(x(t
0
), y(t
0
)). (6.7)
Since f is dierentiable we can write
f(x, y) = f(a, b) + f
x
(a, b)(x a) + f
y
(a, b)(y b) + R
1,a
(x, y), (6.8)
where
lim
(x,y)(a,b)
|R
1,a
(x, y)|
_
(x a)
2
+ (y b)
2
= 0. (6.9)
Since a = x(t
0
), b = y(t
0
), it follows from equations (6.7) and (6.8) that
G(t) G(t
0
)
t t
0
= f
x
(a, b)
_
x(t) x(t
0
)
t t
0
_
+ f
y
(a, b)
_
y(t) y(t
0
)
t t
0
_
+
R
1,a
(x(t), y(t))
t t
0
(6.10)
You can now see the Chain Rule taking shape. We have to prove that
lim
tt
0
|R
1,a
(x(t), y(t))|
|t t
0
|
= 0
Dene E : R
2
R by
E(x, y) =
_
_
R
1,a
(x,y)
(xa)
2
+(yb)
2
, if (x, y) = (a, b)
0 if (x, y) = (a, b).
By equation (6.9) and the denition of continuity, E is continuous at (a, b).
From the denition of E,
R
1,a
(x, y) = E(x, y)
_
(x a)
2
+ (y b)
2
, for all (x, y)
Since a = x(t
0
), and b = y(t
0
),
R
1,a
_
x(t), y(t)
_
|t t
0
|
=
E
_
x(t), y(t)
_
_
x(t) x(t
0
)
t t
0
_
2
+
_
y(t) y(t
0
)
t t
0
_
2
Since x
(t
0
) and y
(t
0
) exist and the fact that E is continuous at (a, b) we get
lim
tt
0
R
1,a
_
x(t), y(t)
_
|t t
0
|
= E
_
x(t
0
), y(t
0
)
_
_
[x
(t
0
)]
2
+ [y
(t
0
)]
2
= 0,
since E(a, b) = 0.
It now follows from equation (6.6) and (6.10) that G
(t
0
) exists, and is given by the
desired chain rule formula.
REMARK
When rst studying the Chain Rule you might think that hypothesis that f is
dierentiable could be replaced by the weaker hypothesis that f
x
(a, b) and f
y
(a, b)
exist. Exercise 1 shows that this is not the case.
With reference to the theorem, let EXERCISE 1
f(x, y) = (xy)
1
3
, x(t) = t, y(t) = t
2
.
Find G(t) = f(x(t), y(t)) and hence show that G
(0) = 1. Further show that f

x
(0, 0) =
0 and f
y
(0, 0) = 0, so that the Chain Rule fails. Draw a conclusion about f at (0, 0).
REMARK
In practice it is convenient to use stronger hypotheses in the Chain Rule. In par-
ticular, that f has continuous partial derivatives at (a, b) and x
(t) and y
(t) are both

continuous at t
0
. This also allows one to obtain the stronger conclusion that G
(t) is
continuous at t
0
. These hypotheses can usually be checked quickly, either by using the
Continuity Theorems, or in more theoretical situations, by using given information.
Suppose that the temperature at position (x, y) in a pond is EXAMPLE 1
T = 10e
1
10
(x
2
+y
2
)
.
The path of a duck swimming on the pond is
x = 2 cos t, y = 4 sin t
Find the rate of change of the ponds temperature as experienced by the duck at time
t =
3
4
.
Solution: Observe that T, x and y are dierentiable, hence the Chain Rule gives
dT
dt
=
T
x
dx
dt
+
T
y
dy
dt
.
Calculate
dx
dt
and
dy
dt
at t =
3
4
, obtaining
dx
dt
=
2,
dy
dt
= 2
2.
At t =
3
4
, the position of the duck is (x, y) = (
2, 2
2). Calculate
T
x
and
T
y
at
(
2, 2
2), obtaining
T
x
=
2
2
e
,
T
y
=
4
2
e
.
So, the Chain Rule gives
dT
dt
=
_
2
2
e
_
(
2) +
_
4
2
e
_
(2
2) =
12
e
degrees/unit time.
One can interpret the result geometrically in terms of the path of the duck and the
level curves of the temperature function (the isothermal curves).
The level curves T =
10
e
, T = T
1
>
10
e
and T = T
2
<
10
e
are shown. The path of the
duck is an ellipse. At time t =
3
4
, the duck is moving from the region with T <
10
e
to
the region with T =
10
e
. Hence we expect that
dT
dt
> 0.
Let EXERCISE 2
f(t) = g(1 + t
2
, 1 t
2
)
If g(2, 0) = (3, 4), nd f
(1). What condition on g will guarantee the validity of your

work?
Let EXERCISE 3
T = ln(1 + x
2
+ y
2
), x = e
t
sin t, y = 2e
t
cos t.
Calculate
dT
dt
when t = 0 in two ways, rstly by substituting x and y in T, and secondly
by evaluating
dx
dt
(0),
dy
dt
(0),
T
x
(0, 2) and
T
y
(0, 2), and applying the Chain Rule.
A dierentiable function f : R
2
R is given, and g : R R is dened by EXERCISE 4
g(t) = f(x, y),
where x = cos t and y = sin t. Write out the Chain Rule for g
(t). Calculate g
3
_
, if
f
_
1
2
,
3
2
_
= (
3, 4).
The Vector Form of the Basic Chain Rule
So far, we have seen that for a composite function formed from
T = f(x, y), and x = x(t), y = y(t),
the Chain Rule reads
dT
dt
=
T
x
dx
dt
+
T
y
dy
dt
.
Observe that the right side is the scalar product of the gradient vector
T =
_
T
x
,
T
y
_
and the vector
dx
dt
=
_
dx
dt
,
dy
dt
_
.
Thus we obtain
dT
dt
= T
dx
dt
.
Corresponding to equation (5), the precise form of the Chain Rule, we have
d
dt
f(x(t)) = f(x(t))
dx
dt
,
with x(t) = (x(t), y(t)).
In this vector form, the Chain Rule holds for any dierentiable function f : R
n
R,
e.g. T = f(x, y, z), representing temperature or some other quantity in 3-space.
The temperature at position x = (x, y, z) in the vicinity of the planet Mercury is EXERCISE 5
T = h(x, y, z) where h : R
3
R is dierentiable. If the path of a spaceship is
x = (x(t), y(t), z(t)), then write the Chain Rule for
dT
dt
.
3
R is given, and g : R R is dened by EXERCISE 6
g(t) = f(x, y, z)
where x = t, y = t
2
and z = t
3
. Write out the Chain Rule for g
(t). Find g
(1) if
f(1, 1, 1) =
_
2,
1
2
, 1
_
.
6.2 Extensions of the Basic Chain Rule
So far, we have considered composite functions formed from dierentiable functions
u = f(x, y), with x = x(t), y = y(t).
In this situation, the dierent variables are referred to as follows:
u : dependent variable
x, y : intermediate variables
t : independent variable
u
x
@
@
y
t t
The tree diagram illustrates the chain of dependence. Observe, that our chain rule
above makes sense from the point of view of rate of change. From the dependence
diagram, we clearly see that the values of u are dependent on x and y which are each
dependent on t. Thus, the rate of change of u should be the sum of the rate of change
with respect to its x-component and with respect to its y-component. The term
u
x
dx
dt
calculates the rate of change of u with respect to those ts that aect u through x.
Similarly
u
y
dy
dt
calculates the rate of change of u with respect to those ts that aect
u through y.
We now discuss the case where there is more than one independent variable.
Let
u = f(x, y), with x = x(s, t), y = y(s, t)
all be dierentiable. Then u is a composite function of two independent variables s
and t. Since u is now a function of two variables, we want to write a chain rule for
u
s
and
u
t
. We observe this is very similar to the case above. For
u
s
, the rate of
change of u with respect to those ss that aect u through x is now
u
x
x
s
, since x is
a function of two variables. Continuing this we get
u
s
=
u
x
x
s
+
u
y
y
s
(6.11)
u
t
=
u
x
x
t
+
u
y
y
t
u
x
@
@
y
A
A
A
A
s t s t
Section 6.2 Extensions of the Basic Chain Rule 63
Show that this form of the Chain Rule could also be motivated using the linear approx- EXERCISE 7
imation. Where is the condition that f(x, y), x(s, t) and y(s, t) are all dierentiable
used?
REMARK
1. It is important to understand the dierence between the various partial deriva-
tives in equations (6.11), and to know which variable is held constant. For
example
u
x
means : regard u as the given function of x and y, and
dierentiate with respect to x, holding y xed.
u
s
means : regard u as the composite function of s and t,
and dierentiate with respect to s, holding t xed.
2. Equations of the form
x = x(s, t), y = y(s, t)
can be thought of as dening a change of coordinates in 2-space.
Let z = f(x, y), where x = r cos , and y = r sin . Assuming that f is dierentiable, EXAMPLE 2
verify that
_
z
r
_
2
+
1
r
2
_
z
_
2
=
_
z
x
_
2
+
_
z
y
_
2
Solution: From the Chain Rule we obtain
z
r
=
z
x
x
r
+
z
y
y
r
=
z
x
cos +
z
y
sin
z
=
z
x
x
+
z
y
y
=
z
x
(r sin ) +
z
y
(r cos )
z
x
@
@
y
A
A
A
A
r r
Thus, we get
_
z
r
_
2
+
1
r
2
_
z
_
2
=
_
z
x
_
cos
2
+
_
z
y
_
2
sin
2
+
_
z
x
_
2
sin
2
+
_
z
y
_
2
cos
2
=
_
z
x
_
2
+
_
z
y
_
2
as required.
REMARK
In some situations (see the example to follow) it is necessary to write a more precise
form of the Chain Rule (6.11), one which displays the functional dependence.
Let g : R
2
R denote the composite function of f(x, y) and x(s, t), y(s, t):
g(s, t) = f(x(s, t), y(s, t)).
Then the rst equation in (6.11) can be written as
g
s
(s, t) =
f
x
(x(s, t), y(s, t))
x
s
(s, t) +
f
y
(x(s, t), y(s, t))
y
s
(s, t),
with a similar equation for
g
t
(s, t).
2
R is given, with f(2, 0) = (2, 3). Let g(x, y) = EXAMPLE 3
f(2xy, x
2
y
2
). Calculate
g
x
(1, 1).
Solution: We solve the problem by labeling the intermediate variables 2xy and x
2
y
2
by u and v. We have
w = g(x, y) = f(u, v), with u = 2xy and v = x
2
y
2
.
The Chain Rule reads:
g
x
(x, y) =
f
u
(u, v)
u
x
(x, y) +
f
v
(u, v)
v
x
(x, y)
= 2y
f
u
(u, v) + 2x
f
v
(u, v)
w
u
@
@
v
A
A
A
A
x y x y
Section 6.2 Extensions of the Basic Chain Rule 65
When (x, y) = (1, 1), it follows that (u, v) = (2, 0), and we obtain
g
x
(1, 1) = 10.
Referring to Example 2, calculate
g
y
(1, 1). EXERCISE 8
A function g : R R is dened by EXERCISE 9
g(t) = f(h(t) + t, h(t) t),
where f : R
2
R and h : R R are both dierentiable. Write the Chain Rule for
g
(t).
Referring to Exercise 9, if h(1) = 2, h
(1) = 3 and f(3, 1) = (2, 3), nd g
(1). EXERCISE 10
In the dependence diagrams in Examples 2 and 3 we see there are two paths leading
from the dependent variable to the independent variable and this gives rise to a sum of
two terms on the right side of the equation. Each path has two links (), which results
in each term being a product of two derivatives. Thus, we can use our dependence
diagrams to nd the chain rule for more complicated situations. In particular, to obtain
a chain rule from a dependence diagram we have the following algorithm.
To write the chain rule from a dependence diagram we: Algorithm
chain rule
1. Take all possible paths from the dierentiated variable to the dierentiating
variable.
2. For each link () in a given path, dierentiate the upper variable with respect
to the lower variable being careful to consider if this is a derivative or a partial
derivative. Multiply all such derivatives in that path.
3. Add the products from step 2 together to complete the chain rule.
REMARK
As we have seen above, for the chain rule to be valid, each function that we take
the (partial) derivative of must be dierentiable.
The temperature T of the water in a pond depends on position and time. Thus we EXAMPLE 4
have temperature function T = T(x, y, t). Assuming that T is dierentiable, nd the
rate of change of temperature experienced by a duck whose path is x = x(t), y = y(t).
Solution: We have
T(t) = T(x, y, t), where x = x(t), y = y(t).
We draw the dependence diagram and apply the algorithm above.
The rst path gives
T
x
dx
dt
,
the second path gives
T
y
dy
dt
,
and the third path gives
T
t
.
T
x
@
@
t
t t
y
Thus, the Chain Rule is the sum of these terms. So, we have
dT
dt
=
T
x
dx
dt
+
T
y
dy
dt
. .
+
T
t

Rate of change of tem-
perature with time, as
experienced by duck
contribution due
to movement of
duck
due to change of tem-
perature with time at a
xed position
It is essential to distinguish between:
dT
dt
: the ordinary derivative of T as a composite function of t
T
t
: the partial derivative of T as the given function of x, y, t with x, y held xed.
In order to emphasize which variables are held xed, one can write:
_
T
t
_
x,y
In order to avoid abuse of notation, i.e. using T to denote two dierent functions, one
can write
T(t) = f(x(t), y(t), t),
Section 6.3 The Chain Rule for Second Partial Derivatives 67
so that T : R R is the function which measures the temperature at that ducks
position at time t and f : R
3
R is the temperature of the water at position (x, y) at
time t. Then the Chain Rule reads
dT
dt
= f
x
_
x(t), y(t), t
_
x
(t) + f
y
_
x(t), y(t), t
_
y
(t) + f
t
_
x(t), y(t), t
_
(6.12)
or more concisely
T
(t) = f
x
x
(t) + f
y
y
(t) + f
t
.
Show that the Chain Rule (6.12) can also be derived by means of the increment form EXERCISE 11
of the linear approximation for f : R
3
R.
Let f : R
2
R be a function such that f(2, 0) = 1 and f(2, 0) = (2, 3). Let EXERCISE 12
g(x, y) = xf(2xy, x
2
y
2
). Calculate
g
x
(1, 1). What assumption do you need to make
about f?
Let u(s, t) = f(x(s, t), y(s, t), s, t). Write the Chain Rule for
u
s
, showing the functional EXERCISE 13
dependence explicitly.
6.3 The Chain Rule for Second Partial Derivatives
In some situations, it is necessary to be able to calculate second derivatives of composite
functions using the Chain Rule. One encounters this problem when working with
partial dierential equations which involve second derivatives e.g. Laplaces equation
u
xx
+ u
yy
= 0.
It also arises when working with Taylor Polynomials and in the proof of Taylors
formula (see Chapter 8).
Lets start with an example using functions of one variable.
If z = f(x) where f is dierentiable and x = e
u
, verify that EXAMPLE 5
d
2
z
du
2
= x
2
d
2
f
dx
2
+ x
df
dx
.
Solution: Observe that by composition we have z = z(u). Since f and x are dier-
entiable the Chain Rule gives
z
(u) = f
(x)x
(u) = f
(x)e
u
.
To calculate z
(u) we will need to apply the chain rule again on the function on the
right. Drawing the dependence diagram and using our algorithm for calculating the
Chain Rule we get
z
(u) =
z
(u)
x
dx
du
+
_
z
(u)
u
_
x
.
Then we have
z
(u)
x
=
f
(x)e
u
x
= f
(x)e
u
z
x
@
@
u
u
since we are holding u constant and taking the derivative with respect to x, and
_
z
(u)
u
_
x
=
_
f
(x)e
u
u
_
x
= f
(x)e
u
since we are holding x constant and taking the derivative with respect to u. Finally,
since
dx
du
= e
u
we get
z
(u) = (f
(x)e
u
)(e
u
) + f
(x)e
u
= x
2
f
(x) + xf
(x), (6.13)
as required.
REMARK
Observe, if we had substituted in x = e
u
at the beginning, we would get
z
(u) = f
(e
u
)e
u
.
Hence, taking the derivative with respect to u we would get
z
(u) =
d
du
_
f
(e
u
)
_
e
u
+ f
(e
u
)
d
du
_
e
u
_
by the product rule
=
_
f
(e
u
)
d
du
_
e
u
_
_
e
u
+ f
(e
u
)e
u
by the chain rule
=
_
f
(e
u
)e
u
_
e
u
+ f
(e
u
)e
u
,
which matches (6.13). Thus, we see that our dependence diagram algorithm not only
calculates the necessary chain rules, but also includes the necessary product rules.
If z = f(x, y) with f dierentiable where x = r cos , y = r sin , verify that EXAMPLE 6
2
z
r
2
+
1
r
z
r
+
1
r
2
2
z
2
=

2
f
x
2
+

2
f
y
2
.
What assumptions do you need to make about f?
Solution: Assuming that f is dierentiable the Chain Rule gives
z
r
= f
x
x
r
+ f
y
y
r
= f
x
cos + f
y
sin . (6.14)
In order to calculate

2
z
r
2
, we have to use the Chain Rule to dierentiate this equation
with respect to r, keeping constant.
To draw the dependence diagram, we rst write (6.14) more precisely showing the
functional dependence. It is
z
r
(r, ) = f
x
(x, y) cos + f
y
(x, y) sin.
So, we see that z
r
is dependent on x, y and where, by
composition, x and y are both dependent on r and .
Thus, we get the dependence diagram to the right.
Using this, we nd that the Chain Rule is
z
r
H
H
H
x
A
A
r
y
A
A
r
z
rr
=
z
r
x
x
r
+
z
r
y
y
r
. (6.15)
Then,
z
r
x
=
_
f
x
cos + f
y
sin
_
x
=
f
x
x
cos +
f
y
x
sin since we are holding constant
= f
xx
cos + f
yx
sin .
To perform this chain rule we are taking partial derivatives of f
x
and f
y
, thus we require
that f
x
and f
y
are dierentiable. Similarly, assuming that f
x
and f
y
are dierentiable
we nd that
z
r
y
= f
xy
cos + f
yy
sin .
Putting these into (6.15) and computing
x
r
and
y
r
we nd that
z
rr
=
_
f
xx
cos + f
yx
sin
_
cos +
_
f
xy
cos + f
yy
sin
_
sin
= f
xx
cos
2
+ f
yx
sin cos + f
xy
cos sin + f
yy
sin
2
. (6.16)
We now repeat this process to nd z
. We have
z
(r, ) = f
x
(x, y)x
(r, ) + f
y
(x, y)y
(r, )
= f
x
(x, y)(r sin ) + f
y
(x, y)(r cos ).
Thus, we get the dependence diagram to the right
z
H
H
H
X
X
X
X
X
X
r x
A
A
r
y
A
A
r
and assuming that f
x
and f
y
are dierentiable the Chain Rule for z
becomes
2
z
2
=
z
x
x
+
z
y
y
+
_
z
_
x,y,r
. (6.17)
We nd that
z
x
=
_
f
x
(r sin ) + f
y
(r cos )
_
x
=
f
x
x
(r sin ) +
f
y
x
(r cos ) since r, are held constant
= f
xx
r sin + f
yx
r cos
z
y
=
f
x
y
(r sin ) +
f
y
y
(r cos ) since r, are held constant
= f
xy
r sin + f
yy
r cos
z
_
f
x
(r sin ) + f
y
(r cos )
_
= f
x
r
sin
+ f
y
r
cos
since x, y, r are held constant

= f
x
r cos f
y
r sin
Putting these into (6.17) we get
f
=
_
f
xx
r sin + f
yx
r cos
_
(r sin ) +
_
f
xy
r sin + f
yy
r cos
_
(r cos )
+
_
f
x
r cos f
y
r sin
_
= f
xx
r
2
sin
2
f
yx
r
2
cos sin f
xy
r
2
sin cos + f
yy
r
2
cos
2
+
_
f
x
r cos f
y
r sin
_
(6.18)
Using (6.14), (6.16) and (6.18) we get
2
z
r
2
+
1
r
z
r
+
1
r
2
2
z
2
=
_
f
xx
cos
2
+ f
yx
sin cos + f
xy
cos sin + f
yy
sin
2
_
+
1
r
_
f
x
cos + f
y
sin
_
+
1
r
2
_
f
xx
r
2
sin
2
f
yx
r
2
cos sin
f
xy
r
2
sin cos + f
yy
r
2
cos
2
+f
x
r cos f
y
r sin
_
= f
xx
(cos
2
+ sin
2
) + f
yy
(sin
2
+ cos
2
)
= f
xx
+ f
yy
as required. Our necessary assumption was that f
x
and f
y
are both dierentiable.
Let g(x, y) be a function from R
2
to R, and let f : R R be dened by EXERCISE 14
f(x) = g(x, 2x).
Verify that
f
(x) = g
xx
+ 4g
xy
+ 4g
yy
.
What assumption on g will ensure that your calculation is valid?
A function g : R R is given, and f : R
2
f(x, y) = g(xy).
Verify that
x
2
f
xx
= y
2
f
yy
.
What assumption on g will ensure that your calculation is valid?
Let f : R
2
R, f C
2
. Dene g : R R by EXERCISE 16
g(s) = f(a + hs, b + ks)
where (a, b) and (h, k) are regarded as xed. Verify that
g
(s) = f
x
(a + hs, b + ks)h + f
y
(a + hs, b + ks)k
g
(s) = f
xx
(a + hs, b + ks)h
2
+ 2f
xy
(a + hs, b + ks)hk + f
yy
(a + hs, b + ks)k
2
.
Chapter 7
Directional Derivatives and
the Gradient Vector
In this chapter we introduce the concept of the directional derivative of a function.
This leads to a geometrical interpretation of the gradient vector.
7.1 Directional Derivatives
Motivation
Let z = f(x, y) represent the height of a moun-
tain. The level curves f(x, y) = C represent the
contour lines. Suppose that a skier is at the point
P(a, b). In what direction should he move in order
to lose height as rapidly as possible?
In order to answer such a question, we have to generalize the idea of the partial
derivative. One can think of
f
x
as the rate of change of f(x, y) in the x-direction.
Our aim is to dene a derivative which gives the rate of change of f(x, y) in any
specied direction.
Section 7.1 Directional Derivatives 73
We are given a function f : R
2
R, a point a R
2
, and a unit vector u (i.e. u = 1).
Let L be the line through a in the direction u. Then L has vector equation
x = a + s u, for s R.
At points on the line L, f(x) has value f(a + s u), and this denes a function of one
variable s. Thus, the rate of change of f at a in the direction of u is just the derivative
of this function with respect to s evaluated at s = 0. Hence, we make the following
denition.
The directional derivative of f : R
2
R at a point a in the direction of a unit Denition
directional
derivative
vector u is dened by
D
u
f(a) =
d
ds
f(a + s u)
s=0
.
REMARK
Observe that this denition is assuming that f(a +s u) is dierentiable at s = 0. A
more precise denition would be
D
u
f(a) = lim
s0
f(a + s u) f(a)
s
provided the limit exists.
Find the directional derivative of f(x, y) = x
2
y
2
at the point (1, 2) in the direction EXAMPLE 1
of the vector (3, 4).
Solution: We rst observe that the vector is not a unit vector, so we must normalize
it. So,
u =
(3, 4)
(3, 4)
=
_
3
5
,
4
5
_
.
Hence,
74 Chapter 7 Directional Derivatives and the Gradient Vector
D
u
f(1, 2) =
d
ds
f
_
(1, 2) + s
_
3
5
,
4
5
__
s=0
=
d
ds
f
_
1 +
3
5
s, 2 +
4
5
s
_
s=0
=
d
ds
_
_
1 +
3
5
s
_
2
_
2 +
4
5
s
_
2
_
s=0
_
=
6
5
_
1 +
3
5
s
_
8
5
_
2 +
4
5
s
__
s=0
=
6
5

16
5
= 2
We now derive a simple formula for calculating the directional derivative, in terms of
the partial derivatives.
If f : R
2
R is dierentiable at a, then Theorem 1
D
u
f(a) = f(a) u,
where u is a unit vector.
Proof: Let a = (a, b) and u = (u
1
, u
2
). Then, since f is dierentiable at a we can
apply the chain rule to get
D
u
f(a) =
d
ds
f
_
(a, b) + s(u
1
, u
2
)
_
s=0
=
d
ds
f(a + su
1
, b + su
2
)
s=0
=
_
D
1
f(a + su
1
, b + su
2
)
d
ds
(a + su
1
) + D
2
f(a + su
1
, b + su
2
)
d
ds
(b + su
2
)
_
s=0
= [D
1
f(a + su
1
, b + su
2
)u
1
+ D
2
f(a + su
1
, b + su
2
)u
2
]
s=0
= D
1
f(a, b)u
1
+ D
2
f(a, b)u
2
= f(a, b) (u
1
, u
2
).
Section 7.1 Directional Derivatives 75
Find the directional derivative of the function f : R
2
R, dened by EXAMPLE 2
f(x, y) = 2x
3
+ 4xy
2
+ y
at the point (-1, 1) in the direction of the vector (1, 1).
Solution: We normalize the vector to get
u =
(1, 1)
(1, 1)
=
_
1
2
,
1
2
_
.
We have
f(x, y) = (6x
2
+ 4y
2
, 8xy + 1), so f(1, 1) = (10, 7).
Since f has continuous partial derivatives at (1, 1) we can apply the theorem to get
D
u
f(1, 1) = (10, 7)
_
1
2
,
1
2
_
=
3
2
.
REMARKS
1. Be careful to check the condition of the theorem before applying it. If f is not
dierentiable at a, then we must apply the denition.
2. If we choose u =

i = (1, 0) or u =

j = (0, 1), then the directional derivative is
equal to the partial derivative f
x
or f
y
respectively.
3. The denition of the directional derivative and the theorem can be extended to
higher dimensions in the expected way.
Find the directional derivative of f : R
3
f(x, y, z) = e
xyz
at the point (1, 1, 2) in the direction of the vector (1, 2, 2).
REMARK
When the directional derivative is applied, x usually represents position, and f(x)
represents some physical quantity, e.g. temperature, or height above sea level. Because
the parameter s in the denition represents distance along the line L, the directional
derivative represents a rate of change with respect to distance.
For example, if f(x) represents the temperature at position x, then D
u
f(a) equals the
rate of change of temperature, with respect to distance, at position a in the direction
u, and has dimensions of temperature per unit length.
If z = f(x, y) represents height above sea level,
then D
u
f(a, b) equals the rate of change of
height z with respect to horizontal distance,
at position (a, b) in the direction u, and is di-
mensionless. Geometrically, it equals the slope
of the tangent to the cross-section C at the
point A. (The vertical plane P cuts the sur-
face z = f(x, y) along the curve C.)
x
y
z
(a, b)
A
C
u
P
z = f(x, y)
7.2 The Gradient Vector in Two Dimensions
The Greatest Rate of Change
In general, for a function f : R
2
R, the directional derivative D
u
f(a) has innitely
many values corresponding to all possible directions u at a. It is natural to ask:
In which direction u does D
u
f(a) assume its largest value?
This is easily answered using Theorem 1 and the following property of the dot product
that
u v = uv cos ,
where is the angle between u and v.
Section 7.2 The Gradient Vector in Two Dimensions 77
Suppose that f : R
2
R is dierentiable at a, and that f(a) = (0, 0). Then the Theorem 2
largest value of D
u
f(a) is f(a), and occurs when u is in the direction of f(a).
Proof: Since f is dierentiable at a and u = 1 we have
D
u
f(a) = f(a) u = f(a) u cos = f(a) cos ,
where is the angle between u and f(a). Thus D
u
f(a) assumes its largest value which
is f(a), when cos = 1, i.e. = 0, so that u is in the direction of f(a).
A function f : R
2
R is dened by f(x, y) = ln(x + y
2
). Find the largest rate of EXERCISE 2
change of f at the point (0, 1), and the direction in which it occurs.
Give a non-constant function f : R
2
R and a point a R
2
such that the directional EXERCISE 3
derivative at a is independent of the direction. What can you say about the tangent
plane of the surface z = f(x, y) at the point a?
REMARK
Theorem 2 also applies in any dimension. That is, if f : R
n
R is dierentiable
at a and u R
n
is a unit vector, then the largest value of D
u
f(a) is f(a), and it
occurs when u is in the direction of f(a).
The Gradient and the Level Curves of f
People who have experience reading contour maps know that the direction of steepest
ascent is orthogonal to the contour lines. In mathematical terms, this means that the
direction of greatest rate of change of f : R
2
R, which we have shown is the direction
of the gradient of f, is orthogonal to the level curves of f.
We now derive this result analytically.
Suppose that f : R
2
R is dierentiable at a and that f(a) = (0, 0). Then f(a) Theorem 3
is orthogonal to the level curve f(x, y) = k through a.
Proof: Since f(a) = (0, 0), by the Implicit Function Theorem (see appendix 1) the
level curve f(x, y) = k can be described by parametric equations x = x(t), y = y(t)
for t I where x(t) and y(t) dierentiable. Hence, the level curve may be written
as f(x(t), y(t)) = k, t I. Suppose that
a = x(t
0
), b = y(t
0
) for some t
0
I.
Since f is dierentiable, we can take the derivative of this
equation with respect to t using the chain rule to get
(a, b)
f(a, b)
(t
0
), y
(t
0
)
f(x, y) = k
f
x
(x(t), y(t))x
(t) + f
y
(x(t), y(t))y
(t) = 0
On setting t = t
0
we get
0 = f(a, b) (x
(t
0
), y
(t
0
)).
Thus f(a, b) is orthogonal to (x
(t
0
), y
(t
0
)) which is tangent to the level curve.
Let z = 3 x
2
+y
2
represent the height above sea level. A hiker is at position (1, 2, 6). EXAMPLE 3
In what direction should he start to move, in order to follow a path of steepest ascent?
What would be the slope of his path (i.e. rate of change of height with respect to
horizontal distance)?
Solution: The gradient of z is
z = (2x, 2y)
and at the given point
z(1, 2) = (2, 4).
x
y
(1, 2)
z(1, 2)
C = 3
By Theorem 2, the hiker should move in the direction (2, 4) in order to follow a path
of steepest ascent (i.e. largest rate of change of z). The slope of his path would be
z(1, 2) =
_
(2)
2
+ (4)
2
= 2
5.
Section 7.2 The Gradient Vector in Two Dimensions 79
Prove that the level curves of the functions f and g dened by EXERCISE 4
f(x, y) =
y
x
2
, x = 0, g(x, y) = x
2
+ 2y
2
intersect orthogonally. Illustrate graphically.
The Gradient Vector Field
Given a function f : R
2
R that is dierentiable at x, the gradient of f at x is dened
by
f(x) = (f
x
(x), f
y
(x)).
The gradient of f associates a vector with each point of the domain of f, and is
referred to as a vector eld. It is represented graphically by drawing f(a) as a
vector emanating from the corresponding point a.
Theorems 2 and 3 show that the gradient vector eld has important geometric prop-
erties:
1) It gives the direction in which the function has its largest rate of change.
2) It gives the direction that is orthogonal to the level curves of the function.
C
a
1
a
2
a
3
f(a
1
)
f(a
2
)
f(a
3
)
f(x) = C
1
f(x) = C
2
f(x) = C
3
If the level curves are contour lines, then a curve such as C, which intersects the level
curves orthogonally, would dene a curve of steepest ascent on the surface.
7.3 The Gradient Vector in Three Dimensions
One cannot visualize the graph w = f(x, y, z) of a function f : R
3
R, because
four dimensions are required. One can gain insight into such a function, however, by
considering the level surfaces in R
3
dened by
f(x, y, z) = k, where k R(f).
The level surfaces of the function f : R
3
f(x, y, z) = x + 2y + 3z,
are the parallel planes
x + 2y + 3z = k.
The level surfaces of the function EXAMPLE 5
f(x, y, z) = x
2
+ y
2
z
2
,
given by
x
2
+ y
2
z
2
= k,
are hyperboloids with two sheets if k < 0, hyperboloids with one sheet if k > 0, and a
cone if k = 0.
x y
z
x
2
+y
2
z
2
= 2
2
x
2
+ y
2
z
2
= 1
x
2
+ y
2
z
2
= 0
x
2
+ y
2
z
2
= 1
x
2
+y
2
z
2
= 2
2
x
2
+y
2
z
2
= 3
2
x y
z
x y
z
x y
z
Section 7.3 The Gradient Vector in Three Dimensions 81
We now discuss the interpretation of the gradient f(a), for f : R
3
R. As noted in
Section 7.2, Theorem 2 applies in this case i.e. f(a) gives the direction of the largest
rate of change of f. We now generalize Theorem 3 to the case f : R
3
R. As one
might guess, we have:
Suppose that f : R
3
R is dierentiable, and that f(a) = 0. Then f(a) is Theorem 4
orthogonal to the level surface f(x) = k through a.
Proof: The details are similar to the proof of Theorem 3.
Observe that Theorem 4 gives a quick way to nd the equation of the tangent plane
of a surface in R
3
given by
f(x, y, z) = k.
Let x be an arbitrary point in the tangent plane to the
surface at the point a. Then the vector x a lies in
the tangent plane, and by Theorem 4, is orthogonal to
f(a), leading to
f(a) (x a) = 0.
x
y
z
a
x
f(a)
f(x, y, z) = k
Since this equation is satised for all x in the tangent plane, it is the equation of the
tangent plane. In component form, we have
f
x
(a)(x a) + f
y
(a)(y b) + f
z
(a)(z c) = 0,
where a = (a, b, c).
Find the equation of the tangent plane to the ellipsoid x
2
+2y
2
+3z
2
= 12 at the point EXERCISE 5
a = (1, 1,
3).
Find the equation of the tangent plane to the surface EXERCISE 6
z =
xy
3x 2y
at (1, 2, 2).
Hint: Rewrite the equation as z(3x 2y) xy = 0 and use the above approach.
Chapter 8
Taylor Polynomials and Taylors
Theorem
For a function of one variable f : R R, the second derivative f
plays an important
role in approximating f(x). Geometrically, f
determines whether the graph of f is

concave up or concave down. Thus, if the graph of f is concave up near x = 0 (f
(x) >
0), then the linear approximation formula gives a value for f(x) which is too small. The
second derivative can in fact be used to estimate the error through Taylors formula.
In addition, f
can be used to increase the accuracy of the linear approximation by

dening a quadratic approximation, the second degree Taylor polynomial.
In this chapter, we extend these ideas to functions of two variables.
8.1 The Taylor Polynomial of Degree 2
For a function of one variable, f : R R, the Taylor polynomial of degree 2 at a is
denoted by P
2,a
(x), and is dened by
P
2,a
(x) = f(a) + f
(a)(x a) +
1
2
f
(a)(x a)
2
.
Observe that P
2,a
(x) is the sum of the linear approximation L
a
(x) and a term which
is of second degree in (x a). The coecient of this term is determined by requiring
Section 8.1 The Taylor Polynomial of Degree 2 83
that the second derivative of P
2,a
(x) equals the second derivative of f at a:
P
2,a
(a) = f
(a)
Your should verify this by dierentiating P
2,a
(x).
The 2-D case
Suppose that f : R
2
R has continuous second partials at a = (a, b). The Taylor
polynomial of f of degree 2 at a is denoted P
2,a
(x, y) and is obtained by adding ap-
propriate 2nd degree terms in (xa) and (y b) to the linear approximation L
a
(x, y).
Consider
P
2,a
(x, y) = L
a
(x, y) + A(x a)
2
+ B(x a)(y b) + C(y b)
2
. (8.1)
where A, B, C are constants. On dierentiating we obtain
2
P
2,(a,b)
x
2
= 2A,

2
P
2,(a,b)
xy
= B,

2
P
2,(a,b)
y
2
= 2C.
Note that L
a
(x, y) does not contribute to the second derivatives since it is of rst
degree in x and y.
We require that the second derivatives of P
2,a
equal the second derivatives of f at
a. This leads to
2A =

2
f
x
2
(a, b), B =

2
f
xy
(a, b), 2C =

2
f
y
2
(a, b).
Substitute for A, B, C in equation (8.1) and write out the expression for L
a,b
(x, y), to
obtain the required formula.
The second degree Taylor polynomial P
2,(a)
of f : R
2
R at a = (a, b) is given by Denition
2nd degree Taylor
polynomial
P
2,a
(x, y) = f(a) + f
x
(a)(x a) + f
y
(a)(y b)
+
1
2
_
f
xx
(a)(x a)
2
+ 2f
xy
(a)(x a)(y b) + f
yy
(a)(y b)
2
_
.
In general, it approximates f(x, y) for (x, y) suciently close to (a, b):
f(x, y) P
2,a
(x, y),
with better accuracy than the linear approximation.
84 Chapter 8 Taylor Polynomials and Taylors Theorem
Use the Taylor polynomial of degree 2 to calculate
_
(0.95)
3
+ (1.98)
3
approximately. EXAMPLE 1
[ This is a continuation of Example 5 on page 41. ]
Solution: Let f(x, y) =
_
x
3
+ y
3
, and (a, b) = (1, 2). By dierentiating, one obtains
f(1, 2) =
_
1
2
, 2
_
, Hf(1, 2) =
_
11
12

1
3
1
3
2
3
_
.
Thus,
P
2,(1,2)
(x, y) = 3+
1
2
(x1) +2(y 2) +
1
2
_
11
12
(x 1)
2
2
3
(x 1)(y 2) +
2
3
(y 2)
2
_
.
This polynomial approximates
_
x
3
+ y
3
near the point (1, 2):
_
(0.95)
3
+ (1.98)
3
P
2
(0.95, 1.98)
= 3 + (0.065) +
_
0.0227
12
_
= 2.935946.
The calculator value is 2.935944. Hence, the error is 0.000002 compared with 0.000943
for the linear approximation.
a) Find the Taylor polynomial P
2,a
(x, y), for EXERCISE 1
f(x, y) =
1
2
y
2
+ x
1
3
x
3
,
at the point a = (1, 0), by calculating the appropriate partial derivatives.
b) Verify your results by letting u = x 1, v = y and writing
f(x, y) =
1
2
v
2
+ u + 1
1
3
(u + 1)
3
.
Expand and neglect powers higher than 2 and then convert back to x and y. This type
of algebraic derivation can only be done for a polynomial function.
We now ask: How large is the error if we use the approximation
f(x, y) P
2,a
(x, y)?
To answer this question, we need to extend Taylors Theorem to functions f : R
2
R.
Section 8.2 Taylors Formula with Second Degree Remainder 85
8.2 Taylors Formula with Second Degree
Remainder
Taylors Formula Theorem
Consider f : R R. If f
exists on [a, x], then there exists a number c between a and

x such that
f(x) = f(a) + f
(a)(x a) + R
1,a
(x), (8.2)
where
R
1,a
(x) =
1
2
f
(c)(x a)
2
. (8.3)
On recalling that
L
a
(x) = f(a) + f
(a)(x a), (8.4)

we see that the term R
1,a
(x) represents the error in using the linear approximation.
Keep in mind that you cant evaluate this expression, because you dont know the
value of c. We only know that c lies between a and x. However this formula is useful,
because it gives a way of nding an upper bound for the error.
If f has a continuous second derivative on an interval [a , a +] centered on a, then
f
is bounded on this interval. That is, there exists a number B such that
|f
(x)| B, for all x [a , a + ].

By equations (8.2)-(8.4),
|f(x) L
a
(x)| = |
1
2
f
(c)(x a)
2
|
1
2
B(x a)
2
.
That is
|f(x) L
a
(x)|
1
2
B(x a)
2
,
for all x [a , a + ]. Knowing f
(x), you can nd a value for B.

The 2-D case
In order to generalize Taylors formula to the case of f : R
2
R, observe that
R
1,a
(x) in equation (8.4) has the same form as the second derivative term in P
2,a
(x),
except that f
is evaluated at c instead of at a. Knowing the form of P

2,a
(x) leads us
to Taylors Theorem.
Taylors Formula Theorem 1
Consider f : R
2
R. If f C
2
in some neighborhood N(a) of a then for all x N(a),
there exists a point c on the line segment joining a and x such that
f(x) = f(a) + f
x
(a)(x a) + f
y
(a)(y b) + R
1,a
(x),
where
R
1,a
(x) =
1
2!
_
f
xx
(c)(x a)
2
+ 2f
xy
(c)(x a)(y b) + f
yy
(c)(y b)
2
_
.
Proof: The idea is to reduce the given function f of two variables to a function g of
one variable, by considering only points on the line segment joining a and x.
We parameterize the line segment L from a = (a, b) to x = (x, y) by
L(t) = a + t(x a), 0 t 1.
For simplicity write h = x a or in component form
(h, k) = (x, y) (a, b).
Then x a = h, y b = k, and Taylors formula assumes the form
f(x) = f(a) + f
x
(a)h + f
y
(a)k + R
1,a
(x),
where
R
1,a
(x) =
1
2!
_
f
xx
(c)h
2
+ 2f
xy
(c)hk + f
yy
(c)k
2
_
.
Dene g : R R by
g(t) = f(L(t)), 0 t 1. (8.5)
Since f has continuous second partials by hypothesis, we can apply the Chain Rule to
conclude that g
and g
are continuous and are given by

g
(t) = f
x
(L(t))h + f
y
(L(t))k (8.6)
g
(t) = f
xx
(L(t))h
2
+ 2f
xy
(L(t))hk + f
yy
(L(t))k
2
(8.7)
for 0 t 1.
Since g
is continuous on the interval [0, 1], Taylors formula may be applied to g

on this interval. That is, we can set x = 1 and a = 0 in equations (8.3) and (8.4). It
follows that there exists a number c, with 0 < c < 1, such that
g(1) = g(0) + g
(0) +
1
2
g
(c). (8.8)
Section 8.2 Taylors Formula with Second Degree Remainder 87
Each term in this equation can be calculated using equations (8.6)-(8.8), giving
g(1) = f(a + (x a)) = f(x),
g(0) = f(a), and
g
(0) = f
x
(a)h + f
y
(a)k.
In addition, if we let c = L(c), then
1
2
g
(c) = R
1,a
(x),
and equation (8.9) becomes precisely the modied version of Taylors formula.
REMARK
Like the one variable case, Taylors Theorem for f : R
2
R is an existence theorem.
That is, it only tells us that the point c exists, but not how to nd it.
Here is an example to show how Taylors formula can be used to estimate the error in
using the linear approximation formula.
If x 0 and y 0 show that EXAMPLE 2
_
1 + x + 2y 1 +
1
2
x + y,
with
|R
1,(0,0)
(x, y)|
3
4
(x
2
+ y
2
).
Solution: By dierentiating f(x, y) =
1 + x + 2y, we obtain
L
(0,0)
(x, y) = 1 +
1
2
x + y
and
f
xx
=
1
4(1 + x + 2y)
3/2
, f
xy
=
1
2(1 + x + 2y)
3/2
, f
yy
=
1
(1 + x + 2y)
3/2
.
For x 0 and y 0, f has continuous second partial derivatives so we can apply
Taylors Theorem to get that there exists a point c on the line segment from x to (0, 0)
such that
R
1,(0,0)
(x, y)
1
2
_
f
xx
(c)(x 0)
2
+ 2f
xy
(c)(x 0)(y 0) + f
yy
(c)(y 0)
2
_
.
Since we can not nd c, we want to nd an upper bound for this function. Applying
the triangle inequality gives
R
1,(0,0)
(x, y)
1
2
_
|f
xx
(c)|x
2
+ 2|f
xy
(c)||x||y| +|f
yy
(c)|y
2
_
. (8.9)
Thus, to nd our upper bound for the error, we just need to nd upper bounds for
|f
xx
(c)|, |f
xy
(c)|, and |f
yy
(c)|. Since x 0 and y 0 we get that 1 + x + 2y 0, and
so we get
|f
xx
(c)|
1
4
, |f
xy
(c)|
1
2
, and |f
yy
(c)| 1.
Putting this into (8.10) we get
R
1,(0,0)
1
2
_
1
4
x
2
+ 2
1
2
|x||y| + 1y
2
_
1
2
_
1
4
x
2
+
1
2
(x
2
+ y
2
) + y
2
_
, since 2|x||y| x
2
+ y
2
=
3
8
x
2
+
3
4
y
2
.
Using the fact that
3
8
x
2
<
3
4
x
2
gives
_
1 + x + 2y
_
1 +
1
2
x + y
_
3
4
(x
2
+ y
2
),
as required.
Let f(x, y) = e
2x+y
. Use Taylors Theorem to show that the error in the linear EXERCISE 2
approximation L
(1,1)
(x, y) is at most 6e[(x1)
2
+(y 1)
2
] if 0 x 1 and 0 y 1.
REMARK
The most important thing about the error term R
1,a
(x) is not its explicit form, but
rather its dependence on the magnitude of the displacement x a. We state the
result as a Corollary.
If f : R
2
R, f C
2
in some closed neighborhood N
(a) = {x R
2
| x a }, Corollary 1
then there exists a positive constant M such that
|R
1,a
(x)| Mx a
2
, for all x N
(a).
Section 8.3 Generalizations 89
8.3 Generalizations
In order to dene the Taylor polynomial P
k,a
(x) of degree k, in a concise manner, we
introduce the dierential operator
(x a)D
1
+ (y b)D
2
,
where D
1
=

x
and D
2
=

y
are the partial dierential operators. Then we formally
write
[(x a)D
1
+ (y b)D
2
]
2
= (x a)
2
D
2
1
+ 2(x a)(y b)D
1
D
2
+ (y b)
2
D
2
2
.
Note that D
2
1
= D
1
D
1
. This means apply D
1
twice, i.e. take the second partial
derivative with respect to the rst variable.
In terms of this notation, the rst degree Taylor polynomial P
1,a
(x) (which is the
linear approximation L
a
(x)) is written as
P
1,a
(x) = f(a) + [(x a)D
1
+ (y b)D
2
]f(a),
and the second degree Taylor polynomial is written as
P
2,a
(x) = P
1,a
(x) +
1
2!
[(x a)D
1
+ (y b)D
2
]
2
f(a).
We can now recursively dene the kth degree Taylor polynomial for k = 2, 3, . . . by
P
k,a
(x) = P
k1,a
(x) +
1
k!
[(x a)D
1
+ (y b)D
2
]
k
f(a).
The expression [(x a)D
1
+(y b)D
2
]
k
can be formally expanded using the Binomial
Theorem.
Write out P
3,a
(x) explicitly using subscript notation. EXERCISE 3
We now see that all of the results we had generalize in the expected way for all
values of k.
Taylors Theorem of order k Theorem 2
Let f : R
2
R, f C
k+1
at each point on the line segment joining a and x. Then
there exists a point c on the line segment between a and x such that
f(x) = P
k,a
(x) + R
k,a
(x),
where
R
k,a
(x) =
1
(k + 1)!
[(x a)D
1
+ (y b)D
2
]
k+1
f(c).
If f C
k
in some neighborhood of a, then Corollary 1
lim
xa
|f(x) P
k,a
(x)|
x a
k
= 0.
If f C
k+1
in some closed neighborhood N(a) of a, then there exists a constant M > 0 Corollary 2
such that
|f(x) P
k,a
(x)| Mx a
k+1
,
for all x N(a).
The nal stage in the process of generalization is to consider functions of n variables
f : R
n
R. One has simply to replace the dierential operator
[(x a)D
1
+ (y b)D
2
]
by
[(x
1
a
1
)D
1
+ + (x
n
a
n
)D
n
] ,
which can be written concisely in vector notation as
_
(x a)
_
,
where = (D
1
, . . . , D
n
).
Chapter 9
Critical Points
Recall from single variable calculus that if x = a is a local extremum of f : R R
then either f
(a) = 0 or f
(a) does not exist. Such points are of interest and are called
critical points of f. But recall that a critical point is not necessarily a local extremum.
For example f(x) = x
3
at x = 0.
In this chapter, we extend these ideas to functions f : R
2
R. The second degree
Taylor polynomial will be used to generalize the second derivative test for local extrema.
These ideas will be applied to optimization problems in Chapter 10.
9.1 Local Extrema and Critical Points
We begin with the denitions of local extrema.
1) Given a function f : R
2
R, a point (a, b) is a local maximum point of f if Denition
local maximum
local minimum f(x, y) f(a, b)
for all (x, y) in some neighborhood of (a, b).
2) Given a function f : R
2
R, a point (a, b) is a local minimum point of f if
f(x, y) f(a, b)
for all (x, y) in some neighborhood of (a, b).
92 Chapter 9 Critical Points
Thinking geometrically, if (a, b) is a local maxi-
mum/minimum point of f and f has continuous par-
tial derivatives, then (a, b) is a local maximum/minimum
point of the cross-sections f(x, b) and f(a, y). Thus, (a, b)
is a critical point of both of these cross-sections and so
both partial derivatives of f will be zero and the tangent
plane will be horizontal.
x
y
z z
z = f(a, b)
(a, b)
z = f(x, y)
Let f : R
2
R. If (a, b) is a local maximum or minimum point of f, then Theorem 1
f
x
(a, b) = 0 = f
y
(a, b),
or at least one of f
x
or f
y
does not exist at (a, b).
Proof: Consider the function g : R R dened by g(x) = f(x, b). If (a, b) is a
local maximum/minimum point of f, then x = a is a local maximum/minimum point
of g, and hence either g
(a) = 0 or g
(a) does not exist. Thus it follows that either

f
x
(a, b) = 0 or f
x
(a, b) does not exist. A similar argument gives f
y
(a, b) = 0 or f
y
(a, b)
does not exist.
Let f : R
2
R. A point (a, b) in the domain of f is called a critical point of f if Denition
critical point
f
x
(a, b) = 0 =
f
y
(a, b),
or if at least one of the partial derivatives does not exist at (a, b).
Find the critical points of the following functions and determine if they are local max- EXAMPLE 1
imum points or local minimum points.
f(x, y) = x
2
+ y
2
, g(x, y) = x
2
y
2
, h(x, y) = x
2
y
2
.
Solution: We see that f
x
= 2x and f
y
= 2y so (0, 0) is the only critical point of f.
Observe that
f(x, y) = x
2
+ y
2
> 0 = f(0, 0), for all (x, y) = (0, 0)
so (0, 0) is a local minimum point of f.
Section 9.1 Local Extrema and Critical Points 93
We have g
x
= 2x and g
y
= 2y so (0, 0) is the only critical point of g and
g(x, y) = x
2
y
2
< 0 = g(0, 0), for all (x, y) = (0, 0)
so (0, 0) is a local maximum point of g.
For h(x, y) we have h
x
= 2x and h
y
= 2y so (0, 0) is the only critical point of h, but
we have h(x, 0) > h(0, 0) for any value of x and h(0, y) < h(0, 0) for any value of y, so
(0, 0) is not a local maximum point or a local minimum point.
Our solutions for f and g make a lot of sense when we realize that z = f(x, y) is a
paraboloid facing up and z = g(x, y) is a paraboloid facing down. Also, we see that
(0, 0) is the point at the center of the saddle for the saddle surface z = h(x, y) hence
it should not be a local minimum or a local maximum. This motivates the following
denition.
A critical point (a, b) of f : R
2
R is called a saddle point of f if in every neighbor- Denition
saddle points hood of (a, b) there exists points (x
1
, y
1
) and (x
2
, y
2
) such that
f(x
1
, y
1
) > f(a, b) and f(x
2
, y
2
) < f(a, b).
The problem that we are faced with has two parts.
1) Given f : R
2
R, nd all critical points of f.
2) Determine whether the critical points are local maxima, minima or saddle points.
We now illustrate 1) with an Example. 2) is discussed in Section 9.2.
Find all critical points of f : R
2
R, dened by EXAMPLE 2
f(x, y) = x
2
y + 3xy
2
+ xy.
Solution: Dierentiate and simplify, to obtain
f
x
= y(2x + 3y + 1),
f
y
= x(x + 6y + 1).
In this type of problem it is helpful to take out common factors in the expressions. In
order to nd the critical points of f we must solve the system of two equations
y(2x + 3y + 1) = 0 (9.1)
x(x + 6y + 1) = 0 (9.2)
Observe that (9.1) implies that either y = 0 or 2x+3y +1 = 0. We consider these two
cases:
Case 1: y = 0.
Putting y = 0 into (9.2) we get x(x+1) = 0, giving two values x = 0 or x = 1. Thus
we have critical points (0, 0) and (1, 0).
Case 2: 2x + 3y + 1 = 0.
We have 3y = 2x 1 so (9.2) gives
0 = x(x + 2(3y) + 1) = x(x + 2(2x 1) + 1) = 3x
2
x = x(3x + 1),
giving two values x = 0 and x =
1
3
. To nd the corresponding y values we put these
into 3y = 2x 1 and get two more critical points (0,
1
3
) and (
1
3
,
1
9
).
Collecting the results, the critical points are (0, 0),
_
0,
1
3
_
, (1, 0), and
_
1
3
,
1
9
_
.
REMARKS
1) It is essential to solve equations (9.1) and (9.2) systematically, by considering all
possible cases, in order to nd all critical points.
2) You should be aware that we can only explicitly nd the critical points for simple
functions f. The equations
f
x
= 0,
f
y
= 0
are a system of equations, which, in general, are non-linear, and there are no
general algorithms for solving such systems exactly. There are, however, numer-
ical methods for nding approximate solutions, one of which is a generalization
of Newtons method to two variables. If you review the one variable case, you
might see how to generalize it, using the tangent plane. Its a challenge!
Section 9.2 The Second Derivative Test 95
Find all critical points of f(x, y) = xye
xy
. EXERCISE 1
Find all critical points of f(x, y) = xcos(x + y). EXERCISE 2
Give a function f : R
2
R with no critical points. EXERCISE 3
9.2 The Second Derivative Test
The second derivative test can be motivated in a simple way by using the second degree
Taylor polynomial of f.
For f : R R the second degree Taylor polynomial approximation is
f(x) f(a) + f
(a)(x a) +
1
2
f
(a)(x a)
2
,
for x suciently close to a. If x = a is a critical point of f, then f
(a) = 0, and the

approximation can be rearranged to give
f(x) f(a)
1
2
f
(a)(x a)
2
.
Thus, for x suciently close to a, f(x) f(a) has the same sign as f
(a). If f
(a) > 0,
then f(x) f(a) > 0 for x suciently close to a, and x = a is a local minimum point.
If f
(a) < 0 then f(x) f(a) < 0 for x suciently close to a, and x = a is a local
maximum point. There is no conclusion if f
(a) = 0.
The 2-D case
For f : R
2
R, f C
2
, the second degree Taylor polynomial approximation is
f(x, y) P
2,(a,b)
(x, y),
for (x, y) suciently close to (a, b). If (a, b) is a critical point of f such that
f
x
(a, b) = 0 = f
y
(a, b),
then the approximation can be rearranged to yield
f(x, y)f(a, b)
1
2
_
f
xx
(a, b)(xa)
2
+2f
xy
(a, b)(xa)(yb)+f
yy
(a, b)(yb)
2
_
(9.3)
for (x, y) suciently close to (a, b). The sign of the expression on the right will deter-
mine the sign of f(x, y) f(a, b), and hence whether (a, b) is a local maximum, local
minimum or saddle point.
The expression on the right is called a quadratic form, and at this stage it is
necessary to discuss some properties of these objects.
Quadratic Forms
A function Q : R
2
R of the form Denition
quadratic form
Q(u, v) = a
11
u
2
+ 2a
12
uv + a
22
v
2
,
where a
11
, a
12
and a
22
are constants, is called a quadratic form on R
2
.
It is important to observe that one can use matrix notation and write
Q(u, v) = [u v]
_
a
11
a
12
a
12
a
22
__
u
v
_
,
so that a quadratic form on R
2
is determined by a 2 2 symmetric matrix. Perform
the matrix multiplications to convince yourself that the two expressions for Q(u, v) are
equal.
We classify quadratic forms on R
2
in the following way:
1) If Q(u, v) > 0 for all (u, v) = (0, 0), then Q(u, v) is said to be positive denite.
2) If Q(u, v) < 0, for all (u, v) = (0, 0), then Q(u, v) is said to be negative denite.
3) If Q(u, v) < 0 for some (u, v) and Q(u, v) > 0 for some other (u, v), then Q(u, v)
is said to be indenite.
4) If Q(u, v) does not satisfy any of 1) 3), then Q(u, v) is said to be semidenite.
These terms are also used to describe the corresponding symmetric matrices.
A =
_
2 0
0 3
_
is positive denite, since Q(u, v) = 2u
2
+ 3v
2
> 0 for all (u, v) = (0, 0). EXAMPLE 3
B =
_
2 0
0 3
_
is indenite, since Q(u, v) = 2u
2
3v
2
, and Q(u, 0) = 2u
2
> 0 for u = 0,
and Q(0, v) = 3v
2
< 0 for v = 0.
C =
_
2 0
0 0
_
is semidenite, since Q(u, v) = 2u
2
0 for all (u, v), and Q(0, v) = 0 for
all v.
REMARK
Semidenite quadratic forms may be split into two classes, positive semidenite
and negative semidenite. Then, the matrix C above would be classied as positive
semidenite.
If A is not a diagonal matrix, the nature of A (or of Q(u, v)) is not immediately obvious.
For example, even if all entries of A are positive, it does not follow that A is a positive
denite matrix.
Classify the symmetric matrix A =
_
1 3
3 2
_
. EXAMPLE 4
Solution: The associated quadratic form is
Q(u, v) = u
2
+ 6uv + 2v
2
Complete the square, obtaining
Q(u, v) = (u + 3v)
2
7v
2
It is now clear by inspection that A is indenite, since
Q(u, 0) = u
2
> 0, for u = 0,
and
Q(3v, v) = 7v
2
< 0, for v = 0.
Having introduced quadratic forms, we return to equation (9.3). Let
u = x a, v = y b,
so that
f(x, y) f(a, b)
1
2
_
f
xx
(a, b)u
2
+ 2f
xy
(a, b)uv + f
yy
(a, b)v
2
_
.
The matrix of the quadratic form on the right is the Hessian matrix of f at (a, b):
Hf(a, b) =
_
f
xx
(a, b) f
xy
(a, b)
f
xy
(a, b) f
yy
(a, b)
_
.
It is thus plausible that if Hf(a, b) is positive denite, then
f(x, y) f(a, b) > 0
for all (u, v) = (0, 0) i.e. for all (x, y) = (a, b) (assuming, of course, that (x, y) is
suciently close to (a, b) so that the approximation is suciently accurate). In other
words, if Hf(a, b) is positive denite, it is plausible that (a, b) is a local minimum point
of f. One can give similar arguments in the cases where Hf(a, b) is negative denite
or indenite, leading to the following theorem.
Second Partial Derivative Test Theorem 2
Let f : R
2
R and suppose that f C
2
in some neighborhood of a, and that
f
x
(a) = 0 = f
y
(a).
1) If Hf(a) is positive denite, then a is a local minimum point of f.
2) If Hf(a) is negative denite, then a is a local maximum point of f.
3) If Hf(a) is indenite, then a is a saddle point of f.
REMARKS
1) The argument preceding the theorem is not a proof, since it involves an approxi-
mation. One can use Taylors formula and a continuity argument to give a proof.
See the Appendix to this chapter.
2) Note the analogy with the second derivative test for functions of one variable.
The requirement g
(a) > 0, which implies a local minimum, is replaced by the

requirement that the matrix of second partial derivatives Hf(a) be positive def-
inite.
To help us classify the Hessian matrix we can use the following theorem from the theory
of quadratic forms.
Let Q(u, v) = a
11
u
2
+ 2a
12
uv + a
22
v
2
and let D = a
11
a
22
a
2
12
, then Theorem 3
1) Q is positive denite if and only if D > 0 and a
11
> 0.
2) Q is negative denite if and only if D > 0 and a
11
< 0.
3) Q is indenite if and only if D < 0.
4) Q is semidenite if and only if D = 0.
Proof: If a
11
= 0, then we can complete the square to get
Q(u, v) = a
11
_
(u +
a
12
a
11
v)
2
+
D
a
2
11
v
2
_
.
Thus, if D > 0 and a
11
> 0, then Q(u, v) > 0 and hence is positive denite. If
D > 0 and a
11
< 0, then Q(u, v) < 0 and hence is negative denite. If D < 0, then
clearly Q(u, 0) and Q((a12/a11)v, v) are of dierent signs, hence Q(u, v) is indenite.
Otherwise, we must have D = 0 and Q(u, v) is semidenite.
Similarly, if a
22
= 0, then we can complete the square to get
Q(u, v) = a
22
_
D
a
2
22
u
2
+ (v +
a
12
a
22
u)
2
_
.
Finally, if a
11
= a
22
= 0, then Q(u, v) = 2a
12
uv and the result follows.
REMARK
Observe that D is the determinant of the associated symmetric matrix.
Find and classify all critical points of the function f : R
2
f(x, y) = x
3
4x
2
+ 4x 4xy
2
.
Solution: To nd the critical points we solve the system
0 = f
x
= 3x
2
8x + 4 4y
2
(9.4)
0 = f
y
= 8xy. (9.5)
From (9.5) we get that x = 0 or y = 0. If x = 0, then (9.4) gives 0 = 4 4y
2
so
y = 1. If y = 0, then (9.4) gives 0 = 3x
2
8x +4 = (3x 2)(x 2). Hence, we have
critical points (0, 1), (0, 1), (2, 0), and
_
2
3
, 0
_
.
The second partial derivatives are
f
xx
= 6x 8, f
xy
= 8y, f
yy
= 8x.
At
_
2
3
, 0
_
, the Hessian matrix is Hf
_
2
3
, 0
_
=
_
4 0
0
16
3
_
, which is clearly negative
denite, since the corresponding quadratic form is Q(u, v) = 4u
2
16
3
v
2
. Thus, by
the second partial derivative test,
_
2
3
, 0
_
is a local maximum point.
At (0, 1), the Hessian matrix is Hf(0, 1) =
_
8 8
8 0
_
. So, det Hf(0, 1) = 64 < 0.
Thus Hf(0, 1) is indenite, and by the second partial derivative test, (0, 1) is a saddle
point.
Similarly, it follows that (0, 1) and (2, 0) are saddle points.
Fill in the details of Example 5 above. EXERCISE 4
Find and classify all critical points of the function EXERCISE 5
f(x, y) = x
2
+ 6xy + 2y
2
.
Find and classify all critical points of the function EXERCISE 6
f(x, y) = (x
2
+ y
2
1)y.
Degenerate critical points
We have seen that quadratic forms (i.e. symmetric matrices) can be classied into four
types: positive denite, negative denite, indenite and semidenite. Note that the
second partial derivative test gives a conclusion in the rst three cases but makes no
reference to the semidenite case. In fact, if Hf(a) is semidenite, the critical point a
may be a local maximum point, a local minimum point or a saddle point. We justify
this statement by considering the functions
f(x, y) = x
4
+ y
4
, g(x, y) = x
4
y
4
, h(x, y) = x
4
y
4
.
For each function (0, 0) is the only critical point, and the Hessian matrix at (0, 0) is
the zero matrix, which is semidenite. However since
f(x, y) f(0, 0) 0 for all (x, y),
g(x, 0) g(0, 0) 0 for all x,
g(0, y) g(0, 0) 0 for all y,
h(x, y) h(0, 0) 0 for all (x, y)
it follows that (0, 0) is a local minimum point for f, a saddle point for g and a local
maximum point for h.
If Hf(a) is semidenite, so that the second partial derivative test gives no conclu-
sion, we say that the critical point a is degenerate. In order to classify the critical
point, one has to investigate the sign of f(x) f(a) in a small neighborhood of a.
A function f : R
2
f(x, y) = 2(x y)
2
x
4
y
4
+ 3.
Show that (0, 0) is a degenerate critical point of f and classify it.
Solution: It is a routine matter to show that
f(0, 0) = (0, 0), Hf(0, 0) =
_
4 4
4 4
_
.
The quadratic form associated with the Hessian is
Q(u, v) = 4u
2
8uv + 4v
2
= 4(u v)
2
0,
with Q(u, u) = 0 for all u, hence Hf(0, 0) is semidenite. Thus, (0, 0) is a degenerate
critical point. In order to classify it, consider
f(x, y) f(0, 0) = 2(x y)
2
x
4
y
4
.
Observe that
f(x, x) f(0, 0) = 2x
4
< 0 for all x = 0
and
f(x, 0) f(0, 0) = 2x
2
x
4
= x
2
(2 x
2
) > 0,
for all x which satisfy 0 < x
2
< 2. Thus in any suciently small neighborhood of
(0, 0), f(x, y) f(0, 0) assumes positive and negative values. Hence, (0, 0) is a saddle
point.
Generalization
The denitions of local maximum point, local minimum point and critical point can be
generalized in the obvious way to functions of n variables, f : R
n
R. The Hessian
matrix of f at a is the n n symmetric matrix given by
Hf(a) =
_

2
f
x
i
x
j
(a)
_
,
where i, j = 1, 2, . . . , n. The Hessian matrix can be classied as positive denite,
negative denite, indenite or semidenite by considering the associated quadratic
form in R
n
:
Q(u) =
n
i,j=1
2
f
x
i
x
j
(a)u
i
u
j
,
as in R
2
. The second derivative test as stated in R
2
now holds in R
n
. It can be justied
heuristically by using the second degree Taylor polynomial approximation,
f(x) P
2,a
(x),
which leads to
f(x) f(a)
1
2!
n
i,j=1
2
f
x
i
x
j
(a)(x
i
a
i
)(x
j
a
j
),
generalizing equation (9.3).
For n = 2, we have seen that one can classify Hf(a) by using the determinant.
This can be extended to the general case. See Math 235.
Level Curves Near a Critical Point
2
R with continuous partial
derivatives. In Section 7.2 we discussed the fact that if
f(a) = 0, then the level curve of f through a is a smooth
curve (at least suciently close to a). In addition, by conti-
nuity, f(x) = 0 for all x in some neighborhood of a. Thus,
if f(a) = 0, there will be some neighborhood of a in which
the level curves of f are smooth non-intersecting curves. A
point at which f(a) = 0 is called a regular point of f.
a
f(a)
Level curves f(x, y) = k near a
when f(a) = 0
(actual shape is not signicant)
Assume that f has continuous second partial derivatives, and approximate f by its
Taylor polynomial P
2,a
(x, y), calculated at the critical point:
f(x, y) f(a, b) +
1
2
_
f
xx
(a)(x a)
2
+ 2f
xy
(a)(x a)(y b) + f
yy
(a)(y b)
2
_
. (9.6)
The constant term f(a, b), and the factor
1
2
in equation (9.6) do not make a signicant
dierence. By setting u = x a and v = y b we have simply made a translation.
Thus, it is plausible (and can be proved) that for (x, y) suciently close to a = (a, b),
the level curves of f will be approximated by the level curves of P
2,a
(x, y).
The problem is thus: what are the possible level curves of an arbitrary quadratic
form:
Q(u, v) = a
11
u
2
+ 2a
12
uv + a
22
v
2
where a
11
, a
12
and a
22
are constants.
To properly answer this question requires even more linear algebra. The possible level
curves and how to sketch them is covered in Math 235.
Convex Functions
Suppose f : R R. We say that a twice dierentiable function f is strictly convex
if f
(x) > 0 for all x and f is convex is f
(x) 0 for all x. Thus the term convex

means concave up. Convex functions have two interesting properties.
Suppose f : R R is twice continuously dierentiable and strictly convex. Then Theorem 4
(1) f(x) > L
a
(x) = f(a) + f
(x)(x a) for all x = a, for any a R.

(2) For a < b, f(x) < f(a) +
f(b)f(a)
ba
(x a) for x (a, b).
Proof: (1) follows from Taylors Theorem: f(x) = L
a
(x) +
f
(c)
2
(x a)
2
where c is
between a and x. Thus R
1,a
(x) > 0 for x = a, giving f(x) > L
a
(x) for all x = a.
(2) Let g(x) = f(x)
_
f(a) +
f(b)f(a)
ba
(x a)
_
. Then g(a) = g(b) = 0 and g
(x) =
f
(x) > 0. We must show that g(x) > 0 for x (a, b). By the Mean Value Theorem
f(b)f(a)
ba
= f
(c) for some c (a, b). Note that g
(x) = f
(x)
f(b)f(a)
ba
= f
(x) f
(c).
Thus g
(c) = 0. Since g
(x) > 0 then g
(x) is strictly increasing. Since g
(c) = 0 then
g
(x) < 0 on [a, c) and g
(x) > 0 on (c, b]. This implies that g(x) is strictly decreasing
on [a, c] and strictly increasing on [c, b]. Since g(a) = 0 then g(x) < 0 on (a, c] and on
[c, b). Therefore g(x) < 0 on (a, b), as required.
REMARK
(1) says that the graph of f lies above any tangent line, and (2) says that any
secant line lies above the graph of f.
Suppose f : R
2
R has continuous second partial derivatives. We say that f is
strictly convex if Hf(x, y) is positive denite for all (x, y). By Theorem 3, f is
strictly convex means f
xx
> 0 and f
xx
f
yy
f
2
xy
> 0 for all (x, y). We get a result which
is analogous to Theorem 4.
Suppose f : R
2
R has continuous second partial derivatives and is strictly convex. Theorem 5
Then
(1) f(x) > L
a
(x) for all x = a, and
(2) f(a + t(b a)) < f(a) + t[f(b) f(a)] for 0 < t < 1, a = b.
Proof: (1) follows from Taylors Theorem:
f(x) = L
a
(x) +
1
2
_
f
xx
(c)(x a)
2
+ 2f
xy
(c)(x a)(y b) + f
yy
(y b)
2
where c is on the line segment from a to x. Since f

xx
(c) > 0, f
xx
f
yy
(c) f
xy
(c)
2
> 0,
R
1,a
(x) > 0 for x = a by Theorem 3. Therefore, f(x) > L
a
(x) for x = a.
(2) We parameterize the line segment L from a = (a
1
, a
2
) to b = (b
1
, b
2
) by
L(t) = a + t(b a), 0 t 1.
For simplicity write h = b a or in component form
(h, k) = (b
1
, b
2
) (a
1
, a
2
) = (b
1
a
1
, b
2
a
2
).
Dene g : R R by
g(t) = f(L(t)), 0 t 1. (9.7)
Since f has continuous second partials by hypothesis, we can apply the Chain Rule to
conclude that g
and g
are continuous and are given by

g
(t) = f
x
(L(t))h + f
y
(L(t))k (9.8)
g
(t) = f
xx
(L(t))h
2
+ 2f
xy
(L(t))hk + f
yy
(L(t))k
2
(9.9)
for 0 t 1. Since f
xx
(L(t)) > 0 and f
xx
(L(t))f
yy
(L(t)) f
xy
(L(t))
2
> 0 for all t,
g
(t) > 0 by Theorem 3. Thus, by Theorem 4, part (2):

g(t) < g(0) +
g(1) g(0)
1 0
(t 0), for 0 < t < 1.
Therefore f(a + t(b a)) < f(a) + t[f(b) f(a)] for 0 < t < 1 as required.
REMARK
(1) says that the graph of f lies above the tangent plane and (2) says that the
cross-section of the graph of f above the line segment from a to b lies below the secant
line.
If f(x) = x
2
, then f
(x) = 2 > 0 for all x, so f(x) is strictly convex. EXAMPLE 7

If f(x, y) = x
2
+ y
2
, then f
xx
= 2, f
xy
= 0, and f
yx
= 2, so f
xx
f
yy
f
2
xy
= 4 > 0, so f
is strictly convex.
Suppose f : R
2
R has continuous partial derivatives, is strictly convex, and has a Theorem 6
critical point c. Then f(x) > f(c) for all x = c and f has no other critical point.
Proof: Note that L
c
(x) = f(c). Thus f(x) > f(c) for all x = c by Theorem 5, part
(1). If f has a second critical point d, then by the reasoning just given f(d) > f(c)
and f(c) > f(d) which is a contradiction.
9.3 Proof of the Second Partial Derivative Test
We now want to prove part (1) of the second partial derivative test. The proof depends
signicantly on the hypothesis that the second partials of f are continuous, and on
a plausible property of positive denite matrices: if you make a small change to the
entries of a positive denite matrix then the new matrix is positive denite. This is
proved separately as a lemma
1
.
Let
_
a b
b c
_
be a positive denite matrix. If | a a|, |
b b| and | c c| are suciently Lemma 1

small, then
_
a

b
b c
_
is positive denite.
Proof: Let Q and

Q be the quadratic forms determined by the given matrices i.e.
Q(u, v) = au
2
+ 2buv + cu
2
(9.10)
and similarly for

Q(u, v). We perform the change of variables
u = r cos , v = r sin ,
to obtain
Q(u, v) = r
2
p() (9.11)
where
p() = a cos
2
+ 2b cos sin + c sin
2
.
Since for r = 1, Q(u, v) = p(), and Q is positive denite, we must have p() > 0 for
all , 0 2.
Let
k = min
02
p().
Then k > 0 and by equation (9.11)
Q(u, v) kr
2
for all (u, v) = (0, 0). (9.12)
We are given that | a a|, |
b b| and | c c| are suciently small. Let

= min{| a a|, |
b b|, | c c|}.
1
This proof was provided by D. Siegel
Section 9.3 Proof of the Second Partial Derivative Test 107
By equation (9.10) and the triangle inequality,
|Q(u, v)

Q(u, v)| | a a|u
2
+ 2|
b b||u||v| +| c c|v
2
(u
2
+ 2|u||v| + v
2
)
= (|u| +|v|)
2
= r
2
(| cos | +| sin|)
2
< 4r
2
We now choose =
1
8
k. Then
|Q(u, v)

Q(u, v)| <
1
2
kr
2
,
which implies
Q(u, v) Q(u, v)
1
2
kr
2
kr
2
1
2
kr
2
, by (9.12)
=
1
2
kr
2
This shows that

Q(u, v) > 0 for all (u, v) = (0, 0). Therefore

Q(u, v) is positive
denite.
REMARK
The lemma is also true if positive denite is replaced by negative denite or
indenite.
We now prove the second partial derivative test. For convenience we restate the theo-
rem.
Second Partial Derivative Test Theorem 2
Let f : R
2
R and suppose that f C
2
in some neighborhood of a, and that
f
x
(a) = 0 = f
y
(a).
1) If Hf(a) is positive denite, then a is a local minimum point of f.
2) If Hf(a) is negative denite, then a is a local maximum point of f.
3) If Hf(a) is indenite, then a is a saddle point of f.
Proof: We apply Taylors formula with second order remainder. Since f
x
(a) = 0 =
f
y
(a), Taylors formula can be written as
f(x) f(a) =
1
2
_
f
xx
(c)(x a)
2
+ 2f
xy
(c)(x a)(y b) + f
yy
(c)(y b)
2
_
, (9.13)
where c lies on the line segment joining a = (a, b) and x = (x, y). The coecient
matrix in the quadratic expression on the right side of (9.13) is the Hessian matrix
Hf(c).
We are given that Hf(a) is positive denite. By the lemma, there exists > 0 such
that if
|f
xx
(x) f
xx
(a)| < , |f
xy
(x) f
xy
(a)| < , |f
yy
(x) f
yy
(a)| < (9.14)
then Hf(x) is positive denite. Since the second partials of f are continuous at a,
the denition of continuity implies that there exists a > 0 such that x a <
implies (9.14) and hence that Hf(x) is positive denite. Since c a < x a, it
follows that Hf(c) is also positive denite. It now follows from equation (9.13) and
the denition of positive denite matrix, that if 0 < xa < then f(x) f(a) > 0.
Thus by denition a is a local minimum point of f.
REMARK
Parts 2) and 3) of the second derivative test can be proved in a similar way, using
the modied lemma.
Chapter 10
Optimization Problems
10.1 Extreme Value Theorem
As we saw with functions of one variable, one is often interested in nding the largest
or smallest possible value of a function f : R
n
R on some specied set S R
n
,
rather than nding the local maximum/minimum points. We start with some standard
denitions.
Given a function f : R
2
R and a set S R
2
. Denition
absolute maximum
absolute minimum
1) A point a S is an absolute maximum point of f on S if
f(x) f(a) for all x S
The value f(a) is called the absolute maximum value of f on S.
2) A point a S is an absolute minimum point of f on S if
f(x) f(a) for all x S
The value f(a) is called the absolute minimum value of f on S.
110 Chapter 10 Optimization Problems
The Extreme Value Theorem
Whether or not f has a maximum/minimum value on S depends on f and on the set
S. Recall from Calculus 1 that the Extreme Value Theorem gives conditions which
imply the existence of a maximum value and minimum value of f : R R on an
interval I. Here is the theorem.
Extreme Value Theorem Theorem
Consider f : R R, and a nite closed interval I R. If f is continuous on I then
there exists c
1
, c
2
I such that
f(c
1
) f(x) f(c
2
) for all x I.
For our purposes, the important thing is to be able to give counterexamples to show
that the conclusion may not be valid if the hypotheses are not satised.
Give a function f : R R and an interval I such that EXERCISE 1
1. I is closed, but f does not have an absolute maximum on I.
2. I is nite and f is continuous on I, but f does not have an absolute maximum
on I.
3. f is continuous on I, but does not have an absolute minimum.
In order to generalize this theorem to functions of two variables, we need to generalize
the concept of a nite closed interval to sets in R
2
.
A set S R
2
is said to be bounded if and only if it is contained in some neighbourhood Denition
bounded set of the origin.
REMARKS
1. Observe that the denition implies that every point in S must have nite distance
from the origin. Hence, bounded set means nite set.
2. In fact, the neighbourhood in the denition need not be centered at the origin.
Section 10.1 Extreme Value Theorem 111
Intuitively, a boundary point of a set S R
2
is a point which lies on the edge of
S. Here is the denition.
Given a set S R
2
, a point b R
2
is said to be a boundary point of S if and only Denition
boundary point if every neighbourhood of b contains at least one point in S and one point not in S.
The set of all boundary points of S is denoted B(S), and called the boundary of S. Denition
boundary of S
Observe that the concept of a closed set in R
2
gener-
alizes the idea of a closed interval in R.
x
y
S
B(S)
a S
b B(S)
c S
A set S R
2
is said to be closed if and only if S contains all its boundary points. Denition
closed set
Consider S = {x R
2
| 1 < x 2}. The boundary of S
is
B(S) = {x R
2
| x = 1 or x = 2}.
Since B(S) is not in S, S is not a closed set.
x
y
S
B(S)
x = 2
x = 1
We can now state the generalization of the Extreme Value Theorem to R
2
.
If f : R
2
R is continuous on a closed and bounded set S R
2
, then there exists Theorem 1
c
1
, c
2
S such that
f(c
1
) f(x) f(c
2
) for all x S.
Proof: The proof is beyond the scope of this course.
Here are some counterexamples to show that the conclusion may not be valid if either
hypothesis is not satised.
a) Let f(x, y) =
_
_
_
1 x
2
y
2
if (x, y) = (0, 0)
0 if (x, y) = (0, 0)
and S = {(x, y) | x
2
+ y
2
1}. EXAMPLE 1
Observe that S is the unit disc and hence is clearly bounded and it is closed since it
contains its boundary, the circle x
2
+ y
2
= 1. But, since f is not continuous at (0, 0),
we see that f does not have a maximum value on S, since f(x, y) has values arbitrarily
close to 1, but never equals 1, for (x, y) S
b) Let f(x, y) = x
2
+ y
2
and S = R
2
.
Clearly f is continuous on S, but since S is not bounded, f does not have a maximum
value on S. The values of f(x, y) can be made arbitrarily large by increasing the values
of x and y.
REMARK
A function f : R
2
R may have an absolute maximum and/or an absolute mini-
mum on a set S R
2
even if the conditions are not satised. We just cannot guarantee
the existence.
Let S = {(x, y) R
2
| x > 1, y R} and let f(x, y) =
_
_
_
1 if (x, y) = (0, 0)
0 if (x, y) = (0, 0).
EXAMPLE 2
Clearly f is not continuous on S and S is neither closed nor bounded. However, clearly
1 is the maximum of f on S and 0 is the minimum.
10.2 Algorithm for Extreme Values
Recall that if f : R R is dierentiable, then the maximum value and minimum value
of f on an interval [a, b] occur either at a critical point of f (i.e. f
(c) = 0) or at an
endpoint of the interval.
This approach can be generalized to f : R
2
R. Let S R
2
be a closed and bounded
set, with boundary B(S) and suppose that f : R
2
R is dierentiable on S. Then the
maximum value and minimum value of f on S occur either at a critical point of f that
is in S, or at a point on the boundary of S. Thus we have the following procedure:
To nd the maximum/minimum value of f : R
2
R on a closed and bounded set Algorithm
extreme values S R
2
,
(1) nd all critical points of f in S
(2) nd the maximum and minimum points of f on the boundary B(S)
Section 10.2 Algorithm for Extreme Values 113
(3) evaluate f at the points found in (1) and (2).
The maximum (minimum) value of f on S is the largest (smallest) of the function
values found in (3).
REMARK
It is not necessary to determine whether the critical points are local maximum or
minimum points.
Find the maximum value of f(x, y) = xy on the set EXAMPLE 3
S = {(x, y) | x
2
+ y
2
1}
Solution: First, we observe that f(x, y) = (y, x) hence the only critical point of f
is (0, 0) which is in S. Next, we nd the maximum value of f on the boundary B(S)
of S. To do this, we describe the boundary (the unit circle x
2
+y
2
= 1) in parametric
form:
x = cos t, y = sin t, 0 t 2.
On B(S), f has the values
g(t) = f(cos t, sin t) = cos t sin t =
1
2
sin 2t
The problem now is to nd the maximum value of
g(t) =
1
2
sin 2t on the interval 0 t 2.
By inspection the maximum value is
1
2
and occurs when t =

4
and
5
4
. Thus the maximum value of f on the boundary B(S)
is
1
2
and occurs at
_
1
2
,
1
2
_
and
_
2
,
1
2
_
.
Finally, since f(0, 0) = 0, the maximum value of f on S is
1
2
, and occurs on the boundary at
_
2
,
1
2
_
.
x
y
S
B(S)
1
2
,
1
2
,
1
max. point
max.
point
Find the maximum of f(x, y) = x
2
y y on the set S = {(x, y) | 9x
2
+ 4y
2
36} . EXERCISE 2
Find the maximum and minimum value of f(x, y) = xy 2x y +2 on the triangular EXAMPLE 4
region S with vertices (0, 0), (2, 0) and (0, 3).
Solution: First, we observe that f = (y 2, x 1) so the only critical point of f is
(1, 2). Since (1, 2) / S, this critical point plays no part in the solution.
The second step is to evaluate f on the boundary B(S) of S. This has to be done on
the three straight line segments separately. The values of f on B(S) dene a function
of one variable, which we denote by g.
1) x = 0, 0 y 3.
Let g(y) = f(0, y) = y + 2. By inspection the maximum
of g on the interval [0, 3] occurs at the end point y = 0, and
the minimum occurs at the end point y = 3. Thus (0, 0) and
(0, 3) are possible maximum and minimum points for f. x
y
S
(2, 0) (0, 0)
(0, 3)
(1, 2)
max.
min.
x
2
+
y
3
= 1
2) y = 0, 0 x 2.
Let g(x) = f(x, 0) = 2x + 2. As in 1) this leads to (0, 0) and (2, 0) as possible
maximum and minimum points for f.
3) y = 3
3
2
x, 0 x 2.
Let g(x) = f(x, 3
3
2
x) =
3
2
x
2
+
5
2
x 1, after simplifying. Find the critical points
of g by solving g
(x) = 0. This gives x =

5
6
. Thus
_
5
6
,
7
4
_
and the end points (0, 3) and
(2, 0) are possible maximum and minimum points of f.
Now evaluate f at all points found above:
f(0, 0) = 2, f(0, 3) = 1, f(2, 0) = 2, f
_
5
6
,
7
4
_
=
1
24
.
Thus the maximum value of f on S is f(0, 0) = 2, and minimum value of f on S is
f(2, 0) = 2.
Find the maximum value of the function f : R
2
R dened by f(x, y) = x
2
y + xy
2
EXERCISE 3
on the triangular region with vertices (0, 0), (0, 1) and (1, 0).
Section 10.3 Optimization with Constraints 115
10.3 Optimization with Constraints
To do step 2 of our algorithm for nding extreme values, we saw that we nd the
maximum and/or minimum of f on the boundary by nding a parametric representa-
tion for the boundary. Of course, in many cases this may be extremely dicult to do.
So, we now derive another algorithm which will allow us to nd the maximum and/or
minimum of f on a curve g(x, y) = k without having to parameterize the curve.
The Method of Lagrange Multipliers
We want to nd the maximum and/or minimum value of f(x, y) subject to the con-
straint g(x, y) = k, or more geometrically, nd the maximum value of f(x, y) on the
curve g(x, y) = k.
Suppose that f(x, y) has a local maximum (or minimum) at (a, b) relative to nearby
points on the curve g(x, y) = k, and suppose that g(a, b) = 0. Then by the Im-
plicit Function Theorem (see appendix 1), g(x, y) = k can be described by parametric
equations
x = p(t), y = q(t) (10.1)
with p and q dierentiable and (a, b) = (p(t
0
), q(t
0
)) for some t
0
. Dene u : R R by
u(t) = f(p(t), q(t)).
The function u gives the values of f on the constraint curve, and hence has a local
maximum (or minimum) at t
0
. It follows that
u
(t
0
) = 0. (10.2)
Assuming f is dierentiable we can apply the Chain Rule to get
u
(t) = f
x
(p(t), q(t))p
(t) + f
y
(p(t), q(t))q
(t).
Evaluating this at t
0
and using (10.2) gives
0 = f
x
(a, b)p
(t
0
) + f
y
(a, b)q
(t
0
).
This can be written as
f(a, b)
_
p
(t
0
), q
(t
0
)
_
= 0. (10.3)
Recall the geometric interpretation of the gradient vector
g(a, b) proven in Theorem 3 of Chapter 7 that g(a, b),
if non-zero, is orthogonal to the level curve g(x, y) = k at
(a, b). Thus, since (p
(t
0
), q
(t
0
)) is the tangent vector to the
constraint curve (10.1) we also have
x
y
g(a, b)
f(a, b)
(a, b)
T
g(x, y) = k
g(a, b)
_
p
(t
0
), q
(t
0
)
_
= 0. (10.4)
Since we are working in two dimensions, equations (10.3) and (10.4) imply that f(a, b)
and g(a, b) have the same direction. In other words, there exists a constant such
that
f(a, b) = g(a, b).
To summarize, we have shown that:
If f(x, y) has a local maximum or minimum at (a, b) relative to points of the curve
g(x, y) = k, then one of the following conditions holds:
1) f(a, b) = g(a, b), for some constant ,
2) g(a, b) = 0.
Case 2 is exceptional, but has to be included since we assumed that g(a, b) = 0 in
the preceding derivation.
This leads to the following procedure, called the method of Lagrange multipliers.
To nd the maximum value and minimum value of a dierentiable function f(x, y) Algorithm
Lagrange
multipliers
subject to the constraint g(x, y) = k, evaluate f(x, y) at all points (a, b) which satisfy
one of the following conditions.
(1) f(a, b) = g(a, b) and g(a, b) = k,
(2) g(a, b) = 0 and g(a, b) = k,
(3) (a, b) is an end point of the curve g(x, y) = k.
The maximum/minimum value of f(x, y) is the largest/smallest value of f obtained at
the points found in (1)-(3).
REMARKS
1. To nd the points (a, b) in (1) we have to solve the system of 3 equations in 3
unknowns
f
x
(x, y) = g
x
(x, y)
f
y
(x, y) = g
y
(x, y)
g(x, y) = k
for x and y. The variable , called the Lagrange multiplier, is not required
and should be eliminated.
2. If the curve g(x, y) = k is unbounded, then one must consider lim
(x,y)
f(x, y)
for (x, y) satisfying g(x, y) = k.
Find the maximum value of the expression 6x + 4y 7 on the ellipse 3x
2
+ y
2
= 28. EXAMPLE 5
Solution: Let f(x, y) = 6x +4y 7, g(x, y) = 3x
2
+y
2
. Then f is the function to be
maximized, and g is the constraint function.
1) f = g, g(x, y) = 28.
Dierentiating gives f = (6, 4), g = (6x, 2y), hence we have to solve the system of
equations
6 = 6x (10.5)
4 = 2y (10.6)
3x
2
+ y
2
= 28 (10.7)
By equation (10.5), x = 0 and so =
1
x
, which when substituted in equation (10.6)
gives y = 2x . We substitute this into (10.7) and solve for x, obtaining x = 2. For
x = 2 we get y = 2(2) = 4, for x = 2 we get y = 2(2) = 4 and thus we obtain
two points, (2, 4) and (2, 4).
2) g(x, y) = (0, 0), g(x, y) = 28.
We have (0, 0) = g(x, y) = (6x, 2y) implies x = y = 0, which does not satisfy the
constraint (10.7). Hence, there are no points in this step.
3) Check end points.
There are no endpoints since the constraint is a closed curve (an ellipse).
We evaluate f at all the points found in the above 3 steps. We get
f(2, 4) = 21, f(2, 4) = 35.
Thus, the maximum value of f(x, y) on 3x
2
+ y
2
= 28 is 21
and occurs at (2, 4).
We can view the result geometrically. The straight lines are
the level curves
f(x, y) = 6x + 4y 7 = k.
Notice that f and g are parallel at the maximum point.
x
y
g
f
(2, 4)
i
n
c
r
e
a
s
i
n
g
f
(
x
,
y
)
f(x, y) = 21
(max. value)
3x
2
+ y
2
= 28
Find the maximum and minimum values of f(x, y) = y on the piriform curve dened EXAMPLE 6
by y
2
+ x
4
x
3
= 0.
Solution: We have f(x, y) = y and constraint g(x, y) = y
2
+ x
4
x
3
= 0.
1) f(x, y) = g(x, y), g(x, y) = 0.
We get f = (0, 1) and g = (4x
3
3x
2
, 2y), so we need to solve
0 = (4x
3
3x
2
) = x
2
(4x 3) (10.8)
1 = (2y) (10.9)
0 = y
2
+ x
4
x
3
(10.10)
Clearly = 0 because of (10.9), so hence (10.8) gives x = 0 or x =
3
4
.
If x = 0, then (10.10) gives y = 0 which does not satisfy (10.9).
If x =
3
4
, then (10.10) gives 0 = y
2
27
256
hence y =
3
3
16
. Hence, we get 2 points
_
3
4
,
3
3
16
_
and
_
3
4
,
3
3
16
_
.
2) g(x, y) = (0, 0), g(x, y) = 0.
g(x, y) = (4x
3
3x
2
, 2y) = (0, 0) when 0 = 4x
3
3x
2
= x
2
(4x3) and when 2y = 0.
So, we get points (0, 0), and (3/4, 0). However, (3/4, 0) is not on the constraint curve,
so we just have one point (0, 0).
The curve is closed so there are no end points.
Evaluating f at all the points found above gives
f
_
3
4
,
3
3
16
_
=
3
3
16
, f
_
3
4
,
3
3
16
_
=
3
3
16
, f(0, 0) = 0
Thus, the maximum value is
3
3
16
at
_
3
4
,
3
3
16
_
and the minimum value is
3
3
16
at
_
3
4
,
3
3
16
_
.
Let R be the region bounded by the curve x =
_
1 y
2
and the y-axis. Find the EXAMPLE 7
maximum and minimum value of f(x, y) = x
2
1
2
x + y
2
on the region R.
Solution: Observe that this is an extreme values on a region problem as in Section
10.2. Thus, we apply our algorithm from Section 10.2.
We rst nd critical points of f inside the region R. We have
f = (2x
1
2
, 2y) = (0, 0) x =
1
4
, y = 0.
Hence, we have one critical point (
1
4
, 0), which is inside the region and f(
1
4
, 0) =
1
16
.
Next, we nd the maximum and minimum of f on the boundary of R. The boundary
has two parts, the y-axis and the semi-circle x =
_
1 y
2
.
For the y-axis, we have x = 0, 1 y 1, so on this line we have
f(0, y) = 0 + y
2
,
which we know has minimum 0 at (0, 0) and maximum 1 and (0, 1).
For the semi-circle, instead of parameterizing it as we did in Section 10.2, we will use
the method of Lagrange multipliers. To make the calculations easier, we simplify the
constraint to x
2
+ y
2
= 1, x 0. Hence, we take g(x, y) = x
2
+ y
2
= 1, x 0.
1) f = g, g(x, y) = 1.
2x
1
2
= (2x) (10.11)
2y = (2y) (10.12)
x
2
+ y
2
= 1, x 0 (10.13)
From (10.12) we see that y = 0 or = 1.
If y = 0, then (10.13) gives x = 1 (since x 0). We can pick a value of so that
(10.11) is also satised so (1, 0) is a point.
If = 1 then (10.11) is 2x
1
2
= 2x, which has no solutions, so we get no points.
2) g = (0, 0), g(x, y) = 1.
We have g = (2x, 2y) = (0, 0) only if x = 0 and y = 0, but this does not satisfy the
constraint so no points.
The constraint curve is a semi-circle, so we have end points when x = 0, so at (0, 1)
and (0, 1).
Putting all the points into f gives
f(1, 0) =
1
2
, f(0, 1) = 1, f(0, 1) = 1
Thus, on the semi-circle the maximum of f is 1 at (0, 1) and the minimum of f is is
1
2
at (1, 0).
Comparing the values of f found at the critical points inside the region and on the
boundary of R we nd that the maximum of f on R is 1 at (0, 1) and the minimum
of f is
1
16
at (
1
4
, 0).
Find the maximum value of xy on the circle x
2
+ y
2
= 1. Sketch the constraint curve EXERCISE 4
and some level curves of xy showing the gradient vectors at the maximum point.
Find the maximum and minimum value of F(x, y) = x
2
+ 2x + y
2
subject to the EXERCISE 5
constraint x
2
+ 4y
2
24.
Functions of Three Variables
What works for f(x, y) usually works for f(x, y, z). Thus, we generalize the previous
algorithm.
To nd the maximum (minimum) value of f(x, y, z) subject to the constraint g(x, y, z) =
k, evaluate f(x, y, z) at all points (a, b, c) which satisfy one of the following:
(1) f(a, b, c) = g(a, b, c) and g(a, b, c) = k,
(2) g(a, b, c) = (0, 0, 0) and g(a, b, c) = k,
(3) (a, b, c) is a boundary point of the surface g(x, y, z) = k.
The maximum/minimum value of f(x, y, z) is the largest/smallest value of f obtained
from all points found in (1)-(3).
REMARK
If condition (1) in the algorithm holds, it follows that the level surface f(x, y, z) =
f(a, b, c) and the constraint surface g(x, y, z) = k are tangent at the point (a, b, c),
since their normals coincide (see Theorem 4 in Section 7.3).
Find the point on the sphere x
2
+ y
2
+ z
2
= 1 which is closest to the point (1, 2, 2). EXAMPLE 8
Solution: We want to minimize the distance between the point (1, 2, 2) and a point
(x, y, z) on the given sphere. To simplify things, we consider the square of this distance,
which is given by the function
f(x, y, z) = (x 1)
2
+ (y 2)
2
+ (z 2)
2
.
The constraint is g(x, y, z) = x
2
+ y
2
+ z
2
= 1.
1) f = g, g(x, y, z) = 1.
2(x 1) = 2x (10.14)
2(y 2) = 2y (10.15)
2(z 2) = 2z (10.16)
x
2
+ y
2
+ z
2
= 1. (10.17)
Observe that (10.14), (10.15), and (10.16) give that x = 0, y = 0, and z = 0. Hence,
solving these equations for and setting them equal to each other gives
x 1
x
=
y 2
y
=
z 2
z
.
Looking at each pair, we nd that y = 2x, z = 2x, and thus y = z. Putting these into
the constraint (10.17) gives two points,
_
1
3
,
2
3
,
2
3
_
and
_
1
3
,
2
3
,
2
3
_
.
2) g = (0, 0, 0), g(x, y, z) = 1.
We have g = (0, 0, 0) implies x = y = z = 0, which does not satisfy the constraint.
3) Endpoints
There are no boundary points since the constraint is a closed surface.
Evaluating f at all the points found above gives
f
_
1
3
,
2
3
,
2
3
_
= 4, f
_
1
3
,
2
3
,
2
3
_
= 16.
Thus, the point
_
1
3
,
2
3
,
2
3
_
is the point on the sphere x
2
+ y
2
+ z
2
= 1 that is closest to
the point (1, 2, 2).
REMARK
Keep in mind the geometric interpretation. The level sets f(x, y, z) = k are spheres
centred on the point (1, 2, 2). The minimum distance occurs when one of the spheres
touches (i.e. is tangent to) the constraint surface which is the sphere g(x, y, z) = 1. At
the point of tangency the normals are parallel, i.e. f = g.
Find the points on the surface z
2
= xy + 1 that are closest to the origin. EXERCISE 6
Generalization
The method of Lagrange multipliers can be generalized to functions of n variables
f : R
n
R and with r constraints of the form
g
1
(x) = 0, g
2
(x) = 0, . . . , g
r
(x) = 0 (10.18)
where x = (x
1
, . . . , x
n
).
In order to nd the possible maximum and minimum points of f subject to the con-
straints (10.18), one has to nd all points a = (a
1
, . . . , a
n
) such that
f(a) =
1
g
1
(a) + +
r
g
r
(a), and g
i
(a) = 0, 1 i r.
The scalars
1
, . . . ,
r
are the Lagrange multipliers. When r = 1, and n = 2 or 3, this
reduces to the previous algorithms.
Part II
Mappings
Chapter 11
Coordinate Systems
A coordinate system is a system for representing the location of a point in a space
by an ordered n-tuple. We call the elements of the n-tuple the coordinates of the
point.
We are used to using the Cartesian coordinate system in which the location of the
point is represented by the directed distance from a set of perpendicular axes which
all intersect at a point O. However, you may also be used to other coordinate systems.
For example, the geographic coordinate system represents location on the earth by
longitude, latitude and altitude.
We will now look at three other important coordinate systems.
11.1 Polar Coordinates
As in all coordinate systems, we must have a frame of reference
for our coordinate system. So, in a plane we choose a point O
called the pole (or origin). From O we draw a ray called the
polar axis. Generally, the polar axis is drawn horizontally to
the right to match the positive x-axis in Cartesian coordinates.
O
P(r, )
r
polar axis
Let P be any other point in the plane. We will represent the position of P by the
ordered pair (r, ) where r 0 is the length of the line OP and is the angle between
the polar axis and OP. We call r and the polar coordinates of P.
Section 11.1 Polar Coordinates 125
REMARKS
1. We assume, as usual, that an angle is considered positive if measured in the
counterclockwise direction from the polar axis and negative if measured in the
clockwise direction.
2. We represent the point O by the polar coordinates (0, ) for any value of .
3. We are restricting r to be non-negative to coincide with the interpretation of r
as distance. Many textbooks do not put this restriction on r.
4. Since we use the distance r from the pole in our representation, polar coordinates
are suited for solving problems in which there is symmetry about the pole.
Plot the points (1,

4
) and (2,
5
6
) in polar coordinates. EXAMPLE 1
Solution:
O
4
_
1,

4
_
1
polar axis
O
5
6
_
2,
5
6
_
2
polar axis
There is one important dierence between polar coordinates and Cartesian coordinates.
In Cartesian coordinates each point has a unique representation (x, y). However, ob-
serve that a point P can have innitely many representations in polar coordinates. In
particular
(r, ) = (r, + 2k), k Z.
Relationship to Cartesian Coordinates
If we now place the pole O at the origin of the Cartesian plane and lie the polar axis
along the positive x-axis, we can nd a relationship between the coordinates of a point
P in the two coordinate systems. In particular, we see from the diagram that
x = r cos , r =
_
x
2
+ y
2
,
y = r sin , tan =
y
x
. (11.1)
x
y
x
y
O
(x, y) = (r, )
r
126 Chapter 11 Coordinate Systems
Convert the point (1, 1) in Cartesian coordinates into polar coordinates. EXAMPLE 2
Solution: We have x = 1 and y = 1, so r =
1
2
+ 1
2
=
2 and tan = 1. Since

x and y are both positive the point is in quadrant 1, and hence =

4
+ 2k, k Z.
Therefore, we get the polar coordinate representations (
2,

4
+ 2k), k Z.
Often we do not need to nd all possible polar representations for a point. Thus, we
further restrict ourselves to a range of (such as 0 < 2 or < ) which
gives unique representation.
Convert the point (1,
3) in Cartesian coordinates into polar coordinates with 0 EXAMPLE 3

< 2.
Solution: We have x = 1 and y =
3, so r =
_
(1)
2
+ (
3)
2
= 2 and tan =
3. Since is in the second quadrant we get =

2
3
. Hence the point has polar
representation (2,
2
3
).
Convert the points (2,
3
) and (1,
3
4
) from polar coordinates to Cartesian coordinates. EXAMPLE 4
Solution: We have x = 2 cos(
3
) = 1 and y = 2 sin(
3
) =
3. Hence the point is

(1,
3) in Cartesian coordinates.
We have x = cos
3
4
=
1
2
and y = sin
3
4
=
1
2
. So, the point has Cartesian
coordinates (
1
2
,
1
2
).
Graphs in Polar Coordinates
The graph of an explicitly dened polar equation r = f(), = f(r) or an implicitly
dened polar equation f(r, ) = 0, is a curve that consists of all points that have at
least one polar representation (r, ) that satises the equation of the curve.
Sketch the polar equation r = 1.
Solution: This is the curve which consists of all
points (r, ) = (1, ), R. Observe that this is all
points that have distance 1 from the origin. Hence,
we get a circle of radius 1.
x
y
1
EXAMPLE 5
Sketch the polar equation =

4
. EXERCISE 1
Sketch the polar equation r = 1 + sin . EXAMPLE 6
Solution: To sketch this equation we rst sketch the
curve in Cartesian coordinates in the r-plane and
use this graph to plot points in the xy-plane. Observe
from the diagram to the right that as increases from
0 to

2
the radius increases from 1 to 2. Then when
increases from

2
to the radius decreases from 2
r
1
2
2

3
2
2
to 1. As increases from to
3
2
we get the radius decreases from 1 to 0, and as
increases from
3
2
to 2 the radius increases from 0 to 1. Each of these steps are shown
below. The nal curve is called a cardioid.
x
y
1
2
1
x
y
1
2
1
x
y
1
2
1
x
y
1
2
1
Sketch the polar equation r = cos . EXAMPLE 7
Solution: Sketching the curve in Cartesian coordinates
in the r-plane gives us the diagram to the right. Then
we see that as increases from 0 to

2
the radius decreases
from 1 to 0. For values of from

2
to
3
2
the radius is
negative, thus we do not draw any points since we have
made the restriction that r 0. As moves from
3
2
to
x
y
1
_
1
2
, 0
_
1
2
2 the radius increases from 0 to 1.
Sketch the polar equations r = sin and r = 1 2 cos . EXERCISE 2
We have seen above that we can use equations (11.1) to convert points between the
coordinate systems. Thus, we can also use these equations to convert equations of
curves between the two systems.
Convert the equation r = cos to Cartesian coordinates. EXAMPLE 8
Solution: Since r
2
= x
2
+ y
2
and x = r cos , we get
r = cos
r
2
= r cos
x
2
+ y
2
= x
(x
1
2
)
2
+ y
2
=
1
4
.
Which is a circle of radius
1
2
centred at
_
1
2
, 0
_
as we drew in Example 7.
Convert the equation of the curve (x
2
+ y
2
)
3/2
= 2xy to polar coordinates. EXAMPLE 9
Solution: Since x = r cos and y = r sin we get
(x
2
+ y
2
)
3/2
= 2xy
r
3
= 2(r cos )(r sin )
r
3
= r
2
sin 2
We can simplify this equation to r = sin 2 since the pole is still included in the graph
by taking = . Moreover, observe that since we have the restriction r 0, that we
must also have sin 2 0. Hence, we nd that a domain of the function is 0

2
,

3
2
.
Convert the equation of the curve x
2
y
2
= 1 to polar coordinates. EXERCISE 3
Area in Polar Coordinates
We now wish to derive the formula for computing area between curves in Polar
coordinates. Clearly this will be a little dierent than before as it does not make sense
to use rectangles to nd our area. In Polar coordinates, it is natural to use sectors of
a circle.
Recall that if
1
and
2
,
2
>
1
, are two angles in a circle of radius r, then the area
between them is

2
1
2
r
2
=
1
2
r
2
(
2
1
).
We now derive the area as before. We divide the region bounded by = a, = b and
r = f() into subregions
0
, . . . ,
n
of equal dierence , then for each subregion
i
,
0 i < n we pick some point
i
with
i

i

i+1
. We then form the sector between
i
and
i+1
with radius f(
i
). The area of this sector is
1
2
[f(
i
)]
2
.
Hence the area is approximately
n1
i=0
1
2
[f(
i
)]
2
Thus as we let the number of subdivisions go to innity

and hence letting each of the
i
tend to 0 we get
A = lim
i
0
n1
i=0
1
2
[f(
i
)]
2
=
_
b
a
1
2
[f()]
2
d
O
r = f()
i+1
f(
i
)
= a
= b
polar axis
Find the area inside the circle r = a. EXAMPLE 10
We need to range from 0 to 2 to make the whole circle so we have
A =
_
2
0
1
2
a
2
d =
1
2
a
2
[2 0] = a
2
.
Find the area inside the lemniscate r = 2
sin 2. EXERCISE 4
To nd the area between two curves in Polar coordinates, we use the same method
we used for doing this in Cartesian coordinates.
1. Find the points of intersections by setting the equations equal to each other.
2. Graph the curves and split the desired region into easily integrable regions.
3. Integrate.
Find the area inside r = 2 sin(2), but outside r = 1. EXAMPLE 11
Setting the curves equal to each other we get 1 = 2 sin(2) hence 2 =

6
or 2 =
5
6
.
From the picture we see that we want to integrate over the region

12
to
5
12
, and we
will nd the area inside the lemniscate and subtract o
the area inside the circle and multiply by 2 for the sym-
metric regions. Hence we get
A = 2
_
_
5/12
/12
1
2
(2 sin(2))
2
d
_
5/12
/12
1
2
(1)
2
d
_
= =

3
+
3
2
x
y
r = 2 sin(2)
r = 1
Find the area between the curves r = cos and r = sin . EXERCISE 5
Section 11.2 Cylindrical Coordinates 131
11.2 Cylindrical Coordinates
Observe that we can extend polar coordinates to 3-dimensional
space by introducing another axis, called the axis of symme-
try, through the pole perpendicular to the polar plane. We
then represent any point in the space by the cylindrical coor-
dinates (r, , z) where r and are as in polar coordinates and
z is the height above (or below) the polar plane. Thus, as in
Polar coordinates, we have the restrictions r 0, 0 < 2
(or < ).
z
O
r
z
(r, , z)
REMARK
Notation for cylindrical coordinates may vary from author to author. In particular,
in the sciences they generally use the Standard ISO 31-11 notation which gives the
cylindrical coordinates as (, , z).
If we place the pole at the origin and the polar axis along the positive x-axis as in
polar coordinates and place the axis of symmetry along the z-axis we then can relate
points in cylindrical and Cartesian coordinates by
x = r cos , r =
_
x
2
+ y
2
,
y = r sin , tan =
y
x
, (11.2)
z = z, z = z
x
y
z
r
z
(x, y, z) = (r, , z)
REMARK
Cylindrical coordinates are useful when there is symmetry about an axis. Thus, it
is some times desirable to lie the polar axis and axis of symmetry along dierent axes.
Convert (1, 1, 3) and (1,
3, 1) from Cartesian coordinates to cylindrical coordinates. EXAMPLE 12

Solution: We have r =
1
2
+ 1
2
=
2, tan = 1 which gives =

4
and z = 3. Thus,
in cylindrical coordinates the point is (
2,

4
, 3).
We have z = 1, r =
_
1
2
+ (
3)
2
= 2, tan =

3
1
which gives =
5
3
since is in
the fourth quadrant. Hence, in cylindrical coordinates the point is (2,
5
3
, 1).
Convert (2, 0, 0) and (0, , 2) from cylindrical coordinates to Cartesian coordinates. EXAMPLE 13
Solution: We are given r = 0, = 0 and z = 0. Hence, x = 2 cos 0 = 2, y = 2 sin0 = 0
and z = 0, so we get the Cartesian point (2, 0, 0).
The points has coordinates r = 0, = and z = 2. Since, r = 0 we get x = y = 0 and
so the point in Cartesian coordinates is (0, 0, 2).
Graphs in Cylindrical Coordinates
As with functions z = f(x, y), the graphs of functions z = f(r, ), or more generally,
f(r, , z) = 0 are surfaces in R
3
.
Sketch the graph of r = 1 in cylindrical coordinates.
Solution: We know that r = 1 gives a circle of radius
1 in polar coordinates. Thus, in cylindrical coordinates
we have a circle of radius 1 at any value of z. Hence, we
have an innite cylinder of radius 1.
x
y
z
r = 1
EXAMPLE 14
Sketch the graph of z = r
2
in cylindrical coordinates. EXERCISE 6
As we did in polar coordinates, we can also transform the equations of curves between
the coordinates systems.
Section 11.3 Spherical Coordinates 133
Convert the equation z = r
2
cos to Cartesian coordinates. EXAMPLE 15
Solution: Using (11.2) we get z = x
_
x
2
+ y
2
.
Find the equation of z =
y
_
x
2
+ y
2
in cylindrical coordinates. EXERCISE 7
11.3 Spherical Coordinates
In 2-dimensional space, we saw that polar coordinates were useful for problems which
where symmetric about the origin. We now extend this idea to another 3-dimensional
coordinate system called spherical coordinates.
As we did in cylindrical coordinates, we will use the pole O
and polar axis from polar coordinates and draw another axis z
perpendicular to the polar plane.
Let P be any point in 3-dimensional space. We will represent P
by the coordinates (, , ) where 0 is the length of the line
OP, is the same angle as in cylindrical coordinates, and is
the angle between the positive z-axis and the line OP.
z
O
(, , )
Since we are keeping the same interpretation of from cylindrical coordinates, it tells
us the orientation of P around the z-axis. Therefore, we only want to indicate the
tilt of the point with the z-axis. So, we restrict 0 .
Thus, our restrictions in spherical coordinates are 0, 0 < 2 (or < )
and 0 .
REMARK
The symbols used for spherical coordinates also vary from author to author as does
the order in which they are written. In mathematics, it is not uncommon to nd
replaced by r. The standard ISO 31-11 convention uses as the polar angle and as
the angle with the positive z-axis. Therefore, it is very important to understand which
notation is being used when reading an article.
From the diagram, we see that we can convert between Carte-
sian coordinates and spherical coordinates by the equations
x = sin cos , =
_
x
2
+ y
2
+ z
2
,
y = sin sin , tan =
y
x
, (11.3)
z = cos , cos =
z
_
x
2
+ y
2
+ z
2
x
y
z
(x, y, z) = (, , )
Convert (
1
2
,
1
2
,
3) and (1, 1, 1) from Cartesian coordinates to spherical coor- EXAMPLE 16

dinates.
Solution: We have =
_
(
1
2
)
2
+ (
1
2
)
2
+ (
3)
2
= 2, tan = 1 =

4
since is in
the rst quadrant and cos =
3
2
=

6
. Hence, in spherical coordinates the point
is (2,

6
,

4
).
We get =
_
(1)
2
+ (1)
2
+ (1)
2
=
3, tan = 1 =
5
4
since is the third
quadrant, and cos =
1
3
. Thus, the point in spherical coordinates is (
3, arccos(
1
3
),
5
4
)
Convert (1,

4
,

4
), (1,

4
,
5
4
), and (1,
3
4
,

4
) from spherical coordinates to Cartesian co- EXAMPLE 17
ordinates.
Solution: We get x = sin

4
cos

4
=
1
2
, y = sin

4
sin

4
=
1
2
, and z = cos

4
=
1
2
.
Therefore, the point has Cartesian coordinates (
1
2
,
1
2
,
1
2
).
We get x = sin

4
cos
5
4
=
1
2
, y = sin

4
sin
5
4
=
1
2
, and z = cos

4
=
1
2
. Therefore,
the point has Cartesian coordinates (
1
2
,
1
2
,
1
2
).
We get x = sin
3
4
cos

4
=
1
2
, y = sin
3
4
sin

4
=
1
2
, and z = cos
3
4
=
1
2
. Therefore,
the point has Cartesian coordinates (
1
2
,
1
2
,
1
2
).
Observe from the above examples, how controls which quadrant the point is in (its
rotation around the z-axis) and only controls whether the point will be above or
below the xy-plane.
Section 11.3 Spherical Coordinates 135
Graphs in Spherical Coordinates
As with cylindrical coordinates, the graph of a function f(, , ) = 0 in spherical
coordinates gives a surface in R
3
.
Sketch = 2.
Solution: Observe that this is the graph with all points
2 units from the origin. Hence, it is a sphere of radius 2.
EXAMPLE 18
Sketch =

4
.
Solution: First imagine a line which makes a

4
angle
with the positive z-axis. Since there is no restriction on
, the graph of the surface will be this line rotated around
the positive z-axis. Hence, we get a cone.
EXAMPLE 19
As with the other coordinate systems, we also want to convert equations between
Cartesian and spherical coordinates.
Convert = sin cos to Cartesian coordinates. EXAMPLE 20
Solution: We rst multiply both sides of the equation by to get
2
= sin cos .
Hence, we can apply (11.3) to get
x
2
+ y
2
+ z
2
= x
(x
1
2
)
2
+ y
2
+ z
2
=
1
4
Convert z
2
= x
2
+ y
2
to spherical coordinates. EXAMPLE 21
Solution: We have
2
cos
2
=
2
sin
2
cos
2
+
2
sin
2
sin
2
cos
2
= sin
2
(cos
2
+ sin
2
)
tan
2
= 1
Thus tan = 1, so we get =

4
or =
3
4
. Observe that =

4
is the top-half of
the cone (as in Example 19) and =
3
4
is the bottom-half of the cone.
Convert x
2
+ y
2
+ z
2
= 2x to spherical coordinates. EXERCISE 8
Chapter 12
Mappings of R
2
into R
2
In part I, we studied scalar-valued functions, that is, functions which map R
2
, or more
generally, R
n
into R. We now extend the ideas of dierential calculus to vector-valued
functions.
You have already worked with the simplest type of vector-valued function. Consider
parametric equations x = f(t), y = g(t) for a curve in R
2
: These two scalar equations
can be written as a vector equation
x = F(t),
where
x = (x, y), F(t) = (f(t), g(t)).
x
y
t
t a b
F
F(t)
F(a)
F(b)
The function F maps t to F(t). The domain of F is in R and its range is in R
2
, and
so F is a vector-valued function.
We now consider vector-valued functions whose domain is in R
2
and whose range is in
R
2
. We shall nd that matrix algebra plays an important role.
138 Chapter 12 Mappings of R
2
into R
2
12.1 The Geometry of Mappings
A pair of equations
u = f(x, y)
v = g(x, y)
associates with each point (x, y) R
2
a single point (u, v) R
2
, and thus denes a
vector-valued function
F : R
2
R
2
.
If we write x = (x, y) and u = (u, v), then the equations can be written as a vector
equation
u = F(x)
where
F(x) = (f(x), g(x)) R
2
.
We shall refer to a vector-valued function F : R
2
R
2
as a mapping
1
of R
2
into R
2
.
The scalar functions f and g are called the component functions of the mapping.
Mappings of R
2
into R
2
(more generally R
n
into R
n
) have many applications, such
as dening curvilinear coordinate systems (e.g. polar coordinates), and performing
a change of variables in multiple integrals (see part III). They are used in applied
mathematics, in statistics, and in computer graphics for simplifying problems in two
or more variables.
In general, if a mapping F : R
2
R
2
acts
on a curve C in its domain, it will deter-
mine a curve in its range, denoted by F(C)
and called the image of C under F. x
y
u
v
F
curve C
image F(C)
More generally, if a mapping F : R
2
R
2
acts on any subset S in its domain it will
determine a set F(S) in its range, called
the image of S under F.
x
y
u
v
F
set S
image F(S)
1
Mappings are also referred to as transformations.
Section 12.1 The Geometry of Mappings 139
In order to develop an intuitive geometric understanding of a mapping F : R
2
R
2
,
it is helpful to determine the images of dierent curves and sets under the mapping.
In general, a mapping will deform a given curve or set.
Find the images of the lines x = k, and y = , under the mapping F : R
2
R
2
dened EXAMPLE 1
by
(u, v) = F(x, y) =
_
1
2
(x + y),
1
2
(x + y)
_
.
Find the image of the square S = {(x, y) | |x| 1, |y| 1} under F.
Solution: Substituting x = k into the equation of the mapping gives
u =
1
2
(k + y)
v =
1
2
(k + y).
We want equations of curves in the uv-plane, so we eliminate y to obtain
u v = k,
which is the equation of the image of the line x = k. In a similar manner (exercise), it
follows that
u + v =
is the image of the line y = .
To determine the image of S under F, we nd the image of each of the boundary lines.
In particular, by choosing k = 1 and = 1, we obtain the images of the sides of
the square S.
x
y
u
v
x = 1 x = 1
y = 1
y = 1
u + v = 1
u + v = 1
u v = 1
u v = 1
2
into R
2
Observe that the mapping in Example 1 is linear. For any linear mapping, the image
of a straight line in the xy-plane is a straight line in the uv-plane. However, we see
from the image of S under F that the lines are contracted and rotated by F.
Find the image of the circle (x 1)
2
+ y
2
= 1 under the mapping F dened in EXERCISE 1
Example 1.
Find the image of the rectangle EXAMPLE 2
R =
_
(r, ) | 1 r 2,

4

3
4
_
under the mapping F : R
2
R
2
from polar coordinates to Cartesian coordinates
dened by
(x, y) = F(r, ) = (r cos , r sin ).
Solution: To nd the image of the rectangle, we will nd the image of each of the
boundary lines under F.
For the line r = 1,

4

3
4
we get
x = cos
y = sin
for

4

3
4
. In this case, we dont need to eliminate since we recognize these are
parametric equations of a circle of radius 1, since they imply
x
2
+ y
2
= 1.
Thus, the image is the part of the unit circle with

4

3
4
.
Similarly, we see that the line r = 2,

4

3
4
gives the part of the circle of radius
2 for which

4

3
4
.
The image of a line =

4
, 1 r 2 is
x = r cos

4
=
1
2
r
y = r sin

4
=
1
2
r
Section 12.2 The Geometry of Mappings 141
for 1 r 2. Eliminating r gives y = x. Moreover, we have that r =
2x and hence
1 r 2 gives that x has values from
1
2x 2
1
2
x
2.
Similarly, for the line =
3
4
, 1 r 2 we get
x = r cos
3
4
=
1
2
r
y = r sin
3
4
=
1
2
r
for 1 r 2. Thus, the image is the line y = x with x values
2 x
1
2
.
x
y
r
F
1 2
4
3
4
x
2
+ y
2
= 1 x
2
+ y
2
= 4
y = x y = x
REMARKS
1. Observe that each of the images are exactly what we would get if we sketched
the equations as in chapter 11.
2. The mapping from polar coordinates to Cartesian coordinates is non-linear. The
image of a straight line is not necessarily a straight line.
Find the image of the square S = {(x, y) | 1 x 2, 2 y 3} under the mapping EXERCISE 2
F : R
2
R
2
dened by
(u, v) = F(x, y) = (xy, y).
2
into R
2
12.2 The Linear Approximation of a Mapping
Consider a mapping F : R
2
R
2
dened by u = f(x, y), v = g(x, y). We assume that
F has continuous partial derivatives. By this we mean that the component functions
f and g have continuous partial derivatives.
The image of a point (a, b) in the xy-plane is the point (c, d) in the uv-plane, where
c = f(a, b), d = g(a, b).
As usual, we want to approximate the image (c + u, d + v) of a nearby point
(a + x, b + y).
x
y
u
v
F
F
(a, b)
(a + x, b + y)
(c, d)
(c + u, d + v)
We do this by using the linear approximation formula for f(x, y) and g(x, y) separately.
We get
u
f
x
(a, b)x +
f
y
(a, b)y,
v
g
x
(a, b)x +
g
y
(a, b)y,
for x and y suciently small. This can be written in matrix form as:
_
u
v
_
_
f
x
(a, b)
f
y
(a, b)
g
x
(a, b)
g
y
(a, b)
__
x
y
_
,
where the product on the right side of the equation is matrix multiplication.
Observe that this resembles our usual form of the linear approximation formula where
the 2 2 matrix is taking the place of the derivative. Thus, we make the following
denition.
Section 12.2 The Linear Approximation of a Mapping 143
The derivative matrix of a mapping F : R
2
R
2
dened by Denition
derivative matrix
F(x, y) = (f(x, y), g(x, y)),
is denoted DF and dened by
DF =
_
f
x
f
y
g
x
g
y
_
.
If we introduce the column vectors
u =
_
u
v
_
, x =
_
x
y
_
then the linear approximation formula for mappings becomes
u DF(a)x,
for x suciently small.
The geometrical interpretation of the linear approximation for mappings is this: the
derivative matrix DF(a) acts as a linear mapping on the displacement vector x to
give an approximation of the image u of the displacement under F.
x
y
u
v
a
F
F(a)
DF(a)x
x
u
Consider the mapping F : R
2
R
2
dened by EXAMPLE 3
(u, v) = F(x, y) =
_
x +
_
x
2
+ y
2
, x +
_
x
2
+ y
2
_
.
Use the linear approximation to estimate the image of the point (3.02, 3.99) under F.
Solution: The derivative matrix of F is
DF(x, y) =
_
_
1 +
x
x
2
+y
2
y
x
2
+y
2
1 +
x
x
2
+y
2
y
x
2
+y
2
_
_
.
2
into R
2
As a reference point choose a = (3, 4). The image point is
F(3, 4) = (2, 8).
The displacement in the uv-plane is approximated by
_
u
v
_
DF(3, 4)
_
x
y
_
=
_
2
5
4
5
8
5
4
5
__
0.02
0.01
_
=
_
0.016
0.024
_
.
Thus
F(3.02, 3.99) (2, 8) + (0.016, 0.024) = (1.984, 8.024)
The calculator value is (1.98405, 8.02405).
2
R
2
dened by EXERCISE 3
(u, v) = F(x, y) =
_
ln(x + y), ln(x y)
_
.
Approximate the image of the point (0.95, 0.1) under F.
Generalization
A mapping F : R
n
R
m
is dened by a set of m component functions:
u
1
= f
1
(x
1
, . . . , x
n
)
.
.
.
u
m
= f
m
(x
1
, . . . , x
n
).
Or, in vector notation
u = F(x) = (f
1
(x), f
m
(x)), x R
n
.
We assume that F has continuous partial derivatives. Then, the derivative matrix of
F is the mn matrix dened by
DF(x) =
_
_
f
1
x
1

f
1
xn
.
.
.
.
.
.
fm
x
1

fm
xn
_
_
.
Section 12.3 Composite Mappings and the Chain Rule 145
As expected, the linear approximation formula for F at a is
u DF(a)x,
where
u =
_
_
u
1
.
.
.
u
m
_
_
R
m
, x =
_
_
x
1
.
.
.
x
n
_
_
R
n
.
12.3 Composite Mappings and the Chain Rule
The next step in developing the theory of mappings is to study the composition of two
mappings.
Consider successive mappings F and G of R
2
into R
2
, dened by
F :
_
p = p(u, v)
q = q(u, v)
, G :
_
u = u(x, y)
v = v(x, y)
(12.1)
x
y
u
v
p
q
F
G
F G
The composite mapping F G, dened by
_
_
_
p = p
_
u(x, y), v(x, y)
_
q = q
_
u(x, y), v(x, y)
_
, (12.2)
maps the xy-plane directly into the pq-plane.
The question is this: how is the derivative matrix D(F G) of the composite mapping
related to the derivative matrices DF and DG of the individual mappings?
The answer is: D(F G)(x) is the matrix product of DF(u) and DG(x), where
u = G(x).
We state this formally in the following theorem.
2
into R
2
Chain Rule in Matrix Form Theorem 1
Consider G : R
2
R
2
and F : R
2
R
2
. If G has continuous partial derivatives at
x and F has continuous partial derivatives at u = G(x), then the composite mapping
F G has continuous partial derivatives at x and
D(F G)(x) = DF(u)DG(x).
Proof: The matrix equation that we wish to verify can be written in terms of partial
derivatives, using equations (12.1) and (12.2):
_
p
x
p
y
q
x
q
y
_
=
_
p
u
p
v
q
u
q
v
__
u
x
u
y
v
x
v
y
_
. (12.3)
The (1, 1) entry of this matrix equation is:
p
x
=
p
u
u
x
+
p
v
v
x
,
which follows from equation (12.2) using the Chain Rule for real-valued functions. The
equality of the other entries follows similarly. The partial derivatives of the composite
mapping are continuous at x by the Continuity Theorems.
Consider the mappings G and F dened by EXAMPLE 4
(u, v) = G(x, y) = (xy, x + y)
(p, q) = F(u, v) = (u v, u
2
)
Form the composite mapping F G and nd the derivative matrices DG, DF, and
D(F G). Verify the Chain Rule formula.
Solution: The composite mapping is
(p, q) = F(G(x, y)) = F(xy, x + y) = (xy x y, x
2
y
2
).
The derivative matrices are:
DG(x) =
_
y x
1 1
_
, DF(u) =
_
1 1
2u 0
_
, D(F G)(x) =
_
y 1 x 1
2xy
2
2x
2
y
_
.
Section 12.3 Composite Mappings and the Chain Rule 147
Form the matrix product,
DF(u)DG(x) =
_
1 1
2u 0
__
y x
1 1
_
=
_
y 1 x 1
2uy 2ux
_
=
_
y 1 x 1
2xy
2
2x
2
y
_
, on substituting u = xy
= D(F G)(x), as required.
Consider the maps F : R
2
R
2
and G : R
2
R
2
dened by EXERCISE 4
F(u, v) = (u
2
v, e
uv1
), G(x, y) = (
_
2x
2
+ 2y
2
, 2x + y
2
).
a) Use the chain rule in matrix form to nd the derivative matrix D(F G).
b) Calculate D(G F)(1, 1).
c) Use the linear approximation of mappings to approximate the image of
(u, v) = (1.01, 0.98) under G F.
Chapter 13
Jacobians and Inverse Mappings
13.1 The Inverse Mapping Theorem
2
R
2
, dened by (u, v) = F(x, y). Our goal now is to nd
a condition which will guarantee that F has an inverse. We start by dening inverse
mappings in the expected way.
A mapping F : R
2
R
2
is said to be one-to-one on a subset D
xy
R
2
if and only if Denition
one-to-one
F(a) = F(b) implies a = b, for all a, b D
xy
.
(x
1
, y
1
)
(x
2
, y
2
)
(u
1
, v
1
)
(u
2
, v
2
)
(x
1
, y
1
)
(x
2
, y
2
)
(u
1
, v
1
) = (u
2
, v
2
)
F
F
F
F
D
xy
D
xy
D
uv
D
uv
F is one-to-one F is not one-to-one
Suppose that F is one-to-one on D
xy
, and that the image of D
xy
under F is D
uv
R
2
. Denition
inverse mapping Then F has an inverse mapping F
1
which maps D
uv
onto D
xy
such that
(x, y) = F
1
(u, v) if and only if (u, v) = F(x, y).
Section 13.1 The Inverse Mapping Theorem 149
As usual, we have
(F
1
F)(x) = x for all x D
xy
. (13.1)
Recall, for f : R R that if f
(x) > 0 for all x [a, b], then f has an inverse on [a, b].
Thus, for a mapping F : R
2
R
2
, it makes sense to investigate the relation between
the derivative matrix DF of F and F being invertible. We start with the following
theorem.
2
R
2
which maps D
xy
onto D
uv
. If F has continuous Theorem 1
partial derivatives at x D
xy
and there exists an inverse mapping F
1
of F which has
continuous partial derivatives at u = F(x) D
uv
then
DF
1
(u)DF(x) = I.
Proof: It follows from equation (13.1) that
D(F
1
F)(x) = Dx.
By the Chain Rule in matrix form, the left hand side is
D(F
1
F)(x) = DF
1
(u)DF(x).
The right hand side is
Dx =
_
x
x
x
y
y
x
y
y
_
=
_
1 0
0 1
_
,
as required.
2
R
2
dened by EXAMPLE 1
(u, v) = F(x, y) = (y + x
2
, x)
Solve for the inverse mapping F
1
. Find the derivative matrices DF and DF
1
and
verify that DF
1
(u) is the matrix inverse of DF(x).
Solution: The inverse mapping is obtained by solving
u = y + x
2
, v = x
for x and y. We obtain
x = v, y = u v
2
.
150 Chapter 13 Jacobians and Inverse Mappings
Thus the inverse mapping is
(x, y) = F
1
(u, v) = (v, u v
2
).
The derivative matrices are:
DF(x) =
_
2x 1
1 0
_
, DF
1
(u) =
_
0 1
1 2v
_
.
Form the matrix product,
DF
1
(u)DF(x) =
_
0 1
1 2v
__
2x 1
1 0
_
=
_
1 0
2x 2v 1
_
=
_
1 0
0 1
_
, on substituting v = x.
REMARK
The fact that we could solve and obtain a unique solution for x and y in the
preceding example proves that F has an inverse mapping on R
2
. It is only in simple
examples that one can carry out this step. Hence it is useful to develop a test to
determine if a mapping F has an inverse mapping.
The determinant of the derivative matrix plays an important role in the study of
mappings and in their application to multiple integrals. It is thus given a special
name, the Jacobian of the mapping.
The Jacobian of a mapping F : R
2
R
2
, Denition
Jacobian
(u, v) = F(x, y) = (u(x, y), v(x, y))
is denoted
(u, v)
(x, y)
, and is dened by
(u, v)
(x, y)
= det[DF(x)] = det
_
u
x
u
y
v
x
v
y
_
.
Calculate the Jacobian
(x, y)
(r, )
of the mapping F given by EXERCISE 1
x = r cos , y = r sin .
REMARK
One can interpret Theorem 1 as asserting that if a mapping F is one-to-one then its
derivative matrix DF(x) is invertible, and its inverse matrix is the derivative matrix
DF
1
(u) of the inverse map. Recall from linear algebra that a square matrix has
an inverse matrix if and only if its determinant is non-zero. Thus, it follows from
Theorem 1 that if a mapping F has an inverse mapping F
1
(and both mappings have
continuous partials), then the Jacobian of F is non-zero. This is stated as a corollary
to Theorem 1.
2
R
2
, dened by Corollary 1
u = f(x, y), v = g(x, y)
which maps a subset D
xy
onto a subset D
uv
. Suppose that f and g have continuous
partials on D
xy
. If F has an inverse mapping F
1
, with continuous partials on D
uv
,
then the Jacobian of F is non-zero:
(u, v)
(x, y)
= 0, on D
xy
.
REMARK
The notation
(u, v)
(x, y)
for the Jacobian reminds one which partial derivatives
have to be calculated. Thus if F maps (x, y) (u, v) and is one-to-one, then the
inverse mapping F
1
maps (u, v) (x, y), and the Jacobian of the inverse mapping is
denoted by
(x, y)
(u, v)
= det[F
1
(u)] = det
_
x
u
x
v
y
u
y
v
_
.
Recall from linear algebra that det(AB) = det Adet B for all n n matrices A, B.
We can thus deduce from Theorem 1 a simple relationship between the Jacobian of
a mapping and the Jacobian of the inverse mapping. We state this as a corollary to
Theorem 1.
Inverse Property of the Jacobian Corollary 2
If the hypotheses of Theorem 1 hold, then
(x, y)
(u, v)
=
1
(u,v)
(x,y)
.
Proof: From Theorem 1
DF
1
(u)DF(x) = I.
Taking the determinant of this equation gives
1 = det(DF
1
(u)DF(x)) = det(DF
1
(u)) det(DF(x)).
Thus, by denition of the Jacobian,
1 =
(x, y)
(u, v)
(u, v)
(x, y)
,
and the result follows.
REMARK
Since we are interested in being able to test whether F
1
exists, we ask: does
Corollary 1 admit a converse? i.e. does
(u,v)
(x,y)
= 0 on D
xy
imply that F
1
exists?
Unfortunately NO, unless we formulate the question more carefully. The following
example shows what can go wrong.
Consider F : R
2
R
2
dened by EXAMPLE 2
(u, v) = F(x, y) = (e
x
cos y, e
x
sin y).
Show that
(u, v)
(x, y)
= 0 on R
2
, but that F
1
does not exist on R
2
.
Solution: Observe that
(u, v)
(x, y)
= e
2x
> 0 for all (x, y) R
2
.
However, F is not one-to-one on R
2
, since, for example
F(0, 0) = F(0, 2) = (1, 0).
Thus F
1
does not exist on R
2
.
The reason the mapping in Example 2 is not invertible is because of the periodic
behavior of sin y and cos y. However, we know we can create inverse functions for these
by restricting their domain to a neighborhood where they are one-to-one. Similarly, in
Example 2, if we restrict the domain to a neighborhood N(0, 0) of radius less than 2,
it will be possible to solve uniquely for x, y in terms of u, v; in particular, an inverse
mapping does exist.
We can generalize this into the following theorem.
Inverse Mapping Theorem Theorem 2
2
R
2
dened by
u = f(x, y), v = g(x, y).
If F has continuous partial derivatives in some neighborhood of (a, b) and
(u, v)
(x, y)
= 0
at (a, b), then there is a neighborhood of (a, b) in which F has an inverse mapping F
1
,
given by
x = p(u, v), y = q(u, v),
with p, q having continuous partials.
Proof: The proof of this theorem is beyond the scope of this course.
2
R
2
dened by EXAMPLE 3
(u, v) = F(x, y) = (xy x
2
, x + y).
Show that F has an inverse mapping in a neighborhood of (1, 2).
Solution: The Jacobian of F is
(u, v)
(x, y)
= det
_
y 2x x
1 1
_
= y 3x
Hence at (x, y) = (1, 2), the Jacobian is non-zero. Clearly the partial derivatives of F
are continuous by the Continuity Theorems. Thus by the Inverse Mapping Theorem,
there is a neighborhood of (1, 2) in which F has an inverse mapping.
Referring to Example 3, show that the inverse mapping is given by EXERCISE 2
(x, y) = F
1
(u, v) =
_
1
4
(v +
v
2
8u),
1
4
(3v
v
2
8u)
_
.
13.2 Geometrical Interpretation of the Jacobian
In this section we explain the geometrical interpretation of the Jacobian of a mapping.
This interpretation is based on the following result from linear algebra. The area of a
parallelogram which is dened by two vectors a = (a
1
, a
2
) and
b = (b
1
, b
2
) is given by
Area =
det
_
a
1
b
1
a
2
b
2
_
. (13.2)
We calculate the area of the image in the uv-plane, of a small rectangle in the xy-plane
under a mapping F : R
2
R
2
dened by
u = f(x, y), v = g(x, y).
x
y
x
y
A
xy
A
uv
u
v
P
Q
R
P
F
We approximate the image of the rectangle dened by the vectors

PQ and

PR as a
parallelogram dened by the vectors
and
, and use the linear approximation

to approximate
and
.
Since

PQ =
_
x
0
_
and

PR =
_
0
y
_
, we obtain

_
u
x
u
y
v
x
v
y
__
x
0
_
=
_
u
x
x
v
x
x
_

_
u
x
u
y
v
x
v
y
__
0
y
_
=
_
u
y
y
v
y
y
_
Section 13.2 Geometrical Interpretation of the Jacobian 155
for x and y suciently small. Note that the partial derivatives are evaluated at P.
We have
A
xy
= xy
and so, by (13.2),
A
uv
=
det
_
u
x
x u
y
y
v
x
x v
y
y
_
det
_
u
x
u
y
v
x
v
y
_
xy.
since x and y are positive. Thus, by denition of the Jacobian
A
uv

(u, v)
(x, y)
A
xy
(13.3)
where the Jacobian is evaluated at P.
In words, the Jacobian of a mapping F describes the extent to which F increases or
decreases areas. We can think of the Jacobian of F as a magnication factor for (very
small) areas that are mapped by F. Keep in mind that the basic relation (13.3) is
an approximation, which is valid only for small areas, and which becomes increasingly
accurate as x and y tend to zero.
Calculate the approximate area of the image of a small rectangle of area xy, located EXAMPLE 4
at the point (3, 4), under the mapping F dened by
(u, v) = F(x, y) =
_
x +
_
x
2
+ y
2
, x +
_
x
2
+ y
2
_
.
Solution: Dierentiation and evaluation at (3, 4) gives the derivative matrix at (3, 4):
DF(3, 4) =
_
2
5
4
5
8
5
4
5
_
.
At (3, 4) the Jacobian is
(u, v)
(x, y)
= det
_
2
5
4
5
8
5
4
5
_
=
8
5
.
Thus
A
uv

8
5
A
xy
.
The diagram shows what is happening geometrically.
x
y
y
x
u
v
(3, 4)
(2, 8)
uv = const.
v = u + const.
Consider the mapping F dened by EXAMPLE 5
(x, y) = F(r, ) = (r cos , r sin ).
Find the image in the xy-plane, of a rectangle in the r-plane, and verify directly that
the Jacobian gives the magnication factor for area.
Solution: Using what we did in Example 2 of Section 12.1, we nd the images of the
lines r = k and = are the circles x
2
+ y
2
= k
2
and the lines y = xtan .
x
y
r
r
r
r
(r, )
circle of
radius r
y = xtan
The area of the rectangle in the r-plane is
A
r
= r.
The image of this rectangle in the xy-plane can be approximated by a rectangle with
sides of length r and r, for r and suciently small. Thus
A
xy
rr = rA
r
.
However, the Jacobian of the mapping is
(x, y)
(r, )
= r > 0
Section 13.2 Geometrical Interpretation of the Jacobian 157
(Exercise 1 in Section 13.1). Thus
A
xy

(x, y)
(r, )
A
r
,
which veries the area transformation formula (13.3).
Let F(x, y) = (x
2
y, xy). Consider the following square
S. Will the image of S under F have more or less area?
Explain your answer.
EXERCISE 3
REMARK
For a linear mapping F : R
2
R
2
, (u, v) = F(x, y) = (ax + by, cx + dy), where
a, b, c, d are constants, the derivative matrix is
DF(x, y) =
_
a b
c d
_
,
and thus the linear approximation is exact by Taylors Theorem since all second partials
are zero. Thus for a linear mapping, the approximation formula (13.3) becomes an
exact relation.
Show that the linear mapping EXERCISE 4
(u, v) = F(x, y) = (x + 2y, x + y)
preserves areas. Illustrate the action of the mapping by nding the image of the square
with vertices (0, 0), (0, 1), (1, 0) and (1, 1).
Use the Jacobian to verify the well-known result that any linear mapping F which is EXERCISE 5
a rotation,
(u, v) = F(x, y) = (xcos + y sin , xsin + y cos ),
where is a constant, preserves areas.
Generalization
At the end of Section 12.2, we generalized the concept of a mapping F : R
2
R
2
to
a mapping F : R
n
R
m
, and dened the mn derivative matrix DF(x). If m = n,
we can dene the Jacobian of the mapping, as follows.
For F : R
n
R
n
, given by Denition
Jacobian
u = F(x) =
_
f
1
(x), . . . , f
n
(x)
_
where u = (u
1
, . . . , u
n
) and x = (x
1
, . . . , x
n
). The Jacobian of F is
(u
1
, . . . , u
n
)
(x
1
, . . . , x
n
)
= det[DF(x)] = det
_
_
f
1
x
1

f
1
xn
.
.
.
.
.
.
fn
x
1

fn
xn
_
_
.
We note that the inverse property of the Jacobian also generalizes:
(x
1
, . . . , x
n
)
(u
1
, . . . , u
n
)
=
1
(u
1
,...,un)
(x
1
,...,xn)
,
where
(x
1
, . . . , x
n
)
(u
1
, . . . , u
n
)
is the Jacobian of the inverse mapping of F.
Geometrical interpretation of the Jacobian in 3-D
The interpretation is based on the following result.
The volume of a parallelepiped which is dened by
three vectors a = (a
1
, a
2
, a
3
), b = (b
1
, b
2
, b
3
) and
c = (c
1
, c
2
, c
3
) is given by
Volume =
det
_
_
a
1
b
1
c
1
a
2
b
2
c
2
a
3
b
3
c
3
_
. x
y
z
a
b
c
3
R
3
dened by
(u, v, w) = F(x, y, z) =
_
f(x, y, z), g(x, y, z), h(x, y, z)
_
.
The image of a small rectangular block of volume V
xyz
= xyz in xyz-space under
this mapping can be approximated by a small parallelepiped in uvw-space. As in the
Section 13.3 Constructing Mappings 159
2-D case we can use the linear approximation and the formula above to approximate
the volume V
uvw
of the image. The result is
V
uvw

(u, v, w)
(x, y, z)
V
xyz
where
(u, v, w)
(x, y, z)
is the Jacobian of the mapping F evaluated at P.
13.3 Constructing Mappings
When performing change of variables in double and triple integrals in chapters 14
and 15, it will be very important to be able to invent an invertible mapping which
transforms one region to another, simpler region. We demonstrate this with some
examples.
Find a linear mapping F which will transform the parallelogram with vertices (0, 0), EXAMPLE 6
(2, 1), (3, 4) and (1, 3) in the xy-plane into the unit square 0 u 1, 0 v 1 in the
uv-plane. Calculate the Jacobian of F and hence nd the area of the parallelogram.
Solution:
x
y
u
v
0
1
1
F
(0, 0)
(2, 1)
(1, 3)
(3, 4)
Dxy
Duv
The lines bounding D
xy
are 2y x = 0, 2y x = 5, 3x y = 0, and 3x y = 5. We
recall from chapter 12, that when performing a mapping, we substituted the equations
of each line into the component functions. Thus, we want to pick component functions
u = f(x, y), v = g(x, y), so that the image of the lines are u = 0, u = 1, v = 0, and
v = 1 respectively. Observe, that the bounding lines come in pairs. To get the rst
pair to have images u = 0 and u = 1, we see that we can take u =
2yx
5
. For the second
pair to have images v = 0 and v = 1 we take v =
3xy
5
. Thus, the desired mapping is
(u, v) = F(x, y) =
_
2y x
5
,
3x y
5
_
.
The Jacobian is
(u, v)
(x, y)
= det
_
1
5
2
5
3
5

1
5
_
=
1
5
.
Since the mapping is linear, we have the exact relation
A
uv
=
(u, v)
(x, y)
A
xy
=
1
5
A
xy
.
Hence the area of the parallelogram is 5 square units.
Find a linear mapping which will transform the ellipse
x
2
a
2
+
y
2
b
2
= 1 into the unit circle EXAMPLE 7
u
2
+ v
2
= 1.
Solution: We want to pick u = f(x, y) and v = g(x, y), such that we turn
x
2
a
2
+
y
2
b
2
= 1
into u
2
+ v
2
= 1. If we write the ellipse as
_
x
a
_
2
+
_
y
b
_
2
= 1, then it is clear that we
want to take u =
x
a
and v =
y
b
. Hence, the desired mapping is
(u, v) = F(x, y) =
_
x
a
,
y
b
_
.
Find a linear mapping F which will transform the ellipse 3x
2
+ 2xy + y
2
= 4 into the EXERCISE 6
circle u
2
+ v
2
= 4.
Find an invertible mapping which will transform the region D
xy
in the rst quadrant EXAMPLE 8
bounded by the hyperbola xy = 1, xy = 3, x
2
y
2
= 2, x
2
y
2
= 4 into a square in
the uv-plane.
Solution: We again see that we have pairs of equations. Thus, if we take u = xy and
v = x
2
y
2
we see that the images of the hyperbola xy = 1, xy = 3, x
2
y
2
= 2,
x
2
y
2
= 4 are u = 1, u = 3, v = 2, v = 4. Hence, the mapping
(u, v) = F(x, y) = (xy, x
2
y
2
),
gives the desired transformation. Observe that it would be dicult to solve for the
inverse explicitly, however, we can at least show that the mapping is locally invertible
by applying the Inverse Mapping Theorem. The Jacobian of F is
det DF(x, y) = det
_
y x
2x 2y
_
= 2x
2
2y
2
,
Section 13.3 Constructing Mappings 161
which is non-zero on the region D
xy
and F has continuous partial derivatives, so F is
invertible in a neighborhood of every point in D
xy
by the Inverse Mapping Theorem.
Find an invertible mapping which will transform the region D
xyz
in the rst octant EXERCISE 7
bound by xy = 1, xy = 3, xz = 1, xz = 3, yz = 2, and yz = 4 into a cube in the
uvw-space.
Part III
Multiple Integrals
Chapter 14
Double Integrals
14.1 Denition of Double Integrals
Recall, to nd the area under a continuous curve y = f(x) over a closed interval [a, b]
we used a single integral which we dened as a limit of Riemann sums:
_
b
a
f(x) dx = lim
n
n
i=1
f(x
i
)x
i
,
where x
i
is the length of the i-th subinterval in some decomposition (i.e. partition)
of the interval [a, b] and x
i
is some point in the i-th subinterval.
We found that the single integral had many applications beside calculating areas under
curves. We can use single integrals for nding mass of thin rods, calculating work, and
for nding volumes of revolution. However, what if we want to calculate the mass of
a thin plate, or to nd the volume of more complicated regions? For these, we use
double integrals.
Let D be a closed and bounded set in R
2
whose boundary is a piecewise smooth closed
curve. Let f : R
2
R be a function which is bounded on D, that is, there exists a
number M such that |f(x)| M for all x D.
164 Chapter 14 Double Integrals
Subdivide D by means of straight lines parallel
to the axes, forming a partition P of D. Label
the n rectangles that lie completely in D, in
some specic order, and denote their areas by
A
i
, i = 1, . . . , n. Choose a point (x
i
, y
i
) in
the i-th rectangle and form the Riemann sum
n
i=1
f(x
i
, y
i
)A
i
. (14.1)
Let |P| denote the length of the longest side of all rectangles in the partition P.
A function f : R
2
R which is bounded on a closed bounded set D R
2
is integrable Denition
integrable on D means that all Riemann sums approach the same value as |P| 0.
If f : R
2
R is integrable on a closed bounded set D, then we dene the double Denition
double integral integral of f on D as
__
D
f(x, y) dA = lim
P0
n
i=1
f(x
i
, y
i
)A
i
.
Is there any guarantee that the limiting process in the denition of the double integral
actually leads to a unique value, i.e. that the limit exists? It is possible to dene
weird functions for which the limit does not exist, i.e. which are not integrable on D.
However, if f is continuous on D, it can be proved that f is integrable on D, that is
the double integral of f does exist. Functions which are discontinuous on D may be
integrable on D. For example, if f is continuous in D except at points which lie on a
curve C (f is piece-wise continuous), then f is integrable. The proofs of these results
are beyond the scope of this course.
Interpretation of the Double Integral
When you encounter the double integral symbol
__
D
f(x, y) dA,
think of limit of a sum. In itself, the double integral is a mathematically dened
object. It has many interpretations depending on the meaning that you assign to the
Section 14.1 Denition of Double Integrals 165
integrand f(x, y). The dA in the double integral symbol should remind you of the
area of a rectangle in a partition of D.
Double Integral as Area: The simplest interpretation is when you specialize f to
be the constant function with value unity:
f(x, y) = 1, for all (x, y) D.
Then the Riemann sum (14.1) simply sums the areas of all rectangles in D, and the
double integral serves to dene the area A(D) of the set D:
A(D) =
__
D
1 dA.
Double Integral as Volume: If f(x, y) 0 for all (x, y) D, then the double
integral
__
D
f(x, y) dA
can be interpreted as the volume V (S) of the 3-D set dened by
S =
_
(x, y, z) | 0 z f(x, y), (x, y) D
_
,
which represents the solid below the surface z = f(x, y) and above the set D in the
xy-plane. The justication is as follows.
The partition P of D decomposes the solid S
into vertical columns. The height of the col-
umn above the i-th rectangle is approximately
f(x
i
, y
i
), and so its volume is approximately
f(x
i
, y
i
)A
i
The Riemann sum (14.1) thus approximates
the volume V (S):
V (S)
n
i=1
f(x
i
, y
i
)A
i
.
As |P| 0 the partition becomes increasingly ne, so the error in the approximation
will tend to zero. Thus the volume V (S) is
V (S) =
__
D
f(x, y) dA.
Double Integral as Mass: Think of a thin at plate of metal whose density varies
with position. Since the plate is thin, it is reasonable to describe the varying density
by an area density, that is a function f(x, y) that gives the mass per unit area at
position (x, y). In other words, the mass of a small rectangle of area A
i
located at
position (x
i
, y
i
) will be approximately
M
i
f(x
i
, y
i
)A
i
.
The Riemann sum (14.1) corresponding to a partition P of D will approximate the
total mass M of the plate D, and the double integral of f over D, being the limit of
the sum, will represent the total mass:
M =
__
D
f(x, y) dA.
Double Integral as Probability: Let f(x, y) be the probability density of a
continuous 2-D random variable (X, Y ). The probability that (X, Y ) D, a given
subset of R
2
, is
P((X, Y ) D) =
__
D
f(x, y) dA.
Average Value of a Function: The double integral is also used to dene the
average value of a function f : R
2
R over a set D R
2
.
Recall for a function of one variable, f : R R, the average value of f over an interval
[a, b], denoted f
av
, is dened by
f
av
=
1
b a
_
b
a
f(x) dx.
Similarly, for a function of two variables f : R
2
R, we can dene the average value
of f over a closed and bounded subset D of R
2
by
f
av
=
1
A(D)
__
D
f(x, y) dA.
A city occupies a region D of the xy-plane. The population density in the city (mea- EXERCISE 1
sured as people/unit area) depends on position (x, y), and is given by a function p(x, y).
Interpret the double integral
__
D
p(x, y) dA.
Section 14.2 Denition of Double Integrals 167
Properties of the Double Integral
The basic properties of single integrals can be generalized to double integrals. We do
not give the proofs but point out that the results are plausible if one thinks in terms
of Riemann sums. In each theorem, D denotes a closed and bounded set, and f and g
are two integrable functions on D.
Linearity Theorem 1
__
D
(f + g) dA =
__
D
f dA+
__
D
g dA
__
D
cf dA = c
__
D
f dA
where c is a constant.
Basic Inequality Theorem 2
If f(x, y) g(x, y) for all (x, y) D, then
__
D
f dA
__
D
g dA.
Absolute Value Inequality Theorem 3
__
D
f dA
__
D
|f| dA.
Decomposition
If D is decomposed into two closed and bounded subsets
D
1
and D
2
by a piecewise smooth curve C, then
__
D
f dA =
__
D
1
f dA+
__
D
2
f dA.
Theorem 4
REMARKS
1. The Basic Inequality can be used to obtain an estimate for a double integral that
cannot be evaluated exactly.
2. The decomposition property is essential for dealing with complicated regions of
integration and with discontinuous integrands.
14.2 Iterated Integrals
It is clear that double integrals can be evaluated approximately by using a computer to
evaluate a suitable Riemann sum. The accuracy would depend on how ne a partition
you choose. But it is natural to ask: is it possible to calculate double integrals exactly,
using methods that work for single integrals? For suciently simple functions and
regions of integration, the answer is YES. The idea is to write the double integral as
a succession of two single integrals, called an iterated integral. We will derive a
method for doing this by using the interpretation of the double integral as volume.
Let D be a region in the xy-plane and let f : R
2
R such that f(x, y) 0 for
all (x, y) D. If V denotes the volume of the solid above D and below the surface
z = f(x, y), then we have
V =
__
D
f(x, y) dA.
Assume that the region D lies between vertical
lines x = x
and x = x
u
with x
< x
u
and has
top curve y = y
u
(x) and bottom curve y = y
(x).
That is, D is described by the inequalities
y
(x) y y
u
(x), and x
x x
u
.
Now, recall from Calculus 2 that we can nd a volume of a region by integrating over
all possible cross-sectional areas. That is,
V =
_
xu
x
A(x) dx,
where A(x) is the cross-sectional area of the solid for any xed value of x. But, we
know that the cross-sectional area A(x) is the area under the cross-section z = f(x, y),
and thus is given by a single integral
A(x) =
_
yu(x)
y
(x)
f(x, y) dy.
Section 14.2 Iterated Integrals 169
Hence, the volume of the region is
V =
_
xu
x
_
_
yu(x)
y
(x)
f(x, y) dy
_
dx.
Thus, we have
__
D
f(x, y) dA =
_
xu
x
_
yu(x)
y
(x)
f(x, y) dy dx,
as desired.
Let D R
2
be dened by Theorem 5
y
(x) y y
u
, and x
x x
u
,
where y
(x) and y
u
(x) are continuous for x
x x
u
. If f(x, y) is continuous on D,
then
__
D
f(x, y) dA =
_
xu
x
_
yu(x)
y
(x)
f(x, y) dy dx.
Proof: The proof is beyond the scope of this course.
REMARK
Although the parenthesis around the inner integral are usually omitted, we must
evaluate it rst. Moreover, as in our interpretation of volume above, when evaluating
the inner integral, we are integrating with respect to y while holding x constant. That
is, we are using partial integration.
Evaluate EXAMPLE 1
__
D
xy dA,
where D is the triangular region with vertices (0, 0), (2, 0) and (0, 1).
Solution: The set D is dened by
0 y 1
1
2
x, and 0 x 2.
By Theorem 5,
__
D
xy dA =
2
_
x=0
1
1
2
x
_
y=0
xy dy dx
=
_
2
x=0
x(
1
2
y
2
)
1
1
2
x
0
dx
=
1
2
_
2
0
x(1
1
2
x)
2
dx
= =
1
6
.
Suppose now that the set D can be described
by inequalities of the form
x
(y) x x
u
(y), and y
y y
u
,
where y
, y
u
are constants and x
(y), x
u
(y)
are continuous functions of y on the interval
y
y y
u
.
Then by reversing the roles of x and y in Theorem 5, the double integral
__
D
f(x, y) dA
can be written as an iterated integral in the order x rst, then y:
__
D
f(x, y) dA =
_
yu
y
_
xu(y)
x
(y)
f(x, y) dx dy. (14.2)
Evaluate the integral in Example 1 by reversing the order of integration. EXAMPLE 2
Solution:
In order to integrate with respect to x rst, we describe the set D by the inequalities:
0 x 2(1 y), and 0 y 1.
By (14.2) we get
__
D
xy dA =
_
1
y=0
_
2(1y)
x=0
xy dx dy
=
_
1
y=0
y
_
1
2
x
2
_
2(1y)
0
dy
= 2
_
1
0
y(1 y)
2
dy = =
1
6
.
Find the volume of the solid S in the rst octant (x 0, y 0, z 0) bounded by EXAMPLE 3
the cylinder y
2
+ z
2
= 16, and the planes 3y 2x = 0, x = 0, z = 0.
Solution:
The cylinder y
2
+z
2
= 16 runs parallel to the x-axis (since there is no x-dependence).
The plane 3y2x = 0 is vertical (since there is no z-dependence). The solid is described
by
0 z
_
16 y
2
and (x, y) D,
where D is the region in the xy-plane bounded by 3y 2x = 0, x = 0, and y = 4.
Hence, the volume of the solid is
__
D
_
16 y
2
dA.
Observe that we can represent the set D as
0 x
3y
2
, and 0 y 4.
Thus, the volume is
__
D
_
16 y
2
dA =
_
4
0
_
3y/2
0
_
16 y
2
dx dy
=
_
4
0
_
16 y
2
(x)
3y/2
0
dy
=
_
4
0
y
_
16 y
2
dy
=
1
2
(16 y
2
)
3/2
4
0
= 32 cubic units.
Observe that the region in Example 2 could have also been represented by
2x
3
y 4,
0 x 6. Hence, we could have applied Theorem 5, instead of using equation (14.2).
However, notice that if we had applied Theorem 5 instead, our inner integral would
have been
_
4
2x/3
_
16 y
2
dy,
which would have been more dicult. Thus, when evaluating a double integral
__
D
f(x, y) dA,
one must take into account two factors:
the shape of the region D.
the form of the integrand f(x, y).
Either of these factors may make it desirable or even essential to use one order of
integration instead of the other.
Describe the set D by inequalities in two ways.
Evaluate the double integral
__
D
(x + y) dA
in two ways.
EXERCISE 2
Evaluate
__
D
y dA, where D is the triangular region with vertices (0, 0), (1, 1) and EXERCISE 3
(0, 2).
Evaluate
__
D
e
y
2
dA, where D is the triangular region with vertices (0, 0), (0, 1) and EXERCISE 4
(1, 1).
Find the volume of the solid bounded above by the paraboloid z = 4 x
2
y
2
, and EXERCISE 5
below by the rectangle D = {(x, y) | 0 x 1, 0 y 1}.
For more complicated regions we may not be able to ap-
ply our method above so easily. For example an annulus
cannot be described by the usual inequalities since a ver-
tical or a horizontal line may intersect the boundary of
D in more than two points. A simple approach to evalu-
ating the double integral
__
D
f(x, y) dA, where D is the annulus is to let D
1
, D
2
denote the discs of radius R
1
and R
2
respectively. Then by the Decomposition Theorem,
__
D
2
f(x, y) dA =
__
D
1
f(x, y) dA+
__
D
f(x, y) dA,
and so the required integral is
__
D
f(x, y) dA =
__
D
2
f(x, y) dA
__
D
1
f(x, y) dA.
Both integrals on the right can be written as iterated integrals in the usual way.
However, for this or even more complicated regions, we can often make it simpler by
applying a change of variables.
14.3 The Change of Variable Theorem
A mapping F : R
2
R
2
can be used to simplify a double integral
__
Dxy
H(x, y) dA,
either by changing the integrand H(x, y), or by deforming the set D
xy
in the xy-plane
into a simpler shape D
uv
in the uv-plane. The process is called a change of variables
in the double integral. In this type of calculation it is convenient to replace the symbol
dA in the double integral by dx dy if one is working in the xy-plane, and by
du dv if one is working in the uv-plane.
In order to derive the change of variable formula for double integrals, we need the
formula which describes how areas are related under a mapping F : R
2
R
2
, given
by
(x, y) = F(u, v) = (f(u, v), g(u, v)). (14.3)
The geometric interpretation of the Jacobian gives us
A
xy

(x, y)
(u, v)
A
uv
, (14.4)
for u, v suciently small where the Jacobian
(x, y)
(u, v)
is evaluated at a point in the
region. Notice that we have interchanged the roles of (x, y) and (u, v) in equations
(14.3) and (14.4), as compared to Section 13.2.
Change of Variable Theorem 6
Let each of D
uv
and D
xy
be a closed bounded set whose boundary is a piecewise-smooth
closed curve. Let
(x, y) = F(u, v) = (f(u, v), g(u, v))
be a one-to-one mapping of D
uv
onto D
xy
, with f, g C
1
, and
(x,y)
(u,v)
= 0 except
for possibly on a nite collection of piecewise-smooth curves in D
uv
. If H(x, y) is
continuous on D
xy
, then
__
Dxy
H(x, y) dx dy =
__
Duv
H
_
f(u, v), g(u, v)
_
(x, y)
(u, v)
du dv.
Proof: A proof is beyond the scope of the course but we can make the result plausible,
as follows.
Section 14.3 The Change of Variable Theorem 175
Consider a partition P of D
uv
into rectangles, by means of straight lines parallel to
the coordinate axes. The images of these lines under the given transformation will in
general be two families of curves which will dene a partition P
of D
xy
into elements
of area which are approximately parallelograms. We can use this partition, instead of
a rectangular partition, to dene
__
Dxy
F(x, y) dx dy.
Thus
__
Dxy
H(x, y) dx dy = lim
P
0
n
i=1
H(x
i
, y
i
)A
i
= lim
P
0
n
i=1
H
_
f(u
i
, v
i
), g(u
i
, v
i
)
_
(x, y)
(u, v)
(u
i
,v
i
)
A
i
,
=
__
Duv
H
_
f(u, v), g(u, v)
_
(x, y)
(u, v)
dA,
by using the denition of double integral relative to the rectangular partition of D
uv
.
The lack of rigor occurs when we use the approximation (14.4).
Evaluate
__
Dxy
(x + y) dA, where D
xy
is the set bounded by the parallelogram with EXAMPLE 4
vertices (0, 0), (2, 1), (1, 3) and (3, 4).
Solution: In Example 6 in Section 13.2,
we found that the mapping
(u, v) = F(x, y) =
_
1
5
(2y x),
1
5
(3x y)
_
maps D
xy
onto D
uv
, the unit square in the
uv-plane.
x
y
u
v
0
1
1
F
(0, 0)
(2, 1)
(1, 3)
(3, 4)
Dxy
Duv
The Jacobian of F is
(u, v)
(x, y)
=
1
5
.
Observe that our mapping F maps D
xy
to D
uv
, but the Change of Variable Theorem
requires a mapping which maps D
uv
to D
xy
. In particular, we actually require the
inverse of our mapping. Solving for x and y we nd that
(x, y) = F
1
(u, v) = (u + 2v, 3u + v).
Hence
(x,y)
(u,v)
= 5, and the integrand becomes x + y = 4u + 3v. Then, the Change of
Variable Theorem gives
__
Dxy
(x + y) dx dy =
__
Duv
(4u + 3v)| 5| du dv.
It is straightforward to write this double integral as an iterated integral and evaluate
it. The nal result is
__
Dxy
(x + y) dA =
35
2
.
Fill in the details in Example 4. EXERCISE 6
Double Integrals in Polar Coordinates
If the boundary of the region is a circle centered on the origin, or a circle that passes
through the origin, it will often help to transform from polar to Cartesian coordinates.
Recall that the mapping from polar to Cartesian coordinates is
(x, y) = F(r, ) = (r cos , r sin ),
which has Jacobian,
(x, y)
(r, )
= r.
Hence, we must restrict r > 0 so that the mapping is one-to-one and the Jacobian is
non-zero so that we can apply the Change of Variable Theorem. Note that we can
make this restriction even if the origin is in the region as the integral over a single
point is 0.
Evaluate EXAMPLE 5
__
Dxy
x
x
2
+ y
2
dA
where D
xy
is the half disc (x 1)
2
+ y
2
1, x 1.
Solution: We rst convert the equations from Cartesian coordinates to polar coordi-
nates. Since x = r cos we get that x = 1 becomes
r cos = 1
r = sec .
Similarly, x
2
+ y
2
= 2x becomes
r
2
= 2r cos
r = 2 cos ,
since we are assuming r = 0. The image D
r
is shown in the gure below. The values
of at the points of intersection are obtained by solving sec = 2 cos , giving =
4
.
The Change of Variable Theorem thus implies
__
Dxy
x
x
2
+ y
2
dx dy =
__
D
r
r cos
r
2
|r| dr d =
__
D
r
cos dr d.
The set D
r
is described by the inequalities
sec r 2 cos , and

4

4
.
We can thus write the integral over D
r
as an iterated integral,
__
D
r
cos dr d =
4
_
4
2 cos
_
sec
cos dr d.
It is a routine matter to evaluate this, leading to the nal answer
__
Dxy
x
x
2
+ y
2
dx dy = 1.
Fill in the details in Example 5. EXERCISE 7
REMARK
Because polar coordinates have a simple
geometric interpretation one can obtain
the r and limits of integration directly
from the diagram in the xy-plane, without
drawing the region D
r
in the same way
as we did for nding areas in polar coordi-
nates in Chapter 11. The method is illus-
trated in the diagram.
Evaluate EXERCISE 8
__
Dxy
1
_
x
2
+ y
2
dA,
where D is the region in the rst quadrant bounded by the circles x
2
+ y
2
= 1 and
x
2
+ y
2
= 4. Use polar coordinates, as in Example 2.
Evaluate EXERCISE 9
I =
__
Dxy
xy dA,
where D
xy
is the set in the rst quadrant bounded by y = x, y = ex, xy = 2 and
xy = 3.
Hint: Find a mapping which maps D
xy
into a rectangle D
uv
in the uv-plane.
Chapter 15
Triple Integrals
15.1 Denition of Triple Integrals
Let D be a closed bounded set in R
3
whose bound-
ary consists of a nite number of surface elements
which are smooth except possibly at isolated points.
Let f : R
3
R be a function which is bounded
on D. Subdivide D by means of three families
of planes which are parallel to the xy, yz, and
xzplanes respectively, forming a partition P of D.
Label the N rectangular blocks that lie completely in D in some specic order, and
denote their volumes by V
i
, i = 1, . . . , n. Choose an arbitrary point (x
i
, y
i
, z
i
) in the
i-th block, i = 1, . . . , n, and form the Riemann sum
n
i=1
f(x
i
, y
i
, z
i
)V
i
. (15.1)
Let P denote the maximum of the dimensions of all rectangular blocks in the partition
P.
180 Chapter 15 Triple Integrals
A function f : R
3
R which is bounded on a closed bounded set D R
3
is said to be Denition
integrable integrable on D if and only if all Riemann sums approach the same value as P 0.
If f : R
3
R is integrable on a closed bounded set D, then we dene the triple Denition
triple integral integral of f over D, as
___
D
f(x, y, z) dV = lim
P0
n
i=1
f(x
i
, y
i
, z
i
)V
i
.
Is there any guarantee that the limiting process in the denition of the triple integral
actually leads to a unique value, i.e. that the limit exists? It is possible to dene
weird functions for which the limit does not exist, i.e. which are not integrable on D.
However, if f is continuous on D, it can be proved that f is integrable on D. Functions
which are discontinuous in D may be integrable on D. For example, if f is continuous
on D except at points which lie on a surface or curve in D, then f is integrable on D.
The proofs of these results are beyond the scope of this course, however.
Interpretation of the Triple Integral
When you encounter the triple integral symbol
___
D
f(x, y, z) dV,
think of limit of a sum. In itself, the triple integral is a mathematically dened
object. It has many interpretations, depending on the interpretation that you assign
to the integrand f(x, y, z). The dV in the triple integral symbol should remind you
of the volume of a rectangular block in a partition of D.
Triple Integral as Volume:
The simplest interpretation is when you specialize f to be the constant function with
value unity:
f(x, y, z) = 1, for all (x, y, z) D.
Then the Riemann sum (15.1) simply sums the volumes of all rectangular blocks in D,
and the triple integral over D serves to dene the volume V (D) of the set D:
V (D) =
___
D
1 dV.
Section 15.2 Denition of Triple Integrals 181
Triple Integral as Mass:
Think of a planet or star whose density varies with position. Let D denote the subset
of R
3
occupied by the star. Let f(x, y, z) denote the density (mass per unit volume)
at position (x, y, z). The mass of a small rectangular block located within the star at
position (x
i
, y
i
, z
i
) will be approximately
M
i
f(x
i
, y
i
, z
i
)V
i
.
Thus, the Riemann sum corresponding to a partition P of D
n
i=1
f(x
i
, y
i
, z
i
)V
i
,
will approximate the total mass M of the star, and the triple integral of f over D,
being the limit of the Riemann sum, will represent the total mass:
M =
___
D
f(x, y, z) dV.
Average Value of a Function:
By analogy with functions of one and two variables we can use the triple integral to
dene the average value of a function f : R
3
R over a closed and bounded set
D R
3
.
Let D R
3
be closed and bounded with volume V (D) = 0, and let f : R
3
R be a Denition
average value bounded and integrable function on D. The average value of f over D is dened by
f
av
=
1
V (D)
___
D
f(x, y, z) dV.
REMARK
If you have the impression that you have read this section someplace else, youre
right. Compare it with Section 14.1. The only essential change is to replace area by
volume.
Properties of the Triple Integral
The triple integral satises the same basic properties as the double integral, and The-
orems 2 to 5 of Section 14.3 generalize in the obvious way to triple integrals.
15.2 Iterated Integrals
We generalize the method used in Section 14.2, and show how to express a triple integral
as a 3-fold iterated integral. This enables you to evaluate triple integrals exactly for
suciently simple functions and integration sets.
Consider a set D R
3
which is described by inequalities of the form
z
(x, y) z z
u
(x, y),
and
(x, y) D
xy
.
Here D
xy
is a closed bounded subset of R
3
whose boundary is a piecewise smooth closed
curve, and z
, z
u
are continuous functions on
D
xy
. Think of the set D as being the 3-D re-
gion with bottom surface z = z
(x, y) and top

surface z = z
u
(x, y), where the extent is de-
ned by the 2-D set D
xy
.
In order to write a triple integral as an iterated integral, take an arbitrary point
(x, y) D
xy
. Then you integrate f(x, y, z) with respect to z from z
(x, y) to z
u
(x, y),
and integrate the result over D
xy
, as a double integral. This procedure essentially sums
over all rectangular blocks in a partition of D, and hence gives the triple integral of
f(x, y, z) over D.
Let D be the subset of R
3
dened by Theorem 1
z
(x, y) z z
u
(x, y) and (x, y) D
xy
,
where z
and z
u
are continuous functions on D
xy
, and D
xy
is a closed bounded subset
in R
2
, whose boundary is a piecewise smooth closed curve. Let f(x, y, z) be constant
on D. Then
___
D
f(x, y, z) dV =
__
Dxy
zu(x,y)
_
z
(x,y)
f(x, y, z) dz dA.
Proof: A proof of Theorem 1 is beyond the scope of this course.
REMARKS
1. Keep in mind that when evaluating a triple integral, it is not essential to inte-
grate rst with respect to z. One chooses the order of integration that is most
convenient. For example, if you can describe D by inequalities of the form
x
(y, z) x x
u
(y, z)
with (y, z) D
yz
, then you could integrate with respect to x rst (see example
2 to follow).
2. As with double iterated integrals, we are doing partial integration.
Evaluate
___
D
z dV , where D is the solid tetrahedron, with vertices (a, 0, 0), (0, b, 0), EXAMPLE 1
(0, 0, c) and (0, 0, 0).
Solution:
The equation of the inclined face of the tetrahedron is
x
a
+
y
b
+
z
c
= 1
The subset D R
3
is described by
0 z c
_
1
x
a

y
b
_
, and (x, y) D
xy
.
Thus by Theorem 1,
___
D
z dV =
__
Dxy
c(1
x
a
y
b
)
_
0
z dz dA =
a
_
0
b(1
x
a
)
_
0
c(1
x
a
y
b
)
_
0
z dz dy dx,
on writing the outer double integral over D
xy
as a double iterated integral. After
evaluating the integrals, one obtains as a nal answer,
___
D
z dV =
1
24
abc
2
.
Verify the answer in example 1 by evaluating the iterated integral. EXERCISE 1
Write the triple integral in example 1 as an iterated integral taking the variables in EXERCISE 2
the order y, x, z. Evaluate the iterated integral and verify you get the same answer as
in exercise 1.
In how many ways can a triple integral be written as an iterated integral? EXERCISE 3
Evaluate
___
D
z
4 y
dV , where D is the region bounded by the cylinder y
2
+ z
2
= 4, EXAMPLE 2
and the planes x + y = 2, x + 2y = 6, z = 0, y = 0, and lying in the rst octant.
Solution:
It is convenient to integrate rst with respect to x, and describe D by the inequalities
2 y x 6 2y and (y, z) D
yz
.
Thus
___
D
z
4 y
dV =
__
Dyz
62y
_
2y
z
4 y
dx dA =
__
Dyz
z dA,
=
2
_
0
4y
2
_
0
z dz dy =
1
2
2
_
0
(4 y
2
) dy =
8
3
.
Evaluate the triple integral in example 2 by writing it as an iterated integral with EXERCISE 4
the variables in the order z, x, y. Why would it not make sense to integrate rst with
respect to y?
Let D be the subset of R
3
( a prism) bounded by the planes x = 0, x = 2, y = 0, z = 0 EXERCISE 5
and y + z = 1. Evaluate
___
D
ydV .
15.3 The Change of Variable Theorem
A mapping F : R
3
R
3
can be used to simplify a triple integral
___
Dxyz
H(x, y, z) dV,
either by changing the integrand H(x, y, z) or by deforming the set D
xyz
in xyz-space
into a simpler shape D
uvw
in uvw-space, thereby simplifying the limits of integration.
In this type of calculation it is convenient to replace the symbol dV in the triple
integral by dx dy dz if one is working in xyz-space, and by du dv dw if one is
working in uvw-space.
Change of Variable Theorem 2
Let
x = f(u, v, w), y = g(u, v, w), z = h(u, v, w),
be a one-to-one mapping of D
uvw
onto D
xyz
, with f, g, h having continuous partials,
and
(x, y, z)
(u, v, w)
= 0 on D
uvw
.
If H(x, y, z) is continuous on D
xyz
, then
___
Dxyz
H(x, y, z) dx dy dz =
___
Duvw
H
_
f(u, v, w), g(u, v, w), h(u, v, w)
_
(x, y, z)
(u, v, w)
du dv dw.
Proof: A proof is beyond the scope of this course, but the volume transformation
formula using the Jacobian in Section 13.2 makes the theorem plausible, as in the case
of the double integral.
Evaluate I =
___
Dxyz
x
2
dV , where D
xyz
is the subset of R
3
bounded by the surfaces EXAMPLE 3
xy = 1, xy = 3, and the planes y +z = 1, y +z = 0, x+y +z = 1 and x+y +z = 2.
Solution: This solid is dicult to draw, but one can visualize it, since it is bounded
by level surfaces of three functions, namely
xy, y + z, and x + y + z.
Thus the solid D
xyz
is described by the inequalities
1 xy 3, 1 y + z 0, 1 x + y + z 2. (15.2)
This suggests that we dene a mapping
u = xy, v = y + z, w = x + y + z. (15.3)
The Jacobian is
(u, v, w)
(x, y, z)
= det
_
_
y x 0
0 1 1
1 1 1
_
_
= x.
By the Change of Variable Theorem,
I =
___
Dxyz
x
2
dx dy dz =
___
Duvw
x
2
(x, y, z)
(u, v, w)
du dv dw.
By the inverse property of the Jacobian,
(x, y, z)
(u, v, w)
=
_
(u, v, w, )
(x, y, z)
_
1
=
1
x
.
It follows from the inequalities (15.2) that x > 0 on D
xyz
. Thus equation (15.3) gives
I =
___
Duvw
x du dv dw.
The next step is to express the integrand x in terms of u, v, w. It follows from equations
(15.3) that x = w v. Thus
I =
_
Duvw
(w v) du dv dw. (15.4)
The inequalities (15.2) imply that the image of the set D
xyz
under the mapping (15.3)
is the rectangular block D
uvw
dened by
1 u 3, 1 v 0, 1 w 2.
We can thus write the triple integral (15.4) as an iterated integral, and since D
uvw
is
rectangular, the order is immaterial:
I =
2
_
1
0
_
1
3
_
1
(w v) du dv dw = = 4.
Verify the result
(u, v, w)
(x, y, z)
= x in Example 3. EXERCISE 6
Find the volume of the solid bounded by the six planes x+y = 1, x+y = 2, xy = 1, EXERCISE 7
x y = 1, x + y + z = 0, x + y + z = 3.
In double integrals we saw that if there is symmetry about the origin it may be helpful
to evaluate the double integral using polar coordinates. Similarly, if we have symmetry
about the z-axis or the origin in R
3
it may be helpful to use our mappings to cylindrical
coordinates or spherical coordinates.
Triple Integrals in Cylindrical Coordinates
Recall that the mapping from Cartesian coordinates to cylindrical coordinates is
x = r cos
y = r sin
z = z,
with r 0, 0 < 2, and the Jacobian is
(x, y, z)
(r, , z)
= r,
(verify). Since we need
(x,y,z)
(r,,z)
= 0, we must again restrict r > 0. So for cylindrical
coordinates, the formula in the Change of Variable Theorem reads
___
Dxyz
H(x, y, z) dx dy dz =
___
D
rz
H(r cos , r sin , z)r dr d dz.
A wedge is cut from the cylinder x
2
+ y
2
= b
2
, by the planes z = 0 and z = ky, where EXAMPLE 4
b and k are positive constants, and y is assumed to be non-negative. Find the volume
of the wedge.
Solution: The volume V is given by
V =
___
Dxyz
1 dV.
In cylindrical coordinates we have the cylinder r = b,
the plane z = 0 and the plane z = kr sin . Hence the
solid is described by
0 z kr sin , 0 r b, 0 .
Using the Change of Variable Theorem gives
V =
_
0
b
_
0
kr sin
_
0
r dz dr d = =
2
3
kb
3
.
The density of the contents of a cylindrical drum dened by EXERCISE 8
x
2
+ y
2
1 and 0 z 2,
is given by
=
k(2 z)
1 + x
2
+ y
2
,
where k is constant. Find the total mass.
Calculate the volume of the solid enclosed by the paraboloid z = x
2
+y
2
and the lower EXERCISE 9
part of the cone (z 2)
2
= x
2
+ y
2
.
Triple Integrals in Spherical Coordinates
Recall that the mapping from spherical coordinates to Cartesian coordinates are
x = sin cos
y = sin sin
z = cos ,
with
0, 0 , 0 < 2,
and Jacobian
(x, y, z)
(, , )
=
2
sin .
Verify that
(x,y,z)
(,,)
=
2
sin . EXERCISE 10
Thus for spherical coordinates, we must restrict > 0 and 0 < < so that the
Jacobian is non-zero and the mapping is one-to-one. Observe that this means we are
not just removing one point, but the entire z-axis. However, this still will not eect
our result as the triple integral over the z-axis is 0. Hence, the Change of Variable
Theorem in spherical coordinates reads:
___
Dxyz
H(x, y, z) dx dy dz
=
___
D
H( sin cos , sin sin , cos )

2
sin ddd
Evaluate
___
D
1
x
2
+ y
2
+ z
2
dV where D is the spherical shell between the spheres of EXAMPLE 5
radius a and b centered on the origin (a < b).
Solution: You would not succeed in evaluating this triple integral as an iterated
integral in terms of x, y and z. However, if you use spherical coordinates, the calculation
is simple.
In terms of spherical coordinates , , , the set D is dened by
a b, 0 , 0 2.
Using the Change of Variable Theorem gives
___
D
1
x
2
+ y
2
+ z
2
dV =
2
_
0
_
0
b
_
a
1
2
(
2
sin ) d d d
= = 4(b a)
Calculate the volume of the solid ellipsoid EXERCISE 11
x
2
a
2
+
y
2
b
2
+
z
2
c
2
1,
where a, b, c are positive constants.
Hint: Make the change of variables (x, y, z) = (au, bv, cw), and transform the ellipsoid
into a solid sphere.
A conical drill bit, angle , drills into a solid sphere
of radius b until the tip reaches the center. Show
that the volume of the solid removed is
V () =
4
3
b
3
sin
2

2
.
cross section y = 0
EXERCISE 12
APPENDICES
Appendix A
Implicitly Dened Functions
A.1 Implicit Dierentiation
An equation of the form
f(x, y) = 0 (A.1)
denes a relationship between the two variables x and y. If
y = g(x)
is a solution of equation (A.1), i.e.
f(x, g(x)) = 0 (A.2)
for all x in some interval I, we say that the function g is dened implicitly by
equation (A.1).
e.g. the functions y =
1 x
2
and y =
1 x
2
are dened implicitly by the
equation x
2
+ y
2
1 = 0.
In general, given an equation of the form (A.1), it is not possible to solve for y in
terms of x to obtain the function g(x) explicitly. However it is easy to calculate the
derivatives of g by dierentiating equation (A.2) with respect to x, a process referred
to as implicit dierentiation. In this way one can nd the linear approximation and
second degree Taylor polynomial of g at a suitable reference point. Here is an example.
Section A.1 Implicit Dierentiation 193
The equation EXAMPLE 1
y
3
y + x = 0 (3)
denes y implicitly as a function of x, y = g(x), with g(0) = 1. Find the linear
approximation and second degree Taylor polynomial of g at the point x = 0.
Solution: Dierentiate equation (3) with respect to x, treating y as a function of x:
3y
2
dy
dx

dy
dx
+ 1 = 0. (4)
Evaluate this at the point (x, y) = (0, 1), obtaining
dy
dx
=
1
2
, and hence g
(0) =
1
2
.
Since g(0) = 1, the linear approximation of g at 0 is
L
0
(x) = 1
1
2
x.
Dierentiate equation (4) with respect to x:
3y
2
d
2
y
dx
2
+ 6y
_
dy
dx
_
2
d
2
y
dx
2
= 0.
Evaluate this at the point (x, y) = (0, 1) obtaining
d
2
y
dx
2
=
3
4
, and hence g
(0) =
3
4
.
The second degree Taylor polynomial of g at 0 is
P
2,0
(x) = 1
1
2
x
3
8
x
2
.
In this way, we can obtain information about the implicitly dened function g:
g(x) 1
1
2
x
3
8
x
2
,
for x suciently close to 0.
The equation EXERCISE 1
xy sin y = 0
denes y implicitly as a function of x, y = g(x), with g(0) = . Find the linear
approximation of g at x = 0.
194 Appendix A Implicitly Dened Functions
If the function y = g(x) is dened implicitly by the equation f(x, y) = 0, where f has
continuous partials, one can derive a formula for g
(x) in terms of the partial derivatives

of f. We have
f(x, g(x)) = 0
for all x in some interval. This equation states that the composite function f(x, g(x))
is the zero function. Thus
d
dx
f(x, g(x)) = 0.
Use the Chain Rule to expand this derivative, obtaining
f
x
(x, g(x)) + f
y
(x, g(x))g
(x) = 0.
If f
y
(x, g(x)) = 0, we can solve for g
(x),
g
(x) =
f
x
(x, g(x))
f
y
(x, g(x))
.
It is not necessary to memorize this formula. What is of interest is its structure. In
Leibniz notation, this equation reads
dy
dx
=
f
x
f
y
.
The minus sign if puzzling one cannot think of
canceling the fs, as is sometimes possible in
single variable calculus.
There is a simple geometrical explanation, how-
ever. In the diagram,
f
x
> 0 and
f
y
> 0,
based on the direction of the gradient vector, but
dy
dx
< 0, based on the slope of the tangent line.
The equation EXERCISE 2
f(x, y) = 0
denes y implicitly as a function of x, y = g(x). If f(1, 3) = 0 and f(1, 3) = (3, 5),
nd g
(1). Assume that f has continuous partial derivatives.

Section A.2 Implicit Dierentiation 195
Generalization
A function g : R
2
R can be dened implicitly by an equation of the form f(x, y, z) =
0, where f : R
3
R. One can use implicit dierentiation to calculate the partial
derivatives of g. We assume that f has continuous partial derivatives.
The equation f(x, y, z) = 0 determines z implicitly as a function of x and y, z = g(x, y). EXAMPLE 2
If f(2, 1, 1) = 0, and f(2, 1, 1) = (4, 6, 2), nd the linear approximation of g at
(1, 1).
Solution: We have
f(x, y, g(x, y)) = 0 (5)
for all (x, y) in some subset of R
2
. Dierentiate equation (5) with respect to x, treating
y as a constant:
x
f(x, y, g(x, y)) = 0.
Expand the left side using the Chain Rule
f
x
(x, y, g(x, y))(1) + f
y
(x, y, g(x, y))(0) + f
z
(x, y, g(x, y))g
x
(x, y) = 0.
Evaluate at (x, y) = (2, 1), with g(2, 1) = 1:
f
x
(2, 1, 1) + f
z
(2, 1, 1)g
x
(2, 1) = 0.
Since f(2, 1, 1) = (4, 6, 2), we obtain
4 + 2g
x
(2, 1) = 0
and so
g
x
(2, 1) = 2.
Similarly one can show that
g
y
(2, 1) = 3.
The linear approximation of g at (2, 1) is:
L
(2,1)
(x, y) = 1 2(x 2) + 3(y + 1).
Referring to example 2, show that g
y
(2, 1) = 3. EXERCISE 3
A.2 The Implicit Function Theorem
In section A.1, we showed that the derivatives of a function y = g(x) that is dened
implicitly by an equation f(x, y) = 0, can be calculated in a routine manner, even
though the function g cannot be solved for explicitly.
In this section we show how to obtain more information about the set of points (x, y)
which satisfy an equation f(x, y) = 0, called the null set of f, and denoted by N(f):
N(f) = {(x, y) | f(x, y) = 0}.
This set is simply the level curve of f which corresponds to the constant value
1
0.
We begin by considering a number of simple examples which illustrate that it is dicult
to make any general statements about the null set of f, even if f is a well-behaved
function. In all the examples, f is a polynomial function, and hence has continuous
partial derivatives of all orders.
(i) f(x, y) = x
2
y
N(f) is the graph of a dierentiable function
y = g(x) = x
2
.
(ii) f(x, y) = y
3
y x
N(f) is a smooth curve, which is not the graph of a
function y = g(x).
1
It is not important that we have used 0 as the constant value, since the level set f(x, y) = k, is
the null set of the function g dened by g(x, y) = f(x, y) k.
Section A.2 The Implicit Function Theorem 197
(iii) f(x, y) = x
2
y
3
N(f) is the graph of a non-dierentiable function
y = g(x) = x
2/3
Note that g
(0) does not exist.

(iv) f(x, y) = x
2
+ x
3
+ y
2
N(f) is a self-intersecting curve (it could be the path
of an electron in a magnetic eld).
(v) f(x, y) = x
2
y
2
N(f) consists of two intersecting curves.
(vi) f(x, y) = (x y)
2
1
N(f) consists of two disjoint curves.
(vii) f(x, y) = x
2
+ y
2
N(f) is a single point (0, 0).
(viii) f(x, y) = x
2
+ y
2
+ 1.
N(f) is the empty set, i.e. 0 does not belong to the range of f.
REMARK
In general, for a given x-value, the equation f(x, y) = 0 does not have a unique
solution for y, and may not have any solution. However, by studying the sketches,
we see that apart from a few exceptional points, for each point (a, b) N(f) there
is a neighborhood of (a, b) such that when restricted to this neighborhood, N(f) is
the graph of a dierentiable function y = g(x). The function y = g(x) represents the
unique solution of the equation f(x, y) = 0 in this neighborhood.
The question is: how can we locate the exceptional points if we dont have a picture
of the null set N(f)?
The answer is: by studying the gradient vector f. Heres how:
If a level set f(x, y) = 0 is a smooth curve, and (a, b) lies on the curve (i.e. f(a, b) =
0), then f(a, b) is normal to the tangent line to the curve at (a, b). Thus, at the
exceptional points A and B in example (ii), and A in example (iv), at which the
tangent line is vertical, f = (f
x
, 0) i.e. f
y
= 0. At the exceptional point (0, 0) in
examples (iii), (iv), (v) and (vii), where the level set f(x, y) = 0 is not a smooth curve,
we have f = (0, 0), as can be veried explicitly (exercise).
The examples thus suggest that if f
y
(a, b) = 0, then the level set f(x, y) = 0 is the
graph of a function y = g(x) in some neighborhood of (a, b), or equivalently that the
equation f(x, y) = 0 has a unique solution y = g(x). This result is called the Implicit
Function Theorem.
Implicit Function Theorem 1
Let f : R
2
R, f C
1
in a neighborhood of (a, b). If f(a, b) = 0 and f
y
(a, b) = 0,
then there exists a neighborhood of (a, b) in which the equation f(x, y) = 0 has a
unique solution for y in terms of x, y = g(x), where g : R R has a continuous
derivative.
Proof: The proof is left to the end of the chapter.
REMARK
The roles of the variables x and y can be interchanged. If the hypothesis f
y
(a, b) = 0
is replaced by f
x
(a, b) = 0, the conclusion is that the equation f(x, y) = 0 has a unique
solution for x,
x = h(y).
The theorem and comment lead to the following:
If f : R
2
R, f C
1
, and Corollary 1
f(a, b) = 0, f(a, b) = 0
(i.e. at least one partial derivative non-zero at (a, b)), then near the point (a, b), the
equation f(x, y) = 0 describes a smooth curve, whose tangent line at (a, b) is orthogonal
to f(a, b). If f
y
(a, b) = 0 then the curve can be written uniquely in the form
y = g(x),
and if f
x
(a, b) = 0, it can be written uniquely in the form
x = h(y).
In the sketch below, we illustrate the Implicit Function Theorem using the function
f(x, y) = x
2
+ x
3
+ y
2
in example (iv).
A: f(0, 0) = (0, 0).
Not a smooth curve in this neighborhood.
There is not a unique solution.
B: f(
2
3
,
2
3
3
= (0,
4
3
3
.
Smooth curve in this neighborhood. Unique
solution y = g(x).
C: f(1, 0) = (1, 0).
Smooth curve in this neighborhood. Unique solution x = h(y).
D: f(
3
4
,
3
8
) = (
3
16
,
3
4
).
Smooth curve in this neighborhood. Unique solution y = g(x) and x = h(y).
a. Prove that the equation f(x, y) = 2x
2
2y
2
+y
4
= 0 has a unique solution y = g(x) EXERCISE 4
near the point
_

7
4
2
,
1
2
_
.
b. At what points is the tangent line to the curve f(x, y) = 0 horizontal/vertical?
c. Use b. to sketch the set dened by f(x, y) = 0.
It is a self-intersecting curve with a familiar shape.
Generalization
The considerations of this section can be applied to an equation of the form
f(x, y, z) = 0.
The geometric interpretation is in terms of surfaces in R
3
.
We rst state the Implicit Function Theorem and its corollary for f : R
3
R.
Let f : R
3
R, f C
1
in a neighborhood of a. If f(a) = 0 and f
z
(a) = 0, then Theorem 2
there exists a neighborhood of a in which the equation f(x, y, z) = 0 has a unique
solution for z in terms of x and y, z = g(x, y), where g : R
2
R has continuous
partial derivatives.
If f : R
3
R has continuous partial derivatives, and Corollary 1
f(a) = 0, f(a) = 0
(i.e. at least one partial derivative is non-zero at a) then near the point a, the equation
f(x, y, z) = 0 describes a smooth surface in R
3
whose tangent plane at a is orthogonal
to f(a).
If f
z
(a) = 0, then the surface can be described uniquely in the form
z = g(x, y),
near the point a. In general, however, the equation f(x, y, z) = 0 will not be the graph
z = g(x, y) of one function g : R
2
R
e.g. f(x, y, z) = x
2
+ y
2
+ z
2
1 = 0 represents a sphere, and can thus be described
by the graphs of two functions,
z =
_
1 x
2
y
2
and z =
_
1 x
2
y
2
REMARK
When applying the Implicit Function Theorem it is easy to remember which partial
derivative of f must be non-zero: it is the partial derivative with respect to the variable
for which one wishes to solve.
Prove that the equation EXAMPLE 3
F(x, y, z) = ye
z
+ xz x
2
y
2
= 0
has a unique solution for x in terms of y and z in a neighborhood of (0, 2, ln2).
Solution: F has continuous partials for all (x, y, z) R
3
by inspection. In addition
F(0, 2, ln2) = 0. The essential condition is that
F
x
(0, 2, ln2) = 0.
This is easily veried, since
F
x
= z 2x.
The equation
f(x, y, z) = z
3
xz + y = 0 (A.3)
describes a smooth surface with a fold, like
a wave on the point of breaking. Show that
the curve x = (3t
2
, 2t
3
, t), t R lies on the
surface and that the tangent plane is vertical
at each point of this curve.
EXAMPLE 4
Solution: Firstly, show that f(x, y, z) is zero along the given curve:
f(3t
2
, 2t
3
, t) = t
3
(3t
2
)(t) + 2t
3
= 0,
for all t R. Thus the curve lies in the surface.
The gradient vector is
f(x, y, z) = (z, 1, 3z
2
x).
Evaluate this vector on the curve:
f(3t
2
, 2t
3
, t) = (t, 1, 3t
2
3t
2
) = (t, 1, 0).
This shows that f is parallel to the xy-plane at points on the curve. Since f is
orthogonal to the tangent plane of the surface f(x, y, z) = 0, it follows that the tangent
plane of the surface f(x, y, z) = 0 is vertical at points on the given curve.
REMARK
The fact that equation (A.3) does describe a smooth surface follows from the fact
that (A.3) can be solved for y by inspection:
y = g(x, z) = xz z
3
,
and that g has continuous partials.
In order to verify the shape of the surface given by equation (A.3), sketch some typical EXERCISE 5
cross-sections x = a, given by
z
3
az + y = 0
in the cases a > 0, a = 0, a < 0.
Proof of the Implicit Function Theorem
We give a proof
2
of the Implicit Function Theorem. The proof depends on the Inter-
mediate Value Theorem and the Mean Value Theorem from single variable calculus.
For convenience we state the theorem again.
Implicit Function Theorem 1
Let f : R
2
R, f C
1
in a neighborhood of (a, b). If f(a, b) = 0 and f
y
(a, b) = 0,
then there exists a neighborhood of (a, b) in which the equation f(x, y) = 0 has a
unique solution for y in terms of x, y = g(x), where g : R R has a continuous
derivative.
2
This proof was provided by David Siegel.
Proof: Existence of a Unique Solution:
Suppose, without loss of generality, that f
y
(a, b) > 0. Since f
y
is continuous at (a, b),
there exists a neighborhood B of (a, b) such that
f
y
(x, y) > 0, for all (x, y) B (A.4)
This implies that in B, f(x, y) is increasing as
a function of y. Hence, for all > 0 suciently
small,
f(a, b + ) > f(a, b) = 0
f(a, b ) < f(a, b) = 0
Since f is continuous at (a, b), there exists a
= () > 0 such that |x a| < implies
f(x, b + ) > 0 and f(x, b ) < 0.
For xed x, with |x a| < , dene a function F : R R by
F(y) = f(x, y).
Then
F(b + ) > 0 and F(b ) < 0. (A.5)
Also by (A.4),
F
(y) = f
y
(x, y) > 0, for |y b| < . (A.6)
By (A.5) and the Intermediate Value Theorem, the equation F(y) = 0 has a solution
between b and b + , and by (A.6) this solution (for y) is unique (since F is
increasing).
Since the function F depends on x, with
|x a| < , this solution depends on x, and
we denote it by g(x). Then F(g(x)) = 0 and
hence f(x, g(x)) = 0, for |x a| < , i.e. g(x)
is a solution of f(x, y) = 0.
g has a Continuous Derivative:
Since f(x, g(x)) = 0 , for all x in some neighborhood of a, we obtain, for h suciently
close to zero,
0 = f(x + h, g(x + h)) f(x, g(x))
=
_
f(x + h, g(x + h)) f(x + h, g(x))
_
+
_
f(x + h, g(x)) f(x, g(x))
_
= f
y
(x + h, c
1
)
_
g(x + h) g(x)
_
+ f
x
(c
2
, g(x))h,
where c
1
lies between g(x) and g(x+h), and c
2
lies between x and x+h (by the Mean
Value Theorem). It follows that
g(x + h) g(x)
h
=
f
x
(c
2
, g(x))
f
y
(x + h, c
1
)
, for h = 0.
As h tends to 0, c
1
approaches g(x) and c
2
approaches x. It follows by continuity of
f
x
and f
y
(H
2
) that
lim
h0
g(x + h) g(x)
h
exists. Thus g
(x) exists, and

g
(x) =
f
x
(x, g(x))
f
y
(x, g(x))
.
So, by the Continuity Theorems g
is continuous.
Solutions to the Exercises 205
Appendix B
Solutions to the Exercises
Answers to Chapter 1
Exercise 1: a) The domain of f is 1 x
2
y
2
> 0
x
2
+ y
2
< 1. The range is z 0.
b) The domain of f is 16 x
2
+ y
2
0 x
2
y
2
16.
The range is z 0.
Exercise 2:
206 Appendix B Solutions to the Exercises
Exercise 3:
Exercise 4:
Exercise 2: Show that lim
y0
f(0, y) = 0 = 1. Thus f(x, y) does not approach a unique
value as (x, y) (0, 0).
Exercise 3: a) Show that lim
x0
f(x, mx
3
) =
m
1 + m
2
. Thus f(x, y) does not approach
a unique value as (x, y) (0, 0).
b) Show that lim
x1
f(x, 0) does not exist. Thus lim
(x,y)(1,0)
Exercise 4: If m(x) f(x) M(x) and lim
xa
m(x) = L = lim
xa
M(x), then lim
xa
f(x) =
L. To change this into our version take m(x) = B(x) + L and M(x) = B(x) + L.
Exercise 5: |x
3
y
3
| |x
3
| +|y
3
|
_
|x| +|y|
_
(x
2
+y
2
). Equality holds if and only
if x = 0 or y = 0.
x0
f(x, mx) = 1. Hence the limit may exist and equal 1.
A suitable inequality is
0 |f(x, y) L| =
x
2
(x 1) y
2
x
2
+ y
2
(1)
=
|x
3
|
x
2
+ y
2
|x|.
The Squeeze Theorem implies that lim
(x,y)(0,0)
f(x, y) = L = 1.
Exercise 1: One example is f(x) =
_
_
_
1, if x 0
0, if x < 0.
Show lim
x1
f(x) = 0 = 1 = lim
x1
+
f(x).
Exercise 2: Use
|xy|
|x|+|y|

|x|(|x|+|y|)
|x|+|y|
= |x| to prove that lim
(x,y)(0,0)
f(x, y) = 0 = f(0, 0).
Hence f is continuous at (0, 0).
Exercise 3: By the limit theorem and the denition of product:
lim
xa
(fg)(a) = lim
xa
f(a) lim
xa
g(x)
= f(a)g(a), by the hypothesis
= (fg)(a), by denition of product.
Therefore, by denition of continuity, fg is continuous at a.
Exercise 4: Apply the limit theorems, the denition of quotient, and the denition
of continuity as in exercise 1. g(a) = 0 is used explicitly when you use the denition
of quotient and the limit theorem.
Note: Since g(a) = 0 and g is continuous at a, g(x) = 0 for all x in some neighborhood
of a.
Exercise 5: For f(x, y) = k we have lim
xa
k = k = f(a). For f(x, y) = x we have
lim
(x,y)(a,b)
x = a = f(a, b). For f(x, y) = y we have lim
(x,y)(a,b)
y = b = f(a, b). So, they
are all continuous on their domain.
Exercise 6: Use the coordinate functions, the constant function, | | and sin(). Use
the sum, product, quotient and composition theorems.
Exercise 7: h(x, y) = (xy)
= e
ln(xy)
. Use the coordinate and constant functions,
e
()
and ln(). Use the product and composition theorems.
x0
f(x, mx) =
m
1 + m
2
. Therefore lim
(x,y)(0,0)
f(x, y) does not
exist, and you cannot make f continuous at (0, 0).
Exercise 9: By the Continuity Theorems f(x, y) = ln(1 + e
sin xy
) is continuous for
all (x, y). Consequently lim
(x,y)(1,)
f(x, y) = f(1, ) = ln2.
Exercise 1: f
x
= y
2
cos(xy
2
), f
y
= 2xy cos(xy
2
).
Exercise 2: Show that for a = 0, h = 0,
f(a + h, a) f(a, a)
h
=
(3a
2
+ 3ah + h
2
)
1/3
h
2/3
.
Since lim
h0
(3a
2
+ 3ah + h
2
)
1/3
h
2/3
= + for a = 0,
f
x
(a, a) does not exist.
Exercise 3: Show that
f(h, a) f(0, a)
h
=
|h||a 1|
h
. Hence, with a = 0, f
x
(0, 0)
does not exist, but with a = 1, f
x
(0, 1) = 0.
Exercise 4:
f
x
(a, b, c) = lim
h0
f(a+h,b,c)f(a,b,c)
h
provided the limit exists. Similarly
for
f
y
and
f
z
.
Exercise 5: f
xx
=
2(y
2
x
2
)
(x
2
+ y
2
)
2
; by symmetry f
yy
=
2(x
2
y
2
)
(x
2
+ y
2
)
2
.
Exercise 6: f
xy
= x
y1
(1 + y ln x) = f
yx
.
Exercise 7: z = 5 +
3
5
(x 3)
4
5
(y + 4).
Exercise 8: The equation of the tangent plane at (a, b,
a
2
+ b
2
) is
z =
a
2
+ b
2
+
a
a
2
+ b
2
(x a) +
b
a
2
+ b
2
(y b).
Substituting (x, y) = (0, 0) gives z = 0.
Exercise 10: Let f(x, y) =
sin x + tany, (a, b) =

_
0,

4
_
.
Show that
sin x + tan y 1 +
1
2
x +
_
y

4
_
, for (x, y) suciently close to
_
0,

4
_
.
Hence
_
sin
_
1
10
_
+ tan
_
3
4
_
1.015. [Calculator value 1.0156]
Exercise 12: The area A is A = f(x, ) =
1
4
x
2
tan. Show
that A 0.48m
2
. [Calculator 0.4626]
Exercise 13: Let f(x, y, z) = xyz, a = (5, 7, 10). Show that
xyz 350 + 70(x 5) + 50(y 7) + 35(z 10).
Hence 4.99 7.01 9.99 349.45. [Calculator 349.4492]
Exercise 1: Use the denition of partial derivative to show that f
x
(0, 0) = 1 and
f
y
(0, 0) = 0. It follows that
|R
1,0
(x)|
x0
= g(x, y), where g(x, y) =
|xy
2
|
(x
2
+y
2
)
3/2
. Show that
lim
x0
g(x, x) = 2
3
2
= 0.
|R
1,0
(x)|
x 0
=
|xy|
_
x
2
+ y
2
, and that 0
|xy|
_
x
2
+ y
2
|y|.
f(h, 1) f(0, 1)
h
=
|h|
h
, so that f
x
(0, 1) does not exist. Hence,
f can not be dierentiable.
Exercise 4: One possibility is f(x, y) =
_
(x 1)
2
+ (y 2)
2
.
Exercise 5: f is not dierentiable at a, since if it were, Theorem 1 would imply that
f is continuous at a, a contradiction.
Exercise 6: See example 1 in section 5.1.
Exercise 7: Show that f
x
(0, 0) = 0 = f
y
(0, 0), and
|R
1,0
(x)|
x 0
= (x
2
+ y
2
)
1/6
.
Exercise 8: Since the partial derivatives of f
x
, f
xx
and f
xy
are continuous, f
x
is
dierentiable and hence continuous. Similarly, the partial derivatives of f
y
are contin-
uous, hence f
y
is dierentiable and thus continuous. Therefore f is dierentiable and
hence also continuous.
Exercise 9: Use the approximation formula to obtain
g(x, y) =
_
1 + 3 tanx + sin y 2 +
3
2
_
x

4
_
+
1
4
y.
Use the continuity theorems to prove that g has continuous partials. Then theorem 2
implies that the approximation is valid for (x, y) suciently close to
_
4
, 0
_
.
Exercise 1: f is not dierentiable at (0, 0).
Exercise 2: f
(1) = 2. Assume that g is dierentiable at (2, 0).

Exercise 3:
dT
dt
(0) =
8
5
.
Exercise 4: g
(t) = f
x
(cos t, sin t)(sin t) + f
y
(cos t, sin t)(cos t); g
3
_
=
1
2
.
Exercise 5:
dT
dt
= h
x
(x(t))x
(t)+h
y
(x(t))y
(t)+h
z
(x(t))z
(t), where x(t) = (x(t), y(t), z(t)).

Exercise 6: g
(t) = F(t, t
2
, t
3
) (1, 2t, 3t
2
); g
(1) = 6.
Exercise 7: Repeat what was done on page 62 and use the linear approximation
again to evaluate
x
t
and
y
t
. We need the functions to be dierentiable to ensure the
linear approximation is a good approximation.
Exercise 8:
g
y
(x, y) = 2xD
1
f(2xy, x
2
y
2
) 2yD
2
f(2xy, x
2
y
2
), so
g
x
(1, 1) = 2.
Exercise 9: g
(t) = D
1
f(h(t)+t, h(t)t)(h
(t)+1)+D
2
f(h(t)+t, h(t)t)(h
(t)1).
Exercise 10: g
(1) = 2.
Exercise 11: Repeat what was done on page 62.
Exercise 12:
g
x
(x, y) = (1)f(2xy, x
2
y
2
) + x
_
(2y)D
1
f(2xy, x
2
y
2
) + 2xD
2
f(2xy, x
2
y
2
)
_
.
g
x
(1, 1) = 9.
Exercise 13:
u
s
= D
1
f( )
x
s
+ D
2
f( )
y
s
+ D
3
f( )(1),
where ( ) =
_
x(s, t), y(s, t), s, t
_
.
Exercise 14: Assume that g has continuous second partials.
Exercise 15: f
x
= yg
(xy), f
y
= xg
(xy), f
xy
= f
yx
= g
(xy) + xyg
(xy).
Assume that g has a continuous second derivative.
Exercise 1: D
u
f(1, 1, 2) =
4
3e
2
, with u =
_
1
3
,
2
3
,
2
3
_
.
Exercise 2: The largest rate of change is f(0, 1) =
5, and occurs in the

direction (1, 2).
Exercise 3: Give f and a such that f(a) = 0, e.g. f(x, y) = x
2
+ y
2
, a = (0, 0).
The tangent plane is horizontal at a.
Exercise 4: Show that f g = 0, and apply
theorem 3.
Exercise 5: (x1)+2(y1)+3
3(z
3) = 0.
Exercise 6: 8(x 1) 3(y 2) + (z + 2) = 0.
Exercise 1: P
2,a
(x, y) =
2
3
(x 1)
2
+
1
2
y
2
.
Exercise 2: We have f
xx
= 4e
2x+y
, f
xy
= 2e
2x+y
, and f
yy
= e
2x+y
. Since
f C
2
, by Taylors Theorem there is a point c such that
R
1,(1,1)
(x, y)
=
1
2
f
xx
(c)(x 1)
2
+ 2f
xy
(c)(x 1)(y 1) + f
yy
(c)(y 1)
2
1
2
_
|f
xx
(c)| (x 1)
2
+ 2|f
xy
(c)||(x 1)||(y 1)| +|f
yy
(c)|(y 1)
2
,
by the triangle inequality. Thus, on 0 x 1 and 0 y 1 we have
|f
xx
| 4e, |f
xy
| 2e, |f
yy
| e.
Hence,
R
1,(1,1)
(x, y)
2e(x 1)
2
+ 2e|x 1||y 1| +
1
2
e(y 1)
2
2e(x 1)
2
+ e(x 1)
2
+ e(y 1)
2
+ 2e(y 1)
2
= 3e[(x 1)
2
+ (y 1)
2
]
Exercise 3:
P
3,(a,b)
= P
2,(a,b)
(x, y) +
1
6
f
xxx
(a, b)(x a)
3
+
1
2
f
xxy
(x a)
2
(y b)
+
1
2
f
xyy
(x a)(y b)
2
+
1
6
f
yyy
(a, b)(y b)
3
Exercise 1: f
x
= y(1 + x)e
xy
, f
y
= x(1 y)e
xy
; (0, 0) and (1, 1).
Exercise 2: Critical points are
_
0,

2
+ k
_
, k Z.
Exercise 3: One possibility is a linear function, e.g. f(x, y) = 2x + 3y.
Exercise 5: One critical point (0, 0), a saddle point.
Exercise 6: The critical points are (1, 0) and
_
0,
1
3
_
. Hf(1, 0) =
_
0 2
2 0
_
,
indenite; (1, 0) are saddle points. Hf
_
0,
1
3
_
=
_
2
3
3
0
0 2
3
_
, positive denite;
_
0,
1
3
_
is a local minimum point. Similarly,
_
0,
1
3
_
is a local maximum point.
Exercise 1: 1. I = [0, 2], f(x) =
_
_
_
x, if 0 x < 1
x 2 if 1 x 2.
2. I = [0,

2
], f(x) = tanx. 3. I = [1, ), f(x) =
1
x
.
Exercise 2: Critical points of f are (1, 0) and (1, 0).
On the boundary, g(t) = f(2 cos t, 3 sin t) = 3 sin t(3
4 sin
2
t). Critical points of g are t =

6
,
5
6
,
7
6
,
11
6
,

2
,
3
2
.
Maximum value of f is 3, and occurs at
_
3,
3
2
_
and
(0, 3).
Exercise 3: Maximum value is
1
4
, and occurs at
_
1
2
,
1
2
_
.
Exercise 4: Maximum value is
1
2
and occurs at
_
2
,
1
2
_
.
Exercise 5: Maximum value is 24 + 4
6 at (2
6, 0) and the minimum value is -1

at (1, 0).
Exercise 6: The closest points are (0, 0, 1).
Exercise 1:
Exercise 2:
Exercise 3: r
2
cos 2 = 1.
Exercise 4: A = 2
/2
_
0
1
2
[2
sin 2]
2
d = 4.
Exercise 5: A =
/4
_
0
1
2
sin
2
d+
/2
_
/4
1
2
cos
2
d =

8
1
4
.
Exercise 6:
Exercise 7: z = sin , r = 0.
Exercise 8: = 2 cos sin ,
2

2
.
Exercise 1: The image is
_
u
1
2
_
2
+
_
v +
1
2
_
2
=
1
2
.
Exercise 2: F(S) = {(u, v) | v u 2v, 2 v 3}.
Exercise 3:
_
u
v
_
DF(1, 0)
_
x
y
_
=
_
1 1
1 1
__
x
y
_
, for x, y suciently
small. F(0.95, 0.1) (0.05, 0.15). [Calculator (0.0488, 0.1625)]
Exercise 4: a) D(F G) =
_
_
4uvx
2x
2
+2y
2
+ 2u
2 4uvy
2x
2
+2y
2
+ 2yu
2
2xve
uv1
2x
2
+2y
2
+ 2ue
uv1 2yve
uv1
2x
2
+2y
2
+ 2yue
uv1
_
_
b) D(G F)(1, 1) = DG(1, 1)DF(1, 1) =
_
3 2
6 4
_
.
c) (G F)(1.01, 0.98) (G F)(1, 1) + D(G F)(1, 1)
_
.01
0.02
_
=
_
1.99
2.98
_
.
Exercise 1:
(x, y)
(r, )
= det
_
cos r sin
sin r cos
_
= r.
Exercise 2: This involves solving a quadratic equation. In order to choose the
appropriate sign, ensure that the image of (x, y) = (1, 2), i.e. the point (u, v) =
(3, 1), is mapped by F
1
onto (1, 2) again.
Exercise 3:
(u,v)
(x,y)
= x
2
y, so if
1
2
x 1 and
1
2
y 1, then x
2
y 1. Thus the
image of S under F will have less area.
(u, v)
(x, y)
= 1.
(u, v)
(x, y)
= 1.
Exercise 6: Observe that 3x
2
+2xy +y
2
= 2x
2
+(x+y)
2
, so take u = x and v =
x+y
2
.
Exercise 7: Let u = xy, v = xz, and w = yz. The cube is 1 u 3, 1 v 3,
2 w 4.
Exercise 1: The integral equals the number of people in the region D.
Exercise 2:
a) D : 0 x 4 y
2
, 2 y 2.
I =
__
D
(x+y) dA =
2
_
2
4y
2
_
x=0
(x+y) dx dy =
256
15
.
b) D :
4 x y
4 x, 0 x 4.
I =
4
_
0
4x
_
4x
(x + y) dy dx.
Exercise 3: D : x y 2 x, 0 x 1.
__
D
y dA =
1
_
0
2x
_
x
y dy dx = 1.
Exercise 4: D : 0 x y, 0 y 1.
__
D
e
y
2
dA =
1
_
0
y
_
0
e
y
2
dx dy =
e 1
2e
.
Exercise 5: V (4 x
2
y
2
)A; D is a rectangle.
V =
__
D
(4 x
2
y
2
) dA =
1
_
0
1
_
0
(4 x
2
y
2
) dy dx =
10
3
.
Exercise 8: Use polar coordinates.
__
Dxy
1
_
x
2
+ y
2
dx dy =
2
_
0
2
_
1
1
r
(r) dr d =

2
.
Exercise 9: Let (u, v) =
_
xy,
y
x
_
. Use the inverse property of the Jacobian to show
that
(x, y)
(u, v)
=
1
2v
. Then
I =
3
_
2
e
_
1
u
_
1
2v
_
dv du =
5
4
.
Exercise 2:
___
D
z dV =
c
_
0
a(1
z
c
)
_
0
b(1
x
a
z
c
)
_
0
z dy dx dz.
Exercise 3: A triple integral can be written as an iterated integral in 3! = 6 ways.
Exercise 4: Refer to the diagram in example 2. The iterated integral is
2
_
0
62y
_
2y
4y
2
_
0
z
4 y
dz dx dy.
In order to integrate rst with respect to y, you would have to decompose D into
several pieces.
Exercise 5: D is described by the inequali-
ties 0 z 1 y, 0 y 1, 0 x 2.
___
D
y dV =
2
_
0
1
_
0
1y
_
0
y dz dy dx =
1
3
.
Exercise 7: The solid is described by the inequalities 1 x + y 2, 1 x y
1, 0 x + y + z 3. Let (u, v, w) = (x + y, x y, x + y + z). Show that
(x, y, z)
(u, v, w)
=
1
2
.
Then
V =
2
_
1
1
_
1
3
_
0
1
_
1
2
_
dw dv du = 3.
Exercise 8: Use cylindrical coordinates.
M =
2
_
0
1
_
0
2
_
0
k(2 z)
1 + r
2
(r) dz dr d = 2k ln 2.
Exercise 9: Use cylindrical coordinates.
The paraboloid is z = r
2
, and the lower part
of the cone is z = 2 r, and they intersect in
the circle r = 1.
V =
2
_
=0
1
_
0
2r
_
r
2
1(r) dz dr d =
5
6
.
Exercise 11: Use the hint, and show that
(x, y, z)
(u, v, w)
= abc.
V =
___
Duvw
abc du dv dw,
where D
uvw
is given by u
2
+v
2
+v
2
= 1. Replace u, v, w by spherical coordinates and
show that
V =
4
3
abc.
Exercise 12: Use spherical coordinates, and refer to the diagram in the text.
V =
2
_
0
_
0
b
_
0
r
2
sin dr d d =
2
3
b
3
(1 cos ),
and use a trig identity.
Problem Sets
Problem Set 1
Level Curves, Limits, Continuity
Section A
A1. For the following functions f : R
2
R, sketch typical level curves and any
exceptional level curves. Level curves are dened by f(x, y) = k, where k is a constant.
a) f(x, y) = 4x
2
y
2
b) f(x, y) = x
2
+ 4y
2
9 c) f(x, y) = x
2
+ y
2
4(x + y)
d) f(x, y) = e
4x
2
y
2
1 e) f(x, y) = 2xy y
2
Discuss the following:
What values can k assume? i.e. determine the range of f.
How do the level curves change as k increases?
Shade in any region of the xy-plane for which k > 0.
Sketch some typical cross-sections x = c, and some typical cross-sections
y = d.
Describe/draw/visualize the surface z = f(x, y) in 3-space.
A2. Let f(x, y) =
x
4
y
5
x
4
+ y
4
. Prove that lim
(x,y)(0,0)
A3. Let f(x, y) =
x
2
(y 3) 6y
2
x
2
+ 2y
2
. Prove that lim
(x,y)(0,0)
f(x, y) = 3.
220 Problem Set 1
A4. Consider f : R
2
R dened by f(x, y) =
_
_
_
sin(xy)
ln(x
2
+y
2
+1)
, for (x, y) = (0, 0)
0, for (x, y) = (0, 0).
Prove that f
x
(0, 0) and f
y
(0, 0) exist, but that f is not continuous at (0, 0).
A5. Let f(x, y) =
y
2
x
4
y
2
+ x
4
, for (x, y) = (0, 0).
(a) Give the range of f, and sketch typical level curves f(x, y) = k. In your diagram,
describe the set of points (x, y) for which |f(x, y)|
3
5
.
(b) On the basis of part (a), draw a conclusion about lim
(x,y)(0,0)
f(x, y). Explain.
Section B
B1. Repeat question A1 for the following functions.
a) f(x, y) = 1 x
4
y
4
b) f(x, y) = 1 (x
2
+ y
2
4)
2
B2. For each function f, determine (with proof) whether or not lim
(x,y)(0,0)
f(x, y) exists.
Dene f(0, 0) so as to make the function continuous at (0, 0), when possible.
(i) f(x, y) =
x
3
2y
3
x
2
+ 2y
2
(ii) f(x, y) =
xy
4
x
2
+y
6
(iii) f(x, y) =
xy
3
x
2
+y
6
(iv) f(x, y) =
2|x| |y|
|x| + 2|y|
(v )f(x, y) =
x
2
6y
2
|x|+3|y|
(vi )f(x, y) =
sin(x
2
+ 2y
2
)
x
2
+ y
2
(vii) f(x, y) =
y
2
4|y|2|x|
|x|+2|y|.
B3. Let f(x, y) =
sin 2(x
2
+ y
2
)
(x
2
+ y
2
)
, for = (0, 0).
(a) By using the inequality
1
6
3
sin , for 0
or otherwise, evaluate lim
(x,y)(0,0)
f(x, y).
(b) Dene f(0, 0) so as to make the function f continuous at (0, 0).
(c) Use theorems on continuity to prove that the function f as dened in (b) is
continuous for all (x, y) = (0, 0).
Problem Sets 221
B4. A function f : R
2
R is dened by f(x, y) =
_
_
1e
|xy|
x
2
+y
2
if (x, y) = (0, 0)
0 if (x, y) = (0, 0).
Prove that f is continuous for all (x, y) R
2
.
Hint: Dont do much work for (x, y) = (0, 0). For (x, y) = (0, 0), you may use the
inequality 0 < 1 e
u
< u, for all u > 0.
B5.(a) Consider f : R
2
R dened by f(x, y) =
_
_
_
xln(x
2
+ 2y
2
), if (x, y) = (0, 0)
0 if (x, y) = (0, 0).
Prove that f
y
is dened for all (x, y) R
2
, but that f
y
is not continuous at (0, 0).
(b) Invent another function with this property.
B6.(a) Prove that
_
x
4
+ y
4
x
2
+ y
2
for all (x, y) R
2
.
(b) Determine whether lim
(x,y)(0,0)
_
x
4
+ y
4
_
x
2
+ y
2
exists.
B7. The temperature of a metal rod at position x, 0 x 1, and at time t, t 0 is
given by u(t, x) = 100e
t
sin x. Sketch the level curves u = 0, 25, 75, 100. Shade the
region of the tx-plane for which u > 75.
B8. Imagine a hill whose elevation z above sea-level (in meters) at position (x, y) is
given by z = f(x, y), where f(x, y) = 1000 9x
2
4y
2
. A hiker, starting at position
(10,5,0), walks up the hill in a south-westerly direction (the positive y-axis points
northwards). Find the maximum elevation reached by the hiker.
Section C
C1. Let f(x, y) =
|x|
a
|y|
b
|x|
c
+|y|
d
where a, b, c and d are positive numbers.
(a) Prove that if
a
c
+
b
d
> 1 then lim
(x,y)(0,0)
f(x, y) exists and equals zero.
(b) Prove that if
a
c
+
b
d
1, then lim
(x,y)(0,0)
C2. A function g : R
2
R is dened by g(x, y) =
_
y
x
e
t
2
dt.
Sketch the level curves of g.
222 Problem Set 2
Problem Set 2
Partial Derivatives, Linear Approximation, Dierentiability
Section A
A1. Let f(x, y) = x(|y| 1). Prove that f is dierentiable at (0, 0).
A2. Let f(x, y) = x|y 1|. Prove that f is not dierentiable at (1, 1).
A3 a) Write the general form of the linear approximation L
a
(x) at the point a, of a
given function f : R
3
R.
b) Find the linear approximation L
a
(x) of the function f at the point a.
i) f(x, y) = ln(x + 2y), a = (3, 1)
ii) f(x, y) =
sin 3x + 4 tany, a = (0,

4
)
iii) f(x, y, z) = e
x+2y+3z
, a = (1, 1, 1)
iv) f(x, y, z) = ln(x
2
yz), a = (2, 1, 3)
A4. Use the linear approximation to calculate an approximate value for i) (0.99e
0.02
)
8
ii)
_
(4.02)
2
+ (3.95)
2
+ (2.01)
2
iii)
_
e
0.1
+ 3 sin(0.05)
Compare your answers with the value from a calculator.
A5. A function g : R
2
R is dened by g(x, t) = f(x 3t) where f : R R. If
f
(2) = 3, calculate g
x
(5, 1), g
t
(5, 1). Show that g
t
(x, t) = 3g
x
(x, t) in general.
A6. A function f : R
2
R is dened by f(x, y) = ye
x
y
, y = 0. Verify that the second
mixed partial derivatives are equal:
2
f
xy
=

2
f
yx
.
A7. Determine the values of the constants and for which the function u(x, t) =
e
t
sin x satises the 1-d heat equation
u
t
=

2
u
x
2
.
Problem Sets 223
Section B
B1. Let f(x, y) =
_
|xy|
(a) Calculate
f
x
(1, 4),
f
x
(0, 0),
f
x
(0, 1) if they exist. At which of these points is it
necessary to use the denition of derivative?
(b) At what points do the partial derivatives of f not exist? Make a conjecture based
on part (a), and give a proof.
B2. (a) Sketch the surface z = |x y| in R
3
. At what points does the surface not have
a tangent plane?
(b) Verify that the partial derivatives of the function f dened by f(x, y) = |x y| do
not exist at the points found in (a).
Note: This implies that f is not dierentiable at these points, as expected from (a).
B3. Consider the functions f : R
2
R dened by
(i) f(x, y) = (xy)
2/3
(ii) f(x, y) = (xy)
1/3
(iii) f(x, y) = |x|
1
2
|y|
3
2
(iv) f(x, y) =
_
_
_
x
3
+y
4
x
2
+y
2
, for (x, y) = (0, 0)
0 for (x, y) = (0, 0).
(a) Use the denition to determine whether f is dierentiable at (0, 0).
(b) On the basis of your answer in (a), can you use one of the theorems to draw a
conclusion concerning the continuity of f at (0, 0)?
(c) On the basis of your answer in (a), can you use one of the theorems to draw a
conclusion concerning the continuity of f
x
and f
y
at (0, 0)?
B4. Determine whether the functions in B7. are dierentiable at (0, a), a = 0.
Hint: Does f
x
(0, a) exist? Consider the cross-section y = a to get a geometric inter-
pretation.
224 Problem Set 2
B5. (a) Invent a function f : R
2
R which is continuous on R
2
but not dierentiable
at (1, 2). Sketch the surface z = f(x, y).
(b) Invent a function f : R
2
R which is continuous on R
2
but not dierentiable at
all points of the circle x
2
+ y
2
= 1. Sketch the surface z = f(x, y).
B6. Consider the theorem: If f : R
2
R is dierentiable at a then f is continuous at
a. Give a counterexample to show that the converse of the theorem is false.
B7. The temperature of a metal rod at position x, 0 x 1, and at time t, t 0 is
given by u(t, x) = 100e
t
sin x. Find the rate of change of temperature with respect
to position when x =
3
4
and t = 1. Find the rate of change of temperature with
respect to time when x =
3
4
and t = 1. Illustrate these rates of change by sketching
the cross-sections x =
3
4
and t = 1.
B8. Let u(x, t) denote the displacement (in mm.) of a vibrating string at a point x
on the string at time t. How would you physically interpret the functions u
t
(x, t) and
u
x
(x, t)?
B9. A silo consists of a circular cylinder of radius 5 meters, and height 25 meters,
capped by a hemisphere. Suppose that the radius is decreased by 5 centimeters and
the height of the cylinder is increased by 10 centimeters. Use the linear approximation
to estimate the change in volume.
B10. If three resistors R
1
, R
2
, R
3
are connected in parallel, the total electrical resistance
R is determined by
1
R
=
1
R
1
+
1
R
2
+
1
R
3
. If R
1
, R
2
and R
3
initially equal 100, 200 and
300 ohms, and are increased by 1,2,4 ohms respectively, use the linear approximation
to calculate the change in R. Compare with a direct calculation on a calculator.
B11. Find all planes which are tangent to the surface z = 1 x
2
y
2
, and contain the
line passing through the points (1, 0, 2) and (0, 2, 2).
B12. (a) Verify that u = Ae
(xct)
2
, where A and c are constants, satises the 1-d wave
equation
u
tt
= c
2
u
xx
()
(b) Graph u versus x for t = 0,
1
c
,
2
c
,
3
c
, on the same axes. With what speed does the
wave move along the x-axis?
(c) Find a solution of () which describes a wave moving to the left along the x-axis.
Problem Sets 225
(d) Let f : R R have a continuous second derivative. Verify that u = f(x ct) is a
solution of the wave equation ().
B13. Show that u(x, t) =
_ x
2
t
0
e
s
2
ds satises the 1-d heat equation
u
t
=

2
u
x
2
. Sketch
the level curves of u(x, t).
B14. Prove that if f
xx
, f
xy
, f
yx
and f
yy
are continuous at a, then f
x
, f
y
and f are
continuous at a.
Hint: Apply the theorems relating to dierentiability.
B15. Let f(x, y) =
xy(x
2
y
2
)
x
2
+ y
2
if (x, y) = (0, 0), f(0, 0) = 0. Evaluate f
x
(0, y) for all
y, and f
y
(x, 0) for all x, using the denition of the partial derivative where necessary.
Hence show that f
xy
(0, 0) and f
yx
(0, 0) exist and are not equal.
Section C
C1. Let f(x, y) = |x|
r
|y|
s
, where r and s are positive numbers.
i) For what values of r and s is f dierentiable at (0,0)?
ii) For what values of r and s is f dierentiable on R
2
?
C2. Prove that if f satises |f(x, y)| x
2
+y
2
for all (x, y) R
2
, then f is dierentiable
at (0,0).
C3. (a) Give a function f : R
2
R such that

2
f
xy
= 0.
(b) Find all functions f : R
2
R which have continuous second partial derivatives,
and satisfy

2
f
xy
= 0.
(c) Suppose that u(x, t) is a function which has continuous second partial derivativeson
R
2
and which satises the one dimensional wave equation
u
tt
= c
2
u
xx
, ()
where c is a constant. Determine how equation () is transformed under the change
of independent variables expressed by p = x + ct, q = x ct. Using your answer to
part b), obtain the general solution of the wave equation (), and compare your answer
with the special solutions discussed in B2.
Problem Set 3
Chain Rule, Directional Derivatives, Gradient Vector
Section A
A1. Let w = x
2
y + xy
3
, x = 3t + 5, y = 2t
2
10. Use the chain rule to calculate
dw
dt
when t = 2.
A2. (a) State the Chain Rule for a composite function g(t) = f(x(t), y(t)), clearly
indicating the hypotheses and the conclusion.
(b) Given a function f : R
2
R, let g : R R be dened by
g(t) = f(e
t
cos t, e
t
sin t).
If f(1, 0) = (8, 4), nd g
(0). What hypotheses must f satisfy?

A3. Suppose that f : R
3
R is given, and that g : R R is dened by
g(t) = f(t, t
2
, t
3
).
If f(1, 1, 1) = (5, 3, 4), nd g
(1). What hypothesis must f satisfy?

A4. Suppose that f : R
2
R is given, and that g : R
2
R is dened by
g(s, t) = f(st, s
2
t
2
)
If f(2, 3) = (4, 3), nd g(1, 2). What hypothesis must f satisfy?
A5. Write the chain rule for the indicated derivatives of the composite functions,
assuming that the various functions have continuous partial derivatives as required:
i) If w = f(x, y, z), and x = x(s, t), y = y(s, t), z = z(s, t), nd
w
t
.
ii) If z = f(x, y), and y = g(x), nd
dz
dx
.
iii) If z = f(x, y), and y = g(x), x = h(u, v), nd
z
u
.
iv) If w = f(x, y, z), and y = g(x, z), z = h(x), nd
dw
dx
.
v) If w = F(p, q, r, s), and r = f(p, q), s = g(p, q), nd
_
w
p
_
q=const
.
Problem Sets 227
A6. In the following questions, state the assumption that you make about f.
(i) If F(x, y) = yf(x
2
y
2
), show that y
F(x, y)
x
+ x
F(x, y)
y
=
x
y
F(x, y).
(ii) If u = x
3
f
_
y
x
,
z
x
_
, show that x
u
x
+ y
u
y
+ z
u
z
= 3u.
(iii) If F(x, y, z) = f
_
y z
x
,
z x
y
,
x y
z
_
, show that x
F
x
+ y
F
y
+ z
F
z
= 0.
A7. The path of a space-craft is given by (x, y, z) = (e
2t
cos t, e
2t
sin t, 2t + 1)
where t denotes time. The temperature at position (x, y, z) is given by a function
u : R
3
R, and the temperature gradient at (1,0,1) is u(1, 0, 1) =
_
1
5
,
1
3
,
1
4
_
.
(a) Find the velocity of the spacecraft at time t.
(b) Find the rate of change of temperature experienced by the spacecraft at time
t = 0.
A8. (a) Calculate the directional derivative of f at the point a in the direction dened
by v:
(i) f(x) = e
x
cos y; a =
_
0,

4
_
; v = (1, 3).
(ii) f(x) = sin(xyz); a =
_
1, 1,

4
_
; v = (1,
2, 1).
(b) In each case nd the direction at a in which the rate of change of f is greatest,
and nd this maximum rate of change.
A9. The temperature of a metal sheet as a function of position (x, y) is given by
T(x, y) = 100 +10e
x
sin y. Find the rate of change of temperature at the point (0,

4
)
in the direction of the vector (1, 1). Find the direction at (0,

4
) in which the rate of
change is greatest, and nd this rate of change.
A10. Calculate the directional derivative of g(x) = ln(x+e
yz
) at (0, 1, 0) in the direction
from the point (0, 1, 0) to the point (2, 3, 1).
A11. Let f(x, y) = ln(x + 2y). Find the directional derivative of f at (1,0) in the
direction of the line y = 2x 2.
228 Problem Set 3
A12. Let f(x, y) = 2xyy
2
. Use the gradient vector to nd the equation of the tangent
line of the curve f(x, y) = 3 at the point (2, 1). Sketch the curve and the tangent line.
A13. Let f : R
3
R be a dierentiable function such that f(a) = 0. Consider the
surface f(x) = k and assume that f(a) = k. Write down the equation of the tangent
plane to the surface at a, in terms of the gradient vector.
A14. Let f(x, y, z) = x
2
+ 2y
2
3z
2
. Use the gradient vector to nd the equation of
the tangent plane to the surface f(x, y, z) = 3 at the point (2, 1, 1).
A15. Use the gradient vector to verify that the two families of curves intersect each
other orthogonally. Illustrate graphically.
(i) xy = c and y
2
x
2
= k
(ii) (x c)
2
+ y
2
= c
2
and x
2
+ (y k)
2
= k
2
.
A16. A sphere centered at (2, 1, 1) passes through the point P = (1, 1, 1). Find the
equation of the tangent plane to the sphere at P. Sketch the sphere and plane.
A17. Let g(u, v) = f(u
2
v
2
, 2uv). Express (g
u
)
2
+ (g
v
)
2
and g
uu
+g
vv
in terms of the
partial derivatives of f. What hypothesis must f satisfy?
A18. If u = f(x+g(y)), where f and g have a continuous second derivative, show that
u
x
u
xy
= u
y
u
xx
.
A19. A function g : R R with continuous second derivative is given, and f is dened
by f(x, y) = g
_
x
y
_
, for y = 0. Calculate

2
f
xy
and

2
f
yx
and verify that they are
equal.
Section B
B1. (a) Find the directional derivative of w = x
2
+ y
2
in the direction of the tangent
vector to the spiral x = (e
t
cos t, 2e
t
sin t), at the point dened by t = 0.
(b) Find
dw
dt
along the spiral, at the same point.
(c) How are these rates of change related?
B2. At a point a R
2
, the directional derivative of a dierentiable function f(x, y) in
the directions (1, 1) and (1, 1) equals 3 and 2 respectively. Find the largest rate of
change of f(x, y) at a, and the direction in which it occurs.
Problem Sets 229
B3. In what directions at the point (2,1) does the directional derivative of the function
f(x, y) = xy equal 0? Equal
_
5
2
? Express your answer by giving the angle between
the required directions and the gradient of f at (2,1). Give a diagram, showing some
typical level curves of f near (2,1), and the required directions.
B4. A space-ship cruising on the sunny side of the planet Mercury starts to overheat.
The space-ship is at location (1,1,1) and the temperature of the ships hull when at
location (x, y, z) will be T = 200 + e
x
2
2y
2
3z
2
, where x, y, z are in metres.
a) In what direction should the ship proceed in order to decrease temperature most
rapidly?
b) If the ship travels at e
8
m/sec, how fast will the temperature decrease (in de-
grees/sec) if it proceeds in that direction?
c) The metal of the hull will crack if cooled at a rate greater than
14e
2
degrees/sec.
Describe the set of possible directions in which the ship may proceed to bring
the temperature down at that rate. Give a sketch.
B5. A cone, with vertex (0, 0, 2) and axis the z-axis, intersects the plane z = 3 in a
circle of radius
5.
a) Show that the tangent plane to the cone at the point (1, 2, 3) cuts the x-axis
at the point (2,0,0). Give a sketch.
b) Write down a vector equation for the normal line to the cone at (1,-2,3). Hence
show that this line intersects the xy-plane at the point (4,-8,0).
B6. Find all points on the paraboloid z = x
2
+ y
2
1 at which the normal line to the
surface coincides with the line joining the origin to the point. Illustrate your results
with a sketch.
B7. (a) Consider the sphere of radius 4 centered at the origin, and the sphere of radius
3 centered at the point (0,5,0). Prove that the normal directions to these spheres at
their points of intersection are orthogonal. Give a sketch.
(b) Generalize this result.
230 Problem Set 3
B8. An engineer wishes to build a railroad up a mountain that has the shape of an
elliptic paraboloid z = c ax
2
by
2
, where a, b, c are positive constants. At the point
(1,1), in what directions may the track be laid so that it will be climbing with a slope
of 0.03 (i.e. a vertical rise of 0.03m for each horizontal metre)? Make a sketch showing
a few level curves, the gradient z at (1,1), and the two possible directions for the
track. Work out the details using a =
3b, b = 0.015.
B9. Let f(x, y) = (xy)
1/3
, p(t) = t, q(t) = t
2
, and consider the composite function
H : R R dened by H(t) = f(p(t), q(t)). Show that the chain rule for H(t) is not
satised at t = 0. What conclusion can you draw about f at (0, 0)?
B10. Let f : R
3
R and g : R
3
R have continuous partial derivatives. Prove that
(fg) = fg + gf.
B11. (a) Let F(t) = f(a +th, b +tk), where f : R
2
R has continuous second partial
derivatives, and a, b, h, k are constants. Show that
F
(t) = h
2
f
11
+ 2hkf
12
+ k
2
f
22
,
where f
11
, f
12
and f
22
are evaluated at (a + th, b + tk).
(b) Can you generalize (a) to give a formula for F
(t)?
B12. Functions f : R
3
R which satisfy Laplaces equation f
xx
+ f
yy
+f
zz
= 0 are of
interest in theoretical physics.
a) Suppose that g : R R has a continuous second derivative, and f(x, y, z) =
g
_
1
r
_
, where r =
_
x
2
+ y
2
+ z
2
> 0. Show that
f
xx
+ f
yy
+ f
zz
=
1
r
4
g
_
1
r
_
, forr > 0.
b) Give a function f, other than a linear function, which satises Laplaces equation.
Section C
C1. Let f : R
3
R be dierentiable and satisfy f(tx) = t
p
f(x), for all x R
3
and
t R, where p is constant. Prove that
x f(x) = pf(x) for all x R
3
.
Problem Sets 231
Problem Set 4
Taylor Polynomials, Taylors Theorem
Section A
A1. Let f(x, y) = e
3x2y
.
a) Calculate the gradient vector and the Hessian matrix of f at a = (2, 3).
b) Hence write down the linear approximation L
a
(x, y) and the Taylor polynomial
P
2,a
(x, y) of f at a = (2, 3).
c) Show that the gradient vector of f has the same direction at each point. What
conclusion can you draw about the level curves of f?
A2. Find the Taylor polynomial P
2,a
(x) for each function.
(i) f(x) = ln(x + e
y
), a = (1, 0) (ii) f(x) = xe
xy
, a = (1, 1).
A3. Let f(x, y) = (x y) sin(x + y). Find the Taylor polynomial P
2,a
(x, y) of f at
(, ).
A4. (a) Use the second degree Taylor polynomial to derive the approximation (1+x)
y
1 + xy, for (x, y) suciently close to (0,0).

(b) Test the accuracy of the approximation in (a) with your calculator by making a
table of values (3 cases). Give the percentage error in the approximations.
A5. Use the second degree Taylor polynomial to derive the approximation ln(sin
2
x +
cos
2
y) x
2
y
2
, for (x, y) suciently close to (0,0).
A6. Let u = f(xcos +y sin , xsin +y cos ), constant. Express u
xx
+u
yy
, u
xx
u
yy
and u
xy
in terms of the second partial derivatives of f. Use double angle trigonometric
identities to simplify.
232 Problem Set 4
Section B
B1. Find a function f : R
2
R such that Hf(x) =
_
1 2
2 3
_
for all x R
2
, f(1, 0) =
(2, 5) and f(1, 0) = 7. Is there more than one such f?
B2. Consider the approximation ln(x + 2y) (x 3) + 2(y + 1), for (x, y) suciently
close to (3, 1). Prove that if x 3 and y 1, the error satises
|error|
7
2
_
(x 3)
2
+ (y + 1)
2
.
B3. Suppose that f : R
2
R has continuous second partial derivatives which satisfy
|f
xx
| M, |f
xy
| M, |f
yy
| M
for all (x, y) N = {(x, y) | (x a)
2
+ (y b)
2
r
2
}, where M is a constant. Let
L
a
(x, y) be the linear approximation of f at a = (a, b). Prove that
|f(x, y) L
a
(x, y)| M[(x a)
2
+ (y b)
2
],
for all (x, y) N. This gives an upper bound for the error in the linear approximation.
B4. Consider f : R
2
R dened by f(x, y) = 2x
2
+ 3y
2
, and let a R
2
be arbitrary.
Prove that f(x, y) L
a
(x, y), for all (x, y) R
2
.
Comment: Since z = L
a
(x, y) is the equation of the tangent plane to the surface
z = f(x, y) at a, this shows that the surface lies above each of its tangent planes.
Section C
C1. Suppose that f : R
2
R has continuous second partial derivatives on the rectangle
a x b, c y d. Use Taylors formula to prove that
d
dx
_
d
c
f(x, y)dy =
_
d
c
f(x, y)
x
dy,
for all x which satisfy a < x < b.
Hint: Let g(x) =
_
d
c
f(x, y) dy, and use the denition of the derivative to calculate
g
(x).
Problem Sets 233
Problem Set 5
Critical Points, Extreme Value Problems
Section A
A1. Find and classify the critical points of the function f, where
(i) f(x, y) = xy
2
x
2
y xy + x
2
(ii) f(x, y) = xye
x+2y
(iii) f(x, y) = (x
2
+ y
2
1)y (iv) f(x, y) = xsin(x + y)
A2. Find the maximum and minimum values of the function f on the square 0 x 1,
0 y 1, where
f(x, y) = xy x
3
y
2
.
A3. Find the maximum and minimum values of f(x, y) = x+2y on the disc x
2
+y
2
4.
A4. Find the maximum and minimum values of the function f : R
2
R dened by
f(x, y) = xye
1
2
x
1
3
y
on the triangular set with vertices (0, 0), (2, 0) and (0, 3).
A5. The steady-state temperature at position (x, y) of a metal disc, x
2
+y
2
b
2
, where
b is a positive constant, is given by f(x, y) = 100 + x
3
3xy
2
. Find the hottest and
coldest points on the disc.
A6. a) Use Lagrange multipliers to nd the greatest and least distance of the curve
6x
2
+ 4xy + 3y
2
= 14
from the origin.
b) Illustrate the result graphically by drawing the constraint curve g(x, y) = 0, the
level curves f(x, y) = C, and the gradient vectors f and g. Clearly indicate the
relation between the level curves of f and the constraint curve at the maximum and
minimum.
234 Problem Set 5
A7. In a Cartesian coordinate system in which the earth is located at (x, y, z) = (0, 0, 0),
the path of a comet is given by
3x
2
+ 8xy 3y
2
= 5
3
, z = 0, x > 0.
Find the distance of closest approach to the centre of the earth. Units are in km10
5
.
Illustrate your answer with a sketch.
Suggestion: In order to avoid messy square roots, use the method of Lagrange mul-
tipliers.
A8. Use Lagrange multipliers to nd the maximum value of x + y + z on the ellipsoid
x
2
+
1
4
y
2
+
1
9
z
2
= 1. Discuss briey a geometrical interpretation.
A9. Use Lagrange multipliers to solve A3 and A5.
A10. Find the greatest and least distance of the surface
6x
2
+ 4xy + 3y
2
+ 14z
2
= 14
from the origin.
Section B
B1. An open irrigation channel is to be made in symmetric
form with 3 straight sides, as drawn. If the sum of
the lengths of the sides of the cross-section equals L
(given), nd the channel design which will permit the
maximum possible ow.
Comment: You should formulate the problem mathematically in the form: nd the
maximum value of a function on a closed and bounded subset of R
2
. Keep in mind
that the maximum could occur on the boundary of the subset.
B2. Consider all pentagons which have a line of symmetry, two adjacent interior angles
of 90
0
, and a perimeter of xed length L. Find the shape that enclosesthe largest area.
B3. Find the maximum and minimum value of f(x, y) = (x + 1)
2
+ y
2
on the part of
the graph of y
2
x
3
= 0 from (1, 1) to (1, 1).
Problem Sets 235
B4. Prove that x
4
+ y
4
4b
2
xy 2b
4
for all x, y R.
B5. Consider the function f : R
2
R dened by f(x, y) = (x
2
+y
2
+k)e
x
2
y
2
where
k is a constant. The properties of f depend in a signicant way on k. Analyse the
function as regards local and global maxima and minima. Sketch/describe the surface
z = f(x, y). How many qualitatively dierent cases are there?
B6. In each case invent a non-constant dierentiable function f : R
2
R with the
stated property. Classify the critical points of f, sketch the level curves, and describe
(i) All points on the line y = 2x are critical points of f.
(ii) All points on the circle x
2
+ y
2
= 1 are critical points of f.
B7. Consider a set of points (x
i
, y
i
), i = 1, 2, . . . , n, which are close to lying on a
straight line y = mx+b. In order to nd the straight line which best ts the points,
we minimize the sum of the squares of the errors:
E(m, b) =
n
i=1
_
y
i
(mx
i
+ b)
_
2
In other words, we nd the minimum value of E(m, b),
for all values of the slope m and intercept b, i.e. for all
(m, b) R
2
.
Apply this method to nd the straight line which best ts the points (0, 1), (2, 3), (3, 6)
and (4, 8). Illustrate the result with a sketch.
Suggestion: Do not expand and simplify E(m, b) before calculating the partial deriva-
tives.
Section C
C1. Suppose that a function f : R
2
R has exactly one critical point which is a local
minimum. Does f have a minimum on R
2
? Discuss with reference to the function
f
1
(x, y) = x
2
+ y
2
(1 x)
3
and f
2
(x, y) = x
2
+ y
2
.
236 Problem Set 5
C2. (a) Use the method of Lagrange multipliers to prove that if
x
2
1
+ x
2
2
+ x
2
3
= 1
then
x
2
1
x
2
2
x
2
3

1
3
3
(b) Hence prove that for all positive real numbers a
1
, a
2
and a
3
,
(a
1
a
2
a
3
)
1
3
a
1
+ a
2
+ a
3
3
(c) Generalize a. and b. to deduce the arithmetic-geometric mean inequality:
(a
1
a
2
a
n
)
1
n
a
1
+ a
2
+ + a
n
n
for all positive real numbers a
1
, a
2
, . . . , a
n
and any positive integer n.
Problem Sets 237
Problem Set 6
Polar, Cylindrical, and Spherical Coordinates
Section A
A1. Convert the following points from Cartesian coordinates to polar coordinates with
0 < 2.
a) (2, 2) b) (
3, 1) c) (1,
3) d) (2, 1)
A2. Convert the following points from polar coordinates to Cartesian coordinates.
a) (2, /3) b) (3, 5/6) c) (3, 2/3) d) (2, /6)
A3. For each of the indicated regions in polar coordinates, sketch the region and nd
the area.
(a) The region enclosed by r = sin . (b) The region enclosed by r = cos 2.
Section B
B1. For each of the indicated regions in polar coordinates, sketch the region and nd
the area.
(a) Inside both r = 1 + 1 sin and r = 1 1 sin .
(b) Inside r = sin and outside r = sin 2.
B2. Convert the following equations in Cartesian coordinates to cylindrical coordinates.
(a) z =
_
2x
2
+ 2y
2
. (b) x = y. (c) z
2
= x
2
y
2
.
B3. Convert the following equations in Cartesian coordinates to spherical coordinates.
(a) x
2
+ y
2
= 4.
(b) x
2
+ y
2
+ z
2
= 2x.
(c) z =
_
x
2
+ y
2
.
(d) z
2
= x
2
y
2
.
238 Problem Set 7
Problem Set 7
Maps, Jacobians, Chain Rule in Matrix Form
Section A
A1. Consider the following maps T : R
2
R
2
. Find the image under T of the square
{D = (x, y) | 1 x 2, 2 y 3}.
(i) T(x, y) = (2x + 3y, x y) (ii) T(x, y) = (xy, x
2
y
2
)
(iii) T(x, y) = (xcos
1
3
y, xsin
1
3
y) (iv) T(x, y) = (e
x+y
, e
xy
)
A2. Find the image of the ring dened by 4 x
2
+y
2
16 under the map F : R
2
R
2
dened by F(x, y) =
_
x
x
2
+ y
2
,
y
x
2
+ y
2
_
.
A3. Consider the map F : R
2
R
2
dened by F(x, y) =
_
_
x
2
+ y
2
,
x
_
x
2
+ y
2
_
. Use
the linear approximation in matrix form to nd the approximate image of the point
(3.1, 3.9) under F.
A4. Consider the maps F : R
2
R
2
and G : R
2
R
2
dened by
F(u, v) = (e
u+v
, e
uv
), G(x, y) = (xy, x
2
y
2
)
a) Calculate the composite map F G and the derivative matrix D(F G)(1, 1).
b) Verify your answer for D(F G)(1, 1) by using the Chain Rule in matrix form.
c) Calculate D(G F)(1, 1).
A5. Consider the maps F : R
2
R
2
and G : R
2
R
2
dened by
F(u, v) = (v + u
2
, u), G(x, y) = (e
x
y, 2e
x
y).
State the Chain Rule in matrix form, and use it to calculate the derivative D(F
G)(0, 1) of the composite map.
Problem Sets 239
A6. Consider the map F : R
2
R
2
dened by
(u, v) = F(x, y) = (y + e
x
, y e
x
).
a) Show that F has an inverse map by nding F
1
explicitly.
b) Find the derivative matrices DF(x, y) and DF
1
(u, v) and verify that DF(x, y)DF
1
(u, v) =
I, where I is the identity matrix.
c) Verify that the Jacobians satisfy
(x, y)
(u, v)
=
_
(u, v)
(x, y)
_
1
.
A7. Calculate the Jacobian
(u, v)
(x, y)
for the following maps T. Find all points at which
the Jacobian is zero. Use the Inverse Map Theorem to prove that T
1
exists in a
neighbourhood of the indicated point:
(i) (u, v) = T(x, y) = (cos(x + y), sin(x y));
_
4
,

4
_
(ii) (u, v) = T(x, y) = (x + y, 2xy
2
) ; (0, 1).
A8. Calculate the approximate area of the image of a small rectangle of area xy
located at the point (a, b) under the map T : R
2
R
2
dened by
(i) T(x, y) = (xy, x
2
y
2
), (a, b) =
_
1,
1
2
_
(ii) T(x, y) =
_
_
x
2
+ y
2
,
x
_
x
2
+ y
2
_
, (a, b) = (1, 1)
Section B
B1. Invent a function F : R
2
R
2
that maps the parallelogram bounded by the lines
y = 3x4, y = 3x, y =
1
2
x and y =
1
2
(x+4) onto the unit square in the rst quadrant.
B2. Invent a transformation F : R
2
R
2
that maps the ellipse x
2
+ 4xy + 5y
2
= 4
onto the unit circle.
B3. Invent a transformation F : R
3
R
3
that maps the ellipsoid x
2
+ 8y
2
+ 6z
2
+
4xy 2xz + 4yz = 9 onto the unit sphere.
240 Problem Set 7
B4. (a) Let u = F(x) be a map of the xy-plane into the uv-plane. Consider a smooth
curve x = x(t) in the xy-plane. Suppose that F maps this curve into the curve u = u(t)
in the uv-plane. Show that the tangent vectors are related by the derivative matrix
according to
u
(t) = DF(x(t))x
(t).
(b) Consider the map u = (xy, x
2
y
2
). Find the image of the curve x = (t, t
2
), t 0
under this map, and sketch both curves. Calculate the tangent vectors to the curves,
and verify the formula that you derived in part (a).
B5. Consider the map F : R
2
R
2
dened by
(u, v) = F(x, y) = (x + ky
2
, y),
where k is a non-negative constant.
a) Find the image of the family of lines x = constant under the map. Illustrate
with a sketch. What happens when k is close to zero, and when k is very large?
Estimate the area of the image of a small rectangle of area xy.
b) Find and sketch the image of the disc x
2
+y
2
1 under the map. How does the
value of k aect the image? Make a conjecture about the area of the image.
Problem Sets 241
Problem Set 8
Double Integrals
Section A
A1. Show that
__
D
(ax + by) dx dy =
1
3
(a + b), where D is the region in the rst
quadrant bounded by the circle x
2
+ y
2
= 1 and the lines x = 0, y = 0; a, b are
constants.
A2. Evaluate
__
D
sin(x + y) dx dy, where D is the triangular region with vertices
(0, 0), (, 0) and
_
2
,

2
_
.
A3. Evaluate
__
D
e
y
2
dx dy, where D is the triangular region with vertices (0, 0),
(0, 1) and (1, 1).
A4. For the following iterated integrals sketch the region of integration, and evaluate
the integrals by reversing the order of integration:
(i)
1
_
0
_
_
1
_
x=y
sin(x
2
) dx
_
_
dy (ii)
_
1
0
_
_

x
y=x
sin y
y
dy
_
dx.
A5. Prove that
__
D
sin
2
(x + y) dA
__
D
sin(x + y) dA where D = {(x, y) | 0
x + y and 0 y }.
A6. Let V denote the volume of the tetrahedron with vertices (a, 0, 0), (0, b, 0), (0, 0, c)
and (0, 0, 0), with a, b, c > 0. Show that V =
1
6
abc.
242 Problem Set 8
A7. Let D be the quarter disc in the rst quadrant dened by x
2
+ y
2
1. Use the
inequality
x
1
6
x
3
sin x x, forx 0
to show that
14
45

__
D
sin x dA
15
45
Note: You will not succeed in evaluating this integral exactly.
A8. Let D be the unit square 0 x 1, b y b + 1. Show that
__
D
x
y
dA = ln
_
b + 2
b + 1
_
.
A9. The temperature at points of the disc x
2
+ y
2
b
2
is given by
f(x, y) = 100 + x
3
3xy
2
Find the average temperature. At what points of the disc does the temperature
equal the average temperature? Give a sketch.
A10. Use the map T(x, y) = (x + y, x + y) to evaluate
_
0
y
_
0
(x + y) cos(x y) dx dy.
A11. Let D be the unit disc x
2
+ y
2
1. Use polar coordinates to show that
__
D
e
x
2
+y
2
dA = (e 1).
A12. Evaluate
__
D
x
_
x
2
+ y
2
dA, where D is the region inside the circle x
2
+y
2
= 2x,
but outside the circle x
2
+y
2
= 1. Use polar coordinates and describe the image
of D.
Problem Sets 243
A13. Let D be the region in the xy-plane enclosed by the lines y = 2x, y = 4x, y =
x and y = 0. Evaluate the Jacobian of the map (x, y) = F(u, v) = (u+uv, uuv),
and show that it is never zero on D. Sketch the image of D in the uv-plane. Use
this map to evaluate
__
D
e
xy
x+y
x + y
dx dy.
Section B
B1. Evaluate the iterated integral
e
_
1
_
_
ln x
_
y=0
y
x
dy
_
_
dx.
B2. Evaluate
__
D
e
|x+y|
dA, where D = {(x, y) | |x| 1, |y| 1}.
B3. A function f : R
2
R is dened by
f(x, y) =
_
_
_
1, if x + y 1
1, if x + y < 1.
Evaluate
__
D
f(x, y) dA, where D is the subset of R
2
dened by |x| +|y| 2.
B4. Evaluate (i)
__
D
xy dA, (ii)
__
D
sin x dA, where D is the unit disc centered at
the origin. Hint: Dont do much work.
B5. A metal plate, bounded by x
2
y
2
= 1, x
2
+ 3y
2
= 1, x = 0 and y = 0, and
lying in the rst quadrant, is coated with silver. The density of silver at position
(x, y) on the plate is given by (x, y) = xy grams per unit area. Calculate the
total amount of silver on the plate.
B6. Find a linear transformation that maps the ellipse x
2
+4xy +5y
2
= 4 onto a unit
circle. Hence show that the area enclosed by the ellipse equals 4 square units
(without explicitly integrating).
244 Problem Set 8
B7. Let D be the subset of R
2
dened by |x| + |y| 1, and let f : R R be
continuous on the interval [1, 1]. Prove that
__
D
f(x + y) dx dy =
1
_
1
f(u) du.
B8. Let D be the disc of radius b centred at the origin, and let f : R R be
continuous. Prove that
__
D
f(x
2
+ y
2
) dA =
_
b
2
0
f(u) du.
Section C
C1. Let f : R R be a continuous function. Prove that
2
_
b
a
_
x
b
f(x)f(y) dy dx =
__
b
a
f(x) dx
_
2
.
C2. a) Show that
__
D(R)
e
(x
2
+y
2
)
dx dy = (1 e
R
2
) , where D(R) is the disc of
radius R, centre (0, 0).
b) Let D be the square {(x, y) | |x| b, |y| b}. Show that
__
D
e
(x
2
+y
2
)
dx dy = 4
_
_
b
_
0
e
x
2
dx
_
_
2
.
c) Hence, prove that
_
0
e
x
2
dx =
2
,
a result which is important in probability theory and in other applications.
This integral cannot be evaluated directly, by nding an antiderivative of
e
x
2
.
Problem Sets 245
Problem Set 9
Triple Integrals
Section A
A1. Write
___
D
f(x, y, z) dV as an iterated integral, for each 3-d region D.
(i) D is the rectangular box dened by |x 1| 2, |y| 3, |z + 1| 1.
(ii) D is the cylindrical solid dened by
x
2
a
2
+
y
2
b
2
1, |z 2| 1.
(iii) D is the tetrahedron with vertices (a, 0, 0), (0, b, 0), (0, 0, c) and (0, 0, c).
(iv) D is the ice-cream cone bounded by x
2
+ y
2
=
1
4
z
2
, z 0 and the
hemisphere dened by x
2
+ y
2
+ z
2
= 25, z > 0.
(v) D is the solid bounded by the paraboloid y = 1x
2
z
2
, and the hemisphere
dened by x
2
+ y
2
+ z
2
= 3, y < 0 in D, y < 1 x
2
z
2
).
A2. Consider the triple integral
___
D
e
x
dV , where D is the 3-d region bounded by
the planes x = 0, y = 0, z = 0 and x+y +z = 1. Write it as an iterated integral
in the order z, y, x. Notice that the order x, z, y will give a simpler integration.
Evaluate the integral using this order.
A3. Describe the 3-d region of integration for the iterated integral:
_
1
y=0
_
1y
x=y1
_
(1y)
2
x
2
z=
(1y)
2
x
2
f(x, y, z) dz dx dy,
and nd the limits when the order of integration is y, x, z.
A4. The temperature at points in the cube
C = {(x, y, z) | |x| 1, |y| 1, |z| 1}
is 100r
2
, where r is the distance to the origin. Find the average temperature. At
what points of the cube does the temperature equal the average temperature?
246 Problem Set 9
A5. Determine the volume bounded by the cone z = 2
_
x
2
+ y
2
and the paraboloid
z = 1 8(x
2
+ y
2
).
A6. Use spherical coordinates to evaluate
___
D
(x
2
+y
2
+z
2
)
3/2
dV, where D is the
solid bounded by the spheres x
2
+ y
2
+ z
2
= a
2
and x
2
+ y
2
+ z
2
= b
2
, with
0 < b < a.
A7. Evaluate the following triple integrals by transforming to spherical coordinates:
(i)
_
1
0
_

1x
2
0
_
2x
2
y
2
x
2
+y
2
dz dy dx (ii)
_
1
0
_
1y
2
0
_

3
3x
2
+3y
2
dz dx dy
A8. Calculate the volume enclosed by the cone z
2
= x
2
+y
2
and the plane z = h > 0,
rst using cylindrical coordinates, and then using spherical coordinates.
Section B
B1. Find the mass inside the sphere x
2
+y
2
+z
2
= 1, if the density is proportional to
(i) the distance from the z-axis (ii) the distance from the xy-plane. Think about
both spherical and cylindrical coordinates, and use whichever is simpler.
B2. Show that the volume V which lies inside the sphere x
2
+y
2
+(z a)
2
= a
2
and
outside the sphere x
2
+ y
2
+ z
2
= 4k
2
a
2
, where k is a constant, 0 < k < 1, is
given by
V =
4
3
a
3
(1 4k
3
+ 3k
4
).
B3. Let V denote the volume of the rst octant region bounded by the coordinate
planes and the parabolic cylinders
a
2
y = b(a
2
x
2
), a
2
z = c(a
2
x
2
), a, b, c > 0.
Show that V = 8abc/15.
Problem Sets 247
B4. Let V denote the volume of the ellipsoid
x
2
a
2
+
y
2
b
2
+
z
2
c
2
= 1. Find V by perform-
ing a transformation, but without integrating.
B5. A glacier which occupies the region
10
2
x
2
< z < 0, moves parallel to the
y-axis with velocity in km/year given by
v(x, z) = 10
3
[1 10
2
(x
2
+ z
2
)].
Find the volume of ice V moved through the xz-plane in a year (distances are in
kilometers).
B6. Evaluate
___
Dxyz
e
xy+z
dV , where D
xyz
is the parallelopiped bounded by the planes
x y + z = 2, x y + z = 3, x + 2y = 1, x + 2y = 1, x z = 0, x z = 2.
B7. Consider the region D in the rst quadrant enclosed by the four curves
ay = x
3
, by = x
3
, cx = y
3
, dx = y
3
,
where a, b, c, d are constants which satisfy 0 < a < b, 0 < c < d. Show that the
area of D equals
1
2
_
a
__
c
_
.
B8. a) The density of a spherical star of radius b depends on the distance r =
_
x
2
+ y
2
+ z
2
from the centre according to = f(r), where f : R R is
a positive continuous function. Write the mass M of the star as a triple
integral. Then show that
M = 4
_
b
0
r
2
f(r) dr.
b) The density of a spherical star of radius b is proportional to
b
3
b
3
+ r
3
, where
r is the distance to the centre. At what points does the density equal the
average density?
B9. A spherical star of radius b has a core of radius
1
2
b with constant density
0
kg/m
3
.
The density of the outer shell is proportional to
1
r
, where r is the distance to the
centre. If the density is a continuous function of r, for 0 r b, nd the total
mass of the star.
248 Problem Set 9
Section C
C1. The tetrahedron with vertices (0, 0, 0), (a, 0, 0), (0, b, 0) and (0, 0, c) is to be sliced
into n pieces of equal volume by planes parallel to the inclined face. Where
should the slices be made?
C2. Calculate the average distance of the point (0,0,c), where c 1, from the set of
all points in the solid sphere x
2
+ y
2
+ z
2
1.
C3. A 3-sphere of radius b in R
4
is dened by the equation
x
2
1
+ x
2
2
+ x
2
3
+ x
2
4
= b
2
Find the volume enclosed by a 3-sphere of radius b.

237 Course Notes

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

237 Course Notes

Enviado por

Direitos autorais:

Formatos disponíveis

Calculus 3

Course Notes for MATH 237

Complete the proof of the theorem, by proving that fg is continuous at a. EXERCISE 3

Use the linear approximation to approximate

(a) and the increment

(a)(x a), (5.2)

(a) exists, then lim

(a) exists then f is continuous at a. We now show that a

(t) is dened on the closed interval

(x(t)) is the derivative of the function f : R R evaluated at x(t). It is

(0) = 1. Further show that f

(t) are both

(1). What condition on g will guarantee the validity of your

(1) = 3 and f(3, 1) = (2, 3), nd g

since x, y, r are held constant

determines whether the graph of f is

can be used to increase the accuracy of the linear approximation by

exists on [a, x], then there exists a number c between a and

(a)(x a), (8.4)

(x)| B, for all x [a , a + ].

(x), you can nd a value for B.

is evaluated at c instead of at a. Knowing the form of P

are continuous and are given by

is continuous on the interval [0, 1], Taylors formula may be applied to g

(a) does not exist. Thus it follows that either

(a) = 0, and the

(a) > 0, which implies a local minimum, is replaced by the

(x) > 0 for all x and f is convex is f

(x) 0 for all x. Thus the term convex

(x)(x a) for all x = a, for any a R.

(c) for some c (a, b). Note that g

(x) > 0 then g

(x) is strictly increasing. Since g

(x) < 0 on [a, c) and g

where c is on the line segment from a to x. Since f

are continuous and are given by

(t) > 0 by Theorem 3. Thus, by Theorem 4, part (2):

(x) = 2 > 0 for all x, so f(x) is strictly convex. EXAMPLE 7

b b| and | c c| are suciently Lemma 1

b b| and | c c| are suciently small. Let

(x) = 0. This gives x =

2 and tan = 1. Since

3) in Cartesian coordinates into polar coordinates with 0 EXAMPLE 3

3. Since is in the second quadrant we get =

3. Hence the point is

Thus as we let the number of subdivisions go to innity

3, 1) from Cartesian coordinates to cylindrical coordinates. EXAMPLE 12

2, tan = 1 which gives =

3) and (1, 1, 1) from Cartesian coordinates to spherical coor- EXAMPLE 16

, and use the linear approximation

(x, y) and top

H( sin cos , sin sin , cos )

(x) in terms of the partial derivatives

(1). Assume that f has continuous partial derivatives.

(0) does not exist.

(x) exists, and

sin x + tany, (a, b) =

(1) = 2. Assume that g is dierentiable at (2, 0).

(t), where x(t) = (x(t), y(t), z(t)).

5, and occurs in the

6, 0) and the minimum value is -1

sin 3x + 4 tany, a = (0,

(0). What hypotheses must f satisfy?

(1). What hypothesis must f satisfy?