Você está na página 1de 146

VECTORS in 2D and 3D

Calculus for functions of two (or more) variables relies heavily on what you already know about the
calculus of functions of one variable. But a few preliminary ideas about vectors and various coordinate
systems in two and three dimensions need to be developed before those single variable ideas can be exploited. Once
that's done, then we can get back to calculus!
Let's start with vectors in the plane - you may have met them already, and you'll certainly make good use of them
in a number of your other courses!
What is a vector?: a quantity, whether it's geometric, scientific or whatever, is a vector so long as it has both a
magnitude (or length) and a direction. For instance, velocity can be described by a vector because it has a
magnitude, namely speed, as well as a direction: the wind blows at a speed of 5 mph from the north-west, Joe heads
due north at 75 mph in his car, and so on.
Displacements provide a different type of example: let's
look at where Bob lives in relation to Alice. If Alice's house
is at point , then Bob's house at point is ft. from
in a direction ENE. If we represent this as an arrow
from to , it determines a vector called a
displacement vector with magnitude the distance from
to , and direction the direction from to . It's natural
to represent this vector by an arrow with the tail and
the head.
In general, we'll usually label vectors by single bold-faced letters like , , ... , and so on. Beware: physicists and
engineers sometimes use different notation. The length of a vector is denoted by or by ; this length is a
positive number except for the zero vector which has length . Of course, not all quantities can be represented as
vectors: for instance, mass, temperature and distance have magnitude, but no direction. Such directionless
quantities are real numbers and will be referred to as scalars.
z = f(x, y)
y = f(y)
A B 223
A 18

A B AB

A
B A B
A B
a v
v v |v|
0 0
3, k, ,
Just as for numbers, there are many concepts, algebraic operations and associated properties for vectors.
1. Vectors , shown to the right are said to be
equivalent (or equal) because they have the same length
and direction even though they are not in the same
position.
2. The scalar multiple of a scalar and vector is the
vector whose length is ; if , then and
have the same direction, while if , then and
point in opposite directions. The examples of and
to the right show that scalar multiplication really does act
as a scaling of a vector.
Vectors are said to be parallel when they are scalar multiples of each other. So all the vectors , , and
above are parallel. They surely look parallel!
Although is the displacement vector from Alice's
house to Bob's, you can't actually drive across blocks to
get there! You might try first going ft. east and then
ft. north. In vector terms
which is just a mathematical way of saying that by adding
vectors as indicated, the lengths and directions will get you
from to .
Mathematics doesn't always come in neat rectangular blocks, however! To describe addition of vectors in general,
triangles and parallelograms are needed. So let , be the red and dark blue vectors shown below. The sum
is then the green vector as shown computed by first noting that the dark blue vector is equal to the light blue vector
and then applying the respective Triangle and Parallelogram Laws.
move where? Triangle Law
Parallelogram Law
u v
k v
kv |k|v k > 0 kv
v k < 0 kv v
2u 2u
u v 2u 2u
AB

200
100
= + , AB

AC

CB

A B
u v u +v
v
Subtracting is defined by adding ; thus the difference is the sum ) as shown by
Create and move it, then apply the Triangle Law.
To see addition and subtraction of vectors
in 'action', check out the interactive version of
the figure to the right. Select the
vector. Use the sliders to create several
different examples.
(i) Was the triangle or parallelogram law
used to create ? Why was the vector
moved?
(ii) Next create . Does
? Why?
(iii) Then do the same for .
Does ? Why?
v u v u +(v
v
a +b
a +b b
b +a
b +a = a +b
a b, b a
b a = a b
Here's a worked example to illustrate subtraction and scalar multiplication
Example 1: when displacement vectors
are specified by the parallelogram
express in terms of and
.
Solution: By the Parallelogram Law
But
while
So then
The following interactive provides a set of examples for you to work on (some involve a bit of geometry and trig!).
u = , v = , AB

AP

CR

u v
= + . CR

CB

CS

= = = u, CB

BC

AB

= = 2 = 2v . CS

AQ

AP

= 2v u. CR

Coordinates, Unit vectors : a very standard way of describing vectors is via basis vectors.
Instead of saying
go East feet and then North feet,
to get from Alice's house to Bob's, you could say
it's blocks East and block North.
Here we're using 'blocks' as a convenient unit. More
mathematically, we could introduce the 'unit' vectors
and write
This still makes mathematical sense for the vector for any pair of numbers and , positive or
negative, because it means that we go blocks in the East direction and in the North direction (negative simply
means we go backwards). The vector is often denoted by , and the values of are called the
components of .
The notation for vectors is very convenient. For example, addition, subtraction and scalar
multiplication are very simple: they are done component-wise:
while by Pythagoras' theorem,
Thus so are 'unit vectors' because they have unit length.
Example 2: determine so that
is the sum of vectors
Solution: as addition and scalar multiplication of
vectors is done component-wise,
Thus
After solving for and we get .
i, j
200 100
2 1
i, j
= 2i +j . AB

v = ai + b j a b
a b
v = ai + b j a, b a, b
v
v = ai + b j = a, b
u +v = , + , = + , + , w = c, d = c, d , a
1
b
1
a
2
b
2
a
1
a
2
b
1
b
2
v = ai + b j = | a, b | = . + a
2
b
2

i = 1, 0 = 1 = j = 0, 1, i, j
a, b
u = 1, 3
u = av + b w
v = 1, 2 , w = 1, 1 .
u = 1, 3 = a 1, 2 + b 1, 1
= a + b, 2a b = 1, 3 .
a + b = 1 , 2a b = 3 .
a b a = 2, b = 1
Now that we linked vectors with coordinate axes in the plane, it will turn out to be convenient to introduce
displacement vectors whose tail is always at the origin. This identifies points in the plane with vectors called
Position vectors.
The four vectors to the right are all equivalent because
they have the same length and direction; in fact, they are
all equivalent to since the vector always moves 2
units in the -direction and 1 unit in the -direction.
The red vector with tail at the origin, however, will be
called a Position Vector. It identifies the point
in the plane with the vector having tail at the origin and
head at . We shall often identify a point
with or as a position vector.
To the three coordinate axes are associated basic unit vectors , , and of length in the direction of the -axis,
-axis, and -axis respectively. Then, the earlier figure relating the coordinates of a point to the
coordinate axes shows that the vector is equal to the displacement vector . So a vector in -
space can be represented by
The values of and are called the components of . Addition, subtraction and scalar multiplication of vectors in
3-space then proceeds component-wise:
Notice also that
By introducing coordinates into vectors, many algebraic, geometric and function-theoretic possibilities become
available - that's the whole point!!
(a, b)
2i +j
x y
P = (2, 1)
(2, 1) P = (a, b)
ai + b j a, b
i j k 1 x
y z P(a, b, c)
ai + b j + c k OP

v 3
v = a, b, c = ai + b j + c k = (a, b, c) .
a, b, c v
u +v = , , + , , = + , + , + , w = c, d, e = c, d, e . a
1
b
1
c
1
a
2
b
2
c
2
a
1
a
2
b
1
b
2
c
1
c
2
i = 1, 0, 0, j = 0, 1, 0, k = 0, 0, 1, ai + b j + c k = . + + a
2
b
2
c
2

Lines and Planes in -space


Click for printable PDF Version
There are many ways of expressing the equations of lines in -space (you probably learned the slope-intercept and
point-slope formulas among others, for example). Now we do the same for lines and planes in -space. Here vectors
will be particularly convenient.
Lines: Two points determine a line in -space. So imagine a laser pointer (or a light saber, or something) at one
of the two points, say , and shine it towards the other point, say . If we extend the laser pointer or light saber in
both directions, we get a line.
To write this as an equation, represent the light saber as a
displacement vector shown in dark blue, with tail at and
head at ; so . To extend in both directions we
scale the vector by writing , shown in lighter blue, where
is a real number. Doing this for all such gives us the
complete line. So in vector-form each point on the line is
given by
Isn't this like the slope-intercept form for a line in the
plane?
Sometimes it's useful to express the equation for a line in coordinate-form: if we write
then the vector equation becomes
giving a second equation for a line in parametric form:
Solving for in these equations (and writing instead of and so on), we then get
giving a third equation
Which of these three equation forms for a line in -space is best to use in a given situation usually depends on how
things are set up, but it's often simpler to start with the vector form.
3
2
3
3
b a
v b
a v = a b v
t v
t t
r(t) = t v +b.
r(t) = x(t), y(t), z(t) , v = k, m, n , b = , , , x
1
y
1
z
1
r(t) = t v +b = tk + , tm+ , tn + , x
1
y
1
z
1
x(t) = tk + , y(t) = tm+ , z(t) = tn + . x
1
y
1
z
1
t x x(t)
t = , t = , t = ,
x x
1
k
y y
1
m
z z
1
n
= = .
x x
1
k
y y
1
m
z z
1
n
3
Example 1: Find parametric equations for the line
passing through the point and parallel to
the vector
Solution: In vector form a line passing through a
point and having direction vector
is given by
This becomes
in parametric form.
Example 2: Find the point of intersection, , of the
lines
Solution: to determine where the lines intersect it's
convenient to convert them to parametric form:
For then the lines intersect when the equations
are satisfied simultaneously. Solving the first two
equations gives , , and a check shows that
these values then satisfy the third equation.
The lines thus intersect when , which if we
substitute in the equations for , and shows that
Planes: Since a plane is two-dimensional, two parameters are needed to describe it as we will see later, but
there are other very useful ways of specifying a plane that allow us to exploit cross products in a natural way as a
normal to the plane :
P(4, 1, 3)
1, 4, 3 .
b = 4, 1, 3
v = 1, 4, 3
r(t) = tv +b = 4 +t, 1 +4t, 3 3t .
x(t) = 4 +t , y(t) = 1 +4t
z(t) = 3 3t
P
= = ,
x 2
4
y 6
3
z 5
1
= = .
x 4
2
y 4
5
z 3
3
x = 2 +4t, y = 6 +3t, z = 5 +t ,
x = 4 +2s, y = 4 +5s, z = 3 +3s .
2 +4t = 4 +2s ,
6 +3t = 4 +5s ,
5 +t = 3 +3s
t = 1 s = 1
s = t = 1
x, y z
P = (6, 9, 6) .
s, t
n
point-normal
three points
To see how these apply, think of the plane as an airplane that can move freely in space:
For the left-hand airplane, pick a point on the fuselage, and specify the wing-tips . These three points
completely determine the position of the airplane. A 3-legged stool works on the same principle: it is perfectly stable
on any 'plane' floor. But for the right-hand airplane, pick at the tail end of the airplane, and let be a point on its
fuselage. The body of the airplane is then the displacement vector . This doesn't specify precisely where the
airplane is, however, because a fighter plane can roll around the axis of its fuselage. To specify position
completely we need to specify the orientation of the airplane by, say, its vertical tail. In mathematical terms this is
the normal to the plane, i.e., the vector perpendicular to the plane as shown above. If you know what a Rolodex
is, the principle is the same!.
Time to bring in vectors as well as dot and cross products to convert these ideas into equations for a plane in -
space.
Let be a fixed point in the plane,
an arbitrary point in the plane, and
the normal to the plane. If
the vector
lies in the plane, and is perpendicular to .
Thus . In terms of coordinates, this
becomes
where . In other words, we get the
point-normal equation
for a plane.
Q R, S
Q P
QP

QP

n
3
Q(a, b, c)
P(x, y, z)
n = A, B, C
b = a, b, c , r = x, y, z ,
= r b = x a, y b, z c QP

n
n (r b) = 0
A, B, C x a, y b, z c = 0 ,
n = A, B, C
A(x a) +B(y b) +C(z c) = 0 .
To emphasize the normal in describing planes, we often ignore the special fixed point and simply write
for the equation of a plane having normal . The next three examples show useful this way of writing
planes can be.
Example 3: Find an equation for the plane whose
graph in the first octant is
Solution: The basic idea is to look at the points of
intersection of the plane and the coordinate axes: now
intersects the -axis when , i.e., when
. Similarly, it intersects the -axis when
, and the -axis when .
Thus from the given graph
Consequently,
is an equation for the plane.
Example 4: Find the vector equation of the line
passing through the point and
perpendicular to the plane
Solution: when the line is perpendicular to the
plane, then the direction vector of the line is parallel
to the normal to the plane. When the plane is
this means the plane has normal
. Thus the line has
as direction vector for. But lies on the line,
so the vector
determines a point on the line.
Consequently, in vector form the equation of the line is
Q(a, b, c)
Ax +By +Cz = D
n = A, B, C
ax +by +cz = d
x y = z = 0
x = d/a y
y = d/b z z = d/c
= 4, = 5, = 3 .
d
a
d
b
d
c
+ + = 1
x
4
y
5
z
3
P(2, 4, 3)
x +4y 2z = 5 .
x +4y 2z = 5
n = 1, 4, 2
v = 1, 4, 2
P(2, 4, 3)
b = 2, 4, 3
r(t) = tv +b = t 1, 4, 2 + 2, 4, 3 .
Example 5: Find an equation for the plane passing
through the point and parallel to the
plane
Solution: parallel planes have the same normal. So
any plane parallel to has normal
On the other hand, since lies on the
parallel plane, the vector
determines a point on the parallel plane.
Now let be an arbitrary point on the
parallel plane. Then the vector
lies in the plane and so will be perpendicular to . In this
case,
Consequently, an equation for the plane is
The fact that the cross-product is perpendicular to both and makes it very useful when dealing with
normals to planes. For example, let
be three points determining the plane to the right below.
Then, if
the vectors
lie in the plane.
So the normal to the plane is given by the cross product
. Once this normal has been calculated,
we can then use the point-normal form to get the equation of
the plane passing through and .
Q(1, 1, 1)
2x +3y +z = 5 .
2x +3y +z = 5
n = 2, 3, 1 .
Q(1, 1, 1)
b = 1, 1, 1
r = x, y, z
= r b = x 1, y 1, z 1 QP

n
n (r b) = 2, 3, 1 x 1, y 1, z 1
= 2(x 1) +3(y 1) +(z 1) = 0 .
2x +3y +z = 6 .
a b a b
Q( , , ) , R( , , ) , S( , , ) x
1
y
1
z
1
x
2
y
2
z
2
x
3
y
3
z
3
b = , , , r = , , , s = , , , x
1
y
1
z
1
x
2
y
2
z
2
x
3
y
3
z
3
= r b, = s b, QR

QS

n = (r b) (s b)
Q, R, S
In practice, it's usually easier to work out in a given example rather than try to set up some general equation
for the plane.
Example 6: Find an equation for the plane passing
through the points
Solution: when the plane passes through ,
and , then the vectors
lie in the plane. Thus the cross-product
is normal to the plane.
If determines an arbitrary point in the
plane, the vector
lies in the plane and so is perpendicular to . In this
case,
which after simplification becomes
Consequently, the plane passes
through , and .
n
Q(1, 1, 2) , R(4, 2, 2) , S(2, 1, 5) .
Q, R
S
= 3, 1, 0 , = 1, 0, 3 , QR

QS

n = = = 3i +9j +k

i
3
1
j
1
0
k
0
3

r = x, y, z P
v = = x +1, y 1, z 2 QP

n
n v = 3(x +1) +9(y 1) +(z 2) = 0 ,
3x +9y +z 8 = 0 .
3x +9y +z = 8
Q, R S
3D-COORDINATE SYSTEMS
When there's symmetry present, it's often helpful to use coordinate systems that rely on this symmetry, say when studying
a torus or motion on a spiral staircase. In three dimensions two particularly useful ones are Cylindrical Polars and Spherical
Polars. Because both build on polar coordinates in the plane that you've probably used already, let's begin by recalling this
system briefly.
Polar Coordinates: the polar coordinates of the point
shown to the right are written where is the distance of
from the origin and is the angle the line from the origin to
makes with the -axis, the angle being measured as
rotates counter-clockwise starting from the positive -axis. At
the origin , but is not well-defined, so we usually
assign to the origin the polar coordinates for any choice
of .
By right-triangle trig and Pythagoras' theorem:
Thus for a point the Cartesian and polar coordinate systems are related by
Notice that the choices of coordinates and are not unique for a given point since and all
describe the same point . But this need not cause us any trouble!
The simplest method for drawing a polar graph is to write as a relation between and , and then
draw the graph of this relation, if possible. For example, when , then
in which case the graph is the circle of radius centered at . Plotting points, however, often works
better: start with the graph of with as rectangular coordinates; then plot the corresponding points on a polar grid
as shown below
P
(r, ) r
P
P x
x
r = 0
(0, )

sin = , cos = , = + , tan = .


y
r
x
r
r
2
x
2
y
2
y
x
P
(x, y) = (r cos , r sin ) , (r, ) = ( , ( )). + x
2
y
2

tan
1
y
x
r P (r, ), (r, +2) (r, + )
P
r = f() g(x, y) = 0 x y
r = 2 cos
+ = = 2r cos = 2x, i. e., + 2x = 0 , x
2
y
2
r
2
x
2
y
2
(x 1 + = 1 )
2
y
2
1 (1, 0)
r = f() r
Polar Coordinate Systems in -space: as we shall see when we get to integration, polar coordinates in the plane are
particularly useful when some circular symmetry is present: a circle is symmetric about a point - its center - because rotation
about the center doesn't change the circle. In -space, however, there are more possibilities for symmetry.
Cylindrical Polar Coordinates: when there's symmetry
about an axis, it's convenient to take the -axis as the axis
of symmetry and use polar coordinates in the -
plane to measure rotation around the -axis. Check the
interactive figure to the right. A point is specified by
coordinates where is the height of above the
-plane.
(i) what happens to as changes?
(ii) what's the relation between , and the axis of
symmetry?
(iii) what are the natural restrictions on ?
(iv) prove that the relation between Cartesian
coordinates and Cylindrical Polar coordinates
for each point in -space is
3
3
z
(r, ) xy
z
P
(r, , z) z P
xy
P z
r P

(x, y, z)
(r, , z) P 3
x = r cos , y = r sin , z = z .
Spherical Polar Coordinates: a sphere is symmetric in all
directions about its center, so it's convenient to take the
center of the sphere as the origin. Then we let be the
distance from the origin to and the angle this line from
the origin to makes with the -axis. Finally, as before, we
use from polar coordinates in the -plane to measure
rotation around the -axis. Investigate the interactive figure
to the right. A point is specified by coordinates .
(i) prove that the relation between Cartesian coordinates
and Spherical Polar coordinates for each
point in -space is
(ii) Show that the natural restrictions on and are
(iii) Points on the earth are frequently specified by Latitude
and Longitude. How do these relate to and ?

P
P z
xy
z
P 3 (, , )
(x, y, z) (, , )
P 3
x = cos sin , y = sin sin , z = cos .
, ,
0 < , 0 < 2, 0 .

Problem: the surface shown in
is the portion of the sphere
lying outside the cylinder
Express in spherical polar coordinates.
Solution: In spherical polar coordinates ,
and
with and . We need further
restrictions on , and so that
Now
i.e., . But then,
Consequently, consists of all points with
S
+ + = 4 x
2
y
2
z
2
+ = 1 . x
2
y
2
S
(, , )
x = sin cos , y = sin sin ,
z = cos ,
0 2 0
,
+ + = 4, + 1 . x
2
y
2
z
2
x
2
y
2
= + + = 4 ,
2
x
2
y
2
z
2
= 2
= 4 = 4 3 . z
2
cos
2
x
2
y
2
S P(, , )
= 2, 0 2, .

6
5
6
-DIMENSIONAL SPACE
On to and : one of the fundamental guiding principles in this course (and all of
mathematics really!) is that we use a natural progression of concepts to tell us how to proceed to
the next concept. Then, crucially, the new concept can often be understood and used by relating it
to the concepts preceding it. In one dimension the number line was represented by a single
variable (often ), while a two-dimensional plane was represented by ordered pairs, say, .
The usual operations of addition of numbers and multiplication had a natural geometric meaning
on the number line, and by using these same algebraic operations of addition and multiplication of
numbers component-wise, we've just given algebraic and geometric interpretations of addition
and multiplication of points in the plane by introducing -vectors. With very little more effort we
then extended this to three-dimensional space by representing it as triples of numbers
and doing everything component-wise once again.
Why stop at ? In physics one looks at four-dimensional space-time, and in recent years
physicists have even represented everything in terms of (or is it ?) dimensions! From an
engineering point of view, how many variables do we need to describe, say, the motion an
airplane in space? We need 3 variables to specify its center, but we also need 3 more to describe
its rotation in -space - its roll, pitch and yaw. So in total we need 6 variables. On the other hand,
in computer programming one often looks at, say, 64 bit strings. They can certainly be added.
What else does one do with them? In mathematics one encounters many similar situations - if
are polynomials of degree at most , say, having real coefficients, then the sum
as well as the difference and constant multiple are again polynomials of
degree at most . Notice that all these manipulations are done coefficient-wise, so does this mean
the set of all polynomials with real coefficients is a variant of 8-dimensional space? What
would then be the replacements for and in -space?
What we want is an algebraic way of realizing -dimensional space for any , just as we
n
n
x (x, y)
2D
(x, y, z)
3
10 26
3
P(x) = + x + ++ , Q(x) = + x + ++ a
0
a
1
a
2
x
2
a
7
x
7
b
0
b
1
b
2
x
2
b
7
x
7
7
P(x) + Q(x) = ( + ) +( + ) +( + ) ++( + ) , a
0
b
0
a
1
b
1
a
2
b
2
a
7
b
7
x
7
P(x) Q(x) P(x)
7
(R) P
7
i, j k 3
n n
2 3 3
realized the plane by -tuples and -space by -tuples.
Euclidean -space, , for any integer is the set of all -tuples
Addition and scalar multiplication of elements of are defined componentwise.
It's common to write the length of elements in as
and then the dot product for is completely analogous to the one we defined for and :
The elements
are unit vectors such that each in can be written
exactly as before. Notice the set of all polynomials of degree at most can be identified
with :
If all the coefficients are or , might this have some connection with bytes? You can expect to
meet all these ideas in future courses!!
Question for the Day: if we define matrices by
so that each matrix can be written
2 3 3
n R
n
n 1 n
x = ( , , , ) , < < . x
1
x
2
x
n
x
j
R
n
x = ( , , , ) x
1
x
2
x
n
R
n
x = ( + ++ , x
2
1
x
2
2
x
2
n
)
1/2
R
n
R
2
R
3
x y = xy cos = + ++ . x
1
y
1
x
2
y
2
x
n
y
n
= (1, 0, 0, , 0, 0) , = (0, 1, 0, , 0, 0) , , = (0, 0, 0, , 0, 1) , i
1
i
2
i
n
x R
n
x = ( , , , ) = + ++ , x
1
x
2
x
n
x
1
i
1
x
2
i
2
x
n
i
n
(R) P
7
7
R
8
P(x) = + x + ++ x = ( , , , , ) . a
0
a
1
a
2
a
7
x
7
a
0
a
1
a
2
a
7
0 1
2 2 , , I
1
I
4
= [ ], = [ ], = [ ], = [ ], I
1
1
0
0
0
I
2
0
0
1
0
I
3
0
1
0
0
I
4
0
0
0
1
2 2 A
A = [ ] = a + b + c + d ,
a
c
b
d
I
1
I
2
I
3
I
4
22 4
is it possible that the set of all matrices is basically the same as if we add matrices
and take scalar multiples in the usual matrix way? Familiar operations on matrices can be used to
develop other properties on :
Property I. Dot Product on : recall that if is a matrix then the Trace
and Transpose of are defined by
in other words, is the sum of the diagonal entries, while 'flips' the entries over the
diagonal. So if we identify elements and of with
matrices
then
Thus we can think of not only as all -tuples , but also as the set of all
-matrices; and when we do, then the dot product is simply . Why bother?
Property II. Multiplication on : since the product, , of -matrices is again a
-matrix, a product
can be introduced on by setting
This extends the usual product of numbers in . However, for most matrices ,
so this product on does not have ALL the usual properties of multiplication of numbers.
R
22
2 2 R
4
R
4
R
4
A = [ ]
a
c
b
d
2 2
A
trace(A) = a + d , = [ ]; A
t
a
b
c
d
trace(A) A
t
a = ( , , , ) a
1
a
2
a
3
a
4
b = ( , , , ) b
1
b
2
b
3
b
4
R
4
A = [ ], B = [ ],
a
1
a
3
a
2
a
4
b
1
b
3
b
2
b
4
trace(A ) = trace([ ][ ]) = trace([ ]) B
t
a
1
a
3
a
2
a
4
b
1
b
2
b
3
b
4
+ a
1
b
1
a
2
b
2
+ a
3
b
1
a
4
b
2
+ a
1
b
3
a
2
b
4
+ a
3
b
3
a
4
b
4
= + + + = a b. a
1
b
1
a
2
b
2
a
3
b
3
a
4
b
4
R
4
4 ( , , , ) x
1
x
2
x
3
x
4
R
22
2 2 trace(A ) B
t
R
4
AB 2 2 A, B
2 2
( , , , )( , , , ) = ( , , , ) a
1
a
2
a
3
a
4
b
1
b
2
b
3
b
4
c
1
c
2
c
3
c
4
R
4
[ ] = C = AB = [ ][ ].
c
1
c
3
c
2
c
4
a
1
a
3
a
2
a
4
b
1
b
3
b
2
b
4
R
1
AB BA A, B
R
4
Nonetheless, as you'll discover in many later courses, these multiplication ideas are extremely
important in physics in connection with all of Dirac's work, as well as in cryptography and
quantum computing in CS. One great idea usually leads to many other great ideas!!
Now the preliminaries are over - time to get back to calculus. Fix and let be a subset of
. This course develops calculus for functions
If , is said to be a scalar-valued or real-valued function, while is said to be vector-valued
if . Sometimes we'll write as , thinking of as an element of and as an
element of , but often we write as , depending on whether it's useful to
think in coordinate-free or coordinate-specific terms. When , then is simply a real-
valued function of one variable, and we are back to single-variable calculus. As the 'natural
progression' idea just discussed suggests, when or , or both, calculus is developed
by referring back to the single variable case all the time.
Example I Lines: the earlier vector form of a line can be interpreted as a linear function
of one variable. The tip of the vector lies on the line, while the line is parallel to the vector .
Example II Planes: a plane can interpreted in vector form as a linear function
of two variables for fixed vectors
Notice that
so the three points
all lie in the plane, and hence determine the plane. Can you use these three points in the plane to
m, n U
R
m
f : U . R
m
R
n
n = 1 f f
n > 1 f y = f(x) x R
m
y
R
n
f f( , , , ) x
1
x
2
x
m
m = n = 1 f
m > 1 n > 1
r : R , r(t) = a + tv , R
3
( ) a v
f : , f(s, t) = a + s b + t c , R
2
R
3
a = ( , , ) , b = ( , , ) , c = ( , , ) . a
1
a
2
a
3
b
1
b
2
b
3
c
1
c
2
c
3
f(0, 0) = a , f(1, 0) = a +b, f(0, 1) = a +c ,
P = ( , , ) , Q = ( + , + , + ) , R = ( + , + , + ) , a
1
a
2
a
3
a
1
b
1
a
2
b
2
a
3
b
3
a
1
c
1
a
2
c
2
a
3
c
3
show that the cross-product is normal to the plane, and so get the equation of the plane in
the Cartesian Coordinate form?
Example III Beyond Planes: the example
is a quadratic vector function and because of the product term . But if lines and planes
are the graphs of degree one functions, it's natural to ask what can happen in this degree-two
case. Well, in coordinates
So after eliminating and from these equations we see that whose graph is a
familiar hyperbolic paraboloid. Thus the graph of a degree one vector form is a plane - a flat
surface - whereas the graph of a degree two and higher vector form is a curved surface. This is
the basis of vector calculus in any dimension.
b c
Ax + By + Cz = D
f(s, t) = su + tv +4st w = s + t +4st =

1
1
0

1
1
0

0
0
1

s + t
s t
4st

u, v w st
x = = x = s + t, y = s t, z = 4st .

x
y
z

s + t
s t
4st

s t z = x
2
y
2
SURFACES and SLICING
Just as having a good understanding of curves in the plane is essential to interpreting the concepts of single
variable calculus, so a good understanding of surfaces in -space is needed when developing the fundamental
concepts of multi-variable calculus.
Surfaces like planes, circular cylinders and spheres we've met already. The next step is to look at a surface arising
as the graph of a real-valued function of two variables - in practice, such functions occur
in the theory of heat flow in a bar, or vibrating strings, or the pressure at a point on the earth. The graph of
is the surface
in -space. Two examples that will occur repeatedly are shown below in
How do we know the surfaces look like that? The basic idea is to take cross-sections of the surface by plane slices.
Because a plane intersects the surface in a curve that also lies in the plane, this curve is often referred to as the
trace of the surface on the plane. Identifying traces gives us one way of 'picturing' the surface; re-assembling the
cross-sections then provides a full picture of the surface. It thus reduces the problem of describing a surface to
identfying curves in the plane - the 'natural progression' idea at work! When ,
the trace on a vertical plane is the curve consisting of all points
in the plane ,
3
z = f(x, y) : U R R
2
z = f(x, y)
S = {(x, y, f(x, y)) : (x, y) in U }
3
z = f(x, y) = x
2
y
2
z = f(x, y) = + x
2
y
2
z = f(x, y)
y = mx +b
{(x, mx +b, f(x, mx +b)) : (x, mx +b) in D},
y = mx +b
the trace on a horizontal plane is the curve
in the plane .
In the case slicing vertically by
means fixing and graphing
while slicing vertically by the plane gives
i.e., parabolas opening up and down respectively.
On the other hand, slicing horizontally by gives
i.e., hyperbolas opening in the -direction if
and in the -direction if . So the cross-sections
are parabolas or hyperbolas, and the surface is called
a hyperbolic paraboloid. Think of it as a saddle or a
rectangular Pringle!
But when the trace on is the graph of , while that on is the graph of
; these are parabolas which always opening upwards. On the other hand the horizontal trace on is
the circle . So this surface is called a Paraboloid. Think of it as a wine-glass!
All this slicing is shown graphically to the left below with the corresponding traces shown to the right:
z = c
{(x, y, c) : (x, y) in D, f(x, y) = c }
z = c
z = x
2
y
2
y = b
y = b
z = f(x, b) = , x
2
b
2
x = a
z = f(a, y) = , a
2
y
2
z = c
f(x, y) = c = , x
2
y
2
x c > 0
y c < 0
z = + x
2
y
2
y = b z = + x
2
b
2
x = a
z = + a
2
y
2
z = c
+ = c x
2
y
2
For a general function , slicing horizontally is a particularly important idea:
Level curves: for a function the level curve of value is
the curve in on which .
Notice the critical difference between a level curve of value and the trace on the plane : a level curve
always lies in the -plane, and is the set of points in the -plane on which , whereas the trace lies in
the plane , and is the set of points with in .
By combining the level curves for equally spaced values of into one figure, say
in the -plane, we obtain a contour map of the graph of . Thus the graph of
can be visualized in two ways,
one as a surface in -space, the graph of ,
the other as a contour map in the -plane, the level curves of value for equally spaced values of .
As we shall see, both capture the properties of from different but illuminating points of view. The
particular cases of a hyperbolic paraboloid and a paraboloid are shown interactively in
z = f(x, y)
z = f(x, y) : D R R
2
c
C D R
2
f = c

C
C c z = c C
xy C xy f(x, y) = c
z = c (x, y, c) (x, y) C
f(x, y) = c c
c = 1, 0, 1, 2, , xy z = f(x, y)
z = f(x, y)
3 z = f(x, y)
xy c c
z = f(x, y)
Problem: describe the contour map of a plane in -
space.
Solution: the equation of a plane in -space is
so the horizontal plane intersects the plane
when
For each , this is a line with slope and -
intercept . Since the slope does not
depend on , the level curves are parallel lines, and
as runs over equally spaced values these lines will
be a constant distance apart.
Consequently, the contour map of a plane consists
of equally spaced parallel lines.
(Does this make good geometric sense?)
Now let's step up a dimension and consider functions of variables; one such
function is
The graph of every function will be a surface in , though it can't be drawn directly; however,
slicing horizontally by produces relations in and whose graphs will be surfaces in -
space which can be drawn. Formally,
3
3
Ax +By +Cz = D,
z = c
Ax +By +Cc = D.
c A/B y
y = (DCc)/B
c
c
w = f(x, y, z) : U R R
3
3
w = f(x, y, z) = + . x
2
y
2
z
2
w = f(x, y, z) R
4
w = c c = f(x, y, z) x, y, z 3
Level surfaces: for a function the level
surface of value is the surface in on which .
Example 1: the graph of as a surface in -space can be regarded as the level surface of the
function .
Example 2: spheres can be interpreted as level surfaces of the function
. Can you see how to interpret ellipsoids in the same way?
From the earlier example of . we obtain three particularly important surfaces as
level surfaces:
Two-sheeted Hyperboloid

Double Cone

Single-sheeted Hyperboloid
by taking and . The two-sheeted hyperboloid and double cone are very important in physics, while the
single sheeted hyperboloid is a favorite architectural device - cooling towers etc - as is the hyperbolic paraboloid.
Again we can investigate what happens as these surfaces are sliced by planes parallel to the coordinate planes:
w = f(x, y, z) : U R R
3
c S U R
3
f = c

S
z = f(x, y) 3 w = 0
w(x, y, z) = z f(x, y)
+ + = x
2
y
2
z
2
r
2
w = r
2
w = + + x
2
y
2
z
2
w = f(x, y, z) = + x
2
y
2
z
2
+ = 1 x
2
y
2
z
2
+ = 0 x
2
y
2
z
2
+ = 1 x
2
y
2
z
2
c = 1, 0, 1
Recognize the curves of intersection? There must be some underlying mathematical theory! To do no more than hint
at what that theory might be notice
all the surfaces have been the graph of some quadratic relation in and like in the case
of a hyperbolic paraboloid or for a sphere,
all the cross-sections of these surfaces have been conic sections like parabolas, hyperbolas etc.
In view of the first of these comments we make the following
Definition: a surface in -space is said to a Quadric Surface when it is the graph of a
quadratic relation in and . In particular, all the surfaces described so far are Quadric
Surfaces.
x, y, z z + = 0 x
2
y
2
+ + = x
2
y
2
z
2
r
2
S 3
x, y, z
Cylindrical Surfaces: sometimes the intersection of a surface in -space with horizontal planes is the
same for all as in the surface below to the left, or is the same for all vertical planes, say , as in the surface to
the right.
Do you see that the circular cylinder to the left is the graph in -space of for fixed because every
horizontal slice is the same circle of radius ? Similarly, the cylinder to the right is parabolic; it's the graph of, say,
, since the intersection with every vertical plane is the same parabola , say. Not surprisingly, it's
called a Parabolic cylinder.
3 z = c
c x = a
3 + = x
2
y
2
r
2
r
r
z = y
2
x = a z = y
2
PATHS, CURVES, and DIFFERENTIATION
Now we turn from surfaces to paths and curves in -space, but from a vector point of view. Let
be a vector-valued function with real valued components and which we assume have continuous derivatives. It is
convenient to think of as a position vector from the origin to the point that changes as varies over the domain
of . The variable is called a parameter - it's frequently time - and the terminal point of traces a curve in as varies;
think of it as the trajectory of a moving particle! Formally,
A Path in is a map . The set of terminal points as varies over
is called a space curve, and the path is said to parametrize or trace out . When
, the set of terminal points will be called a plane curve.
The vector form for a line is an example we've met already, but the -case in general is long familiar: if
is a real-valued function of one variable, then the terminal points of the path is just the graph of , while
if a curve in the plane, such as a circle, is given parametrically by , then parametrizes . So we
are combining both of these familiar concepts and generalizing them to (or to any dimensional space):
3
r(t) = U R = x(t) i +y(t) j +z(t) k R
3
x(t), y(t), z(t)
r(t) (x(t), y(t), z(t)) t
U r t r(t) R
3
t
R
3
r : U R R
3
C r(t) t
U r C C
r : U R R
2
r(t)
r(t) = a +tv 2D y = f(x)
r(x) = xi +f(x) j y = f(x)
C (x(t), y(t)) r(t) = x(t) i +y(t) j C
R
3
As these two examples show, however, it is often easier to understand a space curve by identifying it with a curve on a surface,
especially when it's an 'interesting' quadric surface! Usually this means eliminating the parameter from the components
and (the trig identity and double angle formulas are often useful here).
Example 1: the curve parametrized by
lies on the cylinder because
Example 2: the curve parametrized by
lies on the cone because
Notice that both of these curves spiral counter-clockwise when viewed from overhead. How would the vector functions have to
change for the curves to spiral clockwise?
Question: Example is a spiral staircase. What's its relation to a Double Helix Which vector function will parametrize a
Double Helix?
Many interesting space curves such as arise as the intersection of two surfaces:
Since
the orange curve, call it a 'Pringle curve', lies on each of the quadric surfaces
as well as in the intersection of these surfaces. Does the Pringle curve lie on any other quadric surfaces? Did we use up all the double
angle formulas in the list above?
t x(t), y(t),
z(t) (. ) + (. ) = 1 cos
2
sin
2
c(t) = cos t, sin t , t
+ = 1 x
2
y
2
x(t +y(t = 1, for all t . )
2
)
2
c(t) = t cos t, t sin t , t
= + z
2
x
2
y
2
x(t +y(t = z(t , for all t . )
2
)
2
)
2
1 ? r(t)
r(t) = cos t, sin t, cos 2t
t + t = 1 , cos 2t = t = 2 t 1 , cos
2
sin
2
cos
2
sin
2
cos
2
+ = 1 , z = , z = 2 1 , x
2
y
2
x
2
y
2
x
2
Differentiating vector functions: but what's the derivative of a vector-valued function ? What does tell us? Although
the definition of is just the same as for a scalar-valued function, now it's important to interpret the limit of the Newtonian
Quotient
in terms of vectors!
Investigate the interactive figure to the
right. Use the Triangle Law to interpret
as the green vector.
(i) How does this green vector
correspond to the 'rise' in the scalar-
valued case?
(ii) What properties of vectors are
needed to identify the Newtonian
Quotient with the brown vector? How
does the brown vector correspond to
slope?
(iii) The Newtonian Quotient
approaches the tangent vector to the
graph of at as .
Thus the vector can be identified with the tangent vector to the graph of at shown in red; it gives the rate of change of
at . To compute in coordinates:
and analogously for any . So differentiation is done component-wise:
Partial Differentiation: the partial derivatives of a function of two variables or a function of three
variables can be introduced in a completely algebraic way by simply differentiating with respect to one variable, while holding all the
other variables fixed.
The First Order Partial Derivatives of are defined by
and similarly for and . After freeing the fixed variables the partial derivatives then become
functions of .
r(t) (t) r

(t) r

(t) = r

lim
h 0
r(t +h) r(t)
h
r(t +h) r(t)
r(t) P h 0
(t) r

r P
r(t) P (t) r

(t) = = r

lim
h 0
r(t +h) r(t)
h
lim
h 0
(x(t +h)i +y(t +h)j) (x(t)i +y(t)j)
h
= ( )i + ( )j , lim
h 0
x(t +h) x(t)
h
lim
h 0
y(t +h) y(t)
h
r : U R R
3
(t) = (t) i + (t) j + (t) k = (t), (t), (t) . r

z = f(x, y) w = f(x, y, z)
w = f(x, y, z)
(a, b, c) = (a, b, c) = , f
x
f
x
lim
h 0
f(a +h, b, c) f(a, b, c
h
f
y
f
z
(x, y, z)
In the two variable case, for instance, these partial derivatives become
Example 3: determine when
Solution: differentiating with respect to keeping
fixed, we see that
On the other hand, differentiating with respect to
keeping fixed, we see that
Thus
To understand partial derivatives geometrically, we need to interpret the algebraic idea of fixing all but one variable in terms of the
by now familiar idea of slicing a surface by a plane to produce a curve in space.
Start with
whose graph is shown to the right. Then
To exploit interactivity, fix and use the 'Fix '-
slider to intersect the graph of by the plane .
The cubic curve of intersection shown in orange is the
graph of the vector function
(use 'Curve 1'-slider). The tangent vector to this orange
curve is
(use 'Tangent Vector 1' button). Thus gives the slope
of the graph of in the -direction.
If we fix , say, and use the other sliders and
button, we see that gives the slope of the graph of
in the -direction. Thus
(a, b) = = , (a, b) = = . f
x
f
x

(a,b)
lim
h 0
f(a +h, b) f(a, b)
h
f
y
f
y

(a,b)
lim
k 0
f(a, b +k) f(a, b)
k
+ f
x
f
y
f(x, y) = . e
xy x
2
x y
= 2x y = (2x y) . f
x
e
xy x
2
e
xy x
2
e
xy x
2
y
x
= x . f
y
e
xy x
2
+ = (x y) . f
x
f
y
e
xy x
2
z = f(x, y) = 3 +2 x
2
y
2
x
3
= = (3 +2) = 6x 3 . f
x
f
x

x
x
2
y
2
x
3
x
2
y = 1 y
f y = 1
r(x) = x, 1, f(x, 1) = x, 1, 3 +1 , x
2
x
3
(x) = 1, 0, 6x 3 = 1, 0, , r

x
2
f
x
f
x
z = f(x, y) x
x = 1
f
y
z = f(x, y) y
At a point on the graph of ,
the value is the Slope in the -direction,
is the Tangent Vector in the -direction, while
give the respective slope and tangent vector at in the -direction.
Example 4: determine whether are positive,
negative, or zero at the points and on the
surface to the right.
Solution: at , for instance, the surface slopes up for
fixed as increases, so , while the surface
remains at a constant level at in the direction for
fixed , so . On the other hand,
But what happens at ?
All the same ideas carry over in exactly the same way to functions of three or more variables - just don't expect
lots of pictures!! The partial derivative , for instance, is simply the derivative of with respect to , keeping the variables
AND fixed now.
Information about the partial derivatives of a function can be detected also from the contour map of . Indeed, as one
knows from using contour maps to learn whether a path on a mountain is going up or down, or how steep it is, so the sign of the
partial derivatives of and relative size can be read off from the contour map of .
P = (a, b, f(a, b)) z = f(x, y)
f
x

(a, b)
x
1, 0,
f
x

a, b
x
, 0, 1,
f
y

(a, b)
f
y

a, b
P y
, f
x
f
y
P, Q, R, S
Q
x y > 0 f
y

Q
Q x
y = 0 f
x

Q
= 0, > 0 , < 0, = 0 . f
x

R
f
y

R
f
x

P
f
y

P
S
w = f(x, y, z)
f
z
f(x, y, z) z x
y
z = f(x, y) f
z = f(x, y) f
Example 5: To the right is the contour map of the
earlier function
with 'higher ground' being shown in lighter colors and
'lower ground' in darker colors. Determine whether
are positive, negative, or zero at ,
and .
At , for instance, are the contours increasing or
decreasing as increases for fixed ? That will indicate
the sign of . But what happens at or at . Don't
forget that the graph of appears in the earlier
interactive animation!
Tangent Plane: to determine the equation of the tangent plane to the graph of , let be a point on
the surface above in the -plane as shown to the right below . Slicing the surface with vertical planes and creates
two curves on this graph, both passing through .
These two space curves are the graphs of vector
functions
shown in orange on the surface. The vector
derivatives
are Tangent Vectors at to the graph of , while
the plane containing as well as and is the
Tangent Plane at .
To calculate the equation of the tangent plane at
we need its normal:
z = f(x, y) = 3 +2 , x
2
y
2
x
3
, f
x
f
y
P, Q, R, S
T
R
y x
f
y
P S
f
z = f(x, y) P = (a, b, f(a, b))
(a, b) xy y = b x = a
P
(x) = x, b, f(x, b) , r
1
(y) = a, y, f(a, y), r
2
= (a) = i + (a, b) k , T
1
r

1
f
x
= (b) = j + (a, b) k , T
2
r

2
f
y
P f
P T
1
T
2
P
P(a, b, c)
n = = T
1
T
2

i
1
0
j
0
1
k
(a, b) f
x
(a, b) f
y

= (a, b), (a, b), 1 . f


x
f
y
On the other hand, if is an arbitrary point in the tangent plane at , then
lies in the plane and so is perpendicular to . Thus in point-normal form the equation of the tangent plane is
After rearranging we get:
The equation of the Tangent Plane at is
while the Linearization of at is
The pattern is just the same as for one variable after making allowances for the appearance of the second variable. The
corresponding formulas for functions of more than two variables follow exactly this pattern.
Example 6: determine the tangent plane to the
graph of
at .
Solution: since
we see that
Thus the tangent plane at is
which after simplification becomes
Now let's use the Linearization of to estimate the change Change, , in near :
The function is often called the Linear Approximation to at .
Q(x, y, z) P(a, b, c)
= x a, y b, z f(a, b) PQ

n
x a, y b, z f(a, b) n = x a, y b, z f(a, b) (a, b), (a, b), 1 f
x
f
y
= (x a) (a, b) (y b) (a, b) +z f(a, b) = 0 . f
x
f
y
(a, b, f(a, b))
z = f(a, b) +(x a) (a, b) +(y b) (a, b) , f
x
f
y
f (a, b)
L(x, y) = f(a, b) +(x a) (a, b) +(y b) (a, b) . f
x
f
y
f(x, y) = + +2x x
3
y
2
(1, 2, f(1, 2))
= 3 +2, = 2y,
f
x
x
2
f
y
= 5 , = 4 . f
x

(1,2)
f
y

(1,2)
(1, 2, f(1, 2))
z = f(1, 2) +5(x +1) +4(y 2) ,
z = 5x +4y 2 .
L(x, y) f f f (a, b)
f = f(x, y) f(a, b) L(x, y) f(a, b) = (x a)x +(y b)y.
L(x, y) f (a, b)
Example 7: find the Linearization, , of
at the point .
Solution: the Linearization of at is
But when ,
At , therefore, , while
Thus
which after rearrangement becomes
L(x, y)
z = f(x, y) = y x

(9, 2)
f (a, b)
L(x, y) = f(a, b) + (x a) + (y b). f
x

(a, b)
f
y

(a, b)
f(x, y) = y x

= , = .
f
x
y
2 x

f
y
x

(9, 2) f(9, 2) = 6
= , = 2 . f
x

(9, 2)
1
3
f
y

(9, 2)
L(x, y) = 6 (x 9) +3(y +2) ,
1
3
L(x, y) = 3 x +3y.
1
3
Gradient, Chain Rule, and Directional Derivatives
What's going to replace the derivative of a function of one variable when is a real-valued function
of two or more variables? Since slope now depends on direction, not just sign, we can expect vectors will be needed!
In two variables
Definition: the Gradient of a real-valued function is the vector function
whose components are the partial derivatives of in the - and -directions.
This definition generalizes immediately to a function :
and to any function of variables:
So if is a function of variables, then the value of is a
vector in . Later we shall use the name vector fields to describe functions having domain and range in .
Example: for ,
To draw the graph of we select a set of points
and represent by a vector with initial point and
length scaled so that it's not too long but remains a fixed
proportion of the true length of . Drawing
for every is not helpful, of course, because it
would cover the plane with arrows, so we choose a
representative set. But for functions of or more
variables, the graph of is usually too cluttered to be of
much use even if a computer graphics program is used.
(x) f

y = f(x) f
z = f(x, y)
(f)(x, y) = i + j
f
x
f
y
f x y
w = f(x, y, z)
(f)(x, y, z) = i + j + k ,
f
x
f
y
f
z
y = f( , , , ) x
1
x
2
x
n
n
(f)(x) = + ++ .
f
x
1
i
1
f
x
2
i
2
f
x
n
i
n
f( , , , ) x
1
x
2
x
n
n n = 2, 3, , (f)( , , , ) x
1
x
2
x
n
R
n
R
n
f(x, y) = x
2
y
2
(f)(x, y) = 2xi 2yj .
f P(x, y)
(f)(P) P
(f)(P)
(f)(P) P
3
f
Many important properties of the gradient of are suggested by superimposing the graph of
on the contour map of :
These graphics suggest key properties of the vector at a point in the -plane:
points in the direction of maximum rate of increase of at ,
is perpendicular to the level curve through ,
the length of is the maximum slope at .
Let's first see how these fundamental properties of gradients work in practice before introducing the multi-variable
Chain Rule and Directional Derivatives that are needed to establish these properties.
z = f(x, y) (f)(x, y)
z = f(x, y)
(f)(a, b) (a, b) xy
(f)(a, b) f(x, y) (a, b)
(f)(a, b) (a, b)
(f)(a, b) (a, b, f(a, b))
Atmospheric Pressure, Wind, and Gradients: let be the pressure at the point in some
region in the -plane (flat earth!). Then is often referred to as the Pressure Gradient, not surprisingly!
The level curves are curves of constant pressure,
called isobars, and a standard weather map for
consists of a set of such isobars . As a visual aid this
contour map colors are often used to indicate pressure
at a given isobar - the darker the color, the lower the
pressure. High pressure areas on weather maps are
generally associated with clear skies, while low
pressure areas are generally associated with cloudy or
overcast skies. Wind is generated by moving from high
pressure areas to low pressure areas, so the wind
speed and direction determines a velocity vector at
each point which will be the negative gradient
, negative because at the gradient
points from lower to higher isobars,
whereas wind blows from higher to lower isobars. So
where will the wind be blowing and how strong will it be
blowing in the weather map to the right?
Chain Rule: in one variable the Chain Rule gives the derivative of the composition of and
:
For a function of two variables there is a similar formula, but there are more possibilities. The General
Version of the Chain Rule starts with and which are themselves functions and of two
variables and , so that the composition
is now a function of and . The partial derivatives of become:
Chain Rule, General Version: when is the composition of
and then its partial derivatives are given by
The one and two variable cases set the pattern for more variables. If and
z = P(x, y) (x, y)
D xy P
D
(x, y)
(P)(x, y) (x, y)
(P)(x, y)
y = f(x(t)) y = f(x)
x = x(t)
= f(x(t)) = = (x(t)) (t) .
dy
dt
d
dt
df
dx

x(t)
dx
dt
f

z = f(x, y)
x y x = x(s, t) y = y(s, t)
s t
z = f(x(s, t), y(s, t))
s t z
z = f(x(s, t), y(s, t)) z = f(x, y)
x = x(s, t), y = y(s, t)
= f(x(s, t), y(s, t)) = + ,
z
s

s
f
x
x
s
f
y
y
s
= f(x(s, t), y(s, t)) = + .
z
t

t
f
x
x
t
f
y
y
t
w = f(x, y, z)
then
is a function of and such that
and so on for functions of variables for any .

Example 1: determine when
and
Solution: by the Chain Rule
But when
On the other hand,
Consequently,
We can now substitute for in terms of :
Why might we be interested in such compositions? Well, in the plane for example, both Cartesian and Polar
coordinates are important, so often there's a need to change coordinates: in polar coordinates
Thus the Chain rule then tells us that
and so on.
Another particularly useful composition is the following: suppose is a function of two variables and
is a path in -space parametrizing a curve . Then the composition
is now a function of ; it is the restriction, , of to . There is a version of the Chain Rule for such compositions.
x = x(r, s, t), y = y(r, s, t), z = z(r, s, t),
w = f(x(r, s, t), y(r, s, t), z(r, s, t))
r, s, t
= + +
w
r
f
x
x
r
f
y
y
r
f
z
z
r
f( , , , ) x
1
x
2
x
n
n n
z
s
z = f(x, y) =
x
x + y
x = 3s , y = 1 + s . e
t
e
t
= + .
z
s
z
x
x
s
z
y
y
s
z = ,
x
x + y
= = , = .
z
x
(x + y) x
(x + y)
2
y
(x + y)
2
z
y
x
(x + y)
2
= 3 , = .
x
s
e
t
y
s
e
t
= (3y x ).
z
s
1
(x + y)
2
e
t
e
t
x, y s, t
= .
z
s
3e
t
(3s +1 + s e
t
e
t
)
2
x = x(r, ) = r cos , y = y(r, ) = r sin .
f(r cos , r sin ) = + = cos +sin ,

r
f
x
x
r
f
y
y
r
f
x
f
y
z = f(x, y)
r(t) = x(t) i + y(t) j 2 C
g(t) = f(r(t)) = f(x(t), y(t))
t f

C
f C
Chain Rule for Paths: if is the composition of and , then
analogous to the single variable Chain Rule with dot product replacing the usual product.
There is a corresponding version
for the composition of and path in -space.
The next worked example shows how this form of the Chain Rule might be used.
Example 2: the temperature at a point in the
plane is . If a bug crawls on the plane so
that its position at time in minutes is given by
determine how fast the temperature is rising on the
bug's path after minutes when
Solution: the temperature on the bug's path is
. Thus by the Chain Rule for
Paths,
But
Thus
Now when , the bug is at the point , in
which case
Consequently,
An important application of the Chain Rule for Paths occurs when the path is the straight line through a point
in the direction of a unit vector .
g(t) = f(r(t)) z = f(x, y) r(t)
= + = (t) + (t) = (f) (t) ,
dg
dt
f
x
dx
dt
f
y
dy
dt
x

f
x
y

f
y

r(t)
r

= + + = (f) (t) ,
dg
dt
f
x
dx
dt
f
y
dy
dt
f
z
dz
dt

r(t)
r

g(t) = f(r(t)) w = f(x, y, z) r(t) 3


(x, y)
T(x, y C )

t ( )
x(t) = , y(t) = 5 + t , 1 + t

1
3
3
(2, 6) = 20, (2, 6) = 3 . T
x
T
y
S(t) = T(x(t), y(t))
= + .
dS
dt
T
x
dx
dt
T
y
dy
dt
= , = .
dx
dt
1
2 1 + t

dy
dt
1
3
= + .
dT
dt
1
2 1 + t

T
x
1
3
T
y
t = 3 (2, 6)
(2, 6) = 20, (2, 6) = 3 . T
x
T
y
= + 3 = C/min .
dT
dt
20
4
1
3
6

P(a, b) v = hi + k j
Definition: the Directional Derivative of at in the direction is defined by
Of course,
By the Chain Rule for Paths, if is the equation in vector form of the line through in the
direction , then
since . Consequently,
The Directional Derivative of at in the direction of the unit vector is the
component
of the Gradient .
This tells how to calculate : compute . But by the definition of the dot product,
where is the angle between and . Thus will be maximized when , i.e., when
and are parallel and point in the same direction. This shows that at
the length of is the maximum value of ,
points in the direction of the maximum value of
Another important application occurs when the path parametrized by is a level curve of . For then
for some constant , so by the Chain Rule for Paths,
But at a point on the level curve, is the tangent vector to the level curve, so
z = f(x, y) (a, b) v
(D )(a, b) = . f
v
lim
t0
f(a + th, b + tk) f(a, b)
t
D (a, b) = , D (a, b) = . f
i
f
x

(a, b)
f
j
f
y

(a, b)
r(t) = a + t v P(a, b)
v
f(r(t)) = (f)(r(t)) (t) = (f)(r(t)) v ,
d
dt

r(t)
r

(a, b)
(t) = v r

z = f(x, y) (a, b) v
D (a, b) = (f)(a, b) v f
v
(f)(a, b)
D (a, b) f
v
(f)(a, b) v
D (a, b) = (f)(a, b) v cos = (f)(a, b) cos f
v

(f)(a, b) v D (a, b) f
v
cos = 0
D (a, b) f
v
(f)(a, b) (a, b)
(f)(a, b) D (a, b) f
v
(f)(a, b) D (a, b) . f
v
r(t) z = f(x, y)
f(r(t)) = f(x(t), y(t)) = k k
f(r(t)) = (f) (t) = (k) = 0 .
d
dt

r(t)
r

d
dt
r(t) = (a, b) (a, b) r

(f) (a, b) = 0

(a, b)
r

says that is perpendicular to this tangent vector, establishing the third key property of the gradient:
the vector is perpendicular to the level curve through .
(f)(a, b)
(f)(a, b) (a, b)
Higher Order Derivatives, Taylor Approximations
Second order partial derivatives can be defined just as in the one variable case. We simply differentiate partially twice to
obtain
along with mixed second order partial derivatives
Still higher order partial derivatives are defined in the same way. But does it matter whether we first
differentiate partially with respect to and then with respect to , or first with respect to and then with respect to ;
in other words, does order of differentiation matter?
Well, if , for example,
while
i.e., for this particular function. But is true in general?
Equality of Mixed Partials: if and are differentiable (or even just continuous) on a
disk, then
on this disk. In practice, this means for virtually all functions.
Since differentiability implies continuity, the mixed partials of will certainly be equal when has partial
derivatives up to order ; in practice, this means that mixed partials are equal for all functions encountered. Sometimes it's
algebraically simpler to use the fact that .
= ( = ( ) = , = ( = ( ) = , f
jj
f
x
j
)
x
j

x
j
f
x
j
f
2
x
2
j
f
kk
f
x
k
)
x
k

x
k
f
x
k
f
2

2
x
k
= ( = ( ) = , = ( = ( ) = . f
jk
f
x
j
)
x
k

x
k
f
x
j
f
2
x
k
x
j
f
kj
f
x
k
)
x
j

x
j
f
k x
j
f
2
x
k
x
j
, , , f
jjj
f
jkj
f
kkk
x
j
x
k
x
k
x
j
f(x, y) = e
xy x
2
= ((2x y) ) = x(2x y) , f
xy

y
e
xy x
2
e
xy x
2
e
xy x
2
= (x ) = x(2x y) , f
yx

x
e
xy x
2
e
xy x
2
e
xy x
2
= f
xy
f
yx
= f
jk
f
kj
f
xy
f
yx
= ( ) = = = ( ) = f
xy

y
f
x
f
2
yx
f
2
xy

x
f
y
f
yx
= f
xy
f
yx
, f
xy
f
yx
f f
3
= f
xy
f
yx
Example 1: determine when
Solution: since has partial derivatives
of all orders, equality of Second Partials holds. Now,
by the Chain Rule,
Thus by the Quotient Rule,
and so
It was easier to calculate in Example 1 because
while
Second order partial derivatives are connected with 'concavity', but the relationship is more subtle than in the one
variable case. If the tangent plane at a point on the graph of gives the best Linear
Approximation
to near , then we can expect that some Quadric surface will give the best Quadratic Approximation to near . Just
as the best Linear Approximation is the degree 1 Taylor polynomial centered at for , so this best Quadratic
Approximation is the degree 2 Taylor polynomial. For brevity we'll speak of these degree one and degree two Taylor
polynomials as simply the respective Linear and Quadratic Approximations to at .
f
xy
f(x, y) = x ( ). tan
1
x
y
x (x/y) tan
1
= x ( ( )) f
y

y
tan
1
x
y
= x( ) = .
1
1 +(x/y)
2
x
y
2
x
2
+ y
2
x
2
= ( )= f
yx

x
x
2
+ y
2
x
2
2x( + ) 2x( ) y
2
x
2
x
2
( + y
2
x
2
)
2
= = . f
xy
f
yx
2xy
2
( + y
2
x
2
)
2
f
yx
= (x ( )) = ( ) + ( ) = ( ) + , f
x

x
tan
1
x
y
tan
1
x
y
x
y
1
1 +(x/y)
2
tan
1
x
y
xy
+ y
2
x
2
= ( ( ) + ) = a mess . f
xy

y
tan
1
x
y
xy
+ y
2
x
2
P(a, b, f(a, b)) z = f(x, y)
L(x, y) = f(a, b) + (x a) + (y b) f
x

(a, b)
f
y

(a, b)
f P f P
(a, b) f
z = f(x, y) (a, b)
The Quadratic Approximation to a function at is given by
For example, when , then at ,
so the Linear Approximation to near is given by
while the Quadratic Approximation is
which after some algebra becomes
z = f(x, y) (a, b)
Q(x, y) = f(a, b) + (x a) + (y b) f
x

(a, b)
f
y

(a, b)
+ (x a + (x a)(y b) + (y b .
1
2
f
xx

(a, b)
)
2
f
xy

(a, b)
1
2
f
yy

(a, b)
)
2
f(x, y) =
2
xy
(1, 2)
f(1, 2) = 1, (1, 2) = 1, (1, 2) = , (1, 2) = 2, (1, 2) = , (1, 2) = , f
x
f
y
1
2
f
xx
f
xy
1
2
f
yy
1
2
f(x, y) =
2
xy
(1, 2)
L(x, y) = 1 (x 1) (y 2) = 3 x y,
2
xy
1
2
1
2
Q(x, y) = 1 (x 1) (y 2) +(x 1 + (x 1)(y 2) + (y 2 ,
2
xy
1
2
)
2
1
2
1
4
)
2
6 4x 2y + + xy + .
2
xy
x
2
1
2
1
4
y
2
But what does all this mean graphically for a function ? Well, the graph of a linear equation
is a plane, while the graph of a quadratic equation is a quadric surface. So near a point the
Linear Approximation at approximates the graph of by a plane - the Tangent Plane - while the
Quadratic Approximation at approximates the graph of by a quadric surface such as a
paraboloid, hyperbolic paraboloid, or a hyperboloid.
To illustrate these ideas, let's compute some more Quadratic Approximations, ones that we will be useful shortly. Set
the graph of is shown to the right in interactive form -
try grabbing and moving it! Then
while
and
Thus: when ,
near a hyperbolic paraboloid;
near a paraboloid opening downwards;
near a paraboloid opening upwards.
Do these seem reasonable given the graph of shown above? For a function of one variable, the degree two
approximating polynomial in would be a parabola, opening up or down, but in two variables things are more subtle
because there are several possible approximating quadric surfaces.
Question: all the definitions and calculations so far have been for functions of two variables. Is it clear how to
extend the definitions to a function of three variables?
z = f(x, y)
Ax + by + Cz = D (a, b)
L(x, y) (a, b) z = f(x, y)
Q(x, y) (a, b) z = f(x, y)
z = f(x, y) = sin xsin y, x, y ;
f
= cos xsin y, = sin xcos y,
f
x
f
y
= sin xsin y, = sin xsin y.
f
2
x
2
f
2
y
2
= cos xcos y,
f
2
xy
f(x, y) = sin xsin y
(0, 0) : f(x, y) xy,
( , ) : f(x, y) 1 (x (y ,

2
1
2

2
)
2
1
2

2
)
2
( , ) : f(x, y) 1 + (x + (y + ,

2
1
2

2
)
2
1
2

2
)
2
f y = f(x)
x
z = f(x, y)
w = f(x, y, z)
EXTREMA of REAL-VALUED FUNCTIONS
Optimization of functions is just as important for functions of several variables as it was in one variable. Let's
first look at things graphically. The interactive surface to the right below is one we've met before. It's the graph of
In topographical terms, it has
mountains: Local Maxima as at ,
basins: Local Minima as at ,
both of which occured for graphs in the plane.
But it also has a pass through the mountains
at which the terrain slopes up in one direction
and down in another direction just like a saddle.
So we also have to bring in the notion of
Saddlepoint as at .
Just as in the one variable case, however,
calculus will provide both an algebraic and
graphical understanding of these three types of
local extrema. So for a general function we
introduce the following
Definition: At a function is said to have a
Local Maximum : for all near ,
Local Minimum : for all near .
The point is then said to be a Local Extremum of when one of
these three is satisfied.
In one variable locating local extrema usually meant finding where . In variables we replace by
.
z = f(x, y) = sin xsin y, x, y .
P
Q
R
(a, b) z = f(x, y)
f(x, y) f(a, b) (x, y) (a, b)
f(x, y) f(a, b) (x, y) (a, b)
(a, b) z = f(x, y)
(x) = 0 f

2 (x) f

f(x, y)
Definition: A point is said to be a Critical Point of when
i.e. , or at least one of does not exist.
Three important observations:
If is a local extremum, then is a critical point,
If exists, then the tangent plane is horizontal at the point
on the graph of when is a critical point of .
Saddle Point : a critical point is said to be a SADDLE-POINT when walking
from some directions takes one uphill, others downhill.
The previous graph of shows that and the tangent plane is horizontal at and
. Let's see in detail how this works algebraically to find all critical points:
Start with the function
By the Product Rule,
As are always defined for ,
the only critical points occur when satisfy the
equations
But
so the critical points occur at and at
Now we can use various methods for classifying these critical points. This is straightforward if we have the graph.
For the earlier graph shows that has
1. a local maximum at 2. a local minimum at 3. a saddle point at
.
But there are several methods for classifying the local extrema even when we don't have the graph. Check out the
various choices in the interactive graphic to the right. Critical points are indicated by the red dots. We classify them:
(a, b) f(x, y)
f(a, b) = (a, b) i + (a, b) j = 0 , f
x
f
y
(a, b) = (a, b) = 0 , f
x
f
y
(a, b), (a, b) f
x
f
y
(a, b) (a, b)
f(a, b)
(a, b, f(a, b)) z = f(x, y) (a, b) f(x, y)
(a, b, f(a, b))
z = sin xsin y f(a, b) = 0 P, Q,
R
z = f(x, y) = sin xsin y, x, y .
= cos xsin y, = sin xcos y. f
x
f
y
, f
x
f
y
< x, y <
(x, y)
cos xsin y = 0 = sin xcos y.
cos( ) = sin 0 = cos = 0 ,

2
(a, b) (0, 0)
( , ), ( , ), ( , ), ( , ).

2
z = sin xsin y
( , ), ( , ),

2
( , ), ( , ),

2
(0, 0)
contour map: since height is indicated by
color shading with dark being low and light being
high, local maxima occur when a red dot is
surrounded by shading getting lighter as one
approaches the dot, while local minima occur when
a red dot dot is surrounded by successively darker
shading. Thus the red dots in Quadrants I and III
will be local maxima, while the red dots in
Quadrants II and IV will be local minima. A saddle
point, however, occurs at a red dot when the color
darkens as one move in one direction from the
point, but lightens as one moves in a different
direction from that same point. This occurs at the
red dot at the origin.
gradient: gradient vectors point in the
direction of maximum slope and have length equal
to the value of that maximum slope. Now the arrows
all point inward and get successively smaller at the
red dots in Quadrants I and III,
so these points are local maxima, while the arrows all point outward and get successively smaller at the red dots in
Quadrants II and IV, so these are local minima. But at the origin the arrows point both inward and outward, as well
as having smaller length the closer they are to the origin. So there's a saddlepoint at the origin. Does this use of the
gradient vectors remind you of how you used the First Derivative test to classify critical points for functions of one
variable? It should!
While the previous methods for classifying the critical points make good visuals, using second order partial
derivatives is usually more convenient as the Second Derivative Test was in one variable.
Suppose is a function with continuous second order partial derivatives and let
be the values of these second order derivatives at a point . To determine whether is a local maximum,
local minimum or a saddle point, we need to check the sign of the Discriminant
as prescribed in the important
f(x, y)
A = (a, b), B = (a, b), C = (a, b) f
xx
f
xy
f
yy
(a, b) (a, b)
D = D(a, b) = AC B
2
Second Derivative Test: at a critical point a function has a
Local Maximum: if and ,
Local Minimum: if and ,
Saddle Point: if .
So what about ? Well, here the test fails. It tells us nothing!!
To see how this works for on , note that
At three of the critical points of the Second Derivative Test tells us:

1.
:
here has a local maximum;

2.
:
here has a local minimum;
3. :
here has a saddle point.
Can you see what would happen at the remaining two critical points?
(a, b) f
D > 0 (a, b) < 0 f
xx
D > 0 (a, b) > 0 f
xx
D < 0
D = 0
f(x, y) = sin xsin y x, y
(x, y) = sin xsin y, (x, y) = cos xcos y, (x, y) = sin xsin y. f
xx
f
xy
f
yy
f
( , )

2
A = C = 1, B = 0,
D = 1 > 0,
f
( , )

2
A = C = 1, B = 0,
D = 1 > 0,
f
(0, 0)
A = C = 0, B = 1,
D = 1 < 0,
f
Now let's try a word problem. Here's a typical one.
Problem: the rectangular box shown to the right has
two parallel partitions but no top. It has a volume of
cubic inches.
1. Express the amount, , of material needed to
construct the box as a function of the length and
width of the box (assume no material is wasted).
2. Find the dimensions of the box that will
require the least amount of material to be used in its
construction.
Solution: 1. Let the height of the box be inches.
Then the box has surface area
because it has no top and two sides in the -
direction, while in the -direction it has two parallel
partitions as well as two sides. On the other hand,
the box has volume cubic inches, which imposes a
constraint condition:
Eliminating from these equations gives
This is the surface area as a function of . We
need to find its critical points.
2. By partial differentiation:
The critical points of thus occur when
i.e., at . At this point
while
Thus at the critical point. Consequently,
the surface area is a minimum when and .
8
A
x
y
x, y
z
A(x, y, z) = xy +2xz +4yz
x
y
8
xyz = 8, i. e. , z = .
8
xy
z
A(x, y) = xy + + .
16
y
32
x
x, y
= y , = x . A
x
32
x
2
A
y
16
y
2
A
y = , x = ,
32
x
2
16
y
2
(4, 2)
A = = = 1 > 0, A
xx

(4,2)
64
x
3

x=4
C = = = 4 > 0, A
yy

(4,2)
32
y
3

y=2
B = = 1 > 0 . A
xy

(4,2)
AC > 0 B
2
x = 4 y = 2
Regression Line: there are many other applications of optimization. For example, 'fitting' a curve to data is often
important for modelling and prediction. To the left below, a linear fit seems appropriate for the given data, while a
quadratic fit seems more appropriate for the data to the right.
But what does 'best' fit mean and how might we determine that best fit? This is an optimization problem which for
the linear case can be formulated as minimizing a function of two variables.
Suppose an experiment is conducted at times yielding observed values at these
respective times. If the points are then plotted and they look like ones the to left above, one
might conclude that the experiment can be described mathematically by a Linear Model indicated by the line drawn
through the data. Using for the equation of the model line, we get predicted values at
by setting . The difference between the predicted and observed
values is a measure of the error at the -observation: it measures how far above or below the model line the
observed value lies. We want to minimize the total error over all observations.
The minimum value of the sum
as vary is called the Least Squares Error.
For the minimizing values of and , the corresponding line is called the Least
Squares Line or the Regression Line.
Taking squares avoids positive and negative errors canceling each other out. Other choices like
, , x
1
x
2
x
n
, , y
1
y
2
y
n
( , ), 1 j n, x
j
y
j
y = mx +b , , p
1
p
2
p
n
, , x
1
x
2
x
n
= m +b p
j
x
j
= m +b p
j
y
j
x
j
y
j
j
th
y
j
E(m, b) = ( +( ++( p
1
y
1
)
2
p
2
y
2
)
2
p
n
y
n
)
2
= (m +b +(m +b ++(m +b x
1
y
1
)
2
x
2
y
2
)
2
x
n
y
n
)
2
m, b
m b y = mx +b
( p
j
y
j
)
2
| | p
j
y
j
b
could be used, but the fact that we'll want to differentiate to determine and means the calculus will be a lot
simpler if we don't use absolute values!!
To determine the critical point of differentiate partially:
giving a pair a simultaneous linear equations in and . Now solve for and . To do it by hand, as in homework
assignments, one hopes that is small - say - but solving the simultaneous equations is a problem in linear
algebra, and most spread-sheet programs or computer algebra systems have a built-in algorithm for calculating
and to determine the regression line for a given data set. The mathematics explains where that algorithm
originated! Indeed,
while
which from a linear algebra point of view suggests introducing vectors
in . For then is a critical point when satisfies a pair of simultaneous equations or as the
correspnding matrix equation:
Using the inverse of this matrix makes determining and particularly easy:
Problem: can you see how to modify this argument for a linear model to determine the Least Squares Parabola
providing a best quadratic fit for data as shown to the right above? As discussed in class, this idea can be generalized
beyond lines, parabolas and curves in the plane to regression surfaces.
m b
E(m, b)
= 2( (m +b ) + (m +b ) ++ (m +b )) = 0 ,
E
m
x
1
x
1
y
1
x
2
x
2
y
2
x
n
x
n
y
n
= 2((m +b ) +(m +b ) ++(m +b )) = 0 ,
E
b
x
1
y
1
x
2
y
2
x
n
y
n
m b m b
n n = 3
m
b
= 2(( + ++ )m+ ( + ++ )b ( + ++ )) = 0 ,
E
m
x
2
1
x
2
2
x
2
n
x
1
x
2
x
n
x
1
y
1
x
2
y
2
x
n
y
n
= 2(( + ++ )m+ (1 +1 ++1)b ( + ++ )) = 0 ,
E
m
x
1
x
2
x
n
y
1
y
2
y
n
x = ( , , , ) , y = ( , , , ) , 1 = (1, 1, , 1) , x
1
x
2
x
n
y
1
y
2
y
n
R
n
(m, b) (m, b)
[ ][ ] = [ ]
x m+(x 1)b
2
(x 1)m+ 1 b
2
=
=
x y,
y 1,
x
2
x 1
x 1
1
2
m
b
x y
y 1
2 2 m b
[ ] = [ ][ ].
m
b
1
x 1 (x 1
2

2
)
2
1
2
x 1
x 1
x
2
x y
y 1
CONSTRAINED OPTIMIZATION

By now you're used to finding maxima and minima of both single- and multi-variable functions. But what if a restriction is
placed on the variables? Such so-called Constrained Optimization Problems come up frequently in applications.
Suppose, for instance, you want
to maximize production by allocating a fixed amount of capital between labor and manufacturing costs,
or to know where a weather balloon at the end of a fixed length of cable ends up,
or need to pack food for a trip, but you've got a limited amount of space in your RAV4.
The first method method for dealing with constrained optimization, called the Method of Lagrange Multipliers, is a
particularly powerful one in engineering, the sciences and economics because it works in a wide variety of problems and in
any number of variables. The method is based on contour maps, showing yet again just how useful contour maps really
are. Let's set up an idealized mathematical model by taking a 'hike in the mountains':
Problem: Find the maximum and minimum values of
subject to the constraint
We could use single variable methods: simply
eliminate from and .
But it will be instructive to investigate the problem
using a lot of what has been learned of late! using a lot
of what has been learned of late! The graph of
is a hyperbolic paraboloid, while the graph
of is a circle of radius centered at
the origin in the -plane. Without restrictions on
there would be no maximum or minimum value, just
the saddle point at the origin!
The constraint places restrictions on the values of in . Let's explore the effect of this constraint condition both
in -space and via contour maps in the -plane. Don't see any hiking yet? The trouble is that the graph of is a surface in
-space, while the graph of is a circle in the -plane. But in -space the graph of is a circular cylinder.
So the constraint says we look only for the maximum and minimum values of on the path where the
cylinder graph intersects the graph of as shown in orange to the left below. At these maximum and minimum
points you are walking horizontally along the contour through that point - you'd still be going uphill or downhill otherwise!
How can this be seen in the contour map of as shown to the right below?
z = f(x, y) = y
2
x
2
g(x, y) = + 4 = 0 . x
2
y
2
y z = f(x, y) g(x, y) = 0
z = x
2
y
2
+ 4 = 0 x
2
y
2
2
xy x, y
x, y f(x, y)
3 xy f
3 g xy 3 + 4 = 0 x
2
y
2
g(x, y) = 0 f(x, y)
z = x
2
y
2
z = x
2
y
2
Do you see which point on the orange curve corresponds to the point on the contour map? What about and ?
(Don't forget to rotate the surface for different viewpoints.) The crucial idea is that when the contour line and the circle
have a common tangent at a point in the right hand graphic, then the corresponding point on the orange curve will
be at a local max and local minimum on the curve because here you will be walking horizontally along the contour. But then
both gradient vectors and will be perpendicular to this common tangent at , hence parallel. Since
two vectors are parallel when one is a scalar multiple of the other, we thus get:
Method of Lagrange Multipliers: the maximum and minimum values of subject
to the constraint occur at a point for which there exists such that
and . Such points will be called critical points.
As are well-defined when and are functions of variables (or any greater number of
variables for that matter), the Method of Lagrange Multipliers works very generally: that's why it's important!
P Q R
(a, b)
f(a, b) g(a, b) (a, b)
z = f(x, y)
g(x, y) = 0 (a, b)
(f)(a, b) = (g)(a, b), g(a, b) = 0 ,
(g)(a, b) 0 (a, b)
f, g w = f(x, y, z) g(x, y, z) 3
Instead of finishing off the original problem, let's solve a slightly more complicated one:
Example 1: use Lagrange multipliers to find the
minimum value of
subject to the constraint
Solution: the minimum value of
subject to the constraint
occurs at solutions of
Now
thus the critical points occur at solutions of
i.e., and
So or . Now
(i) if ,
while
(ii) if ,
and then
So the critical points are
But
Consequently, on the minimum value of
is 4, and it occurs at .
f(x, y) = 2 + +3 x
2
y
2
g(x, y) = +4 4 = 0 . x
2
y
2
f(x, y) = 2 + +3 x
2
y
2
g(x, y) = +4 4 = 0 x
2
y
2
f(x, y) = (g)(x, y), g(x, y) = 0 .
f(x, y) = 4xi +2yj ,
g(x, y) = 2xi +8yj .
4x = 2x, 2y = 8y, +4 4 = 0 , x
2
y
2
+4 4 = 0 x
2
y
2
2x(2 ) = 0 , 2y(1 4) = 0 .
x = 0 = 2
x = 0
+4 4 = 0 y = 1 ; x
2
y
2
= 2
2y(1 4) = 0 y = 0 ,
+4 4 = 0 x = 2 . x
2
y
2
(0, 1), (0, 1), (2, 0), (2, 0) .
f(0, 1) = f(0, 1) = 4 ,
f(2, 0) = f(2, 0) = 11 .
g(x, y) = 0
f(x, y) (0, 1)
A standard application of Lagrange Multipliers occurs in maximizing production of 'widgets' by allocating a fixed amount
of capital between labor and manufacturing costs. It is based on so-called Cobb-Douglas functions relating
labor, capital and output in an industrial economy as first derived by economist Paul Douglas and mathematician Charles
Cobb long before financial engineering became a 'hot topic'.
Example 2: by investing units of labor and units
capital, John's Tees can produce
T-shirts. Determine the maximum number of T-shirts
that can be produced on a budget of if labor
costs per unit and capital costs per unit.
Solution: we have to maximize the function
subject to the budget constraint
This maximum occurs at solutions of
Now
Thus
while
So the critical points occur at solutions of
which simplifies to
i.e., .
Substituting for in the budget constraint
condition, we find that
Since
the maximum number of T-shirts that can be produced
is .
z = Cx
a
y
b
x y
P(x, y) = 40x
3/5
y
2/5
$10, 000
$100 $200
P(x, y) = 40x
3/5
y
2/5
g(x, y) = 100x +200y 10, 000 = 0 .
P(x, y) = (g)(x, y), g(x, y) = 0 .
= 40( ),
P
x
3
5
x
2/5
y
2/5
= 40( ).
P
y
2
5
x
3/5
y
3/5
P(x, y) = 8 (3yi +2xj) , x
2/5
y
3/5
g(x, y) = 100 i +200 j .
(8 )3y = 100, x
2/5
y
3/5
(8 )2x = 200, x
2/5
y
3/5
= = ,

8x
2/5
y
3/5
3y
100
2x
200
y = x
1
3
y = x
1
3
x = 60 , y = 20 .
P(60, 20) = 40(60 (20 1546.55 , )
3/5
)
2/5
1546
MOTION in SPACE
One important application of vector calculus is to motion in space - in fact, much of the application of mathematics to science
can be traced back to Kepler's Laws describing the motion of the planets in terms of conic sections. Shortly afterwards Newton
took these geometric descriptions and then created calculus to express the results analytically! Newton's Laws of Motion and the
Theory of Gravitation followed. On the other hand, much of the vector calculus that we'll come to shortly was initially developed in
the -century to explain newly emerging fields of electricity and magnetism, and Einstein later used calculus over surfaces to
develop his theory of Relativity.
Recall that when
is a vector function whose values are vectors in , then a space curve is simply the path traced out by the tip of as varies.
If this path is thought of as the trajectory of an object moving in space, the parameter is naturally time, so that the tip of
gives the position of the object at time . In terms of motion in space, the examples
19
th
r(t) = x(t) i + y(t) j + z(t) k
R
3
r(t) t
t r(t)
t
r(t) = cos t, sin t, t r(t) = t cos t, t sin t, t
we met earlier might be motion on a staircase spiraling (counter-clockwise from above) around a circular cylinder or the path of
an object caught up in a windstorm or tornado so that it spirals around a cone.
Again as we saw earlier, differentiation of a vector function is done component-wise: the first order and second order
derivatives of are
Both are vector functions, and the first order derivative gives the tangent vector at so that is the velocity, , of
the moving object, while the second derivative is its acceleration . Newton, of course, was really interested in starting
with the acceleration!
Example 1: find the position of a particle with
acceleration
when its initial velocity and position satisfy
Solution: since , integration gives
Thus
Consequently,
Differentiation of the dot and cross product of vector functions behaves just like the Product Rule:
as a term-by-term calculation shows. We shall need these shortly.
Now to distance travelled. Intuitively, since the speed of a moving object is the length of its velocity vector, the distance
the object travels from time to time should be the integral of over the time interval . Equivalently, this
will be the arc length of the curve parametrized by It's instructive to outline a proof of this because as always the
basic idea involves approximating Riemann sum arguments. Let be the curve shown in blue to the right below and parametrized
by .
r(t)
(t) = (t) i + (t) j + (t) k , (t) = (t) i + (t) j + (t) k . r

(t) r

r(t) (t) r

v(t)
(t) r

a(t)
r(t)
a(t) = 2 i +12 j ,
v(0) = 7 i, r(0) = 2 i +9k .
a(t) = (t) v

v(t) = 2t i +6 j +v(0) = 2t i +6 j +7 i . t
2
t
2
r(t) = v(t) dt = i +2 j +7t i +r(0) . t
2
t
3
r(t) = i +2 j +7t i +(2 i +9k) t
2
t
3
= ( +7t +2) i +2 j +9 k . t
2
t
3
(c(t) d(t) = (t) d(t) +c(t) (t) , (c(t) d(t) = (t) d(t) +c(t) (t) , )

v(t)
t = a t = b (t) r

[a, b]
r(t), a t b .
C
r : [a, b] R
3
Partition by
and use this to decompose into consecutive arcs
parametrized by
Then each can be approximated by the red vector
so that
Now choose an (orange) sample point in .
All this prompts introducing the Riemann Sum
as an approximation to to the length of between and . But, if is the increment in from to ,
then by the vector Mean Value Theorem,
Thus by taking the limits as we obtain the integral
we'll denote by . Formally,
For a space curve parametrized by its
When is the position of an object moving in 3-space, then this integral expression also
gives the distance travelled by the object from time to time .
[a, b]
a = < < < < < = b t
0
t
1
t
2
t
k1
t
k
t
n
C
, , , , C
1
C
2
C
k
C
n
= {r(t) : t } . C
k
t
k1
t
k
C
k
r( ) r( ) t
k
t
k1
length( ) r( ) r( ) . C
k
t
k
t
k1
r( ) t

k
C
k
r( ) r( )
k =1
n
t
k
t
k1
C r(a) r(b) = t
k
t
k
t
k1
t t
k1
t
k
r( ) r( ) = r( + ) r( ) ( ) . t
k
t
k1
t
k1
t
k
t
k1
r

k
t
k
0 max
1kn
t
k
r( ) r( ) = ( ) = (t) dt , lim
n

k =1
n
t
k
t
k1
lim
n

k =1
n
r

k
t
k

b
a
r

ds
C
C r(t), a t b,
arc length ds = (t) dt .
C

b
a

r(t)
t = a t = b
Example 2: find the integral that represents the
length of the graph shown in
of the vector function
Solution: when
then
and so
The graph of starts at and returns to
when . Thus
Example 3: find the distance travelled over the
time interval for a particle whose position
function is
Solution: the distance travelled by the particle
between and is given by
Now so
Thus
Acceleration and the forces on an object moving in space, be it a particle, an orbiting space station, or a planet, are more
interesting, however. (Remember: acceleration force.) Unlike the case of a particle moving in a straight line, the velocity of an
object moving in 3-space can change both in magnitude and direction. Think of the forces you experience when driving on a
circular off-ramp from a freeway! There's one component in the tangential direction coming from the change in speed, but there's
another in the normal direction that depends on curvature. The normal force depends on the speed on the off-ramp and the force
increases as the radius of the off-ramp decreases; even at constant speed the force in the normal direction is still present.
r(t) = 4 t, 4 t . cos
3
sin
3
r(t) = 4 t, 4 t , cos
3
sin
3
(t) = 12 sin t t, 12 cos t t , r

cos
2
sin
2
(t) = 12 . r

t t( t + t) cos
2
sin
2
cos
2
sin
2

= 12| cos t sin t| .


r(t) (4, 0) (4, 0)
t = 2
length = 12 | cos t sin t| dt .
2
0
[0, ln 2]
r(t) = t, , . 2 e
t
e
t
t = a t = b
I = (t) dt .
b
a
r

(t) = , , , r

2 e
t
e
t
(t) = = . r

2 + + e
2t
e
2t

(( + e
t
e
t
)
2

I = ( + ) dt
ln 2
0
e
t
e
t
= [ = . e
t
e
t
]
ln 2
0
3
2

For a space curve given parametrically by , the tangent and normal vectors at the point
are the unit vectors defined respectively by
Frequently, is called the Principal Normal and the cross product the
Bi-Normal.
Notice that the tangent and normal vectors are perpendicular in the sense that since
Thus forms a right-handed system of unit vectors at each point on the curve. For the case of the helix
these are easy to calculate.
Example 4: for the position function
whose path is a helix we see that the velocity is
so , while
But then
so
while
Now to curvature. Our intuition tells us that the curvature of a circle of radius should be , i.e., the smaller the radius, the
more the circle curves, while the curvature of a straight line should be 0. But coming up with the appropriate definition takes
some care because the same curve can have many different parametrizations. For example, the graphs of
are all the same circle of radius centered at the origin, obviously all having the same curvature, but particles having these as
position function will not have the same velocity or the same acceleration because their derivatives have different lengths. So the
definition of curvature has to based on some 'standard', unambiguous choice of parametrization. The basic idea is to use arc
length as parameter.
Fix a curve parametrized by , and choose some initial fixed point corresponding to . Now define the arc length
parameter by
This is the arc length of the portion of graph of between and , and by the Fundamental Theorem of calculus,
r(t)
r(t)
T(t) = , N(t) = .
(t) r

(t) r

(t) T

(t) T

N(t) B(t) = T(t) N(t)


T(t) N(t) = 0
T(t) = T(t) T(t) = 1 (T(t) T(t)) = 2 (t) T(t) = 0 .
2
d
dt
T

{T(t), N(t), B(t)} r(t)


r(t) = cos t i +sin t j + t k ,
v(t) = (t) = sin t i +cos t j +k , r

(t) = r

2
T(t) = ( sin t i +cos t j +k).
1
2
(t) = ( cos t i +sin t j) , T

1
2
N(t) = = ( cos t i +sin t j) ,
(t) T

(t) T

B(t) = T(t) N(t) = ( sin t i cos t j +k) .


1
2
R 1/R
Rcos t i + Rsin t j , Rcos 2t i + Rcos 2t j , Rcos i + Rsin j , t
3
t
3
R
C r(t) r(a) t = a
s
s = s(t) = (u) du.
t
a
r

r(t) r(a) r(t)


= (t) ;
ds

in particular, we can parametrize by defining a new vector-valued function , setting


In this case, many texts speak of as being parametrized by arc length. Of course, the unit tangent vectors to coincide at
, (they are the same point on ), i.e.,
The Curvature of the space curve at is the scalar
in other words, the curvature of is the rate at which the direction of the unit tangent vector
is changing with respect to arc length.
For a circle , we see that
while
Thus in terms of arc length,
This means that a circle of radius has
exactly as we suspected all along!
In practice, parametrizing a curve by arc length often isn't convenient - what we really want are expressions for and in
terms of , avoiding mention of arc length. There are several such results whose proof uses lots of the ideas we've developed so
far: since
the Chain Rule gives
so that by the second of the results in (*)
= (t) ;
ds
dt
r

C c(s)
c(s) = c(s(t)) = r(t) .
C C
c(s), r(t) C
T(t) = = .
(t) r

(t) r

(s) c

(s) c

C c(s)
= = ,

d
ds
(s) c

(s) c

dT
ds

C
r(t) = Rcos t i + Rsin t j
(t) = R(sin t i +cos t) j , (t) = R( t + t = R, r

cos
2
sin
2
)
1/2
s = Rdt = Rt , = R.
t
0
ds
dt
= = ( ) = (R( cos t i +sin t j)).
dT
ds
dT
dt
dt
ds
1
R
d
dt
(t) r

(t) r

1
R
2
R
curvature = (R( cos t i +sin t j)) = ,

1
R
2

1
R
(t) a(t)
t
= (t) , (t) = N(t) (t) , (t) = , ()
ds
dt
r

dT
ds

(t) = = = (t) , T

dT
dt
ds
dt
dT
ds
r

dT
ds
(t) = N(t) (t) = (t)(t)N(t) . T

But , so by the Product Rule and this last result,


Thus
Fundamental Result I: for an object with position function ,
the acceleration at the point decomposes into tangential and normal components
where the speed and acceleration .
Consequently, for an object moving in space the component of acceleration tangential to the trajectory of motion is the change in
speed, while the normal component depends on curvature, exactly as we experience when leaving a freeway on a circular off-
ramp because the curvature of a circle increases as the radius decreases as we established just a few lines ago! Notice
1. For a straight line , so If the object is moving in a straight line the only acceleration comes from the rate of change of
speed. The acceleration vector then lies in the tangential direction.
2. If the object is moving with constant speed along a curved path, then , so there is no tangential component of
acceleration. The acceleration vector lies in the normal direction.
Motion in general will combine the tangential and normal acceleration.
On the other hand, if we take the cross product of with and use , we get
because . Since has unit length, this gives
Fundamental Result II: for a space curve parametrized by
the curvature at the point is
in other words, the curvature can be expressed in terms of velocity and acceleration.
(t) = (t)T(t) r

(t) = ( (t)T(t)) = T(t) + (t) (t) = T(t) + (t) (t)N(t) . () r

d
dt
r

d (t) r

dt
r

d (t) r

dt
r

2
r(t)
r(t)
a(t) = (t) T(t) +(t)v(t N(t) v

)
2
v(t) = (t) r

a(t) = (t) r

(t) = 0
a(t) = (t)T(t) v

dv/dt = 0
a(t) = (t)v(t N(t) )
2
(t) r

(t) r

()
(t) (t) = (t)T(t) ( T(t) + (t) (t)N(t)) r

d (t) r

dt
r

2
= (t) (T(t) T(t)) + (t) (t)(T(t) N(t)) = (t) (t)B(t) , r

d (t) r

dt
r

3
r

3
T(t) T(t) = 0 B(t)
C r(t)
r(t)
(t) = ;
(t) (t) r

(t) r

3
Kepler's Laws: given the enormous impact Kepler's Laws of
planetary motion (circa 1609) and Newton's mathematical
derivation of them in 1687, it's worth seeing what they say.
Kepler arrived at his three laws by the first example of 'data-
mining'. He took the detailed astronomical observations made by
Tycho Brahe over a period of many years and extracted the Laws
from this 'data-set'. By contrast, Newton invented calculus, then
formulated his three Laws of Motion, and finally arrived at what
we would now call a differential equation from which he derived
Kepler's Laws.
Recall that an ellipse is the set of all points such that the
sum
of the distances from focii is constant. As shown to the
right, take one of these focii as origin and place the sun there;
then parametrize the orbit of a planet by .
Kepler's Laws:
The orbit of a planet is an ellipse with the sun at one focus.
The position vector from the sun to the planet sweeps out equal areas in equal
times.
The square of the period of revolution of the planet about the sun is proportional to the
cube of the length of the semi-major axis of its orbit.
The blue and pink regions above have equal area to illustrate the Second Law, for instance. On the other hand, in vector form
Newton's Second Law of Motion says that where is the net force vector acting on an object and is its acceleration. In
the case of planetary motion, Newton's Law of Gravitation says that the sun attracts the planet with a gravitational force of
magnitude in the direction of the vector where is shown above and are various constants
like mass. Thus
But by Newton's Second Law of Motion, . So he arrived at the differential equation
By solving this equation for , we, like Newton, can derive each of Kepler's Laws. In fact we have already developed most of
the math needed. Amazing!
P
dist{P, } +dist{P, } F
1
F
2
, F
1
F
2
r(t)
r(t)
F = ma F a
GMm/r(t)
2
r(t) r(t) G, m, M
F(r(t)) = ( ) = r(t) .
GMm
r(t)
2
r(t)
r(t)
GMm
r(t)
3
F(r(t)) = m (t) r

(t) = r(t) . r

const
r(t)
3
r(t)
VECTOR FIELDS
In many situations, both theoretical and applied, it is important to be able to deal with vector functions defined on a region in
whose values are vectors in : in the same way, we will often deal with vector functions defined on a region in whose
values are vectors in as well as in higher dimensions too. The velocity of a fluid flowing in -space is a natural example of a
vector field in -space, for instance. Formally,
Definition: a Vector Field is a function such that
is a vector in for each in .
When , a vector field has component scalar fields and so that
for there are three such components:
and there will be components for . If each component is a -function, then is said to be of class . We'll always assume
without mentioning it specifically.
We've met such functions already:
if is temperature at , then is the temperature gradient,
if is pressure at , then is the wind velocity.
With an eye on physics, a vector field is called a Gradient vector field, and is said to be a Potential function for ;
the level curves for are then often called equi-potential lines or curves. As we shall see shortly, such gradient vector fields are
particularly important examples of vector fields.
Important: not every vector field is a Gradient vector field, however. For if
has components, then by the equality for mixed partials,
and so on. In other words,
There is a corresponding criterion for a vector field ; what would it be?
Drawing the graph of a vector field is hardly possible because we can't draw an arrow at each point , even in the case of points
in the plane. So we draw the vectors at a representative sample of points and indicate their length on a comparative basis
R
2
R
2
R
3
R
3
3
3
F : U R
n
R
n
F( , , , ) x
1
x
2
x
n
R
n
x = ( , , , ) x
1
x
2
x
n
U
n = 2 F , F
1
F
2
F(x, y) = (x, y) i + (x, y) j ; F
1
F
2
n = 3
F(x, y, z) = (x, y, z) i + (x, y, z) j + (x, y, z) k ; F
1
F
2
F
3
n R
n
C
k
F C
k
k 1
z = f(x, y) (x, y) F(x, y) = (f)(x, y)
z = f(x, y) (x, y) F(x, y) = (f)(x, y)
F = f f F
f
F(x, y, z) = (x, y, z) i + (x, y, z) j + (x, y, z) k = f(x, y, z) , F
1
F
2
F
3
C
2
= ( ) = ( ) = ,
F
1
y

y
f
x

x
f
y
F
2
x
F = f = , = , = .
F
1
y
F
2
x
F
2
z
F
3
y
F
3
x
F
1
z
F : U R
2
R
2
x
(x, y)
n = 2
only, i.e., the same scalar multiple of each vector is drawn. In the case three basic examples are
Radial: Rotary: Shear:
If you know about matrices and matrix multiplication, then each matrix determines a linear transformation
from to ; thus to each matrix corresponds a vector field
In other words, each linear transformation of the plane (or of -space or even of , more generally,) is a vector field. Can you
see which matrices would determine the radial, rotary, and shear vector fields given earlier?
Problem 1. for what does a rotation matrix
determine a rotary vector field ?
Problem 2. under what condition on a matrix
is the vector field a gradient vector field ?
More generally, any vector field of the form
with a real number, is said to be a radial vector field because the vectors point away from or towards the origin. The
Gravitational force exerted by an object placed at the origin in is an example with , for instance. Since
each such field is a Gradient vector field. Can you see how to write this vector field in terms of a matrix mapping?
On the other hand, the angular velocity
of a rotating object defines a rotary vector field in for each fixed choice of vector in . Here is a vector which
lies in a plane having as normal, and the vectors rotate around this normal. In fact, the earlier example of is of
this type with because
n = 2
F = xi + yj F = yi xj F = 0 i + xj
2 2 A
: x Ax T
A
R
2
R
2
2 2 A
(x) = [ ][ ] = (ax + by)i +(cx + dy)j : . F
A
a
c
b
d
x
y
R
2
R
2
3 R
n

A = [ ]
cos
sin
sin
cos
A = [ ]
a
c
b
d
x Ax
F(x) = const. = const( i + j + k), x = xi + yj + z k ,
x
x
p
x
x
p
y
x
p
z
x
p
p
R
3
p = 3
F(x) = const. = ( ),
x
x
p
const
(p 2)x
p2
3 3
F(x) = x a , x = (x, y, z) in , R
3
R
3
a R
3
F(x, y, z)
a F = yi xj
a = k
For this rotary field
so it is NOT a Gradient vector field. Another example of a rotary vector field in is
which approximates the planar part of the velocity of water draining out of a hole in the bottom of a tub of water.
To appreciate why the notion of Gradient vector field will become important, let's anticipate some integration we'll do shortly. In
one variable we integrated a function over an interval in the number line: , so that if the integral was
simply the length, of the interval of integration. The crucial result linking integration and differentiation was the
Fundamental Theorem of Calculus which can be formulated as
But in the plane or -space we can think of integrating over a curve, not an interval: if is a curve in parametrized by
, then, as part of the theory of multi-variable integration we'll develop soon, the so-called path integral
of a scalar-valued function over will be introduced. Notice
if is a straight line given by , then and , so a path integral over a line reduces to
the usual integral over an interval,
if , then this path integral becomes the integral obtained earlier expressing the arc length of between
and .
Now suppose is a vector field, say a force field, and let be the unit tangent vector to at . Then the
component of tangential to at is , and we'll see that the work done in moving a particle from to
along against is given by the integral
an integral that's called a line integral. The crucial point is that if is a gradient vector field, , then by the Chain Rule,
so that
by the Fundamental Theorem of calculus for one variable. With more care in integration we thus get the first of the important
Fundamental theorems of Calculus in multi-variable calculus:
x k = (xi + yj + z k) k = yi xj .
x k = yi xj = (x, y) i + (x, y) j = 2 0 , F
1
F
2
F
1
y
F
2
x
R
2
F(x, y) = i j ,
y
+ x
2
y
2
x
+ x
2
y
2
I = g(t) dt
b
a
g(t) 1
b a
g(t) dt = f(b) f(a) , g(x) = (x) .
b
a
f

3 C R
3
r(t), a t b
I = g(r(t)) (t) dt
b
a
r

C
C r(t) = u + t v (t) = v r

| (t)| = const r

g(r(t)) 1 C r(a)
r(b)
F T(t) = (t)/ (t) r

C r(t)
F C r(t) F(r(t)) T(t) r(a) r(b)
C F
I = F(r(t)) T(t) (t) dt = F(r(t)) (t) dt ,
b
a
r

b
a
r

F F = f
F(r(t)) (t) = (f) (t) = f(r(t)), r

r(t)
r

d
dt
F(r(t)) (t) dt = f(r(t)) dt = f(r(b)) f(r(a)) ,
b
a
r

b
a
d
dt
Fundamental Theorem of Line Integrals: if is a curve parametrized by and
is a gradient vector field, then the line integral of over between points and on
is given by
So clearly we need to be able to compute functions from a given vector field . This is a is a simple matter of
integration If we know already that is a gradient vector field, and need to find a potential function
so that , set
with an arbitrary function of . By the Fundamental Theorem of calculus, . But if is conservative,
Thus , so , say. Consequently, is determined up to an arbitrary constant by
Stream-Lines: a typical example of a vector field is one whose value at is the velocity of, say, water flowing
around objects in the plane, laminar flow as it is often called. But how does knowing velocity give us position? For instance, the
captain of the Titanic might have thought about determining the future location of an iceberg once it is spotted in the North
Atlantic at a point , assuming he knew the velocity vector field of the ocean currents there. Again it's is really a problem
in integration.
Since the vector is the velocity of the iceberg at position , then the position function of the iceberg is the unique
solution to the initial value differential equation
This suggests the following definition.
Definition: a StreamLine for a vector field is a curve parametrized by
such that
i.e., is the tangent vector to the curve at . The Flow of a vector field is the
family of all its streamlines. Frequently, streamlines are referred to as Flow Lines, or Integral
Curves.
The Flow for the vector field
C r(t) F = f
F C r(a) r(b) C
F(r(t)) (t) dt = f(r(b)) f(r(a)) .
b
a
r

f(x, y) f
F = P(x, y) i + Q(x, y) j
z = f(x, y) F = f
f(x, y) = P(x, y) dx + g(y) ,
g(y) y = P(x, y)
f
x
F
Q(x, y) = = dx + (y) = dx + (y) = Q(x, y) + (y) .
f
y
P
y
g

Q
x
g

(y) = 0 g

g(y) = const. = K f(x, y) K


f(x, y) = P(x, y) dx + K.
F(x, y) (x, y)
r(t)
r
0
F(x, y)
F(r(t)) r(t)
(t) = v(t) = F(r(t)) , r(0) = . r

r
0
F : U R
n
R
n
r(t)
(t) = F(r(t)) , r

F(r(t)) r(t)
F(x, y) = cos( + y) i +(1 + x ) j x
2
y
2
is shown to the right. Notice that
the red and green streamlines all end at one
point - a sink - a Bermuda triangle in the North
Atlantic perhaps,
the green streamline starts at a source
because all streamlines flow out, not into or
through that point.
The orange streamline just looks safer! Now if
only that Titanic captain had taken calculus!
Finally, let's study the streamlines for the gradient vector field starting with a potential function
. Suppose a curve in parametrized by is a streamline for .Then at the point on ;
in other words, is tangent to at . Since is perpendicular to the level curve through any
point , the tangent vector to the streamline for through will be perpendicular to the level curve through . Thus
the streamlines for a gradient vector field are perpendicular to the level curves of .
The radial vector field defined earlier exhibits this perpendicularity: for
while the level curves are circles centered at the origin. On the other hand, will be a
streamline for when
i.e., when . Thus the streamlines are radial lines (as is clear geometrically), and these will be perpendicular to
circles centered at the origin. There is a corresponding result for hyperbolas.
Example 1: find the stream lines for the vector field
Solution: we have to find
so that
i.e., and Thus the
streamlines are
These parametrize the hyperbolas On the
other hand,
where
Thus the level curves for are the hyperbolas
which are perpendicular to the hyperbolas .
F(x, y) = (f)(x, y)
z = f(x, y) C R
3
r(t) F ( ) = F(r( )) r

t
0
t
0
r( ) t
0
C
F(r( )) t
0
r r( ) t
0
(f)(x, y) f(x, y) = k
(x, y) F r( ) t
0
r( ) t
0
F = f f
F(x, y) = xi + yj, f(x, y) = ( + ) F(x, y) = (f)(x, y),
1
2
x
2
y
2
f(x, y) = + = k x
2
y
2
r(t) = x(t)i + y(t)j
F
(t) = (t) i + (t) j = x(t) i + y(t) j = r(t), r

r(t) = (ai + b j) e
t
F(x, y) = xi yj .
r(t) = x(t) i + y(t) j
(t) i + (t) j = F(r(t)) = x(t) i y(t) j, x

(t) = x(t) x

(t) = y(t) . y

r(t) = a i + b j . e
t
e
t
xy = ab.
F(x, y) = xi yj = (f)(x, y),
f(x, y) = .
x
2
y
2
2
f
= k x
2
y
2
xy = ab

DIVERGENCE and CURL
The 'cleanest' way of introducing the Divergence and Curl of a vector field is to give a coordinate definition.
From now on we'll always assume that we are in and continue to assume that all vector fields are at least ,
i.e., the components of have continuous derivatives of order at least 1.
The Divergence of a vector field is defined by
In more sophisticated language, if is regarded as a vector of differential 'operators', then
is just the dot product
emphasizing that divergence creates a scalar function out of a vector field. This still gives little, if any, meaning to
the notion of divergence, however. Patience!
The Curl of a vector field is defined by
A calculation shows that in operator language
emphasizing that curl creates a vector function out of a vector field. Once more the coordinate definition provides no
motivation for the notion. As a check on these algebraic definitions, however, try computing and for the
vector field
F
R
3
C
1
, , F
1
F
2
F
3
F
F = i + j + k F
1
F
2
F
3
divF = + + .
F
1
x
F
2
y
F
3
z
i + j + k

z
divF
divF = F = + + ,
F
1
x
F
2
y
F
3
z
F = i + j + k F
1
F
2
F
3
curl F = ( )i +( )j +( )k .
F
3
y
F
2
z
F
1
z
F
3
x
F
2
x
F
1
y
curl F = F = ,

x
F
1
j

y
F
2
k

z
F
3

divF curl F
(x, y, z) = F
A

a
11
a
31
a
21
a
12
a
32
a
22
a
13
a
33
a
23

x
y
z

associated with a matrix . Do you see how and ca be expressed in terms of various properties
of ?
The full meaning of and , however, must await an understanding of integrals of functions of several
variables, whether scalar-valued or vector-valued; this is what vector calculus is all about!! But let's begin to think
(granted a little imprecision) how the Fundamental Theorem of Calculus might generalize from the single variable
case for scalar-valued functions defined on an interval to vector fields and double and triple integrals
over surfaces and solids.
Just as a curve , a one-dimensional geometric object, was parametrized by a function , so a
surface S, a two-dimensional geometric object, will be parametrized by a vector function such as
where is a subset of the plane . For simplicity, let's suppose is the graph of a function . Then is
parameterized by
and an integral of a vector field over the graph of will be then defined by a double integral
where is the normal at the point on the graph of . In other words, the surface integral of
a vector field is the integral of the normal component of over whereas it was the tangential component that
was integrated over a curve in the case of a line integral.
We'll want to think of a surface in in two ways:
where is a surface having boundary the curve ,
where itself is the boundary of a solid in .
In the first case, could be a hemi-sphere with boundary a circle, while in the second, could be a ball with
boundary , a sphere. Now suppose is parametrized by a function . Then, remarkably
enough, after lots of preliminary work to make sense of all these ideas we'll see that for a vector field :
I:
II:
where the integral on the left in {\bf III} is a triple integral over a solid in . Consequently, if we include also
the Line integral result
3 3 A divF curl F
A
divF curl F
y = f(x) [a, b]
C r : [a, b] R
3
: D R
2
R
3
D R
2
S z = f(x, y) S
: D , (x, y) = xi + yj + f(x, y) k , R
2
R
3
F S
I = F((x, y)) n(x, y) dxdy
D
n(x, y) (x, y) z = f(x, y)
F F S
S R
3
S S
S W W R
3
S S W
W = S S r : [a, b] R
3
F
curl F((x, y)) n(x, y) dxdy = F(r(t)) (t) dt,
D

b
a
r

(divF)(x, y, z) dxdydz = F((x, y)) n(x, y) dxdy,


W

D
W R
3
(f)(r(t)) (t) dt = f(r(b)) f(r(a)) ,
b

III:
from last time, then we have a beautiful calculus interpretation of all three operations , and of
differentiation. These are indeed the Fundamental Theorems of Calculus associated with curves, surfaces, and solids
in ; they, and the physical theories where they were created, are among the most important of the basic
developments in the mathematical sciences in the -century. You see now why we've spent so much time on
vectors, curves and surfaces! Formulating precisely the notion of double and triple integrals will be our next of
business.
The upshot of these versions of the Fundamental theorem of calculus will be that the divergence of a vector field
is related to compressive ( ) or expansive ( ) motion, while curl is related to rotational motion.
Example 1: the vector field
is radial because depends only on . So
near , is either pushing away from the
origin or pulling towards the origin.
Now
As a result
Thus
i.e., when , and when
.
Example 2: the vector field
describes the angular velocity of a body rotating
about an axis in the direction of with angular speed
. If , say, so that the body is rotating
about the -axis, a calculation shows that
Thus
This says that the curl of the velocity vector of a rotating
body is a vector whose value is twice the angular speed
and whose direction is the axis of rotation.
(f)(r(t)) (t) dt = f(r(b)) f(r(a)) ,
b
a
r

, curl div
R
3
19
th
F
divF < 0 divF > 0
F(x) = , x = ( + + ,
x
x
p
x
2
y
2
z
2
)
1/2
F(x) x
x = 0 F(x)
F(x) = i + j + k .
x
x
p
y
x
p
z
x
p
divF = (
1
x
p
px
2
x
p+2
+ + ).
1
x
p
py
2
x
p+2
1
x
p
pz
2
x
p+2
divF(x) = ,
3 p
x
p
divF < 0 p > 3 divF > 0
p < 3
F(x) = a x ,
a
= a a = k
z
F(x, y, z) = yi +xj ,
curl F = = 2k = 2a .

x
y
j

y
x
k

z
0

For vector fields associated with fluid flow, is connected with Vorticity i.e., with the tendency for
elements of the fluid to ''spin''. In mathematical terms, if is the velocity of the fluid, then is the vorticity of
. At each point is a vector whose direction is along the axis of the fluid's rotation; so in two-dimensional
flow, the velocity vector is perpendicular to the plane.
Finally, recall that by the equality of mixed partials, a -Gradient vector field
automatically had the property
But now this can be interpreted as saying that . This suggests the following
Definition: vector field is said to be Conservative on a set
when on . Every Gradient vector field is Conservative.
But is every Conservative vector field necessarily a Gradient vector field, and even if we knew the answer, what good
would this do us? A connection with the Fundamental theorem of Calculus for Line integrals was mentioned in the
previous lecture! So? Well, many of the results you'll learn in any course on Complex analysis rely on this notion, and
in the second half of the -century there was extensive research done on vector fields in satisfying the
conditions , generalizing complex function theory. Towards the end of the -century it was
even realized that this research had strong connections with ideas in signal processing.
F curl F
F curl F
F curl F
C
1
F(x) = (f)(x) = (x)i + (x)j + (x)k F
1
F
2
F
3
= , = , = .
F
1
y
F
2
x
F
2
z
F
3
y
F
3
x
F
1
x
curl F = 0
F U R
3
curl F = 0 U
20
th
F R
n
divF = 0, curl F = 0 20
th

DIVERGENCE and CURL
The 'cleanest' way of introducing the Divergence and Curl of a vector field is to give a coordinate definition. From now on
we'll always assume that we are in and continue to assume that all vector fields are at least , i.e., the components
of have continuous derivatives of order at least 1.
The Divergence of a vector field is defined by
In more sophisticated language, if is regarded as a vector of differential 'operators', then is just
the dot product
emphasizing that divergence creates a scalar function out of a vector field. This still gives little, if any, meaning to the notion of
divergence, however. Patience!
The Curl of a vector field is defined by
A calculation shows that in operator language
emphasizing that curl creates a vector function out of a vector field. Once more the coordinate definition provides no
motivation for the notion. Nonetheless, as a check on your understanding of these algebraic definitions, try answering the next
problem which relates and to properties of matrices.
Problem 1(a): determine for the vector
field
associated with a matrix . Then express
in terms of the TRACE of .
Problem 1(b): determine for the vector
field
associated with a matrix . For which matrices
is ?
The full meaning of and , however, must await an understanding of integrals of scalar-valued and vector-valued
functions of several variables. This is what vector calculus is all about! Granted some imprecision, let's see how the
Fundamental Theorem of Calculus might generalize from the single variable case for scalar-valued functions defined
on an interval to vector fields and double and triple integrals over surfaces and solids.
Just as a curve , a one-dimensional geometric object, was parametrized by a function of one variable, so a
surface S, a two-dimensional geometric object, will be parametrized by a vector function such as where
is a subset of the plane . For simplicity, let's suppose is the graph of a function . Then in set-builder notation,
while in vector function terms, where
In addition, as we saw earlier, a normal to at a point on is given by the cross product
An integral of a vector field over the graph of will be then defined by a double integral
over the region in the -plane. In other words, the surface integral of a vector field is the integral of the normal
component of over whereas it was the tangential component that was integrated over a curve in the case of a line
integral.
We'll want to think of a surface in in two ways:
where is a surface having boundary the curve ,
where itself is the boundary of a solid in .
In the first case, could be a hemi-sphere with boundary a circle, while in the second, could be a ball with boundary
, a sphere. Now suppose is parametrized by a function . Then, remarkably enough, after lots of
preliminary work to make sense of all these ideas we'll see that for a vector field :
I:
where the integral on the left in I is a double integral over the plane region in ( is vector-valued, remember!),
II:
where the integral on the left in II is the usual triple integral over a solid in ( is scalar-valued, remember!).
Consequently, if we include also the Line integral result
III:
from last time, then we have a beautiful calculus interpretation of all three operations , and of differentiation. These
are indeed the Fundamental Theorems of Calculus associated with curves, surfaces, and solids in ; they, and the physical
theories where they were created, are among the most important of the basic developments in the mathematical sciences in
the -century. You see now why we've spent so much time on vectors, curves and surfaces!
The upshot of these versions of the Fundamental theorem of calculus will be that the divergence of a vector field is related
to compressive ( ) or expansive ( ) motion, while curl is related to rotational motion.
Example 1: the vector field
is radial because depends only on . So near
, is either pushing away from the origin or
pulling towards the origin.
Now
As a result
Thus
i.e., when , and when
.
Example 2: the vector field
describes the angular velocity of a body rotating about
an axis in the direction of with angular speed
. If , say, so that the body is rotating
about the -axis, a calculation shows that
Thus
This says that the curl of the velocity vector of a rotating
body is a vector whose value is twice the angular speed
and whose direction is the axis of rotation.
For vector fields associated with fluid flow, is connected with vorticity , i.e., with the tendency for elements of the
fluid to ''spin''. In mathematical terms, if is the velocity of the fluid, then is the vorticity of . At each point is
a vector whose direction is along the axis of the fluid's rotation; so in two-dimensional flow, the velocity vector is perpendicular
to the plane.
Finally, recall that by the equality of mixed partials, a -Gradient vector field
automatically had the property
But now this can be interpreted as saying that . This suggests the following
Definition: vector field is said to be Conservative on a set
when on . Every Gradient vector field is Conservative.
But is every Conservative vector field necessarily a Gradient vector field, and even if we knew the answer, what good would
this do us? A connection with the Fundamental theorem of Calculus for Line integrals was mentioned in the previous lecture!
So? Well, many of the results you'll learn in any course on Complex analysis rely on this notion, and in the second half of the
-century there was extensive research done on vector fields in satisfying the conditions ,
generalizing complex function theory. Towards the end of the -century it was even realized that this research had strong
connections with ideas in signal processing.
General Vector Field: to interpret line integrals for a general requires Stokes' theorem to be be proved later, but specific
cases show the basic ideas.
Example 3: first, let be a rotation vector field. Then
On the other hand, when is the circle of radius centered at the origin and parametrized
by ,
But has radius , so the area of the region inside is . Thus
This could be pure coincidence - let's try some different cases.
Example 4: let be a shear vector field. Then for the same circle as before,
But now , and again
Example 5: let be a radial vector field. Then for the same circle as before,
But now , and so once again,
Example 6: let be a gradient vector field. Then by the Fundamental Theorem for
Line Integrals,
as well as for any closed curve containing the origin. But as we saw earlier, the equality of
mixed partials tells us that . Thus
These four examples all illustrate the following general result to be proved soon:
Theorem: fix a vector field and point in a plane in . Then, if is the circle in
this plane of radius and center ,
where is the normal to the plane at .
Since a line integral around a closed curve in a plane is really measuring the circulation of around , this
result shows that the vector is a measure of the 'curling effect', i.e., 'rotating effect' of around the normal to the
plane. Look at the three basic examples in the -plane again:

Rotation:

Shear:

Radial:
DOUBLE INTEGRALS
The fundamental ideas involved in defining, interpreting and evaluating the integral
of a function of two variables over a region in the -plane are a lot like the ones for the integral of a
function of one variable over an interval in the -axis.
In one variable the integral was the area under the
graph of on when . This was
made precise by defining the value of the integral as the
limit
of a sum of approximating rectangular areas as shown to
the right. Computing the value was done via the
Fundamental Theorem of Calculus
Since the approximating sum made sense whether or
not , the limit was then used to define the
integral for all .
When is a function of two variables we follow a similar route though the details are a little different.
For and a region in the -plane, the Double Integral
of over is the volume of the solid under the graph of and above .
f(x, y) dxdy
D
z = f(x, y) D xy
y = f(x) [a, b] x
y = f(x) [a, b] f(x) 0
f(x) dx = ( f( ) )
b
a
lim
n

k =1
n
x

k
x
k
f(x) dx = F(b) F(a) , (x) = f(x) .
b
a
F

f(x) 0
f
z = f(x, y)
z = f(x, y) 0 D xy
f(x, y) dxdy
D
f D f D
Sometimes the volume of a solid can be computed by geometry without using any calculus. The details suggest how we'll
proceed in general, however. As always, slicing will be the key!!
Example 1. find the volume of the solid below the plane
and above the rectangle
in the -plane.
Solution: the solid is shown to the right. To find its
volume, take a vertical slice for fixed . The trace
of the solid on the vertical plane is the same triangle for each
. But the triangle has height and base , so it has
area
Thus has
This idea of slicing and integrating the area of cross-section to determine the volume of a general solid - the so-called
Slice Method - was formulated by Cavalieri and is expressed mathematically in
Cavalieri's Principle: let be a solid and be a family of
parallel planes such that
lies between and ,
the area of the cross-sectional slice of cut by is .
Then
In addition to Example 1, you've used this idea before when computing volumes of revolution. Suppose is created by
rotating the graph of about the -axis . When is a plane perpendicular to the -axis, then the
slice of cut by is a disk of radius . Here , so we recover the familiar result
for a volume of revolution. But Cavalieri's Principle does not require the cross-sections to be triangles or disks!
W
z = (5 x)
4
3
D = [2, 5] [1, 3]
xy
W
y, 1 y 3
y = 4 = 3
A(y) = base height = 6 .
1
2
W
volume = area triangle side-length = 12 = A(y) dy.
3
1
W , a x b, P
x
W P
a
P
b
W P
x
A(x)
volume of W = A(x) dx.
b
a
W
y = f(x), a x b, x P
x
x
W P
x
f(x) A(x) = f(x)
2
volume of W = f(x dx
b
a
)
2
Example 2. find the volume of the solid under the
hyperbolic paraboloid
and over the square .
Solution: the solid is shown to the right. When is the
vertical slice perpendicular to the -axis for fixed shown
in purple, then
But then by using the slider to fill out the solid, Cavalieri's
Principle shows that has
What changes if the region is not a rectangle, but a disk, say? To the right below is the hyperbolic paraboloid
lying over a disk centered at the origin.
In cylindrical coordinates becomes
defined over the rectangular region :
in the -plane . But the surface mesh looks very
different from the earlier rectangular mesh defined
over the rectangular region in the -plane. Does it
suggest how the plane cross-sections used in
Cavalieri's Principle might be modified to compute the
volume of the solid below the graph?
W
z = f(x, y) = 1 + x
2
y
2
D = [1, 1] [1, 1]
P
x
x x
A(x) = (1 + ) dy
1
1
x
2
y
2
= = +2 . [y + y ] x
2
y
3
3
1
1
4
3
x
2
W
volume = A(x) dx = ( +2 )dx = = 4 .
1
1

1
1
4
3
x
2
[ + ]
4x
3
2x
3
3
1
1
D
z = 2 + x
2
y
2
D = {(x, y) : + 1} x
2
y
2
f
z = 2 + cos 2 , r
2
0 r 1 , 0 2
r
xy
W
We could, of course, still apply Cavalieri's Principle to written in Cartesian form over . We'd find
that has
though the details are quite complicated!
These two cases already show some basic ideas underlying the evaluation of double integrals. If and the
double integral over a region
in the -plane is interpreted as the solid under the graph of and over , then by Cavalieri's Principle,
because by single variable integration the inner integral gives the area of cross-section by slices perpendicular to the -
axis. In this case we say a Double Integral has been represented as an Iterated or Repeated Integral, each integral being a
single variable integral. On the other hand, the second case is surely suggesting a change of coordinates from rectangular
Cartesian Coordinates to polar coordinates of some kind - that's a topic we'll come to shortly. But for the moment let's put
all our slicing ideas to work in a very famous 2,000 year-old problem whose solution by Archimedes foreshadowed integral
calculus 1,600 years later!
Example 3: find the volume of the solid
enclosed by intersecting cylinders
as shown to the right.
Modern computer graphics allow us to
visualize the cylinders and their intersection
much more clearly than Archimedes ever
could, and the problem can be reduced to a
standard one appropriate for any calculus
course via Cavalieri's Principle. But that by no
means diminishes Archimedes achievement.
f z = 2 + x
2
y
2
D
W
volume = A(x) dx = ( (2 + ) dy)dx = 2,
1
1

1
1

1x
2

1x
2

x
2
y
2
f(x, y) 0
D = {(x, y) : c(x) y d(x), a x b }
xy f D
f(x, y) dxdy = ( f(x, y) dy)dx,
D

b
a

d(x)
c(x)
x
W
+ = 1 , + = 1 x
2
y
2
y
2
z
2
Then slice this solid of intersection as shown
in the interactive animation to the right.
Show that the cross-sections
perpendicular to the -axis are squares.
What is the side length of the square of
cross-section by the plane ?
Use your results to show that the solid
enclosed by the cylinders has
How would your results change if the
cylinders were
for some fixed value of ?
So far all the repeated integrations have started by integrating with respect to , and only then integrating with respect
to . But intuitively, the volume of a solid is surely the same whatever the order of slicing. A famous theorem formalizes
this idea.
Fubini's Theorem: if is a continuous function of , the double
integral
of over a rectangle is equal to the iterated integral,
integrating in either order.
Surprisingly enough, examples of discontinuous functions can be
constructed where the interated integral in one order is different from the
interated integral in the opposite order!
Finally, just as in the one variable case, all these double integral ideas can be extended to arbitrary functions, not just
positive ones, by allowing signed volumes. In this way the Double Integral
makes sense for a general over a rectangular region in the -plane; to evaluate it, we
express it as a repeated integral
x
x = a
volume = 4(1 ) dx = .
1
1
x
2
16
3
+ = , + = x
2
y
2
r
2
y
2
z
2
r
2
r
y
x
z = f(x, y) x, y
f(x, y) dxdy
D
f D = [a, b] [c, d]
z = f(x, y)
f(x, y) dxdy
D
f(x, y) D = [a, b] [c, d] (x, y)
f(x, y) dxdy = ( f(x, y) dy)dx = ( f(x, y) dx)dy,
either order of integration being allowed so long as is continuous. The next step will be to replace the rectangular
region of integration by more general regions. Reduction to repeated integration still applies, however, but now the limits
of integration are likely to vary.
f(x, y) dxdy = ( f(x, y) dy)dx = ( f(x, y) dx)dy,
D

b
a

d
c

d
c

b
a
f(x, y)
D
MAPPINGS in the PLANE
Changing variables is a very useful technique for simplifying many types of math problems: in the one-dimensional
case you met translations and scalings in high school algebra, for instance. You
used them to map one graph, say a parabola with vertex at the origin into another parabola
with vertex at , or to change the shape of a parabola. On the other hand, changing
variables was a very powerful technique in single variable integration because the one variable Chain
Rule could be re-interpreted as
Transformations in higher dimensions, called maps or mappings, play an even more important role in multi-variable
calculus.
Two such mappings you've (probably) met already for the plane are:
Polar coordinates: changing coordinates in the plane,
Matrix multiplication: each matrix defines a mapping of the plane ,
it is a linear transformation of the plane, meaning it maps lines to lines.
The reason mappings like these are so useful in double integrals comes from their action on particular sets in the
plane.
f(x) f(x +1) f(x) 2f(3x)
y = f(x) = x
2
y = f(x +1) = (x +1)
2
x = 1
u x = g(u)
f(x) dx = f(g(u)) (u) du, g : [, ] [a, b] .
b
a

: R
2
R
2
: (r, ) (x, y) , x = r cos , y = r sin ,
2 2 A u (u)
A
(u) = Au = [ ][ ] = [ ] = x , x = u + v , y = u + v ;
A
a
11
a
21
a
12
a
22
u
v
x
y
a
11
a
12
a
21
a
22
Let's start with a general double integral
over the green domain of integration in the -
plane to the right. For such a finding the limits of
integration might well be algebraically complicated, or
the integration would be algebraically difficult, or both
would be.
Experience has shown that the integration would
probably be much easier if were replaced by a
rectangle with sides parallel to the coordinate axes.
So to replace with a rectangular region of integration we'll need
a mapping and a rectangle with sides parallel to the axes in the -plane such that:
a 'distortion' function to replace so that
In this case, if , then
Polar coordinates help when regions of integration in the -plane have some radial symmetry because is
simply the usual change of coordinates from the -plane to the -plane.
Example 1: when is a disk of radius
centered at the origin, as shown to the right,
then in -coordinates
On the other hand, in the -plane
is a rectangle.
I = f(x, y) dxdy
D
D xy
D
D
D D

: R
2
R
2
D

uv
(u, v) = (x(u, v), y(u, v)) , ( ) = D; D

(x, y)
(u, v)
(u) g

f(x, y) dxdy = f((u, v)) dudv .


D

(x, y)
(u, v)

= [a, b] [c, d] D

f(x, y) dxdy = ( f((u, v)) dv)du.


D

b
a

d
c

(x, y)
(u, v)

D xy
x = r cos , y = r sin r xy
D a
(x, y)
D = {(x, y) : + }. x
2
y
2
a
2
r
= {(r, ) : 0 r a, 0 2} D

Example 2: when is an annulus


centered at the origin between circles of
radius as shown to the right,
then in -coordinates
On the other hand, in the -plane
is a rectangle.
Thus in both examples, regions that are radial in -coordinates are expressed in -coordinates as rectangles
. By contrast, matrix transformations enter when is a parallelogram.
Example 3: if is the parallelogram in the
-plane to the right having vertices
find a matrix and rectangle in the -
plane so that the mapping
defined by
maps onto ; in other words,
D
a, b, a < b
(x, y)
D = {(x, y) : + }. a
2
x
2
y
2
b
2
r
= {(r, ) : a r b, 0 2} D

xy (r, )
[a, b] [c, d] u x = Au D
D
xy
O(0, 0), P(2, 1), Q(1, 4), R(1, 3) ,
A D

uv
: (u, v) (x, y)
A
[ ][ ] = [ ]
a
11
a
21
a
12
a
22
u
v
x
y
D

D
x = u + v , y = u + v . a
11
a
12
a
21
a
22
A :
2 2
The crucial properties of a matrix transformation are
maps lines to lines, maps parallel lines to parallel lines, .
(Can you see how to prove these?) So to solve Example 3 we need to find a matrix such that the
mapping takes the usual Cartesian grid in the -plane to the the slanted grid in the -plane as shown in
and maps a rectangle onto for some choice of . But by the point-slope formula, the
parallelogram in Example 3 is enclosed by the pairs of parallel lines
so that
This suggests setting
Unfortunately, the mapping
from the -plane to the -plane is going the wrong way; in fact, it sends to . We need to reverse the
mapping; in other words, its inverse is needed. This means solving for as a pair of simultaneous equations in
to determine :
A few calculations now confirm that with this choice of matrix , the mapping defined by
A : R
2
R
2
A A A(0, 0) = (0, 0)
2 2 A

A
uv xy
= [b, 0] [0, d] D

D b, d
D = OPQR
y = 3x, y = 3x +7 , 2y = x, 2y = x +7 ,
D = {(x, y) : 0 3x + y 7 , 0 2y x 7 }.
u = 3x + y, v = 2y x, = [7, 0] [0, 7] . D

(x, y) (u, v) = (3x + y, 2y x)


xy uv D D

x, y
u, v A
x = (2u v) , y = (u +3v) , A = .
1
7
1
7

2
7
1
7

1
7

3
7

A : (u, v) (x, y)
A
[ ] = [ ]

maps onto . Notice too that the mapping


is defined by the inverse matrix
to , which is why the mapping went in the opposite direction to .
The important 'distortion' factor, called the Jacobian, is given by
Definition: the Jacobian of the transformation
is the determinant
Example 4: in polar coordinates
so
This explains why there's an factor in polar integrals!
Example 5: in the matrix case when
then
Thus in the matrix case the Jacobian is the determinant,
, of .
To see why the Jacobian is the distortion factor of the mapping
[ ] = [ ]

2
7
1
7

1
7

3
7

u
v
x
y
= [7, 0] [0, 7] D

D = OPQR
(x, y) (u, v) = (3x + y, 2y x)
= A
1

3
1
1
2

A (x, y) (u, v) = (3x + y, 2y x)


A
: (u, v) (x(u, v), y(u, v))
2 2
= .
(x, y)
(u, v)

x
u
y
u
x
v
y
v

x = r cos , y = r sin ,
= = r .

x
r
y
r
x

cos
sin
r sin
r cos

r
x = u + v, y = u + v , a
11
a
12
a
21
a
22
= = .

x
u
y
u
x
v
y
v

a
11
a
21
a
12
a
22

a
11
a
22
a
21
a
12
det[A] A
makes good use of all the vector calculus we've developed so far. Let be a rectangle in
the -plane and its image in the -plane as shown in
Then
On the other hand, by the vector Mean Value Theorem,
while
But , so
showing that
The Jacobian measures how much the mapping
distorts area in the sense that for each rectangle in the -plane
: (u, v) (x(u, v), y(u, v)) = x(u, v) i + y(u, v) j ,
Q = [a, a + h] [c, c + k]
uv (Q) xy
u = (a + h, c) (a, c) , v = (a, c + k) (a, c) , area((Q)) u v .
= i + j ,
(a + h, c) (a, c)
h

(a,c)
x
u

(a,c)
y
u

(a,c)
= i + j .
(a, c + k) (a, c)
k

(a,c)
x
v

(a,c)
y
v

(a,c)
area(Q) = hk
u v hk k = area(Q) k ,

x
u
y
u
x
v
y
v

(x, y)
(u, v)
: (u, v) (x(u, v), y(u, v))
Q uv
area((Q)) u v area(Q) .

(x, y)
(u, v)

Let's illustrate this change of variable idea in the case of polar coordinates. The Astrodome in Houston as shown to
the right below might be modelled mathematically as the region below the cap of a sphere
above a circular disk
In terms of double integrals its
Rotational symmetry suggests changing to polar
coordinates!
Solution: in polar coordinates,
is a rectangle, while
So after changing to polar coordinates,
The presence of the Jacobian (here the -factor)
makes this an easy repeated integral using the
substitution . For then
Consequently, the mathematical Astrodome has
+ + = x
2
y
2
z
2
R
2
D = {(x, y) : + } . x
2
y
2
a
2
Volume = dxdy.
D
R
2
x
2
y
2

D = {(r, ) : 0 r a, 0 2}
= . R
2
x
2
y
2

( + ) R
2
r
2
cos
2
sin
2

I = ( d)rdr .
a
0

2
0
R
2
r
2

r
u = r
2
I = du = [ ( u .
a
0
u R
2

2
3
R
2
)
3/2
]
a
0
Volume = ( ( ).
2
3
R
3
R
2
a
2
)
3/2
Double Integrals: GENERAL REGION
The main difficulty in evaluating a double integral
was being able to compute the single variable integrals that arose because the double integral could written as repeated
single variable integrals
and either choice of order of integration used. So we could always choose the more convenient one. The situation gets
more complicated when is not of the form , however. It's best to treat each region on its own merits.
Usually a good first step is to draw the region of integration as a graph in the -plane.
Example 1: evaluate the integral
when consists of all points such that
Since is a circle of radius centered at the
origin, consists of all points in the first quadrant inside
this circle as shown to the right. This suggests
fix and integrate with respect to along the black
vertical line as shown,
then integrate with respect to .
So as a repeated becomes
Thus
How would you evaluate this last integral?
In example 1 algebraic conditions specifying suggested how to write the integral as a repeated integral. Other
times algebraic conditions are best interpreted graphically before deciding on limits of integration.
f(x, y) dxdy, D = [a, b] [c, d] ,
D
f(x, y) dxdy = ( f(x, y) dx)dy = ( f(x, y) dy)dx,
D

d
c

b
a

b
a

d
c
D [a, b] [c, d] D
D xy
I = (x + y) dxdy
D
D (x, y)
0 y , 0 x 3 . 9 x
2

= 9 y
2
x
2
3
D
x y
x
I
( (x + y) dy)dx = [ xy dx.
3
0

9x
2

3
0
1
2
y
2
]
9x
2

0
I = (x (9 ))dx = 18 .
3
0
9 x
2

1
2
x
2
D
Example 2: evaluate the integral
when is the bounded region enclosed by and
.
Here is enclosed by the straight line and the
parabola as shown to the right. To determine the
limits of integration we first need to find the points of
intersection of and . These occur when
, i.e., when . Now fix and integrate with
respect to along the black vertical line, so as a repeated
integral
Can you evaluate ?
Notice that in Example 2 it was simpler to fix and integrate first with respect to because the bounding curves
were given as
Had they been given as and it would have been easier to fix and first integrate with respect to
because then it would have been simpler to determine the first set of limits of integration in terms of , not .
Knowing the graph of can help!
I = (3x +4y) dxdy
D
D y = x
y = x
2
D y = x
y = x
2
y = x y = x
2
= x x
2
x = 0, 1 x
y
I = ( (3x +4y) dy)dx
1
0

x
x
2
I
x y
y = (x) = , y = (x) = x. f
1
x
2
f
2
x = (y) g
1
x = (y) g
2
y
x y x
D
Example 3: when is shown to the right, then fixing
and integrating first with respect to along the black
line makes good sense because then
for suitable choices of and functions :
But if we had chosen to fix , then the integral with
respect to would sometimes split into two parts
shown in red. Not a good idea!
Example 4: but when is shown to the right, then
fixing and integrating first with respect to along the
black line makes good sense because then
for suitable choices of and functions . In
this case
But if we had chosen to fix , then the integral with
respect to would sometimes splits into two parts as
shown in red. Again not a good idea!
D
x y
D = {(x, y) : (x) y (x), a x b }
a, b (x), (x)
f(x, y) dxdy = ( f(x, y) dy)dx.
D

b
a

(x)
(x)
y
x
D
y x
D = {(x, y) : (y) x (y), c y d }
c, d (y), (y)
f(x, y) dxdy = ( f(x, y) dx)dy.
D

d
c

(y)
(y)
x
y
Drawing the graph of the region distinguishes between examples 3 and 4. But in some cases the choice of order of
integration is less clear graphically. In fact, one order of integration may lead to integrals that require more
sophisticated techniques of integration or can't be done (never a good idea!!).
Example 5: evaluate the integral
when is the triangular region shown to the right
enclosed by the -axis and the lines
Bad Choice: fix and integrate with respect along
the red line. Then
The trouble is that the inner integral involves requires
evaluating the integral
Nothing you've learned so far in calculus will work
here!! The other order of integration is needed.
Good Choice: fix and integrate with respect
along the black line. Then
Now the inner integral involves requires evaluating
the integral
In this case,
using the substitution .
D
I = x dxdy
D
1 + y
3

D
y
y = x, y = 2 .
1
3
x y
I = ( x dy)dx.
6
0

2
x/3
1 + y
3

dy.
2
x/3
1 + y
3

y x
I = ( x dy)dx.
2
0

3y
0
1 + y
3

xdx = [ = .
3y
0
1
2
x
2
]
3y
0
9
2
y
2
I = dy = 3 du = 26 ,
9
2

2
0
y
2
1 + y
3


3
1
u
2
= 1 + u
2
y
3
So we've got to get used to reversing the order of integration in a double integral. This always requires first looking
carefully at a graph of the region of integration. Then it's a matter of algebra and inverse functions.
Example 6: reverse the order of integration in the
double integral
but make no attempt to evaluate either integral.
Solution: the region of integration is the set
whose graph is shown to the right . The given
repeated integral fixes and integrates with respect
to along the vertical black line. To reverse the order
of integration we need to fix and integrate with
respect to along the red line. To set up the
repeated integral we have to express in the form
for suitably chosen and functions .
Now by inverse functions, the parabola can
be written as ; this tells us how to find the
right hand limit of integration .
On the other hand, the graph above shows the
left hand limit is . Thus can also be written
as
Consequently, reversing the order of integration
shows that
integrating now first with respect to .
I = ( f(x, y) dy)dx,
2
0

4
x
2
D = {(x, y) : y 4 , 0 x 2 } x
2
x
y
y
x
D
D = {(x, y) : (y) x (y) , c y d }
c, d (y), (y)
y = x
2
x = y

x = (y)
x = 0 D
D = {(x, y) : 0 x , 0 y 4 }. y

I = ( f(x, y) dx)dy,
4
0

0
x
TRIPLE INTEGRALS
Studying triple integrals
of functions of three variables is a natural step up from the two variable case. It's a very important one for
applications. Now
the domain of integration in is a solid in 3-space, and when ,
the value of is the volume of the solid in 4-space below the graph of and above .
Here solid in -space means a 4-dimensional region, just as a solid in 3-space is a 3-dimensional region; a ball
of radius is a typical example of a solid in 4-space, for example.
Instead of worrying about careful definitions, however, let's again adopt the natural Cavalieri approach: parallel
planes in 4-space will now slice a 4-dimensional solid into solids in 3-space, and if is the volume of the
3D-slice of by , then surely
assuming is sandwiched between and . But we've just learned how to express the volumes of solids in 3-
space as double integrals, which in turn were expressed as repeated integrals. As a result, a triple integral also can
be written as a repeated integral, except that now there will be three, not two, integrals to be evaluated. In practice,
these single variable integrals come from a careful description of the domain of integration . Examples show how
to do this:
I = f(x, y, z) dxdydz ,
W
f(x, y, z)
I W f > 0
I W f W
4
{(x, y, z, w) : + + + } x
2
y
2
z
2
w
2
r
2
r
P
x
W V(x)
W P
x
f(x, y, z) dxdydz = volume(W) = V(x) dx,
W

b
a
W P
a
P
b
W
Example 1: express the triple integral
as a repeated integral when is the solid above
in the -plane and below the graph of .
Think of as the base of and the graph of
above as the top of . Now fix a point
in , and let go vertically along the red
line from up to the black dot at the top. Next free ,
so that vary over the pink rectangle. Finally, free .
Then as varies, the pink rectangle sweeps out .
Thus is the set of points
and so

= f(x, y, z) dxdydz I
1

W
W
D = [a, b] [c, d]
xy z = 6 x
D W
z = 6 x D W
P = (x, y) D z
P y
y, z x
x W
W
{(x, y, z) : 0 z x 6, c y d, a x b },
I = ( ( f(x, y, z) dz)dy)dx.
b
a

d
c

6x
0
Example 2: express the triple integral
as a repeated integral when is the solid in the first
octant bounded by the graphs of
Since and intersect at as lines in
the -plane, let's take as base of the triangular
region
Fix a point in and let go along the red
line from up to the graph of .
Next free but still keep fixed, so that varies along
the green line from to .
Finally, free to vary from to .
Thus is the set of points such that
and so
In example 2 we could also have chosen to vary along the yellow line before freeing . How would the limits of
integration have then changed?
In both examples 1 and 2 the base of the solid of integration was a region in the -plane and in the first
integral varied from to the top of . But that's not always the case. Here's a typical example.
Example 3: express the triple integral
as a repeated integral when is the solid bounded
by the graphs of
Try describing in two different ways
corresponding to two different choices of base. Then
express as a repeated integral using these two
descriptions of .
I = f(x, y, z) dxdydz
W
W
z = 9 , y = x, y = 2. x
2
y
2
y = x y = 2 (2, 2)
xy W
D = {(x, y) : x y 2, 0 x 2}.
P = (x, y) D z
P z = 9 x
2
y
2
y x y
y = x y = 2
x 0 2
W (x, y, z)
0 z 9 , x y 2, 0 x 2 , x
2
y
2
I = ( ( f(x, y, z) dz)dy)dx.
2
0

2
x

9 x
2
y
2
0
x y
D W xy
z 0 W
I = f(x, y, z) dxdydz
W
W
z = 1 , z = x, x = 0. y
2
W
I
W
Just as with double integrals, there's no reason other than algebraic or geometric convenience to express the
repeated integrals in the same order used in the earlier examples - in fact, there are six possible ways of specifying
the order of integration. When is a continuous function, Fubini's theorem again states that all six choices
choices of order of integration give the same value for a triple integral
so for this reason a more coordinate-free notation
is often used for a triple integral. We shall refer to as the volume element.
Finally, let's actually complete a triple integral.
Example 4. Evaluate the triple integral
when is the solid bounded by the parabolic cylinder
and the planes
as shown to the right.
Solution: the base of enclosed by the blue curves
lies in the -plane and is bounded by the graphs of
and . Since these intersect at ,
while consists of all points such that
So as a repeated integral,
We carry out the individual integrations:
Consequently,
f(x, y, z)
I = f(x, y, z) dxdydz ,
W
I = f(x, y, z) dV = f(x) dV
W

W
dV
I = (2x +4z) dV
W
W
y = x
2
y = x, z = x, z = 0
D W
xy
y = x
2
y = x (1, 1)
D = {(x, y) : y x, 0 x 1 }, x
2
W (x, y, z)
0 z x, y x, 0 x 1 . x
2
I = ( ( (2x +4z) dz)dy)dx.
1
0

x
x
2

x
0
I = ( [2xz +2 dy)dx
1
0

x
x
2
z
2
]
x
0
= ( 4 dy)dx = [ 4 y dx.
1
0

x
x
2
x
2

1
0
x
2
]
x
x
2
I = 4 ( ) dx = .
1
0
x
3
x
4
1
5
PATH INTEGRALS
Having studied integrals over intervals in the -axis and regions in the -plane, it's time to turn to
integrals over curves and surfaces in -space, not just of scalar-valued but also of vector-valued functions. Many
important quantities in science and engineering can then be expressed by integrals over curves and surfaces as you'll
discover in future courses.
First we'll extend the notion of integration to scalar-valued functions over a (smooth) curve in space. The basic
idea is to parametrize by and think of the integral over as a suitably 'distorted' version of the
integral of the composition as a function on , the distorting factor now being defined in terms
of arc length.
Path Integral: when is a curve in -space parametrized by , the Path Integral of a
scalar-valued function over is defined by
The integral exists when and are continuous functions of .
Some texts call a Scalar Line Integral, and one then refers to as the Scalar
line element for the parametrization of .
Example 1: evaluate the path integral
when is the helix
Solution: since
we see that
But on ,
Thus
When for all , then is just the arc length of , raising natural questions for general
and :
why is defined this way ? what does tell us for particular and ?
As always, answers involve approximating Riemann sum arguments. Let be the curve shown in blue to the right
below and parametrized by .
Partition by
and use this to decompose into consecutive arcs
parametrized by
Then each can be approximated by the red vector
so that
Now choose an (orange) sample point in .
All this prompts introducing the Riemann Sum
But, if is the increment in from to , then by the vector Mean Value Theorem,
Thus by taking the limits as we obtain the integral
we've denoted by . But why choose this particular Riemann sum? What does it mean?
Example 2: if is a wire and is the density
of the wire at (in grams per unit length, say),
then
To see why, note that if is constant, then
So in general, the curve has mass
with ever-better approximations as .
Just as integrals were often associated with the area of the region above and below the graph of
when , so a path integral can be identified with the area of the region above and below the
graph of whenever lies in a plane below the graph of . The most common example occurs when
lies in the -plane and .
Example 3: if the graph of
is shown as the green surface in -space,
while the curve in the -plane is shown in black. The
region above and below the graph of is
shown in purple; it has as base and will be a curved
surface whenever is curved. Often people speak of this
purple surface as a curtain wall.
This is the same situation you met before except the
domain of integration is now a curve that is the image of
the domain of integration before, and the 'wall'
below the graph will be curved, not flat. But in the curved
case, when we break into arcs , the expression
is still the area of a rectangle approximating the area of
the 'curtain wall' above and below . The integral
will then be the limit of these approximating sums.
Hence, as the graph above shows, the limit is the
area under the graph of above .
Independence of Parametrization: the notation for the path integral of over makes no reference to
the choice of the function used to parametrize . Yet there are many different parametrizations for the same
curve: for instance,
all parametrize the same circle of radius 1 centered at the origin in the -plane. But would we get the
same value for for all three choices of parametrizations? Fortunately, the answer is YES! It can be shown that
the value of is independent of choice of .
LINE INTEGRALS
It's frequently important to integrate a vector field, not just a scalar-valued function, along a path. We shall call them Line
Integrals; some texts call them Vector Line Integrals by contrast with Scalar Line Integrals. The notion of the Work Done by a
Force Field on a particle to move it along a path given parametrically by provides a good introduction.
When the path is a straight line from to and the
force field is constant as shown to the right in black, then the
component of in the direction of the red vector
This component is and it is the component of the force
doing all the work in moving the particle from to . In
fact:
But in general, will not be a straight line and the force field will likely vary along the path. To derive a general
expression for work done we'll thus use the same approximating Riemann sum argument approach for Path Integrals.
Partition by
and use this to approximate by directed line
segments shown to the right composed of red vectors
and . Then in moving the particle from
to , the
while in moving the particle from to the
Now choose an (orange) sample point in .
Then if is the increment in from to
,
Thus in moving the particle from to the
By taking the limits as we obtain an integral
It represents the work done by the force in moving a particle along from to . Replacing a Force Field by an
arbitrary vector field we arrive at
Line Integral: when is a curve in -space parametrized by , the Line Integral
of a vector field along is defined by
The integral exists when and are continuous functions of .
Some texts call a Vector Line Integral, and one then refers to as the
Vector line element for the parametrization of .
As before it can be shown that the value of is independent of the choice of parametrization of . The example
of a force field moving a particle along shows, however, that a line integral depends on direction along the curve. A
specified direction along a path is called an orientation, and we call this specified direction the positive direction; the opposite
direction is called, not surprisingly, the negative direction. The number line, for example, has a positive direction from to
. A parametrization , , of a path equips with an orientation, the positive direction along is from
to following the usual orientation on the number line as goes from to .
Example 1: evaluate the line integral
when
and is the curve parametrized by
Solution: when is parametrized by
then
while
and
Thus
There is a 'vector-free' way of writing line integrals: if
then you'll often find line integrals written and evaluated in the form
This is especially true in the 2D-case when and for then
from which it follows that
Example 2: evaluate the line integral
when is the straight line joining to .
Solution: the straight line from to can
be parametrized by
in other words,
with and . But then
To see how Path and Line integrals are related, recall that is a tangent vector to at , while is the
unit tangent vector to at . On the other hand,
Inserting these into the definition of path and line integrals we thus get
Theorem: the line integral of a vector field along a curve is the path integral of the tangential
component of along in the sense that
Thinking in terms of the path integral of the tangential component of provides a good way of understanding line integrals.
In the animation to the right the vector field as
well as type and location of curve can be
changed, and the value of the line integral of the
vector field along the curve is given at the top.
The colored arrows at a point on the curve
indicate the vector field at that point. Try varying
the field and the curve both separately and
simultaneously.
What happens for the radial field
when the curve is any circle? (Remember, in
numerical calculations numbers like
should be interpreted as zero!)
How can you change the value of the line
integral from positive to negative when is the
line segment or the sine curve?
For what vector field and positions of the
line segment is the line integral zero (or
essentially zero)?
Gradient Vector Field: one of the most important questions we have to resolve in vector calculus is the form the
Fundamental Theorem of Calculus takes for the integral of a vector field over curves, surfaces and solids. In the case of a Line
integral, if is a gradient vector field, the Chain rule reduces a line integral to a straightforward single variable integral. For if
, then by the Chain Rule,
By the standard single-variable Fundamental Theorem of Calculus, therefore,
This proves:
Fundamental Theorem of Line Integrals: for a gradient vector field the line integral
along a path is given by
in particular, when is a closed curve, meaning , then the integral around is zero, i.e.,
Notice that is a gradient vector field, so this should explain your observations for the line integral over every
circle in the earlier animation. But the proof of the Fundamental Theorem has really established much more: since the value
of depends only on the end-points and of , not on the path taken to get from to .
General Vector Field: to interpret line integrals for a general requires Stokes' theorem to be be proved later, but
specific cases show the basic ideas.
Example 3: first, let be a rotation vector field. Then
On the other hand, when is the circle of radius centered at the origin and parametrized
by ,
But has radius , so the area of the region inside is . Thus
PARAMETRIC SURFACES
Vector calculus associated with surfaces was initially developed in the 19th-century to explain the then newly emerging
fields of electricity and magnetism; you'll meet this in later courses. This was followed in the early 20th-century when Einstein
used calculus over surfaces to develop his theories of Relativity. Our ultimate goal will be to define integrals over surfaces and
relate them both to line integrals and to triple integrals. But just as parametrized curves were a key ingredient in defining line
integrals, so surface integrals require that the surface be parametrized. These days such parametrizations are, in fact, a basic
tool in computer graphics.
A surface in 3-space is said to be a Parametric Surface when it consists of
all points
for some function and region in .
It's usual to write as a vector function
and think of as the surface created in 3-space by the tip of the vector as varies over .
The graph of , a hyperbolic paraboloid, consists by definition of all points in 3-space. It thus
becomes a parametric surface with parameters and
But this same hyperbolic paraboloid can also be parametrized by
using cylindrical polar coordinates, so the same surface may have several different parametrizations.
More generally, the graph of every function is a parametric surface, setting
The sphere of radius centered at the origin is the graph of a function, but using spherical
polar coordinates
this sphere is parametrized by
with the rectangle in the -plane.
Problem 1: use cylindrical polar coordinates to parametrize
Mappings: but what's the point of parametrizing a surface by a function ? Well, for each fixed the function
defines a curve on , while also defines a curve on for each fixed . The interactive example
below for the graph of and shows how this works:
Here where is the green-shaded square in the -plane. Moving the sliders shows that
takes horizontal lines in the -plane to parabolas on the hyperbolic paraboloid and vertical lines
to parabolas . Thus parametrizing by provides a surface mesh by space curves on the
hyperbolic paraboloid determined by the rectangular mesh in the -plane. In more fancy language, is a
mapping from 2-space to the hyperbolic paraboloid in 3-space taking the rectangular coordinate mesh in 2-space to a surface
mesh by space curves. In addition, use of the 'rectangle' button shows that the the hyperbolic paraboloid is then covered by
the 'surface rectangles' coming from the actual rectangles in the coordinate mesh in the -pane. This is what we'll use
shortly to set up integrals for surface area. It's exactly the purpose too of the -mapping in computer graphics! But isn't
such a parametrization also just a different way of talking about slicing the hyperbolic paraboloid by planes parallel to the -
and -coordinate planes?
Another informative thing to do is to look for examples of how surfaces arise in the architectural design and construction of
buildings.
1. Reichstag Dome: as one of the iconic late 20th
century buildings in Europe, it's basically a paraboloid, say
the graph of in rectangular coordinates.
But it was constructed with a metal skeletal structure based
on the cylindrical polar coordinate parametrization:
with in the rectangle . Aren't the
windows just surface rectangles! (If not, how could you
describe them?) Now compare this dome with the next
animation:
Another standard architectural structure is a spiral ramp providing access from one level to another.
2. Loretto Chapel Staircase, Santa Fe: interior access
from the floor of the chapel to an upper gallery is achieved
via a spiral staircase shown to the right. Together with the
handrails, there are three helixes in the construction. Since
the graph of the space curve
is a helix lying in a circular cylinder , how
might the curved base of the staircase be described
mathematically? The interactive animation below provides
the answer. But does it spiral the same way? Can you think
of any other buildings with spiral ramps to provide upward
access? Guggenheim Museum or parking structures? Seen
any on Quest? How might you describe them
parametrically?
Calculus on surfaces: calculus on a parametric surface can now proceed! Suppose is parametrized by
and let be a point on . Then as vary,
define space curves on passing through such that
are vectors at tangent both to the space curves and to the surface . The cross product thus
gives the normal to at ; we can then use this normal and the point-normal equation for a plane to compute the
tangent plane to at . In summary:
When a surface is parametrized by
its normal at a point on is given by the cross product
As usual a paraboloid parametrized by cylindrical coordinates provides a good example algebraically and graphically.
Problem 2: determine the normal vector and the
tangent plane to the paraboloid
at the point .
Solution: in rectangular coordinates,
On the other hand, since
and
the normal vector at an arbitrary point is
given by
Thus
Now the tangent plane at a point with normal
is given by
So after a little algebra we see that
is an equation for the tangent plane at .
The green surface to the right is the portion of
lying in the first octant. Click on the 'Point' button to
pick a point on and then use the slider
to generate the curves and
shown in orange and passing through . Click on the
tangent vectors button.
Can you see from the expression in Problem 2 for
these tangent vectors why one is horizontal?
Can you see from the definition
why this normal points outwards, not inwards, from the
surface?
The length of this normal will be used
to define the scalar surface area element
in scalar suface integrals.
Now use the slider to draw the tangent plane at . It
provides a linearization to the surface at and will
become important in setting up integrals over .
SURFACE AREA, SURFACE INTEGRALS
All the ideas have now been assembled to introduce surface integrals, extending double integrals over 'flat' regions in the
-plane to integrals over surfaces in -space. Let's begin with the integral for surface area.
Previously, we saw that if a surface is parametrized by
its surface area is given by the integral
The basic idea was that mapped the rectangular Cartesian mesh on the -plane to a curvilinear mesh on the surface, the
paraboloid
being typical.
As an illustration of the surface area formula, let's prove another remarkable result of Archimedes - obtained, remember,
over 1,600 years before calculus was invented!
Recall first some known results:
the surface area of a sphere of radius is ,
the surface area of right circular cylinder, open top and
bottom, of radius and height is .
But if the cylinder exactly circumscribes the sphere as
shown to the right, i.e, the cylinder and the sphere have the
same radius and height, then . In this case, both the
sphere and circumscribing cylinder have equal surface area
. Even though this result is clear on the basis of
computation, does it seem obvious geometrically?
Now suppose the complete sphere is replaced by a sphere 'cropped' at the North and South Poles, i.e., the portion of a
sphere between two lines of latitude, as shown in blue to the right below where the 'cylinder' slider also produces the green
cylinder exactly circumscribing this capped sphere. Drag the animation around to convince yourself of this!
Problem: do the blue capped sphere and the exactly
circumscribing green cylinder always have equal surface
area?
Solution: the surface area of the cylinder is easy to
determine - it's just when the topped sphere has
radius and height . Determining the surface area of
the cropped sphere, however, needs the surface area
integral formula: parametrize the sphere by
The corresponding scalar surface element is then
Now slice the sphere by the equatorial plane and a
horizontal plane of height above this equatorial plane.
To use the surface area integral we need to determine
the limits of integration.
The portion of the upper hemisphere centered at
the origin of radius cut off by the plane is
shown in cross-section to the right. Its
which is exactly the same as the surface area of the
circumscribing cylinder of radius and height !!
Scalar Surface Integrals: the surface area formula is just the special case of the integral of a scalar
function over a parametrized surface .
Definition: the surface integral of a scalar valued continuous function
over a surface parametrized by is given by
Notice that the change of variable formula for double integrals was just the 'flat earth' version of a surface integral with
distorting function a mapping having range in , not in . In practice, the surface area element for
special categories of surfaces is often needed.
I. Graph of a function : when is the parametrization of the graph of
by rectangular coordinates and , then for a fixed point on the surface, the
curves on this surface through corresponding to and have respective tangent vectors
Thus the normal to the surface at is
Thus the surface area element for the graph of ,
this graph will have
over a region in the -plane.
Example 1: evaluate the integral
when is the part of the plane
above the rectangle in the -plane.
Solution: the surface area element for the plane
is given by
On the other hand, on ,
Thus
Consequently,
II. Surface of revolution: if is a positive function on , then rotating the graph of around the -axis
produces a surface of revolution whose cross-section by a plane perpendicular to the -axis is a circle in the plane
with radius and center at . Since the equation of this circle is the space curve
, varying parametrizes by
Thus
and so the normal to the surface is
Thus yet another application of the Pythgorean trig identity gives
Consequently, has
Example 2: use the fact that a sphere of radius
is created by rotating the graph of
about the -axis to determine the surface area of a
sphere of radius .
Solution:
when is rotated about the
-axis,
After simplification, this becomes . Thus a
sphere of radius has
Parametric Surfaces: VECTOR INTEGRALS
Now we come to the idea of integrating a vector field over a surface . Such integrals represent flux or rates of
flow through the surface. This and the various forms that the Fundamental Theorem of calculus takes for such integrals
are extremely important in applications - and just as in the one variable case, the Fundamental Theorem will provide
you with a convenient way of evaluating such integrals, as well as often providing a physical interpretation!
So let be a parametric surface parametrized by . For the moment imagine as being a 'filter' through
which water is flowing with velocity . If a surface integral of a velocity field is to measure, say, the flow rate of water
through , i.e., the volume of water flowing through in unit time, then we'll need to specify a positive direction of
flow. This is done by specifying the direction of the unit normal vector at each point since there are always
two possible choices for the direction of the normal.
a surface is said to be oriented when a choice of unit normal
can be made at each point so that varies smoothly with .
An orientation specifies one of two 'sides' in a continuous way - the outside corresponds to the positive direction of
, the inside to the negative direction. Surfaces such as cylinders and spheres come with natural orientations
because a notion of inside and outside is pretty clear, but surprisingly perhaps, inside and outside don't always make
sense for a surface!

The example to the right below is called a
Mobius Strip.
Use the 'construct' slider. It shows how the
surface is constructed: take a thin strip of
paper, then join the two short ends together but
make a half-twist so that the edges are joined
together in opposite directions as indicated by
the arrows. Now image you are the vertical
arrow, initially pointing upwards. As you walk
around the surface, this arrow which is initially
vertical gradually changes direction until it
points downwards when the two edges are
joined. At each point on your walk you are
perpendicular to the surface, yet at the end of
the walk you are back at the same point on the
surface, but pointing in the opposite direction!
Vector Surface Integral: the vector surface integral of a vector field over an
oriented parametric surface is defined as the scalar surface integral
over of the normal component of at . The value of is said to
be the of across (or through) . Reversing the orientation, changes the sign of the integral.
Since and are normal to at , both of
are unit normals. Then
if is already specified and , many texts say is orientation preserving;
if an orientation is not already specified, we can define one by setting . In this case
is automatically orientation-preserving.
A simple cancellation thus gives
When is an oriented surface and is an orientation-preserving parametrization
of , then the vector surface integral of a vector field over is given by
Technically speaking, for this to make sense we assume has continuous partial derivatives and that
on . Such a parametrization is usually said to be regular. In practice, computing for
particular surfaces is important:
1. Cylinders: for the cylinder the unit outward normal at is given by
in cylindrical polar coordinates. (Does this make geometric sense?)
2. Graph of a function: for a function ,
is normal to the graph of at each point. Thus
is a unit normal to the graph of , and it specifies an orientation on the graph. Because this unit normal
always has a positive component in the -direction, the positive direction on the graph is the upwards direction in this
orientation. Paraboloids provide a good illustration: always points up at whether the
paraboloid opens down or up.
3. Spheres: for the sphere the outward normal at is given by
in spherical polar coordinates. Thus the unit outward normal at
(Does this make geometric sense?)
Example 1: determine the flux of the vector field
across the sphere
in the outward direction as shown to the right.
Solution: if we parametrize the sphere by spherical
polar coordinates, then
So by the previous result for the outward normal
,
Thus the flux of through the sphere is
given by the integral
as a calculation and integration shows.
Finally, let's see why the vector surface integral
can be interpreted as the volume of water flowing through
in unit time when the vector field is the velocity of the
flowing water at each point.
If the velocity is constant everywhere, i.e., , say,
and is the rectangular region shown in darker blue to the
right, then the volume of water passing through in unit
time is the volume of the parallelopiped shown in lighter-
shaded blue. If is the unit normal to , then the height of
this parallelopiped is , the speed perpendicular to of
the water flowing through . So the parallelopiped has
But when is flat, its unit normal is always , so and
showing that
When is not a rectangular region and is not constant, we break the surface up into curvilinear rectangles and then
approximate each curvilinear rectangle by a flat rectangle. The previous argument applies to the flat rectangle and the
result for a general surface then follows by the standard limiting argument.
Fundamental Theorems of Calculus
Click for printable PDF Version
We come now to the three generalizations of the Fundamental Theorem of Calculus, known by their discoverers: Green,
Stokes, and Gauss, though several others lay claim to discovering them!
1. Single Variable Calculus
2. Line Integrals
3. Green's Theorem
4. Stokes' Theorem
(x) dx = F(b) F(a)
b
a
F

f d s = f(r(b)) f(r(a))
C
( )dA = P dx +Qdy
D
Q
x
P
y

C
curl F d S = F d s
S

S
5. Gauss' Theorem
These theorems represent the culmination of single and multi-variable calculus. But that's only the beginning - they have
applications to physics and engineering as well as to biological, earth, and environmental sciences wherever an
understanding of fluid and aerodynamics and continuous matter in general is needed, and they point the way for higher
level analysis on manifolds. From a strictly mathematical point of view, they are the gateway to analysis on manifolds. But
that too is critical in applications.
GREEN'S THEOREM
It's important to get the setting correct. A curve parametrized by is said to be
closed when , simple when it never intersects itself.
Circles and ellipses, for example, are simple, closed curves in the plane that are smooth in the sense of having
parametrizations whose components are differentiable, while a rectangular curve is simple
and closed but it is only piecewise smooth. All curves will be taken to be smooth or piecewise smooth.
Green's Theorem: if is a region in the -plane whose boundary,
, is a simple, closed, curve, oriented counter-clockwise, then
for all differentiable functions .
One way of thinking of orienting counter-clockwise, is that if one walks around the boundary in the positive direction,
then the region always lies to one's left. This (non-vectorial) way of stating Green's Theorem is the standard one. To
evaluate the line integral one usually parametrizes by
div FdV = F d S
W

W
C r(t) : [a, b] R
2
r(b) = r(a)
r(t) = x(t) i +y(t) j x(t), y(t)
D xy
D
P dx +Qdy = ( )dxdy
D

D
Q
x
P
y
P, Q : D R
D
D
D
r(t) : [a, b] , r(t) = x(t) i +y(t) j;
2
in which case Green's Theorem becomes
Special choices of : when and ,
so Green's Theorem expresses the Area of as the line integral
This already is a remarkable formula because it says that the area of can be computed knowing only what happens on
. Before using this to compute some areas, it is instructive to see how Green's Theorem reduces to the formula
you learned long ago for the region under the graph of a function over an interval .
The region under the graph of on is
shown in light blue to the right. Let be its boundary as
shown in dark blue and oriented counter-clockwise as
shown. Then, starting at and proceeding counter-
clockwise around , we see that
after integrating by parts, while
Consequently,
Note the attention paid to orientation in the calculations.
r(t) : [a, b] , r(t) = x(t) i +y(t) j; R
2
P dx +Qdy = (P(r(t)) (t) +Q(r(t)) (t)) dt .
D

b
a
x

P, Q P(x, y) = y Q(x, y) = x
( ) = 2 dxdy,
D
Q
x
P
y

D
D
Area(D) = xdy ydx.
1
2

D
D
D
area = f(x) dx
b
a
y = f(x) [a, b]
y = f(x) [a, b]
C
(a, 0)
C
xdy = 0 + b dy + x (x) dx + ady
C

f(b)
0

a
b
f

0
f(a)
= bf(b) + [ xf(x) f(x) dx af(a) ]
a
b

a
b
= f(x) dx = f(x) dx,
a
b

b
a
ydx = 0 +0 + f(x) dx +0 = f(x) dx.
C

a
b

b
a
xdy ydx = f(x) dx = f(x) dx.
1
2

C

a
b

b
a
An example shows how useful such an area result might be:
Example 1: The loop of the cubic shown to the
right is parametrized by
Find the area enclosed by the loop.
Does the direction of increasing correspond to
counter-clockwise orientation?
Well, when
for , we see that
while ; so the loop is traced out
counter-clockwise with increasing , starting at the
origin. On the other hand,
Thus
while
But then,
Consequently, the shaded region has
Special choices of : to see why Green's Theorem might be true, it's worth proving it in a couple of special cases.
Case 1. : by the usual
Fundamental Theorem of Calculus,
Interchanging and we see that
So if we use crucially the fact that integration is taken
counter-clockwise around , we obtain
The proof for the unit disk goes in the same way, but the integral around its boundary, the unit circle, is more tricky.
Since the unit disk centered at the origin is parametrized by , the integral around can be written
r(t) = ( 3, 3t) , |t| . t
2
t
3
3
t
r(t) = ( 3, 3t), t
2
t
3
t 3 3
r( ) = (0, 0), r(1) = (2, 2), 3
r(0) = (3, 0), r(1) = (2, 2),
r( ) = (0, 0) 3
t
(t) = 2t, (t) = 3 3 . x

t
2
x(t) (t) = ( 3)(3 3) = 3 12 +9 , y

t
2
t
2
t
4
t
2
(t)y(t) = 2t( 3t) = 2 6 . x

t
3
t
4
t
2
x(t) (t) (t)y(t) = 6 +9 . y

t
4
t
2
area = ( 6 +9)dt = .
1
2

3
3
t
4
t
2
24 3
5
D
D = [a, b] [c, d]
dxdy = ( dx)dy
D
Q
x

d
c

b
a
Q
x
= (Q(b, y) Q(a, y)) dy.
d
c
Q, P x, y
dxdy = (P(x, d) P(x, c)) dx.
D
P
y

b
a
D
( )dxdy = P dx +Qdy.
D
Q
x
P
y

D
r() = cos i +sin j D
P dx +Qdy = (Q(cos , sin ) cos P(cos , sin ) sin ) d .
2
To handle the double integral over the unit disk we again use the single variable Fundamental Theorem of Calculus.
Case 2. : by the usual
Fundamental Theorem of Calculus,
Now set . The first integral becomes
while the second integral becomes
after replacing by . Thus
Can you now use a similar argument to compute the
double integral of ?
Armed with Green's Theorem in the special Case 1, it's not to see how the result follows for a general region. For
suppose is the region shown below enclosed by the dark blue curve:
We've superimposed a rectangular mesh on D. Let be the two adjacent, dark green rectangles and let be the
union of . By Case 1, Green's Theorem holds for each of , so
P dx +Qdy = (Q(cos , sin ) cos P(cos , sin ) sin ) d .
D

2
0
D = {(x, y) : + 1} x
2
y
2
dxdy = ( dx)dy
D
Q
x

1
1

1y
2

1y
2

Q
x
= Q( , y) dy
1
1
1 y
2

Q( , y) dy.
1
1
1 y
2

y = sin
Q(cos , sin ) cos d ,
/2
/2
Q(cos , sin ) cos d
/2
/2
= Q(cos , sin ) cos d ,
3/2
/2

dxdy = Q(cos , sin ) cos d
D
Q
x

2
0
= Q(x, y) dy.
D
P/y
D
, R
1
R
2
R
, R
1
R
2
, R
1
R
2
( )dxdy = ( )dxdy + ( )dxdy
R
Q
x
P
y

R
1
Q
x
P
y

R
2
Q
x
P
y
= (P dx +Qdy) + (P dx +Qdy) .
R
1

R
2
But as the enlargement of shows, the integrals along the common edge of cancel because the integrals are
taken in opposite directions!! This leaves only the integrals around , so
establishing Green's Theorem for . Repeating this argument sufficiently often with different size rectangles as necessary,
we thus establish Green's Theorem for the light green-shaded region inside . Since the boundary of this region
approaches as the mesh is made finer, we thus obtain Green's Theorem for in the limit.
Green's Theorem re-interpreted for Vector Fields
Green's Theorem can be re-interpreted for vector fields in two important ways that will extend to -space and beyond.
Let be the vector field so that
Since is normal to the -plane, hence to the 'surface' over which we are integrating, we can think of the double
integral
as a 'flat earth' vector surface integral of the vector field . On the other hand, . Thus we get one
version of Green's theorem in vector form
Green's Theorem in Vector Form: if is a region in the -plane whose boundary, , is a
simple, closed, curve, that is oriented counter-clockwise, then
for all -vector fields .
This is the 2D-version of Stokes Theorem that we shall meet shortly. Let's now assume that is parametrized by .
Then the tangent vector to at is , and both of
are perpendicular to . But which is the outward normal? Can you use the fact that the outward normal always has to
point to one's right as one walks around in the positive direction to show that the outward normal at must be
? Now define the vector field by ; then
and so in yet another form Green's theorem becomes
, R
1
R
2
, R
1
R
2
R
( )dxdy = P dx +Qdy,
R
Q
x
P
y

R
R
D
D D
3
F : D R
2
F = P i +Qj
curl F = = ( )k .

x
P
j

y
Q
k

z
0

Q
x
P
y
k xy D
( )dxdy = curl F k dxdy
D
Q
x
P
y

D
curl F ds = dxi +dyj
D xy D
F ds = curl F k dxdy
D

D
C
1
F : D R
2
D r(t)
D r(t) (t) = (t) i + (t) j r

(t) i + (t) j , (t) i (t) j y

(t) r

D r(t)
(t) i (t) j y

F F = Qi P j
F n = Q(r(t)) (t) +P(r(t)) (t) , divF = , y

Q
x
P
y
Green's Theorem in Divergence Form: when is parametrized by and
denotes the outward normal to at , then
for all smooth vector fields . The value of is called the Outward Flux of
across .
This is the 2D-version of Gauss' theorem, (AKA the Divergence Theorem), that we shall meet shortly.
D r(t) = x(t) i +y(t) j
n(t) = (t) i (t) j y

D r(t)
(P(r(t)) (t) +Q(r(t)) (t)) dt = F nds = divFdxdy
b
a
x

D
F : D R
2
F nds
D
F D
GAUSS' THEOREM
Click for printable PDF Version
Finally, we come what can be thought of as the Fundamental Theorem of calculus for triple integrals -
often called Gauss' theorem or the Divergence theorem. Since it will relate a scalar triple integral over a
solid in -space with a vector surface integral over the boundary of , we need to make sure
that this vector surface integral is well-defined.
Gauss' Theorem: when is a solid in -space whose boundary is a
closed, piecewise smooth surface oriented by normal vectors pointing out
from , then
for each vector field defined on .
In practice, of course, the boundary of will be described by some orientation-preserving
parametrization so that
showing yet again why we did all that preliminary work early on! Before going through the details of
specific examples, however, it's worth seeing how the result might be established and what it tells us
about a vector field .
So set . Then
Now suppose that is a rectangular box. Its boundary consists of pairs
of rectangular faces, the faces in each pair being parallel to one coordinate plane. For example, the light
blue and green faces shown below are parallel to the -plane and have outward normal
respectively. Also
W 3 W W
W 3 W
W
divFdV = F dS
W

W
F W
W
: D W R
2
divFdV = F((u, v)) ( ) dudv ,
W

u

v
F
F = P i + Qj + Rk
divFdV = dV + dV + dV .
W

W
P
x

W
Q
y

W
R
z
W = [a, b] [c, d] [e, f] W 3
, S
1
S
2
xz j, j
are parametrizations of such that
Thus
while
On the other hand, by the single variable
Fundamental Theorem of Calculus,
Consequently,
But earlier we saw that the integral of, say, the velocity vector field of a fluid gives the
volume of fluid flowing through a surface in unit time. For such a vector field the two vector surfaces
integrals on the right would thus give the volume of water flowing through and in unit time. Since
is only the component of in the -direction, it's not surprising that it contributes nothing to
the flow through the remaining pairs of faces of . For that reason we have to add in the components
of in the -direction and the -direction. Using exactly the same argument for the faces parallel to
the -plane and the -plane, we thus obtain
establishing Gauss' theorem for a rectangular box simply as a consequence of the single variable
Fundamental Theorem of Calculus!
= xi + d j + z k, a x b, e z f, S
1
= xi + c j + z k, a x b, e z f, S
2
, S
1
S
2
dS = j dxdz on , dS = j dxdz on . S
1
S
2
F dS = Q(x, d, z) dzdx
S
1

b
a

f
e
F dS = Q(x, c, z) dzdx.
S
2

b
a

f
e
dV = ( dy)dzdx = (Q(x, d, z) Q(x, c, z)) dzdx.
W
Q
y

b
a

f
e

d
c
Q
y

b
a

f
e
dV = F dS + F dS.
W
Q
y

S
1

S
2
F dS
S
F
S
S
1
S
2
Q(x, y, z) F j
W
F i k
yz xy
divFdV = dV + dV + = F dS,
W

W
Q
y

W
Q
y

W
Q
y

W
To extend Gauss' Theorem to a more general region we approximate by rectangular boxes all
having faces parallel to the coordinate planes. Gauss' Theorem then applies to each individual box. But
for the union of two boxes having a common face,
because the vector surface integrals over the common face of cancel each other out since the
corresponding normal vectors point in opposite directions; Gauss' theorem for general now follows
by the familiar limiting argument.
Given that we've called Green's Theorem, Stokes Theorem, and Gauss' Theorem the highpoint of
calculus, it's surely worth noting that our proofs of these theorems amounted to little more than
clever geometric extensions of the single variable Fundamental Theorem of Calculus. Maybe that
single variable result is the highpoint of calculus!
W W
W
1
W
2
, W
1
W
2
divFdV = divFdV + divFdV
W
1
W
2

W
1

W
1
= F dS + F dS = F dS
W
1

W
2

( ) W
1
W
2
, W
1
W
2
W
STOKES' THEOREM
Following our by now well-trodden path of extending known results by increasing dimension, we extend Green's Theorem for
domains in the plane to surfaces in -space. This is known as Stokes' Theorem. It relates a surface integral over a surface to a
line integral over its boundary .
To state the theorem it will be important to be 'fairly' precise about the assumptions on and . Recall that a parametrization
for is said to be regular when on ; if was oriented, such a was then said to orientation-
preserving when the orientation was specified by the unit vector
Recall also that an orientation on a curve such as specifies a positive direction on the curve. We need to relate the orientation
on to an orientation on its boundary . The following familiar quadric surfaces show how its done:
In all three, orientation by the outward unit normal on the surface has been chosen. But in the first example, a sphere, the
surface is closed because it has no boundary; we write to indicate that is the empty set. For the second, a paraboloid,
the boundary has only component, while for the third one, a hyperboloid, there are two boundary curves. In all cases, however,
the orientation on is chosen so that as one walks as a normal in the positive direction around a boundary curve the surface
always lies to one's left. Such an orientation on is called the boundary orientation.
Stokes Theorem: when is a surface having an orientation-preserving parametrization
which is - and regular except possibly on , then
for all smooth vector fields . The line integral is taken relative to the boundary orientation on
, and the surface integral is zero if has no boundary.
The - property is a technical condition that simplifies the proof. It is satisfied in all standard examples. Most texts give a proof
by looking at the special case when is the graph of a function and reducing the result to Green's Theorem in the -plane. But if
3 S
S
S S
(u, v) S 0
u

v
S S (u, v)
(u, v) = . e
n

u

v

u

v
S
S S
S = S
S
S
S
(u, v) : D S R
2
1 1 D
curl F dS = F ds
D

S
F
S S
1 1
S xy
3
we assume that Green's theorem holds for any plane in -space, then a very natural proof can be given using triangulation
approximations to .
Indeed, suppose first that is the oriented surface
shown to the right consisting of triangles
having a common edge but lying in different planes,
as indicated by the non-parallel normals. Now let be
a vector field. Then
assuming Green's theorem is valid for any (triangular)
region in any plane in . But as the graphic shows,
the integrals along the common edge are in opposite
directions, leaving only the line integral along .
Thus
establishing Stokes theorem for this special .
Now let's apply this idea repeatedly to a triangulation of a surface , say our favorite hyperbolic paraboloid with the ruled
surface parametrization, as animated below.
3
S
T
, T
1
T
2
F
curl F dS
T
= curl F dS + curl F dS
T
1

T
2
= curl F ds + curl F ds ,
T
1

T
2
R
3
T
curl F dS = F ds ,
T

T
S
T S
Then by refining the mesh so that the triangulation approaches as the mesh gets finer and finer, we see that a limiting
argument surely shows that
thereby providing a plausible argument for Stokes Theorem for a general .
T S
curl F dS = lim curl F dS = lim F ds = F ds ,
S

S
S
It's time to use it! There are three main sets of examples.
I. Surface to line integral: evaluate the integral
when
and is the blue part of the sphere
as shown to the right lying inside the cylinder
and oriented upwards.
Solution: the sphere and the cylinder intersect when
i.e., when . Thus the boundary of is a circle of
radius in the plane with boundary orientation
counter-clockwise. It can be parametrized by
By Stokes Theorem, therefore,
Now
while
In this case,
Consequently, by double angle formulas,
I = curl F dS
S
F = yz i +3zxj + xyk
S
+ + = 4 x
2
y
2
z
2
+ = 1 x
2
y
2
4 = + + = 1 + , x
2
y
2
z
2
z
2
z = 3 S
1 z = 3
r(t) = cos t i +sin t j + k , 0 t 2. 3
curl F dS = F(r(t)) (t) dt .
S

2
0
r

F(r(t)) = sin t i +3 cos t j +sin t cos t k , 3 3


(t) = sin t i +cos t j . r

F(r(t)) (t) = (3 t t) . r

3 cos
2
sin
2
I = (1 +2 cos 2t) dt = 2 . 3
2
0
3
II. Line to surface integral: evaluate the integral
when
and is the curve oriented counter-clockwise as seen
from above obtained by intersecting the plane
and the cylinder .
Solution: it's possible to evaluate directly as a line
integral, but the calculations are easier if Stokes theorem
is used to replace the line integral with a surface integral.
As shown to the right, a natural choice of surface is
the purple portion of the plane lying inside the
cylinder . Then is parametrized by
Since
the normal vector to associated with this
parametrization is the upward normal, and the boundary
orientiation on is the given counter-clockwise
orientation on . Thus Stokes theorem applies:
But
so
Thus
Now change to polar coordinates:
I = F ds
C
F = 3 i +2xj + k y
2
z
2
C
z = 2 + y
+ = 1 x
2
y
2
I
S
z = 2 + y
+ = 1 x
2
y
2
S
(x, y) = xi + yj +(2 + y) j, 0 + 1. x
2
y
2
= j + k ,
x

y
S
S(= C)
C
I = (curl F) ( ) dxdy.
+ 1 x
2
y
2

S

x

y
curl F = curl (3 i +2xj + k) = (2 6y) k , y
2
z
2
(curl F) ( ) = 2 6y.

S

x

y
I = (2 6y) dxdy.
+ 1 x
2
y
2
I = (2 6r sin ) rddr = 2.
1
0

2
0
III. Surface to surface integral: evaluate the
integral
when
and consists of the top and four sides but not the
bottom of the cube having vertices ,
oriented outwards, as shown to the right.
Solution: since is an open cube with missing bottom
face, the boundary orientation on is the one shown in
red to the right when is oriented outwards. By Stokes
Theorem,
But the bottom face - call it - will have the same red
boundary as the open cube and it will have the same
boundary orientation provided is oriented upwards, i.e.,
the unit normal on is . With this orientation on ,
Thus by Stokes theorem,
The point is that the surface integral over is much
simpler than the one over .
Now is parametrized by
(does this specify as unit normal on ), while
In this case,
curl F dS
S
F = xyz i + x j + yz k y
2
x
2
S (
) (1, 1, 1)
S
S
S
I = curl F dS = F ds .
S

S
T
T
T k T
F ds = F ds .
S

S

T

T
curl F dS = curl F dS.
S

S

T

T
T
S
T
(x, y) = xi + yj k, 1 x, y 1 ,
k T?
curl F = curl (xyz i + x j + yz k) y
2
x
2
= z i +(xy 2xyz) j +( xz) k . x
2
y
2
curl F dS = ( + x) dxdy
T

T

1
1

1
1
y
2
= [ x + dy = 2 dy = .
1
1
y
2
1
2
x
2
]
1
1

1
1
y
2
4
3

Você também pode gostar