Nakos Engin Slides 1234

Engineering Mathematics / The Johns Hopkins University
Lectures in Engineering
Mathematics
George Nakos
Engineering Mathematics
The Johns Hopkins University
Fall 2004
1
Part 1: Linear Algebra
1. Matrices: Addition and Scalar Multiplication
2. Matrix Multiplication
3. Linear Systems and Gauss Elimination
4. The Rank of a Matrix; Linear Independence
2
6.1 Matrices: Definitions
Basic Definitions
A matrix is a rectangular arrangement of numbers called en-

tries. A matrix has rows that are numbered top to bottom and
columns that are numbered left to right. The (i, j ) entry is the
entry at the ith row and jth column.
A matrix has size m × n (pronounced ‘m by n’), if it has m rows

and n columns. If m = n, then the matrix is called square. In
this case, n is the size of the square matrix.
Lecture 1 / @ Copyright: George Nakos 3

6.1 Matrices: Examples
The following are matrices of respective sizes 4 × 2, 2 × 3, 3 × 3,

5 × 1, and 1 × 2.
 
  7.1 
 
1 −2 
" # a11 a12 a13  3.2  h
 −3 5 
 21 −1
7 √    i
, ,  a21 a22 a23  ,  −1.5 
, a b
 
0 6  9 5 4
 
a31 a23 a33  4.9 

2 −8  
6.9
The (3, 2) entry of the first matrix is 6. The third matrix is square
of size 3.

6.1 Matrices: General Form
A general matrix A of size m × n with (i, j ) entry aij is denoted

by
 
a11 a12 a13 · · · a1n
 a21 a22 a23 · · · a2n 
A=
 
 ... ... ... ... ... 

am1 am2 am3 · · · amn
This is abbreviated by
h i
A = aij
where i and j are indices such that 1 ≤ i ≤ m and 1 ≤ j ≤ n.

6.1 Vectors
If n = 1, then A is called a column matrix, or a m-vector, or
a vector. If m = 1, then A is called a row matrix, or a n-row
vector, or a row vector. The entries of vectors are usually
called components.
The following are vectors. The first is a 2-vector, the second is

a 4-vector, and the third is a n-vector.
   
" # 4 v1
7  −3   v2 
, ,
   
−3

 2 

 ... 

−1 vn
Here are some row vectors.
h √ i h i h i
[−3] , 1.2 3 , a b c , −2 3 0 4 1

6.1 Zero Matrices; Equal Matrices
A zero matrix, denoted by 0, is a matrix with zero entries. Here

are some examples.
   
" # 0 0 0
0 0 h i
0 = [ 0] , 0= , 0=
 0 0 ,

0= 0 0 , 0=
 0 

0 0
0 0 0
We say that two matrices A and B are equal and we write A = B,

if A and B have the same size and their corresponding entries
are equal. So, if
" # " #
1 2 c d
A= , B=
a b 3 4
then A = B, only if a = 3, b = 4, c = 1, and d = 2.
6.1 Matrix Addition
We can add two matrices of the same size by adding the cor-
responding entries. The resulting matrix is the sum of the two
matrices.
Example We have
" # " # " #
1 −3 0 0 4 5 1 1 5
+ =
2 −4 7 −1 4 −2 1 0 5
h i h i
In general, if A = aij and B = bij , for 1 ≤ i ≤ m and 1 ≤ j ≤ n,
then
h i
A + B = aij + bij

6.1 Scalar Multiplication

We also multiply any real number c, times a matrix A, by multi-
plying all entries of A by c.
Example We have
   
1 0 2 0
2  −3 4  =  −6 8 
   
5 −1 10 −2
h i
In general, if A = aij , then
h i
cA = caij
This operation is called scalar multiplication. The multiplier c
is often called a scalar, because it scales A.
6.1 Matrices: Opposite, Difference

The matrix (−1) A is called the opposite of A and it is denoted
by −A. For example,

0 4 5 0 −4 −5
− =
−1 4 −2 1 −4 2
The matrix A + (−1) B is denoted by A − B and it is called the

difference between A and B. This is the subtraction operation.
A − B = A + (−1) B
Example We have
1 −2 1 −1 0 −1
     
 7 4   6 3   1 1 
 5 −5  −  7 0  =
−2 −5 
8 0 −3 7 11 −7

6.1 Properties of Operations

Theorem
1. (A + B) + C = A + (B + C) (Associative Law)
2. A + B = B + A (Commutative Law)
3. A + 0 = 0 + A = A
4. A + (−A) = (−A) + A = 0
5. c(A + B) = cA + cB (Distributive Law)
6. (a + b)C = aC + bC (Distributive Law)
7. (ab)C = a(bC) = b(aC)
8. 1A = A
9. 0A = 0

6.1 Matrix Transpose
Let A be any m × n matrix. The transpose of A, denoted by AT ,

is the n × m matrix obtained from A by switching all columns of
A to rows and maintaining the same order.
Example We have
 
 T a T
1 2

" # 1
1 3 5 h iT  b 
 3  =
h
1
i
 3 4  = , a b c d = , −8
   
2 4 6 c −8 3
5 6
 
d

6.1 Properties of Transposition
Theorem
Let A and B be m × n matrices and let c be any scalar. Then
1. (A + B )T = AT + B T
2. (cA)T = cAT
T
3. AT =A

6.1 Matrices: Symmetric, skew-Symmetric

A matrix A such that AT = A is called symmetric. Examples
 
  a b c d
" # 0 −1 3
5 −7  b e f g 
,  −1 4 9 ,
   
−7 6  c f h i
 
3 9 6

d g i j
Note the mirror symmetry of a symmetric matrix with respect
to the upper-left to lower-right diagonal line.
A matrix A such that AT = −A is called skew-symmetric. Ex-

amples
 
  0 −b −c −d
" # 0 −1 3
0 7  b 0 −f −g 
,  1 0 −9  ,
   
−7 0 c f 0 −i 
 
−3 9 0

d g i 0
6.1 Special Square Matrices
Let A be a square matrix of size n. The entries aii, 1 ≤ i ≤ n

form the main diagonal. A is called upper triangular, if all
entries below the main diagonal are zero, i.e., if aij = 0 for j < i.
Matrix A is called lower triangular, if the entries above the main
diagonal are all zero, so aij = 0 for i < j. If the main diagonal is
also zero, we talk about strictly upper triangular and strictly
lower triangular matrices.
A, D, E are upper triangular. B, C, D, E are lower triangular. C is

strictly lower triangular.
   
a 0 0 0 0 0
a b 1 0 7 0
A= , B =  b c 0 , C =  1 0 0 , D = , E=
0 c 0 −2 0 7
d e f 1 1 0

6.1 More Special Square Matrices
If the nondiagonal entries of a square matrix are zero, then the

matrix is called diagonal. If all entries of a diagonal matrix are
equal, then we have a scalar matrix. Matrices D and E are
diagonal. Matrix E is a scalar matrix.
A scalar matrix of size n with common diagonal entry 1 is called

an identity matrix and it is denoted by In, or by I.
 
  1 0 ··· 0
" # 1 0 0
1 0  0 1 ··· 0 
I = I2 = , I3 =  0 1 0  , . . . , In = 
   
0 1 ... ... .
. . . .. 
0 0 1
 
0 0 ··· 1

6.2 Matrix Multiplication

Let A be a m×k matrix and B be a k×n matrix. The productAB
is the m × n matrix C = [cij ] = AB, with entries cij are given by
 
a11 a12 · · · a1k  

 ... ... ... ... 
  b11 · · · b1j · · · b1n 
   b21 · · · b2j · · · b2n 
A= ai1 ai2 · · · aik B=
   
 ... ... ... ... ... 
... ... ... ...
   
   
  bk1 · · · bkj · · · bkn
am1 am2 · · · amk
k
X
cij = ai1 b1j + ai2 b2j + · · · + aik bkj = air brj
r=1

6.2 Matrix Multiplication
Examples
 
3 2 4
2 0 1  −2 4 6 7 6
5 =
2 1 2 4 14 9
0 3 −2
 
1
 −2 
4 −1 −2 1  =5
3 
5
4 −1 −2
   
1 1
 −2   −8 2 4 −2 
 3  4 −1 −2 1 =  12 −3 −6 3 
5 20 −5 −10 5

6.2 Properties of Matrix Multiplication

Theorem
1. (AB)C = A(BC) (Associative law)
2. A(B + C) = AB + AC (Left Distributive law)
3. (B + C)A = BA + CA (Right Distributive law)
4. a(BC) = (aB)C = B(aC)
5. Im A = AIn = A (Multiplicative identity)
6. 0A = 0 and A0 = 0
7. (AB)T = B T AT

6.2 Caution with Matrix Multiplication

AB may not equal BA. In fact, if AB is defined, then BA may
not be defined. If BA is defined, then it may not have the same
size as AB. If it does have the same size, it may still not equal
AB.
1. We say that matrix multiplication is noncommutative.
2. If two matrices A and B satisfy AB = BA, then we say that

they commute.
" # " #
0 0 1 0
Example A = and B = commute.
1 1 2 3
6.2 Powers of Square Matrix

Let A be a square matrix. The product AA is also denoted by
A2. Likewise, AAA = A3 and AA · · · A = An for n factors of A. In
addition, we write A1 = A and if A is nonzero, we write A0 = I.
An = AA
| {z· · · A} , A1 = A, A0 = I
n factors
Example

1 −1 3 −4 11 −15
A1 = , A2 = , A3 = , ···
−2 3 −8 11 −30 41

1 2 1 2 1 2
B1 = , B2 = , B3 = , ···
0 0 0 0 0 0

0 1 0 0 0 0
C1 = , C2 = , C3 = , ···
0 0 0 0 0 0

6.2 Motivation for Matrix Multiplication

Let x = (x1, x2) , y = (y1, y2) , and z = (z1, z2) be coordinate
frames. Suppose we go from frame y to frame z by using the
linear transformation
z1 = a11y1 + a12y2
z2 = a21y1 + a22y2
and from frame x to frame y by the linear transformation
y1 = b11x1 + b12x2
y2 = b21x1 + b22x2
If we want to go from frame x to frame z, we substitute
z1 = a11 (b11x1 + b12x2) + a12 (b21x1 + b22x2)
z2 = a21 (b11x1 + b12x2) + a22 (b21x1 + b22x2)

6.2 Motivation for Matrix Multiplication
and rearrange to get
z1 = (a11b11 + a12b21) x1 + (a11b12 + a12b22) x2

z2 = (a21b11 + a22b21) x1 + (a21b12 + a22b22) x2
Now if A and B are coefficient matrices of the first two trans-

formations and C is the coefficient matrix of the last one, then
we see that
C = AB

6.2 Applications of Matrix Multiplication

Example
Each of three appliances outlets receive and sell daily from three
factories TVs and VCRs according to the following table.
TV VCR
Factory 1 40 50
Factory 2 70 80
Factory 3 60 65
Each outlet charges the following dollar amounts per appliance.
Outlet 1 Outlet 2 Outlet 3
TV 215 258 319
VCR 305 282 264

6.2 Applications of Matrix Multiplication
Example (cont.) If A and B are the matrices of these tables,

compute and interpret the product AB.
   
40 50 " # 23 850 24 420 25 960
215 258 319
AB =  70 80  =  39 450 40 620 43 450 
   
305 282 264
60 65 32 725 33 810 36 300
The (1, 1) entry 40 · 215 + 50 · 305 = 23, 850 is the first outlet’s
revenue from selling all the appliances coming from the first
factory. The remaining entries are interpreted similarly.

6.3 Linear Systems and Gauss Elimination

Definition A linear system of m equations in n unknowns
x1, . . . , xn, is a set of m linear equations
a11x1 + a12x2 + · · · + a1nxn = b1
a21x1 + a22x2 + · · · + a2nxn = b2
... (1)
am1x1 + am2x2 + · · · + amnxn = bm
The unknowns are also called variables, or indeterminants.

The numbers aij are the coefficients and the numbers bi are
the constant terms. If all constant terms are zero, then the
system is called homogeneous. The homogeneous system that
has the same coefficients as system (1) is said to be associated
with (1). If m = n, then the system is called square.
6.3 Example of Linear System
Example The system

x1 + 2x2 = −3
2x1 + 3x2 − 2x3 = −10 (2)
−x1 + 6x3 = 9
is linear square with coefficients 1, 2, 0, 2, 3, −2, −1, 0, 6, constant
terms −3, −10, 9, and associated homogeneous system
x1 + 2x2 = 0
2x1 + 3x2 − 2x3 = 0
−x1 + 6x3 = 0

6.3 More Examples of Linear Systems
The following three systems are linear. The first is from an

ancient Chinese text.∗
3x + 2y + z = 39 x1 + x2 = 5 y1 + y2 + y3 = −2
2x + 3y + z = 34 x1 − 2x2 = 6 y1 − 2y2 + 7y3 = 6
x + 2y + 3z = 26 −3x1 + x2 = 1
∗A third century BC book titled Nine Chapters of Mathematical Art. See

Carl Boyer’s A History of Mathematics (New York: Wiley).

6.3 Augmented and Coefficient Matrix

The matrix that consists of the coefficients and constant terms,
is called the augmented matrix of the system. The augmented
matrix of system (2) is
 
1 2 0 −3
2 3 −2 −10 
 

−1 0 6 9
The matrix with entries the coefficients is the coefficient matrix
of the system. The vector of all constant terms is the vector of
constants. The coefficient matrix and the vector of constants
of system (2) are
   
1 2 0 −3
 2 3 −2  and  −10 
   
−1 0 6 9

6.3 Solution of Linear System
A sequence r1, r2, . . . , rn of scalars is a (particular) solution of

system (1), if all the equations are satisfied when we substitute
x1 = r1, . . . , xn = rn. The set of all possible solutions is the
solution set. Any generic element of the solution set is called
the general solution.
If a system has solutions, it is called consistent, otherwise it is

called inconsistent.
Two linear systems with the same solution sets are called equiv-
alent. A solution that consists of zeros only is called a trivial
solution.
6.3 Elementary Row Operations

A basic solution method of a linear system is to eliminate unknowns so that
an equivalent “triangular” system is obtained. This is done by performing
elementary equation operations: (a) adding to an equation a multiple of
another, (b) multiplying an equation by a nonzero scalar, (c) switching two
equations.
For economy these operations are performed on the augmented matrix.
The elementary row operations of any matrix are:
Elimination: add a constant multiple of one row to another Ri + cRj → Ri
Scaling: multiply a row by a nonzero constant cRi → Ri
Interchange: interchange two rows Ri ↔ Rj

6.3 Example of Gauss Elimination

Example We solve the system by elimination.
x1 + 2x2 = −3
2x1 + 3x2 − 2x3 = −10
−x1 + 6x3 =9
   
1 2 0 −3 1 2 0 −3
R2 − 2R1 → R2
2 3 −2 −10   0 −1 −2 −4 
   
R3 + R1 → R3

−1 0 6 9 0 6 2 6
   
1 2 0 −3 1 2 0 −3
 0 −1 −2 −4  R3 + 2R2 → R3  0 −1 −2 −4
   

0 0 2 −2 0 0 2 −2

6.3 Example of Gauss Elimination

The system is in triangular form. Start at the bottom and work upwards to
eliminate unknowns above the leading variables (first variables with nonzero
coefficients) of each equation (back-substitution).
   
1 2 0 −3 1 2 0 −3
 0 −1 −2 −4  R2 + R3 → R2  0 −1 0 −6  R1 + 2R2 →
   
0 0 2 −2 0 0 2 −2
   
1 0 0 −15 1 0 0 −15
(−1) R2 → R2
 0 −1 0 −6   0 1 0 6 
   
(1/2)R3 → R3
0 0 2 −2 0 0 1 −1
x1 = −15, x2 = 6, x3 = −1
6.3 Infinitely Many Solutions
Example Find the intersection of the three planes.

x + 2y − z = 4
2x + 5y + 2z = 9
x + 4y + 7z = 6
Solution: By elimination the augmented matrix of the system

1 0 −9 2
reduces to  0 1 4 1  . We get x−9z = 2, y +4z = 1. Hence,
 
0 0 0 0
x = 9r + 2
y = −4r + 1 r∈R
z=r

6.3 No Solutions
Example [No Solutions] Find the intersection of the three planes

in the (p, q, k)-coordinate system.
q − 2k = −5
2p − q + k = −2
4p − q = −4
Solution:
 The augmented
 matrix of the system reduces to
2 −1 1 −2
 0 1 −2 −5  . The last row corresponds to the false ex-
 
0 0 0 5
pression 0 = 5. Hence, the system is inconsistent. Therefore,
the planes do not have a common intersection.

6.3 Matrix Form of Linear System
System (1) can be written as equality of two vectors. By using

the matrix-vector product we have
    
a11 a12 · · · a1n x1 b1
 a21 a22 · · · a2n  x2   b2 
=
    

 ... ... ... ... 
 ...   ... 

am1 am2 amn xn bm
This is also abbreviated as
Ax = b (3)
where A is the coefficient matrix, x is the vector of the unknowns,
and b is the vector of constants.

6.3 Matrix Form of Linear System
Example Write the linear system in matrix-vector product form.
7x1 + 4x2 + 5x3 = 1

2x1 − 3x2 + 9x3 = −8
Solution: We have
 
" # x1 " #
7 4 5 1
x =
 
 2 
2 −3 9 −8
x3

6.3 Echelon Forms

A zero row of a matrix is a row that consists entirely of zeros. The first
nonzero entry of a nonzero row is called a leading entry. If a leading entry
happens to be 1, we call it a leading 1. Similarly, we can talk about zero
columns.
Definitions Consider the following conditions on a matrix A.
1. All zero rows are at the bottom of the matrix.
2. The leading entry of each nonzero row after the first occurs to the right
of the leading entry of the previous row.
3. The leading entry in any nonzero row is 1.
4. All entries in the column above and below a leading 1 are zero.
If A satisfies the first two conditions, we call it row echelon form. If it

satisfies all four conditions, we call it reduced row echelon form. We often
omit the word “row” and just say echelon form, or reduced echelon form.
6.3 Echelon Forms

Example
Consider the matrices.
     
1 0 0 0 1 0 0 −6 1 0 1
A= 0 0 1 0 , B= 0 1 0 0 , C =  0 0 1 ,
0 0 0 0 0 0 1 −1 0 0 1
   
1 1 0 0 2 1 7 0 9 0
0 0
D= 0 0 1 0 3 , E= , F =  0 0 1 −8 0  ,
1 0
0 0 0 1 4 0 0 0 0 1
   
1 0 −1 0 1 0 0 0
G= 0 1 0 0 , H= 0 0 1 0 
0 0 1 0 0 0 0 −2
Matrices A, B, D, F, G, H are in echelon form. Out of these, A, B, D, F are in
reduced echelon form. Matrices G and H are not in reduced echelon form.
For G condition 4 fails. For H condition 3 fails. Matrices C and E are not in
echelon form. For C condition 2 fails. For E condition 1 fails.
6.3 Gauss Elimination

Algorithm [Gauss Elimination] To reduce any matrix to reduced row echelon
form apply the following steps.
1. Find the leftmost nonzero column.
2. If the first row has a zero in the column of step 1, interchange it with
one that has a nonzero entry in the same column.
3. Obtain zeros below the leading entry by adding suitable multiples of the
top row to the rows below that.
4. Cover the top row and repeat the same process starting with step 1
applied to the leftover submatrix. Repeat this process with the rest of
the rows, until the matrix is in echelon form.
5. Starting with the last nonzero row work upward: For each row obtain a
leading 1 and introduce zeros above it, by adding suitable multiples to
the corresponding rows.

Example Apply Gauss elimination to find a reduced echelon form

of the matrix.
 
0 3 −6 −4 −3
 −1 3 −10 −4 −4 
 
 4 −9 34 0 1 
 
2 −6 20 8 8
Solution:
 

−1 3 −10 −4 −4 
 0 3 −6 −4 −3 
R1 ↔ R2  

 4 −9 34 0 1 

2 −6 20 8 8

The pivot now is −1, at pivot position (1, 1) .

 
−1 3 −10 −4 −4
R3 + 4R1 → R3 
 0 3 −6 −4 −3 

R4 + 2R1 → R4 0 3 −6 −16 −15
 
 
0 0 0 0 0
   
−1 3 −10 −4 −4 −1 3 −10 −4 −4
0 3 −6 −4 −3 0 3 −6 −4 −3
   
   
  Step 1  

 0 3 −6 −16 −15 
 −−−−−→ 
 0 3 −6 −16 −15 

0 0 0 0 0 0 0 0 0 0

The next pivot is 3, at position (2, 2) .

−1 3 −10 −4 −4 −1 3 −10 −4 −4
   

0 3 −6 −4 −3   0 3 −6 −4 −3 
 R3 − R2 → R3 
   
0 3 −6 −16 −15 0 0 0 -12 −12
 
   
0 0 0 0 0 0 0 0 0 0
STEP 5: Starting with the last nonzero row work upward: For
each row obtain a leading 1 and introduce zeros above it, by
adding suitable multiples to the corresponding rows.
 
−1 3 −10 −4 −4
 0 3 −6 −4 −3  R2 + 4R3 → R2
(−1/12)R3 → R3 
 
0 0 0 1 1  R1 + 4R3 → R1


0 0 0 0 0

 
−1 3 −10 0 0
 
−1 3 −10 0 0
3 −6 0 1 1
 
 0   0 1 −2 0 3 
(1/3)R2 → R2   R − 3R →
 

0 0 0 1 1
  1 2
0 0 0 1 1 
  

0 0 0 0 0 0 0 0 0 0
   
−1 0 −4 0 −1 1 0 4 0 1
1   0 1 −2 0 1 
   

 0 1 −2 0 3  (−1)R1 → R1  3 
0 0 0 1 1   0 0 0 1 1 
   

0 0 0 0 0 0 0 0 0 0

6.4 Linear Combinations of Vectors
Definition Let v1, v2, . . . , vk be given n-vectors and let c1, c2, . . . , ck
be any scalars. The n-vector v
v = c1v1 + c2v2 + · · · + ck vk
is called a linear combination of v1, . . . , vk . The scalars c1, . . . , ck
are called the coefficients of the linear combination. If not all ci
are zero, we have a nontrivial linear combination. If all ci are
zero, we have the trivial linear combination. The trivial linear
combination represents the zero vector.
The concept of linear combination is simple: we scale a few

vectors and then we add them.
6.4 Linear Combinations of Vectors

Example Check that the following are linear combinations of the
vectors v1, v2, and v3.
−v1 + 3v2 + 4v3, v1 + 1.5v2 − 9v3, v1 − v3
Solution: We have
−v1 + 3v2 + 4v3 = (−1) v1 + 3v2 + 4v3
v1 + 1.5v2 − 9v3 = 1v1 + (1.5) v2 + (−9) v3
v1 − v3 = 1v1 + 0v2 + (−1) v3
Note that the difference v1 − v2 between two vectors v1 and v2

is the linear combination 1v1 + (−1) v2 with coefficients 1 and
−1.
6.4 Application of Linear Combinations

A sports company owns two factories each making aluminum and titanium
mountain bikes. The first factory makes 150 aluminum and 15 titanium bikes
a day. For the
second factory the numbers are 220 and 20, respectively. If
150 220
v1 = and v2 = , compute and discuss the meaning of:
15 20
(a) v1 + v2
(b) v2 − v1
(c) 10v1
(d) av1 + bv2, for a, b > 0.

6.4 Application of Linear Combinations

Solution: " #
370
(a) v1 + v2 = represents the total number of aluminum
35
(370) and titanium (35) bikes produced by the two factories in
one day. " #
70
(b) v2 − v1 = represents how many more bikes the second
5
factory makes
" over # the first one in one day.
1500
(c) 10v1 = represents how many bikes the first factory
150
makes in 10 days." #
150a + 220b
(d) av1 + bv2 = represents the total number of
15a + 20b
bikes produced if the first factory operates for a days and the
second for b days.
6.4 Linear Dependence

Definition The sequence of m-vectors v1 , . . . , vk is linearly dependent (or
the vectors are linearly dependent), if there are scalars c1 , . . . , ck not all zero
such that
c1 v1 + · · · + ck vk = 0 (4)
So, there is a nontrivial linear combination of the vis representing the zero
vector. Equation (4) with not all ci zero is called a linear dependence
relation of the vis.
Example The vectors
     
1 1 4
 −1   2   14 
 3  ,  0  ,  −6 
4 2 4
are linearly dependent, because if we let c1 = 2, c2 = −6, and c3 = 1, then
       
1 1 4 0
 −1   2   14   0 
2 + (−6)  + 1 =
3  0  −6  0 
4 2 4 0


     

 0 1 3 
Example Let S =  −2  ,  2  ,  14  .
     
 
 3 7 9 
(a) Show that S is linearly dependent.
(b) Find a linear dependence relation.
Solution: (a) We seek c1, c2, c3 not all zero such that
       
0 1 3 0
c1  −2  + c2  2  + c3  14  =  0 
       
3 7 9 0


Equivalently, we seek nontrivial solutions to the homogeneous linear system
    
0 1 3 c1 0
 −2 2 14   c2  =  0 
3 7 9 c3 0
We solve this system to get c1 = 4r, c2 = −3r, c3 = r. There are nontrivial
solutions, hence the set is linearly dependent.
(b) To get a particular linear dependence relation we assign a nonzero value

to the parameter r. For example, if r = 1, then we have
       
0 1 3 0
4  −2  + (−3)  2  + 1  14  =  0 
3 7 9 0
This is one of infinitely many linear dependence relations.

6.4 Linear Independence
Definition The set of m-vectors {v1, . . . , vk } is called linearly

independent, if it is not linearly dependent. This is the same
as saying that there is no linear dependence relation among
v1, . . . , vk . So, all nontrivial linear combinations of the vis yield
nonzero vectors. Equivalently, we have
if c1v1 + · · · + ck vk = 0, then c1 = 0, . . . , ck = 0
In other words, the homogeneous system [v1 · · · vk ] c = 0 has
only the trivial solution. We often say that v1, . . . , vk are linearly
independent.


(" # " #)
1 5
Example Show that , is linearly independent in
−2 3
R2 .
Solution: Let c1 and c2 be scalars such that c1e1 + c2e2 = 0. In

other words,
" # " # " #
1 5 0
c1 + c2 =
−2 3 0
" #
1 5 0
We solve the system with augmented matrix to get
−2 3 0
c1 = 0 and c2 = 0. Therefore, the set is linearly independent.


Example Show that S is linearly independent.
     

 2 8 −4 
 3   −6   3 
 
S =  , ,
     
2 5 1


     

 

4 0 
−6
Solution: We only need to count the number of pivots of the

coefficient matrix.
   
2 8 −4 2 8 −4
 3 −6 3   0 −3 5 
∼
   
2 5 1 0 0 −21
 
   
4 0 −6 0 0 0
This number is 3, the same as the number of columns, so the
set is linearly independent.
6.4 Rank of Matrix

The rank of a m × n matrix A is the maximum number of linearly
independent rows of A.
1. The rank equals the maximum number of linearly indepen-

dent columns of A.
2. The rank is the same as the number of the pivots of A.
3. A and AT have the same rank.
To compute the rank of A it we reduce A to echelon form

and count the number of nonzero rows or the number of pivot
columns.
6.4 Example of Rank

 
1 2 2 −1
 1 3 1 −2
 

 
The rank of A =  1 1
 3  is 2.
0 
 0 1 −1 −1 
 
1 2 2 −1
because, A reduces to the row echelon form matrix
 
1 2 2 −1
 0 1 −1 −1 
 
 
B=
 0 0 0 0 

 0 0 0 0 
 
0 0 0 0

Lecture 2 in Engineering
Mathematics
George Nakos
Fall 2004

1. The Rank of a Matrix; Linear Independence
2. Vector Space, Basis, Dimension
3. Determinants; Cramer’s Rule
4. Matrix Inversion
5. Inner Product Spaces
6. Linear Transformations

6.4 Vector Space: Definition

Definition Let V be a set equipped with two operations named
addition and scalar multiplication. Addition is a map that
associates any two elements u and v of V with a third one,
called the sum of u and v and denoted by u + v.
V × V → V, (u, v) → u + v
Scalar multiplication is a map that associates any real scalar c
and any element u of V with another element of V, called the
scalar multiple of u by c and denoted by cu.
R × V → V, (c, u) → cu
Such a set V is called a (real) vector space, if the two operations
satisfy the following properties, known as axioms for a vector
space.
6.4 Vector Space: Definition

Addition
(A1) u + v belongs to V for all u, v ∈ V.
(A2) u + v = v + u for all u, v ∈ V. (Commutative Law)
(A3) (u + v) + w = u + (v + w) for all u, v, w ∈ V. (Associative Law)
(A4) There exists a unique element 0 ∈ V, called the zero of V, such that for
all u in V
u+0=0+u=u
(A5) For each u ∈ V there exists a unique element −u ∈ V, called the negative
or opposite of u, such that
u + (−u) = (−u) + u = 0

6.4 Vector Space: Examples

Scalar Multiplication
(M1) c u belongs to V for all u ∈ V and all c ∈ R.
(M2) c(u + v) = cu + cv for all u, v ∈ V and all c ∈ R. (Distributive Law)
(M3) (c + d)u = cu + du for all u ∈ V and all c, d ∈ R. (Distributive Law)
(M4) c(du) = (cd)u for all u ∈ V and all c, d ∈ R.
(M5) 1u = u for all u ∈ V.
The elements of a vector space are called vectors. Axioms (A1) and (M1)
are also expressed by saying that V is closed under addition and is closed
under scalar multiplication. Note that a vector space is a nonempty set,
because it has a zero by (A4).

1. The set Rn of all n-vectors with real components.

Operations: The usual vector addition and scalar multiplication. Zero:
The zero n-vector 0. Axioms: For the axioms see the Properties of
Matrix Operations Theorem.
2. The set Mmn of all m × n matrices with real entries.

Operations: The usual matrix addition and scalar multiplication. Zero:
The m × n zero matrix 0. Axioms: For the axioms see the Properties of
Matrix Operations Theorem.
3. The set P of all polynomials with real coefficients.


Operations:
1. Addition: The sum of two polynomials is formed by adding the coeffi-

cients of the same powers of x of the polynomials. Explicitly, if
p1 = a0 + a1 x + · · · + anxn, p2 = b0 + b1 x + · · · + bm xm , n≥m
we write p2 as p2 = b0 + b1 x + · · · + bnxn, by adding zeros if necessary, and
form the sum
p1 + p2 = (a0 + b0 ) + (a1 + b1 ) x + · · · + (an + bn) xn
2. Scalar multiplication: This is multiplication of a polynomial through by

a constant.
cp1 = (ca0 ) + (ca1 ) x + · · · + (can) xn
3. Zero: The zero polynomial, 0, is the polynomial with zeros as coeffi-

cients.
4. Axioms: The verification of the axioms is left as exercise.


The set F (R) of all real valued functions defined on R.
Operations: Let f and g be two real valued functions with domain R and let
c be any scalar.
1. Addition: We define the sum f + g of f and g as the function whose

values are given by
(f + g)(x) = f (x) + g(x) for all x ∈ R
2. Scalar multiplication: The scalar product cf is defined by

(c f )(x) = c f (x) for all x ∈ R
3. Zero: The zero function 0 is the function whose values are all zero.
0(x) = 0 for all x ∈ R
4. Axioms: The verification of the axioms is left as exercise.


Is R2 with the usual addition and the following scalar multiplica-
tion, denoted by , a vector space?
" # " #
a1 ca1
c =
a2 a2
Solution: It is not a vector space, because

" # " # " #
a1 (c + d) a1 ca1 + da1
(c + d) = =
a2 a2 a2
and
" # " # " # " # " #
a1 a1 ca1 da1 ca1 + da1
c +d = + =
a2 a2 a2 a2 2a2
" # " # " #
a1 a1 a1
So, (c + d) 6= c +d and axiom (M3) fails.
a2 a2 a2
6.4 Subspaces
Definition A subset W of a vector space V is called a subspace of
V, if W itself is a vector space under the same addition and scalar
multiplication as V . In particular, a subspace always contains the
zero element.
Theorem Let W be a nonempty subset W of a vector space

V. Then W is a subspace of V if and only if it is closed under
addition (axiom (A1)) and scalar multiplication (axiom (M1)),
that is, if and only if
1. If u and v are in W, then u + v is in W.
2. If c is any scalar and u is in W, then c u is in W .

6.4 Examples of Subspaces
1. The set W = {cv, c ∈ R} of all scalar multiples of the fixed

vector v of a vector space V is a subspace of V.
2. {0} and V are subspaces of V. These are the trivial sub-

spaces of V. {0} is called the zero subspace.
3. The set Pn that consists of all polynomials of degree ≤ n and

the zero polynomial is a subspace of P.
4. The set C(R) of all continuous real valued functions defined

on R is a subspace of F (R).

6.4 Linear Combinations and Span

If v1, . . . , vn are vectors from a vector space V and c1, . . . , cn are
scalars, then the expression
c1 v1 + · · · + cn vn
is well defined and is called a linear combination of v1, . . . , vn.
If not all ci are zero, we have a nontrivial linear combination.
If all ci are zero, we have the trivial linear combination. The
trivial linear combination represents the zero vector.
The set of all linear combinations of v1, . . . , vk is called the span

of these vectors and it is denoted by
Span {v1, v2, . . . , vk }
If V = Span{v1, . . . , vk }, we say that v1, . . . , vk span V and that
{v1, . . . , vk } is a spanning set of V.
6.4 Linear Combinations; Span

Example Let V be a vector space and let v1 , v2 be in V. The following vectors
are in Span{v1 , v2 }.
0, v1, v2 , v1 + v2 , −2v1 , 3v1 − 2v2
Example Let V be a vector space and v be in V. Span{v} is the set of all
scalar multiples of v.
Span{v} = {cv , c ∈ R}
Example Let p = −1 + x − 2x2 in P3 . Show that p ∈ Span {p1 , p2 , p3 } , where
p1 = x − x2 + x3 , p2 = 1 + x + 2x3 , p3 = 1 + x
Solution: Let c1 , c2 , c3 be scalars such that
−1 + x − 2x2 = c1 x − x2 + x3 + c2 1 + x + 2x3 + c3 (1 + x)

Then
−1 + x − 2x2 = (c2 + c3 ) + (c1 + c2 + c3 ) x − c1 x2 + (c1 + 2c2 ) x3
Equating coefficients of the same powers of x yields the linear system
c2 + c3 = −1, c1 + c2 + c3 = 1, −c1 = −2, c1 + 2c2 = 0
with solution c1 = 2, c2 = −1, c3 = 0. Therefore, p is in the span of p1 , p2 , p3 .

Definition Let v1 , . . . , vn be vectors of a vector space V. Then {v1 , . . . , vn} is
linearly dependent, if there are scalars c1 , . . . , cn not all zero such that
c1 v1 + · · · + cn vn = 0 (5)
So, there are nontrivial linear combinations that represent the zero vector.
Equation (5) with not all ci zero is called a linear dependence relation of
the vis.
Example Show that the set {2 − x + x2 , 2x + x2 , 4 − 4x + x2 } is linearly
dependent in P3 .
Solution: This true, because
2(2 − x + x2 ) + (−1) (2x + x2 ) + (−1) 4 − 4x + x2 = 0

Definition The set of vectors {v1 , . . . , vn} from a vector space V is called lin-
early independent, if it is not linearly dependent. This is the same as saying
that there is no linear dependence relation among v1 , . . . , vk . Equivalently,
c1 v1 + · · · + ck vk = 0 ⇒ c1 = 0, . . . , ck = 0
So, every nontrivial linear combination is nonzero.
6.4 Linear Independence Example Show that set {E11 , E12 , E21 , E22 }
is linearly independent in M22 .
Solution: Let

1 0 0 1 0 0 0 0 0 0
c1 + c2 + c3 + c4 =
0 0 0 0 1 0 0 1 0 0

c1 c2 0 0
⇒ =
c3 c4 0 0
Hence, c1 = c2 = c3 = c4 = 0. So, the set is linearly independent.
Example Are 1 + x, −1 + x, 4 − x2 , 2 + x3 linearly independent in P3 ?
Solution: If a linear combination in these polynomials is the zero polynomial,
then
2
3

c1 (1 + x) + c2 (−1 + x) + c3 4 − x + c4 2 + x = 0 ⇒
(c1 − c2 + 4c3 + 2c4 ) + (c1 + c2 ) x + (−c3 ) x2 + c4 x3 = 0
Equating coefficients yields,
c1 − c2 + 4c3 + 2c4 = 0, c1 + c2 = 0, −c3 = 0, c4 = 0
We solve this linear system to get c1 = c2 = c3 = c4 = 0. So, the vectors are
linearly independent in P3 .
6.4 Basis
Definition A subset {v1, . . . , vn} of a nonzero vector space V is
a basis of V, if
1. it is linearly independent, and
2. it spans V.
The empty set is, by definition, the only basis of the zero vector
space {0}.
Here are some examples of bases.

6.4 Examples of Bases
1. The standard basis vectors e1, e2, . . ., en in Rn form a basis of

Rn .
2. {1, x, x2, . . . , xn} is a basis of Pn, called the standard basis

of Pn.
3. {E11, E12, E13, . . . , Emn} is a basis of Mmn, called the stan-

dard basis of Mmn.

6.4 Examples of Bases

Example Show that B = {1 + x, −1 + x, x2} is a basis of P2.
(a) To show that B spans P2, we want every polynomial p =

a + bx + cx2 to be a linear combination in B. So, we look for
scalars c1, c2, c3 such that
c1(1 + x) + c2(−1 + x) + c3x2 = a + bx + cx2 ⇒
(c1 − c2) + (c1 + c2)x + c3x2 = a + bx + cx2
which leads to the system c1 − c2 = a, c1 + c2 = b, c3 = c. We
have
1 1
 
1 0 0 2a + 2b
 
1 −1 0 a  
 1 1 0 b ∼ 0 1 0 − a+ b 
1 1
  
2 2 
0 0 1 c 0 0 1 c
so the system is consistent for all choices of a, b, c. Thus, B
spans P2.
(b) To show that B is linearly independent, let c1, c2, c3 be such

that
c1(1 + x) + c2(−1 + x) + c3x2 = 0 ⇒
(c1 − c2) + (c1 + c2)x + c3x2 = 0
Hence, we have the system c1 − c2 = 0, c1 + c2 = 0, c3 = 0.
Now
   
1 −1 0 0 1 0 0 0
 1 1 0 0 ∼ 0 1 0 0 
   
0 0 1 0 0 0 1 0
So the system has only the trivial solution c1 = c2 = c3 = 0.
Thus, B is linearly independent.
6.4 Characterization of Basis
One of the main characterizations of a basis is described in the

following theorem.
Theorem A subset B = {v1, . . . , vn} of a vector space V is a

basis of V if and only if for each vector v in V there are unique
scalars c1, . . . , cn such that
v = c1v1 + · · · + cnvn

6.4 Dimension
Definitions If a vector space V has a basis with n elements,

then V is called finite dimensional and we say that n is the
dimension of V. We write
dim(V ) = n
the dimension is a well defined number and does not depend

on the choice of basis. The dimension of the zero space {0} is
defined to be zero. A vector space that has no finite spanning
set it is called infinite dimensional.
By counting the number of elements of the standard bases we

see that dim(Rn) = n, dim(Pn) = n + 1, dim(Mmn) = m · n.
6.6 Determinants
" #
a11 a12
Let A = . The determinant, det (A) , of A is the
a21 a22
number
det(A) = a11a22 − a12a21
Let A be
 
a11 a12 a13
A =  a21 a22 a23 
 
a31 a32 a33
The determinant of A in terms of 2 × 2 determinants is the
number

a a a
21 a23
a
21 a22

det(A) = a11 22 23 − a12 + a13

a32 a33 a31 a33 a31 a32


6.6 Determinants
In the same manner we can define determinants of 4×4 matrices.

a11 a12 a13 a14
a

22 a23 a24
a
21 a23 a24

a21 a22 a23 a24
= a11 a32 a33 a34 − a12 a31 a33 a34 +

a31 a32 a33 a34

a42 a43 a44 a41 a43 a44
a41 a42 a43 a44

a a22 a24 a
21 a22 a23

21
+ a13 a31 a32 a34 − a14 a31 a32 a33

a41 a42 a44 a41 a42 a43

6.6 Determinants
Example Find det (C ) , if

 
1 2 0 1
 −1 1 2 0 
C=
 
−2 1 0 −2

 
1 0 2 −1
Solution: det (C ) equals

1 2 0 −1 2 0 −1 1 0 −1 1 2

1 1 0 −2 − 2 −2 0 −2 + 0 −2 1 −2 − 1 −2 1 0

0 2 −1 1 2 −1 1 0 −1

1 0 2
= 1 · 6 − 2 · (−12) + 0 · (−3) − 1 · 0 = 30

6.6 Cofactor Expansion

We have introduced what is known as the cofactor expansion of a deter-
minant about its first row. Each entry of the first row is multiplied by the
corresponding minor and each such product is multiplied by ±1 depending on
the position of the entry. The signed products were added together. Actually,
instead of the first row can use any row or column. Here is how: Let
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
A= ... ... ... ... 

an1 an2 · · · ann
First we assign the sign (−1)i+j to the entry aij of A. This is a checkerboard
pattern of ±’s.
 
+ − + ···
 − + − ··· 
 
 + − + ··· 
... ... ... . . .
Then we pick a row or column and multiply each entry aij of it by the corre-
sponding signed minor (−1)i+j Mij . Lastly, we add all these products.
6.6 Determinants and Row Reduction

The signed minor (−1)i+j Mij is called the (i, j) cofactor, of A and is denoted
by Cij .
Cij = (−1)i+j Mij
1. Cofactor Expansion about the ith row The determinant of A can be

expanded about the ith row in terms of the cofactors as follows.
det A = ai1 Ci1 + ai2 Ci2 + · · · + ainCin
2. Cofactor Expansion about the jth column The determinant of A can

be expanded about the jth column in terms of the cofactors as follows.
det A = a1j C1j + a2j C2j + · · · + anj Cnj
This method of computing determinants by using cofactors is called the co-

factor, or Laplace expansion and it is attributed to Vandermonde and to
Laplace.

6.6 Properties of Determinants

1. A and its transpose have the same determinant, det(A) = det(AT ). For example,

a1 a2 a3 a1 b1 c1
b1 b2 b3 = a2 b2 c2
c1 c2 c3 a3 b3 c3
2. Let B be obtained from A by multiplying one of its rows (or columns) by a nonzero
constant. Then det(B) = k det(A). For example,

a1 a2 a3 a1 a2 a3 a1 a2 ka3 a1 a2 a3
kb1 kb2 kb3 = k b1 b2 b3 , b1 b2 kb3 = k b1 b2 b3
c1 c2 c3 c1 c2 c3 c1 c2 kc3 c1 c2 c3
3. Let B be obtained from A by interchanging any two rows (or columns). Then det(B) =
− det(A). For example,

a1 a2 a3 b1 b2 b3 a1 a2 a3 a3 a2 a1
b1 b2 b3 = − a1 a2 a3 , b1 b2 b3 = − b3 b2 b1
c1 c2 c3 c1 c2 c3 c1 c2 c3 c3 c2 c1
4. Let B be obtained from A by adding a multiple of one row (or column) to another.
Then det(B) = det(A). For example,

a1 a2 a3 a1 a2 a3
ka1 + b1 ka2 + b2 ka3 + b3 = b1 b2 b3
c1 c2 c3 c1 c2 c3

6.6 Properties of Determinants

Note that
1. Elimination Ri + cRj → Ri, does not change the determinant.
2. Scaling, cRi → Ri, scales the determinant by c.
3. Interchange, Ri ↔ Rj , changes the sign of the determinant.
The properties of determinants can be used to compute a de-

terminant as follows. We convert it to triangular form by Gauss
elimination and then multiply the diagonal entries of the trian-
gular form.
6.6 Determinants by Row Reduction

1 2 3 −1 8 1 2 3 −1 8
0 0 4 2 −1 0 0 4 2 −1

0 −5 5 3 7 = 0 −5 5 3 7 −R1 + R5 → R5

0 0 0 1 6 0
0 0 1 6

1 2 3 −2 −9 0 0 0 −1 −17

1 2 3 −1 8
0 −5 5 3 7

= − 0 0 4 2 −1 R2 ↔ R3

0 0 0 1 6

0 0 0 −1 −17

1 2 3 −1 8
0 −5 5 3 7

= − 0 0 4 2 −1 R4 + R5 → R5

0 0 0 1 6

0 0 0 0 −11
= 1 · (−5) · 4 · 1 · (−11)
= −220

6.6 Cramer’s Rule

Let Ax = b be a square system with
     
a11 . . . a1n x1 b1
A =  ... ... ...  , x =  ...  , b =  ... 
an1 . . . ann xn bn
Let Ai denote the matrix obtained from A by replacing the ith column with
b.
 
a11 · · · a1,i−1 b1 a1,i+1 · · · a1n
Ai =  ... ... ... ... ... ... ... 
an1 · · · an,i−1 bn an,i+1 · · · ann
Cramer’s Rule gives an explicit formula for the solution of a consistent square
system.
Cramer’s Rule If det(A) 6= 0, then the system Ax = b has a unique solution
x = (x1, . . . , xn) given by
det(A1 ) det(A2 ) det(An)
x1 = , x2 = , . .. , xn =
det(A) det(A) det(A)

6.6 Cramer’s Rule

Example Use Cramer’s Rule to solve the system.
x 1 + x 2 − x3 = 2
x 1 − x 2 + x3 = 3
−x1 + x2 + x3 = 4
Solution: We compute the determinant of the coefficient matrix

A and the determinants of
     
2 1 −1 1 2 −1 1 1 2
A1 =  3 −1 1  , A2 =  1 3 1  , A3 =  1 −1 3 
     
4 1 1 −1 4 1 −1 1 4
to get det(A) = −4, det(A1) = −10, det(A2) = −12, det(A3) =
−14. Hence,
det(A1) 5 det(A2) det(A3) 7
x1 = = , x2 = = 3, x3 = =
det(A) 2 det(A) det(A) 2
6.7 Matrix Inverse
Definition An n×n matrix A is invertible, if there exists a matrix

B such that
AB = I and BA = I
In such case B is called an inverse of A. If no such B exists
for A, then we say that A is noninvertible. Another name for
invertible is nonsingular and another name for noninvertible is
singular.
Note that the definition forces B to be square of size n (why?).

6.7 Matrix Inverse
Theorem An invertible matrix has only one inverse.
Proof: Suppose that the invertible matrix A has two inverses B

and C. Then
B = BIn = B(AC) = (BA)C = InC = C

Therefore, B = C.
The unique inverse of an invertible matrix A is denoted by A−1.

So
AA−1 = I and A−1A = I

6.7 Matrix Inverse

Next we see how to compute the inverse of an invertible matrix A. The idea
is simple: If A−1 has unknown columns xi, then AA−1 = I takes the form
[Ax1 · · · Axn] = [e1 · · · en]
This matrix equation splits into n linear systems
Ax1 = e1 , . . . , Axn = en
which we solve to find each column xi of A−1 . These systems have the same
coefficient matrix A. Solving each system separately would amount into
n − 1 unnecessary row reductions of A. It is smarter to solve the systems
simultaneously, by simply row reducing the matrix
[A : I]
If we get a matrix of the form [I : B] , then the ith column of B would be xi.
Thus, B = A−1 . So, in order to compute A−1 , we just row reduce [A : I] .

6.7 Matrix Inverse

 
1 0 −1
Example Compute A−1 , if A =  3 4 −2  .
3 5 −2
Solution: We row reduce [A : I].
−1
" #

1 0 −1 1 0 0

1 0 −1 1 0 0
1 0 1 0 0
3 4 −2 0 1 0 ∼ 0 4 1 −3 1 0 ∼ 0 4 1 −3 1 0 ∼
3 5 −2 0 0 1 0 5 1 −3 0 1 0 0 − 14 3
− 54 1
4
−1 −2 −4 −2 −4

1 0 1 0 0 1 0 0 5 1 0 0 5
0 4 1 −3 1 0 ∼ 0 4 0 0 −4 4 ∼ 0 1 0 0 −1 1
0 0 1 −3 5 −4 0 0 1 −3 5 −4 0 0 1 −3 5 −4
Therefore,
−2 −4

5
A−1 = 0 −1 1
−3 5 −4

6.7 Matrix Inverse

Theorem Let A and B be invertible n × n matrices and let c be a nonzero
scalar. Then
1. AB is invertible and
(AB)−1 = B −1 A−1
2. A−1 is invertible and

(A−1 )−1 = A
3. cA is invertible and
1 −1
(cA)−1 = A
c
4. AT is invertible and
T
(AT )−1 = A−1

6.7 Cancellation Laws

Recall that AB = AC does not imply that B = C. However, if A is invertible,
then the implication is true.
Theorem Let A, B, and C be n × n matrices and A is invertible. Then the

cancellation laws hold:
AB = AC ⇒ B = C, BA = CA ⇒ B = C
Proof: Let AB = AC. Since A−1 exists, we can multiply on the left by A−1
to get
A−1 (AB) = A−1 (AC) ⇒ (A−1 A)B = (A−1 A)C ⇒ IB = IC ⇒ B = C
The second implication is proved similarly.

6.7 Determinants and Inversion

Cauchy’s Theorem The determinant of a product of two n×n
matrices is the product of the determinants of the factors.
det(AB) = det(A) det(B)
Cauchy’s Theorem has the following implication.
Theorem
A square matrix is invertible if and only if its determinant is

nonzero.
Furthermore, If A is invertible, then

1
det(A−1) =
det(A)
6.7 Invertibility and Linear Systems
Theorem Let A be an invertible matrix, so det(A) 6= 0. Then
1. Ax = b has a unique solution given by

x = A−1b
2. Ax = 0 has only the trivial solution.
Theorem Let A be a n × n matrix. Then the following are equivalent.
1. det(A) = 0
2. Ax = 0 has nontrivial solutions

6.7 Adjoint
Definition Let A be an n × n matrix. The matrix whose (i, j)

entry is the cofactor Cij of A is the matrix of cofactors of A.
Its transpose is the adjoint of A and it is denoted by Adj(A).
 
C11 C21 · · · Cn1
 C12 C22 · · · Cn2 
Adj(A) = 
 
 ... ... ... ... 

C1n C2n · · · Cnn

6.7 Adjoint
Example Find the adjoint of A, where
 
−1 2 2
A= 4 3 −2 
 
−5 0 3
Solution: The cofactors of A are

C11 = 9, C12 = −2, C13 = 15
C21 = −6, C22 = 7, C23 = 10
C31 = −10, C32 = 6, C33 = −11
Hence,
   
C11 C21 C31 9 −6 −10
Adj(A) = [Cij ]T =  C12 C22 C32  =  −2 7 6 
C13 C23 C33 15 −10 −11

6.7 Adjoint and Inverse

Theorem
1. Let A be an n × n matrix. Then

A Adj(A) = det(A)In = Adj(A) A
2. Let A be an invertible matrix. Then

−1 1
A = Adj(A)
det(A)
Example For the above A, we have det(A) = 17. Hence,

  9 6 10 

−6 −10 − 17 − 17
1  9
17
−1 1 2 7 6
A = Adj(A) = −2 7 6  =  − 17
 
det(A) 17 17 17 
15 −10 −11 15
− 10 11
− 17
17 17

6.8 The Dot Product

The dot product u · v of two n-vectors u = (u1, ..., un) and
v = (v1, ..., vn) is the matrix-vector product
u · v = uT v
The matrix in this case is the row vector obtained by transposing
u. In terms of components the dot product is the number
 
v1
u · v = [ u1 · · · u n ]  . 
 ..  = u1v1 + · · · + unvn (6)
vn
If the dot product of two vectors is zero, we call these vectors
orthogonal.
Note that in equation (6) for convenience we identified a 1 × 1

matrix [a] with its single entry a.
6.8 The Dot Product

Example Let u = (−3, 2, 1) , v = (4, −1, 5) , and w = (−2, 1, −8) .
(a) Find u · v.
(b) Are u and w are orthogonal?
Solution:
(a) We have
 
4
u·v = −3 2 1  −1  = (−3) 4 + 2 (−1) + (1) (5) = −9
5
(b) Vectors u and w are orthogonal, because

u · w = (−3, 2, 1) · (−2, 1, −8) = 0

6.8 The Length of n-Vectors

Definition The norm, or length, or magnitude of an n-vector u = (u1 , . . . , un)
is
√ 2
1
2 2
k uk = u · u = u 1 + · · · + u n
The (Euclidean) distance between two n-vectors u and v is
ku − v k
A n-vector is a unit vector, if its norm is 1.
Example Let v = (1, 2, −3, 1) and u = 21 , − 12 , 12 , − 21 . (a) Find the length of

v. (b) Find the distance between v and u. (c) Is u a unit vector?

Solution: We have
2
21 √
(a) kvk = 12 + 22 + (−3) + 12 = 15
1 5
√
(b) kv − uk = , , − 72 , 32
2 2
= 21
1 1 1 1

(c) kuk = 2
, − ,
2 2
, − 2
= 1. So, u is a unit vector.

6.8 Dot Product and Angle

The dot product for plane and space vectors is related to the
length and angle between the vectors by the following formula
u · v = kuk kvk cos θ (7)

This can be seen by using the law of cosines on the triangle OP Q
with OP = u and OQ = v.
1 2 2 2

kuk kvk cos θ = kuk + kvk − kP Qk
2 
3 3 3
1X 2 2
( v i − u i )2 
X X
= ui + vi −
2 i=1 i=1 i=1
3
X
= ui v i = u · v
i=1

6.8 Main Properties of Dot Product

Let u, v, w be n-vectors and c be a scalar. Then
1. u · v = v · u (Symmetry)
2. u · (v + w) = u · v + u · w (Additivity)
3. c (u · v) = (cu) · v = u · (cv) (Homogeneity)
4. u · u ≥ 0. Also, u · u = 0 if and only if u = 0. (Positive Definiteness)
5. (Pythagorean Theorem) u and v are orthogonal if and only if

ku + vk2 = kuk2 + kvk2
6. (Cauchy-Bunyakovsky-Schwarz Inequality)
|u · v| ≤ kuk kvk (8)

6.8 Inner Product

Definition An inner product on a (real) vector space V is a function that
to each pair of vectors u and v of V associates a real number, denoted by
hu, vi .
h , i : V × V → R, (u, v) → hu, vi
This function satisfies the following properties, or axioms.
For any vectors u, v, w of V and any scalar c, we have
1. hu, vi = hv, ui (Symmetry)
2. hu + w, vi = hu, vi + hw, vi (Additivity)
3. hcu, vi = chu, vi (Homogeneity)
4. hu, ui ≥ 0. Furthermore, hu, ui = 0 if and only if u = 0. (Positivity)
A real vector space with an inner product is called an inner product space.

6.8 Properties of Inner Product

Theorem Let u, v, and w be any vectors in an inner product
space and let c be any scalar. Then
1. hu, v + wi = hu, vi + hu, wi
2. hu, cvi = chu, vi
3. hu − w, vi = hu, vi − hw, vi
4. hu, v − wi = hu, vi − hu, wi
5. h0, vi = hv, 0i = 0

6.8 Examples of Inner Product Spaces

1. The dot product of Rn is an inner product.
2. (Weighted Dot Product) Let w1 , . . . , wn be any positive numbers and let

u = (u1, . . . , un) and v = (v1, . . . , vn) be any n-vectors. The following
defines an inner product in Rn.
h u, v i = w 1 u 1 v 1 + · · · + w n u n v n (9)
3. Let A and B be 2 × 2 matrices with real entries.

a1 a2 b1 b2
A= , B=
a3 a4 b3 b4
The following function defines an inner product in M22 .
hA, Bi = a1 b1 + a2 b2 + a3 b3 + a4 b4
4. Let f (x) and g(x) be in C[a, b], the vector space of the continuous real-
valued functions defined on [a, b]. Then the following defines an inner
product on C[a, b].
Z b
hf, gi = f (x)g(x) dx
a

6.8 Length and Orthogonality

Let V be an inner product space. Two vectors u and v are called orthogonal
if their inner product is zero.
u and v are orthogonal if hu, vi = 0
The norm (or length, or magnitude) of v is the nonnegative number kvk,
defined by
p
k v k = hv , v i (10)
We also define the distance, d(u, v), between two vectors u and v by
d(u, v) = ku − vk (11)
Note that
d(0, v) = d(v, 0) = kvk
A vector with norm 1 is called a unit vector. The set S of all unit vectors of
V is called the unit circle or the unit sphere.
S = {v , v ∈ V and kvk = 1} (12)

6.8 Properties of Norm
The norm in an inner product space V satisfies the following

basic properties.
For all vectors u and v of V and all scalars c, we have
1. kcuk = |c| kuk
2. ku + vk ≤ kuk + kvk (the Triangle Inequality)
3. kuk ≥ 0 and kuk = 0 if and only if u = 0

6.8 Examples of Length and Orthogonality

Example (a) Are the functions sin (x) Rand sin (2x) orthogonal in C [−π, π]
π
under the integral inner product hf, gi = −π f (x) g (x) dx?
(b) What is the norm of sin (2x) with respect to this inner product?
(a) We have
π
1 π
Z Z
hsin (x) , sin (2x)i = sin (x) sin (2x) dx = (cos x − cos 3x) dx
−π 2 −π
π
1 1
= sin x − sin 3x = 0
2 3 −π
so the functions are orthogonal.
(b) The norm is
Z π 1/2 Z π 1/2
1 √
ksin (2x)k = sin2 (2x) dx = (1 − cos (4x)) dx = π
−π 2 −π
1 1
Note: (a) sin (a) sin (b) = (cos (a − b) − cos (a + b)) (b) sin2 (a) = (1 − cos 2a)
2 2
6.8 Matrix Transformations

A matrix transformation T : Rn → Rm, is a transformation for which there
is an m × n matrix A such that
T (x) = Ax
for all x in Rn.
h i
3 −7 8
Example Consider the matrix transformation T : → R3 R2 with A = 2 1 −4
   
x1 x1
3 −7 8
T  x2  =  x2 
2 1 −4
x3 x3
So
 
x1
3x1 − 7x2 + 8x3
T  x2  =
2x1 + x2 − 4x3
x3
For example, the image of the vector (−2, 3, 5) under this transformation is
   
−2 −2
3 −7 8  13
T 3 = 3 =
2 1 −4 −21
5 5
6.8 Linear Transformations

Definition A linear transformation or linear map from a vector
space V to a vector space W is a transformation T : V → W such
that for all vectors u and v of V and any scalar c, we have
1. T (u + v) = T (u) + T (v)
2. T (cu) = cT (u)
The addition in u + v is addition in V, whereas the addition in

T (u) + T (v) is addition in W. Likewise, scalar multiplications cu
and cT (u) occur in V and W, respectively. In the special case
where V = W, the linear transformation T : V → V is called a
linear operator of V.
6.8 Examples of Linear Transformations

• Matrix transformations. Because if A is the matrix of the transformation,
then
T (x1 + x2 ) = A (x1 + x2 ) = Ax1 + Ax2 = T (x1 ) + T (x2 )
and
T (c1 x1 ) = A (c1 x1 ) = c1 Ax1 = c1 T (x1 )
• The special matrix transformations with matrices

−1 0 1 0 −1 0 cos θ − sin θ
, , ,
0 1 0 −1 0 −1 sin θ cos θ
are linear and represent reflection about the y-axis and the x-axis, reflec-
tion about the origin and rotation by θ radians about the origin.

6.8 Examples of Linear Transformations

h i
a b
• T : M22 → P3 , T c d = d + cx + (b − a) x3 is linear.
h i h i h i
a1 b1 a2 b2 a1 + a2 b1 + b2
T c1 d1 + c2 d2 =T c1 + c2 d1 + d2
= (d1 + d2 ) + (c1 + c2 ) x + {(b1 + b2 ) − (a1 + a2 )} x3

= d1 + c1 x + (b1 − a1 ) x3 + d2 + c2 x + (b2 − a2 ) x3
h i h i
a1 b1 a2 b2
=T c1 d1 +T c2 d2
and
h i h i
a1 b1 ca1 cb1
T c c1 d1 =T cc1 cd1
= cd1 + cc1 x + (cb1 − ca1 ) x3

= c d1 + c1 x + (b1 − a1 ) x3
h i
a1 b1
= cT c1 d1

Mathematics
George Nakos
Fall 2004
113
1. Eigenvalues and Eigenvectors
2. Diagonalization and Similarity
3. Symmetric and Orthogonal Matrices
4. Hermitian and Unitary Matrices
114
7.1 Eigenvalues
Definition Let A be an n × n matrix. A nonzero vector v is called

an eigenvector of A, if for some scalar λ
Av = λv (13)
The scalar λ (which may zero) is called an eigenvalue of A
corresponding to (or associated with) the eigenvector v.
Geometrically, if v is an eigenvector of A, then v and Av are on

the same line through the origin.

7.1 Eigenvalues
Example Let

2 2 2 1
A= , v1 = , v2 =
2 −1 1 −2
(a) Show that v1 and v2 are eigenvectors of A.
(b) What are the eigenvalues corresponding to v1 and v2 ?
Solution: We have

2 2 2 6 2
Av1 = = =3 = 3v1
2 −1 1 3 1

2 2 1 −2 1
Av2 = = = −2 = −2v2
2 −1 −2 4 −2
Therefore, v1 is an eigenvector with corresponding eigenvalue λ = 3 and v2
is an eigenvector with corresponding eigenvalue λ = −2.

7.1 Eigenvalues
Example Find all the eigenvalues and eigenvectors of A geometrically, if

0 1
(a) A = .
1 0
(b) A is the standard matrix of the rotation by 30◦ in R3 about the z-axis in
the positive direction.
(a) Ax is the reflection of x about the line y = x. The only vectors that remain
on the same line after rotation are the vectors along the lines y = x and
y = −x. These without the origin are the only eigenvectors. For v along
y = x we have Av = 1v, so v is an eigenvector with corresponding
eigenvalue 1. For v along y = −x, Av = −1v, so v is an eigenvector with
corresponding eigenvalue −1.
(b) The only vectors that remain on the same line after rotation are all vectors
along the z-axis. These without the origin are the only eigenvectors. The
corresponding eigenvalue is 1.

7.1 Computation of Eigenvalues

Theorem Let A be a square matrix.
1. A vector v is an eigenvector of A corresponding to eigenvalue λ if and

only if v is a nontrivial solution of the system
(A − λI)v = 0 (14)
2. A scalar λ is an eigenvalue of A if and only if

det(A − λI) = 0 (15)
Equation (15) is called the characteristic equation of A. The determinant

det(A − λI) is a polynomial of degree n in λ and is called the characteristic
polynomial of A. The matrix A − λI is called the characteristic matrix of
A. If an eigenvalue λ is a root of the characteristic equation of multiplicity k,
we say that λ has algebraic multiplicity k.

7.1 Proof of Theorem
1. We have
Av = λv ⇒ Av = λI v
⇒ Av − λI v = 0
⇒ (A − λI)v = 0
Hence, v is an eigenvector if and only if it is a nontrivial
solution of the homogeneous system (A − λI)v = 0.
2. The homogeneous linear system (14) has a nontrivial solution

if and only if the determinant of the coefficient matrix is zero.
Thus, λ is an eigenvalue of A if and only if det(A − λI) = 0.


7.1 Eigenspace
Fact Let A be a n × n and let λ be an eigenvalue of A. Let Eλ

be the set that consists of all eigenvectors of A corresponding
to λ and the zero n-vector. Then Eλ is a subspace of Rn.
The subspace Eλ of Rn mentioned above consisting of the zero

vector and the eigenvectors of A with eigenvalue λ is called an
eigenspace of A. It is the eigenspace with eigenvalue λ.
The dimension of Eλ is called the geometric multiplicity of λ.

7.2 Examples of Finding Eigenvalues

In the next examples we compute the eigenvalues, the eigenvectors and find
bases for each eigenspace of the given matrix A.
 
1 −1 −1
Example A =  −2 0 4 .
−2 6 −2
Solution: The characteristic equation is

1 − λ −1 −1
= −λ3 − λ2 + 30λ = −λ (λ − 5) (λ + 6) = 0

−2 0 − λ 4

−2 6 −2 − λ
Hence, the eigenvalues are
λ1 = 0, λ2 = 5, λ3 = −6
Next, we find the eigenevectors. For λ1 = 0 we have


   
1 −1 −1 0 1 0 −2 0
[A − 0I : 0] =  −2 0 4 0  ∼  0 1 −1 0 
−2 6 −2 0 0 0 0 0
The general solution is (2r, r, r) for r ∈ R. Hence,

    
 2r   2 
E0 =  r  , r ∈ R = Span  1 
 r   1 
 
2
and eigenvector v1 =  1  defines the basis {v1 } of E0 .
1
For λ2 = 5 we have
   
−4 −1 −1 0 1 0 1/2 0
[A − 5I : 0] =  −2 −5 4 0  ∼  0 1 −1 0 
−2 6 −7 0 0 0 0 0


The general solution is (−r/2, r, r) for r ∈ R. Hence,
−r/2 −1/2

E5 = r , r∈R = Span 1
r 1
Any
nonzero vector of E5 is a basis of E5 . We may choose a fraction free one. So, v2 =
−1

2 defines the basis {v2 } of E5 .
2
For λ = −6 we have
−1 −1 −1/20

7 0 1 0 0
[A − (−6) I : 0] = −2 6 4 0 ∼ 0 1 13/20 0
−2 6 4 0 0 0 0 0
The general solution is (r/20, −13r/20, r) for r ∈ R. Hence,

r/20 1/20
E−6 = −13r/20 , r∈R = Span −13/20
r 1

1
Any nonzero vector of E−6 is a basis of E−6 . For example, v3 = −13 defines the basis
20
{v3 } of E−6 .


 
0 0 1
Example A =  0 1 0 .
0 0 1
Solution: The characteristic equation is

−λ 0 1
= −λ (1 − λ)2 = 0

det(A − λI) = 0 1 − λ 0
0 0 1−λ
Hence, the eigenvalues are
λ1 = 0 , λ2 = λ3 = 1
Next, we find the eigenevectors. For λ1 = 0 we have
   
0 0 1 0 0 1 0 0
[A − 0I : 0] =  0 1 0 0 ∼ 0 0 1 0 
0 0 1 0 0 0 0 0


The general solution is (r, 0, 0) for r ∈ R. Hence,
    
 r   1 
E0 =  0  , r ∈ R = Span  0 
 0   0 
 
1
and eigenvector v1 =  0  defines the basis {v1 } of E0 (Fig. Ex. ).
0
For λ2 = λ3 = 1 with algebraic multiplicity 2, we have
 
−1 0 1 0
[A − 1I : 0] =  0 0 0 0 
0 0 0 0
The general solution is (r, s, r) for r ∈ R. But (r, s, r) = r(1, 0, 1) + s(0, 1, 0),
so       
 r   1 0 
E1 =  s  , r ∈ R = Span  0  ,  1 
 r   1 0 


1 0
The spanning eigenvectors v2 = 0 , v3 = 1 are linearly independent. So, {v2 , v3 } is
1 0
a basis for E1 and the geometric multiplicity of λ = 1 is 2.
NOTE If A = [aij ] is a triangular matrix, then so is A − λI. Hence, in this case

det(A − λI) = (a11 − λ)(a22 − λ) · · · (ann − λ)
We conclude that the eigenvalues of a triangular matrix are the diagonal entries.
−1

1 0
Example A = 0 −4 2 .
0 0 −2
A is triangular, so the eigenvalues are the diagonal entries 1, −2, −4. By row reducing
[A − 1I : 0] , [A − (−2)I : 0], and [A − (−4)I : 0] we get

1 1/3 1/5
E1 = Span 0 , E−2 = Span 1 , E−4 = Span 1
0 1 0
The spanning eigenvectors define bases for the corresponding eigenspaces.

7.5 Diagonalization
Matrix arithmetic with diagonal matrices is easier than with any other matri-
ces. This is most notable in matrix multiplication. For example, a diagonal
matrix D does not mix the components of x in the product Dx.

2 0 a 2a
=
0 3 b 3b
Also, it does not mix rows of A in a product DA (or columns in AD).

2 0 a b c 2a 2b 2c
=
0 3 d e f 3d 3e 3f
Moreover, it is very easy to compute the powers Dk .
k k
2 0 2 0
=
0 3 0 3k
We study matrices that can be transformed to diagonal matrices and take

advantage of the easy arithmetic. We use eigenvalues to develop criteria that
identify these matrices and we explore their basic properties.
7.5 Diagonalization
Definition Let A and B be two n × n matrices. We say that B is similar to

A if there exists an invertible matrix P such that
B = P −1 AP
Definition If a n × n matrix A is similar to a diagonal matrix D, then it is

called diagonalizable. We also say that A can be diagonalized. This means
that there exists an invertible n × n matrix P such that P −1 AP is a diagonal
matrix D.
P −1 AP = D
The process of finding matrices P and D is called diagonalization. We say
that P and D diagonalize A.
The answer of how to diagonalize a matrix is provided in the next theorem.

7.5 Diagonalization
Theorem Let A be an n × n matrix.
1. A is diagonalizable if and only if it has n linearly independent eigenvectors.
2. If A is diagonalizable with P −1 AP = D, then the columns of P are eigen-

vectors of A and the diagonal entries of D are the corresponding eigen-
values.
3. If {v1 , . . . , vn} are linearly independent eigenvectors of A with correspond-

ing eigenvalues λ1 , . . . , λn, then A can be diagonalized by
 
λ1 · · · 0
P = [v1 v2 · · · vn] and D =  ... . . . ... 
0 · · · λn
Theorem Let A be a n × n matrix. The following are equivalent.
1. A is diagonalizable.
2. Rn has a basis of eigenvectors of A.

7.5 Diagonalization
 
0 0 1
Example A =  0 1 0 .
0 0 1
Solution: We found before that λ1 = 0, λ2 = λ3 = 1 and
     
 1   1 0 
E0 = Span  0  , E1 = Span  0  ,  1 
 0   1 0 
A has 3 linearly independent eigenvectors so it is diagonalizable. We may
take
   
1 1 0 0 0 0
P = 0 0 1  , D= 0 1 0 
0 1 0 0 0 1
We may check this by
 −1     
1 1 0 0 0 1 1 1 0 0 0 0
−1
P AP =  0 0 1   0 1 0   0 0 1  =  0 1 0  = D
0 1 0 0 0 1 0 1 0 0 0 1

7.5 Diagonalization
Example
 
1 −1 0
A =  0 −4 2 .
0 0 −2
Solution: We have found that λ1 = 1, λ2 = −2, λ3 = −4 and

   1   1 
 1   3   5 
E1 = Span  0  , E−2 = Span  1  , E−4 = Span  1 
 0     
1 0
A has 3 linearly independent eigenvectors so it is diagonalizable. We may
take
1 13 15
   
1 0 0
P =  0 1 1  , D =  0 −2 0 
0 1 0 0 0 −4

7.5 Diagonalization
Theorem Let λ1 , . . . , λl be any distinct eigenvalues of an n × n matrix A.
1. Then any corresponding eigenvectors v1 , . . . , vl are linearly independent.
2. If B1 , . . . , Bl are bases for the corresponding eigenspaces, then

B = B1 ∪ · · · ∪ Bl
is linearly independent.
3. Let l be the number of all distinct eigenvalues of A. Then A is diago-

nalizable, if and only if B in part 2 has exactly n elements.
 
1 0 3
Example Is A =  1 −1 2  diagonalizable?
−1 1 −2
Solution: We have found that λ1 = λ2 = 0, λ3 = −2 and
−3 −1

E0 = Span −1 , E−2 = Span −1
1 1
This time A has at most 2 (< 3) linearly independent eigenvectors, so it is not diagonalizable,
by part 2 of the theorem.
7.5 Powers of Diagonalizable Matrices

Let A be diagonalizable n × n matrix, diagonalized by P and D, so A = P DP −1 . We have
A2 = (P DP −1 )(P DP −1 ) = P D2 P −1 . We iterate to get
Ak = P Dk P −1
Example Find a formula for Ak , k = 0, 1, 2, . . . , where

1 0 1
A= 0 2 0
3 0 3
Solution: A has eigenvalues 0, 2, 4 and the corresponding basic eigenvectors (−1, 0, 1),
(0, 1, 0), (1, 0, 3) are linearly independent. Hence,
k −1
−1 −1

0 1 0 0 0 0 1
Ak = 0 1 0 0 2 0 0 1 0
1 0 3 0 0 4 1 0 3
 
0 0 0 − 34 0 1
−1

0 1 4
= 0 1 0 0 2k 0  0 1 0 
1 0 3 0 0 4k 1
0 1
4 4
" #
4k−1 0 4k−1
= 0 2k 0
3 · 4k−1 0 3 · 4k−1

7.5 An Important Change of Variables
Let us now discuss an idea that is in the core of most applications of di-
agonalization. Let A be diagonalizable, diagonalized by P and D. Often a
matrix-vector equation f (A, x) = 0 can be substantially simplified, if we re-
place x by the new vector y such that
x = P y or y = P −1x (16)
and replace A with P DP −1 to get an equation of the form g(D, y) = 0 that
involves the diagonal matrix D and the new vector y.
To illustrate suppose we have a linear system Ax = b. Then we can convert
this system into a diagonal system as follows. We consider the new variable
vector y defined by y = P x. We have
Ax = b ⇔ P Ax = P b
⇔ P AP −1 y = P b
⇔ Dy = P b
The last equation defines a diagonal system.
7.3 Orthogonal Matrices
Definition A square matrix A is called orthogonal if it is has orthonormal

columns. This means that every pair of columns is orthogonal and each
column is a unit vector.
Note that a nonsquare matrix with orthonormal columns is not called orthog-
onal. (Perhaps a better name for orthogonal matrix would be “orthonormal”.)
Orthogonal matrices are invertible, because they are square with linearly
independent columns. We have the following important theorem.
Theorem Let A be a square matrix. The following are equivalent.
1. A is orthogonal.
2. AT A = I
3. A−1 = AT
7.3 Examples of Orthogonal Matrices

Example Show that the rotation matrix A is orthogonal

cos θ − sin θ
A=
sin θ cos θ
and find its inverse
Solution: A is orthogonal because
2 θ + sin2 θ

cos 0 1 0
AAT = =
0 cos2 θ + sin2 θ 0 1
Note that the inverse of an orthogonal matrix is its transpose. Hence,

cos θ sin θ
A−1 = AT =
− sin θ cos θ
Example It is easy to see that B is also orthogonal.

 2 1 2 
3 3 3
B =  − 32 2 1
 
3 3 
1 2
3 3
− 23
7.3 Orthogonal Matrices

Theorem For A a n × n matrix, the following statements are equivalent.
1. A is orthogonal.
2. Au · Av = u · v for any n-vectors u and v (Preservation of dot products).
3. kAvk = kvk for any n-vector v (Preservation of lengths).
REMARK The matrix transformation T (x) = Ax defined by an orthogonal

matrix A is also called orthogonal. By the last theorem we see that orthogonal
matrix transformations preserve dot products. Hence, they preserve lengths
and angles.
Theorem If λ is an eigenvalue of an orthogonal matrix A, then |λ| = 1.
Proof: If v an eigenvector of A, then by part 3 of the last theorem
kvk = kAvk = kλvk = |λ| kvk
Hence, |λ| = 1, since kvk 6= 0.
This theorem also holds for complex eigenvalues of A.

7.3 Eigenvalues of Symmetric Matrices

Theorem
1. A real symmetric matrix has only real eigenvalues.
2. A real skew-symmetric matrix eigenvalues that are either pure imaginary

or zero.
Example The symmetric matrix A

 
1 1 2
A= 1 2 1 

2 1 1
has real eigenvalues: 1, 4, −1.
The skew-symmetric matrix B
 
0 2 1
B =  −2 0 1 
−1 −1 0
√ √
has pure imaginary or zero eigenvalues: 0, i 6, −i 6.
7.4 Hermitian, Skew-Hermitian,

and Unitary Matrices
Definitions Let A be a square complex matrix. Then
T
1. A is called Hermitian, if A = A.
T
2. A is called skew-Hermitian, if A = −A.
T
3. A is called unitary, if A = A−1 .
Example Show that matrix A is Hermitian, matrix B is skew-Hermitian, and

matrix C is unitary.
" √ #
1 3
−2i

4 2+i 0 2−i 2
A= , B= , C= √
2−i 0 −2 − i −4i − 3i 1
2 2
Solution: That A is Hermitian because
T T
T 4 2+i 4 2−i 4 2+i
A = = = =A
2−i 0 2+i 0 2−i 0


B is skew-Hermitian, because
T T
T 0 2−i 0 2+i 0 −2 + i
B = = = = −B
−2 − i −4i −2 + i 4i 2+i 4i
T
To show that C is unitary, it suffices to check that C C = I. We have
" √ #T " √ #
1
T 2
− 23 i 1
2
− 23 i
C C= √ √ =
3 1 3 1
− 2
i 2
− 2
i 2
" √ #" √ #
1 3 1
i − 23 i

2 2 2 1 0
= √ √ = = I2
3
i 1
− 3
i 1 0 1
2 2 2 2


REMARKS
1. For a real skew-Hermitian matrix A, we have A = −A. Such a matrix is

called skew-symmetric.
2. For a real unitary matrix A, we have A = A−1 . Hence, a real unitary

matrix is orthogonal.
3. The main diagonal of a Hermitian matrix consists of real numbers.
4. The main diagonal of a skew-Hermitian matrix consists of 0s, or pure

imaginary numbers.
T
5. Equivalent statements for A being a unitary matrix are: A A = I and
also by taking the transpose
AT A = I


Theorem Let A be a complex square matrix. Then
1. If A is Hermitian, then its eigenvalues are real. (Thus, this holds for
symmetric matrices.)
2. If A is skew-Hermitian, then its eigenvalues are pure imaginary, or 0.

(Thus, this holds for skew-symmetric matrices.)
3. If A is unitary, then its eigenvalues have absolute value 1. (Thus, this

holds for real orthogonal matrices.)
Note Let A be a n × n unitary matrix. Then for any complex n-vectors u

and v, we have with respect to the complex dot product:
1. Au · Av = u · v (Preservation of the complex dot product)
2. kAvk = kvk (Preservation of complex norm)
Note The complex dot product is given by u · v = uT v = u1 v1 + · · · + unvn

Mathematics
George Nakos
Fall 2004
143
Part 2: Partial Differential Equations
1. Orthogonal Sets of Functions
2. Generalized Fourier Series
3. Euler’s Formula
4. Review of Homogenous Differential Equations with Constant

Coefficients
5. Sturm-Liouville Theory
144
Some Trigonometric Identities

1. sin (a + b) = sin a cos b + cos a sin b
2. cos (a + b) = cos a cos b − sin a sin b
3. sin (a − b) = sin a cos b − cos a sin b
4. cos (a − b) = cos a cos b + sin a sin b
5. sin (2a) = 2 sin a cos a
6. cos (2a) = 2 cos2 a − 1
1 + cos 2a
7. cos2 (a) =
2
1 − cos 2a
8. sin2 (a) =
2

Some Trigonometric Identities (Cont.)
1 1
9. sin a cos b = sin (a + b) + sin (a − b)
2 2
1 1
10. sin a sin b = cos (a − b) − cos (a + b)
2 2
1 1
11. cos a cos b = cos (a − b) + cos (a + b)
2 2
12. cos (kπ) = (−1)k , k integer.
13. sin (kπ) = 0, k integer.

π
14. cos (2k − 1) = 0, k integer.
2

4.7 Orthogonal Sets of Functions

We consider continuous real-valued functions defined on an interval [a, b] . So
they are in the set C [a, b] .
Definitions
1. We say that the distinct functions gm (x) and gn (x) are orthogonal on
[a, b] , if their integral inner product is zero. I.e., if
Z b
hgm , gni = gm (x) gn (x) dx = 0, for m 6= n
a
2. We say that the sequence of distinct functions g1 (x) , g2 (x) , . . . , gn (x) , . . .

is an orthogonal set on [a, b] , if all functions are pairwise orthogonal.
I.e., if
hgm , gni = 0, for all m 6= n
Recall that the norm or length of each gm on [a, b] under this inner product
is
s s
p Z b Z b
2
kgm k = hgm , gm i = gm (x) gm (x) dx = gm (x) dx
a a

4.7 Orthonormal Sets of Functions

Definition We say that the sequence of distinct functions g1 (x) , g2 (x) , . . . , gn (x) , . . .
is an orthonormal set on [a, b] , if
1. The set is orthogonal: hgm , gni = 0 for m 6= n, and
2. All functions are unit: kgm k = 1.
Because kgm k = 1 is equivalent to kgm k2 = 1, the above definition is equivalent

to saying
Z b
0, for all m 6= n
hgm , gni = gm (x) gn (x) dx =
a
1, for all m = n
Note: From an orthogonal set we may obtain an orthonormal set by dividing

1
each function by its own norm. So we replace gm with gm .
kgm k

1 1
This is because if kgm k 6= 1, 0, then gm
= kgmk = 1 kgmk = 1
kgmk kgmk kgm k
4.7 Assumptions
Standing assumptions: From now on we assume that
a. all functions we discuss are bounded on [a, b],
b. their integrals over [a, b] are finite, and
c. their norms are nonzero.

4.7 Examples of Orthogonal Sets

Example 1 Let gm (x) = sin mx, m = 1, 2, . . .
1. Show that gm (x), m = 1, 2, . . . forms an orthogonal set on [−π, π] .
2. Find each norm and the corresponding orthonormal set.
Solution:
1. If m 6= n, then
Z π
hgm , gni = sin (mx) sin (nx) dx
−π
Z π
1
= [cos ((m − n) x) − cos ((m + n) x)] dx
2 −π
1 1
= sin ((m − n) x)|π−π − sin ((m + n) x)|π−π
2 (m − n) 2 (m + n)
= 0+0
= 0

2. Each norm is computed from
Z π
2
kgm k = sin2 (mx) dx
−π
1 π
Z
= (1 − cos (2mx)) dx
2 −π
π
1 sin (2mx)
= x−
2 2m
−π
1
= (2π)
2
= π
So
√
ksin (mx)k = π, for m = 1, 2, . . .
So, the corresponding orthonormal set is
sin x sin (2x) sin (3x) sin (mx)

√ , √ , √ ,..., √ ,...
π π π π

Example 2 Consider the set
1, cos x, sin x, cos (2x) , sin (2x) , . . . , cos (mx) , sin (mx) , . . .
1. Show that this set forms an orthogonal set on [−π, π] .
2. Find each norm and the corresponding orthonormal set.
Solution:
1. We have
a.
Z π
π
sin (mx)
h1, cos (mx)i = (1) cos (mx) dx = =0
−π m
−π
b.
Z π
π
cos (mx)
h1, sin (mx)i = (1) sin (mx) dx = − =0
−π m −π


c. If m 6= n, then hsin (mx) , sin (nx)i = 0. This was proved before.
d. If m 6= n, then
Z π
hcos (mx) , sin (nx)i = cos (mx) sin (nx) dx
−π
Z π
1
= (sin ((m + n) x) − sin ((m − n) x)) dx
2 −π
−1 1
= cos (m + n) x|π−π + cos (m − n) x|π−π
2 (m + n) 2 (m − n)
= 0+0=0
e.
Z π
hcos (mx) , sin (mx)i = cos (mx) sin (mx) dx
−π
Z π
1
= sin (2mx) dx
2 −π
π
1 cos 2mx
= − =0
2 2m −π

f. If m 6= n, then
Z π
hcos (mx) , cos (nx)i = cos (mx) cos (nx) dx
−π
Z π
1
= (cos ((m − n) x) + cos ((m + n) x)) dx
2 −π
1 1
= sin (m − n) x|π−π + sin (m + n) x|π−π
2 (m + n) 2 (m − n)
= 0+0
= 0
2. The norms are computed from

a.
Z π
2
k1k = 1dx = x|π−π = 2π
−π


b.
Z π
2
kcos (mx)k = cos2 (mx) dx
−π
1 π
Z
= (1 + cos (2mx)) dx
2 −π
π
1 sin (2mx)
= x+
2 2m
−π
= π
c. ksin (mx)k2 = π was proved in the last example. So the norms are
√ √ √
k1k = 2π, kcos (mx)k = π, ksin (mx)k = π, for m = 1, 2, . . .
So the orthonormal set is
1 cos x sin x cos (2x) sin (2x) cos (mx) sin (mx)
√ , √ , √ ,, √ , √ ,,..., √ , √ ,...
2π π π π π π π

4.8 Generalized Fourier Series

Orthogonal sets are very important because if f (x) is a given
function defined on [a, b] and g1 (x) , g2 (x) , . . . , gn (x) , . . . orthog-
onal on [a, b] , then in general f (x) can be represented as a con-
vergent series of the gn (x) .
∞
X
f (x) = angn (x) = a1g1 (x) + · · · + angn (x) + · · · (GFS)
n=1
where the an are constants that depend on the function f (x) .
Equation (GFS) is called the generalized Fourier series of f (x)

with respect to the orthogonal set gn (x) n = 1, 2, . . . . The
coefficients an are the generalized Fourier coefficients of f (x) .
Under some general conditions the series in (GFS) converges to

the function f (x) .
4.8 Generalized Fourier Series
Under the convergence conditions the generalized Fourier coef-

ficients can be computed as follows.
* ∞ + ∞
X X
hf, gni = amgm, gn = am hgm, gni = an hgn, gni
m=1 m=1
because hgm, gni = 0 for m 6= n, by orthogonality. Therefore,
hf, gni hf, gni
an = = (GFC1)
hgn, gni kgnk2
Or by the definition of the integral inner product
Z b Rb
1 a f (x) gn (x) dx
an = 2 a
f (x) gn (x) dx = Rb (GFC2)
kgnk 2
a gn (x) dx

4.8 Example The (Classical) Fourier Series

Consider the orthogonal functions on [−π, π] of Example 2: 1, cos x, sin x, cos (2x) , sin (2
If a function f (x) is defined on [−π, π] then the special generalized Fourier
series
∞
X
f (x) = a0 + (an cos (nx) + bn sin (nx)) (FS)
n=1
is called the (classical) Fourier series of f (x) . The coefficients are computed
by using (GFC2) to get
Z π
1
a0 = f (x) dx
2π −π
1 π
Z
an = f (x) cos (nx) dx
π −π
1 π
Z
bn = f (x) sin (nx) dx
π −π
because we have already seen in Example 2 that
√ √ √
k1k = 2π, kcos (mx)k = π, ksin (mx)k = π, for m = 1, 2, . . .
Relations (FC) are the (classical) Fourier coefficients of f (x) .

4.8 Orthogonality with Respect to a Weight

Function
Let be a positive function defined on [a, b] . I.e., p (x) > 0 for all x in [a, b].
The assignment
Z b
(f, g) → hf, gi = p (x) f (x) g (x) dx
a
defines as inner product on the vector space of all real-valued continuous
functions C [a, b] defined on [a, b] . The norm defined by this inner product is
s
Z b
kf k = p (x) f 2 (x) dx
a
Definition Let p (x) be a positive function defined on [a.b] . The sequence of

functions g1 (x) , g2 (x) , . . . is an orthogonal set on [a, b] with respect to the
weight function p (x) , if the inner product with weight p (x) is zero. I.e., if
Z b
hgm , gni = p (x) gm (x) gn (x) dx = 0, for m 6= n
a

4.8 Orthogonality with Respect to a Weight

Function
If each function in an orthogonal set has norm 1 with respect to the weight
function p (x) , then the set is orthonormal with respect to the weight
function p (x) .
Notes
1. Orthogonality is the same as orthogonality with respect to the weight

function p (x) = 1, for all x in [a, b] .
2. If the g1 (x) ,p
g2 (x) , . . . is orthogonal with respect to weight p (x) and we
set hn (x) = p (x)gn (x) , then by the weighted orthogonality we get
Z b Z bp p
hm (x) hn (x) dx = p (x)gm (x) p (x)gn (x) dx
a
Za b
= p (x) gm (x) gn (x) dx
a
= 0
So the functions h1 (x) , h2 (x) , . . . are orthogonal in the usual sense.

Euler’s Formula
Euler’s Formula relates the complex exponential function with the trigono-
metric sines and cosines. If t is a real number, then
eit = cos t + i sin t
√
where i = −1 is the complex unit such that
i2 = −1
Example We have
eiπ = −1, eiπ/2 = i, ei2π = 1, e2+3i = e2 (cos 3 + i sin 3)
because
eiπ = cos π + i sin π = −1 + i0 = −1
eiπ/2 = cos (π/2) + i sin (π/2) = 0 + i = i
ei2π = cos (2π) + i sin (2π) = 1 + i0 = 1
e2+3i = e2 e3i = e2 (cos 3 + i sin 3) ' −7. 315 1 + 1. 042 7i

Review
Linear Homogeneous with Constant Coefficients
Let y = y (x) be an unknown function of the independent variable x.
A nth order differential equation with constant coefficients is of the form

dny dn−1 y dy
an n + an−1 n−1 + · · · + a1 + a0 y = f (x) (N)
dx dx dx
where all ai are constants and f (x) is a given function.
If f (x) is nonzero (N) is called nonhomogeneous.
If the function f (x) is the zero function, then we have a homogeneous

differential equation:
dny dn−1 y dy
an n + an−1 n−1 + · · · + a1 + a0 y = 0 (H)
dx dx dx
(H) is called the associated homogeneous equation of (N).

Review
In order to solve (H), we seek solutions of the form y = erx , where r is a

constant. Given that
dk rx k rx
(e ) = r e
dxk
substitution into (H) yields

an (rnerx ) + an−1 r n−1 erx + · · · + a1 (rerx ) + a0 (erx ) = 0
rx n n−1

⇒ e anr + an−1 r + · · · + a1 r + a0 = 0
⇒ anrn + an−1 rn−1 + · · · + a1 r + a0 = 0
So to find solutions of the form y = erx it suffices to solve the auxiliary
polynomial equation
anrn + an−1 rn−1 + · · · + a1 r + a0 = 0 (A)
The roots of (A) can be real, or complex, or repeated.

Review
For the special case of a second order equation the general solution is dis-
cussed the following theorem.
Theorem Let y = y(x) be an unknown function

d2 y dy
a 2 +b + cy = 0 (H2)
dx dx
with auxiliary
ar 2 + br + c = 0 (A2)
1. If (A2) has two distict real roots r1 and r2 , then the general real solution
of (H2) is given by
y (x) = c1 er1 x + c2 er2 x
for any constants c1 and c2 .

Review
2. If (A2) has a double real root r, then the general real solution
of (H2) is given by
y (x) = c1erx + c2xerx

for any constants c1 and c2.
3. If (A2) has a complex conjugate pair of roots a ± ib, then

the general real solution of (H2) is given by
y (x) = c1eax cos (bx) + c2eax sin (bx)

for any constants c1 and c2.
Review
Example Solve 2y 00 + 5y 0 − 3y = 0.
Solution: We have 2r2 − 7r + 3 = 0, so r = 12 , 3. Hence,
y (x) = c1 ex/2 + c2 e3x
Example Solve y 00 − 8y 0 + 16y = 0.
Solution: We have r2 − 8r + 16, so r = 4, 4. Hence,

y (x) = c1 e4x + c2 xe4x
Example y 00 − 8y 0 + 20y = 0.
Solution: We have r2 − 8r + 20 = 0, so r = 4 − 2i, 4 + 2i. Therefore,

y (x) = c1 e4x cos (2x) + c1 e4x sin (2x)

Review
An initial value problem (IVP) is a set of differential equations

and initial conditions (IC’s).
Example Solve the IVP.

y 00 + 16y = 0, y (π ) = 3, y 0 (π ) = −5
Solution: We have r2 + 16 = 0. So, r = ±4i. Hence, y (x) =

c1 cos (4x) + c2 sin (4x) . Differentiate to get y 0 = −4c1 sin 4x +
4c2 cos 4x. Now y (π ) = 3 yields c1 = 3 and y 0 (π ) = −5 yields
c2 = − 54 . So the solution is
5
y (x) = 3 cos (4x) − sin (4x)
4
4.7 Sturm-Liouville Theory

A Sturm-Liouville problem (S-L) defined on [a, b] is a boundary value problem
for a second order homogeneous differential equation in unknown function
y = y (x) that can be written in the form
0
r (x) y 0 + [q (x) + λp (x)] y = 0

with two boundary conditions of the form

k1 y (a) + k2 y 0 (a) = 0
l1 y (b) + l2 y 0 (b) = 0
where the constants k1 , k2 are not both zero and the constants l1 , l2 are also
not both zero. The number λ is called the parameter of the S-L problem.
Note that a S-L problem has always the trivial solution y (x) = 0 for all x in
[a, b] . If λ is a scalar such that the S-L problem has a nontrivial solution y (x),
λ is called an eigenvalue of the the problem and the nontrivial y (x) is called
an eigenfunction corresponding to λ.


Example Find the eigenvalues and eigenfunctions of the S-L problem.
y 00 + λy = 0, y (0) = 0, y (π) = 0
Solution: We have the following cases:
Case 1 Let λ < 0, say λ = −v 2 for v > 0. Then we have

y 00 − v 2 y = 0 ⇒ r 2 − v 2 = 0 ⇒ r = ±v
This is a case of two real roots. So
y (x) = c1 evx + c2 e−vx

Using the boundary conditions: The first yields y (0) = 0 = c1 + c2 , so
c2 = −c1 . The second condition yields y (π) = c1 (evπ − e−vπ ) = 0. So, c1 = 0.
Hence, c2 = 0. We only get the trivial solution.
Case 2 Let λ = 0. Then y 00 (x) = 0. Hence, y (x) = c1 x + c2 , by integration.

Using the boundary conditions we get y (0) = 0 = c1 . Hence, y (x) = c2 . Now
y (π) = c2 = 0. Thus, we again get the trivial solution.

Case 3 Let λ > 0, say λ = v 2 for v > 0. Then we have

y 00 + v 2 y = 0 ⇒ r2 + v 2 = 0 ⇒ r = ±vi
This is a case of two complex conjugate roots. So
y (x) = c1 cos (vx) + c2 sin (vx)

Using the boundary conditions we get y (0) = 0 = c1 . So y (x) = c2 sin (vx) .
Now y (π) = c2 sin (vπ) = 0. If c2 = 0, we get the trivial solution. If c2 6= 0,
then sin (vπ) = 0. Hence, vπ = nπ, where n is any integer. Therefore, v = n
is an integer. So there are infinitely many eigenvalues
λn = n2
with corresponding eigenfunctions
yn (x) = sin (nx) , n = 1, 2, 3, . . .
Exercise Find the eigenvalues and eigenfunctions of the S-L problem.

y 00 + λy = 0, y (π) = y (−π) y 0 (π) = y 0 (−π)

4.8 Orthogonality of Eigenfunctions

Theorem 1 (Orthogonality of Eigenfunctions) If p (x) , q (x) , r (x) , and
r 0 (x) are real-valued continuous functions defined on [a, b] for an S-L problem.
0
r (x) y 0 + [q (x) + λp (x)] y = 0

k1 y (a) + k2 y 0 (a) = 0
l1 y (b) + l2 y 0 (b) = 0
Let ym (x) and yn (x) be two eigenfunctions corresponding to different eigen-
vlues λm and λn. Then ym (x) and yn (x) are orthogonal with respect to weight
function p (x) . Furthermore:
1. If r (a) = 0, then the first boundary condition can be dropped.
2. If r (b) = 0, then the second boundary condition can be dropped.
3. If r (a) = r (b) , then the two boundary conditions can be replaced by

y (a) = y (b) , y 0 (a) = y 0 (b)

4.8 Reality of Eigenvalues
Theorem 2 (Real Eigenvalues) If p (x) , q (x) , r (x) , and

r0 (x) are real-valued continuous functions defined on [a, b] for
i0
0
h
r (x) y + [q (x) + λp (x)] y = 0
k 1 y (a ) + k 2 y 0 (a ) = 0
l1y (b) + l2y 0 (b) = 0
and p (x) is either positive in the entire interval [a, b], or nega-
tive in the entire interval [a, b] , then all the eigenvalues are real
numbers.

4.7 Sturm-Liouville Example:

Periodic Boundary Conditions
Example 2 Find the eigenvalues and eigenfunctions of the S-L problem.
y 00 + λy = 0, y (0) = y (2π) , y 0 (0) = y 0 (2π)

Solution: We have
Case 1 Let λ < 0, say λ = −v 2 . Then the auxiliary is r2 − v 2 = 0. We get

r = ±v. So, y (x) = c1 evx + c2 e−vx . Using the boundary conditions we have
y 0 = vc1 evx − vc2 e−vx . Hence,
y (0) = c1 + c2 = y (2π) = c1 e2πv + c2 e−2πv
y 0 (0) = vc1 − vc2 = y 0 (2π) = vc1 e2πv − vc2 e−2πv
Thus, we get the homogoenuous linear system in c1 and c2
c1 1 − e2πv + c2 1 − e−2πv = 0

2πv −2πv

c1 1 − e + c2 e −1 = 0


The coefficient matrix has determinant

1 − e2πv 1 − e−2πv
= 2e−2πv + 2e2πv − 4 = 2 eπv − e−πv 2 6= 0

1 − e2πv e−2πv − 1

So the system has only the trivial solution.
Case 2 Let λ = 0. Then y 00 (x) = 0. Hence, y (x) = c1 x + c2 , by integration.
Using the boundary conditions we get
y (0) = c2 = y (2π) = 2πc1 + c2
y 0 (0) = c1 = y 0 (2π) = c1
Hence, c1 = 0. However, there is no restriction on c2 . So y (x) can be any
constant. Say, y (x) = a0 .
Case 3 Let λ < 0, say λ = −v 2 . Then the auxiliary is r2 + v 2 = 0. We get
r = ±iv. So, y (x) = c1 cos (vx) + c2 sin (vx) . Using the boundary conditions
we have y 0 = −vc1 sin vx + vc2 cos vx. Hence,
y (0) = c1 = y (2π) = c1 cos (2πv) + c2 sin (2πv)
y 0 (0) = vc2 = y 0 (2π) = −vc1 sin (2πv) + vc2 cos (2πv)


Thus, we get the homogoenuous linear system in c1 and c2
c1 (1 − cos (2πv)) + c2 (− sin (2πv)) = 0
c1 (sin (2πv)) + c2 (1 − cos (2πv)) = 0
For nontrivial solutions the coefficient determinant must be zero.

1 − cos (2πv) − sin (2πv) = 2 − 2 cos (2πv) = 4 sin2 (πv) = 0

sin (2πv) 1 − cos (2πv)
Therefore, πv = nπ, where n is an integer. So the system has eigenfunctions

c1 cos (nx) + c2 sin (nx), n is any integer.
So all eigenfunctions are nontrivial linear combinations of the eigenfunctions

1, cos x, cos (2x) , cos (3x) , . . . , sin x, sin (2x) , sin (3x) , . . .

Mathematics
George Nakos
Fall 2004
176
11.2 Modeling the Vibrating String
We consider a string of length L attached to fixed points with x-coordinates 0

and L. Let u (x, t) be the deflection or displacement (signed vertical distance
from the x-axis) of the string at location x at time t.
Goal: Calculate u (x, t) , given (a) the ends of the strings are fixed and (b)
initial displacement u (x, 0) and initial velocity ut (x, 0) .
We need assumptions to simplify the partial differential equation for u (x, t) .

This is because PDEs are very hard or impossible to solve exactly.

Assumptions:
1. The mass of the string per unit length is constant (homogeneous string).
The string is elastic and does not resist to bending.
2. The tension caused by stretching is much greater that gravity. So, gravity
is not a factor here.
3. The string performs a small transverse motion in the vertical plane, so

that both the deflection u (x, t) and its slope ux (x, t) are small.

Forces:
Consider forces acting on small portions of the string.
Since there is no resistance to bending, the tension is tangential to the curve
of the string at each point.
Let T1 and T2 be the tensions at P and Q.
Horizontal direction: There is no motion in the horizontal direction, so the
horizontal component must be constant, say T . So
T1 cos α = T2 cos β = T (1)

Vertical direction: In the vertical direction we have two forces, the vertical
components −T1 sin α and T2 sin β.
Let ρ be the linear mass density of the string, i.e., mass per unit length. By
∂ 2u
Newton’s second law the resultant force is mass ρ∆x times acceleration
∂t2
evaluated at some point between x and x + ∆x.
∂ 2u
T2 sin β − T1 sin β = ρ∆x 2
∂t


Using (1) we get
T2 sin β T1 sin α ∆x ∂ 2 u
= = tan β − tan α = ρ
T2 cos β T1 cos α T ∂t2

∂u ∂u
But tan α = , tan β = are the slopes at x and x + ∆x. So
∂x x ∂x x+∆x
we have
!
ρ ∂ 2u

1 ∂u ∂u
− =
∆x ∂x x+∆x ∂x x T ∂t2
Now we take the limit as ∆x → 0 to get

∂ 2u 2
2∂ u
=c (W-1)
∂t2 ∂x2
T
where c2 = .
ρ
Equation (W-1) is the one-dimensional wave equation.
11.3 Solving the 1-D Wave Equation

We solve the one-dimensional wave equation
∂ 2u 2
2∂ u
=c (W-1)
∂t2 ∂x2
subject to the fixed-ends boundary conditions

u (0, t) = 0, u (L, t) = 0, t≥0 (BC)
and initial conditions specifying an initial deflection f (x) and initial velocity
g (x) , for x such that 0 ≤ x ≤ L.

∂u
u (x, 0) = f (x) , = g (x) , 0≤x≤L (IC)
∂t t=0
Method of solution


Stage 1: Separation of Variables
First we seek nontrivial solutions of the system (W-1), (BC). Notice that the
trivial solution is already a solution.To solve (W-1), (BC) we use the method
of separation of variables. I.e., we seek solutions of the form.
u (x, t) = X (x) T (t)
where X = X (x) is a function of x only and T = T (t) is a function of t only.
Substitution into (W-1) yields
XT 00 = c2 X 00 T
where by X 0 we mean dX
dx
and by T 0 we mean dT
dt
. Now we separate the variables
by dividing both sides by c2 XT to get
T 00 X 00
=
c2 T X
Now x and t are completely independent variables, one being location and
T 00 X 00
one being time. So the only way the functions c2 T of t and X of x is if they
are both the same constant, say −λ. So,

T 00 X 00
2
= = −λ
c T X
Therefore, we get a system of two ordinary differential equations homoge-
neous with constant coefficients: one in X only and only in T only.
T 00 + c2 λT = 0, X 00 + λX = 0
These can be readily solved, provided we know the constant λ.
We use X 00 + λX = 0 and the boundary conditions (BC) to find λ and X (x) .

The boundary conditions (BC) are written in terms of X and T. For all t ≥ 0
u (0, t) = X (0) T (t) = 0, u (L, t) = X (L) T (t) = 0
T cannot be identically zero (T (t) = 0, for all t), or else u (x, t) would be
zero for all x and t, hence we would get the trivial solution. So we must have
X (0) = 0 and X (L) = 0. We get the Sturm-Liouville problem
X 00 + λX = 0, X (0) = 0, X (L) = 0
which we have essentially solved before. We have the following cases:

Case 1 Let λ < 0, say λ = −v 2 for v > 0. Then we have
X 00 − v 2 X = 0 ⇒ r 2 − v 2 = 0 ⇒ r = ±v
This is a case of two real roots. So
X (x) = c1 evx + c2 e−vx

Using the boundary conditions we get X (0) = 0 = c1 + c2 and X (L) =
−vL −vL
vL vL

c1 e + c2 e = 0. So c2 = −c1 . Hence, X (L) = c1 e − e = 0. Thus,
c1 = 0. So, c2 = 0 and we get the trivial solution.
Case 2 Let λ = 0. Then X 00 (x) = 0. Hence, X (x) = c1 x + c2 , by integration.
Using the boundary conditions we get X (0) = 0 = c2 . Hence, X (x) = c1 x.
Now X (L) = c1 L = 0. So, c1 = 0. Thus, we again get the trivial solution.
Case 3 Let λ > 0, say λ = v 2 for v > 0. Then we have
X 00 + v 2 X = 0 ⇒ r2 + v 2 = 0 ⇒ r = ±vi
This is a case of two complex conjugate roots. So
X (x) = c1 cos (vx) + c2 sin (vx)


Using the boundary conditions we get y (0) = 0 = c1 . So X (x) = c2 sin (vx) .
Now X (L) = c2 sin (vπ) = 0. If c2 = 0, we get the trivial solution. If c2 6= 0,
then sin (vL) = 0. Hence, vL = nπ, where n is any integer. Therefore,
v = nπ/L. So there are infinitely many eigenvalues
nπ 2
λn = , n = 1, 2, 3, . . .
L
with corresponding eigenfunctions
nπ
Xn (x) = sin x , n = 1, 2, 3, . . .
L
Note that since sin (−x) = − sin (x) and sin (0) = 0, so we need not keep any
negative integer values for n. These signs can be absorbed by the constant
coefficients.
Now that we know X and λ we turn to T. The equation T 00 + c2 λT = 0 takes

the form


cnπ 2
Tn00 + Tn = 0
L
2
which can be solved right away, becuase cnπ L
> 0. The auxiliary is r2 +
cnπ 2 cnπ

L
= 0. So, r = ± L
i. We have for any constants an and bn
cnπ cnπ
Tn = an cos t + bn sin t , n = 1, 2, 3, . . .
L L
Hence, u = XnTn becomes
cnπ cnπ nπ
un (x, t) = an cos t + bn sin t sin x , n = 1, 2, 3, . . .
L L L
Note that since the system (W-1), (BC) is homogeneous, any finite sum of
solutions un is also a solution.
k
X
u (x, t) = un (x, t)
n=1


Under certain conidtions an infinite sum of solutions is also a solution
∞
X
u (x, t) = un (x, t)
n=1
So we may have a general solution of the form
X∞ cnπ cnπ nπ
u (x, t) = an cos t + bn sin t sin x (W-1-Sol)
n=1
L L L
Stage 2: Fourier Analysis

Solution (W-1-sol) is the kind of solution this method produces, provided
we know the coefficients an and bn. These can be computed by using the
∂u
boundary conditions u (x, 0) = f (x) and ∂t t=0 = g (x) .
Using the first condition and (W-1-sol) with t = 0 yields

∞
X nπ
u (x, 0) = an sin x = f (x)
n=1
L


nπ

Now recall that the functions sn (x) = sin L
x were eigenfunctions to the
S-L problem on [0, L] . Therefore, by Theorem 1 on S-L problems, these
eigenfunctions must be orthogonal. So we can use the generic formula an =
hf, sni / hsn, sni to find an. We have
RL nπ
RL nπ

hf, sni f (x) sin L x dx f (x) sin L x dx
an = = 0R L = R 0L
hsn, sni 2 nπ 1 2nπ

0
sin L x dx 2 0
1 − cos L x dx
RL nπ
Z L
0
f (x) sin L
x dx 2 nπ
= = f (x) sin x dx
L/2 L 0 L
Using the second condition and (W-1-sol) with t = 0 yields

∞
∂u X cnπ nπ
= bn sin x = g (x)
∂t t=0 n=1
L L
The functions sn (x) = sin nπ

L
x are orthogonal, So we can use the generic
cnπ
formula bn L = hg, sni / hsn, sni to find bn. We have


RL nπ
RL nπ

L hf, sni L 0 g (x) sin L
x dx L 0
g (x) sin L
x dx
bn = = RL =
cnπ hsn, sni cnπ sin2 nπ cnπ L/2

0 L
x dx
Z L
2 nπ
= g (x) sin x dx
cnπ 0 L
So the method of separation of variables yields the following solution to the

one-dimensional wave equation.
∞
X cnπ cnπ nπ
u (x, t) = an cos t + bn sin t sin x
n=1
L L L
2 L
Z nπ
an = f (x) sin x dx, n = 1, 2, . . .
L 0 L
Z L
2 nπ
bn = g (x) sin x dx, n = 1, 2, . . .
cnπ 0 L

Nakos Engin Slides 1234

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Nakos Engin Slides 1234

Enviado por

Direitos autorais:

Formatos disponíveis

Engineering Mathematics / The Johns Hopkins University

Part 1: Linear Algebra

1. Matrices: Addition and Scalar Multiplication

3. Linear Systems and Gauss Elimination

4. The Rank of a Matrix; Linear Independence

6.1 Matrices: Definitions

A matrix is a rectangular arrangement of numbers called en-

A matrix has size m × n (pronounced ‘m by n’), if it has m rows

Lecture 1 / @ Copyright: George Nakos 3

6.1 Matrices: Examples

The following are matrices of respective sizes 4 × 2, 2 × 3, 3 × 3,

Lecture 1 / @ Copyright: George Nakos 4

6.1 Matrices: General Form

A general matrix A of size m × n with (i, j ) entry aij is denoted

Lecture 1 / @ Copyright: George Nakos 5

The following are vectors. The first is a 2-vector, the second is

Lecture 1 / @ Copyright: George Nakos 6

6.1 Zero Matrices; Equal Matrices

A zero matrix, denoted by 0, is a matrix with zero entries. Here

We say that two matrices A and B are equal and we write A = B,

6.1 Matrix Addition

Lecture 1 / @ Copyright: George Nakos 8

6.1 Scalar Multiplication

6.1 Matrices: Opposite, Difference

The matrix A + (−1) B is denoted by A − B and it is called the

Lecture 1 / @ Copyright: George Nakos 10

6.1 Properties of Operations

5. c(A + B) = cA + cB (Distributive Law)

6. (a + b)C = aC + bC (Distributive Law)

7. (ab)C = a(bC) = b(aC)

Lecture 1 / @ Copyright: George Nakos 11

6.1 Matrix Transpose

Let A be any m × n matrix. The transpose of A, denoted by AT ,

Lecture 1 / @ Copyright: George Nakos 12

6.1 Properties of Transposition

Let A and B be m × n matrices and let c be any scalar. Then

Lecture 1 / @ Copyright: George Nakos 13

6.1 Matrices: Symmetric, skew-Symmetric

A matrix A such that AT = −A is called skew-symmetric. Ex-

6.1 Special Square Matrices

Let A be a square matrix of size n. The entries aii, 1 ≤ i ≤ n

A, D, E are upper triangular. B, C, D, E are lower triangular. C is

Lecture 1 / @ Copyright: George Nakos 15

6.1 More Special Square Matrices

If the nondiagonal entries of a square matrix are zero, then the

A scalar matrix of size n with common diagonal entry 1 is called

Lecture 1 / @ Copyright: George Nakos 16

6.2 Matrix Multiplication

Lecture 1 / @ Copyright: George Nakos 17

6.2 Matrix Multiplication

Lecture 1 / @ Copyright: George Nakos 18

6.2 Properties of Matrix Multiplication

1. (AB)C = A(BC) (Associative law)

2. A(B + C) = AB + AC (Left Distributive law)

3. (B + C)A = BA + CA (Right Distributive law)

4. a(BC) = (aB)C = B(aC)

5. Im A = AIn = A (Multiplicative identity)

Lecture 1 / @ Copyright: George Nakos 19

6.2 Caution with Matrix Multiplication

1. We say that matrix multiplication is noncommutative.

2. If two matrices A and B satisfy AB = BA, then we say that

6.2 Powers of Square Matrix