Lec04 Perturbation

ERROR AND SENSITIVTY ANALYSIS FOR SYSTEMS
OF LINEAR EQUATIONS
Read parts of sections 2.6 and 3.5.3
Conditioning of linear systems.
Estimating errors for solutions of linear systems
Backward error analysis
Relative element-wise error analysis
4-1
GvL 3.5; Heath 2.3; TB 12 pert
Perturbation analysis for linear systems (Ax = b)

Question addressed by perturbation analysis: determine
the variation of the solution x when the data, namely
A and b, undergoes small variations. Problem is Illconditioned if small variations in data cause very large
variation in the solution.
4-2
Analysis I: Asymptotic First Order Analysis

Let E, be an n n matrix and eb be an n-vector.
Perturb A into A() = A + E and b into b + eb.
Note: A + E is nonsingular for small enough.
- Why?
The solution x() of the perturbed system is s.t.
(A + E)x() = b + eb.
4-3
Let () = x() x. Then,
(A + E)() = (b + eb) (A + E)x = (eb Ex)

() = (A + E)1(eb Ex).
x() is differentiable at = 0 and its derivative is
1
=
A
(eb Ex) .
x(0) = lim0 ()
A small variation [E, eb] will cause the solution to vary
by roughly x(0) = A1(eb Ex).
4-4
The relative variation issuch that

kx()xk
kxk
kA1k
keb k
kxk
+ kEk + O(2).
Since kbk kAkkxk :

kx()xk
kxk
4-5
kAkkA1k
keb k
kbk
kEk
kAk
+ O(2)
The quantity (A) = kAk kA1k is called the condition

number of the linear system with respect to the norm k.k.
When using the p-norms we write:
p(A) = kAkpkA1kp
Note: 2(A) = max(A)/min(A) = ratio of largest to
smallest singular values of A. Allows to define 2(A) when

A is not square.
Determinant *is not* a good indication of sensitivity
Small eigenvalues *do not* always give a good indication
of poor conditioning.
4-6
Example:
Consider, for a large , the n n matrix

A = I + e1eTn
Inverse of A is :
A1 = I e1eTn
For the -norm we have
kAk = kA1k = 1 + ||
so that
(A) = (1 + ||)2.
Can give a very large condition number for a large
but all the eigenvalues of A are equal to one.
4-7
Rigorous norm-based error bounds

Previous bound is valid only when perturbation is small
enough, where small is not precisely defined.

New bound valid within an explicitly given neighborhood.
THEOREM 1: Assume that (A + E)y = b + eb and

Ax = b and that kA1kkEk < 1. Then A + E is
nonsingular and

1
kx yk
kA k kAk
kEk kebk
+
1
kxk
1 kA k kEk kAk
kbk
4-8
To prove, first need to show that A + E is nonsingular
if A is nonsingular and E is small. Begin with simple case:

LEMMA: If kEk < 1 then I E is nonsingular and
1
k(I E)1k 1kEk
Proof is based on following 5 steps
a) Show: If kEk < 1 then I E is nonsingular
b) Show: (I E)(I + E + E 2 + + E k ) = I E k+1.
c) From which we get:
(I E)1 =
k
X
E i + (I E)1E k+1
i=0
4-9
d) (I E)
= limk
Pk
i
E
. We write this as
i=0
(I E)1 =
Ei
i=0
e) Finally:

k

k

X

X

k(I E)1k = lim
E i = lim
E i

k
k
lim
4-10
i=0
k
X
i=0

i
E lim
i=0
k
X
kEki
i=0
1 kEk
Can generalize result:
LEMMA: If A is nonsingular and kA1k kEk < 1 then

A + E is non-singular and
kA1 k
1
k(A + E) k 1kA1k kEk
Proof is based on relation A + E = A(I + A1E) and use
of previous lemma.
4-11
Now we can prove the theorem:

THEOREM 1: Assume that (A + E)y = b + eb and
Ax = b and that kA1kkEk < 1. Then A + E is
nonsingular and

1
kx yk
kA k kAk
kEk kebk
+
1
kxk
1 kA k kEk kAk
kbk
4-12
Proof: From (A + E)y = b + eb and Ax = b we get

(A + E)(y x) = eb Ex. Hence:
y x = (A + E)1(eb Ex)
Taking norms ky xk k(A+E)1k [kebk + kEkkxk]
Dividing by kxk and using result of lemma
ky xk
kxk
k(A + E)1k [kebk/kxk + kEk]

kA1k
[kebk/kxk + kEk]
1

kA1kkAk
kebk
kEk
+
1
1 kA kkEk kAkkxk kAk
kA1kkEk
Result follows by using inequality kAkkxk kbk.... QED
4-13
Simplification when eb = 0 :
kx yk
kxk
kA
1
k kEk
kA1k
kEk
Simplification when E = 0 :
kx yk
kxk
kA
k kAk
kebk
kbk
Slightly less general form: Assume that kEk/kAk
and kebk/kbk and (A) < 1 then

kx yk
2(A)
kxk
1 (A)
4-14
Another common form:

THEOREM 2: Let (A + A)y = b + b and Ax = b
where kAk kEk, kbk kebk, and assume that
kA1kkEk < 1. Then

1
kx yk
kA k kAk
kebk kEk
+
1
kxk
1 kA k kEk kbk
kAk
4-15
Normwise backward error

We solve Ax = b and find an approximate solution y
Question: Find smallest perturbation to apply to A, b so

that *exact* solution of perturbed system is y
4-16
Normwise backward error in just A or b

Suppose we model entire perturbation in RHS b.
Let r = b Ay be the residual.
Then y satisfies Ay = b + b with b = r exactly.

krk
The relative perturbation to the RHS is kbk .
Suppose we model entire perturbation in matrix A.

T
ry
Then y satisfies A y T y y = b
The relative perturbation to the matrix is

ry T
krk2

T /kAk2 =
y y
kAkkyk2
2
4-17
Normwise backward error in both A & b

For a given y and given perturbation directions E, eb, we
define the Normwise backward error:
E,eb (y) = min{ | (A + A)y = b + b;
for all A, b satisfying: kAk kEk;
and kbk kebk}
In other words E,eb (y) is the smallest for which
(
(A + A)y =
b + b;
(1)
kAk kEk;
kbk kebk
4-18
y is given (a computed solution). E and eb to be selected
(most likely directions of perturbation for A and b).

Typical choice: E = A, eb = b
- Explain why this is not unreasonable
Let r = b Ay. Then we have:

THEOREM 3: E,eb (y) =
krk
kEkkyk+keb k
Normwise backward error is for case E = A, eb = b:

krk
A,b(y) = kAkkyk+kbk
4-19
Show how this can be used in practice as a means to

stop some iterative method which computes a sequence of
approximate solutions to Ax = b.
-
4-20
- Consider the 6 6 Vandermonde system Ax = b where
aij = j 2(i1), b = A [1, 1, , 1]T . We perturb A by E,

with |E| 1010|A| and b similarly and solve the system.
Evaluate the backward error for this case. Evaluate the
forward bound provided by Theorem 2. Comment on the
results.
4-21
Proof of Theorem 3
Let D kEkkyk + kebk and E,eb (y). The theorem
states that = krk/D. Proof in 2 steps.
First: Any A, b pair satisfying (1) is such that
krk/D. Indeed from (1) we have (recall that r = b Ay)
Ay + Ay = b + b r = Ay b
krk kAkkyk+kbk (kEkkyk+kebk)
krk
D
Second: We need to show an instance where the minimum

value of krk/D is reached. Take the pair A, b:
T
A = rz ;
b = r
with =
kEkkyk
kebk
D
D
The vector z depends on the norm used - for the 2-norm:
z = y/kyk2. Here: Proof only for 2-norm
a) We need to verify that first part of (1) is satisfied:

yT
y = b r + r

kEkkyk
r
= b (1 )r = b 1
kEkkyk + kebk
kebk
= b
r = b + r
D
(A + A)y = b + b The desired result
(A + A)y = Ay + r
kyk2
b) Finally: Must now verify that kAk = kEk and

kbk = kebk. Exercise: Show that kuv T k2 = kuk2kvk2
kAk =
||
kyk2
kry k =
kbk = ||krk =
4-23
kebk
D
kEkkyk krkkyk
D
kyk2
krk = kebk
= kEk
QED
Componentwise backward error

A few more definitions on norms...
A norm is absolute k|x|k = kxk for all x. (satisfied by
all p-norms).
A norm is monotone if |x| |y| kxk kyk.
It can be shown that these two properties are equivalent.
4-24
- Show: a function which satisfies the first 2 requirements
of vector norms (1. (x) 0 (==0, iff x = 0) and 2.

(x) = ||(x)) satisfies the triangle inequality iff its
unit ball is convex.
(Continued) Use the above to construct a norm in R2
that is *not* absolute.
-
4-25
- Define absolute *matrix* norms in same way. Which of
the norms kAk1, kAk, kA2k, and kAkF are absolute?

- Recall that for any matrix f l(A) = A + E with |E|
u |A|. For an absolute matrix norm

kEk
kAk
What does this imply?

Component-wise analysis requires that we use norms
that are *absolute*
We will restrict analysis to k.k

See sec. 2.6.5 of text.
4-26
Analogue of theorem 2 for case E = |A|, eb = |b|:
THEOREM 4 Let Ax = b and (A + A)y = b + b

where |A| |A| and |b| |b|. Assume that
(A) = < 1. Then A + A is nonsingular and
kx yk
kxk
4-27
2
1
k|A1| |A|k
Componentwise relative condition number :

1
C
(A)
k
|A
| |A| k
Redo example seen after Theorem 3, (6 6 Vandermonde system) using componentwise analysis.
-
4-28
Componentwise backward error for y is the smallest

for which
(
(A + A)y = b + b;
(2)
|A| E;
|b| eb
Denoted by E,eb (y).
THEOREM 5 [Oettli-Prager] Let r = b Ay (residual).
Then
|ri|
.
E,eb (y) = max
i
(E|y| + eb)i
Zero denominator case: 0/0 0 and nonzero/ 0
4-29
Example of ill-conditioning: The Hilbert Matrix

Notorious example of ill conditioning.
2
Hn =
..
1
2
1
3
..
1
1
n n+1
1
3
1
4
.. ..
1
n
1
n+1
..
i.e.,
hij =
1
i+j1
1
2n1
For n = 5 2(Hn) = 4.766.. 105.

Let bn = Hn(1, 1, . . . , 1)T .
Solution of Hnx = b is (1, 1, . . . , 1)T .
Let n = 5 and perturb h5,1 = 0.2 into 0.20001.
New solution:
4-30
(0.9937, 1.1252, 0.4365, 1.865, 0.5618)T

Estimating condition numbers.

Avoid the expense of computing A1 explicitly.
Choose a random or carefully chosen vector v.
Solve Au = v using factorization already computed.
Then kA1k kuk / kvk is an guess-timate of kA1k.
Estimated condition number is (A) kAkkuk / kvk.
Typical choice for v: choose [ 1 ] with signs

chosen on the fly during back-substitution to maximize the
next entry in the solution, based on the upper triangular
factor from Gaussian Elimination.
4-31
Condition Number Measures How Close to Singularity

1/ relative distance to nearest singular matrix.
Let A, B be two n n matrices with A nonsingular and

B singular. Then
1
(A)
kA Bk
kAk
Proof: B singular x 6= 0 such that Bx = 0.

kxk = kA1Axk kA1k kAxk = kA1kk(A B)xk
kA1k kA Bkkxk
Divide both sides by kxk (A) = kxkkAk kA1k
result. QED.
4-32
Example:
!
1 1
1 0.99
let A =
Then
1
1(A)
0.01
2
and
B=
!
1 1
1 1
1(A) 0.01 = 200.
It can be shown that (Kahan)
1
(A)
4-33
= min
B
kA Bk
kAk

det(B) = 0
Estimating errors from residual norms

Let x
an approximate solution to system Ax = b (e.g.,
computed from an iterative process). We can compute
the residual norm:
krk = kb A
xk
Question: How to estimate the error kx x
k from krk?
One option is to use the inequality
kx
xk
kxk
(A)
krk
.
kbk
We must have an estimate of (A).
4-34
Proof of inequality.
First, note that A(x x
) = b A
x = r. So:
kx x
k = kA1rk kA1k krk
Also note that from the relation b = Ax, we get
kbk = kAxk kAk kxk
kxk
kbk
kAk
Therefore,
kx x
k
kxk
kA1k krk
kbk/kAk
= (A)
krk
kbk
- Show that
kx
xk
kxk
4-35
krk
1
.
(A) kbk
THEOREM 6 Let A be a nonsingular matrix and x

an
approximate solution to Ax = b. Then for any norm k.k,
kx x
k kA1k krk
In addition, we have the relation
1
krk
(A) kbk
kx x
k
kxk
(A)
krk
kbk
in which (A) is the condition number of A associated

with the norm k.k.
4-36
Small Example
Solve Ax = b problem in 3-digit decimal arithmetic:
!
!
!
0.641 0.242
x1
0.883
=
0.321 0.121
x2
0.442
Solution by standard algorithm is y =
residual r = b Ay =
0.708
1.775
!
7.12
104.
1.12
(actually used 11 bit arithmetic, printed in decimal.)
4-37
Estimate forward error in small example

Get estimated condition number from
(1, 2) A = (0.001, 0.000)

k(1, 2)k
1
Conclude kA1k1 k(0.001, 0.000)k
= 3000.
1
Combine with kAk1 = 0.962 1.

Get lower bound on Condition Number: 1(A) 3000.
So Theorem 1 simplified gives
kx yk
kxk
kA1k kAk
kebk
kbk
3000
8.24 104
1.325
= 1.866,
predicting no accuracy!
Keep over 4 decimal digits to get any accuracy at all!
4-38
Iterative refinement
Define residual vector:
r = b A
x
We have seen that: x x
= A1r, i.e., we have
x=x
+ A1r
Idea: Compute r accurately (double precision) then
solve
A = r
... and correct x

by
x
:= x
+
... repeat if needed.
Read Section 3.5.3 for details
4-39

Lec04 Perturbation

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Lec04 Perturbation

Enviado por

Direitos autorais:

Formatos disponíveis

ERROR AND SENSITIVTY ANALYSIS FOR SYSTEMS

GvL 3.5; Heath 2.3; TB 12 pert

Perturbation analysis for linear systems (Ax = b)

GvL 3.5; Heath 2.3; TB 12 pert

Analysis I: Asymptotic First Order Analysis

GvL 3.5; Heath 2.3; TB 12 pert

Let () = x() x. Then,

(A + E)() = (b + eb) (A + E)x = (eb Ex)

A small variation [E, eb] will cause the solution to vary

by roughly x(0) = A1(eb Ex).

GvL 3.5; Heath 2.3; TB 12 pert

The relative variation issuch that 

Since kbk kAkkxk :

GvL 3.5; Heath 2.3; TB 12 pert

The quantity (A) = kAk kA1k is called the condition

Note: 2(A) = max(A)/min(A) = ratio of largest to

smallest singular values of A. Allows to define 2(A) when

GvL 3.5; Heath 2.3; TB 12 pert

Consider, for a large , the n n matrix

but all the eigenvalues of A are equal to one.

GvL 3.5; Heath 2.3; TB 12 pert

Rigorous norm-based error bounds

enough, where small is not precisely defined.

THEOREM 1: Assume that (A + E)y = b + eb and

GvL 3.5; Heath 2.3; TB 12 pert

To prove, first need to show that A + E is nonsingular

if A is nonsingular and E is small. Begin with simple case:

GvL 3.5; Heath 2.3; TB 12 pert

GvL 3.5; Heath 2.3; TB 12 pert

Can generalize result:

LEMMA: If A is nonsingular and kA1k kEk < 1 then

GvL 3.5; Heath 2.3; TB 12 pert

Now we can prove the theorem:

GvL 3.5; Heath 2.3; TB 12 pert

Proof: From (A + E)y = b + eb and Ax = b we get

k(A + E)1k [kebk/kxk + kEk]

Result follows by using inequality kAkkxk kbk.... QED

GvL 3.5; Heath 2.3; TB 12 pert

Slightly less general form: Assume that kEk/kAk

and kebk/kbk and (A) < 1 then

GvL 3.5; Heath 2.3; TB 12 pert

Another common form:

GvL 3.5; Heath 2.3; TB 12 pert

Normwise backward error

Question: Find smallest perturbation to apply to A, b so

GvL 3.5; Heath 2.3; TB 12 pert

Normwise backward error in just A or b

Then y satisfies Ay = b + b with b = r exactly.

The relative perturbation to the RHS is kbk .

Suppose we model entire perturbation in matrix A.

GvL 3.5; Heath 2.3; TB 12 pert

Normwise backward error in both A & b

GvL 3.5; Heath 2.3; TB 12 pert

y is given (a computed solution). E and eb to be selected

(most likely directions of perturbation for A and b).

Let r = b Ay. Then we have:

Normwise backward error is for case E = A, eb = b:

GvL 3.5; Heath 2.3; TB 12 pert

Show how this can be used in practice as a means to

GvL 3.5; Heath 2.3; TB 12 pert

- Consider the 6 6 Vandermonde system Ax = b where

aij = j 2(i1), b = A [1, 1, , 1]T . We perturb A by E,

GvL 3.5; Heath 2.3; TB 12 pert

Second: We need to show an instance where the minimum

The relative variation issuch that

- Define absolute matrix norms in same way. Which of