Você está na página 1de 7

Linear Systems - August 26-28, 2009

Topic - Linear Algebra Concepts

Linear systems theory is about the mathematical models we use to represent dynamical systems. In
this respect, this course is a math course about the mathematical tools we use to model the class of
so-called linear dynamical systems.
Mathematically, we think of a system as a mapping between a space of input functions and output
functions. So let Li denote the set of input functions and Lo denote the set of output functions.
We assume that these sets for a linear space in that we can algebraically add signals together. The
system, G, is then a map G : Li → Lo .
We say this system is linear if the system map G satisfies the principle of superposition. In particular,
this means that for any w1 ∈ Li and w2 in Li that the outputs generated by the system satisfy
(1) G[w1 + w2 ] = G[w1 ] + G[w2 ]
where we use G[w] to denote the function in Lo that is generated by system G using input w.
It will be seen later that such linear systems can always be represented in matrix-vector form. This
means that for each w ∈ Li and y ∈ Lo we can identify a vectors w and y (possibly infinite dimen-
sional) and matrix G such that y = Gw. This means, of course, that we can use our understanding
of matrix-vector analysis (or rather linear algebra) to explore the structure of linear systems. This
application of linear algebra to linear systems is precisely what this course is about.
Since matrix-vector computations represent such an important part of our future work, this lecture
will review that a topic that should be familiar to most of you from your earlier studies in high
school. In particular, this lecture will talk about Gaussian elimination as a method to solve systems
of linear algebraic equations.
The familiar problem of solving linear algebraic equations considers equations of the form
(2) b = Ax
n m m×n
where x ∈ ℜ , b ∈ ℜ , and A ∈ ℜ . The problem is that given vector b and matrix A determine
the vector x.
With regard to this problem there are three questions we can consider

• Does a solution, x, exist?


• If a solution does exist, is the solution unique?
• If solutions exist, how do we find or characterize all such solutions?

In beginning to answer these questions, we start by considering a method for constructing such
solutions. That construction is called Gaussian elimination.
Homogeneous Systems of Equations
We’ll start from an example. Let’s consider the problem
    
1 2 1 1 x1
(3) b =  −2  =  4 1 0   x2  = Ax
7 −2 2 1 x3
This matrix-vector equation, of course, is equivalent to the following set of coupled linear algebraic
equations,
(4) 1 = 2x1 + x2 + x3
(5) −2 = 4x1 + x2
(6) 7 = −2x1 + 2x2 + x3
1
2

The problem is to find x1 , x2 , and x3 .


Gaussian elimination uses elementary row-column operations to reduce the original 3 by 3 set
of equations to a sequence of smaller problems. This reduction, in essence, transforms the original
problem into an equivalent problem that is triangular in structure. This reduction is done through
elementary row-column operations. These operations involve multiplying one equation by a real
constant, subtracting from another equation and then replacing the equation. The equations we
select to transform and the constant used in multiplying the first equation is based on trying to
remove a variable from the second equation. By removing this equation, we effectively reduce the
dimensionality of the problem being solved.
As an example, let’s multiply the first equation by −2, add to the second equation, and use this to
replace the second equation. The resulting transformation takes the form
−2 = −4x1 −2x2 −2x3
(7) −2 = 4x1 +x2
−4 = −x2 −2x3
Note that this removes x1 from the second equation.
We can now repeat this approach to remove x1 from the third equation. This is done by an elementary
row-column operation that multiplies the first equation by 1, adds it to the third equation, and
replaces that third equation to yield
+1 = +2x1 +x2 +x3
(8) 7 = −2x1 +2x2 +x3
8 = 3x2 +2x3

This transformation results in the following system of equations in which x1 has been removed from
the second and third equations,
1 = 2x1 +x2 +x3
(9) −4 = −x2 −2x3
8 = 3x2 + 2x3
The 1, 1 element in the first equation is called the pivot. Notice that the last two equations form
a smaller 2 by 2 system of equations that is completely independent from x1 . So we can apply
Gaussian elimination again to this smaller 2 by 2 system.
For this smaller system, we use a row-column operation to remove x2 from the last equation. This
is accomplished by multiplying the second equation by 3, adding it to the third equation, and then
replacing the third equation with the result. In this case, we see that
−12 = −3x2 −6x3
(10) 8 = 3x2 +2x3
−4 = −4x3
So that we now obtain a completely transformed system of linear equations whose right hand side
is triangular in structure,
1 = 2x1 +x2 +x3
(11) −4 = −x2 −2x3
−4 = −4x3

The last equation above is a 1 by 1 system of equations whose solution can be written down by
inspection to obtain x3 = 1. We can now use this value for x3 to solve the 2 by 2 system of equation.
We back substitute the value for x3 into the second equation to obtain
(12) −4 = −x2 − 2 ⇒ x2 = 2
3

We repeat this approach by taking the values for x3 and x2 , back substituting into the first equation
to obtain
(13) 1 = 2x1 + 2 + 1 ⇒ x1 = −1
This process is called backsubstitution.
So when does this approach (Gaussian Elimination with Back substitution) fail? Obviously we can’t
use it if the pivot is zero. Note that if this occurs, we may be able to avoid the zero pivot problem
by simply reordering the equations. If this cannot be done, however, then we say the system of
equations is singular.
A singular system has either no solution or an infinite number of solutions. How do we determine
which case applies to a specific singular system? Again let’s examine this question through an
example.
The system under consideration takes the form,
 
 x
3 3 2  1 
  
0 1
x2 
(14) b= 0 = 2 6 9 5  x3  = Ax
0 −1 −3 3 0
x4
This system is singular. This will become apparent as we apply Gaussian elimination to reduce the
system. The first pivot is a11 = 1. This is nonzero, so we can use elementary row-column operations
to transform the system’s original A matrix as follows,
   
1 3 3 2 1 3 3 2
(15)  2 6 9 5  ⇒  0 0 3 1 
−1 −3 3 0 −1 −3 3 0
This was done by multiplying the first row by 2 and subtracting from the second row. Applying
Gaussian elimination to the third row yields,
   
1 3 3 2 1 3 3 2
(16)  0 0 3 1  ⇒  0 0 3 1 
−1 −3 3 0 0 0 6 2
This was done by multiplying the first row by −1 and subtracting from the third row. Notice that
the second pivot is zero and no reordering of the equations will fix this problem. So this system of
equations is singular.
But we still have a smaller 2 by 2 square subsystem that involves the variables x3 and x4 . So let’s
apply Gaussian elimination to this smaller subsystem. The pivot is now a23 = 3. We remove a
variable from the third equation by multiplying the second row by −2 and subtracting from the
third row. This yields,
   
1 3 3 2 1 3 3 2
(17)  0 0 3 1  ⇒  0 0 3 1 
0 0 6 2 0 0 0 0
The problem now is that the last row is zero, so we can’t use backsubstitution to solve for x4 .
The resulting variables in this system can now be divided into two sets.

• basic variables correspond to non-zero pivots. In this example the basic variables are x1
and x3 .
• free variables correspond to zero pivots. In this example those free variables are x2 and
x4 .
4

So to obtain the most general solution, we assign arbitrary values to the free variables and then use
back substitution.
In this case the we see that the second equation yields,
1
(18) 3x3 + x4 = 0 ⇒ x3 = − x4
3
the basic variable x3 is expressed in terms of the free variable x4 .
We now back substitute the basic variable x3 ’s expression into the first equation to obtain

(19) 0 = x1 + 3x2 + 3x3 + 2x4


(20) = x1 + 3x2 − x4 + 2x4
(21) = x1 + 3x2 + x4

and we solve for the single basic variable x1 in terms of the free variables x2 and x4 to obtain

(22) x1 = −3x2 − x4

Note that all solutions of this singular linear system of equations may not be written as
     
−3x2 − x4 −3 −1
 x 2
  1   0 
(23) x=  = x2 
 0  + x4  −1/3
  
 − 31 x4  
x4 0 1

All solutions are expressed as a linear combination of two vectors in ℜ4 . This set of linear
combinations for a subspace of ℜ4 . The two vectors form a basis that spans this subspace. The
subspace given above has a ”special” name. It is sometimes called the Null Space of the matrix
A. This is because any vector in the span of these two vectors is ”nulled” by the matrix A.
This particular example shows we have an infinite number of solutions and those solutions are the
null space of A. This example assumed that b = 0. In this case, a singular system always has an
infinite number of solutions. Does the same thing happen if b 6= 0? We’ll address that question in
the next lecture.
Inhomogeneous Systems of Equations:
In the previous lecture we consider a homegeneous system of linear equations of the form
 
 x
3 3 2  1 

1
x2 
(24) 0 = b = Ax =  2 6 4 5   x3 
−1 −3 3 0
x4

We found that there were an finite number of solutions that we could express as
    

 −3 −1 

1   0 
 
(25) x ∈ x2   + x4 
  = Null Space of A = N (A)

 0  −1/3  
0 1
 

Homogeneous problems always have a solution. This solution is either 0 or the nontrivial null space
of A.
5

When b is non-zero, we have an inhomogeneous problem. To characterize the solution, we need to


transform b by our row column operations. Again appealing to our earlier example, the transforma-
 T
tion of an arbitrary b = b1 b2 b3 using the earlier row-column operations yields,
 
    x1
b1 1 3 3 2 
 =  0 0 3 1   x2 

(26) b′ =  b2 − 2b1  x3 
b3 − 2b2 + 5b1 0 0 0 0
x4

Note that the last equation requires


(27) b3 − 2b2 + 5b1 = 0
3
Not any b ∈ ℜ will satisfy this relation. This means that the inhomogeneous problem may fail to
have a solution for certain b vectors. The question is how to characterize such a situation?
Note that b can be written as lying in the span of four vectors. In particular,
 
 x
3 3 2  1 
  
b1 1
x2 
(28) b =  b2  =  2 6 9 5   x3  = Ax
b3 −1 −3 3 0
x4
       
1 3 3 2
(29) = x1  2  + x2  6  + x3  9  + x4  5 
−1 −3 3 0
So that b lies in a subspace of ℜ3 that is spanned by the columns of the A matrix. In other words,
for this system to have a solution we require that
(30) b ∈ col(A) = column space of A
In this case we can show that
   
 1 1 
(31) col(A) = span  2  ,  3 
−1 1
 

which is a two-dimensional plane in ℜ3 . Clearly not all b in ℜ3 may lie on this plane. But if b does
lie in col(A), then we can solve the inhomogeneous problem by back substitution. In particular, we
can easily show that all solutions of this problem are given by
     
−3 −1 3b1 − b2
 1   0   0 
(32)  0  + x4  −1/3  +  1 (b2 − 2b1 ) 
x = x2      
3
0 1 0
The first two terms on the right hand side of the above equation are vectors in the null space of A.
The third term on the right hand side of the equation is a particular solution to the problem. In
other words, if b ∈ col(A), then all solutions to the problem can be expressed as
(33) x ∈ xp + N (A)
where xp is a particular solution to the system of equations. Note that if N (A) = {0} (the trivial
null space) then the solution to our system of equations is unique.
What if b does not lie in col(A)? We can still find a ”solution” to this system of linear equations
by ”enlarging” the set of what we think of as solutions. This involves finding a vector that satisfies
the equation in some other sense than strict ”equality”. We sometimes refer to this as a solution
concept.
6

To motivate an alternate solution concept, let’s consider a specific problem. We consider a known
physical ”relation” that satisfies the linear equation Ax = b where x ∈ ℜ2 and b ∈ ℜ3 . We assume
that we ”measure” or ”know” b with some uncertainty. So this means that if we used ”measured”
values of b, then we would search for x by trying to solve the following problem
   
a11 a12   b1
x1
(34) Ax =  a21 a22  = b + v =  b2 
x2
a31 a32 b3
From the ”physical” relationships in this problem we know that b0 must lie in col(A). But the
”noisy” measurement b = b + v will only lie in the column space of A if v is also in this column
space. In general, this will not be the case, so we have no guarantee that b is in col(A). Nonetheless,
we know a ”solution” to this problem should exist. So can we define a ”best” solution for this
problem?
Rather than working directly with the measured b, we consider a ”best” b̂ that lies within col(A).
There are many ways of choosing this ”best” vector, but one obvious choice is to minimize the
difference between b̂ and the measured b. In other words, we select b̂ ∈ col(A) that minimizes
3
1 1X
(35) kb − b̂k22 = (bi − b̂i )2 = J(b̂)
2 2 i=1

Since b̂ must lie in col(A), we know there exists an x̂ ∈ ℜ2 such that b̂ = Ax̂. So we can recast our
”cost” function in terms of the vector x̂ to obtain
3   2
1X   x̂1
(36) J(x̂) = bi − ai1 ai2
2 i=1 x̂2

where aij is the ijth component of the A matrix.


The x̂ we’re looking for minimize J. Because J is convex (quadratic) with respect to x̂, we can
determine the minimizing point by setting the derivative of J to zero. In other words, the optimal
x̂∗ satisfies the equations
3   
∂J X   x̂1
(37) 0 = =− bi − ai1 ai2 ai1
∂ x̂1 x̂2
i=1
3   
∂J X   x̂1
(38) 0 = =− bi − ai1 ai2 ai2
∂ x̂2 x̂2
i=1

which we can rewrite in matrix-vector form as


   
  b1   a11 a12  
a11 a21 a31  a11 a21 a31 x̂1
(39) 0 = − b2  +  a21 a22 
a12 a22 a32 a12 a22 a32 x̂2
b3 a31 a32
(40) = −AT b + AT Ax̂
Note that AT A is usually invertible so that the ”solution” to this problem becomes
−1 T
x̂∗ = AT A

(41) A b
The term in the brackets is called the pseudo-inverse of matrix A. This is our ”solution” to an
inhomogeneous problem that has no solutions in the ”conventional” sense.
This ”mean-squared” solution concept is useful in estimation problems. Note that
(42) 0 = AT b − AT Ax̂ = AT (b − Ax̂) = AT (b − b̂)
7

the last equation says that the error b− b̂ is always in the null space of AT . This is a common property
of minimum mean square estimates that sometimes goes under the name of the orthogonality
principle.
Subspace concepts are useful in characterizing when a system of linear algebraic equations has a
solution. But what ”exactly” is a subspace? This concept is formalized in linear algebra, which
provides a large set of mathematical tools that can be applied to many engineering problems. We’ll
use linear algebraic concepts throughout this course to study systems that can be modeled by systems
of linear differential or difference equations. Next time, we’ll say a bit more about linear algebra,
primarily to review fundamental concepts and establish some notational conventions.

Você também pode gostar