Computational Linear Algebra

Linear systems Matrix decompositions Linear least squares Some useful functions in R
Lecture 4 - Computational linear algebra
Björn Andersson (w/ Jianxin Wei)

Department of Statistics, Uppsala University
February 5, 2015
Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Table of Contents
1 Linear systems
Existence and uniqueness
Solving linear systems
2 Matrix decompositions
LU decomposition
Cholesky decomposition
QR decomposition
Singular value decomposition
Statistical applications
3 Linear least squares
4 Some useful functions in R

Existence and uniqueness of solutions
A n × n matrix A is said to be non-singular if it satisfies any of the

following, equivalent, conditions:
det(A) 6= 0
A−1 exists
rank(A) = n
For a given square matrix A and vector b, the linear system
Ax = b has:
A unique solution: A is non-singular and b is arbitrary
Infinitely many solutions: A is singular and b ∈ span(A)
No solution: A is singular and b 6∈ span(A)

Forward substitution and backward substitution

Let L be a lower triangular matrix. Then the linear system Lx = b
can be solved by forward-substitution:
1 Let x1 = b1 /l11 .
2 For i ∈ {2, . . . , n}
Pi−1
bi − j=1 lij xj
Let xi = lii
Let U be an upper triangular matrix. Then the linear system
Ux = b can be solved by backward-substitution:
1 Let xn = bn /unn .
2 For i ∈ {n − 1, . . . , 1}
bi − nj=i+1 uij xj
P
Let xi = uii
Note that if ∃i : lii = 0 or uii = 0, then the matrix L or U is
singular and thus the system does not have a solution.
Forward and backward substitution in R
Functions forwardsolve() and backsolve().

R> L <- matrix(c(1,2,0,1), 2, 2)
R> forwardsolve(L, c(1,4))
[1] 1 2
R> backsolve(t(L), c(4,1))
[1] 2 1

Table of Contents
1 Linear systems
LU decomposition
QR decomposition

LU decomposition
LU decomposition
Let A be a square matrix such that A = LU, where L is a

lower triangular matrix and U is an upper triangular matrix.
A linear system Ax = b can be solved using forward and backward
substitution:
Let y = Ux. Hence Ly = b.
Solve for y by forward substitution
Solve for x by backward substitution

LU decomposition
Gaussian elimination and LU decomposition
Consider Gaussian elimination written in matrix notation:

   
× × × × × ×
M1 × × × =  0 ∗ ∗ 
× × × 0 ∗ ∗
   
× × × × × ×
M2  0 ∗ ∗  =  0 ∗ ∗ 
0 ∗ ∗ 0 0 +
The LU decomposition then corresponds to
M2 M1 A = U ⇒ A = (M2 M1 )−1 U = LU.

LU decomposition
LU decomposition in R
R> A <- matrix(runif(9), 3, 3)

R> b <- c(2,1,3)
Find x in Ax = b by LU decomposition:
R> solve(A, b)
[1] 17.579906 -2.520983 -2.940866
Find A−1 by LU decomposition:
R> solve(A)
[,1] [,2] [,3]
[1,] -1.138959 -6.554862 8.8042286
[2,] 2.662184 1.148671 -2.9980071
[3,] -1.335980 2.352416 -0.8737737
The Cholesky decomposition is the decomposition of a symmetric
and positive-definite matrix into the product of a upper triangular
matrix and its transpose:
A = U0 U.
From the Cholesky decomposition it is possible to calculate the

inverse of a matrix in the following way:
A−1 = U−1 (U−1 )0 ,
which is a more stable way than using Gaussian elimination.

In R, the function chol() gives the Cholesky decomposition and
chol2inv() the inverse of a matrix using the Cholesky
decomposition.
QR decomposition
QR decomposition
Let A = QR, where Q is an orthogonal matrix and R is an

invertible upper triangular matrix.
Then the linear system Ax = b can be written
Ax = b ⇒ QRx = b ⇒ Rx = Q0 b,
and x can be solved by backward substitution.

QR decomposition
QR decomposition in R
We have the system Ax = b. Solve for x:

R> qr.solve(A, b)
[1] 17.579906 -2.520983 -2.940866
Find the inverse of A−1 by QR decomposition:
R> qr.solve(A)
[,1] [,2] [,3]
[1,] -1.138959 -6.554862 8.8042286
[2,] 2.662184 1.148671 -2.9980071
[3,] -1.335980 2.352416 -0.8737737
qr.Q(qr(A)) and qr.R(qr(A)) retrieves the matrices Q and R
from the QR decomposition.
QR decomposition
Calculating eigenvalues of a matrix
We want to calculate the eigenvalues of a real matrix A. To do so

we can apply the QR algorithm. Let A0 = A. For k = 1, 2, . . .
1 Compute the QR decomposition Ak = Qk Rk
2 Let Ak = Rk Qk
3 Continue until convergence of Ak to a triangular matrix
containing the eigenvalues of A in the diagonal

QR decomposition
Calculating eigenvalues of a matrix

R> qr.eigen <- function(A){
+ for(i in 1:100){
+ QR <- qr(A)
+ R <- qr.R(QR)
+ Q <- qr.Q(QR)
+ A <- R %*% Q
+ }
+ return(diag(A))
+ }
R> qr.eigen(h3)
[1] 1.40831893 0.12232707 0.00268734
R> eigen(h3)$values
[1] 1.40831893 0.12232707 0.00268734
The singular value decomposition (SVD) of a m × n matrix A has

the form
A = UΣV0 ,
where U is a m × m orthogonal matrix, V is a n × n orthogonal
matrix and Σ is a m × n diagonal matrix where
σij ≥ 0, if i = j, 0 else.
singular values - the diagonal entries σii of Σ

singular vectors - the columns ui of U
right singular vectors - the columns vi of V

Reduced form of the SVD
For a m × n matrix A, m > n, the reduced form of the SVD is

0
Σ1
A = UΣV = U1 U2 V0 = U1 Σ1 V0 .
0
The decomposition can also be expressed as

X
A= σii ui vi0 .
σii 6=0

SVD and the spectral decomposition
The spectral decomposition says that a real, symmetric m × m

matrix A can be decomposed as
A = λ1 P1 P01 + · · · + λn Pm P0m ,
where P1 , . . . , Pm are pairwise orthogonal vectors,

I = P1 P01 + · · · + Pm P0m and λ1 , . . . , λm are the eigenvalues of A.
In matrix form we have
P0 AP = Λ ⇐⇒ A = PΛP0 ,
where P is the matrix with vectors P1 , . . . , Pm and

Λ = diag(λ1 , . . . , λm ).

SVD and the spectral decomposition
If A is symmetric and all the eigenvalues are non-negative, then:

the SVD equals the spectral decomposition
the singular values are the eigenvalues
the left and right singular values are eigenvectors
For any matrix A,
The square of the singular values, σi2 , are the eigenvalues of
AA0 and A0 A
The left singular vectors ui are eigenvectors of AA0
The right singular vectors vi are eigenvectors of A0 A

Rank determination
In theory, the rank of A is the number of non-zero singular

values.
In practice, the rank might not be well-determined in that
some singular values may be very small but non-zero.
For many purposes it’s better to regard any singular values
falling below a certain threshold as negligible in determining
the numerical rank.
The R package corpcor contains the function rank.condition()
which can determine the numerical rank of a matrix.

Principal component analysis
(Adapted from Linear Statistical Inference and Its Applications,

Rao 1973)
Let x be a random p-vector. Let X denote the n × p matrix of
observations of x. In some cases, it is desired to reduce the
dimensions of the data matrix for interpretative purposes. One
dimension reduction technique is principal component analysis
(PCA).
In PCA, the object is to reduce the dimensionality by finding the
directions in the data which have the most variation. This is
accomplished by finding the eigenvectors corresponding to the
largest eigenvalues in the covariance (or correlation) matrix of the
data and then projecting the original data onto these vectors.

Let Xc be the centered X matrix, i.e. each column has been mean
subtracted. The principal components Y are defined as
Y = Xc W,
where W is an orthogonal matrix of eigenvectors of X0c Xc ordered

according to the size of the corresponding eigenvalue (starting
from the largest). The matrix W is called the loading matrix.


Note that the covariance matrix of Xc is
1
Σ= X0 Xc .
n−1 c
We can write
1
λ1 w1 w10 + · · · + λp wp wp0 ,

Σ=
n−1
where wk , k ∈ {1, . . . , p} are eigenvectors corresponding to
eigenvalue λk such that they are orthogonal to each other. Hence
we may conclude that the components yi = Xc wi are uncorrelated
with each other, since
Cov(yi , yj ) = wi0 Σwj = 0,
for i, j ∈ {1, . . . , p}, i 6= j. Also note that
Var(yk ) = wk0 Σwk = λk /(n − 1).

Thus, one way of obtaining the matrix W is to calculate the
eigenvectors of the matrix X0c Xc . However, this is rather inefficient
computationally.
Instead, the singular value decomposition can be used: for p < n,
we can write the matrix Xc as
Xc = UDW0 ,
where U is a n × n orthogonal matrix, W is a p × p orthogonal
matrix and D is a diagonal matrix whose diagonal entries are called
the singular values of Xc . Now, remember that the square of the
singular values of Xc are the eigenvalues of X0c Xc and that the
vectors in W are the eigenvectors of X0c Xc . Hence we can retrieve
the loading matrix and the covariance matrix for the principal
components from the SVD.
Factor analysis
A technique which is similar to PCA is factor analysis where the

object is to extract a number of factors Fi from the data. The
model is
X = AF + G,
where F are the common factors, G are the unique factors and A is
a matrix of unknown constants. Let Var(F) = Im and
Var(G) = diag(δ1 , . . . , δp ) = ∆. Then Cov(X) = AA0 + ∆.

Table of Contents
1 Linear systems
LU decomposition
QR decomposition

Vector norms
The vector norm is a measure of the size or magnitude of a vector.
For an integer p > 0 and an n-vector x, the vector norm is
n
!1/p
X
kxkp = |xi |p .
i=1
Important special cases are:
1-norm: kxk1 = ni=1 |xi |
P
2 1/2 , the Euclidean norm
Pn
2-norm: kxk2 = i=1 |xi |
∞-norm: kxk∞ = max1≤i≤n |xi |
Two properties of a vector 2-norm:
2
x1 2 2
x2 = kx1 k2 + kx2 k2

2
kQxk2 = kxk2 for Q orthogonal, kQxk22 = x0 Q0 Qx = kxk22
Linear least squares
A typical linear least squares problem can be characterized as
Ax ' b,
where A is a m × n, m > n, matrix, x is an n-vector and b is an

m-vector.
For such over-determined systems, there is usually no exact
solution. The closest match possible in the 2-norm can however be
found. This is the linear least squares problem, formulated as
min kb − Axk2 = krk2 .

x

The solution to a least squares problem always exists.

The solution is unique iff A has full column rank, i.e. iff
rank(A) = n.

Normal equations
A least squares problem can be treated using methods from

calculus. The object is to minimize the residual vector r = b − Ax.
We define a function
φ(x) = krk2 = (b − Ax)0 (b − Ax) = b0 b − 2x0 A0 b + x0 A0 Ax.
A necessary condition for a minimum is that the gradient is equal

to zero:
∇φ(x) = 2A0 Ax − 2A0 b = 0
and that the solution satisfies the linear system A0 Ax = A0 b.
A sufficient condition is that A0 A is positive definite, which is
equivalent to rank(A) = n

Solving least squares by QR decomposition
We have from the QR decomposition that the least squares

problem can be re-written as
2 2
kb − Axk22 = Q01 b − R1 x 2 + Q02 b 2

and the minimum is then attained when R1 x = Q01 b. Then, x can

be found by back substitution.

Solving least squares by SVD
The reduced form of the SVD is

Σ1
A = U1 U2 V0 = U1 Σ1 V0 .
0
Hence, the solution of Ax ' b is

X u0 b
x = VΣ−1 0
1 U1 b =
i
vi .
σii
σii 6=0
The SVD is especially useful for ill-conditioned or nearly

rank-deficient problems, since the very small singular values can be
dropped from the summation. This makes the solution much less
sensitive to small changes in the data.
Table of Contents
1 Linear systems
LU decomposition
QR decomposition

Applying functions to matrices and arrays
It is sometimes convenient and more efficient to avoid loops and

instead use the function apply() to conduct calculations from the
entries of a matrix. For example, suppose you want to calculate
the mean of each of the columns in a matrix.
R> A <- matrix(c(rnorm(10), rnorm(10, 2), rnorm(10, 5)),
+ ncol=3)
R> apply(A, 2, mean)
[1] 0.01823443 2.19343645 4.80973298
The apply function works for any function you specify. For a
matrix, the second argument denotes if the function should operate
on the rows (1) or the columns (2).

Additional built-in functions

colSums(), rowSums() - calculates the column sums and row
sums of a given matrix
cbind(), rbind() - combines two matrices by their columns
or rows
crossprod(x, y) - calculates t(x) %*% y but is much
faster than transposing and multiplying the matrices/vectors
lower.tri(), upper.tri() - retrieves the upper or lower
triangular parts of matrices
scale(x, center = TRUE, scale = TRUE)
scale(x, scale=FALSE) - retrieves the centered matrix of x
(the mean of each column subtracted from the columns)
scale(x, scale=TRUE) - retrieves the centered and scaled
matrix of x (the columns are mean subtracted and divided by
the standard deviation of the columns)

Computational Linear Algebra

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Computational Linear Algebra

Enviado por

Direitos autorais:

Formatos disponíveis

Linear systems Matrix decompositions Linear least squares Some useful functions in R

Lecture 4 - Computational linear algebra

Björn Andersson (w/ Jianxin Wei)

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

4 Some useful functions in R

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Existence and uniqueness

Existence and uniqueness of solutions

A n × n matrix A is said to be non-singular if it satisfies any of the

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Solving linear systems

Forward substitution and backward substitution

Solving linear systems

Forward and backward substitution in R

Functions forwardsolve() and backsolve().

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

4 Some useful functions in R

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Let A be a square matrix such that A = LU, where L is a

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Gaussian elimination and LU decomposition

Consider Gaussian elimination written in matrix notation:

M2 M1 A = U ⇒ A = (M2 M1 )−1 U = LU.

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

R> A <- matrix(runif(9), 3, 3)

From the Cholesky decomposition it is possible to calculate the

A−1 = U−1 (U−1 )0 ,

which is a more stable way than using Gaussian elimination.

Let A = QR, where Q is an orthogonal matrix and R is an

and x can be solved by backward substitution.

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

We have the system Ax = b. Solve for x:

Calculating eigenvalues of a matrix

We want to calculate the eigenvalues of a real matrix A. To do so

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Calculating eigenvalues of a matrix

Singular value decomposition

Singular value decomposition

The singular value decomposition (SVD) of a m × n matrix A has

singular values - the diagonal entries σii of Σ

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Singular value decomposition

Reduced form of the SVD

For a m × n matrix A, m > n, the reduced form of the SVD is

The decomposition can also be expressed as

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Singular value decomposition

SVD and the spectral decomposition

The spectral decomposition says that a real, symmetric m × m

where P1 , . . . , Pm are pairwise orthogonal vectors,

where P is the matrix with vectors P1 , . . . , Pm and

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Singular value decomposition

SVD and the spectral decomposition

If A is symmetric and all the eigenvalues are non-negative, then:

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Singular value decomposition

In theory, the rank of A is the number of non-zero singular

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Principal component analysis

(Adapted from Linear Statistical Inference and Its Applications,

Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University

Principal component analysis

where W is an orthogonal matrix of eigenvectors of X0c Xc ordered