Lecture 06 - Oprimum Filters

7/20/2008
Lecture 06: Optimum filters
Instructor:
Dr. Gleb V. Tcheslavski
Contact:
gleb@ee.lamar.edu
Office Hours:
Room 2030
Class web site:
http://www.ee.lamar.edu/
gleb/adsp/Index.htm
Dear, it seems like I solved the problem of water purification!

ELEN 5301 Adv. DSP and Modeling Summer II 2008
Problem statement: Wiener filter
The problem of estimation (or extraction) of one signal from another arises quite
often. In many applications, the desired signal (speech, radar signal, EEG, image,
etc ) is not available or observed directly.
etc.) directly Instead
Instead, the desired signal may be noisy
or destroyed. In some simple situations, it may be possible to design a classical
filter (LPF, HPF, BPF) to resolve the desired signal from the data. However, these
filters are rarely optimum in the sense of producing the best estimate of the
signal. Therefore, optimum digital filters – including Wiener and Kalman filters –
are of interest.
The discrete Wiener filter is designed
to recover the desired signal dn from
noisy observations
xn = d n + vn (6.2.1)
Assuming that both dn and xn are wss random processes, Wiener considered the
problem of designing the filter W(z) that produces the minimum mean-square
(MMS) error estimate of dn.
1
7/20/2008
Problem statement: Wiener filter
Therefore:
ξ = E en { } 2
(6.3.1)
where en = d n − dˆn (6.3.2)
Thus, the problem is to find a filter that minimizes ξ. We begin by considering the
general problem of Wiener filtering, where an LTI filter W(z) minimizing (6.3.2)
needs to be designed. Depending upon the relationship between xn and dn, a
number of different problems may be solved with Wiener filters. Some of them are:
1. Filtering: given xn = dn + vn, need to estimate dn with a causal filter; i.e., from
the current and past values of xn;
2. Smoothing: the same as filtering except that the filter can be non-causal;
3. Prediction: if dn = xn+1 and W(z) is a causal filter, the Wiener filter becomes a
linear predictor. Thus, the filter produces a prediction (estimate) of the future
value of the signal in terms of a linear combination of its previous values;
4. Deconvolution: when xn = dn∗gn + vn with gn being the unit-pulse response of
an LTI filter, the Wiener filter becomes a deconvolution filter.
FIR Wiener filter
We need to design an FIR Wiener filter producing the MMS error estimate of a
desired process dn by filtering a set of observations of a statistically related
process xn.
Assuming that xn and dn are jointly wss with known autocorrelations rx,k and rd,k
and known cross-correlation rdx,k; denoting the unit-pulse response of the Wiener
filter by wn; and assuming a p-1 order filter, the filter transfer function will be
p −1
W ( z ) = ∑ wn z − n (6.4.1)
n=0
Therefore, for the input
p xn, the filter output
p will be
p −1
dˆn = ∑ wl xn −l (6.4.2)
l =0
To minimize the mean-square error (that does not depend on n)
{ } = E{d }
2
ξ = E en − dˆn
2
(6.4.3)
n
2
7/20/2008
FIR Wiener filter
its derivative must be zero according to the optimization theory:
∂ξ ∂ ⎧ ∂en* ⎫
=
∂wk* ∂wk*
E { n n } ⎨en ∂w* ⎬ = 0,0
e e*
= E k = 0,1,...,
0 1 p −1 (6.5.1)
⎩ k ⎭
complex conjugate
p −1
With en = d n − ∑ wl xn −l (6.5.2)
l =0
it follows that ∂e *
= − xn*− k
n
(6.5.3)
∂w *
k
and, therefore:
E {en xn*− k } = 0, k = 0,1,..., p − 1 (6.5.4)
Which is known as the orthogonality principle or the projection theorem.
FIR Wiener filter
Therefore, combining (6.5.2) and (6.5.4):

p −1
E {d x *
n n−k } − ∑ w E {x l
*
x
n −l n − k } = 0,0 k = 0,1,...,
0 1 p −1 (6 6 1)
(6.6.1)
l =0
Since xn and dn are jointly wss:
E { xn −l xn*− k } = rx (k − l ) (6.6.2)
E {d n xn*− k } = rdx (k ) (6.6.3)
(6 6 1) becomes
(6.6.1)
p −1
∑ w r (k − l ) = r
l =0
l x dx (k ), k = 0,1,..., p − 1 (6.6.4)
Which is a set of p equations in the p unknowns wk, k = 0,1,…p-1 called the

Wiener-Hopf equations.
3
7/20/2008
FIR Wiener filter
Since the autocorrelation sequence is conjugate symmetric:
rx (k ) = rx* (−k ) (6.7.1)
the Wiener-Hopf equations in matrix form are
⎡ rx (0) rx* (1) " rx* ( p − 1) ⎤ ⎡ w0 ⎤ ⎡ rdx (0) ⎤

⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ rx (1) rx (0) " rx* ( p − 2) ⎥ ⎢ w1 ⎥ ⎢ rdx (1) ⎥
= (6.7.2)
⎢ # # % # ⎥⎢ # ⎥ ⎢ # ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣⎢ rx ( p − 1) rx ( p − 2) " rx (0) ⎦⎥ ⎣⎢ wp −1 ⎦⎥ ⎣ rdx ( p − 1) ⎦
or, in a more compact form:
R x w = rdx (6.7.3)
Where Rx is a p x p Hermitian Toeplitz matrix of autocorrelations, w is the vector

of filter coefficients, and rdx is the vector of cross-correlations between dn and xn.
FIR Wiener filter
The MMS error in the estimate of dn can be found with

⎧⎪ ⎡ ⎤ ⎫⎪
*
{ } ⎧ ⎫
p −1 p −1
ξ = E en = E ⎨en ⎢ d n − ∑ wl xn −l ⎥ ⎬ = E ⎨en d n* − ∑ wl* E {en xn*−l }⎬
2
(6.8.1)
⎩⎪ ⎣ l =0 ⎦ ⎭⎪ ⎩ l =0 ⎭
and since wk is the solution to the Wiener-Hopf equation, it follows that
E {en xn*− k } = 0 (6.8.2)
⎧⎛ p −1
⎞ ⎫
Therefore: ξ min = E {en d n*} = E ⎨⎜ d n − ∑ wl xn −l ⎟ d n* ⎬ (6.8.3)
⎩⎝ l =0 ⎠ ⎭
and evaluating the expected values
p −1
ξ min = rd (0) − ∑ wl rdx* (l ) (6.8.4)
l =0
4
7/20/2008
FIR Wiener filter
Using vector notation:
ξ min = rd (0) − rdxH w (6.9.1)
Finally, since
w = R −x 1rdx (6.9.2)
The MMS error can be written as
ξ min = rd (0) − rdxH R −x 1rdx (6.9.3)
10
FIR Wiener filter: Filtering
Filtering problem implies that a signal dn needs to be estimated from an

observation corrupted by noise
xn = d n + vn (6.10.1)
Assume that noise is a zero mean process uncorrelated with dn. Then
E {d n vn*− k } = 0 (6.10.2)
and rdx (k ) = E {d n xn*− k } = E {d n d n*− k } + E {d n vn*− k } = rd (k ) (6.10.3)
{
rx (k ) = E { xn + k xn* } = E [ d n + k + vn + k ][ d n + vn ] }
Since *
(6.10.4)
for vn and dn uncorrelated, it follows that

rx (k ) = rd (k ) + rv (k ) (6.10.5)
5
7/20/2008
11
FIR Wiener filter: Filtering (Ex)
Therefore, with Rd the autocorrelation matrix for dn, Rv the autocorrelation matrix
for vn, and rdx = rd = [rd(0),…rd(p – 1)]T, the Wiener-Hopf equations become
[ R d + R v ] w = rd (6.11.1)
Further simplifications are possible when more information about the statistic of
the signal is available.
Example: let dn be an (AR) process with an autocorrelation sequence
rd (k ) = α
k
(6.11.2)
where
h 0 < α < 1 and
d th
the corrupting
ti noise
i vn is
i uncorrelated
l t d white
hit with
ith a variance
i
σv2 and
xn = d n + vn (6.11.3)
We need to design a 1st order FIR Wiener filter to reduce the noise.
W ( z ) = w0 + w1 z −1 (6.11.4)
12
The Wiener-Hopf equations are

⎡ rx (0) rx (1) ⎤ ⎡ w0 ⎤ ⎡ rdx (0) ⎤
⎢ r (1) r (0) ⎥ ⎢ w ⎥ = ⎢ r (1) ⎥ (6 12 1)
(6.12.1)
⎣ x x ⎦ ⎣ 1 ⎦ ⎣ dx ⎦
Since noise is uncorrelated with the signal
rdx (k ) = rd (k ) = α
k
(6.12.2)
rx (k ) = rd (k ) + rv (k ) = α + σ v2δ k
k
and (6.12.3)
Therefore, the Wiener-Hopf

p equations
q become
⎡1 + σ v2 α ⎤ ⎡ w0 ⎤ ⎡ 1 ⎤
⎢ ⎥⎢ ⎥ = ⎢ ⎥ (6.12.4)
⎣ α 1 + σ v2 ⎦ ⎣ w1 ⎦ ⎣α ⎦
The Wiener filter is ⎡ w0 ⎤ 1 ⎡1 + σ v2 − α 2 ⎤
⎢w ⎥ = ⎢ ⎥ (6.12.5)
⎣ 1 ⎦ (1 + σ v2 ) − α 2 ⎣ ασ v
2 2
⎦
6
7/20/2008
13
Then the Wiener filter is
W ( z) =
(1 + σ − α ) + ασ
2
v
2 2 −1
vz
(6 13 1)
(6.13.1)
(1 + σ ) − α 2 2
v
2
For a particular case of α = 0.8 and σv2 = 1, the Wiener filter becomes
W ( e jω ) = 0.4048 + 0.2381e − jω (6.13.2)
which is a LPF with a

magnitude response
14
We also note that the power spectrum of dn is

1−α 2
Pd ( e jω ) =
(1 + α 2 ) − 2α cos ω
(6 14 1)
(6.14.1)
For the specific case of α = 0.8 and σv2 = 1
Pd ( e jω ) =
0.36
(6.14.2)
1.64 − 1.6 cos ω
The power spectrum
of the desired signal
is decreasing with
frequency. Spectrum
of noise is constant,
therefore, LPF should
increase SNR.
7
7/20/2008
15
The mean square error will be
ξ mini = E en { } = r (0) − w r
2
d
*
0 dx
d d (1) = σ v
(0) − w1rdx
* 2 1 + σ v2 − α 2
(1 + σ )2 2
−α 2
(6.15.1)
For the specific case of α = 0.8 and σv2 = 1, ξmin = 0.4048.

Prior to filtering, since rd(0) = σd2 = 1 and σv2 = 1, the power in dn equals to power
in vn, then SNR = 1 = 0 dB. After filtering, the power in the signal d’n = wn∗dn is
⎡ 1 α ⎤ ⎡ w0 ⎤
{ } = w R w = [w
E dn '
2 T
d 0 w1 ] ⎢ ⎥ ⎢ ⎥ = 0.3748
⎣α 1 ⎦ ⎣ w1 ⎦
(6.15.2)
And the noise power is
{ } = w R w = [w
E vn '
2 T
v 0
⎡w ⎤
w1 ] ⎢ 0 ⎥ = 0.2206
⎣ w1 ⎦
(6.15.3)
The SNR at the filter output is

0.3748
SNR = 10 lg = 2.302 dB (6.15.4)
0.2206
16
FIR Wiener filter: Linear prediction
With noise-free observations, linear prediction is the problem of finding the MMS
estimate (prediction) of xn+1 using a linear combination of the current and p-1
previous values of xn.
Therefore, an FIR linear
predictor of order p-1 has the
form:
p −1
xˆn +1 = ∑ wk xn − k (6.16.1)
k =0
The linear predictor may be implemented by the Wiener filter by setting dn = xn+1.
Since
rdx (k ) = E {d n xn*− k } = E { xn +1 xn*− k } = rx (k + 1) (6.16.2)
8
7/20/2008
17
The Wiener-Hopf equations for the optimum linear predictor are
⎡ rx (0) rx* (1) " rx* ( p − 1) ⎤ ⎡ w0 ⎤ ⎡ rx (1) ⎤

⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ rx (1) rx (0) " rx* ( p − 2) ⎥ ⎢ w1 ⎥ ⎢ rx (2) ⎥
= (6.17.1)
⎢ # # % # ⎥⎢ # ⎥ ⎢ # ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢⎣ rx ( p − 1) rx ( p − 2) " rx (0) ⎥⎦ ⎢⎣ wp −1 ⎥⎦ ⎣ rx ( p ) ⎦
The mean-square error is
p −1
ξ min = rd (0) − ∑ wk rx* (k + 1) (6.17.2)
k =0
18
FIR Wiener filter: Linear prediction:
Example
For the same (AR) process with an autocorrelation sequence
rd (k ) = α
k
(6.18.1)
the 1st order predictor is

xˆn +1 = w0 xn + w1 xn −1 (6.18.2)
and the Wiener-Hopf equations are
⎡ 1 α ⎤ ⎡ w0 ⎤ ⎡ α ⎤
⎢α 1 ⎥ ⎢ w ⎥ = ⎢α 2 ⎥ (6.18.3)
⎣ ⎦⎣ 1⎦ ⎣ ⎦
The predictor coefficients are
⎡ w0 ⎤ 1 ⎡ 1 −α ⎤ ⎡ α ⎤ ⎡α ⎤
⎢ w ⎥ = 1−α 2 ⎢ −α =
1 ⎥⎦ ⎢⎣α 2 ⎥⎦ ⎢⎣ 0 ⎥⎦
(6.18.4)
⎣ 1⎦ ⎣
9
7/20/2008
19
FIR Wiener filter: Linear prediction:
Example
Therefore, the predictor is
xˆn +1 = α xn (6.19.1)
The mean-square linear prediction error is
ξ min = rx (0) − w0 rx (1) − w1rx (2) = 1 − α 2 (6.19.2)
We observe that as α increases, the correlation between samples of xn increases

and the prediction error decreases. For uncorrelated samples, α = 0 and ξmin = 1,
which is equal
q to the variance of xn. The optimum
p p
predictor in this case will be
xˆn +1 = 0 (6.19.3)
That is the mean value of the process.
20
A more realistic situation is when

the signal is contaminated by
noise The linear predictor needs
noise.
to estimate (predict) the signal in
the presence of noise.
With the input

yn = xn + vn (6.20.1)
The linear prediction will be

p −1 p −1
xˆn +1 = ∑ wk yn − k = ∑ wk ( xn − k + vn − k ) (6.20.2)
k =0 k =0
R y w = rdy (6.20.3)
10
7/20/2008
21
If the noise vn is uncorrelated with the signal xn, then Ry, the autocorrelation
matrix for yn, is
ry (k ) = E { yn yn*− k } = rx (k ) + rv (k ) (6 21 1)
(6.21.1)
and rdy(k), the cross-correlation vector between dn and yn, is
rdy (k ) = E {d n yn*− k } = E { xn +1 yn*− k } = rx (k + 1) (6.21.2)
Therefore, the only difference between linear prediction with and without noise is
i th
in the autocorrelation
t l ti matrix
t i ffor the
th input
i t signal.
i l Wh
When noise
i iis uncorrelated
l t d with
ith
the signal, Rx is replaced with Ry = Rx + Rv.
22
The problem of one-step linear prediction (when xn+1 is predicted) can be

generalized to the problem of multistep prediction, when xn+α is predicted as a
linear combination of the p values xn, xn-1,… xn-p+1.
p −1
xˆn +α = ∑ wk xn − k (6.22.1)
k =0
Where α is a positive integer. Only the cross-correlation term will change

compared to the one-step prediction:
rdx (k ) = E { xn +α xn*− k } = rx (α + k ) (6.22.2)
11
7/20/2008
23
The Wiener-Hopf equations become
⎡ rx (0) rx* (1) " rx* ( p − 1) ⎤ ⎡ w0 ⎤ ⎡ rx (α ) ⎤

⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ rx (1) rx (0) " rx* ( p − 2) ⎥ ⎢ w1 ⎥ ⎢ rx (α + 1) ⎥
= (6.23.1)
⎢ # # % # ⎥⎢ # ⎥ ⎢ # ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣⎢ rx ( p − 1) rx ( p − 2) " rx (0) ⎦⎥ ⎣⎢ wp −1 ⎦⎥ ⎣ rx (α + p − 1) ⎦
or in the matrix form:

R x w = rx ,α (6.23.2)
where rx,α is the autocorrelation vector beginning with rx(α). The MMS error is
p −1
ξ min = rx (0) − ∑ wk rx* (k + α ) = rx (0) − rxH,α w (6.23.3)
k =0
24
The multistep predictor can also

be implemented as a one-step
predictor using a linear
combination of the values of xn
over the interval from n-α-p-2 to
n-α+1.
p −1
xˆn +1 = ∑ wk xn − k −α +1 (6.24.1)
k =0
Assuming that the delay α is a free parameter
parameter, the prediction problem can be
viewed as finding the filter coefficients AND the delay α that minimize MS error.
12
7/20/2008
25
FIR Wiener filter: Noise cancellation
The problem of noise cancellation is similar to the filtering problem since the goal
is to recover a signal degraded by noise (btw, the signal is assumed to be
recorded by a primary sensor)
sensor). However
However, unlike the filtering where the noise
autocorrelation is known,
noise parameters need to
be estimated by a
secondary sensor that is
placed within the noise
field. Although the noise is
measured by a secondary
sensor,, it will be correlated
with the noise coming from the primary sensor but the two processes are not equal.
Since v1,n ≠ v2,n, it is not possible to estimate dn by simply subtracting v2,n from xn.
Instead, the noise canceller contains a Wiener filter estimating the noise vˆ1, n from
the sequence received from the secondary sensor. This estimate is then
subtracted from the primary signal to form an estimate
dˆn = xn − vˆ1,n (6.25.1)
26
The Wiener-Hopf equations for the noise cancellation system are
R v 2 w = rv1v2 (6 26 1)
(6.26.1)
Where Rv2 is the autocorrelation matrix of v2,n and rv1v2 is the cross-correlation
between the needed noise signal v1,n and the Wiener filter input v2,n. The cross-
correlations are
rv1v2 (k ) = E {v1, n v2,* n − k } = E {( xn − d n ) v2,* n − k } = E { xn v2,* n − k } − E {d n v2,* n − k } (6.26.2)
Assuming
g that v2,n
2 n and dn are uncorrelated:
rv1v2 (k ) = E { xn v2,* n − k } = rxv2 (k ) (6.26.3)
R v 2 w = rxv2 (6.26.4)
13
7/20/2008
27
(Example)
Assume that the desired
signal is a sinusoid:
d n = sin ( nω0 + φ ) (6.27.1)
with ω0 = 0.05π and that

the noise sequences are
xn dn
v1,n = 0.8v1,n −1 + g n
(6.27.2)
v2,n = −0.6v2,n −1 + g n
where gn is zero-mean, unit variance white noise uncorrelated with dn.
The observed signal: v2,n
xn = d n + v1,n (6.27.3)
28
(Example)
The sample (estimated) autocorrelation is
N −1
1
rˆv2 (k ) =
N
∑v
n =0
v
2, n 2, n − k
(6.28.1)
The sample (estimated) cross-correlation is

N −1
1
rˆxv2 (k ) =
N
∑x v
n =0
n 2, n − k
(6.28.2)
Note: in typical applications, signal or (and) noise are often nonstationary.

Therefore, the use of LTI Wiener filter is not optimum. However, an
adaptive Wiener filter may provide effective noise cancellation in
nonstationary environments.
14
7/20/2008
29
(Example)
Output of a 6th
order Wiener
filter
Output of a
12th order
Wiener filter
30
IIR Wiener filter
The goal is to design an IIR Wiener filter that for the

sequence
seque ce xn would
ou d p
produce an output yn = xn∗hn tthat
oduce a at iss as
close as possible – in the mean-square sense – to the
desired process dn. We notice that for the IIR Wiener filter,
there are an infinite number of unknown coefficients to be
found: hn for all n.
15
7/20/2008
31
Noncausal IIR Wiener filter
For a noncausal (unconstrained) IIR Wiener filter, the problem is to find the unit
sample response of the filter
∞
H ( z) = ∑hz
n =−∞
n
−n
(6.31.1)
that minimizes the mean-square error
ξ = E en{ } 2
(6.31.2)
Where the mean-square error is

∞
en = d n − dˆn = d n − ∑ hl xn −l (6.31.3)
l =−∞
32
This problem can be solved similarly to the FOR Wiener filter problem: by equating
the derivative of the mean-square error with respect to hk* to zero for each k:
∂ξ
= − E {en xn*− k } = 0, −∞ < k < ∞ (6.32.1)
∂hk*
Which is equivalent to
E {en xn*− k } = 0, −∞ < k < ∞ (6.32.2)
The last equation (6.32.2) is called the orthogonality principle and it is identical to
the orthogonality principle for an FIR filter except that here the equality must hold
for all k. Therefore:
∞
∑ h E {x
l =−∞
l x*
n −l n − k } = E {d x } , *
n n−k −∞ < k < ∞ (6.32.3)
We note that the expectation in the lhs is the autocorrelation, and in the rhs –
cross-correlation.
16
7/20/2008
33

∞
∑ h r (k − l ) = r
∞
l =−∞
l x dx (k ), − ∞ < k < ∞ (6.33.1)
Which are the Wiener-Hopf equations of the noncausal IIR Wiener filter. We
observe that the only difference compared to the FIR case is the summation limit
and the range of values for k. We can also notice that
hk ∗ rx (k ) = rdx (k ) (6.33.2)
In the frequency domain:

H ( e jω ) Px ( e jω ) = Pdx ( e jω ) (6.33.3)
Therefore, the frequency response of the IIR Wiener filter is
Pdx ( e jω )
H (e jω
)= Px ( e jω )
(6.33.4)
34
The transfer function is

Pdx ( z )
H ( z) = (6.34.1)
Px ( z )
We observe that the denominator is an auto-spectrum, while the numerator is a
cross-spectrum.
The mean-square error is
∞
ξ min = rd (0) − ∑ hl rdx* (l ) (6.34.2)
l =−∞
U i P
Using Parseval’s
l’ th
theorem
π
∫π H ( e ) P ( e ) dω
1
ξ min = rd (0) − ω jω * j
(6.34.3)
2π
dx
−
17
7/20/2008
35

π
∫π P ( e ) dω
Since 1 ω
rd (0) = j
(6.35.1)
2π
d
−
we may rewrite the MS error as

π
∫π ⎡⎣ P ( e ) − H ( e ) P ( e )⎤⎦ dω
1
ξ min = ω j ω ω j * j
(6.35.2)
2π
d dx
−
The error can also be expressed in terms of the complex variable z as
H ( z ) P (1 z ) z ∫C ⎡⎣ Pd ( z) − H ( z ) Pdx (1 z )⎤⎦ (6.35.3)

1 1
2π j v∫
ξmin = rd (0) − −1
dz = z −1dz
2π j v
* * * *
dx
C
where the contour, C, may be taken to be the unit circle.
36
Noncausal IIR Wiener filter: Smoothing
We need a MMS estimate of a process dn using the noisy observation
xn = d n + vn ((6.36.1))
We need to find auto- and cross-spectra. Assuming that dn and vn are uncorrelated
zero mean random processes, the autocorrelation is
rx (k ) = rd (k ) + rv (k ) (6.36.2)
and the power spectrum (auto-spectrum) is
Px ( e jω ) = Pd ( e jω ) + Pv ( e jω ) (6.36.3)
The cross-correlation
rdx (k ) = E {d n xn*− k } = E {d n d n*− k } + E {d n vn*− k } = rd ( k ) (6.36.4)
and the cross-spectrum

Pdx ( e jω ) = Pd ( e jω ) (6.36.5)
18
7/20/2008
37
Therefore, the IIR Wiener filter is

Pd ( e jω )
H (e jω
)= P
(e ) + P (e )
(6 37 1)
(6.37.1)
jω jω
d v
We note that over the frequency bands where Pd e Pv e the filter ( )

jω
( )
jω
magnitude response is approximately one. Therefore, over these frequency

bands, a little attenuation takes place. Since
Pdx ( e jω ) = Pd ( e jω ) (6.37.2)
and is real, then

π π
∫−π ⎡⎣ Pd ( e ) − H ( e ) Pdx ( e )⎤⎦ dω = 2π ∫π P ( e ) ⎡⎣1 − H ( e )⎤⎦ d(6.37.3)

1 1
ξ min = jω jω * jω ω j ω
ω j
2π
d
−
38
Finally, combining with the filter frequency response equation, the MS error of a
noncausal IIR Wiener filter smoother is
π Pv ( e jω ) π
Pd ( e jω ) ∫π P ( e ) H ( e ) dω
1 1
ξ min = ∫π dω = ω ωj j
Pd ( e ) + P (e )
(6.38.1)
2π jω jω
2π
v
− v −
or, in the z-domain:

1
2π j v∫
ξ min = P ( z)H ( z) z
v
−1
dz (6.38.2)
19
7/20/2008
39
Causal IIR Wiener filter
To be able to realize an IIR Wiener filter, we needs to constrain it to be causal.

For the causal filter, its unit pulse response hn is zero for n < 0. Therefore:
∞
dˆn = xn ∗ hn = ∑ hk xn − k (6.39.1)
k =0
Using the same procedure as before, we can find the Wiener-Hopf equations for
the causal IIR Wiener filter:
∞
∑ h r (k − l ) = r
l =0
l x dx (k ); 0≤k ≤∞ (6.39.2)
The important difference between this result and the one for the noncausal IIR
filter is the summation limit. The restriction on k as being non-negative implies that
the cross-correlation rdx(k) cannot be expressed as the convolution of hk and rx(k).
40
We start the filter design at the special case when the input to the filter is unit
variance white noise εn. Denoting the Wiener filter coefficients by gn, the Wiener-
Hopf equations are
∞
∑ g rε (k − l ) = r ε (k );
l =0
l d 0≤k ≤∞ (6.40.1)
Since rε(k) = δ(k), the lhs reduces to gk. Therefore, the causal Wiener filter for
white noise is:
g n = rd ε (n)un (6.40.2)
where un is the unit step function. The z-domain

z domain solution is
G ( z ) = [ Pd ε ( z ) ]+ (6.40.3)
Here “+” indicates the “positive-time part” of the sequence whose z-transform is
contained within the brackets.
20
7/20/2008
41
However, in most of the practical applications, it is unlikely that the input to a

Wiener filter will be white noise. Assuming that the input xn is a random process
with a rational power spectrum with no poles and zeros on the unit circle
circle, we may
perform a spectral factorization and write Px as follows:
Px ( z ) = σ 02Q( z )Q* (1 z * ) (6.41.1)
where Q(z) is minimum phase and of the form

N ( z)
Q( z ) = 1 + q1 z −1 + q2 z −2 + ... = (6.41.2)
D( z )
where N(z) and D(z) are minimum phase monic polynomials. If xn is filtered with a
filter having a transfer function of the form
1
F ( z) = (6.41.3)
σ 0Q ( z )
42
the power spectrum of the output process εn will be
Pε ( z ) = Px ( z ) F ( z ) F * (1 z * ) = 1 (6 42 1)
(6.42.1)
Therefore, the output process εn is white noise and F(z) is called a whitening filter.
We notice that since Q(z) is minimum phase, then F(z) is stable and causal and
has a stable and causal inverse F-1(z). As a result, xn may be recovered from εn by
filtering with the inverse filter F-1(z). In other words, there is no loss of information
in the linear transformation producing white noise from xn.
Let H(z) be the causal Wiener filter with an input xn having a rational spectrum and
producing the MMS estimate of dn. Suppose that the input is filtered with a cascade
of three filters F(z), F-1(z), and H(z) where F(z) is the causal whitening filter for xn
and F-1(z) is its causal
inverse.
21
7/20/2008
43
The cascade G(z) = F-1(z)H(z) is the causal IIR Wiener filter producing the MMS
estimate of dn from the white noise εn. The causality of G(z) follows from the fact
that both F-1(z) and G(z) are causal
causal.
The cross-correlation between dn and εn is
⎧⎪ ⎡ ∞ ⎤ ⎫⎪ ∞ *
*
rd ε (k ) = E {d ε *
n n−k } = E ⎨dn ⎢ ∑ fl xn−k −l ⎥ ⎬ = ∑ fl rdx (k + l ) (6.43.1)
⎪⎩ ⎣ l =−∞ ⎦ ⎪⎭ l =−∞
Therefore, the cross-power spectral density is
Pdx ( z )
Pd ε ( z ) = Pdx ( z ) F * (1 z * ) =
σ 0Q* (1 z * )
(6 43 2)
(6.43.2)
and the causal Wiener filter for estimating dn from εn is
1 ⎡ Pdx ( z ) ⎤
G (z) = ⎢ ⎥
σ 0 ⎢ Q* (1 z* ) ⎥
(6.43.3)
⎣ ⎦ +
44
Since the causal Wiener filter is a cascade of two filters

H ( z) = F ( z)G ( z) (6.44.1)
then
1 ⎡ P ( z) ⎤
H ( z) = ⎢ *dx * ⎥
σ 0 Q ( z ) ⎢ Q (1 z ) ⎥
(6.44.2)
2
⎣ ⎦+
In the case of real processes, hn is real and causal IIR Wiener filter in the form:
1 ⎡ Pdx ( z ) ⎤
H ( z) = ⎢ ⎥ (6.44.3)
σ 02Q ( z ) ⎣ Q (1 z ) ⎦ +
Finally, the MS error for the causal IIR Wiener filter is
∞
ξmin = rd (0) − ∑ hl rdx* (l ) (6.44.4)
l =0
22
7/20/2008
45
In the frequency domain the error is

π
∫π ⎡⎣ P ( e ) − H ( e ) P ( e )⎤⎦ dω
1
ξmin = ωj ω jω * j
(6.45.1)
2π
d dx
−
In the z-domain the error is
⎡ P ( z) − H ( z ) P (1 z ) ⎤ z
1
2π j v∫ ⎣
ξmin = d ⎦
*
dx
* −1
dz (6.45.2)
We observe that the expressions for the causal IIR Wiener filter error in the
frequency and z-domains are exactly the same as the corresponding expressions
for the non-causal IIR Wiener filter. In the time-domain description the difference
arises in the summation limits.
46
We notice that a noncausal Wiener filter may be expressed using spectral

factorization as
d ( z)
Pdx
H nc ( z ) =
σ 02Q ( z ) Q* (1 z * )
(6.46.1)
Or, viewed as a cascade of two filters, the noncausal Wiener filter is
1 ⎡ P (z) ⎤
H nc ( z ) = ⎢ dx ⎥
σ 02Q ( z ) ⎢ Q* (1 z* ) ⎥
(6.46.2)
⎣ ⎦
Th non-causall Wiener
The Wi filt
filter can be
b implemented
i l t d by
b the
th structure
t t shown
h below.
b l W
We
note that the first filter is the causal whitening filter generating white noise εn while
the second
is noncausal
filter to
produce the
MMS dn.
23
7/20/2008
47
A causal IIR Wiener filter is formed by taking the causal part of [Pdx(z)/Q*(1/z*)] as
shown below.
48
Causal IIR Wiener filter: filtering
Considering a problem of estimation of a process dn from the noisy observation

xn = d n + vn ((6.48.1))
and assuming that the noise vn is uncorrelated with dn:
Pdx ( z ) = Pd ( z ) (6.48.2)
the causal Wiener filter becomes
1 ⎡ P ( z) ⎤
H ( z) = ⎢ *d * ⎥
σ 0 Q ( z ) ⎢ Q (1 z ) ⎥
(6.48.3)
2
⎣ ⎦+
where Px ( z ) = Pd ( z ) + Pv ( z ) = σ 02Q ( z ) Q* (1 z * ) (6.48.4)
However, to evaluate (built) the actual Wiener filter, the expressions for power
spectral densities Pd(z) and Pv(z) are required.
24
7/20/2008
49
(Example)
We need to estimate a signal dn generated by
d n = 0.8
0 8d n −1 + wn (6.49.1)
from its noisy observation

xn = d n + vn (6.49.2)
Where vn is unit variance white noise uncorrelated with dn and wn is white noise
with variance σ2w = 0.36. Therefore, rd(k) = 0.8|k|. To find the optimum causal IIR
Wiener filter, we begin from observation that
Pdx ( z ) = Pd ( z ) (6.49.3)
Px ( z ) = Pd ( z ) + Pv ( z ) = Pd ( z ) + 1 (6.49.4)
Therefore, with
0.36
Pd ( z ) =
(1 − 0.8 z −1 ) (1 − 0.8 z )
(6.49.5)
50
(Example)
The power spectrum of xn is
Px ( z ) = 1 +
0.36
= 1.6
16
(1 − 0.5 z −1 ) (1 − 0.5 z )
(1 − 0.8 z −1 ) (1 − 0.8 z ) (1 − 0.8 z −1 ) (1 − 0.8 z )
(6.50.1)
Since xn is real, then:

Px ( z ) = σ 02Q ( z ) Q ( z −1 ) (6.50.2)
1 − 0.5 z −1
with σ 02 = 1.6, Q(z) = (6.50.3)
0 8 z −1
1 − 0.8
Since the causal IIR Wiener filter is
1 ⎡ P ( z) ⎤
H (z) = ⎢ dx ⎥
σ 02Q ( z ) ⎢ Q ( z −1 ) ⎥
(6.50.4)
⎣ ⎦+
25
7/20/2008
51
(Example)
Pdx ( z ) 0.36 (1 − 0.8 z )
we can express =
Q ( z −1 ) (1 − 0.8 z −1
)(1 − 0.8 z ) 1 − 0.5 z )
(
0.36 z −1 0.6 0.3
= = +
(1 − 0.8z −1 )( z −1 − 0.5) 1 − 0.8 z −1 z −1 − 0.5
(6.51.1)
Therefore: ⎡ P ( z) ⎤ 0.6
⎢ dx −1 ⎥ =
⎢⎣ Q ( z ) ⎥⎦ + 1 − 0.8 z
(6.51.2)
−1
Positive-time part
and the Wiener filter is

1 (1 − 0.8 z )
−1
0.6 0.375
H ( z) = =
1.6 (1 − 0.5 z ) (1 − 0.8 z ) 1 − 0.5 z −1
(6.51.3)
−1 −1
hn = 0.375 (1 2 ) un
n
or (6.51.4)
52
(Example)
Since D̂ ( z ) = H ( z ) X ( z ) (6.52.1)
th estimate
the ti t off dn may be
b computed
t d recursively
i l as ffollows
ll
dˆn = 0.5dˆn −1 + 0.375 xn (6.52.2)
Finally, the mean-square error can be estimated as
{( )}
∞ l
3 ∞ ⎛1⎞
= rd (0) − ∑ hl rdx (l ) = 1 − ∑ ⎜ ⎟ ( 0.8 ) = 0.3750
2
ξ min = E d n − dˆn
l
(6.52.3)
l =0 8 l =0 ⎝ 2 ⎠
For the comparison, for the 2nd-order FIR Wiener filter, the MS error was 0.4048.
We conclude that using all previous observations of xn only slightly improves the
performance of the Wiener filter.
26
7/20/2008
53
(Example)
For another comparison, we compute a noncausal Wiener filter as
Pdx ( z ) Pd ( z ) 0.36 1.6
H ( z) = = =
Px ( z ) Px ( z ) (1 − 0.5 z −1 ) (1 − 0.5 z )
(6.53.1)
The unit pulse response is

n
3 ⎛1⎞
hn = ⎜ ⎟ (6.53.2)
10 ⎝ 2 ⎠
The MS error can be computed as
∞ ∞ k
3 ⎛1⎞ 3
ξ min = rd (0) − ∑ hl rdx (l ) = 1 − 2∑ ⎜ ⎟ ( 0.8 ) + = 0.3
k
(6.53.3)
l =−∞ k = 0 10 ⎝ 2 ⎠ 10
Which is lower than for the causal filter as it should be expected.
54
(Example)
An interesting observation from the result for that particular non-causal IIR filter is
that the recursive estimator can be rewritten as
dˆn = 0.5dˆn −1 + 0.375 xn = 0.8dˆn −1 + 0.375 ⎡⎣ xn − 0.8dˆn −1 ⎤⎦ (6.54.1)
Therefore, the MMS estimate of dn is based on all observations of xn up to time n.

Similarly, the MMS estimate of dn-1 is based on all observations of xn up to time n-
1. If we have an estimate of dn-1, we may “predict” the estimate for of dn! In this
situation, we may “predict” the next measurement of xn.
When the next actual measurement of xn arrives, we may compare it to the
predicted value. The prediction error will be
α n = xn − xˆn (6.54.2)
This error is called an innovation process and represents the “new information”. In
other words, it represents the part that cannot be predicted. Therefore, the
estimate of dn can be corrected by the new information. This approach is related to
Kalman filtering.
27
7/20/2008
55
Causal IIR Wiener filter: linear
prediction
We need to derive an optimum linear predictor in the form
∞
xˆn +1 = ∑ hk xn − k (6.55.1)
k =0
that would produce the best estimate for xn+1 based on xk for all k ≤ n. Since the
infinite number of past signal values is used, we expect a better prediction than an
FIR predictor produces.
For the linear prediction problem d n = xn +1 (6.55.2)
And the cross-correlation is
rdx (k ) = E {d n xn*− k } = E { xn +1 xn*− k } = rx (k + 1) (6.55.3)
Therefore:
Pdx ( z ) = zPx ( z ) (6.55.4)
56
prediction
The Wiener predictor is then
1 ⎡ zP ( z ) ⎤
H ( z) = ⎢ x
⎥
σ Q ( z ) ⎢ Q* (1 z * ) ⎥
2
(6 56 1)
(6.56.1)
0 ⎣ ⎦+
Px ( z ) = σ 02Q ( z ) Q* (1 z * )
However, since
(6.56.2)
The linear predictor can be simplified as

1
H ( z) = ⎡ zQ ( z ) ⎤⎦ +
Q( z) ⎣
(6.56.3)
Recalling that Q(z) is a monic polynomial (0th-order coefficient = 1),
Q ( z ) = 1 + q1 z −1 + q2 z −2 + ... (6.56.4)
28
7/20/2008
57
prediction
we observe that the positive-time part of zQ(z) is
⎡⎣ zQ ( z ) ⎤⎦ + = ⎡⎣ z + q1 + q2 z −1 + q2 z −2 + ...⎤⎦ = q1 + q2 z −1 + q2 z −2 + ...
+
= z ⎡⎣Q ( z ) − 1⎤⎦ (6.57.1)
The causal IIR linear predictor becomes
1 ⎡ 1 ⎤
H ( z) = z ⎡⎣Q ( z ) − 1⎤⎦ = z ⎢1 − ⎥ (6.57.2)
Q( z) ⎣ Q( z)⎦
The MMS error is
⎡ P ( z ) − H ( z ) P (1 z ) ⎤ z
1
2π j v∫ ⎣
ξ min = d ⎦
*
dx
* −1
dz (6.57.3)
58
prediction
Since Pd ( z ) = Px ( z ) and Pdx ( z ) = zPx ( z ) (6.58.1)
th MMS error will

the ill b
be
H ( z ) Px* (1 z * ) ⎤⎦ z −1dz
1
⎡P ( z ) − z
2π j v∫ ⎣
ξ min = x
−1
(6.58.2)
C
Since the power spectrum is symmetric, i.e.
Px ( z ) = Px* (1 z * ) (6.58.3)
the error becomes

1
ξ min = ∫ Px ( z ) ⎡⎣1 − z −1 H ( z ) ⎤⎦ z −1dz
2π j v
(6.58.4)
C
29
7/20/2008
59
prediction
Substituting the transfer function for the causal IIR Wiener predictor leads to
1 ⎡ ⎛ 1 ⎞ ⎤ −1 1 Px ( z ) −1
ξ min = ∫
v P ( z ) ⎢ 1 − ⎜
⎜ 1 − ⎟⎟ ⎥ z dz = ∫
v z dz
2π j C ⎣⎢ ⎝ Q ( z ) ⎠ ⎦⎥ 2π j C Q ( z )
x
σ 02Q* (1 z * ) z −1dz = σ 02 q0
1
= ∫
v
2π j C
(6.59.1)
Finally since Q(z) is monic

Finally, monic, q0 = 1 and
and, therefore,
therefore the MMS error is
ξ min = σ 02 (6.59.2)
60
prediction
The spectral factorization suggests that for a wss random process, whose power
spectrum is a real-valued, positive, and periodic function of frequency, the
f ll i ffactorization
following t i ti h holds:
ld
Px ( z ) = σ 02Q ( z ) Q* (1 z * ) (6.60.1)
∫ ln Px ( e )dω
1 jω
where 2π
σ =e
2
0
−π
(6.60.2)
Th f
Therefore: π
∫ ln Px ( e )dω
1 jω
2π
ξ min = e −π
(6.60.3)
which is known as the Kolmogorov-Szegö formula.
30
7/20/2008
61
prediction of an AR process
Assuming that xn is an autoregressive* AR(p) process with a power spectrum
σ 02
Px ( z ) =
A ( z ) A* (1 z * )
(6.61.1)
where p
A ( z ) = 1 + ∑ ak z − k (6.61.2)
k =1
is a minimum phase polynomial having all its roots inside the unit circle.
The optimum linear predictor is
H ( z ) = z ⎡⎣1 − A ( z ) ⎤⎦ = − a1 − a2 z −1 − ... − a p z − p +1 (6.61.3)
which happens to be an FIR filter! Therefore, only the last p values out of an
infinite number of past signal samples are used to predict xn+1.
* - AR processes will be explained in details later.

62
prediction of an AR process
A random autoregressive AR(p) process satisfies a difference equation of the
form:
xn = − a1 xn −1 − a2 xn − 2 − ... − a p xn − p + wn (6.62.1)
where wn is white noise. Since wn+1 cannot be predicted from xn or its previous
values (noise is assumed as uncorrelated with the signal), predicting xn+1, we can
use the model for xn and ignore the noise at best.
xˆn +1 = −a1 xn − a2 xn −1 − ... − a p xn − p +1 (6.62.2)
31
7/20/2008
63
prediction of an AR process (Ex)
Consider a real-valued AR(2) process
xn = 0.9
0 9 xn −1 − 00.22 xn − 2 + wn (6 63 1)
(6.63.1)
where wn is unit-variance zero-mean white noise. Since

1
Px ( z ) =
A ( z ) A ( z −1 )
(6.63.2)
where A ( z ) = 1 − 0.9 z −1 + 0.2 z −2 (6.63.3)
Th optimum
The ti linear
li predictor
di t iis
H ( z ) = z ⎡⎣1 − A ( z ) ⎤⎦ = z ⎡⎣0.9 z −1 − 0.2 z −2 ⎤⎦ = 0.9 − 0.2 z −1 (6.63.4)
and the prediction of xn+1 is formed as
xˆn +1 = 0.9 xn − 0.2 xn −1 (6.63.5)
64
A specific realization of xn (solid line) and its optimal prediction (dotted line) are
shown below
For this particular case, the average squared error is

N −1
1
ξ= ∑[ x − xˆn +1 ] = 0.0324
2
(6.64.1)
n +1
N n =0
32
7/20/2008
65
However, in practice, the statistics of xn is never known. Therefore, more
realistically would be to estimate the AR parameters* first for the given data:
N −1
1
rˆx ( k ) =
N
∑x x
n =0
n n−k
(6.65.1)
where N = 200. Then we find

rˆx (0) = 2.1904 rˆx (1) = 1.5462 rˆx (2) = 0.8670 (6.65.2)
The normal equations* are

⎡ 2.1904 1.5462 ⎤ ⎡ aˆ1 ⎤ ⎡1.5462 ⎤
⎢1.5462 2.1904 ⎥ ⎢ aˆ ⎥ = − ⎢0.8670⎥ (6 65 3)
(6.65.3)
⎣ ⎦⎣ 2⎦ ⎣ ⎦
which are solved to find estimates for a1 and a2 next
⎡ aˆ1 ⎤ ⎡ −0.8500 ⎤
⎢ aˆ ⎥ = ⎢ 0.2042 ⎥ (6.65.4)
⎣ 2⎦ ⎣ ⎦
* - AR parameters estimation will be explained in details later.
66
We observe that the estimated AR parameters are not equal to the true ones and,
therefore, the predictor becomes
xˆn +1 = 0.85 xn − 0.2042 xn −1 (6.66.1)
Next, instead of using the predictor on the data that was used to estimate the AR
parameters, we apply the predictor to the next 200 data values. The result is
shown below:
33
7/20/2008
67
Causal IIR Wiener filter: deconvolution
The deconvolution problem is concerned with the recovery of the signal dn that
has been convolved with a filter gn that may not be precisely known:
xn = d n ∗ g n (6.67.1)
Such distortions are often introduced in the process of measuring or recording

data: slightly out of focus cameras, band-limited communication channels. For
instance, a moving during the exposure camera would introduce distortion. If the
“blurring” function gn is perfectly known and has an inverse gn-1, such that
g n ∗ g n−1 = δ n (6.67.2)
then, in theory, it is not hard to recover dn from xn:

X ( e jω )
d n = xn ∗ g −1
⇔ D (e jω
)= G
(e )
(6.67.3)
n jω
However, in practice, precise knowledge of gn is generally not available.
68
Another problem is that the frequency response G(ejω) is zero (or very small) at
some frequencies. Therefore, G(ejω) may be either noninvertible or ill-conditioned.
In addition, noise may be introduced in the measurement process, and, therefore,
more accurate model for the observed process would be
xn = d n ∗ g n + wn (6.68.1)
where wn is additive noise that is often assumed as uncorrelated with dn. In this
situation, even if the inverse filter exists and is well-behaved, after the inverse filter
is applied to xn, the restored signal is
W ( e jω )
Dˆ ( e jω ) = D ( e jω ) + = D ( e jω ) + V ( e jω )
G (e )
(6.68.2)
jω
noise
or, in the time domain:

dˆn = d n + vn (6.68.3)
34
7/20/2008
69
Therefore, the signal estimate gets additional “filtered noise” term…

An alternative approach to the deconvolution problem is to design a Wiener filter
producing the MMS estimate of dn from xn. Let hn be a noncausal IIR LTI filter
producing the estimate of dn. ∞
dˆn = xn ∗ hn = ∑hx
l =−∞
l n −l
(6.69.1)
The solution to the Wiener-Hopf equations in the frequency domain is

Pdx ( e jω )
H (e jω
)= Px ( e jω )
(6 69 2)
(6.69.2)
Since wn is uncorrelated with dn:
Px ( e jω ) = Pd ( e jω ) G ( e jω ) + Pw ( e jω )
2
(6.69.3)
70
Pdx ( e jω ) = Pd ( e jω ) G* ( e jω )
Also, the cross-psd is
(6.70.1)
Therefore, the optimum Wiener filter for deconvolution is given by

Pd ( e jω ) G * ( e jω )
H (e jω
)= (6.70.2)
Pd ( e jω ) G ( e jω ) + Pw ( e jω )
2
Moreover, assuming that G(ejω) is non-zero for all ω and that its inverse exists:
⎡ Pd ( e jω ) ⎤
H (e ) = jω1 ⎢ ⎥
G ( e jω ) ⎢ P ( e jω ) + P ( e jω ) G ( e jω ) ⎥
(6.70.3)
2
⎢⎣ d w ⎥⎦
Since the power spectrum of the filtered noise is
Pv ( e jω ) = Pw ( e jω ) G ( e jω )
2
(6.70.4)
35
7/20/2008
71
The term in the square brackets of (6.70.3) may be represented as

Pd ( e jω )
F ( e jω ) =
Pd ( e jω ) + Pv ( e jω )
(6.71.1)
which is the noncausal IIR Wiener smoothing filter for estimating dn from
yn = d n + vn (6.71.2)
Therefore, the deconvolving Wiener filter may be viewed as a cascade of the

“inverse degradation” filter followed by a noncausal Wiener smoothing filter that
reduces
d th
the filt
filtered
d noise.
i
72
Discrete Kalman filter
The primary limitation to Wiener filters is that both the signal and noise processes
must be jointly wss. Most of practical signals are nonstationary, which limits
applications
li ti off Wi
Wiener filt
filters.
Recall that for recovering an AR(1) process of the form
xn = a1 xn −1 + wn (6.72.1)
from its noisy observation

yn = xn + vn (6.72.2)
where vn and wn are uncorrelated white noise processes, the optimum linear
estimate
i off xn using h measurements off yk, k ≤ n could
i allll off the ld b
be computed
d with
iha
recursion:
xˆn = a1 xˆn −1 + K [ yn − a1 xˆn −1 ] (6.72.3)
where K is a constant Kalman gain minimizing the MS error
{
E xn − xˆn
2
} (6.72.4)
36
7/20/2008
73
However, the mentioned above again uses a wss assumption. For a time-varying
situation, the optimum estimate may be found as
xˆn = an −1,1 xˆn −1 + K n ⎡⎣ yn − an −1,1 xˆn −1 ⎤⎦ (6.73.1)
where Kn is a time-varying gain.

The above result can be extended to estimation of AR(p) processes of the form
p
xn = ∑ ak xn − k + wn (6.73.2)
k =1
That is measured in the presence of additive noise
yn = xn + vn (6.73.3)
74
Let denote a p-dimensional state vector ⎡ xn ⎤

⎢ x ⎥
xn = ⎢
n −1 ⎥
(6.74.1)
⎢ # ⎥
⎢ ⎥
⎢⎣ xn − p +1 ⎥⎦
Then equations (6.73.2) and (6.73.3) may be expressed in the form:
⎡ a1 a2 " a p −1 ap ⎤ ⎡1 ⎤
⎢1 0 " 0 0⎥ ⎥ ⎢0⎥
⎢ ⎢ ⎥
xn = ⎢ 0 1 " 0 0 ⎥ x n −1 + ⎢0 ⎥ wn (6.74.2)
⎢ ⎥ ⎢ ⎥
⎢# # " # #⎥ ⎢# ⎥
⎢⎣ 0 0 " 1 0 ⎥⎦ ⎢⎣0 ⎥⎦
37
7/20/2008
75
yn = [1 0 " 0] x n + vn
and
(6.75.1)
Using matrix notation, we have

x n = Ax n −1 + w n (6.75.2)
yn = cT x n + vn (6.75.3)
where A is a p x p state transition matrix,
w n = [ wn 0 " 0]
T
(6.75.4)
is a vector noise process, and c is a unit vector of length p.

We notice that for the case of AR(1) process, the optimum estimate of the state
vector xn using all the previous measurements, may be found as
xˆ n = Axˆ n −1 + K ⎡⎣ yn − cT Axˆ n −1 ⎤⎦ (6.75.5)
where K is a Kalman gain vector.

76
The equation (6.75.2) is applicable to stationary AR(p) processes only but can be
easily generalized to nonstationary processes as follows
x n = A n −1x n −1 + w n (6.76.1)
where An-1 is a time-varying p x p state transition matrix and wn is a vector of zero-

mean white noise process with
⎧Q (n) k =n
E {w n w nH } = ⎨ w (6.76.2)
⎩0 k≠n
In addition, let yn be a vector of observations of length q formed as
y n = Cn x n + v n (6.76.3)
where Cn is a time-varying q x p matrix, and vn is a vector of zero-mean white

noise statistically independent of wn with
38
7/20/2008
77
⎧Q (n) k =n
E { v n v nH } = ⎨ v (6.77.1)
⎩0 k≠n
The optimum linear estimate for the time-varying case would be
xˆ n = A n −1xˆ n −1 + K n ⎡⎣ yn − cT A n −1xˆ n −1 ⎤⎦ (6.77.2)
With the appropriate Kalman gain matrix Kn, this recursion corresponds to the
discrete Kalman filter.
Assuming g that An, Cn, Qv((n), ), and Qw((n)) are known and denotinggxˆ n|n the best
linear estimate of xn at time n given the observations yi for I = 1,2,…,n and
denoting xˆ n|n −1 the best linear estimate of xn at time n given the observations yi for
I = 1,2,…,n-1, the corresponding state estimation errors are
e n|n = x n − xˆ n|n
(6.77.3)
e n|n −1 = x n − xˆ n|n −1
78
Denoting the error covariance matrices as
Pn|n = E {e n|ne nH|n }

Pn|n −1 = E {e n|n −1e nH|n −1}
(6.78.1)
we wish to solve the following problem:

Suppose that we are given an estimate x̂ 0|0 of the state x0 and that the error
covariance matrix for this estimate P0|0 is known. When the measurement y1 is
available, we need to update x̂ 0|0 and find the estimate x̂1|1of the state at time n =
1 minimizing the MS error
{ } { }
p −1
= tr {P1|1} = ∑ E ei ,1|1
2 2
ξ1 = E e1|1 (6.78.2)
i =0
Once such an estimate is found and the error covariance P1|1 is evaluated, the
estimation is repeated for the next observation y2.
39
7/20/2008
79
ˆ n −1|n −1 we will find

The solution to this problem is derived in two steps. First, given x
xˆ n|n−1 , which is the best estimate of xn without the observation yn. Next, given
yn, and dx
x̂ n|n−1 we will
ill estimate
ti t xn.
In the first step, since no new measurements are used to estimate xn, all we know
is that
x n = A n −1x n −1 + w n (6.79.1)
Since wn is a zero-mean white noise and its values are unknown, we may predict
xn as follows:
xˆ n|n −1 = A n −1xˆ n −1|n −1 (6.79.2)
which has an estimation error
en|n−1 = xn − xˆ n|n−1 = An−1xn−1 + wn − An−1xˆ n−1|n−1 = An−1en−1|n−1 + wn (6.79.3)
80
ˆ n −1|n −1 is an unbiased estimate of xn-1, i.e.

Since wn has zero mean, if x
E {e n −1|n −1} = 0 (6.80.1)
ˆ n|n−1 is an unbiased estimate of xn:

then x
E {e n|n −1} = 0 (6.80.2)
Since the estimation error en-1|n-1 is uncorrelated with white noise wn, then
Pn|n −1 = A n −1Pn −1|n −1A nH−1 + Q w (n) (6.80.3)
where Qw(n) is the covariance matrix for the noise process wn. This completes the
first step of the Kalman filter.
40
7/20/2008
81
ˆ n −1|n −1
In the second step, we incorporate the new measurement yn into the estimate x
A new linear estimate is formed next as
xˆ n|n = K 'n xˆ n|n −1 + K n y n (6.81.1)
where Kn and Kn’ are matrices that need to be specified. The error can be found as
e n|n = x n − K 'n xˆ n|n −1 − K n y n = x n − K 'n ⎡⎣ x n − e n|n −1 ⎤⎦ − K n [Cn x n + v n ]
= ⎡⎣I − K 'n − K n Cn ⎤⎦ x n + K 'ne n|n −1 − K n v n (6.81.2)
Since
E { v n } = 0 and E {en|n −1} = 0 (6 81 3)
(6.81.3)
ˆ n|n will be unbiased for any xn only if

then x
K 'n = I − K nCn (6.81.4)
82
Which leads to the estimate in form:

xˆ n|n = [ I − K nCn ] xˆ n|n −1 + K n y n (6.82.1)
or
xˆ n|n = xˆ n|n −1 + K n ⎡⎣ y n − Cn xˆ n|n −1 ⎤⎦ (6.82.2)
And the error is

e n|n = K 'ne n|n −1 − K n v n = [ I − K n Cn ] e n|n −1 − K n v n (6.82.3)
Since vn is uncorrelated with wn, then vn is uncorrelated with xn. Also, vn is

uncorrelated with en|n-1:
E {e n|n −1 v n } = 0 (6.82.4)
Therefore, the error covariance matrix for en|n-1 is
Pn|n = E {e n|ne nH|n } = [ I − K n Cn ] Pn|n −1 [ I − K n Cn ] + K n Q v (n)K nH

H
(6.82.5)
41
7/20/2008
83
Next, we need to find the value for the Kalman gain Kn that minimizes the MS
error
ξ n = tr {Pn|n } (6.83.1)
trace
where the trace function is n
tr ( A) = ∑ aii (6.83.2)
i =1
Therefore, we need to differentiate ξn with respect to Kn, set the derivative to zero,
and solve for Kn. Using the matrix differentiation formulas
d
tr ( KA ) = A H (6.83.3)
dK
tr ( KAK H ) = 2KA
d
and (6.83.4)
dK
84
we obtain
tr ( Pn|n ) = −2 [ I − K nCn ] Pn|n −1CnH + 2K nQv (n) = 0
d
(6.84.1)
dK
Solving for Kn gives the expression for the Kalman gain
−1
K n = Pn|n −1CnH ⎡⎣Cn Pn|n −1CnH + Q v (n) ⎤⎦ (6.84.2)
Next, we can simplify the expression for the error covariance:
Pn|n = [ I − K nCn ] Pn|n−1 (6 84 3)

(6.84.3)
Finally, the initial conditions for n = 0 can be chosen as

xˆ 0|0 = E {x 0 } (6.84.4)
P0|0 = E {x 0 x 0H } (6.84.5)
42
7/20/2008
85
In summary:
86
Discrete Kalman filter (Ex)
Let xn be an AR(1) process
xn = 0.8
0 8 xn −1 + wn (6.86.1)
where wn is white noise with a variance σw2 = 0.36 and let
yn = xn + vn (6.86.2)
be noisy measurements of xn where vn is unit variance white noise that is

uncorrelated with wn. Thus, with An = 0.8 and Cn = 1, the Kalman filter state
equation is
0 8 xˆn −1 + K n [ yn − 0.8
xˆn = 0.8 0 8 xˆn −1 ] (6 86 3)
(6.86.3)
Here the state vector is a scalar. Therefore, the Kalman gain can be computed
with scalar equations
Pn|n −1 = 0.82 Pn|n −1 + 0.36 (6.86.4)
−1
K n = Pn|n −1 ⎡⎣ Pn|n −1 + 1⎤⎦ (6.86.5)
43
7/20/2008
87
Discrete Kalman filter (Ex)
Pn|n = [1 − K n ] Pn|n −1 (6.87.1)
With
xˆ0 = E { x0 } = 0 and { } =1
P0|0 = E x0
2
(6.87.2)
the Kalman gain and the error covariances

are shown for the first few values of n.
We observe that after a few iterations the

Kalman filter settles into its steady state
solution:
xˆn = 0.8 xˆn −1 + 0.375 [ yn − 0.8 xˆn −1 ] (6.87.3)
88
The goal of the discrete Kalman filter is to use the

measurements yn to estimate the state xn of a dynamic
system.
The Kalman filter is a very powerful recursive estimation

algorithm that has found applications in various areas
including radar tracking, estimation and prediction of target
trajectories, adaptive equalization of telephone channels and
f di di
fading dispersive
i channels,
h l adaptive
d ti antenna
t arrays…
44

Lecture 06 - Oprimum Filters

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Lecture 06 - Oprimum Filters

Enviado por

Direitos autorais:

Formatos disponíveis

7/20/2008

Lecture 06: Optimum filters

Dear, it seems like I solved the problem of water purification!

Problem statement: Wiener filter

ELEN 5301 Adv. DSP and Modeling Summer II 2008

Problem statement: Wiener filter

where en = d n − dˆn (6.3.2)

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter

its derivative must be zero according to the optimization theory:

Which is known as the orthogonality principle or the projection theorem.

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter

Therefore, combining (6.5.2) and (6.5.4):

Since xn and dn are jointly wss:

E {d n xn*− k } = rdx (k ) (6.6.3)

Which is a set of p equations in the p unknowns wk, k = 0,1,…p-1 called the

FIR Wiener filter

Since the autocorrelation sequence is conjugate symmetric:

rx (k ) = rx* (−k ) (6.7.1)

the Wiener-Hopf equations in matrix form are

⎡ rx (0) rx* (1) " rx* ( p − 1) ⎤ ⎡ w0 ⎤ ⎡ rdx (0) ⎤

or, in a more compact form:

Where Rx is a p x p Hermitian Toeplitz matrix of autocorrelations, w is the vector

FIR Wiener filter

The MMS error in the estimate of dn can be found with

E {en xn*− k } = 0 (6.8.2)

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter

Using vector notation:

ξ min = rd (0) − rdxH w (6.9.1)

The MMS error can be written as

ξ min = rd (0) − rdxH R −x 1rdx (6.9.3)

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter: Filtering

Filtering problem implies that a signal dn needs to be estimated from an

and rdx (k ) = E {d n xn*− k } = E {d n d n*− k } + E {d n vn*− k } = rd (k ) (6.10.3)

for vn and dn uncorrelated, it follows that

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter: Filtering (Ex)

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter: Filtering (Ex)

The Wiener-Hopf equations are

Therefore, the Wiener-Hopf

FIR Wiener filter: Filtering (Ex)

Then the Wiener filter is

W ( e jω ) = 0.4048 + 0.2381e − jω (6.13.2)

which is a LPF with a

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter: Filtering (Ex)

We also note that the power spectrum of dn is

For the specific case of α = 0.8 and σv2 = 1

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter: Filtering (Ex)

The mean square error will be

For the specific case of α = 0.8 and σv2 = 1, ξmin = 0.4048.

And the noise power is

The SNR at the filter output is

FIR Wiener filter: Linear prediction

ELEN 5301 Adv. DSP and Modeling Summer II 2008

FIR Wiener filter: Linear prediction

and rdx (k ) = E {d n xn− k } = E {d n d n− k } + E {d n vn*− k } = rd (k ) (6.10.3)

rdy (k ) = E {d n yn− k } = E { xn +1 yn− k } = rx (k + 1) (6.21.2)