Nikhil Chandaria
CID:00466532
Supervisor: Dr. Colin J. Cotter 1st of June, 2010
Abstract
This is a report on creating a ﬁlter to classify an electrocardiogram. The OrnsteinUhlenbeck process is used to model the signal between heartbeats and we investigate the use of the Ensemble Kalman Filter to estimate the parameters of this stochastic process. We ﬁnd that the ﬁlter is unable to estimate the drift term in the stochastic diﬀerential equation. We then propose a modiﬁcation involving the Ensemble Square Root ﬁlter and a Bayesian approach to estimating this parameter, which proves to be more successful, however we discover that the OrnsteinUhlenbeck process proves to be an insuﬃcient model for the signal between each heartbeat from a patient.
Acknowledgements
I would like to thank Dr. Colin Cotter for his encouragement, advice and insight throughout this project. I would also like to thank Professor Nicholas Peters and Dr. Louisa Lawes for their explanation of atrial ﬁbrillation, the ablation procedure and for providing the electrocardiogram data.

1 

. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1 

Ablative Surgery
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
2 


3 

. . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
4 

. . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
4 

Applications
. . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
5 

Validity
. . . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
6 


7 

. . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
7 

Algorithm .
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
8 

Application
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
8 

. . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
9 

Algorithm .
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
10 

Application
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
10 


12 

. . . . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
12 

. . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
13 

Algorithm .
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
14 

Application
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
14 

. . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
19 

Algorithm
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
21 

Application
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
21 

. . . . . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
28 

Algorithm .
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
29 

Application
. . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
30 

. . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
32 

Robustness
. . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
33 


34 


37 

. . . . . . . . . . . . . . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
37 


. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
38 

Bibliography 
39 

A Solution to OU Process 
41 

B σ Estimation Issue 
42 

C Maximum Likelihood Estimator 
44 

D OrnsteinUhlenbeck Process MATLAB Code 
45 

E Final Filter MATLAB code 
46 

F Removing Heartbeats from ECG data 
49 

i 
Atrial ﬁbrillation (AF) is classiﬁed as cardiac arrhythmia; an irregular heartbeat. It is associated with problems within the electrical conduction system of the heart. Within the UK there are at least 46,000 people diagnosed every year (Iqbal et al., 2005); the subsequent result is that £459 million is spent by the National Health Service (Stewart et al., 2004) which is roughly 1% of the NHS budget. There are a variety of treatments available for patients who suﬀer from AF including medicinal, eletrical and chemical cardioversion and ablative surgery. Within this project we are going to focus on catheter ablation. We will consider the detection of AF and the ablative procedure. In section 2 we will consider stochastic processes as a means for modelling the heart in order to develop an analytical tool. In sections 3 and 4 we will consider ﬁltering techniques for estimation purposes. In section 5 we will examine the application of ﬁltering techniques to data from the electrocardiogram (ECG).
1.1 Atrial Fibrillation
In this subsection we will examine atrial ﬁbrillation and methods of detecting the signal.
We will ﬁrst consider normal sinus rhythm; a regular heartbeat and how the electrical signal is con ducted through the heart. It is possible to classify the heartbeat into diﬀerent stages with the use of an ECG. A typical heartbeat is shown in ﬁgure 1.1.2. We can then identify the electrical impulse that generates in the sinoatrial node (shown as the sinus node in ﬁgure 1.1.1) as the P wave; this is what causes the contraction of the atria. This will push the blood from the atria into the ventricles. The electrical signal will then travel to the atroventricular (AV) node upon which the signal will cause the ventricles to contract thus forcing blood from the heart to the rest of the body. The delay between the contractions of the atria and the ventricles is characterized by the PR segment; without this delay the entire heart would beat at the same time. The QRS complex denotes the spread of the electrical activity from the atria to the ventricles. Finally the repolarization of the ventricles is shown by the presence of the T wave.
Fig. 1.1.1: Diagram illustrating the main areas of the heart
For a patient who suﬀers from AF there are two main methods for detection of the condition: the regularity of the heartbeat or the more stronger indicator is the absence of P waves (American Heart Association, 2008). The condition can cause palpitations, fainting and congestive heart failure.
1
Fig. 1.1.2: A typical heartbeat as shown on an ECG
1.2 Ablative Surgery
In this subsection we will discuss the ablative surgery method and the issues associated with it.
Ablative surgery is a procedure where surgeons insert a number of electrodes into a patient’s heart and measure the electrical activity (Sivakumaren, 2009). A catheter using a high frequency alternating current is then used to burn any abscesses away that may exist. This is done by a surgeon searching for any abnormal impulses using a roving electrode. Once the surgeon has located and ablated the abscess they will then continue to search for any other sources of impulses that may exist. The aim is that this can aid in returning the heartbeat to a normal sinus rhythm. This method, however, is subjective and can lead to varying success rates between surgeons and treatment centers (Calkins et al., 2007). We believe that this is because there is no method for determining whether a signal is displaying disorganized electrical activity as is the case under AF. In this project we aim to work on a method that will classify whether a signal is noise or whether it is indeed atrial ﬁbrillation. We hope that this research will help provide an objective decision making process for surgeons in being able to determine whether a signal is noise or an abnormal electrical activity causing AF. This may provide a method for a surgeon to perform a post operation analysis on the procedure to understand the impact of ablation on the patient and to able to distinguish between noisy signals and anomalous electrical activity in the heart.
2
A stochastic process is a random process. The most common example is that of Brownian motion. A stochastic process can only be described by a probability density function (pdf). In the case of an ordinary diﬀerential equation (ODE) a given initial condition will give one real evolution through time however for a stochastic diﬀerential equation (SDE) with a given initial condition there can be any number of possible paths (Risken, 1996). An example of a stochastic process is a Wiener process which has the following properties:
• 
W _{0} = 0 

• 
∆W ∼ 
N(0, t − s) 
We can then generate a Wiener process after a set time span using the following equation:
x _{t} = x _{t}_{−}_{1} + N(0, t − s)
with the initial condition to this equation being x _{0} = 0.
(2.0.1)
Fig. 2.0.1: Demonstration of 4 sample Wiener process paths generated. This shows the randomness of gen erating a Wiener process.
Figure 2.0.1 displays an example of 4 diﬀerent Wiener processes generated using a time diﬀerence of 0.01 and 100 time steps.
Figure 2.0.2 contains the evolution of 10,000 Wiener processes using the same conditions used to generate the paths shown in ﬁgure 2.0.1. Based on these plots and (Risken, 1996) we can infer that the evolution of the standard deviation of particles is given by:
σ(t) = ^{√} t
3
(2.0.2)
Fig. 2.0.2: Demonstration of 10,000 Wiener processes. This demonstrates that the process displays a standard deviation of ^{√} t.
2.1 EulerMaruyama Method
In this subsection we will discuss the EulerMaruyama method for solving SDEs numerically.
For many SDEs of the form
dX _{t} = a(X _{t} )dt + b(X _{t} )dW _{t}
(2.1.1)
there is no explicit solution (Risken, 1996) that can be obtained, thus a numerical integrator is required to obtain the approximate solution to an SDE. The simplest numerical method is the EulerMaruyama (EM) method which aims to ﬁnd a solution to an SDE using the following equation:
X _{k}_{+}_{1}
= X _{k} + a(X _{k} )∆t + b(X _{k} )∆W _{k}
(2.1.2)
1
where ∆W _{k} = W(t _{k} ) − W(t _{k}_{−}_{1} ). This method has order of convergence of n = _{2} and a weak order of convergence of n = 1 (Kloeden et al., 2003). This means that if we want to improve the precision of the EM method by 10 times we would need to reduce the time step by 100 times. We can see that the above equation is the forward Euler method for ordinary diﬀerential equations.
2.2 OrnsteinUhlenbeck Process
This subsection deals with the mean reverting process called the OrnsteinUhlenbeck process which will be used a method for modelling the signal between each heart beat. The application and validity of the method will also be discussed.
The OrnsteinUhlenbeck (OU) Process is a mean reverting stochastic process (Uhlenbeck and Ornstein, 1940) of the form:
dX _{t} = θ(µ − X _{t} )dt + σdW _{t}
4
(2.2.1)
where θ, µ, σ > 0 and W _{t} is a Wiener process.
The exact solution of the OU process has the following
properties:
E[X _{t} ] = X _{0} e ^{−}^{θ}^{t} + µ 1 − e ^{−}^{θ}^{t}
(2.2.2)
and
var(X _{t} ) =
2
σ _{2}_{θ} 1 − e ^{−}^{2}^{θ}^{t}
(2.2.3)
For proof of these two statements please refer to appendix A. In order to generate a sample path simulation
of equation 2.2.1 we need to apply an AR(1) method:
x _{t} = x _{t}_{−}_{1} e ^{−}^{θ}^{∆}^{t} + µ 1 − e ^{−}^{θ}^{∆}^{t} + σ ^{1} ^{−} ^{e} _{2}_{θ}
−2θ∆t
N (0, 1) 
(2.2.4) 
(2.2.5) 
An interesting note is that the OU process displays a Markov chain:
P(X _{n}_{+}_{1} X _{1} , X _{2} ,
...
,
X _{n} ) = P(X _{n}_{+}_{1} X _{n} )
Applying the parameters laid out in table 2.2.1 results in ﬁgure 2.2.1.
θ 
100 
µ 
0 
σ 
1 
∆t 
0.01 
Number of points 
100 
Total time 
1s 
Table 2.2.1: Paramaters for Sample OU Process
Fig. 2.2.1: A sample OrnsteinUhlenbeck process generated using µ = 0, θ = 100 and σ = 1 with 100 points and a total time for the process of 1 second. This demonstrates the mean reverting nature of the process.
2.2.1 Applications
There are various applications for the OrnstienUhlenbeck process such as ﬁnancial modelling. A popular
use of the process is for commodity pricing using the parameters µ as an equilibrium, σ as the risk of the
product and θ the sensitivity to external factors. It has also been used in modelling the cardiovascular
system in animals (Gartner et al., 2010).
5
2.2.2 Validity
The importance of examining the OrnsteinUhlenbeck process is to assess whether it will be a useful
model for devising a tool that will be able to process data from an electrocardiogram (ECG).
The main requirement is that the process can match the data taken from an ECG. The idea is that the
signal between heartbeats is mean reverting and thus we can use the OU process to generate a signal
that will be analogous to the signal between heart beats. In order to do this we need to consider the
data from a surgery; as we discussed in section 1.1 the absence of P waves illustrated by an ECG is a
strong indicator of the presence of atrial ﬁbrillation thus it is important to examine the signal between
the heartbeat to detect the presence of the P wave. In order to do this it is important that we ﬁlter out
the heartbeat to understand whether the stochastic process chosen is suitable.
Fig. 2.2.2: The bottom graph shows an unmodiﬁed ECG. The top diagram is the same ECG with the heartbeat removed. This shows that the signal between each heartbeat shows a mean reverting nature.
Figure 2.2.2 ^{1} gives a comparison of the data with and without heartbeats. We can see that the process
is indeed mean reverting with a mean of roughly 400. Another important check to see that the process is
suitable for the task is whether the data itself is Gaussian by taking a histogram of the data. As we can
see in ﬁgure 2.2.3 the ﬁltered data does display a Gaussian distribution therefore for modelling purposes
the OrnsteinUhlenbeck process should be suﬃcient.
Fig. 2.2.3: A histogram of the ECG with the heartbeat removed. This shows that the information is distributed normally.
^{1} This was created by using a tool to check for monotonicity in a set number of points in an ECG and thus remove a heartbeat. The method is displayed in appendix F
6
A Kalman ﬁlter (Kalman, 1960) is a discrete recursive numerical method used to ﬁlter noisy signals and
estimate the true signal and infer other parameters that are associated with the system. It was originally
developed for use in trajectory estimation for the Apollo space program (McGee and Schmidt, 1985).
Since its conception it has become used in many everyday applications such as GPS and RADAR
tracking, numerical weather prediction (NWP) (Evensen, 1992), turbulence modelling (Majda et al.,
2010) and ﬁnancial modelling (Krul, 2008).
The ﬁlter relies on the fact that the true state can be inferred from the state from the previous time
step; it is a Markov chain. This allows for the ﬁlter to be used in modelling stochastic processes.
This section discusses the developments of Kalman ﬁlters and discusses their suitability towards the
problem of estimating the OrnsteinUhlenbeck process parameters.
3.1 Linear Kalman Filter
In this subsection we will deal with the linear Kalman ﬁlter.
The linear Kalman ﬁlter tries to estimate the state x ∈ R ^{n} using the following stochastic diﬀerence
equation (Welch and Bishop, 2006):
x _{k} = Ax _{k}_{−}_{1} + Bu _{k}_{−}_{1} + w _{k}_{−}_{1}
(3.1.1)
where A is the state transition matrix of size n×n, B (size n×l) relates an optional control input (u ∈ R ^{l} )
with the state x and w _{k} is white process noise of the form P(w) ∼ N (0, Q) where Q is the process noise
covariance. The ﬁlter has an observation equation of the form:
z _{k} = Hx _{k} + v _{k}
(3.1.2)
where z _{k} ∈ R ^{m} is the measurement of a signal, H is a measurement operator of size m × n and v _{k} is a
white measurement noise of the form P(v) ∼ N (0, R) where R is the measurement noise covariance. It
is important to note that the symbols used in the above deﬁnitions vary depending on the book, paper
or notes used to reference the Kalman ﬁlter. In order to understand how the Kalman ﬁlter works the
following terms also need to be deﬁned:
x ^{f} = prior estimate
k
x _{k} ^{a} = posterior estimate
(3.1.3)
(3.1.4)
Along with these two terms, we can deﬁne the following covariance matrices:
f
P
k
= E[(x _{k} − x
k
^{f} )(x _{k} −
^{f} ) ^{T} ] = prior covariance
x
k
a
P
k
= E[(x _{k} − x
^{a} _{k} )(x _{k} − x
_{k} ^{a} ) ^{T} ] = posterior covariance
(3.1.5)
(3.1.6)
With all the required terms now deﬁned, it is possible to now display the Kalman ﬁlter equations. The
equations can be broken down into two categories: prediction and correction steps.
The equations associated with the prediction are as follows:
x
^{f}
k
f
P
k
=
=
_{k}_{−}_{1} ^{a} + Bu _{k}_{−}_{1}
Ax
AP _{k}_{−}_{1} A ^{T} + Q
a
(3.1.7)
(3.1.8)
The equations associated with the correction step are as follows:
K _{k}
=
x _{k} ^{a} =
a
P
k
=
f
P
k
f
H ^{T} (HP
k
H ^{T} + R) ^{−}^{1}
x
^{f}
k
+ K _{k} (z _{k} − Hx ^{f} )
k
f
(I − K _{k} H)P
k
7
(3.1.9)
(3.1.10)
(3.1.11)
The yettobe deﬁned term, K _{k} , is the Kalman gain matrix which relates the prior and posterior estimates ^{2} .
3.1.1 
Algorithm 

for k = 1 to N do 

x ^{f} = Ax ^{a} k f P k = AP 
k−1 ^{+} ^{B}^{u} k−1 k−1 ^{A} ^{T} ^{+} a ^{Q} 

f 
f 

K _{k} = P k H ^{T} (HP k ^{a} = x ^{f} x _{k} k + K _{k} z f k a 
f 

P k = (I − K _{k} H) P end for 
k 

3.1.2 
Application 
deﬁne A,B,u,Q ^{2} and R ^{2}
This section deals with the the pseudocode for the linear Kalman ﬁlter.
Algorithm 3.1.1 Linear Kalman Filter Algorithm
H ^{T} + R) ^{−}^{1}
− Hx ^{f}
k
In order to understand how the Kalman ﬁlter operates it is important to apply it to the underlying prob
lem at hand: estimating the OrnsteinUhlenbeck process however we do have to deal with the issue that
the process is nonlinear when we try to estimate all parameters (x,µ,θ and σ) thus for understanding the
ﬁlter it is assumed that the only parameter that is unknown is x and all others are known thus simplifying
the problem into a linear problem.
In doing so we generate an Ornstein Uhlenbeck process using the parameters as deﬁned in table 2.2.1 and
using the Kalman ﬁlter parameters as deﬁned in table 3.1.1 we are able to call the ﬁlter to estimate the
x state of the process.
x 
0 
2 
P 0 
1 

A 
_{e} −θdt 

B 
^{} _{1} _{−} _{e} −θdt ^{} 

u 
µ 

Q ^{2} 
2 σ 2θ ^{} _{1} _{−} _{e} −2θdt ^{} 

R ^{2} 
0.01 
Table 3.1.1: Parameters used to estimate the position of the OU process when passed through a linear Kalman Filter. It is assumed that µ, θ and σ are known.
Figure 3.1.1 shows that the Kalman Filter works well in trying to estimate the true state given noise
added to the initial signal generated for this application. While it is not perfect it does give us a good
idea of how the ﬁlter operates by forecasting and correcting the signal and it will not completely believe
the incoming state and instead modify it closer to the true state. An important note is that the ﬁlter can
be further ﬁne tuned by reducing the value of R ^{2} (model measurement noise covariance).
This section has provided a grounding in the Kalman ﬁlter however as this ﬁlter is linear it is of little
^{2} For a more comprehensive description of the Kalman ﬁlter and deriving the equations please refer to (Simon, 2006)
8
use to us as we are looking to estimate all parameters in the OrnsteinUhlenbeck process thus we need to
consider nonlinear options.
Fig. 3.1.1: Example of the Kalman Filter estimating the OrnsteinUhlenbeck process. This demonstrates the ability for the ﬁlter to estimate the true state from the noisy observations.
3.2 Extended Kalman Filter
In this subsection we will go on to discuss the extended Kalman ﬁlter used for nonlinear problems.
The linear Kalman ﬁlter does have an inherent problem in that it can only be applied to a linear system;
in the example of the OrnsteinUhlenbeck process this means that the only element that can be estimated
is the position, x. The aim is to estimate all parameters associated with the OU process (x,µ,θ and σ)
therefore we need to consider the case where the ﬁlter takes into account nonlinearity. The ﬁrst step in
doing so is the extended Kalman ﬁlter (EKF).
The EKF works by linearizing the set of equations that we will be using to estimate the process. If
we use the generalised equation (Welch and Bishop, 2006):
x _{k} = f(x _{k}_{−}_{1} , u _{k}_{−}_{1} , w _{k}_{−}_{1} )
z _{k} = h(x _{k} , v _{k} )
(3.2.1)
(3.2.2)
We can then linearize the equations using a Taylors expansion and apply them to the ﬁlter equations
provided we have the following Jacobian matrices:
A ij =
∂f _{i}
∂x _{j}
(x _{k}_{−}_{1} , u _{k}_{−}_{1} , 0)
W ij =
∂f _{i}
∂w _{j}
(x _{k}_{−}_{1} , u _{k}_{−}_{1} , 0)
H ij =
∂h _{i}
∂x _{j}
(x _{k} , 0)
9
(3.2.3)
(3.2.4)
(3.2.5)
V ij =
∂h _{i}
∂v _{j}
(x _{k} , 0)
(3.2.6)
where x _{k} is a posterior estimate as deﬁned in section 3.1. With the above deﬁnitions we can deﬁne the
extended Kalman ﬁlter equations. Again they can be split into estimate and correction steps.
The estimation equations are:
In order to apply the EKF to the OU process we have to apply the Jacobian matrices for the EKF
to the problem at hand. By going back to equation 2.2.1 and applying persistence equations for the
nonobserved states with added artiﬁcial noise. The reason for this is to prevent these parameters from
completely settling to one value. For the problem at hand the patient undergoing surgery may be ablated
thus changing the shape of their ECG therefore we need to ensure that the model covariance for the
parameters does not settle to 0 and thus stop believing the data being taken in from electrodes attached
to the patient:
µ ^{f}
k
θ
f
k
σ ^{f}
k
=
=
=
_{k}_{−}_{1} ^{a} + C _{µ} dW _{µ}
µ
a
θ _{k}_{−}_{1} + C _{θ} dW _{θ}
_{k}_{−}_{1} + C _{σ} dW _{σ}
σ
a
10
(3.2.12)
(3.2.13)
(3.2.14)
We obtain the following Jacobian matrices:
A =
−θ
0
0
0
θ
1
0
0
µ − X _{t}
0
1
0
0
0
0
1
W =
0
0
0
0
0
C _{µ}
0
0
0
0
C _{θ}
0
σ
0
0
C _{σ}
H = ^{} 1
0
0
0 ^{}
V = ^{} 0 ^{}
(3.2.15)
(3.2.16)
(3.2.17)
(3.2.18)
Applying these matrices to the ﬁlter results in very poor performance and as such the ﬁgure has not
been displayed. The ﬁlter diverges with the x _{t} estimation dropping to values O(10 ^{1}^{7}^{2} ) after 17 time steps.
The reason for this is that the extended Kalman ﬁlter suﬀers from linearization error which could explain
the reason for the divergence of parameters. Other approaches to the extended Kalman ﬁlter could be
pursued such as hybrid ﬁlters which considers a continuoustime system with discretetime measurements
or by looking at higher order approaches to the linearization however despite these options the EKF can
be very diﬃcult to tune and can give unreliable estimates depending on the severity of the nonlinearity of
the system (Simon, 2006). This linearization of the covariance error associated with the EKF can result
in unbounded linear instabilities for the error evolution (Evensen, 1992). Therefore other ﬁlter types need
to be examined as an alternative to higher order linearizations which brings us to the ensemble family of
ﬁlters as discussed in section 4.
11
Ensemble ﬁlters are alternatives to the traditional ﬁlters as discussed in section 3 of which the two most
well known are the ensemble Kalman ﬁlter and the particle ﬁlter. In the case of ensemble ﬁlters the error
covariance matrix is represented by a large ensemble of model realizations. The uncertainty in the system
is represented by a set of model realizations rather than an explicit expression for the error covariance
(Evensen, 2009). The model states are then integrated forward in time to predict error statistics. Research
has also shown that the use of ensemble ﬁlters for nonlinear models costs less computationally than an
extended Kalman ﬁlter (Evensen, 2006). Subsequently the ensemble ﬁlters have found widespread use
when handling a large state space such as in NWP.
4.1 Particle Filter
In this subsection we will discuss particle ﬁltering techniques used for estimating nonlinear systems where
the probability density function is nonmodal.
The particle ﬁlter is a sequential Monte Carlo algorithm that, as mentioned in section 4, uses an en
semble of N members, or particles, to estimate the characteristics of a system. It is a computational
method of implementing a Bayesian estimator. In order to understand how it works we must ﬁrst look
at Bayes’ theorem to understand that the particle ﬁlter computes the statistics of the system from which
information can be extracted. If we begin with our system and measurement equations (Simon, 2006):
x _{k}_{+}_{1} = f(x _{k} , w _{k} )
z _{k} = h(x _{k} , v _{k} )
(4.1.1)
(4.1.2)
p(x _{k} Z _{k} ) =
p(z _{k} x _{k} )p(x _{k} Z _{k}_{−}_{1} )
^{} p(z _{k} x _{k} )p(x _{k} Z _{k}_{−}_{1} ) dx _{k}
(4.1.3)
Where Z _{k} denotes measurements z _{1} , z _{2} ,
...
, z _{k} .
Equation 4.1.3 does pose some problems because the
denominator can prove to be intractable hence in many cases it is necessary to use a delta function to
integrate the function and to estimate the probability density function of the system. By being able to
evaluate equation 4.1.3 we will be able to integrate our model in time by using the EulerMaruyama
method or by using the exact solution to the OU process.
Now that we understand Bayes’ theorem we can then begin to look at the particle ﬁlter and how to
apply it. Unlike the family of Kalman ﬁlters, the particle ﬁlter does not assume that the distribution
is Gaussian which means evaluating the pdf is much more diﬃcult therefore we need to represent the it
using a series of weighted particles where π
(i) represents a normalized weight for the i’th particle at time
t
time t.
P(x _{t} Z _{t} ) = ^{} π _{t}_{−}_{1} δ (x _{t} − x _{t}_{−}_{1} )
i
(4.1.4)
Where Z _{t} = (z _{t} , z _{t}_{−}_{1} ,
...
, z _{0} ).
To initialize the particle ﬁlter we distribute a set of N particles based on a
^{f}
known pdf. We shall assume the notation x ^{a} _{k}_{,}_{i} andx _{k}_{,}_{i}
where k is the time step, i is the particle number
and a donates the analysis step and f denotes the forecast state. If we begin by evaluating each particle
and generating a prior state:
x ^{f} _{k}_{,}_{i} = f(x _{k}_{−}_{1}_{,}_{i}
^{a}
, w _{k}_{−}_{1} )
(4.1.5)
^{f}
We then compute the relative likelihood of each particle by evaluating the pdf p(z _{k} , x _{k}_{,}_{i}
) which we will
denote as q _{i} . We then normalize each likelihood to obtain the weight of each particle:
π
i q i
k ^{=}
N
j=1 ^{q} j
12
(4.1.6)
Once we have this normalized weight we are able to resample each particle to generate the posterior state
x _{k}_{,}_{i} ^{a} according to the relative likelihood and thus we have our pdf p(x _{k} z _{k} ). The particle ﬁlter does suﬀer
from some problems; namely sample impoverishment in which case all the particles will collapse to the
same value (Simon, 2006). There are methods of reducing this impoverishment such as adding random
noise or by modifying the resampling step by using a MonteCarlo Metropolis Hastings algorithms. While
this illustrates the use of particle ﬁlters we can simplify the problem because we have shown that we are
using a Gaussian distribution thus allowing us to move to less computationally expensive methods.
4.2 Ensemble Kalman Filter
In this subsection we will examine the ensemble Kalman ﬁlter used for estimating a nonlinear problem
with the assumption that the model displays a normal distribution. This subsection will also go onto dis
cuss modiﬁcations to the ﬁlter required to prevent members collapsing to a single value.
The ensemble Kalman ﬁlter (EnKF) works similar to the method to the particle ﬁlter however it is
computationally less expensive because of the assumption that the distribution of the system is Gaussian
and that every member has an equal weighting. Evensen (2006) shows that the EnKF is a special version
of the particle ﬁlter where the update step is approximated by a linear update step using just the mean
and covariance of the pdf. In order to avoid confusion the notation for the EnKF will be slightly diﬀerent
^{f}
to that which has been proposed in section 3. From now on we will use the notation that x _{k}_{,}_{i}
is the i’th
forecast ensemble member at time k, x _{k}_{,}_{i} ^{a} is the corrected i’th member at time k and in our particular
application x = (x, µ, θ, σ).
them for our analysis step.
We can then revert back to our system equations 3.2.1 and 3.2.2 and use
x ^{f} _{k}_{,}_{i} = f(x _{k}_{−}_{1}_{,}_{i}
^{a}
, u _{k}_{−}_{1} , 0)
z _{k}_{,}_{i} = h(x _{k}_{,}_{i} ^{f} , 0)
(4.2.1)
(4.2.2)
In section 3.1 we deﬁned the covariance matrices for the prior and posterior distributions in equations
3.1.3 and 3.1.4 however we need to pursue a slightly diﬀerent method to establish the covariance matrices.
We begin by deﬁning the matrix X _{f} ∈ R ^{n}^{×}^{N} as the matrix of ensemble member:
X ^{f} = x ^{f}
k
k,1 ^{,} ^{x} f
_{k}_{,}_{2} , ...
,
^{f}
x _{k}_{,}_{N}
We can then deﬁne a matrix X _{k} ∈ R ^{n}^{×}^{N} as the matrix of the ensemble mean:
¯
X _{k} = X ^{f} 1 _{N}
¯
k
(4.2.3)
(4.2.4)
where 1 _{N} is a matrix where all entries are equal to 1/N. Once we have these deﬁnitions we can assemble
a matrix of ﬂuctuations:
X ^{} ^{f}
k
= X ^{f} − X _{k}
k
¯
(4.2.5)
And ﬁnally we can now assemble our error covariance matrix:
f
P
k
=
1
N − 1
X ^{} ^{f}
k
X ^{} ^{f}
k
T