Escolar Documentos
Profissional Documentos
Cultura Documentos
000–000
J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid,
D. Heckerman, A. F. M. Smith and M. West (Eds.)
c Oxford University Press, 2003
SUMMARY
We elaborate on the link between state space models and (Gaussian) Markov random fields. We
extend the Markov random field models by generalising the corresponding state space model. It
turns out that several non Gaussian spatial models can be analysed by combining approximate
Kalman filter techniques with importance sampling. We illustrate the ideas by formulating a
model for edge detection in digital images, which then forms the basis of a simulation study.
1. INTRODUCTION
The class of state space models is very broad and comprises structural time series
models, ARIMA models, cubic spline models and as demonstrated by Lavine (1999)
also Markov random field models. The Kalman filter techniques are powerful tools for
inference in such sequential models. Basic references on state space model methodology
is in Harvey (1989), West and Harrison (1997), Durbin and Koopman (2001). In
the past decade, interest has been in developing Markov chain Monte Carlo (MCMC)
methods for the analysis of complex state state space models, see Carlin et al. (1992),
Carter and Kohn (1994), Frühwirth-Schnatter (1994) and de Jong and Shephard (1996).
Our approach is not based on MCMC, but on iterated extended Kalman smoothing,
which may be combined with importance sampling for exact simulation, see Durbin
and Koopman (2001). Using this method, we avoid the MCMC problems of ensuring
that the Markov chain is mixing well and assessing whether the chain has converged.
Writing Markov random field models as state space models following Lavine (1999),
makes it possible to use Kalman filter techniques to extend and analyse more complex
Markov random field models. We show how to analyse such extensions and illustrate
by formulating a model for restoring digital images with focus on finding edges in the
image. However, the new class of models also have applications within agricultural
experiments, see e.g., Besag and Higdon (1999) and within disease mapping, see e.g.,
Knorr-Held and Rue (2002).
2 Claus Dethlefsen
a Rt
z }|t { z }| {
T
θ t | Dt−1 ∼ Np (Gt mt−1 , Gt C t−1 Gt + W t )
f Qt
z }|t { z }| {
y t | Dt−1 ∼ Nd (F Tt at , F Tt Rt F t + V t )
A e
z }|t { z }|t {
θ t | Dt ∼ Np (at + Rt F t Q−1 (y t − f t ), Rt − At Qt ATt ).
| {zt } | {z }
mt Ct
Assessment of the state vector, θ t , using all available information, Dn , is called
Kalman smoothing and we write (θ t | Dn ) ∼ Np (f mt , C e t ). Starting with m fn = mn and
e
C n = C n , the Kalman smoother is a backwards recursion in time, t = n − 1, . . . , 1,
ft = mt + B t (f
with m mt+1 − at+1 ) and C e t = C t + B t (C e t+1 − Rt+1 )B T , where
t
B t = C t GTt+1 R−1
t+1 . When p is large it is often computationally faster to use the
mathematically equivalent disturbance smoother, see Koopman (1993).
The posterior mode of p(θ | y) is m fT = (f mT1 , . . . , m
fTn ). From the definition of
conditional densities mf maximises p(θ, y) and thus also
n
X n
X
log p(θ, y) = log p(y t | θ t ) + log p(θ t | θ t−1 ) + log p(θ 0 ). (1)
t=1 t=1
We may now interpret the Kalman smoother as being an algorithm to solve (2) recur-
sively.
The log likelihood function for a vector of hyperparameters ψ is given by
X n n
1X 2
l(ψ) = log p(y t | y 1 , . . . , y t−1 , ψ) = c − log |Qt | + ky t − f t k −1 , (3)
2 Qt
i=1 t=1
Markov Random Field Extensions using State Space Models 3
where kxk2Σ = xT Σx and c is a constant. The log likelihood for a given value of ψ can
thus be obtained directly from the Kalman filter. The expression can then be maximised
numerically yielding the maximum likelihood estimate. Approximate standard errors
can be extracted from numerical second derivatives.
the derivatives ∂ log p(y t | λt )/∂λt and ∂ log p(ω t )/∂ω t evaluated at λ e
et = F T θ
t t and
et = θ
ω e t − Gt θ
et−1 , respectively. The latter term, at time t + 1 and evaluated at ω e t+1 is
also needed for insertion in (2).
Since
∂ log p(y t | λt ) ∂ log p(y t | λt )
= −2 (y t − λt )
∂λt ∂(y t − λt )2
and
∂ log p(ω t ) ∂ log p(ω t )
= −2 (θ t − Gt θ t−1 ),
∂ω t ∂ω 2t
we see by comparison with (2) that the approximating model is given by
" #−1
1 ∂ log p(y t | λ )
t
Ve t = −
2 ∂(y t − λt )2 λ =λ et
t
" #−1
f t = − 1 ∂ log p(ω t )
W .
2 ∂ω 2t
ω t =ωet
For example, t-distributions or Gaussian mixtures can be approximated by Method 2.
We will assume that Σ and F are given so that in the following, the posterior is proper.
The posterior distribution of θ is given by Bayes’ Theorem,
where C = (P + F Σ−1 F T )−1 and m = CΣ−1 y. The aim is to assess the posterior
distribution, p(θ | y), in a computationally attractive way. Note that the matrices to
be inverted in the above expression are of size IJ × IJ.
The Markov random field model is equivalent to the following state space model
in the sense that their posterior distributions are identical. The state space model is
evolving following the rows instead of time.
h i
y i θ ∼ N F Ti θ i , Σi 0
(6)
xi i Hθ i 0 τ12 I J−1
Figure 1 To the left is shown a simulated image. The middle image shows the posterior mode
found by Kalman smoothing using the Gaussian Markov random field model. The right image
shows the residual image.
algorithm. For conveniency, we worked with the transformed parameter log τ /σ, allow-
ing the maximiser to suggest any real number as input. In all runs, we have chosen
m0j = 128 and C 0 = 1000 · I.
The resulting maximum likelihood estimates were b τ 2 = 338 and σ
b 2 = 117, and the
posterior mode of θ is displayed in Figure 1 (middle). The result is very smooth and
edges are blurred, as expected from the model. The residual image in Figure 1 (right)
also suggests that the edges are over-smoothed.
Given a value of log τ /σ, the Kalman smoother took approximately 3 minutes in R
on a SUN Enterprise 220R machine. Maximum likelihood estimation of this parameter
took approximately 90 minutes in R running on a SUN Enterprise 220R machine.
Figure 2 The posterior mode (left) after 50 iterations using the iterated extended Kalman
smoother. The residual image is shown in the middle and to the right is shown the average of
the two variances for each pixel calculated via (11).
The restored image after 50 iterations of the iterated extended Kalman smoother
from the image restoration model is seen in Figure 2 (left) along with the residual
image (middle). The parameters chosen were σ 2 = 60, τ 2 = 50, k = 0.95 and c =
25. These were chosen by tuning, since all attempts to perform maximum likelihood
estimation failed due to numeric instabilities. One run with 50 iterations in the model
took approximately 5 hours in R.
8 Claus Dethlefsen
The image to the right in Figure 2 shows the average of the up-down and left-
right variances calculated in each iteration by (11). As seen, the edges are now found,
although the “egg box” function confuses slightly. The edges in the posterior mode
of θ are clearer than the result from the Gaussian Markov random field model. The
residual image resembles white noise and indicates a good fit.
6. DISCUSSION
We provide an alternative to MCMC analysis of spatial models. For non Gaussian
state space models, the iterated extended Kalman smoother is capable of finding an
approximating Gaussian state space model with the same posterior mode. This allows
us to construct Markov random field models with non Gaussian increments. Then, the
approximating state space model can be used as importance density as described in
Durbin and Koopman (2001), to provide exact sampling of quantities of interest.
In the image restoration example, we experience a weakness of our method. When
the lattice is high-dimensional, the iterated extended Kalman smoother is slow. For
this reason, we have not employed importance sampling in the example, but the result
from the approximating state space model seems very satisfactory.
We find that the methodology has a great potential and has a wide range of ap-
plications. This is illustrated in Dethlefsen (2002) with examples from agricultural
experiments.
ACKNOWLEDGEMENTS
I am indebted to my Ph.D. supervisor Søren Lundbye-Christensen for inspiring discus-
sions.
REFERENCES
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion).
J. Roy. Statist. Soc. B 36, 192–236.
Besag, J. and Higdon, D. (1999). Bayesian analysis of agricultural field experiments (with discussion).
J. Roy. Statist. Soc. B 61, 691–746.
Carlin, B. P., Polson, N. G., and Stoffer, D. S. (1992). A Monte Carlo approach to nonnormal and
nonlinear state-space modeling. J. Amer. Statist. Assoc. 87, 493–500.
Carter, C. K. and Kohn, R. (1994). On Gibbs Sampling for State Space Models. Biometrika 81, 541–553.
Christensen, O. F. and Waagepetersen, R. (2002). Bayesian prediction of spatial count data using gen-
eralised linear mixed models. Biometrics 58 (to appear).
de Jong, P. and Shephard, N. (1995). The simulation smoother for time series models. Biometrika 82,
339-350.
Dethlefsen, C. (2002). Space Time Problems and Applications. Ph.D. Thesis, Aalborg University.
Doucet, A., Godsill, S. J. and West, M. (2000). Monte Carlo filtering and smoothing with application to
time-varying spectral estimation. Proc. of the IEEE International Conference on Acoustics, Speech
and Signal Processing, volume II, 701–704.
Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford: Oxford
University Press.
Frühwirth-Schnatter, S. (1994), Data Augmentation and Dynamic Linear Models. J. Time Series Anal-
ysis 15, 183–202.
Harvey, A. C. (1989). Forecasting, structural time series models and the Kalman filter. Cambridge:
University Press.
Ihaka, R. and Gentleman, R. (1996). R: A language for data analysis and graphics. J. Comp. Graph.
Statist. 5, 299–314.
Markov Random Field Extensions using State Space Models 9
Kitagawa, G. (1987). Non-gaussian state-space modeling of nonstationary time series (with discussion).
J. Amer. Statist. Assoc. 82, 1032–1063.
Knorr-Held, L. and Rue, H. (2002), On block updating in Markov random field models for disease
mapping. Scandinavian J. Statist. (to appear).
Koopman, S. J. Disturbance smoother for state space models. Biometrika 80, 117–126.
Lavine, M. (1999). Another look at conditionally Gaussian Markov random fields. Bayesian Statistics 6
(J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford: University Press.
West, M. and Harrison, J. (1997). Bayesian Forecasting and Dynamic Models, Berlin: Springer.