Arroyo, Mateu - 2004 - Spatio-Temporal Modeling of Benthic Biological Species

Spatio-temporal modeling of benthic biological species
Javier Axis-Arroyo
a
, Jorge Mateu
b,
*
a
Centro de Investigacion y Estudios Avanzados del IPN Unidad Merida, Antigua Carretera a Progreso Km 6,
C.P. 97310 A.P. 73 Cordemex, Merida, Yucatan, Mexico
b
Department of Mathematics, University Jaume I, Campus Riu Sec, E-12071 Castellon, Spain
Received 20 May 2003; revised 15 January 2004; accepted 23 January 2004
Abstract
The spatial and temporal distribution of the number of benthic species located in an important area under ecological stress (Puerto
CALICA, Quintana Roo, Mexico) is analyzed by means of Gaussian Spatial Linear Mixed Models. Following a model-based approach we
derive spatial predictions taking into account temporal variations between May 1996 and June 1999. The proposed models were evaluated in
terms of their ability to detect the underlying spatial structure for further interpolation. Uncertainty in the prediction could be evaluated by
using the Bayesian paradigm. The results can be used as a guide for further environmental management policies in the region.
q 2004 Published by Elsevier Ltd.
Keywords: Bayesian; Ordinary and universal kriging; Benthic biological species; Environmental impact management; Gaussian spatial linear mixed model
1. Introduction
Based on the conclusions of the Summit of Rio de
Janeiro of 1992, the loss of biodiversity has been the main
worry of most of the administrators of natural resources.
They have focused on the aim of maintaining biodiversity
levels, trying to preserve whole habitats to fulll multi-
resource goals rather than focusing resources on the
conservation of a few species (Fahrig and Merriam, 1994).
The preservation of whole habitats demands the knowl-
edge of the structure of the habitat, the function of each of its
components and their evolution in space and time. Land-
scape ecology is an area of ecology that analyzes the
patterns of the individual components, their interactions and
evolution in time (McGarigal and McComb, 1995). For this
reason, in the 1990s the natural resources administrators
proposed its generalized use in the formulation and
resolution of natural resource problems (Maurer and
Heywood, 1993; McGarigal and McComb, 1995). Geosta-
tistics is a step forward in landscape ecology.
In this context, when the habitat and the biodiversity are
altered due to human activities, analyzing the environmental
impact is of prime interest for further environmental
management policies. This is the case we present in this
paper.
A feature common to the earth sciences is the nature of
their data. Most of the properties of interest vary
continuously in space and time and cannot be measured or
recorded everywhere. Thus, to represent their variation the
values of individual variables or class types at unsampled
locations must be estimated from information recorded at
sample sites. The need to dene spatial variation precisely is
clear and geostatistics is largely the application of this
theory. It embraces a set of stochastic techniques that take
into account both the random and structured nature of
spatial variables, the spatial distribution of sampling sites
and the uniqueness of any spatial observation (Journel and
Huijbregts, 1978; Goovaerts, 1997).
This paper considers the analysis of data which can be
considered as a partial realization of a random function
(stochastic process) over a region, i.e. a spatially continuous
process, as characterized by Cressie (1993). Typically,
samples are taken at a nite set of points in the region and
used to estimate quantities of interest such as the values of
the property of interest at other locations. Data of this kind
are often called geostatistical data (Matheron, 1962;
Cressie, 1993). Geostatistical methods nd wide appli-
cations, for example in soil science, meteorology, hydrology
and ecology.
0301-4797/$ - see front matter q 2004 Published by Elsevier Ltd.
doi:10.1016/j.jenvman.2004.01.008
Journal of Environmental Management 71 (2004) 6777
www.elsevier.com/locate/jenvman
* Corresponding author. Tel.: -34-964-728391; fax: -34-964-728429.
E-mail address: mateu@mat.uji.es (J. Mateu).
An important tool in geostatistics is the kriging predictor.
The term kriging refers to a least square linear predictor
which, under certain stationarity assumptions, requires at
least the knowledge of the covariance parameters and the
functional form for the mean of the underlying random
function. In practical situations, the parameters are usually
not known. The kriging predictor does not take their
uncertainty into account, but uses plug-in estimates as if
they were true. Bayesian inference provides a way to
incorporate parameter uncertainty in the prediction by
treating the parameters as random variables and integrating
over the parameter space to obtain the predictive distri-
bution of any quantity of interest (Diggle and Ribeiro,
2002).
The basic format for univariate geostatistical data is
{(u
i
; z(u
i
)) : i = 1; ; n}; where u
i
identies a spatial
location and z(u
i
) a scalar measurement taken at the location
ui: It follows that the basic form of a geostatistical model is a
real-valued stochastic process {Z(u) : u [ A}; which is
typically considered in turn to be a partial realization of a
stochastic process {Z(u) : u [ R
2
}: Often, the measure-
ment process Z(u
i
) can be regarded as a noisy version of an
underlying random variable S(u
i
); the value at location u
i
of
a process {S(u) : u [ R
2
} which is of primary scientic
interest. We call S(u) the signal. The basic model is then
extended to one with two ingredients: a stochastic process
S(u) and a statistical model for the measurements, Z =
(Z
1
; ; Z
n
) conditional on {S(u) : u [ R
2
}:
Bayesian methodologies can be used to combine prior
information on an arbitrary number of parameters in
complex biological and environmental process models,
with the information content of data, to obtain probabilistic
parameter estimates. The main distinguishing feature of the
Bayesian approach is that it makes use of more information
than the frequentist approach. Whereas the latter is based on
analysis of what we could call hard data, that is data which
are generally well-structured and derived from a well-
dened observation process, Bayesian statistics also accom-
modates prior information which is usually less well
specied and may even be subjective. This makes Bayesian
methods potentially more powerful, but also implies a need
for extra care in their use.
In this context, the development of Bayesian kriging
(BK) methodology has enlarged the use of geostatistics in
ecological researches, particularly after the publications of
Kitanidis (1986, 1997), Omre (1987) and Omre and
Halvorsen (1989). As a consequence of the increasing use
of BK, several studies have been conducted comparing BK
with other kriging types; the works of Qian (1997) and
Krivoruchko (2001) stand out. Nevertheless, up to now, the
BK has not been evaluated with data based on the spatial
distribution of biological species.
In this paper, we focus on the spatio-temporal distri-
bution of benthic species in an area (next to Puerto
CALICA, Quintana Roo State, Mexico) under ecological
stress because of the building of new parts of the port and
also because of the heavy trafc in the central canal. We
present a methodological approach based on Gaussian
Spatial Linear Mixed Models (GSLMM) as a general and
exible class of models to handle the spatial variation
shown by individuals in a particular ecological environ-
ment. The use of a Bayesian approach is also compatible
with the modeling by means of GSLMM and as such it is
shown in the paper.
The plan of the paper is as follows. Section 2 develops
the hierarchical structure of the GSLMM models. Section 3
focuses on several ways of spatial prediction. The Bayesian
paradigm is presented in Section 4. The real data analysis
comes in Section 5. Finally the paper ends with some
conclusions.
2. Model formulation: GSLMM
2.1. Data structure
Consider a nite set of spatial sample locations
u
1
; u
2
; ; u
n
; within a region D and denote u =
(u
1
; u
2
; ; u
n
): Geostatistical data consist of measurements
taken at the sample locations u: The data vector is denoted
by z(u) = (z(u
1
); ; z(u
n
)); and the data are regarded as
being a realization of a spatial stochastic process {Z(u);
u [ D}: An arbitrary location is denoted by u
i
and the
region D is a xed subset of R
d
with positive d-dimensional
volume. We assume that u varies continuously through the
region D:
Now, assume observations from the additive model
z(u
i
) = f
X
(u
i
) - 1(u
i
); i = 1; ; n (1)
where f is the function of interest and possibly depends on
spatial covariates given by the matrix X: The random
components 1(u
i
) can be associated with measurement
errors and assumed to be Gaussian spatial processes. This
model is known as full interactive model.
2.2. Gaussian spatial linear mixed models
2.2.1. Bayesian paradigm
Bayesian statistics is a very large eld, since any problem
that you can address in a non-Bayesian (frequentist) way
can also be tackled in a Bayesian wayso in a sense this
research area covers the whole of statistics. Indeed, there are
several kinds of problems that can really only be handled
using Bayesian statistics.
Bayesian methodologies can be used to combine prior
information on an arbitrary number of parameters in
complex biological and environmental process models,
with the information content of data, to obtain probabilistic
parameter estimates.
From the early part of this century until the 1970s there
was basically only one theory of statistical inference, the
frequentist approach. The modern Bayesian approach began
J. Axis-Arroyo, J. Mateu / Journal of Environmental Management 71 (2004) 6777 68
as a serious alternative in the 1960s and 1970s, and has
grown into an increasingly substantial and effective
methodology since. In the 1990s, new computational
procedures have even made Bayesian methods the only
viable approach for some important kinds of problems.
There is now a very great interest in Bayesian methods of all
kinds. There are several good sources of Bayesian
information, and amongst them Bernardo and Smith
(1994) is an outstanding reference.
It makes a great deal of practical sense to use all the
information available, old and/or new, objective or
subjective, when making decisions under uncertainty. This
is especially true when the consequences of the decisions
can have a signicant impact, nancial or otherwise. Most
of us make everyday personal decisions this way, using an
intuitive process based on our experience and subjective
judgments. Mainstream statistical analysis, however, seeks
objectivity by generally restricting the information used in
an analysis to that obtained from a current set of clearly
relevant data. Prior knowledge is not used except to suggest
the choice of a particular population model to t to the
data, and this choice is later checked against the data for
reasonableness. The Bayesian approach, on the other hand,
treats these population model parameters as random, not
xed, quantities. Before looking at the current data, we use
old information, or even subjective judgments, to construct
a prior distribution model for these parameters. This model
expresses our starting assessment about how likely various
values of the unknown parameters are. We then make use of
the current data (via Bayess formula) to revise this starting
assessment, deriving what is called the posterior distri-
bution model for the population model parameters. Par-
ameter estimates, along with condence intervals (known as
credibility intervals), are calculated directly from the
posterior distribution. Credibility intervals are legitimate
probability statements about the unknown parameters, since
these parameters now are considered random, not xed.
It is unlikely in most applications that data will ever exist
to validate a chosen prior distribution model. Parametric
Bayesian prior models are chosen because of their exibility
and mathematical convenience. In particular, conjugate
priors are a natural and popular choice of Bayesian prior
distribution models.
2.2.2. GSLMM model through the full interactive model
In this section we further develop the full interactive
model giving an explicit expression for function f by means
of GSLMM models
The model assumed here considers that the variable Z is a
noisy version of a latent spatial process, the signal S(u): The
noises are assumed to be Gaussian and conditionally
independent given S(u): The model is specied by:
(1) Covariates. The mean part of the model is described by
the term X(u
i
)bX(u
i
)
/
denotes a vector of spatially
referenced non-random variables at location u
i
and b is
the mean parameter.
(2) The underlying spatial process {S(u) : u [ R
d
} is a
stationary Gaussian process with zero mean, variance
s
2
and correlation function r(h; f); where f is the
correlation function parameter and h is the vector
distance between two locations.
(3) Conditional independence. The variables Z(u
i
) are
assumed to be Gaussian and conditionally independent
given the signal
Z(u
i
)lS , N(X(u
i
)
/
b - S(u
i
); t
2
) (2)
In some applications we may want to consider a
decomposition of the signal S(u) into a sum of latent
processes T
k
(u) scaled by s
k
: Then, the model can be
rewritten, in a hierarchical way, as:
Level 1
Z(u) = X(u)b - S(u) - 1(u)
= X(u)b -
X
K
k=1
s
k
T
k
(u) - 1(u) (3)
Level 2
T
k
(u) , N(0; R
k
(f
k
)); T
1
; ; T
K
mutually independent and
1(u) , N(0; t
2
I):
Level 3
(b; s
2
; f; t
2
) , pr(); where pr() denes a prior probability
distribution (Bernardo and Smith, 1994).
The model components are described by:
Z(u) is a random vector with components
Z(u
1
); ; Z(u
n
); related to the measurements at sample
locations.
X(u)b = m(u) is the expectation of Z(u)X(u) is a matrix
of xed covariates measured at sample locations u: b is a
vector parameter. If there are no covariates, X = 1 and
the mean reduces to a single constant value at all
locations.
T
k
(u) is a random vector at sample locations, of a
standardized latent stationary spatial process T
k
: It has
zero mean, variance one and correlation matrix R
k
(f
k
):
The elements of R
k
(f
k
) are given by a correlation
function r
k
(h; f
k
) with parameter f
k
: If the process is
isotropic this parameter and the distance h are scalar
parameters. The processes T
1
; ; T
k
are mutually inde-
pendent. The signal S is dened by the sum of scaled
latent processes S(u) =
P
K
k=1
s
k
T
k
(u):
s
k
is a scale parameter.
1(u) denotes the error (noise) vector at the sample
locations u; i.e. a spatially independent process (spatial
white noise) with zero mean and variance t
2
:
3. Spatial prediction through a model-based approach
In geostatistical problems, often the main interest is not
parameter estimation but prediction of the variable at a set
of locations. Denote by Z(u
0
) (Z
0
for now on) the variable to
be predicted at the location u
0
:
The optimal point predictor, dened as the one which
minimizes the prediction mean square error (MSE), is given
by
^
Z
0
= E[Z
0
lZ] (4)
This predictor is called the least squares predictor and its
prediction variance is given by Var[Z
0
lZ]:
The values of the conditional expectation (4) can be
calculated only if the model distributions are fully specied
and the parameters are known. In practice the model
parameters are unknown and an approximation to the
conditional expectation may be then used. Finding the
conditional expectation (4) or an approximation to it, is a
central problem in geostatistics, and several methods have
been proposed.
If complete parametric specication for the model
components is assumed the conditional expectation (4)
can be assessed (Diggle et al., 1998; Diggle and Ribeiro,
2002). Consider, for example, the Gaussian model specied
in Eq. (3) extended to include both Z and Z
0
: The joint
distribution is given by
(Z; Z
0
lb; s
2
; f; t
2
)
, N
X
X
0
" #
b; t
2
I -
V
z
(s
2
; f) v(s
2
; f)
v
/
(s
2
; f) V
0
(s
2
; f)
" # !
(5)
Under this model the conditional expectation (4) can be
directly obtained if all the parameters are known. Then it
coincides with the simple kriging predictor. For the more
realistic scenario of unknown parameters, both classical
likelihood-based and Bayesian paradigms can be adopted.
Under the model-based perspective, assuming the model in
Eq. (3) with known parameters, the prediction problem is
straightforward. If the subsymbol p denotes parameter
known, and V(s
2
p
; f
p
) and v(s
2
p
; f
p
) from Eq. (5) are
denoted by V and v; the predictive distribution (Bernardo
and Smith, 1994) is given by
(Z
0
lZ; b
p
; s
2
p
; f
p
; t
2
p
)
, N(X
0
b
p
- v
/
(t
2
p
I - V
z
)
21
(z 2Xb
p
); t
2
p
I
- V
0
2v
/
(t
2
p
I - V
z
)
21
v) (6)
Therefore, point predictors and associated uncertainty
can be easily obtained. The mean of Eq. (6) coincides
with the minimum MSE predictor, the conditional
expectation (4).
4. Bayesian inference
Let us focus now on parameter estimation and prediction
results for a Bayesian analysis of geostatistical data. For this
aim consider a simpler model than Eq. (3) dened as:
Z(u) = X(u)b - sT(u); with T
u
, N(0; R
z
(f)) and in a
third level dening a prior probability (Bernardo and Smith,
1994) for pr(b; s
2
; f): One of the most important issues
within this context is the analysis of the uncertainty
associated to the mean parameter.
4.1. Uncertainty in the mean parameter
In this case only the mean parameter b is unknown. The
covariance parameters are known and the covariance matrix
is written as V(s
2
p
; f
p
) = s
2
p
R(f
p
); and denoted by s
2
p
R: The
model considered here corresponds to the common situation
in geostatistics where the mean is ltered out and the
covariance parameters are estimated by some method and
plugged-in for predictions.
Considering the above particular model, the joint
probability distribution for (Z; Z
0
) is a simpler version of
Eq. (5), without the nugget effect and with only one latent
process
(Z; Z
0
lb; s
2
p
; f
p
) , N
X
X
0
" #
b; s
2
R
z
r
r
/
R
0
" # !
(7)
and the associated marginal and conditional distributions
are
(Zlb; s
2
p
; f
p
) , N(Xb; s
2
p
R
z
) (8)
and
(Z
0
lZ; b; s
2
p
; f
p
)
, N(X
0
b - r
/
R
21
z
(z 2Xb); s
2
p
(R
0
2r
/
R
21
z
r)) (9)
4.1.1. Posterior for model parameters
Assuming a Normal or Gaussian prior for the mean
parameter, i.e. considering a conjugate prior (Bernardo and
Smith, 1994)
(blZ; s
2
p
; f
p
) , N(m
b
; s
2
p
V
b
) (10)
the posterior (Bernardo and Smith, 1994) is given by
(blZ; s
2
p
; f
p
) , N((V
21
b
- X
/
R
21
z
X)
21
(V
21
b
m
b
- X
/
R
21
z
z);
s
2
p
(V
21
b
- X
/
R
21
z
X)
21
) , N(
^
b
N
; s
2
p
V
^
b
N
) (11)
Now, the mean and variance of the predictive distribution
(Bernardo and Smith, 1994) are
E[Z
0
lZ] =(X
0
2r
/
R
21
z
X)(V
21
b
- X
/
R
21
z
X)
21
V
21
b
m
b
- [r
/
R
21
z
- (X
0
2r
/
R
21
z
X)
(V
21
b
- X
/
R
21
z
X)
21
X
/
R
21
z
]z (12)
Var[Z
0
lZ] =s
2
p
[R
0
2r
/
R
21
z
r - (X
0
2r
/
R
21
z
X)
(V
21
b
- X
/
R
21
z
X)
21
(X
0
2r
/
R
21
z
X)
/
] (13)
In the case of using a at prior (Bernardo and Smith,
1994), i.e. p(u) 1; the mean and variance of the predictive
distribution can be calculated from Eqs. (12) and (13) with
V
21
b
; 0: Finally, the posterior for known mean parameter
b can also be obtained from Eqs. (12) and (13) considering
V
21
b
qX
/
R
21
y
X or V
b
; 0:
4.2. Relationships with conventional geostatistical methods
Some of these results can be related to conventional
geostatistical methods (Journel and Huijbregts, 1978;
Goovaerts, 1997). Under the Bayesian perspective these
geostatistical methods can be interpreted as prediction
procedures which only take into account the uncertainty in
the mean parameters.
1. If X ; 1 and X
0
; 1 (constant mean), the mean and
variance when a at prior is used coincide with the
ordinary kriging (OK) predictor and the ordinary kriging
variance (s
2
OK
):
2. If X and X
0
are trend matrices with rows given by data
coordinates or a function of them, the mean and variance
when a at prior is used coincide with the universal or
trend kriging (UK or KT) predictor and the universal or
trend kriging variance (s
2
KT
):
3. If X and X
0
are trend matrices with covariates measured
at data and prediction locations, respectively, the mean
and variance when a at prior is used coincide with the
kriging with external trend (KTE) predictor and the
kriging with external trend variance (s
2
KTE
):
5. Spatio-temporal analysis of benthic species
5.1. Region of study, data and aims
The loss of biodiversity has been the main worry of most
of the administrators of natural resources. They have paid
particular attention to maintaining biodiversity levels of
certain environmental habitats. The preservation of whole
habitats demands the knowledge of the structure of the
habitat, the function of each of its components and their
evolution in space and time.
We pay attention here to the environmental impact that
human activities might have caused in a marine environ-
ment. Port CALICA is located in the north of Quintana Roo
State, Mexico, 8.7 km from Playa del Carmen, next to
Xcaret Tourist Center (Fig. 1). In 1994 a group of
researchers of CINVESTAV-Merida performed a eld
sampling before the construction of Port CALICA to set
down the ecological base lines for the project Port for Ferry
(PF). They analyzed the following parameters in both the
dock of the port and the bordering marine zone: hydrology,
primary productivity, hydrocarbons and benthic and sh
communities. The reported results described quite clearly
the environmental conditions previous to the project,
providing comparative parameters of the environmental
conditions previous to the construction of the port. The
study reported a biological cover of the marine bottom
bordering Port CALICA of 45%. After the construction of
the PF project, CINVESTAV-Merida carried out eight eld
samplings to monitor the environmental state of the zone,
analyzing the same above mentioned parameters. However,
in this paper we only focus on and analyze the data sampled
in May 1996 and June 1999.
The data were obtained by means of the phototransect
technique (Ohlhorst et al., 1988). In each transect 18
photographs were taken, each photograph representing an
area of 1904 cm
2
(56 34 cm
2
) of the marine bottom.
Therefore, each transect represented an area of 3.427 m
2
.
The study area was divided into six sampling zones, two
phototransect were taken in each sampling zone and each
phototransect was geopositioned (GPS Magellan) to ensure
constant spatial location. Each photograph was digitalized
to determine the number and type of benthic species and
was spatially located dening a spatial system of XY
coordinates. The total number of species dened the
stochastic random eld, Z say, to be interpolated and
predicted.
The study area is located in a zone between coral reefs,
therefore it has scleractinians and gorgonians species but
they do not conform to a coral reef. The species distribution
shows an increase in the number of species with depth.
The dominant group is the algae that presents a uniform
distribution, the gorgonians and sponges show a scattered
distribution throughout the area, and the scleractinians
are distributed in small dispersed patches (Jordan, 1979,
1992; Chavez, 1996; Torruco and Gonzalez, 1996).
Fig. 1. Map of the study region.
The main goal is to analyze the spatial distribution of the
number of benthic species that can be found in the selected
region (Fig. 1) through its variation with time. Other
interesting aims are (a) nding possible relationships
between the spatial distribution of the number of species,
the heavy trafc in the port and the building works in the
zone, (b) analyzing patch sizes. And in general we need to
help dene sensible policies concerning environmental
management.
For the geostatistical analysis several softwares were
used: (a) for variogram analysis, GS - Geostatistics 3.2
(Gamma Design Software, 1998); (b) for kriging modeling
and validation, S-Plus 2000 (MathSoft, 2000a), the library
S - SpatialStats (MathSoft, 2000b) and the library GeoS
(Ribeiro and Diggle, 1999); (c) for model edition and
drawing, SURFFER 7.0 (Golden Software, 1999).
The methodology proposed is based on GSLMM as
presented in earlier sections. These models include the well-
known ordinary kriging (OK), universal kriging (UK) and
can also be used for Bayesian kriging (BK). In this section
we focus on these three models and present comparisons
amongst them. To analyze the goodness-of-t of the three
kriging techniques under the formal and joint evaluation by
means of the GSLMM models, the cross-validation results
were contrasted to the original observations in situ and to
the spatial arrangements of the species that was reported by
the Mexican Caribbean Ofce.
The random eld data based on the total number of
species showed a non-Gaussian distribution in May 1996
and a Gaussian one in June 1999. Therefore the data of May
1996 were transformed by using the BoxCox transform-
ation to ensure Gaussian distribution in the data and
consequently a proper use of the geostatistical methodology.
5.2. Geostatistical analysis
The rst step in a geostatistical analysis is detection of
the spatial dependence and evaluation of its strength. This is
done by means of the variogram analysis.
The tted parameters of several variogram models for
kriging are shown in Table 1. The values of nugget, nugget-
sill relationship (NSR) and proportion of spatial structure
(PSS) indicated that the data obtained in the temporal
samplings (May 1996 and June 1999) are representative of
the spatial variations of the benthic species number and it is
the spatial component that explains the sampled variation.
In June 1999, a bigger NSR was observed together with a
smaller PSS, indicating that there are factors that inuence
the dynamics of the benthic community that are not being
reected in the spatial component.
The cross-validation analysis (Table 2) reported reliable
predictions of models and a low standard error in the models
based on the three kriging types. The smallest prediction
standard error for both temporal samplings was observed
when Bayesian kriging was used, while the biggest standard
error was associated with OK in both samplings.
Fig. 2 shows the results of the prediction (and
corresponding standard errors) of the total number of
benthic species in the two temporal samplings obtained with
the three kriging types. In May 1996, a maximum of 30
species/m
2
and a minimum of 10 species/m
2
was predicted,
and in June 1999 a maximum of 24 species/m
2
and a
minimum of zero.
The pairwise differences between predictions under OK,
UK and BK models is represented in Fig. 3. Basically,
prediction patterns under OK and UK were quite similar,
particularly for May 1996. The OK and UK models
presented a high level of similarity (89.39%) in May
1996. They covered the same range of species (from 10 to
30 species/m
2
), had the same number of patches (11) and the
same classes of patches. The associated standard error plots
also showed a high similarity, presenting a range within 4
species/m
2
, with a central patch with zero error, an
underestimation of the number of species in the area far
from the coast and an overestimation in the zone near the
coast.
In May 1996 the general pattern showed by the spatial
distribution of the number of species had a central patch
with perpendicular orientation to the coast line, followed by
patches with a clear increase in the number of species in
both sides. The biggest number of species concentrations
was found in the Northwest next to the coast line.
Table 1
Semivariogram analysis of the number of benthic species in May 1996 and
June 1999
Variogram analysis vs year
May/1996 June/1999
Nugget 0.001 18.400
Sill 2.7583 131.539
NSR 0.036 13.99
Range 798.5 3147
PSS 1 0.860
R
2
0.659 0.800
RSS 8.636 10953
The variogram models are anisotropic (90) for May 1996 and
exponential for June 1999. NSR: Nugget Sill relationship; PSS: proportion
of spatial structure. R
2
: regression coefcient; RSS: reduced sums of
squares.
Table 2
Cross-validation of kriging technique
Ordinary kriging Universal kriging Bayesian kriging
May/96 June/99 May/96 June/99 May/96 June/99
RC 1.004 1.007 0.999 1.003 1.027 0.999
SE 0.019 0.043 0.018 0.027 0.021 0.001
Y intercept 20.046 20.058 0.009 20.039 20.411 0.010
SE predict 1.372 2.724 1.044 1.783 0.553 0.091
SN 1136 025 1136 025 1136 025
RC: regression coefcient, SE: standard error, SN: species number.
Fig. 2. Kriging predictions and standard errors of benthic species in May 1996 and June 1999. OK: ordinary kriging, UK: universal kriging and BK: Bayesian
kriging.
The difference in May 1996 between the BK model and
the other models was characterized by a smaller range of
species with respect to the other models (1018 spe-
cies/m
2
). Therefore, the larger number of observations
registered in the two temporal samplings were not reected
by the BK model. The similarity level of BK and OK was
85.11% and with respect to UK was 80.28%. The BK
standard error plots showed a perpendicular orientation to
the coast line, with a central patch and patches to its side
with zero error; in the Southwest a small patch of error 1 was
observed.
For June 1999, the general pattern of the number of
species registered a clear increase gradient, from the
Northwest end to the coast line to the Southeast end to the
coast line. The dissimilarity between models for this date
was markedly larger than that in May 1996, the biggest
similarity registered between OK and UK (74.41%) and the
lowest between BK and UK (51.02%), while the similarity
between OK and BK was 55.32%.
In general, as it happened in May 1996, the OK and UK
models showed small variations in the patches size. The
general pattern presented the same species range (024),
Fig. 3. Differences between kriging predictions based on OK, UK and BK corresponding to May 1996 and June 1999.
Fig. 4. Bayesian predictive distributions at 12 selected stations. Dotted lines: mean of prediction, solid lines: real values.
the same number and types of patches. Their associated
standard errors ranged from 24 to 4, being the patch with
zero error begin the main one. Both models overestimated
the number of species in the central zone of the study area
(deep zone of the sailing channel to Port CALICA) and
underestimated the number in the north zone.
The spatial behavior of the benthic species registered in
June 1999 showed, in general, a less predictable behavior
than that of May 1996, due to a weaker spatial intercorrela-
tion. This resulted in a quite different spatial distribution
pattern. Now, taking into account that both temporal
samplings were carried out in the same stadium of the
species annual cycle, and then should register a similar
spatial behavior under natural conditions, the real differ-
ences encountered might be due to other circumstances such
as works at the canal which crosses the analyzed region.
The use of the Bayesian paradigm allowed to model
uncertainty in some of the parameters, such as the mean (b)
parameter. Fig. 4 shows 12 selected locations with the
corresponding predictive Gaussian distributions for May
1996 and June 1999. The uncertainty in the prediction was
controlled and assessed by a Gaussian distribution. Using
this technique, we can evaluate the predictive distribution at
each individual station and thus analyze those stations, and
those subregions, where our predictions can be more
reliable.
The temporal variation of the number of species was
nally analyzed (see Table 3 and Fig. 5). The temporal
variation between data in May 1996 and June 1999 followed
the same general pattern under the three kriging classes,
with an unchanged patch that crossed the study area from
the Southwest to the Northeast, with an increase in the
number of species at South and a decrease at North. The
maximum temporal variation corresponded to areas with a
decrease in the number of species, and the smallest temporal
variation corresponded to the unchanged area.
It is important to note that the maps shown in Figs. 3 and
5 are just built by superposition of the predictions obtained
by a xed kriging model or through the temporal variation.
No level of uncertainty nor probability associated to the
changes are dened. To handle change probabilities we used
the probabilistic kriging proposed by Krivoruchko (2001).
Table 3
Percentage of variation of the number of species per m
2
in May 1996 and
June 1999 for each kriging type
Ordinary kriging Universal kriging Bayesian kriging
Decrease 56.04 56.10 39.65
Unchanged 18.43 18.10 29.97
Increase 25.53 25.79 30.37
Total variation 81.57 82.90 60.02
Fig. 5. Spatio-temporal variation of the number of benthic species.
We thus considered the number of species registered in May
1996 as the threshold with a value set to zero. Then, if an
increase in the number of species in June 1999 was detected,
a value of 1 was assigned, if there was a decrease, a value of
21 was dened and nally, if no change was found a
constant value of 0 was specied. Finally, a Bayesian
kriging was used to interpolate this probabilistic map, which
in turn was called a contingency map.
The contingency map (Fig. 6) showed a clear marked
gradient of variation in the change probability from the
central sailing channel. An increasing probability (positive
values from 0 to 1 in Fig. 6) in the number of species at the
East of the channel was found, together with a decreasing
probability (negative values from 0 to 21 in Fig. 6) in the
number of species at the West of the channel.
Considering the maps of spatial variation between
temporal monitoring dates and mainly the contingency
map, a notorious effect of the sailing channel was observed
on the spatial distribution pattern of the number of species.
The decrease in the species density at West of the channel,
and particularly in the zones with less depth, could be due to
the combined effect of the sailing channel and dominant
marine currents in the study area (from Southeast to
Norwest). A possible explanation was that the port move-
ment increases the suspended sediments in the water and
that the marine currents disperse them in the zone at the
West of the channel.
6. Conclusions
We have shown a new class of geostatistical models
based on Gaussian Spatial Linear Mixed Models that can be
used in practice to make spatial predictions and that
incorporate as particular cases the well known ordinary
and universal kriging techniques. Moreover, these models
can also be used under the Bayesian paradigm. The main
advantage of using Bayesian kriging is that uncertainty in
the model parameters can be evaluated.
In our case study, the BK model and secondly the UK
model were the ones that best predicted the number of
benthic species with a rather small standard error. In general
OK and UK models provided similar predictions throughout
the region. Moreover, using the GSLMM models, we could
use a Bayesian methodology to evaluate the uncertainty in
the predictions.
In relation to the spatio-temporal variation of the number
of benthic species as a measure of biodiversity, the three
models concluded that in the area surrounding Port
CALICA there was a notorious decrease in the number of
species per m
2
, especially in the North zone next to the coast
line, which reected the effect of the sailing channel and of
the dominant currents in the zone. These results could be
used for further and better environmental policies ensuring
biological protection.
Acknowledgements
The referees and the Editor are gratefully acknowledged
for their comments that have substantially improved an
earlier version of the paper.
References
Bernardo, J.M., Smith, A.F.M., 1994. Bayesian Theory, Wiley, Chichester.
Chavez, E.A., 1996. Sampling design for the study of Yucatan reefs,
Northwestern Caribbean, Proceedings of the Eighth International Coral
Reef Symposium, Panama, pp. 14651470.
Cressie, N.A., 1993. Statistics for Spatial Data, Wiley, New York, Revised
edition.
Diggle, P.J., Ribeiro, P.J. Jr., 2002. Bayesian inference in Gaussian model
based geostatistics. Geographical and Environmental Modelling 6,
129146.
Diggle, P.J., Moyeed, R.A., Tawn, J.A., 1998. Non-gaussian geostatistics.
Applied Statistics 47, 299350.
Fahrig, L., Merriam, G., 1994. Conservation of fragmented populations.
Conservation Biology 8, 5059.
Gamma Design Software, 1998. GS - Geostatistics for the Environmental
Sciences. Software Version 3.2,.
Golden Software, 1999. Surface Mapping System. Software Version 7.0,
Golden Software, Golden, CO.
Goovaerts, P., 1997. Geostatistics for Natural Resources Evaluation,
Oxford University Press, Oxford.
Jordan, E., 1979. Estructura y composicion de arrecifes coralinos en la
region nordeste de la Pen nsula de Yucatan, Mexico. Anales del Centro
de Ciencias del mar y Limnologica, Universidad Narimal Autonomad.
Mexico 6, 6986.
Jordan, E., 1992. Atlas de corales del caribe mexicano, CIQRO (edic.),
Quintana Roo, 106 pp.
Journel, A., Huijbregts, C., 1978. Mining Geostatistics, Academic Press,
New York.
Kitanidis, P., 1986. Parameter uncertainty in estimation of spatial functions:
Bayesian analysis. Water Resources Research 22, 499507.
Fig. 6. Contingency map. Values from 0 to 1 dene the probability of an
increase in the number of benthic species; Values from 0 to 21 dene the
probability of a decrease in the number of benthic species.
Kitanidis, P., 1997. Introduction to Geostatistics: Applications in Hydro-
geology, Cambridge University Press, Cambridge, 271 pp.
Krivoruchko, K., 2001. Using linear and non-linear kriging interpolators to
produce probability maps, Proceedings of 2001 Annual International
Association For Mathematical Geology, Cancun, Mexico,.
Matheron, G., 1962. Traite de geostatistique appliquee. Tomes 1 and 2,
Memoires du Bureau de Recherches Gologiques et Minieres, No. 14,
Technip Editions, Paris.
MathSoft, 2000a. Software S-Plus 2000 Professional, Release 2, MathSoft,
Seattle.
MathSoft, 2000b. LibraryS - SpatialStats toS-Plus 2000, MathSoft, Seattle.
Maurer, B.A., Heywood, S.G., 1993. Geographic range fragmentation and
abundance in neotropical migratory birds. Conservation Biology 7,
501509.
McGarigal, K., McComb, W.C., 1995. Relationships between landscape
structure and breeding birds in the Oregon Coast Range. Landscape
Ecology 16, 327349.
Ohlhorst, S.L., Liddell, W.D., Taylor, R.J., 1988. Evaluation of reef census
techniques. Proceedings of the Sixth International Coral Reef
Symposium, Australia vol. 2, 319324.
Omre, H., 1987. Bayesian krigingmerging observations and qualied
guesses in kriging. Mathematical Geology 19, 2538.
Omre, H., Halvorsen, K., 1989. The Bayesian bridge between simple and
universal kriging. Mathematical Geology 21, 767786.
Qian, S.S., 1997. Estimating the area affected by phosphorus runoff in an
everglades wetland: a comparison of universal kriging and Bayesian
kriging. Environmental and Ecological Statistics 4, 129.
Ribeiro Jr., P.J., Diggle, P.J., 1999. geoR/geoS: a geostatistical library for
R/Splus. Technical Report ST-99-09. Department Mathematics and
Statistics, Lancaster University, Lancaster.
Torruco, D., Gonzalez, A., 1996. Comunidad Bentica in CINVESTAV,
Anaalisis de calidad del agua y monitoreo biologico del sitio CALICA,
Quintana Roo. Primer informe, CINVESTAV-IPN Unidad Merida,
Merida Yucatan Mexico, 54 pp.

Arroyo, Mateu - 2004 - Spatio-Temporal Modeling of Benthic Biological Species

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Arroyo, Mateu - 2004 - Spatio-Temporal Modeling of Benthic Biological Species

Enviado por

Direitos autorais:

Formatos disponíveis

Spatio-temporal modeling of benthic biological species

Você também pode gostar