An Experimental Comparison

9812 Mathematical Geology PLENUM Dec′ 98 − Jan′ 99 981203ap 6/29/1999 [632] (PR1)
Mathematical Geology, Vol. 31, No. 4, 1999
An Experimental Comparison of Ordinary and

Universal Kriging and Inverse Distance Weighting1
Dale Zimmerman,2 Claire Pavlik,3

Amy Ruggles,4 and Marc P. Armstrong5
A factorial, computational experiment was conducted to compare the spatial interpolation accuracy
of ordinary and universal kriging and two types of inverse squared-distance weighting. The
experiment considered, in addition to these four interpolation methods, the effects of four data and
sampling characteristics: surface type, sampling pattern, noise level, and strength of small-scale
spatial correlation. Interpolation accuracy was measured by the natural logarithm of the mean
squared interpolation error. Main effects of all five factors, all two-factor interactions, and several
three-factor interactions were highly statistically significant. Among numerous findings, the most
striking was that the two kriging methods were substantially superior to the inverse distance
weighting methods over all levels of surface type, sampling pattern, noise, and correlation.
KEY WORDS: geostatistics, spatial interpolation, spatial pattern, surface-fitting algorithms.
INTRODUCTION
An important statistical problem in the geosciences is the estimation of a two-
or three-dimensional manifold across a study region from observations taken at
known locations in the region. This problem, known as spatial interpolation,
has been addressed using methods that include ordinary and universal krig-
ing, inverse distance weighting, interpolating polynomials, splines, and power
and Fourier series fitting (for reviews of these and other methods, see Franke,
1Received August 1995; accepted 21 July 1998.

2Department of Statistics and Actuarial Science, The University of Iowa, Iowa City, Iowa 52242.
e-mail: dale-zimmerman@uiowa.edu
3Department of Geography, The University of Iowa, Iowa City, Iowa 52242. e-mail: claire-
pavlik@uiowa.edu
4Department of Geography, The University of Iowa, Iowa City, Iowa 52242. e-mail: amy-rug-
gles@uiowa.edu
5Department of Geography, and Program in Applied Mathematical and Computational Sciences,
The University of Iowa, Iowa City, Iowa 52242. e-mail: marc-armstrong@uiowa.edu
375
0882-8121/ 99/ 0500-0375$16.00/ 0  1999 International Association for Mathematical Geology

376 Zimmerman, Pavlik, Ruggles, and Armstrong
1982; Lam, 1983; Cressie, 1993). The variety of available interpolation meth-
ods has led to questions about which is most appropriate in different contexts
and has stimulated several comparative studies of relative accuracy. Unfortu-
nately, published studies are not unequivocal as to which interpolation method
is most accurate. In some studies (Creutin and Obled, 1982; Tabios and Salas,
1985; Rouhani, 1986; Grimm and Lynch, 1991; Laslett and McBratney, 1990;
Weber and Englund, 1994; Laslett, 1994; Phillips and others, 1997), a kriging
procedure performed best, while in others (Laslett and others, 1987; Weber and
Englund, 1992; Gallichand and Marcotte, 1993; Brus and others, 1996; Declercq,
1996) splines or inverse distance weighting were as good or better. Most of
these comparisons were based on analyses of actual geostatistical data, such as
groundwater levels (Rouhani, 1986), acid rain deposition (Grimm and Lynch,
1991), soil pH (Laslett and others, 1987; Laslett and McBratney, 1990), and
land elevation and elevation variances (Weber and Englund, 1992, 1994; Wood
and Fisher, 1993). Using real data rather than synthetic data has several advan-
tages; for example, it precludes one method from having an unfair advantage
merely because the data used for the comparison are generated under the same
model on which the method is based. On the other hand, only with synthetic
data can the effect of certain data characteristics (such as the level of noise or
the strength of spatial correlation) on interpolation accuracy be systematically
evaluated.
The purpose of this paper is to evaluate and compare the accuracy of several
interpolation methods based on an analysis of synthetic data from a computa-
tional experiment. The experiment was designed so that the effects of interpola-
tion method and certain data features and sampling procedures, both individually
and in interaction with one another, could be tested for statistical significance.
The interpolation methods we studied, ordinary kriging, universal kriging, and
two types of inverse squared-distance weighting, are widely used methods that
were given prominence in the relatively recent comparative studies of Weber
and Englund (1992, 1994) and Englund, Weber, and Leviant (1992). The present
study can, in fact, be considered a synthetic-data complement and supplement
to these two previous studies.
Our computational experiment focused on four basic factors in addition to
interpolation method: surface type, sampling pattern, noise (magnitude of error
variance), and strength of spatial correlation among error terms. The three sur-
faces we used, a simple tilted plane, a strong peak based on a sine function, and
an undulating, increasing surface which is the sum of sine and cosine functions,
were selected to encompass a simplified subset of the range of surfaces encoun-
tered in environmental analyses. The surfaces share, at an abstract level, many
characteristics with real environmental surfaces. As Weber and Englund (1994,
p. 590) argued, physical phenomena such as hydraulic head, surface water tem-
perature, and geologic structural surfaces often vary smoothly and continuously.
A Comparison of Kriging and Inverse Distance Weighting 377
The three surfaces were sampled with four commonly used patterns so that the
interaction of surface type with sampling pattern could be evaluated.
In general, the surfaces from which environmental samples are taken are
expected to exhibit local variation (noise); in addition, physical processes operate
which often lead to small-scale spatial correlation. To simulate this in our data,
we systematically controlled the level of noise in, and the strength of spatial
correlation among, the error terms, thus permitting an examination of the effects
of noise and spatial correlation on interpolation accuracy.
The systematic design of our experiment, with sampled surfaces interpo-
lated by ordinary and universal kriging and by two types of inverse squared-dis-
tance weighting, permits comparative evaluation of the four methods. By con-
trolling values of the parameters discussed above, we are able to reach conclu-
sions about the conditions under which kriging (either ordinary or universal)
or inverse squared-distance weighting performs better, as well as the conditions
under which particular sampling patterns may be preferred.
EXPERIMENTAL PROCEDURE
Many characteristics of sampling and data could conceivably affect the per-
formance of interpolation methods. As noted in the previous section, we consid-
ered four basic factors in this investigation in addition to interpolation method:
surface type, sampling pattern, magnitude of error variance, and strength of
small-scale spatial correlation. In this section, we describe how the levels of
these factors were selected and how our computational experiments were con-
ducted.
The experimental region for our investigation was taken to be the unit
square {(x, y): 0 ≤ x ≤ 1, 0 ≤ y ≤ 1}. Three mathematical surfaces, hence-
forth called the plane, the sombrero, and Morrison’s surface, were defined on
this region (Fig. 1). Each of these surfaces represents a particular kind of drift,
or large-scale spatial variation. The drift functions that define the plane and the
sombrero are:
Plane:g(x, y) c 1.05 − x
sin[{16(x − 0.5)}2 + {16( y − 0.5)}2 ]

Sombrero:g(x, y) c
{16(x − 0.5)}2 + {16( y − 0.5)}2
if x ? 0.5 or y ? 0.5
c 1, if x c y c 0.5
The drift function that defines Morrison’s surface, given in Morrison (1971) but
not duplicated here, is rather complicated; it is a linear combination of 48 sine
Figure 1. Mathematical surfaces used for the experiment: A, plane; B, sombrero; C, Morrison’s
surface.
and cosine functions of different phase, period, and amplitude. So as to eliminate

possible differences in the accuracy of interpolation methods for these surfaces
caused by differences in total surface variation (maximum minus minimum), all
three surfaces were scaled so that the maximum was approximately equal to 2.0
and the minimum was approximately equal to 0.0.
Each surface (actually a perturbed version of each surface, as will be de-
scribed subsequently) was sampled at locations whose spatial distribution was
determined according to a particular spatial point process. Four types of sampling
patterns were considered: a regular hexagonal grid, an inhibited pattern, a simple
random pattern, and a clustered pattern. In the order just listed, these types lie on
a continuum of decreasing regularity (or increasing aggregation). The hexagonal

grid comprised 99 sampling points with a grid spacing of approximately 0.104
units. The other three types of patterns all consisted of 100 sampling points.
For each type, realizations were obtained using methods described by Diggle
(1983), which we briefly summarize. A realization of the random pattern was
obtained by generating 200 independent uniform (0, 1) random variables and
pairing them as they were generated to yield 100 (x, y)-coordinates. A realiza-
tion of the clustered pattern was obtained by simulating a Poisson cluster process
with 20 clusters. For each cluster, the number of sampling points was an inde-
pendent Poisson variate with expectation 5.0; realizations were conditioned to
yield a total of 100 points. Locations of points in a cluster were independently
distributed according to a uniform distribution on a circle of radius 0.10 around
the cluster center, with toroidal edge correction when necessary. A realization of
the inhibited sampling pattern was obtained by simulating a sequential, hard-core
inhibition process: the first sampling point was uniformly generated in the unit
square as for a simple random pattern, but each subsequent point was chosen
according to a uniform distribution on that portion of the unit square that lay no
closer than 0.04 to any previously obtained point. A realization of the hexago-
nal grid was obtained by locating the lower leftmost grid point at random within
the smallest subregion for which the resulting grid would lie completely within
the unit square. A realization of each type of sampling pattern is depicted in
Figure 2.
Values of the surfaces at sampling points were perturbed by adding auto-
correlated errors in order to introduce some local variability. The resulting sur-
faces more realistically represent environmental surfaces than the smooth math-
ematical surfaces defined above. For the purpose of evaluating the performance
of interpolation methods, errors were also generated at 400 “evaluation points”
arranged in a 20 × 20 square grid with lower leftmost point at (0.025, 0.025) and
grid spacing 0.05 units. Autocorrelated, normally distributed errors with mean
0 and variance j 2 were generated at the 100 (or 99, for the hexagonal patterns)
sampling points and 400 evaluation points using the Cholesky decomposition
method (Cressie, 1993, p. 201–203). The autocorrelated errors were generated
according to the isotropic exponential covariance model—a commonly used,
valid (nonnegative definite), and monotone decreasing covariance function. This
model is given by
C(r; j 2 , a) c j 2 e − ar
where r is the (Euclidean) distance between points, j 2 > 0 is the error variance,
and a is the correlation strength parameter. Two values of j 2 , 0.0036 and 0.0576,
and two values of a, 4 and 10, were considered. These were chosen so as to be
reasonable in relation to the total variation of each surface and to the area of
Figure 2. Examples of the four types of sampling patterns used for the experiment: A, hexagonal
grid; B, inhibited; C, random; D, clustered.
the experimental region. When j 2 c 0.0036, roughly 95% of the errors have
magnitudes less than 2 . (0.0036)1/ 2 c 0.12 (6% of the range of the data) and
fewer than 0.1% of the errors exceed 10% of the range in magnitude; thus, this
choice of j 2 introduces a relatively small amount of noise into the observations.
When j 2 c 0.0576, however, much more noise is introduced; only 38% of the
errors have magnitudes less than 6% of the range, roughly 40% of the errors
exceed 10% of the range, and about 4% of the errors exceed 50% of the range.
When a c 4, errors at points separated by a distance of 0.25 units are correlated
by the amount 0.37; that is, moderate correlation exists at a distance of one-
fourth the length of the experimental region’s side. When a c 10, however, the
correlation is much weaker; the correlation between points 0.25 units apart is
only 0.08.
The experiment was carried out as follows. For each combination of surface
type, sampling pattern, j 2 , and a, three realizations of the sampling pattern were
obtained and then, conditional on those data locations, errors with the autoco-
variance structure specified by j 2 and a were generated at the 100 (or 99) data
locations and 400 evaluation gridpoints simultaneously. In this way, a total of
144 (3 surface types × 4 pattern types × 3 realizations of each pattern type × 2
values of j 2 × 2 values of a) sets of 500 (or 499) errors referenced by spatial
location were generated. Next, each set of errors was added to the correspond-
ing mathematical surface at the 500 (or 499) locations, yielding a synthetic data
set of 100 (or 99) observations and a “true surface” of 400 values at evaluation
points. Note that we define the true surface as the mathematical surface plus auto-
correlated error; since our synthetic data arose from a perturbed surface, it was
most appropriate for the surface to which interpolated values were compared,
to have the same degree of perturbation. Each set of 100 (or 99) observations
was processed by one of four interpolation algorithms: ordinary kriging (OK),
universal kriging (UK) using (as is common practice) a full second-order polyno-
mial drift function, and inverse squared-distance weighting using the nearest six
observations (ISDW-6) or the nearest 12 observations (ISDW-12). The results
from these interpolations then were used to estimate the value of the true surface
at each of the 400 evaluation points. An estimation error at each evaluation point
and for each interpolation method was obtained by subtracting the value of the
true surface at the point from the value estimated by the interpolation method.
The four interpolation methods were carried out using suitable functions in
S+SpatialStats: krige and predict.krige for OK and UK, and find.neighbor and
functions we wrote ourselves for ISDW-6 and ISDW-12 (for details on these and
related functions, see Kaluzny and others, 1997). Implementation of the two krig-
ing procedures was preceded by the selection of a suitable semivariogram model,
and the estimation of that model, specific to each dataset. This was accomplished,
as is common practice, by fitting each of several semivariogram models (lin-
ear, Gaussian, spherical, and exponential) to the empirical semivariogram by
weighted least squares (Cressie, 1993, sect. 2.6.2) and selecting the model with
the smallest weighted residual sum of squares. The S+SpatialStatsfunctions
variogram and vario.fit were useful in this regard.
To measure interpolation accuracy, the natural logarithm of the mean
squared error [log(MSE)], the root-mean-squared error, and the mean absolute
error were computed for each dataset. In addition, the correlation between the
estimated values and true values at evaluation points was computed for each
dataset. In what follows, we present statistical results for log (MSE) only, as
this was the only measure which residual analysis showed to be nearly normally
distributed. Although the other performance measures had distributions that were
not as well approximated by normality, the results for these measures were qual-
itatively similar to results for log (MSE).
The complete experiment can be regarded as a randomized factorial exper-
iment with five factors: interpolation method, surface type, sampling pattern,
noise level (j 2 ), and correlation strength (a). Each cell of the five-way layout
contained three replicates (calculated from the synthetic datasets corresponding
to the three realizations of each sampling pattern). An experimental measure-
ment was obtained by: (1) comparing the interpolated surface to the true surface
for a dataset at the 400 evaluation points, in the manner described previously,
and (2) reducing the 400 interpolation errors to a single numerical performance
measure [log(MSE)]. The design yielded 576 such measurements.
Accordingly, our analysis of the experiment is based on the model
log(MSE )ijklmn c f (m i , t j , pk , n l , am ) + e ijklmn

(i c 1, 2, 3, 4; j c 1, 2, 3; k c 1, 2, 3, 4; l c 1, 2; m c 1, 2; n c 1, 2, 3)
where log(MSE)ijklmn is log (MSE) for the ith interpolation method and for the
nth dataset at levels j, k, l, m of the other four factors; f is a specified function
(discussed further below); m i is the effect of interpolation method i (i c 1, 2, 3, 4
for OK, UK, ISDW-6, or ISDW-12); t j is the effect of surface type j ( j c 1, 2, 3
for plane, sombrero, or Morrison’s surface); pk is the effect of sampling pattern
k (k c 1, 2, 3, 4 for hexagonal, inhibited, random, or clustered); n l is the effect
of the lth level of noise (l c 1 or 2 for j 2 c 0.0036 or 0.0576); am is the effect
of the mth level of correlation (m c 1 or 2 for a c 4 or 10); e ijklmn is the residual
for the nth data set at levels i, j, k, l, m of the five factors; these residuals are
assumed to be independently and identically distributed normal random variables
with mean zero and positive variance.
The main-effects version of this model takes f to be a simple additive func-
tion of the five factor effects; that is, f (m i , t j , pk , n l , am ) c m i + t j + pk + n l
+ am . We found the most useful version, however, to be the model with main
effects and two- and three-factor interactions.
RESULTS
Table 1 gives the analysis-of-variance table that corresponds to the pre-
scribed model for log (MSE) with main effects and two- and three-factor inter-
action effects. The model described a large portion (96.7%) of the total varia-
tion, and a standard Shapiro-Wilks test could not reject the null hypothesis of
normally-distributed residuals (P c 0.18). All main effects and two-factor inter-
actions were highly significant (P < 0.0001). Also highly significant were three
of the ten three-factor interactions. Further analysis, based on a model which
Table 1. ANOVA for the Experiment, Based on a Model for Log (MSE) with Main Effects and
Two- and Three-Factor Interactionsa
Source df MS F Pr > F
Method (M) 3 27.37 449.67 0.0001*

Surface (S) 2 117.04 1923.17 0.0001*
Pattern (P) 3 46.69 767.11 0.0001*
Noise (N) 1 213.70 3511.25 0.0001*
Correlation (C) 1 18.49 303.88 0.0001*
M× S 6 0.46 7.49 0.0001*
M× P 9 0.23 3.78 0.0001*
M× N 3 7.25 119.11 0.0001*
M× C 3 0.62 10.22 0.0001*
S× P 6 1.79 29.42 0.0001*
S× N 2 29.49 484.49 0.0001*
S× C 2 1.58 25.92 0.0001*
P× N 3 3.00 49.35 0.0001*
P× C 3 0.59 9.75 0.0001*
N× C 1 3.17 52.14 0.0001*
M× S× P 18 0.48 7.84 0.0001*
M× S× N 6 0.62 10.24 0.0001*
M× S× C 6 0.05 0.78 0.59
M× P× N 9 0.05 0.81 0.61
M× P× C 9 0.02 0.35 0.96
M× N× C 3 0.00 0.02 0.99
S× P× N 6 0.31 5.11 0.0001*
S× P× C 6 0.02 0.36 0.90
S× N× C 2 0.07 1.23 0.29
P× N× C 3 0.17 2.73 0.04
Residual 459 0.06
Total 575
aHighly significant terms are indicated with an asterisk. R2 c 0.97.
included four-factor interaction effects, indicated that all interactions of order

four were not significant.
Although the presence of three-factor interactions can, in general, obfuscate
the interpretation of two-factor interactions and main effects, an examination of
the three-way means corresponding to the significant three-factor interactions
(not shown) revealed that in this case they do not. That is, the lack of additivity
indicated by the significant three-factor interactions was due to differences in the
relative magnitudes, but not to differences in the relative rankings, of log (MSE)
over the three-factor combinations. Moreover, the three-factor interactions were
much smaller in magnitude than the lower-order effects. Consequently, the sig-
nificance of several three-factor interactions here does not compromise conclu-
sions drawn from examining main effects and two-factor interactions only.
Figure 3. Interaction plots for the significant two-factor interactions: A, Method × Surface; B,
Method × Pattern; C, Method × Noise; D, Method × Correlation; E, Surface × Pattern; F, Sur-
Figure 3. (Continued) face × Noise; G, Surface × Correlation; H, Pattern × Noise; I, Pattern ×

Correlation; J, Noise × Correlation. Plots show the sample means of log (MSE) for one factor at each
level of the other factor. Means that correspond to the same level of the first factor are connected
by line segments. Horizontal bars represent values obtained by adding and subtracting twice the
standard error from the mean.
Figure 3 displays interaction plots corresponding to all two-factor interac-

tions. The plots indicate that, in most cases, the significance of a two-factor inter-
action is caused by a lack of additivity (parallelism) between the kriging profiles
and the ISDW profiles. There are only two instances in which profiles cross: (1)
in the Method × Surface plot, where OK was superior to UK for the sombrero
and Morrison’s surface, but vice versa for the plane; (2) in the Method × Pattern
plot, where OK was superior to UK for the clustered, inhibited, and hexagonal
patterns but vice versa for the random pattern. Apart from these exceptions, the
significance of the two-factor interactions, like that of the three-factor interac-
tions, appears to be due to differences in relative magnitude rather than in relative
ranking of log (MSE) over the two-factor combinations; similarly, the two-factor
Figure 4. Sample one-factor means of log (MSE). Horizontal bars represent means and values
obtained by adding and subtracting twice the standard error from the mean.
interactions are much smaller than the main effects. Consequently, apart from
the two exceptions noted, the significance of the two-factor interactions does not
materially affect conclusions drawn from examining main effects only.
Figure 4 displays the one-factor means of log (MSE), averaged over all
levels of the other factors, and their standard errors. The information in this
figure can be summarized as follows:
1. The kriging methods performed much (20–30%) better than the two
inverse squared distance weighting procedures. Furthermore, OK out-
performed UK slightly and ISDW-6 outperformed ISDW-12 slightly.
2. The performance of each interpolation method was significantly better
for the plane than for the sombrero, which in turn yielded better perfor-
mance than Morrison’s surface.
3. The performance of each interpolation method deteriorated significantly
as the sampling pattern changed from hexagonal, to inhibited, to random,
and finally to clustered.
when j 2 c 0.0036 than when j 2 c 0.0576. That is, interpolation accuracy
improved when the data were not as noisy.
when a c 4 than when a c 10. That is, interpolation accuracy improved
when the data were more highly spatially correlated.
DISCUSSION
Several important points emerge from the results of the previous section.
Because the primary objective of this investigation was to compare interpolation
methods, we first consider the effect of interpolation method on accuracy [as
measured by log (MSE)]. The most striking result along these lines was that the
two kriging methods consistently and substantially outperformed the two inverse
distance weighting methods over all levels of the other factors. This agrees with
the findings of several authors who have made comparisons among interpolation
methods, but contradicts the findings of several other such authors (refer to the
Introduction or to Cressie, 1993, section 5.9.3, for a list of relevant literature). We
make no claim that our study completely lays this issue to rest; however, we do
find the consistent superior performance of kriging over various types of surfaces
and sampling patterns and various levels of noise and spatial correlation in our
study rather more compelling than findings from less comprehensive studies. Of
course, kriging has several other advantages to inverse distance weighting that
are not performance-related. These include the capability of adapting easily to
the estimation of spatial averages (block kriging) and the provision of a measure
of variability of interpolated values (the kriging variance).
A second point pertaining to the comparison of interpolation methods is
that although the kriging methods were consistently superior to inverse squared-
distance weighting over all levels of the other factors, the extent of superiority
was affected by the other factors. More specifically, the disparity in performance
between the two types of methods tended to be greatest at those levels of the
other factors at which all four methods performed best. For example, in com-
paring interpolation methods over the three surfaces, the relative superiority of
kriging was greatest for the plane and smallest for Morrison’s surface: these are
the surfaces for which all four methods do best and worst, respectively. Simi-
larly, the relative superiority of kriging was greatest when the sampling pattern
was most regular, noise was low, and spatial correlation was high.
Thirdly, consider comparisons of the two kriging methods with each other
and the two inverse distance weighting methods with each other. It was expected
that UK would outperform OK for the plane because the noiseless version of this
surface can be fit exactly by the second-order polynomial function employed by
UK (in fact a first-order polynomial would suffice). And indeed this was so,
although the difference in performance was not statistically significant. Nor was
the difference significant for Morrison’s surface. For the sombrero, however,
OK performed significantly better than UK. This last result is interesting, for
it indicates that for some types of surfaces it is better to completely ignore the
modeling of trend (as OK does) than to model trend inappropriately (as, appar-
ently, UK does for the sombrero). This issue requires further investigation; see
also the relevant work of Journel and Rossi (1989).
As regards the significant Method × Pattern interaction, we found that OK

performed better than UK for the clustered pattern, whereas OK and UK were
not significantly different for the other three patterns. A plausible explanation for
the superiority of OK for clustered patterns is elusive; however, it can be argued
that the issue is of little practical importance because overall performance was so
much worse for clustered patterns that they should be avoided if at all possible.
ISDW-6 consistently performed better than ISDW-12 across all levels of
the other factors, and in most cases the difference, though only on the order of
5–10% of the larger log (MSE), was statistically significant. Thus, if an investi-
gator insists on interpolating a surface using inverse squared-distance weighting,
we recommend that the number of neighbors, k, be six rather than 12. This rec-
ommendation is consistent with those of Morrison (1974), MacDougall (1976),
Peucker (1980), and Hodgson (1992) who recommended, respectively, 3 ≤ k ≤ 7,
6 ≤ k ≤ 9, k ≤ 6, and 4 ≤ k ≤ 7. It differs somewhat, however, from that
of Declercq (1996), who recommended 4 ≤ k ≤ 8 for “smooth” surfaces and
16 ≤ k ≤ 24 for abruptly changing surfaces. Note that the larger the value of k,
the smoother the interpolated surface.
Although the primary objective of this investigation was to compare inter-
polation methods, the effects of the other factors on overall interpolation perfor-
mance were also of some interest. The significance of the main effects and the
relatively small magnitude of the interactions allow us to make several points and
to note some aspects that require further investigation. First, our results indicate
that overall interpolation performance deteriorated consistently and significantly
as the sampling pattern varied from the most regular pattern, hexagonal, to the
next most regular, inhibited, to a random pattern, and finally to the most irreg-
ular pattern, clustered. A similar phenomenon has been documented previously
(Morrison, 1971; MacEachern and Davidson, 1987; Englund, Weber, and Levi-
ant, 1992). These results also comport with spatial design theory, in which sys-
tematic sampling is known to be generally superior (see, for example, Haining,
1990, Chapter 5).
Second, it is generally expected that interpolation methods will perform
better, in absolute terms, at lower noise levels and at higher levels of spatial
correlation. While our results were consistent with these expectations, they are
somewhat at odds with the results of Englund, Weber, and Leviant (1992), who
found, to their surprise, that the accuracy of OK was not significantly affected
by noise level. Two possible explanations for this discrepancy are: (1) we used
about three times as many datasets (144 vs. 54) and thus our analysis was perhaps
more sensitive than theirs at detecting a noise effect; (2) the amounts of noise,
as measured by the ratios of error variance to total variance, were considerably
more disparate in our experiment (roughly 2% and 30% in our experiment vs.
0% and 10% in the study of Englund, Weber, and Leviant).
Finally, consider the effect of surface type. We found that interpolation per-
formance was best for the plane, next best for the sombrero, and worst for Mor-
rison’s surface. That the plane yielded the best performance is not surprising in
light of this surface’s extreme smoothness, but it is difficult to characterize the
sombrero or Morrison’s surface as smoother than the other and hence difficult
to explain why performance was better for the former than for the latter. Addi-
tional investigation is required to understand how surface type affects interpo-
lation performance at both the overall level [as measured, for example, by log
(MSE)] and the micro level. In particular, more work is needed to explicitly
identify and parameterize those surface characteristics (e.g., peaks, troughs, or
portions of the surface with large gradients) that are responsible for the great-
est discrepancies in performance. At finer scales, the grid evaluation approach
used in this paper could serve as the basis for visualizing and evaluating local
performance characteristics.
ACKNOWLEDGMENTS
This research was partially supported by a grant from the Center for Global
and Regional Environmental Research at The University of Iowa. Michael Hodg-
son kindly provided assistance in locating the equations used to generate Morri-
son’s surface. The computing assistance of Yanming Jiang is gratefully acknowl-
edged. We thank two anonymous referees for comments which substantially
improved the paper.
REFERENCES
Brus, D. J., de Gruijter, J. J., Marsman, B. A., Visschers, R., Bregt, A. K., and Breeuwsma, A., 1996,
The performance of spatial interpolation methods and choropleth maps to estimate properties
at points: A soil survey case study: Environmetrics, v. 7, no. 1, p. 1–16.
Cressie, N., 1993, Statistics for spatial data: John Wiley & Sons, New York, 900 p.
Creutin, J. D., and Obled, C., 1982, Objective analyses and mapping techniques for rainfall fields:
An objective comparison: Water Resources Res., v. 18, no. 2, p. 413–431.
Declercq, F. A. N., 1996, Interpolation methods for scattered sample data: Accuracy, spatial patterns,
processing time: Cartography and Geographic Information Systems, v. 23, no. 3, p. 128–144.
Diggle, P. J., 1983, Statistical analysis of spatial point patterns: Academic Press, London, 148 p.
Englund, E. J., Weber, D. D., and Leviant, N., 1992, The effects of sampling design parameters on
block selection: Math. Geology, v. 24, no. 3, p. 329–343.
Franke, R., 1982, Scattered data interpolation: tests of some methods: Mathematics of Computation,
v. 38, no. 157, p. 181–200.
Gallichand, J., and Marcotte, D., 1993, Mapping clay content for subsurface drainage in the Nile
delta: Geoderma, v. 58, nos. 3/ 4, p. 165–179.
Grimm, J. W., and Lynch, J. A., 1991, Statistical analysis of errors in estimating wet deposition
using five surface estimation algorithms: Atmospheric Environment, v. 25a, no. 2, p. 317–327.
Haining, R., 1990, Spatial data analysis in the social and environmental sciences: Cambridge Univ.
Press, Cambridge, 409 p.
Hodgson, M. E., 1992, Sensitivity of spatial interpolation models to parameter variation: ACSM
Technical Papers—Albuquerque, American Congress on Surveying and and Mapping, Bethesda,

MD, v. 2, p. 113–122.
Journel, A. G., and Rossi, M. E., 1989, When do we need a trend model in kriging?: Math. Geology,
v. 21, no. 7, p. 715–739.
Kaluzny, S. P., Vega, S. C., Cardoso, T. P., and Shelley, A. A., 1997, S+SPATIALSTATS user’s
manual for windows and UNIX: Springer, New York, 384 p.
Lam, N. S., 1983, Spatial interpolation methods: A review: American Cartographer, v. 10, no. 2, p.
129–149.
Laslett, G. M., 1994, Kriging and splines: An empirical comparison of their predictive performance
in some applications: Jour. Am. Stat. Assoc., v. 89, no. 426, p. 391–409.
Laslett, G. M., and McBratney, A. B., 1990, Further comparison of spatial methods for predicting
soil pH: Soil Science Society of America Journal, v. 54, no. 6, p. 1553–1558.
Laslett, G. M., McBratney, A. B., Pahl, P. J., and Hutchinson, M. F., 1987, Comparison of several
spatial prediction methods for soil pH: Jour. of Soil Science, v. 38, no. 2, p. 325–341.
MacDougall, E. B., 1976, Computer programming for spatial problems: John Wiley & Sons, New
York, 158 p.
MacEachern, A. M., and Davidson, J. V., 1987, Sampling and isometric mapping of continuous
geographic surfaces: The American Cartographer, v. 14, no. 4, p. 299–320.
Morrison, J. L., 1971, Method-produced error in isarithmic mapping: American Congress on Sur-
veying and Mapping, Washington, D.C., 76 p.
Morrison, J. L., 1974, Observed statistical trends in various interpolation algorithms useful for first
stage interpolation: The Canadian Cartographer, v. 11, no. 2, p. 142–159.
Peucker, T. K., 1980, The impact of different mathematical approaches to contouring: Cartographica,
v. 17, no. 2, p. 73–95.
Phillips, D. L., Lee, E. H., Herstrom, A. A., Hogsett, W. E., and Tingey, D. T., 1997, Use of auxiliary
data for spatial interpolation of ozone exposure in southeastern forests: Environmetrics, v. 8,
no. 1, p. 43–61.
Rouhani, S., 1986, Comparative study of ground-water mapping techniques: Ground Water, v. 24,
no. 2, p. 207–216.
Tabios, G. Q., and Salas, J. D., 1985, A comparative analysis of techniques for spatial interpolation
of precipitation: Water Resources Bull., v. 21, no. 3, p. 365–380.
Weber, D. D., and Englund, E. J., 1992, Evaluation and comparison of spatial interpolators: Math.
Geology, v. 24, no. 4, p. 381–391.
Weber, D. D., and Englund, E. J., 1994, Evaluation and comparison of spatial interpolators, II: Math.
Geology, v. 26, no. 5, p. 589–603.
Wood, J. D., and Fisher, P. F., 1993, Assessing interpolation accuracy in elevation models: IEEE
Computer Graphics and Applications, v. 13, no. 2, p. 48–56.

An Experimental Comparison

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

An Experimental Comparison

Enviado por

Direitos autorais:

Formatos disponíveis

9812 Mathematical Geology PLENUM Dec′ 98 − Jan′ 99 981203ap 6/29/1999 [632] (PR1)

Mathematical Geology, Vol. 31, No. 4, 1999

An Experimental Comparison of Ordinary and

Dale Zimmerman,2 Claire Pavlik,3

KEY WORDS: geostatistics, spatial interpolation, spatial pattern, surface-fitting algorithms.

1Received August 1995; accepted 21 July 1998.

0882-8121/ 99/ 0500-0375$16.00/ 0  1999 International Association for Mathematical Geology

376 Zimmerman, Pavlik, Ruggles, and Armstrong

A Comparison of Kriging and Inverse Distance Weighting 377

sin[{16(x − 0.5)}2 + {16( y − 0.5)}2 ]

378 Zimmerman, Pavlik, Ruggles, and Armstrong

and cosine functions of different phase, period, and amplitude. So as to eliminate

A Comparison of Kriging and Inverse Distance Weighting 379

a continuum of decreasing regularity (or increasing aggregation). The hexagonal

380 Zimmerman, Pavlik, Ruggles, and Armstrong

A Comparison of Kriging and Inverse Distance Weighting 381

382 Zimmerman, Pavlik, Ruggles, and Armstrong

log(MSE )ijklmn c f (m i , t j , pk , n l , am ) + e ijklmn

A Comparison of Kriging and Inverse Distance Weighting 383

Method (M) 3 27.37 449.67 0.0001*

aHighly significant terms are indicated with an asterisk. R2 c 0.97.

included four-factor interaction effects, indicated that all interactions of order

384 Zimmerman, Pavlik, Ruggles, and Armstrong

A Comparison of Kriging and Inverse Distance Weighting 385

Figure 3. (Continued) face × Noise; G, Surface × Correlation; H, Pattern × Noise; I, Pattern ×

Figure 3 displays interaction plots corresponding to all two-factor interac-

386 Zimmerman, Pavlik, Ruggles, and Armstrong

A Comparison of Kriging and Inverse Distance Weighting 387

388 Zimmerman, Pavlik, Ruggles, and Armstrong

As regards the significant Method × Pattern interaction, we found that OK

A Comparison of Kriging and Inverse Distance Weighting 389

390 Zimmerman, Pavlik, Ruggles, and Armstrong

Technical Papers—Albuquerque, American Congress on Surveying and and Mapping, Bethesda,

Você também pode gostar