For Goal 5

6.
Data analysis
Discrete data can only take particular values. There may potentially be an infinite number of those values, but each is
distinct and there's no grey area in between. Discrete data can be numeric -- like numbers of apples -- but it can also be
categorical -- like red or blue, or male or female, or good or bad.
Continuous data are not restricted to defined separate values, but can occupy any value over a continuous range. Between
any two continuous data values there may be an infinite number of others. Continuous data are always essentially
numeric.
Overlay is a GIS operation that superimposes multiple data sets (representing different themes) together for the purpose
of identifying relationships between them. An overlay creates a composite map by combining the geometry and attributes
of the input data sets. Tools are available in most GIS software for overlaying both Vector or raster data.
Overlay with Vector Data
Feature overlays from vector data are created when one vector layer (points, lines, or polygons) is merged with one or
more other vector layers covering the same area with points, lines, and/or polygons. A resultant new layer is created that
combines the geometry and the attributes of the input layers. An example of overlay with vector data would be taking a
watershed layer and laying over it a layer of counties. The result would show which parts of each watershed are in each
county.
Overlay with Raster Data
Raster overlay involves two or more different sets of data that derive from a common grid. The separate sets of data are
usually given numerical values. These values then are mathematically merged together to create a new set of values for a
single output layer. Raster overlay is often used to create risk surfaces, sustainability assessments, value assessments, and
other procedures. An example of raster overlay would be to divide the habitat of an endangered species into a grid, and
then getting data for multiple factors that have an effect on the habitat and then creating a risk surface to illustrate what
sections of the habitat need protecting most.
How Thiessen polygons are constructed
Each Thiessen polygon contains only a single point input feature. Any location within a Thiessen polygon is closer to its
associated point than to any other point input feature. The theoretical background for creating Thiessen polygons is as
follows:
Where S is a set of points in coordinate or Euclidean space (x,y), for any point p in that space, there is one point of S
closest to p, except where point p is equidistant to two or more points of S. A single proximal polygon (Voronoi cell) is
defined by all points p closest to a single point in S, that is, the total area in which all points p are closer to a given point in
S than to any other point in S.
Thiessen proximal polygons are constructed as follows:
All points are triangulated into a triangulated irregular network (TIN) that meets the Delaunay criterion.
The perpendicular bisectors for each triangle edge are generated, forming the edges of the Thiessen polygons. The
location at which the bisectors intersect determine the locations of the Thiessen polygon vertices.
The main problem with Thiessen polygons is that they are directly determined based upon distance, and thus they give no
account to differences in power or influence between different sites in the past. The resulting maps also take no account of
natural breaks in terrain, such as hills, mountains or rivers. As such, the technique is best thought of as producing only a
very approximate model of the past spatial relationship between a set of geographic objects.
How inverse distance weighted interpolation works
Inverse distance weighted (IDW) interpolation explicitly makes the assumption that things that are close to one another
are more alike than those that are farther apart. To predict a value for any unmeasured location, IDW uses the measured
values surrounding the prediction location. The measured values closest to the prediction location have more influence on
the predicted value than those farther away. IDW assumes that each measured point has a local influence that diminishes
with distance. It gives greater weights to points closest to the prediction location, and the weights diminish as a function
of distance, hence the name inverse distance weighted. Weights assigned to data points are illustrated in the following
example:
The Weights window contains the list of weights assigned to each
data point that is used to generate a predicted value at the location
marked by the crosshair
The Power function

As mentioned above, weights are proportional to the inverse of the distance (between the data point and the prediction
location) raised to the power value p. As a result, as the distance increases, the weights decrease rapidly. The rate at which
the weights decrease is dependent on the value of p. If p = 0, there is no decrease with distance, and because each weight
i is the same, the prediction will be the mean of all the data values in the search neighborhood. As p increases, the
weights for distant points decrease rapidly. If the p value is very high, only the immediate surrounding points will
influence the prediction.
Geostatistical Analyst uses power values greater or equal to 1.
When p = 2, the method is known as the inverse distance
squared weighted interpolation. The default value is p = 2,
although there is no theoretical justification to prefer this value
over others, and the effect of changing p should be investigated
by previewing the output and examining the cross-validation
statistics.
The search neighborhood

Because things that are close to one another are more alike than those that are farther away, as the locations get farther
away, the measured values will have little relationship to the value of the prediction location. To speed calculations, you
can exclude the more distant points that will have little influence on the prediction. As a result, it is common practice to
limit the number of measured values by specifying a search neighborhood. The shape of the neighborhood restricts how
far and where to look for the measured values to be used in the prediction. Other neighborhood parameters restrict the
locations that will be used within that shape. In the following image, five measured points (neighbors) will be used when
predicting a value for the location without a measurement, the yellow point.
The shape of the neighborhood is influenced by the input data and the surface you are
trying to create. If there are no directional influences in your data, you'll want to consider
points equally in all directions. To do so, you will define the search neighborhood as a
circle. However, if there is a directional influence in your data, such as a prevailing wind,
you may want to adjust for it by changing the shape of the search neighborhood to an
ellipse with the major axis parallel with the wind. The adjustment for this directional
influence is justified because you know that locations upwind from a prediction location
are going to be more similar at remote distances than locations that are perpendicular to
the wind but located closer to the prediction location.
Once a neighborhood shape has been specified, you can restrict which data
locations within the shape should be used. You can define the maximum and
minimum number of locations to use, and you can divide the neighborhood into
sectors. If you divide the neighborhood into sectors, the maximum and minimum
constraints will be applied to each sector.
The points highlighted in the data view show the locations and the weights that
will be used for predicting a location at the center of the ellipse (the location of
the crosshair). The search neighborhood is limited to the interior of the ellipse. In
the example shown below, the two red points will be given weights of more than
10 percent. In the eastern sector, one point (brown) will be given a weight
between 5 percent and 10 percent. The rest of the points in the search
neighborhood will receive lower weights.
When to use IDW

A surface calculated using IDW depends on the selection of the power value (p) and the search neighborhood strategy.
IDW is an exact interpolator, where the maximum and minimum values (see diagram below) in the interpolated surface
can only occur at sample points.
The output surface is sensitive to clustering and the presence of outliers. IDW
assumes that the phenomenon being modeled is driven by local variation, which can
be captured (modeled) by defining an adequate search neighborhood. Since IDW does
not provide prediction standard errors, justifying the use of this model may be
problematic.
Inverse distance weighting is a deterministic, nonlinear interpolation technique that uses a weighted average of the
attribute (i.e., phenomenon) values from nearby sample points to estimate the magnitude of that attribute at non-sampled
locations. The weight a particular point is assigned in the averaging calculation depends upon the sampled point's distance
to the non-sampled location (see Figure 6.cg.25, below). The method is called inverse distance weighting because
according to Tobler's first law of geography, (see Interpolation) the similarity of two locations should decrease with
increasing distance.
To use inverse distance weighting
interpolation to create a surface, there are
several factors that cartographers need to
consider. One important question is the
type of relationship the phenomenon has
with distance (e.g., does it decrease
dramatically with distance, or do even
relatively distant points have some
degree of similarity with the non-
sampled location?). Many cartographers
choose to specify an inverse distance-
squared relationship, where the weight of
a point differs with the inverse square of
distance (i.e., 1/distance2) rather than a simple inverse distance (i.e., 1/distance) (see Figure 6.cg.26, below).
Figure 6.cg.26 We used a distance exponent of one for the top map, an
exponent of two for the middle map, and an exponent of three for the
bottom map. You can see that more distant points have a greater effect on
the overall pattern of the top map, in that it is a smoother surface than the
middle and bottom maps (i.e., it is gives greater weight to points that are
farther away (and that have fewer similar attribute values), thereby
suppressing the influence of individual peaks). In the bottom map, you can
see that individual peaks in rainfall are most clearly visible, as the
interpolated values are not influenced as heavily by more distant points that
have lower values.
A second important factor is determining how large the neighborhood of influence should be: all points within some fixed
distance of the non-sampled location, or whether the neighborhood should consist of some particular number of points,
regardless of their distance to the non-sampled location. A variation on this second method might be to specify some
combination of distance and number of points (e.g., select the nearest n number of points within 10 km of the non-
sampled location). Each of these decisions can have an impact upon the final appearance of the interpolated surface and
therefore needs to be carefully considered (see Figure 6.cg.27, below).
Figure 6.cg.27 In the three maps above, we incorporated all points within a
fixed radius of 25, 50 and 100 kilometers in the top, middle and bottom maps, respectively. You can see from the top map
that a radius of 25 km is inadequate for this set of samples, as there are several locations that do not have any points
within 25 km, and for which the interpolation algorithm cannot make any prediction at all. Generally, as the size of the
search radius increases, the number of points included in the calculation increases, which has the effect of smoothing the
map pattern.
Figure 6.cg.28 In the three maps above, we have used the nearest 5, 12 and 25 points for calculating precipitation at non-
sampled locations in the top, middle and bottom maps, respectively. Again, you can see that increasing the number of
points used in the calculation smooths the map pattern, by decreasing the impact of extreme values on the predicted values
(i.e., there are fewer locations with the highest amounts of rainfall predicted).
Because inverse distance weighting is a deterministic technique, it does not take into account the spatial structure (i.e.,
arrangement) of the sample points. Therefore, the results that you get using this technique can be influenced by the
spacing and density of the samples, and it is good to be cautious about the accuracy of the interpolated values. Also,
because inverse distance weighting computes an average value, the value it calculates for a non-sampled point can never
be higher than the maximum value for a sample point or lower than the minimum value of the sample point, so if the
peaks and valleys of the data are not represented in your sample, this technique may be wildly inaccurate in some
locations.
Kriging
Kriging is an advanced geostatistical procedure that generates an estimated surface from a scattered set of points with z-
values. Unlike other interpolation methods in the Interpolation toolset, to use the Kriging tool effectively involves an
interactive investigation of the spatial behavior of the phenomenon represented by the z-values before you select the best
estimation method for generating the output surface.
The IDW (inverse distance weighted) and Spline interpolation tools are referred to as deterministic interpolation methods
because they are directly based on the surrounding measured values or on specified mathematical formulas that determine
the smoothness of the resulting surface. A second family of interpolation methods consists of geostatistical methods, such
as kriging, which are based on statistical models that include autocorrelationthat is, the statistical relationships among
the measured points. Because of this, geostatistical techniques not only have the capability of producing a prediction
surface but also provide some measure of the certainty or accuracy of the predictions.
Kriging assumes that the distance or direction between sample points reflects a spatial correlation that can be used to
explain variation in the surface. The Kriging tool fits a mathematical function to a specified number of points, or all points
within a specified radius, to determine the output value for each location. Kriging is a multistep process; it includes
exploratory statistical analysis of the data, variogram modeling, creating the surface, and (optionally) exploring a variance
surface. Kriging is most appropriate when you know there is a spatially correlated distance or directional bias in the data.
It is often used in soil science and geology.
Kriging is similar to IDW in that it weights the surrounding measured values to derive a prediction for an unmeasured
location. The general formula for both interpolators is formed as a weighted sum of the data:
where:
Z(si) = the measured value at the ith location
i = an unknown weight for the measured value at the ith location
s0 = the prediction location
N = the number of measured values
In IDW, the weight, i, depends solely on the distance to the prediction location. However, with the kriging method, the
weights are based not only on the distance between the measured points and the prediction location but also on the overall
spatial arrangement of the measured points. To use the spatial arrangement in the weights, the spatial autocorrelation must
be quantified. Thus, in ordinary kriging, the weight, i, depends on a fitted model to the measured points, the distance to
the prediction location, and the spatial relationships among the measured values around the prediction location. The
following sections discuss how the general kriging formula is used to create a map of the prediction surface and a map of
the accuracy of the predictions.
To make a prediction with the kriging interpolation method, two tasks are necessary:
Uncover the dependency rules.
Make the predictions.
To realize these two tasks, kriging goes through a two-step process:
It creates the variograms and covariance functions to estimate the statistical dependence (called spatial autocorrelation)
values that depend on the model of autocorrelation (fitting a model).
It predicts the unknown values (making a prediction).
It is because of these two distinct tasks that it has been said that kriging uses the data twice: the first time to estimate the
spatial autocorrelation of the data and the second to make the predictions.
Variography
Fitting a model, or spatial modeling, is also known as structural analysis, or variography. In spatial modeling of the
structure of the measured points, you begin with a graph of the empirical semivariogram, computed with the following
equation for all pairs of locations separated by distance h:
Semivariogram(distanceh) = 0.5 * average((valuei valuej)2)
The formula involves calculating the difference squared between the values of the paired locations.
The image below shows the pairing of one point (the red point) with all other measured locations. This process continues
for each measured point.
Often, each pair of locations has a unique distance, and

there are often many pairs of points. To plot all pairs
quickly becomes unmanageable. Instead of plotting each
pair, the pairs are grouped into lag bins. For example,
compute the average semivariance for all pairs of points
that are greater than 40 meters apart but less than 50
meters. The empirical semivariogram is a graph of the
averaged semivariogram values on the y-axis and the
distance (or lag) on the x-axis (see diagram below).
Spatial autocorrelation quantifies a basic principle of geography: things that are closer are more alike than things farther
apart. Thus, pairs of locations that are closer (far left on the x-axis of the semivariogram cloud) should have more similar
values (low on the y-axis of the semivariogram cloud). As pairs of locations become farther apart (moving to the right on
the x-axis of the semivariogram cloud), they should become more dissimilar and have a higher squared difference
(moving up on the y-axis of the semivariogram cloud).
Fitting a model to the empirical semivariogram
The next step is to fit a model to the points forming the empirical semivariogram. Semivariogram modeling is a key step
between spatial description and spatial prediction. The main application of kriging is the prediction of attribute values at
unsampled locations. The empirical semivariogram provides information on the spatial autocorrelation of datasets.
However, it does not provide information for all possible directions and distances. For this reason, and to ensure that
kriging predictions have positive kriging variances, it is necessary to fit a modelthat is, a continuous function or
curveto the empirical semivariogram. Abstractly, this is similar to regression analysis, in which a continuous line or
curve is fitted to the data points.
To fit a model to the empirical semivariogram, select a function that serves as your modelfor example, a spherical type
that rises and levels off for larger distances beyond a certain range (see the spherical model example below). There are
deviations of the points on the empirical semivariogram from the model; some points are above the model curve, and
some points are below. However, if you add the distance each point is above the line and add the distance each point is
below the line, the two values should be similar. There are many semivariogram models from which to choose
Semivariogram models
The Kriging tool provides the following functions from which to choose for modeling the empirical semivariogram:
Circular, Spherical, Exponential, Gaussian, Linear
The selected model influences the prediction of the unknown values, particularly when the shape of the curve near the
origin differs significantly. The steeper the curve near the origin, the more influence the closest neighbors will have on the
prediction. As a result, the output surface will be less smooth. Each model is designed to fit different types of phenomena
more accurately.
The diagrams below show two common models and identify how the functions differ
A spherical model example
This model shows a progressive decrease of spatial autocorrelation (equivalently,
an increase of semivariance) until some distance, beyond which autocorrelation is
zero. The spherical model is one of the most commonly used models.
An exponential model example

This model is applied when spatial autocorrelation decreases exponentially with
increasing distance. Here, the autocorrelation disappears completely only at an
infinite distance. The exponential model is also a commonly used model. The choice
of which model to use is based on the spatial autocorrelation of the data and on prior
knowledge of the phenomenon.
Understanding a semivariogramRange, sill, and nugget

As previously discussed, the semivariogram depicts the spatial autocorrelation of the measured sample points. Because of
a basic principle of geography (things that are closer are more alike), measured points that are close will generally have a
smaller difference squared than those farther apart. Once each pair of locations is plotted after being binned, a model is fit
through them. Range, sill, and nugget are commonly used to describe these models
Range and sill
When you look at the model of a semivariogram, you will notice that at a certain
distance the model levels out. The distance where the model first flattens is
known as the range. Sample locations separated by distances closer than the
range are spatially autocorrelated, whereas locations farther apart than the range
are not.
The value at which the semivariogram model attains the range (the value on the
y-axis) is called the sill. A partial sill is the sill minus the nugget. The nugget is
described in the following section.
Nugget
Theoretically, at zero separation distance (for example, lag = 0), the semivariogram value is 0. However, at an infinitely
small separation distance, the semivariogram often exhibits a nugget effect, which is a value greater than 0. If the
semivariogram model intercepts the y-axis at 2, then the nugget is 2.
The nugget effect can be attributed to measurement errors or spatial sources of variation at distances smaller than the
sampling interval (or both). Measurement error occurs because of the error inherent in measuring devices. Natural
phenomena can vary spatially over a range of scales. Variation at microscales smaller than the sampling distances will
appear as part of the nugget effect. Before collecting data, it is important to gain an understanding of the scales of spatial
variation in which you are interested.
Making a prediction
After you have uncovered the dependence or autocorrelation in your data (see Variography section above) and have
finished with the first use of the datausing the spatial information in the data to compute distances and model the spatial
autocorrelationyou can make a prediction using the fitted model. Thereafter, the empirical semivariogram is set aside.
You can now use the data to make predictions. Like IDW interpolation, kriging forms weights from surrounding measured
values to predict unmeasured locations. As with IDW interpolation, the measured values closest to the unmeasured
locations have the most influence. However, the kriging weights for the surrounding measured points are more
sophisticated than those of IDW. IDW uses a simple algorithm based on distance, but kriging weights come from a
semivariogram that was developed by looking at the spatial nature of the data. To create a continuous surface of the
phenomenon, predictions are made for each location, or cell centers, in the study area based on the semivariogram and the
spatial arrangement of measured values that are nearby.
Kriging methods
There are two kriging methods: ordinary and universal.
Ordinary kriging is the most general and widely used of the kriging methods and is the default. It assumes the constant
mean is unknown. This is a reasonable assumption unless there is a scientific reason to reject it.
Universal kriging assumes that there is an overriding trend in the datafor example, a prevailing windand it can be
modeled by a deterministic function, a polynomial. This polynomial is subtracted from the original measured points, and
the autocorrelation is modeled from the random errors. Once the model is fit to the random errors and before making a
prediction, the polynomial is added back to the predictions to give meaningful results. Universal kriging should only be
used when you know there is a trend in your data and you can give a scientific justification to describe it.
Kriging
Kriging is a stochastic, local interpolation technique that uses information about the spatial structure of the attribute of
interest (i.e., the information contained in the sample points) to estimate the value of that attribute at unknown locations.
Kriging is very similar to inverse distance weighting in that it also uses a weighted average of sample points to estimate
values at unknown points; the main difference between the two methods lies in how those weights are specified. In IDW,
the cartographer arbitrarily specifies a neighborhood of points that should influence the estimation, as well as the strength
of the distance effect (i.e., the similarity of points at a given distance). In kriging, however, we use statistics to decide on a
set of weights that will be most likely to correctly predict the unknown values (i.e., we produce a statistically optimal set
of weights).
Kriging assumes that the variation in a surface can be broken down into three main components: a drift or overall trend,
local spatial autocorrelation (i.e., that points that are close together are more likely to have similar values), and random
stochastic variation (i.e., noise or measurement error). The drift can be estimated with a mathematical function that
approximates the trend in the surface. Here, we will focus on understanding how kriging deals with the other two
components of spatial variation: local spatial autocorrelation and random stochastic variation.
The first step in kriging is to use the sample data to describe the spatial variation in the surface. We can do this by
considering the concept of the semivariance, which is a statistical measure of how much variation there is in the attribute
we are interested in when two points are separated by a particular distance (see Figure 6.cg.29, below). Typically, in any
given dataset, we might only have one set of points that is a particular distance apart from each other. Unfortunately,
statistically, this is not the best scenario, as we can't be sure that the semivariance for that set of points is typical of all of
the potential sets of points that are that distance apart (when we consider both samples and estimated points). As with all
statistics, the more samples we have, the more sure we can be about the predictions we make based on those samples. So
we can use the concept of a distance lag to group sets of points that are similar (but not exactly the same) distances apart
and calculate an average semivariance from those points.
Figure 6.cg.29 This graphic (known as a semivariogram) plots the semivariance

of sample point pairs against the distances between each of the pairs. Typically, the semivariance is low at short distances
and increases with increasing distance. However, at some point, there is a leveling off of semivariance with increasing
distances (i.e., there is some distance after which increasing distances do not result in increasing semivariances).
Once we have a description of the spatial structure of the sample data (in the form of a semivariogram), we can look for a
mathematical function (i.e., equation) that best fits those points (i.e., one that minimizes the distance between all of the
sample points and the line described by the function). One common type of equation that is fit to semivariograms is a
spherical function (see Figure 6.cg.30, below). We can extract three important pieces of information from this spherical
equation: the nugget, the sill and the range. The nugget is the value at which the function meets the y-axis; this value is
usually not at the origin (i.e., 0,0 point) of the graph. We can interpret this value as a measure of the amount of random
stochastic variation (i.e., noise) present in the data sample. This makes intuitive sense when you consider that points that
are in the same location (i.e., no distance apart) should have the same value. In practice, if you make repeated
measurements at the same place, you may not get equal values if there is some measurement noise. The second important
value is called the sill, which is the highest level of semivariance in the data set. A final important value is the range,
which is the distance at which the semivariance stops increasing. In other words, at distances that are greater than the
range, points are unlikely to be similar - Tobler's law is no longer working at these distances.
Figure 6.cg.30 In this graphic, red points are used to plot individual sample pairs, while blue points are used to represent a
summary of the semivariance of all points within a particular lag. The black line is a spherical function that is fit to the
blue points. The size of the lags is shown here by the tick marks on the x-
axis. We have also indicated three important quantities that can be derived
from the semivariogram: the nugget, the range and the sill.
Once we have fit an equation to the semivariogram, we can use it to calculate weights for estimating the values of our
attribute of interest at unknown locations. We accomplish this by first simply measuring the distance between sample
points and the location of the value we want to predict and reading the semivariance that the graph predicts for that
distance. Then, these semivariances are used to solve a series of linear equations whose weights will produce an
interpolation that minimizes the amount of error in the predicted values. Although the mathematics of these equations are
beyond the scope of this lesson, the end result is a set of weights that is then used along with the values of the sample
points to predict the values of unknown locations. Although kriging is conceptually more difficult to understand than
inverse distance weighting, it allows the structure of the data to influence the weighting of sample points on the
interpolation rather than the arbitrary decisions made by the cartographer in inverse distance weighting.
Figure 6.cg.31 Notice the difference between the patterns in maps created using inverse distance weighting (top) and
kriging (bottom). Both maps were created using the same number of neighboring points. Although the general pattern of
highs and lows is quite similar, the surface that results from the inverse distance weighting method is much more regular
than the kriged surface (i.e., the contours are more circular).

For Goal 5

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

For Goal 5

Enviado por

Direitos autorais:

Formatos disponíveis

6.

The Power function

The search neighborhood

When to use IDW

Often, each pair of locations has a unique distance, and

An exponential model example

Understanding a semivariogramRange, sill, and nugget

Figure 6.cg.29 This graphic (known as a semivariogram) plots the semivariance

Você também pode gostar