Escolar Documentos
Profissional Documentos
Cultura Documentos
Article views: 51
homologous perimeter-lines, the average displacement This paper is organized in five main sections. As a
of these and the oscillations provoked by crossing buf- reminder, the following section describes the main
fers (Fig. 1a). characteristics of the official cartographic databases and
This paper is based on our earlier paper samples selected for our approach. After this, we
(Ruiz-Lendı́nez et al., 2013) of this work and its main describe our positional accuracy assessment method-
contribution is to develop a point-based methodology ology in separate subsections. The fourth section shows
for the automatic assessment of positional and geo- the experimental results obtained. Finally, in section five
metric components of spatial data. To achieve this, and conclusions are presented.
on the basis of the set of homologous polygons
obtained according to the methodology developed in Official cartographic databases used
(Ruiz-Lendı́nez et al., 2013), we first have identified and Authors have used the same two official cartographic
computed homologous points between these polygons databases used in our previous work; (see Ruiz-Lendı́nez
using a metric for comparing polygonal shapes defined et al. 2013). Thus, as the tested source we have used the
by Arkin et al. (1991) and then we have applied a point- BCN25 and as the reference source we have used the
based standard developed for assessing the positional MTA10 [two official cartographic databases in Andalusia
accuracy of spatial data. This positional accuracy is (southern Spain)]. A detailed description of them,
described by means of a statistical evaluation of random a justification for their choice and the conditions that they
and systematic errors and specified by means of the root must meet can be found in the above mentioned work.
mean squared error (RMSE) or by the mean value of In this last aspect, we must highlight that both GDB
errors (m) and their standard deviation (s) (Ariza-López have been independently produced and that neither of
and Atkinson-Gordo, 2008a; Li et al., 2012) (Fig. 1b). these two GDB, in turn, have been derived from another
Specifically, we have applied the National Standard cartographic product of a larger scale through any
Spatial Data Accuracy (NSSDA) developed by the process. All this implies that none of them have undergone
Federal Geographic Data Committee (FGDC) in 1998 any action involving a degradation of their quality.
(FGDC, 1998). In addition both databases (must) should be interoper-
This new perspective of the problem has two main able, which means that we must ensure that they will be
objectives: (i) to increase significantly the number of comparable both in terms of reference system and carto-
entities (points) used in the assessment process and (ii) to graphic projection (Ruiz-Lendı́nez et al., 2011).
achieve greater efficiency than that obtained with tra- In the same way, in the above referred work a justi-
ditional points-methods, reducing the final cost of the fication for the selection of Buildings as polygonal fea-
positional assessment process. tures used to assess the positional accuracy is provided.
1 Accuracy assessment methods a Perimeter line-based positional accuracy assessment methods; b Point-based positional
accuracy assessment method
As a reminder, we must note that the four urban areas turning function hA(s). This function measures the angle
selected for our work are included in two sheets (num- of the counter-clockwise tangent as a function of the arc-
bers 0985 and 1009) of the MTN50k (National Topo- length s, measured from some reference point 0 on A’s
graphic Map of Spain at scale 1: 50 000). More boundary. Thus, hA(0) is the angle v that the tangent at
specifically, we have used the following urban areas: the reference point 0 makes with some reference orien-
Santa Fe (1009-03 sheet), Granada (1009-04 sheet), tation associated with the polygon (such as the x-axis).
Carmona (0985-02 sheet) and Mairena del Alcor (0985- hA(s) keeps track of the turning that takes place,
03 sheet). In order to place these urban areas, the increasing with left-hand turns and decreasing with right-
reader is referred to the web version of (Ruiz-Lendı́nez hand turns. In addition, if each polygon is rescaled so that
et al., 2013). the total perimeter length is 1, then hA is a function from
(0, 1) to (24p, 4p) (Fig. 3).
Methodology The domain of hA(s) can be extended to the entire real
line in a natural way by allowing angles to continue to
Comparing polygonal shapes and extracting accumulate as we continue around the perimeter of the
homologous points polygon A. Thus, for a simple closed polygon, the value
The workflow for the proposed methodology is shown in of hA(s þ 1) is hA(s) þ 2p for all s. In addition to this,
Fig. 2. As mentioned above, the matching methodology the function hA(s) has special properties which make it
which determines the set of homologous polygons especially suitable for identifying and extracting
between both GDB (BCN25 and MTA10) was addressed homologous points between two polygons. It is piece-
in Ruiz-Lendı́nez et al. (2013). For this reason, this issue is wise-constant for polygons, making computations
not discussed here. Thus, the starting point of our current particularly easy and fast. By definition, the function
approach has been the set of polygons matched by the GA hA(s) is invariant under translation and scaling of the
with a match accuracy value (MAV) enough to ensure polygon A. Rotation of A corresponds to a simple shift
that two polygons previously matched are sufficiently of hA(s) in the h direction.
similar to be considered as homologous [the calculation of On the other hand, the degree to which two polygons
this threshold value of MAV was also addressed in (Ruiz- A and B are similar can be measured by taking the
Lendı́nez et al., 2013)]. In order to extract homologous distance function D between their turning functions
points between these pairs of polygons, a metric derived hA(s) and hB(s). If we assume that A and B are two
from the method defined by Arkin et al. (1991) for com- matched polygons and that the reference point 0 is the
paring two polygonal shapes has been used. same for both, then D can be defined as
The most intuitive and easiest method of representing
ð1
any 2D polygon is to describe its boundary by giving a list
of coordinates of its vertices. However, following Arkin D¼ ðhA ðsÞ 2 hB ðsÞÞds
0
et al. (1991), an alternative representation of the bound-
ary of a 2D polygon denoted by A can be obtained by its The D value between the turning functions hA(s) and hB(s)
has allowed us to establish an additional criterion (toge-
ther with the MAV provided by the GA, see Newby, 1992)
for determining whether two polygons A and B previously
matched are sufficiently similar to be considered as
homologous and therefore for being used in order to
compute homologous points. The integral (1) can be
computed by adding up the value of the integral within
each strip defined by a consecutive pair of discontinuities
in hA(s) and hB(s), (see Fig. 4). These discontinuities,
in turn, are defined by length shifts (LS) (variations on s
axis owing to length changes of the polygon sides) and
angular shifts (AS) (variations on h axis owing to direction
changes of the polygon sides) of the turning functions.
In addition, if the turning functions of two homolo-
gous polygons (A and B) are overlapped (Fig. 5), each
discontinuity (defined by a pair of points on their
graphic representations) represents a vertex1 of the first
polygon (V1) that can be matched with another vertex of
the second one (V91) (homologous points). The distance
between this pair of vertices has been denoted by DV1.
Finally, and in order to avoid match non-homologous
points, two threshold values were fixed for the length
and angular shifts (denoted by TLS and TAS respect-
ively) (the method of determination of these values
will be presented in the Results section). Thus, only
when these shifts are lower than the threshold values
1 The terms ‘vertex’ and ‘point’ are used interchangeably throughout the
text to refer to punctual entities belonging to more complex geometry
2 Accuracy assessment process flow entities (polygons).
Sampling procedure
With many pairs of homologous points available, the
next step was to establish a sample design criteria. Fol-
4 Strips formed by the functions hA(s) and hB(s), lowing Ariza-López and Atkinson-Gordo (2008a), one
(Arkin et al., 1991) of the most controversial aspects of all point-based
positional accuracy assessment standards is the number
previously defined, the matching between these points and distribution of control elements. Therefore, for
will be achieved. In addition, the possible residual having an adequate representation of the assessment, a
anomalous values of DVi (A, B) distance computed for sample with a statistical basis is necessary. In this sense,
each pair of points (i and i9) that belong to homologous two issues were raised: (i) a sampling procedure, and (ii)
polygons (A and B) were removed by means of com- the size of the sample.
puting two additional parameters: the Average value With regard to the sampling procedure used, and
of DVi (A, B) (mDv(A,B)), and the RMSE of DVi (A, B) according to the Standard’s recommendations (FGDC,
(RMSEDv(A,B)). These thresholds values also acted as 1998), a grid sampling pattern was defined (covering all
control parameters to prevent the acceptance of the terrain) where samples (pairs of homologous points)
unpaired points derived from the difference in the were randomly collected within each generated cell. The
complexity of the shape of polygons owing to a possible number of cells and its size depends on the geographical
different level of generalization and scale of represen- area covered and the size of the sample. In addition, in
tation. Figure 6 shows a pair of homologous polygons. order to avoid biasing the final results only one point by
The polygon A belonging to MTA10 (larger scale rep- polygon was used. Such is the case for a polygon which
resentation GDB) is more complex and with a more occupies part of two or more cells of the sampling grid
detailed shape than its counterpart B belonging to the (Fig. 7). Thus, if we select the vertex V1 belonging to
BCN25 (GDB with a smaller scale representation and the highlighted polygon and included in the cell D4 to
higher level of generalization). In this polygon A, calculate the positional standard, we may not use the
6 Unpaired points derived from the difference in the complexity of the shape of polygons
vertex V2 (included in the cell E4) because it belongs to standard for Federal Agencies of the USA producing
the same polygon. In this case, we must use another analogue and/or digital cartographic data and is ever
vertex V3 to be included in a different polygon. more widely used throughout the world (Ariza-López
With regards to the size of the sample, the number of and Atkinson-Gordo, 2008a). The main reason for
points should be enough to ensure, with a given level of choosing NSSDA as an accuracy assessment measure
confidence, that a GDB with a non-acceptable quality was that it gives results in a more open way than the
level will not be acquired, and it should always be large previously developed tests because it leaves to the user’s
enough for the hypothesis of normality to be fulfilled, understanding whether or not the derived accuracy
this being determined by the laws of large numbers in reaches expectations, which means, in a practical way, if
statistics. For these reasons standards always suggest at the product passes or fails the user’s accuracy expec-
least 20 points (FGDC, 1998). Nevertheless, this number tations. So acceptance or rejection is the responsibility of
seems to be very small and some authors as Li (1991) the user. However, the test only tells us: ‘the product has
and Newby (1992) suggest larger numbers. In this sense, been checked/compiled for N meters of horizontal/vertical
we highlight the work of Ariza-López and Atkinson- accuracy at 95% of level of confidence’ (FGDC, 1998).
Gordo (2008a) in which they suggest a sample size of 100 Table 1 summarizes the steps for applying the standard.
points when applied NSSDA in order to obtain 95%
confidence level on estimation and variability within a
range of + 5%. Following these authors, for the mini-
Results
mum proposed sample size (n¼20 points) the variability The results obtained by applying the proposed pos-
of results is on the order of + 10–11%, which means an itional control methodology based on points to the
approximate variability of the confidence level of 90%. described GDB are shown below. The following sub-
Finally, the last step of our methodology was the sections explain the thresholds setting; report the results
application of the NSSDA standard. The NSSDA of the extraction of homologous points and present the
standard implements a statistical and testing method- results provided by the NSSDA standard.
ology for estimating the positional accuracy of a GDB
by means of points previously selected. National Stan-
dard Spatial Data Accuracy is a compulsorily fulfilled
Table 1 Summary of the NSSDA when applied to the
horizontal component
1 – Select a sample of a minimum of 20 check points (n. ¼ 20).
2 – Compute individual errors for each point i:
hxi ¼ x 10ki 2 x 25ki . . .hyi 2 y 10ki 2 y 25ki
3 – ComputesRMSE
ffiffiffiffiffiffiffiffiffi for each component:
sffiffiffiffiffiffiffiffiffi
Sh2xi Sh2yi
RMSEX ¼ RMSEY ¼
n n
Table 2 Per cent distribution of matched points according to the TAS and TLS values
Table 3 Per cent distribution of matched points for each urban area
0985 Mairena del Alcor 847/875 4421 6179 452 10?22 7?31
0985 Carmona 851/870 5276 6606 579 10?97 8?76
1009 Santa Fe 649/670 2696 4543 352 13?05 7?74
1009 Granada 2250/2301 8999 16144 3378 37?53 20?92
greater statistical significance of the results. In addition, 15?9 m (confidence level ¼ 95%). In addition, the mean
the normal distribution of positional errors was checked accuracy values present a clear decreasing tendency
for each sample (40). All cases show homogeneous when the sample size increases. The same happens with
results with mean accuracy values that range from 9?7 the mean deviation values. Therefore, these results
to 16?3 m (confidence level ¼ 90%) and from 9?5 to agreed with one of the main conclusions derived from
the work of Ariza-López and Atkinson-Gordo (2008a): data acquisition, (ii) the low computational time
The NSSDA has a little tendency to underestimate accu- required compared to traditional methodologies
racy when the minimum proposed sample size (n ¼ 20 (especially when applied on a large number of GDB, and
points) is used. if we consider field work for GPS data acquisition).
Finally, these results are very close to those obtained However, one of the limitations of this research is the
using line-based methods (specifically when the SBOM is effect of the accuracy of the reference data. While in our
used) for these same set of BDG, (see Ruiz-Lendı́nez case, the positional accuracy of the reference data is
et al. 2013). Table 5 shows a comparison between the significantly better than the data being evaluated,
values of uncertainty for a 95% level of confidence improved accuracy of reference data may provide more
achieved by the SBOM, and the values provided by the robust results. For this reason, in future studies we plan
NSSDA standard for this same level of confidence. to diversify our work to different map scales.
In this sense, Carmona case shows again better results
than the other cases, achieving uncertainty values lower
than the values achieved by the rest of the cases. Acknowledgement
This work has been partially funded by the Ministry of
Computation time Science and Technology of Spain under Grant
An important advantage of the application of the No. BIA2011-23217 and by the Regional Government
metrics and operations described above to assess pos- of Andalusia (Spain). The authors also acknowledge the
itional accuracy of GDBs is the low computational time Regional Government of Andalusia (Spain) for the
required compared to traditional methodologies (es- financial support since 1997 for their research group
pecially when applied on a large number of GDB, and if (Ingenierı́a Cartográfica) with code PAIDI-TEP-164.
we consider field work for GPS data acquisition). This is
particularly significant measuring the efficiency of our
approach by means of the ratio (number of points/ References
minutes) achieved: (25/60) applying this methodology in Ariza-López, F. and Atkinson-Gordo, A. 2008a. Variability of NSSDA
a manual way; and (452/1?5), (579/1?8), (352/1?2), estimations. Journal of Surveying Engineering, 134(2), pp.39–44.
(3378/11?2) applying this methodology automatically for Ariza-López, F. and Atkinson-Gordo, A. 2008b. Analysis of some
the four cases presented: Mairena del Alcor, Carmona, positional accuracy assessment methodologies. Journal of
Surveying Engineering, 134(2), pp.404–7.
Santa Fe and Granada respectively. Logically, the effi- Arkin, E. M., Chew, L. P., Huttenlocher, D. P., Kedem, K. and
ciency of our approach depends on the size of the GDBs Mitchell, J. S. B. 1991. An efficiently computable metric for
employed (number of polygons of both datasets: tested computing polygonal shapes. IEEE Transactions on Pattern
GDB and reference GDB, and total number of vertices). Analysis and Machine Intelligence, 13(3), pp.209–16.
FGDC. 1998. Geospatial positioning accuracy standards. Part 3:
Finally, we must note that our test platform was an national standard for spatial data accuracy, Available at:
Intel Core i5 2?4 GHz processor with 4 GB memory and ,http://www.fgdc.gov/standards/projects/FGDC-standards-
the development tool was Microsoft Visual Studio 2010. projects/accuracy/part3/chapter3. [Accessed 1 November 2014].
Forrest, S., Javornik, B., Smith, R. E. and Perelson, A. S. 1993. Using
genetic algorithms to explore pattern recognition in the immune
Conclusions system. Evolutionary Computation, 1, pp.191–212.
Goodchild, M. and Hunter, G. 1997. A simple positional accuracy
This study provides an efficient methodology for auto- measure for linear features. International Journal of
mating the positional accuracy assessment of spatial Geographical Information Science, 11(3), pp.299–306.
data using point features. To this end, the use of the Greenwalt, C. and Shultz, M. 1962. Principles of error theory and
cartographic applications. Technical Report – 96. St Louis,
Arkin metric for comparing polygonal shapes has gave USA: ACIC.
us the capability to obtain, in an unattended way, a high Herrera, F., Lozano, M. and Verdegay, J. 1998. Tackling real-coded
quantity of well-matched points for each polygon. Thus, genetic algorithms: operators and tools for behavioural analysis.
although our identifying procedure of homologous Artificial Intelligence Review, 12, pp.265–319.
Hunter, G. and Goodchild, M. 1996. A new model for handling vector
points can provide unmatched points too, the number of data uncertainty in geographic information systems. Journal of the
homologous points obtained will always be higher than Urban and Regional Information Systems Association, 8(1),
that obtained by using field samples. In addition, the pp.51–7.
effect of non-random sampling has been removed. That ISO. 2002. ISO 19113: geographic information – quality principles. Geneva,
way, the probability of obtaining a better representation Switzerland: International Organization for Standardization.
ISO. 2013. ISO 19157: geographic information – data quality. Geneva,
of the errors is greater than using field samples. Switzerland: International Organization for Standardization,
On the other hand, the experimental results have pp.146.
proven the feasibility of the proposed method, Li, D., Zhang, J. and Wu, H. 2012. Spatial data quality and beyond.
which provides the following advantages: (i) it over- International Journal Geographical Information Science, 26(12),
pp.2277–90.
comes one of the main problems in positional quality Li, Z. 1991. Effects of check points on the reliability of DTM accuracy
assessment processes: the high cost of traditional estimates obtained from experimental test. Photogrammetric
methodologies in obtaining control points, e.g. GPS Engineering and Remote Sensing, 57(10), pp.1333–40.
Myers, E. and Hancock, E. 2001. Least-commitment graph matching proposal of classification. International Journal Geographical
with genetic algorithms. Pattern Recognition, 34, pp.375–94. Information Science, 25(9), pp.1439–66.
Newby, P. 1992. Quality management for surveying, photogrammetry Ruiz-Lendı́nez, J., Ureña-Cámara, M. and Mozas-Calvache, A. 2009.
and digital mapping at the ordnance survey. Photogrammetric GPS survey of roads networks for the positional quality control
Record, 79(14), pp.45–58. of maps. Survey Review, 41(314), pp.374–83.
Ruiz-Lendı́nez, J., Ariza-López, F. and Ureña-Cámara, M. 2013. Tveite, H. and Langaas, S. 1999. An accuracy assessment meted for
Automatic positional accuracy assessment of geospatial geographical line data sets based on buffering. International
databases using line-based methods. Survey Review, 45(332), Journal Geographical Information Science, 13, pp.27–47.
pp.332–42. Zandbergen, P. 2008. Positional accuracy of spatial data: non-normal
Ruiz-Lendı́nez, J., Javier Ariza, F., Ureña, M. A. and Blázquez, E. B. distributions and a critique of the national standard for spatial.
2011. Digital map conflation: a review of the process and a Transactions in GIS, 12(1), pp.103–30.