Escolar Documentos
Profissional Documentos
Cultura Documentos
Helen Freimark
(Specials)
Andreas Wehrle
(Content)
Geodata Management
Content
1. Geodata Management .............................................................................................................................. 2 1.1. Data Import / Export ........................................................................................................................ 3 1.1.1. Data formats .............................................................................................................................. 3 1.1.2. INTERLIS .................................................................................................................................. 6 1.1.3. Geodata Catalogue ..................................................................................................................... 6 1.2. Data Management in a GIS .............................................................................................................. 8 1.2.1. Spatial Databases ....................................................................................................................... 8 1.2.2. Spatial Queries ........................................................................................................................... 9 1.3. Geodata-Modelling ......................................................................................................................... 11 1.3.1. Three level-structure ................................................................................................................ 11 1.3.2. Geometric Model ..................................................................................................................... 11 1.3.3. Topological Model ................................................................................................................... 13 1.3.4. Thematic Model ....................................................................................................................... 15 1.4. Summary ......................................................................................................................................... 19 1.5. Recommended Reading .................................................................................................................. 20 1.6. Glossary .......................................................................................................................................... 21 1.7. Bibliography ................................................................................................................................... 24
Geodata Management
1. Geodata Management
Learning Objectives
Learn in this lesson about the possibilities of importing and exporting data. You will see various data formats and standards that facilitate the interchange of data between various GIS software. You will learn the differences between various types of database systems. You will also see the three-level structure of the geodata model. Learn about Data Formats used in GIS Learn about special formats for the Data Interchange Learn about Database Systems Learn about Three-Level Structure of a Geodata-Model
Geodata Management
Geodata Management
Vector formats
Geodata Management
Raster formats
Geodata Management
1.1.2. INTERLIS
"INTERLIS - A Data Exchange Mechanism for Land-Information-Systems" was published first in 1991. It is a mechanism that consists of a conceptional description language and a transfer format that helps to interchange geodata between geosystems. Its aim is to define concretely the data model in order to derive applications and transfer interfaces. The motive is that optimal digital transfer of structured data is just possible if the data is defined unambiguously and consistently. Since 1993, INTERLIS is integrated in the directives of the official land register and since 1998, it is registered as Swiss standard. A version adapted to the user demands was publicated 2003, called INTERLIS 2; supporting advanced compatibility, graphic defintions, multilingual models. INTERLIS is completely compatible to other data structures, such as UML, GML, XML and also backwards and forwards compatible. INTERLIS is free of royalties and guarantees a long-term support and data storage.
Geodata Management
Geodata Management
The database system is defined by the database model. Depending on the model, the database has to be structured following certain rules. The most important models are the following ones: hierarchical, network, relational and object oriented model. Hierarchical Model The hierarchical data model is the oldest type. It requires a hierarchical structure like a tree, similar to the file structure of common desktop computers. Between two types of data, only 1:n-structures are allowed, and therefore, it is in many cases inefficient. Network Model This model consists in two fundamental components: records and sets. Records contain the stored data, sets describe the relation between the records in a fixed order. A record can be an owner or a member: one owner can have various members and one member can have various owners. A set always consists in a owner and a member. The advantage of this model is good performance, at the expense of the extension of the model, since the relations need to be created additionally.
Geodata Management
Relational Model In this model, data is stored in multiple tables. This model does not work with an explicit relationship but with an identifying value, a so called key. Every entry has a key by which it is identifyied uniquely. By means of this key, various tables can be combined. Consequently, data belonging to a certain feature can be associated although it is not in the same dataset. For example, the following two tables could be part of a database containing a car inventory:
Common database software (Access, dBase, Oracle) often works with the relational model, and also most of the GIS are based on it. SQL (Structured Query Language) is a database language for the query and modification of data in relational databases. A common abbreviation is RDBMS. Object-oriented Model The object-oriented model (ODBMS or OODBMS) improves the weakness of the realational model - scattered data, and therefore, slow processes - and unites all the information belonging to a certain feature in one object. So, changes in the dataset are easier to process since the data is stored at the same place. Furthermore, data belonging to one object can only be modified by using the methods that are defined for this object. The world's largest database and the hightest ingest rate ever recorded are both holded by object-oriented databases. Since most database systems are still relational, object-oriented databases have a lack of interoperability. Object-relational model Finally, an object-relational database (ORDBMS) is a relational database with the ability to integrate custom data types and methods. This model grew in the 90's by extending relational database concepts with objectoriented concepts. In the near future, big part of Geoinformation Systems is exptected to be based on an objectrelational database system. Nowadays, just a few well-developed ORDBMS are available, the most known is maybe PostgreSQL.
Geodata Management
The previous query is a common operation as it can be carried out in any database. In a database related to spatial data, a similar query could be the selection of every conifer in a forrest that has a trunk thinner than 50 centimeters. This is a normal query of attribute data what means that the location, or with other words, the geometry of the trees does not affect the query (supposed that the trees are stored as point elements and not polygon). As said before, this is a typical type of a GIS query; since the data has a spatial relation (coordinates of the trees), the selected trees can be marked. Although, if the result does't need to be visualised, the same query could be done in a common database, without a spatial relation. Corresponding, queries of geometric data are a specialty of GIS; any data without spatial relation cannot be queried. A typical query is the selection of every tree that is not further than 200 meters of a street. In this case, a diplacement of 200 meters parallel to the street will be done and then, any tree that is inside this displacement will be selected. This example looks like that:
10
Geodata Management
1.3. Geodata-Modelling
A geodata-model is normally divided into three parts. You will learn the parts of this division and see the advantages that arise from this structure.
Primitive instancing Every object is characterised by a fixed number of parameters: e.g. length l, width w, height h, depth d and radius r. Therefore, the objects are standardised and therefore well explicated.
11
Geodata Management
Cell decomposition This method creates the objects composing small simple bodies to a bigger and more complex object. It may be compared to a construction kit, which allows the construction of any object using basic shapes. Constructive Solid Geometry In this case, an object can be defined as a set of primitive objects. By means of allowable operations (Boolean algebra; such as union, intersect, difference), the primitive objects may be combined with each other. The method is frequently used in CAD systems, but until now, just rarely in GIS.
Boundary Representation The spatial object is described by the boundary elements of the object. These may be a surface, line or point. Since freeforms are allowed in this model, the explications tolerance is rather big.
12
Geodata Management
Definition of the heigth Many geodata model contains a Z coordinate which contains information about the height of a position. Although, this kind of model is not really three dimensional, since only one altitude value is allowed. Therefore, it is called 2.5-D; Digitial Elevation Models (DEM) are normally 2.5-D. The 2+1-D model is similar; every object contains an additional information about the height (e.g. the height of the building). A real 3-D model may contain various Z values on the same position (X,Y). Geometrical queries Typical geometrical queries could be: Evaluate any parcel that has a size bigger than 400 square meters. Search any house that is built higher than 300 meters over sea. Which parts of any house are among a parcel boundary?
Incidency / Adjacency Two important terms are necessary to know in relation to topography: incidency and adjacency. Incidency refers to the relation in which two objects of different type (e.g. line and a point) are conntected with. For example, a line is incident with its start and end node, and on the other hand, the end node is incident with the line. In contrast to this, the adjacency refers to the relationship of two elements of the same type (e.g. two points). For example, two points may be adjacent through a line, or two lines may start at the same start node, and therefore, they are adjacent .
13
Geodata Management
A spatial element can be interpreted as a collection of points that define the form of the object due to their relationship information. The topological model can be reduced to the following basic elements: p - Number of points l - Number of lines f - Number of surfaces v - Number of volume elements
The previous elements help to check the data consistency. Euler's formula says that for any planar drawing, in which each part is connected with another, the following term is always correct: p-l+f=2. The outer area always counts as a surface.
Quiz Let's see if you still remember the differences between adjacency and incidency. Which of the following statements are correct in relation to the following graphic? a - Three lines are incident with point P4 b - Six lines are adjacent with P1 c - P1 is adjacent with 6 points d - f is adjacent with 5 lines e - a and h are not incident f - P3 is incident with b and g
14
Geodata Management
a - Correct. The term incidence refers to objects of different types. c, g and h are incident in P4. b - False. Ajacence referes to objects of the same type. It should be: Six lines are incident with P1. c - Correct. P1 is adjacent with P2, P3, P4, P5, P6 and P7. d - Correct. f is adjacent with a, b, c, d and e. e - False. Two points can't never be incident. The correct phrase would be: a and h are not adjacent. f - Correct.
Topological Queries Typical topological queries are similar to the previous ones, or like that: Which parcels are next to a certain parcel? Does street X cross street Y? Is the railway station Xyz on a certain railway line?
15
Geodata Management
Feature Class Principle The counterpart to the previous principle is the Feature Class Principle. Depending on its inner construciton, it may be strongly hierarchical (thematic tree) or less (thematic network). The thematic content is divided into objectclasses, whereas every class contains one topic. To specifiy the content exactly, each class is divded into various sublasses. In the tree-structure, every underclass just has one objectclass, in contrast, in the networkstructure, every subclass may have various objectclasses. The feature class model is more flexible than the layer principle, but still less common.
16
Geodata Management
Network vs.Tree
The Feature Class Principle is in Germany well-known, since ATKIS (Amtlich Topographisches KartographieInformationssystem) - the catastral information system - is based upon a object class model.
17
Geodata Management
Thematic queries Thematic queries are mostly specified by the designation of a geodata-model. Possible queries look like this: Which and how long is the river that flows into the Zrichsee? In which canton live the most farmers? How many cities with more than 20'000 inhabitants exist in Switzerland?
18
Geodata Management
1.4. Summary
Remember: Various GIS of various suppliers are available, and therefore, it is difficult to transfer data between the different proprietary systems. Certain data formats facilitate the import and export of data, while other data types aren't interchangeable. Since INTERLIS does not depend on a product, it is a promising project for the interoperability of geodata. The large quantity of geodata that a GIS has to handle with asks for a proper storage method. Modern GIS mostly use a relational database management system (RDBMS) to store and manage the data. It is closely connected to SQL, a language for data queries. In contrast to a pure database management system, a GIS is additional able to carry out spatial queries, what means, that it is possible to query by a spatial location. Finally, remember that a geodata model contains three main parts: a geometric, topologic and thematic model. The geometric model is for the metric definition of the data, that means that it is specifing the physical position and dimension of the objects. The topologic model determines the relationship between the objects, such as intersections, start points, etc. At last, the thematic model defines the thematical content of the geodata model. Mostly used is the Layer Principle, but more flexible is the Feature Class Principle.
19
Geodata Management
BILL, R.., 1999. Grundlagen der Geo-Informationssysteme 1. 4. Heidelberg: Herbert Wichmann Verlag.
Especially chapters 1 "Einfhrung in GIS" and 4 "Erfassung raumbezogener Daten" (german)
20
Geodata Management
1.6. Glossary
data aquisition: As geodata aquisition we define the collection and recording of geodata for further processing. The process of data aquisition includes the recording of geometry (spatial information), date and time (temporal information) and any non-graphical related attributes (thematic information). data normalization: Data Normalization is the process of removing redundancy in data sets through dividing the data sets in to relations, linked through identifiers. The result of a normalization process not only leads to more efficient data storage (smaller files), but also facilitates geodata updating 2. One distinguishes five different normal forms (NF1 to NF5) with various levels of redundancy removal. datum: A datum defines a spatial reference system. It consists in an ellipsoid which adapts best possible the form of a a local area and in a reference point on the earth's surface against which position measurements are made . E.g. in Switzerland, this reference point is set to the old observatory in Bern, in Germany, it is a point in Potsdam. geodata: As Geodata we can define every dataset that has a spatial aspect or component. Synonyms are "spatial data", "geographic data", "geographic data sets" or "GIS data". The syllable "Geo" implies that the dataset has a spatial component that allows to georeference the described phenomena to a location or region on the earth. geodata model: A geodata model is an abstract, artificially created mapping of a part of the real world relevant to a geoinformatics project. The goal of geodata modeling is to map the relevant conditions and processes in the real world to geodata structures 3. A data model not only describes the content, properties and data structures, but also rules and relations between the entities of a data model. geodata structure: As geodata structure we can define the logical, internal data organization of our geographic information, the means of representing a real-life entity inside a geodata model 4. Data structures should enable data storage and data management, as well as quick retrieval of the data. Unique identifier, links, relationships and dependencies help to build consistent and normalized dataset or to external data sources.
2
As Geodata Update we can define the process of data appending or the replacement of existing data to reflect changes in the world
or the model the data is derived from. Special data models must be taken into account for temporal GIS functions. Unfortunately these temporal GIS functions are still experimental and not yet part of commercial GIS systems.
3
As geodata structure we can define the logical, internal data organization of our geographic information, the means of representing a
real-life entity inside a geodata model. Data structures should enable data storage and data management, as well as quick retrieval of the data. Unique identifier, links, relationships and dependencies help to build consistent and normalized data structures and enable links within the dataset or to external data sources.
4
A geodata model is an abstract, artificially created mapping of a part of the real world relevant to a geoinformatics project. The goal
of geodata modeling is to map the relevant conditions and processes in the real world to geodata structures. A data model not only describes the content, properties and data structures, but also rules and relations between the entities of a data model.
5
Data Normalization is the process of removing redundancy in data sets through dividing the data sets in to relations, linked through
identifiers. The result of a normalization process not only leads to more efficient data storage (smaller files), but also facilitates geodata updating. One distinguishes five different normal forms (NF1 to NF5) with various levels of redundancy removal.
21
Geodata Management
geodata update: As Geodata Update we can define the process of data appending or the replacement of existing data to reflect changes in the world or the model the data is derived from. Special data models must be taken into account for temporal GIS functions. Unfortunately these temporal GIS functions are still experimental and not yet part of commercial GIS systems. geometry: In GIS, the geometry describes the form and situation of an object, but not the relation to other objects, as it does the topology 6. GIS: As a Geo Information System (GIS) we can define a computer-aided system for geographic data management, modeling, analysis, simulation and presentation. A GIS is an organized collection of computer hardware, software, geodata and skilled operators. More powerful GISoftware usually utilizes modern database technology or builds on spatial databases. metadata: Metadata describes other data ("data about data") by defining attributes such as year of creation, author, included area, origin, etc. It helps to identify and select the proper product. Open Geospatial Consortium: The Open Geospatial Consortium (OGC) is an international organisation with more than 300 governmental, non-profit, research and commercial member organisations. Its goal is the development and implementation of standards for geospatial contents and services. orthophoto: A orthophoto is an areal photo that is straightened out to an orthogonal coordinate system. primary data aquisition: Primary data aquisition methods derive geodata directly from the objects to be monitored. Representatives of this method are surveying, photogrammetry and remote sensing. Other primary data aquisition methods include field work, data aquisition through automatic data loggers (e.g. water gages, weather stations), interviews, census and polls. raster graphics: Raster graphics is the combination of raster data with graphical attributes. It is only variable in the color of the raster cell. Usually, it is used for photographies, common formats are .jpg, .png, .gif. secondary data aquisition: Secondary data aquisition methods derive the data from primary data sources. It is f.e. quite common to derive data from maps or aerial images. It is obvious that secondary data aquisitions are of lower quality and less up-to-date than primary data aquisitions. spatial base data: A subset of Geodata. Geographic base data is usually provided by national or international surveying and mapping agencies and includes mainly topographic information stored in maps or landscape models. Satellite and Aerial images can also be regarded as spatial base data, as long as they only provide topographic information in the human-visible bands. Swisstopo:
6
The topology defines the situtation and arrangement of geometrical objects. The metrical relations are irrelevant, just the relation
between the objects is important. A topologic map shows only the logical connections of objects and not the exact situation or dimension. For example, a bus plan ist a typical topologic map; it shows the connections between various points (bus-stops), but not the exact position.
22
Geodata Management
Swisstopo is the competence centre of the Swiss Confederation responsible for geographical reference data and all products derived from them. If offers a variety of spatial data, in raster form as well as vector form. thematic data: A subset of Geodata. Thematic data is aquired by specific domains. Thematic data can but does not necessarily have to include a geometry component. It is often linked to spatial base data using coordinates, administrative units, full addresses or zip codes. topology: The topology defines the situtation and arrangement of geometrical objects. The metrical relations are irrelevant, just the relation between the objects is important. A topologic map shows only the logical connections of objects and not the exact situation or dimension. For example, a bus plan ist a typical topologic map; it shows the connections between various points (bus-stops), but not the exact position. vector graphics: Vector graphics is the combination of vector data with graphical attributes. Various attributes can be modified, a polygon may vary in its outline color and thickness, hatching, etc. Common formats are .svg, .dxf, .shp, .pdf. WMS: A Web Map Service (WMS) produces maps of geospatial information dynamically from geographic information. It is defined by the OGC.
23
Geodata Management
1.7. Bibliography
BILL, R., ZEHNER, M. L., 2001. Lexikon der Geoinformatik. Heidelberg: Herbert Wichmann Verlag. BILL, R.., 1999. Grundlagen der Geo-Informationssysteme 1. 4. Heidelberg: Herbert Wichmann Verlag. BILL, R.., 1999. Grundlagen der Geo-Informationssysteme 2. 2. Heidelberg: Herbert Wichmann Verlag. INTERLIS. INTERLIS - The GeoLanguage [online]. Available from: http://www.interlis.ch [Accessed 2006-08-15].
24