Escolar Documentos
Profissional Documentos
Cultura Documentos
Learning Objectives
Upon completing this chapter, you should be able to:
• Describe the uses and value of the application of geographic information
systems (GIS) to public health.
• Discuss the history and the theoretical foundations of GIS.
• Understand the functional development of GIS and how it works.
• Analyze the organizational models and the respective hardware/software/
personnel requirements for GIS along the continuum from a single
individual user to community use.
• List and discuss the social/institutional issues that individual and
organizational users of GIS must address.
• Describe the limitations of GIS software and spatial data.
• Discuss the emerging technologies that have implications for GIS use in
public health.
Overview
Geographic information systems are powerful tools that can enable public
health practitioners to analyze and visualize data. A system of computer hard-
ware and software that allows users to input, analyze, and display geographic
data, GIS permits the manipulation and display of both spatial and attribute
data. GIS now exists at various levels, ranging from small-scale systems for
individual users to enterprise-wide systems. The advent of Internet map serv-
ers and client-server applications has made GIS more widely available and
accessible. However, users of GIS need to have the proper training in order to
use such systems properly. They also need to be aware of the social/institu-
tional issues that can influence GIS use. Finally, users need to be aware of the
limitations in GIS software and in data sets, limitations that can, if ignored,
result in reliance on incomplete and inaccurate data.
431
432 Part IV. New Challenges, Emerging Systems
Introduction
During the past few years, the contribution of information technology to the
practice of public health has become increasingly apparent and has led to the
emergence of the discipline of public health informatics. Public health
informatics has been defined as “the application of information science and
technology to public health practice and research.”1(p1) (also see Chapter 1).
Until very recently, there has been a general perception that the use of infor-
mation technology in the health sciences is 10 to 15 years behind its use in
other fields. Historically, the use of information systems in public health has
focused on the storage and retrieval of data. This focus is changing as the
healthcare industry increases its use of electronic medical records, upgrades
hospital information systems, and uses the Internet for distributing health-
related information and providing remote diagnostics.2,3 In addition, with the
shift of the U.S. healthcare system toward a managed care model, the role of
public health agencies is becoming strongly oriented toward the provision
and use of information and efficient access to it.
At a time when computer hardware and software are becoming more afford-
able, powerful, and user-friendly, public health agencies and service provid-
ers are scrambling to develop the technological infrastructure that will allow
them to make use of information technology. Recognizing the importance of
a strong information infrastructure in providing public health professionals
with access to technical information, the Centers for Disease Control and
Prevention (CDC) in 1992 initiated the Information Network for Public Health
Officials (INPHO) program, which has provided funding to public health agen-
cies to acquire and upgrade information resources. This program is adminis-
tered by the Public Health Practice Program Office (PHPPO), which is dedicated
to improving systems that manage public health information and knowledge.
One of the emerging technologies being adopted by public health profes-
sionals is that of geographic information systems. A GIS is a computer map-
ping and analysis technology consisting of hardware, software, and data
allowing large quantities of information to be viewed and analyzed in a geo-
graphic context. It has nearly all of the features of a database management
system, with a major enhancement: Every item of information in a GIS is tied
to a geographic location. Lasker et al. have identified three basic types of
information needs essential to public health services: (1) data collection and
analysis, (2) communication, and (3) support in decision making.4 GIS has
enormous potential to contribute to the analysis of population-based public
health with its ability to support all three types of information needs.
Although medical geographers have been mapping disease and conduct-
ing spatial analysis for decades, the use of GIS among public health profes-
sionals is a relatively recent development. The fact that two 1999 editions of
the Journal for Public Health Management and Practice were devoted en-
tirely to GIS applications attests to its emerging importance in health sci-
21. Geographic Information System 433
ences. With GIS, public health professionals can manage large quantities of
information; map the distribution of diseases and health care resources; ana-
lyze the relationships among environmental factors and socioeconomic envi-
ronments and disease outcomes; determine where to locate a new hospital or
clinic; and even make decisions about the development or implementation of
health policy.
In this chapter, we will define the nature of a GIS. We will trace its theoreti-
cal foundations and its development and discuss the importance of GIS, par-
ticularly with regard to its contribution to public health. We will then discuss
how GIS concepts work—their treatment and representation of data, GIS or-
ganizational models, and issues related to implementation and uses of GIS.
We will conclude the chapter with a discussion of the implications of emerg-
ing technologies such as Web-based applications and data warehousing for
GIS as a public health tool.
What Is GIS?
What is a GIS? Dozens of definitions exist. Essentially, it is a system of com-
puter hardware and software that allows users to input, analyze, and display
geographic data. More specifically, it is “. . . a computer system that stores
and links non-graphic attributes or geographically referenced data with
graphic map features to allow a wide range of information processing and
display operations, as well as map production, analysis and modeling.”5(p281)
Clarke refers to GIS as (1) a toolbox, (2) an information system, and (3) an
approach to science.6 As a toolbox, GIS is a software package that contains a
variety of tools for processing and analyzing spatial data. Public health pro-
fessionals might use these tools to map infant mortality rates across a state,
identify areas with underserved populations, maintain an infectious disease
surveillance system, or model environmental exposures to toxic substances.
As an information system, a GIS consists of a series of databases that con-
tain observations about features or events that can be located in space and,
hence, mapped and analyzed. GIS also functions as a means of spatial data
storage.7 Information that for centuries was stored on paper maps can now be
stored in digital format in a geographic information system.
In some circles, the meaning of GIS is gradually shifting from “geographic
information system” to “geographic information science,” sometimes
referred to as GIScience.8 GIScience refers to the science behind the technol-
ogy and the study and understanding of the disciplines and technologies that
have contributed to the development of today’s GIS software. These disci-
plines include geography, cartography, geodesy, photogrammetry, computer
science, spatial statistics, and a wide range of physical and social sciences.
Goodchild has categorized these disciplines and provided a more extensive
list of them.8
434 Part IV. New Challenges, Emerging Systems
graphics, and land use. In 1967, the Land Management Information Center
was established at the University of Minnesota and began development of a
statewide GIS database. Parallel traditions in automated mapping and facility
management (AM/FM) systems by gas and electric utilities and other impor-
tant contributions, such as the development of computer-aided drafting (CAD)
systems, are described in detail in Antenucci et al.5
Although many of the early computer mapping and GIS programs were
quite powerful, they appear primitive by today’s standards. The maps they
produced were nowhere near as aesthetic as hand-produced maps; the soft-
ware had a much longer learning curve (which included learning mainframe
Job Control Language and commands specific to the software), and digital
data were difficult to come by.
Many U.S. federal government agencies were important to the evolution of
GIS technology and the development of digital cartographic data, perhaps
most notably the U.S. Bureau of the Census. In 1967, the agency piloted the
use of digital geographic files (streets and census blocks) for a study in New
Haven, Connecticut. These files, the Geographic Base File Dual Independent
Map Encoding files (otherwise known as GBF/ DIME files), were used in
urban areas for the 1970 and 1980 censes. Military use of geographic data
technology in the 1960s led to development of digital databases such as
the World Databank, which could be used in some of the early mapping
programs.
In the late 1980s, the move away from mainframe computers and toward
workstation and PC technologies resulted in dramatic changes to GIS soft-
ware and functionality. Most notably, software became increasingly easy to
use with the development of graphical user interfaces and menu-driven sys-
tems, and large collections of digital datasets were developed for use with the
software. Today, computer users with a day’s training or less can easily begin
using GIS. Such a facility of use has obvious advantages, but there are draw-
backs as well. After all, geographic data are complex. Without a sound knowl-
edge of basic geographic principles, data issues, and map design, it is easy for
an uninformed user to make errors, to mislead, and to be misled.
Perhaps more interesting than Snow’s map, however, was his “medical detec-
tive” work preceding the 1854 epidemic and following the epidemic of 1849,
which helped him to recognize the association between contaminated water and
cholera. The cholera epidemic of 1849 killed over 52,000 people in Great Britain
and over 13,000 in London alone.12 While Snow published a brief account of this
epidemic in 1849, he continued to carry out research over the next few years,
leading to a second edition, published in 1854, that was a more substantial work.
In his second account, Snow noted the association between cholera, pov-
erty, elevation, and the water supply of the various London districts. A fasci-
nating reconstruction, mapping, and geographic analysis of these associations
is provided by Cliff and Haggett. 12 As the authors have noted, “these associa-
tions result in some striking geographical distributions” such as the higher
mortality rates in areas adjacent to the River Thames and the relationship
between cholera and the water supply of London districts. At that time, a
number of metropolitan water companies were supplying water to the city
from a myriad of sources—some directly from the Thames, others from reser-
voirs. Cholera mortality was linked to contaminated water supplies provided
by companies drawing their water directly from the Thames.
Snow also investigated the relationship between elevation and cholera
incidence and observed that cholera was more likely to occur in low-lying
areas than in higher ones. There was some dispute over whether this was the
result of water contamination or of soil type, but it was actually a product of
a combination of the two factors: Lower lying areas had poorer drainage (soil
type), resulting in water stagnation and contamination. The alkalinity of wa-
ter also played a role in the transmission of cholera, as the microbe Vibrio
cholerae likes water with a high pH.
Although many of us would prefer to be in the field, rather than at a desk,
today’s technology makes it possible to carry out an analysis such as Snow’s
in a very small amount of time, at the desktop. Imagine Dr. John Snow at his
desk with a powerful computer mapping and information system. On his com-
puter screen, he has maps of London districts, their water supplies, and the
locations of cholera cases. In addition, his water supply map database con-
tains information about characteristics of the water, such as pH factor and
water source. He also has a map of soils, with information about their charac-
teristics and an elevation model to work with. With the tools available in a
geographic information system (provided that he has spatial data in digital
format), Dr. Snow could do point mapping of cholera cases, calculate dis-
tances to water sources, and examine the relationship of cholera incidence to
water source, water type, soils, and elevation.
Snow’s work provides an indication of how a GIS can benefit public health
practice. Medical geographers, epidemiologists, and other health practitioners
have been carrying out mapping and spatial analysis for centuries, but have been
doing it “longhand,” so to speak. Some of the classic geographic research on
probability mapping,13 disease diffusion and modeling,14 the spatial organiza-
21. Geographic Information System 437
of GIS data as having two components. The first component is spatial data,
consisting of geographic coordinates that provide information about the lo-
cation and dimensions of features on earth and the relationships among these
features. These spatial data are stored in a topologic data structure—a data
structure that maintains information about the spatial relationships among
features, such as adjacency, connectivity, and containment.
The second component is attribute or statistical data, such as census vari-
ables or health outcomes, that describe the non-spatial aspects of the data-
base. Attribute and geographic data are linked through a geocode, a geographic
identifier that is contained in both data components. This geocode can be a
county name or a state name, a zip code, a street address, or some other nu-
meric code.
Figure 21.1 displays a map of Missouri that shows the number of persons
age 65 and over, by county. The spatial data on the map are the Missouri
county boundaries. Attribute data are contained in the table below the map
and are represented on the map by a series of shading patterns. Each record
contains information for a single county; in this case it includes county name,
state name, 1997 population, and the population age 65 and older. The table
also contains standard numeric codes (geocodes) for counties and the state of
Missouri. These codes were developed by US government agencies as part of
the Federal Information Processing Standard (FIPS).
The record for Saline County is highlighted, and the corresponding county
is highlighted on the map. The FIPS code for Missouri is 29, and the FIPS
code for Saline County is 195, providing a combined FIPS code (and a unique
identifier for Saline County, Missouri) of 29195. This value is contained in
the table’s FIPS field. The Missouri county boundary file has a FIPS code
associated with each county, and the attribute data are linked to the appropri-
ate boundary through this geocode.
Most federal geographic data, such as census data, use a set of FIPS codes.
However, the federal codes are not always used by state agencies or other
organizations. Geographic files, such as the county boundary file in Figure
21.1, often contain more than one set of geocodes. If health agencies in the
state of Missouri coded health data by county name, these data could be
mapped using county name as a geocode, so long as that information was also
contained in a field in the spatial database.
Attribute data originate from a variety of sources and come in a wide range
of formats. One of the challenges of using health and demographic data in a
GIS is working with different data formats and structures. Attribute data are
typically stored in tables, where columns represent fields or variables and
rows represent cases or observations. These tables or files are often stored in a
database, defined as “a collection of related data items stored in an organized
manner.”20 The original data may be stored in mainframe legacy systems;
SAS, SPSS or Access databases; Excel spreadsheets; or a number of other
formats. Linking these data to spatial data usually requires importing them
21. Geographic Information System 439
FIGURE 21.1. Spatial and attribute data for Missouri counties. (Data source: US Census,
1990. Map Source: ESRI Redlands, CA,)
440 Part IV. New Challenges, Emerging Systems
into GIS. Most data tables can be converted to ASCII or dBase (.dbf) format,
for easy incorporation into GIS. Spreadsheets and databases are not the same,
and importing spreadsheets into GIS software can be problematic, although it
is often done. Many GIS users view dBase as a preferred file transfer format
because it is readable by many GIS software applications and requires little or
no formatting. Recent developments in both GIS and database management
software allow direct, live linkage among some GIS applications and data-
base management systems.
For years, the main database management system utilized by GIS applica-
tions has been the relational model, where two or more tables can be linked
easily via a common identifier, or key. This is how attribute data are linked to
spatial data using a common geocode. The new trend in the larger GIS soft-
ware applications is toward object-oriented databases, which are capable of
modeling complex spatial objects. These spatial objects contain not only
attributes, but the methods and procedures that operate on them. A more
detailed discussion of database management systems is beyond the scope of
this chapter, but readers are referred to Jones7 for more information about
database models in the context of geographic information systems.
FIGURE 21.2. Unprojected and projected coordinates. (Map source: ESRI Redlands,
CA.)
ships among them can be analyzed and displayed when they are tied to a
common coordinate system.
Many geographic databases are stored as unprojected data—that is, as
latitude/longitude coordinates. Indeed, latitude/longitude coordinates are a
sort of lingua franca, a standard data exchange format, and must be projected
by use of the projection capabilities available in most GIS software products.
Projections and/or coordinate systems that are commonly used in the United
States include (1) state plane coordinate systems, (2) Albers Equal Area pro-
jection, (3) Lambert Conformal Conic projection, and (4) Universal Trans-
verse Mercator (UTM) projection. Good descriptions of map projections and
coordinate systems can be found in Clarke6 and Robinson et al.21 Figure 21.2
displays a map of the continental United States in latitude/longitude coordi-
nates (unprojected) and in Albers Equal Area coordinates (projected).
FIGURE 21.3. Vector and raster data. (Map source: ESRI Redlands, CA: USGS National
Land Cover Data.)
Scale
Scale refers to the ratio of a distance on a map to the corresponding distance
on the ground. A scale of 1/100,000 (usually represented as 1:100,000) means
that 1 inch on the map is equal to 100,000 inches on the real earth. The ratio
is true for any unit of measurement (1 centimeter on the map is equal to
100,000 centimeters on the ground). Large-scale maps show more detail than
small-scale maps. The concept of scale can be confusing because the larger
21. Geographic Information System 443
FIGURE 21.4. An area of coastal Carolina, shown at several map scales. (Map source:
ESRI, Redlands, CA.)
the denominator in the fraction is, the smaller the scale is. In other words, a
map at a scale of 1:12,000 is a larger-scale map than one at 1:2,000,000.
Smaller-scale maps are generally used to show a larger area (such as the world
or the United States), whereas larger-scale maps can be used to “zoom in” to a
smaller area (such as a city or a neighborhood). Because many map details are
lost in smaller-scale maps, scale has an important effect on the precision of
location. Figure 21.4 shows an area of coastal North Carolina represented at
different scales. It is important to remember that, although GIS software al-
lows users to zoom in and out to different scales, the amount of detail in a map
depends entirely on the scale of the original map!
Figure 21.1. In health applications, it may be used with counties, zip codes,
health service areas, census tracts, or other geographic units to show the distribu-
tion of health outcomes, socio-demographic characteristics, health services, or
other relevant variables. Because correct interpretation of the message or pattern
displayed on a choropleth map is so critical to analysis and decision-making, a
more detailed discussion of choropleth map production is provided in a later
section in this chapter concerning visual display of spatial data.
Automated address matching can be used to map clinics, patient residences,
and other locations that contain street addresses. Address matching is a term
that is often used synonymously with geocoding, but it is actually only one
of many methods of geocoding. Essentially, an address, such as 525 Fuller
Street, is a geocode—it refers to a specific location along Fuller Street. Ad-
dress matching works by comparing a specific street address in a database to
a map layer of streets. If the map layer contains relevant information about
the street name and the range of addresses along that street, the software can
interpolate the location of the address and place it along the street. An ex-
ample is shown in Figure 21.5. In this case, the street network data contain
fields with information about the beginning and ending address for the 500
block of Fuller Street. Even addresses are on one side; odd addresses on the
other. The address, 525 Fuller Street, falls about 25% of the distance from the
beginning of the block. Issues of privacy and confidentiality that arise from
address matching are discussed later in the chapter.
FIGURE 21.5. Address matching. (Source: ESRI Street Map, Redlands, CA.)
21. Geographic Information System 445
Most GIS software allows the user either to enter addresses interactively,
one at a time, or to process an entire database of addresses in batch mode.
Although the concept of address matching is very straightforward, there are
many limitations and problems that can be encountered. These are described
in a later section.
Distances among geographic features can be determined with nearly all
GIS/mapping software. In health applications, distances are often needed to
analyze access to health care or to model exposure to an environmental con-
taminant, among other things. Most GIS software allows users to determine
distances either interactively or in batch mode through the use of a distance
function. In the case of the latter, the distance calculation is stored in a vari-
able that may be used for later analysis, such as regression or some sort of
exposure modeling.
Spatial query allows a GIS user to query the attribute database and display the
results geographically. For instance, a user could make a query to display the
location of all rabies cases that have occurred in a county during the past year, or
to show all census tracts in which more than 50% of households have a house-
hold income below the poverty rate. Queries can also be based on distance: A GIS
can be used to display all zip codes within a 25-mile radius of a particular health
clinic or to show all patients within 15 miles of a field phlebotomist.
Buffer functions can define and display a region or “ring” of specified
radius around a point, a line, or an area. GIS software allows the user to define
the width of the buffer—that is, the distance of the outside edge of the buffer
from the feature boundary. A 150-meter buffer might be created to determine
the number of residences close to a toxic release event. A 25-meter buffer
zone around major roads could identify areas with potential lead hazards in
soil from past use of leaded gasoline. Figure 21.6 shows buffers of 25, 50 and
75 miles from Saint Charles Medical Center in Bend, Oregon. Another hospi-
tal is located within 25 miles of Saint Charles Medical Center and there are
three hospitals within 50 miles of the center.
Overlay analysis allows GIS users to integrate feature types and data from
different sources. It is not to be confused with visual overlay, which occurs
when several map layers are registered to a common coordinate system and
displayed together, as in Figure 21.3. Overlay analysis involves some spatial
data processing and results in the creation of new data or modification of
existing data. Two commonly used types of overlay analysis are point-in-
polygon overlay and polygon overlay.
Point-in-polygon overlay is used to determine which area, or polygon, a point
or set of points lies in or whether a point lies inside or outside a particular geo-
graphic area. For example, a point map of patient residences might be overlaid on
a map layer of census tracts to determine the census tract of the residence of each
patient. This application is important when a user is examining the association of
census variables, particularly socioeconomic ones, with health outcomes.
Polygon overlay can be used to create a new map layer from two existing
polygon map layers, when their boundaries are not coincident. For example,
446 Part IV. New Challenges, Emerging Systems
a zip code map layer can be overlaid on a layer of primary sampling units to
obtain a map layer showing all ZIP codes and partial ZIP codes within a
sampling area. This application can be used to create a lookup table that can
be linked to addresses. Polygon overlay is sometimes used to estimate popu-
lations within a geographic area whose boundaries differ from census bound-
aries; it operates in a “cookie cutter” fashion to create new polygons.
Population is then prorated by comparison of the area of the new polygon to
that of the original.
While these are only a few examples of GIS functions, they are all commonly
used in health applications and are easy to learn. Many other functions exist,
ranging from relatively simple techniques such as suitability analysis and cre-
ation of Thiessen polygons to complex methods of spatial modeling. A good
source of information on GIS modeling is Bonham-Carter.23 Kulldorff has de-
21. Geographic Information System 447
scribed some statistical issues and methods pertinent to public health data,24 and
Buescher has warned about computing and using rates based on small numbers.25
There are many time-honored spatial analysis techniques used by geogra-
phers for decades that are not yet incorporated into the more widely used GIS
software products. Furthermore, GIS software has always been lacking in sta-
tistical analysis functions. Using statistical or more advanced spatial analy-
sis techniques usually requires additional programming, often incorporating
a GIS software macro language, or reformatting GIS data for use with statisti-
cal software, such as SAS or SPSS. One statistical software package, S-PLUS,
can be used with ArcView software developed by Environmental Systems
Research Institute (ESRI). Other statistical software has been developed for
very specific applications, such as SaTScan (which can be obtained from the
National Cancer Institute at no charge) for analysis of disease clusters.24 Those
unfamiliar with spatial analysis and spatial statistics may want to refer to
Unwin 26 or Cressie.27 Anyone with a strong interest in exploring spatial analy-
sis methods for use in health applications is urged to read Atlas of Disease
Distributions: Analytic Approaches to Epidemiologic Data. 14
FIGURE 21.7. Data grouping methods for choropleth mapping. (Map source: ESRI,
Redlands, CA.)
provides users with a number of options for classifying numeric data. Four com-
monly used methods are (1) equal interval, (2) quantile, (3) natural breaks, and (4)
mean and standard deviation. Figure 21.7 provides examples of these methods,
using the data from Figure 21.1 for illustrative purposes.
Generally, there is no consistent “right” or “wrong” classification method
to use for classifying data, but some methods are more appropriate for certain
data distributions. The mean and standard deviation method is probably used
least, because the general public may not understand the concept of standard
21. Geographic Information System 449
FIGURE 21.8. Use of map symbols for choropleth mapping of numeric data, (Map
source: ESRI, Redlands, CA.)
the Internet, even the community uses GIS (large-scale GIS). For purposes of the
discussion that follows, we will describe several “discrete” models along this
continuum, but the reader should keep in mind that the boundaries between the
“discrete” models are quite fuzzy. In addition, these models can overlap, and
more than one model may exist in any organization.
Smaller-Scale GIS
Since the early 1990s, there has been a trend toward the use of desktop GIS, a
concept that involves making GIS and mapping accessible to people who use
computers in their everyday work environment. The advent of desktop GIS
has been concomitant with the development of powerful personal computers
and user-friendly, Windows-based software environments. A variety of desk-
top GIS software packages have menu-driven graphical user interfaces and
are easily integrated with office computer hardware.
Departmental GIS
A second organizational model is the departmental GIS. An example is a GIS
in a state health department, where project and mapping support are provided
on an as-needed basis to state and local public health agencies and where
multiple GIS analysts work under the supervision of a GIS manager. Larger
GIS operations such as these often use a wider range of GIS software products
and store large amounts of data on Unix or Windows-based servers.
Enterprise GIS
Larger-scale GIS operations use an enterprise GIS model that provides GIS
capabilities to an entire organization or a corporation and involves the use or
development of multiple data types and applications and coordination among
many departments. With an enterprise GIS, an organization’s data are spa-
tially enabled (i.e., geocoded and available for GIS and mapping applica-
tions) and accessible to the entire organization as a resource for analysis and
decision-making. Data are usually stored on powerful servers within the orga-
nization and served to users across a network. For example, the city of Wilson,
North Carolina houses an enterprise GIS that is used by employees in many
departments, including fire, police, public utilities, public works, and plan-
ning and community development.33 Environmental Systems Research Insti-
tute (ESRI), Inc. has developed a white paper on the development of enterprise
GIS in health and social service agencies.34
In order to evaluate hardware and software needs, GIS users in public health
must determine which GIS organizational model meets their needs, the avail-
ability and format of digital geographic data, and how their GIS activities
will be integrated with other research or operational units. In many cases, a
powerful PC with desktop software will be sufficient. With more sophisti-
cated systems, such as those used in a departmental or an enterprise GIS,
larger investments in data servers and software will be necessary. No matter
which GIS system is purchased, spatial data is always space-intensive. Geo-
graphic data files are large. A user should purchase more hard drive space than
anticipated need indicates.
tively small geographic units, so users will need to take even greater caution
when dealing with rates and small numbers or issues of confidentiality. One
of the projects of the Federal Geographic Data Committee has been to imple-
ment a national Geospatial Data Clearinghouse, which functions as a data
catalog and is accessible via the Internet at http://www.fgdc.gov/clearing-
house/clearinghouse.html. The University of Arkansas’ Center for Advanced
Spatial Technologies maintains a guide to online spatial and attribute data
on its Web site available at http://www.cast.uark.edu/local/hunt/index.html.
Environmental Systems Research Institute (Redland, CA) maintains a Geo-
graphy Network site that is a global network of GIS users and data and
service providers. Users can search for spatial data via this link: http://
www.geographynetwork.com/.
GIS data for public health applications are often created by linking health
attribute data from state and local government agencies to geographic bound-
ary files by geocode. For instance, county-level mortality data can be linked
to a state’s county boundary file by county code. Health datasets that contain
zip code fields can be linked to a zip code boundary file by the zip code field
in both databases. Many public health datasets are created through the ad-
dress matching process, described in a previous section. A thorough review of
GIS data sources for community health planning can be found in Lee and
Irving.36
Often, local public health community planning efforts require relatively
detailed data in order to permit development of maps at the sub-county or
neighborhood level of geography.37 Given the requirement to protect the con-
fidentiality and privacy of individual medical record information, many ex-
isting state or federal spatial databases typically do not provide a sub-county
level of detail—the smallest geographic unit of analysis is often at the county
level. Thus, in addition to whatever data may be available at the state and
federal level, local health agencies need capabilities to geocode and import
their own data. Depending on the nature of a specific health problem being
addressed, the local public health agency also may need to develop local
spatial data partnerships with other local government agencies and commu-
nity organizations (e.g., the department of transportation for motor vehicle
accidents, the police department for teenage crime data.)
Spatial data are critical for GIS operations, but so is information about
those data. The Federal Geographic Data Committee (FGDC) has spent sev-
eral years developing a standard for metadata that describes the content and
quality of a spatial database, or, in FGDC’s words, “data about data.” Metadata
provide important information about who developed the database, the scale
of the original data, the time period of the content, and attribute and posi-
tional accuracy. While metadata does not guarantee the quality of the data,
they do provide important information with which a user can determine ap-
propriateness of the data’s use. Metadata are usually in the form of text stored
in a separate file. They are available for many of the datasets developed by
21. Geographic Information System 455
agement and Practice39 indicates that the use of GIS in health applications is
headed in this direction. This evolution in GIS use will necessitate a higher
level of training and education for the part-time GIS user.
For the most part, learning how to use GIS/desktop mapping software is
not difficult or time-consuming, a fact that can be deceptive because it ob-
scures the complexity of GIS. GIS software vendors often offer their own
training courses, some of which are even available as distance learning op-
tions. As GIS users become more advanced in their analyses, it is imperative
that they have an understanding of coordinate systems, map projections,
geocoding, data development and conversion, metadata, and spatial analysis.
The National Center for Geographic Information and Analysis (NCGIA) has
been developing a core curriculum for GIS education. Topics in the core
curriculum are listed on the Web at http://www.ncgia.ucsb.edu/pubs/core.html.
GIS users in the public health fields have additional concepts that they
must master. Many of these concepts can be gleaned from a course in epidemi-
ology or biostatistics. These concepts include the use of rates, statistical
variation involving the use of small numbers in either the numerator or de-
nominator, the concept of rate adjustment (e.g., age, race, sex) and the impact
of different standard populations (e.g., 1940 vs. 2000) on rates. In addition,
state and local public health GIS users need to have a sound understanding of
the ecological fallacy in the analysis of cause-and-effect relationships and of
issues involved in modeling exposure to environmental factors or using prox-
imity as a surrogate.22
Fortunately, universities and community colleges are rising to the chal-
lenge of providing GIS education to a range of users. Many of them now offer
GIS-related courses in the evening or over the World Wide Web, making them
accessible to “mid-career professionals” who wish to enhance their GIS knowl-
edge. Many are also offering certificate or degree programs in GIS.
Confidentiality
Many health datasets contain sensitive information. Consequently, public law
mandates that many agencies maintain the confidentiality of patient records and
health statistics. Databases often contain addresses that serve as individual iden-
tifiers—the location of a point on a map could be used to identify a person. GIS
users should be cautious about which maps are produced for internal use vs. those
that are distributed to the public or shown in presentations. Some methods used
21. Geographic Information System 457
to protect privacy in GIS applications are to (1) aggregate patient data to zip code
or county level (in some cases, small population numbers in these units may pose
confidentiality concerns); (2) use smaller-scale maps that show less detail; (3)
avoid including the street network on the map, as this provides the most familiar
means of locating an address; or (4) displace point features through the use of a
random displacement algorithm, thus offsetting x,y coordinates but maintaining
geographic integrity.40
Organizational Politics
The impact of organizational politics on GIS operations should not be over-
looked. For example, upper-level managers might veto GIS applications that
address politically sensitive or controversial issues. In addition, reorganiza-
tion in government agencies, common and usually political, can have either
a positive or an adverse impact on GIS operations. Moreover, GIS is a technol-
ogy that nearly everyone wants. Consequently, the location of a GIS unit in
the organizational structure in an agency can affect which projects receive
priority and/or funding.
foremost, map data are space-intensive, and serving them over the Web takes
time. Many IMS applications are slow, and they include only basic functions,
such as turning map layers on or off and zooming in or out. As the technology
evolves, more sophisticated geoprocessing functions will become increas-
ingly available.
Data Warehousing
GIS software products are incorporating developments in database manage-
ment technology. One of these developments is data warehousing, a term that
implies more than a large central database. The term is used to describe a
central repository of all types of data used by an organization or enterprise.
These data warehouses provide high-speed access to databases by many us-
ers, and they allow transactions, such as editing, to occur without interrupt-
ing the normal flow of work. In the future, warehouses for spatial data will
become more common. Currently, few exist, but examples include CubeWerx
of Canada, MrSID image data warehousing, and ESRI’s Spatial Data Engine.45
Conclusion
GIS is an information system, an approach to science, and a powerful set of
analysis and visualization tools that can be used by public health profession-
als to enhance their analysis and understanding of public health issues and to
provide a basis for sound decision making. GIS is deceptively easy to use;
however, geographic data, spatial/epidemiologic analysis, and GIS informa-
tion systems are more complex than they appear to the casual user. The effec-
tive use of GIS requires a combination of good training and experience. In the
years ahead, that training and experience will become even more important as
GIS becomes an increasingly powerful and common tool in the practice of
public health.
Acknowledgments
The author wishes to acknowledge the Research Triangle Institute for its
support, in the form of a Personal Development Award, during the develop-
ment of materials for this chapter.
References
1. O’Carroll P. Informatics Training for CDC Public Health Advisors: Introduction
and Background. http://faculty.washington.edu/~ocarroll/infrmatc/home.htm, posted
July 1997.
2. Cesnik B. The Future of Health Informatics. International Journal of Medical
Informatics. 1999; 55:83-85.
464 Part IV. New Challenges, Emerging Systems
42. Heuvelink GBM. Error Propagation in Environmental Modelling with GIS. London:
Taylor and Francis; 1998.
43. Foresman TW. Spatial Analysis and Mapping on the Internet. Journal of Public Health
Management and Practice. 1999;5:57-63.
44. Harder C. Serving Maps on the Internet: Geographic Information on the World Wide
Web. Redlands, CA: Environmental Systems Research Institute, Inc.; 1998.
45. Lowe JW. Data Warehouses and Spatial Web Sites. Geospatial Solutions. September
2000.