Você está na página 1de 181

RISK-BASED CHARACTERIZATION OF CONTAMINATED

INDUSTRIAL SITES USING A MULTIVARIATE


STATISTICAL AND GIS-BASED APPROACH IN ANNISTON,
ALABAMA

by

HoeHun Ha

September 1, 2011

A dissertation submitted to the

Faculty of the Graduate School of The State

University of New York at Buffalo

in partial fulfillment of the requirements for the

degree of

Doctor of Philosophy

Department of Geography
UMI Number: 3475321

All rights reserved

INFORMATION TO ALL USERS


The quality of this reproduction is dependent on the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if material had to be removed,
a note will indicate the deletion.

UMI 3475321
Copyright 2011 by ProQuest LLC.
All rights reserved. This edition of the work is protected against
unauthorized copying under Title 17, United States Code.

ProQuest LLC.
789 East Eisenhower Parkway
P.O. Box 1346
Ann Arbor, MI 48106 - 1346
Copyright by

HoeHun Ha

2011

ii
To

My Parents and God

iii
Acknowledgement

I am truly thankful to many people for their assistance, support and help in

completing this dissertation. First of all, I would like to express my sincere appreciation to

my major advisor, Dr. Peter Rogerson, for his unconditional support throughout my graduate

study. This study would not have been possible without his constant guidance and his

insightful comments, which made a great improvement of my dissertation quality. He was

always open to all of my questions, and he was so considerate and generous to understand

all circumstances. It was truly an honor to work with him during my graduate study.

I wish to extend my sincere thanks to the committee members, Drs. James Olson,

Ling Bian, and the outside reader, Dr. Daikown Han for their help and support. I am grateful

for their insightful comments and advice. I am particularly indebted to Dr. James Olson in

the past three years. He encouraged and supported me to end up my dissertation research and

to move on to next step. It was of great help for me to work with him for the grant from the

Agency for Toxic Substances and Disease Registry (ATSDR) to Jacksonville State

University. The financial support of the Department of Geography and Department of

Pharmacology and Toxicology is gratefully acknowledged.

There are so many people I would like to appreciate for their help. Thank goes to Dr.

Jared Aldstadt for his generosity and encouragement during my graduate study at UB.

Thanks go to my fellow graduate students, and Ms. Betsy Abraham and Mr. Joe Murray in

the geography office for making the department graduate friendly. I particularly thank to my

fellow student, Peter Kedron for providing me his insightful comments. Everyone in the

iv
department who makes me to cultivate in pursuit of my studies deserves my appreciation. I

am sincerely blessed by their companionship.

Finally, I am heavily indebted to my father and mother who invested and provide me

a great opportunity to study in abroad. Without their invaluable encouragement, patience and

understanding, completion of this dissertation would have been fairly difficult. Once again, I

deeply appreciate all of their moral support and love. I have dedicated this to my parents,

who have always believed in the fact that I can achieve things that I put my heart to as well

as things that I have a passion for, such as this dissertation.

August, 2011

v
Table of Contents
Acknowledgements..iv

Table of Contents vi

List of Tablesx

List of Figures...xiii

Abstract.....xvi

Chapter 1 Introduction....................................................................................1
1.1 Research problems..............................................................................2
1.2 Study area...4
1.3 Study hypotheses .......................................................................................................4
1.4 Significance of the research .......................................................................................5
1.5 Structure of the dissertation........................................................................................8
Chapter 2 Background and Related Literature ........................................11
2.1 Background of PCB and lead production in Anniston, Alabama.............................11
2.1.1 History of PCB production in Anniston, Alabama..............................................11
2.1.2 The investigation of PCB pollution in Anniston, Alabama.................................13
2.1.3 Lead production in Anniston, Alabama..............................................................16
2.1.4 Other heavy metal sources in soils..............................................................18
2.2 PCBs and lead exposure........................................................19
2.2.1 What are PCBs and lead? ...........................................................................19
2.2.2 Health effects associated with exposure to PCBs...............................................21
2.2.3 Effect of socioeconomic status on exposures to PCBs ......................................23
2.2.4 Mobility and environmental health.....................................................................24
2.3 Geographic and spatial analysis on disease............................................................26
2.4 Comparing spatial patterns of clustering.................................................................29
2.4.1 Kernel density estimation ...............................................................................29
2.4.2 Nearest neighbor and K-function approach....................................................31

vi
2.4.3 Local Morans I coefficient of spatial autocorrelation........................34
2.5 Spatial methods of chemical exposure and risk assessment......................................35
2.5.1 Introduction.................................................................................40
2.5.2 Principal Component Analysis (PCA) and Cluster Analysis (CA).....................36
2.5.3 Kohonen Self-Organized Maps (SOM)...............................................37
2.5.4 Bayesian Disease Mapping..............................................................38
Chapter 3 Exploring Geographical Variations of Soil and Serum PCB in
Anniston, Alabama: the Association with Socioeconomic and Spatial
Variables.....................................................................................................41
3.1 Materials and methods...............................................................................................41
3.1.1 Collection of background data............................................................................41
3.1.2 Regression analyses............................................................................................45
3.2 Results........................................................................................................................49
3.2.1 Spatial regression result on soil PCB: socioeconomic and spatial variables..49
3.2.2 Spatial regression result on serum PCB: socioeconomic and spatial variables..52
3.3 Discussion and interpretation.....................................................................................54
3.4 Summary.....................................................................................................................60

Chapter 4 Geographic variation of soil lead concentrations in Anniston,


Alabama........................ ...........................................................................62
4.1 Materials and methods.............................................................................................62
4.1.1 Collection of background data..........................................................................62
4.1.2 Data transformation...........................................................................................65
4.1.3 Geographically Weighted Regression...............................................................65
4.2 Results......................................................................................................................68
4.3 Discussion and interpretation...................................................................................75
4.4 Summary...................................................................................................................80

vii
Chapter 5 Analysis of Heavy Metal Sources in Soils using Multivariate

Statistics and GIS....................................................................................81


5.1 Materials and methods.............................................................................................81
5.1.1 Collection of background data..........................................................................81
5.1.2 Data analysis.....................................................................................................84
5.1.3 Self-Organizing Maps (SOM) analysis.............................................................85
5.1.4 Geostatistical analysis.......................................................................................88
5.2 Results and discussion.............................................................................................90
5.2.1 Metal analysis...................................................................................................90
5.2.2 Principal Component Analysis.........................................................................94
5.2.3 Self-Organizing Maps (SOM) analysis...........................................................103
5.2.4 Geostatistical analysis.....................................................................................106

Chapter 6 Analyzing Associations of Soil and Serum PCB in Anniston,


Alabama: the Comparison between All Properties and Focus
Sites...............................................................................................................112
6.1 Materials and methods...........................................................................................113
6.1.1 Data categories................................................................................................113
6.1.2 Analytical techniques......................................................................................116
6.2 Results................................................................................................................120
6.2.1 Buffer analysis of all properties-focus sites....................................................120
6.2.2 Kriging analysis of all properties-focus sites..................................................132
6.3 Discussion and interpretation.................................................................................137
6.3.1 Proximity to Sites............................................................................................137
6.3.2 Significant associations...................................................................................138
6.3.3 Interpretations.................................................................................................140

6.4 Summary................................................................................................................140

viii
Chapter 7 Conclusions..................................................................................142
References......................................................................................................147

ix
List of Tables

Table 3-1. Independent variables: socioeconomic and spatial variables...44


Table 3-2a. Distribution of soil PCB concentration..49
Table 3-2b. Distribution of serum PCB levels..49
Table 3-3a. Estimates of ordinary least squares (OLS) regression of soil PCB with
socioeconomic and spatial variables...50
Table 3-3b. Correlation matrix between socioeconomic and spatial variables on soil PCB
regression50
Table 3-3c. Diagnostics for heteroskedasticity on soil PCB regression...50
Talbe 3-3d. Diagnostics for spatial dependence on soil PCB regression.50
Table 3-3e. Estimates of spatial regression of soil PCB with socioeconomic and spatial
Variables.50
Table 3-4a. Estimates of ordinary least squares (OLS) regression of serum PCB with
socioeconomic and spatial variables..52
Table 3-4b. Correlation matrix between socioeconomic and spatial variables on serum PCB
regression...53
Table 3-4c. Diagnostics for heteroskedasticity on serum PCB regression..53
Talbe 3-4d. Diagnostics for spatial dependence on serum PCB regression53
Table 3-4e. Estimates of spatial regression of serum PCB with socioeconomic and spatial
Variables.53
Table 4-1. Predictor variables and data sources used for the analysis.64
Table 4-2. Global and GWR regression estimates and diagnostics.69
Table 4-3. List of the coefficients for the excluded variables..76
Table 4-4. List of former and active foundries on lead emission based on 2007 EPA data77
Table 5-1. Metal Concentrations in Soil Samples, Anniston, Alabama..91

Table 5-2. Total variance explained - three components selected..95

Table 5-3. Component matrixes for 11 heavy metals.95


x
Table 6-1a. 25m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
All properties .121
Table 6-1b. 50m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
All properties .121
Table 6-2a. 25m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study) All properties ...123

Table 6-2b. 50m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study) All properties .......123

Table 6-3a. Correlation matrix between soil levels extracted by 25m buffer analysis and
ACHS participants serum levels All properties .124
Table 6-3b. Correlation matrix between soil levels extracted by 50m buffer analysis and
ACHS participants serum levels All properties .124
Table 6-3c. Correlation matrix between soil levels extracted by 25m buffer analysis and
neurocognitive study participants serum levels All properties .125
Table 6-3d. Correlation matrix between soil levels extracted by 50m buffer analysis and
neurocognitive study participants serum levels All properties ..125
Table 6-4a. 25m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
- Only focus sites....126
Table 6-4b. 50m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
- Only focus sites....126
Table 6-5a. 25m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study) Only focus sites127
Table 6-5b. 50m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study) Only focus sites128
Table 6-6a. Correlation matrix between soil levels extracted by 25m buffer analysis and
ACHS Only focus sites...129

xi
Table 6-6b. Correlation matrix between soil levels extracted by 50m buffer analysis and
ACHS Only focus sites..129
Table 6-6c. Correlation matrix between soil levels extracted by 25m buffer analysis and
neurocognitive study participants serum levels Only focus sites.129
Table 6-6d. Correlation matrix between soil levels extracted by 50m buffer analysis and
neurocognitive study participants serum levels Only focus sites..130
Table 6-6e. Regression model of serum PCBs after adjusting age ...130
Table 6-7a. Kriging analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
All properties ....132
Table 6-7b. Kriging analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive study) All properties ..132
Table 6-8a. Correlation matrix between soil levels extracted by kriging analysis and ACHS
participants serum levels All properties ..133
Table 6-8b. Correlation matrix between soil levels extracted by kriging analysis and
neurocognitive study participants serum levels All properties ...133
Table 6-9a. Kriging analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
Only focus sites134
Table 6-9b. Kriging analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive study) Only focus sites..134
Table 6-10a. Correlation matrix between soil levels extracted by kriging analysis and
ACHS participants serum levels Only focus sites....135
Table 6-10b. Correlation matrix between soil levels extracted by kriging analysis and
neurocognitive study participants serum levels Only focus sites.135

xii
List of Figures

Figure 3-1. Distributions of soil PCB levels with percent African American55

Figure 3-2. Distributions of soil PCB levels with percent of old housing units.57

Figure 3-3. Distributions of serum PCB levels with poverty level.....59

Figure 4-1. Map of study area. Lead sampling sites and its concentrations, overlaid with
Lee Brass foundry and major railroads...64
Figure 4-2. The coefficient surfaces generated using the GWR; parameter estimates of
proximity to Lee Brass foundry.70
Figure 4-3. The coefficient surfaces generated using GWR; parameter estimates of
proximity to major railroads..71
Figure 4-4. Significance map for parameter estimates (proximity to Lee Brass foundry)73
Figure 4-5. Significance map for parameter estimates (proximity to major railroads).73
Figure 4-6. Soil lead levels modeled based on spatially varying regression coefficients
generated using GWR...74
Figure 5-1. Sampling zones in the area of study (Source: EPA) ..82

Figure 5-2. Soil sampling points collected based on 4 different zones, overlaid with

foundries, main hydrology and major railroads...83

Figure 5-3. Geometric picture of principal components (PCs) 85


Figure 5-4. U-matrix of the SOM.87
Figure 5-5a. Loading matrix on PC1 and PC2.96

Figure 5-5b. Loading matrix on PC1 and PC3.96

Figure 5-6a. The scores of factor 1 versus factor 2 scatter plots for soil samples taken in the

different zones of study....97

Figure 5-6b. The scores of factor 1 versus factor 3 scatter plots for soil samples taken in the

xiii
different zones of study.....97

Figure 5-7a. Factor 1 scores on soil samples collected in Anniston study area....99

Figure 5-7b. Factor 2 scores on soil samples collected in Anniston study area..100

Figure 5-7c. Factor 3 scores on soil samples collected in Anniston study area..102

Figure 5-8. Soil texture class in Anniston study area..101


Figure 5-9. SOM application to metal levels in soils. Differences between sampling points

according to the metal concentrations in the specific zones..104

Figure 5-10. SOM application to heavy metals in soils. Environmental behavior of the
different elements...106
Figure 5-11a. 3 dimensional factor 1 scores interpolated by

kriging107

Figure 5-11b. 3 dimensional factor 2 scores interpolated by kriging.108

Figure 5-11c. 3 dimensional factor 2 scores interpolated by kriging..110

Figure 5-12a. Interpolated Factor 1 scores..107

Figure 5-12b. Interpolated Factor 2 scores.109

Figure 5-12c. Interpolated Factor 3 scores..110

Figure 6-1. Sample locations of Solutia and FCP soils...114

Figure 6-2. Two focus sites and participants living close proximity to the focus sites...116
Figure 6-3a. Buffer Maps to Estimate Residential Soil PCB Level of ACHS Participants118

Figure 6-3b. Buffer Maps to Estimate Residential Soil PCB Level of neurocognitive Study

Participants.118

Figure 6-4. Kriging Map to Estimate Residential Soil PCB Levels in Anniston using Solutia
and EPA, and FCP Soil Dataset.120
Figure 6-5a. Scatter Plot between soil levels extracted by 50m buffer analysis and ACHS
participants serum levels...131
xiv
Figure 6-5b. Scatter Plot between soil levels extracted by 50m buffer analysis and ACHS
participants serum levels...131
Figure 6-6a. Scatter Plot between soil levels extracted by kriging analysis and
neurocognitive study participants serum levels (before correcting outliers) ...136
Figure 6-6b. Scatter Plot between soil levels extracted by kriging analysis and
neurocognitive study participants serum levels (after correcting outliers) ..136

xv
Abstract

Polychlorinated biphenyls (PCBs) were produced in Anniston, AL, and used in

various commercial applications from 1929 until their ban in the mid-1970s due to concerns

about their environmental and biological persistence and toxicity. The principle objective of

this dissertation is to identify socioeconomic and spatial patterns, trends and the distributions

of soil PCBs and heavy metals, and serum PCBs in Anniston, AL using Geographic

Information Systems (GIS) and spatial statistics. The study is based on EPA soil data, and

Anniston Community Health Survey (ACHS) data and neurocognitive data both derived

from Agency for Toxic Substance and Disease Registry (ATSDR). This study hypothesizes

that increased levels of both soil and serum PCBs are related to increased exposures of

residents to pollution sources, residential locations, and low socioeconomic status. An

association is suspected between the unusually high levels of soil and serum PCBs and

potential contaminating sources (Monsanto plant and its nearest streams). The spatial

distribution of both soil and serum PCBs in the study area is heterogeneous. Two-spatial

regression models show statistically significant associations between proximity of

residential property to potential contaminating sources and increased levels of both soil and

serum PCBs. Living in poor neighborhoods and low socioeconomic status communities in

Anniston has higher risks of soil PCB exposure and potentially results in increased levels of

PCBs in their serum.

Two community health studies conducted based on population-based survey data

constituted sources of information for this dissertation. In the first health study, serum from

766 ACHS adult participants was analyzed for a total of thirty five ortho-substituted PCBs
xvi
by the Center for Disease Control and Preventions National Center for Environmental

Health laboratory. The second study consists of serum PCB levels for 321 children residing

in Anniston and neighbouring communities. Comparisons of these studies at varying

distances from suspected sources of contamination reveals a significantly positive

association between soil PCB levels and levels of PCBs in serum of subjects residing near

the suspected sources of contamination. In the two health studies, residents living in close

proximity to the Monsanto plant and its nearest streams show significant associations

between soil and serum PCB levels (r> 0.50, p=0.009 for the ACHS and r> 0.62, p=0.001

for the neurocognitive study).

This study also hypothesizes that the spatial distribution of 11 heavy metals in the

soil is generally non-homogeneous and that their potential sources are different. Three

clusters that share similar distribution patterns and suspected sources of heavy metal

pollution were detected using Principal Component Analysis (PCA) and Self Organizing

Map (SOM) methods. Soil Pb (lead), Cd (cadmium), Cu (copper) and Zn (zinc) are

primarily attributed to anthropogenic activities, such as operations of chemical foundries and

major railroads, whereas the presence of Co (cobalt) and Mn (manganese), and V

(vanadium) alone are also associated with natural sources such as soil texture, pedogenesis

and soil hydrology. A Geographically Weighted Regression (GWR) model also found

statistically significant associations between soil lead concentrations and anthropogenic

activities (metal casting foundry and railroads).

Keywords: soil and serum PCBs, socioeconomic status, contamination, Monsanto plant,

streams, heavy metals, GIS, spatial statistics, anthropogenic activities, natural sources

xvii
Chapter 1 Introduction

The persistence of high levels of polychlorinated biphenyls (PCBs) and lead in soils in

heavily industrialized places requires an understanding of geographic factors and patterns

(Carlon et al., 2001; Schumacher et al., 2004; Nadal et al., 2004). Identifying geographic

patterns and distributions of PCBs and lead in soils may provide better measures for

remediating and mitigating soils with elevated PCB and lead concentrations. In addition,

assessment of human serum levels for those living in areas with higher levels of soil PCBs

and lead is essential to quantify human exposures to these contaminants and for healthcare

utilization planning and management.

The main research themes and objectives we investigate in the study are summarized as

follows:

1) Characterize PCB and 11 heavy metals contamination, including lead, in soils in

Anniston, Alabama:

To describe spatial differences in levels of soil PCBs and heavy metals in

Anniston

To identify geographical patterns and distribution of PCBs and heavy metals

contamination in soils

To interpret spatial variations of PCB and heavy metals contamination in soils

2) Identify spatial contributors of high concentrations of soil PCBs and lead in Anniston:

To examine whether there is an association between lead contamination in

soils and suspicious sources of contamination

1
To identify spatial contributors that elevate levels of soil lead in Anniston

3) Identify geographic trends of soil PCB contamination in relation to socioeconomic

characteristics of the community:

To find any tendency toward geographic distribution of soil PCB in relation to

socioeconomic characteristics

4) Identify environmental factors contributing to the serum PCB level of residents living

in Anniston:

To identify any possible environmental risk factors with high level of human

serum PCB

5) Investigate the relationship of soil PCB contamination and serum PCB levels from

this exposure :

To determine whether there is a relationship between high soil and serum PCB

levels

1.1 Research problems

This study analyzed variations of toxic chemical contaminations in the context of spatial

and socioeconomic factors in Anniston, Alabama. We focused on polychlorinated biphenyls

(PCBs) and lead, the most common toxic substances found in the heavily industrialized

community of Anniston, Alabama. Anniston has a long history of heavy industry that was

essential for the local economy, even though it is a comparatively small community. As a

consequence of many years of industrial operations, PCBs and lead remain environmentally

persistent pollutants that raise a potential public health concern in this community. Anniston

2
was known as one of only two places in the United States that had a plant which produced

PCBs for over forty years (ATSDR 1996; Olson, 2007).

Exposures of humans to PCBs can potentially tribute to a wide range of adverse health

effects. In addition, lead contamination in soil caused by several foundries located in

Anniston presents a public health hazard as well. For the last few decades, the widespread

existence of PCBs and lead in Anniston communityhas gained nationwide attention because

of environmental and public health concerns (ATSDR 2003; Olson, 2007; EPA, 2008). The

US EPA is still in the process of investigating residential and commercial sites and cleaning-

up soil with high levels of PCB and lead concentrations as a consequence of a consent decree

(US EPA and Solutia 2002; EPA, 2011). Additionally, the Agency for Toxic Substance and

Disease Registry (ATSDR), one of the agencies operated by the Centres for Disease Control

and Prevention (CDC), supported studies to identify relationships between human exposure

to PCBs and lead and potential adverse health effects for the Anniston community during the

past three years (ATSDR 2003; Olson, 2007; EPA, 2008). The Anniston Environmental

Health Research Consortium (AEHRC) conducted the Anniston Community Health Survey

(ACHS) to assess human exposure to PCBs and health risks which may be associated with

this exposure. In this dissertation, the role of geographic perspectives in studies of PCBs and

lead contamination in soil and human serum PCB level is presented and discussed. The study

also evaluated how the distribution of PCB and lead in soil and serum varies in the context of

spatial and socioeconomic factors in the Anniston community.

3
1.2 Study area

The study area is Anniston, Alabama, located in the foothills of the Appalachian

Mountains, approximately 60 miles east of Birmingham and 90 miles west of Atlanta.

According to the 2008 U.S. Census, Anniston is a community of about 23,000 people and is

situated in Calhoun County, encompassing an area of 45.44 square miles of which 45.4

square miles is land and 0.04 square miles is water. As of 2008, Anniston had a median

household income of about $30,698, with 46.2 percent white residents and 51.2 percent

African American residents. It is one of two urban centres and principal cities within the

Anniston-Oxford Metropolitan Statistical Area. Due to a long history of heavy industry, there

is a high potential for the release of contaminants such as PCBs and lead which may

adversely affect the health of both industrialized and non-industrialized neighbourhoods in

this community.

1.3 Study hypotheses

Two study hypotheses were developed on the basis of current literature on PCB and lead

contamination reviewed in Chapter 2. Focusing on these two hypotheses was considered

representative because current hypotheses argue that elevated levels of both PCB and lead in

soils, and human serum PCB is associated with increased exposure to pollution sources. In

light of this background, this study focused primarily on the following two hypotheses.

4
1) Hypothesis I: Elevated levels of both PCB and heavy metals, including lead, in soils,

and human serum PCB are associated with releases at point sources and

socioeconomic factors.

a. Are the variations among the property zones (spatial) homogeneous?

b. Are the levels of human serum PCB in Anniston community associated

with distance from the Monsanto plant and main hydrology?

c. Are race and age of the residents key factors associated with a high level

of serum PCB?

d. Is the distribution of soil PCB related to socioeconomic characteristics in

this community?

2) Hypothesis II: The distribution of PCB and heavy metals contamination in soils and

elevated serum PCB levels are associated with point sources of contamination.

a. Are the high levels of soil lead clustered near local foundries or other point

sources?

b. Are the levels of soil lead associated with the distance from the major

railroads?

c. Does living in close proximity to point sources of contamination increase the

risk of having high serum PCB level?

1.4 Significance of the research

The rationale for performing this research was based on the immediate necessity to

formulate sustainable decisions regarding effective mitigation of soil pollution and reduction

5
of potential health risk in the community. Potential health effects caused by the exposure of

humans to persistent environmental contaminants like PCB and lead are becoming a key

public health concern in most industrialized countries including the United States. In

particular, this research is carried out at the intersection of environmental health/ toxicology

and medical geography, by focusing on persistent environmental pollutants like PCBs and

lead and its consequent health risk in the community. Together, these disciplines play an

important role for interdisciplinary research on environmental contamination and health

effects.

Moreover, this research enhances our understanding of exposure science for PCBs and

lead through understanding of the geographical distribution of soil with elevated PCB and

lead concentrations in Anniston, Alabama in particular. Environmental exposure science

includes explanations and descriptions of geographic variations of toxic pollutants, but also

utilizes analytical approaches such as spatial clustering analysis. This research describes and

pinpoints spatial patterns of PCB and lead in soil. In addition, the research examines

association between proximity of the residents to the former PCB manufacturing plant and

their serum PCB levels. In previous studies, many cases of residents with increased levels of

serum PCB and lead have been identified to reside in areas close to hazardous wastes sites,

major roads and major railroads (Heinze et al., 1998; Reissman et al., 2001). Therefore, it is

important to study local factors that might explain elevated lead and PCB concentrations, and

serum PCB levels. In view of these circumstances, the significance of this research is

associated with two primary contributions as follows:

6
From a methodological perspective, this research

1. Improve geospatial characterization of both PCB and lead contamination in soils,

and human serum PCB levels in Anniston communitybecause this is essential to

enhance understanding of factors responsible for the presence of elevated levels of

both PCB and lead in soils, and human serum PCB. This research will assist

efforts to make a better mitigation plan for environmental contaminants in

Anniston, Alabama.

2. Explores a GIS-based application for effective soil pollution management as well

as analyzes related spatial techniques, which are imperative to the understanding

of the hypotheses proposed above.

From a public health perspective, this research

1. Supports our understanding of spatial patterns, trends and distributions of PCB

and lead concentrations in soil, human serum PCB levels, and suspected sources

of pollution.

2. Provides geographical assessment of environmental pollution risk for making

healthy living choices for residents of this community.

3. Determines the primary geographical factors that might be responsible for the

existing spatial patterns and distributions of elevated levels of PCBs and lead in

soil, and human serum PCB levels in this community.

4. Triggers active debate among the public health policymakers and health

practitioners.
7
1.5 Structure of dissertation

The aim of the structure of this dissertation is to facilitate the readers understanding

of how this document is organized. The dissertation is divided into three parts. Part one

covers the introduction to the research and reviews the related literature. Part two

describes the data and methodologies in the study and also presents and discusses results

of data analyses on the geographical exposure science for PCBs and lead in Anniston,

Alabama.

Part One: Introduction and Related Literature

Chapter 1: Introduction to the research

Chapter 1 deals with the background that summarizes the foundation of this research.

It provides a brief summary of research objectives, the definition of the research problems,

the description of the study area, and discussion of the study hypotheses, and research

significance.

Chapter 2: Background and Related Literature

Chapter 2 covers background of the study and the review of relevant literature. It

reviews previous geographical studies of soil pollutants including PCBs and lead, and serum

PCB levels in the study communities. Research hypotheses in Chapter 1 were defined in

combination with the emerging medical geography and environmental health research

problems that have been reviewed in Chapter 2. This chapter includes three key sections.

Section one includes background of PCB and lead production in Anniston. The second
8
section covers definitions of PCBs and lead, and effects of health, socioeconomic status, and

mobility associated with exposure to PCBs and lead. Spatial methods of chemical exposure

and risk assessment are reviewed in section three. Current methodological studies on

chemical exposure and risk assessment are discussed, including geostatistics and other

statistical techniques in geographical exposure science.

Part Two: Data, Methodologies, Results and Discussions

Chapter 3: Exploring Geographical Variations of Soil and Serum PCB in Anniston,

Alabama: the Association with Socioeconomic and Spatial Variables

Chapter 3 presents that following their release into the environment, the

concentrations of PCBs in soil exhibit a heterogeneous spatial distribution. It is important to

describe and understand this spatial distribution as a precursor to characterizing human

exposure and health effects. In this chapter, we focus upon the spatial distribution patterns of

soil and serum PCBs measured in Anniston, Alabama, in relation to socioeconomic and

spatial variables. We use spatial regression analysis both to determine the socioeconomic

characteristics of those residing in areas of greatest soil contamination and high serum levels,

and to describe the effects of spatial variables on the concentration of PCBs in soils and

human serum.

Chapter 4: Geographic variation of soil lead concentrations in Anniston, Alabama

Chapter 4 describes the heterogeneous spatial patterns of lead in soil in this

community. Geographically weighted regression (GWR) is used to correct problems of

9
spatial nonstationarity in ordinary least square (OLS) regression. GWR generates spatial data

that consider the spatial variation in the relations between independent and dependent

variables. Maps produced from these data in this chapter play an important role in analyzing

and defining spatial nonstationarity.

Chapter 5: Analysis of Heavy Metal Sources in Soils using Multivariate Statistics and GIS

Chapter 5 classifies regional concentrations of 11 different heavy metals in soils and

characterizes the distribution of heavy metals in polluted sites using a Principal Component

Analysis (PCA) and a Kohonen Self-Organizing map (SOM). Kriging isapplied to generate

regional distribution maps for the interpolation of un-sampled areas of heavy metal

contamination using GIS techniques.

Chapter 6: Use of Geographic Information System to Assess Individual Exposure to Soil PCB

Contamination in Anniston, Alabama

Chapter 6 assesses the magnitude of PCB exposure in residents from Anniston,

Alabama by analyzing associations between PCB in soil and serum PCB concentrations using

buffer and kriging methods. In this chapter, we focus upon a high risk group on PCB

exposure who lives in residential areas in close proximity to two significant sources on PCB

exposure, the Monsanto plant and the nearest off-site drainage ditches. Two different serum

datasets and soil samples with PCB levels are used to evaluate individual exposure to soil

PCB contamination in this community.

10
Chapter 2 Background and Related Literatures

Since the industrial revolution began, enormous amounts of a wide range of chemicals

have been released into the water, air, and soil. Pollution is a major concern in heavily

industrialized regions. Today, a prominent source of pollution pressure on the environment

comes from newly industrializing countries (NICs) as they pursue lifestyles similar to those

in developed countries (Meade and Earickson, 2000). Dubos (1965) emphasized that

mankind is adapting, genetically and culturally, to the built-environments that humans have

made. People spend most of the time in their homes and workplaces. Most people in

industrialized nations are living in urban areas, and even in rural areas most of the

environmental stimuli around people are composed of developed land and settlements

(Meade and Earickson, 2000).

2.1 Background of PCB and Lead production in Anniston

2.1.1 History of PCB production in Anniston, Alabama

In the United States, PCBs were extensively used from 1929 through 1979, when the

U.S EPA prohibited production of this hazardous substance. During this time period, more

than 1.5 billion pounds of PCBs were produced in the United States at only two plants, one

located in Anniston, Alabama and the other in Sauget, Illinois. Congress legislated the Toxic

Substances Control Act (TSCA) in 1976 due to the existence and toxicity of PCBs in the

environment (ATSDR, 1996; Olson, 2007; EPA, 2008). This contained strict regulations on
11
the production, processing, and distribution of PCBs. Consequently, the TSCA enacted true

cradle to grave (which stands for from production to disposal) management for PCBs that

were manufactured in the United States (ATSDR, 1996; Olson, 2007; EPA, 2008).

From 1935 to 1977, Solutia, Inc., previously named Monsanto, manufactured PCBs in

Anniston. The Solutia facility is built about 1 mile west from downtown Anniston and

contains 70 acres of land. Based on the 1990 U.S. Census, there were approximately 1,580

families and 5,926 residents residing within 1 mile of the Monsanto plant (ATSDR, 1996;

Olson, 2007; EPA, 2008). The racial makeup of this area was about 56% white and 43%

African American. Furthermore, about 7% and 15% of the residents were children under the

age of five and residents over the age of 65, respectively (ATSDR, 1996; Olson, 2007; EPA,

2008). Many hazardous materials including PCBs were buried on-site and an enormous

amount of PCBs were released into the environment, while this site was operated for over

forty years.

In 1917, the Southern Manganese Corporation (SMC) opened the original facility, and

started to manufacture ferro-phosphorous compounds, ferro-manganese, ferro-silicon, and

phosphoric acid. In addition, the facility started to manufacture biphenyls in the late 1920s

(ATSDR, 1996; Olson, 2007; EPA, 2008). SMC was renamed the Swann Chemical

Company (SCC) in 1930 and it was purchased by Monsanto Company in 1935. Monsanto

Chemical Co. started the manufacture of PCBs in 1930 and it stopped producing PCBs in the

early 1970s. In 1979, the Environmental Protection Agency (EPA) banned the manufacture

of PCBs in the United States (ATSDR, 1996; Olson, 2007; EPA, 2008; EPA, 2009).

12
Monsanto changed its name to Solutia in 1997 and now manufactures para-nitrophenol and

polyphenyl compounds at its facility in Anniston.

During the facilitys operational history, it disposed of hazardous wastes into two

different landfills, which are named the west end landfill (WEL) and the south end landfill

(SEL); these are located adjacent to the plant (ATSDR, 2003; Olson, 2007; EPA, 2009).

These hazardous waste sites and off-sites drainage ditches contained PCB waste products

from production and caused soil contamination. The WEL is placed at the southwestern area

of the facility, covering six acres of land; it was used to dispose of toxic wastes produced at

the facility from the mid-1930s to 1961. In 1961, the Alabama Power Company purchased

the WEL, and then Monsanto started to dispose of toxic substances at the SEL (ATSDR,

2003; Olson 2007; EPA, 2009). The SEL is placed southeast of the facility across U.S.

Highway 202, and is located at the lower northeastern slope of Coldwater Mountain

(ATSDR, 2003; Olson, 2007; EPA, 2009). The SEL includes two cells that were used to

dispose of hazardous substances from the facility, and ten cells that are unlined. Monsanto

ended its disposal of toxic wastes in these landfills in approximately 1988 (Olson 2007; EPA,

2009).

2.1.2 The investigation of PCB pollution in Anniston Alabama

The finding of elevated PCB concentrations in fish discovered in Lake Logan Martin,

located approximately 30 miles from Anniston, became the first notification of PCB

contamination in this area.

13
The sediments in two local streams, Snow Creek and Choccolocco Creek, which flow

into Lake Logan Martin, were primary sources for fish to bio-accumulate with PCBs (Olson,

2007). In November 1993, the detection of large quantities of PCBs found in fish from Snow

Creek, Choccolocco Creek, and Lake Logan Martin prompted the State of Alabama to issue

a fish advisory for these locations, restricting fish consumption for all persons. According to

the U.S. Environmental Protection Agency (EPA), the Agency for Toxic Substance and

Disease Registry (ATSDR), the Alabama Department of Environmental Management, and

the Alabama Department of Public Health, the PCB production plant formerly operated by

Monsanto, was identified as the primary cause of PCB contamination in the region (Olson,

2007; EPA, 2009). Consequently, Health Consultation on the Evaluation of Soil, Blood and

Air Data for Anniston, Alabama was released by ATSDR in February of 2000 (ATSDR,

2000a; Olson, 2007). Solutia, Inc. (formerly Monsanto) conducted an investigation under a

consent order with the Alabama Department of Environmental Management and examined

levels of PCB contamination from soil and sediment samples collected at private residences

north and east of the plant and off-site drainage ditches (ATSDR, 2001; Olson, 2007; EPA,

2009).

Findings from the investigation of PCB pollution in this region prompted property

buyouts for some residents and remediation on off-site polluted places. Furthermore,

throughout the investigation, elevated serum levels of PCBs were detected for residents

living in close proximity to the Monsanto plant and for other neighborhoods in Anniston

(Olson, 2007; EPA, 2008; EPA, 2009). Residents living in Anniston were more vulnerable

14
to the unwanted environmental PCB exposure because of high concentrations of PCB

pollution in the air and soil caused by various contamination sources and pathways.

Several circumstances cause the high risk for PCB contamination in residents living in

Anniston. In particular, children are often in contact with contaminated dust, soil, or dirt

(ATSDR, 1995; Tsongas et al., 2000). In the present situation, it appears that a primary

environmental pathway of PCB exposure in Anniston is water to soil to vegetable plants to

human food because the Monsanto plant, a major PCB manufacturing facility in Anniston,

released PCB contaminated wastes into off-sites drainage ditches and streams. Deposition of

vapour-phase PCBs on the surface of soils or plants is a second pathway (Tsongas et al.,

2000). Geographic Information System (GIS) mapping of PCBs in soil of the Anniston area

indicates that concentrations are highest near the off-sites drainage ditches and the streams

which received liquid disposal discharge from the Monsanto plant and this evidence

corresponds to other case studies on PCB exposure (Kerzhentsev et al., 1997; Tsongas et al.,

2000).

An average air PCB level of 62.8 ng/M3 was also measured from samples collected at

the east side of the Monsanto plant in 1999 (ATSDR, 2000a; ATSDR, 2000b; Olson, 2007).

Then, this air level was compared to a range of 0.3 to 1.5 ng/M3, which reflects the average

PCB concentration measured in urban regions in Alabama, and other rural and urban regions

in the United States (ATSDR, 2000a; ATSDR, 2000b; Olson, 2007). In addition, PCB

concentrations up to 2810 parts per million (ppm; mg/kg) were measured in soil samples

collected in the floodplain connected to the Monsanto plant and PCB concentrations up to

15
840 ppm were measured in places not in the floodplain, but nearby the Monsanto facility

(ATSDR, 2000a; ATSDR, 2000b; Olson, 2007).

Sites highly contaminated with PCB have been cleaned-up by the Monsanto and Solutia

companies through a court-approved agreement with the EPA. The EPA established two

different levels of soil remediation criteria in Anniston, with soil PCB concentrations

measured in ppm, which can be specified as: 1) Excavation of surface soils in residential

yards where five-point composite soil samples contain total PCB concentrations greater than

one part per million (ppm), 2) Excavation of subsurface soils in residential yards where five-

point composite soil samples contain total PCB concentrations greater than 10 ppm and 3)

Cleanup of home interiors with total PCB concentrations in dust greater than 1 ppm (EPA,

2011). In 2003, the US EPA started the large scale remediation of PCB contaminated

residential soil in Anniston. Over twenty thousand soil samples from different parties were

collected and analyzed and about 500 residential properties had the surface soil removed and

replaced (EPA, 2009; EPA, 2011).

2.1.3 Lead production in Anniston Alabama

Since the 1870s, Anniston, Alabama has been a site of iron production which

contributed to the uncontrolled environmental release of lead in this community. In the

1920s, Anniston was the nations largest producer of cast-iron soil pipe, with an annual

production of about 140,000 tons. In fact, during this time period, Anniston was known as

the Soil Pipe Capital of the World.. In more recent years, most foundries have closed, with

the exception of a few. Lead is present as a hazardous environmental contaminant since it is


16
a component of many alloys widely used for casting. The EPAs investigation of lead

contamination in Anniston began when high concentrations of lead were found in soil when

sampling for PCBs during 1999-2000. Companies which presently or formerly owned the

foundries in Anniston agreed to pay for residential lead cleanup for some local properties

(US EPA, 2005). The primary cause of lead contamination in Anniston was the use of

casting molds in local pipe foundries which were made of sand to cast metal pipes. When

these molds were broken off, it caused waste equal in volume to the pipe to be released.

Many residents in Anniston utilized sand from the molds as fill dirt to level their yards.

The first main source of lead exposure in Anniston is the Lee Brass Foundry; it is

known as a major manufacturer in the study area producing one of the highest mold rates in

the industry. The process begins by melting metals remodeled from sand and this melted

metal is poured into molds to make the castings. This foundry is involved in four casting

markets (commercial, plumbing, industrial, and marine products) for the U.S Navy (Lee

Brass). Through this process, lead becomes a main soil contaminant in the waste sand.

Based on EPA reports, waste sand in brass casting foundry maximally contains about 3000

ppm of lead, and approximately 600 ppm is typically dissolved into soils (Anderson et al.,

1983).

In May 2005, the US EPA entered into an Order for Consent on Removal Action with

former and current owners of foundries to sample soil and test for lead levels and remediate

sites with excess lead levels. EPA has been cleaning up lead from residential properties at

the site based on a three-tiered approach, consistent with EPA's August 2003 Superfund

17
Lead-Contaminated Residential Sites Handbook, OSWER 9285.7-50. Tier 1 properties are

residential properties with soil lead concentrations greater than 1,200 ppm, and a sensitive

population: either a child less than 7 years old, or a pregnant woman residing at the property.

Tier 2 properties are residential properties with soil lead concentrations between 400 ppm

and 1,200 ppm and a sensitive population, or soil lead concentrations above 1,200 ppm and

no sensitive population. Tier 3 properties are residential properties with soil lead

concentrations between 400 ppm and 1200 ppm and no sensitive population. As a result of a

consent decree, the US EPA is still in the process of investigating commercial and

residential areas and remediating soil with high lead concentrations.

2.1.4 Other heavy metal sources in soils

Soils are important environments where air, water, and rock interface. As a

consequence, they are a site for various contaminants caused by anthropogenic activities

such as industry, transportation, and agriculture (Facchinelli et al., 2001). Soils can be also a

primary source of heavy metal contamination to ecosystems like surface and ground waters,

living organisms, and the ocean. It is widely known that various types of chemicals released

by foundries are toxic (Mehlman, 1992). Pollutants are released into the atmospheric soils,

and vegetation of residential areas. Therefore, communities located near foundries can face

higher risks of adverse health effects, such as cancers and other chronic diseases (Kaldor et

al., 1984; Bhopal et al., 1998; Lin et al., 2001). Moreover, foundry waste includes inorganic

pollutants that lead to contamination of soils and ecological and human health hazards.

Consequently, the disposal of foundry waste has been associated with the pollution of soils

18
with lead (Pb), cadmium (Cd), chromium (Cr), copper (Cu), nickel (Ni), vanadium (V) and

zinc (Zn) in other potentially toxic substances (Schroder et al., 2000).

For the last few decades, the widespread existence of heavy metals has gained

nationwide public attention. 1) The long-term exposure to low concentrations of toxic

substances such as lead (Pb), arsenic (As), mercury (Hg), and cadmium (Cd) can result in a

wide range of adverse health effects, even though current concentrations of heavy metals in

the environment have little impact on morbidity or mortality in the general population. 2)

Unlike many organics, they are highly resistant to environmental degradation, causing bio-

accumulation. 3) They are present in a backgrounds levels from non-anthropogenic origin

and their background levels in soils are closely associated with pedogenesis and weathering

of parent rocks. 4) They can be mobile through the changing of environmental conditions

such as climatic and land use change (Stigliani, 1993; Christensen, 1995; Chang, 1999).

2.2 PCBs and Lead exposure

2.2.1 What are PCBs and Lead?

Many toxic substances such as lead and PCBs have several desirable physical and

chemical properties that make them attractive for industrial use. Commercially, lead is used

as an important stabilizer and is necessary in many kinds of batteries. Polychlorinated

biphenyls (PCBs) are persistent global environmental contaminants that were used in a wide

range of applications including hydraulic fluids, lubricants, and dielectric fluid in capacitors

and transformers (Schantz, 1996). PCBs are composed of a mixture of 209 different

19
congeners within a class, which has a biphenyl ring structure that has from one to ten

chlorine atoms in various positions. Among the congeners, the PCBs 118, 138, 153, and 180

are the four most abundant congeners in humans, consisting of 46 percent of the total PCBs

in humans and represent a complex mixture of interrelated environmental xenobiotics

(Weisglas-Kuperus and Patandin et al., 2000). Aroclor 1260 is defined as a complex

commercial mixture of chlorinated biphenyl congeners, which have from one to nine

chlorines and an average chlorine content of about 60%. Physical properties of various PCB

mixtures can vary from a waxy solid to an oily liquid (Olson, 2007). PCBs are extremely

resistant to environmental degradation. The lipophilicity and environmental & biological

stability of PCB causes them to bio-accumulate in fish and other wildlife and ultimately in

humans which consume contaminated fish and other food products (Schantz, 1996; Stewart

et al., 1999). According to one study, about 60% of the total PCBs produced has been either

deposited in landfills for storage or is still in use in older electrical equipment, while 30% of

the total PCBs produced have been released into the environment (Tanabe, 1998). Therefore,

potential of additional release of PCBs into the environment still exists. Most researchers

have made similar conclusions that PCBs will continue to be a major contaminant for at

least several more decades (Hansen, 1987; Lang, 1992; Tanabe, 1998).

Lead serves as a good example of how our cultural capacity to change the

environment has outpaced our biological ability to adjust. Lead is commonly found in rocks

on the earths surface, as well as in its waters (Meade and Earickson, 2000). It has been

considered one of the major toxic hazards, but it is usually produced at low levels. Lead

additives to gasoline were primary causes of increasing blood level levels, so U.S. EPA ban
20
on lead additives to fuel. Lead interacts with a wide variety of body chemicals and it causes

chronic nerve diseases and brain damage, due to its effects on copper metabolism in the

human body (Meade and Earickson, 2000). Several situations contribute to the high risk for

lead poisoning in humans, especially among children. Children are often exposed by contact

with polluted soil or dirt, causing them to have lead poisoning (Heinze et al., 1998).

However, it is not easy to treat chronic lead poisoning because of side-effects in the

chellation process, whereby a substance joins with the lead and causes it to be excreted from

the body (Meade and Earickson, 2000). The process can result in higher lead extraction from

bone storage and accelerates an acute blood lead crisis, which did not exist before treatment.

Therefore, proper preventative measures should be made before potentially harmful

treatment becomes necessary (Heinze et al., 1998; Meade and Earickson, 2000).

2.2.2 Health effects associated with exposure to PCB

According to the report of the ATSDR on the assessment of air, soil and blood data

from Anniston, Alabama,

PCBs in soil in some areas of Anniston present a public health hazard

based on the potential for chronic cancerous and non-cancerous health effects.

Furthermore, residential soils in some areas of Anniston with higher levels of

PCBs may present a public health hazard for thyroid and neurodevelopmental

effects for intermediate exposure durations, which are less than 1 year of

exposure. (ATSDR, 2000; Olson, 2007).

21
Thus, it is important to discern the risk to public health in the community due to

environmental exposure to PCBs in Anniston even though production of the PCBs was

banned nearly 30 years ago.

For adults, blood levels of PCB over 5 parts per billion are regarded as exceeding the

average level for the general U.S. population (CDC, 2010). Some residents in Anniston have

been excessively exposed to total PCBs, which are comparable to exposure levels for people

who are involved in heavy occupational exposure to PCBs (Olson, 2007). Blood levels of

PCB in Anniston residents are often found to be higher even compared with those recorded

in individuals who had high environmental exposures to PCBs due to heavy consumption of

contaminated fish in the Great Lakes (Anderson et al., 1998; Falk et al., 1999; Hanrahan et

al., 1999; ATSDR, 2000; Stephen et al., 2001; Olson et al., 2002; Olson, 2007).

In toxicology, there is a general principle that the risk of adverse health effects caused

by the exposure to toxic chemicals, like PCBs and lead, becomes higher as the level of

exposure and the resulting dose in the human increases (Olson, 2007). A wide range of

potential health effects associated with the exposure of humans to PCBs has been found and

they include cancer, cardiovascular disease, chloracne, developmental effects, diabetes,

impaired thyroid function, liver injury, immune system dysfunction, neurobehavioral effects,

and reproductive system impairment (ATSDR, 2000; Maruyama et al., 2002; Pavuk et al.,

2004; Tusscher and Koppe, 2004; Olson, 2007). In particular, children are more vulnerable

to adverse health effects due to exposure to PCBs. Therefore, residents living in Anniston

22
have a relatively higher risk of developing one or more potential health effects listed above

(ATSDR, 2000; Olson, 2007).

In summary, the forty two years of PCB production in Anniston has resulted in high

levels of PCB contamination in this community. Also, this situation led residents to be

unconsciously exposed to high concentrations of environmental PCBs (ATSDR, 2001;

Olson, 2007). The exposure of PCB most likely started in the 1930s and reached its

maximum in the 1970s until the U.S. EPA ban the production of PCBs in the United States.

Exposure to PCBs is still an issue today, but to a lesser extent. This suggests that blood PCB

levels for the residents were much higher in past years, including the period of PCB

production in Anniston community(ATSDR, 2001; Olson, 2007; EPA, 2008). Based on the

serum PCB analyses conducted more recently in Anniston residents, there is an increased

risk of various adverse health effects due to elevated serum PCB levels in some residents.

2.2.3 Effect of socioeconomic status on exposures to PCBs

A number of studies, including our own, have investigated the relationships between

socioeconomic status and exposures to PCBs and have identified associations (Borrell et al.,

2004; Vrijheid et al., 2010). Previous literature has hypothesized that a lower socioeconomic

status measured by low income and limited education, would be closely related to higher

exposures to PCBs. In the adjusted analyses, income has been shown to contribute

significantly to the increase in serum concentrations of PCBs, while education has not

(Borrell et al., 2004). In addition, recent study have examined whether inequalities of

socioeconomic status affect a level of human exposures to various common environmental


23
contaminants in air, water and food. This study has found that associations between

socioeconomic status indicators and PCBs were strong and consistent in direction, whereas

concentrations in other types of contaminants like nitrogen dioxide and DDE generally

showed weak or no relationships to socioeconomic status indicators (Vrijheid et al., 2010).

Both Borrell et al. (2004) and Vrijheid et al. (2010) maintain that socioeconomic disparities

and serum concentrations of multiple environmental contaminants including PCBs are

related in that socioeconomic disparity affects psychosocial factors that exacerbate risk

factors for elevating serum concentrations of the contaminants. Some theories regarding this

relationship contend that inequalities in socioeconomic status create appreciable strain that

ultimately increase risks of adverse health effects (Wilkinson, 1999; Marmot and Wilkinson,

2001).

2.2.4 Mobility and Environmental Health

Population movement, which exists at various scales in space and time and is highly

influenced by economic development, is a fairly important factor to comprehend disease

etiology and environmental exposure. There are two types of classification in mobility,

migration and circulation. Migration can be defined as permanent movements and is not often

measured unless it crosses a political border. The spatial scale in migration can vary from

international to moving to another house after marriage (Meade and Earickson, 2000). In

contrast, circulation refers to movements that return to the origin including travel to perform

daily activities such as shopping, work, school, or a vacation. Mobility affects environmental

24
exposure and is thus an aspect of disease risk; it has been a popular topic studied by

population geographers (Han, 2003; Meade and Earickson, 2000).

Prothero (1961; 1965) firstly conceptualized the influence of population mobility in

the context of controlling malaria in East Africa. He applied maps of animal-herding

territories, pilgrimage routes, and other culture-related population movement when he tried to

understand disease etiology. Furthermore, Prothero (1977) developed theoretical models for

disease and mobility related with potential health hazards for the study of population mobility.

Zelinsky (1971) mentions a demographic transition of mobility, which reveals a

relationship between development and mobility. According to Zelinsky, in a pre-modern

society, fertility and mortality were high, while residential mobility rarely occurred. Thus,

circulation movement was limited to certain types of activities such as agricultural needs,

religious travel, and warfare. However, an enormous population movement was due to

population growth as the demographic transition started and mortality rates decreased. People

migrate from rural areas to cities and to foreign countries and labor circulation increases. In

contrast, in the future demographic transition, migration between cities and for retirement is

mainly generated as birth rates and death rates have stabilized at low levels. Circulation

becomes dramatically more active as it includes all movements for social and economic

purposes (Zelinsky, 1971). The impacts of these demographic transitions in mobility, related

to economic development, on patterns of disease diffusion are important factors for future

epidemiological study.

25
Han (2003), in his review of mobility and health, notes two studies egarding migrant

studies. Cliff et al. (1986) emphasize effects of migrant populations on the diffusion and

transmission of communicable disease. In this study, much attention was paid to infectious

disease such as malaria in regards to disease transmission by human migration. In another

migrant study, McKinlay (1975) compares disease rates between migrants and non-migrants

before and after migration, in relation to the comparison between native people and migrants

and altered risk factors of disease related with new habitats. After all, there are close

associations between migration and disease occurrence as a result of moving into new

habitats and having stress after migration (McKinlay, 1975). In addition, migration from the

countryside to cities affects disease incidence, which demonstrates the etiologic importance

of migration (Hull, 1979).

2.3 Geographic and spatial analysis on disease

Walter (2000) stated that epidemiological analyses in geography date back to the 1800s,

characterizing the spread and possible causes of outbreaks of infectious diseases such as

cholera and yellow fever and using maps of disease rates in different countries. Over the

decades, epidemiological research in geography became more complex, sophisticated, and

utilized. Doll and Keys (1980) also noted that spatial epidemiology brings a rich tradition of

ecologic studies that use explanations of the spatial distribution of diseases in different places

for better comprehension of the etiology of disease.

26
Stocks (1936) highlighted disease mapping and cartography as an early example of

epidemiological analyses in geography. He analyzed variations in cancer mortality in

England and Wales. As more recent examples of disease mapping, Swerdlow and dos Santos

Silva (1993) made an atlas of cancer incidence across counties of England and Wales. Pickle

et al. (1996) also created an all-causes mortality atlas and a separate cancer mortality atlas for

the United States. Disease mapping provides a visual summary, which is one of the first steps

in exploratory spatial data analysis. GIS enables construction of visual maps of spatial

patterns in terms of morbidity and mortality in relation to population density, geographic

features and causal exposures. In the future, spatial analysis in epidemiology promises to

make further contributions to pattern identification and the formulation of spatio-

epidemiologic hypotheses (Monmonier 1996).

Jacquez (2000) emphasized the usefulness of spatial and geographic analyses in

epidemiology. He noted that GIS support helps to optimize geographic locations for health

services and facilities. It helps to determine the identification of medical facilitys coverage

areas, and the estimation of ambulance travel times to medical facilities (Jacquez, 2000). In

public health, GIS in epidemiology provides substantial contributions in vector control; for

example, which places need intervention to reduce vector-borne diseases such as Lyme

disease and Shistosomiasis.

Richardson and Montfort (2000) presented the need for studies in geographic correlation

and spatial statistics. Some of the studies have used explicitly spatial methods to characterize

the geographic distribution of contaminants and serum concentrations of environmental

27
pollutants (Bailey and Gatrell, 1995; Rushton et al., 1996; Haining, 1998; Hwang et al.,

1999; Richardson and Montfort, 2000). The objective of the studies is to test geographic

variation across population in terms of exposure to environmental components such as water

or soil, socioeconomic and demographic measures such as income and race, and lifestyle

factors such as diet and smoking in relation to health outcomes (Rushton et al., 1996;

Richardson and Montfort, 2000). In other words, geographic variation in geographic

explanatory variables can be quantified by spatial statistics. According to Rushton and

Lolonis (1996) and Haining (1998), spatial statistics also explain how populations, their

characteristics, covariates and risk factors can be changed over geographic space.

Methodological research on disease clusters and disease incidence has developed, and spatial

statistical software is often combined with GIS, creating interactive exploratory spatial data

analysis (Bailey and Gatrell, 1995; Haining, 1998).

Devine and Louis et al. (1994) and Lawson (1999) developed spatial models for

epidemiological research in geography. Bayesian models, advanced Monte Carlo estimation

techniques, and geostatistical models are examples of spatial models that are applied for

smoothing disease rate in maps. These models allow us to stabilize and interpolate disease

rates, and also to estimate explanatory variables at non-measured locations. For example,

kriging is defined as a linear-weighted gridding method, which has been applied to extract

additional value by creating contour surfaces of pollutant distribution using spatial sample

data (Juang and Lee, 1998; Nathanail et al., 1998).

28
All of these benefits found in epidemiological research in geography strongly support

descriptive epidemiology. They facilitate the identification of clustering and hotspots of

disease and also can suggest possible causes of disease on a local scale.

2.4 Comparing spatial patterns of clustering

2.4.1 Kernel density estimation

In general, density surfaces can be used to identify concentrations of points. For

instance, if high levels of both PCB and lead in soils, and human serum PCB are found in

close proximity to contamination sources, then the density of contaminated soil samples

with PCB and lead, and elevated serum PCB levels will show greater concentration near the

pollution release sources. Density is a measure of the magnitude or frequency of a

phenomenon per unit of area, such as the number of pedestrian crashes per square mile or

people per square mile (Pulugurtha et al., 2003). Density can be computed either by using

simple or kernel estimation. Both estimations use a circular search area to compute density

(Pulugurtha et al., 2003). In particular, a kernel density map is a good method for showing

where the point features of pedestrian accidents are concentrated. A main difference

between the two density methods is that simple density estimation computes the individual

cell density values, which is the proportion of number of points that fall within the size of

search area, while the kernel method weighs events differently dependent on their location

(Pulugurtha et al., 2003).

29
The study area is divided into a predetermined number of cells in the kernel

estimation. Instead of simply using a circular search area around each cell in simple

estimation, the kernel estimation applies a circular neighborhood around each point and then

a weighted sum or count of events is computed, where the weight is an inverse function of

distance from the reference position at the center of the neighborhood. The search radius in

the circular neighborhood has a direct effect on the results of the density map. That is, as the

search radius is larger, the kernel surface becomes flatter (Pulugurtha et al., 2003). A

quadratic function is used to carry out the kernel density estimation in GIS applications and

the quadratic function can be expressed as:

K * (1 (r/ R)2 ) 2 if r < R and 0 if r >= R

where R is a search radius, r is the distance from the sample point, and K is 3 / * R2

(Ormsby, 1999; Pulugurtha et al., 2003). The kernel density approach is well suited to our

needs because it captures local neighborhood effects, while recognizing that the geographic

position of spatial phenomena is often not known with great accuracy.

30
Figure 1. Kernel density estimate

Source: Pulugurtha et al., 2003

2.4.2 Nearest neighbor and K-function approach

Nearest neighbor analysis and Ripleys K-Statistic are used to detect spatial patterns

of departure from spatial randomness and assess their significance on soil contamination and

serum PCB levels in Anniston. The statistical package Crimestat is used for the

implementation of these techniques.

The first type of spatial comparison, nearest neighbor analysis, is presented to

determine whether sets of events are clustered more closely than would be expected by

chance. The distribution of events is considered clustered if the mean observed distance

31
between them is smaller than a minimum distance based on the standard error of a random

distribution (Schneider et al., 2004):

0.26136
A
0.5 t N 2
Minimum distance= N

A

where A is the total study area measured in square meters, N is the number of highly

contaminated soil samples or high levels of serum PCB, t is a probability level in the

0.26136

Students t-distribution, and N 2 is the standard error distance of a random

A

distribution (Levine, 2004; Schneider et al., 2004). In other words, for a one-tailed

probability, p, there is less than p percent of likelihood that this dispersed pattern could be

the result of random chance (Levine, 2004; Schneider et al., 2004).

Ripleys K-statistic is also used to compare clustering patterns of different levels of

soil PCB/ lead or serum PCB in Anniston, AL. It determines clustering by comparing the

number of observed events within a radius to the expected number for a spatially random

distribution (Bailey and Gatrell, 1995; Schneider et al., 2001; Levine, 2004). The events are

considered clustered when the sum of the count of events within a radius around each event

is bigger than the count of events expected under a random pattern. The process is repeated

at increased radius distances (Bailey and Gatrell, 1995; Schneider et al., 2001; Levine,

2004). The K-function is defined as:


32
A
K (t s ) = 2 I (t ij )
N i i j

where A is the area of the study region in square meters, N is the numbers of highly

contaminated soil samples with PCB and lead or high levels of serum PCB. Each point

(location of contaminated soil samples or resident location with high serum PCB levels) is

represented as i, and points within a circle of a specified radius (ts) are represented as j, so

that I (tij) is the number of points j , within distance (ts) of each point i , summed for all

events, i . The K-function can be transformed to the L-function to make it more intuitive:

K (t s )
L(t s ) = ts

As a result, point patterns are clustered if values of L(ts) are positive, while point patterns are

dispersed when values of L(ts) come out negative for the radius distance of ts. By Monte

Carlo simulation, the L statistic is computed at each interval distance. Values of L less than

the lower limit of the simulation signify dispersion, whereas values of L higher than upper

limit signify clustering (Bailey and Gatrell, 1995; Schneider et al., 2001; Levine, 2004).

33
2.4.3 Local Morans I coefficient of spatial autocorrelation

Given a set of data points, the Cluster and Outlier Analysis, under the spatial

statistics tool in ArcGIS identifies clusters of points with very homogeneous values, and

those clusters of points with values very different in magnitude. Morans I statistic is a

weighted correlation coefficient of a variable to itself, where observations are geographically

referenced. It serves to detect departures from spatial randomness in an entire sample of

observations. Departures from randomness indicate spatial patterns, such as clustering or

dispersion. The statistic may identify other kinds of patterns such as a geographic trend

(Moran, 1950). Morans I (1948) is calculated as follow

n n
n wij ( yi y )( y j y )
i j
I= n n n
( wi j ) ( yi y ) 2
i j i

where n is the number of regions, wij is a measure of the spatial proximity between regions i

and j, and y is the variable of interest. The weight matrix (wij) is based on connectivity or

spatial distance (Rogerson, 2006; Rogerson and Yamada, 2009). Morans I values are in the

range from approximately -1 to 1, where values near 1 represent strong spatial

autocorrelation, while values near -1 if true for the negative sign, and values near 0 indicate

an absence of spatial pattern (Moran, 1948; Rogerson, 2006; Rogerson and Yamada, 2009).

In contrast, a local Morans I is used to detect local spatial autocorrelation. It can be

used to identify local clusters such as regions where adjacent areas have similar values or

spatial outliers such as areas distinct from their neighbors (Anselin, 1995). In local Morans

34
I, a high value indicates that the reference feature is surrounded by features with similarly

high or low values, whereas a low value for I implies that the feature is surrounded by

features with dissimilar values. As Anselin (1995) states, the sum of local Moran values

obtained for all sub-regions in a study region is equal to the global Morans I. The sum of

local Morans I defined as,

I i = n( yi y ) wij ( y j y )
i j

is equal to the global Morans I.

2.5 Spatial Methods of Chemical Exposure and Risk Assessment

2.5.1 Introduction

Carlon et al. (2001) note that the assessment of health risks focuses on identifying

possible adverse health effects due to exposure to pollutants from a site. In assessing the

health risks, it is important to develop target levels in the contaminated site where remedial

action is needed. For procedures of risk assessment, in general, the source-pathway-receptor

model is applied (US-EPA, 1989; ASTM, 1995; CONCAWE, 1997). Moreover, the

procedures include the test of the environmental behavior and toxicity of the pollutants, the

site characteristics, the possible route of pollutants exposed to humans (receptors), and the

dose response of the pollutants (Carlon et al., 2001).

35
Thus, it is important to identify the primary and secondary pollution sources of the

contaminants. The primary source can be defined as an actual cause of the contamination,

which takes into account the mechanisms of transport and environmental processes and the

place of the discharge. In contrast, the secondary source is the influenced environmental

media that individuals are exposed to (Carlon et al., 2001).

Concentration values of contaminants may vary even within short distances because

soil is heterogeneous and often involves processes of accidental contamination. Due to such

conditions, it makes concentrations of contaminants hard to visualize over space and to

conceptualize a model of the contaminated site. Furthermore, many instances have to deal

with a large degree of uncertainty and small sampling datasets when the analysis of

characterization and the risk assessment are performed (Dakins et al., 1994; Carlon et al.,

2001).

2.5.2 Principal Component Analysis (PCA) and Cluster Analysis (CA)

Facchinelli et al. (2001), in a study of multivariate statistics and GIS in identifying

heavy metal sources in soils, note that multivariate statistical and geostatistical approaches

have been extensively adopted to reduce the costs for investigation, as well as to eliminate

and quantify uncertainties (Ferguson, 1998; Ferguson et al., 1998). For instance, Principal

Component Analysis (PCA) and Cluster Analysis (CA) are two common multivariate

statistical methods that have been increasingly used many fields of study, including the risk

assessment of polluted sites. PCA and CA allow the comparison of the composition of the

36
contaminants in sample datasets and also facilitate identification of the origin of the pollution

(Burns et al., 1997; Carlon et al., 2001).

Moreover, geostatistics have been successfully applied to produce regional

distribution maps for the interpolation of non-point sources of heavy metal contamination

using geographical information system (GIS) techniques (Burns et al., 1997; Carlon et al.,

2001; Facchinelli et al., 2001). Geostatistics have become a useful tool widely adopted in

environmental contamination studies due to its capability to determine spatial uncertainty

(Corwin and Wagnet, 1996; Facchinelli et al., 2001). In addition, GIS has not only been

extensively applied for soil contamination studies at a regional scale, but also applied in

studies of urban air pollution and other urban pollution indicators (Admus and Bergman,

1995; Moragues and Alcaide, 1996; Ebbinghaus et al., 1997).

2.5.3 Kohonen Self-Organized Maps (SOM)

The main purpose of applying new risk assessment methodologies is to assist in the

decision making processes. Thus, all of these methodologies should be easily adopted and

understood for all types of users, including the general public, politicians and scientists.

Recently, significant developments in computational functionality have brought not only

remarkably faster comprehensive outcomes, but also have improved the capability and

effectiveness of data treatment (Nadal et al., 2006). Kohonen self-organized maps (SOM), for

example, have been widely applied as a tool to visualize and classify sampled data (Nadal et

al., 2004; Park et al., 2004).

37
Kohonen first proposed the SOM technique as an unsupervised artificial neural

network (ANN) in 1982 (Kohonen, 1982). SOM consists of two different layers, which are

the input layer linked to the dataset, and the output layer corresponding to the map. It is

regarded as an advanced statistical prescreening tool in comparison to classic statistical

techniques. Brosse et al. (2001) emphasize that self-organized maps (SOM) are mainly used

for data mining, which is extracting necessary information from a large amount of data to

find hidden facts in the data. Furthermore, they stress that this technique is able to deal with

fairly large amounts of heterogeneous and unrelated data. In addition, Dan et al. (2002), Tran

et al. (2003) and Shang et al. (2004) state that ANN methodologies including SOM have

been extensively used to rank and elaborate in risk assessment, as well as to characterize the

global pollution of a potentially contaminated areas.

2.5.4 Bayesian Disease Mapping

In spatial statistics, Bayesian disease mapping methods have been one of the main

topics for the last two decades. According to Best et al. (2005), Bayesian mapping methods

can provide a robust approach to spatial analysis and disease mapping. They offer easier

ways to incorporate spatial correlation and also can address uncertainty in the modeling

process by creating models for both the observed data and other unknown data as random

variables. The initial development of the Bayesian method concentrated on empirical Bayes

(EB) techniques, which use frequentist processes such as maximum-likelihood estimation

(Breslow and Clayton, 1993) and the method of moments (Dean and MacNab, 2001) to

assess hyper-parameters, and a plugin approximation to the posterior estimation of relative

38
risks. Especially, the penalized quasi-likelihood (PQL) method has been extensively applied

in empirical Bayes (EB) disease mapping (Dean and MacNab, 2001). The variability of

estimates of relative risks is often underestimated since the EB approach does not take

account of uncertainty coming from assessing hyper-parameters of the random variables,

while the PQL algorithm generally produces almost unbiased point estimates of the relative

risk.

In contrast, in recent years, Markov Chain Monte Carlo (MCMC) algorithms have

been used for full Bayesian (FB) estimation of relative risks in Bayesian disease mapping.

However, verifying how hyper-prior specifications affect posterior estimation remains

unclear, while a FB approach allows speculation of relative risks based upon assessed

posterior distributions that the uncertainty related with the estimates is revealed via unclear

hyper-prior specification (Bernardinelli and Montomoli,1992). Additionally, estimation in FB

disease mapping is widely applied through Gibbs and adaptive rejection sampling (Gilks,

Best, and Tan, 1994). The formula for a Full Bayesian disease mapping may be denoted as:

where, is the likelihood of the model, which reflects the relationship between the

data and the parameters. is the prior distribution of the parameters, which reflects the

initial information on the parameters. Usually, is computed by simulation using

39
Markov Chain Monte Carlo techniques. WinBUGS is specialized disease mapping software

used to fit various types of Bayesian spatial models. It uses the Gibbs sampler for that.

Initially, Bayesian methods were used for analyses in small areas for chronic and non-

infectious diseases (Best et al., 2005). Recently, Bayesian disease mapping has extensively

been applied for geographical analysis of tropical diseases such as malaria and Schistosoma

mansoni infection both at a large and a small scale (Diggle et al., 2002; Gemperli et al., 2004;

Raso et al., 2005). However, these methods have rarely been used in the studies, especially,

in large scale disease control. In addition, there are needs to incorporate the applicability of

GIS, RS and geographic analysis in Bayesian methods so that it can help enhance

implementation of large area disease control programs.

40
Chapter 3 Exploring Geographical Variations of Soil and Serum PCB in
Anniston, Alabama: the Association with Socioeconomic and Spatial
Variables

Our initial objectives of this chapter were to 1) characterize spatial distribution

patterns of soil PCBs in Anniston, Alabama in relation to socioeconomic variables combined

with two spatial factors, 2) determine the effects of socioeconomic status (such as poverty

level, income, education level, etc) and spatial variables (such as proximity to Monsanto

plant) on exposures to PCBs among Anniston Community Health Survey (ACHS)

participants, and 3) in addition, in this study, we hypothesized that the approach of spatial

regression analysis, which associated socioeconomic and spatial variables with PCB

concentrations in soils and serums, would enhance the power to predict PCB levels and

capture significant indicators for each model by accounting for spatial effects and

heteroscedasticity (non-constant error variance).

3.1 Materials and methods

3.1.1 Collection of background data

The specific area of interest is focused on the vicinity of the Monsanto plant.

Information on two sets of dependent variables; soil and serum PCB levels and two set of

independent variables; socioeconomic variables and spatial variables, was collected from

several different data sources. We obtained the database that contained PCB levels of soil

samples in Anniston from the U.S Environmental Protection Agency (EPA). The database

included PCB levels for 22,452 soil samples with multiple measurements in each location
41
measured in ppm (parts per million or mg/kg) and information on the associated latitude and

longitude coordinates. For regression analyses and mapping purposes, we took the averages

of PCB levels in the soil samples where they were taken from the same place and resulted in

a total of 6,864 soil sample averages. For the second dataset, the Anniston Environmental

Health Research Consortium (AEHRC) conducted the Anniston Community Health Survey

(ACHS), which was funded by ATSDR. The ACHS database contains the 766 participants

congener-specific PCB serum level measured in parts per billion (ppb; ng/g), along with their

health history, occupation, sex, age, and address. At the initial stage, two stage-random

sampling was used for the sample selection; 3,200 households randomly selected, 1,823

successfully contacted, and 713 refused to participate. In result, 1,100 participants completed

interviewer administered questionnaire and 774 provided blood samples amongst the

participants.

Socioeconomic variables were extracted from the EPA and US Bureau of the Census

2000 data files. In particular, Census data provided the socio-economic data for Anniston at

the census block and block group level, which encompasses a total of 4,358 census blocks

and 87 census block-groups. For mapping purposes, topographic, boundary, railroad and

street network data for the study area were also obtained from the US Bureau of the Census.

As shown in Table 3-1, the Census dataset provides population counts and percentages by;

race, gender, age, household family, housing unit, education, employment, and income for

each of the 4358 census blocks and 87 census block groups that had recorded population and

percentages. Age groups were collapsed into the following seven age groupings; 0-9, 10-19,

20-29, 30-39, 40-49, 50-64 and 65+. Education level was divided into four categories; no
42
education, elementary school education, high school education, and college or graduate

school education. Also, the Census dataset includes classes of; household family (average

household size, average family size and family household), housing unit (occupied housing

unit, renter occupied housing unit, housing units built before 1970 and others) and

employment (labor force and employed labor force). Furthermore, in the EPA, property type

was defined into categories; industrial, commercial, residential, public, school and others. To

keep track of census blocks and block groups, each was given an arbitrary number. Then,

using GIS, we determined which soil and serum sample belong to which census block and

block group and assigned its socioeconomic values to the soil and serum sample for further

regression analyses. These variables determined were appropriate for analysis of soil and

serum PCB levels; they were selected to be wide enough to include reasonable indicators

that may influence the distribution patterns of PCB concentrations in soil and serum

samples. Lastly, features of two plant landfills and ditches near the Monsanto plant were

scanned and digitized manually due to unavailability of these geographic features. All of

these data were utilized in a Geographic Information Systems (GIS) application and were

implemented in evaluating the distribution patterns of soil and serum PCBs related to

socioeconomic characteristics in Anniston community.

43
Table 3-1. Independent variables: socioeconomic and spatial variables
Category Variable Data Source
Race Percent African American Census Block
Gender Percent Female Census Block
Population Percent of Population to 9 years Census Block
Percent of Population 9 to 19 years
Census Block

Percent of Population 40 to 49
Census Block
years
Percent of Population over 65 years Census Block
Age Median age Census Block
Household Average Household size Census Block
Family Average Family size Census Block
Family Household size Census Block
Housing unit Occupied Housing Unit Census Block
Renter occupied Housing Unit Census Block
Percent Single Mother Census Block
Percent Single Father Census Block
Socio- Percent Single Parent Census Block
Economic Percent of Housing units built before
Variables Census Block Group
1970
Education Percent No Education Census Block Group
Percent Elementary School
Census Block Group
Education
Percent High School Education Census Block Group
Percent College or Graduate School
Census Block Group
Education
Employment Percent Labor Force Census Block Group

Percent of Labor Force Employed Census Block Group


Income Median Income Census Block Group
Percent Poverty Status Census Block Group
EPA
Property Property Own Status Industrial dummy variable
Category Commercial, Residential, Public,
School and others
Spatial Variables Distance to Monsanto plant Scanning Units: meters
Scanning +
Distance to the nearest ditches Units: meters
census tiger

44
3.1.2 Regression analyses

We first used a linear regression to model soil and serum PCB levels as a function of

socioeconomic and spatial variables. Ordinary least squares (OLS) stepwise regressions are

estimated to find independent variables that are statistically significant in each model at the

0.05 significance level for entry and 0.10 for removal. Results are further checked for

multicollinearity, and in case of misbehavior, one of the variables concerned is dropped

from further consideration. Because the distribution of both soil and serum PCB

concentrations are markedly skewed, with many small values and a small number of large

values (see Table 3-2), we used a logarithmic transformation of concentration as the

dependent variables. Hence our initial equations were:

ln (PCB levels in soils or and human serum) = b0+b1x1+ .

,where PCB is the observed concentrations in ppm and human serum PCB is the observed

level in ppb, and the xs represent the independent, explanatory variables.

Given the spatial nature of the data used, spatial autocorrelation and

heteroskedasticity need to be tested since the residuals and the dependent variables may

exhibit not only spatial dependencies but also non-constant error variance. Spatial

dependency is a situation where the error term or the dependent variable at a location is

correlated with observations on the dependent variable at other nearby locations. Diagnostics

of spatial dependencies in the dependent variable and in the residuals are run in the

statistical software GeoDASpace to identify and account for potential spatial effects that

may bias estimation results of OLS regressions (Anselin et al., 2006; GeoDA, 2010).
45
Specifically, the Lagrange Multiplier test pertaining to both the spatial lag (LM lag) and

spatial error models (LM error) are calculated. If both tests are statistically significant, the

robust form of the tests is used to determine the appropriate model.

Spatial effects evidenced by autocorrelation can be handled econometrically in two

primary ways. The spatial lag model includes a spatially lagged dependent variable, Wy, as

one of the explanatory variables:

y = Wy + +

where y is a dependent variable; X is the vector of independent variables, is a vector of

regression coefficients, is a random error term and is a spatial autoregressive

coefficient. On the other hand, the spatial error model, expresses each residual as a function

of surrounding residuals. The spatial error model is given by:

y = + , where
= W +

with the same notation as above and where is an autoregressive regression coefficient,

and W is spatial lag for errors and is normally distributed with mean 0 and variance 2 .

A spatial error model is estimated by maximum likelihood, while a spatial lag model is best

estimated by a two-stage least-squares (2SLS) procedure, which does not assume normality

of errors and can accommodate a correction for heteroskedasticity, if present.

A proper regression model is selected according to GeoDAs (2010)

recommendations. If the OLS regression exhibits heteroskedasticity only, as shown by the

46
diagnostics for heteroskedasticity, then the White correction (1980) is applied to OLS

results. However, if the results of the LM-lag test are significant (or more significant than

the LM-error test), then the spatial lag model is carried out as the alternative regression.

After the model is run, we apply the Anselin-Kelejian test for residual spatial

autocorrelation. If the results of the latter are significant, the model is re-estimated with the

heteroskedastic and autocorrelation robust (HAC) approach of Kelejian and Prucha

(Kelejian and Prucha, 2010). In the case where LM-error test is significant (or more

significant than the LM-lag test), then a spatial error model is estimated; in case of

heteroskedasticity, the Kelejian-Prucha consistent estimator for heteroskedastic error terms

(KP-HET) is used (Kelejian and Prucha, 2010). In all the spatial models estimated for this

study, the spatial weights matrix is specified according to threshold distance criterion.

47
Table 3-2. Distribution of (a) soil PCB concentration and (b) serum PCB levels

Table 3-2a Total soil PCB (ppm) Count


Total soil PCB = 0 4
>0 and <=0.01 638
>0.01 and <=0.05 5010
>0.05 and <= 0.1 8791
>0.1 and <= 0.2 2707
>0.2 and <=0.3 1229
>0.3 and <=0.4 862
>0.4 and <=0.5 434
>0.5 and <=0.6 355
>0.6 and <=0.7 261
>0.7 and <=0.8 269
>0.8 and <= 0.9 157
>0.9 and <= 1 141
>1 and <= 10 1393
>10 and <=100 177
>100 and <=300 14
>300 and <=1400 10
Total 22452
Table 3-2b Total serum PCB (ppb) Count
Total serum PCB >0 and <=0.1 5
>0.1 and <=0.5 75
>0.5 and <= 1 59
>1 and <=1.5 62
>1.5 and <=2 62
>2 and <=2.5 55
>2.5 and <=3 41
>3 and <=3.5 44
>3.5 and <=4 37
>4 and <=4.5 27
>4.5 and <=5 30
>5 and <=10 141
>10 and <= 50 120
>50 and <=100 6
>100 and <=170 2
Total 766

48
3.2 Results

3.2.1 Spatial regression results on soil PCBs: socioeconomic and spatial variables

Table 3-3 shows the results of both non-spatial and spatial regression analyses on

soil PCBs with socioeconomic and two spatial variables. In Table 3-3a, estimates of

ordinary least squares (OLS) regression with socioeconomic and spatial variables are

presented. Also, Table 3-3b describes the results of correlations between socioeconomic

variables and two spatial variables. As shown in the table, seven socioeconomic variables

closely associated with two distance factors, are captured as significant indicators in

explaining distribution patterns of soil PCB levels measured in Anniston, Alabama. The

most significant socioeconomic indicator is percent African American and the coefficient is

-0.127; areas with higher percentage of African American population are associated with

lower soil PCB concentrations. In addition, soil PCB concentrations tend to be higher in

areas, where have less number of family households, higher percentages of old housing units

and higher percentage with no education. Areas of extractive mining activities with

significant surface expression tend to have higher PCB concentrations. These results are all

consistent with the observation that the highest concentrations are in white neighborhoods

living in old housing with low levels of education; the highest concentrations do not occur in

those neighborhoods that have a disproportionate number of minorities.

49
Table 3-3. Regression of soil PCBs with socioeconomic and spatial variables
Table 3-3a Dep. Variable: Log-transformed soil PCBs- R-Squared: 0.216 (OLS)
Independent Variable Unstandardized t-statistic
Coefficient
(standardized)
Constant 5.268 40.550
Distance to Monsanto plant -3.156e-004 (-0.304) -17.498
Distance to the nearest ditches -2.717e-004 (-0.171) -10.851
% African American -0.005 (-0.127) -12.331
Family household size -0.005 (-0.101) -8.993
% college or graduate school education 0.014 (0.101) 8.526
% of housing units built before 1970 0.007 (0.084) 5.394
Property- quarries/strip mines/gravel pits 2.861 (0.059) 5.728
% no education 0.052 (0.071) 5.279
% renter occupied housing unit 0.003 (0.041) 4.864
Table 3-3b Correlation matrix between socioeconomic and spatial variables
(**: significant at 0.01 level, *: significant at 0.05 level)
Distance to Monsanto Distance to the nearest
ditches
% African American -0.309** -0.145**
Family household size 0.270** 0.110**
% college or graduate school education 0.508** 0.167**
% of housing units built before 1970 -0.609** -0.304**
Property- quarries/strip mines/gravel pits -0.030* -0.023
% no education -0.316** -0.325**
% renter occupied housing unit -0.020 -0.029*
Table 3-3c Diagnostics for heteroskedasticity
Test Value probability
Breusch-Pagan test 128.719 0.000
Koenker-Bassett test 58.068 0.000
Table 3-3d Diagnostics for spatial dependence
Test Value probability
Lagrange Multiplier (lag) 116.160 0.000
Robust LM (lag) 6.124 0.013
Lagrange Multiplier (error) 426.908 0.000
Robust LM (error) 316.872 0.000
Table3-3e Dep. Variable: Log-transformed soil PCBs- R-Squared: 0.230 (Spatial error model)
Independent Variable Un standardized t-statistic
Coefficient
Constant 4.701 17.599
Distance to Monsanto plant -1.283e-004 -2.644
Distance to the nearest ditches -4.321e-004 -5.646
% African American -0.001 -1.386
Family household size -0.002 -2.479
% college or graduate school education 0.008 2.924
% of housing units built before 1970 0.012 4.282
Property- quarries/strip mines/gravel pits 1.775 3.667
% no education 0.012 0.611
% renter occupied housing unit 0.001 1.156
Spatial lag for the errors (Lambda) 0.729 6.286
50
The value of R2 for this model is 0.216, which is highly significant for a model with

approximately 7,000 observations. Table 3-3c and 3-3d show that the residuals from the

regression exhibit a high degree of both heteroscedasticity (non-constant error variance)

(Breusch-Pagan = 128.719 and Koenker-Bassett = 58.068) and spatial autocorrelation (LM

lag = 116.160 and LM error = 426.908) in the model. This indicates that spatial effects and

non-constant error variance could influence the significance of the coefficients.

Consequently, a spatial error model was fit using GeoDaSpace (GeoDa, 2010), by

expressing each residuals as a function of surrounding residuals in the regression; the results

are shown in Table 3-3e. Note that the value of R2 has risen to 0.230. The expression of the

spatial error term reduces the absolute magnitude of many of the coefficients, as well as their

statistical significance.

In addition to the spatial error term, this spatial error model using maximum

likelihood performed by GeoDaSpace allows us to correct problems of non-constant error

variance, which could lead to the inclusion of insignificant variables in the model. In results,

three socioeconomic variables; percent African American, percentage of no education, and

percent renter occupied housing units became insignificant after correcting for spatial effects

and non-constant error variance in the model. This means that the original inclusion of these

variables can now be attributed to the spatial autocorrelations and non-constant error

variance. The same variables except these three variables remain in the model, however, and

they are highly significant. Overall, the results in Table 3-3 demonstrate an important and

well-known issue with regression using spatial data; failure to account for spatial effects and

51
non-constant error variance in the model can make independent variables seem more

significant than they actually are.

3.2.2 Spatial regression results on serum PCBs: socioeconomic and spatial variables

Table 3-4a describes the results of ordinary least squares regression on serum PCB

levels measured from residents living in Anniston, Alabama, using socioeconomic and

spatial variables. In addition, Table 3-4b shows the results of correlations between

socioeconomic variables and two spatial variables. The R2 value of the regression is 0.507,

and the standard error of the estimate is 0.908. In the model, the socioeconomic variable

most highly associated with serum PCB concentrations is poverty level after taking age, a

primary factor determining serum PCB levels, into account. The coefficient for the poverty

level is 0.179 implying that residents living in areas of higher poverty level have higher

serum PCB levels. Moreover, serum PCB levels tend to be higher for individuals residing in

areas, where have a higher percent of high school education. These results are consistent

with results in a recent study hypothesized that a lower socioeconomic status measured by

low income and limited education level, would related to higher exposures to PCBs (Borrell

et al., 2004). Lastly, one spatial variable, distance to the Monsanto is also significant and

negative, indicating that residents living in areas further away from the Monsanto would be

associated with lower serum PCB levels.

Table 3-4. Regression of serum PCBs with socioeconomic and spatial variables
Table 3-4a Dep. Variable: Log-transformed serum PCBs- R-Squared: 0.507 (OLS)
Independent Variable Unstandardized t-statistic
Coefficient
(standardized)
52
Constant -2.729 -11.658
Age 0.056 (0.684) 26.432
% of poverty level 0.016 (0.202) 5.480
% high school education 0.018 (0.091) 3.729
Distance to Monsanto plant -5.605e-005 (-0.066) -2.731
Table 3-4b Correlation matrix between socioeconomic and spatial variables
(**: significant at 0.01 level)
Distance to Monsanto
% of poverty level -0.442**
% high school education 0.156**
Table 3-4c Diagnostics for heteroskedasticity
Test Value probability
Breusch-Pagan test 9.310 0.097
Koenker-Bassett test 6.890 0.228
Table 3-4d Diagnostics for spatial dependence
Test Value probability
Lagrange Multiplier (lag) 34.430 0.000
Robust LM (lag) 3.590 0.058
Lagrange Multiplier (error) 33.860 0.000
Robust LM (error) 3.020 0.081
Table 3-4e Dep. Variable: Log-transformed serum PCBs- R-Squared: 0.509 (Spatial lag
model)
Independent Variable Un standardized t-statistic
Coefficient
Constant -3.817 -8.906
Spatially weighted dependent variable 0.736 3.010
Age 0.056 26.984
% of poverty level 0.012 3.923
% high school education 0.021 4.331
Distance to Monsanto plant -5.940e-006 -0.226
Test Value probabilit
y
Anselin-Kelejian 0.18 0.674

In Table 3-4c and 3-4d, there is no problem of non-constant error variance (Breusch-

Pagan = 9.311 and Koenker-Bassett = 6.891), but some spatial dependency (LM lag = 34.43

and LM error = 33.86) is detected, as presented in the diagnostics. Following the procedure

outlined in the method section for selecting a spatial regression model, the spatial lag model

with a two-stage least squares (2SLS) regression is performed then Anselin-Kelejian test is

checked for residual spatial autocorrelation (Anselin-Kelejian = 0.18). Estimates of the

53
spatial lag regression on serum PCBs with socioeconomic and spatial variables are given in

Table 3-4e. The R2 value slightly rise to 0.509, and one spatial variable (distance to the

Monsanto plant) dropped below the significance level in the spatial model after it was

selected in the OLS stepwise procedure. As anticipated, the model of serum PCBs

normalized by log-transformation identifies high poverty level and limited education level as

strong positive indicators, but no other socioeconomic variables were correlated.

3.3 Discussion and interpretation

This study explored the use of both ordinary least squares and spatial regression

estimation methods to identify significant explanatory variables that explain spatial patterns

of soil and serum PCB levels collected in Anniston, Alabama. We used two different sets of

dependent variables- soil and serum PCB levels measured in the study area and from

residents living in the study area respectively. Coefficient values for all explanatory

variables (Table 3-1) were assessed for both soil and serum PCB models. We found that

there are several socioeconomic variables statistically significantly associated with both soil

and serum PCB levels in the study area after taking into account two spatial variables,

distances to the Monsanto plant and to the nearest off-site drainage ditches.

For the soil PCB model, we identified that percent African American, family

household size and percent of housing units built before 1970 at the block level were the

three most important socioeconomic indicators associated with soil PCB concentrations in

the study area. One of our initial hypotheses was that high soil PCB levels are closely related

with areas with a high percentage African American, since many African American
54
residents live around the Monsanto plant and its landfills as described in Figure 3-1.

However, we found that relationship between percent African American and PCB

concentrations was a negative rather than a positive. This is because there are some blocks

where a high percent Caucasian are residing right next to the plant, especially, on the

southwest and north side of the plant. In general, however, census blocks with the high

percent of African Americans are closely distributed in the vicinity to the plant operated by

the Monsanto Company compared to the percent of Caucasian. Therefore, African American

residents would be encountered higher-level exposures to PCBs than Caucasian residents.

Figure 3-1. Distributions of soil PCB levels with percent African American

55
In addition, a low number of family households and a high percentage of housing

units built before 1970 at the block/block group level were significantly associated with high

soil PCB concentrations. This is because census blocks and block groups with low number

of family households and a high percent of old housing units, are mainly distributed in the

industrial zones, where the Monsanto plant and its landfills are placed (Figure 3-2). As

presented in Table 3-3 (b), all of these socioeconomic variables were found significantly

related to the distance factors, distances to the Monsanto plant and to the nearest off-site

drainage ditches. Three variables, percent African American, percent of old housing units,

and property-quarries/strip mines/gravel pits, are negatively correlated to the two distance

factors among the seven selected socioeconomic variables. This is related to the fact that as

census blocks/block groups have a higher percentage of African American, old housing

units, and property of strip mine, they are located closer to the Monsanto and the nearest

ditches. On the other hand, family household size at the block level had a positive

correlation with the distance factors indicating that more numbers of households live in the

census blocks further away from the Monsanto and the ditches.

56
Figure 3-2. Distributions of soil PCB levels with percent of old housing units

Our analyses also demonstrated that three factors (percent African American,

percentage of no education, and percent renter occupied housing units) are excluded from

the regression after the spatial autocorrelation and after the non-constant error variance are

adjusted by spatial error model. The inclusion of variables in OLS model that should not be

there could be due to spatial effects and non-constant error variance in the model. In

particular, the percent African American was selected as a primary indicator in the OLS with

the negative signs, because it did not properly correct the spatial autocorrelation and the

non-constant error variance caused by the highest PCB levels found in only few census

tracts with a high Caucasian population right next to the plant. Therefore, it is very

57
important to adjust effects of spatial dependency and heteroskedasticity in the model before

making a final conclusion.

This study also focused on the investigation of serum PCB levels measured from

individuals residing in the study area, associated with socioeconomic and spatial variables.

After adjusting for age, we found that poverty status and a percent of high school education

at the block group level were the two most important socioeconomic indicators associated

with serum PCB levels. Our findings of significant associations between two selected

variables and serum PCB concentrations strongly support a hypothesis suggested by recent

literature, indicating that higher exposures to PCBs would be attributed to a lower

socioeconomic level, as measured by low income and limited education (Borrell et al.,

2004). In addition, one distance factor, distance to Monsanto, was found significant and

negative in the OLS model. It implies that living close to the plant may influence the effects

of exposures to PCBs among individuals residing in Anniston, Alabama.

Furthermore, in the correlation analyses shown in Table 3-4 (b), significant relations

were also found between two selected socioeconomic variables and one distance factor,

distance to Monsanto. Poverty level is negatively correlated to the distance to Monsanto,

indicating that as census block groups are located closer to the Monsanto, higher percentage

of poverty level the block groups have (Figure 3-3). We note that all of these findings were

consistent with previous literature, thus confirming that socioeconomic status was associated

significantly with an increase in serum levels of PCBs. For example, poor residents tend to

live in older housing, which may lead to opportunities for exposure to PCBs from the plant

58
and from the nearest drainage ditches. The observed factors associated with PCB exposures

could result from the heterogeneity in exposure levels in our sample. Lastly, our analyses

removed the distance factor from the regression when the spatial autocorrelation is corrected

by a two-stage least squares spatial lag model. Thus, the inclusion of the distance factor in

the OLS model could be due to spatial effects in the model.

Figure 3-3. Distributions of serum PCB levels with poverty level

There may have been some bias since we have used two different geographic levels

of data for the dependent and independent variables, especially in the models with

socioeconomic variables. We have aggregated data at the level of census block or block

group for our explanatory variables; however, individual soil and serum locations were used

for the dependent variables. Thus, the exact information on socioeconomic variables
59
corresponding to each soil and serum sample site were not available, implying that there

might be an issue of ecological fallacy, since analyses based on aggregated data could lead

to conclusions different from those based on individual data. There is also a need for caution

in interpreting these results due to the potential for sampling bias on soils inherent in the

study design. As described in Figure 3-1, most soil samples were collected in the areas,

where relatively high percentages of African Americans reside. The nonrandom sample of

population could lead to a possible distortion of regression results in the model. Based on

our findings, it will be of great interest to further examine the comparison of these measures

of soil PCB levels with serum PCB levels in individuals living in the study area. These

regression models can be used to predict soil and serum PCB levels at locations where no

measurements were obtained. In turn, these models will ultimately serve as a tool to

assessing whether estimated soil and serum PCB concentrations at residential locations can

be possible surrogates for actual soil and serum PCB levels in the study area.

3.4 Summary

In summary, the main focus of this study was to identify distribution patterns of PCB

levels in both soil and serum samples taken in Anniston, Alabama, characterized by

socioeconomic and spatial variables. It also analyzed significant relationships found between

these two sets of independent variables to determine spatial tendency on the socioeconomic

factors in association with the distance factors selected in the soil and serum PCB models.

For the soil PCB model, percentage of African American, family household size, percents of

old housing units, and property of strip mines and gravel pits were significant

60
socioeconomic indicators with two distance factors in explaining distribution patterns of

PCB concentrations in soils. In addition, our findings suggest that poverty level and limited

education were associated significantly with an increase in serum concentrations of PCBs.

Furthermore, we found that both models exhibited either high level of spatial autocorrelation

or non-constant error variance. For this reason, we suspect that these effects may bias our

regression results in the models. Consequently, a spatial error model using maximum

likelihood and a spatial lag model using a two-stage least squares (2SLS) procedure were

applied to correct problems of spatial effects and non-constant error variance. The values of

R2 have increased for both models and the spatial regressions reduce the absolute magnitude

of many of the coefficients and their statistical significance leading to the exclusion of

insignificant variables in the model. These geographic analyses of multiple variables allow

researchers to determine more precisely where high soil PCB contamination and higher

exposures to PCBs are occurring, along with the socioeconomic and spatial indicators. This

evidence-based information is necessary in order to gain public attention and isolate areas

with high soil and serum PCB levels for further interventions.

61
Chapter 4 Geographic variation of soil lead concentrations in Anniston,
Alabama

The US EPA provided a database of soil lead levels with associated geographic

coordinates in August 2008. This is a rich database with over 2000 individual soil lead

levels in Anniston and Calhoun County, Alabama, which provides a unique opportunity to

assess the spatial distribution of lead in Anniston communityand identify predictors of soil

lead levels. Since lead is widely distributed within Anniston communityof about 23,000

residents, this work is also of considerable public health significance. The purpose of this

chapter was to 1) pursue a cross-disciplinary, innovative approach to identifying significant

physical risk factors associated with soil lead concentrations, 2) provide an improvement

over ordinary linear regression by exploring spatial nonstationarity across a study area, and

3) illustrate the spatial distribution of the sign, magnitude, and significance of each

explanatory variable. This study hypothesized that the use of geographically weighted

regression (GWR) analysis, which associated physical variables with lead concentrations in

soils, would enhance the power to predict lead concentrations and capture significant

predictors by accounting for regional variations (nonstationarity) between dependent and

explanatory variables.

4.1 Materials and methods

4.1.1 Collection of background data

62
Anniston, Alabama is located approximately 60 miles east of Birmingham and 90

miles west of Atlanta. It is a community of about 23,000 people and is situated in Calhoun

County. The specific area of interest is focused on the proximity to Lee Brass foundry and

22 other former and active foundries located in Anniston and to the major railroads, as

shown in Figure 4-1. Information on physical variables and soil lead levels was collected

from several different data sources. A database that contained lead levels of soil samples in

Anniston was obtained from the U.S Environmental Protection Agency (EPA) in August,

2008. The database included 2,046 soil samples, with multiple measurements taken from the

upper 3 inches of soil in each location measured in ppm (parts per million or mg/kg). Soil

lead levels were measured by US EPA method 3050/6010/6020 (ICP/ICP-MS). Information

on the associated latitude and longitude coordinates is also contained in the database. For

regression analyses and mapping purposes, average lead concentrations in the soil samples

were used when samples were taken from the same location, resulting in average soil lead

levels at 595 sites. Three data sources, Digital Elevation Model (DEM) data, Census data

and EPA data on 23 former and active foundries in Anniston were used to extract the spatial

and physical variables as listed in Table 4-1. Our initial choice of explanatory variables

included those related to: proximity to 23 active or former foundries including Lee Brass

foundry located within the study area, proximity to local railroads and major roads,

proximity to hydrological features, elevation and aspect. All of these variables were

measured in a Geographic Information Systems (GIS) application. The final merged dataset

that was prepared for further regression analyses contained values of these spatial and

physical variables for soil sample locations.

63
Figure 4-1. Map of study area. Lead sampling sites and its concentrations, overlaid with Lee
Brass foundry and major railroads

Table 4-1. Predictor variables and data sources used for the analysis

Predictor variables Data Source Description


Distance to 23 former and
active foundries EPA unit : meters

Distance to the nearest


Census TIGER meters
railroads
Distance to major roads Census TIGER meters
Scanning + Census
Distance to Hydrology meters
TIGER
Elevation DEM meters
degrees
Aspect DEM Degrees from 0 (due north)
to 360 (again due north)

64
4.1.2 Data transformation

Conventional global regression to model lead concentration as a function of physical

variables was first used. A stepwise ordinary least square (OLS) procedure was run to

identify significant explanatory variables, with a 0.01 significance level for entry and a 0.05

level for removal. The distribution of lead concentrations is heavily skewed, with many

small values and a small number of large values, and this may lead to biased conclusions in

statistical analyses. Therefore we needed to use a logarithmic transformation of

concentrations as the dependent variable prior to further regression analyses. Hence our

initial equation was

ln(lead ) = b0 + b1 x1 + ...

where lead is the observed concentration in parts per million, and the xs represent the

explanatory variables. As an additional step, after the log transformed data set was used for

the GWR analyses, the results were back transformed with the reverse process of the log

transformation to produce the final spatial distribution map on soil lead concentrations.

4.1.3 Geographically Weighted Regression

To test regional variation (nonstationarity) in the relationships between explanatory

and dependent variables, GWR (suggested by Fotheringham et al. (1998)) was used. This

approach has an attractive feature where local views of regression, as observed from each

data location are accounted for. Note that the conventional regression equation can be

modeled as:
65
y i = b0 + bk xik + i
k

where, y i is the estimated value of the dependent variable for observation i, b0 is the

intercept, bk denotes the parameter estimate for variable k, x ik denotes the observation on

variable k at location i, and i is the error term. In contrast, in place of generating a single

regression equation, GWR calibrates a separate regression equation for each observation.

For each particular location, one can generate a regression equation using weights that are

attached to observations surrounding the location. Each GWR equation is defined as:

y i = b0 (u i , vi ) + bk (u i , vi ) xik + i
k

where (ui , vi ) represents the coordinate of location i. Note that in the calibration of the GWR

model, it is assumed that observations nearby one another have more of an influence in the

estimation of the regression parameters than observations located farther apart

(Fotheringham et al., 2000). The weights given to each observation are a distance decay

function.

There are two choices for bandwidth selection that determines the distance decay

function: cross-validation (CV) and Akaike Information Criteria (AIC). These methods

allow us automatically to determine the bandwidth that gives the best predictions.

Specifically, CV seeks the bandwidth that minimizes a Cross Validation score, expressed as:

n
CV = ( yi y i ) 2
i =1

where n represents the number of observations. Note that observation i is excluded in the

66
calibration so that the model is not alone calculated on i in regions of sparse observations

(Mennis, 2006; Rogerson, 2006). Alternatively, the AIC method finds the bandwidth that

minimizes the AIC score, expressed as:

n + tr ( S )
AIC = 2n log ( ) + n log (2 ) + n
n 2 tr ( S )

where tr(S) represents the trace of the hat matrix. The hat matrix describes the relationship

between the fitted values and the observed values. The matrix consists of diagonal elements

called leverages, which specify the influence of each observed value on each fitted value for

the same observation (Hoaglin and Welsch, 1978). The AIC method has a benefit over the

CV method, because it considers the degrees of freedom, which may vary between models

centered on different observations. Additionally, the user may select a fixed bandwidth,

which is applied for all observations, or a variable bandwidth that extends further in regions

of sparse observations and is compressed in regions of dense observations (Charlton et al.,

no date; Mennis, 2006). In this paper, the AIC optimization method was used.

For each observation, one can calibrate an independent regression equation, so that

we have a separate parameter estimate, goodness-of-fit, and significance assessment for each

observation. Therefore, we can visually explore and interpret regional variation (spatial

nonstationarity) and significance of the associations between dependent and explanatory

variables by mapping these values. For further details, the reader is referred to Fotheringham

et al. (2002).

67
4.2 Results

4.2.1 Case study: GWR of soil lead (Pb) concentrations in Anniston, AL

The case study concerns soil lead concentrations (ppm) in Anniston, Alabama, using

two significant predictor variables, proximity to Lee Brass foundry and proximity to local

railroads (meters). A map of sampling sites and associated lead levels overlaid with

foundries and railroads is presented in Figure 4-1. The predictor variables and the global

regression parameters of soil lead concentrations estimated by OLS are reported in Table 4-2.

The model indicates that both predictor variables are negatively related to soil lead

concentrations; as distances increase from Lee Brass foundry and major railroads, levels of

lead in soils decrease. Note, however, that while the OLS model had an adjusted R-squared

value of 0.265, accounting for only about 26.5 % of the variance in lead concentrations, the

GWR model improved and increased the models accuracy to an adjusted R-squared value

of 0.387. In addition, in the GWR model, the parameters of each predictor variable vary

across the study area in terms of magnitude, sign, and significance. The diagnostics section

of this table indicates a substantial decrease in the residual sum of squares, standard error of

the estimate, and the AIC statistic. The data inserted into the GWR used a variable

bandwidth that minimizes the AIC value and the variable bandwidth method was used to

account for the regional variation in the size of the data set, and hence the density of soil

samples. A reduction of 3 or more in the AIC value suggests a significant improvement in

model fit (Smith et al., 2006).

68
Table 4-2. Global and GWR regression estimates and diagnostics

Predictor variables Global parameter estimatesa GWR parameter estimatesb

Distance to Lee Brass foundry -0.259 -3.735 to 1.771


(1)
Distance to the nearest -1.513 -4.217 to 3.333
railroad(2)
Intercept (0) 6.079 2.214 to 8.505
Diagnostics

Residual SS 370.807 292.989


Std. error of the estimate 0.791 0.701
2
Adjusted R 0.265 0.387
AIC 1413.174 1322.726
a b
Global regression; Ordinary least squares. GWR; Geographically Weighted Regression

Figures 4-2 and 4-3 show maps of parameter estimates of proximity to Lee Brass

foundry and local railroads, respectively. A standard deviation classification method was

used instead of an equal classification approach since data are not uniformly distributed, but

are instead normally distributed. Manual adjustments are applied to distinguish negative

from positive estimates to aid in the direct comparison of estimates. Figure 4-2 clearly

illustrates that a negative lead concentration-distance to Lee Brass foundry relationship is

evident in the remainder of the city with the exception of the eastern areas of the city, within

which stronger negative associations occur. In contrast, areas of positive relationship

between distance to Lee Brass foundry and soil lead concentrations are largely limited to the

southwest regions of central Anniston. Figure 4-3 presents a different coefficient surface

than Figure 4-2. This figure indicates that a negative relationship occurs adjacent to, or

outside of the triangular area enclosed by railroads on three sides, whereas most positive

69
parameter estimates on distance to the railroads are distributed inside of the area. In

particular, the western areas of the Louisville and Nashville Railroad and the southeast area

of the Southern Railway show higher negative parameter estimates. Both Figures 4-2 and 4-

3 reveal spatial variation of sign of the influence of each significant explanatory variable on

the soil lead concentrations and add valuable information on the distribution patterns of soil

lead concentrations presented in Figure 4-1.That is, the relationships are mostly highly

negative near Lee Brass foundry and local railroads and then become less negative as you go

away indicating nonlinearity in the relationship between concentration and distance.

Figure 4-2. The coefficient surfaces generated using the GWR; parameter estimates of
proximity to Lee Brass foundry

70
Figure 4-3. The coefficient surfaces generated using GWR; parameter estimates of proximity
to major railroads

These two figures, however, do not account for threshold values that differentiate

significant parameter estimates from non-significant estimates. That is, no information on

the distribution of t-values is presented in the figures, meaning that the significance of

associations between independent and dependent variables cannot be identified. Figures 4-4

and 4-5 offer further information about significance and magnitude of the negative and

positive associations revealed in Figures 4-2 and 4-3. Figure 4-4 shows the following

locations of samples of significantly related estimates of proximity to Lee Brass foundry: (a)

locations with t-values that are negatively significant at the 99% confidence level are

presented in dark blue (corresponding to samples in eastern areas of the city), (b) locations

with t-values at the 95% negative significance level are shown in blue, and (c) locations with
71
t-values that are negatively significant at the 90% confidence level are highlighted in light

blue. Specifically, 91 and 133 sample locations have t-values at the 99% and 90% negative

significance level, respectively. Figure 4-4 also indicates that 161 non-significant sample

locations at the 90% level, shown in white circles, are mostly distributed outside of the

triangular area. The following sample locations demonstrate a significant positive

relationship between explanatory and dependent variables: (d) areas with t-values that are

positively significant at the 99% confidence level are shown in dark red, (e) areas with t-

values that are significant at a 95% level are highlighted in red, and (f) areas with t-values

that are positively significant at the 90% confidence level are presented in light red. The

locations with the highest soil lead levels were located in the southwest part of central

Anniston, where 109 sample locations exceeded the 99% confidence level. Figure 4-5 shows

a significance map illustrating parameter estimates of proximity to major railroads. Sample

locations with a 99% negative significance level are found adjacent to, or outside the

triangular area. In contrast, most soil sample locations, which were significant at the 99%

level and showed a positive relationship, were within the triangle formed by the railroads.

72
Figure 4-4. Significance map for parameter estimates (proximity to Lee Brass foundry).

Figure 4-5. Significance map for parameter estimates (proximity to major railroads).

73
A map of soil lead levels modelled based on spatially varying regression coefficients

generated using the GWR was constructed. Figure 4-6 clearly identifies three clusters of

high soil lead concentrations after taking into account regional variations of estimates of the

explanatory variables. The first cluster is located at the middle of the Louisville and

Nashville Railroad, the second and third clusters are located in the area near Lee Brass

casting foundry and along the Southern Railway. This spatial pattern of soil lead levels

created by GWR is consistent with soil lead concentrations presented in Figure 4-1.

Figure 4-6. Soil lead levels modeled based on spatially varying regression coefficients
generated using GWR

74
Results indicate that misspecification of OLS is addressed, by the spatial

nonstationarity explored by the GWR. A global regression estimated by OLS is not able to

correctly specify relationships among explanatory and dependent variables when the

relationships are strongly positive in some regions, while others are negative or insignificant.

Based on this map, it can be reasoned that two explanatory variables, distance to Lee Brass

foundry and distance to major railroads, were identified as the most significant factors

affecting soil lead levels. These two distance variables are considered as a secondary source

of soil contamination since foundries are the primary point sources of lead discharge.

4.3 Discussion and interpretation

The coefficient surfaces for distance to Lee Brass parameter estimates generated

using the GWR are highly beneficial for characterizing spatial patterns in the study area,

where high lead concentrations are more likely to occur in soils close to Lee Brass foundry,

which extends in a N-S direction. This is indeed the case in the eastern areas of the railroad

boundary. In contrast, the positive relationship between soil lead concentrations and distance

to Lee Brass foundry was found in the southwestern regions elongated in a NW-SE direction,

farther from the foundry. This pattern is probably because soil in the vicinity of Lee Brass

foundry can have greater influence on lead exposures by emissions of heavy metals to the air

from the foundry, while locations further from the foundry have less direct impact on lead

exposures through the foundry. In contrast, the other spatial and physical variables were not

captured as significant predictor variables in the equation as shown in Table 4-3. This is a

quite valid result because Lee Brass casting foundry has been recorded as the most lead

75
emission facility in Anniston, Alabama based on 2007 EPA data, whereas other foundries

have released much smaller or no lead disposals into the environment as listed in Table 4-4.

Table 4-3. List of the coefficients for the excluded variables

Predictor variables Coefficients t-value Significance

Distance to M&H Valve 0.219 1.659 0.098

Distance to Standard foundry 0.218 1.215 0.225


Distance to Alabama pipe & 0.186 0.543 0.587
foundry
Distance to Talladega Casting 0.314 0.818 0.414

Distance to Anchor Metals 0.064 0.871 0.384


Distance to Donoho foundry 0.045 0.943 0.346
Distance to Central foundry 0.017 0.418 0.676
Distance to Solutia Inc. 0.003 0.078 0.938
Anniston plant
Distance to Huron Valley -0.004 -0.098 0.922
Steel Inc.
Distance to FMC foundry -0.004 -0.104 0.917
Distance to Bae Systems -0.003 -0.096 0.924
Forge Complex
Distance to Southeastern 0.002 0.051 0.959
Specialty & SA
Distance to Woodstock Iron 0.003 0.067 0.947
& Steel
Distance to Union foundry 0.030 0.596 0.552
Distance to Interstate Roofing -0.004 -0.117 0.907
foundry
Distance to Emory foundry -0.019 -0.502 0.616
Distnace to Ornamental -0.035 -0.842 0.400
foundry
Distance to Star foundry -0.038 -0.884 0.377
Distance to Rudisill foundry -0.050 -1.128 0.260
Distance to U.S Castings -0.080 -1.653 0.099
Distance to Tyico Fire -0.055 -1.393 0.164
Protection
Distance to Multimetco Inc. -0.057 -1.438 0.151
Distance to major roads -0.026 -0.682 0.496
76
Distance to Hydrology -0.007 -0.159 0.873
Elevation -0.093 -2.405 0.016
Aspect 0.002 0.064 0.949

Table 4-4. List of former and active foundries on lead emission based on 2007 EPA data

Foundry names Total On-site Disposal on Total Off-site Total On


lead (in pounds) Disposal on and Off-site
lead Disposal on
lead
Lee Brass foundry 738 13,709 14,447
M&H Valve 58 1,505 1,563

Standard foundry 0 0 0
Alabama pipe & foundry 0 0 0
Talladega Casting 0 0 0
Anchor Metals 0 0 0
Donoho foundry 0 0 0
Central foundry 0 0 0
Solutia Inc. Anniston plant 0 7 8
Huron Valley Steel Inc. 0 0 0
FMC foundry 0 0 0
Bae Systems Forge Complex 0 0 0
Southeastern Specialty & SA 0 0 0
Woodstock Iron & Steel 0 0 0
Union foundry 13 1,048 1,061
Interstate Roofing foundry 0 0 0
Emory foundry 0 0 0
Ornamental foundry 0 0 0
Star foundry 0 0 0
Rudisill foundry 0 0 0
U.S Castings 0 0 0
Tyico Fire Protection 0 0 0
Multimetco Inc. 0 0 0

These results require cautious interpretation. Besides the two explained surfaces of

negative and positive parameter estimates, two unexplained coefficient surfaces apparent in
77
the study area were also found: (a) a negative area is shown on the west sides of the

Louisville and Nashville Railroad, and (b) a positive area is exhibited at the center of Lee

Brass foundry. However, it is important to note that the coefficient surfaces for distance to

Lee Brass foundry presented in Figure 4-2 do not take the significance threshold into

account to distinguish significant parameter estimates from non-significant estimates. Figure

4-4 clearly indicates that in these unexplained places the relationship between the two

variables is not significant at the 90% confidence level, which assists in the interpretation of

results.

This study also pinpoints proximity to local railroads in Anniston as a significant

contributor to increased soil lead concentrations in the study area. There are two major

railroads passing through central Anniston; the Louisville and Nashville Railroad and the

Southern Railway. The Louisville and Nashville Railroad was classified as a Class I railroad

that operated the largest freight and passenger services in the southeastern United States

since 1850 (Klein, 2002). The Southern Railway has also carried passengers, U.S. troops and

freight on scheduled trains in southern states since the 1830s. It was known as the longest

continuous line of railway in the world. These two railroads were the primary sources of

lead contamination in the study area through coal burning until they totally converted to

diesel-powered locomotives in the 1950s. During this time period, coal burning may have

emitted as much as 4000 kg of lead per year to the atmosphere, contaminating urban and

rural regions (Southern Railway Historical Association; Abernethy and Gibson 1963).

Two strongly negative coefficient surfaces in areas along major railroads were found

and are presented in Figure 4-3. One is in the middle, toward the outer side of the Louisville

78
and Nashville Railroad extending in a NW-SE direction, and the other is in the southeast

corner in the Southern Railway, elongated in a NW-SE direction. In contrast, positive

coefficient surfaces on proximity to the railroads are clustered in the soils distributed inside

of the triangular railroads, suggesting that the farther the distance from the railroads, the less

it contributes to the lead concentrations in soils. This pattern may be indicative of the range

of air pollution from emissions of lead to the air through the operation of the two major

railroads. Figure 4-5 offers strong evidence that areas along or nearby railroads are

negatively significant at the 90% and above confidence level, whereas those areas further

away from the railroads show positive and significant relationships at the same confidence

level.

In Figure 4-6, in addition to high concentrations of lead in soils located along the

Louisville and Nashville Railroad, which extends in the NE-SW direction, soil lead

concentrations are fairly high as well along the Southern Railway in a NW-SE direction. In

particular, the areas in the center of the Louisville and Nashville Railroad area and nearby to

Lee Brass foundry have the highest soil lead concentrations, in the range of 400 ppm or

higher. Therefore, it is reasonable to suspect that areas in close proximity to these sites are

exposed to high levels of particulate emissions that contribute to increased lead

contamination in soils. However, these results require cautious interpretation. As shown in

the figure, there is only few soil samples collected in the centre of the Louisville and

Nashville Railroad area. Thus, for the sake of validation, if obtainable, GWR analyses with

more soil lead samples around the Louisville and Nashville Railroad area needs to be carried

79
out for future works. This study has significant implications for further studies on lead

exposure and health impacts for residents living near foundries and railroads.

4.4 Summary

Many previous studies have explored the relationship between proximity to

highways and levels of heavy metals in soils. Fewer studies have focused on heavy metal

contamination in soils near foundries and along railroads. There are 3 major findings from

this study. First, two spatial variables, proximity to Lee Brass foundry and local railroads,

are identified as significant predictors that express spatial variation in soil lead

concentrations in Anniston, Alabama. These associations were observed after taking other

spatial and physical variables into account. These findings support the contention that lead

contamination in soils is highly susceptible to the influence of spatial factors, notably

proximity to point sources for emission. Second, GWR, a local spatial statistical method for

exploring spatial nonstationarity, allows for better identification of significant risk factors.

Unlike ordinary regression, GWR provides a different measurement of associations between

dependent and explanatory variables from location to location. Maps generated using GWR

assist interpretation and exploration of spatial nonstationarity apparent in the study area.

Lastly, geographic areas with higher lead levels in soils need additional investigation,

including the potential for excess human exposure and resulting health effects. In addition to

providing valuable data on the spatial distribution of lead in Anniston, this study may be of

use in other communities with heavy industry where environmental lead exposure may

represent a public health concern.

80
Chapter 5 Analysis of Heavy Metal Sources in Soils using Multivariate
Statistics and GIS

Some heavy metals released by local foundries during production of metal castings

have contaminated the Anniston community. In the 1920s, Anniston was the nations largest

producer of cast-iron soil pipe, with an annual production of about 140,000 tons. During this

time period, Anniston was known as Soil Pipe Capital of the World, and heavy metals

became hazardous contaminants in the waste soil since they are components of many alloys

widely used for casting. As result, the US EPA is still in the process of investigating

commercial and residential areas in Anniston and remediating soil with high heavy metal

concentrations. The objectives of this chapter were: 1) to establish background values on

average regional concentrations of 11 heavy metals (lead (Pb), arsenic (As), cadmium (Cd),

chromium (Cr), cobalt (Co), copper (Cu), manganese (Mn), mercury (Hg), nickel (Ni),

vanadium (V), zinc (Zn)) in 4 different sampling zones; 2) to describe spatial variability of

these metal concentrations at a regional scale; 3) to identify natural or anthropogenic origins;

and 4) to interpolate non-point sources of pollution.

5.1 Materials and methods

5.1.1 Collection of background data

Soil samples were collected in four different zones designated by EPA, as shown in

Figure 5-1; Zone A is defined as the area within 500 meters of each former and current

Anniston foundry operation and only residential properties placed within the area are

81
considered as Zone A (industrial zone). Zone B represents the area depicted within the solid

yellow line and only residential properties located in the yellow line are included in Zone B

(presumably, a less polluted zone). Zone C is the area within the red line and only residential

properties in the red line are included in Zone C (vicinity of the industrial zone). Lastly,

Zone D means the area depicted with a dotted blue line and residential properties located

within the Zone D are included in Zone D (Monsanto plant) (EPA,

2005).

Figure 5-1. Sampling zones in the area of study (Source: EPA).

A database that contained heavy metal levels of soil samples in Anniston was

obtained from the U.S Environmental Protection Agency (EPA). The database included

2,046 soil samples, with multiple measurements taken from the upper 3 inches of soil in
82
each location measured in ppm (parts per million or mg/kg). Information on the associated

latitude and longitude coordinates is also contained in the database. For more reliable

analyses on metal concentrations in each property, averages of metal concentrations in the

soil samples were used where they were taken from the sample place; these were computed

and this resulted in a total of 595 soil sample averages. Soil sampling points were selected as

follows: 209 samples in Zone A, 66 samples in Zone B, 270 samples in Zone C, and 50

samples located within Zone D, as presented in Figure 5-2. Finally, Soil Survey Geographic

(SSURGO) database were also used to identify soil textures for the soil samples collected in

the study region.

Figure 5-2. Soil sampling points collected based on 4 different zones, overlaid with
foundries, main hydrology and major railroads.
83
5.1.2 Data analysis

Statistical analysis of the data was carried out by Kruskal-Wallis and Mann-Whitney

tests used for variables without normal distribution of heavy metals in order to compare

average regional concentrations of 11 heavy metals between different zones of study. A

probability of 0.05 or lower was regarded as significant.

Subsequently, a principal components analysis (PCA) was performed to produce

fingerprints for identifying the origin of the contamination and also to aid the interpretation

of geochemical data (Burns et al., 1997). The objective of PCA is to reduce the dataset

containing a larger number of variables to a smaller size by finding a new set of variables

retaining most of the samples information. When the original set of variables is closely

correlated, it is possible to derive a few underlying components as a linear combination of

the original variables. That is, PCA allows for compression and classification of data. A

score for each sample was assigned in each principal component, so that it allows the

reduced data to be further plotted and analyzed (Schuhmacher et al., 2004). All statistical

analyses were performed with the statistical package SPSS 11.0.

84
Figure 5-3. Geometric picture of principal components (PCs)

In Figure 5-3, the 1st PC z1 captures as much of the variability in the dataset as

possible in X space, while the 2nd PC z2 is the second longest axis in X space, which fits to

a line in the plane perpendicular to the 1st PC (Rogerson, 2006).

5.1.3 Self-Organizing Maps (SOM) analysis

A crucial task of traditional data reduction methods such as principal component

analysis (PCA) and factor analysis (FA) is to reduce both the number of variables (Data

Projection) and the number of observations (Data Quantization) without losing too much

useful information (Yan and Thill, 2005; Rogerson, 2006). However, these traditional

methods have some limitations. For example, assumptions of stationarity and linearity

between variables are often required. Also, the methods generally look for global

relationships rather than local structure within data (Yan and Thill, 2005).

85
In contrast, Self-organizing maps (SOM) is considered as a method that performs a

combination of data projection and data quantization. It is a type of artificial neural network.

The principle of the SOM method can be described as follows: neurons in the output layer

compete with each other and the winner gets a priority to represent the input data vector on

the basis of dissimilarity in the input attribute space (Yan and Thill, 2005; Kohonen, 2001).

It allows not only the winner nodes, but also its adjacent nodes to learn the best match for

the new input node, so that each node eventually trains to represent similar inputs. SOM use

a neighborhood function to preserve the natural order (topology) in the input attribute space

of the data. Consequently, similar input vectors are mapped close to each other, while

dissimilar vectors are mapped further apart on the map grid (Yan and Thill, 2005). The level

of similarity between input vectors can be visualized by a U-matrix (Euclidean distance

between weight vectors of neighboring nodes) of the SOM as shown in Figure 5-4 (Kohonen,

2001; Yan and Thill, 2005).

86
Figure 5-4. U-matrix of the SOM

Specifically, algorithm for SOM is carried out by following five steps: 1)

Randomize the map's nodes' weight vectors, 2) Select an input vector, 3) Go over each node

in the map by using Euclidean distance formula to detect similarity between the input vector

and the map's node's weight vector, and by tracking the node that gives the smallest distance

(this node is called the best matching unit, BMU), 4) Update the nodes in the neighborhood

of BMU by pulling them closer to the input vector,

Wv(t + 1) = Wv(t) + (t)(t)(D(t) - Wv(t))

87
where (t) is a monotonically decreasing learning coefficient and D(t) is the input vector.

The neighborhood function (v, t) depends on the lattice distance between the BMU and

neuron v, and 5) Increment t and repeat from 2 while t < .

SOM have a big advantage over traditional data reduction methods, namely, they

utilize the competitive and continuous learning of all input vectors. It also provides more

powerful visualization tools for data exploration in large volumes of spatial data by giving a

valid platform for user interaction and control (Vesanto, 1999; Yan and Thill, 2005).

Furthermore, it is not limited to the assumptions of stationary and linearity of data. That is, it

converts non-linear relationships between multi-dimensional data into simple geometric

relationships. Lastly, the capability of carrying out both data projection and data

quantization at the same time is one other primary benefit of using SOM. Spatial data often

include a database with a large number of variables and observations, so it is important to

run both data projection and data quantization for data reduction (Kohonen, 2001; Yan and

Thill, 2005).

5.1.4 Geostatistical analysis

A geostatistical approach was used to interpolate concentration values of both raw

data and factor scores of the heavy metals in Anniston. This method is based on the fact that

samples distributed close together in space are more likely to be similar, compared to those

that are further apart (Matheron, 1963; McGrath et al., 2004). Geostatistics uses variograms

(or semi-variogram) to describe the spatial variability of the data, and produces input

parameters for spatial interpolation via kriging (Krige, 1951; McGrath et al., 2004; Webster
88
and Oliver, 2001). The conventional geostatistical model relates the semi-variance, defined

as half the expected squared difference between paired data values Z(x) and Z(x+h), to the

lag distance h, by which locations are separated:

1
( h) = E[ Z ( x) Z ( x + h)]2
2
For discrete sampling areas, such as soil samples, the equation can be modelled as:

1 N (h)
( h) =
2 N (h) i =1
[ Z ( xi ) Z ( xi + h)]2

where Z(xi) represents the value of the data, Z, at location xi, and N(h) denotes the number of

pairs of samples separated by the lag distance h. However, it is rare to have regular sampling,

which has the distance between the sample pairs to be exactly equivalent to h. Thus, the lag

distance, h is often expressed as a distance band for irregular sampling. A variogram plot

can be obtained through the calculation of the variogram at different lags. Then, a theoretical

model, such as an exponential, spherical, or Gaussian model, is selected for the variogram

plot to be best fitted into the model. The fitted model provides not only the input parameters

for kriging interpolation, but also information on the spatial structure of samples (McGrath

et al., 2004).

89
5.2 Results and discussion

5.2.1 Metal analysis.

Table 5-1 summarizes heavy metal concentrations in soils classified according to the

4 sampling zones under study. With respect to the concentrations of heavy metals in soil

samples, no significant differences between collection areas were observed for Ni

concentration in soils. However, for most heavy metals except Ni, significantly different

concentrations were found between sampling zones. In particular, Pb, As, Cd, Cr, Cu, Hg

and Zn concentrations were significantly higher in the industrial areas. Moreover, it was

found that although Co, Mn and V levels in the industrial zone were higher than those

observed in the less polluted zone (15.96 vs. 11.46 mg/kg, 1544.84 vs. 977.08 mg/kg and

20.34 vs. 20.77 mg/kg, respectively), the difference did not reach the level of statistical

significance.

90
Table 5-1. Metal Concentrations in Soil Samples, Anniston, Alabama

In past years, several studies have evaluated the effects of anthropogenic input on the

concentrations of Pb in soils of urban areas, suggesting that foundry fumes, vehicle exhausts,

sewage sludge, and lead pesticides were the primary sources of atmospheric exposure of this

metal. Coal burning is also well known as a common source of soil lead contamination

(Facchinelli et al., 2001; Figueira et al., 2002; Parekh et al., 2002). All of these effects

would clarify why the current Pb concentrations in residential soils collected in Zone A

(industrial zone) were significantly higher than those found in the less polluted zone. In

previous studies, Nadal et al. (2004) reported concentrations in Tarragona County (Spain) of

36.3 and 14.6 mg/kg of Pb for industrial and unpolluted soils, respectively, while they also

found significant differences in the levels of Cr and V between the industrial and unpolluted

areas. In general terms, Pb concentrations in soils collected near the industrial complex in

Anniston, were higher than those previously reported in other industrial and residential areas

91
(Schuhmacher et al., 2002; Nadal et al., 2004). The current levels of Pb were also higher in

comparison to the concentrations found in the Piemonte Region (north-western Italy)

affected by anthropogenic activities like industry, agriculture and transportation (Facchinelli

et al., 2001).

A certain amount of As observed in soils has a pedogenic (natural) origin, averaging

2-5 mg/kg, although anthropogenic inputs such as coal burning, pesticide use and waste

incineration can be important contributors of As contents in soils (ATSDR, 2007). The most

important sources of Cadmium in the environment are anthropogenic inputs, especially

industrial combustions and phosphate fertilizers in agricultural soils. Furthermore, presence

of Cr is highly controlled by both pedogenic (continental dust) and anthropogenic inputs

(primarily oil, gas and coal combustion) (ATSDR, 2008a; ATSDR, 2008b). As, Cd and Cr

concentrations in Anniston soils were, in general terms, higher than the levels previously

observed in other industrial and residential areas (Maiz et al., 2001; Nadal et al., 2004).

Co is a well known micronutrient element essential for the growth of plants. In

particular, the presence of Co is highly influenced by natural sources like soil texture and

pedogenesis. Generally, Co is more likely to be included in clays and organic soils, which

can hold micronutrients and water much better than sandy soils (Plant Nutrients Website).

Moreover, it is widely known that ultramafic rocks contain greater volumes of Co than any

other rock type (Facchinelli et al., 2001). Copper (Cu) enters the environment especially

through metal production, phosphate production and architectural applications such as

roofing and plumbing. Natural processes are also very important contributors for Cu release

into the environment. For example, wind-blown dust, forest fires and decaying vegetation

92
are well-known natural sources of Cu distribution in the environment (Lenntech Website).

Similar to Co, natural sources (soil texture and pedogenesis) are important sources for the

contents of Mn in soils. Nevertheless, incineration wastes, combustion of fossil fuels and

cement production can also be critical factors for the concentrations of Mn in soils (ATSDR,

2008c).

Concentrations of Hg in the environment are strongly associated with both natural

and anthropogenic activities. Releases of this element from natural sources include exposure

of soils by wind and water and breakdown of minerals in rocks. Furthermore, a great amount

of Hg is also released into the environment through human activities like mining, smelting,

application of fertilizers and combustion of fossil fuels (Lenntech Website; ATSDR, 1999).

In contrast, most of Ni on earth is not accessible because it is stored in the iron-nickel melted

core, which accounts for 10% of nickel. The concentrations of Ni in soils can be very

variable ranging from 0.2 ppm to 450 ppm in some loamy and clay soils (Lenntech Website).

Combustion of residual coals and oils, vehicle exhausts and domestic heating are primary

sources of anthropogenic releases of V into the environment. Also, natural sources like soil

texture and soil hydrology can be important for V distribution in the environment (Soldi et

al., 1996; ATSDR, 2009). Lastly, the most important sources for Zn exposures in the

environment are anthropogenic activities, especially foundry fumes or dusts (Facchinelli et

al., 2001). In general, the current Hg and Zn concentrations in Anniston soils were

significantly higher than the levels previously observed in other industrial and residential

areas (Facchinelli et al., 2001; Nadal et al., 2004).

93
5.2.2 Principal Component Analysis

In the current survey, and after normalization of the results by log-transformation,

Principal Component Analysis (PCA) was carried out on the 595 soil samples. As shown in

Table 5-2, PCA retained three components, which account for 76.2% of the total variance in

the data. The eigenvalues of three extracted factors are higher than one, and they become

greater after the matrix rotation, which allows us to clarify ambiguities in the component

attribution. Seven of the heavy metals are subsequently well represented by these three

principal components. The rotated component matrixes presented in Table 5-3, indicate that

Pb, Cd, Cu and Zn are positively correlated, showing high values in the first main principal

component (which explains 35.97% of the total variance), while V is isolated in the third

principal component (19.66% of the variance). The second component (20.55% of the

variance) is positively associated with Co and Mn. However, component scores for As, Cr,

Hg and Ni appeared to be rather ambiguous, and they were hard to be classified. Figures 5-

5a and 5-5b show the component scores for the three extracted components. Subsequently,

the scatterplots of the factor scores on PC1/PC2 and PC1/PC3 for soil samples are presented

in Figures 5-6a and 5-6b, respectively.

94
Table 5-2. Total variance explained - three components selected.

Tota l Va r ia nce Expla ined

Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Component Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 5.782 52.560 52.560 5.782 52.560 52.560 3.956 35.968 35.968
2 1.546 14.052 66.612 1.546 14.052 66.612 2.261 20.552 56.520
3 1.052 9.567 76.179 1.052 9.567 76.179 2.162 19.658 76.179
4 .628 5.710 81.889
5 .512 4.652 86.541
6 .428 3.893 90.434
7 .340 3.087 93.521
8 .248 2.252 95.772
9 .213 1.941 97.713
10 .142 1.287 99.000
11 .110 1.000 100.000
Extraction Method: Principal Component Analysis.

Table 5-3. Component matrixes for 11 heavy metals.

95
Rotated PCA for All Zones (log-transformed)

1
COBALT
0.9
MANGANESE
0.8
0.7
F acto r 2 (20.55% )

0.6 NICKEL

0.5
0.4
0.3 MERCURY
ARSENIC
LEAD Z INC
0.2 V ANADIUM CHROMIUM COPPER
0.1
CADMIUM
0
-0.2 0 0.2 0.4 0.6 0.8 1
F a c t o r 1 (35.96%)

Figure 5-5a. Loading matrix on PC1 and PC2

Rotated PCA for All Zones (log-transformed)

1
V ANADIUM
0.9
0.8 CHROMIUM
0.7
F ac to r 3 ( 19. 65% )

0.6 ARSENIC
0.5
NICKEL
0.4
0.3
MANGANESE
CADMIUM
0.2 COBALT
MERCURY COPPER ZINC
0.1 LEAD

0
-0.2 0 0.2 0.4 0.6 0.8 1
Fa c t or 1 (35.96%)
Figure 5-5b. Loading matrix on PC1 and PC3

96
Figure 5-6a. The scores of factor 1 versus factor 2 scatter plots for soil samples taken in
the different zones of study.

Figure 5-6b. The scores of factor 1 versus factor 3 scatter plots for soil samples taken in
the different zones of study.
97
In the PCA scatter plots, the scores of factor 1 are increased in soil samples,

especially, taken in Zone A (depicted in red dots) and some soils collected in Zone C (shown

as blue triangles). This cluster implies heavy contaminations of Pb, Cd, Cu and Zn in these

zones. For example, sampling point 34 (Zone A) was obtained in the vicinity of a casting

foundry that had released the highest amounts of Pb, Cu and Zn into the environment among

the 23 former and active foundries in Anniston. Similarly, samples 45 and 98 (Zone A),

which displayed high levels on factor 1, were also collected near the casting foundry. On the

other hand, samples 512, 513 and 531 (vicinity of the industrial zone), which also exhibited

high scores on this component, were located in sampling points characterized by major

railroads and another iron casting foundry that constructed many houses and sewer systems

and released the metals by the architectural applications and industrial combustions. In

contrast, most of the soils sampled in the less polluted zone (Zone B), showed low scores on

factor 1, while soils sampled in the industrial zone and near the industrial zone displayed

relatively higher factor 1 scores, implying that pollutions of heavy metals, especially Pb, Cd,

Cu and Zn are more directly influenced by anthropogenic activities like the operations of

foundry and major railroads.

98
Figure 5-7a. Factor 1 scores on soil samples collected in Anniston study area.

Soil samples collected in Zone A and Zone C, also exhibited increased component 2

scores, suggesting high distributions of Co and Mn in these areas, but in different sampling

points compared to soils with high factor 1 scores. Samples 86 and 481 showed the highest

component 2 scores, but they were collected a relatively further distance away from

anthropogenic sources (railroads and foundries). In contrast to component 1, there were also

many soils sampled in the less polluted zone, showed positive scores of component 2,

confirming that contamination of Co and Mn in soils are not only controlled by

anthropogenic inputs, but also by natural factors such as soil texture and pedogenesis. As

previously stated, presence of micronutrients including Co and Mn in soils is significantly

99
affected by the nature of the soil, such as soil texture. That is, clays and organic soils hold

micronutrients and water much better than sandy soils. Water drains much faster in sandy

soils than in clay soils and nutrients are often carried along with water and leached into the

soil (Plant Nutrients Website). In this study, soil samples that have high levels of component

2 were found, especially in areas with clay soils rather than with sandy soils (Figure 5-7b

and Figure 5-8).

Figure 5-7b. Factor 2 scores on soil samples collected in Anniston study area.

100
Figure 5-8. Soil texture class in Anniston study area (CB-FSL: cobbly fine sandy loam,
CL: clay loam, CN-L: channery loam, CR-SIL: cherty silt loam, FSL: fine sandy loam,
GR-CL: gravelly clay loam, GR-FSL: gravelly fine sandy loam, GR-L: gravelly loam, GR-
SCL: gravelly sandy clay loam, GR-SICL: gravelly silt clay loam, GR-SIL: gravelly silt
loam, SIL: silt loam, ST-FSL: stony fine sandy loam, ST-L: stony loam, ST-SL: stony
sandy loam).

Most samples collected in the industrial zone have low scores of factor 3, whereas

soils sampled in less polluted area and in vicinity to the industrial zone showed

comparatively high levels of this factor, implying that both natural and anthropogenic

factors other than foundry are important contributors on the distribution of Vanadium in the

environment. For example, soil samples 425 and 491, with the highest scores on factor 3,

were collected close to major railroads and main hydrology respectively. However, sampling

101
points 235, 274 and 275 (collected in the less polluted zone) have the highest values on

factor 3 and were collected far away from the potential anthropogenic sources (railroads and

foundries). This clearly indicates that anthropogenic inputs are not the only sources of V

contamination in soils. Furthermore, many samples with positive scores on factor 3 were

mainly found in soils that contain a relatively higher percent of clay texture, confirming that

contamination of V was also due to natural factors like soil texture and soil hydrology.

Figure 5-7c. Factor 3 scores on soil samples collected in Anniston study area.

102
5.2.3 Self-Organizing Maps (SOM) analysis

In addition to PCA analysis, a chemometic classification method (SOM) was used on

soil samples to identify whether the zones proximity to the industrial zone (marked as

Industry) and two residential properties (marked as Res Red and Res Yellow) and those near

the Monsanto plant (Monsanto) were influenced by different types of metal contamination.

This is one of the powerful artificial neural network methods to visualize low dimensional

views of high dimensional data and to classify potentially contaminated areas. It provides a

clear assessment of the more contaminated areas, as well as to rank most important

pollutants. Kohonens map was composed of a rectangular grid of 17 X 7 hexagons. The

learning phase was organized with 300 steps and the tuning phase with 300 steps. The

learning process provided a 119 virtual units grid as presented in Figure 5-9. The soils

sampled nearby the former and active foundries (down side of the grid) appear to be more

contaminated than those sampled in the proximity to the residential properties (upper side).

103
Figure 5-9. SOM application to metal levels in soils. Differences between sampling points
according to the metal concentrations in the specific zones.

Several facts that are similar to the results of the PCA analysis are identified in SOM

analysis. Firstly, the presence of active foundries in the industrial zone (Zone A) marked as

Industry provokes an increase of the lead (Pb), cadmium (Cd), and zinc (Zn) levels in the

soils nearby the metal casting complex. Secondly, the concentrations of heavy metals in soil

samples in Zone B (less polluted zone) are, in general terms, lower than the levels found in

soils vicinity to the industrial zone (Zone C). Figure 5-10 displays the component planes of

the SOM in which the heavy metal composition of each virtual unit is shown. The most
104
concentrated samples are those located in the down-center of the grid. It suggests that lead

(Pb), cadmium (Cd), and zinc (Zn), which are more directly influenced by anthropogenic

activities in Anniston, have a similar organizing behavior. Cobalt (Co) and manganese (Mn),

which are not only affected by anthropogenic input, but also by natural factors like soil

texture, would represent a second group of elements located closer to the down-left corner,

while Vanadium (V) behavior would be clearly different from the remaining elements.

In support of the magnitude of the industrialization in the study area, the current

results suggest that metal casting foundries would mean an important heavy metal

contamination source in the area under study. In addition, the meteorological conditions and

the characteristics of the stacks and torches in the foundries of the study area may play an

important role for the dispersion of contaminants and heavy metal accumulation in soils. It,

together with the relatively large number of soil samples collected in this study, allows that

the results can be inferred to other areas with similar contamination sources and industrial

characteristics. Lastly, the above results pinpoint that SOM can be a very useful

chemometric tool to be applied as a classification method.

105
Figure 5-10. SOM application to heavy metals in soils. Environmental behavior of the
different elements.

5.2.4 Geostatistical analysis

Kriging was applied to confirm the interpretation of results and the heterogeneous

spatial distribution of heavy metals in Anniston, Alabama. Factor 1 scores of heavy metal

data were plotted on maps evidencing major railroads and foundries in Anniston (Figure 5-

11a and Figure 5-12a). As shown in these figures, there was good correspondence between

geochemical heterogeneities, shaded in dark brown in the map, and areas operated with

foundries and major railroads. This evidently indicates that long-term anthropogenic

activities of factor 1 are main contributors of Pb, Cd, Cu and Zn contents in soils.

106
Figure 5-11a. 3 dimensional factor 1 scores interpolated by kriging.

Figure 5-12a. Interpolated Factor 1 scores.


107
In addition, factor 2 scores were plotted on maps (Figure 5-11b and Figure 5-12b).

The scores exhibit a different distribution pattern in space as compared with the factor 1

scores. In this case, the variability of Co and Mn is a better fit for the natural sources. The

geostatistical elaboration indicated the spatial association between positive heterogeneities

of Co and Mn contents and the occurrence of clay soils. It validates that presence of Co and

Mn in soils is not only related with anthropogenic sources, but also controlled by natural

sources like soil texture.

Figure 5-11b. 3 dimensional factor 2 scores interpolated by kriging.

108
Figure 5-12b. Interpolated Factor 2 scores.

Lastly, geostatistical analysis was used for the V distribution in soils. In this case, the

variability of V is not related with other metals, while Co and Mn represent a second group

of elements. A GIS elaboration was plotted on the map containing major railroads and main

hydrological feature connections (Figure 5-11c and Figure 5-12c). The highest factor 3

values shaded in dark brown correspond to the main hydrological feature and the major

regional railways, the Southern Railways. Therefore, it confirms that both natural and

anthropogenic sources may contribute to the V release in the environment.

109
Figure 5-11c. 3 dimensional factor 2 scores interpolated by kriging.

Figure 5-12c. Interpolated Factor 3 scores.

110
In summary, it can be observed that the concentrations of heavy metals in soils

collected in Anniston were relatively higher than those reported in previous studies

conducted in other industrialized countries. Furthermore, based on the levels of the chemical

contaminants analyzed in this study, the potential health impact for the residents living close

to the pollution sources should be taken under special consideration.

111
Chapter 6 Analyzing Associations of Soil and Serum PCB in Anniston,
Alabama: the Comparison between All Properties and Focus Sites
Several studies have reported an association between risk of health and exposure to

chemical pollutants including lead, radon, and asbestos (Duhme et al., 1996; Brunekreef et

al., 1996; Heinze et al., 1998; Kohli et al., 2000; Bellander et al., 2001; Reissman et al.,

2001; Oyana et al., 2004; Pan et al., 2005). Some of the studies used residential proximity

analyses using GIS technique to assess individual exposure to chemical pollutions from

natural occurrence and human activities (Kohli et al., 2000; Reissman et al., 2001; Oyana et

al., 2004; Pan et al., 2005). Modelling and mapping of pollution data in air and soil has been

considered proxies of exposure fields for residents in the area (Valjus et al., 1995; Stockwell

et al., 1996). High chemical concentrations in air and soil are measured in residential areas

close proximity to anthropogenic (major roads and railways) and natural sources (ground

radon and ultramafic rocks).

However, only few studies have been carried out to identify a possible association

between PCB exposure in residents and its corresponding health risk. PCB contaminated

residential soils in Anniston are highly attributed to the Monsanto plant, which was one of

only two facilities in the United States that produced PCBs for about forty years. Therefore,

the aim of this chapter was to assess the magnitude of soil PCB exposure in residents from

Anniston by analyzing serum PCB concentrations.

112
6.1 Materials and methods

6.1.1 Data categories

The 4 data categories analyzed in this study are as follows: soil samples with PCB

levels collected in residential properties; Anniston Community Health Survey (ACHS) with

participants serum PCB levels; neurocognitive study with childrens serum PCB levels; and

focus sites consisted of two primary sources of soil PCB contamination, Monsanto plant and

streams near the Monsanto plant. In particular, it is important to note that although both

ACHS and neurocognitive study efforts were separate from the US EPA activities, but aware

of the major undertaking by EPA, the agreement was made that the results of soil PCB

analyses would be available to the ACHS and the neurocognitive investigators to examine

this potential pathway of exposure to PCBs in Anniston residents. The data categories are

described in detail in the following sections.

Soil samples with PCB levels collected in residential properties

The US EPA database contained total PCB levels (parts per million [ppm] or mg/kg)

for 22,452 residential soil samples with multiple measurements in each location from 6,864

properties in Anniston taken by three parties, Solutia and EPA, and Foothills Community

Partnership (FCP). As shown in Figure 6-1, Solutia and EPA samples are being collected

from the area where we expect to see higher PCB concentrations, close to the Monsanto

plant and off-sites drainage ditches and streams, whereas FCP was looking for foundry sand

much further away from the Monsanto plant and off-sites drainage ditches. Information on

113
the associated address and geographic coordinates is also contained in the database. Total

PCBs in soil were analyzed by EPA methods 8082.

Figure 6-1. Sample locations of Solutia and FCP soils

Anniston Community Health Survey (ACHS) with participants serum PCB levels

The Anniston Environmental Health Research Consortium (AEHRC) conducted the

Anniston Community Health Survey (ACHS), which was funded by ATSDR. Serum

measured in parts per billion (ppb; ng/g) from the 766 ACHS participants was analyzed for a

total of thirty five ortho-substituted PCBs by the Center for Disease Control and

Preventions National Center for Environmental Health laboratory using high-resolution gas

114
chromatography/isotope-dilution high-resolution mass spectrometry (HRGC/ID-HRMS).

More details on sample selection are described in section 3.1.1 in Chapter 3.

Neurocognitive study with childrens serum PCB levels

Neurocognitive study was also conducted by the Anniston Environmental Health

Research Consortium (AEHRC) and it contained serum PCB levels for 321 children and its

surrounding communities measured in parts per trillion (ppt; p/g serum) in Anniston. Serum

was analyzed same laboratory procedures as ACHS (a total of thirty five ortho-substituted

PCBs using high-resolution gas chromatography/isotope-dilution high-resolution mass

spectrometry). This study was surveyed Anniston area schools in the vicinity of the

Monsanto plant, which consist of 321 children from 6th through 8th grade. The database also

includes information on their addresses, zip code, basic demographics, and results of

extensive psychometric and cognitive testing.

Focus sites

Two focus sites, Monsanto plant and hydrology including off-sites drainage ditches

and streams near the Monsanto plant were scanned and digitized manually due to

unavailability of these geographic features. All of these data were utilized in a Geographic

Information Systems (GIS) application. These two focus sites identified in chapter 3 were of

primary interest, given their statistically significant association with increased risk of soil and

serum PCB level. Thus, we are particularly interested in residents living in the vicinity of the

plant and hydrology in order to determine potential relationships between soil PCB

contamination and risk of high serum PCB levels. It was assumed that residents in both

ACHS and neurocognitive study who live within 300m radius buffer from the two focus sites

115
were significantly exposed to PCBs from suspected sources of contamination, and that

residents living farther away (> 300m buffer) were assumed to be less exposed. Figure 6-2

shows two focus sites and participants in ACHS and neurocognitive study living within 300m

radius buffer from the two focus sites.

Figure 6-2. Two focus sites and participants living close proximity to the focus sites

6.1.2 Analytical techniques

Buffer analysis

The selection of participants for both ACHS and neurocognitive study and selection

of locations/residences for soil sampling were not coordinated; Buffer and kriging methods

were used to assign soil sample measurements to a particular address in the ACHS and

116
neurocognitive files. Twenty five meter (25m) and 50 meter (50m) radius buffer analyses

were used to locate the residential addresses in ACHS and neurocognitive study in the

proximity of soil measurements in the EPA database. Average and maximum soil levels

were calculated for buffer analyses methods. Up to 5 soil samples per residence were used to

calculate average soil PCB level in 25m buffer radius (median=1) and up to 14 soil samples

for the 50m radius buffer (median=3). In addition, 300m radius buffers along two focus sites,

Monsanto plant and hydrology including off-sites drainage ditches and streams near the

Monsanto plant were created, and then only those who live within the buffer and their

corresponding soil sample measurements were extracted for further proximity analyses

between soil and serum PCB levels. GIS techniques combined with statistical analysis were

applied to compare significance of correlations between soil and serum PCB levels for the

residents of focus sites and all properties. Buffer maps to estimate residential soil PCB levels

of ACHS and neurocognitive study participants are presented in Figures 6-3a and b.

117
Figure 6-3a and b. Buffer Maps to Estimate Residential Soil PCB Levels of ACHS and
Neurocognitive Study Participants
118
Kriging analysis

We also used kriging for spatial prediction at an unobserved location, using data at

observed locations, to increase the number of residences we could use in the statistical

analyses. In same manner, correlation analyses were used to examine associations between

soil and serum PCB levels. Specific procedures and the conventional geostatistical model for

kriging analysis are explained in section 5.1.4 in Chapter 5. Figure 6-4 presents a example of

a kriging map to estimate residential soil PCB levels in Anniston using Solutia and EPA, and

FCP soil dataset.

Lastly, the differences of residential soil PCB levels estimated by buffer and kriging

methods between all properties and focus sites were analyzed using two samples t-test and

one-way analysis of variance (ANOVA). In addition, correlation analysis was performed to

determine feasible associations between the estimated soil PCB levels and their

corresponding serum PCB levels.

119
Figure 6-4. Kriging Map to Estimate Residential Soil PCB Levels in Anniston using Solutia
and EPA, and FCP Soil Dataset.

6.2 Results

6.2.1 Buffer analysis of all properties-focus sites

Buffer analysis in all properties

Three sets of soil levels, Solutia and EPA, FCP and combined were used for buffer

analysis in total ACHS properties. The geographic coordinates of the current address of 106,

150 and 240 ACHS participants were covered within 25m radius buffer of the Solutia and

EPA, FCP and combined soil samples respectively. The means of average and maximum

soil PCB levels in Solutia and EPA soil samples for 25m radius were 0.50 and 1.17 ppm,

whereas the averages of averaged and maximum residential soils in FCP soil samples for
120
25m radius were 0.20 and 0.33 ppm. In addition, the means of averaged and maximum soils

in combined soil samples were 0.34 and 0.71 ppm. Furthermore, fifty meters radius buffer

included 233, 409 and 491 such addresses/properties of the ACHS participants for each set

of soil levels respectively. The means of average and maximum soil PCB levels contained in

50m radius were 0.55 and 1.11 ppm for Solutia and EPA soils; 0.26 and 0.50 ppm for FCP

soils; and 0.38 and 0.79 ppm for combined soils. All these results are summarized in Tables

6-1a and b.

Table 6-1. Descriptive statistics for buffer analysis in ACHS (All properties)

Table 6-1a 25m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
(**: significantly different to the mean PCB levels of soil samples in focus sites at 0.05 level)
25m buffer N Minimum Maximum Mean
Average of the averaged residential 106 0.02 7.93 0.50**
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 106 0.30 29.79 1.17
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the averaged residential 150 0.01 3.72 0.20
soils contained in 25 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 150 0.02 7.40 0.33
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 240 0.01 7.93 0.34**
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 240 0.02 29.8 0.71**
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(Combined dataset)
Table 6-1b 50m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
(**: significantly different to the mean PCB levels of soil samples in focus sites at 0.05 level)
50m buffer N Minimum Maximum Mean
Average of the averaged residential 233 0.03 8.54 0.55**
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 233 0.03 22.50 1.11**
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the averaged residential 409 0.01 8.18 0.26

121
soils contained in 50 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 409 0.01 28.80 0.50
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 491 0.01 8.18 0.38**
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 491 0.02 28.80 0.79**
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Combined dataset)

Also, these three sets of soil levels were applied for buffer analysis in total

neurocognitive study addresses. 23, 26 and 49 neurocognitive study participants properties

were covered by 25m buffer method for Solutia and EPA, FCP and combined soil samples

respectively. As shown in Tables 6-2a and b, the averages of averaged and maximum soil

PCB levels included in 25m radius were 0.50 and 0.76 ppm for Solutia and EPA soils; 0.47

and 0.87 ppm for FCP soils; and 0.49 and 0.82 ppm for combined soil samples. Additionally,

geographic coordinates of the current address of 47, 87 and 117 neurocognitive study

participants were contained within 50m buffer for each set of soil samples respectively. The

averages of averaged and maximum soil PCB levels in three sets of soil samples for 50m

radius were 0.55 and 1.27 ppm; 0.27 and 0.46 ppm; and 0.39 and 0.79 ppm for the Solutia

and EPA, FCP and combined respectively.

122
Table 6-2. Descriptive statistic for buffer analysis in neurocognitive study (All properties)

Table 6-2a 25m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study)
(**: significantly different to the mean PCB levels of soil samples in focus sites at 0.05 level)
25m buffer N Minimum Maximum Mean
Average of the averaged residential 23 0.04 2.25 0.50
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 23 0.05 4.29 0.76
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the averaged residential 26 0.02 3.72 0.47
soils contained in 25 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 26 0.03 7.40 0.87
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 49 0.02 3.72 0.49
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 49 0.03 7.40 0.82
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(Combined dataset)
Table 6-2b 50m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study)
(**: significantly different to the mean PCB levels of soil samples in focus sites at 0.05 level)
50m buffer N Minimum Maximum Mean
Average of the averaged residential 47 0.03 3.58 0.55
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 47 0.04 16.16 1.27
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the averaged residential 87 0.02 3.72 0.27
soils contained in 50 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 87 0.02 7.40 0.46
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 117 0.02 3.72 0.39**
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 117 0.02 16.16 0.79**
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Combined dataset)

123
After extracting soil PCB levels for each ACHS and neurocognitive study participant

by radius buffer method, the soil PCB values were compared to the serum level of each

participant to determine possible associations between soil PCB exposure in residents and

risk of high serum PCB levels. In the results, however, no significant correlations of soil

measurements obtained by the method and serum levels in total ACHS and neurocognitive

study participants were found, as shown in Tables 6-3a-d.

Table 6-3. Correlation matrix between soil levels extracted by buffer analysis and serum
levels in ACHS and neurocognitive study (All properties)

Table 6-3a Correlation matrix between soil levels extracted by 25m buffer analysis and ACHS
participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
25m buffer N Correlation coefficient
Average of the averaged residential soils contained in 25 m radius 106 -0.020
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 25 m-radius 106 0.008
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 25 m radius 150 -0.094
(FCP dataset only)
Average of the maximum residential soils contained in 25 m-radius 150 -0.095
(FCP dataset only)
Average of the averaged residential soils contained in 25 m radius 240 -0.022
(Combined dataset)
Average of the maximum residential soils contained in 25 m-radius 240 0.004
(Combined dataset)
Table 6-3b Correlation matrix between soil levels extracted by 50m buffer analysis and ACHS
participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
50m buffer N Correlation coefficient
Average of the averaged residential soils contained in 50 m radius 233 0.016
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 50 m-radius 233 0.018
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 50 m radius 409 -0.035
(FCP dataset only)
Average of the maximum residential soils contained in 50 m-radius 409 -0.034
(FCP dataset only)
Average of the averaged residential soils contained in 50 m radius 491 0.019
(Combined dataset)
Average of the maximum residential soils contained in 50 m-radius 491 0.015
(Combined dataset)

124
Table 6-3c Correlation matrix between soil levels extracted by 25m buffer analysis and neurocognitive
study participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
25m buffer N Correlation coefficient
Average of the averaged residential soils contained in 25 m radius 23 -0.086
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 25 m-radius 23 -0.032
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 25 m radius 26 -0.046
(FCP dataset only)
Average of the maximum residential soils contained in 25 m-radius 26 -0.051
(FCP dataset only)
Average of the averaged residential soils contained in 25 m radius 49 -0.051
(Combined dataset)
Average of the maximum residential soils contained in 25 m-radius 49 -0.043
(Combined dataset)
Table 6-3d Correlation matrix between soil levels extracted by 50m buffer analysis and neurocognitive
study participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
50m buffer N Correlation coefficient
Average of the averaged residential soils contained in 50 m radius 47 0.028
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 50 m-radius 47 0.001
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 50 m radius 87 -0.069
(FCP dataset only)
Average of the maximum residential soils contained in 50 m-radius 87 -0.067
(FCP dataset only)
Average of the averaged residential soils contained in 50 m radius 117 -0.038
(Combined dataset)
Average of the maximum residential soils contained in 50 m-radius 117 -0.028
(Combined dataset)

Buffer analysis in focus sites

In this analysis, we performed same buffer analysis as above, but we limited both

ACHS and neurocognitive study participants living only within 300 m radius buffer from

two focus sites, Monsanto plant and hydrology including off-sites drainage ditches and

streams near the Monsanto plant since they're identified as two most important factors

influencing soil PCB contamination. For the ACHS participants, the geographic coordinates

of the current address of 41, 9 and 46 properties were covered within 25m radius buffer of

the Solutia and EPA, FCP and combined soil samples respectively. As summarized in
125
Tables 6-4a and b, the averages of averaged and maximum soil PCB levels contained in 25m

radius in the focus sites were 0.96 and 2.54 ppm for Solutia and EPA soils; 0.25 and 0.35

ppm for FCP soils; and 0.88 and 2.29 ppm for combined soils. In addition, 50m radius buffer

contained 68, 22 and 72 addresses of the ACHS participants for each set of soil levels

respectively. The means of average and maximum soil PCB level at three sets of soil sample

for 50m radius buffer were 1.21 and 2.62 ppm, 0.35 and 0.56 ppm, and 1.05 and 2.21 ppm

for the Solutia and EPA, FCP and combined soil samples.

Table 6-4. Descriptive statistics for buffer analysis in ACHS (Focus sites)

Table 6-4a 25m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
(**: significantly different to the mean PCB levels of soil samples in all properties at 0.05 level)
25m buffer N Minimum Maximum Mean
Average of the averaged residential 41 0.02 7.93 0.96**
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 41 0.04 29.79 2.54
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the averaged residential 9 0.04 1.02 0.25
soils contained in 25 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 9 0.06 1.50 0.35
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 46 0.02 7.93 0.88**
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 46 0.04 29.79 2.29**
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(Combined dataset)
Table 6-4b 50m buffer analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
(**: significantly different to the mean PCB levels of soil samples in all properties at 0.05 level)
50m buffer N Minimum Maximum Mean
Average of the averaged residential 68 0.08 8.54 1.21**
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 68 0.10 22.50 2.62**
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
126
Average of the averaged residential 22 0.04 1.02 0.35
soils contained in 50 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 22 0.05 1.73 0.56
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 72 0.07 6.39 1.05**
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 72 0.08 18.39 2.21**
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Combined dataset)

For the neurocognitive study participants in the focus sites, 16, 2 and 18 residential

properties were covered by 25m buffer methods for the three soil datasets. The means of

average and maximum soil PCB levels in Solutia and EPA, and FCP soil samples for 25m

radius were 0.62 and 0.94 ppm, and 0.05 and 0.07 ppm respectively, whereas the averages of

averaged and maximum residential soil in combined samples for 25m radius were 0.56 and

0.84 ppm. Furthermore, 50m buffer included 20, 3 and 23 such addresses of the

neurocognitive study participants for each soil dataset. The averages of averaged and

maximum residential soil levels included in the buffer were 0.89 and 2.41 ppm for Solutia

and EPA samples; 0.31 and 0.32 ppm for FCP samples; and 0.81 and 2.14 ppm for

combined samples. All of these results are presented in Tables 6-5a and b.

Table 6-5. Descriptive statistics for buffer analysis in neurocognitive study (Focus sites)

Table 6-5a 25m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study)
(**: significantly different to the mean PCB levels of soil samples in all properties at 0.05 level)
25m buffer N Minimum Maximum Mean
Average of the averaged residential 16 0.11 2.25 0.62
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 16 0.11 4.29 0.94
soils contained in 25 m-radius (ppm) (ppm) (ppm)

127
(Solutia and EPA dataset only)

Average of the averaged residential 2 0.05 0.05 0.05


soils contained in 25 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 2 0.07 0.07 0.07
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 18 0.05 2.25 0.56
soils contained in 25 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 18 0.07 4.29 0.84
soils contained in 25 m-radius (ppm) (ppm) (ppm)
(Combined dataset)
Table 6-5b 50m buffer analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive Study)
(**: significantly different to the mean PCB levels of soil samples in all properties at 0.05 level)
50m buffer N Minimum Maximum Mean
Average of the averaged residential 20 0.08 3.58 0.89
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum residential 20 0.11 16.16 2.41
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the averaged residential 3 0.07 0.79 0.31
soils contained in 50 m radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum residential 3 0.08 0.80 0.32
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the averaged residential 23 0.07 3.58 0.81**
soils contained in 50 m radius (ppm) (ppm) (ppm)
(Combined dataset)
Average of the maximum residential 23 0.08 16.16 2.14**
soils contained in 50 m-radius (ppm) (ppm) (ppm)
(Combined dataset)

Correlation analysis was carried out to determine possible trends between soil PCB

exposure in residents and risk of high serum PCB levels after extracting soil PCB levels for

each ACHS and neurocognitive study participant by radius buffer methods. In contrast to the

results in all properties, significantly positive associations between soil and serum PCB

levels were found in some of the ACHS results performed by buffer analysis in the focus

sites (r> 0.50, p=0.009). These significant associations were captured even after adjusting
128
age, the strongest predictor of serum level by the regression analysis (see Table 6-6e). All

the results of correlation analysis were summarized and scatter plots were presented in

Tables 6-6a-e and Figures 6-5a and b.

Table 6-6. Correlation matrix between soil levels extracted by buffer analysis and serum
levels in ACHS and neurocognitive study (Focus sites)

Table 6-6a Correlation matrix between soil levels extracted by 25m buffer analysis and ACHS
participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
25m buffer N Correlation coefficient
Average of the averaged residential soils contained in 25 m radius 41 0.196
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 25 m-radius 41 0.217
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 25 m radius 9 0.420
(FCP dataset only)
Average of the maximum residential soils contained in 25 m-radius 9 0.444
(FCP dataset only)
Average of the averaged residential soils contained in 25 m radius 46 0.214
(Combined dataset)
Average of the maximum residential soils contained in 25 m-radius 46 0.257*
(Combined dataset)
Table 6-6b Correlation matrix between soil levels extracted by 50m buffer analysis and ACHS
participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
50m buffer N Correlation coefficient
Average of the averaged residential soils contained in 50 m radius 68 0.032
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 50 m-radius 68 0.037
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 50 m radius 22 0.503**
(FCP dataset only)
Average of the maximum residential soils contained in 50 m-radius 22 0.470*
(FCP dataset only)
Average of the averaged residential soils contained in 50 m radius 72 0.120
(Combined dataset)
Average of the maximum residential soils contained in 50 m-radius 72 0.118
(Combined dataset)
Table 6-6c Correlation matrix between soil levels extracted by 25m buffer analysis and neurocognitive
study participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
25m buffer N Correlation coefficient
Average of the averaged residential soils contained in 25 m radius 16 -0.167
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 25 m-radius 16 -0.069
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 25 m radius 2 n/a
129
(FCP dataset only)
Average of the maximum residential soils contained in 25 m-radius 2 n/a
(FCP dataset only)
Average of the averaged residential soils contained in 25 m radius 18 -0.058
(Combined dataset)
Average of the maximum residential soils contained in 25 m-radius 18 0.010
(Combined dataset)
Table 6-6d Correlation matrix between soil levels extracted by 50m buffer analysis and neurocognitive
study participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
50m buffer N Correlation coefficient
Average of the averaged residential soils contained in 50 m radius 20 -0.056
(Solutia and EPA dataset only)
Average of the maximum residential soils contained in 50 m-radius 20 0.058
(Solutia and EPA dataset only)
Average of the averaged residential soils contained in 50 m radius 3 0.980
(FCP dataset only)
Average of the maximum residential soils contained in 50 m-radius 3 0.980
(FCP dataset only)
Average of the averaged residential soils contained in 50 m radius 23 0.012
(Combined dataset)
Average of the maximum residential soils contained in 50 m-radius 23 0.095
(Combined dataset)
Table 6-6e Dep. Variable: serum PCBs- R-Squared: 0.318 (Regression model)
Independent Variable Un standardized t-statistic
Coefficient
(standardized)
Constant -1.570 -0.449
Soil PCB levels estimated by buffer analysis 6.620 (0.400) -1.962
Age 0.090 (0.276) 1.350

130
Figure 6-5a. Scatter Plot between soil levels extracted by 50m buffer analysis and ACHS
participants serum levels

Figure 6-5b. Scatter Plot between soil levels extracted by 50m buffer analysis and ACHS
participants serum levels
131
6.2.2 Kriging analysis of all properties-focus sites

Kriging analysis in all properties

Kriging procedure used three sets of soil data to interpolate locations for which no

measurements were taken to generate the geographic maps of Anniston and covered 616,

681, and 681 ACHS participants properties for the Solutia and EPA, FCP and combined

soil samples respectively. The averages of maximum kriged soil PCB levels were 0.72 ppm

for Solutia and EPA samples; 0.38 ppm for FCP samples; and 0.57 ppm for combined

samples. Similarly, this kriging method was applied for total neurocognitive study addresses

and 205, 207 and 220 neurocognitive study participants properties were covered by the

method for each set of soil levels respectively. Furthermore, the averages of maximum

kriged soil PCB levels in total neurcognitive addresses were 0.69 ppm for Solutia and EPA

soils; 0.37 ppm for FCP soils; and 0.62 ppm for combined soils. All of these descriptive

statistics are summarized in Tables 6-7a and b.

Table 6-7. Descriptive statistics for kriging analysis in ACHS and neurocognitive study (All
properties)

Table 6-7a Kriging analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
(**: significantly different to the mean PCB levels of soil samples in focus sites at 0.05 level)
Kriging analysis N Minimum Maximum Mean
Average of the maximum kriged 616 0.15 16.99 0.72**
residential soils (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum kriged 681 0.08 1.86 0.38**
residential soils (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum kriged 681 0.09 10.87 0.57**
residential soils (ppm) (ppm) (ppm)
(Combined dataset)
Table 6-7b Kriging analysis for Solutia and EPA, FCP and Combined datasets
(Neurocognitive study)
(**: significantly different to the mean PCB levels of soil samples in focus sites at 0.05 level)

132
Kriging analysis N Minimum Maximum Mean
Average of the maximum kriged 205 0.19 8.69 0.69**
residential soils (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum kriged 207 0.06 1.86 0.37**
residential soils (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum kriged 220 0.14 14.41 0.62**
residential soils (ppm) (ppm) (ppm)
(Combined dataset)

After interpolating soil PCB levels for each ACHS and neurocognitive study

participant by kriging method, correlation analysis was carried out to compare the

interpolated soil PCB levels with their corresponding serum PCB level. In the results,

however, there were no significant associations of soil and serum PCB levels found neither

in total ACHS nor in total neurocognitive study participants, as shown in Tables 6-8a and b.

Table 6-8. Correlation matrix between soil levels extracted by kriging analysis and serum
levels in ACHS and neurocognitive study (All properties)

Table 6-8a Correlation matrix between soil levels extracted by kriging analysis and ACHS
participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
Kriging analysis N Correlation coefficient
Average of the maximum kriged residential soils 616 -0.018
(Solutia and EPA dataset only)
Average of the maximum kriged residential soils 681 0.012
(FCP dataset only)
Average of the maximum kriged residential soils 681 0.009
(Combined dataset)
Table 6-8b Correlation matrix between soil levels extracted by kriging analysis and neurocognitive
study participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
Kriging analysis N Correlation coefficient
Average of the maximum kriged residential soils 205 -0.020
(Solutia and EPA dataset only)
Average of the maximum kriged residential soils 207 0.072
(FCP dataset only)
Average of the maximum kriged residential soils 220 0.014
(Combined dataset)

133
Kriging analysis in focus sites

Significantly higher levels of soil PCB were found in both ACHS and neurocognitive

study participants properties when we only considered a high risk group of PCB exposure

living close proximity to two significant sources of soil PCB contamination, Monsanto plant

and hydrology near the plant. For the ACHS participants, the geographic coordinates of the

current address of 75 properties were covered by the kriging method for all three soil

datasets. As shown in Table 6-9a, we summarized the averages of maximum kriged soil

PCB levels in the focus sites and they were 2.51, 0.79 and 2.16 ppm for the Solutia and EPA,

FCP and combined soil samples respectively. In same manner, this kriging method was also

used for the neurocognitive study participants in the focus sites and covered 29, 28 and 29

addresses for the Solutia and EPA, FCP and combined soil datasets respectively. The

averages of maximum kriged soil levels for the neurocognitive study participants properties

in the focus site were 2.08 ppm for Solutia and EPA soils; 0.67 ppm for FCP soils; and 2.26

ppm for combined soils, as shown in Table 6-9b.

Table 6-9. Descriptive statistics for kriging analysis in ACHS and neurocognitive study
(Focus sites)

Table 6-9a Kriging analysis for Solutia and EPA, FCP and Combined datasets (ACHS)
(**: significantly different to the mean PCB levels of soil samples in all properties at 0.05 level)
Kriging analysis N Minimum Maximum Mean
Average of the maximum kriged 75 0.31 16.99 2.51**
residential soils (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum kriged 75 0.19 1.86 0.79**
residential soils (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum kriged 75 0.26 10.87 2.16**
residential soils (ppm) (ppm) (ppm)
(Combined dataset)
Table 6-9b Kriging analysis for Solutia and EPA, FCP and Combined datasets
134
(Neurocognitive study)
(**: significantly different to the mean PCB levels of soil samples in all properties at 0.05 level)
Kriging analysis N Minimum Maximum Mean
Average of the maximum kriged 29 0.31 8.69 2.08**
residential soils (ppm) (ppm) (ppm)
(Solutia and EPA dataset only)
Average of the maximum kriged 28 0.18 1.86 0.67**
residential soils (ppm) (ppm) (ppm)
(FCP dataset only)
Average of the maximum kriged 29 0.23 14.41 2.26**
residential soils (ppm) (ppm) (ppm)
(Combined dataset)

Lastly, correlation analysis was performed after interpolating soil PCB levels for

each ACHS and neurocognitive study participant living in the focus sites. Interestingly,

some of the results in the correlation analysis in the focus sites showed significantly positive

associations between soil and serum PCB levels in neurocognitive study even after

correcting outliers (r> 0.52, p= 0.001, notice though the presence of four samples that do not

fit the general trend). All the results of correlation analysis and a scatter plot were presented

in Tables 6-10a and b and Figures 6-6a and b.

Table 6-10. Correlation matrix between soil levels extracted by kriging analysis and serum
levels in ACHS and neurocognitive study (Focus sites)

Table 6-10a Correlation matrix between soil levels extracted by kriging analysis and ACHS
participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
Kriging analysis N Correlation coefficient
Average of the maximum kriged residential soils 75 -0.038
(Solutia and EPA dataset only)
Average of the maximum kriged residential soils 75 0.155
(FCP dataset only)
Average of the maximum kriged residential soils 75 0.044
(Combined dataset)
Table 6-10b Correlation matrix between soil levels extracted by kriging analysis and neurocognitive
study participants serum levels (**: significant at 0.01 level, *: significant at 0.05 level)
Kriging analysis N Correlation coefficient
Average of the maximum kriged residential soils 29 0.228
(Solutia and EPA dataset only)

135
Average of the maximum kriged residential soils 28 0.545**
(FCP dataset only)
Average of the maximum kriged residential soils 29 0.335*
(Combined dataset)

Figure 6-6a. Scatter Plot between soil levels extracted by kriging analysis and
neurocognitive study participants serum levels (before excluding four outliers)

Figure 6-6b. Scatter Plot between soil levels extracted by kriging analysis and
neurocognitive study participants serum levels (after excluding four outliers)
136
6.3 Discussion and interpretation

There are 3 major findings from this study: first, distribution of soil PCB

contamination, after accounting for spatial variation in the population at risk, is non-

homogeneous; second, properties in which participants have high levels of soil PCB

exposures appear clustered in proximity to focus sites, including Monsanto plant and

hydrology including off-sites drainage ditches and streams near the Monsanto plant; third,

statistically significant associations between increased risk for high serum PCB levels and

soil PCB contamination near the focus sites, were found using the correlation method. The

study findings correspond and support our earlier observations found in Chapter 3. This

study, together with the previous study, provides a foundation for systematic approaches on

the assessment of environmental exposures in a community and their potential effects on

residents.

6.3.1 Proximity to sites

This study pinpoints proximity to PCB release sites (focus sites) as a significant

contributor to increased soil PCB levels in the study area. For the individuals residing within

300m radius from the focus sites, most of averages of soil PCB levels extracted by buffer

methods were nearly 3 times greater compared with those of individuals living in all

properties (0.88 vs. 0.34 and 2.29 vs. 0.71 ppm in 25m- using the averages of averaged and

maximum combined residential soils in the ACHS study respectively; 1.05 vs. 0.38 and 2.21

vs. 0.79 ppm in 50m-radius using the averages of averaged and maximum combined

residential soils for the ACHS study respectively; 0.81 vs. 0.39 and 2.14 vs. 0.79 ppm in

137
50m-radius using the averages of averaged and maximum combined residential soils in the

neurocognitive study respectively). Similarly, analyses of kriging also revealed that the

mean soil PCB concentration was significantly higher in the properties of both ACHS and

neurocognitive study close proximity to the Monsanto plant and the nearest ditches than

those in all properties (2.16 vs. 0.57 and 2.26 vs. 0.62 ppm for the averages of maximum

kriged residential soils in the ACHS and the neurocognitive study respectively).

We also observed that mean differences of these soil PCB levels between focus sites

and all properties were statistically significant using two samples t-test and one-way analysis

of variance (ANOVA). It is reasonable that these sites significantly influence the quality of

soil, which in turn could trigger higher chance of contact with contaminated dust, soil, or dirt

among individuals with PCBs. It is therefore probable to consider that Anniston

communityliving in the vicinity to the focus sites is exposed to high levels of soil PCBs that

possibly contribute to increased risk of serum PCB levels in residents. Furthermore, this

particular finding validates the results of positive association between serum PCB levels in

ACHS participants and proximity to Monsanto plant found in Chapter 3. Overall, the

extreme increase in soil PCB levels associated with PCB release sites highlights the needs of

mitigating actions such as excavation of contaminated surface soils in residential yards

located nearby the focus sites in order to minimize potential health risk to the residents.

6.3.2 Significant associations

Two statistically significant associations between soil and serum PCB levels near the

focus sites, in comparison with those in the all properties, were identified using the

138
correlation method. Although buffer analyses in the neurocognitive study did not reach

statistical significance, there was modest evidence of increased risk of serum PCB levels

with high soil PCB observed in the ACHS study (r> 0.50, p=0.009). In addition, there was a

positive linear trend between soil and serum PCB levels for residents living within 300 m

radius from the focus sites in the neurocognitive study using kriging analyses (r> 0.62,

p=0.001).

Although not all the correlation results in the focus sites were significant, these

observed associations were generally robust because these statistical significances were not

changed to any great extent by the inclusion of age as a primary potential covariate in the

regression models or by the exclusion of possible outliers in the models. It clearly indicates

that individuals residing in close proximity to two exposure sources, the Monsanto plant and

ditches are at increased risk for high serum PCB levels than other individuals.

Our study had 2 limitations. First, current analyses did not account other possible

confounders like nutritional factors and participants work history to determine whether the

increase in serum PCB levels is related to the consumption of contaminated local food or to

the PCB related jobs. Second, we were not able to obtain complete information pertaining to

the residential history of the participants. This information could explain more specific form

of cumulative soil PCB exposures for each participant. Although there are some evidences

associating increased risk of high serum PCB levels to soil PCB contamination, it is

reasonable to account other obtainable risk factors in the models for further analysis.

139
6.3.3 Interpretations

These interpretations were drawn from the basis of the following aspects. First, there

is a significant release of PCBs from the Monsanto plant, in which a large amount of

contaminated disposals flows into off-sites drainage ditches and streams nearby the plant.

Such circumstances contributed to the deposition of vapour-phase PCBs on the surface of

soils or plants nearby the focus sites and caused the high risk for the elevation of serum PCB

level in the focus sites residents. In particular, children living in the focus sites in the

neurocognitive study are more likely in contact with contaminated dust or soil by playing

outside, so the contaminated soil or dust may play a quantitative role in the PCB exposure

among the children. Furthermore, children in the neurocognitive study have much less

confounder effects in demographic and socioeconomic factors like occupation and age for

their PCB exposure. Thus, significant results observed in the correlation analysis for the

neurocognitive study participants living in the focus sites suggest a feasible association

between soil and serum PCB levels among the children.

6.4 Summary

There are 2 implications of the study findings: first, current soil PCB contamination

not only contribute to increased risk of elevating serum PCB levels but may also contribute

to adverse health effects for the residents living in the focus sites in comparison with other

areas; and second, identification of significant associations of soil PCB to serum PCB levels

among the participants of both ACHS and neurocognitive study in the focus sites, indicates a

140
reasonable etiological pathway between serum PCB levels and PCB in soil. These

implications play an important role to develop new hypotheses relating to the spatial

distribution of soil PCB contamination and its possible adverse health effects in this

community. Although this study does not account some of other potential confounders

including occupation and nutritional factors, it provides evidence on the focus sites at which

PCB exposures may influence susceptible individuals.

141
Chapter 7 Conclusions

The goal of this case study were to explore the spatial patterns of PCB and 11 other

chemical pollutants in soil, contributing factors for different types of the pollutants, and

determine a potential association between serum PCB levels and PCB in soil in Anniston,

Alabama. More specifically, in chapter 3, we used regression models to identify indicators in

explaining spatial patterns of soil and serum PCB levels in Anniston, Alabama. To examine

the absolute magnitude of the effect of the explanatory variables, we calculated coefficients

of each predictor using a linear regression, with levels of PCBs in soils and human serums

taken as two dependent variables and socioeconomic combined with two spatial factors as

independent variables.

In the socioeconomic and spatial model of soil PCBs, higher concentrations of soil

PCBs were found in areas with a low percentage of African Americans, low number of

family households, a high percentage of housing units built before 1970, low education

levels, and property of strip mines and gravel pits. All of these significant socioeconomic

variables were found closely associated with two selected spatial variables, proximity to the

Monsanto plant and to the nearest off-site drainage ditches. In addition, in the

socioeconomic and spatial model of serum PCBs, three variables, poverty levels, percentage

of high school graduates, and vicinity to the Monsanto plant are particularly important

indicators in explaining distribution patterns of serum PCB levels in individuals living in the

study area. Furthermore, spatial lag and error models using maximum likelihood and two-

stage least squares (2S2L) respectively, were performed to correct problems of spatial

142
effects and non-constant error variance in OLS regression. The spatial regressions increase

the R2 values and reduce the absolute magnitude of many of the coefficients, in comparison

with OLS coefficients.

Chapter 4 emphasized that the operation of foundries in Anniston, Alabama for well

over 100 years has resulted in extensive environmental lead (Pb) contamination in Anniston

communityof about 23,000 residents. Since excess environmental exposure to lead is a

potential public health concern, it is important to describe and understand the heterogeneous

spatial distribution of lead in Anniston communityto assist with remediation efforts aimed at

reducing the potential for excess human exposure and associated health risk. In this paper,

spatial distributions of lead in Anniston, Alabama, were focused upon in the vicinity of

foundries and railroads.

This study used regression models to identify predictors in explaining soil lead levels,

which range up to 7,715 ppm. To examine the magnitude of the effect of the explanatory

variables, coefficients of each predictor were calculated using an ordinary linear regression,

with levels of lead in soils taken as the dependent variable and physical factors as

independent variables. In the physical and environmental model, two variables, distance to

railroads and distance to one or more foundries were particularly important in explaining

soil lead concentrations within the study area. Subsequently, geographically weighted

regression (GWR) was performed to correct problems of spatial nonstationarity in ordinary

least square (OLS) regression. It considered the spatial variation in the associations between

explanatory and dependent variables in comparison to ordinary regression, which generates

143
only a single regression equation to yield global associations between variables. The GWR

increased R2 values and reduced errors of model-fit, in comparison to the OLS estimates.

In Chapter 5, the average regional concentrations of 11 heavy metals (Pb, As, Cd, Cr,

Co, Cu, Mn, Hg, Ni, V, Zn) have been determined in soil samples collected from various

industrial sites of Anniston, Alabama, which contains a large number of chemical foundries.

Soils were also sampled in residential and relatively less polluted zones. Multivariate

statistical methods, a Principal Component Analysis (PCA) and a Kohonen Self-Organizing

map (SOM), were applied to classify heavy metals in soils, characterizing the risk

assessment of polluted sites. In addition, kriging was adopted to create regional distribution

maps for the interpolation of non-point sources of heavy metal contamination using

geographical information system (GIS) techniques.

There were significant differences found between sampling zones in the

concentrations of heavy metals, with the exception of the levels in Ni. Three main factors

explaining the heavy metal variability in soils were identified. Pb, Cd, Cu and Zn were

primarily controlled by anthropogenic activities, like operations of chemical foundries and

major railroads, whereas presence of Co and Mn, and V alone were also associated with

natural sources such as soil texture, pedogenesis and soil hydrology. Generally, the levels of

the chemical contaminants analyzed in this study were higher than output from previous

studies in other regions.

Lastly, in Chapter 6, we utilized buffer and kriging analyses to estimate residential

soil PCB levels of participants in the ACHS and the neurocognitive study. We created 25 m
144
and 50 m radius buffer around each participants properties in the ACHS and the

neurocognitive study and analyzed soil PCB level in soil samples contained in each buffer.

In addition, kriging, which is one of geostatistical interpolation methods to predict the value

of an un-sampled location from samples of its value at nearby locations, was applied to

estimate soil PCB levels at un-sampled areas.

Most of mean soil PCB levels extracted by buffer methods using the averages of

maximum residential soils in the ACHS and the neurocognitive study contained in 25m and

50m radius were nearly three times higher in the properties within 300m buffer from the

Monsanto plant and the nearest off-site drainage ditches (focus sites) than in all properties

(0.88 vs. 0.34 and 2.29 vs. 0.71 ppm in 25m- using the averages of averaged and maximum

combined residential soils in the ACHS study respectively; 1.05 vs. 0.38 and 2.21 vs. 0.79

ppm in 50m-radius using the averages of averaged and maximum combined residential soils

for the ACHS study respectively; 0.81 vs. 0.39 and 2.14 vs. 0.79 ppm in 50m-radius using

the averages of averaged and maximum combined residential soils in the neurocognitive

study respectively). Similarly, analyses of kriging also revealed that the mean soil PCB

concentration was significantly higher in the properties of the ACHS and the neurocognitive

study close proximity to the Monsanto plant and the nearest ditches compared to in all

properties (2.16 vs. 0.57 and 2.26 vs. 0.62 ppm for the averages of maximum kriged

residential soils in the ACHS and the neurocognitive study respectively).

Significant associations between soil and serum PCB levels in both ACHS and the

neurocognitive study were found only in the focus sites, but not in the all properties (r> 0.50,

145
p=0.009 for the ACHS and r> 0.62, p=0.001 for the neurocognitive study). It indicates that

individuals residing in close proximity to two exposure sources, the Monsanto plant and

ditches are at increased risk for high serum PCB levels than other individuals. Activities to

reduce soil PCB contamination (e.g., excavation of surface soils in residential yards) and

continuous monitoring of PCB exposure are highly suggested.

However, further evaluation of spatial and socioeconomic factors, serum PCB

triggers and soil PCB exposure, and correlations of findings with serum-soil PCB

contamination in Anniston, are warranted for a definitive link to be made between the

increased risk of soil PCBs and heavy metal concentrations, serum PCB levels, and their

identified sources of contamination.

146
References

Abernethy RF, Gibson F. 1963. Bur. Mines Inform. Circ. 8163.

Anderson MA, Balliett RW, Link PE, Satchell DP. 1983. Method of fixing hazardous

substances in waste foundry sand. United States Patent.

Adamus, L.L., Bergman, M.K., 1995. Estimating nonpoint source pollution loads with a
GIS screening model. Water Resources Bulletin 31(4): 467-655.
Anderson HA, Falk C, Hanrahan L, Olson J, Burse VW, Needham L, Paschal D, Patterson D
Jr, Hill RH Jr, Boddy J, Budd M, Burkett M, Fiore B, Humphrey HEB, Johnson R,
Kanarek M, Lee G, Monaghan S, Reed D, Shelley T, Sonzogni W, Steele G, Wright
D., 1998. Profiles of Great Lakes critical pollutants: A sentinel analysis of human
blood and urine. Environmental Health Perspectives, 106: 279-289.

Anselin, L. 1995. Local Indicators of Spatial Association LISA. Geographical Analysis


27: 93115.

Anselin, L., I. Syabri, and Y. Kho. 2006. GeoDA: An Introduction to Spatial Data Analysis.
Geographical Analysis, 38: 5-22.
Agency for Toxic Substances and Disease Registry (ATSDR), Draft Toxicological Profile

for Polychlorinated Biphenyls, Research Triangle Park, North Carolina, August,

1995.

Agency for Toxic Substance and Disease Registry (ATSDR), Health Consultation:
Monsanto Company, Anniston Calhoun County, Alabama, January 17, 1996.
Agency for Toxic Substance and Disease Registry (ATSDR), Health Consultation:
Evaluation of Soil, Blood & Air Data from Anniston, Alabama. February 14, 2000a.
Agency for Toxic Substance and Disease Registry (ATSDR), Toxicological Profile for

147
Polychlorinated Biphenyls (PCBs). November, 2000b.
Agency for Toxic Substance and Disease Registry (ATSDR), Health Consultation: Exposure
Investigation. October 22, 2001.
Agency for Toxic Substance and Disease Registry (ATSDR), ToxFAQs for Polychlorinated
Biphenyls (PCBs), February, 2001.
Agency for Toxic Substances and Disease Registry (ATSDR), Health Consultation:
Polychlorinated Biphenyls, Dixons, and Pesticides in Soil, Blood and Air from
Anniston, Alabama, July 30, 2003
Agency for Toxic Substances and Disease Registry (ATSDR), Toxicological Profile for

Mercury. US Department of Health and Human Services, Public Health Service,

Agency for Toxic Substances and Disease Registry, Atlanta, GA, 1999.

Agency for Toxic Substances and Disease Registry (ATSDR), Toxicological Profile for

Arsenic. US Department of Health and Human Services, Public Health Service,

Agency for Toxic Substances and Disease Registry, Atlanta, GA, 2007.

Agency for Toxic Substances and Disease Registry (ATSDR), Toxicological Profile for

Cadmium. US Department of Health and Human Services, Public Health Service,

Agency for Toxic Substances and Disease Registry, Atlanta, GA, 2008a.

Agency for Toxic Substances and Disease Registry (ATSDR), Toxicological Profile for

Chromium. US Department of Health and Human Services, Public Health Service,

Agency for Toxic Substances and Disease Registry, Atlanta, GA, 2008b.

Agency for Toxic Substances and Disease Registry (ATSDR), Toxicological Profile for
Manganese. US Department of Health and Human Services, Public Health Service,
Agency for Toxic Substances and Disease Registry, Atlanta, GA, 2008c.

148
Agency for Toxic Substances and Disease Registry (ATSDR), Toxicological Profile for
Vanadium. US Department of Health and Human Services, Public Health Service,
Agency for Toxic Substances and Disease Registry, Atlanta, GA, 2009.

ASTM, 1995. Standard Guide for Risk-based Corrective Action Applied at Petroleum
Release Sites-RBCA, E 1739-95. ASTM, West Conshohocken, PA, USA, pp. 133-
145.
Bailey TC, Gatrell AC (1995) Interactive spatial data analysis. Longman Scientific &
Technical, Essex, England.
Bellander, T., Berglind, N., Gustavsson, P., Jonson, T., Nyberg, F., Pershagen G., and Jarup,

L., 2001. Using Geographic Information Systems to assess individual historical

exposure to air pollution from traffic and house heating in Stockholm.

Environmental Health Perspectives, 109: 633-639.

Bernardinelli, L. and Montomoli, C., 1992. Empirical Bayes versus fully Bayesian analysis
of geographical variation in disease risk. Statistics in Medicine, 11: 9831007.
Best N, Richardson S & Thomson A., 2005. A comparison of Bayesian spatial models for
disease mapping. Statistical Methods in Medical Research, 14: 3559.
Bhopal, R. S., Moffatt, S., Pless-Mulloli, T., Phillmore P. R., Foy, C., Dunn C. E., Tate J.

A., 1998. Does living a constellation of petrochemical, steel, and other industries

impair health? Occupational Environmental Medicine, 55: 812-822.

Borrell L, Factor-Litvak, Wolff M, Susser E, Matte T., 2004. Effect of socioeconomic


status on exposures to polychlorinated biphenlys (PCBs) and
dichlorodiphenlyldichloroethylene (DDE) among pregnant African-American
women. Arch Environ Health, 59: 250-255.
Breslow, N. E. and Clayton, D. G., 1993. Approximate inference in generalized linear
149
mixed models. Journal of the American Statistical Association, 88: 925.
Brosse, S., Giraudel, J.L., Lek, S., 2001. Utilisation of non-supervised neural networks and
principal component analysis to study fish assemblages. Ecol. Model. 146: 159166.
Brunekreef, B., Janssen, N., Hartog, J., Harssema, H., Knape, M., and Vliet, P., 1997. Air

pollution from truck traffic and lung function in children living near motorways.

Epidemiology, 8: 298-303.

Burns, W.A., Mankiewicz, P.J., Bence, Al.E., Page, D.S., Parker, K.R., 1997. A principal
component and least squares method for allocating polycyclic aromatic hydrocarbons
in sediment to multiple sources. Environmental Toxicology and Chemistry, 16: 1119-
1131.

Carlon, C., Critto, A., Marcomini, A., Nathanail, P., 2001. Risk based characterization of
contaminated industrial site using multivariate and geostatistical tools,
Environmental Pollution, 111: 417-427.
CDC, Fourth National Report on Human Exposure to Environmental Chemicals. 2010,
Centers for Disease Control and Prevention, Atlanta, GA:
http://www.cdc.gov/exposurereport.
Chang, L. W., 1999. Toxicology of metals. Boca Raton, FL: CRC Lewis Publishers.

Charlton ME, Fotheringham S, Brunsdon C: Geographically weighted regression


version 2.x, Users manual and installation guide; no date.
Christensen, J. M., 1995. Human exposure to toxic metals: factors influencing

interpretation of biomonitoring results. Science of the Total Environment, 166: 89-

135.

Cliff, A.D. Haggett, P. and Ord, J.K. 1986. Spatial aspects of influenza epidemic. London:
Pion.
CONCAWE, 1997. European Oil Industry Guideline for Risk-based Assessment of
150
Contaminated Sites (Final Report by the Water Quality Management Group).
CONCAWE, Brussels.

Corwin, D.L., Wagnet, R.J., 1996. Applications of GIS to the modeling of nonpoint
pollutants in the vadose zone: a conference overview. Journal of Environmental
Quality, 25: 403-411.
Dakins, M.E., Toll, J.E., Small, M.J., 1994. Risk-based environmental remediation:
decision framework and role of uncertainty. Environ-mental Toxicology and
Chemistry 13 (12), 1907-1915.

Dan, A., Oosterbaan, J., Jamet, P., 2002. Contribution des reseaux de neurones artificiels
(RNA) a la caracterisation des pollutions de sol. Exemples des pollutions en
hydrocarbures aromatiques polycycliques (HAP). C.R. Geosci. 334: 957965.

Dean, C. B. and MacNab, Y. C., 2001. Modelling of rates over a hierarchical health
administrative structure. The Canadian Journal of Statistics, 29: 405419.
Devine OJ, Louis TA et al., 1994. Empirical Bayes estimators for spatially correlated
incidence rates. Environmetrics, 5:381-398
Diggle, P., Moyeed, R., Rowlingson, B., Thompson, M., 2002. Childhood malaria in the
Gambia: a case-study in model-based geostatistics. Applied Statistics, 51: 493506.
Doll, R., 1980. The epidemiology of cancer. Cancer 45:24752485.
Dubos, R., 1965. Man adapting. New Haven, CT: Yale University Press.
Duhme, H., Weiland, S.K., Keil, U., Kraemer, B., Schmid, M., Stender, M., and Chambless,

L., 1996. The association between self-reported symptoms of asthma and allergic

rhinitis and self-reported traffic density on street of residence in adolescents.

Epidemiology, 7: 578-582.

Ebbinghaus, E., Kreeb, K.J., Weinmannkreeb, R., 1997. GIS supported monitoring long-

151
termed urban trace element loads with bark of Aesulum hippocastanum. Journal of
Applied Botany-Angewandte Botanik, 715 (6): 205-211.

Facchinelli, A., Sacchi, E., Mallen, L., 2001. Multivariate statistical and GIS-based
approach to identify heavy metal sources in soils. Environmental Pollution, 114:
313-324.
Falk C, Hanrahan L, Anderson HA, Kanarek MS, Draheim L, Needham L, Patterson D Jr,
Boddy J, Budd M, Burkett M, Fiore B, Humphrey HEB, Jojnson R, Lee G,
Monaghan S, Reed D, Shelley T, Sonzogni W, Steele G, Wright D., 1999. Body
burden levels of dioxin, furans, and PCBs among frequent consumers of Great Lakes
sport fish. Environmental Research, 80: S19-S25.

Ferguson, C.C., 1998. Techniques for handling uncertainty and variability in risk
assessment models. Umweltbundesamt, Berlin.
Ferguson, C.C., Darmendrail, D., Freier, K., Jensen, B.K., Jensen, J., Kasamas, H., Urzelai,
A., Vegter, J. (Eds.), 1998. Better methods for risk assessment. In: Risk Assessment
for Contaminated Sites in Europe, Vol. 1. Scientific Basis. LQM Press, Nottingham,
pp. 135-146.

Figueira, R., Sergio, C., and Sousa, A. J., 2002. Distribution of trace metals in moss

biomonitors and assessment of contamination sources in Portugal. Environmental


Pollution, 118: 153-163.

Fotheringham AS, Brunsdon C, Charlton ME., 1998. Geographically weighted regression:


a natural evolution of the expansion method for spatial data analysis. Environ and
Plann A, 30:1905-1927.

Fotheringham AS, Brunsdon C, Charlton ME., 2000. Quantitative geography; perspectives


on spatial data analysis. London: Sage Publications.
Fotheringham AS, Brunsdon C, Charlton ME., 2002. Geographically weighted regression:

152
the analysis of spatially varying relationships. New York: Wiley.
Gemperli A 2003. Development of spatial statistical methods for modeling point-
referenced spatial data in malaria epidemiology. PhD thesis, University of Basel, pp.
111134.

GeoDa. 2010. GeoDASpace. GeoDa Center, Arizona State University, Tempe, AZ.
[http://www.geodacenter.asu.edu/node/526].
Gilks, W. R., Best, N. G., and Tan, K. K. C. (1994). Adaptive Rejection Metropolis
Sampling within Gibbs Sampling.Cambridge, U.K.: MRC Biostatistics Unit.
Haining, R., 1998. Spatial statistics and the analysis of health data. In: Gatrell AC, Loytonen
M, GIS and health. Taylor and Francis, London, pp 29-48
Han, D, 2003. Geographical Epidemiology of Breast Cancer in Western New York
Exploring Spatio-Temporal Clustering in GIS.
Hanrahan LP, Falk C, Anderson HA, Draheim L, Kanarek MS, Olson J, Boddy J, Budd M,
Burkett M, Fiore B, Humphrey H, Johnson B, Leee G, Monaghan S, Reed D, Shelley
T, Sonzogni W, Steele G, Wright D, Steenport D., 1999. Serum PCB and DDE levels
of frequent Great Lakes sport fish consumers A first look. Environmental Research,
80: S26-S37.

Heinze, I., Gross, R., Stehle, P., and Dillon, D., 1998. Assessment of Lead Exposure in
School from Jakarta. Environmental Health Perspectives, 106: 499-501.

Hoaglin, D.C., Welsch, R.E., 1978. The hat matrix in regression and ANOVA. The
American Statistician, 32: 1722.
Hull, D., 1979. Migration, adaptation, and illness: a review. Social Science and Medicine 13:
25-36.
Hwang, S.A., Fitzgerald, E.F., Cayo, M., Yang, B.Z., Tarbell, A., Jacobs, A., 1999.
Assessing environmental exposure to PCBs among Mohawks at Akwesasne through

153
the use of geostatistical methods. Environmental Research, 80(2):S189-S199.
Jacquez, G.M., 2000. Spatial analysis in epidemiology: Nascent science or a failure of GIS?
Journal of Geographical Systems 2: 91-97
Juang, J.W., and Lee, D.Y., 1998. A comparison of three Kriging methods using auxiliary
variables in heavy-metal contaminated soils. Journal of Environmental Quality, 27:
355-363.

Kaldor, J., Harris, J. A., Glazer, E., Glaser, S., Neutra, R., Mayberry, R., Nelson, V.,

Robinson, L., and Reed, D. 1984. Statistical association between cancer incidence

and major-cause mortality, and estimated residential exposure to air emissions from

petroleum and chemical plants. Environmental Health Perspectives, 54:319-332.

Kelejian, H.H., and I.R. Prucha. 2007. HAC Estimation in a Spatial Framework. Journal of
Econometrics, 140: 131-144.
Kelejian, H.H. and I.R. Prucha. 2010. Specification and Estimation of Spatial
Autoregressive Models with Autoregressive and Heteroskedastic Disturbances.
Journal of Econometrics, 157: 53-67.

Kerzhentsev, A.S., Khakimov, F.I., and Deeva, N.F., 1997. Ecological situation in the city

of Serpukhov and Serpukhov region, Moscow Oblast; Report 1996,in Tsongas, T

and Butcher, W. (eds.), Proceedings of an Environmental Policy and Planning

Workship Held at Pushchino State University, Pushchino, Russia from March 17-

March 21, 3-1 to 3-7.

Klein, M., 2002. History of the Louisville & Nashville railroad. University Press of

Kenturky.
Kohonen, T., 1982. Self-organized formation of topologically correct feature maps.
Biological Cybernetics, 43: 5969.

154
Kohonen, T., 2001. Self-Orgnizing Maps. 3rd edition, Springer, Berlin, Heidelberg.
Kohli, S., Brage, H.N., and Lofman, O., 2000. Childhood leukaemia in areas with different

radon levels: a spatial and temporal analysis using GIS. Journal of Epidemiology

and Community Health, 54: 822-826.

Krige, D. G., 1951. A statistical approach to some basic mine valuation problems on the
Witwatersrnd. Journal of the Chemical, Metallurgical and Mining Society of South
Africa, 52: 119-139.
Lawson, A., (ed) 1999. Disease mapping and risk assessment for public health. Wiley,
London
Lee Brass: Lee Brass Company History. http://www.leebrass.com/history.htm [cited 2010
Feb 24].

Lenntech Website: http://www.lenntech.com/periodic/elements/cu.htm.

Levine, N., 2004. CrimeStat III: A Spatial Statistics Program for the Analysis of Crime

Incident Locations, Houston, TX and the National Institute of Justice.

Lin, M. C., Yu, H. S., Tsai, S. S., Cheng, B. H., Hsu, T. Y., Wu, T. N., and Yang, C. Y.

2001. Adverse pregnancy outcome in a petrochemical polluted area in Taiwan.

Journal of Toxicology and Environmental Health, 63: 565-574.

Maiz, I., Arambarri, I., Garcia, R., and Millan, E., 2001. Evaluation of heavy metal

availability in polluted soils by two sequential extraction procedures using factor

analysis. Environmental Pollution, 110: 3-9.

155
Marmot M, Wilkinson, R., 2001. Psychosocial and material pathways in the relation
between income and health: a response to Lynch, et al. British Medical Journal, 322:
1233-1236.

Maruyama, W., Yoshida, K., Nakanishi, J., 2002. Determinations of tissue-blood partition
coefficients, for a physiological model for humans, and estimation of dioxin
concentration in tissues. Chemosphere, 46: 975985.

Matheron, G.,1963. Principles of geostatistics. Economic Geology, 58: 1246-1266.


McGrath, D., Zhang, C., Carton, O. T., 2004. Geostatistical analyses and hazard
assessment on soil lead in Silvermines area, Ireland. Environmental Pollution, 127:
239-248.
McKinlay, J.B. 1975. Some issues associated with migration, health status, and the use of
health services. Journal of Chronic Diseases, 29: 579-92.
Meade, M.S., Earickson, R.J., 2000. Medical Geography. The Guilford Press. New York.
Mehlman, M.A. 1992. Dangerous and cancer-causing properties of products and chemicals

in the oil refining and petrochemical industry. VIII. Health effects of motor fuels:

carcinogenicity of gasoline-scientific update. Environ. Res., 226: 238-249.

Mennis, J., 2006. Mapping the results of geographically weighted regression. The
Cartographic J , 43:171-179.
Mielke, H.W., Dugas, D., Millke, P.W., Smith, K.S., and Gonzales, C.R., 1997.
Associations between soil lead and children blood in urban New Orleans and rural
Lafourche Parish of Louisiana, Environmental Health Persectives, 105: 950-954.

Monmonier, M., 1996. How to lie with maps. The University of Chicago Press, Chicago, IL
Moragues, A., Alcaide, T., 1996. The use of a geographical information system to assess
the effect of traffic pollution. Science of the Total Environment, 190: 267-273.
Moran, P., 1948. The interpretation of statistic maps. Journal of the Royal Statistical Society
156
Series B, 10: 245-51.
Moran, P., 1950. Notes on Continuous Stochastic Phenomena. Biometrika, 37:17-23.
Nadal, M., Espinosa, G., Schuhmacher, M., Domingo, J.L., 2004. Patterns of PCDDs and
PCDFs in human milk and food and their characterization by artificial neural
networks. Chemosphere, 54: 13751382.
Nadal, M., Kumar, V., Schuhmacher, M., Domingo, J.L., 2006. Definition and GIS-based
characterization of an integral risk index applied to a chemical/ petrochemical area.
Chemosphere, 64: 1526-1535.

Nathanail, C.P., Ferguson, C.C., Brown M.J., Hooker, P.J., 1998. A geostatistical approach
to spatial risk assessment of lead in urban soils to assist planners. In: Moore, D.,
Hungr, O. (Eds.), Proceedings of the Eighth International Congress International
Association for Engineering Geology and the Environment. Bakema, Rotterdam, pp.
2433-2437.

Olson, J.R., Stephen, F.D., Vena, J.E., Greizerstei, H., and Kostyniak, P.J. Exposure
to Halogenated Aromatic Hydrocarbons (HAHs) and Expression of CYP1A1 in
Consumers of Lake Ontario Fish and Wildlife. International Conference on Chemical
Mixtures. Atlanta, Georgia, Sept. 10-12, 2002.

Olson, S.J., 2007. Use of GIS PCB Contamination in Anniston Alabama, Master Thesis.

University at Buffalo.

Ormsby, T., Alvi, J., 1999. Extending ArcView GIS. Environmental System Research
Institute, Inc. Redlands, CA.
Oyana, T.J., Rogerson, P., and Lwebuga-Mukasa, J.S., 2004. Geographic clustering of adult

asthma hospitalization and residential exposure to pollution at a United States-

Canada border crossing. American Journal of Public Health, 94: 1250-1257.

157
Pan, X., Day, H.W., Wang, W., Beckett, L.A., and Schneker, M.B., 2005. Residential

proximity to naturally occurring asbestos and mesothelioma risk in California. Am J

Respir Crit Care Med, 172: 1019-1025.

Park, Y.S., Chon, T.S., Kwak, I.S., Lek, S., 2004. Hierarchical community classification
and assessment of aquatic ecosystems using artificial neural networks. Science of
Total Environment, 327: 105122.

Parekh, P. P., Khwaja, H. A., Khan, A. R., Naqvi, R. R., Malik, A., Khan, K., and

Hussain, G. Lead content of petrol and diesel and its assessment in an urban

environment. Environmental Monitoring and Assessment, 74: 255-262.

Pavuk, M., Cerhan, J.R., Lynch, C.F., Schecter, A., Petrik, J., Chovancova, J., Kocan, A.,
2004. Risk based characterization of contaminated industrial site using multivariate
and geostatistical tools, Environmental Pollution 111: 417-427.

Pickle, L.W., Mungiole, M., Jones, G.K., White, A.A., 1996. Atlas of United States
Mortality. Hyattsville, MD:National Center for Health Statistics.

Plant Nutrients Website: http://www.agr.state.nc.us/cyber/kidswrld/plant/nutrient.htm

Prothero, R. M., 1961. Population movements and problems of malaria eradication in


Africa, World Health Organizational Bulletin, 24: 405-425.
Prothero, R. M. 1965. Migrants and malaria. London: Longmans.
Prothero, R. M. 1977, Disease and mobility: a neglected factor in epidemiology.
International Journal of Epidemiology 6: 259-267.

158
Pulugurtha, S., Vanjeeswaran, K., and Nambisan, S., 2003. Development of criteria to
identify pedestrian high crash locations in Nevada. Nevada Department of
Transportation, Carson City, NV.

Raso G, Matthys B, NGoran EK, Tanner M, Vounatsou P &Utzinger J., 2005. Spatial risk
prediction and mapping of Schistosoma mansoni infections among schoolchildren
living in western Co te dIvoire. Parasitology, 131: 112.

Reissman, D.B., Staley, F., Curtis, G.B., Kaufmann, R.B., 2001. Use of Geographic
Information System technology to aid health department decision making about
childhood lead poisoning prevention activities. Environmental Health
Perspectives,109: 89-94.
Richardson, S., Montfort, C., 2000. Ecological correlation studies. In: Spatial Epidemiology:

Methods and Applications (Elliott P, Wakefield J, Best N, Briggs D, eds).


Oxford:Oxford University Press, 205220.

Rogerson, P., 2006. Statistical Methods for Geography, Sage Publication, London, UK.
Rogerson, P., Yamada, I., 2009. Statistical Detection and Surveillance of Geographic
Clusters, Talyor & Francis Group, Boca Raton, FL.
Rushton G, Krishnamurty R, Krishnamurti D, Lolonis P, Song H. 1996. The spatial
relationship between infant mortality and birth defect rates in a U.S. city. Stat Med
15:19071919.

Schantz, S.L., 1996. Developmental Neurotoxicity of PCBs in Humans: What Do We


Know and Where Do We Go From Here? Neurotoxicology and Teratology, 18:
217-227.
Schneider, R., 2001. Development of a Proactive Approach to Pedestrian Safety Planning,
Master project in City and Regional Planning, University of North Carolina, Chapel
Hill, NC.

159
Schneider, R., Ryznar, R., Khattak, A., 2004. An Accident Waiting to Happen: A spatial
Approach to Proactive Pedestrian Planning. Accident Analysis and Prevention, 36:
193-211.

Schroder, J. L., Basta, N. T., Lochmiller R. L., Rafeerty, D. P., Payton, M.; Kim, S., and

Qualls, C. W Jr., 2000. Soil contamination and bioaccumulation of inorganics on

petrochemical sites. Environmental Toxicology and Chemistry, 19: 2066-2072.

Schuhmacher, M., Bocio, A., Agramunt, M. C., Domingo, J. L., 2002. PCDD/F and
metal concentrations in soil and herbage samples collected in the vicinity of a
cement plant. Chemosphere, 48: 209-217.
Schuhmacher, M., Nadal, M., Domingo, J. L., 2004. Levels of PCDD/Fs, PCBs, and
PCNs in soils and vegetation in an area with chemical and petrochemical
idustries. Environmental Science and Technology, 38: 1960-1969.
Shang, J.Q., Ding, W., Rowe, R.K., Josic, L., 2004. Detecting heavy metal contamination
in soil using complex permittivity and artificial neural networks. Canadian
Geotechnical Journal, 41: 10541067.

Smith, M.J., Goodchild, M.F., and Longley, P.A., 2006. Geospatial analysis- a

comprehensive guide to principles, techniques and software tools. 3rd edition.

Soldi, T., Riolo, C., Alberti, G., Gallorini, M., and Peloso, G. F., 1996. Environmental

vanadium distribution from an industrial settlement. Science of the Total

Environment, 181: 133-142.

Southern Railway Historical Association. Southern railway history.


http://www.srha.net/public/History/history.htm [cited 2010 Feb 24].

Stephen, F.D., Vena, J.E., Greizerstein, H., Kostyniak, P.J., and Olson, J. R., 2001. Serum

PCB, PCDD, PCDF, and Pesticide Levels in Consumers and Non-consumers of

160
Lake Ontario Wildlife. The Toxicologist, 60: 21.

Stewart, P., Darvill, T., Lonky, E., Reihman, J., Pagano, J., Bush, B., 1999. Assessment of
Prenatal Exposure to PCBs from Maternal Consumption of Great Lakes Fish: An
Analysis of PCB Pattern and Concentration. Environmental Research Section A 80:
87-96.
Stigliani, W. M., 1993. Overview of the chemical time bomb problem in Europe. In

Chemical time bombs. Proceedings of the European State-of-the-art Conference on

Delayed Effects of Chemicals in Soils and Sediments; Meulen, G.R.B., Stigliani, W.

M., Salomons, W., Bridges, E. M., Imeson, A. C., Eds.; Hoofddorp, the Netherlands,

p. 13-29.

Stocks P., 1936. Distribution in England and Wales of cancer of various sites. Annual
Report of British Empire Cancer Campaign, 13:239280.
Stockwell, J.R., Sorensen, J.W., Eckert J.W., and Carreras, E.M., 1993. The U.S. EPA

Geographic Information System for mapping environmental releases of Toxic

Chemical Release Inventory (TRI). Chemical Risk Analysis, 13:155-164.

Swerdlow A, dos Santos Silva I. 1993. Atlas of Cancer Incidence in England and Wales
196885. Oxford:Oxford University Press.
Tanabe, S., 1998. PCB problems in the future: Foresight from current knowledge.
Environmental Pollution, 50: 5-28.
Tran, L.T., Knight, C.G., ONeill, R.V., Smith, E.R., OConnell, M., 2003. Self-organizing
maps for integrated environmental assessment of the Mid-Atlantic region.
Environmental Management. 31: 822835.

Tsongas, T., Orlinskii, D., Priputina, I., Pleskachevskaya, G., Fetishchev, A., Hinman, G.,

and Butcher, W., 2000. Risk analysis of PCB exposure via the soil-food crop
161
pathway, and alternatives for re mediation at Serpukhov, Russian Federation. Risk

Analysis, 20: 73-79.

ten Tusscher, G.W., Koppe, J.G., 2004. Perinatal dioxin exposure and later effects- a
review. Chemosphere. 54: 1329-1336.
US EPA, 1989. Risk Assessment Guidance for Superfund, Vol. 1. Human Health

Evaluation Manual (540/1-89/002). US-EPA, Washington.


US EPA and Solutia Inc., Final Consent Decree, October, 2002.

US EPA, Anniston PCB Site Operable Units 1 and 2 Baseline Human Health Risk

Assessment Anniston, Alabama, September, 2008.

US EPA, Human Health Risk Assessment Fact Sheet, January, 2008.

US EPA, Final Pathways Analysis Report for the Baseline Risk Assessment for Anniston

PCB Site Operable Unit 4 Anniston, Alabama, December, 2009.

US EPA, EPA Administrative Agreement and Order on Consent.

http://foothillscommunitypartnership.com/ 2005 [cited 2010 Feb 20].

US EPA, Anniston PCB Site. Environmental Protection Agency, Washington, DC:


http://www.epa.gov/region4/waste/npl/nplal/annpcbal.htm [cited 2011 May 24].
Valjus, J., Hongisto, M., Verkasalo, P., Jarvinen, P., Heikkila, K., and Koskenvuo, M.,

1995. Residential exposure to magnetic fields generated by 110-400 kv power lines

in Finland. Bioelectromagnetics, 16: 365-376.

Vesanto, J., 1999. SOM-Based Data Visualization Methods. Intelligent Data Analysis. 3:
111-126.
162
Vrijheid, M., Martinez, D., Aguilera, I., Ballester, F., Basterrechea, M., Esplugues, A.,
Guxens, M., Larranaga, M., Lertxundi, A., Mendez M., Murcia, M., Santa, Marina
L., Villanueva, C., and Sunyer, J., 2010. Socioeconomic status and exposure to
multiple
environmental pollutants during pregnancy: evidence for environmental inequality?
Journal of Epidemiology & Community Health.
Walter, SD., 2000. Disease mapping: a historical perspective. In: Spatial Epidemiology:
Methods and Applications (Elliott P, Wakefield J, Best N, Briggs DJ, eds).
Oxford:Oxford University Press, 223252.

Webster, R.; Oliver, M. A., 2001. Geostatistics for environmental scientists. John Wiley &
Sons Ltd, Chichester.
Weisglas-Kuoperus, N., Patandin, S.,et al., 2000. Immunologic Effects of Background
Exposure to Polychlorinated Biphenyls and Dioxins in Dutch Preschool Children.
Environmental Health Perspectives. 108: 1203-1207.
White, H. 1980. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct
Test for Heteroskedasticity, Econometrica, 48: 817-838.
Wilkinson, R., 1999. Income inequality, social cohesion, and health: clarifying the theory: a
reply to Muntaner and Lynch. International Journal of Health Survey, 29: 525-543.
Yan, J., Thill J.C., 2005. Visual Exploration of Spatial Interaction Data with Self-
Organizing Maps. University at Buffalo.
Zelinsky, W., 1971. The hypothesis of the mobility transition. Geographic Review, 61: 219-
249.

163

Você também pode gostar