Você está na página 1de 155

ABSTRACT

HUNGERFORD, ASHLEY ELAINE. Three Essays in Spatial Economics. (Under the direction
of Barry Goodwin and Sujit Ghosh.)

The first essay examines insurance claims filed with the National Flood Insurance Program.

The National Flood Insurance Program (NFIP), a government-run insurance program, has

become the largest source of flood insurance in the United States. However, the sustainability of

the program has been called into question. The combination of flood insurance rate maps riddled

with errors and subsidies to high-risk zones has left the NFIP insolvent. This paper examines

alternative methods of premium rating in an attempt to move the NFIP towards solvency.

We use single hurdle models to estimate the count of flood insurance claims within the state

of Florida. We also model the average indemnity payments for each county. By combining the

estimates from the single hurdle models with the estimates for the average indemnity payments,

we examine the loss-cost ratios for these counties as well as the expected number of claims. From

these results, we can determine the largest problem areas within the state and provide estimates

that better reflect flooding risk in Florida.

The second essay continues to investigate risk management issues. In 2011 and 2012 severe

droughts caused extensive damage in crops throughout the Midwest. These conditions combined

with concerns for climate change have led to a growing focus on risk management in agriculture.

The increasing emphasis on risk management is reflected in the 2014 Farm Bill, which replaces

direct payments with shallow loss programs. For this paper we turn our attention to winter

wheat production in Kansas and explore the ratings of the crop insurance policies as well as

predicted payouts from the new Agricultural Risk Coverage program established under the 2014

Farm Bill. Using spatial models we simulate yields of non-irrigated winter wheat and irrigated

winter wheat to estimate crop insurance premium rates as well as payouts from the Agricultural

Risk Coverage program.

The final essay tests the Law of One Price in North Carolina grain markets. The Law of One

Price states the difference in prices two markets should be equal to the difference in the trans-
action costs. If prices continuously operate outside of equilibrium in regionally-linked markets,

this indicates disintegration of the markets. This disintegration could represent asymmetric in-

formation in the markets and affect market players decisions. Much of the previous literature

testing the Law of One Price used regime switching regression. This paper utilizes methodology

previously not considered in this literature, which is the implementation of Stochastic Copula

Autoregessive (SCAR) models. By using SCAR models in place of regime switching regression

to develop impulse responses, changes from equilibrium to disequilibrium and vice versa are

continuous.
Copyright 2014 by Ashley Elaine Hungerford

All Rights Reserved


Three Essays in Spatial Economics

by
Ashley Elaine Hungerford

A dissertation submitted to the Graduate Faculty of


North Carolina State University
in partial fulfillment of the
requirements for the Degree of
Doctor of Philosophy

Economics

Raleigh, North Carolina

2014

APPROVED BY:

Nicholas Piggott Denis Pelletier

Barry Goodwin Sujit Ghosh


Co-chair of Advisory Committee Co-chair of Advisory Committee
UMI Number: 3584313

All rights reserved

INFORMATION TO ALL USERS


The quality of this reproduction is dependent upon the quality of the copy submitted.

In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if material had to be removed,
a note will indicate the deletion.

UMI 3584313
Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author.
Microform Edition ProQuest LLC.
All rights reserved. This work is protected against
unauthorized copying under Title 17, United States Code

ProQuest LLC.
789 East Eisenhower Parkway
P.O. Box 1346
Ann Arbor, MI 48106 - 1346
DEDICATION

To Will

ii
BIOGRAPHY

Ashley Mabee was born in Santa Maria, California and moved with her family to Bakersfield,

California at age 8. She attended California State University, Bakersfield, where she earned

a Bachelor of Science degree in Mathematics. At university Ashley met her future husband

William Hungerford. Although Bakersfield is renowned for its poor air quality and attempted

book bans, Ashley and William left Bakersfield, so Ashley could pursue a PhD in Economics

at North Carolina State University. Upon completion of her PhD she will begin employment at

Economic Research Services of the United States Department of Agriculture. There she will be

reunited with her long-lost office mate, Stephanie Riche.

iii
ACKNOWLEDGEMENTS

During my dissertation work, I have been incredibly fortunate to have, for lack of a better term,

a Super Star committee. Dr. Nick Piggott, your mentoring and guidance through the job market

has been indispensable. I deeply appreciate all of your help with my job market presentation

as well as introducing me to my future supervisor, Joe Cooper. Dr. Sujit Ghosh, I am forever

grateful for your patience and insights on methodology. Your input has not only been vital to

the development of these essays, but also how I approach methodology for economic questions.

Dr. Denis Pelletier, your asset pricing course set the foundation for my dissertation. Also thank

you for all of the hours of help outside of the classroom. And of course, Dr. Barry Goodwin for

being kind enough to open your office door for me two years ago. Your mentoring has greatly

helped me blend together non-trivial methodology with economic intuition. Thank you all very,

very, much.

iv
TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 2 Modeling of Flood Insurance Claims using Spatially-Varying Zero-


Inflated Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 National Flood Insurance Program . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 History of the National Flood Insurance Program . . . . . . . . . . . . . . 7
2.2.2 How Flood Insurance Works . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Single Hurdle Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Estimating the Expected Number of Claims . . . . . . . . . . . . . . . . . 13
2.3.3 Modeling Indemnity Payments . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5.1 Model Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5.2 Simulating Claims and Indemnity Payments . . . . . . . . . . . . . . . . . 19
2.5.3 Indemnity Payments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.4 Loss Cost Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.8 Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Chapter 3 Risk Management in Wheat Production . . . . . . . . . . . . . . . . . 47


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Risk Management in Agriculture . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4.1 Model for Censored County Yields . . . . . . . . . . . . . . . . . . . . . . 52
3.4.2 Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.3 Risk Management Application . . . . . . . . . . . . . . . . . . . . . . . . 55
3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.8 Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Chapter 4 Spatial Integration of North Carolina Grain Markets . . . . . . . . . 89


4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.1.1 Mean Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

v
4.1.2 Variance Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.1.3 Stochastic Copula Autoregressive Model . . . . . . . . . . . . . . . . . . . 94
4.1.4 Comparison to Single Parameter Copulas . . . . . . . . . . . . . . . . . . 96
4.1.5 Non-Linear Impulse Responses . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.6 Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Chapter 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Appendix A Conditional Autoregressive Model (Chapters 1 & 2) . . . . . . . . . . . 132
Appendix B Prior Distributions for Models (Chapter 1) . . . . . . . . . . . . . . . . 133
B.1 Logit Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
B.2 Log Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
B.3 Modeling Indemnity Payments . . . . . . . . . . . . . . . . . . . . . . . . . 134
Appendix C Basics of Copula Modeling (Chapter 3) . . . . . . . . . . . . . . . . . . 135
Appendix D Algorithm for Impulse Responses (Chapter 3) . . . . . . . . . . . . . . . 138
D.1 Obtaining E[Dt+k |t = dt + , Dt1 = dt1 , . . .] . . . . . . . . . . . . . . . . . 138
D.1.1 Initializing the Shock for an Impulse Response . . . . . . . . . . . . 138
D.1.2 Time Path for Impulse Response After the Shock is Implemented . . 139
D.2 Obtaining E[Dt+k |t = dt , Dt1 = dt1 , . . .] . . . . . . . . . . . . . . . . . . . 140

vi
LIST OF TABLES

Table 2.1 High risk areas have at least a 1% annual probability of flooding. These areas
are referred to as 100-year floodplains. Zones labeled A are for inland
areas, while zones labeled V are reserved for coastal areas. Moderate risk
areas are referred to as 500-year floodplains. . . . . . . . . . . . . . . . . . . 24
Table 2.2 Function Forms of the Considered Probability Mass Functions. The function
() is defined as (n) = (n1)! and the function () is defined as (+1) =
P 1
k=0 k +1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Table 2.3 DIC for Logit Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Table 2.4 DIC for the Log Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Table 2.5 The Chi-Square Statistics for Observations of County-Year Data . . . . . . 30
Table 2.6 The Chi-Squared Statistics for data average over the years for each county. . 31
Table 2.7 This table shows the expected losses. The first column is an alphabetical
list of all the counties in Florida. The second column is the annual average
for historical claims. Third is an approximation of the annual number of
claims the NFIP expects. The last three column are show the results for
the simulations for each type of hurdle model. For the SHP and SHNB, the
simulations for the models with the lowest DIC are shown. . . . . . . . . . . 32
Table 2.8 DIC for the Indemnity Payment Models . . . . . . . . . . . . . . . . . . . . 41
Table 2.9 Loss cost ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Table 2.10 Levee Rating System implemented by the United States Army Corps of
Engineers (USACE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Table 3.1 Descriptions of policies offered by RMA . . . . . . . . . . . . . . . . . . . . 62


Table 3.2 DIC for the entire model. Here the logit link is varied, while the truncated
normal distribution has the spatial intercept, the spatial covariate with the
optimal threshold, and the September price. . . . . . . . . . . . . . . . . . . 63
Table 3.3 Chi-Squared Discrepancies. Best-fitting show the Chi-Square discrepancy
of the model that has spatial intercepts with the CAR distribution prior,the
optimal threshold covariate in the truncated normal regression, and the
September price covariate. Independent has different intercepts for each
county with independent priors and the September price covariate. These
two models have the same logit link. . . . . . . . . . . . . . . . . . . . . . . 63

Table 4.1 Kendalls tau coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106


Table 4.2 Results of the Augmented Dickey-Fuller Test conducted on the price series
from each terminal grain market. . . . . . . . . . . . . . . . . . . . . . . . . 107
Table 4.3 Cointegration Test: Maximal Eigenvalue Likelihood Statistics. An estimate
with asterisks , , or indicate statistical significance at the = .10, =
.05 and = .01, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 107

vii
Table 4.4 VAR(1) and GARCH(1,1) estimates and standard errors for the wheat mar-
kets. pW G,tand pW S,t are the price different for the wheat markets in
Greenville and Statesville, respectively, on day t. An estimate with aster-
isks , , or indicate statistical significance at the = .10, = .05 and
= .01, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Table 4.5 VAR(1) and GARCH(1,1) estimates and standard errors for the corn mar-
kets. pCB,t and pCB,t are the price different for the corn markets in Bar-
ber and Laurinburg, respectively, on day t. An estimate with asterisks , ,
or indicate statistical significance at the = .10, = .05 and = .01,
respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Table 4.6 VAR(1) and GARCH(1,1) estimates and standard errors for the soybean
markets. pSC,tand pSF,t are the price different for the soybean markets
in Cofield and Fayetteville, respectively, on day t. An estimate with asterisks
, , or indicate statistical significance at the = .10, = .05 and

= .01, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108


Table 4.7 CRPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Table 4.8 Parameter Estimates for the single parameter copulas . . . . . . . . . . . . 109
Table 4.9 Parameter Estimates for SCAR models of Wheat Markets . . . . . . . . . . 110
Table 4.10 Parameter Estimates for SCAR models of Corn Markets . . . . . . . . . . . 110
Table 4.11 Parameter Estimates for SCAR models of Soybean Markets . . . . . . . . . 110
Table 4.12 Half-Lives of Impulse Responses. The half-life is the time (in days) that it
takes for the deviation dissipate to half of its distance to the equilibrium. . 122

Table C.1 Popular Archimedean Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . 137

viii
LIST OF FIGURES

Figure 2.1 Maps Relating to Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26


Figure 2.2 Maps Relating to Losses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Figure 2.3 The prior and posterior distributions of the covariate of Number of Policies
for the logit link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Figure 2.4 For the logit link with the lowest DIC, the 2.5th , 50th , and 97.5th percentiles
of spatial intercepts are shown above. . . . . . . . . . . . . . . . . . . . . . . 29
Figure 2.5 The prior and posterior distributions of the covariate Number of Policies
for the log link of the SHP model. . . . . . . . . . . . . . . . . . . . . . . . . 31
Figure 2.6 For the SHP model with the lowest DIC, the 2.5th , 50th , and 97.5th per-
centiles of random effect intercepts are shown above. . . . . . . . . . . . . . 34
Figure 2.7 The prior and posterior distributions of and the covariate Number of
Policies for the log link of the SHNB model. . . . . . . . . . . . . . . . . . . 35
Figure 2.8 For the SHNB model with the lowest DIC, the 2.5th , 50th , and 97.5th per-
centiles of spatial intercepts are shown above. . . . . . . . . . . . . . . . . . 36
Figure 2.9 The prior and posterior distributions of the coefficient for log number of
policies and Coast for the SHRZ model. . . . . . . . . . . . . . . . . . . . 37
Figure 2.10 For the SHRZ model, the 2.5th , 50th , and 97.5th percentiles of spatial inter-
cepts are shown above. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 2.11 Maps for Expected Losses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Figure 2.12 Percentiles for the average indemnity payments (adjusted to 2013 dollars)
for a flood insurance claim. . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Figure 2.13 Percentiles for the simulated average indemnity payments (adjusted to 2013
dollars) for a flood insurance claim. These are based on the model for average
indemnity payments, which had the lowest DIC. . . . . . . . . . . . . . . . . 42
Figure 2.14 Maps of loss cost ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Figure 3.1 Figures for the entire state of Kansas including the average yield and number
of acres planted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Figure 3.2 Wheat price for a per bushel (adjusted to 2013 price) . . . . . . . . . . . . . 65
Figure 3.3 Average yield (bushels per acre). Sample period: 1970-2013 . . . . . . . . . 66
Figure 3.4 Prior and posterior distributions of the dryland and irrigated wheat logit
link functions. Note there is no 0 posterior distribution for irrigated wheat
because the intercepts for the best-fit irrigated wheat logit link function are
spatially-varying. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Figure 3.5 Posterior percentiles for the spatial intercepts of the irrigated wheat logit
link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Figure 3.6 Prior and posterior distributions for the parameter of the dryland wheat
and irrigated wheat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Figure 3.7 Posterior percentiles for the spatial intercepts of the dryland wheat trun-
cated normal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Figure 3.8 Posterior percentiles for the secondary spatial covariate of the dryland wheat
truncated normal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

ix
Figure 3.9 Posterior percentiles for the spatial intercepts of the irrigated wheat trun-
cated normal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Figure 3.10 Posterior percentiles for the secondary spatial covariate of the irrigated
wheat truncated normal regression . . . . . . . . . . . . . . . . . . . . . . . 73
Figure 3.11 Percentiles for the simulated dryland yields . . . . . . . . . . . . . . . . . . 74
Figure 3.12 Percentiles for the simulated irrigated yields . . . . . . . . . . . . . . . . . . 75
Figure 3.13 Probability for the three coverage levels of dryland wheat of the best fitting
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Figure 3.14 Probability for the three coverage levels of dryland wheat of the model with
independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Figure 3.15 Probability for the three coverage levels of irrigated wheat of the best fitting
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Figure 3.16 Probability for the three coverage levels of irrigated wheat of the model
with independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Figure 3.17 Premium rates for the three coverage levels of dryland wheat of the best
fitting model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Figure 3.18 Premium Rates for the three coverage levels of dryland wheat of the model
with independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Figure 3.19 Premium rates for the three coverage levels of irrigated wheat of the best
fitting model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Figure 3.20 Premium rates for the three coverage levels of irrigated wheat of the model
with independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Figure 3.21 Distribution of Olympic average of prices . . . . . . . . . . . . . . . . . . . . 84
Figure 3.22 County probability from the ARC program . . . . . . . . . . . . . . . . . . 85
Figure 3.23 Percentiles of the Olympic averages for dryland wheat. . . . . . . . . . . . . 86
Figure 3.24 Percentiles of the Olympic averages for irrigated wheat. . . . . . . . . . . . 87
Figure 3.25 Median of the expected payout for the ARC program. . . . . . . . . . . . . 88

Figure 4.1 Acres planted for the three crops of interest . . . . . . . . . . . . . . . . . . 104
Figure 4.2 Daily Market Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Figure 4.3 Wheat: Gaussian Copula Parameters . . . . . . . . . . . . . . . . . . . . . . 111
Figure 4.4 Wheat: Double Clayton Copula Parameters . . . . . . . . . . . . . . . . . . 112
Figure 4.5 Wheat: Double Gumbel Copula Parameters . . . . . . . . . . . . . . . . . . 113
Figure 4.6 Corn: Gaussian Copula Parameters . . . . . . . . . . . . . . . . . . . . . . . 114
Figure 4.7 Corn: Double Clayton Copula Parameters . . . . . . . . . . . . . . . . . . . 115
Figure 4.8 Corn: Double Gumbel Copula Parameters . . . . . . . . . . . . . . . . . . . 116
Figure 4.9 Soybean: Gaussian Copula Parameters . . . . . . . . . . . . . . . . . . . . . 117
Figure 4.10 Soybean: Double Clayton Copula Parameters . . . . . . . . . . . . . . . . . 118
Figure 4.11 Soybean: Double Gumbel Copula Parameters . . . . . . . . . . . . . . . . . 119
Figure 4.12 Kendalls Tau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Figure 4.13 Wheat: Impulse Response Functions . . . . . . . . . . . . . . . . . . . . . . 121
Figure 4.14 Corn: Impulse Response Functions . . . . . . . . . . . . . . . . . . . . . . . 121
Figure 4.15 Soybean: Impulse Response Functions . . . . . . . . . . . . . . . . . . . . . 122

x
Chapter 1

Introduction

This dissertation contains three essays that center around spatial relationships. Chapter 2 and

Chapter 3 both examine how insurance is affected by the presences of systemic risk. Systemic

risk is the probability of losses occurring simultaneously and dependently. If systemic risk is

not properly accounted for in insurance ratings, rates may be inefficient and lead to inaccurate

probabilities. The third essay, Chapter 4, examines spatial integration of grain markets in

North Carolina. If markets are spatially integrated, then the Law of One Price holds. The Law

of One Price states that the price of a homogeneous good in different locations is the same

when transaction costs are excluded. If the Law of One Price does not hold, this may reflect

asymmetric information in the markets.

The first essay examines the National Flood Insurance Program (NFIP). By 2013 the NFIP

had accumulated approximately $20 billion in debt, including the losses from Hurricane Kat-

rina and Hurricane Sandy, but NFIP collects at most $3.8 billion in premiums per year. Unlike

automobile insurance claims or homeowner insurance claims, the correlation among flood in-

surance claims is very high. Also some models used for rating flood insurance policies, Flood

Insurance Rate Maps, are notoriously erroneous. Therefore, we propose a new method for mod-

eling flood insurance policies that incorporates the spatial autocorrelation among losses as well

as the potential for catastrophic losses as seen during hurricanes.

For our estimation of losses from floods, we use a two part model. The first part of the model

estimates the annual count of flood insurance claims in a particular county, and the second

1
part of the model estimates the average indemnity payment in that county. There are three

variations of the model, in which the count part of the model uses different distributions: the

Poisson distribution, the Negative Binomial distribution, and the Riemann-Zeta distribution.

Both parts of the model account for spatial autocorrelation. This analysis is conducted using

data from the 67 counties of Florida during the sample period 1978-2011. The state of Florida

is chosen because approximately 40% of flood insurance policies held are for properties located

in Florida. After estimating our models, we compare the loss cost ratios derived from the three

variations of our model to the observed loss cost ratios of the NFIP.

There has been increasing interest in agricultural risk management, which is reflected in the

2014 Farm Bill; therefore, for the second essay, we turn our attention to crop insurance and

the Agricultural Risk Coverage (ARC) program. In this essay, we model yields and prices to

determine crop insurance premium ratings and expected payouts from the ARC program. Like

flood insurance, crop insurance is susceptible to systemic risk. Natural disasters, such as the

drought that struck the Midwest in 2012, cause lower yields for producers over an expansive area.

Hence spatial autocorrelation is accounted for in our models of yields. We also expect the spatial

autocorrelation to change depending on the quantity of the yield since spatial autocorrelation of

yields is expected to be higher during a natural disaster compared to a good year. Therefore,

we allow for spatial dependency to change based on the quantity yielded.

For the model estimation of the second essay, we examine winter wheat in Kansas. Kansas

is one of the top producers of winter wheat in the United States. The yield data consist of

annual observations from the 105 counties of Kansas over the sample period 1970-2013. The

observations are separated by irrigation practices, i.e. dryland (non-irrigated) winter wheat and

irrigated winter wheat. Although the majority of winter wheat is grown without irrigation,

dryland practices may cause weather to exert a great influence on yields. After estimating the

models for dryland wheat and irrigated wheat, we use Monte Carlo integration to determine

crop insurance premium rates and the expected payouts for the Agricultural Risk Coverage

program.

Chapter 4, the final essay, changes the focus from risk management to spatial integration.

2
As mentioned earlier, if markets are not integrated, this could indicate asymmetric informa-

tion among markets. Asymmetric information can negatively impact decisions made by market

players. The seminal work on market integration kept transaction costs fixed or assumed trans-

action costs so high that markets would be isolated. More recent work used regime switching

regression models along with impulse response functions that better incorporated transaction

costs. Our analysis utilizes non-linear impulse response functions; however, we opt for copula

modeling instead of the conventional threshold modeling.

The model estimation in Chapter 4 utilizes weekly price observations from six grain markets

in North Carolina. The weekly price series cover a sample period from January 7, 2000 to

November 3, 2011 for the two corn markets and two soybean markets, and the sample period

for the two wheat markets is from October 7, 2005 to May 27, 2010. The estimation is pairwise,

such that the spatial integration is only tested for the same crop markets. After implementing

an ARMA-GARCH model on log returns of the prices for each market, we use copula models

on the standardized residuals. Therefore, the residuals of the ARMA-GARCH estimations for

series of log returns can be correlated, and that correlation is described by a time-varying copula

model referred to as a Stochastic Copula Autoregressive (SCAR) model. By using SCAR models

in place of regime switching regression to develop impulse responses, changes from equilibrium

to disequilibrium and visa-versa are continuous.

3
Chapter 2

Modeling of Flood Insurance Claims


using Spatially-Varying
Zero-Inflated Distributions

2.1 Introduction

Since 1968 the National Flood Insurance Program has provided flood insurance to residential

and non-residential property owners (Michel-Kerjan, 2010). However, the sustainability of the

program has been called into question. This paper examines several issues currently facing

the NFIP and focuses on examining catastrophic risk the program. Out-of-date flood maps,

high administrative cost, the threat of catastrophes, and a myriad of other problems have

created an insolvent program. In 2013 the Government Accountability Office (GAO) determined

the National Flood Insurance Programs debt to the Treasury was approximately $20 billion

(GAO, 2013). In the last twenty years, the United States has experienced catastrophic flooding

several times, including the Midwest Floods of 1993, Hurricane Katrina in 2005, and most

recently Hurricane Sandy in 2012. The NFIP has been able to pay down the debt to $17 billion

(Kousky and Michel-Kerjan, 2012). This debt is quite substantial considering the NFIP brought

4
in slightly over $3.8 billion in premiums in 2013. Under the current structure of the NFIP,

paying off this amount of debt is impossible. Most of this debt stems from indemnity payments

after Hurricane Katrina in Louisana, which amounted to $13.2 billion. The debt accrued from

Hurricane Sandy has not been completely realized; however, the federal government did increase

the debt ceiling of the NFIP from approximately $20.7 billion to $30.4 billion(GAO, 2013). The

current model of the NFIP has grave flaws which allow for these large debts to accumulate. One

of the major flaws is the current policy rating system. Therefore, we examine an alternative

method of rating that would yield actuarially-fair premiums.

Skees and Barnett (1999) outlined several ideal conditions for insurance including 1) infor-

mation between the insurer and insuree should be symmetric, 2) claims should be independent

and random, and 3) losses can easily be measured. Flood insurance claims are spatially cor-

related due to the systemic nature of flooding. Hence, flood insurance suffers from a lack of

independence among claims and systemic risk. Systemic risk is the probability of a large num-

ber of losses occurring simultaneously and dependently. Therefore, the second condition listed

above is violated. In other insurance markets, such as automobile and fire insurance, loss events

are almost completely independent from each other. On the other end of the spectrum, deriva-

tives, such as futures contracts and options for a given commodity, are highly correlated. Flood

insurance and crop insurance fall in between these two extremes.

The analysis of flood insurance claims and indemnity payments is surprisingly sparse. How-

ever, after Hurricane Katrina and more recently Hurricane Sandy, the literature for the supply-

side of flood insurance has grown. A series of papers by Kousky and Michel-Kerjan (2009a and

2009b) explore the tail dependence between flood insurance indemnity payments and home-

owners insurance indemnity payments from wind damage. Using copula modeling, the authors

identify dependence between large volumes of flood insurance indemnity payments and large

volumes of wind damage indemnity payments. Similar results were found by Diers et al. (2012)

and Pfiefer et al. (2012) for flood insurance indemnity payments and wind insurance indemnity

payments in Germany. However, Kousky and Michel-Kerjan (2009a) discarded the observations

equal to zero, while Diers et al. (2012) and Pfierfer et al. (2012) aggregated the data, so the

5
data contained no observations equal to zero. Discarding zeros makes the model conditional on

the occurrence of positive observations. Although a conditional model does provide insight, the

conditional model cannot be used to rating insurance policies. Also the aggregation performed

by Pfiefer et al (2012) is across all of Germany, which would only allow for the calculation of

the premium rate for the entire country. Using a model that omits zero values may overstate

risk.

This papers analysis utilizes data on flood insurance claims from Florida and includes all

of the observations equal to zero. To handle the large number of zeroes in the data, single

hurdle models are utilized. Single hurdle models are two part models, which have a binary

element-modeling whether the observation is zero or not- and positive count element-modeling

the positive observations. While the binary element is modeled using logit regression, the posi-

tive counts can be modeled by traditional count data models. In this paper we use the Poisson

distribution, the Negative Binomial distribution, and the Riemann Zeta distribution. Although

the data set obtained from Federal Emergency Management Agency (FEMA) is monthly from

January 1978 to June 2012, flood insurance premiums are calculated on an annual basis. There-

fore, the sample period for the count observations is aggregated annually over the time period

1978 to 2011. Since the counties of Florida vary in susceptibility to flooding, we allow the

model to vary the risk and severity of flooding from county to county. The three single hurdle

models are used to estimate annual expected number of claims for each county in Florida. The

estimates from these three models are compared to an approximation of the annual expected

number of claims provided by FEMA. The expected number of flood insurance claims can be

used to determine the premiums for each county.

In this paper, we also model the average indemnity payments for each county. The model

for average indemnity payments has spatially-varying coefficients for the covariates as well as

spatially-varying intercepts. This model may be used to help establish appropriate levels of

coverage. When the average indemnity payment model is used in conjunction with the single

hurdle models, loss cost ratios can be established for each county. These loss cost ratios help

determine the practicality of our estimation.

6
2.2 National Flood Insurance Program

2.2.1 History of the National Flood Insurance Program

Before the National Flood Insurance Program (NFIP), flood victims typically had two sources of

relief, charities and government disaster relief. Most floods do not warrant government disaster

relief, and charities do not offer a stable means of assistance (Federal Emergency Management

Agency, 2002). To move the burden of flood disaster relief from taxpayers to floodplain residents

and provide a steady means of assistance to flood victims, in the 1950s, Congress attempted

to encourage private insurers to provide flood insurance with the Federal Flood Insurance Act

of 1956. However, the private sector was not interested in providing flood insurance. After the

devastation of Hurricane Betsy in 1965 along the coast of Louisiana and southeastern Florida,

Congress stepped forward and passed the National Flood Insurance Act of 1968 (Michel-Kerjan,

2010). From this act, the National Flood Insurance Program (NFIP) was established to provide

a government-run flood insurance program to the inhabitants of floodplains.

The NFIP has grown substantially over the last six decades and undergone major reform. Af-

ter Tropical Storm Agnes in 1972, Congress passed the Flood Insurance Protection Act of 1973,

which required the designation of flood prone areas called Special Flood Hazard Areas (SFHA)

and the creation of Flood Insurance Rate Maps (FIRMs). Individuals with federally-backed

mortgages for properties located on floodplains are now required to purchase flood insurance.

In 1983 the Write Your Own (WYO) program was developed. This program allowed private

insurance companies to become intermediaries for the NFIP and sell flood insurance without

bearing any risk. Private insurance companies participating in the WYO program receive ap-

proximately 30% of the flood insurance premiums. The vast majority of NFIP policies are now

purchased through the WYO program. After the devastation of the Midwest Floods of 1993,

the National Flood Reform Act of 1994 established initiatives to increase market penetration

as well as establishing the Community Rating System (CRS). Communities that enlist in CRS

and take preventative measures against floods are able to receive reduced premiums. According

to FEMA, the reduction in premiums can be as much as 45% (FEMA, 2013).

7
For most of the programs history, the National Flood Insurance Program has offered direct

subsidies to approximately 20% of flood insurance policyholders. If flood insurance rate maps

were updated and resulted in higher premiums, policyholders with continuous coverage were

allowed to grandfather their previous insurance rates. The Biggert-Waters National Flood

Reform Act of 2012 eliminated subsidies that allowed the grandfathering of rates. These subsi-

dies were expected to be completely fade out by 2014. However, in March 2014 the Homeowners

Flood Insurance Affordability Act was passed which delayed the elimination of subsidies.

The majority of legislation affecting the National Flood Insurance Program has been enacted

to increase market penetration of flood insurance. By increasing market penetration the federal

government lessens its obligation to supply disaster assistance after major floods. However,

with the focus on increasing market penetration through subsidies, the sustainability of the

program has been ignored, except for the short-lived reform of the Biggert-Waters National

Flood Reform Act of 2012. Therefore, in order for the NFIP to continue, FEMA must evaluate

the premiums needed to achieve solvency.

2.2.2 How Flood Insurance Works

Many property owners are unsure what is and is not covered by flood insurance and how pre-

miums are determined. Flood insurance offered by the NFIP is available to both residential and

non-residential property owners. Currently, the maximum coverage for a residential policyholder

is $250,000 for building damage and $100,000 for contents, while the maximum coverage for a

non-residential policyholder is $1,000,000 for building damage and $500,000 for contents. Cur-

rently, the minimum deductible is $1,000 for coverage under $100,000 and $1,250 for coverage

greater than $100,000. There are separate deductibles for building and contents.

FEMA (2002) defines a flood as

A general and temporary condition of partial or complete inundation of two or


more acres of normally dry land area or of two or more properties (at least one of
which is the policyholders property) from
1. Overflow of inland or tidal waters,

8
2. Unusual and rapid accumulation or runoff of surface waters from any source,
3. Mudflow, or
4. Collapse or subsidence of land along the shore of a lake or similar body of
water as a result of erosion or undermining caused by waves or currents of
water exceeding anticipated cyclical levels that result in a flood as defined
above.

Damage caused by wind during a flood is not covered by the NFIP. Likewise, flood damage is not

covered by the vast majority homeowners insurance. Property owners who live within a 100-year

flood plain (Special Flood Hazard Area) - an area with a 1% or greater annual probability of

being flooded- are required to purchase flood insurance if their mortgages are federally-insured.

Also many lenders require the purchase of flood insurance if a home is located in a 100-year flood

plain. Table 1 shows the Flood Insurance Rate Map (FIRM) designations. Areas designated as

a Special Flood Hazard Area (SFHA) also have a base flood elevation. This elevation is the

expected elevation of flood waters with a 1% probability. The propertys FIRM designation,

base flood elevation, and characteristics of the property determines the policyholders flood

insurance premium.

Interestingly, there is a discrepancy between the definition of a flood and how flood insurance

policies are rated. By definition flood insurance policies are covered under the event of a levee

breaking. However, the condition of levees is not fully accounted for in policy ratings. This

disparity was a contributing factor in the destruction caused by Hurricane Katrina. This issue

is yet another reason to consider alternative methods of insurance rating. The Methods section

below provides alternative models for rating flood insurance policies. Further information on

levee ratings and flood insurance policy ratings is provided in the Discussion.

2.3 Methods

2.3.1 Single Hurdle Models

This paper utilizes Bayesian hierarchical models. Banerjee et al. (2004) provides an excellent

overview on spatial Bayesian hierarchical models. In particular we use single hurdle modeling,

9
which accounts for over-dispersion in the data that may appear due to excess zeros in the count

data. If we do not account for the over-dispersion, then the estimated standard errors may

be too small and resulting into underreporting Type I error rates within a hypothesis testing

scenario. Most counties in Florida exhibit a significant proportion of years having zeros flood

insurance claims with the average being 40% of observations for a county. For our modeling,

we utilize the single hurdle modifications of the Poisson, Negative Binomial, and Riemann Zeta

distributions. The Poisson and Negative Binomial distributions are the workhorses of discrete

modeling, while the Riemann Zeta distribution is rarely used the risk management literature.1

The random variable of the annual count of flood insurance claims per a county, denoted as

Cit , can be represented as

Cit = Bit (1 + Uit ) (2.1)

for County i = 1, . . . , N and Year t = 1, . . . , T . Bit is a random variable distributed as

Bernoulli(Pit ), and Uit is a discrete random variable with the probability mass function f (u, it )

taking values u = 0, 1, 2, . . .. The representation of Cit leads to the following probability mass

function:


1 Pit if c = 0
P r(Cit = c|Pit , it ) = . (2.2)


Pit f (c 1| it ) if c = 1, 2, . . .

Any single hurdle model can be represented by suitably choosing the parameters Pit s and

it s and the function form f (|it ). The representation 2.2 can be used to recover the Bit s and

Uit s based on observing Cit s in the following manner:




0 if Cit = 0
Bit = (2.3)


1 if Cit = 1, 2, . . .
1
Guzzetti et al. (2006) did model floods and landslide risk in Italy using the Reimann Zeta distribution;
however, their model was not single hurdle or spatially-varying. Porter and White (2012) use a Single Hurdle
Reimann Zeta model for modeling terrorists attacks.

10
and


undefined if Bit = 0
Uit = . (2.4)


Cit 1 if Bit = 1

As mentioned earlier, Bit follows a Bernoulli distribution. The probability parameter Pit of this

Bernoulli distribution is modeled using a logit link:

B
X
logit(Pit ) = i,0 + ij zijt , (2.5)
j=1

where (zi1t , . . . , ziBt ) denotes the vector of covariates for County i and Year t.

To help account for spatial variation among the counties, a Conditional Autoregressive

(CAR) model is used for the intercepts of the logit link, such that (0,1 , . . . , 0,N ) CAR( , 2 ).

The CAR is a popular model based on the notion of Markov Random Fields. The mean (or

intercept in our model) of one county is dependent or conditional on the means (intercepts)

of other counties and is conditioned on surrounding counties. Surrounding can be defined

by distance or contiguity. In this model, we choose contiguity over distance since the counties

greatly vary in size and shape. A detailed description of the CAR model is provided in the

Appendix A.

Exchangeable prior distributions are used for the covariate coefficients, which are allowed
iid
to vary with counties as random effects, such that (i1 , . . . , iB ) N ( , ) for i = 1, . . . , N .

All of the prior distributions are chosen to be vague (i.e. with very large prior variance) but

not improper, i.e. the densities of the prior distributions integrate to one. For a detailed list of

the prior distributions, refer to the Appendix.

Three different discrete distributions are considered in modeling the unobserved random ef-

fects Uit : the Poisson distribution, the Negative Binomial, and the Riemann Zeta distribution.

Table 2.2 shows the functional forms for each of these distributions along with their means

and variances. Although the Poisson distribution is a commonly used discrete distribution, the

Poisson distribution restricts the mean equal to the variance. Insurance applications pertaining

11
to natural disasters tend to have extremely large variances relative to the mean value. There-

fore, the Negative Binomial distribution is also proposed for our application. The additional

parameter, r, allows for a greater range of flexibility to model the variance. Finally, when mod-

eling continuous distributions with long tails, the Generalized Pareto distribution is a popular

choice as seen in Cooley et al. (2008). The Riemann Zeta distribution is considered the discrete

analog of the Generalized Pareto distribution. The Riemann Zeta distribution is capable of hav-

ing a very long right tail depending on the parameter specification, which makes it a probable

candidate for modeling catastrophic risk.

The covariates are liked through the mean, as seen in Table 2, of the unobserved random

component Uit . Specifically, letting it = E(Uit |Zit ) we use the following log link function for

the Poisson and Negative Binomial distributions:

B
X
log(it ) = 0,i + j zijt , (2.6)
j=1

where z Tit = (zi1t , . . . , ziBt ) denotes the vector of covariates available for County i and Year t.
iid
The intercepts have the prior (0,1 , . . . , 0,N ) CAR( , 2 ) and the coefficients for the covari-

ates have independent normal distributions for their priors. The parameter r in the Negative

Binomial distribution is estimated using a gamma distribution for its prior distribution.

For the Riemann Zeta distribution we must solve for the parameter it non-linearly to
(it )
obtain the mean of the Riemann Zeta distribution which is it = (it +1)
1 for it > 1, where
P 1
(it ) = n=1 . We again use the log link function for positive count model, such that
n it

B
X
log(it ) = 0,i + j zijt . (2.7)
j=1

The log link function for the Riemann Zeta distribution uses the same covariates and prior

distribution as the log links for the Poisson and Negative Binomial distributions.

If Uit RZ(it ) and E(Uitw ) < then it > w when w is a positive integer. Therefore, to

model a Riemann Zeta distribution with the variance restricted to be finite, we need to use the

12
following link function:

B
X
log(it ) = log(2) + log(1 + exp(0,i + j zijt )). (2.8)
j=1

This link function will be used in future research. These single hurdle models are referred to as

the Single Hurdle Poisson (SHP) model, the Single Hurdle Negative Binomial (SHNB) model,

and the Single Hurdle Riemann Zeta (SHRZ) model.

2.3.2 Estimating the Expected Number of Claims

The purpose of these single hurdle models is to determine the actuarially fair premium rates.

The actuarially fair premium rate is the expected value of the number claims for each county,

which can be written as

E(Ci,2012 ) = Pi,2012 E(Ci,2012 |Ci,2012 > 0), (2.9)

where Ci,2012 is the annual number of claims for county i = 1, . . . , N in the year 2012. The

equation above states that the expected number of claims in a county is equal to the probability

of at least one flood claim in that county during 2012 times the expected number of claims for

the year 2012 conditioned on at least one flood insurance claim being filed. The actuarially fair

premiums can be determined by multiple both sides of the equation above by the coverage. For

actuarially fair premiums, premiums should be equal to expected payouts.

For simulating the count of flood insurance claims, we use posterior predictive sampling.

Posterior predictive sampling differs from sampling in classical statistics. Posterior predictive

sampling is a two part process. Since Bayesian methods treat parameters as random variables,

the first part of posterior predictive sampling is drawing parameters from the posterior distri-

bution. The samples of parameters drawn from the posterior distributions are then used in the

sampling distribution to draw random samples of observations. Posterior predictive sampling

allows us to take in to account the uncertainty not only in the sampling distribution but the

13
uncertainty in the parameters as well. After data are simulated from each type of single hurdle

model - the Poisson, Negative Binomial, and Riemann Zeta models, the expected number of

flood insurance claims for each county are estimated using Monte Carlo integration. Monte

Carlo integration is a method of numeric integration based on simulations that is popular in

the insurance industry.

2.3.3 Modeling Indemnity Payments

For modeling indemnity payments, we use the entire monthly data set from January 1978

to June 2012. The indemnity payments are normalized to 2012 dollars. Average indemnity

payments are modeled as





Riq /Ciq if Ciq > 0
Miq = , (2.10)


Not Available if Ciq = 0

where Ciq is the number of flood claims and Riq is the total of indemnity payments for county

i = 1, . . . , N during month q = 1, . . . , Q. Miq conditioned on Ciq is modeled with the following

equation:
B
X
Miq = i log(1 + Ciq ) + i,0 + ij zijq + iq , (2.11)
j=1

where zijq are covariates for county i = 1, . . . , N and month q = 1, . . . , Q. The county level

random effects are modeled as

iid
1. (1 , . . . , N ) N ( , 2 )

2. (1,0 , . . . , N,0 ) CAR( , 2 )

iid
3. (i,1 , . . . , i,B ) N ( , ) for i = 1, . . . , N .

Here we have a spatially-varying coefficient model as described by Gelfand et al (2003). The

coefficients vary with the counties since we expect coefficients to be affected by the property

14
values in each county, which in turn affects the average indemnity claims. For a detailed list of

the prior distributions used in this model, refer to the Appendix.

After estimating the model for the average indemnity payments for each county, average

indemnity payments are simulated using posterior predictive sampling. Using these simulated

payments in conjunction with the simulations of the count of flood insurance claims, we are able

to establish loss cost ratios for each county. The parameters ( , 2 , , 2 , , ) are estimated

using MCMC methods, and posterior inference is obtained.

2.3.4 Implementation

To perform the analysis, the open source softwares R and OpenBUGS are employed. All mod-

eling is written in R. Using the software package R2OpenBUGS, the Single Hurdle Poisson

model and the Single Hurdle Negative Binomial model are exported to OpenBUGS to perform

the Bayesian updating using the Adaptive Sampling Rejection algorithm. This algorithm is

discussed by OHagan et al. (2004). The Single Hurdle Riemann Zeta model is estimated using

the Metropolis-Hasting algorithm, which is discussed by Gilks (1996) but this model is not

exported to OpenBUGS because the Riemann Zeta function (a portion of the Riemann Zeta

distribution) cannot be estimated in OpenBUGS. Therefore, the modeling of the Riemann Zeta

distribution is performed completely in R. The modeling of the average indemnity payments is

also performed in OpenBUGS.

2.4 Data

Through two Freedom of Information Act (FOIA) requests, data concerning the National Flood

Insurance Program are obtained. Although the first data set obtained from the Federal Emer-

gency Management Agency (FEMA) contained monthly totals of indemnity payments and

counts of flood insurance claims for every county in the United States from January 1978 to

June 2012, the scope of our analysis is limited to Florida. Florida is chosen because approxi-

mately 40% of the flood insurance policies (over 2 million policies in 2012) in the United States

15
are within Florida and approximately 29% of households in Florida have a flood insurance pol-

icy. The data are aggregated annually for the models of the count of flood insurance claims, but

we analyze the monthly data when modeling the average indemnity payments. Premiums are

determined on an annual basis, hence the use annual aggregation for the count model of flood

insurance claims. From the second FOIA request, the data for policies in-force, coverage, and

premiums for each county in Florida was obtained. The minimum elevation2 of each county,

whether the county is coastal3 , and the number of policies are used as covariates through the

link functions as well as modeling the average indemnity payments. Note for the Number of

Policies covariate, a log transformation is used for an easier interpretation of the covariate and

numerical stability.

Figure 2.1 and Figure 2.2 show maps relating to flood insurance policies and losses, respec-

tively. The coastal counties, especially southern Florida and the panhandle, have experience

the highest losses and have the highest amount of policies in-forced. From causal observation

there appears to be spatial correlation, which supports our use of spatially-varying intercepts

in the link functions. The highest losses are in counties with a higher cost of living, such as

Destin in Okaloosa County (located in the panhandle) and Miami in Miami-Dade County on

the southeastern coast of Florida.

2.5 Results

2.5.1 Model Estimation

First we will focus on the logit link, the estimation of Pit , which has the same results independent

of the choice of distribution for modeling the positive number of claims. To determine the

best covariates for the logit link, we use the Deviance Information Criteria (DIC). DIC is the

Bayesian analog of the Akaike Information Criteria (AIC). Like AIC DIC penalizes overfitting,

and a smaller value of the criteria indicates a better fitting model. According to Table 3.2, the
2
The minimum elevations are obtained from the United States Geological Survey.
3
This variable is referred to as Coast in the Results section.

16
inclusion of the covariates of Coast and Minimum Elevation do not improve the estimation

of the probability of at least one flood claim occurring. Therefore, when simulated the count

of claims for each county, the simplest model, containing only the log Number of Policies

is included. Figure 2.3 shows the prior and posterior distribution of the coefficient on the

log Number of Policies for the best fitting logit link based on the DIC. One can see that

the prior distribution is very flat compared to the posterior distribution. If the log Number

of Policies covariate a increases by one unit will lead to approximately a .69% increase in

odds of a claim being filed according to the median of the posterior distribution. Figure 2.4

shows the 2.5th , 50th , 97.5th percentiles of the posterior distributions of the spatial intercepts.

The estimates of the spatial intercepts align with previous expectations that the counties most

prone to hurricane exposure have higher intercepts. The maps for the 50th and 97.5th percentiles

show the coastal counties in the southern end of the pennisula, including Miami-Dade, Palm

Beach, and Monroe, have the highest spatial intercepts. Other hurricane-prone counties, such

as Hilsborough and Pinellas, also have high spatial intercepts.

To compare the positive count component of the single hurdle models, the Poisson, Negative

Binomial, and Riemann Zeta distribution estimates, we use DIC as well as Chi-Square statistics.

DIC is used to compare the distributions to each other as well as which covariates to include. The

Chi-Squared statistics are implemented after selecting the best covariates to include as indicated

by DIC. Also the Chi-Squared statistics includes the logit portion of the single hurdle models,

but the same logit link is used for all three distributions, so the results of the Chi-Squared

statistics reflect only the choice of the positive count distributions. Table 2.4 presents the DIC

for the log links of the Poisson, Negative Binomial, and Riemann Zeta components of the single

hurdle models with different combinations of the covariates. To assess the predictive abilities

of these models, we use the omnibus Chi-squared statistics described by Gelman et al. (2004).

Table 2.5 shows Chi-Squared statistics calculated using all 2278 county-year observations, such

that
67 X
X 34
(ci,t E(Ci,t | i,t ))2
, (2.12)
V ar(Ci,t | i,t )
i=1 t=1

17
where ci,t is the observed count of flood insurance claims for County i during Year t. E(Ci,t | i,t )

and V ar(Ci,t | i,t ) are calculated from the simulated claims described in the next section. Intu-

itively, the Chi-Squared statistic with the lowest value gives the best fitting model.

Insurance applications are mainly concerned with the average of many years. Even if a model

does not predict a flood with a 25 year return interval for a given year, the insurance company

can recuperate given the flooding event occurs on average every 25 years. Therefore, we also

calculate the Chi-Squared statistic over for the average count of flood insurance claims for each

county over the sample period, such that

67
X (ci E(Ci |))2
. (2.13)
i=1
V ar(Ci |)

According to DIC, for the Poisson and Negative Binomial models, the log links that only have

the log Number of Policies as the covariate are the best fitting, while the Riemann Zeta

model has the lowest DIC when the log link contains the covariates log Number of Policies

and Coast. According to DIC the best fitting positive count model is the Negative Binomial

model and the worst fitting model is the Poisson model. The Chi-Squared statistics using the

county-year observations reflect the DIC for the log links. However, looking at the Chi-Squared

statistics for the average count of the flood insurance claims for each county tells a different

story. Here the Chi-Squared statistics for the Riemann Zeta model is the worst fitting, while

the Negative Binomial and Poisson models show much better fits.

Now we examine the posterior distributions of the coefficients for the best fitting log links-

according to DIC- for each positive count distribution. The median of the posterior distribution

for the log Number of Policies for the Poisson model, as seen in Figure 2.5 tells us that a 1

unit increase in the log number of flood insurance policies will increase the number of claims

by approximately .0072 given that at least one claim has occurred. The coefficient for the log

Number of Policies for the Negative Binomial model is slightly lower than the Poisson model.

According to the posterior median for the Negative Binomial model with the lowest DIC as

seen in Figure 2.7, a 1 unit increase in the log number of policies will increase the expected

18
number of claims by .0068 given at least one flood claim is filed. Also Figure 2.7 shows that

the posterior distribution of r for the Negative Binomial model is relatively close to zero, which

indicates the Negative Binomial model substantially differs from the Poisson model. Recall

that r , the Negative Binomial model collapses into the Poisson model. Finally, for the

Riemann Zeta model with the lowest DIC, the coefficient for the log Number of Policies has

a median of approximately 1.1, which indicates a 1 unit increase in the log number of policies

will increase the number of claims by .0011 given that at least one flood claim has been filed as

seen in Figure 2.9. Figure 2.9 also shows the prior and posterior distribution for the covariate

Coast of the Riemann Zeta model with the lowest DIC. As expected being a coastal county

has a positive impact on the number of flood claims. According to the median of the posterior

distribution of Coast, being a coastal county increases the expected count of flood insurance

claims by approximately .5 claims being filed given at least one claim has occurred. Figures

2.6, 2.8, 2.10 for the Poisson, Negative Binomial, and Riemann Zeta models, respectively, show

the 2.5%, 50%, and the 97.5% percentile maps for the spatial intercepts. The magnitude of

the spatial intercepts for the Riemann Zeta model seem arbitrary. However, the Poisson and

Negative Binomial models have the highest spatial intercepts in the coastal counties of the

panhandle. This is interesting because the coastal counties of the panhandle have the highest

loss cost ratios for flood insurance in Florida. The alignment of the spatial intercepts and high

historical losses shows promise in these models.

2.5.2 Simulating Claims and Indemnity Payments

To estimate the actuarially fair premium rates, which is the expected number of claims in a

county, we simulate data from the best fitting SHP, SHNB, SHRZ models, according to DIC.

Table 2.7 shows the expected average number of claims per county over the sample period

according to the NFIP and the three single hurdle models. Note the values for the NFIP are

the values predicted by the NFIP and are not the historical averages. Here we see what is

reflected by the Chi-Squared statistics for the county average. The SHP and SHNB models

are closer to the historical average compared to the SHRZ. Figure 2.11 shows the difference

19
between the expected average number of claims and the historical average number of claims

for each county. Surprisingly, the Single Hurdle Poisson model is the best-fitting model not

only compared to the other two single hurdle models but also the current ratings used by the

National Flood Insurance Program. The SHNB model has expected county estimates both far

below the historical average (underestimating by 301 claims in Escambia County) and far above

the historical averages for counties (overestimating by over 1,800 claims in Broward County).

The expected average number of claims for the SHRZ is consistently overestimated, indicating

the tail of this distribution may be too long.

2.5.3 Indemnity Payments

We find that for modeling the average indemnity payments, the simplest model proves to be the

best fitting, as seen in Table 2.8. The only covariate used is the log of the Number of Claims.

Figure 2.12 shows the 2.5%, 50%, and the 97.5% percentiles of the historical average indemnity

payment (provided a flood occurred) for each county over the sample period. Interestingly, the

median for the average indemnity falls between $2,238 and $18,845, which is substantially lower

than the residential and non-residential coverage limits of $300,000 and $1,500,000, respectively.

This can be compared to the simulated average indemnity payments in Figure 2.13, which shows

the 2.5%, 50%, and 97.5% percentiles of the average simulated indemnity payments for each

county. We see both in the historical indemnity payments and the simulated data consistent

regional trends. The panhandle counties have very high average indemnities payouts. This area

is known for vacation destinations, such as Destin, Okalossa County. Monroe County, located

at the southern end of the peninsula, is the county containing the Florida Keys, which is also

well-known for vacation homes and resorts. Note that the highest simulated average indemnity

payment in the 97.5% percentile, $1,530,106, could not entirely be paid out by the NFIP due

to limits on coverage. However, individuals would be able to purchase flood insurance from a

private firm for damage above the NFIP limit.

20
2.5.4 Loss Cost Ratio

Now that we have modeled both the number of claims and the average indemnity payments,

we can use this information to develop loss cost ratios (LCR) to determine the sustainability

of the program under the current ratings and the single hurdle models presented in this paper.

The loss cost ratio is defined as the total amount paid out in indemnity payouts divided the

total collected in premiums.The ideal ratio is one because then the amount paid out is equal to

the amount paid in to the insurance program. Therefore, we take the average expected number

of claims per county and multiply these by the median simulated average indemnity payments.

Figure 2.14 and Table 2.9 show the LCRs for each county given the observed data and the three

single hurdle models. These loss cost ratios are based on the aggregation of the 34 years in the

sample period. We see that for the observed LCR that half of the counties have paid in three

times more in premiums than what has been distributed in premiums, while the counties in the

panhandle have LCRs greater than one. Not surprisingly, the LCRs using the SHRZ model are

the lowest when compared to the observed LCR and those LCRs of the other single hurdles.

This is not surprising because the SHRZ model had the highest estimated number of claims.

The LCRs using the SHNB model had the highest loss cost ratio, which is for Escambia County

at 125.5. For the LCRs using the SHP model, most counties are greater than one, but there are

not the extremely low or extremely high LCRs seen in the LCRs using the other single hurdle

models.

2.6 Discussion

Flood insurance maps are infrequently updated due to the high cost. New construction in a

community can greatly affect the flood currents and the elevation of flood water. Out-of-date

flood maps obviously can not reflect the changes in water flow caused by new construction.

Because flood maps cannot be updated frequently, less costly complementary methods, such

as the predictions from the SHP model, should be explored and used in conjunction with the

flood maps or applied at a finer resolution and stand alone. The use of Bayesian methods allows

21
for flexible modeling that can easily allow for additional information of the parameters to be

included. More importantly, Bayesian methods account for estimation uncertainty when making

predictions.

Eight years after levee breaches annihilated Greater New Orleans, the effort to maintain

levees seems to have fallen to the way-side. There are no Acceptable4 levees within the state of

Florida. Several levees in urban areas are considered unacceptable and more than a dozen rural

and agricultural levees are unacceptable (USACE, 2013). In fact, approximately one million

acres in Florida are protected by levees in Unacceptable condition. This is problematic since

current flood insurance rates are developed under the assumption that the levees will hold and

do not take into consideration the probability of a breach. For this reason, we propose adding

a component to the flood insurance premium that incorporates the element of risk involved in

a levee breaking. The benefit of this added component would be two-fold: 1) premiums would

better reflect the risk of flooding and 2) policyholders would have more information on the risk

of flooding.

2.7 Conclusion

In this paper, we explore the issue of systemic risk through spatial modeling along with other

issues faced by the National Flood Insurance Program. Modeling the count of flood insurance

claims, we calculate the actuarially fair premiums. Three single hurdle models are considered

for the estimation of the annual number of flood insurance claims in each county of Florida.

Parameters of the models are allowed to spatially vary with the counties of Florida. Although the

Single Hurdle Poisson model does not predict the best on yearly based, the average prediction

of this model over sample period is closer to the historical average than the other single hurdle

models and current rating of the NFIP. Although the loss cost ratios from our estimation are

higher than one, with data on home values it will be possible to better estimate indemnity

payments. Home values are expected to be directly proportional to the average indemnity
4
Table 2.10 list descriptions of these ratings.

22
payments. The inclusion of home values would be beneficial because neighboring counties may

have drastically different median home prices in some regions, but in other regions neighboring

counties may have similar home prices. Therefore, home values may not currently be properly

accounted for in the spatially-varying intercepts.

This analysis has allowed for areas of interest to be determined and further investigated.

Further research will utilize these single hurdles models at a finer spatial resolution within a

given county, preferably at the ZIP code or census tract level. This will allow us to explore socio-

economic issues within the flood insurance program. We would also like to include information

about the Community Rating System (CRS) to determine the effectiveness of CRS and its

impact on market penetration. The use of the single hurdle can extend beyond the use flood

insurance claims. Other insurance claims caused by natural disasters can be described by the

single hurdle models, such as wildfires or landslides. Knowing that ten of the thirty costliest5

hurricanes recorded since 1900 and seven of the thirty most intense6 recorded since 1851 have

occurred within the last 25 years, recent history indicates the modeling of catastrophic events

will continue to be a relevant topic (Blake and Gibney, 2011).


5
This estimate is adjusted for inflation, population, and wealth normalization and only include hurricanes in
the United States.
6
Intensity is defined solely by the lowest central pressure of the storm. These records only include hurricanes
that made landfall in the United States.

23
2.8 Tables and Figures

Table 2.1: High risk areas have at least a 1% annual probability of flooding. These areas are
referred to as 100-year floodplains. Zones labeled A are for inland areas, while zones labeled
V are reserved for coastal areas. Moderate risk areas are referred to as 500-year floodplains.

Zone Risk Annual Probability of a Flood


A High 1%
V High 1%
B Moderate between 0.2% and 1%
X (shaded) Moderate between 0.2% and 1%
X (unshaded) Minimal < .02%
C Minimal < .02%
D Undetermined

24
Table 2.2: Function Forms of the Considered Proba-
bility Mass Functions. The function () is defined as
(n) = (n P 1)! and the function () is defined as
1
( + 1) = k=0 k +1 .

f (Uit = u, it ) 2
u
Poisson u! e
Negative u (r+u) 1 (r+)
u! (r)(r+)u (1+ )r
Binomial r r

Riemann (u+1)(+1) () (+1)(1)(2)2


Zeta (+1) (+1) (+1)2

25
31 659 34591 451225
659 7486 451225 3184372
7486 25965 3184372 12680507
25965 373958 12680507 154680410

(a) Policies in Forced (b) Premiums for 2012

Figure 2.1: Maps Relating to Policies

26
628.54 0.16
0 2.62 0.16 8.54
2.62 11.49 8.54 169.76
11.49 69.35 169.76 2280.03
69.35 925.32

(b) Total of Premiums - Total of


(a) Average Number of Claims Losses (in millions of dollars)

Figure 2.2: Maps Relating to Losses

27
Logit Link: Number of Policies

prior
8

posterior
6
4
2
0

0.4 0.5 0.6 0.7

Figure 2.3: The prior and posterior distributions of the covariate of Number of Policies for the
logit link.

28
Spatial Intercepts: 2.5th Percentile Spatial Intercepts: 50th Percentile Spatial Intercepts: 97.5th Percentile

8.71 4.66 7.26 3.67 5.98 2.78


4.66 4.28 3.67 3.31 2.78 2.42
4.28 4.01 3.31 2.91 2.42 1.78
4.01 3.29 2.91 1.54 1.78 0.99

Figure 2.4: For the logit link with the lowest DIC, the 2.5th , 50th , and 97.5th percentiles of spatial intercepts are shown above.

29
Table 2.3: DIC for Logit Links

Covariates DIC
Policies 2157.2
Coast + Policies 2157.1
Minimum Elevation + Policies 2157.0
Coast + Minimum Elevation + Policies 2157.3

Table 2.4: DIC for the Log Links

Covariates Poisson Negative Binomial Riemann Zeta


Policies 447672 11135 13485
Coast + Policies 447680 11142 13483
Minimum Elevation + Policies 447682 11137 13520
Coast + Minimum Elevation + Policies 447682 11143 13500

Table 2.5: The Chi-Square Statistics for Observations of County-Year Data

Model Chi-Squared Statistic


Poisson 1539811
Negative Binomial 15298
Riemann Zeta 41945

30
Table 2.6: The Chi-Squared Statistics for data average over the years for each county.

Model Chi-Squared Statistic


Poisson 0.0258
Negative Binomial 0.0001
Riemann Zeta 115.85

Poisson Model: Number of Policies

prior
60

posterior
50
40
30
20
10
0

0.69 0.70 0.71 0.72 0.73 0.74 0.75

Figure 2.5: The prior and posterior distributions of the covariate Number of Policies for the
log link of the SHP model.

31
Table 2.7: This table shows the expected losses. The first column is an alphabetical list of all
the counties in Florida. The second column is the annual average for historical claims. Third
is an approximation of the annual number of claims the NFIP expects. The last three column
are show the results for the simulations for each type of hurdle model. For the SHP and SHNB,
the simulations for the models with the lowest DIC are shown.

County Average NFIP Poisson Neg. Binom. Riem. Zeta


Alachua 2.26 4.51 3.07 19.88 88.55
Baker 1.18 0.28 3.92 4.41 6.02
Bay 120.00 38.51 126.60 68.19 713.63
Bradford 1.15 1.00 2.51 2.86 20.94
Brevard 59.85 56.18 61.15 138.37 1077.98
Broward 323.53 401.59 321.59 2141.05 2220.60
Calhoun 3.44 0.43 8.84 2.41 12.74
Charlotte 38.68 74.24 45.31 425.25 638.25
Citrus 84.53 17.95 110.25 35.67 279.02
Clay 12.71 6.48 16.39 9.12 164.14
Collier 25.94 137.95 33.56 70.24 933.36
Columbia 4.82 21.04 7.43 36.80 678.32
DeSoto 4.76 24.85 7.98 126.60 716.50
Dixie 14.47 25.83 27.12 159.50 657.00
Duval 51.24 35.03 51.69 735.71 651.72
Escambia 306.53 27.28 330.83 28.68 513.08
Flagler 6.29 10.91 8.77 146.02 194.28
Franklin 45.29 10.47 67.81 31.47 154.42
Gadsden 0.53 0.23 1.68 7.74 7.17
Gilchrist 5.24 0.95 15.92 2.68 12.88
Glades 0.68 1.50 2.10 19.74 15.64
Gulf 18.21 2.61 29.16 73.89 101.92
Hamilton 2.68 0.18 9.93 2.13 4.30
Hardee 0.91 0.24 1.92 1.73 8.66
Hendry 0.85 3.46 2.05 32.41 46.22
Hernando 37.56 10.51 45.65 10.75 196.93
Highlands 1.18 1.36 2.52 13.77 34.17
Hillsborough 115.71 115.21 120.35 641.58 949.81
Holmes 4.24 0.27 9.83 22.22 9.04
Indian River 0.00 30.46 0.00 0.00 0.00
Jackson 0.94 0.29 1.91 6.90 12.23
Jefferson 0.09 0.15 1.20 6.07 1.55
Lafayette 5.85 0.47 12.66 1.89 13.70
Lake 2.44 5.40 3.57 14.89 95.54
Lee 158.85 240.24 164.40 540.46 1419.95
Leon 18.21 5.86 21.38 47.21 157.74
Levy 16.50 5.94 24.46 24.70 89.66
Liberty 0.15 0.14 1.40 3.69 0.87

32
Table 2.7 Continued

County Average NFIP Poisson Neg. Binom. Riem. Zeta


Madison 1.94 0.21 7.13 2.88 4.49
Manatee 89.41 83.12 90.67 74.12 800.20
Marion 2.12 6.63 3.12 10.57 72.09
Martin 39.68 25.43 46.69 167.27 509.68
Miami-Dade 925.32 438.62 921.57 2241.40 2011.35
Monroe 369.88 97.30 387.78 68.87 843.02
Nassau 7.44 11.81 10.33 130.62 266.05
Okaloosa 144.62 26.85 165.69 67.40 531.28
Okeechobee 3.79 3.86 13.54 3.67 36.51
Orange 10.26 18.62 9.84 19.22 405.14
Osceola 3.71 9.53 5.36 19.82 151.06
Palm Beach 115.12 149.72 117.86 690.64 1616.32
Pasco 161.15 68.41 159.21 319.36 728.16
Pinellas 457.68 299.76 455.14 218.25 1635.27
Polk 13.15 12.47 13.25 23.87 249.92
Putnam 3.94 2.68 7.48 69.36 47.65
Santa Rosa 160.88 13.88 206.90 36.92 354.78
Sarasota 87.03 98.95 89.33 64.82 888.81
Seminole 9.18 8.50 9.42 11.29 229.82
St. Johns 18.00 45.11 18.39 125.03 598.11
St. Lucie 68.62 22.16 86.62 55.85 567.66
Sumter 1.21 2.81 2.04 18.81 48.25
Suwannee 4.97 1.26 9.43 13.95 29.06
Taylor 6.26 2.49 13.18 10.53 33.09
Union 0.50 0.11 2.63 1.63 1.31
Volusia 71.53 54.94 75.58 730.57 720.28
Wakulla 30.32 6.01 56.75 23.93 81.32
Walton 47.65 20.59 63.58 134.57 323.04
Washington 3.15 0.23 7.25 31.43 12.32

33
Intercepts for Poisson Model: 2.5th Percentile Intercepts for Poisson Model: 50th Percentile Intercepts for Poisson Model: 97.5th Percentile

6 3.88 5.79 3.63 5.59 3.36


3.88 2.97 3.63 2.78 3.36 2.46
2.97 1.77 2.78 1.62 2.46 1.48
1.77 0.68 1.62 0.46 1.48 0.24

Figure 2.6: For the SHP model with the lowest DIC, the 2.5th , 50th , and 97.5th percentiles of random effect intercepts are shown above.

34
Neg. Bin. Model: r Neg. Bin. Model: Number of Policies
35

6
prior prior
posterior posterior
30

5
25

4
20
Density

Density

3
15

2
10

1
5
0

0.26 0.28 0.30 0.32 0.34 0.5 0.6 0.7 0.8 0.9

Figure 2.7: The prior and posterior distributions of and the covariate Number of Policies for
the log link of the SHNB model.

35
Intercepts for Neg. Bin. Model: 2.5th Percentile Intercepts for Neg. Bin. Model: 50th Percentile Intercepts for Neg. Bin. Model: 97.5th Percentile

6.12 4.37 4.31 2.96 3 1.74


4.37 3.55 2.96 2.24 1.74 1.02
3.55 2.6 2.24 1.41 1.02 0.22
2.6 1.59 1.41 0.4 0.22 0.84

Figure 2.8: For the SHNB model with the lowest DIC, the 2.5th , 50th , and 97.5th percentiles of spatial intercepts are shown above.

36
Riemann Zeta: Number of Policies Riemann Zeta: Coast
1.2

0.5
prior prior
posterior posterior
1.0

0.4
0.8

0.3
0.6

0.2
0.4

0.1
0.2
0.0

0.0

0.0 0.5 1.0 1.5 2 1 0 1 2 3 4

Figure 2.9: The prior and posterior distributions of the coefficient for log number of policies
and Coast for the SHRZ model.

37
Intercepts for R. Zeta Model: 2.5th Percentile Intercepts for R. Zeta Model: 50th Percentile Intercepts for R. Zeta Model: 97.5th Percentile

1.54 1.48 0.04 0.01 1.4 1.44


1.48 1.46 0.01 0 1.44 1.46
1.46 1.44 0 0.01 1.46 1.49
1.44 1.37 0.01 0.04 1.49 1.56

Figure 2.10: For the SHRZ model, the 2.5th , 50th , and 97.5th percentiles of spatial intercepts are shown above.

38
Difference between the NFIP and the Historical Mean Difference between the SHP and the Historical Mean

486.7 15.75 3.75 1.15


15.75 1.08 1.15 3.2
1.08 2.69 3.2 7.35
2.69 112.01 7.35 46.02

Difference between the SHNB and the Historical Mean Difference between the SHRZ and the Historical Mean

301.01 1.41 0 23.02


1.41 8.7 23.02 155.4
8.7 68.7 155.4 583.49
68.7 1817.52 583.49 1897.07

Figure 2.11: Maps for Expected Losses

39
2.5th Percentile of Average Losses 50th Percentile of Average Losses 97.5th Percentile of Average Losses

126 210 2238 4112 126 23690


210 282 4112 4963 23690 30709
282 420 4963 5854 30709 37842
420 854 5854 6985 37842 51002
854 3957 6985 18845 51002 121955

Figure 2.12: Percentiles for the average indemnity payments (adjusted to 2013 dollars) for a flood insurance claim.

40
Table 2.8: DIC for the Indemnity Payment Models

Covariates
Number of Claims 11640.0
Coast + Number of Claims 83780.0
Minimum Elevation + Number of Claims 83770.0
Coast + Minimum Elevation +Number of Claims 83780.0

41
2.5th Percentile of Simulated Average Losses 50th Percentile of Average Losses 97.5th Percentile of Simulated Average Losses

0 1085 3299 4575 14980 18772


1085 1262 4575 5434 18772 22538
1262 1496 5434 6278 22538 26680
1496 1781 6278 8368 26680 39696
1781 7125 8368 90972 39696 1530106

Figure 2.13: Percentiles for the simulated average indemnity payments (adjusted to 2013 dollars) for a flood insurance claim. These are
based on the model for average indemnity payments, which had the lowest DIC.

42
Observed LossCost Ratio SHP LossCost Ratio

0 0.11 0.055 1.212


0.11 0.334 1.212 2.412
0.334 1.254 2.412 4.363
1.254 9.13 4.363 12.643

SHNB LossCost Ratio SHRZ LossCost Ratio

0.012 0.347 0.009 0.112


0.347 1.078 0.112 0.363
1.078 3.916 0.363 1.051
3.916 125.515 1.051 7.373

Figure 2.14: Maps of loss cost ratios

43
Table 2.9: Loss cost ratios

County NFIP Poisson Neg. Bin. R. Zeta


Alachua 0.1850 3.2391 0.4997 0.1122
Baker 1.9692 1.2121 1.0784 0.7897
Bay 1.9616 11.6019 21.5397 2.0582
Bradford 0.1893 1.2120 1.0651 0.1453
Brevard 0.1482 5.0855 2.2475 0.2885
Broward 0.0599 2.3746 0.3567 0.3439
Calhoun 2.5284 0.4798 1.7566 0.3330
Charlotte 0.0563 2.6580 0.2832 0.1887
Citrus 1.5603 4.8113 14.8701 1.9010
Clay 0.3659 3.0090 5.4046 0.3004
Collier 0.0304 3.0859 1.4742 0.1109
Columbia 0.0052 1.6806 0.3395 0.0184
DeSoto 0.0056 0.7665 0.0483 0.0085
Dixie 0.0173 1.6059 0.2731 0.0663
Duval 0.2132 5.6419 0.3964 0.4474
Escambia 7.0812 10.8792 125.5146 7.0148
Flagler 0.0740 2.3378 0.1404 0.1055
Franklin 0.9029 2.9357 6.3260 1.2892
Gadsden 0.4683 0.0548 0.0119 0.0128
Gilchrist 1.8067 1.0301 6.1134 1.2724
Glades 0.0665 0.5673 0.0603 0.0762
Gulf 1.3911 2.5224 0.9953 0.7216
Hamilton 4.8751 0.9018 4.1965 2.0812
Hardee 0.8103 1.4223 1.5812 0.3150
Hendry 0.0733 1.2269 0.0777 0.0545
Hernando 1.2077 5.1408 21.8310 1.1916
Highlands 0.1427 1.2810 0.2346 0.0946
Hillsborough 0.1743 4.4056 0.8264 0.5582
Holmes 3.6767 1.4889 0.6586 1.6179
Indian River 0 NaN NaN NaN
Jackson 0.6341 1.2761 0.3537 0.1997
Jefferson 0.1214 0.3109 0.0616 0.2410
Lafayette 3.0307 0.8920 5.9688 0.8242
Lake 0.0801 0.7551 0.1812 0.0282
Lee 0.0975 4.3207 1.3142 0.5002
Leon 0.6832 2.7911 1.2642 0.3784
Levy 0.8473 2.4784 2.4545 0.6761
Liberty 1.0866 0.1123 0.0427 0.1818

44
Table 2.9 Continued

County NFIP Poisson Neg. Bin. R. Zeta


Madison 3.2331 0.4214 1.0450 0.6693
Manatee 0.1312 2.2764 2.7847 0.2579
Marion 0.1714 2.4123 0.7118 0.1043
Martin 0.3574 5.4230 1.5137 0.4968
Miami-Dade 0.3803 5.4900 2.2573 2.5155
Monroe 0.9786 5.2262 29.4271 2.4040
Nassau 0.0787 2.1565 0.1706 0.0837
Okaloosa 3.5213 4.8498 11.9226 1.5125
Okeechobee 0.2043 0.9852 3.6353 0.3655
Orange 0.0604 3.0831 1.5789 0.0749
Osceola 0.0398 1.6716 0.4520 0.0593
Palm Beach 0.0737 3.3779 0.5765 0.2463
Pasco 0.5340 6.6278 3.3042 1.4492
Pinellas 0.2460 4.1228 8.5976 1.1475
Polk 0.3071 2.6789 1.4869 0.1420
Putnam 0.3103 2.0286 0.2188 0.3185
Santa Rosa 9.1296 12.6430 70.8486 7.3732
Sarasota 0.1140 3.6114 4.9771 0.3630
Seminole 0.1854 1.1860 0.9902 0.0486
St. Johns 0.0569 3.8238 0.5625 0.1176
St. Lucie 0.7955 5.0000 7.7550 0.7629
Sumter 0.1896 0.6946 0.0754 0.0294
Suwannee 1.1014 0.2253 0.1523 0.0731
Taylor 0.9442 2.2554 2.8236 0.8984
Union 1.3929 0.4754 0.7638 0.9542
Volusia 0.3013 6.2500 0.6466 0.6558
Wakulla 1.5896 2.8574 6.7761 1.9943
Walton 1.0822 5.9664 2.8189 1.1743
Washington 3.2510 1.2846 0.2962 0.7555

45
Table 2.10: Levee Rating System implemented by the United States Army Corps of Engineers
(USACE)

Rating Description
Acceptable All inspection criteria are rated as Acceptable.

Minimally One or more inspection criteria are rated as


Acceptable Minimally Acceptable or Unacceptable. However,
the levee is deemed likely to withstand the next
major flooding event.
Unacceptable The levee is deemed unlikely to hold during the
next major flooding event. The problems must
corrected within two years.

46
Chapter 3

Risk Management in Wheat


Production

3.1 Introduction

In 2011 and 2012 severe droughts caused extensive crop damage throughout the Midwest. During

2011 stories flooded news networks of cattle ranchers being unable to feed their herds due to

the shortage of feed. The following year proved to be disastrous as well. The loss cost ratio

(LCR)1 for corn in 2012 was 2.82, which translates to $12.7 billion of indemnity payments paid

to producers (Summary of Business, 2014). These recent events combined with concerns for

climate change have led to a growing focus on risk management in agriculture. The increasing

emphasis on risk management is reflected in the 2014 Farm Bill, which replaces direct payments

with shallow loss programs.

For this paper we turn our attention to winter wheat production in Kansas. Historically,

the majority of winter wheat production in Kansas has been produced without irrigation,

also known as dryland production. Although dryland wheat production is typically more cost-

effective than irrigated production, if a drought strikes Kansas irrigation could not be used as

a means of mitigating damages. Currently, irrigated wheat and dryland wheat have different
1
This ratio is indemnity payments divided by premiums.

47
benchmark yields for crop insurance guarantees, but these benchmarks do not account for dif-

ferences in the variances or correlations of yields caused by the different practices. If differences

in variances and correlations are not properly accounted for in insurance ratings, premiums will

be inaccurate due to incorrect probabilities and expected loss estimates.

Using spatial models for winter wheat yields in Kansas, we investigate the ratings of the crop

insurance policies as well as expected payouts from the Agricultural Risk Program established

under the 2014 Farm Bill. We model irrigated winter wheat and dryland wheat separately since

these practices have different benchmarks. The data is censored because some counties during

certain years did not plant winter wheat. For this reason we use a Bayesian version of a tobit

model. This model allows us to estimate the probability of an observation being censored. Also

we look for changes in the spatial relationship among county yields since yields tend to be more

spatially correlated during times of drought or other natural disasters.

3.2 Risk Management in Agriculture

The Federal Crop Insurance Corporation (FCIC) underwrites crop insurance policies, which

are then sold by private firms, called Approved Insurance Providers (AIPs), to producers.

These policies insure producers against any form of natural disaster that affect crop production.

Policy guarantees are typically based on revenue or yield. Common coverage levels are 65% and

75% although some crops/areas may be insured at 85% coverage. The Standard Reinsurance

Agreement developed by the Risk Management Agency (RMA) of the United States Department

of Agency (USDA) determines the share of losses paid by the AIPs and the share losses paid by

the FCIC. For a list of policies underwritten by the FCIC refer to Table 3.1. The most popular

of these policies in Revenue Protection, which makes up over 80% of all crop insurance policies.

The current methodology for rating COMBO2 policies is outline by Coble et al. (2010).

COMBO insurance ratings begin with the calculation of the unloaded target rate, which is

a function of loss cost ratios (LCRs) for the county of interest and its neighboring counties.
2
COMBO is an umbrella term for yield-based and revenue-based policies.

48
The loss cost ratio for a county is ratio of the indemnity payments paid to producers over the

premiums collected for the given county. This rate is the anchor rate for insurance policies

within the county. The rate is referred to as unloaded because it is calculated without the

highest 10% of losses for the counties. These large losses are accounted for in the catastrophic

loading. The unloaded target rate is a weighted average of the historical LCRs of the county

and its neighbors, weights are calculated with the Buhlmann method, which is defined as

R = ZX + (1 Z) (3.1)
where
P
Z=
P +K
and

1. R: county unloaded target rate

2. Z: Buhlmann credibility factor

3. X: sample mean of the county of interest

4. : the mean of the adjusted LCR of the county group

5. P: exposure units

6. K = /

(a) is the sample variance of the adjusted LCR for the county of interest.
(b) is the sample variance of the adjusted LCR for the county group.

Once the unloaded target rate has been established, COMBO policies are rated with the

Iman Conover (1982) procedure. The Iman Conover procedure generates correlated random

draws of yields for a given county and price deviates. These correlated random draws of yield

and price deviates are then used to establish then premium rate for 65% coverage. The premium

rate is the expected loss divided by liability, which can be defined as E(Y |Y )/(Y ) 1, where

Y is the realized yield, Y is the predicted yield, and is the coverage level (Goodwin and Ker,

1998).

Disaster assistance for farmers was first established in 1938. For many years crop insurance

was offered for only a few crops and remained rather experimental. However, modern day crop

49
insurance was established by the Federal Crop Insurance Act of 1980. The legislation created

the Federal Crop Insurance Corporation under the jurisdiction of the Risk Management Agency.

Also the Federal Crop Insurance Act of 1980 permitted 30% of premiums to be subsidized for

65% coverage policies (History of Crop Insurance Program, 2014). The federal crop insurance

program floundered through the 1980s and was on the brink of extinction in the early 1990s,

the program was revitalized by the Federal Crop Insurance Reform Act of 1994 (Glauber, 2004)

. This new legislation permitted premium subsidies for higher coverage levels, created catas-

trophic (CAT) coverage, and made program participation mandatory. However, the mandatory

participation requirement was repealed in 1996. The 2000 Farm Bill has allowed for private

entities to carry out research and create new insurance products through a partnership with

RMA (History of Crop Insurance Program, 2014). The most recent agriculture legislation, 2014

Farm Bill, made notably changes involving RMA.

One characteristic that sets crop insurance apart from other non-life insurance is the po-

tential for systemic risk. Systemic risk is the risk of losses occurring simultaneously and de-

pendently, such as in the event of a natural disaster. Natural disasters require insurance firms

to possess very large reserves of capital or reinsurance. Jaffe and Russell (1976) conjectured

that large reserves of capital would cause a firm to be susceptible to hostile takeovers. Miranda

and Glauber (1997) as well as Goodwin (2001) present arguments for the importance of incor-

porating the potential of systemic risk into crop insurance portfolios. In 2012 crop insurance

indemnity payments over all crops totaled $17.4 billion, which amounted to a loss cost ratio of

1.57 (Summary of Business, 2013). With nearly $116 billion in liabilities for 281 million acres,

not including livestock, the Federal Crop Insurance Corporation claims that private firms would

not be able to fully bear the risk of a catastrophe such as the 2012 drought. Therefore, accord-

ing to the FCIC, the Standard Reinsurance Agreement (SRA), which allows private insurers to

share risk with the FCIC ,is necessary.

In the last decade there has been reoccurring concern about crop insurance policies being

inconsistently rated for different regions and crops. Babcock et. al (2004) criticized the as-

sumption of constant relative risk, in other words when the loss cost ratio remains constant

50
over time. Woodard et. al (2011) demonstrated that the using the loss cost ratio to determine

crop insurance premiums is only unbiased when the assumption of constant relative risk is not

violated. They found there was an upward bias in estimates when this assumption was violate.

Title I and Title XI of the 2014 Farm Bill focuses on risk management strategies and has

eliminated direct payments, counter-cyclical payments, and the Average Crop Revenue Election

(ACRE) program. These programs are replaced by the Price Loss Coverage (PLC) program and

the Agriculture Risk Coverage (ARC) program. Farmers can choose to enroll into one of these

two programs. The PLC program pays out the difference between the market price and the

reference price3 multiplied by 85% of the base acres. The ARC program guarantees can either

be based on individual producer revenues or county revenues. Pay outs occur if the producers

revenue drops below 86% of the benchmark revenue. Then producers are paid the different

between the actual revenue per an acre and the guarantee multiplied by 85% of the base acres.

The benchmark revenue is generated from the 5-year Olympic average of yields and the 5-year

Olympic average4 of the national price. Benchmark revenues for irrigated and dryland crops

are calculated separately. The 2014 Farm Bills shift towards these new programs in place of

direct payments and countercyclical payments is a cause to further examine the risk associated

with yields (Agricultural Act, 2014).

3.3 Data

Yields for winter wheat measured in bushels per an acre were collected from the National

Agricultural Statistical Services over the sample period 1970 - 2013. These yields are aggregated

at the county level and grouped by irrigation practices: dryland (non-irrigated) and irrigated.

All 105 counties of Kansas produced dryland winter wheat during the sample period, and

67 counties of Kansas produced irrigated wheat. There are years during the sample period

without production for both dryland wheat and irrigated wheat; therefore, our modeling needs

to account for this censoring. Figure 3.1 shows the number of acres planted for both dryland
3
The reference price for wheat is $5.50 per bushel.
4
Olympic average eliminates the highest and lowest values then averages the remaining values.

51
winter wheat and irrigated winter wheat in Kansas. Here we see the majority of winter wheat is

produced without irrigation, which is true for winter wheat production throughout the United

States. Figure 3.3 shows a slightly increase in mean yield of winter wheat for the entire state

over the sample period; however, when the yields are dis-aggregated into counties, the trend is

not significant in parametric or non-parametric regression.

Since 80% of the crop insurance policies are revenue based, we need prices to simulate

premiums. Wheat futures contract prices were collected from the CBOT and cash prices for

Kansas wheat were collected from the National Agricultural Statistical Service. The futures

contract were priced in September and expired in July of the following year. The cash prices were

the averages for transactions in July. We use the September quotes because the projected price

for winter wheat is announced on September 30. Related to the price announcement, September

is the month when winter wheat is planted in Kansas. Also we use the July expiration date and

cash prices from July because most of the winter wheat in Kansas is harvested from late June

to mid July.

3.4 Methodology

3.4.1 Model for Censored County Yields

Because the dryland and irrigated yield data have years without production, we utilize a Tobit-

like Bayesian model. Tobit models (Tobin, 1958) assume the data have a latent variable y

driving the observation y, such that




P
Bit (0 + Pj=1 j xj,it + it ) if yit
>c
yit = , (3.2)


Bit if yit
c

i.i.d
for County i = 1, . . . , N during Year t = 1, . . . , T . c is a constant threshold, and it N (0, 2 ).

Bit = I(c > 0), where I() is the indicator function. If a response variable has the form described

in Equation 3.2, it is called a censored variable. Censoring may be a result from sampling

52
methods or the nature of the data. For example if an individual is below the age of 65, single,

and makes less than $10,000, he does not have to file a tax return; therefore, his income could

appear to be $0 to somebody investigating tax return data. Equation 3.2 and the example

above show right-handed censoring because values below a particular threshold are censored.

When values of above a certain threshold are censored, this is called left-handed censoring. An

example of left-handed censoring could be caused by instrument that cannot exceed a particular

threshold, such as physicians scale, which typically has a weight limit of approximately 400

pounds.

The likelihood function for the tobit model is

N Y
Y T  !M  !M 1
1 yit xit xit
1 , (3.3)

i=1 t=1

where M = 1 if yit = yit


and 0 otherwise, () is standard normal probability density function,

and () is standard normal cumulative density function.

The difference between a typical tobit model and our model is the estimation of a logit

link as well as the estimation of the normal regression truncated at zero. The logit link is used

to predict whether or not the observation yit will be censored, while the truncated normal

regression predicts the values of the yields when the observation yit is not censored.

The variable Bit Ber(Pit ; therefore, Bit can be modeled using a logit link function. For

the logit link function, we surmise that the year, location, and the September futures contract

price may affect whether the observation is censored or not. The form of the logit link is

B
X
logit(Pit ) = i,0 + j xjt (3.4)
j

for County i = 1, . . . , N and Year t = 1, ..., T . (1,0 , . . . , N,0 ) CAR(, ) for its prior

distribution. CAR(, ) is the abbreviation for the Conditional Autoregressive model, which

is a spatial distribution. The mean (or intercept in our model) of one county is dependent or

conditional on the means (intercepts) of other counties and is conditioned on surrounding

53
counties. Surrounding can be defined by distance or contiguity. In this model, we choose

contiguity over distance since the counties greatly vary in size and shape. A detailed description

of the CAR model is provided in the Appendix A.

The prior distribution of the coefficients on the September futures contract prices and the

years are both normal distributed. Using the notation of N (, 2 ), the prior distributions of

the covariates can be written as j N (0, 100).

The normal regression truncated at zero has the form

yit = i,0 + i,1 zit (3.5)

for County i = 1, . . . , N and Year t = 1, ..., T . (1,0 , . . . , N,0 ) CAR(, ) for its prior dis-

tribution. The same prior distribution is used for (1,1 , . . . , N,1 ). The covariate zit is a binary

covariate defined as




0 if yit > mi
zit = , (3.6)


1 if yit mi

where mi is the median yield for County i = 1, . . . , N . The purpose of the covariate zit is

to capture changes in spatial dependencies that occur at lower yields. As Goodwin (2001)

demonstrated, yields across space during droughts or other natural disasters have stronger

dependencies compared to yields during normal years. The prior distribution of is a truncated

normal distribution N (1, 91 , 0, 2). Note the notation of the truncated normal is N (, 2 , a, b),

where is the mean, 2 , a is the lower limit, and b is the upper limit.

This model for censored county yields is used to simulated yields for the risk management

application of this paper. Note that we do not employ the same simulation technique used in

classical statistical statistics instead we use posterior predictive sampling. In classical statistics

one typically uses the maximum likelihood estimates in the sampling distribution to make ran-

dom draws from the sampling distribution. Posterior predictive sampling differs from the sam-

pling in classical statistics. Posterior predictive sampling is a two part process. Since Bayesian

54
methods treat parameters as random variables, the first part of posterior predictive sampling

is drawing parameters from the posterior distribution. The sample of parameters drawn from

the posterior distribution are then used in the sampling distribution to draw random samples

of observations.

3.4.2 Prices

For the simulated revenues used in the premium ratings, we need to generate prices that are

correlated with the yields. After obtaining a posterior predictive sample of county yields from

the model described above, these yields are averaged to compute the state yield average. This

state yield average is then regressed against log difference of the September futures contract

prices and the July cash price. Prices are then generated from pt = wt1 exp(rt ), where pt is the

cash price for Year t, wt1 is futures contract price, and rt N (0 + 1 yt , r2 ). Note yt is the

state yield for Year t. The use of log-normal distributions for price differentials is common in

crop insurance ratings.

Computations are performed using the software R, and the Bayesian models are imple-

mented using the software package R2OpenBUGS.

3.4.3 Risk Management Application

In non-life insurance applications, there are several measures of interest: 1.) the probability of

a loss, 2.) the expected loss, i.e. the actuarially-fair premium, and 3.) the premium rate. These

values can be found through the Monte Carlo integration. As shown by Goodwin and Ker
1 PM
(1998), the probability of a loss is defined as P (y < C Y ) = M i=1 I(yi < C Y ), where C

is the coverage level, Y is the expected yield or revenue, yi is the ith simulated yield or revenue,

M is the number of replications, and I is an indicator function. The expected loss is defined as

the E(L) = P (y < C Y ) E(C Y y|(C Y y) > 0), where L is the difference between

the guarantee and the actual yield or revenue. Finally, the premium rate can be determined by

dividing the expected loss by the liability, which is C Y .

We simulate prices and yields to rate Group Risk Income Protection policies with the Harvest

55
Price Option (GRIP-HPO) as well as estimate payouts for the new Agricultural Risk Coverage

program based on county yields. The rate for the GRIP policy with the Harvest Price Option

is referred to as the HP Rate. We estimate the HP rate because the majority of revenue

plans purchased include the Harvest-Price Option. The summary of Harvest Price Option and

the GRIP policy are included in Table 3.1. The Group Risk Income Protection policy is rated

similarly to the methods discussed by Coble el al. (2010), which described in detail the rating

of COMBO insurance plans5 . The rate for the Group Risk Income Protection with the Harvest

Price Option is defined as

HP Rate =
P10000
i=1 max(0, C Y min(2 P, max(P, p)) max(0, (yi y + y ) min(2 P, p)))
, (3.7)
10000 Y C P

where Y is the actual production history (APH) yield, P is the September futures contract

price of wheat in 2013, C is the coverage level, yi is the simulated yield and p is the simulated

price. Note that for the Harvest Price Option if the harvest price exceeds twice the September

price, 2 (September price) is used in place of the harvest price. We estimate the 65%, 75%,

and 85% coverage levels.

For the simulations of the Agricultural Risk Coverage program, we simulate county yields

and prices for 2009 to 2013. Then the Olympic averages for each county and the prices are

calculated. This process is repeated 10000 times to create distributions for the Olympic averages

of county yields and the price. Unlike with the GRIP plan, there is only one coverage level,

which is 86%. Using the simulated Olympic averages of yields and prices, we determine the

probability of a payout from the program and the expected payout for each county in 2014.

3.5 Results

To help determine the best fitting models for dryland and irrigated wheat yields, the Deviance

Information Criterion (DIC) is calculated for each model version. DIC is a Bayesian measure
5
COMBO insurance is the umbrella term used to describe yield and revenue based crop insurance policies.

56
similar to AIC. Like AIC lower measures of DIC indicate a better fit, and the measure penalizes

additional parameters. Table 3.2 shows the DIC for the dryland and irrigated wheat models,

where the covariates of the logit link differ. For the logit links of both dryland wheat and

irrigated wheat, the covariates Year and September price affect censoring. The location of

the county does not seem to affect the censoring of dryland yields, but the location of the

county does affect the censoring of irrigated yields. Figure 3.6 shows the prior and posterior

distributions for the parameter of dryland wheat and irrigated wheat. The median of the

parameter is 1.167 for dryland wheat and 1.039 for irrigated wheat. For dryland wheat the

DIC for the model is 28070 when the covariate zit is included and 32280 when the covariate

zit is not present. For irrigated wheat the DIC for the model is 16090 when the covariate zit

is included and 17820 when the covariate zit is not present. Therefore we find that within our

framework county yields are best described using not only spatial intercepts, but also including

a secondary spatial covariate for yields for under mi . Due to the short length of the time series

of yields, identification of more spatial covariates is not feasible.

The posterior distributions of the parameters in the logit links of dryland and irrigated wheat

differ substantially. Figure 3.4 shows the prior and posterior distributions for the parameters in

the logit link functions of dryland wheat and irrigated wheat. For dryland wheat the intercept 0

is constant across counties and has a median of 1.808. The medians of the posterior distributions

for the coefficients of Year and September price are -0.003 and -0.001. These coefficients indicate

a decrease in the probability of censoring as years go by or if the September price increases.

For irrigated wheat, the medians of the posterior distributions of the coefficients for Year and

September price are approximates 0.019 and 0.039, respectively. Therefore, the odd of irrigated

yields being censored increases by 0.04 for every year that goes by, and the odds of censoring

increases by 0.019 for every dollar the September price increases. Also according to the spatial

intercepts of the logit link for irrigated wheat found in Figure 3.5, counties in northeastern

Kansas are the most likely to be censored.

The truncated normal regression for both dryland and irrigated wheat contain spatial in-

tercepts with a CAR prior distribution, the secondary spatial covariate with a CAR prior

57
distribution. For the spatial intercepts and the secondary spatial covariates, we show maps of

the 2.5%, 50%, and 97.5% percentiles of the posterior distributions. The maps for spatial in-

tercepts for dryland wheat and irrigated wheat, shown in Figure 3.7 and Figure 3.9, do not

indicate distinct patterns across the state of Kansas. This is also true for the secondary spatial

covariate as seen in Figure 3.8 and Figure 3.10.

To evaluate the fit of our models, we use the Chi-Squared discrepancies, which are a method

posterior predictive checking described by Gelman et al. (2004). The Chi-Squared discrepancy

is defined as
N X
X T
(yi,t E(Yi,t | i,t ))2
, (3.8)
V ar(Yi,t | i,t )
i=1 t=1

where yi,t is the observed yield for County i during Year t. E(Ci,t | i,t ) and V ar(Ci,t | i,t ) are

calculated from the simulated yields. The Chi-Squared discrepancy with the lowest value gives

the best fitting model. For comparison we not only simulate yields from the best fitting models

for dryland wheat and irrigated wheat, but we also simulate yields from the models where the

truncated normal regressions has spatial intercepts with independent normally distributed prior

distributions and no secondary spatial covariate. Figure 3.11 and Figure 3.12 show the simulated

yields for the best fitting models. For the simulated dryland yields, central Kansas has relatively

high yields and eastern Kansas has lower yields compared to the rest of the state. However, there

are no distinct patterns for irrigated wheat. Also visible inspection indicates the simulated yields

are reasonable when compared to observed yields seen in Figure 3.3. Table 3.3 shows the Chi-

Squared discrepancies. The best fitting model for dryland wheat has a Chi-Squared discrepancy

of 3886.0, where the model with independent intercepts has a Chi-Squared discrepancy of 4054.4.

Also we see the best-fitting model for irrigated wheat has a Chi-Squared discrepancy of 2795.1,

where the model with independent intercepts has a Chi-Squared discrepancy of 3983.4. These

Chi-Squared discrepancies show the improvement in fit caused by including the CAR prior

distribution for the spatial intercepts and the secondary spatial covariates.

Next we generate revenues for the year 2014 to determine premium rates of the GRIP-

HPO policies. Again we simulate from the best fitting models for dryland and irrigated wheat

58
as well as the models with independent intercepts. The policies have revenue guarantees of

65%, 75%, and 85%. Before estimating the premium ratings, we look at the probability of a

loss occurring for these guarantees. Figure 3.13 and Figure 3.14 show the probabilites for the

different guarantees for dryland wheat and irrigated wheat. The probabilities of the best fitting

model of dryland wheat have very distinct patterns. This model indicates higher probabilities

of loss in northwestern Kansas and south central Kansas compared to the rest of the state.

The probability of a loss is lower in eastern Kansas. The median probabilities of a loss across

all counties are 0.207, 0.328, and 0.449 for the 65%, 75%, and 85% guarantees, respectively.

Similar patterns emerge for the dryland wheat model with independent intercepts although this

model has consistently higher probabilities. For the model with independent spatial intercepts,

the median probabilities of a loss across all counties are 0.241, 0.357, and 0.480 for the 65%,

75%, and 85% guarantees, respectively. Figure 3.17 and Figure 3.18 show the premium rates

for the dryland wheat for the best fitting model and the model with independent intercepts.

The premium rates are higher for the model with independent intercepts compared to the best

fitting model. The median premium rates across all counties for the best fitting model are 0.038,

0.069, and 0.107 for the 65%, 75%, and 85% guarantees, while the median premium rates across

all counties for the model with independent intercepts are 0.051, 0.084, and 0.123.

The probabilities of a loss and the premium rates for irrigated wheat differ from the proba-

bilities and premium rates of dryland wheat. Figure 3.15 and Figure 3.16 show the probabilities

of a loss for the 65%, 75%, and 85% guarantees for the best fitting model and the model with

independent intercepts. Again we see the probabilities of a loss generated from the model with

independent intercepts are higher than the probabilities of loss generated by the best fitting

model. The median probabilities of a loss for the best fitting model across all counties are 0.158,

0.292, and 0.439 for the 65%, 75%, and 85% guarantees, while the median probabilities of the

model with independent intercepts are 0.2109, 0.3483, and 0.4959. Also the premiums rates

for irrigated wheat, seen in Figure 3.19 and Figure 3.20, show the model with independent

intercepts has slightly lower premium rates than the best-fitting model. The median premium

rates across all counties for the best fitting model are 0.023, 0.05, and 0.088 for the 65%, 75%,

59
and 85% guarantees, while the median premium rates across all counties for the model with

independent intercepts are 0.022, 0.048, and 0.084.

The final component of our analysis is the application of our models to the new Agricultural

Risk Coverage program. We simulated yields and prices from 2009 to 2013 and then take the

Olympic average of the simulated prices and the Olympic average for the simulated yields

of each county. All of this analysis is conducted using the best fitting model and simulate

10,000 replications of prices and yields. The 2.5%, 50%, and 97.5% percentiles for the simulated

Olympic averages of the yields are shown in Figure 3.23 and Figure 3.24 for dryland and irrigated

wheat, respectively. The distribution of Olympic averages for prices is shown in Figure 3.21.

The mean Olympic average price is $6.48 with a 95% confidence interval from $5.25 to $8.11.

The expected median payout across all counties for the ARC program is $17.65 per an acre

for dryland wheat, and the expected median payout across all counties is $21.32 per an acre

for irrigated wheat as seen in Figure 3.25. It is worth noting the the probability of a pay out

from the ARC program is silently lower than the probability of a payout from a crop insurance

policy with an 85% guarantee. The probability of a payout across all counties is 0.412 for

dryland wheat and 0.314 for irrigated wheat.

3.6 Discussion

Our analysis shows that the best fit for county yields allows the spatial dependencies among

the counties to change with the value of the yields. When compared to a model that assumes

no correlation between yields, we see the dryland wheat premium ratings for different cover-

age levels are more consistent. Therefore, by including spatial dependencies in crop insurance

ratings, the premium rates better reflect intuition. Although the target rate used by RMA is a

weighted average of a countys yields and the yields of its neighboring counties, this average is

only a point estimate and does not does fully describe the dependencies among the distributions

of the county yields. Therefore, RMA may want to consider a model that better accounts for

spatial dependencies.

60
According to a study conducted by Ifft et al. (2012), the total for direct payments from 2004

to 2008 was equal to 6.8% of crop revenues. One of the major concerns of the 2014 Farm Bill

is how the Agricultural Risk Coverage program and Price Loss Coverage program will compare

to direct payments and the other programs being eliminated. The best fitting models predict

the average revenue for an acre of Kansas winter wheat in 2014 will be $ 213.78 and $276.56

for dryland and irrigated winter wheat. The expected median payout per an acre of winter

wheat from the ARC program is $17.65 and $21.32 for dryland and irrigated winter wheat.

If we multiply these values by 0.85 (because ARC payouts are applied to 85% of base acres),

the ratios of the payouts of the ARC program to the expected revenue are 7.02% for irrigated

winter wheat and 6.55% for dryland wheat. Therefore, our analysis concludes the payouts from

the ARC program will be very similar to direct payments.

3.7 Concluding Remarks

This paper found that not only do spatial dependencies exists among county yields, but the

spatial relationships are dependent on the value of the yields. Including these spatial depen-

dencies, the forecasting ability of the models for both dryland and irrigated are improved. This

improved forecast translates into more accurate premiums ratings for crop insurance policies.

We also determine that based on the best fitting models presented in this paper, the ARC

program expected payouts will be very similar to amounts paid out for direct payments.

Title I and Title XI of the 2014 Farm Bill have prioritized risk management in United

States agriculture for the next several years. Since the majority of crop insurance policies have

guarantees based on the production of individual producers instead of county level production,

we plan to apply the models used in this paper to yields of individual producers. Also we plan

to further compare expected payouts of these new programs to direct payments, county-cyclical

payments, and the ACRE program.

61
3.8 Tables and Figures

Table 3.1: Descriptions of policies offered by RMA

Actual Production History insures on a percentage of the predicted yield, typ-


(APH) ically 50% to 75%. The policy holder also selects a
percentage of the predicted price set by RMA, which
is typically between 55% and 100%
Actual Revenue History similar to APH but it insures historical revenue in-
(APH) stead of historical yields. Each crop has unique provi-
sions
Adjusted Gross Revenue insure the a percentage of the revenue entire farm in-
(AGR) stead of each individual crop.
Area Risk Protection In- provides coverage based on the production of an entire
surance (ARPI) county.
ARPI replaces GRP and GRIP described below.
Group Risk Plan (GRP) insures using an index based on county yields. Cover-
age levels up to 90% are offered.
Group Risk Income Pro- similar to GRP but insures based on index of county
tection (GRIP) revenue not yield.
Group Risk Income allows for the producer to choose between the revenue
Protection-Harvest Price calculated with expected price at the time of harvest
Option (GRIP-HPO) and producer chosen coverage level percentage.
Revenue Protection (RP) insures individual produce against both yield losses
from natural causes as well as revenue losses from
changes the projected harvest price. Producers choose
a percentage of their yield to insure typically 50%
to 75%. Indemnity payments are then based on the
greater of the yield multiplied by the harvest price or
the projected price.
Revenue Protection With insures the revenue of the producer using the predicted
Harvest Price Exclusion price.
Yield Protection is similar to APH policies; however, the projected price
is determined by futures contracts not RMA.
Catastrophic Risk Protec- pays 55 percent of the projected price on yield losses
tion Endorsement (CAT exceeding 50 percent. There is $300 fee for each crop
Coverage) insured with CAT Coverage; however, the Federal
Government pays the premium.

62
Table 3.2: DIC for the entire model. Here the logit link is varied, while the truncated normal
distribution has the spatial intercept, the spatial covariate with the optimal threshold, and the
September price.

Model Logit Dryland Irrigated


One 27930 18070
Year
September price
Two
Constant Intercept 27850 17570
Year
September price
Three
Constant Intercept 28080 17680
September price
Four
Constant Intercept 28270 17680
Year
Five
Spatial Intercept (CAR) 27850 16440
Year
September price

Table 3.3: Chi-Squared Discrepancies. Best-fitting show the Chi-Square discrepancy of the
model that has spatial intercepts with the CAR distribution prior,the optimal threshold covari-
ate in the truncated normal regression, and the September price covariate. Independent has
different intercepts for each county with independent priors and the September price covariate.
These two models have the same logit link.

Dryland Irrigated
Best- Fitting 3886.0 2795.1
Independent 4054.4 3983.4

63
State Yields Acres of Winter Wheat Planted

60

10
50

Bushels per Acre

Acres (Millions)
Dryland Dryland
Irrigated Irrigated
40
5

30

0
1970 1980 1990 2000 2010 1970 1980 1990 2000 2010
Year Year

Figure 3.1: Figures for the entire state of Kansas including the average yield and number of acres planted.

64
Wheat Prices (Inflation Adjusted)
20

15
Dollars Per Bushel

10

1970 1980 1990 2000 2010


Year

Figure 3.2: Wheat price for a per bushel (adjusted to 2013 price)

65
Mean of Irrigated Wheat

NA
40.38 44.22
44.22 45.94
45.94 47.27
47.27 51.19

Mean of Dryland Wheat

32.56 33.12
33.12 33.16
33.16 33.37
33.37 33.67

Figure 3.3: Average yield (bushels per acre). Sample period: 1970-2013

66
0 1 2

2.0
2.0

2.0
prior prior prior
dryland dryland dryland

1.5
1.5 irrigated irrigated

1.5
1.0
Density

Density

Density
1.0

1.0
0.5
0.5

0.5
0.0

0.0

0.0
1.0 0.5 0.0 0.5 1.0 1.0 0.5 0.0 0.5 1.0 1.5 1.5 1.0 0.5 0.0 0.5 1.0
N = 20000 Bandwidth = 0.02493 N = 20000 Bandwidth = 0.02551 N = 20000 Bandwidth = 0.02359

Figure 3.4: Prior and posterior distributions of the dryland and irrigated wheat logit link functions. Note there is no 0 posterior
distribution for irrigated wheat because the intercepts for the best-fit irrigated wheat logit link function are spatially-varying.

67
Spatial Intercepts: 2.5%

NA
4.82 1.42
1.42 0.19
0.19 1.28
1.28 2.36
Spatial Intercepts: 50%

NA
3.41 0.66
0.66 1.12
1.12 2.69
2.69 4.97
Spatial Intercepts: 97.5%

NA
2.28 0.13
0.13 2.2
2.2 4.66
4.66 9.67

Figure 3.5: Posterior percentiles for the spatial intercepts of the irrigated wheat logit link.

68

prior
posterior
prior

250
25

posterior

200
20

150
15

100
10

50
5

0
0

1.13 1.14 1.15 1.16 1.17 1.18 1.19 1.025 1.030 1.035 1.040 1.045

(a) Dryland Wheat (b) Irrigated Wheat

Figure 3.6: Prior and posterior distributions for the parameter of the dryland wheat and
irrigated wheat

69
Spatial Intercepts: 2.5%

41.3903 41.5645
41.5645 41.6116
41.6116 41.6422
41.6422 41.6827

Spatial Intercepts: 50%

41.968 42.0035
42.0035 42.0256
42.0256 42.0427
42.0427 42.0739

Spatial Intercepts: 97.5%

42.3902 42.4321
42.4321 42.4548
42.4548 42.4798
42.4798 42.6627

Figure 3.7: Posterior percentiles for the spatial intercepts of the dryland wheat truncated
normal regression

70
Spatial Intercepts: 2.5%

15.8668 15.6827
15.6827 15.637
15.637 15.6137
15.6137 15.5809

Spatial Intercepts: 50%

15.1678 15.151
15.151 15.1346
15.1346 15.1256
15.1256 15.1106

Spatial Intercepts: 97.5%

14.6865 14.6659
14.6659 14.6422
14.6422 14.6122
14.6122 14.4383

Figure 3.8: Posterior percentiles for the secondary spatial covariate of the dryland wheat trun-
cated normal regression

71
Spatial Intercepts: 2.5%

NA
43.81 48.46
48.46 50.79
50.79 52.24
52.24 53.78
Spatial Intercepts: 50%

NA
45.64 51.01
51.01 52.56
52.56 54.18
54.18 56.25
Spatial Intercepts: 97.5%

NA
47.45 52.94
52.94 54.86
54.86 56.23
56.23 58.97

Figure 3.9: Posterior percentiles for the spatial intercepts of the irrigated wheat truncated
normal regression

72
Spatial Intercepts: 2.5%

NA
16.54 16.05
16.05 15.9
15.9 15.79
15.79 15.41
Spatial Intercepts: 50%

NA
15.03 14.93
14.93 14.86
14.86 14.8
14.8 14.7
Spatial Intercepts: 97.5%

NA
14.2 14
14 13.79
13.79 13.62
13.62 13.11

Figure 3.10: Posterior percentiles for the secondary spatial covariate of the irrigated wheat
truncated normal regression

73
Simulated Yields: 2.5%

15.936 16.951
16.951 17.257
17.257 17.689
17.689 18.302

Simulated Yields: 50%

30.905 31.853
31.853 32.207
32.207 32.594
32.594 34.137

Simulated Yields: 50%

49.039 50.324
50.324 50.661
50.661 51.052
51.052 51.984

Figure 3.11: Percentiles for the simulated dryland yields

74
Simulated Yields: 2.5%

NA
19.96 26.885
26.885 28.559
28.559 31.605
31.605 42.523
Simulated Yields: 50%

NA
31.768 40.241
40.241 43.718
43.718 48.864
48.864 54.02
Simulated Yields: 50%

NA
43.582 57.592
57.592 60.946
60.946 63.654
63.654 66.52

Figure 3.12: Percentiles for the simulated irrigated yields

75
Probability for 65% Coverage

0.186 0.2
0.2 0.204
0.204 0.208
0.208 0.217

Probability for 75% Coverage

0.307 0.323
0.323 0.327
0.327 0.331
0.331 0.344

Probability for 85% Coverage

0.426 0.446
0.446 0.449
0.449 0.455
0.455 0.466

Figure 3.13: Probability for the three coverage levels of dryland wheat of the best fitting model.

76
Probability for 65% Coverage

0.221 0.237
0.237 0.241
0.241 0.246
0.246 0.256

Probability for 75% Coverage

0.334 0.354
0.354 0.357
0.357 0.362
0.362 0.375

Probability for 85% Coverage

0.456 0.476
0.476 0.48
0.48 0.485
0.485 0.501

Figure 3.14: Probability for the three coverage levels of dryland wheat of the model with
independent counties.

77
Probability for 65% Coverage

NA
0.013 0.08
0.08 0.158
0.158 0.235
0.235 0.446
Probability for 75% Coverage

NA
0.048 0.182
0.182 0.292
0.292 0.405
0.405 0.643
Probability for 85% Coverage

NA
0.118 0.324
0.324 0.438
0.438 0.579
0.579 0.79

Figure 3.15: Probability for the three coverage levels of irrigated wheat of the best fitting
model.

78
Probability for 65% Coverage

NA
0.162 0.19
0.19 0.2109
0.2109 0.2348
0.2348 0.6006
Probability for 75% Coverage

NA
0.2891 0.3219
0.3219 0.3483
0.3483 0.3716
0.3716 0.7308
Probability for 85% Coverage

NA
0.4331 0.4763
0.4763 0.4959
0.4959 0.5182
0.5182 0.8236

Figure 3.16: Probability for the three coverage levels of irrigated wheat of the model with
independent counties.

79
Premium Rate for 65% Coverage

0.03 0.036
0.036 0.038
0.038 0.041
0.041 0.047

Premium Rate for 75% Coverage

0.058 0.067
0.067 0.069
0.069 0.072
0.072 0.08

Premium Rate for 85% Coverage

0.093 0.104
0.104 0.107
0.107 0.11
0.11 0.12

Figure 3.17: Premium rates for the three coverage levels of dryland wheat of the best fitting
model.

80
Premium Rate for 65% Coverage

0.045 0.049
0.049 0.051
0.051 0.053
0.053 0.058

Premium Rate for 75% Coverage

0.075 0.081
0.081 0.084
0.084 0.087
0.087 0.094

Premium Rate for 85% Coverage

0.111 0.12
0.12 0.123
0.123 0.126
0.126 0.131

Figure 3.18: Premium Rates for the three coverage levels of dryland wheat of the model with
independent counties.

81
Premium Rate for 65% Coverage

NA
0.001 0.01
0.01 0.023
0.023 0.035
0.035 0.087
Premium Rate for 75% Coverage

NA
0.005 0.026
0.026 0.05
0.05 0.073
0.073 0.149
Premium Rate for 85% Coverage

NA
0.013 0.052
0.052 0.088
0.088 0.123
0.123 0.216

Figure 3.19: Premium rates for the three coverage levels of irrigated wheat of the best fitting
model.

82
Premium Rate for 65% Coverage

NA
0.003 0.017
0.017 0.022
0.022 0.025
0.025 0.037
Premium Rate for 75% Coverage

NA
0.01 0.039
0.039 0.048
0.048 0.054
0.054 0.073
Premium Rate for 85% Coverage

NA
0.027 0.071
0.071 0.084
0.084 0.093
0.093 0.12

Figure 3.20: Premium rates for the three coverage levels of irrigated wheat of the model with
independent counties.

83
Olympic Averages of Prices

2000

count
1000

0
5 7 9 11
Dollars

Figure 3.21: Distribution of Olympic average of prices

84
Dryland Wheat

0.392 0.412
0.412 0.417
0.417 0.427
0.427 0.448

Irrigated Wheat

NA
0.302 0.31
0.31 0.314
0.314 0.319
0.319 0.33

Figure 3.22: County probability from the ARC program

85
ARC Average Yield: 2.5%

24.07 24.2
24.2 24.26
24.26 24.37
24.37 24.62

ARC Average Yield: 50%

32.41 32.51
32.51 32.6
32.6 33
33 33.17

ARC Average Yield: 97.5%

42.14 42.43
42.43 42.57
42.57 42.75
42.75 43.13

Figure 3.23: Percentiles of the Olympic averages for dryland wheat.

86
ARC Average Yield: 2.5%

NA
23.83 31.11
31.11 33.05
33.05 37.24
37.24 46.39
ARC Average Yield: 50%

NA
29.93 37.87
37.87 41.04
41.04 46.44
46.44 52.26
ARC Average Yield: 97.5%

NA
35.95 45.94
45.94 51.24
51.24 54.39
54.39 58.54

Figure 3.24: Percentiles of the Olympic averages for irrigated wheat.

87
Dryland Wheat

17.314 17.57
17.57 17.652
17.652 17.797
17.797 18.199

Irrigated Wheat

NA
15.291 19.33
19.33 21.322
21.322 23.096
23.096 25.604

Figure 3.25: Median of the expected payout for the ARC program.

88
Chapter 4

Spatial Integration of North


Carolina Grain Markets

The Law of One Price (LOP) states the price of a homogeneous good in different locations is the

same when transaction costs are excluded. When markets act in accordance with the Law of One

Price, this is referred to as spatial integration. Typically, if regionally-linked markets differ in

price (not including transaction cost), market players will arbitrage. The spatial trade between

two markets will eventually cause the prices to reach an equilibrium. If prices continuously

operate outside of equilibrium in regionally-linked markets, this indicates poorly integrated in

the markets. This lack of integration could represent asymmetric information in the markets

and affect market players decisions.

Early work on spatial integration utilized the theoretical model by Takayama and Judge

(1964); however, the Takayama and Judge formulation assumes fixed transaction costs across

markets. Faminow and Benson (1990) offered an alternative theory, which made the transac-

tion cost across distances significant and firms close in proximity operate as a spatially-linked

oligopoly. The Faminow and Benson formulation bars transactions between markets over a

great distance due to high transaction costs. Early empirical application, such as Ardeni (1989)

and Goodwin and Shroeder (1991), also did not account for dynamic transaction costs. Fackler

89
and Goodwin (2001) provided a primer reviewing methods used for testing cointegration. Early

applications used the Ravillion (1986) and Timmer (1987) market integration tests as well as

tests for Granger causality. However, Barrett (2001) criticized the assumption of fixed trans-

action costs implied by these tests. Also despite the popularity of using cointegration tests for

determining spatial integration, McNew and Fackler (1997) demonstrated that cointegration is

not necessary for well-integrated markets.

Several different methodologies have been used to better incorporate transaction costs into

testing for spatial integration, such as the use of regime switching regression models. Regime

switching regression models establish for upper and lower bounds around the equilibrium, in an

attempt to account for unobserved transaction costs. The difficulty comes from the transaction

costs being not observed. Sexton, Kling, and Carmen (1991) used a three regime model, which

allowed for the LOP to hold, relative shortages, or relative surplus. Barrett and Li (2002) de-

veloped a maximum likelihood estimator, which allowed for regimes in and out of periods of

trade. Goodwin and Piggott (2001) approached the issue of spatial integration using cointegra-

tion tests and threshold error correction models on North Carolina agricultural markets. Their

threshold error correction models were used to form non-linear impulse responses. Unlike the

market integration tests used in many of the earlier applications, impulse responses are dynamic

and allow one to observe the time-path of price differentials between two markets after a shock.

Others followed with extensions of their analysis. Sephton (2003) implemented a threshold test

based on a Vector Error Correction Model (VECM) to the North Carolina agricultural markets.

The estimation with the threshold model based on VECM showed similar results to those found

by Goodwin and Piggott (2001). Threshold tests based on error correction models have also

been used to investigate the dairy sector in Spain (Serra and Goodwin, 2003) and commodities

in Brazil (Balcombe et al., 2007).

Like Goodwin and Piggott (2001) and Sephton (2003), this paper also investigates the

spatial integration of agricultural markets in North Carolina. Our analysis utilizes daily price

observations from six grain markets in North Carolina: two corn markets, two soybean markets,

and two wheat markets. The daily price series analyzed in this paper cover a sample period

90
from January 7, 2000 to November 3, 2011 for the two corn markets and the two soybean

markets, and the sample period for the two wheat markets is from October 7, 2005 to May

27, 2010. These prices series include 2008, which was not included in the previous analysis

of North Carolina agricultural markets. Like Goodwin and Piggott our analysis utilizes non-

linear impulse response functions. These impulse responses impose an asymmetric shock on the

markets and trace the time path of the price differentials after the shock. The impulse responses

are referred to as non-linear because the functional form of the model for which the shock is

imposed is non-linear. However, our analysis steps away from the threshold modeling that has

become a conventional in the spatial integration literature. After implementing an VAR(1)-

GARCH(1,1) model on the price differentials from each pair of terminal markets (Pairing is

determined by grain type), we use copula models on the standardized residuals. The copulas

are have time-varying parameters. These copulas are called Stochastic Copula Autoregressive

(SCAR) models. Instead of imposing thresholds for regimes, this time-varying copula allows for

smooth transitions in and out of equilibrium.

4.1 Methodology

The models for the pairs of grain market prices are estimated sequentially in three parts. We first

estimate models for the mean prices of the grain markets then the variances of these markets.

Next the standardized residuals for each pair of grain markets are used in Stochastic Copula

Autoregressive (SCAR) models. Finally, using all three componentsthe models for the means,

variances, and standardized residual the non-linear impulse responses are estimated.

4.1.1 Mean Modeling

Vector Autoregressive Model

Given the natural of the prices in these grain markets, the daily prices are likely autocorrelated

and the prices between the grain markets are probably correlated. Therefore, we use a Vector

Autoregressive (VAR) model for modeling the mean prices in the terminal grain markets. The

91
VAR(p) model has the form:

Yt = c + 1 Yt1 + 2 Yt2 + . . . + p Ytp + t , (4.1)

where Yt is an N 1 vector at time t = 1, . . . , T , is an N N matrix of coefficients, c is an


iid
N 1 vector of constants, and t N (0, ).

Nonstationarity

If the mean and the autocovariances of the time series are not dependent time, the time series

is defined as stationary1 . Unit-root processes are nonstationary time series, such that the time

series yt for t = 1, . . . , T satisfies (1 L)yt = t , where t for t = 1, . . . , T is a stationary process

and L is the backwards operator. If nonstationarity is not accounted for when modeling, the

estimates will statistically significant but provide very little economic insight. For example if

the nonstationarity in nominal gross domestic product is not accounted for the coefficients of

an autoregressive model will be dominated by time trend caused by increases in inflation not

changes in productivity. Nonstationarity is often seen in price data because of long-term changes

in the market. Therefore, testing the grain price series for nonstationarity is a necessity. Tests

such as the Augmented Dickey-Fuller (ADF) test can be applied to each individual time series.

If the ADF test indicates the time series is a unit-root process, a common solution is to model

the change in the time series such that yt = + yt1 + ut , where yt = yt yt1 for

t = 2, . . . , T . In this paper, yt is called the price differential.

Suppose we are modeling two unit-root nonstationary times using a VAR(p) model. If there

is only one unit-root for both nonstationary time series, the two time series are referred to as

cointegrated. Cointegration of N time series can be modeled using a Vector Error Correction

model (VECM), which has the form

yt = + yt1 + 1 yt1 + . . . + p ytp + ut (4.2)


1
In particular, this form of stationarity is called covariance stationarity.

92
. The rank of the N N matrix indicates the number of unit roots in the system. If rank() =

k, the number of unit roots is equal to N k. The maximal eigenvalue likelihood ratio (LR)

statistic developed by Johansen (1988) is a common measure used for testing cointegration.

The hypotheses for this test statistic are

H0 : rank() = m H1 : rank()m + 1 (4.3)

The LR statistic is defined as

LKmax (m) = (T p)ln(1 m+1 ), (4.4)

where m+1 is the squared canonical correlations between yt and yt1 .

4.1.2 Variance Modeling

Prices of commodities often vary in their volatility over time. In our paper assuming the residuals

of the VAR model have a constant variance would be inappropriate. The Generalized Autore-

gressive Heteroskedastic (GARCH) model, developed by Bollerslev (1986), allows the variance

to change over time. Given the residuals from the model of mean, uit where day t = 1, . . . , T

and market i = 1, 2, the GARCH(1,1) model has the form

2
it = + u2i,t1 + i,t1
2
, (4.5)

where i + i < 1 and i , i , i > 0.

To estimate the GARCH model, we use the Two-Pass method described by Tsay (2005).

First the mean model, in our case the VAR model or VECM, is estimated. The residuals

estimated ui,t for time series i = 1, 2 of the mean model are taken as the true observations.

Using maximum likelihood estimation, the GARCH model is estimated from the residuals.

Using the residuals from the mean model and the conditional variances from the GARCH
uit
model, the standardized residuals zit = it can be calculated. For the standardized residuals

93
of each terminal market, we calculate the rank-based empirical cumulative densities of the

standardized residuals, which is defined as

T
1X
Fi (z) = 1(Zit z). (4.6)
T
t=1

These cumulative densities are then used as the marginal distributions for copula modeling.

According to Charpentier et al. (2006), the nonparametric marginal distributions provide more

efficient copula parameter estimates compared to parametric estimations of the marginal distri-

butions. The copula model is the joint distribution of the standardized residuals for the pair of

grain markets.2 By using a time-varying copula for the standardized residuals, we create a fuller

description of the price series compared to using a Dynamic Conditional Correlation GARCH

model. The time-varying copulas can be described by a variety of the dependence measures

that are discussed below, such as Kendalls tau and the tail dependence coefficient.

4.1.3 Stochastic Copula Autoregressive Model

For the standardized residuals we use the method developed by Almieda and Czado (2012)

the Stochastic Copula Autoregressive (SCAR) model. Expressing the copula data3 as ut =

(u1,t , u2,t ) , we assume ut |(u1 , . . . , ut1 ), (1 , . . . , t ) Ct for t = 1, . . . , T , where Ct is the

time-varying copula distribution with the parameter t . The copulas in this analysis are bivariate

copulas with Kendalls tau coefficient ranging from -1 to 1. The copulas distributions considered

in this paper are the Gaussian, Double Clayton, and Double Gumbel SCAR copulas. The Double

Clayton copula and the Double Gumbel copula are defined such that



C(u, 1 v| ) if < 0
C(u, v) = , (4.7)


C(u, v|) if 0
2
An overview of copula modeling is provided in Appendix D.
3
The copula data for this analysis are the rank-based empirical cumulative densities derived from the stan-
dardized residuals.

94
where C(u, v) is either the Gumbel copula or the Clayton copula. Table 4.1 shows the

formulas for Kendalls tau for the Gaussian, Double Clayton, and Double Gumbel copulas.

The SCAR model parameter t is modeled by the latent variable t , which is defined as

!
2
1 N , (4.8)
1 2

and

t = + (t1 ) + t , (4.9)

for t = 2, . . . , T . Also t is an i.i.d. Gaussian innovation. Then we find the time-varying param-

eter t by using the inverse of Fischers Z transformation


!
1 1 exp(2t ) 1
t = (t ) = , (4.10)
exp(2t ) + 1

where t is Kendalls tau coefficient and 1 is the inverse of the formula for Kendalls tau

coefficient.

Since the SCAR models are estimated with Bayesian methods, we need to establish prior

distributions for , , and 2 . The parameter has a distribution prior of Beta(e, f ), where

e=1 and f=2. The posterior distribution of is

 
1 1 S1 2
p(|, 2 , 1:T , u1:T ) (1 + )e 2 (1 )f 1 ; , 1 (). (4.11)
S0 S0 [1,1]

PT 1 PT
() is the normal density. S0 and S1 are defined as S0 := t=2 (t )2 and S1 := t=2 (t

)(t1 ).

The parameters and have conjugate priors. Their prior distributions are N (c, d2 )

and 2 IG(a, b). In our estimation, the hyperparameters in the prior distributions are as

follows: a=0.1, b=0.1, c=0.75, and d =0.1. Then the posterior distributions for and are

2 |, , 1:T , u1:T IG(a, b) and | 2 , , 1:T , u1:T N (c, d2 ), where

T
1. a := a + 2

95
1 PT 
2. b := b + 2 t=2 (t (t2 ))2 + (1 2 )(1 )2

(T 1)(1)2 +(12 1
3. d2 := 1
d2 + 2 )
 
PT 
4. c := d2 c
d2
+ 1
2
(1 )2 1 + t=2 (t t1 )

All of the estimation for the SCAR models is conducted using the Metropolis-Hastings algo-

rithm.

4.1.4 Comparison to Single Parameter Copulas

For comparison purposes not only SCAR models are estimates for each pair of markets, but

also three single parameter of the copulas: the Gaussian, Clayton, and Gumbel copula are

estimated for each pair of markets. These three copulas were also estimated using Bayesian

methods. The Clayton and Gumbel copula parameters have gamma distributions for their prior

distributions. Note the prior distribution for the Gumbel copula parameter has a support shifted

from (0, ) to (1, ) in order to match the support of the Gumbel copula. The Gaussian copula

parameter has a beta distribution for its prior distribution. The use of the beta distribution does

not allow for a negative parameter in the Gaussian copula. This choice is intentional because a

negative parameter in the Gaussian copula would violate economic intuition because markets

should be positively correlated between markets.

4.1.5 Non-Linear Impulse Responses

The non-linear impulse responses are defined as

It+k (, Dt , Dt1 , . . .)
(4.12)
= E[Dt+k |Dt = dt + , Dt1 = dt1 , . . .] E[Dt+k |Dt = dt , Dt1 = dt1 , . . .],

t = 1, . . . , T , where is a shock introduced to the model and dt is the difference in the price

differentials between the two markets. The negative shock is introduced through the latent

variable t of the SCAR model. Negative shocks to the latent variable in turn create a negative

96
parameter in the copula, making an asymmetric movement in the markets. Appendix C provides

a detailed description of how these impulse responses are calculated.

4.2 Data

The data used in this analysis are daily prices from six terminal grain markets in North Carolina:

two corn markets, two soybeans markets, and two wheat markets. The daily price series of

these markets have a sample period from January 7, 2000 to November 3, 2011 for the two corn

markets and the two soybean markets, and a sample period from October 7, 2005 to May 27,

2010 for the two wheat markets. The corn markets are located in Barber and Laurinburg, which

have a road distance of 113 miles between them. The soybean markets are located in Fayetteville

and Cofield. The road distance between Fayetteville and Cofield is 168 miles. Fayetteville is

currently the only soybean market with a crusher, making it the dominant soybean market

in North Carolina. The crusher is used to process soybeans into meal and oil. Finally, the

wheat markets are situated in Greenville and Statesville, which are 232 miles apart. Given

these distances, all markets (of the same grain) are less than a days drive apart; therefore, a

price differential between markets caused by a shock should dissipate quickly through spatial

trading.

All markets have missing prices but no more than 10% of the prices is missing from a

single terminal market. Missing observations are imputed using posterior predictive distribution

within the Bayesian framework. Posterior predictive sampling differs from the sampling in

classical statistics. Posterior predictive sampling is a two part process. Since Bayesian methods

treat parameters as random variables, the first part of posterior predictive sampling is drawing

parameter values from the posterior distribution gives the observed values. The sample of

parameters drawn from the posterior distribution are then used to generate sample for future

or missing values using the sampling distribution. The algorithm for sampling from the posterior

predictive distribution is as follows:

1. Draw w (|y obs )

97
2. Draw y new f (y| w )

R
which in turn generates a sample y new for f (y new |)(|y obs )d.

Figure 4.2 shows the price series for each grain market. For the corn markets located in

Barber and Laurinburg, the prices are very similar. The prices from soybean markets located in

Cofield and Fayetteville are also very close together. However, the wheat markets show a very

different story. The Greenville wheat market prices are consistently lower than the prices in the

Statesville wheat market. Note that a consistent price difference does not necessarily indicate

poor spatial integration.

One signal for whether markets are integrated is how many acres were planted. If a crop

is a prominent part of the agricultural economy, its markets are more likely to be integrated.

Figure 4.1 shows the acres planted for corn, soybean, and wheat in North Carolina from 1970

through 2013. Overall, corn production has been decreasing in North Carolina. Since the late

1980s soybean acreage has been higher than the acreage of corn and wheat. Wheat acreage has

historically been low compared to the acreage of corn and soybeans. However, since 2011 there

are has been an increase in the wheat acreage planted has increase.

4.3 Results

Before studying the results from the SCAR models, we discuss the mean and volatility models

for the prices. Table 4.2 shows the statistics and p-values of the Augmented Dickey-Fuller

(ADF) test for each terminal grain market. Note the null hypothesis of the ADF test is the

presence of a unit-root in the time series. According to the ADF test, all six terminal markets

are unit-root nonstationary. Table 4.3 shows statistics for the maximal eigenvalue LR test used

for testing cointegration. Cointegration tests are performed for markets of the same grain.

For example, we test cointegration between the Greenville and Statesville wheat markets. The

maximal eigenvalue LR tests whether there are two unit-processes for the two time series (m=0)

or if there is one unit-root for the two time series (m=1). The test statistics for the pair of

wheat markets, the pair of corn markets, and the pair of soybean markets all have statistically

98
significant results indicating two unit-roots. This result suggest the pairs of markets are not

cointegrated. Despite not being cointegrated markets are still able to be well-intergrated as

mentioned by Fackler and McNew. Because the time series are not cointegrated, we do not use

the Vector Error-Correction model. However, because all of the time series are nonstationary the

Vector Autoregressive models are estimated using the price differentials, pM,t = pM,t pM,(t1) ,

where pM,t is the price at market M during day t.

Estimates for the VAR(1) models for each pair of grain markets as well as the GARCH(1,1)

models for each terminal market are given in Tables 4.4, 4.5, and 4.6. The choice of one lag for the

Vector Autoregressive model is based on AIC selection and likelihood ratio tests. The residuals

of the VAR(1) models indicate heteroskedasticity hence GARCH(1,1) models are applied to

each time series of residuals4 . The residuals of the VAR(1) models and the conditional standard

deviations derived from the GARCH(1,1) models are used to determined the standardized

residuals for the copula modeling.

The plots of the prior and posterior distributions of , , and 2 for the SCAR models are

provided in Figures 4.3 - 4.11. Also summaries of the means and standard deviations of the

posterior distributions are shown in Tables 4.9 - 4.11. The means of for the SCAR models

of the wheat markets are 0.7369, 0.7526, and 0.7400 for the Gaussian, Double Clayton, and

Double Gumbel SCAR models, respectively. In the corn markets there is more variation among

the means of for the SCAR models; the Gaussian SCAR model has a mean of 0.7962, and the

Double Clayton SCAR model and the Double Gumbel SCAR have means of equal to 1.1335

and 1.2320. The means of for the SCAR models of the soybean markets are 1.1230, 0.7386,

and 1.2084 for the Gaussian, Double Clayton, and Double Gumbel SCAR models, respectively.

The posterior means of indicates the dependence between t and t1 in the SCAR models.

of the wheat market SCAR models has posterior means of 0.8787, 0.8708, 0.8796 for the

Gaussian, Double Clayton, and Double Gumbel SCAR models. For the soybean markets the

posterior means of are 0.9134, 0.9545, and 0.9583 for the Gaussian, Double Clayton, and
4
The evidence for heteroskedasticity in the Greenville wheat market is present but weak compared to the
other markets

99
Double Gumbel SCAR models. Also the posterior means of for the corn markets are 0.9566,

0.9219, and 0.9538 for the Gaussian, Double Clayton, and Double Gumbel SCAR models.

Using the parameters estimated in the SCAR models, we are able to calculate Kendalls

tau. The median of the posterior distributions for Kendalls tau coefficient of each SCAR model

variations for the pairs of terminal markets can be found in Figure 4.12. Note that because the

parameter of the SCAR model is time-varying, Kendalls tau also varies across time. For the

corn markets, Kendalls tau across time are approximately 0.4117, 0.7607, and 0.9175 for the

Gaussian, Double Clayton, and Double Gumbel SCAR models, respectively. Kendalls tau for

the Double Gumbel SCAR model has two sharp declines. Large sudden changes in Kendalls

tau for the Double Gumbel SCAR model also occur in the soybean and wheat markets. These

changes are associated with large deviations in the data. For example, the decline of Kendalls

tau in the wheat markets is associated with a day when there was no price change in the

Greenville market but a massive $2.51 spike in the Statesville price. For the soybean markets,

Kendalls tau across time are approximately 0.4561, 0.6276, and 0.9112 for the Gaussian, Double

Clayton, and Double Gumbel copulas, respectively. Finally, Kendalls tau across time for the

wheat markets, are 0.4534, 0.7956, and 0.90363 for the Gaussian, Double Clayton, and Double

Gumbel copulas, respectively.

In order to determine the preferred model for forecasting, we utilize the same method as

Almeida and Czado (2012), which is the continuous rank probability score (CRPS). CRPS

is an measure indicating forecasting ability. We examine the forecast of the difference of the

price differentials between two markets for each SCAR model, defined as dt = (p2,t p1,t ).

We choose to examine this difference because the non-linear impulse responses examine the

difference of the price differentials between two markets. Therefore, we define the CRPS of a

single forecasted point at day l as

R R
d (o) 1 X (r) (o) 1 1 X (r) (r)
CRPS(dl )= R |dl dl | |dl dl |, (4.13)
2R
r=1 r=1

(o) (r)
where dl is the difference in price differentials that is observed, dl is the difference in the

100
(r)
simulated price differentials, and dlis resampled from the simulated price differentials. The
d (o) ) = 1
PT +w d (o)
CRPS for the entire forecasted period is CRPS(d w l=T +1 CRPS(dl ), where w is the

length of the forecasted period. Lower continuous rank probability score indicate better forecast.

For the continuous rank probability scores of the estimated SCAR models, we simulate the

last 100 days of the sample period for each pair of grain markets using posterior predictive

sampling. Table 4.7 shows the continuous rank probability scores for the SCAR models of the

three crops. The Double Gumbel SCAR model has the lowest CRPS for the wheat markets,

which is 0.3749 compared to 0.6499 for the Gaussian SCAR model and 1.0539 for the Double

Clayton SCAR model. The Double Gumbel SCAR model has the lowest CRPS for the corn

markets, which is -0.0756 compared to -0.0419 for the Gaussian SCAR model and 0.01440 for

the Double Clayton SCAR model. Finally, the Double Clayton SCAR model also has the lowest

CRPS for the soybean markets, which is -0.4066 compared to the -0.2085 for the Gaussian SCAR

model and -0.3538 for the Double Gumbel SCAR model. The CRPS for the single parameter

copulas are also estimated. Although the SCAR models do not strictly dominate the single

parameter copulas, the best forecast are produced by the Double Gumbel SCAR model for the

wheat and corn markets and the Double Clayton SCAR model for the soybean markets.

The impulse responses are shown in Figures 4.13 - 4.15 and the half-lives for these impulse

responses are given in Table 4.12. An impulse responses show the deviation from equilibrium

after a shock is imposed. The deviations in these impulse responses are the differences in the

price differentials of the pairs of grain markets. The half-life of an impulse response is the

length of time for half of the deviation caused by a shock to be extinguished. In our analysis,

all markets return to equilibrium regardless of the choice of SCAR model within a thirty day

period; however, the amount of time to return to equilibrium varies substantially. For shocks

from the Greenville and Statesville wheat market, the half-lives of the impulse responses are

between six to seven days for the Gaussian, Double Clayton, and and Double Gumbel SCAR

models. The impulse responses for the corn markets are not as consistent across SCAR models.

Depending on the SCAR model, impulse responses for shocks in the Barber and Laurinburg

corn markets are between 1 to 7 days. From the best forecasting model for the corn markets,

101
the Double Gumbel SCAR model, the half-lives are 6.83 days for a shock from Barber corn

market and 3.40 days for a shock from the Laurinburg market. Although the impulse responses

from the Gaussian SCAR models indicate a very slow decay back to equilibrium for shocks

from either the Cofield or Fayetteville soybean markets, the Clayton SCAR model and Gumbel

SCAR model are consistent in their results. Note according to the CRPS, the Clayton SCAR

model and Gumbel SCAR model both forecast better than the Gaussian SCAR model for the

soybean markets. The Clayton SCAR model and Gumbel SCAR model show half-lives between

one and two days for deviations in either the Cofield or Fayetteville soybean markets.

4.4 Discussion

Despite some crops having better integrated markets than other, ultimately, the Law of One

Price holds in all markets as indicated by the returning to equilibrium seen in all of the impulse

responses. Overall the results agree with economic intuition. Given that the markets investigated

are all close in proximity, we expect these markets to be spatially integrated. The soybean

markets have the shortest half-lives for the impulse responses, according to the Clayton and

Gumbel SCAR models, and the acreage planted with soybeans is higher than both the acreage

planted for corn and soybeans. This supports the statement made earlier that the more widely-

produced crops have more integrated markets. The wheat markets return to equilibrium much

slower, which is not surprising because wheat is not a dominant feature of North Carolina

agriculture.

The need to investigate grain markets in North Carolina will only grow in the upcoming

years because of the desire to produce more feed grain in North Carolina. North Carolina is

one of the top pork and poultry-producing states. In 2010 North Carolina had the second

highest state inventory of pigs. It also had the second highest total for United States cash

receipts of all poultry and eggs totaling $3.62 billion. This notable production of livestock

requires a considerable amount of feed grain. North Carolina currently imports a substantial

amount of feed grain from the Midwest. The North Carolina Grain Initiatives goal is for North

102
Carolina to become more self-sufficient in producing feed grain for livestock. One aspect of

the North Carolina Grain Initiative encourages double cropping grain sorghum after wheat

instead of double cropping soybeans after wheat to increase grain production (Piggott, 2013).

The current production of grain sorghum in North Carolina is very sparse. If grain sorghum

production grows in North Carolina, the grain sorghum markets may not be well-integrated

due to the initial thinness of the markets. Also if soybean production decreases, the soybean

markets may become less integrated.

4.5 Conclusion

The purpose of this investigation is to provide a method of measuring spatial integration without

the explicit use of threshold regression models. To accomplish this goal, we use Stochastic

Copula Autoregressive models. Our results show that all of the pairs of North Carolina grain

markets examined in this paper are spatially integrated and the Law of One Price does hold.

The time required to recover from a shock strongly depends on whether the markets are thin.

We see the half-lives of the impulse responses, depending on the choice of SCAR model and

grain, are from just over a day to little under two weeks.

The investigation of these grain markets will continue. This paper looks at only pair-wise

relationships among terminal markets. In future research we hope to move beyond this pair-wise

construction and analyze three or more grain markets at a time. Also as the North Carolina

Grain Initiative grows, we wish to investigate its affect in the spatial integration of markets.

If grain sorghum replaces some of the soybean production, one would expect soybean markets

in North Carolina to become less integrated. Spatial integration is an imporant indicator of

information. If information in terminal markets is asymmetric, the markets will not be well-

integrated. Therefore, by tracking whether or not markets are integrate, we may help market

players make decision as grain markets in North Carolina evolve.

103
4.6 Tables and Figures

2.0

1.5
Acres (Millions)
Corn
Soybean
Wheat
1.0

0.5

1970 1980 1990 2000 2010


Year

Figure 4.1: Acres planted for the three crops of interest

104
12.5
16

7
10.0

12
Dollars

Dollars

Dollars
Greenville Cofield Barber
7.5 Statesville Fayetteville Laurinburg
5

8
5.0
3

2.5 4

2006 2007 2008 2009 2010 2001 2003 2005 2007 2009 2011 2001 2003 2005 2007 2009 2011
Year Year Year

(a) Wheat (b) Soybean (c) Corn

Figure 4.2: Daily Market Prices

105
Table 4.1: Kendalls tau coefficient

0 <0
2 2
Gaussian sin (t )
1
sin (t )
1

Double Clayton
2+ 2+

Double Gumbel
1 +1

106
Table 4.2: Results of the Augmented Dickey-Fuller Test conducted on the price series from each terminal grain market.

Wheat Corn Soybean


Greenville Statesville Barber Laurinburg Cofield Fayetteville
Dickey-Fuller -1.4826 -1.6885 -2.2335 -2.2809 -2.4634 -2.478
p-value 0.7973 0.7102 0.4795 0.4594 0.3821 0.3759

Table 4.3: Cointegration Test:


Maximal Eigenvalue Likelihood
Statistics. An estimate with aster-
isks , , or indicate statistical
significance at the = .10, = .05
and = .01, respectively.

r=0 r=1
Wheat 53.41 3.09
Corn 36.03 1.61
Soybean 150.15 1.82

107
Table 4.4: VAR(1) and GARCH(1,1) estimates and standard errors for the wheat markets. pW G,tand
pW S,t are the price different for the wheat markets in Greenville and Statesville, respectively, on day t.
An estimate with asterisks , , or indicate statistical significance at the = .10, = .05 and = .01,
respectively.

Constant Coef. for pW G,(t1) Coef. for pW S,(t1)


pW G,t 0.0017 -0.3696 -0.0183 0.00019 1 .1175
(0.0031) (0.0293) (0.0288) - - (0.0358)
pW S,t 0.0035 0.1110 -0.1192 0.00396 0.4953 0.5907
(0.0031) (0.0300) (0.0294) (0.00008) (0.0591) (0.03464)

Table 4.5: VAR(1) and GARCH(1,1) estimates and standard errors for the corn markets. pCB,t and pCB,t
are the price different for the corn markets in Barber and Laurinburg, respectively, on day t. An estimate
with asterisks , , or indicate statistical significance at the = .10, = .05 and = .01, respectively.

Constant Coef.for pCB(t1) Coef. for pCL,(t1)


pCB,t 0.0017 -0.2189 0.1569 0.00005 0.0908 0.9078
(0.0031) (0.0303) (0.0312) (0.00001) (0.0122) (0.0107)
pCL,t 0.0017 0.0717 -0.1551 0.00019 .2140 0.7882
(0.0017) (0.0294) (0.0304) (0.00002) (0.0234) (0.0181)

Table 4.6: VAR(1) and GARCH(1,1) estimates and standard errors for the soybean markets. pSC,tand
pSF,t are the price different for the soybean markets in Cofield and Fayetteville, respectively, on day t.
An estimate with asterisks , , or indicate statistical significance at the = .10, = .05 and = .01,
respectively.

Constant Coef.for pSC,(t1) Coef. for pSF,(t1)


pSC,t 0.0033 -0.1236 0.1463 0.00006 0.0583 0.9426
(0.0031) (0.0293) (0.0288) (0.00002) (0.0068) (0.0061)
pSF,t 0.0035 0.1110 -0.1192 0.00003 .0496 0.9529
(0.0031) (0.0300) (0.0294) (0.00001) (0.0059) (0.0050)

108
Table 4.7: CRPS

Wheat Corn Soybean


Gaussian SCAR 0.6499 -0.0419 -0.2085
Double Clayton SCAR 1.0539 0.01440 -0.4066
Double Gumbel SCAR 0.3749 -0.07556 -0.3538
Gaussian 0.7649 -0.0243 -0.2490
Clayton 0.5368 -0.02728 -0.2874
Gumbel 0.8168 -0.0053 -0.2950

Table 4.8: Parameter Estimates for the single parameter copulas

Gaussian Clayton Gumbel


mean st. dev. mean st. dev. mean st. dev.
Wheat 0.9803 0.0612 3.5895 1.5941 3.9569 1.4435
Corn 0.9450 0.0887 3.4937 1.5843 3.9721 2.9036
Soybeans 0.9297 0.0900 3.5523 1.6176 4.6241 1.2829

109
Table 4.9: Parameter Estimates for SCAR models of Wheat Markets

Gaussian SCAR Double Clayton SCAR Double Gumbel SCAR


mean st. dev. mean st. dev. mean st. dev.
0.7369 0.3676 0.7526 0.3342 0.7400 0.3000
0.8787 0.1822 0.8708 0.1833 0.8796 0.1685
2 0.00001 0.00002 0.00001 0.00002 0.00001 0.000001

Table 4.10: Parameter Estimates for SCAR models of Corn Markets

Gaussian SCAR Double Clayton SCAR Double Gumbel SCAR


mean st. dev. mean st. dev. mean st. dev.
0.7962 0.4806 1.1335 0.9146 1.2320 1.0476
0.9134 0.1397 0.9545 0.0866 0.9583 0.0719
2 0.000002 0.000005 0.00001 0.000003 0.000003 0.000070

Table 4.11: Parameter Estimates for SCAR models of Soybean Markets

Gaussian SCAR Double Clayton SCAR Double Gumbel SCAR


mean st. dev. mean st. dev. mean st. dev.
1.1230 0.8825 0.73859 0.6774 1.2084 1.0573
0.9566 0.08517 0.9219 0.14460 0.9538 0.09743
2 0.00001 0.00002 0.00002 0.0001 0.000001 0.000003

110
2

10
5
prior prior prior
posterior
posterior posterior

8e+04
3

6
Density

Density

Density

4e+04
2

4
1

0e+00
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0e+00 4e05 8e05
N = 5000 Bandwidth = 0.02981 N = 5000 Bandwidth = 0.01134 N = 5000 Bandwidth = 6.999e07

Figure 4.3: Wheat: Gaussian Copula Parameters

111
2

10
5

prior prior prior


posterior posterior posterior

8e+04
4

8
3

6
Density

Density

Density

4e+04
2

4
1

0e+00
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0e+00 4e05 8e05
N = 5000 Bandwidth = 0.03091 N = 5000 Bandwidth = 0.01272 N = 5000 Bandwidth = 7.675e07

Figure 4.4: Wheat: Double Clayton Copula Parameters

112
2

120000
10
5

prior prior prior


posterior posterior posterior
4

80000
3

6
Density

Density

Density

40000
2

4
1

2
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0e+00 4e05 8e05
N = 5000 Bandwidth = 0.0285 N = 5000 Bandwidth = 0.0111 N = 5000 Bandwidth = 6.766e07

Figure 4.5: Wheat: Double Gumbel Copula Parameters

113
2

15

8e+05
5

prior prior prior


posterior posterior posterior
4

10
3

4e+05
Density

Density

Density
2

5
1

0e+00
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05 3.0e05
N = 5000 Bandwidth = 0.03547 N = 5000 Bandwidth = 0.008012 N = 5000 Bandwidth = 9.442e08

Figure 4.6: Corn: Gaussian Copula Parameters

114
2
5

prior prior prior

8e+05
15
posterior posterior posterior
4
3

10
Density

Density

Density

4e+05
2

5
1

0e+00
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05
N = 5000 Bandwidth = 0.05314 N = 5000 Bandwidth = 0.004475 N = 5000 Bandwidth = 6.475e08

Figure 4.7: Corn: Double Clayton Copula Parameters

115
2

4e+06
5

prior prior prior

15
posterior posterior posterior
4
3

10

2e+06
Density

Density

Density
2

5
1

0e+00
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.000 0.001 0.002 0.003 0.004
N = 20000 Bandwidth = 0.02399 N = 5000 Bandwidth = 0.004384 N = 5000 Bandwidth = 7.627e08

Figure 4.8: Corn: Double Gumbel Copula Parameters

116
2

20
5

8e+05
prior prior prior
posterior posterior posterior
4

15
3
Density

Density

Density
10

4e+05
2

5
1

0e+00
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05 3.0e05
N = 5000 Bandwidth = 0.05412 N = 5000 Bandwidth = 0.004365 N = 5000 Bandwidth = 7.379e08

Figure 4.9: Soybean: Gaussian Copula Parameters

117
2

0e+00 2e+05 4e+05 6e+05 8e+05


20
5

prior prior prior


posterior posterior posterior
4

15
3
Density

Density

Density
10
2

5
1
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0000 0.0010 0.0020
N = 1000 Bandwidth = 0.04749 N = 1000 Bandwidth = 0.009321 N = 1000 Bandwidth = 2.396e07

Figure 4.10: Soybean: Double Clayton Copula Parameters

118
2

20
5

8e+05
prior prior prior
posterior posterior posterior
4

15
3
Density

Density

Density
10

4e+05
2

5
1

0e+00
0

0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05 3.0e05
N = 5000 Bandwidth = 0.06244 N = 5000 Bandwidth = 0.004362 N = 5000 Bandwidth = 7.461e08

Figure 4.11: Soybean: Double Gumbel Copula Parameters

119
1.0 1.0 1.0

0.5 0.5 0.5

Kendalls Tau
Kendalls Tau

Kendalls Tau
Double Clayton Double Clayton Double Clayton
Double Gumbel Double Gumbel Double Gumbel
0.0 Gaussian 0.0 Gaussian 0.0 Gaussian

0.5 0.5 0.5

0 250 500 750 1000 0 1000 2000 3000 0 1000 2000


Days Day Days

(a) Wheat (b) Corn (c) Soybean

Figure 4.12: Kendalls Tau

120
0.08
0.08

0.06
0.06

Dollars

Dollars
0.04 Double Clayton Double Clayton
Double Gumbel 0.04 Double Gumbel
Gaussian Gaussian

0.02 0.02

0.00
0.00

0 10 20 30 0 10 20 30
Days Days

(a) Shock in Greenville Market (b) Shock in Statesville Market

Figure 4.13: Wheat: Impulse Response Functions

0.02
0.02
Dollars

Dollars

0.01 Double Clayton Double Clayton


Double Gumbel 0.01 Double Gumbel
Gaussian Gaussian

0.00
0.00

0 10 20 30 0 10 20 30
Days Days

(a) Shock in Barber Market (b) Shock in Laurinburg Market

Figure 4.14: Corn: Impulse Response Functions

121
0.04 0.04

0.03 0.03
Dollars

Dollars
Double Clayton Double Clayton
0.02 Double Gumbel 0.02 Double Gumbel
Gaussian Gaussian

0.01 0.01

0.00 0.00

0 10 20 30 0 10 20 30
Days Days

(a) Shock in Fayetteville Market (b) Shock in Cofield Market

Figure 4.15: Soybean: Impulse Response Functions

Table 4.12: Half-Lives of Impulse Responses. The half-life is the time (in days) that it takes
for the deviation dissipate to half of its distance to the equilibrium.

Gaussian SCAR Double Clayton SCAR Double Gumbel SCAR


Wheat: Greenville 6.15 6.19 6.30
Wheat: Statesville 6.50 6.27 6.05
Corn: Barber 1.32 3.81 6.83
Corn: Laurinburg 1.31 5.94 3.40
Soybean: Cofield 9.52 1.31 1.38
Soybean: Fayetteville 8.51 1.32 1.39

122
Chapter 5

Conclusion

For this dissertation, we examine economic questions related to spatial dependence. In Chapter

2 and Chapter 3, we investigate the issue of systemic risk in two federal programs, National

Flood Insurance Program and the federal crop insurance program, through spatial modeling.

Both analyses show the importance of incorporating spatial dependencies when systemic risk

is present.

In Chapter 2, we calculate the actuarially fair premiums for flood insurance policies in

Florida. Three single hurdle models are considered for the estimation of the annual count of

flood insurance claims in each county of Florida. Parameters of the models are allowed to

spatially vary with the counties of Florida. Although the Single Hurdle Poisson model does

not predict the best on yearly based, the average prediction of this model over sample period

is closer to the historical average than the other single hurdle models and current rating of

the NFIP. Although the loss cost ratios from our estimation are higher than one, with finer

resolution data on historical indemnity payments as well as data on home values, it will be

possible to better estimate indemnity payments. This will lead to estimated LCRs closer to

one. Because flood maps cannot be updated frequently, less costly complementary methods,

such as the predictions from the Single Hurdle Poisson model, should be explore and used in

conjunction with the flood maps or applied at a finer resolution and stand alone.

Chapter 3 examines spatial dependencies of Kansas winter wheat yields and applies the

123
findings to the estimation of crop insurance premium rates and payouts for the Agricultural

Risk Coverage program. Our analysis shows that the best fit for county yields allows the spatial

dependencies among the counties to change with the value of the yields. The spatial dependency

changes if the yield is below a certain percent of the median yield. The optimal percentage of

the median yield is 108% for dryland wheat and 102.5% for irrigated wheat. Therefore we find

that within our framework county yields are best described using not only spatial intercepts,

but also including a secondary spatial covariate for yields that are under 108% of median

for dryland wheat and 102.5% of the median irrigated wheat. When compared to a model

that assumes no correlation between yields, we see the dryland wheat premium ratings for

different coverage levels are more consistent. Therefore, by including spatial dependencies in

crop insurance ratings, the premium rates better reflect intuition.

Unlike the previous two chapters, Chapter 4 does not analyze risk management issues. In-

stead this chapter focused on the spatial integration, of six grain markets in located North

Carolina. The purpose of this investigation is to provide a method of measuring spatial inte-

gration without the explicit use of threshold regression models. To accomplish this goal, we use

Stochastic Copula Autoregressive models. Despite some crops having better integrated markets

than other, ultimately, the Law of One Price holds in all markets as indicated by the returning

to equilibrium seen in all impulse responses. Overall the results agree with basic intuition. Given

that the markets investigated are all a relatively short driving distance apart, a maximum of

a four hours, we expect these markets to be spatially integrated. The soybean markets have

the shortest half-lives for the impulse responses, according to the Clayton and Gumbel SCAR

models, and the quantity of acres planted with soybeans is higher than both the quantity of

acres planted for corn and soybeans. The wheat markets return to equilibrium much slower,

which is not surprising because wheat is not a dominant feature of North Carolina agriculture.

There are plans for future research based on each of these essays. For the National Flood

Insurance Program, further research will utilized these single hurdles models at a finer spatial

resolution within a given county, preferably at the ZIP code or census tract level. This will allow

us to explore socio-economic issues within the flood insurance program. We would also like to

124
include information about the Community Rating System (CRS) to determine the effectiveness

of CRS and its impact on market penetration.For agricultural risk management the majority

of crop insurance policies have guarantees based on the production of individual producers

instead of county level production, we plan to apply the models used in Chapter 3 to yields of

individual producers. Also we plan to further compare expected payouts of new programs, such

as the ARC program, to direct payments, county-cyclical payments, and the ACRE program.

Finally the for the topics addressed in Chapter 4, future research will analyze the affect of

the new North Carolina Grain Initiative. If information in terminal markets is asymmetric, the

markets will not be well-integrated. Therefore, tracking whether or not markets are integrate,

we can help market players have the full information to make decision as grain production in

North Carolina grows.

125
REFERENCES

[1] Agricultural act of 2014. https://agriculture.house.gov/farmbill/.

[2] C. Almeida and C. Czado. Efficient bayesian inference for stochastic time-varying copula
models. Computational Statistics & Data Analysis, 56(6):15111527, 2012.

[3] P.G. Ardeni. Does the law of one price really hold for commodity prices? American Journal
of Agricultural Economics, 71(3):661669, 1989.

[4] B.A. Babcock, C.E. Hart, and D.J. Hayes. Actuarial fairness of crop insurance rates with
constant rate relativities. American Journal of Agricultural Economics, 86(3):563575,
2004.

[5] K. Balcombe, A. Bailey, and J. Brooks. Threshold effects in price transmission: the case
of brazilian wheat, maize, and soya prices. American Journal of Agricultural Economics,
89(2):308323, 2007.

[6] S. Banerjee, A. Gelfand, and B. Carlin. Hierarchical Modeling and Analysis for Spatial
Data. Chapman and Hall/CRC;, 1 edition, 2003.

[7] C.B. Barrett. Measuring integration and efficiency in international agricultural markets.
Review of Agricultural Economics, 23(1):1932, 2001.

[8] C.B. Barrett and J.R. Li. Distinguishing between equilibrium and integration in spatial
price analysis. American Journal of Agricultural Economics, 84(2):292307, 2002.

[9] E.S. Blake and E.J. Gibney. The deadliest, costliest, and most intense united states tropical
cyclones from 1851 to 2010 (and other frequently requested hurricane facts). August 2011.

[10] T. Bollerslev. Generalized autoregressive conditional heteroskedasticity. Journal of econo-


metrics, 31(3):307327, 1986.

[11] Arthur Charpentier, Jean-David Fermanian, and Olivier Scaillet. The estimation of cop-
ulas: Theory and practice. Copulas: From theory to application in finance, pages 3560,
2007.

[12] J. Chivers and N.E. Flores. Market failure in information: The national flood insurance
program. Land Economics, 78(4):515521, 2002.

[13] K.H. Coble, T.O. Knight, B.K. Goodwin, M.F. Miller, and R.M. Rejesus. A comprehensive
review of the rma aph and combo rating methodology. March 15 2010.

[14] Congressional Budget Office. The national flood insurance program: Factors affecting
actuarial soundness. 2009.

[15] D. Cooley, N. Douglas, and N. Philippe. Bayesian spatial modeling of extreme precipitation
return levels. Journal of the American Statistical Association, 136:824840, 2007.

126
[16] D. Diers, M. Eling, and S. Marek. Dependence modeling in non-life insurance using the
bernstein copula. Insurance: Mathematics and Economics, 50:430436, 2012.
[17] R. Doman and M. Doman. Copula based impulse response analysis of linkages between
stock markets. Available at SSRN 1615108, 2010.
[18] P.L. Fackler and B.K. Goodwin. Spatial price analysis. Handbook of agricultural economics,
1:9711024, 2001.
[19] M.D. Faminow and B.L. Benson. Integration of spatial markets. American Journal of
Agricultural Economics, 72(1):4962, 1990.
[20] Federal Emergency Management Agency. National flood insurance program: Program
description. August 2002.
[21] Federal Emergency Management Agency. Biggert-waters flood insurance reform act of
2012, December 2012. http://www.fema.gov/flood-insurance-reform-act-2012.
[22] M.J. Fischer and I. Klein. Some results on weak and strong tail dependence coefficients for
means of copulas. Technical report, Diskussionspapiere//Friedrich-Alexander-Universitat
Erlangen-Nurnberg, Lehrstuhl fur Statistik und Okonometrie, 2007.
[23] A.E. Gelfand, H.J. Kim, C.F. Sirmans, and S. Banerjee. Spatial modeling with spatially-
varying coefficients processes. Journal of the American Statistical Association, 98(462):387
396, 2003.
[24] A.E. Gelfand and A. Smith. Sampling-based approaches to calculating marginal densities.
Journal of the American statistical association, 85(410):398409, 1990.
[25] Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin. Bayesian data analysis.
texts in statistical science series, 2004.
[26] C. Genest, J. Neslehova, and N. Ben Ghorbal. Estimators based on kendalls tau in mul-
tivariate copula models. Australian & New Zealand Journal of Statistics, 53(2):157177,
2011.
[27] W.R. Gilks, S. Richardson, and D.J. Spiegelhalter, editors. Markov Chain Monte Carlo in
Practice: Interdisciplinary Statistics. Chapman and Hall/CRC, 1996.
[28] J.W. Glauber. Crop insurance reconsidered. American Journal of Agricultural Economics,
86(5):11791195, 2004.
[29] T. Gneiting and A.E. Raftery. Strictly proper scoring rules, prediction, and estimation.
Journal of the American Statistical Association, 102(477):359378, 2007.
[30] B.K. Goodwin. Problems with market insurance in agriculture. American Journal of
Agricultural Economics, 2001.
[31] B.K. Goodwin and A.P. Ker. Nonparametric estimation of crop yield distributions: impli-
cations for rating group-risk crop insurance contracts. American Journal of Agricultural
Economics, 80(1):139153, 1998.

127
[32] B.K. Goodwin and N.E. Piggott. Spatial market integration in the presence of threshold
effects. American Journal of Agricultural Economics, 83(2):302317, 2001.

[33] B.K. Goodwin and T.C. Schroeder. Cointegration tests and spatial price linkages in regional
cattle markets. American Journal of Agricultural Economics, 73(2):452464, 1991.

[34] B.K. Goodwin and V. Smith. What harm is done by subsidizing crop insurance? American
Journal of Agricultural Economics, 95(2):489497, 2013.

[35] Government Accountability Office. Fema : Action needed to improve administration of the
national flood insurance program.rep. no. 11-297. 2011.

[36] Government Accountability Office. National flood insurance program: Continued attention
needed to address challenges. (GAO-13-858T), September 2013.

[37] F. Guzzetti, C.P. Stark, and P. Salvati. Evaluation of flood and landslide risk to the
population of italy. Enviromental Management, 36(1):1536, 2005.

[38] J. Ifft, C. Nickerson, T. Kuethe, and C. You. Potential farm-level effects of eliminating
direct payments, November 2012.

[39] D.M. Jaffee and T. Russell. Imperfect information, uncertainty, and credit rationing. The
Quarterly Journal of Economics, 90(4):651666, 1976.

[40] H. Joe. Multivariate models and multivariate dependence concepts, volume 73. CRC Press,
1997.

[41] S. Johansen. Statistical analysis of cointegration vectors. Journal of economic dynamics


and control, 12(2):231254, 1988.

[42] N.L. Johnson, A.W. Kemp, and S. Kotz, editors. Univariate Discrete Distributions. Wiley,
3 edition, 2005.

[43] C. Kousky and R. Cooke. The unholy trinity: Fat tails, tail dependence, and micro-
correlations. 2009a. Issue Brief 12-08.: Resources for the Future.

[44] C. Kousky and R. Cooke. Climate change and risk management: Challenges for insurance,
adaptation, and loss estimation. 2009b. Issue Brief 12-08.: Resources for the Future.

[45] C. Kousky and E. Michel-Kerjan. Hurricane sandy, storm surge, and the national flood
insurance program: A primer on new york and new jersey. 2012. Resources for the Future.

[46] D. Lunn, D. Spiegelhalter, A. Thomas, and N. Best. The bugs project: Evolution, critique,
and future directions. Statistics in Medicine, 28:30493067, 2009.

[47] K. McNew and P.L. Fackler. Testing market equilibrium: is cointegration informative?
Journal of Agricultural & Resource Economics, 22(2), 1997.

[48] E. O. Michel-Kerjan. Catastrophe economics: The national flood insurance program. Jour-
nal of Economic Perspectives, 24:165186, 2010.

128
[49] M.J. Miranda and J.W. Glauber. Systemic risk, reinsurance, and the failure of crop insur-
ance markets. American Journal of Agricultural Economics, 79(1):206215, 1997.

[50] National Agricultural Statistics Service. Quick stats, March 2014. http://quickstats.
nass.usda.gov/.

[51] R.B. Nelsen. An introduction to copulas. Springer, 1999.

[52] E. Neuwirth. RColorBrewer: ColorBrewer palettes, 2011. R package version 1.0-5.

[53] A. OHagan, J. Forster, and M.G. Kendall. Bayesian Inference. Arnold London, 2004.

[54] A. Panagiotelis and M. Smith. Bayesian density forecasting of intraday electricity prices
using multivariate skew t distributions. International Journal of Forecasting, 24(4):710
727, 2008.

[55] D. Pfeifer, D. Straburger, and J. Phillipps. Modelling and simulation of dependence struc-
tures in nonlife insurance with berstein copulas. International ASTIN Colloquium, 2012.

[56] N. Piggott. North carolina grain initiative, March 2013.

[57] M.D. Porter and G. White. Self-exciting hurdle models for terrorist activity. The Annals
of Applied Statistics, 6(1):106124, 2012.

[58] M. Ravallion. Testing market integration. American Journal of Agricultural Economics,


68(1):102109, 1986.

[59] A.B. Richard and R.W. Allan. maps: Draw Geographical Maps, 2012. R package version
2.3-0.

[60] Risk Management Agency. History of the crop insurance program, 2014. http://www.
rma.usda.gov/aboutrma/what/history.html.

[61] Risk Management Agency. Summary of business, 2014. http://www.rma.usda.gov/data/


sob.html.

[62] P.S. Sephton. Spatial market arbitrage and threshold cointegration. American Journal of
Agricultural Economics, 85(4):10411046, 2003.

[63] T. Serra and B.K. Goodwin. Price transmission and asymmetric adjustment in the spanish
dairy sector. Applied Economics, 35(18):18891899, 2003.

[64] R.J. Sexton, C.L. Kling, and H.F. Carman. Market integration, efficiency of arbitrage,
and imperfect competition: methodology and application to us celery. American Journal
of Agricultural Economics, 73(3):568580, 1991.

[65] J.S. Shonkwiler and D.W. Shaw. Hurdle count-data models in recreation demand analysis.
Journal of Agricultural and Resource Economics, 21(2):210219, 1996.

[66] J.R. Skees and B.J. Barnett. Conceptual and practical considerations for sharing catas-
trophic/systemic risks. Review of Agricultural Economics, 21(2):424441, 1999.

129
[67] M. Sklar. Fonctions de repartition a n dimensions et leurs marges. Universite Paris 8,
1959.

[68] S. Sturtz, U. Ligges, and A. Gelman. R2winbugs: A package for running winbugs from r.
Journal of Statistical Software, 12(3), 2005.

[69] C.P. Timmer. The corn economy of Indonesia. Cornell University Press, 1987.

[70] James Tobin. Estimation of relationships for limited dependent variables. Econometrica:
journal of the Econometric Society, pages 2436, 1958.

[71] W.G. Tomek. Price behavior on a declining terminal market. American Journal of Agri-
cultural Economics, 62(3):434444, 1980.

[72] R.S. Tsay. Analysis of financial time series, volume 543. John Wiley & Sons, 2005.

[73] United States Army Corps of Engineers. Levee inspection, 2013. http://www.usace.
army.mil/Missions/CivilWorks/LeveeSafetyProgram/LeveeInspections.aspx.

[74] United States Army Corps of Engineers. National levee database, 2013. http://nld.
usace.army.mil/egis/f?p=471:1:.

[75] United States Geological Survey. National elevation dataset. http://ned.usgs.gov/.

[76] M. Wall. A close look at the spatial structure implied by the car and sar models. Journal
of Statistical Planning and Inference, 121(2):311324, 2004.

[77] J.D. Woodard, G.D. Schnitkey, B.J. Sherrick, N. Lozano-Gracia, and L. Anselin. A spatial
econometric analysis of loss experience in the u.s. crop insurance program. The Jounal of
Risk and Insurance, 79(1):261285, 2012.

130
APPENDICES

131
Appendix A

Conditional Autoregressive Model


(Chapters 1 & 2)

The following definition of a Conditional Autoregressive (CAR) model is comes from Wall

(2004). Suppose there is a fixed set of regions {R1 , R2 , . . . , RN } and an indexing set D. If
S S S
the set {R1 , R2 , . . . , RN } is a partition of D such that R1 R2 . . . RN = D, the set

{R1 , R2 , . . . , RN } is called the lattice of D. Given {Z(Ri ) : Ri (R1 , . . . , RN )} is a Gaus-

sian random process. Then the data can be modeled as a CAR model, as discussed by Wall

(2004),
 N
X 
2
Z(Ri )|Z(R(i) ) N i + aij (Z(Rj ) j ), i , (A.1)
j=1

where Z(R(i) ) = Z(Rj : j 6= i), E(Z(Rj )) = j , i2 is the conditional variance, and aij are

constants for i = 1, . . . , N . The elements aij form a matrix A such that A = W, where is a

scalar and W is the neighborhood matrix with its elements, wij defined as follows:




1 if regions i and j are neighbors, denoted by i j
wij = (A.2)


0 if regions i and j are not neighbors or i = j

132
Appendix B

Prior Distributions for Models


(Chapter 1)

B.1 Logit Link

1. Spatial Intercept (CAR model): In OpenBUGS, we only set the prior for the precision,

tau the inverse of the variance. For this we use, Gamma(.1, .1)

2. log(Number of Policies+1): 1 N (0, 2)

3. Coast: 2 N (0, 10)

4. Minimum Elevation 3 N (0, 10)

B.2 Log Link


iid
1. Spatial Intercepts beta0,i N (0, 10) for i = 1, . . . , N

2. log(Number of Policies+1) 1 N (0, 2)

3. Coast: 2 N (0, 10)

4. Minimum Elevation 3 N (0, 10)

133
B.3 Modeling Indemnity Payments

1. Spatial Intercepts (CAR Model) In OpenBUGS, we only set the prior for the precision,

tau the inverse of the variance. For this we use, Gamma(.1, .1)

iid
2. log(Number of Claims +1) i N (0, 10000)

i.i.d.
3. Coast: 1,i N (0, 10000)

i.i.d.
4. Minimum Elevation: 2,i N (0, 10000)

134
Appendix C

Basics of Copula Modeling (Chapter


3)

The literature exploring copula modeling has increased dramatically in the last 15 years. The

backbone of copula modeling stems from Sklar (1959), which states that any joint distribution

can be represented as function, a copula, of its marginal distributions. The representation is

unique if the marginal distributions are continuous. From this theorem, one is to able build a

variety of joint distributions. The joint density of the d-dimensional distribution can be written

as
d
Y
f (u) = c(F1 (u1 ), . . . , Fd (ud )) fi (ui ) (C.1)
i=1

The density of a copula is written as

d c(u1 , . . . , ud )
c(u) = , (C.2)
u1 , . . . , ud

where c(u1 , . . . , ud ) denotes the copula function, i.e. the joint distribution of (u1 , . . . , ud ), and

ui U [0, 1] for i = 1, . . . , d.

Nelson (2006) provides an excellent overview on copula modeling. The most popular copulas

include the elliptical and Archimedean families. Elliptical copulas consists of elliptical distribu-

135
tions, such as the Gaussian distribution and the Students t distribution. The Gaussian copula

of dimension d has the form

C (u) = (1 (u1 ), . . . , 1 (ud )), (C.3)

where 1 is the inverse cumulative distribution of the standard normal distribution and ui

[0, 1] for i = 1, . . . , d and denotes the joint cumulative distribution with the correlation

matrix . The Gaussian copula has zero tail dependence.

Archimedean copulas are characterized by a single generator function and are of the form


C(u) = 1 (u1 , ) + . . . + 1 (ud , theta), , (C.4)

where () is the generator function and is the associated parameter. Table C.1 shows the

functional form and characteristics of several popular Archimedean copulas.

There are several measures of comovement often used when evaluating copula models. Asides

from the traditional measure of linear correlation Pearsons correlation coefficient, Kendalls

and Spearmans are rank correlation coefficient, which are commonly used in the copula

literature. As discussed by Genest et. al. (2011), the inverse of Kendalls tau is sometimes used

to determine the parameter estimates in Archimedean copulas. Tail dependence measures the

comovement of two variables at the extreme regions of the distribution (Fischer and Klein,

2007). The measure of tail dependence is defined by the copula function not the marginal

distributions within the copula. There are separate definition for the lower and upper tail

dependence coefficients. If we define a copula as FX,Y (x, y) = C(FX (x), FY (y)), then the lower

tail dependence coefficient is defined as

C(u, u)
L lim P (Y FY1 (u)|X FX1 (u)) = lim , (C.5)
u0+ u0+ u

136
while the upper tail dependence coefficient is defined as

1 2u + C(u, u)
U lim P (Y > FY1 (u)|X > FX1 (u)) = lim [0, 1]. (C.6)
u1 u1 1u

Table C.1: Popular Archimedean Copulas

Name Generator Inverse Generator Parameter Tail Dependence


1
Clayton (1 + x)1/ (x 1) > 1, 6= 0 lower
1/
Gumbel ex (log(x)) 1 upper
 x 
1 e 1
Frank log(1 (1 e )ex ) log 6= 0 none
e 1

137
Appendix D

Algorithm for Impulse Responses


(Chapter 3)

D.1 Obtaining E[Dt+k |t = dt + , Dt1 = dt1, . . .]

The two subsections, Initializing the Shock for an Impulse Response and Time Path for

Impulse Response After the Shock is Implemented, below outline one single iteration. After

many iterations, such as 100,000 iterations, the mean is taken to obtain E[Dt+k |t = dt +

, Dt1 = dt1 , . . .].

D.1.1 Initializing the Shock for an Impulse Response

1. Random draws are made from the posterior distributions of , , and , which are the

parameters of t .

2. The negative shock is introduced through the error term of the latent variable t .

3. Using the Fisher Z transformation, we obtain t .

4. Randomly draw bivariate vector, vt from the copula with parameter t . Then zt =

1 (vt ), where 1 is the inverse of the standard normal cumulative distribution.

138
5. Using the conditional variances from the GARCH(1,1) models obtain the unstandardized

residuals, ui,t = zi,t i,t

6. The standardized residuals are then used in the VAR(1) model to obtain the predicted

price differentials p1,t and p2,t . For this first iteration, pi,(t1) = 0 for market i = 1, 2

7. We obtain the difference between these two price differentials dt such that dt = p2,t

p2,t . If dt > 0 then the price differential is higher for the Market 2 compared to Market 1;

therefore, the shock has a relatively positive impact on Market 2. The shock is categorized

as a shock on Market 2. If dt < 0 then the price differential is higher for the Market 1

compared to Market 2; therefore, the shock has a relatively positive impact on Market 1.

The shock is categorized as a shock on Market 1.

D.1.2 Time Path for Impulse Response After the Shock is Implemented

Depending on the outcome of the initialized shock the following will either be the time path

following a shock from Market 1 or Market 2.

1. Random draws are made from the posterior distributions of , , and , which are the

parameters of t+1 .

2. Calculate t+1 with the error term equal to zero.

3. Using the Fisher Z transformation, we obtain t+1 .

4. Randomly draw bivariate vector, vt+1 from the copula with parameter t . Then zt =

1 (vt ), where 1 is the inverse of the standard normal cumulative distribution.

5. Using the conditional variances from the GARCH(1,1) models obtain the unstandardized

residuals, ui,t+1 = zi,t+1 i,t+1 . Note that GARCH(1,1) uses ui,t and i,t .

6. The standardized residuals are then used in the VAR(1) model to obtain the predicted

price differentials p1,t+1 and p2,t+1 . Note this iteration uses the price differentials from

time t.

139
7. We obtain the difference between these two price differentials dt+1

8. Repeat these steps until the chosen end of the time path. For this paper, we repeat this

process until dt+30 is reached.

D.2 Obtaining E[Dt+k |t = dt , Dt1 = dt1, . . .]

The same process used as in obtaining E[Dt+k |t = dt + , Dt1 = dt1 , . . .] is used here. The

only different is a shock is not initialized.

140

Você também pode gostar