Escolar Documentos
Profissional Documentos
Cultura Documentos
HUNGERFORD, ASHLEY ELAINE. Three Essays in Spatial Economics. (Under the direction
of Barry Goodwin and Sujit Ghosh.)
The first essay examines insurance claims filed with the National Flood Insurance Program.
The National Flood Insurance Program (NFIP), a government-run insurance program, has
become the largest source of flood insurance in the United States. However, the sustainability of
the program has been called into question. The combination of flood insurance rate maps riddled
with errors and subsidies to high-risk zones has left the NFIP insolvent. This paper examines
alternative methods of premium rating in an attempt to move the NFIP towards solvency.
We use single hurdle models to estimate the count of flood insurance claims within the state
of Florida. We also model the average indemnity payments for each county. By combining the
estimates from the single hurdle models with the estimates for the average indemnity payments,
we examine the loss-cost ratios for these counties as well as the expected number of claims. From
these results, we can determine the largest problem areas within the state and provide estimates
The second essay continues to investigate risk management issues. In 2011 and 2012 severe
droughts caused extensive damage in crops throughout the Midwest. These conditions combined
with concerns for climate change have led to a growing focus on risk management in agriculture.
The increasing emphasis on risk management is reflected in the 2014 Farm Bill, which replaces
direct payments with shallow loss programs. For this paper we turn our attention to winter
wheat production in Kansas and explore the ratings of the crop insurance policies as well as
predicted payouts from the new Agricultural Risk Coverage program established under the 2014
Farm Bill. Using spatial models we simulate yields of non-irrigated winter wheat and irrigated
winter wheat to estimate crop insurance premium rates as well as payouts from the Agricultural
The final essay tests the Law of One Price in North Carolina grain markets. The Law of One
Price states the difference in prices two markets should be equal to the difference in the trans-
action costs. If prices continuously operate outside of equilibrium in regionally-linked markets,
this indicates disintegration of the markets. This disintegration could represent asymmetric in-
formation in the markets and affect market players decisions. Much of the previous literature
testing the Law of One Price used regime switching regression. This paper utilizes methodology
previously not considered in this literature, which is the implementation of Stochastic Copula
Autoregessive (SCAR) models. By using SCAR models in place of regime switching regression
to develop impulse responses, changes from equilibrium to disequilibrium and vice versa are
continuous.
Copyright 2014 by Ashley Elaine Hungerford
by
Ashley Elaine Hungerford
Economics
2014
APPROVED BY:
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if material had to be removed,
a note will indicate the deletion.
UMI 3584313
Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author.
Microform Edition ProQuest LLC.
All rights reserved. This work is protected against
unauthorized copying under Title 17, United States Code
ProQuest LLC.
789 East Eisenhower Parkway
P.O. Box 1346
Ann Arbor, MI 48106 - 1346
DEDICATION
To Will
ii
BIOGRAPHY
Ashley Mabee was born in Santa Maria, California and moved with her family to Bakersfield,
California at age 8. She attended California State University, Bakersfield, where she earned
a Bachelor of Science degree in Mathematics. At university Ashley met her future husband
William Hungerford. Although Bakersfield is renowned for its poor air quality and attempted
book bans, Ashley and William left Bakersfield, so Ashley could pursue a PhD in Economics
at North Carolina State University. Upon completion of her PhD she will begin employment at
Economic Research Services of the United States Department of Agriculture. There she will be
iii
ACKNOWLEDGEMENTS
During my dissertation work, I have been incredibly fortunate to have, for lack of a better term,
a Super Star committee. Dr. Nick Piggott, your mentoring and guidance through the job market
has been indispensable. I deeply appreciate all of your help with my job market presentation
as well as introducing me to my future supervisor, Joe Cooper. Dr. Sujit Ghosh, I am forever
grateful for your patience and insights on methodology. Your input has not only been vital to
the development of these essays, but also how I approach methodology for economic questions.
Dr. Denis Pelletier, your asset pricing course set the foundation for my dissertation. Also thank
you for all of the hours of help outside of the classroom. And of course, Dr. Barry Goodwin for
being kind enough to open your office door for me two years ago. Your mentoring has greatly
helped me blend together non-trivial methodology with economic intuition. Thank you all very,
very, much.
iv
TABLE OF CONTENTS
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
v
4.1.2 Variance Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.1.3 Stochastic Copula Autoregressive Model . . . . . . . . . . . . . . . . . . . 94
4.1.4 Comparison to Single Parameter Copulas . . . . . . . . . . . . . . . . . . 96
4.1.5 Non-Linear Impulse Responses . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.6 Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Appendix A Conditional Autoregressive Model (Chapters 1 & 2) . . . . . . . . . . . 132
Appendix B Prior Distributions for Models (Chapter 1) . . . . . . . . . . . . . . . . 133
B.1 Logit Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
B.2 Log Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
B.3 Modeling Indemnity Payments . . . . . . . . . . . . . . . . . . . . . . . . . 134
Appendix C Basics of Copula Modeling (Chapter 3) . . . . . . . . . . . . . . . . . . 135
Appendix D Algorithm for Impulse Responses (Chapter 3) . . . . . . . . . . . . . . . 138
D.1 Obtaining E[Dt+k |t = dt + , Dt1 = dt1 , . . .] . . . . . . . . . . . . . . . . . 138
D.1.1 Initializing the Shock for an Impulse Response . . . . . . . . . . . . 138
D.1.2 Time Path for Impulse Response After the Shock is Implemented . . 139
D.2 Obtaining E[Dt+k |t = dt , Dt1 = dt1 , . . .] . . . . . . . . . . . . . . . . . . . 140
vi
LIST OF TABLES
Table 2.1 High risk areas have at least a 1% annual probability of flooding. These areas
are referred to as 100-year floodplains. Zones labeled A are for inland
areas, while zones labeled V are reserved for coastal areas. Moderate risk
areas are referred to as 500-year floodplains. . . . . . . . . . . . . . . . . . . 24
Table 2.2 Function Forms of the Considered Probability Mass Functions. The function
() is defined as (n) = (n1)! and the function () is defined as (+1) =
P 1
k=0 k +1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Table 2.3 DIC for Logit Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Table 2.4 DIC for the Log Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Table 2.5 The Chi-Square Statistics for Observations of County-Year Data . . . . . . 30
Table 2.6 The Chi-Squared Statistics for data average over the years for each county. . 31
Table 2.7 This table shows the expected losses. The first column is an alphabetical
list of all the counties in Florida. The second column is the annual average
for historical claims. Third is an approximation of the annual number of
claims the NFIP expects. The last three column are show the results for
the simulations for each type of hurdle model. For the SHP and SHNB, the
simulations for the models with the lowest DIC are shown. . . . . . . . . . . 32
Table 2.8 DIC for the Indemnity Payment Models . . . . . . . . . . . . . . . . . . . . 41
Table 2.9 Loss cost ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Table 2.10 Levee Rating System implemented by the United States Army Corps of
Engineers (USACE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
vii
Table 4.4 VAR(1) and GARCH(1,1) estimates and standard errors for the wheat mar-
kets. pW G,tand pW S,t are the price different for the wheat markets in
Greenville and Statesville, respectively, on day t. An estimate with aster-
isks , , or indicate statistical significance at the = .10, = .05 and
= .01, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Table 4.5 VAR(1) and GARCH(1,1) estimates and standard errors for the corn mar-
kets. pCB,t and pCB,t are the price different for the corn markets in Bar-
ber and Laurinburg, respectively, on day t. An estimate with asterisks , ,
or indicate statistical significance at the = .10, = .05 and = .01,
respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Table 4.6 VAR(1) and GARCH(1,1) estimates and standard errors for the soybean
markets. pSC,tand pSF,t are the price different for the soybean markets
in Cofield and Fayetteville, respectively, on day t. An estimate with asterisks
, , or indicate statistical significance at the = .10, = .05 and
viii
LIST OF FIGURES
Figure 3.1 Figures for the entire state of Kansas including the average yield and number
of acres planted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Figure 3.2 Wheat price for a per bushel (adjusted to 2013 price) . . . . . . . . . . . . . 65
Figure 3.3 Average yield (bushels per acre). Sample period: 1970-2013 . . . . . . . . . 66
Figure 3.4 Prior and posterior distributions of the dryland and irrigated wheat logit
link functions. Note there is no 0 posterior distribution for irrigated wheat
because the intercepts for the best-fit irrigated wheat logit link function are
spatially-varying. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Figure 3.5 Posterior percentiles for the spatial intercepts of the irrigated wheat logit
link. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Figure 3.6 Prior and posterior distributions for the parameter of the dryland wheat
and irrigated wheat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Figure 3.7 Posterior percentiles for the spatial intercepts of the dryland wheat trun-
cated normal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Figure 3.8 Posterior percentiles for the secondary spatial covariate of the dryland wheat
truncated normal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
ix
Figure 3.9 Posterior percentiles for the spatial intercepts of the irrigated wheat trun-
cated normal regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Figure 3.10 Posterior percentiles for the secondary spatial covariate of the irrigated
wheat truncated normal regression . . . . . . . . . . . . . . . . . . . . . . . 73
Figure 3.11 Percentiles for the simulated dryland yields . . . . . . . . . . . . . . . . . . 74
Figure 3.12 Percentiles for the simulated irrigated yields . . . . . . . . . . . . . . . . . . 75
Figure 3.13 Probability for the three coverage levels of dryland wheat of the best fitting
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Figure 3.14 Probability for the three coverage levels of dryland wheat of the model with
independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Figure 3.15 Probability for the three coverage levels of irrigated wheat of the best fitting
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Figure 3.16 Probability for the three coverage levels of irrigated wheat of the model
with independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Figure 3.17 Premium rates for the three coverage levels of dryland wheat of the best
fitting model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Figure 3.18 Premium Rates for the three coverage levels of dryland wheat of the model
with independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Figure 3.19 Premium rates for the three coverage levels of irrigated wheat of the best
fitting model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Figure 3.20 Premium rates for the three coverage levels of irrigated wheat of the model
with independent counties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Figure 3.21 Distribution of Olympic average of prices . . . . . . . . . . . . . . . . . . . . 84
Figure 3.22 County probability from the ARC program . . . . . . . . . . . . . . . . . . 85
Figure 3.23 Percentiles of the Olympic averages for dryland wheat. . . . . . . . . . . . . 86
Figure 3.24 Percentiles of the Olympic averages for irrigated wheat. . . . . . . . . . . . 87
Figure 3.25 Median of the expected payout for the ARC program. . . . . . . . . . . . . 88
Figure 4.1 Acres planted for the three crops of interest . . . . . . . . . . . . . . . . . . 104
Figure 4.2 Daily Market Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Figure 4.3 Wheat: Gaussian Copula Parameters . . . . . . . . . . . . . . . . . . . . . . 111
Figure 4.4 Wheat: Double Clayton Copula Parameters . . . . . . . . . . . . . . . . . . 112
Figure 4.5 Wheat: Double Gumbel Copula Parameters . . . . . . . . . . . . . . . . . . 113
Figure 4.6 Corn: Gaussian Copula Parameters . . . . . . . . . . . . . . . . . . . . . . . 114
Figure 4.7 Corn: Double Clayton Copula Parameters . . . . . . . . . . . . . . . . . . . 115
Figure 4.8 Corn: Double Gumbel Copula Parameters . . . . . . . . . . . . . . . . . . . 116
Figure 4.9 Soybean: Gaussian Copula Parameters . . . . . . . . . . . . . . . . . . . . . 117
Figure 4.10 Soybean: Double Clayton Copula Parameters . . . . . . . . . . . . . . . . . 118
Figure 4.11 Soybean: Double Gumbel Copula Parameters . . . . . . . . . . . . . . . . . 119
Figure 4.12 Kendalls Tau . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Figure 4.13 Wheat: Impulse Response Functions . . . . . . . . . . . . . . . . . . . . . . 121
Figure 4.14 Corn: Impulse Response Functions . . . . . . . . . . . . . . . . . . . . . . . 121
Figure 4.15 Soybean: Impulse Response Functions . . . . . . . . . . . . . . . . . . . . . 122
x
Chapter 1
Introduction
This dissertation contains three essays that center around spatial relationships. Chapter 2 and
Chapter 3 both examine how insurance is affected by the presences of systemic risk. Systemic
risk is the probability of losses occurring simultaneously and dependently. If systemic risk is
not properly accounted for in insurance ratings, rates may be inefficient and lead to inaccurate
probabilities. The third essay, Chapter 4, examines spatial integration of grain markets in
North Carolina. If markets are spatially integrated, then the Law of One Price holds. The Law
of One Price states that the price of a homogeneous good in different locations is the same
when transaction costs are excluded. If the Law of One Price does not hold, this may reflect
The first essay examines the National Flood Insurance Program (NFIP). By 2013 the NFIP
had accumulated approximately $20 billion in debt, including the losses from Hurricane Kat-
rina and Hurricane Sandy, but NFIP collects at most $3.8 billion in premiums per year. Unlike
automobile insurance claims or homeowner insurance claims, the correlation among flood in-
surance claims is very high. Also some models used for rating flood insurance policies, Flood
Insurance Rate Maps, are notoriously erroneous. Therefore, we propose a new method for mod-
eling flood insurance policies that incorporates the spatial autocorrelation among losses as well
For our estimation of losses from floods, we use a two part model. The first part of the model
estimates the annual count of flood insurance claims in a particular county, and the second
1
part of the model estimates the average indemnity payment in that county. There are three
variations of the model, in which the count part of the model uses different distributions: the
Poisson distribution, the Negative Binomial distribution, and the Riemann-Zeta distribution.
Both parts of the model account for spatial autocorrelation. This analysis is conducted using
data from the 67 counties of Florida during the sample period 1978-2011. The state of Florida
is chosen because approximately 40% of flood insurance policies held are for properties located
in Florida. After estimating our models, we compare the loss cost ratios derived from the three
variations of our model to the observed loss cost ratios of the NFIP.
There has been increasing interest in agricultural risk management, which is reflected in the
2014 Farm Bill; therefore, for the second essay, we turn our attention to crop insurance and
the Agricultural Risk Coverage (ARC) program. In this essay, we model yields and prices to
determine crop insurance premium ratings and expected payouts from the ARC program. Like
flood insurance, crop insurance is susceptible to systemic risk. Natural disasters, such as the
drought that struck the Midwest in 2012, cause lower yields for producers over an expansive area.
Hence spatial autocorrelation is accounted for in our models of yields. We also expect the spatial
autocorrelation to change depending on the quantity of the yield since spatial autocorrelation of
yields is expected to be higher during a natural disaster compared to a good year. Therefore,
For the model estimation of the second essay, we examine winter wheat in Kansas. Kansas
is one of the top producers of winter wheat in the United States. The yield data consist of
annual observations from the 105 counties of Kansas over the sample period 1970-2013. The
observations are separated by irrigation practices, i.e. dryland (non-irrigated) winter wheat and
irrigated winter wheat. Although the majority of winter wheat is grown without irrigation,
dryland practices may cause weather to exert a great influence on yields. After estimating the
models for dryland wheat and irrigated wheat, we use Monte Carlo integration to determine
crop insurance premium rates and the expected payouts for the Agricultural Risk Coverage
program.
Chapter 4, the final essay, changes the focus from risk management to spatial integration.
2
As mentioned earlier, if markets are not integrated, this could indicate asymmetric informa-
tion among markets. Asymmetric information can negatively impact decisions made by market
players. The seminal work on market integration kept transaction costs fixed or assumed trans-
action costs so high that markets would be isolated. More recent work used regime switching
regression models along with impulse response functions that better incorporated transaction
costs. Our analysis utilizes non-linear impulse response functions; however, we opt for copula
The model estimation in Chapter 4 utilizes weekly price observations from six grain markets
in North Carolina. The weekly price series cover a sample period from January 7, 2000 to
November 3, 2011 for the two corn markets and two soybean markets, and the sample period
for the two wheat markets is from October 7, 2005 to May 27, 2010. The estimation is pairwise,
such that the spatial integration is only tested for the same crop markets. After implementing
an ARMA-GARCH model on log returns of the prices for each market, we use copula models
on the standardized residuals. Therefore, the residuals of the ARMA-GARCH estimations for
series of log returns can be correlated, and that correlation is described by a time-varying copula
model referred to as a Stochastic Copula Autoregressive (SCAR) model. By using SCAR models
in place of regime switching regression to develop impulse responses, changes from equilibrium
3
Chapter 2
2.1 Introduction
Since 1968 the National Flood Insurance Program has provided flood insurance to residential
and non-residential property owners (Michel-Kerjan, 2010). However, the sustainability of the
program has been called into question. This paper examines several issues currently facing
the NFIP and focuses on examining catastrophic risk the program. Out-of-date flood maps,
high administrative cost, the threat of catastrophes, and a myriad of other problems have
created an insolvent program. In 2013 the Government Accountability Office (GAO) determined
the National Flood Insurance Programs debt to the Treasury was approximately $20 billion
(GAO, 2013). In the last twenty years, the United States has experienced catastrophic flooding
several times, including the Midwest Floods of 1993, Hurricane Katrina in 2005, and most
recently Hurricane Sandy in 2012. The NFIP has been able to pay down the debt to $17 billion
(Kousky and Michel-Kerjan, 2012). This debt is quite substantial considering the NFIP brought
4
in slightly over $3.8 billion in premiums in 2013. Under the current structure of the NFIP,
paying off this amount of debt is impossible. Most of this debt stems from indemnity payments
after Hurricane Katrina in Louisana, which amounted to $13.2 billion. The debt accrued from
Hurricane Sandy has not been completely realized; however, the federal government did increase
the debt ceiling of the NFIP from approximately $20.7 billion to $30.4 billion(GAO, 2013). The
current model of the NFIP has grave flaws which allow for these large debts to accumulate. One
of the major flaws is the current policy rating system. Therefore, we examine an alternative
Skees and Barnett (1999) outlined several ideal conditions for insurance including 1) infor-
mation between the insurer and insuree should be symmetric, 2) claims should be independent
and random, and 3) losses can easily be measured. Flood insurance claims are spatially cor-
related due to the systemic nature of flooding. Hence, flood insurance suffers from a lack of
independence among claims and systemic risk. Systemic risk is the probability of a large num-
ber of losses occurring simultaneously and dependently. Therefore, the second condition listed
above is violated. In other insurance markets, such as automobile and fire insurance, loss events
are almost completely independent from each other. On the other end of the spectrum, deriva-
tives, such as futures contracts and options for a given commodity, are highly correlated. Flood
The analysis of flood insurance claims and indemnity payments is surprisingly sparse. How-
ever, after Hurricane Katrina and more recently Hurricane Sandy, the literature for the supply-
side of flood insurance has grown. A series of papers by Kousky and Michel-Kerjan (2009a and
2009b) explore the tail dependence between flood insurance indemnity payments and home-
owners insurance indemnity payments from wind damage. Using copula modeling, the authors
identify dependence between large volumes of flood insurance indemnity payments and large
volumes of wind damage indemnity payments. Similar results were found by Diers et al. (2012)
and Pfiefer et al. (2012) for flood insurance indemnity payments and wind insurance indemnity
payments in Germany. However, Kousky and Michel-Kerjan (2009a) discarded the observations
equal to zero, while Diers et al. (2012) and Pfierfer et al. (2012) aggregated the data, so the
5
data contained no observations equal to zero. Discarding zeros makes the model conditional on
the occurrence of positive observations. Although a conditional model does provide insight, the
conditional model cannot be used to rating insurance policies. Also the aggregation performed
by Pfiefer et al (2012) is across all of Germany, which would only allow for the calculation of
the premium rate for the entire country. Using a model that omits zero values may overstate
risk.
This papers analysis utilizes data on flood insurance claims from Florida and includes all
of the observations equal to zero. To handle the large number of zeroes in the data, single
hurdle models are utilized. Single hurdle models are two part models, which have a binary
element-modeling whether the observation is zero or not- and positive count element-modeling
the positive observations. While the binary element is modeled using logit regression, the posi-
tive counts can be modeled by traditional count data models. In this paper we use the Poisson
distribution, the Negative Binomial distribution, and the Riemann Zeta distribution. Although
the data set obtained from Federal Emergency Management Agency (FEMA) is monthly from
January 1978 to June 2012, flood insurance premiums are calculated on an annual basis. There-
fore, the sample period for the count observations is aggregated annually over the time period
1978 to 2011. Since the counties of Florida vary in susceptibility to flooding, we allow the
model to vary the risk and severity of flooding from county to county. The three single hurdle
models are used to estimate annual expected number of claims for each county in Florida. The
estimates from these three models are compared to an approximation of the annual expected
number of claims provided by FEMA. The expected number of flood insurance claims can be
In this paper, we also model the average indemnity payments for each county. The model
for average indemnity payments has spatially-varying coefficients for the covariates as well as
spatially-varying intercepts. This model may be used to help establish appropriate levels of
coverage. When the average indemnity payment model is used in conjunction with the single
hurdle models, loss cost ratios can be established for each county. These loss cost ratios help
6
2.2 National Flood Insurance Program
Before the National Flood Insurance Program (NFIP), flood victims typically had two sources of
relief, charities and government disaster relief. Most floods do not warrant government disaster
relief, and charities do not offer a stable means of assistance (Federal Emergency Management
Agency, 2002). To move the burden of flood disaster relief from taxpayers to floodplain residents
and provide a steady means of assistance to flood victims, in the 1950s, Congress attempted
to encourage private insurers to provide flood insurance with the Federal Flood Insurance Act
of 1956. However, the private sector was not interested in providing flood insurance. After the
devastation of Hurricane Betsy in 1965 along the coast of Louisiana and southeastern Florida,
Congress stepped forward and passed the National Flood Insurance Act of 1968 (Michel-Kerjan,
2010). From this act, the National Flood Insurance Program (NFIP) was established to provide
The NFIP has grown substantially over the last six decades and undergone major reform. Af-
ter Tropical Storm Agnes in 1972, Congress passed the Flood Insurance Protection Act of 1973,
which required the designation of flood prone areas called Special Flood Hazard Areas (SFHA)
and the creation of Flood Insurance Rate Maps (FIRMs). Individuals with federally-backed
mortgages for properties located on floodplains are now required to purchase flood insurance.
In 1983 the Write Your Own (WYO) program was developed. This program allowed private
insurance companies to become intermediaries for the NFIP and sell flood insurance without
bearing any risk. Private insurance companies participating in the WYO program receive ap-
proximately 30% of the flood insurance premiums. The vast majority of NFIP policies are now
purchased through the WYO program. After the devastation of the Midwest Floods of 1993,
the National Flood Reform Act of 1994 established initiatives to increase market penetration
as well as establishing the Community Rating System (CRS). Communities that enlist in CRS
and take preventative measures against floods are able to receive reduced premiums. According
7
For most of the programs history, the National Flood Insurance Program has offered direct
subsidies to approximately 20% of flood insurance policyholders. If flood insurance rate maps
were updated and resulted in higher premiums, policyholders with continuous coverage were
allowed to grandfather their previous insurance rates. The Biggert-Waters National Flood
Reform Act of 2012 eliminated subsidies that allowed the grandfathering of rates. These subsi-
dies were expected to be completely fade out by 2014. However, in March 2014 the Homeowners
Flood Insurance Affordability Act was passed which delayed the elimination of subsidies.
The majority of legislation affecting the National Flood Insurance Program has been enacted
to increase market penetration of flood insurance. By increasing market penetration the federal
government lessens its obligation to supply disaster assistance after major floods. However,
with the focus on increasing market penetration through subsidies, the sustainability of the
program has been ignored, except for the short-lived reform of the Biggert-Waters National
Flood Reform Act of 2012. Therefore, in order for the NFIP to continue, FEMA must evaluate
Many property owners are unsure what is and is not covered by flood insurance and how pre-
miums are determined. Flood insurance offered by the NFIP is available to both residential and
non-residential property owners. Currently, the maximum coverage for a residential policyholder
is $250,000 for building damage and $100,000 for contents, while the maximum coverage for a
non-residential policyholder is $1,000,000 for building damage and $500,000 for contents. Cur-
rently, the minimum deductible is $1,000 for coverage under $100,000 and $1,250 for coverage
greater than $100,000. There are separate deductibles for building and contents.
8
2. Unusual and rapid accumulation or runoff of surface waters from any source,
3. Mudflow, or
4. Collapse or subsidence of land along the shore of a lake or similar body of
water as a result of erosion or undermining caused by waves or currents of
water exceeding anticipated cyclical levels that result in a flood as defined
above.
Damage caused by wind during a flood is not covered by the NFIP. Likewise, flood damage is not
covered by the vast majority homeowners insurance. Property owners who live within a 100-year
flood plain (Special Flood Hazard Area) - an area with a 1% or greater annual probability of
being flooded- are required to purchase flood insurance if their mortgages are federally-insured.
Also many lenders require the purchase of flood insurance if a home is located in a 100-year flood
plain. Table 1 shows the Flood Insurance Rate Map (FIRM) designations. Areas designated as
a Special Flood Hazard Area (SFHA) also have a base flood elevation. This elevation is the
expected elevation of flood waters with a 1% probability. The propertys FIRM designation,
base flood elevation, and characteristics of the property determines the policyholders flood
insurance premium.
Interestingly, there is a discrepancy between the definition of a flood and how flood insurance
policies are rated. By definition flood insurance policies are covered under the event of a levee
breaking. However, the condition of levees is not fully accounted for in policy ratings. This
disparity was a contributing factor in the destruction caused by Hurricane Katrina. This issue
is yet another reason to consider alternative methods of insurance rating. The Methods section
below provides alternative models for rating flood insurance policies. Further information on
levee ratings and flood insurance policy ratings is provided in the Discussion.
2.3 Methods
This paper utilizes Bayesian hierarchical models. Banerjee et al. (2004) provides an excellent
overview on spatial Bayesian hierarchical models. In particular we use single hurdle modeling,
9
which accounts for over-dispersion in the data that may appear due to excess zeros in the count
data. If we do not account for the over-dispersion, then the estimated standard errors may
be too small and resulting into underreporting Type I error rates within a hypothesis testing
scenario. Most counties in Florida exhibit a significant proportion of years having zeros flood
insurance claims with the average being 40% of observations for a county. For our modeling,
we utilize the single hurdle modifications of the Poisson, Negative Binomial, and Riemann Zeta
distributions. The Poisson and Negative Binomial distributions are the workhorses of discrete
modeling, while the Riemann Zeta distribution is rarely used the risk management literature.1
The random variable of the annual count of flood insurance claims per a county, denoted as
Bernoulli(Pit ), and Uit is a discrete random variable with the probability mass function f (u, it )
taking values u = 0, 1, 2, . . .. The representation of Cit leads to the following probability mass
function:
1 Pit if c = 0
P r(Cit = c|Pit , it ) = . (2.2)
Pit f (c 1| it ) if c = 1, 2, . . .
Any single hurdle model can be represented by suitably choosing the parameters Pit s and
it s and the function form f (|it ). The representation 2.2 can be used to recover the Bit s and
0 if Cit = 0
Bit = (2.3)
1 if Cit = 1, 2, . . .
1
Guzzetti et al. (2006) did model floods and landslide risk in Italy using the Reimann Zeta distribution;
however, their model was not single hurdle or spatially-varying. Porter and White (2012) use a Single Hurdle
Reimann Zeta model for modeling terrorists attacks.
10
and
undefined if Bit = 0
Uit = . (2.4)
Cit 1 if Bit = 1
As mentioned earlier, Bit follows a Bernoulli distribution. The probability parameter Pit of this
B
X
logit(Pit ) = i,0 + ij zijt , (2.5)
j=1
where (zi1t , . . . , ziBt ) denotes the vector of covariates for County i and Year t.
To help account for spatial variation among the counties, a Conditional Autoregressive
(CAR) model is used for the intercepts of the logit link, such that (0,1 , . . . , 0,N ) CAR( , 2 ).
The CAR is a popular model based on the notion of Markov Random Fields. The mean (or
intercept in our model) of one county is dependent or conditional on the means (intercepts)
by distance or contiguity. In this model, we choose contiguity over distance since the counties
greatly vary in size and shape. A detailed description of the CAR model is provided in the
Appendix A.
Exchangeable prior distributions are used for the covariate coefficients, which are allowed
iid
to vary with counties as random effects, such that (i1 , . . . , iB ) N ( , ) for i = 1, . . . , N .
All of the prior distributions are chosen to be vague (i.e. with very large prior variance) but
not improper, i.e. the densities of the prior distributions integrate to one. For a detailed list of
Three different discrete distributions are considered in modeling the unobserved random ef-
fects Uit : the Poisson distribution, the Negative Binomial, and the Riemann Zeta distribution.
Table 2.2 shows the functional forms for each of these distributions along with their means
and variances. Although the Poisson distribution is a commonly used discrete distribution, the
Poisson distribution restricts the mean equal to the variance. Insurance applications pertaining
11
to natural disasters tend to have extremely large variances relative to the mean value. There-
fore, the Negative Binomial distribution is also proposed for our application. The additional
parameter, r, allows for a greater range of flexibility to model the variance. Finally, when mod-
eling continuous distributions with long tails, the Generalized Pareto distribution is a popular
choice as seen in Cooley et al. (2008). The Riemann Zeta distribution is considered the discrete
analog of the Generalized Pareto distribution. The Riemann Zeta distribution is capable of hav-
ing a very long right tail depending on the parameter specification, which makes it a probable
The covariates are liked through the mean, as seen in Table 2, of the unobserved random
component Uit . Specifically, letting it = E(Uit |Zit ) we use the following log link function for
B
X
log(it ) = 0,i + j zijt , (2.6)
j=1
where z Tit = (zi1t , . . . , ziBt ) denotes the vector of covariates available for County i and Year t.
iid
The intercepts have the prior (0,1 , . . . , 0,N ) CAR( , 2 ) and the coefficients for the covari-
ates have independent normal distributions for their priors. The parameter r in the Negative
Binomial distribution is estimated using a gamma distribution for its prior distribution.
For the Riemann Zeta distribution we must solve for the parameter it non-linearly to
(it )
obtain the mean of the Riemann Zeta distribution which is it = (it +1)
1 for it > 1, where
P 1
(it ) = n=1 . We again use the log link function for positive count model, such that
n it
B
X
log(it ) = 0,i + j zijt . (2.7)
j=1
The log link function for the Riemann Zeta distribution uses the same covariates and prior
distribution as the log links for the Poisson and Negative Binomial distributions.
If Uit RZ(it ) and E(Uitw ) < then it > w when w is a positive integer. Therefore, to
model a Riemann Zeta distribution with the variance restricted to be finite, we need to use the
12
following link function:
B
X
log(it ) = log(2) + log(1 + exp(0,i + j zijt )). (2.8)
j=1
This link function will be used in future research. These single hurdle models are referred to as
the Single Hurdle Poisson (SHP) model, the Single Hurdle Negative Binomial (SHNB) model,
The purpose of these single hurdle models is to determine the actuarially fair premium rates.
The actuarially fair premium rate is the expected value of the number claims for each county,
where Ci,2012 is the annual number of claims for county i = 1, . . . , N in the year 2012. The
equation above states that the expected number of claims in a county is equal to the probability
of at least one flood claim in that county during 2012 times the expected number of claims for
the year 2012 conditioned on at least one flood insurance claim being filed. The actuarially fair
premiums can be determined by multiple both sides of the equation above by the coverage. For
For simulating the count of flood insurance claims, we use posterior predictive sampling.
Posterior predictive sampling differs from sampling in classical statistics. Posterior predictive
sampling is a two part process. Since Bayesian methods treat parameters as random variables,
the first part of posterior predictive sampling is drawing parameters from the posterior distri-
bution. The samples of parameters drawn from the posterior distributions are then used in the
allows us to take in to account the uncertainty not only in the sampling distribution but the
13
uncertainty in the parameters as well. After data are simulated from each type of single hurdle
model - the Poisson, Negative Binomial, and Riemann Zeta models, the expected number of
flood insurance claims for each county are estimated using Monte Carlo integration. Monte
For modeling indemnity payments, we use the entire monthly data set from January 1978
to June 2012. The indemnity payments are normalized to 2012 dollars. Average indemnity
where Ciq is the number of flood claims and Riq is the total of indemnity payments for county
equation:
B
X
Miq = i log(1 + Ciq ) + i,0 + ij zijq + iq , (2.11)
j=1
where zijq are covariates for county i = 1, . . . , N and month q = 1, . . . , Q. The county level
iid
1. (1 , . . . , N ) N ( , 2 )
iid
3. (i,1 , . . . , i,B ) N ( , ) for i = 1, . . . , N .
coefficients vary with the counties since we expect coefficients to be affected by the property
14
values in each county, which in turn affects the average indemnity claims. For a detailed list of
After estimating the model for the average indemnity payments for each county, average
indemnity payments are simulated using posterior predictive sampling. Using these simulated
payments in conjunction with the simulations of the count of flood insurance claims, we are able
to establish loss cost ratios for each county. The parameters ( , 2 , , 2 , , ) are estimated
2.3.4 Implementation
To perform the analysis, the open source softwares R and OpenBUGS are employed. All mod-
eling is written in R. Using the software package R2OpenBUGS, the Single Hurdle Poisson
model and the Single Hurdle Negative Binomial model are exported to OpenBUGS to perform
the Bayesian updating using the Adaptive Sampling Rejection algorithm. This algorithm is
discussed by OHagan et al. (2004). The Single Hurdle Riemann Zeta model is estimated using
the Metropolis-Hasting algorithm, which is discussed by Gilks (1996) but this model is not
exported to OpenBUGS because the Riemann Zeta function (a portion of the Riemann Zeta
distribution) cannot be estimated in OpenBUGS. Therefore, the modeling of the Riemann Zeta
2.4 Data
Through two Freedom of Information Act (FOIA) requests, data concerning the National Flood
Insurance Program are obtained. Although the first data set obtained from the Federal Emer-
gency Management Agency (FEMA) contained monthly totals of indemnity payments and
counts of flood insurance claims for every county in the United States from January 1978 to
June 2012, the scope of our analysis is limited to Florida. Florida is chosen because approxi-
mately 40% of the flood insurance policies (over 2 million policies in 2012) in the United States
15
are within Florida and approximately 29% of households in Florida have a flood insurance pol-
icy. The data are aggregated annually for the models of the count of flood insurance claims, but
we analyze the monthly data when modeling the average indemnity payments. Premiums are
determined on an annual basis, hence the use annual aggregation for the count model of flood
insurance claims. From the second FOIA request, the data for policies in-force, coverage, and
premiums for each county in Florida was obtained. The minimum elevation2 of each county,
whether the county is coastal3 , and the number of policies are used as covariates through the
link functions as well as modeling the average indemnity payments. Note for the Number of
Policies covariate, a log transformation is used for an easier interpretation of the covariate and
numerical stability.
Figure 2.1 and Figure 2.2 show maps relating to flood insurance policies and losses, respec-
tively. The coastal counties, especially southern Florida and the panhandle, have experience
the highest losses and have the highest amount of policies in-forced. From causal observation
there appears to be spatial correlation, which supports our use of spatially-varying intercepts
in the link functions. The highest losses are in counties with a higher cost of living, such as
Destin in Okaloosa County (located in the panhandle) and Miami in Miami-Dade County on
2.5 Results
First we will focus on the logit link, the estimation of Pit , which has the same results independent
of the choice of distribution for modeling the positive number of claims. To determine the
best covariates for the logit link, we use the Deviance Information Criteria (DIC). DIC is the
Bayesian analog of the Akaike Information Criteria (AIC). Like AIC DIC penalizes overfitting,
and a smaller value of the criteria indicates a better fitting model. According to Table 3.2, the
2
The minimum elevations are obtained from the United States Geological Survey.
3
This variable is referred to as Coast in the Results section.
16
inclusion of the covariates of Coast and Minimum Elevation do not improve the estimation
of the probability of at least one flood claim occurring. Therefore, when simulated the count
of claims for each county, the simplest model, containing only the log Number of Policies
is included. Figure 2.3 shows the prior and posterior distribution of the coefficient on the
log Number of Policies for the best fitting logit link based on the DIC. One can see that
the prior distribution is very flat compared to the posterior distribution. If the log Number
of Policies covariate a increases by one unit will lead to approximately a .69% increase in
odds of a claim being filed according to the median of the posterior distribution. Figure 2.4
shows the 2.5th , 50th , 97.5th percentiles of the posterior distributions of the spatial intercepts.
The estimates of the spatial intercepts align with previous expectations that the counties most
prone to hurricane exposure have higher intercepts. The maps for the 50th and 97.5th percentiles
show the coastal counties in the southern end of the pennisula, including Miami-Dade, Palm
Beach, and Monroe, have the highest spatial intercepts. Other hurricane-prone counties, such
To compare the positive count component of the single hurdle models, the Poisson, Negative
Binomial, and Riemann Zeta distribution estimates, we use DIC as well as Chi-Square statistics.
DIC is used to compare the distributions to each other as well as which covariates to include. The
Chi-Squared statistics are implemented after selecting the best covariates to include as indicated
by DIC. Also the Chi-Squared statistics includes the logit portion of the single hurdle models,
but the same logit link is used for all three distributions, so the results of the Chi-Squared
statistics reflect only the choice of the positive count distributions. Table 2.4 presents the DIC
for the log links of the Poisson, Negative Binomial, and Riemann Zeta components of the single
hurdle models with different combinations of the covariates. To assess the predictive abilities
of these models, we use the omnibus Chi-squared statistics described by Gelman et al. (2004).
Table 2.5 shows Chi-Squared statistics calculated using all 2278 county-year observations, such
that
67 X
X 34
(ci,t E(Ci,t | i,t ))2
, (2.12)
V ar(Ci,t | i,t )
i=1 t=1
17
where ci,t is the observed count of flood insurance claims for County i during Year t. E(Ci,t | i,t )
and V ar(Ci,t | i,t ) are calculated from the simulated claims described in the next section. Intu-
itively, the Chi-Squared statistic with the lowest value gives the best fitting model.
Insurance applications are mainly concerned with the average of many years. Even if a model
does not predict a flood with a 25 year return interval for a given year, the insurance company
can recuperate given the flooding event occurs on average every 25 years. Therefore, we also
calculate the Chi-Squared statistic over for the average count of flood insurance claims for each
67
X (ci E(Ci |))2
. (2.13)
i=1
V ar(Ci |)
According to DIC, for the Poisson and Negative Binomial models, the log links that only have
the log Number of Policies as the covariate are the best fitting, while the Riemann Zeta
model has the lowest DIC when the log link contains the covariates log Number of Policies
and Coast. According to DIC the best fitting positive count model is the Negative Binomial
model and the worst fitting model is the Poisson model. The Chi-Squared statistics using the
county-year observations reflect the DIC for the log links. However, looking at the Chi-Squared
statistics for the average count of the flood insurance claims for each county tells a different
story. Here the Chi-Squared statistics for the Riemann Zeta model is the worst fitting, while
the Negative Binomial and Poisson models show much better fits.
Now we examine the posterior distributions of the coefficients for the best fitting log links-
according to DIC- for each positive count distribution. The median of the posterior distribution
for the log Number of Policies for the Poisson model, as seen in Figure 2.5 tells us that a 1
unit increase in the log number of flood insurance policies will increase the number of claims
by approximately .0072 given that at least one claim has occurred. The coefficient for the log
Number of Policies for the Negative Binomial model is slightly lower than the Poisson model.
According to the posterior median for the Negative Binomial model with the lowest DIC as
seen in Figure 2.7, a 1 unit increase in the log number of policies will increase the expected
18
number of claims by .0068 given at least one flood claim is filed. Also Figure 2.7 shows that
the posterior distribution of r for the Negative Binomial model is relatively close to zero, which
indicates the Negative Binomial model substantially differs from the Poisson model. Recall
that r , the Negative Binomial model collapses into the Poisson model. Finally, for the
Riemann Zeta model with the lowest DIC, the coefficient for the log Number of Policies has
a median of approximately 1.1, which indicates a 1 unit increase in the log number of policies
will increase the number of claims by .0011 given that at least one flood claim has been filed as
seen in Figure 2.9. Figure 2.9 also shows the prior and posterior distribution for the covariate
Coast of the Riemann Zeta model with the lowest DIC. As expected being a coastal county
has a positive impact on the number of flood claims. According to the median of the posterior
distribution of Coast, being a coastal county increases the expected count of flood insurance
claims by approximately .5 claims being filed given at least one claim has occurred. Figures
2.6, 2.8, 2.10 for the Poisson, Negative Binomial, and Riemann Zeta models, respectively, show
the 2.5%, 50%, and the 97.5% percentile maps for the spatial intercepts. The magnitude of
the spatial intercepts for the Riemann Zeta model seem arbitrary. However, the Poisson and
Negative Binomial models have the highest spatial intercepts in the coastal counties of the
panhandle. This is interesting because the coastal counties of the panhandle have the highest
loss cost ratios for flood insurance in Florida. The alignment of the spatial intercepts and high
To estimate the actuarially fair premium rates, which is the expected number of claims in a
county, we simulate data from the best fitting SHP, SHNB, SHRZ models, according to DIC.
Table 2.7 shows the expected average number of claims per county over the sample period
according to the NFIP and the three single hurdle models. Note the values for the NFIP are
the values predicted by the NFIP and are not the historical averages. Here we see what is
reflected by the Chi-Squared statistics for the county average. The SHP and SHNB models
are closer to the historical average compared to the SHRZ. Figure 2.11 shows the difference
19
between the expected average number of claims and the historical average number of claims
for each county. Surprisingly, the Single Hurdle Poisson model is the best-fitting model not
only compared to the other two single hurdle models but also the current ratings used by the
National Flood Insurance Program. The SHNB model has expected county estimates both far
below the historical average (underestimating by 301 claims in Escambia County) and far above
the historical averages for counties (overestimating by over 1,800 claims in Broward County).
The expected average number of claims for the SHRZ is consistently overestimated, indicating
We find that for modeling the average indemnity payments, the simplest model proves to be the
best fitting, as seen in Table 2.8. The only covariate used is the log of the Number of Claims.
Figure 2.12 shows the 2.5%, 50%, and the 97.5% percentiles of the historical average indemnity
payment (provided a flood occurred) for each county over the sample period. Interestingly, the
median for the average indemnity falls between $2,238 and $18,845, which is substantially lower
than the residential and non-residential coverage limits of $300,000 and $1,500,000, respectively.
This can be compared to the simulated average indemnity payments in Figure 2.13, which shows
the 2.5%, 50%, and 97.5% percentiles of the average simulated indemnity payments for each
county. We see both in the historical indemnity payments and the simulated data consistent
regional trends. The panhandle counties have very high average indemnities payouts. This area
is known for vacation destinations, such as Destin, Okalossa County. Monroe County, located
at the southern end of the peninsula, is the county containing the Florida Keys, which is also
well-known for vacation homes and resorts. Note that the highest simulated average indemnity
payment in the 97.5% percentile, $1,530,106, could not entirely be paid out by the NFIP due
to limits on coverage. However, individuals would be able to purchase flood insurance from a
20
2.5.4 Loss Cost Ratio
Now that we have modeled both the number of claims and the average indemnity payments,
we can use this information to develop loss cost ratios (LCR) to determine the sustainability
of the program under the current ratings and the single hurdle models presented in this paper.
The loss cost ratio is defined as the total amount paid out in indemnity payouts divided the
total collected in premiums.The ideal ratio is one because then the amount paid out is equal to
the amount paid in to the insurance program. Therefore, we take the average expected number
of claims per county and multiply these by the median simulated average indemnity payments.
Figure 2.14 and Table 2.9 show the LCRs for each county given the observed data and the three
single hurdle models. These loss cost ratios are based on the aggregation of the 34 years in the
sample period. We see that for the observed LCR that half of the counties have paid in three
times more in premiums than what has been distributed in premiums, while the counties in the
panhandle have LCRs greater than one. Not surprisingly, the LCRs using the SHRZ model are
the lowest when compared to the observed LCR and those LCRs of the other single hurdles.
This is not surprising because the SHRZ model had the highest estimated number of claims.
The LCRs using the SHNB model had the highest loss cost ratio, which is for Escambia County
at 125.5. For the LCRs using the SHP model, most counties are greater than one, but there are
not the extremely low or extremely high LCRs seen in the LCRs using the other single hurdle
models.
2.6 Discussion
Flood insurance maps are infrequently updated due to the high cost. New construction in a
community can greatly affect the flood currents and the elevation of flood water. Out-of-date
flood maps obviously can not reflect the changes in water flow caused by new construction.
Because flood maps cannot be updated frequently, less costly complementary methods, such
as the predictions from the SHP model, should be explored and used in conjunction with the
flood maps or applied at a finer resolution and stand alone. The use of Bayesian methods allows
21
for flexible modeling that can easily allow for additional information of the parameters to be
included. More importantly, Bayesian methods account for estimation uncertainty when making
predictions.
Eight years after levee breaches annihilated Greater New Orleans, the effort to maintain
levees seems to have fallen to the way-side. There are no Acceptable4 levees within the state of
Florida. Several levees in urban areas are considered unacceptable and more than a dozen rural
and agricultural levees are unacceptable (USACE, 2013). In fact, approximately one million
acres in Florida are protected by levees in Unacceptable condition. This is problematic since
current flood insurance rates are developed under the assumption that the levees will hold and
do not take into consideration the probability of a breach. For this reason, we propose adding
a component to the flood insurance premium that incorporates the element of risk involved in
a levee breaking. The benefit of this added component would be two-fold: 1) premiums would
better reflect the risk of flooding and 2) policyholders would have more information on the risk
of flooding.
2.7 Conclusion
In this paper, we explore the issue of systemic risk through spatial modeling along with other
issues faced by the National Flood Insurance Program. Modeling the count of flood insurance
claims, we calculate the actuarially fair premiums. Three single hurdle models are considered
for the estimation of the annual number of flood insurance claims in each county of Florida.
Parameters of the models are allowed to spatially vary with the counties of Florida. Although the
Single Hurdle Poisson model does not predict the best on yearly based, the average prediction
of this model over sample period is closer to the historical average than the other single hurdle
models and current rating of the NFIP. Although the loss cost ratios from our estimation are
higher than one, with data on home values it will be possible to better estimate indemnity
payments. Home values are expected to be directly proportional to the average indemnity
4
Table 2.10 list descriptions of these ratings.
22
payments. The inclusion of home values would be beneficial because neighboring counties may
have drastically different median home prices in some regions, but in other regions neighboring
counties may have similar home prices. Therefore, home values may not currently be properly
This analysis has allowed for areas of interest to be determined and further investigated.
Further research will utilize these single hurdles models at a finer spatial resolution within a
given county, preferably at the ZIP code or census tract level. This will allow us to explore socio-
economic issues within the flood insurance program. We would also like to include information
about the Community Rating System (CRS) to determine the effectiveness of CRS and its
impact on market penetration. The use of the single hurdle can extend beyond the use flood
insurance claims. Other insurance claims caused by natural disasters can be described by the
single hurdle models, such as wildfires or landslides. Knowing that ten of the thirty costliest5
hurricanes recorded since 1900 and seven of the thirty most intense6 recorded since 1851 have
occurred within the last 25 years, recent history indicates the modeling of catastrophic events
23
2.8 Tables and Figures
Table 2.1: High risk areas have at least a 1% annual probability of flooding. These areas are
referred to as 100-year floodplains. Zones labeled A are for inland areas, while zones labeled
V are reserved for coastal areas. Moderate risk areas are referred to as 500-year floodplains.
24
Table 2.2: Function Forms of the Considered Proba-
bility Mass Functions. The function () is defined as
(n) = (n P 1)! and the function () is defined as
1
( + 1) = k=0 k +1 .
f (Uit = u, it ) 2
u
Poisson u! e
Negative u (r+u) 1 (r+)
u! (r)(r+)u (1+ )r
Binomial r r
25
31 659 34591 451225
659 7486 451225 3184372
7486 25965 3184372 12680507
25965 373958 12680507 154680410
26
628.54 0.16
0 2.62 0.16 8.54
2.62 11.49 8.54 169.76
11.49 69.35 169.76 2280.03
69.35 925.32
27
Logit Link: Number of Policies
prior
8
posterior
6
4
2
0
Figure 2.3: The prior and posterior distributions of the covariate of Number of Policies for the
logit link.
28
Spatial Intercepts: 2.5th Percentile Spatial Intercepts: 50th Percentile Spatial Intercepts: 97.5th Percentile
Figure 2.4: For the logit link with the lowest DIC, the 2.5th , 50th , and 97.5th percentiles of spatial intercepts are shown above.
29
Table 2.3: DIC for Logit Links
Covariates DIC
Policies 2157.2
Coast + Policies 2157.1
Minimum Elevation + Policies 2157.0
Coast + Minimum Elevation + Policies 2157.3
30
Table 2.6: The Chi-Squared Statistics for data average over the years for each county.
prior
60
posterior
50
40
30
20
10
0
Figure 2.5: The prior and posterior distributions of the covariate Number of Policies for the
log link of the SHP model.
31
Table 2.7: This table shows the expected losses. The first column is an alphabetical list of all
the counties in Florida. The second column is the annual average for historical claims. Third
is an approximation of the annual number of claims the NFIP expects. The last three column
are show the results for the simulations for each type of hurdle model. For the SHP and SHNB,
the simulations for the models with the lowest DIC are shown.
32
Table 2.7 Continued
33
Intercepts for Poisson Model: 2.5th Percentile Intercepts for Poisson Model: 50th Percentile Intercepts for Poisson Model: 97.5th Percentile
Figure 2.6: For the SHP model with the lowest DIC, the 2.5th , 50th , and 97.5th percentiles of random effect intercepts are shown above.
34
Neg. Bin. Model: r Neg. Bin. Model: Number of Policies
35
6
prior prior
posterior posterior
30
5
25
4
20
Density
Density
3
15
2
10
1
5
0
0.26 0.28 0.30 0.32 0.34 0.5 0.6 0.7 0.8 0.9
Figure 2.7: The prior and posterior distributions of and the covariate Number of Policies for
the log link of the SHNB model.
35
Intercepts for Neg. Bin. Model: 2.5th Percentile Intercepts for Neg. Bin. Model: 50th Percentile Intercepts for Neg. Bin. Model: 97.5th Percentile
Figure 2.8: For the SHNB model with the lowest DIC, the 2.5th , 50th , and 97.5th percentiles of spatial intercepts are shown above.
36
Riemann Zeta: Number of Policies Riemann Zeta: Coast
1.2
0.5
prior prior
posterior posterior
1.0
0.4
0.8
0.3
0.6
0.2
0.4
0.1
0.2
0.0
0.0
Figure 2.9: The prior and posterior distributions of the coefficient for log number of policies
and Coast for the SHRZ model.
37
Intercepts for R. Zeta Model: 2.5th Percentile Intercepts for R. Zeta Model: 50th Percentile Intercepts for R. Zeta Model: 97.5th Percentile
Figure 2.10: For the SHRZ model, the 2.5th , 50th , and 97.5th percentiles of spatial intercepts are shown above.
38
Difference between the NFIP and the Historical Mean Difference between the SHP and the Historical Mean
Difference between the SHNB and the Historical Mean Difference between the SHRZ and the Historical Mean
39
2.5th Percentile of Average Losses 50th Percentile of Average Losses 97.5th Percentile of Average Losses
Figure 2.12: Percentiles for the average indemnity payments (adjusted to 2013 dollars) for a flood insurance claim.
40
Table 2.8: DIC for the Indemnity Payment Models
Covariates
Number of Claims 11640.0
Coast + Number of Claims 83780.0
Minimum Elevation + Number of Claims 83770.0
Coast + Minimum Elevation +Number of Claims 83780.0
41
2.5th Percentile of Simulated Average Losses 50th Percentile of Average Losses 97.5th Percentile of Simulated Average Losses
Figure 2.13: Percentiles for the simulated average indemnity payments (adjusted to 2013 dollars) for a flood insurance claim. These are
based on the model for average indemnity payments, which had the lowest DIC.
42
Observed LossCost Ratio SHP LossCost Ratio
43
Table 2.9: Loss cost ratios
44
Table 2.9 Continued
45
Table 2.10: Levee Rating System implemented by the United States Army Corps of Engineers
(USACE)
Rating Description
Acceptable All inspection criteria are rated as Acceptable.
46
Chapter 3
3.1 Introduction
In 2011 and 2012 severe droughts caused extensive crop damage throughout the Midwest. During
2011 stories flooded news networks of cattle ranchers being unable to feed their herds due to
the shortage of feed. The following year proved to be disastrous as well. The loss cost ratio
(LCR)1 for corn in 2012 was 2.82, which translates to $12.7 billion of indemnity payments paid
to producers (Summary of Business, 2014). These recent events combined with concerns for
climate change have led to a growing focus on risk management in agriculture. The increasing
emphasis on risk management is reflected in the 2014 Farm Bill, which replaces direct payments
For this paper we turn our attention to winter wheat production in Kansas. Historically,
the majority of winter wheat production in Kansas has been produced without irrigation,
also known as dryland production. Although dryland wheat production is typically more cost-
effective than irrigated production, if a drought strikes Kansas irrigation could not be used as
a means of mitigating damages. Currently, irrigated wheat and dryland wheat have different
1
This ratio is indemnity payments divided by premiums.
47
benchmark yields for crop insurance guarantees, but these benchmarks do not account for dif-
ferences in the variances or correlations of yields caused by the different practices. If differences
in variances and correlations are not properly accounted for in insurance ratings, premiums will
Using spatial models for winter wheat yields in Kansas, we investigate the ratings of the crop
insurance policies as well as expected payouts from the Agricultural Risk Program established
under the 2014 Farm Bill. We model irrigated winter wheat and dryland wheat separately since
these practices have different benchmarks. The data is censored because some counties during
certain years did not plant winter wheat. For this reason we use a Bayesian version of a tobit
model. This model allows us to estimate the probability of an observation being censored. Also
we look for changes in the spatial relationship among county yields since yields tend to be more
The Federal Crop Insurance Corporation (FCIC) underwrites crop insurance policies, which
are then sold by private firms, called Approved Insurance Providers (AIPs), to producers.
These policies insure producers against any form of natural disaster that affect crop production.
Policy guarantees are typically based on revenue or yield. Common coverage levels are 65% and
75% although some crops/areas may be insured at 85% coverage. The Standard Reinsurance
Agreement developed by the Risk Management Agency (RMA) of the United States Department
of Agency (USDA) determines the share of losses paid by the AIPs and the share losses paid by
the FCIC. For a list of policies underwritten by the FCIC refer to Table 3.1. The most popular
of these policies in Revenue Protection, which makes up over 80% of all crop insurance policies.
The current methodology for rating COMBO2 policies is outline by Coble et al. (2010).
COMBO insurance ratings begin with the calculation of the unloaded target rate, which is
a function of loss cost ratios (LCRs) for the county of interest and its neighboring counties.
2
COMBO is an umbrella term for yield-based and revenue-based policies.
48
The loss cost ratio for a county is ratio of the indemnity payments paid to producers over the
premiums collected for the given county. This rate is the anchor rate for insurance policies
within the county. The rate is referred to as unloaded because it is calculated without the
highest 10% of losses for the counties. These large losses are accounted for in the catastrophic
loading. The unloaded target rate is a weighted average of the historical LCRs of the county
and its neighbors, weights are calculated with the Buhlmann method, which is defined as
R = ZX + (1 Z) (3.1)
where
P
Z=
P +K
and
5. P: exposure units
6. K = /
(a) is the sample variance of the adjusted LCR for the county of interest.
(b) is the sample variance of the adjusted LCR for the county group.
Once the unloaded target rate has been established, COMBO policies are rated with the
Iman Conover (1982) procedure. The Iman Conover procedure generates correlated random
draws of yields for a given county and price deviates. These correlated random draws of yield
and price deviates are then used to establish then premium rate for 65% coverage. The premium
rate is the expected loss divided by liability, which can be defined as E(Y |Y )/(Y ) 1, where
Y is the realized yield, Y is the predicted yield, and is the coverage level (Goodwin and Ker,
1998).
Disaster assistance for farmers was first established in 1938. For many years crop insurance
was offered for only a few crops and remained rather experimental. However, modern day crop
49
insurance was established by the Federal Crop Insurance Act of 1980. The legislation created
the Federal Crop Insurance Corporation under the jurisdiction of the Risk Management Agency.
Also the Federal Crop Insurance Act of 1980 permitted 30% of premiums to be subsidized for
65% coverage policies (History of Crop Insurance Program, 2014). The federal crop insurance
program floundered through the 1980s and was on the brink of extinction in the early 1990s,
the program was revitalized by the Federal Crop Insurance Reform Act of 1994 (Glauber, 2004)
. This new legislation permitted premium subsidies for higher coverage levels, created catas-
trophic (CAT) coverage, and made program participation mandatory. However, the mandatory
participation requirement was repealed in 1996. The 2000 Farm Bill has allowed for private
entities to carry out research and create new insurance products through a partnership with
RMA (History of Crop Insurance Program, 2014). The most recent agriculture legislation, 2014
One characteristic that sets crop insurance apart from other non-life insurance is the po-
tential for systemic risk. Systemic risk is the risk of losses occurring simultaneously and de-
pendently, such as in the event of a natural disaster. Natural disasters require insurance firms
to possess very large reserves of capital or reinsurance. Jaffe and Russell (1976) conjectured
that large reserves of capital would cause a firm to be susceptible to hostile takeovers. Miranda
and Glauber (1997) as well as Goodwin (2001) present arguments for the importance of incor-
porating the potential of systemic risk into crop insurance portfolios. In 2012 crop insurance
indemnity payments over all crops totaled $17.4 billion, which amounted to a loss cost ratio of
1.57 (Summary of Business, 2013). With nearly $116 billion in liabilities for 281 million acres,
not including livestock, the Federal Crop Insurance Corporation claims that private firms would
not be able to fully bear the risk of a catastrophe such as the 2012 drought. Therefore, accord-
ing to the FCIC, the Standard Reinsurance Agreement (SRA), which allows private insurers to
In the last decade there has been reoccurring concern about crop insurance policies being
inconsistently rated for different regions and crops. Babcock et. al (2004) criticized the as-
sumption of constant relative risk, in other words when the loss cost ratio remains constant
50
over time. Woodard et. al (2011) demonstrated that the using the loss cost ratio to determine
crop insurance premiums is only unbiased when the assumption of constant relative risk is not
violated. They found there was an upward bias in estimates when this assumption was violate.
Title I and Title XI of the 2014 Farm Bill focuses on risk management strategies and has
eliminated direct payments, counter-cyclical payments, and the Average Crop Revenue Election
(ACRE) program. These programs are replaced by the Price Loss Coverage (PLC) program and
the Agriculture Risk Coverage (ARC) program. Farmers can choose to enroll into one of these
two programs. The PLC program pays out the difference between the market price and the
reference price3 multiplied by 85% of the base acres. The ARC program guarantees can either
be based on individual producer revenues or county revenues. Pay outs occur if the producers
revenue drops below 86% of the benchmark revenue. Then producers are paid the different
between the actual revenue per an acre and the guarantee multiplied by 85% of the base acres.
The benchmark revenue is generated from the 5-year Olympic average of yields and the 5-year
Olympic average4 of the national price. Benchmark revenues for irrigated and dryland crops
are calculated separately. The 2014 Farm Bills shift towards these new programs in place of
direct payments and countercyclical payments is a cause to further examine the risk associated
3.3 Data
Yields for winter wheat measured in bushels per an acre were collected from the National
Agricultural Statistical Services over the sample period 1970 - 2013. These yields are aggregated
at the county level and grouped by irrigation practices: dryland (non-irrigated) and irrigated.
All 105 counties of Kansas produced dryland winter wheat during the sample period, and
67 counties of Kansas produced irrigated wheat. There are years during the sample period
without production for both dryland wheat and irrigated wheat; therefore, our modeling needs
to account for this censoring. Figure 3.1 shows the number of acres planted for both dryland
3
The reference price for wheat is $5.50 per bushel.
4
Olympic average eliminates the highest and lowest values then averages the remaining values.
51
winter wheat and irrigated winter wheat in Kansas. Here we see the majority of winter wheat is
produced without irrigation, which is true for winter wheat production throughout the United
States. Figure 3.3 shows a slightly increase in mean yield of winter wheat for the entire state
over the sample period; however, when the yields are dis-aggregated into counties, the trend is
Since 80% of the crop insurance policies are revenue based, we need prices to simulate
premiums. Wheat futures contract prices were collected from the CBOT and cash prices for
Kansas wheat were collected from the National Agricultural Statistical Service. The futures
contract were priced in September and expired in July of the following year. The cash prices were
the averages for transactions in July. We use the September quotes because the projected price
for winter wheat is announced on September 30. Related to the price announcement, September
is the month when winter wheat is planted in Kansas. Also we use the July expiration date and
cash prices from July because most of the winter wheat in Kansas is harvested from late June
to mid July.
3.4 Methodology
Because the dryland and irrigated yield data have years without production, we utilize a Tobit-
like Bayesian model. Tobit models (Tobin, 1958) assume the data have a latent variable y
i.i.d
for County i = 1, . . . , N during Year t = 1, . . . , T . c is a constant threshold, and it N (0, 2 ).
Bit = I(c > 0), where I() is the indicator function. If a response variable has the form described
in Equation 3.2, it is called a censored variable. Censoring may be a result from sampling
52
methods or the nature of the data. For example if an individual is below the age of 65, single,
and makes less than $10,000, he does not have to file a tax return; therefore, his income could
appear to be $0 to somebody investigating tax return data. Equation 3.2 and the example
above show right-handed censoring because values below a particular threshold are censored.
When values of above a certain threshold are censored, this is called left-handed censoring. An
example of left-handed censoring could be caused by instrument that cannot exceed a particular
threshold, such as physicians scale, which typically has a weight limit of approximately 400
pounds.
N Y
Y T !M !M 1
1 yit xit xit
1 , (3.3)
i=1 t=1
The difference between a typical tobit model and our model is the estimation of a logit
link as well as the estimation of the normal regression truncated at zero. The logit link is used
to predict whether or not the observation yit will be censored, while the truncated normal
regression predicts the values of the yields when the observation yit is not censored.
The variable Bit Ber(Pit ; therefore, Bit can be modeled using a logit link function. For
the logit link function, we surmise that the year, location, and the September futures contract
price may affect whether the observation is censored or not. The form of the logit link is
B
X
logit(Pit ) = i,0 + j xjt (3.4)
j
for County i = 1, . . . , N and Year t = 1, ..., T . (1,0 , . . . , N,0 ) CAR(, ) for its prior
distribution. CAR(, ) is the abbreviation for the Conditional Autoregressive model, which
is a spatial distribution. The mean (or intercept in our model) of one county is dependent or
53
counties. Surrounding can be defined by distance or contiguity. In this model, we choose
contiguity over distance since the counties greatly vary in size and shape. A detailed description
The prior distribution of the coefficients on the September futures contract prices and the
years are both normal distributed. Using the notation of N (, 2 ), the prior distributions of
for County i = 1, . . . , N and Year t = 1, ..., T . (1,0 , . . . , N,0 ) CAR(, ) for its prior dis-
tribution. The same prior distribution is used for (1,1 , . . . , N,1 ). The covariate zit is a binary
covariate defined as
0 if yit > mi
zit = , (3.6)
1 if yit mi
where mi is the median yield for County i = 1, . . . , N . The purpose of the covariate zit is
to capture changes in spatial dependencies that occur at lower yields. As Goodwin (2001)
demonstrated, yields across space during droughts or other natural disasters have stronger
dependencies compared to yields during normal years. The prior distribution of is a truncated
normal distribution N (1, 91 , 0, 2). Note the notation of the truncated normal is N (, 2 , a, b),
where is the mean, 2 , a is the lower limit, and b is the upper limit.
This model for censored county yields is used to simulated yields for the risk management
application of this paper. Note that we do not employ the same simulation technique used in
classical statistical statistics instead we use posterior predictive sampling. In classical statistics
one typically uses the maximum likelihood estimates in the sampling distribution to make ran-
dom draws from the sampling distribution. Posterior predictive sampling differs from the sam-
pling in classical statistics. Posterior predictive sampling is a two part process. Since Bayesian
54
methods treat parameters as random variables, the first part of posterior predictive sampling
is drawing parameters from the posterior distribution. The sample of parameters drawn from
the posterior distribution are then used in the sampling distribution to draw random samples
of observations.
3.4.2 Prices
For the simulated revenues used in the premium ratings, we need to generate prices that are
correlated with the yields. After obtaining a posterior predictive sample of county yields from
the model described above, these yields are averaged to compute the state yield average. This
state yield average is then regressed against log difference of the September futures contract
prices and the July cash price. Prices are then generated from pt = wt1 exp(rt ), where pt is the
cash price for Year t, wt1 is futures contract price, and rt N (0 + 1 yt , r2 ). Note yt is the
state yield for Year t. The use of log-normal distributions for price differentials is common in
Computations are performed using the software R, and the Bayesian models are imple-
In non-life insurance applications, there are several measures of interest: 1.) the probability of
a loss, 2.) the expected loss, i.e. the actuarially-fair premium, and 3.) the premium rate. These
values can be found through the Monte Carlo integration. As shown by Goodwin and Ker
1 PM
(1998), the probability of a loss is defined as P (y < C Y ) = M i=1 I(yi < C Y ), where C
is the coverage level, Y is the expected yield or revenue, yi is the ith simulated yield or revenue,
M is the number of replications, and I is an indicator function. The expected loss is defined as
the E(L) = P (y < C Y ) E(C Y y|(C Y y) > 0), where L is the difference between
the guarantee and the actual yield or revenue. Finally, the premium rate can be determined by
We simulate prices and yields to rate Group Risk Income Protection policies with the Harvest
55
Price Option (GRIP-HPO) as well as estimate payouts for the new Agricultural Risk Coverage
program based on county yields. The rate for the GRIP policy with the Harvest Price Option
is referred to as the HP Rate. We estimate the HP rate because the majority of revenue
plans purchased include the Harvest-Price Option. The summary of Harvest Price Option and
the GRIP policy are included in Table 3.1. The Group Risk Income Protection policy is rated
similarly to the methods discussed by Coble el al. (2010), which described in detail the rating
of COMBO insurance plans5 . The rate for the Group Risk Income Protection with the Harvest
HP Rate =
P10000
i=1 max(0, C Y min(2 P, max(P, p)) max(0, (yi y + y ) min(2 P, p)))
, (3.7)
10000 Y C P
where Y is the actual production history (APH) yield, P is the September futures contract
price of wheat in 2013, C is the coverage level, yi is the simulated yield and p is the simulated
price. Note that for the Harvest Price Option if the harvest price exceeds twice the September
price, 2 (September price) is used in place of the harvest price. We estimate the 65%, 75%,
For the simulations of the Agricultural Risk Coverage program, we simulate county yields
and prices for 2009 to 2013. Then the Olympic averages for each county and the prices are
calculated. This process is repeated 10000 times to create distributions for the Olympic averages
of county yields and the price. Unlike with the GRIP plan, there is only one coverage level,
which is 86%. Using the simulated Olympic averages of yields and prices, we determine the
probability of a payout from the program and the expected payout for each county in 2014.
3.5 Results
To help determine the best fitting models for dryland and irrigated wheat yields, the Deviance
Information Criterion (DIC) is calculated for each model version. DIC is a Bayesian measure
5
COMBO insurance is the umbrella term used to describe yield and revenue based crop insurance policies.
56
similar to AIC. Like AIC lower measures of DIC indicate a better fit, and the measure penalizes
additional parameters. Table 3.2 shows the DIC for the dryland and irrigated wheat models,
where the covariates of the logit link differ. For the logit links of both dryland wheat and
irrigated wheat, the covariates Year and September price affect censoring. The location of
the county does not seem to affect the censoring of dryland yields, but the location of the
county does affect the censoring of irrigated yields. Figure 3.6 shows the prior and posterior
distributions for the parameter of dryland wheat and irrigated wheat. The median of the
parameter is 1.167 for dryland wheat and 1.039 for irrigated wheat. For dryland wheat the
DIC for the model is 28070 when the covariate zit is included and 32280 when the covariate
zit is not present. For irrigated wheat the DIC for the model is 16090 when the covariate zit
is included and 17820 when the covariate zit is not present. Therefore we find that within our
framework county yields are best described using not only spatial intercepts, but also including
a secondary spatial covariate for yields for under mi . Due to the short length of the time series
The posterior distributions of the parameters in the logit links of dryland and irrigated wheat
differ substantially. Figure 3.4 shows the prior and posterior distributions for the parameters in
the logit link functions of dryland wheat and irrigated wheat. For dryland wheat the intercept 0
is constant across counties and has a median of 1.808. The medians of the posterior distributions
for the coefficients of Year and September price are -0.003 and -0.001. These coefficients indicate
For irrigated wheat, the medians of the posterior distributions of the coefficients for Year and
September price are approximates 0.019 and 0.039, respectively. Therefore, the odd of irrigated
yields being censored increases by 0.04 for every year that goes by, and the odds of censoring
increases by 0.019 for every dollar the September price increases. Also according to the spatial
intercepts of the logit link for irrigated wheat found in Figure 3.5, counties in northeastern
The truncated normal regression for both dryland and irrigated wheat contain spatial in-
tercepts with a CAR prior distribution, the secondary spatial covariate with a CAR prior
57
distribution. For the spatial intercepts and the secondary spatial covariates, we show maps of
the 2.5%, 50%, and 97.5% percentiles of the posterior distributions. The maps for spatial in-
tercepts for dryland wheat and irrigated wheat, shown in Figure 3.7 and Figure 3.9, do not
indicate distinct patterns across the state of Kansas. This is also true for the secondary spatial
To evaluate the fit of our models, we use the Chi-Squared discrepancies, which are a method
posterior predictive checking described by Gelman et al. (2004). The Chi-Squared discrepancy
is defined as
N X
X T
(yi,t E(Yi,t | i,t ))2
, (3.8)
V ar(Yi,t | i,t )
i=1 t=1
where yi,t is the observed yield for County i during Year t. E(Ci,t | i,t ) and V ar(Ci,t | i,t ) are
calculated from the simulated yields. The Chi-Squared discrepancy with the lowest value gives
the best fitting model. For comparison we not only simulate yields from the best fitting models
for dryland wheat and irrigated wheat, but we also simulate yields from the models where the
truncated normal regressions has spatial intercepts with independent normally distributed prior
distributions and no secondary spatial covariate. Figure 3.11 and Figure 3.12 show the simulated
yields for the best fitting models. For the simulated dryland yields, central Kansas has relatively
high yields and eastern Kansas has lower yields compared to the rest of the state. However, there
are no distinct patterns for irrigated wheat. Also visible inspection indicates the simulated yields
are reasonable when compared to observed yields seen in Figure 3.3. Table 3.3 shows the Chi-
Squared discrepancies. The best fitting model for dryland wheat has a Chi-Squared discrepancy
of 3886.0, where the model with independent intercepts has a Chi-Squared discrepancy of 4054.4.
Also we see the best-fitting model for irrigated wheat has a Chi-Squared discrepancy of 2795.1,
where the model with independent intercepts has a Chi-Squared discrepancy of 3983.4. These
Chi-Squared discrepancies show the improvement in fit caused by including the CAR prior
distribution for the spatial intercepts and the secondary spatial covariates.
Next we generate revenues for the year 2014 to determine premium rates of the GRIP-
HPO policies. Again we simulate from the best fitting models for dryland and irrigated wheat
58
as well as the models with independent intercepts. The policies have revenue guarantees of
65%, 75%, and 85%. Before estimating the premium ratings, we look at the probability of a
loss occurring for these guarantees. Figure 3.13 and Figure 3.14 show the probabilites for the
different guarantees for dryland wheat and irrigated wheat. The probabilities of the best fitting
model of dryland wheat have very distinct patterns. This model indicates higher probabilities
of loss in northwestern Kansas and south central Kansas compared to the rest of the state.
The probability of a loss is lower in eastern Kansas. The median probabilities of a loss across
all counties are 0.207, 0.328, and 0.449 for the 65%, 75%, and 85% guarantees, respectively.
Similar patterns emerge for the dryland wheat model with independent intercepts although this
model has consistently higher probabilities. For the model with independent spatial intercepts,
the median probabilities of a loss across all counties are 0.241, 0.357, and 0.480 for the 65%,
75%, and 85% guarantees, respectively. Figure 3.17 and Figure 3.18 show the premium rates
for the dryland wheat for the best fitting model and the model with independent intercepts.
The premium rates are higher for the model with independent intercepts compared to the best
fitting model. The median premium rates across all counties for the best fitting model are 0.038,
0.069, and 0.107 for the 65%, 75%, and 85% guarantees, while the median premium rates across
all counties for the model with independent intercepts are 0.051, 0.084, and 0.123.
The probabilities of a loss and the premium rates for irrigated wheat differ from the proba-
bilities and premium rates of dryland wheat. Figure 3.15 and Figure 3.16 show the probabilities
of a loss for the 65%, 75%, and 85% guarantees for the best fitting model and the model with
independent intercepts. Again we see the probabilities of a loss generated from the model with
independent intercepts are higher than the probabilities of loss generated by the best fitting
model. The median probabilities of a loss for the best fitting model across all counties are 0.158,
0.292, and 0.439 for the 65%, 75%, and 85% guarantees, while the median probabilities of the
model with independent intercepts are 0.2109, 0.3483, and 0.4959. Also the premiums rates
for irrigated wheat, seen in Figure 3.19 and Figure 3.20, show the model with independent
intercepts has slightly lower premium rates than the best-fitting model. The median premium
rates across all counties for the best fitting model are 0.023, 0.05, and 0.088 for the 65%, 75%,
59
and 85% guarantees, while the median premium rates across all counties for the model with
The final component of our analysis is the application of our models to the new Agricultural
Risk Coverage program. We simulated yields and prices from 2009 to 2013 and then take the
Olympic average of the simulated prices and the Olympic average for the simulated yields
of each county. All of this analysis is conducted using the best fitting model and simulate
10,000 replications of prices and yields. The 2.5%, 50%, and 97.5% percentiles for the simulated
Olympic averages of the yields are shown in Figure 3.23 and Figure 3.24 for dryland and irrigated
wheat, respectively. The distribution of Olympic averages for prices is shown in Figure 3.21.
The mean Olympic average price is $6.48 with a 95% confidence interval from $5.25 to $8.11.
The expected median payout across all counties for the ARC program is $17.65 per an acre
for dryland wheat, and the expected median payout across all counties is $21.32 per an acre
for irrigated wheat as seen in Figure 3.25. It is worth noting the the probability of a pay out
from the ARC program is silently lower than the probability of a payout from a crop insurance
policy with an 85% guarantee. The probability of a payout across all counties is 0.412 for
3.6 Discussion
Our analysis shows that the best fit for county yields allows the spatial dependencies among
the counties to change with the value of the yields. When compared to a model that assumes
no correlation between yields, we see the dryland wheat premium ratings for different cover-
age levels are more consistent. Therefore, by including spatial dependencies in crop insurance
ratings, the premium rates better reflect intuition. Although the target rate used by RMA is a
weighted average of a countys yields and the yields of its neighboring counties, this average is
only a point estimate and does not does fully describe the dependencies among the distributions
of the county yields. Therefore, RMA may want to consider a model that better accounts for
spatial dependencies.
60
According to a study conducted by Ifft et al. (2012), the total for direct payments from 2004
to 2008 was equal to 6.8% of crop revenues. One of the major concerns of the 2014 Farm Bill
is how the Agricultural Risk Coverage program and Price Loss Coverage program will compare
to direct payments and the other programs being eliminated. The best fitting models predict
the average revenue for an acre of Kansas winter wheat in 2014 will be $ 213.78 and $276.56
for dryland and irrigated winter wheat. The expected median payout per an acre of winter
wheat from the ARC program is $17.65 and $21.32 for dryland and irrigated winter wheat.
If we multiply these values by 0.85 (because ARC payouts are applied to 85% of base acres),
the ratios of the payouts of the ARC program to the expected revenue are 7.02% for irrigated
winter wheat and 6.55% for dryland wheat. Therefore, our analysis concludes the payouts from
This paper found that not only do spatial dependencies exists among county yields, but the
spatial relationships are dependent on the value of the yields. Including these spatial depen-
dencies, the forecasting ability of the models for both dryland and irrigated are improved. This
improved forecast translates into more accurate premiums ratings for crop insurance policies.
We also determine that based on the best fitting models presented in this paper, the ARC
program expected payouts will be very similar to amounts paid out for direct payments.
Title I and Title XI of the 2014 Farm Bill have prioritized risk management in United
States agriculture for the next several years. Since the majority of crop insurance policies have
guarantees based on the production of individual producers instead of county level production,
we plan to apply the models used in this paper to yields of individual producers. Also we plan
to further compare expected payouts of these new programs to direct payments, county-cyclical
61
3.8 Tables and Figures
62
Table 3.2: DIC for the entire model. Here the logit link is varied, while the truncated normal
distribution has the spatial intercept, the spatial covariate with the optimal threshold, and the
September price.
Table 3.3: Chi-Squared Discrepancies. Best-fitting show the Chi-Square discrepancy of the
model that has spatial intercepts with the CAR distribution prior,the optimal threshold covari-
ate in the truncated normal regression, and the September price covariate. Independent has
different intercepts for each county with independent priors and the September price covariate.
These two models have the same logit link.
Dryland Irrigated
Best- Fitting 3886.0 2795.1
Independent 4054.4 3983.4
63
State Yields Acres of Winter Wheat Planted
60
10
50
Acres (Millions)
Dryland Dryland
Irrigated Irrigated
40
5
30
0
1970 1980 1990 2000 2010 1970 1980 1990 2000 2010
Year Year
Figure 3.1: Figures for the entire state of Kansas including the average yield and number of acres planted.
64
Wheat Prices (Inflation Adjusted)
20
15
Dollars Per Bushel
10
Figure 3.2: Wheat price for a per bushel (adjusted to 2013 price)
65
Mean of Irrigated Wheat
NA
40.38 44.22
44.22 45.94
45.94 47.27
47.27 51.19
32.56 33.12
33.12 33.16
33.16 33.37
33.37 33.67
Figure 3.3: Average yield (bushels per acre). Sample period: 1970-2013
66
0 1 2
2.0
2.0
2.0
prior prior prior
dryland dryland dryland
1.5
1.5 irrigated irrigated
1.5
1.0
Density
Density
Density
1.0
1.0
0.5
0.5
0.5
0.0
0.0
0.0
1.0 0.5 0.0 0.5 1.0 1.0 0.5 0.0 0.5 1.0 1.5 1.5 1.0 0.5 0.0 0.5 1.0
N = 20000 Bandwidth = 0.02493 N = 20000 Bandwidth = 0.02551 N = 20000 Bandwidth = 0.02359
Figure 3.4: Prior and posterior distributions of the dryland and irrigated wheat logit link functions. Note there is no 0 posterior
distribution for irrigated wheat because the intercepts for the best-fit irrigated wheat logit link function are spatially-varying.
67
Spatial Intercepts: 2.5%
NA
4.82 1.42
1.42 0.19
0.19 1.28
1.28 2.36
Spatial Intercepts: 50%
NA
3.41 0.66
0.66 1.12
1.12 2.69
2.69 4.97
Spatial Intercepts: 97.5%
NA
2.28 0.13
0.13 2.2
2.2 4.66
4.66 9.67
Figure 3.5: Posterior percentiles for the spatial intercepts of the irrigated wheat logit link.
68
prior
posterior
prior
250
25
posterior
200
20
150
15
100
10
50
5
0
0
1.13 1.14 1.15 1.16 1.17 1.18 1.19 1.025 1.030 1.035 1.040 1.045
Figure 3.6: Prior and posterior distributions for the parameter of the dryland wheat and
irrigated wheat
69
Spatial Intercepts: 2.5%
41.3903 41.5645
41.5645 41.6116
41.6116 41.6422
41.6422 41.6827
41.968 42.0035
42.0035 42.0256
42.0256 42.0427
42.0427 42.0739
42.3902 42.4321
42.4321 42.4548
42.4548 42.4798
42.4798 42.6627
Figure 3.7: Posterior percentiles for the spatial intercepts of the dryland wheat truncated
normal regression
70
Spatial Intercepts: 2.5%
15.8668 15.6827
15.6827 15.637
15.637 15.6137
15.6137 15.5809
15.1678 15.151
15.151 15.1346
15.1346 15.1256
15.1256 15.1106
14.6865 14.6659
14.6659 14.6422
14.6422 14.6122
14.6122 14.4383
Figure 3.8: Posterior percentiles for the secondary spatial covariate of the dryland wheat trun-
cated normal regression
71
Spatial Intercepts: 2.5%
NA
43.81 48.46
48.46 50.79
50.79 52.24
52.24 53.78
Spatial Intercepts: 50%
NA
45.64 51.01
51.01 52.56
52.56 54.18
54.18 56.25
Spatial Intercepts: 97.5%
NA
47.45 52.94
52.94 54.86
54.86 56.23
56.23 58.97
Figure 3.9: Posterior percentiles for the spatial intercepts of the irrigated wheat truncated
normal regression
72
Spatial Intercepts: 2.5%
NA
16.54 16.05
16.05 15.9
15.9 15.79
15.79 15.41
Spatial Intercepts: 50%
NA
15.03 14.93
14.93 14.86
14.86 14.8
14.8 14.7
Spatial Intercepts: 97.5%
NA
14.2 14
14 13.79
13.79 13.62
13.62 13.11
Figure 3.10: Posterior percentiles for the secondary spatial covariate of the irrigated wheat
truncated normal regression
73
Simulated Yields: 2.5%
15.936 16.951
16.951 17.257
17.257 17.689
17.689 18.302
30.905 31.853
31.853 32.207
32.207 32.594
32.594 34.137
49.039 50.324
50.324 50.661
50.661 51.052
51.052 51.984
74
Simulated Yields: 2.5%
NA
19.96 26.885
26.885 28.559
28.559 31.605
31.605 42.523
Simulated Yields: 50%
NA
31.768 40.241
40.241 43.718
43.718 48.864
48.864 54.02
Simulated Yields: 50%
NA
43.582 57.592
57.592 60.946
60.946 63.654
63.654 66.52
75
Probability for 65% Coverage
0.186 0.2
0.2 0.204
0.204 0.208
0.208 0.217
0.307 0.323
0.323 0.327
0.327 0.331
0.331 0.344
0.426 0.446
0.446 0.449
0.449 0.455
0.455 0.466
Figure 3.13: Probability for the three coverage levels of dryland wheat of the best fitting model.
76
Probability for 65% Coverage
0.221 0.237
0.237 0.241
0.241 0.246
0.246 0.256
0.334 0.354
0.354 0.357
0.357 0.362
0.362 0.375
0.456 0.476
0.476 0.48
0.48 0.485
0.485 0.501
Figure 3.14: Probability for the three coverage levels of dryland wheat of the model with
independent counties.
77
Probability for 65% Coverage
NA
0.013 0.08
0.08 0.158
0.158 0.235
0.235 0.446
Probability for 75% Coverage
NA
0.048 0.182
0.182 0.292
0.292 0.405
0.405 0.643
Probability for 85% Coverage
NA
0.118 0.324
0.324 0.438
0.438 0.579
0.579 0.79
Figure 3.15: Probability for the three coverage levels of irrigated wheat of the best fitting
model.
78
Probability for 65% Coverage
NA
0.162 0.19
0.19 0.2109
0.2109 0.2348
0.2348 0.6006
Probability for 75% Coverage
NA
0.2891 0.3219
0.3219 0.3483
0.3483 0.3716
0.3716 0.7308
Probability for 85% Coverage
NA
0.4331 0.4763
0.4763 0.4959
0.4959 0.5182
0.5182 0.8236
Figure 3.16: Probability for the three coverage levels of irrigated wheat of the model with
independent counties.
79
Premium Rate for 65% Coverage
0.03 0.036
0.036 0.038
0.038 0.041
0.041 0.047
0.058 0.067
0.067 0.069
0.069 0.072
0.072 0.08
0.093 0.104
0.104 0.107
0.107 0.11
0.11 0.12
Figure 3.17: Premium rates for the three coverage levels of dryland wheat of the best fitting
model.
80
Premium Rate for 65% Coverage
0.045 0.049
0.049 0.051
0.051 0.053
0.053 0.058
0.075 0.081
0.081 0.084
0.084 0.087
0.087 0.094
0.111 0.12
0.12 0.123
0.123 0.126
0.126 0.131
Figure 3.18: Premium Rates for the three coverage levels of dryland wheat of the model with
independent counties.
81
Premium Rate for 65% Coverage
NA
0.001 0.01
0.01 0.023
0.023 0.035
0.035 0.087
Premium Rate for 75% Coverage
NA
0.005 0.026
0.026 0.05
0.05 0.073
0.073 0.149
Premium Rate for 85% Coverage
NA
0.013 0.052
0.052 0.088
0.088 0.123
0.123 0.216
Figure 3.19: Premium rates for the three coverage levels of irrigated wheat of the best fitting
model.
82
Premium Rate for 65% Coverage
NA
0.003 0.017
0.017 0.022
0.022 0.025
0.025 0.037
Premium Rate for 75% Coverage
NA
0.01 0.039
0.039 0.048
0.048 0.054
0.054 0.073
Premium Rate for 85% Coverage
NA
0.027 0.071
0.071 0.084
0.084 0.093
0.093 0.12
Figure 3.20: Premium rates for the three coverage levels of irrigated wheat of the model with
independent counties.
83
Olympic Averages of Prices
2000
count
1000
0
5 7 9 11
Dollars
84
Dryland Wheat
0.392 0.412
0.412 0.417
0.417 0.427
0.427 0.448
Irrigated Wheat
NA
0.302 0.31
0.31 0.314
0.314 0.319
0.319 0.33
85
ARC Average Yield: 2.5%
24.07 24.2
24.2 24.26
24.26 24.37
24.37 24.62
32.41 32.51
32.51 32.6
32.6 33
33 33.17
42.14 42.43
42.43 42.57
42.57 42.75
42.75 43.13
86
ARC Average Yield: 2.5%
NA
23.83 31.11
31.11 33.05
33.05 37.24
37.24 46.39
ARC Average Yield: 50%
NA
29.93 37.87
37.87 41.04
41.04 46.44
46.44 52.26
ARC Average Yield: 97.5%
NA
35.95 45.94
45.94 51.24
51.24 54.39
54.39 58.54
87
Dryland Wheat
17.314 17.57
17.57 17.652
17.652 17.797
17.797 18.199
Irrigated Wheat
NA
15.291 19.33
19.33 21.322
21.322 23.096
23.096 25.604
Figure 3.25: Median of the expected payout for the ARC program.
88
Chapter 4
The Law of One Price (LOP) states the price of a homogeneous good in different locations is the
same when transaction costs are excluded. When markets act in accordance with the Law of One
price (not including transaction cost), market players will arbitrage. The spatial trade between
two markets will eventually cause the prices to reach an equilibrium. If prices continuously
the markets. This lack of integration could represent asymmetric information in the markets
Early work on spatial integration utilized the theoretical model by Takayama and Judge
(1964); however, the Takayama and Judge formulation assumes fixed transaction costs across
markets. Faminow and Benson (1990) offered an alternative theory, which made the transac-
tion cost across distances significant and firms close in proximity operate as a spatially-linked
oligopoly. The Faminow and Benson formulation bars transactions between markets over a
great distance due to high transaction costs. Early empirical application, such as Ardeni (1989)
and Goodwin and Shroeder (1991), also did not account for dynamic transaction costs. Fackler
89
and Goodwin (2001) provided a primer reviewing methods used for testing cointegration. Early
applications used the Ravillion (1986) and Timmer (1987) market integration tests as well as
tests for Granger causality. However, Barrett (2001) criticized the assumption of fixed trans-
action costs implied by these tests. Also despite the popularity of using cointegration tests for
determining spatial integration, McNew and Fackler (1997) demonstrated that cointegration is
Several different methodologies have been used to better incorporate transaction costs into
testing for spatial integration, such as the use of regime switching regression models. Regime
switching regression models establish for upper and lower bounds around the equilibrium, in an
attempt to account for unobserved transaction costs. The difficulty comes from the transaction
costs being not observed. Sexton, Kling, and Carmen (1991) used a three regime model, which
allowed for the LOP to hold, relative shortages, or relative surplus. Barrett and Li (2002) de-
veloped a maximum likelihood estimator, which allowed for regimes in and out of periods of
trade. Goodwin and Piggott (2001) approached the issue of spatial integration using cointegra-
tion tests and threshold error correction models on North Carolina agricultural markets. Their
threshold error correction models were used to form non-linear impulse responses. Unlike the
market integration tests used in many of the earlier applications, impulse responses are dynamic
and allow one to observe the time-path of price differentials between two markets after a shock.
Others followed with extensions of their analysis. Sephton (2003) implemented a threshold test
based on a Vector Error Correction Model (VECM) to the North Carolina agricultural markets.
The estimation with the threshold model based on VECM showed similar results to those found
by Goodwin and Piggott (2001). Threshold tests based on error correction models have also
been used to investigate the dairy sector in Spain (Serra and Goodwin, 2003) and commodities
Like Goodwin and Piggott (2001) and Sephton (2003), this paper also investigates the
spatial integration of agricultural markets in North Carolina. Our analysis utilizes daily price
observations from six grain markets in North Carolina: two corn markets, two soybean markets,
and two wheat markets. The daily price series analyzed in this paper cover a sample period
90
from January 7, 2000 to November 3, 2011 for the two corn markets and the two soybean
markets, and the sample period for the two wheat markets is from October 7, 2005 to May
27, 2010. These prices series include 2008, which was not included in the previous analysis
of North Carolina agricultural markets. Like Goodwin and Piggott our analysis utilizes non-
linear impulse response functions. These impulse responses impose an asymmetric shock on the
markets and trace the time path of the price differentials after the shock. The impulse responses
are referred to as non-linear because the functional form of the model for which the shock is
imposed is non-linear. However, our analysis steps away from the threshold modeling that has
GARCH(1,1) model on the price differentials from each pair of terminal markets (Pairing is
determined by grain type), we use copula models on the standardized residuals. The copulas
are have time-varying parameters. These copulas are called Stochastic Copula Autoregressive
(SCAR) models. Instead of imposing thresholds for regimes, this time-varying copula allows for
4.1 Methodology
The models for the pairs of grain market prices are estimated sequentially in three parts. We first
estimate models for the mean prices of the grain markets then the variances of these markets.
Next the standardized residuals for each pair of grain markets are used in Stochastic Copula
Autoregressive (SCAR) models. Finally, using all three componentsthe models for the means,
variances, and standardized residual the non-linear impulse responses are estimated.
Given the natural of the prices in these grain markets, the daily prices are likely autocorrelated
and the prices between the grain markets are probably correlated. Therefore, we use a Vector
Autoregressive (VAR) model for modeling the mean prices in the terminal grain markets. The
91
VAR(p) model has the form:
Nonstationarity
If the mean and the autocovariances of the time series are not dependent time, the time series
is defined as stationary1 . Unit-root processes are nonstationary time series, such that the time
and L is the backwards operator. If nonstationarity is not accounted for when modeling, the
estimates will statistically significant but provide very little economic insight. For example if
the nonstationarity in nominal gross domestic product is not accounted for the coefficients of
an autoregressive model will be dominated by time trend caused by increases in inflation not
changes in productivity. Nonstationarity is often seen in price data because of long-term changes
in the market. Therefore, testing the grain price series for nonstationarity is a necessity. Tests
such as the Augmented Dickey-Fuller (ADF) test can be applied to each individual time series.
If the ADF test indicates the time series is a unit-root process, a common solution is to model
the change in the time series such that yt = + yt1 + ut , where yt = yt yt1 for
Suppose we are modeling two unit-root nonstationary times using a VAR(p) model. If there
is only one unit-root for both nonstationary time series, the two time series are referred to as
cointegrated. Cointegration of N time series can be modeled using a Vector Error Correction
92
. The rank of the N N matrix indicates the number of unit roots in the system. If rank() =
k, the number of unit roots is equal to N k. The maximal eigenvalue likelihood ratio (LR)
statistic developed by Johansen (1988) is a common measure used for testing cointegration.
Prices of commodities often vary in their volatility over time. In our paper assuming the residuals
of the VAR model have a constant variance would be inappropriate. The Generalized Autore-
gressive Heteroskedastic (GARCH) model, developed by Bollerslev (1986), allows the variance
to change over time. Given the residuals from the model of mean, uit where day t = 1, . . . , T
2
it = + u2i,t1 + i,t1
2
, (4.5)
To estimate the GARCH model, we use the Two-Pass method described by Tsay (2005).
First the mean model, in our case the VAR model or VECM, is estimated. The residuals
estimated ui,t for time series i = 1, 2 of the mean model are taken as the true observations.
Using maximum likelihood estimation, the GARCH model is estimated from the residuals.
Using the residuals from the mean model and the conditional variances from the GARCH
uit
model, the standardized residuals zit = it can be calculated. For the standardized residuals
93
of each terminal market, we calculate the rank-based empirical cumulative densities of the
T
1X
Fi (z) = 1(Zit z). (4.6)
T
t=1
These cumulative densities are then used as the marginal distributions for copula modeling.
According to Charpentier et al. (2006), the nonparametric marginal distributions provide more
efficient copula parameter estimates compared to parametric estimations of the marginal distri-
butions. The copula model is the joint distribution of the standardized residuals for the pair of
grain markets.2 By using a time-varying copula for the standardized residuals, we create a fuller
description of the price series compared to using a Dynamic Conditional Correlation GARCH
model. The time-varying copulas can be described by a variety of the dependence measures
that are discussed below, such as Kendalls tau and the tail dependence coefficient.
For the standardized residuals we use the method developed by Almieda and Czado (2012)
the Stochastic Copula Autoregressive (SCAR) model. Expressing the copula data3 as ut =
(u1,t , u2,t ) , we assume ut |(u1 , . . . , ut1 ), (1 , . . . , t ) Ct for t = 1, . . . , T , where Ct is the
time-varying copula distribution with the parameter t . The copulas in this analysis are bivariate
copulas with Kendalls tau coefficient ranging from -1 to 1. The copulas distributions considered
in this paper are the Gaussian, Double Clayton, and Double Gumbel SCAR copulas. The Double
Clayton copula and the Double Gumbel copula are defined such that
C(u, 1 v| ) if < 0
C(u, v) = , (4.7)
C(u, v|) if 0
2
An overview of copula modeling is provided in Appendix D.
3
The copula data for this analysis are the rank-based empirical cumulative densities derived from the stan-
dardized residuals.
94
where C(u, v) is either the Gumbel copula or the Clayton copula. Table 4.1 shows the
formulas for Kendalls tau for the Gaussian, Double Clayton, and Double Gumbel copulas.
The SCAR model parameter t is modeled by the latent variable t , which is defined as
!
2
1 N , (4.8)
1 2
and
t = + (t1 ) + t , (4.9)
for t = 2, . . . , T . Also t is an i.i.d. Gaussian innovation. Then we find the time-varying param-
where t is Kendalls tau coefficient and 1 is the inverse of the formula for Kendalls tau
coefficient.
Since the SCAR models are estimated with Bayesian methods, we need to establish prior
distributions for , , and 2 . The parameter has a distribution prior of Beta(e, f ), where
1 1 S1 2
p(|, 2 , 1:T , u1:T ) (1 + )e 2 (1 )f 1 ; , 1 (). (4.11)
S0 S0 [1,1]
PT 1 PT
() is the normal density. S0 and S1 are defined as S0 := t=2 (t )2 and S1 := t=2 (t
)(t1 ).
The parameters and have conjugate priors. Their prior distributions are N (c, d2 )
and 2 IG(a, b). In our estimation, the hyperparameters in the prior distributions are as
follows: a=0.1, b=0.1, c=0.75, and d =0.1. Then the posterior distributions for and are
T
1. a := a + 2
95
1 PT
2. b := b + 2 t=2 (t (t2 ))2 + (1 2 )(1 )2
(T 1)(1)2 +(12 1
3. d2 := 1
d2 + 2 )
PT
4. c := d2 c
d2
+ 1
2
(1 )2 1 + t=2 (t t1 )
All of the estimation for the SCAR models is conducted using the Metropolis-Hastings algo-
rithm.
For comparison purposes not only SCAR models are estimates for each pair of markets, but
also three single parameter of the copulas: the Gaussian, Clayton, and Gumbel copula are
estimated for each pair of markets. These three copulas were also estimated using Bayesian
methods. The Clayton and Gumbel copula parameters have gamma distributions for their prior
distributions. Note the prior distribution for the Gumbel copula parameter has a support shifted
from (0, ) to (1, ) in order to match the support of the Gumbel copula. The Gaussian copula
parameter has a beta distribution for its prior distribution. The use of the beta distribution does
not allow for a negative parameter in the Gaussian copula. This choice is intentional because a
negative parameter in the Gaussian copula would violate economic intuition because markets
It+k (, Dt , Dt1 , . . .)
(4.12)
= E[Dt+k |Dt = dt + , Dt1 = dt1 , . . .] E[Dt+k |Dt = dt , Dt1 = dt1 , . . .],
t = 1, . . . , T , where is a shock introduced to the model and dt is the difference in the price
differentials between the two markets. The negative shock is introduced through the latent
variable t of the SCAR model. Negative shocks to the latent variable in turn create a negative
96
parameter in the copula, making an asymmetric movement in the markets. Appendix C provides
4.2 Data
The data used in this analysis are daily prices from six terminal grain markets in North Carolina:
two corn markets, two soybeans markets, and two wheat markets. The daily price series of
these markets have a sample period from January 7, 2000 to November 3, 2011 for the two corn
markets and the two soybean markets, and a sample period from October 7, 2005 to May 27,
2010 for the two wheat markets. The corn markets are located in Barber and Laurinburg, which
have a road distance of 113 miles between them. The soybean markets are located in Fayetteville
and Cofield. The road distance between Fayetteville and Cofield is 168 miles. Fayetteville is
currently the only soybean market with a crusher, making it the dominant soybean market
in North Carolina. The crusher is used to process soybeans into meal and oil. Finally, the
wheat markets are situated in Greenville and Statesville, which are 232 miles apart. Given
these distances, all markets (of the same grain) are less than a days drive apart; therefore, a
price differential between markets caused by a shock should dissipate quickly through spatial
trading.
All markets have missing prices but no more than 10% of the prices is missing from a
single terminal market. Missing observations are imputed using posterior predictive distribution
within the Bayesian framework. Posterior predictive sampling differs from the sampling in
classical statistics. Posterior predictive sampling is a two part process. Since Bayesian methods
treat parameters as random variables, the first part of posterior predictive sampling is drawing
parameter values from the posterior distribution gives the observed values. The sample of
parameters drawn from the posterior distribution are then used to generate sample for future
or missing values using the sampling distribution. The algorithm for sampling from the posterior
97
2. Draw y new f (y| w )
R
which in turn generates a sample y new for f (y new |)(|y obs )d.
Figure 4.2 shows the price series for each grain market. For the corn markets located in
Barber and Laurinburg, the prices are very similar. The prices from soybean markets located in
Cofield and Fayetteville are also very close together. However, the wheat markets show a very
different story. The Greenville wheat market prices are consistently lower than the prices in the
Statesville wheat market. Note that a consistent price difference does not necessarily indicate
One signal for whether markets are integrated is how many acres were planted. If a crop
is a prominent part of the agricultural economy, its markets are more likely to be integrated.
Figure 4.1 shows the acres planted for corn, soybean, and wheat in North Carolina from 1970
through 2013. Overall, corn production has been decreasing in North Carolina. Since the late
1980s soybean acreage has been higher than the acreage of corn and wheat. Wheat acreage has
historically been low compared to the acreage of corn and soybeans. However, since 2011 there
are has been an increase in the wheat acreage planted has increase.
4.3 Results
Before studying the results from the SCAR models, we discuss the mean and volatility models
for the prices. Table 4.2 shows the statistics and p-values of the Augmented Dickey-Fuller
(ADF) test for each terminal grain market. Note the null hypothesis of the ADF test is the
presence of a unit-root in the time series. According to the ADF test, all six terminal markets
are unit-root nonstationary. Table 4.3 shows statistics for the maximal eigenvalue LR test used
for testing cointegration. Cointegration tests are performed for markets of the same grain.
For example, we test cointegration between the Greenville and Statesville wheat markets. The
maximal eigenvalue LR tests whether there are two unit-processes for the two time series (m=0)
or if there is one unit-root for the two time series (m=1). The test statistics for the pair of
wheat markets, the pair of corn markets, and the pair of soybean markets all have statistically
98
significant results indicating two unit-roots. This result suggest the pairs of markets are not
cointegrated. Despite not being cointegrated markets are still able to be well-intergrated as
mentioned by Fackler and McNew. Because the time series are not cointegrated, we do not use
the Vector Error-Correction model. However, because all of the time series are nonstationary the
Vector Autoregressive models are estimated using the price differentials, pM,t = pM,t pM,(t1) ,
Estimates for the VAR(1) models for each pair of grain markets as well as the GARCH(1,1)
models for each terminal market are given in Tables 4.4, 4.5, and 4.6. The choice of one lag for the
Vector Autoregressive model is based on AIC selection and likelihood ratio tests. The residuals
of the VAR(1) models indicate heteroskedasticity hence GARCH(1,1) models are applied to
each time series of residuals4 . The residuals of the VAR(1) models and the conditional standard
deviations derived from the GARCH(1,1) models are used to determined the standardized
The plots of the prior and posterior distributions of , , and 2 for the SCAR models are
provided in Figures 4.3 - 4.11. Also summaries of the means and standard deviations of the
posterior distributions are shown in Tables 4.9 - 4.11. The means of for the SCAR models
of the wheat markets are 0.7369, 0.7526, and 0.7400 for the Gaussian, Double Clayton, and
Double Gumbel SCAR models, respectively. In the corn markets there is more variation among
the means of for the SCAR models; the Gaussian SCAR model has a mean of 0.7962, and the
Double Clayton SCAR model and the Double Gumbel SCAR have means of equal to 1.1335
and 1.2320. The means of for the SCAR models of the soybean markets are 1.1230, 0.7386,
and 1.2084 for the Gaussian, Double Clayton, and Double Gumbel SCAR models, respectively.
The posterior means of indicates the dependence between t and t1 in the SCAR models.
of the wheat market SCAR models has posterior means of 0.8787, 0.8708, 0.8796 for the
Gaussian, Double Clayton, and Double Gumbel SCAR models. For the soybean markets the
posterior means of are 0.9134, 0.9545, and 0.9583 for the Gaussian, Double Clayton, and
4
The evidence for heteroskedasticity in the Greenville wheat market is present but weak compared to the
other markets
99
Double Gumbel SCAR models. Also the posterior means of for the corn markets are 0.9566,
0.9219, and 0.9538 for the Gaussian, Double Clayton, and Double Gumbel SCAR models.
Using the parameters estimated in the SCAR models, we are able to calculate Kendalls
tau. The median of the posterior distributions for Kendalls tau coefficient of each SCAR model
variations for the pairs of terminal markets can be found in Figure 4.12. Note that because the
parameter of the SCAR model is time-varying, Kendalls tau also varies across time. For the
corn markets, Kendalls tau across time are approximately 0.4117, 0.7607, and 0.9175 for the
Gaussian, Double Clayton, and Double Gumbel SCAR models, respectively. Kendalls tau for
the Double Gumbel SCAR model has two sharp declines. Large sudden changes in Kendalls
tau for the Double Gumbel SCAR model also occur in the soybean and wheat markets. These
changes are associated with large deviations in the data. For example, the decline of Kendalls
tau in the wheat markets is associated with a day when there was no price change in the
Greenville market but a massive $2.51 spike in the Statesville price. For the soybean markets,
Kendalls tau across time are approximately 0.4561, 0.6276, and 0.9112 for the Gaussian, Double
Clayton, and Double Gumbel copulas, respectively. Finally, Kendalls tau across time for the
wheat markets, are 0.4534, 0.7956, and 0.90363 for the Gaussian, Double Clayton, and Double
In order to determine the preferred model for forecasting, we utilize the same method as
Almeida and Czado (2012), which is the continuous rank probability score (CRPS). CRPS
is an measure indicating forecasting ability. We examine the forecast of the difference of the
price differentials between two markets for each SCAR model, defined as dt = (p2,t p1,t ).
We choose to examine this difference because the non-linear impulse responses examine the
difference of the price differentials between two markets. Therefore, we define the CRPS of a
R R
d (o) 1 X (r) (o) 1 1 X (r) (r)
CRPS(dl )= R |dl dl | |dl dl |, (4.13)
2R
r=1 r=1
(o) (r)
where dl is the difference in price differentials that is observed, dl is the difference in the
100
(r)
simulated price differentials, and dlis resampled from the simulated price differentials. The
d (o) ) = 1
PT +w d (o)
CRPS for the entire forecasted period is CRPS(d w l=T +1 CRPS(dl ), where w is the
length of the forecasted period. Lower continuous rank probability score indicate better forecast.
For the continuous rank probability scores of the estimated SCAR models, we simulate the
last 100 days of the sample period for each pair of grain markets using posterior predictive
sampling. Table 4.7 shows the continuous rank probability scores for the SCAR models of the
three crops. The Double Gumbel SCAR model has the lowest CRPS for the wheat markets,
which is 0.3749 compared to 0.6499 for the Gaussian SCAR model and 1.0539 for the Double
Clayton SCAR model. The Double Gumbel SCAR model has the lowest CRPS for the corn
markets, which is -0.0756 compared to -0.0419 for the Gaussian SCAR model and 0.01440 for
the Double Clayton SCAR model. Finally, the Double Clayton SCAR model also has the lowest
CRPS for the soybean markets, which is -0.4066 compared to the -0.2085 for the Gaussian SCAR
model and -0.3538 for the Double Gumbel SCAR model. The CRPS for the single parameter
copulas are also estimated. Although the SCAR models do not strictly dominate the single
parameter copulas, the best forecast are produced by the Double Gumbel SCAR model for the
wheat and corn markets and the Double Clayton SCAR model for the soybean markets.
The impulse responses are shown in Figures 4.13 - 4.15 and the half-lives for these impulse
responses are given in Table 4.12. An impulse responses show the deviation from equilibrium
after a shock is imposed. The deviations in these impulse responses are the differences in the
price differentials of the pairs of grain markets. The half-life of an impulse response is the
length of time for half of the deviation caused by a shock to be extinguished. In our analysis,
all markets return to equilibrium regardless of the choice of SCAR model within a thirty day
period; however, the amount of time to return to equilibrium varies substantially. For shocks
from the Greenville and Statesville wheat market, the half-lives of the impulse responses are
between six to seven days for the Gaussian, Double Clayton, and and Double Gumbel SCAR
models. The impulse responses for the corn markets are not as consistent across SCAR models.
Depending on the SCAR model, impulse responses for shocks in the Barber and Laurinburg
corn markets are between 1 to 7 days. From the best forecasting model for the corn markets,
101
the Double Gumbel SCAR model, the half-lives are 6.83 days for a shock from Barber corn
market and 3.40 days for a shock from the Laurinburg market. Although the impulse responses
from the Gaussian SCAR models indicate a very slow decay back to equilibrium for shocks
from either the Cofield or Fayetteville soybean markets, the Clayton SCAR model and Gumbel
SCAR model are consistent in their results. Note according to the CRPS, the Clayton SCAR
model and Gumbel SCAR model both forecast better than the Gaussian SCAR model for the
soybean markets. The Clayton SCAR model and Gumbel SCAR model show half-lives between
one and two days for deviations in either the Cofield or Fayetteville soybean markets.
4.4 Discussion
Despite some crops having better integrated markets than other, ultimately, the Law of One
Price holds in all markets as indicated by the returning to equilibrium seen in all of the impulse
responses. Overall the results agree with economic intuition. Given that the markets investigated
are all close in proximity, we expect these markets to be spatially integrated. The soybean
markets have the shortest half-lives for the impulse responses, according to the Clayton and
Gumbel SCAR models, and the acreage planted with soybeans is higher than both the acreage
planted for corn and soybeans. This supports the statement made earlier that the more widely-
produced crops have more integrated markets. The wheat markets return to equilibrium much
slower, which is not surprising because wheat is not a dominant feature of North Carolina
agriculture.
The need to investigate grain markets in North Carolina will only grow in the upcoming
years because of the desire to produce more feed grain in North Carolina. North Carolina is
one of the top pork and poultry-producing states. In 2010 North Carolina had the second
highest state inventory of pigs. It also had the second highest total for United States cash
receipts of all poultry and eggs totaling $3.62 billion. This notable production of livestock
requires a considerable amount of feed grain. North Carolina currently imports a substantial
amount of feed grain from the Midwest. The North Carolina Grain Initiatives goal is for North
102
Carolina to become more self-sufficient in producing feed grain for livestock. One aspect of
the North Carolina Grain Initiative encourages double cropping grain sorghum after wheat
instead of double cropping soybeans after wheat to increase grain production (Piggott, 2013).
The current production of grain sorghum in North Carolina is very sparse. If grain sorghum
production grows in North Carolina, the grain sorghum markets may not be well-integrated
due to the initial thinness of the markets. Also if soybean production decreases, the soybean
4.5 Conclusion
The purpose of this investigation is to provide a method of measuring spatial integration without
the explicit use of threshold regression models. To accomplish this goal, we use Stochastic
Copula Autoregressive models. Our results show that all of the pairs of North Carolina grain
markets examined in this paper are spatially integrated and the Law of One Price does hold.
The time required to recover from a shock strongly depends on whether the markets are thin.
We see the half-lives of the impulse responses, depending on the choice of SCAR model and
grain, are from just over a day to little under two weeks.
The investigation of these grain markets will continue. This paper looks at only pair-wise
relationships among terminal markets. In future research we hope to move beyond this pair-wise
construction and analyze three or more grain markets at a time. Also as the North Carolina
Grain Initiative grows, we wish to investigate its affect in the spatial integration of markets.
If grain sorghum replaces some of the soybean production, one would expect soybean markets
information. If information in terminal markets is asymmetric, the markets will not be well-
integrated. Therefore, by tracking whether or not markets are integrate, we may help market
103
4.6 Tables and Figures
2.0
1.5
Acres (Millions)
Corn
Soybean
Wheat
1.0
0.5
104
12.5
16
7
10.0
12
Dollars
Dollars
Dollars
Greenville Cofield Barber
7.5 Statesville Fayetteville Laurinburg
5
8
5.0
3
2.5 4
2006 2007 2008 2009 2010 2001 2003 2005 2007 2009 2011 2001 2003 2005 2007 2009 2011
Year Year Year
105
Table 4.1: Kendalls tau coefficient
0 <0
2 2
Gaussian sin (t )
1
sin (t )
1
Double Clayton
2+ 2+
Double Gumbel
1 +1
106
Table 4.2: Results of the Augmented Dickey-Fuller Test conducted on the price series from each terminal grain market.
r=0 r=1
Wheat 53.41 3.09
Corn 36.03 1.61
Soybean 150.15 1.82
107
Table 4.4: VAR(1) and GARCH(1,1) estimates and standard errors for the wheat markets. pW G,tand
pW S,t are the price different for the wheat markets in Greenville and Statesville, respectively, on day t.
An estimate with asterisks , , or indicate statistical significance at the = .10, = .05 and = .01,
respectively.
Table 4.5: VAR(1) and GARCH(1,1) estimates and standard errors for the corn markets. pCB,t and pCB,t
are the price different for the corn markets in Barber and Laurinburg, respectively, on day t. An estimate
with asterisks , , or indicate statistical significance at the = .10, = .05 and = .01, respectively.
Table 4.6: VAR(1) and GARCH(1,1) estimates and standard errors for the soybean markets. pSC,tand
pSF,t are the price different for the soybean markets in Cofield and Fayetteville, respectively, on day t.
An estimate with asterisks , , or indicate statistical significance at the = .10, = .05 and = .01,
respectively.
108
Table 4.7: CRPS
109
Table 4.9: Parameter Estimates for SCAR models of Wheat Markets
110
2
10
5
prior prior prior
posterior
posterior posterior
8e+04
3
6
Density
Density
Density
4e+04
2
4
1
0e+00
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0e+00 4e05 8e05
N = 5000 Bandwidth = 0.02981 N = 5000 Bandwidth = 0.01134 N = 5000 Bandwidth = 6.999e07
111
2
10
5
8e+04
4
8
3
6
Density
Density
Density
4e+04
2
4
1
0e+00
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0e+00 4e05 8e05
N = 5000 Bandwidth = 0.03091 N = 5000 Bandwidth = 0.01272 N = 5000 Bandwidth = 7.675e07
112
2
120000
10
5
80000
3
6
Density
Density
Density
40000
2
4
1
2
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0e+00 4e05 8e05
N = 5000 Bandwidth = 0.0285 N = 5000 Bandwidth = 0.0111 N = 5000 Bandwidth = 6.766e07
113
2
15
8e+05
5
10
3
4e+05
Density
Density
Density
2
5
1
0e+00
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05 3.0e05
N = 5000 Bandwidth = 0.03547 N = 5000 Bandwidth = 0.008012 N = 5000 Bandwidth = 9.442e08
114
2
5
8e+05
15
posterior posterior posterior
4
3
10
Density
Density
Density
4e+05
2
5
1
0e+00
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05
N = 5000 Bandwidth = 0.05314 N = 5000 Bandwidth = 0.004475 N = 5000 Bandwidth = 6.475e08
115
2
4e+06
5
15
posterior posterior posterior
4
3
10
2e+06
Density
Density
Density
2
5
1
0e+00
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.000 0.001 0.002 0.003 0.004
N = 20000 Bandwidth = 0.02399 N = 5000 Bandwidth = 0.004384 N = 5000 Bandwidth = 7.627e08
116
2
20
5
8e+05
prior prior prior
posterior posterior posterior
4
15
3
Density
Density
Density
10
4e+05
2
5
1
0e+00
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05 3.0e05
N = 5000 Bandwidth = 0.05412 N = 5000 Bandwidth = 0.004365 N = 5000 Bandwidth = 7.379e08
117
2
15
3
Density
Density
Density
10
2
5
1
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0000 0.0010 0.0020
N = 1000 Bandwidth = 0.04749 N = 1000 Bandwidth = 0.009321 N = 1000 Bandwidth = 2.396e07
118
2
20
5
8e+05
prior prior prior
posterior posterior posterior
4
15
3
Density
Density
Density
10
4e+05
2
5
1
0e+00
0
0
0.0 0.5 1.0 1.5 0.0 0.2 0.4 0.6 0.8 1.0 0.0e+00 1.0e05 2.0e05 3.0e05
N = 5000 Bandwidth = 0.06244 N = 5000 Bandwidth = 0.004362 N = 5000 Bandwidth = 7.461e08
119
1.0 1.0 1.0
Kendalls Tau
Kendalls Tau
Kendalls Tau
Double Clayton Double Clayton Double Clayton
Double Gumbel Double Gumbel Double Gumbel
0.0 Gaussian 0.0 Gaussian 0.0 Gaussian
120
0.08
0.08
0.06
0.06
Dollars
Dollars
0.04 Double Clayton Double Clayton
Double Gumbel 0.04 Double Gumbel
Gaussian Gaussian
0.02 0.02
0.00
0.00
0 10 20 30 0 10 20 30
Days Days
0.02
0.02
Dollars
Dollars
0.00
0.00
0 10 20 30 0 10 20 30
Days Days
121
0.04 0.04
0.03 0.03
Dollars
Dollars
Double Clayton Double Clayton
0.02 Double Gumbel 0.02 Double Gumbel
Gaussian Gaussian
0.01 0.01
0.00 0.00
0 10 20 30 0 10 20 30
Days Days
Table 4.12: Half-Lives of Impulse Responses. The half-life is the time (in days) that it takes
for the deviation dissipate to half of its distance to the equilibrium.
122
Chapter 5
Conclusion
For this dissertation, we examine economic questions related to spatial dependence. In Chapter
2 and Chapter 3, we investigate the issue of systemic risk in two federal programs, National
Flood Insurance Program and the federal crop insurance program, through spatial modeling.
Both analyses show the importance of incorporating spatial dependencies when systemic risk
is present.
In Chapter 2, we calculate the actuarially fair premiums for flood insurance policies in
Florida. Three single hurdle models are considered for the estimation of the annual count of
flood insurance claims in each county of Florida. Parameters of the models are allowed to
spatially vary with the counties of Florida. Although the Single Hurdle Poisson model does
not predict the best on yearly based, the average prediction of this model over sample period
is closer to the historical average than the other single hurdle models and current rating of
the NFIP. Although the loss cost ratios from our estimation are higher than one, with finer
resolution data on historical indemnity payments as well as data on home values, it will be
possible to better estimate indemnity payments. This will lead to estimated LCRs closer to
one. Because flood maps cannot be updated frequently, less costly complementary methods,
such as the predictions from the Single Hurdle Poisson model, should be explore and used in
conjunction with the flood maps or applied at a finer resolution and stand alone.
Chapter 3 examines spatial dependencies of Kansas winter wheat yields and applies the
123
findings to the estimation of crop insurance premium rates and payouts for the Agricultural
Risk Coverage program. Our analysis shows that the best fit for county yields allows the spatial
dependencies among the counties to change with the value of the yields. The spatial dependency
changes if the yield is below a certain percent of the median yield. The optimal percentage of
the median yield is 108% for dryland wheat and 102.5% for irrigated wheat. Therefore we find
that within our framework county yields are best described using not only spatial intercepts,
but also including a secondary spatial covariate for yields that are under 108% of median
for dryland wheat and 102.5% of the median irrigated wheat. When compared to a model
that assumes no correlation between yields, we see the dryland wheat premium ratings for
different coverage levels are more consistent. Therefore, by including spatial dependencies in
Unlike the previous two chapters, Chapter 4 does not analyze risk management issues. In-
stead this chapter focused on the spatial integration, of six grain markets in located North
Carolina. The purpose of this investigation is to provide a method of measuring spatial inte-
gration without the explicit use of threshold regression models. To accomplish this goal, we use
Stochastic Copula Autoregressive models. Despite some crops having better integrated markets
than other, ultimately, the Law of One Price holds in all markets as indicated by the returning
to equilibrium seen in all impulse responses. Overall the results agree with basic intuition. Given
that the markets investigated are all a relatively short driving distance apart, a maximum of
a four hours, we expect these markets to be spatially integrated. The soybean markets have
the shortest half-lives for the impulse responses, according to the Clayton and Gumbel SCAR
models, and the quantity of acres planted with soybeans is higher than both the quantity of
acres planted for corn and soybeans. The wheat markets return to equilibrium much slower,
which is not surprising because wheat is not a dominant feature of North Carolina agriculture.
There are plans for future research based on each of these essays. For the National Flood
Insurance Program, further research will utilized these single hurdles models at a finer spatial
resolution within a given county, preferably at the ZIP code or census tract level. This will allow
us to explore socio-economic issues within the flood insurance program. We would also like to
124
include information about the Community Rating System (CRS) to determine the effectiveness
of CRS and its impact on market penetration.For agricultural risk management the majority
of crop insurance policies have guarantees based on the production of individual producers
instead of county level production, we plan to apply the models used in Chapter 3 to yields of
individual producers. Also we plan to further compare expected payouts of new programs, such
as the ARC program, to direct payments, county-cyclical payments, and the ACRE program.
Finally the for the topics addressed in Chapter 4, future research will analyze the affect of
the new North Carolina Grain Initiative. If information in terminal markets is asymmetric, the
markets will not be well-integrated. Therefore, tracking whether or not markets are integrate,
we can help market players have the full information to make decision as grain production in
125
REFERENCES
[2] C. Almeida and C. Czado. Efficient bayesian inference for stochastic time-varying copula
models. Computational Statistics & Data Analysis, 56(6):15111527, 2012.
[3] P.G. Ardeni. Does the law of one price really hold for commodity prices? American Journal
of Agricultural Economics, 71(3):661669, 1989.
[4] B.A. Babcock, C.E. Hart, and D.J. Hayes. Actuarial fairness of crop insurance rates with
constant rate relativities. American Journal of Agricultural Economics, 86(3):563575,
2004.
[5] K. Balcombe, A. Bailey, and J. Brooks. Threshold effects in price transmission: the case
of brazilian wheat, maize, and soya prices. American Journal of Agricultural Economics,
89(2):308323, 2007.
[6] S. Banerjee, A. Gelfand, and B. Carlin. Hierarchical Modeling and Analysis for Spatial
Data. Chapman and Hall/CRC;, 1 edition, 2003.
[7] C.B. Barrett. Measuring integration and efficiency in international agricultural markets.
Review of Agricultural Economics, 23(1):1932, 2001.
[8] C.B. Barrett and J.R. Li. Distinguishing between equilibrium and integration in spatial
price analysis. American Journal of Agricultural Economics, 84(2):292307, 2002.
[9] E.S. Blake and E.J. Gibney. The deadliest, costliest, and most intense united states tropical
cyclones from 1851 to 2010 (and other frequently requested hurricane facts). August 2011.
[11] Arthur Charpentier, Jean-David Fermanian, and Olivier Scaillet. The estimation of cop-
ulas: Theory and practice. Copulas: From theory to application in finance, pages 3560,
2007.
[12] J. Chivers and N.E. Flores. Market failure in information: The national flood insurance
program. Land Economics, 78(4):515521, 2002.
[13] K.H. Coble, T.O. Knight, B.K. Goodwin, M.F. Miller, and R.M. Rejesus. A comprehensive
review of the rma aph and combo rating methodology. March 15 2010.
[14] Congressional Budget Office. The national flood insurance program: Factors affecting
actuarial soundness. 2009.
[15] D. Cooley, N. Douglas, and N. Philippe. Bayesian spatial modeling of extreme precipitation
return levels. Journal of the American Statistical Association, 136:824840, 2007.
126
[16] D. Diers, M. Eling, and S. Marek. Dependence modeling in non-life insurance using the
bernstein copula. Insurance: Mathematics and Economics, 50:430436, 2012.
[17] R. Doman and M. Doman. Copula based impulse response analysis of linkages between
stock markets. Available at SSRN 1615108, 2010.
[18] P.L. Fackler and B.K. Goodwin. Spatial price analysis. Handbook of agricultural economics,
1:9711024, 2001.
[19] M.D. Faminow and B.L. Benson. Integration of spatial markets. American Journal of
Agricultural Economics, 72(1):4962, 1990.
[20] Federal Emergency Management Agency. National flood insurance program: Program
description. August 2002.
[21] Federal Emergency Management Agency. Biggert-waters flood insurance reform act of
2012, December 2012. http://www.fema.gov/flood-insurance-reform-act-2012.
[22] M.J. Fischer and I. Klein. Some results on weak and strong tail dependence coefficients for
means of copulas. Technical report, Diskussionspapiere//Friedrich-Alexander-Universitat
Erlangen-Nurnberg, Lehrstuhl fur Statistik und Okonometrie, 2007.
[23] A.E. Gelfand, H.J. Kim, C.F. Sirmans, and S. Banerjee. Spatial modeling with spatially-
varying coefficients processes. Journal of the American Statistical Association, 98(462):387
396, 2003.
[24] A.E. Gelfand and A. Smith. Sampling-based approaches to calculating marginal densities.
Journal of the American statistical association, 85(410):398409, 1990.
[25] Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin. Bayesian data analysis.
texts in statistical science series, 2004.
[26] C. Genest, J. Neslehova, and N. Ben Ghorbal. Estimators based on kendalls tau in mul-
tivariate copula models. Australian & New Zealand Journal of Statistics, 53(2):157177,
2011.
[27] W.R. Gilks, S. Richardson, and D.J. Spiegelhalter, editors. Markov Chain Monte Carlo in
Practice: Interdisciplinary Statistics. Chapman and Hall/CRC, 1996.
[28] J.W. Glauber. Crop insurance reconsidered. American Journal of Agricultural Economics,
86(5):11791195, 2004.
[29] T. Gneiting and A.E. Raftery. Strictly proper scoring rules, prediction, and estimation.
Journal of the American Statistical Association, 102(477):359378, 2007.
[30] B.K. Goodwin. Problems with market insurance in agriculture. American Journal of
Agricultural Economics, 2001.
[31] B.K. Goodwin and A.P. Ker. Nonparametric estimation of crop yield distributions: impli-
cations for rating group-risk crop insurance contracts. American Journal of Agricultural
Economics, 80(1):139153, 1998.
127
[32] B.K. Goodwin and N.E. Piggott. Spatial market integration in the presence of threshold
effects. American Journal of Agricultural Economics, 83(2):302317, 2001.
[33] B.K. Goodwin and T.C. Schroeder. Cointegration tests and spatial price linkages in regional
cattle markets. American Journal of Agricultural Economics, 73(2):452464, 1991.
[34] B.K. Goodwin and V. Smith. What harm is done by subsidizing crop insurance? American
Journal of Agricultural Economics, 95(2):489497, 2013.
[35] Government Accountability Office. Fema : Action needed to improve administration of the
national flood insurance program.rep. no. 11-297. 2011.
[36] Government Accountability Office. National flood insurance program: Continued attention
needed to address challenges. (GAO-13-858T), September 2013.
[37] F. Guzzetti, C.P. Stark, and P. Salvati. Evaluation of flood and landslide risk to the
population of italy. Enviromental Management, 36(1):1536, 2005.
[38] J. Ifft, C. Nickerson, T. Kuethe, and C. You. Potential farm-level effects of eliminating
direct payments, November 2012.
[39] D.M. Jaffee and T. Russell. Imperfect information, uncertainty, and credit rationing. The
Quarterly Journal of Economics, 90(4):651666, 1976.
[40] H. Joe. Multivariate models and multivariate dependence concepts, volume 73. CRC Press,
1997.
[42] N.L. Johnson, A.W. Kemp, and S. Kotz, editors. Univariate Discrete Distributions. Wiley,
3 edition, 2005.
[43] C. Kousky and R. Cooke. The unholy trinity: Fat tails, tail dependence, and micro-
correlations. 2009a. Issue Brief 12-08.: Resources for the Future.
[44] C. Kousky and R. Cooke. Climate change and risk management: Challenges for insurance,
adaptation, and loss estimation. 2009b. Issue Brief 12-08.: Resources for the Future.
[45] C. Kousky and E. Michel-Kerjan. Hurricane sandy, storm surge, and the national flood
insurance program: A primer on new york and new jersey. 2012. Resources for the Future.
[46] D. Lunn, D. Spiegelhalter, A. Thomas, and N. Best. The bugs project: Evolution, critique,
and future directions. Statistics in Medicine, 28:30493067, 2009.
[47] K. McNew and P.L. Fackler. Testing market equilibrium: is cointegration informative?
Journal of Agricultural & Resource Economics, 22(2), 1997.
[48] E. O. Michel-Kerjan. Catastrophe economics: The national flood insurance program. Jour-
nal of Economic Perspectives, 24:165186, 2010.
128
[49] M.J. Miranda and J.W. Glauber. Systemic risk, reinsurance, and the failure of crop insur-
ance markets. American Journal of Agricultural Economics, 79(1):206215, 1997.
[50] National Agricultural Statistics Service. Quick stats, March 2014. http://quickstats.
nass.usda.gov/.
[53] A. OHagan, J. Forster, and M.G. Kendall. Bayesian Inference. Arnold London, 2004.
[54] A. Panagiotelis and M. Smith. Bayesian density forecasting of intraday electricity prices
using multivariate skew t distributions. International Journal of Forecasting, 24(4):710
727, 2008.
[55] D. Pfeifer, D. Straburger, and J. Phillipps. Modelling and simulation of dependence struc-
tures in nonlife insurance with berstein copulas. International ASTIN Colloquium, 2012.
[57] M.D. Porter and G. White. Self-exciting hurdle models for terrorist activity. The Annals
of Applied Statistics, 6(1):106124, 2012.
[59] A.B. Richard and R.W. Allan. maps: Draw Geographical Maps, 2012. R package version
2.3-0.
[60] Risk Management Agency. History of the crop insurance program, 2014. http://www.
rma.usda.gov/aboutrma/what/history.html.
[62] P.S. Sephton. Spatial market arbitrage and threshold cointegration. American Journal of
Agricultural Economics, 85(4):10411046, 2003.
[63] T. Serra and B.K. Goodwin. Price transmission and asymmetric adjustment in the spanish
dairy sector. Applied Economics, 35(18):18891899, 2003.
[64] R.J. Sexton, C.L. Kling, and H.F. Carman. Market integration, efficiency of arbitrage,
and imperfect competition: methodology and application to us celery. American Journal
of Agricultural Economics, 73(3):568580, 1991.
[65] J.S. Shonkwiler and D.W. Shaw. Hurdle count-data models in recreation demand analysis.
Journal of Agricultural and Resource Economics, 21(2):210219, 1996.
[66] J.R. Skees and B.J. Barnett. Conceptual and practical considerations for sharing catas-
trophic/systemic risks. Review of Agricultural Economics, 21(2):424441, 1999.
129
[67] M. Sklar. Fonctions de repartition a n dimensions et leurs marges. Universite Paris 8,
1959.
[68] S. Sturtz, U. Ligges, and A. Gelman. R2winbugs: A package for running winbugs from r.
Journal of Statistical Software, 12(3), 2005.
[69] C.P. Timmer. The corn economy of Indonesia. Cornell University Press, 1987.
[70] James Tobin. Estimation of relationships for limited dependent variables. Econometrica:
journal of the Econometric Society, pages 2436, 1958.
[71] W.G. Tomek. Price behavior on a declining terminal market. American Journal of Agri-
cultural Economics, 62(3):434444, 1980.
[72] R.S. Tsay. Analysis of financial time series, volume 543. John Wiley & Sons, 2005.
[73] United States Army Corps of Engineers. Levee inspection, 2013. http://www.usace.
army.mil/Missions/CivilWorks/LeveeSafetyProgram/LeveeInspections.aspx.
[74] United States Army Corps of Engineers. National levee database, 2013. http://nld.
usace.army.mil/egis/f?p=471:1:.
[76] M. Wall. A close look at the spatial structure implied by the car and sar models. Journal
of Statistical Planning and Inference, 121(2):311324, 2004.
[77] J.D. Woodard, G.D. Schnitkey, B.J. Sherrick, N. Lozano-Gracia, and L. Anselin. A spatial
econometric analysis of loss experience in the u.s. crop insurance program. The Jounal of
Risk and Insurance, 79(1):261285, 2012.
130
APPENDICES
131
Appendix A
The following definition of a Conditional Autoregressive (CAR) model is comes from Wall
(2004). Suppose there is a fixed set of regions {R1 , R2 , . . . , RN } and an indexing set D. If
S S S
the set {R1 , R2 , . . . , RN } is a partition of D such that R1 R2 . . . RN = D, the set
sian random process. Then the data can be modeled as a CAR model, as discussed by Wall
(2004),
N
X
2
Z(Ri )|Z(R(i) ) N i + aij (Z(Rj ) j ), i , (A.1)
j=1
where Z(R(i) ) = Z(Rj : j 6= i), E(Z(Rj )) = j , i2 is the conditional variance, and aij are
constants for i = 1, . . . , N . The elements aij form a matrix A such that A = W, where is a
scalar and W is the neighborhood matrix with its elements, wij defined as follows:
1 if regions i and j are neighbors, denoted by i j
wij = (A.2)
0 if regions i and j are not neighbors or i = j
132
Appendix B
1. Spatial Intercept (CAR model): In OpenBUGS, we only set the prior for the precision,
tau the inverse of the variance. For this we use, Gamma(.1, .1)
133
B.3 Modeling Indemnity Payments
1. Spatial Intercepts (CAR Model) In OpenBUGS, we only set the prior for the precision,
tau the inverse of the variance. For this we use, Gamma(.1, .1)
iid
2. log(Number of Claims +1) i N (0, 10000)
i.i.d.
3. Coast: 1,i N (0, 10000)
i.i.d.
4. Minimum Elevation: 2,i N (0, 10000)
134
Appendix C
The literature exploring copula modeling has increased dramatically in the last 15 years. The
backbone of copula modeling stems from Sklar (1959), which states that any joint distribution
unique if the marginal distributions are continuous. From this theorem, one is to able build a
variety of joint distributions. The joint density of the d-dimensional distribution can be written
as
d
Y
f (u) = c(F1 (u1 ), . . . , Fd (ud )) fi (ui ) (C.1)
i=1
d c(u1 , . . . , ud )
c(u) = , (C.2)
u1 , . . . , ud
where c(u1 , . . . , ud ) denotes the copula function, i.e. the joint distribution of (u1 , . . . , ud ), and
ui U [0, 1] for i = 1, . . . , d.
Nelson (2006) provides an excellent overview on copula modeling. The most popular copulas
include the elliptical and Archimedean families. Elliptical copulas consists of elliptical distribu-
135
tions, such as the Gaussian distribution and the Students t distribution. The Gaussian copula
where 1 is the inverse cumulative distribution of the standard normal distribution and ui
[0, 1] for i = 1, . . . , d and denotes the joint cumulative distribution with the correlation
Archimedean copulas are characterized by a single generator function and are of the form
C(u) = 1 (u1 , ) + . . . + 1 (ud , theta), , (C.4)
where () is the generator function and is the associated parameter. Table C.1 shows the
There are several measures of comovement often used when evaluating copula models. Asides
from the traditional measure of linear correlation Pearsons correlation coefficient, Kendalls
and Spearmans are rank correlation coefficient, which are commonly used in the copula
literature. As discussed by Genest et. al. (2011), the inverse of Kendalls tau is sometimes used
to determine the parameter estimates in Archimedean copulas. Tail dependence measures the
comovement of two variables at the extreme regions of the distribution (Fischer and Klein,
2007). The measure of tail dependence is defined by the copula function not the marginal
distributions within the copula. There are separate definition for the lower and upper tail
dependence coefficients. If we define a copula as FX,Y (x, y) = C(FX (x), FY (y)), then the lower
C(u, u)
L lim P (Y FY1 (u)|X FX1 (u)) = lim , (C.5)
u0+ u0+ u
136
while the upper tail dependence coefficient is defined as
1 2u + C(u, u)
U lim P (Y > FY1 (u)|X > FX1 (u)) = lim [0, 1]. (C.6)
u1 u1 1u
137
Appendix D
The two subsections, Initializing the Shock for an Impulse Response and Time Path for
Impulse Response After the Shock is Implemented, below outline one single iteration. After
many iterations, such as 100,000 iterations, the mean is taken to obtain E[Dt+k |t = dt +
1. Random draws are made from the posterior distributions of , , and , which are the
parameters of t .
2. The negative shock is introduced through the error term of the latent variable t .
4. Randomly draw bivariate vector, vt from the copula with parameter t . Then zt =
138
5. Using the conditional variances from the GARCH(1,1) models obtain the unstandardized
6. The standardized residuals are then used in the VAR(1) model to obtain the predicted
price differentials p1,t and p2,t . For this first iteration, pi,(t1) = 0 for market i = 1, 2
7. We obtain the difference between these two price differentials dt such that dt = p2,t
p2,t . If dt > 0 then the price differential is higher for the Market 2 compared to Market 1;
therefore, the shock has a relatively positive impact on Market 2. The shock is categorized
as a shock on Market 2. If dt < 0 then the price differential is higher for the Market 1
compared to Market 2; therefore, the shock has a relatively positive impact on Market 1.
D.1.2 Time Path for Impulse Response After the Shock is Implemented
Depending on the outcome of the initialized shock the following will either be the time path
1. Random draws are made from the posterior distributions of , , and , which are the
parameters of t+1 .
4. Randomly draw bivariate vector, vt+1 from the copula with parameter t . Then zt =
5. Using the conditional variances from the GARCH(1,1) models obtain the unstandardized
residuals, ui,t+1 = zi,t+1 i,t+1 . Note that GARCH(1,1) uses ui,t and i,t .
6. The standardized residuals are then used in the VAR(1) model to obtain the predicted
price differentials p1,t+1 and p2,t+1 . Note this iteration uses the price differentials from
time t.
139
7. We obtain the difference between these two price differentials dt+1
8. Repeat these steps until the chosen end of the time path. For this paper, we repeat this
The same process used as in obtaining E[Dt+k |t = dt + , Dt1 = dt1 , . . .] is used here. The
140