Escolar Documentos
Profissional Documentos
Cultura Documentos
www.elsevier.com/locate/aap
Abstract
The Negative Binomial modeling technique was used to model the frequency of accident occurrence and involvement. Accident
data over a period of 3 years, accounting for 1606 accidents on a principal arterial in Central Florida, were used to estimate the
model. The model illustrated the significance of the Annual Average Daily Traffic (AADT), degree of horizontal curvature, lane,
shoulder and median widths, urban/rural, and the sections length, on the frequency of accident occurrence. Several Negative
Binomial models of the frequency of accident involvement were also developed to account for the demographic characteristics of
the driver (age and gender). The results showed that heavy traffic volume, speeding, narrow lane width, larger number of lanes,
urban roadway sections, narrow shoulder width and reduced median width increase the likelihood for accident involvement.
Subsequent elasticity computations identified the relative importance of the variables included in the models. Female drivers
experience more accidents than male drivers in heavy traffic volume, reduced median width, narrow lane width, and larger number
of lanes. Male drivers have greater tendency to be involved in traffic accidents while speeding. The models also indicated that
young and older drivers experience more accidents than middle aged drivers in heavy traffic volume, and reduced shoulder and
median widths. Younger drivers have a greater tendency of being involved in accidents on roadway curves and while speeding.
2000 Elsevier Science Ltd. All rights reserved.
Keywords: Accident occurrence; Accident involvement; Negative Binomial models; Roadway geometric characteristics; Driver characteristics;
Traffic safety
1. Introduction
Researchers have attempted three approaches to relate accidents to geometric characteristics and traffic
related explanatory variables: Multiple Linear regression, Poisson regression and Negative Binomial regression. However, recent research shows that multiple
linear regression suffers some undesirable statistical
properties when applied to accident analysis, some of
which have been discussed by Jovanis and Chang
(1986). To overcome the problems associated with multiple linear regression models, Jovanis and Chang proposed Poisson regression for modeling accident
frequencies. They argued that Poisson regression is a
superior alternative to conventional linear regression
for applications related to highway safety. In addition,
it could be used with generally smaller sample sizes
than linear regression.
Joshua and Garber (1990) studied the relationship
between highway geometric factors and truck accidents
in Virginia using both linear and Poisson regression
0001-4575/00/$ - see front matter 2000 Elsevier Science Ltd. All rights reserved.
PII: S 0 0 0 1 - 4 5 7 5 ( 9 9 ) 0 0 0 9 4 - 9
634
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
models. They also concluded that linear regression techniques used in their research did not describe the relationship between truck accidents and the independent
variables adequately but that the Poisson models did.
Miaou et al., (1992) used a Poisson regression model
to establish the empirical relationship between truck
accidents and highway geometric on a rural interstate
in North Carolina. The estimated Poisson model suggested that Average Annual Daily Traffic (AADT) per
lane, horizontal curvature, and vertical gradient were
significantly correlated with truck accident likelihood.
During their work, a limitation of the Poisson model
was uncovered. Using the Poisson model necessitates
that the mean and variance of the accident frequency
variable (the dependent variable) be equal. In most
accident data, the variance of the accident frequency
exceeds the mean and, in such case, the data would be
over dispersed. They discussed that, although over dispersion was present, it did not change the conclusion
about the relationship between truck accidents and the
examined traffic and highway geometric design variables. However, they did suggest a correction to overcome the problem of over dispersion.
A follow-up study was completed by Miaou and
Lum (1993). While this study was similar in scope to
the first, the main purpose was to evaluate the statistical properties of two conventional linear regression
models and two Poisson regression models. The models
studied by Miaou and Lum were comparable to those
developed in previous studies to explore the relationship between vehicle accidents and highway geometric
design. The four types of models considered were (1) an
additive linear regression model; (2) a multiplicative
linear regression model; (3) a multiplicative Poisson
regression with exponential function and; (4) a multiplicative Poisson regression with non-exponential rate
function. The authors found that Poisson regression
models outperformed linear regression models. Furthermore, the Poisson regression model with the exponential rate function was the favored model. Miaou and
Lum also attempted to address over dispersion in their
frequency data. When over dispersion existed in the
data and Poisson model is used, the variance of the
estimated model coefficients tended to be underestimated. They attempted to relax the Poisson constraint
of the mean being equal to the variance by using
Wedderburns over dispersion parameter. They found
that with such over dispersed data, using the Poisson
model may not be appropriate for making probabilistic
statements about vehicle accidents because the model
may under or overestimate the likelihood of occurrence.
Because of the over dispersion difficulties, the authors
suggested the use of a more general probability distribution such as the Negative Binomial.
Miaou (1994) studied the relationship between highway geometric and accidents using Negative Binomial
regression. In this study, Miaou evaluated the performance of the Poisson regression, zero-inflated Poisson
regression, and Negative Binomial regression. Maximum likelihood was used to estimate the coefficients of
the models. As an initial step in developing a model,
Miaou suggested that the Poisson regression model
should be used to establish the relationship between
highway geometric and accidents. If over dispersion
exists and is found to be moderate or high, both the
Negative Binomial and zero inflated Poisson regression
models can be explored. He suggested that the zeroinflated Poisson regression model appears to be appropriate when the data exhibits a high number of zero
frequency observations.
Ivan and OMara (1997) applied Poisson regression
for the prediction of traffic accidents using the Connecticut Department of Transportations accident data.
Results of the model suggest that the posted speed
limit, the annual average daily traffic of the highway
are critical accident prediction variables leading to the
conclusion that the Poisson regression model is preferred than the linear regression model.
Shankar et al. (1995) used both the Poisson and
Negative Binomial distributions (Poisson when the data
was not significantly over dispersed and negative binomial when it was) to evaluate the effects of roadway
geometrics and environmental factors on rural accident
frequency in Washington State. In addition to the
overall accident frequency on sections of highway, they
modeled the frequency of specific types of accidents.
The authors concluded that separate regression models
for a specific type of accidents would have a greater
explanatory power, and that this was statistically
confirmed.
Poch and Mannering (1996) applied the Negative
Binomial regression to predict the accident frequency
on sections of principal arterials in Washington State.
They concluded that the Negative Binomial regression
is a powerful predictive tool and one that should be
increasingly applied in future accident frequency
studies.
Fridstrom et al. (1995) measured the contribution of
randomness, exposure, weather, and daylight to the
variation in road accident counts. They stated that the
formulation of the generalized Poisson regression models for accident counts allows for the decomposition of
the total variation in the dependent variable into one
part due to normal random (inexplicable) variation,
and another part due to systematic, causal factors.
They concluded also that the simple Poisson regression
models can come very close to explaining almost all the
systematic variation in a cross-section/time series accident data set. However, when the events analyzed are
not independent, it would be strongly advisable to use
Negative Binomial rather than pure Poisson specification, as certain amount of over dispersion must always
be expected in such cases.
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
In summary, from a methodological perspective, previous researchers have shown that multiple linear regression is not a suitable method for modeling the
relationship between accident occurrence, and the geometric and traffic factors. Poisson regression, and in
case of over dispersion, Negative Binomial regression
are more appropriate approaches for accident
modeling.
635
2. Data collection
In order to develop a mathematical model that correlates accident frequencies to the roadway geometric and
traffic characteristics, one needs to select a roadway
that posses a wide variety of geometric and traffic
characteristics. The goal of this data collection exercise
is to divide this roadway into segments with homogenous characteristics. After reviewing several roadways
in Central Florida, it was decided that State Road 50
(SR 50) is most appropriate for this task.
SR 50 is a 227 km major principal arterial that
connects the east and west coasts of Central Florida
passing through the center of Orlando. Parts of SR 50
are rural, and the number of lanes varies between 2, 4
and 6 lanes. This roadway also experiences high accident rates, and had very limited changes during the
3-year study period (199294). This arterial is also long
enough to produce an adequate number of segments to
develop the model.
Traffic and roadway data were obtained from Roadway Characteristics Inventory (RCI) database maintained by the Florida Department of Transportation
(FDOT). This database may be used to process, store,
and report information that describe all of the states
highway system in Florida. Information on roadways
include geometric characteristics such as horizontal
curves, shoulder widths, median widths, and traffic
characteristics such as traffic volumes and speed limits.
SR 50 was divided into 566 highway segments defined
by any change in the geometric and/or roadway variables (e.g. a new section would be identified when
median changes from 3 to 6 m). Therefore, each highway segment is uniform with respect to all the possible
636
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
(1)
Prob (ni )=
G(u + ni ) u
u (1ui )ni
(G(u)ni !) i
(3)
L(li )= 5
(4)
(5)
(6)
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
(7)
637
(l x
(x l
(8)
4. Estimation results
Coefficient
t-statistics
4.182
0.325
0.622
0.124
3.78
7.62
5.59
4.46
0.122
0.024
0.364
0.302
2.63
1.58
2.09
3.78
0.235
5.45
566
1210
1077
0.11
266
638
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
Table 2
Elasticity estimates for the accident frequency model
Variable
Elasticity
0.33
0.62
0.07
0.13
0.12
0.38
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
639
Table 3
Negative binomial models of male and female drivers accident involvement
Variables
Constant
Log of section length (km)
Log of AADT per lane
Degree of horizontal curve
Shoulder width (m)
Median width (m)
Lane width (m)/no. of lane
Speed difference/speed limit
Urban (1 if urban, 0 if rural)
Over dispersion parameter (a)
Summary statistics
Number of sections
Log-likelihood at zero
Log-likelihood at convergence
r 2 = 1LL(b)/LL(0)
2(LL((b)LL(0))
0.323
0.096
0.128
0.119
0.108
0.025
0.356
0.095
0.367
0.094
566
3657
650
0.275
2014
has a positive effect on accident involvement, the relative effect of AADT per lane on accident involvement is
higher for female drivers than male drivers. This shows
a tendency to more accident involvement by females
during heavy traffic. The decrease in median width
increases accident involvement frequencies for both
male and female drivers. But the relative effect of
median width for female drivers is more pronounced
than that for male drivers. The negative correlation
between the interaction of the lane width and number
of lanes and accident involvement is higher for females
than males. So it can be concluded that narrow lane
width and larger number of lanes have larger effect on
accident involvement for female than male drivers. For
Male drivers, there is a positive correlation between the
percentage of speed and accident involvement, which is
not significant for female drivers. This indicates that
male drivers have a tendency to be involved in accidents that are related to speeding.
Coefficient
t-statistics
2.52
0.092
0.375
0.107
0.077
0.063
0.800
0.317
0.137
3.43
3.21
4.87
6.11
2.63
6.56
8.11
6.31
4.10
566
3408
2864
0.16
1088
Elasticity (male
model)
Elasticity (female
model)
0.09
0.12
0.08
0.09
0.37
0.08
0.18
0.13
0.35
0.13
0.34
0.79
0.07
640
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
Table 5
Negative binomial models of young, middle, and old drivers accident involvement
Variables
Constant
Log of section length (km)
Log of AADT per lane
Degree of horizontal curve
Shoulder width (m)
Median width (m)
Lane width (m)/no. of lane
Shoulder pavement (1 if paved, 0
otherwise)
Urban (1 if urban, 0 if rural)
Speed difference/speed limit
Over dispersion parameter (a)
Summary statistics
Number of sections
Log-likelihood at zero
Log-likelihood at convergence
r 2 = 1LL(b)/LL(0)
2(LL((b)LL(0))
Coefficient
Coefficient
Coefficient
t-statistics
t-statistics
t-statistics
3.152
0.099
0.373
0.312
0.087
0.030
0.706
4.04
3.23
4.74
16.47
2.66
2.98
5.71
0.321
0.105
0.165
0.123
0.074
0.036
0.448
4.01
4.60
2.81
8.91
2.98
5.08
5.37
3.020
0.218
0.342
0.162
0.094
0.725
0.236
2.69
4.47
3.01
2.87
6.92
5.21
2.10
0.534
0.113
0.28
9.48
14.42
1.1
0.174
0.039
0.195
4.56
2.01
9.60
0.458
0.211
4.40
2.13
566
2763
2313
0.16
900
566
3994
3599
0.09
790
566
2518
1557
0.385
1922
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
641
Table 6
Elasticity estimates for age accident involvement models
Variables
0.10
0.31
0.18
0.70
0.22
0.71
0.075
0.10
0.16
0.07
0.12
0.19
0.44
0.02
0.21
0.34
0.26
0.50
0.72
642
M.A. Abdel-Aty, A.E. Radwan / Accident Analysis and Pre6ention 32 (2000) 633642
Acknowledgements
The authors wish to acknowledge the comments and
suggestions of the anonymous referees. Their recommendations resulted in a substantially improved paper.
References
Abdel-Aty, M., Chen, C., Radwan, E., Brady, 1999a. Analysis of
accident-involvement trends by drivers age in Florida. ITE Journal on the Web (Feb. 1999), pp. 6974.
Abdel-Aty, M., Chen, C., Radwan, E. 1999b. Using conditional
probabilities to explore the driver age effect in accidents. ASCE
Journal of Transportation Engineering 125(6).
Agent, K., Deen, R., 1975. Relationship between roadway geometrics
and accidents. Transportation Research Record 541, 111.
Agresti, A., 1990. Categorical Data Analysis. Wiley, New York.
Chen, C., 1997. Statistical Analysis of the Effect of Demographic and
Roadway Factors on Traffic Crash Involvement. M.S. thesis,
Department of Civil Engineering, University of Central Florida.
Fridstrom, L., Ifver, J., Ingebrigtsen, S., Kulmala, R., Thomsen, L.,
1995. Measuring the contribution of randomness, exposure,
weather, and daylight to the variation in road accident counts.
Accident Analysis and Prevention 27 (1), 120.