Anisotropic Extensions of Space-Time Point Process Models For Earthquake Occurrences

University of California
Los Angeles
Anisotropic Extensions
of Space-time Point Process Models
for Earthquake Occurrences
A dissertation submitted in partial satisfaction

of the requirements for the degree
Doctor of Philosophy in Statistics
by
Ka Leung Wong
2009
c Copyright by

Ka Leung Wong
2009
The dissertation of Ka Leung Wong is approved.
Qing Zhou
Jan de Leeuw
David Jackson
Frederic Paik Schoenberg, Committee Chair
University of California, Los Angeles

2009
ii
To Ivy
iii
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Focal Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Analysis of Aftershock Spatial Distribution . . . . . . . . . . . .
3.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3
Stochastic Aftershock Assignment . . . . . . . . . . . . . . . . . .
11
3.4
Relative Location of Aftershocks with respect to Mainshock Focal

Mechanism
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
3.5
Tapered Pareto-Wrapped Exponential . . . . . . . . . . . . . . . .
13
3.6
Alternative Aftershock Spatial Distributions . . . . . . . . . . . .
17
3.6.1
The Normal Modell . . . . . . . . . . . . . . . . . . . . . .
17
3.6.2
The Kagan-Jackson Model . . . . . . . . . . . . . . . . . .
18
Residual Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.7.1
Quadrat Residuals . . . . . . . . . . . . . . . . . . . . . .
19
3.7.2
Weighted K-function . . . . . . . . . . . . . . . . . . . . .
19
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
3.8.1
Fit of proposed models . . . . . . . . . . . . . . . . . . . .
22
3.8.2
Diagnostics and Model Comparison . . . . . . . . . . . . .
23
Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.9.1
25
3.7
3.8
3.9
Scaled Distance . . . . . . . . . . . . . . . . . . . . . . . .
iv
3.9.2
Relocation Catalog . . . . . . . . . . . . . . . . . . . . . .
26
3.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
4 Focal Mechanism-dependent Anisotropic Spatial Kernel for SpaceTime Earthquake Point Process Models . . . . . . . . . . . . . . . .
40
4.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
4.2
Epidemic Type Aftershock Sequence Models . . . . . . . . . . . .
42
4.2.1
Anisotropic Clustering . . . . . . . . . . . . . . . . . . . .
44
4.3
Fault Plane Strike Angle and Relative Aftershock Angle . . . . . .
46
4.4
Anisotropic Extensions of ETAS Models . . . . . . . . . . . . . .
47
4.5
Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . .
49
4.6
Goodness-of-fit and Diagnostic Methods for Spatial and Spatialtemporal Point Process Models . . . . . . . . . . . . . . . . . . .
50
4.7
Deviance Residuals . . . . . . . . . . . . . . . . . . . . . . . . . .
52
4.8
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
4.8.1
Pearson residuals . . . . . . . . . . . . . . . . . . . . . . .
53
4.8.2
Comparison of Isotropic Pareto and Isotropic Tapered Pareto

Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.8.3
54
Impact of the Wrapped Exponential Distribution of Relative Angles Between Mainshocks and Aftershocks . . . . .
55
Models with only Strike-slip Mainshocks . . . . . . . . . .
56
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . .
66
4.8.4
4.9
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vi
68
List of Figures
2.1
Beach ball diagram . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Ternary diagram . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1
Definition of relative aftershock location. . . . . . . . . . . . . . .
29
3.2
Scatterplot of aftershocks . . . . . . . . . . . . . . . . . . . . . . .
31
3.3
Survival function of r . . . . . . . . . . . . . . . . . . . . . . . . .
32
3.4
Conditional histograms of . . . . . . . . . . . . . . . . . . . . .
33
3.5
Estimates of . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
3.6
Density plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
3.7
Quadrat residuals . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
3.8
Weighted K-functions . . . . . . . . . . . . . . . . . . . . . . . .
36
3.9
Survival function of r/L . . . . . . . . . . . . . . . . . . . . . . .
37
3.10 Estimates of , with respect to r/L . . . . . . . . . . . . . . . . .
38
3.11 Estimates of , from a relocation catalog . . . . . . . . . . . . . .
39
4.1
Fault plane strike angle and relative aftershock angle . . . . . . .
61
4.2
Pearson residuals for model (4.5), . . . . . . . . . . . . . . . . . .
62
4.3
Deviance residuals of (4.5) against (4.8). . . . . . . . . . . . . . .
63
4.4
Deviance residuals of (4.5) against (4.10). . . . . . . . . . . . . . .
64
4.5
Deviance residuals of (4.13) against (4.14), with only strike-slip

mainshocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii
65
List of Tables
3.1
ETAS parameters used in stochastic aftershock assignment . . . .
30
4.1
MLE and AIC for ETAS models. . . . . . . . . . . . . . . . . . .
59
4.2
MLE and AIC for ETAS models, with only strike-slip mainshocks.
60
viii
Acknowledgments
I owe my deepest gratitude to my advisor Rick Schoenberg. His kindness, mentorship, and guidance are vital to the development and completion of this work.
I am also grateful to the members of my committee: Qing Zhou, Jan de Leeuw,
and David Jackson. Special appreciation goes to David Jackson who has shared
with me his invaluable expertise in seismology.
I would like to extend my gratitude to Yan Kagan and Alex Veen for their
stimulating discussions, Mark Hansen for taking me under his wings early in my
career, and Glenda Jones for her briliant and dedicated service. Many of my
colleagues also deserve praises. I like to thank Chris Barr, David Diez, and Gong
Chen for their humor, support, and encouragement.
Lastly, I like to acknowledge my dear parents for their unwavering support,
and my lovely Ivy, to whom I owe everything.
ix
Vita
1981
Born, Hong Kong, China.
19992004
B.A. Architecture, University of California, Berkeley.
20042009
Ph.D. Statistics, University of California, Los Angeles.
20062008
Teaching Assistant, Statistics Department, University of California, Los Angeles.
2008
Ph.D. Candidate in Statistics, University of California, Los Angeles.
Publications and Presentations
Wong, Ka and Schoenberg, Frederic. On Mainshock Focal Mechanisms and

the Spatial Distribution of Aftershocks, Bulletin of the Seismological Society of
America, in press.
Intl Workshop on Statistical Seismology 2009 (Lake Tahoe, CA)

Invited paper: A focal mechanism-dependent anisotropic spatial kernel for ETAS
models.
Annual SCEC Meeting 2008 (Palm Springs, CA)

Contributed poster: Estimation of ETAS models for earthquake occurrences.
Abstract of the Dissertation
Anisotropic Extensions
of Space-time Point Process Models
for Earthquake Occurrences
by
Ka Leung Wong
Doctor of Philosophy in Statistics
University of California, Los Angeles, 2009
Professor Frederic Paik Schoenberg, Chair
Focal mechanism provides a reasonable approximation to an earthquakes rupture

mechanics in terms of its fault plane orientation and direction of slip. The first
part of this dissertation explores focal mechanism as a means to describe the
anisotropic spatial distribution of aftershocks. Based on empirical analysis of
aftershock patterns in Southern California seismicity, a spatial distribution for
the relative location of aftershocks with respect to mainshock focal mechanism
is proposed. When compared to alternative models for aftershock and seismicity
patterns, the proposed model appears to offer superior fit to Southern California
seismicity.
The second part proposes a general framework for extending space-time earthquake point process models to incorporate focal mechanism data via an anisotropic
spatial kernel. Using the proposed model for relative aftershock locations as an
example, the effectiveness of using focal mechanism in modeling earthquake occurrences is assessed. In addition, a new residual method is proposed for assessing
xi
the relative performance of models to spatial and spatial-temporal point process

data. This graphical tool is used to illustrate the advantages and disadvantages
of extended ETAS models compared to alternative models and appears quite
effective.
xii
CHAPTER 1
Introduction
Focal mechanism provides a reasonable approximation to an earthquakes rupture
mechanics in terms of its fault plane orientation and direction of slip. This dissertation explores focal mechanism as a means to describe the anisotropic spatial
distribution of aftershocks and extends a state-of-the-art earthquake point process model to incorporate focal mechanism via an anisotropic spatial kernel. In
addition, new methods are proposed for the assessment of goodness-of-fit of such
spatial-temporal point process models, and these methods are applied to the comparison of branching point process models for earthquakes and their aftershocks
using recent seismological data from Southern California.
Chapter 2 provides an introduction to earthquake focal mechanism. Chapter 3
explores focal mechanism as a means to describe the anisotropic spatial distribution of aftershocks using seismological data from Southern California. A spatial
distribution is proposed for the relative location of aftershocks with respect to
mainshock focal mechanism. This model is compared to previously proposed
models based on the normal distribution and the squared cosine function. Using
residual analysis and weighted K-function as diagnostic measures, the normal and
squared cosine models are found to suffer from several serious problems, and that
the proposed distribution has features similar to both alternative models but fits
much better to Southern California earthquake data.
Chapter 4 proposes a general framework for extending space-time point pro-
cess earthquake models to incorporate focal mechanism via an anisotropic spatial

kernel. The proposed model for relative aftershock locations is as an example of
such a spatial kernel, and the effectiveness of using focal mechanism in modeling earthquake occurrences is assessed. Some methods for assessing purely
spatial and spatial-temporal point process models are briefly summarized. Building upon some of these methods, a new residual method that involves inspecting
differences between competing models in contributions to the log-likelihood over
pixels is proposed. This graphical tool appears to be quite effective at portraying
the relative fit of models to space-time point process data, and is used to illustrate the advantages and disadvantages of extended ETAS models compared to
alternative models.
Chapter 5 concludes this dissertation and suggests important topics for future
research in this area.
CHAPTER 2
Focal Mechanism
This chapter provides a brief summary of earthquake focal mechanism; for further
details see Bullen and Bolt (1985) or Bolt (2006).
Earthquakes generally result from seismic failures and can be highly non-linear
and fractal-like processes (Kagan 1997, Turcotte 1989). The observed seismic
wave patterns of most earthquakes can be effectively explained by a double couple
as if the event were equivalent to a single nonelastic slippage on a single fault
plane (Bullen and Bolt, 1985). The focal mechanism of an earthquake, which
includes the direction of slip and the orientation of the fault on which it occurs,
provides a reasonable approximation to an events rupture mechanics (Bullen and
Bolt, 1985).
Focal mechanisms are derived from a solution of the moment tensor of an
earthquake. A seismic moment tensor is a 3 3 symmetric matrix estimated
by an analysis of observed seismic waveforms. Its determinant represents the
moment, or size, of an earthquake, and its eigenvectors give the directions of the
earthquakes N- (neutral), T- (tension), and P- (compression) axes relative to its
hypocenter. Inferred from the moment tensor are two ambiguous nodal planes,
one of which is the fault plane and the other is its perpendicular auxiliary plane.
Differentiating the two planes requires knowledge of the events lateral orientation
(left or right), and such information can be inferred from local geological evidence
and/or the events aftershock pattern. The task of resolving fault plane ambiguity
in Southern California is aided by the large presence of two major right-lateral

strike-sip fault zones, the San Andreas and San Jacinto fault zones (Sanders 1989,
1993).
Depending on an earthquakes faulting style, each earthquake can be classified
into three major categories: strike-slip, normal, and reverse. The majority of
earthquakes in Southern California are called strike-slip events, which are events
with nearly vertical fault planes and nearly horizontal slippage. Focal mechanisms
are typically displayed graphically using so-called beach ball diagrams. The beach
ball diagram in Figure 2.1 corresponds to the focal mechanism of a strike-slip
event whose nodal planes are exactly vertical. The black and white quadrants
represent the tension and compression zones, respectively, containing the T- and
P- axes. If the fault is right-lateral, then the fault plane is represented by nodal
plane 1.
When many focal mechanisms are displayed collectively, a ternary diagram
(Frohlich 1992, Frohlich 2001) may be a useful graphical device. A ternary diagram projects each focal mechanism as a point onto a triangle in which each
corner represents a type of earthquake, slip-slip, normal, or reverse. Figure 2.2
illustrates a ternary diagram for focal mechanisms in a typical Southern California focal mechanism catalog. Such a technique therefore allows one to summarize
a large set of focal mechanisms in a compact fashion. Minor area distortion is
introduced in the projection step.
Estimating focal mechanisms requires an extensive network of seismic stations
and data on focal mechanism have not been historically available in large quantities. Such data recently become more widely available due to advances in instrumentation technology and inversion algorithms. For events located in or near
Southern California, focal mechanism estimates are provided by the Southern
California Earthquake Data Center (SCEDC) which has implemented an automatic inversion process since 1999 (Clinton et. al. 2006). Owing to the difficulty
in the inversion process, focal mechanism solutions are often subject to quality
issues. Factors that affect the quality of a solution include an events epicentral
location relative to the monitoring stations, magnitude, and depth (Clinton et.
al. 2006). For instance, a small-magnitude event on the edge of a network may
not have a successful inversion. Published solutions on SCEDC are labeled with
a quality grade that reflects the precision of the inversion process. The three
quality grades are A, B, and C, in decreasing order of their precision.
nodal plane 2
Taxis
nodal plane 1
Paxis
Figure 2.1: A beach ball diagram corresponding to the focal mechanism of a

strike-slip earthquake.
Figure 2.2: A ternary diagram showing the distribution of earthquake types in a

typical Southern California data set. The dotted lines demarcate the definitions
of strike-slip, normal, and thrust, used in Frohlich (1992) and Frohlich (2001).
CHAPTER 3
Analysis of Aftershock Spatial Distribution
3.1
Background
Since Utsu (1969) first famously noted that aftershock regions tend to be elliptical, it has been widely observed that aftershock activities are often non-circular.
A convenient model used to describe aftershocks anisotropic spatial distribution
is the bivariate normal distribution. Ogata (1998) uses the normal model to
approximate the ellipsoidal contours of aftershock spatial distribution. Kagan
(2002) uses a similar method to measure the mainshock focal zone size. While
the normal distribution may serve as an acceptable first-order approximation for
the distribution of aftershock locations, there is scant evidence of its optimality.
Kagan (2002) lists a few reasons for which that the normal model cannot be
exact. For instance, aftershocks may happen at large distances from the mainshock where no other traces of earthquake rupture can be found. In addition,
aftershocks exhibit the feature of secondary clustering and are not mutually independent in space.
Another shorting coming of the normal model is that it does not incorporate
information on mainshock focal mechanisms, which may have value in forecasting
aftershock patterns since it has been widely observed that aftershocks generally
occur on or near the fault planes of their associated mainshocks. For instance,
Willemann and Frohlich (1987) used the Anderson-Darling statistic to test the
distribution of aftershock hypocenters on the focal sphere of mainshocks against

a uniform distribution to show that for deep focus earthquakes, aftershocks that
were greater than 20 km from the mainshock exhibit significant clustering in the
plane of the Wadati-Benioff zone. Michael (1989) applied the same method to six
shallow focus aftershock sequences in California and found significant clustering
on the fault plane. Kagan (1992) adopted a more exploratory approach by using
equal-area projection rather than the Anderson-Darling statistic. He too reached
a similar conclusion regarding earthquake clustering along fault planes. Based
on this result, Kagan and Jackson (1994) introduced an anisotropic function for
their spatial smoothing kernel in their long-term earthquake forecast.
The aforementioned studies focused exclusively on the clustering of hypocenters in certain directions on the focal sphere. I argue that such analyses essentially ignore the potentially important relationship between the distance from an
earthquakes hypocenter or epicenter to those of its aftershocks and the angular
separation between an earthquake and its aftershocks. In this chapter, I attempt
to model both the angle and distance between earthquakes and their aftershocks
in concert.
This analysis focuses on strike-slip earthquakes with local magnitude ML >
3.0 in Southern California between 1999 and 2006 as mainshocks. Due to the
difficulty and subjectivity associated with the assessment of the precise branching
structure in earthquake catalogs, we follow Zhuang et. al. (2002) in using a
model-based method to identify aftershock sequences stochastically. I propose a
semi-parametric model for the spatial distribution of aftershocks that is composed
of a marginal distribution for distance and a conditional distribution for the
relative angle from the mainshock. This model is compared to two alternatives
that have been used to describe aftershock and seismic patterns respectively: the
normal model and the spatial smoothing kernel of Kagan and Jackson (1994).
We use point process residuals and the weighted K-function to assess the fit of
the various models and show that the proposed semi-parametric model appears
to offer superior fit to Southern California seismicity.
The outline of this chapter is as follows. Sections 3.2 and 3.3 describe the
catalog and aftershock assignment procedure used in this analysis. Section 3.4
defines the relative location of aftershocks with respect to mainshock focal mechanism. A new model is proposed in Section 3.5 for the spatial distribution of
relative aftershock locations. Two alternative models used to describe aftershock
and seismic patterns are given in Section 3.6. Section 3.7 discusses two diagnostic
methods to be used to compare the competing models. Section 3.8 is the results
section. Section 3.9 investigate two additional topics in seismology that are relevant to understanding the spatial distribution of aftershocks. A discussion and
suggestions for future work are given in Section 3.10.
3.2
Data
We focus here on earthquakes in the Southern California Earthquake Data Center

(SCEDC) catalog occurring between September 18, 1999, and Dec 31, 2005, with
epicenters in Southern California and a moment magnitude of M3.0 or above (see
Data and Resources Section). The published hypocenters in the SCEDC moment
tensor catalog are used for the locations of both mainshocks and aftershocks.
The hypocenters are based on first motion triggers and therefore correspond to
the initial points of rupture. Based on comparison of the frequency-magnitude
distribution of the catalog to the Gutenberg-Richter distribution, the catalog
is believed to be complete for earthquakes above M3.0, though some events on
the edge of the network can be missing due to the lack of a qualified moment
tensor solution (Clinton et al. 2006). As mentioned in Chapter 2, moment tensor

solutions provided by SCEDC are labeled with a quality factor, A, B, or C, in
decreasing order of their precision. Only focal mechanisms of quality A or B are
considered because solutions of worse qualities are deemed unstable and are often
discarded (Clinton et. al. 2006).
Because different types of earthquakes are likely to have very different patterns of aftershock activity, in this paper we consider only the aftershock activity
surrounding strike-slip earthquakes of quality A or B, with the notion that the
aftershock activity surrounding other types of events may be analyzed using similar methods in future work. As pointed out by Kagan and Jackson (1998), the
simplicity of the geometry of strike-slip faulting facilitates the description and
interpretation of aftershock patterns: in a strike-slip event, the fault plane intersects the surface of the earth almost vertically and thus line of intersection is a
fairly accurate representation of the fault plane itself. In seismology, this line of
intersection is termed the strike of the fault plane. Since the strike is a reasonable
proxy for the fault plane, we will refer to the two terms interchangeably.
Southern California is populated by a large number of right-lateral strike-slip
faults. As a result, strike-slip events make up the majority of earthquakes in
Southern California. In this paper, we categorize an earthquake mechanism as
strike-slip if its neutral axis of the moment tensor (B-axis) is within 20 of the
vertical (some previous authors have used a cutoff of 30 instead of 20 ) (Frohlich
1992, 2001). A strike-slip will thus have a nearly vertical fault plane and a nearly
horizontal rupture motion. Roughly 1/3 of earthquakes in the SCEDC catalog
fall into this category.
The Southern California Earthquake Data Center (SCEDC) data were obtained via SCEDCs searchable data archive at
10
http://www.data.scec.org/catalog_search/CMTsearch.php
3.3
Stochastic Aftershock Assignment
In determining the mainshock-afershock assignments, we adopt a model-based

approach taken by Zhuang et. al. (2002). In this approach, one assumes an
aftershock sequence model and assigns the aftershock branching structure probabilistically according to the model. For example, one may take a spatial-temporal
epidemic type aftershock sequence (ETAS) model (Ogata 1998) and assume the
conditional intensity at point (t, x, y) is the sum of the background and triggering
intensities,
(t, x, y|Ht ) = (x, y) +
g(t, x, y|ti , xi , yi ),
i:ti <t
where () is a constant background rate and g(|) is the response function. After
fitting the model to the catalog, one may attribute event j to be an offspring of
an earlier event i with probability i,j defined as the ratio of the response function
of i to the conditional intensity at (tj , xj , yj ), that is,
i,j =
g(tj , xj , yj |ti , xi , yi )
.
(tj , xj , yj |Htj )
For any event i in the catalog, a stochastic realization of its aftershock sequence
can be emulated by keeping all of its subsequent events with probability i,j for
j = i + 1, i + 2, . . . , n. In my particular implementation, I used ETAS model
(2.3) of Ogata (1998), fit via maximum likelihood with only strike-slips as trigger
events. Parameter estimates can be found in Table 3.1. As mentioned previously, the purpose of this paper is to explore the branching behavior of Southern
California strike-slip events only, hence only the ETAS branching structures initiated by strike-slip events are modeled here. Use of the probabilistic approach
of Zhuang et al. (2002) is an alternative to the often arbitrary and subjective
11
decisions that can arise in determining mainshock-aftershock assignments. Furthermore, I repeated this probabilistic branching assignment multiple times and
verified that the main conclusions of this paper were not affected. We henceforth
refer to a single realization of this Zhuang et al. (2002) assignment procedure.
While this analysis is not concerned with discerning whether particular events
are foreshocks, mainshocks, aftershocks, or swarms, for purposes of explanation
in this paper, let us refer to the strike-slip earthquakes in the SCEDC catalog of
quality A or B as mainshocks. According to these definitions, the catalog contains 190 distinct mainshocks, which collectively have a total of 1224 emulated
aftershocks.
3.4
Relative Location of Aftershocks with respect to Mainshock Focal Mechanism
We consider an aftershocks relative location to any mainshock with respect to

the focal mechanism of the mainshock, as illustrated in a beach ball diagram in
Figure 3.1 representing the focal mechanism of a right-lateral strike-slip mainshock. The location of an aftershock relative to a mainshock is measured by r and
, where r is the epicentral distance between the two events, and is the angular
separation between the aftershock and the mainshocks fault plane. The fault
plane ambiguity is resolved by assuming all strike-slip faults in Southern California are right-lateral unless individual aftershock sequences clearly delineate a
left-lateral fault upon examination. Gomberg (2003) observed strong directivity effects among a number of unilaterally-rupturing strike-slips with magnitude
Ms > 5.4 in different tectonic environments. While events with directive ruptures
may have asymmetrically triggered aftershocks, such events and their propagation
directions are difficult to quantify in a large catalog of various-sized earthquakes.
12
Therefore we will ignore possible directivity effects and treat fault plane as an
axis without sense of direction. To tentatively differentiate between the compression zone and dilatation zone of the mainshock focal mechanism, one may define
as the angle measured clockwise from the aftershocks epicenter to the nearest
mainshock fault plane for a right-lateral strike-slip, and counter-clockwise for a
left-lateral event. Thus spans from 0 to , and (0, /2) and (/2, )
represent the compression zone (containing the P-axis) and dilatation zone (containing the T-axis) respectively. While the Coulomb stress changes caused by the
slip of the mainshock may be different in both zones (compression and dilatation), the numbers of aftershocks in both zones are found to be similar: 618 in
dilatation zone and 606 in compression zone. To test whether the differences in
the observed aftershock patterns in the two zones are statistically significant, the
dilation zone is reflected along the y-axis into the first quadrant and a chi-square
(X 2 ) test is conducted on whether the two distributions are significantly different.
Each zone is first confined to a [0, 20] [0, 20] window and then partitioned into
a 3 3 rectilinear grid. With only cells containing 5 points or more entering into
the test, the X 2 test yields a test statistic of 3.98 with 5 degrees of freedom, which
corresponds to a p-value of 0.55. In light of such evidence, we do not distinguish
the two zones in the remainder of this paper and restrict to [0, /2] by defining
it as the absolute angular separation from the nearest fault plane.
3.5
Tapered Pareto-Wrapped Exponential
I propose to model the distribution of the relative locations of aftershocks, in

polar coordinates, as a product of two distributions: 1) a marginal tapered Pareto
distribution for the distance r between mainshocks and their aftershocks, and 2)
a wrapped exponential distribution for the relative angle between mainshocks
13
and their aftershocks, given the distance r. The distribution may be written
f (r, ) = 1r fr (r) f|r (|r), where fr and f|r are each one-dimensional densities
R
to be estimated, so that f (r, )r dr d = 1. We refer to this model as the
Tapered Pareto Wrapped Exponential (TPWE) model in what follows.
The distribution of distance between mainshocks and aftershocks has been
investigated in previous studies using various methods. Utsu (1969) noted that
aftershock regions tend to be elliptical, and Ogata (1998) built upon this work,
questioning whether the distance decay function is short range (i.e. normal)
or long range (i.e. inverse power law) and proposing several moment-weighted
models for alternatives. The inverse power law was shown by Felzer et. al.
(2006) to be a good description of aftershock distances between 0.2 km and 50
km. In a time-independent framework, Kagan and Jackson (1994) used a density
proportional to 1/r to describe distances between mainshocks and aftershocks,
in producing a long-term seismic hazard map.
More recently, several authors have begun using the the tapered Pareto distribution as an alternative to the Pareto and truncated Pareto distributions. The
tapred Pareto was originally suggested by Vilfredo Pareto himself (Pareto 1897)
and has been used to describe the distribution of phenomena which obey some
power-law type of behavior but are not quite as heavy-tailed as the Pareto, such
as seismic moments (Jackson and Kagan 1999; Vere-Jones et al. 2001) , the times
and distances between successive earthquakes in Southern California (Schoenberg
et al. 2008), and wildfire sizes (Schoenberg et al. 2003).
The tapered Pareto has cumulative distribution function:

ax
Ftap (x) = 1 (a/x) exp

, ax
where is a threshold after which frequency begins to decay especially rapidly.

Additional information concerning the density, characteristic function, moments,
14
and other properties of the tapered Pareto can be found in Kagan and Schoenberg
(2001).
While a variety of models for the density of r have been proposed, the form of
f|r (|r) has been the subject of relatively scant scrutiny to date. As part of their
spatial smoothing kernel, Kagan and Jackson (1998) used a directivity function,
expressed as D = 1 + cos2 (), where measures the concentration of earthquake epicenters around the presumed fault plane. Here D does not serve exactly
the same purpose as f|r (|r) because f|r (|r) concerns aftershocks, i.e. seismic
activity at times after a mainshock, whereas D describes the time-independent
distribution of around any earthquake. Further, this investigations suggest
that, for the SCEDC data, the distribution of appears to depend on r, so that
the conditional distribution of given r may be more meaningful than the overall marginal distribution of . Due to lack of theoretical support for a particular
functional form, I take a semi-parametric approach here. I propose a circular distribution called the wrapped exponential (W E), whose single parameter may
be estimated locally within selected bins, as described below.
Wrapped distributions are useful for modeling angular variables such as , the
relative angle between mainshocks and aftershocks (Mardia and Jupp 2000). Such
a distribution is obtained by conceptually wrapping a distribution on the real line
around the circumference of a unit circle. That is, if x is a real random variable
with an arbitrary probability density function f , then the wrapped analogue of
f has density
fw (xw ) =
f (xw + 2k),
k=
where
xw = x(mod 2)
15
xw [0, 2),
This approach can be applied to any probability distribution to manufacture

a large class of circular distributions, among which the wrapped Gaussian and
wrapped Cauchy are examples. However, one shortcoming common to most such
wrapped distributions is the lack of a closed form expression, which renders parameter estimation difficult. One exception is the wrapped exponential (WE) in
which the infinite series converges and has a remarkably simple solution (Jammalamadaka and Kozubowski 2001). The WE has been used to model seismic
events triggered by periodic processes (Jupp et al. 2004), and is obtained by
applying the wrapping procedure to the exponential density, f (x) = ex , x > 0.
Since in this analysis only spans [0, /2] by construction, it will cycle on a
quarter circle instead of a full circle. On a quarter circle, the WE has density
fwe () =
e
,
1 e/2
[0, /2].
Note that this is equivalent to the density of a truncated exponential random

variable on the line , i.e. f (X|X < /2), due to the memoryless property of the
exponential distribution.
The WE has a single shape parameter . When = 0, the WE corresponds to
a uniform distribution on its support. As increases, the skewness of the distribution increases, and as approaches , the distribution degenerates to a point
mass at = 0. In the context of this analysis, can be thought as an aftershock
azimuthal concentration parameter. With only one parameter, the WE provides
a very good fit to the conditional distribution of given r. Evidence suggests,
however, that the conditional distribution of changes depending on the value
of r. One way to model its dependency on r is to let the parameter vary as a
function of r. I propose to estimate the value of locally by maximum likelihood
(ML) successively on different bins, each containing n pairs of points, sorted according to the distance r between mainshock and aftershock; in order to estimate
16
more accurately within each bin, I use all possible mainshock-aftershock pairs,
weighting each pair by its probability i,j of being an actual mainshock-aftershock
pairing. I experimented with several choices of n and selected n = 200 in order
to achieve a satisfactory bias-variance tradeoff.
3.6
3.6.1
Alternative Aftershock Spatial Distributions

The Normal Modell
Two alternative models are considered in this paper for comparison. The first
model is the normal model and the second is the spatial smoothing kernel of
Kagan and Jackson (1994). The normal model has been proposed to describe the
relative locations of aftershocks and thus is certainly applicable to the problem
in hand. Kagan and Jacksons model, by contrast, was suggested in a slightly
different context, as explained below, but may nevertheless constitute a relevant
model for comparison.
The normal model is often used as a conveniently simple spatial distribution
for aftershock sequences (Rhoades and Evison 1993, Kagan 2002, Ogata 1998).
A slight modification was made by Ogata (1998), who introduced an anisotropic
function based on the normal model as a spatial extension of his earlier ETAS
model (Ogata 1988), and proposed a separate normal model to be fit to each
aftershock sequence in the de-clustered catalogue, using an anisotropic metric
that replaces the Euclidean metric. Compelling evidence to support the normal
model is the commonly seen elliptical shape of aftershock zones (Rhoades and
Evison 1993).
17
3.6.2
The Kagan-Jackson Model
Kagan and Jackson (1994) estimated the long-term rate densities for earthquakes
as a weighted sum of smoothing kernels, each centered at the epicenter of a
previous ith earthquake, using information on the earthquakes focal mechanisms.
Adopting current notation, the density at point (x, y) is estimated in Kagan and
Jackson (1994) as
f (x, y) =
fi (ri , i , Mi ) + s
where s = 0.02 is a small constant to allow for surprises far from past earthquakes. Of interest here is their proposed smoothing kernel

1
fi (ri , i , Mi ) = A (Mi Mcut ) 1 + cos2 (j ) ,
ri
where A is a normalization constant, Mi is the magnitude of earthquake, and
is a parameter controlling the degree of azimuthal concentration in a direction
relative to the earthquakes focal mechanism (Kagan and Jackson 1994). We refer
to this model as the KJ model in what follows. Kagan and Jackson used their own
expert knowledge to choose . In a region where the concentration of aftershocks
along the fault plane is high, should be assigned a high value, and should
be small if the earthquakes are dispersed relatively isotropically. The choice of
smoothing kernel was selected as a result of the analysis in Kagan (1992) of focal
mechanisms and the distribution of hypocenters on the focal sphere. Although
the model was motivated by analysis of subduction zone earthquakes, Kagan et
al. (2007) recently applied this model to seismicity in southern California using
= 100. In the current analysis, the value of is selected via maximum likelihood.
18
3.7
3.7.1
Residual Analysis
Quadrat Residuals
I focus on a subset of relative mainshock-aftershock locations in [0, 20] [0, 20]

within which I compare the goodness-of-fit for the different models. One method
of assessing the goodness-of-fit of a spatial density for point process is by examining the residuals over various quadrats, as suggested in Baddeley et al. (2005).
That is, one partitions the space into cells and calculates the residuals in each
cell, which may be standardized in various ways. For instance, the residual Ri
corresponding to cell i may be defined as
Ri =
Ni Ei
,
Ei
where Ni is the observed number of points in the cell, and Ei is the expected
number of points defined. Ei can be found by integrating the intensity f over
the cell i.
This is effectively Pearson residuals in the Poisson log-linear regression context. By construction, the residuals are standardized to have mean 0 and standard
deviation approximately 1. Outliers and systematic patterns in the residuals may
indicate lack of fit.
3.7.2
Weighted K-function
The K-function described by Ripley (1981) is commonly used to detect excessive

clustering or inhibition in a point process. The function K(h) is defined as
the average number of additional points within h of any given point, divided
by the overall rate. The null hypothesis is that the underlying point process is
homogeneous Poisson. In cases where the hypothesis is not uniform, each point
19
may be weighted according to the rate of the point-process in question, yielding

the weighted or inhomogeneous K-function (Baddeley et al. 2000). The weighted
K-function has been used by Veen and Schoenberg (2005) to assess the spatial
distributions in point process models for earthquakes in Southern California. To
test the null hypothesis that the spatial intensity of points (mainshock-aftershock
pairs) in region D is f0 (x, y), the weighted K-function may be defined as
KW (h) =
1 X X
wi
wj 1(|pi pj | h),
f2 N i
j6=i
where N is the total number of observed pairs of points, f := inf{f0 (x, y); (x, y)
D} is the infimum of the density over the observed region, 1() is an indicator
function, and wr = f /f0 (pr ), where f0 (pr ) is the modeled density of pairs of
points at vector distance pr apart. Veen and Schoenberg (2005) verified that
for the Poisson case where f0 is locally constant on distinct subregions whose
areas are large relative to the interpoint distance hn , the weighted K-function
is approximately normal with mean h2 and variance 2h2 A/[E(N )]2 , where A
is the size of the area being studied and N is the number of points observed
in A. One common issue in applying the K-function or weighted K-function is
the problem of boundary correction. One method of edge correction is to make
mirror images of the points (where each point is a mainshock-aftershock pair)
along the boundaries over which these points are observed, as suggested e.g. in
Ripley (1981). Since all of observed points are restricted to the first quadrant on
the plane, I reflect each point along both the x- and y-axes.
Statistical inference is drawn from simulation-based confidence bounds. 4,000
samples are taken and a weighted K-function is estimated for each sample. Based
on these weighted K-functions, I form 95% pointwise confidence bounds to make
statistical inferences.
20
3.8
Results
Figure 3.2 shows a subset of mainshock-aftershock pairs as described in Section

3.3. Not surprisingly, it is immediately noticeable that the concentration of points
is much higher near the x-axis (i.e. the fault plane) than elsewhere. One may
observe that aftershocks seem to be distributed nearly uniformly in all directions
when they are close to the respective mainshock, whereas when they are further
away, they tend to lie more predominantly along the azimuth of the fault plane
(i.e. along the x-axis). One may also observe the discreteness in the observed
distances r between mainshocks and their aftershocks, for pairs that are very
close together; this is a result of the resolution of measurements recorded in the
catalog. Note that rounding errors in the locations, compounded by estimation
errors in epicenter locations, may translate into large errors in the estimation of
, especially for small values of r, where a tiny change in location will translate
into a large change in . Willemann and Frohlich (1987) and Michael (1989)
discarded aftershocks within 5 km of each mainshock, in order to avoid dealing
with these noisy observations.
Figure 3.3 shows the survival functions of r, a fitted Pareto (i.e. inverse power
law), and a fitted tapered Pareto distribution, on log-log scale. The diagram
indicates the Pareto offers a good fit to the data for r < 50 km, coinciding with
the observation made by Felzer et. al. (2006). On the other hand, the tapered
Pareto distribution fits the data equally well in the same range, however, its
drastically better fit to the tail distribution renders it a preferred alternative.
Figure 3.4 displays the conditional histogram of for three different ranges
of r: a) 10 km r < 15 km, b) 5 km r < 10 km, and c) 0 km r < 5 km. For
r less than 5 km, the distribution of is seen almost uniform. As noted earlier,
for small r values, is very sensitive to errors in the locations. Hence a possible
21
explanation for its uniform distribution in this range of r is that it is dominated

by noise. In higher ranges of r values, the skewness of the distribution increases
as tend to concentrate at low values. Overlaid on the histograms are fitted WE
densities. We can see that the WE is able to capture the shape of the conditional
histograms in different regions and seems to fit each of the densities rather well.
The local behavior of is sensitive to the value of r, as shown in Figure
3.5, which displays the local weighted maximum likelihood estimates of plotted
against the mean distance in the corresponding bins. One sees that when r is
small, the estimates of are unstable and seem again dominated by noise. As
r increases, climbs steadily until roughly r = 18 km before it declines slowly.
For the purposes of interpolation, extrapolation, or forecasting, one may seek
a parametrization of the estimates of . The F-distribution provides a good
approximation to the shape of the estimates of . Let fv1 ,v2 (x) be the density
function of the F-distribution with v1 and v2 degrees of freedom, the function
2.7 f10,600 (x/22) is found to be the best fit by least squares among functions of
comparable form. The fitted function is superimposed on the estimates of in
Figure 3.5.
3.8.1
Fit of proposed models
Figure 3.6 displays the density surfaces on logarithmic scale for the fitted KJ,
normal, and TPWE models, to relative mainshock-aftershock locations. A comparison between these surfaces reveals characteristic differences and similarities
between the models. The KJ model has a rather sharp peak that reaches about
the same height as the normal model. The KJ density decays very slowly outward and, as a result, retains a substantial density through the entire region.
Its shape mimics a 2-petalled rose that expands along the x-axis and tightens
22
along the y-axis. The normal model in contrast is comparatively smooth near
the origin and has relatively flat tails, obtaining densities that are very close to
zero outside the visible contours. The contour lines themselves, for the normal
model, all have an elliptical shape that seems to resemble aftershock zones. The
TPWE model can perhaps be viewed as a hybrid of the KJ and normal models.
On the one hand, it possess a sharp peak, like the KJ model (though the TPWEs
peak density is much higher than that of the KJ model). On the other hand,
its tail is quite thin and the visible contour lines cover roughly the same area as
the normal density. In addition, the bow-tie shape of the contours of the TPWE
model seems to have features that encompass characteristics of both the KJ and
normal models. While the TPWE model is not meant to be a compromise of
the other two models by construction, it nevertheless shares some similarity with
both of them.
3.8.2
Diagnostics and Model Comparison
The absolute values of the quadrat residuals of the KJ, normal, and TPWE
models are shown in Figure 3.7, on a logarithmic scale in order to facilitate
visualization. One sees immediately that the normal model has several outlying
residuals of very large size. This is due to the occurrence of mainshock-aftershock
pairs at relative distances where the normal model assigns a density very close to
zero. By contrast, the KJ model assigns a substantial density to these outliers.
However, it does so at the cost of having a substantial density throughout the
entire region. As a result, the model is over-predicting in most of the upper half
of the top-left panel of Fig. 8, at relative distances where very few mainshockaftershock pairs were observed. The proposed TPWE joint distribution seems
to achieve a balance between the other two. On the one hand, the outlying
23
residuals are several orders of magnitude smaller than those in the normal model;
on the other hand, the TPWE models residuals are much smaller than the KJ
model in relative locations where observations are rare. Near the origin, both the
normal and KJ models tend to under-predict the density. For instance, in the KJ
model, there is a vertical cluster of large residuals in the region above the origin,
indicating a systematic lack of fit. A similar problem is seen in the residuals of
the normal model. The peaks in these two densities are too low, relative to the
observed mainshock-aftershock pairs. The TPWE model, by contrast, has much
smaller residuals near the origin, indicating superiority of fit.
The weighted K-functions for the three models are shown in Figure 3.8. Also
plotted are the theoretical mean and simulation-based pointwise 95% confidence
bounds. There is serious departure from the confidence bounds in both the KJ
and normal models, indicating statistically significant lack-of-fit. The weighted
K-function for the KJ model is below the lower threshold of the 95% confidence
bounds for all values of h, as a result of its wide-area over-estimation of aftershock
density. In the normal model, the weighted K-function is plotted on a logarithmic
scale because the estimates of the KW are orders of magnitude above the upper
bounds of the 95% confidence intervals. Such a dramatic departure is a result
of serious underestimation of the density at the origin as well as a few outlier
locations where aftershocks are observed. The TPWE does not seem to have
systematic over-estimation or under-estimation of the density of relative distances
between mainshocks and their aftershocks, nor is there any serious indication in
Figure 3.8 of clustering or inhibition of the mainshock-aftershock pairs relative
to this joint distribution.
24
3.9
Additional Topics
This section provides an exploration on two additional topics in seismology that

are relevant to understanding aftershock spatial distribution: scaled distance, and
relocation catalog. These topics provide alternative ways of analyzing aftershocks
and are subjects of on-going research. Although an in-depth examination is
beyond the scope of this dissertation, this section attempts to explore these topics
in relation to the analysis of aftershocks in this chapter and estimate the impact
they may have on the results.
3.9.1
Scaled Distance
It has long been a subject of debate whether the spatial distribution of aftershocks
is independent from mainshock magnitude. Previous authors have reached different conclusions using different criteria (Ogata 1998, Kagan 2002, Huc et. al.
2003, Davidsen et. al. 2005, Felzer et. al. 2006). Amidst such controversy,
this section repeats part of the analysis in this chapter using a scaled distance
in an attempt to capture possible scaling effects due to mainshock magnitude.
One way to relate aftershock distances to mainshock magnitudes is to measure
distances in terms of mainshock rupture lengths (Felzer et. al. 2006), which have
been shown intimately related to magnitude for large events (Wells et. al. 1994).
We estimate the surface rupture length (L) of a fault from empirical relationships
(Wells et. al. 1994) and express the scaled distance in terms of fault lengths,
r/L. The marginal distribution of r/L and the conditional distribution of with
respect to r/L are investigated in the same manner as in the case for r.
The marginal distribution of scaled distance r/L and the corresponding condition distribution of are shown in Figures 3.9 and 3.10 respectively. The fit of
25
the tapered Pareto to r/L is slightly worse than is the case for r but still appears
reasonable. In comparison, the Pareto distribution systemically deviates from the
empirical survival function over the entire range of data, rendering it a misfit.
The conditional distribution of with respect to r/L shows strong resemblance
to its counterpart for unscaled distance. The estimates of are low in the upper
and lower ranges of r/L and are largest at roughly 40 fault lengths. The shape
of the estimates seems again well capture by an F-distribution. Although it is
difficult to infer from Figures 3.9 and 3.10 whether the distribution of aftershocks
is dependent on mainshock magnitude, both figures seem to suggest that TPWE
is applicable to both scaled and unscaled distances, at least for the range of data
considered.
3.9.2
Relocation Catalog
The use of relocation catalogs for the studies of aftershock spatial distribution
is open to debate. Despite the drastically reduced location errors in relocated
events, some authors have argued they may not be well suited such studies.
Some reasons as suggested by Kagan (2002) include 1) increased effort needed to
reinterpret the data, 2) bias and statistical dependence that may be introduced in
the location estimates, and 3) difficulty in communicating the relocation procedure and in reproducing the experiment. In spite of this, due to the more precise
locations, events in a relocation catalog are often found to align in linear and/or
planar structures suggestive of faults. It is therefore interesting as a descriptive
exercise to investigate the impact of relocation on the estimation of aftershock
concentration along mainshock fault planes. If clustering is indeed strong, larger
estimates of will be expected.
I examine relocated events in Southern California (Shearer et. al. 2005)
26
between 1984 and 2002 with M > 3.3 and their focal mechanisms (Hardebeck et.
al. 2003) with quality A or B. Both catalogs are available from SCEDC at
http://www.data.scec.org/research/altcatalogs.html.
New estimates of are obtained using the relocation catalog in a similar fashion
as before and are compared to the previous estimates of using the SCEDC
catalog. While the relocation catalog covers a longer period than does the SCEDC
catalog and has a higher magnitude cut-off, it should still provide a meaningful
comparison on the clustering of aftershocks along the fault direction.
Estimates of from a relative relocation catalog are shown in Figure 3.11. Of
interest is the magnitude of as compared to the SCEDC catalog, particularly in
the lower range of r. Despite improved location estimates, the relative relocation
catalog does not substantially increase the clustering of near-by aftershocks along
the fault direction as reflected in . A possible explanation is that while relocation
decreases location errors slightly for nearby events, it does not eliminate them,
and when the distance from the mainshock is small, is still very susceptible
to noise. Only at larger distances, such as r > 30, does one observe stronger
clustering due to more precise location estimates.
3.10
Discussion
This chapter explores focal mechanism as a means to describe the anisotropy of

aftershock spatial distribution. Using the strike angle as a proxy for the fault
plane, aftershocks are found to lie preferentially along mainshock fault plane for
southern California seismological data. The tapered Pareto / wrapped exponential (TPWE) model appears to adequately describe the locations of aftershocks
relative to mainshocks focal mechanism. Using residual analysis and weighted
27
K-function as diagnostic measures, both suggest that TPWE vastly outperforms

competing models such as the Kagan-Jackson and normal models.
It must be emphasized, however, that this analysis is performed using only
Southern California strike-slip earthquakes of quality A or B as candidates for
mainshocks. The effect of using only strike-slips as triggering events is that aftershocks of other types of mainshocks (e.g. dip-slip events) are now mostly
identified as background events, though some may be falsely identified as aftershocks to some strike-slip event. In other seismic regions, especially in areas
where the faulting is more heterogeneous, the TPWE model might not fit well,
and an important direction for future work is the investigation of the fit of such
models in other seismically active zones.
It should also be noted that earthquakes are treated as point sources in this
analysis. An important topic for future research is the investigation of models
for aftershock distances based on estimating the actual segment of fault which
ruptured for each earthquake and calculating minimal distances between such
segments. Related to this is the sensitivity of aftershock spatial distribution
to mainshock magnitude. This analysis considers an additional scaled distance
based on empirical relations in an attempt to capture possible scaling effects
due to mainshock magnitude. Nevertheless, such relations are only based on approximation and may not be accurate for small- and medium-sized mainshocks.
A topic for future research is a more refined and thorough approach that considers mainshocks of various sizes separately. Lastly, as depth measurements for
earthquakes become increasingly accurate and other features of earthquake faults
become discernible, including possible directivity of aftershocks, such effects may
also be taken into account to reflect a more realistic and accurate description of
aftershock spatial distribution.
28
Auxillary plane
Taxis
aftershock
r
Fault
plane
Paxis
Figure 3.1: A beach ball diagram illustrating the definition of relative aftershock
location with respect to mainshock focal mechanism. r represents the epicentral
distance between the mainshock and aftershock, and measures the aftershocks
angular separation from the mainshock fault plane.
29
Table 3.1: ETAS parameters used in stochastic aftershock assignment.
(M 1 )
(days)
(km)
.255
.346
2.903
K0
1.324
1.305
(shocks/day/km2 )
1.888 107
1.008
30
20
15
10
5
0
0
10
15
20
strike distance (km)
Figure 3.2: Scatterplot of mainshock-aftershock relative locations, with respect

to the mainshocks fault plane. Only a subset is displayed in a 20 20 km2
window.
31
0.500
0.050
0.005
S(r) on log scale
0.001
data
Pareto
tapered Pareto
10
20
50
100
200
500
r (km) on log scale
Figure 3.3: Survival function (1 F {r}) for mainshock-aftershock distances, r.

The tapered Pareto model appears fit much better to r than does the Pareto
model.
32
0.04
0.02
Density
0.00
0
20
40
60
80
(degrees)
Density
0.000 0.010 0.020
10 km < r < 15 km
20
40
60
80
(degrees)
0.010
0.000
Density
5 km < r < 10 km
20
40
60
80
(degrees)
r < 5 km
Figure 3.4: Histograms of relative angle () between mainshocks and aftershocks,

arranged according to distance r between mainshocks and aftershocks. Top panel:
10 km r <15 km; middle panel: 5 km r < 10 km; bottom panel: 0 km r <
5 km. Note that is rather uniformly distributed for r < 5 km.
33
3
2
0
Fdistribution
10
20
30
40
50
60
mean r (km)
Figure 3.5: Estimates of the aftershock azimuthal concentration parameter , as

a function of distance r between mainshock and aftershock. The gray dots are estimates obtained by bins of 200 mainshock-aftershock pairs each; the black dotted
20
20
20
10
10
10
curve shows the parameterization using the density function of an F-distribution.
2.9
10
10
10
6.9
20
10
10
20
20
20
20
11 or less
20
10
10
20
20
10
10
20
Figure 3.6: Density plots on logarithmic scale corresponding to three models for
mainshock-aftershock relative locations. Left: KJ model; middle: normal model;
right: TPWE model.
34
20
20
20
15
15
15
10
10
10
3.9
1.6
10
15
20
7 or less
10
15
20
10
15
20
Figure 3.7: Quadrat residuals from each of the three models for mainshock-aftershock relative locations. Left: KJ model; middle: normal model; right: TPWE
model.
35
100
0
20
40
^
K (h)
60
80
empirical weighted Kfunction

95% pointwise confidence bounds
theoretical mean function
h (km)
1e+03
1e03
1e+00
^
K (h)
1e+06
1e+09

350
h (km)
200
150
0
50
100
^
K (h)
250
300

3
h (km)
Figure 3.8: Weighted K-functions corresponding to three models for mainshock-aftershock relative locations. Top: KJ model; middle: normal model; bottom:
TPWE model.
36
0.500
0.050
0.005
S(r) on log scale
0.001
data
Pareto
tapered Pareto
1e01
1e+00
1e+01
1e+02
1e+03
r/L on log scale
Figure 3.9: Survival function (1 F {r/L}) for normalized mainshock-aftershock

distances, r/L.
37
3
0
Fdistribution
20
40
60
80
100
120
140
mean r/L
Figure 3.10: Estimates of the aftershock azimuthal concentration parameter ,

as a function of scaled distance r/L. The gray dots are estimates obtained by
bins of 200 mainshock-aftershock pairs each; the black dotted curve shows the
parameterization using the density function of an F-distribution.
38
3
2
0
10
20
30
40
50
60
mean r (km)
Figure 3.11: Estimates of the aftershock azimuthal concentration parameter

from a relocation catalog. The gray dots are estimates obtained by bins of 200
mainshock-aftershock pairs each.
39
CHAPTER 4
Focal Mechanism-dependent Anisotropic Spatial
Kernel for Space-Time Earthquake Point
Process Models
4.1
Background
Spatial-temporal Epidemic-Type Aftershock Sequence (ETAS) models for earthquake occurrences were proposed by Ogata (1998) and have since been widely
used to characterize modern catalogs of seismicity. The initial, simple versions
of these models posit an isotropic spatial distribution of aftershocks around each
mainshock. However, such a model fails to account for the anisotropic spatial distribution of aftershocks that has been widely observed since Utsu (1969). Indeed,
Ogata (1998) acknowledges the need for an anisotropic spatial kernel and suggests altering the spatial decay function in ETAS models with ellipsoidal contours
corresponding to a bivariate normal fitted to aftershock regions.
Although the normal model is a convenient choice and may serve as a first
order approximation to aftershock spatial distribution, it is shown in Chapter 3
to suffer from several serious issues. Recently, modern catalogs of earthquakes
containing seismic moment tensor estimates for many of the events have become
available. These estimates, especially the resulting estimates of focal mechanism,
appear to be quite effective at describing the anisotropic spatial distribution
40
of aftershocks (see Chapter 3 and the references therein). Nevertheless, such

information has yet to my knowledge been previously used in ETAS models. A
primary purpose of the present chapter is therefore to explore of ETAS models
that incorporate information regarding focal mechanism, and in particular the
orientation of earthquakes in forecasting subsequent seismicity.
In the previous chapter, TPWE was proposed as a spatial distribution for
the relative aftershock locations with respect to mainshock focal mechanism.
The present chapter proposes general anisotropic extensions of space-time point
processes such as ETAS to incorporate focal mechanism information. A somewhat similar procedure was explored by Kagan and Jackson (1994), who formed
long-term seismic hazard maps by smoothing past seismicity in Southern California using an anisotropic spatial smoothing kernel that depends on the past
earthquakes focal mechanism. Here, the TPWE is used as an example of an
anisotropic, focal mechanism-dependent spatial kernel. The effectiveness of using
focal mechanism in modeling earthquake occurrences is assessed by comparing
ETAS models with isotropic and anisotropic spatial kernels.
A secondary purpose of this paper is to explore a new technique for goodnessof-fit assessment of space-time point process models. The proposed procedure
involves inspecting differences between competing models in contributions to the
loglikelihood over pixels, and are thus called deviance residuals because of their
obvious connection with deviances in the context of generalized linear models.
The resulting residual plots are a small extension of those proposed by Baddeley et
al. (2005) for the purpose of assessing purely spatial point processes by examining
their behavior over pixels. The key idea of the method proposed here is to assess
the relative performance of competing models, i.e. to inspect the differences
between each models residuals. This graphical tool appears to be quite effective
41
at portraying the relative fit of models to space-time point process data, and is
used to illustrate the advantages and disadvantages of extended ETAS models
compared to alternative models.
The format of this chapter is as follows. Section 4.2 provides background
information on models for aftershock behavior such as ETAS models. Section
4.3 details the mathematical relation between mainshock fault plane strike angle
and relative aftershock angle as defined in this analysis. Extended ETAS models
are proposed in Section 4.4. Sections 4.5 discusses parameter estimation. Section
4.6 summarizes some diagnostic methods for purely spatial and spatial-temporal
point process models. A deviance-based residual method is proposed in Section
4.7 method and is used in Section 4.8 to assess the relative fit of ETAS models
described in Section 4.4. Section 4.9 is the concluding section for this chapter.
4.2
Epidemic Type Aftershock Sequence Models
Point process models have proved to be an efficient tool for modeling earthquake
occurrences. Their use in seismology has been pioneered by Vere-Jones (1970,
1975), Ogata (1999) gives a nice review. Today, the ETAS model (Ogata 1988,
1998) is considered to be the standard branching process model in seismology.
First introduced by Ogata (1988) to describe the intense temporal clustering
observed in earthquake occurrences, the ETAS model is a type of branching or
self-exciting point process. Early applications of self-exciting point processes to
earthquake occurrence models can be found in Hawkes and Adamopoulos (1973)
as well as in Lomnitz (1974, Chapter 7). Modeling earthquake occurrences using
a self-exciting point process implies the separation of the seismicity into a longterm background component and a short-term, time-dependent clustering comP
ponent {i:ti <t} g(t|ti ) which represents the aftershock activity. The clustering
42
component describes seismic activity as a superposition of earthquake clusters,

where each earthquake is either a mainshock or an aftershock, and the aftershocks
can themselves trigger further aftershocks, resulting in branching behavior. Note
that since the precise branching behavior of observed earthquakes is often difficult
to determine and open to debate, we use the terms mainshock and aftershock
here purely in reference to the branching behavior described by the ETAS model.
Spatial-temporal versions of the ETAS model were described in Ogata (1998).
The conditional intensity function of the spatial-temporal model can be written
as
(t, x, y|Ht ) = (x, y) +
g(t, x, y|tj , xj , yj ; Mj ),
(4.0)
{j:tj <t}
where (x, y) represents the background seismicity rate, and g(t, x, y|tj , xj , yj ; Mj )
is the rate of aftershocks (M Mc ) at space-time coordinate (t, x, y) caused directly by the jth earthquake (tj , xj , yj , Mj ). The ETAS model accounts for the
separable effects of spatial and temporal clustering using a response function g
that consists of functions of magnitude, time, and space:
g(t, x, y|tj , xj , yj ; Mj ) = (Mj ) (t tj ) h(r, ; Mj )
K0 e(Mj Mc )
=
h(r, ; Mj ),
(t tj + c)p
where r and indicate the location (x, y) in question, relative to the location
of the preceding event (xj , yj ), expressed in polar coordinates. That is, r =
p
{(x xj )2 + (y yj )2 }, = arctan((y yj )/(x xj )), for x 6= xj , and = /2
for x = xj .
In this equation, (M ) measures the productivity of aftershock activity relative to the mainshock magnitude, and (t) is a temporal decay rate of aftershocks.
Both forms are derived from well-known empirical laws in seismology. The spatial
density of direct aftershocks, h(r, ; Mj ), is a primary subject of concern in this
43
chapter. The following forms for h(r, ; Mj ) were proposed by Ogata (1998):
h(r, ; Mj ) = (r2 + d)q ,
r2
h(r, ; Mj ) =
(4.1)
!q
+d
,
e(Mj Mc )

1
r2
h(r, ; Mj ) = exp
.
2 de(Mj Mc )
(4.2)
(4.3)
The discriminating features among the above models can be summarized in two
major ways: (1) h can have short range decay (i.e. normal) or long range decay
(i.e. power law), (2) the spatial distribution of aftershocks around a given mainshock can be parameterized in terms of the Euclidean distance between the two
events, or between this distance scaled as an exponential function of mainshock
magnitude. Common to these models is they all describe isotropic (i.e. rotation
invariant) spatial clustering.
4.2.1
Anisotropic Clustering
It has long been observed that, after a given mainshock, subsequent events tend to
occur within a roughly elliptically shaped region around the mainshock (see e.g.
Utsu, 1969). Ogata (1998) lists several possible reasons, based on the dip angle of
the slipped fault of an earthquake, the proportion of the slipped length of the fault
to its width, and the location errors of aftershock hypocenters. Ogata (1998) and
Ogata and Zhuang (2002) suggest extending the spatial kernel h(r, ; Mj ) to accommodate ellipsoidal aftershock zone contours by altering the Euclidean metric
in h(r, ; Mj ) for certain aftershock sequences that are anisotropically distributed.
One of the methods amounts to first identifying large aftershock sequences in a
catalog and fitting a bivariate normal distribution to the spatial coordinates of
the aftershocks within each sequence. For every fitted normal whose covariance
matrix Sj is significantly different from the identity matrix, one replaces the term
44
r2 in h(r, ; Mj ) by
(r cos(), r sin())Sj (r cos(), r sin())T
for the corresponding mainshock so that its aftershocks are distributed with ellipsoidal contours. Applying this to model (1), for instance, the altered spatial
kernel can be written as
h(r, ; Mj ) = ((r cos(), r sin())Sj (r cos(), r sin())T + d)q .
Note that isotropic clustering is a special case where Sj is the identity matrix.
Although the normal model may serve as a reasonable first-order approximation for the spatial patterns of aftershock zones, it was shown in Wong and
Schoenberg (2009) to suffer from several issues when used to describe aftershocks
across different sequences in Southern California. Even for individual clusters
as suggested by Ogata (1998), its application in a point process model may not
be optimal. For instance, the normal distribution is sensitivity to outliers and it
may not fit well to observation regions that are irregularly shaped and/or difficult
to delineate. On the other hand, it may fit too well to short sequences, causing
the model to overfit. Ogata (1998) tries to prevent this problem by using aftershock clusters of no less than a minimum length. Most importantly, aftershock
identification is often a subjective process and rests heavily on human input.
As suggested by the analysis of Chapter 3, focal mechanism appears to be predictive of aftershock pattern and may provide an alternatively way to account for
the anisotropy of aftershock distribution in a point process model such as ETAS.
As compared to normal approximation, such an approach has the advantages of
more being objective and easily reproducible.
45
4.3
Fault Plane Strike Angle and Relative Aftershock Angle
In Chapter 3, the relative angle of an aftershock with respect to mainshock focal

mechanism was defined as its angular separation from the nearest mainshock
fault plane. I now give a more precise mathematical definition of in relation to
the fault plane strike angle and the aftershocks conventional polar angle.
When one expresses the epicenter of an aftershock in relation to a mainshock
in conventional polar coordinates (r, ), as was done in Section 4.2, one implicitly uses the mainshock epicenter and the positive x-axis as reference point and
reference vector respectively. Denoting the strike angle of the mainshock fault
plane as j , which is also measured counter-clockwise against the positive x-axis,
can be calculated from and j as follows
| j |
if 0 | j | < /2
| j | if /2 | j | <
(, j ) =
| j | if | j | < 3/2
2 | | if 3/2 | | < 2.
j
This definition of is identical to that in Chapter 3, only with explicit consideration of and j . Figure 4.1 provides an illustration of the relations between the
angles.
This formula allows us to model for any given j , as opposed to modeling
directly. For instance, in a mainshock-aftershock pair, the distribution of can
be described by the WE, in much the same way that it is used to describe . In
terms of , the WE has conditional density
fW E ((, j )|r) =
1 (r) e(r)(,j )
4 1 e(r)/2
for any j .
46
(0 2),
(4.4)
4.4
Anisotropic Extensions of ETAS Models
In extending ETAS model to incorporate the strike angle of mainshock fault

planes, we seek a focal mechanism-dependent spatial kernel that can encompass
existing isotropic models. In other words, given an isotropic h(r, ; Mj ) such as
(4.1)-(4.3), we seek h(r, ; Mj , j ) that satisfies
Z 2
Z 2
h(r, ; Mj , j ) d =
h(r, ; Mj ) d,
0
for all j , so that the two marginal distributions agree. One possible parameterization of h(r, ; Mj , j ) is
h(r, ; Mj , j ) = h(r, ; Mj )f (; r, j ),
where f (; r, j ) is a density, for any r, j . The predictive power of focal mechanism can be evaluated by comparing ETAS models with and without the anisotropic
term f .
For instance, one may take the Pareto model (4.1) for h(r, ; Mj ) and generalize it by letting f (; r, j ) = fW E ((, j ); r). The corresponding spatial decay
functions with isotropic and anisotropic clustering can be written respectively as
h(r, ; Mj ) = (r2 + d)q ,
(4.5)
h(r, ; Mj , j ) = (r2 + d)q fW E ((, j ); r),
(4.6)
Alternatively, one may use the tapered Pareto model as suggested by the analysis
in Chapter 3. An isotropic spatial kernel according to which r has a tapered
Pareto marginal distribution can be written as

1
a r a
hT P (r, ; Mj ) =
+
exp
r
r+1
(a r < ).
(4.7)
Similar to (4.5) and (4.6), one can compare hT P and its extend version:
h(r, ; Mj ) = hT P (r, ; Mj ),
h(r, ; Mj , j ) = hT P (r, ; Mj ) fW E ((, j ); r).
47
(4.8)
(4.9)
All of the above spatial kernels are magnitude-invariant. Instead, one may
consider variations that scale with mainshock magnitude, as in models (4.2) or
(4.3). The main focus of this chapter is not on magnitude scaling, however, but
on the incorporation of focal mechanism estimates; interested readers can refer
to Section 3.9.1 for a discussion of a magnitude dependent version of fW E .
The analysis in Chapter 3 suggests that fW E is applicable to strike-slip mainshocks, but the spatial distribution of aftershocks of other types of earthquakes
may be different. One may define an indicator variable
1 mainshock is a strike-slip
I=
0 otherwise
and consider a modified version of (4.6) and (4.9) where the aftershock zones of
strike-slip mainshocks have directional preferences according to fW E , while nonstrike-slip events have circular aftershock zones. The restricted spatial kernels
can be written respectively as
h
i
h(r, ; Mj , j ) = (r2 + d)q I fW E ((, j ); r) + (1 I)/2 ,
(4.10)
h
i
h(r, ; Mj , j ) = hT P (r, ; Mj ) I fW E ((, j ); r) + (1 I)/2 .
(4.11)
and
Alternatively, considering the relatively small presence of qualified strike-slip

mainshocks in the catalog, one may put more emphasis on the directional extension by considering a model with only strike-slips type mainshocks. The conditional intensity function of a space-time ETAS model with only strike-slips
actting as triggering events can be written as
(t, x, y|Ht ) = (x, y) +
g(t, x, y|tj , xj , yj , j ; Mj ).
{j:tj <t & I(j)=1}
48
(4.12)
A response function with isotropic spatial clustering such as (4.5),

K0 e(Mj Mc ) 2
g(t, x, y|tj , xj , yj ; Mj ) =
(r + d)q ,
p
(t tj + c)
(4.13)
can be compared to its focal mechanism-dependent invariant (4.6)

g(t, x, y|tj , xj , yj ; j , Mj ) =
K0 e(Mj Mc ) 2
(r + d)q fW E ((, j ); r).
(t tj + c)p
(4.14)
Similar comparison can also be made for a tapered Pareto model with isotropic
and anisotropic clustering such as (4.8) and (4.9) respectively
K0 e(Mj Mc )
g(t, x, y|tj , xj , yj ; Mj ) =
hT P (r, ; Mj )
(t tj + c)p
g(t, x, y|tj , xj , yj ; j , Mj ) =
(4.15)
K0 e(Mj Mc )
hT P (r, ; Mj ) fW E ((, j ); r). (4.16)
(t tj + c)p
Such models consist of only the background seismicity and aftershock activities following strike-slip mainshocks. Nevertheless, by focusing on the branching structure of strike-slip mainshocks, one is able to accentuate the observed
anisotropic aftershock activities surrounding strike-slip mainshocks as suggested
by Chapter 3 and the impact of a focal mechanism-dependent spatial kernel in
ETAS models.
4.5
Parameter Estimation
Given a catalog of distinct estimated occurrence times, spatial coordinates, magnitudes, and focal mechanisms of earthquakes, {(ti , xi , yi , i , Mi ); Mi M0 , i =
1, . . . , } during a time interval [0, T ] and within an observation region A, the log
likelihood of a point process model such as (4.0) is given by (Daley and VereJones, 1988)
log L() =
n
X
Z Z
log (ti , xi , yi |Ht )
(t, x, y|Ht ) dx dy dt,

0
i=1
49
where is the parameter vector. Parameter estimates can be obtained via maximum likelihood. For details on maximizing the log-likelihood for isotropic models
using numerical methods, see Ogata (1998). Veen and Schoenberg (2004) explore
a relatively robust and efficient Expectation Maximization-based alternative to
gradient-based approaches.
4.6
Goodness-of-fit and Diagnostic Methods for Spatial

and Spatial-temporal Point Process Models
The goodness-of-fit of multi-dimensional point process models are commonly investigated using likelihood statistics such Akaike Information Criterion (AIC,
Akaike 1974) and Bayesian Information Criterion (BIC; Schwarz 1978). For given
data of N observations, let L() be the likelihood of which has k unknown parameters, and MLE be its MLE, then the AIC value of the model is defined
by
AIC = 2 log L(
MLE ) + 2k;
and the BIC value is
BIC = 2 log L(
MLE ) + k log N.
Both AIC and BIC examine the maximized likelihood values plus a penalty term.
In addition to penalizing for the number of unknown parameters, BIC also takes
into account the sample size effect so that it would correct the bias toward the
more complex models. When used for model selection, lower values of AIC and
BIC indicate better fits.
While the AIC and BIC are useful for scoring the overall quality of fit, graphical summaries may also be useful, especially in order to identify where one model
may be fitting well or poorly and to suggest possible ways in which a model may
50
be improved. Schoenberg (2003) investigates two types of residuals as diagnostic methods for space-time point process models such as ETAS. He uses thinned
residuals to detect spatial patterns in the data such as clustering that are not
adequately accounted for by a model. He also uses scaled residuals to assess the
assumption of temporal-magnitude separability. For the case of purely spatial
point process models, the L-function and K-function (see Ripley 1981) are examples of summary functions that may be used to detect clustering or inhibition
in point process data and is often used as a test for homogeneity of planar point
processes. Such functions can be combined with point process residual analysis
such as thinned residuals and scaled residuals in order to test for more general
classes of point process models.
Baddeley et al. (2005) considers various graphical residual analysis methods
for spatial point process models that involve standardized versions of the integrated conditional intensity over each pixel. These methods extend quite readily
to the case of spatial-temporal point processes. For instance, following Baddeley
et al. (2005), given a spatial-temporal point process N with estimated condi the raw residuals of the process may be defined for
tional intensity function ,
each pixel B as
Z
x, y) dtdxdy,
(t,
R(B) = N (B)
B
1/2
and after scaling according to
, one obtains a spatial analog of Pearson
residuals for Poisson log-linear regression, via

X
xi B
(xi )
Z q
x, y) dtdxdy.
(t,
Pearson residuals far away from zero suggest lack of fit.
51
4.7
Deviance Residuals
As an alternative to the aforementioned diagnostic methods, for the purposes of

comparing two competing models for a spatial-temporal point process, I propose
to use an analog of the deviance residuals that are commonly employed in generalized linear models (GLM). Deviance can be broadly defined as minus twice
the log likelihood ratio of two models fitted by maximum likelihood. Generally
one of the two models is taken to be the saturated model with as many parameters as the number of observations, and the other is the model of interest.
As such the deviance reflects the goodness-of-fit of the model in question and
can be used for hypothesis test and model selection. The deviances associated
with individual observations are called deviance residuals. Deviance residuals are
likened to the raw residuals in ordinary regression in much the same way deviance
is likened to the sum of squares. In GLM deviance residuals provide a way of
detecting unusual observations in the data set and measuring the contribution of
each covariate pattern to the total deviance.
Deviance residuals can be extended quite readily to point process models with
a slight modification. A unique characteristic of a point process is that the lack
of points is equally important as the occurrences of points. Therefore, deviance
residuals for a point process model should be associate with a subset of the
sample space, such as a pixel in a spatial point process, rather than individual
observations. For the purposes of comparing two competing models for a spatialtemporal point process, deviance residuals can be defined with respect to two nonsaturated models. Given two spatial-temporal conditional intensities 1 and 2 ,
define the deviance residual of 1 against 2 in a space-time pixel (or subregion)
52
B as
D(1 , 2 , B) =
Z h
i
X

log 1 (ti , xi , yi ) log 2 (ti , xi , yi )
1 (t, x, y) 2 (t, x, y) dx dy dt.
B
{i:(ti ,xi ,yi )B}
These deviance residuals thus represent the relative contribution to the loglikelihood of the information in pixel B, for each model; positive values of D
indicate superior fit of model 1 and negative values of D indicate superior fit of
model 2 . Provided that the two models are nested, under the usual regularity
conditions the deviance residuals should be approximately chi-square distributed
with p degrees of freedom (see e.g. Ogata, 1978).
4.8
4.8.1
Results
Pearson residuals
We use a moment tensor catalog from the SCEDC occurring between September
18, 1999, and Dec 31, 2005, with epicenters in or near Southern California and a
moment magnitude of M3.0 or above; please refer to Wong and Schoenberg (2009)
for further details on this data set. Similar to Chapter 3, each focal mechanism
is categorized as strike-slip if the neutral axis of the moment tensor (B-axis) is
within 20 of vertical. We focus on focal mechanisms of qualities A or B only
as focal mechanisms of worse qualities are thought to be unstable and often
discarded (Clinton et. al. 2006, and private communication with Hauksson).
According to the above definition, 190 (11%) of the events in the dataset are
strike-slip events.
We first consider an application of Pearson residuals to an ETAS model and
demonstrate the difficulty in assessing the individual goodness-of-fit of this type
53
of branching point process. When inspecting a Pearson residual plot, one indication of lack of fit is excessive clustering of residuals of the same sign. Clustered
residuals indicate areas where the model fits poorly. However, this criterion does
not seems to apply in a clustered point process such as ETAS. Figure 4.2 illustrates the Pearson residuals for model (4.5). The raw residuals, even after
standardization, tend to be very large at the pixels containing points, especially
those pixels containing many points. Thus the plot indicates little information
about the model in question. When the residuals are smoothed, as suggested in
Baddeley et al. (2005), detailed local information is lost, and typically, broad regions where the models background rate is incorrectly specified can be discerned,
but little can be gleaned about the models description of local behavior such as
clustering or inhibition.
4.8.2
Comparison of Isotropic Pareto and Isotropic Tapered Pareto

Models
Despite the shortcomings of standard resiudal plots for ETAS models, a relative
approach that compares the performance of competing models such as deviance
residuals may facilitate the examination of the relative strengths of the models.
I first demonstrate deviance residuals as a graphical tool for comparing ETAS
models with different isotropic spatial decay functions such as (4.5) and (4.8).
Deviance residuals are subsequently used to assess the impact of directional extensions in these models.
The deviance residuals of (4.5) against (4.8) are shown in Figure 4.3. Here
the pink and green pixels indicate areas where model (4.5) and model (4.8),
respectively, fit better than the other. Several patterns emerge from this image.
First, isolated events that are far from clusters are almost universally contained
54
in pink pixels, indicating superior fit of the Pareto distribution of model (4.5) in
these sparse regions. On the other hand, clusters of events are largely contained
in green pixels, indicating superior fit of the tapered Pareto distribution of model
(4.8) in these areas. Interestingly such clusters are often bordered by pink pixels
on the outskirts, such at those surrounding the large cluster at (-116.3, 34.5).
These pattens seem to suggest the spatial decay of aftershock intensity in (4.8) is
more concentrated around mainshocks whereas (4.5) is more diffused. As a result,
model (4.8) is relatively better at describing strong clustering of aftershocks and
relative poorer at describing remote triggering of aftershocks. The latter may be
explained by the tapering feature of the tapered Pareto model (4.8).
The AIC values and parameter estimates for both models are shown in Table
4.1. According to the AIC, the fit of (4.8) is significantly better than (4.5)
with a substantial margin. Relating this to the deviance residuals, one sees
that the ability of model (4.8) to describe intense aftershock clustering in the
close proximity of mainshocks more than compensates for its relative weakness
at describing aftershock activities further away from mainshocks.
4.8.3
Impact of the Wrapped Exponential Distribution of Relative

Angles Between Mainshocks and Aftershocks
In addition to comparing models with different distance decay functions, deviance

residuals can also be used to assess the impact of a focal mechanism-dependent,
directional extension in these models. We continue to use (4.5) as an example of
an ETAS model with isotropic clustering, and we compare it to its anisotropic
invariant model (4.10). In Figure 4.4, we see the deviance residuals of (4.5)
against (4.10) suggest disagreement between the two models in certain locations.
For instance, the green pixels near (-120.5, 36) indicate that the orientation-
55
dependent model (4.10) fares better in the proximity of a nearly linear cluster
of strike-slip events. In this cluster, most events are strike-slips and are aligned
closely along a fault line, and as a result, the strike directions of these strike-slip
events are highly predictive of aftershock locations. On the other hand, there
are also locations where strike-slip mainshocks are followed by rather circular
aftershock patterns, as reflected by mixtures of positive and negative deviance
residuals.
The AIC values and parameters estimates for model (4.10) are found in Table
4.1. According to the AIC, the fit of (4.10) is only slightly better than (4.5).
Although model complexity is accounted for in the AIC, one sees that the overall
benefit of the directional extension is marginal. The comparison between another
pair of isotropic and anisotropic models such as (4.8) and (4.11), also shown in
Table 4.1, reveal similar results. The extended model (4.11) only brings about a
small improvement in the AIC.
4.8.4
Models with only Strike-slip Mainshocks
The above analysis seems to suggest that distance is a far more important factor
than focal mechanism in ETAS models. However, the effect of focal mechanism
may be obscured or diminished due to the presence of relatively few qualified
strike-slip mainshocks in the catalog. Despite being the majority earthquake
type in Southern California, strike-slip events with focal mechanism solutions of
quality A or B only account for 11% the size of catalog. In order to accentuate
the impact of focal mechanism in ETAS models for diagnostic purposes, we repeat
the above analysis using only strike-slip mainshocks.
Figure 4.5 shows the deviance residuals of (4.13) against (4.14), models in
which only strike-slips act as triggering events. The overall pattern is similar to
56
the previous comparison when all mainshocks were present. However, new improvement is found near Landers (-116.3, 34.5) in a region populated by a number
of roughly parallel strike-slip faults. Parameter estimates and AIC values for both
models are provided in Table 4.2. The extended model (4.14) shows significant
improvement in terms of AIC, with a 314-point margin. A comparison based on
the tapered Pareto model yields similar results. The AIC for the extended model
(4.16) is 374 lower than that of its isotropic counterpart (4.15). Since the AIC is
under general conditions approximately 2 distributed with parameter equal to
the difference (here 4) between the number of fitted parameters in the competing
models (Ogata, 1978), here such differences are very highly significant.
4.9
Discussion
The greater effectiveness of the anisotropic extensions due to the omission of nonstrike-slip mainshocks may be explained as follows. According to the models, the
spatial decay and temporal decay of aftershock intensity fall off sharply from
the origin. Hence when all mainshocks are present, the conditional intensity
of an aftershock may be dominated by mainshocks that occurred shortly before
the aftershock in its close proximity. Since fW E is approximately uniform for
r < 5 km, it is unlikely to impact the triggering densities of these mainshocks
at all. Coupled with the small number of qualified strike-slips in the catalog,
the inclusion of fW E only brings about minor improvement in the models. On
the other hand, when non-strike-slip mainshocks are omitted and many of the
remaining mainshocks are only found at greater distances, the benefit of a focal
mechanism-dependent spatial kernel becomes much more evident. This seems to
suggest that while fW E is predictive of aftershocks at large, it only has a small
impact on the ETAS models examined due to its lack of directional preferences
57
for aftershocks at shortly distances.

As location estimates of earthquakes improve, an important topic for future
research is the investigation of relative aftershock locations focused on aftershocks that are near mainshocks. Small-sized and medium-sized mainshocks may
be better suited for such a study as point-source approximation may be a more
valid assumption for such mainshocks. Also important is the investigation of
aftershock activities surrounding non-strike-slip mainshocks in relation to mainshock focal mechanism. fW E is currently applied to strike-slip mainshocks in
this analysis and may be complemented by similar characterizations with regard
to other mainshock types. A focal mechanism-dependent spatial kernel that embraces other earthquake types may lead to further improvement in ETAS models.
It should be emphasized that the current analysis focuses only the fault plane
strike angle inferred from focal mechanism as a way of describing the anisotropy of
aftershock spatial distribution. However, there also exist other methods, such as
the Coulomb stress function, that utilize focal mechanism in modeling aftershock
patterns and these methods may be explored as alternatives to fW E .
58
Table 4.1: Maximum likelihood estimates of ETAS model parameters and AIC
for models 4.5, 4.8, 4.10, 4.11.
Model
4.5
4.8
Parameters
K0
.667
.0267
.00945
8.951e-08
.227
.177
.0105
K0
.651
.0294
.0633
7.431e-08
.224
.275
48.658
Model
4.10
4.11
AIC
Parameters
30119.6
AIC
K0
.670
.0270
.00946
8.890e-08
.227
.177
.0105
K0
.655
.0296
.0632
7.392e-08
.224
.275
48.020
59
31093.6
31092.4
30116.5
Table 4.2: Maximum likelihood estimates of ETAS model parameters and AIC
for models 4.13, 4.14, 4.15, 4.16, with only strike-slip mainshocks.
Model
Parameters
AIC
K0
4.13
.255
.346
2.903
1.008
.325
.305
1.888e-07
40362
4.14
.255
.324
2.630
.878
.324
.291
1.998e-07
40048
Model
Parameters
AIC
K0
4.15
.266
.065
.334
1.133
.307
22.980
1.862e-07
40056
4.16
.276
.063
.328
1.121
.305
24.926
1.824e-07
39682
60
fault plane
aftershock
mainshock
Figure 4.1: Relative aftershock angle with respect to mainshock fault plane. Here
the mainshock is situated at the origin, is the strike angle of the mainshock
fault plane, is the angle of aftershock with respect to the positive x-axis, and
is the relative angle of aftershock with respect to the fault plane.
61
37
36
35
latitude
34
33
32
31
122
120
118
116
114
longitude
pearson residuals
450
450
900 or more
Figure 4.2: Pearson residuals for model (4.5),
62
38
37
36
35
latitude
34
33
32
31
122
120
118
116
114
longitude
deviance residuals
1 or less
1 or more
Figure 4.3: Deviance residuals of (4.5) against (4.8).
63
38
37
36
35
latitude
34
33
32
31
strikeslips
nonstrikeslips
122
120
118
116
114
longitude
deviance residuals
0.2 or less
0.2 or more
Figure 4.4: Deviance residuals of (4.5) against (4.10).
64
38
37
36
35
34
32
33
latitude
31
strikeslips
nonstrikeslips
122
120
118
116
114
longitude
deviance residuals
2 or less
2 or more
Figure 4.5: Deviance residuals of (4.13) against (4.14), with only strike-slip mainshocks
65
CHAPTER 5
Concluding Remarks
This concluding chapter outlines some directions for future research on the methods and applications presented in this dissertation. Chapter 3 demonstrate a way
of using focal mechanism to describe the anisotropic spatial distribution of aftershocks and that the proposed tapered Pareto / wrapped exponential (TPWE)
model provides a reasonable fit to the data considered. It must be emphasized,
however, that this analysis is performed using only Southern California strike-slip
earthquakes of quality A or B as candidates for mainshocks. It remains to be
investigated whether focal mechanism and TPWE can be applied to other mainshock types and/or in other seismicall active zones where the faulting is more
heterogeneous.
Chapter 4 proposes a general framework for incoporating focal mechanism
into a point process earthquake model via an anistropic spatial kernel. It uses
the fW E as an example of a directional extension to an isotropic spatial kernel
and assesses the signficance of focal mechanism in space-time ETAS models, using
Southern California seismological data. While this presents an alternative to the
normal approximation approach, fW E s limitation to strike-slip mainshocks and
the quality issues of focal mechanism solutions present some challenges to the
analysis. As quality of focal mechanism data improves and more comprehensive
understanding of focal mechanism and aftershock spatial distribution is available,
especially for aftershocks close to mainshocks, forecast models for seismicity can
66
likely be further improved with the incorporation of focal mechanism data.

Chapter 4 also introductes a novel residual method called deviance residuals
for comparing the relative performance of competing spatial-temporal point process models. Applications of may current diagnostic methods, such as thinning
and Pearson residuals, are often limited due to their shortcomings. In contrast,
deviance residuals are easy to compute and appears to offer interpretable visual
diagnostics. An interesting direction for future research is a more comprehensive and rigorous study of deviance residuals for general clustered point process
models. Meanwhile, many current methods also seem to hold a promising future
as they may be further improved and made more effective. For instance, a technique called augmenting which simulates points from the null hypothesis may be
combined with thinning to yield a powerful tool for assessing the goodness-of-fit
for point process models. Additionally, Voronol tessellation is gaining popularity
in recent years to serve various purposes in analyses that involve spatial data.
Its adaptively nature may provide an interesting alternative to quadrat residuals
that are commonly employed for spatial models.
67
References
[1] Akaike, H. (1974). A new look at the statistical model identification. IEEE
Transactions on Automatic Control, 19(6), 716723.
[2] Baddeley, A., Mller, J., and Waagepetersen, R. (2000). Non and semiparametric estimation of interaction in inhomogeneous point patterns. Statistica Neerlandica, 54(3), 329350.
[3] Baddeley, A., Turner, R., Moeller, J., and Hazelton, M. (2005). Residual
analysis for spatial point processes. Journal of the Royal Statistical Society
B, 67(5), 617666.
[4] Bolt, B. and Bullen, K. E. (1985). An Introduction to the Theory of Seismology, Cambridge University Press, New York.
[5] Bolt, B. (1985). Earthquakes: 2006 Centennial Update, W. H. Freeman &
Company.
[6] Clinton, J. F., Hauksson, E., and Solanki, K. (2006). An evaluation of the
SCSN moment tensor solutions: robustness of the Mw magnitude scale, style
of faulting, and automation of the method, BSSA, 96(5), 1689-1705.
[7] Davidsen, J. and Paczuski, M. (2005). Analysis of the spatial distribution
between successive earthquakes, Phys. Rev. Lett., 94, Art. Num. 048501.
[8] Felzer, K. and Brodsky, E. (2006). Decay of aftershock density with distance
indicates triggering by dynamic stress, Nature, 441, 735-738.
[9] Frohlich, C. (1992). Triangle diagrams: ternary graphs to display similarity and diversity of earthquake focal mechanisms, Physics of the Earth and
Planetary Interiors, 75, 193-198.
[10] Frohlich, C. (2001). Display and quantitative assessment of distributions of
earthquake focal mechanisms, Geophys. J. Int., 144, 300-308.
[11] Frohlich, C2 Willemann, R.J. (1987). Aftershocks of deep earthquakes do
not occur preferentially on nodal planes of focal mechanisms, Nature, 329,
41-42.
[12] Frohlich, C. and Willemann, R.J. (1987). Statistical methods for comparing
directions to the orientations of focal mechanisms and Wadati-Benioff zones,
BSSA, 77(6), 2135-2142.
68
[13] Gomberg, J., Bodin, P., and Reasenberg, P. (2003). Observing earthquakes
triggered in the near field by dynamic deformations, BSSA, 93(1), 118-138.
[14] Hardebeck, J. L., Homogeneity of small-scale earthquake faulting, stress,
and fault strength, BSSA, 96 (5), 1675-1688.
[15] Hardebeck, J. L. and Shearer, P. M. (2003). Using S/P Amplitude Ratios
to Constrain the Focal Mechanisms of Small Earthquakes, BSSA, 93, 24342444.
[16] Hawkes, A. G. and Adamopoulos L. (1973). Cluster models for earthquakes
- regional comparisons, Bulletin of the International Statistical Institute,
45(3), 451-461.
[17] Huc, M. and Main, I. G. (2003). Anomalous stress diffusion in earthquake
triggering: Correlation length, time-dependence, and directionality, J. Geophys. Res., 108(B7), 2324.
[18] Jackson, D.D. and Kagan, Y.Y. (1999). Testable earthquake forecasts for
1999. Seismological Research Letters 70(4), 393-403.
[19] Jammalamadaka, S.R. and Kozubowski, T.J. (2001). A wrapped exponential
circular model, Proc. of AP Academy of Sciences, 5(1), 43-56.
[20] Jammalamadaka, S.R. and Kozubowski, T.J. (2004). New families of
wrapped distributions for modeling skew circular data, Communications in
Statistics, 33(9), 1-16.
[21] Jupp, T., Pyle, D., Mason, B., and Dade, B. (2004). A statistical model
for the timing of earthquakes and volcanic eruptions influenced by periodic
processes, Journal of Geophysical Research, 109(B2), B02206.1B02206.16.
[22] Kagan, Y.Y. (1992). Correlations of earthquake focal mechanisms, Geophys.
J. Int., 110, 305-320.
[23] Kagan, Y.Y. (1997). Are earthquakes predictable, Geophysical Journal International, 131, 505-525.
[24] Kagan, Y.Y. (2002). Aftershock zone scaling, BSSA, 92(2), 641-655.
[25] Kagan, Y.Y. and Jackson D.D. (1994). Long-term probabalistic forecasting
of earthquakes, Journal of Geophysical Research, 99(B7), 13,685-13,700.
[26] Kagan, Y.Y. and Jackson D.D. (1998). Spatial aftershock distribution: Effect
of normal stress, Journal of Geophysical Research, 103(B10), 2445324267.
69
[27] Kagan, Y.Y. and Jackson D.D. (2000). Probabilistic forecasting of earthquakes, Geophy. J. Int., 143, 438-453.
[28] Kagan, Y.Y., Jackson D.D., and Rong Y. (2007). A Testable five-year forecast of moderate and large earthquakes in Southern California based on
smoothed seismicity, Seismological Research Letters, 78(1), 94-98.
[29] Kagan, Y. and Schoenberg, F. (2001). Estimation of the upper cutoff parameter for the tapered Pareto distribution. J. Appl. Prob. 38A, Supplement:
Festscrift for David Vere-Jones, D. Daley, editor, 158175.
[30] Mardia, K.V. and Jupp, P.E. (2000). Directional Statistics, Wiley, New York
[31] Michael, A. (1989). Spatial patterns of aftershocks of shallow focus earthquakes in California and implications for deep focus earthquakes, Journal of
Geophysical Research, 94(B5), 5615-5626.
[32] Ogata, Y. (1988). Statistical models for earthquake occurrences and residual
analysis for point processes, JASA, 83(401), 9-27.
[33] Ogata, Y. (1998). Space-time point-process models for earthquake occurrences, Annals of the Institute of Statistical Mathematics, 50(2), 379-402.
[34] Ogata, Y. (1999). Seismicity analysis through point-process modeling: A
review, Pure and Applied Geophysics, 155, 471-507.
[35] Ogata, Y. and Zhuang, J. (2006). Space-time ETAS models and an improved
extension, Tectonophysics, 413(1-2), 13-23.
[36] Rhoades, D.A. and Evison, F.F. (1993). Long-range earthquake forecasting
based on a single predictor, Geophysical Journal of the Royal Astronomical
Society, 59, 4356.
[37] Ripley, B. (1981). Spatial Statistics, Wiley, New York.
[38] Sanders, C. O. (1989). Fault segmentation and earthquake occurrence in the
strike-slip San Jacinto fault zone, California, Science, 260(5100), 973-976.
[39] Sanders, C. O. (1993). Interaction of the San Jacinto and San Andreas
fault zones, Southern California: Triggered earthquake migration and coupled recurrence intervals, Proceedings of Conference XLV; a workshop on
Fault segmentation and controls of rupture initiation and termination, ed.
by Schwartz, D. P. and Sibson, R. H., U. S. Geological Survey Open-File
Report OF 89-0315, p. 324-349.
70
[40] Schoenberg, F.P. (2003). Multi-dimensional residual analysis of point process

models for earthquake occurrences, JASA, 98(464), 789-795.
[41] Schoenberg, F.P., Barr, C., and Seo, J. (2008). The distribution of Voronoi
cells generated by Southern California earthquake epicenters, Environmetrics, 19, 1-14.
[42] Schwarz, G. E. (1978). Estimating the dimension of a model. Annals of
Statistics, 6(2), 461464.
[43] Shearer, P., Hauksson, E., and Lin, G. (2005). Southern California Hypocenter Relocation with Waveform Cross-Correlation, Part 2: Results Using
Source-Specific Station Terms and Cluster Analysis, BSSA, 95(3), 904-915.
[44] Turcotte, D. L. (1989). A fractal approach to probabilistic seismic hazard
assessment, Tectonophysics, 167, 171-177.
[45] Utsu, T. (1969). Aftershocks and earthquake statistics (I): some parameters
which characterize an aftershock sequence and their interaction. Journal
of the Faculty of Science, Hokkaido University, Ser. VII (geophsyics), 3,
129195.
[46] Veen, A. and Schoenberg, F.P. (2005). Assessing spatial point process models
for California earthquakes using weighted K-functions: analysis of California
earthquakes. in Case Studies in Spatial Point Process Models, editors: Baddeley, A., Gregori, P., Mateu, J., Stoica, R., and Stoyan, D. (eds.), Springer,
NY, 293306.
[47] Vere-Jones, D. (1970). Stochastic Models for Earthquake Occurrence, Journal of the Royal Statistical Society, Series B, 32(1), 1-62.
[48] Vere-Jones, D. (1975). Stochastic Models for Earthquake Sequences, Geophysical Journal of the Royal Astronomical Society, 42, 811-826.
[49] Vere-Jones, D., Robinson, R., and Yang, W.Z. (2001). Remarks on the accelerated moment release model, Geophysical Journal International, 144(3),
517-531.
[50] Wells, D. and Coppersmith, K. (1994). New empirical relationships among
magnitude, rupture length, rupture width, rupture area, and surface displacement, BSSA, 84(4), 974-1002.
[51] Willemann, R.J. and Frohlich, C. (1987). Spatial patterns of aftershocks of
deep focus earthquakes, Journal of Geophysical Research, 92(B13), 1392713943.
71
[52] Zhuang, J., Ogata, Y., and Vere-Jones, D. (2002). Stochastic declustering of
space-time earthquake occurrences, JASA, 97(458), 369-380.
72

Anisotropic Extensions of Space-Time Point Process Models For Earthquake Occurrences

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Anisotropic Extensions of Space-Time Point Process Models For Earthquake Occurrences

Enviado por

Direitos autorais:

Formatos disponíveis

University of California

A dissertation submitted in partial satisfaction

The dissertation of Ka Leung Wong is approved.

Frederic Paik Schoenberg, Committee Chair

University of California, Los Angeles

3 Analysis of Aftershock Spatial Distribution . . . . . . . . . . . .

Stochastic Aftershock Assignment . . . . . . . . . . . . . . . . . .

Relative Location of Aftershocks with respect to Mainshock Focal

Tapered Pareto-Wrapped Exponential . . . . . . . . . . . . . . . .

Alternative Aftershock Spatial Distributions . . . . . . . . . . . .

The Normal Modell . . . . . . . . . . . . . . . . . . . . . .

The Kagan-Jackson Model . . . . . . . . . . . . . . . . . .

Fit of proposed models . . . . . . . . . . . . . . . . . . . .

Diagnostics and Model Comparison . . . . . . . . . . . . .

Epidemic Type Aftershock Sequence Models . . . . . . . . . . . .

Fault Plane Strike Angle and Relative Aftershock Angle . . . . . .

Anisotropic Extensions of ETAS Models . . . . . . . . . . . . . .

Comparison of Isotropic Pareto and Isotropic Tapered Pareto

Models with only Strike-slip Mainshocks . . . . . . . . . .

Beach ball diagram . . . . . . . . . . . . . . . . . . . . . . . . . .

Definition of relative aftershock location. . . . . . . . . . . . . . .

Survival function of r/L . . . . . . . . . . . . . . . . . . . . . . .

3.10 Estimates of , with respect to r/L . . . . . . . . . . . . . . . . .

3.11 Estimates of , from a relocation catalog . . . . . . . . . . . . . .

Fault plane strike angle and relative aftershock angle . . . . . . .

Pearson residuals for model (4.5), . . . . . . . . . . . . . . . . . .

Deviance residuals of (4.5) against (4.8). . . . . . . . . . . . . . .

Deviance residuals of (4.5) against (4.10). . . . . . . . . . . . . . .

Deviance residuals of (4.13) against (4.14), with only strike-slip

ETAS parameters used in stochastic aftershock assignment . . . .

MLE and AIC for ETAS models. . . . . . . . . . . . . . . . . . .

Born, Hong Kong, China.

B.A. Architecture, University of California, Berkeley.

Ph.D. Statistics, University of California, Los Angeles.

Teaching Assistant, Statistics Department, University of California, Los Angeles.

Ph.D. Candidate in Statistics, University of California, Los Angeles.

Publications and Presentations

Wong, Ka and Schoenberg, Frederic. On Mainshock Focal Mechanisms and

Intl Workshop on Statistical Seismology 2009 (Lake Tahoe, CA)

Annual SCEC Meeting 2008 (Palm Springs, CA)

Abstract of the Dissertation

Focal mechanism provides a reasonable approximation to an earthquakes rupture

the relative performance of models to spatial and spatial-temporal point process

cess earthquake models to incorporate focal mechanism via an anisotropic spatial

in Southern California is aided by the large presence of two major right-lateral

Figure 2.1: A beach ball diagram corresponding to the focal mechanism of a

Figure 2.2: A ternary diagram showing the distribution of earthquake types in a

distribution of aftershock hypocenters on the focal sphere of mainshocks against

We focus here on earthquakes in the Southern California Earthquake Data Center

tensor solution (Clinton et al. 2006). As mentioned in Chapter 2, moment tensor

Stochastic Aftershock Assignment

In determining the mainshock-afershock assignments, we adopt a model-based

Relative Location of Aftershocks with respect to Mainshock Focal Mechanism

We consider an aftershocks relative location to any mainshock with respect to

Tapered Pareto-Wrapped Exponential

I propose to model the distribution of the relative locations of aftershocks, in

Ftap (x) = 1 (a/x) exp

where is a threshold after which frequency begins to decay especially rapidly.

This approach can be applied to any probability distribution to manufacture

Note that this is equivalent to the density of a truncated exponential random

Alternative Aftershock Spatial Distributions

The Kagan-Jackson Model

I focus on a subset of relative mainshock-aftershock locations in [0, 20] [0, 20]