Você está na página 1de 9

GEOPHYSICS, VOL. 74, NO. 4 共JULY-AUGUST 2009兲; P. P35–P43, 11 FIGS.

10.1190/1.3119264

Principal component spectral analysis

Hao Guo1, Kurt J. Marfurt2, and Jianlei Liu3

INTRODUCTION
ABSTRACT
Spectral decomposition of seismic data is a recently introduced
Spectral decomposition methods help illuminate lateral interpretation tool that aids in the identification of hydrocarbons,
changes in porosity and thin-bed thickness. For broadband classification of facies, and calibration of thin-bed thickness. Wheth-
data, an interpreter might generate 80 or more somewhat re- er based on the discrete Fourier transform 共Partyka et al., 1999兲,
dundant amplitude and phase spectral components spanning wavelet transform 共Castagna et al., 2003兲, or S-transform 共Matos et
the usable seismic bandwidth at 1-Hz intervals. Large num- al., 2005兲, spectral decomposition typically generates significantly
bers of components can overload not only the interpreter but more output data than input data, presenting challenges in conveying
also the display hardware. We have used principal compo- the meaning of these data in a concise and interpreter-friendly form.
nent analysis to reduce the multiplicity of spectral data and Typically, an interpreter might generate 80 or more spectral ampli-
enhance the most energetic trends inside the data. Each prin- tude and phase components spanning the usable seismic bandwidth
cipal component spectrum is mathematically orthogonal to at 1-Hz intervals. With so much data available, the key issue for in-
other spectra, with the importance of each spectrum being terpretation is to develop an effective way for data representation
proportional to the size of its corresponding eigenvalue. Prin- and reduction.
cipal components are ideally suited to identify geologic fea- The most common means of displaying these components is sim-
tures that give rise to anomalous moderate- to high-amplitude ply by scrolling through them to determine manually which single
spectra. Unlike the input spectral magnitude and phase com- frequency best delineates an anomaly of interest. We illustrate this
ponents, the principal component spectra are not direct indi- process for the phantom horizon slice through the seismic amplitude
cators of bed thickness. By combining the variability of mul- volume 共zero-phase reflectivity兲 66 ms above the Atoka unconfor-
tiple components, principal component spectra highlight mity from a survey acquired over the Central Basin Platform, Texas,
stratigraphic features that can be interpreted using a seismic U.S.A. 共Figure 1兲. We compute 86 spectral components ranging
geomorphology workflow. By mapping the three largest from 5 Hz through 90 Hz using a matched pursuit technique de-
principal components using the three primary colors of red, scribed by Liu and Marfurt 共2007a兲. Figure 2 shows representative
green, and blue, we could represent more than 80% of the corresponding phantom horizon slices at 20-Hz intervals from
spectral variance with a single image. We have applied and 10 Hz 共Figure 2a兲 through 90 Hz 共Figure 2e兲.
validated this workflow using a broadband data volume con- By observing how bright and dim areas of the response move lat-
taining channels draining an unconformity, which was ac- erally with increasing frequency, a skilled interpreter can determine
quired over the Central Basin Platform, Texas, U.S.A. Princi- whether a channel or other stratigraphic feature of interest is thicken-
pal component analysis reveals a channel system with only a ing or thinning. For example, the thinner upstream portions of the
few output data volumes. The same process provides the in- channel indicated by the magenta arrows in Figure 1 are better delin-
terpreter with flexibility to remove any unwanted high-am- eated on the 40–60-Hz components displayed in Figure 2c, whereas
plitude geologic trends or random noise from the original the thicker downstream portions of the channel indicated by the yel-
spectral components by eliminating those principal compo- low arrows in Figure 1 are better delineated by the 20–40-Hz com-
nents that do not aid in delineation of prospective features ponents displayed in Figure 2b. After initial analysis, we might be
with their interpretation during the reconstruction process. able to limit the display to only those frequency components most
important to the task at hand 共Fahmy et al., 2005兲. For example, we
might choose the 35-Hz component to map the channel if we observe

Manuscript received by the Editor 25 June 2008; revised manuscript received 17 September 2008; published online 29 May 2009.
1
University of Houston, Allied Geophysical Laboratories, Houston, Texas, U.S.A. E-mail: hguo@hess.com.
2
The University of Oklahoma, ConocoPhillips School of Geology and Geophysics, Norman, Oklahoma, U.S.A. E-mail: kmarfurt@ou.edu.
3
Chevron Energy Technology Company, Houston, Texas, U.S.A. E-mail: jianlei.liu@chevron.com.
© 2009 Society of Exploration Geophysicists. All rights reserved.

P35

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
P36 Guo et al.

that most channel beds have the maximum constructive interference Furthermore, although scanning works well for choosing the best
at this frequency. However, in terms of conveying information of the frequency for a given horizon, it can become impractical when try-
spectral variation from the data space, only a small portion of the ing to determine the best frequency for multiple spectral component
spectral variation is displayed 共one out of a possible 86 spectral com- volumes. The spectral decomposition algorithm also outputs the
ponents兲. phase spectrum corresponding to each amplitude spectrum. Figure 3
shows the corresponding phase spectra to the amplitude spectra
shown in Figure 2. The phase images in Figure 3 are not easy to inter-
2 km pret when compared to the amplitude slices in Figure 2, and thus the
interpreter might choose to ignore the phase spectrum information
during the interpretation.
We can reduce the amount of images to be scanned by the use of
color stacking 共e.g., Liu and Marfurt, 2007b; Stark, 2005; Theopha-
nis and Queen, 2000兲. In this workflow, we plot one component
against red, a second against green, and a third against blue, and
scale so that the color value range 共0–255兲 of each primary color rep-
Amp resents 95% of the corresponding component data values. Figure 4
shows red, green, and blue 共RGB兲 images of different frequency
High
combinations. Figure 4a-c is designed to accentuate the spectral
variation in the low-, intermediate-, and high-frequency ranges, re-
spectively. Figure 4d-f shows three possible frequency combina-
tions on the global frequency ranges.
Compared with the images in Figure 2, the RGB images in Figure
Low 4 show greater detail by combining multiple images. Simply stated,
color stacking can triple the information that is conveyed by a single
Figure 1. Phantom horizon slice through the seismic data 66 ms mono-frequency display. Considering the complexity of the spectra
above the Atoka unconformity from a survey acquired over the Cen- in the target horizon and the limit of color channels for display, each
tral Basin Platform, Texas, U.S.A. Yellow arrows indicate down- combination will highlight only certain parts of the whole spectrum.
stream, and magenta arrows upstream, components of complex
channel systems draining the unconformity high. Without further data reduction, representation of the original spec-

a) c) e)
2 km 2 km 2 km

Amp Amp Amp


High High High

Low Low Low

b) d)
2 km 2 km

Amp Amp
High High

Low Low

Figure 2. Phantom horizon slices corresponding to those shown in Figure 1 through the 共a兲 10-Hz, 共b兲 30-Hz, 共c兲 50-Hz, 共d兲 70-Hz, and 共e兲 90-Hz
amplitude spectral component volumes computed using a matched pursuit algorithm. Yellow arrows indicate downstream components of the
channel system and are better delineated by the 20–40-Hz frequency images. Magenta arrows indicate upstream components of complex chan-
nel systems and are better delineated by the 40–60-Hz frequency images.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
Principal component analysis P37

tral content reaches its limits 共for example, 86⫻ 85⫻ 84 combina- of spectral components. We begin with a review of principal compo-
tions兲. This situation motivates more efficient data-reduction tech- nent analysis applied to real spectral amplitudes, and then show how
niques. it is readily generalized to represent complex spectra consisting of
One method of data reduction is to generate synthetic attributes spectral magnitudes and phases. We apply the method to a land seis-
first by correlating the data against predefined spectra 共or basis func- mic data volume from the Central Basin Platform and show how we
tions兲 and plotting the correlation coefficients, instead of the spectral can represent 80% of the data variation using a color stack. We con-
components, against color. In the Central Basin Platform example, clude by showing how we can filter out selected spectral variations
we note that the combination of 30, 60, 90 Hz is not strikingly differ- and thereby enhance individual spectral images by eliminating inter-
ent from the combination of 20, 50, 80 Hz, implying that there are no preter-chosen components during the reconstruction.
significant changes over a 10-Hz range. Taking advantage of the re-
dundancy or correlation in the original spectra, Stark 共2005兲 defines THEORY AND METHOD
three spectral basis functions that can be interpreted as simple aver-
Principal component analysis finds a new set of orthogonal axes
ages and plotted the low-frequency average against red, intermedi-
that have their origin at the data mean and that are rotated so that the
ate-frequency average against green, and highest-frequency average
data variance is maximized 共Figure 5兲. The orthogonal axes are
against blue. called eigenvectors and represent spectra in the original frequency
Liu and Marfurt 共2007b兲 provide a moderate improvement to domain. The projections of the original spectra onto these axes are
Stark’s 共2005兲 gate or “boxcar” spectral basis function by applying called principal component 共PC兲 bands. The amount of the total vari-
raised cosines over user-defined spectral ranges to generate low-, ance that each PC band can represent is quantified by its correspond-
mid-, and high-frequency color stack images. Whereas in some cas- ing eigenvalue. Compared with the original frequency components,
es the raised cosine function better approximates the cosine-like PC bands are orthogonal 共and thus uncorrelated兲 linear combina-
thin-bed tuning response, such approximations are suboptimal; we tions of the original spectra. In theory, we calculate the same number
might lose a considerable amount of frequency variability represent- of output PC bands as the input spectral components. The first PC
ed by using either Stark’s 共2005兲 or Liu and Marfurt’s 共2007b兲 three band represents the largest percentage of data variance, the second
predefined basis functions. Furthermore, there is no simple measure PC band represents the second-largest data variance, and so on. The
of how much of the spectrum we do not represent. last PC bands represent the uncorrelated part of the original spectral
Instead of using predefined basis functions or spectra, we propose data, which includes random noise.
using principal component analysis to determine mathematically In the seismic spectral volumes we have analyzed, we find that the
those frequency spectra that best represent the data variability, there- first 15 PC bands represent more than 95% of the total variance of the
by segregating noise components and reducing the dimensionality original data. The first three PC bands represent as much as 80% of

a) c) e)
2 km 2 km 2 km

Phase Phase Phase


180 180 180

0 0 0

–180 –180 –180

b) d)
2 km 2 km

Phase Phase
180 180

0 0

–180 –180

Figure 3. Phantom horizon slices corresponding to those shown in Figures 1 and 2 through the 共a兲 10-Hz, 共b兲 30-Hz, 共c兲 50-Hz, 共d兲 70-Hz, and 共e兲
90-Hz phase spectral component volumes computed using a matched pursuit algorithm. Yellow arrows mark the location of downstream com-
ponents of the channel system. Note the phase rotation of the channel system in different frequency phase components. White arrows mark the
position of a northwest linear feature.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
P38 Guo et al.

the total variance of the data. Furthermore, we can reconstruct “fil- nents usually represent about 80% of the frequency variance seen in
tered” spectra by using a subset of the interpreter-chosen PC spectra, the data along this horizon slice.
thereby providing a means of rejecting random noise and compo- Principal component analysis on spectral components consists of
nents that interpreters consider as uninteresting background. We three steps. The first step is to assemble the covariance matrix by
show a quantitative example of this in the next section. crosscorrelating every frequency component, j ⳱ 兵1,2, . . . ,J其, with
Principal component analysis of more than 100 spectral compo- itself and all other frequency components 共Figure 6兲, resulting in a
nents is well established in remote-sensing interpretation software square, J by J symmetrical, covariance matrix C:
and workflows 共Rodarmel and Shan, 2002兲. Principal component
analysis of seismic attributes also is well established, particularly N M
with respect to seismic shape analysis 共Coléou et al., 2003兲. Mathe-
matically, principal component analysis is an effective statistical da-
C jk ⳱ 兺 兺 dmn
n⳱1 m⳱1
共j兲 共k兲
dmn, 共1兲
ta-reduction method for data spaces in which the dimension is very
large and the data axes are not orthogonal 共that is, somewhat redun-
dant兲 to each other. From an interpreter’s point of view, principal where C jk is the jkth element of the covariance matrix C; N is the
component analysis applied to spectral components begins by deter- number of seismic lines in the survey; M is the number of seismic
共j兲 共k兲
mining which frequency spectrum 共the first principal component兲 crosslines in the survey; and dmn and dmn are spectral magnitudes of
best represents the entire data volume. The second principal compo- the jth and kth frequencies at line n and crossline m.
nent is that spectrum that best represents the part of the data not rep- The second step is to decompose the covariance matrix into J sca-
resented by the first principal component. The third principal com- lar eigenvalues ␭ p and J unit length J ⫻ 1 eigenvectors vp by solving
ponent is the spectrum that best represents that part of the data not the equation
represented by the first two principal components, and so on.
If normalized by the total sum of all eigenvalues, the eigenvalue Cvp ⳱ ␭ pvp . 共2兲
associated with each eigenvector represents the percentage of data
that can be represented by its corresponding principal component. In Almost all numerical solutions of equation 2 共we use LAPACK
general, the first principal component is a spectrum that represents 关2007兴 program ssyevx兲 sort the eigenvectors vp in either ascending
more than 50% of the data variance, the second principal component or descending order according to their corresponding eigenvalues.
is a spectrum that represents another 15% through 25% of the data The third step is to project the spectrum at each trace onto each ei-
共p兲
variance, and the third principal component is a spectrum that repre- genvector vp to obtain a map of coefficients amn that measure how
sents about 5% of the data variance. Together, these three compo- much of each spectrum is represented by a given eigenvector:

a) c) e)
2 km 2 km 2 km

Amp at (Hz) Amp at (Hz) Amp at (Hz)


10 20 30 70 80 90 20 50 80
Hi Hi Hi Hi Hi Hi Hi Hi Hi

Lo Lo Lo . Lo Lo Lo Lo Lo Lo

b) d) f)
2 km 2 km 2 km

Amp at (Hz) Amp at (Hz) Amp at (Hz)


40 50 60 10 40 70 30 60 90
Hi Hi Hi Hi Hi Hi Hi Hi Hi

Lo Lo Lo Lo Lo Lo Lo Lo Lo

Figure 4. Composite RGB images of 共a兲 10–20–30-Hz, 共b兲 40–50–60-Hz, 共c兲 70–80–90-Hz, 共d兲 10–40–70-Hz, 共e兲 20–50–80-Hz, and 共f兲
30–60–90-Hz phantom horizon slices through amplitude spectral component volumes where the first spectral amplitude is plotted against red,
the second against green, and the third against blue. Each spectral component has been balanced over a 500-ms analysis window over the entire
survey.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
Principal component analysis P39

J
共p兲
amn ⳱ 兺 v共j兲p dmn
共j兲
, 共3兲
j⳱1 ... ... ...
where the index j indicates the jth frequency. The output is a series of
PC bands sorted in descending order of their statistical significance
10 Hz 50 Hz
— the percentage of original data variance observable in the particu- 90 Hz

lar PC band.
C(10,10) C(10, 50) C(10, 90)
... ...
EXAMPLE

...
...
...

...
To illustrate the effectiveness of this technique, we examine the C= C(50, 10) C(50, 50) C(50, 90)

phantom horizon slice 66 ms above the Pennsylvanian age Atoka

...

...
...
unconformity from a survey acquired over the Central Basin Plat- .. .
C(90, 10) C(90, 50) C(90, 90)
form, shown in Figure 1. We compute 86 spectral components rang- ... ...
ing from 5 to 90 Hz using a matched pursuit technique described by
Liu and Marfurt 共2007a兲. Next we perform principal component Figure 6. Cartoon showing the computation of the covariance matrix
analysis on the 86 spectral components, form an 86 by 86 covariance Cmn. Each complex spectral time slice is simply crosscorrelated with
itself and all other complex spectral time slices along the horizon of
matrix using equation 1, decompose it into 86 eigenvalue-eigenvec- interest.
tor pairs using equation 2, and project the original spectra at each
trace onto each eigenvector using equation 3. Figure 7a displays the
percentage of data defined by each of the principal component spec-
a) 100

nt 90
ne
po 80
c om
al 70
cip
Percentage (%)

ir n 60
tp
1s 50
40
30
20
30-Hz component (amplitude)

10
2n
d 0
pr
in
cip 1 3 5 7 9 11 13 15 17 19
al
co PC band number (#)
m
po
ne
nt b) 0.3

0.2
Coefficients of PCA band

e) 0.1
lit ud
mp
n t (a
ne 0
po
com
z
2 0-H –0.1
10-
H zc
om
po –0.2
ne
nt
(am
pli
tud –0.3
e)
0 20 40 60 80 100
Figure 5. Principal component analysis 共PCA兲 of data consisting of Frequency component (Hz)
only three frequency components, with black spheres representing PCA 1 PCA 2 PCA 3 PCA 4 PCA 5
the three spectral components for each trace. The data cloud indi-
cates that the three components are highly correlated and somehow Figure 7. 共a兲 The first 20 eigenvalues and 共b兲 the first five corre-
redundant. The data variance is the projection of the data cloud onto sponding eigenvectors 共or principal component spectra兲 computed
the component axis. The PCA analysis rotates the original 10, 20, from the 86 spectral components, eight of which are shown in Figure
and 30-Hz axes so that the first eigenvector 共PC band兲 represents the 2. The first three eigenvectors represent more than 80% of the vari-
greatest variability in the data. The variance along the second eigen- ance in the data. The remaining eigenvectors represent only a small
vector is relatively small and mathematically uncorrelated with the part of the spectral variance. The black line in 共a兲 represents the total
major trend. The third eigenvector 共not shown兲 is perpendicular to cumulative percentage that the first N principal components can rep-
the first two and represents the least amount of variance in the data. resent. Because the data were spectrally balanced using the spectral
Thus the first PC band can effectively capture the major features decomposition algorithm described by Liu and Marfurt 共2007a兲, the
seen in the data, reducing the amount of data by a factor of three. first principal component appears as a flat spectrum. By construc-
共Figure modified from a similar one courtesy of ScottPickford Ltd.兲 tion, the length of each eigenvector is exactly 1.0.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
P40 Guo et al.

tra 共or eigenvectors兲 displayed in Figure 7b. The first three principal the “noisy” PC bands starting from number 20; we interpret PC 71 as
components account for most of the spectral variance seen along this random noise.
horizon. The remaining components account for only about 17% of By mapping the three largest principal components against red,
the data variance. green, and blue, we can represent 83% of the spectral information
Although the input amplitude volume has nonwhite amplitude with a single colored image 共Figure 9兲 in which each component is
spectra, the spectral components were statistically balanced as part rescaled to span the range 0–255. The conspicuous red features de-
of Liu and Marfurt’s 共2007a兲 matched-pursuit spectral decomposi- lineate the wider channels seen on the erosional high in the central
tion algorithm. For this reason, the first eigenvector PC band 1 is ap- and upper region in the horizon time map. The green features delin-
proximately flat and represents 62% of the total variance in the origi- eate narrower channels that are upstream from the previous red
nal data. The second eigenvector is monotonically decreasing, with channels. Note that the narrow channels indicated by green arrows
the high frequencies contributing less than the low frequencies. We are quite difficult to see in Figures 2–4. Although they are best delin-
interpret this trend as representing the fact that the low frequencies eated by the higher-frequency components, these components also
probably are in-phase with each other, whereas the higher frequen- are contaminated by noise. Based on our earlier discussion, note that
cies might have greater variance and need to be represented by more coherent spectra will be represented by the first few principal com-
than one eigenvector. ponents, and incoherent spectra corresponding to random noise will
Although it is tempting to assign physical significance to these be represented by a linear combination of later principal components
spectra 共with eigenvector 3 perhaps representing thin-bed tuning at 共such as PC 71 shown in Figure 8e兲. Selection of the PC bands with
about 55 Hz兲, we need to remember that they reside in mathematical large eigenvalues implicitly increases the signal-to-noise ratio by re-
rather than geologic space. Whereas the first eigenvector best repre- jecting noisy PC bands such as 71.
sents the data, all subsequent eigenvectors are constructed mathe-
matically to be orthogonal to the previous ones. The PC bands are PRINCIPAL COMPONENTS OF COMPLEX
just weighted sums of the original spectral components, as seen in SPECTRA
equation 3.
In Figure 8, we plot the four largest principal components of the The previous analysis was performed only on the magnitude of
data, as well as the 71st component. It is important to remember that the spectra, represented by the images displayed in Figure 2. The
if a given event has very high reflectivity, it will have high amplitude phase spectra displayed in Figure 3 provide additional information.
in its components as well. For this reason, we see channels tuning in For example, in Figure 3c the channels marked by the yellow arrows
and out in PC 1. We note that components 2, 3, and 4 display more show a 90° phase shift relative to the background, and these channels
anomalous behavior in the sense of spectral shape changes relative are more chaotic in the higher-frequency phase image as shown in
to the total of all components. The PC 71 represents one example of Figure 3e. Note the spectral variation of the northwest features

a) c) e)
2 km 2 km 2 km

Amp Insert component 3!


Amp Amp
High High High

0 Low Low

b) d)
2 km
2 km

Insert component 2!
Amp
Amp

High
High

Low
Low

Figure 8. The spectra projected onto the 共a兲 first, 共b兲 second, 共c兲 third, 共d兲 fourth, and 共e兲 71st principal component. Note that the anomalous areas
are best represented by the second, third, and fourth principal component. The 71st principal component represents random noise.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
Principal component analysis P41

PC ANALYSIS AS A FILTER
marked by the white arrows in Figure 3a-e. Several geologic deposi-
tional environments might be represented by the same amplitude Because principal component analysis assigns the most coherent
spectrum. In an extremely simple example, consider the following spectra to the first eigenvalues and the incoherent or random compo-
four 2-term time series: 共1, ⳮ1/2兲, 共1/2, ⳮ1兲, 共ⳮ1/2, 1兲, and 共ⳮ1, nents of the spectra to the later eigenvalues, we can use PC analysis
1/2兲. Each of these four series will have the same amplitude spectra.
This ambiguity could create a false sense of continuity, whereas in
reality we change from an upward-coarsening to an upward-fining
sequence.
A formal generalization of equations 1–3 would begin by comput- 2 km
ing a J by J complex Hermitian-symmetrical covariance matrix from
the amplitude and phase of the complex spectra. This complex Her-
mitian covariance matrix then is decomposed into J real eigenvalue-
complex eigenvector pairs. Crosscorrelating the complex conjugate
of the complex eigenvectors with the complex spectra provides
complex crosscorrelation coefficients or principal component maps.
We have implemented this approach and find it unsatisfactory for
two reasons. First, it is unclear how to generalize our RGB multiat- Amp of PC
tribute display to represent multiple complex maps. Second, and #1#2#3
more important, the phase of the eigenvectors provides an extra de- Hi Hi Hi
gree of freedom to fit the complex spectra. This extra degree of free-
dom results in the channels being blurred instead of appearing as
sharp discontinuities.
Marfurt 共2006兲 recognized this same limitation in principal-com-
ponent coherence computations using the analytic 共complex兲 rather
than the original 共real兲 trace. He devised an alternative means of Lo Lo Lo
computing the statistics of the original 共real兲 and Hilbert transform
Figure 9. Composite RGB image of the first three principal compo-
共imaginary兲 data, and simply treated the Hilbert transform of the nent bands 共red⳱ PC band 1, green⳱ PC band 2, and blue⳱ PC
data as if they were additional real samples. For our complex spectral band 3兲. Each principal component is scaled to display 95% of the
analysis problem, we therefore simply add the covariance matrices data to each color channel. Note very narrow channels indicated by
共j兲 共j兲
computed from the real components, dmn cos ␸ mn , to the covariance green arrows.
matrix computed from the imaginary compo-
共j兲 共j兲 共j兲
nents, dmn sin ␸ mn , where dmn is the magnitude and a) c)
共j兲
␸ mn is the phase of the complex spectra at the jth 2 km 2 km
frequency at line n and crossline m. If our survey
has N ⫻ M seismic traces, this process is equiva-
lent to considering the survey as having 2N ⫻ M
traces. Because the real and imaginary compo-
nents of the complex spectra are independent, in
Amp Amp
general, we will need to use more principal com- High High
ponents to reconstruct the data than if we used the
real data 共or alternatively magnitude data兲 alone.
Generalizing equation 1, we obtain a J by J real
symmetrical covariance matrix: Low Low

N M

C jk ⳱ 兺 兺
n⳱1 m⳱1
共j兲
关dmn 共j兲 共k兲
cos ␸ mn 共k兲
dmn cos ␸ mn
b) d)
2 km
2 km
共j兲 共j兲 共k兲 共k兲
Ⳮ dmn sin ␸ mn dmn sin ␸ mn 兴. 共4兲
We then reapply equations 2 and 3 and note that
the phase between the real and imaginary parts of
the complex spectrum is “locked” and cannot be Amp
Amp
rotated when crosscorrelated with the real eigen- High
High
vectors.
In Figure 10, we display maps of the complex
spectra projected onto the first four principal
components. The first PC band is strikingly simi- Low
Low
lar to the counterpart in Figure 8. For the second
PC band, the channels stand out in excellent con- Figure 10. The complex spectra projected onto the 共a兲 first, 共b兲 second, 共c兲 third, and 共d兲
trast in Figure 10 compared with Figure 8. fourth principal component computed from the complex spectra using equations 4, 2, and
3.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
P42 Guo et al.

as a filter. For the data, the first five PC bands will take account of struction, thereby enhancing more subtle lateral changes in spectral
more than 85% of the variance of the original data. The eigenvalues components. We interpret PC band 2 in Figure 8b to be such a tuning
corresponding to PC bands greater than five drop to a negligible lev- effect.
el, so that spectra crosscorrelated against eigenvectors with eigen- Unfortunately, the principal components themselves do not al-
vectors greater than five appear to be quite random, representing ways have a fixed relation to reflector thickness but instead will vary
very small variance in the data spectra. Plotting the first three spec- from horizon to horizon and survey to survey. Thus, we object to pre-
tral components, we generate the RGB color stack image in Figure 9, dicting reservoir thickness using principal components through geo-
thereby implicitly suppressing noise and increasing the signal-to- statistics or neural nets, although we suspect this might work in some
noise ratio. A more dramatic example can be found in Guo et al. cases. Still, prediction of thickness from physical principals requires
共2006兲, in which the horizon was contaminated by bad picks. use of the original spectral components or reconstructed original
We can use the first five PC bands to reconstruct most of the origi- components from major PC components. On the other hand, as with
nal data. In Figure 11a, we redisplay the 90-Hz spectral magnitude principal components applied to seismic trace-shape analysis 共Co-
component shown originally in Figure 2e. In Figure 11b, we recon- léou et al., 2003兲, we suspect that principal components will be an
struct the 90-Hz spectral component from the first five PC bands. We excellent tool for anomaly mapping using self-organized maps be-
notice that the reconstructed data are very similar to the original data, cause the first few principal components preserve most of the origi-
which proves the effectiveness of data reconstruction. In Figure 11c, nal data variance.
we use PC bands 1, 3, 4, and 5 to reconstruct the spectral magnitude.
Note how the narrow channels are delineated more easily after rejec-
tion of PC band 2 in Figure 8b.
LIMITATIONS
Principal component filtering provides interpreters with flexibili- Display of major principal components might overlook subtle
ty to remove any unwanted trends from the original data by interac- features with little reflectivity. By construction, principal compo-
tively rejecting those principal components that are not correlated to nents are ordered, with the first principal component representing
features of interpretation interest during data reconstruction. Such the greatest variance in the data. For this reason, high-amplitude
exploratory data analysis is a well-accepted workflow in attribute background reflectors 共the rock matrix through which the channel
analysis. In some cases, the acquisition footprint might appear was cut in our example兲 will show up strongly in the first few compo-
strongly in a given principal component spectrum and thus can be nents. However, not all features of exploration interest have strong
suppressed in subsequent reconstruction. In other cases, a strong reflectivity. A significant challenge in spectral decomposition is the
thin-bed tuning imprint corresponding to the background matrix illumination of “invisible channels” 共Suarez et al., 2008兲, channels
might show up a given component and can be rejected during recon- whose reflectivity is nearly indistinguishable
from that of the surrounding matrix. Wallet and
a) c) Marfurt 共2008兲 discuss an alternative, more ex-
2 km 2 km haustive search, based on the “grand tour” meth-
od. Because the grand tour is a projection method,
they find that using the first few principal compo-
nents 共computed using the method described
here兲 form optimal, compact basis functions that
Amp can be projected either as a movie or interactively.
High High As mathematical combinations of different
spectral components, principal component spec-
tra have little direct relation to porosity thickness
provided by the original spectral components
Low Low
themselves. Instead, the images need to be inter-
preted in a more qualitative manner, using princi-
pals of seismic geomorphology within a given
b)
2 km depositional, erosional, or diagenetic framework.
Calibration of these geomorphological features
needs to be done the hard way — through com-
parison to the original seismic data, and to wells
and production data. Direct correlation of spectra
to reservoir properties also needs to be calibrated.
High
Fortunately, such workflows are well established
in seismic waveform analysis 共e.g., Coléou et al.,
2003兲.

Low
CONCLUSIONS

Figure 11. 共a兲 The original 90-Hz component, 共b兲 the same component obtained after re- We have shown that principal component anal-
construction using only the first five PC bands, and 共c兲 the same component obtained after ysis can reduce the redundant spectral compo-
reconstruction using only the PC bands numbered 1, 3, 4, and 5. Note how the channels nents into significantly fewer, more manageable
are more clearly defined in 共c兲.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
Principal component analysis P43

bands that capture most of the statistical variance of the original REFERENCES
spectral response. By mapping the three largest principal compo-
nents using an RGB color stack, we can represent most of the spec- Castagna, J., S. Sun, and R. Siegfried, 2003, Instantaneous spectral analysis:
tral variance with a single image, which in our example provided an Detection of low-frequency shadows associated with hydrocarbons: The
Leading Edge, 22, 120–127.
excellent delineation of channels. Coléou, T., M. Poupon, and K. Azbel, 2003, Interpreter’s corner — Unsuper-
Whereas the first PC band having the largest eigenvalue repre- vised seismic facies classification: A review and comparison of techniques
and implementation: The Leading Edge, 22, 942–953.
sents most of the signal, the later PC bands having smaller eigenval- Fahmy, W. A., G. Matteucci, D. Butters, J. Zhang, and J. Castagna, 2005,
ues represent random noise. This noise might be associated with in- Successful application of spectral decomposition technology toward drill-
ing of a key offshore development well: 75th Annual International Meet-
put data contaminated by ground roll, acquisition footprint, and/or ing, SEG, Expanded Abstracts, 262–264.
bad picks. Reconstructing the spectral components from a subset of Guo, H., K. J. Marfurt, J. Liu, and Q. Dou, 2006, Principal components analy-
PC bands provides a filtering tool that allows us to reject noise or en- sis of spectral components: 76th Annual International Meeting, SEG, Ex-
panded Abstracts, 988–992.
hance spectral behavior that best delineates the features of interest. Liu, J., and K. J. Marfurt, 2007a, Instantaneous spectral attributes to detect
Mathematically, the first three PC bands represent most variance of channels: Geophysics, 72, no. 1, 23–31.
——–, 2007b, Multicolor display of spectral attributes: The Leading Edge,
the original data. However, it is important to note that there is no def- 26, 268–271.
inite physical significance to the exact shape of the PC bands. Thus, Marfurt, K. J., 2006, Robust estimates of reflector dip and azimuth: Geophys-
although such images are excellent for highlighting lateral changes ics, 71, no. 4, 29–40.
Matos, M. C., P. Osorio, E. C. Mundim, and M. Moraces, 2005, Characteriza-
in data that can fit into a seismic geomorphology model, they are not tion of thin beds through joint time-frequency analysis applied to a turbid-
easily usable for quantitative inversion, such as predicting porosity ite reservoir in Compos Basin, Brazil: 75th Annual International Meeting,
SEG, Expanded Abstracts, 1429–1432.
thickness. However, they could be useful input attributes to a super- Partyka, G. A., J. Gridley, and J. Lopez, 1999, Interpretational applications of
vised multiattribute prediction. spectral decomposition in reservoir characterization: The Leading Edge,
18, 353–360.
Rodarmel, C., and J. Shan, 2002, Principal component analysis for hyper-
spectral image classification: Surveying and Land Information Science,
ACKNOWLEDGMENTS 62, 115–122.
Stark, T. J., 2005, Anomaly detection and visualization using color-stack,
cross-plot, and anomalousness volumes: 75th Annual International Meet-
We thank Burlington Resources for permission to use their data in ing, SEG, Expanded Abstracts, 763–766.
this research. We also thank the sponsors of the Allied Geophysical Suarez, Y., K. J. Marfurt, and M. Falk, 2008, Seismic attribute-assisted inter-
Laboratories 共AGL兲 industrial consortium on mapping of subtle pretation of channel geometries and infill lithology: A case study of Ana-
darko Basin Red Fork channels: 78th Annual International Meeting, SEG,
structure and stratigraphic features using modern geometric at- Expanded Abstracts, 963–967.
tributes. We thank associate editor Dengliang Gao and three anony- Theophanis, S., and J. Queen, 2000, Color display of the localized spectrum:
Geophysics, 65, 1330–1340.
mous reviewers for their help in generating a significantly improved Wallet, B. C., and K. J. Marfurt, 2008, A grand tour of multispectral compo-
paper. nents: The Leading Edge, 27, 334–341.

Downloaded 08 Nov 2009 to 118.137.43.11. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/

Você também pode gostar