Você está na página 1de 192
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g, maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI ‘A Bell & Howell Information Company 300 North Zeeb Road, Ann Arbor MI 48106-1346 USA. 313/761-4700 800/521-0600 AUTOMATIC COVARIANCE MODELING AND CONDITIONAL SPECTRAL SIMULATION WITH FAST FOURIER TRANSFORM A DISSERTATION SUBMITTED TO THE DEPARTMENT OF GEOLOGICAL & ENVIRONMENTAL SCIENCES AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY By Tingting Yao April, 1998 UMI Number: 9901635 Copyright 1998 by Yao, Tingting All rights reserved. LUMI Microform 9901635 Copyright 1998, by UMI Company. All rights reserved. This microform edition is protected against unauthorized ‘copying under Title 17, United States Code. UMI 300 North Zeeb Road ‘Ann Arbor, MI 48103 © Copyright 1998 by Tingting Yao [certify that I have read this thesis and that in my opin- ion it is fully adequate, in scope and in quality. as a dissertation for the degree of Doctor of Philosophy. A.G- Fourenel Dr. André G. Journel (Principal adviser) I certify that I have read this thesis and that in my opin- ion it is fully adequate, in scope and in quality asa dissertation for the degree of Doctor of Philosophy. . Clayton V. Deutsch (Petroleum Engineering Department) T certify that I have read this thesis and that in my opin- ion it is fully adequate. in scope and in quality, as a dissertation for the degree of Doctor of Philosophy ir. Gerald M. Mavi (Geophysics Department) I certify that I have read this thesis and that in my opin- ion it is fully adequate, in scope and in quali asa dissertation for the degree of Doctor of Philosophy. Dr. Paul Sfdtzer (Statistics Department) Approved for the University Committee on Graduate Howe lbwow Dean of Graduate Studies Studies: Abstract Variogram / covariance models provide a basic measure of spatial variability / con- tinuity for geostatistical modeling. Traditionally. a closed-form analytical model is fitted to allow for interpolation of sample covariance values while ensuring the positive definiteness condition. For cokriging in presence of several cross correlated attributes. nces is made more difficult because the simultaneous modeling of these (cross) covat of restrictions imposed by the linear model of coregionalization. Bochner’s theorem maps the positive definite constraints into much simpler con- straints on the Fourier transform of the covariance. that is the spectrum density. Ac- cordingly. this dissertation proposes an automatic (cross) covariance modeling method based on Bochner’s theorem. In the univariate case, the experimental auto- covariance table is transformed into a quasi-spectrum density table using Fast Fourier Transform (FFT). This quasi-spectrum density table is then smoothed under constraints of posi- tivity and unit sum. A back transform (FFT) restitutes a permissible positive definite auto-covariance table. In the multivariate case. the algorithm is extended for auto- matic joint modeling of multiple (cross) covariance tables. building on an extension of Bochner’s positive definiteness theorem and an eigenvalue correction. A case study using heavy metals pollution data in soil is developed to demonstrate the algorithm. ‘The task of modeling spatial variabilities is made easier while allowing a better fit of the original sample covariance values, ‘The permissible density spectrum obtained in frequency domain can be used di- rectly for spectral simulation, which has been widely applied in electrical engineering to generate random fields with a given covariance spectrum. Spectral simulation algorithms are particularly fast when based on Fast Fourier Transform (FFT). How- ever. because of lack of phase identification, spectral simulation can only generate unconditional realizations. Local data conditioning is typically obtained by adding a simulated kriging residual. This conditioning process calls for solving a kriging system at each simulated node thus forfeiting the speed advantage of FFT and it only al- lows reproduction of the spatial structure within the kriging search neighborhood. A new algorithm for conditioning is proposed whereby the phase values are determined iteratively to ensure approximative data honoring while reproducing the frequency spectrum, i.e.. the covariance model. A 3D case study is developed to demonstrate the better reproduction of the spatial structure over the whole field when using the proposed conditional spectral simulation algorithm, A complete library with Fortran 77 code for implementation of the proposed al- gorithms is provided in Appendix. a Acknowledgments A dissertation is never the work of a lonely hunter. In my case. [ have been so blessed fh the support. collaboration, and encouragement from so many special people. My wholehearted thanks first go to my advisor. Dr. Andre Journel. His advice, encouragement, and close reading of innumerable drafts of the dissertation have been invaluable. I can’t imagine the final product of this dissertation without his input. Furthermore. his enthusiasm and devotion to geostatistics. his scientific rigor. and his commitment to excellence have constantly led me to strive for the best. He is truly the role model for my life. Even though there have been times his sharp criticism of my papers reduced me to tears. I have to say he is the best advisor I have ever had and can ever expect to have. Andre. thank you for bearing with my initially poor English writing in reports, and for patiently nurturing an immature girl into a conscientious scholar. I deeply appreciate Dr. Clayton Deutsch’s help. During his stay at Stanford. be was the one [ always turned to for any problem I encountered with the geostatistics software GSLIB. Moreover. he read my dissertation with excruciating scrutiny, and helped me to clear any conceptual confusion and to avoid even typos. Talso thank Dr. Gary Mavko. who provided both financial support and academic advice during my first summer in Stanford. It is during that period of time that T conceived the initial ideas of my dissertation through taking the course “Fourier transform and its application”, which was recommended by him. I am grateful to Dr. Paul Switzer for being my committee member and always providing insightful advice from the perspective of a statistician. [ also want to thank Dr. David Freyberg for chairing my defense. Tam extremely indebted to Dr. Tapan Mukerji for always being my support. The second topic of this dissertation. conditional spectral simulation. took shape after I discussed it with him many times. [t is from him that I learned Fourier transform which [had lacked but which is essential for my dissertation. My special thanks are due to Dr. Xianhuan Wen and his wife Xiaolin. They treat me like a little sister and always provide a cozy home for me when I need it most. Their loving care helped me to go through some tough time while away from my parents. [ also pay my gratitude to my officemate and good friend. Sanjay Srinivasan. for his encouragement and advice, academic as well as non-academic. It is a great pleasure to share the same office with him for three years, and the many philosophical discussions we had will be my li time memory Tam deeply grateful to Phaedon Kyriakidis for carefully reviewing and correcting the whole draft of the dissertation to make it more “readable. [ also want to give my thanks to all my groupmates. from each of whom I learned different things. [ am very grateful to them and Stanley. the secretary of SCRF. for keeping SCRF such a pleasant work environment and for making these years such a constructive and wonderful experience for me. My profound gratitude goes to my lovely family ~ my mother. father, brother and sister-in-law. As a math teacher in an elementary school. my mother introduced me to the world of science. while my father. an expert in Chinese literature. taught me to appreciate the beauty of Chinese poems and traditional music. The most important thing I've learned from them is to be honest and righteous. Such a principle and attitude has benefited my academic life. and will always guide me. I owe them a ly day for further study. [ want to thank my brother and sister-in-law for taking part of lot. if only for the lonel they have endured since [ left home at the age of 17 my responsibility to take care of our parents and comfort them when they need it most. Last, but definitely not the least. I wish to express my special gratitude to my b w yitiend for sharing each bit of trouble and happiness in my life, and cheering me up lle I was frustrated in research at sometime. Their support and love are the main inspiration for me to accomplish the research work in Stanford. To them. I dedicate this clissertation. Contents 1 Introduction 1 2° Literature review: covariance modeling 6 21 Concepts and notations. . . 7 2 Parametric modeling: The Linear Model of Coregionalization..... I: 1 Auto covariance modeling. 2... 2.0... Joint cross-covariance modeling . . . . . eee 2.3. Non-parametric modeling based on Fourier Series Fitting... 2... 16 23.1 Auto-covariance modeling Cross-covariance modeling... 2... . .. al 24 Non-parametric modeling based on eigenvalue correction ....... 24 2.5 Comments and proposal for improvement... ........0.005- 3 Automatic covariance modeling with Fast Fourier Transform 28 3.1 ‘Theory and methodology 29 3.2 Sample and exhaustive covariance maps... . . 33 3 Preliminary smoothing of the covariance map. 36 34 PET and smoothing of the spectrum table .. 2.2... 0... - 40 3.1 Smoothing of auto density spectrum maps ........... 2 3.4.2 Joint smoothing of ross spectra... . eee Using the covariance look-up table for estimation and ulation . . 50 3.6 Concluding remarks viii 4 Multiple cross-covariance modeling through eigenvalue correction 61 4.1 Background and motivation... .. sees a) 4.2 Eigenvalues correction on spectrum matrices . 63 43° The Jura data set case study... 00.2. 66 4.3.1 Smoothing of the correlogram tables... ...... 7 67 4.3.2 Cross validation using smoothed covariance tables... . B 4.4 Resolution of the covariance map . ses Bl 1.5 Concluding remarks. 2... 0... eee : ao 33 5 Stochastic simulation: An overview 85 5.1 Introduction ee -- + 86 5.2 Simulation in the spatial domain 37 1 Sequential simulation algorithm 37 2.2 Non-sequential simulation algorithm. 8 5.3. Simulation in the frequency domain... . . -. Ot 5.3.1 Shinozuka and Jan’s algorithm. . . . a. 2 Conditioning spectral simulation with dual kei 3 The Fourier Integral Method : 5.4 Comparing lusim and FIM 6 Conditional spectral simulation algorithm 6.1 An initial proposal for conditioning . 62 Acase study in2D .... oon See 2.1 Modification [ of the Algorithm . . . Ce Modification Ul of the Algorithm 3 Other alternatives. . : 6.3 Comments on the 2D case study 22... 6.4 3D implementation and a case study 6.5 Concluding remarks... . 7 Concluding Remarks 132 A Fast Fourier Transform 138 B Relation between spectrum and amplitude of Fourier coefficients 144 C Computer code and Documentation GH GSCI Bconventions eet eee C2 Automatic covariance modeling... . . cee : C.2.1 Calculation of correlogram map... . C.2.2. Preliminary smoothing of experimental... . . . C23 Modeling of auto-correlogram map C24 Jointly modeling of multiple cross-correlogram maps... . 155 ©.3. Conditional spectral simulation List of Figures 33 34 3.6 37 38. 39 3.10 3.L Reference images and sample location maps of U. V and their statistics. 34 Sample (cross) correlogram maps of U and V. calculated from 140 clustered| data tt 5 : . NS and EW. (Cross) correlogram maps of U and V. calculated from the Sample correlogram of U in directions SW-NE. SE-) haustive data. 2... 2... 5UGaG0b0GnnO6 oe Exhaustive correlogram of U in four directions. Sample (cross) correlogram maps of U and V. calculated from 110 clustered data. allowing for tolerance... . . . « . 38 Sample correlogram of U in four directions. allowing for tolerance... 38 Smoothing fan for sample covariance calculation... .. 2... . . 39 Sample (ctoss) correlogram maps of U and V after interpolation... 10 Sample correlogram of U in four directions after interpolation. .... 41 Smoothed jointly positive definite (cross) correlogram maps. resetting negative density spectrum values to zero in frequency domain... .. 48 2 Smoothed jointly positive definite (cross) correlogram maps. with a 3 x 3 smoothing window in the frequency domain... .. 2.2.2... 48 Smoothed jointly positive definite (cross) correlogram maps. with a 7 x7 smoothing window in the frequency domain... ......-. 48 Density spectrum maps of experimental U correlogram map, and alter positivity correction with smoothing windows 1x 1.3x3.7x7. .. 49 Final smoothed correlogram of U in four directions (dash line). com- pared with the reference (solid line). xi 3.16 Final smoothed correlogram of V in four directions (dash line). com- pared with the reference (solid line). in four directions (dash line). 3.17 Final smoothed cross correlogram of compared with the reference (solid line). Seve eteee 3.18 An example of the parameter file for kb2d using the smoothed correl- ogram map... . 606605545555-55 5000006 bos 3.19 An example of the parameter file for cokb3d using the smoothed (cross) correlogram maps. : (Cross) correlogram maps of U. V built from the analytical variogram models. eee cee eee ee 3.21 Correlogram of U in four directions built from the analytical variogram model (dash line). compared reference (solid line) 6000 3.22 Comparing kriging estimates of U using the traditional modeling ap- proach and reading from the automatically-derived correlogram table. 3 Comparing cokriging estimates of U using the traditional linear core- gionalization model and reading from the (cross) correlogram tables. Only collocated secondary data are used. 3.24 Comparing coktiging estimates of U using the traditional linear core- gionalization model and reading from the (cross) correlogram tables. All neighborhood secondary data are used. Comparing sgsim realizations of U using the traditional modeling ap- proach and reading from the correlogram table: simulated variograms and scatterplots of simulated vs. reference values. 3.26 Accuracy plots from 50 sgsim realization: the upper plot uses tra- ditional variogram model; the lower one uses smoothed correlogram oe 4.1 Location maps of all 359 cadmium samples, the 259 samples to be used as data. and the 100 samples to be used for cross-validation. . . . . 4.2. Experimental auto-correlogram maps calculated from 259 samples of cadmium (C11). chromium (C22), nickel (C33) and zine (C44) xii 60 66 67 13 49 1.10 Experimental cross-correlogram maps between four heavy metals. r is the correlation coefficient. an - 68 Experimental auto-correlogram maps after primary interpolation. .. 69 Experimental cross-correlogram maps after primary interpolation... 70 Original and smoothed density spectrum maps of four auto-correlogram maps. ee : bees enccco 5 _ Ti Original and smoothed cross density spectrum maps (real and imagi- nary) of 2 cross-correlogram maps. Smoothed jointly positive definite auto-correlogram maps... . . Smoothed jointly positive definite cross-correlogram maps. ...... Td Cross section views of the experimental correlograms (continuous line). the final smoothed one obtained from the proposed approach (dotted line) and those obtained from analytical model (dash line)... - 3 Comparing kriging estimates of cadmium using the analytical vari- ogram model and reading from the automatically-derived correlogram table. . . Location maps of the 20 chromium and nickel data. and 259 chromium and zine samples. 6... 0. eee eee Ss 79 Comparing kriging estimates of cadmium using the analytical vari- ogram model and reading from the automatically-derived correlogram table, retaining only 20 samples... . . . : -. 80 Comparing cokriging estimates of cadmium using the analytical vari- ogram models (one secondary variable) and reading from the automatically- derived correlogram tables (one then three secondary variables). ... 82 Reference image and location map of conditioning data and their statis- Ge oe Conditional sgsim results using the isotropic variogram model and 140 clustered data. . . ane oe see a. Conditional spectral simulation results using the density spectrum map and 140 clustered data. ee ee 105 xiii 6.4 Conditional spectral simulation results using 500 randomly drawn sam- plete ee beo5oD 6.5 Conditional spectral simulation results using 1000 randomaly drawn samples. 2. ee 6.6 Conditional spectral simulation results samples, 6.7 Histograms of L10 conditioning data and of the 8 closest neighbor val- ues in reference. sgsim and specsim images. 68 Con map as initial image. ional spectral simulation results (Modification I) using a kriging 6.9 Conditional spectral simulation results (Modification II) using a pseudo- simulated initial image... 2... 6.10 Objective function values vs. number of iteration in the spectral simu- lation. Left column is drawn with arithmetic seale: right column with logarithm scale. 6.11 Accuracy evaluation of 50 realizations of sgsim and variogram model reproduction. 6.12 Accuracy evaluation and variogram model reproduction of 50 tealiza- tions of specsim starting from pseudo-simulated images built fom kriging map... 2. eee bo oee sence Accuracy evaluation and variogram model reproduction of 50 realiza- tions of specsim starting from pseudo-simulated images built from weighted moving average map... .. . . « 6.14 Accuracy evaluation and variogram model reproduction of 50 realiza- tions of speesim starting from completely random phases. Flowchart of program speesim .... 0.0... 6.16 The simulated 3D reference volume with the input va (bold line) and the simulated variograms (fine line)... 2... gram model 6.17 Location map of randomly sampled 20 wells and the histogram of the 320 sample data. 2.2... sees xiv 107 los 109 6.18 6.19 6.20 9 co One 3D realization volume. and the 20 simulated variograms (fine lines). plotted against the input model. The speesim algorithm starts with random phases. One 3D realization volume, and the 20 simulated variog:ams (fine lines). plotted against the input model (bold line). The specsim al- gorithm starts with a kriged map. «2.2 ....00...0000- One 3D realization volume. and and the 20 simulated variograms (fine lines). using sgsim. 2.0... eens An example parameter file for corrmap. oe Sample output from corrmap created by the parameter file shown in Figure C.1 An example parameter file for intpmap. ..... . . eens Sample output from intpmap created by the parameter file shown in FigureC.3 0.0. An example parameter file for automod. . . . eee Sample output from automod created by the parameter file shown in Figure C5, . . An example parameter file for multimed. Sample output of smthcorr.2 from multimod created by tie param- eter file shown in Figure C.7 An example parameter file for specsim C10 Sample output from specsim created by the parameter file shown in Figure C.9 xv 158 159 162 Chapter 1 Introduction Geostatistics provides a toolbox for the description. modeling and prediction of the spatial variability of earth science phenomena. Most geostatistical studies involve some estimation or simulation. which often calls for solving kriging systems or normal equations to account for the spatial correlation between sampled and unsampled locations. Covariance models provide a basic measure of spatial continuity of two attribute values in space and. thus. are at the heart of all geostatistical studies. ‘Traditionally, a closed-form analytical model is fitted to allow for interpolation of sample covariance values while ensuring the positive definiteness condition itself cal ensuring existness and uniqueness of solutions to the kriging systems. This analy' model is usually based on linear (co)regionalization model, which is simply a positive linear combination of some basic structures. known beforehand to be positive definite. such as the spherical. exponential or Gaussian structure. Since these basic structures are defined by range and sill parameters, this analytical modeling is also referred tot as parametric modeling. For cokriging where data from different correlated attributes are to be integrated together. the modeling is even more difficult due to the stricter constraint imposed by linear model of coregionalization (LMC). Journel & Huijbregts (1978). Goovaerts (1997). The linear model of coregionalization requires that all the auto and cross-covariances be fitted with exactly the same number and types of basic structures and the relative contribution of each basic structure be jointly subject to some restrictive constraints. The difficulty of fitting the LMC has impeded wide CHAPTER 1. INTRODUCTION application of cokriging hence the essential task of data integration. This dissertation work aims at a complete automatic covariance modeling based on Bochner’s theorem. not calling for any specific basic analytical model. Bochner’s theorem maps the positive definite constraints into much simpler constraints on the Fourier transform of the covariance. that is. the density spectrum. Accordingly. we propose to transform the experimental covariance table into quasi-density spectrum table using Fast Fourier Transform (FFT). This quasi-spectrum density table is then smoothed under constraints of positivity and unit sum. An inverse Fourier transform yields permissible positive definite covariance tables. which can be used for construct- ing the kriging systems during estimation or simulation. The algorithm is extended for joint modeling of multiple (cross) covariance tables (>> 2), building on an extension of Bochner’s positive definiteness theorem and an eigenvalue correction. The smoothed permissible density spectrum previously obtained can be used di- rectly for spectral simulation in the frequency domain without inverse Fourier trans- form. Such spectral simulation has been widely used in electrical engineering to generate random fields with a given covariance spectrum. The algorithms used are particularly fast when based on Fast Fourier Transform. However. because of lack of phase identification. spectral simulation can only generate unconditional realizations. In geostatistics. local data conditioning is typically obtained by adding a simulated kriging residual on the unconditional simulated value. This conditioning process calls for an additional kriging at each simulated node thus forfeiting the elegance and speed hin the advantage of FFT. Moreover. the spatial structure can be only reproduced wi search neighborhood used in the kriging. A new algorithm for conditioning is pro- posed whereby the phase values are determined iteratively to ensure approximative data honoring while reproducing the frequency spectrum, i.e.. the covariance model. This frequency-domain spectral simulation allows in theory better reproduction of the spatial structure over the whole field. CHAPTER 1. INTRODUCTION 3 Dissertation Outline Chapter 2 tional modeling procedure. The notation introduced in this chapter is used through- introduces the basic concepts of variogram / covariance and their tradi- out the dissertation. A review of the current analytical modeling and Fourier-seri fitting methods provides the background for the dissertation. The motivation for the proposed automatic covariance modeling method arises from discussion of the dis- advantages of traditional methods. that is. the difficult analytical modeling on one hand. and inefficient Fourier-series fitting method on the other hand. Chapter 3 presents an automatic covariance modeling technique. in the line of the Foutier-fitting method reviewed in Chapter 2. The proposed algorithm capitalizes upon the fact that the positive definiteness constraints on the covariance are mapped into simpler constraints on its density spectrum. The main idea is to transform the experimental covariance table into quasi density spectrum tables using Fast Fourier Transform. This quasi density spectrum table is then smoothed under the constraints y and unit sum. A back transform through inverse FFT yi of pos Ids permissi- ble positive definite covariance table. Through this FFT “roundtrip”. a permissible covariance table is obtained without calling for any analytical model. The proposed algorithm is demonstrated step by step through an application of the smoothed co- variance maps for kriging and simulation, using 2D data. The case stud: that the proposed automatic covariance map determination is faster and easier to im- plement than the traditional covariance modeling procedure. yet it yields comparable if not better results for kriging and simulation. Chapter extends the algorithm developed in Chapter 3 to the case of four core- gionalized variables each calling for a different density spectrum correction. For A coregionalized variables and n lag distances, checking positive definiteness involves Bochner’s theorem checking a potentially very large (nA x nA’) covariance matri: reduces this large problem into ensuring positive definiteness of many (n) smaller (K x K) matrices of density spectrum. The positive definiteness of these smaller density spectrum matrices can be checked and ensured by eigenvalue correction. The CHAPTER |. INTRODUCTION 4 algorithm is demonstrated using a public-domain data set comprising + coregional- ized variables related to soil pollution from heavy metals. The impact of spectrum tables smoothing is investigated through cross-validation. This algorithm for simul- taneous modeling of (cross)-covariance maps of many coregionalized variables may prove critical for multiple data types integration leading to better prediction of a sparsely sampled primary variable. Chapters 2 to 4 address the problem of modeling density spectrum. a measure of y of the phenomenon under study. In the following chapters. we will spatial continuity discuss spectral simulation algorithms which can generate realizations reproducing such spectrums whether they are modeled by functions or lookup tables. Chapter 5 presents an overview of the current stochastic simulation techniques. with emphasis on the duality between the LU decomposition algorithm (lusim) in the spatial domai and the Fourier Integral Method (FIM) in the frequency domain. This duality leads to a direct conditional spectral simulation algorithm in the frequency domain which does not resort to the traditional error simulation calling for the solution of kriging ystem for conditioning. Chapter 6 proposes a conditional spectral simulation approach through phase identification besed on the Fourier Integral Method. A 2D case study illustrates the algorithm. The algorithm is improved leading finally to two alternatives: 1. in an estimation mode. a post processing of kriging maps to impose covariance reproduction 2. in a simulation mode. the generation of multiple conditional realizations. In the latter mode. the input variogram model is reproduced significantly better than when using the traditional sequential Gaussian simulation method. as demon- strated by a 3D case study. Chapter 7 pro les some concluding remarks and future research avenues. Appendix A details the implementation of the Fast Fourier Transform algorithm in two different situations: power-of-2 Discrete Fourier Transform algorithm and Prime Factor Analysis algorithm. Appendix B demonstrates the duality between covari- ance in spatial domain and the density spectrum in frequency domain, presenting the CHAPTER |. INTRODUCTION 5 squareroot relation between Fourier coefficient amplitude and density spectrum. Ap- pendix C’ presents the computer code developed in this thesis with full documentation of the required parameter files. Chapter 2 Literature review: covariance modeling ‘This chapter reviews the fundamental concepts of variogram / covariance and their modeling procedure. The discussion provides the background and motivation for the modeling methodology proposed. Section 2.1 introduces the basic concepts of random variable, random function, the stationarity decision. variogram. covariance and correlogram. The notation intro- duced in this section will be used throughout the thesis. Section 2.2 reviews the analytical covariance modeling method based on the linear (co)regionalization model. which amounts to adopt a linear combination of basic permissible functions known to be positive definite. Section 2.3 takes a look at the Fourier-series fitting method based on Bochner’s theorem and Fourier series. Section 2.4 briefly iance ing permissible cov. introduces a method for con matrices based on eigenvalue correction. Section 2.5 discusses the advantages and drawbacks of the reviewed algorithms and suggests a new proposal for modeling, which will be fully developed in the following chapters. CHAPTER 2. LITERATURE REVIEW 2.1 Concepts and notations Geostatistics is the “application of statistical methods or the collection of statistical data for use in the earth science. particularly in geology”. Olea (1991). Geostatistics provides an important toolbox for describing. modeling and predicting the spatial variability of earth science phenomena. In spatial statistics. any value = at location u within the domain of interest A is treated as an outcome =(u) of a random variable (RV) Z(u), which can take a series of possible values. also called realizations. The set of spatially dependent RV's. {Z(u).u € A}. constitutes a random function (RF). also denoted as Z(u). A RF associated to a specific attribute: another set of RV's related to a different attribute. {¥(u').u’ € A}, defines another RF ¥(u). The prediction of an outcome value. =(u). builds on neighboring sampled values of either the same attribute. =(us).a one or a different attribute. e.g.. y(u’,).a = L.....n', The estimated value =*(u) is typically expressed as a linear combination of the sampled values of these two different attributes. u) Y Ao(u)stuy) + Fo va(w)y(u’s) (2.1) where the A,’s and v,’s are weights to be determined. Matheron (1963) devel- oped a generalized linear regression algorithm he called kriging which provides a Best Linear Unbiased Estimator (BLUE) of Z(u). In LD. this algorithm is also known as “Wiener filtering”. Wiener (1966). ‘The term “best” is used here in a least square sense associated to the minimization of an error variance. This minimization calls for a set of normal equations. called (co)kriging system. Luenberger (1969). Goovaerts (1997. p. 2 functions. In the case of two RF’s. Z(u) and ¥(u). that cokriging system is: ). The (co)kriging system calls for a set of covariance / cross-covariance [C2ua— ual] [Cav(uy ~ u's)] l [ [acu)?] | (Cz(u, ~u)]? | (22) (Cva(u', — 4a) (Cr(u's—w'all | | fey] | | feavtw’s - wl? | where [Cz(ug —ug)] is the auto-covariance matrix between any two primary data =(u,) and =(ug), [Cy-(u‘. — u’s)] is that between any two secondary data y(u's) [APTER 2. LITERATURE REVIEW and y(u’g), and [Czy(u, — u',)] is the cross-covariance matrix between primary data (ug) and secondary y(u's), moreover. under the symmetry assumption. Cyz = Czy? [\(u)]7 is a nxt vector of weights applied on the primary data and [v(u)]? is an! x1 vector of weights applied on the secondary data; [Cz(us — u)]” is the n x 1 vector of primary data-to-unknown auto covariance and [Czy(u', —u)]? is the n’ x 1 vector of secondary data-to-unknown cross covariance values. To construct the covariance matrices. one must know the auto and cross-covariance between any two locations. i.e. Cz(u.u’).Czy(u.u’) and Cy(u.w').Vu.u € A. In practice. there is at most one sample at any particular location. In oder to refer the covariances statistically. the unavailable replication of sampling at any location u is traded for replication at other locations u’ in space. This tradeoff is underlying the decision of stationar that can be checked from the data (Isaaks & Srivastava, 1988; Journel, 1936). Under which is a property of the RF model rather than a property the stationarity decision. the covariance C(u.u’) = C(h) is dependent only on the separation vector h = u’ —u and thus can be inferred from a pool of pairs of sample values {=(ug). (Uo +h)} taken at different locations u, and approximately separated by the same vector h. More specifically. the various entries of the (co)kriging system (2.2) are estimated by: # sample auto-covariances, which provide a measure of correlation between any two RV's of the same attribute. e.g.. 1 Ma (h) V(b} x (Uy) + =(ug +h) —mz_ mz, (2.3) where, N(hh) is the number of data pairs approximately separated by vector h. thy DLR (uy) is the mean of the tail values, m2,y, = ste DMM! (uy +h) is the mean of the head values. Recall that an auto-covariance is symmetric. i.e.. Czz(h) = Czz(—h). # sample cross-covariances. which provide a measure of cross-correlation between two RV's related to different attributes, e.g.: CHAPTER 2. LITERATURE REVIEW 9 1 xe im > (Wo) - (Ya +h) — mz ymyy (2.4) where my, = stag DSL) y(t +h) Note that Czy(h) = Cyz(h) alw Cy-z(h). a difference called “lag effect however. Czy(h) may be different from . pe Al) (Journel & Huijbregts. 197: ‘To standardize the covariance defined above by the respective tail and head stan- dard deviations. the sample correlogram is defined as: P Cah) . be(h) = 2) (25) 242%, where. 92.4 = sty Dall! (a) — md 4-02, = why LIN alte +h) — my, are the standard deviations of tail and head values. When the tail and head refers to different attributes. the previous measure is called cross-correlogram: . Cav(h dav(h) = <2) (26) OZ Ay where oy, (Uy +h) — mf, Note that Azy(0) identifies the sample linear correlation coefficient between the two attributes, Although covariance values are used in the original normal equations (Luenberger. 1969. p. 160). geostatisticians prefer the semi-variogram alternative defined as: yz(h) $Var{Z(u+h) — Z(u)} C2(0)-Cz(h), if Cz(0) exists Var{Z(u)} is the sta- tionary variance. The historical reason for geostatisticians preferring the variogram (2.7) where C’z(hi is the stationary covariance. and C'z(0) is that its defination (2.7) requires only second-order stationarity of the RF incre- is" (Deutsch. 1997). ‘The non- consequential effect for most practical situations is discussed in Isaaks and Srivastiva (1988). Ripley (1988). Journel and Rossi (1989). ments. a condition also known as “intrinsic hypoth CHAPTER 2. LITERATURE REVIEW 10 The sample semi-variogram i (2.3) Similarly. the sample cross-semivariogram is: 1x Fav) = sD L (=(te) = wl tte) (2(Ue + A) = ys + 4) (2.9) Nth) S= Due to the limited number of samples available. only a few experimental covariance values C(hi).i = 1 may display large sampling fluctuations irrelevant to the underlying spatial structure. n-can be inferred from the data. These experimental values Typically. an analytical function is fitted to the few sample covariance values C’(h;) to inform all distance vectors h # h,, as well as to smooth out the sampling fluctuations. Moreover. to ensure the existence and uniqueness of the kriging systems. all auto- covariance matrices. Cz(h) = [Cz(h)] and Cy(h) = [Cy(h)]. as well as the joint auto- and cross-covariance matti: [ Cz(h) Czy(h) Cyz(h) Cy(h) must be positive definite (Armstrong. 1981: Armstrong. 1981). Similarly. the auto- and cross-variogram matrix must be negative definite (Christakos. 1984). ‘Traditional. a closed-form analytical model is fitted on the experimental covari- ance values. These analytical models are simply positive combination of some basic structures which are known to be positive definite before hand. The permissible basic variogram structures include the nugget effect model. spherical model. exponential model. Gaussian model. nd power model. These structures. except for the power model, actually or asymptotically reach the sill c, i.e.. the variance, at a distance call range a. For example, for spherical model in 1D, -05 (47). ifksa h ah) = c+ Sph(=) : @ otherwise In 2D or 3D space, a phenomenon may show different patterns of spatial continuity in different directions. This is called anisotropy. There are two types of anisotropy: CHAPTER 2. LITERATURE REVIEW u © Geometric anisotropy: the directional semi-variograms have the same shape and reach the same sill at different finite ranges. These different ranges constitute an ellipsoid with the three main axes identifying the three major directions and the corresponding ranges. denoted as a1.az and a3. A geometric anisotropy can be corrected by transforming the original coordinates h = (hy.h2.A3)" into a new vector hi = (h{.hy.h4)” by applying first a rotation matrix and then rescaling vector on the original vector. so that the value of the anisotropic semivariogram model +(h) identifies that of an isotropic model 7o(h’) with an isotropic range. Let A be that transform with: Ach (2.10) Details about anisotropy correction can be found in Isaaks & Srivastava (1989. p.386) and Govaerts (1998. p.92). © Zonal anisotropy: the directional semi-variograms reach different sills in differ- ent directions. This can be seen as an extreme case of geometric anisotropy. After the anisotropic correction. a(h) = 7o(h’) The 3D case is similar to 2D except for a more complex (3 x 3) rotation and rescaling matrix, There exist different algorithms for modeling experimental (cross) covariances subject to positive definiteness constraints. the two main approaches being para- metric (Journel & Huijbregts. 1978; Isaaks & Srivastiva, 1989; Deutsch & Journel. 1997: Goovaerts. 1997) and non-parametric modeling (Cherry, 1994; Hall, Fisher and Hoffman. 1994; Rehman. 1995). The traditional modeling approach. or parametric modeling, considers only positive linear combinations of the basic covariance struc- tures (functions) which are known to be positive definite beforehand. These basic structures are closed form analytical functions defined by a few parameters, typi- cally a range @ and a relative contribution ¢ to the total variation, hence the term “parametric” models. e.g. the spherical model or the exponential model. CHAPTER 2. LITERATURE REVIEW RB Non-parametric models. as opposed to the parametric models. are tables of covati- ance values which need not be all related by an analytical formulae. Nevertheless, all values in the table must be consistent with each other and provide an overall “good” fit of the corresponding experimental covariance table. Actually a “non-parametric” model uses many more parameters than a so-called parametric model (Journel. 1984). at the limit one parameter per entry of the covariance table. The intention is to pro- vide a closer fit of the experimental covariance values. The present “non-parametric” modeling methods include: © fitting a Fourier series in the spectral domain, then transforming it back into permissible covariance table in the spatial domain. a process based on Bochner’s theorem (Bochner. 1949: Rehman, 1995). © build up a permissible (positive definite) covariance matrix based on eigenvalue ). correction (Wackernagel. 1994. 199: “These various modeling approaches are discussed in detail in the following sections and a new method for non-parametric modeling is proposed improving on the present practice. 2.2 Parametric modeling: The Linear Model of Coregionalization 2.2.1 Auto covariance modeling In most situations. two or more (£ +) basic structures must be combined together to fit the shape of an experimental covariance function. Each of these basic covariance structures ¢(h) can be considered as the contribution from an independent random function component ¥'(u). The RF Z(u) is then modeled as the sum of L independent component RF's ¥"(u): Zu) L Ler) (2.11) CHAPTER 2. LITERATURE REVIEW 13 The resulting Z-covariance is: “ C(h) = (6 Per(h) (2.12) & where ci(h) is a basic covariance structure corresponding to component ¥"(h) and 6')? is its covariance contribution. The corresponding sufficient conditions for the iB resulting covariance model C(h) to be positive definite are: L. each basic structure ¢x(h) is permissible. i.e.. is positive definite. and 2. the contribution or sill (b')? of each basic structure ¢(h) is positive. The practice of modeling an experimental covariance is most often done by trial and error. One can use algorithms such as weighted least squates estimation (Cressie. 1985). quadratic estimation (Marshall & Mardia. 1985: Christensen, 1993). maximum. likelihood estimation (Zimmerman, 1989; iettich & Osborne. 1991; Pardo-[guzquiza. 1997). The basic idea is to first decide upon the types of the basic covariance struc- tures. i.e.. spherical. exponential. etc... with the associated parameters (ranges and relative sills) unknown. Then an objective function is defined as the difference be- tween the observed experimental values C(h;) and the values calculated from the model C'(hy): , WSS = Y w(hi)- [C(hy) - C(ha)? where the weight .o(hhj) is usually made proportional to the number of data pairs used to calculate the value C(h;). The minimization of the objective function H’SS will provide the parameters of the various covariance components. Covariance modeling is rarely the objective per se. it is used as a tool for kriging estimation. Therefore. one should check the parameters of the resulting covariance model by cross-validation (Davis. 1987; Journel, 1987: Lamorey. 1995). The idea is to remove one datum at a time and reestimate its value using the various covariance models being considered. The comparison between the estimated and the real values CHAPTER 2. LITERATURE REVIEW u provides insight to whether a model is appropriate or not. Different parameters for the model (2.12) are considered, then used for kriging estimation. The cross-validation between resulting estimates and true values provides a criterion for selecting a model and its parameters among many possible alternatives. The previous modeling method requires one to decide on the number and the types of the basic covariance structures ci(hh). then determine their model parameters (range and contribution). The prior decision about number and types of component structures is somewhat subjective. 2.2.2 Joint cross-covariance modeling -K. the A? auto In presence of multiple cross-correlated variables Ze(u).k and cross-covariances cannot be modeled independently. They must all be modeled simultaneously to ensure the positive definiteness of any cross correlation matrix built from them, The linear model of coregionalization was introduced to this purpose (Journel & Huijbregts. 1978. p. 171). The idea is again to decompose each variable Oo Led where each component is independent from each other. although ¥/'* and ¥} share Z%(u) into a number of independent components ¥;'(u).( on, the same auto-covariance ¢(h). More precisely. km Z(u) = OY aj, Yiu) (2.13) ai Hence: am L Crh) = DD auajulerth) =X beh) (2.44) i md with Bhs = Dayal. Welk 15) 7 The last expression (2.15) defines L matrices of coefficients [b,,.], each of dimension K x K. Expression (2.15) is also the general expression of a positive definite matrix. CHAPTER 2. LITERATURE REVIEW 6 Therefore. the sufficient conditions for the cross-covariance matrix {Cxu(h)] to be jointly positive definite are: L. each component function c)(h) must be a permissible covariance structure. 2. the (L +1) coregionalization matrices [b,.|' of the relative contribution of each basic structure must all be positive definite. This modeling becomes very difficult as the number K’ of coregionalized variables increases. Journel and Huijbregts (1978), Myers (1983). Goulard (1989). Bourgault and Marcotte (1991). Myers (1992), Bourgault et al. (1995). Goovaerts (1994). Wack- ernagel (1994. 1995). Webster et al. (1994). Iterative algorithms to fit a linear coregionalization model have been proposed by Goulard (1989): Goulard and Voltz, (1992). The idea is to specify L + | permis: le basic structures ¢;(h) to be combined linearly to model each (cross) variogram Ciue(h)- For 2 variables. that coregionalization model is written: t Coe(h) = So biyeth). Vk = 12; = 1.2 The constraint is that the coefficient matrix for each of the L +1 basic structures c(h): pale &] we Bh by ulard’s algorithm starts with a set of arbitrary coefficient matrices B'. The co- (2.16) be positive definite. Ge efficients of these matrices are then iteratively perturbed so as to minimize a weighted sum of squared differences between sample and model covariance values subject to the positive definiteness constraints (2.16). Each squared difference is weighted pro- is sample covariance value. Theoretically. the procedure may not converge nor is it portionally to the number of data pairs used in the calculation of the correspon: guaranteed to provide a unique solution. The most critical problem remains the prior specification of the number and type of the basic structures c(h). CHAPTER 2. LITERATURE REVIEW 16 In summary. the parametric linear model of (co)regionalization decomposes a RF - L. each of which has its Zu) into several independent components ¥"(u).! = 0.- own covariance structure; the positive linear combination of these structures provides a permissible analytical covariance model for Z(u). Ideally. some physical interpre- tation of each component ¥"(u) should be available for the process of decomposition and modeling. The linear coregionalization model requires a prior determination of the number and types of covariance structures. then its parameters can be obtained by some optimization algorithm based on goodness-of-fit or cross-validation criteria. ‘The fit between the analytical model and the experimental values may be poor be- cause of the limitation of each structure being one of the few classical positive definite basic models. For the multivariate case, the modeling of joint positive definite (cross) covariances is even more difficult because of the constraints (2.15). Another limitation of the linear model of coregionalization is that the the cross- covariance must be symmetric because of the expression (2.15) for bl. Thus it can not account for asymmetry also called “lag effect”. Goovaerts (1998. p.48). 2.3 Non-parametric modeling based on Fourier Series Fitting A non-parametric covariance modeling procedure consists of fitting a set of Fourier series then ensuring that they lead to permissible covariance tables. The algorithm. ). Bochner’s theorem states that a function /(h) is positive definite if and only if builds on Bochner’s theorem (Bochner, 194 it can be represented as the Fourier transform of a bounded non-decreasing positive measure. F(t). i.e.. Sth) = [ehar(e) where ht are d-dimension vectors in the spal and frequency domains respec- tively. In the LD case. a set of discrete covariance values can be obtained by replacing CHAPTER 2. LITERATURE REVIEW 7 the previous integral with a discrete sum: chi) = Deas. ay 20. i (2.18) where hi -nare the discrete spatial sampling locations. The condition that the constructed model (h;).i = I.....n be non-negative def- inite entails that coefficients a, > 0.j j optimization problem. which consists of minimizing the squared difference between -...m. These coefficients a,. +++m. called Fourier series. are determined through the formulation of an the fitted values e(/;) and the experimental values é(hj).i = L Based on the above paradigm. Cherry (1994), Hall, Fisher and Hoffman(1991). and Salim Ur Rehman (1995 a,.j = l.....m through optimization and then reconstruct the covariance values from have proposed to fit the Fourier series of coefficients expression (2.18). They consider this procedure as non-parametric modeling since no analytical models defined from a few parameters. such as the spherical model. have been used. 2.3.1 Auto-covariance modeling Isotropic modeling In LD. the integral (2.17) can be written as: Sth) = [22 eaF(t) = Jt (cos(th) + isin(th))dF(t) Since f(h) is required to be real and symmetric and the measure F(t) is also real and symmetric, the above integrand reduces to the cosine transform. leading to the expression: Sh) = [™ costenyd(o where F(t) is a bounded non-decreasing real function, In 2D, the integral (2.17) is written as: CHAPTER 2. LITERATURE REVIEW 18 Hhecty) = [etter tty) For the isotropic case. h = |h| = 2 +43. t=It|= (240 5 B+ Hence f(hz.h,) = f(h). and the Fourier transform over a 2D space is reduced to the Hankel transform of a single variable ¢ (Sneddon. 1951). which is: fy = [ edoleh deo (2.19) where Jo(-) is the Bessel function of the first kind and order zero. Similarly. in the 3D and isotropic case. the integral Fbes bys =f. [Oo [ etteerittetode (tect can be reduced to: S(h 400 ysin( ht) [oe are In summary, for an isotropic covariance, the multi-dimentional integral (2.17) can be reduced to the integral of a single variable as: shy = [* semar(ey (2.20) where: cos(th). ford = 1. S(th) =) tig(th). for d eit) ford =3. where Jo(-) is the Bessel function of the first kind and order zero. For discrete covariance values, the integral (2.20) is rewritten as: ) othe) = Fo S(tyhiay. A weighted least square criterion can be used to fit the experimental discrete ssn values, Consider F(t) covariance values é(h;) by the discrete series c(i). CHAPTER 2. LITERATURE REVIEW 19 as a series of step functions with a finite number of positive jumps a1.....dm at points t1....+tm- Then, the objective function to be minimized is written as: 2 AY) = ow («0 - [é swrae]) (2.23) it ject to the constraint that a, > 0. j = L.....m. where w;: weight assigned on the i** covariance value. ahi nz number of experimental covariance values. experimental covariance at lag hi. frequency in radians per unit lag m: number of Fourier series in the frequency domain a: j* element of Fourier series Here Y¥ = (ay.....am)" is the vector of unknowns and S(-) is the isotropic Fourier kernel given in (2.21). The minimization problem is solved through quadratic pro- gramming. Anisotropic modeling For an anisotropic covariance in R?. the Fourier sum (2.22) is written: (hi) = So SE cost tjhie + lubriy Jaye (2.24) ray km where aif —mg,....1mz. are the unknown coefficients. ‘The requirements of positive definiteness of the covariance values ¢(hi) calls for the conditions: ae 20. f= —mi...mi: k= —ma...ma The optimization problem is : min Q(¥) = Sow: (em -1 =F cash teh) i Ty kes subject to the above constraints. CHAPTER 2. LITERATURE REVIEW The difference hetween the isotropic model (2.22) and the anisotropic model (2.24) lies in the large increase in the number of unknowns in the optimization problem. le the isotropic model had m unknowns. the anisotropic model has (2m; + 1) x(2my +L) unknowns. Similarly. in °, there would be (2m, + 1) x (2m2 + L) x (2my+ L) unknowns to be solved using the same optimization procedure. The number of unknowns in the optimization problem associated with an anisotropic variogram model is thus too large to be practical. Another approach proposed (Rehman. 1995) calls for performing a prior coordinate transform E on the lag dis- tance h so that the covariance model is isotropic with regards to the transformed lag h’. Then. the multi-dimensional Fourier integrand (2.17) can be simplified into Bessel ot sinc function of one variable as in (2.21). After that. the same optimiza- tion formulation can be carried out as in (2.23). This prior coordinate transform can be expressed as a transform matrix similar to matrix A in (2.10). This matrix is defined by the anisotropy characteristics such as principal directions. major ranges and anisotropic ratios. hence the name “semi-parametric modeling ” compared with the non-parametric modeling for isotropic case where no parameters of anisotropy are required. Semi-parametric anisotropic modeling Consider the following example in R?: a coordinate transform matrix E. equivalent 17) to A in (2.10). is applied on each lag h so that the complex Fourier transform (2 can be reduced to Bessel function of type (2.19): (ke J esotetn' Bh) yar (e) where h = {h|. The 2 x 2 transform matrix E is usually unknown. and should be positive definite. The LU decomposition of E is written as: e-w-[% ° |e a2 a3} [0 a3 CHAPTER 2. LITERATURE REVIEW 21 where a = (a1.02.03)' is the vector of unknowns modeling the anisotropy. The discretized covariance values can be calculated as: Jo(t;(h’LL'h,) The optimization problem is then formulated as: 29) 2 min Q(¥.a) => (a0 - (= sosotacnt,)) where the vectors ¥ = (ay.----m)/ and @ = (a1.a2.a3)' are to be solved subject to the constraints a > 0.j

Você também pode gostar