Escolar Documentos
Profissional Documentos
Cultura Documentos
335-49 335
Printed in Great Britain
The approximatedistribution
ofpartialserialcorrelation
coefficients
calculatedfromresidualsfromregression on
Fourierseries
BY J. DURBIN
Department
ofStatistics,
LondonSchoolofEconomicsand PoliticalScience
SUMMARY
Approximations are foundto the joint distributions
of noncircularand circularpartial
serial correlationcoefficients
calculatedfromresidualsfromregressionon Fourierseries.
Resultsforcoefficients calculatedfromdeviationsfromthe truemeanand fromthe sample
meanare obtainedas specialcases. It is shownthatwhenthe observations aro independent
the partial coefficients
are approximately independently distributedin beta distributions
that are the same forall odd-ordercoefficients
and the same forall even-ordercoefficients.
The approximations are ofthird-orderaccuracyin the sensethatthe erroris ofordern-312.
Theywereobtainedby the techniquedevelopedin anotherpaper (Durbin,1980).
1. INTRODUCTION
Daniels (1956) obtainedthe approximatedistribution of the successivecircularserial
correlationcoefficients and partial serial correlationcoefficients, with and withoutmean
correction,for the case of normallydistributed observationsgeneratedby a circularauto-
regression.He also obtainedan approximation to the distributionofthelag one noncircular
statisticwithand withoutmeancorrection. The approximations wereobtainedby meansof
the saddlepointapproximation methodand are ofthird-order accuracyin thesensethatthe
errorcommitted is ofordern-3/2.
In thispaperthisworkis extendedto includenoncircular and circularstatisticscalculated
fromresidualsfromregression on Fourierseries.It is hopedthattheresultswillserveas the
basis fordevelopingtestsof serialcorrelation of successivelyhigherordercalculatedfrom
theresidualsfromleastsquaresregression on slowlychangingregressors. The approximations
are derivedby a techniqueofDurbin(1980)forderivingapproximations forthe densitiesof
sufficientestimatorswhichforthis problemis technicallysimplerthan the saddlepoint
method.
Firstwe considertheappropriatechoiceofdefinition ofthelag onenoncircular coefficient.
For a set of values zl, ... , zn a numberof alternativesare open to us including
Ye= s=1
E pscos{- (2t-1) 7T(S-l1)/n}+ ui, (t = 1,...., n), (3)
is thesameas thatofthecoefficients
ofthepartialcoefficients
It is foundthatthedistribution
derivedfrom'r providedthat q and n are replacedby q + 1 and n + 1 respectively.
In ? 4 an analogoustreatment is givento the circularserialcorrelationcoefficients
when
theseare calculatedfromthe residualsfromthe Fourier sine and cosineregression
COEFFICIENTSDERIVED FROMr1
DISTRIBUTIONOF NONCIRCULAR
2. APPROXIMATE
In this section we constructa class of noncircularserial correlationcoefficientsbased on
r1a,ndobtain an approximationto the joint distributionof the firstmnof them. Let us write
from residuals
Distributionofpartial serial correlationcoefficients 337
000 ... ? ?
O O O ... 2 2
{n [ 2n J 2n 2n
fors = 1,..., n. We definethe lagj coefficient
as
C3 = Z1Z3+Z2+ZlZ4+Z2Z5+ +Zn-2.n.
Zn-3n+ 2Zn-
and in general
C2k = Zl Z2k + Z2 Z2k-1+ + ZkZk+l + Zl Z2k+1+ Z2Z2k+2+* *
r, 1 r1 ... rm-2
is nonnegative-definite.
We wish to investigate the distributionof the rj's when they are calculated from the
least squares residuals
and where here and subsequently Ki is a constant for i = 1,2.... By analogy with this we
take as the density of the u1's,
~~~~1
K2 exp (-2 rc + 2m1(12)
( Exj2;a cl + ...+ 2atm
E oajotj+1 cm),(2
where c'* = u'A u, that is c*' is the same as c" defined by (6) with z replaced by u for
s = 1, ..., m, and where c* = u'u with u' = [Ul, ..., Un]. As in the stationary case we assume
that the roots of the equation Om+ ?1Ofm-'+ ... + am = 0 have modulus less than one. One
would expect (12) to give a better approximation to the distributiongenerated by (10) in
the stationary case than does (11). This approximation has been used previously by
Anderson (1948) for m = 1 and in G. S. Watson's N. Carolina Ph.D. thesis form = 2.
We now investigate the joint distribution of rl,...,rm when the ut's have density (12).
The normalizingconstant is K2 = (21rTa2)-nl A Ii, where A is the matrix of the quadratic
formin (12). Since the eigenvalues of As are cos {sv(k - 1)/n},we have
n Fm m m-s ] n I m1 2
ofpartialserialcorrelation
Distribution fromresiduals
coefficients 339
Let ,. #m be therootsofthe equation m+ ol m- ... + om = 0. Then
m m
E Oli Mim
j [ (O -OS)
j=O s=l
Consequently,
E
j=O
o> ei-r(k-l)j/n = eimif(k-1) /n rl(e-in(k-1)
s=1
/n -#8).
Now
n n-1
(1 _ )-1II (e-ifr(k-l)l/n_s) (eiff(k-l)l/n-s) ][ (ei7j'/n -8s),
k=l j=-(n-1)
so that
n m m n-1
II
2
rj E06jei?(k-l)j/n = fI(j-_tO (ei7rj/n_O.,s)
k=i j=o s=l j=-(n-1)
It follows that
/m tm -tm
K2 = (2iTc2)-An (_o)j) rIl(- 2n)(
2a
uu +
=0otj2
2s=l1 EFotj
EJj=O
?cj+s
uz'A
2rn-
1 m m m-s 8z q m m m-s
zIz + 2 E
=--a2-
01j
z aoj otj+s ZAAsZ + Eza42+ 2E Eoz jozj+scos {s7T(k-1 )/n} (bk-PSk) }
where Z., ... ) Zn are given by (8). If we denote the densityof y = (Yl' ... I Yn) by f(y; 1,)B,c2),
whereo = (l, ..., c)' and fi=(P Pq)' it followsthatf can be writtenin the form
/m m -im
f(y; cx,B3,
a2) = (2iou2)-4n - , (- l)14 rj (1 _-b2n)
1m mm-s q
x exp 1 Vk(bk pk) (13)
[- j(o qo+ +2 8 kq
where
m m m-s
C5= z'A8z, Vk= E e2 +2E acx acxJ+8COS{87T(k-1)/fl}.
340 J. DURBIN
Im m m-s
S2 = (n-q)-l Ea4c +2E E ajaj+sc)
i=OJ s=l j=O
of ax,/ and a2, where a1, .. ., am are the solution of the Yule-Walker equations
a1 + a2r1 +... +amrm-+ ri = O,
al r -+ a2 +.. +amrm-2+r2 = 0,
alrm-i+a2rm-2+***+ am +rm=O.
Using (13) and (19) of Durbin (1980), we have forthe densityg(a, b,S2; o,z /, ur2) of a, b, S2
_ ? . O | (2S2)-1
where
n 2 m
-1
2 2 = S2(
S= (n-q)-'Ez a,rj
This approximation is valid by Case 2 and Theorem 1 of Durbin (1980) applied to the
distributionof 00, . .., cm aJndb1,. . ., bq.
We have already assumed that the roots 01, ... Om # of the equation obm+ x1#m-'+ ... + cxm= 0
have modulusless than one. Let us aAlso limit the doma,inof a to the regionforwhichthe
roots$1,.., $,m of the equaJtion #m+ a1 m-1 + ... + am = 0 haJve modulusless than one; in
ofpartialserialcorrelation
Distribution fromresiduals
coefficients 341
fact the probability of the sample point falling outside this domain equals zero anyway.
The error in replacing the factor fls(l - 0b2n)/(l - $p) by unity is thereforeexponentially
small and can be neglected. Integratingout bl, ..., bq,we obtain forthe density of a, s2
/m \-im /m q-i m \
E a, E
2 2 i( n \ (M+1) tz aj
0 r> Jr(0 (- 1)Jayj
1) a
m
q-i
g(a,S2; o, a2) + ( -
kT7r)
=
Jjz(1)j Oli
}{
V2(0r2)j(n-q) Oli
I~~~~~~~~~~~~~~
IRm I(n-q-1)
1,(S2) 1 /m
=0 j=0~~~3= m m-s
X e_i(n_a)
m exl - a2 oz-?eco+ 2SE ozjofy
j+s {1 + 0(n-1)}.
x a} {1 +
11)
I(nRm
QZj)(, r) O(n-')}
(nym (Zj aj r()"(n-q-m)(E. aj)q-1{Z(-( 1)Ja1}1IRm1 {1 +0(n-')} (15)
2(EJ 0Zj)qj {Zj( _ )j 0Zj}j Qj(n-q)(ozxr)
where the summations are over j = 0, ..., m, and where we express rl, ..., rm in terms of
al, ...) am, with
m m m-s
Q((X,r)= x+ 2
oj2 cvxjo?j+srs.
j=O s=1 j=O
If we make the transformationfroma1, ..., Camto rl, ..., rm,the Jacobian forwhich is given
by Daniels (1956), it is possible to write down the density of r1,...,rm, but this is too
complicated for practical use. However, a considerable simplificationoccurs when the
distributionis expressedin termsof the partial serial correlationsrj.. For the presentpurpose
these may be definedas follows.Let a8s, ..., ass be the solution of the Yule-Walker equations
fors =1,...,m. Then r =-a11. Obviously, rl = rl. Since Rmdefinedby (7) is nonnegative-
definite -1 (r.< 1 forj =1 .
Now asr = as-,r + ass as-,s-r (r = 1, ..., s- 1) (Daniels, 1956, equation (10.3)). Consequently
(1-as-a,, + as-1,2 -
* + as-.,s-.)(1- ass) =
1-as1 + as2
-
*- ass (s odd),
h(r, a) = (
n I)Im odd(1_rj )q-1(1-_rj2().n-)1 Ileven(l-r. )q+1 (1r32).(n-q)) 1
\2)} (Ejc(X)q-i {j(
-
1)i ac}* Q (n-q) (Q r)
x{1+0(n-L)} (-lrj. Al< =l..m q= 0X ..) (16)
when the summationsare over j = 0, ..., m.
Daniels (1956)pointedout thatwhenthenormalizing constantin a one-term saddlepoint
approximation is adjustedto makethetotalprobability equal unity,theerroroftheapproxi-
mationis effectively reducedfromO(n-l) to O(n-312); Durbin(1980) showedthatthisalso
appliesto the essentiallyequivalentapproximations obtainedby the presentmethodin the
sensethattheerroris O(n-3/2)fora givenstandardized valueofrj.thoughnotuniformly forall
standardized rj.. To renormalize(16) considerfirstthe null case in which cx1= ... = am = 0.
Then the rj are independent to the orderof approximation underconsideration withtrue
normalizingconstants 2-n{B(In - Iq + 1, 'n + lq)}-' (j odd) and 2-f{B( n - lq, In + Iq + 1)}-1
(j even) in place of the asymptoticnormalizing constants{n/(2rr)}i.Using the methodof
Madow (1945) it is easy to showthat
the summationsbeing over j = 0, ..., m. Using (16) forthe case ac= 0 and renormalizingwe
thereforeobtain as our final approximation to the density of r, ,..., rm.for general os,
Because the dominant terms of (18) and (19) are essentiallybeta distributionswith half-
integerindices it is easy to obtain approximate significancepoints of r fromthe significance
points of the variance-ratiodistribution.Putting x = (1 + r) in (18) and (19) we obtain for
the dominant terms the densities
Xj(n-q)(j X)I(n+q)-1 Xi (n()1 X)1
- (n+1
xilv-l (1 - x)1v2-l/B(v1,lv2)l
=r r1+&2 r2
+ 0&1 +* + &l1 r, (22)
2s + All+ + Ajlr-
l + & +
ri_a + . **
OZ + r&3oj-1 =-r2_,
344 J. DURBIN
whichwe may writeas R = -r, whencea =R- I-s +
r. Now R-1 is approximately 62,
whereI is the unitmatrixand
[ 0 r1 r2 ... r1-2
r1 0 r1 ... r,-3
Since forthe null case E(r2) = n-1+O(n-2) and E(rirrk) = O(n-2)a firstapproximation to
E2 is (j -2) n-1I and a secondapproximation to R-1 is {1 + (j -2) n-} I-s. It followsthata
secondapproximation to the numerator
of (22) is
r1-{1 +(j-2)n-1}(r1r1_1+r2r1-2+ ...
r,- rl)
The denominatorof (22) is equal to H, (1- r,2),wheretheproductis overs = 1,...,j-1. We
can therefore
write(22) as
1-1
r ][I (l-rs2.) = r -{1 + (j-2)n-lry_l+ + rj-lrl)
s=1
where the summations are over s = q +1, ..., n and x?+l, ..., xn are independent N(0, 1).
From well-knownresults (Durbin & Watson, 1950, p. 419), we thereforehave
E(r1) = -
n - qs=q+l
Cos p (24)
and definingrl, ..., r. in terms of z,, ..., znas before,we findforthe density h*(r., cX)of the
partial coefficients rl , ...,K*
rm)
COEFFICIENTSDERIVED FROMr1
DISTRIBUTIONOF NONCIRCULAR
3. APPROXIMATE
We now considera class of noncircularstatisticsbased on r' and obtain the approximate
distributionof the firstm of them. Since the derivation parallels the previous case very
closely we shall only indicate points of difference.
Write r' in the matrix formr' = (z'B1 z)/(z'z), where B1 is a matrix all of whose elements
are zero, except forthose adjacent to the principaldiagonal whichare 2. The eigenvalues and
eigenvectorsof B1 are easily verifiedto be
s cs(n+1)' I =(sin(n-$1),si
2T1)'s (r s .sin (f) }'.
where the Ba's are required to have the same eigenvectorsas B1. The appropriate C,'s are
given by Anderson (1971, p. 291) as
and in general
C2k = -Zl Z2k-1 Z2 Z2k-2 * Zk+Zl Z2k+ + Z2 Zk+2 + *
= -
Z2k+2 + Z2 Z2k+3 +
-Z1 - -
2k+1 Z2k Z2 Z2k-1 * Zk Zk+1 +Z1 *..
wherethe termsat the ends of these expressionsare deduced fromthose at the beginningby
symmetry. The eigenvaluesof BR are given by Andersonas cos {frs/(n+ 1)} fors = 1, ...,n;
j = 1,2,...
346 J.DURBIN
We now investigatethe distribution
of the r;'s whentheyare calculatedfromthe least
squaresresiduals
wherethe summationsare over s = q + 1, ..., n, and wherein the nonnull case xqi, .. ., xn are
independent
normalvariableswith
m m m-k
var(xs) =E O+i
of2+ 2E , 0 k+i COS{k7T(s-1)/n}.
i=O k=l i=O
Similarly,
whenz definedby (28) is referred 1*as axes, we have
to the eigenvectors
( (30)
r = Cos Ws2
(+ )(Es W2)
where the summations are over s = q +1, ..., n, and wq+1, ..., wn are independent normal
variableswith
m m m-k
var (WS) = Etj2 + 2 E Ol' i+k cos{k7s/(n+1)}.
i=O k=l i=O
Comparing(29) and (30), we inferimmediatelythat r. has the same distribution as rj
factwas pointedout
providedthatq and n are replacedby q + 1 and n + 1. This intriguing
forthe case m = 1, q = 0, oil= 0 by Anderson(1948). It followsthat if r'*, ...,r' are the
partialserialcorrelation computedfromrl,..., r4,theirdensityis givenby (17)
coefficients
withn, q replacedby n + 1, q + 1.
to Yl,
whenceon transforming Yn and using(31) thedensityofYl, Yn can be writtenin
the form
where
m m-s
v=
s=O j=O
voj+s cos (27Tslk/n).
E oil
n-q s=4(q+1)
Thus ifgreateraccuracyis requiredin the null case thanis providedby the basic approxi-
mation(32) an amount
ofobserved
Table 1. Comparison and approximate ofthenoncircular
distributions
rj and thecircularstatistic
statistic r,'
Theoretical Observed percentagefornoncircular Observed percentageforcircular
percentage statisticr,. statistic,r,.
incell j=1 j=2 j= 3 j= 4 j= 5 j= 1 j= 2 j= 3 j= 4 j= 5
(a) n = 25,q = 1
5 5*12 4*96 4*97 4*56 425 4*93 7*16 4*28 6*65 4.86
5 5.11 5*18 4*97 4.74 5*20 5*30 6*21 4 99 6 33 5*30
20 20*73 19*95 19-90 18-55 20*10 19-26 22*17 19*81 22*72 21*00
20 20*01 19*52 20*11 20*47 20*11 19-91 21-44 20*71 20-65 20-45
20 19-13 20*84 19-72 20*66 20*70 20*44 19*17 20-56 18 80 20*24
20 20*10 20*17 20*48 20-58 19*74 20-51 16*77 20-09 17*40 19*59
5 4*91 4.86 4.67 5*16 5.10 4*95 3*75 4-73 387 434
5 4*89 4*52 518 5*28 4*80 470 3*33 4*83 3*49 4.22
(b) n = 50,q = 3
5 5.04 4.75 4.78 5*36 5.73 4.88 6*82 5*26 6-81 6*06
5 4.84 5*26 5*13 5*30 5*58 5*23 5*91 5*12 5.99 6'23
20 20-19 19*47 20-88 20-39 21-57 19-00 22*47 21-13 22*47 22*08
20 20*30 20*03 20*93 20*17 21*11 20*11 20-13 20-49 21*42 21*61
20 20-72 20*76 20*23 19*66 19-34 20-43 19-50 19*99 18'95 19*07
20 19-34 19-82 19*37 19*60 18*37 20-19 16-99 19-13 17*21 18*01
5 4.90 4.86 4*50 4.51 4.25 5.33 4d19 4.68 3-81 3*69
5 4.67 5,05 4.18 5.01 4 02 4*83 3.99 4.20 3.34 3.25
(c) n = 100,q = 5
5 5*03 5*15 5*66 5.45 5-84 4.97 6.30 5.33 6.12 6*32
5 5*26 4*78 5*32 5*80 5*50 4.86 5.68 5*12 5*83 5*85
20 19*77 19-43 21*35 20-05 21*64 18 94 21-61 21-27 23*10 22*91
20 19-72 19-56 20 64 20*42 20*29 20-50 20*17 20*74 21-41 20*22
20 20-34 20-58 19*07 19*81 19-45 19-91 19-15 19-89 18-80 19-27
20 20*03 20*83 19*14 19-33 18 98 20-49 18*60 18 91 17*48 17*89
5 5*04 4*67 4~41 4.97 4~50 5b19 4*46 4*60 3*66 4*17
5 481 5.00 4*41 4*17 3~80 5*14 4~03 4b14 3~60 3.37
coefficients
ofpartialserialcorrelation
Distribution fromresiduals 349
5. MONTECARLOEXPERIMENT
The accuracyofthe approximations fora. randomserieswas testedby meansof a Monte
Carloexperiment. Randomnormaldeviateswerecalculatedby the Marsaglia-Braymethod
as describedby Neave (1973) and rj., r' were computedfor each of n = 25, 50, 100,
j = 1, 2, 3, 4, 5 and q = 1, 3, 5 fromresiduals(8) and (31) respectively.The secondnon-
circularcoefficient r. was not includedsinceits distribution is the same as that ofrj with
n,q replacedby n + 1, q + 1. Percentagepointsofeach approximating were
beta distribution
obtainedby means of the algorithmgiven by Majumder& Bhattacharjee(1973b). The
percentageofobservations fallingintoeach ofeightgroupsalongthe distribution was then
comparedwiththetheoretical percentagesof 5, 5, 20, 20, 20, 20, 5, 5. The meancorrections
(26) and (35) werefoundto be essentialto achievesatisfactory accuracyforj, q ? 1 and were
thereforeemployed. Each experimentwas repeated 10,000 times. The performanceof the
approximationswas found to be unsatisfactoryin the tails for n = 25, q = 3,5 and n = 50,
q = 5, exceptforj = 1,2. The resultsforn = 25,q = 1, n = 50, q = 3 and n = 100,q = 5 are
given in Table 1. The resultsforn = 50, 100 and q = 1 are at least as good as those forn = 25
and those forn = 100, q = 3 are at least as good as those forn = 50 so these results are not
included. In interpretingthese tables, note that the standard errorof a 5% entry is 0-22
and the standard errorof a 20% entryis 0 40.
The resultsforthe noncircularapproximationare fairlysatisfactorybut the performance
of the circularapproximationis noticeably inferior.
REFERENCES
ANDERSON, T. W. (1948). On the theoryof testingserial correlation.Skand. Aktuar. 31, 88-116.
ANDERSON, T. W. (1971). The StatisticalAnalysis of Time Series. New York: Wiley.
DANIELS, H. E. (1956). The approximate distribution of serial correlation coefficients. Biometrika43,
169-85.
DURBIN, J. (1960). The fitting of time-series models. Rev. Int.Statist.Inst. 28, 233-44.
DURBIN, J. (1980). Approximations for densities of sufficientestimators. Biometrika67, 311-33.
DURBIN, J. & WATSON, G. S. (1950, 1951). Testing for serial correlation in least squares regression I and
II. Biometrika 37, 409-28; 38, 159-78.
MADOW, W. G. (1945). Note on the distribution of the serial correlation coefficient. Ann. Math. Statist.
16, 308-10.
MAJUMDER, K. L. & BHATTACHARJEE, G. P. (1973a). The incomplete beta integral. AS 63. Appl. Statist.
22, 409-11.
MAJUMDER,K. L. & BHATTACHARJEE, G. P. (1973b). The inverse of the incomplete beta function ratio.
AS 64. Appl. Statist.22, 411-3.
McGREGOR, J. R. (1960). An approximate test for serial correlation in polynomial regression. Biometrika
47, 111-9.
NEAVE, H. R. (1973). On using the Box-Muller transformation with multiplicative congruential pseudo-
random numbers. Appl. Statist.22, 92-7.
December1976. RevisedJanuary1980]
[Received