Approximation of Serial Correlation

Biometrika(1980), 67, 2, pp.
335-49 335
Printed in Great Britain
The approximatedistribution
ofpartialserialcorrelation
coefficients
calculatedfromresidualsfromregression on
Fourierseries
BY J. DURBIN
Department
ofStatistics,
LondonSchoolofEconomicsand PoliticalScience
SUMMARY
Approximations are foundto the joint distributions
of noncircularand circularpartial
serial correlationcoefficients
calculatedfromresidualsfromregressionon Fourierseries.
Resultsforcoefficients calculatedfromdeviationsfromthe truemeanand fromthe sample
meanare obtainedas specialcases. It is shownthatwhenthe observations aro independent
the partial coefficients
are approximately independently distributedin beta distributions
that are the same forall odd-ordercoefficients
and the same forall even-ordercoefficients.
The approximations are ofthird-orderaccuracyin the sensethatthe erroris ofordern-312.
Theywereobtainedby the techniquedevelopedin anotherpaper (Durbin,1980).
Some key words: Asymptoticexpansion; Autoregressiveseries; Saddlepoint approximation.
1. INTRODUCTION
Daniels (1956) obtainedthe approximatedistribution of the successivecircularserial
correlationcoefficients and partial serial correlationcoefficients, with and withoutmean
correction,for the case of normallydistributed observationsgeneratedby a circularauto-
regression.He also obtainedan approximation to the distributionofthelag one noncircular
statisticwithand withoutmeancorrection. The approximations wereobtainedby meansof
the saddlepointapproximation methodand are ofthird-order accuracyin thesensethatthe
errorcommitted is ofordern-3/2.
In thispaperthisworkis extendedto includenoncircular and circularstatisticscalculated
fromresidualsfromregression on Fourierseries.It is hopedthattheresultswillserveas the
basis fordevelopingtestsof serialcorrelation of successivelyhigherordercalculatedfrom
theresidualsfromleastsquaresregression on slowlychangingregressors. The approximations
are derivedby a techniqueofDurbin(1980)forderivingapproximations forthe densitiesof
sufficientestimatorswhichforthis problemis technicallysimplerthan the saddlepoint
method.
Firstwe considertheappropriatechoiceofdefinition ofthelag onenoncircular coefficient.
For a set of values zl, ... , zn a numberof alternativesare open to us including
r = 1 l2+Z2Z3+ +Zn-lZn+ z2n
It Zi Z2 + ***+ Zn-1 Zn Zl Z2 + ***+ Zn-1 Znl 2

r- 21+ .. . 2rn z 2 + Z22+ .. . + Zn- + 2Z
Ofthese+ a+a2 n (2)
336 J. DURBIN
the representationof z1 and zn in the numeratorand denominator.Daniels (1956) obtained
the saddlepointapproximationto the distribution normal
of r* forthe case of a first-order
withand withoutmean correction.
autoregression However,treatmentof this coefficient
does not seem to extendeasily to higher-order and in consequencewe shall
coefficients
confineourselvesin thispaperto extensionsof r1 and r'.
The coefficientr1 bears the relation r1= 1 - 'd to the statistic
d = {g2(zt-z _)2}/ (gz,2)
adopted by Durbin & Watson (1950, 1951) fortestingserialcorrelationin least squares

regression. serialcorrelation
In ? 2 we developa class of higher-order based on
coefficients
r1 and investigate whencalculatedfromresidualsfromthe Fouriercosine
theirdistribution
regression
Ye= s=1
E pscos{- (2t-1) 7T(S-l1)/n}+ ui, (t = 1,...., n), (3)
wherethe ut's are normallydistributed to be specifiedlaterand whereq is

in a distribution
smallcomparedwithn.
On transforming to the partialserialcorrelationcoefficients a substantialsimplification
occurs.It is foundthat in the null case whenthe us's are independently the
distributed,
coefficientsare distributedapproximately independentlyin formsthat differaccordingto
whethertheorderis odd oreven,theformofthedistribution beingthesameforall odd-order
and thesameforall even-order
coefficients Thisis theproperty
coefficients. foundby Daniels
forthe circularnonregression situation.The formsfoundforthe densitiesare remarkably
simple.They are beta distributions withhalf-integer indicesand theirpercentage pointscan
thereforebe obtainedverysimplyfromthoseof the varianceratiodistribution.
A paralleltreatment is givenin ? 3 to a class ofhigher-order coefficients
serialcorrelation
based on r' whencalculatedfromresidualsfromthe Fouriersineregression
Yt=E P'ssin(n ) +uj (t= 1,..., n). (4)
is thesameas thatofthecoefficients
ofthepartialcoefficients
It is foundthatthedistribution
derivedfrom'r providedthat q and n are replacedby q + 1 and n + 1 respectively.
In ? 4 an analogoustreatment is givento the circularserialcorrelationcoefficients
when
theseare calculatedfromthe residualsfromthe Fourier sine and cosineregression
Yi =lB+ _'+ PsCskn)

2
{22Scos
(2"+
++p2sin -P+ u (t= 1....,n). (5)
Section5 reportstheresultsofa MonteCarloexperiment intendedto testtheperformance

ofthe approximations.The approximations are satisfactoryforthe samplesizes considered
case but the circularapproximations
in thenoncircular are noticeablyinferior.
COEFFICIENTSDERIVED FROMr1
DISTRIBUTIONOF NONCIRCULAR
2. APPROXIMATE
In this section we constructa class of noncircularserial correlationcoefficientsbased on
r1a,ndobtain an approximationto the joint distributionof the firstmnof them. Let us write
from residuals
Distributionofpartial serial correlationcoefficients 337
r, in the matrixformr1= (z'A1z)/(z'z),where

1 1 0 ... 0 0
2 2
101 2 ...
0 0
2,
000 ... ? ?
O O O ... 2 2
ofA1 are easilyverified

The eigenvaluesand eigenvectors to be
{n [ 2n J 2n 2n
fors = 1,..., n. We definethe lagj coefficient
as
rj = (z'Ajz)/(z'z)= cj/co (j = 1,2,...) (6)

wherein orderto achievemathematicaltractability we requirethe matricesA2,A3, ... to
as A1. The appropriatec0'swereobtainedby Anderson(1971,
have the same eigenvectors
p. 286) and are
C2 = ZlZ2+ZlZ3+Z2Z4+ ... +Zn-2Zn+Zn-lZn)
C3 = Z1Z3+Z2+ZlZ4+Z2Z5+ +Zn-2.n.
Zn-3n+ 2Zn-
C4 = Z1 Z4+Z2Z3+Zl Z5+Z2Z6+ ** +Zn-4Zn+Zn-2Zn-1+Zn-3Zn
and in general
C2k = Zl Z2k + Z2 Z2k-1+ + ZkZk+l + Zl Z2k+1+ Z2Z2k+2+* *
=2k+1 kZ21 + Z2Z2k+ * + ZkZk+2 + 2Zk+ + Zl Z2k+2+ Z2 Z2k+3+...,

wherethe termsat the ends of the expressionsforC2k and C2k+l are deduced fromthose at the
beginningby symmetry. These expressionshave the propertythat all the values z., . Zn
occurtwicein each ifone regardslzj as signifyinga singleoccurrence ofz1. The eigenvalues
,ujrofAj are (Anderson,1971,p. 288)
= cos {j,7T(r - 1)/n}
1-urj (r = 1,...,n; j-=1, 2, ...)
the matrix
to showthat withthesedefinitions
It is straightforward
1 r1 r2 ... rm-l
r, 1 r1 ... rm-2
Rm r2 r1 1 ... rm-3 (7)
rm-, rm-2 rm-3 ... 1
is nonnegative-definite.
We wish to investigate the distributionof the rj's when they are calculated from the
least squares residuals
qt=Y scs{ 2 )rk-l)n t=1 ,n (8)

k=l
338 J. DURBIN
fromthe regression(3), whereq is smallrelativeto n. This modelis not onlyof intrinsic
interestbut it also providesan approximationto the generalregression situationwhenthe
regressionsare 'slowly changing'in the sense that they consistpredominantly of low
frequency components.
The motivationforconsidering is to help constructtestsof serialcorre-
thisdistribution
lationof successivelyhigherorderin the residualsfromleast squaresregression. A natural
formalframework forthispurposewouldbe to testthe sequenceofhypotheses Hi-,: ofj= 0
in the stationaryautoregressive models
Ut+?jlu+t-1+J2 ut-2+... + ajUUt- = Et (t - 1, ...,n; j 1,...,m) (9)

forthe distribution of the errorsu1in model (3), wherethe Et'sare independentN(0, a 2).
It is well known(Durbin,1960,p. 234) that - jj is the partialautocorrelation coefficient
betweenu, and ut+j,ifwe keepintermediate fixed.Thusit wouldbe naturalto
observations
take as theteststatisticsthe sequenceofsampleestimatesofthesecoefficients.
Ideally, we would thereforelike to investigatethe distributions of the sample serial
correlation and
coefficients partialserial when
coefficients
correlation the u1'sare generated
by model(9) withj = m,whichforconvenience we writein theform
Ut+ (XiUt-1 + ..* + ?m Ut-m = et. (10)
However,it turnsout that this is an intractableproblem.To overcomethe difficulty we

constructa distributionforthe u1'swhich, whileapproximating closely to the distribution
generatedby (10), is morecompatiblewithour definition ofthe ci's. To do thiswe proceed
by analogywiththe circularcase. If the autoregression (10) had been circular,i.e. had
satisfiedthe restriction
ut ut+n
= for all t, the distributionof the u1'swould have had the
density
K, exp t- (2 E oigc? + 2 E oy oj+1 c? + ... + 2am C)} )11)
where oc0= 1 and

n
esO= ut
J?lglt_
t=1
(S = 0) ... m),
M)) t = Ut+n)
and where here and subsequently Ki is a constant for i = 1,2.... By analogy with this we
take as the density of the u1's,
~~~~1
K2 exp (-2 rc + 2m1(12)
( Exj2;a cl + ...+ 2atm
E oajotj+1 cm),(2
where c'* = u'A u, that is c*' is the same as c" defined by (6) with z replaced by u for
s = 1, ..., m, and where c* = u'u with u' = [Ul, ..., Un]. As in the stationary case we assume
that the roots of the equation Om+ ?1Ofm-'+ ... + am = 0 have modulus less than one. One
would expect (12) to give a better approximation to the distributiongenerated by (10) in
the stationary case than does (11). This approximation has been used previously by
Anderson (1948) for m = 1 and in G. S. Watson's N. Carolina Ph.D. thesis form = 2.
We now investigate the joint distribution of rl,...,rm when the ut's have density (12).
The normalizingconstant is K2 = (21rTa2)-nl A Ii, where A is the matrix of the quadratic
formin (12). Since the eigenvalues of As are cos {sv(k - 1)/n},we have
n Fm m m-s ] n I m1 2
Distribution fromresiduals
coefficients 339
Let ,. #m be therootsofthe equation m+ ol m- ... + om = 0. Then
m m
E Oli Mim
j [ (O -OS)
j=O s=l
Consequently,
E
j=O
o> ei-r(k-l)j/n = eimif(k-1) /n rl(e-in(k-1)
s=1
/n -#8).
Now
n n-1
(1 _ )-1II (e-ifr(k-l)l/n_s) (eiff(k-l)l/n-s) ][ (ei7j'/n -8s),
k=l j=-(n-1)
so that
n m m n-1
II
2
rj E06jei?(k-l)j/n = fI(j-_tO (ei7rj/n_O.,s)
k=i j=o s=l j=-(n-1)
= w. ThenW2n = 1 and w-l =

Put ei/n W2n-1, W-2 = w2n-2, ..., w-(n-1) = wn+l; also Wn =
Consequently,
n-l n-1i 1 2n-1 +
s
rI
(eiirj/n -cOS)
=
II Qw-b) = 1 Os 2n-i
- s)
j=-(n-1) j=-(n-1) # =0
so thatwe have
n m 2 m
i 1-O
k=1 j=O s=l
I1+Os
Now
m m m m m
(1 s0) = j=0
?ji r(1 + 0) = -)m ri (1
s=l
-OS) = j=O
(- O)>?
s=l S=l
It follows that
/m tm -tm
K2 = (2iTc2)-An (_o)j) rIl(- 2n)(
The exponentof (12) is
2a
uu +
=0otj2
2s=l1 EFotj
EJj=O
?cj+s
uz'A
2rn-
1 m m m-s 8z q m m m-s
zIz + 2 E
=--a2-
01j
z aoj otj+s ZAAsZ + Eza42+ 2E Eoz jozj+scos {s7T(k-1 )/n} (bk-PSk) }
where Z., ... ) Zn are given by (8). If we denote the densityof y = (Yl' ... I Yn) by f(y; 1,)B,c2),
whereo = (l, ..., c)' and fi=(P Pq)' it followsthatf can be writtenin the form
/m m -im
f(y; cx,B3,
a2) = (2iou2)-4n - , (- l)14 rj (1 _-b2n)
1m mm-s q
x exp 1 Vk(bk pk) (13)
[- j(o qo+ +2 8 kq
where
m m m-s
C5= z'A8z, Vk= E e2 +2E acx acxJ+8COS{87T(k-1)/fl}.
340 J. DURBIN
Now introducethe sufficient a = (a1, ..., am)',

estimators b= (bl, ..., be)' and
Im m m-s
S2 = (n-q)-l Ea4c +2E E ajaj+sc)
i=OJ s=l j=O
of ax,/ and a2, where a1, .. ., am are the solution of the Yule-Walker equations
a1 + a2r1 +... +amrm-+ ri = O,
al r -+ a2 +.. +amrm-2+r2 = 0,
alrm-i+a2rm-2+***+ am +rm=O.
Using (13) and (19) of Durbin (1980), we have forthe densityg(a, b,S2; o,z /, ur2) of a, b, S2
g(a,b,s2; cx,fl,a2)= (mL) Jf(a,b,s2) 1f(Y cxflu2) {1 +0(n-')}, (14)

~~~2) (y; a,b,S2)
where f(Q,B3,a2) is the limit of n-1 times the informationmatrix of the sample. On taking
the mean of second derivativesof'logfwith respectto elementsof cX, ,Band ar2we findthat
Rmsz /S2 220 0

0
...........................
....................
1
f (a, b,c) O B 0
_ ? . O | (2S2)-1
where
n 2 m
-1
2 2 = S2(
S= (n-q)-'Ez a,rj
Rm is given by (7) and B is a diagonal matrix whose kth element is v where

m m m-s Im \2
Vk = m
ma2
+ 2E ECa1 a+s cos {s7r(k- 1)/n} = (a2 {) + O(n2)}
j=0 s=1 3=0 =
for k = 1, , q. We therefore have
If(a, b, S2) 11 = (aS r) - =m ( a (s2)-Iq-1 2-1{1 +

I Rm1J 0(n-2)}.
Substitutingin (14), we obtain

m \-im m a, I m
{
I
2 {n\]s(+a+l)t2I aj
i E> i(-l 1)a IRm mI
II(S2)jn-1 02n
g(a, b,s2;a,/ ( y) = (mairj)Am ( ) (-) a)R 1
This approximation is valid by Case 2 and Theorem 1 of Durbin (1980) applied to the
distributionof 00, . .., cm aJndb1,. . ., bq.
We have already assumed that the roots 01, ... Om # of the equation obm+ x1#m-'+ ... + cxm= 0
have modulusless than one. Let us aAlso limit the doma,inof a to the regionforwhichthe
roots$1,.., $,m of the equaJtion #m+ a1 m-1 + ... + am = 0 haJve modulusless than one; in
Distribution fromresiduals
coefficients 341
fact the probability of the sample point falling outside this domain equals zero anyway.
The error in replacing the factor fls(l - 0b2n)/(l - $p) by unity is thereforeexponentially
small and can be neglected. Integratingout bl, ..., bq,we obtain forthe density of a, s2
/m \-im /m q-i m \
E a, E
2 2 i( n \ (M+1) tz aj
0 r> Jr(0 (- 1)Jayj
1) a
m
q-i
g(a,S2; o, a2) + ( -
kT7r)
=
Jjz(1)j Oli
}{
V2(0r2)j(n-q) Oli
I~~~~~~~~~~~~~~
IRm I(n-q-1)
1,(S2) 1 /m
=0 j=0~~~3= m m-s
X e_i(n_a)
m exl - a2 oz-?eco+ 2SE ozjofy
j+s {1 + 0(n-1)}.
Integratingout S2 and using Stirling'sformula,we obtain forthe density of a,

a ( n i(m+1) 21(n-q-1)r(fn - lq) (EJaj r1)1(n-q-m)(j aj) q
( =2-7 J) (n - q) (n-q) (h Xj)(2-1{z( - 1)j Oz,>}
x a} {1 +
11)
I(nRm
QZj)(, r) O(n-')}
(nym (Zj aj r()"(n-q-m)(E. aj)q-1{Z(-( 1)Ja1}1IRm1 {1 +0(n-')} (15)
2(EJ 0Zj)qj {Zj( _ )j 0Zj}j Qj(n-q)(ozxr)
where the summations are over j = 0, ..., m, and where we express rl, ..., rm in terms of
al, ...) am, with
m m m-s
Q((X,r)= x+ 2
oj2 cvxjo?j+srs.
j=O s=1 j=O
If we make the transformationfroma1, ..., Camto rl, ..., rm,the Jacobian forwhich is given
by Daniels (1956), it is possible to write down the density of r1,...,rm, but this is too
complicated for practical use. However, a considerable simplificationoccurs when the
distributionis expressedin termsof the partial serial correlationsrj.. For the presentpurpose
these may be definedas follows.Let a8s, ..., ass be the solution of the Yule-Walker equations
as1 + as2r1 + +assrs-+r1 = 0,

aslr + as2 +. +assrs-2+r2 = 0,
asl rs- + as2 rs-2 +* + ass + rs= 0,
fors =1,...,m. Then r =-a11. Obviously, rl = rl. Since Rmdefinedby (7) is nonnegative-
definite -1 (r.< 1 forj =1 .
Now asr = as-,r + ass as-,s-r (r = 1, ..., s- 1) (Daniels, 1956, equation (10.3)). Consequently
(1-as-a,, + as-1,2 -
* + as-.,s-.)(1- ass) =
1-as1 + as2
-
*- ass (s odd),
(1-as-8,, + as-1,2- * -as-1,s-.)(1+ ass) =

1-as, + as2- *
+ ass (s even),
whence
m
E (- 1)j aj = (1 - +
al) (1 + a22) (1 -a33) *.{1 1-)m amm}
j=0
342 J. DuRBiN
Similarly,E1a, = (1-r. )(1-r2 )... (l-rm.), wherethe summationis over j = ...,m.

Since -1 <r3. 1 forj = 1,...,m, we have Zj(- 1)ia, 0 and jaj> 0.
Daniels(1956)showedthattheJacobianofthetransformation froma1,..., amto r ., ...,
is
a(a r...am) _iodd(l-r;2.
= )""' Ieven(1-rj. ) (1-r,2. )3j-1,
over odd and evenj. Since also R =I=

wherethe productsare respectively .(lr,2 )m-j
and Ejajrj= Hj(l-r2 ), we obtainforthe densityof r1.,..., rm.,on substitution in (15),
h(r, a) = (
n I)Im odd(1_rj )q-1(1-_rj2().n-)1 Ileven(l-r. )q+1 (1r32).(n-q)) 1
\2)} (Ejc(X)q-i {j(
-
1)i ac}* Q (n-q) (Q r)
x{1+0(n-L)} (-lrj. Al< =l..m q= 0X ..) (16)
when the summationsare over j = 0, ..., m.
Daniels (1956)pointedout thatwhenthenormalizing constantin a one-term saddlepoint
approximation is adjustedto makethetotalprobability equal unity,theerroroftheapproxi-
mationis effectively reducedfromO(n-l) to O(n-312); Durbin(1980) showedthatthisalso
appliesto the essentiallyequivalentapproximations obtainedby the presentmethodin the
sensethattheerroris O(n-3/2)fora givenstandardized valueofrj.thoughnotuniformly forall
standardized rj.. To renormalize(16) considerfirstthe null case in which cx1= ... = am = 0.
Then the rj are independent to the orderof approximation underconsideration withtrue
normalizingconstants 2-n{B(In - Iq + 1, 'n + lq)}-' (j odd) and 2-f{B( n - lq, In + Iq + 1)}-1
(j even) in place of the asymptoticnormalizing constants{n/(2rr)}i.Using the methodof
Madow (1945) it is easy to showthat
h(r., ox)= h(r.,0){1+0n2,

(> gj)q-, {Z( _1)1g1} Q (n--)(o,r) {1 + O(n"2)},
the summationsbeing over j = 0, ..., m. Using (16) forthe case ac= 0 and renormalizingwe
thereforeobtain as our final approximation to the density of r, ,..., rm.for general os,
h(r, o) = ['odd 2 -{(n -q + 1, 2n + 2q)}' (1-rj* )2-1(1-r; )(n-q)
'even 2-n{B(n - Iq, Iln+ q + 1)}-' (1 -rj )q+l (1 -rJ )j(n-q)-1

XQi(n-q) (g,r)
x{1+O(n-3/2)} (-1) r1<1;j=1,...m;q=0,1,.) (17)
Note thatthe derivationof (17) from(14) requiresonlyelementary operationsand is there-

foretechnicallyeasier to implementthan the saddlepointmethodapplied to the same
problem.
For q = 0 and q = 1, (17) gives the approximate distributionof r,*, ..., rm.for deviations
fromthe true mean and fromthe samplemean respectively. For m = 1 and a, = 0, (17)
agreeswiththeresultin equation(3.7) ofMcGregor (1960)butis inconsistent withexpression
(7.4) and the expressionof Daniels (1956, top of p. 179) forthe distribution of r1 when
q = 0 and 1 respectively;this is not, however,surprising since Daniels used a different
definitionofr1.It followsfromMcGregor'streatmentthat (17) gives the densityof r1., ..., rm.
whencalculatedfromresiduaqls on thefirstq orthogonal
fromregression withan
polynomials
erroroforderq2312.
fromresiduals
Distributionofpartial serial correlationcoefficients 343
r) is not expressibleas a productoffactorsdependingon the different
The termQ(oQ, rj.'s
so in generalthe r1's are not approximately
even asymptotically, independent.However,
wheno1= 0 forj =p + 1,...,m;p = 0, 1,..., we inferthatforj > p, r1 is distributed
approxi-
matelyindependently ofthe otherpartialcorrelations withdensity
(1 -r)q-1 (1 -r2)1(nf-q) -/)
f(r) = 2n B(n -lq + l n + lq) {1 +(-l r < 1; j odd), (18)
(1nB+l(1n_q+l,in+fq){l+Of)
f(r) = 2nB(-n-q in +I q)+ 1) {1+O(n-3/2)} (-1< r< 1; j even). (19)
Because the dominant terms of (18) and (19) are essentiallybeta distributionswith half-
integerindices it is easy to obtain approximate significancepoints of r fromthe significance
points of the variance-ratiodistribution.Putting x = (1 + r) in (18) and (19) we obtain for
the dominant terms the densities
Xj(n-q)(j X)I(n+q)-1 Xi (n()1 X)1
- (n+1
B(fn - lq + 1, 'n + lq) '

B(Qn -lq, 'n + lq+ 1
Now if F has the variance-ratio distribution with (vl, v2) degrees of freedom then
x = v1F/(v1F+v2) has the beta density
xilv-l (1 - x)1v2-l/B(v1,lv2)l
We deduce that forthe appropriate choices of vl,v2 the quantity r = (vl1F-v2)/(vlF+v2) is

distributedwith densitiesgiven by the dominanttermsof (18) and (19) respectively.
It followsthat if rNis the approximate significancepoint of rj at significancelevel a then
r'. - (n q+2)F+(n +q) (j odd), rO= (n q)Fx(n +q+2) (j even), (20)
where Fx is the oxlevel significancepoint of F based on (n - q + 2,n + q) and (n - q,n + q+ 2)

degrees of freedom respectively. In practice it might be more convenient to obtain the
significancelevel froma computersubroutineintendedforthe calculation of the probability
integral of the beta distribution,e.g. that of Majumbder & Bhattacharjee (1973a).
The means of densities (18) and (19) are given by
E(r) -(q-1)/(n+1) (jodd), (21)

-(q + 1)/(n+ 1) (j even).
These expressionswere derived fromapproximate distributionswhich were obtained by
replacing eigenvalues cos (jirr/n) by unity for j = 1, ..., q - 1. Since in practice these values
may be far fromunity when jr is large it is worth attemptingto obtain a more accurate
value forthe mean. This can be done as follows.From Durbin (1960, equation (7)) we have
=r r1+&2 r2
+ 0&1 +* + &l1 r, (22)
2s + All+ + Ajlr-
where cs,..., &>_,are the coefficients

of a fittedautoregressionof orderj-1, that is they are
the solution to the equations
&1+ +
+3&+r2
r2 + + rj-2&11 = - rl,
l + & +
ri_a + . **
OZ + r&3oj-1 =-r2_,
344 J. DURBIN
whichwe may writeas R = -r, whencea =R- I-s +
r. Now R-1 is approximately 62,
whereI is the unitmatrixand
[ 0 r1 r2 ... r1-2
r1 0 r1 ... r,-3
r1-2 r1-3 r1 4 ... 0
Since forthe null case E(r2) = n-1+O(n-2) and E(rirrk) = O(n-2)a firstapproximation to
E2 is (j -2) n-1I and a secondapproximation to R-1 is {1 + (j -2) n-} I-s. It followsthata
secondapproximation to the numerator
of (22) is
r1-{1 +(j-2)n-1}(r1r1_1+r2r1-2+ ...
r,- rl)
The denominatorof (22) is equal to H, (1- r,2),wheretheproductis overs = 1,...,j-1. We
can therefore
write(22) as
1-1
r ][I (l-rs2.) = r -{1 + (j-2)n-lry_l+ + rj-lrl)
s=1
Assumingthat r r,...,r1 are approximately

independently we have approxi-
distributed,
matelyon takingexpectationsthat
E(rj) (j odd),
E(r)
fE(rj)-{I+(j-2)n1}E(ri2) (j even).
Is as axes and we obtain

To obtainE(rj) and E(r,2)we referthe vectorz to the eigenvectors
r,= [cos {j7T(s- 1)/n}x2]/(s x2),
where the summations are over s = q +1, ..., n and x?+l, ..., xn are independent N(0, 1).
From well-knownresults (Durbin & Watson, 1950, p. 419), we thereforehave
E(r1) = -
n - qs=q+l
Cos p (24)
(n-q)(n-q+2) s(+2 2( n 1)+ cos (i(8 1)}n ] ) (25)

the latter for even j. In computingthese expressionsnote that
n 1 (j odd),
E cos {jT(s- 1)/n}=
s=1 0 (j even),
while 1 cos2 { jT(S-1 )/n}= In forj even and positive. In (23) we have ignoredthe negligible
quantities E(ri rk) for i * k and have replaced E(rj ) by its firstapproximationn-'.
We observe that (21) and (24) are the same to order n-'. If greater accuracy is required
in the null case than is provided by the unmodifieddistributions(17) and (18) it is suggested
that a'n amount
E(rj.)+(q-1)/(n+1) (jodd), E(rj.)+(q+1)/(n+l) (jeven) (26)

should be subtracted fromthe observed value of r1 before applying (17) and (18), where
E(r1. ) is g.ivenby (23).
fromresiduals
coefficients
Distribution 345
to considerthe effectof regressing
It is interesting components
yion the high-frequency
cos { (2t- 1) r(n- s)/n} fors = 1, ..., q instead of the low-frequencycomponents
cos{f(2t- 1) 7Ts/n}
as in (3). Lettingztnowdenotetheleast squaresresidual
q
z, = ye- b*cos{ (2t-1) rr(n-s)/n}
8=1
and definingrl, ..., r. in terms of z,, ..., znas before,we findforthe density h*(r., cX)of the
partial coefficients rl , ...,K*
rm)
h*(r., ox) = QIK3)r)

(?)Qi(n7q)(ot,r) nodd(1 +rj.)q+l(1_r! )V(n-2
x H even(1 - r1 )q+1 (1 - r2 )(nf-) -1{1 + O(n-3/2)}, (27)

whereK1 is a normalizingconstantwhichwe shallnot botherto evaluate.
It seemsremarkable regression
thatrelativeto thelow-frequency densitiesthereshouldbe
in thenulldensityoftheodd-order
a drasticdifference coupledwithno difference
coefficients
in thenulldensityofthe even-order coefficients.
COEFFICIENTSDERIVED FROMr1
DISTRIBUTIONOF NONCIRCULAR
3. APPROXIMATE
We now considera class of noncircularstatisticsbased on r' and obtain the approximate
distributionof the firstm of them. Since the derivation parallels the previous case very
closely we shall only indicate points of difference.
Write r' in the matrix formr' = (z'B1 z)/(z'z), where B1 is a matrix all of whose elements
are zero, except forthose adjacent to the principaldiagonal whichare 2. The eigenvalues and
eigenvectorsof B1 are easily verifiedto be
s cs(n+1)' I =(sin(n-$1),si
2T1)'s (r s .sin (f) }'.
We definethe lagj coefficientas,

rj = (z'B1z)/(z'z) = cl/cl (j = 1, 2,...),
where the Ba's are required to have the same eigenvectorsas B1. The appropriate C,'s are
given by Anderson (1971, p. 291) as
2 -iZ +Z1 Z3+Z2Z4+ --- +Zn-2ZnjZ ,
C3= -Z1Z2+Z1 Z4+Z2Z5+ * +Zn-3Zn -Zn- Zn
C4 -Z1 Z3-Z2 +ZlZ5+Z2Z6+*+Zn-4 Zn-2 n-1 Zn-2 Zn
and in general
C2k = -Zl Z2k-1 Z2 Z2k-2 * Zk+Zl Z2k+ + Z2 Zk+2 + *
= -
Z2k+2 + Z2 Z2k+3 +
-Z1 - -
2k+1 Z2k Z2 Z2k-1 * Zk Zk+1 +Z1 *..
wherethe termsat the ends of these expressionsare deduced fromthose at the beginningby
symmetry. The eigenvaluesof BR are given by Andersonas cos {frs/(n+ 1)} fors = 1, ...,n;
j = 1,2,...
346 J.DURBIN
We now investigatethe distribution
of the r;'s whentheyare calculatedfromthe least
squaresresiduals
z1= ys-= 1) (t= 1,..., n)

b'sin(4n+ (28)
fromthe regression withdensity
(4), wherethe errorsutin (4) have the distribution
I Im m-1
K p 2C* + 2 E oijotj+lcl**+ ... + 2oCm
Cm*)
analogousto (12) withcj**= u'Bu(s =1,...,m), c*=u'u.

Whenz definedby (8) is referredto the eigenvectors
Is as axes we have
r= [Escos{T( 1)}X2] /(ZsXS2) (29)
wherethe summationsare over s = q + 1, ..., n, and wherein the nonnull case xqi, .. ., xn are
independent
normalvariableswith
m m m-k
var(xs) =E O+i
of2+ 2E , 0 k+i COS{k7T(s-1)/n}.
i=O k=l i=O
Similarly,
whenz definedby (28) is referred 1*as axes, we have
to the eigenvectors
( (30)
r = Cos Ws2
(+ )(Es W2)
where the summations are over s = q +1, ..., n, and wq+1, ..., wn are independent normal
variableswith
m m m-k
var (WS) = Etj2 + 2 E Ol' i+k cos{k7s/(n+1)}.
i=O k=l i=O
Comparing(29) and (30), we inferimmediatelythat r. has the same distribution as rj
factwas pointedout
providedthatq and n are replacedby q + 1 and n + 1. This intriguing
forthe case m = 1, q = 0, oil= 0 by Anderson(1948). It followsthat if r'*, ...,r' are the
partialserialcorrelation computedfromrl,..., r4,theirdensityis givenby (17)
coefficients
withn, q replacedby n + 1, q + 1.
4. APPROXIMATE DISTRIBUTION OF THE CIRCULARCOEFFICIENTS

The circularserialcorrelation
coefficients
are definedas
r5'= (Etztzt_)I(Zt
Z2) = z'Cjz/(z'z) = Cj'/Co,
where the summations are over t = 1, ..., n and where z is now the vector of least squares
residuals
z, = y- (27- bt) + b28 sin (31)
fromthe regression(5) withzt = zt+. We shall obtainthe approximatedistribution

of the
withdensity(11). For simplicity
r1's whentheerrorsutin (5) havethedistribution we assume
that q and n are odd, notingthat onlytrivialalterationsare neededto obtainanalogous
resultsforq and n even. The proofas givenholds onlyforq > 1 but an almostidentical
argumentholdsforqf= 0. Once again we give onlypointsof difference fromthe previous
derivations.
fromresiduals
coefficients
Distribution 347
Let 1, m and $ &,*$m be therootsoftheequations
'
Om+ a Om-' + ... + cm = 0 and
Om+ a, Om-1+... + am = 0. to thoseof? 2 we findthatthenormalizing
Usingsimilarreductions
constantK1 of (11) has the value K1 = (2ur2)-in II8( 1 - b). The densityof u1,...,u, is
therefore
m 1m m-s
(2Tra2) i,n fl ( 1-o>n) exp) sz .
-2aE aCi+sU cS
to Yl,
whenceon transforming Yn and using(31) thedensityofYl, Yn can be writtenin
the form
f(ya, /3", u2) =(27ur2)-in

n
(1 ) exp
/
-2 Lm m-s
EJcs;+n (
/n 2
2
y_
12
)2
+ zv`,{(bI2k - )2 +(b2k+1 -k+1) }
where
m m-s
v=
s=O j=O
voj+s cos (27Tslk/n).
E oil
Using (14) withf replacedby f", replacingHs1(1- n)/(1_-n) by unityand performing

manipulations essentiallyidenticalto thoseof ? 2, we obtainforthe densityh"(r.,ax)ofthe
rl', ..., rl
partialcoefficients
h"(r,x)= "odd 2-n{B(ln- Iq+ i, In + Iq+ 2)}-1 (1-rj )q (1r2.)1(n-q-l)
x even2-n{B(,n- jq, in+ 'q+ 1)}1-(1 -r )q+l (r1 ()i(n-l)l

Q i(n-q)(o, r)
x {1 + (n-312)1 ( lr,.< ;j = ,m;q= 0, 1,2, ....
............... ...)
(32)
Where the summationis overj = 0, ..., m. Apart frommisprintsand the inclusionhere of the
explicitvalue ofthenormalizing constant,thisexpression is thesameas theforms(10.5) and
(11.5) ofDaniels (1956)forthe cases q = 0 and q = 1 respectively.Comparing(32) with(17),
we observetheremarkable factthatfora randomserieswhena, ... = ?em = 0 theapproxi-
=
matedistributions ofthenoncircularrj and thecircularrj' arethesameforj even.Moreover,
fora randomseriesthe distributions forj odd differ onlyin that q in (32) is replacedby
q-1 in (17).
For the approximating nulldistributions we find
E = -q/(n+1) (j odd), (33)

-(q + 1)/(n+ 1) (j even).
to E(r,!.) forthe null case moreaccuratethan (33) we have

To obtainan approximation
analogouslyto (23)
E (r)
(j odd),
(1} ev_e)1 ) (j ,(34)
348 J. DURBIN
where
E(r~) = 2 (-),1
) c
n-q s=4(q+1)
the last forevenj. Here,

1(n-1) 1
1(n-D
2() gsn ( )
2 z cos(27Tjs/n)=-1 (j odd or even), 2 z cos2(2Tjs/n)= (n-2).
s=1 s=1
Thus ifgreateraccuracyis requiredin the null case thanis providedby the basic approxi-
mation(32) an amount
E(r. )+q/(n+1) (jodd), E(r..)+(q+1)/(n+1) (jeven) (35)

shouldbe subtractedfromthe observedvalue of r>. beforeapplying(32) whereE(r!.) is
givenby (34).
ofobserved
Table 1. Comparison and approximate ofthenoncircular
distributions
rj and thecircularstatistic
statistic r,'
Theoretical Observed percentagefornoncircular Observed percentageforcircular
percentage statisticr,. statistic,r,.
incell j=1 j=2 j= 3 j= 4 j= 5 j= 1 j= 2 j= 3 j= 4 j= 5
(a) n = 25,q = 1
5 5*12 4*96 4*97 4*56 425 4*93 7*16 4*28 6*65 4.86
5 5.11 5*18 4*97 4.74 5*20 5*30 6*21 4 99 6 33 5*30
20 20*73 19*95 19-90 18-55 20*10 19-26 22*17 19*81 22*72 21*00
20 20*01 19*52 20*11 20*47 20*11 19-91 21-44 20*71 20-65 20-45
20 19-13 20*84 19-72 20*66 20*70 20*44 19*17 20-56 18 80 20*24
20 20*10 20*17 20*48 20-58 19*74 20-51 16*77 20-09 17*40 19*59
5 4*91 4.86 4.67 5*16 5.10 4*95 3*75 4-73 387 434
5 4*89 4*52 518 5*28 4*80 470 3*33 4*83 3*49 4.22
(b) n = 50,q = 3
5 5.04 4.75 4.78 5*36 5.73 4.88 6*82 5*26 6-81 6*06
5 4.84 5*26 5*13 5*30 5*58 5*23 5*91 5*12 5.99 6'23
20 20-19 19*47 20-88 20-39 21-57 19-00 22*47 21-13 22*47 22*08
20 20*30 20*03 20*93 20*17 21*11 20*11 20-13 20-49 21*42 21*61
20 20-72 20*76 20*23 19*66 19-34 20-43 19-50 19*99 18'95 19*07
20 19-34 19-82 19*37 19*60 18*37 20-19 16-99 19-13 17*21 18*01
5 4.90 4.86 4*50 4.51 4.25 5.33 4d19 4.68 3-81 3*69
5 4.67 5,05 4.18 5.01 4 02 4*83 3.99 4.20 3.34 3.25
(c) n = 100,q = 5
5 5*03 5*15 5*66 5.45 5-84 4.97 6.30 5.33 6.12 6*32
5 5*26 4*78 5*32 5*80 5*50 4.86 5.68 5*12 5*83 5*85
20 19*77 19-43 21*35 20-05 21*64 18 94 21-61 21-27 23*10 22*91
20 19-72 19-56 20 64 20*42 20*29 20-50 20*17 20*74 21-41 20*22
20 20-34 20-58 19*07 19*81 19-45 19-91 19-15 19-89 18-80 19-27
20 20*03 20*83 19*14 19-33 18 98 20-49 18*60 18 91 17*48 17*89
5 5*04 4*67 4~41 4.97 4~50 5b19 4*46 4*60 3*66 4*17
5 481 5.00 4*41 4*17 3~80 5*14 4~03 4b14 3~60 3.37
coefficients
Distribution fromresiduals 349
5. MONTECARLOEXPERIMENT
The accuracyofthe approximations fora. randomserieswas testedby meansof a Monte
Carloexperiment. Randomnormaldeviateswerecalculatedby the Marsaglia-Braymethod
as describedby Neave (1973) and rj., r' were computedfor each of n = 25, 50, 100,
j = 1, 2, 3, 4, 5 and q = 1, 3, 5 fromresiduals(8) and (31) respectively.The secondnon-
circularcoefficient r. was not includedsinceits distribution is the same as that ofrj with
n,q replacedby n + 1, q + 1. Percentagepointsofeach approximating were
beta distribution
obtainedby means of the algorithmgiven by Majumder& Bhattacharjee(1973b). The
percentageofobservations fallingintoeach ofeightgroupsalongthe distribution was then
comparedwiththetheoretical percentagesof 5, 5, 20, 20, 20, 20, 5, 5. The meancorrections
(26) and (35) werefoundto be essentialto achievesatisfactory accuracyforj, q ? 1 and were
thereforeemployed. Each experimentwas repeated 10,000 times. The performanceof the
approximationswas found to be unsatisfactoryin the tails for n = 25, q = 3,5 and n = 50,
q = 5, exceptforj = 1,2. The resultsforn = 25,q = 1, n = 50, q = 3 and n = 100,q = 5 are
given in Table 1. The resultsforn = 50, 100 and q = 1 are at least as good as those forn = 25
and those forn = 100, q = 3 are at least as good as those forn = 50 so these results are not
included. In interpretingthese tables, note that the standard errorof a 5% entry is 0-22
and the standard errorof a 20% entryis 0 40.
The resultsforthe noncircularapproximationare fairlysatisfactorybut the performance
of the circularapproximationis noticeably inferior.
This paper is based in part on workdone in part at the CenterforComputationalResearch

in Economics and Management Science, Massachusetts Institute of Technology with the
support of the National Science Foundation. I am grateful to Mr Y. K. Tse for the
Computation of Tables 1(a) and (b). Mr Tse's work was supported by the Social Science
Research Council.
REFERENCES
ANDERSON, T. W. (1948). On the theoryof testingserial correlation.Skand. Aktuar. 31, 88-116.
ANDERSON, T. W. (1971). The StatisticalAnalysis of Time Series. New York: Wiley.
DANIELS, H. E. (1956). The approximate distribution of serial correlation coefficients. Biometrika43,
169-85.
DURBIN, J. (1960). The fitting of time-series models. Rev. Int.Statist.Inst. 28, 233-44.
DURBIN, J. (1980). Approximations for densities of sufficientestimators. Biometrika67, 311-33.
DURBIN, J. & WATSON, G. S. (1950, 1951). Testing for serial correlation in least squares regression I and
II. Biometrika 37, 409-28; 38, 159-78.
MADOW, W. G. (1945). Note on the distribution of the serial correlation coefficient. Ann. Math. Statist.
16, 308-10.
MAJUMDER, K. L. & BHATTACHARJEE, G. P. (1973a). The incomplete beta integral. AS 63. Appl. Statist.
22, 409-11.
MAJUMDER,K. L. & BHATTACHARJEE, G. P. (1973b). The inverse of the incomplete beta function ratio.
AS 64. Appl. Statist.22, 411-3.
McGREGOR, J. R. (1960). An approximate test for serial correlation in polynomial regression. Biometrika
47, 111-9.
NEAVE, H. R. (1973). On using the Box-Muller transformation with multiplicative congruential pseudo-
random numbers. Appl. Statist.22, 92-7.
December1976. RevisedJanuary1980]
[Received

Approximation of Serial Correlation

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Approximation of Serial Correlation

Enviado por

Direitos autorais:

Formatos disponíveis

Biometrika(1980), 67, 2, pp.

Some key words: Asymptoticexpansion; Autoregressiveseries; Saddlepoint approximation.

r = 1 l2+Z2Z3+ +Zn-lZn+ z2n

It Zi Z2 + ***+ Zn-1 Zn Zl Z2 + ***+ Zn-1 Znl 2

d = {g2(zt-z _)2}/ (gz,2)

adopted by Durbin & Watson (1950, 1951) fortestingserialcorrelationin least squares

wherethe ut's are normallydistributed to be specifiedlaterand whereq is

Yt=E P'ssin(n ) +uj (t= 1,..., n). (4)

Yi =lB+ _'+ PsCskn)

Section5 reportstheresultsofa MonteCarloexperiment intendedto testtheperformance

r, in the matrixformr1= (z'A1z)/(z'z),where

ofA1 are easilyverified

rj = (z'Ajz)/(z'z)= cj/co (j = 1,2,...) (6)

C4 = Z1 Z4+Z2Z3+Zl Z5+Z2Z6+ ** +Zn-4Zn+Zn-2Zn-1+Zn-3Zn

=2k+1 kZ21 + Z2Z2k+ * + ZkZk+2 + 2Zk+ + Zl Z2k+2+ Z2 Z2k+3+...,

Rm r2 r1 1 ... rm-3 (7)

rm-, rm-2 rm-3 ... 1

qt=Y scs{ 2 )rk-l)n t=1 ,n (8)

Ut+?jlu+t-1+J2 ut-2+... + ajUUt- = Et (t - 1, ...,n; j 1,...,m) (9)

However,it turnsout that this is an intractableproblem.To overcomethe difficulty we

K, exp t- (2 E oigc? + 2 E oy oj+1 c? + ... + 2am C)} )11)

where oc0= 1 and

= w. ThenW2n = 1 and w-l =

The exponentof (12) is

Now introducethe sufficient a = (a1, ..., am)',

g(a,b,s2; cx,fl,a2)= (mL) Jf(a,b,s2) 1f(Y cxflu2) {1 +0(n-')}, (14)

Rmsz /S2 220 0

Rm is given by (7) and B is a diagonal matrix whose kth element is v where

for k = 1, , q. We therefore have

If(a, b, S2) 11 = (aS r) - =m ( a (s2)-Iq-1 2-1{1 +

Substitutingin (14), we obtain

Integratingout S2 and using Stirling'sformula,we obtain forthe density of a,

as1 + as2r1 + +assrs-+r1 = 0,

asl rs- + as2 rs-2 +* + ass + rs= 0,

(1-as-8,, + as-1,2- * -as-1,s-.)(1+ ass) =

Similarly,E1a, = (1-r. )(1-r2 )... (l-rm.), wherethe summationis over j = ...,m.

over odd and evenj. Since also R =I=

h(r., ox)= h(r.,0){1+0n2,

h(r, o) = ['odd 2 -{(n -q + 1, 2n + 2q)}' (1-rj* )2-1(1-r; )(n-q)

'even 2-n{B(n - Iq, Iln+ q + 1)}-' (1 -rj )q+l (1 -rJ )j(n-q)-1

Note thatthe derivationof (17) from(14) requiresonlyelementary operationsand is there-

B(fn - lq + 1, 'n + lq) '

We deduce that forthe appropriate choices of vl,v2 the quantity r = (vl1F-v2)/(vlF+v2) is

r'. - (n q+2)F+(n +q) (j odd), rO= (n q)Fx(n +q+2) (j even), (20)

where Fx is the oxlevel significancepoint of F based on (n - q + 2,n + q) and (n - q,n + q+ 2)

E(r) -(q-1)/(n+1) (jodd), (21)

where cs,..., &>_,are the coefficients

r1-2 r1-3 r1 4 ... 0

Assumingthat r r,...,r1 are approximately

Is as axes and we obtain

(n-q)(n-q+2) s(+2 2( n 1)+ cos (i(8 1)}n ] ) (25)

E(rj.)+(q-1)/(n+1) (jodd), E(rj.)+(q+1)/(n+l) (jeven) (26)

h*(r., ox) = QIK3)r)

x H even(1 - r1 )q+1 (1 - r2 )(nf-) -1{1 + O(n-3/2)}, (27)

We definethe lagj coefficientas,

2 -iZ +Z1 Z3+Z2Z4+ --- +Zn-2ZnjZ ,

C3= -Z1Z2+Z1 Z4+Z2Z5+ * +Zn-3Zn -Zn- Zn

C4 -Z1 Z3-Z2 +ZlZ5+Z2Z6+*+Zn-4 Zn-2 n-1 Zn-2 Zn

z1= ys-= 1) (t= 1,..., n)

analogousto (12) withcj**= u'Bu(s =1,...,m), c*=u'u.

r= [Escos{T( 1)}X2] /(ZsXS2) (29)

4. APPROXIMATE DISTRIBUTION OF THE CIRCULARCOEFFICIENTS

z, = y- (27- bt) + b28 sin (31)

fromthe regression(5) withzt = zt+. We shall obtainthe approximatedistribution

It Zi Z2 + + Zn-1 Zn Zl Z2 + + Zn-1 Znl 2