Escolar Documentos
Profissional Documentos
Cultura Documentos
_
1 1
e
i
e
i
.
.
.
.
.
.
e
i(T1)
e
i(T1)
_
_
, B =
A
2
_
e
i
e
i
_
, U =
_
_
u
0
.
.
.
u
T1
_
_
.
Then model (1) can be rewritten as
X =
_
_
x
0
x
1
.
.
.
x
T1
_
_
= S () B +U. (2)
Consider the log-likelihood function when the noise {u
t
} is an i.i.d. N(0,
2
)
sequence:
L(, , A, ) =
T
2
log
_
2
2
_
1
2
X S () B
2
.
160 DAWEI HUANG
Here V =
T
_
e
ik
j
_
j=p,p+1,...,q; k=0,...,T1
,
F
2
=
1
T
_
e
ik
j
_
j=0,...,p1,q+1,...,T1; k=0,...,T1
.
Let I be the identity matrix with an appropriate dimension. Since
F
1
F
1
+F
2
F
2
= [F
1
, F
2
]
_
F
1
F
2
_
= I,
we have
Q(, A
1
, A
2
) = F
1
X F
1
S () B
2
+F
2
X F
2
S () B
2
. (3)
Let D() =
sin(T/2)
T sin(/2)
be the Dirichlet function. Then
1
T
T1
t=0
e
it
=
e
iT
1
T (e
i
1)
= e
i(T1)/2
D() . (4)
so
1
T
F
1
S ()=
_
_
e
i(T1)(
mp
)/2
D(
mp
)
.
.
.
e
i(T1)(m)/2
D(
m
)
.
.
.
e
i(T1)(
m+q
)/2
D(
m+q
)
e
i(T1)(
mp
+)/2
D(
mp
+)
.
.
.
e
i(T1)(m+)/2
D(
m
+)
.
.
.
e
i(T1)(
m+q
+)/2
D(
m+q
+)
_
_
.
Since the true frequency should not be far away from
m
(see Lemma 1 below),
we only need to search the estimator of in a neighborhood of
m
. When
|
m
| < /T, D(
k
+) is not signicant, so the second column in the above
matrix can be ignored. Similarly we can ignore the second term in (3) since
F
2
S() is not signicant.
Let
Z =
1
T
F
1
X, (5)
H () =
_
D(
mp
) D(
m
) D(
m+q
)
_
,
= diag
_
e
i(T1)
mp
/2
, . . . , e
i(T1)m/2
, . . . , e
i(T1)
m+q
/2
_
.
APPROXIMATE MAXIMUM LIKELIHOOD METHOD 161
Then we have
1
T
F
1
S () B = A
1
e
i(T1)/2
H () +O
_
T
1
_
.
Thus to minimize Q(, A
1
, A
2
) in (3), approximately, we consider
min
m1
<<
m+1
,C
|Z Ce
i(T1)/2
H () |
. (6)
Changing the double minimum for both and C in (6) to two iterated minimums
and using the standard least squares method, we can show that (6) is equivalent
to maximizing the target function f() = |Z
H()|
2
/H()
2
. Thus, let
N () =
1
H ()
H () , Y =
Z, R = Re {Y Y
} ;
we estimate by the maximizer of
f () = N ()
RN () , [
m1
,
m+1
] . (7)
We call this maximizer the Approximate Maximum Likelihood Estimator
(AMLE).
3. Asymptotic Behavior of AMLE
First of all, we have
Lemma 1. Assume that {u
t
} is a purely non-deterministic ergodic stationary
series with zero mean and nite variance. Then
limsup
T
T |
m
| , a.s. (8)
Proof. Using (4), we have
1
T
T1
t=0
e
it
x
t
=
A
2
e
i(T1)()/2
D( ) +
A
2
e
i(T1)(+)/2
D( +) +
1
T
T1
t=0
e
it
u
t
.
Since > 0, D( + ) 0 as T . Also, according to the lemma in
Hannan (1973), the third term in the above formula tends to zero uniformly for
all [0, ]. Further, there must be an integer k such that
k
[
T
, +
T
].
162 DAWEI HUANG
So,
liminf
T
1
T
T1
t=0
e
itm
x
t
= liminf
T
max
0k<T
1
T
T1
t=0
e
it
k
x
t
A
2
liminf
T
min
_
D( ) ;
T
+
T
_
=
A
2
lim
T
1
T sin
2T
=
A
.
If there was a subsequence {
m
=
m
(T
j
), j = 1, 2, . . .} such that
liminf
j
T
j
|
m
(T
j
) | > (1 +) for > 0,
then
liminf
T
1
T
T1
t=0
e
itm
x
t
liminf
j
1
T
j
T
j
1
t=0
e
itm(T
j
)
x
t
A
2
liminf
j
max
_
D( ) ; | |
(1 +)
T
j
_
=
A
2
lim
T
j
1
T
j
sin
(1+)
2T
j
=
A
(1 +)
.
This contradicts the above formula. So (8) holds.
Next, let be the maximizer of f() on [
mp
,
m+q
].
Lemma 2. Under the same condition in Lemma 1, we have
T ( ) 0, a.s. (9)
Proof. Let E() = T
1/2
[ 1 e
i
e
i(T1)
]
T
U
_
.
Let
W =
F
1
_
A
2
e
i
E() +
1
T
U
_
.
APPROXIMATE MAXIMUM LIKELIHOOD METHOD 163
Then
Y =
Z =
A
2
e
i(T1)/2i
H () +W.
Thus,
R =
A
2
4
H () H
() +M, (10)
where
M = Re
_
A
2
_
e
i(T1)/2i
H () W
+e
i(T1)/2+i
WH
()
_
+WW
_
.
It follows from (7) and (10) that
f ( ) =
A
2
4
|N ( )
H ()|
2
+N ( )
MN ( )
f () =
A
2
4
H ()
2
+N ()
MN () .
Then,
1 |N ( )
N ()|
2
(11)
1
4
A
2
H ()
2
[N ( )
MN ( ) N ()
MN ()] .
However, let
2
be the largest singular value of M and > 0; then according
to the lemma in Hannan (1973),
1
T
U 0, a.s., as T . Since > 0, by
(4), F
1
E() 0. Thus W 0 and then 0, a.s., as T . Also,
N ( )
MN ( ) N ()
MN () (12)
= [N ( ) N ()]
M [N ( ) +N ()]
N ( ) N () N ( ) +N () 4.
Further, for all [
mp
,
m+q
] and p +q > 0, we have
4
2
D
_
T
_
2
+D
_
T
_
2
(13)
m+q
j=mp
D(
j
)
2
= H ()
2
.
So it follows from (11), (12) and (13) that
lim
T
|N ( )
. Since
D()
sin(T/2)
T/2
sin (T/2)
T sin (/2)
(/2) sin (/2)
(/2)
2
6
,
we have
sup
mp
m+q
H () G() 0, as T .
Thus, using (13) and (14) we have
lim
T
|G( )
G()|
G( ) G()
= 1, a.s. (15)
Since for any given , there is an integer sequence {T
j
, j = 1, 2, . . .} such that
sin
T
j
2
= 0, it follows from (15) that there is also a subsequence of {T
j
, j =
1, 2, . . .} that sin
T
j
2
= 0. So, without losing generality, we may assume that
both sin
T
j
2
and sin
T
j
2
do not vanish. Let
V () =
_
2(1)
mp+1
2(mp)T
, . . . ,
2(1)
m+q+1
2(m+q)T
_
.
Then (15) can be rewritten as lim
T
|V ( )
V ()|
V ( )V ()
= 1, a.s. Also, since
[
mp
,
m+q
] , we conclude
lim
T
V ( )
V ()
V ( ) V ()
= 1, a.s.
So, when T ,
V ( ) V ()
_
V ( )
V ()
1
_
V ()
2
= V ( )
V ( )
V ()
V ()
2
= V ( )
2
2
V ( )
V ()
V ( )
V () +V ( )
2
0, a.s.,
2T (1)
j+1
( )
(2j T ) (2j T)
_
V ( )
V ()
1
_
2 (1)
j+1
2jT
0, a.s., mpj m+q,
T ( )
2jT
_
V ( )
V ()
1
_
0, a.s., mp j m+q.
In particular, taking j = mp and j = m+q, we have
T ( )
2 (mp) T
T ( )
2 (m+q) T
0, a.s., T .
APPROXIMATE MAXIMUM LIKELIHOOD METHOD 165
Thus, since p +q > 0, (9) holds.
Now we can establish a Central Limit Theorem for the maximizer . We call
a stationary sequence regular if its tail -algebra is trivial (see Hannan (1979)).
It is known that the regular condition implies the ergodic condition.
Theorem 3. Suppose that {u
t
} is a regular stationary sequence with zero mean,
its best linear prediction for u
t
is the best prediction and its spectral density
function () is continuous at . Let
S
T
=
A
2T
H
()
H ()
()
H ()
2
H () . (16)
Then
T
3/2
S
T
( ) N (0, 2 ()) .
Proof. First we have
=
f
()
f
()
, | | | | . (17)
Consider f
(). Since N ()
H () N ()
H () for all , N
()
H () = 0.
Then, it follows from (10) that
f
() = 2N
()
_
AH () Re
_
e
i+i(T1)/2
W
_
+ Re {WW
} N ()
_
.
Using the result in Hannan (1979), we can show that Re{e
i+i(T1)/2
U}
N(0, ()I). Then, since F
1
E() 0, we have
TRe
_
e
i+i(T1)/2
W
_
N (0, () I) . (18)
Also, let
K () H ()
d
d
_
1
H ()
_
=
H
()
H ()
H ()
2
.
Then
N
() =
1
H ()
_
H
() K () H ()
. (19)
So, we have
2Tf
()
AH
()
H
()
H()
H()
2
H ()
N (0, 2 ()) . (20)
Next, we prove that
T
2
_
f
() f
()
= o
p
(1) . (21)
166 DAWEI HUANG
It follows from (7) that f
() = 2N
()
RN() + 2N
()
RN
() , f
() =
2N
()
RN() + 6N
()
RN
() . By (4), |
d
k
d
k
D()| = O(T
k
). Then we have
from (13) that
d
k
d
k
N () =
d
k
d
k
_
1
H ()
H ()
_
= O
_
T
k
_
for all . (22)
Thus, since f
() f
() = f
(
1
)( ), |
1
| | | | |, it follows
from (9) that (21) holds.
Further, manipulating N
() and N
()
T
2
=
A
2
2T
2
_
[H
()
H ()]
2
H ()
2
H
()
2
_
+
2
T
2
_
N
()
MN () +N
()
MN
()
.
By (22) and that M = o
p
(1), also notice that
H
()
H
()
H ()
H ()
2
H ()
2
= H
()
2
[H
()
H ()]
2
H ()
2
,
and
f
()
T
2
=
A
2
2T
2
H
()
H
()
H ()
H ()
2
H ()
2
+o
p
(1) . (23)
Thus, (16) follows from (17), (20), (21) and (23).
Remark 1. Let =
T
2
m. When p = 0 and q = 1, we have
H () =
_
sin()
sin()
(1)
_
+O
_
1
T
2
_
,
T
1
H
() =
_
_
cos()
sin()
()
2
cos()
(1)
+
sin()
2
(1)
2
_
_
+O
_
1
T
2
_
.
Substituting these into (16), one can verify that
2 ()
S
2
T
=
16
5
d
2
(1 ||)
2
_
2
2
2 || + 1
_
()
A
2
sin
2
()
+O
_
1
T
2
_
.
So, when two DFT coecients are used, the asymptotic variance
2()
T
3
S
2
T
in
Theorem 1 is asymptotically equal to that in Quinn (1994). As in Quinn (1994),
the minimum of this quantity is
5
()
2A
2
, while the maximum is
16
3
()
A
2
. Also,
when p = q = 1, we have
2 ()
S
2
T
=
8
5
2
_
1
2
_
2
_
3
4
+ 1
_
()
A
2
(3
4
+ 4
2
+ 1) sin
2
()
+O
_
1
T
2
_
.
APPROXIMATE MAXIMUM LIKELIHOOD METHOD 167
Then, when three DFT coecients are used, the asymptotic variance is asymp-
totically equal to that in Quinn (1997).
Since H
()
H
()
H()
H()
2
H() is the norm of the regression error of the
vector H
()
H
()
H ()
H ()
2
H ()
1
2
> 0. (24)
Also, an interesting problem is the relationship between the eciency of AMLE
and the number of DFT coecients used. We denote the eciency of AMLE as
()
48()
T
3
A
2
2()
T
3
S
2
T
=
12H
()
H
()
H()
H()
2
H ()
2
T
2
. (25)
Figure 1 shows () for 2, 3, 13 and 25 DFT coecients. The X-axis is the
distance between the true frequency and
m
in the unit /T. It is clear that
the minimum of () occurs at
m
. Also, when =
m
and p = q, H
() =
_
cos p
2 sin p/T
0
cos p
2 sin p/T
_
, H ()
() = 0. Thus
-1 -0.8 -0.6 -0.4 -0.2 0
1
1 0.2
0.3
0.4
0.4
0.5
0.6
0.6
0.7
0.8
0.8
0.9
2 DFTs
3 DFTs
13 DFTs
25 DFTs
Figure 1. Eciency curve of AML, 2, 3, 13 and 25 DFT coecients are used.
Corollary 4. Under the conditions in Theorem 3 and when p = q,
min
0<<
()
6
T
2
p
j=1
1
sin
2 j
T
>
6
2
p
j=1
1
j
2
.
Notice that
j=1
1
j
2
=
2
6
(see, for example, Dym and McKean (1972, Chap.
2)), the eciency approaches one very quickly. For example, we can achieve 95%
eciency, no matter where is, when p = 12.
168 DAWEI HUANG
4. Algorithm
In this section, we use a subscript to indicate the number of DFT coecients
on which the statistics are based. Subscript 0 means two DFT coecients while
subscript p > 0 means (2p+1) DFT coecients corresponding to the Fourier
frequencies
j
, j = mp, . . . , m +p. For example,
H
p
() =
_
D(
mp
) D(
m+p
)
_
, f
p
() = N
p
()
R
p
N
p
() .
First consider estimation based on two DFT coecients. One of them is the
maximal DFT coecient corresponding to
m
. We choose the other by the sign
of f
1
(
m
): let
j =
_
1, f
1
(
m
) > 0;
1, f
1
(
m
) 0.
Then the target function is f
0
() = N
0
()
R
0
N
0
() with R
0
=
_
r
11
r
12
r
12
r
22
_
,
r
11
=
T1
t=0
e
itm
x
t
2
, r
22
=
T1
t=0
e
it
m+j
x
t
2
,
r
12
= Re
_
e
i
T1
2
(
m+j
m)
T1
t=0
e
itm
x
t
T1
t=0
e
it
m+j
x
t
_
.
Let [cos , sin ] be the eigenvector corresponding to the maximum eigenvalue of
R
0
. Then
tan = =
r
11
r
22
_
(r
11
r
22
)
2
+ 4r
2
12
2r
12
.
So, we derive a closed form for
0
:
D(
m+j
0
)
D(
m
0
)
=
sin
m
2
cos
0
2
cos
m
2
sin
0
2
sin
m+j
2
cos
0
2
cos
m+j
2
sin
0
2
= ,
0
= 2 arctan
sin
m
2
sin
m+j
2
cos
m
2
cos
m+j
2
. (26)
Now, for any integer p > 0, let
p
=
0
p
(
0
)
f
p
(
0
)
. (27)
Although
p
may not be exactly the same as the maximizer
p
of f
p
() , we have
APPROXIMATE MAXIMUM LIKELIHOOD METHOD 169
Theorem 5. Under the same conditions as in Theorem 3,
T
1.5
S
p
(
p
) N (0, 2 ()) , (28)
where S
p
=
A
2T
H
p
()
H
p
()
Hp()
|Hp()|
2
H
p
() .
Proof. Since
p
=
0
f
p
(
0
) f
p
() +f
p
()
f
p
(
0
)
=
_
1
f
p
()
f
p
(
0
)
_
(
0
)
f
p
()
f
p
(
0
)
, | | |
0
| ,
we have
T
1.5
S
p
(
0
) = O
p
(1) , 1
f
p
()
f
p
(
0
)
= o
p
(1) .
Now (28) follows by the argument in the proof of Theorem 3.
Remark 2. In practice, when the sample size is small and the Signal to Noise
Ratio (SNR) is low, it is better to modify (27) by adding one more step:
1
=
0
p
(
0
)
f
p
(
0
)
,
p
=
1
p
(
1
)
f
p
(
1
)
.
Remark 3. To reduce the computational complexity while keeping the same
asymptotic property, we can replace H
p
() by
V
p
() =
_
2(1)
p1
2(mp)T
2(1)
p+1
2(m+p)T
_
.
When two DFT coecients are used, this leads to
0
=
m
m+j
1
,
which is similar to the estimator in Quinn (1994).
Remark 4. Compared with the existing methods, the computational complexity
of this method is acceptable. As in Rife and Vincent (1970) and Quinn (1994,
1997), the major computation is due to FFT and is O(T log T) . For calculating
p
based on V
p
() , one only needs O(p) extra multiplications and additions. So the
calculation of AML is basically the same as that in Rife and Vincent (1970) and
Quinn (1994, 1997). Suppose we set out to nd the maximizer of the likelihood
function (Rice and Rosenblatt (1988)) or the periodogram (Hannan (1973)) by
iteration procedures like Newtons method. Even if we start from the maximizer
of the periodogram and assume only one iteration is used (though this is far
170 DAWEI HUANG
from enough according to the investigations in Quinn and Fernandes (1991) and
Quinn (1994)), the computational complexity is proportional to O(T log T) plus
T C calculations for the rst and second derivatives of log-likelihood functions,
or periodograms. The constant C in the second part is a very large number since
the derivatives consist of T sine and cosine functions and no fast algorithms are
available for the calculation.
5. Simulation
Simulation was done and some results are displayed in Tables 1 to 3. Two fre-
quencies,
50
T
and
51
T
with T = 128, are estimated, SNR(= 10 log
10
A
2
2
) changed
from -5 to 5, and 2, 3, 13 and 25 DFT coecients were used (p = 0, 1, 6, 12).
The entries in the tables are the average sample eciencies of 1000 replications
for each case. These sample eciencies are calculated by the ratio of CRB over
the sample mean squares errors. The theoretical eciency calculated by (25)
is listed on the last column.
Table 1. T = 128, =
50
T
, 128 DFT coecients.
p 5 4 3 2 1 0 1 2 3 4 5
0 .4349 .4286 .4218 .4148 .4078 .4011 .3947 .3886 .3831 .3780 .3733 .3040
1 .5042 .5115 .5184 .5252 .5320 .5387 .5452 .5515 .5576 .5633 .5687 .6080
6 .8962 .9074 .9144 .9185 .9208 .9218 .9221 .9219 .9213 .9206 .9197 .9074
12 .9145 .9275 .9359 .9414 .9450 .9437 .9488 .9497 .9501 .9503 .9502 .9529
Table 2. T = 128, =
51
T
, 128 DFT coecients.
p 5 4 3 2 1 0 1 2 3 4 5
0 .0001 .0029 .0577 .1111 .6376 .9409 .9456 .9486 .9501 .9503 .9490 .9855
1 .0001 .0026 .0564 .1006 .3700 .9420 .9452 .9470 .9477 .9472 .9453 .9912
6 .0001 .0004 .0586 .1112 .9442 .9633 .9626 .9611 .9586 .9549 .9501 .9999
12 .0001 .0025 .0594 .1105 .6708 .9619 .9609 .9591 .9564 .9527 .9477 .9999
In Tables 1 and 2, the standard DFT with 128 coecients was used. For
=
50
T
, which corresponds to the worst case, the results were consistent with
the theoretical eciency. However, for the best case =
51
T
, when SNR < 0, a
big dierence between the eciencies given by simulation and theory occurred.
We found that it is due to the location of
m
. In this case |
m
| could be
more than /T. To solve the problem, we calculated DFT on 256 frequencies
j
T
, j = 0, . . . , 255. Suppose that
n
T
is the maximizer. Then let
m
=
n
T
if n is
even;
m
=
(n+1)
T
if n is odd and |
T1
t=0
e
it(n+1)/T
x
t
| > |
T1
t=0
e
it(n1)/T
x
t
|;
otherwise
m
=
(n1)
T
. Based on such a
m
, we obtained results which were
close to the theoretical values in Table 3.
APPROXIMATE MAXIMUM LIKELIHOOD METHOD 171
Table 3. T = 128, =
51
T
, 256 DFT coecients.
p 5 4 3 2 1 0 1 2 3 4 5
0 .0455 .8978 .9132 .9253 .9343 .9409 .9456 .9486 .9501 .9503 .9490 .9855
1 .0456 .9130 .9235 .9315 .9375 .9420 .9452 .9470 .9477 .9472 .9453 .9912
6 .0457 .9625 .9643 .9648 .9647 .9639 .9626 .9606 .9577 .9539 .9490 .9999
12 .0457 .9650 .9647 .9645 .9638 .9626 .9609 .9586 .9566 .9517 .9466 .9999
References
Dym, H. and McKean, H. P. (1972). Fourier Series and Integrals. Academic Press, New York.
Hannan, E. J. (1973). The estimation of frequency. J. Appl. Probab. 10, 510-519.
Hannan, E. J. (1979). The central limit theorems for time series regression. Stoch. Proc. Appl.
9, 281-289.
Hannan, E. J. and Huang, D. (1993). On line frequency estimation. J. Time Ser. Anal. 14,
147-161.
Kay, S. M. and Marple, S. L. (1981). Spectrum analysis - a modern perspective. Proc. IEEE
69, 1380-1419.
Kedem, B. (1986). On frequency detection by zero-crossing. Signal Processing 10, 303-306.
Kendall, M. G. and Stuart, A. (1967). The Advanced Theory of Statistics. Charles Grin &
Co., London.
Pisarenko, V. F. (1973). The retrieval of harmonics from a covariance function. Geophys. J.
Roy. Astr. Soc. 33, 347-366.
Quinn, B. R. and Fernandes, J. M. (1991). A fast ecient technique for the estimation of
frequency. Biometrika 78, 489-498.
Quinn, B. G. (1994). Estimating frequency by interpolation using Fourier coecients. IEEE
Trans. Signal Processing 42, 1264-1268.
Quinn, B. G. (1997). Estimation frequency, amplitude and phase from the DFT of a time series.
IEEE Trans. Signal Processing 45, 814-817.
Rice, J. A. and Rosenblatt, M. (1988). On frequency estimation. Biometrika 75, 477-484.
Rife, D. C. and Vincent, G. A. (1970). Use of the discrete Fourier transform in the measurement
of frequencies and levels of Tones. Bell Syst. Tech. J. 49, 197-228.
Truong-Van (1990). A new approach to frequency analysis with amplied harmonics. J. Roy.
Statist. Soc. Ser. B 52, 203-221.
School of Mathematics, Queensland University of Technology, GPO Box 2434, Brisbane, QLD
4001, Australia.
E-mail: d.huang@fsc.qut.edu.au
(Received May 1997; accepted May 1999)