Escolar Documentos
Profissional Documentos
Cultura Documentos
Variance Reduction
Book
Statistical Computing with R
Maria L. Rizzo
Chapman & Hall/CRC, 2008
Integral estimation
g ( x) is a function.
We want to compute g ( x)dx, assuming the integral is finite.
We use facts from statistical moments to estimate integrals.
Recall if X is a random variable with distribution f ( X ) (written as X ~ f ( X ))
and Y = g ( X ) is another random variable, then
U(0,1)
used
Generate m i.i.d. U (0, 1) random variables X 1 , X 2 ,..., X m . because
it
ts
the
domain
of
m
1
integration
[0,1].
=
g ( X ) E ( g ( X )) =
m
i =1
Estimate = g (t )dt.
a
One idea is to use a change of variables so that the simple Monte Carlo
estimator over [0,1] can be used.
Specifically, find function y (t ) such that y (a ) = 0, y (b) = 1 and perform
the integration :
y (b )
dt
dt
g
(
t
(
y
))
dy
=
g
(
t
(
y
))
dy.
y(a)
dy
dy
0
t a
dt
. Then t ( y ) = a + (b a ) y and
= b a.
ba
dy
1
g
(
t
)
dt
=
(
b
a
)
g
(
t
)
a
a b a dt = (b a)a g (u ) fU (u )du = (b a) EU [ g (u )].
SAMPLING ALGORITHM :
iid
ba
=
g( X i )
m i =1
5
1 t 2 / 2
For x > 0, ( x) = 0.5 +
e
dt , back to finite limits.
0
2
For -x < 0, (- x) = 1 ( x), so use method above.
x
t 2 / 2
dt for x > 0.
1 t 2 / 2
e
dt, for an arbitrary x.
2
Estimate = e
0
t 2 / 2
= xe
0
( xy ) 2 / 2
dy = EY [ xe
( xy ) 2 / 2
], where Y ~ U (0,1).
SAMPLING ALGORITHM
iid
Set = xe
.
m i =1
1 t 2 / 2
e
dt where you have a standard normal generator
2
at your disposal.
Let Z ~ N (0,1).
1 z2/ 2
1 z2/ 2
E[ I (Z x)] = I ( z x) f Z ( z )dz = I ( z x)
e
dz =
e
dz = ( x).
2
-
-
2
SAMPLING ALGORITHM
iid
10
11
General Result
f ( x) a probability density supported on set A.
To estimate = g ( x) f ( x)dx,
A
m
1
generate X 1 ,..., X m ~ f ( x) and set = g ( X i ).
m i =1
iid
12
Standard errors
1 m
2
m
2
n
Standard errors
is a sample mean of the independent g ( X 1 ), g ( X 2 ),..., g ( X m ).
and use basic statistical principles.
2
Var() =
, where 2 = Var ( g ( X )).
m
How do we estimate 2 ?
...by the sample variance of g ( X 1 ), g ( X 2 ),..., g ( X m ).
Recall from statistics that the unbiased estimate of sample variance is
1 m
) 2 , while the maximum likelihood estimate is
s =
(
g
(
X
)
i
m 1 i =1
2
1 m
= ( g ( X i ) ) 2 .
m i =1
2
14
Standard errors
is a sample mean of the independent g ( X 1 ), g ( X 2 ),..., g ( X m ).
and use basic statistical principles.
2
Var() =
, where 2 = Var ( g ( X )).
m
How do we estimate 2 ?
...by the sample variance of g ( X 1 ), g ( X 2 ),..., g ( X m ).
Recall from statistics that the unbiased estimate of sample variance is
1 m
) 2 , while the maximum likelihood estimate is
s =
(
g
(
X
)
i
m 1 i =1
2
1 m
= ( g ( X i ) ) 2 .
m i =1
2
15
Standard errors
2
Var()
m
1 m
( g ( X i ) ) 2
m
= i =1
m
m
) 2
(
g
(
X
)
i
i =1
m2
Have to be careful to
have two ms in the
denominator.
and
m
s.e.() =
16
( g ( X ) )
i =1
For Z ~ N (0,1), P(-1.96 < Z < 1.96) = 0.95 and substituting Z =
s.e.( )
P(-1.96 <
< 1.96) = 0.95, and
s.e.( )
P( 1.96 s.e.() < < + 1.96 s.e.()) = 0.95.
A 95% CI for is 1.96 s.e.().
17
Example 5.5
Can use =
instead of <Note the mean
includes a
division by m.
18
Example 5.5
continued
x = 2, Z ~ N (0,1)
g ( Z ) = I ( Z < x) is a Bernoulli random variable, taking value 1 if Z < x and 0 otherwise.
E[ g ( Z )] = E[ I ( Z < x)] = 1 P( Z < x) + 0 P( Z x) = P( Z < x) = ( x).
( x) is the success probability, P[ g ( Z ) = 1] = ( x). Therefore, according to the
Bernoulli distribution, Var[ g ( Z )] = ( x)(1 ( x)).
1 m
Example 5.5
from book
MC variance
estimate
> pnorm(2)
[1] 0.9772499
(2) 0.977, which would yield theoretical variance 0.977(1 - 0.977)/10,000 = 2.223e - 06.
The MC variance estimate is very close.
20
21
Efficiency
Var ( 2 )
Efficiency is called a second - order property. As the cartoon suggests, you want to
first worry whether your estimator is correct (unbiased) before you concern yourself
with efficiency.
22
Notes on efficiency
Variances are unknown so their MC estimates are used for efficiency calculations.
Variances of averages are of order 1/m (decrease as the number of simulations m
increases) so one way to decrease the variance is by increasing the number of
simulations.
Sometimes the percent reduction using 2 instead of 1 is reported :
Var (1 ) Var (2 )
100
.
Var( )
1
23
Power calculations
Jim Carrey, Bruce Almighty
Suppose we are planning to run a simulation study that is costly, and want to
determine the number of simulations m needed to achieve a standard error below .
We have an " a priori" estimate of 2 from prior experiments.
2
We solve
< for m to obtain that we need m > 2 .
24
25
Importance sampling
MOTIVATION :
b
1
g
(
t
)
dt
=
(
b
a
)
g
(
t
)
a
a b a dt = (b a)a g (u ) fU (u )du = (b a) EU [ g (u )].
The sampling algorithm was :
iid
a
=
g( X i )
m i =1
26
Importance sampling
iid
a
=
g( X i )
m i =1
This will not work well if g is not matched well by the U (a,b) density.
g(x
)
27
Importance sampling
b
1
g
(
t
)
dt
=
(
b
a
)
g
(
t
)
a
a b a dt = (b a)a g (u ) fU (u )du = (b a) EU [ g (u )]
28
Importance sampling
GOAL : Calculate g ( x)dx.
LOGIC :
Find density f ( x) such that f ( x) > 0 on the set {x:g ( x) 0} that you
can generate from; f ( x) is called the importance function.
Let Y =
g( X )
be a transformed random variable of X , where X ~ f ( X ).
f (X )
g ( X )
g ( x)
Then E[Y ] = E
=
f ( x)dx = g ( x)dx gives the required integral.
f ( x)
f ( X )
ALGORITHM :
iid
Generate X 1 ,..., X m ~ f ( X ).
1 m g( X i )
Set E[Y ] =
.
m i =1 f ( X i )
29
g ( X )
Var
m
1
Var (Y )
f ( X ) .
Recall from earlier that Var Yi =
=
m
m
m i =1
We want to choose f ( X ) such that
g( X )
has little variability.
f (X )
30
g( X )
c, a constant, since the variance of a constant is 0.
f (X )
Unifo
rmExp(
1)
31
Cauchy
= t1
Rescaled
Exp(1)
Rescaled
Cauchy
Note that some
have a bigger
Example continued
f3
f0
f2
g
f4
32
f1
Example continued
g/f2
g/f4
g/f3
Plot g(x)/f(x) for each of the fs.
See which is most constant.
f3 looks the best.
33Rescaling the Cauchy (f2 f4)
really helped!
Example continued
Unifo
rm
Exp(
1)
Cauc
hy
34
Example continued
Re-scaled
Exp(1)
Re-scaled
Cauchy
35
Example continued
Summary
37 E[ g ( X )] =
importance sampling is rarely
m i =1
(Xi )
End of Chapter 5
Ch 6: MC Methods in Inference
Very important applications of what is
learned in Chapter 5.
Not covered in this course except as
potential homework problems.
38