Você está na página 1de 18

10.

5 Properties of Gaussian PDF


To help us develop some general MMSE theory for the Gaussian
Data/Gaussian Prior case, we need to have some solid results for
joint and conditional Gaussian PDFs.
Well consider the bivariate case but the ideas carry over to the
general N-dimensional case.

Bivariate Gaussian Joint PDF


p( x, y ) =

1
2 | C |1/2

for 2 RVs X and Y

T
x x
1 x x
1
C

exp
y y
2 y y
$!!!!#!!!!
"

quadratic form

cov( X , Y ) X2
=
var( Y ) YX

var( X )
C=
cov( Y , X )

XY

Y2

X X
E =
Y Y

X2
=

X Y

XY

Y2

8
0.09

6
4

0.07
0.06

0.05

0.04

p(x,y)

p(x,y)

0.08

0.03
0.02
0.01

0
-2
-4

0
10

-6
5

y
y

0
-5
-10

-8

-6

-4

-2

-8
-10

-5

0
x

10

Marginal PDFs of Bivariate Gaussian


What are the marginal (or individual) PDFs?
We know that we can get them by integrating:
p( x ) =

p( y ) =

p ( x, y ) dy

p ( x, y ) dx

After performing these integrals you get that:


X ~ N(X, var{X})

Y ~ N(Y, var{Y})

8
6
4
2

p(y)

0
-2
-4

p(x)

-6
-8
-10

-5

0
x

10

Comment on Jointly Gaussian

See Reading Notes on


Counter Example
posted on BB

We have used the term Jointly Gaussian


Q: EXACTLY what does that mean?
A: That the RVs have a joint PDF that is Gaussian
p ( x, y ) =

1
2 | C |1/2

x x
1 x x
C 1

exp
2
y y
y y

Example for
2 RVs

Weve shown that jointly Gaussian RVs also have Gaussian


marginal PDFs
Q: Does having Gaussian Marginals imply Jointly Gaussian?
In other words if X is Gaussian and Y is Gaussian is it
always true that X and Y are jointly Gaussian???
A: No!!!!!
4

Well construct a counterexample: start with a zero-mean,


uncorrelated 2-D joint Gaussian PDF and modify it so it is no
longer 2-D Gaussian but still has Gaussian marginals.
y
2
1 x 2
y

exp 2 + 2
p XY ( x, y ) =
2 X Y
2 X Y

But if we modify it by:


Setting it to 0 in the shaded regions
Doubling its value elsewhere
We get a 2-D PDF that is not
a joint Gaussian but the
marginals are the same
as the original!!!!

y
x

Conditional PDFs of Bivariate Gaussian


What are the conditional PDFs?
If you know that X has taken value X = xo, how is Y distributed?
Slice @ xo

p ( y | x0 ) =

p ( x | x0 )
=
p ( x0 )

25
C=
0.8 25 16

p ( x0 , y )

p( x0 , y ) dy

15
10

Normalizer

Slope of Line
cov{X,Y}/var{X} = Y/X
Note: Conditioning on correlated RV
shifts mean
reduces variance

0.8 25 16

16

p(y|X=5)

0
-5
-10
-15
-15

p(y)
-10

-5

10

15

x
6

Theorem 10.1: Conditional PDF of Bivariate Gaussian


Let X and Y be random variables distributed jointly Gaussian
with mean vector [E{X} E{Y}]T and covariance matrix
var(X )
C=
cov(Y , X )

cov(X ,Y ) X2
=
var(Y ) YX

XY
Y2

Then p(y|x) is also Gaussian with mean and variance given by:
E{Y | X = x o } = E{Y } +

= E{Y } +

var{Y | X = x o } =

Y2

Y2

XY
(xo E{ X })
2
X

Slope of Line

Y
(xo E{ X })
X

2
XY
2
X

Y2

Amount of Reduction

= 1

Y2

Reduction Factor
7

Impact on MMSE
We know the MMSE of RV Y after observing the RV X = xo:
Y = E { Y | X = xo }

So using the ideas we have just seen:


if the data and the parameter are jointly Gaussian, then
XY

YMMSE = E{Y | X = x o } = E{Y } + 2 (xo E{ X })


X

It is the correlation between the RVs X and Y that allow us to


perform Bayesian estimation.

Theorem 10.2: Conditional PDF of Multivariate Gaussian


Let X (k1) and Y (l1) be random vectors distributed jointly
Gaussian with mean vector [E{X}T E{Y}T ]T and covariance
matrix
C (k k ) (k l )
C
XX

C=
CYX

XY

=
CYY (l k )

(l l )

Then p(y|x) is also Gaussian with mean vector and covariance


matrix given by:
1
(x o E{X})
E{Y | X =x o } = E{Y} + C YX C XX

(xo E{ X })
E{Y | X = x o } = E{Y } + XY
2
X

Compare to
Bivariate Results

1
C Y|X = x o = C YY C YX C XX
C XY

var{Y | X = x o } =

Y2

2
XY
2
X

For the Gaussian case the


cond. covariance does not depend
on the conditioning x-value!!!
9

10.6 Bayesian Linear Model


Now we have all the machinery we need to find the MMSE for
the Bayesian Linear Model

x = H + w
N1

Np
known

p1
~N(,C)

N1
~N(0,Cw)

Clearly, x is Gaussian and is Gaussian


But are they jointly Gaussian???
If yes then we can use Theorem 10.2 to get the MMSE for !!!
Answer = Yes!!
10

Bayesian Linear Model is Jointly Gaussian


and w are each Gaussian and are independent
Thus their joint PDF is a product of Gaussians
which has the form of a jointly Gaussian PDF
Can now use: a linear transform of jointly Gaussian is jointly Gaussian
x H
=
I

I

0 w

Jointly Gaussian

Thus, Thm. 10.2 applies! Posterior PDF is


!

Joint Gaussian

Completely described by its mean and variance


11

Conditional PDF for Bayesian Linear Model


To apply Theorem 10.2, notationally let X = x and Y = .
First we need

E{X} = H E{} + E{w} = H


E{Y} = E{} =

And also

C YY = C

{
}
= E {[H( ) + w ][H( ) + w ] }
= H E {( )( ) }H + E {ww }
$!!!#!!!"

C XX = E (x E{x})(x E{x})T

Cross Terms are Zero


because and w are
independent

{ }

C XX = HC HT + E ww T

12

Similarly

{
}
= E {( )(H + w H ) }
= E {( )( ) H }

C YX = Cx = E ( )(x x )T

Use E{w} = 0
E{w} = 0

Cx = C HT

Then Theorem 10.2 gives the conditional PDFs mean and cov
(and we know the conditional mean is the MMSE estimate)
Cross Correlation Cx

Posterior
Mean:

MMSE = E { | x}
T

= + C H HC H + C w
a priori
estimate

Posterior
Covariance:

Relative Quality

Update Transformation Data Prediction


Error
Maps unpredictable part

C|x = C C HT HC HT + C w
a priori
covariance

(x H )

Bayesian
MMSE
Estimator

HC

Reduction Due to Data

13

Ex. 10.2: DC in AWGN w/ Gaussian Prior


Data Model: x[n] = A + w[n]
~ N ( A , A2 )

A & w[n] are independent


~ N (0, 2 )

Write in linear model form:


x = 1A + w with H = 1 = [ 1 1 1]T
Now General Result gives the MMSE estimate as:
A MMSE = E{ A | x} = A + A2 1T ( A2 11T + 2 I) 1 ( x - 1 A )

A2 T
A2 T 1
= A + 2 1 ( I + 2 11 ) ( x - 1 A )

Can simplify using


The Matrix Inversion Lemma
14

Aside: Matrix Inversion Lemma


(A + BCD)

=A

A B DA B + C

1 1

DA 1

nn nm mm mn

Special Case (m = 1):

(A + uu )

T 1

nn

= A 1

A 1uuT A 1
1 + uT A 1u

n1

15

Continuing the Example Apply the Matrix Inversion Lemma:


A MMSE = A +

A2 T
1 I+
2

T
11

A2
2

Use Matrix Inv Lemma

( x 1 A )

A2 T
11T
( x 1 A )
= A + 2 1 I
2
2

N + / A

A MMSE

A2 T
N
T
= A + 2 1
1 ( x 1 A )
2
2

N + / A

Factor Out 1T
& use 1T 1 = N

A2
N
( N x N A )
= A + 2 1
2
2
N + / A

Algebraic
Manipulation

= A + 2 A2
+ / N
A

a priori
estimate

Pass through 1T
& use 1T 1 = N

Gain
Factor

(x A)

When data is bad (2/N >> 2A),


gain is small, data has little use
A MMSE A

2
2
Error Between When data is good ( /N >> A),
Data-Only Est. gain is large, data has large use
& Prior-Only Est.
A MMSE x

16

Using similar manipulations gives:

var( A | x ) =

A2

2
A

A2

2/N

Like || resistors small one wins!


var (A|x) is the smaller of:
data estimate variance
prior variance

Or looking at it another way:


1
var( A | x )

A2

2/N

additive information!

17

10.7 Nuisance Parameters


One difficulty in classical methods is that nuisance parameters
must explicitly dealt with.
In Bayesian methods they are simply Integrated Away!!!!
Recall Emitter Location:

[x y z f0]
Nuisance Parameter

In Bayesian Approach
From p(x, y, z, f0 | x) can get p(x, y, z | x):

p ( x, y , z | x ) = p ( x, y , z, f 0 | x ) df 0
Then find conditional mean for the MMSE estimate!

18

Você também pode gostar