Você está na página 1de 8

Economics 2110 Fall Semester 2007

Marcelo J. Moreira Harvard University

Lect. 6: Joint Distributions, Conditional Distributions,


Independence of Random Variables

Definition 1 A ndimensional random variable is a function from the sam-


ple space to Rn .

Definition 2 The cumulative distribution function for an bivariate random


vector (X, Y ) is

FX,Y (x, y) = Pr( |X() x, Y () y).

Definition 3 Let (X, Y ) be a discrete bivariate vector. Then the function


fXY (x, y) from R2 to R defined by

fXY (x, y) = Pr( |X() = x, Y () = y),

is the joint probability mass function of (X, Y ).

Example A: Toss a coin three times. The sample space is

= {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }.

An example of a bivariate random variable is (X, Y ), where X() is the


number of heads in the first two tosses, and Y () is the number of heads in
the last two tosses. Assuming the coin is a fair one, and the probability of
heads is 1/2 at every toss, with different tosses independent events, the joint
pmf is


1/8 if (x, y) = (0, 0) ( {T T T }),



1/8 if (x, y) = (0, 1) ( {T T H}),



1/8 if (x, y) = (1, 0) ( {HT T }),

2/8 if (x, y) = (1, 1) ( {HT H, T HT }),
fXY (x, y) =

1/8 if (x, y) = (1, 2) ( {T HH}),



1/8 if (x, y) = (2, 1) ( {HHT }),



1/8 if (x, y) = (2, 2) ( {HHH}),

0 otherwise.

1
Economics 2110 Fall Semester 2007
Marcelo J. Moreira Harvard University

The fact that we define two random variables does not change the definition of
the probability mass function for any of the variables. Ignoring the definition
of Y (), the pmf for x is


1/4 x=0 ( {T T T, T T H}),

2/4 x=1 ( {HT H, HT T, T HH, T HT }),
fX (x) =

1/4 x=2 ( {HHT, HHH}),

0 elsewhere.
Similarly we can calculate the pmf for Y . In the context of joint random
variables we typically refer to these single variable probability mass functions
as marginal probability mass functions. In general they can be calculated
from the joint pmf as
fX (x) = P r( |X() = x)
= P r( |X() = x, < Y () < )
X
= fXY (x, y),
y

and
X
fY (y) = fXY (x, y).
x

Given the joint pmf we can calculate probabilities involving both random
variables. For example,
X
P r(X 1, Y 1) = P r( |X() 1, Y () 1) = fXY (x, y)
x1,y1

= f (0, 1) + f (1, 1) + f (1, 2) = 1/2.


Alternatively we can have joint continuous random variables.
Definition 4 Let (X, Y ) be a continuous bivariate vector. Then the function
fXY (x, y) from R2 to R is the joint probability density function of (X, Y ) if
it satisfies
Z
fXY (x, y)dxdy = P r( |(X(), Y ()) A),
A

for all sets A.

2
Economics 2110 Fall Semester 2007
Marcelo J. Moreira Harvard University

Example B: Pick a point randomly in the triangle (0, 0), (0, 1), (1, 0).
Let X be the xcoordinate and y be the ycoordinate. A reasonable choice
for the joint pfd appears to be
fXY (x, y) = c,
for {(x, y)|0 < x < 1, 0 < y < 1, x + y < 1} and zero otherwise (that is,
constant on the domain). What is the appropriate value for c? Note that we
know
1 = P r( ) = P r((X, Y ) {(x, y)|0 < x < 1, 0 < y < 1, x + y < 1})
Z Z 1 Z 1y
= fXY (x, y)dxdy = cdxdy
{(x,y)|0<x<1,0<y<1,x+y<1} 0 0
Z 1
= c (1 y)dy = c/2.
0
Hence c = 2.
Note here that in general we can write such integrals in different ways:
Z Z 1 Z 1y Z 1 Z 1x
g(x, y)dxdy = g(x, y)dxdy = g(x, y)dydx.
{(x,y)|0<x<1,0<y<1,x+y<1} 0 0 0 0

The marginal pdfs are in this case defined as


Z Z 1x
fX (x) = fXY (x, y)dy = 2dy = 2 2x,
0

for 0 < x < 1 and zero otherwise. Similarly


fY (y) = 2 2y, 0 < y < 1, and zero otherwise.
Conditional Probability
Recall how the conditional probability of events was defined:
P r(E1 E2 )
P r(E1 |E2 ) = ,
P r(E2 )
provided P r(E2 ) > 0. We can do the same thing for bivariate discrete random
variables:

3
Economics 2110 Fall Semester 2007
Marcelo J. Moreira Harvard University

Definition 5 Given two discrete random variables X and Y with joint prob-
ability mass function fXY (x, y), the conditional probability mass function for
X given Y = y is

fXY (x, y)
fX|Y (x|y) = ,
fY (y)

provided fY (y) > 0.

Example A (contd): Consider the distribution of X given that Y = 1.


In that case we are conditioning on {HT H, HHT, T HT, T T H}


1/4 x=0 ( {T T H}),

2/4 x=1 ( {HT H, T HT }),
fX|Y (x|Y = 1) =

1/4 x=2 ( {HHT }),

0 elsewhere.

Note that this is the same as the marginal pmf of X. This is not true if we
condition on Y = 2. In that case we condition on {HHH, T HH}:

1/2 x=1 ( {T HH}),
fX|Y (x|Y = 2) = 1/2 x=2 ( {HHH}),

0 elsewhere.

For continuous bivariate random variables things are a little more com-
plicated because the probability that they take on a particular value is zero,
and conditional probabilities are not defined when the probability of the
conditioning event is zero. Nevertheless, the definition of conditional proba-
bility density functions is very similar to that of conditional probability mass
functions:

Definition 6 Given two continuous random variables X and Y with joint


probability density function fXY (x, y), the conditional probability density func-
tion for X given Y = y is

fXY (x, y)
fX|Y (x|y) = .
fY (y)

4
Economics 2110 Fall Semester 2007
Marcelo J. Moreira Harvard University

To see why this works we first look at the conditional cumulative distri-
bution function where we condition on Y (y, y + ).
P r(X x, y < Y < y + )
P r(X x|Y (y, y + )) =
P r(y < Y < y + )
R x R y+
y
fXY (u, v)dvdu
R R y+ .
y
fXY (u, v)dvdu
Now take the limit as goes to zero. In that case both numerator and
denominator go to zero, so to get the limit we have to use lHospitals rule
and differentiate both numerator and denominator with respect to . In
that case we get:
Rx Rx
f XY (u, y)du fXY (u, y)du
FX|Y (x|y) = R
= .
f (u, y)du
XY
fY (y)
Take the derivative with respect to x to get the conditional probability den-
sity function:
fXY (x, y)
fX|Y (x|y) = .
fY (y)
This is not, however, the conditional density of X given the event Y = y. Its
interpretation is that of the limit of the conditional densities, conditioning
on y < Y < y + , and taking the limit as goes to zero.

Definition 7 Let (X, Y ) be a bivariate random vector with pmf/pdf fXY (x, y).
Then the random variables X and Y are independent if
fXY (x, y) = fX (x) fY (y),
where fX (x) and fY (y) are the marginal pdf/pmf s of X and Y .

Result 1 The two random variables X and Y are independent if and only
if there exist functions g and h such that the joint pdf/pmf can be written as
fXY (x, y) = g(x) h(y),
for all < x < , and < y < .

5
Economics 2110 Fall Semester 2007
Marcelo J. Moreira Harvard University

Result 2 Let X and Y be two independent random variables. Then


(i) fX|Y (x|y) = fX (x), and
(ii) fY |X (y|x) = fY (y).

Example B (contd): In the previous example we had

fXY (x, y) = 2, 0 < x < 1, 0 < y < 1, x+y < 1, and zero elsewhere.

It may appear that X and Y are independent by the previous results: choose
g(x) = 2 and h(y) = 1. The reason this does not work is because of the
restrictions on the values of X and Y . To see this write

fXY (x, y) = 2 1{0 < x < 1} 1{0 < y < 1} 1{x + y < 1},

for all < x < and < y < . The indicator function 1{A}
is equal to one if the argument is true and zero if it is false. The indicator
function 1{x+y < 1} cannot be seperated into a function of x and a function
of y and therefore X and Y are not independent.
To see how tricky conditional probability density functions can be with
continuous random variables, consider the BorelKolmogorov paradox. The
joint density of two random variables X and Y is

fY X (y, x) = exp(x y),

for x, y > 0. The marginal density function of Y is

fX (x) = exp(x).

The conditional density of X given Y = 1 is

fX|Y (x|y = 1) = exp(x),

because X and Y are obviously independent.


Now consider the transformation from (X, Y ) to (X, Z) where Z is (Y
1)/X. The inverse of the transformation is X = g1 (X, Z) = X and Y =
g2 (X, Z) = XZ + 1 with Jacobian
g1
(x, z) g2
(x, z) 1 z
g
1 (x, z) g2 (x, z) = 0 x = |x|.
x x
z z

6
Economics 2110 Fall Semester 2007
Marcelo J. Moreira Harvard University

the joint density of Z and X is

fZX (z, x) = x exp(x xz 1),

for z > 1/x and x > 0.


Let us calculate the marginal density of z. For z > 0, the marginal density
is
Z Z
1 1 1
fZ (z) = x exp(x(z +1))e dx = e x(z +1) exp(x(z +1))
0 z+1 0
1
= .
e(z + 1)2
For z < 0, the marginal density is
Z 1/z Z 1/z
1 1 1
fZ (z) = x exp(x(z+1))e dx = e x(z+1) exp(x(z+1))
0 z+1 0

1/z Z 1/z
1
=
x exp(x(z + 1)) + exp(x(z + 1))dx
e(z + 1) 0 0
1/z
1 1 1
= exp((z + 1)/z) exp(x(z + 1))
e(z + 1) z z+1 0

1 1
= 2
exp((z + 1)/z) + 1 .
e(z + 1) z
Take the limit as z 0 to get in both cases fZ (z) = e1 .
The conditional density of X given Z = 0 is then

fXZ (x, z)
fX|Z (x|z = 0) = = x exp(x),
fZ (z) z=0

for x > 0. The paradox is Z = 0 implies and is implied by Y = 1, yet the


conditional density of X given Z = 0 differs from that of X given Y = 1.
The difference is in the way the limit is taken. In one case we take the
limit for 1 < Y < 1 + . In the other case we take the limit < Z <
which is the same as the limit 1 x < Y < 1 + x which is clearly different

7
Economics 2110 Fall Semester 2007
Marcelo J. Moreira Harvard University

from the first limit. For fixed the latter conditioning set obviously includes
more large values of X.
Expectation
Expectations are calculated in pretty much the same way as before, but
now involving double (or, in general, multiple) sums or integrals:
Z Z
E[r(X, Y )] = r(x, y)fXY (x, y)dydx,
x y

for continuous bivariate random variables, and


XX
E[r(X, Y )] = r(x, y)fXY (x, y),
x y

for discrete bivariate random variables.


Example B (expectation): Consider the expected value of XY :
Z 1 Z 1x Z 1 1x Z 1

2
E(XY ) = 2xydydx = xy dx = x(1 x)2 dx
0 0 0 0 0
Z 1
1
3 2 1 4 2 3 1 2 1
= (x 2x + x)dx = x x + x = .
0 4 3 2 0 12
Consider two random variables X and Y with means X and Y and
variances 2X and 2Y respectively. The covariance of X and Y is defined as
C(X, Y ) = E[(X X )(Y Y )],
and the correlation coefficient as
XY = C(X, Y )/( X Y ).
The mean and variance of the sum X + Y are
E[X + Y ] = X + Y ,
and
V (XY ) = 2X + 2Y + 2CXY .
If the two random variables are independent, the variance simplifies to the
sum of the variances 2X + 2Y .

Você também pode gostar