Você está na página 1de 7

Probability Theory Multivariate random variables

Definition 1.1. An n-dimensional random variable, or random


vector X is a function from the probability space to n, that is,

Chapter 1
The joint distribution function is defined by
Multivariate random variables
for x1, x2,,xn . The joint distribution function can also be
written in a more compact way by using vector notation.

Thommy Perlinger, Probability Theory 1 Thommy Perlinger, Probability Theory 2

Multivariate random variables Marginal distributions


In the discrete case the joint probability function is defined by When the joint distribution of X is known it is possible to
determine the marginal distribution of any sub-vector of X.

for x1, x2,,xn , or by using vector notation In the special case n=2, i.e. X=(X,Y), it is easily shown that

and
In the continuous case the joint density function is defined by
in the discrete case, and that in the continuous case

and
for x1, x2,,xn , or by using vector notation

Care must be taken when determining the limits of


3
summation/integration. 4

1
Problem 1.3.13 (part of) Independent random variables
Let the joint density function of X=(X,Y) be given by It is not in general possible to determine the joint distribution of
a random vector X only knowing the marginal distributions of its
components.
Determine the marginal density function of Y. The easiest way The components of X are independent if and only if (iff)
to correctly determine the limits of integration is by describing
the domain of X=(X,Y) graphically

It is now easy to derive that


in the discrete case and

That is, Y is U(0,1).


in the continuous case.
Thommy Perlinger, Probability Theory 5 Thommy Perlinger, Probability Theory 6

Independent random variables Covariance and correlation


It is often cumbersome to determine whether two random The concepts of covariance and correlation are used to determine the
variables X and Y are independent. The following result is magnitude of (linear) dependence between X and Y.
useful.

When deriving the covariance manually it is easier to use the formula


X and Y are independent if and only if

1. the domain of (X,Y) is rectangular, that is, there exists


constants a, b, c och d such that a x1 b, c x2 d, and When X and Y are independent there is no dependence, i.e. Cov(X,Y)=0.
The covariance is scale-dependent. A measure of linear dependence that is
2. the joint density/probability function can be written as a scale-invariant is given by the correlation coefficient .
product of two functions g(x) och h(y) where g is a function of
x only and h is a function of y only.

Thommy Perlinger, Probability Theory 7 Thommy Perlinger, Probability Theory 8

2
The Transformation Theorem
Exercise 1.2 and 1.3 (part of) One-dimensional case
Exercise 1.2. Let (X,Y) be a point that is uniformly distributed Let g(x) be strictly increasing and that we are interested in the random
on the unit disc; that is, the distribution of X and Y is variable Y=g(X) where X has density fX(x).

Since g(x) is strictly increasing it has an inverse which also is strictly


increasing.

Are X and Y independent? Are they uncorrelated?

We differentiate with respect to y and by the chain rule it follows that


Exercise 1.3. Let (X,Y) be a point that is uniformly distributed
on a square whose corners are (1,1). Are X and Y
independent? Are they uncorrelated?

Thommy Perlinger, Probability Theory 9 Thommy Perlinger, Probability Theory 10

The Transformation Theorem The Transformation Theorem


One-dimensional case One-dimensional case
When g(x) is strictly decreasing the inverse is also strictly decreasing, i.e., Theorem. Let X be a continuous random variable with density
function fX(x). Further, let g(x) be a function that is strictly
monotone on the domain of X. It then follows that Y=g(X) is a
continuous random variable with density function

We differentiate with respect to y and by the chain rule it follows that

Since g(x) is strictly decreasing dg/dy will also be negative which means
that it is possible to formulate one single transformation theorem for strictly
monotone functions.

Thommy Perlinger, Probability Theory 11 Thommy Perlinger, Probability Theory 12

3
Example. Example.
The Rayleigh Exponential relationship The Rayleigh Exponential relationship
A density function sometimes used by engineers to model Since Y=g(X)=X2 it follows that
lengths of life of electronic components is the Rayleigh density,
given by

Thus by the transformation theorem

Consider Y=g(X)=X2. Since the domain of X is the positive


reals, g(X) is strictly increasing which means that the
transformation theorem can be applied.

which we recognize as the density function of Exp().


Thommy Perlinger, Probability Theory 13 Thommy Perlinger, Probability Theory 14

Example. The Transformation Theorem


The Rayleigh Exponential relationship n-dimensional case (Conditions)
When we observe that g(x) is strictly increasing it is however not Let X be a continuous random vector with density function fX(x) with its mass
necessary to do it the formal way. We can just as easily use concentrated on S n.
the distribution function. Further, let g=(g1,g2,,gn) be a bijection from S to T n
.

Now consider the n-dimensional random vector Y=g(X), that is, the n one-
dimensional random variables
We differentiate with respect to y and by the chain rule it follows
that

We finally assume that g and its inverse are continuously differentiable.


Thommy Perlinger, Probability Theory 15 Thommy Perlinger, Probability Theory 16

4
The Transformation Theorem
n-dimensional case Determinants
Notation. The determinant of a square matrix A is denoted by det A or |A|.
Theorem 2.1. The density function of Y is
Computation. The determinant of a 22 matrix A is given by

where h is the inverse of g and where


The algebraic complement Aij of the element aij is the matrix that remains
after deleting the i:th row and the j:th column of A.
The determinant of a nn matrix A can be derived recursively via

The purpose. The absolute value of the determinant is the generalized


volume, in the n-dimensional space, that is given by the column vectors of A.
Thommy Perlinger, Probability Theory 17 Thommy Perlinger, Probability Theory 18

Example: Solving a 33 determinant Problem 1.3.21


Let us find the determinant of a certain 33 matrix A, where A is given by Let the joint density function of X=(X,Y) be given by

In order to make the calculations as easy as possible we develop along the Determine the joint density function of U=(U,V) where U=XY
first row. It then follows that and V=X.
It is obvious that this is a bijection and therefore Theorem 2.1 is
applicable. Inversion yields

that is

Thommy Perlinger, Probability Theory 19 Thommy Perlinger, Probability Theory 20

5
The Transformation Theorem
Problem 1.3.21 Auxiliary variables
By Theorem 2.1 we now obtain In many situations we are only interested in the probability
distribution of a one-dimensional function of a random vector.
In order to use Theorem 2.1 to find this distribution we have to
introduce auxiliary variables.

These auxiliary variables can be chosen arbitrarily which


means that we define them to make the computations as easy
and it is clear that U and V are independent, equidistributed as possible.
random variables with density function
When the joint density function fY(y) is obtained we find the
sought marginal distribution by integrating over the auxiliary
variables.

Thommy Perlinger, Probability Theory 21 Thommy Perlinger, Probability Theory 22

Problem 1.3.18 Problem 1.3.18


Let X Exp(1) and Y U(0,1) be independent random variables. Determine In order to find the marginal distribution of U we have to integrate the joint
the density function of U=X+Y. density function over v. However, care has to be taken in order to find the
correct limits of integration.
To find the density function for U we introduce the auxiliary variable V=Y
which means that We have to break up the problem in two parts; 0<u<1 and u1.
In the case 0<u<1 we get
that is and so

By Theorem 2.1 we now obtain

and in the case u1 we get

for 0<v<u<, v<1.

Thommy Perlinger, Probability Theory 23 Thommy Perlinger, Probability Theory 24

6
The Transformation Theorem
Many-to-one
Theorem 2.1 requires that the function g is a bijection from
S to T. What if g is not injective?
Suppose S n can be partitioned into m disjoint subsets S1,
S2,,Sm, such that g:SkT is injective for each k. Then

where hk=(h1k,h2k,,hnk) is the inverse corresponding to the


mapping from SkT and Jk is the Jacobian.

Thommy Perlinger, Probability Theory 25

Você também pode gostar