Você está na página 1de 81

Review of Probability

Important Topics
1
2
3
4

Random Variables and Probability Distributions


Expected Values, Mean, and Variance
Two Random Variables
Fm ,
The Normal, Chi-Squared,
, and t
Distributions
5 Random Sampling and the Sampling Distribution
6 Large-Sample Approximations to Sampling
Distributions

Definitions

Outcomes: the mutually exclusive potential


results of a random process.
Probability: the proportion of the time that the
outcome occurs.
Sample space: the set of all possible outcomes.
Event: A subset of the sample space.
Random variables: a random variable is a
numerical summary of a random outcome.

Probability distribution: discrete


variable

List of all possible [x, p(x)] pairs


x = value of random variable (outcome)
p(x) = probability associated with value
Mutually exclusive (no overlap)
Collectively exhaustive (nothing left out)
0 p(x) 1 for all x
p(x) = 1

Probability distribution: discrete


variable

Probabilities of events.
Cumulative probability distribution.

Example: Bernoulli distribution.

Let G be the gender of the next new person you


meet, where G=0 indicates that the person is
male and G=1 indicates that she is female.

The outcomes of G and their probabilities are


G= 1 with probability p
= 0 with probability 1-p

Probability distribution: continuous


variable
1.

Mathematical formula

2.

Shows all values, x, and frequencies, f(x)


(Value, Frequency)

f(x) Is Not Probability

3. Propert
ies
f (x)dx 1
All x (Area Under Curve)

f (x ) 0, a x b

Frequency

f(x)

b
Value

Probability density function (p.d.f.).

Probability distribution: Continuous


Variable

Cumulative probability distribution.

1
d

Uniform Distribution
1. Equally likely outcomes
f(x)
2. Probability density
function
1
f ( x)
d c
c a

3. Mean and Standard Deviation


cd

d c

12

Expected Values, Mean, and Variance

Expected value of a Bernoulli random


variable

Expected value of a continuous


random variable

Let f(Y) is the p.d.f of random variable Y , then


the expected value of Y is

Variance, Standard Deviation, and


Moments

Variance of a Bernoulli random


variable
G p G is
The mean of the Bernoulli random variable
, so its variance is

The standard deviation is

Moments

The expected value of Y r is called the r th


moments of the random variable Y .
That is the r th moment of Y is E(Y r ).
The mean of Y , E(Y), is also called the first
moment of Y .

Mean and Variance of a Linear


Function of a
Random Variable

Suppose X is a random variable with mean


variance
,
and
Then the mean and variance of Y are

and the standard deviation of Y is

and

Two Random Variables


Joint and Marginal Distributions
The joint probability distribution of two
discrete random variables, say X and Y , is the

probability that the random variables


simultaneously take on certain values, say x and
.
The joint probability distribution can be written
as the function

The marginal probability distribution of a


random variable Y is just another name for its
probability distribution.

E (Y ) 0 (0.15 0.15) 1 (0.07 0.63)

Conditional distribution of Y given X=x is

Conditional expectation of Y given X=x is

E ( M ) 0 (0.35 0.45) 1 (0.065 0.035) 2 (0.05

0.01)

3 (0.025 0.005) 4 (0.01 0.00)


0.35
E ( M ) E ( M | A 0) Pr( A 0) E ( M | A 1) Pr( A 1)
(0 0.70 1 0.13 2 0.10

3 0.05

40.02)
0.5
(0 0.90 1 0.07 2 0.02

3 0.01
40.00)
0.5
0.35

The mean of Y is the weighted average of the


conditional expectation of Y given X, weighted by
the probability distribution of X.

Stated differently, the expectation of Y is the


expectation of the conditional expectation of Y
given X, that is
where the inner expectation is computed using
the conditional distribution of Y given X and the
outer expectation is computed using the
marginal distribution of X.
This is known as the law of iterated
expectations.

Conditional variance

The variance of Y conditional on X is the variance


of the conditional distribution of Y given X.

Var ( M | A 0) [0 0.56]2 0.70 [1 0.56]2 0.13


[2 0.56]2 0.10 [3 0.56]2 0.05
[4 0.56]2 0.02
Var ( M | A 1) [0 0.14]2 0.90 [1 0.14]2 0.07
[2 0.14]2 0.02 [3 0.14]2 0.01
[4 0.14]2 0.00

Independence

Two random variable X and Y are independently


distributed, or independent, if knowing the
value of one of the variables provides no

information about the other.


That is, X and Y are independent if for all values
of x and ,

State differently, X and Y are independent if

That is, the joint distribution of two independent


random variables is the product of their marginal
distributions.

Covariance and Correlation


Covariance
One measure of the extent to which two random
variables move together is their covariance.

Correlation

The correlation is an alternative measure of


dependence between X and Y that solves the
unit problem of covariance.

The random variables X and Y are said to be


uncorrelated if Corr(X, Y) = 0.
The correlation is always between -1 and 1.

The Mean and Variance of Sums of


Random Variables

Normal, Chi-Squared,
Fm ,
Distributions

, and t

The Normal Distribution


The probability density function of a normal
distributed random variable (the normal p.d.f.)
is

where exp(x) is the exponential function of x.


The factor
ensures that

The normal distribution with mean


and
variance 2 is expressed as N(, 2 ).

Black swan theory

The black swan theory or theory of black swan events is


a metaphor that describes an event that comes as a
surprise, has a major effect, and is often inappropriately
rationalized after the fact with the benefit of hindsight.
The theory was developed by Nassim Nicholas Taleb
The disproportionate role of high-profile, hard-topredict, and rare events.
The non-computability of the probability of the
consequential rare events using scientific methods
(owing to the very nature of small probabilities).
The psychological biases that make people individually
and collectively blind to uncertainty and unaware of
the massive role of the rare event in historical affairs.

The main idea in Taleb's book is not to attempt to


predict black swan events, but to build
robustness against negative ones that occur and
be able to exploit positive ones.
Taleb contends that banks and trading firms are
very vulnerable to hazardous black swan events
and are exposed to unpredictable losses.
Taleb is highly critical of the widespread use of
the normal distribution model as the basis for
calculating risk.

The standard normal distribution is the


normal distribution with mean = 0 and variance
2 = 1 and is denoted N(0, 1).
The standard normal distribution is often
denoted by Z and its cumulative distribution
function is denoted by . Accordingly, Pr(Z c)=
(c), where c is a constant.

Key Concept 2.4

Copyright 2003 by Pearson


Education, Inc.

2-42

The Empirical Rule (normal


distribution)

36 of 42
Copyright 2011 Pearson Education, Inc.

90%: +- 1.69
95%: +- 1.96
99%: +- 2.58

The bivariate normal distribution

The bivariate normal p.d.f. for the two random


variables X and Y is

where

is the correlation between X and Y .

Important properties for normal


distribution.
1. If X and Y have a bivariate normal distribution
with covariance ,
and if a and b are two constants, then

2. The marginal distribution of each of the two


variables is normal. This follows by setting a = 1;
b = 0 in 1.
3. If
= 0, then X and Y are independent.

The Chi-squared distribution

The Chi-squared distribution is the


distribution of the sum of m squared
independent standard normal random
variables.
The distribution depends on m, which is called
the degrees of freedom of the chi-squared
distribution.
A chi-squared distribution with m degrees of
freedom is denoted
.

Fm ,

distribution
where

and

are independent.

When n is ,
.
TheFm ,
distribution is the distribution of a
random variable with a chi-squared distribution
with m degrees of freedom, divided by m.
Equivalently, F
the
distribution is the
m ,
distribution of the average of m squared
standard normal random variables.

The Student t Distribution

The Student t distribution with m degrees of


freedom is defined to be the distribution of the
ratio of a standard normal random variable,
divided by the square root of an independently
distributed chi-squared random variable with m
degrees of freedom divided by m.

That is, let Z be a standard normal random


variable, let W be a random variable with a chisquared distribution with m degrees of freedom,
and let Z and W be independently distributed. Then

When m is 30 or more, the Student t distribution is


well approximated by the standard normal
distribution, and t distribution equals the standard
normal distribution Z.

Random Sampling

Simple random sampling is the simplest


sampling scheme in which n objects are selected
at random from a population and each member
of the population is equally likely to be included
in the sample.
Since the members of the population included in
the sample are selected at random, the values of
the observations Y1, , Yn are themselves
random.

i.i.d. draws.

Because Y1, , Yn are randomly drawn from the


same population, the marginal distribution of Yi
is the same for eacn i =1, , n. Y1, , Yn are
said to be identically distributed.
When Y1, , Yn are drawn from the same
distribution and are indepentently distributed,
they are said to be independently and
identically distributed, or i.i.d.

Sampling Distribution of the Sample


Average
The sample average of the n observations Y1, , Yn
is

Because Y1, , Yn are random, their average is


random and has a probability distribution. The
distribution of
is called the sampling
distribution of
.

Mean and Variance of


Suppose Y1, , Yn are i.i.d. and let
and
denote the mean and variance of Yi . Then

Y of
Sampling distribution
distributed

The linear combination of normally distributed


random variable is also normally distributed
E
(Y ) y , var(
Y )42)
y
(equation
2.

Therefore,

Y : N ( y ,

if Y is normally

y2
n

Large-Sample Approximations to
Sampling Distributions
Two approaches to characterizing sample
distributions.
Exact distribution, or finite sample distribution
when the distribution of Y is known.
Asymptotic distribution: large-sample
approximation to the sampling distribution.

Law of Large Numbers

The law of large numbers states that, under


general conditions, will be near
with very
high probability when n is large.
The property that is near
with increasing
probability as n increases is called convergence
in probability, or consistency.
The law of large numbers states that, under
certain conditions,
converges in probability to
, or, is
consistent for
.

Key Concept 2.6

Copyright 2003 by Pearson


Education, Inc.

2-59

The conditions for the law of large numbers are


Yi , i=1, , n, are i.i.d.

The variance of Yi ,

, is finite.

Developing
Sampling Distributions
Suppose Theres a Population ...
Population size, N = 4
Random variable, x
Values of x: 1, 2, 3, 4
Uniform distribution

1984-1994 T/Maker Co.

Population Characteristics
Summary Measures

Population Distribution

X
i 1

2.5

X
i 1

.3
.2
.1
.0

P(x)

1.12

All Possible Samples


of Size n = 2
16 Samples

16 Sample Means

1st 2nd Observation


Obs 1
2
3
4

1st 2nd Observation


Obs 1
2
3
4

1,1 1,2 1,3 1,4

1 1.0 1.5 2.0 2.5

2,1 2,2 2,3 2,4

2 1.5 2.0 2.5 3.0

3,1 3,2 3,3 3,4

3 2.0 2.5 3.0 3.5

4,1 4,2 4,3 4,4

4 2.5 3.0 3.5 4.0

Sample with replacement

Sampling Distribution
of All Sample Means
16 Sample Means
1st 2nd Observation
Obs 1
2
3
4

1 1.0 1.5 2.0 2.5


2 1.5 2.0 2.5 3.0
3 2.0 2.5 3.0 3.5
4 2.5 3.0 3.5 4.0

Sampling Distribution
of the Sample Mean
P(x)
.3
.2
.1
.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0

12.5x27.95
Comparison

Population
.3
.2
.1
.0

P(x)

Sampling Distribution

P(x)
.3
.2
.1
.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0

Chebyshev Inequality

For any random variable X ~(,

The inequality can be rewritten as

Therefore, the inequality says that at least with 1


probability, the random variable will fall within the
region k

), and for any k > 0,

Proof of Weak LLN

For any X

According to Chebyshev Inequality, for any > 0,


we have

Formal definitions of consistency and


law of large numbers.
Consistency and convergence in probability.
Let S1 , S2 , , Sn , be a sequence of random
variables. For example, Sn could be the sample
average of a sample of n observations of the
random variable Y .
The sequence of random variables {Sn} is said to
converge in probability to a limit, , if the
probability that Sn is within of tends to one
as n , as long as the constant is positive.

That is,
if and only if Pr [| Sn |] 0 as n
for every > 0.
If
, then Sn is said to be a consistent
estimator of .
The law of large numbers.
If Y1, , Yn are i.i.d., E(Yi) =
and Var(Yi) < ,
then

The Central Limit Theorem

The central limit theorem says that, under


general conditions, the distribution of
is well
approximated by a normal distribution when n is
large.
Since the mean of
is
and its variance if
, when n is large the distribution of
is
approximately N(
,
).
Accordingly,
is well approximated by the
standard normal
distribution N(0,1)

Convergence in distribution.

Let F1, , Fn , be a sequence of cumulative


distribution functions corresponding to a
sequence of random variables, S1, , Sn , .

Then the sequence of random variables Sn is said


to converge in distribution to S (denoted as
) if the distribution functions {Fn} converge
to F.

That is,
if and only if
limit holds at all

where the

points t at which the limiting distribution F is


continuous.
The distribution F is called the asymptotic
distribution of Sn .

The central limit theorem


If Y1, , Yn are i.i.d. and 0 <

< , then

In other words, the asymptotic distribution of

is N(0,1) .

Slutskys theorem

Slutskys theorem combines consistency and


convergence in distribution.
Suppose that
, where a is a constant, and
. Then

Continuous mapping theorem


If g is a continuous function, then

But, how large of n is large enough?

The answer is: it depends on the distribution of


the underlying Yi that make up the average.

At one extreme, if the Yi are themselves normally


distributed, then is exactly normally distributed
for all n.
In contrast, when Yi is far from normally
distributed, then this approximation can require
n = 30 or even more.

Example: A skewed distribution.

Você também pode gostar