Escolar Documentos
Profissional Documentos
Cultura Documentos
Saher Asad
Variability in estimates
Variability in estimates
Application exercise
Sampling
distributions - via
CLT
Confidence
intervals
Hypothesis testing
Saher Asad
Variability in estimates
Parameter estimation
We are often interested in population parameters.
Since complete populations are difficult (or impossible) to collect
data on, we use sample statistics as point estimates for the
unknown population parameters of interest.
Sample statistics vary from sample to sample.
Quantifying how sample statistics vary provides a way to
estimate the margin of error associated with our point estimate.
But before we get to quantifying the variability among samples,
lets try to understand how and why point estimates vary from
sample to sample.
Suppose we randomly sample 1,000 adults from each state in the US.
Would you expect the sample means of their heights to be the same,
somewhat different, or very different?
Saher Asad
2 / 55
Variability in estimates
Parameter estimation
We are often interested in population parameters.
Since complete populations are difficult (or impossible) to collect
data on, we use sample statistics as point estimates for the
unknown population parameters of interest.
Sample statistics vary from sample to sample.
Quantifying how sample statistics vary provides a way to
estimate the margin of error associated with our point estimate.
But before we get to quantifying the variability among samples,
lets try to understand how and why point estimates vary from
sample to sample.
Suppose we randomly sample 1,000 adults from each state in the US.
Would you expect the sample means of their heights to be the same,
somewhat different, or very different?
Not the same, but only somewhat different.
Saher Asad
2 / 55
Variability in estimates
Application exercise
Suppose that you dont have access to the population data. In order
to estimate the average score of students in a class you might sample
from the population and use your sample mean as the best guess for
the unknown population mean.
Sample, with replacement, ten students from the population, and
record the score of these students.
Find the sample mean.
Plot the distribution of the sample averages obtained by
members of the class.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
7
5
4
4
6
2
3
5
5
6
1
10
4
4
6
Saher Asad
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
3
10
8
5
10
6
2
6
7
3
6
5
8
0
8
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
5
9
7
5
5
7
4
0
4
3
6
10
3
6
10
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
4
3
3
6
8
8
8
2
4
8
3
5
5
8
4
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
10
7
4
5
6
6
6
7
7
5
10
3
5.5
7
10
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
6
6
5
4
5
6
5
6
8
4
10
5
10
8
5
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
4
0.5
3
3
5
6
4
4
2
5
4
7
6
8
3
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
6
2
5
1
5
5
4
4
9
4
3
3
4
4
8
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
6
5
3
2
2
5
10
4
1
4
10
8
10
6
6
136
137
138
139
140
141
142
143
144
145
146
6
7
3
10
4
4
6
6
4
5
5
3 / 55
Variability in estimates
Application exercise
Example:
List of random numbers: 59, 121, 88, 46, 58, 72, 82, 81, 5, 10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
7
5
16
4
4
6
2
3
5
5
6
1
10
4
4
6
18
Saher Asad
17
19
20
21
22
23
24
25
26
27
28
29
30
3
10
31
8
5
10
6
2
6
7
3
6
5
8
0
8
33
32
34
35
36
37
38
39
40
41
42
43
44
45
5
9
46
7
5
5
7
4
0
4
3
6
10
3
6
10
48
47
49
50
51
52
53
54
55
56
57
58
59
60
4
3
61
3
6
8
8
8
2
4
8
3
5
5
8
4
63
62
64
65
66
67
68
69
70
71
72
73
74
75
10
7
76
4
5
6
6
6
7
7
5
10
3
5.5
7
10
78
77
79
80
81
82
83
84
85
86
87
88
89
90
6
6
91
5
4
5
6
5
6
8
4
10
5
10
8
5
93
92
94
95
96
97
98
99
100
101
102
103
104
105
4
0.
5
3
3
5
6
4
4
2
5
4
7
6
8
3
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
6
2
121
5
1
5
5
4
4
9
4
3
3
4
4
8
123
122
124
125
126
127
128
129
130
131
132
133
134
135
6
5
136
3
2
2
5
10
4
1
4
10
8
10
6
6
138
137
139
140
141
142
143
144
145
146
6
7
3
10
4
4
6
6
4
5
5
4 / 55
Variability in estimates
Application exercise
Example:
List of random numbers: 59, 121, 88, 46, 58, 72, 82, 81, 5, 10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
7
5
16
4
4
6
2
3
5
5
6
1
10
4
4
6
18
17
19
20
21
22
23
24
25
26
27
28
29
30
3
10
31
8
5
10
6
2
6
7
3
6
5
8
0
8
33
32
34
35
36
37
38
39
40
41
42
43
44
45
5
9
46
7
5
5
7
4
0
4
3
6
10
3
6
10
48
47
49
50
51
52
53
54
55
56
57
58
59
60
4
3
61
3
6
8
8
8
2
4
8
3
5
5
8
4
63
62
64
65
66
67
68
69
70
71
72
73
74
75
10
7
76
4
5
6
6
6
7
7
5
10
3
5.5
7
10
78
77
79
80
81
82
83
84
85
86
87
88
89
90
6
6
91
5
4
5
6
5
6
8
4
10
5
10
8
5
93
92
94
95
96
97
98
99
100
101
102
103
104
105
4
0.
5
3
3
5
6
4
4
2
5
4
7
6
8
3
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
6
2
121
5
1
5
5
4
4
9
4
3
3
4
4
8
123
122
124
125
126
127
128
129
130
131
132
133
134
135
6
5
136
3
2
2
5
10
4
1
4
10
8
10
6
6
138
137
139
140
141
142
143
144
145
146
6
7
3
10
4
4
6
6
4
5
5
Saher Asad
4 / 55
Variability in estimates
Application exercise
Sampling distribution
What you just constructed is called a sampling distribution.
Saher Asad
5 / 55
Variability in estimates
Application exercise
Sampling distribution
What you just constructed is called a sampling distribution.
What is the shape and center of this distribution? Based on this distribution, what do you think is the true population average?
Saher Asad
5 / 55
Variability in estimates
Application exercise
Sampling distribution
What you just constructed is called a sampling distribution.
What is the shape and center of this distribution? Based on this distribution, what do you think is the true population average?
Approximately 5.39, the true population mean.
Saher Asad
5 / 55
Variability in estimates
6 / 55
Variability in estimates
CLT - conditions
Certain conditions must be met for the CLT to apply:
1. Independence: Sampled observations must be independent.
This is difficult to verify, but is more likely if
random sampling/assignment is used, and
if sampling without replacement, n < 10% of the population.
Saher Asad
7 / 55
Variability in estimates
CLT - conditions
Certain conditions must be met for the CLT to apply:
1. Independence: Sampled observations must be independent.
This is difficult to verify, but is more likely if
random sampling/assignment is used, and
if sampling without replacement, n < 10% of the population.
This is also difficult to verify for the population, but we can check
it using the sample data, and assume that the sample mirrors the
population.
Saher Asad
7 / 55
Confidence intervals
Variability in estimates
Confidence intervals
Why do we report confidence intervals?
Constructing a confidence interval
A more accurate interval
Capturing the population parameter
Changing the confidence level
Hypothesis testing
Saher Asad
Confidence intervals
Confidence intervals
A plausible range of values for the population parameter is called
a confidence interval.
Using only a sample statistic to estimate a parameter is like
fishing in a murky lake with a spear, and using a
confidence interval is like fishing with a net.
Saher Asad
8 / 55
Confidence intervals
interval
Constructing a confidence
Saher Asad
9 / 55
Confidence intervals
interval
Constructing a confidence
x = 3.2 s = 1.74
Saher Asad
9 / 55
Confidence intervals
interval
Constructing a confidence
x = 3.2 s = 1.74
The approximate 95% confidence interval is defined as
point estimate 2 SE
Saher Asad
9 / 55
Confidence intervals
interval
Constructing a confidence
x = 3.2 s = 1.74
The approximate 95% confidence interval is defined as
point estimate 2 SE
s
SE = 1.74
= 5
n
0
0.25
Saher Asad
9 / 55
Confidence intervals
interval
Constructing a confidence
x = 3.2 s = 1.74
The approximate 95% confidence interval is defined as
point estimate 2 SE
s
SE = 1.74
= 50
n
x 2
SE =
0.25
3.2 2 0.25
Saher Asad
9 / 55
Confidence intervals
interval
Constructing a confidence
x = 3.2 s = 1.74
The approximate 95% confidence interval is defined as
point estimate 2 SE
s
SE = 1.74
= 50
n
x 2
SE = 3.2 2
0.25
0.25
= (3.2 0.5, 3.2 +
0.5)
Saher Asad
9 / 55
Confidence intervals
interval
Constructing a confidence
x = 3.2 s = 1.74
The approximate 95% confidence interval is defined as
point estimate 2 SE
s
SE = 1.74
= 50
n
x 2
SE = 3.2 2
0.25
0.25
= (3.2 0.5, 3.2 +
0.5)
Saher Asad
9 / 55
Confidence intervals
point estimate z x
SE
Saher Asad
10 / 55
Confidence intervals
point estimate z x SE
Conditions when the point estimate = x:
1. Independence: Observations in the sample must be independent
random sample/assignment
if sampling without replacement, n < 10% of population
Saher Asad
10 / 55
Confidence intervals
point estimate z x SE
Conditions when the point estimate = x:
1. Independence: Observations in the sample must be independent
random sample/assignment
if sampling without replacement, n < 10% of population
10 / 55
Confidence intervals
Saher Asad
11 / 55
Confidence intervals
Width of an interval
If we want to be more certain that we capture the population
parameter,
i.e. increase our confidence level, should we use a wider interval or a
smaller interval?
Saher Asad
12 / 55
Confidence intervals
Width of an interval
If we want to be more certain that we capture the population
parameter,
i.e. increase our confidence level, should we use a wider interval or a
smaller interval?
A wider interval.
Saher Asad
12 / 55
Confidence intervals
Width of an interval
If we want to be more certain that we capture the population
parameter,
i.e. increase our confidence level, should we use a wider interval or a
smaller interval?
A wider interval.
Can you see any drawbacks to using a wider interval?
Saher Asad
12 / 55
Confidence intervals
Width of an interval
If we want to be more certain that we capture the population
parameter,
i.e. increase our confidence level, should we use a wider interval or a
smaller interval?
A wider interval.
Can you see any drawbacks to using a wider interval?
12 / 55
Confidence intervals
Saher Asad
13 / 55
Confidence intervals
Changing the confidence level
13 / 55
Confidence intervals
level
(d) Z =
2.05
2.33
(b) Z =
(e) Z =
1.96
1.65
(c) Z =
2.33
Saher Asad
14 / 55
Confidence intervals
level
(d) Z =
2.05
2.33
(b) Z =
(e) Z =
1.96
1.65
(c) Z =
2.33
0.98
z = 2.33
z = 2.33
0.01
3
Saher Asad
0.01
2
3
14 / 55
Hypothesis testing
Variability in estimates
Confidence intervals
Hypothesis testing
Hypothesis testing framework
Testing hypotheses using confidence intervals
Formal testing using p-values
Two-sided hypothesis testing with p-values
Decision errors
Choosing a significance level
Saher Asad
Hypothesis testing
Remember when...
Gender discrimination experiment:
Promotion
Promoted Not Promoted
Gender
Male
Female
Total
Saher Asad
21
14
35
3
10
13
Total
24
24
48
15 / 55
Hypothesis testing
Remember when...
Gender discrimination experiment:
Promotion
Promoted Not Promoted
Gender
Male
Female
Total
21
14
35
3
10
13
p males = 21/24
0.88
Total
24
24
48
p females = 14/24
0.58
Saher Asad
15 / 55
Hypothesis testing
Remember when...
Gender discrimination experiment:
Promotion
Promoted Not Promoted
Gender
Male
Female
Total
21
14
35
males
females
3
10
13
Total
24
24
48
= 21/24 0.88
= 14/24 0.58
Possible explanations:
Promotion and gender are independent, no gender
discrimination, observed difference in proportions is simply due
to chance. null - (nothing is going on)
Promotion and gender are dependent, there is gender
discrimination, observed difference in proportions is not due to
chance. alternative - (something is going on)
Saher Asad
15 / 55
Hypothesis testing
Saher Asad
16 / 55
Hypothesis testing
Saher Asad
16 / 55
Hypothesis testing
Saher Asad
16 / 55
Hypothesis testing
Saher Asad
16 / 55
Hypothesis testing
16 / 55
Hypothesis testing
intervals
Saher Asad
17 / 55
Hypothesis testing
intervals
Saher Asad
17 / 55
Hypothesis testing
intervals
Saher Asad
17 / 55
Hypothesis testing
intervals
17 / 55
Hypothesis testing
Saher Asad
18 / 55
Hypothesis testing
Saher Asad
19 / 55
Hypothesis testing
Saher Asad
19 / 55
Hypothesis testing
H0 : = 8
Saher Asad
19 / 55
Hypothesis testing
H0 : = 8
We test the claim that the average number of colleges Duke
students apply to is greater than 8
HA : > 8
Saher Asad
19 / 55
Hypothesis testing
Test statistic
In order to evaluate if the observed sample mean is unusual for the
hypothesized sampling distribution, we determine how many standard
errors away from the null it is, which is also called the test statistic.
Saher Asad
20 / 55
Hypothesis testing
Test statistic
In order to evaluate if the observed sample mean is unusual for the
hypothesized sampling distribution, we determine how many standard
errors away from the null it is, which is also called the test statistic.
=8
Saher Asad
x = 9.7
20 / 55
Hypothesis testing
Test statistic
In order to evaluate if the observed sample mean is unusual for the
hypothesized sampling distribution, we determine how many standard
errors away from the null it is, which is also called the test statistic.
=8
x = 9.7
x N = 8, SE = 20 = 0.5
6
Saher Asad
20 / 55
Hypothesis testing
Test statistic
In order to evaluate if the observed sample mean is unusual for the
hypothesized sampling distribution, we determine how many standard
errors away from the null it is, which is also called the test statistic.
=8
x = 9.7
7
x N = 8,
20 = 0.5
SE =
9.7 6
=
Z=
8
3.4 Chp 4: Foundations for inference
Saher Asad
0.5
20 / 55
Hypothesis testing
Test statistic
In order to evaluate if the observed sample mean is unusual for the
hypothesized sampling distribution, we determine how many standard
errors away from the null it is, which is also called the test statistic.
=8
x = 9.7
The sample mean is 3.4 standard errors away from the hypothesized value. Is this considered unusually high? That
is, is the result statistically significant?
7
x N = 8,
20 = 0.5
SE =
9.7 6
=
Z=
8
3.4 Chp 4: Foundations for inference
Saher Asad
0.5
20 / 55
Hypothesis testing
Test statistic
In order to evaluate if the observed sample mean is unusual for the
hypothesized sampling distribution, we determine how many standard
errors away from the null it is, which is also called the test statistic.
=8
x = 9.7
x N = 8, SE = 20 = 0.5
Z=
Saher Asad
9.7 6
=
8
3.4
0.5
The sample mean is 3.4 standard errors away from the hypothesized value. Is this considered unusually high? That
is, is the result statistically significant?
Yes, and we can quantify how
unusual it is using a p-value.
20 / 55
Hypothesis testing
p-values
We then use this test statistic to calculate the p-value, the
probability of observing data at least as favorable to the
alternative hypothesis as our current data set, if the null
hypothesis were true.
Saher Asad
21 / 55
Hypothesis testing
p-values
We then use this test statistic to calculate the p-value, the
probability of observing data at least as favorable to the
alternative hypothesis as our current data set, if the null
hypothesis were true.
If the p-value is low (lower than the significance level, , which is
usually 5%) we say that it would be very unlikely to observe the
data if the null hypothesis were true, and hence reject H0.
Saher Asad
21 / 55
Hypothesis testing
p-values
We then use this test statistic to calculate the p-value, the
probability of observing data at least as favorable to the
alternative hypothesis as our current data set, if the null
hypothesis were true.
If the p-value is low (lower than the significance level, , which is
usually 5%) we say that it would be very unlikely to observe the
data if the null hypothesis were true, and hence reject H0.
If the p-value is high (higher than ) we say that it is likely to
observe the data even if the null hypothesis were true, and hence
do not reject H0.
Saher Asad
21 / 55
Hypothesis testing
values
=8
Saher Asad
x = 9.7
22 / 55
Hypothesis testing
values
=8
x = 9.7
22 / 55
Hypothesis testing
values
Saher Asad
23 / 55
Hypothesis testing
values
Saher Asad
23 / 55
Hypothesis testing
values
Saher Asad
23 / 55
Hypothesis testing
values
Saher Asad
23 / 55
Hypothesis testing
values
Saher Asad
23 / 55
Hypothesis testing
values
Saher Asad
23 / 55
Hypothesis testing
A poll by the National Sleep Foundation found that college students average about 7
hours of sleep per night. A sample of 169 college students taking an introductory
statistics class yielded an average of 6.88 hours, with a standard deviation of 0.94
hours. Assuming that this is a random sample representative of all college students
(bit of a leap of faith?), a hypothesis test was conducted to evaluate if college students
on average sleep less than 7 hours per night. The p-value for this hypothesis test is
0.0485. Which of the following is correct?
(a) Fail to reject H0, the data provide convincing evidence that
college students sleep less than 7 hours on average.
(b)Reject H0, the data provide convincing evidence that
college
students sleep less than 7 hours on average.
(c) Reject H0, the data prove that college students sleep more than 7
hours
on average.
(d)Fail
to reject
H0, the data do not provide convincing evidence that
college students sleep less than 7 hours on average.
(e)Reject H0, the data provide convincing evidence that college
students in this sample sleep less than 7 hours on
Saher Asad
Chp 4: Foundations for inference
average.
24 / 55
Hypothesis testing
A poll by the National Sleep Foundation found that college students average about 7
hours of sleep per night. A sample of 169 college students taking an introductory
statistics class yielded an average of 6.88 hours, with a standard deviation of 0.94
hours. Assuming that this is a random sample representative of all college students
(bit of a leap of faith?), a hypothesis test was conducted to evaluate if college students
on average sleep less than 7 hours per night. The p-value for this hypothesis test is
0.0485. Which of the following is correct?
(a) Fail to reject H0, the data provide convincing evidence that
college students sleep less than 7 hours on average.
(b) Reject H0, the data provide convincing evidence that
college
students sleep less than 7 hours on average.
(c)Reject
H0average.
, the data prove that college students sleep more
hours on
than
7
(d)Fail to reject H0, the data do not provide convincing evidence that
college students sleep less than 7 hours on average.
(e)Reject H0, the data provide convincing evidence that college
students in this sample sleep less than 7 hours on average.
Saher Asad
24 / 55
Hypothesis testing
H0 : = 7
HA : 7
Saher Asad
25 / 55
Hypothesis testing
H0 : = 7
HA : 7
Hence the p-value would change as well:
p-value
= 0.0485 2
= 0.097
x= 6.88
Saher Asad
= 7
7.12
Chp 4: Foundations for inference
25 / 55
Hypothesis testing
Decision errors
Decision errors
Hypothesis tests are not flawless.
In the court system innocent people are sometimes wrongly
convicted and the guilty sometimes walk free.
Similarly, we can make a wrong decision in statistical hypothesis
tests as well.
The difference is that we have the tools necessary to quantify
how often we make errors in statistics.
Saher Asad
26 / 55
Hypothesis testing
Decision errors
Saher Asad
27 / 55
Hypothesis testing
Decision errors
Saher Asad
H0 true
HA true
27 / 55
Hypothesis testing
Decision errors
Saher Asad
H0 true
HA true
27 / 55
Hypothesis testing
Decision errors
Saher Asad
H0 true
HA true
27 / 55
Hypothesis testing
Decision errors
H0 true
HA true
Type 1 Error
Saher Asad
27 / 55
Hypothesis testing
Decision errors
H0 true
HA true
Type
1 Error
Type 2 Error
G
A Type 1 Error is rejecting the null hypothesis when H0 is true.
A Type 2 Error is failing to reject the null hypothesis when HA is
true.
Saher Asad
27 / 55
Hypothesis testing
Decision errors
H0 true
HA true
Type
1 Error
Type 2 Error
G
A Type 1 Error is rejecting the null hypothesis when H0 is true.
A Type 2 Error is failing to reject the null hypothesis when HA is
true.
We (almost) never know if H0 or HA is true, but we need to
consider all possibilities.
Saher Asad
27 / 55
Hypothesis testing
28 / 55
Variability in estimates
Confidence intervals
Hypothesis testing
Saher Asad
50
100
150
Next lets look at the population data for the number of basketball
games attended:
10
20
30
40
50
60
70
Saher Asad
29 / 55
0
200
800
400
600
10
15
20
Saher Asad
30 / 55
0
200
800
400
600
10
15
Saher Asad
20
30 / 55
0
200
800
400
600
10
15
20
Saher Asad
30 / 55
800
200
400
600
How did the shape, center, and spread of the sampling distribution change going from n = 10 to n =
30?
2
6
10
Saher Asad
31 / 55
800
200
400
600
How did the shape, center, and spread of the sampling distribution change going from n = 10 to n =
30?
2
6
10
Saher Asad
31 / 55
200
600
1000
Saher Asad
32 / 55
Variability in estimates
Confidence intervals
Hypothesis testing
Saher Asad
33 / 55
point estimate z x SE
where z x is selected to correspond to the confidence level, and SE
represents the standard error.
Remember that the value z x SE is called the margin of error.
Saher Asad
34 / 55
Saher Asad
35 / 55
Saher Asad
35 / 55
1.The null hypothesis is that men and women have equal IQ, and
the alternative is that these values are different.
H0 : men = women
HA : men
women
Saher Asad
36 / 55
1. The null hypothesis is that men and women have equal IQ,
and the alternative is that these values are different.
H0 : men = women
HA : men women
Saher Asad
36 / 55
1. The null hypothesis is that men and women have equal IQ,
and the alternative is that these values are different.
H0 : men = women
HA : men women
Saher Asad
36 / 55
xmxw= 11.1
Saher Asad
mw= 0
11.1
37 / 55
11.1
=
0
97.36
0.114
The Z score is huge! And hence
the p-value will be tiny, allowing
us to reject H0 in favor of HA.
Z=
Saher Asad
38 / 55
11.1
=
0
97.36
0.114
The Z score is huge! And hence
the p-value will be tiny, allowing
us to reject H0 in favor of HA.
Z=
Saher Asad
38 / 55
Saher Asad
39 / 55
When to retreat
When to retreat
Statistical tools rely on the following two main conditions:
Independence A random sample from less than 10% of the
population ensures independence of observations. In
experiments, this is ensured by random assignment. If
independence fails, then advanced techniques must be used, and
in some such cases, inference may not be possible.
Sample size and skew For example, if the sample size is too
small, the skew too strong, or extreme outliers are present, then
the normal model for the sample mean will fail.
Saher Asad
40 / 55
Variability in estimates
Confidence intervals
Hypothesis testing
Saher Asad
Saher Asad
41 / 55
Saher Asad
41 / 55
Saher Asad
41 / 55
41 / 55
Decision
fail to reject H0
reject H0
Truth
Saher Asad
H0 true
HA true
42 / 55
Decision
fail to reject H0
Truth
reject H0
Type 1 Error,
H0 true
HA true
Saher Asad
42 / 55
Decision
fail to reject H0
Truth
Type 1 Error,
H0 true
HA true
reject H0
Type 2 Error,
Saher Asad
42 / 55
Decision
fail to reject H0
Truth
H0 true
HA true
1
Error,
reject H0
Type 1
Type 2 Error,
Type 1 error is rejecting H0 when you shouldnt have, and the
probability of doing so is (significance level)
Type 2 error is failing to reject H0 when you should have, and the
probability of doing so is (a little more complicated to calculate)
Power of a test is the probability of correctly rejecting H0, and the
probability of doing so is 1
Saher Asad
42 / 55
Decision
fail to reject H0
Truth
H0 true
HA true
1
Error,
reject H0
Type 1
Type 2 Error,
Power, 1
Type 1 error is rejecting H0 when you shouldnt have, and the
probability of doing so is (significance level)
Type 2 error is failing to reject H0 when you should have, and the
probability of doing so is (a little more complicated to calculate)
Power of a test is the probability of correctly rejecting H0, and the
probability of doing so is 1
In hypothesis testing, we want to keep and low, but there are
inherent trade-offs.
Saher Asad
42 / 55
Saher Asad
43 / 55
Saher Asad
44 / 55
H0 : = 130
HA : > 130
Saher Asad
44 / 55
H0 : = 130
HA : > 130
Well start with a very specific question What is the power of this hypothesis test
to correctly detect an increase of 2 mmHg in average blood pressure?
Saher Asad
44 / 55
Calculating power
The preceding question can be rephrased as How likely is it that this
test will reject H0 when the true average systolic blood pressure for
employees at this company is 132 mmHg?
Saher Asad
45 / 55
Calculating power
The preceding question can be rephrased as How likely is it that this
test will reject H0 when the true average systolic blood pressure for
employees at this company is 132 mmHg?
Hint: Break this down intro two simpler problems
Saher Asad
45 / 55
Calculating power
The preceding question can be rephrased as How likely is it that this
test will reject H0 when the true average systolic blood pressure for
employees at this company is 132 mmHg?
Hint: Break this down intro two simpler problems
1.Problem 1: Which values of x represent sufficient evidence to
reject H0?
Saher Asad
45 / 55
Calculating power
The preceding question can be rephrased as How likely is it that this
test will reject H0 when the true average systolic blood pressure for
employees at this company is 132 mmHg?
Hint: Break this down intro two simpler problems
1. Problem 1: Which values of x represent sufficient evidence
to reject H0 ?
2.Problem 2: What is.the probability that we would reject
H0 if
.
25
xhad come from N mean = 132, SE
= 2.5 , i.e. what is
10 the
=
probability that we can obtain such an x from this
distribution?
Saher Asad
45 / 55
Calculating power
The preceding question can be rephrased as How likely is it that this
test will reject H0 when the true average systolic blood pressure for
employees at this company is 132 mmHg?
Hint: Break this down intro two simpler problems
1. Problem 1: Which values of x represent sufficient evidence
to reject H0 ?
2.Problem 2: What is.the probability that we would reject
H0 if
.
25
xhad come from N mean = 132, SE
= 2.5 , i.e. what is
10 the
=
probability that we can obtain such an x from this distribution?
0
Saher Asad
45 / 55
Problem 1
Which values of x represent sufficient evidence to reject
H0? (Remember H0 : = 130, HA : > 130)
Saher Asad
46 / 55
Problem 1
Which values of x represent sufficient evidence to reject
H0? (Remember H0 : = 130, HA : > 130)
P(Z > z) < 0.05 z >
1.65
x > 1.65
s/
0.05
Saher Asad
130
134.125
46 / 55
Problem 1
Which values of x represent sufficient evidence to reject
H0? (Remember H0 : = 130, HA : > 130)
P(Z > z) < 0.05 z >
1.65
x > 1.65
s/
0.05
130
134.125
Saher Asad
46 / 55
Problem 2
What is the probability that we would reject H0 if x did come
from
N(mean = 132, SE = 2.5).
Saher Asad
47 / 55
Problem 2
What is the probability that we would reject H0 if x did come from
N(mean = 132, SE = 2.5).
Saher Asad
47 / 55
Problem 2
What is the probability that we would reject H0 if x did come from
N(mean = 132, SE = 2.5).
134.125
132 2.
5
=
0.85
Z=
0.8023
Saher Asad
132
0.1977
134.125
47 / 55
Problem 2
What is the probability that we would reject H0 if x did come from
N(mean = 132, SE = 2.5).
134.125
132 2.
5
=
0.85
Z=
0.8023
132
0.1977
134.125
The probability of rejecting H0 : = 130, if the true average systolic blood pressure
of employees at this company is 132 mmHg, is 0.1977 which is the power of this
test.
Therefore, = 0.8023 for this test.
Saher Asad
47 / 55
Null
distribution
120
125
130
135
140
Saher Asad
48 / 55
Null
distribution
120
125
Power
distribution
130
135
140
Saher Asad
48 / 55
Null
distribution
Power
distribution
0.05
120
125
130
135
140
Saher Asad
48 / 55
Null
distribution
Power
distribution
134.125
120
125
130
0.05
135
140
Saher Asad
48 / 55
Null
distribution
Power
distribution
134.125
120
125
130
0.05
135
140
Saher Asad
48 / 55
Null
distribution
Power
distribution
Power
120
125
130
135
140
Saher Asad
48 / 55
Saher Asad
49 / 55
Saher Asad
49 / 55
Saher Asad
49 / 55
Saher Asad
49 / 55
49 / 55
x N(H0 + , SE)
Saher Asad
50 / 55
Saher Asad
51 / 55
=4
Saher Asad
51 / 55
=4
Step 1: Determine the cutoff in order to reject H20 at = 0.05, we need a
sample mean that will yield a Z score
at +least
1.65.
x > of
130
1.655
Saher Asad
51 / 55
=4
Step 1: Determine the cutoff in order to reject H20 at = 0.05, we need a
sample mean that will yield a Z score
of at+least
1.65.
x > 130
1.655
Step 2: Set the probability of obtaining the above x if the true population is
centered at 130 + 4 = 134 to the desired power, and solve for n.
25
.
.
130 + 0.9
1.65
25
n
P Z > 134
25
Saher Asad
n.
=
= P Z > 1.65 4
0.9
25 for inference
Chp 4: Foundations
51 / 55
Saher Asad
52 / 55
0.6
0.4
0.2
1.0
power
0.8
200
400
600
800
1000
Saher Asad
52 / 55
0.6
0.4
0.2
1.0
power
0.8
200
400
600
800
1000
Saher Asad
52 / 55
0.6
0.4
0.2
1.0
power
0.8
200
400
600
800
1000
Saher Asad
52 / 55
0.6
0.4
0.2
1.0
power
0.8
200
400
600
800
1000
52 / 55
Variability in estimates
Confidence intervals
Hypothesis testing
Saher Asad
All else held equal, will the p-value be lower if n = 100 or n = 10, 000?
(a) n = 100
(b) n = 10,
000
Saher Asad
53 / 55
All else held equal, will the p-value be lower if n = 100 or n = 10, 000?
(a) n = 100
(b) n = 10,
000
Saher Asad
53 / 55
All else held equal, will the p-value be lower if n = 100 or n = 10, 000?
(a) n = 100
(b) n = 10, 000
Suppose x = 50, s = 2, H0 : = 49.5, and HA :
49.5.
Saher Asad
53 / 55
All else held equal, will the p-value be lower if n = 100 or n = 10, 000?
(a) n = 100
(b) n = 10, 000
Suppose x = 50, s = 2, H0 : = 49.5, and HA :
49.5.
Zn=10
50
49.52
10
0
Saher Asad
53 / 55
All else held equal, will the p-value be lower if n = 100 or n = 10, 000?
(a) n = 100
(b) n = 10, 000
Suppose x = 50, s = 2, H0 : = 49.5, and HA :
49.5.
Zn=10
50
49.52
10
0
Saher Asad
50 49.5
=
= 2.5, p-value =
0.5
0.0062
2
0.2
1
0
53 / 55
All else held equal, will the p-value be lower if n = 100 or n = 10,
000?
(a) n = 100
(b) n = 10, 000
Suppose x = 50, s = 2, H0 : = 49.5, and HA : 49.5.
Zn=10
Zn=1000
50
49.52
100
50
2
49.5
1000
0
Saher Asad
50 49.5
=
= 2.5, p-value =
0.5
0.0062
2
0.2
1
0
53 / 55
All else held equal, will the p-value be lower if n = 100 or n = 10,
000?
(a) n = 100
(b) n = 10, 000
Suppose x = 50, s = 2, H0 : = 49.5, and HA : 49.5.
Zn=10
Zn=1000
50
49.52
50 49.5
=
= 2.5, p-value =
0.5
0.0062
100
2
0.2
0.
50 =
1
2
5
49.5
0
1000
=
= 25, p-value 0
50 0.02
0
49.5
=
2
100
Saher Asad
53 / 55
All else held equal, will the p-value be lower if n = 100 or n = 10,
000?
(a) n = 100
(b) n = 10, 000
Suppose x = 50, s = 2, H0 : = 49.5, and HA : 49.5.
Zn=10
Zn=1000
50
49.52
50 49.5
=
=
p-value =
0.5
2.5,
0.0062
100
2
0.2
0.
50 =
p-value
1
2
5
49.5
0
0
1000
=
=
50
0
25,
49.5 0.02
As n increases - SE , Z , p-value
Saher Asad
2
100
53 / 55
x
n = 30
10.05
10.1
10.2
n = 5000
Saher Asad
54 / 55
x
n = 30
10.05
p value = 0.45
10.1
10.2
n = 5000
Saher Asad
54 / 55
x
n = 30
10.05
p value = 0.45
10.1
10.2
Saher Asad
54 / 55
x
n = 30
10.05
p value = 0.45
10.1
p value = 0.39
10.2
Saher Asad
54 / 55
x
n = 30
10.05
p value = 0.45
10.1
p value = 0.39
10.2
Saher Asad
54 / 55
x
n = 30
10.05
10.1
10.2
p value = 0.45
p value = 0.39
p value = 0.29
Saher Asad
54 / 55
x
n = 30
10.05
10.1
10.2
p value = 0.45
p value = 0.39
p value = 0.29
Saher Asad
p value 0
54 / 55
x
n = 30
10.05
10.1
10.2
p value = 0.45
p value = 0.39
p value = 0.29
p value 0
When n is large, even small deviations from the null (small effect
sizes), which may be considered practically insignificant, can yield
statistically significant results.
Saher Asad
54 / 55
55 / 55