Você está na página 1de 23

Use the t-test to determine if two groups (two treatments) are different.

Compare the blood pressure of patients taking a drug to that of patients not taking the drug
Two treatments: Drug vs. No drug
Drug
110
115
120
125
130
135
140

No drug (cells A7:A13 and B7:B13)


130
145
145
150
150
170
175

Blood pressure

Blood pressure vs treatment


200
150
Drug
No drug

100
50
0
1

Patient number

The p-value for the t-test tells us the probability that we are wrong
if we conclude that there is a difference between the treatments.
Expressed another way:
The t-test tells us the probability that the observed differences between the two groups
is just due to random differences in sampling the two groups,
in the absence of any effect of the drug.
Using the Excel ttest workbook function:
=TTEST(A7:A13,B7:B13,2,2)
t-test p-value =

Using Menu: Tools / DataAnalysis / t-test Two-sample assuming equal variance


t-Test: Two-Sample Assuming Equal Variances
Drug
No drug
Mean
125 152.1429
Variance
116.6667 240.4762
Observations
7
7
Pooled Variance
178.5714
Hypothesized Mean Di
0
df
12

t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail

-3.8
0.001265
1.782287
0.00253
2.178813

Single factor ANOVA can do the same thing as the t-test to determine if two groups (two treatments) are different.
Compare the blood pressure of patients taking a drug to that of patients not taking the drug
Two treatments: Drug vs. No drug
Drug
110
115
120
125
130
135
140

No drug
130
145
145
150
150
170
175

Using the Excel ttest workbook function:


=TTEST(A7:A13,B7:B13,2,2)
t-test p-value =
0.00253
Single-factor ANOVA = 1-way ANOVA
Two-factor ANOVA = 2-way ANOVA

Anova: Single Factor


SUMMARY
Groups
Drug
No drug

Count
7
7

ANOVA
Source of Variation
Between Groups
Within Groups

SS
2578.571
2142.857

Total

4721.429

Sum
Average Variance
875
125 116.6667
1065 152.1429 240.4762

df

MS
1 2578.571
12 178.5714
13

F
14.44

P-value
F crit
0.00253 4.747225

eatments) are different.

Use single-factor (1-way) Analysis of variance (ANOVA) to determine if two or more groups are different

Compare the blood pressure of patients taking three different drugs, Drug A, B and C.
Drug A
110
115
120
125
130
135
140

Drug B
105
115
125
125
125
140
140

Drug C
130
145
145
150
150
170
175

Blood pres sure

200
150
100

Drug A

50

Drug B
Drug C

0
1

Patie nt

The p-value for the ANOVA tells us the probability that we are wrong
if we conclude that there is any difference between the treatments.
ANOVA tells us the probability that the observed differences between the groups
is just due to random differences in sampling the groups, in the absence of any effect of the drug.
If at least one group is different, ANOVA gives us a small p-value.
Using Menu: Tools / DataAnalysis / ANOVA Single Factor
SUMMARY
Groups
Count
Sum
Average Variance
Drug A
7
875
125 116.6667
Drug B
7
875
125 158.3333
Drug C
7
1065 152.1429 240.4762
ANOVA
Source of Variation
Between Groups
Within Groups

SS
3438.095
3092.857

Total

6530.952

df

MS
F
P-value
F crit
2 1719.048 10.00462 0.001198 3.554561
18 171.8254
20

Looking at the SUMMARY table, we notive that the average for drug C is 152.1429, while
the average for the other two groups is 125.
The ANOVA table tells us that P-value is 0.001198, which means that it is very unlikely
we would see this big a difference between the three groups just by chance.

Use two-factor (2-way) Analysis of variance to determine if either of two factors affects the outcomes
Suppose we think that two factors, gender and drug, may affect the patient's response
Use the Excel menu: Tools/Data Analysis/ ANOVA 2-factor without replication

Factor: Age

Factor: Drug
Drug A
Drug B
118
128
120
130
121
130

Under 21
21 to 55
Over 55

Drug C
135
136
134

Anova: Two-Factor Without Replication


SUMMARY
Under 21
21 to 55
Over 55
Drug A
Drug B
Drug C

Count
3
3
3

Sum
Average Variance
381
127
73
386 128.6667 65.33333
385 128.3333 44.33333

3
3
3

359 119.6667 2.333333


388 129.3333 1.333333
405
135
1

ANOVA
Source of Variation
SS
Rows
4.6666666667
Columns
360.66666667
Error
4.6666666667
Total

370

df

MS
F
P-value
F crit
2 2.333333
2
0.25 6.944272
2 180.3333 154.5714 0.000163 6.944272
4 1.166667
8

The analysis indicates that there is a significant difference among the columns (Factor: Drug), with a p-value of 0.000163
The analysis indicates that there is NOT a significant difference among the rows (Factor: age), with a p-value of 0.25
We might be concerned that we only treated three people with each drug, and feel that we would like more replicates.

g), with a p-value of 0.000163


ge), with a p-value of 0.25
would like more replicates.

2-way ANOVA with replicates


Use two-factor (2-way) Analysis of variance to determine if either of two factors affects the outcomes
Include replicates to increase confidence in the results

Suppose we think that two factors, gender and drug, may affect the patient's response

Factor: Drug
Drug A
Drug B
Under 21
118
128
117
126
110
125
118
131
21 to 55
120
130
118
132
121
129
124
130
Over 55
121
130
122
128
119
130
127
135

Factor: Age

Drug C
135
136
130
135
136
136
140
131
134
131
138
140

Use the Excel menu: Tools/Data Analysis/ ANOVA 2-factor with replication
Anova: Two-Factor With Replication
SUMMARY

Drug A

Drug B

Drug C

Total

Under 21

Count
Sum
Average
Variance

4
463
115.75
14.91667

4
4
12
510
536
1509
127.5
134
125.75
7 7.333333 70.20455

21 to 55

Count
Sum
Average
Variance

4
4
4
12
483
521
543
1547
120.75
130.25
135.75 128.9167
6.25 1.583333 13.58333 47.7197
Over 55

Count
Sum
Average
Variance

4
4
489
523
122.25
130.75
11.58333 8.916667

4
12
543
1555
135.75 129.5833
16.25 43.90152

Total

Count
Sum
Average

12
1435
119.5833

12
12
1554
1622
129.5 135.1667

Variance

17.35606

ANOVA
Source of Variation
Sample
Columns
Interaction
Within

SS
100.6667
1493.167
24.66667
262.25

Total

1880.75

7 10.87879

df
2
2
4
27

MS
F
P-value
F crit
50.33333 5.182078 0.012453 3.354131
746.5833 76.86463 7.14E-12 3.354131
6.166667 0.63489 0.641998 2.727765
9.712963

35

The analysis indicates that there is NOT a interaction between the rows (Factor: age)and the columns (Factor: Drug), with a pThe analysis indicates that there is a significant difference among the columns (Factor: Drug), with a p-value of 7.14E-12
The analysis indicates that there is a significant difference among the rows (Factor: age), with a p-value of 0.012

columns (Factor: Drug), with a p-value of 0.64


, with a p-value of 7.14E-12
h a p-value of 0.012

2-way ANOVA with replicates, test for interaction


Use two-factor (2-way) Analysis of variance to determine if either of two factors affects the outcomes
Include replicates to increase confidence in the results

Suppose we think that two factors, gender and drug, may affect the patient's response

Factor: Drug
Drug A
Drug B
Under 21
118
120
118
118
110
121
118
123
21 to 55
120
118
118
117
121
110
125
118

Factor: Age

Under 21

Average

21 to 55

Average

What is the effect of drug A vs drug B? Depends on age


What is the effect of age? Depends on drug
Therefore, there is an interaction between age and drug
Anova: Two-Factor With Replication
SUMMARY

Drug A

Drug B

Total

Under 21

Count
Sum
Average
Variance

4
4
464
482
116
120.5
16 4.333333

8
946
118.25
14.5

21 to 55

Count
Sum
Average
Variance

4
4
8
484
463
947
121
115.75 118.375
8.666667 14.91667 17.98214
Total

Count
Sum
Average
Variance

8
8
948
945
118.5 118.125
17.71429 14.69643

Drug A
118
118
110
118
116
Drug B > Drug A if under 21
Drug A
120
118
121
125
121
Drug B < Drug A if 21 to 55

ANOVA
Source of Variation
Sample
Columns
Interaction
Within

SS
0.0625
0.5625
95.0625
131.75

df

MS
F
P-value
F crit
1
0.0625 0.005693
0.9411 4.747225
1
0.5625 0.051233 0.82474 4.747225
1 95.0625 8.658444 0.012314 4.747225
12 10.97917

Total

227.4375

15

Total

605.4375

15

The analysis indicates that there IS an interaction between the rows (Factor: age)and the columns (Factor: Drug), with a p-valu
The analysis indicates that there is NOT a significant difference among the columns (Factor: Drug), with a p-value of .82
The analysis indicates that there is NOT a significant difference among the rows (Factor: age), with a p-value of .94
Is it correct to conclude that neither age nor drug have any effect?
How should we analyze the data to determine the effect(s), if any, of age and drug?
What would we have concluded if we had not tested for the interaction between age and drug?
Because we have a significant interaction, we have to look at the effect of the drug separately in each age group

Under 21

Average

21 to 55

Average

Drug A
118
Under 21
118
110
118
116
Drug B > Drug A if under 21

Drug B
120
118
121
123
120.5

t-test for drug effect in "Under 21" age group s


p-value=
0.032785

Drug A
120
118
121
125
121
Drug B < Drug A if 21 to 55

Drug B
118
117
110
118
115.75

t-test for drug effect in "21 to 55" age group se


p-value=
0.02351

If we do not test for interaction, we would conclude that the drug has no effect.
If we test for interaction, we learn that the drug has different effects in different age groups, and that
the effects is significant, but in the opposite direction, in each age group.

Under 21

Drug B
120
118
121
123
120.5

Drug B > Drug A if under 21


Drug B
118
117
110
118
115.75
Drug B < Drug A if 21 to 55

columns (Factor: Drug), with a p-value of 0.01


or: Drug), with a p-value of .82
age), with a p-value of .94

tely in each age group

drug effect in "Under 21" age group separately

drug effect in "21 to 55" age group separately

If you have a missing value, the number of observations in each treatment condition is unbalanced.
This situation is called an unbalanced design
Excel cannot workon unbalanced designs.
You get an error message saying "Input range contains non-numeric data", because one of the cells is empty.
If you have an unbalanced design (missing values) use another statistics package such as R that will handle them.
Example: missing value in lower right cell of data.

Factor: Age

Factor: Drug
Drug A
Drug B
Under 21
118
128
117
126
110
125
118
131
21 to 55
120
130
118
132
121
129
124
130
Over 55
121
130
122
128
119
130
127
135

Drug C
135
136
130
135
136
136
140
131
134
131
138

cells is empty.

at will handle them.

PatientID
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
P15
P16
P17
P18
P19
P20
P21
P22
P23
P24
P25
P26
P27
P28
P29
P30
P31
P32
P33
P34
P35
P36

Drug
Drug A
Drug A
Drug A
Drug A
Drug A
Drug A
Drug A
Drug A
Drug A
Drug A
Drug A
Drug A
Drug B
Drug B
Drug B
Drug B
Drug B
Drug B
Drug B
Drug B
Drug B
Drug B
Drug B
Drug B
Drug C
Drug C
Drug C
Drug C
Drug C
Drug C
Drug C
Drug C
Drug C
Drug C
Drug C
Drug C

Age Group Response


Under 21
118
Under 21
117
Under 21
110
Under 21
118
21 to 55
120
21 to 55
118
21 to 55
121
21 to 55
124
Over 55
121
Over 55
122
Over 55
119
Over 55
127
Under 21
128
Under 21
126
Under 21
125
Under 21
131
21 to 55
130
21 to 55
132
21 to 55
129
21 to 55
130
Over 55
130
Over 55
128
Over 55
130
Over 55
135
Under 21
135
Under 21
136
Under 21
130
Under 21
135
21 to 55
136
21 to 55
136
21 to 55
140
21 to 55
131
Over 55
134
Over 55
131
Over 55
138
Over 55
140

Repeated measures ANOVA


Recall that we used the paired t-test when each patient was measured before and after treatment,
or the same patient got two different treatments (drug A vs. drug B).
The advantage of the paired t-test is that we compare treatments within patients,
which controls for differences between patients.
The paired t-test only works for comparing two treatments, or two time points (before and after)
If we want to compare three or more treatments, we use Repeated Measures ANOVA, which is
the extension of the paired t-test.

In Excel, we perform a repeated measures ANOVA using the Data Analysis menu item "ANOVA: Two-factor without replication
Essentially, we consider treatment (the drug) to be one factor, and patient is the second factor.

PatientID

Drug A
118
117
110
118
120
118
121
124
121
122
119
127

1
2
3
4
5
6
7
8
9
10
11
12

Drug B
110
126
125
121
130
132
119
130
130
128
130
135

Drug C
125
117
110
131
130
132
129
130
118
121
130
135

Sum
353
360
345
370
380
382
369
384
369
371
379
397

Average
117.6667
120
115
123.3333
126.6667
127.3333
123
128
123
123.6667
126.3333
132.3333

Anova: Two-Factor Without Replication


SUMMARY

Count
1
2
3
4
5
6
7
8
9
10
11
12

Drug A
Drug B
Drug C

3
3
3
3
3
3
3
3
3
3
3
3
12
12
12

Variance
56.33333
27
75
46.33333
33.33333
65.33333
28
12
39
14.33333
40.33333
21.33333

1435 119.5833 17.35606


1516 126.3333 46.78788
1508 125.6667 56.78788

We see evidence of a difference


the drugs, but we need to consid
so we look at the ANOVA p-value

ANOVA
Source of Variation
Rows
Columns
Error
Total

SS
745.6388888889
332.0555555556
584.6111111111
1662.3055555556

df

MS
F
P-value
F crit
11 67.78535 2.550889 0.0296064203 2.258518
2 166.0278 6.247933 0.0070992405 3.443357
22 26.57323
35

The ANOVA p-value for patients (rows) is 0.0296, which indicates that patients differ.
The ANOVA p-value for the drugs (columns) is 0.0071, which indicates that the drugs differ.

We probably want to know which of the three drugs are different from each other.
When we compare pairs of treatments after an ANOVA, it is called "post-hoc" comparisons.
Excel doesn't have statistical tests that correct the p-values for doing multiple post-hoc comparisons.
To get p-values corrected for multiple comparisons we should use other software.
However, for now we'll use multiple t-tests, even though this method can give more false-positive results.
PatientID
1
2
3
4
5
6
7
8
9
10
11
12

Drug A
118
117
110
118
120
118
121
124
121
122
119
127
t-test for A vs B
t-test for A vs C
t-test for B vs C

Drug B
110
126
125
121
130
132
119
130
130
128
130
135

Drug C
125
117
110
131
130
132
129
130
118
121
130
135

0.007941
0.022839
0.822582

It appears that there is a big difference between A and B, and between A and C.
B and C do not appear to differ.

OVA: Two-factor without replication".

We see evidence of a difference in means among


the drugs, but we need to consider variance,
so we look at the ANOVA p-value.

sitive results.

Você também pode gostar