Você está na página 1de 43

To test hypotheses concerning the frequency

distribution of one or more populations.

test does not prove that a hypothesis is correct


evaluate Ho what extent the data and hypothesis
have a good fit.

If degree of freedom= 1 : Yates correction for continuity

Quantitative data
One or more categories
Independent and mutually exclusive measurements
Adequate sample size (at least 10)
Random sampling
Data in frequency form (Not % or ratio)
All observations must be used

Expected values: 3 ways


a) Chance (probability)

b) Priori theory or hypothesis


c) Existing data and research

Not less than 1


No more than 20% - less than 5 INACCURATE
Fishers Exact Test cell count < 5

The probability density curve of a chi-square


distribution is asymmetric curve stretching over the
positive side of the line and having a long right tail.

The form of the curve


depends on the value
of the degrees of
freedom.

Critical values for chi-square are found on


tables, sorted by degrees of freedom and
probability levels.

chi-square value > critical value


reject the null hypothesis.

chi-square value < critical value


fail to reject the null hypothesis

Analysis of attribute data


Test for a fixed-ratio hypothesis
Test for independence in a contingency table
Test for homogeneity of ratio

Goodness of fit tests

Attribute data a finite number of discrete


classes
E.g.:
Two classes: presence or absence of an attribute

(male or female, success or failure, effective or


ineffective, dead or alive)
More than two classes: varietal classification,
color classification, tenure status of farmers.

Example:
In rice, the green leafhopper is suspected to differ in
feeding preference between an already diseased
plant and a healthy plant. The researcher, therefore,
encloses a prescribed number of green leafhoppers in
a cage that holds an equal number of healthy and
diseased rice plants. After 2 hours of caging, he then
counts the number of insects found on diseased and
on healthy plants. Of 239 insects confined, 67 were
found on the healthy plants and 172 on the diseased
plants. Does the observed ratio of 67:172 deviate
significantly from the hypothesized no-preference
ratio of 1:1?

Hypothesis:
Ho: There are no significant difference in feeding
preference of leafhopper (i.e., the ratio of 1:1).
HA: There are significant difference in feeding
preference of leafhopper.
Table 1: Feeding Preference Test for the Green Leafhoppers on
Diseased and Healthy Rice Plants.

Feeding
preference
Leafhoppers

Healthy
plants
67

Diseased
plants
172

Total
239

Expected value:

r1 : r2: :rp = hypothesized ratio

| | : absolute value
n1 and n2 : observed values
E1 and E2 : expected values

Two classes: 67 healthy and 172 diseased


plants
r1= r2 = 1, n1=67, n2=172

Degree of freedom: (p 1) = 2-1 =1


3.84 at the 5% level of significance
45.26 > 3.84
The Ho hypothesis of no preference is
rejected.
72% - diseased plants, 28% - healthy plants
Conclusion: Green leafhoppers preferred
diseased plants over healthy plants.

DATA LEAFHOP;
INPUT PLANT $ NUM;
DATALINES;
HE 67
DI 172
;
PROC FREQ;
WEIGHT NUM;
TABLES PLANT /CHISQ TESTP=(50 50);
RUN;

Two classification criteria


Row variable r classes, column variable c
classes
r c contingency table

Hypothesis:
Ho: There is independence between two
classification criteria in a rc contingency table.
HA: There is association between two
classification criteria in a rc contingency table.

Table 2: Frequencies of Farmers Classified According to Two


Categories: Tenure Status and Adoption of New Rice Varieties
Tenure Status

Farmers, no.
Adopter

Nonadopter Row Total (R)

Owner Operator

102

26

128

Share-rent farmer

42

10

52

Fixed-rent farmer

Column total (C)

148

39

Grand total (G)

187

Compute the expected value of each of the


rc cells as:

where

= expected value of the (i, j)th cell


= total of the i th row
= total of the j th column
= Grand total

Table 3: The Expected Values for the Data in Table 2


under the Hypothesis of Independence
Tenure Status

Adopter

Nonadopter

Owner operator

101.3

26.7

Share-rent farmer

41.2

10.8

Fixed-rent farmer

5.5

1.5

Compute 2 value:

Degree of freedom: (r 1)(c 1) = (2)(1) = 2


5.99 at the 5% level of significance
2.01 < 5.99
Ho is accepted
Conclusion: There is independence between
the adoption and tenure status of the farmers.

DATA INDE;
INPUT ADOPT $ TENURE $ NUMBER;
DATALINES;
AD
OO
102
AD
SF
42
AD
FF
4
NO
OO
26
NO
SF
10
NO
FF
3
;
PROC FREQ;
TABLES ADOPT*TENURE/ EXACT;
WEIGHT NUMBER;
RUN;

Attribute data repeated several times


To determine how to pool the information
over all trials
Example:
The preference study of the green
leafhoppers in rice could be repeated over
time or could be repeated in several cages at
the same time.

Table 4: Application of the Chi-Square Test for Homogeneity of


1:1 Ratio on the Number of Green Leafhoppers Found on
Diseased and Healthy Rice Plants in Four Trials
Trial No.

Total
Insects

Observed Values

Expected Values

Healthy

Diseased

Healthy

Diseased

Value

239

67

172

119.5

119.5

45.26

183

74

109

91.5

91.5

6.32

171

54

117

85.5

85.5

22.48

301

97

204

150.5

150.5

37.33

Total

894

292

602

447.0

447.0

106.80 =

T2

111.39 =

s2

Sum

computed
tabular
2

Data from the total s trials can be pooled , s


values compared to the tabular 2 value with s d.f.

d2

computed

d2

tabular 2

Data from the s trials are heterogeneous (i.e., the

s data sets do not share a common ratio), cannot


be pooled.

Value for additivity:

Degree of freedom: (s-1)= 3


4.59 7.81 at the 5% level of significance
Conclusion: The data can be pooled

value compared with the tabular value

2
4

with 4 d.f.
111.39 9.49 at 5% significance level
Hypothesis of no preference (1:1 ratio) is
rejected.

DATA LEHO;
INPUT PLANT $ NUM;
DATALINES;
H
67
DI
172
H
74
DI
109
H
54
DI
117
H
97
DI
204
;
PROC MIXED;
CLASSES PLANT;
MODEL NUM = PLANT;
REPEATED /GROUP=PLANT;
RUN;

Goodness of fit how close the observed data are to


those predicted from a hypothesis
-to assess the compatibility of data with Ho
H0- the individuals are distributed in accordance
with some specified proportions
HA- the individuals are not distributed in accordance
with the specified proportions
Example: To test Mendelian Genetic Model selfpollination of snapdragon (Antirrhinum majus)
Red, pink and white 1:2:1 in single plant

Table 5: Observed and Expected Snapdragon


Flower Counts by Colour
Red
Pink
White
Total

Observed Expected
52
52.25
128
104.50
29
52.25
209
209

Expected value n ratio

2
(O
E)
2

E

2
2
2
(52
52.25)
(
128

104
.
50
)
(
29

52
.
25
)
2

52.25
104.50
52.25
15.63

df = 2 , 2 table value = 5.99


2 obtained > table value
Reject Ho
Conclusion- the Mendelian prediction does
not hold / fit in this situation.

Você também pode gostar