Você está na página 1de 9

1

9.0 Chi-Square distribution

9.1 Introduction
9.1.1 Describe the properties and uses of the Chi-Square Distribution.

The Chi-Square Distribution


So far we have used the standard normal z, t, and F distributions as the test statistics. In
Chapter 13 we will learn how and when to use the Chi-Square as the test statistic.

The chi-square is similar to the t and F distributions in that there is a family of 𝛾 2


distributions - each has a different shape depending on the number of degrees of freedom.

As the illustration shows, when the number of degrees of freedom is small, the
distributions positively skewed, but as the number of degrees of freedom increases it
becomes symmetrical and approaches the normal distribution.

Chi-square is based on squared deviations between an observed frequency and an expected


frequency - therefore, it is always positive.
2

9.2 Goodness of Fit Test

9.2.1 Describe the purpose of a goodness of fit test

Goodness-of-Fit Tests
In the goodness-of-fit test the 𝛾 2 distribution is used to determine how well an observed
set of observations fits an expected set of observations.

 Goodness-of-fit test: A nonparametric test involving a set of observed


frequencies and a corresponding set of expected frequencies.

The purpose of the goodness-of-fit test is to determine if there is a statistical difference


between the two sets of data - one which is observed and the other expected. Is the
difference due to chance, or can we conclude there is a significant difference between the
two values.
NOTE: Again, the same systematic five-step hypothesis testing procedure is followed in
our solution.
We begin by denoting f0 as the observed set of frequencies in a particular category and fe
as the expected frequency in a particular category.

 NOTE: A category is referred to as a cell.

Step 1: State the null and alternate hypotheses:

Step 2: Select the Level of Significance - This is the probability of committing a Type I
error.

Step 3: Select the test statistic is the chi-square statistic.


3

Step 4: Formulate the decision rule. Find the critical value of 𝛾 2 . This critical value is
found in the Appendix H, found by locating the number of degrees of freedom in the left
column and moving horizontally to the right to read the value associated with the level of
significance.

Step 5: Compute the value of the Chi-square and make your decision. Page 443 of your
text illustrates the procedure for computing the 𝛾 2 value.

It is not necessary that the expected frequencies be equal to apply the goodness-of-fit
test. The text illustrates the case of unequal frequencies and also gives a practical use of
chi-square.
4

Examples:

1. A student sells baseball cards for a day. At the end of the day she records the sales of
the six types of cards in a chart as show below.

Player Cards Sold


Tom Seaver 13
Nolan Ryan 33
Ty Cobb 14
George Brett 7
Hank Aaron 36
Johnny Beach 17

At the 0.05 significance level, can she conclude the sales are not the same for each player?

2. A human resources manager records the number of sick days over a week.
The following data was gathered.

Day of the week Number absent


Monday 12
Tuesday 9
Wednesday 11
Thursday 10
Friday 9
Saturday 9

At the 0.01 significance level, can she conclude that there is no difference
in the absenteeism throughout the six-day workweek?
5
6

9.3 Test of Independence

9.3.1 Perform the Chi-Square Test to determine whether two


classifications of the same data are independent of each other

Contingency Tables
The distribution is also used to determine if there is a relationship between two or more
criteria of classifications.
For example, we may be interested in whether or not there is a relationship between job
advancement within a company and the gender of the employee.

Contingency Table: A table made up of rows and columns. Each box is referred to as a cell.

The usual five-step hypothesis testing procedure is followed. The expected frequency, fe ,
is computed by the formula:

(row total)(column total) divided by (Grand Total).

(rowtotal)(columntotal )
i.e. fe 
grandtotal

The number of degrees of freedom used to find the critical value for 𝛾 2 is :

df = (number of rows - 1)(number of columns - 1)

There is a limitation to the use of the 𝛾 2 distribution The value of fe should be at least 5
for each cell (box). This requirement is to prevent any cell from carrying an inordinate
amount of weight and causing the null hypothesis to be rejected.
7

Examples:

1. A Correction Agency is investigating whether those released from prison


show a different adjustment if they return to their hometown or is they go
elsewhere to live. In other words, they would like to know whether there is a
relationship between adjustment to civilian life and place of residence. The
data below was gathered. Using the 0.01 significance level, determine if a
relationship exists.

Outstanding Fair Unsatisfactory


Live in 27 35 33
hometown
Live 13 15 27
elsewhere

2. A social scientist sampled 140 people and classified them according to


income level and whether or not they played a lottery in the last month. The
info is given below. Can we conclude that playing the lottery is related to
income level? Use the 0.05 significance level.

High Low
income income
Played
the
lottery in 21 46
the last
month
Did not
play the
lottery in 19 14
the last
month
8

Worksheet for 9.0

1. In a particular chi-square goodness-of-fit test there are four categories


and 200 observations. Use the .05 significance level.
a. How many degrees of freedom are there?
b. What is the critical value of chi-square?

2. In a particular chi-square goodness-of-fit test there are six categories and


500 observations. Use the .01 significance level.
a. How many degrees of freedom are there?
b. What is the critical value of chi-square?

3. The null hypothesis and the alternate are:


H0: The cell categories are equal.
H1: The cell categories are not equal.

Category f0
A 10
B 20
C 30

a. State the decision rule, using the .05 significance level.


b. Compute the value of chi-square.
c. What is your decision regarding H0?

4. The null hypothesis and the alternate are:


H0: The cell categories are equal.
H1: The cell categories are not equal.

Category f0
A 10
B 20
C 30
D 20

5. Classic Golf, Inc. manages five courses in the Jacksonville, Florida, area. The
Director wishes to study the number of rounds of golf played per weekday
at the five courses. He gathered the following sample information.
9

Day Rounds
Monday 124
Tuesday 74
Wednesday 104
Thursday 98
Friday 120

6. The director of advertising for the Carolina Sun Times, the largest
newspaper in the Carolinas, is studying the relationship between the type of
community in which a subscriber resides and the section of the newspaper
he or she reads first. For a sample of readers, she collected the following
sample information.

National
News Sports Comics
City 170 124 90
Suburb 120 112 100
Rural 130 90 88

At the .05 significance level, can we conclude there is a relationship between


the type of community where the person resides and the section of the
paper read first?

7. The Quality Control Department at Food Town, Inc., a grocery chain in


upstate New York conducts a monthly check on the comparison of scanned
prices to posted prices. The chart below summarizes the results of a sample
of 500 items last month. Company management would like to know whether
there is any relationship between error rates on regular priced items and
specially priced items. Use the .01 significance level.

Regular Advertised
Price Special Price
Undercharge 20 10
Overcharge 15 30
Correct Price 200 225

Você também pode gostar