Você está na página 1de 6

Copy Right : Ra i Unive rsit y

136 11.556
R
E
S
E
A
R
C
H

M
E
T
H
O
D
O
L
O
G
Y
1. A manufacturer of petite womens sportswear has hypothesized
that the average weight of the women its buying1.ts_clothing
is 110 pounds. The company takes two samples of its
customers and finds one samples estimate of the population
mean is 98 pounds, and the other sample produces a mean
weight of 122,pqunds. In the test of the companys hypothesis
that the population mean is 110 pounds versus the hypothesis
that the mean does equal 110 pounds, is one of these sample
values more likely to lead accept the null hypothesis? Why or
why not?
2. On an average day; about 5 percent of the stocks on the New
York Stock set a new high for the year. On Friday Sept 18
th
,
1992 the Dow Jones closed at closed at 3,282,on a robust
volume of over 136 million shares traded. A random sample
of 120, stocks showed that 16 had set new annual highs that
day. Using a significance level of 0.01, should we conclude that
more stocks than usual set new highs on that day?
3. A finance developed a theory that predicted that closed end
equity funds should sell at a premium of about 5% on average.
Assuming that the discount / premium population is
approximately normally distributed does the sample
information support his theory? Test at .05 level of significance.
4.A company recently criticized for not paying women as much as
men claims that its average salary paid to all employees is $23500.
From a random sample of 29 women, the average salary was
calculated to be $23,000. If the population standard deviation
is known to be $1250 for these jobs determine whether we
could reasonably, within 2 standard errors expect to find $23000
as the sample mean if, in fact , the companys claim is true.
5. A manufacturer of vitamins for infants inserts a coupon for a
free sample of its production a package that is distributed at
hospitals to new parents. Historically about 185 of the coupons
have been redeemed. Given current trends for having fewer
children and starting families later, the firm suspects that todays
parents are better educated on average and as a result more
likely to use vitamin supplements for their infants. A sample
of 1500 new parents redeemed 295 coupons. Does this support
at a significance level of 2percent the firms beliefs about todays
new parents.
1. From a sample of 10,200 loans made by a state employees
credit union in the most recent five year period, 350 were
sampled to determine what proportion was made to women.
This sample showed that 39% of the loans were made to
women employees. A complete census of loans 5 years ago
showed that 41% were women borrowers. At a significance
level of .02 can you conclude that the proportion of loans
made to women has changed significantly in the last five years?
In the last lecture we have examined the use of the t distribution
for different types of applications. We now turn to another
important application of the t tests in the case of dependent
LESSON -22:
TUTORIAL
samples. That is we have two samples but the two samples
are not independent of each other. This is particularly
important in research where we frequently have matched
samples or control samples or a before and after situation.
We also wrap up our overview of parametric hypothesis
testing. Hypotheses testing problems are not difficult and
require very similar types of calculations. The important
principle that students need to grasp is to understand which
distribution and which formula to use. We also spend
sometime doing general applications of assorted hypothesis
testing problems.
Testing of Differences between means with
dependent samples
So far our samples had been chosen independently of each
other, i.e. the sample respondents gone through different
types of training programmes or the samples could have
been chosen from two cities, etc. Sometimes we need to take
samples, which are not independent of each other. Examples
of dependent samples are :
Paired samples that allow us to control more precisely for
extraneous factors.
An example of dependent samples is a before and after
sample where we record sample data prior to a new variable.
This is then compared with sample data after the change.
Examples of common usage of such procedures include
ad pretests, product tests, concept tests, etc.
The procedure for hypothesis testing with dependent samples
remains the same except that we use a different formula for
estimating the standard error of sample differences. The
second difference is that we require both sample to be of the
same size.
In paired sample tests we essentially control for any extraneous
factors which can influence the samples. Thus introduction
of the test variable in the one of the samples means that any
differences observed stem purely from the action of the new
variable.
Example
An agricultural research institute wishes to determine a new
hybrid has a greater yield than an old variety.
Independent sample:
If the researches ask 10 farmers to record the yield of an acre
with the new variety and another 10 farmer to record the yield
of an acre planted with the old variety. In this case we do not
control for differences in sample composition of farmers,
land, fertilizer usage, etc.
Dependent Sample
It would be a dependent sample if we had asked the same set
of farmers to plant one field with the new variety and one
field with the old one. In this case we control for external
Copy Right : Ra i Unive rsit y
11.556 137
R
E
S
E
A
R
C
H

M
E
T
H
O
D
O
L
O
G
Y
factors such as quality of land, fertilizer usage, insecticide usage,
etc. Any differences between the yields can be solely attributed to
the use of the new variety.
Another example:
The null hypotheses in this case is
Ho:
1
=
1
Ha:
1
>
1
Conceptually dependent samples can be

treated as one sample
with a change in the test variable. For example if we wish to test
the efficacy of a weight loss program. A sample of 10 participants
preprogramme weight and post programme weights are recorded.
The programme promises a weight loss of more than 17kgs. The
samples are dependent because the same set of people have been
observed twice. They wish to test at .05 level of significance that
weight loss is more than 17kgs.
The problem can be formally stated as:
Ho: : 1- 1=17

Ha: 1-1 >17 a= 0.05
However we are not interested in the actual weights before / after.
What we want to know is the average weight loss. Thus this can
be treated as one sample of weight losses where we can designate
hypothesized weight loss as The null and alternative hypotheses
can be rewritten as :
w
Ho: :
w
=17

Ha:
w
>17
Because we want to know the weight loss exceeds 17 this is a right
tail test. Given n=10, degree of freedom=9 and the t critical value
is 1.833. The calculated t statistic remains unchanged, however the
formula for calculating standard deviation for dependent samples
is different.
The estimate the unknown population standard deviation we
compute individual sample standard deviations using :
1
) (
2

=

n
x x
s
Here x= loss= x
i1
- x
i2
The sample data for the before and after situation for the weight
loss programme are as follows:
Before
xi1
After
xi2
Loss
x
x
2


189
202
220
207
194
177
193
202
208
233
170
179
203
192
172
161
174
187
186
204
19
23
17
15
22
16
19
15
22
29
361
529
289
225
484
256
361
225
484
841
197 4055
x/ n=19.7
s.= (x
2
/ n-1-n >x
2
/ n-1) =(4055/ 9-10(19.7)
2
/ 9)
= 19.34=4.40
Standard error of the mean = s
^
x=
s
^
/ n
= 4.40/ 10
= 4.40/ 3.16=1.39
Now we calculate our observed t statistic:
t=(

x

-
HO
)
-
/

s
^
x
=
19.7-17/ 1.39=1.94
Since observed t statistic > t critical (1.833) we reject the null
hypothesis and conclude that claimed weight loss in the
programme is valid.
Examples
1. Sherri Welch is a, quality control engineer with the windshield
wiper manufacturing division of Emsco Inc. Emsco is currently
considering two new synthetic rubbers for its wiper blades,
and Sherri was charged with seeing whether blades made with
the two new compounds wear equally well. She equipped 12
Cars belonging to other Emsco employees with one blade
made of each of the two compounds. On cars 1 to 6, the right
blade was made of, compound A and the left blade was made
of compound B; on cars 7to 12; compound A was used for the
left blade. The cars were driven under normal operating
conditions until the blades no longer did a satisfactory job of
clearing the windshield of rain. The data below gives the usable
life (in days) of the blades. At = 0.05, do the two compounds
wear equally well?
Blade
A
Blade
B
Diff
x
X
2

183
347
247
269
189
257
233
156
238
211
241
154
162
323
220
274
165
271
224
178
263
199
263
148
21
24
27
-5
24
-14
9
-22
-25
12
-22
6

35 4397

x=x/ n=35/ 12=2.9167


s.= (x
2
/ n-1-n >x
2
/ n-1)= 439711-12 (2.9167)
2
/ 11)
=19.76 days
s
^
x
=s
^
/ n
=19.76/ 12=5.7042
Ho:
A
=
B
a= 0.05
Ha:
A



B
Observed t= (

x

-
HO
)
-
/

s
^

= 2.967-0/ 5.7042=.511
t critical =2.201
Copy Right : Ra i Unive rsit y
138 11.556
R
E
S
E
A
R
C
H

M
E
T
H
O
D
O
L
O
G
Y
Therefore since t calculated<t critical we accept the null
hypotheses that there are no significant differences in the
performance of the two compounds.
2. Nine computer components dealers in major metropolitan
areas were asked for their prices on two similar color inkjet
printers. The results of this survey are given below. At = 0.05
is it reasonable to assert that, on average, the Epson printer(E)
is less expensive than the Okaydata printer(O)?
Again this is a dependent sample because we go to the same
dealers ( hence we are controlling for external variable) and ask
them for the prices of different printers.
The hypotheses are:
Ho:
o
=
E
Ha: :
O
<
E
a= 0.05
Epsom$ Okydata$ Diff
x
X
2

250
319
285
260
305
295
289
309
275

270
325
269
275
289
285
295
325
300
20
6
-16
15
-16
-10
6
16
25
400
36
254
225
254
100
36
254
625
46 2190

x=x/ n=46/ 9=$5.21


s.
2
= 1/ n-1( x
2
-n >x
2
)=1/ 8(2190-9(5.111)
2
=244.36
s= s
2
=244.36=$15.63
s
^
x
=s
^
/ n = 15.63/ 9=$5.21
Observed t= (

x

-
HO
)
-
/

s
^
x
= 5.111-0/ 5.21=.981
t critical =-1.860
Therefore since observed value is less than t critical we do not
reject Ho.
Exercises
1. Additive RU has developed an additive to improve fuel efficiency
for trucks that pull very heavy loads. They tested the additive by
randomly selecting 18 trucks and dividing them into 9 pair. In
each pair, both trucks hauled the same type of load over the
same roadway, but only one truck used fuel with the new
additive. Different pairs followed different routes and carried
different loads. The resulting fuel efficiencies (in km per litre)
are given below. Do the data at =.01 indicate that trucks using
fuel with the additive achieved significantly better fuel efficiency?
Pair 1 2 3 4 5 6 7 8 9
Regular 5.7 6.1 5.9 6.2 6.4 5.1 5.9 6.0 5.5
Additive 6.0 6.2 5.8 6.6 6.7 5.3 5.7 6.1 5.9


3. Donna Rose is a production supervisor on the disk drive
assembly line at Winchester / Technologies. Winchester recently
subscribed to an easy listening music service at its factory hoping
that would relax the workers and lead to greater productivity.
Donna feels the music will distract them . she samples weekly
production for the same six workers before music and after
music. Her data is given below. Test at a=.02 whether average
production has changed at all?
Empolyee 1 2 3 4 6 7
No music 219 205 226 198 209 216
Without
music
235 186 240 203 221 205
Review of Applications of Hypothesis Testing
1. Clic pens has tested two types of point of purchase displays
for its new erasable pens. A shelf display was placed in a random
sample of 40 stores in the test market and a floor display was
placed in 40 other stores in the area. The mean number of pens
sold per store in one month with the shelf display was 42 and
the sample standard deviation was 8. With the floor display
case sample standard deviation was 7 and mean sales 45. At
a=.02 was there a significant difference between sales of the
two types of displays?
Explain to the management of this company the consequences
of Type I and type II errors.
A manufacturer of petite womens sportswear has hypothesizer
that the average weight of the women buying its clothing is 110
pounds. The company takes two samples of its customers and
finds one samples estimate population mean is 98 pounds, and
the other sample produces a Q weight of 122 pounds. In the test
of the company s hypothesis that population mean is 110 pounds
versus the hypothesis that the mean does equal 110 pounds, is
one of these sample values more likely to lead. us accept the null
hypothesis? Why or why not?
8-55 Many cities have adopted High Occupancy Vehicle (Hov)
lanes to s? Commuter traffic to downtown business districts.
Planning for Transportation District has depended on a well-
established average 0;
Passengers per Hay. But a summer intern notes that because many
fu are sponsoring vanpools, the average number of passengers
per car is _ ably higher. The intern takes a sample of 23 vehicles
going through the H lane of a toll plaza and reports a sample
mean of 43 passengers, and a standard deviation of 15 passengers.
At the 0.01' 1evei. of significance, does
Sample suggest that me mean number of passenger has increased?
8-56 In Exercise SC 3-5what would be the power of the test for u
=14,500, and if the significance level were changed to 0.01?
On an average day about 5 percentages of the stocks on the New
York Stock Exchange set a new high for the year. On Friday
September 18, 1992 theDow Jones Industrial Average closed at
3,28 on a robust volume over 136 million shares traded. A random
sample of 120 stocks showed that sixteen had set new annual
highs that day. Using a significance level of 0.01. She we conclude
that more stocks than usual set new highs on that day?
8-58 In response to criticism concerning lost mail, the U.S. Postal
Service initialed new procedures to alleviate this problem. The
postmaster general had b assured that this change would reduce
losses to below the historic loss rate of 0.3 percent. After the new
procedures had been in effect for 2 months, USPS sponsored an
Copy Right : Ra i Unive rsit y
11.556 139
R
E
S
E
A
R
C
H

M
E
T
H
O
D
O
L
O
G
Y
investigation in which a total of 8,000 pieces of mail were mailed
from. various parts of the country. Eighteen of the test price
failed to reach their destinations. At a significance level of 0.10, can
the? Master general concludes that the new procedures achieved
their goal?
8-59 What is the probability that we are rejecting a true null
hypothesis w_ reject the hypothesized value because
a. The sample statistic differs from it by more than 2.15 standard
error in either direction?
b. The value of the sample statistic is more than 1.6 standard
errors above it?
c. The value of the sample statistic is more than 2.33 standard
errors below it?
In 1995, the average 2-week-advance-purchase airfare between
Raleigh-Durham, North Carolina, and New York City was $235.
The population standard deviation was $68. A 1996 survey of 90
randomly chosen between between these two cities found that
they had paid $218.77, on average, for their tickets. Did the average
airfare on this route change significantly between 1995 and 1996?
What is the largest a at which you would conclude that the observed
average fare is not significantly different from $235?
Audio Sounds runs a chain of stores selling stereo systems and
component It has been very successful in many university towns,
but it has had s::.
failures. Analysis of its failures has led it to adopt a policy of not
Opening
store unless it can be reasonably certain that more than 15 percent
of the students in town own stereo systems costing $1,100 or
more. A survey of n the 2,400 students at a small, liberal arts
college in the Midwest has discovered that 57 of them own stereo
systems costing at least $1,100. if Audio Sounds is willing to run
a 5 percent risk of failure, should it open a store this town?
The City of Oakley collects a 1.5 percent transfer tax on closed real
estate transactions. In an average week, there are usually 32 closed
transaction with a standard deviation of 2.4. At the 0.10 level of
significance, would agree with the tax collectors conclusion that
sales are off this year _ sample of 16 weeks had a mean of 28.25
closed sales?
In 1996, it was estimated that about 72 percent of all U.S. housel
were cable TV subscribers. New time magazines editors were sure
their readers subscribed to cable TV at a higher rate than the
population and wanted to use this fact to sell advertising space for
maim cable channels. To verify this, they sampled 250 of New
times subscribers and found that 194 subscribed to cable TV. At
a significance level of 2 percent, do the survey data support the
editors belief?
A company, recently criticized for not paying women as much as
men working in the same positions, claims that its average salary
paid to all employees is $23,500. From a random sample of 29
women in the company, the all average salary was calculated to be
$23,000. H the population standard deviation is known to be
$1,250 for these jobs, determine whether we I reasonably (within
2 standard errors) expect to find $23,000 as the sample mean if, in
fact, the companys claim is true.
Drive-a-Lemon rents cars that are mechanically sound, but older
those rented by the large national rent-a-car chains. As a result, it
advertises that its rates are considerably lower than rates of its
1a:::5 competitors. An industry survey has established that the
average charge per rental at one of the major &ms is $77.38. A
random sample of 18 completed transactions at Drive-a-Lemon
showed an aV36 total charge of $87.61 and a sample standard
deviation of $19.48. Verify that at a = 0.025, Drive-a-Lemons
average total charge is sign higher than that of the major firms.
Does, this result indicate that a-Lemons rates, in fact, are not
lower than the rates charged by major national chains? Explain.
A random sample of 20 privately held North Carolina corporations
revealed the data in Table RW8-2 about their Chief Executive
Officers (CEO
Do these data present conclusive evidence (at ex. = 0.04) that her
prediction accuracy is significantly less than the asserted 85 percent?
In Exercise 8-26, what would be the power of the test for J.L =
54:95 and $43.95 if the significance level were changed toO.05?
A manufacturer of a vitamin supplement for newborns inserts a
coupon for free sample of its product in a package that is distributed
at hospi1a1to new parents. Historically, about 18 percent of the
coupons have been redeemed. Given current trends for having
fewer children and starting families 1aia: _ firm suspects that todays
new parents are better educated, on average,. and c a result, more
likely to use a vitamin supplement for their infants. A sample of
1,500 new parents redeemed 295 coupons. Does this support, at
a signifier level of 2 percent, the firms belief about todays new
parents?
An innovator in the innovator in the motor-drive industry felt
that its new electric mom drive would capture 48 percent of the
regional market within 1 year, be
cause of the products low price and superior performance. There
are 5,co users of motor drives in the region. After sampling 10
percent of these user a year later, the company found that 43
percent of them were using the new drives. At ex. = 0.01, should
we conclude that the company failed to reach it market-share goal?
According to machine specifications, the one-armed bandits in
gambling G3:SI no should payoff once in 11.6 turns, with a
standard deviation of 27 turns )lawyer believes that the machines
at Casino World have been tampered ,.ij and observes a payoff
once in 12.4 turns, over 36 machines. At ex. = 0.01 is ;:h: lawyer
right in concluding that the machines have a lower payoff frequent
Chapter Concepts Test
Circle the correct answer or fill in the blank. Answers are in the back
of the book.
1. In hypothesis testing, we assume that some population
parameter takes on particular value before we sample. This
assumption to be tested is called an alternative hypothesis.
2. Assuming that a given hypothesis about a population mean is
correct, the percentage of sample means that could fall outside
certain limits from this hypothesized mean. is called the
significance level.
3. In hypothesis testing, the appropriate probability distribution
to use is always the normal distribution.
Copy Right : Ra i Unive rsit y
140 11.556
R
E
S
E
A
R
C
H

M
E
T
H
O
D
O
L
O
G
Y
4. If we were to make a Type I error, we would be rejecting a null
hypothesis when it is really true.
5. Testing on the raw scale or the standardized scale will lead to
the same conclusion.
6. If 1.96 is the critical value of z, then the significance level of the
test is C.GS
7. If our null and alternative hypotheses are Ho: J.L = 80 and HI:
J.L < 80, it is appropriate to use a left-tailed test.
8. If the standardized sample mean is between zero and the critical
value then you should not reject Ho.
9. The value 1 - 13 is known as the power of the test.
10. After performing a one-tailed test and rejecting Ho, you realize
you should have done a two-tailed test, at the same signif. In
1992, a survey of 50 municipal hospitals revealed an average
occupy rate of 73.6 percent, and the sample standard deviation
was 18.2 percent Another survey of 75 municipal hospitals in
1995 found an average percent rate of 68.9 percent, and the
sample standard deviation was 19.7 cent. At = a= 0.10, can we
conclude that the average occupancy rate change significantly
during the 3 years between surveys?
General cereals has just concluded a new advertising campaign for
Fruit Crunch, Its all-natural breakfast cereal With nuts, grams, and
dried fruits. I test the effectiveness. of the campaign, brand manager
Alan Neebe surveyed 11 customers before the campaign and
another 11 customers after the campaign. Given are the customers
reported weekly consumption (in ounce of Fruit Crunch:
Before 14 5 18 18 30 10 8 26 13 29 34
After 23 14 13 29 33 11 12 25 21 26 34
a. At a = 0.05, can Alan conclude that the campaign has succeeded
in creasing demand for Fruit Crunch? I
b. Given Alans initial survey before the campaign, can you suggest
after sampling procedure for him to follow after the campaign?
Ben & Jerrys Homemade, Inc., is an unconventional manufacturer
super-premium ice cream known for adventurous flavors such
Chocolate Chip Cookie Dough. A Wall Street Journal article reports
part of the companys success is due to its appeal to young adult
consumers (who will presumably remain loyal customers
throughout
A chemist developing insect repellents wishes to know whether a
newly developed formula gives greater protection from insect bites
than -by the leading product on the market. In an experiment, 14
volunteers had one arm sprayed with the old product and the
other sprayed -.
new formula. Then each subject placed his arms into two chamber
with equal numbers of mosquitoes, gnats, and other biting insects.
the numbers of bites received on each arm follow. At ex = 0.01,
should chemist conclude that the new formula is, indeed, more
effective th.1 current market leader?
Subject 1 2 3 4 5
6 7 8 9 10
11 12
Old formula 5 2 5 4 3
6 2 4 2 6
5 7
New formula 3 1 5 1 1
4 4 2 5 2
3 3
Long Distance Carrier is trying to see the effect of offering I
month free with a monthly fixed fee of $10.95, versus an offer of
a low monthly $S.75-with no free month. To test which might be
more attractive to consumers, Long Distance runs a brief market
test: 12 phone reps make calls using one approach, and 10 use the
other. The following number of customer agreed to switch from
their present carrier to LDC:
Offer Number of switches
I 1 month free 118 115 122 99 106
125 102 100 92 103
113
low mon. fee 115 126 113 102 124
137 108 128 110 135
128
Test at a significance level of 10 percent whether there are
significant1y productivity differences with the two offers.
Is the perceived level of responsibility for an action related to the
sever its consequences? That question was the basis of a study of
responsibility which the subjects read a description of an accident
on an interstate high way. The consequences, in terms of cost and
injury, were described as either very minor or serious. A
questionnaire was used to rate the degree responsibility that the
subjects believed should be placed on the main figure in the story.
Below are the ratings for both the mild-consequences and severe-
consequences groups. High ratings correspond to higher
responsibility attributed to the main figure. If a 0.025 significance
level was used
the study conclude that severe consequences lead to a greater
attribution responsibility?
Consequences Degree of Responsibility
Mild 4 5 3 3 4 12
6
Severe 4 5 4 6 7 8
6 5
In October 1992, a survey of 120 macroeconomists found 87
who believed that the recession had already ended. A survey of
Circle the correct answer or fill in the blank. Answers are in the back
of the book.
1. A paired difference test is appropriate when the two samples
being tested Dependent samples.
2. A one-tailed test for the difference between means may be
undertaken who the sample sizes are either large or small and
the procedures are similar. The only difference is that when the
sample sizes are large, we use the normal distribution, whereas
the t distribution is used when the sample sizes are small.
3. In testing hypotheses about the difference of two means,
suppose that the sample sizes are large. If we do not know the.
Copy Right : Ra i Unive rsit y
11.556 141
R
E
S
E
A
R
C
H

M
E
T
H
O
D
O
L
O
G
Y
actual standard deviations of the two populations, we can use
the sample standard deviations as estimates.
4. If we took two independent samples and performed a
hypothesis test to se whether their means were significantly
different, we would find the resell very similar to a paired
difference test performed on the same two samples.
5. When doing a two-tailed test for the difference between means,
with a null hypothesis of the hypothesized difference between
the two population means is zero.
6. Exact probe values cant be determined (from the table) when
using the t distribution in a hypothesis test.
7. Two-sample tests are used
Notes

Você também pode gostar