Você está na página 1de 38

QUIZ I

Question I
XYZ Company manufactures a very large number of ball bearings using
three different production lines: Red Line, Blue Line and White Line.
50% of production comes from Red Line whereas 30% comes from the Blue
Line and the remaining from the White Line. From the past data it has
been found that about 15% of the production from the Red Line are
defective whereas the corresponding percentages from Blue Line and
White Line are 10% and 5% respectively. The entire production of the
day from the three production lines is combined before it is taken to
the quality control department. The quality control department,
normally, selects two items (without replacement) and tests them.
Based on the above, answer the following questions:

i) What is the probability that the second item picked is defective?

ii) What is the probability that both the first and the second items
picked are defective?

iii) What is the probability that the second item picked is defective,
given that the first item turned out to be defective?

iv) What is the probability that both the items came from the Red
Line, given that both have been found to be defective? From the
Blue Line? From the White Line?




Question II
Mr. Joshi joined the Citibank Financial Services after graduating from
IIM Bangalore. At Citibank Financial Services, while he is supposed to
look after the foreign exchange operations, he also has the
responsibility of a vigilance officer. Recently the Citibank Financial
Services had conducted an internal test for the purpose of promoting
lower level officers. It was alleged that two officers - Mr. Ajay and
Mr. Bhaskaran - have copied in the test. The case landed with Mr.
Joshi for appropriate action.

Mr. Joshi asked for the details about the test and the following
information was given to him. The test contained 20 multiple choice
questions. Each question had five choices, namely A, B,C,D, and E. Of
course, each question had exactly one and only one correct answer
which is one of the five choices. Mr. Joshi defined a new concept
called Identical Wrong Answers. If the two candidates answered the
same question wrongly and made the same wrong choice, then the answer
is taken to be an identical Wrong Answer. For example, if the correct
answer for question 1 happens to be the choice E, and if both the
candidates select the answer A then it is an Identical Wrong Answer.
If even one candidate selects the right answer E then question 1
cannot have an Identical Wrong Answer. Mr. Joshi after examining the
answer scripts found that out of 15 questions where both the
candidates
had given wrong answers, 8 turned out to be identical Wrong Answers.
Based on this data, and the knowledge he gained from the course on
`Introduction to Statistical Methods at IIMB, Mr. Joshi concluded
that the two candidates - Mr. Ajay and Mr. Bhaskar - did resort to
copying and recommended their demotion to the lower level.

i) Do you agree with Mr. Joshis decision? Explain clearly why or
why not? Substantiate your opinion with detailed calculations of
probability involved.

ii) Show each of the steps and all work involved in your
calculations.



QUIZ II

Question I

When the professor mentioned that the time between arrivals,
generally, follows an exponential distribution, one of the PGP
students decided to check this out with real life data. Since his
sister works in the Vijay Trauma Centre, he decided to collect the
information on the arrivals of emergency cases at the trauma centre.
On one particular day, he observed that the emergency cases arrived at
12:10 am; 2:30 am; 3:50 am; 8:20 am; 12:30 pm; 2:20 pm; 3:50 pm; 6:30
pm and 8:10 pm.

Based on the above data, can he confidently say that the professors
contention is valid? Explain your answer, showing all relevant
calculations and stating any assumptions you may have made.

Question II

Professor AK Rao is trying to compare the performance of engineers and
non-engineers in the admission test for his institute. He had taken a
sample of 25 engineers and 16 non engineers from populations that are
normally distributed.Based on the sample means of the scores in the
admission test and the estimated variances for the two populations, he
had calculated the upper and lower limits of 95% confidence intervals
for the respective population means. These intervals are as follows:

Upper Limit Lower Limit
Engineers 124.9536 115.0464
Non-engineers 137.4585 122.5415

Test the Null Hypothesis that there is no significant difference
between average performances of the two groups of students. Clearly
state any assumptions that you make and carry out appropriate
statistical tests to justify the assumptions before arriving at the
conclusion.



QUIZ III
Question I
The unrelenting march of computers into offices and factories in over
the past decade is here to stay. With this in view, R. Swaminathan,
Professor of MIS at The Management Institute, investigated the
computer literacy of middle managers with 10 years or more experience.
He designed a questionnaire that he hoped would measure managers
technical knowledge of computers. If the questionnaire was properly
designed, the scores received by managers could be used to design
different training seminars to bring all managers to a desired level
of computer literacy. To check the validity of the questionnaires
design, nineteen middle level managers from Bangalore were randomly
sampled and asked to complete the questionnaire. Prior to completing
the questionnaires, the managers were asked to describe their
knowledge of and experience with computers. This information was used
to classify the managers as possessing a low (A), medium (B), of high
(C) level of technical computer expertise. This data and their scores
are listed below:

Manager I.D. Level of Technical Expertise Score
1 A 82
2 A 114
3 A 90
4 A 80
5 B 128
6 B 90
7 C 156
8 A 88
9 A 93
10 B 130
11 A 80
12 A 105
13 B 110
14 B 133
15 C 128
16 B 130
17 B 104
18 C 151
19 C 140
Is there sufficient evidence to conclude that the mean score differs
for the three groups of managers?

Question II
The speeds of fifty motor vehicles on Bannerghatta Road, as recorded
by the Department of Traffic Safety, are as follows:

46 58 60 56 70 66 48 54 69 73
62 41 39 52 45 62 53 69 42 56
65 65 67 76 52 52 59 59 47 62
67 51 46 61 40 43 42 77 67 70
67 63 59 63 63 72 57 59 63 66
Test whether the Department of Traffic Safety can conclude that there
is randomness in the observed speed of vehicles.



Midterm

Show sufficient work and give adequate explanations to support your
answers. Clearly state any assumptions that you may make.

Question I
It is Placement season again at Moni Management Mahavidyalaya
(popularly known as 3M). A large number of companies are expected to
come for the Campus placement. The placement rules of 3M allows a
student to attend not more than 12 interviews. At the same time, a
student is out of placement, once he gets firm offers from any 4
companies.

Ravi Ram Kumar, a final year student, has selected the 12 companies
and figured out that each company will make its decision independently
and that he has an equal chance of getting an offer from each of the
companies. Being a top ranker, he estimated this chance to be 30%.
Also, he decided to wait until he gets all the four offers allowed by
3M, before accepting any offer. This way, he thought, he will be able
to make the best possible choice.

1. What is the probability that Ravi has to attend less than 6
interviews before he is out of placement?

2. What is the probability that Ravi will be out of placement when he
attends exactly 7 interviews?

3. What is the probability that Ravi has to attend at least 8
interviews before he is out of placement?

4. What is the probability that Ravi has to attend all the 12
interviews?


Question II
The (probability) density function of rainfall in the South Pacific
island of Nourou is as follows:
f(x) = 0.0 + 0.5x for 0 x 1
f(x) = 0.0 + 0.5 for 1 x 2
f(x) = 1.5 - 0.5x for 2 x 3
and
f(x) = 0 otherwise,
where x is the rainfall measured in cms per day.

Since the islanders mainly depend on the tourism from the mainland,
the rainfall plays a very important role in their livelihood. A day
with less than 1.5 cms rainfall is called the Best Day where as the
one with more than 2 cms rainfall is called the Worst Day and the one
with rainfall in between is called the Fair Day. The Nourou lunar year
consists of exactly 360 days.

1. What is the probability that 3 out of 5 days selected at random,
are Best Days?

2. What is the probability that 3 consecutive days are Fair Days?

3. What is the probability that the number of Best days in a Nourou
lunar year are anywhere between 135 and 270?

Question III
Bhavana, an employee of Karuna Lights and Tubes who is a manufacturer
of electric bulbs, is trying to estimate the average lifetime of bulbs
produced on a particular day. Based on the previous production, she
had estimated the standard deviation ( )of the lifetime. After
taking a sample of 100, she had calculated a two-sided 95% confidence
interval whose lower limit turned out to be 830.5 hours. When the
management complained that the width of the confidence interval is too
large, she simply reduced the confidence level to .90 (still two-
sided) and came up with a lower limit of the confidence interval at
836.8 hours. The management is willing to accept an estimate of the
mean that is within 19.6 above or below the point estimate but the
chances of this estimate being incorrect must be only 1 in 20.

1. What was the original two sided 95% confidence interval of the
lifetime?

2. What was the revised confidence interval with a confidence level of
.95?

3. What should be the sample size to satisfy the requirements of the
management?

4. When Bhavana selected a new sample with size as determined in (3)
above, the sample mean turned out to be 30 hours more than the
previous sample mean. Calculate the two-sided confidence interval as
desired by the management, based on the new sample.


Question IV
Jiffy fix-it-all is a service shop that undertakes major repairs on
electronic goods. From past records Jiffy estimates that the arrival
of repair requests have a Poisson distribution. On an average Jiffy
receives one repair request every 2 hours. Jiffy is a very family
oriented man and works for only 8 hours per working day. During this
time he is able to handle 5 repairs. He likes to live a peaceful life
and never takes on more work that he can handle and does not perceive
the need to hire help. The quality of his work is impeccable.

1. Find the proportion of days in a month when Jiffy has to turn
customers away.

2. In the last month, of the 25 days that Jiffy worked, he had to
turn customers away on ten days. What is the probability of such an
event? Should Jiffy revise his estimate of the parameters he is
considering? Should he hire another person to help him out?

Given that four customers have arrived within the first four hours
of opening shop today what is the probability that Jiffy will find

3. someone arriving in the next half hour?
4. that he gets one more request than he can handle?


Question V
A wholesaler of milk products serves a city with 3 distinctly
different market areas. To decide how to allocate sales efforts, the
wholesaler wants to obtain an estimate of the mean monthly expenditure
on milk products per household in the city. Random samples are to be
selected from each of the market areas for this purpose. The number of
households and the cost of interviewing (per household) in each market
area along with the estimated variance in monthly expenditure on milk
products is given below:

Area 1 Area2 Area3
Households 20,000 10,000 50,000
Variance 324 225 144
Cost (Rs.) 16 4 9

If a total of 80 households are to be sampled:
1. Find the number of households to be sampled from each of the market
areas if only the varying number of households is to be taken into
account.

2. Revise this sampling strategy by taking into consideration the fact
that the estimated variances of expenditure on milk products in the
three areas is not the same.

3. Is there any advantage in revising the strategy as stated above?
Why or why not? Explain. Show all work needed to support your
answer.

4. Considering that the total budget allocated for the sampling in
only Rs. 700 , determine the sample size and the allocation across
the three market areas such that the variance of the overall sample
mean is minimized and the expenditure does not cross the budget.
Compare this strategy with the two above and comment on which you
think is the best.



Final Examination

Show sufficient work and give adequate explanations to support your
answers. Clearly state any assumptions that you may make.

Answer all questions in the space provided below each question. You
may use the back side of the pages if needed. Do NOT attach any rough
work/sheets.

Question I
A firms demand curve describes the quantity of its product that the
firm can sell at different prices, other things being equal. Over a
period of a year Rider Tires company varied the prices for one of
their radial tires to estimate the demand curve for the tire. The data
in the following table describes the tires sales over the
experimental period.

Tire Price (hundreds Rs.) Number Sold (hundreds)
20 13
35 57
45 85
60 43
70 17

1. Find the least squares linear regression equation to approximate
the firms demand function.

2. Construct a scatter plot and graph your least squares line.
Comment on its fit. What does this imply about the relationship
between the tire price and its demand for the sample data?

3. Test to see if the slope of the regression line for the
population is significantly greater than zero. Can you conclude that
there is no relationship between tire prices and sales volumes?
Explain.

4. Calculate the coefficient of determination for the least squares
line of part 1. Interpret its value in the context of the problem.



Question II
The Ever Effective advertising company is constructing an aptitude
test for a job.
Mr. Sukumar, the HRD manager, feels that it is important to plan for a
fairly large variance in the test scores so that the best applicants
can be easily identified. For a certain test, scores are assumed to be
normally distributed with a mean of 80 and a standard deviation of 10.
Ten applicants are to take the aptitude test.

1. Find the approximate probability that the sample variance of the
scores for these applicants is greater than 200.

2. Give an interval that will contain the sample variances 90% of
the time.

3. Is the interval that you obtained in 2. the only interval that
will contain the sample variances 90% of the time? Explain.

4. The ten applicants were also required to take another test the
company has designed for the purpose of comparison and use in the
future. The variance of the scores obtained by the 10 candidates
on this test was found to be 50. Can you conclude from this that
the variances for the two tests designed are different? State any
assumptions that you make.

Question III
Strike Rich Oil Company (SROC) has acquired the oil drilling rights
in the Godavari basin few years ago. The Godavari Basin being an oil
rich area, SROC estimated the probability of striking the oil as 0.45.
If they go ahead with drilling and strike oil, the estimated profits
are Rs. 1500 million. On the other hand, if they fail to strike oil
after drilling, the estimated loss is Rs. 200 million. Recently they
have received an offer from Reliable Lease Ltd. (RLL) for subleasing
the rights. The offer is for Rs. 500 million.

1. What should SROC do?

2. Should somebody offer perfect information to SROC, how much
should SROC value such information?


Reliable Testing Services (RTS), a fully owned subsidiary of RLL
offered to conduct a special magnetic resonance test on the site. The
test can classify the area as WET or DRY. RTS supplied the
following reliability figures of their test based on the past
experience: The probability of striking oil when ever the test
classified the area as WET is 0.8. On the other hand, the probability
of striking oil whenever the test classified the area as DRY is only
0.3.

RLL does have access to the test results and they have informed SROC
that if the test is conducted and the area is classified as WET, the
offer for sublease can be increased to Rs. 750. But, if the test is
conducted and the area is classified as DRY, the offer for sublease
will stand reduced to Rs. 350 million.

3. Should SROC go for the magnetic resonance test?

4. What is the maximum that SROC could afford to pay for the test?

5. What is the efficiency of the information provided by RTS?

6. What should SROC do under each possible scenario?







Question IV

I was asked to test if there was any significant difference in the
performance of boys and girls in the CAT. I decided to do a two-
sample test with equal number of boys and girls selected randomly from
the entire population of the candidates. It is known that the test
scores do follow a normal distribution. I did my analysis on SPSS and
when I retrieved the SPSS.LIS file into MSWord, the screen looked as
below: This must have been the work of a new virus called P9P97 which
inserts $ signs in place of all the important numbers.

T-TEST /GROUPS SIZE (1,2) /VARIABLES RATIO.
T-TEST requires 72 BYTES of workspace for execution.
----------------------------------------------------------------------
-
Page 39 SPSS/PC+
9/13/97
t-tests for independent samples of SIZE

Number
Variable of Cases Mean SD SE of Mean

RATIO
Sample 1.00 $$$ $$$ 6 $$$
Sample 2.00 $$$ 95.5 5 $$$

Mean Difference = $$$

Levene's Test for Equality of Variances: F= $$$ P= $$$

t-test for Equality of Means 95%
Variances t-value df 2-Tail Sig SE of Diff CI for Diff


Equal -9.82 48 $$$ $$$ $$$-$$$
Unequal $$$ $$$ $$$ $$$ $$$-$$$






















----------------------------------------------------------------------
Page 40 SPSS/PC+
9/13/97
This procedure was completed at 9:51:47
fin.


Based on the above SPSS output, answer the following questions:
1. What is the sample size for sample 1(Boys)?

2. What is the SE for the mean of the sample 2 (Girls)?

3. What is the SE of DIFF under the assumption of equal variances?

4. Should we reject or not reject the assumption of equal variances?
Why or Why not?

5. What is the mean of sample 1?

6. Should we reject the null hypothesis that
1
=
2
? Why or Why not?
Define your .

7. Build a 95% confidence interval for the mean of sample 1.

8. Build a 95% confidence interval for the mean difference(CI for
the DIFF).


Question V
Krypton Tea Estates is test marketing its new brand called Wah Taja.
They have selected a total of 200 housewives in 4 cities namely,
Ahmedabad, Bangalore, Calcutta and Delhi. In all the 4 cities put
together, 40% of the housewives preferred Wah Taja over their usual
brand. While 40% of the total sample is from Calcutta, 30% is from
Delhi and only 12.5% from Ahmedabad. While 32% of the housewives from
Ahmedabad had preferred Wah Taja over their regular brand, 33
housewives in Calcutta preferred Wah Taja. The number of housewives
who preferred Wah Taja in Delhi is exactly 2/3 of those who preferred
Wah Taja in Calcutta.

Test the null hypothesis that there is no difference in the preference
for Wah Taja in the four cities. Select the appropriate .








QUIZ I

Question I
Swetha Chemicals has introduced a new detergent powder in the market.
The company had distributed free samples to a large number of customers
and now the company is interested in the repurchase of the detergent
powder by these customers. Initially the company assigned the following
probabilities to the events "proportion of customers coming back for
repurchase".

E(i): Proportion of customers Probability
coming back for repurchase
-------------------------------- ------------
20% 0.15
30% 0.30
40% 0.35
50% 0.20
------------------------------- ------------

It is assumed that all customers make their decision independently.
A sample of 20 customers was chosen at random and it was found that 10
had actually came back for repurchase and the other 10 did not come back
for repurchase. After the sample, what probabilities should the
management assign to the events "proportion of customers coming back for
repurchase"?


Question II

In a recent readership survey of 100 executives, it was found that 60
read Business India (BI); 20 read Business India and Business World
(BW); 10 read only Business Today (BT); 10 read all the three; 30 read
only Business World; and 30 read Business Today and Business India. All
the executives read at least one of the three magazines.

a. What is the probability that one executive selected at random,
reads BT only?



b. What is the probability that one executive selected at random,
reads BW and BI?


c. What is the probability that one executive selected at random,
reads only one of the three magazines?



d. In a random sample of 5 executives, what is the probability that
not more than 3 of them read BW and BT?



e.In a random sample of 5 executives, what is the probability that
exactly one of them reads BW and BI?


QUIZ II
I. Bangalore Transport Authority wants to fine tune the duration of
the green lights in a busy intersection in the city. They have
collected the data during peak hours (4 to 6 pm) for this purpose. It
was found that, on 21 days, the number of vehicles passing through the
intersection on the North-South road averaged to 247.3 with a standard
deviation of 18.7 while on the East-West road the number of vehicles on
11 days averaged to 254.1 with a standard deviation of 15.2. Based on
this data, should the green light on the North-South road be different
from that on the East-West Road?
(It assumed that the concerned random variable i.e., the number of
vehicles follow normal distribution)
(Clearly state your Null and Alternate Hypotheses)


II. Integrated Circuits India Ltd. recently introduced a Quality
Improvement Programme (QIP) because the management felt that too much of
re-working was being done on the ICs. It was found that 26 ICs out of a
random sample of 200 required re-working prior to the QIP. Following
the introduction of QIP and using Pareto Charts and other approaches to
identify significant problems, it was felt that considerable
improvements were made. Out of a sample of 100 ICs taken after
introduction of QIP, only 10 required re-working.

1. Calculate a 95% confidence interval for the true difference in the
proportion of ICs requiring re-work, before and after QIP.

2. Test, at 5% significance level, whether there was really an
improvement in terms proportion of ICs requiring re-work after the
introduction of QIP.
(Clearly state your Null and Alternate Hypotheses)



QUIZ III

Question I:
The arrivals and departures at a service counter were observed for 5
days (120 hours). Arrival and departure times were recorded and from
that record, the interarrival times were computed. It was decided to
choose 20 min. intervals for creating the frequency distribution as
given below:

Duration (minutes) Frequency
0 - 20 110
20 - 40 71
40 - 60 26
60 - 80 20
80 - 100 8
100 - 120 5
--------------------------------
It was felt that the arrivals follow Poisson distribution. Test the
hypothesis by using
2
test after calculating from the above data(with
=.05). State your Null and Alternate hypotheses clearly.


Question II:
Three special formulas for curing a special resin were studied and the
following curing times were observed:

Resin A Resin B Resin C
13 13 4
10 11 1
8 14 3
11 14 4
8 2
4
------------------------------
x 50 52 18
x
2
518 682 62
------------------------------
Test whether there is significant difference between the mean curing
times of the resins at =.05. State your Null and Alternate hypotheses
clearly.



MIDTERM
Question I A large group of 6400 students had taken a Statistics test
and the percentage scores are found to be normally distributed with
mean and standard deviation . The coefficient of variation was found
to be 20%. The absentminded professor forgot the values of and , but
nevertheless remembered that a one sided confidence interval for with
a confidence level of 78.81% yielded a lower limit of 49. Of course, he
remembered that the sample size used was 64.

1. Find the values of and .

2. If all the students who scored 70% or more obtained an "A" Grade,
what percentage of the entire group did obtain an "A" Grade?

3. If three students from the group are selected at random, what is
the probability that all the three obtained an "A" Grade?

Question II A contractor is planning to bid for a job of filling
potholes on a given stretch of road. Before finalising the bid, he got
three experts, namely VN, SS and RKH (who are considered equally
reliable in their claims) to calculate the average no. of potholes in a
one km stretch. As usual, the three experts came up with conflicting
statements. The claims of average number of potholes in a one km
stretch turned out to be 30, 35 and 40 respectively for VN, SS and RKH.

1. In a 100 metre stretch, what is the probability of finding 3 or
more potholes, given that VN's claim is correct?

2. What is the probability of finding exactly 3 potholes in a 100
metre stretch?

3. What is the probability that the distance between any two potholes
is not more than 100 metres, given that VN's claim is correct?

4. Given that you found exactly 3 potholes in a 100 metre stretch,
what is the probability that VN's claim is correct?

Question III
The density function for daily rainfall (x, measured in cms.) at a tea
garden in the Western Ghats is as follows:

f(x) = 0.4 - 0.05x for 0 x 2

f(x) = 0.6 - 0.15x for 2 x 4

f(x) = 0 otherwise.

The day is considered to be a wet day if the daily rainfall is more than
2cm; it is considered to be dry day if the daily rainfall is less than 1
cm; and normal day otherwise.

1. What is the probability that a day selected at random, turns out
to be a wet day?

2. What is the probability that on a selected day, the rainfall is
between 1 cm and 3 cm?



3. Given a wet day, what is the probability that the daily rainfall
is more than 3 cms?

4. What is the probability that more than 120 days out of 200 are wet
days?

Question IV
Virgo Toy Company manufactures toy cars in three different colours: red,
white and blue. These are packed in two different boxes: green and
yellow. Each green box is packed with 2 red cars, 3 white cars and 5
blue cars whereas each yellow box is packed with 3 red cars, 4 white
cars and 3 blue cars.
The entire production of the last month was packed into boxes. A random
sample of 50 boxes was selected and it was found that 40% of them were
green.

1. Based on the above sample, calculate a 95% two sided confidence
interval for the proportion of blue cars in the last month's
production.

2. Based on the above sample, calculate a 95% two sided confidence
interval for the proportion of red cars in the last month's
production.

3. What is the largest sample size of red cars required to make a
safe estimate of the 95% confidence interval for the proportion of
red cars in the last month's production, given that the width of
the confidence interval should be 0.06?

4. Based on the above sample of 50 boxes, calculate a 95% two sided
confidence interval for the proportion of yellow boxes.

Question V

Swetha Chemicals maintains an inventory of spares required. Over a
period of time, the inventory got accumulated and the management
initiated an exercise to prune the inventory. As a first step, the
items in the inventory are classified into 3 groups: A Class, B Class
and C Class. The details of these items is given below:

Total No. Standard Cost of
of items Deviation Sampling/item (Rs.)
of Value(Rs)
A Class 300 150 16
B Class 400 80 9
C Class 1100 40 4

It is decided to take a sample of 90 items from the entire inventory.

1. Determine how many items to be selected from each of the three
classes based on proportional allocation. Calculate the variance
of the overall sample mean under this scenario.

2. Determine how many items to be selected from each of the three
classes based on optimal allocation. Calculate the variance of
the overall sample mean under this scenario.



3. Considering that the total budget allocated for the sampling is
only Rs. 750, determine the sample size and the allocation across
the three classes based on optimal allocation. Calculate the
variance of the overall sample mean under this scenario.


4. Considering that the total budget allocated for the sampling is
only Rs. 750, determine the sample size and the allocation across
the three classes using the cost of sampling so as to minimise the
variance of the overall sample mean. Calculate the variance of
the overall sample mean under this scenario.




FINAL EXAMINATION

Question I
Swetha Chemicals decided to introduce a new detergent -SOFTWHITE- in the
market. They have a choice of either SMALL advertisement budget or
LARGE advertisement budget for promotion. They have defined two
possible market conditions - LOW (defined as 20% of the users will
purchase SOFTWHITE) and HIGH (defined as 40% of the users will purchase
SOFTWHITE). The management has assigned a probability of 0.6 for LOW
and 0.4 for HIGH. The pay off matrix for the two levels of
advertisement budget is as follows:

Advertisement Market Condition
Budget -----------------
LOW HIGH
---------------------------------------------
SMALL 10 100
LARGE -20 300
---------------------------------------------
These figures are in Rs. lakh. (i.e., if the company opts for LARGE
Advertisement Budget, and the market conditions turnout to be LOW, the
company incurs a loss of Rs. 20 lakh)

1. Based on the above data, decide which strategy (i.e., SMALL or
LARGE advertisement budget) should the company follow?

2. What is the expected value of Perfect Information in this case?

To facilitate the decision making, the company executives decided to
survey a random sample of 10 users (it is assumed that the users make
their decisions independently). If the number of users purchasing
SOFTWHITE is less than 2, the response is termed as BAD, if the number
of users purchasing SOFTWHITE is between 3 and 5, the response is termed
as AVERAGE and if the number of users purchasing SOFTWHITE is more than
5, the response is termed as GOOD. It was also decided that if the
response is BAD, LARGE advertisement will not be considered and on the
other hand, if the response is GOOD, only LARGE advertisement will be
considered.

3. Draw the decision tree for the above.

4. What is maximum that the company is willing to pay for the
customer survey?

5. What is the final strategy for the company?

DECISION TREE



QUESTION II
I was trying to do an ANOVA (ONEWAY) using SPSS. Unfortunately, a new
virus called P9p96 attacked the computer and as result some of the
numbers in the ANOVA table were replaced by "$$$". The printout of the
output is reproduced below:

SPSS/PC+ The Statistical Package for IBM PC 9/17/96
GET /FILE 'd:\vnspss\anova.sys'.
The SPSS/PC+ system file is read from
file d:\vnspss\anova.sys


The file was created on 9/17/96 at 10:26:55
and is titled SPSS/PC+
The SPSS/PC+ system file contains
50 cases, each consisting of
34 variables (including system variables).
34 variables will be used in this session.
Page 2 SPSS/PC+ 9/17/96
This procedure was completed at 10:30:57
Page 3 SPSS/PC+ 9/17/96

ONEWAY /VARIABLES WEIGHT BY NEW (0,4).
Page 4 SPSS/PC+ 9/17/96

- - - - - - - - - - O N E W A Y - - - - - - - - - -

Variable WEIGHT
By Variable NEW
Analysis of Variance

Sum of Mean F F
Source D.F. Squares Squares Ratio Prob.

Between Groups $$$ $$$ $$$ 14.73 .001

Within Groups $$$ 108.078 $$$

Total $$$ $$$
Page 5 SPSS/PC+ 9/17/96

This procedure was completed at 10:31:46
-------------------------------------------------------------
Page 6 SPSS/PC+ 9/17/96
fin.

Answer the following questions, based on the above printout:

1. What is the No. of samples used in the above ANOVA?________


2. What is the "Within Groups" Degrees of Freedom?__________

3. What is the "Between Groups" Degrees of Freedom?__________


4. What is the "Total" Degrees of Freedom?__________




5. What is the "Between Groups" Sum of Squares?_________

6. What is the "Within Groups" Mean Squares?_________


7. What is the "Total" Sum of Squares?_________


8. What is the "Between Groups" Mean Squares?_________


9. What does the value .001 under the column "F Prob" signify?


10. What can you conclude from the above ANOVA Table?

Question III
The following is the data with respect to the burning times (in minutes)
of two different emergency flares.

Brand A: 19.4, 21.5, 15.3, 17.4, 16.8, 16.6, 20.3, 22.5, 21.3, 23.4,
19.7, 21.0, 19.4, 16.5, 15.8, 17.4

Brand B: 16.5, 15.8, 24.7, 10.2, 13.5, 15.9, 15.7, 14.0, 12.1, 17.4,
15.6, 15.8, 17.4, 22.5, 13.5

Test whether the two samples come from identical populations using =
0.05. Use only a non-parametric test.

Question IV
The following is the data with respect to the number of employees and
the profit after tax for 10 companies.

No. Employees(X) Profit after Tax(Y)
---------------- -------------------
11 58
20 86
15 60
16 62
30 110
25 97
24 95
28 105
17 67
14 60

X = 200 Y = 800 XY = 17174

X
2
= 4372 Y
2
= 67852

1. Test the null hypothesis (at 0.05 significance level) that there
is no correlation between No. of employees and PAT. (State your Null
and Alternate hypotheses clearly).

2. Fit a linear regression function with PAT as the dependent
variable and no. of employees as the independent variable.



3. Interpret the regression coefficients after testing the null
hypothesis (at = 0.05) that both the intercept and the slope are equal
to zero.

4. Calculate the R
2
and give the interpretation of the same.

5. Calculate a 95% confidence interval for the expected value of PAT
for a company employing 30 persons.





Question V
Four different brands of automobile tyres were tested to see how long
they will last. the sample sizes were 50 each for Brand A and Brand B,
60 for Brand C and 40 for Brand D. 26 of Brand A, 30 of Brand B, 34 of
Brand C and 20 of Brand D lasted for less than 20,000 kms. Test the
null hypothesis that the proportion of tyres lasting greater than or
equal to 20,000 kms is the same for all the four brands. Use a
parametric test at 10% level of significance. Clearly state the Null
and alternate hypotheses and assumptions, if any.



QUIZ I

Question I. A ticket printing machine makes an error on one out of
every fifty tickets printed, on the average. Tickets are packed
in bundles of 200. Ten bundles are packed into a carton and one
hundred cartons are packed into a crate. A bundle is considered
unacceptable if it contains more than eight defective tickets. A
carton is considered unacceptable if it has more than one
unacceptable bundles.

1. What is the probability that a carton contains

a) No unacceptable bundles?


b) At most one unacceptable bundle?


2. What is the expected number of defective tickets in a crate?



3. What is the expected number of unacceptable cartons in a crate?


Question II. ABC Electronics has a policy of training their new recruits
before they are put on a particular production line, but 20% of
the new recruits could not attend the training program before they
were put on that production line. All the new recruits are
assigned to the production line, which is assigned only new
recruits. It was found that 10% of the items produced by the
trained recruits were defective. Also, given that an item was
defective, the probability that it was produced by a trained
recruit is 0.4.

4. What is the probability that an item is defective, given that it
was produced by an untrained new recruit?


5. Given that an item is defective, what is the probability that the
recruit who produced it did not undergo any training?


6. What is the probability that an item is defective?



7. Given that an item is not defective, what is the probability that
it was produced by a trained new recruit?


8. What proportion of the items produced by the trained new recruits
are not defective?




QUIZ II
Question I.
The weight of oranges produced by the MoneyWorth Orchards is
distributed normally with mean and standard deviation . The
oranges which weigh more than 266 gms. are exported and those with
weight less than 100 gms. are sold to the local fruit juice
factory. When MoneyWorth Orchards estimated a two-sided 95%
confidence interval for based on the known value of , the width
of the interval turned out to be 39.2. Considering that the
standard error is inversely proportional to the square root of the
sample size, they increased the sample size by 300 more oranges,
and the width was exactly halved (i. e., it became 19.6). They
have exported 12.30% of their production.

1. What is the value of ?


2. What is the value of ?

3. What percentage of the production was sold to the local fruit
juice factory?


Question II.

The average number of machines breaking down in a factory is 5 for
60 days, Poisson distributed. Every time there is a breakdown,
they have to make a service call.

4. What is the probability that the time between two consecutive
service calls is less than 10 days?


5. What is the average time between service calls?



Question III.
Many public polling agencies conduct surveys to determine the
current consumer sentiment concerning the state of the economy.
One such agency randomly sampled 484 consumers and found that 257
were optimistic about the state of the economy.

6. Develop a 95% confidence interval for the proportion of consumers
who are optimistic about the state of the economy.




7. Based on the above, is it possible to conclude that the majority
of the consumers are optimistic about the state of the economy?




8. If the true proportion of consumers optimistic about the economy


was 0.5, what is the probability that 257 or more in a sample of
484 are optimistic about the state of the economy?


QUIZ III

Question I.

Radiant Paper Company had been using an old method of testing the
strength of paper where a special tearing machine is used by
testing a single sheet of paper (ply). It has been suggested that
measuring five sheets together (five plys) and then adjusting to
single thickness strength would be better procedure. The first
question to be answered is whether the two procedures give
essentially the same value for strength. To answer this question,
the following procedure is used.

Five pieces of paper are cut in half. One half of each piece is
randomly selected and its strength is measured. Next, the five
remaining halfs are tested together as a five ply specimen. Then
the average of the five individual readings minus the five ply
reading is calculated. This procedure is repeated for four
different, but representative types of paper. The procedure is
repeated for 3 such samples for each type of paper. The
observations of the differences are given below:

Observation No. Type Type II Type III Type IV
----------------------------------------------------------
1 2.80 0.00 1.15 1.88
2 0.75 -0.10 1.75 2.65
3 3.70 3.45 4.20 2.70
-----------------------------------------------------------
Mean 2.417 1.117 2.367 2.410
-----------------------------------------------------------

Test whether the "mean differences" depend on paper type, using
0.05 significance level. (Use only a parametric test).


Question II.
A student has the impression (perhaps erroneous) that the score on
a statistics test has some relationship with the number of hours
studied for the test. She gathers data from ten of her friends
with the following result:

Hours studied 4 9 10 14 4 7 12 22 1 17
Test score 31 58 65 73 37 44 60 91 21 84

Using this information she would like to predict the test score based on
the number of hours studied.

1. Develop a regression equation with test score as the dependent
variable.

2. Determine the 95% confidence intervals for the slope and the
intercept of the regression line.

3. Interpret the meaning of the slope and the intercept.

4. State the limitations of this regression, if any.

5. If the student studies 15 hours for the test, what should be her
expected test score? Develop a 95% confidence interval for the
expected test score.

6. Can a professor use this regression equation to infer the number
of hours a student has studied, using the student's test score?
Explain.

Information that you may wish to use:

Variable Sum Sum of squares

Hours studied 100 1,376
Test score 564 36,562

Sum of (Hours studied * Test score) = 6,945



MID TERM
Question I.
Mr. Suresh of Innovative Market Research is analysing the data on
customer satisfaction for International Holiday resorts. He has
collected data from 200 customers who had visited the resorts at four
locations namely Yarcaud (Y), Kodai (K), Ooty (O) and Puri (P). He
classified the satisfaction level into 3 categories namely Highly
satisfied (H), Satisfied (S) and Unsatisfied (U). He was testing the
null hypothesis that the satisfaction level is independent of the
location of the resort. Unfortunately, a new computer virus called
pgp95 infected his computer and replaced some of the numbers in the
frequency table with ###. When he took a print out, the computer
printed the following table:

Y K O P TOTAL

H ### ### 25 ### ###

S ### 18 ### 20 ###

U ### ### ### ### ###

TOTAL 45 50 ### 50 ###

Luckily, Suresh remembered certain special aspects of the data in
the table that he noticed before the virus infection. These
aspects are as follows:

i. The events "PURI" and "Highly Satisfied" are mutually
exclusive.
ii. Customers at Yarcaud are uniformly distributed across the
three categories of satisfaction level.
iii. There are same number of "Satisfied" customers as there are
"Unsatisfied" customers.
iv. P(H/K) = P(P/S)

Help Mr. Suresh by filling the missing numbers in the table and
answer the following questions based on the observed values in the
table.

1. What is the probability that a customer, selected randomly, is
"Highly Satisfied", given that he visited Yarcaud?

2. What is the proportion of customers who are either "Highly
Satisfied" or "Satisfied"?

3. What is the number of customers who visited Ooty?

4. What is the probability that a customer, randomly selected, is
"Satisfied" and also visited Kodai?

5. Test the null hypothesis that the satisfaction level is
independent of the location visited ( = 0.01).


Question II.
Everyday, Vijaya Bakery bakes three large chocolate cakes and
those which are not sold on the same day are given away to the
students of the nearby institute. The data with respect to the
number of cakes sold on the same day that they are baked is given
below:

Number of Cakes Number of
Sold days
----------------------------------
0 1
1 16
2 55
3 228
----------------------------------

Using this data, test the null hypothesis that the random variable
"number of cakes sold on the same day that they are baked" follows
binomial distribution. Please note that the probability of success is
not known and has to be estimated from the data.

I. A company buys batches of an electronic component from a
particular supplier. Due to some peculiarities of the
manufacturing process used by the supplier, the average life of
the components in a batch is either 50 hours or 51 hours. In
either case, the standard deviation is one hour. The company
would like to reject batches which have the lower average life.
There is one problem, however: testing of components is
destructive, which means that any component which is tested is
destroyed and therefore not usable. The company would like to
devise a test that will help them distinguish between the two
batches in such a way that as few components are tested as
possible. The company has decided that if the mean is really 51
hours, it should accept the batch at least 95% of the time, and if
the mean really is 50 hours, it should accept the batch no more
than 10 percent of the time.

1. State the null and alternate hypotheses.


2. Determine the minimum number of components that should be tested
in order to satisfy the company's requirements.

II. An automobile company wishes to compare the mean performance of
two types of shock absorbers: brand A and brand B. Of interest
was whether brand B lasted at least 500 km more than brand A, on
the average. A test was conducted using independent samples of 13
each of each type of shock absorber. The sample from brand A
showed a mean of 25,200 km and a variance of 80,000 km2 whereas
the sample from brand B showed a mean of 25,875 km and a standard
deviation of 40,000 km2.

1. State the null and alternate hypotheses.

2. Test the hypotheses at the 5% significance level.

Question III.
The systolic blood pressure readings of males between the ages of


35 and 59 show a standard deviation of about 17 millimeters. A
sample of 41 male runners in the age group of 35 to 59 showed a
(sample) standard deviation of 15 millimeters. Test the claim
that runners in this age group show less variability in their
systolic blood pressure, using a 5% level of significance.

1. State the null and alternate hypotheses.

2. Perform the test and indicate the results.



Final Examination

I. Bharat Engineering collected the following data on the breaking
strength
in Kgs) from the samples of two different materials.

Material A: 144, 181, 200, 187, 180, 169, 171, 180, 194, 176, 182, 198,
183

Material B: 187, 180, 176, 194, 176, 198, 154, 134, 169, 185, 161, 170,
164 196

The company wants to test the null hypothesis that there is no
difference
between the breaking strength of the two materials. Use a non-
parametric
best, at 0.05 level of significance, to test the above null hypothesis.


II. A Study shows that 16 out of 200 tractors produced on the Assembly
line A required extensive adjustments before shipping, while the
same was true for 26 out of 400 tractors produced on the Assembly
line B.

1. Calculate a 95%, two sided confidence interval for the difference
between the proportions of tractors requiring extensive adjustment
from the two assembly lines.

2. Test the null hypothesis, at 0.05% significance level that there
is no difference between the proportions of tractors requiring
extensive adjustment from the two assembly lines.


III. The following is the SPSS Output obtained on 18/9/95 with some of
the values missing. Study the output carefully and answer the
questions. You should mention clearly the names of the dependent
and independent variables, while answering the questions.

SPSS/PC+ The Statistical Package for IBM PC 9/18/95

GET /FILE 'd:\vnspss\vn500.sys'.
The SPSS/PC+ system file is read from
file d:\vnspss\vn500.sys
The file was created on 6/16/95 at 9:55:18
and is titled SPSS/PC+
The SPSS/PC+ system file contains
500 cases, each consisting of
33 variables (including system variables).
33 variables will be used in this session.
This procedure was completed at 8:27:51
------------------------------------------------------------------------
-------
REGRESSION /VARIABLES SALES1 OP1 INT1 DEP1 TAX1 NP1 ASSET1 NW1 EQ1
BORROW1
/DEPENDENT NP1 /METHOD STEPWISE SALES1 INT1 DEP1 TAX1 ASSET1 NW1 EQ1
BORROW1.
------------------------------------------------------------------------


-------
* * * * M U L T I P L E R E G R E S S I O N * * * *

Equation Number 1 Dependent Variable.. NP1 Net Profit - 94



Multiple R
R Square
Adjusted R Square .80017
Standard Error

Analysis of Variance
DF Sum of Squares Mean Square
Regression 6325822230.66524
Residual 1567902094.34276

F = Signif F =
------------------------------------------------------------------------
-------

* * * * M U L T I P L E R E G R E S S I O N * * * *

Equation Number 1 Dependent Variable.. NP1 Net Profit - 94

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

NW1 .092957 .684625 16.753 .0000
TAX1 .673643 .196473 9.592 .0000
DEP1 .367192 .165261 4.064 .0001
(Constant) -63.512913 -.699 .4847
------------------------------------------------------------------------
-------

* * * * M U L T I P L E R E G R E S S I O N * * * *

Equation Number 1 Dependent Variable.. NP1 Net Profit - 94

------------- Variables not in the Equation -------------

Variable Beta In Partial Min Toler T Sig T

SALES1 .004731 .005090 .183288 .113 .9099
INT1 -.006856 -.008282 .163832 -.184 .8539
ASSET1 .001279 .000870 .091838 .019 .9846
EQ1 .035318 .055539 .224749 1.238 .2165
BORROW1 .063269 .066066 .191605 1.473 .1414

End Block Number 1 PIN = .050 Limits reached.
------------------------------------------------------------------------
-------
This procedure was completed at 8:30:57

1. What is the value of R Square for the regression equation
estimated above? What can we conclude based on this value?



2. What is the value of the Standard Error?

3. In the ANOVA table given above, what is the Regression Degrees of
Freedom?

4. In the ANOVA table given above, what is the Residual Degrees of
Freedom?

5. In the ANOVA table given above, what is the calculated F value?

6. What can we conclude based on the ANOVA table above?


7. Calculate a 95%, two sided confidence interval for the slope with
respect to the independent variable NW1.

8. Calculate a 95% two sided confidence interval for the Constant
(Intercept)


9. Interpret the value of B with respect to the independent variable
DEP1.

10. Which of the coefficients in the regression equation, if any,
cannot be considered to be significantly different from zero?


IV. An experimenter was interested in dieting and weight loss among
men and among women. She believed that in the first two weeks of a
standard dieting program, women would tend to lose more weight than men.
As a check on this notion, a random sample of 15 husband-wife pairs
were put on the same strenuous diet. Their weight losses (in pounds)
after two weeks showed the following:

Pair Husband Wife

1 5.0 2.7
2 3.3 4.4
3 4.3 3.5
4 6.1 3.7
5 2.5 5.6
6 1.9 5.1
7 3.2 3.8
8 4.1 3.5
9 4.5 5.6
10 2.7 4.2
11 7.0 6.3
12 1.5 4.4
13 3.7 3.9
14 5.2 5.1
15 1.9 3.4


1. Calculate the Pearson's product moment correlation coefficient
between weight loss of husband and wife.

2. Is the correlation coefficient significantly different from zero?



3. Do wives lose significantly more than husbands? (Use a parametric
test)



Question V. A company manufactures a particular type of machine, for
which it offers a warranty period of five years. If there is a failure
within the warranty period, the company will fix it free of charge to
the customer, but for which the company incurs a cost of Rs 5,000. We
will assume that there can be at most one failure per machine during the
warranty period. Suppose a customer buys a batch of 200 machines from
this company, the company would like to develop a 95% confidence
interval for the cost that it is likely to incur to repair machines of
this customer. Analysis of past data shows that the time to the
first failure follows a chi-square distribution with a mean of thirteen
years. (For a chi-square distribution, the mean is equal to the number
of degrees of freedom).



1. Determine the mean and variance of the number of failures for the
200 machines during the warranty period for this customer.

2. Determine the 95 % confidence interval for the expected cost of
repair for this customer.




Question VI. Ocean Electric Company (OEC) is planning to build a
nuclear power plant in a particular state at a cost of rupees one
thousand crores. OEC has applied for an operating license to the Atomic
Energy Commission (AEC). Unfortunately, the power needs of this state
require that the plant be located in an area near a geological fault.
Should a severe earthquake rupture the plant, radioactive particles
would be released into the atmosphere in the form of a cloud which would
contaminate everything in its path and cause untold death and human
misery. The AEC is aware of this potential catastrophe. As a result,
it requires that atomic power plants located in earthquake zones be
constructed to withstand the most severe shaking ever recorded in that
location. To comply with this ruling, OEC engineers designed a plant
with a 5 metre thick reinforced concrete foundation and a 1 metre thick
reinforced concrete dome over each reactor. It is estimated that this
design would withstand a jolt of 7.0 on the Richter scale.

An analysis by AEC staff yield the following information, where

R = Reading on Richter scale over the life of the plant.

D1 = Award operating license unconditionally
D2 = Award operating license conditional on additional
concrete reinforcement which is to withstand 8.0 on the
Richter scale
D3 = Do not award operating license.

---------------------------------------------------------------
R <= 7 7 < R <= 8 R > 8
---------------------------------------------------------------


D1 70.0 8,000.0 8,000.0
D2 70.1 70.1 8,000.1
D3 100.0 100.0 100.0
---------------------------------------------------------------
The numbers in this table represent the present value of the
expected loss to the consumer in thousands of crores of rupees
over the life of the plant. D2 represents higher costs than D1
because expensive construction rework must be undertaken, the cost
of which eventually gets passed on to the consumer in the form of
higher rates (This amounts to Rs 0.1 thousand crore). D3 implies
that the power plant is to be abandoned as a nuclear generating
facility, resulting in higher rates for alternative sources of
power and/or reduced levels of commercial activity due to lack of
generating power. The consequences of a ruptured power plant are
therefore Rs 7930 thousand crore.

The prior probabilities of the states of nature are estimated as
P(R <= 7) = 0.999990, P (7 < R <= 8) = 0.000009, and P( R > 8) =
0.000001.

1. Determine the optimal decision based on expected values.

2. What is the maximum that AEC will be willing to pay for any kind
of information?
Prior to making one of the decisions, the AEC has the option of
conducting a special geological survey over an area encompassing a
100 km radius centered on the plant. The survey will cost Rs one
crore to the consumers (courtesy of the AEC) and will result in a
forecast, the probabilities of which are estimated as follows:

Survey forecast
------------------------------------------------------
State of R <=7 7 < R < =8 R > 8
Nature
------------------------------------------------------
R <= 7 0.7 0.2 0.1
7 < R <= 8 0.3 0.5 0.2
R > 8 0.0 0.1 0.9

For example, the first row indicates that in regions where the maximum R
is 7, the survey forecasts a maximum R of 7, 70% of the time, an R
between 7 and 8, 20% of the time, and an R of greater than 8, 10% of the
time.

AEC has further decided that if the survey forecast is R <= 7, it
will not consider option D3 and if the survey forecast is R > 7,
it will not consider option D1.

3. Draw the decision tree and evaluate it.

4. Compute the EVSI and the efficiency of the survey information.

5. What should AEC do?

(Note : Beware of round-off errors!)

Você também pode gostar