Você está na página 1de 20

FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)

Last updated 22nd October 2010


Cover Type B

TO BE RETURNED AT THE END OF THE EXAMINATION.


THIS PAPER MUST NOT BE REMOVED FROM THE EXAM CENTRE.

SURNAME: __________________________________

FIRST NAME: __________________________________

STUDENT NUMBER: __________________________________

COURSE: __________________________________

__________________________________________________________________________________

FINAL EXAMINATION PRACTICE PAPER ‘P’


SOURCE: Past practice paper written Pre-Autumn 2007

SUBJECT NAME and NUMBER: BUSINESS INFORMATION ANALYSIS 26133


BUSINESS STATISTICS 26134

TIME ALLOWED: 3 Hours plus 10 minutes reading time

INSTRUCTIONS TO CANDIDATES:

• This question paper MUST NOT be removed from the Examination Centre
• This is an OPEN BOOK EXAMINATION (any materials allowed including Lecture Notes)

• Calculators (including Programmable Calculators) are allowed


• Record your student name and number carefully on the multiple choice answer sheet provided
• Answer all questions on the multiple choice answer sheet provided
• Use the exam question booklet for any working. All exam question booklets will also be collected and
MUST NOT be removed from the Examination Centre
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P1:
The owner of Fortee Bakery is interested in determining the difference between mean purchase
amounts per customer at two locations. To estimate the difference, samples of 500 customer receipts
from each location are examined. The following sample statistics are produced:
Location 1 Location 2
Mean $5.26 $5.66
Standard Deviation $0.89 $1.05
The owner asks you to examine if there is a significant difference in the average purchase amounts per
customer at the two locations. You determine that:
(a) Using an independent samples t-test, the mean purchase amounts at each location are
not significantly different at the 99% level of significance
(b) Using an independent samples t-test, the mean purchase amounts at each location are
significantly different at the 99% level of significance
(c) Using a paired samples t-test, the mean purchase amounts at each location are not
significantly different at the 99% level of significance
(d) Using a paired samples t-test, the mean purchase amounts at each location are
significantly different at the 99% level of significance
Answer: B = independent; sales significantly different
Justification:
Ho: μ1 – μ2 = 0; Ha: μ1 – μ2 ≠ 0
Test method: independent samples b/c only have aggregate information about each store. No way of
matching the two store sales at the individual observation level.
Standard error: sqrt(s1^2/n1 + s2^2/n2) = sqrt(.89^2/500 + 1.05^2/500) = .061556
99% confidence means that α=1%. Since two-tailed test area in upper tail is α/2 = ½% = .005.
There are n1=500 receipts and n2 = 500 receipts. Df = 500+500 – 2 = 998 d.f.
We use t-distribution since σ is unknown.
T(.005,998) = T(.005,Inf) = 2.576 (table available in exam)
Critical value = 0 + T*SE = $0.158569
Evidence(Point estimate): xbar1 – xbar2 = 5.26 – 5.66 = -$0.40
Test-statistic = $0.40 / SE = -6.49
Since both the evidence in dollars and test-statistic are in the rejection region, reject Ho. Conclude
that the sales at the two locations are significantly different at the 99% level.

Question P2:
As part of an internal auditing process, a firm wishes to estimate the mean proportion of its credit card
holders having accounts that are overdue. The auditors have stipulated that the minimum margin of
error must be no greater than 3% and they wish to use a 95% confidence interval estimate. What size
sample should they select if they have no idea the percentage of accounts that are overdue but still
wish the sample size to be large enough to meet their requirements?
(a) 17
(b) 30
(c) 204
(d) 1068
Answer: D = 1068
Justification:
n = (critical value)^2(sigma)^2 / (margin of error)^2
NOT n = (critical value)^2(standard error)^2 / (margin of error)^2
n = z^2 * p(1-p) / E^2.
Since you do not know what p is, you maximise p(1-p), so assume p=0.5
= 1.96^2 * .5 (1-.5) / .03^2 = 1067.0719

Page 2 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P3:
60% of people have broadband connection. 70% of people use Artsel One as their carrier of choice.
What is the expected probability a person does not have broadband connection and choose a different
carrier other than Artsel One, assuming that the choice to have broadband is independent of the choice
of carrier and visa versa?
(a) 12%
(b) 18%
(c) 42%
(d) 58%
Answer: A = 12%
Justification:
P(AC and BC) = P(AC).P(BC) given independence.
P(AC) = 1 – P(A) = 1 - .70 = .30 = Probability of not using Artsel One
C
P(B ) = 1 – P(B) = 1 – .60 = .40 = Probability of not having broadband.
P(AC and BC) = .30 x .40 = .12

Question P4:
A listing of advertisements indicates that five advertisements choose to list a website first (coded as
one), twenty list a telephone number first (coded as two) and thirty list neither (coded as zero). The
variable, type of listing in advertisement, would be considered to have at best (in terms of ability to
conduct statistical analysis) which measurement properties:
(a) Nominal only
(b) Nominal and Ordinal
(c) Nominal, Ordinal and Interval
(d) Nominal, Ordinal, Interval and Ratio
Answer: A = Nominal only
Justification:
You can only nominate a response as either having the listing. You cannot order the outcomes {web
first ; telphone first; neither listed} could be written equivalently: { telphone first; neither listed; web
first}.

Question P5:
The financial manager of a magazine has compiled the following table from a regression analysis used
to make predictions about the number of sales (in dollars) per issue:
Coefficients Standard Error t Stat P-value
Intercept 6.97 .584 11.9287 5.22E-29
X1 2055 10637.5 0.19318 0.846893
X2 35.236 762.235 0.046227 0.963148
where X1 = the number of pages in the issue; X2 = a dummy variable taking the value 1 if a well
known celebrity is shown on the front cover and 0 otherwise. You predict the level of sales for an
issue in which there are 30 pages and a well known celebrity appears on the cover to be:
(a) $61,650
(b) $61,685
(c) $61,692
(d) More than $62,000
Answer: C = $61,692
Justification:
Sales = 6.97 + 2055(X1) + 35.236(X2)
X1 = 30; X2 = 1 (celebrity appears).
Sales = 6.97 + 2055*30 + 35.236*1
= 6.97 + 61650 + 35.236 = 6.97 + 61685.24
= 61692.21

Page 3 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P6:
A pricing study examines the linear relationship between sales of Cracker Nuts and factors such as its
price; its competitor's prices; a dummy variable representing whether Cracker Nuts had a promotional
offer or not; and, a dummy variable representing whether its closest competitor, Sirius Nuts, had a
promotional offer or not. The data revealed that Cracker Nuts runs promotional offers only when
Sirius Nuts happened to be running a promotional offer as well. The regression output is very
peculiar. Based on this information you would suspect that the major concern is:
(a) non-linear effects
(b) multi-collinearity
(c) outliers
(d) too many variables
Answer: B = Multi-collinearity
Justification:
Since the two companies both run promotional offers at the same time, the dummy variables will be
correlated. Hence, this is mult-collinearity.

Question P7:
An Excel spreadsheet displays a variable that is coded with values representing the area in which an
online panel member is employed. The coding scheme used is:
1 = telecommunications
2 = finance and banking
3 = retailing
4 = other industry
You calculate the mode, median and mean on the data revealing the values 2, 3, and 3, respectively.
Which statement is correct:
(a) On average, panel members appear to be employed in retailing
(b) Most panel members appear to be employed in retailing
(c) 50% of members appear to be employed in retailing
(d) None of the above statements are correct
Answer: D = None of the above
Justification:
The data has nominal properties so only the mode is interpretable. The mode tells us that most people
are employed in industry category 2, which is finance and banking.

Question P8:
The weight of goods being transported by an airline for each passenger is observed. A sample of one
hundred passengers reveals an average weight of goods to be 17.7kg. The population of weights for
all passengers is known to follow a normal distribution with a standard deviation of 10kg. What is the
probability that the sample mean of weights observed is within +/- 2kg of the population mean
weight?
(a) .0456
(b) .1586
(c) .8414
(d) .9544
Answer: D = .9544
Justification:
n = 100; N = unknown (infinite); s = 10;
s(xbar) = SE = σ / sqrt(n) = 10 / sqrt(100) = 10 / 10 = 1kg
mean = 17.7

= P(15.7kg < xbar < 19.7kg)


= P(xbar < 19.7) – P(xbar < 15.7)
= P(z < (19.7 – 17.7)/SE) - P(z < (15.7 – 17.7)/SE)
= P(z<+2/1) – P(z<-2/1)

Page 4 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

= P(z<+2) – [1 - P(z<+2)]
= 2*P(z<+2) – 1
= 2*.9772 – 1
= .9544

Note: depending on how they have been taught, some students may use the following:
P(xbar – j ≤ E(xbar) ≤ xbar + j) = 2*P(z ≤ j / s(xbar)) – 1; where E(xbar) = 17.7 ; j = 2kg.

Question P9:
AGI Sales are informed from their car supplier that carbon monoxide (CO) emissions for a certain
kind of car follow a normal distribution, with a mean of 2.9g/mi. AGI Sales believe the emissions, on
average, may be significantly higher than that suggested by their supplier. Sampling ten of their
vehicles, AGI Sales finds that, on average, the emissions are 3.1g/mi with a standard deviation of
0.4g/mi.. Using the sample, you test the concerns (using alpha = .05) of AGI Sales and conclude:
(a) On average, the emissions are significantly higher than the 2.9g/mi claimed
(b) On average, the emissions are not significantly higher than the 2.9g/mi claimed
(c) On average, the emissions are not significantly higher than the 3.1g/mi claimed
(d) With only ten vehicles tested, one cannot make any of the above conclusions
Answer: B = On average, mean emissions not significantly higher than 2.9g/mi
Justification:
Ho: μ ≤ 2.9g/mi (status quo)
Ha: μ > 2.9g/mi (claim)

Because the sample size is small (n=10) but the population parameter follows a normal distribution,
the test statistic can be used and follows a t-distribution. Also, since the population standard deviation
is unknown (but we know from sample that s = .4g/mi), we can use this.
Rejection region/critical value: tdf=10-1=9,.05 = 1.833 (it is a one-tailed test)
From sample observations: xbar = 3.1g.
We currently assume mu = 2.9g until we find evidence to lend support for a contradictory view.

Test statistic = (xbar – m)/(s/sqrt(n)) = (3.1 – 2.9)/(.4/sqrt(10)) = 1.581139

Since (test stat =1.58) < (t9,.05 = 1.833), we cannot reject the null hypothesis. The mean emission are
not significantly higher than the 2.9g claimed.

Question P10:
Past studies show that the previous mean time to prepare a home-cooked meal was 40 minutes. A new
study claims that the mean time to prepare a home-cooked meal has dropped to be significantly lower
than this amount. Suppose that a study is designed to test this by sampling 100 home-owners. What
should the null hypothesis be to test the claims made in the new study?
(a) μ ≥ 40 minutes
(b) μ > 40 minutes
(c) μ < 40 minutes
(d) μ ≤ 40 minutes
Answer: A = mu >= 40 minutes.
Justification:
One claim or hypothesis is that μ < 40 minutes (significantly lower than 40 minutes).
Another claim is the compliment of this, namely μ ≥ 40 minutes. As this contains the equals, this will
become the null. That is: Ho: μ ≥ 40.

Page 5 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P11:
A production manager of Lotzafun Toys needs to estimate the average time taken to assemble products
using a new manufacturing technique. It is believed that the population standard deviation is 15
seconds. How large a sample of assembly times should be taken to estimate the mean assembly time
to within 2 seconds, with 95% confidence?
(a) 153
(b) 216
(c) 217
(d) 865
Answer: C = 217
Justification:
n = (critical value)^2(sigma)^2 / (margin of error)^2
NOT n = (critical value)^2(standard error)^2 / (margin of error)^2

1.96^2 * 15^2 / 2^2 = 216.09 Round up to 217.

Question P12:
The mean, a measure of central tendency, is sensitive to outliers because it relies on which
measurement property of the data to be calculated?
(a) Integer
(b) Nominal
(c) Ordinal
(d) Quantitative
Answer: D = Quantitative
Justification:
The mean relies on the numerical value or quantitative properties of the data b/c it sums the data.

Question P13:
Eleven supermarkets introduced a promotional display for Grand Baked Beans promoting the product
on the basis of its low fat content. Another eleven stores were identified, this time promoting the
product on the basis of its energy producing benefits. These eleven pairs of supermarkets (22 in total)
were selected that were similar in terms of geographic location, size and product sales. The difference
in units sold was calculated for each pair. The sample mean difference was found to be 4200 units
with a standard deviation of 8800 units sold. Examining this evidence, (with α=.05) you can
conclude:
(a) The average number of units sold was not significantly different across the two sets
of stores
(b) The average number of units sold was significantly different across the two sets of
stores
(c) A conclusion about differences in average units sold cannot be made since we do not
know how each supermarket performed
(d) People appeared to really enjoy the promotion involving low fat content
Answer: A = no significant difference in number of units sold.
Justification:
Hence, Ho: μd = 0; Ha: μd ≠ 0. Hypothesised mean = 0
table critical value = ta/2,n-1 = t.05/2,11-1 = t.025,10 = 2.228 (Note, it is the number of pairs that
form the number of observations).
observed: dbar = 4.2; sd = 8.8
test statistic = t = (dbar – mud)/(sd/sqrt(n))
t = (4.2–0)/(8.8/sqrt(11)) = 1.583
Since t = 1.583 < t-table we cannot reject Ho at the 95% level.

Page 6 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P14:
The premium amounts of all 2500 insurance payers (i.e., the population) follows a normal distribution
with a standard deviation of $15. An internal audit selects a sample of 250 premiums, the sample
revealing an average premium to be $550. What is the probability that this average premium is within
+/- $1.50 of the population mean premium amount? (Hint: consider the issue of finite correction)
(a) 9.5%
(b) 11.4%
(c) 88.6%
(d) 90.5%
Answer: D = 90.5%
Justification:
n = 250; N = 2500 (finite); s = 15;
hence, n/N = 250 / 2500 = .1 which is NOT <= .05, hence finite correction factor reqd.

s(xbar) = [s / sqrt(n)][sqrt(N-n) / sqrt(N-1)]


s(xbar) = [15 / sqrt(250)][sqrt(2500-250)/sqrt(2500-1)] = [.948683][.948873]= .90018 (NOT .948683)

See question 8 solution for the full step-by-step theoretical approach. The following formula was
provided to students in previous semesters to save time, but you do not have to learn this:
P(xbar – j ≤ E(xbar) ≤ xbar + j) = 2*P(z ≤ j / s(xbar)) – 1;
E(xbar) = 550 ; j = 1.50kg.
P(550– 1.5 ≤ E(xbar) ≤ 550+ 1.5) = 2*P(z ≤ 1.5 / .90018) – 1
P(548.50≤ E(xbar) ≤ 551.50) = 2*P(z ≤ 1.666333 ) – 1
= 2*.9525 – 1 = .905

Question P15:
A real estate agent believes a regression will be useful to predict auction prices by including various
factors. The following regression output is produced:
ANOVA df SS MS F Significance F
Regression 5 18.69241 3.738482 44.05016 4.21E-38
Residual 539 45.74425 0.084869
Total 544 64.43666

Estimates Coefficients Standard Error t Stat P-value


Intercept 60.39093 0.012512 4826.769 0
Square Metres 0.068209 0.021616 3.155526 0.001692
Distance to Schools -0.05025 0.021086 -2.38312 0.017512
Distance to Shops -0.23617 0.021768 -10.8495 6.11E-25
Bathrooms 0.033752 0.021957 1.537164 0.12484
Bedrooms 0.187133 0.021467 8.71703 3.53E-17
Using the F-statistic (with 95% level of significance) provided in the ANOVA table only, which
statement is correct:
(a) Both distance to shops and the number of bedrooms in a dwelling are significant in
predicting auction sale prices, better than chance.
(b) Only distance to shops is significant in predicting auction sale prices, better than chance.
(c) Only the number of bedrooms in a dwelling is significant in predicting auction sale
prices, better than chance.
(d) None of the above can be established using the statistics listed in the ANOVA table
Answer: D = none of the above are correct.
Justification:
F-statistic in ANOVA only tells us whether none (Ho) or at least one of the mean coefficients are
significantly different from zero, but not sure which one (Ha).

Page 7 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P16:
A town planner obtains a sample listing the actual heights of buildings (in metres) in the local central
business district. The town planner determines that 20% of buildings are above the actual height
approved by the town planning committee. The town planner also categorises buildings as being
within a close proximity (less than 2km) to the river running through the city. 30% of buildings fall
into the category of being in close proximity to the river. Randomly selecting one building for further
investigation, what is the observed conditional probability that the town planner selects a building that
is built too tall, given it is built within a close proximity to the river? Assume that the excessive height
of a building relative to its approved height is independent of its proximity to the river.
(a) 6%
(b) 20%
(c) 30%
(d) Unable to be determined
Answer: B = 20%
Justification:
P(Too Tall) = 0.20
P(River) = 0.30
P(TT | R ) = P(TT and R) / P(R); However, P(T and R) are unobserved, but since we know the
assumption of independence does not hold, the value can be determined.
P(TT and R) = P(TT)*P(R) under independence.
P(TT | R ) = P(TT)*P(R) / P(R) = P(TT) = .20 = 20%
Note that under independence we’ve shown that P(TT | R) = P(TT) – this should make theoretical
sense to students: we are saying that the height of a building is not conditional upon whether it is near
the river or not.

Question P17:
A sports manufacturer tests the durability of five different soles. Durability is assessed based on wear
and tear, where higher ratings indicate greater wear and tear. The following results were reported
testing the null hypothesis that mean wear and tear for all five soles is equal.
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 1020.782 4 255.1954 1.19364 0.311512 2.375494
Within Groups 533420.8 2495 213.7959

Total 534441.6 2499


What would you conclude at the α=.05 level based on the Analysis of Variance (ANOVA) output:
(a) The soles are not significantly different in terms of mean wear and tear.
(b) The soles are significantly different in terms of mean wear and tear.
(c) The soles are not significantly different in terms of the amount of variability exhibited
in wear and tear.
(d) The soles are significantly different in terms of the amount of variability exhibited in
wear and tear.
Answer: The soles are not significantly different in terms of mean wear and tear.
Justification:
Ho: μ1 = μ2 = … = μ5 = μ*; Ha: At least one of the means is significantly different.
Using p-value: p-val = .311 > a = .05; Hence we do not rewind, we do not rej Ho.
Ho refers to the equality of the means not the variances.

Page 8 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P18:
The regression output below represents the perceptions of private hospitals, measured on a scale of 1
to 7, where 1=Poor and 7=Excellent. The dependent variable results in an index of performance.
Factor Coefficients Standard Error t Stat P-value
Intercept 2.2917 0.0428 53.5840 0.0000
Empathy of staff 0.5432 0.4756 1.1421 0.2562
Expertise of medical staff 2.6029 0.1816 14.3301 0.0000
Administrative efficiency 2.8014 0.2058 13.6139 0.0000
Trustworthiness of staff 0.7331 0.1794 4.0871 0.0001
Cleanliness 1.4829 0.2205 6.7238 0.0000
Quality of facilities 2.4685 0.8181 3.0172 0.0032
Edyr Hospital has the following ratings: empathy (3); expertise (4); administration (2);
trustworthiness (6); cleanliness (2) and quality (5). Suppose the Edyr Hospital aims to improve itself
on perceptions regarding empathy and cleanliness, hoping to obtain ratings of 5 and 6 respectively.
What will be the impact on their overall index of performance as a result of their endeavour:
(a) The index will improve by 5 points, all else constant.
(b) The index will improve by 6 points, all else constant.
(c) The index will improve by 7 points, all else constant.
(d) The index will improve by 8 points, all else constant.
Answer: C = The index will improve by 7 points, all else constant.
Justification: Need only look at the change only – as seen in the last two columns:
impact
Factor Coefficients current improve change change
Intercept 2.2917 1 1 0
Empathy of staff 0.5432 3 5 2 1.0864
Expertise of medical staff 2.6029 4 4 0
Administrative efficiency 2.8014 2 2 0
Trustworthiness of staff 0.7331 6 6 0
Cleanliness 1.4829 2 6 4 5.9316
Quality of facilities 2.4685 5 5 0
INDEX 39.6426 46.6606 7.018 7.018

Question P19:
A charity was attempting to determine if the number of donations being made to its foundation
(dependent variable) was somehow related to the characteristics of donors.
Coefficients Standard Error t Stat P-value
Intercept 49.8416 5.1156 9.7431 0.0000
Age -0.0457 0.0882 -0.5176 0.6050
Income 0.0010 0.0004 2.3912 0.0172
Number of Children -0.0461 0.0886 -0.5202 0.6032
Sole Parent (dummy coded) -0.1427 0.0915 -1.5602 0.1193
Examining this regression output above, the charity should consider that the following characteristics
are significant (at the α=.05 level) in explaining donation behaviour:
(a) None of the variables
(b) Income only
(c) All of the variables, except income
(d) All of the variables
Answer: B = Income only.
Justification:
While income has the lowest coefficient this is likely b/c of the way in which it measured, hence,
deflating the standard error. When testing at α=.05, ONLY Income is significant since .0172 < .05
hence rewind, rej Ho that bi=0. All other variables are insignificant at the a=.05 level since p-values
are all > .05, hence we cannot rewind, we cannot rej Ho.

Page 9 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P20:
An investment company, ABI, determines from its market share that 10% of investors choose ABI to
manage their portfolios. A recent industry survey reveals that 20% of people invest in mining, given
they choose ABI to manage this portfolio. The survey revealed that only 5% of people invest in
mining, given they used an alternative firm to manage their portfolio. To assess the potential attraction
of people to ABI because of its successful portfolio management in mining investments, the firm
wishes to determine the likelihood a person will choose ABI Investments, given they have chosen to
invest in mining. You determine this probability to be:
(a) 2%
(b) 4.5%
(c) 30.77%
(d) 69.23%
Answer: C = 30.7692
Justification:
NOTE: The following solution utilises Bayes theorem. Whilst one can avoid using Bayes Theorem to
come up with an answer it is useful to see its role. Students in some semesters will not be exposed to
Bayes Theorem so would not be expected to see such a difficult question on their final exams. Please
do not ask about this on the discussion board.

An investment company, ABI, determines from its market share that 10% of investors choose ABI to
manage their portfolios.
P(A) = .10 ; P(AC) = .90
A recent industry survey reveals that 20% of people invest in mining, given they choose ABI to
manage this portfolio.
P(M | A) = .20
The survey revealed that only 5% of people invest in mining, given they used an alternative firm to
manage their portfolio.
P(M | AC) = .05
To assess the potential attraction of people to ABI because of its successful portfolio management in
mining investments, the firm wishes to determine the likelihood a person will choose ABI
Investments, given they have chosen to invest in mining.
P(A | M) = ?
This probability is found to be:

Using Bayes Theorem …


P(A | M) = P(A).P(M|A) / [ P(A).P(M|A) + P(AC).P(M | AC) ]
= (.10)(.20) / [(.10)(.20) + (.90)(.05) ]
= .02 / (.02 + .045)
= .307692

Even without knowledge of Bayes Theorem, one can see that:


P(M | A) = P(M and A)/P(A); rearranging gives P(M and A) = P(M|A)*P(A) = .20*.10 = 0.02
P(M | AC) = P(M and AC)/P(AC); rearranging gives P(M and AC) = P(M|AC)*P(AC) = .05*.90 = 0.045
P(M) = P(M | A) + P(M | AC) = .02 + .045 = 0.065
Finally, P(A|M) = P(A and M) / P(M) = 0.02 / 0.065 = .30792

Page 10 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P21:
A survey reveals a breakdown for the number of visits households had made to a National Park in the
previous year:
Number of Visits # Respondents
None 800
1 only 100
2 only 70
3 only 30
4 only 0
Ignoring outcomes relating to those who visit five times a year or more, the average (expected)
number of visits to a National Park that a visitor will make will be?
(a) No visits per year
(b) 0.14 visits per year
(c) 0.33 visits per year
(d) 1 visit per year
Answer: C = .33 visits per year.
Justification:
Looking only at the none, 1,2,3 only examines 1000 visitors. Hence, the probabilities are:
X p(x) x.p(x)
0 .8 0
1 .1 .1
2 .07 .14
3 .03 .09
Total sum x.p(x) = 0 + .1 + .14 + .09 = .33

Question P22:
A sub-set of responses from an observational survey of people at a supermarket reveals that each
person spent the following amount of time (in seconds) waiting at the check-out.
70; 120; 20; 100; 80; 40; 30; 220
The average and standard deviation of check-out waiting time for this sample is:
(a) 85 seconds and 60.4 seconds, respectively.
(b) 85 seconds and 64.6 seconds, respectively.
(c) 85 seconds and 4171.4 seconds, respectively.
(d) 85 seconds and 3650 seconds, respectively.
Answer: B = 64.6
Justification:
Must be sample mean and standard deviation
N 8
Average 85**
Stdev 64.58659746** SAMPLE MEAN
Var 4171.428571
Stdevp 60.41522987
Varp 3650
It relies on the numerical value or quantitative properties of the data b/c it sums the data.

Page 11 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P23:
Based on historical attendance records, an assessment of the number of people attending an IKG
business seminar is normally distributed with a mean of 50 people and standard deviation of 10
people. Caterer rates increase substantially when numbers exceed 60 people. What is the probability
that this will occur?
(a) 16%
(b) 20%
(c) 34%
(d) 84%
Answer: A = 16%
Justification:
P(x>60) = P(z > (60-50)/10) = P(z>10/10) = P(z>1)
= 1 – P(z<=1) = 1 - .8413
= .1587

Question P24:
A listing of stock reveals that twenty stereos were sold last month while a total of fifty DVD players
were sold. Using last months stock listings, the probability that a stereo will be sold this upcoming
month is closest too:
(a) 29%
(b) 40%
(c) 60%
(d) 71%
Answer: A = 29%
Justification:
20 stereos + 50 DVD players = 70 stock items.
Hence, P (stereo ) = 20 / 70 = .285714.

Question P25:
You have asked your administrative assistant, Cindy, to run a regression with your weekly
departmental expenditure for several years against various tasks that have been completed but with no
direct expenditure amount. Cindy runs regression 1 and reports the first part of the results in the table
below. Cindy runs a second regression in which she included some more variables that she had
initially forgotten to include.
Regression 1 Regression 2
Multiple R 0.965577 0.96543
R Square 0.932338 0.932054
Adjusted R Square 0.931925 0.931778
Standard Error 0.286021 0.286329
Observations 495 495
You are immediately suspicious of Cindy’s report because:
(a) Multiple R-squared went down
(b) The adjusted R-squared went down
(c) Both (a) and (b)
(d) The number of observations did not change
(e) Cindy is a psychopath and shouldn’t be trusted
Answer: A = Multiple R-squared went down.
Justification:
If you add additional items to a regression the R-squared should go up not down.
The adjusted R-squared can either go up or down.

Page 12 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P26:
Lisa is a University researcher examining the impact of several variables and their effects on average
prices in consumer markets. She is interested in predicting the movement in general inflation
indicators such as the cost of a loaf of bread or a litre of milk. Lisa hypothesises that average prices
are determined by oil prices: higher oil prices will drive average prices up. Also, she believes that
interest rate rises will see average prices fall in most consumer markets. She argues that any increase
in importation taxes will see the supply of produce decrease and so prices impacted by such taxation
increases will skyrocket. Lisa formalises her theory into a regression framework, examining the
following variables: X1 = oil, cost per barrel; X2 = interest rates; X3 = importation taxes.

If Y is the average price in the consumer market Lisa is examining and she estimates the following
regression function: Y = b0 + b1X1 + b2X2 + b3X3 + error, then the expected signs of b1, b2 and b3
respectively are:
(a) positive, positive, and positive
(b) positive, negative, and positive.
(c) positive, negative and negative.
(d) negative, negative and positive.
Answer: B = positive, negative, positive.
Justification:
higher [oil] prices will drive average prices up: hence, b1 is positive.
interest rate rises will see average prices fall. Hence, negative effect. b2 should be negative.
As importation taxes increase, the supply of goods will decrease, forcing prices up. Hence, b3 is
positive. Answer in summary is positive, negative and positive.

Question P27:
A manager wishes to quickly assess the percentage of employees who take up to a certain amount of
time to travel to work. A survey asks employees how long they have taken that day and their answers
are recorded in minutes. Which method would be the most appropriate summary technique to describe
the data and achieve the manager's objectives?
(a) Correlation Coefficient
(b) Frequency Histogram
(c) Ogive
(d) Scatter-plot
Answer: Ogive
Justification:
Correlation coefficient – numerical measure examines relationship between TWO QUANTITATIVE variables.
Frequency histogram – shows frequency of qual or quant data – is a TABLE.
Ogive - examines the cumulative percentage over and above a given level of x
Scatterplot – graphical measure examines relationship between TWO QUANTITATIVE variables.

Question P28:
Brett, a small business owner of Drywall Plumbing has been told that his investment funds will take
between 5 and 10 days to transfer to his day-to-day account. He has a bill due in 7 days. What is the
probability that the funds will be available in time to pay the bill?
(a) 30%
(b) 40%
(c) 60%
(d) 70%
Answer: B = 40%
Justification:
Using uniform since we have no other information
a = 5; b=10; 1 / (b-a) = 1 / (10-5) = 1/5 = .2
P(x<=7) = (7-5)*.2 = 2*.2 = .4

Page 13 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P29:
Suppose the weight gain over a 12 month period that is induced as a side-effect of a new drug for
treating patients is normally distributed. To estimate the mean weight gain, a sample of 51 patients
was drawn and the sample mean was found to be 25 kilograms gained over 12 months, with a
standard deviation of 6kg determined from the same sample. Constructing a 90% confidence interval
estimate of the mean 12-month weight gain for all patients, the upper limit of your confidence interval
will be approximately.
(a) 24 kg
(b) 26 kg
(c) 31 kg
(d) 37 kg
Answer: B = 26 kg
Justification:
Confidence Interval = point estimate +/- (critical value)(standard error)
where point estimate = sample mean = 25 kg.
critical value = sourced from t-distribution as σ unknown: 90% implies 0.05 in upper tail area with n-
1 = 51-1 = 50 df. = 1.676 (based on t-tables so some element of inaccuracy).
standard error = standard error of sample mean = s / sqrt(n) = 6 / sqrt(51). = 6/7.07 = .840168

Hence 90% confidence interval is:


25 +/- (1.676)(.840168) = 25 +/- 1.408122 = Approx 23.59 and 26.41 kg. With 26.41 the upper limit.

Question P30:
Karen found that she could predict whether a car was speeding from the make and model of the
vehicle, although not without some error. She over predicts on Thursday the number of cars speeding
by 2 cars. On Friday she under predicts the number of speeding cars by 4 cars. Using the philosophy
of regression, which day has been associated with more error in making predictions?
(a) Thursday
(b) Friday
(c) It cannot be determined because we don't know how many cars she predicted would
be speeding.
(d) It cannot be determined because we don't know how many cars she observed were
actually speeding.
(e) Karen is always speeding and should slow down
Answer: B = Friday
Justification:
Friday – it doesn't matter whether we are over or underpredicting … what matters is by how much
which is the residual squared or residual squared error!
We don't need to know what the value was observed nor predicted to work this out.

Question P31:
A sample consisting of fifty observations is taken to examine the quality of a production line. The
population of quality is known to follow a normal distribution. The sample reveals a mean quality
rating of 95 and standard deviation of 2. What is the standard error of the sample mean quality rating?
(a) 0.04
(b) 0.28
(c) 2.00
(d) 7.07
Answer: B = .2828
Justification:
n = 50; s = 2; N = infinite (unknown)
Hence, s(xbar) = s / sqrt(n) = 2 / sqrt(50) = 2 / 7.071068 = .2828

Page 14 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P32:
A solicitor wishes to assess the likely interest in running a series of seminars on property law in the
upcoming month for the set of executive clientele listed who used the solicitor's services in the last
year. The solicitor sends an email to a random selection of thirty executive clients to gauge interest in
the potential seminar series. Each email has a response asking them if they would come to the seminar
series if organised. This is an example of a:
(a) Cross-sectional survey using the population
(b) Cross-sectional survey using a sample
(c) Time-series survey using the population
(d) Time-series survey using a sample
Answer: B = Cross-sectional survey using a sample
Justification:
Cross-sectional … while the seminars may be a series, there is only a need to run a one off survey.
Only a random selection of clients was used: hence this represents a sample

Question P33:
A concreting company wishes to compare additives used in “batches” of concreting. In particular, the
company wishes to assess how long each additive sets, on average, and which additive “is best”.
There are two sets of machinery each running different batches, one with an old style additive and one
with a new style additive. The old style additive is used in 140 batches and reveals that, on average,
concrete would set in 17.2 hours with a standard deviation of 2.5 hours. Using a new style additive in
140 batches also, the concrete takes hold with an average and standard deviation setting time of 15.9
hours and 1.8 hours, respectively. Using only this information given, a researcher now must advise
which additive to use. The researcher is best advised to consider testing:
(a) the difference between means of two populations, and assume independent samples
(b) the difference between means of two populations using matched samples approach.
(c) the difference between two population proportions using independent samples
(d) the difference between two population proportions using a matched samples approach
Answer: A = the difference between means of two populations, assume independent samples
Justification:
Comparing average settting time – continuous variable – looking at MEAN.
Cannot create “pairs” of observations even when have same number of observations. No clear
information on how one would do this. Hence, forced to use independent samples

Question P34:
A student researching attitudes of residents to a new building proposal for a local shopping centre
decides to visit the existing shopping centre. The research stops people at random as they walk
through the centre. The sampling method being used is best described as:
(a) cluster sampling
(b) convenience sampling
(c) random sampling
(d) justified sampling
Answer: B = Convenience Sampling
Justification:
Please note that in previous semesters students have been assessed on this topic. Your lecture notes
will guide you as to whether this topic is assessable or not, so please do not ask on the UTSOnline
discussion board if it is or is not in a given semester.
The sample is NOT random b/c we do not have a list of residents and drawing randomly from this.
The student is simply obtaining responses from the residents who may be most conveniently located
in the vicinity.
It is not a cluster sample b/c this is a type of random sampling.
There is no such thing as "justified sampling".

Page 15 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P35:
A clothing manufacturer has had some bad experiences in the past by taking some actions at times
that deviated from what they have normally done. For instance, they decided to print t-shirts using a
new dye given that the rate of defects from several experiments appeared to be significantly lower
than defect rates using previous dyes. Upon implementing the new dye system, it turned out to be a
major mistake. Looking at their future decisions to deviate from current strategies, the company
should consider:
(a) maximising critical values
(b) minimising statistical risk
(c) minimising Type I errors
(d) minimising Type II errors
ANSWER: minimising Type I errors.
Justification:
The company wants to decrease the likelihood that they make a decision to reject the null hypothesis
(status quo) given that the state of nature is that the null hypothesis is true. For instance, concluding
Ha: mu>current defect rate when H0: mu<=current defect rate. That is, minimise prob Type I error.
Minimising Type II is reducing the likelihood you make a decision to NOT reject the null hypothesis
even though the null hypothesis is false. That is, you are more likely to stay with the Status Quo.

Question P36:
A furniture manufacturer receives instructions that the commercial panels it produces must conform
so as to be made to an average thickness of .75 centimetres. Each hour, 50 panels are selected at
random and precisely measured. After 20 hours, a total of 1000 panels have been measured. With
1000 panels, the thickness averages 0.753cm with a standard deviation of .034cm. Based on the
sample data, what should the company conclude about its product meeting the thickness
specification?
(a) The average thickness is not significantly different to the desired level
(b) The average thickness is significantly different to the desired level
(c) The sample mean is not normal so we cannot make any conclusion about meeting
standards, but would instead use a binomial distribution
(d) One would need to observe the entire population of panels to be confident that production
meets the desired standard
Answer: B = The average thickness is significantly different to the desired level.
Justification:
Adapted from Groebner p325.
Ho: μ =.75cm (status quo) - two tailed
Ha: μ ≠ .75cm (claim)
α is unknown so assume α=.05;
σ is unknown so assume σ = s, but use t-distribution with n-1 df.
Sample size is large (n=1000>30).

Xbar = 0.753; sx = 0.034; n=1000; SE = sx/sqrt(n)

Critical value = t with 0.025 upper tail and 999df = 1.96


Critical value in cms = μ +/- T*SE = .75 +/- 1.96*.034/sqrt(1000) = .747893 and .752107cm

Evidence: xbar = 0.753cm lies in rejection region.


Confirming via test-statistic:
Test stat = (0.753 – 0.75)/(0.034/sqrt(1000) = 2.790245, again lies in rejection region compared to
critical value of 1.96

We reject Ho and adopt view suggested by alternative: hence, the average thickness is significantly
different from the desired level of .75cm.

Page 16 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P37:
The owner of Wild Club and Grill, a nightclub, wishes to construct a confidence interval estimate for
the mean number of customers coming into the club. The owner monitors numbers of several
randomly selected evenings. The width of the confidence interval estimate will be narrower:
(a) if one decreases the number of evenings that are monitored;
(b) if the mean number of visitor numbers decreases;
(c) if the mean number of visitor numbers increases;
(d) if the variation in visitor numbers over all evenings decreases;
Answer: D = if the actual variation in visitor numbers over all evenings decreases.
Justification:
Confidence Interval = point estimate +/- (critical value)(standard error)
In this example, CI = sample mean visitors +/- (za/2 or t dist value)(s or sigma/ sqrt(n))
Statement a = if one decreases the number of evenings that are monitored;
This implies that n decreases … this will increase the standard error, hence increasing the CI length.
Statement b and c = if the actual mean number of visitor numbers decreases or increases;
This implies the point estimate changes, but not necessarily that the standard error changes. Hence the
length of the CI is unaffected.
Statement d = If the variation decreases, implies s or sigma decreasing (even though we may or may
not be observing it). Hence, this decreases the standard error and hence, length of the CI decreases.

Question P38:
Out of 1000 patents filed, 400 patents were filed with an accelerated request being made. Assuming
each patent filing is independent of each other, a random selection of six is made for a complete audit
of the patent office's decision. What is the probability that exactly two being selected for the audit will
have been made with an accelerated request?
(a) 4.6%
(b) 18.66%
(c) 31.1%
(d) 54.36%
Answer: C = 31.1%
Justification:
Using BINOMIAL since independent trials. There are n=6 selections/trials.
p = 400/1000 = .40
P(x=2) = 6C2.(.4^2)(.6^4) = .31104

Question P39:
The following table summarises workers roles’ within a particular firm with 50 employees:
Department Frequency (Number of Employees) Cumulative Frequency
Management 10 10
Production 10 20
Marketing 15 35
Accounting 15 50
Which statement is correct?
(a) 30% of employees are employed in production
(b) 30% of employees are employed in production or less
(c) 30% of employees are employed in marketing
(d) 30% of employees are employed in marketing or less
Answer: C = 30% of employees are employed in marketing
Justification:
Total freq = (5+10+15+20) = 50
There are 10 / 50 = 20% of employees in production.
There are 15 / 50 = 30% of employees in production.
Students should recognise that the cumulative frequency is meaningless. Hence, its use is incorrect.

Page 17 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Question P40:
A new technology for cars requires that a specially developed vehicle will require a special charging
station, similar to the concept of a petrol station. The current proposed number of charging stations for
a 50km journey follows a Poisson distribution, with only one station encountered, on average. The
new vehicle has a maximum range of 200km. What is the probability at least one station will be
encountered within the maximum range journey?
(a) 2%
(b) 63%
(c) 47%
(d) 98%
Answer: D = 98%
Justification
Poisson distribution has average mu = 1 stations / 50 km
mu = 4 stations / 200 km (multiplying both by 4)

We desire at least one i.e., P(x>=1) = 1 – P(x=0).

P(x=0) = (mu^x)(exp(-mu))./(x!)
P(x=0) = (4^0)(exp(-4))./(0!)
P(x=0) = 1*exp(-4)/1 = exp(-4) = .018316

P(x>=1) = 1 - .018316 = .981684

THIS COMPLETES ALL THE QUESTIONS IN YOUR FINAL EXAM

CONGRATULATIONS ☺

Page 18 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

Cumulative Probabilities for the Standard Normal Distribution

Entries in the table given the area under the normal probability distribution to the left of the z
value. For example, z=1.25 the cumulative probability is .8944.

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879

0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389

1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319

1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767

2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936

2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986

3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990

Page 19 of 20
FINAL PRACTICE EXAM P (SOURCE: PRE AUT 07)
Last updated 22nd October 2010

t Distribution

Entries in the table give t values for an area or probability in the upper tail of the t
distribution. With 10 degrees of freedom and .05 area in the upper tail, t.05 = 1.812.

Area in Upper tail


Degrees of
Freedom 0.10 0.05 0.025 0.01 0.005

1 3.078 6.314 12.706 31.821 63.656


2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604

5 1.476 2.015 2.571 3.365 4.032


6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250

10 1.372 1.812 2.228 2.764 3.169


11 1.363 1.796 2.201 2.718 3.106
12 1.356 1.782 2.179 2.681 3.055
13 1.350 1.771 2.160 2.650 3.012
14 1.345 1.761 2.145 2.624 2.977

15 1.341 1.753 2.131 2.602 2.947


16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
18 1.330 1.734 2.101 2.552 2.878
19 1.328 1.729 2.093 2.539 2.861

20 1.325 1.725 2.086 2.528 2.845


21 1.323 1.721 2.080 2.518 2.831
22 1.321 1.717 2.074 2.508 2.819
23 1.319 1.714 2.069 2.500 2.807
24 1.318 1.711 2.064 2.492 2.797

25 1.316 1.708 2.060 2.485 2.787


26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756

30 1.310 1.697 2.042 2.457 2.750


40 1.303 1.684 2.021 2.423 2.704
60 1.296 1.671 2.000 2.390 2.660
120 1.289 1.658 1.980 2.358 2.617
∞ 1.282 1.645 1.960 2.326 2.576

Page 20 of 14

Você também pode gostar