Statistics Questions

Examiners Report
Mark Scheme
Examiners Report
Mark Scheme
Examiners Report
Mark Scheme
Final Mark Scheme
2616/01
June 2004
General Instructions
Some marks in the mark scheme are explicitly designated as M, A, B or E.
M marks (method) are for an attempt to use a correct method (not merely for stating the
method).
A marks (accuracy) are for accurate answers and can only be earned if corresponding M
mark(s) have been earned. Candidates are expected to give answers to a sensible level of
accuracy in the context of the problem in hand. The level of accuracy quoted in the mark
scheme will sometimes deliberately be greater than is required, when this facilitates marking.
B marks (explanation) are for explanation and/or interpretation. These will frequently be
subdividable depending on the thoroughness of the candidates answer.
Follow-through marking should normally be used wherever possible there will
however be an occasional designation of c.a.o. for correct answer only.
Full credit MUST be given when correct alternative methods of solution are used. If errors
occur in such methods, the marks awarded should correspond as nearly as possible to
equivalent work using the method in the mark scheme.
All queries about the marking should have been resolved at the standardising meeting.
Assistant Examiners should telephone the Principal Examiner (or Team Leader if
appropriate) if further queries arise during the marking.
Assistant Examiners may find it helpful to use shorthand symbols as follows:FT
Follow-through marking
Correct work after error
Incorrect work after error
Condonation of a minor slip
BOD
Benefit of doubt
NOS
Not on scheme (to be used sparingly)
Work of no value
Final Mark Scheme
Q1
2616/01
June 2004
X ~ N(, 2), Y ~ N(2, 42); T = aX + bY

(i)
We want
= E[aX + bY] M1
= a. + b.2 1
2b = 1 a i.e. b = 21 (1 a)
The Var(T)
= a22 + 21 (1- a )
= 2{a2 + (1 a)2}
= 2{2a2 2a + 1}
(ii)
d (2a2 2a + 1) = 0
Consider da
(42)
1 Beware printed answer

M1 Substitution of b= 21 (1-a )
reqd
M1
i.e. 0 = 4a 2 1
a = 21 1 Beware printed answer
2
Verification that this is a minimum (e.g. trivially by d 2 )
T = 21 X + 41 Y ~N , 21 2
da
2 if all three items are correct;

award 1 if any two are correct
[Both X and 21 Y are u. b. for m and both are Normally distributed all of which
( )
is also true for T; but] T has smaller variance Var ( X ) = 2 , Var 21 Y = 2

E2
(iii)
t =7.48 B1 FT if wrong
One-sided CI is given by
7.48 - 1.645
M1 M1 B1
1 3
2
10
M1
M1 (use of 21 2 as Var(T))
= 7.48 0.63(71)
= 6.84(29) A1 C.A.O.
Final Mark Scheme
Q2
A
B
(i)
237
203
249
222
2616/01
213
214
233
216
227
230
June 2004
236
Wilcoxon rank sum test (or Mann-Whitney from thereof).

Ranks are
A 10 11
B 1
5
2
3
8
4
6
7
M1 for attempt
A1 if all correct
Rank sum is 20 (from B, otherwise the tables cant be used)

(Mann-Whitney is 5) 1
Refer to tables of Wilcoxon rank sum (or Mann-Whitney) statistics.
Lower 2 21 % tail is needed. 1
Value for (5, 6) is 18 (or 3 for Mann-Whitney).
Result is not significant. 1
Seems medians are the same. 1
(ii)
Normality of both underlying populations/distributions.

n1 = 6
n2 = 5
2
x = 232.5 sn-1
=143.1( sn-1 =11.9624 ) sn2 =119.25,sn =10.9202
2
y = 217.0 sn-1
=100.0 ( sn-1 =10.0 ) sn2 =80.0,sn =8.9443
Pooled s2 =
5143.1+4100.0
=123.94
9
M1 for any reasonable attempt at
pooling (and FT into test)

A1 if correct
Test statistic is
M1
232.5- 217.0 ( -0 )
15.5 = 2.29
= 6.7414
123.94 1 + 1
6
( 92)
A1
= 11.1330
FT reasonable attempt
Refer to t9. 1 May be awarded even if test statistic is wrong. No FT if wrong.
Double-tailed 5% point is 2.262. 1 No FT if wrong
Significant, seems means differ. 1
(iii)
If the assumptions for the t procedure are satisfied, it is better (more

sensitive/powerful), E2
but if not it might be seriously misleading and the non-par procedure safer. E2 4
Final Mark Scheme
Q3
(i)
2616/01
H0 : D = 0 (or AFTER = BEFORE)

H1 : D < 0 (or AFTER < BEFORE)
June 2004
1
1
Where D is the population mean difference after before
1 for verbal defn of
[NOTE candidate might of course define D as before after take core that H1 agrees]
Requires Normality of population
of differences
1.
must be clear, or clearly implied
The test procedure, and the CI in (ii), MUST be PAIRED COMPARISON t.

Differences are [as after before, candidate might use before after]
6
19
13
31
22
44
2
sn1 = 17.621 sn-1
=310.49
d = -12.4
11
14
A1 Accept sn = 16.716(5)
sn2 = 279.44 ONLY if correctly used in sequel.

Test statistic is
-12.4-0
17.621
10
= - 2.22 ( 535 ) A1
Refer to t9
M1, M1, M1 (dont FT to 2nd M1)
1 May be awarded even if test statistic is wrong. No FT if wrong
Lower s.t. 5% pt is 1.833 1 Sign must agree with H1/test statistic, unless a
clear argument based on modulus is used. No FT if wrong.
Significant
(ii)
1. Seems mean afterwards is lower.
14
CI is given by
12.4 2.262 17.621 = 12.4 12.60(4) = (25.00(4), 0.20(4))
10
M1
B1
M1
A1 c.a.o.
Xero out of 4 if not same dist as for test. Some wrong dist can score max M1 B0
M1 A0. Recovery to t9 is ok.
(iii)
Any non-parametric procedure 1

Paired Wilcoxon 1 [allow sign test]
Final Mark Scheme
Q4
(i)
2616/01
June 2004
H0 : no association between age and level of interest. B1

H1 : association between age and level of interest. B1
oi
49
145
194
216
435
651
265
580
845
ei
60.84
133.16
204.16
446.84
A2
Award A1 if any
one is correct. But
deduct 1 if not at
least 2 dp
oi ei = 11.84
or 11.34 with Yates correction
x 2 =3.99 ( 71) with Yates
4.35 ( 73 ) without Yates
M1 for either, near-enough correct

A1 if Yates used
Refer to 12 1 [FT if 2 or 3 df averred]

Upper 5% point is 3.84 1
Significant 1
Seems there is association 1*
Seems under-30s have less interest than would be expected,
and over-30s more, then if there were no association. 2*
* These 3 marks are not available if H0 H1
(ii)
Directly-elected
mayor
(iii)
Yes
No
Total
Level of interest
Great
Little
118
314
49
216
167
530
Total
432
265
697
M1 for table with correctly labelled rows and columns.

M1 if all margins correctly add up from the individual values.
A1, A1, A1, A1 for each individual cell (118, 314, 49, 216).
We do not [at least prima facie] have a random sampler of 697 people
who were classified over the 4 cells. The usual sample 2 approach requires
such an assumption. E2
Examiners Report
2616 Statistics 4
General Comments
Most candidates appeared to be well prepared for this examination and there was no
evidence that candidates had insufficient time to complete the paper. In fact, some
candidates gave full answers to all four questions.
As in previous years candidates performed much more strongly when carrying out
the numerical parts of questions than they did when discussing assumptions or
analysing results. The two most common examples of this weakness were firstly the
assumptions required for the various t-tests to be valid many candidates were not
clear about whether parent populations, samples, means or data had to be normally
distributed or whether they were looking at one distribution, two distributions or the
difference between two distributions.
The second weakness was in the contextualisation of the results of a hypothesis test.
Many candidates did not make any statement beyond reject H 0 , whilst at the other
end of the scale, candidates were too definitive, making statements such as reject
H 0 , hence the median strength using process A is greater than the median strength
using process B.
Once again, Question 1 on estimation was by far the least popular question.
However most candidates who attempted question 1 scored well.
Comments on Individual Questions

Q.1
This question was only attempted by about 20% of candidates.

Virtually all candidates knew what they had to do in part (i) and were able to
verify the value of b. Most were also able to calculate the variance of T,
although poor algebra let down some candidates.
In part (ii) most candidates used calculus to show that the variance was
minimised when a = 0.5, although some showed only that the variance had a
stationary value. A few candidates used a method involving completing the
square.
Candidates who got this far were almost all able to state the distribution of T
and explain why it was a better estimator of than either X or Y.
Most candidates who attempted part (iii) knew what they were doing but a
number failed to realise that Var(T) = 12 2 and a number also did not realise
that because the value of 2 was known, the normal distribution should be
used indeed one candidate used specifically because the sample was
small.
Q.2
This was the most popular question on the paper, being attempted by all but 2
candidates.
Part (i) was obviously familiar ground for most candidates and most scored
very well here. The method of choice for most candidates was to calculate the
Wilcoxon rank sum statistic, covert to the Mann-Whitney statistic and then
use the Mann-Whitney tables. Only a small minority of candidates calculated
a statistic (Wilcoxon or Mann-Whitney) and then moved directly to the
relevant statistical table. However, this part of the question was answered
better than any other part of the paper.
Part (ii) was not answered as well with many candidates not realising that
Normality of both underlying populations was required. The pooled variance
also caused some confusion with some candidates trying to pool standard
deviations, some adding variances and others being confused about the use
of s n2 and/or s n21 .
Once a variance had been obtained, most candidates were then able to
calculate the test statistic correctly and compared it with the two-tailed value
of t 9 .
In both parts (i) and (ii) a significant number of candidates were too definitive
in their interpretation of the rejection, or otherwise, of the null hypothesis.
Answers to part (iii) tended to be too vague with very few candidates
mentioning the fact that the t-test is a more powerful, or sensitive, test than
the non-parametric alternatives, as long as the assumptions are satisfied.
However, if the assumptions are not satisfied, results can be seriously
misleading.
Q.3
In part (i) many candidates lost a significant number of marks because they
did not carefully state their hypotheses or take sufficient care with the
distributional assumption. Hypotheses such as the intensity remains the
same and the intensity reduces were common. What is required are explicit
statements about either the mean of the population of differences, or about
the means of the populations before and after. In addition all terms used
should be defined. The required distributional assumption was the Normality
of the population of differences.
As with other questions, most candidates were able to carry out the
calculations competently and most used the correct value of t.
Part (ii) was very well done by the majority of candidates, although a few did
use the Normal distribution.
Virtually all candidates correctly named the paired Wilcoxon test in part (iii)
Q.4
Most candidates were obviously on comfortable ground here and tended to

score well.
In part (i) most candidates were able to state the hypotheses correctly,
although some got the hypotheses the wrong way round and some talked
about correlation.
Calculations were inevitably done correctly, but a few candidates only gave
the expected values to 1 decimal place or even to the nearest integer.
Many candidates obviously realised that it would be appropriate to use Yates

correction, but few actually did. Of those that did, some were unsure whether
to add or subtract 0.5.
Most candidates correctly used 1 degree of freedom for the test and were
able to give the correct critical value. A small minority used 2 or 3 degrees of
freedom.
2
There was a definite improvement on previous years in the discussion of the

results of the hypothesis test, with many candidates considering the
2
contributions to the statistic, or at the very least considering the
differences between observed and expected values.
Most candidates scored full marks in part (ii)
Candidates struggled with part (iii), with the most common suggestion being
about different sample sizes. The actual reason was that we do not have a
random sample of people who were classified over the 4 cells.
Mark Scheme
MEI STATISTICS 4 (2616)
JANUARY 2005
SOLUTIONS
Question 1
(i)
We have :
X 1 ~ Poisson ( )
X 2 ~ Poisson (4 )
M1 might be implicit
in sequel
X 3 ~ Poisson (10 )
1
(X1 + X 2 + X 3 )
15
1
E ( ) = ( + 4 + 10 )
15
M1 for any attempt
to find E ( )
M1 for use of
Poisson means
=
is unbiased
A1
1
Var( ) = 2 Var( X 1 + X 2 + X 3 )
15
M1 for any
(reasonable) attempt
to find Var
M1 for use of
1
= 2 ( + 4 + 10 )
15
Poisson variances
A1 - beware printed
answer
15
8
(ii)
Y ~ Poisson (10 )
1
1
1
1
E( Y ) = E(Y ) = E(Y ) = . 10
10
10
10
10
=
i.e. unbiased
Now
Var(
1
1
Y)=
Var(Y )
100
10
1 Var(Y )
n
100
1 10
=
100 n
10n
M1
A1
1
M1
M1
M1, A1
7
(iii)
10n
ie
<
15
1
Y
10
JANUARY 2005
M1
for n 2
E1
is better
For Z ~ Poisson ( ), we have Var ( Z ) =
So would need n 16 to be better than
SOLUTIONS
n
E2
Allow 1 for n 15
5
JANUARY 2005
SOLUTIONS
Question 2
(a)
MUST be N (0,1) test and CI for comparing means

1 if both correct. DO
NOT allow
H 0 : 1 = 2
X1 = X 2
H 1 : 1 2
or similar. Allow
verbal statement
1 if
1 , 2 are
adequately defined
in words (population
mean times )
12 6
Test statistic is
13 9
M1
(2 4) 2
(3 5) 2
+
80
90
1.3
0.2081
1.3 = 2.84(97)
A1
0.4562
Refer to N (0,1)
1 No FT if wrong
1% critical point (two-sided) is 2.576
1 No FT if wrong
Significant
Seems mean waiting times differ
CI is given by
-1.3 1.96 0.4562 = -1.3 0.894 = (-2.194, -0.406) A1
M1
B1
accept (-2.2, -0.4)
M1
12
(b) MUST be Wilcoxon rank-sum test (or Mann-Whitney form
thereof).
[For bottom-up
rankings
It is convenient, and natural, to rank top down
W = 55, MW = 34
Use of Ranks M1
Ranks are: I
II
Upper 5% tail
10
11
13
W=55, MW = 34]
12
A1
Rank sum (for I) is 29 (Mann-Whitney is 8)
Refer to tables of Wilcoxon (or M-W) statistic
Lower 5% tail is needed
JANUARY 2005
SOLUTIONS
Value for (6,7) is 29 (or 8 if M-W used)
Result is significant
Seems on the whole there are differences in satisfaction scores
JANUARY 2005
SOLUTIONS
Question 3
Differences (after before):
6 11 22 5 1 4 28 2 7 3 9 8
(a)
MUST be PAIRED WILCOXON test.

Ranks of |d| are
10
M1
11
12
A1 FT if wrong
Test statistic is 5 + 1 + 2 = 8 [or 70]
Refer to paired Wilcoxon table with n=12
Lower 5% point is 17 [upper is 61]
the observed 8 [or 70] is significant
Seems coaching programme has improved short-term visual

memory
7
(b)
MUST be PAIRED COMPARISON t test

1
Normality of differences
d = 7.5
S n 1 = 9.5299 ( S n 1 = 90.8182)
M1 for use of
differences
B1 Accept Sn =
Test statistic (for test of
D = 0 against D > 0) is
7.5 0
= 2.72 (62)
9.5299
12
9.1248 (Sn2 =
83.85) ONLY if
correctly used in
sequel
M1 A1
Refer to t11
1 No FT if wrong
Upper 5% pt is 1.796
1 No FT if wrong
Significant
Seems coaching programme has improved short-term visual

memory
JANUARY 2005
Look at differences
SOLUTIONS
M1
Consider e.g. dotplot

M1, or for any other
relevant
display/discussion
of the data
Bulk of data appear OK [assuming no concern about being

integers], but the two large upper outliers cast doubt
E2 (E0, E1, E2)
JANUARY 2005
SOLUTIONS
Question 4
(i) H0:
H1:
no association (between success of transmission and

type of destination)
association
2
(ii) Oi
100
21
31
152
Ei
57
14
21
92
23 180
13 48
20 72
56 300
91.2(0) 55.2(0) 33.6(0)

24.32
14.72
8.96
36.48
22.08
13.46
Contributions to X2
0.8491 0.0587 3.3440

0.4532 0.0352 1.8216
X2 = 10.63 (985)
awrt 10.64
0.8232 0.0528 3.2019

Refer to
42
A4 - deduct 1 per
error
Must be to this level
of accuracy
M1
A2
[give A1 if
(10.5, 10.8)]
2[or zero; FT if
wrong, unless 300]
1
Upper 10% point is 7.779

Significant
Seems there is association
1
1
ZERO
if H0 H1
12
(iii) The key feature is the behaviour of transmission when intended
destinations are universities. There are many more more than one
attempt, and many more not successful at all, transmissions than
would be expected if there were no association, and many fewer
successful at first attempt transmissions. There is little or no
suggestion of any other associations.
E6 (divisible)
Examiners Report
2616
Mark Scheme
June 2005
Mark Scheme 2616

June 2005
2616
Mark Scheme
June 2005
2616 Statistics 4
Q1
E(Y) = (n 1)2
(i)
Var (Y) = 2(n 1)4
(iii)
T =kY
E(T) = k(n 1)2
B1
(ii)
Y = Xi X
X1, , Xn ~ ind N(, 2)
Var (T) = 2k (n 1)
B1
Bias = E(T) 2
= k(n 1)2 2
M1
A1
Allow M1A0 if 2 E(T).
M1
If both terms present, even if wrong.
A1
If both correct.
A2
Divisible for algebra.

BEWARE printed answer.
M1
To include =0, possibly implied.
A1
Correct derivative.
A1
Isolate k.
A1
M1
Or other methods.
A1
(Since n > 1).
B2
Divisible for algebra.

Answer not printed.
M1
For the converse argument, with no

support of only if, award SC B1.
MSE(T) = Variance + bias2

2
2 2
= 2k (n 1) + {k(n 1) }
2
2
2
= 2k (n 1) +{k (n 1) 2k(n 1) + 1}
= 4[2(n 1) +(n 1)2]k2 2 4(n 1)k + 4
(iv)
d MSE(T )
=0
dk
Consider
d MSE(T )
= 4 2(n 1) + (n 1) 2 2k 2 4 (n 1)
dk
n 1
2(n 1) + (n 1) 2
1
=
n +1
Check minimum by considering
d 2 MSE (T )
= 4 2(n 1) + (n 1) 2 2
d k2
> 0 min
=0 k =
(v)
1
,
n +1
2(n 1) + (n 1) 2 2(n 1)
MSE (T ) = 4
+ 1
n +1
(n + 1) 2
=
2n 2 + n 2 2n + 1 2n 2 + 2 + n 2 + 2n + 1
(n + 1) 2
With k =
(vi)
4
(n + 1)
{2n + 2} =
2
2 4
n +1
From (ii), we want k(n 1)2 2 = 0

1
k =
n 1
In this case, MSE(T) = Var(T)
=
2 4
n 1
A1
M1
A1
Or substitute in expression for MSE in

(iii) this is not difficult.
4
20
2616
Mark Scheme
June 2005
Q2
(i)
(ii)
H0 : =
H1 :
B1
Where , are the population mean strengths for

processes A and B.
B1
Normality of both populations.

Same variance.
B1
B1
Both hypotheses. Do not allow any

other symbols, including, e.g.,
X A = X B or similar, unless they are
clearly and explicitly stated to be
population means. Allow statements in
words (see below).
For adequate verbal definitions of ,
. Must indicate mean; condone
average. Allow absence of
population if correct notation is
used, otherwise insist on population.
4
n1 = 9, x = 114 6667, s n 1 = 87 25, ( s n 1 = 9 3408)

2
n 2 = 8, y = 123 75, s n 1 = 109 07, ( s n 1 = 10 4437)
B1
If all means and variances correct.

Accept sns ONLY if correctly used in
sequel.
2
s n = 77 5 ,
s n = 8 8066
2
Pooled s 2 =
698 + 763 5
= 97 43
15
Test statistic is
114 6667 123 75
1 1
97 43
+
9 8
9 0833
23 0051 = 4 7964
= 1 89(38)
(iii)
(iv)
M1
A1
M1
M1
A1
s n = 95 4375, s n = 9 7692
For any reasonable attempt at pooling
(and ft into test and CI).
If correct.
Overall structure. Allow cs pooled s.
1 1
+
9 8
ft cs pooled s2.
Refer to t15.
Double tail 5% point is 2131.
Not significant.
Seems mean strengths are the same for both
processes.
M1
A1
E1
E1
No ft from here if wrong.

ft only cs test statistic.
ft only cs test statistic. Expect reference
to means and context.
CI is given by 90833
M1
2947
47964
= 90833 141349 = (2321(8), 505(2))
B1
M1
A1
Must be cs ( x y ) ...
From t15.
Allow cs pooled s.
c.a.o. Must be written as an interval.
Wilcoxon
Rank sum test
B1
B1
10
Or Mann-Whitney scores B2.

2
20
2616
Q3
(a)
Mark Scheme
H0 : D = 0 or
H1 : D 0 or
E = S
E S
June 2005
B1
Where D is population mean for Experimental

fertilizer population mean for Standard fertilizer.
B1
Normality of differences is required.
B1
MUST be PAIRED COMPARISON t test.

Differences are
06 23 08 06 09 15 14 08 01 02
2
d = 0 46, s n 1 = 1 0668(75), s n 1 = 1 1382
M1
B1
Test statistic is
0 46 0
1 0668(75)
10
M1
Both hypotheses. Do not allow any

other symbols, including, e.g.,
X E = X S or similar, unless they are
clearly and explicitly stated to be
population means. Allow statements in
words (see below).
For adequate verbal definition of .
Must indicate mean; condone
average. Allow absence of
population if correct notation is
used, otherwise insist on population.
Must be explicit about the population.
Accept s n = 1 0121, s n = 1.0244

ONLY if correctly used in sequel.
Allow cs d and/or sn1.
Allow alternative: 0 (cs 2262)
1 0668(75)
(= 07631) for subsequent
10
comparison with d .
(Or d
= 136(35)
(b)
A1
Refer to t9.
Double tail 5% point is 2262.
Not significant.
Seems mean yield using experimental fertilizer is
same as for standard.
M1
A1
E1
E1
Now need Normality for yields using experimental

fertilizer.
For these yields,
2
x = 20 43, s n 1 = 4 0803, s n 1 = 16 649
B1
One-sided CI (lower confidence bound) is given by

2043
1833
4 0803
10
= 2043 236(51) = 1806(49)
In repeated sampling, lower confidence bounds
obtained in this way would fall below the true mean
on 95% of occasions.
(cs 2262)
1 0668(75)
10
(= 0303, 12231) for comparison with
0.)
c.a.o. (but ft from here if this is wrong.)
Use of D d scores M1A0, but
next 4 marks still available.
ft only cs test statistic. Expect reference
to mean(s) and context.
B1
Accept s n = 3 8709, s n = 14.9841

ONLY if correctly used in sequel.
M1
M1
B1
Mean. Allow cs x .
Minus.
From t9.
M1
Allow cs sn-1, or sn / 9 (see above).
A1
Depends on all 4 preceding marks.
E2
(E0, E1, E2). Comment should refer to

lower bound rather than just the
confidence interval.
11
9
20
2616
Q4
(a)
Mark Scheme
Data
Median 60
Difference
Rank of |diff|
June 2005
29
32
34
38
40
46
51
52
59
63
71
95
31
11
28
10
26
9
22
8
20
7
14
6
9
4
8
3
1
1
3
2
11
5
35
12
M1
M1
A1
(b)
(i)
T = 2 + 5 + 12 = 19
B1
Refer to tables of Wilcoxon single sample (/paired)

statistic.
Lower (or upper if 59 used) 2% tail is needed.
Value for n = 12 is 13 (or 65 if 59 used).
Result is not significant.
No real evidence that median is not 60.
M1
M1
A1
E1
E1

B1
B1
P(80 < N (62, = 27 3) 100 )

= P(0 6593(4) < N (0, 1) 1 3919(4) )
= 0 9180 0 7452 = 0 1728
expected frequency = 200 01728 = 346

(ii)
(iii)
For differences. ZERO in this section if

differences not used.
For ranks of |difference|.
All correct.
ft from here if ranks wrong.
Or 1 + 3 + 4 + 6 + 7 + 8 + 9 + 10 + 11
= 59
Grouping the last two cells,

X2 = 56903 + 01946 + 183265 + 52024 + 8 9526
+ 56195
= 43.98(59)
M1
A1
Allow without grouping.

This becomes + 00769 + 217529.
X2 becomes 6019(62). Then must have
42 below.
Refer to 32 .
Extremely highly significant overwhelming
evidence that Normal model does not fit data.
M1
A1
NEXT mark not available if not 32 .

The fit is not particularly good in most of the

intervals, but the main points are that the modal class
is perhaps half an interval lower than expected, that
there are many fewer low values than expected, and
that there a lot of upper outliers.
E2
(E0, E1, E2)
Part (a) has a small sample and it appears that the

underlying distribution is not Normal could be
dangerous to use a t test.
E2
(E0, E1, E2)
There is also the point that, in the absence of

Normality (or at least of symmetry), we could not use
the t test for the mean as a proxy test for the median.
E1
20
2616
Mark Scheme
June 2005
2616 - Statistics 4
General Comments
There were 93 candidates from 20 centres (June 2004: 82 from 20). The overall
standard of the scripts seen was pleasing: many candidates were clearly well
prepared for this paper. Routine calculations were carried out well but the
candidates ability to comment and interpret were a little disappointing at this level.
Question 1 was by far the least popular question with only about 15 candidates
attempting it. Every candidate attempted Question 2; Questions 3 and 4 were
equally popular.
Comments on Individual Questions

1)
Estimation theory
Although this was the least popular question it seemed to have the highest mean
mark, with most of those attempting it scoring full or almost full marks. Those
who were prepared to try it were likely to be successful as long as their algebra
was up to the task. Sometimes the algebra arrived at the correct destination by
brute force rather than elegance.
There were just two places where marks seemed likely to be lost: part (iv) where
some neglected to verify that the required value of k did indeed give a minimum
and part (vi) where there was a temptation for some to use the converse argument.
2)
Two sample t test and confidence interval; the strengths of steel rods
This was the most popular question being attempted by all candidates. It was also
a very high scoring question: about half of the entry scored full or almost full
marks.
(i)
The hypotheses were usually stated correctly but there was rather less
care in providing verbal definitions of the population means. Similarly, the
required assumptions were sometimes less than ideal.
(ii)
Most candidates carried out the test competently. There was rarely any
problem over finding and using the pooled variance. The critical value was
almost always correct but on a number of occasions the conclusion was
badly expressed.
(iii)
As in part (ii) most candidates had little difficulty here. Just occasionally
the standard error (which had been correctly constructed in part (ii))
became pooled s
(iv)
1
17
This part was almost always correct.
2616
3)
4)
Mark Scheme
June 2005
Paired sample t test and one-sided confidence interval; comparing

fertilizers
(a)
The hypotheses were usually stated correctly but candidates were not as
careful about defining the symbol . Nor were they sufficiently careful
when it came to the distributional assumption.
However there were only a very few candidates who did not realise that
they should carry out a paired test. The vast majority made good progress
with the test itself, and only the final conclusion left room for
improvement.
(b)
As above, most realised what to do here and the correct value for the
lower bound was usually found. A small minority tried to construct the
confidence interval using the information from the paired test. There was
some uncertainty again with the distributional assumption.
The main area of difficulty was with the interpretation of the interval.
Very many comments revealed a flawed understanding of a confidence
interval to quite a worrying extent.
Wilcoxon rank sum test for the median; Chi-squared test for goodness of fit;
waiting times in an airport
(a)
This part of the question was almost always answered well. Many fully
correct solutions were seen.
(b) (i) This part was frequently done correctly.

(ii)
Most candidates calculated a correct value of X2 (with or without

grouping) but relatively few were able to identify the correct Chi-squared
distribution to look up. Most of those who got this second aspect wrong
made no allowance for estimated parameters while a few thought that
there were 200 degrees of freedom. Hardly any commented on the fact
that the test statistic was significant at any level available to them in the
tables.
Disappointingly few candidates took the trouble to comment at all on the
reasons for the poor quality of fit.
(iii)
In this part of the question very few candidates realised that they could
refer back to the previous part for evidence that the assumption of
background Normality was not viable. They knew that Normality was
required, but often chose to look at the sample data in part (a), sometimes
with the aid of a dot plot. Hardly any candidates included in their
discussion the small sample size which might prompt the use of a t test.
No more than a handful of candidates picked up on the fact that a t test
examines the population mean whereas the Wilcoxon test in part (a)
examined the median.

Statistics Questions

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Statistics Questions

Enviado por

Direitos autorais:

Formatos disponíveis

Examiners Report

Final Mark Scheme

Condonation of a minor slip

Not on scheme (to be used sparingly)

Final Mark Scheme

X ~ N(, 2), Y ~ N(2, 42); T = aX + bY

1 Beware printed answer

2 if all three items are correct;

is also true for T; but] T has smaller variance Var ( X ) = 2 , Var 21 Y = 2

Final Mark Scheme

Wilcoxon rank sum test (or Mann-Whitney from thereof).

Rank sum is 20 (from B, otherwise the tables cant be used)

Normality of both underlying populations/distributions.

M1 for any reasonable attempt at

pooling (and FT into test)

If the assumptions for the t procedure are satisfied, it is better (more

Final Mark Scheme

H0 : D = 0 (or AFTER = BEFORE)

Where D is the population mean difference after before

1 for verbal defn of

Requires Normality of population

must be clear, or clearly implied

The test procedure, and the CI in (ii), MUST be PAIRED COMPARISON t.

sn2 = 279.44 ONLY if correctly used in sequel.

M1, M1, M1 (dont FT to 2nd M1)

1 May be awarded even if test statistic is wrong. No FT if wrong

1. Seems mean afterwards is lower.

Any non-parametric procedure 1

Final Mark Scheme

H0 : no association between age and level of interest. B1

4.35 ( 73 ) without Yates

M1 for either, near-enough correct

Refer to 12 1 [FT if 2 or 3 df averred]

M1 for table with correctly labelled rows and columns.

Comments on Individual Questions

This question was only attempted by about 20% of candidates.

Most candidates were obviously on comfortable ground here and tended to

Many candidates obviously realised that it would be appropriate to use Yates

There was a definite improvement on previous years in the discussion of the

MEI STATISTICS 4 (2616)

M1 for any attempt

MEI STATISTICS 4 (2616)

For Z ~ Poisson ( ), we have Var ( Z ) =

So would need n 16 to be better than

MEI STATISTICS 4 (2616)

MUST be N (0,1) test and CI for comparing means

1% critical point (two-sided) is 2.576

Seems mean waiting times differ

accept (-2.2, -0.4)

It is convenient, and natural, to rank top down

Rank sum (for I) is 29 (Mann-Whitney is 8)

Refer to tables of Wilcoxon (or M-W) statistic

Lower 5% tail is needed

MEI STATISTICS 4 (2616)

Value for (6,7) is 29 (or 8 if M-W used)

Seems on the whole there are differences in satisfaction scores

MEI STATISTICS 4 (2616)

MUST be PAIRED WILCOXON test.

Test statistic is 5 + 1 + 2 = 8 [or 70]

Refer to paired Wilcoxon table with n=12

Lower 5% point is 17 [upper is 61]

the observed 8 [or 70] is significant

Seems coaching programme has improved short-term visual

MUST be PAIRED COMPARISON t test