Escolar Documentos
Profissional Documentos
Cultura Documentos
Mark Scheme
Examiners Report
Mark Scheme
Examiners Report
Mark Scheme
2616/01
June 2004
General Instructions
Some marks in the mark scheme are explicitly designated as M, A, B or E.
M marks (method) are for an attempt to use a correct method (not merely for stating the
method).
A marks (accuracy) are for accurate answers and can only be earned if corresponding M
mark(s) have been earned. Candidates are expected to give answers to a sensible level of
accuracy in the context of the problem in hand. The level of accuracy quoted in the mark
scheme will sometimes deliberately be greater than is required, when this facilitates marking.
B marks (explanation) are for explanation and/or interpretation. These will frequently be
subdividable depending on the thoroughness of the candidates answer.
Follow-through marking should normally be used wherever possible there will
however be an occasional designation of c.a.o. for correct answer only.
Full credit MUST be given when correct alternative methods of solution are used. If errors
occur in such methods, the marks awarded should correspond as nearly as possible to
equivalent work using the method in the mark scheme.
All queries about the marking should have been resolved at the standardising meeting.
Assistant Examiners should telephone the Principal Examiner (or Team Leader if
appropriate) if further queries arise during the marking.
Assistant Examiners may find it helpful to use shorthand symbols as follows:FT
Follow-through marking
Correct work after error
Incorrect work after error
BOD
Benefit of doubt
NOS
Work of no value
Q1
2616/01
June 2004
We want
= E[aX + bY] M1
= a. + b.2 1
2b = 1 a i.e. b = 21 (1 a)
The Var(T)
= a22 + 21 (1- a )
= 2{a2 + (1 a)2}
= 2{2a2 2a + 1}
(ii)
d (2a2 2a + 1) = 0
Consider da
(42)
reqd
M1
i.e. 0 = 4a 2 1
a = 21 1 Beware printed answer
2
Verification that this is a minimum (e.g. trivially by d 2 )
T = 21 X + 41 Y ~N , 21 2
da
[Both X and 21 Y are u. b. for m and both are Normally distributed all of which
( )
(iii)
t =7.48 B1 FT if wrong
One-sided CI is given by
7.48 - 1.645
M1 M1 B1
1 3
2
10
M1
M1 (use of 21 2 as Var(T))
= 7.48 0.63(71)
= 6.84(29) A1 C.A.O.
Q2
A
B
(i)
237
203
249
222
2616/01
213
214
233
216
227
230
June 2004
236
2
3
8
4
6
7
M1 for attempt
A1 if all correct
2
x = 232.5 sn-1
=143.1( sn-1 =11.9624 ) sn2 =119.25,sn =10.9202
2
y = 217.0 sn-1
=100.0 ( sn-1 =10.0 ) sn2 =80.0,sn =8.9443
Pooled s2 =
5143.1+4100.0
=123.94
9
232.5- 217.0 ( -0 )
15.5 = 2.29
= 6.7414
123.94 1 + 1
6
( 92)
A1
= 11.1330
FT reasonable attempt
Refer to t9. 1 May be awarded even if test statistic is wrong. No FT if wrong.
Double-tailed 5% point is 2.262. 1 No FT if wrong
Significant, seems means differ. 1
(iii)
Q3
(i)
2616/01
June 2004
1
1
[NOTE candidate might of course define D as before after take core that H1 agrees]
of differences
1.
19
13
31
22
44
2
sn1 = 17.621 sn-1
=310.49
d = -12.4
11
14
A1 Accept sn = 16.716(5)
= - 2.22 ( 535 ) A1
Refer to t9
Lower s.t. 5% pt is 1.833 1 Sign must agree with H1/test statistic, unless a
clear argument based on modulus is used. No FT if wrong.
Significant
(ii)
14
CI is given by
12.4 2.262 17.621 = 12.4 12.60(4) = (25.00(4), 0.20(4))
10
M1
B1
M1
A1 c.a.o.
Xero out of 4 if not same dist as for test. Some wrong dist can score max M1 B0
M1 A0. Recovery to t9 is ok.
(iii)
Q4
(i)
2616/01
June 2004
49
145
194
216
435
651
265
580
845
ei
60.84
133.16
204.16
446.84
A2
Award A1 if any
one is correct. But
deduct 1 if not at
least 2 dp
oi ei = 11.84
or 11.34 with Yates correction
x 2 =3.99 ( 71) with Yates
(iii)
Yes
No
Total
Level of interest
Great
Little
118
314
49
216
167
530
Total
432
265
697
We do not [at least prima facie] have a random sampler of 697 people
who were classified over the 4 cells. The usual sample 2 approach requires
such an assumption. E2
Examiners Report
2616 Statistics 4
General Comments
Most candidates appeared to be well prepared for this examination and there was no
evidence that candidates had insufficient time to complete the paper. In fact, some
candidates gave full answers to all four questions.
As in previous years candidates performed much more strongly when carrying out
the numerical parts of questions than they did when discussing assumptions or
analysing results. The two most common examples of this weakness were firstly the
assumptions required for the various t-tests to be valid many candidates were not
clear about whether parent populations, samples, means or data had to be normally
distributed or whether they were looking at one distribution, two distributions or the
difference between two distributions.
The second weakness was in the contextualisation of the results of a hypothesis test.
Many candidates did not make any statement beyond reject H 0 , whilst at the other
end of the scale, candidates were too definitive, making statements such as reject
H 0 , hence the median strength using process A is greater than the median strength
using process B.
Once again, Question 1 on estimation was by far the least popular question.
However most candidates who attempted question 1 scored well.
Q.2
This was the most popular question on the paper, being attempted by all but 2
candidates.
Part (i) was obviously familiar ground for most candidates and most scored
very well here. The method of choice for most candidates was to calculate the
Wilcoxon rank sum statistic, covert to the Mann-Whitney statistic and then
use the Mann-Whitney tables. Only a small minority of candidates calculated
a statistic (Wilcoxon or Mann-Whitney) and then moved directly to the
relevant statistical table. However, this part of the question was answered
better than any other part of the paper.
Part (ii) was not answered as well with many candidates not realising that
Normality of both underlying populations was required. The pooled variance
also caused some confusion with some candidates trying to pool standard
deviations, some adding variances and others being confused about the use
of s n2 and/or s n21 .
Once a variance had been obtained, most candidates were then able to
calculate the test statistic correctly and compared it with the two-tailed value
of t 9 .
In both parts (i) and (ii) a significant number of candidates were too definitive
in their interpretation of the rejection, or otherwise, of the null hypothesis.
Answers to part (iii) tended to be too vague with very few candidates
mentioning the fact that the t-test is a more powerful, or sensitive, test than
the non-parametric alternatives, as long as the assumptions are satisfied.
However, if the assumptions are not satisfied, results can be seriously
misleading.
Q.3
In part (i) many candidates lost a significant number of marks because they
did not carefully state their hypotheses or take sufficient care with the
distributional assumption. Hypotheses such as the intensity remains the
same and the intensity reduces were common. What is required are explicit
statements about either the mean of the population of differences, or about
the means of the populations before and after. In addition all terms used
should be defined. The required distributional assumption was the Normality
of the population of differences.
As with other questions, most candidates were able to carry out the
calculations competently and most used the correct value of t.
Part (ii) was very well done by the majority of candidates, although a few did
use the Normal distribution.
Virtually all candidates correctly named the paired Wilcoxon test in part (iii)
Q.4
Mark Scheme
JANUARY 2005
SOLUTIONS
Question 1
(i)
We have :
X 1 ~ Poisson ( )
X 2 ~ Poisson (4 )
M1 might be implicit
in sequel
X 3 ~ Poisson (10 )
1
(X1 + X 2 + X 3 )
15
1
E ( ) = ( + 4 + 10 )
15
to find E ( )
M1 for use of
Poisson means
=
is unbiased
A1
1
Var( ) = 2 Var( X 1 + X 2 + X 3 )
15
M1 for any
(reasonable) attempt
to find Var
M1 for use of
1
= 2 ( + 4 + 10 )
15
Poisson variances
A1 - beware printed
answer
15
8
(ii)
Y ~ Poisson (10 )
1
1
1
1
E( Y ) = E(Y ) = E(Y ) = . 10
10
10
10
10
=
i.e. unbiased
Now
Var(
1
1
Y)=
Var(Y )
100
10
1 Var(Y )
n
100
1 10
=
100 n
10n
M1
A1
1
M1
M1
M1, A1
7
(iii)
10n
ie
<
15
1
Y
10
JANUARY 2005
M1
for n 2
E1
is better
SOLUTIONS
n
E2
Allow 1 for n 15
5
JANUARY 2005
SOLUTIONS
Question 2
(a)
H 0 : 1 = 2
X1 = X 2
H 1 : 1 2
or similar. Allow
verbal statement
1 if
1 , 2 are
adequately defined
in words (population
mean times )
12 6
Test statistic is
13 9
M1
(2 4) 2
(3 5) 2
+
80
90
1.3
0.2081
1.3 = 2.84(97)
A1
0.4562
Refer to N (0,1)
1 No FT if wrong
1 No FT if wrong
Significant
CI is given by
-1.3 1.96 0.4562 = -1.3 0.894 = (-2.194, -0.406) A1
M1
B1
M1
12
(b) MUST be Wilcoxon rank-sum test (or Mann-Whitney form
thereof).
[For bottom-up
rankings
W = 55, MW = 34
Use of Ranks M1
Ranks are: I
II
Upper 5% tail
10
11
13
W=55, MW = 34]
12
A1
JANUARY 2005
SOLUTIONS
Result is significant
JANUARY 2005
SOLUTIONS
Question 3
Differences (after before):
6 11 22 5 1 4 28 2 7 3 9 8
(a)
10
M1
11
12
A1 FT if wrong
7
(b)
Normality of differences
d = 7.5
S n 1 = 9.5299 ( S n 1 = 90.8182)
M1 for use of
differences
B1 Accept Sn =
D = 0 against D > 0) is
7.5 0
= 2.72 (62)
9.5299
12
9.1248 (Sn2 =
83.85) ONLY if
correctly used in
sequel
M1 A1
Refer to t11
1 No FT if wrong
Upper 5% pt is 1.796
1 No FT if wrong
Significant
JANUARY 2005
Look at differences
SOLUTIONS
M1
JANUARY 2005
SOLUTIONS
Question 4
(i) H0:
H1:
association
2
(ii) Oi
100
21
31
152
Ei
57
14
21
92
23 180
13 48
20 72
56 300
22.08
13.46
Contributions to X2
X2 = 10.63 (985)
awrt 10.64
42
A4 - deduct 1 per
error
Must be to this level
of accuracy
M1
A2
[give A1 if
(10.5, 10.8)]
2[or zero; FT if
wrong, unless 300]
1
1
1
ZERO
if H0 H1
12
(iii) The key feature is the behaviour of transmission when intended
destinations are universities. There are many more more than one
attempt, and many more not successful at all, transmissions than
would be expected if there were no association, and many fewer
successful at first attempt transmissions. There is little or no
suggestion of any other associations.
E6 (divisible)
Examiners Report
2616
Mark Scheme
June 2005
2616
Mark Scheme
June 2005
2616 Statistics 4
Q1
E(Y) = (n 1)2
(i)
(iii)
T =kY
B1
(ii)
Y = Xi X
Var (T) = 2k (n 1)
B1
Bias = E(T) 2
= k(n 1)2 2
M1
A1
M1
A1
If both correct.
A2
M1
A1
Correct derivative.
A1
Isolate k.
A1
M1
Or other methods.
A1
B2
M1
2 2
= 2k (n 1) + {k(n 1) }
2
2
2
= 2k (n 1) +{k (n 1) 2k(n 1) + 1}
= 4[2(n 1) +(n 1)2]k2 2 4(n 1)k + 4
(iv)
d MSE(T )
=0
dk
Consider
d MSE(T )
= 4 2(n 1) + (n 1) 2 2k 2 4 (n 1)
dk
n 1
2(n 1) + (n 1) 2
1
=
n +1
Check minimum by considering
d 2 MSE (T )
= 4 2(n 1) + (n 1) 2 2
d k2
> 0 min
=0 k =
(v)
1
,
n +1
2(n 1) + (n 1) 2 2(n 1)
MSE (T ) = 4
+ 1
n +1
(n + 1) 2
=
2n 2 + n 2 2n + 1 2n 2 + 2 + n 2 + 2n + 1
(n + 1) 2
With k =
(vi)
4
(n + 1)
{2n + 2} =
2
2 4
n +1
2 4
n 1
A1
M1
A1
2616
Mark Scheme
June 2005
Q2
(i)
(ii)
H0 : =
H1 :
B1
B1
B1
B1
B1
Pooled s 2 =
698 + 763 5
= 97 43
15
Test statistic is
114 6667 123 75
1 1
97 43
+
9 8
9 0833
23 0051 = 4 7964
= 1 89(38)
(iii)
(iv)
M1
A1
M1
M1
A1
s n = 95 4375, s n = 9 7692
For any reasonable attempt at pooling
(and ft into test and CI).
If correct.
Overall structure. Allow cs pooled s.
1 1
+
9 8
ft cs pooled s2.
Refer to t15.
Double tail 5% point is 2131.
Not significant.
Seems mean strengths are the same for both
processes.
M1
A1
E1
E1
CI is given by 90833
M1
2947
47964
= 90833 141349 = (2321(8), 505(2))
B1
M1
A1
Must be cs ( x y ) ...
From t15.
Allow cs pooled s.
c.a.o. Must be written as an interval.
Wilcoxon
Rank sum test
B1
B1
10
2616
Q3
(a)
Mark Scheme
H0 : D = 0 or
H1 : D 0 or
E = S
E S
June 2005
B1
B1
B1
M1
B1
Test statistic is
0 46 0
1 0668(75)
10
M1
comparison with d .
(Or d
= 136(35)
(b)
A1
Refer to t9.
Double tail 5% point is 2262.
Not significant.
Seems mean yield using experimental fertilizer is
same as for standard.
M1
A1
E1
E1
B1
1833
4 0803
10
= 2043 236(51) = 1806(49)
In repeated sampling, lower confidence bounds
obtained in this way would fall below the true mean
on 95% of occasions.
(cs 2262)
1 0668(75)
10
(= 0303, 12231) for comparison with
0.)
c.a.o. (but ft from here if this is wrong.)
Use of D d scores M1A0, but
next 4 marks still available.
No ft from here if wrong.
No ft from here if wrong.
ft only cs test statistic.
ft only cs test statistic. Expect reference
to mean(s) and context.
B1
M1
M1
B1
Mean. Allow cs x .
Minus.
From t9.
M1
A1
E2
11
9
20
2616
Q4
(a)
Mark Scheme
Data
Median 60
Difference
Rank of |diff|
June 2005
29
32
34
38
40
46
51
52
59
63
71
95
31
11
28
10
26
9
22
8
20
7
14
6
9
4
8
3
1
1
3
2
11
5
35
12
M1
M1
A1
(b)
(i)
T = 2 + 5 + 12 = 19
B1
M1
M1
A1
E1
E1
B1
B1
(iii)
M1
A1
Refer to 32 .
Extremely highly significant overwhelming
evidence that Normal model does not fit data.
M1
A1
E2
E2
E1
20
2616
Mark Scheme
June 2005
2616 - Statistics 4
General Comments
There were 93 candidates from 20 centres (June 2004: 82 from 20). The overall
standard of the scripts seen was pleasing: many candidates were clearly well
prepared for this paper. Routine calculations were carried out well but the
candidates ability to comment and interpret were a little disappointing at this level.
Question 1 was by far the least popular question with only about 15 candidates
attempting it. Every candidate attempted Question 2; Questions 3 and 4 were
equally popular.
Estimation theory
Although this was the least popular question it seemed to have the highest mean
mark, with most of those attempting it scoring full or almost full marks. Those
who were prepared to try it were likely to be successful as long as their algebra
was up to the task. Sometimes the algebra arrived at the correct destination by
brute force rather than elegance.
There were just two places where marks seemed likely to be lost: part (iv) where
some neglected to verify that the required value of k did indeed give a minimum
and part (vi) where there was a temptation for some to use the converse argument.
2)
Two sample t test and confidence interval; the strengths of steel rods
This was the most popular question being attempted by all candidates. It was also
a very high scoring question: about half of the entry scored full or almost full
marks.
(i)
The hypotheses were usually stated correctly but there was rather less
care in providing verbal definitions of the population means. Similarly, the
required assumptions were sometimes less than ideal.
(ii)
Most candidates carried out the test competently. There was rarely any
problem over finding and using the pooled variance. The critical value was
almost always correct but on a number of occasions the conclusion was
badly expressed.
(iii)
As in part (ii) most candidates had little difficulty here. Just occasionally
the standard error (which had been correctly constructed in part (ii))
became pooled s
(iv)
1
17
2616
3)
4)
Mark Scheme
June 2005
(a)
The hypotheses were usually stated correctly but candidates were not as
careful about defining the symbol . Nor were they sufficiently careful
when it came to the distributional assumption.
However there were only a very few candidates who did not realise that
they should carry out a paired test. The vast majority made good progress
with the test itself, and only the final conclusion left room for
improvement.
(b)
As above, most realised what to do here and the correct value for the
lower bound was usually found. A small minority tried to construct the
confidence interval using the information from the paired test. There was
some uncertainty again with the distributional assumption.
The main area of difficulty was with the interpretation of the interval.
Very many comments revealed a flawed understanding of a confidence
interval to quite a worrying extent.
Wilcoxon rank sum test for the median; Chi-squared test for goodness of fit;
waiting times in an airport
(a)
This part of the question was almost always answered well. Many fully
correct solutions were seen.
(iii)
In this part of the question very few candidates realised that they could
refer back to the previous part for evidence that the assumption of
background Normality was not viable. They knew that Normality was
required, but often chose to look at the sample data in part (a), sometimes
with the aid of a dot plot. Hardly any candidates included in their
discussion the small sample size which might prompt the use of a t test.
No more than a handful of candidates picked up on the fact that a t test
examines the population mean whereas the Wilcoxon test in part (a)
examined the median.