Escolar Documentos
Profissional Documentos
Cultura Documentos
deserves to be
inspired!
A Self-study Workbook
Written by Dr Kate Exley
Contents
Page
List of Figures and List of Tables 1
10. Ways of producing accurate and clear marking guidance for questions 39
Appendices 58
Appendix 1 58
Appendix 2 60
Appendix 3 63
Writing Good Exam Questions
List of Figures
Page
Figure 1 Diagram to illustrate the principles of Constructive Alignment in 10
module design
List of Tables
Page
Table 1 A table of suggested verbs mapped against the Anderson and 16
Krathwohl adapted levels of Bloom‟s Taxonomy of Cognition
-1-
Writing Good Exam Questions
It is primarily intended for those who are new to writing examination questions
although more experienced colleagues may find it useful as a reference or updating
source. The workbook can be used in a number of ways. For those unable to attend
the „Writing Better Exam Questions‟ staff development workshop, it can act as a
distance learning resource that can be worked through systematically or it can be
quickly consulted to check and review current practice.
For those using the workshop in the distance learning mode the anticipated learning
outcomes are –
A range of different question formats are reviewed and critiqued and a number of
ways of quality assuring draft questions are suggested and explained. However,
writing the exam question is only half the story – producing the associated marking
guidance is also considered here. Marking guidance can take a number of different
forms ranging from specimen or model answers through to descriptive criteria and
detailed marking schemes matched to necessary answer content. These too will be
discussed with reference to examples from The School.
-2-
Writing Good Exam Questions
-3-
Writing Good Exam Questions
There are two three-hour written examination papers taken in June. Together these
two papers contribute 30% to the final assessment (15% each).
Paper 2 tests candidates‟ ability to integrate the knowledge and skills acquired during
the whole of the MSc course. Paper 2 was originally developed in the mid-1990s
after full implementation of the present teaching module structure. As a whole, Paper
2 should be examining the key knowledge/skills which a candidate graduating with
an MSc in X should have. In devising Paper 2, MSc Exam Boards should reflect on
the intended learning outcomes for the MSc –some of which are likely to require
assessment in this exam (others might have been assessed in compulsory study
modules the project etc). MSc intended learning outcomes can be found in the MSc
Course Handbook, prospectus etc. Questions should require integration of
knowledge/skills acquired in different parts of the MSc course – they might use
material from compulsory modules but not optional ones that only some of the class
might take.
Most Distance Learning (DL) modules have a 2-hour exam covering the content of
that module and this contributes 70-100% of the module‟s mark. MSc EPI and CT
also have a 3-hour integrating paper (E400), akin to Paper 2 above, which
candidates sit in their final year of the course. Exam questions are usually co-
ordinated by the Module organiser or other designated members of staff.
All module exams taken will count towards the degree, save only where a student
has been assessed on more modules than are required – in which instance the
Exam Board will determine whether an award may be given, and which modules are
counted towards it.
-4-
Writing Good Exam Questions
The School‟s Assessment Code of Practice describes six assessment objectives that
should be kept in mind when writing examination questions and designing
assessments.
These are –
-5-
Writing Good Exam Questions
The goal – Test items should be really difficult for people who don't understand the
subject material, but they should be straightforward for those who do. If an item is
difficult because of complicated wording (e.g., double negatives) or vocabulary, you
will end up testing language skills rather than ability in the discipline.
i. Clarity
ii. Reliability,
iii. Validity
iv. Authenticity
v. Fairness
i) Clarity
-6-
Writing Good Exam Questions
e.g.
“What will be the outcome of adding further sodium chloride at this point?
Explain your answer.”
In an interview, a dyslexic student spoke of this second line „being hidden away‟ and
he had developed ways of re-reading questions to try and avoid this happening to
him, however, the layout of a question may add to this problem, e.g. the indent here,
on the second line may make it „more hidden‟.
EXERCISE
Testing for clarity – contrast the following versions of the same exam question
(essay format answer required):
Version A:
Public health policy in the United Kingdom underwent a number of significant
changes during the Twentieth Century that can be directly attributed to the needs
and exigencies brought about by international conflict. Some of the changes and
developments that resulted to health systems and service delivery are still with us
today and it is important that we understand the background of circumstances that
influenced the decisions that were made. Provide a short analysis charting what you
consider to be the main transitions in public health policy brought about by the
unique needs and challenges, both direct and indirect, of an environment of
international conflict, within the UK health systems specifically, using the Second
World War as an example.
Version B:
Compare the advances in UK public health policy pre- and post-Second World War.
-7-
Writing Good Exam Questions
ii) Reliability
Does the question allow markers to grade it consistently and reproducibly and does it
allow markers to discriminate between different levels of performance? This
frequently depends on the quality of the marking guidance and clarity of the
assessment criteria. It may also be improved through providing markers with training
and opportunities to learn from more experienced assessors.
iii) Validity
iv) Authenticity
Authenticity is the need to match the style and approach of question setting to the
reality of practice. This is particularly important when considering the assessment of
Masters level qualifications frequently taken by mature students who are
accustomed to working within a professional context. A general example might be,
rather than set an essay style question, ask students to present their understanding
in the style of a professional, or industrial, or clinical report.
This may be very important when considering the testing of „procedural knowledge‟
or „functioning knowledge‟ (please see 5.1). When the exam seeks to test a
candidate‟s knowledge of how something works, the order or sequencing of events,
the interplay between contributing factors etc – it can be very important to ensure
this is built into the question formatting and context setting – to allow authenticity.
-8-
Writing Good Exam Questions
Example
v) Fairness
You need to give students a fair chance to demonstrate what they know and can do
and to be able to succeed in examinations. Fairness can be facilitated by being very
clear about expectations in student performance, providing examples of past
examination papers, giving opportunities for students to practice and gain „exam
technique‟ (through „mocks‟ for example), plus transparency in the processes and
criteria that will be used to mark and grade their work.
Students should know what is expected of them in order to obtain a particular grade
and their marks should be a reflection of their abilities and not a reflection of
extraneous and irrelevant factors such as gender, disability etc. Providing a „level
playing field‟ is the aim and this is particularly important at The School when
considering the different groups of students who come to study or embark upon DL
courses, e.g. non-native English speakers, students who have previously
experienced very different educational cultures, mature professionals etc.
-9-
Writing Good Exam Questions
Figure 1.
Diagram to illustrate the principles of Constructive Alignment in module
design.
Module
Evaluation
- 10 -
Writing Good Exam Questions
Example
At the end of the module students should be able to select an appropriate method
and use it to test the significance of collected data.
The learning outcome clarifies what opportunities need to be built into a test question
and ensure that the test is valid. For the learning outcome given above – students
should be expected to select a method and have the scope to be able to apply the
method to some data and finally to be able to comment on the significance or
otherwise of the data. To further clarify it would be beneficial to demarcate these
three different tasks within the question itself,
Example
perhaps as separate question sections a, b, and c and finally, for each to have a
clear allocation of the total marks for the question.
Examination questions should also aim to indicate to the students how much is
required of them to achieve a good mark – the scale and scope of their expected
answers. One common way of doing this is to give a time limit (10 questions in 20
minutes) or you could limit the amount of space in the answer booklet or on-line pro-
forma provided. Alternatively you can set a maximum word limit for responses.
- 11 -
Writing Good Exam Questions
To what extent
e.g. “Using your knowledge of both prokaryotes and eukaryotes…”
With reference to
e.g. “ With reference to the published research from ..”
EXERCISE
Underline the verb and key elements of the question that give an indication of
the extent (limits and boundaries) of the question.
1. Describe the three main methods of economic evaluation (40%). What are the
main strengths and weaknesses of each method? (40%). Support your answer with
examples of disease evaluation (20%)
3. Write short notes on THREE of the following. In each case explain the
importance of the infectious agent and the mode of transmission in its spread and
control.
a) rotavirus diarrhoea
b) measles
c) guinea worm
d) dengue
e) tuberculosis
Please see Appendix 1. for some feedback comments on this exercise. You may
also wish to refer directly to the learning outcomes of your modules and the Master‟s
level descriptors in the Qualification Framework document.
- 12 -
Writing Good Exam Questions
It is possible to test a wide variety of different kinds of knowledge, skills and attitudes
through the careful writing of examination questions.
Again taking each of these elements in turn let us first consider the different kinds of
Knowledge and ways of knowing that you may wish to test in your students.
Factual Knowledge
Terminology, facts, figures
Conceptual Knowledge
Classification, Principles, Theories, Structures, Frameworks
Procedural Knowledge
Algorithms, Techniques and Methods and Knowing when and how to use
them.
Metacognitive Knowledge
Strategy, Overview, Self Knowledge, Knowing how you know.
- 13 -
Writing Good Exam Questions
EXERCISE
Please consider the following four examination questions and decide what
kind of knowledge you feel they would test?
1. What are the key steps and processes in bringing a new anti-cancer drug to
market and introducing it for clinical use?
3. Using the tabulated data provided calculate the incidence risk of prostate
cancer per 1000 men, per 5 years, at each of the given levels of alchohol
consumption.
4. Why do malaria parasites persist in the human population. Explain the choice
of drugs which could be used to prevent persistence of Plasmoduim
falciparum and Plasmodium vivax.
- 14 -
Writing Good Exam Questions
e.g. Do we want to test a candidate‟s ability to „list important features‟, „analyse the
given findings?‟ or „critique the argument they give‟.
Anderson et al‟s (2001) re-working of Bloom‟s taxonomy makes this easier as they
chose to present the hierarchy of sub-categories as active verbs – and it is their
version particularly that has been widely used in course design and question design
in more recent years. It is however important to remember that,
“Although Bloom's lends itself to wide application, each discipline must define
the original classifications within the context of their field”
Crowe et al (2008)
Figure 2.
Bloom's Taxonomy of Cognition – Revisited by Anderson & Krathwohl (2001)
Create
Evaluate
Analyse
Apply
Understand
Remember
Note – Some colleagues in the School may already be familiar with the original Bloom
taxonomy that uses the terms Knowledge, Comprehension, Application, Analysis, Synthesis
and Evaluation
- 15 -
Writing Good Exam Questions
Table 1
- 16 -
Writing Good Exam Questions
Table 2.
Ways in which intellectual skills can be tested through different question
stems.
- 17 -
Writing Good Exam Questions
(Adapted from Figure 7.11 of McMillan (2001) and Piontek, M.E. (2008))
Note – you may like to compare these question stems with Bloom’s taxonomy,
given earlier and draw comparisons and to cross refer to the learning
outcomes specified for your own Modules.
EXERCISE
Take a few moments to look down this list of question stems and select two
that you feel could be used to test students on your module/course.
Short answer and essay styled questions do give an assessor the opportunity to
judge a range of generic or transferable skills in the way students answer the
questions or respond to the tasks set. The most obvious of these are to do with
ability to write clearly and appropriately, to structure and organise answers so that
most important points are prioritised and well made and the ability to cite and use
source material effectively.
If these skills are to be included and given value in the assessment this should be
clearly stated in the assessment criteria used to make judgements and this fact
should be made clear to students. At The School this is an important issue as many
of the Masters students are non-native English speakers. What proportion of the
marks for a test question are allocated to skills such as „written English‟ should be
related to the Aims and Learning Outcomes for the course and context. In some
cases accuracy and style may be considered important, e.g. to highlight professional
skills and competencies, and be included in the assessment criteria, whilst in others
such characteristics are not what is being taught and considered.
- 18 -
Writing Good Exam Questions
Ability to work under pressure or to demonstrate „stress tolerance‟ etc are unlikely to
be valid learning outcomes for a Masters Programme at The School and therefore all
attempts should be made to reduce the impact that stress and nerves may have on a
student‟s performance in an examination.
It is possible to set and run examinations in ways that limit the importance of stress
induced factors (such as memory lapses) on success. Written examinations can be
organised as open book exams* or question topics can be pre-seen by candidates.
Such strategies reduce the need to „question spot‟ or the impact of „luck‟ in revising
the right or wrong sub-selection of topics tested. They allow students to think more
deeply about and possibly research, their views before attempting the questions (as
with course work assessments) but do have the added advantage of avoiding some
of the concerns of plagiarism – in that candidates produce their individual answers
under exam conditions. Those familiar with running these types of examination
comment that the quality of student answers are frequently judged as a much higher
standard (again as is the case with course work answers).
* Open book examinations can allow students to take their own notes or choice of
texts or previously specified items into the examination.
- 19 -
Writing Good Exam Questions
Check that the question does not assume a lot of background knowledge
which may be culturally specific or introduce unnecessary bias;
Provide any important (untested) background detail within the body of the
question;
Give mark or timing guides within the framing of the question that indicate the
relative importance or attached weightings for each sub-section;
Set multiple-part problem questions so that the parts are independent from
each other. This means that if a student gets the first part wrong they don‟t
automatically lose marks or subsequent sections and makes grading much
quicker and more straightforward.
E.g in the second part of a question, write something like “In the next part of
the calculation, assume that the answer to Part (a) was 25, regardless of what
you actually got in Part (a). Note that 25 is NOT necessarily the correct
answer to Part (a).”
EXERCISE
Can you think of any additional aspects in the exam questions you will be writing that
should be considered to reduce the impact of stress factors?
- 20 -
Writing Good Exam Questions
There are a number of ways in which examination questions can be written and
structured that in turn require very different responses from students. Examination
papers may consist of a variety of these formats. For example a paper may consist
of an initial section of 10 compulsory, short answer questions followed by a second
section in which the student is asked to attempt three from six longer questions
which may be essay or case study or problem solving styled questions.
Here are some examples of different ways questions are written at the School with a
commentary highlighting important features (such as the need to avoid ambiguity,
bias, inequality and yet be able to discriminate between different levels of attainment
and achievement).
There are few examples of such question types being used extensively in summative
assessments at the School and they are included here for completeness sake – and
an acknowledgement that some teachers may well be using these question formats
as part of their class or on-line teaching, as self assessment or formative
assessment opportunities for their students.
- 21 -
Writing Good Exam Questions
True- False
Used to test a breadth in knowledge of information but the problem of
„guessing‟ is a major worry.
Matching Pairs
Used to assess knowledge of complex and inter-connecting relationships.
o Multiple-Choice Completions
This MCQ format allows for more than one correct answer. Such
questions are more difficult since the student is not just looking for one
correct response among four incorrect responses. However, the intent
of this format is not to test four separate points but rather to set up an
interpretive exercise.
- 22 -
Writing Good Exam Questions
Example -
The investigators want to perform a sample size calculation with 80% power
and 5% 2-sided significance. They estimate that HIV-free survival at 7
months will be 60% in the control arm.
(i) Calculate the sample size required to detect a 10% increase in HIV-free
survival at 7 months in the intervention versus the control arm. (Hint:
remember to identify your equation, define all your variables, show all your
calculations and conclude appropriately) (10 marks)
(ii) Assume that 5% of mother–infant pairs are lost to follow up prior to the infant
reaching 7 months and adjust your sample size calculation accordingly. (4
marks)
EXERCISE
How could you improve parts (i) and (ii) of the example question above?
Please see the concerns that were raised by the Module team over the page
- 23 -
Writing Good Exam Questions
Here are the views of the Module leader who raised two questions relating to the
clarity of the draft question
Part (i) - It isn’t clear whether the question is asking students to calculate an
absolute or relative increase? This makes a big difference to the calculation
(see below). This is an example how the omission of one word can have a
significant difference on how the student answers!!!!
Not accounting for loss to follow up, a sample size of 1013 women per study arm
(2026 total) will give us 80% power and 5% significance to detect a 10% increase in
HIV free survival in the intervention from 60% in the control arm.
Part (ii) - Will the formula be included in the question or the provided formulae
sheet? This is a straightforward calculation for which there is a formula. Do
you expect the students to memorise the formula or will they expect it to be
provided?
Being clear about what actually should be tested is the important factor here.
- 24 -
Writing Good Exam Questions
Longer format to allow students to respond to open ended questions at length. Used
to test higher skills, writing and structuring skills, further reading and a deeper level
of understanding. Assessors are frequently interested in a student‟s ability to
organise and integrate a range of ideas and information and build an argument or
make a case (the intellectual skills of synthesis and evaluation, going back to
Bloom‟s taxonomy).
e.g. Outline the morphology, genome organisation and replication of the human
immunodeficiency virus (HIV).
To what extent do you agree with this statement and what are its implications? Make
reference to specific infections to support your conclusions.
Table 3.
Some Common Essay Style Questions used in Exams
Question Stem
- 25 -
Writing Good Exam Questions
EXERCISE
Look back over recent examination papers set for your course or teaching
module and add two more commonly used Question Stems to this list.
1.
2.
Here the students are provided with some data (this could be in written, tabulated,
graphical form etc) and then asked a series of questions about it. The provided
information may be some research findings or monitoring data. The questions
usually begin with a couple of straightforward interpretative questions (e.g. Using the
table of infection rates provided, which of the described drug therapies reduces the
risk of infection the most?). They then move on to more complex questions of
application and analysis that require the students to carry out standard manipulations
or calculations of the data provided. The final questions are likely to be more
evaluative and open-ended, requiring the students to predict likely impacts or
suggest improvements etc.
An Example
On a hot summer day, children in three schools had a school outing to a playground
where some of the children played in the recreational fountain. Two days later nearly
half the children had symptoms of vomiting, diarrhoea, abdominal pain and
headache. A retrospective cohort study was carried out to try to identify the source
of the outbreak with the following results.
(a) Define what is meant by the risk and relative risk of becoming ill associated
with each factor (10%).
- 26 -
Writing Good Exam Questions
(b) Calculate BOTH the risk and relative risk associated with each factor (30%).
(c) Suggest possible interpretations of the results, and the implications for
control recommendations (10%).
The investigators wanted to identify the infectious agent involved. One possibility
they considered was norovirus which is known to cause acute gastroenteritis.
Although reverse transcription-PCR (RT-PCR) method is considered to be the “gold
standard” for diagnosis of this viral infection, it requires skilful personnel and a well-
equipped laboratory. A simpler diagnostic kit has been developed. The following
table shows how the simpler diagnostic kit compares to the gold standard.
(d) Would you advise the investigators to use the simpler diagnostic test in their
epidemiological study? Would your recommendations change if the simpler
diagnostic test was to be used in clinical practice? Justify your answer. (50%)
[Note on norovirus: this highly infectious RNA virus causes a self-limited, mild to
moderate disease that often occurs in outbreaks with clinical symptoms of nausea,
vomiting, diarrhoea, abdominal pain, headache, low grade fever or combination of
these symptoms. No treatment is indicated apart from rehydration in severe cases. ]
EXERCISE
2.
3.
- 27 -
Writing Good Exam Questions
In case study styled questions a context or situation is described in detail (e.g. this
maybe a patient history or government strategy position etc). Such questions are
often seen as being very authentic and ask students to apply their knowledge to a
particular and novel, set of circumstances. They frequently take considerable work
and effort to write well and usually involve a team of people who craft an idea into a
realistic and challenging situation.
Note - Some examples of this type of question are presented as examples in section
11.
Giving Choice
Whilst the structure of exam papers is set by the Board of Examiners and not by
individual question setters, it is never-the-less interesting to consider the impact of
providing question choice within an exam.
Many people view the giving of choice as a way to increase fairness and reduce the
affect of „luck in question spotting‟. It allows students to address questions for which
they feel most prepared and have been most interested in – so seeing the „best‟ the
student can produce. However, providing choice inherently reduces the validity and
reliability of the test instrument because each student is in fact taking a different test
and has been encouraged to sample from their learning in different ways. It is nearly
impossible to create parallel exam questions that test achievement of the learning
outcomes to the same extent, and it is equally difficult to grade two different essays
absolutely comparably – both factors making consistency very difficult (Piontek,
2008).
EXERCISE
- 28 -
Writing Good Exam Questions
It is very difficult to write a question and then immediately see the ambiguities or
errors that it contains. Separating the „creating‟ from the „evaluating‟ roles in time can
help. Write a question and then come back to it the following day and re-read with
fresh eyes. When you have a draft question, next write a model/specimen answer
and/or some marking guidance. As you do this come to a decision about the
appropriate break down of marks and try to estimate how long it will take to tackle
the question, part by part. In coming up with the marking scheme for your question
you might find it helpful to have the learning outcomes for the module or course in
sight to refer to so that you can check that you are valuing the right things and giving
credit to Master‟s level criteria.
Below is a checklist of questions to use once you have a draft question (doing some
of this in a group with questions on overheads can work well):
- 29 -
Writing Good Exam Questions
4. How well does the question relate to intended learning outcomes (of the
teaching module or MSc)?
6. What are the key words describing the task? Are they clear?(eg: list, define,
„suggest reasons behind the effect‟ are better than interpret, discuss,
evaluate)
8. Check punctuation and grammar as this can markedly change the meaning of
sentences (eg “panda eats, leaves and shoots”).
12. Can the question be completed in the time available (including reading,
thinking and reviewing time), including those for whom English is not their first
language?
13. Does the question lead to answers which will distinguish between weak and
strong candidates, eg are there elements for candidates to demonstrate
distinction-level skills/knowledge?
- 30 -
Writing Good Exam Questions
Question Validation
The Masters programme that you contribute too is likely to have its own process of
question validation and process of compiling the examination paper. It is important
that you ascertain this from the module leader and adhere to it.
In general terms, however, once you have the question, model/specimen answer
and marking scheme written ask someone else to answer it (do not give them the
model/specimen answer), timing each part of the question. It allows you to check
that your calculated „time it takes to complete‟ estimates were about right. Modify
the question, and timings and marking scheme based on any misunderstanding
made clear by their answer.
At this stage you will be ready to submit your question to the module leader and they
too will scrutinise your question and may get back to you with further suggested
improvements (please see the extended case study in the Appendix for further detail
about the way The School conducts examination question approval processes.)
Please read the following draft question and suggest improvements – When you
have had a go – turn the page and you will see the changes that the examiners team
finally made to the question.
Question X Draft –
Describe the structure of the cell plasma membrane and its principal components.
How and where are plasma membranes usually made in the eukaryotic cell.
How are molecules transferred across the membrane into and out of the cell :
Water
Ethanol
Sodium and Potassium ions
Sugars.
Over the page you will find the edited version of Question X that was eventually
accepted and used in the examination.
- 31 -
Writing Good Exam Questions
When you submit a question to the Teaching Unit leader it is likely that they will
arrange for it to be scrutinised by members of the teaching team and they will
make suggestions for improvement. Question X, reviewed on the previous page,
ended up looking like this -
Question X accepted –
Describe the structure and synthesis of the cell plasma membrane of eukaryotic
cells and its principal components. Explain how molecules are transferred across
the membrane giving 2 examples.
The questions are usually considered together with the associated marking
guidance notes – and for this question these were the guidance notes that were
accepted –
The major components are lipids of various kinds : these may consist of
phospholipids (eg phosphatidyl choline, serine,ethanolamine etc.)), triacylglycerols
(containing glycerol esterified with saturated or unsaturated fatty acids),
glycolipids (eg diacylglycerols with a sugar chain on the third glyceryl OH), sterols
or steroids (eg cholesterol) amongst others. May also contain others. Also
proteins which may be transmembrane with hydrophobic trans-membrane section
or anchored by lipid. Also protein transporters which span membrane and are
each responsible for the transport of a limited range of molecules or ions. Most
require energy.
a) Diffusion (neutral)
b) diffusion (lipid-soluble)
c) ion transporter
d) specific transporter protein
- 32 -
Writing Good Exam Questions
9. Marking Approaches:
Using assessment criteria and marking schemes
Assessment criteria test the intended learning outcomes for a course or teaching
unit. They describe the knowledge and skills (and possibly attitude) that a student is
expected to demonstrate in their examination answers and they are then used in
marking the work. The learning outcomes describe what students should be able to
do; assessment criteria describe how well they should be able to do it – they set
standards. Remember that learning outcomes define the minimum standard
required to achieve the award, and so in addition to these the assessment criteria
should provide an objective basis for interpreting and differentiating the performance
of students at the level of the outcome (a „satisfactory‟ pass) and at a series of pre-
defined steps above this (usually up to a level considered an „excellent‟ or
„outstanding‟ pass).
Assessment criteria describe the extent to which students have achieved the
specified learning outcomes. They are usually provided at two levels,
Assigning grades fairly and robustly is a demanding occupation for all teachers and
we employ a range of approaches to help us to do this reliably and consistently. Two
very different methods are often used simultaneously and symbiotically – norm
referencing and criteria referencing.
- 33 -
Writing Good Exam Questions
that a few students will fail and a similarly few students will get distinctions whilst the
majority will gain marks that cluster and peak in the middle mark range.
You will also sometimes hear experienced assessors referring to a particular piece of
student work as providing a „benchmark‟. This is where the answer provided for
various reasons encapsulates the criteria for a mark or grade: for example,
determining the threshold for a distinction. This can be extremely helpful, and is a
way in which norm referencing and criteria referencing naturally come together.
Figure 3.
However, not all cohorts will „fit‟ this pattern, for example, Computing for Beginners’
courses could form a two peak pattern, with clusters of students achieving very high
marks (and represent the students who could have taught the course!) and a
second cluster with marks at the bottom of the range (ie those who had never done
any computing before!).
Absolute norm referencing also has the characteristic of effectively setting quotas,
only so many students can get „A‟s and only so many can get „B‟s etc, and the
application of a „bell-shaped curve‟ to small groups or cohorts of students becomes
clearly unfair – where we can see that variations between groups, say from year to
year, is likely to give rise to very different patterns of achievement.
Criterion referenced grading on the other hand – specifies a standard through the
description of clear criteria and anybody who achieves the level or standard
described gains the marks – so everybody in the cohort could potentially get an „A‟
and each student‟s work is individually judged in comparison to the criteria –
regardless of what other students may or may not do.
- 34 -
Writing Good Exam Questions
EXERCISE
Please consider the strengths and limitations of both forms of grading work.
Norm-referenced assessment
Strengths
Weaknesses / limitations
Criterion-referenced assessment
Strengths
Weaknesses / limitations
- 35 -
Writing Good Exam Questions
In The School‟s Assessment Code of Practice we can see some guidance and clarity
on this issue.
The School uses a standard assessment system, marking against six gradepoints:
integers from 0 to 5. Grades 2 and above are pass grades (grade 5 can be seen as
equivalent to distinction standard); whilst grades below 2 are fail grades, (these are
equivalent to the old grades of A, B+, B, C, D and E).
- 36 -
Writing Good Exam Questions
Table 4.
Grade
Descriptor Typical work should include evidence of…
point
5 Excellent Excellent engagement with the topic, excellent depth
of understanding & insight, excellent argument &
analysis. Generally, this work will be „distinction
standard‟.
NB that excellent work does not have to be
„outstanding‟ or exceptional by comparison with
other students; these grades should not be
capped to a limited number of students per class.
Nor should such work be expected to be 100%
perfect – some minor inaccuracies or omissions
may be permissible.
4 Very good Very good engagement with the topic, very good
depth of understanding & insight, very good argument
& analysis. This work may be „borderline distinction
standard‟.
Note that very good work may have some
inaccuracies or omissions but not enough to
question the understanding of the subject matter.
3 Good Good (but not necessarily comprehensive)
engagement with the topic, clear understanding &
insight, reasonable argument & analysis, but may
have some inaccuracies or omissions.
2 Satisfactory Adequate evidence of engagement with the topic but
some gaps in understanding or insight, routine
argument & analysis, and may have some
inaccuracies or omissions.
1 Unsatisfactory / Inadequate engagement with the topic, gaps in
poor understanding, poor argument & analysis.
(fail)
0 Very poor (fail) Poor engagement with the topic, limited
understanding, very poor argument & analysis.
0 Not submitted Null mark may be given where work has not been
(null) submitted, or is in serious breach of assessment
criteria/regulations.
- 37 -
Writing Good Exam Questions
Students should be made aware of the criteria on which all assessment tasks will be
marked, to improve their understanding of the standards expected of them.
The criteria used to place students in each grade category must be written down by
staff setting assessments, and adhered to by all those involved in the marking.
- 38 -
Writing Good Exam Questions
10. Ways of producing accurate and clear marking guidance for questions
Marking guidelines should be based directly on the Assessment criteria and for some
modules, such as those that are quantitative in nature, there is probably a need for
model/specimen answers, in addition to or instead of marking guidelines..
The Assessment criteria will serve as the basis for the development of the marking
guidelines. For each criterion I suggest that you initially think about the major steps
in the continuum of student achievement – i.e. what do you expect from a „Pass‟
answer at a 50% grade level and what would you expect of a „Distinction‟ answer?
Firstly, for each criterion, consider carefully what you expect students to have written
to achieve a passing mark for this criterion. Draft a detailed description of the content
and quality that markers should evaluate, in addition to what has been included in
the assignment instructions. Ask yourself: “What would comprise the minimum of
what I would expect the student to have written for this section, or about this subject,
to achieve a passing mark?” This description or set of required
concepts/ideas/issues/ definitions will serve as the basis for a grade of „2‟.
Once the basic expectations for a „2‟ grade have been drafted in association with the
original criteria, it is then necessary to describe what additional level of content
and/or quality would achieve higher marks (3, 4, 5). Please draft descriptions of what
components might achieve the different possible higher marks (3, 4, 5).
- 39 -
Writing Good Exam Questions
imaginative input and cross referencing from students who have access to nothing
more than the course materials. For a „5‟ grade in particular, it is original thought, not
extra facts, that would contribute.)
You may well find that, depending on the nature of your course, module or subject
area there is one criteria type that tends to take precedence in differentiating the
marks. For example, in a strongly practice-based, professional course, the quality
and authenticity of reflective practice may be a priority criterion. In courses
concerned with exploring the impact of public policy decisions and practices the lead
criteria may be those emphasising the application of key principles and the analysis
of outcomes. If there are lead criteria, then a transparent approach would be to
emphasise these in advance to students both within the teaching and the
assessment design. There should also be links made between the criteria and the
intended learning outcomes that help to show students where the emphasis lies.
Finally based on the basic criteria for a passing mark („2‟), draft a list of fundamental
omissions or errors that would result in a „1‟ ore even in a „0‟, fail.
This is particularly important if you are likely to be assessing „essay‟ style questions
rather than numeric or quantitative questions. It is possible to score 100% in a
calculation answer and virtually impossible to score more than 80% in a discursive
essay style answer.
You have to give your students the opportunity to be able to excel – you need to
consider how your more able students can demonstrate their additional qualities,
creativity or more in-depth knowledge or understanding to you. This is often a difficult
thing to achieve, i.e. to incorporate into the question design an opportunity to
differentiate between your able and excellent students.
- 40 -
Writing Good Exam Questions
- 41 -
Writing Good Exam Questions
Here are a couple of examples showing how the marking guidance gives clear links
to the grading structure and differentiates between the possible grades.
Example 1.
Question
Discuss what is meant by the term “epidemic”. Describe the main features of an
epidemic curve. Identify the main types of epidemic, giving examples.
Marking Guidance
Example 2.
Question
What has been the impact of HIV on the epidemiology and control of TB?
- 42 -
Writing Good Exam Questions
Marking Guidance
A Grade 3 answer should provide basic information on the epidemiology and control
of TB including –
A Grade 2 answer may include some of these points, or alternatively all of these
points but with insufficient discussion.
Grade 1 and below would include some of these points but with significant errors of
interpretation.
Grades 4/5 answers will be an intelligent structured discussion of how HIV impacts
the epidemiology and control of TB including other relevant points in addition to the
ones listed above.
It is interesting to note that in both these examples the assessor has chosen to
provide a description for a Grade 3 answer first – describing a point near the middle
of the grade-scale, the peak of the normal distribution, before going on to relate
higher (4/5) and lower (2/1) scoring grades to this mid-point.
- 43 -
Writing Good Exam Questions
EXERCISE
Consider an examination question that you have written or are currently in the
process of drafting. Produce some marking guidance for the question that provides
clear descriptions that differentiation between the Grades (0 to 5).
Think about which point on the grading scale you find it easiest to begin with.
Example 3.
Question
On a hot summer day, children in three schools had a school outing to a playground
where some of the children played in the recreational fountain. Two days later nearly
half the children had symptoms of vomiting, diarrhoea, abdominal pain and
headache. A retrospective cohort study was carried out to try to identify the source
of the outbreak with the following results.
- 44 -
Writing Good Exam Questions
(a) Define what is meant by the risk and relative risk of becoming ill associated
with each factor (10 marks).
(b) Calculate BOTH the risk and relative risk associated with each factor (30
marks).
(c) Suggest possible interpretations of the results, and the implications for control
recommendations (10 marks).
Marking Guidance
Risk = children who were ill who were exposed/total number exposed
Relative risk = risk in exposed/risk in unexposed
Give 5 marks each for these definitions: total 10 marks.
Give3 marks for each correct risk & 3 marks for each correct relative risk:
total 30 marks
Main risk factor is playing in the recreational fountain. This suggests that
the source of the outbreak is water in the fountain, possibly indicating
faecal-oral transmission. Water in the fountain should be tested regularly
for relevant bacteria and viruses (eg, E Coli, salmonella, norovirus) and
should be monitored to ensure that adequate levels of chlorine are
present in the water. Alternatively children could be prevented from
playing in the fountain (however, on a hot sunny day it may be difficult to
keep them out of the water!) Up to 10 marks for that or similar relevant
comment.
The investigators wanted to identify the infectious agent involved. One possibility
they considered was norovirus which is known to cause acute gastroenteritis.
Although reverse transcription-PCR (RT-PCR) method is considered to be the “gold
standard” for diagnosis of this viral infection, it requires skilful personnel and well-
equipped laboratory. A simpler diagnostic kit has been developed. The following
table shows how the simpler diagnostic kit compares to the gold standard.
Gold standard
Diagnostic test Norovirus present Norovirus absent
Norovirus present 37 3
Norovirus absent 13 47
(d) Would you advise the investigators to use the simpler diagnostic test in their
epidemiological survey? Would your recommendations change if the simpler
diagnostic test was to be used in clinical practice. Justify your answer. (50
marks)
Marking Guidance
- 45 -
Writing Good Exam Questions
Give 5 marks each for calculation of sensitivity and specificity (10 marks).
Discussion of whether or not to use the test in (i) epidemiological survey
or (ii) clinical setting Up to 40 marks for answers that identify the key
requirements of a diagnostic test in the two situations and uses
information from the calculation of sensitivity and specifity correctly.
Some of the following points may be included in the answer:
Providing students with past papers and specimen answers before an examination is
one way of providing transparency and clarity in what is expected and valued in an
answer. Providing students with a specimen answer after their papers have been
marked also helps them to review their own learning and act as a form of „feedback‟.
It also helps to tailor any individual feedback to the particular needs of a student (e.g.
in tutorials) rather than having to cover everything generically.
- 46 -
Writing Good Exam Questions
explanations of why a particular answer is correct (or more correct than others).
However, if answers are expected to „use evidence‟ or „explain with reference to the
literature‟, the specimen answer provided should seek to model good practice in
these academic skills whilst also emphasising that there may be other ways of
achieving positive results. In very open ended response questions it may be best to
provide brief outlines for two or three different possible interpretations and
arguments presented – this can be particularly useful in a „feedback‟ mode of
presentation in which students come, review and then discuss the different
approaches taken thus attempting to encourage students in finding their own „voice‟.
You may like to refer to the extended case study provided in Appendix 3 that takes
you through the steps of exam question and marking guidance development together
with extracts from the module team discussions.
The development and approval of questions is the responsibility of the course team
and is usually a process started towards the end of the Autumn term as refining
questions and marking guidance does take quite a lot of time to do well and
collaboratively.
You have an opportunity now, if you wish, to review an extended case study showing
the approach adopted by one course team and showing the development process for
one question.
Please refer to the extended case study provided in Appendix 3 to see the process
by which questions are produced by module teams.
This extended case study, based on a real example, aims to show the stages
of development that the question went through and reflections on the process
made by the course team (shown in comment boxes)
- 47 -
Writing Good Exam Questions
E.g.:
It is also important to remember that any grade divulged before the final meeting of
the Board of Examiners is a provisional grade, subject to external review and may be
amended at the discretion of the examiners.
Appeals
When thinking about the way we write examination questions and conduct
summative assessments it is worthwhile remembering that candidates may appeal
against a result where there is concern that the examination has not been conducted
in accordance with School policies and procedures. However, the University of
London does not allow appeals on purely academic grounds, such as challenging
the interpretation of a concept or principle.
- 48 -
Writing Good Exam Questions
Strategies to support students are usually based upon two guiding principles;
A common approach used in the School is to provide examples of past papers and
examiners reports so that students can see the process of assessment clearly.
It is also desirable to provide opportunities for students to experience assessment
forms and formats before they „count‟. Building „mock‟ examinations into the module
or course and giving students feedback on their approach and success is one way
that this can be done. Having formative assessment that mirrors the summative
assessment can also be helpful. This is especially true for students at the School
who may have had very diverse experiences of education and assessment
processes prior to their Masters courses either in London or by DL.
The School has produced some guidance on the delivery of feedback to students
after formal course work assessments - this particularly highlights the need for
clarity, transparency and speed of feedback turn around-time (see below).
However, I would also like to emphasise the need to provide constructive feedback
on the formative, „practice‟ or mock assessments that are part of the teaching units
at The School. Feedback here needs to be focussed on helping the students to „do
it better next time‟ – or to coin a phrase “Feed-forward”.
EXERCISE
How can you provide support for your students as they prepare for and
participate in examination assessments?
- 49 -
Writing Good Exam Questions
- 50 -
Writing Good Exam Questions
Feedback on Examinations
School policy is that for coursework and project reports, students should receive
individual feedback to aid their learning. For the June exams, students receive their
grades. For DL courses, Examiner‟s Reports for Students are prepared on
expectations with references to marking schemes.
Question 1
Overall all the sections of question 1 were well answered by the candidates who
attempted this question. The standard of the answers was high and showed depth of
understanding. All major points were covered form questions 1a-1e. The points that
were expected to be included to gain a good mark are detailed below.
a) Gram stain
b) Lipopolysaccharide
c) Bacillus anthracis
d) Treponema pallidum
- 51 -
Writing Good Exam Questions
e) Corynebacterium diphtheriae
Question 2
For a safe pass, the student should have discussed that N. meningitidis is Gram-
negative diplococcus, non-motile and lives in a Comment (M1): This is helpful for the
certain percentage of upper respiratory tracts student to know what is a “safe pass”
within the population. They cause meningitis and
other diseases / symptoms by crossing the blood brain barrier through the same
path, which neutrophils use. As virulence factors, they have pili and fimbriae to
attach, endotoxins causing inflammation to help entry, capsules to interfere with
complement attack as well as phagocytosis, killing and degradation by
macrophages/neutrophils, and IgAase to neutralize IgA. They can be typed be
several capsule serotypes, which are not all covered by the available vaccine.
Diagnosis needs to be very fast, since the most affected ones are children and
teenagers, which can succumb to the disease rather fast. Antibiotic therapy needs to
be started quickly. It would have been excellent to name a few relevant antibiotics.
For diagnosis, growth test using CF and blood on chocolate / blood agar, and test for
sugar usage, and the latex agglutination test should be mentioned, as well as other
possible test including PCR. The more details the better the score.
Question 3
For a safe pass the student should have named 2 zoonotic infections such as
brucellosis, salmonellosis (Salmonella typhimurium), listeriosis (M. bovis),
leptospirosis, psittacosis, tularaemia (Francisella), anthrax (Bacillus anthracis),
Coxiella (Q fever), Lyme disease and so on, so lots of choices.
Better grades could have been achieved by describing their reservoirs, life cycles
and diseases in detail as well as how they can be Comment (M2): This is good
controlled. Some of them have a more
complicated life cycle and are transmitted by vectors (Borrelia, Coxiella, Y. pestis);
some come from specific hosts (M. bovis from ruminants, Leptospira from rats); B.
anthracis makes spores and is therefore difficult to eliminate by simple disinfection,
and the cadavers need to be incinerated. If these issues were detailed, students
would have scored high marks.
- 52 -
Writing Good Exam Questions
Question 4.
This question is based on the paper in your reader (Bahl et al). The questions help
you understand and interpret the data that are given in the tables, and help you
follow the discussion of the data by the authors.
Comment (M3): This is helpful for
students
a) First of all, always read the titles/headers of tables carefully, because these
tell you what exactly is presented in the table: what is measured and how, what the
numbers mean, etc. The two tables give you different information, table 1 counts
episodes, and therefore give you incidence, whereas table 2 gives prevalence, that
is „days-with-disease‟ during the observation period. Looking at „risk‟ for diarrhea, we
see in table 1 that children with low plasma zinc are „at increased risk‟ because there
is a higher incidence of diarrhea with a significantly higher RR (Relative Risk,
significant when the confidence interval does not contain 1): 1.47 (1.03, 2.09). There
is also a significantly higher risk for severe diarrhea (1.70), but the RR for prolonged
diarrhea is not significantly different (RR of 2.54, but confidence interval contains 1).
This is further supported by the prevalence data in Table 2, where we see that only
the diarrhea with fever (= more severe diarrhea) is significantly more frequent in the
children with low plasma zinc. There is no significant difference in the prevalence of
the other morbidities between the children with low and with normal plasma zinc (see
the P-values in the table).
b) First, read carefully. On what data are these statements based? In table 1 you
can see that the nr of episodes of ALRI was not different between the groups
(Confidence Interval contains 1). However, the total number of days with ALRI, as
presented in Table 2, was significantly higher in the children with low plasma zinc.
Therefore, one has to conclude that there must have been more days per episode in
the children with low plasma zinc.
- 53 -
Writing Good Exam Questions
It is sincerely hoped that this workbook has provided the necessary guidance and
information you need to be able to produce demanding but fair examination
questions and their associated marking guidance and assessment criteria, for your
modules.
The process
Questions that are divided into discrete sub-sections and are accompanied by
their associated marking schedules have many benefits for both students and
markers – providing clarity in presentation and grading reliability.
Include data or information in the question to reduce the emphasis on memory
and increase the emphasis on application and critical thinking.
Check that your draft question does not favour or disadvantage students from
particular backgrounds or cultures.
Keep sentences short, layout clear and well spaced out and use precise and
unambiguous language.
Check that the question standard and assessment criteria are at Masters
level.
Check – does the question enable students to excel and allow markers to
discriminate between able and excellent performances.
- 54 -
Writing Good Exam Questions
This workbook has focussed on the challenge of writing exam questions. However,
many of the principles and good practices highlighted are equally applicable to the
design of in-course assessments such as assignments, reports and projects.
Where there is more than one assessment task for a module or course it is important
to ensure that certain learning outcomes are not over assessed whilst others are
neglected.
- 55 -
Writing Good Exam Questions
Haines, C. (2004)
Assessing Students’ Written Work: Marking essays and reports
Key guides for effective teaching in higher education
RoutledgeFalmer
McMillan, J.H. (2001) Classroom assessment :Principles and practice for effective
instruction. Boston:Allyn and Bacon
Piontek, M.E. (2008) Best Practices for Designing and Grading Exams.
Centre for Research on Learning and Teaching, Occasional Papers no. 24,
University of Michigan, http://www.crlt.umich.edu/publinks/occasional.php
- 56 -
Writing Good Exam Questions
Useful web-sites
- 57 -
Writing Good Exam Questions
Appendices
Appendix 1.
EXERCISE
Underline the verb and key elements of the question that give an indication of
the extent (limits and boundaries) of the question.
1. Describe the three main methods of economic evaluation (40%). What are the
main strengths and weaknesses of each method? (40%). Support your answer with
examples of disease evaluation (20%)
‘Describing’ is a relatively low level cognitive skill but then the student is
asked to evaluate the three methods by giving strengths and weaknesses –
this is the Masters level task in this question.
Factors that give limits are the requirement to describe ‘three’ methods and to
support the answer with examples.
Giving ‘Advice’ requires the students to select from and apply their knowledge
in order to synthesise an appropriate surveillance system – this is Masters
level Students are also asked to consider what makes a such a system
‘Quality’ – this could be considered a further degree of difficulty. The limits in
this question are given by the scenario of the question which makes it specific
to a country and a disease context.
- 58 -
Writing Good Exam Questions
3. Write short notes on THREE of the following. In each case explain the
importance of the infectious agent and the mode of transmission in its spread and
control.
a) rotavirus diarrhoea
b) measles
c) guinea worm
d) dengue
e) tuberculosis
This question does not clearly articulate Masters level requirements as the
‘Write short notes’ does not indicate a level and the ‘Explain the importance’
may or may not require some level of evaluation and critique but could equally
be a measure of memory depending on what had been taught in the module.
- 59 -
Writing Good Exam Questions
Appendix 2.
A detailed example –
re-writing and formatting a question to ease interpretation
(related to chapter 6.)
This example has kindly been provided by the teaching team responsible for one of
the DL programmes delivered by the School – Fundamentals of Clinical Trials. It
shows clearly the way a set of guiding principles are used to mould a clearer
question context from a „great idea‟ to a very demanding but fair question set-up.
The team wanted to write a question that tested their students‟ abilities to think about
and apply key concepts rather than re-work the study materials provided. Past
experience had underlined the importance of providing relevant and realistic
question contexts and considerable effort is made to vary the scenarios used in
question setting.
What is presented here is the first draft of the question – some team discussion
notes and then the final question as it was used to asses the DL students.
One of the concerns in the treatment of babies born to HIV+ mothers in developing
countries is the transmission of HIV from mother-to-child during breast-feeding.
Infant formula, if used safely and consistently can prevent HIV from passing from
mother to child but can result in increased infant mortality.
In Botswana, free formula is provided and recommended for babies born to HIV-
infected-mothers who are able to safely and consistently formula feed their infants.
Despite the availability of milk power, a number of HIV-infected-mothers continue to
choose breastfeeding. This may be due in part to difficulties in consistently preparing
safe-formula feeds and also the stigma associated with being seen to formula feed.
For these HIV-infected-mothers exclusive breastfeeding is recommended with early
weaning.
Recruitment Criteria: Pregnant HIV infected mothers, who are currently not
receiving antiretrovirals (ART) and who plan to breast feed
Randomisation: Women to be randomized into 2 groups.
- 60 -
Writing Good Exam Questions
? Both groups of women receive antiretroviral Comment (D2): Don’t know what to
as per current standard of care during call this? This is ethically lead as well as
pregnancy to reduce the risk of HIV passing base line driven???
from the mother to the child, and their infant
takes 1 month of prophylactic antiretrovirals following delivery.
Interventions for comparison:
o Group One: mothers discontinue ARVs after delivery (unless ARV needed for
their own health)
o Group Two: mothers continue ARVs for 6 months after delivery,
The primary endpoint: the proportion of babies alive and uninfected with HIV by
7 months of age.
Secondary endpoints: include cumulative HIV free survival at 7 and 18 months
and safety of maternal ARV prophylaxis for
Comment (D3): I brought this one up
HIV exposed infants. from down below!!!
Teaching Team comments: We were worried Comment (D4): Lets talk over the
about this being a more difficult context to grasp phone because I still think we have some
(i.e the intervention is given to the mother but that way to go with setting up this context.
the impact of the intervention is on the infant, the We need to make it as clear as possible
primary outcome that we are monitoring is HIV for the students and this is a complicated
trial!
survival in the infant) for those whose first
language is not English and those who do not have specialist knowledge of HIV.
However, we felt that this context was in keeping with the level of understanding we
would expect from students enrolled on the module. It is also reflective of a study
based in a developing country. Thus we felt with re-presentation of the context,
using a table form, it would be easier to understand.
HIV can be transmitted from mother to child during pregnancy, during delivery, and
after delivery, through exposure to HIV via breast milk. Prevention of mother-to-
child transmission (PMTCT) of HIV therefore focuses on interventions that reduce
the risk at each of these times. A research team wishes to determine the efficacy
and safety of adding maternal antiretroviral prophylaxis during breastfeeding to
the current local standard of care, for PMTCT of HIV. The current standard of care
includes antiretroviral prophylaxis for the pregnant mother and one-month of
prophylaxis for the infant after birth. Having identified their primary outcome as infant
HIV-free survival at 7 months, they plan the following randomised control trial (RCT):
- 61 -
Writing Good Exam Questions
control
*Randomisation: Pregnant women are randomised 1:1 to the intervention or
control arm
- 62 -
Writing Good Exam Questions
Appendix 3
This extended case study, based on a real example, aims to show the stages
of development that the question went through and reflections on the process
made by the course team (shown in comment boxes)
Question Motivation: We wanted to move away from the overused cardiology drug
trial examples of previous exam papers. We have a diverse tutor team that included
clinical trialists working at the Institute of Mental Health and we were inspired by a
BMJ article by Goodyer et al reporting on a Mental Health trial on major depression
in adolescents.
- 63 -
Writing Good Exam Questions
Assessment Needs: The exam was composed of two questions. Prior to question
setting, we identified and allocated the key concepts (as covered in the distance
learning study material for this module) to be tested for each question. For this
question the chosen key principles to test were:
Trial designs;
Recruitment; Blinding;
Randomisation;
Bias
The second question was to be much more numerical/statistical in nature, thus this
first question excluded calculation type questions. Question two also included
questions specifically designed to be “grade differentiators”. The first question was
seen as testing students‟ understanding and application of central and
“straightforward” concepts.
The question is given in plain text, marking guidance is indented and italics and
module team‟s discussion notes are the comments boxes alongside the text..
The Question
Selective serotonin reuptake inhibitors (SSRIs) are prescribed for the treatment of
major depression in adolescents (age 11-16), Comment (A1): What sort of depression
although there are concerns regarding their – major? Should we define with a
usefulness and a raised risk of suicide. The depression score?
National Institute for
Comment (A2): Should we define
adolescents? 11-16? Not certain what is
the standard.
Health and Clinical Excellence (NICE) recommends the use of SSRIs in combination
with Cognitive Behavioural Therapy (CBT) in the UK. This recommendation is based
on data collected from the United States.
- 64 -
Writing Good Exam Questions
c) What design and conduct features would you apply in order for the trial to be
explanatory or pragmatic?
(10 marks)
Comment (A7): I’m not certain I would
know how to answer this question.
Design and conduct in one question
overwhelms me – sorry I’m just a babe
in arms really! I’m guessing your
direction is to think about an intention to
treat analysis and how we define a
protocol deviation. If we continue down
the pragmatic route do you think we
could streamline this question?
Think about the eligibility criteria and how restrictive this should be
Think about who will be delivering the therapy intervention, what
training and experience these people would have.
- 65 -
Writing Good Exam Questions
Parents may not want to enter their children into a trial involving an
SSRI because of the risk of suicide, especially as they have major
depression. Therefore recruitment may be slow. As NICE
recommends SSRIs in conjunction with CBT based on US data there
may not be equipoise for this trial. Therefore clinicians may be
unwilling to randomise highly depressed children and their parents
may also be unwilling to be involved.
e) At the design stage a third treatment arm was suggested for inclusion
consisting of placebo only. What would be the advantages and disadvantages of
including this treatment arm?
(4 marks)
Comment (A9): Like the idea of this
question because you have to think
about it. But not certain whether it is a
step too far for the students. I think
maybe we could drop and ask about
randomisation? I think we have logged
about 10 marks.
The disadvantages would be that it would be ethically unacceptable
to include a placebo treatment for this group of participants.
Including a third arm would increase the numbers needed to be
recruited.
The inclusion of a placebo arm would allow a direct evaluation of
treatment against no treatment.
- 66 -
Writing Good Exam Questions
The primary outcome of the trial was the Health and Nation Outcome scale which is
a 12 item scale covering a wide range of health and social domains such as
psychiatric symptoms, physical health, Comment (A10): Lovely. Should we be
functioning, relationships and housing. Each adding anything about a composite score
question is marked from 0 (no problem) to 4 and how we use that to conclude? Or is
(severe problem). This was completed by an this too much info? (I.e. what is
interviewer at 12 weeks post randomisation. considered as an improvement? I think
this information can come later)
Two hundred adolescents were to be recruited into the trial from six centres. Simple
Randomisation was used to allocate treatment in the ratio 1:1. Each centre had one
interviewer collecting data and several therapists giving CBT.
g) Identify and discuss possible sources of bias that could occur in each of the
design, conduct and analysis stages of this trial.
(15 marks)
- 67 -
Writing Good Exam Questions
The primary analysis of this trial showed that at 12 weeks post randomisation the
mean (standard deviation) of the primary outcome
Comment (A11): Am I right in
was 18 (CI 7.5) in the SSRI group and 17.1 remembering this as the confidence
(CI8.3) in the SSRI plus CBT group. The interval?
difference between the two groups was not
statistically significant under an intention-to-treat analysis.
j) What other information would you consider when interpreting the results of this
trial, think particularly about what may be reported in the publication?
(20 marks)
Is the sample size large enough?
How many people were included in the analysis?
Was this the best and most appropriate design?
Has there been substantial bias introduced?
What were the results of the secondary outcomes, in particular
safety?
Are the conclusions similar under a per protocol analysis?
Was the randomisation successful, i.e. are the treatment groups
balanced?
Is the trial population generalisable to inform policy decisions?
Step ii The Question Amended after Feedback from the Exam Chair (July)
(Ready for Review By The External Examiner)
Selective serotonin reuptake inhibitors (SSRIs) are prescribed for the treatment of
major depression in adolescents (age 11-16), Comment (L1): JR – Condense to
although there are concerns regarding their what’s needed to answer the question.
usefulness and a possible raised risk of suicide.
The National Institute for Health and Clinical Comment (R1): EL – Yes, but we also
Excellence (NICE) recommends that the National need to be careful not to disadvantage
Health Service (NHS) in the UK use SSRIs (oral those who know nothing in this area.
- 68 -
Writing Good Exam Questions
Pragmatic trials
A pragmatic trial determines the effectiveness of an intervention. This
is the benefit it achieves through routine clinical practice
Advantages: They are generalisable to routine clinical practice
Disadvantages: Larger sample sizes are often required
Explanatory trials
An explanatory trial determines the efficacy of an intervention. This is
the benefit it achieves under ideal conditions.
Advantages: Variation can be reduced due to the strict procedures and
so inferences can be made from smaller sample sizes
You can determine whether the intervention actually works
Disadvantages: They are conducted under strict procedures and so not
very generalisable to routine clinical practice.
Parents may not want to enter their children into a trial involving an
SSRI because of the risk of suicide, especially as they have major
depression. Therefore recruitment may be slow.
As NICE recommends SSRIs in conjunction with CBT based on US
data there may not be equipoise for this trial. Therefore clinicians
may be unwilling to randomise highly depressed children and their
parents may also be unwilling to be involved.
The two previous issues (for some too high risk, for others
effectiveness already proven) could be described as a problem of
equipoise that could affect recruitment.
Both treatments would be available outside of the trial and
therefore there is not as much incentive to take part in a clinical trial.
The population may include those younger than 16 and therefore
consent procedures for non adults is more complicated and
challenging
There may be a larger drop out in the CBT arm due to the extra
burden of having to attend numerous therapy sessions. Alternatively
the extra attention may be beneficial and increase retainment.
- 69 -
Writing Good Exam Questions
They should get some point for mentioning that because this is a
pragmatic trial the drop out will reflect the normal situation as what
they are evaluating is a policy of recommending CBT it will not bias
the research question
Adolescents may drop out when they leave school
Two hundred adolescents were to be randomly assigned into the trial from six
centres. Each centre had one interviewer collecting data and several therapists
giving CBT.
The primary outcome of the trial was the total score of the Health and Nation
Outcome scale which is a 12 item scale covering a wide range of health and social
domains such as psychiatric symptoms, physical health, functioning, relationships
and housing. Each question is marked from 0 (no problem) to 4 (severe problem).
This was completed by an interviewer at 12 weeks post randomisation.
The primary analysis compared the average total score of the Health and Nation
Outcome scale at 12 weeks post randomisation between treatment groups (SSRI
alone versus. SSRI+CBT).
The term “double-blind” means that neither the participant nor the
person treating the patient (i.e. doctor and/or/both therapist), nor the
person responsible for evaluating the outcome (the researcher) know
whether the patient has been allocated to treatment or not. In this
case we also have the interviewer to think about too, who may or
may not be the evaluator.
d) Identify and discuss three possible sources of bias that could occur in this trial.
(6 marks)
- 71 -
Writing Good Exam Questions
The primary analysis of this trial showed that at 12 weeks post randomisation the
mean (standard deviation) of the primary outcome was 18 (SD=7.5) in the SSRI
group and 17.1 (SD=8.3) in the SSRI plus CBT group. The difference between the
two groups was not statistically significant under an
intention-to-treat analysis. Comment (L3): JR – This is too broad
and vague a question. Change to
something more specific.
f) What other information would you consider important to report in the publication
of this trial to be able to interpret its results?
(6 marks)
Is the sample size large enough?
How many people were included in the analysis?
Was this the best and most appropriate design?
Has there been substantial bias introduced?
What were the results of the secondary outcomes, in particular
safety?
Are the conclusions similar under a per protocol analysis?
Was the randomisation successful, i.e. are the treatment groups
balanced?
Is the trial population generalisable to inform policy decisions?
Selective serotonin reuptake inhibitors (SSRIs) are prescribed for the treatment of
major depression in adolescents (age 11-16), although there are concerns regarding
their usefulness and a possible raised risk of suicide. In the UK, SSRIs are
recommended in combination with standard Cognitive Behavioural Therapy (CBT).
This recommendation is based on data from the United States, which could limit
applicability for practice in the UK. Investigators in the UK therefore conducted a
pragmatic randomised controlled trial of SSRIs alone versus SSRIs with a 12 week
course of CBT in adolescents with major depression.
- 72 -
Writing Good Exam Questions
Pragmatic trials
A pragmatic trial determines the effectiveness of an intervention.
This is the benefit it achieves through routine clinical practice
Advantages: They are generalisable to routine clinical practice
Disadvantages: Larger sample sizes are often required
Explanatory trials
An explanatory trial determines the efficacy of an intervention. This
is the benefit it achieves under ideal conditions.
Advantages: Variation can be reduced due to the strict
procedures and so inferences can be made from smaller sample
sizes
You can determine whether the intervention could actually work
Disadvantages: They are conducted under strict procedures and
so not very generalisable to routine clinical practice.
Parents may not want to enter their children into a trial involving an
SSRI because of the risk of suicide, especially as they have major
depression. Therefore recruitment may be slow.
As UK recommends SSRIs in conjunction with CBT based on US
data there may not be equipoise for this trial. Therefore clinicians
may be unwilling to randomise highly depressed children and their
parents may also be unwilling to be involved.
The population may include those younger than 16 and therefore
consent procedures for non adults is more complicated and
challenging
Adolescents may drop out when they leave school
Two hundred adolescents were to be randomly assigned into the trial from six
centres. The primary outcome of the trial was the total score of the 12 item Health
and Nation Outcome scale covering psychiatric symptoms, physical health,
relationships and housing. This was completed by an interviewer at 12 weeks post
randomisation.
The term “double-blind” means that neither the participant nor the
person treating the patient (i.e. doctor and/or/both therapist), nor the
- 73 -
Writing Good Exam Questions
d) Identify and discuss three possible sources of bias that could occur in this trial.
(9 marks)
- 74 -
Writing Good Exam Questions
No! We still needed lots more work on the model answer to make it much
more specific for the exam marking phase. We also didn‟t like marking it out
of 50 – much easier to allocate marks to 100 (but the 50 was a constraint
placed on us by the previous exam board)
- 75 -
Writing Good Exam Questions
EXERCISE
The Process
When do you begin developing examination questions in your course team?
What are the strengths and weaknesses for you in adopting a similar question
development approach to the one described in the case study above?
Having read this case study – what elements would you like to transfer to your
own approach to question writing?
The Question
How would you rate the above in terms of clarity, authenticity and fairness?
How strong would you expect the inter-marker reliability to be based on the
marking guidelines provided?”
- 76 -