Escolar Documentos
Profissional Documentos
Cultura Documentos
Claudia Mewald
Otmar Gassner
Rainer Brock
Fiona Lackenbauer
Klaus Siller
Claudia Mewald
Otmar Gassner
Rainer Brock
Fiona Lackenbauer
Klaus Siller
Der Text sowie die Aufgabenbeispiele knnen fr Zwecke des Unterrichts in sterreichischen
Schulen sowie von den Pdagogischen Hochschulen und Universitten im Bereich der
Lehreraus-, Lehrerfort- und Lehrerweiterbildung in dem fr die jeweilige Lehrveranstaltung
erforderlichen Umfang von der Homepage (www.bifie.at) heruntergeladen, kopiert und verbreitet werden. Ebenso ist die Vervielfltigung der Texte und Aufgabenbeispiele auf einem
anderen Trger als Papier (z. B. im Rahmen von Power-Point Prsentationen) fr Zwecke des
Unterrichts gestattet.
Contents
3
1 SPEAKING TO COMMUNICATE
2 THEORETICAL MODELS
5
2.1 Models of communicative competence
7
2.2 Communicative competence in the CEFR
8 2.2.1 Linguistic competences
10 2.2.2 Sociolinguistic competence
10 2.2.3 Pragmatic competence
11 2.3 The nature of language in unplanned speech
13 3 Test development
56 5 WASHBACK
57 BIBLIOGRAPHY
60 Appendix
Abbreviations
ANC
BIFIE
E8 BIST
CEFR
EFL
FL
SZ
1 Speaking to communicate
It is commonly acknowledged that foreign language learners as well as most stakeholders consider speaking or more comprehensively oral communication the most
required and important skill to be mastered.
According to Thornbury (2009, p. iv), however, [i]t is generally accepted that know
ing a language and the ability to speak it are not synonymous. Nevertheless, the
teaching of foreign languages (FL) has been practised as if knowing and speaking
were the same thing for quite some time, thus being ignorant about the frequent
misbelief that knowing the grammar and some vocabulary, making sentences and
pronouncing them properly (ibid., p. iv) in the foreign language amounts to the
ability to speak it. Therefore, Thornbury maintains, many courses and teachers still
teach how to vocalise grammar rather than how to communicate effectively.
Modern FL teaching, however, supported by research and the sound judgment of
its receivers, who first and foremost want to become effective FL speakers with the
ability to communicate successfully, has acknowledged that the interactive nature of
communication requires communicative competence. Moreover, the goal of most
language learners being the ability to communicate comprehensibly, effectively, and
naturally, those components of communicative competence (see p. 7) essential to
achieve successful communication are at the heart of modern FL teaching and testing.
The fact that spoken language is significantly different from written text, as deter
mined by the nature of the speaking process, is comparatively new. This has
eventually been made tangible by the CANCODE spoken corpus1, the Cambridge
International Corpus, and modern English dictionaries, which show how English is
really used, not how one is supposed to use English or how one uses it in writing2.
The difference between spoken and written language features not only in its lexis but
also considerably in the grammar of spoken language (Carter & McCarthy 2006,
McCarthy 2006a), which is to be acknowledged in teaching as well as in testing and
assessment (also see p. 31).
Consequently, teaching speaking as a skill has to consider aspects of communicative
competence (see p. 7), communicative genres relevant for the target group (see p.
18), and productive strategies which FL speakers apply to communicate according
to the nature of the communicative task and thereby show their available communicative potential. Taking into consideration that according to the Austrian National
Curriculum for Foreign Languages (ANC), FL education should primarily aim at
communicative competence this seems particularly crucial:
Ziel des Fremdsprachunterrichts ist die Entwicklung der kommunikativen Kompetenz in
den Fertigkeitsbereichen Hren, Lesen, An Gesprchen teilnehmen, Zusammenhngend
Sprechen und Schreiben.
Als bergeordnetes Lernziel in allen Fertigkeitsbereichen ist stets die Fhigkeit zur erfolgreichen Kommunikation die nicht mit fehlerfreier Kommunikation zu verwechseln
ist anzustreben. (bmukk 2009c, pp.12)
The curricular priority on successful communication rather than accuracy suggests a fluency-oriented approach to teaching and assessing speaking (Brown 1999,
1
2
Ebsworth1998, Krashen & Terrell 1988, Krashen 2003, McCarthy 2006b, Richards
2008). This is also emphasized by Brock et al. (2008, p. 24), who suggest that the
practice of communicative competence is even possible in large classes if teachers
manage to let go of correcting and adopt the role of facilitators who enable, support,
and encourage speech processes instead. Moreover, they maintain that the explicit
demand for all five skills to be addressed equally intensively brings forth the obliga
tion to assess spoken interaction and oral production regularly and reliably.
Die Fertigkeitsbereiche Hren, Lesen, An Gesprchen teilnehmen, Zusammenhngend
Sprechen und Schreiben sind in annhernd gleichem Ausma regelmig und mglichst
integrativ zu erarbeiten und zu ben. (bmukk 2009c, p. 2)
Da aber die Erfassung der mndlichen Kompetenzen in der Gesamtbeurteilung vom
Lehrplan im Sinne der Gleichwertigkeit der Fertigkeiten explizit gefordert wird, muss
ein GERS-orientierter Unterricht mndliche Prfungs- und bungsformen beinhalten,
die sowohl monologische als auch dialogische Sprechkompetenzen verlsslich abbilden.
(Brock et al. 2008, p.12)
For this reason, the ANC and the E8 Standards (E8 BIST) describe precisely what
language learners should be able to do in spoken interaction and oral production in
can-do descriptors and CEFR levels. Testing speaking in the E8 BIST context thus
relies on an overarching framework which takes the aspects addressed in the three
documents into consideration.
In the following theoretical models of communicative competence and language
ability, construct specifications, task specifications as well as assessment specifications
are described.
2 Theoretical Models
As early as 1961 Lado (p. 239) suggested that the ability to speak was without doubt
the most highly prized skill, while testing it was the least developed and the least
practised in the field of testing. One might argue that Lados work on testing is history and modern FL education has long overcome this mismatch. However, the current
state of the art of testing and assessing speaking in Austrian classrooms suggests that
testing hardly ever happens in a systematic way and thus the ability to speak does
not have a strong formal impact on the learners final grades. Therefore, it seemed
appropriate and necessary to explore findings from international test development
and apply them to the Austrian context in the development of the E8 Speaking Test.
Lado (1961) argues that the underrepresentation of testing and assessment in speaking
derives from the fact that we lack understanding of what constitutes speaking. Consequently, this section focuses on theories of speaking in order make transparent how
the concepts of speaking and communicative competence have been captured by
the literature since the 1960s and finally in the E8 Speaking Test.
Modern testing of speaking draws on competence models that accept the view that
speaking does not happen in a vacuum but that it is a real-life process, co-constructed
between participants talking in specific contexts and situations (Fulcher 2003).
Theoretical models for testing speaking which acknowledge the communicative
function of that skill therefore define competence models.
Some of these researchers were Bachman 1990, Bachman & Palmer 1996, Canale & Swain 1980, Fulcher 2003,
Luoma 2004, Widdowson 1978, and Wilkins 1976. Their publications quoted in this paper had an impact on
the development of the E8 Speaking Test.
Grammatical competence. This type of competence will be understood to include knowledge of lexical items and rules of morphology, syntax, sentence-grammar semantics, and
phonology.
Sociolinguistic competence. This component is made up of two sets of rules: sociocultural
rules of use and rules of discourse.
Strategic competence. This component will be made up of verbal and nonverbal communication strategies that may be called into action to compensate for breakdowns in communication due to performance variables or to insufficient competence. Such strategies will
be of two main types: those that relate primarily to grammatical competence (e.g. how
to paraphrase grammatical forms that one has not mastered or cannot recall momentar
ily) and those that relate more to sociolinguistic competence (e.g. various role-playing
strategies []).
(Canale & Swain 1980, pp. 2930)
Bachman (1990, p. 84) pursues a similar concept as Widdowson and describes communicative language ability (CLA) as consisting of both knowledge, or competence,
and the capacity for implementing, or executing that competence in appropriate,
contextualized communicative language use [our emphasis]. He therefore proposes
a framework including the following components: language competence, strategic
competence, and psychophysiological mechanisms (i.e. the neurological and psychological processes in the actual execution of language as a physical phenomenon
such as sound).
According to Bachman (ibid.) language competence comprises a set of specific knowledge components utilised in communication via language, while strategic competence embraces the mental capacity for implementing the components of language
competence in contextualized communicative language use. He strongly links this
competence with sociocultural knowledge and real-world knowledge. In this respect Bachman is in agreement with Widdowson as well as Canale & Swain who also
emphasise the procedural and functional notion of communicative competence with
regard to contextualised and meaningful communication.
Lexical competence is described as the knowledge of, and the ability to use, the vocabulary of a language, [and] consists of lexical elements and grammatical elements.
(Council of Europe 2001, p. 110)
Lexical elements include fixed expressions and single word forms.
Single word forms are single words that may have several distinct meanings (e.g.
tank container/armoured vehicle), open word classes (nouns, verbs, adjectives etc.),
and lexical sets (days of the week, weights and measures etc.). (Council of Europe
2001, p. 111)
Fixed expressions consist of several words that are used and learnt as wholes. In unplanned speech they are the building blocks of fluency, which demonstrate communicative capacity. According to the CEFR fixed expressions include sentential formulae, phrasal idioms, fixed frames, and fixed phrases (Council of Europe 2001, pp.
110111).
In speaking, fixed expressions are often called lexical phrases, formulaic language,
conversational routines or prefabs. They range from chunks of language to complete
sentences that are not assembled word by word in the speech act but have been preassembled through repeated use. Therefore, they can be accessed easily and quickly
and thus contribute to fluency (Thornbury 2009, p. 23).
Examples from performances recorded during the piloting phase of the E8 Speaking
Test:
How are you? I dont know what you mean. Have a nice day. Good bye.
(sentential formulae, also called social formulas)
What I dont like is ..., Please can I have , Id like to ..., What do you think (about)
..., I hope I will ...
(fixed frames or phrases, also called sentence frames)
Well ..., Right ..., I agree ..., You see ..., Yeah ...
(discourse markers)
Three times a week, brush my teeth, go to a party/cinema/friend
(collocations)
Finally, grammatical elements that belong to closed word classes range from articles
to prepositions and particles (for a complete list see Council of Europe 2001, p.111).
Lexical competence is assessed in the dimension Vocabulary in the E8 Speaking
Test (see p. 33f ).
Grammatical competence
of a language.
and producing wellformed phrases and sentences. (Summarised from Council of Europe 2001,
pp.112113)
The CEFR does not provide a model for grammar or for the organisation of words
into sentences but it identifies parameters [] which have been widely used in
grammatical description: elements, categories, classes, structures, processes, and relations
(Council of Europe 2001, p. 113).
In the E8 Speaking Test, grammatical competence is assessed in the dimension
Grammar (see p. 31f ).
In the assessment of both, lexical and grammatical competence, the nature of lan
guage in unplanned speech is acknowledged (see p. 11).
Phonological competence
The CEFR describes phonological competence as the knowledge of, and skill in the
perception and production of:
sound units,
phonetic composition of
sentence phonetics, and
phonetic reduction.
words,
Apart from the test takers success in making use of an appropriate lexical and grammatical range and the accuracy of the performance, the naturalness and clarity of the
language used are assessed as the third component of linguistic competence in the E8
Speaking Test. A performance is considered natural and clear if the pronunciation is
intelligible and the pronunciation and intonation make it sound natural. In order to
achieve this, performances have to reach a certain level of fluency.
According to McCarthy, fluency is shown through
lexico-grammatical and phonological flow,
apparently effortless accurate selection of elements by individual
the ability of participants to converse appropriately on topics,
the ability to retrieve chunks,
interactive support by each speaker to the flow of talk, and
helping one another to be fluent.
speakers,
In this way, McCarthy maintains, speakers are able to express ideas appropriately,
coherently, speak at a suitable pace, and use pausing at expected points.
In the E8 Speaking Test, fluency features as phonological flow in the sense that
natural and clear pronunciation and intonation should make it possible for speakers of
English to understand the test takers utterances without having to guess on meaning.
10
In agreement with the definition by Canale & Swain (1980), the CEFR defines
discourse competence as the ability ... to arrange sentences in sequence so as to produce coherent stretches of language. (Council of Europe 2001, p.123)
In the E8 Speaking Test this competence can best be demonstrated in the monologue
part (see p. 17), where the test takers are most likely to produce text that features
whole sentences. In other parts of the test (interview or dialogue, see pp. 1617)
the nature of interactive talk will primarily trigger the use of short idea units and
incomplete sentences, strings of short phrases, as well as short turns (see also p. 11).
Functional competence
Functional competence refers to the use of spoken discourse .... for particular functional purposes. (Council of Europe 2001, p. 125)
In the context of the E8 Speaking Test, functional competence comes into play as the
already mentioned ability to make use of known expressions (see p. 8ff) in meaningful exchanges surfacing as communicative capacity or communicative strategies (see
pp. 67).
According to the CEFR (Council of Europe 2001, p. 128) the qualitative aspects
which determine functional success are fluency, the ability to articulate, to keep
going, and to cope when one lands in a dead end and propositional precision, the
ability to formulate thoughts and propositions so as to make ones meaning clear.
The aspect of fluency which describes the ability to articulate is assessed in the
dimension Clarity and Naturalness of Speech, while the ability to keep going, and
to cope when one lands in a dead end as well as the ability to formulate thoughts
and propositions so as to make ones mind clear are assessed in the dimension Task
Achievement and Communication Skills (see p. 28f ).
Design competence
short turns;
in a formal to informal register.
(Summarised from Luoma 2004, p. 12)
delivered
Although the test takers in the E8 Speaking Test are given a short time to think
about their speech act (see p. 40), their performances cannot be called planned
in Luomas terms. Planned speech, according to Luoma (2004, pp. 1213), is
rehearsed, consists of well-thought-out points or opinions, and has been said many
times before.
11
12
Therefore, the nature of language of unplanned speech is considered in the assessment of spoken performances in the E8 Speaking Test, especially in the assessment
of the test takers linguistic competence, i.e. Vocabulary and Grammar. It is
acknowledged, that both vocabulary and grammar in unplanned speech are limited
in their range as well as in their accuracy compared to writing and that performance
effects [which] include the use of hesitations (erm, uh ), repeats, false starts,
incomplete utterances, and syntactic blends (i.e. utterances that blend two grammatical structures as in Ive been to China in 1998 (Thornbury 2009, p. 21) are
natural.
3 Test development
The following Model of speaking test performance (Figure 1) describes the components
which constitute the E8 Speaking Test as well as a range of factors and processes that
have impact on the performance and its assessment and have therefore been considered in the development of the E8 Speaking Test.
It depicts that the construct had to be related to communicative competences, taskspecific knowledge and skills as well as test taker characteristics, which had to be
considered in task development, decisions on setting, and how to pair the test takers.
Moreover, it shows that test administration including setting, interlocutor character
istics and training has a bearing on the performance and that its assessment based on
the assessment scale does so, too.
Thus it clarifies that the test performance is at the heart of an interrelated system
which required organised decision making in order to provide testing and assessment
tools that would most likely bring about valid and reliable results. The following
sections will describe the process of test development in the light of these aspects.
Figure 1: Model of speaking test performance (adapted from Fulcher 2003, p. 115)
13
14
Acknowledging the importance of the issues of reliability and validity it was a major
concern of the E8 testing group to identify sources of error in the assessment of the
test takers communicative language ability and to develop a test and an assessment
tool that would be capable of identifying the language abilities to be measured as
reliably as possible.
3.2.1 Task
The content of the tasks in the E8 Speaking Test is defined by the topics, the communicative function determining the task types, the spoken text types, and the rubrics.
Topics and context
In real life, speaking occurs in a given context. Therefore, the tasks are based on
topics that provide contexts as close to real life as possible and avoid such that might
put some test takers at a disadvantage because the task achievement requires specific
knowledge of the world and/or cultural knowledge. Moreover, topics that require
a great deal of creativity or imagination to accomplish the task or that might easily
trigger stereotypes are not used either.
The topics of the E8 Speaking Test follow curricular guidelines and the contexts the
tasks create reflect the world knowledge and experience of 14-year-old test takers
(also see p. 23). Moreover, great care is given to design tasks that are interesting for
the test takers in order to support motivation and participation.
The topic and the context determine
the
The prompt and the content points give away as little language as possible that will
be needed to accomplish the task to give the test takers the possibility to make use of
15
16
their own ideas and language. However, making use of the language in the prompts
is not prohibited and does not have negative impact on the assessment.
If drawings, graphs, or pictures are used to illustrate the prompt and/or to stimulate
speaking, these are provided in excellent quality so that the test takers are not put at
a disadvantage.
Input texts that are used as a part of the prompt should be authentic. If this is not
possible, adapted texts must provide correct and appropriate English. Input texts
must be as short as possible and they must not exceed 50 words so that reading is
kept to a minimum. The language level of input texts must also be at or preferably
below the tested level and therefore not exceed CEFR level A2 (see Council of
Europe 2001, p. 24).
Rubrics
All rubrics that provide the instructions for the tasks are written in English (see
p.41). The language used has been piloted and revised several times. It is under the
candidates expected level of language competence and therefore easily understandable for test takers who have mastered low CEFR levels of A2.
Task types
Speaking tasks can be set in a way that the speakers are asked to produce speech events
independently or collaboratively (Kahn 2008 quoted in Wong & Waring 2010). For
this reason the E8 Speaking Test has been developed in a way that the test takers are
given the opportunity to produce language in a monologue and a dialogue part.
The literature (Brooks 2009, Egyud & Glover 2001, Taylor 2001) discusses various
aspects of the individual and paired peer-approach towards testing and performance
assessment and emphasizes the advantages of the latter as being more natural and
less stressful for the test takers, thus producing better and more elaborated language.
Therefore, the tasks in the E8 Speaking Test have been designed in a way that the
interactional relationship the test takers are engaged in is symmetrical, i.e. the test
takers communicate with each other about familiar topics and the power-distance
relationship between test takers and an adult interlocutor is reduced to a minimum.
The interview
At the beginning of the test an interview serves as a Warm-up with the goal to
break the ice and to make the participants and the interlocutor familiar with each
other. Following standardised instructions the interlocutor asks three to five interview questions to create a friendly conversation between the interlocutor and the two
test takers, similar to the standard teacher-pupil interaction in class the test takers
should be familiar with. The questions in the interview are global ones that will most
likely elicit short answers; questions that require knowledge of the world, embarrassing or ambiguous questions, or yes/no questions are not used. Typical interview
questions are: Whats your name? Where do you live? What are your hobbies? When do
you usually get up in the morning? What do you normally do after school? What do you
normally do at the weekend?
In the interview, questions about topics that feature in the monologue or dialogue
part of the prompt set are not permitted to avoid repetition and putting test takers
at an advantage or disadvantage because they could repeat language used previously.
The monologue
In the monologue part each test taker is offered a choice of three topics. The three
topics vary in E8 BIST descriptors and text types. Moreover, they do not provide any
overlap with the second test takers topics or with the topics used in the interview or
dialogue part.
Each monologue is triggered by six to eight content points that should provide a
guideline for the test takers. However, they are not restricting, and it is not compulsory to make use of or to cover all of them. That is, the test takers can also follow
their own ideas in the presentation of the selected topic.
Standardised repair questions are provided for all content points and some additional
ones are added. These are used by the interlocutors to support the test takers in case
of breakdown of communication or lack of ideas.
The dialogue
In the dialogue part the two test takers communicate with each other. The interaction is triggered by visual and textual cues that provide ideas for the interaction.
However, these do not restrict the test takers in their freedom to make their own
choices in the elaboration of the given topic.
The dialogue part consists of a short and a long dialogue because certain E8 Standards descriptors suggest a short format, while others lend themselves to be used in
long dialogues (see Construct Space p. 36ff).
In both formats visual stimuli and short verbal prompts (see p. 43ff) such as key
words or question starters are used to trigger interaction about the topic. Additionally, the prompts encourage the test takers to use their own ideas.
The test takers are not bound to make use of the question starters or key words in
the prompt, but successful interaction of the test takers is at the heart of E8 Standards assessment. Therefore, the prompts are considered to be a thought-provoking
medium, while the test takers have the freedom to carry out their own solutions. On
the one hand, this gives the test takers the opportunity to make use of their linguis
tic and creative potential. On the other hand, test takers who are used to following
guidelines in their interaction are offered the opportunity to make use of the stimuli
offered by the prompt.
There is no standardised prompting by the interlocutor in the short dialogue because
this would result in the interlocutor talking on a part in the dialogue which is not
desired. Therefore, if one test taker does not communicate, the interlocutor asks the
other test taker to do so. If this also fails, the long dialogue is started.
In the long dialogue the interlocutors are trained to facilitate the interaction in a
standardised way without being intrusive. Contrary to the monologue, where the
interlocutor asks questions or gives stimuli in cases of breakdown, the interlocutor
remains silent and passes repair question slips to test takers who do not ask questions.
This opens up the opportunity for one test taker to read out the question and for the
other to respond. Ample piloting of repair questions has shown that this is less intrusive than the interlocutors direct repair questions, which re-direct the interaction
into an interlocutor - test taker conversation.
17
18
Text types
The communicative genres or more precisely the text types used in the E8 Speaking
Test are listed in the Construct Space (see p. 36ff) in alphabetical order, as they cannot
be automatically matched with any particular E8 BIST descriptor or topic. Instead,
they have to be meaningfully selected in task design to match the E8 BIST descriptor, the topic, and the task type.
The following text types are used in the E8 Speaking Test:
Descriptions
Descriptions say what things, people, places, pets, pictures etc. are like. Mostly descriptions follow a typical structure: first they identify the phenomenon and then
they describe it in parts, qualities, and/or characteristics. In most cases, descriptions
will suggest the use of the present tense, adverbs and adjectives, or comparisons to
help picture the person or object, and employ the five senses in saying how something
or someone looks, sounds, feels, smells, or tastes.
Expository discourse
Narratives and stories are predominantly constructed in past tenses because they
usually happened in the past before someone tells them. The tenses used can be
simple past, past continuous tense, and past perfect tense. Narratives or stories often
focus on a series of events that are mostly presented in a linear sequence.
In speaking, narratives and storytelling often use direct speech to make the listeners
feel, think, and share experiences through the real dialogues of the participants. A
lot of direct speech will change the nature of the language in the monologue (e.g.
incomplete sentences, phrases, chunks, tense switches ).
If storytelling is triggered by pictures, present tenses can also be used.
Personal reports
Personal reports describe the features of events within the experience of the test takers
(e.g. reports about holidays, weekends, sports weeks, excursions, family meetings or
feasts etc.). They generally follow a similar structure (what, when, where, with whom,
why, how) and use facts to explain something or give details about a topic. Moreover,
they can be descriptive. Reports are mainly delivered in the past. If the reports focus
on rituals in the test takers daily lives, present tenses can also be used.
Personal statements
In personal statements the test takers present themselves; they give reasons, talk
about their plans and/or give explanations for them. The age and the life experience
of the test takers limit the topics that can be matched with this text type to such
referring to future education/job/life/ideal place to live/ideal partner or family/free
time or holiday preferences etc. A personal statement will mostly feature present
tense, future tense, or the conditional.
Argumentative discourse
Traditionally, argumentative discourse is a form of interaction in which the individuals maintain opposing positions. In the context of the E8 Speaking Test, however,
the test takers will most likely share similar opinions. Thus, argumentative discourse
will trigger arguments of equal actors engaged in personal, social interaction rather
than such of abstract or conflicting nature and differ from informal discussion in its
more personal content.
Functional discourse
Functional discourse refers to speech acts that engage the test takers in carrying
out concrete social functions such as greeting and departing, expressing feelings like
surprise, joy, regret, interest etc., making arrangements or transactions in shops, post
offices, getting information about travel, asking and telling the way etc.
Thus, the audience of the functional discourse would normally originate in the
public domain (e.g. shop personnel, police, drivers, conductors, waiters etc.).
Although the test takers are familiar with these audiences from carrying out role
plays in English lessons at school, in the E8 Speaking Test they will not take on the
roles of adults. Functional discourse will therefore exclusively feature tasks that ask
two teenagers to carry out social functions.
Informal conversations
In an informal discussion the test takers will present arguments and information
about a familiar topic from different points of view and they may also phrase a
recommendation as to how to solve a problem or react to a certain situation. Informal discussions in the E8 context can only touch the personal or educational domain
of the test takers and will exclusively focus on familiar topics. Informal discussions
differ from argumentative discourse in the level of formality and in product orientation (recommendation, problem solving).
Text types and audience
The text types mentioned in the previous sections will require different audiences to
be addressed. Although EFL lessons offer multiple opportunities to simulate situations from the personal, educational and public domain, the testing situation must
not put test takers at a disadvantage by putting them into roles that are very different
19
20
from their range of experience or which might make them feel ashamed or shy and
thus prevent them from showing what they know.
Therefore, the E8 Speaking Test does not go beyond the typical scenarios the test
takers are used to experiencing in EFL education. Moreover, they will not be asked
to take on roles that do not reflect their real age, i.e. they will not be asked to speak
as parents, teachers, ticket clerks etc.
Prompt writing and prompt difficulty
Validating the content of a test must also be concerned with the question if the tasks
are a representative sample of what the test takers are familiar with from lessons that
teach speaking and if the difficulty of the tasks is similar.
In order to take care of this aspect, the prompts are exclusively written by practising
qualified English teachers who are also trained as interlocutors and assessors. They
are familiar with the test construct, the theoretical model of speaking, and the test
specifications. Moreover, they apply their experience as experts who have current and
intensive contact with the target group.
Prompt writing is carefully trained and follows guidelines. The teachers collaborate
in pairs and produce first drafts, which are screened by tandem pairs. In this way four
qualified teachers have given their feedback on the prompt before they are screened
by expert E8 BIST trainers who moderate editing if necessary. The completed prompt
sets are pre-piloted with learners of the target group by the authors, who function as
interlocutors and assessors in the pre-piloting. During this phase last adjustments to
repair questions and prompts can be made. If this is the case, additional screening by
the trainers is required.
A second pre-piloting takes place during the second interlocutor/assessor training,
when these prompt sets are used with pupils from a school other than that of the
prompt author. Again, adjustments can be made before these prompt sets are stored
in the BIFIE item bank where they are ready to be piloted a last time under real test
conditions in the year before the actual exam.
Prompt writers are instructed to generate prompt sets that are similar in construct
and ideally identical in the anticipated difficulty for both test takers and in comparison to other prompt sets. However, there are some variables that cannot be
controlled. Test takers who have never encountered the topic or even thought about
it or who do not yet have an opinion about it and have to perform on it in the
course of the test may certainly find the task more demanding than test takers who
have already had experiences with the required content. However, if the appropriate
strategic competences asked for in the task (e.g. describing, turn-taking, questioning
etc.) are available to test takers they can succeed in such tasks, even if they do not
have a wide range of linguistic resources available for the topic.
In Phase 1 the trainees are made familiar with communicative competence in the
CEFR, already mentioned in Chapter 2, and the E8 Speaking Test Specifications,
which will be dealt with in detail in the next chapter (see p. 35ff). To set them up
in their role as assessor, the trainees are acquainted with the construct of the test
and the CEFR scales for the assessment of spoken interaction and oral production
before they study the descriptors of the E8 Speaking Assessment Scale (see p. 42),
after which they assess several examples of video recorded benchmarked E8 Speaking
Tests performances.
Throughout the training assessors are given feedback on their assessor behaviour in
relation to the group and can thus reflect and adjust their assessments towards a more
homogeneous behaviour with the help of anchor performances and justifications
that can be used for individual standardisation practice. Moreover, in the assessment
of the E8 Speaking Test multiple-ratings will be collected through the assessment of
a representative sample of performances by the whole assessor population on-line
and thus assessor behaviour (harshness or leniency) will be adjusted through multifaceted Rasch analysis.
To prepare for their future role as an interlocutor the trainees are provided with guidelines for interlocutors and interlocutor behaviour, followed by reflected individual
and group analyses of video recordings of perfect and flawed interlocutor behaviour.
In order to practise their dual role as interlocutors and assessors the trainees learn to
set up the seating arrangement according to a standardised plan (see p. 23) and carry
out test simulations with their peers, who provide feedback on individual interlocutor behaviour in group discussions.
Finally, to help them with the tasks they have to carry out in Phase Two, they are
presented with the intricacies of prompt writing discussed in the previous chapter,
and are provided with guidelines on how to go about writing their own prompts.
21
22
In the second phase of training the trainees work together in pairs to design and
produce one speaking prompt set (see p. 43ff). To assist them in this task, a second
pair of trainees (tandem pair) moderate and edit the prompt set before it is sent to
the trainers for a final stage of moderation.
Once the prompt set has been passed by the trainers, the trainees carry out trial
speaking tests in one of their schools with eight pairs of fourth year pupils. A select
ed number of these speaking performances showing various competence levels are
assessed, justified and reflected upon in pairs. At this point the interlocutors and
assessors experiences with the prompts are discussed and analysed. If trialling has
uncovered flaws in the prompts quality, more screening and editing takes place.
Phase Three: Online Training 2
During this stage of training the trainees assess eight to ten speaking performances
that are made available to them via a secure online platform. The trainees submit
their scores on the benchmarked performances to the trainers, thereby providing
data to determine inter-rater and intra-rater reliability.
Phase Four: F2F Meeting 2
In this final phase of training the trainees go through a phase of standardisation with
an emphasis on the implementation of the prompt sets in a mock E8 Speaking Test,
referred to as prompt familiarisation, and on interlocutor behaviour. Three or four
prompt sets from within the whole training group and/or the item bank, selected
by the trainers, are pre-piloted with a larger cohort of pupils that the trainees have
not met before to simulate an authentic E8 testing situation. Each trainee receives at
least one opportunity to act as an interlocutor and conduct a speaking test. During
the subsequent tests, the trainees either assume the role of assessors, whereby they
assess several speaking performances, or they observe their peers acting as interlocutors. They thus receive feedback on their interlocutor behaviour and can adjust it if
necessary.
In a future F2F meeting, shortly before the actual E8 Speaking Tests take place, the
trained interlocutors and assessors go through another standardisation and prompt
familiarisation phase.
3.3.3 Physical setting
In addition to the test procedure that is guided by standardised interlocutor behav
iour, the physical setting of the E8 standards test has to be standardised in order to
create an environment that will make the results reliable because all test takers are
tested in a very similar set-up.
The tests are carried out at the test takers school, which provides them with a familiar
environment. The head teachers of the schools are asked to choose rooms that are
well lit, well-aired, friendly, and undisturbed. They are also asked to leave two chairs
outside the testing room for the next test takers waiting for their turn.
The interlocutors arrange chairs and two tables so that they have enough space to
arrange their testing materials and that the test takers sit facing each other and facing
the interlocutor (see Figure 2). In the dialogue part the arrangement should allow for
the test takers to look at each other.
The assessor sits outside this arrangement but must be able to see the test takers
faces.
Candidates
Interlocutor
Assessor
23
24
While physical/physiological variables4 like age and sex can be considered to have
little bearing on performances because the test takers are all of similar age and attending year 8 classes of Austrian schools, cognitive variables like language background, educational background, and background knowledge may have a stronger
diversifying impact on performances.
Affective or situational reactions to test taking such as motivation, physical disposition, as well as factors such as learning strategies and styles, attitude, extrover
sion, introversion, anxiety, personality, or risk taking (Bachman 1990, Davies et al.
1999, Kunnan 1995) can hardly be controlled in a testing situation. Nevertheless,
the following insights from research have been taken into consideration in E8 Standards Testing: Berry (1994) researched the effect of introversion and extroversion on
paired speaking test performance. The results suggest that introverts perform better
in homogenous pairs and in tests with interlocutors than if paired up with extrovert
test takers. Luoma (2004) suggests that test takers who know each other very well
tend to speak less than those who are not too familiar with each other and that
acquaintanceship has a stronger impact on performance than a mismatch in proficiency level. For this reason, it seemed appropriate that the test takers and/or their
teachers should be allowed to choose the peer partners for the E8 Speaking Test in
order to rule out disadvantages caused by individual characteristics discussed above.
We thus expect the effects of personality, culture-specific variables, proficiency levels,
and acquaintanceship to be reduced to a possible minimum.
As much as introversion may have an impact on a test takers performance, lack of
motivation may also result in scores that do not match the actual ability of a test
taker. As the E8 Standards Test is a low-stake test with no direct bearing on the
takers school career, lack of interest in a good performance can prevent test takers
from showing what they actually can do. In order to avoid the undesired situation
where examinees do not approach the testing situation in the expected manner and
thus threaten the validity of results (Henning 1987), it must be the aim to take any
possible measure to foster motivation and to avoid hostile or negative reactions to the
content and format (Fulcher 2003). This can be achieved by making the test takers
familiar with the content and the format.
At this point it has to be acknowledged that most test takers will not have experienced many formal tests in speaking. Apart from rote-learnt role-plays, rehearsed
presentations or book-and film-presentations, which become part of continuous assessment, teachers hardly ever test speaking. Moreover, teachers do not often assess
their pupils pair work. Therefore, it can be expected that the situation of being tested
in speaking will be a new experience for most of the learners.
However, it is hoped that teachers will make use of published testing materials in
order to support their pupils familiarity with the test format and the test procedure.
Prompts, video recorded pilot tests, and the instructions used by the interlocutors
are available at the BIFIE homepage and it is therefore possible for teachers to show
and practise the testing situation with the learners: (Available at: https://www.bifie.
at/node/1821)
Moreover, test takers who have attended eight years of education in Austria and used
the accredited course books will have encountered similar speaking tasks and should
have reached a level of linguistic competence of at least A2 according to the CEFR in
oral production and spoken interaction in the FL as suggested in the curriculum. In
4
Provisions for test takers with special needs are still to be developed (i.e. instruction cards in large fonts, technical support for the hearing impaired etc.)
favourable situations they may even have reached CEFR level B1. Additionally, the
test takers sociolinguistic competence should cover the linguistic markers of social
relations and politeness conventions asked for in the E8 Standards Test.
More culture specific aspects of sociocultural and sociolinguistic competence, which
are part of EFL and certainly important go beyond the possibilities of standardised
testing, are therefore not a requirement for the E8 Standards Test.
Like linguistic competence, making conscious and strategic use of pragmatic competence (discourse competence, functional competence, and design competence) is
required by the curriculum and thus the test takers are expected to be competent in
engaging in interactive speaking tasks that ask them to carry out various communicative functions. Moreover, the test takers will most likely have held planned presentations in their EFL lessons and thus show design competence.
Finally, the presence of a trained person who encourages communication in a standardised way is the big advantage of the E8 Speaking Test. While in all other skills
the test takers are left alone with the task, speaking provides the opportunity for
the interlocutor to promote participation. Moreover, the contribution of the paired
set-up to motivation and participation generally has a positive impact on the performance.
25
26
familiar with the construct and the scales making use of benchmarked performances
and written justifications which exemplify consistent assessment based on set criteria.
The discussion and comparison of written justifications by the assessors and the
benchmarks are important in two ways: firstly they help the assessors adjust to the
standardised assessment scales and the common understanding of their bands and
secondly they unfold possible implicit criteria that may have been applied by the
assessors but that are not stated in the scales (e.g. In my class this would be a top performance, so this must be a high band ...). Identifying the implicit criteria they may
have been using can help the assessors refine their understanding and application of
the scales for future assessments.
In this way, the justifications of the benchmarked performances demonstrate how
the assessment criteria can be directly related to performance criteria. Moreover, they
exemplify the differences between the categories at certain levels through performances. Thus, the aim of the assessor training is to guide the assessors in a way that
they arrive at independent scores based on the band descriptors within a maximum
of +/ one band for a given performance.
One method of further clarifying the E8 Speaking Assessment Scale and to raise the
level of awareness and recall shortly before the test is through the use of anchor performances. Anchor performances are a set of carefully selected benchmarked responses that illustrate the nuances of the categories. These will be presented at the standardisation meeting prior to the mock test before the actual test and made available on
a secure platform so that the assessors may refer to the anchor performances shortly
before the assessment process. This should re-enforce the standardisation and give
the assessors the chance to remember the anchor performances and the assessments
when this information is really needed. This opportunity for individualised recall is
important because the organisation of the E8 Speaking Test requires a time-slot of
several weeks within which the assessors may have to operate.
3.5.1 The Assessment Scale
In order to make the criteria of the assessment scale result in valid interpretations of a
response it is necessary for the criteria to be related to the purpose of the assessment.
Therefore, the criteria should be defined in a way that any given response would
receive the same assessment regardless of who the assessor is or when the response is
assessed.
Therefore, the descriptors of the analytic assessment scale that assessors work with in the
context of the E8 Speaking Test have been carefully designed and linked with the construct to report about the test takers abilities in four dimensions (see p. 41f ).
The E8 Speaking Assessment Scale is applied by the assessors in situ, i.e. the assessment has to be achieved during the test takers performances. This constraint was
considered in the initial development of the E8 Speaking Assessment Scale and taken
seriously in the adaptations of the scale during piloting and benchmarking which
resulted in a shorter and more user-friendly version.
To provide feedback on the test takers communicative competence, the most significant competences needed for speaking as defined in the test construct (see p.36)
are assessed in the following dimensions: task achievement & communicative skills,
clarity & naturalness of speech, grammar, and vocabulary. Due to the above mentioned constraint of in situ assessment the three parts of the E8 Speaking Test, the
monologue, short dialogue, and the long dialogue are assessed holistically, i.e. the
27
28
test takers are awarded one score on each of the four dimensions. The following interpretative descriptions of the four dimensions of the E8 Speaking Assessment Scale
add to the reliability of results in the sense that the judgements are based on defined
categories and band descriptions.
Task achievement & communication skills
In task achievement and communication skills the information the test takers provide (propositional precision, in all parts), the quality of the narrative (thematic
development, primarily in the monologue part) as well as the ability to interact with
a partner (turntaking, primarily in the dialogue part) are assessed.
Propositional precision refers to the information that is communicated in the performance as well as to the successful completion of a communicative speech act. In
propositional precision we ask ourselves: What is the information we get like? Is it
detailed, concrete, limited, or more or less non-existent?
In the monologue part the test takers are asked to give information about a given
topic. In addition they are provided with content points. Thematic development
primarily refers to the monologue part. It deals with the way the speaker develops a
speech act with respect to the given theme. It is to do with the elaboration of ideas
and the narration. If individual ideas (main points) are expanded with relevant detail, thematic development has been very successful.
At the other end of the scale, in basic statements at word or word group level, themes
cannot be developed.
From the linear design of the prompts we can expect the test takers to address the
content points in the sequence that they appear on the prompt cards. However, the
order is not set and therefore test takers may incorporate them into their spoken
production in a random order. The content points are to be seen as guiding points
for the test takers, to help them to speak freely for two minutes about their chosen
subject, but they are not mandatory and test takers are not penalised for not address
ing them. The assessor must concentrate on the overall amount of information that
the test taker is able to pass on and its quality and evaluate it according to the assessment scale. We expect test takers to talk about the topic they have chosen and to give
information that is relevant to the topic. Test takers may even choose one content
point only, but if they give varied information on it they can still reach high bands.
The repair questions provide a guideline of what we would expect the test takers to
talk about in a sufficiently solved task. As the test takers are supposed to produce a
flow of discourse in the monologue section, and not interact with the interlocutor,
it will not be possible to assess the true level of the candidates communication skills
here. If, however, they do interact by asking for the translation of a German word in
English (e.g. What is Schlger in English?) they should receive the support necessary
to carry on. What we can expect in this section are the use of discourse markers such
as well; like; actually; generally; of course; you know; that will reflect the level of test
takers competence in communication skills.
In turntaking we assess the test takers ability to interact with each other. This can be
seen as the ability to begin, maintain, and end a conversation. The test takers may
use prefabricated chunks, stock phrases, discourse markers, or formulaic language in
doing so. If the test situation does not allow for beginning or ending the conversa
tion the lack of evidence for this does not necessarily lead to downgrading. If effective turntaking has been found in the conversation, high bands can still be awarded.
In the short dialogue we can expect the test takers to exhibit turntaking skills in order to achieve the task which may be an invitation, an excuse, a purchase, a decision
making process (e.g. which film to watch) etc.
We can thus expect the test takers to show, in a guided way, the extent to which they
are able to initiate, maintain and close a conversation and how effective they are
when doing this. Good speakers will have no problems formulating the necessary
questions to accomplish the task. Utterances containing suggestions (e.g. Would
you?), agreement (e.g. Me too.), or disagreement (e.g. No, I dont.) and their
quality will also indicate communicative competence. Other indicators of communicative competence will be the use of stock phrases such as of course and not at
all and the frequency of their use.
In the short dialogue the test takers are asked to accomplish a functional discourse.
The detail of information may be limited by the task, therefore the successful completion of the communicative function is the element we are assessing. The functional
aspect of the short dialogue requires the test takers to come to a defined result.
Bearing this in mind, it is likely that the test takers will refer to all the points in the
prompt, because they should succeed in fulfilling the function that is required.
The long dialogues are guided by question prompts or key words that serve the same
function as the content points in the monologue. They are stimuli but not compulsory items to be dealt with. If the test takers develop a conversation about the topic
following their own ideas the task can still be rich in the quality of information we
get.
Unlike the linear designs of the monologue, the long dialogue prompt is cyclical and
there is no telling which content point, (if any), a candidate will address first. As in
the monologue section it is not mandatory to address all the content points. At E8
we can expect good speakers to discuss many of the points, perhaps even all of them,
and to discuss them in some detail. However, the fact that the test takers should
interact with each other, and may in some cases even interrupt each other, it is less
likely that they will have the opportunity to provide too much detailed information
before they are confronted with another point by their partner.
As soon as one candidate has started the conversation and the other candidate has
replied, the decision to initiate, maintain, and end parts of the conversation lies in
the hands of the individual candidates, unless there is a marked imbalance or breakdown of communication and the interlocutor intervenes. Speakers with good communication skills will try to provide a good balance between their verbal input using
learnt phrases such as I think; In my opinion etc. We can expect good speakers to
use phrases such as Me too; I agree/disagree; Really?; Cool etc. when reacting to
their partners utterances. And finally stock phrases such as And what about you?
What do you think? Whats your opinion? will be employed by good speakers to
encourage verbal output from their conversation partners.
Generally speaking the prompts, content points, key words, and question prompts
are there to make the test takers talk. If they find their own ways of solving the task
and the information we get is appropriate and rich this is equally valuable.
In the assessment of task achievement & communication skills the test takers are
allocated one of seven bands.
Band 7 performers give detailed information and are able to expand main points by
relevant new elements. They are effective in turntaking.
29
30
Band 5 performers give concrete information that is clear and they develop a straightforward narrative in the monologue part. They achieve basic turntaking and can initiate, maintain and close a conversation using stock phrases.
Band 3 performers give limited information and in the monologue they give a simple
list of points at sentence or word-group level. They can ask questions effectively
in the dialogue parts. The test takers may partly rely on the interlocutors support
through repair questions to keep going or to come up with some more information.
Band 1 performers give very little information and cannot go beyond simple statements or negations on word or word-group level in the monologue part. This will
mostly result from the fact that they cannot develop a narrative independently and
rely on the interlocutors repair questions to come up with some information. They
make attempts to ask questions (e.g. raising intonation) but are not effective in
questioning. The interlocutor may have to use the repair question cards to keep the
dialogue going.
Clarity & naturalness of speech
The performances of band 5 speakers show some degree of fluency, although some
pausing for lexical or grammatical planning can be necessary. The speakers produce
connected stretches of language that are long enough for pronunciation and intonation to sound intelligible, although sometimes with a foreign accent. At this level
some mispronunciations that do not impair communication can be tolerated.
A band 3 performance is interrupted by noticeable pauses, hesitations and false
starts, which sometimes cause breakdown of communication. The contributions are
short and intelligibly pronounced, too short, however, to develop natural intonation.
Foreign accent or mispronunciation may sometimes impair communication.
In a band 1 performance the speaker is very hesitant which frequently causes breakdown of communication. This may not necessarily be caused by pronunciation
problems, but the very short and isolated utterances or frequent mispronunciations
may either not allow for an evaluation of pronunciation or make it hard for native
speakers to understand the message.
Grammar
The scale for grammar comprises descriptors for range, control, and the clarity of
the message. Therefore, the assessors evaluate the ability to make use of a range of
grammatical structures, the level of their accuracy as well as their impact on the
message. The focus is on grammatical forms that create meaning and that are reason
ably correct to accomplish successful communication. In addition, the assessment
of grammar in the E8 Speaking Test considers the nature of grammar in unplanned
speech (see p. 11).
Although there is some planning time, speech production in the E8 Standards Test
takes place in real time and is therefore considered to show the characteristics typical
of unplanned speech. Thus, the performances are expected to be linear and the test
takers will mostly use an add-on strategy of stringing short idea units together. While
we generally expect complete sentences in the monologue, the dialogues will primar
ily feature incomplete sentences, word groups, short phrases, or chunks. We have to
acknowledge that incomplete utterances (Could be), ellipsis (Sounds like a good
idea), syntactic blends (utterances that blend two grammatical structures as in Ive
been to London last year), or vague language (kind of machine) are natural.
Moreover, present, simple, or active verb forms, will, would, can, personal pronouns,
and determiners are frequent; past forms, perfect forms, and the passive are rare.
In the context of E8 Speaking, grammatical range must be seen in relation to the
above-described nature of grammar in unplanned speech and the standardised tasks
of the E8 Speaking Test. On the one hand, we will expect the test takers to use structures that are meaningfully elicited by the task. On the other hand, spoken language
produced in real time has its special features. The speaking prompts focus exclusively
on familiar topics and have been designed in a way that all ability levels have a good
chance to succeed in the speech act. Thus, they are as straightforward in their set-up
as possible. However, this does not suggest that the response cannot exceed the complexity of the stimulus. Even if a task is simple in nature, we expect differentiation
in grammatical forms or sentence types. Verbs, for example, can be modified, mark
aspect, and determine various types of sentence function such as statement, ques
tion, negation, command/directive, or exclamation.
In the E8 Speaking Test range overrules accuracy in the sense that rich grammatical range through risk taking is encouraged, while inaccuracies that do not impair
meaning play a minor role. The more varied the grammatical range, the higher the
31
32
band. Risk taking which results in rich structures, but reduced control, does not
automatically lead to downgrading.
Local errors that do not hinder communication will not cause downgrading unless
their frequency impairs the message. Only global errors that interfere with the comprehensibility of the text will result in downgrading or the placement of a text at a
low band.
Test takers are encouraged to make use of their full potential and the more creative
the structural features they show, the better. Nevertheless, the use of variation should
not be exaggerated either. The tasks suggest certain scenarios that require special
structural solutions. These should produce authentic and natural variation, but not
artificial language.
The placement of a performance at a certain band reflects the range of grammatical
structures and the level of their correctness within a meaningfully and successfully
accomplished communicative task.
The monologues are designed in a way that test takers at A25 or B16 level have a good
chance to succeed and demonstrate their grammatical range appropriately. Short
dialogues are meant to be A2 tasks and the range of grammatical structures that is
likely to be elicited in such tasks comprises structures typically mastered at A2 level.
Long dialogues have the potential to elicit B1 language and, as a consequence, also
grammatical structures representative of that level.
Band 7 performances feature good grammatical range that creates meaning and
natural language within the framework of the task. The speaker varies the grammatical structures the prompt elicits and may occasionally go beyond the obvious and
expected. However, any enhancement should not make the message sound unnatural
or result in exaggeration of grammatical structures (range for the sake of range). In
addition to good range a relatively high degree of grammatical control is expected. A
few inaccuracies can occur but they will not impair communication.
Band 5 performances show sufficient range of grammatical structures. Sufficient
range is achieved, if the speaker makes enough use of the prompts structural features
to make the required communication successful and if the grammatical forms create
appropriate meaning. Occasional inaccuracies that can impair communication can
be tolerated.
Band 3 performances feature a limited range of simple grammatical structures. This
means that the grammatical structures are just enough to achieve successful communication. Mostly they are very simple, repetitive, and hardly varied. Performances at
band 3 can be frequently inaccurate and may show basic mistakes. However, these
mistakes generally do not cause breakdown of communication.
Band 1 performances feature an extremely limited range of simple structures. This
usually forces the speaker to compromise the message regarding meaning, content,
and naturalness of language. Extremely limited range results in structures that are
repetitive and follow very simple Subject-Predicate-Object sentence patterns. The
structures hardly go beyond the learnt repertoire of beginners. In addition to structural
5
6
For an inventory of grammatical areas at A2 level see KET Handbook, p. 89. Available at: http://www.exams.
ru/docs/ket_handbook.pdf.
For an inventory of grammatical areas at B1 level see PET Handbook, p. 78. Available at: http://www.exams.
ru/docs/pet_schools_handbook.pdf.
To assess vocabulary in the E8 Speaking Test the assessors look at content words
(nouns, full verbs, adjectives, adverbs), collocations and chunks of language that a
speaker uses to fulfil a communicative task. They assess the range of lexis that creates
meaning and manages to accomplish successful communication and control, i.e.
the level of accuracy. In doing so, the assessment of vocabulary, as the assessment of
grammar above, considers the nature of lexis in unplanned speech (see p. 11).
Vocabulary range refers to the breadth of vocabulary the speakers use in their performances. In the E8 Speaking context, range must be interpreted in relation to
the prompt, as the assessors can assess only the vocabulary actually elicited by the
prompt, and the constraints that real time performances provide.
Although the notion of vocabulary items is not limited to single words but rather
stretched to include lexical phrases, formulaic language, collocations, discourse
markers, and chunks, which provide good opportunities for speakers to show what
they know, we have to acknowledge the fact that the number of words a language
learner at beginner level needs to control in speaking (and listening) is fifty percent
smaller than in writing (and reading) (Thornbury & Slade 2006, Thornbury 2009).
However, even if the E8 Speaking tasks are simple in nature we may expect differentiation within choice of lexical elements. For example, if a task asks for a narrative
description about the first few days at a new school, the oral production will primarily contain words related to school, teachers, subjects, new friends etc., which,
however, can be varied and modified. Equally, spoken interaction can be varied by
the use of stock phrases and well-placed discourse markers. Although the prompt
language is as simple as possible, speakers may well exceed the prompt stimulus in
their performance.
It is not enough for a speaker to use a large number of different words in a performance to achieve a high band in assessment. The words a speaker chooses must be
relevant and appropriate to the topic and used in a way that messages are communicated meaningfully. A top speaker will use vocabulary that is generally accurate
enough to formulate even more complex ideas with clarity. Speakers who stay in absolutely safe language areas (e.g. language picked up in years one and two) and avoid
taking any risk have less evidence of mistakes. However, it is E8 policy to encourage
the test takers to venture out of their safe language zone by rewarding risk taking to
communicate successfully.
In the assessment of vocabulary the test takers are allocated one of seven bands.
Oral performances that show a good range of vocabulary at band 7 contain a good
selection of content words and phrases that demonstrate that the speakers are able to
express clear and precise ideas and occasionally even vary formulations so as not to
appear repetitive. We may well expect one or the other expression to stick out and
exceed what we typically expect from test takers at this level.
Band 5 performances contain a sufficient range of mostly high-frequency words that
again meet the need to communicate clear ideas and are generally used accurately.
There may be some occasional mistakes, particularly when the speaker is trying to
communicate a more complex idea.
33
34
appropriate response to the task, the adequate use of devices that create coherence and cohesion characteristic of oral communication, and turntaking (task
achievement & communicative skills)
the ability to produce clear and natural speech by using standard pronunciation
and stress and by producing fluent utterances (clarity & naturalness of speech)
the test takers linguistic competence demonstrated in the choice of vocabulary
that has a certain range and is accurate, and the adequate use of a range of grammatical structures reflecting the nature of lexis and grammar in unplanned speech
(grammar; vocabulary)
Moreover, the Construct Space, which is to be used to construct tasks, has to be
specified (see Table 14, pp. 3639). It lists the E8 BIST, the topics from the ANC,
the spoken text types, the speaking purpose/communicative functions, the context/
audience, and the CEFR descriptors that the E8 BIST can be linked with.
35
Prompt
Type
Schler/innen knnen
CEFR Descriptor
MONOLOGUE PART 1
Topic Area
Narrative or
story (true or
invented)
Personal
report
Personal
statement
Expository
discourse
Description
Spoken Text
Types
To describe or
compare objects/
people/places
To describe
dreams/hopes/
plans/ambitions/
events/activities/
reactions
To express
feelings/hopes
To give reasons/
explanations
To relate a
narrative
To report about
events/personal
experiences /
topics
To (re)tell a story
Communicative
Function
Speaking Purpose /
Educational:
teachers,
classmates
etc.
Personal:
family,
friends etc.
Audience
Primary
Context /
36
Testing Speaking for the E8 Standards
Schler/innen knnen
CEFR Descriptor
MONOLOGUE PART 2
Prompt
Type
Topic Area
Narrative or
story (true or
invented)
Personal
report
Personal
statement
Expository
discourse
Description
Spoken Text
Types
To describe or
compare objects/
people/places
To describe
dreams/hopes/
plans/ambitions/
events/activities/
reactions
To express
feelings/hopes
To give reasons/
explanations
To relate a
narrative
To report about
events/personal
experiences /
topics
To (re)tell a story
Communicative
Function
Speaking Purpose /
Educational:
teachers,
classmates
etc.
Personal:
family,
friends etc.
Audience
Primary
Context /
Prompt
Type
einfache Vereinbarungen
treffen (A2)
vertraute
Alltagssituationen
bewltigen, z.B. Gesprche
in Geschften, Restaurants
und an Schaltern fhren
(A2)
Schler/innen knnen
CEFR Descriptor
SHORT DIALOGUE
Topic Area
Functional
discourse
Informal
conversation
Spoken Text
Types
To agree/accept/
disagree
To ask for/express
preference
To ask for/give
information
To ask for/offer
help/attention
To express
feelings/attitudes/
opinions
To greet/depart
To initiate/
maintain/ close a
conversation
To invite/request
to join
To request action
To state ignorance
To suggest
To sympathise
Communicative
Function
Speaking Purpose /
Educational:
teachers,
classmates
etc.
Personal:
family,
friends etc.
Audience
Primary
Context /
38
Testing Speaking for the E8 Standards
Schler/innen knnen
CEFR Descriptor
LONG DIALOGUE
Prompt
Type
Topic Area
Informal
discussion
Informal
conversation
Argumentative
discourse
To agree/accept/
disagree
To ask for/express
preference
To ask for/give
information
To ask for/offer
help/attention
To express
feelings/attitudes/
opinions
To greet/depart
To initiate/
maintain/ close a
conversation
To invite/request
to join
To request action
To state ignorance
To suggest
To sympathise
Communicative
Function
Speaking Purpose /
Educational:
teachers,
classmates
etc.
Personal:
family,
friends etc.
Audience
Primary
Context /
40
themselves. The interlocutor asks each test taker three to five interview questions.
In
section two, each test taker produces a monologue based on a textual and/or
visual stimulus.
In
section three, the two test takers engage in a short and a long dialogue based
on textual and visual stimuli.
Section 2: The test takers are given one minute to read the prompts; the speaking
Section
3: The test takers are given oral instructions to carry out the short
dialogue, which they start straight away. After the short dialogue (approx.
12minutes) they are given one minute to read the prompts for the long dialogue;
the speaking time for the short dialogue is one to two minutes, for the long
dialogue it is five minutes (approx. 2,5 minutes speaking time per test taker);
8minutes altogether.
2 minutes
Interview
2 minutes
1 minute
4 minutes
12 minutes
1 minute
5 minutes
1 minute
18 minutes
4.7 Rubrics
All rubrics are in English. However, they must be formulated in language that is well
below the test takers expected level to be easily understandable for the test takers.
Therefore, they must not exceed CEFR level A2. Test takers must not be put at a
disadvantage because they have difficulty understanding the rubrics.
Rubrics referring to the dialogues need to indicate the reason for communication
and the context/audience. The required length of the speaking activity is indicated
in minutes.
41
no task achievement
limited using
information
on familiar and routine
sometimes
stock phrases
matters communicated in a simple and
direct exchange
description
or narrative
in and
a simple
list of
limited
information
on familiar
routine
points
on sentenceinoraword-group
matters
communicated
simple and level
effective
questioning in information
direct
exchange
exchange
description
or narrative in a simple list of
points on sentence or word-group level
very word
little information
attempted
questioning to get information
everyday
situations
clear and
concrete
information
of immediate
maintaining
and/or
closing
discourse,
relevance
with
mainphrases
points communicated
sometimes
using
stock
comprehensibly
straightforward
descriptionof
orimmediate
narrative
clear
and concrete information
relevance
basic with
turntaking
through
initiating,
main points communicated
maintaining or closing discourse,
comprehensibly
sometimes description
using stockorphrases
straightforward
narrative
no assessable language
no assessable language
frequent mispronunciations;
only
understood by speakers of English with
some effort
mispronunciations
much hesitation frequently
causing
or
which sometimes
breakdown
of communication
impair
understanding
frequent
only
breakdown
ofmispronunciations;
communication
by speakers
of English with
very understood
short, isolated,
mainly presome
effort
packaged utterances
noticeable
pauses,with
hesitation
or false
intonation,
sometimes
a foreign
starts,
sometimes
causing breakdown of
accent;
occasional
mispronunciations
communication
short contributions
and exchanges
noticeable
pauses, hesitation
or false
linked
with some
simple
connectors
starts,
sometimes
causing
breakdown
of
1
good
range of structures
Grammar
relatively high degree of
grammatical control and few
which do not impair
goodinaccuracies
range of structures
communication
relatively
high degree of
messagecontrol
clear and few
grammatical
no assessable language
causing breakdown of
communication
message usually clear
extremely
limitedofrange
of simple
limited
rangebreakdown
of simple structures
communication
frequently
inaccurate with basic
message
usually
clear
mistakes,
generally
without
communication
generally
message
clearsufficient range of
structures
generally
sufficient
range of
structures
message clear
Descriptors referring to range and control reflect the features of the task and the nature of grammar and vocabulary in unplanned speech.
2 See above
Clarity
fluent and
spontaneous at aof
fairly even
& Naturalness
tempo with natural pauses
Speech
longer stretches
of language
clear,
natural pronunciation
fluent
and spontaneous
at a fairlyand
even
intonation
tempo
with natural pauses
connected
of language
in a
some
degree ofstretches
fluency with
some
connected,
linear
sequence
of
points
pausing for repair or grammatical and
clearly
intelligible pronunciation and
lexical
planning
intonation,
sometimes
with a in
foreign
connected
stretches
of language
a
accent;
occasional
mispronunciations
connected, linear sequence of points
Clarity &Assessment
NaturalnessScale
of (Oct 2011) Grammar1
Speaking
Speech
Descriptors referring to range and control reflect the features of the task and the nature of grammar and vocabulary in unplanned speech.
2 See above
detailed
communicated
Taskinformation
Achievement
& reliably
description or narrative with main points
Communication Skills
expanded by relevant, detailed information
and information
examples communicated reliably
detailed
effective
through
description
orturntaking
narrative with
maininitiating,
points
maintaining
and/ordetailed
closing information
discourse,
expanded
by relevant,
sometimes using stock phrases
and examples
no assessable language
frequently inaccurate
vocabulary controlling a
narrow lexical repertoire
frequently
limited
range ofinaccurate
vocabulary
vocabulary
controlling
mostly
communicating
cleara
narrow
lexical
repertoire
ideas
formulations sometimes
varied to avoid repetition
generally accurate vocabulary
occasionally inaccurate
vocabulary;
errors
sufficient
range ofmajor
vocabulary
possible
when
expressing
communicating clear ideas
more complex
ideas
occasionally
inaccurate
2
good range of vocabulary
Vocabulary
Vocabulary2
43
Model Promptset 04
2011/12
Interlocutor:
Hello, please sit down. Im . Ill do the speaking test with you.
The lady / gentleman in the back is Mrs / Mr .
Shes / Hes listening.
In the first part I will ask you some questions.
Candidate A
Whats your name?
How are you today?
Candidate B
And how about you? How are you today?
Whats your name?
Candidate A
(Use name), whats your favourite food?
Who cooks it for you?
Candidate B
(Use name), whats your favourite sport?
How often do you practise it?
Model Promptset 04
Time
min:sec
2011/12
0:00 Please choose one and read only that topic text
carefully.
You have one minute to prepare.
Model Promptset 04
1:00
2011/12
Candidate A:
Choose one topic and read only that topic text carefully.
3:00
(After 2 minutes)
Could you finish, please? /
Thank you, [Candidate A] (use name).
[Candidate B] (use name), would you start, please?
Model Promptset 04
2011/12
Model Promptset 04
Candidate B:
2011/12
Choose one topic and read only that topic text carefully.
5:00
(After 2 minutes)
[Candidate B] (use name), could you finish, please?/
Thank you, [Candidate B] (use name).
Well now do part three.
Model Promptset 04
2011/12
Model Promptset 04
2011/12
lnterlocutor:
You will now have a conversation together.
You are at the kids flea market. Here are your cards. (Offer
cards and allow the candidates to have a look at the cards for 10
seconds before you carry on.)
[Candidate A] (use name), you go shopping to the flea
market and [Candidate B] (use name) you are selling your
things.
(If necessary) [Candidate A] (use name), please start.
Model Promptset 04
Candidate A:
2011/12
You BUY.
-------------------------------------------------------------------------------------Candidate B:
You SELL.
10
Model Promptset 04
Time
min:sec
2011/12
0:00
Follow trends?
Buy modern
clothes? Why
(not)?
Fashion and
trends and
you: What?
Why?
Whats in?
Whats out?
Why?
Trends
teenagers
like/do not like?
Model Promptset 04
What to do with
things that are out?
http://en.fotolia.com/id/1814695
Money for
buying modern
things?
10
Positive/
negative
things about
trends?
Positive/
negative
things about
fashion?
YOUR OWN
IDEAS
2011/12
Model Promptset 04
2011/12
--------------------------
11
Model Promptset 04
2011/12
12
56
5 Washback
Since the publication of Aldersons and Walls (1993) 15 washback hypotheses,
the impact of testing on teaching/teachers and learning/learners has been widely
acknowledged. If we consider teaching and learning closely linked to curriculum,
course design as well as material production, the effects of testing on those also has
to be taken into account.
Although the E8 Standards Test is a low stakes test that does not have any gate
keeping function, it is expected that it will have an impact on the teaching and
learning of speaking in lower secondary foreign language education.
In general the test, together with the implementation of E8 Bildungsstandards in
2009 and the revision of the curriculum for modern foreign languages, should already have changed the teaching of speaking, as in accordance with the CEFR
(Council of Europe 2001) the skill of speaking features the two components of oral
production (Council of Europe 2001, p. 58ff) and spoken interaction(Council
of Europe 2001, p. 73ff) that should be taught and assessed equally intensively in
a fair proportion to the other three skills. Thus, more attention has been placed on
the role of speaking in teaching EFL and its contribution to the pupils final grade in
these official documents; whether it has also been strengthened through the actual
teaching has yet to be shown.
Moreover, the implementation of E8 Bildungsstandards has been supported by official institutions like BIFIE and SZ as well as publishers and course book writers,
who have reacted to the new requirements through offering on-line and printed
teaching and training materials that support teaching and assessing speaking, so that
the format and the activity types of the E8 Standards Test have found their way into
these materials and some course books.
Finally, sample test papers and video-recorded performances are available on-line,
which offer the opportunity for teachers to make their pupils familiar with taskspecific skills according to the E8 Standards Test format. Additionally, they provide
a guideline on how teachers could administer speaking tests. In order to give the
teachers the opportunity to become familiar with the assessment of speaking performances, in-service training should be offered. It is also hoped, that more materials
that support the teaching and learning of oral production and spoken interaction in
accordance with E8 Bildungsstandards and E8 Standards Tests will be published in
the future, which would help create the desired positive washback on teaching and
learning through making the learners familiar with the instructions, and the task
types.
Bibliography
Alderson, J.C. & Wall, D. 1993. Does washback exist? In: Applied Linguistics, Vol.
14, No. 2, pp. 115128.
Alderson J.C., Clapham C. & Wall, D. 2004. Language Test Construction and
Evaluation. Cambridge: Cambridge University Press.
Bachman, L. F. 1990. Fundamental Considerations in Language Testing. Oxford:
OUP.
Bachman, L. F. & Palmer, A. 1996. Language Testing in Practice. Oxford:
OUP.
Berry, V. 1994. The Assessment of Spoken Language under Varying Interactional
Conditions. Washington D.C.: ERIC Document ED386065. Available online:
http://eric.ed.gov/PDFS/ED386065.pdf.
Bmukk, 2009a. Verordnung: Bildungsstandards im Schulwesen. Available at:
http://www.bifie.at/sites/default/files/VO_BiSt_2009-01-01.pdf
Bmukk, 2009b. Verordnung: Bildungsstandards im Schulwesen, Anlage. Available
at: http://www.bifie.at/sites/default/files/VO_BiSt_Anlage_2009-01-01.pdf
Bmukk, 2009c. Lehrplan Lebende Fremdsprache. Available at:
http://www.bmukk.gv.at/medienpool/17135/lp_hs_lebende_fremdsprache.pdf
Brock, R. Horak, A., Lang-Heran H., Moser, W. , Schatzl, Z., Schlichtherle, B.
Schober, M. 2008. Leistungsfestesellung auf der Basis des Gemeinsamen Europischen Referenzrahmens fr Sprachen (GERS). Praxisreihe: Heft 8. Graz: SZ
Brooks, L. 2009. Interacting in pairs in a test of oral proficiency: Co-constructing a
better performance. Language Testing, 26(3), 341366.
Brown, K. 1999. Developing critical literacy. Sydney, Australia: National Centre for
English Language Teaching and Research.
Brumfit, C. J. & Johnson K. (eds.) 1998. The Communicative Approach to Language Teaching. Oxford: Oxford University Press.
Canale, M. & Swain, M. 1980. Theoretical bases of communicative approaches to
second language testing and teaching. Applied Linguistics, 1(1): 1-47.
Carter, R. & McCarty, M. 2006. The criteria for a spoken grammar. In:
McCarthy, M. 2006. Explorations in Corpus Linguistics. Cambridge: CUP, pp.
2752.
McCarthy, M. 2006a. Explorations in Corpus Linguistics. Cambridge: CUP. Available at: http://www.cambridge.org/other_files/downloads/esl/booklets/McCarthyCorpus-Linguistics.pdf
McCarthy, M. 2006b. Fluency and Confluence: What fluent speakers do. In:
McCarthy, M. Explorations in Corpus Linguistics. Cambridge: CUP, pp. 16.
57
58
Cspes, I. & Egyd G. 2004. Into Europe. The Speaking Handbook. Budapest:
The Teleki Lszl Fondation.
Council of Europe (Ed.) 2001. Common European Framework of Reference for
Languages: Learning, Teaching, Assessment. Cambridge: Cambridge University
Press.
Davies, A., Brown A., Elder, C. Hill, K., Lumley, T. & McNamara T. 1999.
Dictionary of language testing. Cambridge: CUP.
Ebsworth, M. 1998. Accuracy Vs. Fluency: Which Comes First in ESL Instruction?
ESL Magazine. 1:2, 24-26. March/April 1998.
Egyud, G. & Glover, P. 2001. Oral Testing in pairs: A secondary school perspective.
ELT Journal, 55(1), 7076.
Fulcher, G. 2003. Testing Second Language Speaking. London: Longman.
Hanny, R. J. 2000. Assessing the SOL in classrooms. College of William and Mary.
Available at: http://www.wm.edu/education/SURN/solass.html
Henning, G.1987. A Guide to Language Testing. Cambridge, MA: Newbury House.
Hymes, D. 1972. On Communicative Competence. In J.J. Gumperz & D. Hymes
(eds.), Sociolinguistics. Harmondsworth: Penguin Books, pp. 269293.
Hymes, D. 1974. Foundations of Sociolinguistics: An Ethnographic Approach. Philadelphia: University of Pennsylvania.
Johnson, K. & Johnson H. (eds). 1999. Encyclopedic Dictionary of Applied
Linguistics. Malden/Oxford/Victoria: Blackwell Publishing.
Kahn, G. 2008. The social unfolding of task, discourse, and development in the
second language classroom. Unpublished doctoral dissertation. Teachers College:
Columbia University.
Kerlinger, F.N. 1973. Foundations of Behavioral Research. New York: Holt, Rinehart & Winston.
Krashen, S. D. 2003. Explorations in Language Acquisition and Use, Portsmouth
NH: Heinemann.
Krashen S. D. & Terrell T. D. 1988. The Natural Approach. New York: PrenticeHall.
Kunnan, A. J. 1995. Test taker characteristics and test performance. A structural
modeling approach. Cambridge: Cambridge University Press.
Lado, R. 1961. Language Testing. London: Longman.
Luoma, S. 2004. Assessing Speaking. Cambridge: CUP.
Moskal, B. M. 2000. Scoring rubrics: What, when and how? Practical Assessment,
Research & Evaluation, 7 (3) Available at: http://pareonline.net/getvn.asp?v=7&n=3
Richards, J.C. 2008. Moving beyond the Plateau: From Intermediate to Advanced
Levels in Language Learning. Cambridge: CUP. Available at: http://www.cambridge.
org/other_files/downloads/esl/booklets/Richards-Beyond-Plateau.pdf.
Taylor, L. 2001. The paired speaking test format: recent studies. Research Notes, 6,
1517. Cambridge: University of Cambridge ESOL.
Thornbury, S. 2009. How to Teach Speaking. Harlow: Pearson Longman.
Thornbury S. & Slade D. (2006) Conversation. From Description to Pedagogy.
Cambridge: Cambridge University Press.
Widdowson, H.G. 1978. Teaching Language as Communication. London: OUP.
Widdowson, H. G. 1983. Learning Purpose and Language Use. Oxford: OUP.
Wilkins, D.A. 1976. Notional Syllabuses. London: OUP.
Wong, J. & Waring, H.Z. 2010. Conversation Analysis and Second Language
Pedagogy. New York : Routledge.
59
60
Appendix
Schlerinformation und Interviewleitfaden
Mndliche Information der Schler/innen vor der Prpilotierung der
Prompts im Rahmen der Interlocutor/Assessor-Schulung
Ich mchte dich noch ein paar Dinge zu den vier Tests fragen. Deine Antwort ist fr
uns wichtig um zu wissen, ob die Tests noch verndert werden mssen oder ob wir
so weiter arbeiten knnen.
1.
2.
3.
Danke fr deine Hilfe bei dieser Prpilotierung der E8 Standards Speaking Tests.
61