Você está na página 1de 10

Inside the Black Box: Raising Standards through Classroom Assessment

Author(s): Paul Black and Dylan Wiliam


Source: The Phi Delta Kappan, Vol. 80, No. 2 (Oct., 1998), pp. 139-144, 146-148
Published by: Phi Delta Kappa International
Stable URL: http://www.jstor.org/stable/20439383
Accessed: 27/10/2009 09:55
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=pdki.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Phi Delta Kappa International is collaborating with JSTOR to digitize, preserve and extend access to The Phi
Delta Kappan.

http://www.jstor.org

Inside

the

Raising

Black

Box

Standards

Throgh

Classroom

Assessment

BY PAUL BLACK AND


DYLAN WILIAM

Firm

evidence shows that

formative assessment is an
essential component of

classroomwork and that its


development

can raise

standards of achievement,
Mr.
Black andMr. Wiliam point
out. Indeed, theyknow of no
otherway of raising standards
for which sucha strongprima
facie case can be made.
............

.....................................................

,.
I
........................,.

R_

AISING the standards of learn


ing that are achieved through
schooling is an important nation
al priority. In recent years, gov
ernments throughout the world
have been more and more vigorous inmak
ing changes in pursuit of this aim. Nation
al, state, and district standards; target set
ting; enhanced programs for the external
testing of students' performance; surveys
such as NAEP (National Assessment
of

EducationalProgress)andTIMSS (Third
InternationalMathematics and Science
Study); initiativesto improveschoolplan
PAUL BLACK isprofessor emeritus in the
School of Education, King's College, London,
where DYLAN WILIAM is head of school and
professor of educational assessment.

Illustration
byA.GJ. arcc

1998

139

ning and management; and more frequent


and thorough inspection are all means to
ward the same end. But the sum of all
these reforms has not added up to an effec

tivepolicy because somethingismissing.


Learning is driven by what teachers and
pupils do in classrooms. Teachers have to
manage complicated and demanding situ

ing: formative assessment. But we will show


that this feature is at the heart of effective

teaching.
The Argument
We start from the self-evident propo
sition that teaching and learning must be

mary here, our text will appear strong on


assertions and weak on the details of their
justification. We maintain that these as
sertions are backed by evidence and that
this backing is set out in full detail in the
lengthy review on which this article is

founded.
We believe

that the three sections be

interactive.Teachersneed toknow about low establish a strong case that govern


theirpupils'progressanddifficultieswith ments, theiragencies, school authorities,
al, and social pressures of a group of 30
or more youngsters in order to help them learning so that they can adapt their own and the teachingprofession should study
learnimmediatelyandbecomebetterlearn work tomeet pupils' needs - needs that very carefullywhether theyare seriously
ations,channelingthepersonal,emotion

interested in raising standards in educa


ers in the future. Standards can be raised are often unpredictable and that vary from
tion.However,we also acknowledgewide
one pupil to another. Teachers can find out
only if teachers can tackle this task more
spreadevidence thatfundamentalchange
effectively. What ismissing from the ef what they need to know in a variety of
in education can be achieved only slowly
forts alluded to above is any direct help ways, includingobservation and discus
through programs of professional de
sion in the classroom and the reading of with this task. This fact was recognized
velopment that build on existing good prac
in the TIMSS video study: "A focus on pupils'writtenwork.
We use the general term assessment to tice. Thus we do not conclude that forma
standards and accountability that ignores
the processes of teaching and learning in refer to all those activitiesundertakenby tive assessment is yet another "magic bul
let"foreducation.The issues involvedare
and by their students in assess
teachersclassrooms will not provide the direction
- thatprovideinformation too complex and too closely linked to both
ing themselves
that teachers need in their quest to im
thedifficulties of classroompractice and
to be used as feedback tomodify teaching
prove."'
In termsof systemsengineering,pres and learningactivities. Such assessment the beliefs that drive public policy. In a fi
becomes formative assessmentwhen the nal section,we confront this complexity
ent policies in the U.S. and inmany oth
and try to sketch out a strategy for acting
evidence is actually used to adapt the teach
er countries seem to treat the classroom
on our evidence.
as a black box. Certain inputs from the ing tomeet studentneeds.2
There is nothing new about any of this.
outside pupils, teachers, other resour
ces,management rules and requirements, All teachers make assessments in every Does Improving Formative
parental anxieties, standards, testswith high class they teach. But there are three im
Assessment Raise Standards?
are fed into the box.
portant questions about this process that
stakes, and so on A researchreviewpublished in 1986,
Some outputs are supposed to follow: pu we seek to answer:
* Is there evidence that improving for
concentratingprimarilyon classroom as
pils who aremore knowledgeable and com
sessment work for children with mild hand
petent, better test results, teachers who are mative assessment raises standards?
* Is there evidence that there is room
icaps, surveyed a large number of innova
reasonably satisfied, and so on. But what
Those
tions,fromwhich 23were selected.4
for improvement?
is happening inside the box? How can any
* Is there evidence about how to im
chosen satisfied the condition that quan
one be sure that a particular set of new in
titative evidence of learning gains was ob
puts will produce better outputs ifwe don't prove formative assessment?
In setting out to answer these questions,
tained, both for those involved in the in
at least study what happens inside? And
we have conducted an extensive survey of novation and for a similar group not so in
why is it that most of the reform initia
volved. Since then,many more papers have
in the first paragraph are the research literature. We have checked
tives mentioned
not aimed at giving direct help and support through many books and through the past been publisheddescribing similarlycare
nine years' worth of issues of more than ful quantitative experiments. Our own re
to the work of teachers in classrooms?
The answer usually given is that it is 160 journals, and we have studied earlier view has selected at least 20 more studies.
reviewsof research.This process yielded (The number depends on how rigorous a
up to teachers: they have to make the in
set of selection criteria are applied.) All
about 580 articles or chapters to study.We
side work better. This answer is not good
a
materi
these studies show that innovations that in
is
at
least
using
it
lengthy
review,
for
two
reasons.
prepared
First,
enough,
clude strengthening the practice of forma
possible that some changes in the inputs al from 250 of these sources, that has been
tive assessment produce significant and of
published in a special issue of the journal
may be counterproductive andmake ithard
Assessment inEducation, togetherwith ten substantial learning gains. These studies
erfor teachers to raise standards. Second,
range over age groups from 5-year-olds to
it seems strange, even unfair, to leave the comments on our work by leading edu
Switzer
from
university
undergraduates,across several
cational experts
Australia,
most difficult piece of the standards-rais

ing puzzle entirelyto teachers.If thereare


ways inwhich policymakers and others
can give direct help and support to the
everydayclassroom taskof achievingbet
ter learning,thensurely theseways ought
tobe pursuedvigorously.
This article is about the inside of the
blackbox.We focuson one aspectof teach
140

PHI DELTA KAPPAN

land,Hong Kong, Lesotho, and theU.S.3


The conclusionwe have reachedfrom
our researchreview is that the answer to
each of the threequestions above is clear
ly yes. In the threemain sections below,
we outline thenatureand forceof theev
idencethatjustifies thisconclusion.How
ever, because we are presenting a sum

school subjects, and over several coun


tries.
For researchpurposes, learninggains
of this type aremeasured by comparing
theaverageimprovementsin thetestscores
of pupils involved in an innovationwith
the rangeof scores thatare found for typ
ical groupsof pupils on these same tests.

The ratio of the former divided by the lat


ter is known as the effect size. Typical ef
fect sizes of the formative assessment ex
periments were between 0.4 and 0.7. These
effect sizes are larger than most of those
found for educational interventions. The
following examples illustrate some prac
tical consequences of such large gains.
*An effect size of 0.4 would mean that
the average pupil involved in an innova
tion would record the same achievement
as a pupil in the top 35% of those not so

involved.
*An effect size gain of 0.7 in the re
cent international comparative studies in
mathematics' would have raised the score
of a nation in themiddle of the pack of 41
countries (e.g., the U.S.) to one of the top

five.
Many of these studies arrive at another
important conclusion: that improved for
mative assessment helps low achievers more
than other students and so reduces the range
of achievement while raising achievement
overall. A notable recent example is a study
devoted entirely to low-achieving students
and students with learning disabilities, which
shows that frequent assessment feedback
helps both groups enhance their learning.6
Any gains for such pupils could be partic

ularly important.
Furthermore,
pupilswho

ly, the results have to be used to adjust


come to see themselves as unable to learn teaching and learning; thus a significant
aspect of any program will be the ways in
usually cease to take school seriously. Many
become disruptive; others resort to tru which teachers make these adjustments.
*The ways in which assessment can
ancy. Such young people are likely to be
alienated from society and to become the affect the motivation and self-esteem of
sources and the victims of serious social
pupils and the benefits of engaging pupils
problems.
in self-assessment
deserve careful atten
Thus it seems clear that very significant
tion.
learning gains lie within our grasp. The
fact that such gains have been achieved by
Is There Room for Improvement?
a variety of methods that have, as a com
mon feature, enhanced formative assess
A poverty of practice. There is awealth
ment suggests that this feature accounts,
of research evidence that the everyday
at least in part, for the successes. Howev
in classrooms
practice of assessment
is
er, it does not follow that itwould be an beset with problems and shortcomings, as
easy matter to achieve such gains on a the following selected quotations indicate.
* "Marking is usually conscientious but
wide scale in normal classrooms. Many of
the reports we have studied raise a num
often fails to offer guidance on how work
ber of other issues.
can be improved. In a significant minor
*All such work involves new ways to ity of cases, marking reinforces under
enhance feedback between those taught achievement and underexpectation by be
and the teacher, ways thatwill require sig
ing too generous or unfocused. Informa
nificant changes in classroom practice.
tion about pupil performance received by
*Underlying the various approaches are the teacher is insufficiently used to inform
assumptions about what makes for effec
subsequent work," according to a United
tive learning in particular the assump
Kingdom inspection report on secondary
tion that students have to be actively in
schools.7
* "Why is the extent and nature of for
volved.
*For assessment to function formative
mative assessment in science so impover
ished?" asked a research study on second
ary science teachers in the United King
dom.'
* "Indeed they pay lip service to [for
mative assessment] but consider that its
practice is unrealistic in the present edu
cational context," reported a study of Ca
nadian secondary teachers.9
* "The assessment practices outlined
above are not common, even though these
kinds of approaches are now widely pro
moted in the professional
literature," ac
cording to a review of assessment prac
tices in U.S. schools.'0
The most important difficulties with
assessment revolve around three issues.

The first issue is effective learning.


*The tests used by teachers encourage
rote and superficial learning even when
teachers say they want to develop under
standing; many teachers seem unaware of
the inconsistency.

"The food's

really not half bad, but the atmosphere

leaves a lot to be desired."

*The questionsandothermethods teach


ersuse arenot sharedwith other teachers
in the same school, and theyare not crit
icallyreviewedin relationtowhat theyac
tuallyassess.
*Forprimaryteachersparticularly,
there
is a tendency to emphasize quantity and
presentationof work and to neglect its
OCTOBER

1998

141

The ultimate user of assessment information that is


elicited in order to improve learning is thepupil.
quality in relation to learning.
The second issue is negative impact.
*The giving of marks and the grading

no more than devote a tiny fraction of its


resources to such work."2Most of the avail
able resources and most of the public and

function are overemphasized,while the politicalattentionwere focusedon nation


al externaltests.While teachers'contribu
giving of useful advice and the learning
tions to these "summativeassessments"
are
underemphasized.
function
*Approaches are used inwhich pupils
are compared with one another, the prime
purpose of which seems to them to be

have been given some formal status, hard


ly any attention has been paid to their con

"ability," causing them to come to believe


that they are not able to learn.
role
The third issue is the managerial

It is possible that many of the com


mitments were stated in the belief that for

of assessments.

that it already happened all the time and


needed no more than formal acknowledg
ment of its existence. However, it is also
clear that the political commitment to ex

ilar to those of the external

tests in the

UnitedKingdom.Moreover, the tradition


al reliance on multiple-choice testing in
not shared in theUnited King
theU.S. has exacerbated the negative ef
dom fects of such policies on the quality of class

roomlearning.

tributionsthroughformativeassessment. How Can We Improve


competitionratherthanpersonal improve Moreover, the problems of the relation
Formative Assessment?
ment; inconsequence,assessmentfeedback shipbetween teachers'formativeand sum
The self-esteem of pupils.A reportof
teaches low-achieving pupils that they lack mative roles have receivedno attention.

*Teachers' feedback to pupils seems


to serve social and managerial functions,
often at the expense of the learning func

mative assessmentwas not problematic,

ternaltesting inorder topromote compe

tion.
*Teachers are often able to predict pu

tition had a central priority, while

the com

schools in Switzerland states that "a num


ber of pupils ... are content to 'get by.'. . .
Every teacher who wants to practice for
mative assessment must reconstruct the
teaching contracts so as to counteract the
habits acquired by his pupils.""4

The ultimateuser of assessment infor


mation that is elicited in order to improve
learning is the pupil. There are negative
and positive aspects of this fact. The neg
ative aspect is illustrated by the preceding

pils' resultson externaltestsbecause their mitment to formativeassessmentwasmar


own tests imitate them, but at the same ginal.As researcherstheworld over have
When theclassroomculturefo
time teachers know too little about their found, high-stakes external tests always quotation.
cuses on rewards, "gold stars," grades, or
dominate teaching and assessment. How
pupils' learningneeds.
*The collection of marks to fill in rec
ords is given higher priority than the anal
ysis of pupils' work to discern learning

ever, they give teachers poor models

needs; furthermore,some teacherspay

ries of achievement rather than helpful di


agnosis. Given this fact, it is hardly sur
prising that numerous research studies of
the implementation of the education re
forms in the United Kingdom have found

no attention

to the assessment

records of

theirpupils' previous teachers.

for

Of course, not all these descriptions


apply to all classrooms. Indeed, there are
thatformativeassessment is "seriouslyin
to which
many schools and classrooms
With hindsight,
they do not apply at all. Nevertheless, these need of development.""3
general conclusions have been drawn by re we can see that the failure to perceive the
need for substantialsupportfor formative
searchers who have collected evidence throughobservation,interviews,andques assessment and to take responsibility for
developing such support was a serious er
from schools in several coun
tionnaires -

tries,includingtheU.S.
An empty commitment. The devel
opment of national assessment policy in
England and Wales over the last decade
illustrates the obstacles that stand in the
way of developing policy support for for

class ranking, then pupils

look for ways

formativeassessmentbecauseof theirlim to obtain the best marks rather than to im


itedfunctionof providingoverall summa prove theirlearning.One reportedconse

ror.
In theU.S. similar pressures have been

quence is that,when they have any choice,


pupils avoid difficult tasks. They also spend
time and energy looking for clues to the
"right answer." Indeed, many become re
luctant to ask questions out of a fear of

failure.Pupilswho encounterdifficulties
are led to believe that they lack ability,
and this belief leads them to attribute their
difficulties to a defect in themselves about
which they cannot do a great deal. Thus
they avoid investing effort in learning that
can lead only to disappointment, and they
in other
try to build up their self-esteem

felt frompoliticalmovements character ways.


The positive aspect of students'bping
ized by a distrust of teachers and a belief
that external testing will, on its own, im

prove learning.Such fracturedrelation

mative assessment.The recommendations ships between

the primary users of the information gleaned


is that nega
from formative assessments
such as an obsessive fo
tive outcomes cus on competition and the attendant fear
of failure on the part of low achievers -

policy makers and the teach


indeed,
ing profession are not inevitable all subsequentstatementsof government many countries with enviable educational
are not inevitable. What is needed is a cul
policy have emphasized the importance of achievements seem tomanage well with
ture of success, backed by a belief that all
policies that show greater respect and sup
formative assessment by teachers. How
the situation in pupils can achieve. In this regard, forma
ever, the body charged with carrying out port for teachers. While
the U.S. is far more diverse than that in tive assessment can be a powerful weapon
government policy on assessment had no
in the right way.
if it is communicated
the effects of high
and Wales,
or
to
the
England
develop
strategy either to study
formative assessment of teachers and did stakes state-mandated testing are very sim While formative assessment can help all

of a government

142

task force in 198811 and

PHI DELTA KAPPAN

pupils, it yields particularlygood results to improvelearning.

display

the state of their understanding.

Thus we maintain thatopportunitiesfor


should
specificproblemswith theirwork andgiv more generalideasestablishedby research pupils toexpresstheirunderstanding

with

low achievers by concentrating

ing them a clear


wrong and how
accept and work
vided that they

on

understanding of what is
to put it right. Pupils can
with such messages, pro
are not clouded by over

tonesaboutability,competition,and com
parison with others. In summary, themes
sage can be stated as follows: feedback to
any pupil should be about the particular
qualities of his or her work, with advice
on what he or she can do to improve, and

shouldavoid comparisonswith otherpu


pils.
Self-assessmentby pupils.Many suc
cessful innovationshave developed self
and peer-assessment

by pupils as ways of

enhancingformativeassessment,and such
work has achieved some success with pu
pils from age 5 upward. This link of for

mative assessment to self-assessment is


not an accident; indeed, it is inevitable.
To explain this last statement, we should
first note that themain problem that those

who aredeveloping self-assessmentsen


counter is not a problem of reliability and

trustworthiness.
Pupils aregenerallyhon

Such an argument

is consistent with

into theway people learn.New understand


ings are not simply swallowed and stored
in isolation; they have to be assimilated
in relation to preexisting ideas. The new
and the old may be inconsistent or even
in conflict, and the disparities must be re
solved by thoughtful actions on the part of
that there are new
the learner. Realizing
goals for the learning is an essential part
of this process of assimilation. Thus we

Dialogue with the


teacher provides
the opportunity
for the teacher to
respond toand
reorient a pupil's
thinking.

est and reliable in assessing both them


selves and one another; they can even be
too hard on themselves. The main prob
lem is that pupils can assess themselves
only when they have a sufficiently clear conclude: ifformative assessment is to be
pupils shouldbe trainedinself
picture of the targets that their learning is productive,
meant to attain. Surprisingly, and sadly, assessment so that they can understand the
many pupils do not have such a picture, main purposes of their learning and there
by grasp what they need to do to achieve.
and they appear to have become accus
The evolution of effective teaching.
tomed to receiving classroom teaching as
an arbitrary sequence of exercises with no The research studies referred to above show
overarchingrationale.To overcome this very clearly that effective programs of for
pattern of passive reception requires hard mative assessment involve far more than
and sustained work. When pupils do acquire the addition of a few observations and tests
to an existing program. They require care
such an overview, they then become more
committed and more effective as learners. ful scrutiny of all themain components of
a teaching plan. Indeed, it is clear that in
Moreover, their own assessments become
struction and formative assessment are in
an object of discussion with their teach
divisible.
ers and with one another, and this discus
To begin at the beginning, the choice
sion further promotes the reflection on one's
own thinking that is essential to good learn of tasks for classroom work and home
work is important. Tasks have to be justi
ing.
Thus self-assessment by pupils, far from fied in terms of the learning aims that they
serve, and they can work well only if op
being a luxury, is in fact an essential com

When any
ponentofformativeassessment.
one is tryingto learn,feedbackabout the
effort has threeelements: recognitionof
thedesired goal, evidence aboutpresent
of away
position,and someunderstanding
All three
toclose thegap betweenthe twO.15
must be understood to some degree by
anyone before he or she can takeaction

be designed

into any piece of teaching, for

thiswill initiate the interactionthrough


which formative assessment aids learn
ing.
inwhich pupils are led to
Discussions
in their
talk about their understanding
own ways are important aids to increas

ingknowledgeand improvingunderstand
ing.Dialogue with the teacherprovides
the opportunity for the teacher to respond
to and reorient a pupil's thinking. How
ever, there are clearly recorded examples
of such discussions inwhich teachers have,

quite unconsciously, responded inways


thatwould inhibit the future learning of a
pupil. What the examples have in common
is that the teacher is looking for a particu
lar response and lacks the flexibility or the
confidence to deal with the unexpected. So
the teacher tries to direct the pupil toward

giving the expected answer. Inmanipu


lating the dialogue in this way, the teacher
seals off any unusual, often thoughtful but

unorthodox,attemptsby pupils towork


out their own answers. Over time the pu
pils get themessage: they are not required
to think out their own answers. The ob
or
ject of the exercise is to work out guess - what answer the teacher expects
to see or hear.
A particular feature of the talk between
teacher and pupils is the asking of ques
tions by the teacher. This natural and di
rect way of checking on learning is often

unproductive.One common problem is

that, following a question, teachers do not


wait long enough to allow pupils to think
out their answers. When a teacher answers
his or her own question after only two or
three seconds and when aminute of silence
is not tolerable, there is no possibility that
a pupil can think out what to say.
There are then two consequences. One
is that, because the only questions that can
produce answers in such a short time are
questions of fact, these predominate. The
other is that pupils don't even try to think
out a response. Because
they know that
the answer, followed by another question,
portuntiesforpupils tocommunicatetheir will come along in a few seconds, there

evolving understandingarebuilt into the


planning. Di scussion, observationof activities,andmarking of writtenwork can
all be used toprovide thoseopportunities,
but it is then importantto look at or listen
carefully to the talk, thewriting, and the
actions throughwhich pupilsdevelop and

is no point in trying. It is also generally


the case thatonly a few pupils in a class
answer the teacher'squestions.The rest
then leave it to these few, knowing that
theycannot respondas quickiy andbeing
unwilling to riskmakingmistakes inpub
lic. So the teacher,by lowering the level

OCTOBER
1998 143

Tests given in class and tests and other exercises assigned


for homework are also importantmeans of promotingfeedback.
of questions and by accepting answers
teacher. Feedback has been shown to im
routines, for any such change is uncom
from a few, can keep the lesson going but prove learning when it gives each pupil
fortable, and emphasis on the challenge
is actually out of touch with the under
specific guidance on strengths and weak
to think for yourself (and not just towork
standing of most of the class. The ques
nesses, preferably without any overall
harder) can be threatening tomany. Pupils
tion/answer dialogue becomes a ritual, marks. Thus the way inwhich test results cannot be expected to believe in the value
one inwhich thoughtful involvement suf
are reported to pupils so that they can of changes for their learning before they
fers.
identify their own strengths and weak
have experienced the benefits of such chang
There are several ways to break this nesses is critical. Pupils must be given the es.Moreover, many of the initiatives that
particular cycle. They involve giving pu means and opportunities towork with ev
are needed take more class time, particu
pils time to respond; asking them to dis
idence of their difficulties. For formative
larly when a central purpose is to change
cuss their thinking in pairs or in small purposes, a test at the end of a unit or teach
the outlook on learning and the working
groups, so that a respondent is speaking
ing module is pointless; it is too late to methods of pupils. Thus teachers have to
on behalf of others; giving pupils a choice work with the results. We conclude that take risks in the belief that such invest
between different possible answers and thefeedback o01 tests, seatwork, and home
ment of time will yield rewards in the fu
asking them to vote on the options; ask
work shouild give each pupil guidance on
ture,while "delivery" and "coverage" with
ing all of them to write down an answer
how to improve, and each pupil must be poor understanding are pointless and can
and then reading out a selected few; and given help and an opportunity towork on even be harmful.
so on. What is essential is that any dia
the improvement.
Teachers must deal with two basic is
logue should evoke thoughtful reflection
All these points make clear that there sues that are the source of many of the
in which all pupils can be encouraged to is no one simple way to improve forma
problems associated with changing to a
take part, for only then can the formative
tive assessment. What is common to them system of formative assessment. The first
process start to work. In short, the dia
is that a teacher's approach should start by
is the nature of each teacher's beliefs about
logue between pupils and a teacher should being realistic and confronting the ques
learning. If the teacher assumes that knowl
be thoughtful, reflective, focused to evoke
tion "Do I really know enough about the edge is to be transmitted and learned, that
and explore understanding, and conduct
understanding of my pupils to be able to understanding will develop later, and that
ed so that all pupils have an opportunity
help each of them?"
clarity of exposition accompanied by re
to think and to express their ideas.
Much of the work teachers must do to wards for patient reception are the essen
Tests given in class and tests and oth
make good use of formative assessment
tials of good teaching, then formative as
er exercises assigned for homework are can give rise to difficulties. Some pupils
sessment is hardly necessary. However,
also important means of promoting feed
will resist attempts to change accustomed
most teachers accept the wealth of evi
back. A good test can be an occasion for
learning. It is better to have frequent short
tests than infrequent long ones. Any new
learning should first be tested within about
a week of a first encounter, but more fre
quent tests are counterproductive. The qual
ity of the test items
that is, their rele
vance to themain learning aims and their
clear communication
re
to the pupil quires scrutiny as well. Good questions
are hard to generate, and teachers should
collaborate and draw on outside sources
to collect such questions.
Given questions of good quality, it is
essential to ensure the quality of the feed
back. Research studies have shown that,
if pupils are given only marks or grades,
they do not benefit from the feedback. The
worst scenario is one in which some pu
pils who get low marks this time also got
Al
v AF7
low marks last time and come to expect
to get low marks next time. This cycle of
"It has been said that a fool can ask more questions than a wise man can an
repeated failure becomes part of a shared
swer
belief between such students and their

144

PHIDELTAKA1'PAN

dence that this transmission model does


not work, even when judged by its own
criteria, and so are willing tomake a com
mitment to teaching through interaction.
Formative assessment is an essential com
ponent of such instruction.We do not mean
to imply that individualized, one-on-one
teaching is the only solution; rather we
mean that what is needed is a classroom
culture of questioning and deep thinking,
inwhich pupils learn from shared discus
sions with teachers and peers. What emerg
es very clearly here is the indivisibility of
instruction and formative assessment prac

tegral part of each pupil's learning work.


ised by the research evidence are to be se
It follows from this view that several
cured, each teacher must find his or her
changes are needed. First, policy ought to own ways of incorporating the lessons
start with a recognition that the prime lo
and ideas set out above into his or her own
cus for raising standards is the classroom,
patterns of classroom work and into the
so that the overarching priority has to be cultural norms and expectations of a par
the promotion and support of change with
ticular school community."7 This process
in the classroom. Attempts to raise stan
is a relatively slow one and takes place
dards by reforming the inputs to and meas
through sustained programs of profession
uring the outputs from the black box of al development and support. This fact does
the classroom can be helpful, but they are not weaken the message here; indeed, it
not adequate on their own. Indeed, their should be seen as a sign of its authentic
helpfulness can be judged only in light of
ity, for lasting and fundamental improve
their effects in classrooms.
ments in teaching and learning must take
tices.
The evidence we have presented here place in this way. A recent international
The other issue that can create prob
establishes that a clearly productive way
study of innovation and change in educa
lems for teachers who wish to adopt an to start implementing a classroom-focused
tion, encompassing 23 projects in 13 mem
interactive model of teaching and learning policy would be to improve formative as
ber countries of the Organisation for Eco
sessment. This same evidence also estab
relates to the beliefs teachers hold about
nomic Co-operation and Development, has
lishes that in doing so we would not be con
arrived at exactly the same conclusion with
the potential of all their pupils for learn
centrating on some minor aspect of the regard to effective policies for change.'8
ing. To sharpen the contrast by overstat
ing it, there is on the one hand the "fixed business of teaching and learning. Rather,
Such arguments lead us to propose a four
a belief that each pupil has we would be concentrating on several es
I.Q." view point scheme for teacher development.
sential elements: the quality of teacher!
1. Learningfrom development. Teach
a fixed, inherited intelligence that cannot
ers will not take up ideas that sound at
be altered much by schooling. On the oth
pupil interactions, the stimulus and help
for pupils to take active responsibility for tractive, no matter how extensive the re
er hand, there is the "untapped potential"
a belief that starts from the as
their own learning, the particular help need
search base, if the ideas are presented as
view ed tomove pupils out of the trap of "low general principles that leave the task of
sumption that so-called ability is a com
plex of skills that can be learned. Here,
achievement," and the development of the translating them into everyday practice en
we argue for the underlying belief that all habits necessary for all students to be
tirely up to the teachers. Their classroom
if one come lifelong learners. Improvements in lives are too busy and too fragile for all
pupils can learn more effectively
can clear away, by sensitive handling, the formative assessment, which are within
but an outstanding few to undertake such
the reach of all teachers, can contribute
work. What teachers need is a variety of
obstacles to learning, be they cognitive fail
ures never diagnosed or damage to person
substantially to raising standards in all living examples of implementation, as prac
these ways.
ticed by teachers with whom they can iden
al confidence or a combination of the two.
If we
Four steps to implementation.
Clearly the truth lies between these two
tify and from whom they can derive the
confidence that they can do better. They
extremes, but the evidence is that ways of accept the argument outlined above, what
need to see examples of what doing bet
managing formative assessment that work needs to be done? The proposals outlined
below do not follow directly from our termeans in practice.
with the assumptions of "untapped poten
So changing teachers' practice cannot
tial" do help all pupils to learn and can analysis of assessment research. They are
consistent with itsmain findings, but they begin with an extensive program of train
give particular help to those who have
also call on more general sources for guid
previously struggled.
ing for all; that could be justified only if
ance.16
it could be claimed that we have enough
At one extreme, one might call formore
"trainers" who know what to do, which is
Policy and Practice
research to find out how best to carry out certainly not the case. The essential first
such work; at the other, one might call for step is to set up a small number of local
Changing the policy perspective. The
some primary, some
groups of schools assumptions that drive national and state an immediate and large-scale program, with
new guidelines that all teachers should put secondary, some inner-city, some from out
policies for assessment have to be called
er suburbs, some rural- with each school
into question. The promotion of testing as into practice. Neither of these alternatives
is sensible: while the first is unnecessary
an important component for establishing
committed both to a school-based devel
a competitive market in education can be because enough is known from the results opment of formative assessment and to
collaboration with other schools in its lo
very harmful. The more recent shifting of of research, the second would be unjusti

emphasistowardsettingtargetsforall,with
assessmentprovidinga touchstonetohelp
check pupils' attainments,is amore ma
tureposition.However, we would argue
that there is a need now tomove further,
tofocus on the insideof the "blackbox"
and so to explore thepotential of assess
ment to raise standardsdirectlyas an in
146

PHI DELTA KAPPAN

fied because not enough is known about


classroom practicalities in thecontextof
any one country's schools.
Thus the improvement
of formativeas
sessmentcannotbe a simplematter.There
isno quick fix thatcan alterexistingprac
tice by promising rapid rewards.On the
contrary,if the substantialrewardsprom

cal group. In such a process, the teachers


in theirclassroomswill be working out
theanswers tomany of thepracticalques
tions thattheevidencepresentedhere can
not answer.They will be reformulating
the issues, perhaps in relation to funda
mental insightsandcertainlyin termsthat
make sense to theirpeers in other class

study suggests thatassessment, as itoc


wider dissemination for example, ear
curs in schools, is far from a merely
marking funds for inservice training pro
technical problem. Rather, it is deeply
grams would have to be pursued.
social and personal.'9
We must emphasize that this process
will inevitably be a slow one. To repeat
The chief negative influence here is
what we said above, if the substantial re
that of short external tests. Such tests can
wards promised by the evidence are to be dominate teachers' work, and, insofar as
secured, each teacher mustfind his or her
they encourage drilling to produce right
own ways of incorporating the lessons and answers to short, out-of-context questions,
ideas that are set out above into his or her
they can lead teachers to act against their
own patterns of classroom work. Even with
own better judgment about the best ways
optimum training and support, such a process
to develop the learning of their pupils. This
will take time.
is not to argue that all such tests are un
3. Reducing obstacles. All features in helpful. Indeed, they have an important
the education system that actually obstruct
role to play in securing public confidence
the development of effective formative as
in the accountability of schools. For the
sessment should be examined to see how
immediate future, what is needed in any
their negative effects can be reduced. Con
development program for formative as
sider the conclusions from a study of teach
sessment is to study the interactions be
ers of English inU.S. secondary schools.
tween these external tests and formative
assessments to see how themodels of as
Most of the teachers in this study were
sessment that external tests can provide
caught inconflicts among belief systems
could be made more helpful.
and institutionalstructures,agendas, and
All teachers have to undertake some
values. The point of friction among these
existing practices.Dissemination efforts
summative assessment. They must report
conflicts was assessment, which was as
would become more active as results and
to parents and produce end-of-year
re
sociatedwith very powerful feelings of
resources became available from the de
being overwhelmed, and of insecurity,
ports as classes are due tomove on to new
velopment program. Then strategies for
This
guilt, frustration, and anger....
teachers. However, the task of assessing
rooms. It is also essential to carry out such
in a range of subject areas,
development
for the research inmathematics education
is significantly different from that in lan
guage, which is different again from that
in the creative arts.
The schools involved would need ex
tra support in order to give their teachers
time to plan the initiative in light of ex
isting evidence, to reflect on their experi
ence as it develops, and to offer advice
about training others in the future. In ad
dition, there would be a need for external
evaluators to help the teachers with their
development work and to collect evidence
of its effectiveness. Video studies of class
room work would be essential for dissem
inating findings to others.
This dimension of
2. Dissemination.
the implementation would be in low gear
at the outset offering schools no more
than general encouragement
and expla
nation of some of the relevant evidence
that they might consider in light of their

THEFOURTHINTERNATIONAL|

TEACHINGFOR INTELLIGENCE
CONFERENCE
InApril 1998, theworld's educational leaders on teaching and instructiongathered to discuss student
achievement. More than2,200 educators from across the nation came to New YorkCity to focus on strategies
and practices thatpromote teaching for intelligence.

Meet
Town
Live

Hall Meeting
Satellite

Broadcast

Item #1 683
$299.00 with facilitator's guide

Kay Burke

Robin Fogarty

Theodore Sizer

Performance

The Brain-Compatible

Reshaping

Assessment
#1689 (12minutes)

Classroom
#1698 (14minutes)

Schools
#1693 (16minutes)

Arthur

Herb

Robert

Costa

Habits of
Minds/Dispositions of
HOWARD GARDNER, JAMES COMERf LINDA
DARLING-HAMMOND and 20 more education
experts shared their insight on the nature of
intelligence and how to transfer this knowledge
to the classroom.
__

the Experts Interviews

Ginsburg

EarlyChildhood
Education

Thinking
#1 694 (10minutes)
(14minutes)
#1J696
John Goodlad
Elliot Eisner
School Reform and
The Arts and
Renewal
the Aesthetic
#1690 (10 minutes)
~~~~~~~~~~~~~Intelligence
#1692 (12minutes)
Donna Ogle
Reading Strategies

Training and Publishing

High

Stemberg

Successful
Intelligence
#1691 (13minutes)
Uri Triesman

School Improvement
#1697 (12minutes)
$19.95 for each
videotape or $179
for 10-tape package

Inc.

CALL800-348-4474 OR 847-290-6600 TO ORDER.

OCTOBER

695
_J ~~~~~~~~~~~~~~~~~~~~~~#1
(12mi

1998

nutes)

147

pupils summativelyfor externalpurpos


es is clearly different from the task of as

sessing ongoingwork tomonitor and im


proveprogress.Some arguethatthesetwo
roles are so different that they should be
kept apart. We do not see how this can be
done, given that teachers must have some
share of responsibility for the former and

clusions. Enough is known to provide a


basis for active development work, and
some of themost important questions can
be answered only through a program of

practical implementation.
Directions for futureresearchcould in
clude a study of the ways inwhich teach
ers understand and deal with the relation

must take the leading responsibility for shipbetween theirformativeand summa


However, teachersclearly face tive roles or a comparative study of the
the latter.20
difficultproblems inreconcilingtheirfor predictivevalidityof teachers'summative
mative and summativeroles,andconfusion assessments versus external test results.
minds between theserolescan Many more questions could be formulated,
in teachers'
and it is important for future development
impede the improvementof practice.
The arguments here could be takenmuch
further tomake the case that teachers should
play a far greater role in contributing to

summativeassessments for accountabili


ty.One strong reason for giving teachers
a greater role is that they have access to
the performance of their pupils in a vari
ety of contexts and over extended periods

of time.

that some of these


basic research. At
enced researchers
role to play in the

problems be tackled by
the same time, experi
would also have a vital
evaluation of the devel

opmentprogramswe have proposed.


Are We Serious
About Raising Standards?

148

PHI DELTA KAPPAN

2. There

is

internationally agreed-upon
evaluation," "classroom assessment," "in
ternal assessment," "instructional assessment," and "stu
dent assessment" have been used by different authors,
in
and some of these terms have different meanings
"Classroom

different

texts.

3. Paul Black and Dylan Wiliam,


Classroom Learning," Assessment
1998, pp. 7-74.
4. Lynn

S. Fuchs

and Douglas

"Assessment

and

inEducation, March
Fuchs,

Achieve
5. See Albert E. Beaton et al., Mathematics
ment in the Middle
School Years (Boston: Boston
1996).

College,

6. Lynn S. Fuchs et al., "Effects of Task-Focused


Goals on Low-Achieving
Students with and With
out Learning Disabilities,"
American Educational
Research Journal, vol. 34, 1997, pp. 513-43.
7. OFSTED
in Education),
(Office for Standards
Issues for School Devel
Subjects and Standards:
opment Arising from OFSTED Inspection Findings
1994-5: Key Stages 3 and 4 and Post-16
(London:
Her Majesty's
1996), p. 40.
Stationery Office,
8. Nicholas
Assessment:

Daws and Birendra Singh, "Formative


to En
To What Extent Is Its Potential

hance Pupils'
ence Review,

Science
vol. 77,

School Sci

Being Realized?,"
1996, p. 99.

and Djavid
Dassa, Jes?s Vazquez-Abad,
in a Classroom
Set
"Formative Assessment
to Computer
Innovations," Al
ting: From Practice
vol. 39,
berta Journal of Educational
Research,
1993, p. 116.

9. Clement
Ajar,

10. D. Monty Neill, "Transforming


Student Assess
ment," Phi Delta Kappan,
1997, pp. 35
September
36.

The findings summarizedabove and


This is an important advantage because
the program we have outlined have im
sampling pupils' achievement by means
for a variety of responsible
plications
of short exercises taken under the condi
it is the responsibili
However,
dan
is
with
agencies.
of
formal
fraught
tions
testing
to take the lead. It
gers. It is now clear that performance in ty of governments
would be premature and out of order for
any task varies with the context in which
it is presented. Thus some pupils who seem us to try to consider the relative roles in
such an effort, although success would
incompetent in tackling a problem under
test conditions can look quite different in clearly depend on cooperation among gov
ernment agencies, academic researchers,
the more realistic conditions of an every
and school-basededucators.
day encounter with an equivalent problem.
The main plank of our argument is that
Indeed,theconditionsunderwhich formal
standards can be raised only by changes
tests are taken threaten validity because
that are put into direct effect by teachers
they are quite unlike those of everyday per
formance. An outstanding example here is and pupils in classrooms. There is a body
that collaborative work is very important of firm evidence that formative assess
in everyday life but is forbidden by current ment is an essential component of class
norms of formal testing.2'These points open room work and that its development can
raise standards of achievement. We know
up wider arguments about assessment sys
of no other way of raising standards for
tems as awhole arguments that are be
which such a strong prima facie case can
yond the scope of this article.
4. Research. It is not difficult to set out be made. Our plea is that national and state
policy makers will grasp this opportuni
a list of questions that would justify fur
there
area.
ty and take the lead in this direction.
in
Although
research
this
ther
are many and varied reports of successful
innovations, they generally fail to give clear
1. James W. Stigler and James Hiebert, "Understand
Instruc
accounts of one or another of the impor
ing and Improving Classroom Mathematics
of the TIMSS Video Study," Phi
tion: An Overview
tant details. For example, they are often
Delta Kappan, September
1997, pp. 19-20.
silent about the actual classroom methods
no
term here.

used, themotivation andexperienceof the


teachers, the nature of the tests used as
measures of success, or theoutlooks and
expectationsof thepupils involved.
However,while there is ample justifi
cation forproceedingwith carefully for
mulated projects,we do not suggest that
everyone else shouldwait for theircon

Formative Evaluation: A Meta-Analy


Systematic
sis," Exceptional
Children, vol. 53, 1986, pp. 199
208.

"Effects

of

11. Task Group on Assessment


and Testing: A Re
of Education
and Sci
port (London: Department
ence and theWelsh Office,
1988).
As
National
12. Richard Daugherty,
Curriculum
sessment: A Review of Policy, 1987-1994
(London:
Falmer Press, 1995).
Anne Quaker,
and Linda
13. Terry A. Russell,
on the Implementation
of
"Reflections
McGuigan,
National
Science Policy for the 5-14
Curriculum
Age Range: Findings and Interpretations from aNa
tional Evaluation
Journal

of Science

Study in England,"
vol.
Education,

International
17, 1995, pp.

481-92.
"Towards a Pragmatic Ap
14. Phillipe Perrenoud,
in Penelope Wes
proach to Formative Evaluation,"
ton, ed., Assessment
of Pupils' Achievement: Moti
vation and School Success
(Amsterdam: Swets and
1991), p. 92.
Zeitlinger,
and
15. D. Royce
Sadler, "Formative Assessment
the Design of Instructional Systems," Instructional
Science, vol. 18, 1989, pp. 119-44.
the
16. Paul J. Black and J.Myron Atkin, Changing
and
Innovations
in Science, Mathematics,
for the
(London: Routledge
Technology Education
and De
for Economic
Co-operation
Organisation
Subject:

G. Fullan, with
1996); and Michael
velopment,
The New Meaning
Suzanne Stiegelbauer,
of Educa
tional Change

(London: Cassell,

1991).

17. See Stigler and Hiebert, pp. 19-20.


18. Black and Atkin, op. cit.
of Teaching
19. Peter Johnston et al., "Assessment
and Learning in Literature-Based Classrooms," Teach
vol. 11,1995, p. 359.
ing and Teacher Education,
and
and Paul Black, "Meanings
20. Dylan Wiliam
Forma
A Basis for Distinguishing
Consequences:
Brit
tive and Summative Functions of Assessment,"
Research
ish Educational
Journal, vol. 22, 1996,
pp. 537-48.
in some detail in
21. These points are developed
and the
"T. S. Eliot, Collaboration,
Sam Wineburg,
in a Rapidly Changing
of Assessment
Quandaries
1997, pp. 59
World," Phi Delta Kappan, September
IC
65.

Você também pode gostar