Você está na página 1de 3

PUBLISHED: 1 MARCH 2017 | VOLUME: 1 | ARTICLE NUMBER: 0028

comment

Towards artificial intelligence-


based assessment systems
Rose Luckin
‘Stop and test’ assessments do not rigorously evaluate a student’s understanding of a topic. Artificial
intelligence-based assessment provides constant feedback to teachers, students and parents about how
the student learns, the support they need and the progress they are making towards their learning goals.

D
ecades of research have shown that
knowledge and understanding
cannot be rigorously evaluated
through a series of 90-minute exams. The
prevailing exam paradigm is stressful,
unpleasant, can turn students away from
education, and requires that both students
and teachers take time away from learning.
And yet we persist globally to rely on these
blunt instruments, sending students off to
universities and the workplace ill-equipped
for their futures.
Perhaps one reason for the long-lasting
persistence of ‘stop and test’ forms of
assessment is that the alternatives available so
far have been unattractive and equally, or even
more, unreliable than current examination
systems. For example, within the school
education system, marks from work that
students complete as part of their course has
formed part, or all, of their exam result. Fears
about the extent to which such coursework
is truly the sole work of the student has
reduced the attractiveness of this option
and we have moved back towards exams. In
higher education, ‘open book exams’ have Figure 1 | A simple Open Learner Model for tracking how a child is using the help facilities of a piece of
been used to reduce the pressure on students science software. The map in the dialogue box entitled ‘Activities’ depicts the area of the curriculum that
to remember lots of information. This type the child is studying, with each node representing a curriculum topic. When the user clicks on a node in
of approach can help, but it tackles only a this map, the bar chart below and to the left of the map indicates the level of difficulty of the work that
small part of the overall problem, in this case, the child has completed while working on this topic, and the dots on the ‘dice’ below and to the right of
the pressure on memory. Other stressful and the map indicate how much help the child has received. Figure courtesy of Ecolab (Luckin, 2016).
unreliable features remain, such as the exam
conditions, the very limited range of the
assessment, and the accuracy of marking. think of as essentially human. AI systems effectively in cities. Clever AI has penetrated
However, the situation is now different are designed to interact with the world general use to become so useful that it is not
and a realistic and economically attractive through capabilities, such as speech labelled as AI anymore2. We trust it with
alternative lies at our fingertips. We have the recognition, and intelligent behaviours, our personal, medical and financial data
technology to build a superior assessment such as assessing a situation and taking without a thought, so why not trust it with
system — one based on artificial intelligence sensible actions towards a goal1. The use the assessment of our children’s knowledge
(AI) — but we now need to see if we have the of AI in our day-to-day life has increased and understanding?
social and moral appetite to disrupt tradition. exponentially: we use the intelligent search
behind Google, the AI voice recognition AI and assessment
AI is everywhere and knowledge management in the iPhone’s The application of AI to education has
AI can be defined as the ability of computer personal assistant, Siri, and navigation been the subject of academic research for
systems to behave in ways that we would tools such as Citymapper to help us travel more than 30 years, with the aim of making

NATURE HUMAN BEHAVIOUR 1, 0028 (2017) | DOI: 10.1038/s41562-016-0028 | www.nature.com/nathumbehav 1


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
comment

this period of time may be a whole school


Box 1 | AIAssess. semester, a year, several years or more.
The output from AI assessment software
AIAssess is intelligent assessment software the software. Specifically, it collects provides the ingredients that can be
designed for students learning science and data about each step the student takes synthesized and interpreted to produce
mathematics: it assesses as students learn. towards a task solution, the amount of visualizations (Fig. 1). These visualizations,
AIAssess was developed by researchers hints or tips that the student requires to referred to as Open Learner Models
at UCL Knowledge Lab through multiple successfully complete each step and each (OLMs), represent a student’s knowledge,
evaluated implementations5,6. Specifically, task, and the difficulty level of each task the skills or resource requirements and they
AIAssess provides activities that assess and student completes. help teachers and students understand
develop conceptual knowledge by offering The AIAssess Student Model their performance and its assessment 5. For
students differentiated tasks of increasing Component uses outputs from the example, an AI assessment system collects
levels of difficulty as the student progresses. Analytics Component to strengthen or data about student’s achievements, their
In order to ensure that the student weaken its judgement about every student’s: emotional state, or motivation. This data
keeps persevering, AIAssess provides • Knowledge and understanding of can be analysed and used to create an
different levels of hints and tips to help each concept in a mathematics or OLM to: (1) help teachers understand their
the student complete each task. It assesses science curriculum, by assessing each students’ approach to learning to shape their
each student’s knowledge of the subject student’s ability to complete a solution future teaching appropriately; and (2) help
matter, as well as their metacognitive step, or entire task, correctly without motivate students by enabling them to track
awareness, knowledge of their own ability any hints or tips. their own progress and encouraging them to
and learning needs, which is a key skill • Potential for development in their reflect on their learning.
possessed by effective students and a good knowledge and understanding of each AIAssess (Box 1) is a generic AI
predictor of future performance. concept in a mathematics or science assessment system that exemplifies just one
To assess each student’s progress curriculum, by assessing each student’s approach to assessing how much a student
AIAssess uses: a Knowledge Component ability to complete a solution step, or knows and understands. The system is
that stores AIAssess’s knowledge about entire task, correctly with a particular suitable for subjects such as mathematics
science and mathematics so that it can level of hints or tips. or science and is based on existing research
check if each student’s work is correct; • Metacognitive awareness of their tools6,7. However, there are many different
an Analytics Component that collects knowledge and understanding, and the AI techniques — such as natural language
and analyses data about each student’s extent to which they need to use hints processing, speech recognition and semantic
interactions with the software; and a and tips to succeed, by assessing each analysis — that can be used to evaluate
Student Model Component that constantly student’s accuracy in determining the student learning, and an appropriate mix of
calculates and stores what AIAssess judges level of hints or tips they need in order tools would be required for other subjects,
to be each student’s subject knowledge and to complete a solution step correctly, such as spoken language or history, and
metacognitive awareness. and in evaluating the level of difficulty skills such as collaborative problem-solving.
The AIAssess Knowledge Component is at which they can succeed correctly.
fine-grained so that it can generate correct At any point in time, AIAssess can
and incorrect steps toward a solution, not produce a visualization (Fig. 1) that
just correct and incorrect answers. For any illustrates its judgements about a student’s AI is a powerful tool to open
given task that the student is required to performance on a particular task, across a up the ‘black box’ of learning.
perform, AIAssess can generate all possible set of tasks, and across all tasks completed.
steps that a student might take as they This Open Learner Model can be
complete each task. interrogated so that teachers and learners
The AIAssess Analytics component can trace the evidence that supports each The cost of AI assessment
collects each student’s interactions with judgement the software makes. Building AI systems is not cheap and a
large-scale project would certainly need
extremely careful management. There is no
“computationally precise and explicit forms success within each of these activities reliable estimate of the cost of a scaled-up AI
of educational, psychological and social and within each of the steps towards the assessment system that could assess multiple
knowledge which are often left implicit”3. completion of each activity. school subject areas and skills.
The evidence from existing AI systems that AI techniques, such as computer One way of getting a glimpse of the scale
assess learning as well as provide tutoring modelling and machine learning, are of initial investment needed to develop a
is positive with respect to their assessment applied to this information and the AI national AI assessment system would be to
accuracy 4. AI is a powerful tool to open up assessment system forms an evaluation look at the costs of other large AI projects.
the ‘black box of learning’, by providing a of the student’s knowledge of the subject In January 2016, the Obama administration
deep, fine-grained understanding of when area being studied. AI assessment systems announced that it planned to invest
and how learning actually happens. can also be used to assess students’ skills, US$4 billion over a decade (US$400 million
In order to open this black box of such as collaboration and persistence, as per year) to make autonomous vehicles
learning, AI assessment systems need well as students’ characteristics, such as viable8, and in November 2015, Toyota
information about: (1) the curriculum, confidence and motivation. The information committed to an initial investment of
subject area and learning activities that each collection and processing carried out by an US$1 billion over the next five years
student is completing; (2) the details of the AI assessment system to form an evaluation (US$200 million per year) to establish and
steps each student takes as they complete of each student’s progress takes place over a staff two new AI and robotics research and
these activities; and (3) what counts as period of time. Unlike the 90-minute exam, development operation centres9. If we add

2 NATURE HUMAN BEHAVIOUR 1, 0028 (2017) | DOI: 10.1038/s41562-016-0028 | www.nature.com/nathumbehav


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
comment

the estimated costs of making autonomous


vehicles viable, this suggests an annual Table 1 | The cost of the English examination system (2005).
budget of US$600 million per year for a Direct costs Time costs Total
complex AI project. It therefore seems QCA core costs
 
 £8m £8m
reasonable to suggest that a country, such as
QCA NCT costs
 £37m £37m
England, might need to spend the equivalent
of US$600 million (£500 million) per year to Awarding body costs
 £264m £264m
make AI assessment a reality for a set of core Exam centres: invigilation £97m £97m
subjects and skills, at least to start with until Exam centres: support and sundries £61m £9m £70m
the upfront system development costs have Exam centres: exams officers £134m £134m
been covered and the focus could shift to
Total costs £370m £240m £610m
maintenance and improvement.
It is also hard to estimate the cost of Source: a memorandum submitted by the Association of School and College Leaders (ASCL) to the House of Commons Select Committee
on Children, Schools and Families10. NCT, national curriculum tests.
the current exam system to make any
comparison. There are no publicly available
up-to-date data about the costs of the pass exams. AI would provide a fairer, richer stakeholders to work with scientists
existing English exam system. The most assessment system that would evaluate and policymakers to develop the ethical
recent information is in a 2005 report, which students across a longer period of time framework within which AI assessment can
was prepared by PricewaterhouseCoopers and from an evidence-based, value-added thrive and bring benefit. Technically, we
for the then exam regulator, the perspective. It would not be possible for need to build international collaborations
Qualifications and Curriculum Authority students to be coached specifically for an AI between academic and commercial
(QCA)10. This report estimated the cost assessment, because the assessment would enterprise to develop the scaled-up AI
of the English school exam system as be happening ‘in the background’ over assessment systems that can deliver a new
£610 million per annum (Table 1). time, without necessarily being obvious to generation of exam-free assessment. And
If we use Bank of England historical the student. AI assessment systems would politically, we need leaders to recognize
inflation rate data to convert this to a be able to demonstrate how a student deals the possibilities that AI can bring to
figure for 2015, then the figure is about with challenging subject matter, how they drive forward much-needed educational
£845 million (US$1.03 billion). Although the persevere and how quickly they learn when transformation within tightening budgetary
English examination system is not the same given appropriate support. In addition, constraints. Initiatives on these three
in 2016 as it was in 2005, it is not simpler national AI assessment systems would also fronts will require financial support from
and is unlikely to be any less expensive, so offer support and formative feedback to help governments and private enterprise working
a figure of £845 million as an estimate of students improve. together. Initially, it may be more tractable
the cost of the English exam system in 2016 to focus on a single subject area as a pilot
seems conservative. Although designing Ethical concerns project. This approach would enable us
a nationwide learning assessment system The ethical questions around AI in general to firm up the costs and demonstrate the
may well be more complex than designing are equally, if not more, acute when it comes benefits so that we can free teachers and
autonomous vehicles, comparing the level to education. For example, the sharing of students from the burden of examinations.❐
of investment in an existing complex data introduces a host of challenges, from
AI project to the cost of the current individual privacy to proprietary intellectual Rose Luckin is Professor of Learner Centred Design,
examination system in England puts the property concerns. If we are to build scaled UCL Knowledge Lab, Institute of Education,
enterprise of building such a system within a AI assessment systems that will be welcomed University College London, 23–29 Emerald Street,
realistic context. by students, teachers and parents, it will London WC1N 3QS, UK.
We also need to bear in mind that the be essential to work with educators and e-mail: r.luckin@ucl.ac.uk
initial outlay for an AI assessment system system developers to specify data standards
References
would be much greater than the ongoing that prioritize both the sharing of data and 1. Luckin, R., Holmes, W., Griffiths, M. & Forcier, L. B. Intelligence
development and maintenance costs. This the ethics underlying data use. It is also Unleashed: An Argument for AI in Education (Pearson, 2016);
is in contrast to the human-resource-heavy essential that we use the older AI approaches http://go.nature.com/2jwF0zx
2. Bostrom, N. & Yudkowsky, E. in Cambridge Handbook of
exam systems, for which the costs inevitably that involve modelling as well as the more Artificial Intelligence (eds Frankish, K. & Ransey, W. M.) 316–334
rise each year due to the increasing numbers modern machine-learning techniques. (Cambridge Univ. Press, 2011).
of students, and therefore the increasing The modelling approach to AI can make 3. Self, J. Int. J. Artif. Intell. Educ. 10, 350–364 (1999).
4. Hill, P. & Barber, M. Preparing for a Renaissance in Assessment
number of examiners, and the cost transparent the AI system’s reasoning in (Pearson, 2014).
of inflation. a way that machine-learning techniques 5. Mavrikis, M. Int. J. Artif. Intell. Tools 19, 733–753 (2010).
cannot, and it will be essential to be able to 6. Luckin, R. & du Boulay, B. Int. J. Artif. Intell. Educ.
Social equality explain the assessment decisions made by 26, 416–430 (2016).
7. Bull, S. & Kay, J. Int. J. Artif. Intell. Educ. 17, 89–120 (2007).
The benefits of developing an AI assessment any AI assessment system and constantly 8. Spector, M. & Ramsey, M. U.S. proposes spending $4 billion to
approach go beyond economics. Education provide informative feedback to students, encourage driverless cars. The Wall Street Journal (14 January
is the key to changing people’s lives, and teachers and parents. 2016); http://go.nature.com/2jZePEM
9. Toyota will establish new artificial intelligence research and
yet the changes that education makes to development company. Toyota http://bit.ly/2jRt1gW
people’s lives are not always for the better. Looking forward (5 November 2015).
The less able and poorer students in society How do we progress from the current system 10. Memorandum Submitted by Association of School and College
Leaders (ASCL) (UK Parliament, 2007); http://go.nature.
are generally least well served by education to achieve a step change in assessment com/2jpIBBN
systems. Wealthier families can afford to using AI? We need to advance on three
pay for the coaching and tutoring that can fronts. Socially, we need to engage teachers, Competing interests
help students access the best schools and learners, parents and other education The author declares no competing interests.

NATURE HUMAN BEHAVIOUR 1, 0028 (2017) | DOI: 10.1038/s41562-016-0028 | www.nature.com/nathumbehav 3


©
2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.

View publication stats

Você também pode gostar