Você está na página 1de 38

Artificial

Intelligence
Perspectives from Leading Practitioners
in AI and the Science of the Brain

Jack Clark
Artificial Intelligence:
Teaching Machines to
Think Like People
Perspectives from Leading Practitioners
in AI and the Science of the Brain

Jack Clark

Beijing Boston Farnham Sebastopol Tokyo


Artificial Intelligence: Teaching Machines to Think Like People
by Jack Clark
Copyright © 2017 O’Reilly Media, Inc.. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA
95472.
O’Reilly books may be purchased for educational, business, or sales promotional use.
Online editions are also available for most titles (http://oreilly.com/safari). For more
information, contact our corporate/institutional sales department: 800-998-9938 or
corporate@oreilly.com.

Editor: Shannon Cutt Interior Designer: David Futato


Production Editor: Shiny Kalapurakkel Cover Designer: Karen Montgomery
Copyeditor: Octal Publishing, Inc. Illustrator: Rebecca Demarest
Proofreader: Christina Edwards

April 2017: First Edition

Revision History for the First Edition


2017-04-14: First Release

The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Artificial Intelli‐
gence: Teaching Machines to Think Like People, the cover image, and related trade
dress are trademarks of O’Reilly Media, Inc.
While the publisher and the author have used good faith efforts to ensure that the
information and instructions contained in this work are accurate, the publisher and
the author disclaim all responsibility for errors or omissions, including without limi‐
tation responsibility for damages resulting from the use of or reliance on this work.
Use of the information and instructions contained in this work is at your own risk. If
any code samples or other technology this work contains or describes is subject to
open source licenses or the intellectual property rights of others, it is your responsi‐
bility to ensure that your use thereof complies with such licenses and/or rights.

978-1-491-97493-3
[LSI]
Table of Contents

Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Geoff Hinton: Adapting Ideas from Neuroscience for AI. . . . . . . . . . . . . . . 3

Dharmendra Modha: Brain-Inspired Chips for Better AI, What Comes


After Von Neumann, and the Need for Fault Tolerance. . . . . . . . . . . . . . . 9

Adam Marblestone: What the Brain Tells Us About Building Unsupervised


Learning Systems, and How AI Can Guide Neuroscience Research. . . . . 13

Leila Wehbe: What Harry Potter and Recurrent Neural Networks Tell Us
About the Brain’s Inner Narrative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Tom Griffiths: Why People Have Surprisingly Intelligent Intuitions, and


What AI Researchers Can Learn from That. . . . . . . . . . . . . . . . . . . . . . . . . 25

v
Introduction

This report explores some of the growing connections between the


fields of artificial intelligence (AI) and what we understand about
the brain from disciplines as varied as cognitive science, neuro‐
science, and developmental psychology.
Recently, AI has had a number of notable successes in areas of per‐
ception—particularly for speech and image recognition—and
action; for example, in the case of AlphaGo and robotic control.
These successes have been primarily due to development in two
strands of AI: reinforcement learning and deep learning. Now, as AI
scientists plot their next moves, some of them are turning to the
human brain for inspiration about what to build next.
In this report, I share the thoughts and insights of four of today’s
prominent voices in the AI field.
Geoff Hinton, a professor at the University of Toronto, for instance,
has recently explored neural network models inspired by the vary‐
ing dynamics of the synapses found in our brain. Dharmendra
Modha at IBM has been on a decades-long quest to create an
entirely new type of computer chip that breaks with the traditional
Von Neumann architecture, and instead takes inspiration from the
brain.
Meanwhile, within neuroscience there has been a “shedding of
assumptions” about what the brain can and cannot do, which,
according to MIT “neurotechnologist” Adam Marblestone, has
caused scientists to become more receptive to ideas stemming from
AI.

1
Breakthroughs in AI also have given researchers such as Leila
Wehbe at UC Berkeley the computational tools they need to conduct
new experiments on how the brain represents language. Tom Grif‐
fiths, a psychology and computer science professor, has also begun
to explore how humans are able to develop surprisingly correct intu‐
itions by developing complex cognitive models, which might inspire
AI technologists.

2 | Introduction
Geoff Hinton: Adapting Ideas from
Neuroscience for AI

How a better understanding of neurons could lead to smart AI sys‐


tems: An interview with Geoff Hinton, emeritus distinguished pro‐
fessor at the University of Toronto and an engineering fellow at
Google.
A better understanding of the reasons why neurons spike could lead
to smart AI systems that can store more information more effi‐
ciently, according to Geoff Hinton, who is often referred to as the
“godfather” of deep learning.
Hinton is one of the pioneers of neural networks, and was part of
the small group of academics that nursed the technology through a
period of tepid interest, funding, and development.

Key Takeaways
• Large-scale analysis of the brain through research schemes like
the Obama administration’s “Brain Initiative” have the promise
to shed light on new aspects of the brain, giving AI designers
new ideas.
• You can adapt ideas from neuroscience into ideas that are rele‐
vant to AI, though it takes some time. Hinton first thought, in
1973, about implementing a system with capabilities similar to
those afforded by the fact synapses change on multiple time‐
scales, yet it took until 2016 to publish a major paper on this
area.

3
• It’s relatively easy to develop powerful perception systems, but
we need new techniques to build systems capable of reasoning
and language.

Jack: Why should we look at the brain when developing AI systems,


and what aspects should we focus on?
Geoff: The main reason is that it’s the thing that works. It’s the only
thing we know that’s really smart and has general-purpose intelli‐
gence. The second reason is that for many years a subset of people
thought you should look at the brain to try and make AI work bet‐
ter, and they didn’t really get very far—they made a push in the 80s,
but then it got stalled, and they were kind of laughed at by everyone
in AI, who said, “you don’t look at a bumblebee to design a 747.” But
it turned out the inspiration they got from looking at the brain was
extremely relevant and without that, they probably wouldn’t have
gone in that direction. It’s not just that we have an example of some‐
thing that’s intelligent, we also have an example of a methodology
that worked, and I think we should push it further.
Jack: Today, aspects of modern classifiers like neural nets look vaguely
similar to what we know about the brain’s visual system. We’re also
developing memory systems that are inspired by the hippocampus. Are
there other areas where we can look in the brain and start taking ele‐
ments from, like spiking neurons?
Geoff: We don’t really know why neurons spike. One theory is that
they want to be noisy so as to regularize, because we have many
more parameters than we have data points. The idea of dropout [a
technique developed to help prevent overfitting] is that if you have
noisy activations you can afford to use a much bigger model. That
might be why they spike, but we don’t know. Another reason why
they might spike is so they can use the analog dimension of time, to
code a real value at the time of the spike. This theory has been
around for 50 years, but no one knows if it’s right. In certain subsys‐
tems, neurons definitely do that, like in judging the relative time of
arrival of a signal to two ears so you can get the direction.
Another area is in the kinds of memory. Synapses adapt at many dif‐
ferent timescales and in complicated ways. At present, in most artifi‐
cial neural nets, we just have a timescale for adaptation of the
synapses and then a timescale for activation of the neurons. We
don’t have all these intermediate timescales of synaptic adaptation,

4 | Geoff Hinton: Adapting Ideas from Neuroscience for AI


and I think those are going to be very important for short-term
memory, partly because it gives you a much better short-term mem‐
ory capacity.
Jack: Are there any barriers to our ability to understand the brain,
which could slow down the rate at which we can develop ideas in AI
inspired by it?
Geoff: I think if you stick an electrode in a cell and record from it,
or put an electrode near a cell and record from it, or near a bunch of
cells and try to record from half a dozen of them, then you won’t
understand things that might easily be understood by optical dies,
which let you know what a million cells are doing. There’s going to
be all sorts of things in the Obama Initiative for brain science, which
means we’ll get new techniques that will allow us to see things and
make them obvious that would have been very hard to establish
otherwise. We don’t know what they’re going to be, but I suspect
that they will lead to some interesting things.
Jack: So, if we had a sufficiently large neural network, would that be
able to match a human on any given task, or are there missing com‐
ponents that we need?
Geoff: It depends on what particular task you’re talking about. If you
take something like speech recognition, I’d be very surprised if a
really big network exactly matched a human being; I think it’s either
going to be worse or it’s going to be better. Human beings aren’t the
limit. I think actually in speech recognition I wouldn’t be at all sur‐
prised if in 10 years’ time, neural nets can do it better than people.
For other areas, like reasoning and learning from a very small num‐
ber of examples, it may take longer to develop systems that match or
surpass people.
Jack: One problem modern reinforcement learning systems seem to
have is knowing what parts of a problem to devote attention to explor‐
ing, so that you don’t have to waste your time on less-interesting parts
of the image.
Geoff: This is exactly the same in vision. People make very intelli‐
gent fixations. Almost none of the optical array never gets processed
at high resolution, whereas in computer vision people typically just
take the whole array at low resolution, medium resolution, high res‐
olution, and try to combine the information, so it’s just the same
problem in us. How do you intelligently focus on things? We’re

Geoff Hinton: Adapting Ideas from Neuroscience for AI | 5


going to have to deal with the same problem in language. This is an
essential problem, and we haven’t solved it yet.
Jack: You recently gave a lecture on a paper you published about
short-term changes of weights within neural networks. Can you
explain this paper and why you think it is important?
Geoff: In recurrent neural networks, if they’re processing a sentence,
they have to remember stuff about what has happened so far in the
sentence, and all of that memory is in the activations in the hidden
neurons. That means those neurons are having to be used to
remember stuff and so they’re not really available for doing current
processing.
A good example of this is if you have an embedded sentence, like if
someone said, “John didn’t like Bill because he was rude to Mary,
because Bill was rude to Mary.” You process the beginning of the
sentence, then you use exactly the same knowledge processing to
process “because Bill was rude to Mary.” Ideally, you want to use the
same neurons and the same connections and the same weights for
the connections for this processing. That’s what true recursion
would be, and that means you have to take what you have so far in a
sentence and put it aside somewhere, so the question is: how do you
put it aside? In a computer, it’s easy because you have random access
memory, so you just copy it into some other bit of memory to free
up the memory. In the brain, I don’t think we copy neural activity
patterns, what I think we do is have rapid changes to synapse
strength so we can recreate the memories when we need them, and
we recreate them when the context is such that it would be appro‐
priate.
I published a recent paper with Jimmy Ba and some people at Deep‐
Mind showing how we can make that work. I think that’s an exam‐
ple of how the fact that synapses are changing on multiple
timescales can be useful. I first thought about this in 1973 and made
a little model that could do true recursion on a very simple problem.
A year ago, I went back to that at DeepMind and got it working
within the framework so it learns everything. Back when I first
thought about it computers had 64k of memory and we didn’t know
how to train big neural nets.
Jack: Do you think AI agents need to be embodied in some form,
either in a robot or sufficiently rich simulation, to become truly intelli‐
gent?

6 | Geoff Hinton: Adapting Ideas from Neuroscience for AI


Geoff: So, I think there’s two aspects: one is the philosophical aspect
and the other is the practical aspect. Philosophically, I see no reason
why they have to be embodied, because I think you can read Wiki‐
pedia and understand how the world works. But, as a practical mat‐
ter, I think embodiment is a big help. There’s a Marx phrase, which
is, “If you want to understand how the world works, try and change
it.” Just looking is not as efficient a way of understanding things as
acting. So, the philosophical question is, “is action essential?” If
action is essential to understanding the world, then astrophysics is
in trouble. So, no, I don’t think embodiment is necessary.
Jack: If you’re able to replicate some of the properties of spiking neu‐
rons, and combine that with systems that can form temporary memo‐
ries, what will you be able to build?
Geoff: I think it might just make all the stuff we have today work
better. So, for natural language understanding, I think having an
associative memory with fast changes in the weights would be help‐
ful, and for these feedforward nets, I think coincidence detectors are
much better at filtering out clutter in the background, so they’ll be
much better at focusing on the signal and filtering out the noise.
This could also help with learning from small datasets.

Geoff Hinton: Adapting Ideas from Neuroscience for AI | 7


Dharmendra Modha: Brain-
Inspired Chips for Better AI, What
Comes After Von Neumann,
and the Need for Fault Tolerance

What it takes to make silicon chips more like our own brain: An
interview with IBM fellow Dharmendra Modha.
Dharmendra Modha is an IBM fellow and chief scientist for the
company’s brain-inspired computing efforts. Since 2004, he’s been
on a quest to create a chip that has the power, speed, and capability
of the human brain. Now, he’s closer than ever to realizing this
dream, and hundreds of scientists beyond IBM are testing their
ideas on these chips.

Key Takeaways
• The Von Neumann architecture won’t be able to give us the
massively parallel, fault-tolerant, power-efficient systems that
we’ll need to create to embed intelligence into silicon. Instead,
we need to rethink processor design.
• You can’t throw out the baby with the bathwater: even if you
rethink underlying hardware design, you need to implement
sufficiently abstracted software libraries to reduce the pain of
the software developer so that he can program your chip.
• You can achieve power efficiency by changing the way you build
software and hardware to become active only when an event
occurs; rather than tying computation to a series of sequential

9
operations, you make it into a massively parallel job that runs
only when the underlying system changes.

Jack: Tell us about what you’re building and trying to achieve with the
neuromorphic computing project.
Dharmendra: If you look at the computation the brain is doing,
then the computation cannot be efficiently performed by the Von
Neumann architecture. So, what we’ve been doing is taking inspira‐
tion from the structure and dynamics and behavior of the brain to
see if we can optimize time, speed, and energy of computation.
At the same time, we want to maximize the capability that one can
deliver, while allowing for scalability. An AI system we developed
earlier in the year for Lawrence Livermore National Lab had 16
TrueNorth chips tiled in a 4-by-4 array. The chips are designed to be
tiled, so scalability is built in as a design principle rather than as an
afterthought. This architecture is designed to be a complement, and
not a competition, to the existing Von Neumann architecture that
has been with us since 1946.
Jack: Why is the Von Neumann architecture insufficient for building
brain-like computers?
Dharmendra: The Von Neumann architecture is not suitable for
brain-like computation because at its heart it consists of a CPU or
some sort of computation engine, memory, and a bus, which con‐
nects the two via computation, and the bus becomes the bottleneck,
and also sequentializes the computation. You move data to and from
the memory.
So, if you have to flip so much as a single bit, you have to read that
bit from the memory and write it back.
You can imagine that in the Von Neumann architecture the memory
is like oranges in Florida and computation is consuming the oranges
in California and the highway system in the United States is the bus.
So, what happens is you need to move oranges from Florida to Cali‐
fornia, and, in the process, the highway gets backed up. But what if
we moved into the groves so each person had access to a private
tree? You pluck oranges when you need them.
The idea is that if you co-locate memory and computation and
slowly intertwine communication, just like how the brain does, then
you can minimize the energy of moving bits from memory to com‐

10 | Dharmendra Modha: Brain-Inspired Chips for Better AI, What Comes After Von Neumann, and the Need for Fault Tolerance
putation. You can get event-driven computation rather than clock-
driven computation, and you can compute only when information
changes.
You’re also fault tolerant. We have 4,096 parallel cores, so if some of
the cores fail, no big deal. The TrueNorth chip consists of one tiny
modular core that is repeated 4,096 times in a 64-by-64 array, so the
design complexity only increases with the size of the module, while
the architecture capability increases with how many modules you
are able to tile. You decrease the design complexity while increasing
the architectural capability, and you get fault tolerance and scalabil‐
ity for free, so that’s where we break from the Von Neumann archi‐
tecture.
This changes the programming model. The Von Neumann para‐
digm is, by definition, a sequence of instructions interspersed with
occasional if-then-else statements. Compare that to a neural net‐
work, where a neuron can reach out to up to 10,000 neighbors. In
TrueNorth, we can reach out to up to 256, and the reason for that
disparity is because we have silicon and not organic technology. But
you can see there’s a very high fan-out, and high fan-out is difficult
to implement in a sequential architecture.
Jack: Once we have these different computational substrates, how does
the software need to change?
Dharmendra: What we did is create a layer of firmware on top of
the chip, and on top of that we created a new programming lan‐
guage to encapsulate the chip, and on top of that we created deep-
learning tools that allow people to program the machine.
It has required us to change our thinking and be more open to many
kinds of neural network structures. We’ve found that neural net‐
works have remarkably robust structures, and they can be maimed
or surgically altered in multiple ways and yet maintain their ability
to approximate a wide variety of functions to be able to classify a
wide variety of objects.
To get the advantages we had to simplify the architecture. We had to
go from 32-bit synapses to 1-bit synapses; we had to go from arbi‐
trary neural network size states to spiking 1-bit neurons; we had to
go from arbitrary connectivity that’s possible in software to one that
can connect to 256 neurons.

Dharmendra Modha: Brain-Inspired Chips for Better AI, What Comes After Von Neumann, and the Need for Fault Tolerance | 11
Jack: During the course of developing this, did you find yourself look‐
ing at elements of what we know about the brain to guide your own
R&D?
Dharmendra: The cortex is believed to be the seat of perception,
action, cognition, motion, and interaction, and it is hypothesized to
consist of tiny little cortical microcircuits that are repeated through
different parts of the brain, whether dealing with size, taste, touch,
smell, motor functionality, memory, and so forth. Essentially, the
same structure occurs in the whole mammalian spectrum, from the
tiniest mammal, all the way to mouse, rat, cat, monkey, human, ele‐
phant, whale, and so on.
The idea of creating a module with a neurosynaptic core, which is a
tiny little neural network tiled in an infinite sea of cores, came from
how the cortex itself seems to have, in a scalable fashion, given us
the substrate for our own mental prowess.
The second idea was that neurons in the brain are event-driven; they
don’t just fire at a 1 GHz clock frequency like normal chips do, but
they carry out information if and when it is necessary and only
when it is necessary. Modularity allows us to bring memory and
computation and communication together, and then event-driven
computation with spiking neurons allows us to really minimize
active energy and to compute sparingly and communicate sparsely.
There’s also the fault-tolerant aspect: you and I lose neurons all the
time and yet we retain our function, at least outwardly, so that sort
of fault tolerance is also in there.
What we did is put all these constraints together in a simulation and
the simulation grew from mouse scale to cat scale to monkey scale,
and eventually to human scale, with 1014 synapses. This simulation
required 96 racks of Blue Gene Sequoia at Lawrence Livermore
National Labs with 1.5 million processors, 1.5 petabytes of main
memory, and 6.3 million threads. The simulation ran about 1,500
times slower than real time. We worked out a real-time hypothetical
supercomputer to simulate 100 trillion synapses that would be about
12 GW of power—that’s enough to power New York City, Los
Angeles, and we can throw in any midwestern city of your choice
for free (other than Chicago, which consumes a lot). So, it’s a lot of
power. Our ultimate goal is to achieve a brain in a box with 10 bil‐
lion neurons in 2 liters of volume and 1 kilowatt of power in the
foreseeable future.

12 | Dharmendra Modha: Brain-Inspired Chips for Better AI, What Comes After Von Neumann, and the Need for Fault Tolerance
Adam Marblestone: What the
Brain Tells Us About Building
Unsupervised Learning Systems,
and How AI Can Guide
Neuroscience Research

Exploring the permeable border between neuroscience, cognitive


science, and AI with self-styled “neurotechnologist” Adam Marble‐
stone.
Adam Marblestone is the director of scientific architecting within
the Synthetic Neurobiology Group at MIT Media Lab. Prior to that,
he explored the design of scalable biological interfaces and the prin‐
ciples behind cognition in the cortex at Harvard.

Key Takeaways
• The worlds of AI and neuroscience are converging due to the
increasing sophistication of AI models and the “shedding of
assumptions” within the neuroscience community about what
the brain can do and how it does it.
• Creating computers that can perform feats of unsupervised
learning might require us to first create a learning system that
contains a number of self-supervising learning functions, dicta‐
ted by biases baked into the system. This is similar to the way
that children appear to have a bias in their brains for spotting
hands, which lets them ultimately learn more complex visual
elements.

13
• The brain uses a fundamentally different model of memory
than that implemented in neural network approaches like the
Neural Turing Machine.

Jack: Why are the two fields, AI and neuroscience, coming closer
together?
Adam: In addition to the progress in neural network-based AI,
there has been a shedding of assumptions within neuroscience itself
that makes room for such connections.
But a perhaps less-obvious trend comes from the removal of
assumptions researchers have had about how the brain might possi‐
bly work. Go back to the 1980s and think about the mechanisms of
training neural networks, like backpropagation. When that stuff
came out, the neuroscientists said, “this is totally unlike what the
brain does.” For example, it requires one variable that codes for the
activation of the neuron, and another variable that codes for the
error signal that goes to that neuron, so it requires information to be
flowing in multiple directions. They said, “neurons don’t do that,
they have only the axon as an output, which goes in only one direc‐
tion.” So, there were a lot of assumptions that restricted what kinds
of models neuroscientists considered to be “biologically plausible.”
Since then, empirical neuroscience has gone through a series of
intellectual changes. In the 1990s it moved into “molecular neuro‐
science,” revealing a vast amount of complexity of the molecular
machinery at the synapse and in the cell, and in the 2000s it talked
about “circuit neuroscience,” where a wide variety of cell types’ com‐
plex modes of interaction between neurons came to light.
There has been this huge deepening since then in what people
understand about the actual physical structure of the neuron itself
and the complexity of the neural circuitry, which has caused people
to go beyond what you could call a “cartoon model” of neurons.
Before you have detailed knowledge of the biophysics of neurons,
you’re tempted to think of them as perfect little spheres that com‐
pute a result by taking a sum of their inputs and applying a simple
threshold, but it turns out they have a huge amount of internal state.
They compute molecularly, they have molecular states, gene expres‐
sion, multiple electrical subcompartments in the dendritic tree, and
they can execute very specific internal learning rules. So, the indi‐
vidual neuron is much more complex than we thought.

14 | Adam Marblestone: What the Brain Tells Us About Building Unsupervised Learning Systems, and How AI Can Guide Neuroscience Research
This new window on the complexity of neurons and circuits means
that traditional arguments against some sort of generic and power‐
ful optimization or learning process, like backpropagation, should
go away. You can think of what’s happening in deep learning as
being founded on this almost universal algorithm of backpropaga‐
tion, or gradient descent, for optimization, whether it be supervised
or unsupervised or reinforcement based—backpropagation is being
taken as a black box and applied to a wide range of architectures.
So, there’s a universal and powerful way of learning, which is back‐
propagation, and neuroscientists had thought the brain “can’t possi‐
bly” be doing that. What computational neuroscientists postulated
instead were relatively random networks, with local learning rules,
which are weaker, and these models were inspired by statistical
physics as much as by what really worked computationally.
In summary, it turns out that neurons and circuits are more com‐
plex than we thought, and my take is that this means that traditional
arguments against the brain doing learning algorithms at least as
powerful and generic as backpropagation were wrong. In the past,
and even now, models from theoretical neuroscience haven’t been
based on the ground truth circuitry, because we didn’t know what
the ground truth circuity actually looked like in detail. They’ve ten‐
ded to assume random circuity, and they’ve tended to assume rela‐
tively simple learning rules that are not that powerful compared to
what machine learning has chosen to make use of. As we start to
question those assumptions and find that these circuits in the brain
are nonrandom, quite complicated, capable of possessing informa‐
tion with internal states, and so on, this opens up the possibility that
the brain can do at a minimum the types of things machine learning
was doing in terms of optimization, if not even more clever things.
Jack: What clues does the brain hold for things like unsupervised
learning?
Adam: One assumption that people tend to make is that the cerebral
cortex is some kind of unsupervised learning system. I think this
may not be quite the right way to look at it.
It’s not necessarily a clue from looking at the brain circuitry, but it’s a
clue from how biology tends to work—unsupervised learning, as
such, may not be quite the right thing to go for.

Adam Marblestone: What the Brain Tells Us About Building Unsupervised Learning Systems, and How AI Can Guide Neuroscience Research | 15
A way to think about it is that, with the way people talk about unsu‐
pervised learning now, you have an algorithm that can extract struc‐
ture from any data stream, any arbitrary datastream whatsoever,
whether it be stock prices or weather forecasting or anything, and
that algorithm is going to be doing the heavy lifting. But in contrast,
in the brain, with billions of years of evolution, what you have is the
opportunity to build in things specifically for the inputs that matter
for humans or for a particular biological organism. The datastreams
are not generic or arbitrary, and neither should the brain’s algo‐
rithms be equipotent for arbitrary types of data. The brain is allowed
to make some specific assumptions, built in by evolution, about the
world it is going to grow up in.
Certain problems are probably hard for an unsupervised learner to
solve. Shimon Ullman’s work at MIT has given a fantastic example
of this. One of the problems they talk about is the notion of detect‐
ing hands—a person’s hands—and the direction of gaze of a person’s
face. You can imagine human beings have to learn certain very-
specific things, in a certain order, to grow up successfully in the
human social environment. In order to learn language or compli‐
cated behavior, you need to learn from other humans, and to do
that, you need to learn where their hands are and whether they are
looking at you right now. You can imagine a baby has to solve not
just a general unsupervised learning task, but they have to ask very
specific questions, find the faces, find the hands, figure out where
the eyes are looking, find out if my mother is talking to me, and so
on.
If we take the example of hands, you need a simple classification
algorithm where you’re trying in more or less an unsupervised way
to find where the hands are. What Shimon Ullman found is that it’s
not easy to find the hands if you are just doing unsupervised cluster‐
ing, but it’s possible to have some prior knowledge that could have
been built into the brain by evolution, like that hands tend to
approach objects in a certain way, and that there’s a particular pat‐
tern of motion that is characteristic of hands. It turns out that that
motion could be detected by a specific algorithm via a specific kind
of calculation of so-called “optical flow.” So, this is not generic unsu‐
pervised learning, but highly specific, prebiased learning. You can
imagine this could be built in to retinal areas to help the baby form a
deductive bias.

16 | Adam Marblestone: What the Brain Tells Us About Building Unsupervised Learning Systems, and How AI Can Guide Neuroscience Research
So, it is possible that you may have a bunch of supervised learning
tasks that are self-supervised using clever tricks or cues built in by
evolution. You start by having the brain create some simple biases,
like identifying hands. This is a little different from the way the AI
people have done supervised learning, where it is often meant to be
totally independent of the precise problem you are solving. This is
not to say that we don’t have highly generic learning processes in the
brain, and that some of it is supervised by very generic kinds of sig‐
nals like “prediction of the next input.” But this is different from
what one might call “unsupervised pattern classification” and is
likely to include much more specialized mechanisms being unfolded
in a specific order to bootstrap off of one another during develop‐
ment.
Jack: What sort of fruitful overlaps between AI and neuroscience do
you see developing in the coming years?
Adam: There are two interesting ones that come to mind, although
many others exist and more will likely emerge. One is finding out if
the brain does, in fact, do backpropagation, or something quite like
backpropagation. We know a lot of the brain is driven by reward
signals, dopamine, and so forth, but nobody knows whether the
brain really does this generic kind of powerful multilayer optimiza‐
tion that backpropagation enables. Does the architecture of the cor‐
tex support backpropagation? If so, then where are the error signals?
Researchers like Jim DiCarlo at MIT are starting to look for them.
Concretely, some scientists, David Sussillo, for example, take an arti‐
ficial neural network of a kind meant to be relatively well mapped to
the brain, and train it under some assumptions about how a particu‐
lar area of the brain works, using backpropagation. Then, they ana‐
lyze the neurons in the brain and look for the same dynamics. They
are asking: does the brain do the kinds of things it would do if it
were optimized by learning for a particular task, given a particular
kind of basic network structure? It remains to be seen how far this
can go. There’s a very interesting question of how far you can push
that kind of research, because if you can push it far, it would bolster
this idea at some fundamental level that the brain is doing optimiza‐
tion. As we look at the connections between neuroscience and AI,
empirically you might start to see that although one is optimized by
learning processes and built-in heuristics shaped by evolution, and
the other by a programmer running TensorFlow or something, they

Adam Marblestone: What the Brain Tells Us About Building Unsupervised Learning Systems, and How AI Can Guide Neuroscience Research | 17
are converging on similar dynamics. They are both optimized to do
the same thing. That would be encouraging progress.
The second really interesting area has to do with these memory sys‐
tems. If you look at deep learning, memory becomes fundamental to
these questions of variable binding: how planning is done, simula‐
tion of future states of the world, and simulation of the effects of
actions. All these tasks require selective access to different types of
memories, basically pulling up the right memories at the right times.
How the hippocampus, through its interaction with the cortex,
actually stores different kinds of memories is therefore a very inter‐
esting question. There’s a lot of very interesting biophysics going on
in the hippocampus, and some of it may be relevant to these mem‐
ory computations. There are waves of oscillations called theta oscil‐
lations, and the neurons are also coordinating their firing with this
global oscillation, via relative timings of the firings. The neurons are
potentially even compressing series of events that occur into specific
subcycles of these waves, possibly leading to a way to represent tem‐
poral linkages between events, or, in other words, “first this hap‐
pened, then that happened,” or “if this happened, it is likely that will
happen next.”
If we had a better understanding of how the brain represents
sequences of events and actions, then from my limited understand‐
ing, it seems it would be a key insight for resolving a lot of the issues
in AI right now, like hierarchical reinforcement learning. That’s
because we carry out actions that occur on multiple timescales. If
you want to travel from Boston to San Francisco, you have a whole
series of things you have to do. A human would think: I should go to
the airport, but first I need to get in the car, but before that I need to
move my muscles so I can walk to the car. How is that kind of flexi‐
ble interface between actions at different levels of abstraction repre‐
sented, and how do we interpolate between long-range plans and
what we need to do now to get them done? If we understood the
representation of temporal sequences of events and actions in mem‐
ory in the brain, that might be helpful, but of course we can’t know
for sure!

18 | Adam Marblestone: What the Brain Tells Us About Building Unsupervised Learning Systems, and How AI Can Guide Neuroscience Research
Leila Wehbe: What Harry Potter
and Recurrent Neural Networks
Tell Us About the Brain’s Inner
Narrative

Why certain aspects of neuroscience are contingent on the develop‐


ment of AI technologies, according to researcher Leila Wehbe.
Leila Wehbe is a postdoctoral researcher within the Helen Wills
Neuroscience Institute at Berkeley, where she uses fMRI and MEG
techniques to study how the brain represents the meaning of words,
sentences, and stories.

Key Takeaways
• The AI community needs to develop richer, more representative
language models to enable researchers to conduct more detailed
studies of the brain.
• Neural network tools for language and image understanding
can help researchers perform experiments about more abstract
concepts, like the experience of reading books.
• Further understanding from neuroscience about how the brain
represents language and simulates ideas would help people in
the AI community build more powerful language models.

Jack: How has your study of the brain challenged your own assump‐
tions about language?

19
Leila: The biggest assumption I had from reading the literature
before starting this investigation is that language is only constrained
to the left hemisphere or certain parts of the temporal cortex. Once
we started studying language in a more naturalistic setting, like
making subjects read entire stories, we found that not only are the
regions involved in language processing larger than previously
thought, but they are also very bilateral—they span both the left and
the right hemispheres.
The interesting thing with language is that the brain is never doing
only language. Language interacts with other cognitive tasks that we
use in everyday life. For example, we had people read a chapter from
the first Harry Potter book. We found a region that was correlated in
activity with what character our subjects were thinking about. It
happens that this region is also involved in how we perceive other
people’s intentions in everyday life. We found another region that
was correlated with subjects reading about the characters’ physical
actions such as running. That region happens to be involved in per‐
ceiving other people’s physical actions in everyday life. From this
observation, we can hypothesize that the brain is using the same
regions it uses in everyday life to imagine the scenes it is reading.
Jack: So, where does the machine learning component come into your
work?
Leila: We are dealing with a complex text and complex brain data,
and we are trying to find correspondences between features of the
text and the activity in different regions. We need machine learning
both for modeling the features of that text and for finding which
brain regions are responsive to these different features.
The first stage is to model the content of the text. The brain under‐
stands the text stimulus; we want to know how it does it. The
method we use is to find which features of the text modulate the
activity of every brain region. To do that, we need a numerical rep‐
resentation of the different features of that text so that we can esti‐
mate how much they predict brain activity. We use natural language
processing tools for this step. An example of a model we can use is a
semantic vector space, such as word2vec, that approximates the
meaning of words. Such vector spaces are obtained from large
online corpora, and they summarize the statistics of how words
occur around other words. The idea is that words with similar
meaning occur with similar patterns. People have constructed more

20 | Leila Wehbe: What Harry Potter and Recurrent Neural Networks Tell Us About the Brain’s Inner Narrative
complex vector representations, such as models that compose multi‐
ple successive words together into phrases or sentences. Such vec‐
tors could also be used to model brain activity.
We therefore have two parallel streams: a stream of recorded brain
activity while reading words from a text, and a stream of the feature
vectors that correspond to those words. We use these data to learn
the correspondence between brain activity and the features of the
words. Here, we use machine learning to choose and fit models, to
deal with the dimensionality of the data, and to make statistically
sound inferences from our results.
Jack: Why is machine learning necessary for these experiments?
Leila: Ideally, if you wanted to know how the brain represents all the
words in English, and you had enough time, you would show a sub‐
ject all the words in the English dictionary and get the brain image
for all of these words, but that would take forever. Instead, what you
can do is capture word meaning in a smaller dimensional space,
such as word2vec. Now you don’t have to show the subject all the
words, you can show them a subset and you can learn, from these
data, how the brain responds to the different dimensions on which
these words vary. For example, many of the words will be edible and
many will not, and this way you can learn the representation of edi‐
bility. In our experiments, we never explicitly model edibility, but it
is accounted for in some dimensions of semantic vector spaces. We
always learn the responses to all the different dimensions simultane‐
ously using regression.
Different regions in the brain have different roles. Some regions are
involved in visually perceiving the letters on the screen, and so their
activity is not going to be modulated by word meaning. Some other
areas of the brain are going to care about some specific dimensions
of word meaning, but maybe not about other meaning dimensions
or about low-level information such as letters. You can build a linear
model that would predict activity in every voxel in the brain; that
model will try to estimate how different features affect brain activity.
Say you find that the activity of a particular voxel increases by a
large amount if something is animate and by a small amount if it’s
edible. If you find that these weights that you computed from the
data are statistically reliable, then you can infer that this voxel repre‐
sents animacy to a greater extent than edibility, and that it’s generally
representing the meaning of words.

Leila Wehbe: What Harry Potter and Recurrent Neural Networks Tell Us About the Brain’s Inner Narrative | 21
The way you determine the reliability of your models is by using a
training set to fit the model and then a separate test set to see where
in the brain the predictions of your model are correct. As we said
before, you have two parallel streams: feature vectors and brain
images. You use both the feature vectors and the corresponding
brain images from the training set to fit the model. Then, for the
separate test set, you take only the feature vectors, and use your fit‐
ted model to predict brain activity for that test set. You can then
compare the predictions with the real brain images that were
acquired. If you are using a semantic feature space to do your mod‐
eling, only in voxels that care about semantics will you see an accu‐
rate prediction of brain activity. Similarly, you could, instead of
using semantic vectors, have represented words by their visual prop‐
erties. The areas that will be well predicted in this case will be differ‐
ent: instead of the semantic areas, you will get the visual areas.
Jack: What types of tools being developed by the machine learning
community are useful for your work?
Leila: Progress in NLP (natural language processing) has been cru‐
cial for this type of study. As I said before, semantic vector spaces
have been very useful to study the representation of meaning. Other
useful tools being developed are, for example, methods that perform
semantic composition of sequences of words. These allow us to
study things like meaning composition; they would allow us to go
beyond a simple averaging of single-word vectors. That’s crucial for
studying composition in the brain, since it provides a numerical
representation of that task.
In one of my papers (paper: http://aclweb.org/anthology/D/D14/
D14-1030.pdf. And for more information, see: http://
www.cs.cmu.edu/afs/cs/project/theo-73/www/MEGstories/.) I used a
recurrent neural network to obtain a representation of context. If
you have a sequence of consecutive words, you can use a previously
trained recurrent neural network to generate a representation of all
the words preceding each word. That’s useful because we don’t just
hear words in isolation; a lot of the meaning is within the context
formed by the words and sentences before each word.
There are also things that haven’t been integrated in the neuro‐
science community yet; for example, tools that attempt to automati‐
cally generate caption generation for images by learning mappings
between images and text. These are potentially interesting tools

22 | Leila Wehbe: What Harry Potter and Recurrent Neural Networks Tell Us About the Brain’s Inner Narrative
because brain imaging poses a similar research problem; you have
brain activity and you have text and those are two parallel streams:
how can you build a model that goes from one to the other? On
another analysis level, the brain maps visual things to textual things
just like the caption generation networks are trying to do. How does
it do it? These are underexplored areas and would be really interest‐
ing to study.
Basically, we cannot study how the brain is working under naturalis‐
tic, complex tasks if we don’t have a good numerical representation
for those tasks that we can use to model its activity. That’s what these
machine learning tools are useful for.
If you didn’t have these functions or models you’d be limited to
things like comparing two different conditions in the brain. For
example, you would have to study individual semantic concepts one
by one. It’s a very slow progression that way, because we’d have to
study every single small variation by itself. These ML [machine
learning] models open up the possibility of doing these naturalistic
experiments and thinking about how to study the brain as a compu‐
tational system.
Jack: Are there other tools that the community could create to let you
do your work?
Leila: Vectors are really useful but they’re not complete. When we
think, for example, of someone doing an action, there are other NLP
tools that have been developed to capture the structure of such an
event. Some of these tools model the action as a graph; for example,
you have an agent and an action being performed. In general, the
way we think about events is probably not in a linear fashion; it’s not
like we have an RNN in our brain that is simply accruing meaning.
We are building a complex and dynamic meaning graph of some
sort. It would be nice to use a rich representation of meaning such as
one provided by a complex graph denoting relationships between
entities, but this comes with its own challenges because the brain
representation of elements of a complex model is harder to learn
with limited data. At some point, NLP will be able to have a very
good model for the actual meaning of the sentence. A caricature
would be that you give it a sentence and it gives you an image corre‐
sponding to what the brain imagines from the sentence, for exam‐
ple. That would be useful for accurately modeling brain responses
and understanding exactly where and how meaning is represented.

Leila Wehbe: What Harry Potter and Recurrent Neural Networks Tell Us About the Brain’s Inner Narrative | 23
Jack: Are there any discoveries people have made in neuroscience
about language that you feel the AI/NLP community can learn from?
In the vision realm, there has been a very good parallel between lay‐
ered models (such as convolutional neural networks) and how the
brain works. In fact, you could even use the output of different lay‐
ers of a neural network processing an image to predict the brain
activity of successive brain areas along the visual perception stream.
In language, it is still unclear how to do this type of crosstalk. Per‐
haps one area is how the brain deals with unexpected or wrong lan‐
guage input. There are error-correcting signals that arise in the
brain a few hundred milliseconds after it notices a mismatch
between the meaning it was predicting and what the speaker really
meant. The brain then proceeds to correcting the meaning and
inferring the correct intention of the speaker. Maybe if the brain
mechanisms behind this response were better understood, they
could be included as inference-correcting mechanisms in a language
model.
Another thing is that the brain uses a lot of different sources to infer
meaning: social situations, visual stimulus and gestures, tonality and
rhythm, etc. This type of multimodal reasoning would be good to
replicate. People in deep learning are already trying to work along
these lines.

24 | Leila Wehbe: What Harry Potter and Recurrent Neural Networks Tell Us About the Brain’s Inner Narrative
Tom Griffiths: Why People Have
Surprisingly Intelligent Intuitions,
and What AI Researchers Can
Learn from That

Figuring out what makes people able to effectively and efficiently


reason can tell us more about intelligence: An interview with UC
Berkeley Professor Tom Griffiths.
Tom Griffiths is a professor of psychology and computer science at
the University of California at Berkeley. He studies human cognition
by trying to understand the systems we use to help us navigate the
complex, algorithmic space of day-to-day life, and then analyzes the
difference between the actions humans take and the algorithmically
optimal ones to understand our cognition.

Key Takeaways
• AI will be able to exceed human capabilities in certain areas by
being able to consider far more information when making a
decision.
• Now is the right time for AI people and developmental psychol‐
ogists to begin getting together and exchanging information, as
tools like reinforcement learning and convolutional neural net‐
works have become advanced enough that they can benefit from
insights governed by these approaches.

25
• Some of the most impressive aspects of human cognition con‐
cern how we’re able to use biases to help us learn.

Jack: What does our understanding of how humans make decisions


suggest about how to build better AI/ML systems?
Tom: I think it’s tempting to look at human decision-making and
say, “people make errors, people are biased, let’s not build our AI
systems like that!” I take kind of the opposite view. Considering the
difficulty of the computational problems that people are solving and
the computational resources we have for solving them, I think we do
a pretty good job. To me, those errors and biases are a signature of
the difficulty of the problems and the efficiency of the algorithms we
use to navigate them. I tend to think about decision-making as a
matter of trading off error with cost of computation (or time spent
thinking). From that perspective, you probably aren’t going to want
computers to make decisions exactly like people, because the cost of
computation will potentially be radically different. But what I think
people do well is solve the meta-level problem of finding effective
decision-making strategies, given this tradeoff, and I think that’s
exactly what we should be trying to emulate in our AI systems. If
you’re trying to solve an intractable problem in a reasonable amount
of time, you’re going to make errors, you’re going to be biased. But
you want to be smart enough to find algorithms that are as efficient
as possible while keeping those errors and biases to a minimum.
Jack: What motivates your research, and how does it intersect with AI?
Tom: The big questions that motivate my research are about the
mathematical principles that underlie human intelligence. Despite a
lot of progress, there are still plenty of problems where humans are
far better at solving them than AI or machine learning. The classic
examples are things like learning language just from positive input,
learning new concepts from small numbers of examples, learning
causal relationships by observing actions between variables, and
actually being able to scale that up to doing things like science,
which are fundamentally about making those causal discoveries.
You can illustrate some of the challenges for current AI through
some of its major successes. Two big successes are AlphaGo and
DeepBlue. AlphaGo is very impressive, but if you look at how much
data was required for that system to get to where it was, it would be
the equivalent of thousands of years of experience, whereas humans

26 | Tom Griffiths: Why People Have Surprisingly Intelligent Intuitions, and What AI Researchers Can Learn from That
have tens of years of experience. DeepBlue was, again, super impres‐
sive, but if you look at what was required in terms of the power costs
and amount of computation to beat a human being, the human
being was relying on the equivalent of the amount of power you’d
need to illuminate a light bulb and evaluating on the order of tens of
moves, versus hundreds of thousands or millions of moves being
evaluated by the computer system. I think those two examples illus‐
trate two of the big strengths of human cognition, which are being
able to learn from small amounts of data, and being able to come up
with really good algorithms to solve problems despite the computa‐
tional constraints human minds operate under.
The kinds of things I’ve worked on in machine learning have been
about building more expressive probabilistic models of language
and also nonparametric Bayesian models. One of the distinctive
properties of those nonparametric Bayesian models is they give you
a way of learning in which you have both structured representations,
so you can draw meaningful conclusions from small amounts of
data, but also the ability to grow in complexity and accommodate
new information as you grow.
Jack: How is the data we are exposed to as we grow up transformed
into internal representations that we can use to solve problems—do we
know the fundamentals of how to create a curriculum for AI, for
example?
Tom: One way to study these things is by looking at human develop‐
ment. What comes up in Bayesian terms is Bayes rule, which has a
principle of what we call “conservation of learning” built in. This
principle states that you’ve only got one unit of prior probability to
spread over your hypotheses, and so by increasing that probability
for some hypotheses you are decreasing the probability that you
apply to other hypotheses. Your prior experiences influence how you
learn about the world, and we’ve done experiments that show this is
the case through kids and adults. There are some things kids learn a
lot faster than adults, because their prior distribution hasn’t been
shaped to establish a strong distribution, and there are other things
adults can learn faster than children because of their experience.
One of the examples we looked at is causal learning. We’ve found
that adults have a strong expectation that if you have multiple causes
influencing a phenomenon, then those causes act independently. In
one experiment, we showed people a machine and told them to fig‐

Tom Griffiths: Why People Have Surprisingly Intelligent Intuitions, and What AI Researchers Can Learn from That | 27
ure out what makes the machine go and that they had to put multi‐
ple objects on the machine, and then figure out what you have to do
to operate a machine like that. Adults assume that some of the
objects cause the machine to go and you just have to figure out
which objects do this, and they can learn that relatively quickly. But
if it’s a situation where you need to have two objects that need to
work together in chorus, then it takes adults longer than kids to fig‐
ure that out. That’s because the need for multiple objects violates the
adults’ expectations. Kids show the reverse pattern: they’re a bit
slower to learn the independent cause version and a bit faster to
learn the one where the objects interact.
There are a few different hypotheses we’re trying to tease apart. One
is that the kids have different prior distributions, another is more of
an algorithmic approach where they are just jumping around
hypotheses more rapidly. So, in terms of thinking about this and
relating this back to machine learning, I think it is clear that the pri‐
ors we have are shaped by experience, and the question is how do
you give learning systems the right kinds of experience and the right
kinds of learning algorithms so they are able to transfer that knowl‐
edge and generalize, and those are open challenges in ML.
Jack: Do we know what tools people use to attack problems they
encounter in their lives? For instance, do we have a toolbox containing
Bayesian inference and other approaches, and, if so, what are they?
Tom: Bayesian inference gives us an overarching mathematical
theory that says exactly what you should do but nothing about how
you should do it. One question that comes up, then, is how can peo‐
ple possibly be doing anything like Bayesian inference, given that
they’re operating under significant cognitive constraints? And why
is it that people seem to systematically deviate from Bayesian infer‐
ence in well-documented cases in judgment and decision making?
Answering those questions requires moving to a different level of
analysis, so rather than asking these abstract computational ques‐
tions about what the problems are and what the solutions are, we
ask questions about what representations and algorithms are used in
doing that.
The strategy we’ve developed is instead of just asking what cognitive
processes people can be using, we instead ask what’s the best algo‐
rithm for solving this problem, given that you have to navigate a
tradeoff between the errors that you make and the computational

28 | Tom Griffiths: Why People Have Surprisingly Intelligent Intuitions, and What AI Researchers Can Learn from That
cost and the amount of work you have to do? We use the computer
scientist’s perspective of how to think about algorithms and apply
that at the level of cognitive processes. Doing that gives you a much
more realistic view of what rational behavior is like—what Stuart
Russell calls bounded optimality, which is where you use the best
algorithm you can to solve a problem by making this tradeoff
between error and computational cost.
One question we can ask is whether the sorts of algorithms people
seem to use actually are bounded optimally. We’ve found that some
examples of irrational behavior can be interpreted as quite sensible
things to do given that you’re navigating this tradeoff. Then, the
other question you can ask is how do we end up finding pretty good
strategies and what are these pretty good strategies we end up dis‐
covering and how do we get there? This connects with an area in AI
literature called rational meta-reasoning. The question is how
should a rational agent reason about how to reason. We can use
some of the ideas from that rational meta-reasoning literature to
theorize about how people might end up choosing between different
strategies and decisions that they might follow.
People seem to develop internal models of how long it’s going to
take to execute an algorithm and they can use these internal models
to reason about what algorithm to execute in a situation. A fun
example of this is we actually trained people to use different sorting
algorithms. We trained them to do a version of a bubble-sort and
merge-sort algorithm, and we asked them to do this on a few
sequences, and so they got a sense of the tradeoffs involved in those
algorithms. Bubble sort is pretty slow but if the list is short, it can be
efficient, and how slow it is increases with the square of the length of
the list, whereas merge sort scales better with longer lists but doesn’t
benefit from the amount of order that already appears in the list.
People were able to learn that, and when we gave them new sequen‐
ces and asked them to figure out which algorithm to use to sort
which sequences, they made decisions in ways that were consistent
with having good estimates of how long it would take to solve that
problem. It was like they’d built an internal model of the conse‐
quences of executing those algorithms and were consulting that
internal model to, we show, actually perform better than adaptive
sorting algorithms that computer scientists had developed.
That gives you a framework for thinking about what we’re doing
when we’re navigating this space and trying to construct the cogni‐

Tom Griffiths: Why People Have Surprisingly Intelligent Intuitions, and What AI Researchers Can Learn from That | 29
tive tools that would allow us to solve particular problems. What
we’re doing is developing a sense of what costs are involved in exe‐
cuting an algorithm, and then we figure out how to use that to make
an intelligent decision.
What remains in the domain of deep mysteries is that we have a
good theory for how people choose between different algorithms,
but it’s still a really hard problem to ask how you can construct algo‐
rithms in the first place. How do you take that same perspective and
use it to build efficient algorithms? Our ability to write computer
programs, to me, is a mysterious cognitive ability, because in the
worst case that’s something that’s undecidable, and it’s something
where we can do a remarkably good job for working out what an
efficient procedure is for solving a problem.
Jack: When you study AI, what parts do you see and find yourself
thinking, “ah, based on my knowledge of psychology and the brain,
that seems like a useful component?”
Tom: On the cognitive science to machine learning direction, I
think deep reinforcement learning and convolutional neural net‐
works have been developed to the point that we’re very ripe to bene‐
fit from the insights of human cognitive development. AI people are
starting to think about approaching their problems in terms of how
do you represent objects, and how do you represent agents, and how
do you represent causal relationships, and how do you represent
physical reasoning, and so on. Those are exactly the questions devel‐
opmental psychologists have tackled for the last 30 years. Also, peo‐
ple are starting to integrate things like memory, attention, and
different speed learning systems into AI. These are all things that,
again, are topics people have looked at in cognitive science, and,
again, I think there’s an opportunity to be informed by cognitive sci‐
ence there. I think there’s a real chance for a back and forth; sort of
seeing what works there can give us potentially new insights about
human systems, and we can also take what we know about human
systems and see how they transfer over to AI.

30 | Tom Griffiths: Why People Have Surprisingly Intelligent Intuitions, and What AI Researchers Can Learn from That
About the Author
Jack Clark is a writer and communicator focused on artificial intelli‐
gence. He works at OpenAI, and previously covered AI for Bloom‐
berg and BusinessWeek, and distributed systems for The Register. He
writes a weekly newsletter on developments in AI, called Import AI.

Você também pode gostar