Can Winograd Schemas Replace Turing Test For Defining Human-Level AI?

Can Winograd Schemas Replace Turing Test for D...
http://spectrum.ieee.org/automaton/robotics/arti...
Can Winograd Schemas Replace Turing Test for Defining Human-Level AI?
By Evan Ackerman
Posted 29 Jul 2014 | 16:50 GMT
Earlier this year, a chatbot called Eugene Goostman "beat" a Turing
Test for articial intelligence (http://spectrum.ieee.org/techtalk/robotics/articial-intelligence/virtual-tween-passes-turing-test) as
part of a contest organized by a U.K. university. Almost immediately, it
became obvious that rather than proving that a piece of software had
achieved human-level intelligence, all that this particular competition
had shown was that a piece of software had gotten fairly adept at
fooling humans into thinking that they were talking to another human,
which is very dierent from a measure of the ability to "think." (In fact,
some observers didn't think the bot was very clever at all
(http://www.scottaaronson.com/blog/?p=1858).)
Clearly, a better test is needed, and we may have one, in the form of a
type of question called a Winograd schema that's easy for a human to
answer, but a serious challenge for a computer.
The problem with the Turing Test is that it's not really a test of whether
an articial intelligence program is capable of thinking: it's a test of
whether an AI program can fool a human. And humans are really, really
dumb. We fall for all kinds of tricks that a well-programmed AI can use
to convince us that we're talking to a real person who can think.
Illustration: Getty Images
For example, the Eugene Goostman chatbot pretends to be a

13-year-old boy, because 13-year-old boys are often erratic idiots (I've
been one), and that will excuse many circumstances in which the AI simply fails (http://www.scottaaronson.com/blog/?p=1858). So really, the chat bot
is not intelligent at allit's just really good at making you overlook the times when it's stupid, while emphasizing the periodic interactions when its
algorithm knows how to answer the questions that you ask it.
Conceptually, the Turing Test is still valid, but we need a better practical process for testing articial intelligence. A new AI contest, sponsored by
Nuance Communications and CommonsenseReasoning.org, is oering a US $25,000 prize to an AI that can successfully answer what are called
Winograd schemas, named (http://cs.nyu.edu/davise/papers/WSKR2012.pdf) after Terry Winograd (http://hci.stanford.edu/winograd/), a professor of
computer science at Stanford University.
Here's an example of one:
The trophy doesn't t in the brown suitcase because it is too big. What is too big?
The trophy, obviously. But it's not obvious. It's obvious to us, because we know all about trophies and suitcases. We don't even have to "think" about it;
it's almost intuitive. But for a computer program, it's unclear what the "it" refers to. To be successful at answering a question like this, an articial
intelligence must have some background knowledge and the ability to reason.
Here's another one:
Jim comforted Kevin because he was so upset. Who was upset?
These are the rules the Winograd schemas have to follow:
1. Two parties are mentioned in a sentence by noun phrases. They can be two males, two females, two inanimate objects or two groups of
people or objects.
2. A pronoun or possessive adjective is used in the sentence in reference to one of the parties, but is also of the right sort for the second
party. In the case of males, it is he/him/his; for females, it is she/her/her; for inanimate object it is it/it/its; and for groups it is
they/them/their.
3. The question involves determining the referent of the pronoun or possessive adjective. Answer 0 is always the rst party mentioned in
the sentence (but repeated from the sentence for clarity), and Answer 1 is the second party.
4. There is a word (called the special word) that appears in the sentence and possibly the question. When it is replaced by another word
(called the alternate word), everything still makes perfect sense, but the answer changes.
For more details (including some examples of ways in which certain Winograd schemas can include clues that an AI could exploit), this paper
(http://www.aaai.org/ocs/index.php/KR/KR12/paper/view/4492/4924) is easy to understand and well worth reading. In fact, it's so well worth reading
that I'm going to steal their conclusion and post it here:
1 of 2
08/05/2014 09:00 PM
Can Winograd Schemas Replace Turing Test for D...
http://spectrum.ieee.org/automaton/robotics/arti...
Like Turing, we believe that getting the behaviour right is the primary concern in developing an articially intelligent system. We further
agree that English comprehension in the broadest sense is an excellent indicator of intelligent behaviour. Where we have a slight
disagreement with Turing is whether a free-form conversation in English is the right vehicle. Our WS [Winograd schemas] challenge does
not allow a subject to hide behind a smokescreen of verbal tricks, playfulness, or canned responses. Assuming a subject is willing to take
a WS test at all, much will be learned quite unambiguously about the subject in a few minutes. What we have proposed here is certainly
less demanding than an intelligent conversation about sonnets (say), as imagined by Turing; it does, however, oer a test challenge that is
less subject to abuse.
It's worth pointing out that we're a bit skeptical that you can really "test" for human-level AI in this manner. With a highly structured test with specic
questions and answers that are unambiguously right or wrong, there's a lot of potential for a clever (but not thinking) AI to nd ways to exploit it.
The question, then, becomes whether "intelligence" is simply a technological system that is suciently complex to correctly answer a series of
questions that a slightly more complex biological system (us) has arbitrarily decided constitute a measurement of what thinking requires.
It seems inevitable that at some point, we'll have to say that true intelligence is feeling as well as thinking, and "Blade Runner" is way ahead of us:
[ Winograd Schema Challenge (http://commonsensereasoning.org/winograd.html) ] via [ BusinessWire (http://www.businesswire.com/news/home

/20140728005207/en/Nuance-Announces-Winograd-Schema-Challenge-Advance-Articial) ]
2 of 2
08/05/2014 09:00 PM

Can Winograd Schemas Replace Turing Test For Defining Human-Level AI?

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Can Winograd Schemas Replace Turing Test For Defining Human-Level AI?

Enviado por

Direitos autorais:

Formatos disponíveis

Can Winograd Schemas Replace Turing Test for D...

Illustration: Getty Images

For example, the Eugene Goostman chatbot pretends to be a

Can Winograd Schemas Replace Turing Test for D...

[ Winograd Schema Challenge (http://commonsensereasoning.org/winograd.html) ] via [ BusinessWire (http://www.businesswire.com/news/home

Você também pode gostar