Você está na página 1de 242

<DOCINFO AUTHOR ""TITLE "The lexicon-syntax interface in second language aquisition"SUBJECT "Language Acquisition & Language Disorders, Volume

30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

The LexiconSyntax Interface in Second Language Aquisition

Language Acquisition & Language Disorders


Volumes in this series provide a forum for research contributing to theories of
language acquistion (rst and second, child and adult), language learnability,
language attrition and language disorders.

Series Editors
Harald Clahsen

Lydia White

University of Essex

McGill University

Editorial Board
Melissa F. Bowerman

Luigi Rizzi

Max Planck Institut fr Psycholinguistik, Nijmegen

University of Siena

Katherine Demuth

Bonnie D. Schwartz

Brown University

University of Hawaii at Manao

Wolfgang U. Dressler

Antonella Sorace

Universitt Wien

University of Edinburgh

Nina Hyams

Karin Stromswold

University of California at Los Angeles

Rutgers University

Jrgen M. Meisel

Jrgen Weissenborn

Universitt Hamburg

Universitt Potsdam

William OGrady

Frank Wijnen

University of Hawaii

Utrecht University

Mabel Rice
University of Kansas

Volume 30
The LexiconSyntax Interface in Second Language Aquisition
Edited by Roeland van Hout, Aafke Hulk, Folkert Kuiken and Richard Towell

The LexiconSyntax Interface


in Second Language Aquisition
Edited by

Roeland van Hout


University of Nijmegen

Aafke Hulk
University of Amsterdam

Folkert Kuiken
University of Amsterdam

Richard Towell
University of Salford

John Benjamins Publishing Company


Amsterdam/Philadelphia

TM

The paper used in this publication meets the minimum requirements


of American National Standard for Information Sciences Permanence
of Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data


The lexiconsyntax interface in second language aquisition / edited by Roeland
van Hout, Aafke Hulk, Folkert Kuiken and Richard Towell.
p. cm. (Language Acquisition and Language Disorders, issn
09250123 ; v. 30)
Includes bibliographical references and index.
1. Second language acquisition. 2. Grammar, Comparative and
general--Syntax. 3. Lexicology. I. Title: Lexicon-syntax interface in 2nd language
acquisition. II. Hout, Roeland van. III. Series.
P118.2.L49 2003
418-dc21
isbn 90 272 2499 4 (Eur.) / 1 58811 418 X (US) (Hb; alk. paper)

2003051906

2003 John Benjamins B.V.


No part of this book may be reproduced in any form, by print, photoprint, microlm, or
any other means, without written permission from the publisher.
John Benjamins Publishing Co. P.O. Box 36224 1020 me Amsterdam The Netherlands
John Benjamins North America P.O. Box 27519 Philadelphia pa 19118-0519 usa

<TARGET
<
R/ RE E F"toc"
F
DOCINFO AUTHOR ""TITLE "Table of contents"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150" VOFFSET "4">

"ack">
"tow">
"haw">
"cor">
"cra">
"duf">
"dyk">
"wil">
"sab">
"gre">
"hou">

Table of contents

Acknowledgments

vii

1. Introduction: Second language acquisition research in search of


an interface
Richard Towell

2. Locating the source of defective past tense marking in advanced


L2 English speakers
Roger Hawkins and Sarah Liszka

21

3. Perfect projections
Norbert Corver

45

4. L1 features in the L2 output


Ineke van de Craats

69

5. Measures of competent gradience


Nigel Dueld

97

6. Lexical storage and retrieval in bilinguals


Ton Dijkstra

129

7. Inducing abstract linguistic representations: Human and


connectionist learning of noun classes
John N. Williams

151

8. Neural substrates of representation and processing of a second


language
Laura Sabourin and Marco Haverkort

175

9. Neural basis of lexicon and grammar in L2 acquisition:


The convergence hypothesis
David W. Green

197

10. The interface: Concluding remarks


Roeland van Hout, Aafke Hulk and Folkert Kuiken

219

< /R/TREARGET
E FF

"ni">
"si">
"toc">

vi

Table of contents

Name index

227

Subject index

229

<TARGET "ack"
</TARGET
"ack">DOCINFO AUTHOR ""TITLE "Acknowledgments"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Acknowledgments

This volume contains a selection of papers presented at the NWCL/LOT Expert


Seminar on The interface between syntax and the lexicon in second language
acquisition, held in Amsterdam on March 3031, 2001. The seminar was
organized by the editors of this volume. We want to thank the participants of
the seminar for reviewing the papers submitted for this volume. We are very
grateful to the following institutions for nancial support: the North West
Centre for Linguistics (NWCL), the Landelijke Onderzoekschool Taalwetenschap (LOT; Netherlands Graduate School of Linguistics) and from the
University of Amsterdam: the Amsterdam Center for Language and Communication (ACLC), the Chair of Linguistics of the Romance Languages and
the Chair of Second Language Acquisition.

February 2003,
The editors

<TARGET "tow" DOCINFO AUTHOR "Richard Towell"TITLE "Introduction"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 1

Introduction
Second language acquisition research
in search of an interface
Richard Towell
University of Salford

1.

Introduction

If it is to attain its eventual goal, second language acquisition research has to


integrate the totality of second language acquisition processes. These must
include the learning of the core syntax of a second language, the learning of the
lexical items and determining the role of the cognitive mechanisms which are
necessary for the use of linguistic forms in comprehension and production.
It has been accepted for a long time that these three domains involve
dierent kinds of learning: syntax is learnt through a process of implementing
a particular set of universal structures (Chomsky 1986, White 1989); lexis is
learnt by establishing a set of arbitrary associations which operate in a given
society (Waxman 1996); comprehension and production are reliant on general
cognitive procedures (Harley 2001). The learning of syntax is often characterised as a process of triggering (Sakas and J. D. Fodor 2001); the learning of lexis
is characterised by the building up of associations (or connections) (Schreuder
and Weltens 1993); comprehension and production are learnt by establishing
and practising the required procedures (Pinker 1997).
However, these three systems must come together in the creation of a whole
linguistic capacity in the mind of an individual. The syntax will govern the
structure of the grammar but the lexical items will govern how the structure is
implemented. The linguistic knowledge which results from the interaction of these
two systems can only develop and then nd expression through the cognitive
mechanisms associated with language comprehension and production.
The researchers who attempt to provide accounts of the processes and
outcomes of second language acquisition (SLA) are generally all too aware,

Richard Towell

therefore, that they have set themselves an ambitious, interdisciplinary task.


Ideally, as a group, they wish to account for all aspects of second language
acquisition from the phonetic to the intercultural (see Mitchell and Myles 1998).
In particular, they set themselves the task of explaining those factors which have
long been recognised as specic to the acquisition of a second or foreign
language as opposed to the mother tongue. These are usually identied as
transfer or cross-linguistic inuence, evidence of a speciable route of acquisition regardless of rst language background, variability (also known as optionality) in the language of individual learners, and incompleteness or fossilisation
in the nal state of the majority of acquirers (Towell and Hawkins 1994).
Clearly no one researcher could ever hope to deal with all aspects. At
dierent times, the nature of the activities which are being described has led to
the involvement of scholars from disciplines ranging from acoustics to anthropology. The central disciplines involved have, however, always been linguistics
and psychology. For what now seems a brief period in the 1950s, linguistics and
psychology came together to provide what was then thought of as a complete
description of what language was and how it was learned: a powerful combination of structuralist linguistics and behaviourist psychology (Gass and Selinker
2001). Unfortunately, this period is only talked about in todays classes on
second language acquisition in order to show how misguided both of these
initiatives were, forgetting rather that, despite the radical shifts of views which
have followed, these eorts laid the foundations of the disciplines within which
we all situate our research.
It is probably true to say that linguistics and psychology began to follow
dierent routes after the devastating criticisms of Skinners (1957) Verbal
Behaviour put forward by Chomsky (1959), although many psychologists still
continued to attempt to interpret transformational theory in psychological
terms. These attempts foundered as the derivational theory of complexity (the
belief that the more complex transformations were, the longer they would take
to process) was denied by linguists. Linguists pointed out that, although terms
like least eort and economy were essential to their endeavours, they were
not dened with regard to processing eort but in relation to linguistic simplicity: Chomskys economy principles are unambiguously matters of competence,
in that they pertain to representations and derivations internal to the language
faculty and exclude relations beyond the interfaces (Smith 1999:114). Maintaining this position, linguists have gone on to develop their discipline signicantly
but within the boundaries which they have seen as necessary. They have therefore
done so with little reference to any insights from psychology. Mainstream

In search of an interface

generative linguists have focused on syntax. That mainstream focus provides the
background for the articles by Hawkins and Liszka, Corver, Van de Craats and
Dueld in this volume.
However, during the same period, the study of language within psychology
also made rapid strides in exploring many aspects of psycholinguistics (see
Harley 2001). One of these has concentrated on the lexicon, as is demonstrated
in the chapters by Dijkstra and Williams in this volume; others have explored
issues of how language may be stored in the brain and have made use of
imaging techniques to enable us to begin to relate our theoretical analyses to
physical realities (Perani et al. 1996, Dewaele 2002). These are represented in
this volume by the chapters by Sabourin and Haverkort and by Green. There
has been the occasional fruitful interchange in some areas of the discipline from
time to time but there has been no real examination of how the two disciplines
have evolved with regard to SLA and whether there are more global reasons for
looking towards collaboration. There are now signs that both groups of
researchers are coming to an understanding that their particular view of the
world may not suce to account for the overall process and that each will have
to understand more about what the other knows.
Whilst this book does not pretend to complete that task, it will seek to
present current examples of the way linguists think about second language
acquisition and of the way researchers working within a more psychological
frame of reference think about the same subject in such a way as to show how
there is a degree of complementarity in the work being done, even if, at the
highest levels of argument, we are unlikely to see a swift return to the unity of
view of the 1950s (cf. Smith 1999: 174).
The belief that the time is right to seek such complementarity is encouraged
by two fairly recent developments. Within linguistics, the advent of the minimalist theory and its consequences for SLA has caused researchers to look again
at the relative roles of syntax and lexis. Under the minimalist view which, as
Corver demonstrates in chapter three, applies as much to interlanguages as to
any other natural languages, syntax is thought to be universal. It is constituted
of invariant principles with options restricted to functional elements and
general properties of the lexicon (Chomsky 1995: 170). The invariant nature of
the syntax is possible because the functional elements are now seen as part of
the lexicon: It is clear that the lexicon contains substantive elements (nouns,
verbs) And it is reasonably clear that it contains some functional categories. (Chomsky 1995: 240). More recently, SLA specialists (Hawkins 2001: 345,
Herschensohn 2000:80) have stated rather more rmly that, under minimalism,

Richard Towell

functional categories should be seen as part of the lexicon. We will argue below
that this change of emphasis Smith (1999) argues cogently that minimalism
is a natural evolution of the generativist enterprise rather than a revolution
may well modify the way in which linguists have to think about development in
SLA and indeed about the other central features of SLA research outlined above.
Within psychology, there has been a welcome renewal of interest in how
second languages are acquired. Since the mid 1980s, we have seen attempts to
account for second language learning drawing on a rich vein of inductive
research using computer modelling (Rumelhart and McClelland 1986). More
recently, a variety of non-intrusive ways of providing physical evidence of brain
processes have become available (Perani 1999). Both linguists and psychologists
have become interested in how knowledge is stored in the mind and how it is
retrieved from storage. Arguments have been put forward based on a distinction between declarative and procedural memory systems (Towell and Hawkins
1994, Ullman 2001) some of which suggest radical dierences in the way rst
and second languages are acquired and stored in the mind. Most recently,
arguments have been put forward to suggest that usage-based analyses can
account for all linguistic units: Psycholinguistic and cognitive linguistic
theories of language acquisition hold that all linguistic units are abstracted from
language use. In these usage based perspectives, the acquisition of grammar is
the piecemeal learning of many thousands of constructions and the frequencybiased abstraction of regularities within them (Ellis 2002: 144). All of this adds
to the view that a full account of second language acquisition will require
complementary input from both disciplines.
One of the main keys must lie in how we see the concept of development
and the psychological mechanisms which underlie development. The eorts of
syntacticians focus on describing the syntactic structure which lies behind the
interlanguage of the learner. This has always made it dicult for them to
account for development (see Gregg 1996): the placing of functional categories
within the lexicon makes this diculty more acute. The invariant syntactic
knowledge which learners have is a template present in the mind of the learner
which can be modied by the information inserted within it. There cannot be
a driving force for development in the syntax. It follows therefore that that
driving force really comes from the lexis. However, up to now the learning of
lexis has been thought of mainly in terms of one or other forms of associationist
learning theory, with connectionism being the most powerful. Theorists
pursuing this model have tended to argue that connectionist learning can
account for the totality of language learning, including the learning of syntax

In search of an interface

(see quotation from Ellis (2002) above). But syntacticians cannot accept that
the sophisticated structures which they observe and which provide no visible
clues on the surface structure of the language can be learnt in this empirical
fashion. They claim instead that innate knowledge (mediated or not by the L1)
must be guiding the learning (Hawkins 2001).
It is not clear that this argument can be resolved by theoretical debates
between nativists and non-nativists. We need to examine in detail the
evidence of how learners acquire a second language. This evidence must come
from a variety of sources using a variety of techniques. It will take us into
questions of what it is that learners acquire, how they acquire it and how that
process modies their linguistic capacity in both knowledge and use. Hawkins,
Corver and Van de Craats in this volume present clear empirical accounts of
how specic features of syntactic and lexical knowledge play a fundamental role
in second language development. Their accounts cannot, however, tell us
everything we need to know about the mental processes involved. Dueld asks
fundamental questions about the nature of the competence which is acquired.
Dijkstra provides an account of the way in which bilinguals store and access
their knowledge. Williams examines the way in which linguistic knowledge may be
built up on the basis of distributional evidence. Sabourin and Haverkort and then
Green look at the way in which the knowledge may be stored and used. In this way
we can see that a full account of the acquisition of a second language involves
the three systems outlined at the beginning of this chapter. We will argue that
none of the current arguments will suce alone to account for the total process
and that it is essential to attempt to integrate the sources of knowledge available
to us (see Jackendo (2002) for a similarly motivated position).
In this chapter we shall seek rst to outline the research context from the
generativist point of view, initially in general, and then specically with regard
to the contribution of minimalism. We will then examine in more detail the
contribution of psychological research and seek to show how a degree of
complementarity may well exist, if there is the will to look for it. In this way, we
are seeking to provide a context for the more detailed studies which gure in
the rest of the volume by means of which the value of their contribution can be
seen against the background of the evolving discipline of SLA research.

Richard Towell

2. The Linguistics dimension: The generativist research paradigm


In outlining the contribution of generativist research, we will rst briey review
the nature of this research paradigm and the reasons why it has become so
central to SLA research.
2.1 The generativist position
Noam Chomsky (1986: 3) poses three basic questions which linguists need to be
able to answer:
1. What constitutes knowledge of language?
2. How is language acquired?
3. How is knowledge of language put to use?
He and his followers have in eect concentrated on the rst question. It is
essential to but separable from the understanding of the other two.
Indeed, Chomsky argues that in order to obtain a proper answer to this
question it is necessary to idealise the data to be examined away from issues of
performance so that the researcher can gain insight into the abstract knowledge
which the native speaker of a language possesses, i.e. that persons linguistic
competence. Furthermore, the generativist position adopts a modular view of
the mind in which the child possesses an innate language faculty. This is
conceived of as separate from those parts of the mind which are devoted to
general cognitive skills associated with the processing of information (perception, comprehension, production) and memory. It is argued that, because
linguistic structure is universal and is not signalled overtly on the surface of any
of the languages of which it is a manifestation, it is not possible for a child to
acquire knowledge of language (in the sense of syntactic competence) on the
basis of exposure to surface cues alone. This is frequently referred to as the
logical problem of language acquisition. Surface cues are necessary to provide
an indication for the child as to which of the limited number of possible
languages he or she is confronted with, but they could never be sucient to
provide knowledge of the kind of organisation which is present in the syntactic
structure of language. As each child acquires this knowledge with no conscious
eort, no explicit instruction, following a regular pattern of acquisition not
reected in the data to which the child is exposed and without making the
mistakes which piecemeal learning would imply, generativists conclude that
linguistic knowledge is a biological innate endowment for humankind. This

In search of an interface

endowment is what enables the child to know more than the surface of the
language reveals: the surface forms act only as a trigger for the underlying
knowledge which the child already possesses.
It is important to highlight four signicant aspects of this theoretical
position as these will give rise to further comment below and will be dealt with
in subsequent chapters. First, a generativist approach involves idealisation of
the data to be examined. For the study of adult competence in the mother
tongue the researcher can frequently consult his or her own linguistic knowledge as a representative sample of the idealised speech community. This is not
possible in second language research. SLA research requires data gathering
methods which can isolate linguistic competence from performance factors.
Second, the primacy of syntax within the generative paradigm has led to a
separation between syntax and semantics. This is not without its problems as
more and more researchers are nding that semantic factors inuence syntactic
phenomena (Jus 1998, 2001). Dueld in this volume is concerned with how
competence can be successfully dened within this paradigm and proposes that
it is necessary to conceive of competence at two levels, one of which is more
related to the surface structures of language. Third, the conception of the
acquisition of syntactic knowledge through a process of triggering has given rise
to debate (Lightfoot 1993, Carroll 2000, Sakas and J. D. Fodor 2001). Hawkins,
Corver and Van de Craats, through an examination of the acquisition of
syntactic and lexical features, dene more clearly the nature of the features
which have to be learnt and discuss the role of the L1 in providing the initial
knowledge. This might provide a more satisfactory conceptual basis at least with
regard to the initial state. It does, however, leave open the question of how the
learners use empirical evidence to move to subsequent states. Fourth, the
generativist position for SLA has to adapt to the fact that a second language
acquirer has already learnt one language. The issue of how learners transfer or
access universal knowledge in cases where one language has already been learnt
is one which may have to be looked at again within the minimalist paradigm.
This is an issue for Hawkins and Liszka, for Corver, and for Van de Craats.
2.2 Generativist second language research
The particular strength of this approach has been in providing syntactic
analyses within a theoretical framework. This enabled SLA researchers to
predict in a precise way what learners needed to acquire in order to develop
their interlanguage system. During the 1990s, this was manifested mainly

Richard Towell

through analyses presented within the conceptual basis provided by parameter


(re-)setting (Flynn 1987). In this framework, languages could be compared on
the basis of a single underlying syntactic phenomenon which was independently
theoretically motivated by Universal Grammar (UG). This underlying syntactic
phenomenon would have several surface expressions not linked together by
other linguistic theories. Clear statements could then be made about what
learners needed to do in order to re-set their existing parameters to the setting
needed for their second language. Clear predictions could also be made about
what the learner language would look like if the re-setting took place and what
it would look like if it did not. To give two examples: the pro-drop parameter
contrasted the presence or absence in the learners interlanguage grammar of
such diverse phenomena as null subjects, of expletive it and there, the permissibility of subject-verb inversion and the possibility of extracting a wh-subject
across an overt complementiser (that-trace lter) in languages such as English
compared with Spanish or Italian. The verb raising parameter linked dierences
in adverb placement, negation and the use of quantiers in French and English.
The hypothesis in both cases was that exposure to the second language would
enable learners to re-set the parameter through triggering and that researchers
would be able to measure the dierences in the learners linguistic competence
at dierent points in time. Thus, if a native speaker of English acquiring
Spanish triggered the pro-drop parameter to the Spanish setting, that person
would immediately acquire knowledge of all the elements linked in the parameter. If a native speaker of French learning English re-set the verb-raising
parameter, issues to do with adverb placement, the position of negatives and of
oated quantiers would be solved at the same time. Whilst this research
provided a very positive move forward in SLA work, the empirical evidence
frequently did not bear out the view that parameter re-setting was the essential
process of SLA learning of syntax. Many studies (Hawkins, Towell and Bazergui
1993, White 1991) showed what might be called partial parameter re-setting, in
the sense that some of the elements identied were learnt together but others
were not. Whilst learners did not produce wild grammars i.e. grammars which
fell outside the constraints of linguistic theory, they could not be seen simply to
re-set a parameter. This called into question either the nature of the underlying
linguistic denition of the parameter or the process involved. There was also
some uncertainty about what might constitute a trigger: would learning any
form which was a surface manifestation of one element of the parameter trigger
the other forms or was one form particularly privileged to act as a trigger?

In search of an interface

2.2.1 The minimalist perspective


There are several important dierences between the principles and parameters
(P and P) model of syntax and the minimalist version. The most important one
is the claim that the syntax is invariant and that the morpholexical system is the
source of all variation. This has important implications for second language
researchers. Herschensohn (1999) argues convincingly that minimalism should
be better able to account for all of the main features of second language
acquisition as dened above. It will be better at dealing with the area where it
has always been most successful viz transfer but it should also be in a position
to give a better account of the route of learning, incompleteness and variability
(optionality). The problem with the P and P model was that it was all or
nothing: either the parameter had been re-set and all features fell into place or
it had not and they did not. As pointed out above, investigations based on this
theory tended to nd that partial re-setting took place, but the theory itself
could not account for partiality given that the re-setting process was one of
switch-ipping. The notion contained within minimalism that acquisition
proceeds more through the gradual building of L2 grammar through the
control of morpholexical constructions (Herschensohn 1999: 81) allows for
learners to be aware of the need to apply certain features or categories in some
circumstances but not others.
Hawkins (2001) and Hawkins and Liszka in this volume would probably
agree that the minimalist approach opens the door to a more satisfactory
account of variability (optionality) and incompleteness, but they base their
arguments more on the presence or absence of features in the functional
categories of the L2 than on the build up of constructions. Hawkins and Liszka
also set out a specic view on the relationship between the L1 and the L2. They
show that Chinese learners learning English do not mark tense consistently.
Having investigated and rejected a range of alternative proposals, Hawkins and
Liszka argue that this is likely to be because Chinese learners are unable to
establish that the functional category English T is specied for +/- past. They
suggest, furthermore, that this feature is not available to the learners because it
does not exist in their rst language. They claim that where parametrised
syntactic features are not present in a speakers L1, they will not be accessible in
later L2 acquisition. Such a point of view, if substantiated, would argue against
full transfer from the L1 and would substantiate the partial access hypothesis. In
the presentation of the article, Hawkins and Liszka contrast their account with
that of Lardiere who adopts a full access point of view. Lardieres explanation is
that the evidence suggests a failure of mapping from one component within the

10

Richard Towell

language faculty to another. Hawkins and Liszka feel that the evidence from
Chinese learners argues more in favour of a partial transfer perspective.
If Hawkins and Liszka are correct in their analysis, there are considerable
consequences for acquisition in a more general sense. If L2 learners really
cannot provide a feature based analysis for this part of their interlanguage
syntax, they will have to learn the required forms (and store them) in another
way. Whilst the discussion provided by Hawkins and Liszka remains within the
framework of the role of features within generative syntax, the discussion of the
alternative strategies available to learners provides food for thought as to
whether the learning of the non-integrated forms must then be carried out in a
dierent way e.g. stored as declarative as opposed to procedural knowledge as
discussed in the article by Green.
The notion of partial transfer also contrasts with the articles by Corver and
Van de Craats. They allow for the full transfer of both lexical and syntactic
features from the L1 to the L2. Their argument is that the full transfer of the L1
features into the L2 provides the starting point for learners beginning to acquire
an L2. This is combined with a conservative strategy i.e. one in which the
learners maintain those features unless and until they perceive the need to
modify them. The learners create a series of interlanguages, all of which remain
within the constraints of UG and thus provide checkable and interpretable
information for the internal and external interfaces. As they progressively
modify the features in response to the dierences which they perceive on the
basis of the input they receive from speakers of their target L2, their interlanguages move towards a point where their interlingual perfect systems
correspond more to the natural language used by the L2 speakers.
The above comments should serve to show that the generativist perspective
on second language acquisition has the power to create well-dened hypotheses
about the nature of language and to turn these into clearly dened investigative
strategies. The shift to minimalism means, however, that the learning of items
in the lexicon is potentially more signicant than it was previously. The
evidence from the Van de Craats article in particular shows how learners
progressively revise their featural specication of both functional and lexical
categories. As these are both part of the lexicon, it is clear that, from the
minimalist perspective, the way in which second language learners come to
revise their lexical forms especially those which are linked to functional
categories is now a more important issue. We have suggested above that it
provides the driving force for acquisition. The process of coming to know that
forms and features need to be revised, however, necessarily involves comparative

In search of an interface

perception of language forms. How the learners trigger their knowledge or how
they perceive the dierences is something which cannot easily be dealt with
within generative linguistics as currently dened, because it is more related to
performance than competence. As soon as we mention perception we need to
return to the psychological dimension and look again at second language
acquisition from that perspective.

3. The psychological dimension


It is probably as well to recognise immediately that the issues which divided
linguists and psychologists fty years ago have not gone away. Mainstream
linguists still work within a rationalist framework which contrasts with the
psychologists emphasis on empiricism. Linguists tend to reason on a top-down
basis, psychologists base their theories on bottom-up evidence. Linguists believe
in an innate, biological endowment specic to language; psychologists believe
that language learning is one manifestation of cognition amongst others.
Linguists believe that language is a symbolic system; some psychologists, at least,
believe that it can be accounted for without the use of symbols. Linguists tend to
reject computer modelling; many psychologists rely on computer modelling.
Despite these dierences, the argument that is being developed in this
chapter (and which is the justication for this book) is that these dierences
actually provide us with perspectives which are complementary rather than
contradictory, if we choose to look for the areas where insights from one eld
can contribute to the other (see Hulstijn 2002 for a similarly motivated view).
Indeed, the fact that each eld has excluded the domains covered by the other
discipline surely leads to a position where signicant aspects of the total process
of SLA as described in Section 1 cannot be dealt with except by reference to the
other discipline.
It is therefore important to look more closely at the methods and results of
psycholinguistic research in order to establish where the complementarity may
lie. In order to do so we will now situate the articles in the book which have
adopted a psychological reference point within the development of the methodologies used in psycholinguistics.
Psychologists interested in language have made use of a variety of methods,
many of which are shared with other branches of cognitive psychology. The
three most important of these are experimental investigations which rely on the
measurement of reaction times against a theoretically predicted outcome; the

11

12

Richard Towell

use of physical measurements of brain activity; and computer simulations of


speaker or learner behaviour. Psychologists apply these methods to a variety of
L1 and L2 speakers and to patients whose language ability is impaired in some
way e.g. aphasics. All of these are represented in papers in this volume.
Dijkstras article provides an excellent illustration of two out of the three
methodologies typically encountered in the psycholinguistic literature. The
problem space he addresses is whether bilinguals in possession of words from
dierent languages (under his denition this includes language learners as
unbalanced bilinguals) access both sets automatically in response to a given
stimulus or whether one or the other is primed by dierent contexts or presentation methods. His stimuli include words which are cognates (similar in both
form and meaning in the two languages), homographs (similar in orthographic
form but not in meaning) and homophones (similar in sound but not in
orthographic form or meaning). In a series of experiments Dijkstra and his
colleagues have shown that bilinguals cannot do otherwise than access the items
in both of their languages when presented with an applicable stimulus of
isolated words (they call this nonselective access). They have also shown that
frequency of use is the main determinant: highly automatised L1 words are
accessed more swiftly than L2 words. But they have also shown that both items
remain activated for a relatively long time before language selection takes place.
Dijkstra also reports on a (limited) number of studies which examine words in
sentential contexts by the measurement of Event Related brain Potentials
(ERPs) through EEGs. The results have suggested that there are signicant
dierences in the way bilinguals process their second language. Those related to
semantic aspects seem quantitative in nature whilst those related to the syntax have
a qualitative dimension as well, showing dierences amongst early and late second
language learners. Such studies, if replicated, could have considerable impact on
the critical age hypothesis which is essential to some of the generative hypotheses, such as the general blocking principle discussed by Hawkins and Liszka.
The study by Williams in this volume makes use at least in part of the third
methodology regularly exploited by psycholinguists: computer modelling.
Psychologists from Winograd (1972) onwards have been keen to exploit the
processing abilities of computers for the purposes of modelling human behaviour. The most recent manifestation is connectionism which has been extensively (and controversially) used to model language and language learning.
Connectionism (Bechtel and Abrahamsen 1991) attempts to show that many
apparently complex processes can, in fact, be accounted for in a relatively
simple way as long as the processing involved can be very large in quantitative

In search of an interface

terms and can operate in parallel (it is also known as parallel distributed
processing or PDP). The basic idea is to have (a large number of) processing
units which feed into one another at several levels (some of which are hidden)
in sequence. The units involved can be given levels of activation or weighting
prior to any simulation. These may be random or specied in relation to the
outcome envisaged. This will vary according to whether the model is being used
simply to replicate what is assumed to happen in certain forms of processing or
whether it is intended to model a developmental process (such as language
learning). In the former case, random levels will be initially assigned and it will
be intended that during the simulation the model will learn new levels of
activation. It is hoped that these levels of activation will represent eventually
some kind of reality. Put (over-)simply, the lowest level or input units are then
given dierent levels of activation and these levels feed through to the higher
levels. Where the interaction with higher levels is facilitated because certain
units at those levels already have positive activation (excitation), the signal
transmitted will be strengthened. Where the level of input activation encounters
negative activation at a higher level, it will be inhibited and, in the longer term,
weakened. Over very many trials, the simulation establishes a stable level of
output activation which is in essence derived through parallel processing from
the relative frequency of the input it has received. This is arrived at through
dierentiated interaction between signals at the intermediate levels on the basis
of the processes of strengthening and weakening.
In a simulated task learning context, models can be modied in such a way
as to include information about the extent to which the model is performing as
it should in relation to some target (back-propagation). This means that the
model can be trained to move nearer over time to a specied target. Researchers
can then see how the model responds to the information it has been given and
if and how it can progress towards the target. It should be noted that this
model only has one kind of unit which is linked to the signal strength of the
connections: there is no symbolic level within connectionist models. The
researchers can inspect the intermediate levels to see what activation levels the
model produced at dierent times, they will be aware of dierent stages of
learning and they will see the extent to which the desired outcome is obtained
and the relation with the input given.
There is no doubt that theorising alone cannot discover the eect of
enormous quantities of parallel processing and the computer is very ecient at
examining this eect. The sixty thousand dollar question, however, is whether
such modelling really reects anything which goes on in the human brain.

13

14

Richard Towell

Those who favour this kind of modelling argue that it parallels human learning,
that it can generalise beyond the data set which it is given and that it can
produce knowledge which is equivalent to what other approaches consider to
be abstract knowledge. It then follows naturally that proponents of this work
see no need for abstract symbolic categories in the mind because, from their
point of view, these are only the outcomes of the parallel processing described
above, not primitive units.
Numerous articles in the mid 1980s emanating from the PDP group led by
Rumelhart and McClelland (Rumelhart, McClelland and the PDP Research
Group 1986) made very strong claims about the way in which their network
could parallel human language learning. These drew a detailed response,
notably from Pinker and Prince (1988), in which the claims were thrown into
question. As Fodor and Pylyshyn put it: Pinker and Prince argue (in eect)
that more must be going on in learning past tense morphology than merely
estimating correlations since the statistical hypothesis provides neither a close
t to the ontogenic data nor a plausible account of the adult data on which the
ontogenic processes converge. It seems to us that Pinker and Prince have, by
quite a lot, the best of this argument. (Fodor and Pylyshyn 1988: 68). Connectionists have, however, gone on to rene their models and to enable them to call
on other sets of information through which they have renewed their claim to
model language.
These are the issues which are interestingly explored in the study by
Williams. He is critically interested in whether the learning which can be
demonstrated by computer modelling really models that of humans or not. To
investigate this, he combines computer-based simulations with experimental
investigations on humans. The context of learning is that of the specication of
abstract noun classes on the basis of a gender type classication. We already
know that there is considerable evidence available to show that second language
learners fail to assign gender in the consistent way that native speakers do.
There are two central issues. The rst is whether computer based inductive
processes can genuinely be said to have gone beyond exemplar based generalisations to the creation of abstract classes. The second is whether what computers
do and what humans do is in fact similar. Computers may very well demonstrate
a remarkable ability to generalise from distributional examples, but do humans
do the same? Do they rely on many examples to induce classes or do they
induce classes in other ways, such as by making use of other clues e.g. animacy?
Williams rst simulation of the training kind with feedback seemed to
show that the computer-based learning could generate productive knowledge

In search of an interface

of noun classes which enabled the network to generalise beyond the trained
exemplars. It could behave as if it had formed abstract representations. His
second simulation which did not involve feedback in the same way did not
learn as well. Humans who learnt to classify the data into noun classes at 66%
or above seemed to be using conscious explicit strategies, which clearly were not
available to the computers, and/or to be inuenced by prior knowledge of
gender languages, equally not true of the computers. The conclusions which
may be drawn are not simple: it seems as if some aspects of human behaviour
may be similar to the inductive generalising of computers but that humans
make use of other devices as well. Once feedback is introduced computer
learning is considerably more powerful, but then it is probably more powerful
than the human mind.
Green and Sabourin and Haverkort take us into yet other areas of psycholinguistic research. Their concern is with how linguistic knowledge may be
represented and stored in the mind. They share many reference points and to
some extent a methodology. They both discuss evidence based on aphasics, they
both rely on physiological evidence from ERPs to conrm or deny what other
sources have indicated. Their conclusions appear to be slightly dierent.
The broader argument which this research addresses concerns whether or
not L1 and L2 learners acquire and store language in the same way. Virtually all
researchers acknowledge the dierences in learning environment for many L2
learners: they are generally older (beyond the so-called critical age) with fully
developed memory and cognitive systems; they are often literate; they are
already in possession of a rst language and they are often exposed to explanations about language in classrooms as well as to language forms in more or less
authentic contexts of use. A key question has always been: do these dierences
mean that they will learn in a dierent way? Those who argue that they do
suggest that they rely more on explicit learning and that this will be reected in
the way their knowledge is stored in the mind. A separation is often made between
declarative knowledge, sometimes glossed as knowing that, the kind of knowledge which can be consciously accessed and articulated, such as a rule of grammar,
and procedural knowledge, sometime glossed as knowing how i.e. the kind of
knowledge which underlies skill activity, such as riding a bicycle, which cannot
be consciously accessed. The two forms of knowledge are said to be acquired in
dierent ways and stored in two dierent memories: a declarative memory and
a procedural memory each of which is accessed in a dierent way (Anderson
1983, 1993, 2000). Declarative knowledge is acquired explicitly, consciously
and quickly but cannot be used swiftly as a basis for any skill based action.

15

16

Richard Towell

Procedural knowledge is acquired implicitly, unconsciously and slowly by dint


of a lot of practice. It is available swiftly in response to an appropriate stimulus.
The argument has been put that L1 acquisition may be implicit and procedural
and L2 acquisition may be explicit and declarative, although most researchers
who discuss this issue allow for some overlap and some movement between
categories over time as learners become more procient and make their
knowledge more automatic. Researchers such as Paradis (1997), however,
claim categorically that explicit knowledge cannot become implicit.
Greens article questions the necessity for a separation between declarative
and procedural memory and queries the evidence from aphasics on which he
believes it is based. Computer modelling has indicated that the data derived
from the aphasic experiments does not depend on having two memories. He
therefore argues that it is worth looking at physiological evidence to see whether
it conrms the necessity of two memory systems and whether there is evidence
to suggest that L2 learners store knowledge in this dierentiated way. His
interpretation of the available evidence suggests that there is no dierence for
procient L2 learners and he suggests that if there is a dierence in the early
stages of learning it soon disappears. Only longitudinal studies which included
physiological studies could settle this argument conclusively.
Sabourin and Haverkort do not refer explicitly to the two kinds of knowledge or the two memory systems outlined above but they do argue in favour of
a clearer separation between the representation of linguistic knowledge and the
representation of the knowledge which lies behind linguistic processing ability.
This is because they believe that the empirical evidence that they have gathered
by comparing the results obtained when procient L2 learners are required to
undertake the same grammaticality judgement test o-line and on-line shows
that the knowledge base is dierent. The results of the o-line task show no
dierence between the advanced learners and native speakers but the on-line
results do. When learners do a grammaticality judgement test in a paper and
pencil way, they score as highly as native speakers. But when their performance
on the same task is measured through ERPs, it is revealed that they are not
responding in the same way. For Sabourin and Haverkort, this suggests that
they are not accessing the same knowledge even though the observed outcome
of correct answers is the same.
Green therefore argues that the underlying knowledge base for advanced
procient learners is likely to be the same but Sabourin and Haverkort argue
that it is likely to be dierent. Both agree that more research is needed.

In search of an interface

4. In what ways can the linguistic and psychological perspective be seen


to be complementary?
In this nal section of the chapter, an attempt is made to build on the account
given so far and to draw out some of the central themes which are treated in the
following chapters. At the level of principle, as pointed out briey in Section 3
above, there seems to be nothing but contradiction in the present stance of the
founding disciplines of psycho-linguistics. And yet at a more pragmatic level,
the accounts of the research outlined above suggest that the two disciplines may
need each other rather more than they are prepared to admit. We will briey
explore this issue by looking at how interlanguages are created and how they
develop bearing in mind the evidence presented in the various articles.
Let us start with some notion of the initial state of knowledge for second
language learners. This has to be dened for us by the linguists. In their
chapters they argue cogently that learners transfer the features of lexical and
functional categories from the L1 in ways which make up an operational
interlanguage. Those features which are not available via transfer may be
available through direct access to UG. In those many cases where the languages
dier, the features are not combined in the same bundles as those which are
used by L2 speakers. The task for the learners is then to modify the relationship
between features and forms in such a way as to create over time combinations
which correspond more to the bundles used by speakers of the L2. If they can
create those bundles in an appropriate manner, Universal Grammar will ensure
that they are interpretable by other cognitive systems. There are arguments
about the extent to which the knowledge will transfer, and about the nature of
competence, but the line of the argument is clear.
The next question for second language acquisition researchers is how and
why the necessary modication takes place. The linguists answer is that the
learner must in some way come to know (but not in a conscious way) that the
bundles of features in the interlanguage are not adequate to the purpose of full
communication with native speakers in the L2. Once that unconscious realisation has taken place, the learner must then have a way of revising the existing
feature bundles to make them correspond more to the L2 bundles. This is said by
the linguists to happen implicitly and without conscious feedback. As was noted
above, the term triggering is one which is frequently used in the literature but
it is becoming more and more dicult to accept that what must be a complex
process can adequately be summed up by that term. At this point we really have
to turn to the psychologists to gain some insight into what may be happening.

17

18

Richard Towell

Dijkstras article makes it very clear that when possessors of two languages
hear a given stimulus they activate all the relevant linguistic knowledge they
have without separating it out into L1 and L2. All the activated forms have the
possibility of giving rise to overt production: there is a selection process which
determines which will actually be produced. The level of activation relates very
much to frequency of use. This suggests that when second language learners
acquire new forms they will compete with the existing forms. There will be
dierences in activation depending on the context when dealing with words in
utterances (as opposed to the isolated words mostly studied) but the principle
will hold that the ability to use second language forms accurately in uent
utterances will depend at least to some degree on the frequency of use experienced by the user.
This raises interesting questions about how the activation of the interlingual
forms for the purposes of communication in the interlanguage permits modication of the interlanguage system. If the eect of use is merely to strengthen
the connections, as is implied by the non symbolic connectionist models
explored by Williams and implied in the work of Dijkstra, then how can use
give rise to modication of the forms? How also will they overcome the
competition of existing forms?
Sabourin and Haverkort suggest that whilst advanced L2 users may display
the same knowledge in grammaticality judgement tests, the knowledge that lies
behind their use of the language is not the same as that which lies behind the
use in language production. Williams in his discussion of the comparative
learning by humans and computers points out that humans appear to be
inuenced by factors other than the purely distributional. Is it possible that
second language learners do indeed have some dierential representation which
allows them not to have to rely purely on the distributional analysis? Or is there
another more symbolic version of implicit learning which could account for
modication rather than strengthening?
Green in his discussion of the storage of linguistic knowledge raises issues
about the relative contribution of declarative and procedural knowledge.
Within that area of reference, there are interesting questions about how learners
deal with those lexical and functional categories which are not fully integrated
within the syntax, as must be the case for interlingual systems which have not
yet fully developed the syntactic system. We have seen that Hawkins and Liszka
take the view that Chinese learners cannot integrate +/- past in the T category
of their interlanguage syntax. They nonetheless produce correct past tense
forms of regular verbs for at least some of the time. Where and how are these

In search of an interface

forms stored? If they are not generated by a function of the syntax, they must be
stored as separate lexical items. This immediately opens the door to the notion
that there must be a dierence between the proportion of knowledge which is
stored in the lexicon of the L1 and the L2 and in interlanguages: it would seem
probable that in the early stages of learning at least second language learners
must store a large proportion of the forms they have learnt as new lexical items
and only work out later how they may be part of the syntax.
Whilst there are no clear answers which immediately fall out of the existing
state of knowledge, it should be evident that a combination of the insights of
linguists and of psychologists will be required to answer these questions
properly.

References
Anderson J.R. 1983. The architecture of cognition. Cambridge, Mass.: Harvard University Press.
Anderson J. R. 1993. Rules of the mind. New Jersey: Lawrence Erlbaum.
Anderson J. R. 2000. Learning and memory. New York: John Wiley.
Bechtel, W. and Abrahamsen, A. 1991. Connectionism and the mind. Oxford: Blackwell.
Carroll, S. E. 2000. Input and evidence. Amsterdam: John Benjamins.
Chomsky, N. 1959. Review of Skinner 1957. Language 35: 2658.
Chomsky, N. 1986. Knowledge of language. New York: Praeger.
Chomsky, N. 1995. The minimalist program. Cambridge, Mass.: MIT Press.
Dewaele, J. M. 2002 Individual dierences in L2 uency: The eect of neurobiological
correlates. In Portraits of the L2 user, V. Cook (ed.), Clevedon: Multilingual Matters.
Ellis, N. 2002. Frequency eects in language processing: A review with implications for
theories of implicit and explicit language acquisition. Studies in Second Language
Acquisition 24 (2): 143189.
Flynn, S. 1987. A parameter-setting model of L2 acquisition. Dordrecht: Reidel.
Fodor, J. A. and Pylyshyn, Z. W. 1988. Connectionism and cognitive architecture. In
Connections and symbols, Special Edition of Cognition, S. Pinker and J. Mehler (eds), 373. Cambridge, Mass.: MIT Press.
Gass, S. and Selinker, L. 2001. Second language acquisition. New Jersey: Lawrence Erlbaum.
Gregg, K. 1996. The logical and developmental problems of second language acquisition.
In Handbook of second language acquisition, W. Ritchie and T. Bhatia (eds), San Diego:
Academic Press.
Harley, T. 2001. The psychology of language. Hove: Psychology Press.
Hawkins, R. 2001. Second language syntax. Oxford: Blackwell.
Hawkins, R., Towell, R. and Bazergui, N. 1993. Universal Grammar and the acquisition of
French verb movement by native speakers of English. Second Language Research 9: 189233.
Herschensohn, J. 1999. The second time around minimalism and L2 acquisition. Amsterdam:
John Benjamins.

19

</TARGET "tow">

20

Richard Towell

Hulstijn, J. 2002. Towards a unied account of the representation, processing and acquisition of second language knowledge. Second Language Research 18 (3): 193224.
Jackendo, R. 2002. Foundations of language. Oxford: Oxford University Press.
Jus, A. 1998. The Acquisition of semantics-syntax correspondences and verb frequencies
in ESL materials. Language Teaching Research 2: 93123.
Jus, A. 2001. Verb classes, event structure, and second language learners knowledge of
semantics-syntax correspondences. Studies in Second Language Acquisition 23: 305313.
Lightfoot, D. 1993. How to set parameters. Cambridge, Mass.: MIT Press.
Mitchell, R. and Myles, F. 1998. Second language acquisition theories. London: Arnold.
Paradis, M. 1997. The cognitive neuropsychology of bilingualism. In Tutorials in bilingualism.
Psycholinguistic perspectives, A.M.B. de Groot and J.F. Kroll (eds), New Jersey: Lawrence
Erlbaum.
Perani, D. 1999. The functional basis of memory: PET mapping of the memory systems in
humans. In Cognitive neuroscience of memory, L. G. Nilsson and H. J. Markovitsch
(eds), 5578. Seattle: Hogrefe and Huber.
Perani, D., Dehaene, S., Grassi, F., Cohen, L. Cappa, S. F. and Dupoux, E. 1996. Brain
processing of native and foreign languages. Neuroreport 7: 24392444.
Pinker, S. 1997. How the mind works. Harmondsworth: Penguin.
Pinker, S. and Prince, A. 1988. On language and connectionism. Analysis of a parallel
distributed processing model of language acquisition. Cognition 28: 73195.
Rumelhart, D. and McClelland, J. 1986. On learning the past tense of English verbs. In
Parallel distribued processing: Vol 1. Foundations, D. Rumelhart, J. McClelland and the
PDP Research Group 1986. Cambridge, Mass.: MIT Press.
Rumelhart, D., McClelland, J. and the PDP Research Group. 1986. Parallel distribued
processing: Vol 1. Foundations. Cambridge, Mass.: MIT Press.
Sakas, W. G. and Fodor, J. D. 2001. The structural triggers learner. In Language acquisition
and learnability, S. Bertolo (ed.), 172234. Cambridge: Cambridge University Press.
Schreuder, R. and Weltens, B. (eds). 1993. The bilingual lexicon. Amsterdam: John Benjamins.
Smith, N. 1999. Chomsky: Ideas and ideals. Cambridge: Cambridge University Press.
Skinner, B. F. 1957. Verbal behaviour. New York: Appleton Century Crofts.
Towell, R. and Hawkins, R. 1994. Approaches to second language acquisition. Clevedon:
Multilingual Matters.
Ullman, M. 2001. The neural basis of lexicon and grammar in rst and second language:
The declarative/procedural model. Bilingualism: Language and Cognition 4: 105122.
Waxman, S. 1996. The development of an appreciation of specic linkages between
linguistic and conceptual organisation. In The acquisition of the lexicon, L. Gleitman
and B. Landau (eds), Cambridge, Mass.: MIT Press.
White, L. 1989. Universal Grammar and second language acquisition. Amsterdam: John
Benjamins.
White, L. 1991. Adverb placement in second language acquisition: Some eects of negative
evidence in the classroom. Second Language Research 7: 13361.
Winograd, T. 1972. Understanding natural language. New York: Academic Press.

<LINK "haw-n*">

<TARGET "haw" DOCINFO AUTHOR "Roger Hawkins and Sarah Liszka"TITLE "Locating the source of defective past tense marking in advanced L2 English speakers"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 2

Locating the source of defective past tense


marking in advanced L2 English speakers*
Roger Hawkins and Sarah Liszka
University of Essex

1.

Introduction

It is well-known that advanced L2 speakers of English from certain L1 backgrounds show persistent optionality in marking thematic verbs for simple past
tense in spontaneous oral production, as for example in The police caught the
man and take him away. Speakers whose L1 is Chinese appear to be such a
group. Bayley (1991, 1996) found the phenomenon suciently robust in
Chinese speakers to undertake a variationist analysis of the factors which might
be causing it. Wolfram and Hateld (1984) had found similar optionality in the
L2 English of Vietnamese speakers. More recently Lardiere (1998a, 1998b, 2000)
has reported remarkably consistent optionality in simple past tense marking in
a near-native speaker of English sampled with an eight-and-a-half year interval.
Patty, a native speaker of Mandarin and Hokkien, marked simple past tense on
thematic verbs around only one-third of the time in data collected from
spontaneous speech. Native speakers, by contrast, do not apparently show
endemic optionality of the same kind, although failure to inect a verb for past
tense is found sporadically in slips of the tongue (Fromkin 1988).1
An important question for theories of SLA which assume that the mental
grammars of individual L2 speakers are derived from Universal Grammar (UG)
is why such optionality might exist in advanced/near-native speakers. Given
that the morphophonology of forms in English provides clear positive evidence
that past tense is marked, it is unexpected that advanced/near-native speakers
should continue to have problems with it. Locating the source of the diculty
would be a small contribution to the broader goal of determining exactly how
the language faculty is involved in the construction of grammatical knowledge
by older L2 learners.
Beck (1997) has argued that the kind of optionality in question is unlikely

22

Roger Hawkins and Sarah Liszka

to be the result of a decit in the component of the language faculty which


generates inected phonological word forms: morphology. Beck compared the
reaction times of 31 non-native speakers from a variety of L1 backgrounds with
those of 32 natives on a task requiring the production of past-inected verb
forms. Speakers were presented with verb stems on a computer screen, and
required to produce the simple past tense form orally, which activated a timing
device. In previous studies with natives (e.g. Prasada, Pinker and Snyder 1990)
it had been found that irregular verb forms show a frequency eect: the more
frequent the stem of the verb in question (where frequency is dened as
frequency of occurrence in corpora of English usage) the faster the reaction
time to the past tense form. For regular verbs, however, frequency of the stem
form had no eect on reaction time. Such ndings have led to the claim that
irregular past tense verb forms are stored associatively in memory (i.e. are
listed) and hence show frequency eects as a function of strength of association between the stem and the listed form. By contrast, regular inection is
produced by rule, which applies in the same way to all regular stems, independently of frequency. Hence there is no reaction time eect.
Becks ndings with the non-native speakers were that they performed
similarly to the natives: reaction times on low frequency irregular stems were
slower than on high frequency stems (although not signicantly so), and there
was no dierence in their performance on frequent and less frequent regular
stems. This led Beck to suggest that it is not the morphological component
which is involved in causing persistent optionality in past tense marking.
Lardiere (1998a, 1998b, 2000) has suggested that the problem for Patty
resides in the mapping between fully specied syntactic phrase markers and
surface morphophonology. Lardiere argues that other evidence from Pattys
spontaneous oral production suggests that she has a T(ense) category specied
for niteness. Firstly, her use of nominative case-marked pronouns is perfect;
on standard assumptions the nominative case of subjects in English is the result
of an agreement or checking relation between a T category specied [+nite]
and the subject in the specier of TP. Hence if Patty uses nominative pronouns
perfectly she must have represented a nite feature on T. Secondly, there is no
evidence for thematic verb raising (e.g. over negation) in Pattys productions,
suggesting that she has established that T has weak inectional properties in
English another piece of evidence that Patty has a T specied for niteness.
Thirdly, there is evidence that Patty projects nite CPs, which on standard
assumptions implies the presence of a T specied for niteness.
The mapping diculty is a problem accessing morphological forms which

Defective past tense marking in L2 English

have layers of feature structure. Assuming a model of grammar where an


autonomous morphological component reads the output of lexical and
syntactic derivation, identifying those features which condition inectional
operations (Lardiere 1998a: 20), Lardiere argues that the more inectional
features there are associated with a morphological form, the more likely a
problem will arise for an L2 speaker. In the case of tense-marking, she assumes
that the output of syntactic computations presents the morphological component with a terminal T node specied [+nite]. The morphological component
must then determine whether it is [+past] or [past], and if [+past] select
suppletive forms in the case of irregular verbs, or invoke the regular rule for the
axation of -ed in the case of regular verbs. She proposes (speculatively) that
it is among the increasingly complex outer layer mappings from morphology
to PF that we are likely to nd the greatest vulnerability to fossilization and
critical period eects (Lardiere 2000: 124). Additionally, if the phonological
forms themselves involve complex phonology for example, involving word
nal clusters like -kt, -skt and -mpst in the past tense forms of walked, asked and
glimpsed Lardiere argues that this may further aect successful mapping:
We can further imagine that an essentially morphophonological mapping
procedure would be especially vulnerable to derailment from a variety of postsyntactic or extra-syntactic factors, such as phonological transfer from the L1
(Lardiere 1998a: 21). Given that (Mandarin) Chinese is a language with basic
(C)V(nasal) syllable structure and no syllable- or word-nal consonant clusters
(Hansen 2001), we can take Lardieres claim to be that while mapping of
phonological forms onto terminal nodes which are the output of the syntax can
cause problems generally for L2 speakers where layers of features are involved,
in Pattys case this is compounded by the fact that the L2 requires word-nal
consonant clusters where the L1 disallows them.
The combined results of the studies by Beck and Lardiere point to the
following possible conclusion: a decit has occurred in one of the mechanisms
of the morphological component the mechanism referred to as the vocabulary in models of distributed morphology (Halle and Marantz 1993, Embick and
Noyer 2001) which inserts vocabulary items (phonological forms) into
terminal nodes where the feature specication of the vocabulary item and the
feature specication of the terminal node are non-distinct. The representation
of morphological forms themselves is not aected (as Becks study suggests), and
the feature specication of categories manipulated by the syntax is not aected, if
Lardieres analysis is correct. Moreover, the dierence in phonotactic constraints between the L1 and L2 has a persistent inuence, such that the mapping

23

24

Roger Hawkins and Sarah Liszka

problem is exacerbated. Thus this account of optionality in tense marking by L2


speakers is located at the interface between syntax and morphology.
In this chapter we test this claim by comparing the spontaneous oral
production of advanced L2 speakers of English from three dierent L1 backgrounds: Chinese, Japanese and German. If there is a general mapping problem
for L2 speakers at the interface between syntax and morphology involving feature
matching at the point of vocabulary insertion, we would expect this to appear in all
three groups. Since Japanese is similar to Chinese in its phonotactic structure
(disallowing word-nal consonant clusters) but German is like English (cf.
word-nal clusters such as -ntst: getanzt danced), we would expect mapping
problems to be more marked in the Chinese and Japanese informants.2
In fact we will argue that neither of the predictions is borne out by the data.
The Chinese informants do mark simple past tense optionally in oral production, as previous studies had found, but the Japanese and German speakers are
signicantly less likely to do so. We will also show that our Chinese informants
appear to know the morphological properties of English past tense verb
inection, as Becks study suggests that they should, and that they do not
generally appear to have problems producing word-nal consonant clusters.
This will lead us to argue for a dierent locus for their decit: at the interface
between the syntactic component and the lexicon. In particular, we will claim
that the Chinese speakers have diculty assigning the formal (i.e. syntacticallyrelevant) feature [past], which determines the morphophonological forms of
verbs in English, to the feature inventory of the category T(ense) in the lexicon,
because this feature is not selected in Chinese, and is subject to a critical period.

2. Assumptions about the organisation of the language faculty


We follow the spirit of recent work within the minimalist program (Chomsky
1998, 1999, 2001) and assume that the language faculty has a number of
universally xed and invariant computational procedures, and provides a
universal inventory of phonological, semantic and syntactic (formal) features F
from which lexical items can be assembled. One subset of the computational
procedures is the syntax, which is capable of a small number of empirically and
conceptually necessary operations: merge, agree and move. The syntax takes
items presented to it from the lexicon and combines them into expressions.
These expressions are interpreted by a semantic component (LF) (whose
procedures are themselves universally invariant), and which makes the syntactic

Defective past tense marking in L2 English

expressions legible to the conceptual-intentional modules of mind. The


syntactic expressions are also interpreted by morphological/phonological
procedures which are universally uniform, and which make syntactic expressions legible to the sensori-motor systems for the production and understanding of speech.
The universal inventory F of semantic, phonological and syntactic features
is crucial in this model, because it is from this set that features are selected for
the assembly of a lexicon whose items provide the input to the computational
procedures. Individual languages select a subset of features from F and assemble
them into lexical items. It is at this point the selection of particular features
for the assembly of lexical items that languages vary.
We focus here on the selection of syntactic (formal) features. Syntactic
features are those which initiate syntactic operations, for example, wh-movement, case agreement, N-to-D movement, and so on. Some selections of
syntactic features appear to be obligatory. For example, nite T appears
necessarily to select the syntactic feature required to activate structural nominative case. Nominals also appear to obligatorily select case features which render
them active for the purpose of case agreement. Languages are uniform in
selecting these features. Parametric dierences between languages arise when
they make dierent choices of optional syntactic features.
The distinction between obligatory and optional syntactic features is
important for understanding tense and how it is realised morphologically in
languages like English and Chinese. The view we adopt is that there is a syntactic (i.e. semantically uninterpretable) tense feature which, for the sake of
exposition, we call [past], which is available in the universal inventory F, but
which is optional. English has selected it, but Chinese has not. In English, the
presence of [past] in nite T has a consequence for the morphology of the
verb. [past], being a syntactic (formal) feature, is not semantically interpretable at LF and so must be eliminated from a syntactic expression before such
interpretation takes place. This elimination is eected through a checking (or
matching) of the features of T with the morphological features of inected verb
forms like was, had, walked, ran. There are various complications that arise in
this checking/matching operation depending on whether the verb is a light verb
(be, have, do) or a thematic verb (walk, run). We will not expand on these here
(see Lasnik 1999, Embick and Noyer 2001). But the basic claim is that nite
English T has syntactic [past] features which have morphological consequences for V. By contrast, we claim that Chinese does not have syntactic [past]
features on T, although it does have a syntactic [nite] feature (Li 1990: 18).

25

26

Roger Hawkins and Sarah Liszka

As a consequence, bare Vs in Chinese can be interpreted either as past or nonpast, depending on context:
(1) Zhangsan kan dianying.
Zhangsan see movie
Zhangsan is seeing OR saw a movie.

We take this up in more detail in Section 4.

3. The study
To test the claim that the source of optionality in tense marking by L2 speakers
lies at the interface between the syntactic and morphological components of the
language faculty, we selected advanced L2 speakers of English from dierent L1
backgrounds, devised a test aimed at measuring informants knowledge of the
morphological processes involved in simple past tense marking, and collected
a sample of spontaneous production data including simple past tense verb
forms from the same informants.
3.1 Informants
Advanced L2 speakers of English were selected for this study on the basis of
their performance on an independent measure of general prociency. This
consisted of two components: (a) the written multiple-choice grammar test
component of the Oxford Placement Test (Allan 1992) which has 100 items
covering a range of the core morphosyntactic properties of English; (b) Nations
(1990) vocabulary levels test, which was designed as a language teachers aid
for giving help with vocabulary learning, but provides a rough notional measure
of the size of a speakers vocabulary up to the 10,000-word level. Using this
combined grammar/vocabulary test, we selected informants whose L1 was
Chinese, Japanese or German and whose mean scores broadly matched at the
upper end (over 80% correct). This produced an experimental set of informants
of two Chinese, ve Japanese and ve German speakers.3 Details of the prociency scores are given in Table 1.
3.2 Test of knowledge of morphology
If Becks (1997) ndings are generalisable, we expect to nd that the informants in
our study can productively manipulate morphological processes involved in past

Defective past tense marking in L2 English

Table 1.Prociency test scores: experimental informants


L1
Chinese (n = 2)
Japanese (n = 5)
German (n = 5)

Mean prociency score (%)

Range (%)

86.6
85.7
90.7

83.489.7
83.487.0
85.896.0

tense marking in English. To test this we designed a task which required informants to inect both real and invented (nonce) verb stems for simple past tense.
Our reasoning was that if speakers know the verb morphology associated with past
tense, it should make no dierence whether real or nonce forms are involved.
3.2.1 Design
The task was adapted from one used by Prasada and Pinker (1993) with native
speakers (experiment 3 in their study). Prasada and Pinker were interested in
the extent to which natives would produce regular and irregular past tense
forms when presented with nonce verbs, and in particular whether novel
irregulars displaying the prototypical phonological shape of partially productive past tense irregulars (like string, sling, ing, cling strung, slung, ung,
clung) would elicit novel irregular past tense forms, such as spling splung. To
elicit such responses, informants were presented with nonce forms and an
invented denition for each six at the top of each page of a test questionnaire
followed by six sentences, each with a blank, where one of the nonce forms
belonged (Prasada and Pinker 1993: 24).
In our test we were interested simply in whether informants would inect
verbs appropriately for past tense in clear past tense contexts, and whether their
knowledge was generative in the sense of allowing them to inect verbs they had
never encountered before correctly. They were therefore presented with six
verbs at the top of each page of a test questionnaire with denitions (as in the
Prasada and Pinker study), but half were real and half invented (18 of them in
fact taken from the set of prototypical regular and irregular nonce verbs used by
Prasada and Pinker). Below each set of six verbs with their denitions were six
sentences where informants had to decide which verb to insert, and what its
form should be. These contexts required either a simple past or a present
perfect form. Thus informants could not simply produce a past tense form by
rote, but had to make a denite choice of tense appropriate to the context. A
partial illustration of the form of the test is given in (2):

27

28

Roger Hawkins and Sarah Liszka

(2) SPLING:

If you spling, you blow something out of your mouth (like


smoke rings or air bubbles).
CUT:
If you cut something, you make it shorter, or divide it, or
break its surface.
a. The ground sta havent marked out the tennis court or put up the
net yet, but the head gardener claims that they __________ the grass
in readiness.
b. As he rose slowly, the diver __________ bubbles in short bursts.

There were 120 contexts in all, 60 of which were expected to elicit simple past
tense verb forms, and of these 30 involved nonce verb stems (15 prototypical
regular types, like blark, and 15 prototypical irregular types, like spling). A
control group of native speakers (n = 5) also took the morphology test.
3.2.2 Results
Table 2 displays the frequencies of inected and uninected real and nonce
verbs in simple past tense contexts.
A 2 test comparing the total frequency of inected and uninected forms
produced by the non-native groups shows no signicant dierence between

Table 2.Frequencies of real and nonce verbs inected for simple past
L1
Chi (n = 2)

Jap (n = 5)

Ger (n = 5)

Eng (n = 5)

Verb type

Inected Score (%)

Score (%)

Score (%)

Score (%)

Real reg

Yes
No

23
3

88.5
11.5

60
2

96.8
3.2

64
1

98.5
1.5

74
0

.100
.0

Real irreg

Yes
No

29
0

.100
.0

64
2

97.0
3.0

65
0

.100
.0

73
0

.100
.0

Nonce reg Yes


No

25
2

92.6
7.4

58
2

96.7
3.3

64
0

.100
.0

72
2

97.3
2.7

Nonce irreg Yes


No

18
4

81.8
18.2

54
9

85.7
14.3

53
7

88.3
11.7

71
0

.100
.0

Total

95
9

91.3
8.7

236
15

94.0
6.0

246
8

96.9
3.1

290
2

99.3
0.7

Yes
No

NB: Cases where an informant selected a verb form other than simple past (e.g. past progressive or
perfect) are excluded from the table. Hence scores do not necessarily add up to the expected maximum.

Defective past tense marking in L2 English

them (2 = 4.94, df = 2, p < .05). A comparison between the non-natives as a


group and the native controls shows that there is a signicant dierence
(2 = 12.64 (with Yates correction factor), df = 1, p < .05). This dierence
appears to be located primarily in the extent to which the non-natives fail to
inect irregular nonce forms for simple past (where the native controls do
inect in 100% of cases). This is particularly interesting because there are no
signicant dierences between non-natives and natives in inecting regular
nonce forms. The result seems to suggest that speakers know that certain
irregular nonce forms are irregular, hence do not attach a regular inection to
them, but do not know what the inected form should be, producing an
uninected form.
Broadly, and assuming that in some sense the non-native speakers are
distinguishing nonce regulars from irregulars, frequency of past tense marking in
the responses of these advanced non-natives is very similar to those of the natives.
This is consistent with Becks ndings, and suggests that the morphological component is operating similarly in these speakers to the way it operates in natives.
3.3 Inected simple past tense in spontaneous production
Spontaneous oral data were collected from two tasks: the retelling of a short
extract from a Charlie Chaplin lm (Modern Times), and the recounting of a
happy or exciting experience each informant had had. The data were recorded
and transcribed, and only verbs in unambiguously simple past tense contexts
(i.e. those where a native could use no other form) were counted. For the
purposes of this study, only thematic verbs were scored (e.g. walked, but not
modals, copula/auxiliary was or auxiliary had). Verbs in contexts which were
phonologically ambiguous were also discounted (i.e. regular past tense verbs
followed by homophonic stops as in walked to work, or interdental fricatives, as
in chased them). All unambiguous forms thus counted were rechecked against
the original recordings to ensure accuracy. The frequencies of inected and
uninected forms across the three groups of non-native speakers are presented
in Table 3.
2 tests show that there is a signicant dierence between groups both on
frequency of inection with regular verbs (2 =30.49, df=2, p<.01) and with irregular verbs (2 =8.13, df=2, p<.05). The dierence appears to be located entirely in
the Chinese speakers performance. This is in contrast to the performance of
these speakers on the morphology test, where there was no signicant dierence
between the three non-native groups, either on real verbs or nonce forms.

29

30

Roger Hawkins and Sarah Liszka

Table 3.Frequencies of inected/uninected verbs in simple past tense contexts:


spontaneous oral production
L1
Chinese (n = 2)

Japanese (n = 5)

German (n = 5)

Score

(%)

Score

(%)

Score

(%)

Inected 25
Uninect. 15

62.5
37.5

137
12

91.9
8.1

52
2

96.3
3.7

40

.100

149

.100

54

.100

Irregular Inected 64
Ininect. 12

84.2
15.8

252
18

93.3
6.7

79
4

95.2
4.8

Total

.100

270

.100

83

.100

Verb type
Regular
Total

76

These results are problematic for the view that L2 speakers generally have
diculty mapping phonological forms (vocabulary items) with layers of
morphological features onto terminal nodes generated by the syntax. If this
were the case, we would expect all three groups to perform similarly on the
spontaneous production task, but the Chinese speakers are performing signicantly dierently from the Japanese and German speakers.4
Is the dierence the result of L1 phonology interfering with and depressing
the performance of the Chinese informants? This is unlikely. Recall that the
prediction was that if L1 phonology had such an eect, we would expect the
Chinese and Japanese to experience similar problems because the relevant
property (absence of word-nal consonant clusters) is present in both L1s.
However, to pursue this possibility further, we considered performance of the
non-native speakers on consonant clusters elsewhere in their spontaneous
production, specically in monomorphemic words like most, kind. If word-nal
clusters are problematic in production, some evidence for this should surface in
these forms. Table 4 compares informants retention of -t/-d in inecting
regular past tense verbs with their retention of -t/-d in monomorphemes.
Frequencies are small, but suggestive. Although Chinese speakers do drop
nal -t/-d in monomorphemes, they do not do so as frequently as in the simple
past. This result matches a similar nding in Bayley (1996), who in a group of
20 L1 Chinese speakers of intermediate and advanced prociency in L2 English
found a 65% retention of nal -t/-d in monomorphemes versus a 44% retention in simple past tense verb forms. This is in marked contrast with the pattern

Defective past tense marking in L2 English

Table 4.Absence of word-nal -t/-d in monomorphemes and regular simple past tense
forms compared
L1
Chinese

Japanese

German

Word type

-t/-d

Score

(%)

Score

(%)

Score

(%)

Monomorphemes

Present
Absent

9
2

82
18

27
1

96
4

48
0

100
0

Simple past
(Regular)

Present
Absent

25
15

63
37

137
12

92
8

52
2

96
4

of -t/-d deletion found in many studies of native speakers (Labov 1989), as


already observed. Natives are more likely to drop -t/-d when a morphological
boundary is not involved than when it is.
Another possibility is that spontaneous oral production introduces performance pressures which make it dicult for L2 speakers to access inected past
tense verb forms in real-time (Prvost and White 2000: 129). If this were the
case, performance on the morphology test would be a better reection of the
informants competence, because it lessens such pressures, while performance
in spontaneous oral use of English underrepresents informants competence.
This also looks implausible in the context of the three-group comparison. If
performance pressures were involved, we would expect them to surface in all
three groups, not just in the Chinese speakers. However, to pursue the possibility further, we looked at performance of the informants in using regular
inected participles in cases like were scared of, be sliced, is released, which are
identical to the simple past tense forms. Table 5 brings together the frequencies
of inected/uninected regular participles, monomorphemes and regular
simple past tense forms.
Again, although small frequencies are involved, if performance pressures
were responsible for producing optionality in simple past tense marking, we
might expect to see some evidence in the production of participles, where layers
of morphological features also seem to be involved: if a verb form is [nite]
then it is either [+durative] (walking) or [durative]. If it is [durative] it is
either [+past] ((have) walked) or [past] ((to) walk).
In summary, the results from spontaneous oral production are the following: although the non-native groups were matched for general high prociency
in L2 English, the Chinese informants were signicantly less likely to inect

31

32

Roger Hawkins and Sarah Liszka

Table 5.Absence of word-nal -t/-d in regular participles, monomorphemes and


regular simple past tense forms compared
L1
Chinese

Japanese

German

Word type

-t/-d

Score

(%)

Score

(%)

Score

(%)

Participles

Present
Absent

10
0

100
0

23
0

100
0

55
0

100
0

Monomorphemes

Present
Absent

9
2

82
18

27
1

96
4

48
0

100
0

Simple past
(Regular)

Present
Absent

25
15

63
37

137
12

92
8

52
2

96
4

both regular and irregular thematic verbs for past tense than the Japanese or
German speakers. The Chinese speakers retained word-nal -t/-d with monomorphemes more often than with regular past tense verb forms (suggesting the
problem is not caused by word-nal consonant clusters). The Chinese speakers
were perfect in inecting past participles, in contrast to simple past tense verb
forms, suggesting that performance pressures are unlikely as a source of
omission of inections with past tense verbs.

4. Discussion
We are interested in locating the source of optionality in the marking of
thematic verbs for simple past tense in oral production by certain groups of
high prociency L2 speakers of English, for example speakers of L1 Chinese. To
address this problem comparative data were collected from L1 speakers of
Chinese, Japanese and German matched for general prociency as advanced
speakers of English. An important caveat in discussing the results is that the
number of informants was small, therefore the generalisability of any conclusions drawn must be treated with caution. The data were elicited from a test of
knowledge of past tense morphology, and tasks allowing free oral production
(an anecdote and the retelling of a lm). Results of the morphology test suggest
that all three groups, including the Chinese speakers, know the morphological
properties of past tense marking in the context of real English verbs, and are
productively aware of the regular versus irregular distinction even with verbs

Defective past tense marking in L2 English

they have never encountered before (nonce forms). This is consistent with
earlier ndings of Beck (1997), who showed that the reaction times of nonnative speakers on a verb inection task were parallel to those of native speakers
on the same task. This led her to conclude that the source of optionality of past
tense marking is not located in the operation of the morphological component,
a claim with which we concur.
The comparative results from the free oral production of our three experimental groups show that the Chinese speakers are signicantly dierent from
the other two groups, producing uninected regular verbs in unambiguously
past tense contexts in over one third of cases. This was prima facie evidence
against the idea that L2 speakers in general have diculty mapping phonological forms (vocabulary items) onto syntactic terminal nodes where that
mapping implicates layers of morphological structure. If this were a general
problem for L2 speakers, and on the assumption that morphological properties
do not transfer from the L1 to the L2, we expected all three groups to show
evidence of it. By looking at the Chinese speakers performance on word-nal
consonant clusters in monomorphemic words, and their performance in
inecting past participles, we also established that optionality in producing nal
-t/-d was lesser in these contexts. This suggested that the source of the problem
with simple past tense was unlikely to be phonological in nature or the result of
performance pressure because similar optionality would be expected across
these contexts too.
Having eliminated the morphological component, and the mapping of
morphophonological forms onto syntactic representations as likely sources of
observed optionality in past tense marking in the informants studied, an
alternative explanation is needed. In this nal section we explore the consequences of assuming that optionality results from a failure of L2 learners to
include a syntactic feature for tense, that we are calling [past], among the
features which make up the lexical item T. Where this is the case, T enters
syntactic derivations without the feature which, for native speakers of English,
eventually forces the insertion of vocabulary items inected for past tense into
terminal nodes. Given such a hypothesis, we need to account for why optionality is characteristic of the Chinese speakers in our sample, but not the Japanese
or German speakers; why the Chinese speakers nevertheless have greater than
chance success in marking verbs for past tense in past tense contexts; and why
the Chinese speakers might be more successful in marking irregular past forms
and regular participles than in marking regular past forms. Central to this
account is an understanding of the relation between how tense is interpreted by

33

34

Roger Hawkins and Sarah Liszka

the semantic component, and the presence of a syntactic (formal) tense feature
in T. We therefore examine this relationship more closely.
Languages appear to dier in whether they grammaticalise tense through
special morphophonological forms or not. Chinese is standardly assumed to
lack such morphophonological forms, although it does have verbal aspect
markers like -le, -guo and -zhe (Li and Thompson 1981, Li 1990, Packard 2000).
For example, as already noted in Section 2, the verb kan see can be freely
interpreted as present or past depending on context:
(3) a.

Zhangsan kan dianying.


Zhangsan see movie
Zhangsan is seeing a movie.
b. Zhangsan zuotian kan dianying.
Zhangsan yesterday see movie
Zhangsan saw a movie yesterday.

By contrast, Japanese and German do appear to grammaticalise tense, like English.


Japanese has the forms -ta (past) and -ru (non-past) which appear to be tense
auxiliaries (Okuwaki 2000). Thematic verbs in German inect for past versus
non-past tense like English, with regular and irregular variants (e.g. regular:
kaufen buy, ich kaufte I bought; irregular: singen sing, ich sang I sang).5
Why might some languages have grammatical exponents of a past/non-past
distinction while others, like Chinese, simply lack such exponents? An interesting perspective on this question can be found in the work of Chierchia (1998).
Chierchia suggests that the presence of a syntactic property in a grammar which
is associated with a particular semantic operation has the eect of inhibiting the
free application of that operation. In his own study he suggests that the presence
of articles in a language blocks the free application of the semantic operations
which give nominals generic, denite or indenite meanings. So in English,
count Ns can only be interpreted as denite when the is present, but in Russian,
which lacks articles, bare Ns can be freely interpreted as generic, denite or
indenite depending, presumably, on the context (Chierchia 1998: 361). This
idea has been extended by Takeda (1999: 103) into a Generalised Blocking
Principle (GBP): if a language has a certain functional category in its lexicon,
the free application of the semantic operation that has the same function as that
syntactic category is blocked in that language.
Taking functional category here to mean feature of a functional category,
what the GBP proposes in terms of tense is that the semantic operation which
interprets a T-V conguration as past or non-past can apply freely except where

Defective past tense marking in L2 English

a language assigns a syntactic [past] feature to T. This then requires overt


morphological expression and blocks the free operation of tense interpretation.
So while nite bare Vs in Chinese can potentially be interpreted as past, present
or future, depending on the context, nite bare Vs in English can only be
interpreted as non-past, because past tense interpretation is associated with the
specic features which give rise to forms like walked, ran.
Consider now how this idea might be implemented in a grammar which
incorporates some version of distributed morphology (Halle and Marantz 1993,
Embick and Noyer 2001), the kind of model assumed in the work of Lardiere
(2000) and Prvost and White (2000). The syntactic component generates
expressions from bundles of syntactic and semantic features using the operations merge, move and agree (Chomsky 1998, 1999, 2001). Expressions consist
of strings of terminal nodes which are presented to the morphological component for the insertion of vocabulary items (phonological forms). Insertion is an
automatic process which takes place where the features of a vocabulary item are
non-distinct from the features of a terminal node. For example, the terminal
string T-V will present the morphological component with features like
[+nite, past] or [+nite, +past]. Vocabulary items with the relevant features
will then compete for insertion into this position. Two important properties of
insertion are rstly that a vocabulary item does not necessarily have to have all
of the features of the terminal node to be inserted; it is sucient that the
features of the item are non-distinct from those of the terminal node (i.e. they
may be a subset of those features). Secondly, that where vocabulary items are in
competition for insertion into a terminal node, the most highly specied item
compatible with the features of the terminal node is the one which wins the
competition for insertion (Lumsden 1992: 480). For example, suppose that
variants of walk have the following feature specications:
(4) walks [V, +nite, 3p, +sing]
walked [V, +nite, +past]
walk [V ]

Given such specications, only walked can be inserted into a terminal node with
the features [+nite, +past], even though walk is unspecied for those features
and is in principle available for insertion. The reason is that although both
forms have feature specications which are non-distinct from the terminal
node, walked is the more highly specied item.
With such an account of the interaction between syntax and morphology,
it is easy to see how the proposal that adult L2 speakers have diculty with the

35

36

Roger Hawkins and Sarah Liszka

mapping of morphophonological forms onto terminal nodes might be made


explicit: the procedure for inserting more specied vocabulary items over less
specied items is not operating categorically in L2 speakers; less specied forms
are being inserted where they should not be.
The account we wish to explore here, however, is one where the syntactic
feature [past] is absent from T in the terminal string which is the output of the
syntactic computations. The claim we will make is the following: where
parametrised syntactic features are not present in a speakers L1, they will not
be accessible in later L2 acquisition.
This is equivalent to saying that syntactic features which invoke the
Generalised Blocking Principle that have not been activated in early life do not
operate in adult SLA. This immediately distinguishes Japanese and German
speakers on the one hand, from Chinese speakers on the other. Japanese and
German both have morphosyntactic exponents of past tense, hence the
underlying syntactic features which invoke the GBP. In principle this should
allow them to determine that past tense verb forms in English are syntactically
motivated. By contrast, for L1 speakers of Chinese, T does not have the syntactic feature [past]. Our proposal entails that when Chinese speakers learn
English they are unable to establish that English T is specied for [past]. This
would mean that in their English grammars the terminal string T-V will include
[nite] but not [past]. The distinction between vocabulary items like walks,
walked, walk would then have no syntactic motivation for them.6 This is one
kind of answer to the rst question: why is optionality in past tense marking
characteristic of the Chinese speakers in our sample, but not the Japanese and
German speakers? It is also clear, however, that the Chinese informants studied
here have acquired vocabulary items with past tense forms, and use them (at
least supercially) in a highly target-like way in the morphology test, and at
above chance level in spontaneous production. What might explain this if our
claim is correct?
One possibility is that Chinese speakers analyse participles and irregular
past tense verb forms dierently from regular past tense forms: they have a
dierent morphological status in their grammars. Past participles are arguably
dierent from simple past tense forms even in native grammars, in that they are
aspectual in nature (realising perfectivity), and result from a verb-internal
word formation process, which does not involve the T-V conguration. Since
our assumption is that L2 speakers do not have diculty with morphological
operations per se, it would not be surprising if they did not have diculty with
participle forms. In other words, we expect Chinese speakers to have diculty

Defective past tense marking in L2 English

with simple past tense forms because they involve a syntactic feature missing
from T, but not participle forms because they do not involve T. In the case of
irregular past tense forms, one possibility is that the Chinese speakers have
acquired them as items independent from the equivalent bare V forms; i.e. ran
is not the past tense exponent of run as it is in native grammars, but an independently acquired word form, in the same way that, say, amble and saunter are
independent word forms for native speakers. While this is speculative, there is
some limited evidence consistent with it in the transcripts of the spontaneous
production data. There are some cases of doubly inected verb forms such as
in (5a). The same speaker who produced (5a) also used ran in clearly non-past
environments ((5b) and (5c)):
(5) a. The girl ranned not far away.
b. You should ran away together.
c. She could not ran any more.

If ran were an independent word form it could be inected for past tense and be
used in non-past contexts. Observe also that (5b) and (5c) involve the use of ran
as a non-nite verb. Given the model of vocabulary insertion assumed, ran
could not be an inected variant of run specied for the feature [+past]. If it
were, its feature specication would clash with the feature specication of the
non-nite terminal nodes into which it is inserted in (5b) and (5c). Insertion
should simply be impossible.
If past participles and irregular forms like ran have a dierent morphological status in the L2 English grammars of Chinese speakers from regular past
tense verb forms like walked, this would be one kind of answer to the question:
why are Chinese speakers more successful in marking irregular past tense verbs
and regular participles than regular past tense verbs? They are not treating them
as past tense verb forms at all; rather they have independent lexical item status
for Chinese speakers. However, given this account, we have little idea currently
of what these forms might mean for Chinese speakers.
This leaves the need to give an account of optionality in the marking of
regular thematic verbs for past tense. We have no real answer to give in this
case. Firstly, we have to assume that there is some reason why Chinese speakers
do not treat regular verbs as independent vocabulary items as in the case of
irregulars. One possibility is that it is the very regularity of the inection and the
frequency of such forms in the input which forces a morphological analysis of
them as rule-based variants of the bare V. The problem then is to explain why
some forms, but not others, are inected in spontaneous production. To give a

37

38

Roger Hawkins and Sarah Liszka

avour of the problem, consider the following short extract from the transcript
of one of the Chinese informants in our sample:
When I saw the lm Lonely and Hungry and it reminded me of the old time
when life was very hard. Some people they were very hungry and they have no
work to do. They really dont want to steal. But they had no other choice and
when they become so hungry and they really want to just get the food and just
want to eat the food so they stole or they just do something bad. But its
understandable. It was not their mistake And I watch it maybe 20 years ago.
But it I didnt remember clearly about what it talk about. I just laugh a lot.
But now when we see the lm, I think. It gave me much imagination.

What might be prompting this speakers use of the bare verb forms want, watch,
talk and laugh in a context where past tense reference is often clearly intended?
We have little idea yet of what the answer might be, but some possibilities
appear unlikely. One explanation for selective verb inection that has been
advanced in studies of the early stages of L2 acquisition is the Aspect Hypothesis. This maintains that English past markers in early grammars are associated preferentially with verbs/predicates which have telicity as part of their
meaning (achievements and accomplishments in the terminology of Vendler
(1967)). See Bardovi-Harlig (1999) for a review of work on this topic. Does the
same pattern obtain in the high prociency Chinese speakers investigated here?
That is, are they treating the regular past tense form as a marker of inherent
verbal/predicate aspect? A breakdown of inected regular forms by verb type is
given in Table 6.
Table 6.Inected past tense regular verbs by aspectual type: Chinese speakers

Inected tokens
%

Statives

Activities

0/4
0%

10/14
71%

Accomplishments Achievements
1/2
50%

14/20
70%

The results are not conclusive. While statives are not inected at all, which
would be consistent with the Aspect Hypothesis, activities (which are atelic) are
inected to the same degree as achievements (which are telic). Interestingly,
Lardiere (2002) has also analysed Pattys past tense verb forms in terms of
whether there is a correlation with the telicity of the verb/predicate, and found
that 40% of all telic verbs (112/277, covering both regulars and irregulars) were
marked for past tense, while 35% of all atelic verbs (130/371) were past-marked.

Defective past tense marking in L2 English

This is a non-signicant dierence (using 2). Thus, although the Aspect


Hypothesis might be an explanation for the distribution of English past tense
verbs in low prociency L2 speakers, it looks unlikely as an explanation for
optionality in the marking of past tense on regular verbs by high prociency L2
speakers.
Another unlikely possibility considered (and rejected) by Lardiere in the
case of Patty is the Discourse Hypothesis (Bardovi-Harlig 1995). This holds
that L2 speakers of English may initially use past forms of verbs to mark
foreground events in narratives, but not mark background events. Foreground
events are dened as clauses that move time forward in a narrative and which,
if interchanged, would change the sequence of events in the narrative; background events are often out of sequence, provide additional information
about the foreground events, set the scene, evaluate or explain (Bardovi-Harlig
1995: 265267). Lardiere calculates that 32% of Pattys past-marked verb forms
(13/41) describe foreground events, while 30% (11/38) describe background
events.7 Again, while the Discourse Hypothesis might explain the distribution
of forms in low prociency L2 speakers, it looks unlikely to be the source of
optionality in high prociency L2 speakers.
The only hypothesis we are able to advance at present (but for which there is
scant evidence in our current sample) is the following: linguistic theory appears to
need to allow operations which apply to strings post-syntactically, that is in the
morphological component or following vocabulary insertion. For example,
Chomsky (1999) proposes that English has an output condition which bars surface
adjacency between V and direct objects where the V is an unaccusative or a passive.
For example, although (6a) is the expected output expression generated by the
syntax, it is less natural than (6b) or (6c). Chomsky claims that a post-syntactic
output constraint forces the object to move either leftwards or rightwards:
(6) a. There was placed a large bowl on the table.
b. There was a large bowl placed ___ on the table.
c. There was placed ___ on the table a large bowl.

The surface ordering of clitic clusters in French also appears to be the eect of a
post-syntactic output condition (Perlmutter 1971). If two third person clitics cooccur, an accusative form precedes a dative, but if a rst or second person and
a third person clitic co-occur, a dative form precedes an accusative (as in (7)):
(7) a.

Elle le
lui
donne.
she it-acc him-dat gives
She is giving it to him.

39

40

Roger Hawkins and Sarah Liszka

b. Elle me
le
donne.
she me-dat it-acc gives
She is giving it to me.

This suggests that the language faculty allows some monitoring of surface
strings. Perhaps in the normal case for Chinese speakers the morphology inserts
bare verb vocabulary items into T-V strings. But because output checking is a
possibility available in the grammar, Chinese speakers monitor the ambient
discourse for pastness and insert V-ed forms when they are able to detect it.
So, as in French where ordering of object clitics is determined by the person of
the forms in question (rst and second person must precede third person), for
the Chinese speakers selection of a V-ed form is determined by pastness. The
dierence between the two cases is that whereas [person] is a feature present in
the specication of the French pronoun vocabulary items, pastness has to be
determined on the basis of context.
Because context is involved, the monitoring process is unstable and gives
rise to the kind of apparently random use of bare and inected verb forms
illustrated in the sample of informant speech given above. This is consistent
with two observations about the marking of regular verbs for past tense by
Chinese speakers. First, individual speakers of high prociency can dier
markedly in the extent to which they are successful in inecting verbs. Our
informants are apparently more successful in spontaneous oral production than
Patty, for example. Secondly, dierent modalities allow a greater degree of
success in past tense marking by the same individual. So our informants were
more successful on the morphology test than in oral production, and Lardiere
(2002) reports that Pattys written output (in the form of e-mail messages)
shows higher proportions of past tense marking than her spoken output.

5. Conclusion: The interface between syntax and the lexicon in adult SLA
In a comparison of the performance of three groups of highly procient L2
speakers of English (with Chinese, Japanese and German as L1s), it was found
that the marking of past tense was optional in the spoken English of the Chinese
speakers, but not in the case of the Japanese and German speakers. With the
caveat that caution is required in generalising from small numbers of informants,
we argued that the Chinese speakers knowledge of English morphological
processes is intact (as Beck (1997) had already argued), and that phonological

Defective past tense marking in L2 English

properties (i.e. -t/-d deletion) did not appear to be a factor for our speakers.
Furthermore, we found that past participle and irregular past tense forms were
inected more consistently than regulars.
Although previous studies have argued that L2 speakers can fully acquire
the syntactic features of lexical items like T (Lardiere 1998a, 1998b, 2000,
Prvost and White 2000), we explored the alternative possibility that Chinese
speakers cannot establish [past] on T in English precisely because this feature
is absent in their L1. By contrast [past] is present on T both in Japanese and
German. If Chinese speakers do not have access to such a feature in the construction of L2 knowledge, but Japanese and German speakers do, this would
explain the observed dierence between them in inecting thematic verbs for
past tense in spontaneous production.
We linked this decit in the Chinese speakers grammars to the idea that
optional syntactic features, when selected, block the free application of
semantic operations (Chierchia 1998, Takeda 1999). Our account assumes that
these optional properties are subject to a critical period (an idea which originates in the work of Tsimpli and Roussou (1991) and Smith and Tsimpli
(1995)). We then explored, in highly speculative mode, reasons why Chinese
speakers might be successful at all in marking thematic verbs for past tense.
This line of enquiry raises an interesting possibility for the interface
between the syntactic component and the lexicon in adult SLA. Within the
spirit of minimalist enquiry, it might be proposed that all of the resources and
computational procedures of the language faculty LF, syntax, morphology
are intact and operative in adult SLA. However, optional (parametrised)
syntactic features which play a major role in tying morphosyntactic structure to
semantic operations in the L1 acquisition of specic languages are unavailable
if not activated in early life. In parsing L2 data, older L2 speakers can use all of
the resources except for these features. Phonological forms (vocabulary items)
which are selected by such features in native grammars are not selected by those
features in the grammars of L2 learners beyond the critical period. Highly
procient speakers must nevertheless nd some motivation for them. We
rejected some of the conceivable ways in which Chinese speakers might
motivate past tense verb forms in English (the Aspect Hypothesis and the
Discourse Hypothesis), and we suggested that they might be operating in
terms of an output condition which monitors the T-V string in relation to the
pastness of the discourse.

41

<DEST "haw-n*">

42

Roger Hawkins and Sarah Liszka

Notes
* Parts of this work have been presented to audiences at McGill (2000) and GASLA (2000).
We are grateful for the comments of those present in both cases. We have also beneted
greatly from the comments of Donna Lardiere on an earlier draft discussion of the ideas
outlined here. She will not agree with our conclusions, but her views have helped us sharpen
our thinking on a number of issues.
1. The phenomenon of -t/-d deletion from word-nal consonant clusters attested widely in
informal varieties of native English (Labov 1989), is strongly disfavoured in the context of the
regular simple past tense (Bayley 1996: 109). We return to this below, since it appears that the
pattern is the reverse in non-native speakers: -t/-d is more likely to be missing from the
regular past tense forms of verbs than in other contexts.
2. The assumption underlying the predictions made in this paragraph is that the morphological features of vocabulary items do not transfer from the L1 to the L2. Donna Lardiere (p.c.
and 2000: 116117) points out that the assumption is not self-evident. If the L1 exhibits the
same (or higher) degree of [morphological] complexity [L2 learners] may know what to look
for. Whether L2 speakers do or do not transfer morphological properties from their L1 is
obviously an empirical question. If it turns out that they do, the observations reported in this
article will need to be reinterpreted. Given that evidence bearing on the empirical question
is currently lacking (although see Jarvis and Odlin (2000) for some relevant discussion), for
the purposes of this article it will be assumed that transfer of verbal morphological properties
from L1 to L2 is not a factor.
3. The small size of the Chinese group is a reection of the diculty we had in locating L1
Chinese informants who can achieve a score of 80% or above on the prociency test (rather
than a diculty in nding L1 Chinese speakers of L2 English per se). Since, however, for the
purposes of the present investigation, the frequency of past tense marking in non-nativespeaker English is the focus of interest, the variation in group size is relatively unimportant.
In further work it would be desirable to have larger samples of advanced English speakers of
the L1s in question.
4. Unless, of course, L2 speakers whose L1s are morphologically complex know what to look
for. See footnote 2.
5. However, it should be noted that the present perfect ich habe gekauft predominates
in everyday spoken German in what would be simple past contexts in English (Comrie 1976).
6. A question that might be raised here is whether this shouldnt predict random usage by
the Chinese speakers of all three forms (Donna Lardiere p.c.). This would be the case only if
L2 speakers assumed that morphological variants of a lexical item can occur in free variation.
However, we argue subsequently that Chinese speakers attempt to establish representations
which distinguish morphologically distinct forms. The problem for them is that they have to
do so without the benet of the syntactic feature [past].
7. In practice the criteria seem to allow for considerable variation between raters in deciding
which events are in the foreground and which in the background. We had diculty applying
them to our own sample, and do not report the results here.

Defective past tense marking in L2 English

References
Allan, D. 1992. Oxford placement test. Oxford: Oxford University Press.
Bardovi-Harlig, K. 1995. A narrative perspective on the development of the tense/aspect system
in second language acquisition. Studies in Second Language Acquisition 17: 263291.
Bardovi-Harlig, K. 1999. From morpheme studies to temporal semantics: tense-aspect
research in SLA. Studies in Second Language Acquisition 21: 341382.
Bayley, R. 1991. Variation theory and second language learning: Linguistic and social constraints on
interlanguage tense marking. Unpublished doctoral dissertation, Stanford University.
Bayley, R. 1996. Competing constraints on variation in the speech of adult Chinese learners
of English. In Second language acquisition and linguistic variation, R. Bayley and D.
Preston (eds), 97120. Amsterdam: John Benjamins.
Beck, M-L. 1997. Regular verbs, past tense and frequency: tracking down a potential source
of NS/NNS competence dierences. Second Language Research 13: 93115.
Chierchia, G. 1998. Reference to kinds across languages. Natural Language Semantics 6:
339405.
Chomsky, N. 1998. Minimalist inquiries: the framework. MIT Working Papers in Linguistics 15: 156.
Chomsky, N. 1999. Derivation by phase. Ms. MIT.
Chomsky, N. 2001. Beyond explanatory adequacy. Ms. MIT.
Comrie, B. 1976. Aspect. Cambridge: Cambridge University Press.
Embick, D. and Noyer, R. 2001. Movement operations after syntax. Linguistic Inquiry 32:
555595.
Fromkin, V. 1988. The grammatical aspects of speech errors. In Linguistics: The Cambridge
survey, Vol. II, F. Newmeyer (ed.), 117138. Cambridge: Cambridge University Press.
Halle, M. and Marantz, A. 1993. Distributed morphology and the pieces of inection. In
The view from building 20: Essays in linguistics in honor of Sylvain Bromberger, K. Hale
and S. J. Keyser (eds), 111176. Cambridge, MA: MIT Press.
Hansen, J. 2001. Linguistic constraints on the acquisition of English syllable codas by native
speakers of Mandarin Chinese. Applied Linguistics 22: 338365.
Jarvis, S. and Odlin, T. 2000: Morphological type, spatial reference, and language transfer.
Studies in Second Language Acquisition 22: 535556.
Labov, W. 1989. The child as linguistic historian. Language Variation and Change 1: 8598.
Lardiere, D. 1998a. Case and tense in the fossilized steady state. Second Language Research
14: 126.
Lardiere, D. 1998b. Dissociating syntax from morphology in a divergent L2 end-state grammar. Second Language Research 14: 359375.
Lardiere, D. 2000. Mapping features and forms in second language acquisition. In Second
language acquisition and linguistic theory, J. Archibald (ed.), 102129. Malden, MA:
Blackwell.
Lardiere, D. 2002. Second language knowledge of [past] vs. [nite]. Paper presented at
GASLA 6, University of Ottawa.
Lasnik, H. 1999. Minimalist analysis. Malden, MA: Blackwell.
Li, A. Y-H. 1990. Order and constituency in Mandarin Chinese. Dordrecht: Kluwer.

43

</TARGET "haw">

44

Roger Hawkins and Sarah Liszka

Li, C. N. and Thompson, S. 1981. Mandarin Chinese: A functional reference grammar.


Berkeley: University of California Press.
Lumsden, J. 1992. Underspecication in grammatical and natural gender. Linguistic
Inquiry 23: 469486.
Nation, I. P. S. 1990. Teaching and learning vocabulary. Boston, MA: Heinle and Heinle.
Okuwaki, N. 2000. Japanese -ta as an auxiliary verb. Ms. University of Essex.
Packard, J. 2000. The morphology of Chinese: A linguistic and cognitive approach. Cambridge:
Cambridge University Press.
Perlmutter, D. 1971. Deep and surface structure constraints in syntax. New York: Holt,
Rinehart and Winston.
Prasada, S. and Pinker, S. 1993. Generalization of regular and irregular morphological
patterns. Language and Cognitive Processes 8: 156.
Prasada, S., Pinker, S. and Snyder, W. 1990. Some evidence that irregular forms are
retrieved from memory but regular forms are rule-generated. Poster paper, 31st
Annual Meeting of the Psychonomic Society, New Orleans.
Prvost, P. and White, L. 2000. Missing surface inection or impairment in second language
acquisition? Evidence from tense and agreement. Second Language Research 16: 103133.
Smith, N. and Tsimpli, I-M. 1995. The mind of a savant: Language learning and modularity.
Oxford: Blackwell.
Takeda, K. 1999. Multiple headed structures. Unpublished doctoral dissertation, University
of California, Irvine.
Tsimpli, I-M. and Roussou, A. 1991. Parameter resetting in L2? University College Working
Papers in Linguistics 3: 149169.
Vendler, Z. 1967. Verbs and times. In Linguistics and Philosophy, Z. Vendler (ed.), 97121.
Ithaca, NY: Cornell University Press.
Wolfram, W. and Hateld, D. 1984. Tense marking in second language learning: patterns
of spoken and written English in a Vietnamese community. ERIC document ED 25
960. Washington, DC: Centre for Applied Linguistics.

<LINK "cor-n*">

<TARGET "cor" DOCINFO AUTHOR "Norbert Corver"TITLE "Perfect projections"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 3

Perfect projections*
Norbert Corver
Utrecht University

1.

An interface perspective on L2-knowledge

A central question in current generative research is the question of how perfect


a system language is (cf. Chomskys minimalist research program: Chomsky
1995, 2000a, 2000b). Perfection is dened here from an interface perspective:
the grammatical information provided by the linguistic expressions that are
generated by the language L must be legible to the external performance systems
within which the language faculty is embedded. In view of the traditional
assumption that language is a relation of sound and meaning, i.e. a mental
phonetic representation and a mental meaning representation, there are two
points of access from external systems. There is a sensory-motor system that is
looking at P(honetic) F(orm) and reads o information provided by the
PF-representation. And there is some language use or conceptual system that
reads o the meaning information provided by the L(ogical) F(orm)-representation. If the linguistic expression generated by the language system is legible
both on the sound side and on the meaning side, the expression is said to
converge at both interface levels. If there is some element or property which
cannot be interpreted at the interface, the expression crashes. The sentence John
met Mary Bill crashes, for example, because one of the noun phrases (say Bill)
does not receive an interpretation at the LF (i.e. meaning) interface. The noun
phrases John and Mary are interpreted as arguments of the two-place verbal
predicate met. The noun phrase Bill cannot receive an argumental interpretation, nor any other meaning interpretation, and therefore turns the sentence
into an LF-representation that crashes at the meaning interface.
Taking this interface perspective on human language, one could say that
language is an optimally designed, perfect system. It makes linguistic information available to the external systems in a form which is accessible to them.
Knowledge of language, then, can be dened in interface terms: a person has

46

Norbert Corver

knowledge of language L if he is able to form linguistic expressions (i.e. soundmeaning pairs) that are fully-interpretable at the interface levels.
This requirement that the linguistic expressions (PF-LF pairs) generated by the
computational system be legible to the external systems plausibly holds for any
state of L1-knowledge (both interlanguage knowledge and nal state grammatical
knowledge).1 In all stages of language acquisition, the language system (i.e the
grammar) must interact with the external systems. If it does not, the linguistic
knowledge is not usable at all and consequently there would not be any output
products. Thus, from this interface perspective we could say that the L1-products
in any stage of language acquisition are perfect linguistic objects (PF-LF pairs)
in the sense that they fully consist of interface-interpretable properties.
What does the interface perspective on linguistic knowledge contribute to
our view on second language knowledge (and second language acquisition)? In
fact, the conclusion seems inescapable that L2-expressions are also perfect
grammatical objects. If they were not, the L2-objects generated by the computational system would not be legible and usable at all by the external systems
which interact with the L2-grammar. There simply would not be any output
(i.e. utterances). The conclusion must be that L2-products are interpretable
both on the meaning side and the sound side. And just like it does for the
various L1-knowledge states, this conclusion should also hold for the various
knowledge states of the L2-grammar (initial state, interlanguage states and
target state).2
Of course, the conclusion that L2-representations are natural language
objects, in the sense of being objects that fall within the bounds of Universal
Grammar, is not new. Over the last two decades, various researchers interested
in investigating the linguistic competence of L2-learners have argued and tried
to show that the (mental) L2-representations generated by the interlanguage
grammar are constrained by principles of UG (and consequently can be
analysed in the same way as other (e.g. L1 linguistic data; cf. Bley-Vroman 1990,
Schachter 1989, White 1988, 2000). The importance of the interface perspective
on linguistic expressions is that the conclusion of UG-consistency seems
inescapable: if interlanguage representations did not obey the bare output
conditions (i.e. the interface legibility requirements), these representations
would be illegible and inaccessible to the external cognitive systems with which
the interlanguage grammar interacts. The linguistic expressions generated by
the interlanguage grammar would simply not contain the right instructions to
the performance systems, and consequently, they would not be able to put it to
use. And as a consequence of that, there would not be any output.

Perfect projections

Thus, the L2-system, just like the L1-system, is a perfect system in the sense
of being a system that is optimally designed to meet external conditions (bare
output conditions) imposed by other cognitive systems that the language
faculty interacts with. From a dierent perspective, though, linguistic expressions produced by L2-learners seem to be highly imperfect. They very often
deviate, to a greater or lesser extent, from the linguistic expressions produced by
adult mother tongue learners of the language that is acquired. Some L2-expression E with meaning representation LFx very often diers from the equivalent
L1-expression generated by a mother tongue speaker of the language.
Consider, for example, the imperfect L2-expressions in (1a)(4a), which
are produced by Turkish L2-learners of Dutch.3,4 The b-examples represent the
corresponding target expressions.
(1) a.

Ik komt huis en Slenol wegt


huis Stokhasselt.
I come home and Slenol away-3sg house Stokhasselt
I go home and Slenol went to his house in Stokhasselt.
(L2-expression)
b. Ik kom thuis en Slenol gaat (weg) naar zn huis in Stokhasselt.
I come home and Slenol goes (away) to his home in Stokhasselt.
(target expression)

(2) a.

Altijd uh alles
woonte Klirsehir van Turkije.
always uh all/everything live(d) Klirsehir of Turkey
Everyone still lives in Klirsehir in Turkey. (L2-expression)
b. Nog altijd woont iedereen in Klirsehir in Turkije.
still always lives everyone in Klirsehir in Turkey
Everyone still lives in Klirsehir in Turkey. (target expression)

(3) a.

Ik gaan school.
I go-inf school
I go to school. (L2-expression)
b. Ik ga
naar school.
I go-1sg to school
I go to school. (target expression)

(4) a.

En dan andere jongens komt.


and then other boys-pl come-sg
And then, the other boys come. (L2-expression)
b. En dan komen de andere jongens.
and then come-pl the other boys-pl
And then, the other boys come. (target expression)

47

48

Norbert Corver

The L2-expressions in the a-examples in (1)(4) are (supercially) imperfect in


the sense that they deviate from the expressions generated by the grammar of
mother tongue speakers of Dutch. In (1a), imperfection relates to the lexical
item wegt, a verbal form which does not exist in (target) Dutch. In (2a),
imperfection concerns the use of the lexical item alles, a quanticational expression that does not express quantication over persons in the target language. In
(3a), imperfection relates to the linguistic expression of location: there seems
to be no prepositional element available, which carries the locational (i.e. path)
interpretation of school. In (4a), nally, we appear to have an imperfect agreement relation between the (plural) subject-noun phrase and the nite verb.
Even though these L2-expressions may be imperfect from the perspective of
the target language, I hope to show in this paper that they are perfect from the
perspective of the interface conditions.5 L2-projections are perfect projections
in the sense that, at the interface level, they consist of features (associated with
lexical items) that are interpretable for the interface conditions. One could say:
L2-projections are perfect in being externally-interpretable.
On the basis of the types of L2-expressions illustrated in (1)(4), I hope to
show in this article that target (im)perfection (i.e. (non-)correspondence with
the target pattern) should be distinguished from interface (im)perfection.
L2-products of interlanguage grammars are typically target-imperfect but
interface-perfect. Importantly, those target-imperfect but interface-perfect
interlanguage expressions can be of two types. First of all, there are interlanguage expressions that are legible at LF because the equivalent L1 expression
is legible at LF. An example of such an expression resulting from transfer (or
conservation; see Van de Craats, Corver and Van Hout 2000, Van de Craats this
volume) is given in (5). Secondly, there are target-imperfect interlanguage
expressions that are legible at LF and whose imperfection results from other
mechanisms, e.g. the (non-target) merger of some root element and an inectional sux. An example of such an expression is the element wegt in (1).
(5) examen van tolk
exam of interpreter
the interpreter at the exam

Let me briey dwell on the example in (5), which from a supercial perspective
looks like a perfect target Dutch pattern (surface perfection). In possessive
structures of native speakers of Dutch, the element van is an adpositional (i.e.
prepositional) marker that is interpreted as the spell-out of the abstract genitive
case feature associated with the possessor DP in postnominal position (like in:

Perfect projections

dat boek van Jan, that book of Jan). A Dutch-based analysis of the sequence
examen van tolk is highly unlikely, however; under such an analysis, in which
tolk is the complement to the noun examen, the entire noun phrase needs to be
interpreted as the examen taken by the interpreter. This is not the reading it
has, which is the interpreter at/of the exam; examen acts as the possessor.
Given this semantic interpretation, Van de Craats, Corver and Van Hout reach
the conclusion that the syntactic structure associated with the linear sequence
in (5) is a structure transferred (i.e. conserved) from the rst language (i.e.
Turkish). This amounts to an analysis according to which van is an inectional
sux attached to the possessed noun. It is the equivalent of the inectional
element -nin in Turkish expressions like Ayse-nin araba-si (Ayse-gen car-3sg,
Ayses car). The Turkish syntactic structure which underlies the Dutch surface
sequence in (5) is then the one in (6a). (6b) represents the syntactic structure of
the Turkish sequence Ayse-nin araba-si.
(6) a. [DP [AgrP [examen-van]i [Agr [NP ti tolk] Agr]] D]
b. [DP [AgrP [Ayse-nin]i [Agr [NP ti araba] si]] D]

In this article, I will consider examples of interface-legible interlanguage


expressions that are target-imperfect. One may wonder what interface-illegible
interlanguage expressions look like. Being illegible at the interface such illegitimate patterns should never surface in the derivational output. They are ruled
out by the bare output conditions at the interface. Potential examples of such
illegible interlanguage patterns can be made up, of course. In Van de Craats,
Corver and Van Hout (2000), for example, it is noted that even though Turkish
L2-learners produce a great variety of interlanguage possessive patterns (see (7)
for some examples), certain imaginable patterns are not attested in the L2-data;
for example, the patterns in (8). The absence of these patterns in the L2
derivational output hints at the characterization of these patterns as interfaceillegible interlanguage expressions. The external systems somehow cannot
read the linguistic instructions provided by these structures and, as a consequence of that, these patterns are never present in the L2-output.
(7) a.

examen van tolk


exam of interpreter
the interpreter of/at the exam (attested)
b. auto zn lamp
car its light
the cars light (attested)

49

50

Norbert Corver

(8) a.

tolk
examen van
interpreter exam of
the interpreter of/at the exam (unattested)
b. zn lamp auto
his light car
the cars light (unattested)

Through discussion of some illustrative L2-expressions, I hope to show in this


chapter that target (im)perfection should be distinguished from interface
(im)perfection. L2-products of interlanguage grammars are typically targetimperfect but interface-perfect. As I will show, interface perfection applies
both at the level of words and at the level of phrasal categories (i.e. lexical
projections). L2-words that may be imperfect from the point of view of the
target language are (interface-)perfect at the lexicon-syntax interface (see
Section 2). And L2-phrases that are target-imperfect turn out to be fully legible
(hence perfect) at the LF-interface (see Sections 3, 4 and 5).

2. Perfect L2-words at the lexicon-syntax interface


Chomsky (1995) systematically refers to PF and LF as interfaces of syntax with
respectively a perception/articulation system and an interpretation/use system
both being mental faculties. The syntactic structure generated by the
computational system (say, merge and move/attract) is assigned a PF-representation and an LF-representation. Besides the PF- and LF-interface of syntax, a
third interface of the syntax can be identied, viz. the syntactic representation
built up from the lexicon. This is explicitly stated in Chomsky (1991: 46):
that there are three fundamental levels of representation: D-structure, PF,
and LF. Each constitutes an interface of the syntax (broadly constructed) with
other systems: D-structure is a projection of the lexicon, via the mechanisms of
X-bar theory; PF is associated with articulation and perception, and LF with
semantic interpretation.

Although a separate level of D-structure is no longer adopted in minimalist


theorizing, the general idea that syntactic structure is built up from the lexicon
still is a core assumption of generative linguistics. This is also clear from the
following statement by Chomksy (1995: 225):

Perfect projections

Another natural condition is that outputs consist of nothing beyond properties


of items of the lexicon (lexical features) in other words, that the interface
levels consist of nothing more than arrangements of lexical features.

For the lexicon-syntax interface, this implies that each lexical item (i.e. a
constellation of lexical features) must be legible to the computational system
(merge and attract/move) that accesses these objects (i.e. lexical expressions)
and builds more complex expressions (i.e. syntactic expressions) from those
lexical expressions. A question which then arises is: What makes a lexical item
legible to the computational system (i.e. the rules of grammar: e.g. merge,
move/attract, agree)?
To answer this question, let us rst address the question of what a lexical
item (LI) is. In line with De Saussures conception of words, a LI is typically
dened as a sound-meaning pair (i.e. a PF-LF pair). In a sense, a LI is a structured object with a sound representation (the phonological matrix; sound
properties) and a meaning representation (semantic properties).
Presumably it is not the phonological and purely semantic properties which
make a lexical item legible to the computational system. If a LI were just a
sound-meaning pair one could wonder what the syntax (i.e. the recursive
procedure) should do with it. That is, what would make such a sound-meaning
pair legible to the computational system? Phonetic and purely semantic features
do not seem to be accessed by the recursive syntactic procedures. In short, these
PF-LF-pairs would remain illegible to the computational system.
So, what makes these sound-meaning pairs legible to the computational
rules, which combine these pairs into more complex sound-meaning constructs? The answer is: formal (i.e. syntactic) features. Suppose a formal feature
must be added (merged) to the sound-meaning pair (i.e. the lexical item) for
the LI to be legible to the computational processes that generate larger structures. In other words, merger of a categorial feature with the sound-meaning
pair turns the lexical item into an object that is visible at the lexicon-syntax
interface (cf. Marantz 1997, Chomsky 2000b).
Thus, what you have at the clausal level (syntax as a mediating representation between sound and meaning) is also what you have at the word level:
(9) meaning

phonology

syntactic-formal feature

51

52

Norbert Corver

Surface imperfections in the L2 derivational output can now be due to miscategorisation (i.e. mis- from the perspective of the target-language): an
incorrect categorial feature is associated with some sound-meaning pair. In
(10)(13), some examples of miscategorisation by L2-learners are given:
(10) Hier komt weg. Ik beetje momentes.
here comes away I a-bit moment-infl
Here he goes away. I wait a bit.
(11) Ik komt huis en Slenol wegt
huis Stokhasselt.
I come home and Slenol away-3sg house Stokhasselt
I go home and Slenol went to his house in Stokhasselt.
(12) A: In de buit
ligt die.
in the outside lie those
Outside lay these.
I: Hm?
lack of understanding
A: In de buiten. Buiten ook heeft de steen.
in the outside outside also has the stone
There are also stones (at the) outside.
(13) a.

Ja verzeker betalen he.


yes insure pay
discourse-prt
Yes, my insurance will pay.
b. Uh ik ongeluk beur maar ik heb nu geen verzeker.
uh I accident happen but I have now no insure
I had an accident but I have no insurance now.

In (10) and (11), moment and weg are treated as roots (i.e. sound-meaning
pairs) that receive a verbal character after merger of a verbal categorial feature.
Schematically (order irrelevant):
v

(14)

moment/weg

After attachment of the verbal categorial feature to the root, the lexical item
displays verbal behavior: it carries, for example, the verbal inection -t (present
tense, third person singular). From the perspective of the (grammar of the)
L2-learner, there is nothing odd about these lexical items: they each represent
a root (a sound-meaning pair), which carries a categorial feature that turns it
into a verbal form that is legible at the lexicon-syntax interface, in the sense that

Perfect projections

the lexical item (carrying a categorial feature) is accessible to the computational


(i.e. morphosyntactic) rules. In short, these lexical items represent perfect
objects in the L2-learners grammar.
Another interesting example of a target-imperfect but (lexiconsyntax)
interface-perfect object is the verbal form bint in the following examples:
(15) a.

Ja komt politieauto en hij bint


in auto toe.
yes comes police-car and he inside-3sg in car prt
Yes, there comes a police car and he goes into (enters) the car.
b. I: Hij wat?
he what?
A: Hij bint
auto.
he inside-3sg car
He goes into the car.

The use of bint in these examples seems to be a combination of L1-transfer (i.e.


conservation) and creative use of the L2. Turkish has a verb binmek, which means
to get in, and Dutch has a preposition/particle binnen, which means inside.6
The lexical items buit(en) (cf. (12)) and verzeker (cf. (13)) are also perfect
objects in the L2-learners grammar. As can be concluded from their co-occurrence with determiner-like elements (de, geen), these items are nominal. This
nominal behaviour is represented by the nominal categorial feature that is
attached to the root of the lexical item i.e. the bare sound-meaning pair.
Schematically:
n

(16)

buit(en)/verzeker

The verbal analysis of items like momentes, wegt and bint and the nominal
analysis of buit(en) and verzeker are imperfect from the perspective (of the
grammar) of the mother tongue speaker of Dutch. For him, binnen (meaning:
inside) and buiten (meaning: outside) are both prepositions; moment
(meaning: moment) is a noun and verzeker (meaning: to insure) is a verbal
form. In short, the wrong categorial feature is associated with the root in these
L2-expressions. From the perspective of the L2-learners interlanguage grammar, however, these non-target-words are perfect projections: assignment of a
categorial value makes these items legible at the lexicon-syntax interface and
accessible to the computational rules that take these objects as their input. The
noun buit, for instance, is merged with the determiner de.

53

54

Norbert Corver

Summarizing, miscategorisation (i.e, mis from the perspective of the


target language) yields what could be called target imperfect lexical items.
Because of this miscategorisation, these L2-lexical items are often hard to
understand for mother tongue speakers of that language. From the interface
perspective, however, there is no reason to believe that these lexical items are
illegible for the system (read: (morpho)syntax) which interacts with the lexicon.
After assignment of a categorial feature, it is input to the combinatorial rules of
the grammar.

3. Perfect quanticational expressions


Thus far, we have seen that mis-categorisations on the part of the L2-learner lead
to the formation of L2-expressions that are perfect from an interface perspective
but may be deemed imperfect when compared with the target language.
These target imperfections do not only appear in the domain of content
words (also called: lexical categories), but, not unexpectedly, also in the domain
of function words (also called: functional categories). An interesting illustration
of interface legibility of target imperfections comes from the domain of
quantied noun phrases. As shown by the following examples, quantifying
expressions may display dierent forms depending on their function and
position within the syntactic structure. Consider, for example, the following
variants of the universal quantier al in present-day Dutch:7
(17) a.

Jan heeft alle mensen herkend.


Jan has all people recognized
b. Jan heeft alles
herkend.
Jan has everything recognized
c. Jan heeft allen herkend.
Jan has all recognized
Jan has recognized all of them/everyone.

In (17a), we have the quantifying determiner alle, which is often treated as a


fusion of the pre-determiner al (cf. note 7) and the denite article de: al+de
alle (cf. Paardekooper 1974, Verkuyl 1981; but see Zwarts 1992). In (17b), al is
followed by the sequence (e)s, which presumably used to be a genitive case
sux in older variants of Dutch, but is no longer recognized as such anymore;
i.e. alles seems to have developed into a non-composite form which carries a
neuter meaning: everything. As shown by (17b), alles, as opposed to the

Perfect projections

quanticational form alle, occupies an argument position in the clause. The


quanticational element allen, nally, represents a plural form. It always refers
to human beings, and as such arguably carries the formal property [+human].
Just like alles, the lexical item allen in (17c) occupies an argument position
within the clause.8
Importantly, all these quantied expressions obey the universal constraint
that there is a restriction on the quanticational element (say al). This restricted
reading on the quantier can be represented as follows:
(18) a. for all xi [xi: people]
b. for all xi [xi: things]
c. for all xi [xi: people]

(cf. (17a))
(cf. (17b))
(cf. (17c))

By allowing this restricted quanticational reading (i.e. the set of individuals


(objects, persons) is specied over which the quantier ranges) and binding a
variable at LF, the nominal expressions in (17) are interpretable at the LF-interface.
Quantication, being a core property of natural language, is also found in
the L2-derivational output. Also with quanticational expressions we nd
patterns which are imperfect from the perspective of the target language but
perfect from an interface perspective on natural language expressions. Consider,
for example, the forms in (19) and (20), which are all produced by one and the
same Turkish learner of Dutch (viz. Abdullah):
(19) a.

Wat doet alles ik weet niet.


what does all I know not
I dont know what everyone does.
b. Altijd uh alles woonte Klirsehir van Turkije.
always uh all lived Klirsehir of Turkey
Everyone still lives in Klirsehir in Turkey.
c. En dan vandaag hier komen alles.
and then today here come all
Today, everyone comes here.

(20) a.

Alles mensen #
toerist ja.
all people (were) tourists yes
b. Ken je in Nederlands alles stad?
know you in Netherlands every city
c. Alles ja uh kinderen niet Turks spreken.
all yes uh children not Turkish speak
All children dont speak any Turkish.

55

56

Norbert Corver

d. Wij maakte alles


maar ik weet niet alles
naam # Nederlands.
we make everything but I know not everything name # Dutch
We make everything but I dont know the name of everything in Dutch.

In (19) and (20), the form alles is used instead of the target form allen and alle,
respectively. The L2-learner has identied the universal quanticational
meaning of alles, but he has not discovered yet that alles has a non-human
interpretation and that it cannot occur as a quanticational determiner. Or to
put it dierently, the L2-learner has identied the quantier feature associated
with alles, but other lexical features, like its categorical feature (e.g. alle being a
quanticational determiner and alles being a noun-like expression), do not
seem to have been identied yet.
Importantly, the quanticational expressions in (19) and (20) are LF-interpretable: they receive a restrictive reading (i.e. a set of objects is dened over
which they range) and bind a variable at LF. Thus, an L2-expression like (19c)
has the following LF-structure:
(21) For all xi, [xi: people], xi come here today.

In short, the pattern alles (N) represents an LF-interpretable structure from an


interface perspective.
LF-legibility also holds for the lexical item alleen, as it is produced in the
following examples by the same Turkish learner of Dutch:
(22) a.

Ja nu alleen mag.
yes now alleen may
Yes now, everything is permitted.
b. Ik wil alleen leren.
I want alleen learn
I want to learn everything.

(23) a.

Maar niet alleen.


but not alleen
But not everyone (is unpleasant).
b. Maar ik zeg niet alleen slechte mensen.
but I say not alleen bad people
But I dont say that everyone is bad

The item alleen in (22) and (23) receives a clearly (universal) quanticational
interpretation. In (22), it receives the interpretation everything; in (23), it is
interpreted as everyone. Also in these examples, then, the L2-learner has
identied the universal quanticational element al.

Perfect projections

Interestingly, this quanticational reading is not the one which is associated


with the lexical item alleen in the target language. In Dutch, alleen means
alone, a reading which arguably derives from the two elements that compose
this expression: the quantier al and the numeral een: alleen actually means:
one is all.9 As regards its distributional behaviour, alleen occurs as a oating
element which enters into a predicative relationship with a noun phrase in the
sentence. In (24), for example, alleen is predicated over the subject ik.
(24) Ik ben toen alleen naar de bakker gegaan.
I am then alone to the baker went
Then I went to the bakery shop alone.

Contrary to the L2-expression alleen in (22) and (23), the target language item
alleen can never occur in an argument position:
(25) *Ik kende alleen.
I knew alone

Although the L2-lexical item alleen, which has a universal quanticational


meaning, does not occur in the target language, we should not conclude from
this that it is an illegitimate object. At the LF-interface, this L2 quanticational
expression receives a restrictive reading (for all xi [xi: persons/things]) and
binds a variable at LF.
(26) For all xi, [xi: thing], I want to learn xi

As a matter of fact, the L2 quanticational expression alleen is just as perfect


from an interface perspective as a target quanticational expression like iedereen
(everyone) and menigeen (many a one). These forms are sometimes analysed
as composite quanticational expressions, that consist of a quantifying element
(ieder, menig) and an indenite pronominal part (een), which, in the target
language, refers to humans (just like English one in One shouldnt do that). The
L2 learner who produces the forms in (22) and (23) has possibly identied the
composite character of the quanticational expression alleen: it consists of the
universal quantier al and the indenite pronominal element een. As opposed
to a target expressions like iedereen, however, the L2 expression alleen is not
restricted to quantication over humans. This suggests that the L2-learner has
not discovered yet that the indenite pronominal een is restricted to a human
interpretation.
To summarize: forms like alles and alleen, as produced in (19)(20) and
(22)(23) are legitimate expressions at the LF-interface. Even though their

57

58

Norbert Corver

distribution and interpretation (e.g. nonhuman versus human) may dier from
that of the target lexical items alles and alleen, the conclusion seems inescapable
that these quanticational expressions are fully legible at the LF-interface: they
receive a restrictive reading and bind a variable at LF.

4. The interpretability of apparently P-less structures


Target imperfection is also found with L2-structures expressing prepositional
features like location and path. Consider, for example, the expressions in (27)
and (28), that are produced by a Turkish learner of Dutch (see Schenning
(1998) for extensive discussion).
(27) a.

Hij ook woon Kirslehir.


he also lives Kirslehir
He also lives in Kirslehir.
b. Hij werkt Ankara.
he works Ankara
He works in Ankara.
c. Ik nooit geweest Istanbul.
I never been Istanbul
I have never been in Istanbul.

(28) a.

Ik gaan school.
I go school
I go to school.
b. Kom maar mijn huis.
come just my house
Come to my house.
c. Ja wij moet altijd moskee gaan een dag vijf keer moskee gaan.
yes we must always mosque go a day ve time mosque go
Yes, we must go to the mosque ve times every day.

As shown by the following target Dutch equivalents of (27a) and (28a), respectively, Dutch requires the presence of a prepositional element in those syntactic
constructs that express the abstract property of location.
(29) Hij woont ook in Kirslehir.
he lives also in Kirslehir
(30) Ik ga naar school.
I go to school

Perfect projections

In (27), the elements Kirslehir, Ankara and Istanbul indicate a static location, i.e.
place. In (28), the elements school, huis and moskee have a path interpretation. It is obvious that abstract meaning properties like place or path are not
directly related to these nouns. These nominal elements dont have an inherent
locative meaning, as is clear from sentences in which they full a non-locative
role, as in (31):
(31) a.

Ik ken Ankara goed.


I know Ankara well
b. Ik zie mijn huis.
I see my house

It is more likely that the locative properties place (static location) and path
(dynamic location) are associated with the category P(reposition). In the
(target) Dutch examples in (29)(30), this prepositional element expressing the
abstract meaning property space or path is phonetically realized; in the
L2-expressions in (27)(28) it is not. The non-visibility (i.e. phonetic absence)
of the prepositional element does not imply that the category P (and its
projection) is absent in expressions like Kirslehir and school. In fact, the preposition may very well be empty:
(32)

PP
Pspace

(33)

(cf. (27a))
NP
Kirslehir

PP
Ppath

(cf. (28a))
NP
school

In Emonds (1985, 2000), a principle is proposed that permits a closed class


category to be empty under the condition that it is realized on its phrasal sister.
According to this principle, which Emonds calls the Invisible Category Principle
(ICP), empty P structures can be utilized in a language if features like path or
space are realized on the NP-sister by means of a case marking. Thus, although
the interpretable P-feature (i.e. path/space) itself is associated with the P-head
of the prepositional structure, it can be alternatively realized on the NP-sister.
These alternative spell-outs of the locative/path feature are pure spell-outs of
features and appear late in the derivation (i.e. spell out at PF).

59

60

Norbert Corver

As noted in Kornlt (1996), locative meanings are often expressed by


means of case suxes in a language like Turkish:
(34) a.

Kitap masa-da.
book table-loc
The book is on the table.
b. Hasan Ankara-ya git-ti.
Hasan Ankara-dat go-past
Hasan went to Ankara.
c. Hasan Ankara-dan gel-di.
Hasan Ankara-abl come-past
Hasan came from Ankara.

Following Emonds ICP, these structures could be interpreted as empty


P-structures, which have the prepositional feature (space/path) realized on the
NP-sister as a case-sux. For example (order of P and NP-sister irrelevant):
PP

(35)

Pspace

NP
masa-DA (-DA as alternative realization of P-feature)

Under the assumption that L2-learners take a conservative approach towards


the expression of location and path denoting expressions (i.e. PPs), it is
expected that they initially do not realize the prepositional head. Just like in
Turkish, they try to realize the prepositional feature by means of a case marking
on the NP-sister of P, a strategy which is not available in Dutch.10
The following L2-variants of prepositional structures are interesting in this
context:
(36) a.

I: Waar woont hij?


I: where lives he
O: van
Tilburg.
O: of (= in) Tilburg
b. I: Wanneer ziet ze die jongen dan?
I: when
sees she that boy then
O: van
Trabzon.
O: of (= in) Trabzon
c. En dan beetje wandelen van name of street.
and then bit walk
of name of street
And then I walk a bit in name of street.

Perfect projections

In these examples, the prepositional element van appears in a prepositional


structure which denotes a location. Van itself does not seem to carry any
locative meaning. As a matter of fact, this meaningless preposition van also
shows up in non-locative prepositional contexts (cf. Schenning 1998):
(37) a.

Ik niet trouwen van Yvette.


I not marry of Yvette
I dont marry Yvette.
b. Ik zegt: waarom jij niet praten van
mij?
I say why
you not speak of (= with) me
c. En dan moet ik vertellen van hun.
and then must I tell
of them
And then I must tell it to them.

This distribution of the meaning-less element van is suggestive for an analysis


in which it is a case-sux which alternatively realizes the locative prepositional
features and other types of prepositional features. Schematically, the van-variants would then have the structures in (38):
PP

(38)

Pspace

NP
VAN-Tilburg

(VAN as a case-aYx)

In conclusion, both the prepositional structures in (32)(33) and the prepositional structure in (38) are LF-interpretable objects. The empty preposition
(and its projection) carries the interpretable property space or path. The
target imperfection is simply a surface phenomenon that relates to the phonetic
spell out of the prepositional position: the L2-learner initially leaves the
prepositional head empty and tries to realize the prepositional feature by means of
a case-marking (i.e. alternative feature realization). Given the lack of clear case
markings in Dutch, this alternative realization remains empty initially. At a certain
stage in the acquisition process, the prepositional feature gets alternatively
realized on the NP-complement by the semantically empty preposition van.

5. Agreement and asymmetric spell out


In traditional grammars, one often nds the observation that agreement is an
asymmetric relation. A verb, for example, is said to agree with its subject-DP in

61

62

Norbert Corver

person and number; it is not the subject-DP that is dependent on the verb for
agreement. And an adjective is said to agree in number and gender with the
noun it modies; it is not the noun that is valued for certain phi-features under
agreement with the modifying adjective. In recent generative studies (cf. e.g.
Chomsky 1995), this asymmetry of the agreement relationship is captured in terms
of the notion interpretable (formal) feature. An interpretable (formal) feature is
a feature that has a semantic contribution at the LF-interface (i.e. it is interpretable
at LF). A non-interpretable (formal) feature has no interpretation at LF (or for that
matter: PF). Structural case for nouns and phi-features for categories that agree
with nouns are core examples of uninterpretable formal properties.
The asymmetry in the agreement relationship, as observed in traditional
grammatical studies, has been reinterpreted in terms of the notions [+interpretable] versus [interpretable]. It is the element carrying the [+interpretable]
feature that agrees with the element carrying the [interpretable] feature. This
agreement relationship involves feature matching and elimination of the
uninterpretable feature that is associated with the matching constituent. In
(39a), for example, it is the verb zagen that enters into an agreement relationship with the plural subject-noun phrase de mannen. Plurality (more than one)
and singularity (one) is an interpretable property of nouns. In a way, the
plurality feature on the verb zagen is redundant; it does not contribute any
semantics and, as such, can be characterized as uninterpretable. The plurality
marking on the verb is just a formal marker of the agreement relationship;
semantically, it does not contribute anything to the linguistic expression.
Therefore, in a certain intuitive sense, it is easier to imagine that a plural
interpretation gets associated with the ill-formed expression (39b) than with the
ill-formed expression in (39c). In (39b), plurality is morphologically specied
on the noun, i.e. the category that carries the number feature as a [+interpretable] property, and the verb is not morphologically specied for plurality. Even
though there is a morphological mis-match, there is a tendency to assign this
sentence a plural interpretation: that is, the interpretation the men-pl saw-pl
me is much more likely than the interpretation the man-sg saw-sg me. In
other words, it is the plurality marking on the noun that most strongly determines the semantic interpretation of this expression which does not satisfy
agreement at the level of morphological expression. Consider next (39c), where
we have the reverse situation: the subject noun phrase does not bear any
marking of plurality; it is the verb zagen which is plural morphologically. In
spite of the plural marking on the verb, it is intuitively more dicult to get a
plural reading of the noun. As a matter of fact, it is the singularity of the noun

Perfect projections

which seems to be dominant again for the interpretation. In short, given the fact
that in subject-verb agreement relations, it is the noun that determines plural or
singular interpretation, it is expected that morphological marking of singularity
or plurality is more likely to be realized on the noun than on the verb.
(39) a.

De mannen zagen mij.


the men-pl saw-pl me
b. *De mannen zag
mij.
the men-pl saw-sg me
c. *De man
zagen mij.
the man-sg saw-pl me

This asymmetry in the morphological realization of the number feature is


reected in the L2 derivational output: patterns are typically found in which the
agreement properties of the noun are correctly realized overtly, but not those of
the verb. In other words, morphological spell-out of the agreement property
generally applies to the element carrying the [+interpretable] property and not
to the element carrying the [interpretable] property. Consider, for example,
the following L2-utterances, which are produced by a Turkish learner of Dutch.
(40) a.

En dan andere jongens komt.


and then other boys-pl come-sg
b. Maar twee meisen drie jongens weet
ut wel maar heel
but two girls-pl three boys-pl know-sg it prt but very
klein beetje.
little bit
But two girls and three boys knew it, but only a little bit.
c. Twee jongens woont Oisterwijk.
two boys-pl live-sg Oisterwijk
Two boys live in Oisterwijk.
d. Die jongens komt
Turkije.
those boys-pl come-sg Turkey
Those boys come to Turkey.

In these examples, the phi-feature number carries the value plural. This
number feature is correctly spelled out on the noun of the subject-noun phrase
that enters into an agreement relation with the verb. The verb carries what looks
like a singular inection. Arguably, this verbal form is unanalysed morphologically or, alternatively, the marking -t is underspecied for number.
What is important is that, even though the L2-expressions in (40) may be
regarded as imperfect from the perspective of the target language, they are

63

64

Norbert Corver

perfect from the perspective of the LF-interface: i.e. they are expressions that are
fully legible semantically.
Let me close o this section with another illustration of an LF-interpretable
structure in which the agreement properties are morphologically spelled out
asymmetrically. The relevant example comes from noun-phrase internal
agreement between a numeral and a noun. As noted among others in Emonds
(1985), plurality is a property of numerals: i.e. numbers above one are inherently marked for the property [+plural]. This formal property is interpretable,
since plurality plays a role in the semantic interpretation of a linguistic expression. As illustrated by the following L2-expressions, plurality is not always
marked on the agreeing noun:
(41) a.

Vijf minuut he uh en dan klaar # alles dit.


ve minute prt prt and then ready # all this
All this is ready in ve minutes.
b. I: Hoe lang zit je al weer op school # twee weken?
I: how long sit you already at school # two weeks
A: Vijf daag.
A: ve day
c. I: Heb je nu vakantie van school?
I: have you now holiday from school
A: Ja twee week.
A: Yes two week
d. Vader, moeder en drie broer.
father mother and three brother

In (41a), the noun minuut does not carry plural morphology. The target pattern
would be: twee minuten. Also in this case, then, the L2-learner chooses the
strategy of not morphologically expressing plurality if this semantic feature is
already specied in the projected structure. In other words, redundant marking
of the plurality feature on the noun is avoided.
Again, it is important to stress that the numeral+noun-patterns in (41) are
fully legible at the LF-interface. From a target language perspective, however,
these structures are imperfect: in Dutch, plurality is (redundantly) marked on
the noun, when it combines with a numeral that is inherently specied for
plurality (i.e. more than one). In this respect, Dutch diers, by the way, from
Turkish. As noted in Kornlt (1996: 225), there are syntactic contexts in
Turkish where, despite plural semantics of the noun phrase, the head noun
cannot be marked for plurality. When the noun is preceded by a numeral or
certain quantiers, the plural sux cannot occur:

Perfect projections

(42) a.

bes ocuk(*-lar)
ve child(*ren)
ve children
b. birok ocuk(*-lar)
many child(*ren)
many children

In view of the non-redundant marking in (42), it is likely that the L2-learner


who has produced the numeral+noun-patterns in (41) has adopted a conservative strategy: the Turkish rule of not morphologically marking plurality on the
noun when it is preceded by a numeral (or certain quantiers) is also at the
basis of the numeral+noun sequences in his second language, i.e. Dutch (see
again Van de Craats (this volume) for further discussion of the notion of
conservation).

6. Conclusion
From the perspective of the target language, L2-expressions often seem highly
imperfect. At the surface, these L2-expressions (e.g. Dutch L2-products of
Turkish learners) seem to dier greatly from those produced by mother tongue
speakers. From a dierent perspective, though, there does not seem to be much
wrong with those L2-expressions: they are perfect expressions, in the sense that
they meet conditions imposed by other cognitive systems that the language
faculty interacts with (external requirements). That is, any L2 (interlanguage)
grammar provides (grammatical) information that is legible to the cognitive
systems with which it interacts. I have tried to illustrate the interface-legibility
of L2-expressions by means of four types of phenomena: (a) categorial labeling
of words, (b) quanticational expressions, (c) the expression of location in
prepositional structures, and (d) the morphological expression of certain
agreement patterns. As regards the categorial labeling of words, it was noted
that certain L2-words (e.g. an L2-verb like wegt (goes)) that do not exist in the
target-language, are perfect lexical constructs from a (lexicon-syntax) interface
perspective: it is a sound-meaning pair that through assignment of a categorial
value (i.e. V) becomes accessible to the computational system (merge, move,
morphological rules) of the interlanguage grammar. I further argued that
certain L2 quanticational expressions (e.g. alles mensen) that are imperfect
from a target language perspective are fully legitimate from the perspective of
LF-legibility: they are normal quanticational expressions in the sense of

65

<DEST "cor-n*">

66

Norbert Corver

allowing a restricted quanticational reading and binding a variable at LF.


I further argued that LF-legibility also holds for what, at the surface, looks like
a bare nominal carrying a locative meaning (e.g. Ankara, meaning in/to
Ankara). At a more abstract level, these L2-expressions are prepositional
structures, which have the locative meaning associated with an empty P. These
empty-P-structures are fully interpretable expressions at the LF-interface.
Finally, it was observed that in L2-agreement patterns (of beginning learners)
it is typically the [+interpretable] element that gets morphologically marked.
Absence (or underspecication) of morphological marking of the [interpretable] feature may yield a pattern which is imperfect from the perspective of the
target language. From the interface perspective, however, the non-redundantly
marked pattern is perfect: an agreement property like number is typically
spelled out on those items for which singularity or plurality is an inherent
semantic property.
In this approach, I have only glanced over a variety of L2-expressions from
the interface perspective; a perspective which is characteristic of the minimalist
thesis. The major purpose was to show that by taking this perspective, the
conclusion seems inescapable that L2-expressions are perfect grammatical
objects, where perfection amounts to legibility of its information to the systems
with which it interacts.

Notes
* I thank the participants at the workshop for their comments on the talk. I would also like
to thank an anonymous reviewer for helpful comments and suggestions.
1. In a recent interview with Adriana Belletti and Luigi Rizzi, Chomsky states the following
(cf. Chomsky, Belletti and Rizzi 1999 (rev. 2000:17)): Every language meets minimalist
standards. Now, that means that not only the language faculty, but every state that it can attain
yields an innite number of interpretable expressions. That essentially amounts to saying that
there are no dead ends in language acquisition. He further states: The minimalist thesis
would say that all states have to satisfy the condition of innite legibility at the interface.
2. A reviewer raises the following question: What can a generative-minimalist theory account
for with respect to interlanguage expressions that other theories (e.g. a GB-based approach
using presumed UG-notions like government) cannot account for? Although such a
comparison of approaches may be useful in certain respects, it is not always easy to evaluate
the benets of one specic analysis of interlanguage data over another one, especially if the
analytic tools are dierent. Important, though, is the dierent perspective that the minimalist
approach towards language design provides: linguistic properties are not so much considered
from an intra-grammatical perspective, but rather from an interface perspective (lexicon-

Perfect projections

syntax, syntax-semantics, syntax-phonology). This raises dierent sorts of questions about


the linguistic objects one examines (e.g. What makes an (L2) representation (il)legible at the
interface?) and arguably provides dierent sorts of accounts of the grammatical properties
displayed by these representations.
3. The data are drawn from the European Science Foundation (ESF) Program in Second
Language Acquisition by Adult Immigrants (for design, elicitation techniques, and topics, see
Perdue 1993). This project was set up as a longitudinal and cross-linguistic multiple case
study. Most of the data discussed in this article are from the Turkish informants Abdullah
and Osman. Since the issue of language development is not central in this paper, I have left
out information about the stages in which the expressions were uttered by these informants.
4. In each of the L2-expressions (1a)(4a), there is more than one target imperfection. For
the sake of discussion, I will pick out one type of imperfection for each of the examples.
5. I wont consider in this paper the issue of (L2) perfection at the PF-interface.
6. I would like to thank the reviewer for discussion of this example.
7. Another pattern featuring the quanticational element al is: al de boeken (all the books).
In this pattern, the quanticational element occurs in a pre-determiner position.
8. The quanticational form allen also shows up as a oating element, like in: Zij zijn gisteren
allen gekomen (they are yesterday all come; They all came yesterday.).
9. Alleen can also mean only in present-day Dutch. This (homophonous) adverbial element is
not quanticational and displays a cross-categorial distribution, just like its English equivalent.
10. See Van de Craats (2000) and Van de Craats, Corver and Van Hout (2000) for a
discussion of conservation of L1-grammatical features in L2-expressions. See also Van de
Craats contribution in this volume.

References
Bley-Vroman, R. 1990. The logical problem of foreign language learning. Linguistic Analysis
20: 349.
Chomsky, N. 1991. Some notes on economy of derivation and representation. In Principles
and parameters in comparative grammar, R. Freidin (ed.), 417454. Cambridge, MA:
MIT Press.
Chomsky, N. 1995. The minimalist program. Cambridge MA: MIT Press.
Chomsky, N. 2000a. New horizons in the study of language and mind. Cambridge, UK:
Cambridge University Press.
Chomsky, N. 2000b. Minimalist inquiries (MI). In Step by step: Essays in minimalist syntax
in honor of Howard Lasnik, Martin et al. (eds). Cambridge MA: MIT Press.
Chomsky, N., Belletti, A. and Rizzi, L. 1999 (rev. 2000). An interview on minimalism.
University of Siena.
Craats, I. van de 2000. Conservation in the acquisition of possessive constructions. Doctoral
Dissertation, Tilburg University.

67

</TARGET "cor">

68

Norbert Corver

Craats, I. van de, Corver, N. and Hout, R. van 2000. Conservation of grammatical knowledge: on the acquisition of possessive noun phrases by Turkish and Moroccan learners
of Dutch. Linguistics 38 (2): 221314.
Emonds, J. 1985. A unied theory of syntactic categories. Dordrecht: Foris.
Emonds, J. 2000. Lexicon and grammar: The English syntacticon. Berlin/New York: Mouton
de Gruyter.
Kornlt, J. 1996. Turkish. New York: Routledge.
Marantz, A. 1997. No escape from syntax: Dont try morphological analysis in the privacy of
your own lexicon. University of Pennsylvania Working Papers in Linguistics 4 (2): 201225.
Paardekooper, P. C. 1974. Beknopte ABN-syntaxis. Den Bosch: Malmberg.
Perdue, C. (ed.) 1993. Adult language acquisition: Cross-linguistic perspectives, vol. I: Field
methods. Cambridge, UK: Cambridge University Press.
Schachter, J. 1989. Testing a proposed universal. In Linguistic perspectives on second language
acquisition, S. Gass and J. Schachter (eds), 7388. Cambridge: Cambridge University Press.
Schenning, S. 1998. Learning to talk about space. The acquisition of Dutch as a second language
by Moroccan and Turkish adults. Doctoral Dissertation, Tilburg University.
Verkuyl, H. 1981. Numerals and quantiers in X-Bar syntax and their semantic interpretation. In Formal methods in the study of language, J. Groenendijk, T. Janssen and M.
Stokhof (eds), 567599, Amsterdam: Mathematic Centre.
White, L. 1988. Island eects in second language acquisition. In Linguistic theory in second
language acquisition, S. Flynn and W. ONeill (eds): 144172. Dordrecht: Reidel.
White, L. 2000. Second language acquisition: From initial state to nal state. In Second
language acquisition and linguistic theory, J. Archibald (ed.): 130155. Oxford: Blackwell.
Zwarts, J. 1992. X-Syntax X-Semantics. On the interpretation of functional and lexical
heads. Doctoral Dissertation. Research Institute for Language and Speech OTS.
Utrecht University.

<TARGET "cra" DOCINFO AUTHOR "Ineke van de Craats"TITLE "L1 features in the L2 output"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 4

L1 features in the L2 output


Ineke van de Craats
University of Nijmegen

1.

Introduction

In current generative syntax (e.g. Chomsky 1995), the role of the lexicon has
become more prominent in the generation of syntactic expressions. Under this
view, the lexicon is not limited to vocabulary, but also contains important
grammatical information. Lexical knowledge consists of grammatical properties
as dened by language-particular knowledge of functional categories (parameter settings), language-specic knowledge of lexical items and their features (the
vocabulary) and morphological knowledge. By means of lexical items selected
from the lexicon, the computational system of human language can build
phrases and sentences. A syntactic object a clause or a phrase is considered to be the structural projection of a series of linguistic properties associated
with a lexical item. Those formal features (e.g. singular, accusative, human,
+V, N) are stored in the vocabulary, together with the semantics of that
specic lexical item. Lexical items, like nouns and verbs, are base-generated as
a unit, including case morphology and inectional morphology like person,
number, tense, under lexical heads. Functional heads, on the other hand, do not
dominate inectional morphology, they dominate bundles of abstract features.
These features have to be eliminated or erased in the course of the derivation,
which is done by feature checking. This feature checking is a matching of the
features (e.g. case morphology is checked by its case assigner) and is done by
adjoining the inected N or V to the relevant functional head. So, morphology
which is associated with a verb or a noun has to be checked by the abstract
features dominated by a functional head (e.g. Agr or T for verbs, and Agr and
D for nouns). What features are dominated by a functional head and whether
these features are strong or weak is lexical knowledge which is necessary for the
generation of a syntactic object, but is not part of the vocabulary. So, in the
minimalist approach, a parameter is related to a feature of a functional head

70

Ineke van de Craats

that attracts an identical feature of a lexical item at some point in the derivation,
and so, is essentially linked to the lexicon.
If, in recent linguistic theorizing, the formal features of a lexical item and the
specication of functional heads, can be seen as the seeds for building a syntactic
structure they must play the same role in the acquisition of a new language,
assuming that we are dealing with a natural language. Applying generative theory
to second language acquisition is not new of course. Before the work of White
(1982, 1985), Flynn (1986) and many others, Adjmian (1976) was the rst to
adopt a Chomskyan approach to interlanguage development. He considered
grammatical interlanguage systems to be natural languages but dierent from
L1 grammars only in their permeability to aspects of the L1 system.
In this chapter, the current view of generative syntax will be applied to the
analysis of naturalistic second language data. Although L2 expressions are the
syntactic products of the interaction between the computational system and a
changing lexicon, the focus will be on how the grammatical knowledge of an L2
learner is encoded in the lexicon, not on derivation and syntactic representations. The question is more: what is the nature of this grammatical knowledge
at the L2-initial state and how does it change? Through examples produced by
L2 learners and by outlining the longitudinal development of some lexical items,
it will be shown how features of a lexical item may change in the course of the
acquisition process, giving rise to new syntactic structures. For a detailed discussion of the syntactic representations of nominal possessive constructions in which
van (Subsection 5.1) and the realisation of the personal pronoun (Subsection 5.2)
are involved, the reader is referred to Van de Craats, Corver and Van Hout
(2000), for representations of clausal possessive constructions in which heeft is
involved (Section 6) to Van de Craats, Corver and Van Hout (2002).

2. The L2-initial state, data and informants


The central claim is that, initially, the grammatical system of an L2 learner is
not only permeable to the learners L1 system (Adjmian 1976) but even based
on the L1 system. It is assumed that L2 learners exhibit conservative behaviour
and take the fully edged grammar of their L1 as the starting point of the L2
acquisition process (in case they have command of only their L1). This amounts
to both the Full Transfer/Full Access Hypothesis (Schwartz and Sprouse 1996)
and the Conservation Hypothesis (Van de Craats 2000, Van de Craats, Corver
and Van Hout 2000). The latter, however, explicitly states that the learners

L1 features in the L2 output

output cannot show all L1 properties because of a strongly limited L2 vocabulary. With the developing vocabulary, (more) L1 properties related to free and
bound functional morphemes become manifest gradually, as we will see for
genitive markers (Subsection 5.1) and copular forms (Subsection 6), which,
initially, are not found in learners data but appear gradually.
The following aspects of lexical knowledge may be conserved at the L2-initial state, when all learning starts:

parameter settings (e.g. strength values);


knowledge of morphology and morphological realization rules (e.g.
realization of case);
knowledge of lexical items: formal features (e.g. categorial features) and
semantic-conceptual values (the meaning).

Because of acquisition, restructuring will apply at all levels of lexical knowledge:


from parameter values to semantic-conceptual values. Initially, L2 learners rely
on the old system of the L1. On the basis of primary linguistic input, they will
depart from their conserved L1 parameter setting and L1 vocabulary lexical
knowledge. This implies that a parameter together with its possible values will
remain available through UG.
In the next sections, the changing grammatical knowledge at the basis of L2
expressions will be shown through the spontaneous production data of eight
adults (1824 years old) learning Dutch as a second language. These data were
collected within the framework of the European Science Foundation (ESF)
Program on Second Language Acquisition by Adult Immigrants (see Perdue
1993). The ESF project was set up as a longitudinal and cross-linguistic multiple
case study; we only use the Dutch data here. The eight informants were followed for two and a half years. The period of investigation was divided into
three cycles of nine sessions, one session a month. In the examples in the next
sections, we refer to the learner and to the cycles and recording session (e.g. I.7
= rst cyle, session 7) in which the utterance was produced. At the time of the
rst session, the informants had been living in the Netherlands for seven to
twelve months. They had a very low level of prociency in Dutch, were monolingual, and had a limited level of education. Several elicitation tasks were
repeated in each cycle, such as interviews, role-playing, and lm-retelling tasks.
Some other examples used were produced by child L2 learners between six and
nine years old. They come from another corpus of longitudinal and cross-linguistic
data collected by Vermeer (1986). These informants were also from a Turkish
and Moroccan Arabic background: 16 children from both language groups.

71

72

Ineke van de Craats

They were followed over 2.5 years from the time they entered primary school.
At the time of the rst recording the childrens age ranged from 6;4 to 7;9 years.

3. The nature of a lexical item


As hinted at in the previous section, we distinguish lexical knowledge as dened
by UG from language-specic knowledge of lexical items and their lexical
entries. To avoid confusion, we refer to the latter type as the vocabulary. The
rst question that arises here is what knowledge learners have exactly of a lexical
item. Lexical items are combinations of sound and meaning properties which
can be read, or interpreted, by other cognitive systems (cf. Chomsky 1994,
1995). Phonetic features make up the phonetic representation and semantic
features make up the semantic representation. A sound-meaning pairing is
encoded by a phonological matrix. Each coding of a lexical item also contains
a set of formal features: intrinsic features and optional features. The former are
unpredictable, idiosyncratic grammatical properties of lexical items (e.g. the
categorial feature [+N,V] and the person feature [3 person]); the latter include
grammatical features that are predictable from other properties of the lexical entry
(e.g. the features number and (abstract) case, which might be derived from the
categorial feature denition [+N,V]).1 Table 1 gives an example of the concept
bicycle in three dierent languages. Only the phonological matrix diers.
In what way might we conceive of conservation of vocabulary knowledge?
Obviously, conservation does not apply at the level of the phonological matrix.2
One might conceive of early lexical acquisition as a process in which L2 learners
try to match a meaning representation associated with some lexical item of their
L1 vocabulary with a phonological matrix of the target language.
Table 1.Lexical items of the concept bicycle compared for three languages

phonological matrix
semantics
formal features
intrinsic
intrinsic
intrinsic
optional
optional

Turkish

Dutch

English

/bisiklet/
bicycle

/ets/
bicycle

/bike/
bicycle

[+N,V]
[human]
[3 person]
[singular]
[nominative]

[+N,V]
[human]
[3 person]
[singular]
[nominative]

[+N,V]
[human]
[3 person]
[singular]
[nominative]

L1 features in the L2 output

Table 2.The development of an L2 lexical item

phonological matrix
semantics
formal features

L1 item
(Turkish)

Interlanguage

L2 item
(Dutch)

/bisiklet/
bicycle
[+N,V]
[3 person]
[singular]
[nominative]

//
bicycle
[+N,V]
[3 person]
[singular]
[nominative]

/ets/
bicycle
[+N,V]
[3 person]
[singular]
[nominative]

The consequence of this conservation model at the level of the derivational


output might be that, in an interlanguage, apparently empty constituents may
exist. They are lled by L1 feature bundles of semantic-conceptual and formal
features, but lacking a phonological matrix. The phonological representation is
simply absent. The task of the L2 learner will be to ll in the empty slot of the
phonological matrix. Schematically, the acquisition of a lexical item may be
represented as in Table 2.
What learners do, in fact, is add a new phonological matrix to the already
existing bundle of semantic and formal features. They match, in Table 2 for
instance, the L2 phonological matrix /ets/ with the semantic and formal
features belonging to the Turkish phonological matrix /bisiklet/. This combination of an L2 phonological matrix, L1 semantics and an L1 feature bundle
essentially is a new lexical item in the learners L2 vocabulary.
Evidence for this way of learning lexical items is (i) L2 lexical items showing
imperfect matching or mismatching and (ii) empty phonological matrices in
developmental sequences which are lled up later by elements based on the L1
syntax as will be shown by the developmental sequence of the lexical items van
(Subsection 5.1) and heeft (Section 6). In the next section, some examples of
mismatches will be presented. They are related to the semantic features, the
formal features and the argument structure of lexical (not functional) elements.
In Sections 5 and 6, we focus on mismatches of functional elements, both in the
nominal and the clausal domain.

4. Mismatches in lexical development


As long as there are no dierences between the formal feature bundles of two
lexical items in the source and target languages, learners can carry out this

73

74

Ineke van de Craats

matching operation without making any errors. For most content words, such
a perfect match is possible as far as formal features are involved. We do not
expect a dierent set of formal features in two languages because entities are
typically nouns ([+N,V]), actions are typically verbs ([N,+V]), and qualities
are typically adjectives, each of those with its own categorial values (N, V, A)
and the formal features typically related to those categories. As for the semanticconceptual aspects, dierences between L1 and L2 are to be expected, however.
The rst example of imperfect matching involves the domain of semantics.
A meaning representation of a lexical item may have dierent aspects. The
Turkish verb imek, for instance, diers minimally from the Dutch verb drinken
(to drink). The formal features are the same, but the basic meaning of the verb
/imek/ is to put something in something else. This general meaning has
several more specic meaning aspects, viz. to drink and to smoke. Consider
the L2 expression in (1).
(1) Als ik Marlboro drinken.
when I Marlboro drink
When I smoke a Marlboro.

Turkish learner: Ergn: III-5

In (1), Ergn maps the L2 phonological matrix /drinken/ on to both L1


semantic aspects and grammatical properties of the verb imek, which results in
a mismatch.3
Mismatches, however, are not restricted to the domain of semantics. They
may also involve formal features of lexical elements. Let us consider the
examples in (2) and (3). Mahmut, a Turkish learner of Dutch, is retelling a
scene from a silent movie in which Charlie Chaplin must pay the bill in a
restaurant. Charlie refuses to do so because he wants to go to jail. The same
episode was told twice, with an approximately ten months interval. In (2a) and
(3a) the policeman orders Charlie to pay, in (2b) and (3b) Charlie answers that
he has no money.
Jij betalen geven.
you to pay to give
You must pay.
b. Ik niet betalen.
I not pay
I have no money.

(2) a.

(3) a.

Politie
zegt: Jij geld geven.
policeman says you money give
The policeman says: You must pay.

Turkish learner: Mahmut, II-9

Mahmut, III-9

L1 features in the L2 output

b. Ik heb niet geld.


I have not money
I do not have money.

The examples in (3) make clear what Mahmut meant to say in (2). We can
assume that instead of betalen to pay he intended to say geld money.4 From
the perspective of the Conservation Hypothesis, the argument runs as follows.
A Turkish learner expects to nd the verb at the end of the sentence, as Turkish
is basically a language with an SOV sentence structure. Hence, the Turkish
learner in (2a) considers geven to be a verb. Since the L1 item demek (to pay)
has an internal argument, this L2 learner places this argument in the object
position that normally precedes the verb, e.g. in (2a). Hence, betalen must be
the argument, and a noun. In (2b) betalen might be meant as a verb, but (3b)
makes that unlikely, the more so because, in cycle II, Mahmut is not yet able to
produce possessive clauses in which hebben (to have) occurs.
As can be inferred from the sentences in (2) and (3), the categorial value of
betalen in (2) is that of a noun: [+N,V], and not that of a verb [N,+V]. Only
in (3), the phonological matrix /betalen/ is replaced by /geld/, as represented in
Table 3. In this table, the subcategorization frame of to pay has been integrated
in order to underline the verbal character of the L2 item.
Table 3.A learner variant of the lexical item /betalen/ compared to the relevant items
in source and target languages; deviances from the target language are in italics
L1 item
(Turkish)

phonological matrix
semantics
formal features

/para/
money
[+N,V]
[human]
[3 person]
[singular]
[accusative]
subcategorization frame

Learner
variant

L2 item
(Dutch)

/betalen/
money
[+N,V]
[human]
[3 person]
[singular]
[accusative]

/betalen/
to pay
[N,+V]

[DP2 DP1 ]

A comparable mismatch due to misinterpretation of the L2 item terug


(back) by the L2 learner is presented in (4).
(4) a.

Mijn vrouw ik thuis terug.


my wife I home back
My wife came to my house.

Mahmut, I-5

75

76

Ineke van de Craats

b. Ik meisje baby .
I girl baby 
My girlfriend is expecting.
Vijf maanden baby terug.
ve months baby back
In ve months the baby will come.

Mahmut, I-6

In (4a), Mahmut tells about his wedding. Before the wedding, he used to go to
his girlfriends house, far away, but after the wedding his wife came to his
house. In this early stage of acquisition, Mahmut had two options for expressing
the possessive pronoun rst person singular: the target variant mijn (my) +
possessee and the learner variant ik (I) + possessee. He used them both in this
sentence. The particle terug expresses the action of coming. In Dutch, the
particle terug is the separable part of the compound verb terugkomen (to come
back/to return). In matrix clauses, the nite part of the verb appears in (more
precisely: is moved to) the second position in the sentence, while the particle
remains at the end. This is probably the cause of the misinterpretation by the
Turkish learner, who expects the nite verb in end position, where he nds
terug (instead of geliyor comes). In Dutch, as in English, it is even possible to
leave out the past participle gekomen in a perfect tense and to say, as the result
of the action terugkomen, ik ben terug (I am back), which may be interpreted
by this learner as: I have come.
In (4b) the directional element is not so evident. Mahmut took part in a
role playing task. He was asked to explain to a housing ocer why he needed a
house. The informant was given the information that his girlfriend was pregnant. The introducing sentence (ik meisje baby) cannot have another meaning
than that his girlfriend is expecting. In Dutch, the internal argument (a baby)
of the verb verwachten (to be expecting) must be expressed overtly. Note that we
are dealing here with two arguments that are strongly suggestive of the predicate
verwachten (to expect), so that it makes sense to assume a predicate with an
empty phonological matrix, viz., verwachten. The second sentence of (4b)
confused the housing ocer. He understood that the baby was already born
and that he or she would come back after a stay in Turkey or somewhere else.
The cause of this misunderstanding is that Mahmut maps the L2 phonological matrix /terug/onto a L1 feature bundle linked to the verb /geliyor/. In that
way, the particle terug can act as a verb and has the same argument structure as
the verb to come.5 This is represented in Table 4.
The examples above have shown that beginning L2 learners may have
diculty in discerning the grammatical properties of L2 content words, which

L1 features in the L2 output

Table 4.Lexical items of the concept comes; deviance from the target language is in italics
L1 item
(Turkish)

phonological matrix
semantics
L1 formal features

/geliyor/
comes
[N,+V]
[3 person]
[singular]
[present tense]
subcategorization frame [DP]

Learner
variant

L2 item
(Dutch)

/terug/
back
[N,+V]
[3 person]
[singular]
[present tense]
[DP]

/komt/
comes
[N,+V]
[3 person]
[singular]
[present tense]
[DP]

are lexical elements with a relatively high salience in the environmental input.
For the understanding of functional elements this must be still harder.

5. Functional elements in the nominal domain


As functional elements such as determiners and axes have little semantic load
and are often unstressed, they are less salient to L2 learners than content words.
So, L2 learners get less opportunity to perceive them in the L2 input and to
comprehend them. This is even more so for the formal features. Therefore, it is
not surprising that grammatical properties of L1 functional elements persist for
a longer time than those of lexical elements. Consider for this purpose two
mismatches of formal features at the level of the categorial value and see how
these dierences become manifest in the learners L2 expressions. The rst
example relates to the realisation of genitive case. The second example is a more
indirect consequence of the possibility of dropping the subject (pro-drop) in
Turkish and the impossiblity of doing so in Dutch.
5.1 The genitive case
Speakers of Turkish are used to realizing case marking overtly by a rich morphological system of suxes on the head noun. In Dutch, case marking is
generally done covertly, except for pronouns and the marking of genitive case
(cf. Corver in this volume). For a good understanding of the learners data, it is
necessary to go into a bit more detail regarding the nominal possessive constructions in Dutch and in Turkish. A Turkish possessive construction like the
one in (5) features agreement in person and number between the possessed

77

78

Ineke van de Craats

(pro)noun and the possessor. This agreement is manifested by the agreement


sux -s.
(5) Ayse-nin /o-nun
araba-s.
Ayse-gen/(s)he-gen car-3sg
Ayses/her car.

The possessor noun phrase (DP) carries a genitive case feature (-nin or -nun,
choice determined by vowel harmony) and the possessed noun a genitive case
agreement feature (-s), as represented in Table 5. This possessive relationship
is characterized by agreement, overtly realized both on possessor and possessee.
In line with Chomsky (1986, 1995), it is assumed that the genitive case agreement feature of the possessee must check o the genitive case feature associated
with the possessor DP. The required structural conguration is AgrP where the
possessor DP and possessee N enter in SpecHead conguration because of the
strength properties of the Agr head (cf. Van de Craats et al. 2000 for details).
Table 5.Formal feature complex of two lexical items and the functional head Agr in the
Turkish possessive construction

phonological matrix
semantics
formal features

Possessor

Possessee

Agr

/Ayse-nin/
of Ayse
[+N,V]
[3 person]
[singular]
[genitive]

/araba-s/
car
[+N,V]
[3 person]
[singular]
[+genitive case assignment]

//

[+N,V,+D]
strong
[+N,V]
weak

Unlike Turkish, Dutch has several ways for expressing a possessive relationship. Two of them look, supercially, like Turkish (6a, 6b), the other (analytic)
construction (7) does not.
(6) a.

Ayse s auto
Ayse s car
Ayses car
b. Ayse dr auto
Ayse her car
Ayses car

(Saxon genitive)

(Doubling possessive)

L1 features in the L2 output

(7) de auto van Ayse


the car of Ayse
Ayses car

(Analytic construction)

Although the possessive construction in (6a) is called the Saxon genitive


construction, the -s should not be interpreted as an inectional case sux
because, with a coordinated possessor, the possessive marker -s is phonologically attached to the rightmost noun, as in (8). If it were a normal case sux
realized on the head noun, it would be expressed on both nominal heads of a
coordinated construction (cf. Corver 1990). We assume that the Saxon genitive
is a clitic in the Agr head.6
(8) [Ayse en Jan]s kritiek op elkaar.
Ayse and Johns criticism of each other.

The major characteristic of the doubling possessive construction (6b) is the


presence of a possessive clitic which doubles the possessor noun phrase and
agrees in phi features with the possessor. It is assumed that the possessive clitic
(dr for feminine, zn for masculine) heads the functional phrase AgrP (Miller
1991) and also that the doubled possessor DP originates within the lexical
projection NP and raises overtly to the specier position of Agr.
In analytic constructions, as in (7), the dummy preposition van (of) can be
considered to be the morphological realization of the inherent genitive case (cf.
Chomsky 1986). This implies that such genitive case will only be assigned by N
to a noun phrase that receives a thematic role from it. In line with Chomskys
(1995:285) reinterpretation of inherent case, this genitive case is a + interpretable
feature of DP that need not, but could be, checked in a SpecHead conguration.
This has the consequence that the genitive bearing DP (i.e., Ayse) can remain
within its base position since its genitive case feature need not be checked.
The dierences between the three types of possessive constructions vary in
their modes of genitive case licensing and in what element is in the Agr head.
Comparison of Tables 5 and 6 shows that the important dierences between
Turkish and Dutch lie in the mechanism of case licensing and the lexical
material projected in Agr head, both abstract features and overt functional
elements. But what about the realisation of genitive case? That is another point
of dierence and more transparent to learners than the properties discussed
above. In Table 7, the lexical items associated with the concept possessor of or,
to put it in other words, the realisation of genitive case, are compared for
Turkish, Dutch and two learner variants.

79

80

Ineke van de Craats

Table 6.Formal feature complex of two lexical items and the functional head Agr for
three possessive constructions in Dutch
Possessor

phonol. matrix
semantics
formal features

/Ayse/
Ayse
[+N,V]
[3 person]
[singular]
[genitive]

Possessee

/auto/
car
[+N,V]
[3 person]
[singular]
[+gen. case
assignment]

Agr
Saxon
genitive

Doubling
possessive

Analytic
construct.

/-s/

[+N,V,+D]
strong

/clitic/
Agr not
(e.g. zn)
projected
[+N,V,+D]
strong

[+N,V]
weak

[+N,V]
weak

Table 7.Lexical items of the concept possessor of ; deviances from the target language
are in italics
L1 item
(Turkish)

Learner
variant 1

Learner
variant 2

L2 item
(Dutch)

/-(n)In/
possessor
[ax gen]

//
possessor
[ax gen]

/van/
possessor
[ax gen]

[N]

[N]

/van/
possessor
preposition
[N,V]
[DP]

phonological matrix
semantics
categorial value

subcategorization frame [N]

L2 learners who map the phonological matrix of the L2, van (of), onto the
grammatical properties of their L1, produce such nominal phrases as in (9)
and (10).
(9) pronominal possessor
a. [die van] auto
[that of car
his car
b. [onze van] broer
[our of brother
our brother
(10) full noun possessor
a. [examen van] tolk
[exam of interpreter
the interpreter at the exam

Ergn, III-4

child learner (number T25)

Ergn, III-5

L1 features in the L2 output

b. [die jongen van] zijn vader


[that boy of his father
that boys father
c. [de auto van] de lichten
[the car of the lights
the car lights

Osman, III-2

child learner (number T41)

In the examples in (9) and (10), the preposition van (of ) should not be
considered an element of an analytic construction, as in (7), but the genitive
case marker of the preceding possessor noun phrase as in Turkish (cf. example
(5). Notice that both child learners and adult learners produce these constructions based on the L1 bundle of formal features as presented in Table 7.
It is a complicating factor that L2 learners do not show this genitive case
marker from the earliest stage of L2 acquisition. L2 learners, in general, produce
only a few functional elements in the beginning of the acquisition process and
it is questionable whether they perceive functional elements in the L2 input at
all. Nevertheless, they build syntactic constructions like the L2 expressions in
(11), which are not simple two word utterances but can be extensive phrases.
tante dochter7
Osman, I-5
aunt daughter
my aunts daughter (= cousin)
b. tante zoon auto
Mahmut, I-5
aunt son car
my aunts sons car
c. mijn vrouw oma
andere man dochter
Mahmut, II-8
my wife grandmother other man daughter
my wifes grandmothers second husbands daughter

(11) a.

The learner variants in (11) are almost incomprehensible to native speakers of


Dutch, but make sense to Turkish speakers because, under the view of the
Conservation Hypothesis, they base those expressions on their L1 grammar,
more particularly on the fact that in Turkish, all possessor nouns are overtly
marked by a genitive case and that the head noun (the possessee) is marked by
a person and number marker that refers to the preceding possessor, as in (5).
Under this view, an empty () phonological matrix associated with an L1 based
feature bundle can be assumed for the learner variants in (11). (Compare also
Corvers contribution in this volume about the interpretability of apparently
P-less structures.) This analysis is corroborated by the fact that the empty
phonological matrix is lled in at a later developmental stage by the case marker

81

82

Ineke van de Craats

van (of), and by the fact that only Turkish learners exhibit possessive constructions overtly marked for genitive case. Moroccan learners do not. One may
object that the productions exemplied in (11) proceed directly from UG. But
why would Turkish learners have access to UG and Moroccan learners not?
In addition to possessive noun phrases in which the possessor precedes van,
as in (5) and (6), L2 learners with a Turkish language background produce
possessive noun phrases where van precedes the possessor, as in (12) and (13).
(12) pronominal possessor
a. [van hem] moeder
Ergn, II-9
[of him mother
his mother
(target: zijn moeder)
b. [van ons] die fabriek
Ergn, III-4
[of us that factory
our factory
(target: onze fabriek)
c. [van ons] buurman heeft goed gemaakt child learner (number T41)
[of our neighbour has good made
our neighbour has repaired it
(target: onze buurman)
(13) full noun possessor
a. [van Ergn] auto
[of Ergn auto
Ergns car
(target: Ergn zn/s auto)
b. [van schoenen] die touwtje8
[of shoes
that rope
the shoelaces
(target: de veters van de schoenen; de schoenveters)

Abdullah, II-7

Osman III-6

The examples in (12) and (13) exemplify a new stage of lexical development,
viz., the stage in which van is no longer a genitive case sux, but an adposition
preceding the possessor, like van in analytic constructions as in (7). A second
alteration vis-a-vis the former stage is that van precedes a full DP instead of
being a sux on a noun.
In addition to these examples, there are some rare learner variants of the
Saxon genitive and the doubling possessive constructions in which L1 and L2
elements are mixed. These are given in (14). In (14a), we see a target doubling

L1 features in the L2 output

possessive construction and in (14b) a Saxon genitive construction in which the


genitive case is overtly realized together with a functional element in Agr head,
i.e., s and zijn (instead of the target form zn).
(14) a.

die [jongen van] zijn vader


that [boy of his father
the boys father
(target: de jongen zn vader)
b. [van mers] huis
[of mers house
mers house
(target: mers huis)

Osman, III-2

child learner (number T29)

Likewise, the complete development of any lexical item can be represented from
the L2-initial state, viz. the nal state of the L1 lexicon, to the state where the
lexical item of the target language is attained. The formal features required for
building target syntactic objects, viz. those dominated by the functional head
Agr and those of possessor and possessee, were already presented in Table 6.
Table 8.Developmental stages of the L2 lexical item /van/ in constructions with full DP
possessors; changes with regard to the previous stage are in italics

L1 agreement pattern

phonological matrix

semantics

categorial value

subcategorization frame

example
L2 construct pattern

phonological matrix

semantics

categorial value

subcategorization frame

example
L2 analytic pattern

phonological matrix

semantics

categorial value

subcategorization frame
example

Stage 1

Stage 2

Stage 3

Target

//
possessor
[ax gen]
[N]
Ayse- auto

/van/
possessor
[ax gen]
[N]
[Ayse-van]
auto

/van/
d.n.a.
possessor
[adposition]
[DP possor]
[van Ayse] auto

/van/
possessor
[ax gen]
[N]
Ayse-van dr
auto

/van/
d.n.a.
possessor
[adposition]
[DP possor]
van Ayses auto Ayse dr auto
Ayses auto

/van/
possessor
[preposition]
[N,V]
[DP possee]
auto [van Ayse]

/van/
possessor
[preposition]
[N,V]
[DP possee]
auto [van Ayse]

/van/
possessor
[preposition]
[N,V]
[DP possee]
auto [van Ayse]

83

84

Ineke van de Craats

Table 9.Distribution of variants of the lexical possessive item /van/ produced by four
Turkish learners in constructions with full DP possessors
Mahmut
Cycles

Ergn

Abdullah

Osman

II

III

II

III

II

III

II

III

L1 agreement pattern
Possessor
63
Possessor ax

Possor -adposition 

55



42



15



11
2


7
2
3

7

4


1
9

12

2

3
1
1


2
1

L2 construct pattern
Possor +van + clitic
Possor + clitic L2




























2


L2 analytic construct.
Van-preposition


2

8

23

12

2

14

25

In this way, lexical item learning in all aspects can be made visible. The vocabulary-internal development of the L2 lexical item /van/ can be represented as in
Table 8 for constructions with full DP possessors. Three developmental stages
are distinguished, each of them is characterized by one or more changes
compared to the previous stage. The order of the stages is determined by the
rst emergence of the changed value. In stage 1, the possessive L2 expressions
are based on an L1 pattern and the genitive case is not overtly realized. Stage 2 is
characterized by the new (analytic) L2 pattern in which the genitive case is realized
by van, both as a preposition and an ax. Likely, the new construction has impact
on the agreement pattern from the L1 because learners interpret van correctly
as the morphological realization of genitive case, but they mis-assess its categorial value. In stage 3, van is used as a preposition and adposition as well.
In Table 9, the distribution of these lexical stages over the four Turkish
informants is given. The time course represents three successive cycles, each
representing ten months of data collection. The informants are arranged from
the slowest learner on the left of the table to the faster learners on the right.
From the combination of the data in the Tables 8 and 9, it can be inferred
that Mahmut just entered stage 2, that Ergn attained this stage somewhat
earlier, and that Abdullah and Osman have reached stage 3 at the beginning of
the data collection. Osman is the only informant who produced a few doubling
possessive constructions. Table 9, also shows a considerable overlap between the
stages. In cycle III, Osman and Abdullah have abandoned stage 1, Mahmut and
Ergn have not yet done so.

L1 features in the L2 output

5.2 The personal pronoun in an L1 pro-drop system


Turkish is a pro-drop language (cf. Kornlt 1997). This implies that the
personal pronoun that functions as the subject is dropped if the subject has
already been introduced in or is known from the discourse. To be more precise,
the subject is assumed to be a lexically empty pronominal (pro) with the same
person and number features as its lexical equivalent. A full pronoun is used
either for the reintroduction of a person or thing, or for emphasizing the
subject, e.g. in (15b). In the latter case, the subject would be stressed in Dutch.
(15) a.

(pro) geli-yorum
come-pres.1sg
b. ben
geli-yorum
I (stressed) come-pres.1sg
I come

(non-focused)
(focused)non-

The same holds for the pronoun in a possessive noun phrase like his car. The
presence of a lexical pronoun yields a reading in which the possessor is emphasized, e.g., in (16b), his car as opposed to your car. In neutral, non-focused
readings, the pronoun is not lexically expressed but contains number and
person features, as in (16a).
(16) a.

(pro) araba-s

car-3sg
his car
b. o-nun
araba-s
he/that-gen car-3sg
his car

(non-focused)

(focused)non-

So, in Turkish two dierent phonological matrices, /ben/ and //, can be linked
to the same formal feature bundle, as illustrated in the columns 2 and 3 of
Table 10. Notice here that empty phonological matrices are not restricted to
learner variants as in (11) and in the Tables 7 and 8. Dutch does not have the
possibility of pro-drop and therefore can only use a lexically lled pronominal
ik (I). According to recent analyses (see, among others, Corver and Deltto
1999) Dutch pronouns should be considered transitive determiners (D). It is
assumed that their complement is a pro which stands for the phrasal projection
NP and not just for the head noun: [DP D [NP pro]]. The name prodeterminer
seems to be more appropriate for these determiners than pronoun. Turkish
pronouns, however, seem to be pronominals in the true sense of this word; i.e.
they seem to be of the lexical category type N rather than of the functional

85

86

Ineke van de Craats

Table 10.Lexical items of the concept I in Turkish and Dutch


L1 item
(Turkish)

phonol. matrix
semantics
formal features

/ben/
I focused
[+N,V]
[1 person]
[singular]
[nominative]

L2 item
(Dutch)
//
I
[+N,V]
[1 person]
[singular]
[nominative]

/ik/ stressed
I focused
[+D]
[1 person]
[singular]
[nominative]

/k/ /ik/ unstressed


I
[+D]
[1 person]
[singular]
[nominative]

categorial type D (see Kornlt 1997: 300). Support for this comes, for example,
from their behaviour with respect to various morphological rules; like common
nouns, pronominals in Turkish function as stems to which case and plural
morphology can be attached. The two types of pronominals in Turkish and
Dutch are compared in Table 10.
The dierences between the two pronominal systems consist in (i) the
categorial value of the pronoun, and (ii) the use of overt pronouns as a means
of focalising. Given the Conservation Hypothesis, Turkish L2 learners match a
phonological matrix of the L2 with an L1 feature bundle. The question arises
which combination they will make. As can be inferred from the L2 data,
Turkish learners link the non-focused item (an empty pro), which is more
frequently used in Turkish than the focused one and can be considered the
default form, to the Dutch full pronoun. That is not surprising because, in
initial stages, adult learners turn out to perceive full pronouns before clitic
pronominals (cf. Broeder 1991), although the clitic pronouns occur more
frequently in the spoken L2 input. However, the question remains for L2
learners how to express the focused pronoun in their L2 Dutch. Assuming that
Dutch pronominals are of the categorial type N as well, their projections may
contain a slot for demonstrative determiners. So these L2 learners may attach a
demonstrative die or deze (that) to the pronoun in emphatic contexts, as in
(17a), in which Mahmut explains why his daughter must learn Turkish, and in
(17b), in which Ergn talks about his brother.
(17) a.

[Die ik] Hollands dochter Hollands.


[that ik Dutch daughter Dutch
When I speak Dutch, my daughter speaks Dutch.

Mahmut, I-4

L1 features in the L2 output

[Die mijn] mama [die ik] Turks praten mijn dochter


[that my mother [that I Turkish speak my daughter
Hollands praten.
Dutch speak
When my mother and I speak Turkish, my daughter speaks Dutch.
b. [die van mijn] broer
Ergn, I-3
[that of my brother
my (emphatic) brother

Table 11 represents schematically how L1 knowledge is linked to L2 knowledge


in the learners lexicon.
Table 11.Learner variants of the lexical item /ik/ compared to the target language;
deviances from the target language are in italics
Learner
variant 1

phonol. matrix /die ik/


semantics
I focused
formal features [+N,V]
[1 person]
[singular]
[nominative

L2 item
(Dutch)

Learner
variant 2

L2 item
(Dutch)

/ik/ (stressed)
I (focused)
[+D]
[1 person]
[singular]
[nominative]

/ik/
I
[+N,V]
[1 person]
[singular]
[nominative]

/k/ /ik/ unstressed


I
[+D]
[1 person]
[singular]
[nominative]

6. Functional elements in the clausal domain


An interesting example of how small dierences in feature bundles play a role
in L2 acquisition is the acquisition of the possessive verb hebben (to have). A
number of studies (e.g. Benveniste 1966, Freeze 1992, Moro 1997) have pointed
out that possessive have-constructions derive from an underlying locative
construction, and dier only in some respects from possessive constructions
without have. Under this view, the form heeft (has) is considered a form of to
be in which a locative preposition has been incorporated. In that way the Dutch
form heeft is the spell out of the features Tense (present) (T) + Agreement (3sg)
(Agr) + Locative (LOC). Basically, the possessor is considered to be the complement of the locative preposition (P); this PP is moved to the subject position (in
SpecAgrsP) after the incorporation of the locative prepostion. The derivation is
as follows in (18).9

87

88

Ineke van de Craats

dat Paul een motor


heeft.10
that Paul a motorcycle has
that Paul has a motorcycle.
[CP dat [AgrP Pauli [[TP ti [SC een motor [PP Pk ti]] [T+Pk]j]] Agr+
[T+P]j (= heeft)]]
b. Paul heeft een motor.
Paul has a motorcycle
Paul has a motorcycle.

(18) a.

Some languages also show a non-incorporated variant like French (ce livre est
moi this book is to me). Moroccan Arabic, however, does not have a variant in
which the locative preposition is incorporated in the be-copula, viz. no equivalent of the verb to have. Moroccan Arabic uses only a locative sentence with the
PP at+clitic (i.e. Eend-u at+him), both for a present tense sentence (19a) and for
a past tense sentence (19b) (cf. Harrell 1970).
Abder, Eend-u dar
kbira.
(present tense)
Abder at-him house-f big-f
Abder has a big house
[CP Abder, [CP [AgrP [Agr [Eend-u]i [SC dar kbira [PP ti]]]]]]
b. Abder, kanet
Eend-u dar
kbira.
(past tense)
Abder cop.past.3sg.f at-him house-f big-f
Abder had a big house.
[CP Abder, [CP [AgrP [Agr kanetk [TP [Eend-u]i [T tk [SC dar kbira
ti]]]]]]]

(19) a.

Notice that, in (19a), a copular form is lacking in the present tense and the
preposition Eend-u occupies the position of the verb (in Agr), whereas in (19b),
the position of the verb in Agr head is taken by the past tense copula kanet and
Eend-u is in SpecTP. This may have interesting consequences for L2 learners
who are in search of equivalents of their L1 grammar in the environmental
input of the L2. Let us make rst a comparison of the concept has in the
source and the target languages (see Table 12).
Just as was the case for the genitive case marker in Section 5.1, L2 learners
start with an empty phonological matrix linked to the L1 feature bundle, as can
be seen in the following L2 expressions produced by adult Moroccan learners.
In (20a), Fatima is asked about her work in Morocco: she had a small shop with
some knitting machines. In (20b), Mohamed describes his situation in an
interview with a housing ocial.

L1 features in the L2 output

Table 12.Lexical items of the concept have compared for source and target languages
L1 item
(Moroccan Arabic)

L2 item
(Dutch)

/Eend- /
with/at
[N,V]

/heeft/
has
[Agr +T +P]
3sg +PRES +LOC
[DP]

phonological matrix
semantics
categorial value

subcategorization frame [DP clitic]

(20) a.

pronominal possessor
Ik klein winkel. (= ik klein winkel)
Fatima, I-3
I small shop 
I had a small shop.
b. nominal possessor
Mijn vrouw ook klein huis. (= mijn vrouw ook klein huis)
my wife also small house 
My wife has also a small house.
Mohamed, I-6

Slow learners often show a stage in which they produce a preposition between
the possessor and the possessee, e.g. in (21). The preposition appears in the
same position where it would appear in the L1, between the strong pronoun
(not the clitic) or the full noun phrase in the dislocated position at the beginning of the sentence (cf. the position of Abder in (19a, b) and second noun
phrase. In (21a), Fatima talks again about her work and, in (21b), about her
relatives on a photograph: three of them are half sisters and brothers.
(21) a.

pronominal possessor
Ik met klein winkel.
I with small shop
I had a small shop.
[CP ik [CP [AgrP metj [SC klein winkel [PP tj]]]]]
b. nominal possessor
Fatiha Mustafa Khiliye met andere moeder.
Fatiha Mustafa Khiliye with other mother
Fatiha, Mustafa and Khiliye have another mother.

Fatima, I-3

Fatima, II-4

In more advanced stages of L2 development, learners happen to produce


sentences in which the locative character of the possessive clause becomes
manifest, e.g. in (22a) and (22b), in which the entire PP has been moved to

89

90

Ineke van de Craats

SpecAgrP, or (22c) in which the locative preposition is overtly spelled out and
incorporated in heeft as well.
Met kind een jaar.11
with child one year
The child is one year old.
b. Bij hem kief.
at him hashish
He has hashish.
c. Met Soumiya heeft veel pijn.
with Soumiya has much pain
Soumiya suers very much.

(22) a.

Fatima, II-7

HassanK, III-2

HassanM, III-5

Except for Fatima, such locative constructions instead of the verb have are rare
in the data produced by the Moroccan informants. When their have-constructions are considered more closely, however, we become more sceptical about
the character of the have-forms. To start with Fatima, she uses the form heeft
(3sg) not only for all person roles, but she alternatively uses met and heeft with
the same meaning for more than 13 months. Another informant produces
instances of heeft in which it can be a synonym for met (23a). The sentence in
(23b) is even a direct copy of Moroccan Arabic in which heeft is followed by a
clitic as if it were a locative preposition.
Die heeft geel haar mag binnen.
that has yellow hair may inside
The boy with blond hair is allowed to go inside.
b. Die meisje, heef-ze een oom.
that girl
has-she an uncle
That girl has an uncle.

(23) a.

HassanK, II-2

HassanK, II-2

The situation becomes crucial when this informant wants to express a possessive clause in the past tense. Recall from (19b) that, in Moroccan Arabic, the
past tense of a have-clause is expressed by a be-copula in the past tense
followed by the locative preposition and see what HassanK produced in (24).
Die was
heeft (-pro)
30 jaar.
HassanK, II-3
he was-cop.past.3sg has [=Eend+3sg] 30 years
He was 30 years.
b. Die meisje was
nooit heeft (-pro)
verkering. II-4
that girl was-cop.past.3sg never has [=Eend+3sg] relationship
That girl was never in a relationship.

(24) a.

L1 features in the L2 output

c.

Dan was
heeft (-pro)
een huis.
then was-cop.past.3sg has [=Eend+3sg] a house
Then he had a house.

HassanK, II-9

From the examples above we infer that HassanK and Fatima base their syntactic
have-structures on the feature bundle of the Moroccan Arabic preposition
Eend (at, with). In this developmental stage, they still match the L1 feature
bundle with an L2 phonological matrix. Heeft is in fact a prepositional heeft.
The other two informants do not produce past tense forms of the verb hebben;
that is already sucient reason to be sceptical about the real identity of their
have-forms. The question can be raised how and when we can be sure that
heeft is a real copula and no longer a prepositional form. It is not easy to decide
when we are dealing with a copular verb, but it can safely be claimed that it is
no longer a preposition when past tense forms of hebben are expressed in one
verbal form (had had). HassanK succeeds in producing several past tense
forms and Mohamed does only once (see Table 14). A second indication is that
the verb hebben has extended from a possessive copula to an auxiliary of the
perfect tense, e.g. hij heeft gezien (he has seen). In this light, it is relevant to
signal that all Moroccan learners produce have-copulas six months or more
before they start using the auxiliary have (see Van de Craats 2000).
Finally, the development of the lexical item /heeft/ is represented in Table 13, from the L2-initial state (i.e. the empty phonological matrix with the L1
feature bundle) via various developmental stages to the state in which the item
has attained the complete feature constellation it has in the target language.
Tables 13 focuses on the properties of the lexical item heeft, which is the spell
out of the features [agreement], [tense] and [N,V]. This already suggests a
process of adjunction of several functional heads. In Table 13, we abstract away
from the formal features dominated by T and Agr triggering the movement of
P(P) (see Van de Craats et al. 2002: 156 for the derivation).
Table 14 shows how the stages are distributed over the data of the four
Moroccan learners. Slow learners exhibit more tokens of the developmental
stages than fast learners. The order of the stages is determined again by the rst
emergence of a changed value. For the slow learners, a considerable overlap of
the stages can be observed. The overlap is smaller for learners like Mohamed
and HassanM. At the end of the data collection, stage 3 is attained by all four
learners; only two of them can form past tense forms according to target
language standards, however.

91

92

Ineke van de Craats

Table 13.Stages in the development of the L2 lexical item /heeft/; dierences with
regard to the previous stage are in italics

phon.matrix
semantics
categorial value
subcategorization frame
example

Stage 1

Stage 2

Stage 3

Target state

//
with, at
[N,V]
[DP clitic]
Abder boek

/met, bij/
with, at
[N,V]
[DP clitic]
Abder met
boek

/heeft/
with, at
[N,V]
[DP clitic]
Abder was
heeft boek

/heeft/
has
[+Agr,+T,+P]
[DP]
Abder heeft
boek

Table 14.Distribution of variants of the lexical possessive item /hebben/ produced by


four Moroccan learners over developmental stages
Cycles

Fatima
I

Mohamed

II

III

II

L1 construction
Stage 1
48
Stage 2prep.
8
Stage 3prep.
10
Stage 3heeft

62
2
26

21

76

17
1
44

Target state
Hebben present
Hebben past




13








HassanK
III

II

10 1
 
 

1
1
77

138 150
1 




HassanM
III

II

III

7 4
1 1
173 1

7

52

1 
1 
 1

 82
 2

15


143 137
 

7. Conclusions
From the examples discussed above, we can infer that, in general, L2 learners
are more aware of the fact that words dier from language to language, i.e., that
phonological matrices dier, than that they are aware of the fact that other
lexical properties dier. Adult L2 learners are inclined to map or to attach a new
phonological matrix from the L2 to a bundle of semantic and formal features
from the L1. Only if those features bundles dier, can we get an insight in how
L2 learners proceed in their learning of lexical items: they start by assuming an
L1 feature constellation and they gradually change the features of the bundle
one by one, as we saw in Tables 8 and 13 for the acquisition of van and heeft. In
this way, the L2 learners output becomes more and more target-like; not only
the surface form (= the phonological matrix) alters, but also the features.
In this light, it can be explained why functional elements are more dicult
to acquire than lexical elements (i.e. content words). It is not only the case that

L1 features in the L2 output

they are less salient in the environmental input, but they also dier more in the
structure of their feature bundle. Therefore, it takes more time to discover each
of the composing features. The formal features, however, are of crucial importance for the development of the L2 syntax: they are the input for the computational system and, hence, decisive for the result of derivation: the syntactic
objects. They are the interface itself, so to say.12 Until a Moroccan learner of
Dutch, for instance, has not discovered each of the formal features of heeft
(has), he cannot attain the formation of the verb with subject-verb agreement,
he cannot form the past tense of that verb, nor the perfect aspect by means of
the auxiliary hebben. This shows the direct link between lexicon and syntax.
Why is there progress? The syntactic objects (i.e. the product of the computational system and the selection of lexical elements, including formal features)
in interlanguages seem to provide L2 learners with a reason to change their L2
output, because it is not suciently understood by native speakers, incomplete
(recall, for instance, the empty phonological matrices in the output of beginners
in example (11)), etc. Therefore, L2 learners are forced to constantly reanalyse
their L2 output and to change the underlying formal features and/or the
parametric values. This process of restructuring seems to occur in a continuous
interplay between syntax and formal features as the seeds of syntax and as a part
of the lexicon as well.

Notes
1. In Chomsky (1994) it is argued that only the idiosyncratic formal features are part of the
lexical item. Optional formal features are added to the lexical item when it is selected from
the lexicon. Here, we will abstract away from that distinction.
2. Except for the case in which L1 and L2 are so much similar that an L2 learner decides to
use an L1 lexical item (with the L1 phonological matrix) directly in the L2 output.
3. It is certainly not the case that learners keep on using all meaning aspects of a lexical item
in their L2: the more transparent and concrete a specic semantic aspect is, the earlier it is
used in L2 expressions (cf. the 17 dierent aspects of breken (to break) the acceptability of
which Kellerman (1979) asked his Dutch subjects to judge).
4. A reviewer proposed that betalen geven would be a serial verb construction, or geven to
give a light verb; geven can function as a light verb in Dutch. Although this account is not
quite impossible, the reviewer is arguing too much from a target language perspective.
Moreover, there is no light verb geven in (2b), and the example in (3b) makes clear that it is
not the verb betalen that is meant.

93

94

Ineke van de Craats

5. A reviewer advanced that Adjmian (1983) already found that L2 learners tend to transfer
lexical patterns from their L1 to their L2, and assume that verbs take the same kinds of
subject and object in their L2 as they do in their L1. Mapping an L2 phonological matrix
onto L1 lexical properties goes even further.
6. Following insights from Longobardi (1996), we take the position that the Saxon genitive
and the doubling construction are in essence hidden construct noun phrases, because they
share certain properties with the Construct-State construction (as in Arabic and Hebrew).
The Saxon genitive and the possessive doubling have the same distribution as a denite
article, viz. they block the presence of a denite article in D0.
7. The English translation of these examples suggests that the word order is not uncommon,
but possessors DPs in the Saxon genitive are restricted to proper names, and to human
beings in the doubling possessive construction. Moreover, recursion is odd in those
constructions. A native speaker of Dutch would only use the analytic construction for
recursive patterns due to processing limitations.
(i) de dochter van mijn tante
(ii) de auto van de zoon van mijn tante
(iii) de dochter van de tweede man van de oma van mijn vrouw
8. Note that in Dutch, the order Possessor-Possessee is restricted to human possessors; the
Saxon Genitive (e.g. Jans ets Johns bike) is even restricted to proper names and nouns
equivalent to proper names, e.g. tante aunt. This points all the more to the L1 grammar in
example (11b).
9. See also Van de Craats (2000) and Van de Craats, Corver and Van Hout (2002) for details
on this construction in Dutch and Moroccan Arabic.
10. In this example, a subordinated clause is given rst because it shows the basic position of
the verb in Dutch (SOV language).
11. In Moroccan Arabic, ones age is also expressed by a have-construction.
12. Although the lexicon plays an important role in this view on adult L2 acquisition, it is not
identical to the Lexical Learning Hypothesis (e.g. Clahsen, Eisenbeiss and Penke 1996) nor
to other recent views on L1 acquisition (e.g. Radford 2000, Roeper 1996) in which the
acquisition of (L1) syntax is seen as gradual structure building.

References
Adjmian, C. 1976. On the nature of interlanguage systems. Language Learning 16: 297320.
Adjmian, C. 1983. The transferability of lexical properties. In Language transfer in
language learning. S. Gass & L. Selinker (eds), 250268. Rowley, MA: Newbury House.
Benveniste, E. 1966. Etre et avoir dans leur fonctions linguistiques. Bulletin de la Socit
de la linguistique de Paris 55 (1): 113134. Reprinted in Problmes de linguistique
gnrale. Paris: Gallimard.
Broeder. P. 1991. Talking about people. A multiple case study on adult second language
acquisition. Amsterdam/Lisse: Swets & Zeitlinger.
Chomsky, N. 1986. Barriers. Cambridge, Massachusetts: MIT Press.

</TARGET "cra">

L1 features in the L2 output

Chomsky, N. 1994. Bare phrase structure. In MIT occasional papers in linguistics. Cambridge, Massachusetts: MIT Press.
Chomsky, N. 1995. The minimalist program. Cambridge, Massachusetts: MIT Press.
Clahsen, H., Eissenbeiss, S. and Penke, M. 1996. Lexical learning in early syntactic development. In Generative perspectives on language acquisition, H. Clahsen (ed.), 129159.
Amsterdam/Philadelphia: John Benjamins.
Corver, N. 1990. The syntax of left branch extractions. Doctoral dissertation. Tilburg University.
Corver, N. and Deltto, D. 1999. On the nature of pronoun movement. In Clitics in the
Language of Europe, H. van Riemsdijk (ed.), 799861. Berlin: Mouton & de Gruyter.
Flynn, S. 1986. A parameter-setting model of second language acquisition. Reidel, Dordrecht.
Freeze, R. 1992. Existentials and other locatives. Language 68: 553595.
Harrell, R. 1970. A short reference grammar of Moroccan Arabic. Washingon, D. C.: Georgetown University Press.
Kellerman, E. 1979. Transfer and non-transfer: Where we are now. Studies in Second
Language Acquisition 2: 3757.
Kornlt, J. 1997. Turkish. London, New York: Routledge.
Longobardi, G. 1996. The syntax of N raising: A minimalist theory. Manuscript, Utrecht
University.
Miller, P. 1991. Clitics and constituents in phrase structure grammar. Doctoral dissertation,
University of Utrecht.
Moro, A. 1997. Predicative noun phrases and the theory of clause structure. Cambridge:
Cambridge University Press.
Perdue, C. 1993. Adult language acquisition: Cross-linguistic perspectives. Volumes I and II.
Cambridge: Cambridge University Press.
Radford, A. 2000. Children in search of perfection: towards a minimalist model of acquisition. In Essex research reports in linguistics, 34.
Roeper, T. 1996. The role of merger theory and formal features in acquisition. In Generative perspectives on language acquisition, H. Clahsen (ed.), 415449. Amsterdam/
Philadelphia: John Benjamins.
Schwartz, B. D. and Sprouse, R. 1996. L2 Cognitive states and the full transfer/full access
model. Second Language Research 12: 4072.
Van de Craats, I. 2000. Conservation in the acquisition of possessive constructions. A study of
second language acquisition by Turkish and Moroccan learners of Dutch. Doctoral
dissertation, Tilburg University.
Van de Craats, I., Corver, N. and Van Hout, R. 2000. Conservation of grammatical
knowledge: on the acquisition of possessive noun phrases by Turkish and Moroccan
learners of Dutch. Linguistics 38 (2): 221314.
Van de Craats, I., Corver, N. and Van Hout, R. 2002. The acquisition of possessive haveclauses by Turkish and Moroccan learners of Dutch. Bilingualism: Language and
Cognition 5 (2): 147174.
Vermeer, A. 1986. Tempo en struktuur van tweede-taalverwerving bij Turkse en Marokkaanse
kinderen. Doctoral dissertation, Tilburg University.
White, L. 1982. Grammatical theory and language acquisition. Dordrecht: Foris.
White, L. 1985. The pro-drop parameter in second language acquisition. Language
Learning 35: 4762.

95

<LINK "duf-n*">

<TARGET "duf" DOCINFO AUTHOR "Nigel Dueld"TITLE "Measures of competent gradience"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 5

Measures of competent gradience*


Nigel Dueld
McGill University & Max Planck Institute for Psycholinguistics,
Nijmegen

1.

Introduction: Reconsidering the competenceperformance


distinction

An assumption of much current generative research is that grammaticality is a


strictly categorical property. By extension, underlying competence, of which
grammaticality is a reex, is also assumed to be categorical in nature.1 In most
analyses, this assumption about grammaticality is explicitly formalized in the
convention of placing asterisks in front of ungrammatical sentences: starred
sentences are categorically bad, unstarred sentences are unequivocally good.2
A second, equally standard assumption in generative research is that
acceptability judgments are not categorical, at least if one ignores such uninteresting deviances as arbitrary word scrambles, nonce-word insertions, and so
forth. Everybody knows that acceptability judgments on most theoretically
relevant sentences are only relative, that the particular level of acceptability of
a given sentence is a complex function of a variety of factors, including extragrammatical ones. It is notable that L2 researchers have been more explicit in
recognizing this fact than pure theorists (see especially Birdsong 1989, Ellis
1991, Hedgcock 1993, Martohardjono 1998; cf. also Coppieters 1987, Mandell
1999, but the view is more generally held: see Bever 1970, Greenbaum 1977,
Schtze 1996, for discussion and further references).
In spite of the tension between these assumptions, most theoretical linguists
use continuous performance data to draw inferences about (what is assumed to
be) categorical grammaticality, and by extension, categorical underlying
competence.3 Having once idealized to competence, theoretical linguists are
free to ignore those aspects of performance that they deem irrelevant to
grammaticality: this includes the property of gradience that is a principal
concern of this paper.

98

Nigel Dueld

Most of the foregoing discussion is familiar to anyone who has considered


the competence issue, so it might be wondered why I am bringing it up again.
There are three reasons, two of which apply to linguistic research in general,
one which is more specically concerned with the relationship between nativespeaker (NS) and non-native-speaker (NNS) data in second language acquisition research.
The rst reason for critically re-examining the competenceperformance
distinction is to save it from attack from an increasingly coherent body of
research that takes seriously the complexity of grammatical systems whilst at the
same time questioning the empirical validity of that distinction.4
One articulate statement of this new position is presented in the emergentist
work of Allen and Seidenberg (1999). Allen and Seidenbergs attack on the
competenceperformance distinction as it relates to language acquisition is
motivated by what they claim to be the obscure basis of grammaticality judgments:
The mapping between competence grammar and performance is at best
complex, as we have noted: it is also largely unknown. A problem arises
because the primary data on which the standard approach relies grammaticality judgments are themselves performance data (Bever 1970). The
methodology of the standard approach holds that properties of the hypothesized language faculty can be identied on the basis of experts intuitive
judgments of the well-formedness of utterances. However, the relationship
between grammaticality judgments and the structure of the grammar is no
more transparent than that between other aspects of competence and performance. (Allen and Seidenberg 1999: 117118)

While one might take issue with the reference to experts intuitive judgments,
the rest of this paragraph is unexceptionable. Allen and Seidenberg further
point out that misgivings about the basis of grammaticality judgments are to be
found within the generativist camp. The authors cite a telling paragraph in
Schtze (1996: 20):
It is conceivable that competence in this sense of a statically represented
knowledge does not exist. It could be that a given string is generated or its
status computed when necessary, and that the demands of the particular
situation determine how the computation is carried out, e.g., by some sort of
comparison to prototypical sentence structure stored in memory. Since such
a scenario would demand a major re-thinking of the goals of the eld of
linguistics, I will not deal with it further.

Measures of competent gradience

Allen and Seidenberg claim that this sidesteps an important issue. Their
response is to try to derive the eects of grammaticality from an input-driven
system lacking any autonomous syntactic representations: that is, without any
reference to grammatical competence as theoretical linguists would construe it.
For various reasons, including both the results of their own simulation and
the results of other experiments to be discussed below, I believe that this
alternative approach to grammaticality is mistaken, and that something like the
competenceperformance distinction is empirically more correct. At the same
time, their work demonstrates the urgent need for a clearer articulation of the
nature of the distinction, and for a more explicit statement of the content of
grammaticality judgments. While I will not be able to oer that statement here,
I hope to sketch out the direction in which it might be found.
A second reason for reconsidering the competenceperformance distinction
is that I agree with those who claim that the line between competence and
performance is presently drawn in the wrong place: that the current idealization
to competence presents too narrow a view of scope of competence. This is a
view held not only by connectionists, but also by some generativists, most
notably Culicover (1998, 2000), though cf. Fodor (2001).
The specic notion that I would like to include within a revised version of
competence is that of syntactic gradience. As the title of this article implies, I
will argue that part of competence consists, not in correctly making categorical
judgments about grammaticality, but in correctly making gradient ones. To cite
one brief example, to be a competent knower of English is to know, not merely
that the sentences in (1) and (2) below are all less than perfect, but also that
they are rather precisely ordered in acceptability, each example being slightly
less acceptable than the one that immediately precedes it. These examples, cited
in Kluender (1992), are originally due to Chung and McCloskey (1983) (where
the symbol > indicates more acceptable than):
(1) a.

This is the paper that we really need to nd someone who


understands. >
b. This is the paper that we really need to nd a linguist who
understands. >
c. This is the paper that we really need to nd the linguist who
understands. >
d. This is the paper that we really need to nd his advisor who
understands. >
e. This is the paper that we really need to nd John, who understands.

99

100 Nigel Dueld

(2) a.

This is a paper that you really need to nd someone that you can
intimidate with. >
b. Which paper do you really need to nd someone that you can
intimidate with. >
c. How many papers do you really need to nd someone that you can
intimidate with. >
d. What do you really need to nd someone that you can intimidate with.

Kluenders concern in presenting such examples is to draw out the hidden


factors determining the acceptability of these sentences: all of the sentences in
(1), and those in (2), involve very similar syntactic structures, yet their grammaticality status varies from (almost) completely acceptable to strongly unacceptable. In this case, the degrees of acceptability are a function, not of syntactic
structure, but of a semantic factor, what Kluender terms referential specicity.
One conclusion to be drawn from such examples is that we are failing to
account for speakers intuitions by attributing these grammaticality eects to
purely syntactic conditions. This is Kluenders contention. I return to the
hidden factors issue in Section 2.4. below, in discussing the factors determining another putatively syntactic eect the parallelism constraint on VP-ellipsis. At this point, however, my concern is with the fact that this is an instance of
competent gradience: competent knowers of the language judge these sentences as increasingly less acceptable, and they also converge on the particular
ordering involved.
This becomes especially relevant to SLA when we consider a hypothetical
advanced L2 learners response to such sentences.5 On the categorical view of
competence, an L2 learner who systematically rejected all of these sentences as
unacceptable (*) should in fact be more correct relative to the hypothesized
grammar (I-language) than the native speaker who is inclined to accept most or
all of them (most or all of the time). Although unusual, such instances of L2
learners outperforming native-speakers are not unknown; the existence of this
phenomenon is the third reason to re-examine the competenceperformance
distinction. In fact, I will suggest that the hypothetical L2 learners performance
does reveal an underlying categorical competence.
Intuitively, however, a learner who judges as unacceptable what nativespeakers judge as less acceptable has somehow failed to achieve native-speaker
competence, if we dene competence as whatever implicit knowledge determines successful language use.
By one measure, then, the L2 learner whose judgments fail to match those
of the native-speaker will be judged more competent than the native speaker,

Measures of competent gradience

where competence is dened in terms of convergence between a discrete


pattern of behaviour and underlying categorical knowledge. Another measure
and common sense dictates that the same L2 learner exhibiting the same
behaviour should be judged less competent than the native-speaker where
competence is dened in terms of convergence on native-speakers judgments.
Although most generative SLA researchers would claim that they are probing
the rst type of competence, I know of very little work that does not determine
this competence implicitly or explicitly in terms of the second type.6,7
There have been two responses to this apparent paradox. As noted above,
the anti-generativist response is simply to reject the idea of competence as
statically represented, autonomous knowledge. The standard generativist
response is to continue to pretend that gradience is not part of syntactic
competence. Neither alternative seems satisfactory.8
A middle way is to claim that both impulses are correct, because there are
two types of competence. This is the proposal I will try to motivate here. For the
purposes of exposition, Ill term these hypothesized competences, underlying
versus surface competence, respectively, with no preference intended. By
hypothesis, underlying competence (UC) is categorical, and consists of formal
(phonological and syntactic) principles, autonomous from the lexicon. It is
plausible to think of UC as innate. Surface competence (SC), by contrast, is
intimately determined by the interaction of contextual and specic lexical
properties with the formal principles delivered by UC; as a consequence, SC
generates gradient eects. SC is largely language-specic learned knowledge. In
principle, grammaticality judgments can be a reection in performance of
either type of competence; generally, however again, by hypothesis explicit
grammatical judgment tasks will tap surface competence, whereas implicit tasks
will tap underlying competence.9
Now, it may happen that, in a particular task, a given set of judgments will
be weakly consistent with both types of competence, and also that L1 and L2
learners judgments will converge. This state-of-aairs is schematised in
Figure 1 below. Gradient judgments will be weakly consistent with UC just in
case (i) all sentences judged less than perfectly acceptable violate some formal
constraint, (ii) none of the sentences judged to be perfect violate this constraint,
and (iii) it is reasonable to attribute the relative degrees of acceptability of the
unacceptable set to some, non-formal (preferably extra-grammatical) factor.
This I take to be the standard generativist line.

101

102 Nigel Dueld

Common set of judgments


(shared by NSs & NNs)

Surface
competence

Underlying
competence

Figure 1.Full convergence (SC and UC generate the same set of grammatical sentences,
NS and NNS converge on this set).

However, several legitimate alternatives exist. These are cases where either
the performance judgments on the data presented are a reex of only one type
of competence (UC or SC), or the judgments of two subject groups (L1 versus
L2) reect dierent competences. These logical alternatives are schematised in
Figures 25 below.
Common set of judgments
(shared by NSs & NNs)

Surface
competence

Underlying
competence

Figure 2.Convergence on SC only.

Common set of judgments


(shared by NSs & NNs)

Surface
competence

Figure 3.Convergence on UC only.

Underlying
competence

Measures of competent gradience 103

Native
Speaker
Judgments

Non-Native
Speaker
Judgments

Surface
competence

Underlying
competence

Figure 4.Parallel, disjoint convergence (Type 1: NS converge on SC, NNS on UC).


Non-Native
Speaker
Judgments

Surface
competence

Native
Speaker
Judgments

Underlying
competence

Figure 5.Parallel, disjoint convergence (Type 2: NNS converge on SC, NS on UC).

Notice that all of these scenarios assume convergence between NS and NNS
competences; that is to say, I am ignoring those cases where L2 learners have
internalised a surface or underlying competence that is dierent from that of
native-speakers; cf. Sorace (1993, 1996). What this model is intended to account
for are cases in which the L2 learner has attained the target grammar, but where
his/her judgment patterns nevertheless diverge systematically from those of the
native-speaker.
Clearly, hypothesizing two dierent types of competence complicates any
theory of the relationship between competence and performance. Parsimony
requires that this complication be shown to be empirically necessary. The
purpose of the rest of this article, therefore, is to provide such motivation. The
organization is as follows. The following section provides some experimental
evidence motivating two types of competence. Next, I consider two types of
gradient eect, both of which I will argue are properties of surface, rather than
underlying, competence. Then, I discuss two instances of principled mismatch
between NS and NNS judgments, motivating both Figures 4 and 5 above.
Finally, I examine a special case of Figure 2 above, where NS and NNS apparently converge on the same SC, but for dierent reasons.

104 Nigel Dueld

2. The dual competence approach


2.1 Motivating two types of competence
This section draws attention to two pieces of recent work that suggest a distinction between two types of competence. In each case, complementary methodologies have been used to investigate speakers judgments of particular syntactic
phenomena, which have been argued (in the theoretical literature) to exhibit a
categorical distinction. In each case, these methodologies have yielded divergent, but nevertheless coherent, results. This divergence, I would claim, reects
the fact that two dierent types of competence are being tapped.
2.1.1 McKoon and MacFarland (2000)
McKoon and MacFarlands (2000) study investigates the theoretical argument
for a discrete representational contrast between two classes of verbs: externallyversus internally-caused change of state verbs. As these labels suggest, externally-caused change-of-state (ECS) verbs are those whose result state may be
brought about by some external agent/cause, whereas internally-caused change
of state (ICS) verbs describe result states whose cause is internal to the theme
NP undergoing change-of-state. An example of the former is the verb redden,
an example of the latter, bloom.
The theoretical literature on such predicates analyses the dierence in terms
of a discrete lexical contrast: the lexical argument-structure of ECS verbs is said
to involve an additional abstract predicate (CAUSE), which is systematically
absent from the argument-structure of ICSs. In the specic analysis of Levin
and Rappaport Hovav (1995), this entails a structural dierence in the lexicosyntactic representation of the two verb-types: as illustrated in (3) below, ECS
verbs project more structure than ICS verbs.
(3) a. ECS: (() CAUSE (BECOME (x STATE)))
b. ICS:
(BECOME (x STATE))

One piece of empirical evidence supporting this structural contrast comes from
the alleged (in)ability of these intransitive verbs to form transitive counterparts:
it has been claimed that ECSs, but not ICSs, permit transitive alternants. So, for
example, there is claimed to be a categorical contrast in the acceptability of (4a)
versus (4b):
(4) a. The sunshine reddened Amys cheeks.
b. *The sunshine bloomed the tulips.

Measures of competent gradience 105

McKoon and MacFarland tested the validity of this empirical claim through a
number of dierent studies. First, they examined a corpus of approximately 180
million words of written and spoken English. The results of that investigation,
summarized in Table 1, below demonstrate that ICSs considered as a class are
in fact just as likely to transitivise as ECSs.
Table 1.Probability of transitive use for ECS and ICS verbs (adapted from McKoon and
MacFarland 2000: Table 2)
SentenceType

Low prob. Transitive

Higher prob. Transitive

External Cause

Internal Cause

verb

prob. yes

verb

prob. yes

atrophy
awake
crumble
abate

.03
.05
.05
.10

bloom
deteriorate
germinate
rot

.00
.01
.06
.08

mean

.06

mean

.06

redden
dissipate
fray
fossilize

.24
.41
.52
.60

blister
ferment
corrode
erode

.22
.54
.63
.67

mean

.48

mean

.45

All of these predicates vary individually in transitivity some ECSs such as


abate show a low probability of transitivity, others such as fossilize are more
likely to transitivise: the same is true of ICSs. In terms of usage, then, the
transitivity constraint does not distinguish the two classes of predicate; rather,
transitivity is a lexically gradient eect. Thus, the corpus study results apparently invalidate a key piece of empirical evidence for a representational contrast
between the two verb classes.
The proposed theoretical distinction becomes even more questionable in
light of a follow-up study, in which subjects were asked, in an oine grammatical judgment task, to rate the acceptability of various transitivised ECS and ICS
verbs. The results of this task conrmed the patterns already observed in the
spontaneous production data, namely, that the acceptability of the transitive
alternant of a given verb was a function of individual lexical dierences, and of
selectional restrictions, rather than putative verb-class. Statistically, there is no
main eect of verb class, nor an interaction between verb-class and transitivity,

106 Nigel Dueld

in the acceptability ratings. Overall, the acceptability rate for both classes of
verbs was extremely high: even those verbs judged least acceptable had a mean
acceptance rate of over 80%.
The results of the rst two studies speak against the theoretical claim that
these two types of verbs are represented dierently. On the contrary, subjects
close convergence on a scale of relative acceptability (for the transitive forms of
individual predicates) suggests that a ne-grained, lexically determined, and
inherently gradient, type of competence informs native-speakers performance.
Had McKoon and MacFarland stopped at this point, their work would
provide support for the usage-based, probabilistic models of language performance favoured by connectionists and others (cf. especially MacDonald et al.
(1994), Barlow and Kemmer (2000)). A nal experiment, however, suggested
a dierent conclusion. In this latter, implicit experiment, McKoon and
MacFarland measured the response latencies involved in reading ECSs versus
ICSs that had previously been matched in terms of length and oine acceptability. In direct contrast to the previous results, this implicit measure revealed
a reliable contrast in reading time between ECS and ICS verbs; ECS verbs
whether presented in intransitive or transitive frames took signicantly
longer to read than their matched ICS counterparts. The main results are
reproduced in Table 2 below (from McKoon and MacFarland: Tables 7 and 8:
848, 851). That is to say, the results of these last experiments support the idea of
distinct representations for these verbs on the basis of the hypothesized verb
classes.
McKoon and MacFarland argue that the results of the nal experiment
speaks against probabilistic models of lexical representation, and conrm the
psycholinguistic reality of the theoretical model. Both conclusions are too
strong, I think. On the one hand, whether or not there is a categorical distinction in the representation of the two classes of verbs, it is still necessary to
account for subjects ability to converge on a scale of gradient judgments for
individual predicates. As discussed earlier, a speaker whose judgments of the
acceptability of such predicates was the inverse of all other subjects (in Experiment 2) could reasonably be judged less competent than one whose relative
judgments were in accordance with other native speakers (at least by the second
denition of competence), even if their reading times in Experiment 3 were
comparable.
Moreover, the fact that subjects performance in an implicit task yields a
statistically discrete result does not prove that the competence underlying this
behaviour is itself categorical, and certainly is no more than consistent with the

Measures of competent gradience 107

Table 2.Mean reading times for ECS and ICS verbs (from McKoon and MacFarland
2000: Tables 7 & 8)
Sentence Type

External Cause

Internal Cause

Intransitive frames

JTime (ms)

prob. yes

JTime (ms)

prob. yes

All test sentences


Low prob. Transitive
Higher prob. Transitive

1551
1561
1538

.91
.92
.90

1400
1392
1413

.96
.96
.96

Intransitive frames

External Cause

Internal Cause

JTime (ms)

prob. yes

JTime (ms)

prob. yes

2220
2230
2210

.86
.81
.93

2069
2131
1963

.96
.96
.96

All test sentences


Low prob. transitive
Higher prob. transitive

theoretical analysis. To infer the reality of a particular contrast in underlying


competence from an apparently isomorphic contrast in processing is to assume
a very direct interpretation of the derivational theory of complexity that most
psycholinguists would view with some caution.
Having said that, the results of the third experiment do provide support for
the idea that some correlate of the categorical contrast described by the theoretical analysis has psychological reality, and that this correlate does seem to be
categorically expressed.
I suggest that an adequate model is one that accommodates both sets of
results. Rather than attempting to reconcile these within a single type of
competence, the proposal is to assume a dual-competence model, as represented in Figures 15 above. In this model, the results of the rst two experiments
can be represented as in Figure 2 (tapping SC), those of the nal experiment as
in Figure 3 (tapping UC).10
2.1.2 Dueld and White (1999), Dueld et al. (2002)
The dual competence model receives additional support from recent SLA work
using complementary methodologies to investigate L2 learners knowledge of
pronominal clitic placement in Spanish and French: Dueld and White (1999),
Dueld et al. (2002). In this section, my concern is with the divergent results
of the French native-speaker control group in one section of experiments. Here,
unusually, an oine grammaticality judgment task revealed a grammaticality
contrast to which the implicit (sentence-matching) task was apparently insensitive.

108 Nigel Dueld

(5) a.

Je veux le voir.
I want 3sg to.see
I want to see him.
b. *Je le veux voir.
c. Je le fais chanter.
I 3sg make to.sing
I make him sing.
d. *Je fais le chanter.

The phenomenon of interest, illustrated in (5) above, is the contrast in pronominal clitic placement between so-called restructuring verbs (such as pouvoir,
vouloir) and the causative verb faire. The distinction is of theoretical interest in
that the two structures are assumed to involve distinct syntactic representations.
Briey, many theoretical analyses assume that restructuring verbs as the
name suggests restructure the syntax of the lower clause, creating a monoclausal structure. By contrast, sentences involving causatives are assumed to
remain fundamentally bi-clausal: at an abstract level of representation, the
pronominal clitic is still syntactically associated with the lower verb; see
Dueld et al. (2002), for more detailed discussion.
Notice that this analysis, derived on the basis of cross-linguistic comparison
with other Romance varieties, is precisely the opposite of that suggested by a
nave inspection of the French facts. In French, the pronominal clitic stays close
to the verb of which it is an argument in restructuring contexts, and is displaced
from it in causative contexts: hence, one might expect that restructuring would
be required for the interpretation of clitics with causatives, rather than with
verbs like vouloir.
The theoretical analysis, however, predicts a distinction between the
acceptability of two types of ungrammatical sentence, namely, between (5b)
and (5d) above: whereas (5b) is predicted to be ungrammatical at all levels of
representation, (5d), in which the clitic is attached to the verb with which it is
thematically associated, is predicted to be ungrammatical only at surface level.
This predicts that if a task could be found that taps underlying, rather than
surface, grammaticality, then sentences such as (5d) should pattern with other
grammatical, as opposed to ungrammatical, sentences.
In Dueld et al. (2002), we employed just such a task to elicit implicit
grammaticality judgements. This is the Sentence-Matching (SM) paradigm
introduced by Freedman and Forster (1985), later developed for SLA research
by Bley-Vroman and Masterson (1989); see also Eubank (1993), Eubank and
Grace (1988). In this task, subjects are asked to determine whether or not two

Measures of competent gradience 109

visually-presented sentences are identical in form. Previous research has shown


repeatedly that, in general, subjects take signicantly less time to decide that
matching pairs of grammatical sentences are identical than to match corresponding ungrammatical pairs of sentences. Hence, statistically discrete
dierences in reaction times constitute an implicit measure of grammaticality.
This measure has proven useful in comparing native-speakers implicit grammaticality judgments with those of L2 learners, since one is theoretically, at
least able to avoid a potential pitfall of explicit grammaticality judgment
tasks, in which L2 learners may have explicitly learned a particular rule, and be
able to apply it in a grammatical judgment task, but nevertheless have a
radically dierent interlanguage competence from that of native-speakers.
Contrary to some other researchers, I see little value in using the SM
paradigm if the only purpose is to conrm results achievable using more
traditional methodologies. Arguably, the cases of interest are those where
traditional grammaticality judgment tasks and SM yield divergent results.
Indeed, it seems unlikely that sentence-matching would have received much
attention at all had it not been for the fact that it fails in some interesting
contexts. These contexts were the focus arguably, the raison dtre of
Freedman and Forsters original paper from 1985. Freedman and Forster
showed that subjects who otherwise reliably distinguished in response latency
between identical pairs of grammatical versus ungrammatical sentences,
appeared to treat a particular subset of ungrammatical sentences namely,
specied subject condition violations as though they were in fact grammatical. That is to say, there was no signicant dierence in response latency in this
condition. For example, subjects treated pairs of sentences rated oine as
ungrammatical, such as (6a), on a par with grammatical sentences as in (6b),
and distinct from matching pairs of ungrammatical sentences, such as (6c),
which elicited signicantly longer response latencies.
(6) a. *Who did you see Ricks picture of? implicitly treated as
grammatical
b. Who did you see a picture of?
c. *What you did of a picture see?

Freedman and Forster interpreted this systematic absence of an eect in terms


of the level of syntactic representation tapped by the SM task: given the theoretical framework they assumed, they argued that the sentence was grammatical at
s-structure, but ungrammatical at at the level of logical form.
Clahsen, Hong and Sonnenstuhl-Henning (1995) reinterpret these ndings

110

Nigel Dueld

in terms of operations applying at dierent levels of structure: SM, they argue,


inspected an underlying syntactic level (LF), but was insensitive to operatorvariable binding relations holding at that level of representation. This reinterpretation captured both the Freedman and Forster results as well as their
own results on verb-position in nite root versus embedded clauses in German.
These latter results showed that speakers treat as grammatical main clauses in
which the verb (incorrectly) occupies the nal position, as in (7a), but treat as
ungrammatical embedded clauses in which the verb is incorrectly raised to
second position, as shown in (7c):
(7) a. *Hans den Hund gesehen hat. treated as grammatical in SM.
Hans the dog seen has
Hans has seen the dog.
b. Hans hat den Hund gesehen.
c. *da Hans hat den Hund gesehen. treated as ungrammatical in SM.
that Hans has the dog seen
that Hans has seen the dog.
d. da Hans den Hund gesehen hat.

In our own experiments on clitic placement (see Dueld et al. 2002), we determined that in SM native-speakers treat ungrammatical causatives dierently
from ungrammatical restructuring sentences. As predicted by the theoretical
analysis, French native speakers treat cases such as (5d) on a par with other
grammatical sentences: there was no signicant dierence in response latency
between (5c) pairs and (5d) pairs; by contrast, it took subjects signicantly
longer to match (5b) pairs than their grammatical counterparts (5a). Crucially,
in all other conditions involving incorrect clitic placement, French nativespeakers were signicantly slower to match sentences compared to the corresponding correct placements of these clitics. Thus, it was not the case that SM
overall was insensitive to constraints on clitic placement; quite the contrary.
Instead, SM was selectively insensitive to surface violations of grammaticality:
sentences that were underlyingly grammatical were accepted as grammatical,
even though the surface string was ungrammatical.
In oine grammatical judgment tasks, on the other hand, French nativespeakers consistently treat surface ungrammatical sentences equally: oine,
(5d) is considered no more grammatical than (5b). Thus, there is once again a
divergence between the results obtained by implicit versus explicit methodologies. In contrast to the McKoon and MacFarland experiments discussed in the
previous section, here it is the selective absence of a specic result from the

Measures of competent gradience

implicit experiment that is signicant. Whichever the direction, though, both


sets of experiments require a dual competence approach to underlying linguistic
knowledge, if we wish to accommodate and model both online and oine
results. Both sets of experiments here again, I am considering only the
native-speakers results in our experiment exemplify Figures 2 and 3 above,
with the implicit experiment tapping UC, and the results of the explicit grammatical judgment task reecting surface competence.
2.2 Types of gradience
Having established the basic framework of a dual competence model to
accommodate both categorical and gradient eects, I would now like to
consider this latter notion more closely, in order to distinguish dierent types
of gradience. This is the issue that bears most directly on the more general
concern of this volume, namely, on the nature of the lexicon-syntax interface.
Just as I have suggested that there are two types of competence, it is also necessary
to draw a distinction between two types of gradience. Again, for want of better
terms, I will refer to these subtypes as lexical and syntactic (constructional)
gradience, respectively. Since this latter distinction is somewhat more intuitive
than the UC/SC contrast, a couple of illustrative examples should suce.
2.2.1 Lexical gradience
Lexical gradience refers to cases where the acceptability of a given sentence
varies as a function of the properties of particular lexical items. Such properties
may be semantic or idiosyncratic. An example of a semantic property might be
intentionality: verbs that select +intentional subjects may be more acceptable in
a given sentential context than those that do not. Idiosyncratic properties, by
contrast, distinguish lexical items from near neighbours: for example, highly
frequent nouns may be more acceptable than (near) synonyms of lower
frequency; some items may be more appropriate than others in a given register.
In both cases, the acceptability of the carrier sentence is determined by properties of the specic lexical entries.
McKoon and MacFarlands work just discussed exemplies this type of
gradience: their corpus study showed that some intransitive verbs transitivise
much more readily than others, independently of the verbal class to which they
belong. For instance, whereas atrophy (ECS) and deteriorate (ICS) have an
extremely low probability of occurring in transitive frames, fray (ECS) and
ferment (ICS) are much more likely to be transitivised. This dierence is partly

111

112

Nigel Dueld

Change of Location
Change of State
Continuation of a pre-existing state
Existence of State
Uncontrolled process
Controlled process (motional)
Controlled process (non-motional)

selects be (least variation)

selects have (least variation)

Figure 6.Auxiliary Selection Hierarchy (adapted from Sorace 2000).

a function of inherent semantic factors, and partly one of frequency.11


Another article in the same issue of Language, by Sorace (2000), provides a
dierent example of lexical gradience. Sorace is concerned with variation in
auxiliary selection (have versus be) in constructions in the perfect in Germanic
and Romance (especially Italian). The standard theoretical assumption is that,
for any particular variety, the auxiliary associated with a particular predicate is
rigidly lexically-determined: a given predicate either categorically selects the be
auxiliary or the have auxiliary (the latter being the default value). Soraces
article suggests that this categorical view is incorrect. Her paper shows that
both cross-linguistically and within a given variety auxiliary selection is a
gradient rather than categorical property. According to their semantic properties, verbs occupy a position on a continuum (or hierarchy, to use Soraces
term) between be and have selection (see Figure 6).
At either end of this semantically-dened continuum, there is little variation in which auxiliary is selected, so that for a non-motional controlled
process, such as chat (It. chiaccherare) have is the only acceptable auxiliary,
whereas for a pure change of location predicate such as come (It. venire), only be
is possible. This is illustrated by the examples in (8a versus 8b) below. By
contrast, predicates whose inherent semantic properties place them in the
middle of this continuum show much more exibility as to which auxiliary is
selected. This exibility is reected both in terms of linguistic variation with
respect to selection as illustrated by examples in (9) below and in terms
of coercability within a given variety: as the examples in (10) illustrate, verbs
in the middle of the continuum can be pragmatically coerced into preferentially
selecting either one or the other auxiliary.12
(8) a.

Maria /*ha venuta alla festa.


Maria is/has come to-the party
Maria came to the party. [1a]

Measures of competent gradience

b. I colleghi hanno/*sono chiaccherato tutto il pomeriggio.


the colleagues have /are chatted
whole the afternoon
The colleagues chatted the whole afternoon. [33a]
Gli atleti svedesi hanno corso/ ?sono corsi alle Olimpiadi.
the athletes Swedish have run /are run at-the Olympics
The Swedish athletes ran at the Olympic Games. [37]
b. De temperatuur is /heeft 3 uur lang gestegen, maar is toen
the temperature is/has 3 hours 
risen
but is then
weer gezakt.
again dropped
The temperature rose for three hours but then dropped again. [11]

(9) a.

Il pilota ha / ? atterato sulla pista di emergenza.


the pilot has/is landed on-the runway of emergency
The pilot landed on the emergency runway. [44a]
b. Laereo ?ha / atterato sulla pista di emergenza.
the plane has/is landed on-the runway of emergency
The plane landed on the emergency runway. [44b]

(10) a.

If only the poles of the continuum are considered, this contrast gives the
appearance of being categorical; however, once one considers the middle range,
it becomes clear that auxiliary selection is a lexically gradient phenomenon.13
2.2.2 Syntactic/constructional gradience
In other constructions, however, gradient eects are observed that are independent of the particular lexical items involved. That is to say, certain sentences are
regularly judged to be less than perfectly acceptable without being deemed
wholly unacceptable. In the theoretical literature, such sentences are typically
designated as marginal, a status denoted by one or two question-marks (?/??).14
However, as noted above, since standard theoretical models have no way of
representing such judgments, marginal sentences are usually re-classied as
grammatical or ungrammatical ad hoc, depending on the analysis that is being
pursued. Such reclassication immediately obscures an essential feature of most
acceptability judgments, namely, their syntactic gradience.
One example of this type of gradience has already been mentioned, viz., the
inuence of referential specicity in determining the relative strength of
syntactic island eects: Kluender (1992), see also Kluender and Kutas (1993).
A dierent example of syntactic gradience is provided by my work with
Ayumi Matsuo on VP-ellipsis constructions in English. Ellipsis constructions,

113

114

Nigel Dueld

and the constraints pertaining to them, have provided core data for generative
analyses for several decades, their importance rst brought to general attention
in Sags dissertation (1976). The aspect of ellipsis constructions relevant to the
present discussion is a constraint on structural parallelism, which the
theoretical literature claims requires the VP of the antecedent clause to be
syntactically parallel to that of the understood ellipsis. This structural parallelism constraint is used to explain the contrast between (a) versus (b) examples
in (11) and (12) below: examples (11a) and (12a) show VP-ellipsis with
parallel active/verbal antecedent clauses; those in (11b) and (12b) illustrate
two types of non-parallel antecedent, passive VPs and nominal antecedents,
respectively. The examples in (11/12c) and (11/12d) are intended to show that
this parallelism constraint fails to apply or, at least, does not apply so
strongly if the ellipsis (VPE) is replaced with the semantically equivalent
VP-anaphora (VPA) clause.
(11) a.

Someone had to take out the garbage.


But Barney refused to.
b. The garbage had to be taken out.
?/??But Barney refused to.
c. Someone had to take out the garbage.
But Barney refused to do it.
d. The garbage had to be taken out.
But Barney refused to do it.

(VPE)

(VPA)

(12) a.

It always annoyed Sally if anyone mentioned her sisters name.


Tom did, out of spite.
(VPE)
b. The mention of her sisters name always annoyed Sally.
??/*Tom did, out of spite.
c. It always annoyed Sally if anyone mentioned her sisters name.
Tom did it, out of spite.
(VPA)
d. The mention of her sisters name always annoyed Sally.
?Tom did it, out of spite.

The structural parallelism eect is interesting for at least two reasons. First, for
native-speakers, the parallelism constraint has generally gradient, rather than
categorical eects. That is to say, native-speakers typically disprefer, but do not
necessarily exclude, violations of structural parallelism with VPE (the (b)
examples above). This has been demonstrated experimentally in Tanenhaus and
Carlson (1990), as well as in our own work (Dueld and Matsuo 2001, 2002).
The availability of non-parallel ellipsis, in contrast to some other kinds of

Measures of competent gradience

ungrammatical sentence, has also been documented in corpora of spontaneous


speech, as reported in Hardt (1993). The following examples, taken from Hardt
(1993), attest to the productivity of violations of the parallelism constraint.
(13) a.

This information could have been released by Gorbachov, but he


chose not to. (Daniel Schorr, National Public Radio broadcast
10/17/92) [Hardt (131)]
b. A lot of this material can be presented in a fairly informal and accessible fashion, and often I do. (Chomsky 1982, cited in Dalrymple et
al. (1991)) [Hardt (134)]
c. [Many Chicago-area cab-drivers] sense a drop in visitors to the
city. Those who do, they say, are not taking cabs. (Chicago Tribune
2/6/92) [cf. Hardt ex. 118]

Hence, it seems fair to claim that such sentences have a dierent status from
those that native-speakers quite generally reject as unacceptable.
A second point to observe about non-parallel ellipsis is that constructiontype seems to be a factor in determining acceptability. Once again, experimental
evidence just cited conrms the intuition that non-parallel ellipsis where the
antecedent is a derived nominal (12b) is signicantly less acceptable than nonparallel ellipsis where the antecedent is a passive VP (11b) (though it still remains
signicantly more acceptable than some other kinds of ungrammatical sentence).
Standard theoretical analyses of ellipsis have no way to represent either the
gradient eects of the parallelism constraint overall, or the dierential eects of
construction type. Hence, in the theoretical literature, the relative acceptability
judgments just described get recoded categorically, with non-parallel ellipsis
being considered uniformly ungrammatical (*), irrespective of the particular
type of antecedent, and non-parallel anaphora ((11d)/(12d)) deemed perfectly
acceptable, native-speakers intuitions notwithstanding. This contrast is
represented schematically in Table 3.
It should be stressed that this type of gradience is orthogonal to lexical
gradience: for each condition tested in our experiments, dierent verbs and
dierent auxiliaries were deemed more or less acceptable in ellipsis contexts;
crucially, though, all of the verbs were accepted some of the time in nonparallel contexts.
The main statistical ndings for native speakers were as follows. First, in
both experiments, there was a reliable main eect of parallelism: in particular,
VP-ellipsis following non-parallel antecedents was signicantly less acceptable
than following parallel antecedents, irrespective of construction type. Second,

115

116

Nigel Dueld

Table 3.Designated vs. actual acceptability judgments for parallel vs. non-parallel
antecedents (VPE and VPA completions). RH column shows acceptance rates as
percentages for trials in Dueld & Matsuo (2001), (2002), respectively.
Antecedent-ellipsis
type

Designated
grammaticality
judgment

Actual acceptability judgments

Active-VPE
Passive-VPE

90
52

88
48

Verbal-VPE
Nominal-VPE

??

89
39

93
57*

Active-VPA
Passive-VPA

96
91

87
84

Verbal-VPA
Nominal-VPA

97
74

88
76

* The relatively high acceptance rate for VPE following nominal antecedents is due to the dierent
balance of nite and non-nite ellipsis sentences in the latter experiment. See 2.4 below for further
discussion.

two clear interactions were observed: between ellipsis type and parallelism (VPE
following non-parallel antecedents is reliably less acceptable than VPA in the
same context), and between construction type and parallelism (VPE following
passive antecedents is signicantly more acceptable than following nominal
antecedents, though still less acceptable than following active antecedents).
Third, VPA also shows a reliable parallelism eect with nominal antecedents,
albeit a smaller one compared to the VPE eect.
Before discussing the second language learners, let us consider how we might
model the native-speaker results, rst, given the traditional competenceperformance model, and then within the dual competence model proposed here.
As far as I can determine, it is simply impossible to model these gradient
eects in the traditional framework without arbitrarily recoding them as
categorical eects. One could, for example, model the main eect of parallelism
by reclassifying the circa 50% acceptance rate for VPE in passive contexts as
equivalent to categorically unacceptable sentences, say, those with less than 5%
acceptance ratings by native speakers. One could also gloss over the statistically
reliable dierence between construction types: since the principles and parameters approach allows no construction-specic rules, it cannot allow construction-specic eects to bear on grammaticality. Finally, one could dismiss the

Measures of competent gradience

small, but signicant, eect of parallelism in VPA contexts.


Viewed charitably, this way of treating gradient eects obscures subtle and
empirically valid distinctions, while reconciling them with an explanatory
model; a less charitable interpretation would regard this is as xing the data.
However it is viewed, something important is lost. I suggest that these gradient
eects are an essential feature of linguistic competence, not something to be
factored out.
By contrast, the dual competence model allows us to model and to interpret
these gradient results without abandoning the idea that some aspects of
syntactic competence are indeed categorical and autonomous of lexical and
constructional knowledge. As was the case for the McKoon and MacFarland
results discussed above, I suggest that the results of this experiment be interpreted in terms of Figures 2 and 3 respectively, in which the structural parallelism constraint, which shows its eects across constructions, is represented in
UC, whereas specic lexical and constructional information, including frequency information, is represented in SC. The interaction between these two types
of competence gives rise to the various types of gradience observed.
The dual competence model not only permits modelling of lexical and
syntactic gradience; as discussed above, it also provides a potentially explanatory model of principled divergences between native-speakers and second
language learners behaviour, and a way to resolve the apparent paradox of L2
learners outperforming native-speaker controls. In the remaining sections of
this paper, I will consider some cases of what I will term parallel disjoint
convergence.
2.3 Parallel disjoint convergence
As noted above, the dual competence model allows non-native speakers results
to dier from those of native-speakers on any given acceptability judgment task
in two principled ways: either non-native speakers results can reect UC while
native speakers reect SC, or vice versa, these options being schematised in
Figures 4 and 5 above, respectively. In the following sections, I will outline one
instance of each type of disjoint convergence.
2.3.1 Type 1 disjoint convergence
Type 1 disjoint convergence (Figure 4 above) refers to instances discussed at the
outset of this paper, in which L2 learners seem to outperform their nativespeaker counterparts; that is, instances where L2 learners behaviour actually

117

118

Nigel Dueld

comes closer to the categorical behaviour predicted by standard theoretical


models than does that of native-speakers.
This mismatch is illustrated in a recent cross-linguistic study of investigating L2 learners knowledge of English derivational morphology (Dueld,
Sabourin and Curtin 1998). This study was a partial replication for SLA of a set
of experiments with native-speakers reported in Marslen-Wilson et al. (1994).
The Marslen-Wilson et al. experiments examined morphological relatedness
between derivationally related words, as evidenced by priming eects. Among
the interesting ndings of the original study, the two most relevant were as
follows. First, it was determined that words related by simple phonetic overlap
did not yield priming eects (tin does not prime tinsel, nor asp, asparagus); in
other words, only words that could be decomposed into a shared stem plus a
legitimate ax (for the derived form) are considered by native-speakers to be
related. Second, it was determined that being formally morphologically related
was a necessary, but not a sucient condition for lexical relatedness; in addition, the two forms had to be semantically related. Thus, govern was shown to
prime government (and vice versa), but depart fails to prime department, since
the latter pair do not share any meaning.
From a theoretical point of view, the second nding is somewhat unexpected, since most theoretical morphological models assume formal rules to be
autonomous of specic lexical-semantic information. Psycholinguistically,
however, for native-speakers, the results show clearly that relatedness is
encoded in particular lexical entries (which is the only possible locus of specic
semantic information). Thus, there is a mismatch between native-speakers
psycholinguistic representations and what the theoretical models predict.
In our replication of the Marslen-Wilson et al. experiments with L2
learners, we predicted given our theoretical assumptions that L2 learners
might diverge in a principled way from native-speakers. Specically, while we
expected that both native-speakers and L2 learners should fail to show priming
for purely phonetically related (non-morphologically-related) pairs, we
hypothesized that intermediate learners might initially over-generalise the
formal rule, showing priming in depart-department cases, in contrast to native
speakers. This was precisely what we found: whereas our native-speaker group
and our advanced L2 group replicated the Marslen-Wilson et al. ndings, the
intermediate L2 group (native speakers of Japanese) showed priming eects for
morphologically-related pairs irrespective of semantic relatedness. By demonstrating categorical, autonomous behaviour, the intermediate group better
approximated the theoretical ideal than either the advanced group or the native

Measures of competent gradience

speakers: in this sense, these intermediate learners were more competent rather,
closer to underlying competence than the others. On the other hand, they were
clearly less competent than the advanced learners in converging on the judgments, and by extension, on the overall competence, of native-speakers, since
in this case target competence is lexically-constrained, gradient behaviour.
Although this contrast is not exactly comparable to the other phenomena
discussed in this paper, since it is purely lexical, rather than syntactic, the same
logic applies: some grammatical phenomena are categorical and autonomous
properties, others show lexically-specic, gradient eects; both need to be
accommodated.
2.3.2 Type 2 disjoint convergence
The converse behaviour where second language learners results reect
surface competence when native-speakers results show the inuence of UC
can be seen in the sentence-matching experiments on clitic placement discussed
earlier. Recall that the claim was that native-speakers failure to show a grammaticality eect (in the online task) in French causative constructions in
clear contrast to the grammaticality eect they exhibited for restructuring verbs
was due to the underlying grammaticality of the clitic placement in
sentences such as (5d) above.
If this is the correct explanation of the observed asymmetry, then the
predictions for the second language learners on this task are somewhat paradoxical, raising the possibility that L2 learners implicit acceptability judgments
might fail to match those of native speakers by outperforming them with
respect to the presumed theoretical target. In our experiment, this is precisely
what happened: both English and Spanish speaking L2 learners of French
showed a grammaticality eect for both restructuring and causative contexts in
the online task, in contrast to the French native speakers.
Once again, the standard model provides no satisfactory account of these
results: either one is forced to exclude the causative condition altogether on the
grounds that it did not work for the native-speaker controls, or one accepts
(paradoxically) that the L2 learners have achieved native-speaker competence
in this condition, as measured by approximation to the theoretical target,
although their implicit judgments are wholly distinct from those of the nativespeakers. Once more, something signicant is lost.
By contrast, this pattern of results can be accommodated directly by the
dual competence model, as schematised in Figure 5 above. The necessary
assumption would be that whereas native-speakers analyse syntactic structures

119

120 Nigel Dueld

in terms of a general computational system, (at least some) L2 learners analyses


are at the level of the surface properties of specic constructions. This assumption, though controversial, is in line with a respectable body of L2 research (see
especially Clahsen and Muysken 1989, Bley-Vroman 1989, 1990). Clearly,
considerably more research is necessary to demonstrate this version of the
Fundamental Dierence Hypothesis (the idea that SLA is constrained by fundamentally dierent principles and mechanisms than those that guide rst language
acquisition). The point here is that the present model is able to treat such divergences between native and non-native speakers in a principled fashion, without
totally excluding L2 learners access to UG (UC in present terms).
2.4 Factoring out gradient eects: L1 versus L2 dierences
Before concluding, I wish to draw attention to another aspect of competence
that is revealed when one studies gradient eects, but which remains obscured
in the traditional paradigm. By focussing on gradience, it is possible to determine which of several logically independent variables contribute(s) to a
particular acceptability judgment and just as importantly to determine
the relative strength of these variables. One potential outcome of this type of
factor analysis in SLA studies is that native speakers and second language
learners converge on the same overall result for quite dierent reasons: in other
words, their common acceptability judgments are determined by distinct
constraint rankings.15
In work reported in Dueld and Matsuo (2002, in preparation), we carried
out two follow-up experiments on the VP-Ellipsis study reported above. The
previous experiments in line with most other psycholinguistic work in this
area had assumed that the parallelism eect was entirely due to the syntactic
properties of the antecedent clause. The follow-up experiments re-examined
this assumption, the goal being to explain the parallelism eect in VP-ellipsis
constructions, by teasing apart the other linguistic factors that may contribute
to that eect, and (again) to compare native speakers and L2 learners sensitivity to such factors.
For these latter experiments, we considered two properties in addition to
syntactic parallelism, namely (conceptual and syntactic) recoverability, and
niteness. Recoverability refers to the idea that the parallelism eect may be
partly due to the relative salience in the discourse representation of the material
to be reconstructed: that is, non-parallel antecedents might be dispreferred not
for structural reasons, but for interpretive ones. To test this, we manipulated

Measures of competent gradience

the antecedent clause in the active-passive experiment (see above), such that the
linguistic information necessary to (re)-construct and interpret the ellipsis
clause was more or less recoverable from the antecedent. Specically, we
hypothesized that (conceptual) recoverability of non-parallel passive antecedents would be enhanced by the presence of a by-phrase, as for example in (14a)
versus (14b) below.16
(14) a.

Mary was busy, so the package was set by Tom.


?He had promised that he would.
(with by-phrase)
b. When we got back, our driveway had been cleared of snow.
??A neighbour told us that Tom had.
(no by-phrase)

The other property manipulated was the niteness of the ellipsis clause.
Standard theoretical accounts do not distinguish between non-nite ellipsis
involving to as in (11) above, on the one hand, and nite ellipsis involving
do, or some other auxiliary verb as in (12) above, on the other. That is to say, the
parallelism eect is generally claimed to constrain nite and non-nite ellipsis
equally. Intuitively, however, the parallelism eect is considerably weaker with
non-nite ellipsis. These experiments tested that intuition experimentally.
Detailed results and discussion of these experiments are reported in
Dueld and Matsuo (in preparation). Here, it suces to report the main
ndings, which were as follows (see also Dueld and Matsuo 2002). First,
contrary to standard theoretical assumptions, our experiments show that the
parallelism eect in ellipsis is not uniquely due to the structural properties of
the antecedent clause: other lexical and conceptual factors interact to determine
the strength of the eect. Of these factors, niteness is crucial: non-nite ellipsis
displays signicantly weaker parallelism eects in non-parallel contexts than
nite ellipsis. Second, conceptual recoverability does have an eect on the
acceptability of non-parallel antecedents, but only for native speakers, at
least in interaction with niteness: recoverability weakens the parallelism
eect only in non-nite contexts.
The comparison between native-speakers and L2 learners performance was
also revealing. Overall, L2 learners performance parallels that of nativespeakers: both groups exhibit a signicantly lower acceptance of ellipsis in nonparallel contexts; and, for both groups, this eect is gradient, rather than
categorical (just as in the previous experiments). This clearly indicates that
gradient eects can be successfully acquired in SLA.
On the other hand, the constraint ranking underlying native speakers and
L2 learners common overall results appear to be quite dierent. Whereas

121

<DEST "duf-n*">

122 Nigel Dueld

niteness shows a robust main eect for both groups, for L2 learners conceptual recoverability has an ameliorating eect even in nite clauses (which was not
the case for native speakers). This suggests that this type of conceptual information plays a larger role in determining L2 learners judgments than it does for
native speakers, who rely more on purely formal information.
Whatever the nal interpretation of these results should be, it should be
clear that such ndings are only attainable in principle if attention is paid to the
details of gradient eects: a categorical model can neither describe nor accommodate them.

3. Conclusion
The purpose of this paper has been to draw attention to various types of
gradient eects, both lexical and syntactic. As suggested in the title, I have argued
that these eects form an essential part of our implicit grammatical competence.
A revised model of competence was proposed that accommodates these eects,
but which maintains the generativist assumption that certain core aspects of
grammatical knowledge are still categorical, and autonomous of the lexicon.
The proposed model was also shown to oer an explanation for apparent
paradoxes that arise in SLA whenever second language learners outperform
native-speakers: by distinguishing two types of implicit knowledge, it is possible
to oer a principled account for certain systematic mismatches between nativespeakers and second language learners. Finally, this model may ultimately allow
us to bridge the gap between those who argue for strong continuity in SLA and
those who advocate fundamental dierences. There is considerable empirical
evidence for both positions; it could be that both are correct.

Notes
* I am grateful to two anonymous reviewers for constructive comments and suggestions. I
would also like to thank those people who commented on previous drafts of this paper,
including David Birdsong, Jonathan Bobaljik, Tom Roeper and Lydia White. I am especially
grateful to Jonathan Bobaljik for clarifying many misunderstandings on my part, and for
oering a persuasive defense of the standard approach. Unfortunately, due to time constraints, I have not been able to integrate all suggested revisions into this paper. No-one other
than myself is responsible for remaining errors and inconsistencies.

Measures of competent gradience

1. These assumptions were not always held: as noted by Levelt et al. (1977), theoretical
researchers in the sixties and early seventies developed theories on degrees of ungrammaticality (Levelt et al. cite Chomsky 1964, Katz 1964, Zi 1964, and Lako 1971).
2. I postpone any discussion of the status of marginal (?) sentences, since most theoretical
analyses actually end up designating such sentences either as grammatical or ungrammatical,
usually the former; their relative deviance or amelioration relative to the other
sentences of their designated type being attributed to peripheral, extra-syntactic, factors. For
further discussion, see Schtze (1996: especially pp. 41).
3. In this sense, there has been comparatively little progress from a notion of grammaticality
dened in terms of extensional (innite) sets: in this model, particular sentences are either
generated by the grammar, or they are not. Yet most generative linguists would claim,
following Chomsky (1986), that I-language, rather than E-language, is the proper object of
study (see also Hoekstra 1990).
4. Of course, denying the competenceperformance distinction is nothing new; in the past,
though, its detractors often misunderstood or vastly underestimated the intricacy and
complexity of grammatical knowledge.
5. The paradox arises most clearly in L2 research simply because L2 researchers are, of necessity,
more acutely aware than are theoretical linguists of the metalinguistic nature of linguistic
judgments, and of the methodological and analytic problems of data collection and comparison.
6. The comparison involved here may be direct, where L2 subjects judgments are compared
with the judgments of a control group of native-speakers, or indirect, where L2 judgments
are compared with a pre-established set of judgments (perhaps taken from a theoretical paper),
which native-speakers presumably would agree on; either way, L2 learners are judged competent
if their judgments match those of native-speakers in some statistically reliable way.
7. It might be objected that such a conict only arises where native-speakers judgments are
gradient. However, the contention here is that almost all apparently categorical judgments
are in fact gradient (when properly analyzed); hence, there is a real problem here.
8. This is a possible move, provided that there is something interesting left to a judgment
once the gradient properties have been factored out of the equation. Often, though, it seems
as if there is nothing left, no interesting residue that UG could explain. This again echoes
Culicover (1998: 48):
Chomsky has argued consistently that this perspective about linguistic theory [including
the notion of UG as an idealized characterization of linguistic competence: NGD] is
rational and scientic, virtually indisputable. In fact, it cannot reasonably be disputed
given the presumptions that: (i) a language faculty exists that contains specic syntactic
knowledge; (ii) what is left after stripping away the dynamical aspects of language is
something that really exists, in some sense, in the mind/brain (emphasis mine: NGD).
9. This latter assumption may of course be incorrect. The guiding intuition here is that the
introspection involved in explicit tasks is inevitably mediated (in adult native speakers) by
lexical knowledge, a product of surface competence. My speculation is that direct introspection of the computational system is impossible.

123

124 Nigel Dueld

10. Since McKoon and MacFarland did not test L2 learners, this interpretation is intended
to apply to native-speakers only.
11. Here, I make simplifying assumption that these are independent factors. Obviously, this
is not always the case: if, for example, inherent semantic constraints restrict or reduce the
occurrence of a verb in a transitive frame (see immediately below), this will aect the token
frequency for that item, which in turn may further inhibit its acceptability.
12. Numbers in square brackets designate Soraces original example numbers. Example 9b
[11] above is originally due to van Hout (1993: 7).
13. Similar remarks would seem to apply to other unaccusative eects: for example,
there-insertion in English (see Levin & Rappaport Hovav 1995).
14. Crucially, the marginal status of these sentences emerges from a uni-modal pattern of
acceptances: all speakers accept these sentences sometimes, and reject them on other
occasions; see Avrutin and Wexler (1992), for a relevant discussion of uni-modal vs. bimodal patterns of acceptance, and their proper interpretation.
15. In response to a reviewers query, the expression constraint ranking is not intended here
to imply a treatment in terms of Optimality Theory necessarily. It is not obvious that
standard OT models capture gradient eects any better than mainstream generative models,
since violable constraints do not yield gradient judgments (in most models, at any rate).
Rather, the term is intended to refer to dierences in the relative weighting of various lexical
and syntactic factors that determine the judgment. How these should relate to a particular
theoretical description is an independent question.
16. In the case of non-parallel nominal antecedents, we manipulated syntactic, rather than
conceptual, recoverability. Here, we contrasted zero-derived versus non-zero-derived
alternations (e.g., visit versus discussion), since it has been argued that the former (zeroderived nominals) are more easily reconstructable as verb-phrases in VPE contexts. No eect
was found for this type of syntactic recoverability (see Dueld and Matsuo in preparation).

References
Allen, J. and Seidenberg, M. S. 1999. The emergence of grammaticality in connectionist
networks. In The emergence of language, B. Macwhinney (ed.), 115152. Mahwah, NJ:
Erlbaum.
Avrutin, S. and Wexler, K. 1992. Development of principle B in Russian: Co-indexation at
LF and coreference. Language Acquisition 2 (4): 259306.
Barlow, M. and Kemmer, S. 2000. Usage-based models of language. Stanford: Center for the
Study of Language and Information.
Bever, T. G. 1970. The cognitive basis for linguistic structures. In The development of
language, J. R. Hays (ed.), 279362. New York: John Wiley & Sons.
Birdsong, D. 1989. Metalinguistic performance and interlinguistic competence. New York:
Springer-Verlag.

Measures of competent gradience

Bley-Vroman, R. 1989. The logical problem of second language learning. In Linguistic


perspectives on second language acquisition, S. Gass and J. Schachter (eds). Cambridge:
Cambridge University Press.
Bley-Vroman, R. 1990. The logical problem of foreign language learning. Linguistic
Analysis 20: 349.
Bley-Vroman, R. and Masterson, D. 1989. Reaction time as a supplement to grammaticality
judgements in the investigation of second language competence. University of Hawaii
Working Papers in ESL 8 (2): 207237.
Chomsky, N. 1964. Degrees of grammaticalness. In The structure of language: Readings in the
philosophy of language, J.A. Fodor and J.J. Katz (eds). Englewood Clis: Prentice Hall.
Chomsky, N. 1986. Knowledge of language: Its nature, origin and use. New York: Praeger.
Chung, S. and Mccloskey, J. 1983. On the interpretation of certain island facts in GPSG.
Linguistic Inquiry 14: 704713.
Clahsen, H. and Muysken, P. 1989. The UG paradox in L2 acquisition. Second Language
Research 5: 129.
Clahsen, H., Hong, U. and Sonnenstuhl-Henning, I. 1995. Grammatical constraints in syntactic
processing: Sentence-matching experiments in German. The Linguistic Review.
Coppieters, R. 1987. Competence dierences between native and near-native speakers.
Language 63: 544573.
Culicover, P. 1998. The minimalist impulse. In The limits of syntax, P. W. Culicover and L.
McNally (eds), 4477. New York: Academic Press.
Culicover, P. 2000. Minimalist architectures (Review of Jackendo 1997). Journal of
Linguistics 35: 137150.
Dalrymple, M., Shieber, S. and Pereira, F. 1991. Ellipsis and higher-order unication.
Linguistics and Philosophy 14 (4): 399452.
Dueld, N. and Matsuo, A. 2001. A comparative study of ellipsis and anaphora in L2
acquisition. In Proceedings of the 25th Boston University conference on language
development, A. H.-J Do, L. Domnguez and A. Johansen (eds), 238249. Somerville,
MA: Cascadilla Press.
Dueld, N. and Matsuo, A. 2002. Finiteness and parallelism: Assessing the generality of
knowledge about English ellipsis in SLA. In Proceedings of the 26th Boston University
conference on language development, B. Skarabela, S. Fish, S. and A. H.-J. Do (eds),
197207. Somerville, MA: Cascadilla Press.
Dueld, N. and Matsuo, A. in preparation. Acquiring competent gradience: Factoring out
the parallelism eect in VP-ellipsis. ms. McGill University/University of Ottawa.
Dueld, N., Sabourin, L. and Curtin, S. 1998. UG constraints on derivational morphology
in SLA. McGill Working Papers in Linguistics: Proceedings of GASLA 1997 13 (1, 2).
Dueld, N. and White, L. 1999. Assessing L2 knowledge of Spanish clitic placement:
Converging methodologies. Second Language Research 15 (2): 133160.
Dueld, N., White, L., Bruhn De Garavito, J., Montrul, S. and Prvost, P. 2002. Clitic placement in L2 French: Evidence from sentence matching. Journal of Linguistics 38 (3): 137.
Ellis, R. 1991. Grammaticality judgments and second language acquisition. Studies in
Second Language Acquisition 132: 161186.
Eubank, L. 1993. Sentence matching and processing in L2 development. Second Language
Research 9: 253280.

125

126 Nigel Dueld

Eubank, L. and Grace, S. 1988. V-to-I and inection in non-native grammars. In Morphology
and its interface in L2 knowledge, M.-L. Beck (ed.), 6988. Amsterdam: John Benjamins.
Fodor, J. D. 2001. Parameters and the periphery: Reections on syntactic nuts. Journal of
Linguistics 37 (2): 367392.
Freedman, S. E. and Forster, K. I. 1985. The psychological status of overgenerated sentences. Cognition 19: 101131.
Greenbaum, S. 1977. Acceptability in language. The Hague: Mouton.
Hardt, D. 1993. Verb phrase ellipsis: Form, meaning and processing. Computer and Information Science, University of Pennsylvania: Ph.D. dissertation.
Hedgcock, J. 1993. Well-formed vs. ill-formed strings in L2 metalingual tasks: Specifying
features of grammaticality judgments. Second Language Research 91: 121.
Hoekstra, T. 1990. Markedness and growth. In Logical issues in language acquisition, I. Roca
(ed.), 6383. Dordrecht: Foris.
Katz, J. J. 1964. Semi-Sentences. In The structure of language: Readings in the philosophy of
language, J. A. Fodor and J. J. Katz (eds), 400416. Englewood Clis: Prentice Hall.
Kluender, R. 1992. Deriving island constraints from principles of predication. In Island
constraints: Theory, acquisition and processing, H. Goodluck and M. Rochemont (eds),
195222. Dordrecht & Boston: Kluwer.
Kluender, R. and Kutas, M. 1993. Subjacency as a processing phenomenon. Language and
cognitive processes 8 (4): 573640.
Lako, G. 1971. Presuppositions and wellformedness. In Semantics, D. D. Steinberg, and
L. A. Jakobovitz (eds). London: Cambridge University Press.
Levelt, W. J. M., Van Gent, J. A. W. M., Haans, A. F. J. and Meijers, A. J. 1977. Grammaticality, paraphrase, imagery. In Acceptability in language, S. Greenbaum (ed.), 87101. The
Hague: Mouton.
Levin, B. and Rappaport Hovav, M. 1995. Unaccusativity: At the syntax-lexical semantics
interface. Vol. 26. Linguistic Inquiry Monograph Series. Cambridge, Mass.: MIT Press.
Macdonald, M.-E. C., Pearlmutter, N. J. and Seidenberg, M. A. 1994. Syntactic ambiguity
resolution as lexical ambiguity resolution. In Perspectives on sentence processing, C.
Clifton Jr., L. Frazier and K. Rayner (eds), 123154. Hillsdale, NJ: Erlbaum.
Mandell, P. B. 1999. On the reliability of grammaticality judgment tests in second language
acquisition research. Second Language Acquisition 15 (1): 7399.
Marslen-Wilson, W., Tyler, L. K., Waksler, R. and Older, L. 1994. Morphology and meaning
in the English mental lexicon. Psychological Review 101 (1): 333.
Martohardjono, G. 1998. Measuring competence in L2 acquisition: Commentary on part I.
In The generative study of second language acquisition, S. Flynn, G. Martohardjono and
W. ONeil (eds), 151157. Mahwah, NJ: Lawrence Erlbaum Associates.
McKoon, G. and MacFarland, T. 2000. Externally and internally cause change of state
verbs. Language 76 (4): 833858.
Sag, I. 1976. Deletion and logical form. MIT: Doctoral dissertation.
Schtze, C. 1996. The empirical base of linguistics. Chicago: University of Chicago Press.
Sorace, A. 1993. Incomplete and divergent representations of unaccusativity in non-native
grammars of Italian. Second Language Research 9: 2248.

</TARGET "duf">

Measures of competent gradience 127

Sorace, A. 1996. The use of acceptability judgments in second language acquisition


Research. In Handbook of language acquisition, T. Bhatia and W. Ritchie (eds). New
York: Academic Press.
Sorace, A. 2000. Gradients in auxiliary selection with intransitive verbs. Language 76 (4):
859890.
Tanenhaus, M. and Carlson, G. N. 1990. Comprehension of deep and surface verbphrase
anaphors. Language and Cognitive Processes 5 (4): 257280.
Van Hout, A. 1993. On unaccusativity: The relation between argument and aspect. Paper
presented at the Arbeitsgruppe Strukturelle Grammatik, MPG, Berlin.
Zi, P. 1964. On understanding utterances. In The structure of language: Readings in the
philosophy of language, J. A. Fodor and J. J. Katz (eds). Englewood Clis: New Jersey.

<LINK "dyk-n*">

<TARGET "dyk" DOCINFO AUTHOR "Ton Dijkstra"TITLE "Lexical storage and retrieval in bilinguals"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 6

Lexical storage and retrieval in bilinguals*


Ton Dijkstra
NICI/University of Nijmegen

1.

Introduction

Language users, monolinguals and bilinguals alike, usually communicate in


sentences. Because sentences consist of words, a complete understanding of
how language users process sentences includes an understanding of how they
recognize their constituent words. Although language users appear to recognize
words embedded in sentences of their mother tongue almost eortlessly, the
underlying word recognition process must surely be very complex. First, word
identication must depend on the characteristics of the lexical item itself, for
instance, on how often it has been encountered in the past (e.g. does it have a high
or low frequency of usage?) and on whether it is ambiguous with respect to its
syntactic category (e.g. is dance used as a noun or verb?) or semantics (e.g. does
bank refer to the river side or the institution?). In addition, a words recognition
process could be aected by the syntactic and semantic aspects of the preceding
sentence context, which may be more or less constraining or predictive.
For bilinguals reading in their second language (L2), the recognition of
words in sentences must be even more complex, because several additional
factors comes into play. At the lexical level, for instance, it is likely that the
subjective frequency of the L2 words is considerably lower than that of their L1
words (due to the participants lower prociency in L2), making L2 words
harder to recognize. Furthermore, it may not always be clear in advance which
language a presented word belongs to, because a word form may be ambiguous
across languages (e.g. the word room occurs both in English and in Dutch, but
in Dutch means cream) or because there may be code switches (language
alternations) in the sentence (e.g. dit is een voorbeeld van bottom-up processing,
meaning this is an example of bottom-up processing). In addition, the syntactic
and semantic aspects of the preceding sentence are not the only possible constraints that may aect target word recognition: there is an additional factor,

130 Ton Dijkstra

namely the language of the preceding words, which might provide an independent source of constraint on bilingual word recognition.
Because the recognition of words in sentence context is so complex, it is no
wonder that most studies in this area during the last decades have focussed on
the bilinguals recognition of isolated words. This process already requires a
distinction of dierent types of structural representations for words (for
instance, orthographic, semantic, and phonological; a storage issue), and is
inextricably bound up with how words are retrieved (a processing issue), for
which purposes they are retrieved (issues with respect to task demands and
instruction), and in which non-linguistic context their retrieval takes place (e.g.
is the word positioned within a stimulus list containing words from the same or
a dierent language?).
The major part of this chapter is specically concerned with the issue of
isolated word recognition, and it addresses the following questions:
1. Structure: which representations are activated during bilingual word
recognition?
2. Process: what is the time-course of activation for words of dierent languages?
3. Contextual constraints: how do non-linguistic experimental factors, such as
participant expectancies and instruction, aect lexical selection in bilinguals?
In the short second part of this chapter, we will discuss a few recent studies that
have examined eects of linguistic context (syntactic, semantic, and lexical) on
the recognition of words in sentence context. At present, only a handful of
reaction time (RT) studies has been done, investigating quite diverse linguistic
questions within the perspective of dierent theoretical frameworks. Given this
sorry state of aairs, we will focus on the more coherent studies that have
investigated the bilinguals brain activity during the processing of syntactic and
semantically incorrect sentences in terms of Event-Related Potentials (ERPs).
We will argue that while semantic processing may be quantitatively dierent
between monolinguals and bilinguals, syntactic parsing may be both quantitatively and qualitatively dierent, in complex ways that depend, for instance, on
the L2 prociency of the bilinguals involved.

2. How bilinguals recognize words presented in isolation


Before we examine recent studies on the recognition of isolated words by
bilinguals, we will rst consider three important general aspects of their

Lexical storage and retrieval in bilinguals

organization. First, what kind of bilinguals participated in these studies?


Second, what type of stimulus materials did they involve? Third, what were the
empirical paradigms the studies used for investigation?
Perhaps unexpected by some readers, the term bilinguals in many psycholinguistic studies does not specically refer to those language users who use the
words of two dierent languages at the same rate and with the same ease (socalled balanced bilinguals). Instead, the bilinguals participating in these studies
usually are persons who use their two languages in daily life but to a dierent
extent, implying that they are more procient in one language (usually their
native language or L1) than the other (their second language or L2; they are
unbalanced bilinguals). In many of the studies we will be talking about,
participants are university students, who are quite procient in English (generally having eight or more years of experience) but have a dierent language
(often Dutch) as their strongest, native language. These bilinguals, who have
acquired their L2 relatively late (at puberty or later), will be referred to as late
bilinguals, while earlier L2 acquisition makes bilinguals early bilinguals.
The stimulus materials that are often used in studies on isolated words are
referred to as interlingual homographs and cognates. Interlingual homographs are words that have identical orthographic representations across
languages but dierent semantics (such as angel, meaning heavenly messenger
in English, but sting in Dutch), while cognates are words that overlap across
languages in both their orthographic form and their meaning (e.g. lm). To
address how words of dierent languages are stored and retrieved in bilinguals,
RTs to interlingual homographs or cognates are compared to matched onelanguage control items in dierent experimental paradigms. Any latency
dierences that arise between the two item types are assumed to be a consequence of the special bilingual status of interlingual homographs and cognates.
Among the many experimental paradigms used in bilingual RT studies are
variants of lexical decision, language decision, progressive demasking, (language) go/no-go, word naming, and word association. In a language-specic
lexical decision task, participants press one button if they encounter a word in
the target language, and another button if they see a nonsense letter string or
non-word. For instance, in the English lexical decision task, participants press
a yes button if a presented letter string was English, and a no button if it is
not. In a generalized lexical decision task, participants press the yes button for
any word they encounter, irrespective of its language. A presented word could,
for instance, be Dutch or English. In contrast, in the language decision task
nonsense letter strings (non-words) do not occur. Only words are presented,

131

132

Ton Dijkstra

belonging to one of two languages. One button must be pressed when a word
belongs to one language (e.g. English), and another if it belongs to the other
language (e.g. Dutch). In the go/no-go task, only words from two languages are
presented as well. However, participants react only when they identify a word
from the target language, for instance, English (English go/no-go), but they do
nothing if a word of the non-target language (for instance, Dutch) is presented.
In progressive demasking, participants identify a word that is presented in an
alternating sequence with a pattern mask, for instance a checkerboard or a row
of hash marks. Across alternations, the pattern mask decreases in duration,
while stimulus presentation increases, until the target word is recognized. In
word naming, the participants must read aloud words in the target language, for
instance, in their mother tongue (e.g. Dutch) or in their second language (e.g.
English). Finally, in word association, participants respond to a presented word
by producing the rst word that comes to their mind in the target language.
2.1 Structure: Which representations are activated during bilingual
word recognition?
Research has shown that in the initial stages of monolingual word recognition,
an input letter string leads to the activation of multiple word candidates in the
mental lexicon that closely match the input (see Grainger and Dijkstra 1996).
For instance, when the letter string word is presented to English monolinguals,
the stored representations for words like word, cord, ward, wood, and work will
initially become active (such candidates are called neighbours). When the word
identication process proceeds, inappropriate candidates will gradually be
reduced in activation and no longer be considered as a possible input word.
Finally, only the lexical representation corresponding to the presented word
remains active and becomes recognized. For bilinguals, the interesting question
arises if word candidates from dierent languages are activated if they overlap
suciently with the input letter string. For instance, are both the English and
the Dutch readings of the interlingual homograph angel activated in parallel if
Dutch-English bilinguals read the word in an English book?
According to the language-selective access view on bilingual word recognition, only lexical candidates of the task-relevant language (in this example
English) are activated (Gerard and Scarborough 1989, Macnamara and Kushnir
1971). Thus, when the word word is presented, Dutch word candidates that are
similar to word, like bord and wond would not become activated. In contrast,
according to the language non-selective access view, word candidates from

Lexical storage and retrieval in bilinguals

both languages (here English and Dutch) become active (Altenberg and Cairns
1983, Grainger and Beauvillain 1987, Van Heuven, Dijkstra and Grainger 1998).
As we shall see later, the majority of recent studies support the language nonselective access hypothesis. However, formulating the lexical access views in this
general way ignores at least two important points. First, the issue of (non)selective access should be dierentiated with respect to dierent types of lexical
representations: e.g. orthographic, phonological, and semantic representations.
Second, the answer to the question might depend on whether one is processing
in ones native language (L1) or in a second language (L2).
Dijkstra, Grainger and Van Heuven (1999) found evidence of crosslinguistic competition between words of dierent languages that are similar in
form and/or meaning. They investigated whether Dutch-English bilinguals
recognized interlingual homographs faster or slower than matched onelanguage control items in English lexical decision and progressive demasking
tasks. The English stimulus words varied in their degree of orthographic (O),
phonological (P), and semantic (S) overlap with Dutch words. Examples of
items in their six test conditions are sport (overlap in S, O, and P codes), wild
(SO), wheel (SP), pink (OP), angel (O), and pace (P). The rst two conditions
(SOP and SO conditions) consist of what are usually called cognates, while the
last three conditions contain interlingual homographs (OP and O conditions)
or interlingual homophones (P condition).
Participants were faster to make a lexical decision to the target words with
cross-linguistic overlap than to exclusively English control words if the overlap
was orthographic and/or semantic in nature (e.g. in the SO and O conditions).
In contrast, cross-linguistic phonological overlap produced inhibitory eects.
Responses to test items of the P condition, for example, were slower than to
matched purely English words.
To show that the observed result pattern did not arise because the test and
control items were not well matched in some unknown aspects, the lexical decision
experiment was replicated with American-English monolinguals. For these
participants, no RT or error dierences were found between test items and their
matched controls. Furthermore, to show that the results were not restricted to
lexical decision, the experimental materials were also included in a progressive
demasking task. The result pattern obtained with this paradigm was strikingly
similar to that in lexical decision, indicating that it was not the task that induced
the facilitation and inhibition eects for homographs relative to controls.
In sum, a presented word form in L2 appears to initially activate orthographic, phonological, and semantic lexical representations in both L2 and L1.

133

134 Ton Dijkstra

The opposite eects of orthographic and phonological overlap may help to


explain observed dierences in the result patterns of other available studies,
because the stimulus materials in these studies may have varied in terms of the
degree of cross-linguistic phonological overlap and therefore in the relative
amount of inhibition (e.g., Dijkstra, Van Jaarsveld and Ten Brinke 1998, Font
2001, Gerard and Scarborough 1989, Von Studnitz and Green 2002).
In a follow-up experiment (Van Heuven and Dijkstra 1999), English
pseudo-homophones were added to the stimulus list. Pseudo-homophones are
nonsense words for which the pronunciation sounds like a real word, like brane
and bloo. The reasoning behind this manipulation was that the presence of such
items would discourage the use of phonology, and would therefore lead to a
reduction of the earlier found phonological inhibition eect (see Davelaar,
Coltheart, Besner and Jonasson 1978, for similar arguments in a monolingual
study). A reduction of the phonological inhibition eect was indeed found in
several conditions, but the eect did not disappear completely in conditions
where cross-linguistic overlap occurred in several codes (e.g. sop). This nding
suggests that phonology may be re-activated by interactions between codes. For
instance, via its semantic and orthographic overlap an interlingual homograph
like lm might reactivate its phonology in both languages (see Gottlob, Goldinger, Stone and Van Orden 1999, for such resonance eects in a comparable
monolingual study; and Sebastin-Galls and Kroll in press, for an overview of
the role of phonology in bilingual lexical processing).
Recent studies by Van Hell and Dijkstra (2002) and Font (2001) indicate
that language non-selective access also occurs for cross-linguistically ambiguous
target words of the native language (L1) and even when targets are not completely identical in form across languages. In the study by Van Hell and Dijkstra
(2002), trilinguals with Dutch as their native language, English as their second
language, and French as their third language performed a word association task
or a lexical decision task in their L1 (Dutch). Stimulus words were (mostly)
non-identical cognates such as tomaat or non-cognates. Shorter association and
lexical decision times were observed for Dutch-English cognates than for noncognates. For trilinguals with a more equal (high) prociency in French and
English, faster responses in lexical decision were found for both Dutch-English
and Dutch-French cognates. In other words, even when their orthographic and
phonological overlap across languages is incomplete, cognates may be recognized faster than non-cognates.
For French-Spanish bilinguals, Font (2001) has found that in lexical decision
cognates diering in one letter between languages (called neighbour cognates

Lexical storage and retrieval in bilinguals

by her) are still facilitated but signicantly less so than identical cognates.
Furthermore, she has shown that the amount of facilitation that is observed
depends on the position of the deviating letter in the word. Neighbour cognates
with the dierent letter at the end of the word (e.g. French texte Spanish texto)
are facilitated more than neighbour cognates with the dierent letter inside (e.g.
French usuel Spanish usual). In fact, facilitatory eects for the latter type of
cognate disappeared and eects tended towards inhibition when such cognates
were of low frequency in both languages. Similar patterns of results were found
in both L1 and L2 processing.
These results make it likely that the size of RT eects observed for cognates
and interlingual homographs depends on their degree of cross-linguistic
overlap (also cf. Cristoanini, Kirsner and Milech 1986). Note that it follows
logically that across language pairs that do not share orthography at all (e.g.
Chinese and English), no orthographically similar word candidates can be
activated, while eects of phonological similarity might still occur (depending
on, for instance, the way tonal information aects the establishment of the set
of lexical candidates).
2.2 Process: What is the time-course of activation for words
from dierent languages?
The issue of language (non)selective access can also be examined from a
processing point of view by considering the time-course of lexical activation and
selection in bilingual word recognition. As a rst question, we may consider the
rate of code activation in L1 and L2: how fast do orthographic, phonological, and
semantic codes from the two languages become active? From the monolingual
domain, we know that high frequency words are generally recognized faster
than low frequency words, and, because the words of L2 must have a lower
subjective frequency than those of L1 (simply because the former have been
encountered less often), it seems likely that L2 codes become available slower
than L1 codes. A comparison of the study by Dijkstra et al. (1999), discussed in
the previous section, with a study by Lemhfer and Dijkstra (submitted)
provides information that supports this viewpoint.
Dijkstra et al. (1999) showed that in a lexical decision task where L2
(English) was the target language of the bilingual participants, cross-linguistic
eects arose for L1-L2 (Dutch-English) homographs with respect to all three
codes. Because English was the target language in this task, task execution
implied the verication of the English language membership of possible word

135

136 Ton Dijkstra

candidates, even when Dutch codes would be available faster than English ones.
In other words, Dutch codes had time to establish themselves and exert eects
on later available English targets that were necessary for responding.
Lemhfer and Dijkstra (submitted) presented the same stimulus materials
to Dutch-English bilinguals in a generalized lexical decision task. In this task,
participants responded with yes to both English and Dutch words, but with
no to non-words. In contrast to English lexical decision, participants in this
task can use both Dutch and English lexical representations as a reliable basis
for response. Thus, in this task cross-linguistic eects will arise only to the
extent that L1 and L2 codes can aect each other before the fastest codes
(usually Dutch ones, we assume) are retrieved and responded to. The results of
this study were quite clear: no facilitation eects arose for interlingual homographs, while cognates were facilitated relative to control words. The pattern of
results for homographs indicates that responses were based upon the fastest
available code, usually the Dutch orthographic code, while cross-linguistic
overlap with respect to semantics in the case of cognates apparently can be used
to speed up the response.
In sum, even though L1 and L2 codes become active in parallel, L2 codes
are often activated more slowly than L1 codes, probably because of dierences
in subjective frequency between languages. As a consequence, the development
of cross-linguistic eects depends on the target language in the experiment (L1
or L2) and on other temporal characteristics of the task involved.
There is a dierent way of approaching the issue of the time-course of
lexical selection in bilinguals. Rather than asking how fast lexical representations from dierent languages become active, we might wonder how long they
remain active. Even if there is an initial activation of various codes from
dierent languages, lexical selection might be relatively fast or slow afterwards.
This issue has been investigated in experimental studies by varying the frequency ratio of the two readings of interlingual homographs (e.g. angel is relatively
more frequent in English than in Dutch).
Dijkstra, Timmermans and Schriefers (2000) examined how long the two
readings of an interlingual homograph compete for selection and whether
language information provided by the item can be used to facilitate the selection of one of these readings. In three experiments, each with a dierent
instruction, bilingual participants processed the same set of homographs
embedded in identical mixed-language lists. Homographs of three types were
used: high-frequent in English and low-frequent in Dutch; low-frequent in
English and high-frequent in Dutch; and low-frequent in both languages. In the

Lexical storage and retrieval in bilinguals

rst experiment (involving language decision), one button was pressed when an
English word was presented and another button for a Dutch word. In the
second and third experiments participants reacted only when they identied
either an English word (English go/no-go) or a Dutch word (Dutch go/no-go),
but they did not respond if a word of the non-target language (Dutch or
English, respectively) was presented.
In all three tasks, clear inhibition eects arose for homographs relative to
one-language controls. Even in the Dutch go/no-go task for Dutch-English
bilinguals performing in their native language, participants were unable to
completely exclude eects from the non-target language on homograph
identication. More important for the present discussion, however, is the
nding that target-language homographs were often overlooked, especially if
the frequency of their other-language competitor was high. The relative
frequency of the two readings of the interlingual homograph was found to aect
both RTs and error rates. In the Dutch go/no-go task, participants did not
respond to low-frequency items belonging to their native language in about 25
percent of the cases!
Inspection of cumulative distributions showed that if they did not respond
after about 15001600 ms, they did not respond anymore within the time
window of two seconds. The observed attening of the cumulative distribution
towards an asymptotic value suggests that recognition of the homograph
reading from the non-target language in some way prohibits the subsequent
recognition of the target language reading (e.g. after recognition, all other
lexical candidates may be suppressed). Thus, selection of one of the readings of
the interlexical homographs takes place rather late during processing. The
results suggest that until that time both readings of a presented homograph are
involved in a (frequency-dependent) race to recognition that is won by the
fastest candidate.
It is clear that the system must at some time arrive at a selection of one
lexical item only, but apparently the role played by the language of that item in
aiding selection is only minor. In fact, determination of the language of the item
may depend on lexical selection having taken place. In addition, it does not
seem possible to discard the homograph reading from the non-target language
and to focus on the target reading only on the basis of the instruction that just
the target language needs to be responded to.
Similar results were found when the target language was English (L2) and
when it was Dutch (L1), even though fewer target words were overlooked in the
second case. Again, this nding points to activation from both readings of the

137

138

Ton Dijkstra

interlingual homographs irrespective of whether the target language is the


native language or a second language.
This study allows two important additional conclusions. First, there appear
to be serious limitations on the degree of control that participants can exert on
the relative activation of their two languages. Second, the selection of the target
word appears to be based on item characteristics (such as word frequency) and
not on the language membership of the item. Language membership appears to
be available relatively late (maybe only after item identication) and therefore
cannot help to speed up lexical selection.
2.3 Contextual constraints: How do non-linguistic experimental factors
aect lexical selection in bilinguals?
In the previous sections, we have argued that under the experimental circumstances of the presented studies involving isolated words, the word recognition
system functions in a language non-selective way. However, that the system can
function in a non-selective way does not imply that it does so irrespective of the
experimental circumstances. In the following two sections, we will consider to
which extent the observed language non-selectivity may be modulated by
context. We will make a distinction between two types of contextual factors:
non-linguistic or experimental and linguistic. Non-linguistic or experimental
context aspects are concerned with participant expectations based on, for
instance, (the explicitness of the) instruction and task demands. Linguistic
context aspects have to do with lexical, syntactic, semantic, and language
information, such as provided by a sentence context. We note that for lists of
individual items, stimulus list composition and in particular language intermixing could have both linguistic (lexical) and non-linguistic eects.
Language intermixing refers to whether an experiment contains exclusively
items belonging to one language (blocked presentation) or items from two
languages (mixed presentation). The eects of language intermixing and task
instruction on bilingual word recognition were the focus of three DutchEnglish lexical decision experiments by Dijkstra, Van Jaarsveld and Ten Brinke
(1998). In Experiment 1, Dutch bilingual participants performed an English
lexical decision task including Dutch-English homographs, cognates, and purely
English control words. The mean RTs to interlingual homographs were
unaected by the frequency of the Dutch reading and did not dier from those
to monolingual controls. In contrast, cognates were recognized faster than
controls. In Experiment 2, Dutch participants again performed an English

Lexical storage and retrieval in bilinguals 139

lexical decision task including interlingual homographs, but, apart from nonwords, Dutch words were also incorporated, requiring a no-reaction. Strong
inhibition eects were now obtained for interlingual homographs relative to
English control words. The size of the inhibition eect depended on the relative
frequency dierence of the two readings of the homograph. It was largest when
the Dutch reading of the homographs had a high frequency relative to the
English reading. In Experiment 3, Dijkstra et al. (1998) used the same stimulus
materials but changed the task demands. Participants now performed a general
lexical decision task, responding yes if a word of either language was presented
(rather than saying no to Dutch words). In this experiment, frequencydependent facilitation eects were found for the interlingual homographs
(relative to English control words).
The authors argued that the null-results for interlingual homographs in the
rst experiment did not constitute conclusive evidence that bilingual word
recognition involves a language selective access process, because in that case the
dierent stimulus list composition of Experiment 2 should not have aected the
results. Instead, the results of Experiment 2 were considered as evidence
supporting the language non-selective access view, and this view was tested and
supported again under the dierent task demands of Experiment 3. This last
experiment further showed that task demands may aect the direction of the
observed eects: changing the task from language specic lexical decision in
Experiment 2 towards generalized lexical decision in Experiment 3 turned the
inhibition eects of Experiment 2 into facilitation eects in Experiment 3.
The null-eects in Experiment 1 make one wonder if the Dutch readings of
the interlingual homographs were activated at all in this English lexical decision
task. Recent reanalysis of the data suggests they were. A regression analysis
showed that (despite the over-all null results) homograph responses became
slower as the frequency of their Dutch reading increased, while they became
faster with increasingly high English frequency readings. Furthermore, De Moor
(1998) demonstrated that the L1 semantics of the interlingual homographs was
apparently activated as well. De Moor rst replicated the null-result for homographs relative to controls. Then, on the trial after the homograph appeared, she
presented the English translation of its Dutch reading. For instance, brand was
followed by re, which is the English translation of the Dutch word brand. A
small but reliable translation priming eect of 11 ms was found. In a replication
of this experiment with dierent stimulus materials, Van Heste (1999) observed
a reliable 35 ms dierence between translation and control trials. The Dutch
reading of the homograph on the previous trial had apparently been activated

140 Ton Dijkstra

even though this did not aect its RT (cf. De Bruijn, Dijkstra, Chwilla and
Schriefers 2001).
Finally, Dijkstra et al. (1999) performed an analogous experiment that was
reviewed in Section 2, also involving an English lexical decision task with
interlingual homographs and controls. In this study, signicant facilitation
eects were found for homographs having cross-linguistic overlap in orthography but not in phonology (stage), and no eects for items with overlap in both
(step). The items in this study were comparable to those in Dijkstra et al. (1998),
making it likely that the null-eects in the earlier study were due to mixing the
two types of items. (Indirect support for this reasoning comes from a Spanish
lexical decision study involving French-Spanish bilinguals by Font (2001: 115),
who found facilitatory eects for French-Spanish homographs that had little
phonological similarity across languages.)
Several other accounts have been proposed for the null-results in Experiment 1 and the inhibitory eects in Experiment 2 from Dijkstra et al. (1998).
These accounts have either referred to dierences in the relative activation of
words from the two languages in Experiments 1 and 2 (Dijkstra et al. 1998,
Grosjean 2001), or to dierences in participants decision strategies (De Groot et
al. 2000, Dijkstra et al. 2000). Let us take a closer look at the various proposals.
Dijkstra et al. (1998) assumed that the degree of activation of Dutch (the
non-target language) was higher in Experiment 2 than in Experiment 1, because
Dutch words were only included in Experiment 2. As a consequence, the
English readings of the interlingual homographs suered from more competition by the Dutch reading in Experiment 2. As an underlying mechanism, this
view assumes that lexical activation eects can last across trials and can aect
relative language activation. In sum, the dierent results in the two experiments
are assumed to be a consequence of dierent bottom-up activation processes
due to the composition of the stimulus list. This is basically an explanation in
terms of lexical context eects.
Similarly, Grosjean (2001) interpreted the results in terms of the participants language mode, referring to the relative state of activation of the
bilinguals languages and language processing mechanisms. The mode is
monolingual if only one language is relatively active and bilingual if both
languages are active (though one language may be more active than the other).
In Experiment 1, the participants only read English words and non-words
(although some words were homographs and cognates) and they were instructed to decide whether the items were English words or not. This would have
positioned them towards the monolingual end of the mode continuum, but

Lexical storage and retrieval in bilinguals

they did not reach this position totally as they knew they were being tested as
bilinguals. Thus, although their Dutch was partly active (which would explain
the cognate eect) it was not suciently active to create a homograph eect. In
sum, Grosjean (2001) proposed that both the participants expectancies with
respect to the English lexical decision task and the degree of language intermixing (encountering mostly English words) aected the bilinguals performance.
This explanation implies that both non-linguistic and linguistic context aspects
aected relative language activation.
De Groot et al. (2000) replicated the null-results observed in Experiment 1
by Dijkstra et al. (1998) using dierent stimulus materials and dierent DutchEnglish bilinguals. They proposed that the participants were instructed to
perform a language specic English lexical decision task, but on some trials
may instead have treated the task as a language neutral lexical decision task.
The adoption of a language specic processing mode would induce slower
responses to homographs than to matched controls due to lexical competition
between the activated target and non-target readings of the interlingual
homographs (just as in Experiment 2 by Dijkstra et al. 1998). In contrast, in a
language neutral processing mode the response to a homograph would be
based on the availability of any reading, irrespective of language, and homographs could then be responded to faster than controls (as in the generalized
lexical decision task of Experiment 3 by Dijkstra et al. 1998). In sum, a mixture
of the two processing modes adopted by the participants led to a mixture of
facilitation and inhibition eects for homographs relative to controls, yielding
an overall null-result. (Note that this account would predict larger standard
deviations for the homographs in the condition where the Dutch reading of the
homograph has a high frequency than in the conditions where it has a low
frequency, because in the former type of condition, the Dutch reading would be
available much faster than the English reading, while that would not be the case
for the last type of words.)
Dijkstra, De Bruijn, Schriefers and Ten Brinke (2000) pointed out that the
participants in the three studies that reported null-results were apparently not
told in advance that some of the presented letter strings would be words in both
Dutch and English. Participants might sometimes have adopted a language
neutral processing mode because they were in an uncertain situation. To
disentangle the eects of instruction and language intermixing, Dijkstra et al.
designed an experiment that combined features of Experiments 1 and 2 by
Dijkstra et al. (1998). Participants were explicitly instructed that they would
encounter Dutch words requiring a no response and were provided with

141

142 Ton Dijkstra

examples in the practice set. However, exclusively Dutch words were presented
only in the second part of the experiment. No signicant RT dierences were
found between the interlingual homographs and matched English control items
in the rst part of the experiment. In contrast, strong inhibitory eects for
homographs relative to control words appeared in the second part. Examination
of the transition from Part 1 to Part 2 showed that, as soon as Dutch items
started to come in, the RTs to homographs were considerably slowed down
compared to control words. These results converge completely with those of
Experiments 1 and 2 by Dijkstra et al. (1998) discussed above. They suggest that
language intermixing rather than instruction-based expectancies drives the
bilingual participants performance. Instead of interpreting the pattern of
results in Part 1 and Part 2 of the experiment as evidence for dierences in
relative language activation (depending on the local absence or presence of
Dutch items in the stimulus list), Dijkstra et al. propose that participants used
dierent decision criteria in the two parts of the experiment, depending on the
types of lexical items they encountered.
Ignoring the details of the proposed underlying mechanisms, we can draw
a number of general conclusions on the basis of these and other studies (see
Dijkstra and Van Heuven (2002), for an elaborated model of bilingual word
recognition based on this evidence). First, word candidates from both target
and non-target languages are activated in parallel in a bottom-up way (via the
signal), even though their rates of activation may dier depending on subjective
frequency. Second, stimulus list composition and task demands are important
determinants of the response patterns. Third, task demands, instruction details,
and other top-down information sources do not override the activated
bottom-up information; instead, the activated representations in the two
lexicons are used for responding in accordance with the requirements of the
task at hand. In all, the conclusion is that for isolated words presented in
stimulus lists, bilingual word recognition is based on the input signal and is
basically automatic. Non-linguistic context eects (due to the composition of
the stimulus list, the specic instruction, or the task to be performed) appear to
aect the decision criteria that are used to accept one lexical candidate or
another during the lexical selection process, but not to aect the relative degree
of activated word candidates from one or the other language.

Lexical storage and retrieval in bilinguals 143

3. How bilinguals recognize words presented in sentences


In everyday life, the contextual inuences on word recognition are not provided
by previous words in an unrelated word list or by the demands of the experimental task that must be performed, but by the syntactic, semantic, lexical, and
language aspects of the sentence context that precedes a particular word that is
to be recognized. In the following section, we briey consider such linguistic
context eects on word recognition. We will rst examine an RT study on the
general eects of sentence context on bilingual word recognition, showing that
there may be complex interactions between dierent aspects of the sentence
context and word identication. Next, we will zoom in on syntactic and
semantic aspects of sentence processing as reected in studies that measure the
bilinguals brain activity using Event-Related Potentials.
Altarriba, Kroll, Sholl and Rayner (1996) examined semantic and lexical
form eects of a preceding sentence context on bilingual word recognition in
two experiments. In the rst experiment, they monitored the eye movements of
Spanish-English bilinguals who were reading English (L2) sentences that
contained either an English (L2) or a Spanish (L1) target word (Experiment 1).
Sentences provided either high or low semantic constraints on the target words.
An example sentence of the high constraint and Spanish target condition is He
wanted to deposit all his dinero at the credit union, where dinero is Spanish for
money. The experiment led to an interaction between the frequency of the
target word and degree of sentence constraint for Spanish target words with
respect to the rst xation duration, but not for English target words. Thus,
when the Spanish target words were of high frequency and appeared in highly
constrained sentences, the participants apparently experienced interference.
This result suggests that sentence constraint inuences not only the generation
of semantic feature restrictions for upcoming words, but also that of lexical
features. The high-frequency Spanish word matched the generated set of
semantic features, but not the expected lexical features when the word appeared
in the alternate language (Altarriba et al. 1996: 483). The same pattern of results
was found in a second experiment, where the sentences were presented word by
word using the rapid serial visual presentation (RSVP) technique and participants named the capitalized target word in each sentence.
The ndings of this study indicate that linguistic sentence context interacts
with target word recognition, suggesting that linguistic context functions in a
dierent way than non-linguistic context. Furthermore, it is interesting to note
that the data pattern showed an interaction of word frequency (a lexical

144 Ton Dijkstra

information source) and the sentence constraint, and not of language membership and the sentence constraint. This suggests that (just like for isolated words)
lexical characteristics are more important than language characteristics in the
determination of word recognition in sentences.
Only a limited number of studies have investigated syntactic eects of
sentence context on word recognition in some detail (for a full review, see Kroll
and Dussias in press). Here we will briey describe a few recent studies that
used Event-Related brain Potentials (ERPs) to compare syntactic and semantic
aspects of sentence processing in bilinguals (Weber-Fox and Neville 1996,
Hahne 2001, Hahne and Friederici 2001). Like the authors of these studies, we
will argue that there are processing dierences between monolinguals and
bilinguals with respect to semantic aspects that appear to be especially quantitative in nature, while the dierences with respect to syntax appear to be qualitative as well.
In the study by Weber-Fox and Neville (1996), ve groups of ChineseEnglish bilinguals performed an acceptability judgment task for sentences in
their L2, English, while their EEG was recorded. These groups of participants
had learned English at dierent ages. Apart from normal control sentences, they
read semantically anomalous sentences in English (e.g. The scientist criticized
Maxs event of the theorem), sentences that contained violations of English
phrase structure rules (e.g. The scientist criticized Maxs of proof the theorem),
sentences that contained specicity constraint violations (e.g. What did the
scientist criticize Maxs proof of?), and sentences that contained subjacency
constraint violations (e.g. What was a proof of criticized by the scientist?). In
terms of their brain activity, early L2 learners (those who had learned English
before age 11) responded to the semantic anomalies in a very similar way as
monolingual language users. The other bilingual groups diered from the
monolinguals only quantitatively: an N400 eect, a marker associated with the
processing of semantic anomalies, was present in their EEGs, but it was delayed
in time relative to that in monolinguals.
In contrast, several qualitative dierences between prociency groups were
found with respect to the syntactic processing of phrase structure violations.
First, none of the bilingual groups displayed a so-called early left anterior
negativity (N125) in the EEG that was present in monolinguals. The N125 is an
early eect in the EEG that may reect automatized rst-pass parsing processes.
Second, a second left lateralized negativity (N300500) was found in all groups,
which was left lateralized (found in the left hemisphere of the brain) in monolinguals and early bilinguals, but more bilaterally distributed in late bilinguals.

Lexical storage and retrieval in bilinguals

Third, a P600 eect was present in the monolinguals and early bilinguals but
not in the late learners. The P600 eect is considered to be the most important
EEG marker associated with syntactic reanalysis and repair. In sum, late L2
learners consistently displayed large dierences in ERPs patterns relative to
monolinguals, suggesting that (especially late) syntactic processes are dierent
in late L2 learners.
Hahne (2001) came to similar conclusions on the basis of an auditory
sentence processing study involving procient late Russian-German bilinguals and
German monolinguals. Her participants listened to German sentences that were
either correct (e.g. Die Tr wurde geschlossen, The door was being closed),
contained a semantically incorrect item (selection restriction violation: Die Ozean
wurde geschlossen, The ocean was being closed), or a syntactically correct item
(word category violation: Das Geschft wurde am geschlossen, The shop was
being on closed). As before, ERP dierences in processing semantic incongruities between native and L2 speakers were only quantitative in nature, while
there were qualitative dierences with respect to syntactic processing between
the two participant groups. This suggests that the second language learners did
not process syntactic information in the way that native listeners did.
Hahne and Friederici (2001) examined sentence comprehension in Japanese speakers who had learned German as a second language after puberty.
These bilinguals listened to German sentences that were correct or contained
semantic and/or syntactic violations. A variety of dierences was found in the
ERPs for the Japanese-German bilinguals and German monolinguals. Semantically incorrect sentences induced an ERP pattern similar for the two groups (an
N400 eect), while correct sentences led to a dierent pattern (greater positivity) in L2 learners than in native listeners. The latter nding may reect the
greater diculties the learners had with respect to syntactic integration. For
sentences containing a phrase structure violation, L2 learners, in contrast to
native listeners, did not show signicant modulations of the syntax-related ERP
components mentioned above (the early anterior negativity and the P600).
Furthermore, sentences containing a pure semantic or combined syntactic/
semantic violation elicited eects not found in native listeners. These eects
may reect additional conceptual-semantic processing in late bilinguals.
These ERP studies indicate that future RT studies examining sentence
processing in bilinguals are likely to yield evidence for complex interactions
between lexical and syntactic knowledge in L1 and L2. Of course, there is a vast
number of research issues to be addressed. For a psycholinguist working on the
bilingual lexicon, one interesting issue to explore is to which extent the language

145

146 Ton Dijkstra

non-selective access mechanism found at the lexical level also holds at the
syntactic level.
The assumption that the syntactic rules and syntactic categories of dierent
languages are not incorporated in language-specic databases but in an
integrated store leads to a variety of predictions. For instance, for a DutchEnglish bilingual processing dierences could arise for a noun phrase like the
light of a distant star and a noun phrase like the man sat in his room, because
the interlingual homograph star is an adjective in Dutch (meaning rigid) but
a noun in English, while the homograph room is a noun in both languages
(which means cream in Dutch).
Furthermore, language non-selective access of syntactic rules might lead to
specic cross-linguistic priming eects. For instance, hearing a sentence like
the librarian handed the reader a book might prime the production of de vader
gaf het meisje een appel (The father gave the girl an apple) but not de vader gaf
een appel aan het meisje (The father gave an apple to the girl) (cf. Bock 1986).
Another interesting issue is whether there is a separate eect of the language
of the sentence context on the recognition of a target word. In other words,
could a noun phrase or larger sentence context elicit some kind of language
frame that aects the processing of later arriving words? In that case, processing
a sentence with a code-switch like I see a huis might be more dicult than
processing a regular sentence like I see a house, simply because the words in
the rst sentence context do not all belong to one and the same language.
Finally, it seems likely that we may be expecting some unexpected results in
future RT studies on bilingual syntactic processing, due to the complex interactions between lexical, syntactic, and semantic factors. One possibility is that
quantitative dierences in working memory capacity for L2 syntactic processing
may lead to qualitative processing consequences between L2 monolinguals and
bilinguals (cf. Michael, Dijkstra and Kroll 2002). For instance, in a pilot study
in our lab we found that although both Dutch and German readers tended to
resolve local ambiguities in subject- and object-relative clauses in their L1 by
using syntactic information only, Dutch-German readers in their L2 used
semantic information as well (Caelen 1998, but also see Frenck-Mestre and
Pynte 1997). A similar pattern was found in L1 readers under higher processing
load conditions.
All these exciting questions and many others are amenable to empirical
research by means of existing research techniques. Unfortunately, the collection
of empirical data addressing these questions has only just started and we have
no answers to these questions yet.

Lexical storage and retrieval in bilinguals 147

4. General conclusions
In this chapter, we have considered a number of questions about bilingual
lexical processing and provided answers based on the presently available
empirical evidence with respect to interlingual homographs and cognates. First,
we have argued that during the recognition of isolated words by bilinguals,
lexical candidates from several languages are activated in parallel. Such parallel
activation does not only hold for orthographic representations, but also for
phonological and semantic codes. Moreover, there is evidence that language
non-selective access occurs even when bilinguals are processing words in their
native language and are not aware that their second language knowledge is
important. These ndings indicate that the bilingual word identication system,
just like the monolingual system, is to a large extent automatic in nature, in
the sense that lexical candidates from both languages are activated in a fast
recognition process that in itself is largely unaected by intentional and
attentional factors.
Second, we have examined the time-course of lexical activation with respect
to L1 and L2 and found that L2 is slower to be activated than L1, depending on
relative L1/L2 prociency and therefore on (subjective) L1/L2 word frequency.
For interlingual homographs, we have found that in spite of dierences in
L1/L2 activation rates, both readings of interlingual homographs remain active
during lexical processing for a relatively long time. This nding has several
important theoretical consequences. For instance, if the language membership
of word candidates could be used quickly to suppress lexical candidates that are
irrelevant in the experimental context, eects of the non-targeted reading of the
homographs should quickly disappear. However, if language membership
information becomes available late during processing, both readings of the
homograph would remain active for quite long. The available empirical studies
support the latter position, which is in correspondence with the automatic
nature of bilingual word recognition.
Third, we have demonstrated that both non-linguistic experimental and
linguistic context factors may aect the result patterns that are observed in
experiments. It appears that non-linguistic factors such as task demands and
instruction aect the performance of bilingual participants at the level of task
and decision processes as well as participant strategies. Linguistic factors such
as lexical, syntactic, and semantic aspects of the sentence context appear to
aect the word identication process more directly. Evidence from ERP studies
indicates that syntactic processing in bilinguals may dier both quantitatively

<DEST "dyk-n*">

148 Ton Dijkstra

and qualitatively from that in monolinguals, but RT studies are badly needed in
order to specify from which underlying mechanisms the dierences originate.
To conclude, empirical studies on bilingual word recognition in the last
decade have uncovered a number of fundamental characteristics of the bilingual
word recognition system. They have answered some major questions that are
unique to the bilingual domain, such as that about language selective or nonselective access, as well as more generally important questions, such as how
language users handle lexical ambiguity and how task and stimulus context
aect word recognition. The conclusions of these studies will have to be taken
into account during the development of a more general model of bilingual
processing. However, much more empirical evidence on the interaction
between lexical, syntactic, and semantic processing is needed before we can
even attempt to build such a model.

Note
* The author thanks Folkert Kuiken and two anonymous reviewers for their comments on
a previous version of this paper. The author also thanks Judy Kroll for her continuous
support and the many discussions that shaped the ideas in this chapter.

References
Altarriba, J., Kroll, J. F., Sholl, A. and Rayner, K. 1996. The inuence of lexical and conceptual constraints on reading mixed-language sentences: Evidence from eye xations and
naming times. Memory & Cognition 24: 477492.
Altenberg, E. P. and Cairns, H. S. 1983. The eects of phonotactic constraints on lexical
processing in bilingual and monolingual studies. Journal of Verbal Learning and Verbal
Behavior 22: 174188.
Bock, J. K. 1986. Syntactic persistence in language production. Cognitive Psychology 18:
355387.
Caelen, M. 1998. Extending the study on the processing of relative clauses to bilingualism.
Unpublished Masters Thesis, University of Nijmegen.
Cristoanini, P., Kirsner, K. and Milech, D. 1986. Bilingual lexical representation: The status of
Spanish-English cognates. Quarterly Journal of Experimental Psychology 38A: 367393.
Davelaar, E., Coltheart, M., Besner, D. and Jonasson, J. T. 1978. Phonological recoding and
lexical access. Memory & Cognition 6: 391402.
De Bruijn, E., Dijkstra, A., Chwilla, D. and Schriefers, H. 2001. Language context eects on
interlingual homograph recognition: Evidence from event-related potentials and
response times in semantic priming. Bilingualism: Language and Cognition 4: 155168.

Lexical storage and retrieval in bilinguals 149

De Groot, A. M. B., Delmaar, P. and Lupker, S. J. 2000. The processing of interlexical homographs in a bilingual and a monolingual task: Support for nonselective access to
bilingual memory. Quarterly Journal of Experimental Psychology 53: 397428.
De Moor, W. 1998. Visuele woordherkenning bij tweetalige personen. [Visual word recognition
in bilinguals.] Unpublished Master Thesis, University of Ghent.
Dijkstra, A., De Bruijn, E., Schriefers, H. J. and Ten Brinke, S. 2000. More on interlingual
homograph recognition: Language intermixing versus explicitness of instruction.
Bilingualism: Language and Cognition 3: 6978.
Dijkstra, A., Grainger, J. and Van Heuven, W. J. B. 1999. Recognition of cognates and
interlingual homographs: The neglected role of phonology. Journal of Memory and
Language 41: 496518.
Dijkstra, A., Timmermans, M. and Schriefers, H. 2000. Cross-language eects on bilingual
homograph recognition. Journal of Memory and Language 42: 445464.
Dijkstra, A. and Van Heuven, W. J. B. 2002. The architecture of the bilingual word recognition system: From identication to decision. Bilingualism: Language and Cognition 5:
175197.
Dijkstra, A., Van Jaarsveld, H. and Ten Brinke, S. 1998. Interlingual homograph recognition: Eects of task demands and language intermixing. Bilingualism: Language and
Cognition 1: 5166.
Font, N. 2001. Rle de la langue dans laccs au lexique chez les bilingues: Inuence de la
proximit orthographique et smantique interlangue sur la reconnaissance visuelle de mots.
Unpublished Doctoral Thesis of the Universit Paul Valery, Montpellier, France.
Frenck-Mestre, C. and Pynte, J. 1997. Syntactic ambiguity resolution while reading in
second and native languages. Quarterly Journal of Experimental Psychology 50: 119148.
Gerard, L. D. and Scarborough, D. L. 1989. Language-specic lexical access of homographs
by bilinguals. Journal of Experimental Psychology: Learning, Memory and Cognition 15:
305313.
Gottlob, L. R., Goldinger, S. D., Stone, G. O. and Van Orden, G. C. 1999. Reading homographs: Orthographic, phonologic, and semantic dynamics. Journal of Experimental
Psychology: Human Perception and Performance 25: 561574.
Grainger, J. and Beauvillain, C. 1987. Language blocking and lexical access in bilinguals.
Quarterly Journal of Experimental Psychology 39A: 295319.
Grainger, J. and Dijkstra, A. 1996. Visual word recognition. In Computational Psycholinguistics: AI and connectionist models of human language processing, A. Dijkstra and K.
De Smedt (eds), 139165. London: Taylor and Francis.
Grosjean, F. 2001. The bilinguals language modes. In Language processing in the bilingual,
J. L. Nicol and T. D. Langendoen (eds), 125. Oxford: Blackwell.
Hahne, A. 2001. Whats dierent in second-language processing? Evidence from eventrelated brain potentials. Journal of Psycholinguistic Research 30: 251266.
Hahne, A. and Friederici, A. 2001. Processing a second language: late learners comprehension mechanisms as revealed by event-related brain potentials. Bilingualism: Language
and Cognition 4: 123141.
Kroll, J. F. and Dussias, P. E. In press. The comprehension of words and sentences in two
languages. Chapter to appear in Handbook of bilingualism, T. Bhatia and W. Ritchie
(eds). Cambridge, MA: Blackwell Publishers.

</TARGET "dyk">

150 Ton Dijkstra

Lemhfer, K. and Dijkstra, A. Submitted. Recognizing cognates and interlingual homographs: Time course and code similarity eects in generalized lexical decision.
Macnamara, J. and Kushnir, S. L. 1971. Linguistic independence of bilinguals: The input
switch. Journal of Verbal Learning and Verbal Behavior 10: 480487.
Michael, E., Dijkstra, A. and Kroll, J. F. 2002. Individual dierences in the degree of language
nonselectivity in uent bilinguals. Paper presented at the meeting of the International
Linguistic Association, Toronto, Canada.
Sebastin-Galls, N. and Kroll, J. F. In press. Phonology in bilingual language processing:
Acquisition, perception, and production. In Phonetics and phonology in language
comprehension and production: Dierences and similarities, N. Schiller and A. Meyer
(eds). Berlin: Mouton de Gruyter.
Van Hell, J. and Dijkstra, A. 2002. Foreign language knowledge can inuence native
language performance in exclusively native contexts. Psychonomic Bulletin and Review
9: 780789.
Van Heste, T. 1999. Visuele woordherkenning bij tweetaligen. [Visual word recognition in
bilinguals.] Unpublished Master Thesis, University of Leuven.
Van Heuven, W. J. B., Dijkstra, A. and Grainger, J. 1998. Orthographic neighborhood eects
in bilingual word recognition. Journal of Memory and Language, 39: 458483.
Van Heuven, W. J. B. and Dijkstra, A. April 1999. The role of phonology in the recognition of
interlingual homographs and cognates. Paper presented at the Second International
Symposium on Bilingualism, Newcastle, UK.
Von Studnitz, R. E. and Green, D. 2002. Interlingual homograph interference in GermanEnglish bilinguals: Its modulation and locus of control. Bilingualism: Language and
Cognition 5: 123.
Weber-Fox, C. M. and Neville, H. J. 1996. Maturational constraints on functional specializations for language processing: ERP and behavioral evidence in bilingual speakers.
Journal of Cognitive Neuroscience 8: 231256.

<TARGET "wil" DOCINFO AUTHOR "John N. Williams"TITLE "Inducing abstract linguistic representations"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 7

Inducing abstract linguistic representations


Human and connectionist learning of noun classes
John N. Williams
University of Cambridge

1.

Introduction

Noun class information is a crucial component of the interface between the


lexicon and the grammar. In order to explain linguistic productivity it is
necessary to assume that linguistic rules are dened not over specic words, but
classes of word. This is not only true given the classical distinction between
lexicon and grammar, but also in emergentist views which see no clear
separation between these two systems (Ellis 1998, Tomasello 2000). Even
though the latter stress the lexical-specicity of many grammatical rules, it is
still recognised that adult productivity can only be explained if words are
grouped into classes, even if those classes do not map neatly onto traditional
linguistic categories. The way in which words are grouped into grammatical
classes is therefore an important issue in understanding language development,
particularly in explaining the leap from lexical learning to grammar learning.
Noun classes, such as grammatical gender, are fundamentally abstract,
grammatical notions (Corbett 1991). However, attempts have been made to
uncover subtle phonological and semantic cues that can be used to predict a
words gender (Kelly 1992). For example, masculine nouns in German are more
likely to be monosyllabic, and monosyllabic words that are masculine contain
more consonants than those of other classes. In French, feminine nouns tend to
end in closed stressed syllables (e.g. personne, tomate, viande), and masculine
nouns tend to end in open stressed syllables (e.g. avion, bruit, chapeau, bain).
There are also a number of characteristic derivational morphemes associated
with each gender (e.g. -eur and -ment are masculine, and -tion, -euse, -ire are
feminine). Sokolik and Smith (1992) trained a connectionist network to classify
French nouns as either masculine or feminine. The network was presented with
the orthographic, rather than phonological, forms of the words. They found

152

John N. Williams

that it could then indicate the gender for nouns that it had not received during
training, although its performance was not perfect (ranging between 73% and
75%). This indicates that there are regularities in the form (in this case spelling)
of French words which can to a certain extent predict gender category.
Yet there are always words which fall stubbornly outside such generalisations. In the case of French, Carroll (2001) argues that in any case, the kinds of
phonological cues that have been appealed to are more subtle than could
reasonably be expected to be represented in the lexicon. This is not to say that
phonological and semantic cues do not play a role in learning gender systems,
or that they do not aect how easy it is to remember the gender of specic
words. But ultimately gender classes impose an abstract categorisation on words
which is independent of their phonological and semantic properties. Learning
gender systems, then, requires the formation of abstract grammatical categories,
and producing grammatically well-formed utterances involves applying
agreement rules which make reference to those categories.
There is a growing body of evidence which suggests that even quite advanced second language learners continue to make gender errors (Hawkins
2001, Holmes and De la Batie 1999). In contrast, such errors are relatively rare
in rst language acquisition (Caselli, Leonard, Volterra and Campagnoli 1993).
There is also evidence for qualitative dierences between rst and second
language acquisition and processing of gender. A number of studies have shown
that second language learners are more sensitive to phonological agreement
patterns that correlate with gender classes than either children or adults in their
native language. For example, for the Italian il pettine (the comb, masculine
singular) a second language learner might produce *le pettine, using the article
which is more often associated with the -e ending on feminine plural nouns
(Holmes and De la Batie 1999). In contrast a child would be more likely to
produce *il pettino, choosing an article that is correct for the nouns gender and
number, and providing the noun with the characteristic -o ending for masculine singulars. This demonstrates a grasp of the nouns abstract gender as the
controlling inuence in determiner selection (Caselli et al. 1993). In reaction
time tasks on adults, Taraban and Kempe (1999) showed that non-native
speakers of Russian are more sensitive to phonological cues to gender than are
natives. Finally, a study by Guillelmon and Grosjean (2001) showed that
whereas native speakers of French and early bilinguals show certain gender
congruency eects in reaction time tasks, such eects are absent in late bilinguals. These studies suggest that second language learners do not achieve
native-like representation or processing of gender information.

Inducing abstract linguistic representations

In this chapter I shall explore the possibility that the reason why gender is
a persistent problem for second language learners is precisely because the
underlying abstract grammatical concepts are dicult to acquire through
associative learning. I shall address this issue through behavioural studies of
semi-articial language learning in tandem with computational (connectionist)
simulations. These simulations were used as a means of assessing the viability,
and potential limitations, of a purely associative learning account of the
behavioural data.

2. The issue of abstraction in human and connectionist learning


Noun class induction provides a well-constrained domain in which to examine
the broader issue of abstraction in both human and connectionist learning. In
the case of adult implicit learning there has been a good deal of debate over
whether the knowledge that is acquired in, say articial grammar learning
experiments can really be characterised as abstract (compare Johnstone and
Shanks 1999, Knowlton and Squire 1996, Meulemans and Van der Linden
1997). Some degree of abstraction is suggested by the ability to transfer rule
knowledge between stimulus sets (Knowlton and Squire 1996, Mathews et al.
1989). But this appears to be no more than knowledge of patterns of alternation
or doubling of stimuli, for example the common abstract ABA structure which
underlies the syllable sequences ga-ti-ga and wo-fe-wo (Marcus, Vijayan, Bandi
Rao and Vishton 1999). Gmez and Gerken (2000) refer to this as patternbased abstraction. But language structure depends upon patterns that are
dened over abstract categories, such as the common NVN structure underlying Dogs eat pizza and John loves books. Gmez and Gerken, (2000) refer to
this as category-based abstraction. Very little implicit learning research has
examined this kind of abstraction, even though it is a prime area in which
implicit learning of language structure can be evaluated.
In connectionist networks rule-like behaviour, such as the ability to
generalise to novel inputs, is an emergent property of the system, and there is
no separation between rote memory for examples and the representation of
underlying generalisations (consider, for example, the well-known models of
past tense formation (Rumelhart and McClelland 1986), and reading (Seidenberg and McClelland 1989)). But it has been argued that the human generative
capacity in linguistic domains can not be accounted for without the classical
distinction between knowledge of instances and knowledge of rules, or the

153

154

John N. Williams

traditional computational distinction between data and symbolic programs


(Fodor and Pylyshyn 1988). According to this view, the problem with connectionist models is that they respond to novel inputs purely on the basis of their
similarity to trained examples, and not by applying abstract rules (Berent,
Marcus, Shimron and Gafos 2002, Marcus 1999, Marcus et al. 1999). Categorybased abstraction provides an ideal arena in which to explore this issue.

3. Previous research into human and connectionist learning


of word classes
In his work on sequence learning Elman (1990) showed that there is a sense in
which a connectionist network can learn abstract noun classes. This network
learned the sequential probabilities of words in simple sentences through a
prediction task (attempting to predict the next word in a sentence on the basis
of the preceding ones). When the internal states of the network were examined
(see below for an illustration of how this is done) it was found that the activation patterns produced by words clustered into classes that reected the
distributional properties of the training sentences. The two largest clusters were
for nouns versus verbs, and within these groups there were smaller sub-clusters
corresponding to transitivity preference for verbs, and animacy for nouns.
These clusters were based purely on a distributional analysis of the words in the
input. For example, what made a noun inanimate was nothing more than the
fact that it only occurred before certain kinds of verb (e.g. move, break) and not
others (e.g. smell, see). This work is widely cited as proof that networks can
induce word classes by performing distributional analysis, and as support for a
statistical approach to language learning (Redington and Chater 1998).
Given the apparent power of distributional information to deliver noun
class information it is perhaps surprising that there is only limited evidence
from experimental studies that humans are able to exploit it in order to learn
noun classes. Saran (2001) examined incidental learning of a set of hierarchical phrase structure rules in which each phrase was associated with a distinct
class of nonsense words. She argued that the results of the grammaticality
judgment tests showed that the participants developed sensitivity to phrase
structure and word class, and that this was based on a statistical analysis of the
distribution of the words in the input. However, abstract representations of word
class would permit test items containing word sequences that had never occurred
in the input to be judged as grammatical (or more grammatical than similar

Inducing abstract linguistic representations

sequences which violated phrase structure). Because no such test was performed
it is dicult to know whether abstract word classes had really been learned.
More stringent tests of word class learning become possible when noun
classes, such as gender systems, are considered. Brooks, Braine, Catalano and
Brody (1993) used an articial language in which there were two noun classes,
and each class used dierent axes to mark the location of the actor in relation
to the object denoted by the noun. Neither the form nor meaning of the nouns
provided any clue to their class. Adults were rst taught the vocabulary, and
then performed both comprehension and production tasks (e.g. acting out
phrases, or describing pictures with feedback in the form of the correct answer).
After training they were tested on knowledge of the trained items, and also on
their ability to produce the correct response for noun-ax combinations that
had not been presented during training. Whilst their performance on trained
items was at around 75%, they were at chance on the generalisation items. Not
one of the 16 subjects showed evidence of having learned the system. Similar
results have been obtained in a number of other studies (Braine 1987, Braine et
al. 1990, Frigo and McDonald 1998). Frigo and McDonald (1998: 237) argue
that models of noun class learning that depend on pure distributional analysis
(Anderson 1983, Maratsos and Chalkley 1980, Pinker 1984) are too powerful.
The question is, then, does connectionism fall into this class of overly
powerful learning mechanisms for learning noun classes? The experiments and
simulations presented below further explored the circumstances under which
arbitrary and non-arbitrary noun class systems can be learned by humans and
connectionist networks.
4. Experiment 1
Williams and Lovatt (2003) tested whether humans can learn the arbitrary noun
class system shown in Table 1. There were eight nouns divided into two arbitrary classes masculine and feminine. Words in the masculine class occurred
with the determiners ig, i, ul, and tei. Words in the feminine class occurred
with the determiners ga, ge, ula, and tegge.1 The training items were the nonitalicised phrases shown in Table 1. The italicised items were withheld for
testing generalisation. It would only be possible to know that the ball should
be translated as ig johombe by knowing that johombe belongs to the masculine
class. Neither its form, its -e ending, nor its meaning provide any clues.
The participants rst learned the nouns and determiners as isolated
vocabulary items. They then received the determiner-noun combinations for

155

156 John N. Williams

Table 1.The items employed in Experiments 1 and 2. Items used for testing generalisation are in italics.
denite singular
(the)

denite plural
(the)

indenite singular
(a)

indenite plural
(some)

masculine
ball
house
ght
bird

ig johombe
ig zabide
ig wakime
ig migene

i johombi
i zabidi
i wakimi
i migeni

ul johombe
ul zabide
ul wakime
ul migene

tei johombi
tei zabidi
tei wakimi
tei migeni

feminine
shoe
kiss
cake
nose

ga shosane
ga tisseke
ga chakume
ga nawase

ge shosani
ge tisseki
ge chakumi
ge nawasi

ula shosane
ula tisseke
ula chakume
ula nawase

tegge shosani
tegge tisseki
tegge chakumi
tegge nawasi

each training item as part of an exercise in rote memorisation that cycled


through phases of presentation and cued recall over sets of four items. Four
phrases were presented with their English translations, for example: the nose
ga nawase, the birds i migeni, some balls tei johombi, a kiss ula tisseke. The
participants repeated each novel phrase immediately after they had seen and
heard it. After the four phrases had been presented participants attempted to
recall each phrase given the English translation and stem as cues, for example: the
birds _ migen_, the nose _ nawas_, a kiss _ tissek_, some balls _ johomb_. They
were provided with feedback after each recall attempt in the form of the correct
answer. After receiving the 24 training items they performed a generalisation
test on the withheld items in Table 1. The generalisation test was similar to the
recall component of the training phase. The English translation of each phrase
was presented (e.g. the ball), along with the form of the corresponding stem
(johomb_), and the participants had to produce the appropriate determiner and
appropriately inected noun. No feedback was given. This sequence of memory
and generalisation tasks was repeated ve times.
Across 21 participants the mean generalisation performance over the ve
cycles was 36%, 48%, 54%, 66%, and 67%. A repeated measures Anova showed
that the improvement in performance was signicant, F(4,80) = 13.11, p < 0.001.
This shows that the participants learned something of the underlying noun class
organisation. However, there were large individual dierences in the level of
learning. Two factors were found to independently predict performance on the
nal generalisation test. The rst was the participants phonological short-term

Inducing abstract linguistic representations

memory, as measured prior to the experiment by their ability to recall lists of


three nonsense words (the singular forms of the nouns in the target language)
in the order of presentation. The correlation between this memory measure and
performance on the nal generalisation test was r = 0.528, p < 0.05. There was
evidence that the relationship between phonological short-term memory and rule
learning was mediated by memory for determiner-noun combinations received
during training. Clearly memory ability is crucial to performing the kind of
distributional analysis upon which learning of this kind of system depends.
The second factor was a measure of the participants breadth and depth of
knowledge of other gender languages. All of the participants L1s were non-gender
languages (in fact all but one of them was a native speaker of English), but the
more gender languages they knew as L2s, and the better they knew them, then
the better their performance on the generalisation test (r = 0.520, p < 0.05). This
suggests that the learning process was facilitated by linguistic knowledge.
There are a number of possible reasons why our participants managed to
learn an arbitrary noun class system whereas those in the previous studies did
not. First, the systems used by Brooks et al. (1993) and Braine et al. (1990)
involved agreement between spatial prepositions and nouns, and Frigo and
McDonald (1998) used a system involving agreement between greetings and
names. Participants may have had relatively little familiarity with similar
systems in other languages that they knew. Second, it is possible that the size of
the languages is important. Braine et al. (1990) used a 24-word vocabulary, Brooks
et al. (1993) used 30 words, and Frigo and McDonald (1998) used 20 words,
whereas Experiment 1 used only eight words. Clearly, keeping track of the collocates of 20 to 30 words is much harder than keeping track of the collocates of eight
words. A third potentially important factor is that in the present case some of
the determiners in each class had the same ending. The feminine class contained the pairs ga-ula and ge-tegge; the masculine class contained i-tei, and the
remaining determiners ig and -ul were the only ones to end in consonants. This
similarity structure may have facilitated the learning process.
Experiment 1 demonstrates that an arbitrary noun class system is in
principle learnable. The question now is whether a connectionist simulation of
the same learning problem will be similarly successful.
4.1 Simulation 1
For this and all other simulations reported here, the simulation package Tlearn
was used (Plunkett and Elman 1997). The aim in the rst simulation was to

157

158

John N. Williams

train the network in a way which resembled as closely as possible the training
task performed by the participants in Experiment 1. I decided to focus on the
recall component of the training task. The network was taught to produce the
correct determiner for each phrase in the training set shown in Table 1. The
input consisted of representations of the noun stem, the inection, the English
determiner, and the number of the noun. For example, the input for the item
tei johombi was johomb, -i, some, and plural. This is the information that
is relevant to predicting the determiner, and which was explicitly provided to
the participants in the recall component of the training task in Experiment 1.2
Following Elman (1990) one unique input node was used to represent each
element of the input (for example one unit was used to represent johomb),
yielding a total of 15 input nodes (eight stems, two inections, three English
determiners, singular, plural). The input nodes were connected to ve hidden
units, which were in turn connected to eight output units, one for each of the
eight possible determiners. For each input pattern the network was taught to
produce the correct determiner. For example given the input johomb, -i, some,
and plural it was taught to predict tei. This involved comparing the actual
output from the network with the correct output, and making appropriate
changes to the connection weights within the network according to the degree
of error. In this sense the network was provided feedback in the same way as the
participants in the experiment.
The network was initially trained until the root mean square (RMS) error
for the training items was 0.1 (this required an average of 2,479 cycles through
the training set).3 An error of this magnitude indicated that for each input
pattern the network was able to activate the correct determiner on the output
layer to a value close to the target value of 1.0, and all other output units had
values close to zero. Testing involved presenting the input patterns for the
generalisation items in Table 1 (i.e. the network was presented with patterns that it
had not received during training). For each input pattern, the activation level of
the output units was recorded, compared to the correct answer and the degree of
error calculated. The training and test procedure was repeated 20 times, and on
each run the connection weights were given random starting values.
Generalisation performance on each run was perfect in the sense that the
activation on the node for the correct determiner was far greater than that of
the others. Over 20 runs the mean RMS error was 0.118 (which is not much
greater than that for trained items). That is, the network was able to correctly
predict the determiner for input patterns that it had never encountered during
training with an accuracy which was almost as high as for the trained items.4

Inducing abstract linguistic representations 159

In order to explore the nature of the networks internal representations the


word stems were presented alone to the input layer at test (i.e. all of the other
elements of the input were given values of zero). The activation patterns over
the ve hidden units were recorded and submitted to a cluster analysis (for a
similar procedure see Elman 1990). The logic of only presenting the word stems
was that the aim was to ascertain the similarity structure of the hidden unit
activations to the nouns in a way that was not contaminated by the activations
produced in specic contexts of deniteness and number. Over six separate
runs a similar result was obtained the activation patterns clustered according
to gender. That is, nouns within the same class produced activation patterns
that were similar to each other and distinct from the patterns produced by the
nouns in the other class.
It should be clear that this network is not simply producing responses to the
generalisation items on the basis of their similarity to trained items. For
example, for the test item ig johombe the stimuli were johomb, -e, the, and
singular. During training johomb and -e only occurred with ul. The elements
the and singular occurred with both ig and ga with equal frequency. Yet the
network was able to produce a strong output on ig and much lower levels of
activation on the remaining determiners. Simulation 1 therefore shows that a
connectionist network can achieve linguistic productivity, and can behave as if
it has formed abstract representations, even though there are no abstract
representations as such within the network.
There are various ways in which the power of Simulation 1 could be varied
in order to account for the eects of individual dierences in Experiment 1, or
the failures to obtain learning of arbitrary noun classes in previous experiments.
The eect of memory ability could be dealt with by changing the learning rate
parameter (which determines the size of the weight changes in response to a
given amount of error). Factors such as the similarity structure of the determiners, or the number of nouns in the training set, would be expected to
inuence learning rate as well. However, the inuence of knowledge of other
gender languages is more problematic and will be considered after the remaining experiments and simulations have been reported.
Connectionist networks are commonly regarded as models of the associative mechanisms underlying implicit learning (Cleeremans and Jimnez 2002).
However, when we debriefed our participants after Experiment 1 it was clear
that the more successful amongst them had been employing intentional
learning strategies, and that there was a good correspondence between their
conscious understanding of the system and their performance in the nal

160 John N. Williams

generalisation test. It therefore becomes important to test whether learning


could be obtained under implicit conditions.

5. Experiment 2
This experiment employed a training task that was thought to be unlikely to
induce an intentional learning strategy. Participants rst performed the same
phonological short-term memory test and vocabulary learning exercise as in
Experiment 1. Determiner-noun combinations from the training set were then
auditorily presented in a semi-random sequence, avoiding immediate repetitions of the same noun or determiner. For each item the participants had to
perform the following tasks: (1) repeat the phrase aloud, (2) indicate whether
it refers to a living or non-living thing by pressing one of two response keys, and
(3) translate the phrase into English. For example, for the item ul johombe they
would respond by saying ul johombe, pressing the non-living key, and saying
a ball. The meanings of the words were altered so that half of the nouns in each
class referred to living things and half to non-living things. The living/nonliving decision was included because this experiment was also a control for a
subsequent version in which noun animacy predicted noun class membership
(see Experiment 3 below). Here it serves as a means of increasing task demands
so that participants would be less likely to attempt to engage explicit learning
processes. The participants were told that the purpose of the experiment was to
see how their decision and translation performance improved with practice and
so they were encouraged to make their responses as quickly and as accurately as
possible. Training extended over 15 cycles through the 24-item training set,
giving a total of 360 training trials. This took between 60 and 75 minutes
including rest breaks after every ve cycles.
The training phase was followed by the test phase. On each trial the English
translation of a test phrase was visually presented (e.g. the ball) and the
participants had to choose between a grammatical and ungrammatical translation in the target language, where the determiner for the ungrammatical item
was always of the correct number and deniteness, but the incorrect gender
(e.g. ig johombe versus ga johombe for the ball). First the eight generalisation
items were presented (see Table 1) followed by 16 trained items.
There were 18 participants who were selected on the basis of their good
knowledge of gender languages so as to increase the potential for obtaining
learning in this experiment. They all rated themselves as intermediate or better

Inducing abstract linguistic representations

in at least two gender languages (mean = 2.8, range = 2 to 6). Twelve of the
participants spoke a gender L1. Using the same scale for assessing knowledge of
gender languages as employed by Williams and Lovatt (2003) they scored 5.8,
which is much higher than the mean of 2.6 for the participants in Experiment 1.
Their phonological short-term memory was also somewhat superior, the mean
score being 71% as opposed to 64%.
None of the participants were aware of the noun class system either during
training or test phases. The average percentage correct on the generalisation
items was 56%, which was not signicantly dierent from the chance level of
50%, t = 1.34, p > 0.1. On the other hand, performance on the trained items was
69%, which is signicantly better than chance, t = 5.53, p < 0.001, and signicantly better than performance on generalisation items, t = 2.58, p < 0.05. Thus,
although the participants had quite good memory for trained items, there was
no evidence of learning the underlying noun class distinction. This conclusion
is emphasised by the fact that the ten participants who scored 75% or better on
the trained items (mean = 80%) had a mean generalisation score of 50%. Nor
were there any correlations between generalisation test performance and either
phonological short-term memory or language background, and participants
who spoke a gender L1 did no better than those that did not (generalisation test
scores were 56% for both groups).
Given the failure to obtain learning in this experiment one may conclude
that Simulation 1 was in fact too powerful, and that the learning that occurred
in Experiment 1 was a result of purely explicit processes which fall outside the
scope of the model. However, there is an alternative possibility. We should also
consider the relationship between the task performed by Simulation 1 and the
tasks performed by the participants in Experiments 1 and 2. Simulation 1 was
intended as a model of the recall component of the training task used in
Experiment 1. But in Experiment 2 the participants task was very dierent.
They did not have to generate any determiners at any point during training, but
only had to perform animacy decisions and produce English translations.
Simulation 1 could not be said to be a good model of this task. A second
simulation was therefore conducted that made dierent assumptions about the
learning task.
5.1 Simulation 2
Incidental learning is best regarded as a relatively passive process of recording
correlations between attended features in each experience. Cleeremans and

161

162 John N. Williams

Jimnez (2002), following OReilly and Munakata (2000: 18), have referred to
this as model learning, the goal of which is to enable the cognitive system to
develop useful, informative models of the world by capturing its correlational
structure. Connectionist models of model learning do not require feedback
because the system merely attempts to represent the structure of the inputs it is
provided. This is in contrast to task learning which has the aim of mastering
specic input-output mappings (i.e. achieving specic goals) in the context of
specic tasks through error-correcting learning procedures (ibid. p. 18).
Crucially for present purposes they assume that model learning operates
continuously, regardless of the task. Simulation 1 instantiated task learning, and
was successful because the underlying noun class distinction happened to be
relevant to the task the network was required to perform. But in Experiment 2
the tasks that the participants were performing (animacy decisions and translation) exerted no pressure to learn the noun class distinction. The same would
be true of simulations of those tasks. The only way in which the noun class
distinction could be learned, therefore, would be through model learning,
which requires a dierent kind of network from that used in Simulation 1.
One way of instantiating model learning is to train a three-layer network to
associate each input to itself. That is, the network learns to reproduce the input
pattern on the output layer. These are called autoassociation networks
(Plunkett and Elman 1997). Because there are fewer hidden than input/output
units the network is forced to discover an economical means of representing the
patterns so that they can be reproduced on the output. This gives the network
the potential to extract generalisations. Autoassociation networks do not
require feedback because the input itself provides the reference point against
which the accuracy of the output can be judged. How does such a network fare
on the arbitrary noun class induction problem?
In Simulation 2 there were 31 input units representing the eight determiners, eight stems, two inections, three English determiners, eight English
nouns, and units for singular and plural. All of the relevant information in a
training item such as ul johombe, a ball was represented as a pattern over the
input layer. The 31 output units represented the same information as the input
units. The network had 20 hidden units.5 For each item in the training set the
network was trained to reproduce the input pattern on the output layer.
Training continued until output error ceased to decline (which was after about
2,500 cycles).
In Experiment 2, learning was assessed by forcing participants to choose
between two translations for a phrase, for example, between ga johombe and ig

Inducing abstract linguistic representations 163

johombe as translations of the ball. The model can be tested in the same way by
presenting both grammatical and ungrammatical determiner-noun combinations and comparing the strength of the output on the determiner units. For a
trained item, such as ul johombe, the strength of activation of the corresponding
determiner in the output, in this case ul, was, as one would expect, very high
(0.996 when averaged over eight training items on ve separate runs, where the
required activation level was 1.0). Ungrammatical items such as ula johombe
produced much weaker activation of the corresponding output determiner
node, in this case ula (0.214). Clearly the network had not simply learned to
reproduce input patterns on the output layer. Rather, its ability to do so was
aected by whether it had received those patterns during training. In human
terms this would be the equivalent of a greater feeling of familiarity for ul
johombe than ula johombe. But for generalisation items the output activation on
determiners in both grammatical and ungrammatical items, e.g. ig johombe
versus ga johombe, was very low and not signicantly dierent (0.054 and 0.055
respectively). In other words, both items appeared equally unfamiliar to the
network. Therefore, like the human participants in Experiment 2, the autoassociation network had good memory for trained items, but was unable to
distinguish between grammatical and ungrammatical generalisation items.
The contrast between Simulations 1 and 2 demonstrates that task learning
enabled a connectionist network to become sensitive to an abstract noun class
distinction whereas model learning did not. This is a rather surprising result
when one considers that there is a sense in which the networks were performing
rather similar tasks. In both cases they had to remember which determiners
occurred with which congurations of noun, deniteness, and number in the
training items. The dierence was that in Simulation 1 the networks resources
were focused on predicting the determiner from the cues that it was provided,
whereas in Simulation 2 the network was actually attempting to remember the
unique combination of determiner, noun, deniteness, and number that
occurred in each training item. This exercise in episodic memory for entire
training episodes apparently did not exert sucient pressure on the network to
discover the underlying noun class distinction.
The contrast between task learning and model learning is reminiscent of the
procedural-declarative distinction in Andersons ACT framework (Anderson
1983). Productions are sets of rules which match their IF conditions against
the current contents of working memory, and if these are satised, they THEN
produce some action, or deposit some other kind of representation in working
memory. Although stated in a symbolic formalism in ACT, a connectionist

164 John N. Williams

network can be conceived as a subsymbolic model of the entire set of productions which perform the transformation between one type of input to another
type of output (Sun et al. 2001). Both procedural learning and connectionist
learning of this type depends on error correction. In contrast, Simulation 2
could be identied with the declarative memory component of the ACT
framework. The idea that these two kinds of memory system might have
dierential power to extract generalisations from the environment is clearly
relevant to attempts to construct a theory of second language acquisition in
terms of their interaction (Towell and Hawkins 1994), or to identify them with
dierent brain regions (Ullman 2001). Indeed, the idea that procedural learning
is more powerful at extracting abstract linguistic rules would be consistent with
the proposal that such a mechanism supports rst language acquisition,
whereas second language acquisition is supported by declarative learning
(Ullman 2001).
If Simulation 1 is accepted as a valid model of the learning process in
Experiment 1 then there is another interesting consequence. The learning that
was occurring in that experiment was characterised as explicit. Not only did
the participants appear to have an intention to learn, but some of them also
made comparisons between consciously recalled input items, and formed
conscious hypotheses. Simulation 1 captures the intentional component of the
learning process, since it too evaluated its outputs with respect to feedback for
the purpose of learning in order to be able to generate determiners. But it
obviously does not model the other components of what, in human terms, we
regard as explicit learning. However, this does not necessarily detract from the
relevance of the model. Shanks (1995) reviews a range of studies on human
learning where there is a good t between human behaviour and connectionist
models. Yet in many of these experiments the participants were actively
searching for rules. For example, in a medical diagnosis task (see Shanks
1995: 42) participants were presented with hypothetical patients with certain
symptoms and were instructed to diagnose what illness each patient had. Each
trial was accompanied by feedback in the form of the correct diagnosis. Performance was directly related to the degree of contingency between dierent cues
(symptoms) and outcomes (diseases) in the training data. Shanks showed that
the results could be adequately modelled by a simple connectionist network in
which symptoms were presented as inputs, diagnoses as outputs, and the correct
diagnosis was provided as feedback (Shanks 1995:120). Yet the participants in the
experiment presumably had the experience of actively trying to work out the
relationship between symptoms and diseases. Whilst it is presently unclear how the

Inducing abstract linguistic representations 165

conscious states of the learner inuence the learning mechanism, it should not
be assumed that the possibility of there being such interactions rules out a
unied associative explanation (Cleeremans and Jimnez 2002).

6. Experiment 3
With the benet of hindsight Experiment 2 was a poor test of implicit learning
because the kind of associative learning mechanism supposed to underlie
implicit and incidental learning would not be expected to learn the underlying
generalisations. We6 therefore decided to run Experiment 2 again, but this time
using a system that would be learnable even by the kind of autoassociation
network used in Simulation 2. The language was essentially the same as that
shown in Table 1 except that the meanings of the words were altered so that all
of the words in class I referred to living things and all of the words in class II
referred to inanimate objects. For simplicity, the living/non-living distinction
will be referred to here in terms of an animacy cue to noun class. It has been
shown that under intentional learning conditions humans have no problem
grasping semantically-based noun classes (Braine 1987, Carroll 1999). A version
of Simulation 2 that included animacy information conrmed that the present
system was also learnable by an autoassociation network. This is presumably
because there are direct associations between the units that encode animacy and
certain determiners. Note, therefore, that in this experiment we are no longer
concerned with whether implicit learning of abstract noun classes is possible.
Rather the issue is whether implicit learning of a noun class distinction can be
obtained under conditions where the connectionist model predicts that there
should be an eect.
The tasks and procedure were exactly the same as in Experiment 2. For each
phrase presented during training the participants had to repeat it, indicate
whether it referred to a living or nonliving thing, and translate it into English.
Note that this time the living/nonliving decision coincided with the noun class
of the word. The same learning tests were used as in Experiment 2. There were
37 participants with varied language backgrounds.
Only seven of the participants became aware of the noun class distinction
and its relation to animacy during the training phase, and their performance
was perfect, or near perfect, on the generalisation and trained items. None of
the remaining 30 participants became aware of the system during the training
phase and none of them claimed to have been consciously trying to work out

166 John N. Williams

the system during the generalisation test. Even at the end of the whole testing
phase none of them realised the relevance of animacy. Nevertheless, performance on generalisation items was 61%, which was signicantly above the
chance level of 50%, t = 3.25, p < 0.01. They scored 71% correct on trained
items, which was also signicantly above chance, t = 6.09, p < 0.001. Therefore,
Experiment 3 succeeded in demonstrating at least some degree of implicit
learning of a system that was also learnable by an autoassociation network.
However, there were large individual dierences in generalisation test
performance. Just as in Experiment 1 there were correlations with phonological
short-term memory (r = 0.50, p < 0.01) and knowledge of gender languages
(r = 0.586, p < 0.001), which in this case was quantied simply in terms of the
number of gender languages in which the participants rated their prociency as
intermediate or better (mean = 1.8, range = 0 to 5). We also evaluated whether,
amongst the 30 unaware participants, speakers of gender L1s did better than
speakers of non-gender L1s. For the 13 speakers of gender L1s mean generalisation test performance was 71%, which is signicantly above chance, t = 4.08,
p < 0.01, whereas for the 17 speakers of non-gender L1s it was 54%, which is not
signicantly above chance, t = 0.96. The dierence between these two groups
was signicant, t = 2.78, p < 0.01. The two groups did not dier signicantly in
terms of the number of L2s spoken to an intermediate level or better (3.54 and
3.12 respectively, t < 0.92), the number of gender languages known as an L2 (the
means were 1.46 and 1.23 respectively), but they did dier slightly in terms of
phonological short term memory (77% versus 68%, p = 0.08). Better matched
groups resulted from removing the three participants with the lowest phonological short term memory scores from the sample (all scores were less than 50%,
and all three participants were in the non-gender L1 group). The 13 speakers of
gender L1s and remaining 14 speakers of non-gender L1s were well matched in
terms of number of gender languages spoken as an L2 (1.46 and 1.43 respectively) and in terms of phonological short term memory scores (77% and 73%). Yet
the generalisation scores were 71% and 55% (the dierence being signicant,
t = 2.31, p < 0.05). Note that for the non-gender L1 group the mean for the
trained test items was well above chance (67%, t = 3.74, p < 0.01).

7. Discussion
In one sense it could be argued that there is a good alignment between the
connectionist models and the human data in the present studies, provided

Inducing abstract linguistic representations 167

assumptions are made about which kind of network is appropriate to which


task conditions. Where the model was able to generalise there was also evidence
for generalisation amongst the participants in the experiment (Simulation 1 and
Experiment 1, Simulation 2 supplemented by animacy information and
Experiment 3). Where the model was not able to generalise there was no
evidence for generalisation amongst the human participants (Simulation 2 and
Experiment 2). The problem is, however, that the networks only seem to
account for learning amongst those participants who already possessed knowledge of other gender languages. Yet none of the networks contained any prior
knowledge. Seen in this light they provide a poor t to the human data. In this
nal section I shall consider ways in which prior knowledge could have inuenced human learning, and whether the data then become more amenable to a
connectionist interpretation. I shall then consider the implications of the
present results for second language acquisition.
7.1 The role of prior linguistic knowledge
One way in which prior knowledge could facilitate learning is through its eect
on the learners strategy. Recall that the success of Simulation 1 depended upon
using number, deniteness, and an abstract representation of the nouns
(represented as single nodes) to generate the determiners. But this presupposes
a certain understanding of the nature of gender systems. Participants who did
not have this understanding may simply have approached the task in the way
that it was presented to them; that is, as a short-term memory exercise for
determiner-noun combinations. In that case their learning processes would be
more appropriately modelled by Simulation 2 than Simulation 1. Indeed, the
contrast between Simulations 1 and 2, between task learning and model
learning, could be seen as a computational account of a more general contrast
between analytic and non-analytic, memory-based, learning strategies (Skehan
1998). In the present case the probability of adopting an appropriate analysis
strategy could have also depended upon metalinguistic knowledge of other
gender systems that was derived from second language learning experience.
Obviously a learning strategy account can not apply to the kind of incidental and implicit learning occurring in Experiment 3. However, in this case
learning failures could be accounted for simply by assuming that animacy was
not perceived as being relevant to the determiners. In Williams (in preparation)
I argue that implicit learning of form-meaning connections (such as between
determiners and animacy information) is problematic because of the requirement

168 John N. Williams

that form and meaning are unitised at encoding; learners must actually perceive
them as being relevant to each other. Merely paying attention to the relevant
elements does not appear to be sucient, at least not under the task conditions
of Experiment 3. In terms of the model learning mechanism instantiated by
Simulation 2 this means that even though animacy information was attended,
it did not enter the same memory trace as information about the determiner,
noun identity, deniteness, and number. The problem is, therefore, to explain
why participants who spoke a gender L1 deed this principle and were able to
unconsciously associate the determiners with animacy information. There is no
obvious connectionist answer to this problem. Is the classical linguistic approach any more promising?
Linguistic (Carroll 1989, Hawkins 2001) and psychological (Levelt, Roelofs
ans Meyer 1999, Vigliocco, Antonini and Garrett 1997) analyses of gender
representation and processing in the L1 assume abstract gender features that are
attached to nouns in the lexicon. How gender features are acquired is not often
considered. However, Carroll (2001) proposes an induction procedure which
is triggered by the presence of alternating determiner forms in the input (e.g.
two words for some). The rst occurrence of one of the determiners, for
example in tei johombi (some monkeys) has no eect. But when another
phrase involving a word for some is encountered, for example tegge nawasi
(some vases), the learner seeks to rationalise the contrast by marking the noun
with a [+gender] feature. In this way, one of the determiners becomes an
assigner of the gender feature whilst the other remains the default. Remembering which of the alternating pair of determiners assigns the gender feature is
likely to be problematic, however. In the (admittedly articial) case that
[+gender] also corresponds to some other active feature of the noun, such as
[+inanimate], one can imagine that this problem would be alleviated. To
account for the inuence of gender L1s in Experiment 3 it would have to be
assumed that this kind of induction mechanism can only operate in L2 if it was
used in the L1. This is perhaps not too implausible if one considers that each
time a speaker of a gender language encounters a novel noun the same process
of using the accompanying determiner to assign gender to it must operate. On
the other hand, it is another matter to assume that, when confronted with a new
language, learners are able to assign new gender features on the basis of newly
observed alternations between determiners. It is also relevant to consider that
at present there is no evidence that speakers of gender L1s have any less
problem with gender in an L2 than do speakers of non-gender L1s (Bruhn and
White 2000). Thus, although the gender L1 advantage found in Experiment 3

Inducing abstract linguistic representations 169

is intriguing, there is no obvious way of accounting for it at the present time


from either connectionist or classical perspectives.
7.2 Implications for second language acquisition
When considering second language acquisition, particularly under naturalistic
un-instructed conditions, it is relevant to consider the power of incidental
learning mechanisms; that is, learning that takes place as a natural consequence
of processing the relevant stimuli for purposes other than discovering the
underlying regularities. This means that we should consider implicit learning
conditions like those in Experiments 2 and 3 and learning mechanisms of the
type exemplied by Simulation 2 as being the most relevant. Granted this
assumption, then the prospects for associative learning of abstract noun classes
would appear to be bleak.
However, one limitation of Experiment 2 and Simulation 2 is that they
employed a completely arbitrary noun class system. As mentioned earlier, it has
been argued that in many natural languages at least a proportion of the members of the same noun class share phonological and semantic properties. Could
the presence of these cues facilitate learning? In fact a number of experimental
studies have shown that partial phonological and semantic cues do indeed
facilitate noun class induction (Braine 1987, Brooks et al. 1993, Frigo and
McDonald 1998). However, these studies have only demonstrated an eect of
partial cues under intentional learning conditions similar to those in Experiment 1. There have been no demonstrations of their eect upon implicit
learning. Indeed, my own preliminary investigations of learning such systems
using networks of the type used in Simulation 2 have failed to generalise to
unmarked words (whereas a network such as Simulation 1 would clearly have
no problem).
Even under the intentional learning conditions of the earlier experiments
there was very little evidence of generalisation to items that did not carry the
appropriate cues. The adults in the study of Brooks et al. (1993) showed barely
a signicant eect using a one-tailed test (which assumes that the direction of
the dierence is predicted), and for the children in their second experiment
there was no evidence of generalisation at all. Given that seven out of the 16
adults had explicit knowledge of the word classes, whereas only one of the
children did, then it seems likely that these participants were responsible for the
slightly above-chance performance of the group as a whole. Generalisation to
unmarked nouns would therefore seem to be unlikely under implicit conditions.

170 John N. Williams

Only in one of three experiments of Frigo and McDonald (1998) was performance on unmarked generalisation items signicantly above chance, and this
was when word class was indicated by a characteristic initial and nal syllable
(e.g. wanersumglot, wanolovglot, wanalglot versus kaisalmrish, kaisilvrish,
kaisalbrish). Braine (1987) also obtained good generalisation to unmarked
words, but half of the nouns in one class referred to males and the other half to
females. Thus, generalisation appears to be limited to cases where the cues are
more salient than in natural languages.
Somewhat counter-intuitively, where the above studies did nd evidence of
generalisation to unmarked items was when entirely novel nouns were introduced in the nal test phase. The equivalent test in the context of the language
used here would involve telling participants that ul vark means a dog and
asking them to produce the translation of the dog (the correct answer being ig
vark). Such a test only requires knowledge of the associations between the
determiners. Therefore, it does appear that partial phonological cues can
facilitate acquisition of inter-determiner associations (or rather, their equivalent
in the languages that were used). Determiners in the same class presumably
become associated by virtue of their frequent association to the same phonological cue. Generalisation is then achieved by a process of inference from another
determiner-noun combination that is provided at test or recalled from memory.
As argued by Frigo and McDonald (1998), poor performance on generalisation
tests involving nouns that occurred in training could be because of problems
recalling an example of a determiner that occurred with that noun. But the
native speaker of a gender language is assumed to generate an appropriate
determiner directly on the basis of an abstract specication of the nouns
gender in the lexicon, not by inference. It is far from clear that the participants
in these experiments acquired knowledge of noun classes in that sense.
The results from these studies do not, therefore, oer much prospect of
incidental learning of noun classes. This is of course consistent with the claim
that gender is a persistent problem for second language learners. Assuming an
underlying model learning mechanism such as that in Simulation 2, learning
would be predicted to be limited to rote storage of determiner-noun combinations, and associations between determiners and partial phonological and
semantic cues. This would explain L2 learners sensitivity to phonological cues
in gender processing tasks (Guillelmon and Grosjean 2001, Holmes and De la
Batie 1999, Taraban and Kempe 1999). Unmarked nouns would have to be
dealt with through rote storage, putting a strain on phonological memory
(Williams and Lovatt 2003). The lack of a true underlying noun class organisation

Inducing abstract linguistic representations

would make storage of determiner-noun combinations particularly prone to


error, but if at least one instance of a determiner-noun pair can be retrieved,
other appropriate determiners could be inferred using knowledge of interdeterminer associations. Thus, second language learners can acquire a semblance of competence but the failure to organise the underlying representations
in terms of abstract noun classes will cause persistent problems. I have argued
that this reects a weakness in the type of associative learning mechanism that
is assumed to underlie incidental learning.

Notes
1. This language was derived from Italian. The determiners were derived from the Italian il,
i, un, dei, la, le, una, and delle by systematically substituting consonants (lg, dt, nl).
The nouns correspond to Italian nouns which end in -e in the singular and -i in the plural
regardless of gender, e.g. cliente (masculine), stazione (feminine). Note that none of the
participants in Experiments 1 and 2 (reported below) had any knowledge of Italian, and only
two participants in Experiment 3 knew Italian at an intermediate level or better as an L2.
2. The only dierence was that in the experiment they also had to produce the inection,
whereas in the simulation the inection was provided on the input. However, in the
experiment the participants learned the correct plural inections in the preliminary
vocabulary learning phase, and not in the training phase of the main experiment. In any case
the inection provides no clue as to the correct determiner over and above the presence or
absence of the plurality of the noun.
3. A Root Mean Square error of 0.1 means that over all of the input patterns presented on a
particular cycle the average dierence between the actual output and the required output on
each node was 0.1 units of activation. The point at which the correct output node was simply
the most active occurred well before an RMS error of 0.1 was achieved.
4. The Luce ratio was also used as a measure of network performance the activation level
of the correct output node divided by the sum of the activation over all output nodes. Perfect
output would be indicated by a Luce ratio of 1.0. In this simulation the mean Luce ratio over
20 runs was 0.87.
5. The number of hidden units was set to about two thirds of the number of input/output
units so as to force the inputs through a reduced representational space, exerting pressure on
the network to extract generalisations. Other simulations were performed with either 10 or
40 hidden units but the generalisation performance was similar to that reported here.
6. This experiment was run in collaboration with Helen East.

171

172 John N. Williams

References
Anderson, J. R. 1983. The architecture of cognition. Cambridge MA: Harvard University Press.
Berent, I., Marcus, G. F., Shimron, J. and Gafos, A. I. 2002. The scope of linguistic generalizations: Evidence from Hebrew word formation. Cognition 83: 113139.
Braine, M. D. S. 1987. What is learned in acquiring word classes: A step towards an acquisition theory. In Mechanisms of language acquisition, B. MacWhinney (ed.), 6587.
Hillsdale, NJ: Lawrence Erlbaum.
Braine, M. D. S., Brody, R. E., Brooks, P. D., Sudhalter, V., Ross, J. E., Catalano, L. and Fisch,
S. M. 1990. Exploring language acquisition in children with a miniature articial
language: Eects of item and pattern frequency, arbitrary subclasses, and correction.
Journal of Memory and Langage 29: 591610.
Brooks, P J., Braine, M.D S., Catalano, L. and Brody, R. 1993. Acquisition of gender-like
noun classes in an articial language: The contribution of phonological markers to
learning. Journal of Memory and Language 32: 7695.
Bruhn, J. and White, L. 2000. L2 acquisition of Spanish DPs: the status of grammatical
features. In Proceedings of the 24th annual Boston University conference on language
development. Vol. 1, S. C. Howell, S. A. Fish and T. Keith-Lucas (eds), 164175. Somerville, Mass.: Cascadilla Press.
Carroll, S. 1989. Second language acquisition and the computational paradigm. Language
Learning 39: 535594.
Carroll, S. E. 1999. Input and SLA: Adults sensitivity to dierent sorts of cues to French
gender. Language Learning 49: 3792.
Carroll, S. E. 2001. Input and evidence: The raw material of second language acquisition.
Amsterdam: John Benjamins.
Caselli, M. C., Leonard, L. B., Volterra, V. and Campagnoli, M. G. 1993. Toward mastery of
Italian morphology: A cross-sectional study. Journal of Child Language 20: 377393.
Cleeremans, A. and Jimnez, L. 2002. Implicit learning and consciousness: A graded,
dynamic perspective. In Implicit learning and consciousness, R. M. French and A.
Cleeremans (eds), 140. Hove: Psychology Press.
Corbett, G. 1991. Gender. Cambridge: Cambridge University Press.
Ellis, N. C. 1998. Emergentism, connectionism and language learning. Language Learning
48: 631664.
Elman, J. L. 1990. Finding structure in time. Cognitive Science 14: 179211.
Fodor, J. A. and Pylyshyn, Z. W. 1988. Connectionism and cognitive architecture: A critical
analysis. Cognition 28: 371.
Frigo, L. and McDonald, J. L. 1998. Properties of phonological markers that aect the
acquisition of gender-like subclasses. Journal of Memory and Language 39: 218245.
Gmez, R. L. and Gerken, L. 2000. Infant articial language learning and language acquisition. Trends in Cognitive Sciences 4: 178186.
Guillelmon, D. and Grosjean, F. 2001. The gender marking eect in spoken word recognition: The case of bilinguals. Memory and Cognition 29: 503511.
Hawkins, R. 2001. Second language syntax: A generative introduction. Oxford: Blackwell.

Inducing abstract linguistic representations

Holmes, V. M. and De la Batie, B. D. 1999. Assignment of grammatical gender by native


speakers and foreign language learners. Applied Psycholinguistics 20: 479506.
Johnstone, T. and Shanks, D. R. 1999. Two mechanisms in implicit articial grammar
learning? Comment on Meulemans and Van der Linden 1997. Journal of Experimental
Psychology: Learning, Memory, and Cognition 25: 524531.
Kelly, M. H. 1992. Using sound to solve syntactic problems: The role of phonology in
grammatical category assignments. {sychological Review 99: 349364.
Knowlton, B. J. and Squire, L. R. 1996. Articial grammar learning depends on implicit
acquisition of both abstract and exemplar-specic information. Journal of Experimental
Psychology: Learning, Memory, and Cognition 22: 169181.
Levelt, W J. M., Roelofs, A. and Meyer, A. S. 1999. A theory of lexical access in speech
production. Behavioural and Brain Sciences 22: 175.
Maratsos, M. P. and Chalkley, M. A. 1980. The internal language of childrens syntax: The
ontogenesis and representation of syntactic categories. In Childrens Language Vol. 2,
K. Nelson (ed.), 127214. New York: Gardner Press.
Marcus, G. F. 1999. Language acquisition in the absence of explicit negative evidence: Can
simple recurrent networks obviate the need for domain-specic learning devices?
Cognition 73: 293296.
Marcus, G. F., Vijayan, S., Bandi Rao, S. and Vishton, P. M. 1999. Rule learning in 7-monthold infants. Science 283: 7780.
Mathews, R.C., Buss, R.R., Stanley, W.B., Blanchard-Fields, F., Cho, J.-R. and Druhan, B. 1989.
The role of implicit and explicit processes in learning from examples: A synergistic eect.
Journal of Experimental Psychology: Learning, Memory, and Cognition 15: 10831100.
Meulemans, T. and Van der Linden, M. 1997. Associative chunk strength in articial
grammar learning. Journal of Experimental Psychology: Learning, Memory, and
Cognition 23: 10071028.
OReilly, R. C. and Munakata, Y. 2000. Computational explorations in cognitive neuroscience:
Understanding the mind by simulating the brain. Cambridge: MA: MIT Press.
Pinker, S. 1984. Language learnability and language development. Cambridge, Mass.: Harvard
University Press.
Plunkett, K. and Elman, J. L. 1997. Exercises in rethinking innateness: A handbook for
connectionist simulations. Cambridge, MA: MIT Press.
Redington, M. and Chater, N. 1998. Connectionist and statistical approaches to language
acquisition: A distributional perspective. Language and Cognitive Processes 13: 129191.
Rumelhart, D. E. and McClelland, J. L. 1986. On learning the past tense of English verbs. In
Parallel distributed processing: Explorations in the microstructure of cognition Vol. 2, J. L.
McClelland and D. E. Rumelhart (eds), Cambridge, MA: MIT Press.
Saran, J. R. 2001. The use of predictive dependencies in language learning. Journal of
Memory and Language 44: 493515.
Seidenberg, M. S. and McClelland, J. L. 1989. A distributed, developmental model of word
recognition and naming. Psychological Review 96: 523569.
Shanks, D. R. 1995. The psychology of associative learning. Cambridge: Cambridge University
Press.
Skehan, P. 1998. A cognitive approach to language learning. Oxford: Oxford University Press.

173

</TARGET "wil">

174 John N. Williams

Sokolok, M. E. and Smith, M. E. 1992. Assignment of gender to French nouns in primary and
secondary language: A connectionist model. Second Language Research 8: 3958.
Sun, R., Merrill, E. and Peterson, T. 2001. From implicit skills to explicit knowledge: a
bottom-up model of skill learning. Cognitive Science 25: 203244.
Taraban, R. and Kempe, V. 1999. Gender processing in native and nonnative Russian
speakers. Applied Psycholinguistics 20: 119148.
Tomasello, M. 2000. The item-based nature of childrens early syntactic development.
Trends in Cognitive Sciences 4: 156163.
Towell, R., and Hawkins, R. 1994. Approaches to second language acquisition. Clevedon:
Multilingual Matters.
Ullman, M. T. 2001. The neural basis of lexicon and grammar in rst and second language:
The declarative/procedural model. Bilingualism: Language and Cognition 4: 105122.
Vigliocco, G., Antonini, T. and Garrett, M. F. 1997. Grammatical gender is on the tip of
Italian tongues. Psychological Science 84: 314317.
Williams, J. N. In preparation. Implicit learning of form-meaning connections.
Williams, J. N. and Lovatt, P. 2003. Phonological memory and rule learning. Language
Learning 53: 67121.

<LINK "sab-n*">

<TARGET "sab" DOCINFO AUTHOR "Laura Sabourin and Marco Haverkort"TITLE "Neural substrates of representation and processing of a second language"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 8

Neural substrates of representation


and processing of a second language*
Laura Sabourin and Marco Haverkort
University of British Columbia / University of Nijmegen &
Boston University

1.

Introduction

Most research in second language acquisition as well as in rst language


acquisition does not make a careful enough distinction between the dierent
levels at which language behaviour and changes in the language of the learner
are described. The description is usually cast in terms of a representation of
grammatical knowledge available to the learner, and changes in language
behaviour are viewed as the result of qualitative changes in that knowledge, for
instance the addition of a rule or the resetting of a parameter. In this paper, we
want to argue that it is important to distinguish between the representation of
grammatical knowledge, the language processor, and general cognitive strategies in adult second language acquisition.
In order to be able to distinguish between grammar and processor in
second language acquisition, we will compare results obtained with dierent
methods. In particular, we will use an o-line grammaticality judgment task to
tap grammatical knowledge of second language learners, and on-line EEG
measurements to investigate to what extent the processing strategies of second
language learners are qualitatively similar to those used by native speakers.
Specically, we will investigate the use of grammatical gender to see if L2
processing is strictly linguistic in nature or depends to some degree on more
general cognitive strategies.
In the next sections we will rst look at evidence from the eld of aphasia
that a distinction between knowledge and grammar on one hand and the
processor on the other should be made. We will then look at two L2 experiments to see if this distinction also holds for L2.

176 Laura Sabourin and Marco Haverkort

2. Grammar versus processor: Evidence from aphasia


There is quite extensive evidence from another domain of linguistic inquiry
the study of aphasia that supports the idea that the representation of
grammatical knowledge on the one hand and language processing on the other
are dissociated entities. According to this view, aphasics have the knowledge of
their language available, but cannot process language on-line, due to working
memory or other processing limitations (cf. Kolk 1995, 1998).
There are a number of observations that support the idea that aphasics still
have access to the correct grammatical representations, but that their access is
too slow for adequate on-line processing. First, although there is a spontaneous
recovery process post-onset for virtually all patients, there is no indication that
the representations must be re-acquired or relearned.
Second, the fact that patients exhibit task-dependent variation supports
aphasics having access to the grammatical representations. They perform at
chance level with certain constructions such as object relative clauses,
passives, and object clefts (Caplan and Hildebrandt 1988, Grodzinsky 1990)
in a sentence-picture matching task, where they have to select the picture that
correctly depicts the sentence they were just given. These same patients can
perform much better (close to ceiling level), in a grammaticality judgment task.
Linebarger et al. (1983) and Grodzinky and Finkel (1998) however, found that
their patients had problems with the grammaticality judgment task, especially
when the sentences to be judged involved antecedent-trace dependencies. The
former task is more complex and involves more processing than the latter: a
syntactic structure needs to be established, onto which a semantic representation is then mapped; subsequently, the pictures need to be analysed, resulting
in a conceptual structure, and nally these conceptual structures have to be
compared with the semantic representation of the sentence in order to nd the
best match. In a grammaticality judgment task, on the other hand, only the rst
step needs to be taken: a syntactic representation needs to be established, and if
the structure under construction fails before it is nished, the sentence is marked
as ungrammatical (Chomsky 1995). A full semantic structure does not need to be
computed, as in the sentence-picture matching task. Thus, if patients have a
processing problem, it is to be expected that they will perform better on the
latter task, which is simpler computationally and requires less storage capacity.
The fact that they perform much better on grammaticality judgments clearly
indicates that the grammatical knowledge must be available at some level.
Third, it has been shown in several studies (Burkhardt et al. 2001, Haarmann

Neural substrates of representation and processing of a second language 177

1993, Kolk 2002, among others) that aphasics exhibit syntactic and semantic
priming eects. In a syntactic priming task, unimpaired subjects are quicker in a
lexical decision task if the target is a word that syntactically ts into the sequence of
words heard or read up to the point of presentation of the target. It reects the fact
that language users have clear expectations about what syntactic category is to
come next. For aphasics, these eects also show up, but only with stimulus onset
asynchronies (SOA) that are larger than the optimal SOA for unimpaired subjects:
whereas the optimal SOA for unimpaired subjects is 300 ms (with longer SOAs the
eect gradually disappears), the optimal SOA for the aphasics is much larger.
Haarman (1993) presents data from a syntactic priming study. He compared sentences such as those in (1) and found a priming eect for an unimpaired control group of about 65 ms on the last word: if that word t the
syntactic context, the unimpaired control group made the lexical decision 65 ms
quicker than if it did not. The agrammatic patients showed the same priming
eect (a quicker response to words that t the syntactic context), but only when
the SOA was increased from 300 (normals) to 1100 ms.
(1) a.

Wij zijn getest/*gewandeld.


We are tested/walked.
b. Wij kunnen praten/*neus.
We can talk/nose.
c. op de tafel/*rood
on the table/red

The fact that the aphasics showed a syntactic priming eect can only be
explained under the assumption that they have the relevant knowledge (regarding phrase structure and subcategorization) at their disposal and hence have
syntactic expectations as to what word class the next word will be; otherwise, no
eect should be found. The fact that the optimal SOA is a little over three times
as large for the aphasic population as for the control group, however, indicates
that the aphasics cannot make use of the relevant knowledge quick enough online; as soon as they are given more time, the exact same eect shows up as for
the unimpaired population. However, for the unimpaired control group, the
priming eect disappeared when items were presented at longer SOAs.
A similar priming eect has been shown to exist in ller-gap dependencies,
using semantic priming. Burkhardt et al. (2001), using sentences with moved
wh-phrases and DPs, presented semantically related or unrelated words
(Examples 2 and 3 below) at the trace or 600 ms after the trace in object
position (indicated by ti).

178 Laura Sabourin and Marco Haverkort

(2) The kid loved the cheese whichi the brand new microwave melted ti
yesterday afternoon while the entire family was watching tv.
(3) The butteri in the small white dish melted ti after the boy turned on the
brand new microwave.

In this experiment as well, the priming eect can only be observed in a small
window, which for normals is immediately at the object position; if the semantic prime is presented with a delay, the priming eect is gradually lost in the
unimpaired population, an indication that it is indeed the reactivation of the
semantic content of the moved phrase at the trace position that causes the
eect. Here, again, the aphasics exhibit a priming eect, but only when the
semantically (un)related word is presented with a delay of 600 ms, indicating
that the patients can construct the trace associated with the moved wh-phrase
or DP. Thus, the representation of the relevant syntactic knowledge must be
available to them; otherwise no priming eect would be expected.
These observations all point in the direction that a processing-based
account of aphasic behaviour is on the right track: the knowledge base seems to
be available, and can be used by the patients under particular conditions.
However, the task cannot be too complicated or involve too many sub-tasks,
and the patients need to be given sucient time to do the task. At the behavioural level, young children and second language learners behave similarly to
the aphasics in a number of respects (subjects are omitted, verbs are not
inected for tense and agreement but occur in the innitival form in the
corresponding syntactic position instead, and functional categories conjunctions, determiners, pronouns, auxiliaries, copula verbs and prepositions, for
instance are omitted), which suggests that their behaviour should be
explained along similar lines, viz. in terms of processing limitations, in line with
Ockhams razor (see also Avrutin, Haverkort and Van Hout 2001 and the
dierent papers in that volume). We hypothesize that in other populations,
particularly second language learners, however, these limitations are of a
dierent nature: not so much timing restrictions, as in the aphasic population,
but the use of qualitatively dierent processing strategies (see below).

3. Second language processing


As indicated above, the aim of this paper is to investigate the role of the
representation of grammatical knowledge, language processing and general cognitive strategies in adult second language learners. It is possible that successful

Neural substrates of representation and processing of a second language 179

second language learners have native-like knowledge, just like the aphasics
(suggesting that access to Universal Grammar (UG) for a second language is
possible). However, they may actually process this knowledge in a non-nativelike manner. Their non-native processing, though, may not be due to timing
limitations as in aphasic populations (a quantitative dierence) but may be a
qualitative dierence.
We will now look at studies that investigate whether advanced second
language learners of Dutch, even if they exhibit the same knowledge as native
speakers in an o-line grammaticality judgment task, exhibit the same neurophysiological responses to grammatical violations. This would indicate that,
even though their knowledge is comparable to that of native speakers, their online processing diers qualitatively. Comparison of data obtained using the
traditional grammaticality judgment technique with those obtained by tapping
directly into electrophysiological activity in the brain associated with a specic
processing phenomenon allows us to study knowledge and processing separately. Grammatical gender is specically interesting in this respect, because it
involves both lexical and syntactic aspects; hence storage, computation, and
their interaction can be studied simultaneously.

4. Knowledge versus processing: Two experiments


4.1 Grammatical gender
Grammatical gender or noun classication systems are found in many of the
worlds languages. Dutch is a gender language with two gender classes, marked
by the denite articles de (common gender) and het (neuter gender). Originally,
the language employed a three gender system with masculine, feminine and
neuter categories, but the former two were conated into one common gender.
The earlier three-way system is similar to the system presently used in German,
a language that is closely related to Dutch.
The following experiments investigated how second language speakers deal
with local grammatical gender agreement within the noun phrase. There are
two dierent types of agreement that fall into this category. One type is the
agreement between denite determiner and noun. Common gender nouns
(such as tafel, table) take the common denite determiner de and neuter
gender nouns (such as kind, child) take, in the singular, the neuter denite
determiner het. In the plural, the determiner de is used for both common and

180 Laura Sabourin and Marco Haverkort

neuter gender nouns. The indenite determiner is the same for both genders,
i.e. een, and the agreement is only evident on the adjective (adjective-noun
agreement). For indenite common gender nouns, the sux -e is added to the
adjective, while for indenite neuter nouns the adjective remains uninected,
as shown in the following examples:
(4) a.

Een klein
kind.
a small- child-neut
b. Een klein-e
tafel.
a small-agr table-com

4.1.1 Experiment 1: Grammatical knowledge


This rst experiment was designed to determine the level of knowledge second
language speakers can achieve concerning the Dutch gender agreement system.
Only advanced participants with German as their native language were tested in
the second language group. The task here, as for Experiment 2, was to judge the
grammaticality of sentences. There were 2 types of sentences in the experimental items: the rst sentence type contained either the correct or incorrect
denite determiner, while the second type contained either correct or incorrect
adjectival agreement.

Participants
In total 59 participants were tested on this task: 34 native speakers of Dutch
formed the control group, while there were 25 second language learners with
German as their native language. As we were interested in studying advanced
second language learners, participants were required to have a high level of
prociency. Therefor attain such participants had to have been using Dutch for
at least three years. A prociency score was also obtained from each second
language participant; a score of 90% or more correct was required. This
prociency score was determined by testing participants on their knowledge of
number and niteness agreement.1 Information about the participants is
summarized in Table 1.
Materials and methodology
The grammaticality judgment test contained 80 sentences of interest, each of
which contained the critical determiner-adjective-noun sequence. Half of these
items belonged to the determiner-noun agreement condition and the other half
to the adjective-noun agreement condition. For the rst condition, the critical

Neural substrates of representation and processing of a second language

Table 1.Participant information. The number of participants included for each


language group along with information as to the average duration and range of
exposure the German participants had to the Dutch language.
Native language

Exposure to Dutch

Accuracy on prociency test

Dutch (n = 34)

N/A

Range: 90100%
Average: 98%

German (n = 25)

Range: 249 yrs*


Average: 11.6 yrs

Range: 92100%
Average: 97%

* The one German subject who had less then 3 years of exposure to Dutch, started teaching himself
Dutch while still living in Germany, but those years were not counted, as the amount of Dutch used
before moving to the Netherlands could not be determined.

nouns in the sentences were preceded either by the correct denite determiner
or by the incorrect denite determiner (Example 5). In the second condition
there were sentences containing indenite NPs in which the critical noun was
preceded either by the correctly or incorrectly inected form of the adjective
(Example 6). The test also included 200 ller sentences with dierent types of
violations: 80 sentences that were used in the prociency measure, 80 sentences
looking at the use of the relative pronouns and 40 sentences looking at the form
of the predicative adjective. Full details can be found in Sabourin (2003).
(5) Het/*De
kleine kind
probeerde voor het eerst te lopen.
the-neut/*com small child-neut tried
for the rst to walk.
The small child tried to walk for the rst time. (DET-N agreement)
(6) Hij loopt op een gekke/*gek
manier. (A-N agreement)
he walks in a funny-com/*neut way. 

The critical nouns used in this experiment were controlled for frequency. Half
of the items were of high frequency while the other half were of a middle
frequency. The middle frequency items were still of a fairly high frequency to
ensure that second language participants would know them. The frequency of
each item was determined through the CELEX database (Burnage 1990). The
log frequency of each high frequency item was between 1.96 and 2.98 (average
2.28); for each middle frequency item, it was between 1.11 and 1.49 (average
1.31). Items were also broken down into gender class: half of the nouns were
common gender nouns (de) and the other half were neuter gender ones (het).
Each subject received a grammaticality judgment questionnaire. The
participants were asked to rst go through the test making a yes/no decision as
to the grammaticality of each sentence. They were required to complete this

181

182 Laura Sabourin and Marco Haverkort

task in 30 minutes. After judging the grammaticality of each sentence, they were
asked to go back to the beginning and correct every sentence they had marked
as ungrammatical. This was done to ensure that subjects were rejecting a
sentence for the right reasons and not, for instance, due to the fact that they felt
that an incorrect preposition or incorrect word order had been used.
In scoring the grammaticality judgments only sentences that were both
correctly judged as grammatical or ungrammatical and that contained a
relevant correction were considered as correct answers. For example, in the
ungrammatical version of the sentence in (5), repeated below as (7), the
participant correctly might have said the sentence was ungrammatical, but, in
the correction of the sentence only changed the position of the prepositional
phrase voor het eerst. If this was the case, the answer was scored as incorrect.
(7) *De
kleine kind
probeerde voor het eerst te lopen.
the-com small child-neut tried
for the rst to walk
The small child tried to walk for the rst time.

Similarly, if the sentence was supposed to be marked as grammatical but the


subject rated the sentence as ungrammatical, making a correction that was
unrelated to the condition being tested, the answer was scored as correct. For
example, if the above sentence had been grammatical (with het kind instead of
de kind), but the subject still rated it as ungrammatical due to the position of
the prepositional phrase, the answer would have been considered as correct,
since correct judgment was given with respect to gender.

Results
The results for this experiment will be analysed using a four-way Anova
(analysis of variance). Only responses to the ungrammatical items will be
analysed as scores on the grammatical items were near perfect for both groups.
The within-subjects eects were deniteness (denite and indenite), frequency (high and middle), and gender (common and neuter). The between-subjects
eect was native language (Dutch and German). Only one signicant interaction was found in this analysis; deniteness signicantly interacted with L1
(F(1,57) = 15.8, p < .001). This eect can be seen in Figure 1. The main eects of
deniteness (F(1,57) = 32.84, p < .001) and L1 (F(1,57) = 25.37, p < .001) were
also signicant.
What is most important to note here is that while there was a dierence
between the native speakers and the L2 learners, this is only clearly the case
when indenite NPs are being used. The L2 learners perform signicantly less

Neural substrates of representation and processing of a second language

100
90
80

def
indef

70
60
50
Dutch

German

Figure 1.Scores (in percent) comparing the Dutch and German scores on the denite
and indenite NPs.

worse than the native speakers on the denite NP items compared to the
indenite NP items.
To summarize, the German group shows that, for the denite NPs (the
determiner-noun agreement condition), their knowledge is similar to that of
the native speakers. However, for indenite NPs (the adjective-noun agreement
condition), the German group performs very poorly. One way to interpret these
results is by noting that determiner-noun agreement is similar to simply
assigning gender to nouns and can, therefore, be done on the basis of lexical
information rather than via a syntactic process. On the other hand, adjectivenoun agreement requires the participants to take the lexical knowledge of which
gender an item is and apply this information in order to correctly inect the
adjective. In the next experiment we look at the on-line processing of the same
sentences. The question now is how the L2 group processes these dierent kinds
of data on the on-line version of the task.
4.1.2 Experiment 2: Processing
As was seen in the rst experiment, the German group showed that when they
are given an overt determiner they can judge the grammaticality quite accurately. If only this o-line measure had been used, the conclusion might be that
second language speakers acquire a native-like competence of their second
language. But, as was argued above, there are some reasons to think that the
representation of knowledge on the one hand and language processing on the
other can be dissociated. For aphasics, evidence was presented, indicating that
they still have knowledge of the language but that they cannot process the
language on-line in a quick enough manner. It is therefore also possible that

183

184 Laura Sabourin and Marco Haverkort

second language learners acquire the knowledge of the second language


grammar but are not able to process this language in the same manner as native
speakers, though for dierent reasons than the aphasics. The fact that accuracy
decreases when syntactic agreement must be processed is suggestive that this
may be the case.
There are numerous techniques that can be used to test on-line language
processing. Some experiments make use of reaction time (RT) measurements
such as lexical decision and self-paced reading. Unfortunately, although these
techniques can tell us a lot about how language processing is organized in terms
of its general architecture, they may not be ne-grained enough to determine
whether similar or dierent processing mechanisms are being used. One on-line
technique that provides detailed information on the qualitative aspects of
processing is electroencephalography (EEG). The neuroimaging technique of
Event-Related Potentials (ERPs) is able to measure the electrophysiological
activity in the brain that is thought to directly reect neural activity. ERPs are
negative or positive changes in the voltage of the ongoing brain activity that can
be elicited by sensory input or a cognitive task. The technique provides information on the latency, amplitude, polarity, and distribution over the scalp of
the EEG-signal. It has been found in previous language studies that signals
elicited by, for instance, grammatical and ungrammatical sentences can be
discriminated (Kutas 1993, Rugg and Coles 1995). One well-understood
ERP-correlate of syntactic language processing is the P600 or Syntactic Positive
Shift (SPS) which is associated with processing of morpho-syntactic anomalies
and complexity (Osterhout and Holcomb 1992, Hagoort, Brown and Groothusen 1993). The P600 is a positive deection in the EEG-signal that starts
approximately 500 ms after the presentation of the word that renders a sentence
ungrammatical; this positivity continues for about 400 ms and reaches its
maximum amplitude at around 600 ms after the presentation of the word that
renders the sentence ungrammatical. This component is most prominent in
centro-parietal regions of the scalp.
There has recently been some research on the neural correlates of grammatical gender processing in both Dutch and German. These studies have only
looked at native speakers and they have looked only at determiner-noun
agreement. For Dutch, Hagoort and Brown (1999) showed that grammatical
gender incongruencies result in an increase in the amplitude of the P600
component as compared to sentences with congruent determiner-noun
agreement. A P600 component was also found in German for gender violations
(Gunter, Friederici and Schriefers 2000). Thus, grammatical gender violations

Neural substrates of representation and processing of a second language

in both Dutch and German result in an increase in amplitude of the P600


component. The question then is: will Germans also show this P600 in their L2
processing?

Participants
In total 39 participants were tested on the ERP version of the above experiment.
There were 23 native speakers of Dutch and 14 second language speakers with
German as their native language. The L2 participants have lived in the Netherlands between two and 32 years with an average of 9.8 years. None of the
participants took part in Experiment 1.
Materials and methodology
The critical stimulus sentences used in this experiment were the same sentences
used in Experiment 1. While the task used in the ERP version also contained a
grammaticality judgment, there were a few important dierences in how the
on-line task was run compared with the o-line task from Experiment 1.
During the ERP measurement, participants were seated in a dimly lit soundproof room facing a computer monitor. Sentences were presented word by
word in the middle of the screen (the word was on the screen for 250 ms,
followed by a blank screen for 250 ms before the next word appeared). Each
sentence was preceded by an asterisk (to let participants know that a new
sentence was about to start). After each sentence, a delay screen was displayed,
followed by a screen requesting subjects to give a grammaticality judgment by
pushing one of two buttons. After each sentence participants were given two
seconds in which they were allowed to blink.2 The experiment started with a
practice session to allow participants to get used to the presentation style of the
sentences and to practice not blinking during the sentence trials. The actual
experiment lasted approximately one hour. The native speakers were given
three breaks while the L2 participants were given a total of seven breaks; the
length of the pause was chosen by the participant, so the total testing time
varied, depending on the length of the pauses that were taken.
EEG recording
The EEG activity was recorded by means of tin electrodes mounted in an elastic
cap (Electro-Cap International) from 12 electrode sites, based on the international 1020 system. The 12 electrodes analyzed were: F7, Fz, F8, T3, Cz, T4, T5,
Pz, T6, O1, Oz and O2. The F represents frontal electrodes, T represents
temporal electrodes, C represents central electrodes, P represents parietal

185

186 Laura Sabourin and Marco Haverkort

electrodes and O represents occipital electrodes. Odd numbers represent electrodes on the left half of the scalp, even numbers represent electrodes on the right
half, and the z represents electrodes along the midline. All electrodes were
referenced to linked mastoids. Both horizontal and vertical electro-oculograms
(EOGs) were measured for both eyes. Electrode impedances were kept below 5 k.
EEG and EOG signals were sampled at 1000 Hz, amplied and digitally ltered
with a cut-o frequency of 30 Hz; eective sample frequency was 100 Hz.

Results
First the behavioural results (accuracy on the grammaticality judgment) will be
presented followed by separate analyses of the on-line ERP data for the native
speakers and the German group. For the L2 group only the sentences which
they correctly judged as grammatical will be analyzed.
The average accuracy scores for the Dutch and German groups are presented in Table 2.
Table 2.Accuracy scores (in %) ont the ERP version of the grammaticality judgment task
NP Denites

Dutch (n = 23)
German (n = 14)

NP Indenites

grammatical

ungrammatical

grammatical

ungrammatical

95%
88%

90%
80%

93%
93%

92%
68%

Average ERPs were computed at the above electrode sites for each participant in all conditions. The averaging was done for an interval starting at the
onset of the critical noun and continuing for 1500 ms post-onset. All averages
were aligned with a 200 ms pre-stimulus baseline (200 ms before onset of the
critical noun is set to zero for both conditions to correct for pre-existing
dierences in the EEG). For analysis purposes, averaged ERPs of each 1500 ms
epoch were divided into 50 intervals of 30 ms. This method allows one to see
the onset and duration of eects more clearly. In each of these 50 intervals, mean
amplitudes were statistically analysed with a Manova. Eects will be reported only
if three or more successive intervals reach signicance at the .05 level for native
speakers and at the 0.1 level for the German group. Three successive signicant
intervals are more likely to reect a real and reliable eect despite the use of
multiple comparisons. A .1 level will be allowed for the L2 speakers in order to
avoid false negative results of the P600 as less of their data can be analysed (only
the sentences for which they made a correct judgment) and it is expected that

Neural substrates of representation and processing of a second language 187

their data will also be more variable. Both of these reasons will likely make it
more dicult to nd signicant dierences in the wave patterns so a less strict
level of signicance will be taken, but the reader must be aware then, that
unexpected signicant results should be looked at carefully.
Each language group will be analysed separately and then discussed
comparatively. For each 30 ms interval a three-way Manova was carried out
looking at the eects of grammaticality (two levels), front to back scalp distribution (four levels: frontal, temporal, parietal and occipital), and left to right
scalp distribution (three levels: left, midline and right). The four levels of the
front to back scalp distribution are: F7, Fz and F8 as the rst level, T3, Cz and
T4 as the second level, T5, Pz, and T6 as the third level and O1, Oz and O2 as
the fourth level. The three levels of the left to right scalp distribution are: F7, T3,
T5 and O1 as the left hemisphere electrodes; Fz, Cz, Pz and Oz as the midline
electrodes and; F8, T4, T6 and O2 as the right hemisphere electrodes.

Denite NPs
Upon a visual inspection of the ERP patterns for the denite NPs for the native
speakers a clear P600 pattern is seen; the ungrammatical sentences are more
positive than the grammatical sentences over the more posterior electrodes.
This can be seen by looking at Figure 2.
Looking at the waves statistically the presence of a P600 is supported.
Within each of the 30 intervals from 570 to 1500 ms the eect of grammaticality
is signicant at the .05 cut-o. Within this same time frame there was also a
signicant interaction with the front to back factor. From 570 to 900 ms, there
is a largely distributed P600 component which is signicant over the following
electrodes: Fz, C3, Cz, C4, T5, P3, Pz, P4, T6, O1, Oz and O2. From 900 ms to
the end the posterior positivity is maintained, while a frontal negativity starts
which is only signicant to the .05 level at electrodes F3 and F8.
Upon visually inspecting the ERP patterns for the German participants (see
Figure 3) we can also see a P600 component.
Statistically we see that indeed a P600 eect is present. However, it is much
more restricted and starts later than the one found for native speakers. Between
840 and 990 ms there is a signicant positivity for the ungrammatical sentences at
electrodes C3, Cz, C4, T5, P3, Pz, P4 and T6. This can be seen in Figure 4 where
the dierence waves at electrode Pz are compared for the Dutch and German
groups. Dierence waves represent the wave found for the ungrammatical
sentence minus the wave found for the grammatical sentence; thus a positivity
in the dierence wave reects a positivity in the ungrammatical sentences.

188 Laura Sabourin and Marco Haverkort

Figure 2.ERP wave patterns for the grammatical (darker line) versus ungrammatical
sentences for the native speakers in the NP denite condition. The y-axis represents a
voltage of 5 microvolts with positive plotted up.

In Figure 4 we can see that the P600 eect starts later for the German group
and that its maximal amplitude occurs later as well. Another dierence between
the ERP waves for the German and Dutch groups can be seen. The German group
does not show the late frontal negativity that is seen in the native speaker ERP.

Indenite NPs
A visual inspection of the native speaker data (see Figure 5) shows that a P600
eect is also present for these sentences. Statistically, the P600 eect is signicant from 600 to 710 ms at electrodes C3, Cz and C4 and between 600 and 1500
ms at electrodes T5, P3, Pz, P4, T6, O1, Oz and O2. Electrodes Fz and F8 show
a late frontal negativity. This can be seen in Figure 5.
Visual inspection of the German data reveals no obvious eects (see
Figure 6). It is important to note that only sentences that were correctly judged

Neural substrates of representation and processing of a second language 189

Figure 3.ERP wave patterns for the grammatical (darker line) versus ungrammatical
sentences for the German group in the NP denite condition. The y-axis represents a
voltage of 5 microvolts with positive plotted up.

in the grammaticality judgment portion of this task are included in the


analyses below.
Looking at the statistics there are three time frames where the eect of
grammaticality is signicant: 300 to 420 ms, 780 to 960 ms and 1080 to 1200
ms. Within none of these time frames does the eect of grammaticality signicantly interact with either the front to back factor or the left to right factor.
Looking at each electrode separately also does not result in any signicant
dierences between grammatical and ungrammatical waves. However, upon
visual inspection, it appears that in the rst time frame there is a frontal
positivity, followed by a left posterior negativity and a frontal negativity for the
ungrammatical sentences. The nal signicant time frame seems to be due to a
largely distributed positivity. A comparison of the dierence waves for the
native and German speakers at electrode Pz can be seen in Figure 7.

190 Laura Sabourin and Marco Haverkort

Figure 4.The dierence waves for the ungrammatical minus the grammatical condition for
the NP denite sentences. The P600 component as seen at electrode Pz for the native
speakers (the darker line) and the German speakers. The y-axis represents a voltage of
8 microvolts with positive plotted up.

In the case of the denite NP where gender agreement between the overt
determiner and noun can be seen as equivalent to gender assignment to nouns,
which is similar in Dutch and German, the Germans are able to perform well
o-line and their on-line processing looks very similar to that of native speakers, although the P600 component found for the ungrammatical sentences
occurs later and is more restricted. However, in the case of Dutch indenite
NPs, where agreement can only occur at a more purely syntactic level, since
more than just knowing whether an item is common or neuter is required, the
German speakers have quite a bit of diculty in the o-line judgment, and in
looking at processing of only the items they correctly judged in the on-line task
they do not show a P600 component.

Neural substrates of representation and processing of a second language

Figure 5.ERP wave patterns for the grammatical (darker line) versus ungrammatical
sentences for the native speakers in the NP indenite condition. The y-axis represents
a voltage of 5 microvolts with positive plotted up.

5. General conclusions
The main goal of this paper was to show that in the study of language behaviour
in general and in the eld of second language acquisition in particular the
representation of grammatical knowledge and the processing in which this
knowledge is employed need to be carefully distinguished. We have shown that
for the phenomenon of grammatical gender and for one specic group of
second language learners this is indeed an important distinction, because this
group shows a dierence in processing at least for the indenite NPs. While
quantitative dierences in language processing can be seen in aphasic populations, when compared with unimpaired language users, we have shown that
there is actually a qualitative dierence between native speakers and second
language learners in the processing of language. The results presented in this

191

192 Laura Sabourin and Marco Haverkort

Figure 6.ERP wave patterns for the grammatical (darker line) versus ungrammatical
sentences for the German group in the NP indenite condition. The y-axis represents a
voltage of 5 microvolts with positive plotted up.

paper suggest that the German participants may be making use of a translation
strategy to learn Dutch gender assignment knowledge and that they may be
using their L1 processing strategies to process their L2 for cases where the
grammars are very similar. Further support for this is seen in the ERP processing patterns for sentences involving subject-verb agreement and niteness, both
phenomena for which the Germans show native-like knowledge in the o-line
task. For niteness structures, for which the German group can translate their
L1 strategies, they show native-like processing but for subject-verb agreement,
which exists in German but is dierent in this language, the processing is
dierent from that observed in native speakers (see Sabourin 2003). Another
important thing to note is that the very clear late frontal negativity which was
seen in the ERPs of the native speakers was not found for the German group
even for the NP denite condition for which a P600 eect can be seen.3 These

Neural substrates of representation and processing of a second language 193

Figure 7.The dierence waves for the ungrammatical minus the grammatical condition for
the NP indenite sentences. The P600 component as seen at electrode Pz for the native
speakers (the darker line) and the German speakers. The y-axis represents a voltage of
8 microvolts with positive plotted up.

ndings suggest that linguistic processing (as reected by the P600) can only
occur in the L2 when the processing strategy from the L1 can be used relatively
directly in L2 processing. Processing of grammatical structures that are not
similar in the L1 and L2 may be learned and handled by more general cognitive
strategies. Ullman (2001) discussed the declarative/procedural memory
distinction in terms of L1 processing, lexical knowledge being declarative and
syntactic knowledge being procedural in nature. For L2 speakers, Ullman claims
that both lexical and syntactic knowledge usually rely on declarative memory,
although Ullman suggests that factors such as age of exposure and practice may
inuence the ability to use procedural memory in L2 learners. Using this terminology the results presented here suggest that in L2 processing of grammatical
knowledge, only in the case where the L1 and the L2 are similar can procedural
memory be used by advanced adult L2 learners although probably with a
quantitative dierence compared to native speakers, viz. a temporal delay.

<DEST "sab-n*">

194 Laura Sabourin and Marco Haverkort

Further research should shed light on whether this non-native-like processing


is an across-the-board second language eect or whether dierences may be
found depending on the particular phenomena studied and particular language
groups involved.

Notes
* We would like to thank Laurie Stowe, John Hoeks, Liz Temple and two anonymous
reviewers for their comments on an earlier draft of this paper. The research of the rst author
was funded by the School of Behavioral and Cognitive Neurosciences (BCN) of the University of Groningen; the research of the second author was funded by a grant from the Royal
Netherlands Academy of Sciences (KNAW).
1. For more details on the prociency test see Sabourin (2001, 2003).
2. Participants were asked to try their best to not blink during presentation of the sentences
as eye movements and blinks greatly distort the EEG signal.
3. This suggests that whatever the L2 speakers are doing, it is not exactly the same as the
native speakers.

References
Avrutin, S., Haverkort, M. and Van Hout, A. 2001. Introduction: Language acquisition and
language breakdown. Brain and Language 77: 269273.
Burkhardt, P., Piango, M. and Wong, K. 2001. The role of the anterior left hemisphere in
real-time sentence comprehension: Evidence from split intransitivity. Ms. Yale University.
Burnage, G. 1990. A guide for users. Nijmegen: CELEX Centre for Lexical Information.
Caplan, D. and Hildebrandt, N. 1988. Disorders of syntactic comprehension. Cambridge: MIT
Press.
Chomsky, N. 1995. The minimalist program. Cambridge: MIT Press.
Grodzinsky, Y. 1990. Theoretical perspectives on language decits. Cambridge: MIT Press.
Grodzinsky, Y. and Finkel, L. 1998. The neurology of empty categories: Aphasics failure to
detect ungrammaticality. Journal of Cognitive Neuroscience 10 (2): 281292.
Gunter, T. C., Friederici, A. D. and Schriefers, H. 2000. Syntactic gender and semantic
expectancy: ERPs reveal early autonomy and late interaction. Journal of Cognitive
Neuroscience 12 (4): 556568.
Haarmann, H. 1993. Agrammatic aphasia as a timing decit. Doctoral dissertation, University
of Nijmegen.
Hagoort, P. and Brown, C. 1999. Gender electried: ERP evidence on the syntactic nature
of gender processing. Journal of Psycholinguistic Research: Special Issue on Processing of
Grammatical Gender 28 (6): 715728.

</TARGET "sab">

Neural substrates of representation and processing of a second language 195

Hagoort, P., Brown, C. and Groothusen, J. 1993. The syntactic positive shift (SPS) as an
ERP-measure of syntactic processing. Language and Cognitive Processes 8: 439483.
Kolk, H. 1995. A time-based approach to agrammatic production. Brain and Language 50:
282303.
Kolk, H. 1998. Disorders of syntax in aphasia: Linguistic-descriptive and processing
approaches. In Handbook of neurolinguistics, B. Stemmer and H. Whitaker (eds),
250260. San Diego: Academic Press.
Kolk, H. 2002. Language production in agrammatic aphasics: an experimental study. Paper
presented at the University of Nijmegen Linguistics Colloquium.
Kutas, M. 1993. In the company of other words: Electrophysiological evidence for singleword and sentence context eects. Language and Cognitive Processes 8 (4): 533632.
Linebarger, M., Schwarz, M. and Saran, E. 1983. Sensitivity to grammatical structure in socalled agrammatic aphasics. Cognition 13: 361392.
Osterhout, L. and P. J. Holcomb. 1992. Event-related brain potentials elicited by syntactic
anomaly. Journal of Memory and Language 31: 785806.
Rugg, M. D. and Coles, M. G. H. 1995. The ERP and cognitive psychology: Conceptual
issues. In Electrophysiology of the mind: Event-related brain potentials and cognition,
M. D. Rugg and M. G. H. Coles (eds), 2739. Oxford: Oxford University Press.
Sabourin, L. 2001. L1 eects on the processing of grammatical gender in L2. In Eurosla
Yearbook, Volume 1, S. Foster-Cohen and A. Nizegorodcew (eds), 159169. Amsterdam:
John Benjamins.
Sabourin, L. 2003. Grammatical gender agreement in L2 processing. Doctoral dissertation,
University of Groningen.
Ullman, M. T. 2001. The neural basis of lexicon and grammar in rst and second language:
The declarative/procedural model. Bilingualism: Language and Cognition 4 (1):
105122.

<LINK "gre-n*">

<TARGET "gre" DOCINFO AUTHOR "David W. Green"TITLE "Neural basis of lexicon and grammar in L2 acquisition"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 9

Neural basis of lexicon and grammar


in L2 acquisition
The convergence hypothesis*
David W. Green
University College London

1.

Introduction

In acquiring a second language (L2) individuals must grasp its grammar and its
vocabulary but dierences in the context of acquisition from a rst acquired
language (L1) may mean that dierent learning mechanisms are involved. Such
dierences, in turn, carry implications for the neural representation of L1 and
L2. Understanding the neural basis of the representation of L1 and L2 can
therefore contribute to a deeper understanding of the interface of syntax and
lexicon in L2 acquisition.
The basic orienting question is this: is a persons lexical and grammatical
knowledge represented dierently if it is learned as an L2 as opposed to an L1?
In this chapter, I contrast this proposal, termed the dierential representation
hypothesis, with an alternative, termed the convergence hypothesis. This
hypothesis states that as prociency in L2 increases, non-native speakers
represent, and process, the language in the same way as native speakers of that
language. In addressing these hypotheses, I consider both their computational
basis, and relevant neuropsychological and neuroimaging data.
The chapter is structured as follows: I rst consider the dierential representation hypothesis. On one version an L2 is represented in a persons righthemisphere rather than in their left-hemisphere. I reject this possibility. Next I
consider a version in which in L2, in contrast to L1, grammatical and lexical
information is represented in a common memory system. This version of the
hypothesis relies on a distinction between two memory systems that has been
justied primarily by studies on amnesic patients. However, computational
modelling shows that a single memory system is sucient to generate the data

198 David W. Green

of these amnesic patients. Such a result prompts consideration of the computational basis for the alternative, convergence hypothesis. By themselves computational arguments are not decisive and so we consider neuropsychological and
neuroimaging data in an eort to adjudicate between the two hypotheses
empirically. A nal section considers the kind of studies needed to further our
understanding of the issue.

2. A distinct hemispheric representation of L2?


The human brain is a product of evolution and it makes computational sense
for an evolved system to have redundant and duplicate mechanisms for
performing tasks (Edelman 1989). A strong version of the dierential representation hypothesis is therefore possible in which L2 is represented in a completely distinct neuroanatomical substrate from L1 (Scoresby-Jackson 1867).
Language functions in monolingual, right-handed individuals are typically
represented in a distributed left-hemisphere network. 91% of right-handed
participants showed left-hemisphere dominance for language in a study in
which they were injected with a barbiturate of sodium amytal either into the
right, or into the left, carotid artery (Loring, Meador, Lee et al. 1990). Sodium
amytal causes an anaesthesia for 1 to 2 minutes in the cerebral hemisphere on
the same side as the injection. If language is lateralised to that hemisphere the
person will be unable to speak. Less invasively, in a large functional imaging
study, 94% of right-handed participants showed left-hemisphere dominance for
language (Springer, Binder, Hammeke et al. 1999). These data are consistent
with the notion that the left-hemisphere contains circuitry specialised for
language processing. Nonetheless, in principle, L2 might be represented in
homologous areas of the right-hemisphere (Albert and Obler 1978). However,
Rapport, Tan and Whitaker (1983) in a study of right-handed polyglot aphasics
prior to surgery found no evidence of the disruption of picture naming following intracarotid injection of sodium amytal into the right-hemisphere. In
contrast, naming was massively disrupted following injection into the lefthemisphere. Further, in a study of 88 reported cases of right-handed bilingual
aphasics, Fabbro (1999: 210211) found that only 8% presented with a lesion to
the right-hemisphere. Taking into account reporting biases, he concluded that the
incidence of aphasia in bilinguals with right-hemisphere lesions is not in fact
higher than that shown by monolingual aphasics. These observations suggest that
both L1 and L2 are represented in a common substrate in the left-hemisphere

Neural basis of lexicon and grammar in L2 acquisition 199

though perhaps with dierent microanatomical representations (Paradis 2001).


The critical question then concerns the neural representation of lexical and
grammatical knowledge for L2 within this hemisphere. I consider a more subtle
version of the dierential representation hypothesis in the next section.

3. The specic representation of lexicon and grammar


Researchers have taken dierent views on the extent to which the lexicon and
grammar of a language are subserved by distinct neural mechanisms that are
language-specic. According to one view, words are processed in one dedicated,
posterior system and grammar is processed in another dedicated anterior
system (e.g. Chomsky 1995, Pinker 1994). An alternative view is that these two
components of the language system are indeed mediated by distinct neural
mechanisms but that these mechanisms are not in fact specic to language.
Ullman (2001a) proposed that the lexicon is stored in a neural system that
subserves declarative memory in general. By contrast, grammar is represented
in a procedural memory system that is implicated in the learning of motor and
cognitive skills in general.
The declarative memory system is held to be involved in the learning of
facts and events and to be particularly important in the learning of arbitrarily
related information from dierent sources such as the associations between the
sounds of words and their meanings. Information in this system is available for
explicit (i.e. conscious) recollection. Whilst initial learning may depend on the
medial temporal structures (e.g. the hippocampus), neocortical regions
subsequently become the principal site of representation (e.g. temporo-parietal
region). In contrast, the use of grammar (including syntax, morphology, and
phonology) is achieved by a system that underlies the performance of motor
skills in general a procedural system mediated by structures in the frontal
cortex and basal ganglia and the inferior parietal region (Squire, Knowlton and
Musen 1993, Squire 1994). The nondeclarative or procedural system is held to
inuence behaviour implicitly, i.e. in the absence of conscious recollection.
Hence the contrast between declarative and procedural (or nondeclarative)
memory systems is sometimes referred to, in summary terms, as a contrast
between explicit and implicit memory systems (see, for example, Paradis, 1994).
But this identication cannot be taken too far. The acquisition of vocabulary for
instance is not simply a matter of declarative memory. Gupta and Dell (1999)
argue that the learning of vocabulary involves the explicit learning of the

200 David W. Green

relationship between a phonological representation and meaning and the


implicit learning of the mapping of input phonemes onto an articulatory chain
(see also Ellis 1995 and Segalowitz and Segalowitz 1993). Consistent with this
view, Paradis (1997: 333334) argues that the acquisition of vocabulary is a
partially explicit process. Likewise Lebrun (2002: 304) argues that common
words and phrases are not only stored neocortically but as verbomotor subcortical patterns. But the point to note here is that this contrast applies equally to
L1 and L2 vocabulary learning.
The distinction between declarative and nondeclarative memory systems
has been exploited as a means to contrast the neural representation of L1 and
L2. Consider one possible dierence between the acquisition of L1 and L2. The
representation of a language acquired in an oral, conversational setting (e.g.
Quebecois; Friulian) may dier from one acquired in the formal setting of a
school. In particular, there may be a dierence in the representation of grammar and morphology (morphosyntax). Individuals who acquire L1 in a
conversational setting achieve prociency in morphosyntax implicitly. In
contrast the grammatical rules in the school setting are part of an explicit
declarative knowledge. Maturational constraints may also aect the acquisition
of morphosyntax more than the acquisition of vocabulary (Paradis 1994: 398).
In consequence, Paradis (1994) has argued that L1 and L2 may load dierently
on these two memory systems. L1, especially its morphosyntax, but also its
lexicon to an extent, may load more on the implicit, procedural memory system
whereas L2 may load more on an explicit, declarative memory system.
Ullman (2001b) has made a related proposal based on the notion that
linguistic abilities are sensitive to the age of exposure to the language (Lenneberg 1967). It is generally considered that attainment in L2 is constrained by the
age at which learning begins. For instance, there is a negative correlation
between the age at which learning begins and eventual performance (Johnson
and Newport 1989). But not all language capacities are aected equally: the use
of grammar is more adversely aected than the use of lexical items. As a result,
in L2 acquisition, there is a specic shift in processing of grammatical computation from the procedural memory system to the declarative memory system
(Ullman 2001b: 108; Note 2: 110 contrasts his proposal with that of Paradis).
There is no shift for lexical processes. These are held to depend on the declarative memory system for both L1 and for L2. I take this version of the dierential
representation hypothesis and consider it in a little more detail.
In Ullmans (2001b) view, vocabulary in both L1 and in L2 is represented
in a declarative memory system in the form perhaps of an associative

Neural basis of lexicon and grammar in L2 acquisition 201

network linking meanings and sounds. By contrast, whereas grammatical


processing in an L1 (e.g. forming the past-tense of a regular English verb such
as walk by adding -ed to the stem) relies on a procedural system, grammatical
processing in an L2 (such as English), is achieved declaratively. The basic notion
is that linguistic forms that are compositionally computed in L1 are memorized
in L2 as if they were words or idioms. Given that the associative lexical memory
can generalize patterns, such a system can still be productive. Certain rules may
also be learned, though these will dier in type from any implicitly learned rules
of L1. Ullman acknowledges that age of exposure to L2 is not the only factor
aecting the dependence on declarative memory: even older learners may
show a degree of dependence on procedural memory if they have had a large
amount of practice that is, a fairly substantial amount of use of the language
(Ullman, 2001b: 110). But the clear implication is that even procient speakers
of L2 will dier from native speakers of that language in relying much more on
declarative memory for grammatical computations.
There are two elements to Ullmans (2001b) proposal. First, it is motivated
by the claim that linguistic abilities are sensitive to the age of exposure. Second,
it appeals to two distinct types of memory that have been inferred from research
on amnesic patients. The following section considers each element.

4. Some grounds for doubt


Ullman (2001b) motivated the shift towards a declarative representation of
grammatical knowledge in L2 by appealing to data (Johnson and Newport
1989) on the limits of L2 (English) attainment for native Korean, and Chinese;
speakers who learned English post-puberty (after the age of 17 years). In the
Johnson and Newport study, L2 learners were asked to judge whether or not
auditorily presented sentences were grammatically correct or not. Roughly half
of the sentences were grammatical and half were minimally dierent ungrammatical variants. A key nding was that age of acquisition was negatively
correlated with performance before puberty, but there was no systematic
relationship between age of acquisition and performance post-puberty. Further,
few if any of the 46 participants in their study, achieved native-like levels of
performance post-puberty. These results are consistent with a critical period
view of language acquisition (Lenneberg 1976). In contrast to such data,
Birdsong and Molis (2001) in a replication of Johnson and Newport, but using
61 native Spanish speakers, found that the age of acquisition did predict

202 David W. Green

attainment in L2 post puberty. They also found evidence of native-like attainment in late learners of L2. They argue, in line with Flege, Yeni-Komshian and
Liu (1999), that practice is an important factor in determining the eventual level
of attainment. The nature of the L1 and L2 pairing may also be relevant.
Turning now to the key element of Ullmans (2001b) proposal, namely the
idea that emphasis is shifted to the declarative memory system in L2 learners
and that there is little or no involvement of a procedural memory system in
grammatical processing. This proposal presumes that there is good evidence for
the existence of these two types of memory systems. Data from amnesic patients
appear to provide compelling support. Amnesics have poor declarative memory
but show normal performance on various tasks involving nondeclarative
memory (Gabrieli 1998). A study by Knowlton and Squire (1993) is exemplary.
They used dot patterns created by systematically distorting a prototype pattern.
Amnesic patients were able to classify these patterns normally (a nondeclarative
task) but were severely impaired in their ability to recognize whether or not a
particular pattern had been presented previously (a declarative task). Knowlton
and Squire interpreted these data as evidence that performance in the two tasks
was mediated by two dierent memory system one of which (the declarative
system) was impaired and the other of which was not. However, computational
work by Nosofsky and Zaki (1998) challenged this interpretation by showing
that dierences in performance on these two tasks can be obtained within a
single memory system. A slight reduction in the value of a sensitivity parameter
in their computational model reduced classication performance marginally
but exerted a marked eect on recognition performance.
More pertinent to the present concern is work on the learning of articial
grammars. Knowlton and Squire (1994, 1996) contrasted normals and amnesics
in their ability to learn an articial grammar. In such studies individuals rst
memorise a set of strings generated by a (nite-state) grammar and are then
informed about a set of rules generating the strings. In the classication task,
they have to classify a new set of strings into those that are grammatical and
those that are not. In the recognition task, they have to indicate whether or not
a string of symbols was presented. In the studies by Knowlton and Squire
classication performance in amnesic patients was normal but recognition
performance was impaired. Knowlton and Squire interpreted this as evidence
that the two tasks are mediated by dierent memory systems only one of
which, the declarative memory system, is impaired in amnesics.
Other studies support a dissociation between recognition and repetition
priming in amnesic patients (e.g. Hamann and Squire 1997: Experiment 1).

Neural basis of lexicon and grammar in L2 acquisition 203

Repetition priming refers to the improvement in the identication, detection or


production of a stimulus as result of having experienced it previously. It is
considered to be mediated by nondeclarative memory because repetition
priming occurs even when there is no conscious recollection of the prior
experience of the stimulus (Gabrieli 1998). Priming is held to be mediated by
neocortical structures that are spared in amnesic patients (e.g. McClelland,
McNaughton and OReilly 1995).
Hamann and Squire (1997: Experiment 1) presented amnesic patients and
controls in a priming phase with a set of four-letter consonant strings for three
seconds each. In a later study phase, these stimuli and others were presented for
170 ms each and participants had to identify them. Priming was operationalised
as the dierence in the identication of old and new strings. After two priming
and study phases participants were tested for their recognition of stimuli
presented in the study phases. The recognition test consisted of pairs of old and
new stimuli and participants had to decide which string was old. Amnesics
showed the same degree of priming as the normal controls but their recognition
performance was at chance. But do these dissociations (between recognition
and classication and between recognition and repetition priming) require us
to postulate two distinct memory systems?
On the basis of simulation data, Kinder and Shanks (2001) argue that they
do not. They used a simple recurrent network (see Cleeremans 1993) to
simulate performance in articial grammar learning. In order to dierentiate an
amnesic network from a normal network they reduced the learning rate during
acquisition and, in a separate simulation, reduced the number of hidden units
prior to test. A change in either parameter was sucient to induce a dissociation between classication and recognition. In a second set of simulations, they
showed that a simple recurrent network could also simulate the dissociation
between recognition and repetition priming.
These simulation results show that the dissociations observed in the clinical
population do not require a dual-memory system. Instead, such results are
consistent with a single system or network and so weaken support for the
procedural/declarative distinction at the heart of the dierential representation
hypothesis. Although these simulation results lead us to be wary of using
performance dierences as direct evidence of dierent cognitive and neural
systems,1 computational results only provide an existence proof and do not
establish that the brain does not in fact have distinct declarative and nondeclarative memory systems. Further, I know of no computational studies
aimed at showing whether or not the kinds of specic dissociations referenced

204 David W. Green

by Ullman can emerge from single networks (see also Ullman 2001b, Note 1:10).
At a minimum, such results encourage the search for an alternative formulation. In fact, computational considerations lead us to expect a rather dierent
outcome for L2 acquisition. The next section considers a computational
justication for the convergence hypothesis.
5. The convergence hypothesis and its computational basis
According to the convergence hypothesis, any qualitative dierences between
native speakers of a language and L2 speakers of that language disappear as
prociency increases. Such a hypothesis is broadly in line with the idea that
prociency in language involves identifying, and using, the various cues to
meaning see, for example, the competition model (MacWhinney 1997). The
convergence hypothesis is specically concerned with the neural, and not
simply the cognitive, representation of L2. Given the diversity of languages, in
order to consider the computational grounds for the convergence hypothesis,
we need to characterise languages at a certain level of abstractness. There are
four linguistic means for communicating experience (see, for example, Tomasello 1995): individual symbols (lexical items); markers on symbols (grammatical morphology), ordering patterns of symbols (word order) and prosodic
variations of speech (e.g. stress, intonation, timing). Languages dier in the
weight they attach to these dierent linguistic means. In some languages, word
order is basically free and information on who did what to whom is conveyed
by word endings or by prosody in tone languages. By contrast, in English, such
information is conveyed by word order and this is relatively rigid. These
dierent linguistic means or signals require dierent devices for their processing. The rst step of the computational argument for the convergence hypothesis is that the neural representation of the various linguistic devices is similar
across languages. The next paragraph spells out the basis for this argument.
Such devices may be represented by specic networks with a distinct neural
anatomical representation or they may be mediated by a specialised network
with a distributed neural representation. Specialised networks can emerge from
unique interactions amongst a set of regions each fullling a number of
dierent functions (e.g. Mesulam 1990). Consider the development of a system
using these devices to communicate meaning. First, dierent neural regions
may compete to process input. Those regions, whether innately specied, or
possessing some small processing advantage, will come to mediate processing of
a given linguistic means. Neural regions active at the same time will connect

Neural basis of lexicon and grammar in L2 acquisition 205

together (Hebb 1949, Robertson and Murre 1999) giving rise to a specialised
network. Second, once a network has come to process signals of a particular
type it will resist processing other types of signals unless input to it is curtailed
in which case it may process signals of another type given that plastic reorganization is possible. One line of support for this view is evidence of crowdingout: when language is displaced to the right-hemisphere as a result of neurological damage to the left-hemisphere, a persons visuo-spatial skills (typically
mediated by the right-hemisphere) are impaired (Teuber 1974, Strauss, Satz
and Wada 1990). Third, given commonalities across brains in the initial
sensitivities of dierent regions call this the commonality assumption2
there will be commonalities in the neural representation of the dierent devices
for speakers of dierent languages.
The second step of the computational argument for the convergence
hypothesis is that the acquisition of an L2 arises in the context of an already
specied, or partially-specied system, with a specic neural network mediating
each device.3 It follows that L2 will receive convergent representation with L1.
Further, given the commonality assumption (see above) the representation of
L2 will converge with the representation of that language learned as an L1.
The convergence hypothesis does not entail that a speaker of L2 will
necessarily achieve native-like levels of performance (for example in achieving
certain phonetic norms, Flege 1995) nor does it exclude the possibility that tasks
such as mental arithmetic are carried out exclusively in L1. Clearly, also, the
contexts of acquisition (e.g. a formal school setting versus an immersion
setting) aect the initial registration of linguistic information. However, in
contrast to the dierential representation hypothesis, the convergence hypothesis is committed to the prediction that as prociency in L2 increases, the same
linguistic means involve the same neural networks as native speakers. The
hypothesis would be refuted if there is no change in representation with
prociency and if a normal, procient L2 speaker activated neural networks
disjoint from those of a native speaker, especially when encoding and decoding
syntactic information. The fact that explicit, declarative representations of
grammatical information, play only an initial role in on-line processing,
according to the convergence hypothesis, does not mean that they are unimportant. Explicit (metalinguistic) representation may well benet the recovery of L2
over L1 following brain-damage (e.g. Lebrun 2002, Paradis 1994, 1997). But
such a possibility, it seems to me, cannot be used to claim a continuing role for
such representations in on-line processing, once the relevant procedures are in
place. However, this possibility is open to test.

206 David W. Green

Of course, certain dierences in processing proles and neural activation


are to be expected when L2 speakers are contrasted with monolingual speakers
of that language. The acquisition of an L2 carries consequences. Alternative
means for expressing communicative intentions can induce competition both
in production (e.g. Bialystok 1992, Gollan and Kroll 2001, Green 1986, 1998,
Hermans 2000) and in comprehension (e.g. Dijkstra, van Jaarsveld and ten Brinke
1998, de Groot, Delmaar and Lupker 2000) though the range of conditions under
which this occurs is unknown. Depending on how the system is controlled there
may be a dierence in processing proles despite convergence. But there will be
a marker for such an eect: increased competition (and hence increased
activation perhaps) in the areas associated both with lexical and with grammatical encoding will be associated with increased activation in the areas associated
with language control. Such eects will be apparent both in L2 and in L1.
The next section considers empirical data with a view to adjudicating
between the dierential representation and convergence hypotheses.

6. Empirical data: Can we adjudicate?


Both neuropsychological and neuroimaging studies provide data that may help
in adjudicating between the two hypotheses. Under the latter we include EventRelated Potential (ERP) data and haemodynamic methods (Positron Emission
Tomography, PET and functional Magnetic Resonance Imaging, fMRI (Note 4
briey describes these classes of method). We rst consider evidence for the
distinct representation of lexical and grammatical information in L1 and then
consider what we can infer from neuropsychological and neuroimaging studies
of bilingual speakers.
6.1 The representation of L1
Both neuropsychological and neuroimaging data suggest that there is a degree
of specialisation within monolingual speakers for syntactic and semantic
processes. For instance, Breedin and Saran (1999) reported a patient, D. M.,
who was good at detecting grammatical violations despite a pervasive loss of
semantic knowledge. ERP data from normal individuals also indicate that there
are distinct mechanism mediating at least post-lexical syntactic and semantic
processes (Hagoort, Brown and Osterhout 2000). For instance, N400 (found
400 ms after an event) is sensitive to violations of semantic expectancy whereas

Neural basis of lexicon and grammar in L2 acquisition 207

P600 (found 600 ms after an event) is sensitive to syntactic violations. ERP data
cannot provide direct evidence of the neural sources of such eects but haemodynamic studies are informative. Studies on grammatical processing and
encoding in native speakers (Hagoort, Brown and Osterhout 2000) suggest a
common syntactic component subserved by the left frontal area (a dorsal part
of Brocas area and adjacent parts of the middle frontal gyrus) and studies on
the semantic representation of words identify regions in the temporo-parietal
region the left extrasylvian temporal cortex and the left anterior inferior
frontal cortex (Price 2000). Neuropsychological data (Donkers, Redfern and
Knight 2000) and also neuroimaging data (e.g. Price, Moore, Humphreys and
Wise 1997) suggest a specic area in the anterior temporal region as a site
critical for the interpretation of sentences (i.e. syntactic-semantic integration).
6.2 The representation of L2
What empirical evidence is there that L2 is represented dierently from L1 as
proposed by the dierential representation hypothesis? According to Ullman
(2001b), L2 learned late will be sensitive to damage to neocortical temporal/
temporal-parietal regions for those linguistic forms that depend on grammatical
processing in L1. A case reported by Ku, Lachmann and Nagler (1996) seems to
support his position. A 16 year old native Chinese speaker who had been living
in the United States for six years and who had received intensive training in
English over this period suered a circumscribed lesion to the left temporal lobe
(as a result of herpes simplex encephalitis). For three weeks following the lesion
he lost the ability to comprehend and to speak English. In contrast, naming in
Mandarin was normal. However, in speaking Mandarin his syntax was simplied and so this case is not decisive support for the claim that grammatical
information is represented dierently in L2.
The notion that L1 grammatical processing is mediated by a frontal-basal
ganglia circuit predicts that damage to the basal ganglia will lead to a selective
loss of L1. Fabbro and Paradis (1995) report the case of E. M. with such a lesion
and, true to prediction, her spontaneous speech in her L1 (Venetan) was poor
whereas her speech was better in her L2 (Italian) that she rarely used prior to
the lesion. Ullman (2001b) considered the nature of her errors. There was a
similar proportion of word nding diculties in both languages but a tendency
for poorer grammatical performance in L1 (e.g. the omission of grammatical
function words in obligatory contexts). However, these eects are small and the
overwhelming dierence is her spontaneous use of L2 in preference to L1.

208 David W. Green

Green and Price (2001) argue that language control (e.g. the ability to select
between one language and another) is also mediated by frontal-basal ganglia
circuits. An impairment in this system will also give rise to problems in modulating the output from the lexical-semantic system.
Leaving the neuropsychological data on one side, how convincing is the
ERP and neuroimaging data for a distinct representation of L2? Individuals
acquiring L2 can vary in terms of when they acquired L2, how they acquired L2
and how procient they are in using it. Typically, prociency is confounded
with the age of acquisition. In terms of prociency it is natural to expect that
less procient users of L2 will show quantitative dierences on a range of
measures (e.g. naming time, ERP eects and activation patterns). The critical
issue is whether or not there are qualitative dierences indicating that dierent
neural mechanisms are involved. If there are, it is important to determine
whether these necessarily imply dierent representations.
ERP data point to both quantitative and qualitative dierences in processing between L1 and L2. Kutas and Kluender (1991), for instance, found that the
N400 component in response to a semantic anomaly was delayed and of lower
amplitude in a bilinguals less uent language. Likewise, Webber-Fox and
Neville (1996) found N400 present in all groups of Chinese-English second
language learners though it was more delayed in those learning L2 after
reaching the age of 1113 years. More critically, in contrast, to monolinguals,
there was a distinct pattern of response to phrase structure violations in
bilinguals. Only individuals acquiring L2 before the age of four showed no
dierence from native learners of L2. Such data are compatible with the notion
that there is a critical period for language learning and are consistent with the
notion that dierent brain mechanisms mediate syntactic processing in late
learners of L2. But they are not decisive as later exposure to English was
associated with worse performance in identifying syntactically anomalous
sentences. Such individuals may have been circumventing syntactic processing.
Hahne and Friederici (2001) examined the eects of phrase structure
violations and semantic anomaly in Japanese late learners of German. These
individuals also showed substantial error rates in a grammaticality judgement
task and so cannot be considered procient. Hahne and Friederici (2001)
conrmed a delayed N400 eect in response to semantic anomaly but also found
a right anterior central negativity. Unlike native German speakers there was no
early anterior negativity in response to a syntactic violation. They propose that
late learners identify lexical content independently of morphological form (e.g.
the past participle form of the verb) and construct a representation directly

Neural basis of lexicon and grammar in L2 acquisition 209

based on conceptual information. Hahne and Friederici speculated on the


source of these eects based on other functional imaging data (Falk, Durwen,
Mller et al. 1999, Opitz, Mecklinger and Friederici 2000). They proposed that
late learners of L2 (at least their Japanese participants) supplement lexicalsemantic information by using the right prefrontal cortex to construct a
semantic-conceptual representation of sentence content.
Unfortunately, there is a dearth of functional imaging studies of L2 grammatical processing and encoding. On the production side more generally, Kim,
Relkin, Lee and Hirsch (1997) used fMRI to study the representation of L1 and
L2 while bilinguals covertly described what they had done the previous day.
Half of their sample acquired their L2 in infancy and half after puberty. L1 and
L2 were represented in spatially segregated parts of the left inferior frontal
cortex (Brocas area) in late learners but in overlapping parts of Brocas area in
early learners. Regions activated in Wernickes area (traditionally linked to
language comprehension) overlapped for both groups. Kim et al. concluded
that age of acquisition aected neural representation. However, there was no
assessment of prociency in L2 and so we cannot tell whether or not age of
acquisition is critical. Late learners could have been less procient in their L2.
In fact, when L2 prociency is high, Chee, Tan and Thiel (1999) found no
dierence within the left prefrontal cortex (including Brocas area) when
comparing word generation in early bilinguals (L2 acquired before the age of six)
and late bilinguals (L2 acquired after the age of twelve) for Mandarin-English
speakers in Singapore. The pattern of brain activation in response to Mandarin
words was similar to that observed in response to English words, and did not
vary as a function of age of acquisition. Klein, Milner, Zatorre et al. (1995)
reached a similar conclusion: a common network of brain areas is engaged in L1
and in L2 in highly-procient bilinguals despite late acquisition of L2.
In terms of comprehension, Abutalebi, Cappa and Perani (2001) concluded
that both languages are processed in a single and common left-sided network,
comprising all the classical language areas when L2 is acquired early (before the age
of ve). In contrast for late bilinguals, the degree of language prociency is the
critical factor. Highly procient late bilinguals activate similar left hemispheric
areas for L1 and L2 (Perani, Paulesu, Sebastian-Galles et al. 1998) whereas less
procient subjects have dierent patterns of activation for their two languages
(Perani, Dehaene, Grassi et al. 1996, Dehaene et al. 1997, Price, Green and Von
Studnitz 1999). Critically, more extensive activations are associated with the less
procient language (e.g. greater temporal lobe dispersion) perhaps indicating that
in comprehending stories individuals process grammatical forms dierently.

210 David W. Green

At the macroanatomical level then current functional imaging data indicate


that there is little dierence in the representation of L1 and L2 for highlyprocient bilinguals. The implication is that age of acquisition is less critical
than prociency. However, as Vaid and Hull (2002) observed, we still need
studies that directly compare individuals diering in L2 prociency.

7. Ways forward
The dierential representation hypothesis and the convergence hypothesis concur
that the initial representation of L2 may dier from that of native speakers of that
language. The fundamental research requirement then is to conduct withinparticipant longitudinal studies to chart changes in behaviour on various tasks,
e.g. picture naming (Kroll, Michael, Tokowicz and Dufour 2002) and to
examine changes in ERP and neuroimaging proles as prociency in L2
changes. Further, such studies need to involve both syntactic and lexical tasks.
To my knowledge there are currently relatively few such studies.
As discussed above, in contrast to L1, grammatical knowledge of L2 may be
represented explicitly and declaratively. According to the convergence hypothesis, ERP responses to syntactic anomalies should change with prociency.
Osterhout and McLaughlin (2000) studied responses to semantic and syntactic
anomalies in native speakers of French and in novice learners. Semantic
anomalies yielded N400 and syntactic anomalies yielded a P600 in native
French speakers. French learners after four weeks of instruction showed an
N400 in response to semantic anomalies. In contrast, there was either an N400
or no eect in response to syntactic anomalies. After just four months, however,
syntactic anomalies yielded a P600 but no N400. These data suggest that if there
are qualitative dierences between native speakers and L2 learners, these can be
rather short-lived. Any dierences in responding to syntactic anomalies are
presumably negatively correlated with the increasing grasp of syntax. Consistent
with the convergence hypothesis, Weber-Fox and Neville (submitted, cited in
Ullman 2001b) examined responses to open-class and closed-class words.
Native speakers of English showed a left anterior negativity to closed class words
(N280) and an N400 for open-class words. L2 speakers of English (with Chinese
as their rst language) showed the same open class N400 as native English
speakers. Interestingly, the response to closed class words related to an independent test of their grammatical ability. The higher the score on the test, the
earlier the anterior negativity for closed class words.

Neural basis of lexicon and grammar in L2 acquisition

It is important to extend longitudinal investigation to examine the neural


correlates of parsing in more detail. A number of behavioural studies have
examined the extent to which L2 learners of English show an inuence of their
L1 on resolving local syntactic ambiguities of various kinds (e.g. Frenck-Mestre
and Pynte 1997, Jus 1998). The behavioural picture is complex but not
inconsistent, in my view, with the convergence hypothesis (see Kroll and
Dussias in press for a recent review). In examining the extent to which L2
parsing proles converge with those of native speakers it is possible that we will
need to identify instances where an inappropriate parse leads to a high cost in
recovering the intended interpretation. After all, if the intended interpretation
can be recovered quickly, what computational constraint is there for the neural
processing prole of an L2 learner to converge with that a native speaker? On
the other hand, the processing cost of recovery may be a function of language
background. As Jus (1998: 135) proposes, languages (e.g. Japanese, Korean)
with Subject Object Verb structure may lead speakers to become adept at
recovery from garden-paths. Speakers of these languages may routinely make
parsing decisions about theta-roles that must be revised on encountering the
verb. Regardless of these kinds of possibilities, the convergence hypothesis
predicts that for procient speakers of L2, sentence interpretation will activate
an area in the anterior temporal pole and this area, as in the case of L1 (Noppeney and Price submitted), will show evidence of syntactic priming, i.e.
reduced activation in circumstances where the same syntactic structure is
repeated either within- or between-languages.
The achievement of prociency also entails attending to the world in the
manner of native speakers so that lexical and grammatical processes can be
coordinated appropriately (Black and Chiat 2003, Levelt, Roelofs and Meyer
1999, Slobin 1996). It is this pattern of coordination that also needs to be
considered. Neuroimaging allows us to consider how areas work together.
Bchel, Friston and Frith (2000: 339), for instance, describe methods for
examining eective connectivity (the inuence one neuronal system exerts on
another) using structural equation modelling of the patterns of activation in
dierent regions of interest. We should expect convergence of these patterns of
eective connectivity with those of native speakers of the language as prociency increases. By way of illustration, consider dierences between languages in
the way they package together dierent aspects of a movement. English packages manner and motion together (hop, oat) and, unlike some other languages
(e.g. Spanish), has fewer verbs that express motion and direction together (e.g.
rise, fall). In order to select the correct verb in Spanish an English speaker must

211

<DEST "gre-n*">

212 David W. Green

explicitly encode the direction of motion i.e. make that property of the scene
salient rather than the manner of motion. Prociency should be associated with
changes in the explicit representation of the properties of scenes. In terms of
activation patterns, there should be changes in the activation and connection of
cortical regions mediating those properties so that the correct words can be
selected and expressed in a suitable syntactic frame.5

8. Conclusion
This chapter has contrasted two main hypotheses about the representation and
processing of lexicon and grammar in L2. The subtle form of the dierential
representation hypothesis proposes that declarative representations play a much
more important role in the representation of grammar in L2 than in L1. This
chapter has considered the computational and empirical basis for the dierential representation hypothesis and argued, on both computational and neuroimaging grounds, for an alternative, convergence hypothesis. According to this
hypothesis, as prociency in L2 increases, the networks mediating L2 converge
with those mediating language use in native speakers of that language. Current
evidence marginally favours the convergence hypothesis. However, we lack
appropriate longitudinal studies of L2 acquisition. Crucially, we have little or no
information about the functional integration of dierent neural regions during
second language use. Scope indeed for discovery!

Notes
* I thank the editors of this volume for the opportunity to contribute to our understanding
of the interface between syntax and the lexicon in L2 acquisition and for constructive
comments on a previous draft of this chapter.
1. It would be useful to have converging evidence for the existence of distinct neural systems
beyond that oered by amnesia. Fortunately, we do not need to rely exclusively on neuropsychological data. Neocortical activity is reduced for stimuli that have been processed before
(Ungerleider, 1995) and this datum has been interpreted as evidence that these structures
mediate priming. But amnesics can also be impaired on priming in certain conditions
(Ostergaard, 1999).
2. The commonality assumption is compatible with anatomical variability. Brains dier in
the location of quite major features (e.g. Rickard 2000). The commonality assumption refers
to the sensitivity of neural regions, not to their precise anatomical location. Neuroimaging
evidence for convergence must take such variability into account.

Neural basis of lexicon and grammar in L2 acquisition

3. At the lexical level, the impact of a prior representation is captured by cognitive models
such as the revised hierarchical model (Kroll and Stewart 1994) and the distributed features
model (e.g. De Groot 1993, Kroll and de Groot 1997). It might also be argued that prior
representation of L1 induces a radically distinct representation of L2. Jiang and Forster
(2001) propose, on the basis of experimental evidence, that lexical items in L2 (they tested
native Chinese speakers with English as the L2) are represented in a non-lexical memory.
This memory allows the meaning of translation equivalents to be retrieved indirectly via the
L1 lexical item. However, it is unclear what it means for a non-lexical system to represent the
syntactic properties of lexical items. Their ndings, as they acknowledge, need to be
replicated with procient L2 speakers and with a dierent language pairs.
4. ERPs provide high resolution temporal evidence of the existence of dierent processes
during language processing. These are derived by averaging signals from an eletroencephalogram over a series of trials that time-locked to the presentation of a particular type of
stimulus. An ERP itself comprises various components (i.e. positive and negative voltage
peaks). Where these are aected by some experimental manipulation they are termed ERP
eects. ERP data are compatible with an innite number of neural generators (the inverse
problem) and so they need to be complemented by data from haemodynamic methods.
Haemodynamic methods (Positron Emission Tomography, PET; or functional Magnetic
Resonance Imaging, fMRI) rely on a close coupling between changes in the activation of a
population of neurons and change in blood supply. A haemodynamic eect arises only when
there is a change in the overall metabolic demand in a neuronal population. PET and fMRI
track dierent signals. PET measures the decay of a short-lived isotope which accumulates
in a neural region in proportion to the amount of blood owing through that region. The
most typical fMRI method indexes metabolic demand and hence relative neural activity by
assessing the ratio of deoxy- to oxyhaemoglobin in the blood (see Rugg (2000) for a critical
appraisal of these methods). It is worth noting that haemodynamic methods, along with
other electrophysiological methods, allow us to identify regions that are sucient for task
performance but they do not allow us to identify regions that are necessary for task performance. Other data are needed to identify which regions are necessary. For instance, if task
performance is impaired in a patient with a lesion at given site then this region, or the
network of which it is part, is necessary for task performance. Likewise, virtual lesions
induced by drugs or by transcranial magnetic stimulation, may help identify regions
necessary for task performance.
5. Lower levels of prociency in L2 might also be associated with more reliance on conceptual/pragmatic information (see Hahne and Friederici above). In native speakers, conceptual
factors aect grammatical encoding (Vigliocco and Hartsuiker 2001) and it is reasonable to
expect that such eects might be more marked in novice learners. A sentence completion
task oers one behavioural measure. Vigliocco and Franck (2001) showed that there were
more errors in generating a predicate when the sex of the referent was incongruent with the
gender of the noun. Given that novice learners of French or Italian, for instance, know the
syntactic gender of the noun, they might also show greater eects of incongruity.

213

214 David W. Green

References
Abutalebi, J., Cappa, S. F. and Perani, D. 2001. The bilingual brain as revealed by functional
imaging. Bilingualism: Language and Cognition 4: 179190.
Albert, M. L. and Obler, L. K. 1978. The bilingual brain: Neuropsychological and neurolinguistic aspects of bilingualism. New York: Academic Press.
Bialystok, E. 1992. Selective attention in cognitive processing: The bilingual edge. In
Cognitive processes in bilinguals, R. J. Harris (ed.), 501513. Amsterdam: Elsevier Science
Publishers B. V.
Birdsong, D. and Molis, M. 2001. On the evidence for maturational constraints in secondlanguage acquisition. Journal of Memory and Language 44: 235249.
Black, M and Chiat, S. 2003. Noun-verb dissociations: a multi-faceted phenomenon.
Journal of Neurolinguistics 16: 231250.
Breedin, S. D. and Saran, E. M. 1999. Sentence processing in the face of semantic loss:
A case study. Journal of Experimental Psychology: General 128: 547562.
Bchel, C., Frith, C. and Friston, K. 2000. Functional integration: Methods for assessing
interactions amongst neuronal systems. In The neurocognition of language, C. M.Brown
and P. Hagoort (eds), pp. 337355. Oxford: Oxford University Press.
Chee, M. W. L., Tan, E. W. L. and Thiel, T. 1999. Mandarin and English single word
processing studied with functional Magnetic Resonance Imaging. Journal of Neuroscience 19: 30503056.
Chomsky, N. 1995. The minimalist program. Cambridge, MA: MIT Press.
Cleeremans, A. 1993. Mechanisms of implicit learning: Connectionist models of sequence
processing. Cambridge, MA: MIT Press.
Dehaene, S. D., Dupoux, E., Mehler, J., Cohen, L., Paulesu, E., Perani, D., van de Moortele,
P. F., Lehricy, S. and Le Bihan, D. 1997. Anatomical variability in the cortical representation of rst and second languages. Neuroreport 8: 38093815.
De Groot, A. M. B. 1993. Word-type eects in bilingual processing tasks: Support for a
mixed representational system. In The bilingual lexicon, R. Schreuder and B. Weltens
(eds), 2751. Amsterdam: John Benjamins.
De Groot, A.M.B., Delmaar, P. and Lupker, S.J. 2000. The processing of interlexical homographs in translation recognition and lexical decision: Support for non-selective access to
bilingual memory. Quarterly Journal of Experimental Psychology 53A: 397428.
Dijkstra, A., van Jaarsveld, H. and ten Brinke, S. 1998. Interlingual homograph recognition:
Eects of task demands and language intermixing. Bilingualism: Language and
Cognition 1: 5166.
Dronkers. N. F., Redfern, B. B. and Knight, R. T. 2000. The neural architecture of language
disorders. In The new cognitive neurosciences, M. S. Gazzaniga (ed.), 949960. Cambridge, Mass: MIT.
Edelman, G. M. 1989. The remembered present: A biological theory of consciousness. New York:
Basic Books.
Ellis, N. C. 1995. The psychology of foreign language vocabulary acquisition: Implications
for CALL. Computer Assisted Language Learning 8: 103128.

Neural basis of lexicon and grammar in L2 acquisition

Fabbro, F. 1999. The neurolinguistics of bilingualism: An introduction. Hove, Sussex: Psychology Press.
Fabbro, F. and Paradis, M. 1995. Dierential impairments in four multilingual patients with
subcortical lesions. In Aspects of bilingual aphasia, M. Paradis (ed.), 139176. Oxford,
UK: Pergamon.
Falk, A. R., Durwen, H. F., Mller, C., Knig, M., Mller, E. and Heuser, L. 1999. Determination of eloquent cortical areas in Russian bilinguals performing a word generation
task. Neuroimage 6: S1002.
Flege, J. E. 1995. Second-language speech learning: Theory, ndings and problems. In
Speech perception and linguistic experience: Issues in cross-language research, W. Strange
(ed.), 233277. Timonium, MD: York Press.
Flege, J. E., Yeni-Komshian, G. H. and Liu, S. 1999. Age constraints on second-language
acquisition. Journal of Memory and Language 41: 78104.
Frenck-Mestre, C. and Pynte, J. 1997. Syntactic ambiguity resolution while reading in second
and native languages. Quarterly Journal of Experimental Psychology 50A: 119148.
Gabrieli, J. D. 1998. Cognitive neuroscience of human memory. Annual Review of
Psychology 49: 87115.
Gollan, T. H. and Kroll, J. F. 2001 Lexical access in bilinguals. In A handbook of cognitive
neuropsychology: What decits reveal about the human mind, B. Rapp (ed.), 321345.
New York: Psychology Press.
Green, D. W. 1986. Control, activation and resource: A framework and a model for the
control of speech in bilinguals. Brain and Language 27: 210223.
Green, D. W. 1998. Mental control of the bilingual lexico-semantic system. Bilingualism:
Language and Cognition 1: 6781.
Green, D. W. and Price, C. 2001. Functional imaging in the study of recovery patterns in
bilingual aphasics. Bilingualism: Language and Cognition 4: 191201.
Gupta, P. and Dell, G. S. 1999. The emergence of language from serial order and procedural
memory. In The emergence of language, B. MacWhinney (ed.), 447481. Mahwah:
Lawrence Erlbaum Associates.
Hagoort, P., Brown, C. M. and Osterhout, L. 2000. The neurocognition of syntactic
processing. In The neurocognition of language, C. M. Brown and P. Hagoort (eds),
273316. Oxford: Oxford University Press.
Hahne, A. and Friederici, A. D. 2001. Processing a second language: Late learners comprehension mechanisms as revealed by event-related brain potentials. Bilingualism:
Language and Cognition 4: 123141.
Hamann, S. B. and Squire, L. R. 1997. Intact priming for novel perceptual representations
in amnesia. Journal of Cognitive Neuroscience 9: 699713.
Hebb, D.O. 1949. The organization of behaviour: A neuropsychological theory. New York: Wiley.
Hermans, D. 2000. Word production in a foreign language. Doctoral dissertation, Katholieke
Universiteit Nijmegen.
Jiang, N. and Korster, K. I. 2001. Cross-language priming asymmetries in lexical decision
and episodic recognition. Journal of Memory and Language 44: 3251.
Johnson, J. S. and Newport, E. L. 1989. Critical period eects in second language learning:
The inuence of maturational state on the acquisition of English as a second language.
Cognitive Psychology 21: 6099.

215

216 David W. Green

Jus, A. 1998. Main verb versus reduced relative clause ambiguity resolution in L2 sentence
processing. Language Learning 48: 107147.
Kim, K. H. S., Relkin, N. R., Lee, K. M. and Hirsch, J. 1997. Distinct cortical areas associated
with native and second languages. Nature 388: 171174.
Kinder, A. and Shanks, S. 2001. Amnesia and the declarative/non-declarative distinction:
A recurrent network model of classication, recognition and repetition priming.
Journal of Cognitive Neuroscience 13: 648669.
Klein, D., Milner, B., Zatorre, R., Meyer, E. and Evans, A. 1995. The neural substrates
underlying word generation: A bilingual functional-imaging study. Proceedings of the
National Academy of Sciences USA 92: 28992903.
Knowlton, B. J. and Squire, L. R. 1993. The learning of categories: Parallel brain systems for
item memory and category knowledge. Science 262: 17471749.
Knowlton, B.J. and Squire, L.R. 1994. The information acquired during articial grammar
learning. Journal of Experimental Psychology: Learning, Memory, and Cognition 20: 7991.
Knowlton, B. J. and Squire, L. R. 1996. Articial grammar learning depends on implicit
acquisition of both abstract and exemplar-specic information. Journal of Experimental
Psychology: Learning, Memory, and Cognition 22: 169181.
Kroll, J. F. and Dussias, P. E. (in press). The comprehension of words and sentences in two
languages. In Handbook of bilingualism, T. Bhatia and W. Ritchie (eds), Cambridge,
MA: Blackwell Publishers.
Kroll, J. F. and de Groot, A. M. B. 1997. Lexical and conceptual memory in the bilingual:
Mapping form to meaning in two languages. In Tutorials in bilingualism: Psycholinguistic perspectives, A. M. B. de Groot and J. F. Kroll (eds), 169199. Mahwah, NJ:
Lawrence Erlbaum Associates.
Kroll, J. F. and Stewart, E. 1994. Category interference in translation and picture naming:
Evidence for asymmetric connections between bilingual memory representations.
Journal of Memory and Language 33: 149174.
Kroll, J. F., Michael, E., Tokowicz, N. and Dufour, R. (2002). The development of lexical
uency in a second language. Second Language Research 18: 137171.
Ku, A., Lachmann, E. A. and Nagler, W. 1996. Selective language aphasia from herpes
simplex encephalitis. Pediatric Neurology 15: 169171.
Kutas, M. and Kluender, R. 1991. What is who violating? A reconsideration of linguistic
violations in light of event-related brain potentials. In Cognitive Electrophysiology, H.-J.
Heinze, T. F. Mnte and G. R. Mangun (eds),183210. Boston: Birkhuser.
Lebrun, Y. 2002. Implicit competence and explicit knowledge. In Advances in neurolinguistics of bilingualism, F. Fabbro (ed.), 299313. Udine: Forum.
Lenneberg, E. H. 1976. Biological foundations of language. New York: Wiley.
Levelt, W. J. M., Roelofs, A. and Meyer, A. S. 1999. A theory of lexical access in speech
production. Behavioral and Brain Sciences 22: 175
Loring, D. W., Meador, K. J., Lee, G. P., et al. 1990. Cerebral language lateralization:
evidence from intracorotid amobarbital testing. Neuropsychologica 28: 831838.
McClelland, J.L., McNaughton, B.L. and OReilly, R.C. 1995. Why there are complementary
learning systems in the hippocampus and neocortex: Insights from the successes and
failures of connectionist models of learning and memory. Psychological Review 102:
419437.

Neural basis of lexicon and grammar in L2 acquisition 217

MacWhinney, B. 1997. SLA and the competition model. In Tutorials in bilingualism:


Psycholinguistic perspectives, A. M. B. de Groot and J. F. Kroll (eds), 113142. Mahwah,
NJ: Lawrence Erlbaum Associates.
Mesulam, M. M. 1990. Large scale neurocognitive networks and distributed processing for
attention, language and memory. Annals of Neurology 28: 597613.
Noppeney, U. and Price, C. J. (submitted). The neural basis of syntactic priming. Ms.
Wellcome Department of Imaging Neuroscience, UCL.
Nosofsky, R. M. and Zaki, S. R. 1998. Dissociations between categorization and recognition
in amnesic and normal individuals: an exemplar-based interpretation. Psychological
Science 9: 247255.
Opitz, B., Mecklinger, A. and Friederici, A. D. 2000. Functional asymmetry of human
prefrontal cortex: Encoding and retrieval of verbally and non-verbally coded information. Learning and Memory 7: 8596.
Ostergaard, A. L. 1999. Priming decits in amnesia: Now you see them now you dont.
Journal of the International Neuropsychological Society 5: 175190.
Osterhout, L. and McLaughlin, J. 2000. What brain activity can tell us about secondlanguage learning. Paper presented at the 13th Annual CUNY conference on Human
Sentence Processing, San Diego.
Paradis, M. 1994. Neurolinguistic aspects of implicit and explicit memory: Implications for
bilingualism and second language acquisition. In Implicit and explicit language learning,
N. Ellis (ed.), 393419. London: Academic Press.
Paradis, M. 1997. The cognitive neuropsychology of bilingualism. In Tutorials in bilingualism: Psycholinguistic perspectives, A. M. B. de Groot and J. F. Kroll (eds), 331354.
Mahwah, NJ: Lawrence Erlbaum Associates.
Paradis, M. 2001. Bilingual and polyglot aphasia. In Handbook of neuropsychology, 2nd edition,
vol. 3 Language and aphasia, R. S. Berndt (ed.), 6991. Amsterdam: Elsevier Science.
Perani, D., Dehaene, S., Grassi, F., Cohen, L., Cappa, S. F., Dupoux, E., Fazio. F. and Mehler,
J. 1996. Brain processing of native and foreign languages. NeuroReport 7: 24392444.
Perani, D., Paulesu, E., Sebastian-Galles, N., Dupoux, E., Dehaene, S., Bettinardi, V., Cappa,
S. F., Fazio, F. and Mehler, J. 1998. The bilingual brain: Prociency and age of acquisition of the second language. Brain 121: 18411852.
Pinker, S. 1994. The language instinct. New York: William Morrow.
Price, C. J. 2000. The anatomy of language: contributions from functional neuroimaging.
Journal of Anatomy 197: 335359.
Price, C. J., Green, D. and Von Studnitz, R. 1999. A functional imaging study of translation
and language switching. Brain 122: 22212236.
Price, C. J., Moore, C., Humphreys, G. and Wise, R. 1997. Segregating semantic from
phonological processes during reading. Journal of Cognitive Neuroscience 9: 727733.
Rapport, R. L., Tan, C. T., and Whitaker, H. A. 1983. Language function and dysfunction
among Chinese and English speaking polyglots: Cortical stimulation, Wada testing, and
clinical studies. Brain and Language 18: 342366.
Rickard, T.C. 2000. Methodological issues in functional magnetic resonance imaging studies of
plasticity following brain injury. In Cerebral reorganization of function after brain damage,
H. S. Levin and J. G. Grafman (eds), 304- 317. Oxford: Oxford University Press.

</TARGET "gre">

218 David W. Green

Robertson, I. H.and Murre, J. M. 1999. Rehabilitation of brain damage: Brain plasticity and
principles of guided recovery. Psychological Bulletin 125: 544575.
Rugg, M. R. 2000. Functional neuroimaging in cognitive science. In The neurocognition of
language, C.Brown and P. Hagoort (eds), 1536. Oxford: Oxford University Press.
Scoresby-Jackson, R. E. 1867. Case of aphasia with right hemiplegia. Edinburgh Medical
Journal 12: 696706.
Segalowitz, N. S. and Segalowitz, S. J. 1993. Skilled performance, practice, and the dierentiation of speed-up from automatization eects: Evidence from second language word
recognition. Applied Psycholinguistics 14: 369385.
Slobin, D. I. 1996. From thought and language to thinking for speaking. In Rethinking
linguistic relativity, J. J. Gumperz and S. C. Levinson (eds), 177202. Cambridge:
Cambridge University Press.
Springer, J. A., Binder, J. R., Hammeke, T. A. et al. 1999. Language dominance in neurologically normal and epilepsy subjects: A functional MRI study. Brain 122: 20332046.
Squire, L. R., Knowlton, B. and Musen, G. 1993. The structure and organization of
memory. Annual Review of Psychology 44: 453495.
Squire, L. R. 1994. Declarative and nondeclarative memory: Multiple brain systems
supporting learning and memory. In Memory systems, D. L. Schacter and E. Tulving
(eds), 203231. Cambridge, MA: MIT Press.
Strauss, E., Satz, P. and Wada, J. 1990. An examination of the crowding hypothesis in epileptic
patients who have undergone the carotid amytal test. Neuropsychologia 28, 12211227.
Teuber, H. L. 1974. Why two brains? In The Neurosciences. Third study program, F. G.
Worden (ed.), 7174. Cambridge: MIT Press.
Tomasello, M. 1995. Language is not an instinct. Cognitive Development 10: 131156.
Ullman, M. T. 2001a. The declarative/procedural model of lexicon and grammar. Journal
of Psycholinguistic Research 30: 3769.
Ullman, M. T. 2001b. The neural basis of lexicon and grammar in rst and second language:
the declarative/procedural model. Bilingualism: Language and Cognition 4: 105122.
Ungerleider, L. G. 1995. Functional brain imaging studies of cortical mechanisms for
memory. Science 270: 769775.
Vaid, J. and Hull 2002. Re-envisioning the bilingual brain using functional neuroimaging:
Methodological and interpretive issues. In Advances in neurolinguistics of bilingualism,
F. Fabbro (ed.), 315355. Udine: Forum.
Vigliocco, G. and Franck, J. 2001. When sex aects syntax: Context eects in sentence
production. Journal of Memory and Language 45: 368390.
Vigliocco, G. and Hartsuiker, R. J. 2001. The interplay of meaning, sound, and syntax in
sentence production. Psychological Bulletin (under review).
Weber-Fox, C. and Neville, H. J. 1996. Maturational constraints on functional specializations for language processing: ERP and behavioral evidence in bilingual speakers.
Journal of Cognitive Neuroscience 8: 231256.
Weber-Fox, C. and Neville, H. J. (submitted). Sensitive periods dierentiate processing
subsystems for open and closed class words: An ERP study in bilinguals. cited in
Ullman, 2001b (above).

<TARGET "hou" DOCINFO AUTHOR "Roeland van Hout, Aafke Hulk and Folkert Kuiken"TITLE "The interface"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Chapter 10

The interface
Concluding remarks
Roeland van Hout, Aafke Hulk and Folkert Kuiken
University of Nijmegen (Hout) / Utrecht University (Hulk) /
Universiteit van Amsterdam (Kuiken)

1.

Interfaces in generative grammar

In its bare form, a grammar can be dened as a set of elements or symbols (the
lexicon) and a set of rules (the syntax) that together produce output strings, the
utterance of the language belonging to the grammar. The format of the lexicon
is evident: it has no structure, and it does not need to have one. Chomsky
(1965: 84) characterizes the lexicon as simply an unordered list of all lexical
formatives. Although the assumed absence of order may be accepted as a
heuristic device, the lexicon of human languages and of human speakers is not
an unordered list, far from that, as is apparent from many recent lexical studies
as well as from the chapters in this book. The form, role and structure of lexical
items, words, lemmas, or whatever the lexical formatives are called, and the
relationships or connections between them constitute the pivotal domain of
research in the chapters of this book. The nearest neighbour to which lexical
items and their structural properties connect is syntax, the machinery by which
utterances can be computed, on the basis of lexical input. The questions then
are: How are lexicon and syntax linked precisely? What kind of information do
they exchange? What is the interface between lexicon and syntax? Which
interface levels need to be distinguished? There are several possible answers to
this question depending on ones theoretical perspective.
Within the generative, modular paradigm, Chomsky (1995: 131) distinguishes two interface levels: the level of phonetic form (PF) is the interface with
sensorimotor systems, the level of logical form (LF) is the interface with systems
of conceptual structure and language use. The two performance systems involved
are the articulatory-perceptual system and the conceptual-intentional system,

220 Roeland van Hout, Aafke Hulk and Folkert Kuiken

or to put it in a more straightforward way, they relate to sound and meaning.


This generativist view is captured by Jackendo (2002: 197) in the following
diagram:
lexicon
syntax
phonology

semantics

This diagram illustrates the central position of syntax, and, at the same time, it
raises the question of the relationship between lexicon and syntax. In the
chapter by Norbert Corver, this relationship was identied as the third interface
level. Corver cited Chomsky (1991, 46):
that there are three fundamental levels of representation: D-structure, PF
and LF. Each constitutes an interface of the syntax (broadly constructed) with
other systems: D-structure is the projection of the lexicon, via the mechanisms
of X-bar theory; PF is associated with articulation and perception, and LF with
semantic interpretation.

D-structure is no longer a separate level in minimalist theory, as Corver points


out, but the internal interface of syntax and lexicon is still to be distinguished.
The lexical input needs to provide the information required to put grammar to
work. This means that lexical items in generative grammar contain information
related to syntax (the formal features), to phonology (the phonological matrix),
and to semantics (the semantic features).
The minimalist approach puts syntax in a central position. Other generative
view points exist, for instance Jackendo (2002) attaches equal generative capacity
and autonomy to each of the three levels. From his perspective, phonological
structures, syntactic structures and conceptual structures are part of the same
processing architecture (Jackendo 2002: 199), and that implies that interface
issues between lexicon and syntax are basically not dierent from conceptual
and phonological interface issues, as is illustrated in the following diagram:
lexicon
phonology

syntax

semantics

The interface 221

The contributions in this book concentrate on the lexicon-syntax dimension,


without any direct claims about the relative importance of other interfaces.
Interesting contributions on the relationships between semantics/conceptual
structures and syntactic structures, for instance, can be found in Bowerman and
Levinson (2001), in the context of (rst) language acquisition.
In the Jackendo approach, the key position of the lexicon in relation to
phonological, syntactic, and semantic structures is reected in the format of
lexical items. Three layers of information are distinguished for lexical items:
phonological features, semantic features, and syntactic (or formal) features. The
lexicon contains the fuel to put language to work, which applies to both content
and function words. In his chapter Ton Dijkstra added the orthographic layer,
a consequence of our literate society that tends to be overlooked by linguists.

2. Learning the syntax, learning the lexicon


As discussed in Richard Towells chapter, L2 acquisition, in contrast to L1
acquisition, is marked by variability and incompleteness. Any theory on human
language should be able to explain these two phenomena, but, in addition, the
claim can be made that L2 acquisition can shed light on the properties of
human language. The interaction between one language (for instance the L1)
and another language (for instance the L2) in the heads and mouths of real
speakers, can produce evidence for the core properties of lexical and syntactic
structures (cf. Muysken 2000). Including the developmental track of acquisition
and stagnation stresses the relevance of bilingual language processing as a
primary topic in language research even further.
What is the relationship between syntax and lexicon in L2 acquisition? A
strong argument in favour of the high status of lexical information is that
second language learners always have been aware of the kernel value of a
lexicon. Learners prefer to walk around with dictionaries, not with grammars.
It has been understood for some time that syntax and lexicon involve dierent
kinds of learning: syntax is learnt through a process of implementing a particular set of universal structures; lexis is learnt by establishing a set of arbitrary
associations which operate in a given society. The learning of syntax is often
characterised as a process of triggering; the learning of lexis is characterised by
the building up of associations (or connections). Yet these two systems must
come together in the creation of a whole linguistic system in the mind of an
individual. The syntax will govern the phrase structure of the grammar but the

222 Roeland van Hout, Aafke Hulk and Folkert Kuiken

lexical items will govern how the phrase structure is implemented, notably the
argument structure and the feature composition of the lexical items is essential
to the implementation of the syntax in the language production process.
This book was designed to examine the relative contribution of these two
dimensions in a clear fashion, through illustrations of exemplary research
carried out within each paradigm and to examine how they can be made to
inter-relate in a way which would enable us to explain better the overall process
of SLA. An examination of the interface between syntax and the lexicon is both
timely and important. Both groups of researchers are now coming to an
understanding that their particular view of the world may not suce to account
for the overall process and that each will have to understand more about what
the other knows. From the point of view of the researcher interested in syntax
(generally coming from a background in linguistics), the shift of linguistic
theory away from principles and parameters and into minimalism has resulted
in a crucial increase of the signicance of the lexicon. Within minimalism so
much of the information essential for the working of the system has been
assigned to the lexicon that it has become crucial for syntacticians to reect
more on how the lexicon works. From the point of view of the researcher
interested in the lexicon (generally coming from a background in psychology),
it is important to integrate the outcome of lexical learning within the overall
acquisition process. Unless one adopts a purely connectionist position, it is
clear that the use of the lexical items studied can only take place within the
syntactic system. An understanding of the acquisition of the syntax is therefore
essential to the understanding of the whole second language processing and
acquisition. The introductory chapter by Richard Towell explicates the complementarity of linguistics and psychology in doing language research. Taking the
other chapters in this book in consideration, he balances the linguistic and
psychological dimensions of basic questions in second language acquisition
research. Towell argues that the lexical part of the lexicon-syntax interface is the
driving force of language acquisition and that we need to investigate the
psychological mechanisms behind this development.

3. Some nal considerations


The chapters in this book demonstrate the many dierent perspectives required
to study second language acquisition over its full range. A whole series of
contrasts keeps returning: symbolic learning vs. connectionist learning, L1 vs.

The interface 223

L2 acquisition, procedural vs. declarative knowledge, structure vs. process, competence vs. performance, etc. One main conclusion is that, over and over again, we
need to plea for a cocktail of treatments, for many sorts of data, and for many sorts
of expertises to give an appropriate answer to the questions belonging to such
contrastive pairs. In this cocktail, the following distinctions can be made.
3.1 Variety in methodology
It will be clear that no single particular methodology can produce all the
answers we need. We need the linguistic analysis of spontaneous and elicitated
speech data (see the chapters by Hawkins and Liszka, Van de Craats, Corver),
but also on-line and o-line grammatical judgment tasks (see the chapters by
Dueld, and Sabourin and Haverkort). We need to carry out psycholinguistic
experiments with reaction times (see the chapters by Dueld and Dijkstra), but
also newer methodologies should be applied like eye-tracking, and neuroimaging techniques (see the chapters by Sabourin and Haverkort, and Green).
Another promising methodology is the use of computer simulations (see the
chapter by Williams).
3.2 Variety in learners and languages
The discovery and testing of general or universal principles and parameters in
language and language acquisition require the full range of learners: from real
beginners (see Corver and Van de Craats) to (near)native speakers (see, e.g.
Hawkins and Liszka), from unguided acquisition to classroom learning. At the
same time we need to consider the full language typological range, both as a
source and a target language in L2 acquisition (the chapters in this book cover
a range larger than in many other books on L2 acquisition). As for the computer simulations, dierent learning algorithms should be probed, including
algorithms where previous knowledge (L1) can be implemented to explore its
eect on acquiring another language system (L2). Without studying real
beginners, it seems not feasible to get an answer to the question of the role of
cognitive vs. linguistic principles in speaking a new language (cf. Klein and
Perdue 1997), and how, perhaps, cognitive principles are matched by linguistic
structures. Cognitive principles can be inuential in an indirect way via the
syntax-semantic interface, but maybe they directly trigger specic syntactic
mechanisms. That would contradict approaches based solely on lexical formal
features as the main sources of information for syntactic structures.

224 Roeland van Hout, Aafke Hulk and Folkert Kuiken

3.3 Variety in linguistic domains


The same linguistic domains must be investigated, in both L1 and L2 acquisition, including not only inectional processes but derivational morphological
devices as well, and including not only syntactic parameters related to the whole
clause, but also parameters for subdomains (possessive DPs, for instance; see
Van de Craats). Studying dierent linguistic domains can be helpful in determining which words are stored and which explicit rules are active in producing
lexical items. Schreuder and Baayen (1997) show that even regular inectional
forms can be stored instead of being computed. Morphology is interesting here
because it constitutes, due to its paradigmatic organisation, the link between
syntactic structures and sets of lexical items.
In L1 acquisition the acquisition of past tense forms has been an important
domain for testing associative learning algorithms. (cf. Plunkett and Juola
1999). Past tense forms, from a generative point of view, are the topic of
research of Hawkins and Liszka in this book. They observe that high prociency
adult L2 speakers of English do not always mark thematic verbs like walk, notice
for past tense in obligatory past tense contexts. They nd that the morphology
cannot be the source of such optionality. In their view, optionality in adult L2
performance results from the interaction between a perfectly functioning
syntactic component with an impaired lexicon (as far as tense features are
concerned).
Gender systems turn out to be another attractive domain in L2 acquisition
because of their irregularities and their covert regularities: Williams suggests
that problems learning gender in a second language may reect the weakness of
the kind of associative learning mechanism that underlies incidental learning.
Sabourin and Haverkort argue that there is a qualitative dierence between
native speakers and second language learners in language processing. They
suggest that the German participants use a translation strategy to learn Dutch
gender assignment and that they use their L1 processing strategies to process
their L2 in cases where the grammars are very similar.
3.4 Variety in contexts and tasks
Both Hawkins and Liszka, and Haverkort and Sabourin make it clear that
dierent tasks may return dierent results. A simple production task may show
that a learner has control over a specic process, whereas spontaneous language
production may show strong dierences between native and L2 speakers.

The interface 225

Dijkstras research corroborates the naturalness of such task dierences. The


lexicon is a exible store whose properties may dier with the task it is confronted with. Only dierent forms of processing can be held responsible for
such task dierences, which means that processing is an inherent part of
linguistic structuring. The chapters by Dijkstra and Green show how intricate
the organisation of the bilingual lexicon is. Dierences in organisation may be
related to the way dierences between speakers and communities develop,
dependent upon the type of bilingualism and, in addition, the linguistic
distance between the language involved.
3.5 Variety in perspectives
Towells introduction to this book is a plea for a stronger cooperation between
linguistics and psychology and a plea for longitudinal studies. Generative
linguistics alone will not do.
Whilst generative linguists claim to describe the acquisition process, most
of their eorts pertain to the classication of successive linguistic stages of
learners interlanguages. Towell concludes that the driving force of language
learning must come from lexis, as there is no driving force available in the
computational system (CS), as it is dened in modern generative syntax.
Syntactic knowledge is a template simply present in the mind of the learner that
automatically operates given the lexical information inserted. This is illustrated
by Corver who argues, both for word level categories and phrasal level categories, that L2-expressions are just as perfect as L1-expressions from an interface
perspective, even though from the perspective of the target language they may
seem highly imperfect.
On the other hand, the learning of lexis has been thought of mainly in
terms of some form of associative learning theory, with connectionism being
the leading variant in language acquisition research. Some theorists have
concluded that connectionist learning can account for the totality of the
learning, including the learning of syntax. Generative syntacticians argue that
the sophisticated structures which they observe and which are not visible at the
surface structure of the language, cannot be learnt in an empirical fashion alone
and therefore they claim that innate knowledge (mediated or not by the L1)
must be guiding language acquisition. It is clear that both groups have a strong
case to make but that neither is able on their own to account for the total
process. Syntax needs to be fed by lexical information, including formal
features; this information has to be collected by the lexicon from output strings

</TARGET "hou">

226 Roeland van Hout, Aafke Hulk and Folkert Kuiken

generated by syntax. This is particularly clear in the contribution of Van de


Craats who shows how syntax and lexicon interact in the data of Moroccan and
Turkish adults learning Dutch outside the class-room. The starting point of
their developmental process is assumed to be the fully-edged grammar of the
L1 that under the impact of the L2-environmental input changes the underlying
grammar of the L1 towards a more target-like L2 output.
The computation of syntactic information on the basis of output strings
seems to require storage devices, associative linking and analogical strategies.
Such computational eorts need time (see Van de Craats), and sometimes a lot
of time before the proper information can become available in the production
of spontaneous speech (see Hawkins and Liszka).
We hope that the readers of this volume have come to appreciate the
complexities involved in the issue of interface and will be encouraged to do more
interdisciplinary research in order pave the way for a deeper understanding of
the interface between syntax and the lexicon in second language acquisition.

References
Bowerman, M. and Levinson. S. (eds). (2001). Language acquisition and conceptual development. Cambridge: Cambridge University Press.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge. Mass: MIT Press.
Chomsky, N. (1991). Some notes on economy of derivation and interpretation. In Principles and
parameters in comparative grammar, R. Freidin (ed.), 417454. Cambridge MA: MIT Press.
Chomsky, N. (1995). The minimalist program. Cambridge MA: MIT Press.
Jackendo, R. (2002). Foundations of language. Brain, meaning, grammar, evolution. Oxford:
Oxford University Press.
Klein, W. and Perdue, C. (1997). The basic variety. Or: Couldnt natural languages be much
simpler? Second Language Research 13, 301/347.
Muysken, P. (2000). Bilingual speech. A typology of code-mixing. Cambridge: Cambridge
University Press.
Plunkett, K. and Juola, P. (1999). A connectionist model of English past tense and plural
morphology. Cognitive Science 23, 436490.
Schreuder, R. and Baayen, H. (1997). How complex simplex words can be. Journal of
Memory and Language 37, 118139.

<TARGET "ni" DOCINFO AUTHOR ""TITLE "Name index"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Name index
A
Abutalebi 209, 214
Allen 54-56, 67, 98, 99, 124
Altarriba 143, 148
Avrutin 124, 178, 194
B
Barlow 106, 124
Bever 97, 98, 124
Birdsong 97, 122, 124, 201, 214
Bley-Vroman 46, 67, 108, 120, 125
Brown 184, 194, 195, 206, 207, 214, 215,
218
C
Cappa 20, 209, 214, 217
Carlson 114, 127
Chomsky 1-3, 6, 19, 20, 24, 35, 39, 43, 45,
50, 51, 62, 66, 67, 69, 72, 78, 79, 93,
94, 95, 115, 123, 125, 176, 194, 199,
214, 219, 220, 226
Chung 99, 125
Clahsen 94, 95, 109, 120, 125
Coppieters 97, 125
Corver 3, 5, 7, 10, 45, 48, 49, 67, 68, 70,
77, 79, 81, 85, 94, 95, 220, 223, 225
Culicover 99, 123, 125
D
De Groot 20, 140, 141, 149, 206, 213, 214,
216, 217
De Moor 139, 149
Dijkstra 3, 5, 12, 18, 129, 132-136,
138-142, 146, 148-150, 206, 214,
221, 223, 225
Dueld 3, 5, 7, 97, 107, 108, 110, 114,
116, 118, 120, 121, 124, 125, 223
Dussias 144, 149, 211, 216

E
Ellis 4, 5, 19, 97, 125, 151, 172, 200, 214,
217
F
Fabbro 198, 207, 215, 216, 218
Fodor 1, 7, 14, 19, 20, 99, 125-127, 154,
172
Font 134, 140, 149
Foster 195
Freedman 108-110, 126
Friederici 144, 145, 149, 184, 194, 208,
209, 213, 215, 217
G
Gerard 132, 134, 149
Grainger 132, 133, 149, 150
Green 3, 5, 10, 15, 16, 18, 134, 150, 197,
206, 208, 209, 215, 217, 223, 225
Greenbaum 97, 126
Grosjean 140, 141, 149, 152, 170, 172
H
Hagoort 184, 194, 195, 206, 207, 214, 215,
218
Hahne 144, 145, 149, 208, 209, 213, 215
Hamann 202, 203, 215
Hardt 115, 126
Hebb 205, 215
Hedgcock 97, 126
Hong 109, 125
K
Kemmer 106, 124
Kinder 203, 216
Kirsner 135, 148
Kluender 99, 100, 113, 126, 208, 216
Kornlt 60, 64, 68, 85, 86, 95

</TARGET "ni">

228 Name index

Kroll 20, 134, 143, 144, 146, 148-150,


206, 210, 211, 213, 215-217
Kushnir 132, 150
Kutas 113, 126, 184, 195, 208, 216
L
Lebrun 200, 205, 216
Lemhfer 135, 136, 150
Levelt 123, 126, 168, 173, 211, 216
Levin 104, 124, 126, 217
M
MacDonald 106, 126
MacFarland 104-107, 110, 111, 117, 124,
126
Macnamara 132, 150
MacWhinney 124, 172, 204, 215, 217
Mandell 97, 126
Marslen-Wilson 118, 126
Martohardjono 97, 126
Matsuo 113, 114, 116, 120, 121, 124, 125
McCloskey 99, 125
McKoon 104-107, 110, 111, 117, 124, 126
Milech 135, 148
Murre 205, 218
Muysken 120, 125, 221, 226
N
Neville 144, 150, 208, 210, 218
O
Osterhout 184, 195, 206, 207, 210, 215,
217
P
Paradis 16, 20, 199, 200, 205, 207, 215,
217
Perani 3, 4, 20, 209, 214, 217
Price 207-209, 211, 215, 217
R
Rappaport Hovav 104, 124, 126

Rayner 126, 143, 148


Robertson 205, 218
S
Sag 114, 126
Scarborough 132, 134, 149
Schriefers 136, 140, 141, 148, 149, 184,
194
Schtze 97, 98, 123, 126
Seidenberg 98, 99, 124, 126, 153, 173
Shanks 153, 164, 173, 203, 216
Sholl 143, 148
Sonnenstuhl-Henning 109, 125
Sorace 103, 112, 124, 126, 127
Squire 153, 173, 199, 202, 203, 215, 216,
218
T
Tanenhaus 114, 127
Ten Brinke 134, 138, 141, 149, 206, 214
Timmermans 136, 149
U
Ullman 4, 20, 164, 174, 193, 195,
199-202, 204, 207, 210, 218
V
Van de Craats 3, 5, 7, 10, 48, 49, 65, 67,
69, 70, 78, 91, 94, 95, 223, 224, 226
Van Hell 134, 150
Van Heste 139, 150
Van Heuven 133, 134, 142, 149, 150
Van Hout 48, 49, 67, 70, 94, 95, 124, 127,
178, 194, 219
Van Jaarsveld 134, 138, 149, 206, 214
Von Studnitz 134, 150, 209, 217
W
Weber-Fox 144, 150, 210, 218
Wexler 124
White 1, 8, 20, 31, 35, 41, 44, 46, 68, 70,
95, 107, 122, 125, 168, 172, 178

<TARGET "si" DOCINFO AUTHOR ""TITLE "Subject index"SUBJECT "Language Acquisition & Language Disorders, Volume 30"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4">

Subject index
A
abstract representations 15, 154, 159
abstraction 4, 153, 154
acceptability judgment 117, 120, 144
acquisition of vocabulary 199, 200
age of acquisition 201, 208-210, 217
agreement 22, 25, 44, 48, 61-66, 77, 78,
83, 84, 87, 93, 152, 157, 178-181,
183, 184, 190, 192, 195
alternative feature realization 61
amnesic patients 197, 198, 201-203
animacy 14, 154, 160-162, 165-168
aphasia 175, 176, 194, 195, 198, 215-218
aphasics 12, 15, 16, 176-179, 183, 184,
194, 195, 198, 215
articial grammar 153, 173, 202, 203, 216
articial language 153, 155, 172
aspect hypothesis 38, 39, 41
association 22, 131, 132, 134, 150, 170
asymmetric spell out 61
auxiliary selection 112, 113, 127

competence 2, 5-8, 11, 17, 31, 43, 46,


97-104, 106, 107, 109, 111, 116, 117,
119, 120, 122-126, 171, 183, 216, 223
competition model 204, 217
computer modelling 4, 11, 12, 14, 16
connectionism 4, 12, 19, 20, 155, 172, 225
connectionist learning 4, 151, 153, 154,
164, 222, 225
connectionist model 165, 174, 226
connectionist network 151, 154, 159, 163,
164
conservation 48, 53, 65, 67, 68, 70, 72, 73,
75, 81, 86, 95
conservation hypothesis 70, 75, 81, 86
conservative strategy 10, 65
constructional gradience 113
convergence 101-103, 106, 117, 119, 197,
198, 204-206, 210-212
convergence hypothesis 197, 198, 204,
205, 210-212
critical period 23, 24, 41, 201, 208, 215

B
bilingual 20, 130-132, 134-136, 138-140,
142-150, 198, 206, 208, 214-218,
221, 225, 226
bilingual syntactic processing 146

D
declarative 4, 10, 15, 16, 18, 20, 163, 164,
174, 193, 195, 199-203, 205, 212,
216, 218, 223
declarative memory 15, 193, 199-203
derivational theory of complexity 2, 107
discourse hypothesis 39, 41
distributed morphology 23, 35, 43
distributional analysis 18, 154, 155, 157
dual competence 104, 107, 111, 116, 117,
119
Dutch go/no-go 137

C
categorical 56, 97, 99-101, 104, 106, 107,
111-114, 116-119, 121-123
causative verb 108
clitic placement 107, 108, 110, 119, 125
closed-class words 210
cognate 135, 141
common gender 179-181

E
EEG 144, 145, 175, 184-186, 194
eective connectivity 211

230 Subject index

electroencephalography 184
English go/no-go 132, 137
English lexical decision task 131, 138-141
ERP data 186, 206-208, 213
ERPs 12, 15, 16, 130, 144, 145, 184, 186,
192, 194, 213
Event-Related brain Potentials 144, 149,
195, 215, 216
explicit learning 160, 164, 199

grammaticality 16, 18, 97-101, 107-110,


116, 119, 123-126, 154, 175, 176,
179, 180-183, 185-187, 189, 208
grammaticality judgment 107, 109, 116,
126, 154, 175, 176, 179-181, 185,
186, 189
haemodynamic methods 206, 213
H
hemispheric representation of L2 198

F
feature 9, 10, 17, 22-25, 33-37, 40-42, 48,
51-54, 56, 59-64, 66, 69, 70, 72, 73,
76, 78-81, 85-88, 91-93, 113, 117,
143, 168, 222
feature bundle 73, 76, 81, 85, 86, 88, 91,
93
formal feature 51, 73, 78, 80, 85
frequency 4, 12, 13, 18, 19, 22, 28, 29, 37,
42, 43, 111, 112, 117, 124, 129, 135,
136-139, 141-143, 147, 159, 172,
181, 182, 186
frequency of use 12, 18
frontal-basal ganglia circuit 207
functional category 9, 34
functional head 69, 78, 80, 83

I
idealisation of the data 7
implicit learning 18, 153, 159, 165-167,
169, 172, 174, 200, 214
incidental learning 154, 161, 165,
169-171, 224
initial sensitivities 205
innate knowledge 5, 225
interlanguage 4, 7, 8, 10, 17, 18, 43, 46,
48-50, 53, 65, 66, 70, 73, 94, 109
interlingual homograph 132, 134, 136,
137, 146, 148-150, 214
interpretable feature 79
invariant principles 3
invisible category principle (ICP) 59, 60

G
generalised blocking principle (GBP) 34, 36
generalized lexical decision 131, 136, 139,
141, 150
generativist 4-7, 10, 98, 101, 122, 220
go/no-go 131, 132, 137
gradience 97, 99-101, 111-113, 115, 117,
120, 125
grammar 1, 4, 8, 9, 15, 17, 19-21, 23, 26,
34, 35, 40, 43, 44, 46, 48, 51-54, 65,
67, 68, 70, 81, 88, 94, 95, 98, 100,
103, 123, 151, 153, 173, 174, 175,
176, 179, 184, 195, 197, 199, 200,
202, 203, 212, 216, 218, 219, 220,
221, 226
grammatical gender 151, 173-175, 179,
184, 191, 194, 195
grammaticalise 34

L
language decision 131, 137
language faculty 2, 6, 10, 21, 22, 24, 26,
40, 41, 45, 47, 65, 66, 98, 123
language intermixing 138, 141, 142, 149,
214
language mode 140
language non-selective access 132-134,
139, 146, 147
language processing 19, 140, 149, 150,
176, 178, 183, 184, 191, 198, 213,
218, 221, 222, 224
language processor 175
language-selective access 132
language-specic lexical decision 131
lexical decision 131, 133-136, 138-141,
150, 177, 184, 214, 215
lexical gradience 111, 112, 115

Subject index

lexical item 33, 37, 42, 48, 51-53, 55-57,


69, 70, 72, 73, 72-75, 83, 84, 87, 91,
92, 93, 129, 137, 213
lexicon 4, 10, 19, 20, 24, 25, 34, 40, 41,
50-54, 65, 66, 68, 69, 70, 83, 87, 93,
94, 101, 111, 122, 126, 132, 145, 151,
152, 168, 170, 174, 195, 197, 199,
200, 212, 214, 218-222, 224-226
lexicon-syntax interface 1, 50-53, 111,
222
LF-legibility 56, 65, 66
linguistic context eects on word
recognition 143
location 48, 58-61, 65, 112, 155, 212
logical problem of language acquisition 6
M
mapping problem 24
medial temporal structures 199
metalinguistic knowledge 167
minimalist program 19, 24, 67, 95, 194,
214, 226
minimalist theory 3, 66, 95, 220
miscategorisation 52, 54
model learning 162, 163, 167, 168, 170
modular 6, 219
morphological relatedness 118
N
near-native speaker 21
neocortical regions 199
neuter gender 179-181
non-parallel ellipsis 114, 115
noun class 151, 153-157, 160-163, 165,
169, 170
noun classication 179
null-results for interlingual homographs
139
numeral 57, 64, 65
O
on-line processing 176, 179, 183, 190, 205
open-class words 210
optionality 2, 9, 21, 22, 24, 26, 31-33, 36,
37, 39, 224

output condition 39, 41


P
P600 145, 184-188, 190, 192, 193, 207,
210
parallel distributed processing 13, 20, 173
parallelism constraint 100, 114, 115, 117
parameter resetting 44
parameter setting 71
past participle 41, 76, 208
past tense marking v, 21, 22, 26, 27, 29,
31-33, 36, 40, 42
path 48, 58-61
performance 6, 7, 11, 16, 22, 26, 29-33,
40, 45, 46, 97-103, 106, 116, 121,
123, 124, 141, 142, 147, 149, 150,
152, 155-161, 164-166, 169-171,
199-203, 205, 207, 208, 213, 218,
219, 223, 224
phonological matrix 51, 72, 73, 72-78, 80,
81, 83, 86, 89, 88, 91-94, 220
procedural 4, 10, 15, 16, 18, 20, 163, 164,
174, 193, 195, 199-203, 215, 218, 223
procedural learning 164
procedural memory 4, 15, 16, 193,
199-202, 215
prodeterminer 85
progressive demasking 131-133
Q
qualitative dierence 179, 191, 224
quantication 48, 55, 57
R
recognition 129, 130, 132, 135, 137-139,
142-144, 146-150, 172, 173, 202,
203, 214-218
repetition 202, 203, 216
repetition priming 202, 203, 216
representation of grammatical knowledge
175, 176, 178, 191, 201
S
semantic anomalies 144, 210
sentence-matching 107-109, 119, 125

231

</TARGET "si">

232 Subject index

short-term memory 156, 157, 160, 161,


166, 167
simple past tense in spontaneous
production 29
simulation data 203
spontaneous recovery 176
subsymbolic model 164
surface competence 101, 111, 119, 123
syntactic anomalies 184, 210
syntactic feature 25, 33, 36, 37, 42
syntactic gradience 99, 113, 117
Syntactic Positive Shift (SPS) 184, 195
syntactic priming eect 177
syntactic processing 125, 144-147, 195,
208, 215

T
task learning 13, 162, 163, 167
task-dependent variation 176
terminal node 23, 35
test of knowledge of morphology 26
time-course of lexical activation 135, 147
triggering 1, 7, 8, 17, 91, 221
V
VP-ellipsis 100, 113-115, 120, 125
W
word association 131, 132, 134
word naming 131, 132
working memory 146, 163, 176

In the series LANGUAGE ACQUISITION AND LANGUAGE DISORDERS (LALD) the


following titles have been published thus far or are scheduled for publication:
1. WHITE, Lydia: Universal Grammar and Second Language Acquisition. 1989.
2. HUEBNER, Thom and Charles A. FERGUSON (eds): Cross Currents in Second
Language Acquisition and Linguistic Theory. 1991.
3. EUBANK, Lynn (ed.): Point Counterpoint. Universal Grammar in the second language. 1991.
4. ECKMAN, Fred R. (ed.): Confluence. Linguistics, L2 acquisition and speech pathology. 1993.
5. GASS, Susan and Larry SELINKER (eds): Language Transfer in Language Learning.
Revised edition. 1992.
6. THOMAS, Margaret: Knowledge of Reflexives in a Second Language. 1993.
7. MEISEL, Jrgen M. (ed.): Bilingual First Language Acquisition. French and German
grammatical development. 1994.
8. HOEKSTRA, Teun and Bonnie SCHWARTZ (eds): Language Acquisition Studies in
Generative Grammar. 1994.
9. ADONE, Dany: The Acquisition of Mauritian Creole. 1994.
10. LAKSHMANAN, Usha: Universal Grammar in Child Second Language Acquisition.
Null subjects and morphological uniformity. 1994.
11. YIP, Virginia: Interlanguage and Learnability. From Chinese to English. 1995.
12. JUFFS, Alan: Learnability and the Lexicon. Theories and second language acquisition
research. 1996.
13. ALLEN, Shanley: Aspects of Argument Structure Acquisition in Inuktitut. 1996.
14. CLAHSEN, Harald (ed.): Generative Perspectives on Language Acquisition. Empirical
findings, theoretical considerations and crosslinguistic comparisons. 1996.
15. BRINKMANN, Ursula: The Locative Alternation in German. Its structure and acquisition. 1997.
16. HANNAHS, S.J. and Martha YOUNG-SCHOLTEN (eds): Focus on Phonological
Acquisition. 1997.
17. ARCHIBALD, John: Second Language Phonology. 1998.
18. KLEIN, Elaine C. and Gita MARTOHARDJONO (eds): The Development of Second
Language Grammars. A generative approach. 1999.
19. BECK, Maria-Luise (ed.): Morphology and its Interfaces in Second Language Knowledge. 1998.
20. KANNO, Kazue (ed.): The Acquisition of Japanese as a Second Language. 1999.
21. HERSCHENSOHN, Julia: The Second Time Around Minimalism and L2 Acquisition.
2000.
22. SCHAEFFER, Jeanette C.: The Acquisition of Direct Object Scrambling and Clitic
Placement. Syntax and pragmatics. 2000.
23. WEISSENBORN, Jrgen and Barbara HHLE (eds.): Approaches to Bootstrapping.
Phonological, lexical, syntactic and neurophysiological aspects of early language
acquisition. Volume 1. 2001.
24. WEISSENBORN, Jrgen and Barbara HHLE (eds.): Approaches to Bootstrapping.
Phonological, lexical, syntactic and neurophysiological aspects of early language
acquisition. Volume 2. 2001.
25. CARROLL, Susanne E.: Input and Evidence. The raw material of second language
acquisition. 2001.

26. SLABAKOVA, Roumyana: Telicity in the Second Language. 2001.


27. SALABERRY, M. Rafael and Yasuhiro SHIRAI (eds.): The L2 Acquisition of Tense
Aspect Morphology. 2002.
28. SHIMRON, Joseph (ed.): Language Processing and Acquisition in Languages of
Semitic, Root-Based, Morphology. 2003.
29. FERNNDEZ, Eva M.: Bilingual Sentence Processing. Relative clause attachment in
English and Spanish. 2003.
30. HOUT, Roeland van, Aafke C.J. HULK, Folkert KUIKEN and Richard J. TOWELL
(eds.): The Lexicon-syntax Interface in Second Language Acquisition. 2003.

Você também pode gostar