Escolar Documentos
Profissional Documentos
Cultura Documentos
Ida Toivonen, Piroska Csri, and Emile van der Zee, editors
All rights reserved. No part of this book may be reproduced in any form by any
electronic or mechanical means (including photocopying, recording, or informa-
tion storage and retrieval) without permission in writing from the publisher.
MIT Press books may be purchased at special quantity discounts for business or
sales promotional use. For information, please email special_sales@mitpress.mit.
edu.
This book was set in Times by Toppan Best-set Premedia Limited. Printed and
bound in the United States of America.
Structures in the mind : essays on language, music, and cognition in honor of Ray
Jackendoff / edited by Ida Toivonen, Piroska Csri, and Emile van der Zee.
pages cm
Includes bibliographical references and index.
ISBN 978-0-262-02942-1 (hardcover : alk. paper) 1. Psycholinguistics.
2. Cognitive science. 3. Neurolinguistics. 4.
Cognition. I. Jackendoff, Ray, 1945- honoree. II. Toivonen, Ida. III.
Csri, Piroska. IV. Zee, Emile van der.
P37.S846 2015
401.9dc23
2015009287
10 9 8 7 6 5 4 3 2 1
Contents
Acknowledgments ix
Introduction xi
0.1 The Scholar Ray Jackendoff, by Ida Toivonen, Piroska Csri, and
Emile van der Zee xi
0.2 Some Brief Reflections on Ray Jackendoff, by Paul Bloom xii
0.3 Ray Jackendoffs Scholarship, by Noam Chomsky xii
0.4 The Brilliant Ray of Linguistics, by Adele E. Goldberg xiii
0.5 Meeting Ray Jackendoff, by Georgia M. Green xv
0.6 Rays Influence on a Young Generative Semanticist, by Frederick J.
Newmeyer xvi
0.7 Ray Jackendoff in the Semantic Pantheon, by Barbara
H. Partee xvii
0.8 The Man Who Made Language a Window into Human Nature, by
Steven Pinker xix
0.9 Ray Jackendoff, Cognitive Scientist, by Thomas Wasow xxiii
0.10 Why Ray Is Special, by Moira Yip xxv
0.11 The Organization of This Volume, by Ida Toivonen, Piroska Csri,
and Emile van der Zee xxvi
I LINGUISTIC THEORY 1
II PSYCHOLINGUISTICS 165
Contributors 393
Index 395
Acknowledgments
Many people have helped in the creation of this volume, and we would
like to express our sincere gratitude to all. We wish to especially thank
the authors. It was very exciting to receive and read the chapters, and we
truly believe that Ray as well as all other readers will appreciate their
efforts as much as we do. We also want to thank the authors for providing
comments on each others initial drafts; the resulting volume bears
witness to the ideas, care and effort they poured into this task. We also
thank the scholars that contributed to the introduction of the volume.
Their contributions provided insights, depth and color that we could not
have managed on our own. In addition, the chapter authors as well as
the authors of the introduction have also patiently helped us with general
advice and encouragement. We must especially mention Joan Maling and
Daniel Dennett for support and practical advice.
Our special thanks also go to the external reviewers who have gener-
ously lent their time to the texts included in this volume: Erik Anonby,
Ash Asudeh, Andrew Brook, Liz Coppock, Simon Durrant, Evan
Houldin, Tabish Ismail, Jonah Katz, Kumiko Murasugi, Diane Nelson,
Dan Siddiqi, and Raj Singh. We are impressed by and grateful for the
reviewers enthusiasm, expertise, care, and general willingness to help.
Many thanks to Paul Melchin for his excellent editorial and formatting
assistance.
We also thank all the people at MIT Press who have worked on this
with us. We especially thank Sarah Courtney, Christopher Eyer, Philip
Laughlin, Mary Reilly, and Marcy Ross.
From behind the curtains, Hildy Dvorak has given us advice and
support from the very beginning, and we are very grateful for all her help.
Thank you all!
Introduction
These are exciting times for those who study cognition. Just a few decades
ago, it was commonly assumed to be futile to directly study concepts of
the mind, such as knowledge of language or consciousness. Today, cogni-
tive science is an established area of study. We are slowly moving from
a collection of disciplines with distinct methodologies and different out-
looks on what is important towards a truly interdisciplinary field. Cogni-
tive scientists are no longer necessarily linguists, psychologists, computer
scientists, etc., with a common interest in the mind. It is now common-
place to identify as a student of a topic or research area in cognition:
language, perception, consciousness, attention, memory, moral reasoning,
or learning. Researchers are cross-listed across traditional departments,
students are supervised by teams of scholars with different areas of spe-
cialization, and ties between universities and industry are flourishing.
The field brims with activity and enthusiasm, and for this we owe a huge
debt to pioneering researchers in cognitive science, scholars who took
the study of the mind seriously and had the audacity to reach out across
disciplines. One such pioneer is Ray Jackendoff. This volume is intended
as a thank you and a tribute to him and his work.
Ray Jackendoff was born in 1945. He studied mathematics at Swarth-
more College and then linguistics at MIT. He received his doctorate from
MIT in 1969. After a short stint at UCLA, he was hired at Brandeis
University, where he taught for 35 years. In 2005, he became the Seth
Merrin Professor of Humanities at Tufts University and co-director (with
Dan Dennett) of the Center for Cognitive Studies.
An introduction into the different areas to which Ray Jackendoff con-
tributed (syntax, morphology, semantics, phonology, musical cognition,
xii Introduction
I first met Ray at a party at the conclusion of one of the San Diego Syntax
Festivals held in La Jolla in maybe 1970. These were informal roundtable
affairs during which Chomskys students who had left the nest after
earning their degrees discussed their own and each others work. Grad
students were there as observers. Ray and I were somewhat wary of each
other. I had a reputation in Cambridge as the Wicked Witch of the West
after an ill-advised implicature in a review I wrote for Language, and in
the heartland of what came to be known as generative semantics, he was
the face of the lexicalist heresy. I thought dancing might defuse (or
diffuse) the unspoken tension, but the music was not suitable, so speaking
happened, but dancing did not. It was probably twenty or twenty-five
years before we spoke again. By the mid-1990s, our views of the relations
between syntax and the speakers construal of the world and how it
works, and of what a grammar is had converged to a degree neither of
us could have foreseen, following the disparate paths of interpretive
semantics (Jackendoff 1972, 1983, 1990, 1997) and modern formal phrase
structure grammar (Gazdar, Klein, Pullum, and Sag 1985; Pollard and Sag
1987; Pollard and Sag 1994; Green 2011), which in the head-driven incar-
nations was strongly lexicalist. Our differences seemed trivial and local.
When Ray came back to Illinois a few years ago (he had taught here at
two LSA Linguistic Institutes), we had a lively dinner after his talk, and
I was sorry we had never realized how much our work was going in the
same direction, from the same general principles.
xvi Introduction
been calling lexical redundancy rules. The prevailing view, which came
from Chomskys ideas about evaluation metrics for grammars, was that
the best grammar was the shortest grammar, and theoretical adequacy
was all about finding a theory of grammar for which the grammar picked
by the evaluation metric was indeed the best grammar. This was all in
the service of language acquisition, which was presented schematically
as a matter of choosing among all the possible grammars consistent with
the primary data. And the idea that the evaluation metric would prize
the shortest grammar was undoubtedly influenced by mathematicians
aesthetics in designing axiomatic theories, where it makes sense to search
for a non-redundant and minimal set of axioms, or physicists search
for some small set of physical laws that explain a wide range of
phenomena.
This paper of Jackendoffs really helped to change that whole attitude;
it was probably one important step on the path from taking physics as
the model science to the more recent ethos of seeing linguistics, and
psychology more generally, as parts of biology. What Jackendoff argued
convincingly is that the lexicon is full of important subregularities, both
in semantics and in morphology, that cannot be reduced to pure redun-
dancy rules because they are not fully productive and the actual forms
that occur are therefore not predictable. Whereas a real redundancy rule
like [+human] [+animate] means that you do not have to enter the
value of the animacy feature for a noun with the [+human] feature, the
forms and meanings of nominalizations like discussion, argument, rebut-
tal, show a similar nominalizing semantics result of V-ing associated
with three different suffixes, and discussion, congregation, copulation,
show three different semantic effects (result of V-ing, group that Vs,
act or process of V-ing) associated with the same suffix (Jackendoff
1975, 650). He used examples like these, and examples of morphologi-
cally complex forms with no synchronically live base forms (like retri-
bution, fission), to argue that lexical redundancy rules are not productive
derivational rules; one needs full entries in the lexicon. But he did not
rest with his demonstration of the need for full entries. To me the most
interesting part of the paper was his exploration of the question of what
redundancy rules can be good for if they do not let us omit information
from the lexicon. And his answer was an important advance in psycho-
linguistics: he proposed a view on which redundancies make new lexical
items easier to learn, and on which redundancy rules can also be used
creatively in the formation of new lexical items, suggesting that lexicon
and syntax are not so sharply different as had been supposedthe
Introduction xix
lexicon is not all simply memorized (and larger structures are not all
simply generated). This was all consistent with Chomskys lexicalism, but
it may be hard to realize now how new and surprising was the idea of
having both redundancy rules and full lexical entries.
The other article I remember most vividly from that period is the one
in which he looked closely at the semantics of case roles and offered
structural-metaphorical extensions of basic relations involving locations,
paths, and motion: Toward an explanatory semantic representation
(Jackendoff 1976). Thats the one where he likens explaining an idea to
someone to putting an object into a container, and many other such
analogies. Thats a good example of some of the insights that come from
conceptual semantics that dont have any direct counterpart in formal
semantics.
In all of his work, from the earliest to the most recent, he has been a
champion of the importance of semantics as a part of linguistics proper,
and at the same time has increasingly forged ties with other aspects of
psychology, especially perception. He began as a Chomskyan, and has
kept the mentalistic stance, but has argued against Chomkys syntacto-
centric view of linguistics. He agrees with Lakoff on the importance of a
conceptual perspective on semantics, but disagrees with him in many
other ways. He agrees with Fodor on some fundamental issues about the
language of thought, but parts company with him on realism. He appre-
ciates much of what has been done in formal semantics, while arguing
strenuously against classical versions of its foundations. And he takes
pains to note that not all model-theoretic semanticists insist on realistic
models, citing Bachs Natural language metaphysics (Bach 1986) as
compatible with the view of basing semantics on models as conceptual-
ized by the language user (Jackendoff 1998). In short, Jackendoff has
been an important and independent thinker, making a myriad of major
substantive contributions while taking on the important foundational
issues of our field. We are all in his debt. And thats even without men-
tioning his musical contributions, which have brightened many occasions,
including the 1974 Linguistic Institute in Amherst, from which those of us
who were there carry happy memories of Rays beautiful chamber recital.
0.8 The Man Who Made Language a Window into Human Nature
Steven Pinker
tying the geometry of parse trees to their meanings, Ray also fleshed out
the crucial linkage between syntax and semantics. On a personal note I
can add that reading X-Bar Syntax as a postdoctoral fellow in 1980 was
a revelation to me. The theory immediately suggested a way in which a
child could work backward from the wording of a parental utterance and
an understanding of its context to the phrase structure rules that gener-
ated it. This could allow children to bootstrap their way into the syntax
of the target language, and it served as the heart of a theory of language
acquisition that I developed in several articles and books.
Grammar as evidence for conceptual structure, included in the
seminal 1978 collection Linguistic Theory and Psychological Reality, was
yet another revelation. Building on ideas by Jeffrey Gruber, Ray showed
that abstract concepts of motion and location lay at the heart of a vast
array of expressions that were not ostensibly about the physical world
at all. This insight truly made language a window into thought, and
it anticipated vast research enterprises in the decades to come on
analogy, conceptual metaphor, and embodied cognition. In subsequent
worksSemantics and Cognition (1983), Semantic Structures (1990), and
Parts and Boundaries (1991), Ray carried out breathtaking analyses of
the cognitive representation of space, time, motion, matter, agency, goals,
causation, and social relationships, perhaps coming closer than anyone
to laying down a spec sheet for the contents of thought.
And there was more to come. In 1997, Ray turned to the question of
How language helps us think, and probed the relation between lan-
guage and thought in a far deeper way than the trendsetters of the neo-
Whorfian fads of the 2000s. In his 1993 paper with Barbara Landau,
What and where in spatial cognition, he suggested that its no coin-
cidence that neuroscientists co-opted interrogative pronouns for the two
major divisions of the visual system: the divisions represent two different
kinds of spatial information encoded respectively in the meanings of
nouns and prepositions. Recent studies of the neurobiological basis of
spatial cognition using neuroimaging techniques that did not exist when
Ray and Barbara wrote their paper have vindicated this ambitious idea.
One of the first and most famous applications of generative linguistics
to other domains was Leonard Bernsteins 1973 lecture series called The
Unanswered Question, which loosely applied Chomskys theories to
music. In 1977 Ray published a critical review of this premature effort,
but Ray is never satisfied with just tearing things down. His 1983 book
with Fred Lerdahl, A Generative Theory of Tonal Music, outlined a
sophisticated analysis of the cognitive structures underlying melody and
xxii Introduction
rhythm and how they overlap with the structures of languagea topic
he returned to in his 2006 essay The Capacity for Music: What Is It and
Whats Special About It? and his 2009 essay Parallels and Nonparallels
Between Language And Music. In my view, it remains the richest and
most insightful analysis of the mental representation of music.
As if language, space, thought, and music were not a broad enough
range of topics, Ray turned in 1987 to the problem of consciousness.
Unlike the many cognitive scientists and neuroscientists who use the
topic as an excuse to do bad philosophy, Ray came up with a substantive,
non-obvious, and plausible hypothesis about the contents of conscious-
ness, namely that we are aware of intermediate levels of representation
in the hierarchy from sensation to abstract knowledge. He also contrib-
uted the invaluable concept of the computational unconscious, the
infrastructure of information processing that makes reasoning and
awareness possible.
This leaves the nature of language itself, and here we have Rays two
capstone contributions. In The Architecture of the Language Faculty
(1997) and Simpler Syntax (with Peter Culicover, 2005), Ray outlined a
theory of language that (unlike the allegedly minimalist theories of his
mentor) implements Einsteins dictum that everything should be made
as simple as possible, but no simpler. Rays parallel architecture model,
which posits multiple generative components whose outputs contain
variables which are unified by interface rules, embraces both the open-
ended combinatorial power of language and the idiosyncrasies it toler-
ates at every level. Best of all, it harmonizes with Rays other capstone,
Foundations of Language (2002), which presents nothing less than a
theory of the place of language in nature, integrating grammatical theory,
parsing, acquisition, evolution, and neuroscience.
Rays stunning record of contributions comes from a happy conglom-
eration of traits: a concentration on the deepest questions about lan-
guage, mind, and human nature; full use of the theoretical precision and
empirical richness made available by modern linguistics; a judicious
level of formalism, which avoids the extremes of woolliness and fussi-
ness; an intuitive feel for the texture of mental phenomena; and a refusal
to be swayed by fads, fashions, ideologies, dogmas, or daringness for its
own sake.
Rays oeuvre is distinctive for another reason. He blazed a trail into
the center of the mind without the appurtenances and perquisites of
academic power. He could not dine out on the brand-name appeal of his
university; did not preside over a factory of graduate student helpers and
Introduction xxiii
The last paragraph of Rays first book (Jackendoff 1972, 386) includes
the following:
If we open up a human being, what do we find inside? The answers have been
of the form: We find a four-chambered heart, a spine, some intestines, and a
transformational grammar with two or more syntactic levels. The question of this
section has been: What function do the things we have found serve? Why do they
have the structure they have as opposed to any other?
When I first read this, I was struck by the audacity of comparing the
psychological reality of then-current grammatical theory with the physi-
ological reality of hearts, spines, and intestines. Reflecting on the passage
over forty years later, what impresses me is rather different. The impor-
tant part is not the sentence about physical organs and grammar, but the
two questions that follow. The question about function demands a deeper
level of explanation than has been the norm in generative grammar. The
why-question invites explanations in terms of biological evolution.
It has been a hallmark of Rays work over the decades to seek expla-
nations of linguistic phenomena in terms of fundamental properties of
human cognition, and to inquire into the origins of those properties. In
doing this, he has connected his own linguistic discoveries with research
in psychology and biology. Perhaps more than any other linguist, he has
worked to integrate linguistics into cognitive science.
Of course, Ray is not alone in this. Indeed, since the late 1950s, Chomsky
has touted linguistics for the insight it can provide into human cognitive
abilities. Chomskys work played a major role in the birth of cognitive
science, combining insights from linguistics, philosophy, and psychology,
and using new tools made available by the development of computers,
to create a new science of the mind. In the 1960s and 70s, he sometimes
referred to linguistics as a branch of psychology; later, to emphasize his
strong claims about the innateness of much linguistic knowledge, he
began to refer to linguistics as a branch of biology. Such claims led to a
xxiv Introduction
2002, 236). While most cognitive scientists would find this assump-
tion unproblematic, Ray is again breaking with Chomsky, who asserts,
human language is not particularly or specifically a communica-
tion system (http://www.nancho.net/advisors/chomsky.html). Ray, in
contrast, writes, I will assume without justification that any increase
in explicit expressive power of the communicative system is adap-
tive, whether for cooperation in hunting, gathering, defense [footnote
omitted], or for social communication such as gossip (Jackendoff 1999,
272). Rays common-sense approach to this issue puts him out of the
mainstream of theoretical linguistics, but very much in the mainstream of
cognitive science.
Ray was recently awarded the David E. Rumelhart Prize for Contribu-
tions to the Theoretical Foundations of Human Cognition. Though not
the first linguist to win this prestigious award, he is the first whose work
involves neither laboratory experimentation nor computational model-
ing. He follows the work of the experimentalists and the modelers, and
synthesizes their findings into theories of mental architecture that he
then tests using more traditional linguistic methods. In this way, he has
been able to do more to integrate linguistics into cognitive science than
any other linguist.
These remarks are a personal case study based on the years of my contact
with Ray. I hope that it illuminates how broad-ranging his mind is, and
what a great debt many of us owe him.
For me as a phonologist, it is unusual to find myself collaborating with,
or being encouraged by, a syntactician or a semanticist. But Ray has
always resisted being typecast, hence his extraordinary breadth of knowl-
edge and enthusiasm. As a graduate student, I read his work on X-Bar
syntax, but despite this my interests moved towards phonology. Then, as
luck would have it, when my son was a few months old, Ray hired me to
fill a part-time temporary job replacing Joan Maling, who was on mater-
nity leave. The following year Jane Grimshaw took maternity leave, and
the year after that Joan Maling took a second leave, and so it happened
that after three years of me hanging about Brandeis, Ray went to bat on
my behalf, and with great resourcefulness persuaded the administration
to create a part-time tenure-track position in phonology, for which I was
duly hired. The point of this personal tale is that Ray has always seen the
xxvi Introduction
big picture: in 1983 part-time tenure-track jobs were a new idea, but that
has never stopped Ray.
This comes through with great clarity in his scholarly work: he does
not get boxed in by the wisdom-du-jour. I have only collaborated with
him once, on a 1987 Language paper with Joan Maling (on which, typi-
cally, they insisted on making me first author because I didnt have tenure
yet and they thought it might help me). The paper was on quirky case,
and it used mechanisms drawn from autosegmental phonology to assign
case markings, so we called it Case in Tiers. In the context of Rays oeuvre
it is a mere bagatelle, but I remember what sheer fun he was to bounce
ideas off.
When I needed a keynote speaker to launch the University College
London (UCL) interdisciplinary Centre for Human Communication, the
only person I considered was Ray. He has a skill that is desperately rare
among theoretical linguists: he can build bridges to researchers from
other branches of language sciences, as well as cognitive sciences and
philosophy. And of course he gave a superb talk.
More recently, when I began to develop an interest in comparisons
between human language and animal communication, especially bird-
song, he was one of the very few linguists to whom I sent a draft paper,
and, typically for Ray, he quickly responded with thoughtful and encour-
aging comments, including suggestions as to where to submit it. Since
then, his work on the evolution of human language has helped form my
thinking on the issues, and I always assign his papers to my students. I
plan on continuing to do so for many years to come.
This Festschrift is an indication that I am not alone in my admiration
for Rays work, or in my gratitude for being his colleague and friend.
Notes
References
Peter W. Culicover
1.1 Introduction
1.3 Representations
Figure 1.1
Development of flow between regions at times (a), (b), and (c)
in the Penn Treebank. The human parser is trained on the corpus of the
learners experience.
A probabilistic phrase structure grammar has rules of the form in (8),
where A, B, C . . . are categories and p is the probability of the particular
expansion.
(8) [p]A B C . . .
When the parser encounters an instance of B, it projects the structure
[A B C . . .] with probability p. The probability is determined by the
frequency of the full structure initiated by B in the corpus that the
parser is trained on. These probabilities correspond in our physical
description of processing in a linguistic space to the width and density
of trajectories.
My experiments using a parser trained on a tagged corpus have shown
that configurations that are locally well-formed but globally non-existent
in the corpus cannot be correctly parsed.9 To take just one example, it is
well-known that extraction from a sentential subject in English, as in (9),
is unacceptable (Ross 1967):
(9) *These are the shares whichi [S that the president sold ti] has
surprised the market.
This sentence is locally well-formed, in that a sentence may be a subject
in English, the wh-phrase is where a wh-phrase may be, and the gap is
where a gap may be.
Interestingly, the filler-gap configuration exemplified here does not
occur in the corpus. The reason may be that (9) is ungrammatical in the
traditional sense, or it may be nonexistent in the corpus for reasons other
than grammar per se. In any case, the parser is not trained on sentences
like (9), and hence does not handle such a sentence properly, as shown
in figure 1.2. The feature g (for gap) should appear on the node
RP-IM, but actually is passed down through VS-gNS-II. This is an error,
since the extraction is from the subject, not the matrix VP.
The traditional explanation for the unacceptability of sentences such
as (9) is that it violates a grammatical constraint. However, there is an
alternative possibility: that such cases reflect processing complexity
(Hofmeister, Casasanto, and Sag 2013). On this view, more complex
configurations, like genuine cases of ungrammaticality, are rare in the
experience of the learner. This rarity gives rise to high surprisal, reflect-
ing the low or zero probability of the configuration (Hale 2001, 2003;
Levy 2005, 2008). High surprisal in turn correlates with the subjective
experience of unacceptability (Crocker and Keller 2006).
12 Peter W. Culicover
Figure 1.2
Parse of extraction from subject
The idea that extraction from subjects introduces complexity was pro-
posed by Kluender (1992, 1998, 2004). Similar arguments have been
made for other island constraints in the recent literature (see, e.g., Hof-
meister 2011; Hofmeister and Sag 2010; Hofmeister et al. 2007; Hofmeis-
ter et al. 2013; Hofmeister, Culicover, and Winkler, forthcoming; Sag,
Hofmeister, and Snider 2007). While a fully explicit processing account
of these constraints in terms of complexity is yet to be formulated, SS
points in this direction, on the assumption that grammatical knowledge
consists only of constructions. The task of the processor is to take these
constructions, that is, memory structures in the performance mechanism,
and fit them together in order to compute representations for more
complex expressions. On this view, any judgment that cannot be tied
directly to the well-formedness conditions imposed by constructions
must have an extra-grammatical explanation.
The preceding sections suggest that no matter what the linguistic experi-
ence of the learner is, it will be incorporated into linguistic competence
in the language processing mechanism in the form of a construction. Such
a view does not explain where the linguistic experience comes from,
or what if anything constrains its properties. But it does appear that
Simpler Syntax and the Mind 13
languages share certain properties and lack others, and that some proper-
ties, at least, are good candidates for universals. So we come to what is
probably the most fundamental issue in syntactic theory, which is that of
universals: how are they represented in the mind, and where do they
come from?
Regarding the first question, we propose in SS that Universal Grammar,
that is, the human language faculty, is a toolkit that learners draw upon
in construction grammars of their languages; this is an idea that has been
prominent in Jackendoffs work (see e.g., Jackendoff 2002, chap. 4).
Something that is in the toolkit need not be in every grammar, but it
must be universally available. The toolkit assumed in SS is very restricted,
compared with more traditional grammatical theories (Culicover and
Jackendoff 2005, chap. 1).
Regarding the second question, in Culicover (1999) and Culicover
(2013), I suggest that universals are in part reflections of economy in
the formulation of SYN-CS correspondences. The notion of economy is of
course familiar from the Minimalist Program, where it is envisioned in
terms of computational perfection (Chomsky 1995). I take economy to
be a matter of the actual complexity of the form-meaning correspondence.
Let us begin with the plausible assumption that what is evolutionarily
prior to language is essentially human CS, as articulated by Jackendoff
(1972, 1983, 1990, 1997, 2002). In particular, assume that it represents ref-
erence to objects, relations between objects and properties of objects, and
events and states, that is, representations of the form x.F(:x). Kirby
(1997, 2002) and his colleagues (Kirby, Smith, and Brigthon 2007) have
conducted computational experiments to model the evolution of lan-
guage. These experiments show how groups of agents, that is, learners, in
a generation can settle on increasingly more general grammatical hypoth-
eses about the correspondences between strings and meanings produced
by the preceding generation. Once a group of agents hits upon the idea of
using sounds to refer to and distinguish objects and their properties, syn-
tactic representations may evolve that are as complex as the CS represen-
tations, and in fact closely mirror the structure of these representations.
A key advance in the evolution of such representations is the forma-
tion of categories based on similarity of properties and distribution. So
it is reasonable to assume that three key universals are the following:
(i) CS is structured and recursive.
(ii) Sound corresponds to CS.
(iii) Form categories.
14 Peter W. Culicover
Notes
1. I have to confess that I (deviously) got Ray to comment on another piece that
I was working on at the same time as this one that dealt with some of the same
16 Peter W. Culicover
issues. As always, his comments have been very much to the point, and have led
to substantial improvements. He is of course not responsible for any errors. More
generally, I am pleased to once again have the opportunity to thank him for his
friendship, his kindness, his patience, and his generosity, to acknowledge the
enormous influence he has had on me and my work, and to thank him for afford-
ing me the privilege of collaborating with him for (wait for it!) . . . over FORTY
fabulous years.
For very helpful comments on this piece in its present form, I thank Dan Siddiqi
and an anonymous reviewer. I am also grateful to Richard Samuel for stimulating
discussions about many issues, including the competence-performance distinc-
tion. Naturally, none of them are responsible for any errors, either.
2. Of course, linguistics is a branch of cognitive science, since language is a cre-
ation of the human mind. But much of linguistic research is not explicitly con-
cerned with the mental representation of language, while mental representation
is the central concern of cognitive science.
3. This particular quotation is from a joint article, but it has been Jackendoffs
idea for some time; see, e.g., Jackendoff (2002, chap. 6).
4. We argue in SS that the grammatical functions Subj and Obj must also be
represented in correspondences. I leave these out here in part to simplify the
exposition, and in part because in simple correspondences the grammatical func-
tions are redundant. They appear to play a role, however, in capturing relation-
ships between constructions such as active-passive.
5. I include the phonetic form of these expressions for explicitness, although it
is inherited from the forms of the individual words and the normal syntactic
structure of the English VP.
6. Treating the elements as points is of course a simplification, since they too
have temporal characteristics.
7. Since the syntactic part of the space is not structured prior to experience,
categories will vary across languages, as suggested by Culicover (1999) and Croft
(2001, 2005), among others. However, since the semantic part of the space is
universal, it will constrain the types of categories that form, under reasonable
assumptions about economy and generalization. See section 1.5 for further
discussion.
8. The traversal of a trajectory is neutral with respect to speaker and hearer. A
speaker starts with the CS representation, producing the sounds while going
through the corresponding syntactic representation and from that to the phono-
logical form. A hearer is driven through the trajectory by the phonological
form, which corresponds to the syntactic structure, which in turn corresponds to
the interpretation. In fact, in the course of real time processing, the hearer
is likely to entertain multiple alternative structures, a point that I return to in
section 1.4.
9. The experiments use the parsing environment described in Nguyen et al.
(2012), and were carried out in collaboration with William Schuler and Marten
van Schijndel.
Simpler Syntax and the Mind 17
References
Brighton, Henry, Kenneth Smith, and Simon Kirby. 2005. Language as an evolu-
tionary system. Physics of Life Reviews 2 (3): 177226.
Briscoe, Edward. 2000. Grammatical acquisition: Inductive bias and coevolution
of language and the language acquisition device. Language 76 (2): 245296.
Briscoe, Edward. 2002. Linguistic Evolution through Language Acquisition:
Formal and Computational Models. Cambridge: Cambridge University Press.
Chater, Nick, and Morten H. Christiansen. 2010. Language acquisition meets
language evolution. Cognitive Science 34 (7): 11311157.
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT
Press.
Chomsky, Noam. 1986. Knowledge of Language: Its Nature, Origin, and Use. New
York: Praeger.
Chomsky, Noam. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Crocker, Matthew W., and Frank Keller. 2006. Probabilistic grammars as models
of gradience in language processing. In Gradience in Grammar, edited by Gisbert
Fanselow, Caroline Fry, Ralf Vogel, and Matthias Schlesewsky, 227245. Oxford:
Oxford University Press.
Croft, William. 2001. Radical Construction Grammar: Syntactic Theory in Typo-
logical Perspective. Oxford: Oxford University Press.
Croft, William. 2005. Logical and typological arguments for radical construction
grammar. In Construction Grammars: Cognitive Grounding and Theoretical
Extensions, edited by Mirjam Fried and Jan-Ola stman, 273314. Amsterdam:
John Benjamins.
Culicover, Peter W. 1998. The minimalist impulse. In The Limits of Syntax,
edited by Peter W. Culicover and Louise McNally, 4777. New York: Academic
Press.
Culicover, Peter W. 1999. Syntactic Nuts: Hard Cases, Syntactic Theory, and Lan-
guage Acquisition. Oxford: Oxford University Press.
Culicover, Peter W. 2011. Core and periphery. In The Cambridge Encyclopedia
of the Language Sciences, edited by Patrick Colm Hogan, 227230. Cambridge:
Cambridge University Press.
Culicover, Peter W. 2013. Grammar and Complexity: Language at the Intersection
of Competence and Performance. Oxford: Oxford University Press.
Culicover, Peter W., and Ray Jackendoff. 2005. Simpler Syntax. Oxford: Oxford
University Press.
Culicover, Peter W., and Ray Jackendoff. 2006. The Simpler Syntax Hypothesis.
Trends in Cognitive Sciences 10 (9): 413418.
Culicover, Peter W., and Andrzej Nowak. 2002. Learnability, markedness, and the
complexity of constructions. In Language Variation Yearbook, vol. 2, edited by
Pierre Pica and Johan Rooryk, 530. Amsterdam: John Benjamins.
18 Peter W. Culicover
Culicover, Peter W., and Andrzej Nowak. 2003. Dynamical Grammar. Oxford:
Oxford University Press.
Gertner, Yael, Cynthia Fisher, and Julie Eisengart. 2006. Abstract knowledge of
word order in early sentence comprehension. Psychological Science 17 (8):
684691.
Goldberg, Adele E. 1995. Constructions: A Construction Grammar Approach to
Argument Structure. Chicago: University of Chicago Press.
Hale, John T. 2001. A probablistic Earley parser as a psycholinguistic model. In
Proceedings of the Second Meeting of the North American Chapter of the Associa-
tion for Computational Linguistics, 18, Morristown, NJ: Association for Com-
putational Linguistics.
Hale, John T. 2003. The information conveyed by words in sentences. Journal of
Psycholinguistic Research 32 (2): 101123.
Hawkins, John A. 1994. A Performance Theory of Order and Constituency. Cam-
bridge: Cambridge University Press.
Hawkins, John A. 2004. Complexity and Efficiency in Grammars. Oxford: Oxford
University Press.
Hofmeister, Philip. 2011. Representational complexity and memory retrieval in
language comprehension. Language and Cognitive Processes 26 (3): 376405.
Hofmeister, Philip, Inbal Arnon, T. Florian Jaeger, Ivan A. Sag, and Neal Snider.
2013. The source ambiguity problem: Distinguishing the effects of grammar and
processing on acceptability judgments. Language and Cognitive Processes 28
(12): 4887.
Hofmeister, Philip, Peter W. Culicover, and Susanne Winkler. Forthcoming.
Effects of processing on the acceptability of frozen extraposed constituents.
Syntax 19.
Hofmeister, Philip, T. Florian Jaeger, Ivan A. Sag, Inbal Arnon, and Neal Snider.
2007. Locality and accessibility in wh-questions. In Roots: Linguistics in Search
of Its Evidential Base, edited by Sam Featherston and Wolfgang Sternefeld,
185206. Berlin: de Gruyter.
Hofmeister, Philip, and Ivan A. Sag. 2010. Cognitive constraints and island effects.
Language 86 (2): 366415.
Hofmeister, Philip, Laura Staum Casasanto, and Ivan A. Sag. 2013. Islands in the
grammar? Standards of evidence. In Experimental Syntax and Island Effects,
edited by Jon Sprouse and Norbert Hornstein, 4263. Cambridge: Cambridge
University Press.
Jackendoff, Ray. 1972. Semantic Interpretation in Generative Grammar. Cam-
bridge, MA: MIT Press.
Jackendoff, Ray. 1983. Semantics and Cognition. Cambridge, MA: MIT Press.
Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press.
Jackendoff, Ray. 1997. The Architecture of the Language Faculty. Cambridge, MA:
MIT Press.
Simpler Syntax and the Mind 19
Nguyen, Luan, Marten van Schijndel, and William Schuler. 2012. Accurate
unbounded dependency recovery using generalized categorial grammars. In Pro-
ceedings of COLING 2012): Technical Papers, 21252140. http://www.aclweb.org/
anthology/C/C12/C12-1130.pdf.
Ross, John R. 1967. Constraints on Variables in Syntax. PhD diss., MIT.
Sag, Ivan A. 2012. Sign-based Construction Grammar: An informal synopsis. In
Sign-based Construction Grammar, edited by Hans C. Boas and Ivan A. Sag,
39170. Stanford, CA: CSLI.
Sag, Ivan A., Philip Hofmeister, and Neal Snider. 2007. Processing complexity in
subjacency violations: The complex noun phrase constraint. In Proceedings of the
43rd Annual Meeting of the Chicago Linguistic Society, edited by Malcolm.
Elliott, James Kirby, Osamu Sawada, Eleni Staraki, and Suwon Yoon, 215229.
Chicago: Chicago Linguistic Society.
Tomasello, Michael J. 2003. Constructing a Language. Cambridge, MA: Harvard
University Press.
2 What Makes Conceptual Semantics Special?
Urpo Nikanne
Figure 2.1
Hierarchical levels of a theoretical approach
Figure 2.2
Hierarchical levels of Conceptual Semantics as a scientific approach. The goals of research
form the innermost level, and the formalism and technical solutions the outermost one (cf.
Nikanne 2008).
In what follows, I will explain briefly what the different layers in figure
2.2 stand for. I will concentrate on the goals, background assumptions,
and methodological guidelines. Many characteristic properties of
Conceptual Semantics, such as compositionality of lexical meanings,
semantic functions, semantic fields, etc., belong to the formalisms and
technical solutions of the theory, and they fall outside of the scope of this
chapter.
as a part of human cognition? How does language function and what are
the relationships between language and other cognitive domains?
At this level, the following two things must be taken as given:
1. The relevance of the research topic: language is a natural phenomenon
(i.e., the research topic is something real).
2. The relevance of the point of view: language is a part of the human
mind.
These assumptions may sound self-evident, but they are still something
to be aware of. If itagainst all oddsturns out that there is reason to
believe that language is not a real phenomenon or that it is not a part of
the human mind, the approach would not be scientific.
Fortunately, there seems to be no reason to give up the goals of Con-
ceptual Semantics: language has been described successfully and gram-
mars have been written for thousands of years in different cultural
traditions (Itkonen 1991), so this long experience of research gives us
reason to believe that language is a relevant scientific research topic.
Some linguists study language primarily as a social phenomenon, a
tool for communication, while other linguists study language as a part of
the human mind. There is, however, no contradiction between these
points of view: even though language is a tool and a medium of com-
munication between people, it must be processed in the minds of indi-
vidual people.
In addition, communication consists of messages with a form and
content. The content of linguistic communication often refers to different
aspects of the human life: emotions, actions, social relations, visual obser-
vations, aesthetic experiences, and so on. Language must link together
all this different information and give it a linguistic form (see Macnamara
[1978]; Jackendoff [1983]). There is, thus, a connection between language
and the other domains of human cognition (see the discussion on cogni-
tive constraints below). Language can even be used for communicating
information that is a result of imagination: lies, fairy tales, surrealistic
jokes, etc. A cognitive approach to language is a crucial part of the puzzle
when science tries to understand what human life consists of.
Note that, for example, Cognitive Linguistics (see, e.g., Langacker
[1987a,b] and Lakoff [1987], among many other texts by the same authors)
shares the goals of research with Conceptual Semantics but not the same
background assumptions and methodological guidelines. Conceptual
Semantics aims at a formal theory, whereas the Cognitive Linguistic
approaches does not.
What Makes Conceptual Semantics Special? 27
Universal part of
grammar
CORE GRAMMAR OF
LANGUAGE L
GRAMMAR OF LANGUAGE L
Figure 2.3
The layers of grammar. The dashed lines indicate that there is no clear-cut borderline
between the layers. The core grammar of L consists of the universal part and the language
specific part. The whole grammar of L consists of the core grammar and an irregular part.
to analyze which aspect of language the theory takes as primary and what
consequences follow from it:
1. System-based function-oriented view
2. System-based form-oriented view
3. Occurrence-based use-oriented view
4. Occurrence-based form-oriented view
Language is used for a variety of functions, and the different parts of the
language system (syntactic categories, affixes, phonemes, etc.) typically
serve particular functions. A theory may take the function of language
as its starting point and consider the form of language to be subordinate
to the function. This is the view generally adapted by so-called functional
theories.
FORM USE
Figure 2.4
System-based function-oriented view
FORM USE
Figure 2.5
System-based form-oriented view
Some linguists aim at basing their theories on the most concrete appear-
ance of language, namely the context. This perspective on language is
quite different from the system-based ones. Occurrence-based and use-
oriented approaches tend to take the frequency of particular parts of
structure as the fundamental tool in their analysis. These approaches are
frequency-based and probabilistic when it comes to the analysis of words
and expressions. In this view the system is an approximation based
on the typical (the most frequent) way the forms occur in concrete
contexts.
The fourth possibility is to take concrete utterances as formal units,
without their contexts, as the starting point. Taking this view would mean
that the primary aspect of language consists of concrete utterances that
would somehow be recognized. Then they would be interpreted in the
context they occur. Function and structure are then subordinate to the
concrete utterances and their concrete contexts. This is an unintuitive
perspective on language, and as it is not widely represented among lin-
guistic theories, therefore I will not discuss it further.
The analysis above is only a tool for analyzing and understanding the
view of language as a background assumption of a linguistic theory. One
can easily come up with more possibilities by changing the order of the
boxes and the direction of the arrows.
32 Urpo Nikanne
FORM USE
Figure 2.6
Occurrence-based use-oriented view
2.7 Conclusion
formulated in this chapter are based on this idea of language and mind
as form-based systems.
2.8 Acknowledgments
References
Chomsky, Noam. 1965. Aspects of the Theory of Language. Cambridge, MA: MIT
Press.
Chomsky, Noam. 1970. Remarks on nominalization. In Readings in English Trans-
formational Grammar, edited by Roderick A. Jacobs and Peter S. Rosenbaum,
184221. Waltham, MA: Ginn.
Chomsky, Noam. 1986. Knowledge of Language: Its Nature, Origin, and Use.
Westport, CT: Praeger Publishers.
Croft, William. 2001. Radical Construction Grammar. Oxford: Oxford University
Press.
Fillmore, Charles J., and Paul Kay. 1997. Berkeley Construction Grammar.
Latest update: February 27, 1997. Access August 19, 2013. http://www1.icsi
.berkeley.edu/~kay/bcg/ConGram.html.
Fillmore, Charles J., Paul Kay, and Mary C. OConnor. 1988. Regularity and idi-
omaticity in grammatical constructions: The case of let alone. Language 64 (3):
501538.
Fodor, Jerry A. 1983. The Modularity of Mind. Cambridge, MA: MIT Press.
Fried, Miriam, and Jan-Ola stman, ed. 2004. Construction Grammar in a Cross-
language Perspective. Amsterdam: Benjamins.
Goldberg, Adele. 1995. Constructions. Chicago: University of Chicago Press.
Harrikari, Heli. 1999. Epenthesis, geminates, and the OCP in Finnish. Nordic
Journal of Linguistics 22 (1): 326.
Itkonen, Esa. 1983. Causality in Linguistic Theory. Kent: Groom Helm.
38 Urpo Nikanne
Itkonen, Esa. 1991. Universal History of Linguistics: India, China, Arabia, Europe.
Amsterdam: John Benjamins.
Jackendoff, Ray S. 1972. Semantic Interpretation in Generative Grammar. Cam-
bridge, MA: MIT Press.
Jackendoff, Ray S. 1975. Toward an explanatory semantic representation. Linguis-
tic Inquiry 7 (1): 89150.
Jackendoff, Ray S. 1983. Semantics and Cognition. Cambridge, MA: MIT Press.
Jackendoff, Ray S. 1987a. Consciousness and the Computational Mind. Cam-
bridge, MA: MIT Press.
Jackendoff, Ray S. 1987b. The status of thematic relations in linguistic theory.
Linguistic Inquiry (18): 369411.
Jackendoff, Ray S. 1990. Semantic Structures. Cambridge, MA: MIT Press.
Jackendoff, Ray S. 2002. Foundations of Language: Brain, Meaning, Grammar,
Evolution. Oxford: Oxford University Press.
Jackendoff, Ray S. 2007. Language, Consciousness, Culture: Essays on Mental
Structure. Cambridge MA: MIT Press.
Karlsson, Fred. 1983. Suomen kielen nne- ja muotorakenne. Helsinki: WSOY.
Kay, Paul. 1995. Construction Grammar. In Handbook of Pragmatics: Manual,
edited by Jeff Versceuren, Jan-Ola stman, and Jan Blommaert, 171177. Amster-
dam: John Benjamins.
Kettunen, Lauri. 1940. Suomen murteet III. B, selityksi murrekartastoon. Hel-
sinki: Finnish Literature Society.
Lakoff, George. 1987. Women, Fire, and Dangerous Things. Chicago: University
of Chicago Press.
Langacker, Ronald W. 1987a. Foundations of Cognitive Grammar. Vol. 1. Theo-
retical Perquisites. Stanford, CA: Stanford University Press.
Langacker, Ronald W. 1987b. Foundations of Cognitive Grammar. Vol. 2. Descrip-
tive Application. Stanford, CA: Stanford University Press.
Macnamara, John. 1978. How can we talk about what we see? MS, Department
of Psychology, McGill University.
Nikanne, Urpo. 1990. Zones and Tiers: A Study of Argument Structure. Helsinki:
Finnish Literature Society.
Nikanne, Urpo. 1995. Action tier formation and argument linking. Studia Lin-
guistica 49 (1): 131.
Nikanne, Urpo. 2005. Constructions in Conceptual Semantics. In Construction
Grammars: Cognitive Grounding and Theoretical Extensions, edited by Jan-Ola
stman and Mirjam Fried, 191242. Amsterdam: John Benjamins.
Nikanne, Urpo. 2006. Aspectual case marking of object in Finnish. Research in
Language 4: 215242.
Nikanne, Urpo. 2008. Conceptual Semantics. In Handbook of Pragmatics,
edited by Jan-Ola stman and Jef Verschueren, 338343. Amsterdam: John
Benjamins.
What Makes Conceptual Semantics Special? 39
One (of very many) important life lessons we can learn from Ray Jack-
endoffs work is to eschew quick identification of semantic properties
with syntactic properties. Rather we must allow for a good amount of
independence between syntax and semantics, and each realm stays
simpler. Plus, with a little luck, phenomena that resist analysis in either
dimension alone can be nicely divided and conquered (e.g., Culicover
and Jackendoff 2006).
Culicover and Jackendoff (1997) present arguments that a con-
struction can at the same time involve syntactic coordination and
semantic subordination, explaining many of its otherwise puzzling
properties. In this paper, we aim to make a similar argument for a
type of coordination in German in which the syntactic coordina-
tor aber but unexpectedly appears in a position characteristic of
conjunct-internal particles. We argue that, indeed, in these cases aber
is syntactically a sentence-internal particle, yet semantically it is the
coordinator it always was. Such an analysis is empirically adequate
and is arguably simpler than either of the alternatives (to wit: syntactic
displacement of a coordinator or analysis as juxtaposition, rather than
coordination).
3.1 Introduction
The German adversative coordinator aber but allows for two classes of
syntactic construals. First, just like English but, it can occur between two
constituents of the same syntactic category, for example, V1 in (1a) and
S in (1b);1 in other words, it behaves like and, except that it carries an
adversative meaning:
42 Daniel Bring and Katharina Hartmann
SYNTACTIC CATEGORY
SEMANTIC CATEGORY
Figure 3.1
The players: Adversative markers discussed in this paper
This section presents in detail two arguments that buried aber (as well
as jedoch, allerdings however) has a truly coordinating semantics. In the
context of our final diagnosis, this is taken to indicate thatunlike
semantically similar particles like trotzdem and dennoch nevertheless
they are coordinating adversative particles.
Likewise, the prosodic juncture between the two conjuncts in (6) can be
much less dramatic or even absent, and the second conjunct will typically
be realized with a final fall, as is characteristic for declarative sentences.
A structure with a buried coordinatorand this constitutes our Exhibit
Aclearly patterns with the syndetic coordination in (6), rather than the
asyndetic ones in (5):
(7) a. Sie ist nicht reich, besitzt aber eine Yacht.
she [V1 is not rich] [V1 owns but a yacht]
She is not rich, yet owns a yacht.
b. Ich glaube, dass sie nicht reich ist, ihr Bruder aber eine
I think that [S she not rich is] [S her brother but a
Yacht besitzt.
yacht owns]
I think that she isnt rich, but (that) her brother owns a yacht.
We submit that this contrast between sentences like (5) on the one hand
and those like (6) and (7) on the other should be taken seriously, even
though it merely involves intonation and pragmatic intuitions about
up-in-the-air-ness; asyndetic coordinations without aber are a different
species from those with buried aber.
In order to make the point we are arguing more perspicuous, we intro-
duce the term BARE COORDINATIONS for coordination structures that
involve neither a syntactic coordinator, nor buried aber (nor its class-
mates jedoch or allerdings, which will be discussed in more detail in
section 3.3 below). Our claim is that bare coordinations are pragmatically
incomplete and are marked so intonationally, but coordinations with
buried aber, and coordinating particles in general, are not. We conclude
from that that buried aberapart from expressing adversativityhas a
genuinely coordinating function even in asyndetic coordinations (which
we will model by making it a semantic coordinator in section 3.4 below).
Before going on, let us note that the bare coordination counterparts
to (7) are even more marked than the bare coordinations in (5):
(8) a. ?? Sie ist nicht reich, besitzt eine Yacht . . .
she [V1 is not rich] [V1 owns a yacht]
She is not rich, owns a yacht . . .
b. ?? Ich glaube, dass sie nicht reich ist, ihr Bruder eine
I think that [S she not rich is] [S her brother a
Yacht besitzt . . .
yacht owns]
I think that she is not rich, (that) her brother owns a yacht . . .
46 Daniel Bring and Katharina Hartmann
3.2.2 Zwar
Our Exhibit B for arguing that buried aber is a true semantic coordinator
involves the concessive particle zwar, inserted in the first conjunct.
Similar to English true . . . but, German zwar absolutely requires an
adversative coordinator in the second conjunct, which can be aber either
in coordinator position or, crucially, buried:
(12) a. Sie ist zwar nicht reich, aber sie besitzt eine Yacht.
[V2 she is zwar not rich] but [V2 she owns a yacht]
True, she is not rich, but she owns a yacht.
b. Sie ist zwar nicht reich, besitzt aber eine Yacht.
she [V1 is zwar not rich] [V1 owns but a yacht]
True, she is not rich, but she owns a yacht.
Crucially, the adversative particles dennoch and trotzdem nevertheless
we met earlier do not qualify well as confederates for zwar, with
or without a syntactic coordinator; compare (13) to (10a) and (11a)
above:4
(13) * Sie ist zwar nicht reich, (und) besitzt dennoch/trotzdem
she [V1 is zwar not rich] (and) [V1 owns nevertheless
eine Yacht.
a yacht]
intended: True, she is not rich, (but) owns a yacht nevertheless.
48 Daniel Bring and Katharina Hartmann
In the previous section we have shown that buried aber behaves just
like the syntactic coordinator aber: it makes for a pragmatically
complete coordination, and it can satisfy zwars appetite for an adversa-
tive second conjunct, two things the regular adversative nevertheless-
type particles cannot. This may seem like evidence in favor of H1. Given
that aber also occurs as an uncontroversial syntactic coordinator, why
not claim that buried aber is in fact the same as the syntactic coordinator,
shuffled into the second conjunct by some syntactic displacement
operation?
In this section we will turn to two other adversative particles,
jedoch and allerdings (however). The crucial observation is that these
(unlike trotzdem and dennoch nevertheless, discussed in the previous
section) share all the properties we took to be indicative of buried abers
coordinator status, but that they cannot occur as syntactic coordinators.
This means there has to be an analysis of these properties that does not
rely on being a syntactic coordinator.
3.3.1 Jedoch and Allerdings Are Semantic, but Not Syntactic, Coordinators
First, asyndetic coordinations with allerdings or jedoch (however) are
pragmatically complete, just like their counterparts with buried aber;
compare (14) to (7) above:5
Semantic Coordination without Syntactic Coordinators 49
3.4 Semantics
3.6 Summary
SYNTACTIC CATEGORY
SEMANTIC CATEGORY
Figure 3.2
Adversative markers with refined syntactic distribution
Notes
Note that S constituents do not necessarily contain a subject, either, as that may
be outside the coordination; in such cases, S-hood is diagnosed by the presence
of other uncontroversially VP-external elements such as weak pronouns (e.g.,
sich in (1b), (2b), see also section 5.1).
2. What we call particles in this paper are equally commonly classified as
adverbials; nothing hinges on this distinction here.
3. An anonymous reviewer suggests that the oddness of failing to mark prag-
matic opposition seen in (8) and (9) might be explained as an instance of failure
to maximize presupposition: aber, dennoch, trotzdem, etc. grammatically express
opposition, while plain und does not, so the former are in a sense stronger,
andwhere they are appropriateblock using the latter due to some principle
of Maximize presupposition! (Heim 1991).
We think this is a plausible suggestion, except that it is unclear to us how the
contrastive or adversative content of aber and its ilk could be a presupposi-
tion (given that A aber B clearly presupposes neither A nor B, how could it
presuppose any relation between them?). Assuming instead that it is a conven-
tional implicature, we could perhaps derive the intended effect from a generaliza-
tion of Maximize presupposition! to something like Maximize non-at-issue
content!
4. We find examples like (13) seriously degraded. A reviewer suggests, however,
that examples similar to (13) could be found in corpora, and that they do occur
in Google search results.
To obtain a more systematic picture, we ran a search on a 22,248,965 word corpus
of German newspaper texts, Berliner Morgenpost, October 1997, MayDecember
1998, JanuaryDecember 1999, using the COSMAS IIweb interface provided by
the Institut fr deutsche Sprache, Mannheim. We found that of the 7,962 occur-
rences of zwar, only 1.39% occur as sentence-mates with dennoch/trotzdem but
without one of aber/jedoch/doch/allerdings (more than half of them clause ini-
tially, incidentally); in contrast, 63.26% of zwar co-occur with aber/jedoch/doch/
allerdings (and without dennoch or trotzdem) in the same sentence (we didnt
search for co-occurrences across sentence boundaries, which probably accounts
for most of the remaining 35%).
Even considering that aber/jedoch/doch/allerdings are more than 15 times more
frequent than dennoch/trotzdem in total, they are still in fact more than 45 times
more frequent with zwar and without dennoch/trotzdem than with dennoch/
trotzdem, and without aber/jedoch/doch/allerdings. We take this to confirm our
original judgment that there is a marked and systematic difference between the
two classes.
5. The English translations with however work less than perfectly (we think
because however prefers to have its contrasting element in a separate sentence);
we provide them nonetheless in order to conveyas best as possiblea feel for
the German construction.
6. Again, prompted by a reviewers Google result similar to (17a), we conducted
a search on a 4,491,138 word tagged corpus of German newspaper texts
(Tagged-C), using the COSMAS IIweb interface provided by the Institut fr
60 Daniel Bring and Katharina Hartmann
References
Beckman, Mary E., Julia Hirschberg, and Stefanie Shattuck-Hufnagel. 2005. The
original ToBI system and the evolution of the ToBI framework. In Prosodic
Typology: The Phonology of Intonation and Phrasing, edited by Sun-Ah Jun,
954. Oxford: Oxford University Press.
Culicover, Peter W., and Ray Jackendoff. 1997. Semantic subordination despite
syntactic coordination. Linguistic Inquiry 28 (2): 195217.
Culicover, Peter W., and Ray Jackendoff. 2006. The simpler syntax hypothesis.
Trends in Cognitive Sciences 10 (9): 413418.
Haider, Hubert, and Inger Rosengren. 1998. Scrambling. Sprache und Pragmatik
49. Lund: University of Lund.
Heim, Irene. 1991. Artikel und Definitheit. In Semantik: Ein internationales
Handbuch der zeitgenssischen Forschung. Handbcher zur Sprach- und Kom-
munikationswissenschaft, vol. 6, edited by Arnim von Stechow and Dieter Wun-
derlich, 487534. Berlin: Walter De Gruyter.
Semantic Coordination without Syntactic Coordinators 61
Hirschberg, Julia, and Mary E. Beckman. 1994. The ToBI annotation conventions.
MS. The Ohio State University.
Pasch, Renate, Ursula Braue, Eva Breindl, and Ulrich Herman Waner. 2003.
Handbuch der deutschen KonnektorenLinguistische Grundlagen der Besch-
reibung und syntaktische Merkmale der deutschen Satzverknpfer (Konjunk-
tionen, Satzadverbien und Partikeln). Schriften des Instituts fr Deutsche Sprache,
Band 9. Berlin, New York: Walter De Gruyter.
Riemsdijk, Henk van. 1978. A Case study in Syntactic Markedness: The Binding
Nature of Prepositional Phrases. Ph.D. diss., University of Amsterdam.
Umbach, Carla (2004). On the notion of contrast in information structure and
discourse structure. Journal of Semantics 21 (2): 155175.
Vicente, Luis (2010). On the syntax of adversative coordination. Natural Lan-
guage and Linguistic Theory 28 (2): 381415.
4 Out of Phase: Form-Meaning Mismatches in the
Prepositional Phrase
Joost Zwarts
This paper presents two cases in which the syntactic and semantic struc-
tures of a prepositional phrase (PP) do not line up. This is in line with
the relative independence of these levels of representation in the Parallel
Architecture framework of Jackendoff (2002). At the same time, these
mismatches can be analyzed as restricted lexical exceptions to the oth-
erwise rather tight correspondence between syntax and semantics in this
domain.
In the Parallel Architecture view of grammar (e.g., Jackendoff 2002),
a linguistic expression can be taken as a bundle of different types of
information, each with their own structural primitives and principles.
Take the (partial) representation of the phrase under the table in (1):
(1) Phonology: ( n)( dr)( )( te)( bl)
Syntax: [PP P [NP D N ]]
Semantics: UNDER (THE (TABLE))
There is a piece of phonology, consisting of sound segments, organized
into syllables, a syntactic structure with parts of speech, and a representa-
tion of the expressions meaning in terms of function application. Within
this bundle, parts correspond to each other, like the phonological form
( n)( dr) with the syntactic category P and the semantic function
UNDER, and ( )( te)( bl) with [NP D N ] and THE (TABLE),
forming smaller bundles, some basic, some derived.
In mainstream generative grammar, especially in its current minimalist
form, the syntactic structure forms the combinatorial backbone of an
expression. Sound and meaning components are derived by mapping the
syntactic structure to a phonological and a semantic structure. The syn-
tactic representation tends to be quite rich, allowing the mappings to
sound and meaning to be as simple and direct as possible. In the Parallel
Architecture, however, all three components function as relatively inde-
pendent pieces of structure, held together by interface rules that leave
64 Joost Zwarts
This makes explicit what is special about the use of after in (10b) in
comparison to (10a), how a modifier is treated as if it were an argument
and the ground becomes implicit (because there is nothing in the syntax
or phonology corresponding to the reference event or time R in (12b)).
The pattern in (12b) occurs in many different languages, with a variety
of temporal prepositions that describe temporal relations (Haspelmath
1997; Caha 2010). A temporal distance is expressed from the speech time
S or a reference time R, in the direction of the past or the future. The
German PP (13b), for example, locates an event one month before the
speech time S:
(13) a. einen Monat vor dem Unfall
a.ACC month before the accident
a month before the accident
b. vor einem Monat
before a.DAT month
a month ago
In German, measure phrases are usually accusative, as einen Monat a
month in (13a), but when they follow the preposition, in the temporal
distance construction, they carry the dative case that is typical for the
locative use of prepositions. This constitutes fairly direct evidence that
the measure phrase in (13a) behaves as the syntactic object of the prepo-
sition vor even though it is semantically a modifier. The two different
lexical entries of vor that figure in (13a) and (13b) are shown in (14a)
and (14b), respectively, ignoring dative case for the time being:
(14) a. vor1 Phon2
[PP P1 NP2 ]
BEFORE1 (Event2)
b. vor1 Phon2
[PP P1 NP2 ]
BEFORE1 (S) ; Amount2
The English construction a month ago does not fit the pattern of (12b)
and (14b): ago can better be treated as an intransitive preposition with
an obligatory modifier, as argued by Fillmore (2002) and Coppock
(2009).
Haspelmath (1997) and Caha (2010) choose opposite strategies in
working away the mismatch in (13b), either pragmatically or syntacti-
cally. For Haspelmath einem Monat is semantically the argument of vor
and for Caha it is syntactically a modifier. Haspelmath paraphrases the
70 Joost Zwarts
German derives from was not only richer in its inventory, but it also
allowed the spatial use of cases without any prepositions, something that
can be seen in Latin. The accusative form Romam has the meaning TO
(ROME) and the ablative form Carthagine the meaning FROM (CAR-
THAGE). It is assumed that prepositions came in later in the IE lan-
guages, developing out of adverbs (see Dal [1966] for German). This
means that nouns were already carrying obligatory case markers with
elementary directional meanings and prepositions were combined with
those case-marked nouns, adding locative meanings. The accusative case
in German is closer than the preposition because it represents an older
layer and the locative preposition is outside it, grown as a newer layer
(see Vincent [1999] for this situation in Latin and Romance).
In order to allow these non-compositional combinations, the gram-
matical system has to reanalyze them as lexical units, as in (18). It
would be impossible to first build an accusative noun phrase das
Zimmer with the meaning TO (ROOM ; DEF) and then apply in with
the meaning IN in such a way that the place function gets squeezed
between TO and the ground ROOM. The only option is to take the
combination in+ACC as a lexical unit, non-compositionally associated
with the meaning TO IN.
from above, that is, hanging (Van Staden, Bowerman, and Verhelst 2006).
Now consider the following examples:
(21) a. Bob stond op zijn handen (op de tafel).
Bob stood on his hands (on the table)
a. * Bob stond aan zijn handen (op de tafel).
b. Bob hing aan zijn handen (aan de dakgoot).
Bob hung on his hand (on the gutter)
b. * Bob hing op zijn handen (aan de dakgoot).
The preposition op or aan used to introduce the body part of the figure
object that makes contact with the ground object (op/aan zijn handen
on his hands) is always the same contact preposition that is used to
express the type of contact made by the figure object with the ground
object (op de tafel on the table, aan de dakgoot on the gutter). If there
is support from below, then op is used with both body part and ground
object; if there is support from above, then aan is used with both body
part and ground object.
Suppose now that semantically the preposition on in example (20) still
applies to an implicit ground and that his head refers to the figure of the
spatial relation and not the ground. The representation of the contribu-
tion of the PP could be as given in (22):
(22) on1 his2 head3
[PP P1 [NP D2 N3 ]]
BE (HEAD3 (BOB2), ON1(Ground))
Although many aspects of this construction need further study, it
seems a potential example of a PP that involves a mismatch between
form and meaning because the syntactic object of the preposition cor-
responds to what is conceptually the figure.
A different type of mismatch is presented by doubling in the preposi-
tional phrase, which is rare in English, but common in many other lan-
guages (see, for example, Aelbrecht and Den Dikken [2013]). The FROM
function can be expressed in Dutch by a preposition van, a postposition
vandaan (with a meaningless cranberry morpheme daan), but interest-
ingly, also by a combination of the two:
(23) a. van onder de tafel
from under the table
b. onder de tafel vandaan
under the table from-DAAN
c. van onder de tafel vandaan
from under the table from-DAAN
76 Joost Zwarts
Acknowledgments
The research for this paper was made possible by a grant from the Neth-
erlands Organization for Scientific Research (NWO), grant 360-70-340.
Parts of this paper were presented at various workshops in the past
couple of years, and I thank the audiences there for helpful comments
and questions. Urpo Nikanne and Henk Verkuyl are gratefully acknowl-
edged for their remarks on an earlier version of this paper.
References
Aelbrecht, Lobke, and Marcel den Dikken. 2013. Preposition doubling in Flemish
and its implications for the syntax of Dutch PPs. Journal of Comparative Ger-
manic Linguistics 16 (1): 3368.
Caha, Pavel. 2010. The German locative-directional alternation: A peeling
account. Journal of Comparative Germanic Linguistics 13 (3): 179223.
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT
Press.
Cinque, Guglielmo, and Luigi Rizzi, eds. 2010. Mapping Spatial PPs: The Cartog-
raphy of Syntactic Structures. Vol. 6. Oxford: Oxford University Press.
Coppock, Elizabeth. 2009. The Logical and Empirical Foundations of Bakers
Paradox. PhD diss., Stanford University.
Culicover, Peter, and Ray Jackendoff. 2005. Simpler Syntax. Oxford: Oxford Uni-
versity Press.
Dal, Ingerid. 1966. Kurze deutsche Syntax auf historischer Grundlage. Tbingen:
Max Niemeyer Verlag.
Out of Phase 77
Den Dikken, Marcel. 2010. On the functional structure of locative and directional
PPs. In Mapping Spatial PPs: The Cartography of Syntactic Structures, vol. 6,
edited by Guglielmo Cinque and Luigi Rizzi, 74126. Oxford: Oxford University
Press.
Draye, Luk. 1996. The German dative. In The Dative, vol. 1, Descriptive Studies,
edited by William van Belle and Willy van Langendonck, 155215. Amsterdam/
Philadelphia: John Benjamins.
Fillmore, Charles. 2002. Mini-grammars of some time-when expressions in
English. In Complex Sentences in Grammar and Discourse: Essays in Honor of
Sandra A. Thompson, edited by Joan Bybee and Michael Noonan, 3159.
Amsterdam/Philadelphia: John Benjamins.
Gehrke, Berit. 2008. Ps in Motion: On the Semantics and Syntax of P Elements
and Motion Events. PhD diss., Utrecht University.
Haspelmath, Martin. 1997. From Space to Time: Temporal Adverbials in the
Worlds Languages. Mnchen: LINCOM Europa.
Jackendoff, Ray. 1983. Semantics and Cognition. Cambridge, MA: MIT Press.
Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press.
Jackendoff, Ray. 2002. Foundations of Language: Brain, Meaning, Grammar, Evo-
lution. Oxford: Oxford University Press.
Jackendoff, Ray. 2008. Construction after construction and its theoretical chal-
lenges. Language 84 (1): 828.
Koopman, Hilda. 2000. Prepositions, postpositions, circumpositions, and particles:
The structure of Dutch PPs. In The Syntax of Specifiers and Heads, edited by
Hilda Koopman, 204260. London: Routledge.
Lestrade, Sander. 2010. The Space of Case. PhD diss., Radboud University
Nijmegen.
Smith, Michael B. 1995. Semantic motivation vs. arbitrariness in grammar: Toward
a more general account of the dative/accusative contrast with German two-way
prepositions. In Insights in Germanic Linguistics I. Methodology in Transition,
edited by Irmengard Rauch and Gerald Carr, 293323. Berlin/New York: Mouton
de Gruyter.
Svenonius, Peter. 2007. Adpositions, particles, and the arguments they introduce.
In Argument Structure, edited by Eric Reuland, Tanmoy Bhattacharya, and
Giorgos Spathas, 63103. Amsterdam/Philadelphia: John Benjamins.
Talmy, Leonard. 1972. Semantic Structures in English and Atsugewi. PhD diss.,
University of California, Berkeley.
Talmy, Leonard. 2000. Toward a Cognitive Semantics. Cambridge, MA: MIT
Press.
Van Riemsdijk, Henk. 2007. Case in spatial adpositional phrases: The dative-
accusative alternation in German. In Pitar Mos: A Building with a View. Papers
in Honour of Alexandra Cornilescu, edited by Gabriela Alboiu, Larisa Avram,
Andrei Avram, and Daniela Isac, 265283. Bucharest: Editura Universitatii
Bucuresti.
78 Joost Zwarts
Van Riemsdijk, Henk, and Riny Huijbregts. 2001. Location and locality. In Prog-
ress in Grammar: Articles at the 20th Anniversary of the Comparison of Gram-
matical Models Group in Tilburg, edited by Marc van Oostendorp and Elena
Anagnostopoulou, 123. Amsterdam: Rocquade. Reprinted in Phrasal and
Clausal Architecture: Syntactic Derivation and Interpretation. In honor of Joseph
E. Emonds, edited by Simin Karimi, Vida Samiian, and Wendy K. Wilkins, 339
364. Amsterdam: John Benjamins, 2007.
Van Staden, Miriam, Melissa Bowerman, and Mariet Verhelst. 2006. Some prop-
erties of spatial description in Dutch. In Grammars of Space: Explorations in
Cognitive Diversity, edited by Stephen C. Levinson and David P. Wilkins, 475511.
Cambridge: Cambridge University Press.
Verkuyl, Henk J. 1973. Temporal prepositions as quantifiers. In Generative
Grammar in Europe, edited by Ferenc Kiefer and Nicolas Ruwet, 582615. Dor-
drecht: D. Reidel.
Vincent, Nigel. 1999. The evolution of c-structure: Prepositions and PPs from
Indo-European to Romance. Linguistics, 37 (6): 11111153.
Zwarts, Joost. 2006. Case marking direction: The accusative in German PPs. In
Proceedings of the 42nd Annual Meeting of the Chicago Linguistics Society, vol.
2, The Panels, edited by Jacqueline Bunting, Sapna Desai, Robert Peachey, Chris-
topher Straughn, and Zuzana Tomkov, 93107. Chicago: Chicago Linguistic
Society.
5 The Light Verbs Say and SAY
Jane Grimshaw
This paper proposes a universal schema for what I refer to as SAY verbs,
and shows how their shared syntactic and semantic properties derive
from the schema. The proposal is that SAY verbs fall into four distinct
types: the light verb say, verbs which encode SAY and discourse role, SAY-
by-means verbs, and SAY-with-attitude verbs.
The verb say is a light verb which corresponds to the abstract light
verb SAY, which is the shared semantic component of all SAY verbs.
Verbs such as ask, announce, assert, maintain, note, order, remark,
report, tell, and wonder encode aspects of the discourse role of the events
they report: asserting, ordering, questioning, and commenting, among
others. Mode verbs, which subdivide into SAY-by-means (mutter, grunt,
write) and SAY-with-attitude (bitch, gripe), encode other aspects of the
saying event by combining with an independent activity predicate.
Discourse-role verbs and mode verbs impose restrictions on their
arguments beyond those imposed by the SAY schema. The English light
verb say directly lexicalizes the SAY schema: it does not encode the prop-
erties that distinguish among discourse-role verbsit can be used to
report events of asserting, questioning, and commentingnor does it
encode the properties that distinguish among mode verbs such as mutter,
grunt, and bitch. It is therefore compatible with all of the grammatical
contexts that any of the other SAY verbs occurs in.
The SAY schema proposal builds on a long-standing hypothesis origi-
nating in works like Dowty (1979), Talmy (1985), Jackendoff (1990), and
Hale and Keyser (1993) that the syntactic and semantic properties of
predicates derive from universal semantic components and the principles
governing their realization. The core characteristics of SAY verbs are
schematized in (1). SAY requires an agentive subject and a Linguistic
80 Jane Grimshaw
The schema in (1) entails that the complement of a SAY verb should be
obligatory. Discourse-role SAY verbs like those in (3) and (4) transpar-
ently fit this pattern:3
said
remarked
(3) *The students reported .
noted
maintained
The Light Verbs Say and SAY 81
said
remarked
(4) The students reported that the exam was easy.
noted
maintained
The Linguistic Material argument can correspond to a variety of
syntactic complements, including that-CPs as in (4) and wh-CPs as
in (5):
asked
(5) The students whether the exam was easy.
wondered
Note that the latter two complement structures are not unique to SAY
verbs. Verbs like believe, discover, and feel also allow that-CPs, and verbs
such as know and find out allow wh-CPs. The examples in (6) illustrate
this point:
believed
(6) a. The students discovered that the exam was easy.
felt
knew
b. The students whether the exam was easy.
found out
However, since they have Linguistic Material arguments, SAY verbs can
combine with direct quotes.4 They do so in three contexts. In the first,
the quote is in complement position.5 In the second, the quote hosts a
parenthetical quotation fragment (QF). In the final case, the quote com-
bines with a copula in a pseudo-cleft.6 In the last two configurations the
quote is identified indirectly with the verbs complement position through
an operator that is coindexed with the quote (see Grimshaw 2013). Of
the verbs in (3)(6) only the SAY verbs can appear with quotes in any of
these configurations.
Every SAY verb appears in all three syntactic configurations, provided
that no independent principles interfere. One factor concerns the struc-
ture of pseudo-clefts, in which the DP what fronts from the complement
position of the SAY verb. Any SAY verb that does not admit a DP comple-
ment is excluded from the pseudo-cleft SAY verb contexts, while it is
allowed in the others. This is discussed in section 5.6.
The examples in (7) and (9) are well-formed because say and remark
are SAY verbs: those in (8) and (10) are not. The verbs believe, discover,
82 Jane Grimshaw
feel, know, and find out do not combine with quotes: they are not SAY
verbs.7
said
(7) The students Our exam was easy.
remarked
believed
(8) *The students discovered Our exam was easy.
felt
asked
(9) The students Will our exam be easy?
wondered
knew
(10) *The students Will our exam be easy?
found out
In (11)(14) the quote hosts a QF, which contains an embedding verb
missing its embedded clause. I use only clause-final examples here, but
the parenthetical can appear within the quote instead of clause-finally
(Grimshaw 2013). Again, only the SAY verbs are possible.
said
(11) Our exam was easy, the students .
remarked
believed
(12) *Our exam was easy, the students discovered .
felt
asked
(13) Will our exam be easy? the students .
wondered
knew
(14) *Will our exam be easy? the students .
found out
In the representation of QFs, the quote is not embedded. The comple-
ment of the verb in the parenthetical is a trace, which is bound by an
operator, which in turn is identified with the quote. (The works cited in
note 5, as well as Corver and Thiersch [2001], give evidence for the pres-
ence of the chain.) Hence, indirectly, the quote provides the complement
of the embedding verb, and only a Linguistic Material complement can
license the quote. In (15), which is the representation of (11), the func-
tional projection (FP) hosting the operator is right-adjoined to the TP
which dominatines the quote:
The Light Verbs Say and SAY 83
(15) TP
TPi FP
DP T XP F
V DP T
was easy V
V XPi
said
The pseudo-cleft evidence again shows that SAY verbs combine with
direct quotes, but the other verbs do not:
said
(16) What the students was Our exam was easy.
announced
believed
(17) *What the students discovered was Our exam was easy.
felt
asked
(18) What the students was Will our exam be easy?
wondered
knew
(19) *What the students was Will our exam be easy?
found out
All of the examples of quotation in (7)(14) and (16)(19) are consistent
with the effects of selection, which I return to in section 5.7. (4) and
(6) show that say, remark, believe, discover, and feel are compatible with
CPs introduced by that. (5) and (7) show that ask, wonder, know, and
84 Jane Grimshaw
The properties of SAY + mode combinations follow from the SAY schema
and the mode verb, together with independent principles constraining
thematic roles and aspect. The mode verb (e.g., mutter, grunt, write, and
bitch) provides the morphological realization for the SAY + mode combi-
nation.9 It is an activity predicate:
muttered
grunted
at
(22) The customer ( the manager) for a few seconds
wrote to
bitched
(then left).
Like the SAY verbs discussed above, SAY + mode combinations are all
verbal and project achievements, despite the activity status of the mode
component. Projections headed by SAY verbs seem to lack the internal
structure of accomplishments.
The activity verb has its own argument-taking capacities, illustrated in
(23). Here, bitch combines with an Agent, and optionally with a Goal
(introduced by to) or a Target (introduced by at):10
Agent /
(23) bitch
Goal or Target /
The mode verb combines with the SAY schema as indicated in (24). The
indexing on the arguments is carried over from (1) and (23):
Agent / i,
(24) SAY-bitch Lingustic Material / j
Goal or Target/k,
The subject of the SAY + mode verb is an argument of both components
of SAY + bitch, represented formally by the fact that it carries two indices,
and the same holds for the Goal/Target. The Linguistic Material argu-
ment is related only to the SAY component.
As for the other SAY verbs, the Linguistic Material argument of SAY +
mode verbs can be realized as a CP or a direct quote as in (25)-(28). In
(25) and (26) the quote is in complement position:
muttered
grunted
(25) The students that the exam was too difficult.
wrote
bitched
86 Jane Grimshaw
muttered
grunted
(26) The students The exam was too difficult.
wrote
bitched
SAY + mode verbs occur in QFs, as in (27), combining with quotes as
indirect complements:
muttered
grunted
(27) The exam was too difficult, the students .
wrote
bitched
SAY-by-means verbs can also occur in pseudo-clefts:
muttered
(28) What the students grunted was the exam was too difficult.
wrote
The SAY schema contributes the Linguistic Material argument, and hence
these complementation possibilities.11
The semantic complexity of SAY + mode structures is not unique. It is
well known that a manner component can form part of the representa-
tion of motion verbs, forming verbs that would be analyzed in the present
terms as GO-by-means. For recent examples of the line of research initi-
ated in Talmy (1985), see Zubizarreta and Oh (2007), Beavers, Levin, and
Tham (2010), Beavers and Koontz-Garboden (2012).
Yet another case of conflation can be found in verbs indicating ges-
tures or facial expressions, like shrug and beam, and allowing them to
occur with clausal complements. These are not SAY verbs, though, and
they do not combine with quotes.
The SAY verbs looked at so far have agentive subjects and are eventive,
but say also occurs with a subject that encodes the location of Linguistic
Material. Nouns like shelf and river cannot appear as the subject in (29)
because of their non-linguistic character.
sign
poster
(29) The said that the park was closed.
book
article
The Light Verbs Say and SAY 87
When the subject is a Location rather than an Agent, the entire clause
is stative. It is therefore odd in the progressive, and incompatible with a
Goal argument:12
sign
poster
(30) ??The was saying that the park was closed.
book
article
sign
poster
(31) ??The said to the tourists that the park was closed.
book
article
Nevertheless, say continues to display the hallmarks of a SAY verb: it
combines with quotes in complement position, in QFs, and in
pseudo-clefts:
sign
poster
(32) The says the park is closed.
book
article
(33) a. The park is closed, the sign says.
b. What the sign says is, The park is closed.
The verb say thus has two variants, corresponding to two variants of the
schema. The second is in (34):
Location/i
(34) SAY
Lingustic Material/j
All SAY verbs should occur with non-agentive subjects in principle.
Whether they do or not will depend upon the demands of their discourse
role or mode. Certain discourse roles are clearly compatible with non-
agentive subjects:
(35) a. The survey asks whether people work more than 40 hours a
week.
b. The article comments that most people lie about their work
habits.
Which ones are compatible and why remains to be investigated.
The restrictions on the subject of the SAY + mode verb are compara-
tively transparent. The subject of SAY is identified with the subject
88 Jane Grimshaw
muttered
grunted
(39) They to the instructor that the exam was too difficult.
wrote
bitched
The analysis of SAY verbs divides them into four distinct cases: say,
discourse-role verbs, SAY-by-means, and SAY-with-attitude. These distinc-
tions coincide with differences in the realization of the Linguistic Mate-
rial argument.
bitched
(43) *What the students was Our exam was difficult.
griped
The source of these discrepancies lies in the expression of the Linguis-
tic Material argument, which is realized by the DP what in a pseudo-cleft.
The table in (44) summarizes the generalization, based on
pseudo-clefts:
(44) Linguistic Material argument realized as a DP
means with attitude to yield verbs with the rough paraphrases bitch in a
whisper or whisper grouchily. No verb has a structure which encodes
ask bitchily or assert grouchily (combining discourse role with atti-
tude), and no verb has a structure which encodes ask by whispering or
assert by shouting (combining discourse role with means). The para-
phrases indicate that these are not logically impossible meanings, but
they seem to be linguistically impossible, suggesting that discourse role,
means, and attitude compete for a single position in the structure of
complex SAY verbs. This conclusion is reminiscent of the hypothesis that
manner and result components are incompatible in verb meanings.
(See Beavers and Koontz-Garboden (2012) for a recent review.)
The next case to consider is the English light verb say. Continuing to use
QFs, we can show that say is similarly indifferent to the distinction
between interrogatives and declaratives:16
(60) a. The exam was too difficult, the students said.
b. Will the next exam be that difficult? the students said.
Finally we turn to the SAY-with-attitude verbs, which show a slightly dif-
ferent pattern. They are a little odd with interrogatives, as in (61b):
bitched
(61) a. The exam was too difficult, the students .
griped
bitched
b. ?Will the next exam be that difficult? the students .
griped
The attitude that these verbs encode when they combine with clauses
is an attitude toward a state of affairs, and an interrogative complement
does not denote a state of affairs. Hence the combination in (61b) is
possible only in a context in which the current exam was regarded as too
difficult and the students are indirectly complaining about this state of
affairs. If this line of reasoning is correct, bitch and gripe combine freely
with clausal arguments, provided that the argument supplies the state of
affairs that the attitude is related to.
Under this reasoning, the only SAY verbs that exercise control over the
clausal arguments that they combine with are the discourse-role verbs.
If discourse role is the source of selection effects among SAY verbs,
selection by verbs that do not encode discourse role, that is, non-SAY
verbs, must be different in nature from the selection observed with SAY
verbs. This is the starting point of Grimshaw (2014).
5.8 Conclusion
Acknowledgements
My gratitude goes to the editors for making this volume possible, and to
Ray Jackendoff for making it necessary. I would like to thank Pranav
Anand, Veneeta Dayal, Valentine Hacquard, Florian Jaeger, Angelika
Kratzer, Julien Musolino, Sara ONeill, Alan Prince, Ken Safir, Roger
Schwarzschild, Chung-chieh Shan, the Colloquium audience at the
Rutgers University Center for Cognitive Science, and participants in the
2013 Rutgers Syntax I course. Their input into this research has been
enormously helpful. The paper has also benefitted from the astute com-
ments of an anonymous reviewer.
Notes
1. For the sake of simplicity, I will assume that interrogative-taking verbs such
as ask, wonder, and inquire also have Goal arguments. A more refined treatment
might modify this.
2. For related studies on say and verba dicendi see Munro (1982), Lehrer (1988),
Suer (2000). The special status of these verbs is recognized in typological
studies, such as Dixon (2006) and Noonan (2007).
3. Other verbs (e.g., tell) allow their complements to be elided in null comple-
ment anaphora (Grimshaw 1979, Depiante 2000) but still require the presence
of their complement if there is no appropriate antecedent in the discourse. See
note 8 on the status of manner-of-speaking verbs without complements.
4. The verbs hear and read also take Linguistic Material arguments and combine
with quotes. This suggests that it is the argument itself, rather than the SAY predi-
cate, which licenses direct quotes.
5. Whether the direct quote is the actual complement of the verb is controversial.
Obviously direct quotes are not just CPs like the complements in (2), (4), and
(6). The case for their complement status is argued in Grimshaw (2013, 2014).
See also Bonami and Godard (2008), de Vries (2006).
6. Only examples where the quote follows the copula as in (16) and (18) are
given here. The quote may instead be the subject as in (i):
(i) Our exam was easy is what the students said.
7. For the sake of brevity, I illustrate the behavior of verbs with Linguistic Mate-
rial arguments only in configurations where the quote is their sole argument. The
point can be replicated for verbs such as ask and tell, versus convince and show,
The Light Verbs Say and SAY 97
References
Beavers, John, Beth Levin, and Shiao Wei Tham. 2010. The typology of motion
expressions revisited. Journal of Linguistics 46 (2): 331377.
Beavers, John, and Andrew Koontz-Garboden. 2012. Manner and result in the
roots of verbal meaning. Linguistic Inquiry 43 (3): 331369.
Bonami, Olivier, and Danile Godard. 2008. On the syntax of direct quotation
in French. In Proceedings of the HPSG08 Conference, edited by Stefan Mller,
355377. Stanford, CA: CSLI Publications.
Corver, Norbert, and Craig Thiersch. 2001. Remarks on parentheticals. In Prog-
ress in Grammar: Articles at the 20th Anniversary of the Comparison of
Grammatical Models Group in Tilburg, edited by Marc van Oostendorp and
Elena Anagnostopoulou. http://www.meertens.knaw.nl/books/progressingrammar/
corver.pdf.
Depiante, Marcela Andrea. 2000. The Syntax of Deep and Surface Anaphora: A
Study of Null Complement Anaphora and Stripping/Bare Argument Ellipsis.
PhD diss., University of Connecticut.
Dixon, Robert M. W. 2006. Complement clauses and complementation strategies
in typological perspective. In Complementation: A Cross-linguistic Typology,
edited by Robert M. W. Dixon and Alexandra Y. Aikhenvald, 148. Oxford:
Oxford University Press.
Dowty, David. 1979. Word Meaning and Montague Grammar. Dordrecht:
D. Reidel.
Grimshaw, Jane. 1979. Complement selection and the lexicon. Linguistic Inquiry
10 (2): 279326.
Grimshaw, Jane. 2013. Quotes, subordination, and parentheticals. MS, Depart-
ment of Linguistics. Rutgers University.
Grimshaw, Jane. 2014. Direct quotes and sentential complementation. MS,
Department of Linguistics, Rutgers University.
Hale, Kenneth, and Samuel Jay Keyser. 1993. On argument structure and the
lexical expression of syntactic relations. In The View from Building 20: Essays in
Linguistics in Honor of Sylvain Bromberger, edited by Kenneth Hale and Samuel
Jay Keyser, 53109. Cambridge, MA: MIT Press.
Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press.
Labendz, Jacob. 1998. Using standard American English manner-of-speaking and
sound-emission verbs as speech verbs. Senior essay, Brandeis University.
Lehrer, Adrienne. 1988. Checklist for verbs of speaking. Acta Linguistica Hun-
garica. 38 (14): 143161.
Munro, Pamela. 1982. On the transitivity of say verbs. In Studies in Transitivity,
Syntax and Semantics 15, edited by Paul J. Hopper and Sandra A. Thompson,
301318. New York: Academic Press.
Noonan, Michael. 2007. Complementation. In Language Typology and Language
Description, edited by Timothy Shopen, 42140. Cambridge, UK: Cambridge
University Press.
The Light Verbs Say and SAY 99
Suer, Margarita. 2000. The syntax of direct quotes with special reference to
Spanish and English. Natural Language and Linguistic Theory 18 (3): 525578.
Talmy, Leonard. 1985. Lexicalization patterns. In Grammatical Categories and the
Lexicon, vol. 3, edited by Tim Shopen, 57149. Cambridge, UK: Cambridge Uni-
versity Press.
Vries, Mark de. 2006. Reported direct speech in Dutch. Linguistics in the Neth-
erlands 23: 212223.
Zubizarreta, Maria Luisa, and Eunjeong Oh. 2007. On the Syntactic Composition
of Manner and Motion. Cambridge, MA: MIT Press.
Zwicky, Arnold. 1971. In a manner of speaking. Linguistic Inquiry 2 (2): 223233.
6 Cognitive Illusions: Non-Promotional Passives and
Unspecified Subject Constructions
6.1 Introduction1
Figure 6.1
The Rubin vase
102 Joan Maling and Catherine OConnor
speakers and across time, and consequently across linguists, that we are
following here.
Irish has a form of the finite verb known as the free (form of the) verb,
or the autonomous form:5
(3) a. Tgadh suas an corpn ar bharr na haille.
raise-PST-AUT up the body on top the cliff-GEN
The body was raised to the top of the cliff.
(McCloskey 2007, 826, ex. 1a)
b. h-Itheadh, h-ladh, ceoladh, . . .
eat-PST-AUT drink-PST-AUT sing-PST-AUT
There was eating, drinking, singing, (and then the storytelling
began).
(McCloskey 2007, 826, ex. 2c)
The autonomous form is derived by adding a distinctive suffix ((e)adh
in the Past) to the verbal stem, one for each tense (Present, Past, Future,
Conditional Mood, Past Habitual). The autonomous inflection is derived
historically from the passive; however, as illustrated in (3b), it can be
added not only to transitive verb stems, but also to intransitive verbs.
McCloskey (2007) argues that [d]espite its origin, and despite the fact
that it fulfills many of the same discourse functions as short passives in
English, the autonomous construction is not a passiveor not at least if
by a passive form we mean one in which the underlying object of a transi-
tive verb is rendered as a surface subject (827). The internal argument
of an autonomous form derived from a transitive verb stem behaves like
any other direct object in Irish: (a) it is marked accusative rather than
nominative; (b) if it is a light pronoun, it may be postposed to clause-final
position, an option available to direct objects but not to subjects; and (c)
it may be a resumptive pronoun, also an option available to direct objects
but not to subjects in Irish (see McCloskey [2007] and references cited
there). Scholars agree that the patient is not promoted to surface subject
in the autonomous form, but some still analyze it as a passive, albeit
106 Joan Maling and Catherine OConnor
is expressed on the surface. Based on her study of the Polish and Ukrai-
nian participial no/to constructions and the Irish autonomous construc-
tion, Maling (1993) selected the four syntactic properties listed in (7) to
use as diagnostics. The values given below would indicate that a given
construction is active:
(7) a. No agentive by-phrase is possible.
b. Binding of anaphors (reflexive/reciprocal) by the null argument
is possible.
c. Control of subject-oriented adjuncts by the null argument is
possible.
d. Nonagentive (unaccusative) verbs can occur in the
construction.
The underlying assumption is that a syntactically present subject argu-
ment licenses binding of lexical anaphors and control of subject-oriented
adjuncts, but blocks an agentive by-phrase. Furthermore, unaccusative
verbs should be able to occur in the construction provided that the verb
selects for a human (internal) argument. A syntactically active imper-
sonal construction with an overt grammatical subject, for example,
French on or German man, has all four of these properties; in contrast,
the canonical passive construction lacks all four properties.6
Using this diagnostic framework, Maling and Sigurjnsdttir (2002,
100107) contrasted the syntactic properties of the accusative-assigning
participial no/to construction in Polish versus Ukrainian:
(8) a. wityni zbudowano w 1640 roku. (Polish)
church-F.ACC built-no in 1640 year
The church was built in 1640.
(Maling and Sigurjnsdttir 2002, ex. 8b)
b. Cerkvu bulo zbudovano v 1640 roci. (Ukrainian)
church-F.ACC was built-no in 1640 year
The church was built in 1640.
(Sobin 1985, 653)
This contrast is puzzling, because in addition to the null subject and non-
promoted direct object, both constructions display the same verbal mor-
phology. Maling and Sigurjnsdttir showed that despite their common
historical origin, and the shared morphological properties of assigning
accusative case and consequent lack of agreement, the Polish and Ukrai-
nian constructions are polar opposites in terms of syntactic behavior. The
Cognitive Illusions 109
Table 6.1
Syntactic properties of various constructions in Polish and Ukrainian
agentive by-phrase * ok * ok
bound anaphors in object position ok * ok *
control of subject-oriented adjuncts ok * ok *
unaccusative (nonagentive) verbs ok * ok *
accusative rather than nominative case (if that argument does not bear
a lexical case, dative or genitive):
(9) Loks var fundi stelpuna eftir mikla leit.
finally was found-NEUT girl.the-ACC after great search
The girl was finally found after a long search. or
They finally found the girl after a long search.
This innovation is a system-internal change that is neither the result of
borrowing nor the result of any phonological change or morphological
weakening. What exactly is the nature of the change? The analysis of the
innovative construction has been the subject of lively debate in recent
years; scholars differ in their assessment of whether the NTI is a transi-
tive passive or an active impersonal construction.8 Everyone agrees that
the postverbal NP in the NTI is an object; the disagreement lies in what
is assumed to occupy the syntactic subject position. Under one analysis,
the NTI is a non-promotional passive resembling the Ukrainian parti-
cipial no/to construction (Eythrsson 2008). Under the alternative
analysis, the null subject is proarb, a thematic [+human] subject that can
serve as a syntactic binder; the construction is syntactically active like the
Polish counterpart (Maling and Sigurjnsdttir 2002; Maling 1993, 2006).
Icelandic also has a productive impersonal passive of intransitive
verbs, which presents an important backdrop to the NTI. The fact that
the understood subject of an impersonal passive of an intransitive verb
can be interpreted only as a volitional agent (typically human), even if
the verb allows inanimate subjects in the active voice, surely supports
the plausibility of the proarb analysis for the NTI. The subject of the verb
flauta whistle can be many things, including tea kettles or trains, but the
impersonal passive a var flauta itEXPL was whistled can be under-
stood only as describing human whistlers.9
The syntactic characteristics of the NTI have been investigated in two
nationwide surveys, the first of which was conducted in 19992000 and
reported in Maling and Sigurjnsdttir (2002). A questionnaire was dis-
tributed to 1,731 tenth graders (age 1516) in 65 schools throughout
Iceland; this number represents 45% of the children born in Iceland in
1984. More than half of the adolescents in most parts of the country (n
= 1475) accepted sentences with an accusative definite postverbal object
like the one in (9), with a range between 51%69% across the various
test sentences. However, only 28% of adolescents in Inner Reykjavk
(n = 220) accepted these sentences, and very few of the adult controls
(n = 200).
Cognitive Illusions 111
A surprising and unexpected result of the survey came from the adult
controls. In spite of their disagreements about the syntactic status of the
NTI, all scholars of Icelandic considered traditional impersonal passives
of intransitive verbs to be true passives. Thus it was a surprise to discover
that about half of the adult speakers in the survey accepted two of the
diagnostics for active constructionsreflexives and subject-oriented
adjunctsin traditional impersonal passives. An example containing a
subject-oriented adjunct is shown in (10):
(10) a var komi skellihljandi tmann.
itEXPL was come laughing.out.loud into class
People came into class laughing out loud.
(Maling and Sigurjnsdttir 2002, ex. 37a)
Maling and Sigurjnsdttir pointed out that the more subject-oriented
participles are accepted, the more simple reflexives are accepted (126).
For adolescents, the correlation was highly significant (r = 0.433, n = 1693,
p < 0.001, 2-tailed); for adults the correlation was also highly significant
(r = 0.532, n = 199, p<0.001, 2-tailed) (Maling and Sigurjnsdttir 2002,
126n15). This correlation supports the suggestion that these speakers
have a syntactically active representation for the traditional so-called
impersonal passives.
In contrast, there are other speakers who allow neither reflexives nor
subject-oriented adjuncts; these judgments reflect a passive analysis.
When asked about a sentence like (10), one such speaker, a woman in
her seventies, remarked: a vantar einhvern someone is missing. Her
remark suggests that her grammar did not make available a referent for
the controller of the adjunct.
We take no position on whether the grammar of an individual speaker
can have both or only one of the representations. We simply observe that,
in the aggregate, there is evidence for both grammatical analyses among
contemporaneous speakers.
Since almost no adults accepted the NTI, there is an implicational
relation: speakers who accept the NTI with accusative objects also accept
traditional impersonal passives with reflexive verbs, but not vice versa.
Maling and Sigurjnsdttir (2002) suggested that sentences with a reflex-
ive object represent the first step in the reanalysis of the past participle
in the NTI from passive to syntactically active. They interpreted the age-
related variation for reflexive impersonals as reflecting three stages in
the diachronic development of the NTI. Support for the claim that reflex-
ive impersonals are an intermediate stage in the development of what is
112 Joan Maling and Catherine OConnor
now called the NTI comes from the fact that the reflexive impersonal is
a relative newcomer. Eythrsson (2008, 189) observed that impersonal
passives of reflexive verbs were not found in Old Icelandic, but rather
seem to be an innovation of Modern Icelandic that is increasingly gaining
ground. A corpus search on an open-access digital library (timarit.is),
which hosts digital editions of newspapers and magazines from the 17th
century to the early 21st century, found only sporadic examples of reflex-
ive impersonal passives in the earlier periods, but after about 1890, the
number of examples increases significantly (rnadttir, Eythrsson, and
Sigursson 2011). Thus a crucial first step in the reanalysis of the imper-
sonal passive as a syntactically active construction in Icelandic seems to
have been the extension to reflexive predicates; this then extends to
other bound anaphors, and finally to the inclusion of (definite) object
NPs, both full NPs and personal pronouns that are morphologically
accusative, as expected in an active clause.
The ongoing syntactic change in Modern Icelandic indicates that
native (adult) speakers do not all necessarily come to the same gram-
matical analysis of every construction; on the contrary, speakers them-
selves may come to radically different analyses of the same data. The
readily observable data underdetermines the analysis; it is only by
pushing the speaker to judge more complex, or less common (even van-
ishingly rare), sentences that we can see the empirical consequences of
choosing one syntactic representation over another.
In our view, morphology and syntax conspire in the cases discussed here
to assemble a linguistic Rubin vase. Although we have discussed only
cases where an apparently passive construction has been reanalyzed as
an impersonal active, grammatical change can occur in the opposite
direction as well (Siewierska [2010], drawing on Broadwell and Duncan
[2002]). Kaqchikel, an Eastern Mayan language of highland Guatemala,
has a variety of passive constructions, including one marked with the affix
ki. The verb in the ki-passive shows active morphology, with an active
transitive verbal suffix /-Vj/ and the 3rd plural ergative agreement marker
ki, as would be appropriate for a transitive verb with an impersonal
they subject.
Broadwell and Duncan (2002) argue that this verb form has evolved
into a construction with the syntactic properties of a passive. It can
co-occur with an agentive by-phrase, which can be singular or plural, and
Cognitive Illusions 113
Table 6.2
Mismatch between morphology and syntax
even 1st or 2nd person. But in contrast with the Polish no/to construc-
tion, the ki-passive is a promotional passive: it is the patient argument
and not the agent which has the grammatical properties of a subject,
as shown by syntactic tests including the use of subject-oriented
adverbials.
Taken together, our exemplars reveal that every possible association
between surface morphology and syntactic behavior is attested cross-
linguistically, as shown in table 6.2. In each case that we have discussed
above, there are several potential sources of indeterminacy within the
constructional object itself. One is structural: a construction that derives
an intransitive impersonal is inherently ambiguous (Haspelmath 1990,
35; Maling and Sigurjnsdttir 2002, 126; Blevins 2003, 481), as are sub-
jectless transitives. As we have seen here, the morphology may be associ-
ated with a canonical passive historically or in other languages in the
family, while the syntax may suggest an active construction. Because of
the inherent morphosyntactic ambiguity, speakers of the same language
may construe one of these constructions in different ways, leading to
eventual change, as in the Icelandic case, and in Polish versus Ukrainian.
This leads to actual divergences in the data, even as linguists seek to
discern the unifying reality.
But as we have mentioned, it is not just speakers who diverge in their
interpretation of these ambiguous linguistic objects. In their typological
survey, Keenan and Dryer (2007) delve further into disputes among
linguists over exactly the class of objects we have discussed: those at the
border of impersonal actives (which they call unspecified subject con-
structions) and non-promotional passives. They make reference to several
long-standing debates about such cases. Wolfart (1973) analyzes the
ikawi suffixed construction in Cree as an unspecified subject construc-
tion, while Dahlstrom (1991) analyzes it as a passive. Wolfarts analysis
relies on the morphology, Dahlstroms on a syntactic test. Reportedly,
Hockett and Bloomfield sparred over the analogous construction in
Ojibwa. MacKay (1999) describes a Tepehua construction as passive, and
114 Joan Maling and Catherine OConnor
Notes
1. We thank the editors for the invitation to contribute to this volume in honor
of Ray Jackendoff, who has made such insightful and important contributions to
our understanding of the semantics-syntax interface. The material in this paper
is based in part on work done while the first author was serving as Director of
NSFs Linguistics Program. Any opinion, findings, and conclusions expressed in
this material are those of the authors, and do not necessarily reflect the views of
the U.S. National Science Foundation. Special thanks to Jim McCloskey for
helpful discussions of the syntactic issues surrounding the Irish autonomous
construction, and to Jane Grimshaw for help in clarifying our exposition.
2. Note that here we are not saying that constructions with an oblique agent
phrase, or by-phrase, are the most common type of passives. Siewierska and
Bakker (2012), following Corbett (2005), discuss how the by-phrase constitutes
Cognitive Illusions 115
a unique feature that excludes other analyses and thus serves to identify the
canonical passive.
3. We purposely problematize the promotion of the patient/theme argument in
this paper. Oddly, Keenan and Dryer (2007), in their impressive typological
survey of passives, do not prioritize our property (ii) as characteristic of a basic
passive. Rather, they seem to take it for granted and select the following proper-
ties as basic: We shall refer to passives like (1b), John was slapped, as basic
passives. What makes them distinct from other passives is (i) no agent phrase
(e.g., by Mary) is present, (ii) the main verb in its non-passive form is transitive,
and (iii) the main verb expresses an action, taking agent subjects and patient
objects. Our justification for calling such passives basic is that they are the most
widespread across the worlds languages (328).
4. The earliest attested example is from 1772, First Earl of Malmesbury, cited by
Warner (1995):
(i) I have received the speech and address of the House of Lords; probably, that
of the House of Commons was being debated when the post went out.
Visser (19631973, vol. IV) has a long and amusing collection of vitriolic com-
ments on this new usage that go almost right through to the end of the 19th
century.
5. The following abbreviations are used:
ACC accusative (case) M masculine
ADV adverbial M.EV multiple event
AUT autonomous NEUT neuter
CAUS causative OBL oblique
COMP complementizer PL plural
DEM demonstrative PPART past participle
EXPL expletive PROX proximate
F feminine PST past (tense)
GEN genitive (case) REFL reflexive
HABIT habitual S singular
IMPERS impersonal SPEC specifier
LOG logophoric 3 third person
6. The dichotomy is not always this clear-cut. For example, in German, imper-
sonal passives allow a by-phrase, but also reflexives and reciprocals. Both inher-
ent and noninherent reflexive predicates form impersonal passives (see Plank
[1993], and especially Schfer [2012] for discussion); moreover, at least some
unaccusative verbs can form impersonal passives (Primus 2011). A Google
search turns up examples like Es wurde gestorben auf beiden Seiten it was died
on both sides. Clearly further investigation of the lexical restrictions is needed.
For Icelandic, see Sigursson (1989, 322, n48) and Thrinsson (2007, 266273).
7. See Dubinsky and Nzwanga (1994) for discussion of a transitive impersonal
construction in Lingala, a western Bantu language.
8. A good survey of the empirical facts and theoretical issues can be found in
Thrinsson (2007, 273283).
116 Joan Maling and Catherine OConnor
9. The situation for German and Dutch is more nuanced (see the discussion in
Primus [2011]). For impersonal passives in Norwegian, see Maling (2006).
References
rnadttir, Hlf, Thrhallur Eythrsson, and Einar Freyr Sigursson. 2011. The
passive of reflexive verbs in Icelandic. In Relating to Reflexives, edited by
Tania E. Strahan, special issue, Nordlyd 37:, 3997. http://septentrio.uit.no/
index.php/nordlyd/article/view/2024/1884.
Austen, Jane (1803) Northanger Abbey. Ebook, Project Gutenberg. http://
www.gutenberg.org/files/121/121-h/121-h.htm.
Austen, Jane. 1884. Letters of Jane Austen. Edited with an introduction and critical
remarks by Edward Lord Brabourne. London: Bentley. http://www.pemberley.com/
janeinfo/brablet7.html#letter37.
Blevins, James P. 2003. Passives and impersonals. Journal of Linguistics 39 (3):
473520.
Broadwell, George Aaron, and Lachlan Duncan. 2002. A new passive in Kaq-
chikel. Linguistic Discovery 1 (2):116.
Corbett, Greville. 2005. The canonical approach in typology. In Linguistic Diver-
sity and Language Theories, edited by Zygmunt Frazyngier, Adam Hodges, and
David S. Rood, 2549. Amsterdam: John Benjamins.
Dahlstrom, Amy. 1991. Plains Cree Morphosyntax. Outstanding Dissertations in
Linguistics. New York: Garland Publishing.
Dubinsky, Stanley, and Mazemba Nzwanga. 1994. A challenge to Burzios gener-
alization: Impersonal transitives in western Bantu. Linguistics: An Interdisciplin-
ary Journal of the Language Sciences 32 (1): 4764.
Eythrsson, Thrhallur. 2008. The New Passive in Icelandic really is a passive. In
Grammatical Change and Linguistic Theory: The Rosendal Papers, edited by
Thrhallur Eythrsson, 173219. Amsterdam: John Benjamins.
Haspelmath, Martin. 1990. The grammaticization of passive morphology. Studies
in Language 14 (1): 2572.
Keenan, Edward, and Matthew Dryer. 2007. Passive in the worlds languages. In
Language Typology and Syntactic Description, vol. I, 2nd ed., edited by Timothy
Shopen, 325361. Cambridge: Cambridge University Press.
Kibort, Anna, 2001. The Polish passive and impersonal in Lexical Mapping
Theory. In Proceedings of the LFG 01 Conference, University of Hong Kong,
Hong Kong, edited by Miriam Butt and Tracy Holloway King, 163183. Stanford,
CA: CSLI Publications.
Kibort, Anna. 2004. Passive and Passive-like Constructions in English and Polish:
A Contrastive Study with Particular Reference to Impersonal Constructions. Ph.D.
diss., University of Cambridge.
MacKay, Carolyn Joyce. 1999. A Grammar of Misantla Totonac. Salt Lake City:
University of Utah Press.
Cognitive Illusions 117
Siewierska, Anna, and Dik Bakker. 2012. Passive agents: Prototypical vs. canoni-
cal passives. In Canonical Morphology and Syntax, edited by Dunstan Brown,
Marina Chumakina, and Greville G. Corbett, 151189. Oxford: Oxford University
Press.
Sigursson, Halldr rmann. 1989. Verbal Syntax and Case in Icelandic in a
Comparative GB Approach. Ph.D. diss., Lund University. Reprint, Reykjavk:
Linguistic Institute, University of Iceland, 1992.
Sobin, Nicholas. 1985. Case assignment in Ukrainian morphological passive con-
structions. Linguistic Inquiry 16 (4): 649662.
Stenson, Nancy. 1989. Irish autonomous impersonals. Natural Language and Lin-
guistic Theory 7 (3): 379406.
Thrinsson, Hskuldur. 2007. The Syntax of Icelandic. Cambridge: Cambridge
University Press.
Visser, Fredericus Theodorus. 19631973. An Historical Syntax of the English
Language. 4 vols. Leiden: E. J. Brill.
Warner, Anthony R. 1995. Predicting the progressive passive: Parametric change
within a lexicalist framework. Language 71 (3): 533557.
Wolfart, H. Christoph. 1973. Plains Cree: A Grammatical Study. Transactions of
the American Philosophical Society. New Series, vol. 63, part 5. Philadelphia:
American Philosophical Society.
7 Agentive Subjects and Semantic Case in Korean
7.1 Introduction
to a physical location (unless there is evidence that the verb in (7a) can
have a different argument structure from the (same) verb in (7b), which
seems unlikely; see also Sells [2004]).
Interestingly, it is on this sort of coerced reading that an agent
subject referring to an organization triggers plural agreement on the
verb in British English, as pointed out to me by Jane Grimshaw
(pers. comm.):
(8) a. My team are/*is playing tonight.
b. The hospital are/*is going to refuse patients with no health
insurance.
The existence of locative agentive subjects is certainly not an isolated
fact about Korean. Japanese, a typologically similar language, shows loca-
tive agentive subjects as well, as exemplified in (9) (Suh 1996, 209):
(9) Kore-ni tzuite-wa yatogawa-de tzuyoi hantai-o
this-DAT regard-TOP opposition.party-LOC strong objection-ACC
simesi-teiru. (Japanese)
show-PROG
On this matter, the opposition party is raising a strong objection.
The de-marked NP yatogawa in (9) also acts as a collective noun whose
referent is a political organization.
Why is it that semantic reference to an organization or institution is
an important prerequisite for locative marking in Korean? Since an
organization or institution that a collective noun refers to typically estab-
lishes itself with a location (i.e., office, mailing address, material exis-
tence), the intrinsic meaning of location becomes an integral part of the
denotation of a collective noun. The crucial point, then, is that for an
agentive subject to receive locative case, it must have an inherent meaning
of location, and this semantic condition explains why the agent subject
of (7b), which refers to individuals and not to an organization, cannot
receive locative case.
On the other hand, the contrasts in (10) show that some qualification
is needed for the required semantic condition vis--vis the integrated
meaning of location.
(10) a. *Phiko-eyse cayphan-ul ikiessta.
defendant-LOC trial-ACC won
The defendant(s) won the trial.
b. Phiko-chuk-eyse cayphan-ul ikiessta.
defendant-side-LOC trial-ACC won
The defense won the trial.
124 Max Soowon Kim
Why is locative case possible in (10b) but not in (10a)? Notice the crucial
difference: in (10b) a nominal suffix with a locative meaning (e.g., chuk,
phyen, or ccok side/part (of)) has been added to the subject. The ratio-
nale for this seems clear. An NP like phiko-chuk defendants side
includes every member of the party referred to collectively as the defen-
dant (i.e., all of the accused and their lawyers), rendering it a collective
noun eligible for locative case, even if the defendants side consists of a
singleton member (i.e., a lone defendant who represents himself or
herself without a lawyer).
We now state the requisite semantic condition in (11).
(11) Eligibility for Locative Case-Marking
For an agentive subject to receive locative case, it must have an
intrinsic meaning of location (by virtue of being a collective
noun) or an acquired meaning of location (by means of a
location-denoting nominal suffix).
The availability of location-denoting nominal suffixes, then, makes virtu-
ally any agentive subject eligible for locative case in Korean, predicting
a large linguistic corpus.
It is worth noting two related properties of Japanese and comparing
them with Korean. In Japanese, the suffix tati, which Nakanishi and
Tomioka (2004) analyze as a group-denoting suffix rather than a mere
plural marker, behaves similar to the location-denoting suffix in Korean,
as illustrated in (12a) (Sells 2004, exx. 2021).
(12) a. Gakusei-tati-de/*Gakusei-de bokoo-o otozureta.
student-PL-LOC/*student-LOC alma.mater-ACC visited
A group of students visited their alma mater.
b. Taroo-dake-de bokoo-o otozureta.
Taroo-only-LOC alma.mater-ACC visited
c. *Insoo-man-eyse mokyo-lul chacassta.
Insoo-only-LOC alma.mater-ACC visited
Without the suffix tati, the agentive subject in (12a) cannot receive loca-
tive case. On the other hand, a subset of locative agentive subjects in
Japanese clearly differs from their Korean counterparts. As the contrasts
in (12b,c) show, adding dake only to the agentive subject makes the
Japanese example eligible for locative case, whereas that still does not
salvage the Korean counterpart. These and related issues require further
research.
Agentive Subjects and Semantic Case in Korean 125
I have argued that locative agent subjects have the properties of true
grammatical subjects. Should they not be true grammatical subjects, what
could be an alternative analysis? The alternative analysis (raised by a
reviewer) assumes that the true subject is nonovertmost likely a null
or elided argument NP (see S. Kim [1999] for NP ellipsis)and that for
the purposes of case marking, the nonovert subject claims the nomina-
tive. This is sketched in (14).
(14) NPNULL-(NOM) NP-LOC NP-ACC Verb (order irrelevant)
I will examine this null-NOM analysis and show why it cannot be a
correct analysis.
The null-NOM analysis can be shown to work well for a subset of
locative agentive subjects. Relevant examples are given below:
(15) a. Wuli hakkyo-eyse kummeytal-ul ttassta.
our school-LOC gold.medal-ACC won
Our school won the gold medal.
b. Wuli hakkyo-eyse chwukkwupwu-ka kummeytal-ul ttassta.
our school-LOC soccer.team-NOM gold.medal-ACC won
The soccer team of our school won the gold medal.
The locative NP in (15a) as an institution takes the verbs agent role and
is interpreted as indicated; it does not refer to the location of the event
126 Max Soowon Kim
(where the athletic games were played and the medals were awarded).
But in (15b) there is an (extra) nominative subject that assumes the
verbs agent role. On closer inspection, however, (15b) exemplifies a
part-whole relation whereby the part-NP (i.e., the soccer team) and the
possessor NP (i.e., our school) must share the agent role (see Maling and
Kim [1992] for the case-marking of part-whole relations). Nonetheless,
it does have the structure expected under the grammatical representa-
tion sketched in (14).
But the real problem with the alternative analysis is posed by examples
where the agent subject is an individual but still receives locative case
by virtue of the location-denoting suffix it bears. This is illustrated in (16).
(16) a. John-ccok-eyse cangki-lul ikiessta.
John-side-LOC chess-ACC won
John/Johns side won the chess game.
b. proj-(NOM) NPj-LOC NP-ACC Verb
c. NPj-LOC proj-(NOM) NP-ACC Verb
An example like (16a) is semantically and pragmatically well-formed
even when a chess game was played by only two individuals, say, John
and Peter, and John won the game by beating Peter. According to the
null-NOM analysis sketched in (14), there must be a null subject NP that
claims the nominative and either of the structures in (16b,c) must be true.
But since the overt locative NP in (16a) and the null subject NP postu-
lated in (16b) or (16c) must refer to the same individual (i.e., John), the
two NPs must co-refer as indicated. Neither of the structures, however,
can survive as grammatical since (16b) is a Condition C violation and
(16c) is a Condition B violation (for Binding Theory, see Chomsky
[1981]). This means that the locative agent NP in (16a) must be the true
subject and directly take the verbs agent theta role. The conclusion, then,
is that the alternative analysis cannot be an adequate account for the
whole set of data, and therefore must be rejected in favor of the Locative
Agentive Subject Analysis proposed here, which takes an eyse-marked
subject to be a true grammatical subject.
Notice the gap in the paradigm: the dative case for animate NPs has no
honorific form. Why is it that only case markers used for inanimate NPs
have honorific forms? Given that honorifics are used to express defer-
ence towards individuals, the link between deference and inanimacy at
first seems counterintuitive and contradictory. I argue, however, that that
is what honorifics are truly for: avoidance of direct reference. In Korean
culture, avoidance of direct eye contact during a conversation is a way
to show respect for the other party, since direct eye contact may be
interpreted as confrontation rather than attention. The linguistic coun-
terpart of this cultural aspect of indirectness is that avoidance of
making direct reference to a person also implies respect in a similar
manner, since direct personal reference may be interpreted more likely
128 Max Soowon Kim
b. LEX
(26) a. LEX/SEM
b. LEX/SEM
7.5 Conclusion
In this chapter I have shown that agentive subjects in Korean can be case
marked with a locative case, both in plain form (eyse) and honorific form
(kkeyse), quite generally under a proper semantic condition. Based on
the morphosyntactic properties of locative agentive subjects, I argued
that the honorific case is a locative case and its linguistic function is to
avoid direct personal reference. In light of the semantic case marking of
agentive subjects, I reexamined certain aspects of the Case-in-Tiers
theory of morphological case assignment, especially the distinction
between case shift and case overlay, and suggested that case overlay be
treated as a subset of case shift.
Notes
1. At various stages of writing this paper I benefitted from the suggestions and
comments made by the following people: Jane Grimshaw, Ray Jackendoff, Joan
Maling, Keiko Murasugi, Mamoru Saito, James Yoon, and the anonymous review-
ers. I also benefitted from the audience at the Nanzan University Syntax Work-
shop (Nagoya, July 2011), where parts of the material were presented. I am solely
responsible for any flaws and errors. Last, but not least, I am grateful to the
editors of this volume, especially Ida Toivonen, for the opportunity to participate
in the much-awaited celebration of Rays 70th birthday. His intellectual influence
on my linguistic thinking has been enormous, and it has been my pleasure and
privilege to learn from and be around such a gifted linguist. I fondly remember
when I was housesitting his Belmont home in the summer of 1996. I had the
privilege of using his marvelous home office on the third floor. There was a small
additional room that was accessible only from his office. What did I see there?
His stunning artistic and engineering talent: the whole room was railed for the
lovely locomotive Ray had built for himself! Who would have thought that while
writing a train of books, this world-class linguist was building a train and rails?
2. A topic marker (un/nun) must not be taken as a morphological case since it
can mark any NP, regardless of the NPs GF, theta role, or argumenthood, hence
it lies outside the morphological case system.
Agentive Subjects and Semantic Case in Korean 135
References
Maling, Joan, and Soowon Kim. 1992. Case assignment in the inalienable posses-
sion construction in Korean. Journal of East Asian Linguistics 1 (1): 3768.
Martin, Samuel. 1992. A Reference Grammar of Korean. Rutland, VT: Charles E.
Tuttle Co.
Nakanishi, Kimiko, and Satoshi Tomioka. 2004. Japanese plurals are exceptional.
Journal of East Asian Linguistics 13 (2): 113140.
Sells, Peter. 1995. Korean and Japanese morphology from a lexical perspective.
Linguistic Inquiry 26 (2): 277325.
Sells, Peter. 2004. Oblique case marking on core arguments. Perspectives on
Korean Case and Case Marking, edited by Jong-Bok Kim and Byung-Soo Park,
151182. Seoul: Thaehaksa.
Sigursson, Halldor Arman. 2003. Case: Abstract vs. morphological. In New
Perspective on Case Theory, edited by Ellen Brandner and Heike Zinsmeister,
223268. Stanford, CA: CSLI Publications.
Suh, Cheong-Soo. 1996. Kwuke Mwunpep [Korean grammar]. Seoul: Hanyang
University Press.
Yip, Moira, Joan Maling, and Ray Jackendoff. 1987. Case in tiers. Language 63
(2): 217250.
Yoon, James H. 2005. Non-morphological determination of nominal particle
ordering in Korean. In Clitic and Affix Combinations: Theoretical Perspectives,
edited by Lorie Heggie and Francisco Ordez, 239282. Amsterdam: John
Benjamins.
Yoon, James H. 2007. Raising of major arguments in Korean and Japanese.
Natural Language and Linguistic Theory 25 (3): 615653.
Yu-Cho, Young-mee, and Peter Sells. 1995. A lexical account of inflectional suf-
fixes in Korean. Journal of East Asian Linguistics 4 (2): 119174.
Zaenen, Annie, Joan Maling, and Hskuldur Thrinsson. 1985. Case and gram-
matical functions: The Icelandic passives. Natural Language and Linguistic
Theory 3 (4): 441483.
8 Lexical Aspect and Natural Philosophy: How to Untie
Them
Henk J. Verkuyl
8.1 Introduction
Let me begin with two quotations. The first one is from Jackendoff:
[. . .] the learning of language isnt just a passive soaking up of information
from the environment. Rather, language learners actively construct unconscious
principles that permit them to make sense of the information coming from the
environment. (1993, 35)
For the author there is a clear distinction between an external world and
an internal one. The external one is the world of reality, nature, life, and
what these bring about; the internal world is the mental world, the world
of thoughts, giving shape to the external world in so far as this is know-
able to a human being, language teaches this in the clearest way. Then
he continues:
An action expressed by a verb is thought of as going on, as an action in progress,
or as having been done, as a completed action. An action is really the ever-
continuing transition from an action in progress to a completed action. A verb
138 Henk J. Verkuyl
captures an action either in the middle of this transition or at the other end, when
it has become a totally completed action.
The reason for combining the notion of lexical aspect with the notion of
natural philosophy in the title of the present chapter is for me the obser-
vation made by Filip (2012b, 721) that the origins of our understanding
of lexical aspect lie in Aristotles distinction of kinesis and energeia.
Both Filip (2012a,b) and Rothstein (2004)this book is about lexical
aspectgive credit for this insight to Dowty (1979), a work that indeed
can be seen as linguistically completing the foundations for seeing aspec-
tual classes as organizing our ontological view of the world, foundations
Lexical Aspect and Natural Philosophy 143
laid around the fifties by natural language philosophers like Ryle (1949),
Vendler (1957), and Kenny (1963).
In a section titled The development of verb classification, Dowty
observes that Aristotle distinguishes between kineseis (translated
movements) and energiai (actualities), a distinction which corresponds
roughly to the distinction we shall be making between accomplishments
and activities/states (1979, 5253).5 By this, Dowty opens the door for
an external justification of his linguistic classification by the authority of
philosophers, with Vendler in the lead. Many linguists have gone through
this door.
Let me first give the crucial quotation for understanding the Aristo-
telian distinction mentioned by Dowty:
(7) Now of these processes we should call the one type motions (kinseis), and
the other actualizations (energeias). Every motion is incompletethe pro-
cesses of thinning (ischnasia), learning (mathsis), walking (badisis), building
(oikodomsis)these are motions (kinseis), and incomplete (ateleis) at that.
For it is not the same thing which at the same time is walking and has walked,
or is building and has built, or is becoming and has become, or is being moved
and has been moved, but two different things; and that which is causing
motion is different from that which has caused motion. But the same thing
at the same time is seeing and has seen, is thinking and has thought. The
latter kind of process, then, is what I mean by actualization, and the former
what I mean by motion. (Metaphysics 1048b, 2834; in Aristotle ([1933] 1961),
transl. by H. Tredennick; Greek key words as they occur in the text are added
to the translation.)
As pointed out in Ackrill (1965, 1978) and Charles (1985), the distinction
between a process aimed at a goal (kinsis) and actualization, the situa-
tion in which the goal has been achieved (energeia) makes them mutually
exclusive. Each process is incomplete as long as the telos has not been
reached, whereas each actualization is in itself complete. Each completed
process is an actuality (energeia), which has no goal in itself. Aristotle
made motion dependent on a force (a mover) as a precondition of
change to keep it going, all motion ultimately being reduced to the
(Unmoved) Prime Mover described in Physics, Book 8 (Aristotle 1985,
vol. 1, 418-446). As a consequence of this ideatotally absent in the
Galilean perspectivechange had to be seen as always related to a telos,
a goal.
In the seventeenth century, Aristotelian natural philosophy came to
its end in the domain which today is called natural science. Physicists took
a supreme step-by-step effort by saying farewell to what nowadays is
considered at best a form of nave physics. The Galilean vision on motion
144 Henk J. Verkuyl
(Galilei [1632] 1953, The Second Day) is quite different from the Aris-
totelian one in that in the former, motion is in principle eternal
(unbounded in aspectual terminology) unless there is some force bring-
ing the moving object to a stop. It should be added that Aristotle was
only interested in motion as carrier of change.
Many linguists working on aspect are attracted to the Aristotelian
kinetics, the key term for them being telicity.6 After Aristotle was put
aside as a guide in physics it took the Catholic Church nearly four
hundred years before Galileos work finally was declared compatible
with the papal doctrine. This might explain why Aristotle remained an
ipse dixit-guide in logic and philosophy, under the label natural philoso-
phy, until far into the 20th century. In many European countries, the
humaniora were imbued with natural philosophy.7
One could, of course, argue that Aristotles ontology has nothing to do
with the notion of motion in modern physics and that it perfectly
addresses the problem of how to account for the cognitive organization
of our dealing with the world out there. In other words, his Metaphysics
could be taken as providing a serious model for the semantics of expres-
sions in natural language in spite of severe criticisms from outside lin-
guistics, appropriately or not. I am on the side of those who hold that
one cannot exclude that the construction of unconscious principles deter-
mining our cognitive organization could turn out to be compatible with
the Aristotelian view after all. Yet the best attitude for those who take
Aristotle as an aspectual guide seems to be to remain skeptical or at least
to be prepared to follow a different track. The appeal to Aristotles theo-
retical terms such as telos, change, and motion in the aspectual literature
is not only amazing in view of the fact that his analysis of motion has
been shown to be insufficient but also that nowadays there is still a lot
of uncertainty among philosophers about the correct interpretation of
the distinctions in Metaphysics 1048.8
At any rate, it is rather difficult to escape from misleading conclusions
made on the basis of translations. For example, in (7) Aristotle uses the
Greek verb badidzein, which is translated as walk even in the case of
authoritative translations, such as Hugh Tredennicks and William David
Rosss translations of Metaphysics (Aristotle [1933] 1961, 1908) and
Rosss translation of Nicomachean Ethics in Aristotle (1985, vol. 2).
Albert Rijksbaron (pers. comm.) points out that the English walk has as
its most natural translation the verb peripatein, where peri-, of course,
already indicates that there is no telos, hence no kinsis, as in choreuein
(dance) and aulein (play the flute). Badidzein most generally implies or
Lexical Aspect and Natural Philosophy 145
feel themselves obliged to give. The OED seems to make a clear distinc-
tion between a verb and a verb phrase, staying close to what is (or should
be) the linguistic norm: to consider a verb the kernel (head) of a predi-
cate and not a VP, let alone a predication at the S-level. In this respect
the OED does not pay lip service to the corresponding linguistic labels,
because (leaving the irrelevant things out) the verb walk is described as
in (8) and play as in (9).11
(8) walk [no obj., usu. with adverbial] move at a regular pace by lifting
and setting down each foot in turn, never having both feet off the
ground at once: I walked across the lawn | she turned and walked a
few paces.
(9) play [with obj.] produce (notes) from a musical instrument;
perform (a piece of music): they played a violin sonata.
This is exactly what is to be expected from a dictionary. In the definition
of walk there is no room for information about complements that
walk may take except in terms of some examples. This allows walk to
be the same semantic unit occurring in Mary walked in the park,
Mary walked a few paces, Mary walked to the station, and in Nobody
walked. Thus it makes no sense to take walk as pertaining to an incom-
plete action with a fixed goal. Also, we find our experiences with moving,
lifting and setting down feet and staying with one foot on the ground
back in definition (8), but the definition abstracts away from temporality
in the sense of unique actualization in real time. Verbs are atemporal
as long as they are in the lexicon, and it is only in the use of a sentence
in a particular discourse situation that actualization can play a role.
Note that, in spite of the prototyping example in (9), the definition
leaves room for all sorts of complements, due to the use of produce and
perform.
Given that temporality is to be blocked from the lexicon and given the
need to evade prototypicality, when it comes to analyzing aspectual
information, it is time to discard terms like telicity, culmination, homoge-
neity, cumulativity, and other popular terms such as DO and CAUSE in the
Aristotle-Vendler-Dowty tradition as explanatory terms. This gives room
for starting with the verb alone abstracting from the content of its argu-
ments, directed by the question: what property must a verb have to
provide the abstract structure necessary for obtaining actualization in
real time in a spoken utterance? Referring back to the separation between
PLAY and the aspectual information A in the paragraph after (9), it
seems appropriate from the methodological point of view to locate this
148 Henk J. Verkuyl
structure in the A-part of a verb. That is, in all verbs. So, the next question
is: what sort of structure is it that verbs have in common and that is
involved in aspectual composition?
Verkuyl (1993, chap.13) argues that the interaction between the number
systems R and N is essential for aspectual information, where R models
our experience with continuity and density, and N our experience with
discreteness, repetition and habituality, but in both cases outside real
time. In the present section, I will follow that line, but the outcome will
be different in important respects. Given the task of analyzing verbs
without taking into account the content of their arguments, the following
list of assumptions announces itself:
a. each verb represents type structure in the sense that the time axis
Tmodeling actualization in real time at the moment of speechis
never part of the lexicon itself; therefore the denotation of each verb
in the lexicon contributes atemporal R-structure because R can be
seen as isomorphic to T;
b. certain verbs require lexically a mapping from R+ into N;
c. certain verbs require an additional mapping from N into N in order
to be able to lexically express repetition, habituality, plurality, etc.;
d. verbs not falling under (b) and (c) can occur with complements that
cause a structural shift from R+ to N;
e. the Progressive Form requires the presence of R+.
It is not possible to work this out in detail within the space allowed here.
What follows is therefore a programmatic description of the architecture
necessary for the construction of a coherent account of how a verb con-
tributes to complex aspectual information.12
The basic idea underlying the present analysis is that the tense opera-
tors pres and past map from numerical systems like R and N into the
time axis T. This idea is based on the assumption that the (mental)
lexicon has no timeline, because its essence is to abstract from actual
situations and to store our knowledge of them independently of tensed
time. A verb does not provide a token in real time, it provides type struc-
ture, and the idea is that number systems are more appropriate for
expressing type structure, and that the time axis is more suitable for
actualizing a type in real time as a unique token. This means that a
Lexical Aspect and Natural Philosophy 149
in the dark how many images of the function fco were involved in the
mapping into N. The Past tense locates the (unique) actualization of the
discrete event in real time without requiring that the range of fce be taken
as a point in time at which he died (if that is possible anyhow). In this
way, the truth conditions are clearly revealed.
The most natural way of looking at a verb meaning is that the functions
associated with it start with the origin 0, and that the ceiling function
maps to 1 in order to provide discreteness (in terms of Te Winkel [1866]:
completed action).16 However, such an assumption would not suffice
because verbs like stutter, knock, hit lexically allow for repetition, albeit
not necessarily. This repetition is not expressed by fco in R+, it belongs
clearly to N. In other words, for the correct lexical characterization of
these verbs, the output of fco is to be taken by fce so as to provide a unit
that may be repeated.
To account for this sort of higher level repetition, there is a well-
known function available: the successor function s : N N standardly
defined as:
(13) s(n) = n + 1
Suppose that this function is also part and parcel of the information
lexically expressed by a verb on top of the shift from R into N. Then it
accounts for the unbounded repetition that may be expressed by the
verbs tick, knock, and hit. But it does not account for verbs like die and
melt and, as we shall discuss later, for sentences like The bullet hit the
target on the interpretation that the sentence is about one hit as opposed
to The bullets hit the target. This implies that if one decides to attribute
the successor function to a verb (whether lexically or structurally), there
should always be a way to block it by information outside the verb itself
to stop the function s from being applied. Thus one may think of defining
a function fs : N N as:
m if x m
(14) fs ( x ) =
x + 1 otherwise
Applying this to sentences like Bill belched his way out of the restau-
rant and Harry moaned his way down the road discussed in Jackendoff
(1990, 211-243), this requires that the value of m be context-dependent.
In the restaurant case, the size of the restaurant as well as information
about the amount of beer drunk by Bill could bring the hearer to esti-
mate m as lying between 4 and 8. In the moaning case, it clearly depends
on the length of the road. In both cases, m can be taken as a contextually
Lexical Aspect and Natural Philosophy 151
Table 8.1
Verbs Occurring with One (External) Argument
Continuous Discrete
Table 8.2
Verbs with an External Argument and an Internal Argument/Complement
Continuous Discrete
VP
VP NP
int NP
Figure 8.1
The ceiling function positioned at
of fce comes from outside the lexicon and so the -position qualifies as a
place where fce can be triggered. In She played her card, the quantifica-
tional information in the internal argument her card forces fco into N by
fce. In other words, the function fcecrucial for making an action discrete
is structurally available for receiving the complement-NP so as to yield
the VP-meaning. The place where this happens can be argued to be at
. This makes it unnecessary to go into the Noun-information itself as
in Krifkas work, so that it is not necessary to follow the process of eating
an apple in Mary ate an apple by mapping events to objects and reversely.
It suffices to have the determiner-information expressed by an available
plus the information that apple is a count noun.19
An argument in favor of figure 8.1 comes from VPs like to play matches.
The verb play itself is marked as a fco-verb, and in its intransitive use
it may pertain to something going on eternally (think of mythological
gods playing on Olympus). But in playing matches one ends up in
the same category as winning medals: one needs a way to discretize
locally, that is, per match. Due to the plural, one needs the successor
function to get a(n unbounded) sequence of playing situations. This is
only an option in the case of She played her cards. Here it should remain
underdetermined whether fce is stopped after one application or may
continue unboundedly, but the difference between She played her cards
and She played matches is not aspectual at all. That is a matter of PLAY,
not of A.
fully predictable; (iii) they are independent, free to combine with verbs
(2002, 76). These are exactly conditions compatible with the functions fco
and fce. For sentences like Bill slept away and Bill wrote on, Jackendoff
observes: These mean roughly Bill kept on V-ing, i.e. away is not direc-
tional as in run away, and on is definitely not locational (2002, 77). In
the present framework these cases are dealt with automatically in terms
of the presence or absence of the ceiling function in the -position. This
also holds for cases as discussed in Toivonen (2006). A sentence like The
children jumped on is to be analyzed as a case in which the particle on
feeds the fs fce fco function expressed by verbs like knock, hit, and jump,
with the instruction to take the second option in definition (15): the
repetition in N is unbounded.
Table 8.3
Continuity, Imparfait, and the Present Participle
8.7 Summary
In the present chapter, I have argued that in the literature about aspec-
tual classes there is an annoying imbalance in allowing too much a priori
ontology. This is due to considering the lexicon as the place to be for
having a free ride to the secrets of ontological structure. As a conse-
quence, the main aspectual difference visible in the sentences in (1) and
in the first two sentences of (3) is put aside in the domain of grammatical
aspect so that one can focus on lexical aspect, hence on aspectual classes,
hence on ontological classes. As long as one takes the position that aspect
is a complex of different semantic factors contributed by different
meaning elements in a phrase or sentence, one is bound to work more
soberly on the question of how to account for aspectual phenomena such
as continuation, compulsory repetition, unboundedness, etc., apart from
what we know about verbs. One way to sober down is to look for abstract
mathematical principles that guide our construction of complex informa-
tion. The first step is then to remove tense from the analysis and to
restrict the focus to tenseless predication. The second step is to abstract
from the content of the arguments verbs may have and to see verbs as
carrying type structure expressed by elementary functions operating on
number systems that can be assumed to underlie our temporal organiza-
tion. The combined action of the functions discussed above seems to be
sufficient for doing away with the distinction between grammatical and
lexical aspect. I am aware that the present sketch is not the whole story,
so I refer the reader to Verkuyl (forthcoming) for a more detailed
account. That paper has been revised quite drastically after handing in
the present chapter. In particular, the function fco is to be replaced by
two functions having the same format but with different restrictions, the
first one accounting for stative verbs, and the other one for nonstative
verbs. In that sense, the present chapter turns out to be the first of two
steps necessary to rescue the theory of tense and aspect from the trap of
nave physics.
8.8 Acknowledgments
I would like to thank Emmon Bach, Robert Binnick, Olga Borik, Theo
Janssen (ILLC), Jan Luif, Pim Levelt, Remko Scha, Joost Zwarts and an
anonymous reviewer for their comments on earlier versions. I thank
Albert Rijksbaron for guiding me through thorny questions of how to
interpret Aristotles analysis of motion without doing injustice to him,
Lexical Aspect and Natural Philosophy 159
Notes
1. The two quotations are from Te Winkel (1866, 67, 69). Translations of the quotes
are mine. Lammert. A. te Winkel (18101868) was a prominent nineteenth
century Dutch grammarian. Levelts brilliant A History of Psycholinguistics
(2013) makes clear that the so-called Cognitive Revolution by many scholars
located in the United States in the decades of the fifties and sixties of the past
century has a flourishing European pre-history going back to the 18th century.
Te Winkel participated in the mid-nineteenth century linguistic discussion, being
part of that history.
2. Taking (i)(iii) as oppositions between tense operators this means that Mary
will leave is to be analyzed as PRES(POST(IMP(Mary leave))) and Mary will have
left as PRES(POST(PERF(Mary leave))). It should be underscored that PRES and PAST
are in fact the only operators expressing real time, whereas SYN, POST, IMP, and
PERF operate on and yield a tenseless predication. This excludes posteriority from
being identified with future because it lacks the sense of being directly related
to the point of speech. For a formalized account of the binary system, see Verkuyl
(2008).
3. Nowadays both the term aspect and its Russian equivalent vid cannot escape
from a visual interpretation. However, there are other ways of dealing with the
opposition between (3a) and (3b). In a historical analysis of the heteroclite origin
of the term aspect, De Vogu et al. (2004, 118) shows that its visual connotation
may be simply due to a dubious interpretation of the word vid. Apart from its
visual meaning connected to the Latin verb videre, the Russian vid may also
mean sort or (conceptual) subdivision, in the sense of branch. This simply
means that originally aspect was simply seen as a form opposition, not a semantic
one. In the early use of the term vid in the 18th and 19th century linguistic litera-
ture, this non-visual meaning was predominant. The visual metaphor crept in by
translation. For a perspicuous sketch of the history of the study of aspect and
Aktionsart, see Mynarczyk (2004, 3367).
4. Krifka (1989, 1998) uses the term quantized for this restriction, but the two
terms have a totally different content because the notion quantized presumes
notions such as cumulativity, part structure, etc.; see Krifka (1998, 200).
160 Henk J. Verkuyl
5. Albert Rijksbaron (pers. comm.) points out that for classicists the identifica-
tion of activities/states with energeia is absolutely wrong even with the use of the
modifier roughly.
6. Te Winkels distinction between Action in Progress and Completed Action
does not appeal (overtly) to Aristotles ontology. This holds for virtually all lin-
guists who wrote about aspect before the second half of the last century, such as
Poutsma (1926) in his chapter on aspect and the German grammarians that I
mentioned in the first chapter of Verkuyl (1972), among them Streitberg (1889),
Herbig (1896), and Jacobsohn (1933).
7. In the early sixties, there was a huge clash in Holland between the leading
mathematical logician Evert Beth and one of the leading linguists at the time,
Anton Reichling, who was in his earlier days trained as a Jesuit priest. Elffers
(2006) describes in detail how Beth decided to discontinue the discussion about
Chomskys Syntactic Structures with the frustrated feeling that Reichling
approached science as an Aristotelian natural philosopher. Is it accidental that
Zeno Vendler in his earlier life also was thoroughly trained as a Jesuit priest, that
Anthony Kenny was trained as a Roman Catholic priest, and that Gilbert Ryle
was a philosopher feeling himself at home in the tradition of phenomenology
(like Meinong and Heidegger)? It might explain their intimacy with Aristotles
mental legacy and with their natural philosophical approach to the role of lan-
guage in ontological issues. A highly interesting picture of the role of the Church
in scientific research is given in Jaspers and Seuren (forthcoming).
8. A clear account of the long-standing difficulties in interpreting Aristotles
notion of motion is Sachs (2005); see also Rijksbaron (1989).
9. It should be said that Vendler had a keen eye for translational problems (1966,
10ff.).
10. Albert Rijksbaron (pers. comm.) pointed out to me that Aristotles analysis
of the difference between motions and actualities is restricted to the indicative;
see also Rijksbaron (2002). He explains this in terms of the dominant role of
truth in Aristotles Metaphysics.
11. Of course, this is not the only information provided about the two verbs, but
(8) and (9) suffice to make the point at issue.
12. A more detailed account of the sketch following in section 8.5.1 is given in
Verkuyl (forthcoming).
13. In other terms, fco = {x,y|y = x x 0} What (10) does, is also inherent to
the notion of continuity as used in Jackendoff (1996, 351).
14. For languages without the Pres/Past-distinction, e.g. Chinese, see Verkuyl
(2008, 162179).
15. fce, applied to , =3, applied to 0.658, 0.658=0. fen, applied to , =4,
applied to 0.658, 0.658=1. The floor and ceiling functions are generally defined
as functions from R or Q to the set of integers Z, but in the present analysis we
restrict ourselves to positive integers including 0. I will continue with the ceiling
function returning to the floor function later on.
Lexical Aspect and Natural Philosophy 161
16. This would have as a consequence that if the ceiling function fce maps, say
the image 3.6789 of the function fco to 4 as the first discrete number in its range,
the number 4 will be replaced by 1. Technically, this would require an adaptation
in the definition of fce, but more importantly, conceptually it would mean that the
length of the interval created by fco is not reflected in the N-structure: the holes
between the natural numbers are made independent of the intervals between
them if they occur in R. This independence is exactly what is necessary to account
for repetition and habituality.
17. In Verkuyl (forthcoming), I explore a different route from the one sketched
here on the basis of skipping the successor function fs. The ceiling function fce is
a so-called step function which allows us to account for the repetition possibly
expressed by verbs like belch and jump in terms of a continued mapping from
R+ to N providing a sequence of similar steps. In terms of the present chapter
this amounts to allowing m to be set as m1. This alternative would lead to blur-
ring the difference between the two discrete classes in table 8.1.
18. Such a structure would even allow intransitive fcefco-verbs to have the fce
function located at the -location where the place for the NP is blocked. For
space considerations, I will not go into the technicalities here.
19. See Verkuyl (1993, 168187) for a formal account (in the framework of the
theory of Generalized Quantification) of how to deal with quantificational infor-
mation in NPs with a count or mass noun.
20. In a binary tense system as sketched in Verkuyl (2008) this would be a
normal procedure. It runs counter to the proposal made in Verkuyl (1993, 318
327) where PROG is considered an operator external to the tenseless predicate p:
TENSE(PROG(p))).
21. Exceptions are avoir and tre. In a number of cases singular present tense
forms back out of the regularity.
22. The same appears to apply to Italian. Lenci and Bertinetto (2000) discusses
the impossibility of sentences like Gianni andava al mare con Maria (Gianni
went-IMP to the beach with Mary) occuring with adverbials like due volte (twice)
or molte volte (many times) as opposed to these sentences occurring with the
present perfect.
References
Herbig, Gustav. 1896. Aktionsart und Zeitstufe. Beitrge zur Funktionslehre des
Indogermanischen Verbums. Indogermanische Forschungen 6: 157269.
Jackendoff, Ray. 1990. Semantic Structures. Current Studies in Linguistics Series
14. Cambridge, MA: MIT Press.
Jackendoff, Ray. 1993. Patterns in the Mind; Language and Human Nature. New
York: Harvester/Wheatsheaf.
Jackendoff, Ray. 1996. The proper treatment of measuring out, telicity, and
perhaps even quantification in English. Natural Language and Linguistic Theory
14 (2): 305354.
Jackendoff, Ray. 2002. English particle constructions, the lexicon, and the auton-
omy of syntax. In Verb-Particle Explorations, edited by Nicole Deh, Ray Jack-
endoff, Andrew McIntyre, and Silke Urban, 6794. Berlin & New York: Mouton
de Gruyter.
Jacobsohn, Hermann. 1933. Aspektfragen. Indogermanische Forschungen 51:
292318.
Jaspers, Dany, and Pieter A. M. Seuren. Forthcoming. The square of opposition
in Catholic hands: A chapter in the history of 20th-century logic. Logique et
Analyse.
Kenny, Anthony. 1963. Action, Emotion and Will. London: Routledge &
Kegan Paul.
Kolni-Balozky, J. [1938] 1960. A Progressive Russian Grammar. 6th ed. London:
Pitman & Sons.
Krifka, Manfred. 1989. Nominal reference, temporal constitution and quantifica-
tion in event semantics. In Semantics and Contextual Expression, Groningen-
Amsterdam Studies in Semantics 11, edited by Renate Bartsch, J. van Benthem,
and Peter van Emde Boas, 75115. Dordrecht: Foris Publications.
Krifka, Manfred. 1998. The origins of telicity. In Events and Grammar, edited by
Susan Rothstein, 197235. Dordrecht: Reidel.
Lenci, Alessandro, and Pier Marco Bertinetto. 2000. Aspect, adverbs, and events:
Habituality vs. perfectivity. In Speaking of Events, edited by James Higginbotham
and Fabio Pianesi, 245287. New York: Oxford University Press.
Levelt, Willem J. M. 2013. A History of Psycholinguistics: The Pre-Chomskyan
Era. Oxford: Oxford University Press.
Mynarczyk, Anna. 2004. Aspectual Pairing in Polish. PhD diss., Utrecht Univer-
sity. http://dspace.library.uu.nl/handle/1874/633.
Poutsma, Hendrik. 1926. A Grammar of Late Modern English. II. The Parts of
Speech. Groningen: Noordhoff.
Rijksbaron, Albert. 1989. Aristotle, Verb Meaning and Functional Grammar:
Towards a New Typology of States of Affairs. Amsterdam: J.C. Gieben
Publisher.
Rijksbaron, Albert. 2002. The Syntax and Semantics of the Verb in Classical
Greek. An Introduction. 3rd ed. Amsterdam: J.C. Gieben Publisher.
164 Henk J. Verkuyl
that dog is a word than that bank is a word. In effect, the meaning of the
preceding word cat has been activated to facilitate the processing of all
words within its underlying semantic network.
By using a cross-modal version of this technique, we were able to
follow the real-time activation of a displaced constituent during the
course of sentence comprehension. Consider the annotated sentence
This is the catk1 thatk/i the young girl2 followed (GAP)i3 last night in4 the
dark, in which the subscript i shows the syntactic dependency existing
between the relative pronoun and the gap position (GAP), and the
superscripts show the locations of the visual target sitesthat is, the sites
at which the experimenter examines if the word cat has been activated
to prime the targets or probes. The subjects task was twofold: to listen
to the sentence, and while listening, to decide if a letter string flashed on
a computer screen (the target) was a word or not. When sentences were
spoken at a normal speed, neurologically intact subjects showed faster
lexical decisions for the target dog than for the target bank at positions
1 and 3, but, crucially, not at position 2. In effect, while listening to the
sentence, subjects had activated the meaning CAT immediately after
hearing it, following which they held it in a non-active memory store, and
then reactivated it at the gap site; thus filling the gap.
Although likely capable only of coarse lexical coding, Wernickes
aphasic patients also showed this so-called gap-filling effect. However,
Brocas aphasic patients did not (Zurif et al. 1993; Swinney et al. 1996).
This difference between the two groups capacities bears on functional
neuroanatomy. That is, lesions yielding Brocas aphasia which tend to be
large and somewhat variable within an imprecisely bounded left inferior
frontal region are distinguishable from lesions yielding Wernickes
aphasia which emerge from damage to the left posterior superior region.
So, even though the two lesion sites are only imprecisely specified, the
classical syndromes of Brocas and Wernickes aphasia do have lesion-
localizing value (see, e.g., Alexander, Naeser, and Palumbo 1990; Naeser
et al. 1989; Vignolo 1988). The evidence suggests then that gap-filling
capacity appears to depend crucially upon an intact left anterior region,
but not upon the left posterior language region.
The Brocas patients inability to form a syntactic link during the
course of comprehension ties into Grodzinskys Trace Deletion Hypoth-
esis (1986, 1990, 2000), which characterizes their failure to understand
agentive semantically reversible sentences as an inability to represent
gaps, (see also Hickok, Zurif, and Canseco-Gonzalez 1993; Avrutin 2006;
Piango 2000). However, our subsequent work indicates that the
An Evolving View of Enriched Semantic Composition 169
syntactic linkage problem that we charted for this patient group does not
reflect a limitation of linguistic knowledge. In one of the follow-up
studies of Brocas aphasic patients, we widened the temporal window in
order to probe for priming not only at the post-verbal gap position but
also at 500ms after the gap (position 4). At this last position we observed
reactivation of the antecedent (Burkhardt, Piango, and Wong 2003;
Burkhardt et al. 2008; Love et al. 2008). Thus, left frontal damage did not
disallow the antecedents reactivation; rather it slowed the process such
that the syntactic linking operation was no longer formed in a timely
manner. A second study also pointed to a temporal alteration following
left frontal damage. When the rate of input was decreased by one third
(from a normal speaking rate of six syllables per second to one of four
syllables per second), Brocas patients did reliably reactivate a displaced
constituent at the gap position (Love et al. 2008).
We conclude from these findings that the failure to form syntactic
dependencies can be explained not as the result of any specific loss of
syntactic knowledge, but rather as the consequence of a disruption to
elemental processing resources that sustain the speed of lexical activa-
tion necessary for implementing this knowledge in real time. These
resources appear to depend crucially upon the integrity of the left frontal
region associated with Brocas aphasia. Indeed, recent fMRI results
provide considerable support for this characterization of the functional
commitment of the left frontal cortex, pointing to the recruitment of this
area as a function of searching for the gapa processing consideration
and not as a function of licensing the gapa representational matter
(Piango et al. 2009).
As described above, the reflexive formation of a relative pronoun/wh-
dependency, which fundamentally involves the creation of a syntactic
link between the wh- element and its gap, seems not to involve the left
posterior superior cortical region connected to Wernickes aphasia. Still,
Wernickes aphasic patients do have sentence comprehension problems
comprehension problems that are not always accountable by reference
either to single word meanings or other morphosyntactic factors. That is,
even when they understand the meaning of the individual words in a
structurally simple sentence, Wernickes patients oftentimes are unable
to combine them to gain the meaning of the sentence. This seemed to us
to reflect a specifically sentence-level (i.e., compositional) semantic limi-
tation, an observation that connected in promising ways with the funda-
mental proposal of the Tripartite Parallel Architecture (Jackendoff 1997):
the possibility that not only phonology, morphology, and syntax, but also
170 Mara Mercedes Piango and Edgar B. Zurif
participants are the girl and the book. The observation is that the explicit
morphosyntax of the sentence [NP[V[NP]]], does not contain expres-
sions that refer to any event, let alone the beginning of one. Yet, the only
coherent interpretation demands that there be such an event. Moreover,
BEGIN is an aspectual verb that modifies temporal reference. Conse-
quently, it can be posited that the verb BEGIN (and other predicates with
similar behavior) is restricted to arguments of a temporal or eventive
nature. And herein lies the issue: the book, which denotes an entity par-
ticipant, does not directly provide an argument that satisfies such a
lexical restriction. Nevertheless, despite this verb-complement mismatch,
the sentence The girl began the book receives a coherent interpretation
with an eventive component. This indicates that a temporal/eventive
argument is supplied (or coerced) as comprehension unfolds. The mecha-
nism that accomplishes this, the analysis goes, must be semantic in nature
as the (overt) morphosyntax of the sentence does not suggest a syntactic
basis for this eventive meaning. In a manner similar to aspectual coercion
then complement coercion is also taken to enrich the meaning of the
sentence: that is, it introduces meaning beyond that introduced via syn-
tactic composition alone, thus providing the sentence with an accessible
interpretation.
least once a week throughout the year, and jumping for an hour need
not require jumping exhaustively during the hour, only frequently enough
during that period of time. Under this analysis then, the interpretation
of these sentences (including those normally said to involve coercion)
does not demand the involvement of an ITER operator as the traditional
approach has it, but rather depends on the retrieval (from the conceptual
structure associated with the lexical items in the sentence) of a measure
that allows the partitioning of the specific interval along which the event
predicate distributes.
When the absolute length of the measuring interval is large in com-
parison to the duration of a typical event in the predicatefor example,
a swimming practice session, or a drive to the local marketthe partition
measure is assumed to be correspondingly large. When the length of the
measuring interval is short in comparison to the duration of a typical
event in the predicatefor example, the duration of a jump or a sneeze
the partition measure is correspondingly short. In both kinds of cases,
however, the source of the iteration is the same. This allows us to con-
clude that iterative readings with for-adverbials do not depend on the
(a)telicity of the verb as had been previously claimed, but rather on
the interaction between knowledge of the typical duration of events, the
length of the measuring interval, and the availability of a partition
measure from the contextual conceptual structure to determine the
intervals internal structure.
Viewing the partition measure-retrieval analysis from a processing
perspective, the account of the comprehension effect reported is straight-
forward: when a for-adverbial is combined with a predicate in the process
of sentence composition, a search of a partition measure must take place.
If the predicate is atelic (e.g., SWIM in swim for an hour), the partition
measure comes at minimal cost, as the preferred interpretation can make
do with the infinitesimal partition, which is available by default. If the
predicate is telic, then the processor can still opt for the infinitesimal
partition, but in most cases this will yield an implausible interpretation
(e.g., ???sneeze for an hour, whereby only one sneeze has taken place
covering the whole hour period). When that is the case, a search through
context must take place in order for the processor to retrieve a more
plausible partition measure. It is this search that manifests itself as pro-
cessing cost. According to this analysis then, the interpretation of sen-
tences such as Mary skipped/jumped for an hour entails no break in
interpretation or repair of any sort. Instead, it requires the satisfaction
of the requirements of the lexical items in the sentence. One of these
An Evolving View of Enriched Semantic Composition 177
these sentences can receive multiple interpretations (one for each dimen-
sion) which, crucially, are mutually exclusive.
So, in the case of The girl began the book, upon encountering began,
the processor must exhaustively retrieve all dimension-specific functions
encoded in its lexical representation, and upon encountering the book, a
potential structured-individual under at least the spatial, informational,
and eventive dimensions, all the dimensions associated with the book are
retrieved as possible candidates for the required axis. In this situation, at
least two possible functions are viable: the eventive-dimension function
that leads to the interpretation (began event involving the book, whereby
the girl is mapped onto the agent role whereas the book is mapped to
the patient role of the event, e.g., the girl began the book = the girl began
writing/reading/restoring the book) and the informational-dimension
function that leads to an interpretation whereby the girl is the source of
information for the segment, e.g., the girl began the book = the anecdote/
story about the girl began the book). The availability of these two inter-
pretations represents an ambiguity that must be resolved, as only one of
the readings can be the intended at any given time.
As can be seen, and in contrast to the traditional account, our analysis
of aspectual verbs and of the complement-coercion effect does not
depend on the implementation and processing cost of introducing spe-
cialized entity-to-event (type-shifting) operators into the semantic rep-
resentation. Rather, our analysis tells us that the interpretation of an
aspectual verb + complement segment depends upon two processes: (1)
the exhaustive activation of all possible dimension-specific functions that
are lexically encoded in each of the predicates in the aspectual class
that is, we claim, part of what defines the class as aspectual, and (2)
the lexically-guided search through the conceptual structure associated
with the complement seeking to determine the dimension (eventive,
informational, etc.) along which the axis (structured object) is to be
determined. In light of these observations, we further propose that it is
the combination of these two processesexhaustive activation of func-
tions (at the verb) and dimension determination (at the complement)
that is the source of the cost observed in the comprehension of coercion
configuration sentences. At this point, we note that this analysis extends
to other possible configurations involving aspectual verbs. We focus on
this one because this one happens to be the only experimentally studied
subclass of aspectual verb sentences.4
Early support for our analysis is found in Traxler et al. (2005) who
reported that whereas previous exposure to an activity, for example,
An Evolving View of Enriched Semantic Composition 181
cortex (BA7), and frontal cortex (BA6, BA24) were associated with
computation of subject + verb (aspectual-psychological: (The boy began)
vs. (The boy loved) and a separate activation of subregions within the
left frontal cortex, involving BA 44, 45, BA47, BA6 (bilateral), and mar-
ginally BA8, were associated with comprehension of the complement,
which in this case was identical for each aspectual/psychological pair
((the book...) vs. (the book...)).
Given these data, it seems reasonable to suggest that the activation of
the region associated with Wernickes aphasia (BA40) and some other
regions within the fronto-temporal network such as BA7 and BA6 indi-
cates their role in the exhaustive retrieval of the dimension-specific
functions encoded in the verb in anticipation of the complement. It also
seems reasonable to suggest that the functional role of the cortical area
associated with Brocas aphasia be expanded to include its participation
in some semantic operations. Still, this area remains set apart from the
other cortical regions activated in the experiment in light of its crucial
role in gap-filling. This last consideration highlights the difference
between, on the one hand, a fast-acting, impenetrable syntactic compo-
sitional process whose overriding objective for any given utterance
is to mark constituency and subcategorization displacements and, on
the other, a slower-acting, penetrable semantic compositional process
whose objective is to build the local meaning of an utterance not in isola-
tion, but inextricably embedded in the larger non-linguistic conceptual
system.
9.3 Acknowledgments
Notes
References
Alexander, Michael P., Margaret A. Naeser, and Carole Palumbo. 1990. Brocas
area aphasias: Aphasia after lesions including the frontal operculum. Neurology
40 (2): 353362.
Avrutin, Sergey. 2006. Weak syntax. In Brocas Region, edited by Yosef Grodzin-
sky and Katrin Amunts, 4962. New York: Oxford University Press.
Burkhardt, Petra, Sergey Avrutin, Mara M. Piango, and Esther Ruigendijk.
2008. Slower-than-normal syntactic processing in agrammatic Brocas aphasia:
Evidence from Dutch. Journal of Neurolinguistics 21 (2): 120137.
Burkhardt, Petra, Mara M. Piango, and Carol Wong. 2003. The role of the
anterior left hemisphere in real-time sentence comprehension: Evidence from
split intransitivity. Brain and Language 86 (1): 922.
Deo, Ashwini, and Mara M. Piango. 2011. Quantification and context in measure
adverbs. In Proceedings of the 21st Semantics And Linguistic Theory Conference,
edited by Neil Ashton, Anca Chereches, and David Lutz, 295312. http://
elanguage.net/journals/salt/article/view/21.295/2516.
Deo, Ashwini, Mara M. Piango, Yao-Ying Lai, and Emily Foster-Hanson. 2012.
Building multiple events: The cost of context retrieval. Paper presented at the
AMLaP Conference, Ria da Garda, Italy, September 2012. Poster 224. http://
pubman.mpdl.mpg.de/pubman/item/escidoc:1563764:3/component/
escidoc:1563765/Smit_huettig_monaghan_amlap2012.pdf.
Grodzinsky, Yosef. 1986. Language deficits and the theory of syntax. Brain and
Language 27 (1): 135159.
Grodzinsky, Yosef. 1990. Theoretical Perspectives on Language Deficits. Cam-
bridge, MA: MIT Press.
Grodzinsky, Yosef. 2000. The neurology of syntax: Language use without Brocas
area. Behavioral and Brain Sciences 23 (1): 121.
184 Mara Mercedes Piango and Edgar B. Zurif
Piango, Mara Mercedes, Emily Finn, Cheryl Lacadie, and Todd Constable. 2009.
The Role of the Left Inferior Frontal Gyrus in Sentence Composition: Connect-
ing fMRI and Lesion-Based Evidence. Paper presented at the 47th Annual
Meeting of the Academy of Aphasia, Boston, MA, October 2009.
Piango, Mara Mercedes, Aaron Winnick, Rashad Ullah, and Edgar Zurif. 2006.
Time-course of semantic composition: The case of aspectual coercion. Journal of
Psycholinguistic Research 35 (3): 233244.
Piango, Mara Mercedes, and Edgar Zurif. 2001. Semantic operations in aphasic
comprehension: Implications for the cortical organization of language. Brain and
Language 79 (2): 297308.
Piango, Mara Mercedes, Edgar Zurif, and Ray Jackendoff. 1999. Real-time
processing implications of enriched composition at the syntax-semantics inter-
face. Journal of Psycholinguistic Research 28 (4): 395414.
Price, Cathy J. 2012. A review and synthesis of the first 20 years of PET and fMRI
studies of heard speech, spoken language, and reading. Neuroimage 62 (2):
816847.
Pustejovsky, James. 1995. The Generative Lexicon. Cambridge, MA: MIT Press.
Shapiro, Lewis P., and Beth A. Levine. 1990. Verb processing during sentence
comprehension in aphasia. Brain and Language 38 (1): 2147.
Shapiro, Lewis P., Edgar Zurif, and Jane Grimshaw. 1987. Sentence processing and
the mental representation of verbs. Cognition 27 (3): 219246.
Shapiro, Lewis P., Edgar Zurif, and Jane Grimshaw. 1989. Verb representation
and sentence processing: Contextual impenetrability. Journal of Psycholinguistic
Research 18 (2): 223243.
Swinney, David, Edgar Zurif, Penny Prather, and Tracy Love. 1996. Neurological
distribution of processing resources underlying language comprehension. Journal
of Cognitive Neuroscience 8 (2): 174184.
Traxler, Matthew J., Brian McElree, Rihana S. Williams, and Martin J. Pickering.
2005. Context effects in coercion: Evidence from eye movements. Journal of
Memory and Language 53 (1): 125.
Traxler, Matthew J., Martin J. Pickering, and Brian McElree. 2002. Coercion in
sentence processing: Evidence from eye-movements and self-paced reading.
Journal of Memory and Language 47 (4): 530547.
Utt, Jason, Alessandro Lenci, Sebastian Pad, and Alessandra Zarcone. 2013. The
curious case of metonymic verbs: A distributional characterization. In Proceed-
ings of the 10th International Conference on Computational Semantics (IWCS
2013). Workshop Towards a Formal Distributional Semantics. http://aclweb.org/
anthology/W/W13/W13-0604.pdf.
Vignolo, Luigi. 1988. The anatomical and pathological basis of aphasia. In
Aphasia, edited by Frank Clifford Rose, Renata Whurr, and Maria A. Wyke,
227255. London: Whurr.
Zurif, Edgar, David Swinney, Penny Prather, Julie Solomon, and Camille Bushell.
1993. An on-line analysis of syntactic processing in Brocas and Wernickes
aphasia. Brain and Language 45 (3): 448464.
10 Height Matters
If two bodies collide, then the first of them collides with the second, the
second collides with the first, and they collide with each other. Surpris-
ingly, assenting to these mutual entailments does not imply that these
sentence forms are semantically equivalent, at least in a court of law. For
if a scooter collides with a bus then the scooters insurance company
pays, and the reverse obtains if the bus collides with the scooter.
Although any collision must be a single event, the asymmetry of syntac-
tic structure in these cases imparts a further semantic element to the
interpretation. Of course, if the bus and the scooter collide (or the
scooter and the bus collide), that is simply a tragic accident and money
doesnt change hands. This set of syntactic structures is a striking case
whereby even a single symmetrical motion event (colliding) can be lin-
guistically framed so as to alter the relative prominence of the partici-
pants, resulting in additional interpretive values of path direction and
evenas in the present caseattributions of instigation and cause. Ray
Jackendoffs career in linguistics and psychology has been materially
involved with uncovering and explicating such subtle framing properties
by means of which languages add perspective to description in render-
ing the representations of events. In this essay, we focus on one such
powerful framing device that centrally influences listeners interpreta-
tions. This is the modulation of meaning conveyed through the relative
prominence of sentential constituents, established through height in the
syntactic structure. This structural property is a major controller of lis-
teners semantic interpretations even in the face of countervailing con-
ceptual biases. The syntactic patternings that we will discuss, though
partly unique to English, fall within a range of parametric cross-language
variability that is sufficiently narrow so that children can use them
to recover the meanings of words. For English, as we shall seeand
to varying degrees in all languagesheight in the observed (that is
188 Barbara Landau and Lila R. Gleitman
Figure 10.1
Visual capture and the interpretation of scenes. When the dog is subliminally highlighted
for 60 msec as experimental participants view this scene, they are more likely to describe
the scene as A dog is chasing the man than if the highlight is placed at a neutral point
or, especially, if the man is highlighted, in which case they are more likely to say A man
is running away from the dog (after Gleitman et al. [2007]).
terms that encode paths. This is consistent with the idea that linguistic
encoding does not exactly reflect our non-linguistic spatial representa-
tions (Landau and Jackendoff 1993). As described by Jackendoff (1983),
the major ontological type, [PATH], includes just two types of bounded
paths. Goal-paths represent paths whose endpoint is the object of the
prepositional phrase (PP, encoded in English by to plus an NP) and
Source-paths represent paths whose starting point is the object of the PP
(usually encoded by from).
Linguistic analyses support asymmetry between these two path types
on a number of grounds: goal PPs tend to be unmarked by inflectional
material in a wide range of languages, whereas source PPs tend to be
marked (Fillmore 1997; Ihara and Fujita 2000; Jackendoff 1983); goal PPs
tend to be arguments of verbs, whereas source PPs tend to be adjuncts
(despite exceptions such as English remove and empty, Nam [2004]); goal
and source PPs also distinguish themselves on other properties such as
movement and behavior in locative alternations (Nam 2004). Typological
groupings based on the collapsing of either goal or source paths with
marking of place led Nikitina (2006) to suggest that goal and source
paths are maximally distinct in universal semantic space. In addition
to the linguistic evidence for asymmetries, there is now evidence for the
prominence of goal paths over source paths in human pre-linguistic
understanding of events (Lakusta et al. 2007). The prominence of goals
prior to language learning, combined with universal prominence in syn-
tactic structures that express paths, leads naturally to the prediction that
the source-goal asymmetry should be reflected in young childrens lan-
guage. It is, as we describe next. However, as in the case of the agent/
patient asymmetry, we also show that preferences for expressing the goal
path can be reversed by providing contrary linguistic informationin
this case, by the choice of source-path lexical verbs; for example, get
rather than give.
The goal bias in language has been demonstrated in several experi-
ments. Lakusta and Landau (2005) showed 3-year-olds videotaped events
in which an object or person moved from one landmark-specified loca-
tion to another, with both origin and endpoint indicated by landmarks
and visible throughout. For example, in one event, a toy bird emerged
from a bucket, moved in an arc to a glass, and came to rest in it (see
figure 10.2).
Children responded to the question What happened? by saying The
bird flew to the glass, rather than The bird flew out of the bucket or
even The bird flew out of the bucket and into the glass. That is, although
Height Matters 193
Figure 10.2
The goal bias in the expression of spatial events. When children or adults are shown a
motion event in which an object moves from one location (the source) to another (the
goal), they are strongly biased to express the event in terms of the goal path rather than
the source path. In the example event above, they are more likely to say The bird flew
into the glass than either The bird flew out of the bucket or The bird flew out of the
bucket and into the glass. See text for discussion of findings (Lakusta and Landau 2005).
the physical event that was depicted afforded at least these three differ-
ent descriptions (motion from the source, to the goal, or both), children
and adults were strongly biased to describe the events in terms of motion
towards the goal. Whenever possible, they included path expressions that
encoded the goal path (with goal as argument) rather than the source
path. Lakusta and Landau found that this bias is quite general, holding
for manner-of-motion events (e.g., running, hopping) that do not have
an inherent directionality as well as for transfer events (giving, getting),
and even events involving change of state (saddening, brightening). The
generalization of the goal bias from spatial domains to non-spatial
domains accords with Grubers (1965) observations, further articulated
by Jackendoff (1983) as the Thematic Relations Hypothesis.
The finding of a goal bias in language has now been replicated and
extended, and is robustly present in other languages that have been
investigated (e.g., Lakusta et al. 2006; Ihara and Fujita 2000) and across
both animate/intentional events and physical events (Lakusta and
Landau 2012). The goal bias is not a simple reflection of the non-linguistic
bias present in infancy, however. When children and adults are given a
non-linguistic task in which they must identify changes to either goal or
source across a sequentially-presented pair of events, they detect goal
changes more accurately, but only for animate/intentional events.
In light of the strength of the goal bias, it can be nullified or reversed
only by introducing blatantly contrary information. Lakusta and Landau
194 Barbara Landau and Lila R. Gleitman
S S
NP VP NP VP
Figure 10.3
Simplified phrase structure descriptions for a sentence with the English verb meet (3a) and
the French verb se rencontrer (3b). Figure 10.3a represents an intransitive use of the verb
meet with a conjoined subject noun-phrase. This English format for meet is identical to that
for eat or walk or any ordinary intransitive, that is, it gives no indication of the reciprocal
interpretation (each other) associated with meet and other symmetrical verbs in this
construction. Rather, the symmetricality is assumed to be represented as part of the lexical
(rather than syntactic) specification for symmetrical predicates (Gleitman et al. 1996). In
contrast, as figure 10.3b shows, French and many other languages mark the reciprocal for
symmetrical predicates with a pronominal clitic (se), thus morphosyntactically (as well as
lexically) differentiating symmetrical from nonsymmetrical predications. According to
some accounts (e.g., Gleitman 1965), such a reciprocal element occurs as well in the under-
lying morphosyntactic representation of English symmetricals. In essence, under such a
syntactic rather than lexical account of the reciprocal inference structure, the underlying
syntactic tree for English meet is just like that for French se rencontrer, only in Englishthe
reciprocal occurs on the surface as a phonetically empty item.
more complex than this. We begin with the orthodox definition of a sym-
metrical relation:
(i) For all x, y, xRy < -- > yRx
This property is expressed in some hundreds of English language predi-
cates, including such stative relational terms as match, equal, near, and
friend, and in inherently reciprocal activity terms such as meet, argue,
and marry. For instance, if x is equal to y, so must y be equal to x, and if
John and Peter are friends then each of them stands in this relation to
the other. Because symmetrically compared entities necessarily play a
single thematic role, we would expect them to surface as sisters in a single
syntactic argument position, and so indeed they do (see figure 10.3a), as,
for example:
(1) John and Peter meet.
196 Barbara Landau and Lila R. Gleitman
NP VP
V NP
Figure 10.4
Simplified phrase structure description for John meets Peter. Here, the subject/
complement structure is asymmetric, implying some within-category distinction in promi-
nence of the compared nominals or of their roles in the predication even though the
verb itself is a symmetrical one. Notice that even with the two names (John, Peter)
used here, the positioning may imply that the complement noun-phrase expresses a more
prominent individual (as in, say, My sister met Barack Obama) or the ground in a figure-
ground perspective such that the subject noun-phrase went to meet the complement
noun-phrase.
asked to judge the semantic values of yig and zav on the basis of relative
fame, size/mobility, power, and birth order; for example, Which is older,
the yig or the zav? People assigned different scores as a function of the
positioning of the nonce items in the syntactic structure, with the higher
score always given to the item in complement position.4
As follows from these findings, the intuition that the concepts/terms
are symmetrical is put on a more secure footing via the twin linguistic
diagnostics of plurality preference and inference-structure characteristics
(figure 10.3). Syntactically distinct placement of the compared nominal
items as subject and complement (figure 10.4) establishes their place-
ment in a conceptual hierarchy but does not alter the symmetry of the
predicate itself. Thus there is no paradox in the unequal ratings of simi-
larity as between the president comparisons in (6) and (7): these were
never the same comparison at all, and therefore the definition of sym-
metry (i) was never violated. Two entities compared on property p (say,
prominence or competence as a leader) may be very similar, but when
compared on property q (say, relative physical size or strength) may be
very different.
Armed with these findings and interpretations, we can now return to
our main focus of attention: how does the language of agents and patients
and sources and goals behave in relevant regards?
Figure 10.5
Two search situations. In one case (left panel), search for a red L among green Ls is easy
and fast; it requires a search of only one feature, color. The red L appears to pop out of
the display. In the second case (right panel), search for a red L among green Ls and red
Os is difficult; it requires a search of two features, color and shape.
If a person searches for a red L among a set of green Ls, s/he need only
use the feature red to identify the target. In such a case, one subjectively
feels that the target stimulus pops out of the display; indeed, search
time does not increase as the number of green elements increases. By
contrast, if a person searches for the same red L in a display that contains
both green Ls and red Os, s/he will need to search for both red and L
to find the target, differentiating it from green Ls and red Os. This search
feels much more effortful, and search times increase linearly with set size.
The difference between the feature and conjunction searches illustrate
that the latter is much more difficult. Although the mechanisms underly-
ing such illusory conjunctions are debatable, one theory is that active
allocation of attention must be deployed in order to accurately represent
and maintain feature conjunctions (Treisman and Gelade 1980).
The fragility of the visual system in binding object properties under
certain conditions raises the more general question of whether language
can help resolve the potential ambiguity in the visual representation. If
the visual properties fail to bind properly, then our representation will
be indeterminate with respect to which properties go together. What is
needed is a format that establishes just a single correct representation of
the several that are possible.
In a series of experiments, Dessalegn and Landau (2008, 2013) showed
that linguistic information can indeed disambiguate the potential misas-
signment of two properties, resulting in improved memory for the right
combination of color and location in a stimulus. Their studies probed the
ability of young children to encode and then remember a simple visual
stimulus that combined color and location, specifically a square that is
split, with one red and one green half. The details of the findings show
that the effects hold only under highly specific conditions. To work, the
linguistic information must establish the choice between two possible
interpretations available to the visual system. This is accomplished by the
use of an asymmetric predicate (e.g., left, right) together with the syntac-
tic frame in which the two NPs (red, green) are situated in different
positions.
In Dessalegn and Landaus experiments, 4-year-olds were shown a
square that was split in half vertically by two colors (e.g., red on the right,
green on the left), and were told they would have to remember it. The
square then disappeared for one second, after which a display appeared
containing the original square, its reflection (e.g., red on left, green on
right), and a third square with a different geometric split (e.g., a diagonal
or horizontal split with red and green in each half; see figure 10.6 for an
204 Barbara Landau and Lila R. Gleitman
example). Children were correct on only about 60% of the trials; almost
all errors were choices of the targets reflection rather than the target
itself. That is, children rarely chose the square with the different geomet-
ric split (e.g., a horizontal or diagonal split), showing that they retained
the type of split they had seen; their errors reflected fragility in remem-
bering the assignment of color to each side of the split square.
This pattern held over a number of experiments that manipulated the
context of presentation. In one, the target square was named with a novel
noun (See this? This is a dax) in order to evaluate whether simply
naming the square could draw sufficient attention to disambiguate the
two descriptions (red left/red right). Results remained the same as base-
line. In another, children were asked to Point to the red part in order
to evaluate whether perceptual-motor activity might ground the childs
representation of what she saw. Results again remained the same as
baseline. These findings tentatively ruled out explanations holding that
the children simply needed to deploy more attention in order to store
and remember the correct color/location assignments.
By contrast, when children were instructed with the sentence The red
is left/right/top/bottom of the green, their performance increased by
approximately 20%, now around 80% or better. This instruction contains
Height Matters 205
We have discussed in this essay three of the many ways that syntactic
and semantic representations interact, concentrating on a single aspect
of linguistic geometry: height in a phrase structure tree. In the first
section, we showed how anchor points in the perceptual-conceptual
domains of animacy and motion, owing to their special psychological
saliency, are prototypically assigned to higher nodes in these linguistic
representationsanimates over inanimates, and motions toward over
motions from a goal. In these cases, the causal flow is from conceptual
prominence to linguistic representation, with central features capturing
the higher nodes in the configural tree. Second, we looked at the curious
case of symmetrical comparison where, one might suppose, the very
definition of symmetricality should lead us to expect that the compared
entities would appear at the same height in phrase structure trees. As we
discussed, however, though sometimes they do (as in the intransitive uses
in figure 10.3), sometimes they do not (as in the tree in figure 10.4). In
this latter case we see effects of the kinds of variable we looked at earlier,
with various influences of perceptual and semantic prominence predict-
ing which entity will surface linguistically in the subject rather than
complement position, thus higher in the tree. Third, we showed that
learners under cognitive stressin this case, very young children trying
to distinguish spatial and hue aspects of fleetingly glimpsed symmetrical
figureslean on asymmetrical structural information as an effective
boost to memory. This time, it is the linguistic structure that plays the
causal role, facilitating memory for the relevant aspect of the visually
perceived world.
We want to end as we began: by acknowledging our significant indebt-
edness to Ray Jackendoff, both for coaxing linguists and psychologists to
Height Matters 207
think about these interface issues and for developing the formal frame-
work that allows them to be investigated and explained. Ray: you are
very high in our personal phrase structure trees.
Notes
1. Notationally we use double quotes for utterances, italics for the mention
(rather than use) of a word or phrase, and single quotes for the concept that
the word or phrase expresses.
2. Notice that the more conceptually difficult it is to conceive of some nominal
element as an agent, the more grotesque the outcome of switching the compo-
nent noun phrases becomes, e.g., Fame/his dreams fled the man as alternatives to
The man chased fame/his dreams.
3. The interpretive contrast between the kin terms father and cousin is perhaps
the clearest indication that symmetry is a lexical-semantic rather than a syntactic
feature. When appearing in the same linguistic environments, father does not
elicit inferences of symmetry, whereas cousin (defeasibly) does.
4. Psychologists have been quick to embrace some version of the view that the
structures just discussed say something useful about the concept of similarity (see
in particular the analyses from Medin, Goldstone, and Gentner [1993] and Smith
and Heise [1992]), for once there has been at least a hint of the respects in
which entities are similar to each other, via structured syntactic representations,
then the relation of similarity itself can be rehabilitated. The relation famously
villified (as vacuous and fickle) by Nelson Goodman (1972) can by the same
token now be viewed with the more positive descriptors dynamic and flexi-
ble. Still, as Goodman pointed out, the rehabilitation deals with similarity only
on the streetsfor practical but not theoretical purposesbecause to say that
two things are similar in having a specified property in common is to say nothing
more than that they have that property in common (445) so the term similar is
doing no independent work despite its retitling.
5. These reassignments apply across the symmetrical class, e.g., if asked for when
they would say Red China is similar to North Korea, participants conjecture
preferences for the climate, the opportunities for surfing, etc., in regards that may
favor North Korea.
References
Ihara, Hiroko, and Ikuyo Fujita. 2000. A cognitive approach to errors in case
marking in Japanese agrammatism: The priority of goal-ni over the source-kara.
In Constructions in Cognitive Linguistics: Selected Papers from the Fifth Interna-
tional Cognitive Linguistics Conference, Amsterdam, 1997, edited by Ad Foolen
and Frederike Van der Leek, 123140. Amsterdam: John Benjamins.
Jackendoff, Ray. 1983. Semantics and Cognition. Cambridge, MA: MIT Press.
Jespersen, Otto. 19091949. A Modern English Grammar on Historical Principles.
Vol. 7. Copenhagen: Munksgaard; London: Allen & Unwin.
Lakusta, Laura, and Barbara Landau. 2005. Starting at the end: The importance
of goals in spatial language. Cognition 96 (1): 133.
Lakusta, Laura, and Barbara Landau. 2012. Language and memory for motion
events: Origins of the asymmetry between source and goal paths. Cognitive
Science 36 (3): 517544.
Lakusta, Laura, Laura Wagner, Kirsten OHearn, and Barbara Landau. 2007.
Conceptual foundations of spatial language: Evidence for a goal bias in infants.
Language Learning and Development 3 (3): 179197.
Lakusta, Laura, Hanako Yoshida, Barbara Landau, and Linda Smith. 2006.
Cross-linguistic evidence for a goal/source asymmetry: The case of Japanese.
Poster presented at the International Conference on Infant Studies, Kyoto,
Japan, June 2006.
Landau, Barbara, and Ray Jackendoff. 1993. What and where in spatial lan-
guage and spatial cognition. Behavioral and Brain Sciences 16 (2): 217265.
Medin, Douglas L., Robert L. Goldstein, and Dedre Gentner 1993. Respects for
similarity. Psychological Review 100 (2): 254278.
Miller, Carol A. 1998. It takes two to tango: Understanding and acquiring sym-
metrical verbs. Journal of Psycholinguistic Research 27 (3): 385411.
Nam, Seungho. 2004. Goal and source: Asymmetry in their syntax and seman-
tics. Paper presented at the Workshop on Event Structure, Leipzig, Germany,
March 2004.
Nappa, Rebecca, Allison Wessel, Katherine L. McEldoon, Lila R. Gleitman, and
John C. Trueswell. 2009. Use of speakers gaze and syntax in verb learning. Lan-
guage Learning and Development 5 (4): 203234.
Nikitina, Tatiana. 2006. Subcategorization pattern and lexical meaning of motion
verbs: A study of the source/goal ambiguity. Linguistics 47 (5): 11131141.
Pinker, Steven. 1989. Learnability and Cognition: The Acquisition of Argument
Structure. Cambridge, MA: MIT Press.
Quine, Willard. 1960. Word and Object. New York: Wiley.
Rosch, Eleanor. 1975. Cognitive reference points. Cognitive Psychology 7 (4):
532547.
Sadalla, Edward K., W. Jeffrey Burroughs, and Lorin J. Staplin. 1980. Reference
points in spatial cognition. Journal of Experimental Psychology, Human Learn-
ing, and Memory 6 (5): 516528.
210 Barbara Landau and Lila R. Gleitman
Senghas, Ann, Sotaro Kita, and Asli zyrek. 2004. Children creating core prop-
erties of language: Evidence from an emerging sign language in Nicaragua.
Science 305 (5691): 17791782.
Smith, Linda B., and Diana Heise. 1992. Perceptual similarity and conceptual
structure. In Percepts, Concepts and Categories: The Representation and Process-
ing of Information. Advances in Psychology 93, edited by Barbara Burns, 233
272. Oxford: North Holland.
Talmy, Leonard. 1983. How language structures space. In Spatial Orientation:
Theory, Research, and Application, edited by Herbert L. Pick and Linda P.
Acredolo, 225282. New York: Plenum Press.
Treisman, Anne M., and Garry Gelade. 1980. A feature-integration theory of
attention. Cognitive Psychology 12 (1): 97136.
Treisman, Anne, and Hilary Schmidt. 1982. Illusory conjunction in the perception
of objects. Cognitive Psychology 14 (1): 107141.
Tversky, Amos. 1977. Features of similarity. Psychological Review 84 (4):
327350.
Tversky, Amos, and Itamar Gati. 1978. Studies of similarity. In Cognition and
Categorization, edited by Eleanor Rosch and Barbara B. Lloyd. Hillsdale, NJ:
Erlbaum.
Woodward, Amanda L. 1998. Infants selectively encode the goal object of an
actors reach. Cognition 69 (1): 134.
11 Accessibility and Linear Order in Phrasal Conjuncts
11.1 Introduction
their knowledge store to address, and then update that entry with the
New information contained in the later part of the sentence (Branigan,
McLean, and Reeve 2003, 181).
perform the same task under two different conditions. In the naming
study, the participants label pictures of pairs of objects shown on a com-
puter screen that is visible only to themselves (e.g., apple, pencil). One
of the objects in the picture is old, having been labeled in the immedi-
ately prior trial (e.g., apple), whereas the other object is new, not having
been encountered in prior trials (e.g., pencil). The confederate then finds
the picture that matches the participants description from among a stack
of similar pictures. Based on prior research employing a similar paradigm
with adult native speakers of German (Narasimhan and Dimroth 2008),
we predict that adult English speakers are more likely to use the old-
before-new ordering within conjunct noun phrases (e.g., an apple and a
pencil) versus the new-before-old ordering (e.g., a pencil and an apple).
In the naming-under-load study, a second group of participants per-
forms the identical labeling task described above, but concurrently mem-
orizes and rehearses a list of distractor words that are semantically
related to both the old and the new referents. We hypothesize that inter-
ference from semantically related distractor words (Gordon, Hendrick,
and Levine 2002; Ferreira and Firato 2003) will make it harder for speak-
ers to retrieve labels for the old and the new referents in the naming-
under-load condition. A straightforward prediction is that speakers will
tend to show an increased tendency to use the order that is easier for
them to produce when under a processing load relative to the simple
naming condition.
But what is the easier order likely to be? If accessibility leads to ease
of processing, then old referents are mentioned early in the utterance
because their mention in discourse makes them more activated and avail-
able for retrieval earlier than new referents (Bock and Irwin 1980; Fer-
reira and Yoshita 2003), freeing up working memory capacity for other
processes (Baddeley 1986; Jackendoff 2002; Just and Carpenter 1992).
But as 3- to 5-year-olds prefer the new-before-old order in phrasal con-
juncts (Narasimhan and Dimroth 2008), it may be the new-before-old
order that is easier for speakers to produce. Owing to the multiplicity of
factors that favor production of the old-before-new order (discussed
earlier), adult speakers may not exhibit the new-before-old order in
typical discourse contexts that do not tax the language processing system
in any way. But we conjecture that when speakers are placed under a
cognitive load, their processing resources are taxed in such a way that
ease-of-processing considerations become paramount during utterance
production. For instance, speakers may want to produce the new item
first because it is novel, salient, and therefore in the current focus of
218 Bhuvana Narasimhan, Cecily Jill Duffield, and Albert Kim
11.4.1 Participants
Participants were 18 native English-speaking adults (11 females), with
no history of language disorders, ranging in age from 18 to 38 years,
recruited on the University of Colorado Boulder campus. Two partici-
pants were excluded from the study, one due to showing strong influence
of a second language, and one due to equipment failure.
11.4.2 Materials
The stimulus items consisted of photographs of 24 inanimate objects. The
object names were grouped into 12 pairs. Object pairs were matched
based on the frequency of their labels in the CHILDES (Child Language
Data Exchange System) database (MacWhinney 2000) given that this
database was used to generate object labels in preparation for future
comparisons with young children. Additional matching criteria for object
pairs were based on the phonological features of object labels, ease of
Accessibility and Linear Order in Phrasal Conjuncts 219
labeling the objects, and the size of the real world objects that the labels
named. Three warm-up pairs and 14 filler pairs were also included (see
appendix A).
Short film clips of the two items moving in random paths across the
screen were created. Two versions of each clip were created, with the
items initially appearing in different locations on the screen in order to
avoid any spatial bias that might influence order of mention of the
objects. The stimulus items were randomized and organized in eight
conditions based on order of list presentation, version of film clip shown
(as described above), and order of stimulus presentation (item A or item
B in a pair presented first). Film clips of items were presented on a
15-inch MacBook Pro.
11.4.3 Procedure
Participants labeled a single item (e.g., a flower) shown on the computer
screen. An experimenter who could not see the screen found a matching
picture out of a set of pictures and repeated the participants object label.
Participants then saw a clip of two items, one of which had been shown
in the immediately prior clip (e.g., a flower and a crayon), and again
labeled the objects such that the experimenter could find the matching
picture.
11.4.5 Discussion
The results of the current study replicated the results seen with the adults
in the Narasimhan and Dimroth (2008) study with German speakers.
Adults prefer to label first a referent made relatively more accessible by
prior mention versus a newly introduced referent. This preference may
be due to ease of processing; speakers find it easier to produce more
accessible items first. Alternatively, this word order preference may
reflect a learned convention. Perhaps participants prefer old-before-new
because it is the most frequent order to which they have been exposed,
or because they adopt an audience-design strategy; speakers may have
assumed that the old-before-new order would facilitate the confederates
comprehension and picture-matching activity.
In order to examine whether manipulating ease of processing inde-
pendently of other factors influences linear ordering preferences, we next
employed a concurrent recall task that increases retrieval difficulty of
the labels for the old and new referents.
11.5.1 Participants
Participants were 18 native English-speaking adults (9 females) recruited
on the University of Colorado Boulder campus ranging in age from 19
to 34 years, with no history of language disorders. Two participants data
were excluded, one due to failure to name all items presented during the
test trials, and a second due to experimenter error. None of the partici-
pants had participated in Experiment 1.
11.5.2 Materials
Stimulus items. Stimulus items were identical to those in the naming
study.
Accessibility and Linear Order in Phrasal Conjuncts 221
Distractor words. The materials for the concurrent verbal recall task
consisted of a list of 6 distractor words for each trial. Three words were
related to each of the two test items in the trial. Distractor items were
selected from the WordNet online database (Fellbaum 1998; Princeton
University 2010) or semantic associates chosen by the experimenters.
Distractors were also matched for concreteness, familiarity, and image-
ability ratings from the MRC Psycholinguistic Database (Coltheart
1981). In filler trials, distractor words were randomly selected. Distractor
words were presented in an ABABAB format, such that all A-distractors
were related to one item in a pair, and all B-distractors were related to
the second item (see appendix B).
11.5.3 Procedure
The procedure for the distractor experiment consisted of two tasks: a
naming task and a recall task. In the recall task, participants saw six
words that they had to memorize on a computer screen. Participants
engaged in rehearsal of the distractors: they were instructed to continu-
ously repeat the distractor words aloud to aid their memorization until
they saw a screen with a question mark. Upon seeing the question mark,
participants then completed the recall task with a test: they were
instructed to recall as many of the distractor words as they could. Par-
ticipants were told that they would be rated on the number of correct
words recalled.
The naming task was concurrent with the repetition (rehearsal) stage
of the recall task: in between the presentation of the distractor words
and the question mark signaling the recall task (i.e., while participants
were repeating the distractor words aloud), participants were shown the
stimulus items for the object naming task on the computer screenfirst
the old object, and then the old and new objects together. At this point,
they would name the objects (in the first instance, by labeling the old
object, and in the second, by using a phrasal conjunct labeling both the
old and new objects).
An example of the procedure is presented in appendix C.
1.00
0.90 new_old
0.90
old_new
0.80
0.70
0.60 0.57
0.50
0.43
0.40
0.30
0.20
0.10
0.10
0.00
Naming Naming + Load
Figure 11.1
Mean proportions of old-before-new and new-before-old responses in the naming study
and the naming-under-load study
1.00
0.90 new_old
0.90
old_new
0.80 0.79
0.70 0.65
0.60 0.58
0.50
0.42
0.40 0.35
0.30
0.21
0.20
0.10
0.10
0.00
Naming R-new R-old Else
(Naming + Load) (Naming + Load) (Naming + Load)
Figure 11.2
Mean proportions of old-before-new and new-before-old responses in the naming study
(first two bars to the left of the graph) and in the R-old, R-new, and Else trial groups of
the naming-under-load study. (R-old: last-produced distractor word was related to the first,
older item of the pair; R-new: last-produced distractor word was related to the newer item;
Else: the last word mentioned prior to the object labeling was a word not on the original
distractor list.) Note: Although participants reproduced distractor words that they were
instructed to memorize, sometimes they randomly produced a word that was not on the
list; such occurrences were coded as else.
Table 11.1
Effects of the Last-Produced Distractor Word and Fluency on New-Before-Old versus
Old-Before-New Responses
analysis, the only significant coefficient obtained was for distractor status.
Participants were more likely to produce old-before-new responses in
the naming study than in any of the three distractor status conditions of
the naming-plus-recall study: R-old group ( = 3.17, SE = 0.80, Z value
= 3.97, p < 0.001), the R-new group ( = 4.45, SE = 0.83, Z value =
5.37, p < 0.001), and the Else group ( = 2.37, SE = 1.02, Z value =
2.33, p < 0.05). Thus we find a reduction in the old-before-new responses
in the naming-under-load task relative to the naming task irrespective of
the semantic relatedness of the distractor word to either new or old refer-
ent labels.
11.5.5 Discussion
Our results show that speakers demonstrate an overall elimination of the
old-before-new bias under a processing load. These findings provide
empirical support for the role of speaker-oriented considerations such as
ease of processing in modulating word-order preferences. They are also
compatible with the specific hypothesis that speakers first produce new
information that has a relatively less robust representation in working
memory. Possibly, the saliency of new objects also contributes to their
activation and ease of retrieval. Furthermore, when speakers produce a
distractor that is semantically related to the new item immediately prior
to naming the objects in the test trials, the old-before-new bias is com-
pletely reversed. One possible explanation is that the related distractor
primes the new item, perhaps having an additive effect along with
saliency, increasing its activation and likelihood of being produced first.
Accessibility and Linear Order in Phrasal Conjuncts 227
Distractors related to the old items do not have a similar effect, a finding
that requires further research for an explanation.
While the introduction of cognitive load in our second study reduced
the preference for old-before-new responses in adults, it did not result in
a strong, across-the-board preference for new-before-old, as seen in the
young childrens production even without added cognitive load (Nara-
simhan and Dimroth 2008; Dimroth and Narasimhan 2012). If the
saliency of the new object makes it the more easily accessible item, or if
the fragility of its mental representation motivates early encoding in the
utterance by speakers, why do adults not show a basic preference for
new-before-old? As discussed earlier, one possibility is the influence
of competing factors that favor the old-before-new order in adults.
For instance, adults have had more exposure than children to the
(putatively) more frequently occurring old-before-new order pattern
across different construction types over the course of their linguistic
experience. For instance, speakers may be more likely to use, and encoun-
ter, the old-before-new order when using active declarative construc-
tions, which they may often hear used with the old-before-new order, as
opposed to phrasal conjuncts (see Stephens [2010] for evidence of an
old-before-new preference in childrens production of ditransitive con-
structions, and Slevc [2011] for a similar preference in adults in the
absence of a cognitive load). The influence of addressee-oriented con-
siderations favoring the old-before-new as a way to facilitate listener
comprehension may also play a role, albeit in an attenuated matter under
conditions in which speakers lack the cognitive resources to engage in
audience design.
An alternate explanation for the response pattern seen in the naming-
under-load study does not rely on competition between factors favoring
the old-before-new versus the new-before-old order. Rather, it accounts
for the reduction in the old-before-new bias in terms of the influence of
cognitive load on how the old information is encoded, maintained, or
retrieved. For instance, Slevc (2011) suggests that speakers old-new
preference in the production of dative constructions is attenuated when
under a verbal processing load because of interference-based effects
from items held in memory from the concurrent recall task: WM
[working memory] load either made it difficult to keep [old] information
sufficiently active to warrant early mention or led to increased interfer-
ence at the point of retrieving that otherwise accessible item. . . . a plau-
sible alternative is that the WM load interfered with the encoding of the
228 Bhuvana Narasimhan, Cecily Jill Duffield, and Albert Kim
accessible item (2011, 1511). Since there was no preference for the old-
before-new or new-before-old order in our second study (except in those
cases where a distractor word semantically related to a new item was
produced before a test trial), it is possible that a similar explanation can
be provided for our results. That is, it is possible that the old information
was not more robustly represented than the new information, or that it
was not even retained in memory at all. Although it is not mutually
exclusive with our account, several factors suggest that an explanation
along the lines provided by Slevc (2011) is unlikely to be the sole factor
motivating the reduction in the old-before-new bias in our second study.
First, anecdotal evidence suggests that participants are maintaining the
old vs. new distinction: participants used the definite determiner the to
label old referents (7 responses; no responses showed the definite deter-
miner used with new referents). Furthermore, in 19 excluded responses,
participants named only the new referent (there were no cases in which
participants named only the old referent and omitted the new one).
Second, producing a distractor word semantically related to a new refer-
ent label facilitates the retrieval of the new item to a greater extent than
when a distractor word related to an old referent label is produced (see
figure 11.2). This suggests that the representation of old versus new ref-
erents is distinct. Third, there is no relationship between the number of
correctly recalled distractor words and ordering preference. If impaired
memory for the old object led to the decrement in old-before-new order,
participants ordering preferences should be influenced by differences in
their recall abilities, but this is not the case. Finally, if participants were
simply using random ordering patterns, we would expect to see a roughly
50-50 split in choice of orders at the individual level. Instead, we see a
bimodal pattern (table 11.2), where almost all the participants either
have a predominantly old-before-new preference or a predominantly
new-before-old ordering preference.
11.6. Conclusions
Table 11.2
Proportion of New-Before-Old and Old-Before-New Responses Per Participant in the
Naming-Under-Load and Naming Tasks*
*Bolded items represent participants preferred pattern of responses (60% or more of their
responses).
Acknowledgements
Appendix A
Target labels for object pairs for each trial (filler items not shown)
Pair Item 1 Item 2
1 book chair
2 clock plate
3 flower crayon
4 cup shoe
5 key knife
6 hat egg
7 cookie bottle
8 tree bus
9 ball spoon
10 car bed
11 apple pencil
12 glass shirt
Appendix B
Target labels for object pairs and related distractors for each trial
Pair Item 1 Item 2 Distractors
Appendix C
Idealized example from a single trial in the distractor task (target items
are underlined)
(distractor words presented on the screen: ORANGE, ERASER, BANANA, RULER,
PEACH, STAPLER; Participant begins rehearsal for the recall task)
References
ings of the 25th Annual Conference of the Cognitive Science Society, edited by
Richard Alterman and David Kirsch, 180185. Boston, MA: Psychology Press.
Clark, Eve V., and Susan E. Haviland. 1977. Comprehension and the given-new
contract. In Discourse Production and Comprehension, edited by Roy O. Freedle,
140. Norwood, NJ: Ablex.
Clark, Herbert H., and Deanna Wilkes-Gibbs. 1986. Referring as a collaborative
process. Cognition 22 (1): 139.
Clifton, Charles, Jr., and Lynn Frazier. 2004. Should given information come
before new? Yes and no. Memory and Cognition 32 (6): 886895.
Coltheart, Max.1981. The MRC psycholinguistic database. Quarterly Journal of
Experimental Psychology 33 (4): 497505.
Dimroth, Christine, and Bhuvana Narasimhan. 2012. The development of linear
ordering preferences in child language: The influence of accessibility and topical-
ity. Language Acquisition 19 (4): 312323.
Fellbaum, Christine, ed. 1998. WordNet: An Electronic Lexical Database. Cam-
bridge, MA: MIT Press.
Ferreira, Victor S., and Carla E. Firato. 2002. Proactive interference effects on
sentence production. Psychonomic Bulletin and Review 9 (4): 795800.
Ferreira, Victor S. and Hiromi Yoshita. 2003. Given-new ordering effects on the
production of scrambled sentences in Japanese. Journal of Psycholinguistic
Research 32 (6): 669692.
Gordon, Peter C., Randall Hendrick, and William H. Levine. 2002. Memory load
interference in syntactic processing. Psychological Science 13 (5): 425430.
Halliday, Michael A.K. 1994. Introduction to Functional Grammar. London:
Edward Arnold.
Haywood, Sarah L., Martin J. Pickering, and Holly P. Branigan. 2005. Do speakers
avoid ambiguities during dialogue? Psychological Science 16 (5): 362366.
Hoff-Ginsberg, Erica. 1997. Language Development. Pacific Grove, CA: Brooks/
Cole.
Jackendoff, Ray S. 1972. Semantic Interpretation in Generative Grammar. Cam-
bridge, MA: MIT Press.
Jackendoff, Ray.1983. Semantics and Cognition. Cambridge, MA: MIT Press.
Jackendoff, Ray. 1990. Semantic Structures. Cambridge, MA: MIT Press.
Jackendoff, Ray. 2002. Foundations of Language: Brain, Meaning, Grammar,
Evolution. New York: Oxford University Press.
Just, Marcel A., and Patricia A. Carpenter. 1992. A capacity theory of comprehen-
sion: Individual differences in working memory. Psychological Review 99 (1):
122149.
Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music.
Cambridge, MA: MIT Press.
Levelt, Willem J. M. 1989. Speaking: From Intention to Articulation. Cambridge,
MA: MIT Press.
234 Bhuvana Narasimhan, Cecily Jill Duffield, and Albert Kim
MacWhinney, Brian. 2000. The CHILDES Project: Tools for Analyzing Talk. 3rd
ed. Mahwah, NJ: Lawrence Erlbaum Associates.
Narasimhan, Bhuvana, and Christine Dimroth. 2008. Word order and information
status in child language. Cognition 107 (1): 317329.
Princeton University. 2010. About WordNet. WordNet. Princeton University.
http://wordnet.princeton.edu.
Slevc, L. Robert. 2011. Saying whats on your mind: Working memory effects on
syntactic production. Journal of Experimental Psychology: Learning, Memory,
and Cognition 37 (6): 15031514.
Stephens, Nola. 2010. Given-before-new: The Effects of Discourse on Argument
Structure in Early Child Language. Ph.D. diss., Stanford University.
Von Stutterheim, Christiane, and Wolfgang Klein. 2002. Quaestio and
L-perspectivation. In Perspective and Perspectivation in Discourse, edited by Carl
F. Graumann and Werner Kallmeyer, 5988. Amsterdam: John Benjamins.
Wundt, Wilhelm M. 1900. Die Sprache. Leipzig: Engelmann.
Yule, George. 1997. Referential Communication Tasks. Mahwah, NY: Lawrence
Erlbaum Associates.
12 Sleeping Beauties
Willem J. M. Levelt
During the decade around 1860 Gregor Mendel ran his classic experi-
ments on the hybrids of pea plants in the botanical garden of his
Augustine monastery in Brnn, Austria. There he discovered the basic
principles of heredity, later called Mendels laws: the law of segregation
(the existence of dominant and recessive traits) and the law of indepen-
dent assortment (traits being independently inherited). In 1866 he
published these discoveries as Versuche in Pflanzenhybriden in the
journal of the local natural science society, not exactly a journal that
featured on Charles Darwins shelves. Mendel then became abbot of his
monastery and spent little further effort on promoting his discoveries.
They became sleeping beauties for the next three decades. By the
end of the 1890s, four princes, more or less independently, kissed them
back to life: Hugo de Vries from Amsterdam, Erich Tschermach-
Seyseneggassisted by his brother Arminfrom Vienna, and Carl
Correns from Tbingen. Their papers, all three reporting the rediscovery
of Mendels laws, appeared almost simultaneously in 1900, two of them
acknowledging Mendels priority, the third one, Hugo de Vries, soon
joining in.
This is undoubtedly the most famous case of rediscovery in modern
science. However, rediscovery is not limited to the natural sciences. The
present chapter will review a number of sleeping beauties in linguistics
and psycholinguisticsdiscoveries, tools, and theories that reawakened
after long periods of slumber. I came across them while writing A History
of Psycholinguistics (2013).1 One of these beauties, the first to be dis-
cussed, was kissed back from enchantment by Ray Jackendoff in his
theory of consciousness (1987).
236 Willem J. M. Levelt
that we cannot recognize a word when its initial speech sound is experi-
mentally changed. We will not recognize cold when we hear told. Still, we
might recognize the spoken non-word gypothesis as hypothesis. Later
versions of the theory allow for slight activation of (candidate) words that
were not in the original cohort (in the example the word hypothesis).
Another attractive feature of cohort theory is the notion of uniqueness
point. Each new incoming speech sound further reduces the cohort, till
just one candidate word is left, which is then recognized as the target
word. That can happen before all of the words speech sounds have come
in. Take the word snorkel. When the input has reached the stage snor-
then the cohort has been reduced to snorkel, snorer, snort, snorter, and
snorty. But as soon as k comes in, only snorkel remains. Hence, speech
sound k is snorkels uniqueness point. A words uniqueness point thus
depends on the set of word-initial alternatives in the listeners lexicon.
The theory predicts that a word is recognized as soon as its uniqueness
point is reached. This was nicely confirmed in the initial experiments, and
the notion is still a basic one in spoken word perception.
Sigmund Exner had been ahead of Marslen-Wilson by over eight
decades. He formulated the essence of the theory in 1894. Here is the
relevant text in English translation (from Levelt 2013, 81):
When you for instance hear the sound K, with [. . .] very low intensity the traces
are activated which in many earlier cases were simultaneously active with the
perception of K and which correspond to the images of Knabe [boy], Kuh
[cow], Kirsche [cherry], Kugel [ball], Kern [kernel], etc. [. . .] This activation
doesnt disappear however with the disappearance of the sound K, but continues
[. . .] as a trace for a duration of a number of seconds [. . .]. If during the existence
of this activation [. . .] also the sound I is heard, then a further bit of activation
will be received by those traces that are associatively connected to the sound I.
This should not mean that the image of Fisch [fish] is not also activated by the
I-sound because of its connection to the I-sound, but it is obvious that all images
whose name begins with KI have a remarkable advantage, because they were
already activated by the previous K-sound. [. . .] Hence, the image Kirsche will
be closer to the activation value needed for clear consciousness as the image
Fisch. In addition, it [the I-sound] will [. . .] suppress the vague images Knabe,
Kuh, Kugel, Kern, etc. [. . .] [Kirsche] will however still be at the same
activation level with other words beginning with Ki [. . .]. If then the further
sound R is added, the total activation process of the traces in the brain is nar-
rowed down following the same principle, so that only the traces representing
the images Kirsche and Kirche are activated; the further sound Sch then hits
a relatively very small number of active brain traces, but it is intensive and it will,
during the pause that follows completion of the word, develop itself into the full
activation of the image traces of Kirsche.4 (German original: Exner 1894,
307308)
240 Willem J. M. Levelt
A most remarkable sleeping beauty has been Rudolf Meringer and Carl
Mayers (1895) theory of speech errors and its further extension in
Meringer (1908). There is indeed great beauty here. The thoroughly data-
based theory is the first to explain speech errors from an explicit psycho-
logical theory of utterance production, a theory that in its essentials still
stands today. It is, moreover, almost incomprehensible how this work
could suffer the fate of a decades-long sleep state. Let us shortly consider
these two features of the case.
The linguist Rudolf Meringer (18591931) was born in Vienna, and
held teaching positions there and, since 1899, in Graz. He was a con-
firmed empiricist: one who cannot observe is not a researcher, but a
bookworm5 (Meringer 1909, 597). His grand empirical project became
the systematic collection, analysis, and psycholinguistic explanation of
spontaneous speech errors. Meringer organized the systematic collection
by involving the participants in a regular lunch-time meeting. They
agreed to stick to certain rules, such as speaking one person at a time
and halting all conversation as soon as a tongue slip occurred. The latter
would allow for proper recording of the error and for immediate intro-
spection on the part of the speaker concerned. This procedure introduced
an important methodological feature: all occurring speech errors were
recorded, not just the remarkable, interesting, or funny ones as had been
the tradition, and as would regrettably become the tradition again.
Medical doctor Carl Mayer was only marginally involved with data col-
lection and analysis and not at all with the writing. However, his
co-authorship was important for Meringer because it would mark empir-
ical speech error research as natural science. The total corpus recorded
amounted to some 2500 slips of the tongue.
Sleeping Beauties 241
Mayer, many more from his own or his colleagues, supporting an entirely
different story: speech errors result from something suppressed from
consciousness, forcing its way out. For example, Sie werden Trost finden,
indem Sie sich vllig Ihren Kindern widwen (target: widmen devote)
spoken by a gentleman to a beautiful young widow (you will find con-
solation in fully widowing yourself to your children). Here is Freuds
explanation for this mechanically obvious perseveration, the suppressed
thought indicated a different kind of consolation: a beautiful young
widow will soon enjoy new sexual pleasures. No wonder that Meringer
describes such analyses as jenseits von gut und bse (beyond good
and evil; 1908, 129). In subsequent editions of Zur Psychopathologie des
Alltagslebens, Freuds stories become ever wilder and more offensive to
Meringer. Ultimately, after its sixth edition in 1919, Meringer had had
enough, and wrote a detailed, totally devastating and hilarious review
(Meringer 1923). After carefully deconstructing Freuds phantom
interpretations case after case, Meringer concludes, How much clearer
spoke Pythia than the way Fate reveals itself to modern Freud-humans!
One should even despair, if the same Fate hadnt also blessed the same
human beings with psychoanalysis8 (140). However, it was to no avail.
Freuds story telling about speech errors had conquered the world; in
1923, the 11th printing of the English edition became available already.
This brings us to the other cause of obliteration. There was never an
English translation of Meringer and Mayers (1895) treatise. After World
War I, and especially after the establishment of the Third Reich in 1933,
the center of gravity of psycholinguistics shifted to the Anglo-Saxon
world, especially North-America. As we will consider, research lines were
drastically broken, knowledge of German was limited, and mental
machinery was anathema for dominant behaviorism. Behavioristic psy-
cholinguistics culminated in Burrhus Frederic Skinners Verbal Behavior
of 1957, or rather already in his William James Lectures of 1947, which
was generally considered as holy writ. Verbal Behavior is essentially a
book about the speaker, in which the theoretical framework of operant
conditioning is applied to the phenomena of language productionan
enormous scaling up from elementary behaviors of rats and pigeons in
Skinner boxes to the most complex of all behaviors, speaking. Not sur-
prisingly, the book lacks an empirical, let alone experimental, basis; it is
a discursive text. It does however discuss speech errors. They can occur
when two verbal operants (verbal responses such as snarl and tangle)
have the same strength and become simultaneously emitted (as snangle).
Here, Skinner rejects Freuds approach to look for explanations in highly
Sleeping Beauties 243
S A O V A
man furious child hit hard
Figure 12.1
a b (a) c A B
(a) d (a) e aI bI A D
Figure 12.2
Als er sich den Vorwurf sehr zu Herzen zu nehmen schien ( a b ) und immer aufs neue
beteurte (c), da er gewi gern mitteile (d), gern fr Freunde ttig sei (e), so empfand sie
( A B), da sie sein zartes Gemt verletzt habe ( a 1 b 1 ), und sie fhlte sich als seine
Schuldnerin ( A D ). [As he seemed to take the reproach to heart ( a b) and again and
again proclaimed (c) that he certainly gladly intimated (d) to be eagerly active for his
friends (e), then she experienced ( A B), that she injured his tender heart ( a 1 b 1 ), and
she felt indebted to him ( A D ).]
Figure 12.3
Neither Adolf Reinach nor Hans Lipps are referred to in John Austins
famous 1955 William James Lectures (Austin 1962), but they had cer-
tainly been pioneers of speech act theory.
Hermann Steinthal had introduced the term akataphasia for the inability
of certain aphasic patients to build sentences in spite of the fact that the
underlying thought or judgment is intact. Adolf Kussmaul, in his wonder-
ful 1877 text on disorders of language, recognized the same syndrome
calling it agrammatism, the term we still use. It is the inability to inflect
words appropriately and to syntactically order them into sentences11
(164). A more detailed analysis of agrammatic speech style was under-
taken by Carl Wernickes students Karl Bonhoeffer (1902) and Karl
Heilbronner (1906). They characterized this style as telegraphic. Heil-
bronner argued that this style was not voluntary but a real syntactic
inability, a primary effect of a lesion in the speech motor area.
This was the state-of-the-art when Max Isserlin (18791941) published
his paper ber Agrammatismus (1921). The paper includes extensive
protocols of the spoken and written texts of three agrammatic patients.
Here is an utterance of case 1 (WD), who describes how his brother-in-
law was killed:
Thief beenbrother-in-law at job, nothing noticed at all2 daysthrown in the
Pregelin Knigsberg anyhow very badjust Goldmarksnothing to eat.
Killer found latertaken out of bed worker.12
writing religious texts, teaching bible classes, and preaching in his local
Nashville community. It took almost two decades before Stroops paper
returned to the scientific agenda. By now it is the most cited paper in
the domain of reading research. For both Mendel and Stroop religious
duties took precedence over scientific self-promotion.
Another quite general impediment is the language of publication. This
certainly holds for all seven cases discussed in this paper. All of them
were published in German, and none of the relevant publications by
Steinthal, Exner, Meringer, Wundt, Reinach, Lipps, or Isserlin were trans-
lated into English. With the shift of gravity in the language sciences to
the Anglo-Saxon world, especially North-America, during the first half
of the 20th century, English became the language of science. Increasingly,
the mastery of German was lost in the linguistic community. Secondary
English-language sources became the tools of reference to the original
sources, often with major misrepresentations or omissions as a conse-
quence. Wundt, for instance, was soon called an introspectionist in the
United States and often still is, but he wasnt. Wundt never introduced a
method of systematically observing and reporting ones own inner expe-
rience, thoughts, and feelings. That was done by his students Oswald
Klpe in Wrzburg, and Edward Titchener at Cornell. It was the latter
who ascribed introspectionism to Wundt, whereas Wundt had himself
attacked that method in his ferocious 1908 critique of Karl Bhlers
Habilitationsschrift (Wundt 1908), which had been supervised by Klpe.
As mentioned, the major American source on Wundts (psycho-)linguis-
tics was Bloomfields (1914) text, but it left out Wundts phrase diagrams
and didnt mention his grammar of sign language.
One really wicked fairy has been behaviorism, in particular the North-
American Watsonian variant of it. This played in linguistics and psychol-
ogy alike. All above beauties had originated in the minds of mentalists.
Still in 1914, the year John Broadus Watsons Behavior appeared, Leonard
Bloomfield put the common view this way: To demonstrate in detail the
role of language in our mental processes would be to outline the facts of
psychology (56), but then the tide quickly turned in the United States,
for reasons that are still not well understood. This is how Bloomfield
rejected mentalism in 1933: It remains for linguists to show, in detail,
that the speaker has no ideas, and that the noise is sufficientfor the
speakers words to act with a trigger-effect upon the nervous systems of
his speechfellows ([1933] 1976, 93). Although behavioristic language
scholars deeply disagreed among themselves, they all outlawed explana-
tion in terms of mental constructs. It even became an industry to translate
250 Willem J. M. Levelt
their intellectual heritage. In one case, the two World Wars joined forces
to truncate a promising intellectual development. Both pioneers of
speech act theory were killed on the German front: in 1913 young Adolf
Reinach in Diksmuide Belgium, and in 1941 Hans Lipps on the Russian
front. John Austin could hardly have become aware of their work.
12.9 Prospect
Has modern science successfully banished the wicked fairy? The lan-
guage barriers have been largely removed, with (bad) English as the
generally accepted lingua franca of science. Although dogmatic behav-
iorism has faded from the scene, other forms of intellectual provincialism
have until recently blossomed in linguistics behind impenetrable walls of
defense. But this era of linguistic wars also belongs to the past it seems.
Most importantly, the seven decades since the latest (and hopefully very
last) World War has seen a large scale globalization of the scientific
enterprise, from which the language sciences are profiting immensely.
Language diversity can now, finally, be addressed involving native speak-
ers of all ethnicities and cultures. The beauties on this global academic
scene are very much alive and kicking, but let us stay alert. One menacing
wicked fairy in modern science is its quasi market model. Frequent pub-
lication in high-impact journals has become the sine qua non for a sci-
entific career. Publication rate, especially among the young and untenured,
has been rocketing in recent years. Journal papers, especially short and
multiple-authored ones, have become the dominant output commodity
of science and (psycho-)linguistics. However, a really functioning market
matches producers and consumers. That healthy situation does not exist
in science as Klein (2012) has argued. Most published papers are hardly
ever cited and quite probably hardly ever carefully read. There is no
guarantee whatsoever that the best ideas will ultimately emerge in the
market. It seems moreover inevitable that especially risky, non-trivial,
and innovative insights will be hard put to survive peer review. In short,
new sleeping beauties are bound to be added to the hidden, overgrown
castle of science. History will keep repeating itself.
Notes
1. Inevitably, the present paper occasionally uses material from that book.
2. Glckliche Fortschritte in der Sprachwissenschaft setzen eine entwickelte
Psychologie voraus. This and all following translations are mine.
252 Willem J. M. Levelt
3. Alles Sprechen und Denken in Worten beruht darauf [. . .] dass der Inhalt
seine stellvertretenden Wrter in das Bewusstsein schicke, da er selbst nicht
dahin gelangen kann.
4. Mit hnlicher, sehr geringer Intensitt werden beim Hren, z.B. des Lautes
K, die Bahnen erregt werden, welche in vielen Fllen gleichzeitig mit der Emp-
findung des K in Action waren und die den Vorstellungen von Knabe, Kuh,
Kirsche, Kugel, Kern etc. entsprechen. . . . Diese Erregung verschwindet aber
nicht sofort mit dem Aufhren des Lautes K, sondern besteht als Bahnung, wie
wir gesehen haben, noch eine nach Secunden zhlende Zeitdauer fort. . . . Wenn
nun whrend des Bestehens der Bahnung dieser Rindenfasern . . . noch der Laut
I gehrt wird, so werden dadurch aus dem ganzen Bereiche der gebahnten Vor-
stellungen jene Bahncomplexe einen weiteren Zuschuss an Erregung bekom-
men, welche assoziativ mit dem Laute I verknpft sind. Es soll dabei nicht gesagt
sein, dass nicht auch die Vorstellung Fisch durch den I -Laut gehoben wird, indem
auch sie mit dem Laute I zusammenhngt, aber es leuchtet ein, dass alle Vorstel-
lungen, deren Wortbezeichnung mit KI beginnt, einen bedeutenden Vorsprung
haben, da sie durch das vorgehende K bereits gehoben waren. . . . Es wird also
die Vorstellung Kirsche nher dem Erregungswerthe liegen, bei dem sie dem
Bewusstsein klar vorschwebt, als die Vorstellung Fisch. Sie wird weiterhin nach
dem Prinzip der centralen Hemmung die dunkle Vorstellungen Knabe, Kuh,
Kugel, Kern etc. unterdrcken, sie wird aber nicht allein dies thun, da sie mit
der Lautfolge Ki noch nicht voll entwickelt ist, vielmehr wird sie . . . noch auf
gleicher Erregungsstufe stehen mit den Vorstellungen, welche anderen mit Ki
beginnenden Worten angehrt, und diese werden gemeinschaftlich die centrale
Hemmung erwecken. Reiht sich dann weiterhin der Laut R an, so wird der
gesammte Erregungsprocess der Rindenbahnen nach demselben Principe noch
weiter eingeschrnkt, so dass etwa nur mehr die Bahnen, welche der Vorstellung
Kirsche und Kirche entsprechen, gebahnt sind; der weitere Laut Sch trifft nur
mehr eine verhltnissmssig sehr geringe Anzahl von Rindenfasern gebahnt,
diese Bahnung aber ist eine intensive und wird mit der Pause, welche nach Vol-
lendung des Wortes eintritt, sich zur vollen Erregung der Vorstellungsbahnen der
Kirsche entwicklen knnen. (Exner 1894, 307308).
5. . . . und wer nicht beobachten kann, ist kein Forscher, sondern ein
Bcherwurm.
6. Beim Sprechfehler versagt nur die Aufmerksamkeit, die Machine luft ohne
Wchter, sich selbst berlassen.
7. . . . wird beim Versprechen von solchen Lautgesetzen vllig abgesehen.
8. Wieviel klarer sprach die Pythia, als wie sich das Schicksal modernen Freud-
Menschen offenbart! Man mte verzweifeln, wenn dasselbe Schicksal die Men-
schen nicht auch mit der Psychoanalyse begnadet htte!
9. Der Zweck unseres Sprechens ist stets der, den Willen oder Erkentniss einer
Person so zu beeinflussen, wie es dem Sprechenden als wertvoll erscheint.
10. Vielmehr ist das Befehlen ein Erlebnis eigener Art, ein Tun des Subjektes,
dem neben seiner Spontaneitt, seiner Intentionalitt und Fremdpersonalitt die
Vernehmungsbedrftigkeit wesentlich ist.
Sleeping Beauties 253
References
Austin, John Langshaw. 1962. How to Do Things with Words. Oxford: Clarendon
Press.
Bonhoeffer, Karl. 1902. Zur Kenntniss der Rckbildung motorischer Aphasien.
Mitteilungen aus den Grenzgebieten der Medizin und Chirurgie 10: 203224.
Bloomfield, Leonard. 1914. An Introduction to the Study of Language. New York:
Henry Holt.
Bloomfield, Leonard. [1933] 1976. Language. London: Allen & Unwin.
Cohen, Anthony. 1968. Errors of speech and their implications for understanding
the strategy of language users. Zeitschrift fr Phonetik 21 (12): 177181.
Exner, Siegmund. 1894. Entwurf zu einer physiologische Erklrung der psy-
chischen Erscheinungen. Vol. 1. Leipzig and Vienna: Franz Deuticke.
Freud, Sigmund. [1901] 1954. Zur Psychopathologie des Alltagslebens. Frankfurt
am Main: Gustav Fischer.
Haarmann, Henk, and Herman H. J. Kolk. 1991. A computer model of the tem-
poral course of agrammatic sentence understanding: The effects of variation in
severity and sentence complexity. Cognitive Science 15 (1): 4987.
Heilbronner, Karl. 1906. Ueber Agrammatismus und die Strung der inneren
Sprache. Archiv fr Psychiatrie und Nervenkrankheiten 41: 653683.
Herbart, Johann Friedrich. 1824. Psychologie als Wissenschaft, neu gegrndet auf
Erfahrung, Metaphysik und Mathematik. 2 vols. Knigsberg: Unzer.
Isserlin, Max. 1921. ber Agrammatismus. Zeitschrift fr die gesamte Neurologie
und Psychiatrie 75: 332410.
Isserlin, Max. 1936. Aphasie. In Handbuch der Neurologie, vol. 6, edited by
Oswald Bumke and Otfrid Foerster, 626807. Berlin: Springer.
Jackendoff, Ray. 1987. Consciousness and the Computational Mind. Cambridge,
MA: MIT Press.
Jackendoff, Ray. 1997. The Architecture of the Language Faculty. Cambridge, MA:
MIT Press.
Jackendoff, Ray. 2007. Language, Consciousness, Culture: Essays on Mental Struc-
ture. Oxford: Oxford University Press.
Jackendoff, Ray. 2012. A Users Guide to Thought and Meaning. Oxford: Oxford
University Press.
254 Willem J. M. Levelt
Jespersen, Otto. 1922. Language: Its Nature, Development and Origin. New York:
Henry Holt.
Klein, Wolfgang. 2012. Auf dem Markt der Wissenschaften oder: Weniger wre
mehr. In Herausragende Persnlichkeiten berichten ber ihre Begegnung mit Hei-
delberg, edited by Karlheinz Sonntag, Heidelberger Profile, 6184. Heidelberg:
Universittsverlag Winter.
Kolk, Herman, and Claus Heeschen. 1990. Adaptation symptoms and impairment
symptoms in Brocas aphasia. Aphasiology 4 (3): 221231.
Kussmaul, Adolf. 1877. Die Strungen der Sprache: Versuch einer Pathologie der
Sprache. In Handbuch der Speciellen Pathologie und Therapie, edited by Hugo
von Ziemssen. Anhang. Leipzig: F. C. W. Vogel.
Levelt, Willem J. M. 2013. A History of Psycholinguistics: The Pre-Chomskyan
Era. Oxford: Oxford University Press.
Lipps, Hans. [1937, 1938] 1958. Die Verbindlichkeit der Sprache. Frankfurt: Vittorio
Klostermann.
Maas, Utz. 2010. Verfolgung und Auswanderung deutschsprachiger Sprach-
forscher. 2 vols. Tbingen: Stauffenburg Verlag.
Marslen-Wilson, William D., and Alan Welsh. 1978. Processing interactions and
lexical access during word-recognition in continuous speech. Cognitive Psychol-
ogy 10 (1): 2963.
Mendel, Gregor. 1866. Versuche ber Pflanzenhybriden. Verhandlungen des
naturforschenden Vereins in Brnn 4: 347.
Meringer, Rudolf. 1908. Aus dem Leben der Sprache. Versprechen, Kindersprache,
Nachahmungstrieb. Berlin: Behr.
Meringer, Rudolf. 1923. Die tglichen Fehler im Sprechen, Lesen und Handeln.
Wrter und Sachen 8: 122140.
Meringer, Rudolf, and Carl Mayer. 1895. Versprechen und Verlesen. Eine
psychologisch-linguistische Studie. Stuttgart: Gschensche Verlagshandlung.
New edition, edited by Anne Cutler and David Fay. Amsterdam: John Benjamins,
1978.
Mller, Friedrich Max. 1887. The Science of Thought. London: Longmans, Green,
and Co.
Nida, Eugene Albert. 1949. Morphology: The Descriptive Analysis of Words. 2nd
edition. Ann Arbor, MI: University of Michigan Press.
Reinach, Adolf. 1913. Die apriorischen Grundlagen des brgerlichen Rechtes.
Halle: Max Niemeyer.
Skinner, Burrhus Frederic. 1957. Verbal Behavior. Acton, MA: Copley Publishing
Group.
Steinthal, Heymann. 1855. Grammatik, Logik und Psychologie: Ihre Prinzipien
und ihr Verhltniss zu einander. Berlin: F. Dmmler. New edition, Hildesheim:
Georg Olms, 1968.
Steinthal, Heymann. 1881. Einleitung in die Psychologie und Sprachwissenschaft.
Berlin: F. Dmmler.
Sleeping Beauties 255
Daniel Silverman
are articulatory, aerodynamic, acoustic, and auditory reasons for this (the
four As).
Regarding articulation, a complete oral closure followed by its release
is quite easy to produce in comparison to other gestures that have
become part of the speech code, gestures that often require extreme
muscular and timing precision to achieve their characteristic aerody-
namic, acoustic, and auditory traits (Ladefoged and Johnson 2011).
Aerodynamically, this simple articulatory action produces a passively
energized expulsion of air from the vocal tract. As air is the medium of
sound transmission, increased airflow allows for more salient and more
varied sounds. Perhaps especially, upon the breaking of an oral seal and
allowing air to rapidly flow from the lungs and out the mouth, the vocal
folds, when properly postured, may readily engage in vibratory activity
(Rothenberg 1968).
Acoustically, this sudden and forceful expulsion of air produces a
speech signal of comparatively heightened energy, one in which any
number of pitch/phonation (source) and resonance (filter) modifications
might be encoded.
Regarding audition, the mammalian auditory nerve is especially
responsive to sudden increases in acoustic energy (Delgutte 1982; Tyler
et al. 1982); a quick reaction to the sudden breaking of silence provides
obvious survival advantages in predation situations. The incipient speech
code would likely exploit this property from the outset, as it does to this
very day (Bladon 1986).
This nascent oral seal may be at the lips, but also, the flexibility of the
tongue allows both its front to form a seal at the alveolar ridge, and its
back to form a seal at the soft palate. The perceptual product of these
distinct closure locations is three easily-distinguished speech events of
exceptionally short duration. This tripartite perceptual distinction estab-
lishes the conditions for different acoustic signals to encode different
meanings; we might imagine an early stage during which these three
closure postures were in place, coordinated with largely undifferentiated
qualities to their opening postures, perhaps resulting in three sounds,
roughly, pu, ti, ka.
If vocal activity of this nature was indeed harnessed to encode meaning,
the semiotic character of primitive speech was of a first-order state, in
contrast to the zero-order state of the manual-gestural system with which
it may have overlapped: each of the three sounds might encode a single
meaning (maybe Run!, Kill!/Eat!, Sex!). One arbitrary event cor-
responds to one meaning, and one meaning is cued by one arbitrary
262 Daniel Silverman
ways: -di, for example, may now come to be associated with an additional
meaning, and thus becomes free to appear as the first element of a
complex, for example, di-bu (as opposed to a different complex, ti-bu).
Note that the articulatory properties of these initial di-s are slightly dis-
tinct from -di (typically involving an expanded pharynx and lowered
larynx during oral closure in order to maintain trans-glottal airflow,
hence voicing), but nonetheless correspond to -di quite well in acoustic
terms.
This sort of simple and natural sound change sets in motion a massive
increase in the systems complexity. For example, newly-voiced medial
closures may undergo further sound changes, to be harnessed for new
meanings: when the di of di-bu is placed in second position (for example,
ka-di), it is pronounced with closure voicing, comparable to the closure
voicing that had earlier been added to -ti in this context (for example,
earlier bu-ti, now bu-di). That is, two different meanings are now cued
by the same sounds in comparable or even identical contexts. We may
have bu-di in which -di means one thing, but also bu-di in which -di
means something else. This establishes a one-to-many relationship
between sound and meaning (derived homophony), a development also
found in all languages (Silverman 2012).
If many sounds each came to symbolize more than one meaning, lis-
tener confusion and communicative failure may result. Such a scenario
will not come to pass, however (Martinet 1952; Labov 1994; Silverman
2012). Defeating the pervasiveness of this potentially function-negative
development, the di- of di-bu may passively undergo another change
when placed in second position: some spontaneous productions of origi-
nal -di that possess a slight weakening of their voiced closures may evolve
towards a new value, perhaps, -zi, so we have bu-di (earlier bu-ti), and a
different form, bu-zi (earlier bu-di; still earlier, bu-ti). Indeed, such sound
patterns are likely to take hold exactly because of their function-positive
consequences: creeping phonetic patterns that inhibit undue listener
confusion are likely to be replicated and conventionalized. In short, suc-
cessful speech propagates; failed speech falls by the wayside.
This means we now have di- alternating with -zi, both meaning one
thing, and, recall, we have -di alternating with ti-, both meaning another.
The co-evolution of these many-to-one relationships between sound and
meaning results in many meaningful elements of the speech signal pos-
sessing both systematic phonetic variation and semantic stability, even
across varied contexts. Now, in turn, this new sound zi may unhinge itself
from its context and be deployed to signal new meanings.
Evolution of the Speech Code 265
Such speech patterns are found time and again in both (diachronic)
sound changes and (synchronic) sound alternations (Gurevich 2004).
It is now clear that the mere juxtaposition of two simple sounds trig-
gers remarkable growth and complexity of both the phonetic and the
semantic inventories. Both one-to-many and many-to-one correspon-
dences between sound and meaning naturally emerge. This is second-
order symbolism.
ambiguous affiliation of the middle term thus opens the gates to hierar-
chical structure.
Of course, these multiple interpretations of particular phonetic strings
should be few and far between, since most strings possess (1) sound-
sequencing cues, (2) meaning-sequencing cues, and (3) pragmatic cues to
the intended structure and meaning of the string. Consequently, and most
interestingly, it is exactly those rarely-encountered ambiguous forms that
are most important for the development of the system toward third-
order symbolic status. We turn to this issue now.
13.5 Discussion
When it comes to the origins of grammar, the search for evidence typi-
cally encompasses four domains:
1. Naturally occurring sub-language states in child learners, pidgins,
innovated signed languages, and impeded speech (due to drunkenness,
semi-consciousness, or pain, for example)
2. Ape-training studies
3. Laboratory experiments
4. Computer simulations
The present proposals exploit a fifth domain of inquiry, one of internal
reconstruction (Saussure 1879) taken to its final frontier. Internal recon-
struction is a method for investigating the origins of grammar inasmuch
as observing the receding of distant galaxies is a method for investigating
the real Big Bang: we observe extant pressures on structure and change,
and extrapolate them to their logical origins.
Several advantages arise from this approach to the origin of grammar.
1. These proposals properly treat language as a complex adaptive
system (Steels 2000; Beckner et al. 2009), one that is inherently social,
involving both speakers and listeners; one that is inherently dynamic,
involving competing pressures, and thus allowing for adaptive change;
one whose structures are wholly emergent; one that affectsand is
affected bythe co-evolutionary interactions of biological, cognitive,
and social structures.
2. The present approach strictly adheres to the tenets of Uniformitari-
anism (Hutton 1795; Lyell 18301833). As noted, the proposed pressures
and emergent structures by which the system originated remain in place
to this very day. And while Uniformitarianism does not rule out the pos-
sibility of punctuated equilibrium (Eldredge and Gould 1972)indeed,
the proposed grammatical Big Bang embodies this phenomenonstill,
saltation itself is fully absent: natura non facit saltum.
3. Speaker-based approaches to the evolution of grammar and gram-
matical change, as compared to listener-based approaches, are not equals-
and-opposites: production is solely relevant at the level of the speaker
(not the listener), whereas perception crucially relies on a role for both
the speaker and the listener. That is, perception is inherently dependent
on the interlocutionary event, whereas production is not. With its empha-
sis on the interlocutionary event itself, the present approach properly
Evolution of the Speech Code 271
McClelland, and the PDP Research Group 1986); the present-day sen-
tence might shorten to present-day word length, and in turn, these
evolved word-sentences may be subject to an additional level of hier-
archical and recursive arrangement. The semantic content of these
higher-level (fourth-order?) structureswhatever they might turn out to
bemay force a re-evaluation of the present-day system as one of infi-
nite expressivity (Kirby 2007). Indeed, certain present-day languages
already reverberate with the stirrings of such properties: witness the
polysynthetic languages of North America, and the stem-modifying
languages of Meso-America and East Africa.
7. The present approach to the origins of grammar incorporates
degeneracy as an important component in its evolution: comparable
forms may have distinct functions, and single functions may be underlain
by multiple, different forms. Degeneracy may be a crucial element to the
introduction of hierarchical complexity in any complex adaptive system
(Whitacre 2010; see also Firth [1948] for an analysis in a specifically
linguistic context). Earlier employed to characterize genetic and biologi-
cal systems (Edelman and Gally 2001), degeneracy may be characteristic
of any system when categories are at once sufficiently robust to fulfill
and maintain their function (stability) and also sufficiently variable to
be under constant modification (evolvability). Clearly, the presence of
second order symbolismwith its one-to-many and many-to-one rela-
tions between form (sound) and function (meaning) paving the way to
third-order symbolism (hierarchical and recursive structures)is the
analog of this trait in the evolution of the speech code: a degenerative
grammar.
13.7 Acknowledgments
References
Ay, Nihat, Jessica C. Flack, and David C. Krakauer. 2007. Robustness and com-
plexity co-constructed in multi-modal signaling networks. Philosophical Transac-
tions of the Royal Society of London / B 362 (1479): 441447.
Beckner, Clay, Richard Blythe, Joan Bybee, Morton H. Christiansen, William
Croft, Nick C. Ellis, John Holland, Jinyun Ke, Diane Larsen-Freeman, and Tom
Schoenemann. 2009. Language is a complex adaptive system: Position paper.
Language Learning, 59 (s1): 126.
Bickerton, Derek. 1990. Language and Species. Chicago: University of Chicago
Press.
274 Daniel Silverman
Bladon, Anthony. 1986. Phonetics for hearers. In Language for Hearers, edited
by Graham McGregor, 124. Oxford: Pergamon Press.
Delgutte, Bertrand. 1982. Some correlates of phonetic distinctions at the level of
the auditory nerve. In The Representation of Speech in the Peripheral Auditory
System, edited by Rolf Carlson and Bjrn Granstrm, 131150. Amsterdam:
Elsevier Biomedical.
Edelman, Gerald M., and Joseph A. Gally. 2001. Degeneracy and complexity in
biological systems. Proceedings of the National Academy of Sciences of the United
States of America 98 (24):1376313768.
Eldredge, Niles, and Stephen J. Gould. 1972. Punctuated equilibria: An alterna-
tive to phyletic gradualism. In Models in Paleobiology, edited by Thomas J. M.
Schopf, 82115. San Francisco: Freeman Cooper.
Firth, John R. 1948. Sounds and prosodies. Transactions of the Philological Society
47: 127152.
Fodor, Jerry A. 1983. Modularity of Mind: An Essay on Faculty Psychology.
Cambridge, MA: MIT Press.
Gurevich, Naomi. 2004. Lenition and Contrast: The Functional Consequences of
Certain Phonetically Conditioned Sound Changes. New York: Routledge.
Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch. 2002. The faculty of
language: What is it, who has it, and how did it evolve? Science 298 (5598):
15691579.
Hayes, Bruce. 1995. Metrical Stress Theory: Principles and Case Studies. Chicago:
University of Chicago Press.
Hutton, James. 1795. Theory of the Earth; with Proofs and Illustrations. Edin-
burgh: Creech.
Jackendoff, Ray. 1999. Possible stages in the evolution of the language capacity.
Trends in Cognitive Sciences 3 (7): 272279.
Kiparsky, Paul. 1973. How abstract is phonology? In Three Dimensions of Lin-
guistic Theory, edited by Osamu Fujimura, 556. Tokyo: The TEC Corporation.
Kirby, Simon. 2007. The evolution of language. In Oxford Handbook of Evolu-
tionary Psychology, edited by Robin Ian MacDonald Dunbar and Louise Barrett,
669681. Oxford: Oxford University Press.
Krakauer, David C., and Joshua B. Plotkin. 2004. Principles and parameters of
molecular robustness. In Robust Design: A Repertoire for Biology, Ecology and
Engineering, edited by Erica Jen, 115133. Oxford: Oxford University Press.
Kruszewski, Mikoaj. [1883] 1995. Oerk Nauki O Jazyke (An Outline of Linguis-
tic Science). Translated by Gregory M. Eramian. In Writings in General Linguis-
tics, edited by Ernst Frideryk Konrad Koerner, 43174. Amsterdam Classics in
Linguistics 11. Amsterdam: John Benjamins.
Labov, William. 1994. Principles of Linguistic Change: Internal Factors. Oxford:
Blackwell.
Ladefoged, Peter, and Keith Johnson. 2011. A Course in Phonetics. 6th ed. Inde-
pendence, KY: Wadsworth, Cengage Learning.
Evolution of the Speech Code 275
14.1 Introduction
Syntax
hie
e ra
ur rc
t hic
truc al
rs str
ea uc
lin tu
re
Phonolgy Semantics
PHOL
SEM
PHON CS
Figure 14.1
Syntax as the combinatorial mechanism that translates linear structure from Phonology
into hierarchical structure in Semantics, and the other way around (with functions PHOL
and SEM that generate grammatically relevant representations of sound and meaning; for
definitions of such functions see Wiese [2003b, 2004]).
[BITING
is Agent of is Patient of
EVENT]
Figure 14.2
Links between sign-sign and meaning-meaning relationships for the sentence Paula bites
Fred.
280 Heike Wiese and Eva Wittenberg
14.3 How Rituals Could Have Supported the Emergence of Dependent Links
forms the basis for phonological processes and restrictions, but it does
not link to hierarchical structures in meaning; no dependent links are
created as of yet.
However, with hierarchical sound structures established, it would
be possible for these patterns to be transferred to complex elements
above the syllable level and thus to meaningful elements. Once hierarchi-
cal structures are salient for verbal elements that carry meaning,
this could then give rise to links with hierarchical representations
of meaning. Complex sounds could now obtain their interpretation
through the connection of sign-sign relations between their referents.
This provides the basis for a grammatical system like that found in
modern language, that is, a system based on the correlation of sound
relations with hierarchical semantic relations through a syntactic system
that organizes dependent links tying together parallel structures in dif-
ferent domains.
Finally, consider example (8), which describes the same events as (5),
but in a different order:
(8) Harry read a book. Last Christmas, Sally had given it to him.
In this example, there is a mismatch between the order of events (first
the giving, then the reading) and the syntactic order of the sentences
referring to these events (the sentence referring to the reading event is
followed by that referring to the giving event). Again the parallelism
between semantic and syntactic structure is disrupted, this time at the
level of discourse. Numerous studies using different technologies such as
probe-recognition, recall, or measuring ERPs have shown that scenarios
like (8), in which the direct second-order iconicity is disrupted, require
more processing effort than scenarios in which this order is preserved
(Briner, Virtue, and Kurby 2012; Mnte, Schiltz, and Kutas 1998; Ohtsuka
and Brewer 1992). Also, childrens understanding of sentences with
chronologically ordered events (Ilkka read the letter before he went to
school) is better than their understanding of sentences where the order
of events is inverse (Before Ilkka went to school, he read the letter;
Johnson 1975; Notley et al. 2012; Pyykknen and Jrvikivi 2012). Thus,
there is ample evidence that the parallelism between conceptual and
discourse structure aids comprehension and memory.
that is, the development of dependent links, and could have provided the
crucial basis for the development of syntactic structure.
We identified a domain with strong ritual characteristics as particularly
significant in this development, namely, music. A central and evolution-
ary early phenomenon in human cultures, music not only supports the
linking of relations in general but also provides a domain for the linking
of linear and hierarchical relations in the acoustic domain in particular.
This linking can, in further steps, be transferred to meaningful elements
and connect linear representations from the acoustic domain with hier-
archical meanings; in todays language, dependent links are what ulti-
mately get us from sound to meaning. We argued that this works best
when the representational levels that are linked with each other run
closely in parallel with respect to their structures, thus allowing straight-
forward dependent links. We provided three examples for phenomena
where such a parallelism was disruptedcoercion, light verb construc-
tions, and temporal dissociation in discourseand we reviewed psycho-
linguistic evidence that shows how comprehenders rely on direct
parallelism and perform poorer or slower when such parallelism is
absent.
Taken together, this article provided something like a voyage through
a number of areas that have benefited from Ray Jackendoffs research:
theories about such diverse topics as grammar, music, cognition, and the
evolution of language are, in his mind, never far apart; and he constantly
seeks evidence for or against his theories in a variety of places, as his
various endeavours with psycholinguists show. In his own words:
[S]cience is a lot like chamber music. You cant just do your own stuff. You have
to be constantly listening to everyone. Sometimes the crucial facts come from
your own field, sometimes from the most unexpected place in someone elses.
Were all in this together, and the goal is to create a coherent story about thought
and meaning and the mind and the brain that will satisfy usand, we hope,
posterity. (Jackendoff 2012, 213)
References
Briner, Stephen W., Sandra Virtue, and Christopher A. Kurby. 2012. Processing
causality in narrative events: Temporal order matters. Discourse Processes 49 (1):
6177.
Bhler, Karl. 1934. Sprachtheorie: Die Darstellungsfunktion der Sprache. Jena: G.
Fischer.
Culicover, Peter W., and Ray Jackendoff. 2005. Simpler Syntax. Oxford: Oxford
University Press.
290 Heike Wiese and Eva Wittenberg
Culicover, Peter W., and Andrzej Nowak. 2002. Learnability, markedness, and
the complexity of constructions. In Language Variation Yearbook, vol. 2, edited
by Pierre Pica and Johan Rooryk, 530. Amsterdam: John Benjamins. Reprinted
in Peter W. Culicover, Explaining Syntax, 530. Oxford: Oxford University
Press, 2013.
Czerwon, Beate, Annette Hohlfeld, Heike Wiese, and Katja Werheid. 2012. Syn-
tactic structural parallelisms influence processing of positive stimuli: Evidence
from cross-modal ERP priming. International Journal of Psychophysiology 87
(1): 3834.
Deacon, Terrence William. 1997. The Symbolic Species: The Co-evolution of Lan-
guage and the Brain. New York: Norton & Co.
Dienes, Zoltn, Gustav Kuhn, Xiuyan Guo, and Catherine Jones. 2012. Commu-
nicating structure, affect, and movement. In Language and Music as Cognitive
Systems, edited by Patrick Rebuschat, 156168. Oxford: Oxford University Press.
Fiebach, Christian J., Matthias Schlesewsky, Ina D. Bornkessel, and Angela D.
Friederici. 2002. Specifying the brain bases of syntax: Distinct fMRI effects of
syntactic complexity and syntactic violations. Paper presentated at the 8th Annual
Conference on Architectures and Mechanisms for Language Processing (AMLAP
2002), Tenerife, Spain, September 2002.
Husband, E. Matthew, Lisa A. Kelly, and David C. Zhu. 2011. Using complement
coercion to understand the neural basis of semantic composition: Evidence from
an fMRI study. Journal of Cognitive Neuroscience 23 (11): 32543266.
Jackendoff, Ray. 1997. The Architecture of the Language Faculty. Cambridge, MA:
MIT Press
Jackendoff, Ray. 1999. Possible stages in the evolution of the language capacity.
Trends in the Cognitive Sciences 3 (7): 272279.
Jackendoff, Ray. 2002. Foundations of Language. Oxford: Oxford University
Press.
Jackendoff, Ray. 2007. Language, Consciousness, Culture: Essays on Mental Struc-
ture. Cambridge, MA: MIT Press.
Jackendoff, Ray. 2012. A Users Guide to Thought and Meaning. Oxford: Oxford
University Press.
Jackendoff, Ray, and Fred Lerdahl. 2006. The capacity for music: What is it, and
whats special about it? Cognition 100 (1): 3372.
Jackendoff, Ray, and Eva Wittenberg. 2014. What you can say without syntax: A
hierarchy of grammatical complexity. In Measuring Linguistic Complexity, edited
by Frederick Newmeyer and Laurel Preston, 6582. Oxford: Oxford University
Press.
Johnson, Helen L. 1975. The meaning of before and after for preschool children.
Journal of Experimental Child Psychology 19 (1): 8899.
Katsika, Argyro, David Braze, Ashwini Deo, and Mara Mercedes Piango. 2012.
Complement coercion: Distinguishing between type-shifting and pragmatic infer-
encing. The Mental Lexicon 7 (1): 5876.
Arbitrariness and Iconicity in the Syntax-Semantics Interface 291
Kuperberg, Gina R., Arim Choi, Neil Cohn, Martin Paczynski, and Ray Jackend-
off. 2010. Electrophysiological correlates of complement coercion. Journal of
Cognitive Neuroscience 22 (12): 26852701.
Lapata, Mirella, Frank Keller, and Christoph Scheepers. 2003. Intra-sentential
context effects on the interpretation of logical metonymy. Cognitive Science 27
(4): 649668.
Leach, Edmund R. 1968. Ritual. In International Encyclopedia of the Social Sci-
ences, vol. 13, edited by David L. Sills, 520526. New York: Macmillan.
Lee, Bruce Y., and Andrew B. Newberg. 2005. Religion and health: A review and
critical analysis. Zygon 40 (2): 443468.
Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music.
Cambridge, MA: MIT Press.
Maess, Burkhard, Stefan Koelsch, Thomas C. Gunter, and Angela D. Friederici.
2001. Musical syntax is processed in Brocas area: An MEG study. Nature Neu-
roscience 4 (5): 540545.
McElree, Brian, Liina Pylkknen, Martin J. Pickering, and Matthew J. Traxler.
2006. A time course analysis of enriched composition. Psychonomic Bulletin and
Review 13 (1): 5359.
Mnte, Thomas F., Kolja Schiltz, and Marta Kutas. 1998. When temporal terms
belie conceptual order. Nature 395 (6697): 7173.
Notley, Anna, Peng Zhou, Britta Jensen, and Stephen Crain. 2012. Childrens
interpretation of disjunction in the scope of before: A comparison of English
and Mandarin. Journal of Child Language 39 (3): 482522.
Ohtsuka, Keisuke, and William F. Brewer. 1992. Discourse organization in the
comprehension of temporal order in narrative texts. Discourse Process 15 (3):
317336.
Patel, Aniruddh D. 1998. Syntactic processing in language and music: Different
cognitive operations, similar neural resources? Music Perception 16 (1): 2742.
Patel, Aniruddh D. 2003. Language, music, syntax and the brain. Nature Neurosci-
ence 6 (7): 674681.
Patel, Aniruddh D. 2008. Music, Language, and the Brain. New York: Oxford
University Press.
Pickering, Martin J., Brian McElree, and Matthew J. Traxler. 2005. The difficulty
of coercion: A response to de Almeida. Brain and Language 93 (1): 19.
Piango, Mara M., Jennifer Mack, and Ray Jackendoff. Forthcoming. Semantic
combinatorial processes in argument structure: Evidence from light verbs. In
Proceedings of the 32nd Annual Meeting of the Berkeley Linguistic Society. Berke-
ley, CA: Berkeley Linguistics Society.
Pyykknen, Pirita, and Juhani Jrvikivi. 2012. Children and situation models of
multiple events. Developmental Psychology 48 (2): 521529.
Pylkknen, Liina, Andrea E. Martin, Brian McElree, and Andrew Smart. 2009.
The anterior midline field: Coercion or decision making? Brain and Language
108 (3): 184190.
292 Heike Wiese and Eva Wittenberg
Saussure, Ferdinand de. 1916. Cours de linguistique gnral. Paris: ditions Payot
et Rivages.
Slevc, L. Robert, Jason C. Rosenberg, and Aniruddh D. Patel. 2009. Making psy-
cholinguistics musical: Self-paced reading time evidence for shared processing of
linguistic and musical syntax. Psychonomic Bulletin and Review 16 (2): 374381.
Tillmann, Barbara, and Emmanuel Bigand. 2002. A comparative review of
priming effects in language and music. In Language, Vision, and Music, edited by
Paul Mc Kevitt, Sen Nuallin, and Conn Mulvihill, Advances in Conscious-
ness Research 35, 231240. Amsterdam: John Benjamins.
Tillmann, Barbara, Emmanuel Bigand, and Marion Pineau. 1998. Effects of
global and local contexts on harmonic expectancy. Music Perception 16 (1):
99118.
Traxler, Matthew J., Brian McElree, Rihana S. Williams, and Martin J. Pickering.
2005. Context effects in coercion: Evidence from eye movements. Journal of
Memory and Language 53 (1): 125.
Traxler, Matthew J., Martin J. Pickering, and Brian McElree. 2002. Coercion in
sentence processing: Evidence from eye-movements and self-paced reading.
Journal of Memory and Language 47 (4): 530547.
Wiese, Heike. 2003a. Numbers, Language, and the Human Mind. Cambridge:
Cambridge University Press.
Wiese, Heike. 2003b. Sprachliche Arbitraritt als Schnittstellenphnomen [Lin-
guistic Arbitrariness as an Interface Phenomenon]. Habilitation thesis, Hum-
boldt University.
Wiese, Heike. 2004. Semantics as a gateway to language. In Mediating between
Concepts and Language, Trends in Linguistics 152, edited by Holden Hrtl and
Heike Tappe, 197222. Berlin: Mouton de Gruyter.
Wiese, Heike. 2007. Grammatische Relationen und rituelle Strukturenein evo-
lutionrer Zusammenhang? In WahlverwandschaftenVerben, Valenzen, Vari-
anten: Festzeitschrift fr Klaus Welke zum 70. Geburtstag, Germanistische
Linguistik 188/189, edited by Hartmut E. H. Lenk and Maik Walter, 113136.
Hildesheim: Georg Olms.
Wittenberg, Eva, and Mara M. Piango. 2011. Processing light verb constructions.
The Mental Lexicon 6 (3): 393413.
Wittenberg, Eva, Martin Paczynski, Heike Wiese, Ray Jackendoff, and Gina
Kuperberg. 2014. The difference between giving a rose and giving a kiss:
Sustained neural activity to the light verb construction. Journal of Memory and
Language 73: 3142.
Wittenberg, Eva. 2013. Paradigmenspezifische Effekte subtiler semantischer
Manipulationen. Linguistische Berichte 235: 293308.
Wittenberg, Eva, and Jesse Snedeker. 2014. It takes two to kissbut does it take
three to give a kiss? Conceptual sorting based on thematic roles. Language,
Cognition and Neuroscience 29 (5): 635641.
15 The Biology and Evolution of Musical Rhythm: An
Update
W. Tecumseh Fitch
15.1 Introduction
clarify and ground our thinking about rhythmic cognition from a biologi-
cal and evolutionary viewpoint.
The starting point for any comparison of music and language, following
Jackendoffs lead, is to adopt a divide and conquer strategy in both the
musical and linguistic domains. There is an unfortunate tendency in the
cognitive science literature to adopt an overly monolithic view of capaci-
ties like language, music, social intelligence, and similar abilities, rather
than to squarely face their composite, multi-component nature. A mono-
lithic viewpoint leads all too naturally to the wrong questions, such as
when did language evolve? (as if all components have evolved at some
specific moment in our evolutionary history) or where is music located
in the brain? (as if the complex of perception, abstract cognition, and
production underlying music would occupy a single cortical region). The
antidote to this tendency is to recognize that any complex cognitive
capability will, when properly broken down and understood, prove to
rely upon a suite of interacting cognitive and neural capabilities, each of
which may well have its own independent evolutionary history and
neural implementation.
Jackendoff has clearly and forcefully advocated a multi-component
approach in both of these domains. In language evolution, he has offered
one of the most finely articulated multi-step scenarios for the evolution
of specific components of language, clearly separating the evolution of
phonology, syntax, and semantics (Jackendoff 1999). His general approach
to language as a system, the Parallel Architecture, embodies the need for
the separation of sub-capacities (Jackendoff 2002, 2007). Similarly in
music, his joint work with Fred Lerdahl articulates the multiple interact-
ing layers of rhythm, melody, and harmony, again illustrating that the
clearest path to understanding is to first analytically carve nature at the
joints, investigate the pieces, and then synthetically consider their inter-
actions. This becomes particularly crucial when comparing music and
language, since we can safely assume a mixture of distinctness and
overlap in their individual components.
Overall, Jackendoffs approach to the music/language comparison has
been agnostic: he proposes that we analyze each domain in its own terms,
and then let the chips fall where they may (Jackendoff and Lerdahl
1982; Lerdahl and Jackendoff 1983; Jackendoff and Lerdahl 2006; Jack-
endoff 2009). Not all commentators on this issue have been equally
The Biology and Evolution of Musical Rhythm: An Update 295
Music
Music
Music
language
Language
Language
Figure 15.1
Three models for the relationship between music and language.
15.3 Hypotheses about the Relationship between the Music and Language
Capacities
Both music and language are universal human capacities, found in every
known culture. Both domains appear to rest on some species-specific
biological basis, but nonetheless encompass a large number of culturally-
acquired instantiations (different languages and different musical idioms).
Both are generative systems that make infinite use of finite means,
combining atomic primitives (notes, phonemes) into hierarchical com-
plexes (melodies, words, sentences). But despite these similarities, the
differences between music and language are equally obvious: most prom-
inently, music lacks the form of explicit, proposition-based semantics that
gives language its semantic power (Fitch 2006; Jackendoff and Lerdahl
2006; Jackendoff 2009). Music also has typical features lacking in lan-
guage, such as isochronicity (a steady beat) and a discretized frequency
range (pitch system) (Nettl 2000); Western tonal music also features a
complex harmonic syntax (Jackendoff and Lerdahl 2006). Fitch (2006)
dubbed these design features of music. Understanding this complex
pattern of similarities and differences clearly necessitates a multi-
component approach to comparison (Patel 2008; Jackendoff 2009).
Researchers who have adopted specific multi-component models have
nonetheless reached quite different conclusions (figure 15.1). On the
different side, there is a long tradition in neurology of seeing the neural
296 W. Tecumseh Fitch
repeatedly strike resonant objects (cf. Fitch 2006). In the case of gorillas,
the hands typically strike the animals own body, while chimpanzees
more commonly strike a resonant object (Arcadi, Robert, and Boesch
1998). However, there is no published evidence for synchronization of
such drumming, nor evidence that either of these species is able to
entrain their drumming to an external auditory signal. One possible
exception concerns a vocal phenomenon in bonobos dubbed staccato
hooting by Franz de Waal: During choruses, staccato hooting of differ-
ent individuals is almost perfectly synchronized so that one individual
acts as the echo of another or emits calls at the same moments as
another. The calls are given in a steady rhythm of about two per second
(De Waal 1988, 203). Unfortunately, De Waal presented no data or acous-
tic analysis in support of this statement, and no further reports of staccato
hooting have occurred in the twenty-five years since this tantalizing
statement was published. Thus, in general, until recently there was virtu-
ally no evidence for synchronization in any bird or nonhuman mammal
species, which led some commentators (e.g., Williams 1967) to the conclu-
sion that humans are uniqueat least among higher vertebratesin
our capacity to synchronize our rhythmic movements and vocalizations
among multiple individuals, or to an external sound source.
All this changed abruptly in 2009, when two papers were published
simultaneously in the prestigious journal Current Biology (Patel et al.
2009a; Schachner et al. 2009). The initial indications of well-developed
synchronization to a musical rhythm in birds first surfaced in YouTube
videos purportedly showing dancing in a sulphur-crested cockatoo
(Cacatua galerita) named Snowball. Snowball was anonymously
donated to a bird rescue shelter along with a note indicating that he
enjoyed the music on an enclosed CD. When the CD was played, Snow-
ball began to rhythmically bob his head and lift his legs in time to the
music (figure 15.2). A YouTube video of this dancing went viral (more
than five million views by 2015) and subsequently came to the attention
of scientists, many of whom were initially sceptical about its veracity. But
the videos were suggestive enough for Aniruddh Patel and his colleagues
to travel to Snowballs home in Indiana to explore his synchronization
abilities experimentally.
The Biology and Evolution of Musical Rhythm: An Update 301
Figure 15.2
Snowball, a sulfur-crested cockatoo, dancing. See (Patel et al. 2009a).
evidence that a bird can extract a rhythmic pulse from human music and
synchronize its movements to that pulse: Pulse Perception and Entrain-
ment (PPE) (cf. Fitch 2013c).
The discovery of PPE in Snowball immediately raised multiple ques-
tions about the origins and frequency of this ability in other species. To
address the zoological generality of such abilities, Adena Schachner and
colleagues performed a large-scale analysis of YouTube videos purport-
ing to show dancing animals (Schachner et al. 2009). Because many
popular videos on the internet that supposedly show dancing animals are
obviously doctored by synchronizing the audio track to the animals
movements, initial scepticism about each video is clearly warranted.
Schachner and colleagues sifted through more than one thousand such
videos, excluding examples of doctoring, and in the remaining sample
testing whether the animal subjects maintained a consistent phase rela-
tive to the downbeat and/or matched the tempo of the music. Most
videos showed no evidence fulfilling these criteria. However, in thirty-
three videos, they observed what appeared to be PPE.
Among the fifteen species in Schachner and colleagues videos for
which solid evidence for PPE was observed, an astonishing fourteen were
of parrots; the only exception was a single potential example of PPE in
an Asian elephant. Schachner and colleagues also experimentally inves-
tigated PPE in both Snowball the cockatoo and the African grey parrot
Alex. In both birds, clear evidence for PPE was uncovered, consistent
with the conclusions of Patel and colleagues. Despite hundreds of videos
showing dancing dogs, no dogs showed convincing evidence of PPE.
These data pointed to a rather surprising conclusion: PPE was charac-
teristic of only two taxa among all bird and mammal species: humans
and various parrots (Fitch 2009).
These findings led to a surge of interest in animal rhythmic abilities,
including more carefully controlled laboratory studies. The ability of
another parrot species, budgerigars (parakeets or budgies), to synchro-
nize was studied by Hasegawa and his colleagues (2011), who easily
trained eight birds to tap to an acoustically- and visually-indicated tempo
at a wide range of frequencies. While budgies learned the task more
easily for slow tempos (12001800ms period), they subsequently tapped
more accurately to more rapid tempos (450600 ms), closer to typical
human tempos. As typical for human tapping experiments, all of the
budgies tended to lead the beat slightly, so a merely reactive process is
unlikely to account for PPE in this species. They should therefore provide
a suitable model in which to study animal rhythm further.
The Biology and Evolution of Musical Rhythm: An Update 303
With the evidence for PPE in parrots now clear, several laboratories
renewed the search for evidence of synchronization abilities in nonhu-
man primates. Two new studies with rhesus macaques confirmed major
differences between the rhythmic abilities of humans and these monkeys
(Zarco et al. 2009; Merchant et al. 2011). In both studies, macaques were
trained to tap a key at a regular rate, and their behaviour was compared
to that of human participants. Despite certain similarities in error pat-
terns, monkeys were unable to synchronize to a metronomic pulse, or to
continue tapping regularly once such a pulse was removed. Furthermore,
humans typically show a distinct advantage when tapping is cued acousti-
cally rather than visually (cf. Patel et al. 2005); such a modality difference
was not seen in macaques (Zarco et al. 2009). These recent experiments
thus lend credence to the notion that human rhythmic abilities are unique
among primates.
However, a final recent primate study provides some glimmer of hope
for other primates. Three common chimpanzees were trained to tap on
alternating, briefly illuminated keys of a MIDI keyboard (Hattori,
Tomonaga, and Matsuzawa 2013). They were required to learn to tap
alternating keys, and after a minimum of thirty consecutive taps, they
received a food reward. After consistently meeting this criterion, each
individual moved on to a test stage in which a repeated distractor
note (different from the one produced by their own keyboard press)
was played at a consistent tempo (400, 500, or 600ms inter-onset inter-
val). Reward was given for completing thirty taps, irrespective of any
synchronization, so while the apes were trained to tap, they were not
trained to synchronize. Nonetheless, one of the three chimpanzees, a
female named Ai, demonstrated spontaneous synchronization to this
regular distractor note, but only at the 600ms tempo. This chimpanzee
spontaneously aligned her taps (mean of roughly 0 phase) to this steady
auditory pulse. The two other chimpanzees showed no evidence of
synchronization.
Unfortunately, Ai did not show synchronization to the other two
tempos, and the authors hypothesized that her successful synchroniza-
tion to the 600ms tempo stemmed from the fact that her spontaneous
tapping frequency was very close to this (about 580ms). Although the
limitation to one of three animals and a single tempo suggests that chim-
panzee synchronization abilities remain quite limited compared to the
abilities of humans or parrots, they go well beyond those previously
observed in macaques. This is thus the first well-controlled primate study
demonstrating any component of PPE in a nonhuman primate, though
304 W. Tecumseh Fitch
Ais performance still does not approach typical human (or parrot)
levels.
15.6 Patels Vocal Learning Hypothesis Meets Ronan the Sea Lion
Figure 15.3
Ronan, a California sea lion, bobbing her head up and down to music. See (Cook et al.
2013).
* True Seals
Phocidae
Eared Seals
Otariidae
* Walruses
Odobenidae
Non-Pinniped Carnivores
Figure 15.4
The evolutionary relationships between the three main clades of pinnipeds: true seals,
walruses and otariid eared seals like sea lions. Following (Arnason et al. 2006).
which suggest vocal learning in this clade. For example, Weddell seals
(Leptonychotes weddelli) have songs that vary considerably in spectral
cues between adjacent sites in Antarctica, and each of these neighboring
populations also has its own unique call types. These and other examples
strongly suggest that vocal learning exists in several other species of
phocid seals.
In walruses, strong circumstantial evidence for vocal learning again
comes from captivity, in this case from two walruses (a male and a
female), who were trained to emit novel vocalizations for a reward
(Schusterman 2008; Schusterman and Reichmuth 2008). All the walruses
were easily trained to emit various sounds and to make up new sounds,
demonstrating considerable vocal flexibility and control. More telling,
the male walrus developed a novel (and un-reinforced) noise-making
behavior that involved buzzing a floating toy by breathing out through
it. The sound produced seemed to be intrinsically rewarding, and it also
308 W. Tecumseh Fitch
attracted the attention of female walruses in the tank. One of the females
later learned how to do the same thing. So, while these examples do not
conclusively demonstrate walrus vocal learning, they are consistent with
excellent vocal control and flexibility, and show the capacity of walruses
to learn a novel method of sound production.
Walruses are more closely related to otariid seals than to phocid seals,
so this nested pattern of data can be explained in various ways. The
common ancestor of all pinnipeds may have evolved vocal learning, and
then otariids (apparently) lost it, or phocids and walruses may have
independently evolved vocal learning. The third possibility is that all
pinnipeds are vocal learners, and we just dont have evidence for this in
otariids like sea lions yet.
Another possibility that would realign Patels hypothesis with the sea
lion results relies on the fact that sea lionslike most marine mammals
can easily be trained to bring their vocalizations under operant control
(Schusterman and Feinstein 1965). While most mammals, including pri-
mates, can (with extensive training) learn to emit vocalizations to a
command, this is typically much more difficult than ordinary operant
responses (e.g., bar pressing) (Adret 1992), and it is typically very chal-
lenging to train a primate to vocalize on command. For sea lions this task
is easy, which suggests that, instead of considering vocal learning as a
binary feature, we should think of it as a continuum (Fitch and Jarvis
2013), spanning from elaborate and complex vocal learning (humans,
parrots) through good vocal control with little learning (sea lions and
most marine mammals) to very little voluntary vocal control or learning
(most mammals, including primates). By this slight modification of Patels
hypothesis, which could be then renamed the vocal control and rhythmic
synchronization hypothesis, the basic insight of a close mechanistic
link between PPE and vocal motor control would remain valid, and
Ronans performance would constitute data consistent with this revised
hypothesis.
Summarizing these new animal data, it is now quite clear that the
capacity for rhythmic synchronization exists in several nonhuman species,
including at least sea lions and multiple parrot species. Crucially, and
unlike the long-known examples of entrainment in insects and frogs, both
parrots and sea lions appear to share with humans the ability to entrain
their movements to a wide range of tempos and to infer a pulse from a
complex musical surface. Although the data for Ai the chimpanzee
suggest that some modicum of synchronization abilities may be found in
at least some individual chimpanzees, her failure to generalize to new
The Biology and Evolution of Musical Rhythm: An Update 309
tempi still indicates a sharp distinction from human, parrot, or sea lion
rhythmic performance. Thus human pulse perception and entrainment,
while unusual or unique among primates, is shared with these more
distant relatives, almost certainly as a result of convergent evolution.
Returning to the comparison of music and language, however, these
new comparative data remain silent about the second major component
of musical rhythm: hierarchical metrical structure. Because metrical
structure is hypothesized to be a shared aspect of music and speech, while
entrainment to an isochronic rhythm is typical only of music, the cur-
rently available animal data are relevant only to music and not the (argu-
ably more interesting) question of the biological origins of metrical
structure. There is thus a clear need for animal studies of meter percep-
tion: we know virtually nothing at present about any species ability to
detect different cues to stress in speech or respond to the metrical grid
in music. I thus end with a brief discussion of meter in music and lan-
guage, explaining its relevance to the biology and evolution of both
capacities, in the hope of spurring such comparative research.
several layers of rhythmic elements (beats, bars, etc.), where some ele-
ments are picked out as stronger than others at each level. Interestingly,
in many ways these two rather different reflections of meter share similar
constraints. In both cases, the hierarchical structure allows only a very
small number of subdivisions (two or three; Lerdahl and Jackendoff
1983). In speech and music there is a strong preference for regularity and
for a good fit between the sound stream and a consistent metrical grid.
This is reflected in speech by the fact that words that receive one stress
pattern when spoken alone (like kangaro, which takes stress on the last
syllable) can have the stress reassigned in the context of other words in
a phrase (in kangaroo curt, the main stress shifts to court and away from
the third syllable oo). Finally, in both music and language, there are
multiple potential acoustic realizations of strength, including pitch,
duration, and loudness in both domains (although an additional cue, the
timbral changes in unstressed syllables, as seen in English, Dutch, or
German vowel reduction, appears to be limited to language).
Noting these similarities, Jackendoff and Lerdahl (2006) also point out
two key differences: speech meter is less regular in terms of hierarchical
structure, and also, regarding pulse timing, the beats underlying speech
are not isochronic, at least not in ordinary spoken language. These dif-
ferences suggest a clear distinction between two cognitive processing
domains: isochrony and pulse perception, on the one hand, and meter
and hierarchy perception, on the other (cf. Fitch 2013c). Of course, this
implies no strict dichotomy between music and language: poetry, nursery
rhymes, and song lyrics all occupy an intermediate zone between music
and language and are much more regular than standard speech. Fur-
thermore, even within music, the degree of isochrony varies considerably.
While dance music tends to be strongly isochronic (Temperley 2004),
much Western art music encourages a more flexible interpretation of the
pulse for expressive purposes (Repp 1998), and several known musical
styles are not isochronic at all (Frigyesi 1993; Clayton 1996). Thus isoch-
rony represents a continuum, with dance music and ordinary speech at
opposite ends. In any case, the distinction between meter and pulse
renders these isochrony issues unproblematic for Jackendoff and Ler-
dahls proposal that meter is a shared cognitive aspect of the two domains.
In a recent book comparing language and music as cognitive systems
(Rebuschat et al. 2012), three chapters explore slightly different view-
points about metrical phonology that extend the previous discussion of
Jackendoff and Lerdahl. The target article by Nigel Fabb and Morris
Halle (Fabb and Halle 2012) first introduces their recently-developed
The Biology and Evolution of Musical Rhythm: An Update 311
approach to poetic meter, and then compares this with both ordinary
word stress and musical rhythm. The authors conclude that these three
domains share multiple important features, particularly of grouping, but
Fab and Halle allow various phenomena in music (such as rests) that
violate their proposed rules of linguistic stress. They consider exceptions
in poetry to be examples of extrametricality, invoked by artistic pre-
rogative. Fabb and Halles theory assigns beats exclusively to syllables,
so every abstract beat must be a projection from a pronounced syllable.
In music, by contrast, rests (silences) often project to beats. Their model
thus argues against a strict identity of the cognitive underpinnings of
meter and stress in music and poetry.
In opposition to this partial sharing hypothesis, both Vaux and Myler
(2012) and Roberts (2012) argue for a strict identity between metrical
cognition in these two domains. As Vaux and Myler point out, the metri-
cal patterns of poetic verse can easily be accommodated to that of music
if, instead of allowing only for strict projection from syllables, silences
can also play the role of beats. They illustrate this with the nursery rhyme
Hickory Dickory Dock (previously discussed by Jackendoff and
Lerdahl 2006). In this rhyme, the natural way to speak the verse involves
waltz meter, but leaving a pause (like a musical rest) after dock and
clock:
x x x (x)
x x x x x x x (x x) (x x) x
Hick-or-y, dick-or-y, dock, ___ the
x x x (x)
x (x) x x (x) x x (x x) (x x x)
mouse ran up the clock. ___
where the parentheses indicate beats that are felt and timed, but which
do not project to any syllable (termed catalexis in poetry). In particu-
lar, at the end of the last line, an entire triplet is left silent. Vaux and
Myler suggest that this requires a modification of Fabb and Halles
model, and that instead of projection from syllables, call for a model that
involves a mapping between linguistic syllables and abstract timing
slots. In such a model, they argue, several problems with Fabb and
Halles approach disappear and, not coincidentally, the cognitive under-
pinnings of musical and speech meter are indeed identical: they conclude
that poetic metre is music.
312 W. Tecumseh Fitch
15.8 Conclusions
Acknowledgements
I thank Simon Durrant, Jonah Katz, Fred Lerdahl, and Ida Toivonen for
constructively critical comments on a previous version of the manuscript,
and ERC Advanced Grant #230604 SOMACCA for financial support.
References
Adret, Patrice. 1992. Vocal learning induced with operant techniques: An over-
view. Netherlands Journal of Zoology 43 (12): 125142.
Alexander, Richard D. 1975. Natural selection and specialized chorusing behav-
ior in acoustical insects. In Insects, Science and Society, edited by David Pimentel,
3577. New York: Academic Press.
Allott, Robin. 1989. The Motor Theory of Language Origin. Sussex: The Book
Guild.
Arbib, Michael A., ed. 2013. Language, Music, and the Brain: A Mysterious Rela-
tionship. Cambridge, MA: MIT Press.
Arcadi, Adam Clarke, Daniel Robert, and Christophe Boesch. 1998. Buttress
drumming by wild chimpanzees: Temporal patterning, phrase integration into
loud calls, and preliminary evidence for individual distinctiveness. Primates 39
(4): 505518.
Arnason, Ulfur, Annette Gullberg, Axel Janke, Morgan Kullberg, Niles Lehman,
Evgeny A. Petrov, and Risto Vinl. 2006. Pinniped phylogeny and a new
hypothesis for their origin and dispersal. Molecular Phylogenetics and Evolution
41 (2): 345354.
Bernstein, Leonard. 1981. The Unanswered Question: Six Talks at Harvard
(Charles Eliot Norton Lectures). Cambridge, MA: Harvard University Press.
Brown, Steven. 2000. The Musilanguage model of music evolution. In The
Origins of Music, edited by Nils Lennart Wallin, Bjrn Merker, and Steven
Brown, 271300. Cambridge, MA: MIT Press.
Buck, John. 1938. Synchronous rhythmic flashing in fireflies. Quarterly Review of
Biology 13 (3): 301314.
Buck, John. 1988. Synchronous rhythmic flashing in fireflies. II. Quarterly Review
of Biology 63 (3): 265289.
Calvin, William H. 1983. A stones throw and its launch window: Timing precision
and its implications for language and hominid brains. Journal of Theoretical
Biology 104 (1): 121135.
Clayton, Martin R. L. 1996. Free rhythm: Ethnomusicology and the study of
music without metre. Bulletin of the School of Oriental and African Studies,
University of London 59 (2): 323332.
Cook, Peter, Andrew Rouse, Margaret Wilson, and Colleen J. Reichmuth.
2013. A California sea lion (Zalophus californianus) can keep the beat: Motor
The Biology and Evolution of Musical Rhythm: An Update 317
Hoeschele, Marisa, Hugo Merchant, Yukiko Kikuchi, Yuko Hattori, and Carel
ten Cate. 2015. Searching for the origins of musicality across species. Philosophi-
cal Transactions of The Royal Society B 370 (1664): 20140094.
Honing, Henkjan. 2012. Without it no music: Beat induction as a fundamental
musical trait. Annals of the New York Academy of Sciences 1252 (1): 8591.
Honing, Henkjan, Hugo Merchant, Gbor P. Hden, Luis Prado, and Ramn
Bartolo. 2012. Rhesus monkeys (Macaca mulatta) detect rhythmic groups in
music, but not the beat. PLoS One 7 (12): e51369.
Hulse, Stewart H., and Jeffrey Cynx. 1985. Relative pitch perception is con-
strained by absolute pitch in songbirds (Mimus, Molothrus, Sturnus). Journal of
Comparative Psychology 99 (2): 176196.
Jackendoff, Ray. 1987. Consciousness and the Computational Mind. Cambridge,
MA: MIT Press.
Jackendoff, Ray. 1999. Possible stages in the evolution of the language capacity.
Trends in Cognitive Sciences 3 (7): 272279.
Jackendoff, Ray. 2002. Foundations of Language. New York: Oxford University
Press.
Jackendoff, Ray. 2007. A Parallel Architecture perspective on language process-
ing. Brain Research 1146: 222.
Jackendoff, Ray. 2009. Parallels and nonparallels between language and music.
Music Perception 26 (3): 195204. Reprinted as Music and Language, in The
Routledge Companion to Philosophy and Music, edited by Theodore Gracyk and
Andrew Kania, 101112. New York: Routledge. 2011.
Jackendoff, Ray, and Fred Lerdahl. 1982. A grammatical parallel between music
and language. In Music, Mind, and Brain: The Neuropsychology of Music, edited
by Manfred E. Clynes, 83117. New York: Plenum.
Jackendoff, Ray, and Fred Lerdahl. 2006. The capacity for music: What is it, and
whats special about it? Cognition 100 (1): 3372.
Janik, Vincent M., and Peter J. B. Slater. 1997. Vocal learning in mammals. In
Advances in the Study of Behavior, vol. 26, edited by Peter J. B. Slater, Charles
T. Snowdon, Jay Rosenblatt, and Manfred Milinski, 5999. San Diego: Academic
Press.
Janik, Vincent M., and Peter J. B. Slater. 2000. The different roles of social learn-
ing in vocal communication. Animal Behaviour 60 (1): 111.
Jarvis, Erich D. 2004. Learned birdsong and the neurobiology of human language.
Annals of the New York Academy of Sciences 1016 (1): 749777.
Katz, Jonah, and David Pesetsky. 2009. The identity thesis for language and music.
LingBuzz. http://ling.auf.net/lingbuzz/000959.
Lashley, Karl. 1951. The problem of serial order in behavior. In Cerebral Mecha-
nisms in Behavior: The Hixon Symposium, edited by Lloyd A. Jeffress, 112146.
New York: Wiley.
Leakey, Richard E. 1994. The Origin of Humankind. New York: Basic Books.
Lenneberg, Eric H. 1967. Biological Foundations of Language. New York: Wiley.
320 W. Tecumseh Fitch
Lerdahl, Fred. 2001. The sounds of poetry viewed as music. Annals of the New
York Academy of Sciences 930 (1): 337354.
Lerdahl, Fred. 2013. Musical syntax and its relation to linguistic syntax. In Lan-
guage, Music and the Brain, edited by Michael A. Arbib, 257272. Cambridge,
MA: MIT Press.
Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music.
Cambridge, MA: MIT Press.
Levman, Bryan G. 1992. The genesis of music and language. Ethnomusicology 36
(2): 147170.
Liberman, Mark, and Alan Prince. 1977. On stress and linguistic rhythm. Linguis-
tic Inquiry 8 (2): 249336.
Livingstone, Frank B. 1973. Did the Australopithecines sing? Current Anthropol-
ogy 14 (12): 2529.
Martin, James G. 1972. Rhythmic (hierarchical) versus serial structure in speech
and other behavior. Pyschological Review 79 (6): 487509.
Merchant, Hugo, Wilbert Zarco, Oswaldo Prez, Luis Prado, and Ramn N.
Bartolo. 2011. Measuring time with different neural chronometers during a
synchronization-continuation task. Proceedings of the National Academy of Sci-
ences 108 (49): 1978419789.
Merchant, Hugo, Jessica Grahn, Laurel Trainor, Martin Rohrmeier, and W
Tecumseh Fitch. 2015. Finding the beat: A neural perspective across humans and
non-human primates, Philosophical Transactions of The Royal Society B 370
(1664): 20140093.
Merker, Bjrn. 2000. Synchronous chorusing and human origins. In The Origins
of Music, edited by Nils Lennart Wallin, Bjrn Merker, and Steven Brown,
315327. Cambridge, MA: MIT Press.
Merker, Bjrn. 2002. Music: The missing Humboldt system. Musicae Scientiae 6
(1): 321.
Miller, George A., Eugene Galanter, and Karl H. Pribram. 1960. Plans and the
Structure of Behavior. New York: Henry Holt.
Mithen, Steven J. 2005. The Singing Neanderthals: The Origins of Music, Lan-
guage, Mind, and Body. London: Weidenfeld & Nicolson.
Montagu, Ashley. 1976. Toolmaking, hunting and the origin of language. Annals
of the New York Academy of Sciences 280 (1): 266273.
Nettl, Bruno. 2000. An ethnomusicologist contemplates universals in musical
sound and musical culture. In The Origins of Music, edited by Nils Lennart
Wallin, Bjrn Merker, and Steven Brown, 463472. Cambridge, MA: MIT Press.
Nottebohm, Fernando. 1975. A zoologistss view of some language phenomena
with particular emphasis on vocal learning. In Foundations of Language Develop-
ment: A Multidisciplinary Approach, edited by Elizabeth Lenneberg, 61103. New
York: Academic Press.
Patel, Aniruddh D. 2003. Language, music, syntax, and the brain. Nature Neurosci-
ence 6 (7): 674681.
The Biology and Evolution of Musical Rhythm: An Update 321
Patel, Aniruddh D. 2006. Musical rhythm, linguistic rhythm, and human evolu-
tion. Music Perception 24 (1): 99104.
Patel, Aniruddh D. 2008. Music, Language, and the Brain. New York: Oxford
University Press.
Patel, Aniruddh D. 2013. Sharing and nonsharing of brain resources for language
and music. In Language, Music, and the Brain: A Mysterious Relationship, edited
by Michael A. Arbib, 329355.Cambridge, MA: MIT Press.
Patel, Aniruddh D., John R. Iversen, Micah R. Bregman, and Irena Schulz. 2009a.
Experimental evidence for synchronization to a musical beat in a nonhuman
animal. Current Biology 19 (10): 827830.
Patel, Aniruddh D., John R. Iversen, Micah R. Bregman, and Irena Schulz. 2009b.
Studying synchronization to a musical beat in nonhuman animals. Annals of the
New York Academy of Sciences 1169 (1): 459469.
Patel, Aniruddh D., John R. Iversen, Yanqing Chen, and Bruno H. Repp. 2005.
The influence of metricality and modality on synchronization with a beat. Experi-
mental Brain Research 163 (2): 226238.
Peretz, Isabelle, Julie Ayotte, Robert J. Zatorre, Jacques Mehler, Pierre Ahad,
Virginia B. Penhune, and Benot Jutras. 2002. Congenital amusia: A disorder of
fine-grained pitch discrimination. Neuron 33 (2): 185191.
Peretz, Isabelle, and Max Coltheart. 2003. Modularity of music processing. Nature
Neuroscience 6 (7): 688691.
Peretz, Isabelle, and Jos Morais. 1989. Music and modularity. Contemporary
Music Review 4 (1): 279293.
Poole, Joyce H., Peter L. Tyack, Angela S. Stoeger-Horwath, and Stephen
Watwood. 2005. Elephants are capable of vocal learning. Nature 434 (7032):
455456.
Ralls, Katherine, Patricia Fiorelli, and Sheri Gish. 1985. Vocalizations and vocal
mimicry in captive harbor seals, Phoca vitulina. Canadian Journal of Zoology 63
(5): 10501056.
Ravignani, Andrea. 2014. Chronometry for the chorusing herd: Hamiltons legacy
on context-dependent acoustic signalling. Biology Letters 10 (1): 20131018.
Rebuschat, Patrick, Martin Rohrmeier, John A. Hawkins, and Ian Cross, eds.
2012. Language and Music as Cognitive Systems. Oxford: Oxford University
Press.
Repp, Bruno H. 1998. A microcosm of musical expression. I. Quantitative analysis
of pianists timing in the initial measures of Chopins Etude in E major. Journal
of the Acoustical Society America 104 (2): 10851100.
Richman, Bruce. 1993. On the evolution of speech: Singing as the middle term.
Current Anthropology 34 (5): 721722.
Roberts, Ian. 2012. Comments and a conjecture inspired by Fabb and Halle. In
Language and Music as Cognitive Systems, edited by Patrick Rebuschat, Martin
Rohrmeier, John A. Hawkins, and Ian Cross, 5166. Oxford: Oxford University
Press.
322 W. Tecumseh Fitch
Rohrmeier, John A. Hawkins, and Ian Cross, 4350. Oxford: Oxford University
Press.
Wells, Kentwood D. 1977. The social behaviour of anuran amphibians. Animal
Behaviour 25 (3): 666693.
Westphal-Fitch, Gesche, and W. Tecumseh Fitch. Forthcoming. Towards a com-
parative approach to empirical aesthetics. In Art, Aesthetics and the Brain, edited
by Marcos Nadal. Oxford: Oxford University Press.
Williams, Leonard. 1967. The Dancing Chimpanzee: A Study of the Origins of
Primitive Music. New York: Norton.
Wright, Anthony A., Jacqueline J. Rivera, Stewart H. Hulse, Melissa Shyan, and
Julie J. Neiworth. 2000. Music perception and octave generalization in rhesus
monkeys. Journal of Experimental Psychology: General 129 (3): 291307.
Zarco, Wilbert, Hugo Merchant, Luis Prado, and Juan Carlos Mendez. 2009.
Subsecond timing in primates: Comparison of interval production between
human subjects and rhesus monkeys. Journal of Neurophysiology 102 (6):
31913202.
Zeman, Adam, Fraser Milton, Alicia Smith, and Rick Rylance. 2013. By heart:
An fMRI study of brain activation by poetry and prose. Journal of Consciousness
Studies 20 (910): 132158.
16 Neural Substrates for Linguistic and Musical Abilities:
A Neurolinguists Perspective1
Yosef Grodzinsky
It is tempting to say that musical and linguistic abilities, likely among the
hallmarks of humanity, are similar. What comes to mind are not only
formal properties and processing routines that these two abilities may
share, but also common brain mechanisms. In this chapter, I consider the
logic of inquiry and the current state of empirical evidence as they
pertain to the quest for common neural bases for language and music. I
first try to enumerate the properties that any cognitive ability akin
to language should possess (section 16.1), and move to a brief consider-
ation of the neurological argument for the modularity of language from
music (section 16.2). I then proceed to a critical review of studies that
have investigated gross double dissociations between music/language
(section 16.3). In section 16.4, I focus on studies of pitch discrimination
in amusia, which I critique (section 16.5). In section 16.6, I propose a
novel experimental paradigm for the study of pitch in language. In
response to past critiques, I show that this paradigm overcomes them.
The paradigm, which I present in detail, is based on semantic consider-
ations, specifically on the claim that only associates with focus (expressed
via pitch accent). When an element in a sentence is focused, a set of
alternative meanings emerges; only is a function that picks certain alter-
natives out of the focus set, and negates them. This paradigm helps to
create minimal sentence pairs that need not be compared in order to test
sensitivity to pitch accent. Rather, they can be investigated separately.
This property of the materials helps the new paradigm get around criti-
cisms raised in the literature by Patel and his colleagues. I conclude
(section 16.7) by alluding to salient properties of the speech of a famous
amusical individual.
It is most pleasing to use this space for a discussion of focus in the
context of music/language modularity, as these are two areas of inquiry
to which Ray Jackendoffan early teacher/mentor of minehas made
326 Yosef Grodzinsky
How can we tell that two (or more) classes of behaviors belong in the
same cognitive unit? We must ask whether they are governed by the
same set of building blocks and rules that combine them, structural con-
straints on such combinations, and algorithms that implement them in
use. Osherson (1981) puts it very succinctly:
. . . let C1 and C2 be two classes of processes and structures that conform to two
sets of interlocking and explanatory principles, P1 and P2, respectively. If the
properties of C1 can be proved not to be deducible from P2, and likewise for C2
and P1, then distinct faculties are (provisionally) revealed. (241242)
16.2 The Neurological Argument for the Separability of Language from Music
Measured variable A +
Measured variable B +
sentences, which they compared to songs. Their study had the following
schematic design:
Loci for language were found in the left inferior frontal gyrus (LIFG,
roughly Brocas region), left middle frontal gyrus, and left anterior,
middle, and posterior temporal regions, as well as the angular gyrus.5
Music areas were found on both hemispheres, from right and left anterior
and posterior temporal regions, to right and left premotor, supplemen-
tary motor, areas. Again, a double dissociation is demonstrated, but not
as sharply as one would have wished.
We might examine the results of these studieswhether they evince
neuroanatomical overlap between language and music, and whether we
observe a match between the lesion studies and those in health, or even
anatomical congruence between the two sets of fMRI studies. However,
before looking at the results, we might question the choice of tasks,
materials, and contrasts:
I. Are the musical and linguistic materials and contrasts uniform? If dif-
ferent studies use different types of contrasts, why would one expect the
resulting errors (in the case of lesion work) or activation patterns (in
health) to be similar in the first place?
II. Are the musical and linguistic tasks matched? The specificity/
modularity agenda requires use of parallel methodology and reasoning
across cognitive domains.
III. How do the tests connect to linguistic and musical structure? The
interest in the relation between language and music stems from the belief
that linguistic and musical strings are structured and governed by rules.
We also know that the neuropsychology of other domains indicates
complex symptomatology that differentiates between different syndrome
types within each domain. How does this structural complexity enter into
the considerations here?
Reviewing the studies above, we begin with the neuropsychological
cases. G.L. and J.C. received a mixed bag of tests. G.L.s linguistic abilities
were assessed through the Token Test (De Renzi and Vignolo 1962).
332 Yosef Grodzinsky
Ayotte et al. found that amusical individuals were near normal in dis-
criminating between these sentences. Their success here, contrasted with
their failure on the musical discrimination task, led Ayotte et al. to con-
clude that music and language are modular from one another.
Patel et al. (2008) and Liu et al. (2010) disagree with this conclusion.
To them, the high performance on (11) is not particularly telling.
336 Yosef Grodzinsky
Indeed, those individuals who had serious trouble with the musical
comparison task were also not good at distinguishing questions from
statements as in (12)(13). Patel and colleagues conclude that this
resultthe cross-modal co-occurrence of failuresargues against
domain specificity, as the musical deficit co-occurs with a linguistic one.
Still, the conclusion reached by Patel and his colleagues may be a bit
hasty. As the stakes are highat issue is music/language modularityI
would like to revisit Ayotte et al.s results for (11), and see whether a
different interpretation is possible. I will then propose a way to get
around the experimental problems noted by Patel and his colleagues, one
that might lead to an improved test, with the hope of obtaining a some-
what higher resolution than previous studies.
Patel and his colleagues reject the relevance of the focus discrimination
test in (11), arguing that those in (12)(13) are more informative.
But there may be reasons to take the opposite viewto argue that
in fact the amusical subjects success on the contrast in (11) is a
better benchmark of their pitch identification in the linguistic context
than their failure in (12)(13). In what follows, I will try to argue for
the latter view. A successful argument would hopefully reopen the
possibility of an empirical argument in favor of language/music
modularity.
To begin with, let me note that amusical subjects, said to fail in recogni-
tion and imitation tasks with familiar musical pieces, reportedly have
normal communicative skills.7 And yet, if the failure of these individuals
to discern a question from a statement as in (11)(12) is indicative of a
communication deficit, why is it not manifest in their daily linguistic
functioning? It is true that many communicative acts contain many
cues beyond pitch regarding semantic type, but there surely are instances
in which such a discrimination deficit would manifest in communication.
As Liu et al. (2010) point out, amusics rarely report problems outside
the musical domain, but proceed to suggest that it may be expected
that these individuals would struggle with aspects of spoken language
that rely on pitch-varying information (1682). Curiously, while the
amusical subjects performance level on the imitation task was lower
than normal (only 87 percent correct), it was much higher than their
chance performance in identification or discrimination. While Liu et al.
acknowledge the absence of noticeable communicative deficits in amusi-
cal subjects, they nonetheless insist that pitch deficits can be behavior-
ally relevant to both speech and music (1691), offering no further
discussion.
Next, consider the argument that linguistic pitch is carried by meaning-
and form-bearing objects, whereas musical pitch is not. Patel (2012)
proposed this distinction to account for the amusical subjects success in
the focus discrimination condition (11). The idea is that pitch is linked
to a word (perhaps to a syllable), whereas pitch contours without lan-
guage are not, and that this link might have eased memory demands.
While this might be the case, but a question immediately arises: why
are the same subjects worse with question/statement pairs, which
also have syllabic, lexical, and propositional content? Moreover, do we
understand the reasons behind the differential performance found
between discrimination/identification and imitation of questions and
statements?
338 Yosef Grodzinsky
In this section, I will put forward a simple proposal for an improved pitch
test in the linguistic domain, one that would get around Patels tagging
critique of Ayotte et al.s focus discrimination study, and would also be
on a par with the typical musical recognition task on which amusical
subjects fail. The goal here is to situate the tasks in a more naturalistic
context, which would not require a comparison (made easy by tagging),
and moreover would not be taxing in a way that isnt necessarily relevant
to communication. For that purpose, I will propose a task in which pitch
is required for linguistic (as opposed to meta-linguistic) analysis in sen-
tence comprehension. That is, as amusical subjects fail in simple tasks in
which pitch is crucial, namely recognition of familiar musical pieces (let
alone singing them or detecting deviations from melodic lines), we might
want to create a linguistic analogue, in which difference in pitch accent
would be crucial for language use. Natural candidate tasks involve com-
prehension, question answering, or verification. I will focus on the latter,
in the hope of finding a test for Patels claim that the discrimination
between different sites of pitch accent within a sentence is not a valid
test of sensitivity to pitch.
In many of the worlds languages (though by no means all), semantic
focus is triggered by pitch accent. Semantic focus evokes a set of alterna-
tives, picks out one, and makes it more salient. Ray Jackendoff made
an early contribution to the analysis of focus in the generative frame-
work, analyzing it through the use of the structured-meaning approach;
later, alternative semantics was introduced (Rooth 1985; 1992), which is
what guides my brief presentation below. In (15a), we have a sentence
p, and focus on an element within p evokes a set of alternatives whose
members are all propositions that John introduced Bill to someone in
the context.
Simplifying somewhat, assume a context C that features a scenario
in which John (and only he) is introducing people to one another, and
where the other participants are Bill, Mary, Betty, and Sue. Focus on
Sue, conveyed through pitch accent, asserts the proposition p (15b),
and in addition gives rise to additional, focus semantic value by allowing
Neural Substrates for Linguistic and Musical Abilities 339
Sentences (15a) and (16a) make the same assertion p, but differ in
their focus semantic value, as AcSUE AcBILL. Namely, the focus value
evoked when the pitch accent is on Bill or Sue is different. Thus a
scenario in which John introduced Bill to someone, who happened to
be Sue, is compatible with (15) but not (16), whereas a situation in
which John introduced someone to Sue, and that someone was Bill, is
compatible with (16) but not (15). The respective acceptability judgments
follow.
The meaning differences between (15) and (16) may seem somewhat
murky, because focus makes a certain alternative more salient than
others, and the notion of salience is somewhat difficult to capture.
However, matters become crystal clear when only is introduced as an
element that associates with focus (Rooth 1985). Sentential only is a
function that combines with a sentence p and a set of alternatives Ac that
focus evokes (i.e., the set of all non-weaker alternative propositions to p
that is supplied by C), and returns a set of propositions in which all but
p are negated. The result is a sentence that asserts p, where all the other
alternatives in Ac are false (Rooth 1985; Fox 2007):
340 Yosef Grodzinsky
(18) John only introduced Bill to SUE (and to no other individual present)
d. It is true that John introduced Bill to Sue, but it is false that John
introduced Bill to Betty, and it is false that John introduced Bill to
Mary
(19) John only introduced BILL to Sue (and to no other individual present)
d. It is true that John introduced Bill to Sue, but it is false that John
introduced Betty to Sue, and it is false that John introduced Mary to
Sue
We can now see that although (18) and (19) make the same assertion p,
they have different truth-conditions, because pitch accent marks a differ-
ent element in each case, thereby evoking a different set of alternatives.
Only then negates every proposition q p:
The reader may have noticed that the examples chosen above all involve
a ditransitive predicate (introduce). This is done on purpose, in order to
make the meaning contrast that different focus choices produce as
minimal as possible. The idea here is to create a task whose performance
requires sensitivity to pitch accent, and where pitch accent is placed on
elements that are syntactically and semantically on a par, modulo the
task at hand.10 Only needs focus, and our task would include a possible
position for association with focus on each of the two objects of the
ditransitive verb. Normal performance on a verification task, given a
scenario, would require the identification of focus location, which would
occur in the absence of a comparison between two representations. In
this task, tagging as postulated by Patel and his colleagues, is not
possible.
Let me provide a concrete example of how this meaning contrast is
produced:
The above sketch makes it quite clear, I hope, that this setupthe asso-
ciation of only with focusallows for the testing of sensitivity to pitch
accent in a task that does not require discrimination. When the right
controls are introduced (and there are many, to be sure), this should
allow for testing through a verification (truth-value judgment) task. It is
equally easy to imagine, I think, a production task with scenarios like
342 Yosef Grodzinsky
(21) and (22), in which amusical subjects would be forced to use only,
and the issue would be whether or not they can successfully use pitch
accent to mark the associated focus.
An implementation of this proposal is presently unavailable. What is
important about it is that the above does not enable tagging, because
no comparison or discrimination between two utterances is required.
Patel et al. would predict that amusical subjects would fail in this verifica-
tion task. Failure on their part would provide strong empirical evidence
against the modularity of language and music. And thus, while at present
no relevant result is available, the jury appears to be still out on the
modularity of language and music, at least until a result of the proposed
experiment, or some related one, is obtained.
16.7 Coda
I tried to revive the notion that amusia, as reported in the clinical litera-
ture, does not co-occur with a language deficit (contra Liu et al. [2010]).
One anectodal, yet not insignificant, observation relates to the famous
late economist Milton Friedman, believed to be amusical. His fame
allows us to have access to speech samples of his. An important example
is an interview on Greed he granted Phil Donahue in 1979.11 If you
havent seen it, I would urge you to do so, for Friedmans especially
expressive intonation, containing many questions and exclamations
(apparently intended to make his argumentation more convincing) might
make a compelling case for language/music modularity.
Notes
1. An earlier version of this paper was presented at Music and Brains: The Sur-
prising Link, an ELSC/ICNC conference at Mishkenot ShaananimJerusalem,
the Hebrew University of Jerusalem, February 10th, 2013. I would like to thank
the organizers, Eli Nelken, Ronny Granot, and Nori Jacoby for their kind invita-
tion. I also thank the following agencies and institutions for their support:
Edmond and Lily Safra Center for Brain Sciences, Canada Research Chairs
(CRC), and the Canadian Social Science and Humanities Research Council
(SSHRC). Eli Nelkens comments and the crucial help of Michael Wagner, Direc-
tor of McGills Prosodylab, is also gratefully acknowledged.
2. This evidence, however, is mostly based on work at the single word level, while
the linguistic perspective focuses on operations that form larger expressions from
more basic units (Varley, Klessinger, Romanowski, and Siegal [2005] being a
possible exception). See Heim et al. (2012), Deschamps et al. (under review) for
further evidence that bears on this issue.
Neural Substrates for Linguistic and Musical Abilities 343
References
Ayotte, Julie, Isabelle Peretz, and Krista Hyde. 2002. Congenital amusia: A
group study of adults afflicted with a music-specific disorder. Brain 125 (2):
238251.
Bernstein, Leonard. 1973. The Unanswered Question. Cambridge, MA: Harvard
University Press.
Brannon, Elizabeth M. 2005. The independence of language and mathematical
reasoning. Proceedings of the National Academy of Sciences of the United States
of America 102 (9): 31773178.
Changeux, Jean-Pierre, and Alain Connes. 1998. Conversations on Mind, Matter,
and Mathematics. Edited and translated by M. B. DeBevoise. Princeton, NJ:
Princeton University Press.
Chomsky, Noam. 1988. Language and Problems of Knowledge: The Managua
Lectures. New York: Cambridge University Press.
Cohen, Laurent, and Stanislas Dehaene. 2000. Calculating without reading:
Unsuspected residual abilities in pure alexia. Cognitive Neuropsychology 17 (6):
563583.
De Renzi, Enio, and Luigi Vignolo. 1962. The Token Test: A sensitive test to detect
receptive disturbances in aphasics. Brain 85 (4): 665678.
Deschamps, Isabelle, Galit Agmon, Yonatan Loewenstein, and Yosef Grodzinsky.
Under review. Quantities and quantifiers: Webers law, monotonicity and modu-
larity. MS. McGill University and The Hebrew University, Jerusalem.
Fadiga, Luciano, Laila Craighero, and Alice Roy. 2006. Brocas region: A speech
area? In Brocas Region, edited by Yosef Grodzinsky and Karin Amunts, 137152.
New York: Oxford University Press.
Fazio, Patrik, Anna Cantagallo, Laila Craighero, Alessandro DAusilio, Alice C.
Roy, Thierry Pozzo, Ferdinando Calzolari, Enrico Granieri, and Luciano Fadiga.
2009. Encoding of human action in Brocas area. Brain 132 (7): 19801988.
Fedorenko, Evelina, Josh McDermott, and Nancy Kanwisher. 2012. Sensitivity to
musical structure in the human brain. Journal of Neurophysiology 108 (12):
32893300.
Fodor, Jerry. 1983. The Modularity of Mind. Cambridge, MA: MIT Press.
Fox, Danny. 2007. Free choice disjunction and the theory of scalar implicatures.
In Presupposition and Implicature in Compositional Semantics, edited by Uli
Sauerland and Penka Stateva, 71120. New York: Palgrave Macmillan.
Gelman, Rochel, and Brian Butterworth. 2005. Number and language: How are
they related? Trends in Cognitive Sciences 9 (1): 610.
Grodzinsky, Yosef. 2006. The language faculty, Brocas region, and the mirror
system. Cortex 42 (4): 464468.
Grodzinsky, Yosef. 2013. The mirror theory of language: A neuro-linguists per-
spective. In Language Down the Garden Path: The Cognitive and Biological Basis
Neural Substrates for Linguistic and Musical Abilities 345
for Linguistic Structure, edited by Montserrat Sanz, Itziar Laka, and Michael
Tanenhaus, 333347. Oxford: Oxford University Press.
Grodzinsky, Yosef, and Lisa Finkel. 1998. The neurology of empty categories:
Aphasics failure to detect ungrammaticality. Journal of Cognitive Neuroscience
10 (2): 281292.
Henschen, Salomon Eberhard. 1920. Klinische und anatomische Beitrge zur
Pathologie des Gehirns. Stockholm: Nordiska Bokhandeln.
Heim, Stefan, Katrin Amunts, Dan Drai, Simon Eickhoff, Sara Hautvast, and
Yosef Grodzinsky. 2012. The language-number interface in the brain: A complex
parametric study of quantifiers and quantities. Frontiers in Evolutionary Neuro-
science 4 (4): 112.
Hyde, Krista L., Robert J. Zatorre, and Isabelle Peretz. 2011. Functional MRI
evidence of an abnormal neural network for pitch processing in congenital
amusia. Cerebral Cortex 21 (2): 292299.
Jackendoff, Ray. 1972. Semantic Interpretation in Generative Grammar. Cam-
bridge, MA: MIT Press.
Jackendoff, Ray. 1983. Semantics and Cognition. Cambridge, MA: MIT Press.
Jackendoff, Ray. 2009. Parallels and non-parallels between language and music.
Music Perception 26 (3): 195204.
Jentschke, Sebastian, and Stefan Koelsch. 2008. Musical training modulates the
development of syntax processing in children. NeuroImage 47 (2): 735744.
Katz, Jonah, and David Pesetsky. 2011. The identity thesis for language and music.
MS. Institute Jean Nicod and MIT.
Kimura, Doreen. 1973a. Manual activity during speakingI. Right-handers. Neu-
ropsychologia 11 (1): 4550.
Kimura, Doreen. 1973b. Manual activity during speakingII. Left-handers. Neu-
ropsychologia 11 (1): 5155.
Koelsch, Stefan. 2005. Neural substrates of processing syntax and semantics in
music. Current Opinion in Neurobiology 15 (2): 207212.
Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music.
Cambridge, MA: MIT Press.
Liu, Fang, Aniruddh D. Patel, Adrian Fourcin, and Lauren Stewart. 2010. Intona-
tion processing in congenital amusia: Discrimination, identification, and imita-
tion. Brain 133 (6): 16821693.
Nelken, Israel. 2011. Music and the auditory brain: Where is the connection?
Frontiers in Human Neuroscience 5: 106.
Osherson, Daniel N. 1981. Modularity as an issue for cognitive science. Cognition
10 (13): 241242.
Patel, Aniruddh. 2012. Language, music, and the brain: A resource-sharing frame-
work. In Language and Music as Cognitive Systems, edited by Patrick Rebuschat,
Martin Rohrmeier, John A. Hawkins, and Ian Cross, 204223. Oxford: Oxford
University Press.
346 Yosef Grodzinsky
Patel, Aniruddh, Meredith Wong, Jessica Foxton, Aliette Lochy, and Isabelle
Peretz. 2008. Speech intonation perception deficits in musical tone deafness
(congenital amusia). Music Perception 25 (4): 357368.
Peretz, Isabelle. 1993. Auditory atonalia for melodies. Cognitive Neuropsychology
10 (1): 2156.
Peretz, Isabelle, and Max Coltheart. 2003. Modularity of music processing. Nature
Neuroscience 6: 688691.
Pulvermller, Friedemann, and Luciano Fadiga. 2010. Active perception: Senso-
rimotor circuits as a cortical basis for language. Nature Reviews Neuroscience 11
(5): 351360.
Rizzolatti, Giacomo, and Michael Arbib. 1998. Language within our grasp. Trends
in Neurosciences 21 (5): 188194.
Rooth, Mats. 1985. Association with Focus, PhD. diss., University of Massachu-
setts, Amherst.
Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics
1 (1): 75116.
Schuell, Hildred. 1965. Minnesota Test for Differential Diagnosis of Aphasia. Min-
neapolis, MN: University of Minnesota Press.
Schnupp, Jan, Israel Nelken, andAndrew J. King. 2011. Auditory Neuroscience:
Making Sense of Sound. Cambridge, MA: MIT Press.
Stewart, Lauren. 2008. Fractionating the musical mind: Insights from congenital
amusia. Current Opinion in Neurobiology 18 (2): 127130.
Varley, Rosemary A., Nicolai J. C. Klessinger, Charles A. J. Romanowski,
and Michael Siegal. 2005. Agrammatic but numerate. Proceedings of the National
Academy of Sciences of the United States of America 102 (9): 35193524.
Venezia, Jonathan, and Greg Hickok. 2009. Mirror neurons, the motor system,
and language: From the motor theory to embodied cognition and beyond. Lan-
guage and Linguistics Compass 3 (6): 14031416.
17 Structure and Ambiguity in a Schumann Song
Fred Lerdahl
17.1 Introduction
Figure 17.1
Form of generative music theory
Grouping
Musical surface
Time-span
segmentation
Meter
Time-span
reduction
Stability Prolongational
conditions reduction
Figure 17.2
A flowchart of GTTMs components
Figure 17.3
Schumann, Im wunderschnen Monat Mai (the first song of Dichterliebe)
figure 17.4 gives the German text and English translation.2 Each poetic
strophe is set to the same music, with the unresolved piano introduction
repeating at the end to produce the sense of a fragmentary, unbounded
structure. This striking formal feature reflects the poets emotions, which
are torn between hope and doubt that his love will be reciprocated. His
uncertainty is also mirrored in the songs ambiguous tonality. The begin-
ning and ending imply F# minor, but this tonic never arrives, and the
song does not resolve. Each vocal stanza begins firmly in A major but
350 Fred Lerdahl
Als alle Knospen sprangen, when all the buds were bursting,
Als alle Vgel sangen, when all the birds were singing,
Figure 17.4
The text of Im wunderschnen Monat Mai (by Heinrich Heine)
Figure 17.5
Metrical and grouping analysis of the first vocal phrase (bars 58)
Figure 17.5 provides a rhythmic analysis of the first vocal phrase of the
song. The grouping brackets parse the phrase into two halves. The metri-
cal grid represents strong and weak beats by a dot notation.3 If a beat is
strong at one level, it is also a beat at the next larger level. The note
Structure and Ambiguity in a Schumann Song 351
Figure 17.6
Grouping overlap and hypermetrical ambiguity in bars 115
values to the left of the grid register the distance between beats at each
level. Notice that the grouping boundaries are slightly out of phase with
the time spans between beats, showing an upbeat of one 16th note to bar
1 and three 16th notes to bar 3.
Figure 17.6 shows a rhythmic analysis of the first stanza, ignoring beats
beneath the bar level. Throughout the song, two-bar groups combine to
form four-bar groups, but behind this simple pattern lies a complication.
The two-bar group in bars 910 repeats sequentially in bars 1112. Yet
bars 1213 also form a two-bar group, echoing bars 12 with the bass line
DC# and a progression into the dominant of F# minor. Thus bar 12 both
ends one group and begins another, producing an overlap. The figure also
displays two plausible hypermetrical interpretations.4 On one hand, there
is a preference for hearing strong beats early in a group, favoring
interpretation A. On the other hand, the relative harmonic stability of
even-numbered bars, together with the crescendos into bars 10 and 12
and the longer harmonic duration in those bars, supports interpretation
B. Further, the grouping overlap in bar 12 causes a metrical shift (or
deletion, as indicated by the dots in parentheses), for under either inter-
pretation A or interpretation B, the listener hears a metrical pattern in
bars 1213 and 1415 parallel to that in bars 12 and 34. For some
352 Fred Lerdahl
(a)
4-bar groups: A1 B1 C1 A2 B2 C2 A3
?
(b)
Figure 17.7
Ambiguity in the global grouping structure
Figure 17.8
Structural beginnings and cadences in time-span reduction
Figure 17.9
Time-span reduction of bars 115 on the interpretation that the song is in F# minor
ending the second phrase; similarly for the third and fourth phrases. At
the sixteen-bar level, all that remains are [b] launching the group and [c]
ending it.
Figure 17.9 shows a time-span reduction of bars 115 on the interpreta-
tion that the global tonic is F# minor.5 (Later I shall consider the alterna-
tive of A major.) Level f reduces the 16th-note musical surface to 8th notes.
Level e in turn eliminates embellishing events at level f to yield a quarter-
note sequence. Level d continues the process to the half-note level and
level c to the two-bar level. Two-bar groupings are shown beneath level
c. The overlap in bar 12 is represented by two events, a D major arrival
for the previous phrase and a C# dominant 7th for the ensuing phrase.
Levels b and a eliminate less structural events at the four- and eight-bar
levels. The dominant 7ths of F# minor dominate the entire structure
because they act as the structural beginning and cadence of the largest
groups.
Prolongational analysis is represented by a tree structure in which
right branching signifies a tensing motion and left branching a relaxing
motion. In figure 17.10a, dominating event x tenses into subordinate
event y; in figure 17.10b, subordinate x relaxes into dominating y. The tree
notation is an adaptation from syntactic trees in linguistics, but without
syntactic categories. Prolongational trees are often accompanied by a
formally equivalent notation in slurs. The slurs coordinate with branch-
ings. Dashed slurs are reserved for repetitions.
A prolongational analysis derives from global to local levels of its
associated time-span reduction via the interaction principle illustrated in
Structure and Ambiguity in a Schumann Song 355
(a) (b)
Events: x y x y
tensing relaxing
Figure 17.10
The branching notation for prolongational reduction. In (a), y is subordinate to x, and the
progression from x to y is a tensing motion. In (b), x is subordinate to y, and the progression
from x to y is a relaxing motion.
a a
b b
Time-span Prolongational
reductional reductional
levels c c levels
d d
Figure 17.11
Schematic diagram of the interaction principle
Figure 17.12
Prolongational analysis of bars 113 on the interpretation that the song is in F# minor
of the song, yet in this analysis they are all unstable. A prolongational
analysis selects stability over salience.
The lower system in figure 17.12 removes repetitions to bring out the
basic harmonic and linear motion. The C#7 chords dominate the struc-
ture. As half cadences they point to F# minor as the global tonic. At level
b, the first dominant 7th progresses to the local tonic of A major, which
then elaborates into the region of B minor. The sequenced modulation
to D major emerges at level c. The dashed branch to the D major chord
in bar 12 receives a double branch because of the grouping overlap dis-
cussed earlier. Its second branch reflects a reinterpretation of that event
as the predominant of F# minor.
At the bottom of figure 17.12 there is a functional harmonic analysis
employing the symbols of T for tonic function, S for subdominant or
predominant function, and D for dominant function.6 Another, Dep,
signifies departure. These symbols represent not chords per se but their
prolongational role: Dep for the branching that departs from the
superordinate event, D for the branching that attaches to or points to T,
and S for the branching that attaches to D.
The prolongational and functional analysis of most phrases takes the
form of figure 17.13: a T prolongation elaborated by a departure, followed
by S that moves into a two-membered cadence, D to T. Whatever else
happens in the phrase, this pattern usually occurs, for it efficiently proj-
ects a tensing-relaxing pattern. In a half cadence, the final T is omitted
from the schema, and occasionally the opening T is absent. Another
variant is the absence of S. The more a phrase deviates from the schema,
Structure and Ambiguity in a Schumann Song 357
[ c ]
T Dep S D T
Figure 17.13
Normative prolongational and functional structure. Dep stands for departure, [c] for
cadence, usually VI.
Figure 17.14
Time-span reduction of bars 115 on the interpretation that the song is in A major
the less stable the overall structure. This normative branching and func-
tional schema also takes place at grouping levels larger than the phrase.
The analysis in figure 17.12 achieves a version of normative prolonga-
tional structure but with an unorthodox functional progression. The
framing prolongation is not T to T but D to D, and the primary departure
in bar 6 is, at a smaller level, T of a related key. At a global level only S
to D in bars 1213 is standard. This unusual realization of normative
structure weakens the sense of F# minor as global tonic.
The theory derives the alternative global tonic of A major if only one
change is made in the time-span reductionby not labeling the C#7
chord in bar 2 (and its repetitions) as half-cadential. The revised time-
span reduction in figure 17.14 takes this step. Its justification is that bars
12 alone do not firmly establish F# minor. With the removal of the initial
358 Fred Lerdahl
Figure 17.15
Prolongational and functional analyses of bars 12: (a) if the C# 7th chord is treated as a
half-cadence in F# minor; (b) if the C# 7th chord is treated as not cadential but as a chro-
matic deviation within A major.
Figure 17.16
Prolongational analysis of bars 117 on the interpretation that the song is in A major
GTTM leaves the conditions for pitch stability, which are needed to
construct a prolongational analysis, in an imprecise state. TPS resumes
this thread to develop a quantitative model of pitch stability that cor-
relates with, and in a sense explains, Carol Krumhansls well-established
empirical data on the relatedness of pitches, chords, and keys (Krumhansl
1990). TPS calculates relatedness in terms of cognitive distance and
provides a quantitative treatment of tonal tension and relaxation.
The fundamental construct of the pitch-space model is the basic space
shown in figure 17.17a, oriented to a tonic chord in C major. (Keys are
represented in boldface, with major keys designated by upper case and
minor keys by lower case.) In figure 17.17b, the same configuration is
represented in standard pitch-class set-theory notation in order to
perform numerical operations. The space represents relationships that
everyone knows intuitively: starting at the bottom row, the chromatic
scale is the collection of available pitches, repeating every octave to form
12 pitch classes; the diatonic scale is built from members of the chromatic
scale; the triad is built from members of the diatonic scale; the root and
fifth of a triad are more stable than the third; and the root is more stable
360 Fred Lerdahl
(a) (b)
(C)
G (C) 7
E G (C) 4 7
C D EF G A B (C) 0 2 45 7 9 11 (0)
C C# D D# E F F# G G# A Bb B (C) 0 1 2 3 4 5 6 7 8 9 10 11 (0)
Figure 17.17
Basic diatonic space: (a) using note-letter names; (b) in numerical format (C = 0, C# =
1, . . . B = 11). Both (a) and (b) are oriented to I/C.
(x y) = i + j + k,
where (x y) = the distance between chord x and chord y;
i = the number of moves on the cycle of fifths at level (d);
j = the number of moves on the cycle of fifths at levels (a-c);
k = the number of noncommon pcs in the basic space of y compared
to those in the basic space of x.
Figure 17.18
Diatonic chord-distance rule
than the fifth. The basic space can be seen as an idealized form of Krum-
hansl and Kesslers (1982) empirically established tone profile of the
stability of pitches in a major key. If the tonic note C is wrapped around
to itself, the basic space takes the geometric shape of a cone.
Any chord in any key is representable by a configuration of the basic
space. The distance rule in figure 17.18 transforms one configuration into
another and measures the distance traversed, utilizing three factors that
combine additively: (1) the number of moves on the chromatic cycle of
fifths to reach another key, for instance C major to G major; (2) the
number of moves on the diatonic cycle of fifths to reach another chord
within a key, for instance the tonic of C major to its dominant; and (3)
the number of new pitch classes, weighted by psychoacoustic salience, in
the new configuration of the basic space.
To illustrate, figure 17.19a calculates the distance from I/C to its domi-
nant. Figure 17.19b does the same from I/C to i/c. The smaller the output
number, the shorter the distance. The pitch-class set-theory notation is
not essential; indeed, a computer implementation of the rule employs
the equivalent binary notation shown at the bottom of the figure.
Just as there are many possible routes between cities, so there are many
routes from one chord in one key to another chord in the same or
Structure and Ambiguity in a Schumann Song 361
(a) (b)
7 0
2 7 0 7
2 4 7 11 0 3 7
0 2 4 5 7 9 11 0 2 3 4 5 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
(I/C V/C) = 0 + 1 + 4 = 5 (I/C i/c) = 3 + 0 + 4 = 7
In binary notation:
000000010000 100000000000
001000010000 100000010 000
001000010001 100100010000
101011010101 101101011010
111111111111 111111111111
Figure 17.19
Illustrations of
V viio ii IV vi
I iii V viio ii
IV vi I iii V
viio ii IV vi I
iii V viio ii IV
Figure 17.20
A portion of chordal space arrayed in two dimensions
B b D d F
E e G g Bb
A a C c Eb
D d F f Ab
G g Bb bb Db
Figure 17.21
A portion of regional space arrayed in two dimensions. Major keys are in upper-case letters,
minor keys in lower-case letters.
Figure 17.22 combines figures 17.20 and 17.21 into a portion of chordal-
regional space. Each region is designated by a boldface letter, and this
letter simultaneously stands for the tonic of that key. Arrayed within each
key are its other six triads.
Figure 17.23 shows the relevant portion of chordal-regional space for
the Schumann song and traces the path of its harmonic progression on
the interpretation that it is in F# minor. The numbers next to the arrows
give the order of the progression. The double lines represent pivotsthat
is, chords that assume two locations in the space. The music passes
through four adjacent regions and reaches the tonic of all of them except
for the tonic of F# minor. The graph brings out the multiple roles of the
B minor chord. At the beginning, it is the subdominant of F# minor. In
bar 5, it migrates to the supertonic of A major. In bar 10, it appears as
tonic of B minor and then pivots as the submediant of D major before
returning to its initial state as subdominant of F# minor.7
Pitch-space paths such as that in figure 17.23 give a useful but only
approximate picture of distances from one event to the next. To achieve
a precise account, one must return to the distance rule in figure 17.18,
Structure and Ambiguity in a Schumann Song 363
o
III V vii iii V viio III V viio
VI a III vi C iii VI c III
iio iv VI ii IV vi iio iv VI
o
III V vii iii V viio III V viio
VI d III vi F iii VI f III
iio iv VI ii IV vi iio iv VI
Figure 17.22
A portion of chordal-regional space
Figure 17.23
Path of the songs harmonic progression in chordal-regional space on the interpretation
that the global tonic is F# minor. The numbers next to the arrows give the order of the
progression. Double lines represent pivots.
which, together with other rules whose discussion lies beyond the scope
of this essay, affords a quantified prediction of patterns of tension and
relaxation. The crucial concept is to equate distance traveled with the
amount of change in tension or relaxation. If the motion is away from a
point of rest, the rule computes an increase in tension; if it is toward a
point of rest, it computes a decrease in tension. A change in tension
can be computed sequentially from one event to the next, as if the lis-
tener had no memory of past events or expectation of future ones; or it
can be computed hierarchically down the prolongational tree, so that
right branches signify connections to past events and left branches
364 Fred Lerdahl
Figure 17.24
Hierarchical tension analysis for the F# minor interpretation of the song
Figure 17.25
Tension curve for the values in figure 17.24
analysis does not include the factor of melodic and harmonic attractions,
which contribute crucially to expectations of ensuing events. For example,
a leading tone is strongly attracted to its tonic pitch and is expected to
resolve there; likewise a dominant 7th chord to its tonic chord. This factor
is especially powerful for the V7 chords that frame the song: in pitch-
space tension, they are close to the tonic, but in terms of expectation they
are very tense. The full tension model developed in TPS incorporates
both surface-dissonance and attraction factors, and their role in making
accurate tension predictions is demonstrated empirically in Lerdahl and
Krumhansl (2007). If these factors were included in figure 17.25, the most
telling effect would be to increase the composite tension of the framing
V7 chords. In spite of these missing factors, the curve graphed in figure
17.25 reflects essential aspects of the F# minor hearing and shows a
jagged bell-like shape that is typical of tension curves in most tonal
pieces.
The interpretation of the song as globally in A major presents a dif-
ferent picture. Figure 17.26 repeats the prolongational analysis from
figure 17.16 but with tension values in the tree and summed tension
numbers between the staves. Figure 17.27 translates these numbers into
an unorthodox tension curve. (The curve stops at bar 15 in order to
facilitate a comparison with figure 17.25.) After a local tensing motion to
the neighboring C#7 chord, the curve relaxes to zero tension at the A
major cadence in bar 5. At this point it follows a shape for bars 912
366 Fred Lerdahl
Figure 17.26
Hierarchical tension analysis on the A major interpretation of the song
Figure 17.27
Tension curve for the values in figure 17.26
Notes
1. The GTTM/TPS theory applies not only to classical and romantic tonal music
but equally well to a wide variety of musical styles including pop music (see
Jackendoff and Lerdahl 2006 for an analysis of a Beatles song). I choose to
analyze this particular Schumann song because it fascinates me and because it
challenges the theory in interesting ways. Recordings of it are easily accessible
on the internet, and I urge the reader to listen to it several times before studying
the analysis.
2. The score and translation are taken from Schumann (1971).
3. This notation, first proposed in Lerdahl and Jackendoff (1977), is analogous
to the phonological grid notation introduced at about the same time by Liberman
and Prince (1977). Lerdahl (2001b, 2013) discusses this and other aspects of the
relationship between linguistic and music theory.
4. Hypermetrical means metrical structure at a level larger than the
notated bar.
5. For reasons of space, it is convenient not to show the pitch analysis of the
entire song. Since the second strophe repeats the structure of the first, this omis-
sion does not affect the analysis in any significant way. Again for convenience,
at larger levels the music is compressed to one staff.
6. These designations are familiar from Riemannian function analysis (Riemann
1893). My use of them departs from that tradition. TPS (chap. 5) explains how
these and other functions arise from prolongational position in combination with
tonic orientation.
7. Figure 23 corresponds to an analysis in Cohn (2011) in the context of an inter-
esting comparison between TPS and neo-Riemannian theories.
8. The fleeting A# in the arpeggiation of the B minor chord in bar 1 (see figure
17.3) briefly implies the key of B minor, but this detail reduces out already at the
8th-note level of time-span reduction and is not a factor at larger levels of
analysis.
References
Cohn, Richard. 2011. Tonal pitch space and the (Neo-)Riemannian Tonnetz. In
The Oxford Handbook of Neo-Riemannian Music Theories, edited by Edward
Gollin and Alexander Rehding, 322348. New York: Oxford University Press.
Jackendoff, Ray. 1991. Musical parsing and musical affect. Music Perception 9 (2):
199230.
Jackendoff, Ray, and Fred Lerdahl. 2006. The capacity for music: What is it, and
whats special about it? Cognition 100 (1): 3372.
Koffka, Kurt. 1935. Principles of Gestalt Psychology. New York: Harcourt, Brace
& World.
Krumhansl, Carol L. 1990. Cognitive Foundations of Musical Pitch. New York:
Oxford University Press.
Structure and Ambiguity in a Schumann Song 369
Krumhansl, Carol L., and Edward J. Kessler. 1982. Tracing the dynamic changes
in perceived tonal organization in a spatial representation of musical keys. Psy-
chological Review 89 (4): 334368.
Lerdahl, Fred. 2001a. Tonal Pitch Space. New York: Oxford University Press.
Lerdahl, Fred. 2001b. The sounds of poetry viewed as music. In The Biological
Foundations of Music, edited by Robert J. Zatorre and Isabelle Peretz. Annals
of the New York Academy of Sciences 930 (1): 337354. Reprinted with revisions
in The Cognitive Neuroscience of Music, edited by Isabelle Peretz and Robert J.
Zatorre, 412429. New York: Oxford University Press, 2003.
Lerdahl, Fred. 2013. Musical syntax and its relation to linguistic syntax. In Lan-
guage, Music, and the Brain: A Mysterious Relationship, edited by Michael A.
Arbib, 257272. Strngmann Forum Reports 10, series edited by Julia Lupp.
Cambridge, MA: MIT Press.
Lerdahl, Fred, and Ray Jackendoff. 1977. Toward a formal theory of tonal music.
Journal of Music Theory 21 (1): 111171.
Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music.
Cambridge, MA: MIT Press.
Lerdahl, Fred, and Carol L. Krumhansl. 2007. Modeling tonal tension. Music
Perception 24 (4): 329366.
Liberman, Mark, and Alan Prince. 1977. On stress and linguistic rhythm. Linguis-
tic Inquiry 8 (2): 249336.
Riemann, Hugo. 1893. Vereinfachte Harmonielehre; oder, Die Lehre von den
tonalen Funktionen der Akkorde. London: Augener.
Schumann, Robert. 1971. Dichterliebe. Edited by Arthur Komar. Norton Critical
Scores. New York: W. W. Norton & Company.
Smith, Nicholas A., and Lola L. Cuddy. 2003. Perceptions of musical dimensions
in Beethovens Waldstein sonata: An application of tonal pitch space theory.
Musicae Scientiae 7 (1): 734.
Wittgenstein, Ludwig. 1953. Philosophical Investigations. Oxford: Blackwell.
18 The Friars Fringe of Consciousness
Daniel Dennett
model is not where the work of semantic processing occurs. Ray argues
for this in two detailed chapters in his 1987 book, drawing on phenom-
enological observation of our experience of music, vision, and visual
imagery, and language itself, of course. He also analyzes the difficulties
of other theories. His claim has since been taken up by another fine
theorist, Jesse Prinz (2012). The Cartesian idea, shared by Jerry Fodor,
Tom Nagel, and John Searle, that consciousness is the source (somehow)
of all Understanding and Meaning1 is, I believe, the greatest single cause
of confusion and perplexity in the study of the mind. For some (e.g.,
Fodor and Nagel) it fuels the conviction that a science of the mind is
ultimately beyond us, an unfathomable mystery. For others (e.g., Searle)
it deflects attention from the one kind of science that could actually
explain how understanding happens: a computational approach that in
one way or another breaks down the whole mysterious, holistic, ineffable
kaleidoscope of phenomenology into processes that do the cognitive
work that needs to be done.
Ray has seen that the first step toward any viable theory of conscious-
ness must demote consciousness from its imagined position as the ulti-
mate Inner Control Room (where it all comes together and the
understanding happens), but he doesnt quite carry through on the
second step, which is embodied in the moral I draw from the demise of
the Cartesian Theater:
All the work done by the imagined homunculus in the Cartesian Theater must
be distributed around in space and time to various lesser agencies in the brain.
(Dennett 2005, 69)
All the work. And all the play, too, for that matter: the savoring, enjoying,
delighting, as well as the abhorring, being disgusted by, disdaining. . . . It
all has to be outsourced to lesser entities, none of which is the ego, or
the person, or the Subject. Just as the phenomenon of life is composed,
ultimately, of non-living parts (proteins, lipids, amino acids, . . . ) so con-
sciousness must be dismantled and shown to be the effects of non-
conscious mechanisms that work sub-personally. When this step is taken,
the Subject vanishes, replaced by mindless bits of machinery uncon-
sciously executing their tasks. In Consciousness Explained, I described
what I called the Hard Question: and then what happens? (255). This is
the question you must ask and answer after you have delivered some
item to consciousness. If instead you stop there, in consciousness,
youve burdened the Subject with the task of reacting, of doing some-
thing with the delivery, and left that project unanalyzed. Answering the
The Friars Fringe of Consciousness 373
Exactly. If you still have an Emperor in your model, you havent begun
your theory of consciousness. A necessary condition any theory of con-
sciousness must satisfy in the end is that it portrays all the dynamic
activity that makes for consciousness as occurring in an abandoned
factory, with all the machinery churning away and not a soul in sight, no
workers, no supervisors, no bosses, not even a janitor, and certainly no
Emperor! For those who find this road to progress simply unacceptable,
there is a convenient champion of the alternative option: if you DONT
leave the Subject in your theory, you are evading the main issue! This is
what David Chalmers (1996) calls the Hard Problem, and he argues that
any theory that merely explains all the functional interdependencies, all
the backstage machinery, all the wires and pulleys, the smoke and mirrors,
has solved the easy problems of consciousness, but left the Hard
Problem untackled. There is no way to nudge these two alternative posi-
tions closer to each other; there are no compromises available. One side
or the other is flat wrong. There are plenty of Hard Questions crying out
for answers, but I have tried to show that the tempting idea that there is
also a residual Hard Problem to stump us once weve answered all the
Hard Questions is simply a mistake. I cannot prove this yet but I can
encourage would-be consciousness theorists to recognize the chasm and
recognize that they cant have it both ways.2
It is one thing to declare that you are abandoning the Cartesian
Theater for good, and another thing to carry through on it. Rays work
offers a nice example of a half measure that needs to be turned into a
full measure: his discussion of what he called affects in Consciousness
374 Daniel Dennett
The fact that he calls these items affects or feels is a bit ominous: just
whose feels are they and how does this Subject, whoever or whatever it
is, respond to them? Ray is silent on this scorethat is, Ray ducks the
Hard Question. But we can try to answer it for him. These feels are
present in our phenomenology, and as such are denizens of the fringe
of consciousness, byproducts of the (higher, or more central) uncon-
scious workhouse in which conceptual and spatial structures get built and
analyzed. Rays excellent half step forward is to dismantle the tradition-
ally mysterious and unanalyzable grasping or comprehending by
the Subject in the Cartesian Theater, outsourcing all that work to
unconscious high-level processes into which we have no introspec-
tive access at all. Those backstage processes make all the requisite links
to conceptual structures, taking care thereby of our ongoing compre-
hension of the words streaming through the fringe of consciousness.
Those words have phonological properties we experience directly
accompanied by the feeling that they are meaningful (or not). Here
we have the beginnings of a nice division of labor: (almost) all the Work
of Understanding has been assigned to unconscious bits of machinery,
leaving only one task for the conscious Subjectappreciating the mean-
ingfulness or noticing the meaninglessness of whatever is on stage at the
moment.
Calling such a signal a feeling at first looks like a step backwards,
back into the murky chaos of qualia, but the fact that the distinction is
binary is encouraging, since it suggests that it does only a small job; its
a single-throw switch, the effects of which are in need of delegation to
some unconscious functionaries. Lets consider some minimal reactions
and then build up from there.
The Friars Fringe of Consciousness 375
producing the speech acts and analyzing them, nor to the factors that are
controlling that machinery. Monitoring our own thought, we can hope
for an insightful breakthrough, but not command one.
These are, of course, the apt and familiar responses we make to feelings
of meaninglessness or its opposite, but notice that once we have cata-
logued a few of them (the highlights from an apparently inexhaustible
list of possibilities), we can leave the feeling out of it, and just have the
binary switch or flag as the triggerer of this family of responses. The
feeling is, as Ray says, ineffableit has no content beyond just the bare
sense of meaninglessness or meaningfulnessand we have, arguably,
captured that content in our catalogue of appropriate responses. The
feeling is not doing any work. One might put it this way (tempting fate):
a zombie, lacking all feelings or qualia, who is equipped with a binary
switch with the input-output conditions we have just described doesnt
lack anything important; it can monitor its own cognition for signs of
meaninglessness, and react appropriately when they are uncovered just
as we conscious folk do; it can tell others about the phenomenology
of its own experiences of meaningfulness and meaninglessness, and that
account will gybe perfectly with our accounts, since there is nothing more
to these feelings than this.
These binary character tags are the easiest cases. Ray did well to put
the term feelings in scare-quotes, since they are best considered as only
feelings pro tem, on their way to the junkyard once we answer the Hard
Question about what happens next when we have them. Once we get
used to the move, we can start tackling all the more complicated, multi-
dimensional aspects of our experience and deconstructing them in similar
fashion.3
Notes
1. Rays innovation in his Users Guide to Thought and Meaning of using a rather
sacred font for philosophical terms that are meant to be particularly deep and
portentous, is irresistible.
2. I can offer intuition pumps to render my claim at least entertainable by those
who find it frankly incomprehensible at first. See especially The Tuned Deck,
in Dennett (2003), (from which some material in the previous paragraphs is
drawn) and Dennett (2005, 2013).
3. My favorite example of this kind of further deconstruction (effing the inef-
fable, we might call it) is David Hurons analysis of the qualia of musical scale
tones, in Sweet Anticipation (2006). What does the stability of do, the tonic,
amount to, compared to the instability of ti, the leading tone, and which families
378 Daniel Dennett
References
Neil Cohn
Climbing Trees and Seeing Stars 381
392 Neil Cohn
Note
Any images provided without attribution were created by Neil Cohn ( copy-
right Neil Cohn).
References
Lexical aspect, 138142, 149156 Metrical structure, 296, 309313, 351, 382
Lexical Conceptual Semantics, xxviii Meyer, David E., 167
Lexicalism, xv, xvi, 21 Miller, Carol A., 201
Lexical redundancy rules, xviiixix Miller, George A., 313
Lexical semantics, xxviii, 139, 145 Minimalist Program, 13, 63, 73
Lexicography, 146147 Mirror Neuron theory, 327
Liberman, Mark, 296, 368n3 Mismatches
Light verbs, 79, 287 form-meaning, xviixviii, 6376
Linear order in linguistic representation, 8, morphology-syntax, 104, 109, 113
10 Mithen, Steven J., 271, 293, 315
Linguistic Material arguments, 8084, Mithun, Marianne, 114
8586, 8992 Mittelfeld, 53
Linking rules/principles, 28, 35, 65, 190, Mynarczyk, Anna, 159n3
278280, 284289 Model-theoretic semantics, xix
Lipps, Hans, 246247, 248, 249, 251 Mode (means and attitude) verbs, 79,
Liu, Fang, 335336, 337, 342 8486, 8788, 93
Livingstone, Frank B., 315 Modularity, 3233, 3435, 271, 326
Localist hypothesis, 68 of music and language, 295, 325, 327335,
Locatives and locative case, 69, 74, 336, 337338, 342
119134 Montagu, Ashley, 314
Love, Tracy, 169 Morais, Jos, 296
Lyell, Charles, 270 Morgan, David B., 314
Morphological and semantic regularities
Maas, Utz, 250 in the lexicon (paper), xviixviii, xx
MacKay, Carolyn Joyce, 113114 Morphology, relationship to syntax,
Macnamara, John, xii, 26 103104, 109, 112113
MacWhinney, Brian, 218 Motor hierarchies, 313314
Maess, Burkhard, 284 Mller, Friedrich Max, 236
Maling, Joan, xxv, xxvi, xxviii, 104105, Mller, Stefan, 5
108109, 110111, 113, 114, 116n9, 120, Munro, Pamela, 96n2
121, 126, 130, 131, 133, 134, 135n4 Mnte, Thomas F., 288
Manner-of-speaking verbs, 97n8. See also Music, xxixxii, 347367, 382
Mode (means and attitude) verbs biological basis for, 293315
Market model in science, 251 language, relationship to, 293296,
Marr, David, 4, 381 309312, 313315, 325342
Marslen-Wilson, William D., 238239, 240 in rituals, 283285, 289
Martin, James G., 293 Myler, Neil, 293, 296, 311
Martin, Samuel, 126
Martinet, Andr, 264 Naccache, Lionel, 375
Martins, Mauricio D., 314 Naeser, Margaret A., 168
Matsuzawa, Tetsuro, 303, 304 Nagel, Tom, 372
Mattingly, Ignatius G., 271 Naigles, Letitia R., 6
Mayer, Carl, 240, 241242, 243 Nakanishi, Kimiko, 124
McClelland, James L., 272 Nam, Seungho, 192, 194
McCloskey, James, 104, 105, 106107 Nappa, Rebecca, 191
McElree, Brian, 170, 173, 287 Narasimhan, Bhuvana, 213215, 217, 220,
McLean, Janet F., 212, 213 227
Meaning. See Semantics Natural philosophy, 142143
Means. See Mode (means and attitude) Necker cube, 367
verbs Nelken, Israel, 335
Medin, Douglas L., 207n4 Nettl, Bruno, 295
Memory, lexical, xx Neuroscience and neurolinguistics, xxi, 167,
Memory structures, 4, 6, 8 168170, 295296, 326327, 328335. See
Mendel, Gregor, 235, 241, 248, 249 also Aphasia
Merchant, Hugo, 303, 312 Newberg, Andrew B., 283
Meringer, Rudolf, 240242, 243, 248, 249 New information. See Information status
Merker, Bjrn, 293, 298 New Transitive Impersonal construction,
Methodology, 23, 3336, 331334, 341342 109112
Index 401
Verkuyl, Henk J., 67, 138, 142, 148, 154, 158, Zubizarreta, Maria Luisa, 86
159n2, 160n6, 160n12, 160n14, 161n17, Zurif, Edgar, 168, 170, 172, 173, 181
161n19, 161n20 Zwar, 4748, 49
Vicente, Luis, 51 Zwarts, Joost, 71, 72
Vignolo, Luigi, 168, 331 Zwicky, Arnold, 97n8
Vincent, Nigel, 74
Virtue, Sandra, 288
Visser, Fredericus Theodorus, 117n4
Vocal learning and rhythmic
synchronization hypothesis, 304308
Vogu, Sarah de, 159n3
Von Stutterheim, Christiane, 213
Voorhees, Burton, 373
Vries, Mark de, 96n5