Escolar Documentos
Profissional Documentos
Cultura Documentos
Formulaic Language
Fundamentals of
Formulaic Language
An Introduction
DAVID WOOD
Bloomsbury Academic
An imprint of Bloomsbury Publishing Plc
LON DON OX F O R D N E W YO R K N E W D E L H I SY DN EY
Bloomsbury Academic
An imprint of Bloomsbury Publishing Plc
50 Bedford Square
London
WC1B 3DP
UK
1385 Broadway
New York
NY 10018
USA
www.bloomsbury.com
BLOOMSBURY and the Diana logo are trademarks of Bloomsbury Publishing Plc
First published 2015
David Wood, 2015
David Wood has asserted his right under the Copyright, Designs and Patents Act,
1988, to be identified as the Author of this work.
All rights reserved. No part of this publication may be reproduced or
transmitted in any form or by any means, electronic or mechanical,
including photocopying, recording, or any information storage or retrieval
system, without prior permission in writing from the publishers.
No responsibility for loss caused to any individual or organization acting on or
refraining from action as a result of the material in this publication can be accepted by
Bloomsbury or the author.
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
Contents
Preface vi
References
Index 191
173
Preface
M
y first encounters with formulaic language date back to the mid 1990s
during my time as a teacher of English as a Second Language (ESL)
and English for Academic Purposes (EAP) at a large university. I became
particularly intrigued with the teaching of spoken language and the challenges
presented for second language learners by the real-time, ephemeral nature
of speech. In looking around for resources and background knowledge
to help, I found myself encountering the term fluency very often in the
literature. I began to look for the underlying psycholinguistic mechanisms
and the research on the nature of fluency and found some reference to
the role of formulaic language. Some of the papers I read alluded to the
notion that formulaic language might play some role in facilitating fluency
speech, or that formulaic language might be a fundamental aspect of spoken
communication in several ways. This elusive phenomenon came to haunt my
dreams for some years to come, as I soon thereafter embarked on doctoral
studies with a goal of attempting to measure or examine the relationship
between formulaic language and fluent speech.
I am still fascinated by the study of formulaic language, and have since
examined it from several other perspectives, including pedagogical and corpusbased. I have seen my students in graduate programs become interested in
formulaic language too and seen them set out to study formulaic language
from various perspectivesin academic writing, in the speech of autistic
children, in textbooks, in the discourse of official meetings, and more. I have
taught seminars on the topic and supervised a range of master of arts and
doctoral projects. Throughout all of this, I have seen students struggle with
the sheer volume and range of literature. The multidisciplinary nature of the
field means they the need to quickly grasp concepts from areas as diverse as
psycholinguistics, vocabulary research, and discourse analysis, to name but a
few. To add to the burden, it became painfully clear early on that much of the
written work in the area is not particularly reader-friendly, especially for those
new to the field. The combination of complex concepts, diverse sources, and
opaque prose has made the establishment of a foundation in this area an
uphill climb indeed.
PREFACE
vii
This has led me to take on the task of creating an overview of the area
for newcomers. The present volume is meant to be a resource for new
researchers first and foremost, but may also be a reference source for
established scholars. The content in this book is not mine, it is a distillation
of the work of many others from across the decades. It is taken from a wide
range of sources, from original research reports, review and summative stateof-the-art papers, edited collections, and so on.
The content of this book is not meant to be a complete presentation of
every bit of research conducted to date, it is meant instead to be a start,
a place for readers to get a sense of what exists, and then to go further
beyond this book as needed. Some parts of the book go more deeply into the
literature than others, partly due to space constraints, partly due to my own
perceptions of what needs to be foregrounded.
I encourage those who use this book as a teaching resource to bear that in
mind and to point out to students what more is needed in any area. The book
ranges widely, stops to scrutinize, and at the same time dances across many
areas of study. This is inevitable for an overview like this, crafted by a single
author. Please feel free to pick, choose, adapt, adjust, dismiss, embrace, or
elaborate on anything you find herein.
I hope this book will be a support for teachers and students in this area. I
fervently wish for you to be inspired to take some risks in research, to push
some boundaries in the area, and to go ahead and create new knowledge.
I must give thanks to those whose work was so useful in creating this
book:
To my students Randy Appel, Ridha Ben Rejeb, Lina Al Hassan, Alisa
Zavialova, Olga Makinina, Lin Chen, and Joelle Doucet. Special thanks
to Joshua Romancio for editing assistance.
And others from whom I have been inspired.
Ottawa
February, 2015
1
Formulaic Language Research
in a Historical Perspective
Across Decades and
Continents
ome years ago I had two interesting experiences with students of English
as a second language, which sparked in me an early interest in the formulaic
nature of language. A clever student whose first language was Spanish came
to me during a break in class and asked Teacher, what means festival?
I was a bit puzzled by the question, since the recent lessons had had no
content related to this, but I manfully attempted to explain the word as best as
I could. It was the students turn to be puzzled, as he tried to fit my definitions
with what he was trying to understand. After some struggle, he interrupted
me to ask Why do you say this word at the start? It slowly dawned on
me that what he had heard as festival was in fact first of all, which I did indeed
tend to use at the start of lessons and to give instructions. It surprised me
that he would interpret a three-word sequence as a single word, but I put it
down to some confusion about phonology and a general lack of vocabulary
knowledge. Another incident occurred with a very active and alert student
from Cambodia, who arrived in my beginner class quite late in the course,
with limited to no English proficiency whatsoever. She bravely plunged into
the job of becoming a member of the group and learning what she could. Her
English output after several lessons began with I no stan, whenever anything
was addressed to her in English or if she had to participate in anything by
speaking. Later, this was modified to I don no stan, and later still it became a
closer replication of the sequence I dont understand. I noticed this at the time
as an amusing example of a student mustering one resource to cope with a
really challenging situation. It also appeared odd to me that she had taken a
three-word sequence and interpreted it as a single word. Later, in hindsight,
this story and the festival incident came to represent to me the power of
formulaic language and a glimpse of the process of language acquisition and
the importance of formulaic language in it.
To begin any discussion of formulaic language, it is important to establish
some foundations and establish the terminology that is used to refer to it, and
to look at a definition or definitions. Formulaic sequence is generally used to
refer to one such item, formulaic language is the uncountable noun referring
to these items as a collective, and phraseology is a term often used to refer to
the study of formulaic language. As we will see later, phraseology also does
double duty as a specific term for a particular type of analysis of formulaic
language.
These days, formulaic language is a language phenomenon that is quite well
known among researchers and students in linguistics and applied linguistics.
Articles with a focus on formulaic language are appearing in an expanding
range of journals at an ever more frequent rate, papers on topics related to
formulaic language are being presented at congresses around the world, and
graduate theses on this theme are appearing everywhere. It is remarkable
that all of this is occurring in the absence of a journal devoted to formulaic
language, and with only one widely attended recurring conference, that of
the Formulaic Language Research Network (FLaRN), which has been held at
various locations in Europe every second year since 2004. The Yearbook of
Phraseology, a creation of Europhras, a European organization dedicated to
the study of phraseology, and published by Mouton de Gruyter, is the only
periodical publication currently in existence that concerns itself with formulaic
language research. The real source of information about formulaic language
has been a range of books, both edited collections and monographs, about it
over the past fifteen to twenty years: Sinclairs (1991) Corpus, Concordance,
Collocation was a landmark, Nattinger and DeCarricos (1992) Lexical Phrases
and Language Teaching; Cowies (1998) Phraseology: Theory, Analysis,
and Applications; Wrays (2002) Formulaic Language and the Lexicon and
Formulaic Language: Pushing the Boundaries (2008); Allerton, Nesselhauf, and
Skanderas (2004) Phraseological Units: Basic Concepts and Their Applications;
Schmitts (2004) Formulaic Sequences: Acquisition, Processing and Use;
Granger and Meuniers (2008) Phraseology: An Interdisciplinary Perspective;
and Woods (2010b) Perspectives on Formulaic Language. This list is by no
means exhaustive, but gives a taste of the range and quantity of formulaic
language-focused work that exists on library shelves around the world. So what
is the attraction to formulaic language despite the relative lack of a coherent
set of venues in which to present or find research and information about it?
It has become apparent over the years that formulaic language is, despite its
Early research
Lacking the technology to perform extensive corpus research, and hampered
by the scattered nature of linguistic knowledge, researchers before the 1970s
paid scant attention to formulaic language. However, there were pockets of
work being conducted in diverse fields outside of linguistics proper.
Collocation researchers
Early work in the area of collocations was initiated by Firth (1951, 1957) in the
1950s, although the actual term itself had been around much longer. Firths
basic definition of collocation was the co-occurrence of words in proximity,
with several possible types of variation. One type is the habitual collocation,
in which words occur together quite frequently. Firth uses the example of
Learning psychologists
Goldman-Eisler in the late 1960s, with her book Psycholinguistics: Experiments
in Spontaneous Speech (1968), was among the first to discover that fluent
speech in particular is characterized by patterns of temporal variables such
as pause phenomena and length of runs of speech. This was the start of a
tradition of research, with a psycholinguistic flavor, into speech fluency. The
researchers discovered that patterns of fluent speech hint at a possible role for
formulaic language, as it has become apparent that fluent speakers have many
Grammarians
Early grammarians hinted at the importance of formulaic language
Jesperson (1924) examined the phenomena of free and fixed expressions,
and the structure of idioms was a focus of the works of Chafe (1968)
and Fraser (1970). Work on phrasal dictionaries by Hornby, Gatenby, and
Wakefield (1942) and Palmer (1938) influenced how phrasal units were
handled in later works. Meanwhile, in Eastern Europe phraseology was
taking off as a legitimate area of study in its own right. Researchers including
Amosova (1963), Melcuk (1988), and Vinogradov (1947) compiled lists of
idioms and collocations and classified them. Some examples include
pure idioms, expressions with literal meanings totally divorced from their
idiomatic meanings, for example, chew the fat and beat around the bush.
Figurative idioms are expressions in which the figurative meaning is an
obvious derivation from the literal meaning, for example, hold water or steal
ones heart. Restricted collocations include word combinations in which the
interpretation of one word is dependent on its relationship with the other, for
example, pay a visit or meet ones needs.
Since 1970s
The 1970s represent a turning point in formulaic language research, with a
number of linguists pursuing research in the area, and some major areas of
research came to be defined. Lexicographers began to assemble information
about multiword chunks; research on speech acts and pragmatics grew. A
major event was a course taught by Charles and Lily Wong Fillmore in 1977 at
the 1977 Linguistic Institute, and Coulmas (1981) and Krashen and Scarcella
(1978) published a review article and an edited collection of papers. Pawley
and Syder in 1983 published a landmark paper pointing out that formulaic
language is likely key to second language fluency and nativelike selection
the tendency we have to use routine ways of expressing things, despite the
supposed infinite potential of language. For example, we say how are you?
rather than creative alternatives such as what is the nature of your current
well-being? The reasons for this, according to Pawley and Syder, have to do
with processing restrictions and the probability that we acquire language in
chunks and we store and retrieve word strings often as wholes from longterm memory to fit the meanings and functions that arise in communication.
In 1991, Sinclair posited the idiom and the open choice principle, a somewhat
similar idea, that most texts are largely composed of multiword expressions
that constitute single choices in the mental lexicon. Many new areas of focus
arose over the years.
Identification
Issues began to develop around the actual identification of formulaic language
in texts and discourses. Some cases are clear such as true idioms, phrasal
verbs, nominal compounds, and so on (see Chapters 2 and 3 for detailed
discussions), but many gray areas persisted. For example, discontinuous
expressions are hard to identify, as fillable slots and two-part expressions
tend to blend into surrounding textfor example, not onlybut also. Pawley
(1986) elaborated a list of twenty-seven diagnostics, and Moon (1998), among
others, also presents lists of diagnostic criteria.
Wray (2002) laid out a set of criteria for determining if multiword
combinations might be prefabricated. Structure or form of the sequence is
one such criterion, and it is often the case that strings begin with conjunctions,
articles, pronouns, prepositions, or discourse markers (p. 31). Compositionality
or internal structure of strings is also important, as Wray observes that the
string is no longer obliged to be grammatically regular or semantically logical
(p. 33). Fixedness, or the tendency for prefabricated sequences to be of
invariable form, is another such criterion, although Wray does allow that a large
subset of formulaic sequences often have fillable slots (p. 34). Other criteria
relate to phonological or prosodic aspects of the articulation of a sequence,
including intonation contour and speed of articulation, and fluency criteria
such as lack of internal pausing (p. 35). For spoken language in particular,
an important point of Wrays to bear in mind is that it may simply be that
identification cannot be based on a single criterion, but rather needs to draw
on a suite of features (p. 43).
Somewhat later, Wray (2008) came to emphasize that the processing
of formulaic sequences as wholes likely results from the ways acquisition
processes operate with respect to input. She notes that much language
input in first language acquisition is left unanalyzed unless necessary, a
phenomenon she terms needs only analysis, or NOA (p. 17). If there is a
strong form-meaning link with a particular string, for example, How do you
do, as a standard greeting among previously unacquainted adults, with no
variation, then the string will remain unanalyzed. Over the course of first
language acquisition, acquirers may note some variation in such strings,
such as lexical insertion (e.g., Have you seen my boots/shirt/watch?, or Id
like a Coke/cheeseburger/3-month plan), but analysis will likely stop at the
recognition of the existence of a fillable slot and the possible word types
that may fill the slot. For adult second language learners, this process
may be much less frequent or slower, since the tendency of adults and
language programs is to analyze second language input for patterns, not
10
to mention the fact that second language learners receive greatly reduced
input compared to children in a first language.
Some researchers have linked formulaic sequences to the lexicogrammar
(Tucker, 2005) and to systemic models of functional grammar (Butler, 2003).
These researchers have noted that formulaic sequences have a place in
models of language that prioritize the lexicogrammar and levels of structure
related to speech act realizations. They acknowledge the role of formulaic
sequences in integrating extraclausal or partially clausal expressions into
functional grammars of discourse.
One of the best checklists to aid in identifying formulaic language is that of
Wray and Namba (2003) (see Chapter 2 for details).
Classification
There are many categories of formulaic language, including collocations,
idioms, phrasal verbs, lexical phrases, lexical bundles, and so on (see Chapter
3 for a detailed discussion). For formulaic sequences with pragmatic functions,
Pawley (2007) outlines seven identifying criteria:
1 Segmental phonology
2 Music, that is, intonation, rhythm, and stress of production
3 Grammatical category
4 Grammatical structure
5 Idiomaticity constraints
6 Literal meaning pragmatic function
7 Accompanying body language
11
insertions, and including two-word collocations (e.g., for the most part,
so far so good); institutionalized expressions, which are sentencelength, invariable, and mostly continuous (e.g., a watched pot never boils,
nice meeting you, long time no see); phrasal constraints, which allow
variations of lexical and phrase categories, and are mostly continuous
(e.g., a ___ ago, the ___er the ___er); sentence builders, which allow
construction of full sentences, with fillable slots (e.g., I think that X, not
only X but Y) (pp. 3845).
A more recent descriptive scheme for formulaic sequences is that of
Wray and Perkins (2000), in which they focus on semantic and syntactic
irregularities of the sequences. A vital aspect of formulaic sequences,
according to Wray and Perkins, is their semantic irregularity. They are not
composed semantically, but are holistic items, like idioms and metaphors.
Another key element of formulaic sequences is their syntactic irregularity,
which is manifest in two qualities: a restriction on manipulation, for example,
one cannot pluralize beat around the bush or passivize face the music or
say you slept a wink, or feeding you up; the fact that in formulaic language
normal restrictions are flouted, such as the sequences that contain an
intransitive verb + direct object, for example, go the whole hog or other
gross violations of syntactic laws like by & large.
Prevalence
Researchers have worked to identify what proportion of discourse in a
given genre or register is in fact formulaic. For example, Altenberg (1998) in
examining the London Lund corpus found that over 80 percent of words are in
formulaic sequences. A well known and often cited number is from the study
by Erman and Warren (2000), which found that 52 to 58 percent of texts in a
corpus were comprised of formulaic sequences.
12
One of the earliest observations about the nature of spoken discourse was
that of Pawley and Syder (1983), who described the way clauses tend to be
chained. They noted that everyday fluent conversational speech is composed
largely of strings of more or less independent clauses, without much
grammatical integration. For example, subordination is only present in limited
amounts in spontaneous speech (Pawley & Syder, 1983, pp. 202204). Pawley
and Syder presented an analysis of two types of native-speaker production to
illustrate how fluency relates to clause-chaining. One speaker, George Davies,
produced speech in which fluent units were separate clauses:
/we had a /fan tastic time
[slows]
(1.1)
/there/were/ all kinds of re/lations /there/
[accel]
[slows ]
/I dun/no where they/all come /from/
[accel]
[slows]
I didnt know/alf o them
[accel]
(0.9)
and ahthe kids/sat on the floor
(0.2)
(1.5)
and ol/ Uncle Bert/he/ah
o/course /he was the life and soul of the party
[accel]
[slows]
/Uncle /Bert ad a /black bottle
[accel]
[slows] (1.5)
an ahed t/tell a/few stories
(0.2)
[accel] [slows]
an ed/take a /sip out of the/black bottle
[accel]
[slows]
n the/more sips he /took /outa / that bottle
[accel]
(1.0)
the worse the /stories got
(1.6)
(Pawley & Syder, 1983, p. 203)
Another speaker, Q., produced comparatively nonfluent speech, in a PhD
dissertation oral defense:
and it/seems to be
[accel]
if a /word is/fairly/high on the frequency /list/
13
[slow]
[accel]
I /havent /made /any count
[accel]
but/justim/pression isticallyum
[slow]
umthe /chances are
that you get acom /pound
[slow]
ora /notherphono /logically deviantform
[slow]
with ah/which is al/ready in other /words
[accel]
[slow]
/which is /fairly frequently the /same/phono /logical
[accel]
[slows]
shape
(Pawley & Syder, 1983, p. 201)
It is apparent that Q is planning only a few words at a time, unlike Davies. The
context and the content of the discourse is novel for him, and it is obvious
that he struggles with formulating and conceptualizing and articulating, due
to the considerable stress of the experience. To make matters even more
arduous, Q tries to use a clause-integrating strategy, in which each new
clause depends to some extent on the structure of the previous onefor
example, his false start or reformulation of the final clause in the sample,
beginning with with, repaired to begin with which. The genre, register,
and relative lack of interactivity of the speech production leave Q little choice
but to use this style of speech. On the other hand, Davies is speaking more
spontaneously and comfortably, and his speech shows clause-chaining of
independent clauses linked by conjunctions such as and, in most cases.
According to Pawley and Syder, this style is most effective in narrative
speech:
With the chaining style, a speaker can maintain grammatical and
semantic continuity because his clauses can be planned more or less
independently, and each major semantic unit, being only a single clause,
can be encoded and uttered without internal breakswe may speak,
then, of a one clause at a time facility as an essential constituent
of communicative competence in English: the speaker must be able
regularly to encode whole clauses in their full lexical detail, in a single
encoding operation and so avoid the need for mid-clause hesitations.
(Pawley & Syder, 1983, pp. 203, 204)
14
Acquisition
One early researcher in the area of formulaic language in child first language
acquisition is Lily Wong Fillmore (1976), who examined the language
development of six-year-old children. A later work by Peters (1983) elaborated
on childrens use of strategies to extract formulaic sequences from input and
retain them while at the same time breaking them down to build grammar
and lexical competence. Later, Wray and Perkins (2000) identified four stages
for childrens use of formulaic language in first language acquisition: a purely
holistic strategy whereby they extract multiword sequences from input
without analysis; analytic stage where grammar and lexical knowledge are
acquired; fusion of sequences and use of processing shortcuts; a balance
that favors holistic processing except where circumstances require analytic
processing.
The evidence for a role of formulaic language in adult second language
acquisition is less clear than that for children. Adults tend to take an analytic
approach to language learning and only under certain circumstances will they
acquire multiword sequences holistically.
Yorio (1980) was one of the early investigators of adult language
development and formulaic sequences. In an examination of studies of
instructed adult learners writing, he found that, unlike children, adult learners
do not appear to use formulaic language to any great extent and that when
they do, they seem not use it to develop overall language knowledge. Instead,
they appeared to use it more as a production strategy, to save effort and
attention in spontaneous communication.
15
16
In summary
From this short and dense overview of the research history of formulaic
language, some patterns and themes emerge. One image remains, however. It
still seems that we are working with something quite elusive about language.
Like the characters in the tale of the blind men and the elephant, we can only
feel for a certain aspect of the phenomenon at a time. Luckily, we can all pool
our impressions from these encounters with particular aspects and create a
fuller image through reading and researching over time.
A few of the many themes and patterns that the research shows are:
MM
MM
MM
MM
MM
17
For certain, all the questions have not been answered yet in any particular
area. How do we know whether a formulaic sequence is stored and retrieved
as a whole in spoken language? Do the basic assumptions about formulaic
language in the processing of spoken language also apply to written language,
to any extent? How valuable is it to elaborate lists of categories of formulaic
language?
2
Identifying Formulaic
LanguageFrequency,
Psychological Representation,
and Judgment
20
Ultimately, you may simply resort to remarking that some are formulaic,
some are not, and some are more formulaic than others. But how can you
even make those decisions? You could look at how often they are used in a
particular context, study the prosodic features of the string (see Chapter 6),
maybe you would look in a corpus.
It is encouraging to know that a variety of means of identifying formulaic
sequences have been developed. However, the processes are in many cases
more inexact than we might expect. Some means of empirical measurementbased identification are discussed below, followed by a more detailed look at
criteria-based checklists, which rely on the decisions of judges as opposed to
measurement instruments.
21
per million words (e.g., Biber, Johansson, Leech, Conrad, & Finegan, 1999;
Simpson-Vlach & Ellis, 2010). This approach often yields word combinations
which are not complete structural units (Cortes, 2004), and are generally
labeled as lexical bundles (e.g., Biber et al., 1999), or multiword constructions
(Liu, 2012; Wood & Appel, 2014). Some researchers, using this set of criteria
as only part of a more complex identification protocol, simply use the term
formulaic sequences (e.g., Simpson-Vlach & Ellis, 2010).
This frequency-based method is most appropriate for large corpora
of hundreds of thousands of words, if not millions, taken from specific
registers of language and/or academic disciplines. It has many limitations
for use with small data sets, as the standard minimum cutoffs for frequency
established in the field may not be met in such cases. It would be difficult,
for example, to use only frequency as a criterion for identifying formulaicity
in a set of transcribed conversations on a range of topics. Some items which
we might consider formulaic might arise only once or twice in such a data
set. Another drawback of use of frequency-based analysis is that it does not
give any information about the psycholinguistic validity of the formulas. This
particular issue arose in a study by Schmitt, Grandage, and Adolphs (2004),
who identified formulas from a corpus and presented them to subjects in
spoken dictation tasks designed to overtax short-term memory capacity.
After an analysis of the participants reconstructions of the dictations, it was
concluded that the holistic storage of the sequences, which were formulaic
according to frequency in the corpus, varied among participants (Schmitt,
Grandage, & Adolphs, 2004). A further limitation of using frequency alone
as a criterion for formulaicity is that additional steps are also required to
eliminate meaningless combinations of words for functional analyses of
formulaic language.
Sequences which are salient or readily recognizable as chunks, such as
on the other hand or how do you do, may or may not be frequent in any
particular corpus or genre, but they do have coherence in that they represent
elements which usually stick together in this order and which always have
a particular meaning or function. This tendency for words to stick together
can be measured statistically using measures of association such as mutual
information (MI), which determines how likely the items are to appear together
compared to chance. MI has no particular statistical significance cutoff and is
most useful for purposes of comparison. A higher MI score would indicate
a higher likelihood of co-occurrence, and taken together with frequency
measures, can provide objective evidence of formulaicity. Other measures of
the relative stickiness of word strings are also used, for example, in corpus
linguistics Gries (2008, 2012) is using the Fisher-Yates exact probability test to
help determine the degree of association between a word and a construction.
22
Studies are also triangulating data from various sources such as corpus
measures of association together with eye tracing and response latency
dataprocedures often referred to as psycholinguistic measures.
Often, researchers with small or quite specific corpora will refer to a large
general corpus such as the British National Corpus (BNC) or the Corpus
of Contemporary American English (COCA) for information about particular
word strings. For example, Wood and Namba (2013) identified formulaic
sequences of potential value for Japanese university students to perform
oral presentations. The sequences were all generated using native speaker/
proficient speaker intuition, and were then confirmed as formulaic with
reference to the spoken language subcorpus of the COCA at a frequency
cutoff of at least ten occurrences per million words and with a MI score of
at least 3.0 in the corpus (for an overview of MI see Schmitt, 2010). This
ensured that the sequences were frequent in spoken discourse and that
they were strings of items highly likely to stick togethertwo powerful
markers of formulaicity. Other researchers have used the hits generated
by online search engines such as Google to aid in determining what is
formulaic. Shei (2008) illustrated how no popularly available corpus seems
large enough to provide adequate instances of formulaic sequences for
close investigation. Shei proposes that researchers and teachers use the
Internet as a sort of vast corpus, employing a search engine like Google
to help identify and retrieve multiword units for linguistic research and
language teaching and learning. Simply Googling a particular word string
and examining the resulting hits can yield valuable information about its
frequency, form, variability, and functions.
Psycholinguistic measures
As seen in Chapter 4, a number of studies of formulaic language have been
carried out using measures of processing speed. Conklin and Schmitt (2012)
summarize a list of studies that have incorporated a variety of measurements,
including reaction times (e.g., Conklin & Schmitt, 2012), eye movement
(e.g., Underwood, Schmitt, & Galpin, 2004), and electrophysiological (ERP)
measures (e.g., Tremblay & Baayen, 2010).
These studies use eye tracking or response latencies involving reading.
While psycholinguistic measures are useful for determining which sequences
have been stored holistically by individual speakers, they provide us with
a partial view of the use of a sequencefor example, they do not usually
help us to know how common a given sequence may be in actual use in the
community, and the formulaic sequences identified in these ways may include
23
rare, unusual, or one-off sequences that the speaker has tended to use for a
variety of idiosyncratic reasons.
Phonological characteristics
Another measure used to identify formulaic sequences in spoken language can
be phonological coherence, discussed at some length in Chapter 6. Formulaic
sequences tend to be uttered with particular prosodic features such as
alignment with pauses and intonation units, resistance to internal dysfluency,
no internal hesitations, fast speech rhythm, and stress placement restrictions
(see Lin, 2010, 2012, for a discussion). Some cautions are important here: as
with psycholinguistic methods, phonological coherence provides a limited or
partial sense of formulaicity. For one thing, phonological coherence is limited
to analysis of spoken language only. As well, it only relates to formulas used
by a particular speaker, and analysis is limited by the quality of the audio data
recorded.
24
25
26
1 phonological coherence
2 greater length and complexity than other output
3 nonproductive use of rules underlying a sequence
4 situational dependence
5 frequency and invariance in form
word string.
2 By my judgment, part or all of the word string lacks semantic
transparency.
3 By my judgment, this word string is associated with a specific
27
28
that there might not be a single answer as to what to search for was
at least partly addressed by having the judges read relevant literature
about formulaic sequences and to study and apply a set of five criteria
drawn from that literature.
5 Application of intuition in such a way may occur at the expense of
Judgment criteria
Five criteria were applied in deciding whether a sequence was a formula,
drawn from previous research on formulaic sequences. No particular criterion
or combination of criteria were deemed as essential for a word combination
to be marked as formulaic, these were only guides:
1 Phonological coherence and reduction. In speech production
29
30
Judgment procedure
The expert judges were two graduate students in applied linguistics, and the
researcher himself. All had read Coulmas (1979), Nattinger and DeCarrico
(1992), Peters (1983), and Wray and Perkins (2000) prior to the judging
process. A benchmarking, preliminary discussion session was held in which
the judgment criteria and the procedure as a whole were clarified, and a
two transcripts were jointly examined and coded by all three judges, in an
effort to standardize the overall approach to identification of formulas. Due
to the fact that the speech samples were very specific narrative retells, the
formulas identified covered a wide range, from idioms (love your neighbor,
thats it, instead of) to two-word verbs (throw away, come back, let out, give
up, got mad, fall down), to repeated prepositional and participial phrases
(living in the same house, taking a bath, started fighting, out of the house,
at the moment, in the middle). The judges individually coded the rest of the
transcripts, following the time sequences of the speech samples, beginning
with sample number one for a given participant and continuing to sample
two and on through sample six for the same participant. After this, marked
items were accepted as formulaic if two or all three of the judges were
in agreement. In some cases, issues such as location of the boundaries
between formulas and the surrounding language, or judges determination
that some items were possibly but not definitely formulaic, were decided by
the researcher.
31
(2003) propose that each applicable criterion on their list should be rated on
a 5-point Likert scale from strongly agree to dont know, to strongly disagree,
where strongly disagree indicates the absence of a trait that sometimes
indicates it [formulaicity] (p. 26). All four checklists represent a departure
from the methods described above in that they place considerable importance
on native speaker intuition.
Wray and Nambas (2003) checklist is the most ambitious, having a
total of eleven criteria that address thirteen points. Peters (1983), on the
other hand, lists six criteria that address eight points while Woods (2010a)
checklist is based on five criteria. Taken together, the checklists show
remarkable agreement on the range of characteristics that may be indicative
of formulaicity, all of them make reference to phonological characteristics
and complexity. As seen elsewhere in this book, phonological markers of
formulaicity can include phonological coherence, reduction, or distinctive
phonological patterns, including phonological fusion, reduction of syllables,
or deletion of schwa (Wood, 2010a). Complexity refers to the fact that a
given sequence may be noticeably more advanced or less advanced than
the individuals typical nonformulaic language use in terms of syntactic and
morphological features.
Woods (2010a) is the only checklist to specifically reference form, linked
to Nattinger and DeCarricos (1992) taxonomy of lexical phrases. Frequency is
also considered by Wray and Namba (2003) and Peters (1983) to be a mark of
formulaicity, although frequency in this context refers to frequent use by the
speaker, not an arbitrary threshold for identification set in corpus research. As
already discussed, Wood (2010a) does not consider frequency in and of itself
to be a criterion for identification.
It is also interesting to note that both Peters (1983) and Wray and Namba
(2003) allow for the two social extremesidiosyncratic uses and communitywide usesof formulas. Wood (2010a) does not consider idiosyncratic
uses a mark of formulaicity, though they are accepted given the developing
competence of nonnative speakers participating in the study. Wray and
Nambas (2003) checklist features criteria that can be applied to either correct
or inappropriate forms. Wray and Nambas (2003) checklist takes into account
local repetitions, including reading, derivations, and functional uses as possible
indicators of formulaicity.
Clearly, all the checklists rely on native speaker intuition to classify word
combinations as formulaic. For a number of types of research in this area,
judgment checklists can help to overcome the limitations of frequencybased psycholinguistic, or phonologically focused identification methods, and
provide a sort of aggregate measure of formulaicity. This is first and foremost
a useful means of identifying formulas in spoken language corpora, but, as
32
In summary
From this general overview of approaches and means and methods of
identifying formulaic sequences in corpora and texts, some interesting
possibilities come into sight. It is possible to determine formulaicity by using
frequency statistics, either from ones own corpus or from taking sequences
from ones own text or small corpus and checking their frequency or MI
in very large corpora such as the BNC or the COCA. It is even feasible
to use Internet search engines to guide decisions about formulaicity.
Psycholinguistic or acoustical features of a sequence and its processing
can also yield useful guidance about possible formulaicity. In working with
language data, expert or native speaker judgment about formulaicity may
be employed, measures well suited to smaller or quite specific data sets. In
these cases, a checklist of characteristics of the strings and their uses can
be a useful guide for judges. A few of the many themes and patterns which
the research shows are:
MM
MM
MM
MM
MM
Even if you use corpus frequency and MI statistics and acoustical features and
judges and checklists, you are likely to remain guarded about your decisions
about formulaicity. As the body of research grows, however, it is more and
more likely that new and more reliable or confidence-inspiring means of
determining formulaicity will emerge.
33
3
Categories of Formulaic
LanguageLabels and
Characteristics
ver the history of formulaic language research the units of analysis have
been labeled in a wide range of ways. This is largely because researchers
were not all examining the exact same phenomenon, and were frequently
working in quite separate areas of linguistics and, as we saw in Chapter 1,
even in areas outside of linguistics in fields as diverse as social anthropology
and neurology. It took time for anyone to survey the existing research and
actually attempt to draw a picture of the phenomenon under examination and
sketch out the state of knowledge about it. In fact, it was not until Wray (1999)
took a step back and examined the growing body of research that the umbrella
term formulaic language/formulaic sequence came into widespread use. Since
then, that term has more or less gained and held traction in the literature.
We have seen special issues of journals devoted to formulaic language, for
example, in 2012 a special volume of Annual Review of Applied Linguistics.
We have seen particular academic symposia devoted to the field, for example,
the Symposium at University of Wisconsin Milwaukee in 2007. Wray herself
was instrumental in developing the Formulaic Language Research Network
(FLaRN), which has a social networking site on the Web with hundreds of
members, and has spawned a series of well-attended seminars over the years
at which researchers share their work.
It was Wray and Perkins (2000, p. 3) who noted that formulaic language at
that point had been labeled by as many as forty terms (Table 3.1):
36
37
As we can see from the list, the range is remarkable. In the years since
the publication of Wray and Perkins, new terms have been added, including
corpus technology-derived labels such as n-grams and concgrams. However,
a survey of the main categories in recent literature shows that there is
considerable common ground among researchers as regards exactly what
they are studying, but, at the same time, categories exist for valid reasons.
The main areas of focus which have emerged over the years are collocations,
idioms, lexical phrases, lexical bundles, metaphors, proverbs, phrasal verbs,
n-grams, concgrams, and compounds. If we examine each of these in turn, we
will end up with a strong sense of what is actually meant by formulaic language.
Collocations
The term collocation is a bit of a puzzler for many, because it appears to
simultaneously refer to a specific type of word combination and to all
38
39
to consider a collocation. Jones and Sinclair (1974) found that the span of
words which is optimal for a collocation is four words to the right or left of a
node, or core word. Kjellmer, Stubbs, and Altenberg took the computerized
methods espoused by Sinclair some steps further. Kjellmer worked on the
Dictionary of English Collocations (1984) defining a collocation as a continuous
and recurring sequence of two or more words which are grammatically well
formed. These efforts were the genesis of the computer-based frequencydriven study of collocations.
40
Idioms
Definitions of idiom
Unfortunately, the definition of idiom is in some ways just as fraught as that
of collocation. Some researchers use the term in an extremely broad sense,
encompassing proverbs, slang expressions, and even individual words of
certain types. Many others, however, use the term in a much narrower sense,
to refer only to word strings which are, in the words of Moon (1998, p. 4),
fixed and semantically opaque or metaphorical, for example, kick the bucket
or spill the beans. It may be that for such a complex language phenomenon,
no specific single definition will do it justice.
The most encompassing definition of idiom is that which includes even
single words. The definition elaborated by Hockett (1958) is a classic in this
category, labeling any language item whose meaning is not visible from
41
42
Categorizations of idioms
As for categorization of types of idioms, various scholars have elaborated
taxonomies. Makkai (1972) identifies six subcategories (adapted from Liu,
2008, pp. 17, 18):
1 Phrasal verbsverb and one or two particles, for example, come
across
2 Tournurea verb and at least two words (often noun phrases), for
high-handed
43
regard to or in regard to
2 Formulaegrammatical in structure and compositional in meaning,
formulaic language.
2 Semantic opacity (adding up meanings does not yield the whole)
spic and span and to and fro are examples of this phenomenon,
although the lexical items involved are in and of themselves opaque,
we do not see spic, span, or fro used in other contexts (see Allerton,
1984 for more on this). Other examples of semantic opacity have
roots in history, such as kick the bucket (die), which derives from
a phenomenon known in the procedures involved in the slaughter
44
idea of lexicality, this implies that the words in an idiom are fixed and
cannot be substituted by synonyms. Some idioms are fixed to the
point of not allowing any syntactic or morphological variation, such
as hook line and sinker or by the way, or beat around the bush; we
cannot pluralize any of the items in these word sequences nor, for
example, passivize the latter one to read the bush is beaten around.
However, some idioms allow a limited amount of such variation, such
as red herring or teach an old dog new tricks; it is possible to say red
herrings, plural, and to reverse the order of an idiom such as teach
new tricks to an old dog.
Lexical phrases
Lexical phrases are a particular subset of formulaic language first publicized
by Nattinger and DeCarrico (1992), based largely on previous work by
Becker (1975). They outline two large categories of the phrases, strings of
specific lexical items and generalized frames. The former are generally unitary
lexical strings and may or may not be canonical in the grammar, while the
latter consist of category symbols and specific lexical items. Four criteria
help in classifying the phrases: length and grammatical status; canonical
or noncanonical shape; variability or fixedness; whether it is a continuous,
unbroken string of words, or discontinuous, allowing lexical insertions (pp.
37, 38). They also identify four large categories of lexical phrases which
display aspects of the four criteria: polywords, which operate as single
45
Lexical bundles
Lexical bundles (Biber & Conrad, 1999; Biber, Johansson, Leech, Conrad, &
Finnegan, 1999) are a category of formulaic language characterized by the
means by which they are identified and their purely functional naturethey
are not meaning units per se, but rather, units of function which serve to
characterize particular types of discourse. The work on lexical bundles has
been overwhelmingly conducted on academic language, especially academic
written text.
Lexical bundles are combinations of three or more words which are
identified in a corpus of natural language by means of corpus analysis
software programs. An additional characteristic of lexical bundles is that they
occur across a range of texts or, in the case of academic language, a range of
disciplines. Biber and Conrad (1999) noted that these word combinations are
so common, it might be assumed that lexical bundles are simple expressions,
and that they will be acquired easily (p. 188). However, the acquisition and
use of lexical bundles does not appear to occur naturally. Lexical bundles have
been shown to be used at high frequency in published academic writing,
and particular types of the bundles are characteristic of particular disciplines
(Cortes, Jones, & Stoller, 2002). Academic disciplines have different ways
of seeing the world, connected with different communicative conventions
(Hyland & Hamp-Lyons, 2001).
Biber (2006) presented a comprehensive corpus-based analysis of
university language, including an examination of lexical bundles in textbooks.
He found that academic disciplines differed in their use of lexical bundles, with
natural and social sciences relying on them more than the humanities. Overall,
the distribution of lexical bundles across functional categories in Bibers study
show that referential bundlesmaking direct reference to real or abstract
entities or to textual content or their attributesare the most common.
46
Metaphors
Metaphor is essentially a semantic principle centered around an
unconventional act of reference, a word that is used to describe an entity
which is essentially outside of its denotational range, and there is tension
between a literal and a metaphoric interpretation. The structure of a metaphor
is like this: a vehicle is the term used in an interpreted sense which cannot
be understood literally because of the unusual context of use. The topic is
the referent of the vehicle. The grounds are the analogies or features shared
between vehicle and topic. Take, for example, life is a highway. Highway
is the vehicle, the word being used in an interpreted, not literal sense. Life
is the topic, and the grounds are the analogy between the passing of time
and the covering of distance. The metaphor can, of course, use a marker
such as is like or kind of such as in life is like a box of chocolates. The vehicle
can be single words or phrases or full clauses or sentences. The strength of
a metaphor depends on the degree of semantic tension between vehicle
and topic, linguistic markers such as like, kind of, and the implicitness or
prominence of the vehicle. The metaphor calls on us to compare the two
components, the vehicle and the topic.
47
Proverbs
Proverbs are hard to define but key are the opacity of the relationship
between literal and figurative meanings, and sentence-like length. Pragmatic
characteristics of proverbs include advice and warning (better late than never,
dont put the cart before the horse), instruction and explaining (an apple a
day keeps the doctor away, the ball is in your court), and communicating
common experience and observations (you cant get blood from a stone,
just as the sun rises in the east). They are not the words of the speaker, but
quotations from a canon of proverbs shared by members of a community.
They feature the linguistic characteristics of brevity and directness, simple
and/or parallel syntax, metaphorical quality, and sometimes archaic
structures.
Compounds
Compounds are special cases in formulaic language study, being more a
branch of word formation. A compound is, in fact, the creation of a word
with a unique meaning by combining two existing words, and in English
many compounds in fact are written as two separate words (see ten
Hacken, 2004).
Compounds show asymmetry, with the second of the two words
usually the head or core of the combinationfor example, desk computer
describes a type of computer and computer desk describes a type of desk
(see Williams, 1981). The head word is subject to rules of pronoun reference
and has some freedom of syntactic form. The head represents a type and
the nonhead serves to classify the head. There are three forms of compound
words:
MM
MM
MM
48
blended into a single word. The rules for writing compounds are not universal
or specific in English, and it is common for even experienced and highly
educated writers to need to consult dictionaries or online resources to
determine whether a given item is two words, a hyphenated compound, or
a single word.
Words modified by adjectives, for example, an old school, are different
from a compound word, for example, a high school in the degree to which
the nonhead word changes the essential character of the head, or the
degree to which the modifier and the noun are inseparable. In the example
of high school, the compound represents a single entity, a particular type
of school which is always identified as such, whereas old school is simply a
school being described as old. The adjective slot in the combination can be
filled by any number of items.
Modifying compounds are often hyphenated, for example, an old-furniture
salesman sells old furniture, but an old furniture salesman is an old man. When
compound modifiers precede a noun, they are often hyphenated: part-time
worker, high-speed chase. Adverbs, words ending in -ly, are not hyphenated
when compounded with other modifiers: highly rated university, a partially
refundable purchase.
In pluralizing, the most significant word, the head, takes the plural form.
Examples include also-rans, fathers-in-law, and go-betweens.
Phrasal verbs
Phrasal verbs are a particularly English type of formulaic language phenomenon.
They are verbs combined with a preposition or particle, or both, with often
nonliteral meanings, or both literal and figurative interpretations, like idioms.
Three structural categories exist:
Verb + preposition (prepositional phrasal verbs)
Help me look after Jakes dog for the weekend.
Other children often picked on Sebastian.
What if you run into your ex-wife at the party?
Verb + particle (particle phrasal verbs)
You should bring that up at the next meeting.
Try not to give in when you see the dessert table.
Come over and lets hang out for the afternoon.
49
Concgrams
A concgram is, like all formulaic language, a combination of two or more
words. However, a concgram is a noncontinuous sequence, in which the
constituent words are separated by others. The idea dates back to the 1980s
when the Cobuild team at the University of Birmingham tried to find a way
to search corpora by machine for noncontiguous sequences of associated
words.
The ability to discover noncontinuous word combinations in corpora
increases the likelihood that researchers will discover not only a more
50
In summary
From this general overview of categories of formulaic language, it can
be surprising how many and varied the types are. The phenomenon we
are dealing with is by no means unitary, and the classifications and the
taxonomies are somewhat leaky or slipperythe distinctions between,
for example, a collocation and an idiom are blurry, and it also appears
that particular researchers have somewhat arbitrarily composed their
own sets of descriptions and classifications and definitions of the various
types. Other types of formulaic sequences seem lost in the shuffle,
uncategorized but intuitively formulaiclook at items like and then or
sooner or later. Where do they fit? The advent of corpus analysis technology
and techniques has done much to help us identify new types of formulaic
sequences, but what makes the exact determination of a lexical bundle
different from a sequence identified using frequency and other statistical
measures such as mutual information (see Chapter 2)? Is the distinction
even worthy of debate? A few of the many themes and patterns which the
research shows are:
MM
MM
MM
MM
MM
51
4
Mental Processing of
Formulaic LanguageHolistic
and Automatized
54
55
to swim, ride a bicycle, or drive a car. The key to your ultimate success was
probably practice, right? In fact, automatization or proceduralization occurs
generally through a process of repetition and repeated use and recall. Through
automatization, content originally stored in the conscious mind can become
available for efficient use in real time. Knowledge which is proceduralized can
be said to be available for use more or less subconsciously, and one may
perform other tasks simultaneously. Lets take one of our examples and break
it into stepslearning to drive a car. At first all the necessary steps must be
explained and shown. The novice driver struggles to control the steering, the
brakes, the accelerator, and, if a standard shift vehicle, the delicate clutch, shift,
and accelerator control while shifting. Any distractions will tend to make the
novice driver lose controlnoise, the need to carry on a conversation, or any
physical effort in addition to the driving itself, such as controlling windshield
wipers, attending to other traffic, and so on. With repetition, however, the
performance becomes much smoother and less effortful. Eventually one
gets to the point where the driving itself is more or less automatic, and
you can chat, sing to music, drink beverages, smoke cigarettes, and so
on, simultaneous to driving. This is an example of how automatization of a
complex set of knowledges or actions can, over time with practice, become
skilled and performable concurrently with other tasks. So it is with language.
Think about someone progressing to be more and more fluent over time.
He or she first struggles to produce even content words, let alone anything
approaching grammar. With exposure and practice he or she can start to
laboriously create roughly grammatical utterances. But this takes quite a bit
of effort. His or her cognitive and affective resources are almost entirely taken
up with formulating utterancesretrieving words from the mental lexicon,
applying rules of syntax and morphology to them, lining them up, articulating
themall of this takes up most of his or her head space. Distractions,
stresses, or interruptions may cause him or her to lose the train of thought or
communication and need to start all over. With time, aspects of this process
become automatized and the student is able to produce utterances with less
excruciating effort, depending on context and so on. He or she can attend to
coming up with ideas to communicate, to planning the next things to say, and
so on, while actually producing language. Like the driver of a car, he or she is
now able to multitask and be more skilled and flexible with the passing of
time and plenty of practice.
Spontaneous speech
Producing spontaneous language is a shockingly complex thing, if you pause
to really look at it. Watch a person speaking in conversation or in a skilled way
in any context. Mind and the muscles of articulation are operating in synch in
56
a truly remarkable way. Ideas roll around the brain, simultaneous to weighing
contextual factors such as a sense of who is listening and what shared
knowledge and opinion exist. Clauses and phrases and words encapsulating
the ideas and meanings and nuances of the speakers mind clump together
and roll out into the air. And, remarkably, a listener can attend to and react to
the utterances in real time, showing comprehension, and even be driving a
car or cooking a meal, or watching a television program at the same time. This
seems miraculous and may be at least partly explained by the automatization
or proceduralization processes described earlier. It seems that ones memory
of language bits and pieces and rules, and so on is the underpinning of all
this. We recall what we have learned and implement it using another type
of memory. But if we take a look at the nature of human memory, we find a
surprising limitation which at the same time helps us to understand a bit more
the role of formulaic language in communication.
57
58
59
60
61
62
to have encountered many idioms very often (Conklin & Schmitt, 2012). In
light of all of these restrictions on concluding from the idiom research, we
still need to approach the study of mental processing with a mix of caution
and enthusiasm. While the idiom research provides us with a few tantalizing
bits of knowledge, the study of mental processing of other types of formulaic
sequences may be a much richer and more rewarding area to investigate.
63
64
65
In summary
From this general overview of research and theory on mental processing of
formulaic language, it is really interesting to note that despite the quantity of
research which has explored the notion of retrieved and stored as wholes,
there are still unanswered questions. It may be stated that formulaic language
is likely to some extent dealt with holistically. But is this always the case?
Can a given sequence sometimes be dealt with holistically and sometimes
constructed in a more synthetic, conscious manner? If so, what are the factors,
cognitive and contextual, which might influence whether the sequence is
dealt with holistically or not? Is it also perhaps the case that sequences may
fit on a spectrum of holistic processing, with, for example, collocations and
idioms being on the holistic end of the spectrum, and lexical bundles or lexical
phrases being dealt with in more constructed way? Are there answers in
second language acquisition theory that we have not yet encountered?
A few of the many themes and patterns which the research shows are:
MM
MM
MM
66
MM
MM
5
Formulaic Language and
AcquisitionFirst and
Second Language
68
second language acquisition. It appears that initial first and second language
acquisition in children includes attending to formulaic sequences in language
input, adopting them for use, and later segmenting and analyzing them. The
analysis may take place later partly as a result of neurological development
and a resultant increase in analytic cognitive skills.
Early research
The first serious study of formulaic language in child language acquisition
dates back to the 1970s. The first such studies were basically case studies
of individual children and their progress through the acquisition of language.
Wong-Fillmore (1976) was one of the very first to study the second language
acquisition of a child and find that one prominent process involved formulaic
chunk acquisition. Her data further revealed that this was followed by a
process of segmentation or syntactic and semantic analysis and breakdown
of the acquired chunks. This in turn furthered development of overall linguistic
competence. Another early researcher in the area, Hakuta (1974), conducted
a sixty-week study of the second language acquisition of a Japanese child and
found evidence of initial acquisition of prefabricated chunks later analyzed
and used to facilitate overall language development. Much later, in a similar
vein, Hickey (1993), in a longitudinal examination of the acquisition of Irish
Gaelic of a child, also discovered a role for formulas in acquisition. Again, she
found that they were later broken down and analyzed, providing grist for the
linguistic competence mill.
A turning point in the first language acquisition research came in the early
1980s with Anne Peters seminal piece of work on child language acquisition.
Peters (1983) documented how the process of formulaic chunk acquisition and
later segmentation, as established by Wong-Fillmore and Hakuta, might actually
work. Peters claims that there is evidence for eight assertions about the process:
1 First acquisition units by children often consist of more than one
morpheme.
2 There is no difference between these units and minimal ones in terms
of storage.
3 All of the polymorphemic units can be segmented (broken down).
4 Smaller units from segmentation are stored in the lexicon.
5 Both the original unit and the segmented ones can be stored in the
lexicon.
6 Segmentation produces structural information, starting with simplest
69
According to Peters, the child very early and quickly develops strategies for
extracting meaningful chunks from the flow of conversation. This may be
based on any of a range of cues, for example:
1 the utility of the chunk for his/her own needs
2 the result he/she observes occurring when the chunk is used among
adults
3 the frequency with which he/she is exposed to the chunk
4 some attractive aspect of the phonetics or prosody of the chunk
70
time is it?
6 Phatic formulasto establish, prolong, or discontinue interaction, for
71
72
73
74
75
76
77
some ten times more frequent as a word in general discourse, making the
make X cry construction appear quite formulaic. In the study, first language
participants were required to identify whether they heard cry or try after
the carrier phrase they made me, and the signal ranged from try to cry on
an eight-step continuum. The ambiguous sounds were more often perceived
as cry, showing that the formulaic nature of the make-causative construction
was quite powerful.
Reading time also appears to be influenced by formulaic knowledge. Bod
(2001) showed that higher frequency three-word sentences such as I like it
were reacted to faster by native speakers than low frequency ones. Ellis,
Frey, and Jalkanen (2008) showed that native speakers are quick to read and
process frequent collocations with verb agreement and booster-maximizeradjectives. Arnon and Snider (2010) showed that more frequent phrases were
processed faster than less frequent ones even when they were matched
as to the frequencies of individual words. Tremblay, Derwing, Libben, and
Westbury (2011) used three self-paced reading tasks with lexical bundles
(see Chapter 8) and matched control sentence fragments to show that the
lexical bundles were read faster.
Studies involving retention of material in short-term memory and accurate
subsequent reproduction show an influence of knowledge of formulaic
language. Bannard and Matthews (2008) found that children were more likely
to reproduce familiar sequences correctly than less frequent or familiar ones,
and to reproduce them faster. Studies of priming, in which sequences recently
encountered in communication are reproduced later, show a priming effect
for hearing, speaking, reading, or writing sequences (see e.g., McDonough
and Trofimovich, 2008 and a growing body of subsequent work).
Most of the aforementioned studies involved native speaker or child
participants. With second language participants, some remarkable work has
also been done. Conklin and Schmitt (2008) found that formulaic phrases were
read faster than matched nonformulaic phrases by both native and second
language participants. Ellis and Simpson-Vlach (2009) and Ellis, SimpsonVlach, and Maynard (2008) found that second language learners processed
formulaic language more effectively if it was of high frequency, as opposed to
native speakers, who processed faster those sequences which also exhibit a
high rate of mutual information (MI), which measures the statistical likelihood
of words collocating.
Extensive exposure to formulaic language appears to aid fluency of
speech. For example, Taguchi (2007) studied the development of speech
abilities in students drilled in word chunks and found that they used more
correct chunks after instruction and that they were more aware of discourse
features. Wood (2006, 2009a, 2009b) and Wood and Namba (2013) have
shown that exposure to and practice with formulaic sequences has positive
78
79
In summary
From this short overview of the research into first and second language
acquisition of formulaic language, some patterns and themes emerge. One
important element is the notion of segmentation of formulaic sequences from
the input, and subsequent breakdown of the stored sequences and use of
their constituent elements for development of the language system, grammar,
and so on. The research into adult second language acquisition of formulaic
language is heavily concentrated in naturalistic contexts of acquisition, and
leaves little for classroom or formal language teaching practitioners to work
with. A great deal of the work with adults has involved native speakers or
second language speakers of high proficiency. Is it possible that second
language learners do acquire some formulaic language units as wholes at
first and then break them down as time passes? Or do they tend to recognize
the sequences as strings of discrete and separately recognizable units, and
only when instructed to, perceive them as wholes? Is there a blend of both
of these types of processing and acquisition? More research is needed to
determine whether this is so and if so, how it workswhat makes some
sequences salient as wholes and others not? A few of the many themes and
patterns which the research shows are:
MM
MM
MM
MM
MM
For certain, all the questions have not been answered yet in any particular
area. How can we determine whether and how an adult learner might perceive
80
6
Formulaic Language and
Spoken LanguageFluency
and Pragmatic Competence
82
and may allow speakers to produce language that outstrips their actual
competence; they facilitate fluent speech.
83
84
Shin and Nation (2008) worked to identify the most common or highfrequency collocations in English using specific criteria. Using the spoken
subcorpus of the BNC as a basis, they used the most common single
content words (nouns, verbs, adjectives, and adverbs) as a starting point,
and used a frequency cutoff of thirty per ten million running words. The list
is presented in Table 6.2:
85
86
Speech fluency
If you look around for descriptions of spoken language abilities, such as
second language syllabuses or assessment criteria and so on, you are likely to
see the word fluency represented somewhere. It is often used as a synonym
for nativelike ability in a second language, or for having good command of
language. In terms of speech in particular, ask people what it means and you
may hear words like smoothness or flow of speech. In the end, though,
the research literature on fluency has generally attended to temporal variables
of speech. These are measurable, quantifiable aspects of speech: speed;
pauses and hesitations; length of runs.
87
88
89
NP + Aux + VP ().
2 Collocations are strings of specific lexical items, such as rancid
butter and curry favor, that co-occur with a mutual expectancy
greater than chance ().
3 Lexical phrases are collocations, such as how do you do? and for
example, that have been assigned pragmatic functions () (p. 36).
The authors go on to refine these categories and further refine their shared
characteristics.
MM
MM
MM
MM
90
And he came back the cat came back to the his house and ah
This results in a run of thirteen syllables, only one of which is a filler nonlexical
item ah.
MM
Here she appears to think aloud, buying time to recall the next event in the
narrative and uses a very simple subject + verb formula to repeat her lack of
clear recall. It helps her to produce a 19-syllable run.
Stringing together multiple formulaic sequences also helped speakers to
avoid pausing and to extend runs, as evidenced by, for example, For instance,
in later retells of the film Strings (speech samples 3 and 6, these examples
91
are from sample 6) several participants described the old man in the story
making music by himself in his room, a combination of three short
two-word formulas making music, by himself, and in his room. This
produces a very fluent ten-syllable run.
Reliance on a single formulaic sequence also helped speakers to avoid
pausing. To introduce the next action in the story, for example, it was common
to use and then, or and next.
Use of self-talk and fillers was a relatively sophisticated strategy used by
the speakers. This includes use of self-referential collocations as I know, or
I think, or I guess. Also included in this category are long strings used for
self-talk or circumlocution such as I dont know, or I dont know the
things name. These allowed them to produce longer runs.
Similarly, use of formulaic sequences as rhetorical devices was a relatively
sophisticated means of avoiding pauses and extending runs. Wood notes that,
in later retells, the study participants tended to use beginning formulas such
as at the beginning, narrative move markers such as when the story
is go ahead, and endings such as that is the end of the story. All of
these add greatly to the length of runs as well as to the effectiveness of the
storytelling.
Interestingly, this study still stands alone as the only effort to determine
the role of formulaic language in speech fluency. It shows both quantitatively
and qualitatively the importance of formulaic language in second language
speech fluency development.
Phonological characteristics
As noted in Chapter 2, formulaic sequences appear to display particular
phonological characteristics in speech. Lin (2010, 2012) cataloged these
characteristics.
Phonological coherence is the term most often used to summarize the
nature of the prosody of formulaic sequences in speech. It was Peters (1977,
1983) who first noted that children can be observed to produce sequences
which surpass their grammatical competence and which exhibit no internal
hesitations and a smooth intonation contour, making them stand out from
the rest of the speech flow. The idea is, then, that these characteristics of
formulaic sequences can give researchers a clue as to what has been acquired
as a chunk. Child language researchers such as Hickey (1993) note that
phonological coherence is basic to spoken formulaic sequences in childrens
speech. As for adult speech, it is not quite so clear, although numerous
researchers have assumed that a similar dynamic applies as to child language
(e.g., Moon, 1997; Wray, 2004).
92
93
Pragmatics
One extremely important aspect of spoken language is pragmatic competence.
The idea of communicative competence (e.g., Bachman, 1990) led the
language teaching field into new ways of approaching its purpose. In addition
to grammar, communicative competence includes knowledge of discourse,
genre and text, social aspects of language, and a focus on the learner and
what he/she does. Current models of communicative competence see
communication as composed of knowledge or competence in four key areas:
organizational competence, pragmatic competence, sociolingual competence,
and strategic competence.
Pragmatic competence is key to successful ability to communicate in social
interactions, and is the basis of what we might call small C culture. We
can define pragmatic competence as the knowledge and skill necessary for
successful and appropriate use of language in communication, and subdivide
it into several broad categories:
MM
MM
community
Formulaic language has been referred to by a range of labels in pragmatics,
including conventional expression, pragmatic routine, situation-based
utterance (SBU). The general agreement in pragmatics research is that
94
172
Research focuses
We certainly seem obsessed with academic language in our research on
formulaic language. Indeed, virtually the entire body of work on lexical bundles
and formulaic language in writing is focused on academic writing. It is time
to move the focus out of the academy and look at other areas of importance.
For example, studying the use of formulaic language in service encounters,
doctorpatient discourse, debates, counseling, and psychotherapeutic
situations, to name a few, can provide insights into how these types of
communication are structured, the nature of the discourse therein, and how
to foster facility with this language to native speakers and second language
learners. Ben Rejeb (2014) conducted a study of formulaic language used in
official business meetings of university student government, using a corpus of
meeting minutes spanning many years. More of this type of research is useful.
Similarly, perhaps it is time to take the focus away from productive skills
and toward receptive skills. We have studied the use of formulaic language
in speech and in writing, but comparatively little in reading and listening.
More research in these areas can help us not only to discover the ways in
which language users handle formulaic sequences, but also to uncover more
psycholinguistic processes. These types of research can help augment our
ways of teaching language as well.
A final area ripe for research is to examine the ways formulaic language works
with discourse analysis and writing studies. In empirical discourse analysis
such as conversation analysis, there are some obvious roles for formulaic
language to play, but we have not yet examined them. In critical discourse
analysis and the systemic-functional linguistic ways of deconstructing text
and determining ideologies, there seem to be many ways that research can
incorporate knowledge of formulaic language. Similarly, in writing studies,
the ways in which people learn to write and the ways that discourses evolve
seem to be natural places where the interface with formulaic language study
can exist. Collaborative and multifaceted research projects can help to push
formulaic language into the forefront and help us all to create an unlimited
amount of new knowledge in a wide range of fields.
References
Adel, A. & Erman, B. (2012). Recurrent words combinations in academic writing
by native and non-native speakers of English: A lexical bundles approach.
English for Specific Purposes, 31, 8192.
Al Hassan, L. & Wood, D. (2015). The effectiveness of focused instruction of
formulaic sequences in augmenting L2 learners academic writing skills: A
quantitative research study. Journal of English for Academic Purposes.
Allerton, D. J. (1984). Three (or four) levels of coocurrence restriction. Lingua, 63,
1740.
Allerton, D. J., Nesselhauf, N., & Skandera, P. (Eds.). (2004). Phraseological units:
Basic concepts and their application. Basel: Schwabe.
Altenberg, B. (1993). Recurrent word combinations in spoken English. In:
J. D. Arcy (Ed.), Proceedings of the Fifth Nordic Association for English
Studies Conference (pp. 1727). Reykjavik: University of Iceland.
Altenberg, B. (1998). On the phraseology of spoken English: The evidence of
recurrent word combinations. In A. P. Cowie (Ed.), Phraseology: Theory,
analysis and application (pp. 101122). Oxford: Clarendon Press.
Altenberg, B. & Tapper, M. (1998). The use of adverbial connectors in advanced
Swedish learners written English. In: S. Granger (Ed.), Learner English on
computer (pp. 318). New York: Longman.
Ambridge, B., Rowland, C. F., Theakston, A. L., & Tomasello, M. (2006).
Comparing different accounts of inversion errors in childrens non-subject whquestions: What experimental data can tell us? Journal of Child Language,
33, 519557.
Amosova, N. N. (1963). Osnovui angliiskoy frazeologii [The foundations of English
phraseology]. Leningrad: University Press.
Anderson, J. (1983). The architecture of cognition. Cambridge, MA: Harvard
University Press.
Appel, R. & Wood, D. (in press). Recurrent word combinations in EAP test-taker
writing: Differences between high and low proficiency levels. Language
Assessment Quarterly.
Arnon, I. & Snider, N. (2010). More than words: Frequency effects for multi-word
phrases. Journal of Memory and Language, 62, 6782.
Ashby, M. (2006). Prosody and idioms in English. Journal of Pragmatics, 38(10),
15801597.
Austin, J. L. (1962). How to do things with words. Oxford: Clarendon Pres.
Bacha, N. N. (2002). Developing learners academic writing skills in higher
education: A study for educational reform. Language and Education, 16(3),
161177.
174
REFERENCES
REFERENCES
175
176
REFERENCES
REFERENCES
177
178
REFERENCES
REFERENCES
179
180
REFERENCES
REFERENCES
181
Kuiper, K. & Haggo, D. (1985). The nature of ice hockey commentaries. In R. Barry
and J. Acheson (Eds.), Regionalism and national identity: Multidisciplinary
essays on Canada, Australia and New Zealand (pp. 167175). Christchurch
Association for Canadian Studies in Australia and New Zealand.
Kumaravadivelu, B. (2002). Beyond methods: Macrostrategies for language
teaching. New Haven, CT: Yale University Press.
Kunin, A. V. (1955). English-Russian phraseological dictionary (2nd ed., 1967;
3rd ed., 1984). Moscow: Russkii Yazik.
Lakoff, G. & Johnson, M. (1980). Metaphors we live by. Chicago, IL: University of
Chicago Press.
Laufer, B. (2011). The contribution of dictionary use to the production and
retention of collocations in a second language. International Journal of
Lexicography, 24, 2949.
Laufer, B. & Girsai, N. (2008). Form-focused instruction in second language
vocabulary learning: A case for contrastive analysis and translation. Applied
Linguistics, 29, 694716.
Laufer, B. & Roitblat-Rozovski, B. (2011). Incidental vocabulary acquisition: The
effects of task type, word occurrence and their combination. Language
Teaching Research, 15, 391411.
Laufer, B. & Waldman, T. (2011). Verb-noun collocations in second language
writing: A corpus analysis of learners English. Language Learning, 61,
647 672.
Leki, I. (2006). The legacy of first-year composition. In P. K. Matsuda, C. OrtmeireHooper &
Lennon, P. (1984). Retelling a story in English. In H. W. Dechert, D. Mhle &
M. Raupach (Eds.), Second language productions (pp. 5068). Tubingen:
Gunter Narr Verlag.
Lennon, P. (1990a). The advanced learner at large in the L2 community:
Developments in spoken performance. International Review of Applied
Linguistics in Language Teaching, 28, 309321.
Lennon, P. (1990b). Investigating fluency in EFL: A quantitative approach.
Language Learning, 40(3), 387417.
Levy, S. (2003). Lexical bundles in professional and student writing (Doctoral
dissertation) Retrieved from CSA Linguistics and Language Behaviour
Abstracts. (ISSN: 04194209).
Levy, S. (2008). Lexical bundles in professional and student writing. Saarbrucken:
VDM Verlag.
Lewis, M. (1997). Pedagogical implications of the lexical approach. In J. Coady
& T. Huckin (Eds.), Second language vocabulary acquisition (pp. 255270).
Cambridge: Cambridge University Press.
Lewis, M. (2000). Materials and resources for teaching collocation. In M. Lewis
(Ed.), Teaching collocations: Further developments in the lexical approach
(pp. 186204). Boston, MI: Heinle.
Lewis, M. (2008). Implementing the lexical approach. London: Heinle.
Li, J. & Schmitt, N. (2009). The acquisition of lexical phrases in academic writing:
A longitudinal case study. Journal of Second Language Writing, 18(2), 85102.
Lieven, E., Salomo, D., & Tomasello, M. (2009). Two-year-old childrens production
of multiword utterances: A usage-based analysis. Cognitive Linguistics, 20,
481508.
182
REFERENCES
REFERENCES
183
184
REFERENCES
REFERENCES
185
186
REFERENCES
REFERENCES
187
188
REFERENCES
REFERENCES
189
Wood, D. (1998). Making the grade: An interactive course in English for academic
purposes. Toronto: Prentice Hall Allyn and Bacon.
Wood, D. (2001). In search of fluency: What is it and how can we teach it?
Canadian Modern Language Review, 57(4), 573589.
Wood, D. (2002). Formulaic language in acquisition and production: Implications
for teaching. TESL Canada Journal, 20(1), 115.
Wood, D. (2006). Uses and functions of formulaic sequences in second language
speech: An exploration of the foundations of fluency. Canadian Modern
Language Review, 63(1), 1333.
Wood, D. (2009a). Preparing ESP learners for workplace placement. ELT Journal,
63(4), 323331.
Wood, D. (2009b). Effects of focused instruction of formulaic sequences on
fluent expression in second language narratives: A case study. Canadian
Journal of Applied Linguistics, 12(1), 3957.
Wood, D. (2010a). Formulaic language and second language speech fluency:
Background, evidence, and classroom applications. London/New York:
Continuum.
Wood, D. (2010b). Lexical clusters in an EAP textbook corpus. In D. Wood
(Ed.), Perspectives on formulaic language: Acquisition and communication
(pp. 8106). New York/London: Continuum.
Wood, D. & Appel, R. (2013). Lexical bundles in first year university business and
engineering textbooks: A resource for EAP. In H. M. McGarrell & D. Wood
(Eds.), Special research symposium issue of CONTACT. Refereed Proceedings
of TESL Ontario Research Symposium, October 2012. Vol. 39, No. 2
(pp. 92102).
Wood, D. C. & Appel, R. (2014). Multiword constructions in first year university
textbooks and in EAP textbooks. Journal of English for Academic Purposes,
15, 113.
Wood, D. & Namba, K. (2013). Focused instruction of formulaic language: Use
and awareness in a Japanese university class. The Asian Conference on
Language Learning Official Conference Proceedings 2013, pp. 203212.
Wood, M. M. (1981). A definition of idiom. Bloomington: University of Indiana
Linguistics Club.
Wray, A. (1999). Formulaic sequences in second language teaching: Principles
and practice. Applied Linguistics, 21(4), pp. 463489.
Wray, A. & Fitzpatrick, T. (2008). Why cant you just leave it alone? Deviations
from memorized language as a gauge of nativelike competence. In
F. Meunier & S. Granger (Eds.), Phraseology in language learning and teaching
(pp. 123148). Amsterdam: John Benjamins.
Wray, A. & Perkins, M. R. (2000). The functions of formulaic language: An
integrated model. Language and Communication, 20, 128.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge
University Press.
Wray, A. (2004). Heres one I prepared earlier: Formulaic language learning on
television. In N. Schmitt (Ed.), Formulaic sequences: Acquisition, processing
and use (pp. 249268). Amsterdam/Philadelphia, PA: John Benjamins.
Wray, A. (2008). Formulaic language: Pushing the boundaries. Oxford: Oxford
University Press.
190
REFERENCES
Index
Academic Formulas List (AFL) 82,
11012, 123
academic textbook language 1312
Academic Word List (AWL) 82, 114
academic writing 16, 45, 1038, 110,
11719, 124, 131, 1347, 1645,
172
beneficial effects, lexical bundles
1345
corpus-focused studies 10817
historical perspectives 1056
learner corpora 1068
lists of formulaic sequences
10817
nature 1025
acquisition theory
adult language 745
associative models 76
child language 678
developmental sequence 789
flooding the input 142, 144
Focus on Form (FonF) pedagogy
147
formulaic language, classroom
lessons 6970, 166
lexical bundles 124
power of memory in language 76,
146
pragmatic competence 95, 139
second language 25, 578, 656,
157, 1623, 168
speed of speech 87
thematic context 145
Adel, A. 107
Adolphs, S. 21, 63, 92, 96
Al Hassan, L. 108
Allerton, D. J. 2, 43
Ambridge, B. 74
Amosova, N. N. 7, 39
Anderson, J. 56
192
INDEX
Butler, C. S. 10
Bybee, J. 29, 92
Byrd, P. 104, 11416, 119, 134
Cadierno, T. 78
Cameron-Faulkner, T. 72
Canadian Academic English Language
Assessment (CAEL) 107
Canter, G. 61
Chafe, W. L. 7
Chan, T.-P. 142
Chang, A. C. S. 142
Chen, L. 131, 144
Chen, Y. 108, 124, 133, 135
Chiu, C.-Y. 140
Cieslicka, A. 61
COBUILD 49, 143
Collentine, J. 76
collocation
anomalous 43
Firths definition 38
frequency-based 389
lexicography 40
phraseological approaches
3940
taxonomy 29
two-word 11, 45
collocation researchers 45
Columbus, G. 140
complexity, phonetic 256, 31
Conklin, K. 22, 612, 77, 140
Connor, U. 102, 106
Conrad, S. 16, 21, 45, 107, 122
Cook, V. 102
Corpus of Contemporary American
English (COCA) 22, 323, 109,
160
Cortes, V. 16, 21, 45, 122, 124, 131,
133, 135
Coulmas, F. 8, 25, 2930, 89
Cowie, A. P. 2, 3940, 104
Coxhead, A. 82, 104, 105, 11416, 119,
134
criteria checklists
Coulmas 25
frequency statistics 234
gradience of formulaicity 267
judgment procedure 2830
Peters 256, 302
INDEX
English for Academic Purposes (EAP)
103, 105, 108, 11112, 115,
1234, 129, 1313, 1435
English for specific purposes (ESP)
131, 1434, 156
epic sung poetry 5
Erman, B. 11, 81, 92, 107
Eskildsen, S. W. 78
Eyckmans, J. 1401, 143
Ferris, D. 106
Finegan, E. 21, 107
first language
childrens use 6874
double role, formulaic language
714
pragmatic competence 701
vocabulary acquisition 678
Firth, J. R. 45, 38
Fitzpatrick, T. 146
fluency workshop
automatization stage 152
case study 153
free-talk stage 152
input stage 152
production stage 152
Focus on form (FonF), teaching
method 147
folklorists 56
formulaic language. See also current
research; specific activities
childrens use 1415, 256 (See
also criteria checklist, Peters)
classification 1011
comprehension 1114
definition 24
identification criteria 910, 160
oral formulaic genres 8
speech production 1114, 1634
writing process 1645
formulaic language, pedagogical
principle
feedback 147
practice 1457
preparation 145
Formulaic Language Research
Network ((FLaRN) 2, 35
formulaic sequences
automatization stage 152
193
benefits 1345
categorization 161
Columas criteria 25
communication strategy 15
in corpora 23
corpus analysis 4
discourse analysis, class room
1536
evidence 153
fillable slots 9
fluency workshop 1512
free talk stage 153
identification 20, 224, 278, 32
input stage 23
language models 10
length analysis 1324
multiword 14
native speaker usage 24
pragmatic function criteria 10
production stage 152
semantic and syntactic irregularities 11
speech fluency 6, 11
as vocabulary 1479
Forsberg, F. 139
Franken, M. 142
Fraser, B. 7, 41
Freed, B. F. 76, 878
frequency statistics
corpora 204, 312
criteria checklists 234
native speaker judgment 235
phonological characteristics 23
psycholinguistic measures 223
Freudenthal, D. 73
Frey, E. 77
Fulcher, G. 94
Galpin, A. 22, 61
Gatbonton, E. 146
Gatenby, E. V. 7, 40
Gibbs, R. 61
Girsai, N. 142
Gobet, F. 73
Goffman, E. 6
Goldberg, A. 64, 16970
Goldman-Eisler, F. 6
grammar construction 16970
grammarians 7
Grandage, S. 21, 63
194
INDEX
Jalkanen, I. 77
Jesperson, O. 7
Jiang, N. 140
Johansson, S. 21, 45, 107
Johnson, M. 148
Jones, J 16, 45, 131
Jones, M. 103
Jones, S. 39
Kappel, J. 140
Katz, J. J. 41
Kecskes, I. 95
Kemmer, S. 169
Kempler, D. 63
Keshavarz, M. H. 140
Kirjavainen, M. 73
Kjellmer, G. 5, 389
Kleinelder, S. 144
Koprowski, M. 144
Kormos, J. 76
Kramsch, C. 172
Krashen, S. 8
Kress, G. 102
Kuiper, K. 8, 62
Kumaravadivelu, B. 172
Kvecses, Z. 143
labels
collocation 378
concgrams 37, 4950
lexical bundles 21
lexical phrases 10
n-grams 37
terminologies 357
Lakoff, G. 148
Laufer, B. 139, 142
learning psychologists 67
Leech, G. 21, 45, 107
Leki, I. 102
Lennon, P. 878
Levy, S. 106, 124
Lewis, M. 103, 1434, 171
lexical bundles
academic discipline 1516
acquisition 124
class fragments 1256
components 1214
corpus analysis 21, 456, 1656
frequency-based method 21, 62
INDEX
195
MacFarlane, J. 62
Makkai, A. 41, 42
Malinowski, B. 6
Manes, J. 94
Martin, K. I. 76
196
native speaker
clause-chaining fluency 12
intuition 20, 224, 31
judgement 235, 278, 32 (See
also criteria checklist, Wood)
Nattinger, J. R. 2, 10, 29, 30, 31, 44,
45, 89, 143
Nekrasova, T. M. 140
Nesselhauf, N. 2, 5, 38, 39
neurologists 4, 6
Newton, J. 142
noncompositionality 39, 42, 44
nonlexical-bundles 12931
OBrien, I. 76
ODell, F. 143, 171
ODonnell, M. B. 24
Opie, I. 6
Opie, P. 6
Palmer, H. E. 7, 40
Paltridge, B. 102
Paqout, M. 105
Par, A. 102
Parry, M. 5
Pawley, A. 5, 8, 9, 10, 12, 13, 14, 81
Perfetti, C. A. 146, 151
Perkins, M. R. 11, 14, 29, 30, 35, 37,
89
Peters, A. M. 14, 15, 25, 30, 31, 68,
69, 74, 75, 89, 91
Peters, E. 141
philosophers 6
phonological short-term memory
(PSTM) 76
phrasal compounds 42
phrasal verbs 910, 37, 42, 489, 81,
109, 143
Pine, J. M. 73, 74
Pinker, S. 64
Poos, D. 106
Postal, P. 41
Postman, W. A. 63
pragmatics 3, 8, 81, 936
pre-task planning 145
Raimes, A. 102
Rainey, I. 94
Raupach, M. 87
INDEX
Rayson, P. 106
referential bundles 1278
Rehbein, J. 7
Reiter, R. M. 94
research history, formulaic language
early research 47
lexical bundles 1516
since 1970s 7
source of information 2
themes and patterns 1617
use of strange items 34
word sequences, examples 3
Ricard, E. 150
Riggenbach, H. 87, 153, 154
Rinvolucri, M. 151
Robinson, P. 57
Roever, C. 95
Romer, U. 24
Rowland, C. F. 74
Rumsey, A. 8
Sadoski, M. 143
Safar, A. 76
Salazar, D. 106
Salomo, D. 72
Scarcella, R. 8
Schauer, G. A. 96
Schloff, L. 150
Schmidt, R. W. 15, 74, 75
Schmitt, N. 2, 21, 22, 61, 62, 63, 77,
85, 86, 105, 140
Scott, M. 121
Searle, J. 6
second language
developmental sequence
789
formulaic knowledge 768
themes and patterns, research
7980
theoretical models, acquisition
756
vocabulary acquisition 745,
1623
Segalowitz, N. 76, 146
semantic opacity 434
Sharifian, F. 96
Shei, C. C. 22
Shirai, Y. 78
Silva, T. 102
INDEX
Simpson-Vlach, R. 21, 32, 46, 77, 82,
83, 106, 109, 110, 111, 112, 116,
119, 123, 129, 130, 131, 133,
134, 135, 136, 165
Sinclair, J. 2, 5, 8, 38, 39, 81
Siyanova-Chanturia, A. 61, 62
Skandera, P. 2, 43
Snider, N. 62, 77, 101
sociologists 6
Sosa, A. 62
Spears, R. A. 144
specific activities
chain dictations 151
chat circles 151
mingle jigsaw 150
productive skills 97
receptive skills 97
shadowing 150
student dictations 151
4/3/2 technique 145, 1501
spoken language
lists of formulaic sequence
826
phonological characteristics 912
pragmatics competence, teaching
937
speech production 1634
themes and patterns, research 98
word fluency 8291
Staehr, L. S. 140
stance bundles 1267
Staples, S. 107
Steinel, M. P. 143
Steinel, W. 143
Stengers, H. 140, 141
Stoller, F. 16, 45, 131
Stubbs, M. 39, 171
Sugaya, N. 78
Swain, M. 145
Swinney, D. 60, 61
Syder, F. H. 8, 12, 13, 14, 81
Taguchi, N. 77, 96
Tapper, M. 106
task-based language teaching (TBLT)
144
teaching
English for Academic Purposes
(EAP) 115, 1312
197
198
INDEX
Willis, D. 143
within-task planning 145
Witten, I. H. 142
Wolfson, N. 94
Wong, M. L.-Y. 94
Wong-Fillmore, L. 68, 69
Wood, D. C. 2, 7, 11, 21, 22, 25, 27,
28, 29, 30, 31, 33, 46, 77, 88,
89, 90, 91, 99, 107, 108, 112,
113, 114, 115, 116, 119, 124, 132,
133, 136, 140, 144, 145, 146,
150, 152, 153, 154, 165, 171
Wood, M. M. 42