SLP Aba 5 2

Volume 5 Issue Number 2
ISSN 1932 - 4731
Table of Contents
Pg. 88: Guest Editors’ Comments - M. N. Hegde & Raymond Weitzman
Pg. 90: Language and Grammar: A Behavioral Analysis - M.N. Hegde
Pg. 114: Verbal Behavior by B.F. Skinner: Contributions to

Analyzing Early Language Learning - Scott F. McLaughlin
Pg. 132: The Bases for Language Repertoires: Functional Stimulus-

Response Relations - Raymond S. Weitzman
Pg. 150: Behavioral vs. Cognitive Views of Speech Perception and

Production - Henry D. Schlinger, Jr.
Pg. 166: Speech and language assessment: A verbal behavior analysis

- Barbara E. Esch, Kate B. LaLonde, & John W. Esch
Pg. 191: Effects of a Speaker Immersion Procedure on the Production

of Verbal Operants - Nirvana Pistoljevic, Claire Cahill, & Fabiola Casarini
The Journal of Speech - Language Pathology
and Applied Behavior Analysis
VOLUME NO. 5, ISSUE NO. 2

ISSN: 1932 - 4731
Published: May 13, 2010

Publisher’s Statement
The Journal of Speech-Language Pathology and Applied Behavior Analysis (JSLP-ABA) is
published by Dr. Joseph Cautilli and BAO Journals. It is a peer-reviewed, electronic journal intended for
general circulation in the scientific community.
The mission of this journal is to provide a forum for SLP and ABA professionals to exchange
information on topics of mutual interest. These topics may include, but are not necessarily limited to support
for disorders of prelinguistic communication, speech perception/production, oral language and literacy,
speech fluency, and voice. They may also address issues pertaining to accent reduction, culturally-based
language variations, and augmentative-alternative communication. JSLP-ABA welcomes articles describing
assessment and treatment efficacy research based on detailed case studies, single -subject designs, and group
designs. Also encouraged are literature reviews that synthesize a body of information, highlight areas in need
of further research, or reconsider previous information in a new light. Additionally, this journal welcomes
papers describing theoretical frameworks and papers that address issues pertaining to SLP-ABA
collaboration.
All materials, articles, and information published in JSLP-ABA are peer-reviewed by the review
board of JSLP-ABA for informational purposes only. The information contained in this journal is not
intended to create any type of patient-therapist relationship or representation whatsoever. To receive a free
subscription to JSLP-ABA, please send an e-mail to BAOJournals@aol.com. Include your name and the e-
mail address in the body of your e-mail; and type “subscribe-SLP-ABA” in the subject field. When your e-
mail is received, your name will be added to the subscription list. You will then automatically receive notice
of publication of each new issue through an e-mail containing a hyperlink to the latest issue.
All rights are reserved. The Journal of Speech – Language Pathology and Applied Behavior Analysis may be freely accessed, downloaded,
and distributed free of charge. If you wish to sell our journals or charge a fee for access to our journals, you need to obtain the express prior
written permission of the copyright holder. For uses requiring permission, contact Joseph Cautilli, Ph.D., BCBA. All information
contained within is provided as is. The SLP-ABA journal, its publisher, authors, and agents, cannot be held responsible for the way this
information is used or applied. The Journal is not responsible for typographical errors.
Mission Statement
The mission of the Journal of Speech-Language Pathology and Applied Behavior Analysis (JSLP-
ABA) is to provide a forum for SLP and ABA professionals to exchange information on topics of mutual
interest. These topics may include (but are not necessarily limited to) support for disorders of prelinguistic
communication, speech perception/production, oral language and literacy, speech fluency, and voice. They
may also address issues related to accent reduction, culturally based language variations and augmentative-
alternative communication. JSLP-ABA welcomes articles describing assessment and treatment efficacy data
based on detailed case studies, single -subject research design, and group designs. Also encouraged are
literature reviews that synthesize a body of information, highlight areas in need of further research, or
reconsider previous information in a new light. Additionally, this journal welcomes papers describ ing
theoretical frameworks and papers that address issues pertaining to SLP-ABA collaboration.
JSLP-ABA is viewed as a primary source of information for speech-language pathology (SLP)

professionals and professionals in applied behavior analysis (ABA) who support individuals of all ages with
communicative disorders. The contents of this journal are intended to meet the interest of these professionals
for information to support evidence-based practice. JSLP -ABA is also intended to serve as a vehicle to
encourage collaboration between these SLP and ABA professionals.
Submission Information for Authors
Overview
All papers must be submitted in MS Word DOC format to the Lead Editor (Dr. Joseph Cautilli) via e-mail at
jcautilli2003@yahoo.com. Papers may be submitted at the initiative of an author or in response to an
invitation from the Lead or Associate Editors. All submissions are peer-reviewed and must be accompanied
by a signed Assignment of Rights (AOR) form. A link to the AOR form is at the bottom of this page. After
peer review and follow-up, all articles are copyedited. Authors have an opportunity to review and approve
their manuscript prior to publication. Once approved, authors are responsible for all statements made in their
work, including changes made by the copy editor prior to approval.
Content
To be considered for publication, articles must address topics of mutual interest to SLP and ABA
professionals. These topics may include (but are not necessarily limited to) support for disorders of
prelinguistic communication, speech perception/production, oral language and literacy, speech fluency, and
voice. They may also address issues related to accent reduction, culturally-based language variations and
augmentative-alternative communication, SLP-ABA collaboration. Articles may report original research,
descriptions of theoretical frameworks, literature reviews, treatment critiques, and tutorials.
Peer Review Process
All submitted manuscripts are reviewed initially by the Lead Editor. Manuscripts with insufficient priority
for publication will be rejected promptly. Other manuscripts will be sent to the Senior Associate Editor, who
will distribute them to editorial consultants with relevant expertise. The editorial consultants will read the
papers and evaluate (1) the importance of the topic addressed by the paper; (2) the paper’s conformity to
standards of evidence and scholarship; and (3) the clarity of writing style. Comments provided by the
editorial consultants will then be provided to the author(s) for follow up.
Formatting Requirements: To support the electronic copy-editing process, authors must honor all of the
following guidelines:
• The page set-up for manuscripts must be set for 1-inch margins on all 4 boarders.
• All pages must be in portrait orientation. There can be no pages in landscape orientation.
• Manuscripts must be typed in single -spacing using size 11 “Times New Roman” Font.
• Manuscripts must be submitted as one continuous document rather than in sections or sub-
documents.
• Each manuscript must include 7 elements in the following order: title, name(s) of author(s),
abstract, key words, body, references, author(s)’ contact information.
• Do not insert pagination, headers, or footers. (These are inserted in the copy-editing process)
• The use of headings is encouraged and should be structured according to the guidelines
described in the Publication Manual of the American Psychological Association (5th edition).
• If graphics, figures and tables are used, they must be created in *.jpg or *.bmp format. No Excel
graphs will be accepted.
• Graphics, figures, and tables, if used, may be embedded in the body of the manuscript or they
may be submitted in a separate MS Word document. If the latter option is chosen, then author(s)
must indicate clearly the intended location of each item (graphic, figure, table) within the
manuscript so that the copy editor can make the insertions.
• Individual graphics, figures, and tables, when used, may not be larger than one page.
• The caption for a table must be printed above the table. The caption for a figure must be printed
below the figure.
• In the references section, please use italics where APA style would allow underlining (e.g., the
titles of journals and books).
• Author contact information must include the following 4 elements for each author: name,
mailing address, phone, and e-mail.
• Manuscripts must be saved and submitted in MS Word “DOC” format.
• When there is a conflict between the requirements of APA style (see below) and the formatting
rules listed here, the formatting rules will supersede the APA requirements.
Manuscript Style Requirements:
• With the exception of the above (formatting) guidelines, authors must write their manuscripts in
a style that is consistent with the Publication Manual of the American Psychological Association
(APA Manual) (5th edition). A copy of this manual may be ordered at http://www.apastyle.org/
• Consistent with APA style, authors must use non-sexist language. Please refer to Table 2.1 in
the APA Manual for “Guidelines for Unbiased Language.”
• Also consistent with APA style, authors must use person-first language for referring to
individuals with potentially stigmatizing characteristics. Person-first language requires an author
to name the individual first, followed by descriptive information (e.g., "child with autism")
rather than to use an adjectival form (i.e., "autistic child") or a nominal form (i.e., "the autistic").
• As noted above: When there is a conflict between the requirements of APA style and the
formatting rules listed in the above section, the formatting rules will supersede the APA
requirements.
General Guidelines for Preparing Abstracts:
The following general guidelines must be honored to insure that JSLP-ABA will be accepted into the major
psych databases. (See PsychINFO website: http://www.apa.org/psycinfo/about/covinfo.html)
• An abstract may not exceed 960 characters and spaces (approximately 120 words). Characters
can be conserved by using digits for numbers (except at the beginning of sentences); by using
well-known abbreviations; and by using the active voice.
• Begin the abstract with the most important information, but don’t repeat the title.
• Include only the four or five most important concepts, findings, or implications.
• Embed as many key words and phrases in the abstract as possible.
• Include in the abstract only information that appears in the body of the manuscript.
• For the sake of clarity, define all acronyms and abbreviations except for measurements; spell out
the names of tests; use generic names for drugs (when possible); and define unique terms.
• Use the present tense to describe results with continuing applicability or conclusions drawn and
the past tense to describe variables manipulated or tests applied.
• As much as possible, use the third person rather than the first person.
Abstracts for Empirical Studies: Abstracts for empirical studies are also generally about 100 to 120 words
in length. They should include the following information:
• Problem under investigation (in one sentence)
• Pertinent characteristics of participants (e.g., number, type, age, sex, genus and species)
• Experimental method, including apparatus, data-gathering procedures, and complete test
• Names and complete generic names and dosage and routes of administration of any drugs
(particularly if the drugs are novel or important to the study)
• Findings, including statistical significance levels
• Conclusions and implications or applications
Abstracts for Literature Reviews and Theoretical Articles
Abstracts for review or theoretical articles are generally about 75 to 100 words in length, and they include
the following information:
• The topic (in one sentence)
• The purpose, thesis, or organizing construct and the scope (comprehensive or selective) of the
article
• Sources used (e.g., personal observation, published literature)
• Conclusions
Thank you!
The Behavior Analyst Online Journals Department
ADVERTISEMENT
ADVERTISING IN BAO Journals

If you wish to place an advertisement in any of our journals, you can do it by contacting us.
The prices for advertising in one issue are as follows:
1/4 Page: $50.00 1/2 Page: $100.00 Full Page: $200.00
If you wish to run the same ad in multiple issues for the year, you are eligible for the following discount:
1/4 Pg.: $40 - per issue 1/2 Pg.: $75 - per issue Full Page: $150.00-per issue
An additional one time layout/composition fee of $25.00 is applicable
In addition to placing your ad in the journal (s) of your choice, we will place your ad on our website’s advertising
section.
For more information, or place an ad, contact Halina Dziewolska by phone at (215) 462 -6737 or e-mail at:
halinadz@hotmail.com
The Journal of
Speech - Language Pathology and Applied Behavior Analysis
ISSN: 1932-4731
Editorial Staff
Senior Editor
Joe Cautilli, Ph.D., BCBA
Co-Lead Editor
Mareile Koenig, Ph.D., CCC-SLP, BCBA
Co-Lead Editor
Douglas Greer, Ph.D.
Associate Editors
Leslie Cohen, Ph.D.
Joanne Gerenser, Ph.D., CCC-SLP
Elizabeth Grillo, Ph.D., CCC-SLP
Caio Miguel, Ph.D.
Editorial Board
Christine Barthold, Ph.D., BCBA
Vince Carbone, Ph.D.
Jenn Cronin, M.Ed.
Brian Cowley, Ph.D., BCBA
Kathy Dyer, Ph.D., CCC -SLP, BCBA
Anntonette Falco, M.Ed.
Lori Frost, MS, CCC -SLP
Cheryl Smith Gabig, Ph.D., CCC-SLP
Cheryl Gunter, Ph.D., CCC -SLP
James Halle, Ph.D.
Giri Hegde, Ph.D., CCC-SLP
Anne Holmes, Ph.D., CCC-SLP, BCBA
Laura Hutt, M.S., SLP -CCC, BCBA
James Luiselli, Ph.D.
Hedda Meadon, Ph.D.
Pat Mirenda, Ph.D., BCBA
Pete Peterson, Ph.D.
Anna I. Petursdottir, Ph.D.
David Sidener, Ph.D.
Francois Tonneau, Ph.D.
Michael Weinberg, Ph.D., BCBA
Mary Jane Weiss, Ph.D., BCBA
Ray Weitzman, Ph.D.
Notice to Readers
Due to technical difficulties, it has become necessary for us to create a mirror site to access all
editions of BAO journals published since the loss of our webmaster, Craig Thomas . We are working
to correct all the proble ms with the old site and we appreciate your patience. We are currently
constructing a new, fully functional BAO journal website with the features of the old BAO site. Until
further notice, you may access the current and future BAO journals at www.BAOJournal.com.
Please bookmark this location or put it in your browser’s “Favorites” for future access.
A link on the new BAO Journal site will take you to the old BAO site where all past issues are
currently accessible.
Thank you for your loyal support for BAO and its journals.
Cordially,
Joe Cautilli and BAO Journals

SLP- ABA Volume 5, Issue No. 2
Guest Editors’ Comments

M. N. Hegde & Raymond Weitzman
We are pleased to offer this special issue of the Journal of Speech-Language Pathology
and Applied Behavior Analysis on bridging the gap between the conceptual foundations of speech
and language and the treatment procedures used in speech-language pathology (SLP). Our aim in
putting this special issue together was to provide the practitioner with a behavioral analytic view
of speech and language that is conceptually more consistent with the behavioral treatment widely
used in remediating communication disorders in children and adults. Currently, most speech-
language pathologists (SLPs) who routinely use the behavioral treatment methods accept the
linguistic and cognitive theories in understanding language and language development. Linguistic
and cognitive theorists, being mentalistic as well as nativistic, seem unable to derive their own
useful treatment procedures from their theories.
Treatment of communication disorders, by nature, is experimental, and experimental
approaches are based on the philosophy and methods of natural science. The behavioral view of
language and language development is a natural science account, as opposed to a cognitive,
mentalistic, and nativistic approach. It is precisely for this reason that the behavioral approach has
generated experimentally verified treatment procedures. Therefore, adopting a natural science
account of verbal behavior—a preferred substitute for language—would be more consistent with
experimentally based treatment procedures.
Conceptual and treatment consistency is not the only advantage of adopting the verbal
behavior view of language. It helps the SLPs avoid spurious, speculative, and untestable theories
of language and language acquisition. The verbal behavior view eliminates the contradiction of
holding a nativistic view of language while trying to modify speech-language behaviors through
environmental changes. The behavioral view will help SLPs target functional (cause-effect)
language units in assessment and treatment, instead of unstable or unreliable linguistic categories
that are unrelated to potential independent variables. Many other advantages are likely to follow
with an approach that integrates the basic analysis of language with experimental treatment
procedures.
This issue consists of six papers that offer a glimpse of various aspects of Skinner’s
Verbal Behavior. While the first four papers offer perspectives on the behavioral analysis of
language and language development, the last two illustrate the applied aspects of that analysis.
Hegde’s paper gives a historical background to the publication of Skinner’s Verbal
Behavior and the linguistic criticisms that followed. The paper then goes on to summarize the
various functional units of verbal behavior, an understanding of which is essential to appreciating
Skinner’s analysis. Dr. Hegde points out the clinical implications of functional units as opposed
to linguistic structural units for the work of SLPs.
The next paper by McLaughlin offers a behavioral analysis of child language
development. Dr. McLaughlin points out that the behavioral view is in fact better able to account
for the documented facts of language development in children. Although there has been much
research to show that verbal interactions between children and their caregivers are the basis of
language learning, the distorted view that only cognitive-nativistic view of language development
is valid persists in many textbooks. Dr. McLaughlin counters the generally accepted but
unjustified criticism that the behavioral view cannot account for language development in
children.
Weitzman’s paper, next in the series, addresses what linguists consider a difficult issue
for behavioral analysts. The issue often raised relates to the poverty of the stimulus or the alleged
88
inadequacy of language “input” that makes it difficult to conceive how children can learn
language without internal mental mechanisms. Although stimuli are only a part of the total
contingency involved in learning language (or any other skill), the criticism has been often
repeated. Dr. Weitzman argues that the poverty of stimulus hypothesis is unwarranted. He points
out that there is plenty of justification to hold that language repertoires in children may be
acquired through operant contingencies of stimuli, responses, and reinforcement.
Perception, another presumably difficult issue for behavioral scientists, receives a
competent behavioral analysis in Schlinger’s paper. Historically, it is a problem of how mind or
the brain understands and evaluates sensory information that it receives. The idea that perception
is behavior (or action) is radically different from the traditional psychological view that it is some
form of passive taking-in of the external world. The traditional definition has been an important
basis of infant speech perception research. Although operant conditioning has been the main
method of studying whether infants could differentially respond to different speech stimuli, the
researchers have resorted to mentalistic and cognitive theories to inconsistently explain their own
data. Schlinger makes a compelling argument that it is much more parsimonious to consider
perception as behavior and suggests that SLPs may better integrate such a concept of perception
with speech and language learning as well as their use of applied behavioral methods in treating
disorders of communication.
In their paper, Esch and associates addresses the neglected issue of behavioral assessment
of communication disorders. Most assessment procedures, whether based on standardized tests or
naturalistic language samples, are based on the linguistic analysis of speech and language. Dr.
Esch and colleagues point out a great need to develop assessment procedures that target
functional verbal behavior units instead of structural categories. Pointing out the limitations of a
purely structural assessment, Esch et al. urge SLPs to make a contextual analysis of speech and
language skills. Such an analysis will take into consideration the cause-effect relations of verbal
behavior units. They argue that a cause-effect based assessment will more easily lead to valid
treatment targets.
The final paper in this issue by Pistolijev and associates offers an excellent illustration of
a new approach to te aching verbal behaviors. Their paper describes how, with the help of the
Speaker Immersion Procedure, it is possible to establish speaker repertoire in children with
language disorders. Dr. Pistolijev and colleagues demonstrate in their study that with the help of
behavioral procedures, it is possible to generate speaker repertoire in noninstructional settings.
They further illustrate how Skinner’s functional units may be effective treatment targets for
children with language disorders.
We hope that the papers included in this special issue will prompt SLPs to gain a better
appreciation of Skinner’s Verbal Behavior. We also hope that the SLPs will consider integrating
the behavioral analytic view with their assessment and treatment approaches.
We thank the Editors of JSLP-ABA for giving us an opportunity to put together this
special issue. We are especially grateful to Dr. Mareile Koenig for her competent and friendly
support from the beginning to the end.
M. N. Hegde
Raymond Weitzman
Guest Editors
89
Language and Grammar:

A Behavioral Analysis
M.N. Hegde
Abstract
While speech-language pathologists (SLPs) accept the behavioral methods of treatment in their
professional work, they tend to entertain an inadequate or dismissive view of the behavioral analysis of language
and grammar. This may be because SLP’ academic study of language consists mostly of linguistic theories that
typically misrepresent Skinner’s (1957) analysis of verbal behavior. An appreciation of Skinner’s analysis would be
consistent with the clinicians’ use of applied behavioral techniques in treating speech and language disorders.
Therefore, this paper reviews Skinner’s functional units of verbal behavior and his analysis of grammar in terms of
autoclitics as secondary verbal operants. Skinner’s analysis is comprehensive, innovative, clinical research-
supported, and relevant to SLPs. Keywords: Verbal behavior, primary and secondary verbal operants, mands, tacts,
intraverbals, echoics, textuals, audience, autoclitics, response classes
_____________________________________________________________________
Introduction
Speech-language pathologists’ (SLPs’) academic study of language is heavily influenced by

linguistic and cognitive viewpoints. A majority of textbooks and writings familiar to SLPs explore in
greater detail the linguistic and structural view of language and offer only a limited summary of the
behavioral view whose concepts and implications are not carried throughout the text. Most SLPs are well
versed in the phonologic, morphologic, syntactic, and pragmatic structures of language but are not
equally well versed in the functional units that are basic to Skinner’s (1957) analysis. Nonetheless, SLP’s
treatment methods are mostly behavioral (Hegde, 1998, 2008a). Inevitably, this has led to a conceptually
inconsistent model of language and treatment of language disorders.
Chomsky’s (1959) critical review of Skinner’s (1957) book—Verbal Behavior—is better known
than the book itself. Most students and clinicians seem to be unaware of the invalidity of Chomsky’s
criticism or the competent responses given to his negative review (e.g., Anderson, 1991; MacCorquodale,
1969, 1970; McLeish & Martin, 1975; Palmer, 2006; Richelle, 1976). Rejoinders to his review have
pointed out that Chomsky poorly understood Skinner’s Verbal Behavior, behavioral methodology, and
behaviorism. Chomsky’s misunderstanding of Skinner’s book and concepts was so severe that it “would
prompt most examination graders to read no further” (Richelle, 1976, p. 209). Chomsky frequently
attributed views of other psychologists to Skinner who had unequivocally repudiated them. In a
questionable case of scholarship, Chomsky repeatedly misquoted Skinner (Adelman, 2007). More than
four decades after he wrote the review, Chomsky was still a critic of Skinner, and with the same distorted
understanding of Skinner’s work (Virues-Ortega, 2006).
A commonly held assumption among most linguists, and SLPs who follow them, is that Skinner’s
Verbal Behavior has faded into history. The fact, however, is that research on verbal behavior and
treatment of verbal behavior disorders based on Skinnerian analysis are flourishing. Among several others
in the Unites States, the journals of The Analysis of Verbal Behavior, The Behavior Analyst, Journal of
Applied Behavior Analysis, Behavior Modification, and several international journals on behavior analysis
regularly publish many articles on the Skinnerian verbal behavior analysis and treatment. This journal,
Journal of Speech-Language Pathology and Applied Behavior Analysis is devoted to bridging the gap
between the two disciplines. As Schlinger (2008a) has ably demonstrated, Skinner’s Verbal Behavior is
alive and well. An interesting observation Schlinger makes is that although both Verbal Behavior and
Chomsky’s (1957) Syntactic Structures had their 50th anniversary in 2007, Skinner’s book on
Amazon.com, has been selling better than Chomsky’s. The verbal behavior approach to treating children
90
with autism is now recognized internationally as the most evidence-based approach. Teaching almost all
forms of communication disorders is essentially behavioral (Hegde, 1998, 2006, 2007; Hegde & Maul,
2006; Pena-Brooks & Hegde, 2007), whether some SLPs acknowledge it or not. In fact, if any tide has
turned against something, it is the tide against Chomsky’s generative linguistics. While Skinner’s
experimental and applied behavior analysis is thriving worldwide, Chomsky’s generative grammar notion
has disappeared from linguistics (Harris, 1993; Leigland, 2007). Chomsky’s own multiple revisions and
qualifications of his 1957 theory have moved away from a cognitive, generative, rule -based theory of
language (Schoneberger, 2000). Within just a few years of Chomsky’s Syntactic Structures was
published, there was the generative semantic “rebellion” that denied the supremacy of grammar in
language. (Linguists often describe newer approaches as revolution, war, rebellion.) Soon came the
“pragmatic revolution” which asserted in the 1970s that language should be understood as actions
performed in social contexts—mostly an arm-chair philosophical view which was still structural in its
orientation. More than 30 years before the “pragmatic revolution,” Skinner had advocated the social
nature of verbal behavior with better conceptual and experimental bases than the speculative pragmatic
approach has ever had (see Skinner, 1957, Preface, for a historical account of his analysis). SLPs have
found that when they need to intervene (i.e., offer treatment), they need to turn toward Skinner’s
experimental and applied behavior analysis; linguistics of any era could offer little or no help.
Contrary to the typical portrayal of Skinner’s analysis of language as “simplistic,” it is
sophisticated, complex, and comprehensive. His analysis of verbal behavior, as he preferred to call it,
includes an innovative analysis of grammar, word order, and meaning (Hegde, 2008b) which is unfamiliar
to most SLPs. There are other methodological behavioral approaches to language (Osgood,
1963; Mowrer, 1952; Staats, 1968) that are sometimes confused with Skinner’s vastly different radical
behavioral approach that offers a natural science view of language, with an ensuing applied technology
that SLPs have readily accepted. At least three unique features of Skinner’s analysis of verbal behavior
are especially relevant to an applied science of speech-language pathology.
First, Skinner’s analysis accepts the constraints of the methods of natural science. Dependent
variables are analyzed in relation to their publicly observable, measurable, and experimentally
manipulable independent variables. Skinner’s analysis is functional in the sense that it seeks to identify
variables that cause verbal behaviors. Explanations of events are kept at the level of observation and
experimental analysis, and therefore, do not involve inferred mental, cognitive, or pseudobiological
(innate) entities.
Second, Skinner’s analysis treats language as a form of behavior, and not as a formal system that
exists in the minds or brains of speakers, independent of their actions. Thirty years after the publication of
his Verbal Behavior, Skinner (1987, p. 11) restated that his book “is not about language. A language is a
verbal environment, which shapes and maintains verbal behavior.” He went on to say that “Those who
want to analyze language as the expression of ideas, the transmission of information, or the
communication of meaning naturally employ different concepts.” (1987, p. 11). He then urged the
scientists to judge which one—a scientific causal analysis or a mental structural analysis—works better.
When a causal approach is preferred, analysis of structural properties of mechanically generated sentences
(e.g., they are eating apples, or colorless green ideas sleep furiously) are not productive because they do
not represent empirical data. Such productions will be of interest to scientists only when they are
empirically recorded utterances of speakers, under given conditions of stimulation, meeting specific social
consequences.
Third, Skinner’s analysis does not include special explanatory laws. He wrote Verbal Behavior to
show that “speech is within the domain of behaviors which can be accounted for by existing functional
laws, based on the assumption that it is orderly, lawful, and determined, and that it has no unique
emergent properties that require either a separate causal system, an augmented general system, or
recourse to mental way-stations” (MacCorquodale, 1969, p. 832). Consistent with his analysis of
behaviors in general, Skinner has analyzed verbal behaviors in terms of a contingency relationship
91
between (1) current states of motivation, (2) currently controlling environmental conditions, (3) past
history of reinforcement, and (4) the genetic constitution of the individual (Skinner, 1957). Operant
analysis, therefore, is not restricted to “stimuli and responses” and does not ignore the genetic factors.
Verbal Behavior: Definition
Verbal behavior (VB) is a class of behavior that is “reinforced through the mediation of other persons”
(Skinner, 1957, p.2). Verbal behavior is social behavior, because, unlike nonverbal behavior, it cannot be
conditioned or maintained by nonsocial entities. Nonverbal behavior in this context does not refer to
nonvocal verbal behavior (as in alternative forms of communication). It refers to behaviors that are, in
traditional terms, noncommunicative (e.g., walking or watering a house plant). Contrary to nonverbal
behaviors, verbal behavior may be conditioned only by the actions of other people.
The essence of Skinner’s definition is that it is only people who get affected by it in such a way as
to get conditioned to reinforce VBs. In other words, both the VBs, and their consequences (listener
responses), are conditioned. Also, unlike nonverbal behaviors, VBs are devoid of direct and mechanical
reinforcement contingencies (Skinner, 1957; MacCorquodale, 1969). As E. Vargas (1988) distinguished
them, VBs are verbally governed (mediated), whereas nonverbal behaviors are (environmental) event-
governed. Consider the example that contrasts a nonverbal response with a verbal response: A thirsty
woman may walk up to the refrigerator and get a drink. The nonverbal response of walking will directly
and mechanically get reinforced when she gets her drink—an environmental event. No other person need
be present to reinforce it. But instead, if her response is verbal (e.g., “May I have a glass of water?”), it
needs social mediation to get reinforced. Someone (mediator) has to reinforce it by complying with her
request. The need for a mediator to select and strengthen VBs adds an additional element to the familiar
three-term contingency involving stimuli, responses, and consequences that explains nonverbal behavior.
VB, therefore, is explained on the basis of a four-term contingency that involves (1) stimuli, (2) verbal
responses, (3) listener responses, and (4) the reinforcing effects of listener responses (J. Vargas, 2009). It
should be noted however, that in all other respects, VB is essentially like nonverbal behavior. For
instance, verbal and nonverbal behaviors both have their respective discriminative stimuli, and are
similarly selected and strengthened by their consequences, and may be extinguished by withholding
reinforcement (E. Vargas, 1988). Also to be noted is that the uniqueness of VB does not require special
explanatory laws; Skinnerian laws of behavior are sufficient to account for it.
Verbal Behavior: Units of Analysis

An analysis of verbal behavior should first determine the units of analysis. Linguists analyze
language with such structural units as phonemes, morphemes, words, and sentences that may be adequate
for a formal analysis of language. Skinner asserted that linguistic structures tell us nothing about their
causes—but the natural science account of any phenomenon is a causal analysis. Apparently,
structuralists presume that independent variables can be sliced according to the structural properties of
responses. That is, phonemes, words, sentences, and so forth necessarily have separate causal variables—
a presumption without empirical support.
Skinner’s analysis shows that the same cause may lead to the production of a word, a phrase, or a
sentence depending on the current stimulus condition and past reinforcement history. For instance, one
might just say, “yuck” or “I think it is disgusting”—variable structural units under similar stimulus
conditions and similar effects on listeners. To the contrary, the same verbal response may be controlled by
different independent variables in different situations. For instance, a boy might say “ball” because he
saw a ball, or echoed someone else, or read the printed word ball. —different causes for structurally the
same response (“ball”). That structures (forms) and causes do not covary is unaccounted for in the
linguistic analysis. A word is always a word, regardless of why it was produced. A sentence is different
92
from a word, though it may have the same cause as a word on a given occasion. Skinner’s analysis of
verbal behaviors based on their independent variables avoids this problem inherent to structural analysis.
Technically, the response unit in the behavioral analysis is called a verbal operant which is “. . . a
disposition (tendency, likelihood) to respond in a certain way to a certain state of affairs because of a past
history of reinforcement” (Winokur, 1976, p. 21). A given verbal response is concrete, and is an exemplar
of a class of responses. In contrast, a verbal operant is abstract because it means both a controlling
relation and a class of verbal responses with similar causes and conditioning history. Skinner classified
VBs on the basis of motivational variables, discriminative stimulus control, and other VBs (that cause
additional VBs). The following sections of this paper summarize distinct verbal operants, beginning with
mands.
Motivational Control: The Mand
A mand is a verbal operant whose cause is a motivational variable. States of deprivation or

aversive stimulation cause mands to be emitted by a speaker. Skinner defined the mand as “a verbal
operant in which the response is reinforced by a characteristic consequence and is therefore under the
functional control of deprivation or aversive stimulation” (1957, p. 35-36). Under a state of deprivation,
positive reinforcers (consequences individuals work to obtain) will be effective. Under conditions of
aversive stimulation, negative reinforcers (consequences that remove such stimulation) will be effective.
In either case, a mand of any form, including speaking, writing, signing (e.g., American Sign Language),
pointing, finger spelling, and sending Morse codes may be emitted (Michael, 1982).
Responses such as A glass of water, please or May I have a hamburger are controlled by states of
deprivation and are reinforced positively. States of deprivation are motivational, and deprivation simply
means that a person has not had access to something specified for some measured duration. Responses
such as Quit that or Get out are controlled by their respective aversive stimulus and are reinforced
negatively when the listener complies. In all cases, a mand specifies its own reinforcer; for instance, the
mand, Will you please be quiet specifies what will (negatively) reinforce that mand: cessation of chatter.
When mands are produced, an appropriately conditioned listener will act in ways that are reinforcing to
the speaker.
Produced mostly for the benefit of speakers, and propelled by states of motivation, particular
forms of mands do not strictly covary with discriminative stimuli present in the environment. For
instance, a speaker’s mand, “May I have an apple pie?” is more likely in places where pies are available .
Nonetheless, one might also say, “I want to eat a piece of pie” when none is in sight; it may function as a
mand if another person who hears it proceeds to bake a pie. Occasionally, when deprivation is very
strong, mands may be completely free from external stimulus control, as in the “isolated desert-dwelling
hermit’s cry, ‘water’” (Winokur, 1976, p. 30). In general, requests, commands, prayers, advice, questions,
warnings, permissions, offers, and the like are mands. Note that multiple linguistic categories are reduced
to just one (mand). Whether an utterance is a mand or not cannot be determined by its structural
properties. The utterance of the word “Fire!,” for example is a mand when addressed to a firing squad, a
textual when read aloud from print, a tact when it is evoked by the sight of fire, and an echoic when a
child in therapy imitates that modeled word. Similarly, the sentence I see fire may be a tact, a textual, or
an echoic, each with its own cause.
Skinner (1957) also described a variety of generalized mands, which seem irrational but are
nevertheless lawful. Extended mands occur when people mand small babies, dolls, untrained animals, and
machines (e.g., a driver’s mand at a stop light, “Common, green light!”) that do not reinforce the speaker.
They are maintained because of a past history of reinforcement for similar responses emitted under
similar conditions.
93
Clinical Implications
SLPs should be especially interested in teaching mands to children and adults with language
disorders. Traditionally, SLPs have shown greater interest in teaching tacts (see the next section)—the
perennial naming of objects and colors rather than mands. Mands, however, are an important class of
verbal operants that clinicians should target in both early and later stages of language intervention with
children as well as adults. Even individuals with aphasia, traumatic brain injury, or dementia would be
better functional communicators if they could mand. In fact, what is promoted as functional
communication in speech-language pathology is, for the most part, mands. For the purpose of
clarification, it should be noted here that function in the speech-language pathology literature does not
refer to causes, as it does in natural science and behavioral analysis. Instead it vaguely refers to the “use
of language.” It is generally and correctly asserted that individuals with significant communication
problems but who learn to “express their basic needs,” “ask for information,” “request for clarification”—
all mands—are better functional communicators than are those who name or describe objects. A woman
whose husband is aphasic does not especially care if he can describe or name (tact) water; she will be
content if he can mand it when thirsty.
Michael (1988) suggests that about half of what adults say in the course of a daily interaction
with others may consist of mands. Some SLPs may assume that children who are taught the labels (tacts)
for objects, will mand the objects they want. Contrary to this assumption, clinical VB training, or what is
beginning to be called the verbal behavior approach (Barbera, 2007; Miguel, 2009), has made it clear that
children who learn tacts may not automatically mand; they need mand training as well (Hall & Sundberg,
1988; Michael, 1988). Children who cannot mand often resort to such undesirable behaviors as tempter
tantrums, whining, grabbing, and aggressive nonvocal acts because they cannot request what they want
(Carr, et al., 1994.) Nonverbal or minimally verbal children are especially prone to unacceptable problem
behaviors to socially acceptable manding. Teaching mands first to such children may reduce many
undesirable behaviors because the mands will give them access to the same reinforcers that their
undesirable behaviors successfully sought (Carr, et al., 1994; Reichle & Wacker, 1993). Other classes of
VBs may then be more efficiently taught to children whose undesirable vocal or nonvocal mandin g
behaviors have been replaced by desirable vocal mands.
In more recent research on teaching VB to children and adults who have not learned a verbal
repertoire, the concepts of deprivation and aversive stimulation have been refined further to account for
some varied conditions under which mands tend to be produced. Generally, and as noted, mands may be
produced under states of deprivation or aversive stimulation. For instance, a person who has not had
access to water for several hours is likely to request it when the conditions that support a mand exist.
However, a person may also mand for a drink when he or she has just ingested salt—a condition that bar
owners tend to exploit by offering salty pretzels to its patrons who, after eating them, order (mand) more
drinks (J. Vargas, 2009). Eating pretzels creates a state of fluid deprivation, but is not, in itself, a state of
deprivation. SLPs offering language treatment to infants and toddlers often schedule language treatment
sessions just before their young clients have had breakfast or lunch to increase the probability that the
food used as a reinforcer during the sessions might be more effective than when the children arrived at the
clinic after a full meal. Similarly, a teacher who plans to increase question-asking behaviors (mands) in
her students may increase the difficulty of an academic task. This aversive task difficulty may increase
the probability that the students ask for help; the teacher may then use prompts, models, and other
procedures to teach mands (request for teacher’s help). Such necessary steps taken to increase the
motivation for mands (or other nonverbal behaviors) are known as establishing operations (EOs), a term
Keller and Schoenfeld had used in 1950, but expanded and refined by Michael (1988, 2000). EOs also are
known as motivating operations (MOs) (J. Vargas, 2009). Under natural settings involving speakers with
good mand repertoire (such as the adult in the bar), EOs increase the probability that a mand will be
produced. Under clinical conditions involving speakers with limited mand repertoire, establishing
operations make it somewhat easier to teach mands. In essence, EOs have two kinds of effects that
94
clinicians can exploit. First, they alter the reinforcing effects of some object, event, or activity; this is the
reinforcer-establishing effect. Second, EOs change the current frequency of behaviors that were
previously reinforced by that object, event, or activity; this is the evocative effect. The two effects of EOs
are independent and concurrent (Michael, 2000). In light of these refinements of the motivational
variable, Michael defines the mand as “a type of verbal operant in which a particular response form is
reinforced by a characteristic consequence and is therefore under the functional control of the establishing
operation relevant to that consequence” (1988, p. 7; emphasis added).
EOs may be unconditioned (UEO) or conditioned (CEO). Biological propensities underlie UEOs
whereas past learning underlie CEOs (Hall & Sundberg, 1988; Michael, 1988). Asking mothers to bring
infants to early language intervention sessions just before breakfast, and then using breakfast food as
reinforcers for vocal responses is an example of UEO. The infant’s sensitivity to food is biologically
determined (hence unconditioned or unlearned), although the specific types of food preference is learned.
On the other hand, an EO that increases the value of a toy as a reinforcer for a child under mand training
is an example of CEO. An preferred toy placed on a high shelf may temporarily increase its value as a
reinforcer and the probability that the child will mand it. A missing item necessary to complete a task
might also create a CEO to teach mands, as Hall and Sundberg (1987) demonstrated. While teaching a
student who is deaf to make soup, the authors omitted the needed hot water to create a CEO that helped
teach the mand, “hot water.” It may be noted that if the water manded is immediately consumed, any EO
the clinician will have manipulated would be unconditioned; if the water manded is not consumed, but is
used for some activity (such as washing hands or cooking a meal), then any EO in effect is conditioned.
The classic conditioning literature not only had recognized the effect of deprivation (currently,
part of EOs), but also that of satiation. While deprivation increases the value of a reinforcer and the
response rate associated with it, satiation decreases both. A person who has just eaten is unlikely to mand
food. This effect of deprivation, too, is crucial for the clinician who plans to teach mands for food and
drink. As a child receives food following successive mands, the reinforcing value of food is likely to
decline, and so is the response rate. The term abolishing operations has been used to refer to those aspects
of EOs that reduce (a) the reinforcing value of some object or activity and (b) the response frequency
associated with that object or activity (McGill, 1999; Michael, 2000).
There exists an extensive literature on EOs in behavioral literature (see McGill, 1999 and Smith
& Iwata, 1997 for reviews) and mand training (see Sautter & LeBlanc, 2006 for a review of treatment
research). In most of the studies, EOs were manipulated to decrease undesirable behaviors (e.g., self-
injurious behaviors). However, EOs are important in mand training because the clinician need not wait for
opportunities to arise for the child to produce it. Instead, the clinician can create conditions (EOs) that
encourage manding more frequently and thereby make the mand training more efficient (e.g., Hall &
Sundberg, 1988). It is also evident from VB treatment research that what SLPs call language initiation, an
important skill targeted in language therapy with children, is, for the most part, manding. Teaching mands
to children with impaired VB (language disorders) is an effective way of teaching verbal initiation
(Taylor, et al., 2005). Consistent with Skinner’s suggestion (1957), VB treatment research also has shown
that mand training facilitates the training of other verbal operants (Sautter & LeBlanc, 2006).
Discriminative Stimulus Control: The Tact
Discriminative stimuli are aspects of the environment that control certain verbal responses. A
discriminative stimulus sets an occasion for a response that has characteristically received reinforcement
in the past. Although some people mand much, tacts that are controlled by discriminative stimuli are a
significant portion of most people’s everyday speech. A tact is a verbal operant evoked by objects or
events in the environment and reinforced by a verbal community in the presence of those objects and
events. Discriminative stimuli that control tacts are “nothing less than the whole of the physical
environment—the world of things and events which a speaker is said to “talk about’” (Skinner, 1957, p.
95
81). Motivational variables, critical for mands, are unimportant for tacts, which may be described as
“objective” or “disinterested.” Tacts say less about speakers' internal states than they do about their
physical world; mands do the opposite.
At the simplest level, naming could be a tact. At the next level of complexity, descriptive
statements could be tacts. Normally, listeners reinforce tacts based on the relation or correspondence
between the tact and its antecedent. For instance a tact such as grass is reinforced if the controlling
antecedent is indeed grass; but the tact grass is green is reinforced on the basis of a correspondence
between the object grass and its conventional color. When the speaker is a child learning VB, however,
tacts whose forms do not show strict correspondence with their adult forms, and hence lack the
conventional correspondence with the antecedents as well, may still be reinforced. When the child
says “da,” for example, the mother may say, “Yes, that is a dog!” and thus reinforce the child’s tact, even
though that tact does not correspond either to its discriminative stimulus (in the adult sense) or to its adult
topographic feature. Gradually, the mother demands greater correspondence, and an appropriate repertoire
of tacts is established.
Tacts, though generally controlled by discriminative aspects of the physical environment, do not
have a point-to-point correspondence with their antecedents. Echoics and textuals (see the subsequent
sections) have such a correspondence. Tacts evoked by environmental stimuli soon become more
complex due to the recombinative arrangements with other verbal operants; intraverbals, described later,
help generate continuous speech in the absence of a parade of physical stimuli.
While mands are likely to be reinforced by unconditioned reinforcers, generalized conditioned
reinforcers always reinforce tacts. In most situations, these reinforcers are verbal responses of
listeners: right, correct, good, I agree, I think so, very interesting, and so forth. These and other
reinforcers are often interchangeable, and the speakers will get reinforced as long as their verbal
responses bear a conventional correspondence with their antecedents. Lack of correspondence can lead to
conditioned punishers: No, that is not green; I don’t agree, I see it differently, and so forth.
Like any other response, tacts, once conditioned, will generalize to similar stimulus situations.
Various kinds of generalization vastly expand the tact repertoire. A generic or simple generalization of a
tact is observed when a child or an adult produces an established tact (“ball” or “pen”) to a new stimulus
(e.g., a new ball or a new pen). More complex forms of generalized tacts are involved in what are
considered metaphor (including simile) and metonymy.
Metaphorical generalizations (that create what are generally called metaphors), philosophically
and linguistically thought to be a cognitive and creative achievement of a high order, also are a special
kind of tacts under more refined discriminative stimulus control. Skinner’s (1957) example, Juliet is the
sun (metaphor) or Juliet is like the sun (simile), shows that the variables that controlled Romeo the
speaker are sun and Juliet who shared some common stimulus property that affected him. Creative as they
may be, metaphors and similes arise out of discriminated and shared properties of stimuli that control
them, not out of some presumed cognitive processes or intellectual achievements.
Metonymical generalization of tacts accounts for verbal operants that seem to have no controlling
stimuli (Skinner, 1957; Winokur, 1976). Metonymy is the act of naming something with another word that
is associated with it. In behavior analysis, metonymical expressions seem to lack a relevant stimulus, as
shown in the example that follows. These verbal operants pose a particularly difficult problem to the
linguists and cognitive theorists. How do speakers tact objects that are missing, which they do all the
time? If the object is missing, and the response is “about” that object, what is the discriminative stimulus
for that response? In linguistic -semantic analysis, responses of this kind are classified as nonexistence. In
cognitive analyses, the speaker emitting such a response is said to “recognize the absence of an object that
was once present.” Unfortunately, what is recognition, and how the absence of something is recognized,
pose additional explanatory challenges. A missing object cannot be a stimulus for a response, just as a
missing cause does not produce an effect. For example, when a child, while looking at the toy shelf,
96
says, “No truck,” we cannot conclude that the missing truck controlled that response. Many other objects,
not just trucks, were also missing, but the child did not tact them. The response “No truck” is actually
controlled by the currently present stimuli (e.g., toys that are present, along with perhaps the empty space
on the shelf) that coexisted with the missing truck. The toy truck has been a part of those stimuli in the
presence of which the response truck has been reinforced in the past. Clusters of stimuli have common
elements, and a response conditioned to one of them is also conditioned to all or some of the individual
elements in clusters. The speaker who again confronts one or some of those elements is likely to emit the
response in question. This is called metonymical extension (generalization) of responses. Metonymical
generalizations also account for more complex tacts than just naming a missing object (Skinner, 1957;
Winokur, 1976). A journalist’s report that the “White house asserts that the recession is over,” is indeed
what the President or a spokesperson has said. Specific speakers and the White house are commonly
associated with each other; therefore, they share a controlling relation to the tacts.
Certain processes governing tacts lead to abstraction. A controlling stimulus is typically
composed of multiple and discriminable (isolatable) properties such as shape, size, color, texture,
configuration, use, function, and so forth. A verbal response under the control of an isolated discriminable
property of a stimulus is an abstract response. Skinner wrote that “abstraction is a peculiarly verbal
process because a nonverbal environment cannot provide the necessary restricted contingency” (1957, p.
109). In other words, nonverbal environment cannot teach abstractions; only a verbal community can. For
instance, in teaching the child an abstraction of red to redness as such, the verbal community (or a
clinician) will have to reinforce the tact “red” made in relation to objects that are red, but vary in shape,
size, texture, and to a point, hue. However, because these irrelevant properties (e.g., shape or size) also
gain some control over the verbal tact “red,” the teachers must reinforce differentially. The response red
is reinforced always and consistent with redness, but regardless of other properties of red stimuli. In this
kind of teaching, irrelevant stimulus properties do not covary with reinforcement whereas redness does,
and thus comes to control the tact “red.” As a result, a response is created that tacts an abstract property of
a stimulus that varies in other properties.
Skinner’s analysis of tacts is extensive, and includes provocative discussions on how people come
to tact private stimuli—stimuli that arise within the speaker’s body, and more importantly, how the verbal
community manages to arrange contingencies of reinforcement for them. In addition, discussions on
problems of reference and meaning (Hegde, 2008b), and a variety of literary behaviors also are included.
Generally, SLPs do a good job of teaching tacts, although the clinicians have tended to
conceptualized what they teach in linguistic terms (naming and describing objects and events). Although
tact teaching is important, there may sometimes be an overreliance on teaching simple tacts at the expense
of other verbal operants, especially mands, intraverbals, and autoclitic (often grammatic) relations. Simple
tacts are typically the responses given to the mand, “What is this?” Except at such simplest level of object
naming, tact training will include different types of verbal operants. Individual words expanded into
topographically more complex combination of verbal operants (phrases and sentences) include mands,
intraverbals, and autoclitics (certain morphologic and syntactic aspects). As we shall see in a later section,
most everyday speech is a combination of different verbal operants. For instance, after teaching a child to
tact “ball” to certain round objects, the clinician may teach the child to produce The ball is red, which
consists of two tacts (ball and red) with two autoclitics (the and is). Similarly, the child mat learn to say,
give me that red ball, which consists of a mand, two tacts (red and ball) and an autoclitic (that).
Therefore, pure tact training should soon give rise to a higher level of training in which different verbal
operants are combined into what are commonly called sentences and that Skinner (1957) considered as
larger segments of VB resulting from autoclitic activity.
97
After the mand, the tact is the second most frequently targeted verbal operant in VB treatment
research (Sautter & LeBlanc, 2006). Possibly, if child language treatment research published in speech-
language pathology and other related discipline journals is included, the tact may be the most frequently
taught verbal operant. In most VB treatment research, tacts were taught in combination with other verbal
operants. Several studies also have analyzed the functional independence of tacts from other verbal
operants (mands, intraverbals, and echoics). The findings have generally supported Skinner’s assertion
that verbal operants have different causes and need separate training, although in a few studies
generalization across functional units have been noted (Sautter & LeBlanc, 2006).
Verbal Behaviors Caused by Other Verbal Behaviors
Skinner wrote that “behavior generally stimulates the behaver” (1957, p. 138), and VB can
stimulate other VBs. Intraverbals, echoics, and textuals are the three kinds of VBs whose controlling
variables are VBs themselves.
Intraverbals
VBs whose controlling variables are prior verbal responses are called intraverbals. A defining
characteristic of intraverbals is that there is not point-to-point correspondence between intraverbal
responses and their stimuli. Such a correspondence is more evident in tacts, and most in echoics (imitative
responses that duplicate their own stimuli) and textuals (naming printed stimuli—reading).
Some intraverbals may be generated by another person’s verbal responses. A speaker may
say “four” when someone utters “two plus two is . . .” However, the most important classes of
intraverbals are those that are controlled by the speaker’s own prior VBs. Speech, once initiated by some
variables, is capable of evoking more speech in the same person. Much of everyday conversation is
intraverbal, as are serious discussions. One speaker’s production of “Why?” often evokes the production
of an intraverbal “Because . . .” in another speaker. Similarly, “Fine, thank you” may be an intraverbal
response to “How are you?” (Skinner, 1957). The instructor who asks the class, “What is a discriminative
stimulus?” hopes that the question will generate an intraverbal response of “A discriminative stimulus is .
. .” The instructor also hopes that what follows is an accurate (reinforceable) and complete intraverbal.
Most of lower or higher education is designed to generate intraverbal responses that the ordinary verbal
community may not establish. As these examples show, an utterance is not only a response to some other
variable, but it also is a stimulus to subsequent utterances. People often “go on speaking” not because of a
“train of ideas” rushing inside their heads, but because of the stimulus function of their own verbal
responses. Most likely, intraverbal control is a significant contributor to speech fluency (Hegde, 1982).
Intraverbals can be either chains or clusters (Winokur, 1976). Intraverbal chains have a fixed
order upon which the delivery of reinforcement is contingent. In reciting a poem, one part controls the
other in a sequential manner. The child acquires the alphabet as a chain in which one letter supplies the
necessary stimulus for the next. When the recitation of the alphabet gets interrupted, the child usually
goes back to recreate the stimuli for subsequent responses. Counting is chaining, as are formulas,
syllogisms, and symbolic logic. History is taught and learned as intraverbals. Skinner cautions, however,
that “any one link in a chain of responses is not under the exclusive control of the preceding link” (1957,
p. 72), because repeating just the last emitted letter of an interrupted recitation of the alphabet may not
reinstate the chain.
Intraverbal clusters are groups of verbal operants that can evoke each other with no specific order
or grammatical connection. Moreover, clusters are bidirectional, while chains are unidirectional. Word
association test responses are clusters. The verbal response “ring,” for example, can serve as a
discriminative stimulus for clusters, such as: (1) “gold,” “diamond,” “hand,” “finger,” “engagement;” (2)
“noise,” “clang,” “bell,” “door;” (3) “worm,” and perhaps other clusters (Winokur, 1976). However, a
98
member of any one cluster is usually not a member of any other cluster, although all the clusters in
question may have the same discriminative stimulus.
Intraverbal clusters are of two types, thematic and formal (Winokur, 1976). In thematic clusters
there is no acoustic similarity between the verbal response that serves as the discriminative stimulus and
the response it evokes. There is such a similarity in formal clusters. In thematic clusters, verbal stimuli
and responses evoke each other because they share common “meaning” in the sense that they have
entered into similar contingencies of reinforcement. For example, “ring,” “noise,” “clang,” and “bell,” are
all about the same thing. In formal clusters, stimuli and responses come together because they sound
similar acoustically: the word “hat” may evoke “cat,” “mat,” “pat,” and “chat.” Thus, rhyming as a
phonological skill may not suggest some kind of awareness or knowledge, but phonetically controlled
formal clusters (intraverbals). Formal clustering is explained on the basis of response induction, a
behavioral process in which new responses similar in form to those reinforced earlier are likely to occur.
Intraverbal relations have also been described in terms of divergent and convergent control. A
cluster that consists of the verbal stimulus “Chair” and several evoked intraverbals (e.g., “table,” “sofa,”
“dining,” “reclining,” etc.) illustrates divergent intraverbal control; a single stimulus evokes multiple and
varied intraverbal responses. A different cluster that consists of varied verbal stimuli (e.g., “four legs,”
“made of wood,” “something to sit on,” etc.) and a single evoked intraverbal “Chair” illustrates the
convergent control (Axe, 2008). Because of their general complexity, intraverbals require not a simple
discrimination, but a conditional discrimination, in which one verbal stimulus changes the evocative
effect of another verbal stimulus, and in combination, they evoke an intraverbal response (Axe, 2008).
See the next section for examples and teaching implications of conditional discrimination in intraverbal
relations.
SLPs who describe language disorders in terms of lack of continuous speech, limited
conversational skills, impaired sentence completion tasks, lack of topic initiation and maintenance,
limited production of synonyms and antonyms, limited production of proverbs and common sayings, are
indeed describing impaired intraverbal relations. Impairment in intraverbal relations is a higher level VB
disorder than deficiencies in producing mands and tacts. To establish the higher level intraverbal skills in
the repertoires of children and adults with language disorders, SLPs first need to teach the more basic
verbal operants, including echoics, mands, and tacts. For instance, to learn the intraverbal, “In the winter,
the big white bears hibernate,” emitted as an answer to the question, “What do the big white bears do in
winter?,” the client should have all the tacts of that intraverbal in his or her verbal repertoire; if not, the
clinicians will have to first establish them.
In the SLP literature, there are few or no studies on teaching explicitly described intraverbal
relations to children and adults. Nonetheless, SLPs have taught intraverbals to their clients, though they
have not conceptualized them as such. Clinicians often teach words to children with language disorders
by (a) presenting a stimulus, (b) asking a question (e.g., “What is this?”), (c) immediately modeling the
response (e.g., “Say, ball”) and (d) reinforcing the correct response from the children. Later, they fade the
model, and just present the stimulus, ask the question, and reinforce the response. This sequence of
procedures is a way of establishing intraverbal relations. Clinicians routinely exploit a premorbidly well
established intraverbal relation, even if it is currently weakened, to evoke intraverbal responses in clients
with brain injury. For instance, clinicians often prompt in the form of an incomplete sentences (e.g., “You
eat with a . . .) to evoke an intraverbal response “fork” from a client with aphasia. Phonetic cues, (e.g., the
word starts with a p . . .”) or cues based on the use of an object (e.g., “You write with it”) also are
examples of strategies to train or retrain intraverbal relations. Asking children to name individual
members (e.g., cats, rats, dogs, etc.) of a class of stimuli (animals) is another method of establishing
intraverbals. Responses to such directions as “Name some animals” are intraverbals. Intraverbals are
99
established as well when children learn to respond with synonyms when an antonym is supplied or (and
vice versa) or when they learn to give correct definitions when asked to define terms.
In VB treatment research, there are several controlled treatment studies on teaching intraverbal
relations to clients who lack them. These studies help guide SLPs develop treatment programs to increase
explicitly defined intraverbal relations in children and adults who have limited verbal repertoire. Cihon’s
(2007) review of studies on training intraverbal repertoire and the specific studies cited in it are helpful to
clinicians in designing their own intraverbal treatment programs for children and adults with VB
(language) disorders. Cihon describes peer training, conversational speech training, the discrete trials, and
several other educational methods (e.g., direct instruction and precision teaching) as effective procedures
to establish intraverbal relations.
Echoics
Much has been written on the role of imitation in language learning. It may be so because echoics
(imitated verbal responses) are perhaps the earliest of vocal operants. An echoic is a verbal response
whose acoustic pattern resembles that of its own verbal stimulus. In effect, an echoic reproduces its own
stimulus. Echoics are reinforced when the responses more or less accurately reproduce the acoustic
characteristics of their stimuli and closely follow the stimuli. Repeating what someone said in the recent
or remote past is not echoic verbal operant. Echoics are verbal operants preceded by the same or closely
resembling verbal stimuli (Skinner, 1957). Because they are reinforced verbal operants, no special faculty
or instinct is presumed to be responsible for them.
Listening consists mostly of covert echoics. [Incidentally, listening, the other part of VB, is not
covered in this paper; but see Schlinger, 2008b for the Skinnerian view that listening is behaving
verbally.] At the least, a listener may covertly repeat the important parts of a speaker’s responses. The
recipient often covertly and overtly repeats complicated traffic directions. In everyday conversation,
speakers tend to echo one another’s words or peculiar phrases. Certain work orders or complicated
instructions also may be immediately repeated (echoed). Skinner (1957) also describes self-reinforcing
self-echoic behaviors that are often called palilalia if they are excessive, apparently uncontrolled, or
pathological in other ways. Even the so-called verbal perseveration seen in individuals with neurological
disorders may be described as fully or partly self-echoics. It is important to note, therefore, that echoics
are not limited to simple imitations.
Echoics in a child’s repertoire gives a distinct advantage to the verbal community that teaches its
language. The teaching begins with babbling, but both the infant and the caregivers have some ways to
travel before they arrive at helpful echoics. Echoics as early operants are shaped from the baby’s tendency
to initially exhibit unconditioned and undiscriminated babbling (i.e., non-operant vocalizations that are
neither an echoic nor any other kind of verbal operant). Nonoperant babbling occurs when well fed (and
relaxed) babies are lying on their back, and the air movement through the vocal folds causes them to
vibrate, resulting in random sounds or noises. Because it is random (not yet selected by reinforcement), it
is likely to include sounds that are not specific to the infant’s verbal environment (the family “language”)
(McLaughlin, 2006). That the infant’s babbling may include some sounds alien to the infant’s verbal
environment is neither surprising nor remarkable. There is no justification to hypothesize that the
nonoperant and unselected (unreinforced) babbling is due to some innate mechanism because such
babbling is a physiological-aerodynamic phenomenon.
The reinforcing effects the caregivers themselves experience from the infant’s nonoperant
babbling that establishes an interlocked set of caregiver-infant reactions that help the emergence of other
verbal operants. The interlocked chain of reactions include the parent’s echo-babble, infant’s self-echoic
babble, caregivers’ reinforcement, infant’s operant babble, and the more refined reinforcement made
contingent on sounds of the surrounding (family) environment that eventually help establish mands, tacts,
100
and other verbal operants. It is interesting to note that the initial reinforcement occurs for the caregivers,
who, (in everyday language) take delight in the random sounds or syllables their babies produce. That
their babies’ vocal sounds and syllables are conditioned reinforcers to the caregivers and (soon) to the
babies is central to the development of initial echoics and their transformation into other verbal operants.
Baby’s babbled sounds and syllables reinforce the caregivers because who will have had a long verbal
learning history; speech (and its individual sounds) have been socially reinforcing to them; lack of speech
is socially aversive. The reinforcing effects of their babies’ nonoperant babbling causes two main changes
in the caregivers’ behaviors. First, they echo-babble their babies’ babbles. That is, the caregivers repeat
what the babies babble. Second, the caregivers produce similar sounds and syllables while they are
reinforcing their babies with their caretaker routines: changing, bathing, drying, dressing, feeding, holding
them closely, and playing with them. The vocal sounds of the caregivers, by association with such
reinforcers, acquire conditioned reinforcement value for the babies. Next, the sounds the babies hear
themselves produce are immediately and automatically reinforced, causing an increase in babbling. It is
known that babies who are deaf begin to babble but soon stop—possibly because of lack of automatic
reinforcement. This reinforcement also may be the reason why babies self-echo; that is, they repeat their
own babbled sounds in a stage of language development researchers call reduplicated (e.g., da-da-da)
babbling (McLaughlin, 2006). A different type of condit ioning also begins to take place. Parents, hearing
the babies’ babbled sounds, syllables, and self-echoics, begin to socially reinforce them (e.g., by smiling,
tickling, picking the babies up). There is experimental evidence that such contingent social reinforcers
increase babbling in babies. (See Hegde, 1998 for reviews of classic studies and Bloom, Russell,
Wassenberg, 1987; Goldstein, King, & West, 2003, Goldstein & Schwade, 2008, Gros-Louis, West,
Goldstein, & King, 2006 for more recent studies in which caregiver reinforcement has systematically
increased babbling in babies.)
One typical objection to this analysis is that parents do not plan to reinforce and that they cannot
reinforce consistently (Owens, 2005). A point worth noting is that the caregiver reinforcement need not
be planned (McLaughlin, 2006) nor should it be continuous to increase nonechoic, echoic, or self-echoic
babbling. There is evidence that caregivers attend to (reinforce) about 50% infant vocalizations (Gros-
Louis, et al., 2006). It is well known that intermittent (less than 100%) reinforcement increases both the
frequency and the strength of behaviors so reinforced (J. Vargas, 2009). Indeed, because of intermittent
reinforcement, infant vocalizations resist extinction when reinforcers are not forth coming (Goldstein,
Bornstein, Schwade, Baldwin, & Brandstadter, 2007). Nonetheless, in their sources on language
development, SLPs are likely to be told, without any review of studies, that “There is very little evidence
that the infant’s babbling is shaped gradually by selective reinforcement” (Owens, 2005, p. 77). Nor is
there any evidence that the effects of reinforcement are always slow and gradual, even if these could be
operationalized.
There is now stronger evidence that self-echoic babbling, caregiver echoics, and reinforced
babbling serves as a basis for more complex verbal operants, because one theory that discounted that
possibility has been discredited. This is Jakobson’s (1968) theory of discontinuity between early
vocalizations and later language development. Most researchers now agree that there is a continuity, and
thus there is support, for conditioned echoics playing a significant role in VB learning (McLaughlin,
2006; Pena-Brooks & Hegde, 2007). Later on, babies’ responses at the word level are often partial
echoics (approximations) as in dada for daddy. Nevertheless, parents tend to reinforce them initially, but
gradually, the parents require progressively better approximations, resulting in complete echoics. While
partial or complete echoics are being reinforced, other variables, such as persons, objects, and events also
are present. Eventually, these variables gain discriminative stimulus control over the response.
Michael (1982) suggests a new term, duplic, to extend Skinner’s echoic verbal operant to include
sign language, which does not have a vocal stimulus that typically evokes a vocal echoic (or other verbal
operants). The new term captures the essence of a verbal operant whose stimulus need not be verbal;
when one person—often a teacher—signs, the other person—often a learner—copies the sign (imitating,
101
not “replying”). The term duplic includes not only sign imitations, but also copying texts. The term
captures both the auditory and visual stimulus modes, and therefore, may be preferable to the term echoic.
Regardless of the theoretical differences on the importance of echoics in language learning, SLPs
routinely establish echoics (typically called imitations) in most of their treatment sessions. Echoics may
help establish verbal operants of any topographic feature: words, phrases, and sentences. The verbal
stimuli SLPs give to evoke echoics in their clients are called modeling, although within a behavioral
framework these models are also mands (e.g., “Say, ball”). In everyday speech, modeling is not the
necessary stimulus for echoics. In the typical verbal environment, a speaker need not manipulate a
stimulus to call the resulting response an echoic. A speaker who says, “Remember, meeting at 3 p.m.
today,” is not necessarily modeling for the listener to echo; but the listener may still overtly or covertly
echo it: “meeting at 3 p.m. today.” The initial speaker may still reinforce the listener’s overt echoic (“Yes,
that is the time!”) while not having explicitly set the stage for it.
Some of the earliest treatment studies on child language and speech disorders conducted in the
1960s and 1970s, reported that the modeling-imitation-reinforcement sequence was effective in
establishing echoics that may be shaped either into topographically more complex responses or other
classes of verbal operants (see Hegde, 1998 for details). For instance, once such tacts as ball and big are
established, the client may be taught such mands as I want big ball or such tacts plus autoclitics as that is
a big ball. Verbal operants that are at zero baseline almost always need to be first established as echoics.
Shaping should be added to echoics when echoics are not produced on baselines (Hegde, 1998). Echoics
have been found to be a useful initial teaching target in treating adult communication disorders, including
aphasia, apraxia of speech, and dysarthria. Whenever echoics are established as a starting point in
treatment, clinicians fade modeling to bring the client’s verbal responses under the control of more natural
stimulus conditions (e.g., motivation in the case of the mand, and discriminative stimuli in the case of the
tact).
Textuals
Skinner defined the textual as a vocal response that “is under the control of a nonauditory verbal
stimulus” (1957, p. 66). Most commonly, the printed text is the visual controlling (causal) variable of
textual behavior, commonly called reading. Skinner defined a reader as a “speaker under the control of a
text” (1957, p. 65). Other visual stimuli that control textuals include pictures, pictograms, phonetic
symbols, hieroglyphs, Braille, and other visual forms that evoke textuals. In reading, printed stimuli are
covertly or overtly named, but are not described, and for this reason, textuals are said to be functionally
equivalent to the proper name of the stimulus (printed characters, words, other symbols). The stimulus
control is precise, or there will be defective textuals (misreading).
Michael (1982) suggests the term codic to include textual and to appropriately extend this type of
verbal operant to taking dictation. In both the textual and dictation taking, the stimulus is verbal, although
the verbal stimulus for the textual is print and the verbal stimulus for dictation taking is vocal. The
response form, however is different; it is reading—a vocal response in the case of the textual and
writing—a motoric response in the case of dictation taking. Skinner’s analysis of the textual and the more
recent extensions (see Michael, 1982, 1985; E. Vargas, 1982, 1988) include such varied phenomena as
self-textuals (“making notes to oneself”), transcription, writing, copying printed material, and so forth.
Thus, the Skinnerian analysis is more complete than the fragmented linguistic analysis of phonologic,
semantic, grammatic, morphologic, and literacy skills.
102
Treating reading and writing disorders is now within the scope of practice of SLPs. In recent
times, attention has been drawn to the concept that reading and writing skills are language-based. It may
have taken this long to recognize this simple fact because of the linguistic model SLPs have been
following. The transformational generative linguistics paid little or no attention to reading and writing.
Skinner, however, had considered all aspects of VB since the 1940s, and his analysis of textuals offers a
good starting point for planning remediation programs. Unfortunately, there is not much research in
speech-language pathology on teaching textuals to children or adults. In behavioral journals (e.g., Journal
of Applied Behavior Analysis, The Analysis of Verbal Behavior), clinicians can find ways of teaching
reading and writings skills that are better based on experimental methods than are the literacy intervention
approaches often described in SLP journals and textbooks.
Space will not permit a more detailed examination of textuals as clinical treatment targets. It is,
however, useful for SLPs to consider integrating textual training with speech-language training (Hegde &
Maul, 2006). For example, pictures used to evoke (along with mands) VBs of specific topographic
features (phonemes, words, phrases, sentences) may be accompanied with printed stimuli. When teaching
the production of phoneme /b/ or the tact ball with the help of a stimulus picture, the clinician may have
the word ball printed on the stimulus card. While evoking either an echoic ball or an evoked tact ball, the
clinician also can point to the printed word ball to reinforce a textual response. If clinicians follow the VB
approach to teaching reading and writing, they would avoid such ineffective procedures as teaching
phonological awareness or teaching encoding and decoding skills, and such other unproven cognitive-
linguistic approaches.
Audience
The listener is an important part of the contingency governing VB, which typically occurs only in
the presence of other individuals who reinforce it. The term audience refers to an effect on verbal
operants that listeners (audience) have. Therefore, audience is not a class of verbal operants like the ones
described so far. An audience is defined as “a discriminative stimulus in the presence of which verbal
behavior is characteristically reinforced and in the presence of which, therefore, it is characteristically
strong” (Skinner, 1957, p.172). An audience may determine all that is said on a given occasion, although
this happens infrequently. Audience is usually a supplementary variable; its effects are additive to the
strength of other primary variables. As a supplementary variable, an audience may have two kinds of
effects.
First, audience may determine such production effects as audibility of utterances. On such
occasions, the audience may not have any effect on what is being said, only on how it is said. A strong
primary variable makes speech very probable, but normally it will be uttered aloud only in the presence of
an audience. In certain other social situations, strong primary variables of speech may be absent, so that
people confronting each other “do not have much to say.” The presence of each other, however, might
constitute a strong audience effect (strong discriminative stimuli for speech). People then talk
about the weather or do “small talk.” An opposite effect is seen when the audience present is too weak,
and the primary variables are very strong. The person is “itching” to speak but confronts a wrong kind of
an audience—the one that has not been associated with reinforcement in the past. The speaker
may nevertheless blurt out something, and then may hastily add “please don’t mind, I am just bubbling
today.”
Second, as a supplementary variable, an audience may partly determine what is being said,
although some politicians may be accused of saying all that is said because of an audience. When an
audience becomes a part of controlling variables, its effect is usually to select a particular response.
The presence of a single primary variable raises the probability of the emission of several verbal
responses that are typically conditioned to it. The audience present may determine which one of these
actually gets said. After having tested a child, a language clinician is likely to respond with different sets
of words to describe the “problem” depending on whether the listener is a junior student in speech
103
pathology, a graduate student, a fellow co1league, or the mother of the child. People who are bilingual
switch languages depending on their audience. Among themselves, professions emit their own jargon, and
peer groups emit their slang. These are all instances of audience control of what is being said.
For the most part, SLPs initially establish verbal operants in the clinic. As they teach various
verbal skills to their clients, they also serve as the initial audience for the clinically established VBs.
Consequently, the clinicians become discriminative stimuli for the clinically established VBs. Unless the
clinicians take additional steps, other people the clients encounter may not evoke the newly established
VBs because their discriminative stimulus function is not yet established. This is often described a
problem of generalization; but it is indeed a problem of lack of required additional conditioning of VBs to
audience other than the clinician (Hegde, 1998).
To overcome that problem, clin icians usually bring other people, including family members, into
the treatment sessions. Fellow clinicians also may take part in treatment sessions. The clinician may move
treatment sessions to more natural settings where additional people serve as audience for the newly
established verbal skills. In essence, audience generalization is a matter of transferring the stimulus
control from clinician to other persons with whom the client regularly interacts.
Multiple Causation of Verbal Operants
Motivation (causing mands), environmental events (causing tacts), another speaker’s speech
(causing echoics), printed or other visual stimuli (causing textuals), one’s own speech (causing
intraverbals) and the audience (producing an effect on those verbal operants) exemplify the six kinds of
stimulus control of the first five kinds of verbal operants. These controlling variables do not exert their
influence in isolation, however. Typically, several variables concurrently control the same
verbal operant, and different sets of variables control different operants. Consequently, (1) a single
variable evokes (causes) multiple responses, (2) multiple variables evoke a single response,
and, (3) different classes of verbal operants constitute single utterances. Only in echoics (duplics) and
codics (textuals and dictation taking) do we see single variables controlling their respective single
responses.
That a single variable controls more than one response is the basis of much of the speech emitted
in relatively constant environments. People do not need a parade of stimulus events to talk for extended
periods of time. One variable is sufficient to generate much speech, which serves as stimuli for more
speech (intraverbal control). Therefore, there is hardly any dearth of stimuli. In fact the problem might be
to explain how certain responses, though probable, do not get emitted and how the responses that were
emitted got selected. Skinner (1957) rejected the mentalistic idea that speakers choose their words,
because it creates a more difficult problem: explanation of an elusive choosing process and an inner
speaker who chooses. Instead, Skinner proposed that the number and strength of variables present in a
given situation determines the selection of verbal operants. The effects of two or more controlling
variables are additive. For a specific response, therefore, the greater the number of currently active
variables, the higher the probability of its occurrence. However, a few stronger variables with a longer
history of reinforcement may exert greater influence on response selection than several weaker variables
with defective or recently instituted contingencies. Additional supplementary strengthening in the form of
specific audience effects is also a source of response selection. On the basis of multiple causation, Skinner
described different kinds of supplementary strengthening and explained various kinds of neologism,
intrusions, slips, and distortions in speech.
Different classes of verbal operants, such as tacts, mands, and intraverbals are often parts of a
single utterance. The utterance, Fool! You think I am a fool? contains an echoic and a tact. Attention,
104
kids! contains a mand and an audience effect. The mand, May I have a large cup of strong coffee also
contains multiple tacts (large, cup, strong, coffee) and autoclitics (a, of).
Except at the very basic level, VB training involves multiple verbal operants. Pure mands or pure
tacts, though rare, may be taught with limited response topography (words, phrases). Teaching children to
say cookie, sock, or book as either a mand or a tact may be initially necessary, but excessive training at
this level will make it difficult to shift training to more complex verbal operants. More importantly, the
simpler and purer verbal operants will not be as effective as the more complex verbal operants in
modifying the listener responses.
Response complexity may be increased only by including different verbal operants into single
utterances. The traditional sentence typic ally is a combination of multiple verbal operants, including
autoclitics, described in the next section. Conversational skills training—a high level of verbal training—
will include, in different combination, mands, tacts, partial or full echoics, self-echoics, autoclitics, and
intraverbals, as noted previously.
Autoclitics: Grammar and More
In Skinner’s (1957) analysis , mands, tacts, intraverbals, echoics, textuals, and audience effects are
primary verbal behaviors, which do not include grammar and word order. This position contrasts with the
Chomskyan linguistic theory in which grammar is primary, and has the status of an independent variable
in that an innate grammar is essential for language development. However, Skinner did not ignore word
order and grammar. None of the abundant linguistic criticisms have come to grips with the subtleties of
Skinner’s analysis of grammar and word order. According to Skinner (1957), grammar and word order
are secondary to the primary verbal operant classes described previously. Grammar is secondary to first
having something to say. Without a repertoire of primary verbal operants, a speaker has no use for
grammar (including word order). Therefore, word order and grammatical features are dependent
variables. They describe certain effects, not causes. Skinner described the secondary processes of
grammar and word order under the term autoclitic. The term, however, includes several verbal
phenomena not included under the linguistic analysis of grammar and word order.
Skinner wrote that “part of the behavior of an organism becomes in turn one of the variables
controlling another part” (1957, p. 313), and that “parts of language deal with other parts of language”
(1957, p. 341). In other words, some verbal operants discriminatively tact other verbal operants. Thus,
when we “tact our own verbal behaviors, including its functional relationships” (Skinner, 1957, p. 314),
we have autoclitics. As they talk, people tell their audience how, what, and why they talk; they fine-tune
the listener reactions by making specific comments about their speech. Such specifications of what, why,
and how speech is being emitted are autoclitics, which include grammatical features. In essence,
autoclitics include speech about speech, the controlling variables of speech, or specifications of certain
aspects of stimuli that prompted speech. It is in this sense that autoclitics are secondary VBs controlled by
other, primary, VBs. Primary VBs are the controlling part; autoclitics are the parts controlled. Both are
VBs, but autoclitics are not controlled by those that control primary VBs (e.g., motivation and physical
stimuli).
It has been noted that the term autoclitics includes not only grammatical words and word order,
but many other kinds of verbal responses that are traditionally not included under grammar. Responses
such as I said, I see, in other words, certainly, perhaps, are all autoclitics too. Any part of an utterance
which discriminatively tacts any aspect of the controlling variables and their relatio n to primary operants
is an autoclitic.
Varieties of Autoclitics
105
Autoclitics are of different kinds. A large group of them are descriptive. “I said . . .,” is an
autoclitic which describes the speaker’s prior verbal responses, while I was about to say describes an
imminent response. Some of the descriptive autoclitics tact the controlling variables of the verbal operants
they accompany. A statement such as the President is in Copenhagen may be made for any one of several
reasons. But if the speaker, reading a newspaper, adds I see to that statement, the listener knows why the
speaker said what he or she did. In other words, the controlling variable of the statement (the printed story
in the newspaper) is tacted for the benefit of the listener. The same state ment, when preceded by Tom
said (that the President is in Copenhagen) identifies a different variable controlling the primary VB.
A number of autoclitics specify the strength of the VBs they are a part of. Autoclitics like I guess,
I imagine, indicate that what follows in each case is not very strongly determined, where as I am certain
that, I know for sure that, indicate that subsequent responses have strong controlling variables. Although
the primary VBs (the responses that follow or accompany those autoclitics) are the same, the effects on
the listener are not. Thus, different autoclitics modify listener reactions in specific ways. Autoclitics like I
hate to say tact the emotional state of the speaker, while those like I agree, and I should say describe the
relation between the response and other VB of the speaker or listener, or the situation in which the
behavior occurs.
Some autoclitics qualify responses they accompany, and include negation and assertion. The
linguistic analysis of negation implies that the absence of previously experienced objects is the source of
negation. But as Skinner says, this is “clearly impossible in a causal description” (1957, p. 322), because a
thing that does not exist cannot cause anything. If the response no rain is controlled by the absence of
rain, “why do we not emit a tremendous flood of responses under the control of the absence of thousands
of other things?” (Skinner, 1957, p. 322). In the example no truck discussed earlier, no is an autoclitic
controlled by the verbal operant truck , which is in turn controlled not by its absence, but by the presence
of stimuli, which, on previous occasions, accompanied the object truck. The response she is not
Swedish contains the autoclitic not, which is controlled by she and Swedish, both made likely by primary
variables. In this case, not implies that Swedish is not a part of the tact for she. Put differently, not implies
that the controlling variables for the two tacts, she and Swedish do not covary and have not been
reinforced as a verbal operant.
The most common assertive autoclitic in English is is. It tells the listener something about the
causal variables of responses and the relationship among them. The verbal response that chair is
big contains two tacts and the autoclitic is. The autoclitic specifies that the same discriminative stimulus
controls both that chair and big. In other situations, is is also controlled by the temporal characteristics of
the stimulus. The English auxiliaries is and was assert relationships between responses, and responses and
their antecedents, but different temporal aspects govern their emission. Technically, assertive autoclitics
imply that the operants involved are either tacts or intraverbals. Mands, echoics, and textuals are not
typically asserted.
Quantifying autoclitics tact the numerical properties of the controlling stimuli that prompted a
primary verbal response. The articles the and a specify the relationship between responses and their
controlling variables in two ways. They tact the numerical properties of the controlling variables, their
specificity, or both. The definite article the literally points at the controlling variable of the response it
accompanies. The articles also make it possible to obtain a more precise response from the listener. Other
autoclitics like all, some, few, mostly, one, many and so forth are similar: they specify the numerical
properties of discriminative stimuli and their relation to verbal operants.
A number of fragmentary tacts, which are autoclitics, appear in the form of grammatical tags,
inflections, or bound morphemes. Tags like -s, -ed, and -ing (e.g., spills, spilled, and spilling) tact the
temporal relationship between the controlling variables and the speaking. Possessive inflection -s, on the
other hand, tacts the controlling relationship between two tacts in the same utterance. They also inform
the listener that the discriminative stimuli for the two tacts (boy’s hat) are likely to covary. The two
106
conjunctions and and or imply that more than one verbal operant is being controlled by the same
discriminative stimulus. In the case of and, however, the two operants are compatible in relation to their
single stimulus (small and beautiful). In the case of or the two operants are not compatible (genius or
crock). Most, but not all, prepositions tact the spatial relationships between the controlling variables of
verbal operants. When a speaker says I heard the sound of music, the autoclitic of discriminatively tacts
the variable responsible for the sound. Sound could be of anything, and consequently, as a verbal
response, sound could be controlled by any one of several variables. The autoclitic of discriminatively
tacts the specific controlling variable of a current verbal response.
No attempt will be made in this paper to catalog all grammatical words and tags in terms of their
autoclitic function. The point is that grammar can be accounted for in terms of a causal analysis as against
the structural analysis. One final question, however, needs to be considered: Why do autoclitics occur at
all? Why do speakers tact the controlling variables of their own VB? Clearly, the speakers need not tact
the controlling variables for their own sake, because they already are in touch with them, or else they
would not speak. It is often the listener who has no access to the speakers’ controlling variables.
Therefore, listeners ask questions like when, where, how many, how sure are you, why do you say that,
and so forth. Parts of answers to such mands (questions) are autoclitics. In sum, autoclitics occur for
the benefit of the listener, who needs to know, (1) why an utterance is made, (2) how strong is that
utterance (operant strength), and (3) what temporal, spatial, physical, quantitative, and other properties of
stimuli govern that utterance. The precision of listener reaction depends on this kind of information. An
imprecise listener reaction is a defective contingency for the speaker. In the final analysis, therefore, the
speaker is a beneficiary too.
In Skinner’s analysis, ordering (syntax) seen in verbal responses is also partly due to the autoclitic
process. But there are other causal variables that help order VB. Some ordered VB may not involve an
ordering process at all. For example, the child may initially learn phrases like hi there, how are you, fine,
thank you and so on as single functional entities. For adults, too, utterances such as have a nice day may
be single functional units, not involving an ordering process.
In some situations, word order may correspond to the sequence of relevant stimuli that generate
verbal responses. For instance, a TV sports commentators’ speech is temporally ordered according to the
order in which the events (stimuli) unfold in front of them. Order may also be due to the order in which
verbal stimuli generate more verbal responses, as in the responses given to free association test items and
in all intraverbals (e.g., reciting the alphabet or a number series). This suggests that intraverbal control is
one of the multiple sources of order (syntax). Order in echoic responses is due to the order in which the
antecedent stimuli are generated for the speaker. Yet another source of order is the relative strength of
responses, because the strongest is more likely to be emitted first, and the weakest the last, and thus an
order would emerge (Skinner, 1957). When a speaker says “Hand me that book, the red one,” the order of
the verbal operants may be partly determined by their relative strength. Possibly, the main mand part of it
(“Hand me [that] book”) was stronger than the tact part of it; a weaker tact (“[the] red one”) was added
because the mand was not effective in generating a quick and correct response from the listener.
Educational teaching of elements of autoclitics to children is as old as education itself; elementary

schools were once called grammar schools. Initial treatment efficacy studies on teaching grammatical
morphemes to children with language disorders were published in the late 1960s and early 1970s. Such
studies, however, were conducted by applied behavior analysts (e.g., Guess, Sailor, Rutherford, & Baer,
1968; Schumaker & Sherman, 1970). Subsequently, studies on teaching grammatic morphemes as well as
syntactic structures began to appear in SLP journals (see Bricker, 1993 and Hegde, 1998 for historical
reviews of treatment research on teaching grammatical elements to children with language disorders).
Unfortunately, whether the research was conducted by applied behavior analysts or SLPs, the authors did
107
not describe the treatment targets in terms of verbal operants; many investigators have, and continue to
use, the linguistic terms to describe the skills taught to children with language disorders. The methods of
teaching and the research designs in which the treatment efficacy was evaluated were all behavioral,
however. This state of affairs continues, even in some explicitly behavioral journals (e.g., Journal of
Applied Behavior Analysis). Articles published in The Analysis of Verbal Behavior are exceptional in
using both the verbal behavior concepts and behavioral treatment.
In a later section of this paper, I will describe selected clinical research on autoclitics (especially
the grammatical morphemes) that has shed light on verbal operant response classes. Teaching various
sentence forms as SLPs conceptualize it will always involve autoclitic responses. Conversely, teaching
specific grammatical morphemes often (though not always) involve sentences. For instance, to teach the
verbal auxiliary is, the clinic ian needs sentences (combined or ordered verbal operants of different
classes): The boy is running, The girl is smiling, and so forth (Hegde, 1980).
The response class research, briefly reviewed in a later section, suggests that it is more efficient to
target Skinner’s autoclitics and other verbal operants rather than the linguistic categories. As teaching
various grammatical elements is a significant part of teaching VBs to children with language disorders, it
is hoped that SLPs will begin to adopt Skinner’s functional units rather than the linguistic structural units
in assessment (see Esch’s article in this issue) and treatment.
Receptive Language?
Linguistically and popularly, an appropriate response to a verbal stimulus is often described as

receptive language or comprehension. In the behavioral analysis, the terms receptive language and
comprehension are even more questionable than the term language itself. The implication that a listener
passively “receives” and “understands” (or “processes the signal,” as in the popular computer analogy)
spoken language or read material altogether misses the point of verbal behavior, which constitutes actions
of speakers and listeners. Typically, appropriate behavior to verbal stimuli is the basis to assume a
mentalistic notion of comprehension, which remains unexplained and unobserved. What is observed is
either a reinforceable (appropriate) verbal or a nonverbal response to verbal stimuli. In the case of clients
who have obviously limited verbal repertoire, clinicians test comprehension by sampling a correct
(reinforceable) nonverbal response to verbal stimulus. Typically, the client is manded (asked) to point to
the named stimulus embedded in a stimulus set. If there is a conventional correspondence between the
mand, the pointing, and the physical stimulus, the clinician reinforces the pointing response. Thus, in a
behavioral analysis, comprehension is an unnecessary term that skirts the valid stimulus-response-
reinforcement contingency that is adequate to handle the relevant observations.
In an extension of Skinner’s analysis, Michael (1985) suggests that VBs of pointing at, touching,
and in responding in other nonverbal ways to verbal stimuli may be called stimulus-selection-based VB.
Although teaching stimulus sele ction (comprehension or receptive language) as a precursor to VB
teaching to most children with speech or language disorders is unnecessary, such a teaching is a part of
augmentative and alternative intervention for adults and children with limited verbal repertoire. Michael
calls the more typical forms of verbal (as well as vocal) behavior topography-based because, for instance,
the words dog and cat have different response topographies, whereas stimulus-selection-based VB (e.g.,
pointing) remains constant regardless of the object pointed to.
Language Treatment Research Supports the Behavioral Analysis
Traditionally, SLPs have relied on structural analysis of language, in which the independent
variables are either ignored or assumed to take the form of innate devices, mental schemes, mental
images, cognitive concepts, maps, and other presumed entities with little or no empirical validity. As
applied scientists, however, SLPs need to intervene in a behavioral process that for some reason has been
impaired. They have to produce changes in the language behaviors of their clients. Changes in behaviors
108
can be produced only when their causal variables are under the clinician’s control. In this sense, SLPs are
more like empirical scientists than structural nativists.
Because the variables figured in the structural analysis are mentalistic and nonmanipulable, SLPs
have turned to behavioral intervention for communication and swallowing disorders (Hegde, 1998).
Behavioral intervention is also a causal analysis of behaviors, however. Its success as an applied
technology depends on its emphasis on a causal analysis of normal or impaired VBs. Whenever some
new behavior is clinically modified, the applied technology throws some light on the basic analysis of that
behavior.
Treatment Targets Should be Functionally Organized
Clinical research data in treating language disorders in children, published both in the behavioral
and speech-language pathology literature, has produced some impressive results that support Skinner’s
functional units (verbal operants), as against the structural (linguistic) categories. For instance, the VB
approach to language intervention has shown that, as Skinner asserted, mands and tacts are functionally
independent verbal operants, in the sense that teaching tacts will not automatically result in the production
of mands or vice versa (Hall & Sundberg, 1987). Many SLPs know, for instance, that a child who is
taught to say “Ball” (a tact) when shown a ball will not automatically produce “I want ball” (a mand). The
child who has learned to mand a ball will not necessarily tact it; both need separate training. Interestingly,
functional (causal) independence of verbal operants has been demonstrated with chimpanzees (Savage-
Rumbaugh, 1984) and pigeons (Michael, Whitley, & Hesse, 1983; Sundberg, 1985).
In language treatment research conducted by both behavior analysts and SLPs, functional
response classes that contradict the validity of structural categories have emerged when (a) what linguists
consider separate grammatic structures did not need separate training and (b) when what is considered a
single grammatical structure broke down into different response classes that needed separate training. For
instance, treatment research has demonstrated that such single linguistic categories as (a) the regular
plural (Guess, et. al., 1968), (b) the irregular plural (Hegde & McConn, 1981; Hegde Noll, & Pecora,
1979), (c) the regular past tense (Schumaker & Sherman, 1970), and (d) the irregular past tense (Hegde
& McConn, 1981; Hegde Noll, & Pecora, 1979) are each a collection of multiple and functionally
independent response classes (see Hegde & Maul, 2006, for a summary of research). Independent
response classes need separate training as there is no or socially accepted generalization across functional
response classes. Treatment research also has demonstrated that such multiple linguistic categories as (a)
the subject-noun and object-noun phrases (McReynolds & Engmann, 1974) and (b) the English auxiliary
and copula (Hegde, 1980) are not functionally independent; teaching one of the two will instate the other,
regardless of which one is taught. These two studies have used the more powerful ABAB experimental
design than the typical teach-one-and-probe-the-other method; one of the two skills was taught, then
reversed, and finally reinstated to show that the untaught skill is produced, reversed, and reinstated
without training. Essentially, under a causal-experimental analysis, some linguistic distinctions collapse
into single functional units and other single structural units break into multiple functional units. Although
more research on operant verbal response classes is needed, clinical research designed to remediate
deficient language skills supports Skinner’s (1957) figurative comment that when “language” is dropped
to the floor, it may break into verbal operants, not linguistic structures.
Summary and Conclusions
Skinner’s is primarily a cause-effect analysis of VB. His main concern was to describe the
controlling variables of verbal operants. Accordingly, he described six kinds of stimulus control and five
kinds of verbal operants: motivation (causing mands), environmental events (causing tacts), another
109
speaker’s speech (causing echoics), printed or other visual stimuli (causing textuals), one’s own speech
(causing intraverbals) and the audience effect (not a class of verbal operants, but producing an effect on
other verbal operants). Skinner described a secondary process called autoclitics that included grammar
and other related phenomena. He proposed that VBs are multiply determined and most everyday
utterances are a combination of multiple verbal operants.
SLPs receive better training in the linguistic view of language than the behavioral view. The
textbooks and other sources they read routinely distort the Skinnerian analysis and repeat Chomsky’s
misunderstood and misplaced criticism of Skinner’s Verbal Behavior, even as Chomsky’s theories have
faded and Skinner’s natural science approach and behavioral treatment have been gaining worldwide
respect. The critics of the behavioral view have an ethical responsibility to first understand Skinner’s
analysis before making critical comments. SLPs have successfully used the behavioral intervention
procedures because linguistics does not describe independent variables that are manipulated in what we
call treatment (Hegde, 1998). If the SLPs also adopt a functional (cause-effect) analysis of VBs, they
would then be internally more consistent with their concepts and treatment methods. Treatment research
in child language disorders has generally supported Skinner’s view that VB is not organized structurally,
but functionally.
I conclude with a personal epilogue. Many years ago, a distinguished clinical scientist and a
friend of mine, Leija McReynolds, then professor at University of Kansas, was visiting me for a few days.
After she gave a lecture at California State University-Fresno, and before she left for Stanford University
where she was spending her sabbatical year in the Linguistics Department, asked me a sharp question in
the campus parking lot: “Giri, do you think the influence of linguistics on us SLPs has been detrimental?”
I said “Yes,” she smiled gently as she got into my car, and I drove her to the airport. Professor
McReynolds (along with Engmann) showed for the first time in SLP that when subjected to an
experimental analysis using the single -subject ABAB research design, the grammatically distinct subject
noun and object noun phrases collapse into a single functional unit.
References
Adelman, B. E. (2007). An underdiscussed aspect of Chomsky (1959). The Analysis of Verbal Behavior,
23, 29-34.
Anderson, J. (1991). Skinner and Chomsky 30 years later or: The return of the repressed. The Behavior
Analyst, 14, 49-60.
Axe, J. B. (2008). Conditional discrimination in the intraverbal relation: A review and recommendation
for future research. The Analysis of Verbal Behavior, 24, 159-174.
Barbera, M. L. (2007). The verbal behavior approach: How to teach children with autism and related
disorders. Philadelphia, PA: Jessica Kingsley Publishers.
Bloom, K., Russell, A., Wassenberg, K. (1987). Turn taking affects the quality of infant vocalizations.
Journal of Child Language, 14, 211-227.
Bricker, D. (1993). Then, now and the path between. In A. P. Kaiser & D. B. Gray (Eds.), Enhancing
children’s communication: Research foundations for interventions (pp. 11-31). Baltimore, MD:
Paul H. Brookes.
Carr, E. G., Levin, L., McConnachie, G., Carlson, J. I., Kemp, D. C., & Smith, C. E. (1994).
Communication-based intervention for problem behavior. Baltimore, MD: Brookes.
Chomsky, N. (1959). A review of Skinner’s Verbal Behavior. Language, 35, 26-58.
Cihon, T. M. (2007). A review of training intraverbal repertoire: Can precision teaching help? The
Analysis of Verbal Behavior, 23, 123-133.
110
Goldstein, M. H., Bornstein, M. H., Schwade, J. A., Baldwin, F., & Brandstadter, R. (2007, March). Five-
month-old infants have learned the value of babbling. A poster presented at the biennial meeting
of the Society for Research in Development. Available at: http://babylab.psych.cornell.edu/wp-
content/uploads/2009/04/srcd07_still_face_handout.pdf
Goldstein, M. H., King, A. P., & West, M. J. (2003). Social interaction shapes babbling: Testing paralle ls
between bird song and speech. Proceedings of the National Academy of Sciences, USA, 100,
8030-8035.
Goldstein, M. H., & Schwade, J. A. (2008). Social feedback to infants’ babbling facilitates rapid
phonological learning. Psychological Sciences, 19, 515-523.
Gros-Louis, J. G., West, M. J., Goldstein, M. H., & King, A. P. (2006). Mothers provide differential
feedback for infants’ prelinguistic sounds. International Journal of Behavior Development, 30,
500-516.
Guess, D., Sailor, W., Rutherford, G., & Baer, D. M. (1968). An experimental analysis of linguistic
development: The productive use of the plural morpheme. Journal of Applied Behavior Analysis,
1, 225-235.
Hall, G., & Sundberg, M. L. (1987). Teaching mands by manipulating conditioned establishing
operations. The Analysis of Verbal Behavior, 5, 41-53.
Harris, R. A. (1993). The linguistics wars. New York: Oxford University Press.
Hegde, M. N., Noll, M. J., & Pecora, R. (1979). A study of some factors affecting generalization of
language training. Journal of Speech and Hearing Disorders, 44, 301-320.
Hegde, M. N. (1980). An experimental-clinical analysis of grammatical and behavioral distinctions
between verbal auxiliary and copula. Journal of Speech and Hearing Research, 23, 864-877.
Hegde, M. N., & McConn, J. (1981). Language training : Some data on response classes and
generalization to an occupational setting. Journal of Speech and Hearing Disorders, 46, 353-
358.
Hegde, M. N. (1982). Antecedents of fluent and dysfluent oral reading: A descriptive analysis. Journal of
Fluency Disorders, 7, 323-341.
Hegde, M. N. (1998). Treatment procedures in communicative disorders (3rd ed.). Austin, TX: Pro-Ed.
Hegde, M. N. (2006). A coursebook on aphasia and other neurogenic language disorders. Clifton Park,
NY: Cengage Delmar.
Hegde, M. N. (2007). Treatment protocols for stuttering. San Diego, CA: Plural Publishing.
Hegde, M. N. (2008a). Hegde’s Pocketguide to treatment in speech-language pathology (3rd ed.). Clifton
Park, NY: Cengage Delmar.
Hegde, M. N. (2008b). Meaning in behavioral analysis. Journal of Speech-Language Pathology and
Applied Behavior Analysis, Consolidated issue 2.4-3.1, 1-25. Online journal at: http://www.slp-
aba.net
Hegde, M. N., & Maul, C. A. (2006). Language disorders in children: An evidence-based approach to
assessment and treatment. Boston, MA: Allyn and Bacon.
Jakobson, R. (1968). Child language, aphasia and phonological universals (A. R. Keiler, Trasn.). The
Hague, The Netherlands: Mouton.
Keller, F. S., & Schoenfeld, W. N. (1950). Principles of psychology. New York: Appleton, Century,
Crofts.
111
Leigland, S. (2007). Fifty years later: Comments on the further development of a science of verbal
behavior. The Behavior Analyst Today, 8, 336-345.
MacCorquodale, K. (1969). B. F. Skinner’s Verbal Behavior: A retrospective appreciation. Journal of the
Experimental Ana1ysis of Behavior, 12, 831-841.
MacCorquodale, K. (1970). On Chomsky’s review of Skinner’s Verbal Behavior. Journal of the
Experimental Ana1ysis of Behavior, 13, 83-99.
McGill, P. (1999). Establishing operations: Implications for the assessment, treatment, and prevention of
problem behavior. Journal of Applied Behavior Analysis. 32, 393-418.
McLaughlin, S. (2006). Introduction to language development (2nd ed.). Clifton Park, NY: Cengage
Delmar.
McLeish, J., & Martin, J. (1975). Verbal behavior: A review and experimental analysis. The Journal of
General Psychology, 93, 3-66.
McReynolds, L. V., & Engmann, D. L. (1974). An experimental analysis of the relationship between
subject and object noun phrases. (ASHA Monograph # 18.) In L. V. McReynolds (Ed.),
Developing systematic procedures for training children’s language (pp. 30-46). Rockville, MD:
American Speech-Language-Hearing Association.
Michael, J. (1982). Skinner’s elementary verbal relations: Some new categories. The Analysis of Verbal
Behavior, 1, 1-3.
Michael, J. (1985). Two kinds of verbal behavior plus a possible third. The Analysis of Verbal Behavior,
3, 1-4.
Michael, J. (1988). Establishing operations and the mand. The Analysis of Verbal Behavior, 6, 3-9.
Michael, J. (2000). Implications and refinements of the establishing operation concept. Journal of Applied
Behavior Analysis. 33, 401-410.
Michael, J., Whitley, P., & Hesse, B. E. (1983). The pigeons parlance project. The Analysis of Verbal
Behavior, 2, 6-9.
Mowrer, O. H. (1952). The autism theory of speech development and some clinical applications. Journal
of Speech and Hearing Disorders, 17, 263-268.
Miguel, C. (2009). Editorial: The verbal behavior approach. The Analysis of Verbal Behavior, 25, 1-3.
Osgood, C. E. (1963). On understanding and creating sentences. American Psychologist,
18, 735-751.
Owens, R. E., Jr. (2005). Language development: An introduction (6th ed.). Boston, MA: Allyn and
Bacon.
Palmer, D. (2006). On Chomsky’s appraisal of Skinner’s Verbal Behavior: A half century of
misunderstanding. The Behavior Analyst, 29, 253-267.
Pena-Brooks, A., & Hegde M. N. (2007). Assessment and treatment of articulation and phonological
disorders in children (2nd ed.). Austin, TX: Pro-Ed.
Reichle, J., & Wacker, D. P. (1993). Communicative alternatives to challenging behavior: Integrating
functional assessment and intervention strategies. Baltimore, MD: Brookes.
Richelle, M. (1976). Formal analysis and functional analysis of verbal behavior: Notes on the debate
between Chomsky and Skinner. Behaviorism, 4, 209-221.
112
Sautter, R. A., & LeBlanc, L. A. (2006). Empirical applications of Skinner’s analysis of verbal behavior
with humans. The Analysis of Verbal Behavior, 22, 35-48.
Savage-Rumbaugh, S. E. (1984). Verbal behavior at a procedural level in the chimpanzee. Journal of the
Experimental Analysis of Behavior, 43, 223-250.
Schlinger, H. D. (2008a). The long-good-bye: Why B. F. Skinner’s Verbal Behavior is alive and well on
the 50th anniversary of its publication. The Psychological Record, 58, 329-337.
Schlinger, H. D. (2008b). Listening is behaving verbally. The Behavior analyst, 31, 145-161.
Schoneberger, T. (2000). A departure from cognitivism: Implications of Chomsky’s second revolution in
linguistics. The Analysis of Verbal Behavior, 17, 57-73.
Schumaker, J., & Sherman, J. A. (1970). Training generative verb usage by imitation and reinforcement
procedures. Journal of Applied Behavior Analysis, 3, 273-287.
Skinner, B. F. (1957). Verbal behavior. New York: Appleton Century.
Smith, R. G., & Iwata, B. A. (1997). Antecedent influence on behavior disorders. Journal of Applied
Behavior Analysis. 28, 515-535.
Staats, A. W. (1968). Learning, language, and cognition. New York: Holt.
Sundberg, M. L. (1985). Teaching verbal behavior to pigeons. The Analysis of Verbal Behavior, 3, 11-17.
Taylor, B. A., Hoch, H., Potter, B., Rodriguez, A., Spinnato, D., Kalaigian, M. (2005). Manipulating
establishing operations to promote initiations toward peers in children with autism. Research in
Developmental Disabilities, 26, 385-392.
Vargas, E. A. (1982). Intraverbal behavior: The codic, duplic, and sequelic subtypes. The Analysis of
Verbal Behavior, 1, 5-7.
Vargas, E. A. (1988). Verbally-governed and event-governed behavior. The Analysis of Verbal Behavior,
6, 11-12.
Vargas, J. S. (2009). Behavior analysis for effective teaching. New York: Rutledge.
Virues-Ortega, J. (2006). The case against B. F. Skinner 45 years later: An encounter with N. Chomsky.
The Behavior Analyst, 29(2), 243-251.
Winokur, S. (1976). A primer of verbal behavior: An operant view. Englewood Cliffs, NJ: Prentice-Hall.
Author Contact Information
M.N. Hegde, Ph.D.

California State University
Postal: 1948 Ashcroft Avenue, Clovis, CA 93611
E-mail: girih@csufresno.edu
113
Verbal Behavior by B.F. Skinner:

Contributions to Analyzing Early Language Learning
Scott F. McLaughlin
Abstract
B.F. Skinner’s Verbal Behavior (1957) is analyzed in the context of early language learning. In the book,
Skinner did not emp hasize the foundations for language learning in infants and young children. His principles and
concepts are integrated with current knowledge of caregiver-infant interactions. Several major elements of his
functional analysis are described as well as relevant verbal operants. These are correlated with terms and concepts
used in classical and contemporary research in child language.
Keywords: Verbal behavior, caregiver-infant interaction, language learning, antecedents, consequences, selective
reinforcement, functional analysis, contingency shaped behavior, verbal operants, mands, echoics, tacts, autoclitics
Introduction
Two pieces of literature appeared quietly and without fanfare in 1957. Each book inalterably
affected how we have come to view language, human behavior, and language learning. In 1957, Noam
Chomsky published Syntactic Structures (1957), his germinal work that established the foundations of
psycholinguistics. This work, in combination with a second publication, Aspects of the Theory of Syntax
(Chomsky, 1965), broadly influenced research in linguistics and the theoretical relationships between the
mind and language. Together, they represented a firm anchor point on one end–the nativistic end–of the
philosophical continuum established in 1957.
With regard to explaining language development, one of the more publicized features of this
theory was the Language Acquisition Device, or LAD. Although there was no intention to correlate this
“device” to any underlying neurological structure, Chomsky proposed the LAD as the presumed innate
mechanism in the human brain (or mind) to explain the apparent ease and rapidity with which children
acquire language. Although the concept of the LAD has been modified through the years, in the 1960s it
became the widely accepted explanation for children’s acquisition of language, largely removing
caregivers from any active role in their children’s language abilities.
The opposite anchor point–the behavioral end–on this philosophical continuum was established
that same year, when B.F. Skinner published Verbal Behavior (1957). (See Hegde, in this issue, for a
comprehensive review of Skinner’s Verbal Behavior.) In contrast to Chomsky’s approach, Skinner
proposed an analysis of verbal behavior based on a natural science account of behavior that had evolved
since the earlier publication of his The Behavior of Organisms (1938). In Verbal Behavior, Skinner
applied a functional analysis approach to analyze language behaviors in terms of their natural occurrence
in response to observable environmental circumstances and the measurable effects they have on human
interactions. In this view, language was characterized as the result of, as opposed to the reason for,
complex human behavior. The complexities of language do not exist prior to or independent of human
behavior; instead the complexities of language behavior reflect our capacity to respond verbally to the
complex and subtle intricacies inherent in human experiences and interactions.
In Verbal Behavior, Skinner (1957) did not emphasize explaining the nature of early language
development; perhaps the explanation seemed obvious to him from the fact that he had invoked an
operant model in the overall analysis. Verbal Behavior primarily focused on an explication of the causal
variables for the verbal interactions of accomplished speakers and listeners whose learning histories were
in place and preceded the verbal behaviors in question–in essence, adult speakers. It is perhaps
unfortunate that Skinner emphasized this level of analysis–adult verbal interactions–throughout Verbal
Behavior as it ostensibly posed difficulty for some who have tried to explain the development of language
114
under his model. Chomsky critiqued Skinner’s functional analysis in a book review that many found
puzzling due to several lengthy criticisms put forth by Chomsky that did not relate to principles or
concepts contained in Verbal Behavior. Specifically, Chomsky criticized Skinner for proposing imitation
and conscientious parental tutoring as the major explanations for language development. Chomsky also
took Skinner to task for his supposed reliance on memorized Markovian chains as an explanation for
grammar. However, none of these elements were included in Skinner’s model. Over the years, attempts to
describe how Skinner’s model might be applied to children’s language development, including recent
sources (Owens, 2005; Berko Gleason, 2005; Hulit, 2006), have continued to rely on perpetuating these
misconceptions contained in Chomsky’s (1959) “bewildering” (MacCorquodale, 1970, p. 83) critique of
Verbal Behavior. However, others (McLaughlin, 2006; Winokur, 1976) have attempted to provide
integrated explanations of early language learning that are more consistent with the original principles set
forth in Verbal Behavior (1957).
This paper will revisit some basic terms and concepts central to understanding the functional
analysis set forth by Skinner (1957). In the context of early language learning, these elements will be
explored as to how they relate to our understanding of language learning in young children, with
particular emphasis on the foundations laid down in the first months and years of life. Finally, Skinner’s
basic classifications of verbal behaviors–verbal operants–will be described and correlated with traditional
concepts and research from language development literature.
The Foundational Contingency: Infant-Caregiver Interactions

According to Skinner (1957), a functional analysis of verbal behavior must build on the
fundamental task of describing the behaviors of interest and then extend itself to explaining the
causes of those behaviors. Such an analysis attempts to identify the variables that cause certain
verbal behaviors to occur in certain circumstances. In the same way that one might say
combustion is increasingly likely as a function of the combined presence of fuel, air, and heat,
one might say that certain verbal behaviors are dependent upon, or occur as a function of, certain
causal variables. The interrelationship of multiple and complex variables that influence verbal
behavior comprises the contingenc ies that are established through a learning history–even though
sometimes quite momentary–in which the setting, a behavior, and the consequences that have
attended to the behavior influence the likelihood that the behavior will occur again in similar
circumstances.
In the context of language learning, analyzing the independent variables–the causes–is no simple
proposition. To say that human beings are complex in their functioning is probably less a platitude than
an understatement. Furthermore, developing humans–from infancy through childhood–are especially
complex due to the multitude of concurrent changes that occur so rapidly during a relatively short period
of time. There is the underlying physiological and neurological maturation that provides for an array of
increasingly refined sensory and motor capabilities. Supporting and nurturing all these physiological
changes requires the social interactions associated with caregivers’ routines and activities. Eventually, the
child’s social interactions expand to an entire supporting cast of individuals with whom the child will
interact in various settings and roles. In addition, the increasingly refined motor skills and the expanding
and sophisticated social behaviors set the stage for new and more complex verbal behaviors to evolve.
This is a process that is not only multifaceted; the various facets intertwine with and drive each other
forward in the overall process.
Although Skinner’s analysis is focused on the behavioral layer in all this development, it
acknowledges that there is a myriad of accompanying genetic, neurophysiological, and social influences
(Skinner, 1974). They simply do not occur at a level that can be included in this particular analysis. They
no doubt exist and their potential roles would be addressed where necessary, but they are for the purposes
115
of a functional analysis the inaccessible undercurrents that are left to other fields to explore and analyze.
The functional analysis for the present purpose will confine itself to defining the roles of the antecedent
setting events, consequences, and their influence on the onset and development of communication
behaviors in infants and language learning in children.
Antecedents for Infant Behaviors: The Stimulus Setting for Language Learning
The antecedents of behavior are those stimuli or stimulus events that set the stage for behavior.
Every moment that we experience contains a multitude of external stimuli that can be detected and
internal stimuli that can be sensed at some level. As a result, any single stimulus or combination of
stimuli has the potential to serve as the setting event for a response. If that response proves to be useful in
some way that has reinforcing consequences, the stimulus that was present when reinforcement occurred
becomes a discriminative stimulus for similar behaviors in the future. In our natural interactions in the
“real world,” it is likely that more often than not, the combinations of stimuli that comprise a stimulus
setting are more arbitrary, random, and coincidental than they are planned, staged, or contrived. Things
just come together at times and present us with the occasion to respond in some way.
External Stimuli. The human infant is immersed in stimuli from the beginning–even prior to birth.
Although the fetal surroundings are darkened and relatively silent, it has been determined that the cochlea
and auditory nerve have developed at around 26 weeks of gestation and the fetus is subsequently capable
of detecting sound (Locke, 1993). The auditory system does have stimuli available to it, primarily in the
form of the internal sounds of the mother’s heartbeat and blood flow. However, the mother’s voice is also
detectable –albeit muffled–at conversational levels above this ambient noise. In this way, the fetus has
several weeks of hearing at least the prosodic character of the mother’s voice, a sound that will be
immediately associated with the most significant events in the first moments and days of the infant’s life
following birth (Locke, 1993).
At birth, the events that essentially integrate the social and functional nature of communication
are present almost immediately. There is a significant convergence of human interaction in which
caregivers begin to consistently (some might say constantly) provide for their infants’ very survival.
Human infants are uniquely dependent on their caregivers for survival from the start and for several years
following. Caregivers in turn provide for these needs through various interactions and routines that take
on a strongly social character.
It is critical to note that in this convergence (e.g., feeding sessions) the newborn’s first
experiences are both life-sustaining and social by nature. As noted earlier, the infant’s auditory system is
fully functioning at birth and the infant is probably already familiar with the mother’s voice patterns.
Although his or her visual acuity is limited, stimuli within a foot are in fact visible to the infant. By the
nature of most feeding sessions, the infant nestled in a feeding position at the breast, or in some similar
placement if bottle fed, places the caregiver’s face and voice in perfect proximity to be intimately
associated with feeding (Locke, 1993). This proximity and intimacy will persist as caregivers will need to
assist with feeding for some time. As a result, this interaction, so vital for simple survival, consistently
associates the caregiver’s face, voice, and physical contact with survival needs. As the earliest and most
familiar stimulus setting for developing infants, this convergence of attention and social stimulation
coincides with their most critical period of rapid brain growth and neurological maturation. This is a
crucial principal in understanding the learning process that ensues. As Locke surmised, "One can hardly
think of a developmental circumstance that would more favorably affect acquisition of complex behavior"
(Locke, 1993, p. 264).
Although most of the infant’s behaviors–eye gaze, hand movements, facial movements, and
vocalizing–are primarily reflexive at first, they do coincide with the earliest most important routines that
by their nature include the caregiver’s face and voice as conspicuous elements in the overall setting event.
Most caregivers, at least in Western societies, are inclined to respond to their infants with eye gaze
116
behaviors, hand play, and speech (often in some form of “motherese”) while the infant is being cuddled,
bathed, dressed, and fed. Again, these earliest social experiences, repeatedly associated with the care-
giving routines, become the foundational and consistent setting events for infants interacting with the
world around them. Simultaneously, caregivers are at the center of their infants’ experiences as the most
important source for survival and stimulation. As maturation progresses, infant motor behaviors–hand
movements, facial expression, and so forth–become less reflexive and more voluntary and serve to initiate
interactions in the context of routines and new forms of interaction with the caregivers.
Beyond those earliest intimate experiences, with time the child’s developing motor system
eventually permits exploring a wider territory, which includes a vast array of smells, textures,
temperatures, shapes, colors, sizes, locations, actions, objects, persons, and relationships among them.
This is hardly a complete list of the possible physical and functional features that could present
themselves to any of us given the complexity of our world. As children grow, they become more adept at
handling and exploring objects, more aware of the diverse events around them, and more responsive to
the ways these all interact and relate to the significant people in their lives. All these very complex
stimuli–both subtle and signific ant–become salient setting events for the development of an increasingly
complex array of verbal behaviors. In other words, it would follow that the increasing complexity of the
child’s verbal behavior correlates with his or her responses to increasingly complex stimuli and
relationships in the environment.
Internal Stimuli. Not all stimuli that contribute to setting events for verbal behavior are external.
From birth, the infant experiences various states that presumably represent internal stimuli. Although the
internal physiological stimuli cannot be directly observed, the process that prompts them to occur can be.
The time period in which the infant is deprived of some basic need that results in the internal stimulus can
be observed and measured. The length of time a child goes between feedings, the amount of time a child
is uncovered and exposed to a chilly draft, or goes without social interaction would each be indicative of
the degree of the corresponding internal states of hunger, cold, or loneliness (social deprivation) that may
serve as motivation for the child’s subsequent behavior. Deprivation states can be real, accessible , and
measurable variables when analyzing internal stimuli as antecedent events for certain behaviors in
children.
In early months, the overt behaviors may be purely physiological, reflexive, and strictly vocal in
nature–for example, the hunger cry. Interestingly, caregivers respond differentially to negative
vocalizations (whining, fussing, crying, and so forth) as opposed to pos itive vocalizations (cooing,
babbling, laughing, and so forth). Negative vocalizations prompt more attempts at changing the infant’s
physical state through tactile or positional changes. Caregivers tend to respond to positive vocalizations
with vocal-verbal behaviors (Keller & Scholmerich, 1987). With the infants’ development of more
precise verbal (as opposed to vocal) behaviors, a repertoire of overt responses that are more specific to
different deprivation states and their underlying internal stimuli becomes possible and it would follow
that caregivers would become more precise in responding verbally to address their children’s internal
states (pain, hunger, cold, etc.).
Consequences: The Nature of Reinforcement in Language Learning.
The other major ele ment in the foundational contingency for language learning described by
Skinner (1957) was the nature of consequences. On one level, that reinforcement probably plays a role at
some level in verbal behavior would appear to be quite straightforward. Most may have a basic
understanding of what constitutes reinforcement. However, a purely academic understanding that simply
equates reinforcement to food and praise, or a knowledge of how clinicians explicitly and systematically
reinforce target skills may hamper our ability to recognize the subtlety in caregiver-infant interactions.
Appreciating the intricate strands of a reinforcement history that link the potency of the earliest
experiences with the most basic reinforcers to the myriad of fleeting moments of social reinforcement that
117
follow during the day-to-day caregiver-infant interactions is essential to understanding how language
evolves from the earliest germinal moments in an infant’s life.
Primary reinforcers. Reinforcers that have survival value–food, warmth, social contact–are
considered primary reinforcers and are a normal part of care provided to most infants. Certainly the
simplest and most obvious primary reinforcer is food. As with the youngest in any species, food plays a
vital role as a primary reinforcer with survival value for human infants. However, it is perhaps even more
crucial because human infants are entirely dependent on caregivers to provide this life sustaining primary
reinforcer a number of times each day for several years. This earlie st contingency between caregiver
responses and infant needs is vital to the infant’s survival and development. However fundamental it may
be, providing for the infant’s survival is merely the first level of interconnection between infants and their
caregivers.
Secondary reinforcers. Secondary reinforcers are those stimuli or events that have the potential to
reinforce behaviors due to their prior association with primary reinforcers. If a secondary reinforcer is
established by being conditioned through association with a primary reinforcer, the social interactions
noted previously that cause caregivers’ faces, speech, and physical contact to coincide with the delivery
of food, by definition, should establish potent secondary reinforcers for an infant from the first day of life.
Through this frequent association, the caregiver’s presence, attention, facial expressions, eye contact,
touch, speech, and overall interaction are established from the beginning as powerful secondary
reinforcers. Caregivers’ contingent attention and responsiveness to the needs and behaviors of their
infants are predictive of the infant’s physical, social, intellectual, and emotional development as well as
their eventual language development (Dunham, Dunham, Hurshman, & Alexander, 1989; Tamis-
Lamonda, Bornstein, & Baumwell, 2001).
Any behavior on the infant’s part that evokes one or more of those social response elements from
a caregiver–a touch, speech sounds, eye gaze, or a smile –in response will be effectively reinforced. This
observation–the potency of the caregiver’s natural behavior to serve as a secondary reinforcer as a result
of this convergence–is perhaps most crucial to understanding the dynamics of early learning so essential
to interpreting and extrapolating Skinner’s Verbal Behavior to early language learning. As most would
agree, it has been documented that infants are in turn responsive to caregivers’ presence, interaction, and
attention in many domains of learning (Kuhl, 2004).
Natural consequences. Most caregivers are naturally responsive to infants’ noncry vocalizations.
In turn, infants appear to learn that their caregivers do respond and come to have expectations that they
will respond contingently. This reciprocity and infants’ expectation connecting infants’ vocalizations and
caregivers’ contingent responses suggests the reinforcing nature of these interactions (Goldstein,
Schwade, & Bornstein, 2009). It has been demonstrated that infants’ vocalizations can be influenced by
social reinforcement in controlled experimental studies (Rheingold, Gewirtz, & Ross, 1959; Routh, 1969;
Todd & Palmer, 1968; Wahler, 1969; Weisberg, 1963). Furthermore, it has been shown that the frequency
of infant vocalizations varies depending upon the nature of engagement with their caregivers (Keller &
Scholmerich, 1987). At least one experimental study has demonstrated that infants’ can be conditioned to
produce different responses based on differential social reinforcement (Routh, 1969). It should be noted
that caregivers do not need to be conscious of providing reinforcement for these communicative
behaviors; it occurs merely as the result of the natural interactions with their children. Infant behaviors
that are attended to and effectively responded to by the caregiver are naturally reinf orced by the
immediacy and relevance of the caregiver’s response. This is the essence of natural consequences, a
critical concept in Skinner’s analysis that has been missed or misconstrued in many attempts to review,
critique, or interpret Verbal Behavior (1957).
Selective reinforcement. Another closely related concept, selective reinforcement, has been
misinterpreted by some as well. A number of sources (Berko Gleason, 2005; Owens, 2005; Hulit, 2006)
attempting to describe Skinner’s model in the context of language learning, have portrayed selective
118
reinforcement as a conscious, even conscientious, process painstakingly applied by caregivers. Again, this
interpretation seems to be related to Chomsky’s review of Verbal Behavior in which he contended that
there is no evidence that caregivers use “slow and careful” reinforcement applied with “meticulous care”
to teach language to their child (Chomsky, 1959, pp. 39,42,43). However, this is simply another one of
Chomsky’s “straw men” in that Skinner never indicated such requirements. More than 30 years after he
published his Verbal Behavior, Skinner (1988, p. 486) wrote that:
Chomsky and others often imply that I think that verbal behavior must be taught, that explicit
contingencies must be arranged. Of course, I do not, as Verbal Behavior makes it clear. Children learn
to speak in wholly noninstructional verbal communities. But the contingencies of reinforcement are
still there, even though they may be harder to identify. (p. 486)
Perhaps some confusion stems from the distinction between reinforcement as part of an
experiment and as it occurs in the natural environment. In Routh’s (1969) study, different classes of infant
vocal responses–consonant-like sounds, vowel-like sounds, and all vocalizations–were chosen by the
study design for the purpose of determining if infants’ vocal behaviors were responsive to differential
reinforcement. The specific vocal behavior that was reinforced in each infant in fact increased selectively.
This was a useful and purposeful distinction; the isolated increase in the specified vocal behavior being
reinforced demonstrated the effects of selective reinforcement. However, in this demonstration the vocal
behaviors were chosen by the investigator as part of the experimental design. Selective reinforcement
under normal circumstances is the natural result that follows when one response survives and is
strengthened because it is more effective than another. In the natural environment’s differential
reinforcement–intentional or unintentio nal–is contingent on the value of a response and it “selects” the
response by strengthening its future probability.
Selective reinforcement is analogous to the natural selection that perpetuates useful variations in
traits within a species. As Skinner pointed out in About Behaviorism, in the same way that “accidental
traits, arising from mutations, are selected by their contribution to survival, so accidental variations in
behavior are selected by their reinforcing consequences” (1974, p. 114). When reinforcement is more
likely to follow an effective response, it selectively strengthens that response over less effective
responses. The caregiver does not need to be conscious of “selecting” one response over another;
children’s responses are selectively reinforced by virtue of communicating more effectively and evoking
favorable outcomes. Caregivers do not wake up each morning planning which behaviors they will
reinforce and what schedule of reinforcement they will use that day; they wake up hoping to help their
offspring thrive, learn, and be happy. Both natural consequences and selective reinforcement are central
aspects of the subtle consequences that determine children’s communication behaviors, including those
that will become integrated into the lasting verbal behaviors that comprise language.
Summary. The basic elements of Skinner’s functional analysis of verbal behavior (1957) include
the stimuli–external and internal–that set the stage for verbal behavior and the consequences that affect its
reoccurrence. For the young infant, the environment is generally composed of those stimuli that either
emanate from the caregiver–speech, touch, eye contact, facial expression, as so on–or occur as a result of
actions by the caregiver–change in position, proximity to other stimuli, presentation of food, and so on.
The infant’s frequent, consistent, and contingent interactions with the caregiver while having primary
survival needs addressed automatically keys in the caregiver’s attention as a powerful secondary
reinforcer that occurs naturally and is capable of selectively strengthening various infant responses.
A Functional Analysis of Verbal Behavior: Correlates in Early Language Learning
Skinner (1957) developed his model based on verbal behavior as a conspicuous and ubiquitous
example of complex social behavior. He defined verbal behavior as a social behavior that is “reinforced
through the mediation of other persons” (p. 2)–an especially fitting description of early language learning.
119
His interlocking verbal paradigm captured the reciprocal nature of social interaction that is intrinsic to
human verbal behavior. Applying this framework, he then illustrated the types of verbal behaviors–verbal
operants–that occur in episodes or exchanges based on their circumstances, each speaker’s behavior, and
the consequences that follow. Different contingent relationships serve to functionally define different
primary verbal operants. Each category of verbal operant represents a set of verbal behaviors that tend to
occur under similar setting events and are reinforced in characteristic ways. Beyond the primary verbal
operants, autoclitic behaviors represent the essence of “grammar” and more in Skinner’s analysis (see
Hegde in this issue). Along with some basic concepts central to the analysis, several verbal operants and
their relevance to early language learning will be described. Correlated traditional concepts and
terminology will be integrated into the discussion of each.
Concepts Central to a Functional Analysis
Unit of Analysis
Description: In analyzing verbal behavior, especially in the context of early language learning, it
is important to note that Skinner’s unit of analysis–the verbal operant–was functional, not structural. It
was more important for him to consider the settings and the consequences of verbal behavior than its
formal structural characteristics (i.e., noun, noun phrase, declarative). With infants, of course, the initial
level of behavior is more vocal than verbal. Vocal behaviors that evoke caregiver attention and interaction
evolve from cooing, babbling, and jargon that increasingly contain adult-like speech elements, including
syllable patterns and intonational contours, and evoke verbal responses from caregivers (i.e., motherese).
As a result, infants’ vocal behaviors gradually transition toward verbal behaviors (i.e., the first word).
With children, the nature of the verbal operant will evolve over time. In addition, the size of the
functional verbal operant will expand over time. As the child becomes capable of coordinating longer
motor sequences in his or her speech, his or her utterances can orchestrate more relations and address
social agendas more deftly as they expand, for example, from mere grunts and gestures to “Cookie” to
“More cookie” to “Two big cookies” all the way to “I can tell from how they smell that you make the best
chocolate chip cookies!”
Correlates in Early Language Learning: Traditional literature includes few, if any, close
correlates to Skinner’s functional unit of analysis. Dore’s “Primitive Speech Acts” (Dore, 1974) and
“Conversational Speech Acts” (Dore, 1986) perhaps come closest due to their pragmatic basis. Given the
preceding example of how a “functional unit” might evolve from primitive grunts and gestures to quite
sophisticated and diplomatic indirect requests, some might be led to equate Skinner’s unit of analysis with
the pragmatic concept of “purposes,” such as described in Speech Acts theory (Searle, 1969). It is
important to remain aware that Skinner used the term “function” to refer to the causal explanations of
verbal behaviors, as when noting that a certain response seems to be causally related to certain
circumstances and a past history of reinforcement. Invoking the mentalistic concept of “purpose” as is
done in pragmatic analyses makes that model different from Skinner’s.
Interlocking Verbal Episodes
Description: A vital aspect of Skinner’s model was casting the functional analysis in terms of
interlocking verbal episodes intended to capture the “back and forth” reciprocal nature of verbal behavior
between a speaker and listener whose roles frequently interchange throughout the interaction. In most
cases, there are at least two participants in a verbal exchange. Each alternately serves as the speaker and
the listener. As the exchange progresses, the verbal behavior of the first serves as a discriminative
stimulus (SD ) and possibly a reinforcer (Sr) for the second. In turn, the second individual may respond
verbally to the verbal SD produced by the first person. The verbal behavior of the second person may
120
serve to reinforce the first person and serve as a verbal SD that evokes a subsequent response from the first
person and so on until the series of verbal exchanges (i.e., the conversation) terminates.
In early interactions, episodes are frequently prompted by the caregiver with eye contact, physical
contact, facial expressions, or speech (“motherese”). Recent studies have continued to demonstrate the
significant role that the caregiver’s presence and interaction can play in influencing infant attention,
vocalization, reciprocity, and even enhancing their discrimination of the speech around them (Gartstein,
Crawford, & Robertson, 2008; Goldstein, Schwade, & Bornstein, 2009; Keller & Scholmerich, 1987;
Kuhl, Tsao, & Liu, 2003). Mothers adjust their behavior to their children’s and children as young as 9 and
13 months make active adjustments to their caregiver behaviors based on their developmental needs to
evoke more meaningful responses from the mother (Tamis-Lamonda, Bornstein, & Baumwell, 2001).
These observations suggest that even prior to the first birthday interactions that fit the verbal interlocking
paradigm have already become established to enhance language growth “in the context of responsive
social exchanges between caregivers and children” (Tamis-Lamonda, Bornstein, & Baumwell, 2001, p.
763).
Correlates in Early Language Learning: For a number of years, the literature has described the
occurrence of phenomena that illustrate the concept of the verbal interlocking paradigm. Beginning with
proto-conversations in caregiver-infant feeding interactions (Bateson, 1971; Locke, 1993) and the earliest
forms of conversational interaction (Dore, 1986) , caregiver-child discourse has been recognized as a rich
source of teaching and learning about language. These phenomena represent important interactions that
could be captured as interlocking verbal episodes and analyzed for the social and verbal transactions that
occur through them. A cumulative history across ongoing exchanges between caregivers and children
would probably allow an analysis of the diverse set of subtle teaching tools used by caregivers that may
not even be apparent to the caregivers.
Hart and Risley (1999) collected monthly recordings for 30 months of the interactions between 42
children and their families. In their analysis they discovered surprisingly rich opportunities in families for
most children to be exposed to language models. They found that an average of 700-800 utterances per
hour was produced by people within hearing distance of a child. The average of utterances directed by the
parent to the child was 300-400 per hour. The parents’ language to others contained richer vocabulary and
longer clauses and used declarative statements 60% of the time. In contrast, the parents used more
questions and directives with their children, averaging 83 questions per hour. Hart and Risley analyzed
much of their data in terms of “episodes,” defined as occasions in which families and children directed
social behavior toward each other terminated with 5-second boundaries of no social behavior. An average
of 96 interactional episodes was recorded each hour and, interestingly, more than half of the episodes
were initiated by the children. However, social responsiveness remained constant; in episodes containing
utterances by both parents and children, the ratio of utterances was 50:50 throughout the 2 year period of
learning to talk. These findings document extensive reciprocal and symmetrical interactions that correlate
to Skinner’s interlocking verbal episodes and occur frequently throughout a child’s experience with
adults, forming an important foundation of early language learning.
The Influence of Stimulus Control
Description: When a response has been reinforced in the presence of a stimulus, the stimulus
potentially gains the ability to influence or control the likelihood of that response occurring in the future.
The very first productions of “mama” or “dada” may or may not be verbal operants (“intentional” or
“meaningful”), but they are nonetheless fortuitous. These two syllable pairs occur frequently in the early
sound repertoire of children babbling in most every human language, possibly due to their underlying
connection to oral-motor actions involved in feeding. The younger the infant when these syllable pairs are
first heard, the less likely they are the true “first words.” Nonetheless, at any age, mothers and fathers
respond differentially to those two productions. When those syllables occur in their absence, it is not
121
likely that others respond quite as enthusiastically, especially in the early months. Over time, however,
after being reinforced by the relevant parent, those syllables are produced more frequently and
differentially depending on who is immediately present. Once the child produces “mama” and “dada”
reliably in their presence, mother and father have become the controlling stimuli for those responses.
Stimulus discrimination and stimulus generalization. Stimulus discrimination occurs when a
restricted set of stimuli exclusively controls or influences a response, apparently due to a relatively
narrow set of controlling or “defining” features. Conversely, stimulus generalization occurs when
different stimuli that are similar in some defining manner become controlling stimuli for a response.
Initially, for a young child “mama” carries the sense that there is only this one person in his or her
world who deserves this moniker. That response initially exhibits strong stimulus discrimination. In most
cases, there is one adult female present to respond to and reinforce the early random forms of “mama.”
However, at some point, perhaps following more exposure to additional adult females, stimulus
generalization occurs and soon every adult female is called “mama” or “mommy.” Eventually, through a
counterbalancing process of generalization and discrimination, the child learns that only certain adult
females qualify as “mommies” and, at least implicitly, the child comes to reserve the “capitalized”
version, “Mommy,” for his or her own. He or she learns that there is a broader category of “mommies”
and a specific instance of “Mommy.”
Correlates in Early Language Learning: Children’s vocabulary learning in language development
has always been of interest to researchers. The various responses children produce as they attempt to label
and categorize people, objects, and events around them gives us that special sense of having a “window
into their minds.” In traditional literature it has long been acknowledged that children only gradually learn
to use words according to the adult conventions or boundaries. In the process, children have been
described as producing “overextensions”–extending the use of a word beyond its conventional meaning
(Bloom, 1973). A familiar example used to illustrate overextensions is children calling every creature
with four legs, fur, and a tail, a “doggy.” With experience and corrective feedback, children’s
overextensions are remediated by their verbal community. However, their occurrence provides a
straightforward illustration of the behavioral process of stimulus generalization, albeit over-generalization
with respect to the conventional meanings.
Theories on children’s cognitive processes as they develop their concepts and word meanings
have focused on hypothetical mental processes that may underlie these phenomena. The prevailing notion
is that the child is hypothesis testing in attempts to develop the meaning of a word (Brown, 1958). Some
theories have emphasized the child’s apparent attention to perceptual features (Clark, 1973) and others
have stressed that children base their conceptual meanings for words on the central or core function of an
object (Nelson K. , 1974). In a functional analysis, it may turn out that these aspects are operative as
different levels of controlling stimuli across individual children, but behaviorally a child will only
demonstrate conceptual behavior if his or her response generalizes within a group of stimuli that are
similar in some defining way and discriminates across groups of stimuli that differ in defining ways.
Contingency Shaped Behavior

Description: Contingency shaped behavior consists of behaviors that evolve through a gradual
process of differential reinforcement. Very few, if any, voluntary behaviors, especially complex
behaviors, are present at birth. As the infant gradually gains voluntary control over various movements
and responses, the consequences that follow–some positive and some negative–cause behaviors to
gradually emerge, evolve, and refine.
To the extent that caregivers are responsive to infant behaviors, it appears that the contingent
nature of their responses is important to the infant. Goldstein, Schwade, and Bornstein (2009) found that
5-month-old infants had learned that an adult had been responding contingently to their vocalizations.
After a period of responding contingently to infants’ vocalizations, when the adults assumed a still face,
122
infants exhibited extinction bursts in which their smiling decreased and vocalizations increased.
Additionally, 9-month-old infants, when presented with vocal production models by their mothers
contingent on their own babbling, modified their babbling and included the mother’s phonological
patterns in their babbling (Goldstein & Schwade, 20008).
Skinner (1957) emphasized the importance of response contingent behavior between speaker that
maintains responding and shapes responses during the overall process of learning language. In addition, to
the extent that it provides differential reinforcement or corrective feedback it would also shape the
outcome of the process by strengthening response forms that evolve toward the conventions of the
language. Tamis-LeMonda, Borstein, and Bauwell (2001) found that maternal responsiveness to infants’
play and vocalizations at 9 months and 13 months were predictive of the infants’ achievement of several
language milestones.
Correlates in Early Language Learning: This concept has special relevance to Skinner’s analysis
of verbal behavior. Following Chomsky’s model, Brown (1973) , among many others, had asserted that
the child was acquiring the rules of grammar by extracting the regularities or patterns in the language he
or she was exposed to. The regularities or patterns in the child’s language presumably revealed the
internalized rules of grammar that governed the child’s language production. The distinction between
“rule-governed behavior” and “contingency-shaped behavior” became a central issue that crystallized a
key difference between Chomsky’s theory and Skinner’s analysis. The former placed the controlling
mechanism in the child’s mind (the LAD) and the system of rules being extracted from exposure to the
language. The latter placed the controlling mechanism in the history of reinforcement that incrementally
shapes behaviors toward conventionally accepted and consistent patterns of verbal behavior. Such
consistent patterns may be described as evidence of “rules.” However, the behavioral evidence for such
“rules” contradicts what most would expect. In characterizing rule-governed behavior, one would expect
a more rapid and complete change in behavior that quite promptly complies with the newly learned rule.
However, in the case of early language learning, most behaviors evolve gradually (Brown, 1973).
In About Behaviorism, Skinner wrote, “Certainly for thousands of years people spoke
grammatically without knowing that there were rules of grammar. Grammatical behavior was shaped,
then as now, by the reinforcing practices of verbal communities in which some behaviors were more
effective than others…” (1974, pp. 127-128). In fact, in a seldom quoted section in his landmark book
about children’s mastery of grammatical morphemes, Brown (1973) very clearly described this shaping
process in describing the subjects’ “unaccountable regressions and unexplained abrupt advances,” stating
that
the learning involved must be conceived as gradual change in a set of probabilities

rather than as the sudden acquisition of quite general rules. If our conception is
correct, it means that the learning of the … 14 grammatical morphemes is more like
habit formation and operant conditioning than anyone has supposed. Skinner’s
definition of operant strength in terms of response probability is surprisingly apt. (p.
388)
Still, in his overall conclusions, Brown (1973) determined that there was no evidence in his data
of “selective pressures” being exerted through adult models or through direct reinforcement by adults.
Moerk (1992) obtained Brown’s entire data set and performed a different analysis on it. Brown had used
frequency counts of parent use of the morphemes in question and found that neither parental models for,
nor parental expansions of, a child’s utterance correlated with the children’s mastery for that morpheme.
In contrast, Moerk used a microanalysis of the learning process present in caregiver-child interactions
across adjacent time periods. This approach revealed the extent to which the parent’s modeling, corrective
feedback (reinforcement), and massing (repeated, intense modeling of a structure) were at work in
shaping the child’s subsequent productions. Brown’s simplified operational definition (1973) of models,
123
reinforcement, and feedback had essentially eliminated recognizing their roles in his own data (Moerk,
1992).
Verbal Operants in Developing Language
Skinner designated the basic elements of verbal behavior as primary verbal operants. As the
building blocks in his functional analysis, each primary verbal operant is based on the contingency
represented between the setting event that “sets the stage” for a characteristic type of verbal behavior due
to the reinforcement that has followed that type of verbal behavior. These primary verbal operants
become the elements or fragments that, once reinforced, are available to be rearranged through the
speaker’s secondary verbal behavior into more complex responses. Consistent arrangements of these
primary verbal operants evolve in ways that, by convention, reflect the different relationships among them
and other variables inf luencing the speaker. The following is a brief summary of each primary verbal
operant. There is certainly more to understanding each one than can be included here. (Again, the reader
is referred to Hegde, in this issue, for a more complete review and discussion of Skinner’s Verbal
Behavior.)
Mands
Description: A mand is a verbal operant in which the verbal behavior specifies its reinforcement.
It occurs in response to a deprivation state or an aversive state. In general terms, the speaker’s verbal
behavior tells the listener how to reinforce the speaker by addressing a deprivation state (hunger, thirst,
social contact, etc.) or by terminating an aversive state (cold, wet, pain, lack of information, etc.).
Correlates in Early Language Learning: For infants, the earliest cry behaviors, although not
conscious or intentional, probably represent the first demands placed on the environment around them
(Formby, 1967). Although the infant’s cry cannot “specify” the reinforcement, there are not too many
needs to choose from–food, dry diaper, sleep, or company–and caregivers typically address the specific
need either by recognizing the particular cry through past experience or through a process of elimination.
Obviously, for the infant, this responsive and contingent behavior is crucial to implanting the sense that
the world responds in some way when he or she behaves and help establish the earliest mand. With motor
development, mands expand to include reaching and pointing and as new verbal behaviors are learned,
they come to include additional specifics with more socially appropriate and effective forms–“I’d like the
donut with white icing and sprinkles, please.” The literature in early development of pragmatic abilities is
replete with examples of early communicatio n skills that essentially illustrate mands. The most
straightforward example is Primitive Speech Acts (PSAs) (Dore, 1974). Of the 10 types of PSAs included
in the classification of pragmatic behaviors, four of them are essentially mands–Calling, Protesting,
Requesting Answer, and Requesting Action.
Echoics
Description. The obvious reference for the term echoic is the word “echo.” An echoic is verbal
behavior that reproduces the acoustic properties of another’s verbal behavior. Imitation is the traditional
sense of echoics, but traditionally carries with it a sense of intention. We imitate when asked or for some
purpose, where there are aspects of others’ verbal behavior that are sometimes reproduced without any
awareness on the part of the speaker. After some time in a different geographic region, we take on the
accent or the expressions of that region without necessarily being conscious of the change in our verbal
behavior.
Correlates in Early Language Learning: Infants may exhibit the earliest echoics in the nature of
their babbling. Although there is a measure of controversy surrounding this phenomenon, it cannot be
denied that infants’ babbling gradually sounds increasingly like the native language to which they are
124
exposed. One theory (Mowrer, 1952) that actually predated Verbal Behavior, described the process as one
in which speech sounds produced by the infant that more closely resemble the caregivers’ sounds that
have been long associated with feeding sessions were automatically self-reinforcing. This process causes
those sounds that are more natural to the verbal community to gradually become more prominent in the
infant’s babbling while those that are not, and as a result are not self-reinforced, decrease in frequency.
The hypothetical result of Mowrer’s autism theory–the gradual evolution of babbling toward the sounds
of the parents’ language–has been documented even at the level of echoing both the phonemes and
intonational contours of another speaker (deBoysson-Bardies, Sagart, & Durand, 1984).
Beyond the first year, echoics would appear to be most useful for children in learning new
vocabulary when caregivers and teachers prompt them to repeat a new word. Children appear to imitate
vocabulary terms under a variety of circumstances. Leonard, Schwartz, Folger, Newhoff, and Wilcox
(1979) found that children’s imitations of lexical items were more likely when the term and its referent
were novel and when the referent was informative in the present context. Keenan (1974) suggested that at
times children imitate their partner’s utterance when it appears that they are unable to add significant
information to the exchange–probably not unlike the ill-prepared student who simply repeats the
instructor’s unexpected question.
In addition, some have suggested that in the process of selective imitation, producing partial
echoics might provide children with the opportunity to include words and phrases from the language
models around them as their repertoire of verbal behavior expands. The correspondence between the
models being echoed and the attempted communication might be reinforced differentially. Echoing or
including a relevant word, word ending, or phrase might result in an utterance that is reinforced through a
better outcome. Folger and Chapman (1978) found that children at the one- and two-word stage were
more likely to imitate an adult’s utterance if it was an expansion of their own preceding utterance.
Interestingly, because the adult’s expansion was typically partially based on the child’s utterance, it would
constitute a partial echoic, and the children were more inclined to imitate these than an adult’s exact
repetition of their utterance.
Farrar (1992), like several others (Moerk, 1992; Nelson K. E., 1977) found that 2-year-old
children were two to three times more likely to imitate the correct grammatical feature in response to
corrective recasts which replace incorrect or missing grammatical features, than any other maternal
responses. Again, in all these examples, the children’s selective imitation through reproducing a partial
echoic appeared to be an important learning tool used to shape their production of grammatically
progressive forms.
With some fear of belaboring the point, it is important to ask why these instances of echoic
behaviors can be considered as learning tools as opposed to simply moments of meaningless, non-
progressive imitation. As indicated throughout this paper, the connection between the caregiver’s
attention and continued interactions were established as powerful secondary reinforcers from the
beginning. To the extent that a child approximates the caregiver’s (and through generalization, all adults’)
recast or expansion of his or her ungrammatical response, the child’s echoic behaviors are automatically
self-reinforced (see Hegde, in this issue, for more on echoics and language learning).
Tacts
Description: Verbal behaviors that allow listeners to be in “contact” with the speaker’s
environment are called tacts in Skinner’s model. Tacts are essentially verbal behaviors that “name” or
“describe” the elements of the environment. Because tacts, to be reinforced, must exhibit conventional
correspondence to how others in the verbal community would talk about the same circumstances, they
allow sharing with listeners the objects, persons, events, and relationships that originally evoked the
speaker’s verbal behavior. We share our experiences with others through tacting.
125
Tacting takes its first form in the proverbial first words, when children begin to consistently
respond to their caregiver’s presence with “mama” or “dada” or name their favorite toy. Of course, these
tacts evoke much attention and celebration, but otherwise caregivers continue to attend and respond to
most tacts to encourage children to maintain and expand descriptions of their experiences well into the
future, especially those that are out of the caregiver’s view.
For the child, the causation for tacting is different than for manding. The two verbal operants
might be functionally confused, for example when the toddler’s tact is misinterpreted as a mand. Manding
is caused by a state of motivation and will be reinforced by receipt of the specified reinforcer–say, when a
child truly asks for more ice cream and then eats it. If the words “ice cream” were not due to an actual
desire for ice cream, and instead were simply tacts noting its presence to evoke a comment from the
caregiver, any ice cream that is provided may be used for finger painting or end up on the floor.
Correlates in Early Language Learning: Children’s first words are a source of joy for caregivers
and a source of fascination for researchers. Beyond the traditional first word occurring on the first
birthday, researchers have been intrigued by the grammatical composition of first words and the first 50
words. At least from the superficial perspective, there has been a predominance of nouns reported in both
cases (Nelson K. , 1973). This “noun bias”–naming objects, people, animals, and so forth–has been found
in maternal speech to one-year-olds (Goldfield, 1993). Some (e.g., Gentner, 1982) have taken this as
evidence of an innate predisposition toward learning nouns, in which children learning any human
language should be predisposed to learning nouns. Testing this premise, however, others (e.g., Tardif,
Liang, Zhang, Fletcher, & Kaciroti, 2008) have found that there are not only cross-linguistic differences
but evidence of strong influence from parental input.
Extended Tacts
Description. Skinner noted that “stimulus control is by no means precise” (1957, p. 91). When a
response is reinforced in the presence of an object or event, there is actually a variety of features that
accompany the primary stimulus. When another stimulus appears that shares at least one of those features,
the original response may occur as an extended tact–it is extended to the new stimulus based on stimulus
generalization. Skinner described a number of extended tacts that illuminate a variety of most interesting
verbal behaviors; these will not be addressed here, but are nonetheless fascinating. (See Hegde, in this
issue, for a discussion of generalized tacts.)
Correlates in Early Language Learning: Perhaps the most obvious manifestation of extended
tacts in our observations of children learning language is what has been widely called “overextensions.”
The classic examples include the young child who having become familiar with the family pet–the one
that has four legs, fur, a tail, and barks–proceeds to call everything with four legs and fur a “doggy.”
These and the conceptual behavior they illustrate were discussed previously in relation to the influence of
stimulus control.
Other examples of extended tacts include the development of figurative language. Metaphors,
similes, idioms and proverbs are examples of figurative language that are essentially based on stimulus
generalization. In most such expressions, a relationship shared by two sets of stimuli is expressed through
an analogy as in the idiomatic expression, “Don’t let the cat out of the bag,” which likens letting a feline
escape to giving out a secret. Due to their abstract nature, these expressions are less likely to be
understood by younger children. An individual expression may be learned by rote and used in very
limited context until the stimulus generalizations that underlie such figurative expressions become well
established. The concept of metaphoric transparency has been used to describe the degree to which the
literal meaning and figurative meaning are similar (Gibbs, 1987). Because, in contrast to the previous
example, the literal meaning of the expression, “Don’t beat around the bush” is less similar to its
figurative meaning, this latter idiom would be said to exhibit less metaphoric transparency. Perhaps it
126
would be expeditious to not “beat around the bush” and simply recognize that these two sets of stimuli
involved using and understanding this idiom have less in common, which behaviorally makes them less
likely to evoke or support the stimulus generalization necessary to understand them.
Autoclitics: Secondary Verbal Behaviors
Description: Perhaps the most subtle and interesting element in Skinner’s (1957) functional
analysis of verbal behavior is autoclitic behavior. Skinner characterized autoclitic behavior as a secondary
level of verbal behavior in which the speaker’s behavior comments on the relationships among the
primary verbal behaviors by inserting certain words or word endings or through rearranging their order.
Skinner classified these secondary behaviors as autoclitic words (“is”, “was,” “but,” “and,” etc.),
autoclitic tags (“-ed,” “-ing,” “-s,” “-ion,” etc.), and autoclitic orders, (such as those used in English for
statements, questions, imperatives, etc.).
Autoclitic behavior is most essentially Skinner’s way of addressing the traditional concepts of
syntax and grammar. Traditionally, syntax and grammar are comprised of the mental rules that dictate the
order of words and word endings. In contrast, Skinner conceived of these words, word endings and orders
as being reinforced and shaped to conform to the verbal community. Due to the established practice in a
verbal community, a certain word order is more effective in sharpening the listener responses than an
order that is not part of that practice; the proverbial headline “Man Bites Dog” is eye-catching not
because of the three words involved, but because the order in which they have been arranged corresponds
to a relationship that occurs infrequently and evokes effective responses as a consequence.
Correlates in Early Language Learning: An important correlate in the traditional literature was
again provided by Brown (1973) in his landmark book, A First Language. Brown noted that as the
children in his study began producing multi-word utterances, they began to express certain semantic roles
and relationships by relying on consistent word orders. This appeared to be in lieu of the grammatical
inflections that had not yet emerged. For example, without the possessive inflection /’s/ available, the
children appeared to rely on a set word order to “express possession,” as the traditional formulation goes.
Any number of different word pairs placed in similar order could achieve the same result (e.g., mommy
sock, baby shoe, daddy chair). Of course, without the possessive inflection, they still had to rely on adults
to interpret the word orders based on the situational context. Brown coined the term “case frames” for
these consistent word orders (1973, p. 135). This is an interesting correlate to Skinner’s (1957) concept
of autoclitic frames (p. 361) and syntactic frame s (p. 405) which refer to similar consistent word orders
used to comment on relationships among the primary verbal behaviors.
Another important correlate in the literature that illustrates the relevance of Skinner’s model
appeared not long after Verbal Behavior (1957) was published. Berko (1958) published what became a
landmark study illustrating a phenomenon in child language called overregularization. It has been
observed that young children initially learn to correctly produce irregular past tense verbs (e.g., ran, sat,
ate) and irregular plural nouns (e.g., men, women, children). However, at the approximate time that they
begin to produce the regular forms using the grammatical inflections /-ed/ and /-s/, they appear to
“unlearn” the irregular forms and produce overregularized forms such as eated, sitted, runned.
This phenomenon has been explained through a variety of mentalistic models, including single
mechanism, connectionist, or network mechanisms (Maslen, Theakston, Lieven, & Tomasello, 2004). In
contrast, Skinner’s model explains this phenomenon in a more objective and scientifically elegant manner
relying on the behavioral principles of stimulus generalization and response induction. For example, in
the case of past tense the controlling stimulus for both regular and irregular forms is the past time frame
for an event. The irregular verb forms learned earlier by children relate to common everyday events–sit-
sat, eat-ate, run-ran, and so forth. When the child begins to learn the regular past tense inflection, it will
naturally occur in response to events that also occur in the past time frame. Because this is the controlling
127
stimulus common to both, stimulus generalization occurs and response induction causes the regular form
to generalize to all occurrences of past tense, causing the child to produce sitted, eated, runned, and so on.
Conclusion
The ultimate test of any analysis is whether it offers a better and more parsimonious explanation
and whether if finds application. An important dispositio n in the conduct of science is that it is better to
go without an explanation than to settle for an inadequate one (Bachrach, 1972; Hegde, 2003). Skinner’s
analysis of verbal behavior offers a parsimonious explanation based on experimentally manipulable
empirical relations as against unobservable mentalistic or cognitive structures (see Schilnger in the
current issue). When so many had settled prematurely on an inadequate explanation (e.g., the innate
mechanisms, grammatical universals, the LAD and the LAS), we can now see the debt of gratitude owed
to so many who were committed to arriving at an empirically based understanding of language learning.
Although more research is needed, applied research based on Skinner’s Verbal Behavior has
increased to a significant extent, as evidenced in publications in such journals as The Analysis of Verbal
Behavior, The Behavior Analyst, Journal of Applied Behavior Analysis, and several other national and
international journals. Professionals in a number of fields have used the tools of change described by
Skinner–possibly in many cases without recognizing it. Much of the evidenced-based treatment
procedures in speech-language pathology are behavioral procedures that are based on Skinner’s operant
conditioning, currently generally described as applied behavioral analysis (Hegde, 1998). That this
evidence has such good fit is vindicating of the insights provided by Skinner’s analysis. It is now time for
speech-language pathologists to see the inconsistency of holding up the LAD or other innate mechanisms
as the basis for language learning while applying behavioral principles in actually changing language
behaviors. That most clinicians manipulate the environmental variables, including their own models,
contexts, stimuli, and reinforcement in their everyday clinical practice is not only vindicating of Skinner’s
analysis, but compelling of clinicians to understand and adopt that analysis.
References
Bachrach, A. J. (1972). Psychological research: In introduction. (3rd ed.). New York, NY: Random
House.
Bateson, M. (1971). The interpersonal context of infant vocalizations. Quarterly Progress Report,
Research Laboratory of Electronics , 100, 170-176.
Berko Gleason, J. (2005). The development of language (6th ed.). Boston: Allyn & Bacon.
Berko, J. (1958). The child's learning of English morphology. Word , 14, 150-177.
Bloom, L. (1973). One word at a time: The use of single-word utterances before syntax. The Hague:
Mouton.
Bloom, L., & Lahey, M. (1978). Language development and language disorders. New York, NY: John
Wiley & Sons.
Brown, R. (1973). A first language: The early stages. Cambridge, MA: Harvard University Press.
Brown, R. (1958). Words and things. New York, NY: Free Press.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: M.I.T. Press.
128
Chomsky, N. (1959). Review of Skinner's Verbal behavior. Language , 26-58.
Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.
Clark, E. V. (1973). What's in a word? On the child's acquisition of semantics in his first language. In T.
E. Moore (Ed.), Cognitive development and the acquisition of language. (pp. 65-110). New York,
NY: Academic Press.
deBoysson-Bardies, B., Sagart, L., & Durand, C. (1984). Discernible differences in the babbling of
infants according to target language. Journal of Child Language , 11, 1-15.
Dore, J. (1974). A pragmatic description of early langugae development. Journal of Psycholinguistic

Research , 3, 343-350.
Dore, J. (1986). The development of conversational competence. In R. Scheifelbusch (Ed.), Language

competence: Assessment and intervention. (pp. 3-60). San Diego, CA: College-Hill Press.
Dunham, P., Dunham, F., Hurshman, A., & Alexander, T. (1989). Social contingency effects on
subsequent perceptual-cognitive tasks in young infants. Child Development , 60, 1486-1496.
Folger, J., & Chapman, R. (1978). A pragmatic analysis of spontaneous imitations. Journal of Child
Language , 5, 23-38.
Formby, D. (1967). Maternal recognition of infant's cry. Developmental Medicine and Ch ild Neurology ,
9, 293-298.
Gartstein, M. A., Crawford, J., & Robertson, C. D. (2008). Early markers of languge and attention:
Mutual contributions and the impact of parent-infant interactions. . Child Psychiatry and Human
Development , 9-26.
Gentner, D. (1982). Why nouns are learned before verbs: Linguistic relativity versus natural partitioning.
In S. Kuczaj (Ed.), Language development: Language, thought, and culture. (Vol. 2, pp. 301-
333). Hillsdale, NJ: Erlbaum.
Gibbs, R. W. (1987). Linguistic factors in children's understanding of idioms. Journal of Child Language
, 14, 569-586.
Goldfield, B. (1993). Noun bias in maternal speech to one-year-olds. Journal of Child Language , 20, 85-
99.
Goldstein, M. H., & Schwade, J. A. (20008). Social feedback to infants' babbling facilitates rapid
phonological learning. Psychological Science , 19, 515-523.
Goldstein, M. H., Schwade, J. A., & Bornstein, M. H. (2009). The value of vocalizing: Five-month-old
infants associate their own noncry vocalizations with responses from caregivers. Child
Development , 636-644.
Hart, B., & Risley, T. R. (1999). The social world of children learning to talk. Baltimore, MD: Paul H.
Brookes .
129
Hegde, M. N. (2003). Clinical research in communication disorders: Principles and strategies. (3rd ed.).
Autsin, TX, CA: Pro-Ed.
Hulit, L. M. (2006). Born to talk: An introduction to speech and language develpment. Boston: Allyn &
Bacon.
Keller, H., & Scholmerich, A. (1987). Infant vocalizations and parental reactions during the first 4 months
of life. Developmental Psychology , 62-67.
Kennan, E. (1974). Conversational competence in children. Journal of Child Language , 1, 163-184.
Kuhl, P. K. (2004). Early languge acquisition: Cracking the speech code. Nature , 5, 831-843.
Kuhl, P. K., Tsao, F.-M., & Liu, H.-M. (2003). Foreign-language experience in infancy: effects of short-
term exposure and social interaction on phonetic learning. Proceedings of the National Academy
of Sciences , 9096-9101.
Leonard, L. B., Schwartz, R. G., Folger, M. K., Newhoff, M., & Wilcox, M. J. (1979). Children's
imitations of lexical items. Child Development , 50, 19-27.
Locke, J. L. (1993). The child's path to spoken language. Cambridge MA: Harvard University Press.
MacCorquodale, K. (1970). On Chomsky's review of Skinner's Verbal Behavior. Journal of the

Experimental Analysis of Behavior , 83-99.
Maslen, R. J., Theakston, A. L., Lieven, E. V., & Tomasello, M. (2004). A dense corpus study of past
tense and plural overregularization in English. Journal of Speech, Language, and Hearing
Research , 47, 1319-1333.
McLaughlin, S. (2006). Introduction to language development (2nd ed.). Clifton Park NY: Thomson
Delmar Learning.
Moerk, E. L. (1992). First language: Taught and learned. Baltimore, MD: Paul H. Brookes.
Mowrer, O. H. (1952). Speech development in the young child: 1. The autism theory of speech
development and some clinical appications. Journal of Speech and Hearing Disorders , 263-268.
Nelson, K. (1974). Concept, word, and sentence: Interrelations in acquisition and development.
Psychological Review , 81, 11-56.
Nelson, K. E. (1977). Facilitating children's syntax acquisition. Developmental Psychology , 13, 101-107.
Nelson, K. (1973). Structure and strategy in learning to talk. Monographs of the Society for Research in
Child Development , 38, 1-136.
Owens, R. E. (2005). Language development: An introduction (6th ed.). Boston: Allyn & Bacon.
130
Rheingold, K. S., Gewirtz, J., & Ross, H. (1959). Social conditioning of vocalizations in the infant.
Journal of Comparative and Physiological Psychology , 52, 68-73.
Routh, D. K. (1969). Conditioning of vocal response differentiation in infants. Developmental

Psychology, 1, 219-226.
Searle, J. R. (1969). Speech acts. Cambridge, England: Cambridge University Press.
Skinner, B. F. (1974). About behaviorism. New York: Alfred A. Knopf.
Skinner, B. F. (1988). Skinner's reply to Catania. In A. C. Catania, & S. Harnad (Eds.), The selection of
behavior: The operant behaviorism of B. F. Skinner: Comments and consequences (pp. 483-488).
Cambridge, UK: Cambridge University Press.
Skinner, B. F. (1938). The behavior of organisms. New York: Appleton.
Skinner, B. F. (1957). Verbal behavior. New York: Appleton.
Tamis-Lamonda, C. S., Bornstein, M. H., & Baumwell, L. (2001). Maternal responsiveness and children's
achievement of language milestones. Child Development , 72, 748-767.
Tardif, T., Liang, W., Zhang, Z., Fletcher, P., & Kaciroti, N. (2008). Baby's first 10 words.
Developmental Psychology , 44, 929-938.
Todd, G., & Palmer, B. (1968). Social reinforcement of infant babbling. Child Development , 39, 591-
596.
Wahler, R. G. (1969). Infant social development: Some experimental analsyses of an infant-mother

interaction during the first year of life. Journal of Experimental Child Psychology , 7, 101-113.
Weisberg, P. (1963). Social and nonsocial conditioning of infant vocalization. Child Development , 39,
377-388.
Winokur, S. (1976). A primer of verbal behavior. Englewood Cliffs, NJ: Prencice-Hall.
Author Contact Information:
Dr. Scott McLaughlin

Speech-Language Pathology
University of Central Oklahoma
100 N. University Drive
Edmond, OK 73034
Email: smclaughlin@uco.edu
Phone: 405-974-5297
131
The Bases for Language Repertoires:

Functional Stimulus-Response Relations
Raymond S. Weitzman
Abstract
This paper surveys the nature and types of stimulus-response relations and how those
involved in operant conditioning are the ontogenetic bases for establishing, maintaining, and
changing language behavior. It also examines and critiques two putative empirical claims under
the poverty of the stimulus arguments that language behavior is stimulus free and that the
linguistic input is limited and degenerate. It then discusses the relevance of operant stimulus-
response relations and principles in the treatment of speech-language disorders.
Keywords: stimulus-response relations, reflex relations, respondent relations, operant relations,
poverty of the stimulus arguments, speech-language interventions, treatment principles, treatment
procedures
____________________________________________________________________
Introduction
Speech-language pathologists (SLPs) and applied behavior analysts (ABAs) have much
in common. They generally prefer evidence-based methodologies to inference-based
methodologies. The evidence-based methodology of behavior analysis provides a causal basis for
explaining behavior in terms of its functional relationship to environmental variables. While
behavior analysts have certainly not discovered all the causes of human behavior, their
experimental findings of lawful relations between environmental stimuli and behavior, how these
relationships are established, and what kind of factors go into establishing and maintaining these
relationships have given them great confidence in their endeavors and in their interpretations of
how behavior repertoires are formed. This has resulted in the development of effective techniques
in therapeutic situations and offers the prospect of future progress in applied areas.
From the history of their field and from their experiences working with clients, SLPs and
ABAs know that in many cases individuals with verbal behavioral disorders are amenable to
improvement, if not a complete restoration to normal behavior. My purpose here is to discuss
empirical issues that are the common concerns of SLPs and ABAs. Cognitive psychologists and
most linguists have argued that language behavior cannot be learned “simply” through stimulus-
response relations (Chomsky, 1959). But what are S-R relations? How are such relations
established? Are the objections of cognitivists and linguists to an explanation of language
behavior in terms of S-R relations valid? In what follows I will try to answer these questions and
in doing so demonstrate the relevance of special kinds of S-R relations—the operant S-R
relations—to providing behavior interventions to those with speech-language problems.
Functional Stimulus-Response Relations
S-R relations can be interpreted in a variety of ways. In behavior analysis, what such a
relation means is that behavior, in the form of an identifiable response (R), is controlled or has
132
come to be controlled in some way by some identifiable event, condit ion, or situation in the
environment. That environmental event, condition, or situation is called a stimulus (S). This
relationship is a functional relationship in that the stimulus is defined in terms of the effect it has
on behavior.
The term stimulus can refer to (1) a specific instance of physical events, (2) combinations
or complexes of events, (3) the absence of previously occurring events, (4) a relation among
events, (5) specific physical properties of events, (6) classes of events defined by phys ical
properties, and (7) classes of events defined functionally (Catania, 1998). After reading the
previous list of what the term stimulus can refer to, the reader may be thinking that almost
anything can be a stimulus. The most important thing to remember about an S-R relation is not
what kinds of things can be stimuli or even responses, but the role stimuli play in affecting
behavior. S-R relations must be determined empirically, i.e., based on observational and
experimental findings about how environmental variables (the stimuli) and behavior (the
responses) interact with each other.
In the last sentence of the previous paragraph I deliberately used and emphasized the
word “interact” with reference to stimuli and responses because often it is assumed that the
temporal locus of a stimulus is antecedent to the response it affects. However, the temporal locus
of a stimulus that affects behavior can also be a consequence of behavior itself, for behavior acts
on the environment and in doing so often changes it (Cooper, et al, 2007, pp. 28-29). Altering the
environment may then very well have an effect on future behavior. Just how future behavior
might be affected as a result of a consequent stimulus change will be discussed in the following
section on operant relations.
Another thing to keep in mind about S-R relations is that in all likelihood these
relationships do not involve single, unique events. This can be seen by the list of things that the
term stimulus can refer to, given earlier. Although in a single observation or a single experiment,
a specific S and a specific R are being considered, the relation between them may involve a
number of different values of S and R. For example, the verbal stimuli “Sit down,” “Have a
seat,” “Why don’t you make yourself comfortable ,” “Pull up a chair,” “Please be seated,” “Take
a load off,” and so forth may all result in the “same” response of sitting down. Note also that the
act of sitting down (R) is not without its variability. People sit down in many different ways,
depending on what they sit down on, the kind of clothes they are wearing, their physical state,
their emotional state, and such other factors.- But basically the relationship between the verbal
stimuli of some speaker(s) and the physical response of some listener(s) is the same. Thus, for
any given S-R relation, we are usually dealing with a class of stimuli and a class of responses.
Each class may be given a particular verbal label. In this example, ‘sitting down’ is the label
given to the response class and “mand to sit” might be given as the label for the stimulus class.
It seems natural to suppose that if a stimulus class influences a response class in the same
way, we would expect the members of the stimulus class to have some physical properties in
common. This frequently happens, but it is not always the case. The members of the stimulus
class given in the previous paragraph labeled “mand to sit” really have nothing physical in
common except they are in the same receptive mode (auditory if spoken, and visual if written).
Furthermore, they do not have very much in common structurally. Some belong to the same
133
sentence class (Imperative) and a few share a lexical item (the indefinite article “a”), but not
much else.
Just what is the specific functional relationship between S and R is a matter of empirical
circumstances. From a scientific point of view what is important is that these relationships are
lawful, meaning that they are regular and predictable and established in the same empirical
manner. Furthermore, if S and R are variables that can be manipulated by the experimenter, the
possibility for altering the relationship arises. This alterability factor is of particular relevance for
the speech-language and behavior analysis practitioner whose goal is to establish, change, or
correct language behavior.
Types of S-R Relations
There are many ways of classifying S-R relations. In behavior analysis, however, S-R
relations are classified according to whether the relations are reflexive , respondent, or operant.
One of the key criteria for this classification has to do with whether the relationship is
phylogentic or ontogenetic. Phylogenetic S-R relations are the products of the genetic selection
that arise from the evolutionary history of a species. Phylogenetic S-R relations are often referred
as being automatic , mechanical, and unlearned, because of their high degree of predictability.
Ontogenetic S-R relations are those that are established by an organism’s interactions with its
environment. They are said to be contingent or conditional, because they are more probabilistic
and subject to the vagaries of changes in the environment. In other words, ontogenetic S-R
relatio ns are learned.
Reflex Relations
Everyone is familiar with reflex relations or what are commonly called reflexes. Among
these are the gag reflex, the knee-jerk reflex, the startle reflex, and the papillary light reflex.
Reflexes involve involuntary responses to various environmental stimuli. In a reflex relation the
stimulus is referred to as being unconditioned and is said to elicit the response. Human infants
are known to have about a half-dozen so-called primitive reflexes that gradually disappear by the
time they are one year of age. Reflexes are phylogenetically based S-R relations and therefore
unlearned. They are a product of the evolutionary history of a species and universally part of a
species’ behavioral repertoire. Using the jargon of computerese, reflexes are “hardwired.” Except
for the developmental reflexes in human infants mentioned above, reflexes do not change much
over the life span of an individual. Furthermore, they are usually unmodifiable. However, reflex
relations may be temporarily weakened by repeated presentation of the unconditioned stimulus.
Such a process is known as habituation. The reflex relation usually returns quickly after a fairly
brief interval of time.
Reflex relations can occur in chains. For example, the nursing process of newborn babies
involves a chain of reflexes. First, touching an infant’s cheek elicits head turning in the direction
of the tactile sensation (rooting reflex). If in the process of turning the infant’s mouth touches the
surface of an object, this elicits sucking. If the object that infant comes in contact with is the
mother’s nipple or the nipple of a bottle, milk will then enter the infant’s mouth. The sensation of
134
milk in the mouth triggers swallowing. These chains are called reactive or reflex chains (Pierce
& Cheney, 2008).
Simple reflexes and reflex chains make up only a tiny part of the behavioral repertoire of
a human being. It was once believed that all behavior could be explained in terms of reflexes,
including consciousness (Sechenov, 1965). Reflexes were considered the elemental building
blocks of all behavior. This mechanistic view, however, was overthrown by the later discoveries
of respondent and, even more importantly, operant S-R relations.
Reflexive vocalizations occur when an infant is sucking, swallowing, and burping.
Fussing sounds and crying are also elicited by a wide variety of stimuli. Among these are
deprivation of food or water, diaper rash, and other discomforting stimuli, including loud noises,
noxious tastes or smells, etc. However, there is no empirical evidence that indicates that reflex
relations play any role at all in the establishment or maintenance of language repertoires.
It is important to note that although an involuntary response, whether vocal or non-vocal,
is elicited by an unconditioned stimulus in a reflex relation, the same response or response class
may also be evoked under other environmental conditions. Behaviors can and often do have
multiple causes (Skinner, 1953, 1957). For example, an infant’s cry, as pointed out above, may be
elicited under various kinds of physical deprivation or aversive stimuli, but may also come under
the control of non-eliciting ontogenetic social and situational stimulus variables (Schlinger, 1995).
Likewise, stimuli that may elicit a reflex response may also come to control behavior in a non-
eliciting way under other environmental conditions. For example, a flashing red light might elicit
the orienting reflex response of head turning, but under other conditions may cause a driver to
stop momentarily before driving on. How these non-eliciting stimulus-response functions are
established will be discussed in the next two subsections.
Respondent Relations
“If reflexes were the only legacy of natural selection, an organism would be ill-equipped to
survive in a changing environment.” (Donahoe & Palmer, 1994)
As discussed in the previous subsection, reflex relations have been phylogenetically

selected for and require the presence of a particular member of a stimulus class to elicit a
particular member of a response class. But always contiguous with a particular eliciting stimulus-
event are other kinds of environmental stimuli with other kinds of properties that are not
necessarily in the same sensory mode. This opens up the possibility for these other stimulus-
events to establish a new eliciting function with the reflex response. Ivan Pavlov (1849-1936) is
given credit for discovering how such stimulus-response relations get established through a
process known today as respondent conditioning (also called classical or Pavlovian
conditioning).
Respondent conditioning basically involves the contiguous pairing of the eliciting
unconditioned stimulus with some other neutral stimulus called the conditioned stimulus .
After several such pairings, the presence of the conditioned stimulus alone comes to elicit the
reflex response, now called the conditioned response. Thus, a new S-R relation has been
established through the original pairing of an unconditioned stimulus with a conditioned stimulus.
135
The conditioned stimulus may be a property or a by-product of the unconditioned

stimulus. For example, if a dog has been deprived of food for a sufficiently long period, the sight
of food or the smell of food brings about salivation. However, the conditioned stimulus may be
totally independent of the unconditioned stimulus, i.e., quite arbitrary. In some of Pavlov’s
experiments with dogs (Pavlov, 1927), the conditioned stimulus was the sound of a bell, or a
metronome, or the sound of the laboratory door being opened by the experimenter, or even
Pavlov himself as he entered the laboratory to perform an experiment on one of his dogs.
Pairing a conditioned stimulus with an unconditioned stimulus is referred to as first-
order respondent conditioning. It has been shown experimentally that second-order, third-order,
or in general, higher-order respondent conditioning is possible. Once the first-order
conditioned stimulus-response relation has been established it is then possible to pair the first-
order conditioned stimulus with another conditioned stimulus.
The establishment of new S-R relations through respondent conditioning might be said to
add to an individual’s behavioral repertoire in the sense that although not novel, the response is
now being elicited by a different stimulus. Except for those respondent relations where the
response is a strongly emotional one, a respondent S-R relation is not what could be called a long-
lasting relation. Once the respondent relation is established, repeated occurrences of the
conditioned stimulus without the unconditioned stimulus weaken the S-R relation until the
conditioned stimulus elicits no response. This process is known a respondent extinction. The
only way to re-establish the relationship is to go through respondent conditioning all over again.
Of course, the original reflex relation is left undisturbed.
While respondent conditioning seems to play a role in developing taste aversions, sexual
arousal, and phobias, little is known about what role it plays in language development and
everyday language behavior. It may play some role in words acquiring emotional connotations,
but little else. That does not mean that there has not been much speculation in the past about its
function with respect to language behavior. Nevertheless, textbooks on applied behavior analysis
and speech-language disorders usually have little to say about respondent S-R relations or
establishing them for therapeutic purposes. The application of respondent conditioning in speech-
language interventions seems to be extremely limited, since only the stimulus is changed in the
stimulus-response relation, not the behavior. Usually speech-language interventions involve
changing the behavior of the client, such as correcting for misarticulations. To achieve changes in
behavior requires the establishment of operant relations.
Operant Relations
“Men act upon the world, and change it, and are changed in turn by the consequences of their
action.” (Skinner, 1957)
Skinner’s quotation captures the essence of how most human behavior occurs, is
maintained, shaped, changed, ela borated on (i.e., made more complex), refined, and enlarged or
contracted in its repertoire. Behavioral repertoires are largely created and maintained through an
individual’s interactions with the physical and social environment. Traditionally, we refer to this
kind of behavior as voluntary behavior, in contrast to reflexive behavior. This kind of behavior is
called operant behavior in behavior analysis. Operant behavior is a function of its consequences
136
and is much more complex than reflexive behavior in that the operant S-R relations involve both
antecedent stimuli and consequent stimuli, as will be shown below.
Consequent Stimuli. The following is an illustration of how a consequent stimulus can
affect behavior. An infant is placed on her back in a crib. Typically, when placed in such a
position the infant will randomly move her arms and feet and look around in what could be called
uncommitted or emitted behavior (Cooper, Heron, & Heward, 2007). Above the infant and
within her visual field is a motionless but colorful mobile. After a while she may habituate to her
surroundings and eventually stop her activity and even fall asleep. Let us now tie the end of a
ribbon to her right leg and connect the other end to the mobile, so that every time she moves her
right leg the objects hanging from the mobile will be set in motion. The first time she moves her
right leg the motion of the mobile will attract her attention. Pretty soon she is regularly moving
her right leg to initiate the movement. The more vigorously she moves her leg the more
movement in the mobile. Notice that the behavior is first initiated randomly. We have no
evidence that there are any eliciting stimuli. Yet now the behavior is repeated because of its
consequence(s). Traditionally we say that the infant is engaging in purposeful or intentional
behavior. What was once a random movement of the right leg has become a regular, “purposeful”
activity.
Reinforcement. The consequence of the behavior, the moving mobile, has come to
function as a positive reinforcing stimulus , also called a positive reinforcer, for the generation
of an operant behavioral repertoire (right leg movement). Technically speaking, consequent
stimuli that are positively reinforcing are those that increase the probability that the response will
occur again under similar circumstances. Consequent stimuli that decrease the probability that a
response will reoccur are called punishing stimuli or punishers . For example, if a mother gently
slaps (aversive stimulus) her young son’s hand as he reaches out to grab something his mother
doesn’t want him to touch, the child will withdraw his hand. He might try several times more to
reach to the object, but each time he gets slapped, perhaps a little harder. After a while, he no
longer reaches out for the object. Each slap (the aversive punishing stimulus) is reducing the
likelihood of his reaching out for the object. On the other hand, if a response leads to the removal
of an aversive stimulus, the probability of the response is likely to increase when similar aversive
contingencies occur. In such cases, the aversive stimulus is often referred to as a negatively
reinforcing stimulus or negative reinforcer.
Reinforcing stimuli can be classified on the basis of their origin (empirical source) or
their formal characteristics (Cooper, et al., 2007). In classifying reinforcing stimuli on the basis of
origin, two types are identified: unconditioned (positive or negative) reinforcers and
conditioned (positive or negative) reinforcers . Unconditioned reinforcers, also known as
primary reinforcers, are those that that are not learned. In other words, such stimuli have been
established as reinforcers through the process of natural selection over the evolutionary history of
a species. Food, water, and sexual stimulation are frequently cited as examples of unconditioned
reinforcers. Just what other stimuli are unconditioned reinforcers is difficult to specify for the
human species, because few empirical studies have attempted to sort them out. It has been
suggested that touch and some facial gestures, like the smile, may also be natural reinforcers, but
the evidence is not strong enough to know for sure. Conditioned reinforcers, also known as
secondary reinforcers, are stimuli that were originally neutral but became reinforcing by being
paired either with unconditioned reinforcers or previously conditioned reinforcers. Perhaps one of
137
the most well known examples of a conditioned reinforcer is money. Conditioned reinforcers like
money are referred to as generalized conditioned reinforcers because they have been paired
with so many different unconditioned and other conditioned reinforcers.
In terms of formal characteristics reinforcers can be divided into (1) tangible reinforcers ,
that is, those things that can be touched, seen, smelled, and manipulated; (2) edible reinforcers ,
that is, anything that can be eaten; (3) activity reinforcers , such as playing games by oneself or
with others, reading, attending a concert, playing with friends, engaging in artistic endeavors, and
so forth; (4) social reinforcers , such as physical contacts like hugs and pats, proximity to others,
attending, and verbal stimuli like praise. Obviously social reinforcers are provided by other
people and are probably among the most useful of the conditioned reinforcers for certain kinds of
clients.
Antecedent Stimuli. Now let us complicate the previous situation of the infant and the
mobile to show how contingent (non-eliciting) antecedent stimuli can influence behavioral
responses. Let us say that during the time the infant was in her crib and engaged in these activities,
a red light was on. After the reinforcing relationship has been established between the motion of
her right leg and its consequence, we turn off the light and arrange things so that any further
movement of her right leg does not result in setting the mobile in motion. After a while, the
movement of her right leg will become less frequent and predictable , i.e., the reinforcing relation
has been extinguished. But if we turn the red light on again and allow again movement in her
right leg to cause the mobile to move, we will soon observe the infant’s right leg moving as
frequently and as predictably as before. In other words, we have established the reinforcing
relation between the be havior (right leg movement) and its consequences (setting the mobile in
motion) but only if the red light is on. Now when we turn the light off, we will find that the infant
does not move her right leg, or more likely, moves her leg but with much less frequency and
certainly with less regularity then when the light is on.
This example illustrates how antecedent stimuli can come to control operant behavior as
a result of stimulus discrimination training, that is, reinforcing behavior in the presence of one
antecedent stimulus but not in the presence of another. Through discrimination training a new
stimulus-response relationship has been created between the presence of the antecedent red light
and the behavior. The antecedent red light is said to occasion or evoke the behavior, that is, make
it more likely to occur. The red light is said to be functioning as a discriminative stimulus . The
behavior under the control of the discriminative stimulus is referred to as the discriminated
operant.
After an antecedent stimulus has come to evoke a particular response under contingencies
of reinforcement, other stimuli are also likely to evoke the same response. This empirical
principle is called stimulus generalization. These other stimuli usually share some common
properties with the controlling stimulus, although the evocative strength of these new
(unconditioned) stimuli may not be as great as the original controlling stimulus. In the just given
example of stimulus discrimination, the infant learned that right leg movement was reinforced
when the red light was on but not when the light was off. However, lights of other colors might
now also evoke leg movement with the strength (frequency) of the leg response varying
depending upon how close the color is to red. Stimulus generalization can also be seen in
language learning. Once a child has acquired a new word, it will be evoked over a wider range of
138
stimuli than it would be evoked in adult speech. For example, one child learned the word “fly” in
the context of the common household insect, but then on other occasions used the same word in
the context of any small insect, crumbs, and even her own toes. Linguists often refer to such
language behavior as overextensions, but such behavior conforms very well to the concept of
stimulus generalization.
Schedules of Reinforcement. A reinforcing stimulus following some behavior increases
the likelihood of that behavior occurring in the future. But for how long will the behavior reoccur,
given the appropriate conditions? The answer to this question will depend on the past
contingencies of reinforcement, that is, the frequency with which the behavior is reinforced. The
rate at which reinforcement occurs is called a schedule of reinforcement (Ferster & Skinner,
1957). A continuous reinforcement (CRF) schedule is one in which an operant response is
reinforced every time it occurs. There are also several varieties of intermittent reinforcement
schedules, each with their own special characteristics. Interval schedules are arranged according
to the interval of time that has passed after an appropriate response occurs, regardless of the
number of other appropriate, or even non-appropriate, responses that have occurred in that
interval. A fixed interval schedule (FI) is one in which the occurrence of the reinforcing stimulus
depends on a constant time interval following an appropriate response, such as five minutes (FI5)
or ten minutes (FI10). On the other hand, a variable interval schedule (VI) is one in which the
interval of time is not constant but can vary around an average. For example, a VI10 means that
on average reinforcement is occurring every 10 minutes after an appropriate response, but an
actual interval might be shorter or longer than 10 minutes. Ratio schedules are arranged
according to the number of appropriate responses that have occurred, regardless of time. Just like
interval schedules, ratio schedules can be fixed (FR) or variable (VR). For example, an FR1
means that every appropriate response is being reinforced, i.e., continuous reinforcement, while
an FR10 means that every 10th appropriate response is being reinforced. A VR15 means that on
average every 15th response is being reinforced.
Behavior that is continuously reinforced will have a very high probability of reoccurring
and will be strengthened faster than on an intermittent schedule. If reinforcement is withheld,
extinction of the behavior will occur quite rapidly. Responses on a continuous schedule of
reinforcement also tend to have very little variation in their topography and thus tend to be rather
stereotypical. Furthermore, while in laboratory and clinical settings, behavior on a continuous
reinforcement schedule will fairly often be employed in certain circumstances, one rarely finds
them occurring in natural settings.
Unlike a continuous reinforcement schedule where high rates of responding can be
quickly generated, the rate of responding takes more time to build up with intermittent
reinforcement. Nonetheless, response rates under intermittent reinforcement also eventually reach
high probabilities of occurring under appropriate circumstances. An advantage of an intermittent
schedule of reinforcement over CRF is that the behavior will persist longer and be more resistant
to extinction. Comparing the different intermittent schedules, experimental evidence suggests that
behaviors under variable schedules (VR and VI) of reinforcement, are likely to be more resistant
to extinction than those under the fixed schedules (FR and FI), as long as the average ratio or
average interval is not too great. Various kinds of intermittent schedules of reinforcement are
more likely to be found in natural settings than CRFs. This explains in part why much of our
operant behavior persists over time, including language behavior. It should also be kept in mind
139
that within our behavioral repertoires different operants are subject to different schedules of
reinforcement and that schedules may change over time.
The Three-Term Contingency. Operant S-R relations, thus, usually involve what is called
a three-term contingency, sometimes symbolized as SD→R→SR, where SD is the discriminative
stimulus, R, the response, and SR, the reinforcing stimulus. According to this “formula,” the
occurrence of some behavior is highly probable (other conditions being present) when a certain
antecedent (discriminative) stimulus is present as the result of a past history of the behavior being
contingently followed by a reinforcing stimulus under some schedule of reinforcement. The
three-term contingency formula for operant conditioning looks deceptively simple, which is
probably one of the reasons why many linguists and cognitive psychologists, as well as others,
have difficulty conceiving that operant conditioning could be the basis for language behavior.
What I have presented here is only the bare basics of the principles of operant conditioning. For
details, see Catania (1953), Cooper, et al (2007), Fester & Skinner (1957), Pierce & Cheney
(2008), Skinner (1953), or Sulzer-Azaroff & Mayer (1991).
Two Poverty of the Stimulus Arguments
“It is perhaps worth emphasizing that the orderly control of behavior in a stable environment by
contingencies of reinforcement is not a theory but an empirical fact.” (Palmer, 1998)
After 23 years of working on it, B. F. Skinner (1957) wrote Verbal Behavior, a 478- page
exposition of how language behavior could be explained in terms of operant principles. Actually,
Skinner avoids using the term ‘language behavior’ in his book (1957, p. 2) because for him
‘language’ refers to the practices of a linguistic community and his focus is on the verbal
behavior of the individual. Also his use of the term verbal behavior is broader in conception than
just spoken language..
Unlike linguists, Skinner did not try to define nor describe language or linguistic units in
structural terms or in terms of formal rules for generating and comprehending language forms.
Instead, he tried to characterize language behavior in terms of the sources that provide the
reinforcing stimuli and discriminative stimuli for the behavior. As he put it (Skinner, 1957, p. 2),
language behavior is “…behavior reinforced through the mediation of other persons…” In other
words, it is the members of a language community who provided the stimulus control and
reinforcement necessary for an individual to become a member of and to maintain membership in
that community. Later in Verbal Behavior (pp. 224-226), Skinner refined his characterization to
make it clear that language behavior of the speaker was a special repertoire tied to a listener
whose own language behavior was shaped by a community of speaker-listeners (Palmer, 2008).
Unlike cognitivists and linguistic nativists, Skinner did not put the structures and the rules
or principles that supposedly govern language behavior in the head. Instead, in describing
language behavior in terms of the environmental (physical and social) variables that give rise to
such behavior, he gave full responsibility for the phonological and grammatical patterns in
language behavior to the language community, through its control over the contingencies of
reinforcement.
140
Arguments critical of explaining language behavior in terms of operant S-R relations

might be said to have begun soon after the publication of Verbal Behavior and quickly
culminated with Chomsky’s review of that book (Chomsky, 1959). Over the ensuing decades
these arguments became more refined and new ones were added, becoming known as the poverty
of the stimulus arguments (Chomsky, 1980; Thomas, 2002).
The poverty of the stimulus arguments might be regarded as a multi-bladed Excalibur
with which the knights of linguistic nativism try to slay the empiricist dragon. Only those
considered most pertinent to the issues of concern of speech-language pathologists and applied
behavior analysts will be discussed here. It has been argued by Chomsky and others that
explaining the ontogenetic development of language behavior (i.e., language acquisition) in terms
of establishing operant S-R relations in a child’s interactions with speaking members of a
language community is insufficient and furthermore, that the structural complexities of speech
utterances (i.e. the grammar of language) are too great to be learned through verbal stimuli.
1. Language behavior is stimulus-independent. Chomsky (1972) stated that language
behavior is “…free from control of detectable stimuli, either external or internal.” (p. 12.) This is
such a radical claim that one might well wonder if there is any empirical support for it.
Unfortunately, Chomsky provides no such evidence. Previous to making this claim, Chomsky
(1972) discusses another “important observation.” The observation is that “…the normal use of
language is innovative, in the sense that much of what we say in the course of normal language
use is entirely new, not a repetition of anything that we have heard before and not even similar in
patternin any useful sense of the terms “similar” and “pattern”to sentences or discourse that
we have heard in the past” (pp. 11-12). Again just what evidence there is to support this claim is
not stated. In fact, Chomsky speaks of this observation as being a “truism.” Chomsky treats the
stimulus-free aspect and the novelty of language behavior as being independent observations.
Logically it seems that if language behavior were not under the control of stimuli then it would be
natural to expect much novelty in language behavior. Nonetheless, Chomsky claims that language
behavior is always coherent and appropriate to the situation. These properties are apparently
intended to rule out language behavior that is random or not in conformance with the grammatical
principles of the language system. It is interesting to note that Chomsky rules out the possibility
of the coherence and appropriateness of language behavior being controlled by external stimuli,
specifically the language community. Instead he lets coherence and situational appropriateness
remain inexplicable mysteries by saying, “Just what “appropriateness” and “coherence” may
consist in we cannot say in any clear or definite way, but there is no doubt that these are
meaningful concepts.” (p. 12)
While there seems to be little, if any, empirical evidence for stimulus-independent
language behavior, there is plenty of empirical evidence for language behavior being stimulus-
dependent, particularly in the language acquisition process. Microanalytical studies of the dyadic
conversations between children and their parents have shown that (1) children are relying on cues
from the immediate conversation for building up their speech repertoires; (2) parents are
providing models of speech patterns that show up later in the child’s speech; (3) when a child’s
verbal response is grammatically incorrect, the parent will frequently follow up with an utterance
similar to what the child said but recasted in the correct grammatical form, thus providing a
mechanism for shaping the child’s language repertoire. Mediated reinforcing stimuli take many
forms, such as fulfilling a child’s verbal request and providing various social reinforcements
141
when the child responds appropriately. For example, when the parent asks the child a question
and the child replies appropriately or when something in the environment, such as a object or
event, or a property of either evokes a verbal response from the child that is deemed appropriate,
the parent then responds with such expressions as “yes,” “right,” and so forth, possibly followed
by a recast. If there was something grammatically wrong in the child’s reply, just recasting or
even expanding on what the child said often seems to be sufficient for the child to self-correct
eventually. It might be worth mentioning here that this is probably the way by which children
begin to distinguish between what linguists refer to as grammatical and ungrammatical utterances.
Later their “sense of grammaticality” is refined largely through the contingencies of
reinforcement provided by the educational system. These findings are not what one would expect
if language behavior were stimulus-independent (Moerk, 1992, 2000).
2. Primary linguistic input is limited and degenerate. Nativists use the term “degenerate”
to refer to the fact that in everyday conversations speakers fairly often make mistakes in
pronunciation and vocabulary selection, do not speak in full sentences, but in fragments, hesitate,
start an utterance, stop, and begin anew, and in general make a number of errors. In addition, the
range of grammatical constructions that listeners hear is limited. How then, nativists ask, could a
child possibly construct a mental grammar of their language that would allow them to produce
and understand virtually any utterance in their language?
The preceding question raised by nativists clearly reveals the difference between the
theoretical predispositions of nativists and behaviorists. Behaviorists make no assumption about
the existence of any innate, species-specific, mental mechanism for grammar construction. They
assume that language behavior arises, is maintained, and changes through the interactions
between an individual and other members of his/her language community. For behaviorists there
are no internal rules and representations or principles and parameters for generating linguistic
utterances or for responding to linguistic utterances; there is only the history of contingencies of
reinforcement for such behavior by the verbal community and the current circumstances or
setting in which such behavior occurs. Clearly behaviorists and nativists stand apart in their
theoretical claims. But the crux of the difference between them seems to be in their empirical
claims and the strength of the evidence that supports them.
The empirical evidence for the linguistic input being limited and degenerate seems to be
very weak. Studies of the interactions between children in the process of learning their first
language and their parents or other members of their language community strongly suggest a
great richness in the quality and quantity of the antecedent and consequent stimuli. Far from
being anomalous, messy, and full of errors as adult speech is claimed to be, these studies of
parent-child verbal interactions going back at least as far as the early 1970’s (Snow, 1972) have
shown that parent’s modify their language behavior when interacting with children so that their
speech is simpler, less lengthy, more careful, less complex, and more repetitious. Moerk (1992)
found that as children develop in their language behavior, the speech of the parents becomes more
complex, staying somewhat ahead of the complexity of child’s speech, as if providing a goal for
the child to reach.
In addition, a study of the quantity of the parent-child interactions (Hart & Risley, 1995) ,
involving 42 children from three different social classes from the ages of 13 months to 36 months,
found that the children were verbally engaged with their parents at an average rate of 341
142
utterances per hour. These utterances contained numerous kinds of morphological and syntactic
structures and numerous kinds of sentence types. Although a complete analysis of the verbal
interactions between each parent and child has yet to be done, their study reveals a tremendous
richness in the children’s linguistic input.
Richness of input is necessary but hardly sufficient for a child to learn a language.
Studies have shown that children just listening for hours and hours to another language do not
show much progress in learning the language beyond perhaps having some rudimentary
understanding of a few words or expressions. Learning a first language also demands a rich
situational context and much feedback and interaction between the speaker and listener with a
frequent reversal of speaker-listener roles. In more behavioral terms, learning language behavior
requires interactions between the child and his or her social and physical environment where
operant S-R verbal relations can be established and maintained.
The Relevance of Operant S-R Relations to Speech-Language Interventions
Speech-language interventions have four basic aims: (1) to increase desirable or target
verbal behaviors; (2) to decrease undesirable verbal behaviors; (3) to establish desirable verbal
behaviors when such behaviors do not occur in the client’s behavioral repertoire; and (4) to
provide the means for the clients to maintain the desirable verbal behaviors in their everyday life
after treatment is over. To achieve these aims, Hegde (1998) distinguishes between treatment
principles and treatment procedures. Treatment principles are verbal inductive generalizations
or rules that are the outcomes of experimental analyses of operant behavior. Thus, they are
focused on the conditions under which operant S-R relations are established and maintained.
The treatment procedures are those intervening interactions that a clinician carries out
with a client to bring about changes in the behavior of the client. In this process, not only is the
client’s behavior changed, but the clinician’s own behavior will also be altered as a consequence
of the client’s responses as he or she determines which discriminative and reinforcing stimuli are
effective in altering the client’s behavior and which schedule of reinforcement is most suitable for
the maintenance of the new behavioral repertoires.
Treatment procedures might be said to be where the rubber meets the road. They are the
specific steps taken by the clinician in implementing the empirically based principles on a case-
by-case basis in carrying out language interventions. Because the treatment principles are
empirically well founded, valid, reliable, and replicable, they are what license or legitimize the
use of the treatment procedures. Treatment procedures need to be tailored to the client because
each client is an individual and will respond differently owing to the unique aspects of his or her
physiology and history of past contingencies of operant interactions with the physical and social
environments.
Hegde (1998) points out that there are many operant principles that suggest ways to carry
out speech-language behavior interventions. The principle of positive reinforcement can be
applied to increase the probability of desirable or target speech-language behaviors, such as
community acceptable word pronunciation. One of the problems a speech-language clinician
faces is determining what are the appropriate positive reinforcers for a particular client. Some
clients, initially at least, respond better to certain primary reinforcers, while other clients might
143
respond better to conditioned reinforcers, based on their past history of contingent reinforcement.
For example, a clinician working with an autistic child who has tended to not interact well with
other people and who displays little or no verbal behavior might make use of one or more primary
reinforcers such a food or objects such as toys that the child shows some interest in and allowing
access to them when the child responds in some verbally desirable way, like saying “Please.”
Gradually the clinician can switch to conditioned reinforcers, particularly social reinforcers.
Some undesirable behaviors can be removed from a client’s repertoire through extinction.
If these are operant behaviors, reinforcers must control them. By finding out what they are the
clinician can withdraw these stimuli and thus bring about extinction of the behavior. Differential
reinforcement can help by replacing the undesirable behavior with more desirable behavior.
Another approach to removing undesirable behaviors is to follow such behaviors with aversive
stimuli (punishers). There are some advantages to punishment of undesirable behavior. The
behavior is rapidly extinguished and is not likely to reoccur. However, using punishing stimuli
has a downside to it. It may engender other unacceptable behaviors, such as aggression, anger,
even withdrawal from interaction. The use of punishment should be decided on a case-by-case
basis and great care must be taken not to overuse it.
For reinforcement to be effective in building or altering a language repertoire, the target
verbal behavior must first be emitted. But what if the target behavior is not in the client’s current
repertoire or occurs too infrequently to make practical use of it? In situations like these the
clinician needs to apply procedures that can result in new verbal behaviors being added to a
client’s language repertoire. The kinds of procedures that are appropriate will depend on
assessing the current nonverbal vocal behavior, language repertoire, and learning and social skills
of the client. If a client already demonstrates some emitted vocal skills but exhibits little language
behavior, then the verbal modeling-imitation procedure may be quite useful in the development
of such behavior. As was pointed out earlier, Skinner (1957) defined verbal behavior as behavior
that is mediated by other persons. In verbal modeling-imitation, a model is a verbal response
produced by one person and the imitation is the evoked verbal response that is topographically
similar to its modeled stimulus, produced by another person. In the example presented earlier, the
red light (the SD ) evoked the infant’s leg movement (R) after the behavior was differentially
reinforced in the presence of the light but not in its absence. Note that formally or topographically,
there was no resemblance between the red light and the infant’s leg movement, but not in the case
of modeling-imitation, where the antecedent verbal stimulus and the verbal operant share
common topographical properties, in this case acoustic ones. Skinner (1959) called such verbal
operants echoics . Others may refer to them as verbal or vocal imitations.
Modeling-imitation can be seen in the pre-linguistic and early stages of language
development. Young children, infants and toddlers, frequently get socially reinforced for
vocalizing in ways that are similar to or echo adult speech. For example, a mother that hears her
young daughter babble the uncommitted vocal response “dada” will suddenly pay more attention
to her child, show excitement, smile brightly, gently touch or pat, and even hug her child, and say
“Yes, dada.” Some or all of these parental responses may provide reinforcement for the child to
repeat “dada.” Soon the child is saying “dada” whenever the mother says to her “Say, dada”
under some schedule of reinforcement. Thus, “dada” has become a discriminated verbal operant.
In fact, this kind of interaction can become generalized, such that whenever the mother says “Say,
mama,” or “Say, doggie,” or, in general, “Say, X,” the child will repeat back (echo) whatever X is.
144
Through this generalized procedure new verbal responses are being added to the child’s language
repertoire.
Modeling-imitation also has applications in speech-language interventions. For example
if a client is having difficulties with pronouncing certain words or longer expressions, the model
discriminative stimulus provides a standard that can be used for correcting the problem. It may
also be used in dealing with a client’s morphological or syntactic problems.
Most people would not consider a child’s echoic responses to have any particular
informative value , that is, we would hardly consider the verbal echoic episodes between the
mother and her daughter as conversations. Nevertheless, echoics are a part of language behavior.
Echoic behavior, either complete or partial, plays various roles in adult verbal episodes, such as
when quoting someone else, expressing surprise, reassuring that one is paying attention,
concurring or showing understanding, taking vows and oaths, repeating to gain time to respond,
and so forth (Winokur, 1976). Furthermore, echoics are extremely valuable in both normal
language development (see Hegde, in this issue, and McLaughlin, in this issue) and language
interventions as a jumping off stage for developing other kinds of verbal operants that are more
involved in verbal interchanges.
Among a number of verbal operants, Skinner (1957) distinguished between two primary
ones: tacts and mands. Tacts are verbal operants that are evoked by the presence of
discriminative stimuli, such as objects, events, or properties of objects and events, spatial,
temporal, and other relationships or combinations of these. In other words, tacts are verbal
operants for whatever there is in our universe of discourse. Mands are verbal responses that are
evoked under conditions of deprivations or aversive stimuli. The form of the mand often specifies
or tacts its reinforcer, making it possible for another member of the verbal community to respond
as a reinforcement mediator to supply the reinforcer specified by the mand. Linguistically a mand
can be as simple and as short as a single word “Cookie!” or as long as a sentence “Would you go
into the bedroom and get my green slippers?” or even longer.
To establish tact and mand verbal operants as part of the language repertoire of a child
with only echoic behavior requires the use of transfer of control procedures (Sundberg &
Partington, 1998). To illustrate such a procedure the earlier “dada” example, where a child
acquired a repertoire of echoic behaviors, will be expanded on. Now the father is present in the
room. The mother points to the father and says, “Who is this? Dada. Say, dada.” The child
echoes “dada” and is socially reinforced with big smiles, verbal praise, and perhaps some
physical contact from one or both parents. After some similar episodes, the echoic prompt can be
faded out, while keeping the prompt “Who is this?” Later, when the father is alone in the room
with his daughter, if she utters “dada” without any prompts at all, her behavior is reinforced. In
this way, transfer of stimulus control from the verbal antecedent stimulus prompts to the presence
of the father is accomplished. The initial model antecedent stimulus acts as a kind of catalyst to
establish a new operant relationship in which the presence of another antecedent stimulus comes
to evoke the original echoic response. Once the child has learned tact relationships, the transfer of
control procedure is no longer necessary. Instead, the child can start acquiring more tacts by
observing adults and other children tacting in the presence of discriminative stimuli. This is
another kind of verbal modeling-imitation that is frequently observed in language development. If
the child has also learned mand relationships, she can start asking for the names of things,
145
perhaps first by pointing and than later verbally by asking, “What’s that?” Mand verbal operants
can be learned in a similar way but with the added complication that a deprivation or aversive
antecedent stimulus must be already present or must be established by the clinician, so that
transfer of control can eventually be made to it.
Human beings like other living organisms interact with the physical world and thus their
behavior is subject to the contingencies of reinforcement and stimulus discrimination the physical
world provides. But more than any other species human beings interact with their fellow human
beings and are subject to the contingencies of reinforcement and stimulus discrimination of the
language behavior of others. And once such behavior has been established, we learn to behave
both verbally and non-verbally, according to the verbal instructions of others. Verbal
instructions, whether vocal, written, or gestured, are stimuli that describe or specify occasioning
discriminative stimuli, the evoked responses they control, and their consequences or some
combination of these (Skinner, 1969). Verbal instructions have also been called contingency-
specifying stimuli (Schlinger & Blakely, 1987) or rules (Skinner, 1969). The behavior that they
evoke has been called by Skinner (1969) rule -governed behavior and by Catania (1998)
verbally governed behavior.
The linguistic form that verbal instructions can take is highly varied, ranging over such
traditional sentence types as declaratives, interrogatives, and imperatives. We learn many skills
via the model- imitation route, but many more are acquired through instructions or a combination
of modeling and instruction. For example, a manual of English pronunciation might describe how
to pronounce the “f” sound as follows: Spread your lips a little and gently bring your lower lip up
against your upper teeth; then breath out through your mouth without vibrating your vocal cords.
If a clinician gave the same instructions to a client who is having difficulty pronouncing this
sound and the client has had a long history of being reinforced for following verbal instructions,
there is a good chance that the client would correctly pronounce the target sound. If the client still
has problems, the clinician can model the articulation of sound, providing a visua l stimulus for
the client to imitate. Also the acoustic result of the articulation of the sound by the clinician
results in an auditory stimulus that provides the client with a model for the resulting auditory
effect of his or her own production.
Another means of adding new behaviors to an individual’s repertoire is shaping. Not all
clients are able to acquire new behaviors using the modeling-imitation procedure, nor are all
clients able to follow verbal instructions well. Shaping is a technique for adding new behaviors in
a step-by-step fashion using differential reinforcement. Skinner (1953) compares the shaping of
emitted behaviors into new complexes of behavior through differential reinforcement of
successive approximations to the desired terminal behavior to the shaping of a lump of clay by a
sculptor into a piece of artistic design. Teaching language skills to autistic and cognitively
deficient children (Sundberg & Partington, 1998) and rehabilitating the verbal skills of aphasic
adults can be particularly difficult, but shaping can be an effective means of establishing verbal
behavior.
Starkweather (Starkweather, 1983) discusses shaping with the example of a five-year-old
boy with no evidence of speech behavior, who did vocalize on some occasions and had a gestural
repertoire for communicating his needs. He was brought to a clinic after being deprived of food
for a while. His mother brought a sandwich and it was made known that he could have a bite of
146
the sandwich at the pleasure of the clinicians. His gestures requesting the sandwich were ignored.
Finally, he started whimpering a little. Immediately he was given a bite of the sandwich. Soon he
varied his whimper with a kind of grunt, which was reinforced with another bite from the
sandwich, but whimpers were no longer reinforced. It was not long before he was just grunting to
get food reinforcement. Then the clinicians changed his reinforcement so that he was being
reinforced only if he grunted with his mouth open, producing a vowel-like sound. The open-
mouth grunts varied in their vowel quality and then only the vocalizations with a short schwa-like
quality were reinforced and not others. As the variation in his vowel-like sounds changed, some
would be reinforced and not others and only if at the same time he pointed to a particular object
and produced a certain vowel. Eventually he reached the behavioral point where he was being
reinforced when he pointed to the sandwich and said /æ???/, which corresponded to the vowels
in the word ‘sandwich’, each ending with a glottal stop. The shaping procedure continued until he
could say the whole word correctly. As you can see, shaping can be a very onerous technique for
teaching new behaviors. It takes a great deal of time, progress may not be as straightforward as a
clinician might hope, constant monitoring is required, and the results may have some unintended
consequences. The shaping procedure offers the best approach for initiating new behaviors with
clients who do not imitate. With clients who do imitate, the modeling-imitation procedure is
always more efficient.
Various schedules of reinforcement are used for establishing and maintaining desirable
behaviors. If the desirable behavior only occurs infrequently, continuous reinforcement can be
used to strengthen and stabilize that behavior, such as was done in the case of the five-year-old
child discussed by Starkweather. Once the desired behavior is regularly occurring in appropriate
situations, it is then advisable to “thin” the delivery of reinforcement by gradually switching to an
intermittent schedule of reinforcement to ensure that it will be maintained. As a goal, a highly
variable schedule might be preferable because it generates responses that are highly resistant to
extinction and also because variable schedules are more commonly encountered in natural
settings.
From the previous discussion it should be clear that speech-language pathologists and
applied behavior analysts must rely on the operant conditioning methods to establish S-R
relations to carry out language interventions. There is no other way to try to modify speaker or
hearer verbal behavior repertoires. The successes that have been achieved by operant
conditioning methods provide the reinforcing stimuli that keep practitioners continuing to rely on
such methods and the well-founded empirical principles on which they are based.
References
Catania, A. C. (1998). Learning (4th ed.). Upper Saddle River: Prentice-Hall, Inc.
Chomsky, N. (1959). Review of Skinner 1957. Language, 35(1), 26-58.
Chomsky, N. (1972). Language and mind (Expanded ed.). New York: Harcourt Brace Jovanovich,
Inc.
Chomsky, N. (1980). Rules and representations. New York: Columbia University Press.
147
Cooper, J. O., Heron, T. E., & Heward, W. L. (2007). Applied behavior analysis (2nd ed.). Saddle
River, NJ: Pearson Education, Inc.
Donahoe, J. W., & Palmer, D. C. (1994). Learning and complex behavior. Boston, MA: Allyn
and Bacon.
Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton-
Century-Crofts.
Hart, B., & Risley, T. R. (1995). Meaningful differences in the everyday experience of young
American children. Baltimore, MD: Paul H Brookes Publishing Co.
Hegde, M. N. (1998). Treatment procedures in communicative disorders (3rd ed.). Austin, TX:
Pro-Ed.
Moerk, E. L. (1992). A first language taught and learned. Baltimore, MD: Paul H. Brookes.
Moerk, E. L. (2000). The guided acuisition of first language skills (Vol. 20). Stamford, CT:
Ablex.
Palmer, D. C. (1998). On Skinner's rejection of S-R psychology. The Behavior Analyst, 21(1), 93-
96.
Palmer, D. C. (2008). On Skinner's definition of verbal behavior. International Journal of

Psychology and Psychological Therapy, 8(3), 295-307.
Pavlov, I. P. (1927). Conditioned reflexes (G. V. Anrep, Trans. paperback ed.). New York: Dover
Publications, Inc.
Pierce, W. D., & Cheney, C. D. (2008). Behavior analysis and learning (4th ed.). New York:
Psychology Press.
Schlinger, H. D., Jr. (1995). A Behavior analytic view of child development. New York: Plenum
Press.
Schlinger, H. D., Jr., & Blakely, E. (1987). Function-altering effets of contingency-specifying

stimuli. The Behavior Analyst, 10(1), 41-45.
Sechenov, I. M. (1965). Reflexes of the brain (S. Belsky, Trans. English ed.). Cambridge, MA:
The M.I.T. Press.
Skinner, B. F. (1953). Science and human behavior (Paperback ed.). New York: The Free Press.
Skinner, B. F. (1957). Verbal behavior. New York: Appleton-Century-Crofts.
148
Skinner, B. F. (1969). Contingencies of reinforcement: A theoretical analysis. New York:

Appleton-Century-Crofts.
Snow, C. E. (1972). Mothers' speech to children learning language. Child Development, 43(2),
549-565.
Starkweather, C. W. (1983). Speech and language: Principles and processes of behavior change.
Englewood Cliffs, NJ: Prentice-Hall.
Sundberg, M. L., & Partington, J. W. (1998). Teaching language to children with autism or other
developmental disabilities (Version 7.1 ed.). Pleasant Hill, CA: Behavior Analysts, Inc.
Thomas, M. (2002). Development of the concept of "the poverty of the stimulus." The Linguistic
Review, 19, 51-71.
Winokur, S. (1976). A Primer of verbal behavior: An operant view. Englewood Cliffs, NJ:
Prentice-Hall, Inc.
Raymond S. Weitzman
1035 E. Monticello Circle
Fresno, CA 93720-1872
Phone: 559-438-6334
Email: raymondw@csufresno.edu
Behavior Analyst Online Is Looking For Financial

Support
The Behavior Analyst Online organization is seeking donors to support its cause.
By contributing to the cost of the journals, you will help to keep our journals free.
We plan to list our donors (if they desire) on the BAO site.
The categories of donors are:
Champion - $500.00, Elite - $250.00, Fellow - $150.00, Friend - $50.00
If you would like to contribute please contact Halina Dziewol ska at

halinadz@hotmail.com. Please make check payable to Halina Dziewolska site
funder raiser and send the check to 535 Queen Street, Philadelphia, Pa. 19147
149
Behavioral vs. Cognitive Views of

Speech Perception and Production
Henry D. Schlinger, Jr.
Abstract
Speech perception and language acquisition have been studied primarily by cognitively oriented
researchers. Many of these researchers discount a behavioral account despite facts demonstrating that
speech in humans will not proceed typically or even at all in the absence of early exposure to a speech
environment, and that reinforcing consequences from others as well from infants’ own vocalizations shape
their vocal repertoires. In the present article, I illustrate a general behavior analytic approach to speech
perception and production and contrast it with the more standard cognitive view. I suggest that cognitive
accounts are not parsimonious in that they make assumptions about events and processes that are not
testable and, thus, not falsifiable. I argue that a behavior analytic approach is not only parsimonious, but it
is supported by evidence from both human infants and songbirds, and it is more likely to lead to practical
applications.
Key words: speech perception, speech production, behavior analysis, operant learning, statistical learning
______________________________________________________________________________
Introduction
The study of speech perception and production has been dominated largely by cognitively
oriented researchers and theorists. However, cognitive theories do not translate well into actually teaching
speech and language. Because of its emphasis on behavior and its foundation of experimentally derived
principles of learning, operant learning theory, on the other hand, has for more than 40 years been used to
successfully teach speech and language, especially to people with a variety of speech and language
disorders (e.g., Camarata, 1993; DeLeon, Arnold, Rodriguez-Catter, & Uy, 2003; Hegde, 1998; Hegde,
2007; Hegde & Maul, 2006; Johnston, & Johnston, 1972; Lancaster et al., 2004; Pena-Brooks & Hegde,
2007; Wagaman, Miltenberger, & Arndorfer, 1993). Although there is a fair amount of behavioral
research on teaching verbal behavior to people with speech and language disorders, with few exceptions
(e.g., Guess, Sailor, Rutherford, & Baer, 1968; Whitehurst, 1972; Whitehurst, Ironsmith, & Goldfein,
1974) behavioral scientists have not contributed much basic research on speech perception or production.
What behavior analysis can offer language researchers and speech-language pathologists (SLPs) is a
coherent and parsimonious interpretation of speech consistent with experimentally established scientific
principles of learning that has immediate practical applications. The purpose of this paper is to illustrate a
general behavioral approach to speech perception and production and to contrast it with the traditional
cognitive account. The task of interpreting the traditional speech perception and production literature
from a behavior analytic perspective is not as onerous as one might imagine, given that much of the
research either incorporates relatively straightforward operant conditioning methods or methods that are
amenable to an operant analysis.
Contrasting Views of Perception
Cognitive Views of Perception
Traditional treatments describe sensation in terms of the effects of stimuli on sensory receptors.
Perception, on the other hand, has generally been referred to as how the brain interprets sensory
experience, or more formally as:
…the process by which animals gain knowledge about their environment and about themselves in
relation to the environment. It is the beginning of knowing and so is an essential part of cognition.
More specifically, to perceive is to obtain information about the world through stimulation. (Gibson
150
& Spelke, 1983, p. 2)

The main problem with such descriptions is that gaining knowledge, knowing, and obtaining
information are inferred solely from observable behavioral evidence and, thus, do not add much to our
understanding. Moreover, placing perception inside the brain, as some descriptions do, only moves the
level of analysis further away from the actual behaviors involved when one speaks of perception. And
talking about the brain as if it is doing something – perceiving – is an example of what I refer to as the
brain-as-person metaphor because organisms, not brains, perceive, that is, behave.
Descriptions of the development of perception in infants are equally vague. For example,
according to Gallahue (1989), “newborns attach little meaning to sensory stimuli” but very soon infants
begin to attach meaning and attend to specific stimuli and to identify objects (p. 184). But what does that
mean? These verbs do not refer to specific behaviors that can be studied but rather labels for a variety of
behaviors. To understand what it means to attend, to identify or to attach meaning, we would need to
observe what newborns actually do and the circumstances under which they do it when researchers use
such terms. And we should not be surprised if the behaviors and circumstances vary considerably from
instance to instance. Furthermore, such characterizations raise a more fundamental question: Why do
infants begin to attach meaning or attend to specific stimuli? The traditional account offers no clear
answer.
Finally, the term perception itself is problematic. As a noun, it is usually considered a name for a
process. But, of course, most of the time the only evidence for the process is the behavior that leads one to
say that an organism has perceived something in the first place. In such cases, considering perception to
be a process is an example of the logical error of reification.
A Behavioral View of Perception
Rather than debating the meaning of the term perception, a behavioral approach looks at what an
individual does (and under what circumstances) that leads investigators to say that he or she perceives
something. The focus is not on some inferred hypothetical construct, but rather on the actual behaviors
from which such inferences are made. The advantages are obvious. Because the behaviors can be
objectively described and measured, their causes can be potentially discovered and, as a result, can be
manipulated as independent variables to change the behaviors. It is not possible to change one’s
perception.
Our principal question, then, is this: What is someone doing when he or she is said to “perceive”?
For example, what does it mean to say that I perceive the computer on which I am writing the words you
are reading? In other words, what do I do that causes someone to say that I perceive the computer? The
list includes but is not restricted to looking at the computer, pushing the button to turn it on, moving the
mouse, typing on the keyboard, and, importantly for verbal organisms, calling it a computer and
describing it. Obviously, these behaviors usually, but not always, only occur in the presence of the
computer, that is, while it is functioning as a set of visual, auditory, and tactile stimuli. (When these
behaviors occur in the absence of the computer, we say that I am imagining it.) There are, however, many
other stimuli impinging on my sensory receptors as I write at the computer, but until they influence some
behavior on my part, we are not likely to say that I perceive them. For example, the sound waves
produced by a car going down the street affect sensory receptors in my inner ear, but unless I do
something like get up to see who it is, comment on it to someone else, or say to myself, “I wonder who
just drove by,” I would not be said to perceive the car even though I sensed it. As this example illustrates,
the sound of the car as a conditional stimulus (CS) or a discriminative stimulus (SD ) might evoke several
different behaviors that would cause an observer to say that I perceived the car. Or, it might evoke no
behaviors at all. Our senses are constantly bombarded by stimuli that are transduced into neural impulses,
but we only perceive, that is respond to, a very small portion of them. Behavior analysts will recognize
this as a relatively straightforward issue of stimulus control. The question for a science of behavior is:
151
What causes us to respond to some stimuli but not others; in other words, what causes only certain stimuli
to evoke behavior? Cognitive approaches to speech perception are problematic because they infer
hypothetical constructs. A behavioral approach is more parsimonious because it infers potentially
observable and manipulable interactions between behavior and stimuli.
Cognitive Approaches to Speech Perception
At this point in our evolutionary history, auditory perception is important for human beings
primarily because we talk and listen. But as with perception in general, speech perception in particular has
been studied within the conceptual framework of cognitive science.
Consider the following description of the perception of a spoken word by Holt and Lotto (2008):
The ease with which a listener perceives speech in his or her native language belies the complexity of
the task. A spoken word exists as a fleeting fluctuation of air molecules for a mere fraction of a
second, but listeners are usually able to extract the intended message. (p. 42, emphasis added)
This short description embodies the essence of a cognitive approach to speech perception: that
speakers have intentions, that words contain meanings , and that listeners must extract the intended
meaning. Of course, the only evidence for extracting the intended meaning of an utterance is what the
listener actually does. When this account is contrasted with a behavioral approach in which a speaker’s
verbal behavior, itself evoked by the current circumstances (often including the presence of a listener), in
turn evokes verbal (and nonverbal) behavior in the listener all because of a history of operant learning, it
is not too difficult to understand why the behavioral approach has not fared well at all against the
cognitive one. The cognitive account, despite its logical and scientific problems (see below), is the more
familiar and accessible. It is also consistent with the philosophical tradition of mentalism with which we
have all been raised. Even so, most language researchers discount any important role for operant learning
in speech perception or production.
Cognitive Views of “Skinnerian Learning”

Modern language researchers do not put any stock in a behavioral account of speech perception
or language acquisition. Some researchers still reference Chomsky’s (1959) review of Skinner’s (1957)
book Verbal Behavior as demonstrating “the failure of existing learning models, such as Skinner’s, to
explain the facts of language development” (Kuhl, 2000, p. 11852), despite the fact that Chomsky’s
review was thoroughly rebutted decades ago (MacCorquodale, 1970; see also Palmer, 2006 for a brief
history of the writing of Verbal Behavior, Chomsky’s review, and MacCorquodale’s rebuttal; see also
Hegde in this issue). As I have written elsewhere:
It seems absurd to suggest that a book review could cause a paradigmatic revolution or wreak all the
havoc that Chomsky’s review is said to have caused to Verbal Behavior or to behavioral psychology.
To dismiss a natural science (the experimental analysis of behavior) and a theoretical account of an
important subject matter that was 23 years in the writing by arguably the most eminent scientist in
that discipline based on one book review is probably without precedent in the history of science.
(Schlinger, 2008b, pp. 335-336)
Moreover, much empirical research since Chomsky’s review supports the behavioral view that
parents and other caregivers behave in ways that shape and reinforce verbal behavior in young children
(e.g., Whitehurst, Novak, & Zorn, 1972; Moerk, 1978, 1983, 1992; Hart & Risley, 1995, 1999). And
speech-language pathologists have regularly used behavioral interventions to teach speech and language
skills to children with articulation (speech) and language disorders (Hegde & Maul, 2006; Pena-Brooks &
Hegde, 2007).
In addition to citing Chomsky’s review, many language researchers perpetuate myths about
152
Skinner’s view of language. For example, according to Kuhl (2000),

On Skinner’s view, no innate information was necessary, developmental change was brought about
through reward contingencies, and language input did not cause language to emerge . . .The emerging
view argues that the kind of learning taking place in early language acquisition cannot be accounted
for by Skinnerian reinforcement. (p. 11850)
It is not clear what “innate information” means; if it means some presumed and unverifiable
innate mechanism, Skinner may not have supported it. Skinner, however, never discounted the
importance of well documented neurophysiological and genetic information – he believed that all
behavior, including verbal behavior, was the combined result of inheritance and learning. But Skinner
was neither a geneticist nor an evolutionary biologist (although he did write often about biology and
evolution [see Morris, Lazo, & Smith, 2004]). Rather, his area of expertise was the effect of
environmental contingencies on behavior. Thus, although Skinner emphasized the role of reinforcement
contingencies on the behavior of a wide range of species, especially humans, he never claimed that
developmental change was brought about solely through “reward contingencies.” And he most certainly
did acknowledge the contribution of language input in language acquisition. But he did so within the
confines of a unified account of behavior in general (see Skinner, 1957). Finally, as I argue below, there is
no evidence that in any way suggests “that the kind of learning taking place in early language acquisition
cannot be accounted for by Skinnerian reinforcement.” On the contrary, the kind of learning that takes
place in early language acquisition can almost entirely be accounted for by operant learning principles.
Elsewhere I have addressed some of the misconceptions of a behavioral approach to language
(see Schlinger, 1995, pp. 178-182). Nonetheless, such misconceptions continue to be perpetuated by
language researchers. One reason may be that behavior analysts have conducted very little of their own
research on speech perception or language acquisition, except in the context of teaching children with
language deficits (e.g., Learman et al., 2005; Sautter & LeBlanc, 2006). And much of the research
conducted by behavior analysts has not been published in mainstream language journals. Also, many
behavior analysts have been relatively content with interpreting the facts of verbal behavior, including
research by traditional language researchers, according to the established principles of behavior analysis.
Regardless of whether such an interpretation is correct, it has much to recommend it because it is
parsimonious and firmly grounded in, and consistent with, experimentally established principles.
But if behavior analysts themselves have not conducted much research into speech perception and
production, other researchers have demonstrated, albeit incidentally, the critical role of operant learning,
especially in speech production. For example, even though Kuhl (2000) claimed that “Skinnerian
reinforcement learning” cannot explain how infants’ perceptual systems are altered by experience,
elsewhere (e.g., Kuhl, 2004) she cited studies showing that social contact and interactions affect the
duration, rate, and frequency of vocal learning in human infants and in songbirds. In particular, she cited a
study by Goldstein, King, and West (2003) with human infants in whic h in a contingent condition
“mothers were instructed to respond immediately to their infants’ vocalizations by smiling, moving closer
to and touching their infants” (Kuhl, 2004, p. 837). Not surprisingly, at least to a behavior analyst, the
results showed that when compared to infants in the non-contingent condition, infants in the contingent
condition produced more vocalizations and more mature and adult-like vocalizations. This is obviously
operant conditioning (Kuhl would call it “Skinnerian learning”) and it is consistent with previous research
demonstrating that reinforcement, even in the absence of awareness, can strengthen (i.e., select)
vocalizations and numerous forms of speech (e.g., Greenspoon, 1955; Rosenfeld, & Baer, 1970;
Rheingold, Gewirtz, & Ross, 1959; Todd & Palmer, 1968). Even Goldstein et al. refer to it as “social
shaping” (Skinner coined the term shaping to refer to the gradual differentiation of responses belonging to
a response class as a function of differential reinforcement.). Other data from Goldstein’s lab confirm the
powerful role of operant learning in early language acquisition (e.g., Goldstein, Schwade, & Bornstein,
2009). But such findings are not new (e.g., Rheingold, Gewirtz, & Ross, 1959; Todd & Palmer, 1968).
Nonetheless, data such as these provide further support for the interpretation that infant vocalizations,
153
including those called babbling, can be, and are shaped (i.e., selected) by consequences. The
consequences for infant babbling and speech sometimes come from others, but the most relevant
consequences are likely the match between the babbled sounds and those heard by the infant for the
several months prior to the onset of babbling (see Schlinger, 1995, pp. 158-160; Hegde, this issue). There
is simply no question that operant learning is a significant factor in language production.
What Does It Mean to Say That Someone “Perceives Speech”?
Whether we know it or not, most of the time when we say that someone perceives speech, we
mean that his or her behavior is under the stimulus control of the speech. Some questions for a science of
behavior are: 1) What form does the behavior take?; 2) What is the function of the behavior?; and, 3)
How is such behavior acquired?
At the most basic level, it could be said that infants perceive speech if they turn their head toward
the location of vocal stimuli, although we would be hard-pressed to claim that infants are “extracting
information” from such stimuli. Researchers often assess auditory perception in infants by using so-called
habituation and dishabituation methods, for example, by measuring non-nutritive sucking or changes in
heart rate, all of which, on the present account, qualify as perceptual behaviors. One example of a
seemingly simple speech perception phenomenon that has received a considerable amount of attention
from language researchers is categorical perception.
The Strange Case of Categorical Perception
Beginning in the late 1950s and early 1960s, researchers began to test the ability of both infants
and adults to discriminate different categories of phonemes, a phenomenon called categorical perception.
Traditional accounts view the phoneme as “the smallest unit of sound that signifies a difference in
meaning in a given natural language” (Aslin, 1987, p. 68). Behavior analysts, on the other hand, view the
phoneme functionally as “the smallest unit of sound that exerts stimulus control over behavior”
(Schlinger, 1995, p. 153). For the present account, whatever behavior is evoked by the sound of a
phoneme is what we mean when we speak of categorical perception.
In early experiments, researchers used computers to present a series of synthetic consonant-vowel
(CV) sounds that ranged across a number of consonants (e.g., /bV/-/dV/-/gV/). Specifically, the computer
generated synthetic speech sounds that varied along a stimulus dimension called voice onset time (VOT),
which is the basis of the distinction between CVs like pa and ba. Voice onset time refers “to the point at
which vocal cords begin to vibrate before or after we open our lips” (Bates, et al., 1987, p. 152). Thus, for
sounds we react to as b, voicing begins either before or simultaneously with the consonant burst; and in
sounds that we react to as p, voicing begins after the consonant burst. A computer can present stimuli
along this VOT continuum from -150 to +150 milliseconds (ms) from burst to voice. Results from
numerous studies have shown that English-speaking adults and infants do not respond differentially to
VOTs that fall either significantly above or below the boundary between pa and ba, which is about 25-30
ms. Results showing that infants only a few months old can respond differentially to these different
phonemic categories suggested to some researchers that humans are born with “phonetic feature
detectors” that evolved specifically for speech and that respond to phonetic contrasts found in the world’s
languages (Eimas, 1975). Such results seemed to support nativist theories of speech and language (e.g.,
Chomsky, 1957) and were couched in such terms, which is understandable because Chomsky’s views on
innate mechanisms of generative grammar still dominated the study of language in the 1970s.
Results from other studies, however, have shown that nonhumans (e.g., chinchillas, monkeys,
Japanese quail and even rats) could be trained to respond to these phonemic categories in the same way as
human adults and children (e.g., Dooling, Best, & Brown, 1995; Kluender, Diehl, & Killeen, 1987; Kuhl,
1981; Kuhl & Miller, 1975, 1978; Kuhl & Padden, 1982, 1983; Reed, Howell, Sackin, Pizzimenti, &
Rosen, 2003; Toro, Trobalon, & Sebastián-Gallés, 2005). Moreover, categorical perception in adults was
154
limited to the phonemes in their respective native languages (Miyawaki et al., 1975), which suggests a
strong experiential component.
These studies with nonhuman animals forced a conclusion that ran counter to the prevailing views
that somehow infants were born responding to phonetic units, and that language evolved in humans
discontinuously with lower species; namely, that infants are born with a general capacity to discriminate
auditory stimuli including speech sounds rather than one that evolved specifically for speech. Thus,
domain-general, rather than species-specific mechanisms seem to be responsible for infants’ tendency to
respond to phonetic units (Kuhl, 2000). As Kuhl (1981) stated, “the evolution of the sound system of
language was influenced by the general auditory abilities of mammals” (p. 347). Or, as Bates, O’Connell,
and Shore (1987) put it:
We rushed too quickly to the conclusion that the speech perception abilities of the human infant are
based on innate mechanisms evolved especially for speech. The infant’s abilities do indeed seem to
involve a great deal of innately specified information processing. But we do not yet have firm
evidence that any of this innate machinery is speech specific. We assumed that the human auditory
system evolved to meet the demands of language; perhaps, instead, language evolved to meet the
demands of the mammalian auditory system. This lesson has to be kept in mind when we evaluate
other claims about the innate language acquisition device. (p. 154)
What Bates et al. mean is that the development of speech perception in infants is constrained by what
they are able to hear, and that the range of auditory stimuli detectable by the mammalian auditory system
evolved for reasons having nothing to do with speech.
Many language researchers now agree that the language environment exploits natural boundaries
of a general auditory capacity that is common to mammals and some birds, but also can modify them
(Diel, Lotto, & Holt, 2004). Thus, although initial studies on categorical perception were meant to support
nativist theories of language development, such as Chomsky’s, further research supported the opposite
conclusion, namely, that categorical perception resulted largely from experience and learning. In studies
on categorical perception with both human infants and non-human animals, researchers successfully
trained discriminative responses to phonemic sounds using operant conditioning procedures. One can
assume, then, that similar contingencies operate naturally for human infants. The fact that adults respond
only to phonemes in their respective native languages also suggests that those phonemes have specific
functions that non-native phonemes do not, thus supporting the contention that operant learning is
responsible. But that is not how cognitive researchers see it.
Cognitive Views of Learning
According to Kuhl (2000), “The acquisition of language and speech seems deceptively simple”
(p. 831). By that she means that children learn their native language quickly and effortlessly. She
wonders, then, how children, but not language theorists, are able to crack “the speech code” so easily.
This is a little like asking how children began walking, negotiating the effects of gravity, before physicists
understood the law of gravity. Nonetheless, Kuhl believes that the last several years has produced “an
explosion of information about how infants tackle this task” . . . that would be surprising to and
“unpredicted by the main historical theorists” (p. 831, emphasis added). Specifically, “children learn
rapidly from exposure to language, in ways that are unique to humans, combining pattern detection and
computational abilities (often called STATISTICAL LEARNING) with special social skills” (p. 831,
emphasis added).
It may come as a surprise to behavior analysts (who, I believe, are the “main historical theorists”
Kuhl refers to) that after decades of uncritically adhering to Chomsky’s structural nativist view of
language, many language researchers and theorists have done an about-face and now tout learning as the
primary mechanism for language acquisition and speech perception (e.g., Kuhl, 2000). However,
155
according to these scholars, this “new kind of learning” is not “Skinnerian learning.”
Many modern language researchers now propose that certain aspects of language are experience-
dependent (vs. experience independent), and even refer to the experience as “learning,” but instead of
relying on empirically supported theories of (Pavlovian and operant) learning, they posit new forms of
learning that are mostly human-language-specific (e.g., Saffran, Aslin, & Newport, 1996). To their credit,
modern language researchers explicitly reject the assertion that similarities across languages (suggested
by Chomsky) reflect innate linguistic knowledge and, instead, now believe that learning can explain them.
But, oddly, these researchers attribute the similarities across languages not to already well-established
general learning principles, but rather to poorly researched constraints on learning even while
acknowledging that these learning mechanisms were not tailored solely for language (Saffran, 2003).
So, what has caused language researchers to conclude that experience and learning (though not
operant or Pavlovian learning) play a critical role in language acquisition and speech perception? What
has changed over the last two decades to support these so-called “new views of learning” is the discovery
that “by simply listening to language, infants acquire sophisticated information about its properties . . . ”
(Kuhl, 2000, p. 11852), a phenomenon also referred to as “incidental language learning” (Saffran et al.,
1997). Two examples of this “new kind of learning” are discriminative abilities in infants and so-called
statistical learning (Kuhl, 2000).
Discriminative Abilities in Infants

The first example that illustrates the so-called “new views of learning” is that infants detect
patterns or similarities in language input. Researchers cite evidence from a variety of studies, including
the finding that at birth infants generally prefer to listen to the language spoken by their mothers during
pregnancy and specifically to their mother’s voice over another woman’s voice, and to particular stories
read by their mothers during the last several weeks of pregnancy (DeCasper & Fifer, 1980; DeCasper &
Spence, 1986; Mehler et al., 1988; Moon et al., 1993; Nazzi et al., 1998). Other examples include the
finding that by 9 months of age, but not earlier, infants prefer to hear prosodic patterns characteristic of
their native language (Jusczyk, Cutler, & Redanz, 1993). Finally, infants have been shown to listen longer
to words in their native language than to words in another language (Jusczyk et al., 1993). According to
Kuhl (2000), “At this age, infants do not recognize the words themselves, but recognize the perceptual
patterns typical of words in their language” (p. 11852). Such a locution, however, simply describes the
research results and provides no explanation. And it is not clear how researchers know that infant
recognize patterns but not words.
Statistical Learning
The second example of the “new views of learning language” is that “infants exploit the statistical
properties of the input, enabling them to detect and use distributional and probabilistic information
contained in ambient language to identify higher-order units” (Kuhl, 2000, p. 11852, emphasis added).
This type of learning is called statistical learning by language researchers (e.g., McMurray & Hollich,
2009), and for Kuhl (and many other language researchers) it is the mechanism “responsible for the
developmental change in phonetic perception between the ages of 6 and 12 months” (2004, p. 833; see
also Maye, Werker, & Gerken, 2002). According to Kuhl (2000) :
Running speech presents a problem for infants because, unlike written speech, there are no breaks
between words. New research shows that infants detect and exploit the statistical properties of the
language they hear to find word candidates in running speech before they know the meanings of
words. (p. 11852)
Researchers have demonstrated that by 6 months of age, infants who shortly after birth could
universally be taught to discriminate between phonetic units, show a preference for the phonetic units of
their native language (Kuhl, et al., 1992). In fact, researchers describe the changes in infant speech
156
perception as a reduction in the ability to discriminate speech sounds that are not found in one’s native
language. Because “the beginnings and ends of sequences (i.e., the segmentation) of sounds that form
words in a particular language are not marked by any consistent acoustic cues” (Aslin et al., 1998, p.
321), such as pauses, and because the acoustic structure of speech across different languages is highly
variable, researchers believe that a distributional rather than an acoustical analysis must be used to solve
the problem of finding the words in a particular language. A distributional analysis refers to the
regularities in the relative positions of sounds over a large sample of linguistic input (Aslin et al., 1998).
For example, in English, “certain combinations of two consonants are more likely to occur within words
whereas others occur at the juncture between words. Thus, the combination ‘ft’ is more common within
words whereas the combination ‘vt’ is more common between words” (Kuhl, 2000, p. 11853).
According to many language researchers, human infants need to discover the phonemes and
words in a particular language. Because the speech they are exposed to is so variable and not marked by
reliable acoustic cues, researchers believe that “infants use computational strategies to detect the
statistical and pr osodic patterns in language input” (Kuhl, 2004, p. 831). Other researchers are quick to
point out that infants are not consciously calculating statistical frequencies, but rather are sensitive to
distributional information contained in the linguistic input to which they are exposed (Aslin et al., 1998).
Notwithstanding this one disclaimer, most of these researchers seem to believe that the infants, or their
brains, are extracting statistical information from the linguistic input.
Based on the results of particular studies (e.g., Saffran et al., 1996), numerous researchers have
concluded that, “infants use statistical information to discover word boundaries” (Aslin et al., 1998, p.
321), or they learn “from exposure to the distributional patterns in language input” (Kuhl, 2004, p. 835).
However, such conclusions suffer from a number of logical and scientific problems.
Problems With a Cognitive Account of Learning
There are several logical and scientific problems with the cognitive account of speech perception
and learning. The first concerns the issue of agency; that is, who does what. Cognitive accounts misplace
the agency producing the effects inside the infant instead of in the linguistic environment which cognitive
theorists clearly believe is critical for such learning. This is illustrated repeatedly in the literature in the
way researchers talk about language learning infants. For example, Saffran et al. (1996) state that, “One
task faced by all language learners is the segmentation of fluent speech into words” (p. 1927). According
to Kuhl (2004), “infants use computational strategies to detect the statistical and prosodic patterns in
language input, and that this leads to the discovery of phonemes and words” P. 831). Thus, as stated by
these researchers, infants are faced with tasks, extract information, use or exploit strategies, abstract
patterns, discover rules, and so on. Sometimes the task is assigned to the (infant’s) brain, which is said to
be endowed with mechanisms that enable it “to extract the information carried by speech” and “use them
to discover abstract grammatical properties.” (Mehler, Nespor, & P?na, 2008, p. 434). Either way, the
locus of control is said to be within the individual. In essence, many of these researchers believe that it is
the job of language learners to make sense of vague or complex information. This account is at odds with
a natural science approach which looks for physical causes of behavior. Using an evolutionary analogy, it
would be akin to saying that the task for individual organisms (or their brains) is to exploit strategies in
order to discover the rules for how to survive in a complex environment. But ever since Darwin (and
Wallace), we know that the direction of causation is the other way. The environment selects extant traits
to the extent that on average those traits enable individuals possessing them to live long enough to pass on
their genes. A selectionist account of language learning suggests that only some responses of infants to
specific stimuli will produce desired consequences.
Another problem with cognitive accounts of learning is that sometimes the questions are so
confusing as to be essentially unanswerable. For example, Saffran (2003) asked what infants were
actually learning in a segmentation task: “Are they learning statistics? Or are they using statistics to learn
language?” (p. 112). The answer, of course, is neither. To understand what is wrong with these questions
157
and with the notion of statistical learning in general, we must make a distinction between the researchers’
behavior and that of the subjects in experiments or infant in natural environments. It is true that a
statistician or psychologist can statistically analyze conditional probabilities of certain sounds or
arrangements of sounds within a stream of speech. But infants (or their brains) are not carrying out a
statistical analysis based on the distributional patterns in the speech they hear anymore than they are
calculating force, resistance, or gravity when they walk. The evidence that researchers cite is simply that
hearing speech changes the behavior of infants. In reality it is the researchers who are doing the statistical
analysis, not the infants. The principle of parsimony suggests that explanations of phenomena should
make the fewest assumptions. But modern language researchers make many unnecessary and essentially
untestable assumptions when they suggest that infants or their brains are statistically analyzing speech
sounds. Just because a researcher can carry out a statistical analysis of speech sounds does not mean that
is how infants learn from hearing the sounds.
In addition to conceptual and logical problems with a cognitive account of speech perception,
there are also methodological problems with some of the research. For example, the infants used in many
studies on speech perception were already at least 6 months of age, which means they had at least 6
months (excluding prenatal exposure) of hearing speech and countless interactions with a vocal
environment, which researchers acknowledge contributes to speech perception and language learning
(Kuhl, 2000, 2004). Moreover, many of the studies trained infants, however briefly, to make
discriminations or to show preferences, by reinforcing appropriate responses (although the authors rarely,
if ever, referred to their training as operant conditioning). And, the general research paradigm in most of
these studies is the hypothetical deductive one still common in psychology, which does not and cannot
account for the variability from one infant to the next and therefore must pool (i.e., average) data. Such an
approach obscures variation rather than refining experimental control to account for it (Schlinger, 2004).
Finally, many of the very researchers who cite animal studies as evidence against a uniquely
human capacity for speech perception claim that statistical learning is uniquely human. But studies
employing operant contingencies have shown that even rats can be taught to perceive (i.e., discriminate)
the nuances of speech (e.g., Reed et al., 2003; Toro et al., 2005). These animal studies further suggest that
operant learning is a plausible explanation for how infants learn to discriminate speech sounds.
A Behavioral View of Speech Perception
As already stated, a behavioral view of speech perception stresses what an individual does when
we say that he or she perceives speech and implicates general learning principles in the acquisition and
maintenance of such behavior. In the infant laboratory, for example, the label speech perception is
reserved for the behaviors researchers measure as responses to speech sounds, such as changes in heart
rate and non-nutritive sucking. Outside the laboratory, the term speech perception is used to denote such
behaviors as turning ones head in the direction of the speech, smiling, and making sounds. In
sophisticated listeners, speech perception refers to a much wider range of behaviors from complying with
requests to actually listening to what a speaker says, that is, subvocally echoing or otherwise talking to
oneself (see Schlinger, 2008a).
Over and above identifying the behaviors involved, “one of the most important issues in speech
perception is how listeners come to perceive sounds in a manner that is particular to their native
language” (Diel, et al., 2004, p. 164). As mentioned previously, newborn infants have been shown to
prefer to listen to the language spoken by their mothers and specifically to their mother’s voice. By
“prefer” researchers mean that infants will engage in behaviors that result in hearing the language spoken
by their mothers during pregnancy, their mother’s voice more than another woman’s voice, and specific
features of their mother’s voice over other features (DeCasper & Fifer, 1980; DeCasper & Spence, 1986;
Moon et al., 1993; Nazzi et al., 1998). Thus, at birth or shortly thereafter, particular features of the
infant’s native language in general and the mother’s voice in particular have become potent conditioned
158
reinforcers in the sense that when such stimuli are presented contingently on some infant behavior (e.g.,
particular sucking patterns), that behavior increases relative to behavior that does not produce those
features. These stimuli appear to become conditioned reinforcers (and perhaps acquire other behavioral
functions as well) by simply hearing them; that is, it is not clear that pairing with other reinforcing stimuli
is necessary. We do not need to appeal to a statistical analysis to explain these effects because as
mentioned previously, it is researchers who impose those statistical properties after the fact. We also do
not need to appeal to ad hoc cognitive processes, such as “perceptual representations of speech . . . stored
in memory” (Kuhl, 2000, p. 11854) because the evidence for such explanations is only the behavior to be
explained. The behavioral explanation is simply the most parsimonious.
Early linguistic exposure produces other effects as well. For example, exposure to a specific
language alters infants’ perception of specific speech sounds by 6 months of age (Kuhl et al., 1992).
Although the precise mechanism by which such effects are produced remains to be determined, it seems
as if, just as with the sound of the infant’s native language as well as the sound of the voices of significant
others, the sounds of specific phonemes acquire behavioral functions by virtue of general principles of
Pavlovian and operant learning. The role of general learning principles, however, is much easier to
demonstrate in the production of speech, which sometimes involves the very same behaviors as
perceiving speech.
A Behavioral View of Speech Production
How Do Infants Learn to Produce Speech?

Typically, a distinction is made between perceiving and producing speech. Elsewhere, however, I
have argued that what we normally speak of as listening is behaving (subvocally) (Schlinger, 2008a).
Equating listening (to speech or music) with perceiving is consistent with the thesis in the present article;
namely that when we say that someone perceives speech he or she is behaving in certain ways. Although
most infants naturally progress through a series of stages of vocal sounds, and although those sounds are
undoubtedly influenced by reinforcement, the role of operant learning becomes especially clear when
infants begin babbling and discrete changes in their vocal output can be more easily detected. Infant
babbling contains the prosodic characteristics of adult speech to which the infants have been exposed
(e.g., Levitt & Aydelott Utman, 1992; Whalen, Levitt & Wang, 1991). Some researchers (Kuhl &
Meltzoff, 1996) refer to this phenomenon as (vocal) imitation, although such a portrayal is not quite right.
Both anecdotal and experimental observations suggest that infants in the first year of life learn to
produce not just the intonation and prosody of the language that they hear but the sounds as well. In fact
recent research suggests that even the “melody” of the cries of newborns’ is influenced by hearing the
prosodic features of their native language as early as the third trimester of pregnancy (Mampe, Friederici,
Christophe, & Wermke, 2009). Despite the admission that vocal learning depends on hearing the
vocalizations of others and of oneself, language researchers acknowledge that “little is known about the
processes by which change in infants’ vocalizations are induced” (Kuhl & Meltzoff, 1996, p. 2425). That
does not prevent these researchers, however, from offering ad hoc cognitive explanations. For example,
Kuhl and Meltzoff (1996) speculate that “infants listening to ambient language store perceptually derived
representations of the speech sounds they hear which in turn serve as targets for the production of speech
utterances” or that it is as though both adults and infants “have an internalized auditory-articulatory ‘map’
that specifies the relations between mouth movements and sound” (p. 2426). The problem with such
explanations, as I have repeatedly pointed out, is that they are not parsimonious; that is, they require many
untestable, unfalsifiable assumptions and are often just redundant descriptions of the observable evidence.
More parsimonious explanations can be gleaned by comparing vocal development in human infants to
that of songbirds.
Vocal Learning in Infants and Songbirds: The Role of Automatic Reinforcement
159
Researchers who investigate language development in humans and the development of songs in
certain species of birds have noted many parallels (Brainard & Doupe, 2002; Doupe & Kuhl, 1999; Kuhl,
2000; 2004). For one, as already mentioned, social contingencies of reinforcement play an important role
in vocal learning (Goldstein et al., 2003; 2009). Additionally, researchers agree that hearing the
vocalizations of others and of oneself are necessary for vocal development in infants (Kuhl & Meltzoff,
1996) and in songbirds (Brainard & Doupe, 2002; Doupe & Kuhl, 1999). Thus, in infants and in many
songbirds, immature vocal sounds are shaped into more mature sounds in large part by the feedback
produced by making sounds. Although few of these scholars directly mention reinforcement or operant
learning, many describe how “infants’ successive approximations of vowels would become more
accurate” due to the “acoustic consequences of their own articulatory acts” (Kuhl & Meltzoff, 1996, p.
2426); or how “during sensorimotor song learning, motor circuitry is gradually shaped by performance-
based feedback to produce an adaptively modified behaviour (Brainerd & Doupe, 2002, p. 355). Other
researchers describe how sounds emitted by infants and songbirds “are then gradually molded to resemble
adult vocalizations” (Doupe & Kuhl, 1999, p. 574). Finally, some researchers explicitly acknowledge a
selection process involved in early vocal production. For example, de Boysson-Bardies, 1999 write:
The vocal productions of children are thus modeled by selection processes. The phonetic forms and
intonation patterns specific to the language of the child’s environment are progressively retained at
the expense of forms that are not pertinent to the phonological system of this language. The process
begins at birth, if not before. However, the first effects on vocal performance are delayed, particularly
by the slow course of motor development. (p. 56)
It is amazing that otherwise good scholars fail to either know or acknowledge that the process they
are talking about is a form of selection by consequences called operant conditioning (see Skinner, 1981).
In particular, infants and songbirds start out with a repertoire of immature or unrefined sounds. When the
infants or songbirds hear themselves making sounds that match what they have heard from others, those
sounds are automatically strengthened (i.e., reinforced) in the sense that they occur with a greater
frequency relative to sounds that do not match what they have heard from others. In other words, the
parity achieved when produced sounds are closest to heard sounds automatically strengthens the produced
sounds (Palmer, 1996). According to some researchers, this vocal learning occurs relatively rapidly in
infants and songbirds and without much in the way of external reinforcement (Doupe & Kuhl, 1999, but
see Goldstein et al., 2003, 2009, emphasis added). But it does not occur in the absence of any
reinforcement. Such shaping takes place as a function of automatic reinforcement, that is, reinforcement
not mediated by another individual (see Vaughan & Michael, 1982). Automatic reinforcement (though
not by that name) in the form of parity between self-produced auditory feedback and the sounds heard
from others has also been recognized in song learning in birds (e.g., Konishi, 1965, 1985; Watanabe &
Aoki, 1998). For example, according to Konishi (1985), “A bird’s use of auditory feedback in song
development resembles learning by trial and error; the bird corrects errors in vocal output until it matches
the intended pattern” (p. 134). By “trial and error,” Konishi means operant learning.
Automatic reinforcement also plays an important role in early language acquisition (Schlinger,
1995, pp. pp. 158-160; Smith, Michael, & Sundberg, 1996; Sundberg et al., 1996), and it parsimoniously
explains so-called learning without reinforcement. Automatic reinforcement is the elusive mechanism
responsible for vocal learning that many language researchers (e.g., Doupe & Kuhl, 1999) are searching
for, and it is right before their very eyes (or ears). Moreover, an automatic reinforcement hypothesis
requires very few assumptions, is consistent with known scientific principles and is, thus, a parsimonious
explanation.
The only wildcard is the process by which the vocal sounds one is exposed to in his or her
environment assume automatically reinforcing qualities. At this point it is not clear whether pairing with
other reinforcing stimuli is necessary to establish the sounds of speech or songs as reinforcers, or whether
simple exposure suffices, although the evidence suggests that mere exposure is sufficient. If so, then the
question becomes how much exposure is required during the so-called sensory or perceptual learning
160
phase. What are not helpful, however, are cognitive accounts that appeal to “perceptually derived
representations of the speech sounds” (Kuhl & Meltzoff, 1996, p. 2425). Such accounts are inferred
solely on the basis of behavioral observations and, moreover, are only redundant descriptions of those
observations. As explanations, then, they are circular in that the only evidence of the perceptually derived
representations is the fact that an individual’s vocal sounds that approximate those previously heard come
to predominate in their repertoire. As I have tried to show in this paper, a behavioral account is simpler, in
part because it is based on a foundation of experimentally derived principles, and, thus, does not need to
appeal to inferred, hypothetical entities.
Summary and Conclusions
In this article I have illustrated the cognitive approach to speech perception and language
acquisition by noting that although language researchers often incorporate operant or operant-like
methods in their research, their interpretations of the results are problematic. Specifically, cognitively
oriented language researchers often attribute the causes of speech perception and production to children
themselves, or to their brains rather than to the environmental features which the researchers manipulate
in their studies. Because this approach infers hypothetical constructs instead of testable physical events, it
does not adhere to the principle of parsimony. Moreover, inferring hypothetical constructs leads to logical
errors of reification and circular reasoning. I also attempted to show how research conducted by
cognitively oriented speech and language researchers can be parsimoniously interpreted according to the
experimentally established principles of behavior analysis. Furthermore, because behavior analysts
demonstrate environmental variables as causes of the behaviors, their approach is inherently consistent
with the principle of parsimony and is, thus, directly testable.
For SLPs, perhaps most importantly, cognitive approaches are not very helpful in suggesting
effective assessment or treatment procedures to remediate speech and language disorders. I conclude that
a functional, behavior-analytic approach serves the SLPs better, by offering both an experimentally based
analysis of speech and language and measurable and manipulable methods of treatment.
References
Aslin, R. N. (1987). Visual and auditory development in infancy. In J. D. Osofsky (Ed.) Handbook of
infant development (pp. 5-97). New York: Wiley.
Aslin, R. N., Saffran, J. R., & Newport, E. L. (1998). Computatio n of conditional probability
statistics by 8-month-old infants. Psychological Science, 9, 321 -324.
Bates, E., O’Connell, B., & Shore, C. (1987). Language and communication in infancy. In J. D.
Osofsky (Ed.), Handbook of infant development (pp.149-203). New York: Wiley.
Camarata, S. (1993). The application of naturalistic conversation training to speech production in
children with speech disabilities. Journal of Applied Behavior Analysis, 26, 173-182.
Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.
Chomsky, N. (1959). Review of Verbal Behavior by B.F. Skinner, Language 35, 26-58.
DeCasper, A. J. & Fifer, W. P. (1980). Of human bonding: Newborns prefer their mother’s voices.
Science 208, 1174–1176.
DeCasper, A. J., Introduction& Spence, M. J. (1986). Prenatal maternal speech influences newborns'
perception of speech sounds. Infant Behavior and Development, 9, 133-150.
Diel, R. L, Lotto, A. J., & Holt, L. L. (2004). Speech perception. Annual Review of Psychology, 55, 149–
79
161
DeLeon, I. G., Arnold, K. L., Rodriguez-Catter, V., & Uy, M. L. (2003). Covariation between bizarre and
nonbizarre speech as a function of the content of verbal attention. Journal of Applied Behavior
Analysis, 36, 101-104.
Dooling, R. J., Best, C. T., & Brown, S. D. (1995). Discrimination of synthetic full-formant and
sinewave/ra-la/continua by budgerigars (Melopsittacus undulatus) and zebra finches (Taeniopygia
guttata). Journal of the Acoustical Society of America, 97, 1839–1846.
Eimas, P. D. (1975). Developmental studie s in speech perception. In L. B. Cohen, & P. Salapatek (Eds.)
Infant Perception: Vol. 2. From Sensation to Cognition (pp. 193–231) Academic: New York).
Gallahue, D. L. (1989). Understanding motor development: Infants, children, adolescents (2nd ed.)
Indianapolis: Benchmark Press.
Gibson, E. J., & Spelke, E. S. (1983). The development of perception. In P. H. Mussen, J. H., Flavell, &
E. M. Markman (Eds.), Handbook of child psychology: Vol. 3.Cognitive development (4th ed., pp.
1-76). New York: Wiley.
Goldstein, M. H., King, A. P., & West, M. J. (2003). Social interaction shapes babbling: Testing parallels
between birdsong and speech. Proceedings of the National Academy of Sciences, 100(13), 8030 -
8035.
Goldstein, M. H., Schwade, J. A., & Bornstein, M. H. (2009). The value of vocalizing: Five-month-old
infants associate their own noncry vocalizations with responses from adults. Child Development,
80 (3), 636 – 644.
Greenspoon, J. (1955). The reinforcing effect of two spoken sounds on the frequency of two responses.
American Journal of Psychology, 68: 409-416.
Guess, D., Sailor, W., Rutherford, G., & Baer, D. M. (1968). An experimental analysis of linguistic
development: The productive use of the plural morpheme. Journal of Applied Behavior Analysis,
1, 297-306.
Hart, B., & Risley, T. R. (1995). Meaningful differences in the everyday experiences of young American
children. Baltimore, MD: Paul Brookes.
Hart, B., & Risley, T. R. (1999). The social world of children: Learning to talk . Baltimore: Paul
Brookes.
Hegde, M. N. (2007), Treatment protocols for stuttering. San Diego, CA: Plural Publishing.
assessment and treatment. Boston, MA: Allyn and Bacon.
Holt, L. L., & Lotto, A. J. (2008). Speech perception within an auditory cognitive science framework.
Current Directions in Psychological Science, 17, 42-46.
Johnston, J. M., & Johnston, G. T. (1972). Modification of consonant speech-sound articulation in young
children. Journal of Applied Behavior Analysis, 5, 233-246.
Jusczyk, P. W., Cutler, A. & Redanz, N. J. (1993). Infants' preference for the predominant stress patterns
of English words. Child Development, 64, 675–687.
Jusczyk, P. W., Friederici, A. D., Wessels, J. M. I., Svenkerud, V. Y., & Jusczyk, A. M. (1993). Infant's
sensitivity to the sound patterns of native language words. Journal of Memory and Language, 32,
402-420.
Kluender, K. R., Diehl, R. L., & Killeen, P. R. (1987). Japanese Quail can form phonetic categories.
162
Science, 237, 1195-1197.

Kuhl, P. K. (1981). Discrimination of speech by nonhuman animals: Basic auditory sensitivities
conducive to the perception of speech-sound categories. Journal of the Acoustical Society of
America, 70, 340–349.
Kuhl, P. K. (2000). A new view of language acquisition. Proceedings of the National Academies of
Sciences, 97, 11850-11857.
Kuhl, P. K. (November, 2004). Early Language acquisition: Cracking the speech code. Nature Reviews:
Neuroscience, 5, 831-843.
Kuhl, P. K., & Meltzoff, A. N. (1996). Infant vocalizations in response to speech: Vocal imitation and
developmental change. Journal of the Acoustical Society of America, 100, 2425-2438.
Kuhl, P. K., & Miller, J. D. (1975). Speech perception by the chinchilla: Voiced-voiceless distinction in
alveolar plosive consonants. Science 190, 69–72.
Kuhl, P. K., & Miller, J. D. (1978). Speech perception by the chinchilla: Identification functions for
synthetic VOT stimuli. Journal of the Acoustical Society of America, 63, 905-917.
Kuhl, P. K., & Padden, D. M. (1982). Enhanced discriminability at the phonetic boundaries for the
voicing feature in macaques. Perception & Psychophysics, 32, 542-550.
Kuhl, P. K., & Padden, D. M. (1983). Enhanced discriminability at the phonetic boundaries for the place
feature in macaques. Journal of the Acoustical Society of America, 73, 1003-1010.
Kuhl, P. K., Wiliams, K. A., Lacerda, F., Stevens, K. N., & Lindbloom, B. (1992). Linguistic experience
alters phonetic perception in infants by 6 months of age. Science, 255, 606-608.
Lancaster, B. M., LeBlanc, L.A., Carr, J. E., Brenske, S., Peet, M. M., & Culver, S. J. (2004). Functional
analysis and treatment of the bizarre speech of dually diagnosed adults. Journal of Applied
Behavior Analysis, 37, 395-399.
Lerman, D. C., Parten, M., Addison, L.R., Vorndran, C. M., Volkert, V. M., & Kodak, T. (2005). A
methodology for assessing the functions of emerging speech in children with developmental
disabilities. Journal of Applied Behavior Analysis, 38, 303–316.
MacCorquodale, K. (1970). On Chomsky's review of Skinner's Verbal Behavior. Journal of the
Experimental Analysis of Behavior. 13, 83–99.
Mampe, B., Friederici, A. D., Christophe, A., & Wermke, K. (2009). Newborns' cry melody is shaped by
their native language. Current Biology, 20,
Maye, J., Werker, J. F. & Gerken, L. (2002). Infant sensitivity to distributional information can affect
phonetic discrimination. Cognition, 82, B101–B111.
McMurray, B., & Hollich, G. (2009). Core computational principles of language acquisition: Can
statistical learning do the job? Introduction to Special Section. Developmental Science, 12, 365-
368.
Mehler, J., Nespor, M., & P?na, M. (2008). What infants know and what they have to learn about
language. European Review, 16, 429–444.
Mehler, J., Jusczyk, P., Lambertz, G., Halsted, N., Bertoncini, J. & Amiel-Tison, C. (1988). A precursor
of language acquisition in young infants. Cognition, 29, 143–178.
Miyawaki, K., Strange, W., Verbrugge, R., Liberman, A. M., & Jenkins, J. J. (1975). An effect of
linguistic experience: The discrimination of [r] and [l] by native speakers of Japanese and
English. Perception and Psychophysics, 18, 331-340.
163
Moerk, E. L. (1978). Determiners and consequences of verbal behaviors of young children and their
mothers. Developmental Psychology, 14, 537-545.
Moerk, E. L. (1983). The mother of Eve—As a first language teacher. Norwood, NJ: Ablex.
Moerk, E. L. (1992). First language: Taught and learned. Baltimore: Paul H. Brookes.
Moon, C., Cooper, R. P., & Fifer, W. P. (1993). Two-day-olds prefer their native language. Infant
Behavior and Development, 16, 495-500.
Morris, E. K., Lazo, J. F., & Smith, N. G. (2004). Whether, when, and why Skinner published on
biological participation. The Behavior Analyst, 27, 153-169.
Nazzi, T., Bertoncini, J., & Mehler, J. (1998). Language discrimination by newborns: towards an
understanding of the role of rhythm. Journal of Experimental Psychology: Human Perception
and Performance, 24, 756-66.
Reed, P., Howell, P., Sackin, S., Pizzimenti, L., & Rosen, S. (2003). Speech perception in rats: Use of
duration and rise time cues in labeling of affricate/fricative sounds. Journal of the
Palmer, D. C. (1996). Achieving parity: The role of automatic reinforcement. Journal of the Experimental
Analysis of Behavior, 65, 289–290.
Palmer, D. C. (2006). On Chomsky's appraisal of Skinner's Verbal Behavior: A half-century of
misunderstanding. The Behavior Analyst, 29, 253-267.
Pena-Brooks, A., & Hegde, M. N. (2007). Assessment and treatment of articulation and phonological
disorders in children (2nd ed.). Austin, TX: Pro-Ed.
Rheingold, H. L., Gewirtz, J. L., and Ross, H. W. (1959). Social conditioning of vocalizations. Journal of
Comparative and Physiological Psychology, 25, 68-73.
Rosenfeld, H. M., & Baer, D. M. (1970). Unbiased and unnoticed verbal conditioning: The double -agent
robot procedure. Journal of the Experimental Analysis of Behavior 14, 99-107.
Saffran, J. R. (2003). Statistical language learning: Mechanisms and constraints. Current Directions in
Psychological Science, 12, 110-114.
Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science,
274, 1926-1928.
Saffran, J. R., Newport, E. L., Aslin, R. N., Tunick, R. A., & Barrueco, S. (1997). Incidental language
learning: Listening (and learning) out of the corner of your ear. Psychological Science, 8, 101-
105.
Sautter, R. A., & Leblanc, L. (2006). Empirical applications of Skinner’s analysis of verbal behavior with
humans. The Analysis of Verbal Behavior, 22, 35–48.
Schlinger, H. D. (1995). A behavior-analytic view of child development. New York: Plenum.
Schlinger, H. D. (2004). Why psychology hasn’t kept its promises. Journal of Mind and Behavior, 25 (2),
123-144.
Schlinger, H. D. (2008a). Listening is behaving verbally. The Behavior Analyst, 31, 145-161.
Schlinger, H. D. (2008b). The long goodbye: Why B. F. Skinner’s Verbal Behavior is alive and well on
the 50th anniversary of its publication. Psychological Record, 58, 329-337.
Skinner, B. F. (1957). Verbal Behavior. New York: Appleton-Century-Crofts.
Skinner, B. F. (July, 1981). Selection by consequences. Science, 213, 501-504.
164
Todd, G. A., & Palmer, B. (1968). Social reinforcement of infant babbling. Child Development, 39, 591-
596.
Toro, J. M., Trobalon, J. B., & Sebastián-Gallés, N. (2005). Effects of backward speech and speaker
variability in language discrimination by rats. Journal of Experimental Psychology: Animal
Behavior Processes, 31, 95-100.
Wagaman, J. R., Miltenberger, R. G., & Arndorfer, R. E. (1993). Analysis of a simplified treatment for
stuttering in children. Journal of Ap plied Behavior Analysis, 26, 53-61.
Whitehurst, G. J. (1972). Production of novel and grammatical utterances by young children. Journal of
Experimental Child Psychology, 13, 502-515.
Whitehurst, G.J., Novak, G., and Zorn, G.A. (1972). Delayed speech studied in the home. Developmental
Psychology, 7, 169-177.
Whitehurst, G. J., Ironsmith, M., & Goldfein, M. (1974). Selective imitation of the passive construction
through modeling. Journal of Experimental Child Psychology, 73, 288-302.
Author Note
I am grateful to Julie Riggott for her keen editorial eye and to Giri Hegde for his helpful comments on an
earlier version of this manuscript.
Henry D. Schlinger, Jr.

Department of Psychology
California State University
Los Angeles, CA 90032-8227
E-mail: hschlin@calstatela.edu
ADVERTISEMENT

If you wish to run the same ad in multiple issues for the year, you are eligible for the following discount:
1/4 Pg.: $40 - per issue

Full Page: $150.00-per issue
In addition to placing your ad in the journal(s) of your choice, we will place your ad on our website’s advertising section.
For more information, or place an ad, contact Halina Dziewolska by phone at (215) 462-6737 or e-mail at:
halinadz@hotmail.com
165
Speech and language assessment:

A verbal behavior analysis
Barbara E. Esch, Kate B. LaLonde, and John W. Esch
Abstract
Speech-language assessments typically describe deficits according to form (topography), without

identifying the environmental variables responsible for the occurrence (function) of a particular utterance.
We analyze a database of 28 standardized speech-language assessments according to six response classes
including five of Skinner’s (1957) verbal operants. We discuss the importance of including a functional
analysis of speech-language skills to better inform treatment planning and target selection.
Recommendations for future research are included.
Keywords: speech-language, assessment, verbal behavior, functional analysis, verbal operant
____________________________________________________________________________________
Introduction
As practitioners concerned with treating speech-language disorders, one of our primary goals is to
accurately and efficiently determine which communication skills should be targeted for intervention. How
do we know when something needs to be taught? What defines a skill deficit or a communication
breakdown ? In everyday terms, a speech-language problem is signaled when a breakdown occurs in the
interaction between a speaker and a listener. That is, we say that communication is successful when the
outcome of an interaction is effective (i.e., functional), but when the interaction is weak and ineffective,
we suspect a deficit in the repertoire of one of the communication partners. Thus, the critical aspect that
defines communicative competence lies in the success of the dyad, a dynamic process comprised of
functional units of discourse between a speaker and a listener, even when these roles are assumed within a
single individual (e.g., Lodhi & Greer, 1989; Palmer, 1998; Skinner, 1957).
Despite the fundamentally social nature of communication, assessment tools for speech-language
deficits rarely take into account this requisite speaker-listener unit, nor is it routine to test for, describe, or
analyze specific breakdowns in this unit. Most speech-language assessments in widespread use today
evaluate response topographies (forms of responses) alone, without regard for a functional analysis of the
causal variables that lead to the specific topographic features of responses. Indeed, much assessment time
and energy is expended in classifying speech-language performance, not by its role within a unit of
functional communication between a speaker and a listener (i.e., cause and effect), but instead only by its
arbitrarily-labeled categories describing non-function based properties such as word structure (e.g., nouns,
verbs, plurals), modality (expressive, receptive), relationship (e.g., antonyms/synonyms, agreement), or
other inferred characteristics (e.g., ellipsis, nomination, phonological process). This focus is illustrated by
ASHA’s (1993) definition of language disorder as an impairment in “comprehension and/or use of
spoken, written, and/or other symbol systems. The disorder may involve (1) the form of language
(phonology, morphology, and syntax), (2) the content of language (semantics), and/or (3) the function of
language in communication (pragmatics) in any combination.” Although function is an element of this
definition, this usage of the term refers to a linguistic feature of language (pragmatics) in contrast to
Skinner’s analysis of function in which environmental variables describe (and thus, define) the contingent
relatio n that accounts for each particular instance of an utterance (i.e., language). As such, linguistic
descriptions are less adequate for applied work (i.e., treatments) than is Skinner’s model, which specifies
the variables that evoke and strengthen verbal behavior.
To be sure, a thorough topographic description of an individual’s speech-language repertoire may
be a necessary component to plan an appropriate therapy program, but it is insufficient to accomplish the
166
task because a key element of the evaluation is missing. Our job during assessment is to document not
merely occurrences of wrong responses to assessment items, but also the speaker-listener environment
(antecedent and consequent variables) in which the topography occurs. If a functional analysis of the
speaker-listener exchange is omitted from the assessment, a critical part of language learning is at risk of
being excluded from an effective intervention plan (Damico, 1993; Frost & Bondy, 2006; LaRue, Weiss,
& Cable, 2008; Rowland & Schweigert, 1993; Spradlin & Siegel, 1982; Sundberg, 2008).
Meaning defined by environmental context. The meaning of verbal behaviors is a function of their
controlling variables (Hegde, 2008; Skinner, 1957). Speakers and listeners do not “make mistakes,” “use
the wrong word,” or “fail to generalize” in the ordinary sense. A response does not occur in a vacuum,
without its controlling variables or variable (Austin, 1975; Bates, 1976; Catania, 2006; Schlinger, 1995;
Searle, 1969) and attempting to catalog responses without this information prevents our understanding of
what a particular response “means.” It is the analysis of a response form within a context defined by
antecedent and consequent variables that allows us to determine whether the response is correct or not.
For example, the cause-effect context in which a thirsty person asks for water please is different than that
in which he or she is not thirsty but nevertheless emits water in responding to a teacher’s instructions to
repeat after me: “water.” The point is the same regardless of topographies; saying water in New York or
agua in Costa Rica or Wasser in Germany does not “mean” the same thing when one wants water as it
does when one is responding to say “water” or repita “agua” or bitte wiederhole “Wasser.”
Topography is interesting only in terms of the functional context in which it occurs. The point
applies whether considering a single topography (e.g., water, agua, Wasser) or equivalent forms
(synonyms). Whether assessing or treating speech-language skills, a knowledgeable clinician will
recognize that the conditions that evoke pickle and cucumber are not at all the same as those stimuli that
evoke pickle and predicament. It is not the words that mean the same thing; antecedent and consequent
relations (e.g., request vs. repeat contingencies) are what explain the occurrence of these forms. That is,
forms may be interchangeable only to the extent that they share the same controlling variables. Thus,
“meaning” is topography within a contingent relation of controlling variables and it is this contingent
arrangement that establishes function (i.e., mean ing).
Without assessing the controlling variables (motivation, discriminative stimuli, consequent
stimuli) that evoke and strengthen or weaken speech-language responses, we may fail to identify
appropriate functional (cause-effect) relations by which defective forms (e.g., grammatical errors) of a
disorder should be remediated. Evaluations that result in effective intervention plans include an
examination of the reasons (controlling variables) that an individual’s verbal environment would occasion
or maintain particular speech-language topographies (right or wrong) in the first place. We must account
for these occurrences by determining the conditions that evoke and maintain them, to adequately
prescribe a treatment program that will eliminate, modify, or otherwise resolve these errors.
In sum, a complete speech-language account (Skinner, 1957) would describe not only the form of
a speaker’s response but it would also explain the function of interactions between a speaker and a
listener, resulting in a detailed description of response errors in terms of their topographies (specific
words) and the environmental contexts (antecedent/consequent stimuli) in which those topographical
errors occur. This would provide both the description (topography) and the explanation (function) for any
given response. Such an account is essential for planning and carrying out effective interventions, whether
they involve simple or complex treatments. Without such information, we risk embarking on an
incomplete or poorly articulated treatment program that produces or maintains errors (i.e., poor stimulus
control over correct responses), resulting in gaps (e.g., splinter skills) in the overall verbal repertoire (see
Baker, LeBlanc, & Raetz, 2008; Greer & Ross, 2008).
Treatment Efficacy
A perplexing discrepancy currently exists with respect to assessment and treatment of speech-
language disorders. On the one hand, standardized assessment tools that dominate in the field of speech-
167
language pathology are based on, and result in, a linguistic description of speech-language, yet, at best,
these assessments can only weakly inform treatment because a linguistic approach to treatment does not
exist (Hegde & Maul, 2006). It is true that not all speech pathologists rely solely on standardized tools to
inform their treatments. However, whether they use standardized tests alone or they supplement them
with other information (e.g., language samples), the analysis of skills for the purpose of diagnosis and
treatment planning is linguistically based. This is handicapping because, despite linguistic information
from the assessment, the therapist lacks the functional analysis of verbal behavior needed to effect
behavior change, which is the sole aim of therapy. Moreover, he or she must look elsewhere (i.e., applied
behavior analysis) for effective teaching tools (e.g., Cooper, Heron, & Heward, 2007; Hegde, 1998;
Miltenberger, 2001) and formats (e.g., Lovaas & Smith, 2003) that can support clinical intervention. By
contrast, a functional (behavioral) approach to speech-language has already been described for both
assessment (e.g., Carr & Durand, 1985; Duker, 1999; Frost & Bondy, 2002; Greer & Ross, 2008; Hart &
Rogers-Warren, 1978; Lerman et al., 2005; Spradlin, 1963; Sundberg, 2008; Sundberg & Partington,
1998) and for treatment (see Hegde, 1998, Ogletree & Oren, 2001, and Sautter & LeBlanc, 2006 for
reviews). Despite this, it is only speech-language treatment that seems to have been influenced by
behavior analysis and its technology (e.g., Bourgeois, 1992; Kouri, 2005; Rvachew, 1994) whereas
assessment of these disorders remains firmly linguistically based on tools (see Directory, American
Speech-Language-Hearing Association, ASHA, 2009) that do not include or provide for an analysis of
environmental variables that control the speech-language performances assessed.
It is perhaps this problem referred to by proponents of informal (i.e., criterion-referenced)
assessments (Notari & Bricker, 1990; Romanczyk, Lockshin, & Matey, 2001) for children with a
diagnosis of Autism Spectrum Disorder (ASD). These advocates argue that, for this population at least,
standardized assessments typically do not identify appropriate curricular targets. Although focused on the
needs of individuals with ASD, these and other discussions (National Autism Center, 2009) emphasize
the issue of treatment efficacy for all individuals receiving speech-language intervention and the need to
administer assessments that are comprehensive enough to inform treatment.
The Purpose of Assessment
Speech-language assessment is conducted for many reasons. It can provide diagnostic labels (e.g.,
specific language impairment, apraxia of speech, aphasia) and help determine therapy progress. It can
also support documentation required by agencies, such as performance comparisons (i.e., norm-
referenced data) for Individualized Educational Plans in schools, and status updates for reimbursement
purposes in medical and clinical settings. But by far, one of the most important purposes of an assessment
tool is to provide adequate information to plan an effective intervention that fits into a sequenced
curriculum of skills. As mentioned earlier, most standardized assessment tools used by SLPs are based
theoretically on a linguistic analysis of language for which no corresponding treatment methods are
available. This “conceptual inconsistency” (Hegde & Maul, 2006) results from several historical
influences on the development of the profession’s theoretical base and may explain the prominence
(Novak & Pelaez, 2004) of diagnostic labels (e.g., apraxia, auditory processing disorder) in terms of
hypothetical constructs in lieu of function-based explanations of behavior. Duchan (2008) traces the
current conceptual perspective in speech pathology from an emphasis on psychological processing (1945
to 1965) to linguistics (1965 to 1975) and, finally, to pragmatics (1975 to 2000) at which time “we
reconsidered and reframed language in light of its communicative, linguistic, cultural, and everyday-life
contexts” (p. 2). It is unclear what is meant by “everyday-life contexts,” but a functional (cause-effect)
analysis of language may be the goal. Much of what is described in this historical review hints at the need
to address behavioral function (see also Prizant & Duchan, 1981) and there is a tangential nod to behavior
analysis evident in Duchan’s program descriptions that include sabotage techniques (i.e., motivating
operations; Laraway, Snycerski, Michael, & Poling, 2003) and response intents (i.e., mand, tact,
intraverbal; Skinner 1957). Despite this, the descriptive focus, including that widely available (e.g.,
Pinker, 1994) to general consumers interested in language development, remains clearly non-behavioral
168
(e.g., psycholinguistic skills, linguistic relationships). What has evolved, and permeates the field of
speech pathology, appears to be largely a non-behavioral view of language learning in which a functional
analysis for many professionals may not mean a causal, explanatory analysis of verbal behavior in terms
of the environmental stimuli that evoke and maintain it but, rather, may resonate more as a description of
the “use of language.” This can impede prescription and remediation efforts by failing to provide a full
account of speech-language performance: speaker-listener interactions comprised of not only
topographic/structural descriptions but also of functional (i.e., causal) explanations for the occurrence of
those topographies.
Challenges to Resolve
A number of issues present both assessment and clinical application challenges for speech pathologists
and others responsible for teaching speech-language skills. We propose that solutions are available to help
resolve these issues by applying a behavioral analysis to the assessment process initially and, later,
throughout treatment. Our discussion of these concerns follows.
1. Receptive-Expressive Dichotomy
Speech-language and its assessment is typically described as consisting of two categories,
receptive and expressive. Accordingly, treatment plans are likely to channel the therapeutic focus
into this same dichotomy. As a result, speaker and listener repertoires may be regarded as simply
two halves of a common cognitive process in which words are “understood” in one modality and
“used” in another. Instead of considering language as performance (i.e., behavior), this traditional
view of language implies that a language entity exists structurally as a type of cognitive holding
tank from which appropriate responses (i.e., “meaning”) are chosen to fit a particular
communicative situation. The notion is that speakers toggle between selecting a word and using
it. It is significant, however, that we do not appeal to a similar cognitive account to explain
nonverbal behaviors, such as scratching an itch or scrubbing a pot. No one would assert that,
when the mosquito bites, we select a scratch from a mental reservoir of available muscle actions.
We would be satisfied to contend that the itchiness occurred, we scratched it, and the itch went
away.
In contrast to linguistic explanations of language, a behavioral view posits that we would
not “use” a word, water for example, any more than we would “use a reach” (Skinner, 1957, p. 7)
to obtain the water itself. Instead, antecedent and consequent conditions related to water are
sufficient to evoke either response, whether a nonverbal reach for water or a verbal water (Hegde
& Maul, 2006). Nothing is gained by inserting a hypothetical construct (receptive or expressive
“use”) into an explanation of why the response occurred. We still have to account for each
instance of the proposed use (Sundberg & Michael, 2001). This requires identifying the response
of interest as part of a unit of motivational variables, prompts, instructions, and consequences.
Instead of residing at-the-ready in a sort of cognitive container, speech-language skills are more
usefully characterized as different repertoires based on separate functional relations between
antecedent and consequent conditions (Hegde & Maul, 2006; Schlinger, 1995; Sundberg &
Michael, 2001).
Appealing to hypothetical constructs to explain instances of verbal behavior can obscure
a clinician’s efforts to pinpoint errors during assessment and to target a coordinated sequence of
skills for remediation. Consider a situation in which a child does well on a receptive test of verb
tense but fails verb items on an expressive test (e.g., CELF-4; Semel, Wiig, & Secord, 2003). Is
the problem with the speaker repertoire (expressive) in general or with verb tense specifically?
Should treatment consist of repeating verb tense forms while looking at pictures (e.g., the boy is
running) or should it provide practice in completing sentences (e.g., Bob is walking but Reggie is
. . .), with pictures or without? What if the learner can label pictures with progressive verb forms
(e.g., TWF-2; German, 2000), but cannot complete sentences with correct verb forms, or changes
169
verb tense when asked to repeat sentences (e.g., CELF-4; also CELF-P, Wiig, Secord, & Semel,
2004; TOLD-P:3, Newcomer & Hammill, 1988), a task that essentially tests echoic skills? Is this
a problem of verb tense, sentence completion, or poor repetition (i.e., echoic)? What about the
learner who can say rhyming words but cannot point to them (e.g., PLS-4; Zimmerman, Steiner,
& Pond, 2002)? Is the problem receptive or does it indicate a poor (possibly covert) echoic
repertoire (i.e., “expressive”)? How are we to interpret results of a test that shows a child can
point to a puppy in response to which one is little but cannot tell you the opposite of big? Should
you work on adjectives, opposites, or general expressive skills? These situations exemplify the
difficulty in determining intervention targets from assessments where skills are not explained
functionally (i.e., by their controlling variables) but, instead, they are defined linguistically and
categorized topographically as either receptive or expressive.
2. Mismatch Between Assessment Focus and Real-World Contingencies
Most speech-language tests in wide use today are standardized instruments (ASHA,
2009) that provide information about skills solely according to linguistic parameters, described
earlier as topographic responses. However, the speech-language behavior emitted by an
individual does not exist in a topography-only sense, absent its effect on a listener (Skinner,
1957) and, in the real world, topographic errors (thoup for soup) are disregarded (Hart & Rogers-
Warren, 1978) unless their form is too deviant (e.g., my doggy runded away). Topographies
become functional entities (i.e., meaningful) only when they occur in a dynamic environment
consisting of at least one speaker and one listener. We cannot know what a speaker means if we
hear him or her say shoe merely on the basis of the topography (word) itself. We need access to
the speaker’s reasons, a description of the conditions that evoked such a response (Hegde, 2008).
Functional speech-language behavior is evoked and strengthened in a unit in which antecedent
and consequent stimuli occur in temporal proximity to an instance of a speaker’s topographic
behavior and combine to become functional communication (see Sautter & LeBlanc, 2006).
Therefore, its description, to be useful for treatment planning, must involve more than just a
description of topography. Instead, we need to describe speech-language behavior more
functionally (e.g., Baker et al., 2008; Greer & Ross, 2008; Koegel & Koegel, 1995) with resulting
evaluation tools (e.g., Sundberg, 2008; Partington & Sundberg, 1998) that take this functional
unit into account.
3. Treatment Interference Due to Problem Behavior
We have often heard the sentiment expressed by clinicians and others that “I can’t work
with this person until his (or her) behavior is fixed.” It is true that interfering behavior is a
problem, yet it need not preclude our assessment and teaching efforts. A good first step is to ask
“if he were speaking English (or any language) right now, instead of crying, hitting, running
away, what would he be saying?”
Through functional analysis (Iwata, Dorsey, Slifer, Bauman, & Richman, 1994), it is
possible to identify and address weak speech-language repertoires that are functioning as problem
behavior. Functions have been identified that indicate problem behavior, although not
recognizable as true language in form, is indeed functioning as language to gain access to (i.e.,
request) attention, tangibles, or escape from task demands (e.g., Dwyer-Moore & Dixon, 2007;
Kodak, Northup, & Kelley, 2007).
For learners with weak communication skills disguised as problem behavior, listener
skills are often the initial focus of therapy (i.e., compliance training) because these skills were the
weakest (and thus most salient) during assessment. Although listener skills are critically
important in the overall speech-language repertoire, focusing initial treatment on those skills may
be unproductive for learners with interfering behavior problems. From a functional standpoint,
this is because the consequences for listener responses do not directly benefit the speaker
170
(Skinner, 1957). Learners who already find little to compel them to engage in treatment are
unlikely to be motivated by generalized social reinforcers (i.e., praise) when they can emit easier
responses (e.g., hitting) that readily produce consequences of greater value to them. For the
learner with a history of failure for speech-language attempts, mand (i.e., request) assessment and
training is a good first choice (Esch, 2009; Koegel & Koegel, 1995) because the consequences
that maintain mand behavior are specific and are of direct benefit (i.e., you get what you ask for).
The key issue is to train responses that are equivalent in function (e.g., access to attention) but yet
are more socially acceptable in form (e.g., asking instead of hitting).
Typically developing children develop a strong repertoire of mands before other verbal
operants (Bijou & Baer, 1965; Novak, 1996) and, like any other learner, when this skill set is
defective, it is not unusual to see problem behaviors arise that fill the functional vacuum.
Therefore, the task of assessment is to identify not only inappropriate response form, but its
function. Without determining function, eliminating an offensive form alone is unlikely to
succeed. Through assessment of verbal functions, the therapist can identify appropriate mands to
teach in order to provide the learner, child or adult, with speech-language responses that are
adaptive in the natural environment, regardless of diagnosis (e.g., ASD, traumatic brain injury),
disability label (e.g., developmental language impairment, aphasia, apraxia of speech), or
educational setting (e.g., home, school, hospital, clinic).
4. Identifying and Sequencing Intervention Targets
Assessment should lead to a plan for intervention, a prescriptive list of targets to be
acquired (LeBlanc, Dillon, & Sautter, 2009). When assessments identify deficits in non-
functional, topographic terms alone (e.g., derivational adjectives, inflection verbs), it can be
difficult to pinpoint specific speech-language responses that would be manageable therapy targets
or to determine how they fit together as part of a competent verbal repertoire. What should we
teach first – nouns, opposites, plurals, or colors? Should we work to resolve word-finding
problems before number repetition or relational vocabulary? Because none of us has access to a
learner’s perceptions or cognitions (Schlinger, 1995; see also, Schlinger, this issue), targets
identified in linguistic terms are not easily modifiable until they are re-interpreted as a
measurable, observable set of responses, defined as part of a functional verbal unit comprised of
antecedent and consequent stimuli. Given these more concrete criteria, it is easy to see how
topographic descriptions alone do not resolve our diagnostic task.
Functions of verbal behavior. No doubt most readers of this journal are familiar with
Skinner’s (1957) analysis of verbal behavior, which provides a useful theoretical framework for
assessing, and thus treating, speech-language behavior in terms of the environmental variables
that control verbal responses (see also Greer & Ross, 2008; Hegde, this issue; Sundberg, 2008;
Sundberg & Partington, 1998). Table 1 presents five of these verbal operants that are most
relevant to our discussion. In brief, consider the conditions under which we might emit the
response cookie. When hungry, we might ask for cookie. We could say cookie! in response to
seeing, smelling, or tasting one even if we are not hungry. Given the instruction say ‘cookie’, we
may emit the required repetition. Also, we could likely respond cookie to one of many verbal
stimuli related to the topic of cookies (e.g., what did your mom bake, what does c-o-o-k-i-e spell).
Finally, we might read cookie if we saw it written on the Keebler box. The foregoing examples
are identified as mand, tact, echoic, intraverbal, and textual operants, respectively, and, in each
instance, the form of the response is the same, yet the environmental conditions
(antecedent/consequent stimuli) in which each response would likely be emitted are not at all
equivalent. When assessments provide this level of speech-language information, a more
effective intervention plan can be designed, one that addresses not only response topographies but
171
response function as well, thus ensuring a more integrated language learning experience for those
we teach.
Table 1. Descriptions of five elementary verbal operants (Skinner, 1957)

Verbal Antecedent events that Consequent events that
Response
Operant evoke the operant strengthen the operant
Motivating conditions Asking Specified by the mand

Mand (e.g., wants toy airplane) (e.g., Airplane) (e.g., Gets toy airplane)
Verbal stimulus (vocal) Repeating Generalized social reinforcers

Echoic (e.g., “Say ‘airplane’”) (e.g., Airplane) (e.g., “Right!”)
Nonverbal stimulus Labeling Generalized social reinforcers

Tact (e.g., Airplane flies overhead) (e.g., Look Mommy, Airplane!) (e.g., Mom: “Wow! That’s really big!”)
Verbal stimulus (any) Conversation Generalized social reinforcers

Intraverbal (e.g., “Did you arrive by train?”) (e.g., No, airplane) (e.g., “Oh, how was the flight?”)
Verbal stimulus (textual) Reading Generalized social reinforcers

Textual (e.g., Word: AIRPLANE) (e.g., Airplane) (e.g., “Good reading!”)
NOTE: Functions that may involve complex language behavior (e.g., problem solving, remembering, joint control,
emergent relations) are outside the scope of this paper. Readers interested in these topics are referred to Donahoe
and Palmer (1994), Lowenkron (2006), or Rehfeldt and Barnes-Holmes (2009).
Sequential targets. Assessments need to do more than just identify what needs to be
taught. Intervention targets also need to be sequenced in such a way that the learner’s new
communication skills achieve success in his or her verbal community as quickly as possible
(Greer & Ross, 2008). Teaching targets sequenced according to a functional analysis of verbal
behavior may be more efficient than following traditionally defined sequences (i.e., receptive
before expressive) (Miguel & Petursdottir, 2009). For example, Williams and Greer (1993)
demonstrated that, when targets were defined in terms of their verbal function, children learned
functional and spontaneous speech, whereas, when linguistic targets were taught, the children
learned fewer forms and functions. This study shows that when the variables that control a
speech-language target response are identified, they can be used, modified, or otherwise brought
to bear on the response of interest to help a therapist effect change in the learner’s verbal behavior
to ultimately become a more competent speaker. As we shall see in the next section (see Error
Analysis below), this is a powerful tool for therapists.
Sometimes the controlling variables for certain intervention targets are inside the
learner’s body and thus they are inaccessible to the clinician. Response targets like these, often
called feelings (e.g., tired, happy, sad, angry, sick), are difficult to teach because, as clinicians,
we cannot verify the presence/absence of the stimuli that evoke them. Yet these and other private
events (Schlinger & Poling, 1998; Skinner, 1957) are commonly tested in speech-language
assessments (e.g., PLS-4; ROWPVT, Brownell, 2000; TOLD-P:3) and are often selected as
targets to teach labeling non-verbal stimuli (i.e., tact) to children whose tact repertoires are weak
even for stimuli that are outside the skin and thus are verifiable by teacher and learner alike (e.g.,
book, wagon, pizza). Because of this, assessments that identify controlling variables for potential
172
intervention targets (e.g., Sundberg, 2008) have the advantage of pointing clinicians toward
appropriate targets and, at the same time, focusing their efforts away from targets that may seem
important but that are premature in the developmental-functional curriculum.
5. Error analysis
The purpose of speech-language assessment is to identify response errors in the learner’s
verbal repertoire so treatment can be provided that will eliminate these errors in the day-to-day
communication environment and replace them with more adequate responses. As discussed, a
careful analysis of the controlling relations for speech-language responses can provide valuable
information for treatment planning.
The value of an error. Error responses are instructive for clinicians because they tell us precisely
what variables control the extant incorrect response. An analysis of these errors allows us to
thereby establish correct responses and to eliminate stimuli as prompts (i.e., multiple control) that
are extraneous, but currently required, to evoke these responses (Sundberg & Michael, 2001).
For example, a learner may indeed be able to correctly answer How many feet does a duck have
when visiting the duck pond at the park but may not be able to emit the same correct response on
the ride home when the visual stimulus (i.e., the duck) is absent. By cataloging the conditions in
which a desired response does and does not occur, we have the information we need to write
intervention plans to transfer control from the current evocative variables to those that should
evoke and maintain correct responding.
Functional independence of operants and stimulus control transfer. Whereas a verbally
competent speaker may readily tact after learning to mand, or to respond intraverbally after
learning to point to an item, this seemingly automatic transfer of function does not occur easily
for individuals with speech-language impairment. For example, in a study of tact, mand, and
intraverbal responding (Sundberg, San Juan, Dawdy, & Argüelles, 1990), individuals with
traumatic brain injury demonstrated hierarchies of acquisition, showing that verbal functions
(e.g., tact, mand) could be acquired from echoic or textual (i.e., letters) control but that stimulus
control transfer (Catania, 1998) from one function to another did not occur without direct
training.
A growing body of literature in error analysis has shown the functional independence of
many language-related responses (e.g., Braam & Poling, 1983; Hall & Sundberg, 1987; Lamarre
& Holland, 1985; Luciano, 1986; Miguel, Petursdottir, Carr, & Michael, 2008; Partington &
Bailey, 1993; Petursdottir, Carr, Lechago, & Almason, 2008; Sidman, 1971; Sigafoos, Doss, &
Reichle, 1989; Twyman, 1995; Watkins, Pack-Teixeira, & Howard, 1989) and stimulus control
transfer has been reported for several verbal functions.
Sweeney-Kerwin, Carbone, O’Brien, Zecchin, and Janecky (2007) transferred control of
mand responses by children diagnosed with ASD from nonverbal stimuli (i.e., tact) to appropriate
motivating conditions. In another study of children with ASD, Goldsmith, LeBlanc, and Sautter
(2006) reported successful transfer of stimulus control to bring tact responses under intraverbal
control. A study by Lerman et al. (2005) illustrates particularly well the value of analyzing
language responses by their controlling variables. In this study, a child could tact baby but could
not mand baby nor emit any baby-related intraverbal responses. The specificity of this type of
information, by verbal function, clearly pinpoints treatment goals (e.g., teach mand and
intraverbal responses for the same topography as that acquired under tact control).
Clinical competence with stimulus control transfer is particularly useful in identifying
appropriate intraverbal targets and in providing treatment that avoids inducing errors with this
complex repertoire. Whereas the conditions that might evoke a single mand, tact, or echoic
response are fairly straightforward, the variables controlling any particular intraverbal response
173
can be numerous. For instance, a mand requires only sufficient motivating conditions; the tact is
evoked by a particular nonverbal stimulus; and an echoic, in general terms, is simply a repetition
of an auditory model. On the other hand, a competent speaker has an intraverbal repertoire in
which a single response is under the control of tens, perhaps hundreds, of antecedent stimuli that
evoke it. For example, under appropriate conditions, we could easily emit the intraverbal response
salsa to stimuli such as what’s tortilla dip called, let’s chop tomatoes to make some. . ., what
dance class are you taking, and any number of other salsa-related questions. But learners with
weak speech-language repertoires will be challenged by any one of these stimuli and, as we have
suggested, simply teaching a selection, tact, echoic, or mand response is unlikely to result in an
extensive salsa repertoire.
A behavioral interpretation of the findings discussed above dissuades us from cognitive
explanations of deficits identified through our assessments. Because a learner can point to a dog
when asked, but cannot name a dog when he sees one is not well explained by saying that he does
not yet have the concept of dog. Instead, we can more profitably turn our attention to the
variables that evoke various dog responses to plan and carry out an effective treatment program.
We cannot blame learners or their disability for error responses when we have yet to arrange
appropriate stimulus conditions that will evoke and strengthen more accurate responses. Indeed,
clinicians who understand how to assess error responses in terms of their controlling variables
have a distinct advantage in helping learners increase their speech-language skills (Sundberg &
Michael, 2001) by strengthening appropriate stimulus conditions under which particular target
responses occur.
Functional Assessment in Speech-Language Pathology
A few models (partial or comprehensive) are available for functional assessment of speech-
language disorders (e.g., Baker et al., 2008; Carr & Durand, 1985; Grow, Kelley, Roane, & Shillingsburg,
2008; Lerman et al., 2005; Partington & Sundberg, 1998; Sundberg, 2008; note: SLPs interested in
functional assessment related to feeding disorders are referred to Piazza & Roane, 2009) and researchers
have called for increased attention to environmental variables for analysis of communication disorders
(e.g., Hyter, 2007; Roth & Spekman, 1984). However, in general, SLPs largely rely on standardized,
linguistic-based assessment tools to provide diagnostic information, which are unlikely to inform or
adequately support efforts to design appropriate and effective intervention programs. It is perhaps not
surprising, then, that speech-language pathologists often turn to criterion-referenced tests to develop
appropriate intervention targets, although, absent analyses of causal variables, such informal measures
arguably offer no advantage over their standardized counterparts in terms of providing a behavioral
analysis of language performances, which we maintain is essential for effective treatment planning.
Database of Speech-Language Tests
As a first step in bridging this gap, it would be helpful to have a “translation” of existing
assessment instruments, reinterpreted according to the verbal functions that are represented by their test
items. To that end, we examined a group of speech-language tests (Tables 2 through 7) designed to
diagnose aphasia, apraxia of speech, articulation and phonological disorders, and language disorders
(expressive, receptive, or both). Assessments for other speech-language disorders such as fluency (i.e.,
stuttering), voice quality, and dysphagia (swallowing disorders) were excluded from the database.
The assessment database consists of 28 standardized speech-language tests that were selected
from among those commonly used at a university-based speech and language clinic. The clinic is
associated with a graduate program for SLP, which is accredited by ASHA’s Council on Academic
Accreditation in Audiology and Speech-Language Pathology. Tests are administered to individuals
referred for diagnostic purposes or, in the case of established clients, the tests are given to document
progress toward therapeutic goals.
174
The database lists the probable controlling variables for responses required in each test or sub-
test. For some test items, it is likely that multiple stimuli must be self-generated to emit a “correct”
response (e.g., self-echoics). Thus, a more complex analysis may be needed in which additional variables
are considered (e.g., joint control, Lowenkron, 2006; emergent relations, Sidman, 1994; see also Barnes-
Holmes, Barnes-Holmes, & Cullinan, 2000; autoclitics, Skinner, 1957). Nevertheless, a beginning
analysis is offered, listing the test item’s probable controlling variables for 5 of Skinner’s verbal operants
mand, echoic, tact, intraverbal, and textual, and for the nonverbal operant involving listener relations
commonly referred to as receptive language.
Procedures
Each test (or subtest) was coded according to the verbal operant represented by the inherent or
implied antecedent conditions prescribed by the test and by any other information available with respect
to the functional unit represented by each test item. Antecedent conditions included examiner’s
instructions (e.g., point to, say what I say, tell me about), materials, allowed prompts, and actual or
implied motivating operations (Laraway et al., 2003; Michael, 1982, 2004) to evoke appropriate
responses. In some assessments, allowed prompts changed the operant being tested by providing
additional stimuli that could exert control over the response. Tables 2 through 7 specify these situations
(when they could be identified by the test protocol) with the letter P (prompt) under the appropriate
operant column, indicating a potential change in, or addition to, the basic operant being tested.
Other factors that informed the coding procedure included controlling variables that were only
implied, but not directly tested, due to the nature of the test (i.e., informant assessments, see Table 5).
Such indirect assessments are so designated in the Comments column.
Each test or component subtest was coded twice, once by the first author, a board certified
behavior analyst and speech-language pathologist, and again by the second author, a graduate-level
speech-language pathology student with an undergraduate degree in behavior analysis. In the case of
disagreement, an independent behavior analyst reviewed items until agreement was reached.
Code Definitions
Test items were coded according to Skinner’s (1957) five basic verbal operants (mand, tact,
intraverbal, echoic, textual) or, in the case of nonverbal operants, as receptive items. To be precise, the
test items themselves were not operants, but they were coded as such because of the type of functional
unit that would exist if a correct response to the test item occurred and was reinforced. It should be noted,
however, that in many general testing situations, reinforcement is specifically proscribed (presumably to
maintain test integrity). For this reason, no such functions are assumed to be established through the
testing procedure with the assessments in our database. For the examiner-practitioner, advantages of
withholding reinforcement during assessment should be carefully evaluated as some studies have shown
improved test performance under reinforcement, compared to non-reinforcement conditions (e.g., Edlund,
1972; Koegel et al., 1997).
Mand. A test item was coded mand (M) if there was evidence that the item evaluated responses
under the control of a motivating operation or if a consequence, provided or implied, was response-
specific (e.g., child says cookie and gets cookie).
Echoic. Items coded echoic (E) presented verbal stimuli for which a correct response would be
verbal with point-to-point correspondence. For example, a correct echoic response to the instruction “Say
‘what’s your name’” would be what’s your name.
Tact. A tact (T) code designated items in which a non-verbal stimulus (e.g., picture, object) was
presented to evoke a verbal response. For example, an item would be coded T if it instructed the examiner
to show a picture of a house, with house being the correct response. Note, however, that in both
assessment and instructional situations, it is a frequent practice to add the question what’s this when
175
presenting pictures or objects to test “labels.” In such cases, a response is more accurately described as
being under both tact (house picture) and intraverbal (what’s this) stimulus control. Items were also coded
T if a nonverbal stimulus was given to evoke verbal responses regarding attributes such as stimulus
feature, function, or class (e.g., a correct answer would be bounce or beach instead of ball).
Intraverbal. A test item was coded intraverbal (IV) if it contained a verbal stimulus to evoke a
verbal response that did not match (repeat) the examiner’s model. For example, if the verbal stimulus was
what’s your name, a correct response under the control of intraverbal contingencies might be Riley. Items
were also coded IV if a verbal stimulus was presented to evoke verbal responses regarding stimulus
attributes such as feature, function, or class (e.g., wheel, ride in, or vehicle instead of car).
Textual. A test item was coded textual (Tx) if the assessment instructed the examiner to present a
written stimulus and a correct response required reacting to the written material verbally (i.e., reading).
Items were further designated intraverbal (IV) if reading comprehension was required.
Receptive. Items asking the examiner to present an instruction, in which a correct response would
be nonverbal, were coded as receptive (R). Examples of R-coded items are point to cup, give me the
pencil, and show me jumping. Items were also coded R if a conditional discrimination was required
regarding stimulus attributes (e.g., point to the one that has a tail instead of point to dog). Note that other
operants are implicated in conditional discriminations and these are designated in the Tables (e.g., echoic,
tact).
Finally, items that may have required multiple controlling variables are so designated with the
probable operants marked within parentheses.
Following the database (Tables 2-7 below), we present a summary in which we discuss patterns
found in our analysis along with implications for future work on this topic.
Table 2, Next Page!
176
Table 2. Aphasia Tests
177
Table 3. Apraxia Tests
178
Table 4. Articulation/Phonology Tests
179
Table 5. Receptive-Expressive Language Tests
180
Table 6. Expressive Language Tests
181
Table 7. Receptive Language Tests
Results, Discussion, and Considerations for the Future

Information from the speech-language assessment database points to several issues of interest for
future investigations.
First, analysis of the database revealed a striking omission in traditional speech-language tests.
The mand function, widely regarded as the earliest verbal operant established (Bijou & Baer, 1965;
Schlinger, 1995; Sundberg, 2008) and of greatest benefit to speakers (Skinner, 1957), was assessed in
only two of the 28 database tests (PLS-4; REEL-3, Bzoch, League, & Brown, 2003). Despite their
inclusion, the mand function in these tests was only indirectly evaluated (i.e., informant report, such as
parent or caregiver responses, was either required or allowed). This means that relevant motivating
conditions for the occurrence of mands were not directly arranged or evaluated for their evocative effects.
Moreover, it is of particular concern that mand contingencies were absent from the three
assessments for aphasia, an acquired neurological disorder that often is profoundly damaging to speech-
language repertoires. It would seem that, of all the verbal functions potentially impaired in aphasia, the
mand would be of foremost importance to evaluate and, if weak, to re-establish quickly. Collectively,
aphasia tests in the database represent a total of 475 response opportunities for persons with aphasia, yet
the tests contained no mand contingencies to evaluate this critically important repertoire for these
individuals. Behavioral researchers have begun to offer alternative (i.e., non-traditional) models for the
description of aphasia deficits (Baker et al., 2008), but functional evaluation of this critical skill in the
repertoires of actual individuals appears unaddressed in this population.
Next, analysis of the assessment database brought the importance of stimulus control into clearer
focus on at least two issues related to its identification. Unlike assessments in which controlling stimuli
are specified by the test items (e.g., tact, mand), traditional speech-language tests may unintentionally
require multiple stimulus control for correct responding. At other times, they may inadvertently provide
multiple stimuli (i.e., prompts) when it is undesirable to do so. As a result, test items may be harder or
easier than they are meant to be, obscuring the repertoire purportedly being tested. That is, learners would
182
be disadvantaged if they do not have the requisite learning history to respond correctly when doing so
requires control by more than one independent variable or when, conversely, multiple stimuli must be in
place for the learner to respond correctly to items intended to test a single function.
Several assessments in the database illustrated this issue in which it seemed that several stimuli
must, or could, converge to evoke a correct response, thereby risking confounded test results. For
example, some assessments (e.g., TOLD-P:3) require the learner to listen to a word and then repeat only
part of it (e.g., say ‘baseball’ without saying ‘base’). Although this clearly evaluates echoic control, other
repertoires may be required (e.g., intraverbal, autoclitic; Schlinger, 2008; Skinner, 1957), particularly
since the correct response must necessarily omit part of the echoic model, as a self-editing response.
Multiple control was also implicated in situations where instructions to the learner seemed
ambiguous (e.g., prompting a pointing response with tell me; PLS-4). In this case, although a pointing
response is presumably sufficient to be scored as correct, a learner who not only points but also responds
verbally (i.e., it’s that one!) may have a more sophisticated repertoire than a learner who only points to
the answer. If so, this information would be important for treatment planning. Multiple control required
for correct responding was also evident in assessment items where the actual function being evaluated
changed as a result of prompts allowed during correction procedures. For example, the Goldman-Fristoe
Test of Articulation (GFTA-2; Goldman & Fristoe, 2000) consists of asking what’s this while showing
pictures one at a time. Each response is then evaluated for point-to-point correspondence with the
phonemic elements of the (unspoken) model. As such, this test evaluates a tact repertoire (more precisely,
a tact-intraverbal repertoire). However, if no response occurs, an echoic prompt is allowed (e.g., say
‘house’). Thus, the task changes from one requiring tact/intraverbal control to one that requires echoic-
only control. However, because the pictures are presumably still present, the clinician cannot be certain
whether there is partial tact control over an echoic response, should one occur.
These examples illustrate the difficulty in trying to assess speech-language skills with
assessments that specify only topography, and not contingencies, required for a correct response. That is,
individuals without the requisite learning history or those with obvious impairment (e.g., aphasia) may
have only part of the skills necessary to perform well on these assessments and, without a clear
identification of the variables required for correct responding, the learner’s repertoire may appear more or
less deficient. Therefore, assessments to identify therapy intervention targets need to clearly identify (1)
the stimulus control for various operants that define a competent speech-language repertoire and (2) the
foundational, cumulative repertoires that may need to be in place (e.g., tact, listener) to support more
complex responding (e.g., intraverbal). This explication should take into consideration recent research and
supporting literature regarding complex speech-language skills such as naming and categorization (e.g.,
Miguel et al., 2008; Petursdottir et al., 2008), equivalence (Sidman, 1994), and other derived relations
(e.g., Rosales & Rehfeldt, 2007).
Speech-language assessments yielding a functional hierarchy of skill deficits have the advantage
of being more prescriptive for subsequent intervention than are those that yield structural-only
descriptions of errors (Baker et al., 2008; Lerman et al., 2005; Sundberg et al., 1990). This is because the
independent variables governing, and thus crucial to, behavior change are not typically assessed,
described, or otherwise addressed in traditional speech-language assessments (although there is evidence
of emerging interest in the contextual communication environment; e.g., Hyter, 2007). One recently
published assessment of verbal functions and related language skills is the Verbal Behavior Milestones
Assessment and Placement Program (VB-MAPP, Sundberg, 2008), which provides clinicians with a
hierarchy of 170 skills developmentally referenced from ages 0 – 48 months. Skills are balanced across
the verbal functions (e.g., mand, tact, intraverbal, echoic) and related areas (e.g., social skills, linguistic
skills, reading) in order to avoid the rote responding that can occur when out-of-sequence skills are taught
(e.g., intraverbal) without having first established the requisite supporting functions, such as tact and
listener repertoires (also see Greer & Ross, 2008). To address behaviors that may interfere with skill
acquisition, an additional component test, the VB-MAPP Barriers Assessment, identifies 24 potential
183
learning barriers to which environmental (i.e., behavioral) solutions can be applied in order to maximize
instructional efficiency.
Future research needs to establish the clinical efficacy of the VB-MAPP and other function-based
speech-language assessments (e.g., Partington & Sundberg, 1998) as they become available. In the
meantime, assessments of this sort offer immediate clinical benefit over non-functional speech-language
tests because they allow clinicians to identify speaker-listener deficits according to developmental norms
in a curricular sequence and, at the same time, they pinpoint the environmental variables that currently
control these responses errors. By identifying the variables of which errors are a function, assessments
like the VB-MAPP also highlight the stimuli that do not yet control desired speech-language responses;
thus, interventions can be designed that incorporate stimulus control transfer procedures for more
effective and efficient learning. Practitioners who have yet to access function-based speech-language
assessments can nevertheless begin to analyze their existing evaluation tools (some of which may appear
in the database) for the likely functions represented by these instruments. This first-step would be
invaluable for informing treatments by assisting therapists in the selection and sequencing of appropriate
targets for their interventions.
Additional research is needed to further elucidate speaker-listener functions. For example, Poon
and Butler (1972) suggest there may be developmental influences on intraverbal relations (e.g., different
acquisition stages for how, when, where). As noted earlier, Baker et al. (2008) offer an initial function-
based taxonomy for evaluating speaker-listener repertoires following neurological impairment (i.e.,
aphasia). Lerman and colleagues (2005) discuss positive treatment implications by including existing
responses in functional assessments of the verbal repertoire. Yes-no responding has been assessed and
trained across the verbal functions (Shillingsburg, Kelley, Roane, Kisamore, & Brown; 2009) following
demonstrations that these responses did not generalize from one operant (e.g., mand) to another (e.g., tact)
without specific training. Finally, Carr and Firth (2005) call for researchers and practitioners alike to
publish results of individual treatments based on Skinner’s (1957) analysis of verbal behavior. Key to this
body of evidence would be the contributions of speech pathologists in which speech-language
assessments and clinical progress reports include analyses of independent variables (i.e., functions) that
are responsible for topographies of interest.
There is much to be explained in verbal behavior (Sundberg, 1991) and much is still speculative
(Palmer, 1998). Nevertheless, the utility of our assessments will be strengthened by a more thorough
accounting of the observable variables that control speech-language behavior. If it is true that “learning
occurs best when embedded within functional activities” (Rowland & Schweigert, 1993, p. 173), then
assessment that includes a functional account is essential.
References
(Asterisk indicates references in assessment database)
American Speech-Language-Hearing Association (1993). Definitions of communication disorders and
variations. Retrieved April 4, 2010, http://www.asha.org/docs/html/RP1993-00208.html.
American Speech-Language-Hearing Association (2009). Directory of speech-language pathology
assessment instruments. Retrieved December 31, 2009, http://www.asha.org/assessments.aspx.
Austin, J. L. (1975). How to do things with words (2nd ed.). In J. O. Urmson and M. Sbisa (Eds.), How to
do things with words. Cambridge: Harvard University Press.
Baker, J. C., LeBlanc, L. A., & Raetz, P. B. (2008). A behavioral conceptualization of aphasia. The
*Bankson, N. W., & Bernthal, J. E. (1990). Bankson-Bernthal test of phonology. Chicago, IL: Applied
Symbolix.
184
Barnes-Holmes, D., Barnes-Holmes, Y., & Cullinan, V. (2000). Relational frame theory and Skinner’s
Verbal Behavior: A possible synthesis. The Behavior Analyst, 23, 69-84.
Bates, E. (1976). Language in context: Studies in the acquisition of pragmatics. New York: Academic
Press.
Bijou, S. W., & Baer, D. M. (1965). Child development: Vol. 2. Universal stage of infancy. New York:
Appleton-Century-Crofts.
Bourgeois, M. S. (1992). Evaluating memory wallets in conversations with persons with dementia.
Journal of Speech and Hearing Research, 35, 1344-1357.
Braam, S. J., & Poling, A. (1983). Development of intraverbal behavior in mentally retarded individuals
through transfer of stimulus control procedures: Classification of verbal responses. Applied
Research in Mental Retardation, 4, 279-302.
*Brownell, R. (2000). Receptive one-word picture vocabulary test. Novato, CA: Academic Therapy.
*Bzoch, K. R., League, R., & Brown, V. L. (2003). Receptive-expressive emergent language test (3rd ed.).
Austin, TX: Pro-Ed.
Carr, E. G., & Durand, V. M. (1985). Reducing behavior problems through functional communication
training. Journal of Applied Behavior Analysis, 18, 111-126.
Carr, J. E., & Firth, A. M. (2005). The verbal behavior approach to early and intensive behavioral
intervention for autism: A call for additional empirical support. Journal of Early Intensive
Behavioral Intervention, 2, 18-27.
*Carrow-Woolfolk, T. (1999). Test for auditory comprehension of language (3rd ed.). Austin, TX: Pro-Ed.
Catania, A. C. (1998). Learning (4th ed.). Upper Saddle River, NJ: Prentice-Hall.
Catania, A. C. (2006). Words as behavior. The Analysis of Verbal Behavior, 22, 87-88.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, N. (1980). Rules and representations. New York: Columbia University Press.
Cooper, J. O., Heron, T. E., & Heward, W. L. (2007). Applied behavior analysis (2nd ed.). Upper Saddle
River, NJ: Prentice-Hall.
*Dabul, B. (2000). Apraxia battery for adults (2nd ed.). Austin, TX: Pro-Ed.
Damico, J. S. (1993). Language assessment in adolescents: Addressing critical issues. Language, Speech,
and Hearing Services in Schools, 24, 29-35.
*Dawson, J. I., Stout, C. E., & Eyer, J. A. (2003). The structured photographic expressive language test
(3rd ed.). DeKalb, IL: Janelle.
Donahoe, J. W., & Palmer, D. C. (1994). Learning and complex behavior. Needham Heights, MA: Allyn
and Bacon.
Duchan, J. F. (2008). Getting here: A short history of speech pathology in America. Retrieved December
24, 2009, from http://www.acsu.buffalo.edu/~duchan/history.html.
Duker, P. C. (1999). The verbal behavior assessment scale (VerBAS): Construct validity, reliability, and
internal consistency. Research in Developmental Disabilities, 20, 347-353.
*Dunn, L. M., Dunn, L. M., & Dunn, D. M. (1997). Peabody picture vocabulary test (3rd ed.). Circle
Pines, MN: American Guidance Service.
185
Dwyer-Moore, K. J., & Dixon, M. R. (2007). Functional analysis and treatment of problem behavior of
elderly adults in long-term care. Journal of Applied Behavior Analysis, 40, 679-683.
Edlund, C. V. (1972). The effect on the behavior of children, as reflected in the IQ scores, when
reinforced after each correct response. Journal of Applied Behavior Analysis, 5, 317-319.
Esch, B. (2009). Early communication: 4 obstacles to success. Autism and Related Developmental
Disabilities SIG Newsletter, 25, 1-2.
*Fenson, L., Marchman, V. A., Thal, D. J., Dale, P. S., Reznick, S. J., & Bates, E. (2007). MacArthur-
Bates communicative development inventories. Baltimore, MD: Brookes.
Frost, L., & Bondy, A. (2002). The picture exchange communication system (PECS) training manual (2nd
ed.). Newark, DE: Pyramid.
Frost, L., & Bondy, A. (2006). A common language: Using B. F. Skinner’s verbal behavior for
assessment and treatment of communication disabilities in SLP-ABA. The Journal of Speech-
Language Pathology and Applied Behavior Analysis, 1, 103-109.
*Gardner, M. F. (1990). Expressive one-word picture vocabulary test. Novato, CA: Academic Therapy.
*German, D. J. (2000). Test of word finding: TWF-2 (2nd ed.). Austin, TX: Pro-Ed.
*Goldman, R., & Fristoe, M. (2000). Goldman-Fristoe Test of Articulation (2nd ed.). Circle Pines, MN:
American Guidance Service.
Goldsmith, T. R., LeBlanc, L. A., & Sautter, R. A. (2007). Teaching intraverbal behavior to children with
autism. Research in Autism Spectrum Disorders, 1, 1-13.
Grow, L. L., Kelley, M. E., Roane, H. S., & Shillingsburg, M. A. (2008). Utility of extinction-induced
response variability for the selection of mands. Journal of Applied Behavior Analysis, 41, 15-24.
Greer, R. D., & Ross, D. E. (2008). Verbal behavior analysis: Inducing and expanding new verbal
capabilities in children with language delays. Boston, MA: Pearson.
Hall, G., & Sundberg, M. L. (1987). Teaching mands by manipulating conditioned establishing
Hart, B., & Rogers-Warren, A. (1978). A milieu approach to teaching language. In R. L. Schiefelbusch
(Ed.), Language intervention strategies (pp. 193-235). Baltimore: University Park Press.
Hegde, M. N. (2008). Meaning in behavioral analysis. The Journal of Speech-Language Pathology and
Applied Behavior Analysis, 2.4-3.1, 1-24.
assessment and treatment. Boston, MA: Pearson.
*Helm-Estabrooks, N. (1992). Test of oral and limb apraxia. Chicago, IL: Applied Symbolix.
*Helm-Estabrooks, N., Ramsberger, G., Morgan, A. R., & Nicholas, M. (1989). Boston assessment of
severe aphasia. Chicago, IL: Applied Symbolix.
*Hickman, L. A. (1997). The apraxia profile. San Antonio, TX: The Psychological Corporation.
*Hodson, B. W. (2004). Hodson assessment of phonological patterns (3rd ed.). Austin, TX: Pro-Ed.
Hyter, Y. D. (2007). Pragmatic language assessment: A pragmatics-as-social practice model. Topics in
Language Disorders, 27, 128-145.
186
Iwata, B. A., Dorsey, M. F., Slifer, K. J., Bauman, K. E., & Richman, G. S. (1994). Toward a functional
analysis of self-injury. Journal of Applied Behavior Analysis, 27, 197-209. (Reprinted from
Analysis and Intervention in Developmental Disabilities, 2, 3-20, 1982).
*Kaufman, N. R. (1995). Kaufman speech praxis test. Detroit: Wayne State University Press.
*Kertesz, A. (2007). Western aphasia battery revised. San Antonio, TX: PsychCorp.
*Khan, L., & Lewis, N. (2002). Khan-Lewis phonological analysis (2nd ed.). Circle Pines, MN: American
Guidance Service.
Kodak, T., Northup, J., & Kelley, M. E. (2007). An evaluation of the types of attention that maintain
problem behavior. Journal of Applied Behavior Analysis, 40, 167-171.
Koegel, R. L., & Koegel, L. K. (1995). Teaching children with autism: Strategies for initiating positive
interactions and improving learning opportunities. Baltimore, MD: Paul H. Brookes.
Koegel, L. K., Koegel, R. L., & Smith, A. (1997). Variables related to differences in standardized test
outcomes for children with autism. Journal of Autism and Developmental Disorders, 27, 233-243.
Kouri, T. A. (2005). Lexical training through modeling and elicitation procedures with late talkers who
have specific language impairment and developmental delays. Journal of Speech, Language, and
Hearing Research, 48, 157-171.
Lamarre, J., & Holland, J. G. (1985). The functional independence of mands and tacts. Journal of the
*LaPointe, L., & Horner, J. (1998). Reading comprehension battery for aphasia (2nd ed.). Austin, TX:
Pro-Ed.
Laraway, S., Snycerski, S., Michael, J., & Poling, A. (2003). Motivating operations and terms to describe
them: Some further refinements. Journal of Applied Behavior Analysis, 36, 407-414.
LaRue, R., Weiss, M. J., & Cable, M. K. (2008). Functional communication training: The role of speech
pathologists and behavior analysts in serving students with autism. The Journal of Speech-
Language Pathology and Applied Behavior Analysis, 3.2-3.3, 26-34.
LeBlanc, L. A., Dillon, C. M., & Sautter, R. A. (2009). Establishing mand and tact repertoires. In R. A.
Rehfeldt & Y. Barnes-Holmes (Eds.), Derived relational responding: Applications for learners
with autism and other developmental disabilities: A progressive guide to change (pp. 79-108).
Oakland, CA: New Harbinger.
Lerman, D. C., Parten, M., Addison, L. R., Vorndran, C. M., Volkert, V. M., & Kodak, T. (2005). A
methodology for assessing the functions of emerging speech in children with developmental
disabilities. Journal of Applied Behavior Analysis, 38, 303-316.
*Lippke, B. A., Dickey, S. E., Selmar, J. W., & Soder, A. L. (1997). Photo articulation test (3rd ed.).
Austin, TX: Pro-Ed.
Lodhi, S., & Greer, R. D. (1989). The speaker as listener. Journal of Experimental Analysis of Behavior,
51, 353-359.
Lovaas, O. I., & Smith, T. (2003). Early and intensive behavioral intervention in autism. In A. E. Kazdin
& J. R. Weisz (Eds.), Evidence-based psychotherapies for children and adolescents (pp. 325-
340). New York: Guilford.
Lowenkron, B. (2006). An introduction to joint control. The Analysis of Verbal Behavior, 22, 123-127.
Luciano, M. C. (1986). Acquisition, maintenance, and generalization of productive intraverbal behavior
through transfer of stimulus control procedures. Applied Research in Mental Retardation, 7, 1-20.
187
Michael, J. (1982). Distinguishing between discriminative and motivating functions of stimuli. Journal of
the Experimental Analysis of Behavior, 37, 149-155.
Michael, J. (2004). Concepts and principles of behavior analysis (rev. ed.). Kalamazoo, MI: Society for
the Advancement of Behavior Analysis.
Miguel, C. F., & Petursdottir, A. I. (2009). Naming and frames of coordination. In R. A. Rehfeldt & Y.
Barnes-Holmes (Eds.), Derived relational responding: Applications for learners with autism and
other developmental disabilities: A progressive guide to change (pp. 129-148). Oakland, CA:
New Harbinger.
Miguel, C. F., Petursdottir, A. I., Carr, J. E., & Michael, J. (2008). The role of naming in stimulus
categorization by preschool children. Journal of the Experimental Analysis of Behavior, 89, 383-
405.
Miltenberger, R. G. (2001). Behavior modification: Principles and procedures (2nd ed.). Belmont, CA:
Wadsworth/Thomson Learning.
National Autism Center (2009). Evidence-based practice and autism in the schools: A guide to providing
appropriate interventions to students with autism spectrum disorders. Randolph, MA: National
Autism Center.
*Newcomer, P. L., & Hammill, D. D. (1988). Test of language development – primary (3rd ed.). Austin,
TX: Pro-ed.
Nicolosi, L., Harryman, E., & Kresheck, J. (1978). Terminology of communication disorders: Speech,
language, hearing. Baltimore, MD: Williams & Wilkins.
Notari, A., & Bricker, D. (1990). The utility of a curriculum-based assessment instrument in the
development of individualized education plans for infants and young children. Journal of Early
Intervention, 14, 117-132.
Novak, G. (1996). Developmental psychology: Dynamical systems and behavior analysis. Reno, NV:
Context Press.
Novak, G., & Pelaez, M. (2004). Child and adolescent development: A behavioral systems approach.
Thousand Oaks, CA: Sage.
Ogletree, B. T., & Oren, T. (2001). Application of ABA principles to general communication instruction.
Focus on Autism and Other Developmental Disabilities, 16, 102-109.
Palmer, D. C. (1998). The speaker as listener: The interpretation of structural regularities in verbal
behavior. The Analysis of Verbal Behavior, 15, 3-16.
Palmer, D. C. (2000). Chomsky’s nativism: A critical review. The Analysis of Verbal Behavior, 17, 39-50.
Partington, J. W., & Bailey, J. S. (1993). Teaching intraverbal behavior to preschool children. The
Partington, J. W., & Sundberg, M. L. (1998). The assessment of basic language and learning skills (The
ABLLS). Pleasant Hill, CA: Behavior Analysts.
Petursdottir, A. I., Carr, J. E., Lechago, S. A., & Almason, S. M. (2008). An evaluation of intraverbal
training and listener training for teaching categorization skills. Journal of Applied Behavior
Analysis, 41, 53-68.
Piazza, C. C., & Roane, H. S. (2009). Assessment of pediatric feeding disorders. In J. L. Matson, F.
Andrasik, & M. L. Matson (Eds.), Assessing childhood psychopathology and developmental
disabilities (pp. 471-490). New York: Springer.
188
Pinker, S. (1994). The language instinct: How the mind creates language. NY: Harper Collins.
Poon, W., & Butler, K. G. (1972). Evaluation of intraverbal responses in five- to seven-year-old children.
Journal of Speech and Hearing Research, 15, 303-307.
Prizant, B. M., & Duchan, J. F. (1981). The functions of immediate echolalia in autistic children. Journal
of Speech and Hearing Disorders, 46, 241-249.
Rehfeldt, R. A., & Barnes-Holmes, Y. (2009). Derived relational responding: Applications for learners
with autism and other developmental disabilities: A progressive guide to change. Oakland, CA:
New Harbinger.
Romanczyk, R. G., Lockshin, S., & Matey, L. (2001). The children’s unit for treatment and evaluation. In
J. S. Handleman & S. L. Harris (Eds.), Preschool education programs for children with autism
(pp. 49-94). Austin, TX: Pro-Ed.
Rosales, R., & Rehfeldt, R. A. (2007). Contriving transitive conditioned establishing operations to
establish derived manding skills in adults with severe developmental disabilities. Journal of
Applied Behavior Analysis, 40, 105-121.
Roth, F. P., & Spekman, N. J. (1984). Assessing the pragmatic abilities of children: Part 1. Organizational
framework and assessment parameters. Journal of Speech and Hearing Disorders, 49, 2-11.
Rowland, C., & Schweigert, P. (1993). Analyzing the communication environment to increase functional
communication. Journal of The Association for Persons with Severe Handicaps, 18, 161-176.
Rvachew, S. (1994). Speech perception training can facilitate sound production learning. Journal of
Speech and Hearing Research, 37, 347-357.
Sautter, R. A., & LeBlanc, L. A. (2006). Empirical applications of Skinner’s analysis of verbal behavior
with humans. The Analysis of Verbal Behavior, 22, 35-48.
Schlinger, H. D. (1995). A behavior analytic view of child development. New York: Plenum Press.
Schlinger, H. D. (2008). Conditioning the behavior of the listener. International Journal of Psychology
and Psychological Therapy, 8, 309-322.
Schlinger, H. D., & Poling, A. D. (1998). Introduction to scientific psychology. New York: Plenum.
Searle, J. (1969). Speech acts: An essay in the philosophy of language. London: Cambridge University
Press.
*Secord, W. A., & Donohue, J. S. (2002). Clinical assessment of articulation and phonology. Greenville,
SC: Super Duper.
*Semel, E., Wiig, E. H., & Secord, W. A. (2003). Clinical evaluation of language fundamentals ages 5-8
(4th ed.). San Antonio, TX: Pearson.
Shillingsburg, M. A., Kelley, M. E., Roane, H. S., Kisamore, A., & Brown, M. R. (2009). Evaluation and
training of yes-no responding across verbal operants. Journal of Applied Behavior Analysis, 42,
209-223.
Sidman, M. (1971). The behavioral analysis of aphasia. Journal of Psychiatric Research, 8, 413-422.
Sidman, M. (1994). Equivalence relations and behavior: A research story. Boston, MA: Authors
Cooperative.
Sigafoos, J., Doss, S., & Reichle, J. (1989). Developing mand and tact repertoires with persons with
severe developmental disabilities with graphic symbols. Research in Developmental Disabilities,
11, 165-176.
189
Skinner, B. F. (1957). Verbal behavior. New York: Appleton-Century-Crofts.

Spradlin, J. E. (1963). Assessment of speech and language of retarded children: The Parsons language
sample. Journal of Speech and Hearing Disorders Monograph, 10, 8-31.
Spradlin, J. E., & Siegel, G. M. (1982). Language training in natural and clinical environments. Journal of
Speech and Hearing Disorders, 47, 2-6.
Sundberg, M. L. (1991). 301 research topics from Skinner’s book Verbal Behavior. The Analysis of
Verbal Behavior, 9, 81-96.
Sundberg, M. L. (2008). VB-MAPP: Verbal behavior milestones assessment and placement program.
Concord, CA: AVB Press.
Sundberg, M. L., & Michael, J. (2001). The benefits of Skinner’s analysis of verbal behavior for children
with autism. Behavior Modification, 25, 698-724.
Sundberg, M. L., & Partington, J. W. (1998). Teaching language to children with autism or other
developmental disabilities. Pleasant Hill, CA: Behavior Analysts.
Sundberg, M. L., San Juan, B., Dawdy, M., & Argüelles, M. (1990). The acquisition of tacts, mands, and
intraverbals by individuals with traumatic brain injury. The Analysis of Verbal Behavior, 8, 83-
99.
Sweeney-Kerwin, E. J., Carbone, V. J., O’Brien, L., Zecchin, G., & Janecky, M. N. (2007). Transferring
control of the mand to the motivating operation in children with autism. The Analysis of Verbal
Behavior, 23, 89-102.
Twyman, J. S. (1996). The functional independence of impure mands and tacts of abstract stimulus
properties. The Analysis of Verbal Behavior, 13, 1-19.
*Wagner, R., Torgesen, J., & Rashotte, C. (1999). Comprehensive test of phonological processing for
ages 5 and 6. Austin, TX: Pro-Ed.
*Wallace, G., & Hammill, D. (1994). Comprehensive receptive and expressive vocabulary test. Austin,
TX: Pro-Ed.
Watkins, C. L., Pack-Teixeira, L., & Howard, J. S. (1989). Teaching intraverbal behavior to severely
retarded children. The Analysis of Verbal Behavior, 7, 69-81.
* Wiig, E. H., Secord, W. A., & Semel, E., (2004). Clinical evaluation of language fundamentals
preschool (2nd ed.). San Antonio, TX: PsychCorp.
Williams, G., & Greer, R. D. (1993). A comparison of verbal-behavior and linguistic-communication
curricula for training developmentally delayed adolescents to acquire and maintain vocal speech.
Behaviorology, 1, 31-46.
*Williams, K. T. (1997). Expressive vocabulary test. Circle Pines, MN: American Guidance Service.
*Zimmerman, I. L., Steiner, V. G., & Pond, R. E. (2002). Preschool language scale (4th ed.). San
Antonio, TX: Pearson.
Correspondence concerning this article should be addressed to the first author.
Barbara E. Esch
Esch Behavior Consultants, Inc.
P. O. Box 20002
190
Kalamazoo, MI 49019
Phone: 561-676-7212
E-mail: besch1@mac.com
Kate B. LaLonde
P.O.Box 20002
Kalamazoo, MI 49019
John W. Esch
P. O. Box 20002
Kalamazoo, MI 49019
_____________________________________________________________________________________
ADVERTISEMENT

If you wish to run the same ad in multiple issues for the year, you are eligible for the
following discount:

In addition to placing your ad in the journal(s) of your choice, we will place your ad on our
website’s advertising section.
For more information, or place an ad, contact Halina Dziewolska by phone at (215)
462-6737 or e-mail at: halinadz@hotmail.com
191
Effects of a Speaker Immersion Procedure on the Production of

Verbal Operants
Nirvana Pistoljevic, Claire Cahill, and Fabiola Casarini
Abstract
We studied the effects of a Speaker Immersion Procedure on the numbers of vocal verbal operants
emitted in non-instructional settings (NIS) by two preschoolers with language delays. Prior to the
intervention, the participants emitted low rates of vocal verbal behavior in NIS. The dependent variables
were the numbers of vocal verbal operants emitted in three different NIS and the numbers of mands emitted
in the presence of contrived establishing operations. The independent variable was the Speaker Immersion
Procedure, an instructional tactic using contrived and naturally occurring establishing operations to increase
opportunities to teach speaker behavior. Results showed an increase in numbers of vocal verbal operants
emitted by participants following the implementation of the Speaker Immersion Procedure (SIP).
Keywords: Speaker Immersion Procedure, Verbal Behavior, Establishing Operations, Motivating
Operations, Vocal Verbal Operants, Mands, Tacts, Echoic, Spontaneous Speech, Language
______________________________________________________________________________
Introduction
Children with native intellectual disabilities and children from impoverished backgrounds
frequently lack or do not develop functional verbal repertoires naturally (Hart and Risley, 1995). Even
when children with delays in language development receive behavioral language interventions, their prior
lack of a history of reinforcement for vocal verbal communication calls for intensive language learning
instruction (Greer, Chavez-Brown, Nirgudkar, Stolfi, & Rivera-Valdes, 2005; Greer & Keohane, 2005;
Pistoljevic, 2008; Pistoljevic & Greer, 2006). These children need intensive language experiences to
compensate for deficits in their learning history and to advance their verbal development.
According to Woods (1984), when the verbal antecedents from the parents were absent, children
with native disabilities were usually silent, lacking “spontaneous speech,” whereas typically developing
children were more likely to initiate interactions without verbal antecedents from others. Many students
with native or environmental disabilities acquire correct vocal verbal behavior during instruction but may
not emit it spontaneously in non-instructional settings (NIS) (Greer, 2002; Nuzzolo-Gomez & Greer,
2004). What they lack is “independent” or “spontaneous” speaker behavior, including verbal operants
emitted wit hout any verbal antecedent and used in ways that were not previously reinforced (Ross,
Nuzzolo, Stolfi & Natarelli, 2006). This behavior is a major goal of language training programs.
Williams and Greer (1993) compared the effects of linguistic and verbal behavior curricula on
the acquisition of functional speech. For the linguistic curriculum, the students responded to the verbal
behavior of the instructor (e.g., “What is that?”) by emitting intraverbal responses. In the verbal behavior
curriculum, the student responses were under the control of non-verbal stimuli (i.e., the presence of the
nonverbal stimuli served as an antecedent). Operant procedures were used for both methods and both
produced the same number of correct responses. Overall, the verbal behavior curriculum resulted not only
in the students’ learning more words but also in the maintenance and generalization of verbal operants. In
other words, the two curricula produced two very different repertoires, and only the verbal behavior
curriculum resulted in “spontaneous” speech under the control of non-verbal environmental stimuli.
Typically developing children seem to learn to communicate in effortless ways. They are more
likely to respond to nonverbal antecedents in order to initiate verbal interactions spontaneously, readily
responding to the natural establishing operations (EOs) that control human communicative behavior (e.g.,
lack of social attention when attention is preferred for pure tacts; deprivation or aversive conditions for
mands). Therefore, one possible reason why children with disabilities are often observed not emit pure
191
tacts and mands in NIS is that they were taught to produce these verbal operants only under the partial
antecedent verbal control of others. That is, tacts and mands were taught as intraverbals and the relevant
direct control of nonverbal antecedents was never actually learned. In other words, these children learned
to talk only when asked for a response (e.g., “What do you want?") and not in response to a natural EO
(e.g., deprivation or aversive conditions). As an alternative, one could avoid verbal antecedents and
arrange the environment to create EOs that would encourage the emission of pure mands (i.e., requests for
desired items independent of verbal antecedents) (Pistoljevic, 2008).
Within the verbal behavior model, speaker behavior is represented by six basic operants or
functions, each defined by effects on the listener. Included are echoics, mands, tacts, intraverbals, textual
responses and autoclitics (Skinner, 1957; Ross et al., 2006). Among these operants, tacts and mands are of
particular interest for the current study. Tacts are controlled by the presence of stimuli and maintained by
generalized reinforcement, while mands are controlled by specific motivational conditions (e.g.,
deprivation) and reinforced by what is specified in the mand itself. Skinner (1957) defined the mand as a
verbal operant, which specifies its consequence, under the control of conditions of deprivation or aversive
stimuli. When compared to other verbal operants, only the mand is controlled by motivational conditions,
such as deprivation, satiation, and aversive stimulation. Additionally, the manipulation of these
motivational variables can be used to evoke verbal behavior (Skinner, 1957).
Incorporating previous work on motivation, Michael (1993) defined establishing operations
(EOs) as "changes in the environment which alter the effectiveness of any object or event as
reinforcement and simultaneously alter the frequency of the behavior followed by the reinforcement" (p.
191). Recently, the term motivating operation (MO) has been introduced as a replacement term for EO
(Laraway, Snycerski, Michael, & Poling, 2001, 2003). The MO is comprised of the value altering effect,
which either increases the reinforcing effectiveness of a stimulus, object, or event which is an EO or that
decreases reinforcer effectiveness, which is an abolishing operation (AO).
Differentiating among the types of EOs, Michael (1993) describes unconditioned establishing
operations (UEO) as unlearned motivation such as deprivation of food, water, or sleep. The effect of the
UEO is innate, but the behavioral response is learned. In contrast, conditioned establishing operations
(CEO) are defined as learned motivation, for which social attention, toys, or money often function as
reinforcers. Not only is the CEO learned, but the response is also learned. One category of CEO is called
transitive CEOs, in which a stimulus condition “makes some other stimulus condition effective as a form
of conditioned reinforcement, and evokes behavior that has obtained that item in the past” (Sundberg,
2004). Daily activities involve transitive CEOs as motivation, such as cooking, cleaning, or schoolwork.
EOs are a component of language acquisition for typical children and are often incorporated as an
independent variable into verbal behavior instruction for children with disabilities (Sundberg, 1993, 2004;
Sundberg, Loeb, Hale, & Eigenheer, 2002; Sundberg & Michael, 2001; Sundberg & Partington, 1998). In
fact, for mand training, if the EO is not in effect one cannot deliver mand instruction (Sundberg, 2004).
But waiting for EOs to occur naturally in the environment limits the number of opportunities for mand
instruction. For that reason, contrived EOs are often an essential component of mand instruction. For
UEOs, the passage of time can increase the momentary value of the stimulus, as in the case of thirst or
hunger. The UEOs can also be contrived, such as giving salty foods to increase the value of juice (Sund
berg, 2004). For CEOs, the instructor can capture opportunities in which one stimulus increases the value
of another stimulus. In the case of transitive CEOs, opportunities to conduct mand training can be
contrived. Several experiments have identified EO tactics that have been effective in producing the
motivational contexts necessary to teach speaker verbal capabilities (Hart & Risley, 1975; McGee,
Krantz, Mason, & McClannahan, 1983; Ross, 1998; Ross & Greer, 2003; Ross et al., 2006; Schwartz,
1994; Tsiouri & Greer, 2003, Williams & Greer, 1993).
Three common procedures that use EOs to increase speaker behavior are incidental teaching,
behavior chain interruption strategy (BCIS), and brief motivational procedures with time delay (Ross et
192
al., 2006; Schwartz, 1994). During incidental teaching, a speaker indicates, through gestures or
comments, deprivation of an item or activity that he/she needs assistance to obtain. The listener (i.e. the
experimenter) then provides verbal questions , modeling or expectant looks that function as prompts,
inducing the speaker to emit a pre-determined response. When the speaker emits the target response, the
desired item is contingently delivered. Experiments demonstrated that the incidental teaching procedure
was effective in increasing the number of mands emitted after its implementation for preschool children
diagnosed with autism (McGee, Morrier & Daly, 1999) and for young children with delays (Dunst,
Bruder, Trivette, Raab, & McLean, 2001; Hart & Risley, 1975).
The BCIS uses interruptions of routines or chained behaviors as an EO for both mands and tacts
(Hall & Sundberg, 1987; Ross et al., 2006). The implementation of the BCIS tactic demonstrated to be
successful in increasing mands for play by a child with autism (Sigafoos & Littlewood, 1999) and in
increasing selection requests for elementary-school children with language delays (Grunsell & Carter,
2002). Hall and Sundberg (1987) presented instant coffee without hot water, increasing the value of the
water and creating not only motivation, but also an opportunity for mand instruction.
The brief motivational procedure incorporates time delay into the mand training and consists of
the teacher presenting an item from which a student has been deprived, waiting a few seconds and
delivering the requested item if the target response is emitted. This procedure was effective in increasing
the rate of functional vocal speech for adolescents with developmental disabilities (Williams & Greer,
1993) and in improving the acquisition of vocal speech for young children with autism (Drash, High &
Tudor, 1999).
Overall, these procedures were found to be effective in increasing the occurrence of verbal
behavior, but many studies suggested that they may not be the best tactic to facilitate or induce
independent speaker behavior. In a comparison of the three types of EOs on the acquisition and
maintenance of mands in preschoolers, all were found to be effective in acquisition, generalization, and
maintenance of mands, although BCIS resulted in the slowest acquisition of mands (Schwartz, 1994).
Carter and Grunsell (2001) found that BCIS has only been functional for creating EOs in the context of
not naturally-occurring routines and Sundberg and Michael (2001) suggested that incidental teaching
produces a too limited number and variety of EO to teach communication as it naturally occurs. Other
studies on brief motivational procedures found that the mand function did not transfer to the tact function
unless specific instruction was implemented (Nuzzolo-Gomez & Greer, 2004), suggesting that these
motivational procedures may not be sufficient to establish speaker behavior across different functions,
such as mand, tact and autoclitic.
A strategy known as the Speaker Immersion Procedure (SIP) was developed through research to
increase spontaneous speaker behavior (Greer & Ross, 2004; Ross, 1995, Ross et al. 2006; Williams &
Greer, 1993). During SIP, the experimenter arranges environmental conditions to create increased
opportunities for mands, requiring students to use different autoclitic frames to obtain items and activities
or to make transitions in their environment (Reilly-Lawson & Greer, 2006; Ross et. al, 2006). SIP is
particularly useful for students with limited mand and tact repertoires (Greer & Ross, 2008). The
procedure is structured to make vocal communication behavior obtain maximum reinforcement with
minimum effort. Vocal verbal manding is always easier and faster than crying, gesturing, or using other
nonverbal topographies. During SIP, the student can obtain desired items or engage in preferred activities
only if he or she emits the targeted vocal mand (e.g., “I want___ please”). Opportunities to mand items or
routines are created and maintained until a physical response to an EO requires more effort than the
emission of a vocal mand for the same item or routine (Greer & Ross, 2008). Students also learn from the
correction procedure following the production of an incorrect mand. This procedure consists of echoics
provided by the teacher.
Ross, Nuzzolo, Stolfi, and Natarelli (2006) tested the effects of SIP when implemented for different
amounts of time during the day on the independent language of four students with developmental
193
disabilities. In the first experiment the participants were two students who emitted autoclitic mands only
when prompted. In the second experiment, the participants were two students who could emit mands and
tacts with autoclitics, but did not mand for desired stimuli or tact common items spontaneously. In the
first experiment SIP was implemented during one 60-minute session daily, and in the second experiment
it was implemented during one 10-minute session daily. During the SIP, students had to emit a verbal
response to an EO for each movement, environmental change or activity change (i.e. to request a straw to
drink from a juice box; to request a pencil to write their name). Instructional settings were rotated with
NIS, and the participants were presented with a 10-second opportunity to mand while the experimenters
withheld a targeted event or item. The item or event was delivered if the student emitted the mand. If the
student did not emit the mand, an echoic model of the autoclitic mand was presented by the teacher,
providing the student with another opportunity to respond. Results indicated significant increases in the
numbers of independent mands emitted by all of the students, not only during the SIP but also during
generalization probes.
In this paper we report a systematic replication of Ross et al. (2006), in which EO probes and
probes of vocal operants in the non-instructional setting were conducted prior to and following the SIP.
For experimental control, the EO pre and post probes were matched such that the same opportunities were
created. In addition, the specific EOs used for the probes were recorded to ensure that the mands taught
during the SIP were different from those used in the probes. Another change from the original study was
that the SIP sessions were continued until the participant emitted mands in the presence of EOs with 90%
accuracy across two consecutive sessions. The students learned from the reinforcement and from the
correction procedures provided through learn units (Albers & Greer, 1991; Greer 2002; Greer & Ross,
2008).
Method
Participants
The participants were two male preschool students diagnosed with disabilities. The participants
were selected because they emitted mands in the one-to-one instructional setting but emitted few
appropriate vocal mands in the non-instructional setting (NIS).
Participants were selected from a private publicly funded school for students with disabilities that
used the Comprehensive Application of Behavior Analysis to Schooling (CABAS) model (Greer, 1994).
The school was located in a suburb outside a large metropolitan area and the school day was five hours.
Participants A and B were part of an inclusion classroom, with twelve students, one teacher, and three
teaching assistants, with a teacher-student ratio of 1:3. The classroom included students with and without
disabilities and consisted of 10 male students and two female students ranging from three to five years
old.
Participant A was a 3.11 year old male, functioning as a listener and emergent speaker (Greer &
Ross, 2008). He emitted generalized mands using full sentences (i.e., “I want cookie please”) in the 1:1
instructional setting, but he emitted few appropriate mands and instead cried or gesturally indicated the
preferred items or activities in the NIS. Participant B was a 3.7 year old male, functioning as an emergent
listener-emergent speaker (Greer & Ross, 2008). He emitted 2-word mands (i.e. “Cookie please”) in the
1:1 instructional setting but he emitted low numbers of any vocal mands in the NIS, waiting passively
after emitting gestural mands. The participants are described in Table 1, and examples of their
independent verbal operants prior to and following the intervention are given in Table 2.
194
Table 1. Description of Participants
Level of
P1 Age 2 Standardized Test Scores Repertoires
Verbal Capability (based on Greer & Ross, 2008)
(based on Greer & Ross, 2008)
A 3.11
• Listener Preschool Language Scale -4: • Generalized Imitation
• Expressive Communication: • Following Directions
• Emergent Speaker SS4 =78 • Tacts and Mands with
• Auditory Comprehension: Autoclitics During 1:1
SS4 =66 Instruction
• Total Language Score: • Listener Component of
SS4 =69 Naming
WPPSI-R4
• Verbal IQ: SS3 =79
• Performance IQ: SS3 =76
• Full Scale IQ: SS3 =74
B 3.7
• Listener Preschool Language Scale -4 • Generalized Imitation
• Expressive Communication: • Following Directions
• Emergent Speaker
SS3 = 67 • Tacts and Mands with
• Auditory Comprehension: Autoclitics During 1:1
SS3 = 61 Instruction
• Total Language Score: • Listener Component of
SS3 = 60 Naming
1
P = participants
2
Age = represented in years & months
3
SS = Standard Score
4
WPPSI-R = Wechsler Preschool & Primary Scale of Intelligence-Revised
Setting
During the probe sessions and intervention, a classroom teacher and/or a teaching assistant
collected data. The treatment sessions were conducted in the participants’ classroom and in the hallway
during transitions from and to the bus. The probe sessions were all conducted in the classroom during
regular activities. The EOs probes were conducted in the classroom’s bathroom, at the cubbies during
morning and afternoon routines and at the lunch table, while the NIS probes were conducted in the
classroom’s toy area, lunch table and activity tables, where at least one other student was always present.
Definition of Behaviors
Dependent variables. The dependent variables measured in this study were both the number of
vocal verbal operants emitted during non-instructional time and mands emitted in the presence of EOs.
The dependent variables were measured during probe sessions prior to and immediately following the
SIP. Non-instructional setting (NIS) probes were conducted in the natural school setting, and the
experimenters provided no EOs during probe session.
NIS pre-probes were conducted for three consecutive days for both participants. Data were
collected on the number of tacts and mands emitted by the students for 10-minute sessions in each setting,
195
with a total of 30-minute daily pre-probes. Each setting included the presence of one or more students
together with the target student. The NIS were the classroom’s tables during group activities (i.e. playing
with blocks, play-dough or puzzles), the lunch table during lunch and the classroom’s toy area during free
play.
The procedure for the NIS probes, established by Pistoljevic and Greer (2006), consisted of 10-
minute observation sessions across three different settings. The experimenters collected the data for the
numbers and functions of utterances students emitted spontaneously across the target settings (e.g.
numbers of pure mands and tacts). Mands were defined as verbal operants that specify their reinforcers
under relevant conditions (Skinner, 1957) and tacts as verbal operants that make contact with the
environment or identify components of the environment resulting in the listener or reader providing a
generalized reinforcer to the speaker or writer (Greer, in press). Autoclitics were defined as verbal
behavior that functions to modify, quantify, and/or qualify the effect of other verbal operants such as
mands or tacts (Greer, 2009; Skinner, 1957). Pure tacts and mands were defined as vocal verbal operants
that are under the control of nonverbal antecedents. For example, as he found a missing piece of a puzzle,
one of the participants emitted a pure tact “A train!” during the post-probe. An example of a pure mand
with autoclitics emitted during the post-probe at the lunch table was the student saying, “I want my juice”
when another student took the target student’s juice. An example of autoclitic tact was “An orange car!”
The consequence for tact is generalized reinforcement (i.e. social praise or attention) emitted by the
listener. The consequence for mand is the listener delivering to the student the item or activity specified in
the mand. No intraverbal responses were recorded; the teacher did not interact with the student vocally.
Tacts and mands are called impure when they are under the control of a vocal antecedent, for example
“what is this?” for a tact and “do you want to go to the toy area?” for a mand. During the NIS probe
sessions, Participants A and B received no vocal praise or correction from the teacher following the
emission of verbal behavior. Peers present in the NIS functioned as the audience for the verbal operants
emitted by these participants. Data on the exact word and sentences emitted by the students during the pre
and post probes in the NIS were also collected (Table 2).
For the second dependent variable, data were recorded on the number of mands emitted during
establishing operation (EO) probes conducted before and after the SIP. For the pre-probe, 30 EOs were
created or captured within a school day and the specific instances were recorded, so that the same 30
opportunities were created during the post-probes. To ensure that the opportunities were actually EOs, the
experimenters contrived or waited for the occurrence of natural environment conditions that could induce
the student to mand for a desired item or activity. Each EO (i.e., cubby blocked, book withheld, door
closed) was presented by the experimenters, who waited 10 seconds for the response to be emitted by the
students. If the target response (i.e., “I want book, please”, “Open door please”) was emitted, the item was
delivered or the student was allowed to engage in the activity requested. If the target response was not
emitted within 10 seconds, 10 more seconds were given and the EO was enhanced and exaggerated by
looking expectantly at the student or by praising other students who were engaged in the activity desired
by the target student. Data on the exact mands emitted by the participants before and after the SIP were
also collected and summarized in Table 3.
For the dependent variable, responses to EOs were recorded as non-vocal mands, non-target vocal
mands, and target vocal mands. Non-vocal mands were defined as gestures, crying, or grabbing. Non-
target vocal mands were defined as single words or utterances with fewer words than the specified target
forms. The target mand form was specified for each student based on previously acquired mand forms.
Following the SIP, identical NIS and EO post probes were conducted. The EO post-probes used
scripted EOs, in order to create the same mand opportunities as the ones created during the pre-probes.
Immediately after the SIP, post-probes were conducted following procedures identical to those used
during the pre-probes. One month after completion of the SIP, follow-up post probes were repeated to
determine whether the effects were maintained.
196
Table 2. Specific Verbal Operants Emitted by Participants During Three 30-Minute NIS Probes
P1 Pre -Probes Post-Probes
A
“I don’t want to eat “I want to go to “I want my juice” “I want more”
my pizza” recess” “It’s Gordon!” “I miss one”
“Throw away” “I want to clean up” “Good morning” “I did this”
“Open please” “I don’t know” “Thomas the train!” “I found this”
“Blocks!” “Lie down” “I don’t want this” “This goes here”
“Open please” “A train!” “Good night!” “Where is this”
“Computer!” “The end” “Open please” “I want this”
“I want that one” “He flies!” “Give me that one, “I can’t clean up!”
“Open please” “I don’t know” please” “He falls down!”
“Cake!” “I want my daddy” “I want another one” “Excuse me”
“I found money!” “Down!” “Here is your seat”
“A star!” “Where is the W?” “You have to sit
“Stand here” “I made it” down”
“What are you “Puzzle please” “Come on!”
doing?” “Let’s do this” “We need chair”
“Now walk” “An umbrella!” “Have a seat!”
“This is a man” “orange car!” “Follow me”
“I want juice” “All done!”
“Where is my juice?”
B
“Ok” “An horse” “I have a square” “It’s broken”
“away” “Help please” “Wow, circle” “M”
“a book” “Plate!” “Oh no” “N”
“star!” “This” “Bye bye” “Ok”
“square!” “Thank you” “Help please” “It’s an helicopter”
“circle!” “It’s purple” “Here you go!” “I color”
“Crayon” “Help, please” “All done”
“circle” “Thank you” “It’s Vinny’s”
“Oh no!” “A sheep” “Blue!”
“One, two, three, “Oh oh, where did “Oh Thomas”
four, five” they go?” “Oh no!”
“Oh, thank you” “Recess?” “Here you go”
“An heart” “thank you” “It’s broken”
“Yellow” “Puzzle” “The end”
“Ok, what happened?” “N” “Look here”
“Brown” “Look” “a book”
“Star” “C”
“Oh oh green” “Help, please”
1
P = Participant
Independent variable. The independent variable in this study was the SIP in which the
experimenter used learn units for mand instruction while increasing student opportunities by creating
EOs. The experimenters created 30 EOs divided in two sessions of 15 minutes, with one during the
morning, from the school bus arrival to the morning routine, and the other during lunch. EOs, defined as
197
conditions which act to alter the momentary effectiveness of a reinforcer (Greer & Ross, 2008), were
created using interrupted chain (i.e., withholding the straw after giving the juice box to the student;
blocking access to a necessary or desired item or activity; withholding a necessary or desired item). After
an EO occurred, the experimenter waited 10 seconds for the student to emit the correct vocal mand. To
ensure that EOs were actually in place, the experimenter only counted instances in which participant
manded either in target or non-target forms. In fact, only when the student manded was it considered as an
opportunity and if, to the contrary, no mand was emitted, the student’s response was not recorded. For
example, if the teacher had a cookie and the student manded by whining, grabbing, pointing, or vocally
asking for it , only then was an EO considered to be in place. If the mand was emitted in the targeted form,
the experimenter delivered the item or activity and recorded it as a correct response. If the mand was
emitted in non-target form (e.g. crying, grabbing, reaching), the experimenter recorded it as an error and
provided an echoic correction by modeling the correct response. The requested item or activity was not
delivered unless the mand was emitted in target form, although in some cases, the item or activity could
not be withheld. For example, if an EO was created for the student to mand to exit the school bus and he
remained silent, the experimenter eventually had to allow the student off the bus. The target mand forms
for Participant A were “I want ___ please,” “I need ___ please,” and “Give me ___ please.” The target
mand form for Participant B was “___ please.” The SIP continued until the criterion of 90% correct
responses for two consecutive sessions was met.
Table 3. Examples of Vocal Mands Emitted by Participants in Response to Contrived EO Probes

EOs Pre -Probes Post-Probes
1. Withhold backpack “I want my backpack, please”
2. Block door “Open please”
3. Block toy area “I want to go to toy area, please”
4. Block bathroom door “Door” “I want to go to the bathroom, please”
5. Withhold water “Give me water, please”
6. Withhold soap “Give me soap, please”
7. Block flushing toilet “Flush” “I want to flush the water, please”
8. Withhold book “Dora book” “Give me Thomas, please”
9. Withhold play-dough “Play-dough, please”
10. Utensils without food “Please” “I want this, please”
11. Straw without juice “Please” “I want juice, please”
12. Block chair “I want to sit down please”
For the SIP, mand instruction was delivered in "learn units". A learn unit is a measure of teaching
defined by a 3 term contingency for the student and 2 or more three term contingencies for the teacher
(Albers & Greer, 1991; Greer, 2002). Responses to the learn unit presentation from a teacher results in a
response from the student and the student’s response occasions reinforcement or correction delivered by
the teacher. Different teaching procedures in addition to learn units were used for each student, based on
their instructional history (Greer & Keohane, 2005; Greer & Ross, 2008). For Participant A, the SIP was
taught using the mand procedure (Greer & Ross, 2008), and for Participant B, a 1-second time delay
procedure was used (Charlop & Trasowech, 1991; Ingenmey & Van Houten, 1991; Matson, Sevin,
Fridley, & Love, 1990). For Participant A, the intervention was implemented with the mand procedure
(Greer & Ross, 2008). During this procedure the experimenter showed a preferred item or activity to the
student, waited three seconds and immediately delivered the item or activity if the student independently
emitted the target response form, or provided an echoic correction if the student did not respond or
emitted an incorrect response. For Participant B a 1-second time delay procedure (Charlop & Trasowech,
1991; Ingenmey & Van Houten, 1991; Matson, et al., 1990) was used. As an antecedent response prompt,
198
time delay procedures use variations in the time intervals between presentation of the natural stimulus and
the presentation of the response prompt. If the student did not independently respond one second after the
presentation of the item or activity, the experimenter provided the student with an echoic prompt of the
target response for the student to echo and receive the item or activity specified. During this errorless
learning procedure, the students usually start to emit the correct response independently after a few
sessions.
Data Collection
Data were collected by one or two experimenters during the NIS probe session using a pen, a
clipboard, a timer, and a data collection sheet. The emission of tacts was recorded with a “T”, the
emission of mands was recorded with an “M”, and the exact words said by the students were recorded
(Tables 2 and 3). The data were then added across settings and graphed as the total number of tacts and
mands emitted during the cumulative probe period. For the EOs probes data were collected using a pen, a
clipboard, a timer, and a data sheet. After creating each EO, a plus (+) was collected for the student
correctly emitting the target mand form, and a minus (-) was collected when a response was not emitted
or it was emitted with a non-target form. The exact words emitted by the students were also recorded (See
Table 3.) During the intervention a timer, a pen, and a data sheet were also used. A plus (+) was recorded
for the emission of a target mand form and a minus (-) was recorded when the child did not emit the target
form of a vocal mand. For Participant B, during the one-second time delay procedure, a “P” was recorded
for correct responses emitted following a vocal prompt, and a plus (+) was recorded for the independent
target form of responses while a minus (-) was collected for all non-target responses. The data were then
graphed on a 30-learn unit graph as a closed circle for the correct independent responses and an open
circle for prompted responses.
Interobserver Agreement
During the NIS probes, interobserver agreement (IOA) was conducted with two observers
simultaneously collecting the data. For Participant A, during NIS pre-probes IOA was calculated for 44 %
of the sessions, with 100% agreement. IOA was also calculated for 33% of NIS post probes, with 100%
agreement. For Participant B, during NIS pre-probes, IOA was calcula ted for the 78% of the sessions,
with 100% agreement.
Design
A delayed multiple probe design across participants (Horne & Baer, 1978; Pistoljevic & Greer,
2006) was used to compare the number of verbal operants emitted before and after the SIP in NIS and to
compare the number of mands emitted during the EOs probes before and after the treatment.
Results
During the EOs pre-probes (Figure 1), Participant A emitted 0 target form mands, 6 vocal mands
in a non-target form, and 24 non-vocal mands. Participant B emitted 0 target form mands, 5 non-target
vocal mands, and 25 non-vocal mands.
During the NIS pre-probes (Figure 2), Participant A emitted 1 tact and 3 mands during the first
sessions, 1 tact and 1 mand during the second sessions, and 2 tacts and 1 mand during the third sessions.
Participant B emitted 2 tacts and 1 mand during the first pre-probe session, 0 tacts and mands in the
second sessions, and 3 tacts and 0 mands in the third session.
During the SIP, the numbers of correct mands emitted by Participant A were 9, 19, 25, 27 and 28,
and he met criterion after the fifth session. Participant B emitted 0, 1, 13, 18, 25, 21, 27, 28 correct mands
199
during the SIP and met the criterion at the 8th session.
For the EOs post-probes, Participant A emitted 26 target vocal mands, 4 non-target form mands,
and 0 non-vocal mands. Participant B emitted 26 target vocal mands, 3 non-target form mands, and 1 non-
vocal mand.
200
Figure 1. The number of mands emitted during pre-, post-, and follow-up probes of 30 establishing
operations for both participants
Figure 2. The number of mands and tacts emitted during pre-, post-, and follow-up probes in 30-minutes
across NIS for both participants
During the NIS post-probes Participant A emitted 8 tacts and 6 mand in the first sessions, 5 tacts
and 4 mands in the second sessions and 12 tact and 15 mands in the third sessions. During the post probes
201
Participant B emitted 23 tacts and 2 mands in the first sessions, 10 tacts and 6 mands in the second
sessions, and 14 tacts and 1 mand in the third sessions.
The specific verbal operants emitted by participants in the NIS for the pre and post probes are
listed in Table 2 and examples of mands emitted in response to EOs are given In Table 3.
One month after the completion of SIP, follow-up post probes were repeated to measure the
continued effects of the intervention. In the follow-up post-probes in the NIS setting, Participant A
emitted 15 tacts and 12 mands and Participant B emitted 23 tacts and 5 mands. During the EO follow-up
post- probes, Participant A emitted 29 target form mands, 1 non-target vocal mand, and 0 non-vocal
mands. Participant B emitted 19 target form mands, 8 non-target vocal mand, and 3 non-vocal mands.
Discussion
This study was conducted to test the effects of the SIP on the number of independent vocal verbal
operants emitted by students with low rates of spontaneous speaker behavior outside of instruction.
During the SIP, participants were exposed to multiple EOs and mand instruction, where appropriate
echoic models of mands were provided. These contrived motivating operations (Laraway, Snycerski,
Michael, & Poling, 2001) had value-altering effects, increasing the reinforcing effectiveness of some
object or event, and therefore creating an opportunity for a student to mand. In this experiment, following
the SIP, the students’ rates of emission of spontaneous speech significantly increased in response to the
EOs and during non-instructional time, One month after the intervention, the follow-up post-probes
revealed that both participants maintained high levels of spontaneous speech compared to the baseline
levels.
In this experiment, the EOs and corresponding mands in the probes differed from those presented
during the SIP. Controlling the EOs in this manner ensured that the mands taught were not identical to
those in the probes. If the student emitted a mand in the probes following the SIP, he did not learn that
particular mand during the intervention. Rather than simply acquiring the vocabulary to mand, the
students may have learned a relationship between emitting appropriate vocal verbal behavior and delivery
of reinforcement (e.g. their needs were meet more frequently).
The effectiveness of the SIP may be due to the fact that during mand instruction, the necessary
relationship between EOs and environmental circumstances in which the behavior is likely to be
reinforced (such as the presence of an appropriate audience) was established. It is also possible that the
procedure taught the students target responses that previously were not in their repertoire or that the
outcomes were a result of the presentation of multiple EOs and corrections across many different settings,
so that the instructional history necessary to acquire the target responses was produced.
For both participants, not only did the numbers of pure mands increase in the NIS during the
post-probes, but also the numbers of pure tacts significantly increased. The mean number of words
emitted per utterance also increased, showing a wider use of autoclitics for Participant A. The student
may have learned that vocal verbal behavior emitted increases the frequency of generalized reinforcement
received from the environment (e.g. attention from adults).
Interestingly, the experimenters also observed that the effects of SIP may have affected toilet
training for Participant A. Prior to SIP, the student did not independently mand for the bathroom and had
frequent accidents. After SIP was implemented, the student started to independently mand for the
bathroom and no longer had accidents.
Williams and Greer (1993) found that when EOs were incorporated and verbal antecedents were
avoided, participants emitted more of what is typically characterized as spontaneous speech or the
initiation of language interactions. Both Ross and Greer (2003) and Tsiouri and Greer (2003) replicated
these findings as well. But, even when a child acquires a fluent speaker repertoire, learning new verbal
202
operants in each class of responses (i.e. mands, tacts) may require separate direct instruction for each
(Greer & Ross, 2008; Pistoljevic & Greer, 2006; Ross & Greer, 2003). Therefore, it is safe to say that this
procedure and the results it yielded is only the first step in a long path of instruction for these students. A
speaker needs to become a speaker-as-own listener, a Namer, Observational Learner, Reader, Writer,
Self-Editor, and a Problem Solver as defined by Greer and Keohane, 2005, Greer and Ross, 2008.
References
Albers, A., & Greer, R.D. (1991). Is the three term contingency trial a predictor of effective instruction?
Journal of Behavioral Education, 1, 337-354
Carter, M., & Grunsell, J. (2001). The behavior chain interruption strategy: A review of research and
discussion of future directions. The Journal of the Association for Persons with Severe
Handicaps, 26(1), 37-49.
Charlop, M. H., & Trasowech, J. E. (1991). Increasing autistic children's daily spontaneous speech.
Journal of Applied Behavior Analysis, 24, 747-761.
Drash, P. W., High. R. L., & Tudor, R. M. (1999). Using mand training to establish an echoic repertoire
in young children with autism. The Analysis of Verbal Behavior, 16, 29-44.
Dunst, C. J., Bruder, M. B., Trivette, C. M., Raab, M., & McLean, M. (2001). Natural learning
opportunities for infants, toddlers, and preschoolers. Young Exceptional Children, 4, 18-25
Greer, R. D. (1994). A systems analysis of the behaviors of schooling. Journal of Behavioral Education,
4, 255-264
Greer, R. D. (2002). Designing teaching strategies: An applied behavior analysis system approach. San
Diego, CA: Academic Press.
Greer, R. D. (2009). Teaching as applied behavior analysis. Unpublished manuscript. Teachers College,
Columbia University, NY
Greer, R. D., & Keohane, D.D. (2005). The evolution of verbal behavior in young children. Behavioral
Development Bulletin, 1, 31-48.
Greer, R.D., Keohane, D.D., & Healy, O. (2002). Quality and comprehensive applications of behavior
analysis to schooling. The Behavior Analyst Today, 3, 120-132.
Greer, R. D., & Ross, D.E. (2004). Research in the induction and expansion of complex verbal behavior.
Journal of Early Intensive Behavioral Intervention, 1, 141-165
Greer, R. D., & Ross, D.E. (2008). Verbal behavior analysis: Inducing and expanding new verbal
capabilities in children with language delays. New York, NY: Allyn and Bacon.
Greer, R. D., Stolfi, L., Chavez-Brown, M., & Rivera-Valdes, C. (2005).The emergence of the listener to
speaker component of naming in children as a function of multiple exemplar instruction. The
Grunsell, J., & Carter, M. (2002). The behavior chain interruption strategy: Generalization to out-of-
routine contexts. Education and Training in Mental Retardation and Developmental Disabilities,
37, 378-390.
Hart, B. & Risley, T. R. (1975). Incidental teaching of language in the preschool. Journal of Applied
Behavior Analysis, 8, 411-420.
Hart, B., & Risley, T. R. (1995). Meaningful differences in the everyday experience of young American
children. Baltimore: Paul H. Brooks.
203
Hall, G. A., & Sundberg, M. L. (1987). Teaching mands by manipulating conditioned establishing
Horne, R. D. & Baer, D. M. (1978). Multiple probe technique: A variation the multiple baseline. Journal
of Applied Behavior Analysis, 11, 189-196
Ingenmey, R. & Van Houten, R. (1991). Using time delay to promote spontaneous speech in an autistic
child. Journal of Applied Behavior Analysis, 24, 591-596.
Ingham, P., & Greer, R. D. (1992). Changes in student and teacher responses in observed and generalized
settings as a function of supervisor observations. Journal of Applied Behavior Analysis, 25, 153-
164.
Laraway, S., Snycerski, S., Michael, J., & Poling, A. (2001).The abative effect: A new term to describe
the action of antecedents that reduce operant responding. The Analysis of Verbal Behavior, 18,
101–104.
Laraway, S., Snycerski, S., Michael, J., & Poling, A. (2003). Motivating operations and terms to
describe them: Some further refinements. Journal of Applied Behavior Analysis, 36(3), 407-414.
Lamm, N., & Greer, R. D. (1991). A systematic replication and a comparative analysis of CABAS.
Journal of Behavioral Education, 1, 427-444.
Matson, J. L., Sevin, J. A., Fridley, D., & Love, S. R. (1990). Increasing spontaneous language in three
autistic children. Journal of Applied Behavior Analysis, 23, 227-233.
McGee, G. G., Krantz, P.J., Mason, D., & McClannahan, L. E. (1983). A modified incidental-teaching
procedure for autistic youth: Acquisition and generalization of receptive object labels. Journal of
Applied Behavior Analysis, 16, 329-338.
McGee, C., Morrier, M. J., & Daly, T. (1999). An incidental teaching approach to early intervention for
toddlers with autism. The Journal of the Association for Persons with Severe Handicaps, 24, 133-
146.
Michael, J. (1993). Establishing operations. The Behavior Analyst, 16, 191-206.
Nuzzolo-Gomez, R., & Greer, R. D. (2004). Emergence of untaught mands or tacts of novel adjective
object pairs as a function of instructional history. The Analysis of Verbal Behavior, 20, 63-76.
Pistoljevic, N. (2008).The effects of multiple exemplar and intensive tact instruction on the acquisition of
Naming in preschoolers diagnosed with autism and other language delays. (Doctoral dissertation,
Columbia University, 2008). Abstract from: UMI Proquest Digital Dissertations [on-line].
Dissertations Abstract Item: AAT 3317598.
Pistoljevic, N., & Greer, R. D. (2006). The effects of daily intensive tact instruction on preschool
students’ emission of pure tacts and mands in non-instructional setting. Journal of Early and
Intense Behavior Intervention, 3(1), 103-120.
Reilly-Lawson, T., & Greer, R. D. (2006). Teaching the function of writing to middle school students
with academic delays. Journal of Early and Intensive Behavior Intervention, 3(1), 151-170.
Ross, D. E. (1995). Verbal immersion to increase speaker behavior. Poster presentation at the annual
international conference for the Association for Behavior Analysis, Washington, DC.
Ross, D. E. (1998). Generalized imitation and the mand: Inducing the first instances of vocal verbal
behavior in young children with autism. (Doctoral dissertation, Columbia University, 2008).
Abstract from: UMI Proquest Digital Dissertations [on-line]. Dissertations Abstract Item: AAT 9
834364.
204
Ross, D. E., & Greer, R. D. (2003). Generalized imitation and the mand: Inducing first instances of
speech in young children with autism. Research in Developmental Disabilities, 24, 58-74.
Ross, D. E., Nuzzolo, R., Stolfi, L., & Natarelli, S. (2006). Effects of speaker immersion on independent
speaker behavior of preschool children with verbal delays. Journal of Early and Intense Behavior
Intervention, 3, 135-149
Ross, D. E., Singer-Dudek, J., & Greer, R. D. (2005). The Teacher Performance Rate Accuracy scale
(TPRA): Training as evaluation. Education and Training in Developmental Disabilities, 40(4),
411-423.
Schwartz, B. S. (1994). A comparison of establishing operations for teaching mands to child ren with
language delays. Doctoral dissertation, Columbia University, 1994. Abstract from UMI Proquest
Digital Dissertations [on-line]. Dissertations Abstracts Item: AAT9424540.
Selinske, J. E., Greer, R. D., & Lodhi, S. (1991). A functional analysis of the comprehensive application
of behavior analysis to schooling. Journal of Applied Behavior Analysis, 24, 107-117.
Sigafoos, J., & Littlewood, R. (1999). Communication intervention on the playground: A case study on
teaching requesting to a young child with autism. International Journal of Disability,
Development and Education, 46(3), 421-429.
Skinner, B. F. (1957). Verbal behavior. Acton, MA: Copley Publishing Group and the B. F. Skinner
Foundation.
Sundberg, M. L. (1993). The applications of establishin g operations. The Behavior Analyst, 16, 211-214
Sundberg, M. L., Loeb, M., Hale, L., & Eigenheer, P. (2002). Contriving establishing operations to teach
mands for information. The Analysis of Verbal Behavior, 18, 15-29.
Sundberg, M., & Michael, J. (2001).The benefits of Skinner's analysis of verbal behavior for children
with autism. Behavior Modification, 25(5), 698-724.
Sundberg, M.L., & Partington, J. W. (1998). Teaching language to children with autism or other
developmental disabilities. Danville, CA: Behavior Analysts, Inc.
Sundberg, M. L. (2004). A behavioral analysis of motivation and its relation to mand training. In L. W.
Williams (Ed.), Developmental disabilities: Etiology, assessment, intervention, and integration.
Reno NV: Context Press.
Tsiouri, I., & Greer, R. D. (2003). Inducing vocal verbal behavior through rapid motor imitation training
in young children with language delays. Journal of Behavioral Education, 12, 185-206.
Williams, G., & Greer, R. D. (1993). A comparison of verbal-behavior and linguistic -communication
curricula for training developmentally delayed adolescents to acquire and maintain vocal speech.
Behaviorology, 1, 31-46.
Acknowledgements
The authors would like to thank Dr. R. D. Greer, professor of Education and Psychology at Teachers
College and The Graduate School of Arts and Sciences at Columbia University for his endless guidance
and support. Also, we thank the staff, students, and families of The Fred S. Keller School for their
participation and continued support.
205
Nirvana Pistoljevic, PhD

Teachers College, Columbia Univ.
525 125th Street, PO Box 223
New York, NY 10027
Phone: 212-678-8328
nirvana.pistoljevic@gmail.com
Claire Cahill, MA
Fred S. Keller School
PO Box 716
680 Oak Tree Road
Palisades, NY 10964
Phone: 845-359-8846
claire.cahill@gmail.com
Fabiola Casarini, MA
Fred S. Keller School
PO Box 716
680 Oak Tree Road
Palisades, NY 10964
Phone: 845-359-8846
fc2302@columbia.edu
ADVERTISEMENT
ADVERTISING IN BAO JOURNALS
If you wish to pla ce an advertisement in any of our journals, you can do it by contacting us.
If you wish to run the same ad in multiple issues for the ye ar, you are eligible for the following
discount:

In addition to placing your ad in the journal(s) of your choice, we will place your ad on our
website’s advertising section.
For more information, or place an ad, contact Halina Dziewolska by phone at (215) 462 -6737
or e-mail at: halinadz@hotmail.com
206

SLP Aba 5 2

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

SLP Aba 5 2

Enviado por

Direitos autorais:

Formatos disponíveis

Volume 5 Issue Number 2

ISSN 1932 - 4731

Pg. 88: Guest Editors’ Comments - M. N. Hegde & Raymond Weitzman

Pg. 90: Language and Grammar: A Behavioral Analysis - M.N. Hegde

Pg. 114: Verbal Behavior by B.F. Skinner: Contributions to

Pg. 132: The Bases for Language Repertoires: Functional Stimulus-

Pg. 150: Behavioral vs. Cognitive Views of Speech Perception and

Pg. 166: Speech and language assessment: A verbal behavior analysis

Pg. 191: Effects of a Speaker Immersion Procedure on the Production

VOLUME NO. 5, ISSUE NO. 2

Published: May 13, 2010

JSLP-ABA is viewed as a primary source of information for speech-language pathology (SLP)

Submission Information for Authors

Peer Review Process

Manuscript Style Requirements:

General Guidelines for Preparing Abstracts:

Abstracts for Literature Reviews and Theoretical Articles

The Behavior Analyst Online Journals Department

ADVERTISING IN BAO Journals

1/4 Page: $50.00 1/2 Page: $100.00 Full Page: $200.00

An additional one time layout/composition fee of $25.00 is applicable

Joe Cautilli, Ph.D., BCBA

Douglas Greer, Ph.D.

Joe Cautilli and BAO Journals

Guest Editors’ Comments

Language and Grammar:

Speech-language pathologists’ (SLPs’) academic study of language is heavily influenced by

Verbal Behavior: Definition

Verbal Behavior: Units of Analysis

Motivational Control: The Mand

A mand is a verbal operant whose cause is a motivational variable. States of deprivation or

Discriminative Stimulus Control: The Tact

Verbal Behaviors Caused by Other Verbal Behaviors

Multiple Causation of Verbal Operants

Autoclitics: Grammar and More

Educational teaching of elements of autoclitics to children is as old as education itself; elementary

Linguistically and popularly, an appropriate response to a verbal stimulus is often described as

Language Treatment Research Supports the Behavioral Analysis

Treatment Targets Should be Functionally Organized

Summary and Conclusions

Author Contact Information

M.N. Hegde, Ph.D.

Verbal Behavior by B.F. Skinner:

The Foundational Contingency: Infant-Caregiver Interactions

Consequences: The Nature of Reinforcement in Language Learning.

A Functional Analysis of Verbal Behavior: Correlates in Early Language Learning

Concepts Central to a Functional Analysis

Interlocking Verbal Episodes

The Influence of Stimulus Control

Contingency Shaped Behavior

the learning involved must be conceived as gradual change in a set of probabilities

Verbal Operants in Developing Language

Autoclitics: Secondary Verbal Behaviors

Chomsky, N. (1959). Review of Skinner's Verbal behavior. Language , 26-58.

Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.

Dore, J. (1974). A pragmatic description of early langugae development. Journal of Psycholinguistic

Dore, J. (1986). The development of conversational competence. In R. Scheifelbusch (Ed.), Language

Kennan, E. (1974). Conversational competence in children. Journal of Child Language , 1, 163-184.

MacCorquodale, K. (1970). On Chomsky's review of Skinner's Verbal Behavior. Journal of the

Routh, D. K. (1969). Conditioning of vocal response differentiation in infants. Developmental

Searle, J. R. (1969). Speech acts. Cambridge, England: Cambridge University Press.

Skinner, B. F. (1974). About behaviorism. New York: Alfred A. Knopf.