NUCLEAR STRUCTURES IN LINGUISTICS
Ricnarp S. Prenwan
Scammn Iserrzore oF Lixovreries
1. The expression ‘nuclear structures’ has become, in our day, a term to con-
jure with; but the concept is not new in linguistics. "It is mentioned or implied
in contemporary discussions under the terms ‘immediate constituents’, ‘rank’,
and ‘endocentric phrases’; in the older literature it is referred to as ‘modification’,
‘attribution’, or ‘subordination’. An assumption of different ranks is implicit
in such word-pairs as stem-affiz, head-attribute, noun-adjective, substantive
modifier, verb-adverb, principal-subordinate.
‘The purpose of this paper is not to offer a new concept for linguistic theory,
but rather to codify the eriteria which probably serve as the basis for most judg
ments of relative rank that have been tacitly invoked in linguistic analysis.*
2. We begin with the assumption that the principle of immediate constitu-
ents is valid* This principle might be described as a sort of gravitational attrac-
tion between certain morphemes or groups of morphemes, but not between others.
As an analyst observes it when undertaking the description of a given language,
he might call it ‘concomitance’ or ‘affinity’, or simply the tendency of a given
class of sequences to occur only with certain other selected classes of sequences.
‘Thus, some sort of ‘essential affinity’ is observed to exist between red and -ish but
not between ra? and -ing, between very aud Just but uot between very aud runs.
‘The usefulness of this principle consists in the very considerable simplification
of analysis which it affords. If one does not accept the IC* hypothesis, one is
almost compelled to regard every morpheme in an utterance as pertinent to the
description of every other morpheme. Buta good analysis in termsof ICs usually
reduces the total possible environmental factorsof a given morphemeor sequence of
morphemes to one: in other words, it states that the only pertinent environment
of a given IC is its concomitant (the other IC). ‘Thus, in the sentence I would
Cf. Leonard Bloomfield, Language 161, 209 f., 21 f. (New York, 1938); Otto Jespersen,
‘The philosophy of grammar 97-107 (New Vork, 194). Neadless to eny, my development
owes a great deal to these sources.
+The first stimulus for this paper waa received while I was attending the Lingui
Institute at the University of Michigan in the summer of 1945, holding a acholarship {rom
the American Council of Learned Societies. An earlier version of the paper waa read at a
meeting of the Linguistic Society of America, 30 December 1947, Tam indebted to Zellig
8. Harris for help in preparing the published draft.
*Cf. especially Bloomfield, op.cit.; Kenneth L. Pike, Taxemes and immediate constit
uuents, Lana. 19.65-82 (1043); Pike, Analysis of a Mixteco text, ITAL 10.113-88 (1044),
esp. 120; Eugene A. Nida, Syntax 44-61 Glendale, Calif., 1946); Rulon 8, Wells, Immedi
constituents, Laxa. 23 Tam especially indebted to Pike and Nida for
struction in the prineiples of immediate constituents.
‘This useful abbreviation for immediate constituent (plural ICs) is taken from Wells,
op.cit.
Of course there are always exceptions, for instance factors of intonation, concord, and
substitution (pronouns). ‘The last two might be handled by the method suggested in Zellig,
8. Harris, Discontinuous morphemes, Lana. 21.121-7 (1045), and the first by the method
287288, RICHARD 8. PITTMAN
like to have gone, the IC principle strips down the pertinent environment of would
to the single morpheme like, that of to to the sequence have gone, and that of
have to gone* This means that a proper analysis of the ICs of any expression
should be rewarded by a very appreciable saving of labor, since it eliminates the
1non-pertinent parts of an utterance at each level of deseription.
3. Tt snot the intention of this paper, however, to examine the techniques for
Aetermining ICs. Whatever method a linguist follows, his final result will in
all probability assume several degrees of rank. To certain constituents he is
likely to assign a principal or ‘central’ status; these he may label root, stems,
bases, themes, heads, nouns, verbs, main clauses, ete. ‘To other constituents
he is likely to assign a subordinate or ‘lateral’ status; these he may call affixes,
‘encltics, formatives, attributes, modifiers, subordinate clauses, ete. Tt would
be possible to term the central constituents ‘nuclei” and the lateral ones
‘satellites
‘These terms are of course not meant to imply that there must be anything
inherently principal or subordinate about all morphemes and groups of mor-
phemes. They merely afirm that in most expressions the linguistic structure is
such as to make this distinction and this relationship « very convenient one for
the systematizer*
4. But what, precisely, is the advantage of Inbeling one constituent central
and another lateral? Probably the principal gain is that since we conventionally
describe the largest classes first, and smaller classes (or satelite) in terms of
their relation to nucle, the central-lateral clasification gives us a desirable
working basis for our description. It shows us the essential nucei at all levels,
+0 that inthe descriptive arrangement the satelites may then be simply grouped
with their respective nuclei. Just as an astronomer finds it simpler to describe
the moon's relation to the earth than its relation to the sun, so a linguist, in
analyzing the sentence Eat your bread, finds it simpler to describe the relation of
‘your to bread than its relation to eat.
5. This dichotomy into nuclei and satellites poses the question which is the
primary concer of this paper: What procedures are followed in deciding that
‘this is a nucleus and that a satellite, this a stem or head and that an affix or
attribute? Most linguists seem to make the classification intuitively. The
following is an attempt to suggest some of the assumptions which may underly
this intuition.
‘Ten premises are proposed as probably constituting the basis for most of the
distinctions of rank that linguists make. In illustrating the premises, the
sequence AB represents any two immediate constituents of an expression in any
uggeoted in the same writer's Bimultancous componente fu phywolugy, Lana, 2381-206
(as4s),
‘On have gone see Harris, Discontinuous morphemes (f. 5 above).
" Pike appears to have been among the frst to uae the term ‘nidleus in thi aense, See
hig Analysia of « Mixteco text (fn. 3 above).
"The equations in Zellig 8. Tiara, From morpheme to utterance, Laxo. 22.161-85
(2946), though not using the term IC, are based on the ‘auclear hypothesi’. is highest
surviving numbered formulas represent the base nuclei. The eliminated formulas at each
Tevel represent the satellites,NUCLEAR STRUCTURES IN LINGUISTICS 280
sven language. Theoretically, it does not matter whether these are regarded
‘as morphemes, words, or phrases; actually, however, the more morphemes there
‘rein an IC, the more complicated it would befor illustrative purposes. For
this reason, most of the illustrations will contain only single words and mor-
phemes. An arrow points from satellite to nucleus: my —> hat. A subseript
humeral after a letter indicates that the letter represents an entire formelass
rather than a single form; thus, A (man) = the form-cass to which man belong,
i.e, the class of nouns. It is assumed in the illustrations that the ICs have already
been determined: the problem which will occupy us is ther relative rank.
Premise 1. INDEPEXDENCE.? If one of two TCs occurs alone but the other
does not, the former is usually considered to be central and its concomitant
lateral. Thus, we probably consider affixes as subordinate to stems because
stems, in many languages, oceur alone, whereas affixes usually do not. In
English, a morpheme of the class of talk may occur without the suffix -ing, but,
‘not conversely. In Spanish, a word of the class of perro may oceur without the
rmorpheme su, but not conversely. ‘This premise might alio be refered to
ss ‘Uispenaily"s Une more dispensable of two TCs is cully seared x che
satelli
Premise 2. Ciass size. If one of two ICs belongs toa larger form cass (i.
‘class with more members) than the other, it is usually considered to be central
‘nd its concomitant lateral. If Ay of the sequence AyBs represents clas of
fity members and Ba clas of five members, its probable that A wl be labelled
central and B lateral. This premise is fairly apparent in the relative sizes of the
English adverb and verb classes, pronouns and verbs, affixes and stems, 04.
goateay, shrt-ensc. Fr. je—mange.
Premise 3. Vensatmrry (raxce). If one of two ICs has a potential range
of occurrence with more different classes of concomitants than the other, it
‘imally considered contra and ite concomitant lateral If A of the soquence
AB occurs with five different classes of concomitants, while B occurs with only
two, itis probable that A will be interpreted as a nucleus and B asa satelite.
An example is the relationship of English nouns and adjectives, Their relative
independence and class size might be debatable, but there seems to be little doubt
that nouns ocear in a much greater variety of environments than adjectives.
Other illustrations: comerdoun, in-vside, Fr. deus—ans.
Premise 4. Expocswrniciry* If « constitute” belongs to the sime class
‘as one of its immediate constituents, that constituent is usually interpreted as a
nucleus and its concomitant asesatellite. Ifthe sequence AB belongs to Class 1
and the constituent B also belongs to Claas 1, B will probably be regarded ss
+ Pome 1,2, and @ have been suggested, in ight diferent form, by Hoekett and
Premine by Bioh, in Chale F. Hock review of Nia‘ Morphology, Lax 2320-65
(rep, 292. Several of thee premio were wad by Kenneth lc Pe an Hanie ¥
Pik, immodiat sonmitent of Staatecn plalon, UAL 187-01 (IO)
Aan inguita might accept this ante only vad form frank, eluding xocentee
onsrvtions. But rank in nevertele implied By th deciption ofan exons orm
iit ney au conitng of sem pla af.
“A costae ran expression that consist of two (or more) IC; ae Wells, op it200 RICHARD 8. PITTMAN
‘nucleus. Most endocentric expressions come under this premise, Examples:
‘cans—read, bigdog, Sp. woop.
Premise 6. Crass rRzqvexcr." If one of two classes of ICs occurs oftener
than the other, it is likely to be considered central and its concomitant lateral
If Ai of the sequence A:B, occurs 100 times to 10 occurrences of Br, class Ay will
probably be regarded as central. In testing this premise in English, one would
check to see if nouns occur more often than adjectives, verbs more often than
adverbs, stems more often than affixes, independent clauses more often than
dependent clanses. Of course, if the premises of independence and versatility
‘are valid, this one would seem to be a necessary corollary, since more independent
and versatile classes would be expected to occur more often than those which are
less 20.
Premise6. Inprvipvanrneavexcy. Inseeming contradiction to the premise
of class frequency, it may be possible to state that an individual constituent
‘hich occurs more often than its concomitant is likely to be considered lateral
and its concomitant central. If A of the sequence AB is observed to occur more
often in the language dian its concomitant B, itis very possible Ut A will be
interpreted as lateral and B as central. This, of course, is readily apparent in
‘language such as Nahuatl, where stems do not occur without affixes, and where
‘hence certain afixes must occur very much more often than any member of the
stem classes, eg, the prefix ni- and the suffix -t have much higher frequencies
than any of the stems with which they occur.
Exceptions to this premise will immediately become apparent; but there seems
to be enough evidence to justify its inclusion, For example, a high percentage
of the words which were eliminated from Van der Beke's, Morgan's, and
Buchanan’s word counts (for French, German, and Spanish respectively)”
because of frequencies too high to be worth counting, were words whieh would
‘eenerally be considered lateral types.
Premise 7. Prosopr. In some languages, factors of syllable length, stress,
piteh, or intonation may influence determinations of rank. In Nahuatl, for
example, a stress on the word kinika ‘how’ leads one to describe it as Kieni*~ka,
stem plus enclitic
Premise 8. Lexorm. If nothing is known about two ICs except their length,
(ie. the number of phonemes contained in ther) it is very likely that the longer
will be classified as nuclear and the shorter as a satelite." If a linguist were
asked to make a guess at two Nahuatl terms about which he knew nothing, e
‘kal and neknemi-s, he would, in all probability, surmise that the i- and the -s
were lateral elements (afixes), ‘This premise may sound like linguistic heresy,
-yet there can belittle doubt thatthe really fast’ linguists who ‘get the hang’ of
"Martin Joos, Statistical patteras in Gothie phonology, Laxo. 1839-8 (1982), dis
tinguishes between text frequeney and list frequency. Ian referring here to text fre
™ George E. Van der Beke, French word book (Publications of the Ameria and Cana
dian Committees on Modern Languages, Vol. 15; New York, 12); Bayard Q. Morea,
‘German frequeney word book (PACCME, Val. 9; New York, 192); Milton A. Dachanan, A
‘rudd Spanish word book (PACCML, Vol 3; Toronto, 192),
"This io probably much lea likely in ayotax than a morphology.NUCLEAR STRUCTURES IN LINGUISTICS 201
‘ Janguage in record time use al sorts of undefined mental shorteuts, including
‘one, in probing linguistic structures. Tt is also worth while to compare this
premise with the premise of individual frequency, and to recall at this point
Zipf's hypothesis that ‘as the relative frequency of a word or morpheme) in-
creases, it tends to decrease in magnitude.”* Tt would seem that there may
indeed be a detectable correlation between length, frequency, and rank.
Premise 9. Mraxixc. Many linguists might deny any valid correlation
between meaning and rank; and yet, given the sequences noéa: ‘my house’ and
keinckea ‘them he-eats’, and no further information about them, they would very
probably be willing to hazard a guess that éa: ‘house’ and ka ‘he-eats’are nuclei
‘and that the other elements are prefixes. Substantival and verbal concepts
‘are very strongly associated in the minds of most of us with linguistic nuclei
Premise 10, Parrenn. This premise operates here, of course, as in all
‘other phases of linguistic analysis: unfamiliar elements are interpreted on the
analogy of those which are familiar. Cran- is listed as lateral to berry because
‘lack, for instance, is lateral to berry.
6. Tlaving alleged that the foregoing premises probably form the basis for
‘many or most ofthe judgments which linguists make regarding rank, one might
ask, Are these criteria valid?
Doubtless they represent varying degrees of validity, depending on the lan-
guage in question and on the linguist handling them. The first four seem to be
‘especially useful. The others probably represent ‘reinforcing’ criteria rather
than primary determinants. ‘The difficulty with the ninth is, of course, the
universal problem of the definitive classification of meanings and the interfer-
fence of the linguist’s own background. Perhaps it is safe to say that where
‘the premises are unanimous in favoring a given ranking, few linguists would
object. Where there is considerable contradiction between the eriteri, there
will be hesitaney and disagreement with regard to rank.
7. This, however, is not too disturbing, since itis not claimed that a graded
relationship exists between all the constituents of any language. The most
immediately apparent exceptions are compounds and coordinate constructions
in which, instead of one nucleus and one satellite, there may be two (or more)
nuclei of the same class, eg. It’s going to rain, I'm going home. But even these
may tend, at times, to'be interpreted as having a graded structure. English
‘compound constructions like mailman, post ofice, goldbup,etc., while ostensibly
having two co-equal nuclei, are actually often analyzed as satellite-nucleus
constitutes instead. Many other variations are aso possible, such as satellite-
nucleus satellite (wn-truthful), nucleus-satellite-satellite (whisper-ing-s), nucleus-
‘aucleus-satellte (calyisi-s), ete. A close IC analysis of such forms, however,
‘often results in breaking them into separate layers.
Tt is also possible, of course, for a single constituent to be simultaneously
"George K. Zipf, Relative frequency au a determinant of phonetic change, Harvard
‘Studies in Claaseal ‘Philology 40.1-95 (1929); The paycho-biology of language (Boston,
1985).-Martin Joo, in his review of the latter work, Laxo. 12.100-210 (1096, though re.
jcting Zipts causal relationship between frequency and length, nevertheless appears to
‘Admit the correlation as a functional interrelation’292 RICHARD 8. PITTMAN
nucleus and a satellite. ‘Thus, in the phrase very god idea, good is simultaneously
a nucleus for the satelite very and a satellite to the mucleus idea.
Perhaps the hardest cases to handle are those where each IC represents a
different class, but the classes are approximately equal in size, versatility, fre-
quency, ete. “English subject-predicate constructions are of this type. Tt is
hardly possible to call them coordinate, and yet to rank either subject or predi-
cate as subordinate to the other might incur considerable controversy. It
might be convenient to term such forms ‘collateral’ classes, meaning classes
which are approximately equal, judged by our premises, but not identical.
8. Although the primary implication of these premises has been conceived
as applying to grammar, it seems that they may also be profitably applied to
phonemic and syllabic structures. ‘The chief illustration of this treatment is
the article by Kenneth L. Pike and Eunice V. Pike on Mazateco syllables, already
cited (fn. 9).