Você está na página 1de 7

Linguistic Society of America

Frequency of Consonant Clusters Author(s): Sol Saporta Reviewed work(s): Source: Language, Vol. 31, No. 1 (Jan. - Mar., 1955), pp. 25-30 Published by: Linguistic Society of America Stable URL: http://www.jstor.org/stable/410889 . Accessed: 25/05/2012 08:24
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Linguistic Society of America is collaborating with JSTOR to digitize, preserve and extend access to Language.

http://www.jstor.org

FREQUENCY OF CONSONANT CLUSTERS


SOL SAPORTA University of Illinois
0.1.

approachto certainlinguisticphenomenais based on two generalassumptions, which most structurallinguistswould probablyaccept but would probablyconsiderirrelevant: first,that languageas a formof learnedbehavioris subjectto the generalprincipleswhich govern all learnedbehavior;second,that to determine what these principlesare and how they function, the relative frequency of of may be significant.In his description the phonologyof a language, phenomena the structurallinguisthas usuallylimited his analysisto a presentationof those
sequences of phonemes which occur in a language, without noting the relative frequencies of these combinations. It is true that several studies have offered

PSYCHOLINGUISTICS AND STRUCTURAL LINGUISTICS.

The psycholinguistic

systematic statements of distributionand clustering,pointing out for example


that in English a phone of the /Ij/ phoneme does not appear in word-initial position, or that /str-/ is an initial cluster but /sOr-/ is not.' But the greater-thanchance occurrence of some sequences and the less-than-chance occurrence of others has usually gone unmentioned. The contribution of a psycholinguistic analysis is to suggest that these deviations from chance are not random, but are

governedby some 'lawful'principle.


0.2. PURPOSE OF THE PRESENT STUDY. In particular, the present paper suggests that analysis of the relative frequency of consonant clusters will reveal a tend-

ency on the part of any languagesystem to producespeechin such a way as to considerthe effort of both the speakerand the listener, the encoderand the decoder.In any consonantcluster,the situationof least effort for the speakeris that in which the successivephonemesare most similar;but this requiresmaximum effort on the part of the listener, who is then forced to make a series of fine discriminations. For the listener, the optimal situation is that in which the phonemes of a cluster differ as much as possible; but this requires maximum

effort on the part of the speaker.Our hypothesismay be phrasedas follows:

the average frequency of a consonant cluster is a function of the difference between the phonemes in the cluster: low frequencies are expected for clusters which are either extremely similar or extremely dissimilar; high frequencies are expected for clusters which are at neither extreme.2
1For such an analysis see Benjamin L. Whorf, Linguistics as an exact science, Technology review 43:2.3-8 (1940); Zellig S. Harris, Methodsin structural linguistics (Chicago, 1951). I am indebted to Henry R. Kahane and Charles E. Osgood for valuable suggestions. I have also benefited from discussions by members of the Psycholinguistic Seminar held in the summer of 1953 at Indiana University, under the sponsorship of the Social Science Research Council. 2 The consonants /r/ and /1/, as well as vowels and semivowels, are excluded from this study. Combinations involving these phonemes need further investigation.
25

26

LANGUAGE, VOLUME 31, NUMBERS 1-2

1.1. THE TEXTS. Two texts have been analyzed to test this hypothesis, one in colloquial English, the other in substandard Mexican Spanish. The English material is that presented by Carroll in a mimeographed report of 15 March 1952:1 for a corpus extracted from modern plays, comprising about 20,000 phonemes,4 the occurrences of each phoneme were tabulated so as to show how often it is followed and preceded by every other phoneme. A sequence such as /-mps/ in the word jumps is considered an occurrence of the cluster /mp/ and also an occurrence of the cluster /ps/. The Spanish material is an unpublished phonemic transcription of the speech of a Mexican informant, elicited by a field worker.6
1.2. PHONEMES AND DISTINCTIVE To determine the degrees of difFEATURES.

ference between phonemes, an analysis into distinctive features seemed appropriate; the systems used are those presented by Jakobson for English and by Alarcos Llorach for Spanish.6 For example, the English phonemes /p/, /t/ and /0/ show the following features (Jakobson 43): Vocalic/Non-Vocalic Consonantal/NonConsonantal Compact/Diffuse Grave/Acute pt 0 - -Nasal/Oral Tense/Lax Continuant/Inter- -rupted -- -Strident/Mellow 0 pt - - + + + - - + -

+ + +
-

/t/ and /0/ therefore have the same distinctive features except for the contrast continuant/interrupted; this contrast is considered to be a difference of two units. On the other hand, /t/ and /p/ contrast as to grave/acute, which is also a difference of two units; but in addition, the distinction strident/mellow, which is minus in /t/, is irrelevant in /p/. This 'semi-contrast' is considered to be one unit of difference, so that /p/ and /t/ differ by a total of three units. In this way, the degree of difference between any two phones in a language can be established and labeled. The results for English and Spanish are presented below (?2.1, ?2.2). 1.3. ASSUMPTIONS AND DIFFICULTIES. A number of special problems need comment before the results of this procedure can be discussed. These concern syllable boundaries and syllable position, juncture, and the frequency of morphemes and allomorphs. Miss Fischer-Jorgensen, in a recent article,' has pointed out that structural
3 John B. Carroll, Transitional probabilities of English phonemes, Progress report on project56 (Harvard University, Graduate School of Education; Cambridge, Mass., 1952). STranscribed by Frederick B. Agard of Cornell University according to the phonemic system of Trager and Smith, An outline of English structure (Norman, Okla., 1951). 5 The phonemic transcription is taken from a dissertation by Gerald Markley, The verbal Mexican Spanish (University of Illinois, 1954). Spaces in Markley's categoriesof substandard transcription are interpreted as juncture phonemes. 6 Jakobson, Fant, and Halle, Preliminaries to speechanalysis: The distinctivefeatures and their correlates (Cambridge, Mass., 1952); E. Alarcos Llorach, Fonologia esparfola(Madrid,
1950).
7 Eli Fischer-Jorgensen, On the definition of phoneme categories on a distributional basis, Acta linguistica 7.8-39 (1952).

CLUSTERS FREQUENCYOF CONSONANT

27

laws often determine which clusters do or do not appear within a syllable. For instance, the sequence /tl/ does not occur within an English syllable; that this is a fact of structure and not an accidental gap is shown by the parallel absence of /dl/. In this paper, however, syllable boundaries are disregarded; the sequence /tl/ is listed among the clusters because it occurs in atlas. Also disregarded is the limitation of certain clusters to a particular position within the syllable-for example of /pr/ to the first position and /rp/ to the last. If /tl/ occurs in atlas, does it also occur in the phrase at last? The scholar who transcribed our English text writes this sequence /t # 1/, where the / N / serves as a cover symbol for various kinds of juncture phenomena, both final and medial. Although the exact phonetic basis of this symbol is not always clear, it has seemed wise not to alter the phonemic transcription of the text. Accordingly, the sequence at last is not considered to include the cluster /tl/. That certain grammatical morphemes occur frequently in any text is of course irrelevant for the purposes of this paper. It is apparent that the relatively high frequency of the cluster /ts/ is in part due to the shape and distribution of certain allomorphs ('third singular present', 'plural', etc.). The question remains, however, what principles may operate to determine the shapes of common morphemes, and it is precisely this type of question which the hypothesis attempts to explain: the distribution of the allomorphs of common morphemes is such as to produce clusters which will tend to divide the effort more or less equally between speaker and listener. It is worth noting that although the sequence /ts/ does not occur in Spanish at all, the overall result is essentially the same. 2.1. ENGLISH. Table 1 shows the difference, in units, between each of the nineteen English consonant phonemes and each of the others. The tabulation
mp
m

b f v0

V n t d s zA

A 6 j i

kg

p b f v 0 5 n t d s z

4
6 1 x k g

44242444227573577553226666557355724575375425275735245467 7 5 5 3 4 2 5 6 4 28684657879358866475789753284668798575724864869787575422377778838888555584668578357946245864867585397644252TABLE

1.

ENGLISH

PHONEMES:

UNITS

OF DIFFERENCE

28

LANGUAGE, VOLUME 31, NUMBERS 1-2 mp


m

b f n t

dx

kg

is

6 y

p b f n t d x k g i~ s 6 y
TABLE

43342326566254455253355637785365758246366853337673764437765554465484766255365377475526634232. SPANISH PHONEMES: UNITS OF DIFFERENCE

belowshowsthe relativefrequencyof the occurrent clusters.Numbersin column A denote the difference, units, betweenmembersof a cluster.ColumnB gives in the numberof theoreticallypossible clustersfor each degree of difference,361 in all. ColumnC gives the total frequencyof actually occurringclustersof each of kind, 837 in all. (Sincethe transcriber the Englishtext wrote no double consonantswithout a juncturebetweenthem, the frequencyof clustersin which the membersdiffer by zero units is zero.) ColumnD gives the average numberof clusters in each category (columnC divided by ColumnB). A 0 1 2 3 4 5 6 7 8 9 B 19 0 38 24 48 72 40 60 48 12 C 0 0 8 29 244 390 16 113 35 2 D 0.0 0.0 0.2 1.2 5.1 5.4 0.4 1.9 0.7 0.2

The Spanish material is given in Table 2 and in the tabulation 2.2. SPANISH.

below, which are constructedon the same principlesas the ones shown for English. Note that the differencebetweenSpanishphonemesis not always the same as that between analogousEnglish phonemes.Thus, English /t/ and /s/ differ by 4 units; but Spanish/t/ and /s/ differ by 5, becausein Spanish[z] is not a separate phoneme but only an allophone of /s/. The total number of theoreticallypossibleclustersin Spanish(14 X 14) is 196;the numberof actually occurringclusters,each counted as often as it occurs,is 924.

CLUSTERS FREQUENCYOF CONSONANT A 0 1 2 3 4 5 6 7 8 B 14 0 16 34 26 40 34 24 8 C 10 0 2 227 224 313 89 59 0 D 0.7 0.0 0.1 6.7 8.6 7.8 2.6 2.5 0.0

29

3.1. NORMAL The DISTRIBUTION CURVE. distribution of the average frequencies of clusters (Column D) tends in both languages to follow a normal curve; the one important exception is the extremely low frequency of English clusters whose members differ by six units.8 On the basis of the limited material examined, it seems reasonable to suggest that the frequency of consonant clusters reflects the two factors mentioned earlier, the effort of the encoder and that of the decoder.9 Maximum effort for either is avoided in favor of effort more or less equally divided. In terms of units of difference, clusters of two consonantal phonemes tend to differ by 4 or 5 units, whereas clusters differing by 2 (maximum effort for the listener) or by 8 or 9 (maximum effort for the speaker) are avoided.

&
Z k
g
TABLE 3. ENGLISH

mp b f v0 1i n t d s z 9 1 S 757354676846 7755364786642
4 4

k g

735576874668244
753758676486424
PHONEMES: CHANGES REGARDED IN UNITS OF DIFFERENCE AS CLUSTERS

2IF

/tg/

AND

/di/

ARE

3.2. ALTERNATEPHONEMIC ANALYSIS.Most analyses of English are alike in

regarding the affricates in churchand judge as unit phonemes. The possibility of regarding them as clusters involves an interesting change in our statistics: (1) the number of phonemes is reduced from 19 to 17, and the number of theoretically possible clusters therefore is only 289; (2) the total number of clusters actually occurring is increased from 837 to 971; (3) instead of six phonemes characterized as compact and oral there are only four, namely /6, I, k, g/; (4) since these can be completely defined by using in addition the contrasts tense/ lax and continuant/interrupted, the contrast strident/mellow becomes irrelevant. The resulting changes in the differences between the four phonemes /6, 1, k, g/ and all other phonemes are shown in Table 3. Note especially that the difference
8 The Spanish curve has a minor deviation due to the occurrence of clusters with zero difference between the members. These are nearly all instances of /mm/, as in /kommigo/ 'with me'. But this transcription is based on morphophonemic criteria; an alternative writing /komigo/ is certainly possible also. 9 Cf. George K. Zipf, Human behaviorand the principle of least effort (Cambridge, Mass., 1949).

30

LANGUAGE, VOLUME 31, NUMBERS 1-2

between /t/ and /9/, or between /d/ and /1/, is now 6 instead of 7. The revised tabulation of frequencies now has the following greatly improved appearance: A 0 1 2 3 4 5 6 7 8 B 17 0 32 24 48 52 36 56 24 C 0 0 10 16 290 366 224 45 20 D 0.0 0.0 0.3 0.7 6.0 7.0 6.2 0.8 0.8

The average frequencies in Column D follow an almost perfect normal distribution curve. If it can be demonstrated that an argument of this kind has validity (i.e. that our hypothesis is correct), it will be necessary for the structural linguist to examine all instances of deviation from a normal curve and perhaps to revise his description. 3.3. CONCLUSION. is clear that the two samples on which this paper is It each containing fewer than a thousand clusters, are not enough to establish based, a general principle; but they are certainly suggestive. If the hypothesis here discussed is valid at all, it should be valid for any language. To test it we need a large body of texts in many different languages, accurately analyzed and transcribed according to some standard method that will make direct comparisons meaningful.

Você também pode gostar