Escolar Documentos
Profissional Documentos
Cultura Documentos
57
ple tapes. Hence, the prefix {wa} 'and' is applied to phatic phoneme). 2 In this case, emphasis can be
the above stems to f o r m / w a k a t a b / a n d / w ~ a d a q / , determined from the surface (~ orthographic) form.
respectively. However, this is not always the case. Syriac spi-
rantization requires lexical information as the fol-
2.2 Phonological and Orthographic Rules lowing example illustrates: Synchronically speaking,
Surface-to-lexical mappings must account for phono- the six plosives [b], [g], [d], [k], [p] and [t] undergo
logical and orthographic processes. In fact, for many spirantization when in postvocalic position wilh re-
languages, the phonological and orthographic rules spect to the lexical form, 3 resulting in [v], [~], [b], [x],
tend to be more numerous than the morphological If] and [0], respectively. Hence, */katab/--~ [k0av],
rules. This is the case in Semitic. For example, and */wakatab/--~ [wax0av] (in both cases the first
the Syriac grammar reported in (Kiraz, 1996) con- stem vowel is deleted as described above).
tains 48 rules. Only six rules (a mere 12.5%) 1 are
motivated by templatic morphology. The rest are 3 Multi-Tape Grammar
phonological and orthographic. This section provides a grammar for the above data
Consider the above derivation o f / k a t a b / , but for using a multi-tape model and illustrates some of the
Syriac rather than Arabic (both languages share the complexities involved in maintaining multiple lexical
same morphemes in this case). Syriac has the Vowel tapes throughout. The multi-tape model (originally
Deletion Rule proposed by (Kay, 1987)) is an extension to the com-
monly used regular rewrite rules. In the multi-tape
V ~ e / _ _ CV version, more than one lexical tape is allowed. Here,
we shall use the following formalism - which derives
where e is the empty string. The rule states that from the one reported by (Pulman and Hepple, 1993)
short vowels in open syllables are deleted. Hence,
- to express regular rewrite rules:
* / k a t a b / ~ / k t a b / . The rule applies right-to-left; LLC - LEx - RLC {:::~,~-:~)
hence, when adding the object pronominal suffix LSC - SURF -- I:~SC
{eh} 'MASCULINE 3RD SINGULAR', the second vowel
is deleted, * / k a t a b e h / ~ / k a t b e h / .
where L L C is the left lexical context, L~x is the
Similarly, prefixing the above {wa} morpheme lexical form, R L C is the right lexical context, LSC
(which is also shared by Syriac and Arabic), re-
is the left surface context, SURF is the surface form,
sults in */wakatab/ ~ / w a k t a b / (first stem vowel and RSC is the right surface context. The operators
is deleted), and * / w a k a t a b e h / ~ / w k a t b e h / ( p r e f i x and ¢:~ indicate optional and obligatory rules, re-
vowel and second stem vowel are deleted). spectively. In the multi-tape version, lexical expres-
It is worth noting that such phonological rules sions are n-tuple of regular expressions of the form
do not depend on the nonlinear lexical structure (xl, x2, ..., x,0, with the ith expression referring
of the stem. They actually apply on the morpho- to symbols on the ith lexical tape. When n = 1,
logically derived stem. Semitic, then, maintains the parentheses can be ignored; hence, (x) and x are
at least the following strata: lexical-morphological equivalent .4
(where the lexical representation is nonlinear) and The grammars presented here assumes a lexicon
morphological-surface (where both representations with the morpheme entries presented above. The
are linear). pattern morpheme is {cvcvc} (in small letters); cap-
itals in rules denote variables drawn from a finite-set
2.3 Other Linguistic Representations
of symbols.
So far we have looked at two linguistic representa- Lexieal expressions make use of three tapes: pat-
tions: lexical and surface (~ orthographic). Now tern, root and vocalism, respectively. Hence, the
consider a text-to-speech system which requires a
phonological representation as well. 2The scope of emphasis is another challenging prob-
lem. Sometimes emphasis spreads till the end of the
In the Arabic example above, the first phoneme current syllable, and sometimes till the end of the word.
of / s a d a q / is emphatic (denoted by the sublinear 3Diachronically speaking, early Aramaic idioms, of
dot). This emphasis is spread at the phonologi- which Syriac is one, did not apply the above vowel dele-
cal level resulting in [s.a.d.aq] ([q] is already an em- tion rule; hence, in the New Testament the first [a] in
sabachthani (Mt 27:46) is retained. Later, however, the
1Had the grammar been more exhaustive, the per- vowel deletion rule took effect, but spirantized conso-
centage would be much less since most additions to the nants remained as if the deletion did not take place.
rules would be in the domain of phonology/orthography, 4For compiling such rules into automata, see
rather than templatic morphology. (Grimley-Evans, Kiraz, and Pulman, 1996).
G r a m m a r 1 G r a m m a r for A r a b i c / ( w a ) k a t a b / a n d Grammar 2 G r a m m a r for Syriac Vowel Deletion
/(wa)s. a d a q / Rule
* - <c;C,~> - * * - (v,~,a) - (cv,C,a) ¢*
R4 . .
R1 , _ IC - *
_ _
* - a - (cv,C,a) ¢*
* - Xi - *
R6 , ,
R3 . - X - *
any context. R7 * _ _ *
R8 * - - *
6For regular relations, see (Kaplan and Kay, 1994). where C is a consonant and V is a vowel
~o
Grammar 3 G r a m m a r for Spirantization, case for In addition, the size of the intermediate a u t o m a t a is
[b]--~ Iv] substantially decreased in terms of space complexity.
V - b - * ¢~ There is another advantage of this model if used
lZ9 . , in a multi-lingual Semitic environment system. We
-- V --
noted above how the derivation o f / k a t a b / i n Arabic
RiO V - (c,b,c) - * ¢:.,~ and Syriac is similar. The only difference is that in
• -- V -- * the latter a vowel deletion rule takes place. It is
then possible to generalize the lexical-to-linearized-
(v,e,V) - (c,b,e} - *
form module for more than one Semitic language.
Rll • -- V -- *
At the abstract finite-state level, our solution may
where V is a vowel
have some similarities with the proposal of (Kor-
nat, 1991) which aims at modeling autosegmental
phonology by coding nonlinear autosegmental repre-
An identity rule (similar to R3 is also required). sentations as linear strings. Kornai's approach lin-
Applying R8 and the identity rule on the input of earizes the lexical nonlinear representation from the
G2 is illustrated below: outset using a number of coding mechanisms.
IwlalllaldlalktaltlalblLinearizedLexForm
3 3 3 8 3 3 3 8 3 3 3
Iwlatl[ Idtalkl Itla[blSurfaee References
Recall that the rule applies right-to-left. Goldsmith, J. 1976. Autosegmental Phonology.
Ph.D. thesis, MIT. Published as Autosegmental
It might not be clear from this example how ad-
and Metrical Phonology, Oxford 1990.
vantageous is this solution. After all, only three rules
were saved. However, note that almost all of the Grimley-Evans, E., G. Kiraz, and S. Putman. 1996.
rules in a real grammar do not belong to the tem- Compiling a partition-based two-level formalism.
platte morphology domain, but to the linear phono- In COLING-96: Papers Presented to the 16th
logical]orthographic domain. Consider the case of International Conference on Computational Lin-
Syriac spirantization mentioned above, viz., guistics.
6i
Pulman, S. and M. Hepple. 1993. A feature-based
formalism for two-level phonology: a description
and implementation. Computer Speech and Lan-
guage, 7:333-58.