Left: Boolean logic
Up: DATR techniques
Right: Representing lists
Perhaps surprisingly, DATR turns out to be an excellent language for
defining finite state transducers (FSTs). A path can be used as the
input tape and a value as the output tape (recall that the DATR default mechanism means that extensions to left-hand-side paths are
automatically carried over as extensions to right-hand-side paths, as
discussed in Section 3.3, above). Nodes
can be used for states, or else states can be encoded in attributes
that are prefixed to the current input path. Here, for example, is a
very simple DATR FST that will transduce a path such as <subj 1 sg
futr obj 2 sg like> into the value ni ta ku penda (Swahili for
I will like you). For clarity, this FST does not exploit
default inheritance to capture the 50% overlap between the subject
and object pronoun paradigms. See Gazdar (1992) for a
version that does.
S1:
<subj 1 sg> == ni S2:<>
<subj 2 sg> == u S2:<>
<subj 3 sg> == a S2:<>
<subj 1 pl> == tu S2:<>
<subj 2 pl> == m S2:<>
<subj 3 pl> == wa S2:<>.
S2:
<past> == li S3:<>
<futr> == ta S3:<>.
S3:
<obj 1 sg> == ni S4:<>
<obj 2 sg> == ku S4:<>
<obj 3 sg> == m S4:<>
<obj 1 pl> == tu S4:<>
<obj 2 pl> == wa S4:<>
<obj 3 pl> == wa S4:<>.
S4:
<like> == penda.
Although the example is trivial, the technique is both powerful and
useful. Gibbon (1990), for example, has made notably
effective use of it in his treatments of African tone systems. Much
of the computational phonology and morphology of the last decade and a
half has depended on FSTs. Potential lexical applications come
readily to mind - for example, the orthographic spelling rules for
English suffixation (cf. sky/skies). We give below a small
fragment of such an FST in which + is used as a morpheme
boundary marker. Note the role of DATR variables in giving concise
expression to the rules:
# vars $abc: a b c d e f g h i j k l m n o p q r s t u v w x y z.
# vars $vow: a e i o u.
SPELL:
<> ==
<+> == <>
<$abc> == $abc <>
<e + $vow> == $vow <>.
These axioms then give rise to theorems such as these:
SPELL:
<l o v e> = l o v e
<l o v e + s> = l o v e s
<l o v e + e d> = l o v e d
<l o v e + e r> = l o v e r
<l o v e + l y> = l o v e l y
<l o v e + i n g> = l o v i n g
<l o v e + a b l e> = l o v a b l e.

Left: Boolean logic
Up: DATR techniques
Right: Representing lists
Copyright © Roger Evans, Gerald Gazdar & Bill Keller, Tuesday 10 November 1998