Modelling recursion in English grammar

The model of English usage suggested by our computer mail answerer is a very piecemeal and ad hoc one. It suggests that often the way we use language is simply by responding to particular, special-purpose patterns with stereotyped responses. Although this may sometimes be true, it gives us little information about how people understand and respond to sentences that they have never seen before or the basic principles that lie behind efficient language use. In this section, we will begin to develop a more general RTN for a small fragment of English. This will be unsatisfactory in a number of ways, but will be useful as a running example in the discussions to follow. To keep things simple, we will ignore the intricacies of verb groups suggested by our previous examples.

To start with, we need some abbreviations corresponding to some basic lexical categories used in the description of English syntax. We have seen N, NP and DET in the ENGLISH-1 network of Chapter 2. The symbol V is used here to stand for English verbs and the symbol WH for English relative pronouns like 'who' and 'which'. Secondly, we need to define some larger networks in terms of these categories.


BOX:




Name S: Initial 0 Final 2 From 0 to 1 by NP From 1 to 2 by VP.

Name NP: Initial 0 Final 2 From 0 to 1 by DET From 1 to 2 by N From 2 to 3 by WH From 3 to 2 by VP.

Name VP: Initial 0 Final 1,2 From 0 to 1 by V From 1 to 2 by NP From 1 to 3 by that From 3 to 2 by S.

N abbreviates: woman, house, table, mouse, man, ... . NP abbreviates: Mayumi, Maria, Washington, John, Mary, ... . DET abbreviates: a, the, that, ... . V abbreviates: sees, hits, sings, lacks, saw, ... . WH abbreviates: who, which, that. .



S is the network for English sentences. A sentence can be recognized by finding first a noun phrase (NP) and then a verb phrase (VP). The noun phrase at the beginning of a sentence - for example, a phrase like 'Mayumi' or 'the woman' - is the subject of the sentence. A noun phrase can include a relative pronoun (WH) introducing a qualifying verb phrase, which is a rather simple type of relative clause, as in 'the man who sings'. The verb phrase is sometimes known as the predicate of the sentence and this contains a verb and possibly a noun phrase (the object of the verb). It is also possible for certain verbs to be followed by the word 'that' and a sentential complement, as in 'thinks that Maria sings'. Thus, the above S network will recognize sentences like:


Mayumi sees the house. Maria sings. The table hits Washington. Mayumi sees that Maria sings. The table that lacks a leg hits Washington.

and also various other, less natural sounding, sequences of words. Note finally that the arc labels now have a dual status: they may simply stand for a set of items in the lexicon, as previously, they may just name a subnetwork, or they may do both. Thus, we can have items in the lexicon, such as 'Mayumi', listed as members of the category NP, or we can have an NP network, or we can have both.

This fragment demonstrates that English syntax is fundamentally recursive. For instance, it is easy to construct an English sentence that contains an English sentence, an English sentence that contains an English sentence that contains an English sentence ... and so on for as long as we like. But,of course, such sentences will become increasingly hard to understand as they get longer:


Mayumi says that Maria is a genius. Mayumi says that Mayumi says that Maria is a genius. Mayumi says that Mayumi says that Mayumi says that Maria is a genius.

The recursiveness of English, and most other natural languages, means that RTNs are a natural tool for expressing its regularities.

In our network, one way of traversing the S network involves traversing the VP network, which in turn involves traversing the S network again as a subtask. Once we have networks that display 'genuine' recursion in this way, we can see that RTNs are indeed more powerful than FSTNs with an abbreviatory convention for word categories. We could imagine a reference in an arc to a subnetwork as a shorthand for having a copy of that network in that position instead of the arc. This would explain why traversing the arc involves traversing the whole subnetwork. This shorthand model works nicely for examples like the VERB-GROUP subnetwork, but will not work when the definition of a phrase (directly or indirectly) includes itself. When this happens, the model would claim, in effect, that the finitely stated RTN is shorthand for an infinite network (Figure 3.1).


Exercise 3.2