Implementing flexible control

Our earlier example parsers in this chapters had two characteristics in common: (i) they used the Prolog database to store the chart, and (ii) they had no explicit agenda, but simply relied on Prolog backtracking as their control strategy. This approach makes for short code, but sacrifices flexibility. In the two parsers that we describe in this section, we will maintain and manipulate both the chart and an explicit agenda as lists of edges that get passed as arguments from predicate to predicate. As we shall see, it then becomes possible to switch between depth-first-like to breadth-first-like processing by interchanging the order of arguments in a single predicate call.

Our basic data structure remains the edge, only one shorn of the final FOUND component used in our earlier chart parsers.




edge(START,FINISH,LABEL,TOFIND)

This means that, for simplicity, we are talking about a recognizer, rather than a parser. Charts and agendas simply consist of lists of such edges.

As before, we begin with a bottom-up parser. We start off with an initialization predicate 'start_agenda(String,Vertex,Agenda)' which produces an initial Agenda of lexical edges given String starting at Vertex. If the string is empty, then there is an empty agenda:

Code:chrtlib2.pl




start_agenda([],V0,[]).

Otherwise, we peel off the first word, consult the lexicon and map all the relevant inactive edges spanning this word into an agenda, appending this to the front of the agenda obtained by looking at the rest of the words:




start_agenda([Word|Words],V0,Agenda) :- V1 is V0+1, findall(edge(V0,V1,Category,[]), word(Category,Word), Agenda1), start_agenda(Words,V1,Agenda2), append(Agenda1,Agenda2,Agenda).

The predicate 'findall', which is in the utilities library, is used to collect in its third argument a list which contains for each solution of the second argument the appropriate instantiation of the first argument. It is similar to the built-in predicates 'setof' and 'bagof' that are provided with many Prolog systems, except that 'findall' produces the empty list in its third argument (rather than failing) if there are no solutions to the second argument. In this case, 'findall' is used to iterate over all possible ways of finding Word as an instance of some Category. For each such solution, an appropriate 'edge' term is included in the answer list (Agenda1).

The predicate 'extend_edges(Agenda, InChart, OutChart)' is used to add the edges in Agenda to an initial chart InChart, taking into account the fundamental rule and bottom-up rule to generate yet more edges, producing a new chart OutChart. It is an identity function when the agenda is empty:




extend_edges([],Chart,Chart).

Otherwise, it takes the first edge on the agenda and sees whether it is already in the chart. If so, it can move down the agenda without more ado:




extend_edges([Edge|Agenda1],Chart1,Chart2) :- member(Edge,Chart1), !, extend_edges(Agenda1,Chart1,Chart2).

Otherwise 'extend_edges' adds the first edge to the chart, looks for any new edges that it provokes as a result of addition, adds them to the agenda, and then continues with this new agenda and the augmented chart




extend_edges([Edge|Agenda1], Chart1, Chart3) :- Chart2 = [Edge|Chart1], new_edges(Edge,Chart2,Edges), add_edges(Agenda1,Edges,Agenda2), extend_edges(Agenda2,Chart2,Chart3).

You 'add' an edge to a list of edges either by finding that it is already there, or by just adding it:




add_edge(Edge,Edges,Edges) :- member(Edge,Edges), !. add_edge(Edge,Edges,[Edge|Edges]).

And you combine two lists of edges by calling 'add_edge' in the obvious recursive manner:




add_edges([],Edges,Edges). add_edges([Edge|Edges],Edges1,Edges3) :- add_edge(Edge,Edges1,Edges2), add_edges(Edges,Edges2,Edges3).

Code:buchart2.pl So far, so good, but the essence of chart-parsing, the fundamental rule, has yet to make its appearance. We find it here in the predicate 'new_edges(Edge,Chart,Edges)' which maps Edge and Chart into a set of new edges, Edges, that arises from the addition of Edge to Chart. It is also this 'new_edges' predicate which is responsible for adding the empty active edges that get everything going. If an inactive Category1 edge spanning V1 to V2 is being added then, for each rule whose leftmost daughter is of Category1, create an empty active edge at V1, of the kind dictated by the applicable rule:




new_edges(edge(V1,V2,Category1,[]),Chart,Edges) :- findall(edge(V1,V1,Category2,[Category1|Categories1]), rule(Category2,[Category1|Categories1]), Edges1),

Then, for every existing active edge which currently ends at V1, and which is looking for a Category1, add a further edge (which may turn out to be either active or inactive) based on it, that ends at V2:




findall(edge(V0,V2,Category3,Categories2), member(edge(V0,V1,Category3,[Category1|Categories2]),Chart), Edges2),

Finally, combine the two resulting sets of edges:




add_edges(Edges1,Edges2,Edges).

That clause deals with everything that has to be done on the occasion of the addition of an inactive edge. But we still need a clause to cater for the case of active edge addition, and, as with our earlier bottom-up chart parser example, this turns out to be simpler. We just look for every inactive edge that could allow the active edge to be extended, and add edges extending it, as appropriate:




new_edges(edge(V1,V2,Category1,[Category2|Categories]),Chart,Edges) :- findall(edge(V1,V3,Category1,Categories), member(edge(V2,V3,Category2,[]),Chart), Edges).

That really completes our new bottom-up chart parser, but it is convenient to wrap the top level predicates 'start_agenda' and 'extend_edges' into a further predicate that will save us having to specify the initial vertex, and will check to see that the chart does indeed contain an edge that spans the string with an instance of the initial symbol:




test(String) :- start_agenda(String,0,Agenda), extend_edges(Agenda,[],Chart), initial(Symbol), member(edge(0,M,Symbol,[]),Chart), N is M + 1, not(member(edge(_,N,_,_),Chart)).

The parser we have been considering operates depth-first and, to the casual reader, this might appear to simply follow from our decision to implement it in Prolog. But this is not the case. In fact, it follows from the order in which things get added to the agenda, and this is within our control. At present, the new edges that result from adding an edge get put on the front of the agenda, which means that the list is functioning as a stack. We can redefine 'extend_edges' by commenting out the existing agenda contruction call of 'add_edges' and replace it with a call in which the order of the first two arguments (the existing agenda and the new items) is reversed. This simple move gives us a list that behaves like a queue and hence a parser that explores the search space in an approximately breadth-first manner.




extend_edges([Edge|Agenda1], Chart1, Chart3) :- Chart2 = [Edge|Chart1], new_edges(Edge,Chart2,Edges), % add_edges(Edges,Agenda1,Agenda2), % depth-first processing add_edges(Agenda1,Edges,Agenda2), % breadth-first processing extend_edges(Agenda2,Chart2,Chart3).

Switching from bottom-up to top-down requires rather more work, however, although some of the code we have developed so far can be carried over unchanged, namely 'start_agenda', 'add_edge', 'add_edges' and both variants of 'extend_edges'. Code:tdchart2.pl The predicate 'start_agenda' creates an initialized agenda of lexical edges, which, given that there is no bottom-up rule, we can simply treat as the initial chart in top-down parsing. We also need a way to get active edges started, however, and we do this, as in our earlier top-down parser, with a predicate called 'start_active(Category, Vertex, Edges)'. This is straightforward: for every rule that expands Category, create an empty active edge based on the rule at Vertex and put it in Edges:




start_active(Category,Vertex,Edges) :- findall(edge(Vertex,Vertex,Category,Categories), rule(Category,Categories), Edges).

Our new top-down parser still has to incorporate the fundamental rule. We find it, as with the preceding example, in the predicate 'new_edges(Edge,Chart,Edges)' which maps Edge and Chart into a set of new edges, Edges, that arises from the addition of Edge to Chart. If an inactive Category1 edge spanning V1 to V2 is being added, then for every existing active edge which currently ends at V1, and which is looking for a Category1, add a further edge (which may turn out to be either active or inactive) based on it, that ends at V2:




new_edges(edge(V1,V2,Category1,[]),Chart,Edges) :- findall(edge(V0,V2,Category2,Categories), member(edge(V0,V1,Category2,[Category1|Categories]),Chart), Edges).

That clause deals with everything that has to be done on the occasion of the addition of an inactive edge. But we also need a clause to cater for the case of active edge addition. We need to do three things: (i) create the relevant empty active edges to look for the next category (the top-down parsing strategy), (ii) look for every inactive edge that could allow the active edge to be extended, and create edges extending it, as appropriate, and (iii) combine the two resulting sets of edges:




new_edges(edge(V1,V2,Category1,[Category2|Categories]),Chart,Edges) :- start_active(Category2,V2,Edges1), findall(edge(V1,V3,Category1,Categories), member(edge(V2,V3,Category2,[]),Chart), Edges2), append(Edges1,Edges2,Edges).

Our new top-down chart parser is almost complete. It only remains to wrap the predicates into a package that will initialize the chart, start an empty active edge for the initial symbol, create an initial agenda, elaborate the chart to termination, and conclude by checking for success:




test(String) :- initial(Symbol), start_agenda(String,0,Chart1), start_active(Symbol,0,Agenda), extend_edges(Agenda,Chart1,Chart2), member(edge(0,M,Symbol,[]),Chart2), N is M + 1, not(member(edge(_,N,_,_),Chart2)).

Exercise 6.2

Send us a comment.



[Contents] [Previous] [Next]
This document was translated by troff2html v0.21 on October 22, 1996.