Now consider the problem of parsing in the light of the data structure we have just described, but remember that a chart is simply a set of edges. As in our previous presentations of parsing, we will initially ignore the problem of actually building phrase structure trees, since this will require some additions to the edge data structure. Suppose that we are in the middle of the parsing process and we look at the chart to find that it contains the following edges, among others:
{ <0, 2, S -> NP.VP>, <2, 3, VP -> TV.NP PP>, <3, 5, NP -> Det N.>, <5, 8, PP -> P NP.>, ... }
We can represent these edges diagrammatically, as shown in Figure 6.11. For clarity, we have omitted all the other edges that would need to be in the chart for the chart to have developed thus far. What we have here is two active edges and two inactive ones. The latter constitute a noun phrase and a prepositional phrase that we have already found. The first active edge represents a hypothesis about a sentence that has found a noun phrase, although not the one represented, and is looking for a verb phrase. The second active edge represents a hypothesis about a verb phrase that has found a transitive verb and is looking for a noun phrase followed by a prepositional phrase. Consider the first active edge. To satisfy the hypothesis, we need to find an inactive verb phrase edge in the chart that starts at vertex 2. But, let us suppose there is no such edge. We do, of course, have a hypothesis about the (eventual) presence of such an edge, but, until that hypothesis is confirmed, we are not in a position to do anything about the first active edge. Turning our attention now to the second active edge, we have a verb phrase hypothesis that is looking to find a noun phrase that starts at vertex 3. And we do indeed have such an edge. This means that our verb phrase hypothesis has been further, although not yet fully, confirmed. We can represent this further confirmation by adding another active edge to the chart - namely, <2, 5, VP -> TV NP.PP>. This is a hypothesis that is looking to find a prepositional phrase starting at vertex 5; but we have one of those in our chart, so this verb phrase hypothesis is (fully) confirmed and we can, accordingly, add an inactive edge to the chart to represent the verb phrase we have found - namely, <2, 8, verb phrase -> TV NP PP.>. This is where we have got to:
{ <0, 2, S -> NP.VP>, <2, 3, VP -> TV.NP PP>, <2, 5, VP -> TV NP.PP>, <2, 8, VP -> TV NP PP.>, <3, 5, NP -> Det N.>, <5, 8, PP -> P NP.>, ... }
If we now return to our first inactive edge, we can see that its hypothesis is now also (fully) confirmed, since there is, indeed, a verb phrase that begins at vertex 2. So we can add the inactive edge <0, 8, S -> NP VP.> to our chart which now looks as shown in Figure 6.12. The fact that we have an S edge spanning the entire width of the graph means that we have succeeded in parsing the string as a sentence. There may well be further parses to find, but we have at least found one of them.


The process we have just described represents the essence of chart parsing. There is more to the latter than our informal walkthrough might lead you to think, but everything else is technical niceties. What we have done is to apply exactly the same rule three times: if an active edge meets an inactive edge of the desired category, then put a new edge into the chart that spans both the active and inactive edges. Kay calls this rule the fundamental rule of chart parsing. We have expressed it very imprecisely. What do meets and desired mean? What kind of edge do we add? But this imprecision can be readily remedied:
Fundamental rule
If the chart contains edges <i, j, A -> W1.B W2> and <j, k, B -> W3.>, where A and B are categories and W1, W2 and W3 are (possibly empty) sequences of categories or words, then add edge <i, k, A -> W1 B.W2> to the chart.
This rule does not say whether the new edge is active or inactive, but then it does not need to since this is fully determined by whether W2 is empty or not. In a real implementation, however, it might well be convenient or efficient to add a simple flag to the edge data structure to signal activity or inactivity. Notice also that this rule only adds to the chart; it does not remove the active edge that has succeeded from the chart. This is as it should be since that active edge may be able to succeed in virtue of another inactive edge that appears in the chart at a later stage. If we remove it, then we run the risk that our parser will fail to find all possible parses, or, to put it in logician's parlance, we run the risk that our parser will exhibit incompleteness.
Now that we have the fundamental rule,
there
are three further matters that we need to address before we have all
the ingredients to make a fully functioning chart parser. These are:
initialization,
rule invocation strategy, and
search strategy,
We shall deal with all of these in turn.
Send us a comment.