The DBQ evaluation algorithm presented in Section 9.2 bottoms out when it reaches a simple proposition involving a predicate and associated objects. At this point, it is assumed that the database will have an exhaustive listing of the true cases of the appropriate relation. This is highly reminiscent of our first, approach to semantic interpretation,
where we represented the meanings of words by sets. Given that DBQ formulae result from a fairly simple translation of natural language sentences, our assumption seems to be that the database must contain a listing for every English content word that might be appropriate in the domain. And yet this is clearly ridiculous, because there will be many ways of asking essentially the same question:
Does Morgan Brothers accept VISA cards? Can I pay Morgan Brothers with a VISA card? Is Morgan Brothers registered with VISA? Does Morgan Brothers take part in the VISA scheme?
and also many questions whose answers are closely related to one another:
Does Morgan Brothers accept credit cards? Will Morgan Brothers accept my VISA card? Does Morgan Brothers accept international credit cards? Do I have to pay Morgan Brothers by cash?
If we had to include a separate listing for each of the properties and relations, such as 'accept', 'pay', 'register', 'take part', 'VISA' and 'credit card', that might be mentioned, we would have to have a huge database with a great deal of information represented many times. Moreover, our machine would be showing very limited understanding of the subject matter, because it would have no explicit representation of the fact that certain concepts like 'pay' and 'accept' are fundamentally related to one another. The difficulty of constructing such a system would directly reflect the shallowness of such a model of the world.
One important respect in which a natural language understanding system should be able to model the world is in being able to determine what follows from what. If a language processing system cannot determine, for instance, that if someone has influenza then they are unwell then its model of the world (at least as far as medical matters are concerned) is surely lacking. We have already noted that one advantage of a formal MRL might be that it provides formal rules of inference for validly deriving new statements from information provided as statements in the language. We might hope to harness the potential thus offered to provide accurate and computationally feasible inference mechanisms as part of an understanding system. As we will indicate later, it is not clear that this can be done in a general way, but there are interesting models of some capabilities that such inference systems might have.
DBQ (without printout actions) is really just a restricted form of the predicate calculus, and so it is natural to look towards standard notions of logical inference to see whether they can be used to enhance our question answerers. Translating to a logical language like DBQ does not commit us to any particular account of how English words correspond to predicates. For instance, we are no more committed to translating 'Mayumi sings' to sings(mayumi) than to any of the following, say:
a(b) exists(X, mayumi(X), sings(X)) m(sing) fact(world25, m, sing) all(X, world25(sing, X), contain(X, m))
All that we need to do is to provide appropriate and consistent interpretations for the predicates and constants that we use in the translation. On the other hand, using predicate calculus does commit us to standard notions of validity and inference, but also provides computational techniques for making inferences (from the work on automatic theorem proving). In this section, we will briefly discuss some of the features of standard logical inference, to indicate why it is computationally unattractive and hence why some researchers in computational linguistics have felt a need either for restrictions of it or for other kinds of inference systems.
The standard notion of logical inference assumes that we start with a set of axioms, or facts that we are sure are true, together with a desired conclusion that we wish to establish. In a QA system, the axioms might be the background database and the conclusion the statement expressed by a question. Inference involves constructing a DAG, called a proof, where each node is labelled with a formula: the desired conclusion is at the initial node and all final nodes are labelled by axioms. For the inference to be valid, each formula in the graph must be licenced by an inference rule that shows that it follows from the set of formulae that immediately succeed it in the graph. This is illustrated in Figure 9.1.

For simplicity, we will use a restricted notation for inference rules based on a small subset of the programming language
Prolog. In this language, a general rule about the world is expressed in the form conclusion if condition, and indicates that whenever the condition is true, so is the conclusion (many expert systems use rules with roughly this flavour). Thus, a rule says that if you have already established that the conditions of the rule are true - that is, you have complete DAGs starting at nodes labelled with these formulae - then you are entitled to combine the conclusion with these DAGs to make a new legal DAG, as shown in Figure 9.2.

The condition of a rule may be a collection of conditions, separated by the word and. Conditions and conclusions all take the form of a predicate, followed optionally by a number of associated objects, enclosed in brackets and separated by commas (as in DBQ). Given suitable interpretations for the predicates accepts, public_holiday and no_change and the constant morgan_bros, the rule:
(1) accepts(morgan_bros, visa) if public_holiday and no_change(morgan_bros)
would mean that Morgan Brothers accepts VISA cards if it is a public holiday and Morgan Brothers is out of change. If public_holiday and no_change are listed in our domain database, then this rule will allow us to answer a certain question about the acceptance of credit cards. For a rule to answer a class of questions, it needs to contain variables, indicated by letters like X and Y, to denote arbitrary individuals. Thus, given suitable interpretations of predicates like accepts and credit_card, the rules:
(2) accepts(X, Y) if credit_card(Y) and registered(X, Y) (3) can_pay_with(X, Y, Z) if customer(X) and accepts(Y, Z)
say that for any individuals X and Y, X accepts mode of payment Y if Y is a type of credit card and X is registered with the Y scheme. Also, for any individuals X, Y and Z, if X is a customer, they can pay Y with mode of payment Z if Y accepts Z. Using these inference rules, a QA system with a database containing information about credit card schemes and people registered with them could answer questions about acceptance and payment.
It is important to note that our Prolog-like language for inference rules is incomplete in many ways and represents a very restricted part of predicate calculus. For instance, it only allows us to write rules that have simple facts as their conclusions; it does not allow negative conditions in rules and it does not allow rules to introduce, by the use of function symbols, objects that have not been encountered before. Nevertheless, this small subset will suffice for the main points that we wish to make. For a question answerer, the limitation on conclusions of rules means that an inference system using this restricted language could be introduced at the 'bottom level' of query evaluation, but could not be used to answer whole DBQ queries - complex DBQ formulae would need to be reduced to simple predications before the inference system could be of any use.
When using an inference system there are two main options for how the DAG of formulae that links the conclusion to the axioms is constructed. In forwards (or data-directed) inference, the DAG is constructed starting from the axioms and (hopefully) heads towards the desired conclusion. That is, the inference system searches through all possible things that follow from what it knows, intending that some desired conclusions will be among them. The way this is often organized computationally is for the system always to infer all consequences of the facts it is given (that is, to perform all possible sequences of applications of inference rules) and then to record them somewhere. So then, seeing if a goal follows simply involves looking to see whether it is in the set of facts. On the other hand, when new information is added to the set of facts, a great deal of activity may be generated to discover all new consequences. If there are infinitely many consequences, of course, the system will never finish adding the new information. In a forwards inference system, a rule of the form:
a if b and c and d ...
is interpreted to mean 'if B, C, D, ... have been shown to be true then add A to the set of facts'. This is very similar to the basic idea of bottom-up parsing, where the rule would be read as 'if you have found a B, followed by a C, followed by a D, ... then record the fact that you have found an A'. Thus, a forwards inference system usually operates by accepting a set of facts, inferring all possible conclusions from them and announcing whenever one of the set of goals is inferred to be true. For instance, if we have the last two rules and already know that:
credit_card(visa) credit_card(mastercard) customer(j_dore)
then if we discover the new information that:
registered(morgan_bros, visa)
then we are entitled to conclude:
accepts(morgan_bros, visa)
by rule (2) (with X being Morgan Brothers and Y being VISA) and:
can_pay_with(j_dore, morgan_bros, visa)
by rule (3) (with X being Jean Dore, Y being Morgan Brothers and Z being VISA). The DAG supporting this proposition is as shown in Figure 9.3. If we have previously announced an interest in the can_pay_with relation, we would expect the system to tell us about the new facts that it has inferred. Otherwise, we would expect it simply to store them away in the hope that they will eventually contribute to inferences that we are interested in.

Although forwards inference is useful in other contexts, it is not attractive for our QA system, because we do not want to explicitly store or explore every true case of every relation just in case it is asked about in a question. We require a more directed approach that focuses on what might be useful for the question at hand. In backwards (or goal-directed) inference, the DAG of formulae is constructed by starting at the desired conclusion and then heading backwards towards the axioms. In such a framework, no inferencing occurs until a goal is presented to the system. Then the system reasons backwards, reducing the thing it is trying to show to a set of (hopefully, simpler) subgoals. These subgoals are then themselves reduced to simpler goals, until we reach goals that directly match axioms. Thus in a backwards inference system, a rule of the form:
a if b and c and d ...
is interpreted to mean 'if you wish to show A, try showing B, C, D, ...'. This is very similar to the basic idea of top-down parsing, where the rule would be read as 'if you wish to find an A, try finding a B, followed by a C, followed by a D,...'. Thus, a backwards inference system usually operates by accepting a sequence of goals and answering yes or no, according to whether it can show that they are true. For instance, consider the situation where again we have the three rules stated earlier, together with a database containing, among other things, the following facts:
credit_card(visa) credit_card(mastercard) customer(j_dore) registered(morgan_bros, visa) registered(jones_inc, visa)
Imagine that the QA system is asked by Jean Dore 'Who can I pay with VISA?' and manages to translate it into the following DBQ formula:
all(X, can_pay_with(j_dore, X, visa), printout(X))
Evaluation of this query, using the algorithm we described in Section 9.2, involves enumerating all true instances of the formula:
can_pay_with(j_dore, X, visa)
collecting the possible substitutions for X and printing them out. Using backwards inference, the system would attempt to construct a DAG with a formula like this at the start. There is only one inference rule (3) that could be used to justify this formula, and so the start of the DAG must look as shown in Figure 9.4(a). Now, the customer formula is an axiom, and so only the accepts needs justification. Again, only one inference rule is applicable. This introduces an axiom and another formula to be justified, which is illustrated in Figure 9.4(b). Now, the registered formula can be matched against facts in the database in two ways. If we choose X to be Morgan Brothers, then we can get a complete proof for can_pay_with(j_dore, morgan_bros, visa), whereas if we choose X to be Jones Incorporated, then we get a proof of can_pay_with(j_dore, jones_inc, visa). There are thus two possible proofs and two answers to be printed out, as shown in Figure 9.4(c) and (d).

Of the two inference techniques we have considered, forwards inference is probably the more useful if we wish to have a system that responds rapidly to changes in its knowledge and is able to detect one of a large number of possible unusual events. On the other hand, backwards inference is more directed, and so is more appropriate for a system that knows what it is trying to do. As we will see in subsequent sections, however, logical inference is not as straightforward as these simple examples might suggest.
We will not pursue the topic of backwards inference further here, since, if the reader has reasonable intuitions about how Prolog works, then, ipso facto, they have an understanding of the nature of backwards inference. If one needs to implement backwards inference, and Prolog is the language one is using, then the simplest and most straightforward option is just to use Prolog to do the inferencing directly, without building a distinct backwards inference engine on top of Prolog. There may, of course, be good reasons (of efficiency, say, or control) for doing the latter in a real application, but they go beyond the rather modest aims of this chapter. What is, perhaps, surprising, is that one can implement a forwards inference engine in a language (Prolog) that works by doing backwards inference. We put some flesh on this observation in the following section.
Send us a comment.