Evaluating DBQ formulae

When we evaluate a DBQ formulae (as when we invoke a Prolog goal), in general we need more than just a true/false result - we need to know for what values of the variables the formula can be made true. It is simplest to see this for formulae of the first type (involving simple predicates). Imagine that we have a database about car parts which includes information about which firms supply which parts. We will assume that a relation supplies is represented in the database by a long list of entries specifying exhaustively who supplies what, as follows:




Who What

smith_corp radiators jones_inc spark_plugs morgan_bros radiators jones_inc batteries smith_corp tyres morgan_bros pumps

Questions about this relation can be phrased in DBQ by using the predicate supplies. We will use the first (ARG0) argument position associated with supplies to indicate the supplying firm and the second (ARG1) position to indicate the type of part. Then, the following are possible DBQ formulae involving this relation:




supplies(X, radiators) supplies(X, Y)

The result of evaluating the first of these formulae will be the set of possible values for X which would result in the formula matching something in the database. We can display these as a set of substitutions (assignments of values to variables), just as many Prolog systems show assignments to variables after a goal has been successfully satisfied. Here is one way of writing what we get:




{X=smith_corp; X=morgan_bros}

We use the '=' symbol to indicate the association of a possible value with a variable, the ',' symbol to group together associations and ';' to separate such groupings. The result of evaluating the second of the formulae will be the set of pairs of values for X and Y that result in the formula matching something in the database. In this case, the result corresponds to the whole relation represented by:




{X=smith_corp, Y=radiators; X=jones_inc, Y=spark_plugs; .., }

In general, a DBQ formula involving a predicate and associated arguments will be assumed to correspond to a primitive relation, each true case of which is explicitly listed in the database. The evaluation of such formulae is dealt with by the database and returns the set of possible variable bindings that allow the formula to match a true fact recorded there. Since we are using Prolog, we can simply use the \*L database to do this for us.

Let us now consider the more complex kinds of DBQ formulae. In general, there needs to be a special way of evaluating each kind of conjunction and each kind of quantifier. Consider the and conjunction. We want the truth value of an and proposition to be true if and only if each of the two sub propositions is true. A value of a variable will satisfy such a complex proposition only if it satisfies both of the subpropositions. So, here is a way of generating the possible substitutions for a conjunction:




To evaluate and(P1, P2): Let the set of answers computed so far be empty Evaluate P1, to give the set of substitutions S1 For each element s1 of S1: Apply s1 to the proposition P2, yielding P2' Evaluate P2', giving the set of substitutions S2 For each element s2 of S2: Combine s1 with s2, and add the result to the Set of answers computed so far Return the set of answers computed so far

We have made use here of the operations of applying and combining substitutions.

We will illustrate them through an example.

Consider the following DBQ formula:




and(supplies(X, radiators), supplies(X, Y))

Evaluating the first part (supplies(X, radiators)) of the and proposition against our database results in the first of the set of substitutions given above:




S1: {X=smith_corp; X=morgan_bros}

Each of these two substitutions is dealt with separately. First of all, the substitution 'X=smith_corp' is considered. The second part of the and proposition is rewritten so that all occurrences of X are replaced by smith_corp - this is applying the substitution to the formula. As a result, we now have the following formula:




supplies(smith_corp, Y)

Evaluating this gives rise to the second set of substitutions:




S2: {Y=radiators; Y=tyres}

To get possible answers, we combine the elements of S2 one by one with the element chosen from S1. This just means getting the pairs of substitutions to pool their information. So, we get the following two answers:




{X=smith_corp, Y=radiators; X=smith_corp, Y=tyres}

To complete the set of answers, we need to consider the second substitution in S1. Applying this to the second part of the and proposition gives:




supplies(morgan_bros, Y)

and evaluating this gives:




S2: {Y=radiators; Y=pumps}

Once more we combine these substitutions with the one selected from S1, this time generating the two answers:




{X=morgan_bros, Y=radiators; X=morgan_bros, Y=pumps}

Now that we have dealt with all the elements of S1, we can enumerate the set of substitutions that result from evaluating the whole formula:




{X=smith_corp, Y=radiators; X=smith_corp, Y=tyres; X=morgan_bros, Y=radiators; X=morgan_bros, Y=pumps}

Notice that the algorithm for evaluating an and proposition invokes the evaluation algorithm recursively on the parts of the formula. The algorithm will act similarly for other conjunctions and quantifiers. The recursion will bottom out every time because each recursive call is operating on a smaller piece of formula than the one that invoked it. So, eventually, the recursive calls will cause the evaluation of simple predicate formulae, which can be done by the database without any further recursive calls. Notice also that our description of the evaluation of an and query can be viewed as a rather abstract definition of how Prolog finds all solutions of a conjunction of goals - for each way of satisfying the first goal, it attempts the second goal with variables having the values that they got from the solution of the first goal. This suggests that we can implement this operation straightforwardly in Prolog.

In evaluating a DBQ formula, we will only be interested in substitutions for variables that appear outside the formula. For instance, in the and example it was important to remember the possible values of X satisfying the first subformula, in order to use them in evaluating the second. Now, the assumption will be made here that a quantified formula will never contain variables that are used outside (that is, free variables).

So, this means that when a quantified formula is evaluated we need only ever produce one of two possible answers:




{} {e}

where e is the empty substitution that assigns values to no variables. The first of these answers says that there are no substitutions that will make this formula true (it corresponds to failure in Prolog), whereas the second says that there is a substitution that will make the formula true, but it is empty because the formula has no free variables (this corresponds to a Prolog goal succeeding but not instantiating any variables). Here is the part of the evaluation algorithm dealing with the all and exists quantifiers:




To evaluate all(V, R, B): Evaluate R, to give the set of substitutions S1 If there are none, simply return {e} Otherwise, for each element s1 of S1: apply s1 to the proposition B, yielding B' if B' evaluates to give at least one substitution, continue going through the elements of S1 otherwise immediately return {} If you get successfully through all the elements of S1, return {e}

To evaluate exists(V, R, B): Evaluate R, to give the set of substitutions S1 If there are none, simply return {} Otherwise, for each element s1 of S1: apply s1 to the proposition B, yielding B' if B' evaluates to give at least one substitution, immediately return {e} otherwise continue going through the elements of S1 If you get through all the elements of S1 without any success, return {}

Intuitively, the all algorithm goes through all the values of the variable satisfying the restriction and checks that they satisfy the body. Only if they all satisfy can it return {e} - that is, true. On the other hand, the exists algorithm is content to return {e} if there exists any value of the variable that makes both the restriction and the body true. Consider, for example, the evaluation of:




all(X, supplies(X, radiators), supplies(X, tyres))

given our example database. The evaluation of the restriction part provides the following set of substitutions:




S1: {X=smith_corp; X=morgan_bros}

Taking the first of these and applying it to the body, we get:




supplies(smith_corp, tyres)

which evaluates to {e}. So, we continue with the second substitution in S1. This time, the second formula to be evaluated is:




supplies(morgan_bros, tyres)

which yields {}. Because of this, the evaluation of the whole formula returns {}. The evaluation has indicated that 'everyone who supplies radiators supplies tyres' is false, because Morgan Bros supplies radiators but not tyres.

Notice that the above strategy of evaluating the restriction R in order to substitute values into the body B will only work if the evaluation of the restriction actually produces substitutions for the variable. Not every DBQ formula will, however, produce substitutions for all the variables in it when it is evaluated. For instance, evaluating an 'all' formula, whether or not it is successful, never yields any interesting substitutions. So, in order for query evaluation as we have sketched it to work, we must require that a restriction formula be one of the following:

  1. a simple proposition mentioning the variable (e.g. owns(X,Y) mentions X and Y) or

  2. a conjunction with such a proposition as its first conjunct.

If a DBQ formula is coming from the analysis of an English sentence, this restriction is rarely a problem, because most NPs provide via a noun a simple proposition about the referent which will appear in the restriction part of the relevant DBQ formula. For instance, 'every airline' gives rise to a DBQ formula where the restriction is the simple proposition 'airline(X)', where X is the variable quantified over in the formula.

We have now sketched out the main components of an evaluation algorithm for DBQ formulae. Put together, these form a recursive algorithm for evaluating a formula by recursively evaluating its components in particular ways. The algorithm returns a set of substitutions, rather than a truth value, but for a formula with no free variables the possible results, {e} and {}, do indicate true and false, respectively. The algorithm as sketched can also be extended to handle other conjunctions, logical operators and quantifiers, such as or, not, most.

When a DBQ formula is treated as a database query, it is assumed that the person posing the query is interested in whether the proposition expressed is true or not. In general, they may be interested in knowing for what objects a given proposition is true, as well as knowing whether there are such objects. So, DBQ allows the specification of certain extra actions that are to be carried out during query evaluation. Such actions are treated as propositions that are always true, but which cause side effects to occur when their truth is verified. The only action we will actually use is that of printing out the value of a variable (that makes some proposition true). This will give rise to a DBQ form like




printout(X)

We really need to introduce a clause into the algorithm for such things, as follows:




To evaluate printout(X): Print out X Return {e}

Of course, to introduce such actions sensibly into our DBQ formulae, we need to know when they will be encountered in the evaluation process. Thus, for instance, we want a printout action only to be encountered after a particular value has been put in the place of the relevant variable.

Code:dbq.pl Writing a Prolog evaluator for DBQ formulae is extremely simple because the language is already so close to Prolog. Normal satisfaction of Prolog queries already provides for us the enumeration of possible values for variables, and on backtracking Prolog will generate all the possibilities. Of course, in Prolog we do not have explicit access to substitutions as datastructures. We thus have to harness the Prolog execution mechanism to ensure that variables are instantiated correctly before each goal is attempted. In conclusion, we simply treat a DBQ query like a normal Prolog query and provide clauses for the predicates 'and', 'or', 'exists', and so on, as well as the various database predicates (like 'supplies'). Here is such an evaluator (DBQ.PL) in its entirety:




and(Prop1,Prop2) :- Prop1, Prop2.

or(Prop1,Prop2) :- Prop1; Prop2.

exists(_,Restriction,Body) :- Restriction,Body, !.

all(_,Restriction,Body) :- not((Restriction,not(Body))).

printout(Variable) :- write(Variable), nl.

To evaluate a DBQ query, we then simply invoke it as a Prolog goal, for instance:




?- all(X,supplies(X,radiators),printout(X)).

If a DBQ query produces more than one possible substitution as its result, then we can make it enumerate all the possibilities by causing Prolog to backtrack.

Exercise 9.2

Code:europe.pl Code:airlines.pl

Send us a comment.



[Contents] [Previous] [Next]
This document was translated by troff2html v0.21 on October 22, 1996.