Question answering

Translating from a natural language sentence to an MRL expression can be useful in a number of ways, but how can we evaluate whether real understanding has occurred? Suppose somebody came up to you and claimed to have a natural language understanding program that translated English sentences into an MRL. They might give you some examples of sentences and their translations:


Mayumi sings ==> SING(M) Beryl sees Hanni ==> SEE(B,H)

How would you evaluate their claim? You would certainly want to know what the denotations of the symbols SING, M, SEE, B and H are; that is, which objects and relations in the world they are supposed to indicate.

You would also want to know the rules for constructing formulae in the MRL (the language's syntax). And you would want to know the rules for determining whether a given formula is true or not in any given context (the language's semantics). For although the truth or falsity of a formula in a given context might not always be of direct interest, the ability to determine (or know what would be involved in determining) the truth value in a given context is a fundamental part of knowing what the formula means. This is because knowing in what circumstances a formula would be true in some context (knowing the formula's truth conditions) amounts to knowing what the formula is claiming about the world. Armed with the appropriate syntax and semantics for the MRL, you could ascertain that the example sentences were indeed translated into expressions with the correct meaning.

But this would not be enough to demonstrate that the program had understood these sentences. To demonstrate understanding, it is not sufficient to show that the translation into an MRL, together with the semantics of the MRL, provides correct meanings for natural language sentences. If it were, we could simply use our natural language as an MRL and have the identity translation. It is not even sufficient to show that the semantics of the MRL is something that can be formally specified.

For that offers no guarantee of any understanding, apart from that achievable by a human reader who is able to work out the semantics of meaning representations.

The kinds of tests for understanding that we need must examine the machine's ability to manipulate the symbols of the meaning representation language in a way that reflects their actual meaning. If a machine has sensory inputs of some kind, we might require that its manipulation of symbols like sing and SEE have some association or correlation with certain kinds of auditory or visual input or output. If the machine is able to accept or produce natural language utterances, we might require that its actions be in accordance with what we would expect from a language understander. But, above all, we might require that the way in which symbols are organized and used in the machine reflects somehow the organization and ways of the world. For instance, we would expect that there would be a connection between the way in which the symbol sing is used and the ways in which other symbols like SONG and MELODY are used, since there are obvious connections between the (real) concepts that these symbols presumably denote. Obviously, we cannot have a comprehensive model of the world in our machine, but whatever model the machine has should parallel the real world in all the respects relevant to the current application.

Question answering is one task that we can use to test the understanding of a natural language processing system. Other tasks include translation, the production of explanations and summaries and various problem-solving tasks. We will concentrate here on question answering because it is relatively easy to build a simple question answerer from the materials we have already developed. In addition, natural language question-answering systems (QA systems) are providing an increasingly attractive way for humans to extract information from large computer databases. In its simplest form, which is dealing with yes-no questions, question answering includes being able to determine whether or not a given natural language statement, or, rather, its MRL equivalent, is true. That is, a QA system that can deal with any question (within some natural language fragment) in any context (one of a set of possible databases) will have to know the truth conditions of formulae in the MRL and hence, to act appropriately, will have to understand natural language queries in a non-trivial way.

Chapter 8 provided a syntax and informal semantics for the DBQ meaning representation language. Formulae of this language make use of general symbols like and and all, as well as special symbols for objects and relations that will depend on the particular database at hand. But are our semantic interpretation rules correct to translate the English word 'and' into a construction involving the connective and? What exactly does the symbol all mean to the machine? Can it really manipulate these symbols in a way that reflects the meanings of English sentences? What we will do next is specify informally an algorithm for evaluating a DBQ formula; that is, determining whether it is true or false in some given database. The operations for determining the truth of a given type of construction in the language will reflect the meaning that the construction has for the machine. The extent to which these operations reflect the original natural language meaning will remain unanswered, although there is certainly an intuitive reasonableness to the operations, at least when they are dealing with simple examples. If DBQ were to be taken seriously as an MRL, then it would be important to provide a formal semantics for it, and to determine to what extent the operations of the question-answering algorithm respected and embodied this semantics. This would enable us to factor out the question of the appropriateness of DBQ translations of natural language statements from the question of the extent to which the DBQ semantics is reflected in the machine.

For semantic translation, it was convenient to regard DBQ formulae as DAGs, as this allowed us to accumulate separately the different pieces of information that went into the construction of a given formula. For question-answering purposes, however, we can take a DBQ formula to be a fully constructed object, and it will be useful to exploit this and introduce a more concise Prolog syntax similar to that of typed predicate calculus (predicate calculus where all variables are annotated with the type of object that they can range over). There are three kinds of DBQ formulae, and their representations in PATR and their concise representations are as follows:




Old notation New notation

<predicate> = p p(a0,a1) <arg0> = a0 <arg1> = a1

<connective> = c c(p1,p2) <prop1> = p1 <prop2> = p2

<quantifier> = q q(v,r,b) <variable> = v <restriction> = r <body> = b

For example:




Old notation New notation

<quantifier> = all all(X, airline(X), employer(X)) <restriction predicate> = airline <restriction arg0> = <variable> <body predicate> = employer <body arg0> = <variable>

<connective> = and and(airline(Delta),hotel_chain(Hilton)) <prop1 predicate> = airline <prop1 arg0> = Delta <prop2 predicate> = hotel_chain <prop2 arg0> = Hilton

In translating from the old notation to the new, we replace all values for the VARIABLE feature by Prolog variables in such a way that two such symbols are the same exactly when the two values share in the DAG.

Here is a slightly more complex translation example, taken from Chapter 8, and set out in an indented format -- "every hotel chain took over an airline":




quantifier : all variable : X restriction : arg0 : X predicate : hotel_chain body : quantifier : exists variable : Y restriction : arg0 : Y predicate : airline body : arg0 : X arg1 : Y predicate : took_over

This translates into:




all(X, hotel_chain(X), exists(Y, airline(Y), took_over(X,Y)))

Exercise 9.1

Send us a comment.



[Contents] [Previous] [Next]
This document was translated by troff2html v0.21 on October 22, 1996.