Information about the properties and attributes of an individual can often be obtained by considering the classes that the individual belongs to. Thus, we know that Clark Kent has an notebook because he is a reporter, that he has two arms because he is a human being, and that he is warm blooded because he is a mammal. In general, it will be more economical, in terms of the number of facts and rules needed, to express information at the level of classes (mammals are warm blooded) rather than in terms of individuals (Clark Kent is warm blooded, Lois Lane is warm blooded, ...), where this is possible. In addition, it will also be more economical to express general information about how the world divides up into classes of objects (all humans are mammals) than to specify all the classes an individual belongs to (Clark Kent is a reporter, Clark Kent is a human, Clark Kent is a mammal, ...). Thus rules of the following general form are likely to be especially useful in a system for reasoning about the world:
human(X) if reporter(X) mammal(X) if human(X) mammal(X) if dog(X) number_of_legs(X, 2) if human(X) number_of_legs(X, 2) if bird(X) number_of_legs(X, 4) if dog(X)
has_notebook(X) if reporter(X) number_of_arms(X, 2) if human(X) warm_blooded(X) if mammal(X)
Facts and rules of the first kind describe the hierarchy of individuals and classes that make up the world (often called an 'isa-hierarchy'). Rules of the second kind express how various attributes and properties follow from class memberships. It is convenient to draw the isa-hierarchy as a directed graph and attach information of the second kind to the classes it applies to. The result is often called a semantic network. Thus, the foregoing information would correspond to the semantic network shown in Figure 9.5.

Various work has been done exploiting this network metaphor directly in a computational way. Given the network outlined here, for instance, we can find out how many arms Clark Kent has by searching up the isa-hierarchy (following the arrows) until we find a class to which there is number_of_arms information attached. This is a very directed inference process (although still potentially involving search) as compared to the kinds of general backwards inference that we have considered. Implementing special mechanisms to perform these kinds of inferences efficiently may be effective, but it is an open problem to what extent a natural language processing system can make do with only these limited class-based inferences.
Our use of macros in lexical entries was very much in the spirit of organizing information around hierarchies of classes. For instance, we can think of the definitions:
Macro syn_iV: <cat> = V <arg0 cat> = NP <arg0 case> = nom.
Macro syn_tV: syn_iV <arg1 cat> = NP <arg1 case> = acc.
Lexeme eat: syn_tV.
as introducing the inheritance network shown in Figure 9.6.

In our treatment of the lexicon, macros were treated as abbreviations to be expanded at the time a word is looked up. Such a strategy was feasible because the total amount of information associated with a given lexical item was small. If we were representing a reasonable amount of information about humans and reporters, however, we could not carry out such an expansion every time we wanted to retrieve information about a particular human being. Instead we would need to adopt a strategy where the information was looked up only when required for example, for use in a unification.
As well as providing a way of organizing a large amount of information about the world, class membership can also be the basis of efficient detection of anomaly. If we know what kinds of objects can participate in what relations, this gives us a crude way of screening information. Consider, for instance, the problem of understanding the following sentence:
A bat perched on the wall.
where 'bat' may refer to a kind of flying mammal or a wooden implement (used, perhaps, for playing cricket or baseball, or for rioting). We can encode relevant information about chiropterans, perching, baseball bats, cricket bats and the classes of objects associated with them as the following rules for a forwards inference system:
animal(X) if perch(X, Y) animate(X) if animal(X) animal(X) if chiropteran(X) chiropteran(X) if flying_bat(X) inanimate(X) if cricket_bat(X) inanimate(X) if baseball_bat(X) contradiction if animate(X) and inanimate(X)
If we then choose the chiropteran sense for 'bat' and add the information:
flying_bat(bat28) perch(bat28, wall06)
then the inference process will simply infer that bat28 is an animal and animate. No contradiction will arise. Chiropterans cannot perch, of course, they can only hang, but our knowledge base does not include this subtlety. If we then choose one of the other two senses for 'bat' and add the information:
baseball_bat(bat29) perch(bat29, wall07)
then the inference process will correctly produce contradiction and we will know that something has gone wrong. In practice, as we have seen, this kind of anomaly detection is normally performed by a process more like type checking. That is, the lexical entry for a verb like 'perch' would contain information that stated that its subject must be a phrase referring to an animal. This will then be checked by looking at class markers in the definition of the relevant noun word sense. In Chapter 8, we saw how such checking could be built into unifications performed during parsing. Such a technique is a clear example of optimizing a restricted kind of inference that is useful in NLP. On the other hand, the optimization requires that the acceptability of phrases can be determined solely from properties of the individual words. A more general inferential approach, as sketched here, would be capable of taking into account information from the context as well as information in lexical entries. Thus it might be able to detect the anomaly of:
It perched on the wall.
in a context where 'it' could only refer to an object already known to be a baseball bat. We will consider the use of background information further in Chapter 10.
Another feature that semantic network systems offer is indexing of information by the objects involved, rather than the relations. Many logic-based systems organize their information around the relations involved, so that, for instance, all the rules that would enable us to conclude things about number_of_arms are kept in one place, all the rules about warm_blooded are kept somewhere else, and so on. In addition, there is an efficient way of going from the name of a relation to the bundle of information about it. This is convenient if we want to answer inference questions like 'Who is warm blooded?' because all the relevant information is conveniently grouped together and accessible, but it is not a suitable organization if we want to answer questions like 'What do you know about Mayumi?'. The idea of organizing information around objects is the basis of the object-oriented programming paradigm mentioned previously.
Implementing a simple semantic network in Prolog is straightforward. We introduce a predicate 'attr(Entity, Attribute, Value)' so that we can stipulate what value particular attributes have for particular entities (we will make no distinction between individuals and classes in our system -- they both count as entities). Where an attribute is a property that an entity may or may not have, we just use the values 'yes' and 'no' accordingly. Thus:
Code:inherits.pl attr(club_member,sex,female). attr(associate,associate_member,yes). attr(associate,citizenship,non_US). attr(life_member,life_member,yes). attr(life_member,citizenship,'US'). attr(kim,over_50,no). attr(jean,over_50,yes). attr(mayumi,over_50,yes). attr(beryl,over_50,no).
We then augment 'attr' with an 'isa' predicate that links up the entities into a hierarchy:
isa(associate,club_member). isa(life_member,club_member). isa(kim,associate). isa(jean,associate). isa(mayumi,life_member). isa(beryl,life_member).
The final component is a predicate 'has_attr' that defines when an entity is to count as having a particular attribute-value pair associated with it. This will be the case either when such a pair is directly given by the 'attr' predicate:
has_attr(Entity,Attribute,Value) :- attr(Entity,Attribute,Value).
Or when the entity concerned is linked (by 'isa') to another entity, and that entity has the attribute-value pair associated with it:
has_attr(Entity1,Attribute,Value) :- isa(Entity1,Entity2), has_attr(Entity2,Attribute,Value).
Given this code, here is an exhaustive listing of the conclusions that it will allow us to draw from the example network:
Kim is an associate member. The sex of Kim is female. Kim is not over 50. The citizenship of Kim is non-US.
Jean is over 50. Jean is an associate member. The sex of Jean is female. The citizenship of Jean is non-US.
Mayumi is over 50. Mayumi is a life member. The citizenship of Mayumi is US. The sex of Mayumi is female.
Beryl is a life member. The citizenship of Beryl is US. The sex of Beryl is female. Beryl is not over 50.
Code:show_net.pl The implementation shown, which is to be found in the file inherits.pl, works well provided that the network it is given obeys the rules of the game: (i) the network has no "isa" cycles (i.e. is a DAG), (ii) no entity has a value specified for an attribute that is also specified at an ancestral node in the DAG, (iii) whenever there is a pair of entities, neither being a descendent of the other but with a descendent in common, there is no attribute that both entities specify a value for
and (iv) no entity is associated locally with more than one value for a given attribute.
Send us a comment.