Using Prolog for natural language processing

Prolog is an expressive language for stating algorithms in computational linguistics. In NLP, we are frequently interested in manipulating symbols (words, phonemes, parts of speech) and structured objects (sequences, trees, graphs) made from them. Prolog is a high-level language in which we can directly express operations on symbols (represented by words, strings, numbers, for instance) and structures (represented by lists or terms, for instance) without having to worry about how these high-level concepts are actually represented in the machine. Prolog allows us to specify complex structures concisely in terms of abstract patterns. It also allows us to talk about information at a very abstract level in terms of a set of "facts" and to express arbitrarily complex retrieval operations ("inferences") involving it. The concept of recursion plays a fundamental role in NLP. Linguistic objects are described by recursive data structures and operations on these data structures are naturally expressed as recursive algorithms. In common with other high-level programming languages, Prolog places no restrictions on predicate definitions calling themselves (directly or indirectly), and so can express such algorithms directly.

Although the programs presented in this book are primarily designed to be clear and explanatory, rather than efficient, we should point out that Prolog is a language for writing "serious" NLP programs too. The Prolog programmer is presented with a range of possible ways of implementing the objects manipulated by a program. According to the characteristics of the task at hand, one or other of these might suggest itself as the most efficient, for instance to achieve an appropriate indexing behaviour. The provision of a garbage collector in Prolog means that the programmer can write programs that freely create and discard data structures, confident that unused storage will be recycled and that there will be no artificial limits on the sizes or numbers of data structures.

Send us a comment.



[Contents] [Previous] [Next]
This document was translated by troff2html v0.21 on October 22, 1996.