Prolog is an expressive language for stating algorithms
in computational linguistics.
In NLP, we are frequently interested in manipulating
symbols (words, phonemes, parts of speech)
and structured objects (sequences, trees, graphs) made from them. Prolog
is a high-level language in which we can directly express operations on
symbols (represented by words, strings, numbers, for instance) and structures
(represented by lists or terms, for instance) without having to worry
about how these high-level concepts are actually represented in the machine.
Prolog allows us to specify complex structures concisely in
terms of abstract patterns.
It also allows us to talk
about information at a very abstract
level in terms of a set of "facts" and to express
arbitrarily complex retrieval operations
("inferences") involving it.
The concept of recursion plays a fundamental role in NLP.
Linguistic objects are described by recursive data
structures and operations on
these data structures are naturally expressed as recursive algorithms.
In common with other high-level programming languages,
Prolog places no restrictions on predicate definitions
calling themselves (directly or indirectly),
and so can express such algorithms directly.
Although the programs presented in this book are
primarily designed to be clear and
explanatory, rather than efficient, we should point out that Prolog
is a language for writing "serious" NLP programs too.
The Prolog programmer is presented with a range of possible ways of
implementing the objects manipulated by a program.
According to the characteristics of the task at hand,
one or other of these might suggest itself
as the most efficient, for instance to achieve an appropriate indexing
behaviour. The provision of a garbage collector in Prolog means that
the programmer can write programs that freely create and discard
data structures, confident that unused storage will be recycled and that
there will be no artificial limits on the sizes or numbers of data structures.
Send us a comment.