Morphological and Orthographic Tools for English


Tools for inflectional morphological analysis and generation, and for determining the orthography of the indefinite article are now available.

The tools are

morpha
a fast and robust morphological analyser for English based on finite-state techniques that returns the lemma and inflection type of a word, given the word form and its part of speech. (The latter is optional but accuracy is degraded if it is not present).
morphg
generates a word form given a specification of the lemma, part-of-speech, and the type of inflection required. Morphg is derived automatically from morpha, ensuring consistency and reversability of the tools. An option controls British English or American English behaviour with respect to consonant doubling.
ana
postprocesses text to insert the correct form of the indefinite article (i.e. a or an). Ana encodes a set of rules keying off the pronunciation of the next word (so an is produced if the following word starts with a vowel sound, and a otherwise). The tool handles plain text, part of speech-tagged text and SGML among other possible formats.

The tools are implemented using widely-available unix utilities, and are free for research purposes; for any proposed commercial use please contact John Carroll. Also, send an email if you would like to be notified of new releases. New features in the works include derivational morphology for deverbal nouns, and comparative and superlative forms of adjectives.

Recent changes:

September 2003: new version of morpha/g with pre-built binaries for Linux, Solaris and Mac OS X, and a few classes of misanalysis fixed.


A description of the tools is published in

Minnen, G., J. Carroll and D. Pearce (2001) `Applied morphological processing of English', Natural Language Engineering, 7(3). 207-223. More>

Minnen, G., J. Carroll and D. Pearce (2000) `Robust, applied morphological generation'. In Proceedings of the 1st International Natural Language Generation Conference, Mitzpe Ramon, Israel. 201-208. More>

Please refer to one of these papers when describing any research using the tools.

The tools were produced as part of the UK EPSRC-funded PSET project and are being further developed on the RASP project.


Back to John A. Carroll's homepage