The Natural Language and Computational Linguistics (NLCL) group works on devising accurate, efficient and scalable approaches for computer-based analysis and generation of language, driven by newly emerging application areas which demand language processing that deals with meaning. Members of the group pioneered statistical approaches to natural language processing (NLP) in the 1990s, and this data-driven emphasis continues. The main areas of research are:
- domain-independent statistical parsing
- processing of language using deep linguistic knowledge
- detailed manual annotation of language data
- computing with word meanings
In the course of their research, members of the group have produced a number of state of the art systems and datasets which are widely used by other researchers (see Resources).
One particular strand of work that the group is becoming renowned for is leveraging data automatically produced from distributional analyses of word occurrence for use in tasks that involve computing with word meanings. Members of the group have had recent high profile successes (recognised by best paper awards at major international conferences) in automatically identifying predominant senses of words, and in near-linear average time complexity generation of text from meaning representations.