US mini logoHome | A-Z Index | People | Reference | Contact us
University of Sussex
About | People | Projects | Doctoral Programme | Seminar Series | Resources

Using Selectional Preferences to Detect Non-Compositionality of Verb-Object Combinations

Speaker

Diana McCarthy

Affilliation

Sussex

Abstract

Automatic methods to detect the non-compositionality of multiwords have attracted attention in recent years because of the importance of this for semantic interpretation. There have been various approaches to capturing non-compositionality, many using distributional similarity to contrast a target multiword with its constituents. We will describe our work exploring the use of selectional preferences for detecting non-compositional verb-object multiwords. To characterise the arguments in a given grammatical relationship we experiment with three models of selectional preference. Two use WordNet and one uses the entries from a distributional thesaurus as classes for representation. In previous work on selectional preference acquisition, the classes used for representation are selected according to the coverage of argument tokens. For both the distributional thesaurus model and one of the WordNet models we select classes for representing the preferences by virtue of the number of argument types that they cover, rather than the number of tokens. Then, only tokens under the classes which are representative of the argument head data are used to estimate the probability distribution for the selectional preference model. We demonstrate a highly significant correlation between measures which use these `type-based' selectional preferences and compositionality judgements from a data set used in previous research. The type-based models perform better than the models which use tokens for selecting the classes. Furthermore, the models which use the automatically acquired thesaurus entries produced the best results. The correlation for the thesaurus models is stronger than any of the individual features used in previous research on the same dataset.

[Practice talk for a seminar in Groningen]

see also

Site maintained by: John Carroll Disclaimer | Feedback