US mini logoHome | A-Z Index | People | Reference | Contact us
University of Sussex
About | People | Projects | Doctoral Programme | Seminar Series | Resources

All-words WSD: Knowledge Acquisition Bottleneck and Effect of Domain

Speaker

David Martinez

Affilliation

Basque Country University

Abstract

The last edition of the Senseval evaluation track (Senseval-3, 2004) for WSD showed that all-words systems were still far from the performance that was achieved by lexical-sample systems. The main problems that affect the scalability of WSD algorithms to all words are the lack of training data, and the domain dependency of the systems (methods trained on a corpus usually perform worse when tested on a different one).

In this talk I will describe an approach to obtain training examples automatically, based on (Leacock, 1998), and its application to the all-words disambiguation task. I will also discuss the importance of the domain of the target corpus, and briefly introduce different lines that are being explored in order to address this problem by relying on unlabeled data.

see also

Site maintained by: John Carroll Disclaimer | Feedback