Speaker
Affilliation
Nara Institute of Science and Technology, visiting Sussex
Abstract
Anaphora resolution, which is the process of identifying whether or not an expression refers to another expression, is an important process for various NLP applications. In contrast to rule-based approaches, empirical or corpus-based approaches to this problem have been shown to be a cost-efficient solution achieving a performance that is comparable to the best performing rule-based systems. Aanphora resolution can be decomposed into two subtasks: antecedent identification, which is the process to identify an antecedent for a given anaphor, and anaphoricity determination, which is the process to judge whether or not a candidate anaphor is anaphoric.
In the first half of the talk, I will present an antecedent identification model, named `tournament model', which captures contextual information that is more sophisticated than what is offered in Centering Theory (Grosz et al., 95). Our experiments show that this model significantly outperforms earlier machine learning-based approaches, such as Soon et al. (2001).
In the second half of the talk, I will present an anaphoricity determination model, named `selection-then-classication model', a process that reverses the order of the steps in the classication-then-search model proposed by Ng and Cardie (2002), inheriting all the advantages of that model. I conducted experiments on resolving noun phrase anaphora in Japanese. The results show that with the selection-then-classication based modifications, the proposed model outperforms earlier learning-based approaches.