Speaker
Affilliation
Cambridge
Abstract
Accuracy scores of over 92% are now being reported for statistical parsers trained and tested on the WSJ sections of the Penn Treebank (PTB). However, it is well known that the performance of PTB parsers degrades significantly when applied to other domains, eg biomedical research papers. In addition, the use of the Parseval metrics as a general measure of parser accuracy has been called into question.
In this talk I will investigate both questions of parser adaptation and evaluation, in the context of a statistical parser based on Combinatory Categorial Grammar (CCG). For the adaptation case, I will show that a simple technique of retraining parser components at lower levels of representation -- in this case POS tags and CCG supertags -- leads to a surprisingly accurate parser for biomedical text. For the evaluation case, I will describe a new test suite consisting of manually annotated unbounded dependencies, for a variety of grammatical constructions. The CCG parser's performance on recovering such dependencies compares well with other off-the-shelf parsers, but still leaves much room for improvement. I will motivate the need for such an evaluation despite the relatively low frequencies of unbounded dependencies in naturally occurring text.