Speaker
Affilliation
Toshiba Research Europe, Cambridge
Abstract
Each year the Conference on Computational Natural Language Learning (CoNLL) features a 'shared task', in which participants train and test machine learning systems on exactly the same data sets, in order to better compare the systems. The topic for the shared task in this year's CoNLL (the tenth such conference, hence CoNLL-X) is multi-lingual dependency parsing.
Dependency structure is an alternative to constituent structure for representing syntactic analyses of sentences and is said to be particularly suited for freer word order languages. During the last decade, much progress has been made not only in constituent but also in dependency parsing and with the emergence of treebanks for various languages, both types of parsers have increasingly been applied to languages other than English. The shared task continues this line of research.
I am one of the organizers of the shared task and will describe how we converted treebanks for 13 different languages (Arabic, Bulgarian, Chinese, Czech, Danish, Dutch, German, Japanese, Portuguese, Slovene, Spanish, Swedish and Turkish) into a common data format, what parsing approaches have been taken by participants, how parser performance is evaluated, what results were achieved and what we can learn from them about the approaches and the problem of multi-lingual dependency parsing itself.
More information about the shared task is available at the CoNLL-X Shared Task website.