Speaker
Affilliation
Cambridge
Abstract
We investigate automatic classification of speculative language, or `hedging', in scientific literature from the biomedical domain using weakly-supervised learning. We discuss the task from both a human annotation and machine learning perspective and focus on aspects of the problem that set it apart from previous weakly-supervised ML research. We show how the problem can be tackled with a probabilistic formulation of the self-training paradigm, and present a theoretical and practical evaluation of our learning and classification models.