Speaker
Affilliation
Sheffield
Abstract
The Clinical E-Science Framework (CLEF) project was a 5-year MRC-funded e-science project whose technical objective was to explore how advanced information technologies could be used to capture, integrate and present electronic patient information in the domain of cancer treatment within a secure and ethical framework, so as to support clinical research and improve patient care. One important strand of work within CLEF was the automatic extraction of the rich and relatively untapped clinical information in the textual component of the patient record: from radiology reports, histopathology reports and the clinical narratives that are recorded following every patient-doctor consultation. To address this subtask we applied information extraction (IE) technologies. In this talk I give a general overview of the approach to IE taken in the project, addressing: the creation of a rich, semantically annotated corpus of clinical documents; the implementation and evaluation of supervised learning techniques for entity and relation extraction for a range of clinical entity and relation types; initial steps towards using the temporal information in clinical texts to assist in aligning the clinical events mentioned within those texts with mentions of the same events in the structured data component of the patient record. Taken in sum, the CLEF IE activities represent perhaps the most ambitious clinical text mining effort to date.