Paper: Extracting Data Records from Unstructured Biomedical Full Text

ACL ID D07-1088
Title Extracting Data Records from Unstructured Biomedical Full Text
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2007
Authors

In this paper, we address the problem of extracting data records and their attributes from unstructured biomedical full text. There has been little effort reported on this in the research community. We argue that semantics is important for record extraction or finer-grained language processing tasks. We derive a data record template including semantic language models from unstruc- tured text and represent them with a dis- course level Conditional Random Fields (CRF) model. We evaluate the approach from the perspective of Information Extrac- tion and achieve significant improvements on system performance compared with other baseline systems.