Paper: Creating A Test Corpus Of Clinical Notes Manually Tagged For Part-Of-Speech Information

ACL ID W04-1211
Title Creating A Test Corpus Of Clinical Notes Manually Tagged For Part-Of-Speech Information
Venue International Joint Workshop On Natural Language Processing In Biomedicine And Its Applications NLPBA BioNLP
Session
Year 2004
Authors

This paper presents a project whose main goal is to construct a corpus of clinical text manually annotated for part-of-speech information. We describe and discuss the process of training three domain experts to perform linguistic annotation. We list some of the challenges as well as encouraging results pertaining to inter-rater agreement and consistency of annotation. We also present preliminary experimental results indicating the necessity for adapting state-of-the-art POS taggers to the sublanguage domain of medical text.