Paper: Exploring variation across biomedical subdomains

ACL ID C10-1078
Title Exploring variation across biomedical subdomains
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2010

Previous research has demonstrated the importance of handling differences be- tween domains such as “newswire” and “biomedicine” when porting NLP systems from one domain to another. In this paper we identify the related issue of subdomain variation, i.e., differences between subsets of a domain that might be expected to be- have homogeneously. Using a large corpus of research articles, we explore how subdo- mains of biomedicine vary across a variety of linguistic dimensions and discover that there is rich variation. We conclude that an awareness of such variation is necessary when deploying NLP systems for use in single or multiple subdomains.