Paper: Linguistically debatable or just plain wrong?

ACL ID P14-2083
Title Linguistically debatable or just plain wrong?
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2014

In linguistic annotation projects, we typ- ically develop annotation guidelines to minimize disagreement. However, in this position paper we question whether we should actually limit the disagreements between annotators, rather than embracing them. We present an empirical analysis of part-of-speech annotated data sets that suggests that disagreements are systematic across domains and to a certain extend also across languages. This points to an un- derlying ambiguity rather than random er- rors. Moreover, a quantitative analysis of tag confusions reveals that the majority of disagreements are due to linguistically de- batable cases rather than annotation errors. Specifically, we show that even in the ab- sence of annotation guidelines only 2% of annotator choices are linguistically unmo- ti...