Paper: Weighted Krippendorff's alpha is a more reliable metrics for multi-coders ordinal annotations: experimental studies on emotion, opinion and coreference annotation

ACL ID E14-1058
Title Weighted Krippendorff's alpha is a more reliable metrics for multi-coders ordinal annotations: experimental studies on emotion, opinion and coreference annotation
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

The question of data reliability is of first im- portance to assess the quality of manually an- notated corpora. Although Cohen?s ? is the prevailing reliability measure used in NLP, al- ternative statistics have been proposed. This paper presents an experimental study with four measures (Cohen?s ?, Scott?s pi, binary and weighted Krippendorff ? s ?) on three tasks: emotion, opinion and coreference annotation. The reported studies investigate the factors of influence (annotator bias, category prevalence, number of coders, number of categories) that should affect reliability estimation. Results show that the use of a weighted measure re- stricts this influence on ordinal annotations. They suggest that weighted ? is the most reli- able metrics for such an annotation scheme.