Paper: Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

ACL ID D08-1027
Title Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008
Authors

Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We ex- plore the use of Amazon’s Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: af- fect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. For all five, we show high agreement between Mechani- cal Turk non-expert annotations and existing gold standard labels provided by expert label- ers. For the task of affect recognition, we also show that using non-expert labels for training machine learning algorithms can be as effec- tive as using gold standard annotations from...