Paper: Cheap Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription

ACL ID N10-1024
Title Cheap Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription
Venue Human Language Technologies
Session Main Conference
Year 2010
Authors

Deploying an automatic speech recogni- tion system with reasonable performance requires expensive and time-consuming in-domain transcription. Previous work demonstrated that non-professional anno- tation through Amazon’s Mechanical Turk can match professional quality. We use Mechanical Turk to transcribe conversa- tional speech for as little as one thir- tieth the cost of professional transcrip- tion. The higher disagreement of non- professional transcribers does not have a significant effect on system performance. While previous work demonstrated that redundant transcription can improve data quality, we found that resources are bet- ter spent collecting more data. Finally, we describe a quality control method without needing professional transcription.