Paper: Lessons Learned in Part-of-Speech Tagging of Conversational Speech

ACL ID D10-1080
Title Lessons Learned in Part-of-Speech Tagging of Conversational Speech
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2010
Authors

This paper examines tagging models for spon- taneous English speech transcripts. We ana- lyze the performance of state-of-the-art tag- ging models, either generative or discrimi- native, left-to-right or bidirectional, with or without latent annotations, together with the use of ToBI break indexes and several meth- ods for segmenting the speech transcripts (i.e., conversation side, speaker turn, or human- annotated sentence). Based on these studies, we observe that: (1) bidirectional models tend to achieve better accuracy levels than left-to- right models, (2) generative models seem to perform somewhat better than discriminative models on this task, and (3) prosody improves tagging performance of models on conversa- tion sides, but has much less impact on smaller segments. We conclude that...