Paper: Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation

ACL ID E14-1065
Title Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

We propose a novel technique for adapting text-based statistical machine translation to deal with input from automatic speech recognition in spoken language translation tasks. We simulate likely misrecognition errors using only a source language pro- nunciation dictionary and language model (i.e., without an acoustic model), and use these to augment the phrase table of a stan- dard MT system. The augmented sys- tem can thus recover from recognition er- rors during decoding using synthesized phrases. Using the outputs of five differ- ent English ASR systems as input, we find consistent and significant improvements in translation quality. Our proposed tech- nique can also be used in conjunction with lattices as ASR output, leading to further improvements.