Paper: Cohesive Phrase-Based Decoding for Statistical Machine Translation

ACL ID P08-1009
Title Cohesive Phrase-Based Decoding for Statistical Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2008
Authors

Phrase-based decoding produces state-of-the- art translations with no regard for syntax. We add syntax to this process with a cohesion constraint based on a dependency tree for the source sentence. The constraint allows the decoder to employ arbitrary, non-syntactic phrases, but ensures that those phrases are translated in an order that respects the source tree’s structure. In this way, we target the phrasal decoder’s weakness in order model- ing, without affecting its strengths. To fur- ther increase flexibility, we incorporate cohe- sion as a decoder feature, creating a soft con- straint. The resulting cohesive, phrase-based decoder is shown to produce translations that are preferred over non-cohesive output in both automatic and human evaluations.