Paper: Syntax Based Reordering with Automatically Derived Rules for Improved Statistical Machine Translation

ACL ID C10-1126
Title Syntax Based Reordering with Automatically Derived Rules for Improved Statistical Machine Translation
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2010
Authors

Syntax based reordering has been shown to be an effective way of handling word order differences between source and target languages in Statistical Machine Translation (SMT) systems. We present a simple, automatic method to learn rules that reorder source sentences to more closely match the target language word or- der using only a source side parse tree and automatically generated alignments. The resulting rules are applied to source lan- guage inputs as a pre-processing step and demonstrate significant improvements in SMT systems across a variety of lan- guages pairs including English to Hindi, EnglishtoSpanishandEnglishtoFrench as measured on a variety of internal test sets as well as a public test set.