Paper: Training Phrase Translation Models with Leaving-One-Out

ACL ID P10-1049
Title Training Phrase Translation Models with Leaving-One-Out
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2010
Authors

Several attempts have been made to learn phrase translation probabilities for phrase- based statistical machine translation that go beyond pure counting of phrases in word-aligned training data. Most approaches report problems with over- fitting. We describe a novel leaving- one-out approach to prevent over-fitting that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task. In contrast to most previous work where phrase models were trained sepa- rately from other models used in transla- tion, we include all components such as single word lexica and reordering mod- els in training. Using this consistent training of phrase models we are able to achieve improvements of up to 1.4 points in BLEU. As a side effect, the phrase table ...