Paper: Monolingual Marginal Matching for Translation Model Adaptation

ACL ID D13-1109
Title Monolingual Marginal Matching for Translation Model Adaptation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2013

When using a machine translation (MT) model trained on OLD-domain parallel data to translate NEW-domain text, one major chal- lenge is the large number of out-of-vocabulary (OOV) and new-translation-sense words. We present a method to identify new translations of both known and unknown source language words that uses NEW-domain comparable doc- ument pairs. Starting with a joint distribution of source-target word pairs derived from the OLD-domain parallel corpus, our method re- covers a new joint distribution that matches the marginal distributions of the NEW-domain comparable document pairs, while minimiz- ing the divergence from the OLD-domain dis- tribution. Adding learned translations to our French-English MT model results in gains of about 2 BLEU points over strong baselines.