Paper: Domain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora

ACL ID C08-1125
Title Domain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2008
Authors

tra Statistical machine translation systems are usually trained on large amounts of bilingual text and monolingual text. In this paper, we propose a method to per- form domain adaptation for statistical machine translation, where in-domain bi- lingual corpora do not exist. This method first uses out-of-domain corpora to train a baseline system and then uses in-domain translation dictionaries and in-domain monolingual corpora to improve the in- domain performance. We propose an al- gorithm to combine these different re- sources in a unified framework. Experi- mental results indicate that our method achieves absolute improvements of 8.16 and 3.36 BLEU scores on Chinese to English translation and English to French translation respectively, as compared with the baselines using onl...