Paper: Bayesian Word Alignment for Statistical Machine Translation

ACL ID P11-2032
Title Bayesian Word Alignment for Statistical Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

In this work, we compare the translation performance of word alignments obtained via Bayesian inference to those obtained via expectation-maximization (EM). We propose a Gibbs sampler for fully Bayesian inference in IBM Model 1, integrating over all possi- ble parameter values in finding the alignment distribution. We show that Bayesian inference outperforms EM in all of the tested language pairs, domains and data set sizes, by up to 2.99 BLEU points. We also show that the proposed method effectively addresses the well-known rare word problem in EM-estimated models; and at the same time induces a much smaller dictionary of bilingual word-pairs.