Paper: Large Scale Decipherment for Out-of-Domain Machine Translation

ACL ID D12-1025
Title Large Scale Decipherment for Out-of-Domain Machine Translation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

We apply slice sampling to Bayesian de- cipherment and use our new decipherment framework to improve out-of-domain machine translation. Compared with the state of the art algorithm, our approach is highly scalable and produces better results, which allows us to decipher ciphertext with billions of tokens and hundreds of thousands of word types with high accuracy. We decipher a large amount of monolingual data to improve out-of-domain translation and achieve significant gains of up to 3.8 BLEU points.