Paper: Hallucinating Phrase Translations for Low Resource MT

ACL ID W14-1617
Title Hallucinating Phrase Translations for Low Resource MT
Venue International Conference on Computational Natural Language Learning
Year 2014

We demonstrate that ?hallucinating? phrasal translations can significantly im- prove the quality of machine translation in low resource conditions. Our hallucinated phrase tables consist of entries composed from multiple unigram translations drawn from the baseline phrase table and from translations that are induced from mono- lingual corpora. The hallucinated phrase table is very noisy. Its translations are low precision but high recall. We counter this by introducing 30 new feature functions (including a variety of monolingually- estimated features) and by aggressively pruning the phrase table. Our analysis evaluates the intrinsic quality of our hallucinated phrase pairs as well as their impact in end-to-end Spanish-English and Hindi-English MT.