Paper: Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment

ACL ID P13-1003
Title Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013
Authors

We derive variants of the fertility based models IBM-3 and IBM-4 that, while maintaining their zero and first order pa- rameters, are nondeficient. Subsequently, we proceed to derive a method to com- pute a likely alignment and its neighbors as well as give a solution of EM training. The arising M-step energies are non-trivial and handled via projected gradient ascent. Our evaluation on gold alignments shows substantial improvements (in weighted F- measure) for the IBM-3. For the IBM- 4 there are no consistent improvements. Training the nondeficient IBM-5 in the regular way gives surprisingly good re- sults. Using the resulting alignments for phrase- based translation systems offers no clear insights w.r.t. BLEU scores.