Paper: Simultaneous Word-Morpheme Alignment for Statistical Machine Translation

ACL ID N13-1004
Title Simultaneous Word-Morpheme Alignment for Statistical Machine Translation
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2013
Authors

Current word alignment models for statisti- cal machine translation do not address mor- phology beyond merely splitting words. We present a two-level alignment model that dis- tinguishes between words and morphemes, in which we embed an IBM Model 1 inside an HMM based word alignment model. The model jointly induces word and morpheme alignments using an EM algorithm. We eval- uated our model on Turkish-English parallel data. We obtained significant improvement of BLEU scores over IBM Model 4. Our results indicate that utilizing information from mor- phology improves the quality of word align- ments.