Paper: Integrating an Unsupervised Transliteration Model into Statistical Machine Translation

ACL ID E14-4029
Title Integrating an Unsupervised Transliteration Model into Statistical Machine Translation
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2014
Authors

We investigate three methods for integrat- ing an unsupervised transliteration model into an end-to-end SMT system. We in- duce a transliteration model from parallel data and use it to translate OOV words. Our approach is fully unsupervised and language independent. In the methods to integrate transliterations, we observed improvements from 0.23-0.75 (? 0.41) BLEU points across 7 language pairs. We also show that our mined transliteration corpora provide better rule coverage and translation quality compared to the gold standard transliteration corpora.