Paper: Modified Distortion Matrices for Phrase-Based Statistical Machine Translation

ACL ID P12-1050
Title Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2012
Authors

This paper presents a novel method to suggest long word reorderings to a phrase-based SMT decoder. We address language pairs where long reordering concentrates on few patterns, and use fuzzy chunk-based rules to predict likely reorderings for these phenomena. Then we use reordered n-gram LMs to rank the re- sulting permutations and select the n-best for translation. Finally we encode these reorder- ings by modifying selected entries of the dis- tortion cost matrix, on a per-sentence basis. In this way, we expand the search space by a much finer degree than if we simply raised the distortion limit. The proposed techniques are tested on Arabic-English and German-English using well-known SMT benchmarks.