Paper: Confidence-based Rewriting of Machine Translation Output

ACL ID D14-1133
Title Confidence-based Rewriting of Machine Translation Output
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2014

Numerous works in Statistical Machine Translation (SMT) have attempted to iden- tify better translation hypotheses obtained by an initial decoding using an improved, but more costly scoring function. In this work, we introduce an approach that takes the hypotheses produced by a state-of- the-art, reranked phrase-based SMT sys- tem, and explores new parts of the search space by applying rewriting rules se- lected on the basis of posterior phrase- level confidence. In the medical do- main, we obtain a 1.9 BLEU improve- ment over a reranked baseline exploiting the same scoring function, corresponding to a 5.4 BLEU improvement over the orig- inal Moses baseline. We show that if an indication of which phrases require rewrit- ing is provided, our automatic rewriting procedure yields an additiona...