Paper: Machine Translation by Triangulation: Making Effective Use of Multi-Parallel Corpora

ACL ID P07-1092
Title Machine Translation by Triangulation: Making Effective Use of Multi-Parallel Corpora
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2007
Authors

Current phrase-based SMT systems perform poorly when using small training sets. This isaconsequenceofunreliabletranslationes- timates and low coverage over source and target phrases. This paper presents a method which alleviates this problem by exploit- ing multiple translations of the same source phrase. Central to our approach is triangula- tion, the process of translating from a source to a target language via an intermediate third language. This allows the use of a much wider range of parallel corpora for train- ing, and can be combined with a standard phrase-table using conventional smoothing methods. Experimental results demonstrate BLEU improvements for triangulated mod- els over a standard phrase-based system.