Paper: Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries

ACL ID P12-1097
Title Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2012
Authors

In this work, we introduce the TESLA- CELAB metric (Translation Evaluation of Sentences with Linear-programming-based Analysis ? Character-level Evaluation for Languages with Ambiguous word Bound- aries) for automatic machine translation eval- uation. For languages such as Chinese where words usually have meaningful internal struc- ture and word boundaries are often fuzzy, TESLA-CELAB acknowledges the advantage of character-level evaluation over word-level evaluation. By reformulating the problem in the linear programming framework, TESLA- CELAB addresses several drawbacks of the character-level metrics, in particular the mod- eling of synonyms spanning multiple char- acters. We show empirically that TESLA- CELAB significantly outperforms character- level BLEU in the English-Chinese transl...