Paper: Re-Evaluation The Role Of Bleu In Machine Translation Research

ACL ID E06-1032
Title Re-Evaluation The Role Of Bleu In Machine Translation Research
Venue Annual Meeting of The European Chapter of The Association of Computational Linguistics
Session Main Conference
Year 2006
Authors

We argue that the machine translation community is overly reliant on the Bleu machine translation evaluation metric. We show that an improved Bleu score is nei- ther necessary nor sufficient for achieving an actual improvement in translation qual- ity, and give two significant counterex- amples to Bleu’s correlation with human judgments of quality. This offers new po- tential for research which was previously deemed unpromising by an inability to im- prove upon Bleu scores.