Paper: Decomposability of Translation Metrics for Improved Evaluation and Efficient Algorithms

ACL ID D08-1064
Title Decomposability of Translation Metrics for Improved Evaluation and Efficient Algorithms
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008
Authors

B is the de facto standard for evaluation and development of statistical machine trans- lation systems. We describe three real-world situations involving comparisons between dif- ferent versions of the same systems where one can obtain improvements in B scores that are questionable or even absurd. These situ- ations arise because B lacks the property of decomposability, a property which is also computationally convenient for various appli- cations. We propose a very conservative modi- fication to B and a cross between B and word error rate that address these issues while improving correlation with human judgments.