Paper: Automatic Evaluation of Chinese Translation Output: Word-Level or Character-Level?

ACL ID P11-2028
Title Automatic Evaluation of Chinese Translation Output: Word-Level or Character-Level?
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

Word is usually adopted as the smallest unit in most tasks of Chinese language processing. However, for automatic evaluation of the quali- ty of Chinese translation output when translat- ing from other languages, either a word-level approach or a character-level approach is possi- ble. So far, there has been no detailed study to compare the correlations of these two ap- proaches with human assessment. In this paper, we compare word-level metrics with character- level metrics on the submitted output of Eng- lish-to-Chinese translation systems in the IWSLT’08 CT-EC and NIST’08 EC tasks. Our experimental results reveal that character-level metrics correlate with human assessment better than word-level metrics. Our analysis suggests several key reasons behind this finding.