Paper: Combining Source and Target Language Information for Name Tagging of Machine Translation Output

ACL ID P08-3004
Title Combining Source and Target Language Information for Name Tagging of Machine Translation Output
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2008
Authors

A Named Entity Recognizer (NER) generally has worse performance on machine translated text, because of the poor syntax of the MT output and other errors in the translation. As some tagging distinctions are clearer in the source, and some in the target, we tried to integrate the tag information from both source and target to improve target language tagging performance, especially recall. In our experiments with Chinese-to-English MT output, we first used a simple merge of the outputs from an ET (Entity Translation) system and an English NER system, getting an absolute gain of 7.15% in F-measure, from 73.53% to 80.68%. We then trained an MEMM module to integrate them more discriminatively, and got a further average gain of 2.74% in F-measure, from 80.68% to 83.42%.