Paper: Error Detection for Statistical Machine Translation Using Linguistic Features

ACL ID P10-1062
Title Error Detection for Statistical Machine Translation Using Linguistic Features
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2010
Authors

Automatic error detection is desired in the post-processing to improve machine translation quality. The previous work is largely based on confidence estimation us- ing system-based features, such as word posterior probabilities calculated from N- best lists or word lattices. We propose to incorporate two groups of linguistic fea- tures, which convey information from out- side machine translation systems, into er- ror detection: lexical and syntactic fea- tures. We use a maximum entropy clas- sifier to predict translation errors by inte- grating word posterior probability feature and linguistic features. The experimen- tal results show that 1) linguistic features alone outperform word posterior probabil- ity based confidence estimation in error detection; and 2) linguistic features can furt...