Paper: Linguistically Annotated BTG for Statistical Machine Translation

ACL ID C08-1127
Title Linguistically Annotated BTG for Statistical Machine Translation
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2008
Authors

Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into sta- tistical machine translation (SMT). In this paper, we propose a Linguistically Anno- tated BTG (LABTG) for SMT. It conveys linguistic knowledge of source-side syn- tax structures to BTG hierarchical struc- tures through linguistic annotation. From the linguistically annotated data, we learn annotated BTG rules and train linguisti- cally motivated phrase translation model and reordering model. We also present an annotation algorithm that captures syntac- tic information for BTG nodes. The ex- periments show that the LABTG approach significantly outperforms a baseline BTG- based system and a state-of-the-art phrase- based system on the NIST MT-05 Chinese- to-English t...