Paper: Forced Derivation Tree based Model Training to Statistical Machine Translation

ACL ID D12-1041
Title Forced Derivation Tree based Model Training to Statistical Machine Translation
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

A forced derivation tree (FDT) of a sentence pair {f, e} denotes a derivation tree that can translate f into its accurate target translation e. In this paper, we present an approach that leverages structured knowledge contained in FDTs to train component models for statistical machine translation (SMT) systems. We first describe how to generate different FDTs for each sentence pair in training corpus, and then present how to infer the optimal FDTs based on their derivation and alignment qualities. As the first step in this line of research, we verify the effectiveness of our approach in a BTG- based phrasal system, and propose four FDT- based component models. Experiments are carried out on large scale English-to-Japanese and Chinese-to-English translation tasks, and significant improvemen...