Paper: A Syntactified Direct Translation Model with Linear-time Decoding

ACL ID D09-1123
Title A Syntactified Direct Translation Model with Linear-time Decoding
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2009
Authors
  • Hany Hassan (IBM Cairo Technology Development Center, Giza Egypt)
  • Khalil Sima'an (University of Amsterdam, Amsterdam The Netherlands)
  • Andy Way (Dublin City University, Dublin Ireland)

Recent syntactic extensions of statisti- cal translation models work with a syn- chronous context-free or tree-substitution grammar extracted from an automatically parsed parallel corpus. The decoders ac- companying these extensions typically ex- ceed quadratic time complexity. This paper extends the Direct Transla- tion Model 2 (DTM2) with syntax while maintaining linear-time decoding. We employ a linear-time parsing algorithm based on an eager, incremental interpre- tation of Combinatory Categorial Gram- mar (CCG). As every input word is pro- cessed, the local parsing decisions resolve ambiguity eagerly, by selecting a single supertag–operator pair for extending the dependency parse incrementally. Along- side translation features extracted from the derived parse tree, we explore syn- t...