Paper: cdec: A Decoder Alignment and Learning Framework for Finite-State and Context-Free Translation Models

ACL ID P10-4002
Title cdec: A Decoder Alignment and Learning Framework for Finite-State and Context-Free Translation Models
Venue Annual Meeting of the Association of Computational Linguistics
Session System Demonstration
Year 2010
Authors

We present cdec, an open source frame- work for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based models, and models based on synchronous context-free gram- mars. Using a single unified internal representation for translation forests, the decoder strictly separates model-specific translation logic from general rescoring, pruning, and inference algorithms. From this unified representation, the decoder can extract not only the 1- or k-best transla- tions, but also alignments to a reference, or the quantities necessary to drive dis- criminative training using gradient-based or gradient-free optimization techniques. Its efficient C++ implementation means that memory use and runtime performance are significantly b...