Paper: Sinuhe – Statistical Machine Translation using a Globally Trained Conditional Exponential Family Translation Model

ACL ID D09-1107
Title Sinuhe – Statistical Machine Translation using a Globally Trained Conditional Exponential Family Translation Model
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2009
Authors

We present a new phrase-based con- ditional exponential family translation model for statistical machine translation. The model operates on a feature repre- sentation in which sentence level transla- tions are represented by enumerating all the known phrase level translations that occur inside them. This makes the model a good match with the commonly used phrase extraction heuristics. The model’s predictions are properly normalized prob- abilities. In addition, the model automati- cally takes into account information pro- vided by phrase overlaps, and does not suffer from reference translation reacha- bility problems. We have implemented an open source translation system Sinuhe based on the proposed translation model. Our experi- ments on Europarl and GigaFrEn corpora demonstrate that fi...