Paper: Fast and Adaptive Online Training of Feature-Rich Translation Models

ACL ID P13-1031
Title Fast and Adaptive Online Training of Feature-Rich Translation Models
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2013
Authors

We present a fast and scalable online method for tuning statistical machine trans- lation models with large feature sets. The standard tuning algorithm?MERT?only scales to tens of features. Recent discrimi- native algorithms that accommodate sparse features have produced smaller than ex- pected translation quality gains in large systems. Our method, which is based on stochastic gradient descent with an adaptive learning rate, scales to millions of features and tuning sets with tens of thousands of sentences, while still converging after only a few epochs. Large-scale experiments on Arabic-English and Chinese-English show that our method produces significant trans- lation quality gains by exploiting sparse fea- tures. Equally important is our analysis, which suggests techniques for mitigati...