Paper: Max-Violation Perceptron and Forced Decoding for Scalable MT Training

ACL ID D13-1112
Title Max-Violation Perceptron and Forced Decoding for Scalable MT Training
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2013
Authors

While large-scale discriminative training has triumphed in many NLP problems, its defi- nite success on machine translation has been largely elusive. Most recent efforts along this line are not scalable (training on the small dev set with features from top ?100 most fre- quent words) and overly complicated. We in- stead present a very simple yet theoretically motivated approach by extending the recent framework of ?violation-fixing perceptron?, using forced decoding to compute the target derivations. Extensive phrase-based transla- tion experiments on both Chinese-to-English and Spanish-to-English tasks show substantial gains in BLEU by up to +2.3/+2.0 on dev/test over MERT, thanks to 20M+ sparse features. This is the first successful effort of large-scale online discriminative training fo...