Paper: Discriminative Training of 150 Million Translation Parameters and Its Application to Pruning

ACL ID N13-1034
Title Discriminative Training of 150 Million Translation Parameters and Its Application to Pruning
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2013
Authors

Until recently, the application of discrimina- tive training to log linear-based statistical ma- chine translation has been limited to tuning the weights of a limited number of features or training features with a limited number of pa- rameters. In this paper, we propose to scale up discriminative training of (He and Deng, 2012) to train features with 150 million pa- rameters, which is one order of magnitude higher than previously published effort, and to apply discriminative training to redistribute probability mass that is lost due to model pruning. The experimental results confirm the effectiveness of our proposals on NIST MT06 set over a strong baseline.