Paper: Efficient Subsampling for Training Complex Language Models

ACL ID D11-1104
Title Efficient Subsampling for Training Complex Language Models
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2011

We propose an efficient way to train maximum entropy language models (MELM) and neural network language models (NNLM). The ad- vantage of the proposed method comes from a more robust and efficient subsampling tech- nique. The original multi-class language mod- eling problem is transformed into a set of bi- nary problems where each binary classifier predicts whether or not a particular word will occur. We show that the binarized model is as powerful as the standard model and allows us to aggressively subsample negative training examples without sacrificing predictive per- formance. Empirical results show that we can train MELM and NNLM at 1% ∼ 5% of the standard complexity with no loss in perfor- mance.